Abstract
In Wales, 24.8% of children aged 4-5 years live with overweight/obesity. Obesity is linked to developing multiple long-term conditions. We aimed to predict childhood obesity using healthcare and wider demographic, socioeconomic, and area-level data. The Secure Anonymized Information Linkage (SAIL) Databank in Wales contains routinely collected individual-level anonymized data from health records and administrative data. Two subsamples were created. The first restricted to singleton births between 15 March 2010 and 28 March 2012 to include Census 2011 data. The second included births after 1 January 2014 to include early-life measurements. Age- and sex-adjusted body mass index (BMI) at 4-5 years was used to define outcome of overweight/obesity (≥91st centile). Backward stepwise logistic regression models with multivariable fractional polynomials were used to develop models in stages. Data were available on 53 815 children at 4-5 years in census and 60 990 children in early-life subsample. Maternal BMI, smoking, marital status, birthweight, ethnic group, gender, and breastfeeding at birth were retained in all models. Additional variables were retained on adding census and area-level factors but increase in discrimination (Area Under the Curve, AUC) was marginal (0.66-0.67). In the second subsample, AUC improved from 0.67 to 0.79 as factors up to weight at 27 months were incorporated. Factors from healthcare records were largely consistent with existing literature. Additional insights were provided by including census data, though increase in model discrimination was marginal. Childhood obesity can act as a mediator on the pathway to multiple long-term conditions, and risk identification tools may target early prevention.