Abstract
INTRODUCTION: Knee pain significantly impairs health and quality of life among middle-aged and older adults. However, the predictive utility of lipid metabolism biomarkers for knee pain risk remains inadequately explored. METHODS: This study utilized data from the China Health and Retirement Longitudinal Study (CHARLS, 2011-2013) to investigate the association between lipid-related metabolic indicators and the risk of knee pain. Multiple lipid biomarkers and composite indices-including the lipid accumulation product (LAP), triglyceride-glucose (TyG) index, and TyG-BMI-were incorporated. Five machine learning models were developed and evaluated for predictive performance. Model interpretation was conducted using SHAP (SHapley Additive exPlanations) to identify the most influential predictors. RESULTS: A higher prevalence of knee pain was observed in high-altitude, cold regions such as Qinghai and Sichuan provinces. Composite metabolic indices (LAP, TyG, and TyG-BMI) exhibited stronger predictive power than traditional single lipid markers. Among the models, the Stacked Ensemble algorithm achieved the best performance, with an AUC of 0.85 and a Brier score of 0.13. SHAP analysis highlighted LAP and TyG-related indices as the top contributors to prediction outcomes. DISCUSSION: These findings emphasize the importance of lipid metabolism indicators in the early identification of knee pain risk. The integration of interpretable machine learning approaches and composite metabolic indices offers a promising strategy for personalized prevention in aging populations.