Abstract
BACKGROUND: Remnant cholesterol (RC), a key component of triglyceride-rich lipoproteins, is an established predictor of atherosclerotic cardiovascular disease. However, its longitudinal associations with the development of a broad spectrum of other chronic diseases, particularly in middle-aged and older populations, remain largely unexplored. METHODS: This nationwide prospective cohort study analyzed data from 8,828 adults (aged ≥ 45 years) in CHARLS. We employed Cox proportional hazards models to assess longitudinal associations between RC and 14 new-onset chronic conditions (ascertained via self-reported doctor diagnosis). Furthermore, interpretable machine learning models were developed and validated, with SHAP analysis used to quantify feature importance. RESULTS: In fully adjusted Cox models, elevated RC was significantly associated with increased risks of new-onset diabetes, dyslipidemia, kidney disease, and liver disease (HRs ranging from 1.16 to 1.30, all P < 0.05). Among machine learning models, XGBoost demonstrated excellent predictive performance for all four conditions (AUCs ranging from 0.819 to 0.906). In SHAP analysis, RC was consistently identified as one of the top features associated with the model’s classification output, highlighting its potential utility in risk stratification. CONCLUSION: Using both Cox regression and machine learning in a large prospective cohort, this study identifies RC as a robust, independent risk factor for incident diabetes, dyslipidemia, kidney, and liver disease. Our findings support incorporating RC into routine screening to enhance clinical risk stratification and targeted interventions. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13098-026-02106-2.