Abstract
BACKGROUND: There has been a growing prevalence of cardiovascular metabolic diseases (CMD) in adults aged ≥ 45 years, and vision impairment (VI) is highly prevalent in this population. The objective of this study was to explore the critical determinants of VI in individuals affected by CMD and to develop risk prediction models. METHODS: We analyzed data collected in 2011 (n = 1,926) and 2015 (n = 3,033) within the China Health and Retirement Longitudinal Study (CHARLS). Risk factors were selected using the least absolute shrinkage and selection operator (LASSO) regression followed by multivariable logistic regression analysis. Eight machine learning (ML) algorithms were applied: LR, GBM, XGBoost, LightGBM, CatBoost, AdaBoost, NN, and SVM. The evaluation of model performance incorporated ROC curves, calibration assessments, and decision curve analysis. RESULTS: Eleven predictors demonstrated significant links to VI in CMD patients: hearing impairment, depressive symptoms, pain, lower uric acid levels, poorer self-rated health, functional limitations, multimorbidity, reduced cognitive function, poorer sleep quality, and histories of glaucoma and cataract surgery. Among the eight ML algorithms, LR achieved the most stable performance, with AUCs of 0.705 (2015 training set), 0.693 (2015 internal validation set), and 0.695 (2011 temporal validation set). Shapley Additive exPlanations (SHAP) analysis ranked the relative contribution of predictors, and a nomogram was developed for individualized risk estimation. CONCLUSIONS: We established an LR-based prediction model for VI in patients with CMD aged ≥ 45 years, exhibiting stable accuracy and favorable interpretability in clinical settings. This tool may support timely recognition and intervention of eye health risks in CMD patients aged ≥ 45 years, particularly in settings with limited ophthalmic resources. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12886-025-04596-6.