Abstract
BACKGROUND: Diabetes significantly increases the risk of cognitive impairment, particularly mild cognitive impairment (MCI). Early identification of individuals at risk for MCI is crucial for timely intervention. This study was aimed at developing and validating a machine learning-based model to predict MCI in patients with Type 2 diabetes (T2DM). METHODS: Participants with T2DM and completed cognitive assessments were included. Feature selection was done using statistical methods and genetic programming to reduce collinearity. Six classification models were trained and evaluated using cross-validation and hyperparameter tuning. External validation was performed with cohorts from the Jiangsu DiabEtes COgnitive Dysfunction Early Diagnosis and Intervention (DECODE) study and the Third National Health and Nutrition Examination Survey (NHANES III). SHAP analysis identified key predictors, and a web interface was developed for practical application. RESULTS: A total of 2074 participants were included. Significant predictors were education, age, GCA index (glycolipid metabolism), systolic blood pressure, eGFR, BMI, and diabetes duration. The support vector classifier (SVC) model achieved the highest performance, with an AUC of 0.74 ± 0.04, an F1 score of 0.62 ± 0.06, and a recall of 0.74 ± 0.09 in internal validation. External validation with the DECODE cohort yielded an AUC of 0.80, an F1 score of 0.80, and a recall of 0.89. NHANES III validation confirmed the model's reliability in predicting MCI risk. CONCLUSIONS: This study compared machine learning models for diagnosing MCI in T2DM patients. The SVC model demonstrated strong efficacy and accuracy, highlighting the potential of machine learning in diagnosing MCI in this population.