Abstract
Diabetic peripheral vascular disease (DPVD) and diabetic foot (DF) are major complications that lead to disability in diabetic patients, severely impaired their quality of life. Firstly, this study gathered cross-sectional data from 1240 patients with type 2 diabetes and its complications in the the department of vascular surgery and endocrinology of the second affiliated hospital of zhejiang university school of medicine. In the pre-processing part, samples with serious data loss are eliminated, and the data are processed by methods such as MICEforest. After that, random forest (RF), support vector machine (SVM), backpropagation neural network (BPNN), extreme gradient boosting (XGBoost), and SHapley Additive exPlanation (SHAP) were employed to rank the importance of the 27 indicators. The entropy weight method was then applied to comprehensively assign weights to all indexes. Finally, the genetic neural network algorithm (GA-BPNN) was introduced to construct a prediction model for diabetes complications. In addition, the SHAP algorithm was applied to obtain the weight and importance ranking of each risk factor in the prediction model. This study identified the top 17 key indicators through a comprehensive weighting approach. Among the 5 classification models evaluated, the GA-BPNN algorithm exhibited the best performance in both diabetes and DPVD (G1), DPVD and DF (G2), achieving the area under the receiver operating characteristic curve (AUC) values of 0.79 and 0.89, accuracy rates of 0.78 and 0.80, and F1-scores of 0.77 and 0.83, respectively. Furthermore, hypothesis testing results indicate that indicators such as fibrinogen and c-reactive protein show statistically significant differences between groups. SHAP feature importance analysis also highlights the significant influence of these features in identifying diabetic complications. GA-BPNN can be employed as a prediction model for DPVD and DF. In feature selection, the comprehensive weighting method and SHAP analysis identified key features. In summary, this study constructed a comprehensive prediction model based on machine learning and interpretable algorithms, integrating diabetes-specific indicators, traditional cardiovascular risk factors, coagulation function, inflammatory markers, and cardiac structural parameters. It can effectively identify high-risk patients for diabetic complications, uncover potential features, and thereby assist in subsequent efforts to reduce the incidence of these complications.