Early Type 2 diabetes risk prediction using explainable machine learning in a two-stage approach

利用可解释机器学习的两阶段方法进行早期2型糖尿病风险预测

阅读:1

Abstract

BACKGROUND: Diabetes is a chronic disease characterized by elevated blood glucose levels. Without early detection and proper management, it can lead to serious complications and increase healthcare costs. Its global prevalence is rising, with many cases remaining undiagnosed. In this study, we developed an explainable machine learning model using a two-stage approach for predicting diabetes. METHODS: Five machine learning (ML) models, including Multi-Layer Perceptron, Support Vector Machine, K-Nearest Neighbor, Extreme Gradient Boosting (XGBoost), and Naïve Bayes, were trained and evaluated using a two-stage approach. In Stage one, a public dataset containing 520 samples was used, and Shapley Additive exPlanations (SHAP) and MLP weights were applied for feature selection. In Stage two, the same models were trained and evaluated using a dataset of 270,943 samples collected from Rwanda. SHAP was further employed to explain the model output. RESULTS: In Stage one, the Multi-Layer Perceptron model achieved the best performance on a public dataset, with an accuracy of 95.19%. Feature selection techniques identified the top 10 influential predictors associated with diabetes risk, including those recommended by diabetes care providers in Rwanda. In Stage two, the XGB model outperformed other models, achieving an accuracy of 97.14%. CONCLUSION: This study presents a two-stage, explainable machine learning framework for systematic screening for type 2 diabetes. The first stage evaluates risk based on reported symptoms, while the second stage incorporates demographic, anthropometric, and vital sign data for refined risk assessment. Integration of these models into the mUzima mobile application can enhance community health workers' capacity to identify and refer high-risk individuals. By enabling early and accurate detection, the proposed approach has the potential to reduce undiagnosed diabetes and support improved disease management.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。