Development of an explainable machine learning model for predicting the occurrence of advanced diabetic kidney disease

开发一种可解释的机器学习模型,用于预测晚期糖尿病肾病的发生

阅读:3

Abstract

AIMS: This study aims to develop an interpretable machine learning (ML) model for predicting the occurrence of advanced diabetic kidney disease (DKD), with the objective of identifying patients at an early stage of the disease, thereby facilitating timely and appropriate clinical intervention. METHODS: Variable selection was performed using a combination of the least absolute shrinkage and selection operator (LASSO) and recursive feature elimination (RFE) techniques. A prediction model was constructed and validated using eight ML algorithms, and the model's performance was evaluated using area under curve (AUC), accuracy, sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), F1 score, Brier score, calibration curve, and decision curve analysis (DCA). The SHapley Additive exPlanation (SHAP) and partial dependence plot (PDP) methods were employed to interpret the model both locally and globally. Finally, the prediction model was integrated into a network platform based on the Shiny application for direct use by clinicians and patients. RESULTS: Serum creatinine, age, hemoglobin, serum urea, serum ALP, serum UA, platelet count, serum osmolality, serum bicarbonate, and monocyte count were identified as the most important variables in the advanced DKD model. Eight ML models were developed using these five variables. Among them, the logistic regression (LR) model demonstrated accurate predictive ability in both internal and external validation, with AUCs of 0.948 (95%CI: 0.920-0.975) and 0.898 (95%CI: 0.883-0.913), respectively. Furthermore, the LR model exhibited excellent performance in terms of accuracy, sensitivity, PPV, NPV, F1 score, and Brier score. The results of the calibration curve and DCA also indicate a high degree of consistency between the predicted and observed risks of the RF model, with a net return approaching full coverage. The model developed is available through LR-based online calculators for clinicians, free of charge: https://dev2333.shinyapps.io/logistics1/. CONCLUSION: This study developed and validated an interpretable LR model for predicting the occurrence of advanced DKD. The LR model can assist clinical practice by effectively identifying individuals at higher risk of advanced DKD at an early stage, allowing patients to receive timely and personalized treatment, and thereby providing a reliable foundation for improving patient prognosis and optimizing medical resource utilization.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。