Identification and validation of an explainable early-stage chronic kidney disease prediction model: a multicenter retrospective study

识别和验证可解释的早期慢性肾脏病预测模型:一项多中心回顾性研究

阅读:4

Abstract

BACKGROUND: Chronic Kidney Disease (CKD) has become a significant global public health issue, affecting approximately 10% of adults. Due to the lack of obvious symptoms in the early stages, CKD is often difficult to diagnose in a timely manner, leading to the gradual progression of the disease, which can eventually develop into End-Stage Renal Disease (ESRD). This study applied machine learning (ML) methods to integrate patient clinical data and developed an early CKD prediction model applicable to individuals without diabetes, hypertension, or coronary heart disease. The model aims to enhance the accuracy of early CKD risk assessment, thereby delaying disease progression and improving patient outcomes. METHODS: This study is a retrospective multicenter study conducted in China, including patients with CKD and healthy individuals who underwent physical examinations from February 2021 to April 2024. Six ML methods, including Decision Tree, Multilayer Perceptron, and XGBoost, were used to predict CKD, integrating different combinations of features such as blood routine, urine analysis, and blood biochemistry. Multiple evaluation metrics, including AUC and F1 score, were used to compare the prediction performance. The SHAP interpretability method was applied to assess feature importance and explain the final model's results. FINDINGS: Data from three hospitals were used in this study, with the dataset divided into training and internal validation sets (CKD: 11,436 cases, non-CKD: 10,004 cases) and an external validation set (CKD: 350 cases, non-CKD: 473 cases). Among the six ML models, XGBoost performed the best. Regarding feature combinations, the "blood routine + urinalysis + basic information" combination yielded the best performance (AUC = 0.9235, external validation AUC = 0.8962). Additionally, a web tool was developed in this study to facilitate the application of early CKD risk prediction in clinical practice. INTERPRETATION: This study applied an interpretable ML model to effectively predict early CKD. Even when using the relatively low-cost "blood routine + urinalysis + basic information" combination, the model still demonstrated high prediction accuracy. This method has potential clinical application prospects and may help identify early CKD, reducing the risk of disease progression. FUNDING: This research was supported by the National Key Research and Development Program of China (2023YFC3502903, 2022YFC3502302), National Natural Science Foundation of China (82074580), and Science and Technology Project of Jiangsu Provincial Research Institute of Chinese Medicine Schools (JSZYLP2024011).

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。