Data-driven explainable chronic kidney disease detection using RF based data imputation and meta-ensemble learning

基于随机森林数据插补和元集成学习的数据驱动型可解释慢性肾脏病检测

阅读:2

Abstract

Chronic kidney disease (CKD) is a progressive medical condition with significant public health impact, where early detection is critical for effective intervention. This work presents a clearly defined and structured data-driven framework designed to improve CKD prediction through robust preprocessing and optimized ensemble learning. This study proposes a novel hybrid framework that combines Random Forest (RF)-based imputation for missing value handling, categorical feature encoding, and synthetic minority oversampling technique (SMOTE) for addressing class imbalance, integrated with a Grey wolf optimizer (GWO)-based weighted ensemble of top-performing classifiers (Decision Tree, Logistic Regression, and Gaussian Naïve Bayes). The ensemble weights are optimized using the Grey Wolf optimizer (GWO) to enhance predictive accuracy. We evaluate the proposed framework on the UCI CKD dataset, demonstrating that it outperforms individual classifiers and conventional ensemble methods, achieving an accuracy of 98.75%, precision of 98.8%, recall of 98.6%, and F1-score of 98.7%. Additionally, explainable AI (XAI) techniques including SHAP and LIME are employed to analyze feature contributions, providing interpretable insights and confirming the clinical relevance of the predictions. Overall, the proposed framework offers a transparent, reliable, and computationally efficient clinical decision support model that bridges the gap between data-driven AI and nephrology practice.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。