A Differential Evolution-Based Optimized Ensemble for Balanced and Imbalanced Medical Datasets

基于差分进化算法的平衡和非平衡医学数据集优化集成模型

阅读:1

Abstract

BACKGROUND: Class imbalance is a frequent and severe problem in medical datasets, where instances from the minority class are usually high risk or disease positive. Most traditional classifiers suffer from a biasness towards the majority class, resulting in a poor detection rate of the minority class and, therefore, decreased confidence in prediction systems in medical applications. METHODS: In this paper, we present an optimized ensemble by differential evolution (OEDE), a novel ensemble learning framework, to address this problem. OEDE harmonizes three dissimilar base learners (Logistic Regression, Random Forest, and XGBoost) and trains each using class-balancing techniques. Next, the model utilized Differential Evolution (DE) to discover the most appropriate ensemble weights to maximize the area under the ROC curve (AUC) on a validation dataset. RESULT: We conducted experiments on four real-world medical datasets, whose imbalance ratios vary from 1.89 to 14.6, using OEDE in the original, SMOTE, and ADASYN balanced conditions. Experimental results demonstrate substantial performance gain of OEDE on the challenging Thoracic dataset, achieving a 70.08% AUC, outperforming the standard Random Forest (50.82%) and AdaBoost (47.15%) baselines by over 19%. Additionally, on the Cervical Cancer dataset, the model achieved a peak AUC of 97.89%. The results indicate that the proposed OEDE consistently outperforms or is competitive with traditional ensemble models in terms of AUC, F1-score, and Recall. ROC curve analysis also approved the OEDE's superior discriminative capabilities. CONCLUSION: The proposed OEDE framework effectively improves minority class detection in imbalance medical datasets. Its robust and flexible design makes it a promising tool for healthcare risk prediction tasks where minority class groups need to be well identified.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。