Lifestyle data-based multiclass obesity prediction with interpretable ensemble models incorporating SHAP and LIME analysis

基于生活方式数据的多类别肥胖预测,采用可解释的集成模型,结合SHAP和LIME分析

阅读:3

Abstract

Obesity is a major public health concern. Predicting obesity risk from lifestyle data can guide targeted interventions, but current models remain limited. This study first evaluates ensemble learning methods and then combines approaches to improve prediction accuracy and generalizability. Four ensemble techniques-boosting, bagging, stacking, and voting-were tested. Five boosting and five bagging models were constructed alongside voting and stacking models. Hyperparameter tuning optimized performance, and feature importance analysis guided potential feature elemination. In phase two, hybrid stacking and voting models integrated the best-performing boosting and bagging models to enhance predictive capability. Model robustness was ensured through k-fold cross-validation and statistical validation. SHAP (SHapley Additive exPlanations) and LIME (Local Interpretable Model-agnostic Explanations) improved interpretability by analyzing feature contributions. Hybrid stacking and voting models outperformed other ensemble methods, with stacking achieving the best performance (accuracy: 96.88%, precision: 97.01%, and recall: 96.88%). Feature importance analysis identified key predictors, including sex, weight, food habits, and alcohol consumption. The results demonstrated that hybrid ensembles significantly improved obesity risk prediction while preserving interpretability. Integrating multiple ensemble and explainability techniques provides a reliable framework for obesity prediction, supporting clinical decisions and personalized healthcare strategies to mitigate obesity risk.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。