Explainable artificial intelligence approaches for predicting depression by combining feature selection methods and machine learning classifiers

结合特征选择方法和机器学习分类器的可解释人工智能方法用于预测抑郁症

阅读:2

Abstract

OBJECTIVE: Depression represents a significant global health challenge, further complicated by the multifaceted and complex nature of its diagnosis and treatment. This study explores the application of multiple feature selection (FS) methodologies combined with XAI (explainable artificial intelligence) method named SHapley Additive exPlanations (SHAP) to enhance predictive accuracy in depression classification models using large-scale national survey data. METHODS: Leveraging microdata from the National Mental Health Survey of Korea (2021), encompassing 5511 Korean adults, this research systematically evaluates how different FS-machine learning classifier combinations affect model performance and identifies nondiagnostic socioeconomic, psychological, and lifestyle factors associated with clinically diagnosed depression. By employing diverse FS methods (e.g., ReliefF, Markov Blanket, and Information Gain) across multiple machine learning classifiers, we systematically compare their performance across 12 classifiers. RESULTS: We demonstrate that optimal FS method selection depends on machine learning classifier architecture, with ReliefF excelling in Stacking (F2-score =0.9851) and Markov Blanket performing best in ExtraTrees and LightGBM (F2-score =0.9848, 0.9838). After excluding core diagnostic criteria variables to avoid circularity, our analysis reveals that social distress (loneliness), reluctance to seek professional help, quality of life measures, and physical health comorbidities emerge as highly influential nondiagnostic predictors. CONCLUSION: Our findings advance the field by: (1) systematically demonstrating that FS method effectiveness varies by machine learning classifier type, (2) providing a dual-layer XAI framework combining FS with SHAP for comprehensive interpretability, and (3) identifying culturally specific risk factors in an underrepresented Asian population using high-quality face-to-face collected data. These contributions provide methodological guidance for researchers developing interpretable depression prediction models and offer clinically actionable insights for identifying at-risk individuals in Korean populations.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。