Machine learning and SHAP values explain the association between social determinants of health and post-stroke depression

机器学习和SHAP值解释了健康社会决定因素与中风后抑郁症之间的关联。

阅读:1

Abstract

OBJECTIVE: To create and verify a machine learning model that integrates social determinants of health (SDoH) for assessing post-stroke depression (PSD) and examining the association between SDoH and disease outcomes. METHODS: Data were acquired from the National Health and Nutrition Examination Survey. Logistic regression was employed to analyse the association between SDoH and PSD, whereas Cox regression was utilized to assess the correlation between SDoH and all-cause mortality in PSD. The Boruta algorithm was employed for feature selection, and four machine learning models were constructed (CatBoost, Logistic, Multilayer Perceptron, and Random Forest) to evaluate the predictive effectiveness, calibration, and clinical applicability of these ML models. SHAP values were computed to ascertain the predictive significance of each feature in the model that exhibited the highest predictive performance. RESULTS: Logistic regression analysis revealed a significant positive correlation between SDoH and PSD prevalence(p for trend < 0.0001). Compared to the other three models, CatBoost (AUC = 0.966) demonstrated the best overall predictive performance. Moreover, the decision curve analysis (DCA) and calibration curve findings demonstrated that the CatBoost model possessed considerable clinical utility and consistent predictive efficacy. The ten-fold cross-validation method further confirmed the model's robustness and generalization ability. CONCLUSIONS: A linear relationship exists between SDoH and PSD, with CatBoost demonstrating the best performance in predicting PSD. SHAP values emphasize the importance of SDoH.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。