A novel machine learning framework for stroke type identification in resource constrained settings with robustness to missing data

一种适用于资源受限环境且对缺失数据具有鲁棒性的新型机器学习框架,用于识别中风类型

阅读:1

Abstract

Stroke is the third leading cause of disability and mortality worldwide. Accurate identification of stroke-type-ischemic or hemorrhagic-is critical for guiding treatment; however, it typically requires costly neuroimaging, which is often inaccessible in rural areas of developing countries. This study employs machine learning (ML) to identify stroke-type using only clinical data, aiming to develop a cost-effective method, particularly for resource-limited settings lacking neuroimaging facilities. A dataset from 2,190 stroke patients with 79 clinical attributes has been collected in-house and used for the development of the proposed ML-framework. The framework robustly addresses missing data through Multiple Imputation by Chained Equation (MICE), ensuring it to function robustly even when some laboratory test results are unavailable. Further, the research addresses target leakage through statistical tests and utilizes SHAP-analysis to identify the most important attributes for classification. The proposed framework achieves 82.42% weighted accuracy, 82.33% accuracy, 82.19% sensitivity, 82.65% specificity, and an 86.68% F1-score. Notably, with only the 19 most significant attributes identified via SHAP, the framework maintains a weighted accuracy of 82.20%. Prospective validation on an independent dataset demonstrates a 16.42% improvement over the best-performing clinical score, Siriraj. The proposed ML-framework may help reduce the time to treatment for patients in resource-limited settings by enabling prompt primary care and timely referral to stroke-ready facilities.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。