Early Prediction of Adverse Stroke Outcomes Using Nonclinical Factors and Missing Data: A Machine Learning Study

利用非临床因素和缺失数据早期预测卒中不良结局:一项机器学习研究

阅读:1

Abstract

INTRODUCTION: Early prediction of stroke outcomes using prognostic tools may help clinical decision-making and inform resource allocation. However, clinical information required to inform prediction tools is often missing. We evaluated the performance of machine learning (ML) prediction models of adverse stroke outcome at 90 days post-admission that exploit non-clinical data, and missingness, alongside traditional clinical and demographic predictors. METHODS: We used routine hospital data from UK clinical sites (NHS SafeHaven) to train three gradient-boosted models. We compared baseline clinical features with nonclinical features and missingness to predict a composite 90-day adverse stroke outcome: mortality, stroke recurrence, or new care-home discharge. Model validation used 10% of the data. Model performance was evaluated by accuracy (correct predictions/total predictions) and area under the receiver operating characteristics curve (AUC) while DeLong's test was used to compare performance of the three models. We used Brier score to evaluate model calibration. SHapley Additive exPlanations (SHAP) analyses determined the contribution of each model feature in predicting adverse stroke outcome. RESULTS: The final sample included 3,530 stroke patients with 51% males (mean age = 72 years; SD = 14). Clinical data were incomplete with five clinical features having >63% missing values. The performance of the three models was not significantly different (p = 0.5-0.9). The model with non-clinical and missingness features demonstrated 71% accuracy and AUC of 0.76 with Brier score of 0.19. Nonclinical factors, such as time to clinical assessment and time to admission, were among the five most important predictors of adverse stroke outcome (mean |SHAP| = 0.03 and 0.05), alongside Glasgow Coma Scale (0.08), age (0.03), and temperature (0.02). Missing clinical values (pulse and LDL) predicted adverse stroke outcome (mean |SHAP| = 0.02 and 0.02) and were correlated with age (ρ = 0.2), arrival by ambulance (ρ = 0.3), length of stay (ρ = -0.3), and transient ischaemic attack (ρ = 0.3). CONCLUSION: We demonstrate that nonclinical factors and missingness of data can assist in early predictions of 90-day adverse stroke outcomes. As these factors are often well documented in electronic health systems, they could complement or supplement traditional clinical predictive factors.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。