Prediction of recurrent ischemic stroke using machine learning from real-world data

利用机器学习技术从真实世界数据预测复发性缺血性卒中

阅读:2

Abstract

BACKGROUND: Recurrent ischemic stroke (RIS) is a significant challenge in Malaysia, affecting approximately 33% of patients. However, studies using artificial intelligence (AI) to predict this event using real-world data remain very limited. This study aimed to develop and evaluate Support Vector Machine (SVM), K-Nearest Neighbor (KNN), and RUSBoost models for predicting recurrent ischemic stroke using real-world data from the Malaysian National Neurology Registry. METHODS: We established a retrospective study of 7,697 enrolled patients registered in the National Neurology Registry in Malaysia (2009-2016). We developed and evaluated several machine learning models, including SVM, KNN, and RUSBoost, to predict recurrent RIS. The Synthetic Minority Over-Sampling Technique (SMOTE) was applied to the training data to handle the imbalanced data. Ten-fold cross-validation was applied to assess the robustness and accuracy of the models, and performance was evaluated using criteria of accuracy, sensitivity, specificity, PPV, and area under the ROC curve (AUC). RESULTS: Among the evaluated machine learning models, RUSBoost demonstrated the strongest and most clinically relevant performance when assessed on validation (test) folds under stratified ten-fold cross-validation, achieving an AUROC of 0.943, sensitivity of 86.5%, and a favourable balance between sensitivity and PPV of 40.2% on the original imbalanced dataset. Although the application of SMOTE during training improved model discrimination for RUSBoost (training-fold AUROC = 0.986). The SHAP analysis showed that age, race, glucose level, hypertension, hyperlipidemia, and duration of diabetes were the most significant factors linked to an increased risk of recurrent ischemic stroke. CONCLUSION: This study demonstrates that applying machine learning models on real-world clinical data is a promising tool for predicting the risk of ischemic stroke recurrence. RUSBoost emerged as the most reliable and generalisable model for clinical risk prediction, proved effective in improving prediction accuracy and identifying patients at highest risk. While SMOTE enhanced model learning during training. The findings highlight the importance of integrating AI technologies into clinical practice to support early treatment decisions and enhance preventive interventions, opening new pathways for better patient care and reducing the health burden from recurrent stroke.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。