Construction and comparison of short-term prognosis prediction model based on machine learning in acute ischemic stroke

基于机器学习的急性缺血性卒中短期预后预测模型的构建与比较

阅读:1

Abstract

OBJECTIVE: To construct and compared the short-term prognosis prediction models of acute ischemic stroke (AIS) by machine learning (ML). METHODS: Retrospectively study. The group W (mRS≤3) was clustered, and combined with group P (mRS>3) to form the post-clustering dataset for modeling. The "glmnet", "rpart", "xgboost", "randomForest", "neuralnet" packages were used to construct ML models. The accuracy, sensitivity, specificity, positive predict value (PPV), negative predict value (NPV) among the models were compared. Four external clinical datasets were used for external clinical validation. The optimal prediction model was determined by variable screening ability, model visualization, and external clinical validation performance. RESULTS: The post-clustering dataset contains 139 patients (group W) and 122 patients (group P). The neutrophil multiplied by D-dimer (NDM) has predictive value in all ML prediction models in this study. In the decision tree model, NDM(Q) occupies the first tree node, When NDM≤5.62 and the age<74.5, the probability of poor prognosis of AIS is less than 20 %. When NDM>5.62 and accompanied by pneumonia, the incidence of poor prognosis of AIS is about 90 %. In the Random Forest (RF) model, NDM(Q) had the highest Gini index. The variable combination screened by the RF model had the best performance in the neural network, and the accuracy, sensitivity, specificity, PPV, and NPV of the external validation were 0.800, 0.774, 0.833, 0.857, and 0.741, respectively. The RF model had the best performance in the external clinical validation datasets, with accuracies of 0.646, 0.697, 0.695, and 0.713, respectively. CONCLUSIONS: NDM shows predictive value for AIS short-term prognosis in all ML models in this study. The optimal model in screening characteristic variables and the performance of in external clinical datasets was RF model. In the analysis of medical data with small sample size and outcome as categorical variables, RF could be used as the main algorithm to build a model.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。