Comparison of machine learning models for predicting stroke risk in hypertensive patients: Lasso regression model, random forest model, Boruta algorithm model, and Boruta algorithm combined with Lasso regression model

比较用于预测高血压患者卒中风险的机器学习模型:Lasso回归模型、随机森林模型、Boruta算法模型以及Boruta算法与Lasso回归模型相结合的模型

阅读:1

Abstract

The aim of this study was to compare the performance of 4 machine learning models-Lasso regression model, random forest model, Boruta algorithm model, and the Boruta algorithm combined with Lasso regression-in predicting stroke risk among hypertensive patients. The study evaluated the strengths and weaknesses of each model to provide a more clinically valuable prediction model for stroke risk. The study included 3472 hypertensive patients, of which 312 had experienced a stroke, and 3160 had not. Various health indicators were analyzed using Lasso regression, random forest, Boruta algorithm, and the Boruta algorithm combined with Lasso regression. Model performance was evaluated based on the area under the curve (AUC) of the receiver operating characteristic curve, the precision-recall curve, calibration curve, and decision curve analysis to assess classification ability, precision, calibration, and clinical benefit. The Lasso regression and Boruta algorithm models both have an AUC of 0.716, making them the best-performing models in terms of classification ability. The Boruta algorithm combined with Lasso regression model has an AUC of 0.705, slightly lower than the previous 2 models but still shows good predictive capability, with better interpretability due to feature selection. The random forest model has an AUC of 0.626, which is the lowest among the models, indicating weaker classification performance compared to the others. Among the 4 models, the Lasso regression model and Boruta algorithm model performed similarly in terms of classification ability, both demonstrating moderate predictive power, while the random forest model performed relatively poorly. The Boruta combined with Lasso regression model was precise in variable selection but had limited clinical utility. Therefore, the Lasso regression model appears to be the most balanced in predicting stroke risk and is the recommended model based on this study.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。