Developing an Interpretable Machine Learning Model for Early Prediction of Cardiovascular Involvement in Systemic Lupus Erythematosus

开发可解释的机器学习模型以早期预测系统性红斑狼疮的心血管受累情况

阅读:2

Abstract

BACKGROUND: Cardiovascular disease is a leading cause of death in systemic lupus erythematosus (SLE). Early prediction of cardiac involvement is critical for improving patient outcomes. This study aimed to identify key factors associated with cardiac involvement in SLE and to develop an interpretable machine learning (ML) model for risk prediction. METHODS: We conducted a retrospective analysis of 1,023 SLE patients hospitalized in Shenzhen People's Hospital between January 2000 and December 2021, with a median age of 31 years at hospitalization (IQR: 25-39 years), 92.1% being female, and 18.77% developing cardiovascular involvement during a median follow-up of 3,737 days (IQR: 1,920-5,246). The most predictive features were selected through the intersection of three feature selection techniques: Random Forest, LASSO, and XGBoost. Models were trained on 70% of the dataset and tested on the remaining 30%. Among seven evaluated algorithms, the Gradient Boosting Machine (GBM) demonstrated the best performance on the test set. Model interpretability was assessed using the DALEX package, which generated feature importance plots and instance-level breakdown profiles to visualize decision-making logic. RESULTS: Over a median follow-up of 3737 days, 192 (18.77%) patients developed cardiac involvement. Seven key predictors-arthritis, hypertension, HDL-C, LDL-C, total cholesterol, CRP, and ESR- were identified from 51 clinical and biological variables at admission. The Gradient Boosting Machine (GBM) model (AUC: 0.748, Accuracy: 0.779, Precision: 0.605, F1 score: 0.433, recall 0.338) performed the best of the seven models. CONCLUSION: This study is the first to develop an interpretable ML model to predict the risk of cardiac involvement in SLE. Notably, the GBM model showed optimal performance, and its interpretability allowed clinicians to visualize decision-making processes, facilitating early identification of high-risk patients.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。