An interpretable LightGBM model for predicting coronary heart disease: Enhancing clinical decision-making with machine learning

一种可解释的LightGBM模型用于预测冠心病:利用机器学习增强临床决策

阅读:2

Abstract

BACKGROUND: Coronary Heart Disease (CHD) is one of the major burdens of cardiovascular diseases worldwide. Traditional diagnostic methods, such as coronary angiography and electrocardiogram, face challenges including high costs, subjectivity, and high misdiagnosis rates. To address these issues, this study proposes a prediction framework for CHD based on the LightGBM algorithm, aiming to improve the accuracy and interpretability of CHD risk prediction. METHODS: This study utilized three publicly available datasets: BRFSS_2015, Framingham, and Z-Alizadeh Sani. The BRFSS_2015 dataset was used for model training, while the Framingham and Z-Alizadeh Sani datasets were employed for validation. Data preprocessing included cleaning, feature engineering, and handling missing values. The LightGBM model was selected for its efficiency and performance, and SHAP (SHapley Additive exPlanations) values were used to enhance model interpretability. Model performance was evaluated using metrics such as accuracy, precision, recall, F1-score, and AUROC. A CHD scoring system was developed based on the model's predictions to assist clinicians in risk assessment. RESULTS: The LightGBM model demonstrated excellent performance, achieving an accuracy of 90.60% and an AUROC of 81.06% on the BRFSS_2015 dataset. After parameter tuning, the model's accuracy improved to 90.61%, and the AUROC increased to 81.11%. On the Framingham dataset, the accuracy improved from 83.96% to 85.26%, and the AUROC increased from 62.86% to 67.37%. On the Z-Alizadeh Sani dataset, the accuracy improved from 78.69% to 80.33%, and the precision increased from 74.40% to 76.36%. CONCLUSIONS: SHAP analysis revealed that age, smoking status, diabetes, hypertension, and high cholesterol were the most influential features in predicting CHD risk. The developed CHD scoring system provided a user-friendly tool for clinicians to assess patient risk levels effectively.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。