Real-world risk stratification for coronary heart disease: a one-year prediction model using health information exchange data

基于健康信息交换数据的冠心病真实世界风险分层:一年期预测模型

阅读:2

Abstract

BACKGROUND: Coronary heart disease (CHD), the most common form of heart disease, progresses over years before culminating in serious cardiac events. Early prediction and intervention are critical to reducing CHD-related morbidity, mortality, and healthcare burden. OBJECTIVE: To develop and validate a machine learning model using statewide electronic health records (EHRs) to predict 1-year risk of CHD in the general population of Maine, enabling targeted preventive strategies. METHODS: Two population-based cohorts were constructed from the Maine Health Information Exchange (HIE): a retrospective cohort for model training and calibration (2015–2017, N = 1,042,124), and a prospective cohort for external validation (2016–2018, N = 1,040,158). EHR features included demographics, diagnoses, procedures, medications, labs, and utilization metrics. A multistage modeling pipeline—comprising statistical filtering, XGBoost-based feature selection, risk prediction, and isotonic regression calibration—was used to construct the final model. Validation included discrimination, calibration, and survival analysis. RESULTS: The final XGBoost model achieved strong discrimination: AUC = 0.952 (95% CI: 0.950–0.954) in the retrospective cohort and 0.888 (95% CI: 0.885–0.890) in the prospective cohort. Based on calibrated risk probabilities, the population was stratified into five risk categories: very low (92.30%, N = 960,021), low (6.79%, N = 70,676), medium (0.85%, N = 8,888), high (0.05%, N = 554), and very high (0.002%, N = 19). Among the very high-risk group, 11 individuals (57.89%) developed CHD within one year. CONCLUSIONS: This statewide, HIE-based CHD risk prediction model demonstrates robust performance and real-world applicability. It enables early identification of high-risk individuals and supports population-scale precision prevention through evidence-informed, proactive care. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12889-025-24266-y.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。