Predicting the frequent exacerbator phenotype in COPD: development and validation of a multicenter real-world prediction model

预测慢性阻塞性肺疾病(COPD)频繁急性加重表型:多中心真实世界预测模型的开发与验证

阅读:1

Abstract

BACKGROUND: The frequent exacerbator phenotype (FEP) of chronic obstructive pulmonary disease (COPD) significantly impacts quality of life, increases healthcare burden, and increases mortality rates. This study aims to develop an interpretable machine learning model for the early prediction of FEP to improve patient prognosis. METHODS: Retrospective data were collected from the electronic health records (EHRs) of two hospitals for three independent cohorts of patients hospitalized for the first time due to acute exacerbation of COPD (AECOPD). Patients were categorized into frequent exacerbation and nonfrequent exacerbation groups on the basis of whether they experienced two or more exacerbations requiring hospitalization during a 12-month follow-up period. The feature variables were selected via univariate regression combined with the Boruta algorithm. Nine machine learning models were developed and validated via 5-fold cross-validation. The optimal prediction model was selected by integrating performance on the test set, two independent external datasets, and clinical requirements. The global and local interpretability of the model was achieved via Shapley additive explanations (SHAPs). Restricted cubic splines (RCSs) were employed to analyze the dose‒response relationships between continuous variables and the frequent exacerbator phenotype. Ultimately, the model was deployed on the Shiny platform. RESULTS: This study included a development cohort of 1,310 patients and two external validation cohorts consisting of 418 and 200 patients. The datasets included 64 variables, including demographic information, blood indices, and comorbidities. Following feature screening, 14 key variables were identified to construct machine learning models. In model performance comparisons, the stacking ensemble model demonstrated superior predictive efficacy, generalization ability, and control of missed diagnosis rates. SHAP value analysis ranks the contributions of 14 key variables to the prediction of FEP. Restricted cubic spline (RCS) analysis further revealed dose‒response relationships between nine key continuous variables and FEP. Finally, the research team developed a web-based interactive prediction tool (https://aipd.shinyapps.io/FEPCOPD/). CONCLUSION: This study developed a robust stacking ensemble prediction model for FEP in COPD patients, leveraging multidimensional clinically accessible data. By deploying an interactive prediction tool on the Shiny platform, primary care providers can conveniently utilize the model to facilitate early identification of patients with this high-risk phenotype. CLINICAL TRIAL NUMBER: MR-43-23-040012. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12911-025-03281-4.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。