Abstract
BACKGROUND: In-hospital cardiac arrest (IHCA) is a major adverse event with a high death risk. Machine learning (ML) models of prognosis in cardiac arrest (CA) patients have been established, but there are some interferences in their clinical application. This study developed an ensemble learning (EL) model based on clinical information to predict IHCA patient death risk. METHODS AND RESULTS: This retrospective cohort study used data from the Medical Information Mart for Intensive Care IV (MIMIC-IV) database and eICU Collaborative Research Database. Patients (age ≥ 18 years) with CA based on the ICD-9/10 code were included. Eight candidate ML models were selected for soft voting ensemble. Features were sequentially eliminated based on feature importance scoring to reduce input complexity without compromising model performance. The final model was externally validated with the MIMIC-IV database and deployed as a web application. Overall, 4,068 patients were included. In the internal validation cohort, the EL model exceeded single ML models with an accuracy of 0.842, precision of 0.830, recall of 0.839, F1 score of 0.835, and AUC of 0.898 and showed better calibration across the spectrum of survival probabilities. Furthermore, there is no obvious decline in the prediction performance of the EL model with the top seven features (HCO(3) (-), Glasgow Coma Scale, white blood cell count, international normalized ratio, hematocrit, body temperature, and blood urea nitrogen) retained. In external validation, the performance slightly decreased but remained acceptable for deploying a clinically feasible web application. CONCLUSION: The EL model outperformed single ML models in predicting IHCA patient death risk. The identified seven key features enabled the parsimonious EL model to reliably estimate the death risk.