Abstract
Hospital readmissions prolong patient suffering and increase healthcare expenditures. While several studies have attempted to develop prediction models to reduce readmissions, most have demonstrated modest predictive accuracy. To improve upon prior approaches, we conducted an overview of systematic reviews to identify the most relevant predictor variables, then subsequently developed machine learning models in a retrospective, multisite study across eight hospitals. The patient sample comprised 200,799 inpatient stays from eligible hospitalizations, based on the Centers for Medicare and Medicaid Services (CMS) definition of unplanned readmissions within 30 days of discharge. We constructed random forest models and evaluated out-of-sample performance using the area under the receiver operating characteristic curve (AUC) across different train-test splits. The hospital-wide sample was divided into medical and surgical cohorts to investigate predictor importance across different patient populations. The average AUC score was 0.78 ± 0.01 (mean ± standard deviation [SD]). Patients' diagnoses were the most important predictor variables (contributing 18.4% ± 0.15 to the model's decision, mean ± standard error [SE]), followed by nursing assessments (11.2% ± 0.04, mean ± SE) and procedural information (10.8% ± 0.09, mean ± SE). Comparing medical and surgical patients, we found that medications and prior healthcare use (e.g., prior emergency encounters) were more important in the medical compared with the surgical cohort, whereas procedural information and healthcare provider information (e.g., physician caseload) were more relevant in the surgical relative to the medical cohort. In conclusion, we have established the feasibility of using Swiss electronic medical record (EMR) data to accurately predict unplanned readmissions. The reported variable importances may guide future research and inform development of clinical decision support systems aimed at reducing readmissions.