Abstract
OBJECTIVES: The aim of this work was to train machine learning models to identify patients at end of life with clinically meaningful diagnostic accuracy, using 30-day mortality in patients discharged from the emergency department (ED) as a proxy. DESIGN: Retrospective, population-based registry study. SETTING: Swedish health services. PRIMARY AND SECONDARY OUTCOME MEASURES: All cause 30-day mortality. METHODS: Electronic health records (EHRs) and administrative data were used to train six supervised machine learning models to predict all-cause mortality within 30 days in patients discharged from EDs in southern Sweden, Europe. PARTICIPANTS: The models were trained using 65 776 ED visits and validated on 55 164 visits from a separate ED to which the models were not exposed during training. RESULTS: The outcome occurred in 136 visits (0.21%) in the development set and in 83 visits (0.15%) in the validation set. The model with highest discrimination attained ROC-AUC 0.95 (95% CI 0.93 to 0.96), with sensitivity 0.87 (95% CI 0.80 to 0.93) and specificity 0.86 (0.86 to 0.86) on the validation set. CONCLUSIONS: Multiple models displayed excellent discrimination on the validation set and outperformed available indexes for short-term mortality prediction interms of ROC-AUC (by indirect comparison). The practical utility of the models increases as the data they were trained on did not require costly de novo collection but were real-world data generated as a by-product of routine care delivery.