Abstract
BACKGROUND: Pulmonary embolism (PE) continues to be one of the leading causes of cardiovascular mortality. The wide range of reported mortality rates reflects that PE is a heterogeneous disease with different clinical characteristics. European Society of Cardiology (ESC) guidelines stratify PE into high, intermediate and low-risk groups, yet intermediate-high-risk PE lacks tailored management. This study seeks to enhance risk stratification through a retrospective analysis utilizing machine learning algorithms on intermediate-high-risk PE cases. METHODS: This study retrospectively identified a cohort of 79 patients with intermediate-high-risk PE from two clinical centers. We performed an unsupervised stratified cluster analysis using the average method, based on 11 standardized continuous variables [age, sex, chronic lung disease, chronic heart disease, diabetes, Pulmonary Embolism Severity Index (PESI) score, mean arterial pressure, oxygen index, B-type brain natriuretic peptide (BNP), D-dimer, and treatment]. We compared baseline characteristics, laboratory data, and clinical outcomes among the resulting clusters using one-way analysis of variance. RESULTS: Three distinct clusters were identified through agglomerative hierarchical clustering, utilizing 10 clinical variables: cluster 1 (6 cases), a dominant cluster 2 (67 cases), and cluster 3 (6 cases). The analysis showed statistically significant differences in age, chronic lung disease, chronic heart disease and diabetes among clusters (P<0.05). The three clusters showed significant differences in length of stay in the intensive care unit (ICU), total length of stay, hospitalization expenses, 1- and 3-year PE mortality (P=0.03, P=0.052, P<0.001, P<0.001, P=0.003, respectively). The computed tomography pulmonary angiography (CTPA)-confirmed absorption was comparable for three clusters (P=0.95). Cluster 3 had the poorest prognosis with the longest ICU stay, the highest costs, the highest 1- and highest 3-year PE mortality. CONCLUSIONS: In this study, hierarchical clustering discovered disease subtypes related to prognosis by clustering the characteristics of intermediate-high-risk PE patients at admission. Machine learning-based classification may contribute to risk stratification for patients with intermediate-high-risk PE.