Abstract
BACKGROUND: Candidemia is a rare but life-threatening bloodstream infection that remains difficult to predict using conventional risk stratification approaches, highlighting the need for improved predictive strategies. As a result, empiric antifungal therapy is often delayed even in high-risk patients. METHODS: We developed a deep learning model (PyTorch_EHR) to predict 7-day candidemia risk by using electronic health record data from two large cohorts (Houston Methodist Hospital System [HMHS] and MIMIC-IV), including adult inpatients who underwent at least one blood culture. Model performance was compared with logistic regression (LR), LightGBM, and established intensive care unit candidemia scores. We further implemented a two-step prediction framework integrating candidemia and 30-day mortality risk models to inform empiric antifungal decision-making. RESULTS: Among 213,404 and 107,507 patients in the HMHS and MIMIC-IV cohorts, candidemia occurred in fewer than 1% (851 [0.4%] and 634 [0.6%], respectively). PyTorch_EHR outperformed LR, LightGBM, and existing candidemia scores, particularly in terms of area under the precision-recall curve (AUPRC) in HMHS and MIMIC-IV. By integrating 30-day mortality risk, the two-step framework identified an additional 20 and 28 candidemia cases beyond the one-step model, increasing coverage to 61% (121/199) and 46% (68/147) in HMHS and MIMIC-IV, respectively. Many patients identified by the two-step framework had high mortality yet did not receive empiric antifungal therapy (61.1% HMHS; 82.6% MIMIC-IV). CONCLUSION: A two-step deep-learning framework integrating candidemia and mortality risk may support early identification of high-risk patients and facilitate timely empiric antifungal therapy. Prospective studies are warranted to confirm the findings.