Abstract
BACKGROUND: Multidrug-resistant Pseudomonas aeruginosa (MDR-PA) infections present a critical healthcare challenge, often progressing to sepsis with high mortality. Current prediction tools lack specificity for drug-resistant organisms, hindering the early identification of high-risk patients. This study aimed to develop and validate an interpretable machine learning (ML) model to predict sepsis development in patients with MDR-PA infections. METHODS: We conducted a multicenter retrospective study analyzing 2,001 patients with laboratory-confirmed MDR-PA infections from two major medical centers between January 2019 and May 2025. The derivation cohort included 1,182 patients, while 819 patients from an independent center served as the external validation cohort. Feature selection was performed using a hybrid approach combining LASSO regression and support vector machine-recursive feature elimination (SVM-RFE). Seven ML algorithms were evaluated, with model interpretability enhanced via SHapley Additive exPlanations (SHAP). A web-based calculator was subsequently developed to facilitate clinical implementation. RESULTS: The sepsis incidence was approximately 7% across cohorts. Feature selection identified six key predictors: calcium level, chronic obstructive pulmonary disease (COPD), red blood cell distribution width-standard deviation (RDW-SD), intra-abdominal infection, invasive catheters, and prior antibiotic exposure. The Random Forest model demonstrated superior performance, achieving an AUC of 1.000 in the SMOTE-balanced training set, 0.837 in internal validation, and 0.816 in external validation. SHAP analysis highlighted COPD and calcium levels as the most significant contributors to sepsis risk. CONCLUSIONS: This study presents the first interpretable ML model specifically tailored for predicting sepsis onset in patients with MDR-PA infections. By addressing the limitations of general sepsis scores, our validated model and accompanying web-based tool provide clinicians with a precise, visualizable decision-support system to optimize early intervention strategies.