Abstract
BACKGROUND: Severe pulmonary complications (SPCs) after cancer surgery have a significant impact on morbidity, mortality, and healthcare burden. Despite this, clinicians currently lack accurate and practical tools to predict the occurrence and survival outcomes of SPCs. METHODS: We conducted a retrospective cohort study of cancer patients undergoing surgery at Hunan Cancer Hospital between June 2023 and June 2025. Two datasets were established (1): 434 patients (227 with SPCs and 207 controls) for predicting SPCs occurrence, and (2) 227 SPC patients with complete follow-up data for 28-day and 90-day survival prediction. Six supervised machine learning classifiers, including linear discriminant analysis (LDA), support vector machine (SVM), random forest (RDF), decision tree (DST), adaptive boosting (ADA), and extremely randomized trees (EXT), were developed. Hyperparameters were optimized using grid search with five-fold stratified cross-validation. Performance was assessed using testing sets by evaluating the Brier score, precision, recall, F1-score, and AUC. SHapley Additive exPlanations (SHAP) were used for model interpretability, and the finalized models were deployed as web-based applications. RESULTS: For SPC occurrence prediction, the EXT model demonstrated the best performance (AUC = 0.813). For 28-day mortality prediction, EXT achieved the highest discrimination (AUC = 0.921), whereas RDF performed best for 90-day mortality (AUC = 0.899). SHAP analysis identified preoperative hypertension, ECOG score, and intraoperative blood loss as the most influential predictors of SPC occurrence. Postoperative SOFA scores, APACHE II scores, and blood urea nitrogen (BUN) were key predictors of 28-day and 90-day mortality. PRESCO (https://presco.streamlit.app/), an online tool, provides real-time prediction of SPC and survival outcomes. CONCLUSIONS: We developed and validated machine learning models that accurately predict the occurrence and survival of SPCs in cancer patients after surgery. By deploying the online tool, clinicians can easily access it and utilize its functions to perform personalized risk stratification and guide perioperative decisions in oncology.