Abstract
BACKGROUND: To address the heterogeneity in treatment responses and the lack of robust prognostic tools for unresectable esophageal squamous cell carcinoma (ESCC) patients undergoing immunochemotherapy, this study aimed to develop and validate an interpretable machine learning (ML) model for survival prediction and risk stratification. METHODS: A retrospective cohort of 323 unresectable ESCC patients treated with immunochemotherapy (2019-2025) was analyzed. Using the XGBoost algorithm, we integrated baseline clinical features (age, tumor location, TNM stage) and laboratory parameters (albumin, globulin, blood glucose) to construct a prognostic model. SHapley Additive exPlanations (SHAP) values were employed to quantify feature contributions, and external validation (n=48) was performed to assess generalizability. SHAP (SHapley Additive exPlanations) is a game theory-based framework that enables model interpretability by quantifying the contribution of each feature to predictions. The primary endpoint was overall survival (OS). RESULTS: The model achieved AUC values of 0.794 (internal test) and 0.689 (external test), with calibration curves demonstrating strong concordance between predicted and observed survival rates. Key prognostic factors included tumor response, age, hypoalbuminemia, hyperglobulinemia and hyperglycemia. Risk stratification using a nomogram-derived cutoff (total score ≥50) revealed significantly inferior 2-year OS in high-risk versus low-risk patients (21.3% vs 58.6%, P<0.001). CONCLUSION: This interpretable ML model effectively predicts survival outcomes in unresectable ESCC patients receiving immunochemotherapy, offering a data-driven tool for personalized therapeutic decision-making. Multicenter prospective trials are warranted to validate its clinical utility.