Abstract
Neoadjuvant immunotherapy has increased rates of pathological complete response (pCR) in thoracic malignancies and has renewed interest in organ-preserving strategies. In esophageal squamous cell carcinoma (ESCC), the Surgery As Needed for Oesophageal cancer (SANO) trial has shown that active surveillance may be feasible for selected patients who achieve clinical complete response (cCR) after neoadjuvant chemoradiotherapy. Nevertheless, cCR remains an imperfect surrogate for pCR, and residual disease may still be present despite apparently negative post-treatment assessments. In this context, the recent study by Qi and colleagues is particularly noteworthy. In a multicenter retrospective cohort of 335 patients with ESCC treated with neoadjuvant chemoimmunotherapy followed by surgery, the authors developed and externally validated an interpretable multimodal radiopathomics model integrating pretreatment contrast-enhanced CT radiomics and H&E whole slide image pathomics. Their intermediate fusion model outperformed unimodal radiomics, unimodal pathomics, and a late fusion approach across validation cohorts, while also providing feature-level and case-level interpretability. This work is important because it illustrates why multimodal prediction may better capture treatment response than single-modality assessment alone and because it uses data already generated in routine care. At the same time, challenges related to manual annotation, pathology field selection, digital workflow standardization, and external generalizability remain substantial. We believe this study represents an important step toward biologically informed, clinically usable prediction of pCR and offers a valuable framework for refining organ-preserving strategies in the era of neoadjuvant immunotherapy.