Abstract
This multicenter study aims to quantified CT-based intratumoral heterogeneity (ITH) and provide a novel valuable dimension for deep learning radiomics in predicting lymph node metastases (LNM) in early-stage lung adenocarcinoma (LUAD). Data from 787 early-stage LUAD patients who underwent surgery were retrospectively analyzed across three centers. By unsupervised clustering CT tumor 2D subregions, the ITHscore was generated. A fusion predictive model incorporating the deep learning features, Radscore and ITHscore was developed and validated. ITHscore was an important predictor achieved AUCs of 0.813, 0.807, 0.718 in test sets, respectively. Gene Set Enrichment Analysis of the test dataset D revealed the epithelial-mesenchymal transition pathway as the top enriched gene signature (NES = 1.71, p < 0.001) in the high ITHscore group. The fusion model demonstrated the best prediction performance, with AUCs of 0.832, 0.884, and 0.815 in test sets. Correlation and interpretability analyses confirmed that these three feature sets from different dimensions have a good synergistic effect in capturing imaging phenotypes. The ITHscore can be used as a novel imaging biomarker for predicting LNM in early-stage LUAD. The deep learning radiomics model fusing the ITHscore provided new insights for promoting the early non-invasive personalized treatment of LUAD. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1038/s41598-025-26331-3.