Abstract
In the postoperative management of early-stage non-small cell lung cancer (NSCLC), accurate identification of patients at high risk of progression is essential for developing personalized follow-up schedules and adjuvant treatment strategies. However, in multi-center settings, model and data heterogeneity limit the flexibility and generalizability of traditional federated learning. To address this, we propose a heterogeneous federated learning model (HFLM) that enables centers to adopt different model architectures while using a robust feature transfer strategy to alleviate the impact of heterogeneous data. Using CT images from 892 early-stage NSCLC patients across four medical institutions, HFLM achieved AUCs of 0.863 (95% CI, 0.8072–0.9192), 0.837 (95% CI, 0.7204–0.9530), 0.846 (95% CI, 0.7349–0.9564), and 0.847 (95% CI, 0.6971–0.9963). Cross-validation and stratified analyses further confirm its strong generalization and stability across centers. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1038/s41598-025-30565-6.