Abstract
OBJECTIVES: Lymph node metastasis (LNM) is an important factor affecting the stage and prognosis of patients with lung adenocarcinoma. The purpose of this study is to explore the predictive value of the stacking ensemble learning model based on (18)F-FDG PET/CT radiomic features and clinical risk factors for LNM in lung adenocarcinoma, and elucidate the biological basis of predictive features through pathological analysis. METHODS: Ninety patients diagnosed with lung adenocarcinoma who underwent PET/CT were retrospectively analyzed and randomly divided into the training and testing sets in a 7:3 ratio. Stacking ensemble learning models were developed based on radiomic features combined with clinical risk factors. The predictive performance of each model was assessed through area under the curve (AUC). Additionally, Spearman's correlation analysis was employed to investigate the association between features predicting LNM and pathological features. RESULTS: Multifactorial logistic regression identified the bronchial cut-off sign and serum carcinoembryonic antigen (CEA) as clinical risk factors. The Stacking-combined model demonstrated superior diagnostic efficacy compared with logistic regression, random forest, and naive Bayes-combined models, with AUC values of 0.971 and 0.901 in the training and testing sets, respectively. Despite the absence of FDR-significant radiomic-pathomic correlations (all q > 0.05), exploratory analysis revealed nominal associations (uncorrected P < 0.05) for partial feature pairs. Crucially, radiomic features demonstrated strong associations with Ki-67 expression: PET_GLRLM_LongRunHigh GreyLevelEmphasis (r = 0.610, q < 0.001) and CT_INTENSITY-BASED_Intensity BasedEnergy (r = 0.332, q = 0.004). CONCLUSIONS: The stacking ensemble learning model based on (18)F-FDG PET/CT radiomics demonstrates potential for predicting LNM in lung adenocarcinoma, and the quantitative analysis of radiomic features holds significant biological significance.