Abstract
BACKGROUND: Radiomics analysis of (18)F-FDG PET/CT enables extraction of high-dimensional quantitative features that may serve as imaging biomarkers for tumor biology. In particular, in non-small cell lung cancer (NSCLC), such features could potentially predict immune checkpoint biomarker status, such as PD-L1 expression, and histological subtype without the need for invasive tissue sampling. METHODS: We retrospectively analyzed pre-treatment (18)F-FDG PET/CT scans of 115 patients with histologically confirmed NSCLC (adenocarcinoma or squamous cell carcinoma) and known PD-L1 status. Radiomic features were extracted from primary tumors and combined with basic clinical variables (“naïve features”). Principal component analysis (PCA) and hierarchical clustering were performed to assess separability between the selected classes representing either PD-L1 subtype or histology. Multiple machine learning classifiers, including random forest (RF), were trained using radiomic-only, naïve-only, and combined feature sets. Model performance was evaluated by area under the receiver operating characteristic curve (AUC-ROC). RESULTS: For PD-L1 prediction, the RF model trained on radiomic features achieved the highest accuracy (AUC-ROC = 0.83, 95% CI = [0.75 - 0.91]), outperforming the combined feature model (AUC-ROC = 0.81, 95% CI = [0.73 - 0.89]) and naïve-only generalized linear and RF models (AUC-ROC ~ 0.5). Heatmap analysis of selected radiomic features showed significant clustering of PD-L1-positive and -negative tumors (Fisher’s exact p = 7 × 10(− 4)). For histological subtype classification, the best-performing RF model achieved an AUC-ROC of 0.76 using radiomic features. Although PCA demonstrated partial separation between adenocarcinoma and squamous cell carcinoma (PC1 = 38%, PC2 = 21%), the hierarchical clustering led to the significant class separation (p = 0.0234). CONCLUSIONS: Our pilot project indicated that (18)F-FDG PET/CT radiomics enables accurate non-invasive prediction of PD-L1 expression and provides moderate discrimination between major NSCLC histological subtypes. Clinical variables alone have negligible predictive value for PD-L1 status. Radiomic models may complement histopathology, particularly when tissue sampling is limited or repeated biomarker assessment is required. Prospective, multi-center validation is needed to confirm generalizability and facilitate clinical translation. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12967-026-08029-w.