Abstract
BACKGROUND: Non-invasive prediction of EGFR mutation status in lung adenocarcinoma (LUAD) is critical for treatment planning, particularly in small pulmonary nodules where tissue genotyping is limited. However, the consolidation-to-tumor ratio (CTR), a clinically relevant imaging biomarker, has rarely been incorporated into radiomics-based models. OBJECTIVE: To develop and validate an interpretable CT radiomics model incorporating CTR and clinical features for predicting EGFR mutation status in LUAD patients with nodules ≤3 cm. METHODS: In this retrospective study included 492 patients with pathologically confirmed LUAD who underwent preoperative non-contrast chest CT between January 2017 and December 2022. Tumors were manually segmented for radiomic feature extraction, and CTR was measured for each lesion. Radiomic textures were computed with PyRadiomics using a fixed gray-level bin width. Feature selection was performed using analysis of variance and mutual information filtering followed by RFE with a random-forest base estimator. Three random forest classifiers were constructed: a radiomics-only model, a clinical-only model, and a combined radiomics-clinical model. Model performance was assessed by AUC with 95% CI, and interpretability was evaluated using SHapley Additive exPlanations (SHAP). RESULTS: The combined model achieved the best performance (AUC, 0.74 [95% CI: 0.69-0.79] in training; 0.76 [95% CI: 0.66-0.85] in testing), outperforming the radiomics-only (AUC, 0.69) and clinical-only (AUC, 0.60) models in the testing cohort. CTR was the most influential feature according to SHAP analysis. CONCLUSION: A interpretable radiomics model integrating CTR and clinical features enables effective non-invasive prediction of EGFR mutation status in small LUAD nodules, supporting molecular risk stratification when tissue genotyping is unavailable.