Abstract
BACKGROUND: Sublobar resection is suitable for peripheral stage I lung adenocarcinoma (LUAD). However, if tumor spread through air spaces (STAS) present, the lobectomy will be considered for a survival benefit. Therefore, STAS status guide peripheral stage I LUAD surgical approach. This study aimed to identify radiological features associated with STAS in peripheral stage I LUAD and to develop a predictive machine learning (ML) model using radiomics to improve surgical decision-making for thoracic surgeons. METHODS: We conducted a retrospective analysis of patients who underwent surgical treatment for lung tumors from January 2022 to December 2023, focusing on clinical peripheral stage I LUAD. High-resolution computed tomography (CT) scans were used to extract 1,581 radiomics features. Least absolute shrinkage and selection operator (LASSO) regression was applied to select the most relevant features for predicting STAS, reducing model overfitting and enhancing predictability. Ten ML algorithms were evaluated using performance metrics such as area under the receiver operating characteristic curve (AUROC), accuracy, recall, F1-score, and Matthews Correlation Coefficient (MCC) after a 10-fold cross-validation process. SHapley Additive exPlanations (SHAP) values were calculated to provide interpretability and illustrate the contribution of individual features to the model's predictions. Additionally, a user-friendly web application was developed to enable clinicians to use these predictive models in real-time for assessing the risk of STAS. RESULTS: The study identified significant associations between STAS and radiological features, including the longest diameter, consolidation-to-tumor ratio (CTR), and the presence of spiculation. The Random Forest (RF) model for 3-mm peritumoral extensions demonstrated strong predictive performance, with a Recall_Mean of 0.717, Accuracy_Mean of 0.891, F1-Score_Mean of 0.758, MCC_Mean of 0.708, and an AUROC_Mean of 0.944. SHAP analyses highlighted the influential radiomics features, enhancing our understanding of the model's decision-making process. CONCLUSIONS: The RF model, employing specific intratumoral and 3-mm peritumoral radiomics features, was highly effective in predicting STAS in peripheral stage I LUAD. This model is recommended for clinical use to optimize surgical strategies for LUAD patients, supported by a real-time web application for STAS risk assessment.