Abstract
BACKGROUND: Lung cancer remains one of the leading causes of cancer-related deaths worldwide. This study utilized clinical risk factors along with intratumoral radiomics, peritumoral radiomics, and intratumoral subregional features extracted from computed tomography (CT) lung-window images for individual and integrated modeling to classify solid pulmonary nodules and identify the optimal model, thereby improving diagnostic accuracy while minimizing unnecessary invasive procedures. METHODS: CT images of 230 pathologically confirmed solitary solid pulmonary nodules were retrospectively collected from two hospitals. Among the 167 patients from the first hospital, 20% (n=34) served as the test set, while the remaining 80% (n=133) were used as the training and development set for 5-fold cross-validation, while data from the second hospital (n=63) served as an external test set. Intratumoral and peritumoral regions of interest (ROIs) were delineated on lung window images, and relevant radiomics features were extracted. Multiple machine learning algorithms-including Logistic Regression (LR), Support Vector Machine (SVM), K-Nearest Neighbors (KNN), and Linear Support Vector Classifier (Linear SVC) etc.-were employed to construct predictive models for distinguishing benign from malignant solid pulmonary nodules. RESULTS: A triple-feature model (intratumoral, peritumoral, clinical) achieved superior diagnostic performance [area under the receiver operating characteristic curve (AUC): training 0.932, 95% confidence interval (CI): 0.897-0.960; test 0.833, 95% CI: 0.773-0.890; external test 0.741, 95% CI: 0.618-0.864] with high sensitivity/specificity. The intratumoral-peritumoral dual-modality model showed optimal cross-center robustness external test, AUC =0.808 (95% CI: 0.700-0.922). Habitat imaging revealed heterogeneity, AUC =0.750 (95% CI: 0.676-0.825). Decision curve analysis confirmed the triple-model's clinical utility. SHAP identified age, gender, and key radiomics (e.g., gradient_firstorder_Skewness_Intra) as top predictors. Multi-center test confirmed generalizability, positioning this integrated framework as a robust tool to reduce invasive procedures in pulmonary nodule management. CONCLUSIONS: The multi-combination models developed in this study enhance the diagnostic accuracy for distinguishing benign from malignant solid pulmonary nodules, with the triple-feature model demonstrating the highest diagnostic performance. This approach has the potential to spare patients from unnecessary invasive procedures and strengthen clinical decision-making in the management of pulmonary nodules.