Machine learning-based prediction of invasiveness in lung adenocarcinoma presenting as ground-glass nodules using radiomics and clinical CT features

基于机器学习的肺腺癌侵袭性预测:利用放射组学和临床CT特征预测磨玻璃结节的侵袭性

阅读:1

Abstract

BACKGROUND: Lung adenocarcinoma(LA), the predominant histological subtype of lung cancer, frequently manifests as ground-glass nodules (GGNs) on computed tomography. Preoperative discrimination of invasiveness—critical for guiding surgical and therapeutic decisions—remains challenging due to subjective radiological assessment and limited sensitivity of conventional methods. This multicenter study aimed to develop a robust, non-invasive predictive framework integrating radiomics and clinical CT features using machine learning (ML) to stratify GGN-associated LA invasiveness. METHODS: A retrospective dual-cohort analysis was conducted on 357 patients with pathologically confirmed LA. The primary cohort (n = 312) was randomly divided into a training cohort (n = 249) and a test cohort (n = 63) at an 8:2 ratio. The external validation cohort consisted of 45 patients. Radiomics features (n = 1129) were extracted from High Resolution CT (HRCT), and clinical CT features (n = 16) were evaluated by blinded radiologists. Principal component analysis (PCA) and least absolute shrinkage and selection operator (LASSO) were respectively used for dimensionality reduction of radiomics features and five ML algorithms (XGBoost, SVM, Random Forest, Logistic Regression, LightGBM) were trained to predict invasiveness (low: minimally invasive adenocarcinoma/Grade 1 invasive adenocarcinoma; high: Grade 2/3 invasive adenocarcinoma). Model performance was assessed using Area Under the Curve (AUC), sensitivity, specificity, and Decision Curve Analysis. The calibration curve was plotted, and SHapley Additive exPlanations methods were used to interpret the predictive models. RESULTS: The Random Forest model In the Clinical CT Features-PCA radiomics model performed the best, with an AUC value of 0.854 for the training cohort, 0.769 for the test cohort, and 0.778 for the external validation cohort. Key predictive features included PCA-derived radiomic components and clinical CT Features. Clinical CT Features-PCA Radiomics RF model significantly outperformed clinical-only models and Clinical CT Features-LASSO Radiomics Model, showing superior predictive ability. CONCLUSIONS: Integration of radiomics and clinical CT features via ML, particularly RF, enables accurate preoperative prediction of LA invasiveness in GGNs. This approach enhances objectivity over conventional radiological assessment and may optimize personalized treatment strategies. Further validation in larger, prospective cohorts is warranted to confirm clinical utility. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12885-025-14983-3.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。