Automated quantification of tumor-infiltrating lymphocytes by machine learning reveals prognostic and immunogenomic features in lung cancer

利用机器学习对肿瘤浸润淋巴细胞进行自动定量分析,揭示肺癌的预后和免疫基因组学特征

阅读:1

Abstract

Tumor-infiltrating lymphocytes (TILs) are key components of the tumor microenvironment (TME) and are recognized as prognostic and predictive biomarkers in non-small cell lung cancer (NSCLC). However, manual TIL assessment on hematoxylin and eosin (H&E)-stained slides is subjective and poorly reproducible. This study aimed to develop and validate an automated, machine learning–based framework for TIL quantification and explore its associations with immunogenomic features and patient outcomes. H&E-stained slides and transcriptomic, genomic, and clinical data from lung adenocarcinoma patients were retrieved from The Cancer Genome Atlas (TCGA). An automated TIL quantification pipeline was built in QuPath (v0.5.1) with stain normalization, watershed cell segmentation, and a supervised cell classifier to identify tumour cells, stromal cells, and TILs. In a separate step, a random forest model based on aggregated Haralick texture features and tumour stage was trained to classify patients into high- and low-TIL subgroups. TIL density cut-offs were defined by maximally selected rank statistics. Survival was analyzed via the Kaplan–Meier method and Cox regression. ssGSEA, ESTIMATE, GSVA, and WGCNA were applied to characterize immune infiltration and transcriptomic modules. Somatic mutations were compared between groups, and drug sensitivity was predicted via GDSC-derived ridge regression models. Model performance was evaluated via 10-fold cross-validation with SMOTE oversampling. Automated quantification achieved high concordance with the results of the pathologist review and RNA-seq inference. An optimal TIL cut-off of 135 cells/mm(2) was used to stratify patients into high- and low-density groups. High-TIL tumors were enriched for adaptive immune infiltration, antigen presentation, and TCR signaling, and exhibited greater mutational diversity, whereas low-TIL tumors were enriched in ribosome biogenesis and protein translation pathways. Prognostically, high-TIL density was associated with improved overall survival (HR=0.48, 95% CI: 0.29–0.79; P = 0.004). The predicted IC50 values did not differ for standard chemotherapies but varied for the selected compounds. The Haralick-based classification model achieved an AUC of 0.87 (95% CI 0.835–0.901) in internal cross-validation, which improved to 0.892 (95% CI 0.848–0.913) when tumour stage was incorporated. This study demonstrated that automated TIL quantification is feasible and prognostically relevant in lung cancer and may provide a hypothesis-generating marker of immune activation for future immunotherapy studies; however, direct validation in immunotherapy-treated cohorts is required before clinical implementation. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1038/s41598-026-37076-y.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。