An integrated automated deep learning framework for annotating tumor-infiltrating lymphocytes in lung adenocarcinoma pathology

用于标注肺腺癌病理中肿瘤浸润淋巴细胞的集成自动化深度学习框架

阅读：3

作者：Li,Xia,Wei,Kang-Lai,Huang,Zhao-Quan,Huang,Zi-Yan

期刊：	Frontiers in Bioinformatics	影响因子：
时间：	2026	起止号：	2026;6：1764743
doi：	10.3389/fbinf.2026.1764743	研究方向：	细胞生物学、肿瘤

Abstract

OBJECTIVE: Quantitative analysis of tumor-infiltrating lymphocytes (TILs) is crucial in computational pathology studies of lung adenocarcinoma. However, acquiring large-scale, fully annotated datasets remains a major obstacle for the supervised learning approaches that currently dominate high-precision modeling. To address this data bottleneck, we developed a fully automated pipeline for the precise annotation of tissue contours, tumor parenchyma, and lymphocytes in whole-slide images (WSIs). METHODS: This study utilized WSI data from The Cancer Genome Atlas (TCGA) cohort, with comprehensive manual annotations performed by two pathologists using QuPath software, with all annotations subsequently reviewed by a third senior pathologist. The resulting training dataset comprised over 20,000 annotated units. These annotated data were used to train three core modules consisting of an OpenCV-based image processing pipeline for tissue contour detection, a lightweight U(2)-NetP model for tumor parenchyma segmentation, and a YOLOv7 object detection framework for TILs identification within stromal regions. The pipeline was rigorously validated on both an independent internal cohort and an external hospital cohort, and its outputs were benchmarked against semi-quantitative assessments from expert pathologists. RESULTS: The pipeline demonstrated robust and generalizable performance. For tissue contour detection, the OpenCV-based pipeline achieved a Dice coefficient of 90.90% on the test set. For the core learning-based tasks, the tumor parenchyma segmentation model achieved a Dice coefficient of 87.17% on the internal test set and maintained consistent accuracy on the external cohort, with Dice coefficients ranging from 0.8509 to 0.9178. In the particularly challenging task of lymphocyte detection, the YOLOv7-based model attained an F1-score of 78.84% and mAP@0.5 of 81.16% on the test set, with performance sustained on external data. Critically, the automated TILs quantifications showed excellent agreement with independent pathologist assessments (ICC >0.96). The implementation of optimized lightweight architectures enables the pipeline to serve as an accessible solution for large-scale WSIs analysis in computational pathology. CONCLUSION: This study has successfully developed a fully automated annotation pipeline for lung adenocarcinoma WSIs. By generating high-quality annotations of stromal TILs, this pipeline establishes a reliable data foundation for subsequent computational pathology research and facilitates the advancement of artificial intelligence applications in pathology.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用；引用内容仅为补充信息，不代表本站立场。

2、若认为本页面引用内容涉及侵权，请及时与本站联系，我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容，需注明“来源：[生知库]”并获得授权；使用引用内容的，需自行联系原作者获得许可。

4、投稿及合作请联系：info@biocloudy.com。