ResViT-GANNet: a deep learning framework for classifying breast cancer histopathology images using multimodal attention and GAN-based augmentation

ResViT-GANNet:一种利用多模态注意力机制和基于 GAN 的数据增强技术对乳腺癌组织病理学图像进行分类的深度学习框架

阅读:1

Abstract

BACKGROUND: Breast cancer remains the most commonly diagnosed malignancy among women worldwide. Histopathological image analysis is the clinical gold standard for diagnosis; however, the high resolution and complexity of these images, together with limited annotated data, pose significant challenges for traditional deep learning methods. This study aims to develop a robust classification framework capable of effectively analyzing high-resolution histopathological images. METHODS: We propose ResViT-GANNet, a novel dual-branch deep learning architecture that integrates a residual convolutional network with channel attention and a vision transformer with multi-layer token fusion. This design is specifically intended to capture both fine-grained local pathological features and long-range global semantic representations. A key novelty of our framework is the Token-Aligned Multimodal Attention (TAMA) module, which combines heterogeneous features from both branches through multi-head attention and token-wise alignment. To address limited and imbalanced data, we incorporated synthetic histopathology images generated with StyleGAN2-ADA into the training set. Extensive experiments on the BACH and BreakHis datasets demonstrate superior performance, with statistical significance confirmed through rigorous evaluation. RESULTS: On the BACH dataset (4-class classification), ResViT-GANNet achieved an accuracy of 96.40%, precision of 96.34%, recall of 96.36%, and an F1-score of 96.35%. These results significantly outperformed baseline methods including TransMIL (85.83%), CTransPath (88.75%), and SwinCNN (92.89%), with p-values < 0.01 and large effect sizes (Cohen’s d > 1.0). Incorporating synthetic data yielded an average accuracy improvement of 3.3%. On the BreakHis dataset (8-class classification across four magnification levels), the model attained an average accuracy of 98.22%, with per-class accuracies ranging from 97.25% to 99.50%. Grad-CAM visualizations further confirmed enhanced interpretability and highlighted critical histological features relevant for classification. CONCLUSIONS: ResViT-GANNet substantially improves classification performance on complex, high-resolution histopathology images. The major contributions of this work include a parallel dual-branch architecture enabling synergistic local–global feature learning, a token-aligned multimodal fusion mechanism, and the integration of generative augmentation with explainable AI. Together, these innovations enhance model generalization and robustness, underscoring the potential of ResViT-GANNet as a clinically useful decision-support system for breast cancer diagnosis. TRIAL REGISTRATION: Not applicable.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。