A Novel Hybrid CNN-ViT-Based Bi-Directional Cross-Guidance Fusion-Driven Breast Cancer Detection Model

一种新型的基于混合 CNN-ViT 的双向交叉引导融合驱动乳腺癌检测模型

阅读:2

Abstract

Accurate and early identification of breast cancer from mammography is key to reducing breast cancer mortality, and automated analysis is challenging due to subtle lesion appearances, heterogeneous breast density, and the variance caused by modality. Standard Convolutional Neural Networks (CNNs) are excellent at capturing localized textures, whereas Vision Transformers (ViTs) capture long-range dependencies; however, both often struggle to produce a unified representation that consistently supports diagnostic decision-making. To address these limitations, this study presents a dual-stream framework integrating ConvNeXt for high-fidelity local feature extraction with Swin Transformer V2 for hierarchical global context modeling. A Bi-Directional Cross-Guidance (BDCG) mechanism is added to harmonize interactions between the two feature domains and ensure mutual information learning in the representations. Furthermore, a Prototype-Anchored Similarity Head (PASH) is used to stabilize classification using distance-based reasoning instead of using linear separation. Comprehensive experiments show the effectiveness of the proposed method using two benchmark datasets. On Dataset 1, the model achieves accuracy: 98.8%, precision: 98.7%, recall: 98.6%, and F1 score: 97.2%, outperforming existing models based on CNN, ViTs, and hybrid architectures, and provides a lower inference time (8.3 ms/image). On the more heterogeneous Dataset 2, the model maintains strong performance, with an accuracy of 97.0%, precision of 95.4%, recall of 94.8%, and F1-score of 95.1%, demonstrating its resilience to domain shift and imaging variability. These results underscore the value of structural multi-scale feature interaction and prototype-driven classification for robust mammographic analysis. The consistent performance across internal and external evaluations indicates the potential for the proposed framework to be reliably applied in computer-aided screening systems.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。