ViTCNN: a robust hybrid CNN-Vision Transformer based deep learning framework for multi-disease diagnosis in women's healthcare

ViTCNN:一种基于混合 CNN-Vision Transformer 的深度学习框架,用于女性医疗保健领域的多疾病诊断

阅读:1

Abstract

Accurate and efficient detection of multiple diseases from diagnostic images remains a major challenge in today's world, especially in women's health conditions such as breast cancer, cervical cancer, and Polycystic Ovary Syndrome (PCOS). Each of these diseases presents its own unique imaging characteristics and visual patterns, making detection of these diseases all together through a single model is highly challenging. In this respect, in order to overcome this, we have proposed a hybrid deep learning framework that combines EfficientNetB0 and Vision Transformer for multiple multi-disease detection. This shared backbone and multi-head architecture of the proposed framework integrate the strong spatial feature extraction ability of EfficientNetB0 with the contextual reasoning ability of the Vision Transformer, ensuring that the model is able to capture both local and global features of diseases. Our framework was trained on a different dataset containing several thousand of annotated diagnostic images using a two-stage learning strategy: 70 epochs of initial training followed by 30 epochs of fine-tuning. Experimental results show very impressive diagnostic performance, where our approach has achieved accuracies of 97.64% for breast cancer, 94.28% for cervical cancer, and 98.10% for PCOS. These numbers are improved to 98.82%, 95.96%, and 98.96%, respectively, after a fine-tuning stage. Future work on this study will focus on dataset expansion and clinical validation for real-world diagnostic deployment.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。