Abstract
Accurate and efficient detection of multiple diseases from diagnostic images remains a major challenge in today's world, especially in women's health conditions such as breast cancer, cervical cancer, and Polycystic Ovary Syndrome (PCOS). Each of these diseases presents its own unique imaging characteristics and visual patterns, making detection of these diseases all together through a single model is highly challenging. In this respect, in order to overcome this, we have proposed a hybrid deep learning framework that combines EfficientNetB0 and Vision Transformer for multiple multi-disease detection. This shared backbone and multi-head architecture of the proposed framework integrate the strong spatial feature extraction ability of EfficientNetB0 with the contextual reasoning ability of the Vision Transformer, ensuring that the model is able to capture both local and global features of diseases. Our framework was trained on a different dataset containing several thousand of annotated diagnostic images using a two-stage learning strategy: 70 epochs of initial training followed by 30 epochs of fine-tuning. Experimental results show very impressive diagnostic performance, where our approach has achieved accuracies of 97.64% for breast cancer, 94.28% for cervical cancer, and 98.10% for PCOS. These numbers are improved to 98.82%, 95.96%, and 98.96%, respectively, after a fine-tuning stage. Future work on this study will focus on dataset expansion and clinical validation for real-world diagnostic deployment.