Abstract
Producing high-quality segmentation masks for medical images is a fundamental challenge in biomedical image analysis. Recent research has investigated the use of supervised learning with large volumes of labeled data to improve segmentation across medical imaging modalities and unsupervised learning with unlabeled data to segment without detailed annotations. However, a significant hurdle remains in constructing a model that can segment diverse medical images in a zero-shot manner without any annotations. In this work, we introduce the attention diffusion zero-shot unsupervised system (ADZUS), a new method that uses self-attention diffusion models to segment biomedical images without needing any prior labels. This method combines self-attention mechanisms to enable context-aware and detail-sensitive segmentations, with the strengths of the pre-trained diffusion model. The experimental results show that ADZUS outperformed state-of-the-art models on various medical imaging datasets, such as skin lesions, chest X-ray infections, and white blood cell segmentations. The model demonstrated significant improvements by achieving Dice scores ranging from 88.7% to 92.9% and IoU scores from 66.3% to 93.3%. The success of the ADZUS model in zero-shot settings could lower the costs of labeling data and help it adapt to new medical imaging tasks, improving the diagnostic capabilities of AI-based medical imaging technologies.