Abstract
Medical image segmentation (MedISeg) remains a challenging task, with current methods including the specialized but non-generalizable lightweight model and powerful yet resource-intensive pretrained model like SAM-Med. Although the existing combination methods seek to compensate for their respective weaknesses, the excessive involvement of SAM-Med often limits the performance of the lightweight model during the training process. To address this issue, we propose MuGu, a novel mutual-guidance learning framework that enables bidirectional exchange of the learning ability between the pretrained model and lightweight model. Specifically, MuGu introduces two core innovative mechanisms: 1) a Confidence Prompt Guidance (CPG) mechanism that dynamically selects the samples with the lowest inference confidence as SAM-Med’s next prompt to guide the SAM-Med’s participation in the lightweight model’s training process; 2) an Ensemble Structure Boundary Guidance (ESBG) mechanism that integrates the prediction results from both SAM-Med and the lightweight backbone via an Adaptive Combination Attention (ACA) module to provide the refined boundary features. These features are then incorporated into the optimization of the lightweight backbone. Extensive experiments on four public 2D/3D segmentation datasets demonstrate that our MuGu achieves the state-of-the-art performance relative to a strong baseline.