Abstract
In recent years, deep learning has achieved significant advancements in medical image segmentation. Medical image segmentation is fundamental to computer-aided diagnosis, yet challenges persist in balancing local detail preservation and global context modeling. This paper proposes CSWin-MDKDNet, a novel Transformer-based architecture enhanced with Multi-dimensional Selective Fusion (MDSF) and Knowledge Distillation Loss (KD-loss). The MDSF module refines multi-scale feature fusion through channel-spatial attention, while KD-loss mitigates feature redundancy in deep layers. Evaluated on the Synapse (multi-organ CT), ACDC(cardiac MRI) and ISIC2018 datasets, our model achieves state-of-the-art performance, with 81.82% DSC (Synapse), 91.76% DSC (ACDC) and 91.64% DSC(ISIC2018), outperforming existing methods in accuracy.