Abstract
Accurate prostate zonal segmentation remains challenging due to domain shifts across institutions and the need for precise anatomical delineation. While transformer-based models like the Segment Anything Model (SAM) show promise, they struggle in medical domains, particularly with continual learning due to catastrophic forgetting. This study introduces SynESAM, a hybrid transformer-CNN segmentation framework that combines SAM’s image encoder with a CNN-based decoder. SynESAM employs synchronous learning strategies, specifically Elastic Weight Consolidation (EWC) and variational EWC (vEWC), to preserve prior knowledge during continual learning. The model was trained and evaluated on two datasets, UAB (in-house) and ProstateX (public), using Dice Similarity Coefficient (DSC) and Mean Surface Distance (MSD) as metrics. SynESAM significantly outperformed 16 benchmark models on both datasets. On UAB, it improved average DSC by 9.84% (TZ) and 18.54% (PZ), with worst-case DSC gains of 18.96% (TZ) and 31.53% (PZ). MSD improved by 40.93% (TZ) and 30.02% (PZ). In cross-dataset testing, SynESAM-EWC achieved up to 19.73% DSC improvement and 25.84% MSD reduction for PZ. SynESAM-vEWC showed slightly lower yet consistent gains. Overall, SynESAM provides robust, anatomically accurate segmentation across datasets by integrating transformer generalization, CNN efficiency, and continual learning strategies that mitigate catastrophic forgetting. GRAPHICAL ABSTRACT: [Image: see text]