Abstract
Deep learning has recently made remarkable progress in remote sensing image segmentation, with hybrid architectures that integrate convolutional neural networks (CNNs) and Transformers emerging as a promising solution, particularly for high-resolution imagery. However, challenges remain in complex remote sensing scenes, particularly in capturing detailed boundary structures and small-scale targets. One key limitation lies in the suboptimal cross-level feature fusion within the encoder, resulting in semantic misalignment that hinders the precise segmentation of small objects and fine structural details. Additionally, during the decoding stage, the lack of explicit boundary guidance frequently causes the loss of edge information during feature reconstruction, compromising the delineation of object contours in intricate environments. To address these issues, We propose a novel hybrid architecture named Boundary-Guided Semantic Compensation Network (BGSC-Net). Our framework integrates two key components: a Cross-Level Semantic Compensation Module (CLSCM) that dynamically fuses high-level semantics with low-level spatial details to enhance small object segmentation, and an Auxiliary Boundary Supervision Module (ABSM) that enhances structural modeling for blurry or complex boundaries through explicit boundary modeling and an auxiliary supervision strategy based on joint optimization of the edge and main segmentation branches. Experiments show that BGSC-Net achieves superior segmentation performance, with mIoU scores of 87.57% on Potsdam, 85.61% on Vaihingen, 55.05% on LoveDA, and 74.77% on UAVid. To further validate its generalization capability in specialized fine-grained segmentation tasks, we evaluated the model on our challenging self-constructed Mangrove Species Fine-grained Segmentation Dataset (MSFSD), where it achieved an mIoU of 89.58%, confirming its practical utility for precise mangrove species mapping.