Abstract
As the spatial resolution of remote sensing imagery continues to be improved, the complexity of the information also increases. Remote sensing images generally have characteristics such as wide imaging ranges, dispersed distribution of similar land objects, complex boundary shapes, and dense small targets, which pose severe challenges to semantic segmentation tasks. To address these challenges, we propose a channel reconstruction and dual attention dynamic fusion network (CRDFNet), which is a semantic segmentation network for remote sensing image that can effectively integrate global and local contexts. To better handle complex boundary shapes, we designed a channel feature aggregation module (CFAM), which can extract spatially redundant information during feature fusion and enhance high-resolution detail features. Through a channel reconstruction block, it promotes the alignment of fine-grained information from the encoder with high-level semantic information from the decoder, effectively aggregating multi-scale features extracted by the encoder and significantly improving segmentation accuracy. At the same time, to optimize the segmentation performance of small targets, we propose a dual attention feature refinement module (DAFRM), which achieves precise segmentation of small targets by effectively fuses the shallow spatial features of the encoder and the deep semantic features of the decoder through a dynamic fusion mechanism guided by dual attention. Experimental results on the Potsdam, Vaihingen, UAVid, and MSIDBG datasets demonstrate that CRDFNet outperforms existing methods in terms of F1 score, OA, and mIoU (Intersection over Union), validating its excellent performance.