Abstract
Accurate road extraction from remote sensing images is crucial for autonomous driving, urban planning, and route planning. However, existing methods struggle to address the challenges of scale variation, occlusion, and blurred boundaries. To tackle these challenges, this paper proposes a heterogeneous dual-decoder network (HDDNet), which aims to simultaneously solve the multiple problems in remote sensing road extraction by designing two decoders with complementary functions. Specifically, the main decoder incorporates the Dynamic Snake Grouping Dilation (DSGD) module, which combines road morphological features with a grouped multi-scale receptive field to enhance the capture of narrow and multi-scale road features. The auxiliary decoder integrates the Multi-directional Connectivity and Boundary Enhancement (MCBE) module, which jointly optimizes road connectivity and boundary refinement by leveraging directional consistency between the road body and edges. Finally, the Dual Attention Feature Fusion (DAFF) module is introduced to interactively learn and fuse the output features of the main decoder and the auxiliary decoder in both spatial and channel dimensions, which improves the accuracy and robustness of feature representations. We conducted systematic experiments on three representative public datasets: DeepGlobe, Ottawa, and CHN6-CUG. The results demonstrate that the proposed method significantly outperforms current mainstream approaches in the road extraction task, achieving Intersection over Union (IoU) scores of 71.36%, 91.85%, and 67.27%, respectively, which strongly validates the performance and robustness of HDDNet across diverse road scenarios.