Integrating simplified Swin-T with modified EFS-Net for attention-guided underwater pipelines segmentation in complex underwater environments

将简化的 Swin-T 与改进的 EFS-Net 相结合,用于复杂水下环境中注意力引导的水下管道分割

阅读:1

Abstract

Accurate segmentation of underwater pipelines is essential for marine infrastructure inspection. However, deep learning models often struggle with extreme underwater conditions such as low light, sea snow, and sea fog, leading to poor generalization on unseen data. Existing approaches typically focus on either accuracy or computational efficiency, leaving the challenge of achieving an optimal balance between the two unresolved. This paper introduces a novel hybrid architecture, the Swin Transformer-EFSNet fusion network, which delivers state-of-the-art accuracy with significantly reduced computational complexity and strong generalization capability. The model employs a dual-encoder design: a lightweight Swin Transformer branch to capture contextual relationships and a modified EFSNet branch optimized for efficient local feature extraction. Their outputs are dynamically integrated using a three-head cross-attention fusion module which prioritizes salient spatial and contextual information before decoding the final segmentation mask. We also present the HOMOMO dataset, a new benchmark containing images with challenging conditions such as low light, fog, sea snow, and complex occlusions (e.g., pipelines buried under sand or covered by vegetation). Extensive experiments on HOMOMO and two public datasets demonstrate that our method outperforms strong baselines, including UNet, SwinUNet, TransUNet, Mask2Former, YOLOv5, YOLOv11, and YOLOv12. On HOMOMO, our model achieves a mIoU of 98.44% and an F-boundary of 82.01%, surpassing the best-performing method by 8.43% and 5.34%, respectively. Crucially, the proposed model exhibits outstanding generalization to unseen data, demonstrating robustness against domain shifts. By effectively balancing global and local processing, our hybrid design achieves high accuracy without imposing heavy computational costs. These results establish a new paradigm for efficient and reliable visual perception in underwater environments, paving the way for practical autonomous inspection systems.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。