Multi-scale parallel gated local feature transformer

多尺度并行门控局部特征转换器

阅读:1

Abstract

Visual Simultaneous Localization and Mapping (VSLAM) is a crucial technology for autonomous mobile vision robots. However, existing methods often suffer from low localization accuracy and poor robustness in scenarios with significant scale variations and low-texture environments, primarily due to insufficient feature extraction and reduced matching precision. To address these challenges, this paper proposes an improved multi-scale local feature matching algorithm based on LoFTR, named MSpGLoFTR. First, we introduce a Multi-Scale Local Attention Module (MSLAM), which achieves feature fusion and resolution alignment through multi-scale window partitioning and a shared multi-layer perceptron (MLP). Second, a Multi-Scale Parallel Attention Module is designed to capture features across various scales, enhancing the model's adaptability to large-scale features and highly similar pixel regions. Finally, a Gated Convolutional Network (GCN) mechanism is incorporated to dynamically adjust weights, emphasizing key features while suppressing background noise, thereby further improving matching precision and robustness. Experimental results demonstrate that MSpGLoFTR outperforms LoFTR in terms of matching precision, relative pose estimation performance, and adaptability to complex scenarios. Notably, it excels in environments with significant illumination changes, scale variations, and viewpoint shifts. This makes MSpGLoFTR an efficient and robust feature matching solution for complex vision tasks.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。