Abstract
With the rapid advancement of UAV, image matching algorithms for visual navigation have been increasingly widely used. However, extensive research demonstrate that existing algorithms exhibit insufficient matching accuracy and long response time in challenging aerial scenes, which cannot meet the accuracy requirements of UAV visual navigation. In this paper, we propose an efficient image matching method for UAV visual navigation, named as DALGlue, based on convolutional neural network feature extraction algorithm and feature matching network with linear attention mechanism. DALGlue uses dual-tree complex wavelet transform to preprocess the collected aerial images, which enhances structural information and fine details. Compared with directly processing raw images, dual-tree complex wavelet transform module solves the problem of edge blurring in UAV dynamic flight. Then, an adaptive spatial feature fusion module is developed to extract features from images and calculate feature points and descriptors. In addition, we employ linear attention mechanism to aggregate image features, which can effectively reduce computational costs while improving network characteristics. Finally, the Sinkhorn algorithm is used to calculate the allocation matrix and output optimal assignment. DALGlue demonstrates a unique balance between accuracy and real-time performance, which can be operate under strict computational and memory constraints. In comparison to the state-of-the-art method LightGlue, the experimental results show that DALGlue obtains 11.8% points improvement in MMA. On the MegaDepth-1500 benchmark, DALGlue achieves the AUC@5 °/10 °/20 ° values of 57.01, 73.00, and 84.11 respectively, which effectively improved match precision.