Abstract
In order to improve the detection capability of low-slow-small UAV targets in complex backgrounds, this paper introduces a novel method that combines spatio-temporal information, which includes (1) an improved YOLO detector for small UAV detection, (2) a motion target detection module, and (3) an integrated combination strategy for static and dynamic judgment. We firstly provided an improved YOLOv11 static detection method by combining SPD Conv, BiFPN and a detect header for high-resolution layers, and then designed a dynamic target-detection algorithm which helps the YOLO method capture minor movement features, finally introducing a fusing strategy of static detection and dynamic judgment. The experimental results on small UAV datasets, including various sky, mountain and building backgrounds, have shown that the proposed approach increases Precision, Recall, and mAP50 by 12.1%, 29.5%, and 29.6%, respectively, compared with the baseline YOLO11 detector. The proposed MSM-YOLO achieves Precision, Recall, and mAP50 of 94%, 92%, and 86.3%, enabling the effective detection of small UAV targets in complex scenarios. Moreover, the ablation experiments also proved the effectiveness of each module. The proposed method was further deployed in a redesigned RK3588 embedded system, achieving 100 fps after optimized process, and it has shown effectiveness and practicality in further air-to-air UAV detection applications.