Pedestrian detection in aerial image based on convolutional neural network with attention mechanism and multi-scale prediction

基于卷积神经网络、注意力机制和多尺度预测的航拍图像行人检测

阅读:4

Abstract

Pedestrian object detection is crucial in intelligent systems such as traffic management and surveillance. Traditional machine learning methods have shown drawbacks, including low accuracy and slow processing. Convolutional Neural Network (CNN)-based algorithms have achieved notable progress, but mainstream CNNs still struggle with slow speed and low accuracy, particularly for small and occluded targets from aerial perspectives. In this paper, we propose a Multi-Scale Attention YOLO (MSA-YOLO) algorithm to address these issues. MSA-YOLO incorporates a Squeeze, Excitation, and Cross Stage Partial (SECSP) channel attention module to extract richer pedestrian features with minimal additional parameters. A multi-scale prediction module is also introduced to capture information across different scales, improving small object detection and reducing missed detections. To evaluate our approach, we manually collect and annotate the Aerial Pedestrian dataset (AP dataset), which, to our knowledge, provides more annotations, varied scenes, and diverse view angles than comparable existing datasets. The high-resolution images in the AP dataset allow for capturing more detailed pedestrian features, which can enhance model performance. Experimental results show that, on the AP dataset, MSA-YOLO demonstrates clear advantages over several widely used object detection and pedestrian detection models developed in recent years, indicating its potential dual benefits in terms of accuracy and efficiency.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。