Multi-dimensional attention transformer for vehicle and pedestrian detection in adverse weather

用于恶劣天气下车辆和行人检测的多维注意力转换器

阅读:2

Abstract

Real-time object detection in adverse weather and low-light conditions is crucial for applications such as autonomous driving and intelligent surveillance. This paper presents MDAT-YOLO, a novel object detection framework designed to balance accuracy and efficiency in challenging environments. The model integrates multi-dimensional attention mechanisms and transformer-based enhancements to strengthen feature extraction and adaptability. It introduces two core modules: DWConv_O, an optimized depthwise separable convolution layer, and ODConv++, an omni-dimensional dynamic convolution module that enhances spatial, channel, and kernel-level interactions for improved feature selectivity and dynamic response. A lightweight C3 Transformer (C3TR) block further reduces computational overhead while maintaining strong representational capacity. MDAT-YOLO is evaluated on four benchmark datasets, including RTTS, VOC-Foggy, ExDark, and a custom foggy VOC-PASCAL subset, achieving accuracy improvements of 70.50%, 65.14%, 77.40%, and 49.00%, respectively. The model sustains real-time speeds up to 145 FPS, demonstrating robustness and practicality for real-world deployment under diverse environmental conditions.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。