Abstract
Autonomous driving systems critically rely on the precise detection of distant, small traffic signs to ensure safe and efficient navigation. Nonetheless, existing detection algorithms are confronted with several significant challenges, including the limited efficacy in capturing the subtle visual features of small targets, the adverse effects of complex background clutter, and the imperative for real-time inference via computationally lightweight models. To address these challenges, we propose YOLO-AML, which effectively reduces computational complexity through parameter-free spatial transformations and low-channel convolution operations while preserving fine-grained features of small objects. The proposed Normalization-based Attention with sigmoid and tanh (NAST) module employs a hybrid gating mechanism to precisely regulate attention weight distribution, thereby suppressing background noise without introducing additional convolutional overhead. Furthermore, the C2PSA-LSKA (CLSKA) module integrated into the backbone network enhances the receptive field while minimizing parameter count, effectively mitigating the issue of traffic signs being obscured by background clutter. Additionally, a Normalized Wasserstein Distance (NWD) loss function is introduced to alleviate gradient vanishing commonly encountered with extremely small objects. Experimental results indicate that the optimized model reduces the total number of parameters by 17%, computational complexity by 16.8%, achieves a detection speed of 72.2 FPS, and improves detection accuracy by 2.0%. Grad-CAM heatmap visualization further confirms the model's enhanced feature discriminability and robustness against background interference. Overall, YOLO-AML demonstrates significant improvements in detection performance under complex real-world driving scenarios.