Abstract
Visually impaired individuals face significant challenges in navigating safely and independently, particularly under adverse weather conditions such as fog. To address this issue, we propose YOLO-Extreme, an enhanced object detection framework based on YOLOv12, specifically designed for robust navigation assistance in foggy environments. The proposed architecture incorporates three novel modules: the Dual-Branch Bottleneck Block (DBB) for capturing both local spatial and global semantic features, the Multi-Dimensional Collaborative Attention Module (MCAM) for joint spatial-channel attention modeling to enhance salient obstacle features and reduce background interference in foggy conditions, and the Channel-Selective Fusion Block (CSFB) for robust multi-scale feature integration. Comprehensive experiments conducted on the Real-world Task-driven Traffic Scene (RTTS) foggy dataset demonstrate that YOLO-Extreme achieves state-of-the-art detection accuracy and maintains high inference speed, outperforming existing dehazing-and-detect and mainstream object detection methods. To further verify the generalization capability of the proposed framework, we also performed cross-dataset experiments on the Foggy Cityscapes dataset, where YOLO-Extreme consistently demonstrated superior detection performance across diverse foggy urban scenes. The proposed framework significantly improves the reliability and safety of assistive navigation for visually impaired individuals under challenging weather conditions, offering practical value for real-world deployment.