Abstract
Effective insect detection is crucial for sustainable cotton production, yet traditional monitoring methods remain labor-intensive, inefficient, and environmentally detrimental. This study introduces Enhanced YOLO12, a novel deep learning architecture for real-time cotton insect detection. Building on the YOLO12 framework, the proposed model integrates an optimized Spatial Pyramid Pooling (SPP) module and attention-based feature extraction to improve detection accuracy while maintaining computational efficiency. To ensure robustness, we developed and evaluated multiple baseline models (standard YOLO11 and YOLO12) and custom architectures (YOLO12_Fusion, YOLO11-BRA-Net, YOLO11_CBAM, and Enhanced Hybrid YOLO12). According to the conducted experiments, Enhanced Hybrid YOLO12 achieved the best performance, achieving 0.942, 0.876, 0.945, and 0.735 in precision, recall, mAP50 and mAP50-95, respectively. It significantly outstands the results of the standard YOLO12 (0.925, 0.848, of 0.913, and 0.662). These results demonstrate that Enhanced Hybrid YOLO12 can be considered as a state-of-the-art framework for precision agriculture, with its high detection accuracy and real-time capability. Therefore, they encourage this deep learning model in pest management applications.