Abstract
Given the complexity of crop growth environments in nature, where leaf backgrounds often include soil, weeds, and other plants, along with variable lighting conditions, and considering the small size of leaf spots and the wide variety of crop diseases with significant scale differences, this paper proposes a new BGM-YOLO model structure aimed at improving accuracy and inference speed. First, the GSBottleneck module is utilized to enhance the C2f module of the YOLOv8n model, leading to the introduction of the GSC2f module, which reduces computational costs and increases inference efficiency. Next, the model incorporates a multiscale bitemporal fusion module (BFM) to increase the effectiveness and robustness of feature fusion across different levels. Finally, we developed a median-enhanced spatial and channel attention block (MECS) that combines both channel and spatial attention mechanisms, effectively improving the capture and fusion of small-scale features. The experimental results demonstrate that the BGM-YOLO model achieves a 3.9% improvement in the mean average precision (mAP) over the original model. In crop disease detection tasks, the BGM-YOLO model has higher detection accuracy and a lower false negative rate, confirming its practical value in complex application scenarios.