Abstract
With the advancement of automation in modern agriculture, the demand for intelligence in the post-picking sorting of fruits and vegetables is increasing. As a significant global agricultural product, the defect detection and sorting of tomato is essential to ensure quality and improve economic value. However, the traditional detection method (manual screening) is inefficient and involves high labor intensity. Therefore, a defect detection model named YOLO-RGDD is proposed based on YOLOv12s to identify five types of tomato surface defects (scars, gaps, white spots, spoilage, and dents). Firstly, the original C3k2 module and A2C2f module of YOLOv12 were replaced with RFEM in the backbone network to enhance feature extraction for small targets without increasing computational complexity. Secondly, the Dysample-Slim-Neck of the YOLO-RGDD was developed to reduce the computational complexity and enhance the detection of minor defects. Finally, dynamic convolution was used to replace the conventional convolution in the detection head in order to reduce the model parameter count. The experimental results show that the average precision, recall, and F1-score of the proposed YOLO-RGDD model for tomato defect detection reach 88.5%, 85.7%, and 87.0%, respectively, surpassing advanced object recognition detection algorithms. Additionally, the computational complexity of the YOLO-RGDD is 16.1 GFLOPs, which is 24.8% lower than that of the original YOLOv12s model (21.4 GFLOPs), facilitating the model's deployment in automated agricultural production.