Abstract
Addressing the issue of fruit recognition and localization failures in harvesting robots due to severe occlusion by branches and leaves in complex orchard environments, this paper proposes an occlusion avoidance method that combines a lightweight YOLOv8n model, developed by Ultralytics in the United States, with active perception. Firstly, to meet the stringent real-time requirements of the active perception system, a lightweight YOLOv8n model was developed. This model reduces computational redundancy by incorporating the C2f-FasterBlock module and enhances key feature representation by integrating the SE attention mechanism, significantly improving inference speed while maintaining high detection accuracy. Secondly, an end-to-end active perception model based on ResNet50 and multi-modal fusion was designed. This model can intelligently predict the optimal movement direction for the robotic arm based on the current observation image, actively avoiding occlusions to obtain a more complete field of view. The model was trained using a matrix dataset constructed through the robot's dynamic exploration in real-world scenarios, achieving a direct mapping from visual perception to motion planning. Experimental results demonstrate that the proposed lightweight YOLOv8n model achieves a mAP of 0.885 in apple detection tasks, a frame rate of 83 FPS, a parameter count reduced to 1,983,068, and a model weight file size reduced to 4.3 MB, significantly outperforming the baseline model. In active perception experiments, the proposed method effectively guided the robotic arm to quickly find observation positions with minimal occlusion, substantially improving the success rate of target recognition and the overall operational efficiency of the system. The current research outcomes provide preliminary technical validation and a feasible exploratory pathway for developing agricultural harvesting robot systems suitable for real-world complex environments. It should be noted that the validation of this study was primarily conducted in controlled environments. Subsequent work still requires large-scale testing in diverse real-world orchard scenarios, as well as further system optimization and performance evaluation in more realistic application settings, which include natural lighting variations, complex weather conditions, and actual occlusion patterns.