Abstract
In modern smart dairy farms, precise feed management and accurate monitoring of dairy cows' feeding behavior are crucial for improving production efficiency and reducing feeding costs. However, in practical applications, complex environmental factors such as varying illumination, frequent occlusion, and dense multi-targets pose significant challenges to real-time visual perception. To address these issues, this paper proposes a lightweight multi-target detection model, BFDet-YOLO, for the joint detection of dairy cows' feeding behavior and feed density levels in pasture environments. Based on the YOLOv10 framework, the model incorporates four targeted improvements: (1) a bidirectional feature fusion network (BiFPN) to address the insufficient multi-scale feature interaction between dairy cows (large targets) and feed particles (small targets); (2) a lightweight downsampling module (Adown) to preserve fine-grained features of feed particles and reduce the risk of small target miss detection; (3) an attention-enhanced detection head (SEAM) to mitigate occlusion interference caused by cow stacking and feed accumulation; (4) an improved bounding box regression loss function (DIoU) to optimize the localization accuracy of non-overlapping small targets. Additionally, this paper constructs a pasture-specific dataset integrating dairy cows' feeding behavior and feed distribution information, which is annotated and expanded by combining public datasets with on-site monitoring data. Experimental results demonstrate that BFDet-YOLO outperforms the original YOLOv10 and other mainstream target recognition models in terms of detection accuracy and robustness while maintaining a significantly streamlined model scale. On the constructed dataset, the model achieves 95.7% mAP@0.5 and 70.7% mAP@0.5:0.95 with only 1.85 M parameters. These results validate the effectiveness and deployability of the proposed method, providing a reliable visual perception solution for intelligent feeding systems and smart pasture management.