Abstract
Under large-scale farming conditions, automated sow estrus detection is crucial for improving reproductive efficiency, optimizing breeding management, and reducing labor costs. Conventional estrus detection relies heavily on human expertise, a practice that introduces subjective variability and consequently diminishes both accuracy and efficiency. Failure to identify estrus promptly and pair animals effectively lowers breeding success rates and drives up overall husbandry costs. In response to the need for the automated detection of sows' estrus states in large-scale pig farms, this study proposes a method for detecting sows' vulvar status and estrus based on multi-dimensional feature crossing. The method adopts a dual optimization strategy: First, the Bi-directional Feature Pyramid Network-Selective Decoding Integration (BiFPN-SDI) module performs the bidirectional, weighted fusion of the backbone's low-level texture and high-level semantic, retaining the multi-dimensional cues most relevant to vulvar morphology and producing a scale-aligned, minimally redundant feature map. Second, by embedding a Spatially Enhanced Attention Module head (SEAM-Head) channel attention mechanism into the detection head, the model further amplifies key hyperemia-related signals, while suppressing background noise, thereby enabling cooperative and more precise bounding box localization. To adapt the model for edge computing environments, Masked Generative Distillation (MGD) knowledge distillation is introduced to compress the model while maintaining the detection speed and accuracy. Based on the bounding box of the vulvar region, the aspect ratio of the target area and the red saturation features derived from a dual-threshold method in the HSV color space are used to construct a lightweight Multilayer Perceptron (MLP) classification model for estrus state determination. The network was trained on 1400 annotated samples, which were divided into training, testing, and validation sets in an 8:1:1 ratio. On-farm evaluations in commercial pig facilities show that the proposed system attains an 85% estrus detection success rate. Following lightweight optimization, inference latency fell from 24.29 ms to 18.87 ms, and the model footprint was compressed from 32.38 MB to 3.96 MB in the same machine, while maintaining a mean Average Precision (mAP) of 0.941; the accuracy penalty from model compression was kept below 1%. Moreover, the model demonstrates robust performance under complex lighting and occlusion conditions, enabling real-time processing from vulvar localization to estrus detection, and providing an efficient and reliable technical solution for automated estrus monitoring in large-scale pig farms.