Abstract
Accurate feeding-state monitoring is essential for improving feeding management, reducing feed waste, and supporting water quality and fish welfare in aquaculture. However, existing vision-based methods often rely on subjective labels or computationally expensive temporal models, which limits practical on-farm deployment. Here, we propose an objective, edge-deployable framework for motion-driven feeding-state quantification and binary feeding/non-feeding recognition from top-view videos. The framework integrates frame-pair dense optical-flow encoding with a lightweight network (EfficientFeedingNet) to enable real-time deployment. Using an optical-flow-derived motion-intensity signal (V-Value), we automatically delineate feeding-response intervals and construct a perception-based dataset (Perceptual Dataset) with reproducible binary labels, alongside an observer-labeled Intuitive Dataset. Across representative backbones, models trained on the Perceptual Dataset achieve >90% test accuracy and improve over the Intuitive Dataset by 13.13-18.46 percentage points. The proposed EfficientFeedingNet attains 96.53% test accuracy while remaining lightweight for edge deployment; on a Jetson Orin NX, it runs at 7.0 ms per image (143.24 fps). Overall, the proposed framework provides a practical basis for timely, data-driven feeding decisions in precision aquaculture.