Abstract
The size and mass of fish are crucial parameters in aquaculture management. However, existing research primarily focuses on conducting fish size and mass estimation under ideal conditions, which limits its application in actual aquaculture scenarios with complex water quality and fluctuating lighting. A non-contact size and mass measurement framework is proposed for complex underwater environments, which integrates the improved FishKP-YOLOv11 module based on YOLOv11, stereo vision technology, and a Random Forest model. This framework fuses the detected 2D key points with binocular stereo technology to reconstruct the 3D key point coordinates. Fish size is computed based on these 3D key points, and a Random Forest model establishes a mapping relationship between size and mass. For validating the performance of the framework, a self-constructed grass carp dataset for key point detection is established. The experimental results indicate that the mean average precision (mAP) of FishKP-YOLOv11 surpasses that of diverse versions of YOLOv5-YOLOv12. The mean absolute errors (MAEs) for length and width estimations are 0.35 cm and 0.10 cm, respectively. The MAE for mass estimations is 2.7 g. Therefore, the proposed framework is well suited for application in actual breeding environments.