Abstract
To achieve accurate detection of the pineapple fruit picking area and pose under complex backgrounds and varying lighting conditions, this study proposes a pineapple keypoint detection model (LTHRNet) based on an improved LiteHRNet. Image data of pineapple fruits under different lighting conditions were collected, and six keypoints were defined to characterize the morphological features of the fruit. In the model design, LTHRNet incorporates the LKA_Stem module to enhance initial feature extraction, the D-Mixer module to capture both global and local feature relationships, and the MS-FFN module to achieve multi-scale feature fusion. In addition, the model employs parallel sub-networks with different resolutions to maintain high-resolution feature information and improve the precision and spatial accuracy of keypoint detection. Experimental results show that LTHRNet performs well in pineapple keypoint detection. It achieves 93.5% and 95.1% in KAP(0.5) and KAR(0.5), respectively, outperforming other models in terms of detection accuracy and robustness under challenging lighting and occlusion conditions, with a detection speed of 21.1 fps. For pose estimation, the average offset angle (AOA) of LTHRNet is 2.37°, which is significantly lower than that of other models. In summary, the proposed LTHRNet model demonstrates high accuracy and strong robustness in pineapple keypoint detection and pose estimation, providing reliable keypoint localization and pose estimation data for pineapple harvesting, while also offering an effective reference for pose recognition in other fruit-picking tasks.