Abstract
Since the practical constraints of unknown pedestrian goal information, research on inverse reinforcement learning (IRL) applied to social robots has focused on trajectory planning based on current motion direction, other pedestrians, and obstacles. However, social robots typically have clear navigation goals, and the practicality of the aforementioned methods is debatable. Moreover, trajectory prediction at longer distances also poses significant challenges for such trajectory planning methods. Therefore, this paper proposes a goal-oriented autonomous decision-making (GO-ADM) method, which addresses the social compatibility navigation problem of robots through sequential actions at each discrete time steps, eliminating the need to predict pedestrian trajectories. Specifically, this paper defines and collects goal-oriented expert demonstration and propose a collaborative interactive IRL framework for social robots. By exploring both explicit and implicit norms of pedestrian motion in the expert demonstration, goal-oriented reward function was ultimately integrated into GO-ADM to achieve goal-oriented autonomous decision-making for social robots. Additionally, GO-ADM incorporates a penalty function for social safety distance to further ensure the safety of human-robot interaction. Considering the longitudinal dominant navigation task of pedestrians in real-world scenarios, this study constructs a realistic training environment to validate the longitudinal and lateral dominant navigation tasks of the GO-ADM. The results demonstrate that, under both longitudinal and lateral dominant navigation tasks, the robot is capable of making rational autonomous decisions, with the final average deviation from the destination being less than 0.13 m and 0.23 m, respectively, and average deviation rates are less than 0.18% and 1.1%. Furthermore, reduplicate experiments demonstrate that GO-ADM achieves higher success rate with other social navigation and decision-making algorithms. Under severe interference conditions, with maximum noise 0.5 m, the success rate of GO-ADM exceeding 75%, demonstrating strong robustness against unknown interferences.