Abstract
This paper addresses the challenge of safe path planning for mobile robots operating in human-shared environments, where human movements are inherently stochastic. To this end, we propose a reinforcement-learning-based path planning algorithm that accounts for human-related uncertainties at the planning level. The algorithm first employs a Markov decision process learner to explore the environment and generate multiple candidate paths. Second, to reduce computational redundancy, a path eliminator module filters out similar paths based on a proposed diversity metric, ensuring path diversity with minimal overhead. Simultaneously, a Monte Carlo-simulated human risk predictor is integrated into the decision-making unit to select the safest path among the candidates. This integrated algorithm enables robots to generate safe and efficient trajectories without the need for frequent re-planning, even in environments with stochastic human behavior. Simulation results demonstrate the effectiveness of the proposed method. In high-density settings, a 40×40 grid map with 10 humans, the proposed method reduces the average number of conflicts by -69.8%, -54.8%, and -73.4% compared with A*, MDP, and RRT methods, respectively. Meanwhile, it improves task success rates by 94.4%, 70.7%, and 118.75% relative to the same baseline methods.