Abstract
This paper addresses the challenge of simultaneously satisfying collision-free trajectory, path quality optimization, and execution time optimization in trajectory planning for industrial robots in confined spaces. A multi-objective integrated trajectory optimization algorithm is proposed based on the TD3 reinforcement learning framework. Firstly, the motion generation mechanism is improved using a Butterworth filter and a dynamic noise attenuation strategy to enhance trajectory smoothness. Secondly, a genetic algorithm is employed to automatically optimize the hyperparameters of the TD3 algorithm, and a prioritized experience replay mechanism is introduced to improve the utilization efficiency of critical experiences, thereby enhancing the algorithm's convergence speed and stability. Finally, a composite reward function based on time-distance information is designed to effectively guide the industrial robot in optimizing trajectory execution time under collision-free conditions. This paper uses the Fairino Robot5 robotic arm model as the training object and conducts simulation experiments in a PyBullet environment, as well as real physical experiments. Compared to the RRT method, manual teaching method, traditional TD3 method, and SAC method, the actual trajectory execution time was reduced by 57.03% , 22.94% , 26.05%, and 20.5%, respectively.