Discovery of skill-switching criteria for learning agile quadruped locomotion

探索学习敏捷四足动物运动的技能转换标准

阅读:1

Abstract

This study develops a hierarchical learning and optimization framework that can learn and achieve well-coordinated multi-skill locomotion. The learned multi-skill policy can switch between skills automatically and naturally while tracking arbitrarily positioned goals and can recover from failures promptly. The proposed framework is composed of a deep reinforcement learning process and an optimization process. First, the contact pattern is incorporated into the reward terms to learn different types of gaits as separate policies without the need for any other references. Then, a higher-level policy is learned to generate weights for individual policies to compose multi-skill locomotion in a goal-tracking task setting. Skills are automatically and naturally switched according to the distance to the goal. The appropriate distances for skill switching are incorporated into the reward calculation for learning the high-level policy and are updated by an outer optimization loop as learning progresses. We first demonstrate successful multi-skill locomotion in comprehensive tasks on a simulated Unitree A1 quadruped robot. We also deploy the learned policy in the real world, showcasing trotting, bounding, galloping, and their natural transitions as the goal position changes. Moreover, the learned policy can react to unexpected failures at any time, perform prompt recovery, and successfully resume locomotion. Compared to baselines, our proposed approach achieves all the learned agile skills with improved learning performance, enabling smoother and more continuous skill transitions.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。