Human Strategy Adaptation in Reinforcement Learning Resembles Policy Gradient Ascent.

阅读:4
作者:Xiong Hua-Dong, Ji-An Li, Wilson Robert C, Mattar Marcelo G
A hallmark of intelligence is the ability to adapt behavior to changing environments, which requires adapting one's own learning strategies. This phenomenon is known as learning to learn in cognitive science and meta-learning in artificial intelligence. While this phenomenon is well-established in humans and animals, no quantitative framework exists for characterizing the trajectories through which biological agents adapt their learning strategies. Previous computational studies that either assume fixed strategies or use task-optimized neural networks do not explain how humans refine strategies through experience. Here we show that humans adjust their reinforcement learning strategies resembling principles of gradient-based online optimization. We introduce DynamicRL, a framework using neural networks to track how participants' learning parameters (e.g., learning rates and decision temperatures) evolve throughout experiments. Across four diverse bandit tasks, DynamicRL consistently outperforms traditional reinforcement learning models with fixed parameters, demonstrating that humans continuously adapt their strategies over time. These dynamically-estimated parameters reveal trajectories that systematically increase expected rewards, with updates significantly aligned with policy gradient ascent directions. Furthermore, this learning process operates across multiple timescales, with strategy parameters updating more slowly than behavioral choices, and update effectiveness correlates with local gradient strength in the reward landscape. Our work offers a generalizable approach for characterizing meta-learning trajectories, bridging theories of biological and artificial intelligence by providing a quantitative method for studying how adaptive behavior is optimized through experience.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。