Distributed representations of temporally accumulated reward prediction errors in the mouse cortex.

小鼠皮层中时间累积的奖励预测误差的分布式表征

阅读:17
Reward prediction errors (RPEs) quantify the difference between expected and actual rewards, serving to refine future actions. Although reinforcement learning (RL) provides ample theoretical evidence suggesting that the long-term accumulation of these error signals improves learning efficiency, it remains unclear whether the brain uses similar mechanisms. To explore this, we constructed RL-based theoretical models and used multiregional two-photon calcium imaging in the mouse dorsal cortex. We identified a population of neurons whose activity was modulated by varying degrees of RPE accumulation. Consequently, RPE-encoding neurons were sequentially activated within each trial, forming a distributed assembly. RPE representations in mice aligned with theoretical predictions of RL, emerging during learning and being subject to manipulations of the reward function. Interareal comparisons revealed a region-specific code, with higher-order cortical regions exhibiting long-term encoding of RPE accumulation. These results present an additional layer of complexity in cortical RPE computation, potentially augmenting learning efficiency in animals.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。