Dopamine dynamics during stimulus-reward learning in mice can be explained by performance rather than learning.

The reward prediction error (RPE) hypothesis posits that phasic dopamine (DA) activity in the ventral tegmental area (VTA) encodes the difference between expected and actual rewards to drive reinforcement learning. However, emerging evidence suggests DA may instead regulate behavioral performance. Here, we used force sensors to measure subtle movements in head-fixed mice during a Pavlovian stimulus-reward task, while recording and manipulating VTA DA activity. We identified distinct DA neuron populations tuned to forward and backward force exertion. They are active during both spontaneous and conditioned behaviors, independent of learning or reward predictability. Variations in force and licking fully account for DA dynamics traditionally attributed to RPE, including variations in firing rates related to reward magnitude, probability, and omission. Optogenetic manipulations further confirmed that DA modulates force exertion and behavioral transitions in real time, without affecting learning. Our findings challenge the RPE hypothesis and instead suggest that VTA DA neurons dynamically adjust the gain of motivated behaviors, controlling their latency, direction, and intensity during performance.

期刊：	Nature Communications	影响因子：	15.700
时间：	2025	起止号：	2025 Oct 13; 16(1):9081
doi：	10.1038/s41467-025-64132-4

Dopamine dynamics during stimulus-reward learning in mice can be explained by performance rather than learning.

特别声明