Abstract
Behavioral outcomes are rarely certain, requiring subjects to discriminate between available choices by using feedback to guide future decisions. Probabilistic reversal learning (PRL) tasks test subjects' ability to learn and flexibly adapt to changes in reward contingencies. Cortico-striatal circuitry has been broadly implicated in flexible decision-making-though what role these circuits play remains complicated. In this study we leveraged the fast temporal dynamics of local field potentials to precisely identify the role that cortico-striatal networks play during PRL reward feedback. We measured widespread (32-CH) local field potential activity of male Long-Evans rats during a PRL task wherein a target response delivered reward on 80% of trials while a non-target response delivered reward on 20% of trials. When subjects learned those reward probabilities, contingencies were reversed. We found that reward-evoked oscillations at beta (15-30 Hz) and high gamma (>70 Hz) frequencies marked positive reward valence and reflected probability of reward. Activity and connectivity at beta frequencies between orbitofrontal cortex, anterior insula, medial prefrontal cortex, and ventral striatum during expected rewards were correlated with behavioral performance and specific aspects of value/exploitative behavior as defined by a reinforcement learning computational model. Finally, we found that modulating beta activity in orbitofrontal cortex with optogenetic (20 Hz) stimulation promoted maladaptive behavior when stimulation was provided during non-target responses, consistent with our data and computational model predictions. Reward-evoked beta oscillations may reflect a crucial component underlying reward learning, and erroneous elevations in this physiological signal may contribute to maladaptive task performance and behavioral disruptions.