Abstract
Acting successfully in dynamic environments requires learning supported by two systems that differ in computational demand: a fast, model-free system that repeats rewarded actions, and a more effortful model-based system that uses a mental model of the task structure to guide flexible, goal-directed decisions. A key open question is whether people engage effortful model-based strategies to the same extent when deciding for themselves versus others, and which computations underpin self-other differences. Using a two-step task with reinforcement learning drift-diffusion modelling in 92 adults, we found that deciding for others slowed down model-free learning and reduced reliance on model-based strategies, with the latter partially mediated by differences in non-decision time. Moreover, individual differences in social value orientation predicted the self-other discrepancy in model-based decision-making, with more prosocial individuals showing smaller gaps. Together, these findings identify the computational mechanisms underpinning prosocial model-based decision-making and demonstrate how individual differences modulate this computation.