Abstract
Latent learning experiments were critical in shaping Tolman's cognitive map theory. In a spatial navigation task, latent learning means that animals acquire knowledge of their environment through exploration, such that pre-exposed animals learn faster on a subsequent learning task than naive ones. This enhancement has been shown to depend on the design of the pre-exposure phase. Here, we hypothesize that the deep successor representation (DSR), a recent computational model for cognitive map formation, can account for the modulation of latent learning because it is sensitive to the statistics of behavior during exploration. In our model, exploration aligned with the future reward location significantly improves reward learning compared to random, misdirected, or no exploration, as reported by experiments. This effect generalizes across different action selection strategies. We show that these performance differences follow from the spatial information encoded in the structure of the DSR acquired in the pre-exposure phase. In summary, this study sheds light on the mechanisms underlying latent learning and how such learning shapes cognitive maps, impacting their effectiveness in goal-directed spatial tasks.