Nucleus accumbens dopamine release reflects Bayesian inference during instrumental learning

Albert J Qü; Lung-Hao Tai; Christopher D Hall; Emilie M Tu; Maria K Eckstein; Karyna Mishchanchuk; Wan Chen Lin; Juliana B Chase; Andrew F MacAskill; Anne G E Collins; Samuel J Gershman; Linda Wilbrecht

doi:10.1101/2023.11.10.566306

Nucleus accumbens dopamine release reflects Bayesian inference during instrumental learning

bioRxiv [Preprint]. 2023 Nov 13:2023.11.10.566306. doi: 10.1101/2023.11.10.566306.

Authors

Albert J Qü^{1

2}, Lung-Hao Tai³, Christopher D Hall⁴, Emilie M Tu¹, Maria K Eckstein⁵, Karyna Mishchanchuk⁶, Wan Chen Lin³, Juliana B Chase¹, Andrew F MacAskill⁶, Anne G E Collins^{1

3}, Samuel J Gershman^{7

8}, Linda Wilbrecht^{1

3}

Affiliations

¹ Department of Psychology, University of California, Berkeley, CA, 94720, USA.
² Center for Computational Biology, University of California, Berkeley, CA, 94720, USA.
³ Helen Wills Neuroscience Institute, University of California, Berkeley, CA, 94720, USA.
⁴ Sainsbury Wellcome Centre for Neural Circuits and Behaviour, University College London, London, W1T 4JG, UK.
⁵ Google DeepMind, London, UK.
⁶ Department of Neuroscience, Physiology and Pharmacology, University College London, UK.
⁷ Department of Psychology and Center for Brain Science, Harvard University, Cambridge, MA, 02138, USA.
⁸ Center for Brains, Minds, and Machines, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA.

Abstract

Dopamine release in the nucleus accumbens has been hypothesized to signal reward prediction error, the difference between observed and predicted reward, suggesting a biological implementation for reinforcement learning. Rigorous tests of this hypothesis require assumptions about how the brain maps sensory signals to reward predictions, yet this mapping is still poorly understood. In particular, the mapping is non-trivial when sensory signals provide ambiguous information about the hidden state of the environment. Previous work using classical conditioning tasks has suggested that reward predictions are generated conditional on probabilistic beliefs about the hidden state, such that dopamine implicitly reflects these beliefs. Here we test this hypothesis in the context of an instrumental task (a two-armed bandit), where the hidden state switches repeatedly. We measured choice behavior and recorded dLight signals reflecting dopamine release in the nucleus accumbens core. Model comparison based on the behavioral data favored models that used Bayesian updating of probabilistic beliefs. These same models also quantitatively matched the dopamine measurements better than non-Bayesian alternatives. We conclude that probabilistic belief computation plays a fundamental role in instrumental performance and associated mesolimbic dopamine signaling.

Publication types

Preprint

Abstract

Publication types

Grants and funding