Human decision making balances reward maximization and policy compression

Lucy Lai; Samuel J Gershman

doi:10.1371/journal.pcbi.1012057

Human decision making balances reward maximization and policy compression

PLoS Comput Biol. 2024 Apr 26;20(4):e1012057. doi: 10.1371/journal.pcbi.1012057. eCollection 2024 Apr.

Authors

Lucy Lai^{1

2}, Samuel J Gershman³

Affiliations

¹ Program in Neuroscience, Harvard University, Cambridge, Massachusetts, United States of America.
² Theoretical Sciences Visiting Program, Okinawa Institute of Science and Technology Graduate University, Onna, Okinawa, Japan.
³ Department of Psychology and Center for Brain Science, Harvard University, Cambridge, Massachusetts, United States of America.

Abstract

Policy compression is a computational framework that describes how capacity-limited agents trade reward for simpler action policies to reduce cognitive cost. In this study, we present behavioral evidence that humans prefer simpler policies, as predicted by a capacity-limited reinforcement learning model. Across a set of tasks, we find that people exploit structure in the relationships between states, actions, and rewards to "compress" their policies. In particular, compressed policies are systematically biased towards actions with high marginal probability, thereby discarding some state information. This bias is greater when there is redundancy in the reward-maximizing action policy across states, and increases with memory load. These results could not be explained qualitatively or quantitatively by models that did not make use of policy compression under a capacity limit. We also confirmed the prediction that time pressure should further reduce policy complexity and increase action bias, based on the hypothesis that actions are selected via time-dependent decoding of a compressed code. These findings contribute to a deeper understanding of how humans adapt their decision-making strategies under cognitive resource constraints.

Copyright: © 2024 Lai, Gershman. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Adult
Cognition / physiology
Computational Biology
Decision Making* / physiology
Female
Humans
Male
Models, Psychological
Reinforcement, Psychology
Reward*
Young Adult

Grants and funding

This research was supported by a Harvard Brain Science Initiative Bipolar Disorder Seed Grant to SJG, (https://brain.harvard.edu/psychiatric-disorders/); the National Science Foundation Graduate Research Fellowship, DGE-1745303 to LL, (https://www.nsfgrfp.org/); and the 28Twelve Foundation Harvey Fellowship to LL, (https://www.28twelvefoundation.org/). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.