Person: Kool, Wouter
Loading...
Email Address
AA Acceptance Date
Birth Date
Research Projects
Organizational Units
Job Title
Last Name
Kool
First Name
Wouter
Name
Kool, Wouter
3 results
Search Results
Now showing 1 - 3 of 3
Publication Competition and Cooperation Between Multiple Reinforcement Learning Systems(Elsevier, 2018) Kool, Wouter; Cushman, Fiery; Gershman, SamuelMost psychological research on reinforcement learning has depicted two systems locked in battle for control of behavior: a flexible but computationally expensive “model-based” system and an inflexible but cheap “model-free” system. However, the complete picture is more complex, with the two systems cooperating in myriad ways. We focus on two issues at the frontier of this research program. First, how is the conflict between these systems adjudicated? Second, how can the systems be combined to harness the relative strengths of each? This chapter reviews recent work on competition and cooperation between the two systems, highlighting the computational principles that govern different forms of interaction.Publication When Does Model-Based Control Pay Off?(Public Library of Science, 2016) Kool, Wouter; Cushman, Fiery; Gershman, SamuelMany accounts of decision making and reinforcement learning posit the existence of two distinct systems that control choice: a fast, automatic system and a slow, deliberative system. Recent research formalizes this distinction by mapping these systems to “model-free” and “model-based” strategies in reinforcement learning. Model-free strategies are computationally cheap, but sometimes inaccurate, because action values can be accessed by inspecting a look-up table constructed through trial-and-error. In contrast, model-based strategies compute action values through planning in a causal model of the environment, which is more accurate but also more cognitively demanding. It is assumed that this trade-off between accuracy and computational demand plays an important role in the arbitration between the two strategies, but we show that the hallmark task for dissociating model-free and model-based strategies, as well as several related variants, do not embody such a trade-off. We describe five factors that reduce the effectiveness of the model-based strategy on these tasks by reducing its accuracy in estimating reward outcomes and decreasing the importance of its choices. Based on these observations, we describe a version of the task that formally and empirically obtains an accuracy-demand trade-off between model-free and model-based strategies. Moreover, we show that human participants spontaneously increase their reliance on model-based control on this task, compared to the original paradigm. Our novel task and our computational analyses may prove important in subsequent empirical investigations of how humans balance accuracy and demand.Publication Cost-Benefit Arbitration Between Multiple Reinforcement-Learning Systems(SAGE Publications, 2017-07-21) Kool, Wouter; Gershman, Samuel; Cushman, FieryHuman behavior is sometimes determined by habit and other times by goal-directed planning. Modern reinforcement-learning theories formalize this distinction as a competition between a computationally cheap but inaccurate model-free system that gives rise to habits and a computationally expensive but accurate model-based system that implements planning. It is unclear, however, how people choose to allocate control between these systems. Here, we propose that arbitration occurs by comparing each system’s task-specific costs and benefits. To investigate this proposal, we conducted two experiments showing that people increase model-based control when it achieves greater accuracy than model-free control, and especially when the rewards of accurate performance are amplified. In contrast, they are insensitive to reward amplification when model-based and model-free control yield equivalent accuracy. This suggests that humans adaptively balance habitual and planned action through on-line cost-benefit analysis.