Publication: Habits of thought: Model-free reinforcement learning over cognitive operations
Open/View Files
Date
Authors
Published Version
Published Version
Journal Title
Journal ISSN
Volume Title
Publisher
Citation
Abstract
People often persist in doing things that they know intellectually are no longer good for them. They mindlessly take their usual route to work, even though they know it's closed for construction, or reach for their morning coffee even though they're trying to quit caffeine. These persistent behaviors often go under the label of habits, and contemporary theories have had great success characterizing the computational cognitive mechanisms underlying them. In particular, work on model-free reinforcement learning has shown how habits can arise from direct associations between actions and reward (e.g. representations of the form "reaching for morning coffee = ++", computed through past experience with the action).
Yet, computational accounts of habits are lacking in a key way. Intuitively, people don't just persist in external actions, like reaching for morning coffee. They also persist in internal, cognitive patterns. For instance, a person might form a habit of fantasizing about coffee, or of planning how to get her next cup. Despite the purported importance of these "habits of thought" for people's mental lives, they are noticeably absent from computational accounts of habits. These accounts have typically modeled habits as forming over simple motor actions (like pulling a lever) or external choices (like selecting buttons in a laboratory decision task) -- and have not investigated habits over more internal, abstract types of cognitive operations (like setting the goal of getting coffee).
This dissertation fills that gap. Here, I demonstrate that people inflexibly persist in two types of internal, cognitive operations -- setting a goal to pursue (Chapter 1) and generating a decision option to consider (Chapter 2) -- after being rewarded for them, even when those rewards are known to be irrelevant for the present context. I formally model these patterns as model-free reinforcement learning over internal operations, and show that habits of thought can serve a useful function: They help make model-based planning tractable by narrowing its scope and channeling it down a small number of promising paths. Finally, I also find that there are cognitive operations which don't exhibit this type of habit of thought: In our experiments (Chapter 3), people don't employ model-free reinforcement learning to select chunked action sequences (e.g. sequences of button presses mentally chunked together as a unit), and instead use only model-based planning to choose sequences. Together, this work provides a precise account of the cognitive mechanisms that can underlie habits of thought; rigorously demonstrates the existence of such habits; suggests their adaptive function; and begins to map out their boundary conditions.