Publication: The Price of Personalization: An Application of Contextual Bandits to Mobile Health
No Thumbnail Available
Date
2018-06-29
Authors
Published Version
Published Version
Journal Title
Journal ISSN
Volume Title
Publisher
The Harvard community has made this article openly available. Please share how this access benefits you.
Citation
Research Data
Abstract
One goal in healthcare is to be able to accurately personalize treatments, including the ability to maintaining overall efficacy in treatment while minimizing the harm and quantity of mistreated patients. With the recent prevalence of mobile devices, rapid collection of data was made possible to leverage in personalizing treatment of long-term diseases, as in the HeartSteps study, an adaptive mHealth (mobile health) intervention application for cardiovascular maintenance.
We frame the HeartSteps study as a contextual multi-armed bandit (MAB) problem, a reinforcement learning setting in which the agent must choose the optimal treatment action among several based on contextual information.
We investigate and test the use of several different variants of the Thompson Sampling heuristic, a lightweight but effective reinforcement learning algorithm, to solve the Multi-armed Bandit problem as applied to HeartSteps. Experimental bootstrapping results are interpreted and then used to corroborate theoretically backed modifications to Thompson Sampling, guiding the future design of HeartSteps to maximize overall treatment performance while minimizing variance of per-patient performance. Through these evaluations, we examine the price of personalization, or the trade-off between optimizing the overall treatment efficacy versus optimizing the fairness of individual treatment efficacies.
Description
Other Available Sources
Keywords
Computer Science, Mathematics
Terms of Use
This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service