Publication: A Novel Algorithm for Calculating Explicit Sampling Probabilities
Open/View Files
Date
Authors
Published Version
Published Version
Journal Title
Journal ISSN
Volume Title
Publisher
Citation
Abstract
Reinforcement learning (RL) is a branch of machine learning that tackles sequential decision making problems via an agent-environment framework, with the objective of maximizing a scalar reward signal. Many online reinforcement learning (RL) algorithms lack explicit sampling probabilities, the probability that an action is selected given the observed history up to that point. These explicit sampling probabilities are necessary for calculating estimators used in off-policy evaluation. Our primary contribution is the development of a Monte Carlo integration (MCI) based algorithm that closely matches the performance of randomized least squares value iteration (RLSVI), an efficient Bayesian RL algorithm that does not offer explicit sampling probabilities. Moreover, we present an application of this algorithm in the context of the ADAPTS-HCT clinical trial, which uses a novel hierarchical RL algorithm in a dyadic patient-caregiver structure to improve medication adherence following hematopoietic stem cell transplantation.