Publication:

Regularization in Reinforcement Learning: Equivalences and Novel Methods

Loading...
Thumbnail Image

Date

2025-05-12

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Rathnam, Sarah V. 2025. Regularization in Reinforcement Learning: Equivalences and Novel Methods. Doctoral Dissertation, Harvard University Graduate School of Arts and Sciences.

Abstract

Reinforcement learning (RL) is a powerful framework for sequential decision-making, with applications ranging from robotics to healthcare. However, in real-world settings, such as mobile health (mHealth), RL faces challenges due to limited data and the need for generalization beyond observed experiences. Regularization -- a set of techniques that constrain model complexity to prevent overfitting and promote generalization-- plays a crucial role in overcoming these challenges. This dissertation critically examines existing RL regularization methods, uncovers novel connections between them, and introduces new approaches inspired by the challenges of mobile health studies.

One focus of this work is establishing theoretical connections between existing regularization methods. We prove that discount regularization produces the same optimal policy as a Bayesian prior on the transition function and a penalized Q-function, and is also equivalent to a truncated lambda return. These relationships reveal underlying assumptions and limitations of discount regularization.

This work also focuses on introducing novel regularization methods. First we introduce a state-action-specific regularization method that mitigates the limitations of discount regularization uncovered in our analysis. We also propose a novel Bayesian hypothesis testing-based regularization approach that leverages prior study data to improve learning while adapting to differences between the environments of the prior and current studies. This is particularly useful in mobile health applications where feedback is sparse and exploration is limited.

Through theoretical analysis and empirical validation, this dissertation advances the understanding of RL regularization methods and introduces new techniques that enhance generalization in data-constrained environments. These contributions provide a principled foundation for improving RL applications in healthcare and beyond.

Description

Other Available Sources

Research Data

Keywords

machine learning, regularization, reinforcement learning, Applied mathematics, Computer science, Statistics

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Related Stories