Publication:

Task-Relevant Generative Models and Safe Reinforcement Learning with Applications to Clinical Decision-Making

Loading...
Thumbnail Image

Date

2025-08-15

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Sharma, Abhishek. 2025. Task-Relevant Generative Models and Safe Reinforcement Learning with Applications to Clinical Decision-Making. Doctoral Dissertation, Harvard University Graduate School of Arts and Sciences.

Abstract

Clinical decision-making is inherently complex and high-stakes, frequently requiring sequential decisions under uncertainty. Automated discovery of clinical concepts and reinforcement learning (RL) have great potential to assist clinicians, yet current approaches face several practical challenges. First, generative models often fail to differentiate clinically relevant structures from irrelevant noise, compromising interpretability and predictive utility. Second, traditional RL approaches optimize for fixed objectives and thus cannot adequately accommodate varying clinical goals or patient-specific preferences. Finally, prevalent offline RL methods tend to be unsafe, overly conservative, or reliant on impractical assumptions, such as access to the behavior policy, which restricts their real-world usability.

This thesis introduces methodologies designed specifically to address these challenges. First, Chapter 3 introduces prediction-focused Gaussian Mixture Models (pf-GMM) and Hidden Markov Models (pf-HMM) for identifying and clustering clinically relevant features from noisy, high-dimensional data. Second, Chapter 4 proposes a Robust Decision-Focused (RDF) model-based RL framework. This framework learns transition dynamics that perform consistently well across changing clinical reward preferences, ensuring high-quality decisions in diverse clinical scenarios. Third, Chapter 5 presents Decision-Point RL (DPRL), an offline RL methodology that identifies high-confidence ``decision points" in clinical data for targeted, minimal policy adjustments, backed by theoretical safety guarantees. Validation of these contributions includes rigorous theoretical analyses and extensive empirical evaluations using synthetic benchmarks, medical simulators (e.g., cancer treatment), and real-world clinical datasets (e.g., a hypotension cohort from MIMIC-IV dataset, an HIV dataset, and electronic health records from a hospital system).

The methods presented demonstrate improvements in predictive accuracy, clinical interpretability, robustness against reward shifts, and safety compared to conventional approaches. Collectively, this thesis advances the development of interpretable, robust, and safe machine learning methods, effectively bridging the gap between machine learning methods and real-world clinical practice, enhancing decision-making in healthcare.

Description

Other Available Sources

Research Data

Keywords

generative models, machine learning for healthcare, probabilistic machine learning, reinforcement learning, Artificial intelligence, Computer science

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Related Stories