Publication:

Trustworthy Machine Learning for Medicine

Loading...
Thumbnail Image

Date

2025-04-22

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Han, Tessa. 2025. Trustworthy Machine Learning for Medicine. Doctoral Dissertation, Harvard University Graduate School of Arts and Sciences.

Abstract

Machine learning is achieving significant breakthroughs in various applications, including medical research, from analyzing genomic data to accelerating drug discovery and designing personalized treatment plans for patients. However, as machine learning is applied to such high-stakes domains where model errors and biases can adversely impact human lives, there is a growing focus on build- ing models that not only have high accuracy but that are also trustworthy. This dissertation studies three areas of trustworthy machine learning – interpretability, robustness, and safety alignment – and addresses key challenges in each area. In the area of interpretability, we develop a theoretical framework to understand the mathematical properties of explanation methods, elucidating their commonalities and differences, explaining why different methods can generate disagreeing explana- tions, and providing a principled approach to select among methods. In the area of robustness, we develop algorithms to efficiently estimate a model’s average-case robustness, enabling an accurate and efficient characterization of real-world model behavior for large-scale applications. Lastly, in the area of safety alignment, we develop a novel benchmark dataset and evaluate and improve the medical safety of large language models, finding that publicly-available medical large language models do not meet medical safety standards and that fine-tuning them on safety demonstrations can improve their safety while preserving their medical knowledge. Altogether, this research advances the conceptual understanding and practical application of trustworthy machine learning, especially in the medical domain, and paves the way for future research.

Description

Other Available Sources

Research Data

Keywords

Computer science, Bioinformatics

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Related Stories