Publication:
New methods for representation learning, uncertainty quantification, and causal inference in biomedical machine learning

No Thumbnail Available

Date

2022-09-07

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Kompa, Benjamin. 2022. New methods for representation learning, uncertainty quantification, and causal inference in biomedical machine learning. Doctoral dissertation, Harvard University Graduate School of Arts and Sciences.

Research Data

Abstract

Machine learning has made tremendous progress in recent years on the basis of large datasets, increased computational capacity, and the unreasonable effectiveness of inductive biases for text and image data. These same inductive biases (e.g. convolutions) have also proved as natural fits across biomedical data from genomes to medical records. However, biomedical applications of machine learning are fraught with additional challenges such as privacy concerns, dataset shift after deployment, and estimation of causal effects in the presence of confounding. In the following dissertation, we develop a model that learns an embedding space of medical concepts across multiple private sources of healthcare data and provide new benchmarks to assess models’ understanding of medical knowledge in the cui2vec R package. We introduce a Unified Feature Disentanglement Network trained on the Cancer Genome Atlas, which can garner insights into key genes in oncological development. Additionally, we examine the coverage properties of popular, approximate Bayesian machine learning models and find that they fail to adequately adjust uncertainty measures under dataset shift. Finally, we present a new approach based on neural networks for estimating causal effects in the presence of unmeasured confounding. Collectively, these methods address core challenges for biomedical applications of machine learning and provide foundations for future research directions.

Description

Other Available Sources

Keywords

Bioinformatics

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Referenced By

Related Stories