Publication:
What's Missing from Machine Learning for Medicine? New Methods for Causal Effect Estimation and Representation Learning from EHR Data

No Thumbnail Available

Date

2023-05-09

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Bellamy, David. 2023. What's Missing from Machine Learning for Medicine? New Methods for Causal Effect Estimation and Representation Learning from EHR Data. Doctoral dissertation, Harvard University Graduate School of Arts and Sciences.

Research Data

Abstract

This thesis explores the applications of deep learning in clinical and epidemiologic data analysis, focusing on neural networks for causal effect estimation and clinical risk prediction. I claim that neural networks have a significant role in the future of causal inference and I present empirical results in Chapter 2 that demonstrate the superiority of neural nets for solving the fundamental integral equation of proximal inference. In Chapters 3 and 4, I discuss the limitations of deep learning in the context of clinical risk prediction. Chapter 3 analyzes the field's progress on modeling structured medical data using deep learning and shows that it has been stagnant. I propose several reasons for this stagnation and attempt to address some of them with a novel Transformer architecture in Chapter 4. This pre-trained Transformer model, called Labrador, does not consistently outperform tree-based methods in downstream fine-tuning tasks despite showing strong results on pre-training. This observation motivates several concluding arguments that I present in the final chapter. Among them, I emphasize that the full potential of deep learning in medical artificial intelligence is yet to be realized. In order to realize some of this potential, I argue that multi-modal models are required and that coordinated institutional efforts will be necessary to foster the resources for large-scale data and model training. Finally, I discuss some of the remarkable capabilities recently observed in GPT-4 and hypothesize that large language models, despite being purely predictive in nature, can learn about causality by training on sufficiently large data. I claim that this emergent understanding of causality and acquisition of world models should be anticipated as an outcome of next-token prediction. I use the human cortex as an existence proof for this statement and draw a connection between next-token prediction methods and the predictive coding theory of neuroscience for information processing in the cortex. In conclusion, I discuss the implications of large language models for the practice of causal inference and vice versa.

Description

Other Available Sources

Keywords

Causal Inference, Deep Learning, Electronic Health Records, Healthcare, Medical AI, Neural Networks, Artificial intelligence, Epidemiology, Statistics

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Referenced By

Related Stories