Publication:

Towards Practical Applications of Machine Learning in Healthcare with Federated Learning

Loading...
Thumbnail Image

Date

2024-05-13

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Zhong, Aoxiao. 2024. Towards Practical Applications of Machine Learning in Healthcare with Federated Learning. Doctoral dissertation, Harvard University Graduate School of Arts and Sciences.

Research Data

Abstract

Federated Learning (FL) has emerged as a significant tool in healthcare machine learning, enabling institutions to collaboratively train models while maintaining data privacy. This dissertation describes the implementation of a real-world healthcare FL project and addresses the challenge of domain shift for more effective model deployment.

We begin by detailing a practical application of FL during the SARS-COV-2 pandemic. Twenty institutions collaborated on a healthcare FL study to develop the "EXAM" (EMR CXR AI Model), which predicts future oxygen requirements for symptomatic patients using vital signs, laboratory data, and chest x-rays. EXAM achieved an Area Under the Curve (AUC) of over 0.92, marking a 16% improvement and a 38% increase in generalizability over local models. This project demonstrated FL's ability to enable rapid scientific collaboration without data exchange, producing a model that generalized across heterogeneous, unharmonized datasets and provided the healthcare community with a validated tool to combat COVID-19.

Next, we address a specialized non-iid FL challenge termed \emph{Domain-mixed FL}, where each client's data is assumed to be a mixture of several predefined domains. We propose a novel method, FedDAR, which learns a shared domain representation and personalized prediction models in a decoupled manner. Theoretical proofs show that FedDAR achieves linear convergence in simplified settings, and extensive empirical studies on both synthetic and real-world datasets demonstrate its superiority over existing FL methods.

Finally, we explore the multi-dimensional domain shift problem prevalent in healthcare ML applications. We introduce a novel strategy using an ensemble of mixtures of experts (EMoE), each expert tailored to adapt to shifts along different dimensions. This approach is designed to be versatile and robust, suitable for both centralized and federated learning settings. Rigorous testing on various real-world datasets has shown that our method outperforms contemporary domain generalization and personalized federated learning approaches, effectively managing the complexities of multi-dimensional domain shifts.

Description

Other Available Sources

Keywords

Federated Learning, Healthcare, Machine Learning, Electrical engineering

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Referenced By

Related Stories