Publication:

Learning gene regulation with cross-modal integration of observations and perturbations

Loading...
Thumbnail Image

Date

2025-06-05

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Ryu, Jayoung Kim. 2025. Learning gene regulation with cross-modal integration of observations and perturbations. Doctoral Dissertation, Harvard University Graduate School of Arts and Sciences.

Abstract

Biological systems are studied by observing and perturbing. Recent abundance in omics data in various modalities with higher resolution and scale, along with an explosion in perturbation techniques, poses critical opportunities in learning gene regulation. This PhD dissertation presents innovative cross-modal integration approaches for understanding gene regulatory mechanisms. It demonstrates how the integration provides multi-dimensional insights on gene regulation, namely how different components in multiple modalities interact across cellular contexts, and the perturbational and sequence variant impacts on those axes of gene regulation.

Specifically, I have developed four computational methods, namely CORAL, SIMBA+, BEAN, and Labeled Gromov-Wasserstein Optimal Transport, each of which has tackled distinct tasks for a better understanding of gene regulation mechanisms using different statistical and machine learning models. CORAL allows upstream and downstream transfer learning of single-cell multiomics data. SIMBA+ provides cell-state-specific mechanistic insights on sequence variants and phenotypes by integrating single-cell multiomics data and genome-wide association studies. BEAN integrates genotype and phenotype from high-throughput CRISPR base editing screens to improve the power of the screen. Labeled Gromov-Wasserstein Optimal Transport integrates multimodal single-cell perturbation screens for cross-modality response prediction and feature interpretation.

Overall, this PhD dissertation took a major step forward in machine learning and statistical approaches for learning gene regulation through cross-modal integration of observational and perturbational datasets. Through the integration, I provide major insights into gene regulation, variant interpretation, disease mechanism, and clinical discovery.

Description

Other Available Sources

Research Data

Keywords

Algorithm, CRISPR, Genomics, Machine Learning, Perturbation, Bioinformatics

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Related Stories