Publication:

Advancing Molecular and Functional Understanding of Cells with Artificial Intelligence

Loading...
Thumbnail Image

Date

2025-06-05

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

He, Yichun. 2025. Advancing Molecular and Functional Understanding of Cells with Artificial Intelligence. Doctoral Dissertation, Harvard University Graduate School of Arts and Sciences.

Abstract

Rapid advancements in biotechnology are transforming our ability to measure biological systems across multiple modalities and scales. On the molecular level, spatially resolved single-cell transcriptomics enables comprehensive profiling of gene expression while preserving the spatial architecture of tissues. On the functional level, flexible brain-machine interfaces permit stable, long-term recordings of single-neuron activity throughout behavioral learning and disease progression. These innovations have increasingly shifted the life sciences toward a data-centric paradigm. However, traditional computational modeling approaches remain limited in their capacity to extract meaningful biological insights from heterogeneous, high-dimensional data, especially for decoding complex cell states and functions across space and time. To address this critical gap, this dissertation introduces a suite of computational methods that integrate artificial intelligence and machine learning (AI/ML) with cutting-edge biotechnologies. These methods are designed to interpret large-scale, multimodal biological data and feed insights back into experimental design for iterative discovery. Chapter 2 presents ClusterMap, a spatially informed, unsupervised clustering framework for single-cell and tissue segmentation directly from in situ transcriptomic data. Building on this, Chapter 3 builds a comprehensive spatial atlas of the mouse central nervous system by integrating single-cell gene expression and spatial data at subcellular resolution. To generalize spatial analysis across datasets and technologies, Chapter 4 introduces FuseMap, a universal deep-learning framework that harmonizes multiple brain atlases into a common coordinate framework, enabling gene imputation, tissue region annotation, and cross-dataset integration. Expanding beyond transcriptomics, Chapter 5 introduces AutoSort, a real-time multimodal spike sorting algorithm for stable long-term neural recordings, and UnitedNet, a multi-task learning model that jointly performs cell-type identification, cross-modal prediction, and feature relevance discovery across diverse single-cell modalities. In summary, this dissertation presents novel AI-powered frameworks that bridge molecular and functional modalities at the single-cell level. These approaches not only enhance our ability to decode the complex architecture and dynamics of biological systems but also provide a foundation for future integrative studies in development, disease, and therapeutic response.

Description

Other Available Sources

Research Data

Keywords

Bioengineering

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Related Stories