Publication:
Integrative analysis of large-scale single-cell genomics data in immune-mediated diseases

No Thumbnail Available

Date

2023-05-02

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Kang, Joyce Blossom. 2023. Integrative analysis of large-scale single-cell genomics data in immune-mediated diseases. Doctoral dissertation, Harvard University Graduate School of Arts and Sciences.

Research Data

Abstract

Recent advances in single-cell RNA sequencing have enabled the fine-grained characterization of heterogeneous cell types and states. However, integrating cells and effectively aligning shared cell states from disparate sources, such as different datasets, individuals, tissues, and species, remains a key computational challenge. In this thesis, we develop and apply novel methods for integrating complex single-cell datasets, with particular applications towards immune-mediated diseases. We first introduce Symphony, an algorithm for building large-scale, integrated single-cell reference atlases in a convenient, portable format that enables mapping query datasets within seconds. Symphony precisely localizes query cells within a stable reference-defined cell embedding, enabling the reproducible downstream transfer of diverse types of cell annotations to the query. We benchmark the performance of Symphony and demonstrate its utility in multiple real-world examples, including inferring discrete cell types, positions along a continuous developmental trajectory, and surface protein expression. Next, we conduct a large-scale integrative analysis to understand how cell states influence genetic regulation of gene expression within the HLA locus, which is critical in autoimmune and infectious diseases, transplantation, and cancer. By integrating >1.13 million immune cells from 1,073 individuals and three tissues, we identify cell-type-specific expression quantitative trait loci (eQTLs) for classical HLA genes. We show that many eQTL effects are dynamic across cell states, especially for HLA-DQ genes. Dynamic HLA regulation may contribute to interindividual variation in antigen presentation levels and immune responses. Finally, we explore cross-species integration using single-cell data from myeloid cells from human patients with systemic lupus erythematosus (SLE) and four mouse models of SLE, enabling the comparison of conserved and diverging gene expression patterns across cell states across species. We identify a classical monocyte cell state that is expanded in diseased mice and correlates with disease activity in human patients. Overall, this work demonstrates the power of integrative single-cell analysis for mapping disease-relevant cell states and dynamic gene regulatory mechanisms across multiple data sources.

Description

Other Available Sources

Keywords

data integration, gene regulation, HLA, immune-mediated diseases, single-cell, Bioinformatics, Genetics, Immunology

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Referenced By

Related Stories