Publication: Integrative analysis of large-scale single-cell genomics data in immune-mediated diseases
No Thumbnail Available
Open/View Files
Date
2023-05-02
Authors
Published Version
Published Version
Journal Title
Journal ISSN
Volume Title
Publisher
The Harvard community has made this article openly available. Please share how this access benefits you.
Citation
Kang, Joyce Blossom. 2023. Integrative analysis of large-scale single-cell genomics data in immune-mediated diseases. Doctoral dissertation, Harvard University Graduate School of Arts and Sciences.
Research Data
Abstract
Recent advances in single-cell RNA sequencing have enabled the fine-grained characterization of heterogeneous cell types and states. However, integrating cells and effectively aligning shared cell states from disparate sources, such as different datasets, individuals, tissues, and species, remains a key computational challenge. In this thesis, we develop and apply novel methods for integrating complex single-cell datasets, with particular applications towards immune-mediated diseases.
We first introduce Symphony, an algorithm for building large-scale, integrated single-cell reference atlases in a convenient, portable format that enables mapping query datasets within seconds. Symphony precisely localizes query cells within a stable reference-defined cell embedding, enabling the reproducible downstream transfer of diverse types of cell annotations to the query. We benchmark the performance of Symphony and demonstrate its utility in multiple real-world examples, including inferring discrete cell types, positions along a continuous developmental trajectory, and surface protein expression.
Next, we conduct a large-scale integrative analysis to understand how cell states influence genetic regulation of gene expression within the HLA locus, which is critical in autoimmune and infectious diseases, transplantation, and cancer. By integrating >1.13 million immune cells from 1,073 individuals and three tissues, we identify cell-type-specific expression quantitative trait loci (eQTLs) for classical HLA genes. We show that many eQTL effects are dynamic across cell states, especially for HLA-DQ genes. Dynamic HLA regulation may contribute to interindividual variation in antigen presentation levels and immune responses.
Finally, we explore cross-species integration using single-cell data from myeloid cells from human patients with systemic lupus erythematosus (SLE) and four mouse models of SLE, enabling the comparison of conserved and diverging gene expression patterns across cell states across species. We identify a classical monocyte cell state that is expanded in diseased mice and correlates with disease activity in human patients.
Overall, this work demonstrates the power of integrative single-cell analysis for mapping disease-relevant cell states and dynamic gene regulatory mechanisms across multiple data sources.
Description
Other Available Sources
Keywords
data integration, gene regulation, HLA, immune-mediated diseases, single-cell, Bioinformatics, Genetics, Immunology
Terms of Use
This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service