|dc.description.abstract||The regulation of gene expression determines the molecular tools a cell has at its disposal to perform its basic functions, respond to its environment, and replicate itself and its genetic material. With the development of single-cell transcriptomics, we have an unprecedented ability to observe the gene expression of individual cells and how it varies across cell types and environmental conditions. However, a great challenge of biology remains to understand the complex biochemical circuitry that underlies expression and how it gets corrupted in disease.
In this thesis, I describe two projects utilizing single-cell transcriptomics to characterize and interpret gene expression. In the first project, we develop a tool for exploring single-cell data, consensus non-negative matrix factorization (cNMF), which identifies signatures of co-varying genes and their relative contributions to each cell. Unlike hard clustering, cNMF models each cell as a weighted mixture of expression signatures, allowing it to account for multiple independent axes of variation within cell. We show with simulated data and in several real datasets that the signatures inferred by cNMF can frequently be interpreted as identity programs that are characteristic of discrete cell types, or “activity” programs that are induced in cells carrying out specific activities (e.g. cell replication, adaptation to a hypoxic environment, or responding to signals from neighboring cells). We identify unexpected activity programs that reproduce across multiple datasets, highlighting cNMF’s ability to draw out the latent sources of variation within cells in a tissue.
In the second project, we characterize changes in the transcriptomes and in selected protein markers of circulating immune cells in a lethal challenge model of Ebola virus in Rhesus Macaque. We describe the dynamics of the host response in each circulating immune cell type, and relate these to the clinical phenomena in Ebola virus disease. For example, T-cells acquire an apoptosis expression signature consistent with observed lymphopenia, and monocytes lose expression of MHC class II proteins, which could underlie the lack of an adaptive immune response observed in severe clinical cases. Moreover, as Ebola virus has an RNA genome and makes poly-adenylated mRNA, we detect infected cells and identify host expression markers associated with tropism. We find that monocytes expressing a macrophage-differentiation program and co-expressing CD14 and CD16 are the predominant target of EBOV amongst circulating cells, in vivo. Finally, while virtually all uninfected cells express the anti-viral interferon response program, we show that EBOV suppresses this program in infected cells while up-regulating putative pro-viral genes such as DYNLL1 and HSPA5. We thus demonstrate that Ebola virus shapes the gene expression of infected cells to evade the antiviral response and enhance its own replication.
In summary, this thesis describes two projects centered around the analysis of single-cell data. The first develops a general approach for discerning gene expression programs underlying cell type and cellular activities. The second employs this method as part of a general analysis of the dynamics of host response to Ebola virus infection. This body of work constitutes distinct advances in our ability to interpret transcriptomic data and our specific understanding of the immune response to Ebola virus.||