Differential Abundant Cell Population Analysis in COVID-19PBMC and Immune Checkpoint Blockade Single Cell RNASequencing Data
CitationQian, Gege. 2021. Differential Abundant Cell Population Analysis in COVID-19PBMC and Immune Checkpoint Blockade Single Cell RNASequencing Data. Master's thesis, Harvard Medical School.
AbstractUnderstanding immunological changes underlying tumors and disease microenvironments of other disorders, such as SARS-COV-2, has been challenging because it involves measuring genomic, epigenetic, and molecular changes in a myriad of cells. With the advent of single-cell technologies, it is now possible to assess transcriptome and chromatin accessibility at the single-cell resolution. The technology is currently being deployed in cancer and immunological disorders to study underlying immunological changes. These applications have also exposed the need for new statistical methods to handle increasing data complexity in single-cell experiments.
One such application is characterizing the transcriptomic proﬁle to identify the differential cell population abundance between two biological conditions, which is probably the most fundamental application of the scRNA-Seq analysis. However, the current single-cell approach performs the analysis at the sample level resulting in insufficient statistical power to capture differential abundance due to the small sample size in scRNA data. Further, they ignore scRNA-Seq specific confounding factors such as inefficient genetic material extraction, amplified sample-specific bias, and differences introduced by various sequencing techniques. Here we developed an in silico approach (scDiffPop) that performs a robust statistical analysis at the individual-cell-level to determine biologically meaningful cell type abundance difference. Comparing to other methods, the commonly adopted DESeq is relatively robust to outliers and computationally efficient when dealing with large samples. However, its false discovery rate (FDR) control, like the other methods, is sensitive to sample size. scDiffPop pools related cell types based on the hierarchical relationship and performs sample-level DESeq on the larger meta-groups to gain a stronger statistical power. After validated by several positive and negative tests, we applied scDiffPop on COVID -19 and immune checkpoint blockade (ICB) peripheral blood mononuclear cells (PBMC) datasets to explore which cell populations are most responsible to the pathological phenotypes. In the COVID-19 scenario, we identified that the γδT cell, IgG Plasma Blast, and CD14+ Monocytes are the more crucial immune population that majorly respond to the viral infection. While applying scDiffPop to Yuen et al. dataset composed of 5 responders and 5 non-responders to anti-PD-1 treatment, we find that CXCR4- NK cells and RUNX3+ NK cells are enriched in the responders, whereas monocytes populations are more abundant in non-responders.
Citable link to this pagehttps://nrs.harvard.edu/URN-3:HUL.INSTREPOS:37368632