Identifying and Quantifying Novel Bacteria
Nchinda-Pungong, Nkaziewoh Ndabong
MetadataShow full item record
CitationNchinda-Pungong, Nkaziewoh Ndabong. 2021. Identifying and Quantifying Novel Bacteria. Bachelor's thesis, Harvard College.
AbstractThe tree of life lies at the heart of biology, but major gaps persist among bacteria. Attempts to identify these missing microbes face challenges in determining which organisms are poorly characterized and where to find them. Here, we have devised a bioinformatics-based pipeline for identifying novel organisms and assessing their relative abundance in different environments based on 16S sequences. Using data from GTDB, we validate that the 16S V4 region can be used to estimate the novelty of an organism’s whole genome. Then, we apply the pipeline to 16S SILVA data, estimating how many organisms remain to be discovered at each taxonomic level. We also determine that V4 sequencing is likely to underestimate genome novelty relative to the full 16S. Next, we apply the pipeline to datasets from the Earth Microbiome Project, assessing the relative abundance of novel organisms in different environments. Our results indicate that soil samples contain the highest volume of novel bacteria, but the optimal environment for microbial discovery varies based on the desired taxonomic level of novel organisms and laboratory sequencing capacity. We then apply the pipeline to standardized samples collected from several environments, determining that salt marsh soil contains a high density of novel organisms. Lastly, we use the pipeline to enrich one marsh sample for novel organisms, assembling a novel Gracilibacteria genome in the process. This pipeline allows researchers to compare environments for microbial sequencing and enrich for novel organisms, speeding up the rate at which we discover novel bacteria.
Citable link to this pagehttps://nrs.harvard.edu/URN-3:HUL.INSTREPOS:37368578
- FAS Theses and Dissertations