Show simple item record

dc.contributor.advisorHuttenhower, Curtis
dc.contributor.authorNchinda-Pungong, Nkaziewoh Ndabong
dc.date.accessioned2021-07-19T04:12:53Z
dash.embargo.terms2022-06-23
dc.date.created2021
dc.date.issued2021-06-23
dc.date.submitted2021
dc.identifier.citationNchinda-Pungong, Nkaziewoh Ndabong. 2021. Identifying and Quantifying Novel Bacteria. Bachelor's thesis, Harvard College.
dc.identifier.other28411043
dc.identifier.urihttps://nrs.harvard.edu/URN-3:HUL.INSTREPOS:37368578*
dc.description.abstractThe tree of life lies at the heart of biology, but major gaps persist among bacteria. Attempts to identify these missing microbes face challenges in determining which organisms are poorly characterized and where to find them. Here, we have devised a bioinformatics-based pipeline for identifying novel organisms and assessing their relative abundance in different environments based on 16S sequences. Using data from GTDB, we validate that the 16S V4 region can be used to estimate the novelty of an organism’s whole genome. Then, we apply the pipeline to 16S SILVA data, estimating how many organisms remain to be discovered at each taxonomic level. We also determine that V4 sequencing is likely to underestimate genome novelty relative to the full 16S. Next, we apply the pipeline to datasets from the Earth Microbiome Project, assessing the relative abundance of novel organisms in different environments. Our results indicate that soil samples contain the highest volume of novel bacteria, but the optimal environment for microbial discovery varies based on the desired taxonomic level of novel organisms and laboratory sequencing capacity. We then apply the pipeline to standardized samples collected from several environments, determining that salt marsh soil contains a high density of novel organisms. Lastly, we use the pipeline to enrich one marsh sample for novel organisms, assembling a novel Gracilibacteria genome in the process. This pipeline allows researchers to compare environments for microbial sequencing and enrich for novel organisms, speeding up the rate at which we discover novel bacteria.
dc.format.mimetypeapplication/pdf
dc.language.isoen
dash.licenseLAA
dc.subject16S
dc.subjectBacteria
dc.subjectBioinformatics
dc.subjectMetagenomics
dc.subjectTaxonomy
dc.subjectBioengineering
dc.subjectBioinformatics
dc.subjectMicrobiology
dc.titleIdentifying and Quantifying Novel Bacteria
dc.typeThesis or Dissertation
dash.depositing.authorNchinda-Pungong, Nkaziewoh Ndabong
dash.embargo.until2022-06-23
dc.date.available2021-07-19T04:12:53Z
thesis.degree.date2021
thesis.degree.grantorHarvard College
thesis.degree.levelBachelor's
thesis.degree.levelUndergraduate
thesis.degree.nameAB
dc.contributor.committeeMemberEddy, Sean
dc.contributor.committeeMemberCluzel, Philippe
dc.type.materialtext
thesis.degree.departmentBiomedical Engineering AB
dc.identifier.orcid0000-0002-8877-5200
dash.author.emailnkazinchinda@gmail.com


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record