A Case Study for Large-Scale Human Microbiome Analysis Using JCVI’s Metagenomics Reports (METAREP)

View/ Open
Author
Goll, Johannes
Thiagarajan, Mathangi
Abubucker, Sahar
Yooseph, Shibu
Methé, Barbara A.
Published Version
https://doi.org/10.1371/journal.pone.0029044Metadata
Show full item recordCitation
Goll, Johannes, Mathangi Thiagarajan, Sahar Abubucker, Curtis Huttenhower, Shibu Yooseph, and Barbara A. Methé. 2012. A case study for large-scale human microbiome analysis using JCVI’s metagenomics reports (metarep). PLoS ONE 7(6): e29044.Abstract
As metagenomic studies continue to increase in their number, sequence volume and complexity, the scalability of biological analysis frameworks has become a rate-limiting factor to meaningful data interpretation. To address this issue, we have developed JCVI Metagenomics Reports (METAREP) as an open source tool to query, browse, and compare extremely large volumes of metagenomic annotations. Here we present improvements to this software including the implementation of a dynamic weighting of taxonomic and functional annotation, support for distributed searches, advanced clustering routines, and integration of additional annotation input formats. The utility of these improvements to data interpretation are demonstrated through the application of multiple comparative analysis strategies to shotgun metagenomic data produced by the National Institutes of Health Roadmap for Biomedical Research Human Microbiome Project (HMP) (http://nihroadmap.nih.gov). Specifically, the scalability of the dynamic weighting feature is evaluated and established by its application to the analysis of over 400 million weighted gene annotations derived from 14 billion short reads as predicted by the HMP Unified Metabolic Analysis Network (HUMAnN) pipeline. Further, the capacity of METAREP to facilitate the identification and simultaneous comparison of taxonomic and functional annotations including biological pathway and individual enzyme abundances from hundreds of community samples is demonstrated by providing scenarios that describe how these data can be mined to answer biological questions related to the human microbiome. These strategies provide users with a reference of how to conduct similar large-scale metagenomic analyses using METAREP with their own sequence data, while in this study they reveal insights into the nature and extent of variation in taxonomic and functional profiles across body habitats and individuals. Over one thousand HMP WGS datasets and the latest open source code are available at http://www.jcvi.org/hmp-metarep.Other Sources
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3374610/pdf/Terms of Use
This article is made available under the terms and conditions applicable to Other Posted Material, as set forth at http://nrs.harvard.edu/urn-3:HUL.InstRepos:dash.current.terms-of-use#LAACitable link to this page
http://nrs.harvard.edu/urn-3:HUL.InstRepos:10497282
Collections
- SPH Scholarly Articles [6329]
Contact administrator regarding this item (to report mistakes or request changes)
Related items
Showing items related by title, author, creator and subject.
-
A genome-to-genome analysis of associations between human genetic variation, HIV-1 sequence diversity, and viral control
Bartha, István; Carlson, Jonathan M; Brumme, Chanson J; McLaren, Paul J; Brumme, Zabrina L; John, Mina; Haas, David W; Martinez-Picado, Javier; Dalmau, Judith; López-Galíndez, Cecilio; Casado, Concepción; Rauch, Andri; Günthard, Huldrych F; Bernasconi, Enos; Vernazza, Pietro; Klimkait, Thomas; Yerly, Sabine; O’Brien, Stephen J; Listgarten, Jennifer; Pfeifer, Nico; Lippert, Christoph; Fusi, Nicolo; Kutalik, Zoltán; Allen, Todd M; Müller, Viktor; Harrigan, P Richard; Heckerman, David; Telenti, Amalio; Fellay, Jacques (eLife Sciences Publications, Ltd, 2013)HIV-1 sequence diversity is affected by selection pressures arising from host genomic factors. Using paired human and viral data from 1071 individuals, we ran >3000 genome-wide scans, testing for associations between host ... -
MetaRef: a pan-genomic database for comparative and community microbial genomics
Huang, Katherine; Brady, Arthur; Mahurkar, Anup; White, Owen; Gevers, Dirk; Huttenhower, Curtis; Segata, Nicola (Oxford University Press, 2013)Microbial genome sequencing is one of the longest-standing areas of biological database development, but high-throughput, low-cost technologies have increased its throughput to an unprecedented number of new genomes per ... -
The Global Invertebrate Genomics Alliance (GIGA): Developing Community Resources to Study Diverse Invertebrate Genomes
Giribet, Gonzalo; Bracken-Grissom, Heather; Collins, Allen G.; Collins, Timothy; Crandall, Keith; Distel, Daniel; Dunn, Casey; Haddock, Steven; Knowlton, Nancy; Martindale, Mark; Medina, Mónica; Messing, Charles; O'Brien, Stephen J.; Paulay, Gustav; Putnam, Nicolas; Ravasi, Timothy; Rouse, Greg W.; Ryan, Joseph F.; Schulze, Anja; Wörheide, Gert; Adamska, Maja; Bailly, Xavier; Browne, William E.; Diaz, M. Christina; Evans, Nathaniel; Flot, Jean-François; Gofarty, Nicole; Johnston, Matthew; Kamel, Bishoy; Kawahara, Akito Y.; Laberge, Tammy; Lavrov, Dennis; Michonneau, François; Moroz, Leonid L.; Oakley, Todd; Osborne, Karen; Pomponi, Shirley A.; Rhodes, Adelaide; Rodriguez-Lanetty, Mauricio; Santos, Scott R.; Thacker, Robert W.; Van de Peer, Yves; Santos, Scott R.; Satoh, Nori; Thacker, Robert W.; Voolstra, Christian R.; Welch, David Mark; Winston, Judith; Zhou, Xin (Oxford University Press, 2013)Over 95% of all metazoan (animal) species comprise the “invertebrates,” but very few genomes from these organisms have been sequenced. We have, therefore, formed a “Global Invertebrate Genomics Alliance” (GIGA). Our intent ...