Publication:

MetaGen: reference-free learning with multiple metagenomic samples

Loading...
Thumbnail Image

Open/View Files

Date

2017

Journal Title

Journal ISSN

Volume Title

Publisher

BioMed Central
The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Xing, Xin, Jun S. Liu, and Wenxuan Zhong. 2017. “MetaGen: reference-free learning with multiple metagenomic samples.” Genome Biology 18 (1): 187. doi:10.1186/s13059-017-1323-y. http://dx.doi.org/10.1186/s13059-017-1323-y.

Abstract

A major goal of metagenomics is to identify and study the entire collection of microbial species in a set of targeted samples. We describe a statistical metagenomic algorithm that simultaneously identifies microbial species and estimates their abundances without using reference genomes. As a trade-off, we require multiple metagenomic samples, usually ≥10 samples, to get highly accurate binning results. Compared to reference-free methods based primarily on k-mer distributions or coverage information, the proposed approach achieves a higher species binning accuracy and is particularly powerful when sequencing coverage is low. We demonstrated the performance of this new method through both simulation and real metagenomic studies. The MetaGen software is available at https://github.com/BioAlgs/MetaGen. Electronic supplementary material The online version of this article (doi:10.1186/s13059-017-1323-y) contains supplementary material, which is available to authorized users.

Description

Research Data

Keywords

Metagenomics, Binning, Mixture model, Multinomial, Unsupervised learning

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Related Stories