Publication:
NEAT: a framework for building fully automated NGS pipelines and analyses

Thumbnail Image

Open/View Files

Date

2016

Journal Title

Journal ISSN

Volume Title

Publisher

BioMed Central
The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Schorderet, Patrick. 2016. “NEAT: a framework for building fully automated NGS pipelines and analyses.” BMC Bioinformatics 17 (1): 53. doi:10.1186/s12859-016-0902-3. http://dx.doi.org/10.1186/s12859-016-0902-3.

Research Data

Abstract

Background: The analysis of next generation sequencing (NGS) has become a standard task for many laboratories in the life sciences. Though there exists several tools to support users in the manipulation of such datasets on various levels, few are built on the basis of vertical integration. Here, we present the NExt generation Analysis Toolbox (NEAT) that allows non-expert users including wet-lab scientists to comprehensively build, run and analyze NGS data through double-clickable executables without the need of any programming experience. Results: In comparison to many publicly available tools including Galaxy, NEAT provides three main advantages: (1) Through the development of double-clickable executables, NEAT is efficient (completes within <24 hours), easy to implement and intuitive; (2) Storage space, maximum number of job submissions, wall time and cluster-specific parameters can be customized as NEAT is run on the institution’s cluster; (3) NEAT allows users to visualize and summarize NGS data rapidly and efficiently using various built-in exploratory data analysis tools including metagenomic and differentially expressed gene analysis. To simplify the control of the workflow, NEAT projects are built around a unique and centralized file containing sample names, replicates, conditions, antibodies, alignment-, filtering- and peak calling parameters as well as cluster-specific paths and settings. Moreover, the small-sized files produced by NEAT allow users to easily manipulate, consolidate and share datasets from different users and institutions. Conclusions: NEAT provides biologists and bioinformaticians with a robust, efficient and comprehensive tool for the analysis of massive NGS datasets. Frameworks such as NEAT not only allow novice users to overcome the increasing number of technical hurdles due to the complexity of manipulating large datasets, but provide more advance users with tools that ensure high reproducibility standards in the NGS era. NEAT is publically available at https://github.com/pschorderet/NEAT. Electronic supplementary material The online version of this article (doi:10.1186/s12859-016-0902-3) contains supplementary material, which is available to authorized users.

Description

Keywords

Bioinformatics, Genomics, High throughput sequencing, RNAseq, ChIPseq, NGS pipelines

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Referenced By

Related Stories