Publication:

Sensitive and highly resolved identification of RNA-protein interaction sites in PAR-CLIP data

Loading...
Thumbnail Image

Date

2015

Journal Title

Journal ISSN

Volume Title

Publisher

BioMed Central
The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Comoglio, Federico, Cem Sievers, and Renato Paro. 2015. “Sensitive and highly resolved identification of RNA-protein interaction sites in PAR-CLIP data.” BMC Bioinformatics 16 (1): 32. doi:10.1186/s12859-015-0470-y. http://dx.doi.org/10.1186/s12859-015-0470-y.

Abstract

Background: PAR-CLIP is a recently developed Next Generation Sequencing-based method enabling transcriptome-wide identification of interaction sites between RNA and RNA-binding proteins. The PAR-CLIP procedure induces specific base transitions that originate from sites of RNA-protein interactions and can therefore guide the identification of binding sites. However, additional sources of transitions, such as cell type-specific SNPs and sequencing errors, challenge the inference of binding sites and suitable statistical approaches are crucial to control false discovery rates. In addition, a highly resolved delineation of binding sites followed by an extensive downstream analysis is necessary for a comprehensive characterization of the protein binding preferences and the subsequent design of validation experiments. Results: We present a statistical and computational framework for PAR-CLIP data analysis. We developed a sensitive transition-centered algorithm specifically designed to resolve protein binding sites at high resolution in PAR-CLIP data. Our method employes a Bayesian network approach to associate posterior log-odds with the observed transitions, providing an overall quantification of the confidence in RNA-protein interaction. We use published PAR-CLIP data to demonstrate the advantages of our approach, which compares favorably with alternative algorithms. Lastly, by integrating RNA-Seq data we compute conservative experimentally-based false discovery rates of our method and demonstrate the high precision of our strategy. Conclusions: Our method is implemented in the R package wavClusteR 2.0. The package is distributed under the GPL-2 license and is available from BioConductor at http://www.bioconductor.org/packages/devel/bioc/html/wavClusteR.html. Electronic supplementary material The online version of this article (doi:10.1186/s12859-015-0470-y) contains supplementary material, which is available to authorized users.

Description

Research Data

Keywords

PAR-CLIP, RNA, RNA binding proteins, Bayesian statistics

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Related Stories