Integrated Genome Analysis Suggests that Most Conserved Non-Coding Sequences are Regulatory Factor Binding Sites

DSpace/Manakin Repository

Integrated Genome Analysis Suggests that Most Conserved Non-Coding Sequences are Regulatory Factor Binding Sites

Citable link to this page

 

 
Title: Integrated Genome Analysis Suggests that Most Conserved Non-Coding Sequences are Regulatory Factor Binding Sites
Author: Cloonan, Nicole; Kuersten, Scott; Grimmond, Sean; Hemberg, Martin; Gray, Jesse M.; Greenberg, Michael Eldon; Kreiman, Gabriel

Note: Order does not necessarily reflect citation order of authors.

Citation: Hemberg, Martin, Jesse M. Gray, Nicole Cloonan, Scott Kuersten, Sean Grimmond, Michael E. Greenberg, and Gabriel Kreiman. 2012. Integrated genome analysis suggests that most conserved non-coding sequences are regulatory factor binding sites. Nucleic Acids Research 40(16): 7858-7869.
Full Text & Related Files:
Abstract: More than 98% of a typical vertebrate genome does not code for proteins. Although non-coding regions are sprinkled with short (<200 bp) islands of evolutionarily conserved sequences, the function of most of these unannotated conserved islands remains unknown. One possibility is that unannotated conserved islands could encode non-coding RNAs (ncRNAs); alternatively, unannotated conserved islands could serve as promoter-distal regulatory factor binding sites (RFBSs) like enhancers. Here we assess these possibilities by comparing unannotated conserved islands in the human and mouse genomes to transcribed regions and to RFBSs, relying on a detailed case study of one human and one mouse cell type. We define transcribed regions by applying a novel transcript-calling algorithm to RNA-Seq data obtained from total cellular RNA, and we define RFBSs using ChIP-Seq and DNAse-hypersensitivity assays. We find that unannotated conserved islands are four times more likely to coincide with RFBSs than with unannotated ncRNAs. Thousands of conserved RFBSs can be categorized as insulators based on the presence of CTCF or as enhancers based on the presence of p300/CBP and H3K4me1. While many unannotated conserved RFBSs are transcriptionally active to some extent, the transcripts produced tend to be unspliced, non-polyadenylated and expressed at levels 10 to 100-fold lower than annotated coding or ncRNAs. Extending these findings across multiple cell types and tissues, we propose that most conserved non-coding genomic DNA in vertebrate genomes corresponds to promoter-distal regulatory elements.
Published Version: doi:10.1093/nar/gks477
Other Sources: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3439890/pdf/
Terms of Use: This article is made available under the terms and conditions applicable to Other Posted Material, as set forth at http://nrs.harvard.edu/urn-3:HUL.InstRepos:dash.current.terms-of-use#LAA
Citable link to this page: http://nrs.harvard.edu/urn-3:HUL.InstRepos:10536037
Downloads of this work:

Show full Dublin Core record

This item appears in the following Collection(s)

 
 

Search DASH


Advanced Search
 
 

Submitters