Publication:

Transcription factors, coregulators, and epigenetic marks are linearly correlated and highly redundant

Loading...
Thumbnail Image

Open/View Files

Date

2017

Journal Title

Journal ISSN

Volume Title

Publisher

Public Library of Science
The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Ahsendorf, Tobias, Franz-Josef Müller, Ved Topkar, Jeremy Gunawardena, and Roland Eils. 2017. “Transcription factors, coregulators, and epigenetic marks are linearly correlated and highly redundant.” PLoS ONE 12 (12): e0186324. doi:10.1371/journal.pone.0186324. http://dx.doi.org/10.1371/journal.pone.0186324.

Abstract

The DNA microstates that regulate transcription include sequence-specific transcription factors (TFs), coregulatory complexes, nucleosomes, histone modifications, DNA methylation, and parts of the three-dimensional architecture of genomes, which could create an enormous combinatorial complexity across the genome. However, many proteins and epigenetic marks are known to colocalize, suggesting that the information content encoded in these marks can be compressed. It has so far proved difficult to understand this compression in a systematic and quantitative manner. Here, we show that simple linear models can reliably predict the data generated by the ENCODE and Roadmap Epigenomics consortia. Further, we demonstrate that a small number of marks can predict all other marks with high average correlation across the genome, systematically revealing the substantial information compression that is present in different cell lines. We find that the linear models for activating marks are typically cell line-independent, while those for silencing marks are predominantly cell line-specific. Of particular note, a nuclear receptor corepressor, transducin beta-like 1 X-linked receptor 1 (TBLR1), was highly predictive of other marks in two hematopoietic cell lines. The methodology presented here shows how the potentially vast complexity of TFs, coregulators, and epigenetic marks at eukaryotic genes is highly redundant and that the information present can be compressed onto a much smaller subset of marks. These findings could be used to efficiently characterize cell lines and tissues based on a small number of diagnostic marks and suggest how the DNA microstates, which regulate the expression of individual genes, can be specified.

Description

Research Data

Keywords

Biology and Life Sciences, Cell Biology, Chromosome Biology, Chromatin, Chromatin Modification, Histone Modification, Genetics, Epigenetics, Gene Expression, Biology and life sciences, Cell biology, Chromosome biology, Chromatin modification, DNA methylation, Gene expression, DNA, DNA modification, Biochemistry, Nucleic acids, RNA, Non-coding RNA, Long non-coding RNAs, Proteins, DNA-binding proteins, Genetic Loci, Transcription Factors, Gene Regulation, Regulatory Proteins

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Related Stories