Show simple item record

dc.contributor.authorAsthana, Saurabh
dc.contributor.authorRoytberg, Mikhail
dc.contributor.authorStamatoyannopoulos, John
dc.contributor.authorSunyaev, Shamil R.
dc.date.accessioned2012-02-28T16:29:10Z
dc.date.issued2007
dc.identifier.citationAsthana, Saurabh, Mikhail Roytberg, John Stamatoyannopoulos, and Shamil Sunyaev. 2007. Analysis of sequence conservation at nucleotide resolution. PLoS Computational Biology 3(12): e254.en_US
dc.identifier.issn1553-734Xen_US
dc.identifier.urihttp://nrs.harvard.edu/urn-3:HUL.InstRepos:8268112
dc.description.abstractOne of the major goals of comparative genomics is to understand the evolutionary history of each nucleotide in the human genome sequence, and the degree to which it is under selective pressure. Ascertainment of selective constraint at nucleotide resolution is particularly important for predicting the functional significance of human genetic variation and for analyzing the sequence substructure of cis-regulatory sequences and other functional elements. Current methods for analysis of sequence conservation are focused on delineation of conserved regions comprising tens or even hundreds of consecutive nucleotides. We therefore developed a novel computational approach designed specifically for scoring evolutionary conservation at individual base-pair resolution. Our approach estimates the rate at which each nucleotide position is evolving, computes the probability of neutrality given this rate estimate, and summarizes the result in a Sequence CONservation Evaluation (SCONE) score. We computed SCONE scores in a continuous fashion across 1% of the human genome for which high-quality sequence information from up to 23 genomes are available. We show that SCONE scores are clearly correlated with the allele frequency of human polymorphisms in both coding and noncoding regions. We find that the majority of noncoding conserved nucleotides lie outside of longer conserved elements predicted by other conservation analyses, and are experiencing ongoing selection in modern humans as evident from the allele frequency spectrum of human polymorphism. We also applied SCONE to analyze the distribution of conserved nucleotides within functional regions. These regions are markedly enriched in individually conserved positions and short (<15 bp) conserved “chunks.” Our results collectively suggest that the majority of functionally important noncoding conserved positions are highly fragmented and reside outside of canonically defined long conserved noncoding sequences. A small subset of these fragmented positions may be identified with high confidence.en_US
dc.language.isoen_USen_US
dc.publisherPublic Library of Scienceen_US
dc.relation.isversionofdoi://10.1371/journal.pcbi.0030254en_US
dc.relation.hasversionhttp://www.ncbi.nlm.nih.gov/pmc/articles/PMC2230682/pdf/en_US
dash.licenseLAA
dc.subjectcomputational biologyen_US
dc.subjectHomo (human)en_US
dc.subjectmammalsen_US
dc.titleAnalysis of Sequence Conservation at Nucleotide Resolutionen_US
dc.typeJournal Articleen_US
dc.description.versionVersion of Recorden_US
dc.relation.journalPLoS Computational Biologyen_US
dash.depositing.authorSunyaev, Shamil R.
dc.date.available2012-02-28T16:29:10Z
dash.affiliation.otherHMS^Health Sciences and Technologyen_US
dash.affiliation.otherHMS^Medicine-Brigham and Women's Hospitalen_US
dc.identifier.doi10.1371/journal.pcbi.0030254*
dash.contributor.affiliatedSunyaev, Shamil


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record