Extracting Sequence Features to Predict Protein–DNA Interactions: A Comparative Study

DSpace/Manakin Repository

Extracting Sequence Features to Predict Protein–DNA Interactions: A Comparative Study

Citable link to this page

. . . . . .

Title: Extracting Sequence Features to Predict Protein–DNA Interactions: A Comparative Study
Author: Zhou, Qing; Liu, Jun

Note: Order does not necessarily reflect citation order of authors.

Citation: Zhou, Qing and Jun S. Liu. 2008. Extracting sequence features to predict protein–DNA interactions: A comparative study. Nucleic Acids Research 36(12): 4137–4148.
Full Text & Related Files:
Abstract: Predicting how and where proteins, especially transcription factors (TFs), interact with DNA is an important problem in biology. We present here a systematic study of predictive modeling approaches to the TF–DNA binding problem, which have been frequently shown to be more efficient than those methods only based on position-specific weight matrices (PWMs). In these approaches, a statistical relationship between genomic sequences and gene expression or ChIP-binding intensities is inferred through a regression framework; and influential sequence features are identified by variable selection. We examine a few state-of-the-art learning methods including stepwise linear regression, multivariate adaptive regression splines, neural networks, support vector machines, boosting and Bayesian additive regression trees (BART). These methods are applied to both simulated datasets and two whole-genome ChIP-chip datasets on the TFs Oct4 and Sox2, respectively, in human embryonic stem cells. We find that, with proper learning methods, predictive modeling approaches can significantly improve the predictive power and identify more biologically interesting features, such as TF–TF interactions, than the PWM approach. In particular, BART and boosting show the best and the most robust overall performance among all the methods.
Published Version: http://dx.doi.org/10.1093/nar/gkn361
Terms of Use: This article is made available under the terms and conditions applicable to Open Access Policy Articles, as set forth at http://nrs.harvard.edu/urn-3:HUL.InstRepos:dash.current.terms-of-use#OAP
Citable link to this page: http://nrs.harvard.edu/urn-3:HUL.InstRepos:2757494

Show full Dublin Core record

This item appears in the following Collection(s)

  • FAS Scholarly Articles [7585]
    Peer reviewed scholarly articles from the Faculty of Arts and Sciences of Harvard University
 
 

Search DASH


Advanced Search
 
 

Submitters