Complexity in Protein-DNA Sequence Recognition
Rogers, Julia Maria
MetadataShow full item record
CitationRogers, Julia Maria. 2017. Complexity in Protein-DNA Sequence Recognition. Doctoral dissertation, Harvard University, Graduate School of Arts & Sciences.
AbstractGene expression is controlled in part by the binding of transcription factors to genomic regulatory sequences. A transcription factor’s preferences for interacting with particular DNA sequences affect which genomic regions it can bind, and consequently, which genes are under its control. This work seeks to better understand these sequence preferences for two classes of transcription factors, and to determine which features of a transcription factor determine its specificity.
I first present a study of the binding specificities of Transcription Activator-Like Effectors, or TALEs. TALEs are frequently used as tools in genome engineering applications, due to their seemingly simple and predictable DNA recognition rules. TALE proteins bind DNA through a series of repeats, where each repeat binds one DNA base. Through a comprehensive evaluation of the binding specificities of TALE proteins, I show that the protein context of a TALE repeat affects its specificity for its target base. This improved understanding of TALE-DNA interactions enabled the development of a suite of computational tools to aid in the design of TALE proteins for use in genome engineering.
I also present two studies of the binding specificity of the forkhead family of transcription factors, one of the major classes of eukaryotic transcription factors. While forkhead proteins were expected to recognize the canonical forkhead DNA binding site, I show that proteins in this family in fact display more varied binding specificities. By combining phylogenetic analysis with functional assays of binding specificity, I show that three branches of the forkhead family independently evolved the ability to recognize an alternate DNA motif. Additionally, some individual proteins within these branches can recognize both the alternate and canonical motifs. This bispecificity is surprising, given the extent of the differences between the canonical and alternate motifs.
I next describe a study into the mechanistic basis of this bispecificity. I present the co-crystal structures of the DNA binding domain of the bispecific protein FoxN3 in complex with both the canonical and alternate DNA sites, which show that FoxN3 adopts similar conformations to recognize both sequences. Interestingly, the DNA adopts strikingly different conformations in the two structures. Differences in the positions of the N- and C-terminal ends of the FoxN3 DNA binding domain suggest that these protein elements could contribute to the ability of this forkhead domain to recognize the alternate motif.
Overall, the work presented here highlights the complexity of the interactions between proteins and DNA. These studies motivate the need for further investigation into the mechanisms underlying these interactions in order to better understand the regulation of gene expression.
Citable link to this pagehttp://nrs.harvard.edu/urn-3:HUL.InstRepos:41140214
- FAS Theses and Dissertations