Publication:
Molecular phenotypes from evolutionary sequences: Method development and biological applications

No Thumbnail Available

Date

2019-05-16

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Green, Anna Gustafson. 2019. Molecular phenotypes from evolutionary sequences: Method development and biological applications. Doctoral dissertation, Harvard University, Graduate School of Arts & Sciences.

Research Data

Abstract

Evolutionary pressure to create and preserve useful traits leaves traces in the genomes of organisms. These traces can be detected and interpreted using statistical methods in order to make inferences about an organism's phenotype and evolutionary history. One such trait is protein-protein interactions, which leave traces detectable as dependencies between residues from two different proteins. Protein-protein interactions underly many important biological processes, and an ability to detect all protein-protein interactions at residue resolution would be an important step in understanding organismal phenotype. In this thesis I present methodological advances and their accompanying applications to make biological discoveries about protein-protein interactions from evolutionary sequence data, using a method called evolutionary couplings. In chapter 1, I show the first application of evolutionary couplings at a proteome scale, to predict interactions and interfaces for hundreds of proteins in the Escherichia coli genome, emphasizing proteins that cannot be studied with current experimental methods. I also introduce the largest non-redundant dataset of protein-protein interactions with known structures to date, and present this as a resource for future analysis and method development. In chapter 2, I detail a comprehensive and user-friendly command line application built to allow utilization of evolutionary couplings analysis by non-advanced users, and a Python package built to facilitate development of new functionality. In chapter 3, I use a statistical model based on pairwise amino acid preferences to analyze features of specificity in the eukaryotic protocadherin superfamily. In chapter 4, I predict and analyze structures in the bacterial elongasome and divisome, demonstrating fine biological details and testable hypotheses that can be learned by these computational methodologies. In this thesis, I have sought to both develop and apply computational methods, and facilitate the use of these methods for discovery in molecular biology.

Description

Other Available Sources

Keywords

computational biology, genomics, structural biology, evolutionary biology, microbiology

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Referenced By

Related Stories