Show simple item record

dc.contributor.advisorHogle, James M.en_US
dc.contributor.advisorSunyaev, Shamil R.en_US
dc.contributor.authorJordan, Daniel Michaelen_US
dc.date.accessioned2015-07-17T16:29:35Z
dc.date.created2015-05en_US
dc.date.issued2015-05-08en_US
dc.date.submitted2015en_US
dc.identifier.citationJordan, Daniel Michael. 2015. Predicting the Effects of Missense Variation on Protein Structure, Function, and Evolution. Doctoral dissertation, Harvard University, Graduate School of Arts & Sciences.en_US
dc.identifier.urihttp://nrs.harvard.edu/urn-3:HUL.InstRepos:17464216
dc.description.abstractEstimating the effects of missense mutations is a problem with many important applications in a variety of fields, including medical genetics, evolutionary theory, population genetics, and protein structure and design. Many popular methods exist to solve this problem, the most widely used of which are PolyPhen-2 and SIFT. These methods, along with most other popular methods, rely on multiple sequence alignments of orthologous protein sequences. Based on the amino acids observed in each column of the alignment, they produce a profile describing how tolerated each amino acid is at each position. They then compare the wild-type and variant amino acids to this profile to produce a prediction. In practice, these methods are fast, robust, and relatively reliable. However, from a theoretical perspective, they have at least three significant shortcomings: 1. They use effects on selection as a proxy for effects on phenotype and protein structure and function. 2. They treat each position as independent, ruling out most forms of interactions between sites. 3. They do not explicitly model the process of evolution, instead assuming that sequences we observe more or less represent an equilibrium state. With the recent explosion of sequencing technology, as well as the steady increase of computational power, we are now beginning to have enough data to investigate these simplifications and see how much they really affect the performance of these methods. In this dissertation, I present three such investigations. First, I describe a modified predictor designed to predict risk for a specific disease, hypertrophic cardiomyopathy (HCM), rather than general seletive effect. This method achieves significantly higher accuracy than methods without such specific domain knowledge. Next, I describe a model of pairwise interactions between sites, demonstrating both statistically and with in vivo evidence that approximately 7-12% of disease-causing variants may be mispredicted by these methods due to such interactions. Finally, I describe a hybrid method that uses an alignment-based estimator to inform a parametric model of evolution, resulting in a small but significant improvement in accuracy.en_US
dc.description.sponsorshipBiophysicsen_US
dc.format.mimetypeapplication/pdfen_US
dc.language.isoenen_US
dash.licenseLAAen_US
dc.subjectBiophysics, Generalen_US
dc.subjectBiology, Geneticsen_US
dc.subjectBiology, Biostatisticsen_US
dc.titlePredicting the Effects of Missense Variation on Protein Structure, Function, and Evolutionen_US
dc.typeThesis or Dissertationen_US
dash.depositing.authorJordan, Daniel Michaelen_US
dc.date.available2015-07-17T16:29:35Z
thesis.degree.date2015en_US
thesis.degree.grantorGraduate School of Arts & Sciencesen_US
thesis.degree.levelDoctoralen_US
thesis.degree.nameDoctor of Philosophyen_US
dc.contributor.committeeMemberLiu, Jun S.en_US
dc.contributor.committeeMemberMorton, Cynthia C.en_US
dc.contributor.committeeMemberShakhnovich, Eugene I.en_US
dc.type.materialtexten_US
thesis.degree.departmentBiophysicsen_US
dash.identifier.vireohttp://etds.lib.harvard.edu/gsas/admin/view/299en_US
dc.description.keywordsgenomics; bioinformatics; human genetics; hypertrophic cardiomyopathy; molecular diagnosis; variant effect prediction; PolyPhen; molecular evolution; epistasisen_US
dash.author.emaildaniel.jordan@aya.yale.eduen_US
dash.identifier.drsurn-3:HUL.DRS.OBJECT:25164013en_US
dash.identifier.orcid0000-0002-5318-8225en_US
dash.contributor.affiliatedJordan, Daniel Michael
dc.identifier.orcid0000-0002-5318-8225


Files in this item

Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record