Show simple item record

dc.contributor.advisorMacDonald, Marcyen_US
dc.contributor.authorSamocha, Kaitlin E.en_US
dc.date.accessioned2017-07-25T14:42:48Z
dc.date.created2016-05en_US
dc.date.issued2016-05-06en_US
dc.date.submitted2016en_US
dc.identifier.citationSamocha, Kaitlin E. 2016. Modeling Rare Protein-Coding Variation to Identify Mutation-Intolerant Genes With Application to Disease. Doctoral dissertation, Harvard University, Graduate School of Arts & Sciences.en_US
dc.identifier.urihttp://nrs.harvard.edu/urn-3:HUL.InstRepos:33493508
dc.description.abstractSequencing exomes—the 1% of the genome that codes for proteins—has increased the rate at which the genetic basis of a patient’s disease is determined. Unfortunately, when a patient does not carry a well-established pathogenic variant, it is extremely challenging to establish which of the tens of thousands of variants identified in that individual is contributing to their disease. In these situations, variants must be prioritized to make further investigation more manageable. In this thesis, we have focused on creating statistical frameworks and models to aid in the interpretation of rare variants and towards establishing gene-level metrics for variant prioritization. We developed a sensitive and specific workflow to detect newly arising (de novo) variants from exome sequencing data of parent-child trios, and created a sequence-context based mutational. This mutational model was the basis of a rigorous statistical framework to evaluate the significance of de novo variant burden not only globally, but also per gene. When we applied this framework to de novo variants identified in patients with an autism spectrum disorder, we found a global excess of de novo loss-of-function variants as well as two genes that harbored significantly more de novo loss-of-function variants than expected. We also used the mutational model to predict the expected number of rare (minor allele frequency < 0.1%) variants in exome sequencing datasets of reference individuals. We found a significant depletion of missense and loss-of-function variants in a subset of genes, indicating that these genes are under strong evolutionary constraint. Specifically, we identified 3,230 genes that are intolerant of loss-of-function variation and that set of genes is enriched for established dominant and haploinsufficient disease genes. Similarly, we searched for regions within genes that were intolerant of missense variation. The most missense depleted 15% of the exome contains 83% of reported pathogenic variants found in haploinsufficient disease genes that cause severe disease. Additionally, both gene-level and region-level constraint metrics highlight a set of de novo variants from patients with a neurodevelopmental disorder that are more likely to be pathogenic, supporting the utility of these metrics when interpreting rare variants within the context of disease.en_US
dc.description.sponsorshipMedical Sciencesen_US
dc.format.mimetypeapplication/pdfen_US
dc.language.isoenen_US
dash.licenseLAAen_US
dc.subjectBiology, Geneticsen_US
dc.subjectBiology, Bioinformaticsen_US
dc.titleModeling Rare Protein-Coding Variation to Identify Mutation-Intolerant Genes With Application to Diseaseen_US
dc.typeThesis or Dissertationen_US
dash.depositing.authorSamocha, Kaitlin E.en_US
dc.date.available2017-07-25T14:42:48Z
thesis.degree.date2016en_US
thesis.degree.grantorGraduate School of Arts & Sciencesen_US
thesis.degree.levelDoctoralen_US
thesis.degree.nameDoctor of Philosophyen_US
dc.contributor.committeeMemberHirschhorn, Joelen_US
dc.contributor.committeeMemberSeidman, Christineen_US
dc.contributor.committeeMemberHurles, Matthewen_US
dc.type.materialtexten_US
thesis.degree.departmentMedical Sciencesen_US
dash.identifier.vireohttp://etds.lib.harvard.edu/gsas/admin/view/895en_US
dc.description.keywordsgenomics; exome; sequencing; mutation; de novo; autism; constrainten_US
dash.author.emailksamocha@gmail.comen_US
dash.contributor.affiliatedSamocha, Kaitlin E.


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record