Show simple item record

dc.contributor.advisorDaly, Mark J.
dc.contributor.authorArtomov, Mykyta
dc.date.accessioned2019-05-20T10:23:25Z
dc.date.created2017-05
dc.date.issued2017-05-10
dc.date.submitted2017
dc.identifier.urihttp://nrs.harvard.edu/urn-3:HUL.InstRepos:40046483*
dc.description.abstractWith recent rapid decrease in exome and genome sequencing price amount of the available sequencing data has dramatically increased. While analysis of common genetic variation has succeeded with GWAS and fine-mapping methodology, systematic large-scale approach to rare protein-coding DNA variation analysis and interpretation is still in its early days. Rare variation, unlike GWAS, enables deep insight into the personalized disease predisposing factors and better understanding of underlying biology and, thus, facilitates potential new drug discoveries. In this thesis, we have focused on developing methods for interpretation of the genetic association results using protein-protein interaction models to aid the prioritization of disease risk genes and provide insights into involved biological pathways. We created a composite approach for rare DNA variation analysis in case-control cohorts. Our approach was initially tested in the medium-sized cohort of focal segmental glomerulosclerosis patients, identifying several new risk genes that were validated using proof-of-concept mouse model. This methodology was then extended to the large-scale analysis of the germline cancer cohort (over 2,000 samples matched to more than 7,000 controls). We identified common features shared by known cancer predisposing genes and created a strategy for identification of the new cancer driving genes. List of novel candidate genes was created for several cancer phenotypes and some of the candidates were subjected to validation in mouse model successfully proving tumor suppressor activity of the encoded proteins. Analysis of the genetic risk factors provides only unstructured pieces of information about the biology of a disorder. Generally, after identification of the associated loci massive follow-up studies are required to, first, prove the causal relationship, and, most importantly, understand the molecular mechanism of causality. Which locus should be prioritized for protein-level studies is currently determined based on empirical knowledge of protein function. Integration of the experimentally proven individual proteins functionality is then aimed to identify pathways affected by disease. Alternatively to this extensive approach, we developed a statistical framework that integrates genetic association data from multiple sources (GWAS, RVAS, etc.) and finds the protein-protein network returning the best cumulative association score. Using Bayesian model association results are then refined with evidence of the specific gene appearance in the best network. Our method provides a ranked list of genes prioritized based on both association strength and integration in the functional pathway. Such approach is essential for understanding biology of the disorders where it is impossible to build adequate animal model – autism, schizophrenia and other neuropsychiatric diseases.
dc.description.sponsorshipChemistry and Chemical Biology
dc.format.mimetypeapplication/pdf
dc.language.isoen
dash.licenseLAA
dc.subjectBiology, Bioinformatics
dc.subjectBiology, Genetics
dc.titleA Framework for Protein-Level Interpretation of Genetic Associations and Integration With Large-Scale DNA Sequencing Analysis
dc.typeThesis or Dissertation
dash.depositing.authorArtomov, Mykyta
dc.date.available2019-05-20T10:23:25Z
thesis.degree.date2017
thesis.degree.grantorGraduate School of Arts & Sciences
thesis.degree.levelDoctoral
thesis.degree.nameDoctor of Philosophy
dc.contributor.committeeMemberTao, Hensin
dc.contributor.committeeMemberShakhnovich, Eugene
dc.type.materialtext
thesis.degree.departmentChemistry and Chemical Biology
dash.identifier.vireohttp://etds.lib.harvard.edu/gsas/admin/view/1560
dc.description.keywordsGenetic association studies; Cancer genetics
dash.author.emailn.artomov@gmail.com


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record