Building maps from genetic sequences to biological function
Riesselman, Adam Joseph
MetadataShow full item record
CitationRiesselman, Adam Joseph. 2019. Building maps from genetic sequences to biological function. Doctoral dissertation, Harvard University, Graduate School of Arts & Sciences.
AbstractPredicting how changes to the genetic code will alter the characteristics of an organism is a fundamental question in biology and genetics. Typically, measurements of the true functional landscape relating genotype to phenotype are noisy and costly to obtain. Though high-throughput DNA sequencing and synthesis can shed light on biological constraints in organisms, inferring relationships from these high-dimensional, multi-scale data to make predictions about new biological sequences is a formidable task. Here, I aim to build algorithms that map genetic sequences to biological function. In Chapter 1, I examine how deep latent variable models of evolutionary sequences can predict the effects of mutations in an unsupervised manner. In Chapter 2, I discuss how deep autoregressive models can be applied to genetic data for variant effect prediction and the synthesis of a diverse synthetic nanobody library. In Chapter 3, I explore how sparse Bayesian logistic regression can efficiently summarize laboratory affinity maturation experiments to improve nanobody binding affinity. In Chapter 4, I show how to integrate genetic, proteomic, and metabolomic data to optimize thiamine biosynthesis in E. coli. In Chapter 5, I propose future research directions, including extensions to both the analytical methods and biological systems discussed. These results show that probabilistic algorithms of genetic sequence data can both explain phenotypic variation and be used to design proteins and organisms with improved properties.
Citable link to this pagehttp://nrs.harvard.edu/urn-3:HUL.InstRepos:41121291
- FAS Theses and Dissertations