Machine-guided design and evolution of biological systems: from the protein to the genome scale
MetadataShow full item record
AbstractEvolution has shown that mutation and selection over billions of years can produce complex molecules and organisms that thrive in a diverse range of environments. As biological engineers, we would like to systematize the navigation of genetic landscapes to find solutions to urgent health and technological needs. In this thesis, I approach the engineering of biological systems from the perspective of design. I illustrate the view of design as an iterative framework of satisfying engineering constraints while discovering and testing degrees of freedom in biological systems. Beginning at the genome scale, I describe a software framework for encoding design rules for recoding genomes and its application to the design of an E. coli strain using only 57 of 64 codons. The genome is being assembled and tested in 50-kilobase segments and we have verified that over 50% of the recoded genome design can functionally complement. Where design rules break down, we leverage DNA synthesis and genome editing to generate targeted diversity and update the design rules. Next, I describe how a model-guided approach that prioritizes mutations to test can augment adaptive laboratory evolution. A 63-codon genomically recoded organism that we previously engineered suffered from impaired fitness and we used our approach to discover a minimal set of high-impact edits that recover 59% of the fitness defect. Finally, I discuss ongoing work to augment design and evolution of proteins by training machine learning models that learn from and guide high-throughput mapping of fitness landscapes. I describe lessons learned in a proof-of-concept study mapping the fitness landscape of the green fluorescent protein and implications for engineering of other proteins. The unifying contribution of this dissertation is a demonstration at multiple scales of how to systematically integrate DNA synthesis, sequencing, high-throughput assays, and computational methods to interrogate biological systems and learn design principles that expand our engineering capabilities.
Citable link to this pagehttp://nrs.harvard.edu/urn-3:HUL.InstRepos:40049979
- FAS Theses and Dissertations