Publication: Statistical models of protein mutational effects, directed evolution, and clinical chemistry
Open/View Files
Date
Authors
Published Version
Published Version
Journal Title
Journal ISSN
Volume Title
Publisher
Citation
Abstract
Randomness is at the heart of life. For one, it mutates genes and thereby the properties of biomolecules like DNA, RNA, and proteins. Biomolecular changes can effect other changes at higher levels, causing variations that manifest in such states as disease. Randomness flows the other way too. Environments fluctuate, causing changes from the outside in. Living things are complex and diverse, and so are the forces that pervade them, genetically, physiologically, and environmentally.
Science progresses, in statistical terms, by inference and simulation: learning from data to predict aspects of different situations. But life's complexity and diversity make this difficult. There's a huge variety of possible proteins and many ways in which they might be improved toward a purpose or disrupted. There are many diseases, many possible markers thereof, and many different people with many different lives. Biology is in these and other ways a noisy science. When we interpret biological data and when we use biology to engineer and to make us healthier, we need to account for randomness, to learn kernels of truth and gently expand their scope.
Through questions I've asked during my PhD, I've taken a small part in this struggle. My approach has been to develop concise statistical models to identify underlying trends in biological data and overarching principles for applications in engineering and health. To better understand the effects of single amino acid substitutions on proteins, I ask in Chapter 1 to what degree these effects are explained by certain physicochemical features of the substitution. I ask in Chapter 2 how stringently experimenters should select variants while engineering biomolecules via directed evolution. And in Chapter 3, I formalize a statistical foundation for interpreting clinical chemistry tests with respect to health. It's been my hope that, beyond satisfying my curiosity, my answers are general enough to each problem that they are straightforward frameworks on which others can comfortably build.