Towards Safe Genome Editing and Rapid Disease Detection: Deep Bayesian Active Learning for Model-Driven CRISPR Guide Design
Band, Neil Benjamin
MetadataShow full item record
CitationBand, Neil Benjamin. 2020. Towards Safe Genome Editing and Rapid Disease Detection: Deep Bayesian Active Learning for Model-Driven CRISPR Guide Design. Bachelor's thesis, Harvard College.
AbstractScientists use genome editing tools – nucleases which cut genetic material at targeted locations – to associate genes to diseases, detect viruses, and engineer agriculture, with the future goal of correcting human genetic disorders. CRISPR-Cas systems enable cheap genome editing using a synthesized ribonucleic acid (RNA) guide to direct a Cas nuclease to the desired cutsite. “Guide design” procedures improve the accuracy and safety of CRISPR-Cas interventions by selecting promising candidate RNA guides using a computational model to predict outcomes based on a training dataset.
The small size of CRISPR-Cas datasets discourages guide design researchers from using neural networks, a powerful and data-hungry class of models. This thesis presents the first application of Bayesian neural networks (BNNs) – a variant which better handles data scarcity – in genome editing. BNNs are applied on two of the world’s largest CRISPR-Cas datasets, achieving the same accuracy as state-of-the-art approaches with up to 141 times less data and up to 37% higher relative accuracy with equal data.
BNNs can readily improve CRISPR guide design, including in Cas13 protocols for cheap and rapid SARS-CoV-2 detection. This work demonstrates the first instance of computer-driven CRISPR experiment design, in which BNNs outperform human expertise in building an effective training dataset.
Citable link to this pagehttps://nrs.harvard.edu/URN-3:HUL.INSTREPOS:37364719
- FAS Theses and Dissertations