Nucleotide-Level Modeling of Genetic Regulation Using Dilated Convolutional Neural Networks
Citation
Gupta, Ankit. 2017. Nucleotide-Level Modeling of Genetic Regulation Using Dilated Convolutional Neural Networks. Bachelor's thesis, Harvard College.Abstract
The expression of genes is the product of a complex regulatory process, whose complete nature remains elusive. In order to better understand gene regulation, this work seeks to improve on efforts to model the locations of regulatory elements of the human genome directly from raw sequences of nucleotides. Past work on building this model has focused on incorporating only small amounts of DNA for this prediction task, making it difficult to model the complex long-term dependencies that arise from DNA’s 3 dimensional conformation.In this work, we model these long-term dependencies using dilated convolutional neural networks, which offer the scaling properties of convolutions while modeling long-term dependencies with the performance of recurrent neural networks (RNNs). We show that this architecture is effective at modeling the locations of transcription factor binding sites, histone modifications, and DNAse hypersensitivity sites. We develop and release a novel dataset for this larger context modeling task, and show that dilated convolutions perform better than standard deep convolutional neural networks and RNN-based architectures at modeling the locations of regulatory markers in the human genome.
Terms of Use
This article is made available under the terms and conditions applicable to Other Posted Material, as set forth at http://nrs.harvard.edu/urn-3:HUL.InstRepos:dash.current.terms-of-use#LAACitable link to this page
http://nrs.harvard.edu/urn-3:HUL.InstRepos:39011027
Collections
- FAS Theses and Dissertations [5370]
Contact administrator regarding this item (to report mistakes or request changes)