Nucleotide-Level Modeling of Genetic Regulation Using Dilated Convolutional Neural Networks
CitationGupta, Ankit. 2017. Nucleotide-Level Modeling of Genetic Regulation Using Dilated Convolutional Neural Networks. Bachelor's thesis, Harvard College.
AbstractThe expression of genes is the product of a complex regulatory process, whose complete nature remains elusive. In order to better understand gene regulation, this work seeks to improve on efforts to model the locations of regulatory elements of the human genome directly from raw sequences of nucleotides. Past work on building this model has focused on incorporating only small amounts of DNA for this prediction task, making it difficult to model the complex long-term dependencies that arise from DNA’s 3 dimensional conformation.
In this work, we model these long-term dependencies using dilated convolutional neural networks, which offer the scaling properties of convolutions while modeling long-term dependencies with the performance of recurrent neural networks (RNNs). We show that this architecture is effective at modeling the locations of transcription factor binding sites, histone modifications, and DNAse hypersensitivity sites. We develop and release a novel dataset for this larger context modeling task, and show that dilated convolutions perform better than standard deep convolutional neural networks and RNN-based architectures at modeling the locations of regulatory markers in the human genome.
Citable link to this pagehttp://nrs.harvard.edu/urn-3:HUL.InstRepos:39011027
- FAS Theses and Dissertations