Publication: Nucleotide-Level Modeling of Genetic Regulation Using Dilated Convolutional Neural Networks
No Thumbnail Available
Date
2017-07-14
Authors
Published Version
Published Version
Journal Title
Journal ISSN
Volume Title
Publisher
The Harvard community has made this article openly available. Please share how this access benefits you.
Citation
Gupta, Ankit. 2017. Nucleotide-Level Modeling of Genetic Regulation Using Dilated Convolutional Neural Networks. Bachelor's thesis, Harvard College.
Research Data
Abstract
The expression of genes is the product of a complex regulatory process, whose complete nature remains elusive. In order to better understand gene regulation, this work seeks to improve on efforts to model the locations of regulatory elements of the human genome directly from raw sequences of nucleotides. Past work on building this model has focused on incorporating only small amounts of DNA for this prediction task, making it difficult to model the complex long-term dependencies that arise from DNA’s 3 dimensional conformation.
In this work, we model these long-term dependencies using dilated convolutional neural networks, which offer the scaling properties of convolutions while modeling long-term dependencies with the performance of recurrent neural networks (RNNs). We show that this architecture is effective at modeling the locations of transcription factor binding sites, histone modifications, and DNAse hypersensitivity sites. We develop and release a novel dataset for this larger context modeling task, and show that dilated convolutions perform better than standard deep convolutional neural networks and RNN-based architectures at modeling the locations of regulatory markers in the human genome.
Description
Other Available Sources
Keywords
Computer Science, Biology, Bioinformatics, Statistics
Terms of Use
This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service