Publication:

Towards Learning Regulatory Elements of Promoter Sequences With Deep Learning

Loading...
Thumbnail Image

Date

2017-07-14

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Abstract

Promoters play a key role in gene regulation. Although progress has been made to understand the elements which make up a promoter, the identification of all of the regulatory elements which comprise the promoter remains challenging due to the high variability of promoter sequences. In this thesis, I aim to identify regulatory elements in promoter regions using deep learning. Specifically, I employ a convolutional neural network (CNN) to predict whether a given genomic sequence contains a promoter versus several null models, i.e. background sequences. I compare the performance of the CNN model for each null model and perform saliency analysis to visualize what the network has learned. The main result I found is that the null model must be carefully selected to avoid learning confounding signals such as nucleotide biases. I found that a dinucleotide shuffle of transcription start sites was able to find known regulatory elements associated with bi-directional promoters.

Description

Other Available Sources

Research Data

Keywords

Computer Science

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Related Stories