Publication:

Multiplicative Feature-Based Attention for Transfer Learning in Deep Convolutional Neural Networks

Loading...
Thumbnail Image

Date

2017-10-13

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Abstract

Recent years have seen dramatic progress in the use of deep convolutional neural networks (CNNs) to solve problems in computer vision. Typically, CNNs for computer vision are trained to perform a single task, such as image classification, with a fixed set of target image categories. In contrast, our own brains must flexibly solve many different tasks using the same neuronal “hardware.” One mechanism that may play a role in this flexibility is attention, which allows the brain to dynamically weight neural representations in a top-down, task dependent way. While some past efforts have explored adding attention to deep neural networks (Mnih et al., 2014; Xu et al., 2016), these have mostly focused on spatial attention, which allocates attention to specific locations in space. Here, we explore feature-based attention, where attention amplifies certain task-relevant feature detectors, rather than spatial locations. We investigate feature-based attention in neural networks through the context of transfer learning. A CNN is first trained to perform a reference task; next, a multiplicative weighting function is learned that amplifies certain filters to improve performance on a new task. Because this multiplicative weighting function has relatively few parameters, it can be learned quickly, yielding rapid improvements in performance on the new task. Consistent with our expectations, we find that filters with the highest initial discriminative ability are amplified the most, and we analyze which parts of the new task images are most amplified. This work has the potential both to advance practical methods for rapid transfer learning and provide insights into how featural attention might behave in the brain.

Description

Other Available Sources

Research Data

Keywords

Computer Science, Biology, Neuroscience

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Related Stories