Publication: Learning Interpretable and Bias-Free Models for Visual Question Answering
No Thumbnail Available
Date
2019-08-23
Authors
Published Version
Published Version
Journal Title
Journal ISSN
Volume Title
Publisher
The Harvard community has made this article openly available. Please share how this access benefits you.
Citation
Grand, Gabriel. 2019. Learning Interpretable and Bias-Free Models for Visual Question Answering. Bachelor's thesis, Harvard College.
Research Data
Abstract
Visual Question Answering (VQA) is an innovative test of artificial intelligence that challenges machines to answer human-generated questions about everyday images. Over the past several years, steadily ascending scores on VQA benchmarks have generated the impression of swift progress towards systems capable of human-level reasoning. However, closer inspection of current VQA datasets and models reveals serious methodological issues lurking behind the façade of progress. Many popular VQA datasets contain systematic language biases that enable models to cheat by answering questions “blindly” without considering visual context. Meanwhile, the predominant approach to VQA relies on black-box neural networks that render this kind of cheating hard to detect, and even harder to prevent.
In light of these issues, this work presents two sets of original research findings addressing the twin problems of interpretability and bias in VQA. We first aim to endow VQA models with the capacity to better explain their decisions by pointing to visual counterexamples. Our experiments suggest that VQA models overlook key semantic distinctions between visually-similar images, indicating an over-reliance on language biases. Motivated by this result, we introduce a technique called adversarial regularization that is designed to mitigate the effects of language bias on learning. We demonstrate that adversarial regularization makes VQA models more robust to latent biases in the data, and improves their ability to generalize to new domains. Drawing on our findings, we recommend a set of design principles for future VQA benchmarks to promote the development of interpretable and bias-free models.
Description
Other Available Sources
Keywords
Terms of Use
This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service