Show simple item record

dc.contributor.authorGrand, Gabriel
dc.date.accessioned2020-08-28T09:18:37Z
dc.date.created2019-03
dc.date.issued2019-08-23
dc.date.submitted2019
dc.identifier.citationGrand, Gabriel. 2019. Learning Interpretable and Bias-Free Models for Visual Question Answering. Bachelor's thesis, Harvard College.
dc.identifier.urihttps://nrs.harvard.edu/URN-3:HUL.INSTREPOS:37364587*
dc.description.abstractVisual Question Answering (VQA) is an innovative test of artificial intelligence that challenges machines to answer human-generated questions about everyday images. Over the past several years, steadily ascending scores on VQA benchmarks have generated the impression of swift progress towards systems capable of human-level reasoning. However, closer inspection of current VQA datasets and models reveals serious methodological issues lurking behind the façade of progress. Many popular VQA datasets contain systematic language biases that enable models to cheat by answering questions “blindly” without considering visual context. Meanwhile, the predominant approach to VQA relies on black-box neural networks that render this kind of cheating hard to detect, and even harder to prevent. In light of these issues, this work presents two sets of original research findings addressing the twin problems of interpretability and bias in VQA. We first aim to endow VQA models with the capacity to better explain their decisions by pointing to visual counterexamples. Our experiments suggest that VQA models overlook key semantic distinctions between visually-similar images, indicating an over-reliance on language biases. Motivated by this result, we introduce a technique called adversarial regularization that is designed to mitigate the effects of language bias on learning. We demonstrate that adversarial regularization makes VQA models more robust to latent biases in the data, and improves their ability to generalize to new domains. Drawing on our findings, we recommend a set of design principles for future VQA benchmarks to promote the development of interpretable and bias-free models.
dc.description.sponsorshipComputer Science
dc.format.mimetypeapplication/pdf
dc.language.isoen
dash.licenseLAA
dc.titleLearning Interpretable and Bias-Free Models for Visual Question Answering
dc.typeThesis or Dissertation
dash.depositing.authorGrand, Gabriel
dc.date.available2020-08-28T09:18:37Z
thesis.degree.date2019
thesis.degree.grantorHarvard College
thesis.degree.levelUndergraduate
thesis.degree.nameAB
dc.type.materialtext
thesis.degree.departmentComputer Science
thesis.degree.discipline-jointMind Brain Behavior
dash.identifier.vireo
dc.identifier.orcid0000-0003-1920-0021
dash.author.emailgabrieljgrand@gmail.com


Files in this item

Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record