Publication:
Right for the Right Reasons: Training Neural Networks to be Interpretable, Robust, and Consistent with Expert Knowledge

No Thumbnail Available

Date

2021-05-10

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Ross, Andrew Slavin. 2021. Right for the Right Reasons: Training Neural Networks to be Interpretable, Robust, and Consistent with Expert Knowledge. Doctoral dissertation, Harvard University Graduate School of Arts and Sciences.

Research Data

Abstract

Neural networks are among the most accurate machine learning methods in use today. However, their opacity and fragility to distribution shifts make them difficult to trust in critical applications. Recent efforts to develop explanations for neural networks have produced tools to shed light on the implicit rules behind predictions. These tools can help us identify when networks are right for the wrong reasons, or equivalently that they will fail under distribution shifts that should not affect predictions. However, such explanations are not always at the right level of abstraction, and more importantly, cannot correct the problems they reveal. In this thesis, we explore methods for training neural networks to make predictions for better reasons, both by incorporating explanations into the training process and by learning representations that better match human concepts. These methods produce models that are more interpretable to users and more robust to distribution shifts.

Description

Other Available Sources

Keywords

interpretability, machine learning, representation learning, robustness, Computer science

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Referenced By

Related Stories