Publication:

Using Linear Approximations to Explain Complex, Blackbox Classifiers

Loading...
Thumbnail Image

Date

2020-06-17

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Ross, Alexis Jihye. 2020. Using Linear Approximations to Explain Complex, Blackbox Classifiers. Bachelor's thesis, Harvard College.

Abstract

Machine learning models have the potential to aid human decision-making in a variety of domains. However, many cannot be safely deployed because they are so complex that they are essentially “black boxes” to humans. Given this fact, the need for an independent method of explaining predictions made by such models arises. This thesis discusses how local linear approximations can be used to explain complex, blackbox classifiers. The first part of this thesis draws upon a philosophical account of causal explanation to argue that local linear approximations derived through sampling methods can be effective causal explanations of blackbox classifier predictions. The second part of this thesis proposes an original end-to-end framework for generating actionable counterfactuals that change classifier predictions. Empirical findings are presented which suggest that this method is a promising avenue for future work.

Description

Other Available Sources

Research Data

Keywords

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Related Stories