Using Linear Approximations to Explain Complex, Blackbox Classifiers
Ross, Alexis Jihye
MetadataShow full item record
CitationRoss, Alexis Jihye. 2020. Using Linear Approximations to Explain Complex, Blackbox Classifiers. Bachelor's thesis, Harvard College.
AbstractMachine learning models have the potential to aid human decision-making in a variety of domains. However, many cannot be safely deployed because they are so complex that they are essentially “black boxes” to humans. Given this fact, the need for an independent method of explaining predictions made by such models arises. This thesis discusses how local linear approximations can be used to explain complex, blackbox classifiers. The first part of this thesis draws upon a philosophical account of causal explanation to argue that local linear approximations derived through sampling methods can be effective causal explanations of blackbox classifier predictions. The second part of this thesis proposes an original end-to-end framework for generating actionable counterfactuals that change classifier predictions. Empirical findings are presented which suggest that this method is a promising avenue for future work.
Citable link to this pagehttps://nrs.harvard.edu/URN-3:HUL.INSTREPOS:37364684
- FAS Theses and Dissertations