Publication:

Constrained Bayesian Neural Networks

Loading...
Thumbnail Image

Date

2019-10-25

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Lorch, Lars Christian. 2019. Constrained Bayesian Neural Networks. Bachelor's thesis, Harvard College.

Abstract

Neural networks are central to many of the recent empirical breakthroughs in machine learning, but their inability to model the uncertainty of predictions makes them inadequate for safety-critical domains. Bayesian neural networks (BNNs) extend standard neural networks by modeling probability distributions over predictions, allowing us to judge how confident we should be in a given prediction. However, even though BNNs define a theoretical framework of incorporating prior beliefs about the parameters, they are unable to encode interpretable prior knowledge in function space, where most experts have prior domain knowledge. In this thesis, I present a rigorous and interpretable approach of imposing prior constraints in the input-output space onto the distributions modeled by Bayesian neural networks. By formulating a general constraint prior, the novel method can be applied to arbitrary inequality constraints and treated as a black box with any inference technique normally used with BNNs. Extensive evaluations show qualitatively and quantitatively that Constrained Bayesian neural networks are able to successfully incorporate complex and yet interpretable constraints onto the functions they model. Furthermore, Constrained BNNs do not affect any objectives or advantages of inference methods negatively and can even guide mean-field variational inference approaches to higher unconstrained ELBO values than standard BNNs on average. Finally, novel multimodal posterior predictive distributions are shown in special constraint cases. A new formulation of variational inference with a general Gaussian mixture variational family is derived to obtain these results and compared to a state-of-the-art sampling method.

Description

Other Available Sources

Research Data

Keywords

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Related Stories