Publication:

Information Geometric Approaches for Neural Network Algorithms

Loading...
Thumbnail Image

Date

2016-06-21

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Abstract

This thesis addresses applications of differential geometry to learning algorithms on stochastic multilayer perceptrons and Boltzmann machines. Traditional methods rely on gradient descent, though given the non-Euclidean nature of the probability distributions of the algorithm outputs, the gradient with respect to the model parameters is not in general the true direction of steepest descent. We instead use a natural gradient derived from the Fisher metric on statistical manifolds. In the case of multilayer perceptrons, the challenge lies in deriving the inverse Fisher matrix, for which we provide explicit forms in some simple cases and approximations for more general feedforward networks. In the case of Boltzmann machines, we discuss the theory of exponential families, elucidating relationships between the Fisher metric, Kullback-Leibler divergence, and particular geometric connections on exponential families which provides a view of Boltzmann machine learning in terms of geodesics and forms the foundation for another application of the natural gradient. Throughout, we provide simulation results of some of the algorithms discussed, demonstrating the practical power of the natural gradient in some cases.

Description

Other Available Sources

Research Data

Keywords

Mathematics, Computer Science

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Related Stories