Publication: Learning structured representations in neural networks
Open/View Files
Date
Authors
Published Version
Published Version
Journal Title
Journal ISSN
Volume Title
Publisher
Citation
Abstract
Deep learning has revolutionized engineering fields where feature extraction and model-based approaches were traditionally used, yielding groundbreaking results in computer vision, natural language processing, and their intersection. These advances were made possible by the widespread use of GPUs for training deep networks, and these models frequently have billions (if not trillions) of parameters and require the equivalent of years to train. However, a multitude of learning problems in science and engineering are bounded by constraints: either user-defined (such as cost, compute, or memory constraints) or problem-defined (such as data or task symmetries, or physical constraints).
In this dissertation we translate such constraints into structured priors; and present how that structure can be embedded in neural networks by construction. In particular, in the first part we propose architectures that follow optimization-based priors (which we term user-defined) with provable guarantees, and in the second part we propose architectures that have algebraic structure (which we term problem-defined). Both of these frameworks lead to neural networks with provable priors, significantly reduced parameter counts, while maintaining (or improving) the performance of downstream tasks.