Publication:

Deep Latent Variable Models of Natural Language

Loading...
Thumbnail Image

Date

2020-05-14

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Kim, Yoon. 2020. Deep Latent Variable Models of Natural Language. Doctoral dissertation, Harvard University, Graduate School of Arts & Sciences.

Abstract

Understanding natural language involves complex underlying processes by which meaning is extracted from surface form. One approach to operationalizing such phenomena in computational models of natural language is through probabilistic latent variable models, which can encode structural dependencies among observed and unobserved variables of interest within a probabilistic framework. Deep learning, on the other hand, offers an alternative computational approach to modeling natural language through end-to-end learning of expressive, global models, where any phenomena necessary for the task are captured implicitly within the hidden layers of a neural network. This thesis explores a synthesis of deep learning and latent variable modeling for natural language processing applications. We study a class of models called deep latent variable models, which parameterize components of probabilistic latent variable models with neural networks, thereby retaining the modularity of latent variable models while at the same time exploiting rich parameterizations enabled by recent advances in deep learning. We experiment with different families of deep latent variable models to target a wide range of language phenomena---from word alignment to parse trees---and apply them to core natural language processing tasks including language modeling, machine translation, and unsupervised parsing.

We also investigate key challenges in learning and inference that arise when working with deep latent variable models for language applications. A standard approach for learning such models is through amortized variational inference, in which a global inference network is trained to perform approximate posterior inference over the latent variables. However, a straightforward application of amortized variational inference is often insufficient for many applications of interest, and we consider several extensions to the standard approach that lead to improved learning and inference. In summary, each chapter presents a deep latent variable model tailored for modeling a particular aspect of language, and develops an extension of amortized variational inference for addressing the particular challenges brought on by the latent variable model being considered. We anticipate that these techniques will be broadly applicable to other domains of interest.

Description

Other Available Sources

Research Data

Keywords

natural language processing, deep learning, latent variable models

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Related Stories