Sequential Discrete Latent Variables for Language Modeling

Singh, Rachit

dc.contributor.author	Singh, Rachit
dc.date.accessioned	2019-03-26T11:07:53Z
dc.date.created	2018-05
dc.date.issued	2018-06-29
dc.date.submitted	2018
dc.identifier.uri	http://nrs.harvard.edu/urn-3:HUL.InstRepos:38811559	*
dc.description.abstract	We introduce a variant of the variational RNN (VRNN) model with discrete latent states to increase interpretability in RNN-based language models. Finding that naively training the model results in the same posterior collapse phenomenon observed in many other autoregressive tasks, we take the special case of an HMM where exact inference is tractable and examine the optimization challenges in that setting. We learn that sampling to compute the optimization objective likely causes optimization of the inference network to be intractable. Since the exact ELBO can be computed in the case of an HMM, we train an inference network for an HMM generative model (without any posterior collapse), then initialize a VRNN using the HMM's parameters and inference network. We find that fine tuning this model and adding non-Markovian transitions between latent time steps lets the model approach an LSTM-based language model's performance, while maintaining a sparse discrete latent state.
dc.format.mimetype	application/pdf
dc.language.iso	en
dash.license	LAA
dc.subject	Computer Science
dc.subject	Mathematics
dc.title	Sequential Discrete Latent Variables for Language Modeling
dc.type	Thesis or Dissertation
dash.depositing.author	Singh, Rachit
dc.date.available	2019-03-26T11:07:53Z
thesis.degree.date	2018
thesis.degree.grantor	Harvard College
thesis.degree.level	Undergraduate
thesis.degree.name	AB
dc.type.material	text
thesis.degree.department	Computer Science
dash.identifier.vireo	http://etds.lib.harvard.edu/college/admin/view/293
dash.author.email	rachitsingh@outlook.com

Files in this item

Name:: SINGH-SENIORTHESIS-2018.pdf
Size:: 2.128Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

FAS Theses and Dissertations [6136]

Show simple item record