Show simple item record

dc.contributor.authorSingh, Rachit
dc.date.accessioned2019-03-26T11:07:53Z
dc.date.created2018-05
dc.date.issued2018-06-29
dc.date.submitted2018
dc.identifier.urihttp://nrs.harvard.edu/urn-3:HUL.InstRepos:38811559*
dc.description.abstractWe introduce a variant of the variational RNN (VRNN) model with discrete latent states to increase interpretability in RNN-based language models. Finding that naively training the model results in the same posterior collapse phenomenon observed in many other autoregressive tasks, we take the special case of an HMM where exact inference is tractable and examine the optimization challenges in that setting. We learn that sampling to compute the optimization objective likely causes optimization of the inference network to be intractable. Since the exact ELBO can be computed in the case of an HMM, we train an inference network for an HMM generative model (without any posterior collapse), then initialize a VRNN using the HMM's parameters and inference network. We find that fine tuning this model and adding non-Markovian transitions between latent time steps lets the model approach an LSTM-based language model's performance, while maintaining a sparse discrete latent state.
dc.format.mimetypeapplication/pdf
dc.language.isoen
dash.licenseLAA
dc.subjectComputer Science
dc.subjectMathematics
dc.titleSequential Discrete Latent Variables for Language Modeling
dc.typeThesis or Dissertation
dash.depositing.authorSingh, Rachit
dc.date.available2019-03-26T11:07:53Z
thesis.degree.date2018
thesis.degree.grantorHarvard College
thesis.degree.levelUndergraduate
thesis.degree.nameAB
dc.type.materialtext
thesis.degree.departmentComputer Science
dash.identifier.vireohttp://etds.lib.harvard.edu/college/admin/view/293
dash.author.emailrachitsingh@outlook.com


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record