Show simple item record

dc.contributor.authorDahl, George E.
dc.contributor.authorAdams, Ryan Prescott
dc.contributor.authorLarochelle, Hugo
dc.date.accessioned2013-12-13T15:29:56Z
dc.date.issued2012
dc.identifierQuick submit: 2013-08-08T12:15:38-04:00
dc.identifier.citationDahl, George E., Ryan Prescott Adams, and Hugo Larochelle. 2012. Training restricted Boltzmann machines on word observations. In Proceedings of the 29th International Conference on Machine Learning, Edinburgh, Scotland, June 26 – July 1, 2012, ed. John Langford and Joelle Pineau, 679-686. Edinburgh: International Machine Learning Society.en_US
dc.identifier.isbn9781450312851en_US
dc.identifier.urihttp://nrs.harvard.edu/urn-3:HUL.InstRepos:11375693
dc.description.abstractThe restricted Boltzmann machine (RBM) is a flexible tool for modeling complex data, however there have been significant computational difficulties in using RBMs to model high-dimensional multinomial observations. In natural language processing applications, words are naturally modeled by K-ary discrete distributions, where K is determined by the vocabulary size and can easily be in the hundreds of thousands. The conventional approach to training RBMs on word observations is limited because it requires sampling the states of K-way softmax visible units during block Gibbs updates, an operation that takes time linear in K. In this work, we address this issue by employing a more general class of Markov chain Monte Carlo operators on the visible units, yielding updates with computational complexity independent of K. We demonstrate the success of our approach by training RBMs on hundreds of millions of word n-grams using larger vocabularies than previously feasible and using the learned features to improve performance on chunking and sentiment classification tasks, achieving state-of-the-art results on the latter.en_US
dc.description.sponsorshipEngineering and Applied Sciencesen_US
dc.language.isoen_USen_US
dc.publisherInternational Machine Learning Societyen_US
dc.relation.isversionofhttp://icml.cc/2012/papers/364.pdfen_US
dc.relation.hasversionhttp://arxiv.org/pdf/1202.5695v2.pdfen_US
dash.licenseOAP
dc.subjectlearningen_US
dc.subjectmachine learningen_US
dc.titleTraining Restricted Boltzmann Machines on Word Observationsen_US
dc.typeConference Paperen_US
dc.date.updated2013-08-08T16:16:08Z
dc.description.versionAuthor's Originalen_US
dc.rights.holderGeorge E. Dahl; Ryan Prescott Adams; Hugo Larochelle
dash.depositing.authorAdams, Ryan Prescott
dc.date.available2013-12-13T15:29:56Z
dc.relation.bookProceedings of the 29th International Conference on Machine Learningen_US
workflow.legacycommentsFLAG2 I'm not sure about posting the publisher's version. It's possible, even likely, though, that the "publisher's version" is identical to the manuscript in this case. This looks to me like it was built with a LaTeX template before hand, and was not altered afterwards. No page numbers, etc. If this is actually a manuscript, we can post OAP. Committed 12/13/13 by eek per CS convention of using LaTeX and statement on first page of "Appearing in...", making a call to deposit this OAP.en_US
dash.contributor.affiliatedAdams, Ryan Prescott


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record