ObjectiveGAN: Using Generative Adversarial Networks and Reinforcement Learning to Fine-Tune Sequence Generation Models
Author
Guimaraes, Gabriel Lima
Metadata
Show full item recordAbstract
When generating sequences with recurrent neural networks, naive reinforcement learning can be used to give ``hints" and guide the model's generative process towards an arbitrary objective criterion on the output data. For example, when generating music one might reward the model for staying within just one key or when generating molecule strings one might want them to be valid chemical compounds.Very often, this type of heuristic can backfire by leading the model to become lazy or greedy, generating uninteresting data and even failing to improve on the given objective. Traditional approaches have tweaked the objective function adding domain-specific penalties and rewards to prevent the model from becoming greedy. While this method has been successful in improving the desired objective, effective reward functions can be hard to craft and rely heavily on domain-specific knowledge.
This thesis introduces ObjectiveGAN as a solution to this problem. We employ Generative Adversarial Networks (GANs) to increase the entropy of the generative process and prevent it from being greedy, ultimately improving the objective we are interested in. In contrast with traditional RL methods that depend on carefully crafted heuristics to work well, ObjectiveGAN also works with simple heuristics by adding a dynamic GAN component to the reward function. This GAN component allows the model to maximize the hard coded objective, while maintaining information learned from the training data.
We implement ObjectiveGAN in the context of chemistry molecules and show that it can be used to generate a large percentage of new valid molecules that are not present in the training set.
Terms of Use
This article is made available under the terms and conditions applicable to Other Posted Material, as set forth at http://nrs.harvard.edu/urn-3:HUL.InstRepos:dash.current.terms-of-use#LAACitable link to this page
http://nrs.harvard.edu/urn-3:HUL.InstRepos:38811489
Collections
- FAS Theses and Dissertations [5848]
Contact administrator regarding this item (to report mistakes or request changes)