Publication:

Just for Laughs: Utilizing Machine Learning to Rate and Generate Humorous Analogies

Loading...
Thumbnail Image

Date

2017-07-14

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Abstract

This thesis aims to present a procedure for generating humorous 4-word analogies, in the form of humans : water :: Texans : barbecue. By using a neural word embedding, we created a system that can construct 4-tuples of words of a comedic nature, based on an initial pool of funny analogies, written and rated by Amazon Mechanical Turk (AMT) users. Our procedure involved 4 main steps:

  1. Generating a collection of “funny words”, by classifying with a support vector machine (SVM) words from the embedding that are similar to words used in the analogies written by AMT users.
  2. Generating funny pairs of words taken from the collection of funny words, and classifying them to obtain more likely funny words. Negative examples were randomly generated pairs from the embedding, and positive examples were pairs from the AMT users’ analogies.
  3. Generating matchings of generated pairs by another SVM classifier. Negative examples were random 4-tuples of words from the embedding; positive examples were complete humorous analogies obtained from AMT. Our method was shown to perform significantly better than the following baselines:
  • random 4 tuples of words from the embedding
  • random 4 tuples of words from the “funny pool” of words we classified
  • random matching of funny pairs we generated. We assessed the performance in terms of the mean scores obtained per analogy in each baseline, and in terms of maximum funniness score obtained in each category (7/10 fully-computer generated, 5/10 random match of pairs, 4/10 random "funny" words and 3/10 random words). To further establish the usefulness of neural word embeddings to capture humor and generate comedic structures, we introduced a "funniness" score prediction which showed positive correlation with actual ratings obtained from AMT users, and performed a Turing test, in which 35% of computer-generated analogies were mistaken to be human made.

Description

Other Available Sources

Research Data

Keywords

Computer Science

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Related Stories