Publication:
Modeling Human Behavior in Space Invaders

No Thumbnail Available

Date

2019-08-23

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Lennon, James. 2019. Modeling Human Behavior in Space Invaders. Bachelor's thesis, Harvard College.

Research Data

Abstract

Effective AI systems in the real world must be able to interact and cooperate effectively with the people who use and benefit from them. In order to make this possible, these systems must have a realistic model of how humans will behave in various situations; either overestimating or underestimating human performance can lead to strongly suboptimal outcomes. To this end, this thesis proposes a new algorithm for imitation learning, working in the Atari 2600 Space Invaders environment. We first modify GAIL, a state-of-the-art deep imitation learning algorithm, to work in Atari environments and verify that it scales up to more complex environments more effectively than the original version of the algorithm. We then build a framework for evaluating and comparing human imitators, developing a set of relevant statistics that consider both in-environment performance and descriptive similarity. The new method that is introduced breaks down the problem of human imitation into two subproblems: creating an agent that plays the game well and learning a "corrective" function that modifies this agent to play in a human manner. This hybrid approach is fast to train and can be easily tuned along a spectrum to make the tradeoff between more closely matching the human behavior or performing at a higher level. This approach shows promising results across the evaluation statistics; it achieves a high likelihood of the data under the learned policy, produces a score distribution matching that of the human data, and also matches the human distribution of actions as it acts in the environment.

Description

Other Available Sources

Keywords

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Referenced By

Related Stories