Show simple item record

dc.contributor.authorZhang, Haoqi
dc.contributor.authorParkes, David C.
dc.contributor.authorChen, Yiling
dc.date.accessioned2010-04-27T14:29:17Z
dc.date.issued2009
dc.identifier.citationZhang, Haoqi, David C. Parkes, and Yiling Chen. 2009. Policy teaching through reward function learning. In Proceedings of the tenth ACM Conference on Electronic Commerce : July 6-10, 2009, Stanford, California, ed. J. Chuang, 295-304. New York: ACM Press.en_US
dc.identifier.isbn978-1-60558-458-4en_US
dc.identifier.urihttp://nrs.harvard.edu/urn-3:HUL.InstRepos:3996846
dc.description.abstractPolicy teaching considers a Markov Decision Process setting in which an interested party aims to influence an agent's decisions by providing limited incentives. In this paper, we consider the specific objective of inducing a pre-specified desired policy. We examine both the case in which the agent's reward function is known and unknown to the interested party, presenting a linear program for the former case and formulating an active, indirect elicitation method for the latter. We provide conditions for logarithmic convergence, and present a polynomial time algorithm that ensures logarithmic convergence with arbitrarily high probability. We also offer practical elicitation heuristics that can be formulated as linear programs, and demonstrate their effectiveness on a policy teaching problem in a simulated ad-network setting. We extend our methods to handle partial observations and partial target policies, and provide a game-theoretic interpretation of our methods for handling strategic agents.en_US
dc.description.sponsorshipEngineering and Applied Sciencesen_US
dc.language.isoen_USen_US
dc.publisherAssociation for Computing Machineryen_US
dc.relation.isversionofhttp://portal.acm.org/citation.cfm?id=1566417&dl=ACMen_US
dc.relation.hasversionhttp://www.eecs.harvard.edu/econcs/pubs/zhangec09.pdfen_US
dash.licenseMETA_ONLY
dc.subjectactive indirect elicitationen_US
dc.subjectenvironment designen_US
dc.subjectpolicy teachingen_US
dc.subjectpreference elicitationen_US
dc.subjectpreference learningen_US
dc.titlePolicy Teaching Through Reward Function Learningen_US
dc.typeMonograph or Booken_US
dc.description.versionVersion of Recorden_US
dash.depositing.authorParkes, David C.
dash.embargo.until10000-01-01
dc.identifier.doi10.1145/1566374.1566417
dash.contributor.affiliatedZhang, Haoqi
dash.contributor.affiliatedChen, Yiling
dash.contributor.affiliatedParkes, David


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record