Accounting for Human Learning when Inferring Human Preferences

Harry Giles, Lawrence Chan

November, 2020

Abstract

Inverse reinforcement learning (IRL) is a powerful tool for learning reward functions from demonstrations. However, standard IRL assumes that the human demonstrator is stationary; that is, the human’s policy does not change during the demonstrations. In this paper, we study IRL when the human is learning – that is, the human’s policy improves over the course of the demonstrations. We show that observing a learning human can be more informative about their preferences than observing a human with a fixed policy, and that standard IRL techniques perform poorly when the human is learning in an unfamiliar environment.

Type

Preprint

Publication

NeurIPS 2020 HAMLETS Workshop

Accounting for Human Learning when Inferring Human Preferences

Abstract

Lawrence Chan

PhD Candidate