Publications

Evaluating Language-Model Agents on Realistic Autonomous Tasks
Evaluating Language-Model Agents on Realistic Autonomous Tasks

We create four agents from Claude and GPT-4 to investigate the ability of frontier language models to perform autonomous replication and adaptation.

Optimal Cost Design for Model Predictive Control
Benefits of assistance over reward learning

We illustrate the benefits of agents that try to assist humans, over agents that learn a reward during training and then maximize said reward after deployment.

Accounting for Human Learning when Inferring Human Preferences