Publications

(2023). Towards the Scalable Evaluation of Cooperativeness in Language Models. Under review.

PDF

(2023). Reclaiming the Digital Commons: A Public Data Trust for Training Data. Under review.

PDF

(2023). Characterizing Manipulation from AI Systems. Under review.

PDF

(2023). Harms from Increasingly Agentic Algorithmic Systems. Under review.

PDF

(2021). Scoring Rules for Performative Binary Prediction. In NeurIPS 2021 Workshop on Strategic ML.

PDF

(2021). Loss of Control: "Normal Accidents" and AI Systems. In ICLR 2021 RAI Workshop.

PDF

(2021). The Limits of Global Inclusion in AI Development. In AAAI 2021 RDAI Workshop, selected for spotlight.

PDF

(2020). Inverse Policy Evaluation for Value-based Sequential Decision-making. Preprint.

PDF

(2020). Training Recurrent Neural Networks Online by Learning Explicit State Variables. In ICLR 2020.

PDF

(2020). Fixed-Horizon Temporal Difference Methods for Stable Reinforcement Learning. In AAAI 2020.

PDF