Serena Booth

I'm hiring PhD Students and a Postdoc at Brown!

Apply to the CS PhD program here, and mention my name in your application.
I'm looking for interest in human-AI/human-robot interaction, reinforcement learning, and/or AI Policy.

Here are my research, teaching, and diversity statements from my 2023 faculty job search.

Aligning AI and Robot Behaviors with Human Intents

People must be easily able to specify, model, inspect, and revise AI and robot behaviors. I design methods and tools to enable these interactions.

Specify

Writing specifications for AI systems is critical yet notoriously hard because these systems lack common sense reasoning, making it easy to write specifications that result in unintended and potentially dangerous side effects. I study how experts and non-experts write specifications, and how these specifications should be interpreted.

The Perils of Trial-and-Error Reward Design: Misdesign through Overfitting and Invalid Task Specifications

Trial-and-error reward design is unsanctioned, but the implications of this widespread practice have not been studied. We conduct empirical computational and user study experiments, and we find that trial and error leads to the design of reward functions which are overfit and otherwise misdesigned. Even in a trivial setting, we find that reward function misdesign is rampant. Published at AAAI Conference on Artificial Intelligence, 2023.

Project Webpage

A visual comparing optimal advantage versus partial return. The partial return is equal (0) when the robot makes progress toward or goes away from the goal. The advantage is higher when the robot makes progress toward the goal, as desired.

Learning Optimal Advantage from Preferences and Mistaking it for Reward

Typical reinforcement learning from human preferences (RLHF) approaches assume that human preferences arise only from trajectory segments' sums of reward. In past work, we showed that regret is a better model of human preferences (published at TMLR 2024). In this work, we consider the consequences if preferences arise instead from this better-supported regret preference model. Published at AAAI Conference on Artificial Intelligence, 2024.

Project Webpage

Inspect

After writing a specification and using some algorithm to optimize it, how can a person assess whether a robot or an AI has learned the behavior that meets their needs and expectations? Is it aligned to their intent?

A project overview image. It shows a decision surface, with highlighted points corresponding to an adversarial example, a picture of a corgi, a picture of a corgi butt, and a picture of a loaf of bread. The level sets for 50 percent confidence examples (e.g., the corgi butt and the adversarial examples) are highlighted.

Bayes-TrEx: A Bayesian Sampling Approach to Model Transparency by Example

Looking at expressive examples can help us better understand neural network behaviors and design better models. Bayes-TrEx is a tool to find these expressive examples. Published at AAAI Conference on Artificial Intelligence, 2021.

Project Webpage

RoCUS: Robot Controller Understanding via Sampling (An Extension to Bayes-TrEx)

We sample representative robot behaviors. We show how exposing these representative behaviors can help with the revision of a dynamical system robot controller's specifications. Published at Conference on Robot Learning (CoRL), 2021.

Project Webpage

Do Feature Attribution Methods Work?

We design a principled evaluation mechanism for assessing priority attribution methods, and contribute to the growing body of literature suggesting these methods cannot be trusted in the wild. Published at AAAI Conference on Artificial Intelligence, 2022.

Project Webpage

A scene showing how a user might view a logical summary and a system state. The image shows a car with cars to its left, right, and behind. The description says 'I speed up when one or both of: (1) Both of: - a vehicle is not in front of me - the next exit is not 42. (2) All of: - a vehicle is to my right. - a vehicle is not in front of me. - a vehicle is behind me.

Communicating Logical Statements Effectively

How should we best present logical sentences to a human? I study whether different logical forms are easier or harder for people to parse. I find that people are more resilient than anticipated! Published at International Joint Conference on AI (IJCAI), 2019.

Project Webpage

Model

How do humans come to understand the behavioral patterns encoded in a specification? More generally, how do humans maintain and mitigate uncertainty about their beliefs about AI systems?

A visual overview of the Variation Theory of Learning

Revisiting Human-Robot Teaching and Learning Through the Lens of Human Concept Learning Theory

We look at how cognitive theories of human concept learning should inform human-robot interaction interfaces, especially for teaching and learning tasks. Published at ACM/IEEE International Conference on Human-Robot Interaction (HRI), 2022.

Project Webpage

Above: two side-by-side photos of a robot moving with contrast, in which a robot has two different policies (i and not i) in the same environment. Below: two side-by-side photos of a robot moving with generalization, in which a robot has the same policy (i) in two different environments.

Varying How We Teach: Adding Contrast Helps Humans Learn about Robot Motions

In this preliminary work, we test the consequences of applying some of the insights from the Variation Theory of Learning to assist humans in learning about robot motions. Published at HRI Workshop on Human-Interactive Robot Learning, 2023.

Project Webpage

Sidequests

Machine Learning Practice Outside Big Tech: Resource Constraints Challenge Responsible Development

We interviewed industry practitioners from startups, government, and non-tech companies about their use of machine learning in developing products. We analyze these interviews with thematic analysis. Published at AAAI/ACM Conference on Artificial Intelligence, Ethics, and Society (AIES), 2021.

Project Webpage

A small TurtleBot robot, kitted out with a cookie delivery box.

Piggybacking Robots: Human-robot Overtrust in University Dormitory Security

My award-winning undergraduate senior thesis, a project which set out to answer the question of whether we place too much trust in robotic systems, specifically in the physical security domain. Published at ACM/IEEE International Conference on Human-Robot Interaction (HRI), 2017.

Project Webpage

Media Coverage

A comic depicting a robot with a plate of cookies, trying to enter someone's house.

PhD Comics

A comic depicting a robot trying to get a human to take a cookie in exchange for placing themself in danger.

Soonish by Kelly and Zach Weinersmith

Advocacy

Serena and colleague Willie Boag standing in front of a doorway in Congress.

Science Policy

In 2021-2022, I served as President and in 2020-2021 as Vice President of MIT's Science Policy Initiative. I advocate for using science to inform policy, and for using policy to make science just and equitable. Pictured above with colleague Willie Boag. Not an endorsement for Senator Alexander.

Serena's students posing for a photo on a staircase.

Equity and Inclusion

I'm a strong advocate for the inclusion of women and underrepresented minorities in science. In 2019, I served as co-president of MIT's GW6: Graduate Women of Course 6. Pictured above: high school students from an introductory CS class I taught in Puebla, Mexico.

Conferences and Journals

Position: Strong Consumer Protection is an Inalienable Defense for AI Safety in the United States

Serena Booth

International Conference on Machine Learning (ICML) 2025

Towards Improving Reward Design in RL: A Reward Alignment Metric for RL Practitioners

Calarina Muslimani, Kerrick Johnstonbaugh, Suyog Chandramouli, Serena Booth, W. Bradley Knox, Matthew E Taylor

Reinforcement Learning Conference (RLC) 2025

Goals vs. Rewards: Towards a Comparative Study of Objective Specification Mechanisms

Septia Rani, Serena Booth, Sarath Sreedharan

Reinforcement Learning Conference (RLC) 2025

Models of Human Preference for Learning Reward Functions

W. Bradley Knox, Stephane Hatgis-Kessell, Serena Booth, Scott Niekum, Peter Stone, Alessandro Allievi

Transactions on Machine Learning Research (TMLR) 2024

Webpage

Learning Optimal Advantage from Preferences and Mistaking it for Reward

W. Bradley Knox, Stephane Hatgis-Kessell, Sigurdur Orn Adalgeirsson, Serena Booth, Anca Dragan, Peter Stone, Scott Niekum

AAAI Conference on Artificial Intelligence 2024

Webpage

Quality-Diversity Generative Sampling for Learning with Synthetic Data

Allen Chang, Matthew Fontaine, Serena Booth, Maja Matarić, Stefanos Nikolaidis

AAAI Conference on Artificial Intelligence 2024

The Perils of Trial-and-Error Reward Design: Misdesign through Overfitting and Invalid Task Specifications

Serena Booth, W. Bradley Knox, Julie Shah, Scott Niekum, Peter Stone, Alessandro Allievi

AAAI Conference on Artificial Intelligence 2023

Webpage

Extended Abstract: Graduate Student Descent Considered Harmful? A Proposal for Studying Overfitting in Reward Functions

Serena Booth, W. Bradley Knox, Julie Shah, Scott Niekum, Peter Stone, Alessandro Allievi

Multidisciplinary Conference on Reinforcement Learning and Decision Making 2022

Spotlight, Extended Abstract: Partial Return Poorly Explains Human Preferences

W. Bradley Knox, Stephane Hatgis-Kessell, Serena Booth, Scott Niekum, Peter Stone, Alessandro Allievi

Multidisciplinary Conference on Reinforcement Learning and Decision Making 2022

Revisiting Human-Robot Teaching and Learning Through the Lens of Human Concept Learning Theory

Serena Booth, Sanjana Sharma, Sarah Chung, Julie Shah, Elena L. Glassman

ACM/IEEE International Conference on Human-Robot Interaction (HRI) 2022

Webpage

Do Feature Attribution Methods Correctly Attribute Features?

Yilun Zhou, Serena Booth, Marco Ribeiro, Julie Shah

AAAI Conference on Artificial Intelligence 2022

Webpage

Bayes-TrEx: A Bayesian Sampling Approach to Model Transparency by Example

Serena Booth, Yilun Zhou, Ankit Shah, Julie Shah

AAAI Conference on Artificial Intelligence 2021

Webpage

RoCUS: Robot Controller Understanding via Sampling

Yilun Zhou, Serena Booth, Nadia Figueroa, Julie Shah

Conference on Robot Learning (CoRL) 2021

Webpage

Machine Learning Practice Outside Big Tech: How Resource Constraints Challenge Responsible Development

Aspen Hopkins, Serena Booth

AAAI/ACM Conference on Artificial Intelligence, Ethics, and Society (AIES) 2021

Webpage

Evaluating the Interpretability of the Knowledge Compilation Map: Communicating Logical Statements Effectively

Serena Booth, Christian Muise, Julie Shah

International Joint Conference on AI (IJCAI) 2019

Webpage

Piggybacking Robots: Human-robot Overtrust in University Dormitory Security

Serena Booth, James Tompkin, Hanspeter Pfister, Jim Waldo, Krzysztof Gajos, Radhika Nagpal

ACM/IEEE International Conference on Human-Robot Interaction (HRI) 2017

Webpage

Workshops

Learning Optimal Advantage from Preferences and Mistaking it for Reward

W. Bradley Knox, Stephane Hatgis-Kessell, Sigurdur Orn Adalgeirsson, Serena Booth, Anca Dragan, Peter Stone, Scott Niekum

2023 ICML Workshop on The Many Facets of Preference-based Learning (MFPL) 2023

Varying How We Teach: Adding Contrast Helps Humans Learn about Robot Motions

Tiffany Horter, Elena L. Glassman, Julie Shah, Serena Booth

HRI Workshop on Human-Interactive Robot Learning 2023

Webpage

The Irrationality of Neural Rationale Models

Yiming Zheng, Serena Booth, Julie Shah, Yilun Zhou

2022 NAACL Workshop on Trustworthy Natural Language Processing (TrustNLP) 2022

Do Priority Attribution Methods Correctly Attribute Priorities?

Yilun Zhou, Serena Booth, Marco Ribeiro, Julie Shah

NeurIPS 2021 XAI4Debugging Workshop 2021

How to Understand Your Robot: A Design Space Informed by Human Concept Learning

Serena Booth, Sanjana Sharma, Sarah Chung, Julie Shah, Elena L. Glassman

ICRA 2021 Workshop on Social Intelligence in Humans and Robots (SIHR) 2021

Sampling Prediction-Matching Examples in Neural Networks: A Probabilistic Programming Approach

Serena Booth, Ankit Shah, Yilun Zhou, Julie Shah

AAAI 2020 Workshop on Statistical Relational Artificial Intelligence (StarAI) 2020

Modeling Blackbox Agent Behaviour via Knowledge Compilation

Christian Muise, Salomon Wollenstein Betech, Serena Booth, Julie Shah, Yasaman Khazaeni

AAAI 2020 Workshop on Plan, Activity, and Intent Recognition (PAIR) 2020