Autumn 2023 Colloquium

Organizers: Jerry Savage, Abhishek Gupta, Maya Cakmak, Josh Smith

No Robotics Colloquium 10/06/2023
Becoming Teammates: Designing Assistive, Collaborative Machines
Chien-Ming Huang (Johns Hopkins University) 10/13/2023

Abstract: The growing power in computing and AI promises a near-term future of human-machine teamwork. In this talk, I will present my research group’s efforts in understanding the complex dynamics of human-machine interaction and designing intelligent machines aimed to assist and collaborate with people. I will focus on 1) tools for onboarding machine teammates and authoring machine assistance, 2) methods for detecting, and broadly managing, errors in collaboration, and 3) building blocks of knowledge needed to enable ad hoc human-machine teamwork. I will also highlight our recent work on designing assistive, collaborative machines to support older adults aging in place.

Biography: Chien-Ming Huang is the John C. Malone Assistant Professor in the Department of Computer Science at the Johns Hopkins University. His research focuses on designing interactive AI aimed to assist and collaborate with people. He publishes in top-tier venues in HRI, HCI, and robotics including Science Robotics, HRI, CHI, and CSCW. His research has received media coverage from MIT Technology Review, Tech Insider, and Science Nation. Huang completed his postdoctoral training at Yale University and received his Ph.D. in Computer Science at the University of Wisconsin–Madison. He is a recipient of the NSF CAREER award. https://www.cs.jhu.edu/~cmhuang/

Perception and Decision-Making Systems for Human-Robot Teaming in Safety-Critical Environments
Angelique Taylor (Cornell Tech) 10/20/2023

Abstract: In this talk, I will present recent work on developing perception and decision-making systems that enable robots to team with groups of people. My core focus is on problems that robots encounter in human-robot teaming, including perceptions of human groups and social navigation, particularly in safety-critical environments. First, I will discuss how I developed computer vision methods that enable robots to detect and track their teammates in real-world settings. Building on this, I designed a social navigation system that allows robots to deliver materials to healthcare workers using acuity-aware image features to incorporate the severity of patients' health while navigating in the ED. This work will help robots avoid interrupting care delivery. My work will enable robots to work in safety-critical, human-centered environments, and ultimately help improve patient outcomes and alleviate clinician workload.

Biography: Angelique Taylor is an Assistant Professor in the Information Science Department at Cornell Tech and Director of the Artificial Intelligence and Robotics Lab (AIRLab) focusing on research at the intersection of robotics, computer vision, and artificial intelligence. The AIRLab designs robots and augmented reality systems that sense and model groups of people in real-world, safety-critical environments and develops machine learning algorithms to support team collaboration. Before joining Cornell, Angelique was a Visiting Research Scientist at Meta Reality Labs Research working on AI to support multi-user collaboration in AR/VR. She received her Ph.D. in Computer Science and Engineering from the University of California San Diego in 2021. She has received the Google Award for Inclusion Research, NSF GRFP, Microsoft Dissertation Award, the Google Anita Borg Memorial Fellowship, the Arthur J. Schmitt Presidential Fellowship, a GEM Fellowship, and an award from the National Center for Women in Information Technology (NCWIT).

Updates from NVIDIA’s Seattle Robotics Lab: Task and Motion Planning, Visuomotor Transformers, and Fine-grained Robot Manipulation
Caelan Garrett, Ankit Goyal, and Iretiayo Akinola (NVIDIA’s Seattle Robotics Lab) 10/27/2023

Abstract: In this talk, we present three robotics research directions we are conducting at NVIDIA’s Seattle Robotics Lab. First, Task and Motion Planning (TAMP) systems, while automated, typically require a detailed planning model – a challenge in contact-rich scenarios. We show that TAMP can be combined with 1) teleoperation to efficiently generate data to train visuomotor transformer policies and 2) imitation learning to improve the robustness of deployed policies. Second, explicit 3D representations like voxels are more effective than camera-based methods for object manipulation but are computationally demanding. To address this, we introduce RVT, a multi-view transformer that outperforms the prior state-of-the-art method (PerAct) with a 26% higher task success rate, 36X faster training efficiency, and 2.3X improved inference speed. Finally, in the IndustReal project, we embrace a simulation-first approach to contact-rich assembly by developing high-fidelity simulations and leveraging reinforcement learning to effectively solve simulated assembly tasks. Ultimately, we employ sim2real techniques to successfully achieve policy transfer from the simulated environment to the real world.

Biography: Caelan Garrett, Ankit Goyal, and Iretiayo Akinola are Research Scientists at NVIDIA Research where they work on improving the state-of-the-art of robotic manipulation. Caelan obtained his PhD from MIT, and his research focuses on TAMP using classical and learning-based methods. Ankit obtained his PhD from Princeton University, and his research focuses on 3D computer vision for robotic manipulation. Iretiayo obtained his PhD from Columbia University, and his research focuses on sim-to-real robotic manipulation from multi-modal (visual and tactile) perception.

Title: Bridging State and Action: Towards Continual Reinforcement Learning
Khimya Khetarpal (Google Deepmind) 11/03/2023

Abstract: Abstract: My research goal is to develop Artificial Intelligence (AI) systems that can learn to efficiently represent world knowledge, plan with it, and be able to adapt to changes over time through learning and interaction. I tackle this problem by using reinforcement learning (RL), which allows agents to learn by trial-and-error, from interaction with their environment. Towards the goal of building machines that develop broadly intelligent behavior, in this talk I will present a framework that enables AI agents to “represent” and “reason” about their environment through the lens of affordances. Affordances play a dual role by reducing the number of action possibilities available in any given situation as well as facilitating learning of more efficient and precise transition models from data. Next, to reason and make predictions selectively across different time scales, I will talk about an approach for using affordances to build temporally abstract partial models. I share trade-offs between single-step models and temporally extended partial models by quantifying the loss incurred when using them for planning or learning. The theoretical guarantees I present provide insights and decouple the role of affordances from temporal abstraction. In a nutshell, the lens of affordances offers multi-fold benefits, 1) faster planning across different timescales, 2) improved sample efficiency for learning transition models, and 3) robust generalization.

Biography: Bio: Khimya Khetarpal is a Research Scientist at Google Deepmind. She earned her Ph.D. in Computer Science from the Reasoning and Learning Lab at McGill University and Mila, advised by Doina Precup. She is broadly interested in artificial intelligence and reinforcement learning. Khimya’s work has appeared in leading AI journals and conferences including NeurIPS, ICML, AAAI, AISTATS, ICLR, The Knowledge Engineering Review, ACM, JAIR and TMLR. Her work has also been featured in MIT Technology Review. She was recognized as a TMLR expert reviewer in 2023, one of the Rising Stars in EECS 2020, a finalist for Three Minute Thesis (3MT) competition in AAAI 2019, selected for the Doctoral Consortium at AAAI 2019, and awarded Best Paper Award (3rd Price) for an ICML 2018 workshop on lifelong learning. Throughout her career, she has sought to actively mentor through initiatives such as co-founding the Mila peer advising initiative, teaching and assisting AI4Good Lab, volunteering at Skype A Scientist, and mentoring at FIRST Robotics.

No Robotics Colloquium (Holiday) 11/10/2023
Title: RoboCat: A self-improving robotic agent
Coline Devin (Google Deepmind) 11/17/2023

Abstract: Abstract: The ability to leverage heterogeneous robotic experience from different robots and tasks to quickly master novel skills and embodiments has the potential to transform robot learning. Inspired by recent advances in foundation models for vision and language, we propose a multi-embodiment, multi-task generalist agent for robotic manipulation. This agent, named RoboCat, is a visual goal-conditioned decision transformer capable of consuming action labelled visual experience. This data spans a large repertoire of motor control skills from simulated and real robotic arms with varying sets of observations and actions. With RoboCat, we demonstrate the ability to generalise to new tasks and robots, both zero-shot as well as through adaptation using only 100–1000 examples for the target task. We also show how a trained model itself can be used to generate data for subsequent training iterations, thus providing a basic building block for an autonomous improvement loop. We investigate the agent’s capabilities, with large-scale evaluations both in simulation and on three different real robot embodiments. We find that as we grow and diversify its training data, RoboCat not only shows signs of cross-task transfer, but also becomes more efficient at adapting to new tasks.

Biography: Bio Coline Devin is a senior research scientist at Google DeepMind. She received her PhD in Computer Science from UC Berkeley, advised by Sergey Levine, Trevor Darrell, and Pieter Abbeel. She is an NSF Graduate Research Fellow and has published work at NeurIPS, ICLR, CoRL, ICRA, and IROS.

No Robotics Colloquium (Holiday) 11/25/2023
Robotics & The New Cyberlaw
Ryan Calo (UW Law) 12/01/2023

Abstract: The ascendance of the Internet wrought great social, cultural, and economic changes. It also launched the academic movement known as “cyberlaw.” The themes of this movement reflect the essential qualities of the Internet, i.e., the set of characteristics that distinguish the Internet from predecessor and constituent technologies. Now a new set of technologies is ascending, one with arguably different essential qualities. This talk examines how the mainstreaming of robotics—for instance, drones and driverless cars—will affect legal and policy discourse, and explores whether cyberlaw is still the right home for the resulting doctrinal and academic conversation.

Biography: Ryan Calo is the Lane Powell and D. Wayne Gittinger Professor at the University of Washington School of Law. He is a founding co-director (with Batya Friedman and Tadayoshi Kohno) of the interdisciplinary UW Tech Policy Lab and a co-founder (with Chris Coward, Emma Spiro, Kate Starbird, and Jevin West) of the UW Center for an Informed Public. Professor Calo holds a joint appointment at the Information School and an adjunct appointment at the Paul G. Allen School of Computer Science and Engineering. Professor Calo's research on law and emerging technology appears in leading law reviews (California Law Review, Columbia Law Review, Duke Law Journal, UCLA Law Review, and University of Chicago Law Review) and technical publications (MIT Press, Nature, Artificial Intelligence) and is frequently referenced by the national media. His work has been translated into at least four languages. Professor Calo has testified three times before the United States Senate and organized events on behalf of the National Science Foundation, the National Academy of Sciences, and the Obama White House. He has been a speaker at President Obama's Frontiers Conference, the Aspen Ideas Festival, and NPR's Weekend in Washington. Professor Calo is a board member of the R Street Institute and an affiliate scholar at the Stanford Law School Center for Internet and Society (CIS), where he was a research fellow, and the Yale Law School Information Society Project (ISP). He serves on numerous advisory boards and steering committees, including University of California's People and Robots Initiative, the Electronic Frontier Foundation (EFF), the Center for Democracy and Technology (CDT), the Electronic Privacy Information Center (EPIC), Without My Consent, the Foundation for Responsible Robotics, and the Future of Privacy Forum. In 2011, Professor Calo co-founded the premiere North American annual robotics law and policy conference We Robot with Michael Forman and Ian Kerr. Professor Calo worked as an associate in the Washington, D.C. office of Covington & Burling LLP and clerked for the Honorable R. Guy Cole, the Chief Justice of the U.S. Court of Appeals for the Sixth Circuit. Prior to law school at the University of Michigan, Professor Calo investigated allegations of police misconduct in New York City. He holds a B.A. in Philosophy from Dartmouth College. Professor Calo won the Phillip A. Trautman 1L Professor of the Year Award in 2014 and 2017 and was awarded the Washington Law Review Faculty Award in 2019.

Ani Kembhavi (AI2) 12/08/2023

Biography: Ani Kembhavi leads the Perceptual Reasoning and Interaction Research (PRIOR) group at AI2. He is also an Affiliate Associate Professor in the Department of Computer Science & Engineering at the University of Washington. He is interested in research problems at the intersection of vision, language and embodiment. He graduated from the University of Maryland with a PhD in 2010, under the supervision of Prof. Larry Davis. Prior to joining AI2 he worked at Microsoft's Bing, building large scale machine learning systems in the Image and Video Relevance team.