Organizers: Selest Nashef, Patrícia Alves-Oliveira, Josh Smith, Byron Boots, Maya Cakmak, Dieter Fox, Sam Burden, Siddhartha S. Srinivasa
Abstract: Geometric perception is the task of estimating geometric models (e.g., object pose and 3D structure) from sensor measurements and priors (e.g., point clouds and neural network detections). Geometric perception is a fundamental building block for robotics applications ranging from intelligent transportation to space autonomy. The ubiquitous existence of outliers —measurements that tell no or little information about the models to be estimated— makes it theoretically intractable to perform estimation with guaranteed optimality. Despite this theoretical intractability, safety-critical robotics applications still demand trustworthiness and performance guarantees on perception algorithms. In this talk, I present certifiable outlier-robust geometric perception, a new paradigm to design tractable algorithms that enjoy rigorous performance guarantees, i.e., they return an optimal estimate with a certificate of optimality for a majority of problem instances, but declare failure and provide a measure of suboptimality for worst-case instances. Particularly, I present two general-purpose algorithms in the certifiable perception toolbox: (i) an estimator that uses graph theory to prune gross outliers and leverages graduated non-convexity to compute the optimal model estimate with high probability of success, and (ii) a certifier that employs sparse semidefinite programming (SDP) relaxation and a novel SDP solver to endow the estimator with an optimality certificate or escape local minima otherwise. The estimator is fast and robust against up to 99% random outliers in practical perception applications, and the certifier can compute high-accuracy optimality certificates for large-scale problems beyond the reach of existing SDP solvers. I showcase certifiable outlier-robust perception on robotics applications such as scan matching, satellite pose estimation, and vehicle pose and shape estimation. I conclude by remarking opportunities for integrating certifiable perception with big data, machine learning, and safe control towards trustworthy autonomy.
Biography: Heng Yang is a final-year Ph.D. candidate in the Laboratory for Information & Decision Systems and the Department of Mechanical Engineering at the Massachusetts Institute of Technology (MIT), working with Prof. Luca Carlone. He holds a B.S. degree from Tsinghua University and an S.M. degree from MIT, both in Mechanical Engineering. His research interests include large-scale convex optimization, semidefinite relaxation, robust estimation, and machine learning, applied to robotics and trustworthy autonomy. His work includes developing certifiable outlier-robust machine perception algorithms, large-scale semidefinite programming solvers, and self-supervised geometric perception frameworks. Heng Yang is a recipient of the Best Paper Award in Robot Vision at the 2020 IEEE International Conference on Robotics and Automation (ICRA), a Best Paper Award Honorable Mention from the 2020 IEEE Robotics and Automation Letters (RA-L), and a Best Paper Award Finalist at the 2021 Robotics: Science and Systems (RSS) conference. He is a Class of 2021 RSS Pioneer.
Abstract: Robots learning in the wild can utilize input from human teachers to improve their learning capabilities. However, people can be imperfect teachers, which can negatively affect learning when robots expect a teacher to be a constantly present and correct oracle. Towards addressing this issue, we create algorithms for robots learning from imperfect teachers, who may be inattentive to the robot or give inaccurate information. These algorithms are based in Interactive Reinforcement Learning (interactive RL), which enables robots to take information from interactions with both their environmental reward function and additional feedback or advice from their teachers. These algorithms will allow robots to learn with or without human attention and to utilize both correct and incorrect feedback, giving more people the capability to successfully teach a robot.
Biography: Taylor Kessler Faulkner is a PhD candidate in the Computer Science Department at the University of Texas at Austin, where she is advised by Prof. Andrea Thomaz. Her work is at the intersection of Human-Robot Interaction and Robot Learning. Specifically, she focuses on designing algorithms that make it easier for non-expert people to teach and interact with learning robots.
Abstract: Many robotics tasks, even seemingly simple procedural tasks like assembly and cleaning, require a continuous cycle of planning, learning, adapting and executing of diverse skills and sub-tasks. However, deep reinforcement learning algorithms developed for short-horizon tasks often fail on long-horizon tasks -- suffering from high-dimensionality of inputs and sample complexity. It is thus hard to scale and generalize learning agents to long-horizon complex tasks. To this end, my research centers on enabling autonomous agents to perform long-horizon, complex physical tasks. More specifically, I focus on how complex robotics tasks can be addressed by modularizing long-horizon tasks into multiple sub-tasks and skills; and address the following three main challenges: (1) how to generalize policies and reinforcement learning algorithms, (2) how to compose learned skills for a long-horizon task, and (3) how and what to learn from demonstrations. In this talk, I will describe some recents works from my lab and discuss future directions.
Biography: Joseph Lim is an associate professor in the Kim Jaechul School of Artificial Intelligence at KAIST, and is leading the Cognitive Learning for Vision and Robotics (CLVR) lab. Previously, he was an assistant professor at the University of Southern California (USC). Before that, he completed his PhD at Massachusetts Institute of Technology under the guidance of Professor Antonio Torralba, and also had a half-year long postdoc under Professor William Freeman at MIT and a year long postdoc under Professor Fei-Fei Li at Stanford University. He received his bachelor degree at the University of California - Berkeley, where he worked in the Computer Vision lab under the guidance of Professor Jitendra Malik. He also has spent time at Microsoft Research, Adobe Creative Technologies Lab, Google, and Facebook AI Research.
Abstract: Zsolt Kira is an Assistant Professor at the Georgia Institute of Technology and Associate Director of Georgia Tech’s Machine Learning Center. His work lies at the intersection of machine learning and artificial intelligence for perception and robotics. Current projects and interests relate to Embodied AI, focusing on tackling more complex, compositional tasks and the integration of learning and planning. On the machine learning front, his group has made significant contributions to moving beyond current limitations of supervised machine learning to tackle un/self-/semi-supervised methods, out-of-distribution detection, model calibration, learning under imbalance, continual/lifelong learning, and adaptation. He is especially interested in the intersection of these, in order to fully leverage the stream of data that embodied robots typically encounter. Prof. Kira has grown a portfolio of projects funded by NSF, ONR, DARPA, industry, and the IC community, has over 45 publications in top venues, and has received several best paper/student paper awards.
Biography: As robots enter our homes, they must solve complex, long-horizon tasks that require multiple skills including navigation, pick/place, and interaction with articulated objects. While imitation and reinforcement learning have both achieved remarkable successes in decision-making tasks such as game playing and some simpler robotics tasks, more complex tasks have thus far been difficult to solve. In this talk, I'll discuss our research agenda for moving towards this ability. First, I will present a fast, photo-realistic simulation environment (Habitat 2.0, work with Dhruv Batra and Meta) that supports fast physics simulation (including articulated objects through several unique optimizations), larger-scale RL experimentation, and fast scientific iteration. Second, I will present methods for hierarchical decomposition of tasks into skills and sub-policies, both for object rearrangement tasks within Habitat 2.0 as well as vision-language navigation (Robo-VLN), in order to more successfully learn longer-horizon tasks. Importantly, we compare to traditional sense-plan-act non-learning approaches and show that they are brittle, require significant tuning and privileged information to do well, and all methods (learned and non-learned) still cannot fully solve complex tasks requiring the chaining of multiple skills. This leads to a number of open areas, and I will discuss on-going work on self-supervised learning, skill chaining, and improved inverse reinforcement learning to enable more effective long-horizon robot task learning.
Abstract: Robots today are typically confined to interact with rigid, opaque objects with known object models. However, the objects in our daily lives are often non-rigid, can be transparent or reflective, and are diverse in shape and appearance. One reason for the limitations of current methods is that computer vision and robot planning are often considered separate fields. I argue that, to enhance the capabilities of robots, we should design state representations that consider both the perception and planning algorithms needed for the robotics task. I will show how we can develop novel perception and planning algorithms to assist with the tasks of manipulating cloth, articulated objects, and transparent and reflective objects. By thinking about the downstream task while jointly developing perception and planning algorithms, we can significantly improve our progress on difficult robots tasks.
Biography: Robots today are typically confined to interact with rigid, opaque objects with known object models. However, the objects in our daily lives are often non-rigid, can be transparent or reflective, and are diverse in shape and appearance. One reason for the limitations of current methods is that computer vision and robot planning are often considered separate fields. I argue that, to enhance the capabilities of robots, we should design state representations that consider both the perception and planning algorithms needed for the robotics task. I will show how we can develop novel perception and planning algorithms to assist with the tasks of manipulating cloth, articulated objects, and transparent and reflective objects. By thinking about the downstream task while jointly developing perception and planning algorithms, we can significantly improve our progress on difficult robots tasks.