Organizers: Tapomayukh Bhattacharjee, Maya Cakmak, Dieter Fox, Siddhartha S. Srinivasa
Abstract: The expanding application of automation and the resulting increase in complexity of robotics systems presents unique challenges to designers and operators of these systems. Designers are faced with the task of predicting conditions for suboptimal operation and diagnosing them when they arise. On the other hand, operators need to understand the reasoning behind the actions of increasingly complex robotics systems. As autonomous robots become more prevalent, the explainability of their actions will take on new significance. This talk explores principles of explainable robotics applied to systems comprising both classical and ML-based components, and proposes strategies to improve explainability in these systems.
Biography: Nima Keivan is a principal engineer at Amazon, where he helps guide the technology and science development of Amazon's next generation of industrial autonomous mobility. He joins Amazon as part of the acquisition of Canvas Technology, where he was CTO and co-founder. Canvas developed vision-based autonomous mobility solutions capable of working safely with and around people in industrial applications. He has a PhD in Computer Science with a focus on visual-inertial dense and sparse SLAM as well as planning and control for agile autonomous ground vehicles.
Abstract: Reinforcement Learning (RL) is recognized as a promising approach to autonomously learn complex behaviors directly from sensor observations and suitable reward functions. The paradigm of RL has enjoyed tremendous success in domains such as Atari games, Go, and StarCraft. However, the need for large amounts of on-policy data has limited its practicality when applied to real-world robotics. This talk presents a combination of approaches developed at X in collaboration with Google and DeepMind, which aim to address this issue: 1) collaborative robot learning, 2) off-policy RL, and 3) the use of simulation and domain adaptation to reduce data requirements by a factor of 100. We show results on the task of grasping objects from bins using a monocular camera, as well as a waste-sorting application at the Everyday Robot Project where the use of RL is crucial for continual learning on real data.
Biography: Mrinal Kalakrishnan is a staff roboticist at X (formerly Google X), where he leads the robot learning effort on the Everyday Robot Project. His work focuses on developing robot learning algorithms for navigation and manipulation that can handle the complexity of real-world applications in human environments. He received his PhD in Computer Science from the University of Southern California, where he focused on robot learning and motion planning for manipulation and legged locomotion.
Abstract: Many real-world planning applications involve complex relationships defined on high-dimensional, continuous variables. For example, robotic manipulation requires planning with kinematic, collision, visibility, stability, stiffness, and motion constraints involving object poses, robot configurations, and robot trajectories. These constraints typically require specialized procedures to sample satisfying values. We introduce PDDLStream, an extension of Planning Domain Description Language (PDDL) that supports a generic, declarative specification for these sampling procedures while treating their implementation as blackboxes. Each procedure can either be supplied by an engineer or learned from data. We provide a domain-independent algorithm that lazily queries these samplers in order to focus on sampling spaces that are relevant to the task. Our approach supports deterministic, cost-sensitive planning in belief space, allowing us to address partially-observable domains. Finally, we demonstrate an online replanning system, built on top of our planners, that is capable of completing multi-step manipulation tasks in a stochastic, kitchen environment.
Biography: Caelan Garrett is a fifth year PhD student at MIT in the Learning and Intelligent Systems group within CSAIL. He is advised by Professors Tomás Lozano-Pérez and Leslie Pack Kaelbling. His research is on integrating robot motion planning, discrete AI planning, and machine learning to flexibly and efficiently plan for autonomous mobile manipulators operating in human environments. He is a recipient of the NSF Graduate Research Fellowship. He has interned in the autonomous fulfillment industry while at Amazon Robotics, in the autonomous vehicle industry while at Optimus Ride, and at NVIDIA’s Seattle Robotics Lab.
Abstract: In the domain of image and video analysis, much of the deep learning revolution has been focused on narrow, high-level classification tasks that are defined through carefully curated, retrospective data sets. However, most real-world applications – particularly those involving complex, multi-step manipulation activities -- occur “in the wild." These systems demand a richer, fine-grained task representation that is informed by the application context and which supports quantitative analysis and compositional synthesis. As a result, the challenges inherent in both high-accuracy, fine-grained analysis and performance of perception-based activities are manifold, spanning representation, recognition, and task and motion planning. In this talk, I’ll summarize our work addressing these challenges. I’ll first describe SAFER, our approach to interpretable, attribute-based activity detection. SAFER operates in both pre-trained and zero shot settings and has been applied to a variety of applications ranging from surveillance to surgery. Along the way, I’ll describe our related work on future prediction as means to perform self-supervised pretraining of models for activity recognition. I’ll then briefly describe complementary work on machine learning approaches for systems supporting perception-based planning and execution of manipulation tasks. I’ll close with some recent work on end-to-end training of a robot manipulation system which leverages architecture search and fine-grained task rewards to achieve state-of-the-art performance in complex, multi-step manipulation tasks.
Biography: Greg Hager is the Mandell Bellmore Professor of Computer Science at Johns Hopkins University and Founding Director of the Malone Center for Engineering in Healthcare. Professor Hager’s research interests include computer vision, vision-based and collaborative robotics, time-series analysis of image data, and applications of image analysis and robotics in medicine and in manufacturing. He is a member of the CISE Advisory Committee, the governing board of the International Federation of Robotics Research and former member of the Board of the Directors of the Computing Research Association. He previously served as Chair of the Computing Community Consortium. In 2014, he was awarded a Hans Fischer Fellowship in the Institute of Advanced Study of the Technical University and in 2017 was named a TUM Ambassador. Professor Hager has served on the editorial boards of IEEE TRO, IEEE PAMI, and IJCV and ACM Transactions on Computing for Healthcare. He is a fellow of the ACM and IEEE for his contributions to Vision-Based Robotics and a Fellow of AAAS, the MICCAI Society and of AIMBE for his contributions to imaging and his work on the analysis of surgical technical skill. Professor Hager is a co-founder of Clear Guide Medical and Ready Robotics.
Abstract: My research is driven by the puzzle of why humans can effortlessly manipulate any kind of object while it is so hard to reproduce this skill on a robot. Humans can easily cope with uncertainty in perceiving the environment and in the effect of manipulation actions. One hypothesis is that humans are exceptionally accurate in perceiving and predicting how their environment will evolve. Therefore, improving the accuracy of perception and prediction is one way forward. In this talk, I would like to advocate for a different view on this problem: What if we will never reach perfect accuracy? If we accept that premise, then an important focus towards more robust robotic manipulation is to develop methods that can cope with a base level of uncertainty and unexpected events. In this talk, I will present three approaches that embrace uncertainty in robotic manipulation. First, I present an approach where one robot scaffolds the learning of another robot by optimally placing physical fixtures in the environment. When optimally placed, these fixtures funnel uncertainty and thereby dramatically increase learning speed of the manipulation task. Second, I present an approach that goes beyond a single manipulation tasks by performing task and motion planning. We propose to combine a logic planner with a trajectory optimiser, where the output is a sequence of Cartesian frames that are defined relative to an object. This object-centric approach has the advantage that the plan remains valid even if the environment changes in an unforeseen way. Third, I present an approach for deformable object manipulation which is a challenging task due to a high-dimensional state space and complex dynamics. Despite large degrees of uncertainty, the system is robust thanks to a continuously re-planning model-predictive control approach.
Biography: Jeannette Bohg is an Assistant Professor of Computer Science at Stanford University. She was a group leader at the Autonomous Motion Department (AMD) of the MPI for Intelligent Systems until September 2017. Before joining AMD in January 2012, Jeannette Bohg was a PhD student at the Division of Robotics, Perception and Learning (RPL) at KTH in Stockholm. In her thesis, she proposed novel methods towards multi-modal scene understanding for robotic grasping. She also studied at Chalmers in Gothenburg and at the Technical University in Dresden where she received her Master in Art and Technology and her Diploma in Computer Science, respectively. Her research focuses on perception and learning for autonomous robotic manipulation and grasping. She is specifically interesting in developing methods that are goal-directed, real-time and multi-modal such that they can provide meaningful feedback for execution and learning. Jeannette Bohg has received several awards, most notably the 2019 IEEE International Conference on Robotics and Automation (ICRA) Best Paper Award, the 2019 IEEE Robotics and Automation Society Early Career Award and the 2017 IEEE Robotics and Automation Letters (RA-L) Best Paper Award.
Abstract: Developing robots to robustly and autonomously operate in unknown, unstructured environments is one of the most challenging problems in the industry. The changing dynamics of a battlefield make that problem even worse. This presentation will highlight several sub-problems that the Army Research Laboratory has been working to solve to reduce soldier risk. First, how can we increase the variety of unknown objects a robot can grasp? Self-sealing suction is one technology that may help by enabling local pulling forces to be used in a scalable manner. A functional description, extensive evaluation, and applications of this technology will be discussed. Second, what does the robot do if it inadvertently flips over? Operations in unstructured environments can be unpredictable and require robust error handling, such as the ability to self-right. A framework for analyzing this problem and its solutions will be presented, enabling a deeper understanding of the link between a robot’s morphology and its ability to self-right. Finally, how can we achieve human-like functionality in unstructured environments? Progress to date using the RoMan platform shows a path forward by opening a hinged container to retrieve a soft bag, disassembling a stack of debris, and removing a tree from a roadway.
Biography: Dr. Chad C. Kessens is a robotic manipulation researcher in the Autonomous Systems Division of the US Army Research Laboratory (ARL) in Maryland. While an undergrad at Caltech, he began his career in robotics as a summer research fellow at the Jet Propulsion Laboratory where he constructed the first prototype for the AXEL program, aimed at Mars exploration. He went on to get his M.S. in Mechanical Engineering from Washington University in St. Louis, studying traumatic brain injury and the stress/strain fields resulting from dynamic loading of brain tissue. After working in full threat spectrum defense for 2 years with Battelle, he returned to robotics through ARL, developing solutions to rapid autonomous door opening. He went on to invent a scalable grasping technology using local pulling contact forces as part of his Ph.D. work at the University of Maryland conjointly with his ARL work, and completed the degree in 2018. He subsequently taught a robot modeling course as part of UMD’s professional master’s program for 2 years. Concurrently, he worked to understand the relationship between a robot's morphology and its ability to self-right under various conditions. Most recently, he led ARL’s Robotics Collaborative Technology Alliance mobile manipulation task, of which the University of Washington was a critical part.
Abstract: Tom Williams is an Assistant Professor of Computer Science at the Colorado School of Mines, where he directs the Mines Interactive Robotics Research Lab. Prior to joining Mines, Tom earned a joint PhD in Computer Science and Cognitive Science from Tufts University in 2017. Tom’s research focuses on enabling and understanding natural language based human-robot interaction that is sensitive to environmental, cognitive, social, and moral context. His work is funded by grants from NSF, ARL, and USAFA, as well as by Early Career awards from both NASA and the US Air Force.
Biography: In this talk, I will discuss the importance of morally sensitive language generation in human-robot interaction, and the constraints this places both on the types of algorithms suitable for robotic NLG and the ways in which those algorithms should be evaluated. My argument will leverage current algorithmic work in human-robot interaction; experimental findings from my laboratory; surprising theories from moral philosophy; and a fictional character from 1960s Argentine literature.