Organizers: Jerry Savage, Abhishek Gupta, Maya Cakmak, Josh Smith
Abstract: At long last, robot hands are becoming truly dexterous. It took advances in sensor design, mechanisms, and computational motor learning all working together, but we’re finally starting to see true dexterity, in our lab as well as others. This talk will focus on the path our lab took to get here, and questions for the future. From a mechanism design perspective, I will present our work on optimizing an underactuated hand transmission mechanism jointly the grasping policy that uses it, an approach we refer to as “Hardware as Policy”. From a sensing perspective, I will present our optics-based tactile finger, providing accurate touch information over a multi-curved three-dimensional surface with no blind spots. From a motor learning perspective, I will talk about learning tactile-based policies for dexterous in-hand manipulation and object recognition. Finally, we can discuss implications for the future: how do we consolidate these gains by making dexterity more robust, versatile, and general, and what new applications can it enable?
Biography: Matei Ciocarlie is an Associate Professor in the Mechanical Engineering Department at Columbia University, with affiliated appointments in Computer Science and the Data Science Institute. His work focuses on robot motor control, mechanism and sensor design, planning and learning, all aiming to demonstrate complex motor skills such as dexterous manipulation. Matei completed his Ph.D. at Columbia University in New York; before joining the faculty at Columbia, Matei was a Research Scientist and then Group Manager at Willow Garage, Inc., and then a Senior Research Scientist at Google, Inc. In these positions, Matei contributed to the development of the open-source Robot Operating System (ROS), and led research projects in areas such as hand design, manipulation under uncertainty, and assistive robotics. In recognition of his work, Matei was awarded the Early Career Award by the IEEE Robotics and Automation Society, a Young Investigator Award by the Office of Naval Research, a CAREER Award by the National Science Foundation, and a Sloan Research Fellowship by the Alfred P. Sloan Foundation.
Abstract: Deep learning has enabled rapid advances in perception, planning, and natural language understanding for robots. However, current learning-based systems lack any formal assurances when required to generalize to novel scenarios. For example, perception systems can fail to identify or localize unfamiliar objects, and large language model (LLM)-based planners can hallucinate outputs that lead to unsafe outcomes when executed by robots. How can we rigorously quantify the uncertainty of machine learning components such that robots know when they don’t know and can act accordingly? In this talk, I will present our group’s work on developing principled theoretical and algorithmic techniques for providing formal assurances on learning-enabled robots that act based on rich sensory inputs (e.g., vision) and natural language instructions. The key technical insight is to leverage and extend powerful methods from conformal prediction and generalization theory for rigorous uncertainty quantification in a way that complements and scales with the growing capabilities of foundation models. I will present experimental validation of our methods for providing strong statistical guarantees on LLM planners that ask for help when they are uncertain, and for vision-based navigation and manipulation.
Biography: Anirudha Majumdar is an Assistant Professor at Princeton University in the Mechanical and Aerospace Engineering department and Associated Faculty in the Computer Science department. He also holds a part-time position as a Visiting Research Scientist at the Google AI Lab in Princeton. Majumdar received a Ph.D. in Electrical Engineering and Computer Science from the Massachusetts Institute of Technology in 2016, and a B.S.E. in Mechanical Engineering and Mathematics from the University of Pennsylvania in 2011. Subsequently, he was a postdoctoral scholar at Stanford University from 2016 to 2017 at the Autonomous Systems Lab in the Aeronautics and Astronautics department. Majumdar is a recipient of the Sloan Fellowship, ONR Young Investigator Program (YIP) award, NSF CAREER award, Google Faculty Research Award (twice), Amazon Research Award (twice), Young Faculty Researcher Award from the Toyota Research Institute, Best Student Paper Award at the Conference on Robot Learning (CoRL), Paper of the Year Award from the International Journal of Robotics Research (IJRR), Best Conference Paper Award at the International Conference on Robotics and Automation (ICRA), and the Excellence in Teaching Award (Princeton SEAS).
Abstract: Imitating humans in the real world or employing reinforcement learning in simulated worlds are the most widely used approaches for training robots today. While learning from human supervision is effective, building generalizable robots requires collecting human trajectories at scale which is prohibitively expensive. On the other hand, reinforcement learning in simulation is slow, ineffective for long horizon tasks and sensitive to reward shaping and auxiliary losses. In this talk, I will present our recent surprising finding: Imitating shortest path planners in simulation can produce agents that can proficiently navigate, explore and manipulate objects in the real world. Our agent SPOC uses no human demonstrations, no reinforcement learning, no depth sensors and makes no assumptions about the target environment. A key factor that enables this result is the scale and diversity of the training data -- made possible by our recent works to procedurally generate simulations via ProcTHOR and HoloDeck and massively scaling up 3D assets via our openly available Objaverse resource. Finally, I will present an easy and effective method to fine tune robots to your homes via a real-to-sim approach called Phone2Proc.
Biography: Ani Kembhavi is the Senior Director of Computer Vision at the Allen Institute for Artificial Intelligence (AI2) in Seattle. He is also an Affiliate Associate Professor at the Computer Science & Engineering department at the University of Washington. He obtained his PhD at the University of Maryland, College Park and spent 5 years at Microsoft. His research interests lie at the intersection of computer vision, natural language processing and embodiment. His work has been awarded a Best Paper Award at CVPR 2023, an Outstanding Paper Award at Neurips 2022, an AI2 Test of Time award in 2020 and an NVIDIA Pioneer Award in 2018.
Abstract: Inverse RL provides a strong framework for imitating humans' planning behavior, yet no approach has successfully addressed planetary-scale problems with hundreds of millions of states and demonstration trajectories. In this talk, I'll share how we scaled IRL algorithms (e.g. dominant eigenvector inspired initialization conditions) and the challenges we faced along the way. I'll also share our key observation that there exists a trade-off among classic IRL methods between the use of cheap, deterministic planners and expensive yet robust stochastic policies. This insight is leveraged in Receding Horizon Inverse Planning (RHIP), a new generalized method that enables interpolating between classic IRL algorithms and provides fine-grained control over performance trade-offs. The talk culminates in a policy that achieves a 16-24% improvement in Google Maps route quality, and to the best of our knowledge, represents the largest published benchmark of IRL algorithms in a real-world setting to date.
Biography: Matt Barnes is a senior engineer at Google Research in Seattle. Matt obtained his PhD in Robotics at Carnegie Mellon and was a postdoctoral scholar at the University of Washington with Sidd Srinivasa. His interests focus on designing large-scale systems at the intersection of learning and planning, and exploring the unique research challenges associated with bringing these systems into real-world use by billions of users.
Abstract: Fine manipulation, such as cutting fingernails, threading a needle, or performing delicate surgical tasks like removing clots from organs, is ubiquitous in daily life. Automating these tasks through robotic systems offers significant economic potential. Unlike existing robotic solutions that automate specific problems through dedicated systems and structures, my research aims to empower general-purpose hardware systems to automate fine manipulation challenges, without imposing additional setup complexities or requiring extensive human intervention. I have utilized data-driven approaches like imitation learning and reinforcement learning to formulate precise, robust, and adaptive policies. In scenarios where demonstrations are available, I have devised frameworks that enhance the robustness of imitation learning agents, enabling their success in fine manipulation tasks. Conversely, for scenarios where obtaining demonstrations is impractical or costly, I have introduced a training paradigm that enhances the sample efficiency of reinforcement learning agents, allowing them to develop strategies that exhibit exceptional dexterity, potentially surpassing human capabilities using 30 minutes of data. In summary, combining learning methods with structures and priors could not only reduce the human efforts during the automation process but improve the precision and robustness of the robots.
Biography: Liyiming Ke is a final-year PhD candidate at the University of Washington advised by Sidd Srinivasa. Her research is dedicated to pushing the boundaries of fine motor skills of robotic systems using data-driven approaches. She has developed theoretical frameworks for imitation learning and built a low-cost chopsticks robot platform capable of fine manipulation and grasping in dynamic environments. She has been selected as one of the Rising Stars in EECS 2023. She has conducted research internships at Facebook AI Research and Microsoft Research.