Robotics | UW CSE Robotics

Winter 2025 Colloquium

Organizers: Jerry Savage, Abhishek Gupta, Maya Cakmak, Josh Smith

Learning-from-observation2.0
Katsu Ikeuchi (Principal Researcher/Research Manager, Microsoft) 01/31/2025

Abstract: We are developing a Learning-from-Observation (LfO) system that acquires robotic behaviors through the observation of human demonstrations. Unlike the bottom-up approach known as "Learning-from-Demonstration" or "Imitation Learning," which replicates human movements as they are, we are employing a top-down approach (top-down learning-from-observation). This method entails observing only the critical components of human actions through a task model representation (akin to Minsky's frame), generating an abstract representation based on these observations, which is subsequently mapped onto the robot's behavior. The advantages of this top-down approach include the ability to generalize and correct observational errors by utilizing an intermediate task model representation, thereby enhancing the affinity with large language models. Furthermore, by tailoring the mapping to each individual robot, the system can be applied to different robotic platforms without necessitating significant modifications to the recognition system. The initial step of the system involves the utilization of a large language model (LLM) to comprehend the "what-to-do" from human demonstrations and subsequently retrieve the corresponding task model. This task model directs the CNN-based observation module to focus on specific aspects of human behavior and fills in the requisite parameters for "how-to-do," thereby completing the intermediate representation. Based on this finalized task model, the system activates the appropriate agents from a pre-trained group of agents—trained through reinforcement learning on the "how-to-do" aspect—to execute the robot's actions. This presentation will provide a comprehensive overview of the system architecture, the design methodologies for the pre-trained skill sets, and other pertinent details. Furthermore, it will discuss a comparison between this hybrid approach, which integrates traditional robotic techniques with LLMs, and end-to-end (E2E) methodologies, including foundation models.

Biography: Dr. Ikeuchi joined Microsoft in 2015, following distinguished tenures at MIT's Artificial Intelligence Laboratory, Japan's National Institute of Advanced Industrial Science and Technology (AIST), Carnegie Mellon University's Robotics Institute (CMU-RI), and the University of Tokyo. His research interests span computer vision, robotics, and Intelligent Transportation Systems (ITS). He has served as the Editor-in-Chief of the International Journal of Computer Vision (IJCV) and the International Journal of Intelligent Transportation Systems (IJITS), as well as the Encyclopedia of Computer Vision. Dr. Ikeuchi has also chaired numerous international conferences, including IROS95, CVPR96, ICCV03, ITSW07, ICRA09, ICPR12, and ICCV17. He has been the recipient of several prestigious awards, such as the IEEE PAMI Distinguished Researcher Award, the Okawa Award, the Funai Award, the IEICE Outstanding Achievements and Contributions Award, as well as the Medal of Honor with Purple Ribbon from the Emperor of Japan. Dr. Ikeuchi is a Fellow of IEEE, IAPR, IEICE, IPSJ, and RSJ. He earned his Ph.D. in Information Engineering from the University of Tokyo and his Bachelor's degree in Mechanical Engineering from Kyoto University.

Human-in-the-Loop Machine Learning for Robot Navigation and Manipulation
Peter Stone (Professor of Computer Science, University of Texas at Austin) 02/07/2025

Abstract: While there have been huge advances in Machine Learning in recent years, many of the successes have relied on immense amounts of training data. Especially for sequential-decision-making tasks (the realm of reinforcement learning), obtaining such data from online experience can take a very long time. On the other hand, learning can often be dramatically accelerated by leveraging human input, for example as demonstrations of successful task executions, as interventions to correct mistakes, or simply as evaluative feedback separating "correct" actions from incorrect actions. This talk focuses on such Human-in-the-Loop Machine Learning for robotics tasks, covering both navigation, especially in tightly constrained spaces, and manipulation in open-world settings.

Biography: Dr. Peter Stone's main research interest in AI is understanding how we can best create complete intelligent agents. He considers adaptation, interaction, and embodiment to be essential capabilities of such agents. Thus, his research focuses mainly on machine learning, multiagent systems, and robotics. For him, the most exciting research topics are those inspired by challenging real-world problems. He believes that complete, successful research includes both precise, novel algorithms and fully implemented and rigorously evaluated applications. His application domains have included robot soccer, autonomous bidding agents, autonomous vehicles, autonomic computing, and social agents.

Implementing Robot Task Planning with Learned Manipulation Skills
Sebastian Castro (Roboticist/Applied Scientist) 02/21/2025

Abstract: Robotics as a field has a constantly growing repository of fundamental techniques for perception, motion planning, navigation, and control. Lately, this has been accelerated by robots becoming more ubiquitous in industry, as well as a surge of research in machine learning and optimization based approaches. As we become equipped with the ability to program robots with a variety of skills, it naturally follows to think about composition of these skills to achieve high-level goal specifications. This talk will introduce the landscape of tools that enable planning at the task level, as well as common behavior abstractions to ground task plans to robust, executable skills in the real world. Then, we describe some promising research directions in composition of learned manipulation skills, with a goal of operationalizing robot task planning for real-world tasks while reducing dependency on domain-specific, hand-engineered solutions.

Biography: Sebastian Castro is a roboticist and applied scientist at the Robotics and AI Institute, working on task planning and composition of learned manipulation skills. He holds Bachelor’s and Master’s degrees from Cornell University in mechanical engineering, with a concentration on dynamics, systems, and control, applied to high-level planning and control of modular robots. His prior professional experience includes technical content development and marketing for robotics competitions with MathWorks, and robotics software engineering at MIT Computer Science and Artificial Intelligence Laboratory (CSAIL), Boston Dynamics, and PickNik Robotics. Sebastian also devotes personal time to robotics education through blog posts, open-source software, and education-focused talks and workshops.

Modeling Humans for Humanoid Robots
Xiaolong Wang (Assistant Professor of Computer Science and Engineering, UC San Diego) 02/28/2025

Abstract: Xiaolong Wang is an Assistant Professor in the ECE department at the University of California, San Diego, and a Visiting Professor at NVIDIA Research. He received his Ph.D. in Robotics at Carnegie Mellon University. His postdoctoral training was at the University of California, Berkeley. His research focuses on the intersection between computer vision and robotics. His specific interest lies in representation learning with videos and physical robotic interaction data. These comprehensive representations are utilized to facilitate the learning of human-like robot skills, with the goal of generalizing the robot to interact effectively with a wide range of objects and environments in the real physical world. He is the recipient of the Sloan Research Fellowship, J. K. Aggarwal Prize, NSF CAREER Award, Intel Rising Star Faculty Award, Best Paper Awards at IROS and ICRA, and Research Awards from Sony, Amazon, Adobe, and CISCO.

Biography: Having a humanoid robot operating like a human has been a long-standing goal in robotics. The humanoid robot provides a general-purpose platform to conduct diverse tasks we do in our daily lives. In this talk, I will present a 2-level learning framework designed to equip humanoid robots with robust mobility and manipulation skills, enabling them to generalize across diverse tasks, objects, and environments. The first level focuses on training a Vision-Language-Action (VLA) model with human video data. This VLA can predict “mid-level” actions on precise robot movements and trajectories. The second level involves developing low-level robot manipulation skills through human hand imitation, and low-level humanoid whole-body control skills via human body imitation. By combining human VLA with low-level robot skills, this framework offers a scalable pathway toward realizing general-purpose humanoid robots.

Jeannette Bohg (Assistant Professor of Computer Science, Stanford University) 03/07/2025

Robotics Research at the University of Tsukuba
Fumihide Tanaka (Professor, Institute of Systems and Information Engineering University of Tsukuba) 03/14/2025

Abstract: I would like to take this opportunity to introduce robotics research at the University of Tsukuba. Physically/socially assistive robotics, object manipulation, mobile robots, wearable robots, social robotics and human-robot interaction.

Biography: Fuhimide Tanaka is a Professor in the Institute of Systems and Information Engineering at The University of Tsukuba. His background includes roles and education at Tokyo Institute of Technology (Ph.D.), Sony Corporation, Sony Intelligent Dynamics Laboratories, University of California, San Diego, University of Tokyo, and now University of Tsukuba. He has been involved in the development of Sony's robots and SoftBank's Pepper robot. His current research focuses on AI robots to assist the aging society.