Autumn 2015 Colloquium

Organizers: Tanner Schmidt, Dieter Fox

Physically Situated Dialog: Opportunities and Challenges
Dan Bohus (Microsoft Research) 10/09/2015

Abstract: Most research to date on spoken language interaction has focused on supporting dialog with single users in limited domains and contexts. Efforts in this space have led to significant progress, including wide-scale deployments of telephony-based systems and voice-enabled mobile assistants. At the same time, numerous and important challenges in the realm of physically situated, open-world interaction have remained largely unaddressed, e.g., language interaction with robots in a public space, in-car systems, ambient assistants, etc. In this talk, I will give an overview of the Situated Interaction project at Microsoft Research, which aims to address some of these challenges. Specifically, I will outline a core set of communicative competencies required for supporting dialog in physically situated settings – such as models of multiparty engagement, turn-taking and interaction planning, and I will present samples of our work as part of a broader research agenda in this space.

Biography: Dan Bohus is a Senior Researcher in the Adaptive Systems and Interaction Group at Microsoft Research. His research agenda is focused on physically situated, open-world spoken language interaction. Before joining Microsoft Research, Dan has received his Ph.D. degree (2007) in Computer Science from Carnegie Mellon University.

Aerial autonomy at insect scale: What flying insects can tell us about robotics and vice versa
Sawyer Fuller (UW) 10/16/2015

Abstract: Insect-sized aerial robots will be deployed where their small size, low cost, and maneuverability give them an advantage over larger robots. For example, they could deploy in swarms to follow airborne plumes to locate gas leaks in dense piping infrastructure. However, miniaturization poses challenges because the physics of scaling dictates that many technologies used in larger aircraft cannot operate effectively at the size of insects. These include propellers, the Global Positioning System, and general-purpose microprocessors. Insects have overcome these challenges by evolving a scale-appropriate flight apparatus whose robustness and agility surpasses anything man-made. For example, using only senses carried onboard, they can land on flowers buffeted by wind or deftly avoid a flyswatter. But how they do this is not fully understood. My research aims to better understand insect capabilities through experimental study and use this to create autonomous robot counterparts with competitive performance. I will describe experiments I performed on flies that revealed that they sense and compensate for wind to improve flight agility. And I will describe flight control demonstrations on a fly-sized flapping-wing robot stabilized by insect-inspired sensors. The results indicate that, under the severe power and weight constraints and unfamiliar physics at this scale, success will require designs with intimately coupled sensing and mechanics, new low-power control architectures, and careful observation of the techniques used by biology.

From Polynomials to Humanoid Robots
Russ Tedrake (MIT) 10/23/2015

Abstract: The last few years have seen absolutely incredible advances in the field of robotics, with massive new investments from major companies including Google, Apple, and Uber. At the heart of these advances are algorithms, often using mathematical optimization, which allow our machines to better interpret massive streams of incoming data, to decide how and where to move, and even to balance and not fall down while they are executing those plans. In this talk, I'll describe some of those advances in the context of a controlling a 400lb humanoid robot in a disaster response scenario and an airplane that can dart through a forest at 30 mph. And I'd like to send a clear message -- there is still a lot of work to be done! Even small improvements in our mathematical foundations, such as the algorithms which check if a polynomial equation is uniformly greater than zero, can make our robots more capable of moving through the world.

Biography: Russ Tedrake is the X Consortium Associate Professor of Electrical Engineering and Computer Science, Aeronautics and Astronautics, and Mechanical Engineering at MIT, the Director of the Center for Robotics at the Computer Science and Artificial Intelligence Lab, and the leader of Team MIT's entry in the DARPA Robotics Challenge. He is a recipient of the NSF CAREER Award, the MIT Jerome Saltzer Award for undergraduate teaching, the DARPA Young Faculty Award in Mathematics, the 2012 Ruth and Joel Spira Teaching Award, and was named a Microsoft Research New Faculty Fellow. Russ received his B.S.E. in Computer Engineering from the University of Michigan, Ann Arbor, in 1999, and his Ph.D. in Electrical Engineering and Computer Science from MIT in 2004, working with Sebastian Seung. After graduation, he joined the MIT Brain and Cognitive Sciences Department as a Postdoctoral Associate. During his education, he has also spent time at Microsoft, Microsoft Research, and the Santa Fe Institute.

Factor Graphs for Flexible Inference in Robotics and Vision
Frank Dellaert (Skydio) 10/30/2015

Abstract: Simultaneous Localization and Mapping (SLAM) and Structure from Motion (SFM) are important and closely related problems in robotics and vision. I will show how both SLAM and SFM instances can be posed in terms of a graphical model, a factor graph, and that inference in these graphs can be understood as variable elimination. The overarching theme of the talk will be to emphasize the advantages and intuition that come with seeing these problems in terms of graphical models. For example, while the graphical model perspective is completely general, linearizing the non-linear factors and assuming Gaussian noise yields the familiar direct linear solvers such as Cholesky and QR factorization. Based on these insights, we have developed both batch and incremental algorithms defined on graphs in the SLAM/SFM domain. I will also discuss my recent work on using polynomial bases for trajectory optimization, inspired by pseudospectral optimal control, which is made easy by the new Expressions language in GTSAM 4, currently under development.

Biography: Frank Dellaert is currently on leave from the Georgia Institute of Technology for a stint as Chief Scientist of Skydio, a startup founded by MIT grads to create intuitive interfaces for micro-aerial vehicles. When not on leave, he is a Professor in the School of Interactive Computing and Director of the Robotics PhD program at Georgia Tech. His research interests lie in the overlap of Robotics and Computer vision, and he is particularly interested in graphical model techniques to solve large-scale problems in mapping and 3D reconstruction. You can find out about his group’s research and publications at and The GTSAM toolbox which embodies many of the ideas his group has worked on in the past few years is available for download at

Modeling Human Communication Dynamics
Louis-Philippe Morency (CMU) 11/13/2015

Abstract: Human face-to-face communication is a little like a dance, in that participants continuously adjust their behaviors based on verbal and nonverbal cues from the social context. Today's computers and interactive devices are still lacking many of these human-like abilities to hold fluid and natural interactions. Leveraging recent advances in machine learning, audio-visual signal processing and computational linguistic, my research focuses on creating computational technologies able to analyze, recognize and predict human subtle communicative behaviors in social context. I formalize this new research endeavor with a Human Communication Dynamics framework, addressing four key computational challenges: behavioral dynamic, multimodal dynamic, interpersonal dynamic and societal dynamic. Central to this research effort is the introduction of new probabilistic models able to learn the temporal and fine-grained latent dependencies across behaviors, modalities and interlocutors. In this talk, I will present some of our recent achievements modeling multiple aspects of human communication dynamics, motivated by applications in healthcare (depression, PTSD, suicide, autism), education (learning analytics), business (negotiation, interpersonal skills) and social multimedia (opinion mining, social influence).

Biography: Louis-Philippe Morency is Assistant Professor in the Language Technology Institute at the Carnegie Mellon University where he leads the Multimodal Communication and Machine Learning Laboratory (MultiComp Lab). He received his Ph.D. and Master degrees from MIT Computer Science and Artificial Intelligence Laboratory. In 2008, Dr. Morency was selected as one of "AI's 10 to Watch" by IEEE Intelligent Systems. He has received 7 best paper awards in multiple ACM- and IEEE-sponsored conferences for his work on context-based gesture recognition, multimodal probabilistic fusion and computational models of human communication dynamics. Dr. Morency was lead Co-PI for the DARPA-funded multi-institution effort called SimSensei which was recently named one of the year’s top ten most promising digital initiatives by the NetExplo Forum, in partnership with UNESCO.

Real-time dense methods for 3D perception
Tom Whelan (Oculus Research) 11/20/2015

Abstract: In the past few years real-time dense methods have exploded onto the scene of robotics and general 3D perception. Key to the high level algorithms which can exploit this kind of data the most is the underlying method for generating and reconstructing a dense 3D representation. This talk will firstly contain an overview of the Kintinuous system for large scale real-time dense SLAM as well as a number of more recent results in use cases such as object detection, semantics and in-the-loop robotic control. Secondly, the recently published ElasticFusion system for real-time comprehensive dense 3D reconstruction will be presented, posing an alternative map-centric approach to the SLAM problem. Finally, some very recent results on extracting advanced surface information from the scene in real-time will be shown, paving the way for the future of real-time dense methods.

Biography: Dr. Thomas Whelan is currently a Research Scientist at Oculus Research in Redmond working with the Surreal Vision team. Previous to this he spent one year as a post doctoral research fellow at the Dyson Robotics Laboratory at Imperial College London, lead by Prof. Andrew J. Davison. He was previously a Ph.D. student at the National University of Ireland Maynooth under a 3 year post-graduate scholarship from the Irish Research Council. In 2012 he spent 3 months as a visiting researcher at Prof. John Leonard’s group in CSAIL, MIT funded by a Science Foundation Ireland Short-Term Travel Fellowship. He received his B.Sc. (Hons) in Computer Science & Software Engineering from the National University of Ireland Maynooth in 2011. His research focuses on developing methods for dense real-time perception and its applications in SLAM and robotics.

Robust Distributed Control Policies for Multi-Robot Systems
Seth Hutchinson (UIUC) 12/04/2015

Abstract: In this talk, I will describe our recent progress in developing fault-tolerant distributed control policies for multi-robot systems. We consider two problems: rendezvous and coverage. For the former, the goal is to bring all robots to a common location, while for the latter the goal is to deploy robots to achieve optimal coverage of an environment. We consider the case in which each robot is an autonomous decision maker that is anonymous (i.e., robots are indistinguishable to one another), memoryless (i.e., each robot makes decisions based upon only its current information), and dimensionless (i.e., collision checking is not considered). Each robot has a limited sensing range, and is able to directly estimate the state of only those robots within that sensing range, which induces a network topology for the multi-robot system. We assume that it is not possible for the fault-free robots to identify the faulty robots (e.g., due to the anonymous property of the robots). For each problem, we provide an efficient computational framework and analysis of algorithms, all of which converge in the face of faulty robots under a few assumptions on the network topology and sensing abilities.

Biography: Seth Hutchinson received his Ph.D. from Purdue University in 1988. In 1990 he joined the faculty at the University of Illinois in Urbana-Champaign, where he is currently a Professor in the Department of Electrical and Computer Engineering, the Coordinated Science Laboratory, and the Beckman Institute for Advanced Science and Technology. He served as Associate Department Head of ECE from 2001 to 2007. He currently serves on the editorial boards of the International Journal of Robotics Research and the Journal of Intelligent Service Robotics, and chairs the steering committee of the IEEE Robotics and Automation Letters. He was Founding Editor-in-Chief of the IEEE Robotics and Automation Society's Conference Editorial Board (2006-2008), and Editor-in-Chief of the IEEE Transaction on Robotics (2008-2013). He has published more than 200 papers on the topics of robotics and computer vision, and is coauthor of the books "Principles of Robot Motion: Theory, Algorithms, and Implementations," published by MIT Press, and "Robot Modeling and Control," published by Wiley. Hutchinson is a Fellow of the IEEE.

Toward General-Purpose Manipulation of Deformable Objects
Dmitry Berenson (WPI) 12/11/2015

Abstract: Imagine a robot that could perceive and manipulate rigid objects as skillfully as a human adult. Would a robot that had such amazing capabilities be able to perform the range of practical manipulation tasks we expect in settings such as the home? Consider that this robot would still be unable to prepare a meal, do laundry, or make a bed because these tasks involve deformable object manipulation. Unlike in rigid-body manipulation, where methods exist for general-purpose pick-and-place tasks regardless of the size and shape of the object, no such methods exist for a similarly broad and practical class of deformable object manipulation tasks. The problem is indeed challenging, as these objects are not straightforward to model and have infinite-dimensional configuration spaces, making it difficult to apply established motion planning approaches. Our approach seeks to bypass these difficulties by representing deformable objects using simplified geometric models at both the global and local planning levels. Though we cannot predict the state of the object precisely, we can nevertheless perform tasks such as cable-routing, cloth folding, and surgical probe insertion in geometrically-complex environments. Building on this work, our new projects in this area aim to blend exploration of the model space with goal-directed manipulation of deformable objects and to generalize the methods we have developed to motion planning for soft robot arms, where we can exploit contact to mitigate the actuation uncertainty inherent in these systems.

Biography: Dmitry Berenson received a BS in Electrical Engineering from Cornell University in 2005 and received his Ph.D. degree from the Robotics Institute at Carnegie Mellon University in 2011, where he was supported by an Intel PhD Fellowship. He completed a post-doc at UC Berkeley and started as an Assistant Professor in Robotics Engineering and Computer Science at WPI in 2012. He founded and directs the Autonomous Robotic Collaboration (ARC) Lab at WPI, which focuses on motion planning, manipulation, and human-robot collaboration.