FRR: Symmetric Policy Learning for Robotic Manipulation

$866,735FY2023CSENSF

Northeastern University, Boston MA

Investigators

Abstract

Machine learning has recently had a major impact on robotics, enabling robots to learn to solve problems in ways that would have been difficult to program manually. Unfortunately, most of this learning currently happens in simulation, and not in the real world. This approach is necessary because today’s machine learning algorithms generally require large amounts of data to learn anything meaningful and simulation is the most direct way of bringing this data to bear. However, there are challenges that come from mismatches between simulation and real experience and, ideally, robots would be able to learn from trial-and-error experience in the physical world as humans and animals do. By simplifying and generalizing real experiences, these systems can get more out of the limited amount of real-world data. This project explores an approach to achieve needed simplifications by incorporating problem symmetries into machine learning. Preliminary work suggests that this is a promising approach and we expect to be able to significantly improve the efficacy of robotic learning in general. This work has significant potential to impact applications in a variety of fields including defense applications, space applications, warehousing and logistics applications, healthcare applications, and applications in the home. The results of this work will be disseminated widely both in the research community and to the public at large. Nearly all planning and learning methods used in robotics today depend on models of the world – models that are sometimes wrong. Ideally, robotic systems would have the ability to adapt online to the nuances of the real world as they are encountered. The dominant paradigms for this type of adaptation are reinforcement learning (RL) and imitation learning (IL). However, today’s RL and IL algorithms are not nearly sample efficient enough to learn directly in the real world. Sample efficiency means that the algorithm can learn thoroughly from a small number of experiences. If sample efficiency were improved to the point that one could meaningfully do RL on physical robotic systems, it could dramatically improve the reliability of robotic control policies, especially for hard-to-model problems like contact-rich manipulation. This is the focus of this project – to improve sample efficiency so that robots can learn and adapt online directly in the real world via RL and learn from a small number of demonstrations via IL. The project will achieve this goal by leveraging a new class of symmetric neural models that encode problem symmetries present in many robotics domains. Preliminary work suggests these models can speed up learning by orders of magnitude in some cases. The project has the following main aims: 1) to expand current symmetric learning methods to handle domains with imperfect symmetries; 2) to explore object factored symmetric models; 3) to explore symmetric learning in visual force domains; 4) to explore policy learning directly on physical robotic systems. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

View original record on NSF Award Search →