Data-driven Policy Transfer with Imprecise Perception Simulation

Martin Pecka, Karel Zimmermann, Matěj Petrlík and Tomáš Svoboda

Czech Technical University in Prague, Faculty of Electrical Engineering, Department of Cybernetics

Center for Robotics and Autonomous Systems

The paper presents a complete pipeline for learning continuous motion control policies for a mobile robot when only a non-differentiable physics simulator of robot-terrain interactions is available. The multi-modal state estimation of the robot is also complex and difficult to simulate, so we simultaneously learn a generative model which refines simulator outputs. We propose a coarse-to-fine learning paradigm, where the coarse motion planning is alternated with imitation learning and policy transfer to the real robot. The policy is jointly optimized with the generative model. We evaluate the method on a real-world platform in a batch of experiments.

Submitted to IROS 2018 with RAL option. Available on arXiv.

Schema of the algorithm. DEM

Video of the training progress

FullHD H.264, 28 MB:

Video from testing on really unstructured terrain