Recorded May 1st, 2018
“The current dominant paradigm of imitation learning relies on strong supervision of expert actions for learning both what to and how to imitate.
We propose an alternative paradigm wherein an agent first explores the world without any expert supervision and then distills its own experience into a goal-conditioned skill policy using a novel forward consistency loss formulation.”
https://pathak22.github.io/zeroshot-imitation/
[Read More …]
Powered by WPeMatico