Offline Learning from Demonstrations and Unlabeled Experience

Zolna, Konrad; Novikov, Alexander; Konyushkova, Ksenia; Gulcehre, Caglar; Wang, Ziyu; Aytar, Yusuf; Denil, Misha; de Freitas, Nando; Reed, Scott

Computer Science > Machine Learning

arXiv:2011.13885 (cs)

[Submitted on 27 Nov 2020]

Title:Offline Learning from Demonstrations and Unlabeled Experience

Authors:Konrad Zolna, Alexander Novikov, Ksenia Konyushkova, Caglar Gulcehre, Ziyu Wang, Yusuf Aytar, Misha Denil, Nando de Freitas, Scott Reed

View PDF

Abstract:Behavior cloning (BC) is often practical for robot learning because it allows a policy to be trained offline without rewards, by supervised learning on expert demonstrations. However, BC does not effectively leverage what we will refer to as unlabeled experience: data of mixed and unknown quality without reward annotations. This unlabeled data can be generated by a variety of sources such as human teleoperation, scripted policies and other agents on the same robot. Towards data-driven offline robot learning that can use this unlabeled experience, we introduce Offline Reinforced Imitation Learning (ORIL). ORIL first learns a reward function by contrasting observations from demonstrator and unlabeled trajectories, then annotates all data with the learned reward, and finally trains an agent via offline reinforcement learning. Across a diverse set of continuous control and simulated robotic manipulation tasks, we show that ORIL consistently outperforms comparable BC agents by effectively leveraging unlabeled experience.

Comments:	Accepted to Offline Reinforcement Learning Workshop at Neural Information Processing Systems (2020)
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Robotics (cs.RO); Machine Learning (stat.ML)
Cite as:	arXiv:2011.13885 [cs.LG]
	(or arXiv:2011.13885v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2011.13885

Submission history

From: Konrad Zolna [view email]
[v1] Fri, 27 Nov 2020 18:20:04 UTC (1,319 KB)

Computer Science > Machine Learning

Title:Offline Learning from Demonstrations and Unlabeled Experience

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Offline Learning from Demonstrations and Unlabeled Experience

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators