RvS: What is Essential for Offline RL via Supervised Learning?

Emmons, Scott; Eysenbach, Benjamin; Kostrikov, Ilya; Levine, Sergey

Computer Science > Machine Learning

arXiv:2112.10751 (cs)

[Submitted on 20 Dec 2021 (v1), last revised 11 May 2022 (this version, v2)]

Title:RvS: What is Essential for Offline RL via Supervised Learning?

Authors:Scott Emmons, Benjamin Eysenbach, Ilya Kostrikov, Sergey Levine

View PDF

Abstract:Recent work has shown that supervised learning alone, without temporal difference (TD) learning, can be remarkably effective for offline RL. When does this hold true, and which algorithmic components are necessary? Through extensive experiments, we boil supervised learning for offline RL down to its essential elements. In every environment suite we consider, simply maximizing likelihood with a two-layer feedforward MLP is competitive with state-of-the-art results of substantially more complex methods based on TD learning or sequence modeling with Transformers. Carefully choosing model capacity (e.g., via regularization or architecture) and choosing which information to condition on (e.g., goals or rewards) are critical for performance. These insights serve as a field guide for practitioners doing Reinforcement Learning via Supervised Learning (which we coin "RvS learning"). They also probe the limits of existing RvS methods, which are comparatively weak on random data, and suggest a number of open problems.

Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)
Cite as:	arXiv:2112.10751 [cs.LG]
	(or arXiv:2112.10751v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2112.10751

Submission history

From: Scott Emmons [view email]
[v1] Mon, 20 Dec 2021 18:55:16 UTC (3,438 KB)
[v2] Wed, 11 May 2022 03:17:44 UTC (1,476 KB)

Computer Science > Machine Learning

Title:RvS: What is Essential for Offline RL via Supervised Learning?

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:RvS: What is Essential for Offline RL via Supervised Learning?

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators