Stochastically Dominant Distributional Reinforcement Learning

Martin, John D.; Lyskawinski, Michal; Li, Xiaohu; Englot, Brendan

Computer Science > Machine Learning

arXiv:1905.07318 (cs)

[Submitted on 17 May 2019 (v1), last revised 7 Oct 2020 (this version, v4)]

Title:Stochastically Dominant Distributional Reinforcement Learning

Authors:John D. Martin, Michal Lyskawinski, Xiaohu Li, Brendan Englot

View PDF

Abstract:We describe a new approach for managing aleatoric uncertainty in the Reinforcement Learning (RL) paradigm. Instead of selecting actions according to a single statistic, we propose a distributional method based on the second-order stochastic dominance (SSD) relation. This compares the inherent dispersion of random returns induced by actions, producing a more comprehensive and robust evaluation of the environment's uncertainty. The necessary conditions for SSD require estimators to predict accurate second moments. To accommodate this, we map the distributional RL problem to a Wasserstein gradient flow, treating the distributional Bellman residual as a potential energy functional. We propose a particle-based algorithm for which we prove optimality and convergence. Our experiments characterize the algorithm performance and demonstrate how uncertainty and performance are better balanced using an \textsc{ssd} policy than with other risk measures.

Comments:	Accepted to the 2020 International Conference on Machine Learning
Subjects:	Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:1905.07318 [cs.LG]
	(or arXiv:1905.07318v4 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1905.07318

Submission history

From: John Martin Jr [view email]
[v1] Fri, 17 May 2019 15:15:08 UTC (507 KB)
[v2] Sat, 23 Nov 2019 16:57:49 UTC (1,021 KB)
[v3] Thu, 5 Mar 2020 23:16:27 UTC (3,242 KB)
[v4] Wed, 7 Oct 2020 15:27:25 UTC (3,235 KB)

Computer Science > Machine Learning

Title:Stochastically Dominant Distributional Reinforcement Learning

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Stochastically Dominant Distributional Reinforcement Learning

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators