MBDP: A Model-based Approach to Achieve both Robustness and Sample Efficiency via Double Dropout Planning

Zhang, Wanpeng; Xiao, Xi; Yao, Yao; Chen, Mingzhe; Luo, Dijun

Computer Science > Machine Learning

arXiv:2108.01295 (cs)

[Submitted on 3 Aug 2021 (v1), last revised 2 May 2024 (this version, v2)]

Title:MBDP: A Model-based Approach to Achieve both Robustness and Sample Efficiency via Double Dropout Planning

Authors:Wanpeng Zhang, Xi Xiao, Yao Yao, Mingzhe Chen, Dijun Luo

View PDF HTML (experimental)

Abstract:Model-based reinforcement learning is a widely accepted solution for solving excessive sample demands. However, the predictions of the dynamics models are often not accurate enough, and the resulting bias may incur catastrophic decisions due to insufficient robustness. Therefore, it is highly desired to investigate how to improve the robustness of model-based RL algorithms while maintaining high sampling efficiency. In this paper, we propose Model-Based Double-dropout Planning (MBDP) to balance robustness and efficiency. MBDP consists of two kinds of dropout mechanisms, where the rollout-dropout aims to improve the robustness with a small cost of sample efficiency, while the model-dropout is designed to compensate for the lost efficiency at a slight expense of robustness. By combining them in a complementary way, MBDP provides a flexible control mechanism to meet different demands of robustness and efficiency by tuning two corresponding dropout ratios. The effectiveness of MBDP is demonstrated both theoretically and experimentally.

Subjects:	Machine Learning (cs.LG)
Cite as:	arXiv:2108.01295 [cs.LG]
	(or arXiv:2108.01295v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2108.01295

Submission history

From: Wanpeng Zhang [view email]
[v1] Tue, 3 Aug 2021 04:55:16 UTC (1,639 KB)
[v2] Thu, 2 May 2024 14:38:51 UTC (1,637 KB)

Computer Science > Machine Learning

Title:MBDP: A Model-based Approach to Achieve both Robustness and Sample Efficiency via Double Dropout Planning

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:MBDP: A Model-based Approach to Achieve both Robustness and Sample Efficiency via Double Dropout Planning

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators