Adversarial Policies: Attacking Deep Reinforcement Learning

Gleave, Adam; Dennis, Michael; Wild, Cody; Kant, Neel; Levine, Sergey; Russell, Stuart

Computer Science > Machine Learning

arXiv:1905.10615 (cs)

[Submitted on 25 May 2019 (v1), last revised 17 Jan 2021 (this version, v3)]

Title:Adversarial Policies: Attacking Deep Reinforcement Learning

Authors:Adam Gleave, Michael Dennis, Cody Wild, Neel Kant, Sergey Levine, Stuart Russell

View PDF

Abstract:Deep reinforcement learning (RL) policies are known to be vulnerable to adversarial perturbations to their observations, similar to adversarial examples for classifiers. However, an attacker is not usually able to directly modify another agent's observations. This might lead one to wonder: is it possible to attack an RL agent simply by choosing an adversarial policy acting in a multi-agent environment so as to create natural observations that are adversarial? We demonstrate the existence of adversarial policies in zero-sum games between simulated humanoid robots with proprioceptive observations, against state-of-the-art victims trained via self-play to be robust to opponents. The adversarial policies reliably win against the victims but generate seemingly random and uncoordinated behavior. We find that these policies are more successful in high-dimensional environments, and induce substantially different activations in the victim policy network than when the victim plays against a normal opponent. Videos are available at this https URL.

Comments:	Presented at ICLR 2020
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Cryptography and Security (cs.CR); Machine Learning (stat.ML)
ACM classes:	I.2.6
Cite as:	arXiv:1905.10615 [cs.LG]
	(or arXiv:1905.10615v3 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1905.10615

Submission history

From: Adam Gleave [view email]
[v1] Sat, 25 May 2019 15:23:19 UTC (7,861 KB)
[v2] Tue, 11 Feb 2020 19:54:47 UTC (6,882 KB)
[v3] Sun, 17 Jan 2021 19:25:56 UTC (6,259 KB)

Computer Science > Machine Learning

Title:Adversarial Policies: Attacking Deep Reinforcement Learning

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Adversarial Policies: Attacking Deep Reinforcement Learning

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators