Putting words into the system's mouth: A targeted attack on neural machine translation using monolingual data poisoning

Wang, Jun; Xu, Chang; Guzman, Francisco; El-Kishky, Ahmed; Tang, Yuqing; Rubinstein, Benjamin I. P.; Cohn, Trevor

Computer Science > Computation and Language

arXiv:2107.05243 (cs)

[Submitted on 12 Jul 2021]

Title:Putting words into the system's mouth: A targeted attack on neural machine translation using monolingual data poisoning

Authors:Jun Wang, Chang Xu, Francisco Guzman, Ahmed El-Kishky, Yuqing Tang, Benjamin I. P. Rubinstein, Trevor Cohn

View PDF

Abstract:Neural machine translation systems are known to be vulnerable to adversarial test inputs, however, as we show in this paper, these systems are also vulnerable to training attacks. Specifically, we propose a poisoning attack in which a malicious adversary inserts a small poisoned sample of monolingual text into the training set of a system trained using back-translation. This sample is designed to induce a specific, targeted translation behaviour, such as peddling misinformation. We present two methods for crafting poisoned examples, and show that only a tiny handful of instances, amounting to only 0.02% of the training set, is sufficient to enact a successful attack. We outline a defence method against said attacks, which partly ameliorates the problem. However, we stress that this is a blind-spot in modern NMT, demanding immediate attention.

Comments:	Findings of ACL, to appear
Subjects:	Computation and Language (cs.CL); Cryptography and Security (cs.CR)
Cite as:	arXiv:2107.05243 [cs.CL]
	(or arXiv:2107.05243v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2107.05243

Submission history

From: Jun Wang [view email]
[v1] Mon, 12 Jul 2021 08:07:09 UTC (174 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2021-07

Change to browse by:

cs
cs.CR

References & Citations

DBLP - CS Bibliography

listing | bibtex

Jun Wang
Chang Xu
Francisco Guzmán
Ahmed El-Kishky
Yuqing Tang

…

export BibTeX citation

Computer Science > Computation and Language

Title:Putting words into the system's mouth: A targeted attack on neural machine translation using monolingual data poisoning

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Putting words into the system's mouth: A targeted attack on neural machine translation using monolingual data poisoning

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators