Scheduled Sampling Based on Decoding Steps for Neural Machine Translation

Liu, Yijin; Meng, Fandong; Chen, Yufeng; Xu, Jinan; Zhou, Jie

Computer Science > Computation and Language

arXiv:2108.12963 (cs)

[Submitted on 30 Aug 2021 (v1), last revised 31 Aug 2021 (this version, v2)]

Title:Scheduled Sampling Based on Decoding Steps for Neural Machine Translation

Authors:Yijin Liu, Fandong Meng, Yufeng Chen, Jinan Xu, Jie Zhou

View PDF

Abstract:Scheduled sampling is widely used to mitigate the exposure bias problem for neural machine translation. Its core motivation is to simulate the inference scene during training by replacing ground-truth tokens with predicted tokens, thus bridging the gap between training and inference. However, vanilla scheduled sampling is merely based on training steps and equally treats all decoding steps. Namely, it simulates an inference scene with uniform error rates, which disobeys the real inference scene, where larger decoding steps usually have higher error rates due to error accumulations. To alleviate the above discrepancy, we propose scheduled sampling methods based on decoding steps, increasing the selection chance of predicted tokens with the growth of decoding steps. Consequently, we can more realistically simulate the inference scene during training, thus better bridging the gap between training and inference. Moreover, we investigate scheduled sampling based on both training steps and decoding steps for further improvements. Experimentally, our approaches significantly outperform the Transformer baseline and vanilla scheduled sampling on three large-scale WMT tasks. Additionally, our approaches also generalize well to the text summarization task on two popular benchmarks.

Comments:	Code is at this https URL. To appear in EMNLP-2021 main conference. arXiv admin note: text overlap with arXiv:2107.10427
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2108.12963 [cs.CL]
	(or arXiv:2108.12963v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2108.12963

Submission history

From: Yijin Liu [view email]
[v1] Mon, 30 Aug 2021 02:41:42 UTC (4,201 KB)
[v2] Tue, 31 Aug 2021 06:21:16 UTC (2,080 KB)

Computer Science > Computation and Language

Title:Scheduled Sampling Based on Decoding Steps for Neural Machine Translation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Scheduled Sampling Based on Decoding Steps for Neural Machine Translation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators