In this paper, we proposed a simple approach to filtering noisy sentence pairs from a synthetic parallel corpus generated with back-translation. We measured the ...
Experimental results on the IWSLT 2017 Korean→English translation task show that despite using much less data, this method outperforms the baseline NMT ...
Improving Neural Machine Translation by Filtering Synthetic Parallel ...
search.ebscohost.com › login
Abstract: Synthetic data has been shown to be effective in training state-of-the-art neural machine translation (NMT) systems. Because the synthetic data is ...
Jan 6, 2023 · New research from IBM, UC San Diego explores synthetic parallel data as a new means of pre-training machine translation models.
Guanghao Xu, Youngjoong Ko, Jungyun Seo: Improving Neural Machine Translation by Filtering Synthetic Parallel Data. Entropy 21(12): 1213 (2019).
2.1 Previous Work. Considering the limited size of noisy parallel data, data augmentation methods are commonly used to generate more noisy training materials.
Aug 22, 2024 · This paper proposes a novel way of utilizing a monolingual corpus on the source side to assist Neural Machine Translation (NMT) in low-resource ...
Feb 8, 2024 · Synthetic parallel data: Similar to the English-German setup, 3.2 million Turkish sentences were back-translated into English to create ...
We propose a method to effectively expand the training data via filtering the pseudo-parallel corpus using quality estimation based on sentence-level round- ...
People also ask
What is parallel data for machine translation?
What are the six challenges for neural machine translation proceedings of the first workshop on neural machine translation?
Which neural network is best for machine translation?
What is the difference between statistical and neural machine translation?
Improving Neural Machine Translation by Filtering Synthetic Parallel Data. Entropy 2019, 21, 1213. https://doi.org/10.3390/e21121213. AMA Style. Xu G, Ko Y ...