[PDF][PDF] Combining statistical machine translation and translation memories with domain adaptation
Proceedings of the 19th Nordic Conference of Computational …, 2013•aclanthology.org
Since the emergence of translation memory software, translation companies and freelance
translators have been accumulating translated text for various languages and domains. This
data has the potential of being used for training domain-specific machine translation
systems for corporate or even personal use. But while the resulting systems usually perform
well in translating domain-specific language, their out-of-domain vocabulary coverage is
often insufficient due to the limited size of the translation memories. In this paper, we …
translators have been accumulating translated text for various languages and domains. This
data has the potential of being used for training domain-specific machine translation
systems for corporate or even personal use. But while the resulting systems usually perform
well in translating domain-specific language, their out-of-domain vocabulary coverage is
often insufficient due to the limited size of the translation memories. In this paper, we …
Abstract
Since the emergence of translation memory software, translation companies and freelance translators have been accumulating translated text for various languages and domains. This data has the potential of being used for training domain-specific machine translation systems for corporate or even personal use. But while the resulting systems usually perform well in translating domain-specific language, their out-of-domain vocabulary coverage is often insufficient due to the limited size of the translation memories. In this paper, we demonstrate that small in-domain translation memories can be successfully complemented with freely available general-domain parallel corpora such that (a) the number of out-of-vocabulary words (OOV) is reduced while (b) the in-domain terminology is preserved. In our experiments, a German–French and a German–Italian statistical machine translation system geared to marketing texts of the automobile industry has been significantly improved using Europarl and OpenSubtitles data, both in terms of automatic evaluation metrics and human judgement.
aclanthology.org
Showing the best result for this search. See all results