Robust cross-lingual knowledge base question answering via knowledge distillation
Data Technologies and Applications
ISSN: 2514-9288
Article publication date: 30 April 2021
Issue publication date: 11 October 2021
Abstract
Purpose
Previous knowledge base question answering (KBQA) models only consider the monolingual scenario and cannot be directly extended to the cross-lingual scenario, in which the language of questions and that of knowledge base (KB) are different. Although a machine translation (MT) model can bridge the gap through translating questions to the language of KB, the noises of translated questions could accumulate and further sharply impair the final performance. Therefore, the authors propose a method to improve the robustness of KBQA models in the cross-lingual scenario.
Design/methodology/approach
The authors propose a knowledge distillation-based robustness enhancement (KDRE) method. Specifically, first a monolingual model (teacher) is trained by ground truth (GT) data. Then to imitate the practical noises, a noise-generating model is designed to inject two types of noise into questions: general noise and translation-aware noise. Finally, the noisy questions are input into the student model. Meanwhile, the student model is jointly trained by GT data and distilled data, which are derived from the teacher when feeding GT questions.
Findings
The experimental results demonstrate that KDRE can improve the performance of models in the cross-lingual scenario. The performance of each module in KBQA model is improved by KDRE. The knowledge distillation (KD) and noise-generating model in the method can complementarily boost the robustness of models.
Originality/value
The authors first extend KBQA models from monolingual to cross-lingual scenario. Also, the authors first implement KD for KBQA to develop robust cross-lingual models.
Keywords
Acknowledgements
This research is supported by the National key research and development program under grant no. 2020YFC1521503; the National Natural Science Foundation of China under grant no. 61672102, no. 61073034, no. 61370064 and no. 60940032; the National Social Science Foundation of China under grant no. BCA150050; the Program for New Century Excellent Talents in the University of Ministry of Education of China under grant no. NCET-10-0239; and the Open Project Sponsor of Beijing Key Laboratory of Intelligent Communication Software and Multimedia under grant no. ITSM201493.
Citation
Wang, S. and Dang, D. (2021), "Robust cross-lingual knowledge base question answering via knowledge distillation", Data Technologies and Applications, Vol. 55 No. 5, pp. 661-681. https://doi.org/10.1108/DTA-12-2020-0312
Publisher
:Emerald Publishing Limited
Copyright © 2021, Emerald Publishing Limited