Generating keyword queries for natural language queries to alleviate lexical chasm problem

X Liu, S Pan, Q Zhang, YG Jiang, X Huang - Proceedings of the 27th …, 2018 - dl.acm.org
X Liu, S Pan, Q Zhang, YG Jiang, X Huang
Proceedings of the 27th ACM International Conference on Information and …, 2018dl.acm.org
In recent years, the task of reformulating natural language queries has received
considerable attention from both industry and academic communities. Because of the lexical
chasm problem between natural language queries and web documents, if we directly use
natural language queries as inputs for retrieval, the results are usually unsatisfactory. In this
work, we formulated the task as a translation problem to convert natural language queries
into keyword queries. Since the nature language queries users input are diverse and multi …
In recent years, the task of reformulating natural language queries has received considerable attention from both industry and academic communities. Because of the lexical chasm problem between natural language queries and web documents, if we directly use natural language queries as inputs for retrieval, the results are usually unsatisfactory. In this work, we formulated the task as a translation problem to convert natural language queries into keyword queries. Since the nature language queries users input are diverse and multi-faceted, general encoder-decoder models cannot effectively handle low-frequency words and out-of-vocabulary words. We propose a novel encoder-decoder method with two decoders: the pointer decoder firstly extracts query terms directly from the source text via copying mechanism, then the generator decoder generates query terms using two attention modules simultaneously considering the source text and extracted query terms. For evaluation and training, we also proposed a semi-automatic method to construct a large-scale dataset about natural language query-keyword query pairs. Experimental results on this dataset demonstrated that our model could achieve better performance than the previous state-of-the-art methods.
ACM Digital Library
Showing the best result for this search. See all results