Abstract
In the era of deep learning, modeling for most natural language processing (NLP) tasks has converged into several mainstream paradigms. For example, we usually adopt the sequence labeling paradigm to solve a bundle of tasks such as POS-tagging, named entity recognition (NER), and chunking, and adopt the classification paradigm to solve tasks like sentiment analysis. With the rapid progress of pre-trained language models, recent years have witnessed a rising trend of paradigm shift, which is solving one NLP task in a new paradigm by reformulating the task. The paradigm shift has achieved great success on many tasks and is becoming a promising way to improve model performance. Moreover, some of these paradigms have shown great potential to unify a large number of NLP tasks, making it possible to build a single model to handle diverse tasks. In this paper, we review such phenomenon of paradigm shifts in recent years, highlighting several paradigms that have the potential to solve different NLP tasks.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
X. Y. Li, J. R. Feng, Y. X. Meng, Q. H. Han, F. Wu, J. W. Li. A unified MRC framework for named entity recognition. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL, pp. 5849–5859, 2020. DOI: https://doi.org/10.18653/v1/2020.acl-main.519.
H. Yan, T. Gui, J. Q. Dai, Q. P. Guo, Z. Zhang, X. P. Qiu. A unified generative framework for various NER subtasks. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL, pp. 5808–5822, 2021. DOI: https://doi.org/10.18653/v1/2021.acl-long.451.
J. Devlin, M. W. Chang, K. Lee, K. Toutanova. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, ACL, Minneapolis, USA, pp. 4171–4186, 2019. DOI: https://doi.org/10.18653/v1/N19-1423.
C. Raffel, N. Shazeer, A. Roberts, K. Lee, S. Narang, M. Matena, Y. Q. Zhou, W. Li, P. J. Liu. Exploring the limits of transfer learning with a unified text-to-text transformer. Journal of Machine Learning Research, vol. 21, no. 140, pp. 1–67, 2020.
T. B. Brown, B. Mann, N. Ryder, M. Subbiah, J. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, S. Agarwal, A. Herbert-Voss, G. Krueger, T. Henighan, R. Child, A. Ramesh, D. M. Ziegler, J. Wu, C. Winter, C. Hesse, M. Chen, E. Sigler, M. Litwin, S. Gray, B. Chess, J. Clark, C. Berner, S. McCandlish, A. Radford, I. Sutskever, D. Amodei. Language models are few-shot learners. In Proceedings of the 33rd International Conference on Neural Information Processing Systems, 2020.
X. P. Qiu, T. X. Sun, Y. G. Xu, Y. F. Shao, N. Dai, X. J. Huang. Pre-trained models for natural language processing: A survey. Science China Technological Sciences, vol. 63, no. 10, pp. 1872–1897, 2020. DOI: https://doi.org/10.1007/s11431-020-1647-3.
T. Schick, H. Schütze. Exploiting cloze-questions for few-shot text classification and natural language inference. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, ACL, pp. 255–269, 2021. DOI: https://doi.org/10.18653/v1/2021.eacl-main.20.
T. Schick, H. Schütze. It’s not just size that matters: Small language models are also few-shot learners. In Proceedings of Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, ACL, pp. 2339–2352, 2021. DOI: https://doi.org/10.18653/v1/2021.naacl-main.185.
T. Y. Gao, A. Fisch, D. Q. Chen. Making pre-trained language models better few-shot learners. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL, pp. 3816–3830, 2021. DOI: https://doi.org/10.18653/v1/2021.acl-long.295.
T. Shin, Y. Razeghi, R. L. Logan IV, E. Wallace, S. Singh. AutoPrompt: Eliciting knowledge from language models with automatically generated prompts. In Proceedings of Conference on Empirical Methods in Natural Language Processing, ACL, pp. 4222–4235, 2020. DOI: https://doi.org/10.18653/v1/2020.emnlp-main.346.
X. L. Li, P. Liang. Prefix-tuning: Optimizing continuous prompts for generation. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL, pp. 4582–4597, 2021. DOI: https://doi.org/10.18653/v1/2021.acl-long.353.
X. Liu, Y. N. Zheng, Z. X. Du, M. Ding, Y. J. Qian, Z. L. Yang, J. Tang. GPT understands, too. [Online], Available: https://arxiv.org/abs/2103.10385, 2021.
B. Lester, R. Al-Rfou, N. Constant. The power of scale for parameter-efficient prompt tuning. [Online], Available: https://arxiv.org/abs/2104.08691, 2021.
T. X. Sun, Y. F. Shao, H. Qian, X. J. Huang, X. P. Qiu. Black-box tuning for language-model-as-a-service. [Online], Available: https://arxiv.org/abs/2201.03514, 2022.
Y. Kim. Convolutional neural networks for sentence classification. In Proceedings of Conference on Empirical Methods in Natural Language Processing, ACL, Doha, Qatar, pp. 1746–1751, 2014. DOI: https://doi.org/10.3115/v1/D14-1181.
P. F. Liu, X. P. Qiu, X. J. Huang. Recurrent neural network for text classification with multi-task learning. In Proceedings of the 25th International Joint Conference on Artificial Intelligence, New York, USA, pp. 2873–2879, 2016.
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, I. Polosukhin. Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, USA, pp. 6000–6010, 2017.
Q. Chen, X. D. Zhu, Z. H. Ling, S. Wei, H. Jiang, D. Ink-pen. Enhanced LSTM for natural language inference. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, ACL, Vancouver, Canada, pp. 1657–1668, 2017. DOI: https://doi.org/10.18653/v1/P17-1152.
X. Z. Ma, E. Hovy. End-to-end sequence labeling via Bidirectional LSTM-CNNs-CRF. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, ACL, Berlin, Germany, pp. 1064–1074, 2016. DOI: https://doi.org/10.18653/v1/P16-1101.
J. D. Lafferty, A. McCallum, F. C. N. Pereira. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the 18th International Conference on Machine Learning, Morgan Kaufmann Publishers Inc., Williamstown, USA, pp. 282–289, 2001.
C. M. Xiong, V. Zhong, R. Socher. Dynamic coattention networks for question answering. In Proceedings of the 5th International Conference on Learning Representations, Toulon, France, 2017.
M. J. Seo, A. Kembhavi, A. Farhadi, H. Hajishirzi. Bidirectional attention flow for machine comprehension. In Proceedings of the 5th International Conference on Learning Representations, Toulon, France, 2017.
D. Q. Chen, A. Fisch, J. Weston, A. Bordes. Reading Wikipedia to answer open-domain questions. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, ACL, Vancouver, Canada, pp. 1870–1879, 2017. DOI: https://doi.org/10.18653/v1/P17-1171.
I. Sutskever, O. Vinyals, Q. V. Le. Sequence to sequence learning with neural networks. In Proceedings of the 27th International Conference on Neural Information Processing Systems, Montreal, Canada, pp. 3104–3112, 2014.
D. Bahdanau, K. Cho, Y. Bengio. Neural machine translation by jointly learning to align and translate. In Proceedings of the 3rd International Conference on Learning Representations, San Diego, USA, 2015.
M. T. Luong, Q. V. Le, I. Sutskever, O. Vinyals, L. Kaiser. Multi-task sequence to sequence learning. In Proceedings of the 4th International Conference on Learning Representations, San Juan, Puerto Rico, 2016.
J. Gehring, M. Auli, D. Grangier, D. Yarats, Y. N. Dauphin. Convolutional sequence to sequence learning. In Proceedings of the 34th International Conference on Machine Learning, Sydney, Australia, pp. 1243–1252, 2017.
D. Q. Chen, C. D. Manning. A fast and accurate dependency parser using neural networks. In Proceedings of Conference on Empirical Methods in Natural Language Processing, ACL, Doha, Qatar, pp. 740–750, 2014.
C. Dyer, M. Ballesteros, W. Ling, A. Matthews, N. A. Smith. Transition-based dependency parsing with stack long short-term memory. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, ACL, Beijing, China, pp. 334–343, 2015. DOI: https://doi.org/10.3115/v1/P15-1033.
Y. Bengio, R. Ducharme, P. Vincent. A neural probabilistic language model. In Proceedings of the 13th International Conference on Neural Information Processing Systems, Denver, USA, pp. 932–938, 2000.
E. Grave, A. Joulin, N. Usunier. Improving neural language models with a continuous cache. In Proceedings of the 5th International Conference on Learning Representations, Toulon, France, 2017.
Z. H. Dai, Z. L. Yang, Y. M. Yang, J. Carbonell, Q. Le, R. Salakhutdinov. Transformer-XL: Attentive language models beyond a fixed-length context. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, ACL, Florence, Italy, pp. 2978–2988, 2019. DOI: https://doi.org/10.18653/v1/P19-1285.
M. Lewis, Y. H. Liu, N. Goyal, M. Ghazvininejad, A. Mohamed, O. Levy, V. Stoyanov, L. Zettlemoyer. BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL, pp. 7871–7880, 2020. DOI: https://doi.org/10.18653/v1/2020.acl-main.703.
Z. L. Yang, P. Qi, S. Z. Zhang, Y. Bengio, W. W. Cohen, R. Salakhutdinov, C. D. Manning. HotpotQA: A dataset for diverse, explainable multi-hop question answering. In Proceedings of Conference on Empirical Methods in Natural Language Processing, ACL, Brussels, Belgium, pp. 2369–2380, 2018. DOI: https://doi.org/10.18653/v1/D18-1259.
B. H. Wu, Z. S. Zhang, H. Zhao. Graph-free multi-hop reading comprehension: A select-to-guide strategy. [Online], Available: https://arxiv.org/abs/2107.11823, 2021.
D. Chai, W. Wu, Q. H. Han, F. Wu, J. W. Li. Description based text classification with reinforcement learning. In Proceedings of the 37th International Conference on Machine Learning, pp. 1371–1382, 2020.
P. C. Yang, X. Sun, W. Li, S. M. Ma, W. Wu, H. F. Wang. SGM: Sequence generation model for multi-label classification. In Proceedings of the 27th International Conference on Computational Linguistics, ACL, Santa Fe, USA, pp. 3915–3926, 2018.
B. McCann, N. S. Keskar, C. M. Xiong, R. Socher. The natural language decathlon: Multitask learning as question answering. [Online], Available: https://arxiv.org/abs/1806.08730, 2018.
J. L. Fu, X. J. Huang, P. F. Liu. SpanNER: Named entity re-/recognition as span prediction. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL, pp. 7183–7195, 2021. DOI: https://doi.org/10.18653/v1/2021.acl-long.558.
L. Y. Cui, Y. Wu, J. Liu, S. Yang, Y. Zhang. Template-based named entity recognition using BART. In Proceedings of the Findings of the Association for Computational Linguistics, ACL, pp. 1835–1845, 2021. DOI: https://doi.org/10.18653/v1/2021.findings-acl.161.
Y. Q. Wang, M. L. Huang, L. Zhao, X. Y. Zhu. Attention-based LSTM for aspect-level sentiment classification. In Proceedings of Conference on Empirical Methods in Natural Language Processing, ACL, Austin, USA, pp. 606–615, 2016. DOI: https://doi.org/10.18653/v1/D16-1058.
C. Sun, L. Y. Huang, X. P. Qiu. Utilizing BERT for aspect-based sentiment analysis via constructing auxiliary sentence. In Proceedings of Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, ACL, Minneapolis, USA, pp. 380–385, 2019. DOI: https://doi.org/10.18653/v1/N19-1035.
Y. Mao, Y. Shen, C. Yu, L. J. Cai. A joint training dual-MRC framework for aspect based sentiment analysis. In Proceedings of the 35th AAAI Conference on Artificial Intelligence, Palo Alto, USA, pp. 13543–13551, 2021.
H. Yan, J. Q. Dai, T. Ji, X. P. Qiu, Z. Zhang. A unified generative framework for aspect-based sentiment analysis. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL, pp. 2416–2429, 2021. DOI: https://doi.org/10.18653/v1/2021.acl-long.188.
C. X. Li, F. Y. Gao, J. J. Bu, L. Xu, X. Chen, Y. Gu, Z. R. Shao, Q. Zheng, N. Y. Zhang, Y. P. Wang, Z. Yu. Senti-Prompt: Sentiment knowledge enhanced prompt-tuning for aspect-based sentiment analysis. [Online], Available: https://arxiv.org/abs/2109.08306, 2021.
D. J. Zeng, K. Liu, S. W. Lai, G. Y. Zhou, J. Zhao. Relation classification via convolutional deep neural network. In Proceedings of the 25th International Conference on Computational Linguistics, ACL, Dublin, Ireland, pp. 2335–2344, 2014.
O. Levy, M. Seo, E. Choi, L. Zettlemoyer. Zero-shot relation extraction via reading comprehension. In Proceedings of the 21st Conference on Computational Natural Language Learning, ACL, Vancouver, Canada, pp. 333–342, 2017. DOI: https://doi.org/10.18653/v1/K17-1034.
X. R. Zeng, D. J. Zeng, S. Z. He, K. Liu, J. Zhao. Extracting relational facts by an end-to-end neural model with copy mechanism. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, ACL, Melbourne, Australia, pp. 506–514, 2018. DOI: https://doi.org/10.18653/v1/P18-1047.
X. Han, W. L. Zhao, N. Ding, Z. Y. Liu, M. S. Sun. PTR: Prompt tuning with rules for text classification. [Online], Available: https://arxiv.org/abs/2105.11259, 2021.
J. P. Cheng, M. Lapata. Neural summarization by extracting sentences and words. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, ACL, Berlin, Germany, pp. 484–494, 2016. DOI: https://doi.org/10.18653/v1/P16-1046.
M. Zhong, P. F. Liu, Y. R. Chen, D. Q. Wang, X. P. Qiu, X. J. Huang. Extractive summarization as text matching. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL, pp. 6197–6208, 2020. DOI: https://doi.org/10.18653/v1/2020.acl-main.552.
A. Aghajanyan, D. Okhonko, M. Lewis, M. Joshi, H. Xu, G. Ghosh, L. Zettlemoyer. HTLM: Hyper-text pre-training and prompting of language models. [Online], Available: https://arxiv.org/abs/2107.06955, 2021.
D. K. Choe, E. Charniak. Parsing as language modeling. In Proceedings of Conference on Empirical Methods in Natural Language Processing, ACL, Austin, USA, pp. 2331–2336, 2016. DOI: https://doi.org/10.18653/v1/D16-1257.
M. Strzyz, D. Vilares, C. Gómez-Rodríguez. Viable dependency parsing as sequence labeling. In Proceedings of Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, ACL, Minneapolis, USA, pp. 717–723, 2019. DOI: https://doi.org/10.18653/v1/N19-1077.
L. L. Gan, Y. X. Meng, K. Kuang, X. F. Sun, C. Fan, F. Wu, J. W. Li. Dependency parsing as MRC-based span-span prediction. [Online], Available: https://arxiv.org/abs/2105.07654, 2021.
O. Vinyals, L. Kaiser, T. Koo, S. Petrov, I. Sutskever, G. E. Hinton. Grammar as a foreign language. In Proceedings of the 28th International Conference on Neural Information Processing Systems, Montreal, Canada, pp. 2773–2781, 2015.
S. N. Wang, H. Fang, M. Khabsa, H. Z. Mao, H. Ma. Entailment as few-shot learner. [Online], Available: https://arxiv.org/abs/2104.14690, 2021.
G. Lample, M. Ballesteros, S. Subramanian, K. Kawakami, C. Dyer. Neural architectures for named entity recognition. In Proceedings of Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, ACL, San Diego, USA, pp. 260–270, 2016. DOI: https://doi.org/10.18653/v1/N16-1030.
C. Y. Xia, C. W. Zhang, T. Yang, Y. L. Li, N. Du, X. Wu, W. Fan, F. L. Ma, P. Yu. Multi-grained named entity recognition. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, ACL, Florence, Italy, pp. 1430–1440, 2019. DOI: https://doi.org/10.18653/v1/P19-1138.
J. Fisher, A. Vlachos. Merge and label: A novel neural network architecture for nested NER. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, ACL, Florence, Italy, pp. 5840–5850, 2019. DOI: https://doi.org/10.18653/v1/P19-1585.
X. Dai, S. Karimi, B. Hachey, C. Paris. An effective transition-based model for discontinuous NER. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL, pp. 5860–5870, 2020. DOI: https://doi.org/10.18653/v1/2020.acl-main.520.
J. T. Yu, B. Bohnet, M. Poesio. Named entity recognition as dependency parsing. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL, pp. 6470–6476, 2020. DOI: https://doi.org/10.18653/v1/2020.acl-main.577.
T. X. Sun, Y. F. Shao, X. P. Qiu, Q. P. Guo, Y. R. Hu, X. J. Huang, Z. Zhang. CoLAKE: Contextualized language and knowledge embedding. In Proceedings of the 28th International Conference on Computational Linguistics, Barcelona, Spain, pp. 3660–3670, 2020. DOI: https://doi.org/10.18653/v1/2020.coling-main.327.
J. T. Gu, Z. D. Lu, H. Li, V. O. K. Li. Incorporating copying mechanism in sequence-to-sequence learning. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, ACL, Berlin, Germany, pp. 1631–1640, 2016. DOI: https://doi.org/10.18653/v1/P16-1154.
X. Y. Li, F. Yin, Z. J. Sun, X. Y. Li, A. Yuan, D. Chai, M. X. Zhou, J. W. Li. Entity-relation extraction as multi-turn question answering. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, ACL, Florence, Italy, pp. 1340–1350, 2019. DOI: https://doi.org/10.18653/v1/P19-1129.
T. Y. Zhao, Z. Yan, Y. B. Cao, Z. J. Li. Asking effective and diverse questions: A machine reading comprehension based framework for joint entity-relation extraction. In Proceedings of the 29th International Joint Conference on Artificial Intelligence, Yokohama, Japan, pp. 3948–3954, 2020.
J. Andreas, A. Vlachos, S. Clark. Semantic parsing as machine translation. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, ACL, Sofia, Bulgaria, pp. 47–52, 2013.
Z. C. Li, J. X. Cai, S. X. He, H. Zhao. Seq 2seq dependency parsing. In Proceedings of the 27th International Conference on Computational Linguistics, ACL, Santa Fe, USA, pp. 3203–3214, 2018.
S. Rongali, L. Soldaini, E. Monti, W. Hamza. Don’t parse, generate! A sequence to sequence architecture for task-oriented semantic parsing. In Proceedings of the Web Conference 2020, ACM, Taipei, China, pp. 2962–2968, 2020. DOI: https://doi.org/10.1145/3366423.3380064.
C. Gómez-Rodríguez, D. Vilares. Constituent parsing as sequence labeling. In Proceedings of Conference on Empirical Methods in Natural Language Processing, ACL, Brussels, Belgium, pp. 1314–1324, 2018. DOI: https://doi.org/10.18653/v1/D18-1162.
D. Vilares, C. Gómez-Rodríguez. Discontinuous constituent parsing as sequence labeling. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, ACL, pp. 2771–2785, 2020. DOI: https://doi.org/10.18653/v1/2020.emnlp-main.221.
R. Vacareanu, G. C. G. Barbosa, M. A. Valenzuela-Escárcega, M. Surdeanu. Parsing as tagging. In Proceedings of the 12th Language Resources and Evaluation Conference, Marseille, France, pp. 5225–5231, 2020.
J. Liu, Y. B. Chen, K. Liu, W. Bi, X. J. Liu. Event extraction as machine reading comprehension. In Proceedings of Conference on Empirical Methods in Natural Language Processing, ACL, pp. 1641–1651, 2020. DOI: https://doi.org/10.18653/v1/2020.emnlp-main.128.
A. Ramponi, R. van der Goot, R. Lombardo, B. Plank. Biomedical event extraction as sequence labeling. In Proceedings of Conference on Empirical Methods in Natural Language Processing, ACL, pp. 5357–5367, 2020. DOI: https://doi.org/10.18653/v1/2020.emnlp-main.431.
W. P. Yin, N. F. Rajani, D. Radev, R. Socher, C. M. Xiong. Universal natural language processing with limited annotations: Try few-shot textual entailment as a start. In Proceedings of Conference on Empirical Methods in Natural Language Processing, ACL, pp. 8229–8239, 2020. DOI: https://doi.org/10.18653/v1/2020.emnlp-main.660.
T. Le Scao, A. Rush. How many data points is a prompt worth? In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, ACL, pp. 2627–2636, 2021. DOI: https://doi.org/10.18653/v1/2021.naacl-main.208.
Z. B. Jiang, F. F. Xu, J. Araki, G. Neubig. How can we know what language models know? Transactions of the Association for Computational Linguistics, vol. 8, pp. 423–438, 2020. DOI: https://doi.org/10.1162/tacl_a_00324.
G. H. Qin, J. Eisner. Learning how to ask: Querying LMs with mixtures of soft prompts. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, ACL, pp. 5203–5212, 2021. DOI: https://doi.org/10.18653/v1/2021.naacl-main.410.
K. Hambardzumyan, H. Khachatrian, J. May. WARP: Word-level adversarial ReProgramming. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL, pp. 4921–4933, 2021. DOI: https://doi.org/10.18653/v1/2021.acl-long.381.
Z. X. Zhong, D. Friedman, D. Q. Chen. Factual probing is [MASK]: Learning vs. learning to recall. In Proceedings of Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, ACL, pp. 5017–5033, 2021. DOI: https://doi.org/10.18653/v1/2021.naacl-main.398.
T. Schick, H. Schmid, H. Schütze. Automatically identifying words that can serve as labels for few-shot text classification. In Proceedings of the 28th International Conference on Computational Linguistics, Barcelona, Spain, pp. 5569–5578, 2020. DOI: https://doi.org/10.18653/v1/2020.coling-main.488.
S. D. Hu, N. Ding, H. D. Wang, Z. Y. Liu, J. G. Wang, J. Z. Li, W. Wu, M. S. Sun. Knowledgeable prompt-tuning: Incorporating knowledge into prompt verbalizer for text classification. [Online], Available: https://arxiv.org/abs/2108.02035, 2021.
I. Dagan, O. Glickman, B. Magnini. The PASCAL recognising textual entailment challenge. In Proceedings of the 1st Machine Learning Challenges Workshop, Springer, Southampton, UK, pp. 177–190, 2005. DOI: https://doi.org/10.1007/11736790_9.
A. Poliak, A. Haldar, R. Rudinger, J. E. Hu, E. Pavlick, A. S. White, B. Van Durme. Collecting diverse natural language inference problems for sentence representation evaluation. In Proceedings of Conference on Empirical Methods in Natural Language Processing, ACL, Brussels, Belgium, pp. 67–81, 2018. DOI: https://doi.org/10.18653/v1/D18-1007.
A. Williams, N. Nangia, S. Bowman. A broad-coverage challenge corpus for sentence understanding through inference. In Proceedings of Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, ACL, New Orleans, USA, pp. 1112–1122, 2018. DOI: https://doi.org/10.18653/v1/N18-1101.
Y. Sun, Y. Zheng, C. Hao, H. P. Qiu. NSP-BERT: A prompt-based zero-shot learner through an original pre-training task-next sentence prediction. [Online], Available: https://arxiv.org/abs/2109.03564, 2021.
W. Wu, F. Wang, A. Yuan, F. Wu, J. W. Li. CorefQA: Coreference resolution as query-based span prediction. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL, pp. 6953–6963, 2020. DOI: https://doi.org/10.18653/v1/2020.acl-main.622.
Y. J. Gu, X. Y. Qu, Z. F. Wang, B. X. Huai, N. J. Yuan, X. L. Gui. Read, retrospect, select: An MRC framework to short text entity linking. In Proceedings of the 35th AAAI Conference on Artificial Intelligence, Palo Alto, USA, pp. 12920–12928, 2021.
S. Y. Gao, A. Sethi, S. Agarwal, T. Chung, D. Hakkani-Tur. Dialog state tracking: A neural reading comprehension approach. In Proceedings of the 20th Annual SIGdial Meeting on Discourse and Dialogue, ACL, Stockholm, Sweden, pp. 264–273, 2019. DOI: https://doi.org/10.18653/v1/W19-5932.
X. Y. Du, C. Cardie. Event extraction by answering (almost) natural questions. In Proceedings of Conference on Empirical Methods in Natural Language Processing, ACL, pp. 671–683, 2020. DOI: https://doi.org/10.18653/v1/2020.emnlp-main.49.
K. T. Song, X. Tan, T. Qin, J. F. Lu, T. Y. Liu. MASS: Masked sequence to sequence pre-training for language generation. In Proceedings of the 36th International Conference on Machine Learning, Long Beach, USA, pp. 5926–5936, 2019.
G. Paolini, B. Athiwaratkun, J. Krone, J. Ma, A. Achille, R. Anubhai, C. N. dos Santos, B. Xiang, S. Soatto. Structured prediction as translation between augmented natural languages. In Proceedings of the 9th International Conference on Learning Representations, Vienna, Austria, 2021.
J. T. Gu, J. Bradbury, C. M. Xiong, V. O. K. Li, R. Socher. Non-autoregressive neural machine translation. In Proceedings of the 6th International Conference on Learning Representations, Vancouver, Canada, 2018.
W. Z. Qi, Y. Y. Gong, J. Jiao, Y. Yan, W. Z. Chen, D. Liu, K. W. Tang, H. Q. Li, J. S. Chen, R. F. Zhang, M. Zhou, N. Duan. BANG: Bridging autoregressive and non-autoregressive generation with large scale pretraining. In Proceedings of the 38th International Conference on Machine Learning, pp. 8630–8639, 2021.
M. Elbayad, J. T. Gu, E. Grave, M. Auli. Depth-adaptive transformer. In Proceedings of the 8th International Conference on Learning Representations, Addis Ababa, Ethiopia, 2020.
Acknowledgements
This work was supported by National Natural Science Foundation of China (No. 62022027).
Author information
Authors and Affiliations
Corresponding author
Additional information
Colored figures are available in the online version at https://link.springer.com/journal/11633
Tian-Xiang Sun received the B.Eng. degree in software engineering from Xidian University, China in 2019. During 2019–2020, he was an applied scientist intern at Amazon Shanghai AI Lab, China. Since 2019, he is the Ph.D. degree candidate in School of Computer Science, Fudan University, China. He serves as a reviewer of ICML, ACL, EMNLP, AAAI, IJCAI, and COLING.
His research interests include natural language processing and deep learning.
Xiang-Yang Liu received the B.Eng. degree in intelligent science and technology from Xidian University, China in 2020. Since 2020, he is a master student in School of Computer Science, Fudan University, China.
His research interests include natural language processing and deep learning.
Xi-Peng Qiu received the B.Sc. degree and Ph.D. degrees in computer science from Fudan University, China in 2001 and 2006, respectively. Currently, he is a professor in School of Computer Science, Fudan University, China.
His research interests include natural language processing and deep learning.
Xuan-Jing Huang received the Ph.D. degree in computer science from Fudan University China in 1998. She is currently a professor of School of Computer Science, Fudan University, China. She has served as Program Co-Chair in EMNLP 2021, CCL 2019, CCL 2016, NLPCC 2017, SMP 2015, the organizer of WSDM 2015, and competition chair of CIKM 2014. She has been included in the 2020 Women in AI List and AI 2000 Most Influential Scholar Annual List, jointly announced by Tsinghua — Chinese Academy of Engineering’s Joint Research Center for Knowledge and Intelligence, and the Institute for Artificial Intelligence of Tsinghua University, and 2020 Women in Tech List by Forbes China. She has published more than 100 papers in major computer science conferences and journals.
Her research interests include artificial intelligence, natural language processing, information retrieval and social media processing.
Rights and permissions
This article is licensed under a Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.
The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
To view a copy of this licence, visit http://creative-commons.org/licenses/by/4.0/.
About this article
Cite this article
Sun, TX., Liu, XY., Qiu, XP. et al. Paradigm Shift in Natural Language Processing. Mach. Intell. Res. 19, 169–183 (2022). https://doi.org/10.1007/s11633-022-1331-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11633-022-1331-6