Hostname: page-component-586b7cd67f-dsjbd Total loading time: 0 Render date: 2024-12-03T13:12:23.977Z Has data issue: false hasContentIssue false

The effect of morphology in named entity recognition with sequence tagging

Published online by Cambridge University Press:  27 July 2018

ONUR GÜNGÖR
Affiliation:
Department of Computer Engineering, Bogazici University, Istanbul Huawei R&D Center, Istanbul, Turkey e-mail: onurgu@boun.edu.tr
TUNGA GÜNGÖR
Affiliation:
Department of Computer Engineering, Bogazici University, Istanbul, Turkey e-mail: gungort@boun.edu.tr, suzan.uskudarli@boun.edu.tr
SUZAN ÜSKÜDARLI
Affiliation:
Department of Computer Engineering, Bogazici University, Istanbul, Turkey e-mail: gungort@boun.edu.tr, suzan.uskudarli@boun.edu.tr

Abstract

This work proposes a sequential tagger for named entity recognition in morphologically rich languages. Several schemes for representing the morphological analysis of a word in the context of named entity recognition are examined. Word representations are formed by concatenating word and character embeddings with the morphological embeddings based on these schemes. The impact of these representations is measured by training and evaluating a sequential tagger composed of a conditional random field layer on top of a bidirectional long short-term memory layer. Experiments with Turkish, Czech, Hungarian, Finnish and Spanish produce the state-of-the-art results for all these languages, indicating that the representation of morphological information improves performance.

Type
Article
Copyright
Copyright © Cambridge University Press 2018 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

This research was supported by Boğaziçi University Research Fund (BAP) under Grant 13083.

References

Appelt, D. E., Hobbs, J. R., Bear, J., Israel, D., Kameyama, M., Martin, D., Myers, K., and Tyson, M. 1995. SRI International FASTUS system: MUC-6 test results and analysis. In Proceedings of the 6th Conference on Message Understanding, Association for Computational Linguistics, pp. 237–48.Google Scholar
Babych, B., and Hartley, A. 2003. Improving machine translation quality with automatic named entity recognition. In Proceedings of the 7th International EAMT Workshop on MT and Other Language Technology Tools, Improving MT through Other Language Technology Tools: Resources and Tools for Building MT, Association for Computational Linguistics, pp. 1–8.Google Scholar
Bhatia, P., Guthrie, R., and Eisenstein, J. 2016. Morphological priors for probabilistic neural word embeddings. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 490–500.Google Scholar
Bojanowski, P., Grave, E., Joulin, A., and Mikolov, T., 2017. Enriching word vectors with subword information. Transactions of the Association for Computational Linguistics 5: 135–46.Google Scholar
Borthwick, A. E. 1999. A Maximum Entropy Approach to Named Entity Recognition, Ph.D. thesis. New York, NY, USA: New York University.Google Scholar
Collobert, R., and Weston, J. 2008. A unified architecture for natural language processing. In Proceedings of the 25th International Conference on Machine Learning (ICML-08), ACM, pp. 160–7.Google Scholar
Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., and Kuksa, P., 2011. Natural language processing (almost) from scratch. Journal of Machine Learning Research 12: 2493–537.Google Scholar
Cotterell, R., and Schütze, H., 2018. Joint Semantic Synthesis and Morphological Analysis of the Derived Word. Transactions of the Association for Computational Linguistics 6: 3348.Google Scholar
Çöltekin, Ç. 2014. A set of open source tools for Turkish natural language processing. In Proceedings of the 9th International Conference on Language Resources and Evaluation, pp. 1079–86.Google Scholar
Demir, H., and Özgür, A. 2014. Improving named entity recognition for morphologically rich languages using word embeddings. In Proceedings of the International Conference on Machine Learning and Applications (ICMLA), IEEE, pp. 117–22.Google Scholar
Dietterich, T. G., 1998. Approximate statistical tests for comparing supervised classification learning algorithms. Neural Computation 10: 1895–923.Google Scholar
Erjavec, T. 2004. MULTEXT-East version 3: multilingual morphosyntactic specifications, lexicons and corpora. In Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC-04), ELRA, pp. 1535–1538.Google Scholar
Erjavec, T. 2010. MULTEXT-East version 4: multilingual morphosyntactic specifications, lexicons and corpora. In Proceedings of the International Conference on Language Resources and Evaluation (LREC).Google Scholar
Farkas, R., Szeredi, D., Varga, D., and Vincze, V. 2010. MSD-KR harmonizacio Szeged Treebank 2.5-ben [Harmonizing MSD and KR codes in the Szeged Treebank 2.5]. In Proceedings of the VII Magyar Szamıtogepes Nyelveszeti Konferencia, pp. 349–53.Google Scholar
Finkel, J. R., Grenager, T., and Manning, C. 2005. Incorporating non-local information into information extraction systems by Gibbs sampling. In Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, pp. 363–70.Google Scholar
Graves, A., and Schmidhuber, J., 2005. Framewise phoneme classification with bidirectional LSTM networks and other neural network architectures. Neural Networks 18: 602–10.Google Scholar
Grishman, R., and Sundheim, B. 1996. Message understanding conference-6: a brief history. In Proceedings of the 16th Conference on Association for Computational Linguistics, pp. 466–71.Google Scholar
Guo, H., Zhu, H., Guo, Z., Zhang, X., Wu, X., and Su, Z. 2009. Domain adaptation with latent semantic association for named entity recognition. In Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), pp. 281–9.Google Scholar
Guo, J., Xu, G., Cheng, X., and Li, H. 2009. Named entity recognition in query. In Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval, ACM, pp. 267–74.Google Scholar
Hajič, J., Panevová, J., Hajičová, E., Sgall, P., Pajas, P., Štěpánek, J., Havelka, J., Mikulová, M., Žabokrtský, Z., Ševčíková-Razímová, M., and Urešová, Z., 2006. Prague Dependency Treebank 2.0. Philadelphia, PA, USA: Linguistic Data Consortium.Google Scholar
Hajič, J., Hajičová, E., Mikulová, M., and Mírovský, J. 2017. Prague dependency treebank. In Ide, N. and Pustejovsky, J. (eds.), Handbook of Linguistic Annotation, pp. 555–94. Netherlands: Springer.Google Scholar
Hana, J., Zeman, D., Hajic, J., Hanová, H., Hladká, B., and Jerábek, E. 2005. Manual for morphological annotation. ÚFAL Technical Report, Revision for the Prague Dependency Treebank 2.0 (No. 2005/27).Google Scholar
Harris, Z. S. 1954. Distributional structure. Word 10: 146–62.Google Scholar
Hochreiter, S., and Schmidhuber, J., 1997. Long short-term memory. Neural Computation 9: 1735–80.Google Scholar
Huang, Z., Xu, W., and Yu, K. 2015. Bidirectional LSTM-CRF models for sequence tagging. arXiv:1508.01991.Google Scholar
Humphreys, K., Gaizauskas, R., Azzam, S., Huyck, C., Mitchell, B., Cunningham, H., and Wilks, Y. 1998. University of Sheffield: description of the LaSIE-II system as used for MUC-7. In Proceedings of the 7th Message Understanding Conferences (MUC-7), ACL.Google Scholar
Jiang, J., and Zhai, C. X. 2007. Instance weighting for domain adaptation in NLP. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics (ACL), pp. 264–71.Google Scholar
Konkol, M., and Konopik, M. 2013. CRF-based Czech named entity recognizer and consolidation of Czech NER research. In Habernal, I. and Matoušek, V. (eds.), Text, Speech and Dialogue, pp. 153–60. Lecture Notes in Computer Science, vol. 8082. Berlin, Heidelberg: Springer.Google Scholar
Koskenniemi, K. 1983. Two-level morphology: a general computational model for word form recognition and production. Publication no. 11, Department of General Linguistics, University of Helsinki, Finland.Google Scholar
Koskenniemi, K. 1984. A general computational model for word-form recognition and production. In Proceedings of the 10th International Conference on Computational Linguistics and 22nd Annual Meeting on Association for Computational Linguistics, pp. 178–81.Google Scholar
Kripke, S., 1982. Naming and Necessity. Boston: Harvard University Press.Google Scholar
Kuru, O., Can, O. A., and Yuret, D. 2016. CharNER: character-level named entity recognition. In Proceedings of the 26th International Conference on Computational Linguistics (COLING-2016), pp. 911–21.Google Scholar
Lafferty, J., McCallum, A., and Pereira, F. 2001. Conditional random fields: probabilistic models for segmenting and labeling sequence data. In Proceedings of the 18th International Conference on Machine Learning (ICML), pp. 282–9.Google Scholar
Lankinen, M., Heikinheimo, H., Takala, P., Raiko, T., and Karhunen, J. 2016. A character-word compositional neural language model for Finnish. CoRR abs/1612.03266.Google Scholar
Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K., and Dyer, C. 2016. Neural architectures for named entity recognition. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT-2016), pp. 260–70.Google Scholar
Lee, J., Kim, G., Yoo, J., Jung, C., Kim, M., and Yoon, S. 2017. Training IBM Watson using automatically generated question-answer pairs, CoRR, abs/1611.03932.Google Scholar
Liu, Y., and Ren, F. 2011. Japanese named entity recognition for question answering system. In Proceedings of the IEEE International Conference on Cloud Computing and Intelligence Systems, IEEE, pp. 402–6.Google Scholar
Lee, C., Hwang, Y., and Jang, M. 2007. Fine-grained named entity recognition and relation extraction for question answering. In Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR-2007), ACM, pp. 799–800.Google Scholar
Luong, T., Socher, R., and Manning, C. D. 2013. Better word representations with recursive neural networks for morphology. In Proceedings of the 17th Conference on Computational Natural Language Learning (CoNLL), pp. 104–13.Google Scholar
Ma, X., and Hovy, E. 2016. End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL), pp. 1064–74.Google Scholar
McCallum, A., and Li, W. 2003. Early results for named entity recognition with conditional random fields, feature induction and web-enhanced lexicons. In Proceedings of the 7th Conference on Natural Language Learning at HLT-NAACL, Association for Computational Linguistics, pp. 188–91.Google Scholar
Mikolov, T., Karafiát, M., Burget, L., Cernocky, J., and Khudanpur, S. 2010. Recurrent neural network based language model. In Proceedings of the 11th Annual Conference of the International Speech Communication Association (INTERSPEECH), p. 3.Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., and Dean, J. 2013. Distributed representations of words and phrases and their compositionality. In Proceedings of the 26th International Conference on Neural Information Processing Systems, vol. 2, pp. 3111–9.Google Scholar
Miwa, M., and Bansal, M. 2016. End-to-end relation extraction using LSTMs on sequences and tree structures. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL), pp. 1105–16.Google Scholar
Oflazer, K., 1994. Two-level description of Turkish morphology. Literary and Linguistic Computing 9: 137–48.Google Scholar
Oflazer, K., 2003. Dependency parsing with an extended finite-state approach. Computational Linguistics 29: 515–44.Google Scholar
Lewis, G. L., 1991. Turkish Grammar. Oxford: Oxford University Press.Google Scholar
Pennington, J., Socher, R., and Manning, C. D. 2014. GloVe: global vectors for word representation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP-2014), pp. 1532–43.Google Scholar
Pirinen, T. A. 2015. Omorfi – Free and open source morphological lexical database for Finnish. In Proceedings of the 20th Nordic Conference of Computational Linguistics (NODALIDA-2015), pp. 313–5.Google Scholar
Proszeky, G., and Tihanyi, L., 1993. Humor: high-speed unification morphology and its applications for agglutinative languages. La Tribune Des Industries de la Langue 10: 28–9.Google Scholar
Rao, D., McNamee, P., and Dredze, M. 2013. Entity linking: Finding extracted entities in a knowledge base. In Poibeau, T., Saggion, H., Piskorski, J. and Yangarber, R. (eds.), Multi-Source, Multilingual Information Extraction and Summarization, pp. 93115. Berlin, Heidelberg: Springer.Google Scholar
Sak, H., Güngör, T., and Saraçlar, M. 2007. Morphological disambiguation of Turkish text with perceptron algorithm. In Proceedings of the 8th International Conference on Computational Linguistics and Intelligent Text Processing, pp. 107–18.Google Scholar
Santos, C. D., and Zadrozny, B. 2014. Learning character-level representations for part-of-speech tagging. In Proceedings of the 31st International Conference on Machine Learning (ICML-2014), pp. 1818–26.Google Scholar
Şeker, G. A., and Eryiğit, G. 2012. Initial explorations on using CRFs for Turkish named entity recognition. In Proceedings of the International Conference on Computational Linguistics (COLING-2012), pp. 2459–74.Google Scholar
Ševčíková, M., Žabokrtský, Z., and Krůza, O. 2007. Named entities in Czech: annotating data and developing NE tagger. In Proceedings of the International Conference on Text, Speech and Dialogue, pp. 188–95.Google Scholar
Shen, Q., Clothiaux, D., Tagtow, E., Littell, P., and Dyer, C. 2016. The role of context in neural morphological disambiguation. In Proceedings of the Conference on Computational Linguistics (COLING-2016), pp. 181–91.Google Scholar
Silfverberg, M., Ruokolainen, T., Lindén, K., and Kurimo, M., 2016. FinnPos: an open-source morphological tagging and lemmatization toolkit for Finnish. Language Resources and Evaluation 50: 863–78.Google Scholar
Socher, R., Perelygin, A., Wu, J., Chuang, J., Manning, C. D., Ng, A., and Potts, C. 2013. Recursive deep models for semantic compositionality over a sentiment treebank. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP-2013), pp. 1631–42.Google Scholar
Straková, J., Straka, M., and Hajič, J. 2016. Neural networks for featureless named entity recognition in Czech. In Proceedings of the 19th International Conference on Text, Speech and Dialogue (TSD-2016), pp. 173–181.Google Scholar
Szarvas, G., Farkas, R., Felföldi, L., Kocsor, A., and Csirik, J. 2006a. Highly accurate named entity corpus for Hungarian. In Proceedings of the International Conference on Language Resources and Evaluation.Google Scholar
Szarvas, G., Farkas, R., and Kocsor, A. 2006b. A multilingual named entity recognition system using boosting and C4.5 decision tree learning algorithms. In Proceedings of the International Conference on Discovery Science, pp. 267–78.Google Scholar
Toutanova, K., Klein, D., Manning, C., and Singer, Y. 2003. Feature-rich part-of-speech tagging with a cyclic dependency network. In Proceedings of the HLT-NAACL 2003, pp. 252–9.Google Scholar
Tron, V., Halacsy, P., Rebrus, P., Rung, A., Simon, E., and Vajda, P. 2006. The annotation system of HunMorph. Technical Report, The Media Research Center, Budapest University of Technology and Economics.Google Scholar
Tür, G., Hakkani-Tür, D., and Oflazer, K., 2003. A statistical information extraction system for Turkish. Natural Language Engineering 9: 181210.Google Scholar
Turian, J., Ratinov, L., and Bengio, Y. 2010. Word representations: a simple and general method for semi-supervised learning. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (ACL), pp. 384–94.Google Scholar
Underhill, R., 1976. Turkish Grammar. Cambridge, MA: MIT Press.Google Scholar
Varga, D., and Simon, E., 2007. Hungarian named entity recognition with a maximum entropy approach. Acta Cybernetica 18: 293301.Google Scholar
Votrubec, J. 2006. Morphological tagging based on averaged perceptron. In Proceedings of the 15th Annual Conference of Doctoral Students (WDS-2006), pp. 191–5.Google Scholar
Voutilainen, A. 2011. FinnTreeBank: creating a research resource and service for language researchers with constraint grammar. In Proceedings of NoDaLiDa 2011 Workshop on Constraint Grammar Applications, pp. 41–9.Google Scholar
Wu, D., Lee, W. S., Ye, N., and Chieu, H. L. 2009. Domain adaptive bootstrapping for named entity recognition. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP-2009), ACL, pp. 1523–32.Google Scholar
Xu, Y., and Liu, J. 2017. Implicitly incorporating morphological information into word embedding. CoRR abs/1701.02481.Google Scholar
Yang, Z., Salakhutdinov, R., and Cohen, W. 2016. Multi-task cross-lingual sequence tagging from scratch. CoRR abs/1603.06270.Google Scholar
Yeniterzi, R. 2011. Exploiting morphology in Turkish named entity recognition system. In Proceedings of the Association for Computational Linguistics Student Session (ACL-2011), pp. 105–10.Google Scholar
Yildiz, E., Tirkaz, C., Sahin, H. B., Eren, M. T., and Sonmez, O. 2016. A morphology-aware network for morphological disambiguation. In Proceedings of the 30th AAAI Conference on Artificial Intelligence, AAAI Press, pp. 2863–9.Google Scholar
Zsibrita, J., Vincze, V., and Farkas, R. 2013. magyarlanc: A toolkit for morphological and dependency parsing of hungarian. In Proceedings of the Recent Advances in Natural Language Processing (RANLP-2013), pp. 763–71.Google Scholar