Hostname: page-component-586b7cd67f-t7czq Total loading time: 0 Render date: 2024-11-30T20:08:42.077Z Has data issue: false hasContentIssue false

A 3-phase approach based on sequential mining and dependency parsing for enhancing hypernym patterns performance

Published online by Cambridge University Press:  22 September 2021

Ahmad Issa Alaa Aldine
Affiliation:
University Bretagne Sud, IRISA Lab, France – Vannes Email: ahmad.issa-alaa-eddine@univ-ubs.fr, giuseppe.berio@univ-ubs.fr, nicolas.bechet@irisa.fr Lebanese University, Lebanon Email: ahmad.faour@ul.edu.lb
Mounira Harzallah
Affiliation:
LINA - University of Nantes, France E-mail: mounira.harzallah@univ-nantes.fr
Giuseppe Berio
Affiliation:
University Bretagne Sud, IRISA Lab, France – Vannes Email: ahmad.issa-alaa-eddine@univ-ubs.fr, giuseppe.berio@univ-ubs.fr, nicolas.bechet@irisa.fr
Nicolas Béchet
Affiliation:
University Bretagne Sud, IRISA Lab, France – Vannes Email: ahmad.issa-alaa-eddine@univ-ubs.fr, giuseppe.berio@univ-ubs.fr, nicolas.bechet@irisa.fr
Ahmad Faour
Affiliation:
Lebanese University, Lebanon Email: ahmad.faour@ul.edu.lb

Abstract

Patterns have been extensively used to extract hypernym relations from texts. The most popular patterns are Hearst’s patterns, formulated as regular expressions mainly based on lexical information. Experiences have reported good precision and low recall for such patterns. Thus, several approaches have been developed for improving recall. While these approaches perform better in terms of recall, it remains quite difficult to further increase recall without degrading precision. In this paper, we propose a novel 3-phase approach based on sequential pattern mining to improve pattern-based approaches in terms of both precision and recall by (i) using a rich pattern representation based on grammatical dependencies (ii) discovering new hypernym patterns, and (iii) extending hypernym patterns with anti-hypernym patterns to prune wrong extracted hypernym relations. The results obtained by performing experiments on three corpora confirm that using our approach, we are able to learn sequential patterns and combine them to outperform existing hypernym patterns in terms of precision and recall. The comparison to unsupervised distributional baselines for hypernym detection shows that, as expected, our approach yields much better performance. When compared to supervised distributional baselines for hypernym detection, our approach can be shown to be complementary and much less loosely coupled with training datasets and corpora.

Type
Research Article
Copyright
© The Author(s), 2021. Published by Cambridge University Press on behalf of Asian Journal of Law and Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Agrawal, R. & Srikant, R. 1995. Mining sequential patterns. In Proceedings of the Eleventh International Conference on Data Engineering, ICDE 1995, IEEE Computer Society, 3–14, http://dl.acm.org/citation.cfm?id=645480.655281 Google Scholar
Aldine, A. I. A., Harzallah, M., Giuseppe, B., BÉchet, N. & Faour, A. 2018. Redefining hearst patterns by using dependency relations. In Proceedings of the 10th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 2: KEOD, INSTICC, SciTePress, 148–155, doi: 10.5220/0006962201480155 CrossRefGoogle Scholar
Baroni, M., Bernardi, R., Do, N. Q. & Chieh Shan, C. 2012. Entailment above the word level in distributional semantics. In EACL, 23–32.Google Scholar
Bechet, N., Cellier, P., Charnois, T. & Cremilleux, B. 2012. Sequential pattern mining to discover relations between genes and rare diseases. In IEEE Int. Symp. on Computer-Based Medical Systems (CBMS), 1–6.Google Scholar
BÉchet, N., Cellier, P., Charnois, T. & CrÉmilleux, B. 2015. Sequence mining under multiple constraints. In Proceedings of the 30th Annual ACM Symposium on Applied Computing, SAC 2015, ACM, 908–914, doi: 10.1145/2695664.2695889, http://doi.acm.org/10.1145/2695664.2695889.CrossRefGoogle Scholar
Buitelaar, P., Cimiano, P. & Magnini, B. 2005. Ontology learning from text: An overview. In Ontology Learning from Text: Methods, Applications and Evaluation, 3–12.Google Scholar
Camacho-Collados, J., Delli Bovi, C., Espinosa-Anke, L., Oramas, S., Pasini, T., Santus, E., Shwartz, V., Navigli, R. & Saggion, H. 2018. SemEval-2018 Task 9: Hypernym discovery. In Proceedings of the 12th International Workshop on Semantic Evaluation (SemEval-2018), Association for Computational Linguistics.CrossRefGoogle Scholar
Cellier, P., Charnois, T. & Plantevit, M. 2010. Sequential patterns to discover and characterise biological relations. In Computational Linguistics and Intelligent Text Processing, Gelbukh, A. (ed). Springer Berlin Heidelberg, 537548.CrossRefGoogle Scholar
Chandramouli, K., Kliegr, T., Nemrava, J., Svatek, V. & Izquierdo, E. 2008. Query refinement and user relevance feedback for contextualized image retrieval. In 2008 5th International Conference on Visual Information Engineering (VIE 2008), 453–458.Google Scholar
Cui, H., Kan, M. Y. & Chua, T. S. 2007. Soft pattern matching models for definitional question answering. ACM Transactions on Information Systems 25, 8.CrossRefGoogle Scholar
Devlin, J., Chang, M. W., Lee, K. & Toutanova, K. 2019. Bert: Pre-training of deep bidirectional transformers for language understanding.Google Scholar
Fellbaum, C. 1998. Wordnet: An Electronic Lexical Database. MIT Press.CrossRefGoogle Scholar
Gomez-PÉrez, A. & Manzano-Mancho, D. 2004. An overview of methods and tools for ontology learning from texts. The Knowledge Engineering Review 19(3), 187212. doi: 10.1017/S0269888905000251.CrossRefGoogle Scholar
Hearst, M. A. 1992. Automatic acquisition of hyponyms from large text corpora. In Proceedings of the 14th International Conference on Computational Linguistics, 539–545.Google Scholar
Hearst, M. A. 1998. Automated Discovery of Wordnet Relations. WordNet: An Electronic Lexical Database, 131–152.Google Scholar
Jacques, M. P. & Aussenac-Gilles, N. 2006. VariabilitÉ des performances des outils de tal et genre textuel. cas des patrons lexico-syntaxiques 47, 1132.Google Scholar
Klein, D. & Manning, C. D. 2003. Accurate unlexicalized parsing. In Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1, ACL 2003, Association for Computational Linguistics, 423–430, doi: 10.3115/1075096.1075150, https://doi.org/10.3115/1075096.1075150.CrossRefGoogle Scholar
Kotlerman, L., Dagan, I., Szpektor, I. & Zhitomirsky-Geffet, M. 2010. Directional distributional similarity for lexical inference. NLE, 359389.Google Scholar
Levy, O., Remus, S., Biemann, C. & Dagan, I. 2015. Do supervised distributional methods really learn lexical inference relations? In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Association for Computational Linguistics, 970–976. doi: 10.3115/v1/N15-1098, https://www.aclweb.org/anthology/N15-1098.CrossRefGoogle Scholar
Lin, D. 2003. Dependency-based evaluation of minipar. Treebanks - Building and Using Parsed Corpora, 317–329.Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S. & Dean, J. 2013. Distributed representations of words and phrases and their compositionality. In NIPS, 3111–3119.Google Scholar
Mirkin, S., Dagan, I. & Geffet, M. 2006. Integrating pattern-based and distributional similarity methods for lexical entailment acquisition. In COLING and ACL, 579–586.Google Scholar
Nguyen, D. P. T., Matsuo, Y. & Ishizuka, M. 2007. Exploiting syntactic and semantic information for relation extraction from wikipedia. In IJCAI07-TextLinkWS. CrossRefGoogle Scholar
Orna-Montesinos, C. 2011. Words & Patterns: Lexico-Grammatical Patterns and Semantic Relations in Domain-Specific Discourses, 24.Google Scholar
Pei, J., Han, J., Mortazavi-Asl, B., Pinto, H., Chen, Q., Dayal, U. & Hsu, M. C. 2001. Prefixspan: Mining sequential patterns efficiently by prefix-projected pattern growth. In International Conference on Data Engineering, 215–224.Google Scholar
Pennington, J., Socher, R. & Manning, C. D. 2014. Glove: Global vectors for word representation. In EMNL, 1532–1543.CrossRefGoogle Scholar
Ponzetto, S. P. & Strube, M. 2011. Taxonomy induction based on a collaboratively built knowledge repository. Artificial Intelligence 175(9), 17371756, https://doi.org/10.1016/j.artint.2011.01.003, http://www.sciencedirect.com/science/article/pii/S000437021100004X CrossRefGoogle Scholar
Roller, S., Kiela, D. & Nickel, M. 2018. Hearst patterns revisited: Automatic hypernym detection from large text corpora. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), Association for Computational Linguistics, 358–363, http://aclweb.org/anthology/P18-2057.Google Scholar
Sang, E. T. K. & Hofmann, K. 2009. Lexical patterns or dependency patterns: Which is better for hypernym extraction? In Proceedings of the Thirteenth Conference on Computational Natural Language Learning, CoNLL 2009, Association for Computational Linguistics, 174–182.Google Scholar
Seitner, J., Bizer, C., Eckert, K., Faralli, S., Meusel, R., Paulheim, H. & Ponzetto, S. P. 2016 A large database of hypernymy relations extracted from the web. In LREC. Google Scholar
Sheena, N., Jasmine, S. M. & Joseph, S. 2016. Automatic extraction of hypernym and meronym relations in english sentences using dependency parser. In Procedia Computer Science, 539546.CrossRefGoogle Scholar
Shwartz, V., Goldberg, Y. & Dagan, I. 2016. Improving hypernymy detection with an integrated path-based and distributional method. CoRR abs/1603.06076, http://arxiv.org/abs/1603.06076,Google Scholar
Shwartz, V., Santus, E. & Schlechtweg, D. 2017. Hypernyms under siege: Linguistically-motivated artillery for hypernymy detection. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers, Association for Computational Linguistics, 65–75, https://www.aclweb.org/anthology/E17-1007 CrossRefGoogle Scholar
Snow, R., Jurafsky, D. & Ng, A. 2005. Learning Syntactic Patterns for Automatic Hypernym Discovery. MIT Press, 1297–1304.Google Scholar
Srikant, R. & Agrawal, R. 1996. Mining sequential patterns: Generalizations and performance improvements. In Proceedings of the 5th International Conference on Extending Database Technology: Advances in Database Technology, EDBT 1996, Springer-Verlag, 3–17, http://dl.acm.org/citation.cfm?id=645337.650382 CrossRefGoogle Scholar
Wang, J. & Han, J. 2004. Bide: Efficient mining of frequent closed sequences. In Proceedings of the 20th International Conference on Data Engineering, ICDE 2004, IEEE Computer Society, 79, http://dl.acm.org/citation.cfm?id=977401.978142 Google Scholar
Weeds, J. & Weir, D. 2003. A general framework for distributional similarity. In EMLP, 81–88.Google Scholar
Yan, X., Han, J. & Afshar, R. 2003. Clospan: Mining closed sequential patterns in large datasets. In: SDM, 166–177.CrossRefGoogle Scholar
Yang, Z., Dai, Z., Yang, Y., Carbonell, J., Salakhutdinov, R. & Le, Q. V. (2020) Xlnet: Generalized autoregressive pretraining for language understanding.Google Scholar
Yu, C., Han, J., Wang, P., Song, Y., Zhang, H., Ng, W. & Shi, S. (2020) When hearst is not enough: Improving hypernymy detection from corpus with distributional models.CrossRefGoogle Scholar
Zhang, E. & Zhang, Y. 2009. Average Precision, Springer US, 192–193. doi: 10.1007/978-0-387-39940-9_482, https://doi.org/10.1007/978-0-387-39940-9_482 Google Scholar
Zheng, W., Cheng, H., Yu, J. X., Zou, L. & Zhao, K. 2019. Interactive natural language question answering over knowledge graphs. Information Sciences 481, 141159, doi: https://doi.org/10.1016/j.ins.2018.12.032, https://www.sciencedirect.com/science/article/pii/S0020025518309848 CrossRefGoogle Scholar