Hostname: page-component-78c5997874-xbtfd Total loading time: 0 Render date: 2024-11-14T04:34:15.517Z Has data issue: false hasContentIssue false

Directional distributional similarity for lexical inference

Published online by Cambridge University Press:  11 October 2010

LILI KOTLERMAN
Affiliation:
Department of Computer Science, Bar-Ilan University, Ramat Gan, Israel e-mail: lili.dav@gmail.com
IDO DAGAN
Affiliation:
Department of Computer Science, Bar-Ilan University, Ramat Gan, Israel e-mail: dagan@cs.biu.ac.il
IDAN SZPEKTOR
Affiliation:
Yahoo! Research, Building 30 Matam Park, Haifa 31905, Israel e-mail: idan@yahoo-inc.com
MAAYAN ZHITOMIRSKY-GEFFET
Affiliation:
Department of Information Science, Bar-Ilan University, Ramat Gan, Israel e-mail: maayan.geffet@gmail.com

Abstract

Distributional word similarity is most commonly perceived as a symmetric relation. Yet, directional relations are abundant in lexical semantics and in many Natural Language Processing (NLP) settings that require lexical inference, making symmetric similarity measures less suitable for their identification. This paper investigates the nature of directional (asymmetric) similarity measures that aim to quantify distributional feature inclusion. We identify desired properties of such measures for lexical inference, specify a particular measure based on Average Precision that addresses these properties, and demonstrate the empirical benefit of directional measures for two different NLP datasets.

Type
Papers
Copyright
Copyright © Cambridge University Press 2010

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Agirre, E., Enrique, A., Keith, H., Jana, K., Marius, P., and Aitor, S. 2009. A study on similarity and relatedness using distributional and WordNet-based approaches. In NAACL HLT '09: Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Boulder, CO, USA: Association for Computational Linguistics, pp. 1927.Google Scholar
Banerjee, S., and Ted, P. 2002. An adapted lesk algorithm for word sense disambiguation using WordNet. In CICLing, Mexico City, ME, pp. 136145.Google Scholar
Barak, L., Dagan, I., and Shnarch, E. 2009. Text categorization from category name via lexical reference. In Proceedings of NAACL HLT 2009: Short Papers, pp. 3336, Boulder, Colorado, USA.Google Scholar
Bar-Haim, R., Dagan, I., Dolan, B., Ferro, L., Giampiccolo, D., Magnini, B., and Szpektor, I. 2006. The second PASCAL recognising textual entailment challenge. In Proceedings of the Second PASCAL Recognising Textual Entailment Challenge, Venice, Italy, pp. 3336.Google Scholar
Basili, R., Cammisa, M., and Moschitti, A. 2006. A semantic kernel to classify texts with very few training examples. Informatica (Slovenia) 30 (2): 163172.Google Scholar
Bhagat, R., Pantel, P., and Hovy, E. 2007. LEDIR: an unsupervised algorithm for learning directionality of inference rules. In Proceedings of EMNLP-CoNLL, Prague, Czech Republic.Google Scholar
Bloehdorn, S., and Moschitti, A. 2007. Structure and semantics for expressive text kernels. In CIKM, Lisbon, Portugal, pp. 861864.Google Scholar
Budanitsky, A., and Hirst, G. 2006. Evaluating WordNet-based measures of semantic distance. Computational Linguistics 32 (1): 1347.Google Scholar
Caraballo, S. A. 1999. Automatic construction of a hypernym-labeled noun hierarchy from text. In Thirty-Seventh Annual Meeting of the ACL, College Park, MD, USA.Google Scholar
Chen, S. F., and Goodman, J. 1996. An empirical study of smoothing techniques for language modeling. In ACL, Santa Cruz, CA, USA, pp. 310318.Google Scholar
Church, K. W., and Hanks, P. 1990. Word association norms, mutual information and lexicography. Computational Linguistics 16 (1): 2229.Google Scholar
Clarke, D. 2009. Context-theoretic semantics for natural language: an overview. In Proceedings of the Workshop on Geometrical Models of Natural Language Semantics, pp. 112119. Athens, Greece: Association for Computational Linguistics.Google Scholar
Dagan, I., Glickman, O., and Magnini, M. 2006. The PASCAL recognising textual entailment challenge. In Quiñonero-Candela, J., Dagan, I., Magnini, B., d'Alché-Buc, F. (eds.), Machine learning challegues. Lecture Notes in Computer Science, vol. 3944, pp. 177190. Springer.Google Scholar
Dagan, I., Lee, L., and Pereira, F. 1999. Similarity-based models of cooccurrence probabilities. Machine Learning 34 (1–3): 4369.CrossRefGoogle Scholar
Fellbaum, C. 1998. WordNet – An Electronic Lexical Database. MIT Press.Google Scholar
Gasperin, C., Gamallo, P., Agustini, A., Lopes, G., and de Lima, V. 2001. Using syntactic contexts for measuring word similarity. In In the Workshop on Semantic Knowledge Acquisition and Categorisation (ESSLI 2001), Helsinki, Finland.Google Scholar
Gauch, S., Wang, J., and Rachakonda, S. M. 1999. A corpus analysis approach for automatic query expansion and its extension to multiple databases. ACM Transactions on Information Systems (TOIS) 17 (3): 250269.CrossRefGoogle Scholar
Geffet, M., and Dagan, I. 2005. The distributional inclusion hypotheses and lexical entailment. In Proceedings of ACL, Michigan, USA.Google Scholar
Harabagiu, S., and Hickl, A. 2006. Methods for using textual entailment in open-domain question answering. In ACL-44: Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics, pp. 905912, Morristown, NJ.Google Scholar
Harabagiu, S. M., Hickl, A., and Lacatusu, V. F. 2007. Satisfying information needs with multi-document summaries. Information Processing and Management 43 (6): 16191642.Google Scholar
Hindle, D. 1990. Noun classification from predicate-argument structures. In Proceedings of ACL, Pittsburgh, Pennsylvania, USA.Google Scholar
Jiang, J. J., and Conrath, D. W. 1997. Semantic similarity based on corpus statistics and lexical taxonomy. In Proceedings of the International Conference on Research in Computational Linguistics, Tapei, Taiwan, pp. 1933.Google Scholar
Jing, Y., and Croft, W. B. 1994. An association thesaurus for information retrieval. In Proceedings of RIAO 94, Rockefeller University, NY, USA, pp. 146160.Google Scholar
Jones, M. N., and Mewhort, D. J. K. 2007. Representing word meaning and order information in a composite holographic lexicon. Psychological Review 114 (1): 137.CrossRefGoogle Scholar
Ko, Y., and Seo, J. 2004. Learning with unlabeled data for text categorization using a bootstrapping and a feature projection technique. In ACL 2004, Barcelona, Spain, pp. 255262.Google Scholar
Leacock, C., and Chodorow, M. 1998. WordNet: An Electronic Lexical Database – Combining Local Context and WordNet Similarity for Word Sense Identification, in Wordnet: An Electronic Lexical Database, chap. 11, pp. 265283. MIT Press.Google Scholar
Lee, L. 1999. Measures of distributional similarity. In Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics, College Park, MD, USA, pp. 2532.Google Scholar
Lin, D. 1998a. Automatic retrieval and clustering of similar words. In Proceedings of COLING-ACL, Montreal, Quebec, Canada.Google Scholar
Lin, D. 1998b. Dependency-based evaluation of minipar. In Proceedings of the Workshop on Evaluation of Parsing Systems at LREC 1998, Granada, Spain.Google Scholar
Lin, D. 1998c. An information-theoretic definition of similarity. In Proceedings of the International Conference on Machine Learning, Madison, WI, USA.Google Scholar
Lin, D., and Pantel, P. 2001. DIRT – discovery of inference rules from text. In Proceedings of ACM SIGKDD Conference on Knowledge Discovery and Data Mining 2001, San Francisco, CA, USA, pp. 323328.Google Scholar
Liu, B., Li, X., Lee, W. S., and Yu, P. S. 2004. Text classification by labeling words. In AAAI-2004, San Jose, CA, USA, pp. 425430.Google Scholar
Lloret, E., Ferra'ndez, O., Munoz, R., and Palomar, M. 2008. A text summarization approach under the influence of textual entailment. In Sharp, B. and Zock, M. (eds.), NLPCS, pp. 2231. INSTICC.Google Scholar
Mandala, R., Tokunaga, T., and Tanaka, T. 1999. Combining multiple evidence from different types of thesaurus for query expansion. In Proceedings of SIGIR, Berkeley, CA, USA.Google Scholar
McCallum, A., and Nigam, K. 1999. Text classification by bootstrapping with keywords, em and shrinkage. In ACL '99 Workshop for Unsupervised Learning in Natural Language Processing, pp. 5258, College Park, Maryland, USA.Google Scholar
Michelbacher, L., Evert, S., and Schutze, H. 2007. Asymmetric association measures. In Proceedings of RANLP, Borovets, Bulgaria.Google Scholar
Mirkin, S., Dagan, I., and Shnarch, E. 2009a. Evaluating the inferential utility of lexical-semantic resources. In EACL '09: Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics, Athens, Greece: Association for Computational Linguistics, pp. 558566.Google Scholar
Mirkin, S., Specia, L., Cancedda, N., Dagan, I., Dymetman, M., and Szpektor, I. 2009b. Source-language entailment modeling for translating unknown terms. In Proceedings of ACL-IJCNLP. Singapore.Google Scholar
Pantel, P., and Ravichandran, D. 2004. Automatically labeling semantic classes. In Proceedings of Human Language Technology/North American chapter of the Association for Computational Linguistics (HLT/NAACL-04), pp. 321328, Boston, MA, USA.Google Scholar
Patwardhan, S. 2003. Incorporating Dictionary and Corpus Information into a Context Vector Measure of Semantic Relatedness. Master's thesis, Palo Alto, CA, USA: University of Minnesota.Google Scholar
Pedersen, T., Patwardhan, S., and Michelizzi, J. 2004. Wordnet: Similarity – measuring the relatedness of concepts. In AAAI, pp. 10241025, San Jose, CA, USA.Google Scholar
Resnik, P. 1995. Using information content to evaluate semantic similarity in a taxonomy. In IJCAI'95: Proceedings of the 14th international joint conference on Artificial intelligence, pp. 448453, San Francisco, CA: Morgan Kaufmann Publishers Inc.Google Scholar
Roberts, M. A. J., and Chater, N. 2008. Using statistical smoothing to estimate the psycholinguistic acceptability of novel phrases. Behavior Research Methods 40 (1): 8493.Google Scholar
Ruge, G. 1992. Experiments on linguistically-based term associations. Information Processing and Management 28 (3): 317332.Google Scholar
Sahlgren, M., Holst, A., and Kanerva, P. 2008. Permutations as a means to encode order in word space. In Proceedings of the 30th Annual Meeting of the Cognitive Science Society (CogSci'08), Washington, DC, USA, pp. 13001305.Google Scholar
Salton, G., and , McGill (eds.) 1983. Introduction to Modern Information Retrieval. McGraw-Hill.Google Scholar
Szpektor, I., and Dagan, I. 2008. Learning entailment rules for unary templates. In Proceedings of COLING, Manchester, UK.Google Scholar
Szpektor, I., and Dagan, I. 2009. Augmenting WordNet-based inference with argument mapping. In Proceedings of the 2009 Workshop on Applied Textual Inference, Singapore.Google Scholar
Szpektor, I., Dagan, I., Bar-Haim, R., and Goldberger, J. 2008. Contextual preferences. In Proceedings of ACL-08: HLT, Columbus, OH, USA, pp. 683691.Google Scholar
Szpektor, I., Shnarch, E., and Dagan, I. 2007. Instance-based evaluation of entailment rule acquisition. In Proceedings of ACL 2007, Prague, Czech Republic.Google Scholar
Turney, P. D. 2001. Mining the web for synonyms: Pmi-ir versus lsa on toefl. In EMCL '01: Proceedings of the 12th European Conference on Machine Learning, pp. 491502, London, UK: Springer-Verlag.Google Scholar
Tversky, A. 1977. Features of similarity. Psychological Review 84: 327352.CrossRefGoogle Scholar
Voorhees, E. M., and Harman, D. K., (eds.) 1999. The Seventh Text REtrieval Conference (TREC-7), vol. 7. NIST.Google Scholar
Weeds, J., and Weir, D. 2003. A general framework for distributional similarity. In Proceedings of EMNLP, Sapporo, Japan.Google Scholar
Weeds, J., Weir, D., and McCarthy, D. 2004. Characterising measures of lexical distributional similarity. In Proceedings of COLING, Geneva, Switzerland.Google Scholar
Wilcoxon, F. 1945. Individual comparisons by ranking methods. Biometrics Bulletin 1: 8083.Google Scholar
Wu, Z. and Palmer, M. 1994. Verb semantics and lexical selection. In Proceedings of the 32nd annual meeting on Association for Computational Linguistics, Las Cruces, NM, USA, pp. 133138.Google Scholar
Xu, J., and Croft, W. B. 1996. Query expansion using local and global document analysis. In Proceedings of SIGIR, Zurich, Switzerland.Google Scholar
Zazo, Á. F., Figuerola, C. G., Alonso Berrocal, J. L., and Rodríguez, E. 2005. Reformulation of queries using similarity thesauri. Information Processing and Management 41 (5): 11631173.CrossRefGoogle Scholar
Zhitomirsky-Geffet, M., and Dagan, I. 2009. Bootstrapping distributional feature vector quality. Journal of Computational Linguistics 35 (3).Google Scholar