Directional distributional similarity for lexical inference

LILI KOTLERMAN; IDO DAGAN; IDAN SZPEKTOR; MAAYAN ZHITOMIRSKY-GEFFET

doi:10.1017/S1351324910000124

Directional distributional similarity for lexical inference

Published online by Cambridge University Press: 11 October 2010

LILI KOTLERMAN ,

IDO DAGAN ,

IDAN SZPEKTOR and

MAAYAN ZHITOMIRSKY-GEFFET

Show author details

LILI KOTLERMAN: Affiliation:
Department of Computer Science, Bar-Ilan University, Ramat Gan, Israel e-mail: lili.dav@gmail.com
IDO DAGAN: Affiliation:
Department of Computer Science, Bar-Ilan University, Ramat Gan, Israel e-mail: dagan@cs.biu.ac.il
IDAN SZPEKTOR: Affiliation:
Yahoo! Research, Building 30 Matam Park, Haifa 31905, Israel e-mail: idan@yahoo-inc.com
MAAYAN ZHITOMIRSKY-GEFFET: Affiliation:
Department of Information Science, Bar-Ilan University, Ramat Gan, Israel e-mail: maayan.geffet@gmail.com

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

Distributional word similarity is most commonly perceived as a symmetric relation. Yet, directional relations are abundant in lexical semantics and in many Natural Language Processing (NLP) settings that require lexical inference, making symmetric similarity measures less suitable for their identification. This paper investigates the nature of directional (asymmetric) similarity measures that aim to quantify distributional feature inclusion. We identify desired properties of such measures for lexical inference, specify a particular measure based on Average Precision that addresses these properties, and demonstrate the empirical benefit of directional measures for two different NLP datasets.

Type: Papers
Information: Natural Language Engineering , Volume 16 , Special Issue 4: Distributional Lexical Semantics , October 2010 , pp. 359 - 389

DOI: https://doi.org/10.1017/S1351324910000124 [Opens in a new window]
Copyright: Copyright © Cambridge University Press 2010

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Agirre, E., Enrique, A., Keith, H., Jana, K., Marius, P., and Aitor, S. 2009. A study on similarity and relatedness using distributional and WordNet-based approaches. In NAACL HLT '09: Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Boulder, CO, USA: Association for Computational Linguistics, pp. 19–27.Google Scholar

Banerjee, S., and Ted, P. 2002. An adapted lesk algorithm for word sense disambiguation using WordNet. In CICLing, Mexico City, ME, pp. 136–145.Google Scholar

Barak, L., Dagan, I., and Shnarch, E. 2009. Text categorization from category name via lexical reference. In Proceedings of NAACL HLT 2009: Short Papers, pp. 33–36, Boulder, Colorado, USA.Google Scholar

Bar-Haim, R., Dagan, I., Dolan, B., Ferro, L., Giampiccolo, D., Magnini, B., and Szpektor, I. 2006. The second PASCAL recognising textual entailment challenge. In Proceedings of the Second PASCAL Recognising Textual Entailment Challenge, Venice, Italy, pp. 33–36.Google Scholar

Basili, R., Cammisa, M., and Moschitti, A. 2006. A semantic kernel to classify texts with very few training examples. Informatica (Slovenia) 30 (2): 163–172.Google Scholar

Bhagat, R., Pantel, P., and Hovy, E. 2007. LEDIR: an unsupervised algorithm for learning directionality of inference rules. In Proceedings of EMNLP-CoNLL, Prague, Czech Republic.Google Scholar

Bloehdorn, S., and Moschitti, A. 2007. Structure and semantics for expressive text kernels. In CIKM, Lisbon, Portugal, pp. 861–864.Google Scholar

Budanitsky, A., and Hirst, G. 2006. Evaluating WordNet-based measures of semantic distance. Computational Linguistics 32 (1): 13–47.Google Scholar

Caraballo, S. A. 1999. Automatic construction of a hypernym-labeled noun hierarchy from text. In Thirty-Seventh Annual Meeting of the ACL, College Park, MD, USA.Google Scholar

Chen, S. F., and Goodman, J. 1996. An empirical study of smoothing techniques for language modeling. In ACL, Santa Cruz, CA, USA, pp. 310–318.Google Scholar

Church, K. W., and Hanks, P. 1990. Word association norms, mutual information and lexicography. Computational Linguistics 16 (1): 22–29.Google Scholar

Clarke, D. 2009. Context-theoretic semantics for natural language: an overview. In Proceedings of the Workshop on Geometrical Models of Natural Language Semantics, pp. 112–119. Athens, Greece: Association for Computational Linguistics.Google Scholar

Dagan, I., Glickman, O., and Magnini, M. 2006. The PASCAL recognising textual entailment challenge. In Quiñonero-Candela, J., Dagan, I., Magnini, B., d'Alché-Buc, F. (eds.), Machine learning challegues. Lecture Notes in Computer Science, vol. 3944, pp. 177–190. Springer.Google Scholar

Dagan, I., Lee, L., and Pereira, F. 1999. Similarity-based models of cooccurrence probabilities. Machine Learning 34 (1–3): 43–69.CrossRef Google Scholar

Fellbaum, C. 1998. WordNet – An Electronic Lexical Database. MIT Press.Google Scholar

Gasperin, C., Gamallo, P., Agustini, A., Lopes, G., and de Lima, V. 2001. Using syntactic contexts for measuring word similarity. In In the Workshop on Semantic Knowledge Acquisition and Categorisation (ESSLI 2001), Helsinki, Finland.Google Scholar

Gauch, S., Wang, J., and Rachakonda, S. M. 1999. A corpus analysis approach for automatic query expansion and its extension to multiple databases. ACM Transactions on Information Systems (TOIS) 17 (3): 250–269.CrossRef Google Scholar

Geffet, M., and Dagan, I. 2005. The distributional inclusion hypotheses and lexical entailment. In Proceedings of ACL, Michigan, USA.Google Scholar

Harabagiu, S., and Hickl, A. 2006. Methods for using textual entailment in open-domain question answering. In ACL-44: Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics, pp. 905–912, Morristown, NJ.Google Scholar

Harabagiu, S. M., Hickl, A., and Lacatusu, V. F. 2007. Satisfying information needs with multi-document summaries. Information Processing and Management 43 (6): 1619–1642.Google Scholar

Hindle, D. 1990. Noun classification from predicate-argument structures. In Proceedings of ACL, Pittsburgh, Pennsylvania, USA.Google Scholar

Jiang, J. J., and Conrath, D. W. 1997. Semantic similarity based on corpus statistics and lexical taxonomy. In Proceedings of the International Conference on Research in Computational Linguistics, Tapei, Taiwan, pp. 19–33.Google Scholar

Jing, Y., and Croft, W. B. 1994. An association thesaurus for information retrieval. In Proceedings of RIAO 94, Rockefeller University, NY, USA, pp. 146–160.Google Scholar

Jones, M. N., and Mewhort, D. J. K. 2007. Representing word meaning and order information in a composite holographic lexicon. Psychological Review 114 (1): 1–37.CrossRef Google Scholar

Ko, Y., and Seo, J. 2004. Learning with unlabeled data for text categorization using a bootstrapping and a feature projection technique. In ACL 2004, Barcelona, Spain, pp. 255–262.Google Scholar

Leacock, C., and Chodorow, M. 1998. WordNet: An Electronic Lexical Database – Combining Local Context and WordNet Similarity for Word Sense Identification, in Wordnet: An Electronic Lexical Database, chap. 11, pp. 265–283. MIT Press.Google Scholar

Lee, L. 1999. Measures of distributional similarity. In Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics, College Park, MD, USA, pp. 25–32.Google Scholar

Lin, D. 1998a. Automatic retrieval and clustering of similar words. In Proceedings of COLING-ACL, Montreal, Quebec, Canada.Google Scholar

Lin, D. 1998b. Dependency-based evaluation of minipar. In Proceedings of the Workshop on Evaluation of Parsing Systems at LREC 1998, Granada, Spain.Google Scholar

Lin, D. 1998c. An information-theoretic definition of similarity. In Proceedings of the International Conference on Machine Learning, Madison, WI, USA.Google Scholar

Lin, D., and Pantel, P. 2001. DIRT – discovery of inference rules from text. In Proceedings of ACM SIGKDD Conference on Knowledge Discovery and Data Mining 2001, San Francisco, CA, USA, pp. 323–328.Google Scholar

Liu, B., Li, X., Lee, W. S., and Yu, P. S. 2004. Text classification by labeling words. In AAAI-2004, San Jose, CA, USA, pp. 425–430.Google Scholar

Lloret, E., Ferra'ndez, O., Munoz, R., and Palomar, M. 2008. A text summarization approach under the influence of textual entailment. In Sharp, B. and Zock, M. (eds.), NLPCS, pp. 22–31. INSTICC.Google Scholar

Mandala, R., Tokunaga, T., and Tanaka, T. 1999. Combining multiple evidence from different types of thesaurus for query expansion. In Proceedings of SIGIR, Berkeley, CA, USA.Google Scholar

McCallum, A., and Nigam, K. 1999. Text classification by bootstrapping with keywords, em and shrinkage. In ACL '99 Workshop for Unsupervised Learning in Natural Language Processing, pp. 52–58, College Park, Maryland, USA.Google Scholar

Michelbacher, L., Evert, S., and Schutze, H. 2007. Asymmetric association measures. In Proceedings of RANLP, Borovets, Bulgaria.Google Scholar

Mirkin, S., Dagan, I., and Shnarch, E. 2009a. Evaluating the inferential utility of lexical-semantic resources. In EACL '09: Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics, Athens, Greece: Association for Computational Linguistics, pp. 558–566.Google Scholar

Mirkin, S., Specia, L., Cancedda, N., Dagan, I., Dymetman, M., and Szpektor, I. 2009b. Source-language entailment modeling for translating unknown terms. In Proceedings of ACL-IJCNLP. Singapore.Google Scholar

Pantel, P., and Ravichandran, D. 2004. Automatically labeling semantic classes. In Proceedings of Human Language Technology/North American chapter of the Association for Computational Linguistics (HLT/NAACL-04), pp. 321–328, Boston, MA, USA.Google Scholar

Patwardhan, S. 2003. Incorporating Dictionary and Corpus Information into a Context Vector Measure of Semantic Relatedness. Master's thesis, Palo Alto, CA, USA: University of Minnesota.Google Scholar

Pedersen, T., Patwardhan, S., and Michelizzi, J. 2004. Wordnet: Similarity – measuring the relatedness of concepts. In AAAI, pp. 1024–1025, San Jose, CA, USA.Google Scholar

Resnik, P. 1995. Using information content to evaluate semantic similarity in a taxonomy. In IJCAI'95: Proceedings of the 14th international joint conference on Artificial intelligence, pp. 448–453, San Francisco, CA: Morgan Kaufmann Publishers Inc.Google Scholar

Roberts, M. A. J., and Chater, N. 2008. Using statistical smoothing to estimate the psycholinguistic acceptability of novel phrases. Behavior Research Methods 40 (1): 84–93.Google Scholar

Ruge, G. 1992. Experiments on linguistically-based term associations. Information Processing and Management 28 (3): 317–332.Google Scholar

Sahlgren, M., Holst, A., and Kanerva, P. 2008. Permutations as a means to encode order in word space. In Proceedings of the 30th Annual Meeting of the Cognitive Science Society (CogSci'08), Washington, DC, USA, pp. 1300–1305.Google Scholar

Salton, G., and , McGill (eds.) 1983. Introduction to Modern Information Retrieval. McGraw-Hill.Google Scholar

Szpektor, I., and Dagan, I. 2008. Learning entailment rules for unary templates. In Proceedings of COLING, Manchester, UK.Google Scholar

Szpektor, I., and Dagan, I. 2009. Augmenting WordNet-based inference with argument mapping. In Proceedings of the 2009 Workshop on Applied Textual Inference, Singapore.Google Scholar

Szpektor, I., Dagan, I., Bar-Haim, R., and Goldberger, J. 2008. Contextual preferences. In Proceedings of ACL-08: HLT, Columbus, OH, USA, pp. 683–691.Google Scholar

Szpektor, I., Shnarch, E., and Dagan, I. 2007. Instance-based evaluation of entailment rule acquisition. In Proceedings of ACL 2007, Prague, Czech Republic.Google Scholar

Turney, P. D. 2001. Mining the web for synonyms: Pmi-ir versus lsa on toefl. In EMCL '01: Proceedings of the 12th European Conference on Machine Learning, pp. 491–502, London, UK: Springer-Verlag.Google Scholar

Tversky, A. 1977. Features of similarity. Psychological Review 84: 327–352.CrossRef Google Scholar

Voorhees, E. M., and Harman, D. K., (eds.) 1999. The Seventh Text REtrieval Conference (TREC-7), vol. 7. NIST.Google Scholar

Weeds, J., and Weir, D. 2003. A general framework for distributional similarity. In Proceedings of EMNLP, Sapporo, Japan.Google Scholar

Weeds, J., Weir, D., and McCarthy, D. 2004. Characterising measures of lexical distributional similarity. In Proceedings of COLING, Geneva, Switzerland.Google Scholar

Wilcoxon, F. 1945. Individual comparisons by ranking methods. Biometrics Bulletin 1: 80–83.Google Scholar

Wu, Z. and Palmer, M. 1994. Verb semantics and lexical selection. In Proceedings of the 32nd annual meeting on Association for Computational Linguistics, Las Cruces, NM, USA, pp. 133–138.Google Scholar

Xu, J., and Croft, W. B. 1996. Query expansion using local and global document analysis. In Proceedings of SIGIR, Zurich, Switzerland.Google Scholar

Zazo, Á. F., Figuerola, C. G., Alonso Berrocal, J. L., and Rodríguez, E. 2005. Reformulation of queries using similarity thesauri. Information Processing and Management 41 (5): 1163–1173.CrossRef Google Scholar

Zhitomirsky-Geffet, M., and Dagan, I. 2009. Bootstrapping distributional feature vector quality. Journal of Computational Linguistics 35 (3).Google Scholar

Article contents

Directional distributional similarity for lexical inference

Abstract

Access options

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests