Design and realization of a modular architecture for textual entailment

SEBASTIAN PADÓ; TAE-GIL NOH; ASHER STERN; RUI WANG; ROBERTO ZANOLI

doi:10.1017/S1351324913000351

Design and realization of a modular architecture for textual entailment

Published online by Cambridge University Press: 13 December 2013

RUI WANG and

SEBASTIAN PADÓ: Affiliation:
Institute for Natural Language Processing, Stuttgart University, 70569 Stuttgart, Germany e-mail: pado@ims.uni-stuttgart.de
TAE-GIL NOH: Affiliation:
Institute of Computational Linguistics, Heidelberg University, 69120 Heidelberg, Germany e-mail: noh@cl.uni-heidelberg.de
ASHER STERN: Affiliation:
Department of Computer Science, Bar-Ilan University, Ramat Gan 52900, Israel e-mail: astern7@gmail.com
RUI WANG: Affiliation:
German Research Center for Artificial Intelligence, 66123 Saarbrücken, Germany e-mail: rui.wang@dfki.de
ROBERTO ZANOLI: Affiliation:
Human Language Technology, Fondazione Bruno Kessler, 38123 Trento, Italy e-mail: zanoli@fbk.eu

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

A key challenge at the core of many Natural Language Processing (NLP) tasks is the ability to determine which conclusions can be inferred from a given natural language text. This problem, called the Recognition of Textual Entailment (RTE), has initiated the development of a range of algorithms, methods, and technologies. Unfortunately, research on Textual Entailment (TE), like semantics research more generally, is fragmented into studies focussing on various aspects of semantics such as world knowledge, lexical and syntactic relations, or more specialized kinds of inference. This fragmentation has problematic practical consequences. Notably, interoperability among the existing RTE systems is poor, and reuse of resources and algorithms is mostly infeasible. This also makes systematic evaluations very difficult to carry out. Finally, textual entailment presents a wide array of approaches to potential end users with little guidance on which to pick. Our contribution to this situation is the novel EXCITEMENT architecture, which was developed to enable and encourage the consolidation of methods and resources in the textual entailment area. It decomposes RTE into components with strongly typed interfaces. We specify (a) a modular linguistic analysis pipeline and (b) a decomposition of the ‘core’ RTE methods into top-level algorithms and subcomponents. We identify four major subcomponent types, including knowledge bases and alignment methods. The architecture was developed with a focus on generality, supporting all major approaches to RTE and encouraging language independence. We illustrate the feasibility of the architecture by constructing mappings of major existing systems onto the architecture. The practical implementation of this architecture forms the EXCITEMENT open platform. It is a suite of textual entailment algorithms and components which contains the three systems named above, including linguistic-analysis pipelines for three languages (English, German, and Italian), and comprises a number of linguistic resources. By addressing the problems outlined above, the platform provides a comprehensive and flexible basis for research and experimentation in textual entailment and is available as open source software under the GNU General Public License.

Type: Articles
Information: Natural Language Engineering , Volume 21 , Issue 2 , March 2015 , pp. 167 - 200

DOI: https://doi.org/10.1017/S1351324913000351 [Opens in a new window]
Copyright: Copyright © Cambridge University Press 2013

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Agirre, E., Cer, D., Diab, M., and Gonzalez-Agirre, A., 2012. SemEval-2012 task 6: a pilot on semantic textual similarity. In Proceedings of the International Workshop on Semantic Evaluation, Montréal, Canada, pp. 385–93.Google Scholar

Androutsopoulos, I., and Malakasiotis, P., 2010. A survey of paraphrasing and textual entailment methods. Journal of Artificial Intelligence Research 38: 135–87.Google Scholar

Baker, C. F., Fillmore, C. J., and Lowe, J. B., 1998. The Berkeley FrameNet project. In Proceedings of the Joint International Conference on Computational Linguistics and Annual Meeting of the Association for Computational Linguistics, Montréal, QC, pp. 86–90.Google Scholar

Bar-Haim, R., Dagan, I., Greental, I., and Shnarch, E., 2007. Semantic inference at the lexical-syntactic level. In Proceedings of the Annual Meeting of the American Association for Artificial Intelligence, Vancouver, BC, pp. 871–6.Google Scholar

Bar-Haim, R., Szpektor, I., and Glickman, O., 2005. Definition and analysis of intermediate entailment levels. In Proceedings of the ACL-PASCAL Workshop on Empirical Modeling of Semantic Equivalence and Entailment, Ann Arbor, MI, pp. 55–60.CrossRef Google Scholar

Ben Aharon, R., Szpektor, I., and Dagan, I., 2010. Generating entailment rules from FrameNet. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, Uppsala, Sweden, pp. 241–6.Google Scholar

Bentivogli, L., Magnini, B., Dagan, I., Trang Dang, H., and Giampiccolo, D. 2009. The fifth PASCAL recognising textual entailment challenge. In Proceedings of the TAC Workshop on Textual Entailment, Gaithersburg, MD.Google Scholar

Berant, J. 2012. Global Learning of Textual Entailment Graphs. PhD thesis, Tel Aviv University, Israel.Google Scholar

Berant, J., Dagan, I., and Goldberger, J., 2012. Learning entailment relations by global graph structure optimization. Computational Linguistics 38 (1): 73–111.Google Scholar

Berger, A., Caruana, R., Cohn, D., Freitag, D., and Mittal, V., 2000. Bridging the lexical chasm: statistical approaches to answer-finding. In Proceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Athens, Greece, pp. 192–9.Google Scholar

Bergmair, R. 2009. A proposal on evaluation measures for RTE. In Proceedings of the Workshop on Applied Textual Inference, Singapore, pp. 10–17.Google Scholar

Bobrow, D., Condoravdi, C., Crouch, R., Paiva, V. De, Karttunen, L., King, T., Nairn, R., Price, C., and Zaenen, A., 2007. Precision-focused textual inference. In Proceedings of the ACL-PASCAL Workshop on Textual Entailment and Paraphrasing, Prague, Czech Republic, pp. 16–21.CrossRef Google Scholar

Bos, J., and Markert, K., 2005. Recognising textual entailment with logical inference. In Proceedings of the Joint Conference on Human Language Technology and Empirical Methods in Natural Language Processing, Vancouver, BC, pp. 628–635.Google Scholar

Bos, J., and Markert, K., 2006. When logical inference helps determining textual entailment (and when it doesn’t). In Proceedings of the Second PASCAL Challenges Workshop on Recognising Textual Entailment, Venice, Italy, pp. 98–103.Google Scholar

Cabrio, E., and Magnini, B., 2011. Towards component-based textual entailment. In Proceedings of the International Conference on Computational Semantics, Oxford, UK, pp. 320–4.Google Scholar

Callmeier, U., Eisele, A., Schäfer, U., and Siegel, M., 2004. The DeepThought core architecture framework. In Proceedings of the International Conference on Language Resources and Evaluation, Lisbon, Portugal, pp. 1205–8.Google Scholar

Castillo, J., 2010. A machine learning approach for recognizing textual entailment in Spanish. In Proceedings of the NAACL/HLT Young Investigators Workshop on Computational Approaches to Languages of the Americas, Los Angeles, CA, pp. 62–7.Google Scholar

Chklovski, T., and Pantel, P., 2004. VerbOcean: mining the web for fine-grained semantic verb relations. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, Barcelona, Spain, pp. 33–40.Google Scholar

Clark, P., Harrison, P., Thompson, J., Murray, W., Hobbs, J., and Fellbaum, C. 2007. On the role of lexical and world knowledge in RTE3. In Proceedings of the ACL-PASCAL Workshop on Textual Entailment and Paraphrasing, Prague, pp. 54–9.Google Scholar

Clarke, J., Srikumar, V., Sammons, M., and Roth, D., 2012. An NLP curator (or: How I learned to stop worrying and love NLP pipelines). In Proceedings of the International Conference on Language Resources and Evaluation, Istanbul, Turkey, pp. 3276–83.Google Scholar

Cohen, K. Bretonnel, and Carpenter, B. (eds.), 2008. Proceedings of the ACL Workshop on Software Engineering, Testing, and Quality Assurance for Natural Language Processing. Stroudsburg, PA: Association for Computational Linguistics.Google Scholar

Crysmann, B., Frank, A., Kiefer, B., Müller, S., Neumann, G., Piskorski, J., Schäfer, U., Siegel, M., Uszkoreit, H., Xu, F., Becker, M., and Krieger, H.-U., 2002. An integrated architecture for shallow and deep processing. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, Philadelphia, PA, pp. 441–8.Google Scholar

Cunningham, H., 2002. GATE, a general architecture for text engineering. Computers and the Humanities 36 (2): 223–54.Google Scholar

Cunningham, H., Maynard, D., Bontcheva, K., Tablan, V., Aswani, N., Roberts, I., Gorrell, G., Funk, A., Roberts, A., Damljanovic, D., Heitz, T., Greenwood, M. A., Saggion, H., Petrak, J., Li, Y., and Peters, W., 2011. Text Processing with GATE (Version 6). Sheffield, UK: University of Sheffield.Google Scholar

Curran, J., 2003. Blueprint for a high-performance NLP infrastructure. In Proceedings of the HLT-NAACL Workshop on Software Engineering and Architecture of Language Technology Systems, Berkeley, CA, pp. 39–44.Google Scholar

Dagan, I., Glickman, O., and Magnini, B., 2005. The PASCAL recognising textual entailment challenge. In Proceedings of the PASCAL Challenges Workshop on Recognising Textual Entailment, Southampton, UK, pp. 177–90.Google Scholar

Dagan, I., Roth, D., and Zanzotto, F. M., 2012. Recognizing Textual Entailment: Models and Applications. Synthesis Lectures on Human Language Technologies number 17. New York: Morgan & Claypool.Google Scholar

de Marneffe, M.-C., and Manning, C. D., 2008. The Stanford typed dependencies representation. In Proceedings of the COLING Workshop on Cross-Framework and Cross-Domain Parser Evaluation, Manchester, UK, pp. 1–8.Google Scholar

de Marneffe, M.-C., Rafferty, A. N., and Manning, C. D., 2008. Finding contradictions in text. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, Columbus, OH, pp. 1039–47.Google Scholar

Efron, B., and Tibshirani, R. J., 1993. An Introduction to the Bootstrap. New York: Chapman and Hall.Google Scholar

Fellbaum, C. (ed.), 1998. WordNet: An Electronic Lexical Database. Cambridge, MA: MIT Press.Google Scholar

Ferrucci, D., and Lally, A., 2004. UIMA: an architectural approach to unstructured information processing in the corporate research environment. Natural Language Engineering 10 (3–4): 327–48.Google Scholar

Finkel, J. R., Grenager, T., and Manning, C., 2005. Incorporating non-local information into information extraction systems by Gibbs sampling. In Proceedings of the Annual Meeting on Association for Computational Linguistics, Ann Arbor, MI, pp. 363–70.Google Scholar

Goldberg, Y., and Elhadad, M., 2010. An efficient algorithm for easy-first non-directional dependency parsing. In Proceedings of the Annual Conference of the North American Chapter of the ACL, Los Angeles, CA, pp. 742–50.Google Scholar

Gurevych, I., Mühlhäuser, M., Müller, C., Steimle, J., Weimer, M., and Zesch, T. 2007. Darmstadt knowledge processing repository based on UIMA. In Proceedings of the First Workshop on Unstructured Information Management Architecture at the Conference of the Society for Computational Linguistics and Language Technology, Tübingen, Germany.Google Scholar

Haghighi, A., and Klein, D. 2009. Simple coreference resolution with rich syntactic and semantic features. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, Singapore, pp. 1152–61.Google Scholar

Hajič, J., Ciaramita, M., Johansson, R., Kawahara, D., Martì, M. A., Màrquez, L., Meyers, A., Nivre, J., Padó, S., Štepánek, J., Stranák, P., Surdeanu, M., Xue, N., and Zhang, Y., 2009. The CoNLL-2009 shared task: syntactic and semantic dependencies in multiple languages. In Proceedings of the Conference of Natural Language Learning, Boulder, CO, pp. 1–18.Google Scholar

Harabagiu, S., and Hickl, A., 2006. Methods for using textual entailment in open-domain question answering. In Proceedings of the International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, Sydney, Australia, pp. 905–12.Google Scholar

Harabagiu, S., Hickl, A., and Lacatusu, F. 2007. Satisfying information needs with multi-document summaries. Information Processing and Management 43 (6): 1619–42.Google Scholar

Harmeling, S., 2009. Inferring textual entailment with a probabilistically sound calculus. Journal of Natural Language Engineering 15 (4): 459–77.Google Scholar

Hinrichs, M., Zastrow, T., and Hinrichs, E., 2010. WebLicht: web-based LRT services in a distributed eScience infrastructure. In Proceedings of the International Conference on Language Resources and Evaluation, Valletta, Malta, pp. 489–93.Google Scholar

Ide, N., and Suderman, K., 2007. GrAF: a graph-based format for linguistic annotations. In Proceedings of the ACL Linguistic Annotation Workshop, Prague, Czech Republic, pp. 1–8.Google Scholar

Koehn, P., Hoang, H., Birch, A., Callison-Burch, C., Federico, M., Bertoldi, N., Cowan, B., Shen, W., Moran, C., Zens, R., Dyer, C., Bojar, O., Constantin, A., and Herbst, E., 2007. Moses: open source toolkit for statistical machine translation. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, Prague, Czech Republic, pp. 177–80.Google Scholar

Lin, D., and Pantel, P., 2002. Discovery of inference rules for question answering. Journal of Natural Language Engineering 7 (4): 343–60.Google Scholar

Lotan, A., Stern, A., and Dagan, I., 2013. TruthTeller: annotating predicate truth. In Proceedings of the Annual Meeting of the North American Chapter of the ACL, Atlanta, GA, pp. 752–7.Google Scholar

MacCartney, B., Grenager, T., de Marneffe, M.-C., Cer, D., and Manning, C. D., 2006. Learning to recognize features of valid textual entailments. In Proceedings of the Human Language Technology Conference of the NAACL, New York City, NY, pp. 41–8.Google Scholar

MacCartney, B., and Manning, C. D. 2007, June. Natural logic for textual inference. In Proceedings of the ACL-PASCAL Workshop on Textual Entailment and Paraphrasing, Prague, Czech Republic, pp. 193–200.Google Scholar

Mehdad, Y., Negri, M., and Federico, M., 2010. Towards cross-lingual textual entailment. In Proceedings of the Annual Conference of the North American Chapter of the ACL, Los Angeles, CA, pp. 321–4.Google Scholar

Mehdad, Y., Negri, M., and Federico, M., 2011. Using bilingual parallel corpora for cross-lingual textual entailment. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, Portland, OR, pp. 1336–45.Google Scholar

Meyers, A. (edr.), 2005. Proceedings of the Workshop on Frontiers in Corpus Annotations II: Pie in the Sky. Stroudsburg, PA: Association for Computational Linguistics.Google Scholar

Mirkin, S., Dagan, I., and Padó, S., 2010. Assessing the role of discourse references in entailment inference. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, Uppsala, Sweden, pp. 1209–19.Google Scholar

Mirkin, S., Dagan, I., and Shnarch, E., 2009. Evaluating the inferential utility of lexical-semantic resources. In Proceedings of the Conference of the European Chapter of the ACL, Athens, Greece, pp. 558–66.Google Scholar

Monz, C., and de Rijke, M., 2001. Light-weight entailment checking for computational semantics. In Proceedings of the Conference on Inference in Computational Semantics, Siena, Italy, pp. 59–72.Google Scholar

Nairn, R., Condoravdi, C., and Karttunen, L., 2006. Computing relative polarity for textual inference. In Proceedings of the Conference on Inference in Computational Semantics, Buxton, UK, pp. 67–76.Google Scholar

Negri, M., Bentivogli, L., Mehdad, Y., Giampiccolo, D., and Marchetti, A., 2011. Divide and conquer: crowdsourcing the creation of cross-lingual textual entailment corpora. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, Edinburgh, Scotland, UK, pp. 670–9.Google Scholar

Negri, M., Kouylekov, M., Magnini, B., Mehdad, Y., and Cabrio, E., 2009. Towards extensible textual entailment engines: the EDITS package. In Proceeding of the Conference of the Italian Association for Artificial Intelligence, Reggio Emilia, Italy, pp. 314–23.Google Scholar

Negri, M., Marchetti, A., Mehdad, Y., Bentivogli, L., and Giampiccolo, D., 2012. SemEval-2012 task 8: cross-lingual textual entailment for content synchronization. In The Joint Conference on Lexical and Computational Semantics and Sixth International Workshop on Semantic Evaluation, Montréal, Canada, pp. 399–407.Google Scholar

Nielsen, R. D., Ward, W., and Martin, J. H., 2009. Recognizing entailment in intelligent tutoring systems. Journal of Natural Language Engineering 15 (4): 479–501.Google Scholar

Nivre, J., and Nilsson, J., 2005. Pseudo-projective dependency parsing. In Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, Ann Arbor, MI, pp. 99–106.Google Scholar

Padó, S., Cer, D., Galley, M., Jurafsky, D., and Manning, C. D. 2009. Measuring machine translation quality as semantic equivalence: a metric based on entailment features. Machine Translation, 23 (2–3): 181–93.Google Scholar

Patrick, J., and Cunningham, H. (eds.), 2003. Proceedings of the HLT-NAACL Workshop on Software Engineering and Architecture of Language Technology Systems. Stroudsburg, PA: Association for Computational Linguistics.Google Scholar

Peñas, A., Rodrigo, Á., Sama, V., and Verdejo, F., 2008. Testing the reasoning for question answering validation. Journal of Logic and Computation 18: 459–474.Google Scholar

Pianta, E., Girardi, C., and Zanoli, R., 2008. The TextPro tool suite. In Proceedings of the International Conference on Language Resources and Evaluation, Marrakech, Morocco, pp. 2603–7.Google Scholar

Romano, L., Kouylekov, M., Szpektor, I., Dagan, I., and Lavelli, A., 2006. Investigating a generic paraphrase-based approach for relation extraction. In Proceedings of the Conference of the European Chapter of the ACL, Trento, Italy, pp. 401–8.Google Scholar

Sammons, M., Vydiswaran, V., and Roth, D. 2012. Recognizing textual entailment. In Bikel, Daniel M. and Zitouni, I. (eds.), Multilingual Natural Language Applications: From Theory to Practice. Englewood Cliffs, NJ: Prentice Hall, pp. 209–258.Google Scholar

Schäfer, U. 2007. Integrating Deep and Shallow Natural Language Processing Components – Representations and Hybrid Architectures. PhD thesis, Saarland University, Saarbrücken, Germany.Google Scholar

Schmid, H., 1994. Probabilistic part-of-speech tagging using decision trees. In Proceedings of the International Conference on New Methods in Language Processing, Manchester, UK, pp. 44–9.Google Scholar

Stern, A., and Dagan, I., 2011. A confidence model for syntactically motivated entailment proofs. In Proceedings of the Conference on Recent Advances in Natural Language Processing, Borovets, Bulgaria, pp. 455–62.Google Scholar

Tatu, M., and Moldovan, D., 2005. A semantic approach to recognizing textual entailment. In Proceedings of the Joint Conference on Human Language Technology and Conference on Empirical Methods in Natural Language Processing, Vancouver, BC, pp. 371–8.Google Scholar

Toutanova, K., and Manning, C. D. 2000. Enriching the knowledge sources used in a maximum entropy part-of-speech tagger. In Proceedings of the Conference on Empirical methods in Natural Language Processing, Hong Kong, pp. 63–70.Google Scholar

Wang, R. 2011. Intrinsic and Extrinsic Approaches to Recognizing Textual Entailment. PhD. thesis, Saarland University, Saarbrücken, Germany.Google Scholar

Wang, R., and Neumann, G. 2008a. An accuracy-oriented divide-and-conquer strategy for recognizing textual entailment. In Proceedings of the TAC Workshop on Textual Entailment, Gaithersburg, MD.Google Scholar

Wang, R., and Neumann, G., 2008b. Information synthesis for answer validation. In Working Notes for the CLEF 2008 Workshop, Aarhus, Denmark, pp. 742–5.Google Scholar

Wang, R., and Zhang, Y. 2009. Recognizing textual relatedness with predicate-argument structures. In Proceedings of Conference on Empirical Methods in Natural Language Processing, Singapore, pp. 784–92.Google Scholar

Zanzotto, F. M., Pennacchiotti, M., and Moschitti, A., 2009. A machine learning approach to textual entailment recognition. Journal of Natural Language Engineering 15 (4): 551–82.CrossRef Google Scholar

Zeller, B. D., and Padó, S., 2013. A search task dataset for German textual entailment. In Proceedings of the International Conference on Computational Semantics, Potsdam, Germany, pp. 288–99.Google Scholar

Article contents

Design and realization of a modular architecture for textual entailment

Abstract

Access options

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests