Javad Hosseini
Javad Hosseini is a researcher at Google Research, UK, working on natural language inference, reasoning, and problems related to factuality of large language models. Before joining Google, Javad earned his PhD at the Institute for Language, Cognition and Computation (ILCC), University of Edinburgh, under supervision of Mark Steedman. He obtained his MSc in computer science from the University of Washington while working with Hanna Hajishirzi, Oren Etzioni , and Su-In Lee. He earned his MSc and BSc (1st rank) in Computer Software Engineering from Sharif University of Technology.
Research Areas
Authored Publications
Sort By
Resolving Indirect Referring Expressions for Entity Selection
Silvia Pareti
Proceedings of the Annual Meetings of the Association for Computational Linguistics (ACL 2023)
Preview abstract
Recent advances in language modeling have enabled new conversational systems. In particular, it is often desirable for people to make choices among specified options when using such systems. We address the problem of reference resolution, when people use natural expressions to choose between real world entities. For example, given the choice `Should we make a Simnel cake or a Pandan cake?' a natural response from a non-expert may be indirect: `let's make the green one'. Such natural expressions have been little studied for reference resolution. We argue that robustly understanding such language has large potential for improving naturalness in dialog, recommendation, and search systems. We create AltEntities (Alternative Entities), a new public dataset of 42K entity pairs and expressions (referring to one entity in the pair), and develop models for the disambiguation problem. Consisting of indirect referring expressions across three domains, our corpus enables for the first time the study of how language models can be adapted to this task. We find they achieve 82%-87% accuracy in realistic settings, which while reasonable also invites further advances.
View details
Complementary Roles of Inference and Language Models in Open-domain QA
Liang Cheng
Mark Steedman
Proceedings of the 2nd Workshop on Pattern-based Approaches to NLP in the Age of Deep Learning (2023)
Preview abstract
Answering open-domain questions through unsupervised methods poses challenges for both machine-reading (MR) and language model (LM)-based approaches. The MR-based approach suffers from sparsity issues in extracted knowledge graphs (KGs), while the performance of the LM-based approach significantly depends on the quality of the retrieved context for questions. In this paper, we compare these approaches and propose a novel methodology that leverages directional predicate entailment (inference) to address these limitations. We use entailment graphs (EGs), with natural language predicates as nodes and entailment as edges, to enhance parsed KGs by inferring unseen assertions, effectively mitigating the sparsity problem in the MR-based approach. We also show EGs improve context retrieval for the LM-based approach. Additionally, we present a Boolean QA task, demonstrating that EGs exhibit comparable directional inference capabilities to large language models (LLMs). Our results highlight the importance of inference in open-domain QA and the improvements brought by leveraging EGs.
View details
Sources of LLM Hallucination in Natural Language Inference
Nick McKenna
Tianyi Li
Liang Cheng
Mark Johnson
Mark Steedman
Findings of the Association for Computational Linguistics: EMNLP 2023
Preview abstract
Large Language Models (LLMs) are claimed to be capable of Natural Language Inference (NLI), necessary for applied tasks like question answering and summarization. We present a series of behavioral studies on several LLM families (LLaMA, GPT-3.5, and PaLM) which probe their behavior using controlled experiments. We establish two biases originating from pretraining which predict much of their behavior, and show that these are major sources of hallucination in generative LLMs. First, memorization at the level of sentences: we show that, regardless of the premise, models falsely label NLI test samples as entailing when the hypothesis is attested in training data, and that entities are used as “indices’ to access the memorized data. Second, statistical patterns of usage learned at the level of corpora: we further show a similar effect when the premise predicate is less frequent than that of the hypothesis in the training data, a bias following from previous studies. We demonstrate that LLMs perform significantly worse on NLI test samples which do not conform to these biases than those which do, and we offer these as valuable controls for future LLM evaluation.
View details
Preview abstract
Transformer encoders contextualize token representations by attending to all other tokens at each layer, leading to quadratic increase in compute effort with the input length. In practice, however, the input text of many NLP tasks can be seen as a sequence of related segments (e.g., the sequence of sentences within a passage, or the hypothesis and premise in NLI). While attending across these segments is highly beneficial for many tasks, we hypothesize that this interaction can be delayed until later encoding stages. To this end, we introduce Layer-adjustable Interactions in Transformers (LAIT). Within LAIT, segmented inputs are first encoded independently, and then jointly. This partial two-tower architecture bridges the gap between a Dual Encoder's ability to pre-compute representations for segments and a fully self-attentive Transformer's capacity to model cross-segment attention. Also, LAIT can be introduced only when finetuning, effectively converting an existing pretrained Transformer into the hybrid of the two aforementioned architectures, and providing an intuitive control over the performance-efficiency tradeoff. Experimenting on a wide range of NLP tasks, we find LAIT to significantly improve efficiency while preserving accuracy.
View details
Language models are poor learners of directional inference
Tianyi Li
Sabine Weber
Mark Steedman
Findings of the Association for Computational Linguistics: EMNLP 2022, pp. 903-921
Preview abstract
We examine the RoBERTa LMs' competence of directional predicate entailments with prompt fine-tuning. Through analysis, we find that contrary to previous evidence of success, they are showing limited capabilities of directional inference; moreover, existing datasets are either ignorant of directionality, or infested by spurious correlations, allowing models to overfit to dataset artefacts. In response, we present BoOQA (Boolean Open QA), an extrinsic, robust, multi-lingual evaluation benchmark for directional predicate entailments, independent of existing training sets. On BoOQA, we establish baselines and verify that existing LM-prompting models are not competent directional entailment learners, while entailment graphs are cursed by sparsity. We bring the open problem of directional predicate entailment to spotlight and advocate for research along this line.
View details
Cross-lingual Inference with A Chinese Entailment Graph
Tianyi Li
Sabine Weber
Liane Guillou
Mark Steedman
Findings of the Association for Computational Linguistics: ACL 2022, pp. 1214-1233
Preview abstract
Predicate entailment detection is a crucial task for question-answering from text, where previous work has explored unsupervised learning of entailment graphs from typed open relation triples. In this paper, we present the first pipeline for building Chinese entailment graphs, which involves a novel high-recall open relation extraction (ORE) method and the first Chinese fine-grained entity typing dataset under the FIGER type ontology. Through experiments on the Levy-Holt dataset, we verify the strength of our Chinese entailment graph, and reveal the cross-lingual complementarity: on the parallel Levy-Holt dataset, an ensemble of Chinese and English entailment graphs beats both monolinguals, and raises unsupervised SOTA by 4.7 AUC points.
View details
Open-Domain Contextual Link Prediction and its Complementarity with Entailment Graphs
Shay B. Cohen
Mark Johnson
Mark Steedman
Findings of the Association for Computational Linguistics: EMNLP 2021, pp. 2790-2802
Preview abstract
An open-domain knowledge graph (KG) has entities as nodes and natural language relations as edges, and is constructed by extracting (subject, relation, object) triples from text. The task of open-domain link prediction is to infer missing relations in the KG. Previous work has used standard link prediction for the task. Since triples are extracted from text, we can ground them in the larger textual context in which they were originally found. However, standard link prediction methods only rely on the KG structure and ignore the textual context of the triples. In this paper, we introduce the new task of open-domain contextual link prediction which has access to both the textual context and the KG structure to perform link prediction. We build a dataset for the task and propose a model for it. Our experiments show that context is crucial in predicting missing relations. We also demonstrate the utility of contextual link prediction in discovering out-of-context entailments between relations, in the form of entailment graphs (EG), in which the nodes are the relations. The reverse holds too: out-of-context EGs assist in predicting relations in context.
View details
Multivalent Entailment Graphs for Question Answering
Nick McKenna
Liane Guillou
Sander Bijl de Vroe
Mark Johnson
Mark Steedman
Conference on Empirical Methods in Natural Language Processing (EMNLP, long papers) (2021), pp. 10758-10768
Preview abstract
Drawing inferences between open-domain natural language predicates is a necessity for true language understanding. There has been much progress in unsupervised learning of entailment graphs for this purpose. We make three contributions: (1) we reinterpret the Distributional Inclusion Hypothesis to model entailment between predicates of different valencies, like DEFEAT(Biden, Trump) |= WIN(Biden); (2) we actualize this theory by learning unsupervised Multivalent Entailment Graphs of open-domain predicates; and (3) we demonstrate the capabilities of these graphs on a novel question answering task. We show that directional entailment is more helpful for inference than non-directional similarity on questions of fine-grained semantics. We also show that drawing on evidence across valencies answers more questions than by using only the same valency evidence.
View details