Mutual Information Alleviates Hallucinations in Abstractive Summarization

van der Poel, Liam; Cotterell, Ryan; Meister, Clara

Computer Science > Computation and Language

arXiv:2210.13210 (cs)

[Submitted on 24 Oct 2022 (v1), last revised 29 Oct 2022 (this version, v2)]

Title:Mutual Information Alleviates Hallucinations in Abstractive Summarization

Authors:Liam van der Poel, Ryan Cotterell, Clara Meister

View PDF

Abstract:Despite significant progress in the quality of language generated from abstractive summarization models, these models still exhibit the tendency to hallucinate, i.e., output content not supported by the source document. A number of works have tried to fix--or at least uncover the source of--the problem with limited success. In this paper, we identify a simple criterion under which models are significantly more likely to assign more probability to hallucinated content during generation: high model uncertainty. This finding offers a potential explanation for hallucinations: models default to favoring text with high marginal probability, i.e., high-frequency occurrences in the training set, when uncertain about a continuation. It also motivates possible routes for real-time intervention during decoding to prevent such hallucinations. We propose a decoding strategy that switches to optimizing for pointwise mutual information of the source and target token--rather than purely the probability of the target token--when the model exhibits uncertainty. Experiments on the XSum dataset show that our method decreases the probability of hallucinated tokens while maintaining the Rouge and BertS scores of top-performing decoding strategies.

Comments:	EMNLP 2022
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2210.13210 [cs.CL]
	(or arXiv:2210.13210v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2210.13210

Submission history

From: Clara Meister [view email]
[v1] Mon, 24 Oct 2022 13:30:54 UTC (177 KB)
[v2] Sat, 29 Oct 2022 18:42:46 UTC (177 KB)

Computer Science > Computation and Language

Title:Mutual Information Alleviates Hallucinations in Abstractive Summarization

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Mutual Information Alleviates Hallucinations in Abstractive Summarization

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators