Contrastive Decoding: Open-ended Text Generation as Optimization

Xiang Lisa Li; Ari Holtzman; Daniel Fried; Percy Liang; Jason Eisner; Tatsunori B. Hashimoto; Luke Zettlemoyer; Mike Lewis

doi:10.18653/v1/2023.acl-long.687

Contrastive Decoding: Open-ended Text Generation as Optimization

Xiang Lisa Li, Ari Holtzman, Daniel Fried, Percy Liang, Jason Eisner, Tatsunori Hashimoto, Luke Zettlemoyer, Mike Lewis

Abstract

Given a language model (LM), maximum probability is a poor decoding objective for open-ended generation, because it produces short and repetitive text. On the other hand, sampling can often produce incoherent text that drifts from the original topics. We propose contrastive decoding (CD), a reliable decoding approach that optimizes a contrastive objective subject to a plausibility constraint. The contrastive objective returns the difference between the likelihood under a large LM (called the expert, e.g. OPT-13B) and a small LM (called the amateur, e.g. OPT-125M), and the constraint ensures that the outputs are plausible. CD is inspired by the fact that the failures of larger LMs (e.g., repetition, inco- herence) are even more prevalent in smaller LMs, and that this difference signals which texts should be preferred. CD requires zero additional training, and produces higher quality text than decoding from the larger LM alone. It also works across model scales (OPT-13B and GPT2-1.5B) and significantly outperforms four strong decoding algorithms (e.g., nucleus, top-k) in automatic and human evaluations across wikipedia, news and story domains.

Anthology ID:: 2023.acl-long.687
Volume:: Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2023
Address:: Toronto, Canada
Editors:: Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 12286–12312
Language:
URL:: https://aclanthology.org/2023.acl-long.687
DOI:: 10.18653/v1/2023.acl-long.687
Bibkey:
Cite (ACL):: Xiang Lisa Li, Ari Holtzman, Daniel Fried, Percy Liang, Jason Eisner, Tatsunori Hashimoto, Luke Zettlemoyer, and Mike Lewis. 2023. Contrastive Decoding: Open-ended Text Generation as Optimization. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 12286–12312, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):: Contrastive Decoding: Open-ended Text Generation as Optimization (Li et al., ACL 2023)
Copy Citation:
PDF:: https://aclanthology.org/2023.acl-long.687.pdf
Video:: https://aclanthology.org/2023.acl-long.687.mp4

PDF Cite Search Video