Training and Inference on Any-Order Autoregressive Models the Right Way

Shih, Andy; Sadigh, Dorsa; Ermon, Stefano

Computer Science > Machine Learning

arXiv:2205.13554 (cs)

[Submitted on 26 May 2022 (v1), last revised 24 Oct 2022 (this version, v2)]

Title:Training and Inference on Any-Order Autoregressive Models the Right Way

Authors:Andy Shih, Dorsa Sadigh, Stefano Ermon

View PDF

Abstract:Conditional inference on arbitrary subsets of variables is a core problem in probabilistic inference with important applications such as masked language modeling and image inpainting. In recent years, the family of Any-Order Autoregressive Models (AO-ARMs) -- closely related to popular models such as BERT and XLNet -- has shown breakthrough performance in arbitrary conditional tasks across a sweeping range of domains. But, in spite of their success, in this paper we identify significant improvements to be made to previous formulations of AO-ARMs. First, we show that AO-ARMs suffer from redundancy in their probabilistic model, i.e., they define the same distribution in multiple different ways. We alleviate this redundancy by training on a smaller set of univariate conditionals that still maintains support for efficient arbitrary conditional inference. Second, we upweight the training loss for univariate conditionals that are evaluated more frequently during inference. Our method leads to improved performance with no compromises on tractability, giving state-of-the-art likelihoods in arbitrary conditional modeling on text (Text8), image (CIFAR10, ImageNet32), and continuous tabular data domains.

Comments:	36th Conference on Neural Information Processing Systems (NeurIPS 2022)
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2205.13554 [cs.LG]
	(or arXiv:2205.13554v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2205.13554

Submission history

From: Andy Shih [view email]
[v1] Thu, 26 May 2022 18:00:02 UTC (1,084 KB)
[v2] Mon, 24 Oct 2022 08:39:05 UTC (1,108 KB)

Computer Science > Machine Learning

Title:Training and Inference on Any-Order Autoregressive Models the Right Way

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Training and Inference on Any-Order Autoregressive Models the Right Way

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators