Plug-and-Play Diffusion Distillation

Hsiao, Yi-Ting; Khodadadeh, Siavash; Duarte, Kevin; Lin, Wei-An; Qu, Hui; Kwon, Mingi; Kalarot, Ratheesh

Computer Science > Computer Vision and Pattern Recognition

arXiv:2406.01954 (cs)

[Submitted on 4 Jun 2024 (v1), last revised 14 Jun 2024 (this version, v2)]

Title:Plug-and-Play Diffusion Distillation

Authors:Yi-Ting Hsiao, Siavash Khodadadeh, Kevin Duarte, Wei-An Lin, Hui Qu, Mingi Kwon, Ratheesh Kalarot

View PDF HTML (experimental)

Abstract:Diffusion models have shown tremendous results in image generation. However, due to the iterative nature of the diffusion process and its reliance on classifier-free guidance, inference times are slow. In this paper, we propose a new distillation approach for guided diffusion models in which an external lightweight guide model is trained while the original text-to-image model remains frozen. We show that our method reduces the inference computation of classifier-free guided latent-space diffusion models by almost half, and only requires 1\% trainable parameters of the base model. Furthermore, once trained, our guide model can be applied to various fine-tuned, domain-specific versions of the base diffusion model without the need for additional training: this "plug-and-play" functionality drastically improves inference computation while maintaining the visual fidelity of generated images. Empirically, we show that our approach is able to produce visually appealing results and achieve a comparable FID score to the teacher with as few as 8 to 16 steps.

Comments:	IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024 project page: this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2406.01954 [cs.CV]
	(or arXiv:2406.01954v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2406.01954

Submission history

From: Mingi Kwon [view email]
[v1] Tue, 4 Jun 2024 04:22:47 UTC (45,518 KB)
[v2] Fri, 14 Jun 2024 15:53:07 UTC (45,519 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Plug-and-Play Diffusion Distillation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Plug-and-Play Diffusion Distillation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators