Pruning then Reweighting: Towards Data-Efficient Training of Diffusion Models

Li, Yize; Zhang, Yihua; Liu, Sijia; Lin, Xue

Computer Science > Computer Vision and Pattern Recognition

arXiv:2409.19128 (cs)

[Submitted on 27 Sep 2024 (v1), last revised 1 Oct 2024 (this version, v2)]

Title:Pruning then Reweighting: Towards Data-Efficient Training of Diffusion Models

Authors:Yize Li, Yihua Zhang, Sijia Liu, Xue Lin

View PDF HTML (experimental)

Abstract:Despite the remarkable generation capabilities of Diffusion Models (DMs), conducting training and inference remains computationally expensive. Previous works have been devoted to accelerating diffusion sampling, but achieving data-efficient diffusion training has often been overlooked. In this work, we investigate efficient diffusion training from the perspective of dataset pruning. Inspired by the principles of data-efficient training for generative models such as generative adversarial networks (GANs), we first extend the data selection scheme used in GANs to DM training, where data features are encoded by a surrogate model, and a score criterion is then applied to select the coreset. To further improve the generation performance, we employ a class-wise reweighting approach, which derives class weights through distributionally robust optimization (DRO) over a pre-trained reference DM. For a pixel-wise DM (DDPM) on CIFAR-10, experiments demonstrate the superiority of our methodology over existing approaches and its effectiveness in image synthesis comparable to that of the original full-data model while achieving the speed-up between 2.34 times and 8.32 times. Additionally, our method could be generalized to latent DMs (LDMs), e.g., Masked Diffusion Transformer (MDT) and Stable Diffusion (SD), and achieves competitive generation capability on ImageNet. Code is available here (this https URL).

Comments:	Under Review
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2409.19128 [cs.CV]
	(or arXiv:2409.19128v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2409.19128

Submission history

From: Yize Li [view email]
[v1] Fri, 27 Sep 2024 20:21:19 UTC (281 KB)
[v2] Tue, 1 Oct 2024 18:40:07 UTC (281 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Pruning then Reweighting: Towards Data-Efficient Training of Diffusion Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Pruning then Reweighting: Towards Data-Efficient Training of Diffusion Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators