Exploring Low-Dimensional Subspaces in Diffusion Models for Controllable Image Editing

Chen, Siyi; Zhang, Huijie; Guo, Minzhe; Lu, Yifu; Wang, Peng; Qu, Qing

Computer Science > Computer Vision and Pattern Recognition

arXiv:2409.02374 (cs)

[Submitted on 4 Sep 2024 (v1), last revised 10 Sep 2024 (this version, v2)]

Title:Exploring Low-Dimensional Subspaces in Diffusion Models for Controllable Image Editing

Authors:Siyi Chen, Huijie Zhang, Minzhe Guo, Yifu Lu, Peng Wang, Qing Qu

View PDF HTML (experimental)

Abstract:Recently, diffusion models have emerged as a powerful class of generative models. Despite their success, there is still limited understanding of their semantic spaces. This makes it challenging to achieve precise and disentangled image generation without additional training, especially in an unsupervised way. In this work, we improve the understanding of their semantic spaces from intriguing observations: among a certain range of noise levels, (1) the learned posterior mean predictor (PMP) in the diffusion model is locally linear, and (2) the singular vectors of its Jacobian lie in low-dimensional semantic subspaces. We provide a solid theoretical basis to justify the linearity and low-rankness in the PMP. These insights allow us to propose an unsupervised, single-step, training-free LOw-rank COntrollable image editing (LOCO Edit) method for precise local editing in diffusion models. LOCO Edit identified editing directions with nice properties: homogeneity, transferability, composability, and linearity. These properties of LOCO Edit benefit greatly from the low-dimensional semantic subspace. Our method can further be extended to unsupervised or text-supervised editing in various text-to-image diffusion models (T-LOCO Edit). Finally, extensive empirical experiments demonstrate the effectiveness and efficiency of LOCO Edit. The codes will be released at this https URL.

Subjects:	Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
Cite as:	arXiv:2409.02374 [cs.CV]
	(or arXiv:2409.02374v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2409.02374

Submission history

From: Siyi Chen [view email]
[v1] Wed, 4 Sep 2024 01:47:01 UTC (24,973 KB)
[v2] Tue, 10 Sep 2024 20:36:25 UTC (24,977 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Exploring Low-Dimensional Subspaces in Diffusion Models for Controllable Image Editing

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Exploring Low-Dimensional Subspaces in Diffusion Models for Controllable Image Editing

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators