Synthesizing Moving People with 3D Control

Li, Boyi; Rajasegaran, Jathushan; Gandelsman, Yossi; Efros, Alexei A.; Malik, Jitendra

Computer Science > Computer Vision and Pattern Recognition

arXiv:2401.10889 (cs)

[Submitted on 19 Jan 2024]

Title:Synthesizing Moving People with 3D Control

Authors:Boyi Li, Jathushan Rajasegaran, Yossi Gandelsman, Alexei A. Efros, Jitendra Malik

View PDF HTML (experimental)

Abstract:In this paper, we present a diffusion model-based framework for animating people from a single image for a given target 3D motion sequence. Our approach has two core components: a) learning priors about invisible parts of the human body and clothing, and b) rendering novel body poses with proper clothing and texture. For the first part, we learn an in-filling diffusion model to hallucinate unseen parts of a person given a single image. We train this model on texture map space, which makes it more sample-efficient since it is invariant to pose and viewpoint. Second, we develop a diffusion-based rendering pipeline, which is controlled by 3D human poses. This produces realistic renderings of novel poses of the person, including clothing, hair, and plausible in-filling of unseen regions. This disentangled approach allows our method to generate a sequence of images that are faithful to the target motion in the 3D pose and, to the input image in terms of visual similarity. In addition to that, the 3D control allows various synthetic camera trajectories to render a person. Our experiments show that our method is resilient in generating prolonged motions and varied challenging and complex poses compared to prior methods. Please check our website for more details: this https URL.

Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2401.10889 [cs.CV]
	(or arXiv:2401.10889v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2401.10889

Submission history

From: Boyi Li [view email]
[v1] Fri, 19 Jan 2024 18:59:11 UTC (5,760 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Synthesizing Moving People with 3D Control

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Synthesizing Moving People with 3D Control

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators