LDM3D-VR: Latent Diffusion Model for 3D VR

Stan, Gabriela Ben Melech; Wofk, Diana; Aflalo, Estelle; Tseng, Shao-Yen; Cai, Zhipeng; Paulitsch, Michael; Lal, Vasudev

Computer Science > Computer Vision and Pattern Recognition

arXiv:2311.03226 (cs)

[Submitted on 6 Nov 2023]

Title:LDM3D-VR: Latent Diffusion Model for 3D VR

Authors:Gabriela Ben Melech Stan, Diana Wofk, Estelle Aflalo, Shao-Yen Tseng, Zhipeng Cai, Michael Paulitsch, Vasudev Lal

View PDF

Abstract:Latent diffusion models have proven to be state-of-the-art in the creation and manipulation of visual outputs. However, as far as we know, the generation of depth maps jointly with RGB is still limited. We introduce LDM3D-VR, a suite of diffusion models targeting virtual reality development that includes LDM3D-pano and LDM3D-SR. These models enable the generation of panoramic RGBD based on textual prompts and the upscaling of low-resolution inputs to high-resolution RGBD, respectively. Our models are fine-tuned from existing pretrained models on datasets containing panoramic/high-resolution RGB images, depth maps and captions. Both models are evaluated in comparison to existing related methods.

Comments:	Accepted to Workshop on Diffusion Models, NeurIPS 2023
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2311.03226 [cs.CV]
	(or arXiv:2311.03226v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2311.03226

Submission history

From: Gabriela Ben Melech [view email]
[v1] Mon, 6 Nov 2023 16:12:10 UTC (6,230 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CV

< prev | next >

new | recent | 2023-11

Change to browse by:

cs
cs.AI

References & Citations

export BibTeX citation

Computer Science > Computer Vision and Pattern Recognition

Title:LDM3D-VR: Latent Diffusion Model for 3D VR

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:LDM3D-VR: Latent Diffusion Model for 3D VR

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators