Learning the 3D Fauna of the Web

Li, Zizhang; Litvak, Dor; Li, Ruining; Zhang, Yunzhi; Jakab, Tomas; Rupprecht, Christian; Wu, Shangzhe; Vedaldi, Andrea; Wu, Jiajun

Computer Science > Computer Vision and Pattern Recognition

arXiv:2401.02400 (cs)

[Submitted on 4 Jan 2024 (v1), last revised 1 Apr 2024 (this version, v2)]

Title:Learning the 3D Fauna of the Web

Authors:Zizhang Li, Dor Litvak, Ruining Li, Yunzhi Zhang, Tomas Jakab, Christian Rupprecht, Shangzhe Wu, Andrea Vedaldi, Jiajun Wu

View PDF HTML (experimental)

Abstract:Learning 3D models of all animals on the Earth requires massively scaling up existing solutions. With this ultimate goal in mind, we develop 3D-Fauna, an approach that learns a pan-category deformable 3D animal model for more than 100 animal species jointly. One crucial bottleneck of modeling animals is the limited availability of training data, which we overcome by simply learning from 2D Internet images. We show that prior category-specific attempts fail to generalize to rare species with limited training images. We address this challenge by introducing the Semantic Bank of Skinned Models (SBSM), which automatically discovers a small set of base animal shapes by combining geometric inductive priors with semantic knowledge implicitly captured by an off-the-shelf self-supervised feature extractor. To train such a model, we also contribute a new large-scale dataset of diverse animal species. At inference time, given a single image of any quadruped animal, our model reconstructs an articulated 3D mesh in a feed-forward fashion within seconds.

Comments:	The first two authors contributed equally to this work. The last three authors contributed equally. Project page: this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2401.02400 [cs.CV]
	(or arXiv:2401.02400v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2401.02400

Submission history

From: Zizhang Li [view email]
[v1] Thu, 4 Jan 2024 18:32:48 UTC (6,672 KB)
[v2] Mon, 1 Apr 2024 04:56:37 UTC (6,870 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Learning the 3D Fauna of the Web

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Learning the 3D Fauna of the Web

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators