S2R-DepthNet: Learning a Generalizable Depth-specific Structural Representation

Chen, Xiaotian; Wang, Yuwang; Chen, Xuejin; Zeng, Wenjun

Computer Science > Computer Vision and Pattern Recognition

arXiv:2104.00877 (cs)

[Submitted on 2 Apr 2021 (v1), last revised 15 Jun 2021 (this version, v2)]

Title:S2R-DepthNet: Learning a Generalizable Depth-specific Structural Representation

Authors:Xiaotian Chen, Yuwang Wang, Xuejin Chen, Wenjun Zeng

View PDF

Abstract:Human can infer the 3D geometry of a scene from a sketch instead of a realistic image, which indicates that the spatial structure plays a fundamental role in understanding the depth of scenes. We are the first to explore the learning of a depth-specific structural representation, which captures the essential feature for depth estimation and ignores irrelevant style information. Our S2R-DepthNet (Synthetic to Real DepthNet) can be well generalized to unseen real-world data directly even though it is only trained on synthetic data. S2R-DepthNet consists of: a) a Structure Extraction (STE) module which extracts a domaininvariant structural representation from an image by disentangling the image into domain-invariant structure and domain-specific style components, b) a Depth-specific Attention (DSA) module, which learns task-specific knowledge to suppress depth-irrelevant structures for better depth estimation and generalization, and c) a depth prediction module (DP) to predict depth from the depth-specific representation. Without access of any real-world images, our method even outperforms the state-of-the-art unsupervised domain adaptation methods which use real-world images of the target domain for training. In addition, when using a small amount of labeled real-world data, we achieve the state-ofthe-art performance under the semi-supervised setting. The code and trained models are available at this https URL.

Comments:	Accepted by CVPR2021(oral)
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2104.00877 [cs.CV]
	(or arXiv:2104.00877v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2104.00877

Submission history

From: Xiaotian Chen [view email]
[v1] Fri, 2 Apr 2021 03:55:41 UTC (9,530 KB)
[v2] Tue, 15 Jun 2021 07:24:40 UTC (11,935 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:S2R-DepthNet: Learning a Generalizable Depth-specific Structural Representation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:S2R-DepthNet: Learning a Generalizable Depth-specific Structural Representation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators