A Compacted Structure for Cross-domain learning on Monocular Depth and Flow Estimation

Chen, Yu; Cao, Xu; Lin, Xiaoyi; Huang, Baoru; Zhou, Xiao-Yun; Zheng, Jian-Qing; Yang, Guang-Zhong

Computer Science > Computer Vision and Pattern Recognition

arXiv:2208.11993 (cs)

[Submitted on 25 Aug 2022]

Title:A Compacted Structure for Cross-domain learning on Monocular Depth and Flow Estimation

Authors:Yu Chen, Xu Cao, Xiaoyi Lin, Baoru Huang, Xiao-Yun Zhou, Jian-Qing Zheng, Guang-Zhong Yang

View PDF

Abstract:Accurate motion and depth recovery is important for many robot vision tasks including autonomous driving. Most previous studies have achieved cooperative multi-task interaction via either pre-defined loss functions or cross-domain prediction. This paper presents a multi-task scheme that achieves mutual assistance by means of our Flow to Depth (F2D), Depth to Flow (D2F), and Exponential Moving Average (EMA). F2D and D2F mechanisms enable multi-scale information integration between optical flow and depth domain based on differentiable shallow nets. A dual-head mechanism is used to predict optical flow for rigid and non-rigid motion based on a divide-and-conquer manner, which significantly improves the optical flow estimation performance. Furthermore, to make the prediction more robust and stable, EMA is used for our multi-task training. Experimental results on KITTI datasets show that our multi-task scheme outperforms other multi-task schemes and provide marked improvements on the prediction results.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2208.11993 [cs.CV]
	(or arXiv:2208.11993v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2208.11993

Submission history

From: Xu Cao [view email]
[v1] Thu, 25 Aug 2022 10:46:29 UTC (11,919 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:A Compacted Structure for Cross-domain learning on Monocular Depth and Flow Estimation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:A Compacted Structure for Cross-domain learning on Monocular Depth and Flow Estimation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators