SelfVIO: Self-Supervised Deep Monocular Visual-Inertial Odometry and Depth Estimation

Almalioglu, Yasin; Turan, Mehmet; Sari, Alp Eren; Saputra, Muhamad Risqi U.; de Gusmão, Pedro P. B.; Markham, Andrew; Trigoni, Niki

Computer Science > Computer Vision and Pattern Recognition

arXiv:1911.09968 (cs)

[Submitted on 22 Nov 2019 (v1), last revised 23 Jul 2020 (this version, v2)]

Title:SelfVIO: Self-Supervised Deep Monocular Visual-Inertial Odometry and Depth Estimation

Authors:Yasin Almalioglu, Mehmet Turan, Alp Eren Sari, Muhamad Risqi U. Saputra, Pedro P. B. de Gusmão, Andrew Markham, Niki Trigoni

View PDF

Abstract:In the last decade, numerous supervised deep learning approaches requiring large amounts of labeled data have been proposed for visual-inertial odometry (VIO) and depth map estimation. To overcome the data limitation, self-supervised learning has emerged as a promising alternative, exploiting constraints such as geometric and photometric consistency in the scene. In this study, we introduce a novel self-supervised deep learning-based VIO and depth map recovery approach (SelfVIO) using adversarial training and self-adaptive visual-inertial sensor fusion. SelfVIO learns to jointly estimate 6 degrees-of-freedom (6-DoF) ego-motion and a depth map of the scene from unlabeled monocular RGB image sequences and inertial measurement unit (IMU) readings. The proposed approach is able to perform VIO without the need for IMU intrinsic parameters and/or the extrinsic calibration between the IMU and the camera. estimation and single-view depth recovery network. We provide comprehensive quantitative and qualitative evaluations of the proposed framework comparing its performance with state-of-the-art VIO, VO, and visual simultaneous localization and mapping (VSLAM) approaches on the KITTI, EuRoC and Cityscapes datasets. Detailed comparisons prove that SelfVIO outperforms state-of-the-art VIO approaches in terms of pose estimation and depth recovery, making it a promising approach among existing methods in the literature.

Comments:	15 pages, submitted to The IEEE Transactions on Robotics (T-RO) journal, under review
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV)
Cite as:	arXiv:1911.09968 [cs.CV]
	(or arXiv:1911.09968v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.1911.09968

Submission history

From: Yasin Almalioglu [view email]
[v1] Fri, 22 Nov 2019 10:51:09 UTC (1,984 KB)
[v2] Thu, 23 Jul 2020 13:37:41 UTC (5,030 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:SelfVIO: Self-Supervised Deep Monocular Visual-Inertial Odometry and Depth Estimation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:SelfVIO: Self-Supervised Deep Monocular Visual-Inertial Odometry and Depth Estimation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators