I Like to Move It: 6D Pose Estimation as an Action Decision Process

Busam, Benjamin; Jung, Hyun Jun; Navab, Nassir

Computer Science > Computer Vision and Pattern Recognition

arXiv:2009.12678 (cs)

[Submitted on 26 Sep 2020 (v1), last revised 30 Nov 2020 (this version, v2)]

Title:I Like to Move It: 6D Pose Estimation as an Action Decision Process

Authors:Benjamin Busam, Hyun Jun Jung, Nassir Navab

View PDF

Abstract:Object pose estimation is an integral part of robot vision and AR. Previous 6D pose retrieval pipelines treat the problem either as a regression task or discretize the pose space to classify. We change this paradigm and reformulate the problem as an action decision process where an initial pose is updated in incremental discrete steps that sequentially move a virtual 3D rendering towards the correct solution. A neural network estimates likely moves from a single RGB image iteratively and determines so an acceptable final pose. In comparison to other approaches that train object-specific pose models, we learn a decision process. This allows for a lightweight architecture while it naturally generalizes to unseen objects. A coherent stop action for process termination enables dynamic reduction of the computation cost if there are insignificant changes in a video sequence. Instead of a static inference time, we thereby automatically increase the runtime depending on the object motion. Robustness and accuracy of our action decision network are evaluated on Laval and YCB video scenes where we significantly improve the state-of-the-art.

Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2009.12678 [cs.CV]
	(or arXiv:2009.12678v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2009.12678

Submission history

From: Benjamin Busam [view email]
[v1] Sat, 26 Sep 2020 20:05:42 UTC (43,168 KB)
[v2] Mon, 30 Nov 2020 19:03:28 UTC (35,921 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:I Like to Move It: 6D Pose Estimation as an Action Decision Process

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:I Like to Move It: 6D Pose Estimation as an Action Decision Process

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators