Visibility Aware Human-Object Interaction Tracking from Single RGB Camera

Xie, Xianghui; Bhatnagar, Bharat Lal; Pons-Moll, Gerard

Computer Science > Computer Vision and Pattern Recognition

arXiv:2303.16479 (cs)

[Submitted on 29 Mar 2023 (v1), last revised 31 Oct 2023 (this version, v2)]

Title:Visibility Aware Human-Object Interaction Tracking from Single RGB Camera

Authors:Xianghui Xie, Bharat Lal Bhatnagar, Gerard Pons-Moll

View PDF

Abstract:Capturing the interactions between humans and their environment in 3D is important for many applications in robotics, graphics, and vision. Recent works to reconstruct the 3D human and object from a single RGB image do not have consistent relative translation across frames because they assume a fixed depth. Moreover, their performance drops significantly when the object is occluded. In this work, we propose a novel method to track the 3D human, object, contacts between them, and their relative translation across frames from a single RGB camera, while being robust to heavy occlusions. Our method is built on two key insights. First, we condition our neural field reconstructions for human and object on per-frame SMPL model estimates obtained by pre-fitting SMPL to a video sequence. This improves neural reconstruction accuracy and produces coherent relative translation across frames. Second, human and object motion from visible frames provides valuable information to infer the occluded object. We propose a novel transformer-based neural network that explicitly uses object visibility and human motion to leverage neighbouring frames to make predictions for the occluded frames. Building on these insights, our method is able to track both human and object robustly even under occlusions. Experiments on two datasets show that our method significantly improves over the state-of-the-art methods. Our code and pretrained models are available at: this https URL

Comments:	accepted to CVPR 2023, edited acknowledgement
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2303.16479 [cs.CV]
	(or arXiv:2303.16479v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2303.16479

Submission history

From: Xianghui Xie [view email]
[v1] Wed, 29 Mar 2023 06:23:44 UTC (23,645 KB)
[v2] Tue, 31 Oct 2023 16:27:27 UTC (23,645 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Visibility Aware Human-Object Interaction Tracking from Single RGB Camera

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Visibility Aware Human-Object Interaction Tracking from Single RGB Camera

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators