GEARS: Local Geometry-aware Hand-object Interaction Synthesis

Zhou, Keyang; Bhatnagar, Bharat Lal; Lenssen, Jan Eric; Pons-moll, Gerard

Computer Science > Computer Vision and Pattern Recognition

arXiv:2404.01758 (cs)

[Submitted on 2 Apr 2024 (v1), last revised 11 May 2024 (this version, v3)]

Title:GEARS: Local Geometry-aware Hand-object Interaction Synthesis

Authors:Keyang Zhou, Bharat Lal Bhatnagar, Jan Eric Lenssen, Gerard Pons-moll

View PDF HTML (experimental)

Abstract:Generating realistic hand motion sequences in interaction with objects has gained increasing attention with the growing interest in digital humans. Prior work has illustrated the effectiveness of employing occupancy-based or distance-based virtual sensors to extract hand-object interaction features. Nonetheless, these methods show limited generalizability across object categories, shapes and sizes. We hypothesize that this is due to two reasons: 1) the limited expressiveness of employed virtual sensors, and 2) scarcity of available training data. To tackle this challenge, we introduce a novel joint-centered sensor designed to reason about local object geometry near potential interaction regions. The sensor queries for object surface points in the neighbourhood of each hand joint. As an important step towards mitigating the learning complexity, we transform the points from global frame to hand template frame and use a shared module to process sensor features of each individual joint. This is followed by a spatio-temporal transformer network aimed at capturing correlation among the joints in different dimensions. Moreover, we devise simple heuristic rules to augment the limited training sequences with vast static hand grasping samples. This leads to a broader spectrum of grasping types observed during training, in turn enhancing our model's generalization capability. We evaluate on two public datasets, GRAB and InterCap, where our method shows superiority over baselines both quantitatively and perceptually.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2404.01758 [cs.CV]
	(or arXiv:2404.01758v3 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2404.01758

Submission history

From: Keyang Zhou [view email]
[v1] Tue, 2 Apr 2024 09:18:52 UTC (1,452 KB)
[v2] Thu, 4 Apr 2024 08:03:04 UTC (1,449 KB)
[v3] Sat, 11 May 2024 19:55:40 UTC (1,449 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:GEARS: Local Geometry-aware Hand-object Interaction Synthesis

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:GEARS: Local Geometry-aware Hand-object Interaction Synthesis

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators