CAT-NeRF: Constancy-Aware Tx$^2$Former for Dynamic Body Modeling

Zhu, Haidong; Zheng, Zhaoheng; Zheng, Wanrong; Nevatia, Ram

Abstract:This paper addresses the problem of human rendering in the video with temporal appearance constancy. Reconstructing dynamic body shapes with volumetric neural rendering methods, such as NeRF, requires finding the correspondence of the points in the canonical and observation space, which demands understanding human body shape and motion. Some methods use rigid transformation, such as SE(3), which cannot precisely model each frame's unique motion and muscle movements. Others generate the transformation for each frame with a trainable network, such as neural blend weight field or translation vector field, which does not consider the appearance constancy of general body shape. In this paper, we propose CAT-NeRF for self-awareness of appearance constancy with Tx$^2$Former, a novel way to combine two Transformer layers, to separate appearance constancy and uniqueness. Appearance constancy models the general shape across the video, and uniqueness models the unique patterns for each frame. We further introduce a novel Covariance Loss to limit the correlation between each pair of appearance uniquenesses to ensure the frame-unique pattern is maximally captured in appearance uniqueness. We assess our method on H36M and ZJU-MoCap and show state-of-the-art performance.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2304.07915 [cs.CV]
	(or arXiv:2304.07915v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2304.07915

Computer Science > Computer Vision and Pattern Recognition

Title:CAT-NeRF: Constancy-Aware Tx$^2$Former for Dynamic Body Modeling

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators