Authors:
Takashi Shimizu
;
Fumihiko Sakaue
and
Jun Sato
Affiliation:
Nagoya Institute of Technology, Japan
Keyword(s):
Human Poses, Camera Motions, CNN, RNN, LSTM, Deep Learning.
Abstract:
In this paper, we propose a novel method for recovering 3D human poses and camera motions from sequential
images by using CNN and LSTM. The human pose estimation from deep learning has been studied extensively
in recent years. However, the existing methods aim to classify 2D human motions in images. Although some
methods have been proposed for recovering 3D human poses recently, they only considered single frame poses,
and sequential properties of human actions were not used efficiently. Furthermore, the existing methods
recover only 3D poses relative to the viewpoints. In this paper, we propose a method for recovering 3D
human poses and 3D camera motions simultaneously from sequential input images. In our network, CNN is
combined with LSTM, so that the proposed network can learn sequential properties of 3D human poses and
camera motions efficiently. The efficiency of the proposed method is evaluated by using real images as well
as synthetic images.