Self-supervising Action Recognition by Statistical Moment and Subspace Descriptors

Wang, Lei; Koniusz, Piotr

doi:10.1145/3474085.3475572

Computer Science > Computer Vision and Pattern Recognition

arXiv:2001.04627 (cs)

[Submitted on 14 Jan 2020 (v1), last revised 5 Aug 2021 (this version, v2)]

Title:Self-supervising Action Recognition by Statistical Moment and Subspace Descriptors

Authors:Lei Wang, Piotr Koniusz

View PDF

Abstract:In this paper, we build on a concept of self-supervision by taking RGB frames as input to learn to predict both action concepts and auxiliary descriptors e.g., object descriptors. So-called hallucination streams are trained to predict auxiliary cues, simultaneously fed into classification layers, and then hallucinated at the testing stage to aid network. We design and hallucinate two descriptors, one leveraging four popular object detectors applied to training videos, and the other leveraging image- and video-level saliency detectors. The first descriptor encodes the detector- and ImageNet-wise class prediction scores, confidence scores, and spatial locations of bounding boxes and frame indexes to capture the spatio-temporal distribution of features per video. Another descriptor encodes spatio-angular gradient distributions of saliency maps and intensity patterns. Inspired by the characteristic function of the probability distribution, we capture four statistical moments on the above intermediate descriptors. As numbers of coefficients in the mean, covariance, coskewness and cokurtotsis grow linearly, quadratically, cubically and quartically w.r.t. the dimension of feature vectors, we describe the covariance matrix by its leading n' eigenvectors (so-called subspace) and we capture skewness/kurtosis rather than costly coskewness/cokurtosis. We obtain state of the art on five popular datasets such as Charades and EPIC-Kitchens.

Comments:	ACM MM'21
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2001.04627 [cs.CV]
	(or arXiv:2001.04627v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2001.04627
Related DOI:	https://doi.org/10.1145/3474085.3475572

Submission history

From: Piotr Koniusz [view email]
[v1] Tue, 14 Jan 2020 05:03:54 UTC (2,974 KB)
[v2] Thu, 5 Aug 2021 15:25:12 UTC (595 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Self-supervising Action Recognition by Statistical Moment and Subspace Descriptors

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Self-supervising Action Recognition by Statistical Moment and Subspace Descriptors

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators