Hierarchical Modeling for Task Recognition and Action Segmentation in Weakly-Labeled Instructional Videos

Ghoddoosian, Reza; Sayed, Saif; Athitsos, Vassilis

Computer Science > Computer Vision and Pattern Recognition

arXiv:2110.05697 (cs)

[Submitted on 12 Oct 2021]

Title:Hierarchical Modeling for Task Recognition and Action Segmentation in Weakly-Labeled Instructional Videos

Authors:Reza Ghoddoosian, Saif Sayed, Vassilis Athitsos

View PDF

Abstract:This paper focuses on task recognition and action segmentation in weakly-labeled instructional videos, where only the ordered sequence of video-level actions is available during training. We propose a two-stream framework, which exploits semantic and temporal hierarchies to recognize top-level tasks in instructional videos. Further, we present a novel top-down weakly-supervised action segmentation approach, where the predicted task is used to constrain the inference of fine-grained action sequences. Experimental results on the popular Breakfast and Cooking 2 datasets show that our two-stream hierarchical task modeling significantly outperforms existing methods in top-level task recognition for all datasets and metrics. Additionally, using our task recognition framework in the proposed top-down action segmentation approach consistently improves the state of the art, while also reducing segmentation inference time by 80-90 percent.

Comments:	Accepted in WACV 2022
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2110.05697 [cs.CV]
	(or arXiv:2110.05697v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2110.05697

Submission history

From: Reza Ghoddoosian [view email]
[v1] Tue, 12 Oct 2021 02:32:15 UTC (2,033 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CV

< prev | next >

new | recent | 2021-10

Change to browse by:

References & Citations

DBLP - CS Bibliography

listing | bibtex

Reza Ghoddoosian
Vassilis Athitsos

export BibTeX citation

Computer Science > Computer Vision and Pattern Recognition

Title:Hierarchical Modeling for Task Recognition and Action Segmentation in Weakly-Labeled Instructional Videos

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Hierarchical Modeling for Task Recognition and Action Segmentation in Weakly-Labeled Instructional Videos

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators