Ego-exo4d: Understanding skilled human activity from first-and third-person perspectives
Abstract We present Ego-Exo4D a diverse large-scale multimodal multiview video dataset
and benchmark challenge. Ego-Exo4D centers around simultaneously-captured egocentric …
and benchmark challenge. Ego-Exo4D centers around simultaneously-captured egocentric …
Learning to predict activity progress by self-supervised video alignment
G Donahue, E Elhamifar - … of the IEEE/CVF Conference on …, 2024 - openaccess.thecvf.com
In this paper we tackle the problem of self-supervised video alignment and activity progress
prediction using in-the-wild videos. Our proposed self-supervised representation learning …
prediction using in-the-wild videos. Our proposed self-supervised representation learning …
Put myself in your shoes: Lifting the egocentric perspective from exocentric videos
We investigate exocentric-to-egocentric cross-view translation, which aims to generate a first-
person (egocentric) view of an actor based on a video recording that captures the actor from …
person (egocentric) view of an actor based on a video recording that captures the actor from …
An outlook into the future of egocentric vision
What will the future be? We wonder! In this survey, we explore the gap between current
research in egocentric vision and the ever-anticipated future, where wearable computing …
research in egocentric vision and the ever-anticipated future, where wearable computing …
Retrieval-augmented egocentric video captioning
Understanding human actions from videos of first-person view poses significant challenges.
Most prior approaches explore representation learning on egocentric videos only while …
Most prior approaches explore representation learning on egocentric videos only while …
Finepseudo: improving pseudo-labelling through temporal-alignablity for semi-supervised fine-grained action recognition
Real-life applications of action recognition often require a fine-grained understanding of
subtle movements, eg, in sports analytics, user interactions in AR/VR, and surgical videos …
subtle movements, eg, in sports analytics, user interactions in AR/VR, and surgical videos …
Synchronization is all you need: Exocentric-to-egocentric transfer for temporal action segmentation with unlabeled synchronized video pairs
We consider the problem of transferring a temporal action segmentation system initially
designed for exocentric (fixed) cameras to an egocentric scenario, where wearable cameras …
designed for exocentric (fixed) cameras to an egocentric scenario, where wearable cameras …
Spherical World-Locking for Audio-Visual Localization in Egocentric Videos
Egocentric videos provide comprehensive contexts for user and scene understanding,
spanning multisensory perception to behavioral interaction. We propose Spherical World …
spanning multisensory perception to behavioral interaction. We propose Spherical World …
EgoExoLearn: A Dataset for Bridging Asynchronous Ego-and Exo-centric View of Procedural Activities in Real World
Being able to map the activities of others into one's own point of view is one fundamental
human skill even from a very early age. Taking a step toward understanding this human …
human skill even from a very early age. Taking a step toward understanding this human …
Fusing Personal and Environmental Cues for Identification and Segmentation of First-Person Camera Wearers in Third-Person Views
As wearable cameras become more popular an important question emerges: how to identify
camera wearers within the perspective of conventional static cameras. The drastic difference …
camera wearers within the perspective of conventional static cameras. The drastic difference …