Mar 5, 2021 · Abstract:We propose a two-stream convolutional network for audio recognition, that operates on time-frequency spectrogram inputs.
The Slow pathway has high channel capacity while the Fast pathway operates at a fine-grained temporal resolution. We showcase the importance of our two-stream ...
Implementation of "Slow-Fast Auditory Streams for Audio Recognition, ICASSP, 2021" in PyTorch - ekazakos/auditory-slow-fast.
SLOW-FAST AUDITORY STREAMS FOR AUDIO RECOGNITION. International Conference on Acoustics, Speech and Signal Processing (ICASSP). Top-1 Accuracy (%). Split Model.
ABSTRACT. We propose a two-stream convolutional network for audio recognition, that operates on time-frequency spectrogram inputs.
This work learns Slow-Fast auditory streams with separable convolutions and multi-level lateral connections in a two-stream convolutional network for audio ...
The Slow pathway has high channel capacity while the Fast pathway operates at a fine-grained temporal resolution. We showcase the importance of our two-stream ...
Mar 5, 2021 · We propose a two-stream convolutional network for audio recognition, that operates on time-frequency spectrogram inputs.
We train and evaluate the Auditory SlowFast (ASF) [27] and Self-Supervised Audio Spectrogram Transformer (SSAST) [26] audio encoder networks, with both a linear ...
EPIC-SOUNDS is a large scale dataset of audio annotations capturing temporal extents and class labels within the audio stream of the egocentric videos.