End-to-end Networks for Supervised Single-channel Speech Separation

Venkataramani, Shrikant; Smaragdis, Paris

Electrical Engineering and Systems Science > Audio and Speech Processing

arXiv:1810.02568 (eess)

[Submitted on 5 Oct 2018]

Title:End-to-end Networks for Supervised Single-channel Speech Separation

Authors:Shrikant Venkataramani, Paris Smaragdis

View PDF

Abstract:The performance of single channel source separation algorithms has improved greatly in recent times with the development and deployment of neural networks. However, many such networks continue to operate on the magnitude spectrogram of a mixture, and produce an estimate of source magnitude spectrograms, to perform source separation. In this paper, we interpret these steps as additional neural network layers and propose an end-to-end source separation network that allows us to estimate the separated speech waveform by operating directly on the raw waveform of the mixture. Furthermore, we also propose the use of masking based end-to-end separation networks that jointly optimize the mask and the latent representations of the mixture waveforms. These networks show a significant improvement in separation performance compared to existing architectures in our experiments. To train these end-to-end models, we investigate the use of composite cost functions that are derived from objective evaluation metrics as measured on waveforms. We present subjective listening test results that demonstrate the improvement attained by using masking based end-to-end networks and also reveal insights into the performance of these cost functions for end-to-end source separation.

Subjects:	Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Signal Processing (eess.SP)
Cite as:	arXiv:1810.02568 [eess.AS]
	(or arXiv:1810.02568v1 [eess.AS] for this version)
	https://doi.org/10.48550/arXiv.1810.02568

Submission history

From: Shrikant Venkataramani [view email]
[v1] Fri, 5 Oct 2018 08:44:27 UTC (6,580 KB)

Electrical Engineering and Systems Science > Audio and Speech Processing

Title:End-to-end Networks for Supervised Single-channel Speech Separation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Electrical Engineering and Systems Science > Audio and Speech Processing

Title:End-to-end Networks for Supervised Single-channel Speech Separation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators