Adversarial Speaker Adaptation

Meng, Zhong; Li, Jinyu; Gong, Yifan

doi:10.1109/ICASSP.2019.8682510

Computer Science > Machine Learning

arXiv:1904.12407 (cs)

[Submitted on 29 Apr 2019]

Title:Adversarial Speaker Adaptation

Authors:Zhong Meng, Jinyu Li, Yifan Gong

View PDF

Abstract:We propose a novel adversarial speaker adaptation (ASA) scheme, in which adversarial learning is applied to regularize the distribution of deep hidden features in a speaker-dependent (SD) deep neural network (DNN) acoustic model to be close to that of a fixed speaker-independent (SI) DNN acoustic model during adaptation. An additional discriminator network is introduced to distinguish the deep features generated by the SD model from those produced by the SI model. In ASA, with a fixed SI model as the reference, an SD model is jointly optimized with the discriminator network to minimize the senone classification loss, and simultaneously to mini-maximize the SI/SD discrimination loss on the adaptation data. With ASA, a senone-discriminative deep feature is learned in the SD model with a similar distribution to that of the SI model. With such a regularized and adapted deep feature, the SD model can perform improved automatic speech recognition on the target speaker's speech. Evaluated on the Microsoft short message dictation dataset, ASA achieves 14.4% and 7.9% relative word error rate improvements for supervised and unsupervised adaptation, respectively, over an SI model trained from 2600 hours data, with 200 adaptation utterances per speaker.

Comments:	5 pages, 2 figures, ICASSP 2019
Subjects:	Machine Learning (cs.LG); Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
Cite as:	arXiv:1904.12407 [cs.LG]
	(or arXiv:1904.12407v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1904.12407
Journal reference:	2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, United Kingdom
Related DOI:	https://doi.org/10.1109/ICASSP.2019.8682510

Submission history

From: Zhong Meng [view email]
[v1] Mon, 29 Apr 2019 00:38:16 UTC (230 KB)

Computer Science > Machine Learning

Title:Adversarial Speaker Adaptation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Adversarial Speaker Adaptation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators