[PDF][PDF] Subspace LHUC for Fast Adaptation of Deep Neural Network Acoustic Models.
L Samarakoon, KC Sim - Interspeech, 2016 - isca-archive.org
Interspeech, 2016•isca-archive.org
Recently, the learning hidden unit contributions (LHUC) method is proposed for the
adaptation of deep neural network (DNN) based acoustic models for automatic speech
recognition (ASR). In LHUC, a set of speaker dependent (SD) parameters is estimated to
linearly recombine the hidden units in an unsupervised fashion. Although LHUC performs
considerably well, the gains diminish when the availability of the adaptation data amount
decreases. Moreover, the per-speaker footprint of LHUC adaptation is in thousands and it is …
adaptation of deep neural network (DNN) based acoustic models for automatic speech
recognition (ASR). In LHUC, a set of speaker dependent (SD) parameters is estimated to
linearly recombine the hidden units in an unsupervised fashion. Although LHUC performs
considerably well, the gains diminish when the availability of the adaptation data amount
decreases. Moreover, the per-speaker footprint of LHUC adaptation is in thousands and it is …
Abstract
Recently, the learning hidden unit contributions (LHUC) method is proposed for the adaptation of deep neural network (DNN) based acoustic models for automatic speech recognition (ASR). In LHUC, a set of speaker dependent (SD) parameters is estimated to linearly recombine the hidden units in an unsupervised fashion. Although LHUC performs considerably well, the gains diminish when the availability of the adaptation data amount decreases. Moreover, the per-speaker footprint of LHUC adaptation is in thousands and it is not desirable. Therefore, in this work, we propose the subspace LHUC, where the SD parameters are estimated in a subspace and connected to various layers through a new set of adaptively trained weights. We evaluate the subspace LHUC in the Aurora4 and AMI IHM tasks. Experimental results show that the subspace LHUC outperforms standard LHUC adaptation. With utterance-level fast adaptation, the subspace LHUC achieved 11.3% and 4.5% relative improvements over the standard LHUC for the Aurora4 and AMI IHM tasks respectively. Furthermore, the subspace LHUC reduces the per-speaker footprint by 94% over the standard LHUC adaptation.
isca-archive.org
Showing the best result for this search. See all results