[PDF][PDF] SNR-based mask compensation for computational auditory scene analysis applied to speech recognition in a car environment.

JH Park, SM Kim, JS Yoon, HK Kim, SJ Lee… - …, 2010 - isca-archive.org
JH Park, SM Kim, JS Yoon, HK Kim, SJ Lee, Y Lee
INTERSPEECH, 2010isca-archive.org
In this paper, we propose a computational auditory scene analysis (CASA)–based front–end
for two–microphone speech recognition in a car environment. One of the important issues
associated with CASA is the accurate estimation of mask information for target speech
separation within multiple microphone noisy speech. For such a task, the time–frequency
mask information is compensated through the signal–to–noise ratio resulted from a
beamformer to adjust the noise quantity included in noisy speech. We evaluate the …
Abstract
In this paper, we propose a computational auditory scene analysis (CASA)–based front–end for two–microphone speech recognition in a car environment. One of the important issues associated with CASA is the accurate estimation of mask information for target speech separation within multiple microphone noisy speech. For such a task, the time–frequency mask information is compensated through the signal–to–noise ratio resulted from a beamformer to adjust the noise quantity included in noisy speech. We evaluate the performance of an automatic speech recognition (ASR) system employing a CASA–based front–end with the proposed mask compensation method. In addition, we compare its performance with those employing a CASA–based front–end without mask compensation and the beamforming–based front–end. As a result, the CASA–based front–end achieves an average word error rate (WER) reduction of 8.57% when the proposed mask compensation method is applied. In addition, the CASA–based front–end with the proposed method provides a relative WER reduction of 26.52%, compared with the beamforming–based front–end.
isca-archive.org
Showing the best result for this search. See all results