Google Scholar

[PDF][PDF] Robust speech enhancement techniques for ASR in non-stationary noise and dynamic environments.

G Liu, D Dimitriadis, E Bocchieri - INTERSPEECH, 2013 - academia.edu

INTERSPEECH, 2013•academia.edu

Abstract

In the current ASR systems the presence of competing speakers greatly degrades the recognition performance. This phenomenon is getting even more prominent in the case of hands-free, far-field ASR systems like the “Smart-TV” systems, where reverberation and non-stationary noise pose additional challenges. Furthermore, speakers are, most often, not standing still while speaking. To address these issues, we propose a cascaded system that includes Time Differences of Arrival estimation, multi-channel Wiener Filtering, nonnegative matrix factorization (NMF), multi-condition training, and robust feature extraction, whereas each of them additively improves the overall performance. The final cascaded system presents an average of 50% and 45% relative improvement in ASR word accuracy for the CHiME 2011 (non-stationary noise) and CHiME 2012 (non-stationary noise plus speaker head movement) tasks, respectively.

academia.edu

Show moreShow less

Save Cite Cited by 9 Related articles All 5 versions View as HTML

Showing the best result for this search. See all results

Cite

Advanced search

Saved to My library

[PDF][PDF] Robust speech enhancement techniques for ASR in non-stationary noise and dynamic environments.