[PDF][PDF] A new approach to minimize utterance verification error rate for a specific operating point.
WH Au, MH Siu - INTERSPEECH, 2003 - isca-archive.org
WH Au, MH Siu
INTERSPEECH, 2003•isca-archive.orgIn many telephony applications that use speech recognition, it is important to identify and
reject out-of-vocabulary words or utterances without keywords by means of utterance
verification (UV). Typically, UV is performed based on the likelihood ratio of the target model
versus an alternative model. The “goodness” of the models and the particular criteria used
for estimating these models can have significant impact on its performance. Because the UV
problem can be considered as a two-class classification problem, minimum classification …
reject out-of-vocabulary words or utterances without keywords by means of utterance
verification (UV). Typically, UV is performed based on the likelihood ratio of the target model
versus an alternative model. The “goodness” of the models and the particular criteria used
for estimating these models can have significant impact on its performance. Because the UV
problem can be considered as a two-class classification problem, minimum classification …
Abstract
In many telephony applications that use speech recognition, it is important to identify and reject out-of-vocabulary words or utterances without keywords by means of utterance verification (UV). Typically, UV is performed based on the likelihood ratio of the target model versus an alternative model. The “goodness” of the models and the particular criteria used for estimating these models can have significant impact on its performance. Because the UV problem can be considered as a two-class classification problem, minimum classification error (MCE) training is a natural choice. Earlier work has focused on MCE training to reduce total classification errors. In this paper, we extend the MCE approach to minimize the error rates. In particular, we focus on the error rates at certain operating points and show how this can result in a significant EER reduction for phone verification on the TIMIT and a non-native kids corpus. While the particular technique is developed on utterance verification, it can also be generalized for other verification tasks such as speaker verification.
isca-archive.org
Showing the best result for this search. See all results