Statistical Testing on ASR Performance via Blockwise Bootstrap

Liu, Zhe; Peng, Fuchun

Statistics > Machine Learning

arXiv:1912.09508 (stat)

[Submitted on 19 Dec 2019 (v1), last revised 20 May 2020 (this version, v2)]

Title:Statistical Testing on ASR Performance via Blockwise Bootstrap

Authors:Zhe Liu, Fuchun Peng

View PDF

Abstract:A common question being raised in automatic speech recognition (ASR) evaluations is how reliable is an observed word error rate (WER) improvement comparing two ASR systems, where statistical hypothesis testing and confidence interval (CI) can be utilized to tell whether this improvement is real or only due to random chance. The bootstrap resampling method has been popular for such significance analysis which is intuitive and easy to use. However, this method fails in dealing with dependent data, which is prevalent in speech world - for example, ASR performance on utterances from the same speaker could be correlated. In this paper we present blockwise bootstrap approach - by dividing evaluation utterances into nonoverlapping blocks, this method resamples these blocks instead of original data. We show that the resulting variance estimator of absolute WER difference between two ASR systems is consistent under mild conditions. We also demonstrate the validity of blockwise bootstrap method on both synthetic and real-world speech data.

Comments:	6 pages, 2 figures
Subjects:	Machine Learning (stat.ML); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
Cite as:	arXiv:1912.09508 [stat.ML]
	(or arXiv:1912.09508v2 [stat.ML] for this version)
	https://doi.org/10.48550/arXiv.1912.09508

Submission history

From: Zhe Liu [view email]
[v1] Thu, 19 Dec 2019 19:20:09 UTC (68 KB)
[v2] Wed, 20 May 2020 22:51:25 UTC (149 KB)

Statistics > Machine Learning

Title:Statistical Testing on ASR Performance via Blockwise Bootstrap

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Machine Learning

Title:Statistical Testing on ASR Performance via Blockwise Bootstrap

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators