[PDF][PDF] UZH TILT: A Kaldi recipe for Swiss German Speech to Standard German Text.
SwissText/KONVENS, 2020•ceur-ws.org
Abstract Swiss German Speech-to-Text (STT) is a challenging task due to the fact that no
single-dominant pronunciation or standardised orthography exists. This is compounded by a
severe lack of appropriate training data. One potential avenue, and that which is
investigated as part of the GermEval 2020 Task 4 on Low-Resource Speech-to-Text, is to
translate spoken Swiss German into standard German text implicitly through STT. In this
paper, we describe our proposed system that makes use of the Kaldi Speech Recognition …
single-dominant pronunciation or standardised orthography exists. This is compounded by a
severe lack of appropriate training data. One potential avenue, and that which is
investigated as part of the GermEval 2020 Task 4 on Low-Resource Speech-to-Text, is to
translate spoken Swiss German into standard German text implicitly through STT. In this
paper, we describe our proposed system that makes use of the Kaldi Speech Recognition …
Abstract
Swiss German Speech-to-Text (STT) is a challenging task due to the fact that no single-dominant pronunciation or standardised orthography exists. This is compounded by a severe lack of appropriate training data. One potential avenue, and that which is investigated as part of the GermEval 2020 Task 4 on Low-Resource Speech-to-Text, is to translate spoken Swiss German into standard German text implicitly through STT. In this paper, we describe our proposed system that makes use of the Kaldi Speech Recognition Toolkit to implement a time delay neural network (TDNN) Acoustic Model (AM) with an extended pronunciation lexicon and language model. Using this approach, we achieve a word error rate of 45.45% on the held-out test set.
ceur-ws.org
Showing the best result for this search. See all results