Deep factorization for speech signal

L Li, D Wang, Y Chen, Y Shi, Z Tang… - … on Acoustics, Speech …, 2018 - ieeexplore.ieee.org
Various informative factors mixed in speech signals, leading to great difficulty when
decoding any of the factors. An intuitive idea is to factorize each speech frame into individual
informative factors, though it turns out to be highly difficult. Recently, we found that speaker
traits, which were assumed to be long-term distributional properties, are actually short-time
patterns, and can be learned by a carefully designed deep neural network (DNN). This
discovery motivated a cascade deep factorization (CDF) framework that will be presented in …

Deep factorization for speech signal

D Wang, L Li, Y Shi, Y Chen, Z Tang - arXiv preprint arXiv:1706.01777, 2017 - arxiv.org
Speech signals are complex intermingling of various informative factors, and this information
blending makes decoding any of the individual factors extremely difficult. A natural idea is to
factorize each speech frame into independent factors, though it turns out to be even more
difficult than decoding each individual factor. A major encumbrance is that the speaker trait,
a major factor in speech signals, has been suspected to be a long-term distributional pattern
and so not identifiable at the frame level. In this paper, we demonstrated that the speaker …
Showing the best results for this search. See all results