A study on auditory feature spaces for speech-driven lip animation
Interspeech, 2011•inria.hal.science
We present in this paper a study on auditory feature spaces for speech-driven face
animation. The goal is to provide solid analytic ground to underscore the description
capability of some well-known features with relation to lipsync. A set of various audio
features describing the temporal and spectral shape of speech signal has been computed
on annotated audio extracts. The dimension of the input feature space has been reduced
with PCA and the contribution of each input feature is investigated to determine the more …
animation. The goal is to provide solid analytic ground to underscore the description
capability of some well-known features with relation to lipsync. A set of various audio
features describing the temporal and spectral shape of speech signal has been computed
on annotated audio extracts. The dimension of the input feature space has been reduced
with PCA and the contribution of each input feature is investigated to determine the more …
We present in this paper a study on auditory feature spaces for speech-driven face animation. The goal is to provide solid analytic ground to underscore the description capability of some well-known features with relation to lipsync. A set of various audio features describing the temporal and spectral shape of speech signal has been computed on annotated audio extracts. The dimension of the input feature space has been reduced with PCA and the contribution of each input feature is investigated to determine the more descriptive. The resulting feature space is quantitatively and qualitatively analyzed for the description of acoustic units (phonemes, visemes, etc.) and we demonstrate that the use of some low-level features in addition to MFCC increases the relevance of the feature space. Finally, we evaluate the stability of these features w.r.t. the gender of the speaker.
inria.hal.science
Showing the best result for this search. See all results