Discrimination of Speech Activity and Impact Noise Using an Accelerometer and a Microphone in a Car Environment

SM Kim, HK Kim, SJ Lee, YK Lee - … Conference, FGCN 2011, Held as Part …, 2011 - Springer
SM Kim, HK Kim, SJ Lee, YK Lee
Communication and Networking: International Conference, FGCN 2011, Held as …, 2011Springer
In this paper, we propose an algorithm to discriminate speech from vehicle body impact
noise in a car. Depending on road conditions such as the presence of large bumps or
unpaved stretches, impact noises from the car body may interfere with the detection of voice
commands for a speech-enabled service in the car, which results in degraded service
performance. The proposed algorithm classifies each analysis frame of the input signal
recorded by a microphone into four different categories such as speech, impact noise …
Abstract
In this paper, we propose an algorithm to discriminate speech from vehicle body impact noise in a car. Depending on road conditions such as the presence of large bumps or unpaved stretches, impact noises from the car body may interfere with the detection of voice commands for a speech-enabled service in the car, which results in degraded service performance. The proposed algorithm classifies each analysis frame of the input signal recorded by a microphone into four different categories such as speech, impact noise, background noise, and mixed speech and impact noise. The classification is based on the likelihood ratio test (LRT) using statistical models constructed by combining signals obtained from the microphone with those from an accelerometer. In other words, the different characteristics detected by both acoustical and mechanical sensing enable better discrimination of voice commands from noise emanating from the vehicle body. The performance of the proposed algorithm is evaluated using a corpus of speech recordings in a car moving at an average velocity of 30-50 km/h with impact noise at various signal-to-noise ratios (SNRs) from -3 to 1 dB, where the SNR is defined as the ratio of the power of speech signals to that of impact noise. It is shown from the experiments that the proposed algorithm achieves a discrimination accuracy of 85%.
Springer
Showing the best result for this search. See all results