A Novel Scheme for Single-Channel Speech Dereverberation
Abstract
:1. Introduction
2. Previous Work
2.1. Wu and Wang Method
2.2. Spendred Algorithm
2.3. The K-SVD Algorithm
2.4. Denoising with Minimal Statistics
3. The Proposed Method
3.1. Sparsification
3.2. Rir Envelope Estimation
- in the low frequency region
- there was a form of coloration in the frequency spectrum
- the overall spectral energy was different
3.3. Spectral Subtraction
4. Quality and Intelligibility Metrics
5. Performance Evaluation
5.1. Experimental Setup
5.2. Evaluation
6. Conclusions
Author Contributions
Funding
Conflicts of Interest
References
- Tonelli, M.; Mitianoudis, N.; Davies, M. A Maximum Likelihood approach to blind Audio de-reverberation. In Proceedings of the 7th International Conference on Digital Audio Effects (DAFX-04), Naples, Italy, 5–8 October 2004. [Google Scholar]
- Elko, G.W. Microphone array systems for hands-free telecommunication. Speech Commun. 1996, 20, 229–240. [Google Scholar] [CrossRef]
- Gaubitch, N.D. Blind Identification of Acoustic Systems and Enhancement of Reverberant Speech. Ph.D. Thesis, Imperial College, University of London, London, UK, 2006. [Google Scholar]
- Yegnanarayana, B.; Murthy, P.S. Enhancement of reverberant speech using LP residual signal. IEEE Trans. Speech Audio Process. 2000, 8, 267–281. [Google Scholar] [CrossRef] [Green Version]
- Gillespie, B.W.; Florencio, D.A.F.; Malvar, H.S. Speech de-reverberation via maximum-kurtosis sub-band adaptive filtering. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Salt Lake City, UT, USA, 7–11 May 2001; pp. 3701–3704. [Google Scholar]
- Tonelli, M. Blind Speech Dereverberation. Prog. Vib. Acoust. 2014, 2, 1–37. [Google Scholar]
- Wu, M.; Wang, D. A two-stage algorithm for one-microphone reverberant speech enhancement. IEEE Trans. Audio Speech Lang. Process. 2006, 14, 774–784. [Google Scholar]
- Doire, C.S.J.; Brookes, M.; Naylor, P.A.; Hicks, C.M.; Betts, D.; Dmour, M.A.; Holdt-Jensen, S. Single-Channel Online Enhancement of Speech Corrupted by Reverberation and Noise. IEEE Trans. Audio Speech Lang. Process. 2017, 25, 572–587. [Google Scholar] [CrossRef]
- Doire, C.S.J. Single-Channel Enhancement of Speech Corrupted by Reverberation and Noise. Ph.D. Thesis, Imperial College, London, UK, 2016. [Google Scholar]
- Aharon, M.; Elad, M.; Bruckstein, A.M. The K-SVD: An Algorithm for Designing of Overcomplete Dictionaries for Sparse Representation. IEEE Trans. Signal Process. 2006, 54, 4311–4322. [Google Scholar] [CrossRef]
- Martin, R. Noise power spectral density estimation based on optimal smoothing and minimum statistics. IEEE Trans. Speech Audio Process. 2001, 9, 504–512. [Google Scholar] [CrossRef] [Green Version]
- Stewart, R.; Sandler, M. Statistical measures of early reflections of room impulse responses. In Proceedings of the 10th International Conference on Digital Audio Effects (DAFx-07), Bordeaux, France, 10–15 December 2007; pp. 59–62. [Google Scholar]
- Sumarac-Pavlovic, D.; Mijic, M.; Kurtovic, H. A simple impulse sound source for measurements in room acoustics. Appl. Acoust. 2008, 69, 378–383. [Google Scholar] [CrossRef]
- Repp, B.H. The sound of two hands clapping: An exploratory study. J. Acoust. Soc. Am. 1987, 81, 1100–1109. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Blesser, B. An interdisciplinary synthesis of reverberation viewpoints. J. Audio Eng. Soc. 2001, 49, 867–903. [Google Scholar]
- Georganti, E.; Mourjopoulos, J.; Jacobsen, F. Analysis of room transfer function and reverberant signal statistics. Anal. Room Transf. Funct. Reverberant Signal Stat. 2008, 123, 376. [Google Scholar] [CrossRef]
- Georganti, E.; Zarouchas, T.; Mourjopoulos, J. Reverberation analysis via response and signal statistics. In Proceedings of the 128th AES Convention, London, UK, 23–25 May 2010. [Google Scholar]
- Schroeder, M.R. New Method of Measuring Reverberation Time. J. Acoust. Soc. Am. 1965, 37, 409–412. [Google Scholar] [CrossRef]
- Meyer, J.; Simmer, K.U.; Kammeyer, K.D. Comparison of one-and two-channel noise-estimation techniques. In Proceedings of the 5th International Workshop Acoustic Echo Control Noise Reduction, London, UK, 11–12 September 1997; pp. 17–20. [Google Scholar]
- Martin, R. Spectral subtraction based on minimum statistics. In Proceedings of the European Signal Processing Conference, Edinburgh, UK, 13–16 September 1994; pp. 1182–1185. [Google Scholar]
- Hu, Y.; Loizou, P.C. A comparative intelligibility study of speech enhancement algorithms. In Proceedings of the ICASSP, Honolulu, HI, USA, 15–20 April 2007. [Google Scholar]
- Shannon, R.; Zeng, F.-G.; Kamath, V.; Wygonski, J.; Ekelid, M. Speech recognition with primarily temporal cues. Science 1995, 270, 303–304. [Google Scholar] [CrossRef] [PubMed]
- Wang, D.; Kjems, U.; Pedersen, M.S.; Boldt, J.B.; Lunner, T. Speech perception of noise with binary gains. J. Acoust. Soc. Am. 2008, 124, 2303–2307. [Google Scholar] [CrossRef] [PubMed]
- Tribolet, J.M.; Noll, P.; McDermott, B.J. A study of complexity and quality of speech waveform coders. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Tulsa, OK, USA, 10–12 April 1978. [Google Scholar]
- ITU-T: Recommendation P.862: Perceptual Evaluation of Quality (PESQ): An Objective Method for End-to-End Speech Quality Assessment of Narrow-Band Telephone Networks and Speech Codecs; ITU: Geneva, Switzerland, 2001.
- Quackenbush, S.R.; Clements, M.A. Objective Measures of Speech Quality; Prentice-Hall: Englewood Cliffs, NJ, USA, 1988. [Google Scholar]
- Hu, Y.; Loizou, P.C. Evaluation of objective quality measures for speech enhancement. IEEE Trans. Audio Speech Lang. Process. 2008, 16, 229–238. [Google Scholar] [CrossRef]
- Ma, J.; Hu, Y.; Loizou, P.C. Objective measures for predicting speech intelligibility in noisy conditions based on new band-importance functions. J. Acoust. Soc. Am. 2009, 125, 3387–3405. [Google Scholar] [CrossRef] [PubMed]
- Boldt, J.B.; Ellis, D.P.W. A Simple Correlation-Based Model of Intelligibility for Nonlinear Speech Enhancement and Separation. In Proceedings of the 17th European Signal Processing Conference (EUSIPCO), Glasgow, UK, 24–28 August 2009. [Google Scholar]
- Taal, C.H.; Hendriks, R.C.; Heusdens, R.; Jensen, J. A short-time objective intelligibility measure for time-frequency weighted noisy speech. In Proceedings of the ICASSP, Dallas, TX, USA, 14–19 March 2010. [Google Scholar]
- Taal, C.H.; Hendriks, R.C.; Heusdens, R.; Jensen, J. An Algorithm for Intelligibility Prediction of Time–Frequency Weighted Noisy Speech. IEEE Trans. Audio Speech Lang. Process. 2011, 19, 2125–2136. [Google Scholar] [CrossRef]
- Falk, T.H.; Zheng, C.; Chan, W.-Y. A Non-Intrusive Quality and Intelligibility Measure of Reverberant and Dereverberated Speech. IEEE Trans. Audio Speech Lang. Process. 2010, 18, 1766–1774. [Google Scholar] [CrossRef]
- Blumensath, T.; Davies, M.E. Stagewise Weak Gradient Pursuits. IEEE Trans. Signal Process. 2009, 57, 4333–4346. [Google Scholar] [CrossRef] [Green Version]
Room Type | (s) |
---|---|
Small room | 0.5 |
Classroom | 1.3 |
Amphitheater | 1.8 |
Simulation | 0.3 |
Reference | WW [7] | Spendred [8] | Proposed | |
---|---|---|---|---|
Simulated Dataset | ||||
fwSNRseg (dB) | −4.01 | 4.01 | 4.07 | 4.02 |
PESQ | 2 | 2.19 | 2.3 | 2.36 |
NSECGT | 0.92 | 0.81 | 0.88 | 0.89 |
STOI | 0.79 | 0.5 | 0.75 | 0.76 |
SRRseg (dB) | −20.24 | −29.22 | −15 | −18.37 |
SRMR (dB) | 5.55 | 6.12 | 7.33 | 7.65 |
Recorded Dataset | ||||
fwSNRseg (dB) | −29.07 | 27.1 | 13.11 | 29.07 |
PESQ | 1.4 | 1.18 | 1.6 | 1.62 |
NSECGT | 0.78 | 0.71 | 0.79 | 0.81 |
STOI | 0.49 | 0.33 | 0.48 | 0.481 |
SRRseg (dB) | −27.55 | −28.8 | −24.54 | −25.85 |
SRMR (dB) | 3.49 | 8.4 | 5.7 | 3.63 |
© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Kilis, N.; Mitianoudis, N. A Novel Scheme for Single-Channel Speech Dereverberation. Acoustics 2019, 1, 711-725. https://doi.org/10.3390/acoustics1030042
Kilis N, Mitianoudis N. A Novel Scheme for Single-Channel Speech Dereverberation. Acoustics. 2019; 1(3):711-725. https://doi.org/10.3390/acoustics1030042
Chicago/Turabian StyleKilis, Nikolaos, and Nikolaos Mitianoudis. 2019. "A Novel Scheme for Single-Channel Speech Dereverberation" Acoustics 1, no. 3: 711-725. https://doi.org/10.3390/acoustics1030042
APA StyleKilis, N., & Mitianoudis, N. (2019). A Novel Scheme for Single-Channel Speech Dereverberation. Acoustics, 1(3), 711-725. https://doi.org/10.3390/acoustics1030042