Multi-Channel Speech Enhancement

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 35

Multi-channel speech

enhancement
Chunjian Li
DICOM, Aalborg University

3/24/2006

Lecture notes for Speech Communications

Methods & applied fields

Dual-channel spectral subtraction


- noise reduction in speech

Adaptive Noise Canceling (ANC)


- noise reduction and interference elimination
- echo canceling
- adaptive beamforming

Blind Source Separation (BSS)


Blind Source Extraction (BSE)
3/24/2006

Lecture notes for

Dual-channel spectral
subtraction
- Hanson and Wong, ICASSP84.

3/24/2006

Lecture notes for

The method

The exponent is chosen to be a=1 based on


listening test and spectral distortion
measure.
The noisy phase is used in the
reconstruction of signal.
The estimate of noise spectrum is either
obtained from a reference channel or
estimated from the noisy signal assuming
the SNR is very low (about -12 dB).
3/24/2006

Lecture notes for

Revisiting the phase issue


To see the dependency of magnitude on phase:
S ( f ) S ( f ) N ( f ) N ( f )

S ( f )

S( f )

1
a

e j ( f )

S( f )
2
N ( f ) 1
N( f )
N ( f )

a
2

cos()

1
a

N ( f )
a

where is the phase difference between the two signals.


It is clear that the estimate of signal magnitude spectrum depends
on both the SNR and the phase difference. But phase is not estimated
in this method because the enhanced quality is acceptable.

3/24/2006

Lecture notes for

Comments

The simplest (and a bit unrealistic)


form of exploiting multi-channel.
Aims at improving intelligibility.
Significant intel. gains only at very low
SNR (-12dB).
Unvoiced speech is not processed.

3/24/2006

Lecture notes for

Adaptive Noise Canceling

First proposed by Widrow et al. [1] in 1975.


It is adaptive because of the use of adaptive
filter such as the LMS algorithm.
The objective: estimate the noise in the
primary channel using the noise recorded in
the secondary channel, and subtract the
estimate from the primary channel
recordings.

[1] B. Widrow, J. R. Grover, J. M. McCool et al. Adaptive noise canceling:

Principles and applications, Proceedings of the IEEE, vol.63, pp. 1692-1716,


Dec. 1975.

3/24/2006

Lecture notes for

Signal model

3/24/2006

Lecture notes for

Signal estimation
The estimated signal:
s( n) y (n) d1 (n)
M 1

d1 (n) h(i )d 2 (n i )
i 0

The optimization criterion:

h arg min y (n) h(i )d 2 (n i )


h
i 0

M 1

3/24/2006

Lecture notes for

Signal estimation
The minimization can be solved by applying the orthogonality principle:
M

ryd 2 ( ) h(i )rd 2 ( i ) 0


i 0

This can be solved in the same way as solving the normal equations.
But it is usually solved by sequential algorithms such as the LMS
algorithm. The advantages of the LMS are:
-No matrix inversion, low complexity
-Fully adaptive, suitable to non-stationary signal and noise
-Low delay

3/24/2006

Lecture notes for

LMS
-It is a sequential, gradient descent minimization method,
- The estimate of the weights is updated each time a new sample
is available:

h k h k 1 k g
Where the element of the gradient vector:
g

3/24/2006

( )

M 1

2 ryd 2 ( ) h(i )rd 2 ( i )


h( )
i 0

Lecture notes for

LMS
Or, in matrix form:

g 2(ryd 2 R d 2 h )
The most important trick is, in this sequential implementation, to
approximate the correlation matrix and cross-correlation vector by
The instantaneous estimates.

d dH 2
R
d2
2

ryd 2 d 2 y (n)
3/24/2006

Lecture notes for

LMS
The step size is often chosen empirically, as long as the following
condition is satisfied for stability reason:

max

where max is the largest eigenvalue of the matrix R d 2


The larger the step-size, the faster the convergence, but also the
larger estimation variance.

3/24/2006

Lecture notes for

Comments

The LMS belongs to the stochastic gradient


algorithm.
The algorithm is based on the instantaneous
estimates of correlation function, which are of high
variance. But the algorithm works well because of
its iterative nature, which averages the estimate
over time.
Low complexity: O(M), where M is the filter order.
Although the derivation is based on WSS
assumption, the algorithm is applicable to stationary
signals, due to the sequential implementation.
3/24/2006

Lecture notes for

Implementation issues of ANC

Microphones must be sufficiently separated in


space or contain acoustic barriers.
Typically 1500 taps are needed => large
misadjustment => pronounced echo => must use
small step-size => long convergence time.
Different delays from the sources to the two
microphones must be taken care of.
Frequency domain LMS can reduces the number
of taps needed.
ANC can be generalizes to a multi-channel system,
which can be seen as a generalized beamforming
system.
3/24/2006

Lecture notes for

Eliminating cross-talk
Cross-talk: If the signal is also captured in the reference channel, the ANC
will suppress part of the signal. Cross-talk can be reduced by employing
two adaptive filter within a feedback loop.

3/24/2006

Lecture notes for

Beamforming

Compared to ANC, beamforming is


truly a spatial filtering technique.
First, locate the source direction; then
form a beam directing to the source.
The source location problem is a
analogy of the spectral analysis
problem, with the frequency domain
replaced by the spatial domain.
3/24/2006

Lecture notes for

A simple array model

Planar wave
Uniform linear array
Sensors responses are identical and
LTI
Sensors are omni directional
One parameter to estimate: DOA
3/24/2006

Lecture notes for

ULA

3/24/2006

Lecture notes for

ULA
The signal model:

y (t ) a( ) s (t ) e(t )
where the array transfer vector :

a( ) 1 e

jc 2

... e

jc m T

Where m is the delay with reference to the first sensor, and c is the
center frequency of the signal. By defining the spatial frequency as:
s c

d sin
c

we can write the array transfer vector as:

a( ) 1 e

3/24/2006

j s

... e

j ( m 1) s T

Lecture notes for

ULA

A direct analogy between frequency


analysis and spatial analysis using the
spatial frequency.
To avoid spatial aliasing:
d /2
All frequency analysis techniques can be
applied to the DOA estimation problem.

3/24/2006

Lecture notes for

Spatial filtering

Analogy between spatial filter and temporal filter

3/24/2006

Lecture notes for

Spatial filtering

The spatially filtered signal: x(t ) h*a( )s(t )


Objective: find the filter that passes
undistorted the signals with a given DOA;
and attenuates all the other DOAs as
much as possible.
min h*h subject to h*a( ) 1
h

3/24/2006

Lecture notes for

The beam pattern

3/24/2006

Lecture notes for

Restrictions to beamforming

Very sensitive to array geometry, need good


calibration
Has only directivity, no selectivity in range or
other location parameters
Frequency response is not flat
Ambient noises are assumed to be spatially
white
Beam width (or selectivity) depends on the
size of the array
Spatial aliasing problem
3/24/2006

Lecture notes for

Blind Source Separation (BSS)

MIMO systems
Spatial processing techniques with no
knowledge of array geometry
Invisible beam
Arbitrarily high spatial resolution
Do not depend on signal frequency
Spatial noise is not assumed to be white
Not a spatial sampling system
3/24/2006

Lecture notes for

Solutions to BSS

Independent Component Analysis


(ICA) [2]
Independent Factor Analysis (IFA) [3]

[2] A. Hyvarinen, J. Karhunen, and E. Oja, Independent Component Analysis, John Wiley & Sons, Inc. 2001
[3] H. Attias, Independent factor analysis, Neural Computation, 1999.

3/24/2006

Lecture notes for

Independent component
analysis (ICA)

Instantaneous mixing
The number of sensors is greater than
or equal to the number of sources
No system noise
The sources (components) are
independent of each other
The sources are non-Gaussian
processes
3/24/2006

Lecture notes for

ICA model
Cocktail party problem. Three sources, three sensors:
x1 (t ) a11 s1 (t ) a12 s2 (t ) a13 s3 (t )
x (t ) a s (t ) a s (t ) a s (t )
22 2
23 3
2 12 1

x3 (t ) a31s1 (t ) a32 s2 (t ) a33 s3 (t )

Or, in matrix form

x As
Neither s nor A are known. Can not be solved by linear algebra.
If the sources are independent non-Gaussian, the A matrix can
be found by maximizing the non-Gaussianity of the sources.

3/24/2006

Lecture notes for

Contrast function
An iterative gradient method. First initialize the A matrix.
If the mixing matrix A is square and non-singular, move it to the left:

A 1x s
Calculate the non-Gaussianity of s, and find the next estimate of A that
gives a higher non-Gaussianity. Iterate until convergence.
The contrast function is the objective function to maximize or minimize.

3/24/2006

Lecture notes for

Maximizing non-Gaussianity

Non-Gaussian is independent
Measuring non-Gaussianity
- by kurtosis
- by negentropy

3/24/2006

Lecture notes for

ICA methods

ICA by maximizing non-Gaussianity


ICA by Maximum Likelihood
ICA by minimizing mutual information
ICA by nonlinear decorrelation

3/24/2006

Lecture notes for

Extensions to ICA

Noisy ICA
ICA with non-square mixing matrix
Independent Factor Analysis
Convolutive mixture
Methods using time structure

3/24/2006

Lecture notes for

Blind Source Extraction

Only interested in one or a few


sources out of many (feature
extraction)
Save computation
Dont know the exact number of
sources

3/24/2006

Lecture notes for

BSE

D. Mandic and A. Cichocki, An Online Algorithm For Blind Extraction Of Sources With Different
Dynamical Structures.

3/24/2006

Lecture notes for

You might also like