Kalman Filter: Ekf, Ukf

EKF, UKF
Pieter Abbeel
UC Berkeley EECS
Many slides adapted from Thrun, Burgard and Fox, Probabilistic Robotics
TexPoint fonts used in EMF.

Read the TexPoint manual before you delete this box.:
AAAAAAAAAAAAA
Kalman Filter
n  Kalman Filter = special case of a Bayes’ filter with dynamics model and
sensory model being linear Gaussian:
2 -1
Page 1!
Kalman Filtering Algorithm
n  At time 0:
n  For t = 1, 2, …
n  Dynamics update:
n  Measurement update:
Nonlinear Dynamical Systems

n  Most realistic robotic problems involve nonlinear functions:
n  Versus linear setting:
Page 2!
Linearity Assumption Revisited
y y
p(y)
x
p(x)
x 5
Non-linear Function
y y
p(y)
x
“Gaussian of p(y)” has p(x)
mean and variance of y
under p(y)
x 6
Page 3!
EKF Linearization (1)
p(x) has high variance relative to region in which linearization is accurate. 8
Page 4!
p(x) has small variance relative to region in which linearization is accurate. 9
EKF Linearization: First Order Taylor

Series Expansion
n  Dynamics model: for xt “close to” µt we have:
n  Measurement model: for xt “close to” µt we have:
10
Page 5!
EKF Linearization: Numerical
n  Numerically compute Ft column by column:
n  Here ei is the basis vector with all entries equal to zero,
except for the i’t entry, which equals 1.
n  If wanting to approximate Ft as closely as possible then ²
is chosen to be a small number, but not too small to avoid
numerical issues
Ordinary Least Squares

n  Given: samples {(x(1), y(1)), (x(2), y(2)), …, (x(m), y(m))}
n  Problem: find function of the form f(x) = a0 + a1 x that fits

the samples as well as possible in the following sense:
Page 6!
n  Recall our objective:
n  Let’s write this in vector notation:
n  , giving:
n  Set gradient equal to zero to find extremum:
(See the Matrix Cookbook for matrix identities, including derivatives.)

n  For our example problem we obtain a = [4.75; 2.00]
a0 + a1 x
Page 7!
Ordinary Least Squares 26
24
22
20
30
20 40
More generally:
30
10 20
n  0 0 10
n  In vector notation:

n  , gives:
n  Set gradient equal to zero to find extremum (exact same

derivation as two slides back):
Vector Valued Ordinary Least Squares

Problems
n  So far have considered approximating a scalar valued function from
samples {(x(1), y(1)), (x(2), y(2)), …, (x(m), y(m))} with
n  A vector valued function is just many scalar valued functions and
we can approximate it the same way by solving an OLS problem
multiple times. Concretely, let then we have:
n  In our vector notation:
n  This can be solved by solving a separate ordinary least squares

problem to find each row of
Page 8!
Vector Valued Ordinary Least Squares
Problems
n  Solving the OLS problem for each row gives us:
n  Each OLS problem has the same structure. We have
Vector Valued Ordinary Least Squares and

EKF Linearization
n  Approximate xt+1 = ft(xt, ut)
with affine function a0 + Ft xt
by running least squares on samples from the function:
{( xt(1), y(1)=ft(xt(1),ut), ( xt(2), y(2)=ft(xt(2),ut), …, ( xt(m), y(m)=ft(xt(m),ut)}
n  Similarly for zt+1 = ht(xt)
Page 9!
OLS and EKF Linearization: Sample Point
Selection
n  OLS vs. traditional (tangent) linearization:
OLS
traditional (tangent)
OLS Linearization: choosing samples points
n  Perhaps most natural choice:
n 
n  reasonable way of trying to cover the region with

reasonably high probability mass
Page 10!
Analytical vs. Numerical Linearization
n  Numerical (based on least squares or finite differences) could
give a more accurate “regional” approximation. Size of
region determined by evaluation points.
n  Computational efficiency:
n  Analytical derivatives can be cheaper or more expensive
than function evaluations
n  Development hint:
n  Numerical derivatives tend to be easier to implement
n  If deciding to use analytical derivatives, implementing finite
difference derivative and comparing with analytical results
can help debugging the analytical derivatives
EKF Algorithm
n  At time 0:
n  For t = 1, 2, …
n  Measurement update:
Page 11!
EKF Summary
n  Highly efficient: Polynomial in measurement dimensionality k
and state dimensionality n:
O(k2.376 + n2)
n  Not optimal!

n  Can diverge if nonlinearities are large!
n  Works surprisingly well even when all assumptions are
violated!
34
Linearization via Unscented Transform
EKF UKF
35
Page 12!
UKF Sigma-Point Estimate (2)
EKF UKF
36
EKF UKF
37
Page 13!
[Julier and Uhlmann, 1997]

UKF intuition why it can perform better
n  Assume we know the distribution over X and it has a mean \bar{x}
n  Y = f(X)
n  EKF approximates f by first order and ignores higher-order terms
n  UKF uses f exactly, but approximates p(x).
Page 14!
Self-quiz
n  When would the UKF significantly outperform the EKF?
y
n  Analytical derivatives, finite-difference derivatives, and least squares

will all end up with a horizontal linearization
à they’d predict zero variance in Y = f(X)
Beyond scope of course, just

A crude preliminary investigation of whether we can get EKF to match including for completeness.
UKF by particular choice of points used in the least squares fitting
Page 15!
Original unscented transform
n  Picks a minimal set of sample points that match 1st, 2nd and 3rd moments
of a Gaussian:
n  \bar{x} = mean, Pxx = covariance, i à i’th column, x 2 <n
n  · : extra degree of freedom to fine-tune the higher order moments of

the approximation; when x is Gaussian, n+· = 3 is a suggested heuristic
n  L = \sqrt{P_{xx}} can be chosen to be any matrix satisfying:

n  L LT = Pxx
[Julier and Uhlmann, 1997]
Unscented Kalman filter

n  Can simply use unscented transform and estimate the
mean and variance at the next time from the sample
points
n  Observation update:
n  Use sigma-points from unscented transform to compute
the covariance matrix between xt and zt. Then can do the
standard update.
Page 16!
[Table 3.4 in Probabilistic Robotics]
UKF Summary
n  Highly efficient: Same complexity as EKF, with a constant factor
slower in typical practical applications
n  Better linearization than EKF: Accurate in first two terms of
Taylor expansion (EKF only first term) + capturing more
aspects of the higher order terms
n  Derivative-free: No Jacobians needed
n  Still not optimal!
Page 17!

Kalman Filter: Ekf, Ukf

Uploaded by

Copyright:

Available Formats

Kalman Filter: Ekf, Ukf

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Kalman Filter: Ekf, Ukf

Uploaded by

Copyright:

Available Formats

EKF, UKF

TexPoint fonts used in EMF.

n Measurement update:

Nonlinear Dynamical Systems

n Versus linear setting:

EKF Linearization (2)

p(x) has high variance relative to region in which linearization is accurate. 8

p(x) has small variance relative to region in which linearization is accurate. 9

EKF Linearization: First Order Taylor

n Measurement model: for xt “close to” µt we have:

n Numerically compute Ft column by column:

Ordinary Least Squares

n Problem: find function of the form f(x) = a0 + a1 x that fits

n Set gradient equal to zero to find extremum:

(See the Matrix Cookbook for matrix identities, including derivatives.)

Ordinary Least Squares

n In vector notation:

n Set gradient equal to zero to find extremum (exact same

Vector Valued Ordinary Least Squares

n In our vector notation:

n This can be solved by solving a separate ordinary least squares

n Each OLS problem has the same structure. We have

Vector Valued Ordinary Least Squares and

n Similarly for zt+1 = ht(xt)

OLS Linearization: choosing samples points

n Perhaps most natural choice:

n reasonable way of trying to cover the region with

n Measurement update:

n Not optimal!

Linearization via Unscented Transform

UKF Sigma-Point Estimate (3)

[Julier and Uhlmann, 1997]

n EKF approximates f by first order and ignores higher-order terms

n UKF uses f exactly, but approximates p(x).

n Analytical derivatives, finite-difference derivatives, and least squares

Beyond scope of course, just

n \bar{x} = mean, Pxx = covariance, i à i’th column, x 2 <n

n · : extra degree of freedom to fine-tune the higher order moments of

n L = \sqrt{P_{xx}} can be chosen to be any matrix satisfying:

[Julier and Uhlmann, 1997]

Unscented Kalman filter

You might also like

n  Measurement update:

n  Versus linear setting:

n  Measurement model: for xt “close to” µt we have:

n  Numerically compute Ft column by column:

n  Problem: find function of the form f(x) = a0 + a1 x that fits

n  Set gradient equal to zero to find extremum:

n  In vector notation:

n  Set gradient equal to zero to find extremum (exact same

n  In our vector notation:

n  This can be solved by solving a separate ordinary least squares

n  Each OLS problem has the same structure. We have

n  Similarly for zt+1 = ht(xt)

n  Perhaps most natural choice:

n  reasonable way of trying to cover the region with

n  Measurement update:

n  Not optimal!

n  EKF approximates f by first order and ignores higher-order terms

n  UKF uses f exactly, but approximates p(x).

n  Analytical derivatives, finite-difference derivatives, and least squares

n  \bar{x} = mean, Pxx = covariance, i à i’th column, x 2 <n

n  · : extra degree of freedom to fine-tune the higher order moments of

n  L = \sqrt{P_{xx}} can be chosen to be any matrix satisfying: