II. The Exponential Function: II.1. Smooth Functions Defined by Power Series
II. The Exponential Function: II.1. Smooth Functions Defined by Power Series
II. The Exponential Function: II.1. Smooth Functions Defined by Power Series
, b
) := (aa
, ab
+a
b).
Then TA is a unital Banach algebra.
Writing := (0, 1), then each element of TA can be written in a unique
fashion as (a, b) = a +b and the multiplication satises
(a +b)(a
+b
) = aa
+ (ab
+a
b).
In particular,
2
= 0.
Proof. That TA is a unital algebra is a trivial verication. That the norm is
submultiplicative follows from
(a, b)(a
, b
) = aa
+ab
+a
b a a
+a b
+a
b
(a +b)(a
+b
) = (a, b) (a
, b
).
This proves that (TA, ) is a unital Banach algebra, the unit being 1 = (1, 0).
The completeness of TA follows easily from the completeness of A (Exercise).
Lemma II.1.5. Let (c
n
)
nN
be a sequence in K and r > 0 with
n=0
|c
n
|r
n
< .
Further let A be a nite-dimensional unital Banach algebra. Then
f: B
r
(0) := {x A: x < r} A, x
n=0
c
n
x
n
denes a smooth function. Its derivative is given by
df(x) =
n=0
c
n
dp
n
(x),
where p
n
(x) = x
n
is the nth power map whose derivative is given by
dp
n
(x)y = x
n1
y +x
n2
yx +. . . +xyx
n2
+yx
n1
.
For x < r and y M
d
(K) with xy = yx we obtain in particular
dp
n
(x)y = nx
n1
y and df(x)y =
n=1
c
n
nx
n1
y.
II.1. Smooth functions dened by power series 29
Proof. We observe that the series dening f(x) converges for x < r by the
Comparison Test (for series in Banach spaces). We shall prove by induction over
k N that all such functions f are C
k
-functions.
Step 1: First we show that f is a C
1
-function. We dene
n
: A A by
n
(h) := x
n1
h +x
n2
hx +. . . +xhx
n2
+hx
n1
.
Then
n
is a continuous linear map with
n
nx
n1
. Furthermore
p
n
(x +h) = (x +h)
n
= x
n
+
n
(h) +r
n
(h),
where
r
n
(h)
_
n
2
_
h
2
x
n2
+
_
n
3
_
h
3
x
n3
+. . . +h
n
=
k2
_
n
k
_
h
k
x
nk
.
In particular lim
h0
r
n
(h)
h
= 0, and therefore p
n
is dierentiable in x with
dp
n
(x) =
n
. The series
(h) :=
n=0
c
n
n
(h)
converges absolutely in End(A) by the Ratio Test since x < r:
n=0
|c
n
|
n
n=0
|c
n
| n x
n1
< .
We thus obtain a linear map (x) End(A) for each x with x < r.
Now let h satisfy x +h < r, i.e., h < r x. Then
f(x +h) = f(x) +(x)(h) +r(h), r(h) :=
n=2
c
n
r
n
(h),
where
r(h)
n=2
|c
n
|r
n
(h)
n=2
|c
n
|
n
k=2
_
n
k
_
h
k
x
nk
k=2
_
n=k
|c
n
|
_
n
k
_
x
nk
_
h
k
<
follows from x +h < r because
nk
|c
n
|
_
n
k
_
x
nk
h
k
=
n
|c
n
|(x +h)
n
n
|c
n
|r
n
< .
30 II. The exponential function 22. April 2007
Therefore the continuity of real-valued functions represented by a power series
yields
lim
h0
r(h)
h
=
k=2
_
n=k
|c
n
|
_
n
k
_
x
nk
_
0
k1
= 0.
This proves that f is a C
1
-function with the required derivative.
Step 2: To complete our proof by induction, we now show that if all
functions f as above are C
k
, then they are also C
k+1
. In view of Step 1, this
implies that they are smooth.
To set up the induction, we consider the Banach algebra TA from
Lemma II.1.4 and apply Step 1 to this algebra to obtain a smooth function
F: {x+h TA: x+h = x+h < r} TA, F(x+h) =
n=0
c
n
(x+h)
n
,
We further note that
(x +h)
n
= x
n
+ dp
n
(x)h .
This implies the formula
F(x +h) = f(x) +df(x)h,
i.e., that the extension F of f to TA describes the rst order Taylor expansion
of f in each point x A. Our induction hypothesis implies that F is a C
k
-
function.
Let x
0
A with x
0
< r and pick a basis h
1
, . . . , h
d
of A with h
i
<
rx
0
. Then all functions x df(x)h
i
are dened and C
k
on a neighborhood
of x
0
, and this implies that the function
B
r
(0) Hom(A, A), x df(x)
is C
k
. This in turn implies that f is C
k+1
.
The following proposition shows in particular that inserting elements of a
Banach algebra in power series is compatible with composition.
Proposition II.1.6. (a) On the set P
R
of power series of the form
f(z) :=
n=0
a
n
z
n
, a
n
K
and converging on the open disc B
R
(0) := {z K: |z| < R}, we dene for r < R:
f
r
:=
n=0
|a
n
|r
n
.
II.1. Smooth functions dened by power series 31
Then
r
is a norm with the following properties:
(1)
r
is submultiplicative: fg
r
f
r
g
r
.
(2) The polynomials f
N
(z) :=
N
n=0
a
n
z
n
satisfy f f
N
r
0.
(3) If A M
n
(K) satises A < R, then f(A) :=
n=0
a
n
A
n
converges. We
further have
f(A) f
r
for A r < R
and for f, g P
R
we have
(f g)(A) = f(A)g(A).
(b) If g P
S
with g
s
< R for all s < S and f P
R
, then f g P
S
denes an analytic function on the open disc of radius S, and for A M
n
(K)
with A < S we have g(A) < R and the Composition Formula
(1.1) f(g(A)) = (f g)(A).
Proof. (1) First we note that P
R
is the set of all power series f(z) =
n=0
a
n
z
n
for which f
r
< holds for all r < R. We leave the easy
argument that
r
is a norm to the reader. If f
r
, g
r
< holds for
g(z) =
n=0
b
n
z
n
, then the Cauchy Product Formula implies that
fg
r
=
n=0
k=0
a
k
b
nk
r
n
n=0
n
k=0
|a
k
| |b
nk
|r
k
r
nk
= f
r
g
r
.
(2) follows immediately from f f
N
r
=
n>N
|a
n
|r
n
0.
(3) The relation f(A) f
r
follows from a
n
A
n
|a
n
|r
n
and the
Comparison Test for absolutely convergent series in a Banach space. The relation
(f g)(A) = f(A)g(A) follows from the Cauchy Product Formula (Exercise II.1.3)
because the series f(A) and g(A) converge absolutely.
(b) We may w.l.o.g. assume that K = C because everything on the case
K = R can be obtained by restriction. Our assumption implies that g(B
S
(0))
B
R
(0), so that f g denes a holomorphic function on the open disc B
S
(0). For
s < S and g
s
< r < R we then derive
f g
s
n=0
a
n
g
n
n=0
|a
n
|g
n
s
f
r
.
For s := A we obtain g(A) g
s
< R, so that f(g(A)) is dened.
For s < r < R we then have
f(g(A)) f
N
(g(A)) f f
N
r
0.
Likewise
(f g)(A) (f
N
g)(A) (f g) (f
N
g)
s
f f
N
r
0,
and we get
(f g)(A) = lim
N
(f
N
g)(A) = lim
N
f
N
(g(A)) = f(g(A))
because the Composition Formula trivially holds if f is a polynomial.
32 II. The exponential function 22. April 2007
Exercises for Section II.1
Exercise II.1.1. Let X
1
, . . . , X
n
be nite-dimensional normed spaces and
: X
1
. . . X
n
Y an n-linear map.
(a) Show that is continuous. Hint: Choose a basis in each space X
j
and
expand accordingly.
(b) Show that there exists a constant C 0 with
(x
1
, . . . , x
n
) Cx
1
x
n
for x
i
X
i
.
Exercise II.1.2. Let Y be a Banach space and a
n,m
, n, m N, elements in
Y with
n,m
a
n,m
:= sup
NN
n,mN
a
n,m
< .
(a) Show that
A :=
n=1
m=1
a
n,m
=
m=1
n=1
a
n,m
and that both iterated sums exist.
(b) Show that for each sequence (S
n
)
nN
of nite subsets S
n
N N, n N,
with S
n
S
n+1
and
n
S
n
= N N we have
A = lim
nN
(j,k)S
n
a
j,k
.
Exercise II.1.3. (Cauchy Product Formula) Let X, Y, Z be Banach space and
: X Y Z a continuous bilinear map. Suppose that x :=
n=0
x
n
is
absolutely convergent in X and that y :=
n=0
y
n
is absolutely convergent
in Y . Then
(x, y) =
n=0
n
k=0
(x
k
, y
nk
).
Hint: Use Exercises II.1.1(b) and II.1.2(b).
II.2. Elementary properties of the exponential function
After the preparations of the preceding section, it is now easy to see that the
matrix exponential function denes a smooth map on M
n
(K). In this section
we describe some elementary properties of this function. As group theoretic
consequences for GL
n
(K), we show that it has no small subgroups and that all
one-parameter groups are smooth and given by the exponential function.
II.2. Elementary properties of the exponential function 33
For x M
n
(K) we dene
(2.1) e
x
: =
n=0
1
n!
x
n
.
The absolute convergence of the series on the right follows directly from the
estimate
n=0
1
n!
x
n
n=0
1
n!
x
n
= e
x
and the Comparison Test for absolute convergence of a series in a Banach space.
We dene the exponential function of M
n
(K) by
exp: M
n
(K) M
n
(K), exp(x) := e
x
.
Proposition II.2.1. The exponential function exp: M
d
(K) M
d
(K) is
smooth. For xy = yx we have
(2.2) d exp(x)y = exp(x)y = y exp(x)
and in particular
d exp(0) = id
M
n
(K)
.
Proof. To verify the formula for the dierential, we note that for xy = yx,
Lemma II.1.5 implies that
d exp(x)y =
n=1
1
n!
nx
n1
y =
n=0
1
n!
x
n
y = exp(x)y.
For x = 0, the relation exp(0) = 1 now implies in particular that d exp(0)y = y.
Lemma II.2.2. Let x, y M
n
(K).
(i) If xy = yx, then exp(x +y) = expxexpy.
(ii) exp(M
n
(K)) GL
n
(K), exp(0) = 1, and (expx)
1
= exp(x).
(iii) For g GL
n
(K) we have ge
x
g
1
= e
gxg
1
.
Proof. (i) Using the general form of the CauchyProduct Formula (Exer-
cise II.1.3), we obtain
exp(x +y) =
k=0
(x +y)
k
k!
=
k=0
1
k!
k
=0
_
k
_
x
y
k
=
k=0
k
=0
x
!
y
k
(k )!
=
_
p=0
x
p
p!
__
=0
y
!
_
.
(ii) From (i) we derive in particular expxexp(x) = exp0 = 1, which
implies (ii).
(iii) is a consequence of gx
n
g
1
= (gxg
1
)
n
and the continuity of the
conjugation map c
g
(x) := gxg
1
on M
n
(K).
34 II. The exponential function 22. April 2007
Remark II.2.3. (a) For n = 1, the exponential function
exp: M
1
(R)
= R R
= GL
1
(R), x e
x
is injective, but this is not the case for n > 1. In fact,
exp
_
0 2
2 0
_
= 1
follows from
exp
_
0 t
t 0
_
=
_
cos t sint
sint cos t
_
, t R.
This example is nothing but the real picture of the relation e
2i
= 1.
Proposition II.2.4. There exists an open neighborhood U of 0 in M
n
(K)
such that the map
exp|
U
: U GL
n
(K)
is a dieomorphism onto an open neighborhood of 1 in GL
n
(K).
Proof. We have already seen that exp is a smooth map, and that d exp(0) =
id
M
n
(K)
. Therefore the assertion follows from the Inverse Function Theorem.
If U is as in Proposition II.2.4, we dene
log
V
: = (exp|
V
)
1
: V U M
n
(K).
We shall see below why this function deserves to be called a logarithm function.
The following corollary means that the group GL
n
(K) contains no sub-
groups that are small in the sense that they lie arbitrarily close to the identity.
No Small Subgroup Theorem
Theorem II.2.5. There exists an open neighborhood V of 1 in GL
n
(K) such
that {1} is the only subgroup of GL
n
(K) contained in V .
Proof. Let U be as in Proposition II.2.4 and assume furthermore that U is
convex and bounded. We set U
1
:=
1
2
U . Let G V := expU
1
be a subgroup
of GL
n
(K) and g G. Then we write g = expx with x U
1
and assume that
x = 0. Let k N be maximal with kx U
1
(the existence of k follows from
the boundedness of U ). Then
g
k+1
= exp(k + 1)x G V
implies the existence of y U
1
with exp(k + 1)x = expy. Since (k + 1)x
2U
1
= U follows from
k+1
2
x [0, k]x U
1
, and exp |
U
is injective, we obtain
(k + 1)x = y U
1
, contradicting the maximality of k. Therefore g = 1.
A one-parameter (sub)group of a group G is a group homomorphism
: (R, +) G. The following result describes all dierentiable one-parameter
subgroups of GL
n
(K).
II.2. Elementary properties of the exponential function 35
One-parameter Group Theorem
Theorem II.2.6. For each x M
d
(K) the map
: (R, +) GL
d
(K), t exp(tx)
is a smooth group homomorphism solving the initial value problem
(0) = 1 and
(t) = lim
s0
(t +s) (t)
s
= lim
s0
(t)
(s) (0)
s
= (t)
(0) = (t)x
implies that is dierentiable and solves the initial value problem
() for |t| .
In particular, is smooth and of the form (t) = tx for some x M
d
(K).
Hence (t) = exp(tx) for |t| , but then (nt) = exp(ntx) for n N leads to
(t) = exp(tx) for each t R.
Exercises for Section II.2
Exercise II.2.1. Let D M
n
(K) be a diagonal matrix. Calculate its operator
norm.
Exercise II.2.2. If A is a Banach algebra with unit element 1 and g A
satises g 1 < 1, then g is invertible, i.e., there exists an element h A
with hg = gh = 1. Hint: For x := 1 g the Neumann series y :=
n=0
x
n
converges. Show that y is an inverse of g.
II.2. Elementary properties of the exponential function 37
Exercise II.2.3. (a) Calculate e
tN
for t K and the matrix
N =
_
_
_
_
_
0 1 0 . . . 0
0 1 0
1
0 . . . 0
_
_
_
_
_
M
n
(K).
(b) If A is a block diagonal matrix diag(A
1
, . . . , A
k
), then e
A
is the block
diagonal matrix diag(e
A
1
, . . . , e
A
k
).
(c) Calculate e
tA
for a matrix A M
n
(C) given in Jordan Normal Form. Hint:
Use (a) and (b).
Exercise II.2.4. Recall that a matrix x is said to be nilpotent if x
d
for some
d N and y is called unipotent if y 1 is nilpotent.
Let a, b M
n
(K) be commuting matrices.
(a) If a and b are nilpotent, then a +b is nilpotent.
(b) If a and b are diagonalizable, then a +b and ab are diagonalizable.
(c) If a and b are unipotent, then ab is unipotent.
Exercise II.2.5. (Jordan decomposition)
(a) (Additive Jordan decomposition) Show that each complex matrix X
M
n
(C) can be written in a unique fashion as
X = X
s
+X
n
with [X
s
, X
n
] = 0,
where X
n
is nilpotent and X
s
diagonalizable. Hint: Existence (Jordan normal
form), Uniqueness (what can you say about nilpotent diagonalizable matrices?).
(b) (Multiplicative Jordan decomposition) Show that each invertible complex
matrix g GL
n
(C) can be written in a unique fashion as
g = g
s
g
u
, with g
s
g
u
= g
u
g
s
,
where g
u
is unipotent and g
s
diagonalizable. Hint: Existence: Put g
u
:=
1 +g
1
s
g
n
.
(c) If X = X
s
+X
n
is the additive Jordan decomposition, then e
X
= e
X
s
e
X
n
is
the multiplicative Jordan decomposition of e
X
.
(d) A M
n
(C) commutes with a diagonalizable matrix D if and only if A
preserves all eigenspaces of D.
(e) A M
n
(C) commutes with X if and only if it commutes with X
s
and
X
n
. Hint: If A commutes with X, it preserves the generalized eigenspaces of X
(verify this!), and this implies that it commutes with X
s
, which is diagonalizable
and whose eigenspaces are the generalized eigenspaces of X.
38 II. The exponential function 22. April 2007
Exercise II.2.6. Let A M
n
(C). Show that the set
e
RA
= {e
tA
: t R}
is bounded in M
n
(C) if and only if A is diagonalizable with purely imaginary
eigenvalues. Hint: Choose a matrix g GL
n
(C) for which A
:= gAg
1
is in
Jordan normal form A
k=1
(1)
k+1
x
k
k
converges for x M
d
(K) with x < 1 and denes a smooth function
log: B
1
(1) M
d
(K).
For x < 1 and y M
d
(K) with xy = yx we have
(d log)(1 +x)y = (1 +x)
1
y.
Proof. The convergence follows from
k=1
(1)
k+1
r
k
k
= log(1 +r) <
for r < 1, so that the smoothness follows from Lemma II.1.5.
If x and y commute, then the formula for the derivative in Lemma II.1.5
leads to
(d log)(1 +x).y =
k=1
(1)
k+1
x
k1
y = (1 +x)
1
y
(here we used the Neumann series; cf. Exercise II.2.2).
Proposition II.3.2. (a) For x M
d
(K) with x < log 2 we have
log(expx) = x.
(b) For g GL
d
(K) with g 1 < 1 we have exp(log g) = g.
Proof. (a) We apply Proposition II.1.6 with exp P
S
, S = log 2, R =
e
log 2
= 2 and exp
s
e
s
e
S
= 2 for s < S. We thus obtain log(expx) = x
for x < log 2.
(b) Next we apply Proposition II.1.6 with f = exp, S = 1 and g(z) =
log(1 +z) to obtain exp(log g) = g.
40 II. The exponential function 22. April 2007
The exponential function on nilpotent matrices
Proposition II.3.3. Let
U := {g GL
d
(K): (g 1)
d
= 0}
be the set of unipotent matrices and
N := {x M
d
(K): x
d
= 0}
the set of nilpotent matrices. Then U = 1 +N and
exp
N
:= exp|
N
: N U
is a homeomorphism whose inverse is given by
log
U
: g
k=1
(1)
k+1
(g 1)
k
k
=
d1
k=1
(1)
k+1
(g 1)
k
k
.
Proof. First we observe that for x N we have
e
x
1 = xa with a :=
d
n=1
1
n!
x
n1
.
In view of xa = ax, this leads to (e
x
1)
d
= x
d
a
d
= 0. Therefore exp
N
(N) U .
Similarly we obtain for g U that
log
U
(g) = (g 1)
d
k=1
(1)
k+1
(g 1)
k1
k
N.
For x N the curve
F: R M
d
(K), t log
U
exp
N
(tx)
is a polynomial function and Proposition II.3.2 implies that F(t) = tx for
tx < log 2. This imples that F(t) = tx for each t R and hence that
log
U
exp
N
(x) = F(1) = x.
Likewise we see that for g = 1 +x U the curve
G: R M
d
(K), t exp
N
log
U
(1 +tx)
is polynomial with G(t) = 1 + tx for tx < 1. Therefore exp
N
log
U
(g) =
F(1) = 1 +x = g. This proves that the functions exp
N
and log
U
are inverse to
each other.
II.3. The logarithm function 41
Corollary II.3.4. Let X End(V ) be a nilpotent endomorphism of the K-
vector space V and v V . Then the following are equivalent:
(1) X.v = 0.
(2) e
X
.v = v.
Proof. Clearly X.v = 0 implies e
X
.v =
n=0
1
n!
X
n
.v = v.
If, conversely, e
X
.v = v, then
X.v = log(e
X
).v =
k=1
(1)
k+1
(e
X
1)
k
k
.v = 0.
The exponential function on hermitian matrices
For the following proof we recall that for a hermitian d d-matrix A we
have
A = max{||: det(A1) = 0}
(Exercise II.3.1).
Proposition II.3.5. The restriction
exp
P
:= exp|
Herm
d
(K)
: Herm
d
(K) Pd
d
(K)
is a dieomorphism onto the open subset Pd
d
(K) of Herm
d
(K).
Proof. For x
= x we have (e
x
)
= e
x
1
, . . . , e
n
are the eigenvalues of e
x
. Therefore e
x
is positive denite for each
hermitian matrix x.
If, conversely, g Pd
d
(K), then let v
1
, . . . , v
n
be an orthonormal basis of
eigenvectors for g with g.v
j
=
j
v
j
. Then
j
> 0 for each j , and we dene
log
H
(g) Herm
d
(K) by log
H
(g).v
j
:= (log
j
)v
j
, j = 1, . . . , n. From this
construction of the logarithm function it is clear that
log
H
exp
P
= id
Herm
d
(K)
and exp
P
log
H
= id
Pd
d
(K)
.
For two real numbers x, y > 0 we have
log(xy) = log x + log y.
From this we obtain for > 0 the relation
(3.1) log
H
(g) = (log ) 1 + log
H
(g)
by following what happens on each eigenspace of g.
42 II. The exponential function 22. April 2007
The relation
log(x) =
k=1
(1)
k+1
(x 1)
k
k
for x R with |x 1| < 1 implies that for g 1 < 1 we have
log
H
(g) =
k=1
(1)
k+1
(g 1)
k
k
.
This proves that log
H
is smooth in B
1
(1) Herm
d
(K), hence in a neighborhood
of g
0
if g
0
1 < 1 (Lemma II.3.1), which means that for each eigenvalue of
g
0
we have | 1| < 1 (Exercise II.2.1). If this condition is not satised, then
we choose > 0 such that g < 2. Then g 1 < 1, and we obtain with
(3.1) the formula
log
H
(g) = (log )1 + log
H
(g) = (log )1 +
k=1
(1)
k+1
(g 1)
k
k
.
Therefore log
H
is smooth on the whole open cone Pd
d
(K), so that log
H
= exp
1
P
implies that exp
P
is a dieomorphism.
Corollary II.3.6. The group GL
d
(K) is homeomorphic to
U
d
(K) R
dim
R
(Herm
d
(K))
with dim
R
(Herm
d
(K)) =
_
d(d+1)
2
for K = R
d
2
for K = C
.
Exercises for Section II.3.
Exercise II.3.1. Show that for a hermitian matrix A Herm
n
(K) and the
euclidian norm on K
n
we have
A := sup{Ax: x 1} = max{||: der(A1) = 0}.
Hint: Write x K
n
as a sum x =
j
x
j
, where Ax
j
=
j
x
j
and calculate
Ax
2
in these term.
Exercise II.3.2. The exponential function
exp : M
n
(C) GL
n
(C)
is surjective. Hint: Use the multiplicative Jordan decomposition: Each g
GL
n
(C) can be written in a unique way as g = du with d diagonalizable and u
unipotent with du = ud; see also Proposition II.3.3.
II.4. The BakerCampbellHausdorDynkin Formula 43
II.4. The BakerCampbellHausdorDynkin Formula
In this section we derive a formula which expresses the product
expxexpy of two suciently small matrices x, y as the exponential image
exp(xy) of an element xy which can be described in terms of Lie brackets. The
(local) multiplication is called the BakerCampbellHausdor Multiplication
and the explicit series describing this product the Dynkin series.
The discussion of xy requires some preparation. We start with the adjoint
representation of GL
n
(K). This is the group homomorphism
Ad: GL
n
(K) Aut(M
n
(K)), Ad(g)x = gxg
1
,
where Aut(M
n
(K)) is the group of algebra automorphisms of M
n
(K). For
x M
n
(K) we further dene a linear map
ad(x): M
n
(K) M
n
(K), adx(y) := [x, y].
Lemma II.4.1. For each x M
n
(K) we have
Ad(expx) = exp(adx).
Proof. We dene the linear maps
L
x
: M
n
(K) M
n
(K), y xy
and
R
x
: M
n
(K) M
n
(K), y yx.
Then L
x
R
x
= R
x
L
x
and adx = L
x
R
x
. Therefore Lemma II.2.2(ii) leads to
Ad(expx)y = e
x
ye
x
= e
L
x
e
R
x
y = e
L
x
R
x
y = e
ad x
y.
The differential of the exponential function
Proposition II.4.2. Let x M
n
(K) and
exp x
(y) := (expx)y the left
multiplication by expx. Then
d exp(x) =
exp x
1 e
ad x
adx
: M
n
(K) M
n
(K),
where the fraction on the right means (adx) for the entire function
(z) :=
1 e
z
z
=
k=1
(z)
k1
k!
.
44 II. The exponential function 22. April 2007
The series (x) converges for each x M
n
(K).
Proof. First let : [0, 1] M
n
(K) be a smooth curve. Then
(t, s) := exp(s(t))
d
dt
exp(s(t))
denes a map [0, 1]
2
M
n
(K) which is C
1
in each argument. We calculate
s
(t, s) = exp(s(t)) ((t))
d
dt
exp(s(t))
+ exp(s(t))
d
dt
_
(t) exp(s(t))
_
= exp(s(t)) ((t))
d
dt
exp(s(t))
+ exp(s(t))
_
(t) = e
s ad (t)
(t).
Integration over [0, 1] with respect to s now leads to
(t, 1) = (t, 0) +
_
1
0
e
s ad (t)
(t) ds =
_
1
0
e
s ad (t)
(t) ds.
Next we note that for x M
n
(K) we have
_
1
0
e
s ad x
ds =
_
1
0
k=0
(adx)
k
k!
s
k
ds =
k=0
(adx)
k
_
1
0
s
k
k!
ds
=
k=0
(adx)
k
(k + 1)!
= (adx).
We thus obtain for (t) = x +ty the relation
(0, 1) = exp(x)d exp(x)y =
_
1
0
e
s ad x
y ds = (adx)y.
Lemma II.4.3. For
(z) =
1 e
z
z
:=
k=1
(1)
k1
z
k1
k!
, z C
and
(z) =
z log z
z 1
:= z
k=1
(1)
k1
k
(z1)
k1
= z
k=0
(1)
k
k + 1
(z1)
k
for |z1| < 1
II.4. The BakerCampbellHausdorDynkin Formula 45
we have
(e
z
)(z) = 1 for z C, |z| < log 2.
Proof. If |z| < log 2, then |e
z
1| < 1 and we obtain from log(e
z
) = z :
(e
z
)(z) =
e
z
z
e
z
1
1 e
z
z
= 1.
In view of the Composition Formula (1.1) (Proposition II.1.6), the same
identity as in Lemma II.4.3 holds if we insert linear maps L End(M
n
(K)) with
L < log 2 into the power series and :
(expL)(L) = ( exp)(L)(L) = (( exp) )(L) = id
M
n
(K)
.
Here we use that L < log 2 implies that all expressions are dened and in
particular that expL 1 < 1, as a consequence of the estimate
(4.1) expL 1 e
L
1.
The derivation of the BCH formula follows a similar scheme as the proof of
Proposition II.4.2. Here we consider x, y V
o
:= B(0, log
2 =
1
2
log 2 and |t| 1 we obtain in particular
expxexpty 1 < e
log 2
1 = 1.
Therefore expxexpty lies for |t| 1 in the domain of the logarithm function
(Lemma II.3.1). We therefore dene for t [1, 1] :
F(t) = log(expxexpty).
To estimate the norm of F(t), we note that for g := expxexpty, |t| 1, and
x, y < r we have
log g
k=1
g 1
k
k
= log(1g1) < log(1(e
2r
1)) = log(2e
2r
).
For r :=
1
2
log(2
2
2
) <
log 2
2
= log
2
) = log(
2).
46 II. The exponential function 22. April 2007
Next we calculate F
(t) =
d
dt
exp(F(t)) =
d
dt
expxexpty
= (expxexpty)y = (expF(t))y.
Using Proposition II.4.2 again, we obtain
(4.3) y =
_
expF(t)
_
1
(d exp)
_
F(t)
_
F
(t) =
1 e
ad F(t)
adF(t)
F
(t).
The next step is to rewrite (4.3) with the function from Lemma II.4.3
as
_
adF(t)
_
F
(t) = y.
We claim that ad(F(t)) < log 2. From ab ba 2a b we derive
ada 2a for a M
n
(K).
Therefore
adF(t) 2F(t) < 2 log(
2) = log 2,
so that the discussion above and (4.3) lead to
(4.4) F
(t) =
_
exp(adF(t))
_
y.
Proposition II.4.4. For x, y M
n
(K) with x, y <
1
2
log(2
2
2
) we
have
log(expxexpy) = x +
_
1
0
_
exp(adx) exp(t ady)
_
y dt,
with as in Lemma II.4.3.
Proof. With (4.4) and the preceding remarks we get
F
(t) =
_
exp(adF(t))
_
y
=
_
Ad(expF(t))
_
y =
_
Ad(expxexpty)
_
y
=
_
Ad(expx) Ad(expty)
_
y =
_
exp(adx) exp(adty)
_
y.
Moreover, we have
F(0) = log(expx) = x.
By integration we therefore obtain the formula.
II.4. The BakerCampbellHausdorDynkin Formula 47
Proposition II.4.5. For x, y M
n
(K) and x, y <
1
2
log(2
2
2
) we
have
x y := log(expxexpy)
= x+
k,m0
p
i
+q
i
>0
(1)
k
(k + 1)(q
1
+. . . +q
k
+ 1)
(adx)
p
1
(ady)
q
1
. . . (adx)
p
k
(ady)
q
k
(adx)
m
p
1
!q
1
! . . . p
k
!q
k
!m!
y.
Proof. We only have to rewrite the expression in Proposition II.4.4:
_
1
0
_
exp(adx) exp(adty)
_
y dt
=
_
1
0
k=0
(1)
k
_
exp(adx) exp(adty) id
_
k
(k + 1)
_
exp(adx) exp(adty)
_
y dt
=
_
1
0
k0
p
i
+q
i
>0
(1)
k
(k + 1)
(adx)
p
1
(adty)
q
1
. . . (adx)
p
k
(adty)
q
k
p
1
!q
1
! . . . p
k
!q
k
!
exp(adx)y dt
=
k,m0
p
i
+q
i
>0
(1)
k
(k + 1)
(adx)
p
1
(ady)
q
1
. . . (adx)
p
k
(ady)
q
k
(adx)
m
p
1
!q
1
! . . . p
k
!q
k
!m!
y
_
1
0
t
q
1
+...+q
k
dt
=
k,m0
p
i
+q
i
>0
(1)
k
(adx)
p
1
(ady)
q
1
. . . (adx)
p
k
(ady)
q
k
(adx)
m
.y
(k + 1)(q
1
+. . . +q
k
+ 1)p
1
!q
1
! . . . p
k
!q
k
!m!
.
The power series in Proposition II.4.5 is called the Dynkin Series. We
observe that it does not depend on the size n of the matrices we consider. For
practical purposes it often suces to know the rst terms of the Dynkin series:
Corollary II.4.6. Let x, y M
n
(K) and x, y <
1
2
log(2
2
2
). Then we
have
x y = x +y +
1
2
[x, y] +
1
12
[x, [x, y]] +
1
12
[y, [y, x]] +. . .
Proof. One has to collect the summands in Proposition II.4.5 corresponding
to p
1
+q
1
+. . . +p
k
+q
k
+m 2.
Product and Commutator Formula
We have seen in Lemma II.1.1 that the exponential image of a sum x + y
can be computed easily if x and y commute. In this case we also have for the
commutator [x, y] := xy yx = 0 the formula exp[x, y] = 1. The following
proposition gives a formula for exp(x +y) and exp([x, y]) in the general case.
48 II. The exponential function 22. April 2007
If g, h are elements of a group G, then (g, h) := ghg
1
h
1
is called their
commutator. On the other hand we call for two matrices A, B M
n
(K) the
expression
[A, B] := AB BA
their commutator bracket.
Proposition II.4.7. For x, y M
d
(K) the following assertions hold:
(i) (Trotter Product Formula)
lim
k
_
e
1
k
x
e
1
k
y
_
k
= e
x+y
.
(ii) (Commutator Formula)
lim
k
_
e
1
k
x
e
1
k
y
e
1
k
x
e
1
k
y
_
k
2
= e
xyyx
.
Proof. (i) From Corollary II.4.6 we derive
(4.5) lim
k
k
_
x
k
y
k
_
= x +y.
Applying the exponential function, we obtain (i).
(ii) We consider the function
(t) := tx ty (tx) (ty),
which is dened for suciently small t R and smooth. In view of
exp(x y (x)) = expxexpy exp(x) = exp
_
Ad(expx)y) = exp(e
ad x
y)
(Lemma II.4.1), we have
(4.6) x y (x) = e
ad x
y,
and therefore the Chain Rule for Taylor Polynomials yields
(t) = tx ty (tx) (ty) = e
t ad x
ty (ty)
= (ty +t
2
[x, y] +
t
3
2
[x, [x, y]] +. . .) (ty)
= ty +t
2
[x, y] ty + [ty, ty] +t
2
r(t) = t
2
[x, y] +t
2
r(t),
where lim
t0
r(t) = 0. We therefore have
(0) =
(0) = 0 and
(0)
2
= [x, y].
This leads to
(4.7) lim
k
k
2
_
1
k
x
_
_
1
k
y
_
1
k
x
_
1
k
y
_
=
(0)
2
= [x, y].
Applying exp leads to the Commutator Formula.
II.4. The BakerCampbellHausdorDynkin Formula 49
Exercises for Section II.4.
Exercise II.4.1. If (V, ) is an associative algebra, then we have Aut(V, )
Aut(V, [, ]).
Exercise II.4.2. (a) Ad : GL
n
(K) Aut
_
M
n
(K)
_
is a group homomorphism.
(b) For each Lie algebra g the map ad: g der(g), adx(y) := [x, y] is a
homomorphism of Lie algebras.
Exercise II.4.3. Let V be a nite-dimensional vector space, F V a sub-
space and : [0, T] V a continuous curve with ([0, T]) F . Then for all
t [0, T] :
I
t
:=
_
t
0
() d F.
Hint: Use the linearity of the integral to see that every linear functional vanishing
on F vanishes on I
t
. Why does this imply the assertion?
Exercise II.4.4. On each nite-dimensional Lie algebra g there exists a norm
with
[x, y] xy x, y g,
i.e., adx x. Hint: If
1
is any norm on g, then the continuity of the
bracket implies that [x, y]
1
Cx
1
y
1
. Modify
1
to obtain .
Exercise II.4.5. Let g be a Lie algebra with a norm as in Exercise II.4.4.
Then for x +y < ln2 the Dynkin series
x y = x+
k,m0
p
i
+q
i
>0
(1)
k
(k + 1)(q
1
+. . . +q
k
+ 1)
(adx)
p
1
(ady)
q
1
. . . (adx)
p
k
(ady)
q
k
(adx)
m
p
1
!q
1
! . . . p
k
!q
k
!m!
y
converges absolutely. Hint: Show that
x y x +e
x
y
k>0
1
k + 1
(e
x+y
1)
k
.
Exercise II.4.6. Prove Corollary II.4.6.
Exercise II.4.7. Let V and W be vector spaces and q : V V W a
skew-symmetric bilinear map. Then
[(v, w), (v
, w
)] :=
_
0, q(v, v
)
_
is a Lie bracket on g := V W . For x, y, z g we have
_
x, [y, z]
= 0.
50 II. The exponential function 22. April 2007
Exercise II.4.8. Let g be a Lie algebra with
_
x, [y, z]
= 0 for x, y, z g.
Then
x y := x +y +
1
2
[x, y]
denes a group structure on g. An example for such a Lie algebra is the three-
dimensional Heisenberg algebra
g =
_
_
_
_
_
0 x y
0 0 z
0 0 0
_
_
: x, y, z K
_
_
_
.