mat67-Lj-Inner Product Spaces

Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

MAT067 University of California, Davis Winter 2007

Inner Product Spaces


Isaiah Lankham, Bruno Nachtergaele, Anne Schilling
(March 2, 2007)

The abstract definition of vector spaces only takes into account algebraic properties for
the addition and scalar multiplication of vectors. For vectors in Rn , for example, we also
have geometric intuition which involves the length of vectors or angles between vectors. In
this section we discuss inner product spaces, which are vector spaces with an inner product
defined on them, which allow us to introduce the notion of length (or norm) of vectors and
concepts such as orthogonality.

1 Inner product
In this section V is a finite-dimensional, nonzero vector space over F.
Definition 1. An inner product on V is a map

·, · : V × V → F
(u, v) → u, v

with the following properties:


1. Linearity in first slot: u + v, w = u, w + v, w for all u, v, w ∈ V and au, v =
au, v;

2. Positivity: v, v ≥ 0 for all v ∈ V ;

3. Positive definiteness: v, v = 0 if and only if v = 0;

4. Conjugate symmetry: u, v = v, u for all u, v ∈ V .


Remark 1. Recall that every real number x ∈ R equals its complex conjugate. Hence for
real vector spaces the condition about conjugate symmetry becomes symmetry.
Definition 2. An inner product space is a vector space over F together with an inner
product ·, ·.
Copyright c 2007 by the authors. These lecture notes may be reproduced in their entirety for non-
commercial purposes.
2 NORMS 2

Example 1. V = Fn
u = (u1 , . . . , un ), v = (v1 , . . . , vn ) ∈ Fn
Then

n
u, v = ui v i .
i=1

For F = R, this is the usual dot product

u · v = u1 v1 + · · · + un vn .

For a fixed vector w ∈ V , one may define the map T : V → F as T v = v, w. This map
is linear by condition 1 of Definition 1. This implies in particular that 0, w = 0 for every
w ∈ V . By the conjugate symmetry we also have w, 0 = 0.

Lemma 2. The inner product is anti-linear in the second slot, that is, u, v + w = u, v +
u, w for all u, v, w ∈ V and u, av = au, v.

Proof. For the additivity note that

u, v + w = v + w, u = v, u + w, u


= v, u + w, u = u, v + u, w.

Similarly,
u, av = av, u = av, u = av, u = au, v.

Note that the convention in physics is often different. There the second slot is linear,
whereas the first slot is anti-linear.

2 Norms
The norm of a vector is the analogue of the length. It is formally defined as follows.

Definition 3. Let V be a vector space over F. A map

·:V →R
v → v

is a norm on V if

1. v = 0 if and only if v = 0;


3 ORTHOGONALITY 3

2. av = |a|v for all a ∈ F and v ∈ V ;

3. Triangle inequality v + w ≤ v + w for all v, w ∈ V .


Note that in fact v ≥ 0 for all v ∈ V since

0 = v − v ≤ v +  − v = 2v.

Next we want to show that a norm can in fact be defined from an inner product via

v = v, v for all v ∈ V .

Properties 1 and 2 follow easily from points 1 and 3 of Definition 1. The triangle inequality
requires proof (which we give in Theorem 5).
Note that for V = Rn the norm is related to what you are used to as the distance or
length of vectors. Namely for v = (x1 , . . . , xn ) ∈ Rn we have

v = x21 + · · · + x2n .

v
x3

x2

x1

3 Orthogonality
Using the inner product, we can now define the notion of orthogonality, prove the Pythagorean
theorem and the Cauchy-Schwarz inequality
 which will enable us to prove the triangle in-
equality. This will show that v = v, v indeed defines a norm.
3 ORTHOGONALITY 4

Definition 4. Two vectors u, v ∈ V are orthogonal (u⊥v in symbols) if and only if u, v =
0.
Note that the zero vector is the only vector that is orthogonal to itself. In fact, the zero
vector is orthogonal to all vectors v ∈ V .
Theorem 3 (Pythagorean Theorem). If u, v ∈ V with u⊥v, then

u + v2 = u2 + v2 .

Proof. Suppose u, v ∈ V such that u⊥v. Then

u + v2 = u + v, u + v = u2 + v2 + u, v + v, u


= u2 + v2.

Note that the converse of the Pythagorean Theorem holds for real vector spaces, since in
this case u, v + v, u = 2Reu, v = 0.
Given two vectors u, v ∈ V with v = 0 we can uniquely decompose u as a piece parallel
to v and a piece orthogonal to v. This is also called the orthogonal decomposition. More
precisely
u = u1 + u2
so that u1 = av and u2 ⊥v. Namely write u2 = u − u1 = u − av. For u2 to be orthogonal to
v we need
0 = u − av, v = u, v − av2 .
Solving for a yields a = u, v/v2, so that
 
u, v u, v
u= v+ u− v . (1)
v2 v2

Theorem 4 (Cauchy-Schwarz inequality). For all u, v ∈ V we have

|u, v| ≤ uv.

Furthermore, equality holds if and only if u and v are linearly dependent, i.e., are scalar
multiples of each other.
Proof. If v = 0, then both sides of the inequality are zero. Hence assume that v = 0.
Consider the orthogonal decomposition

u, v
u= v+w
v2
3 ORTHOGONALITY 5

where w⊥v. By the Pythagorean theorem we have


 
 u, v 2 2 2
u = 
2
v  + w2 = |u, v| + w2 ≥ |u, v| .
 v2  v2 v2

Multiplying both sides by v2 and taking the square root yields the Cauchy-Schwarz in-
equality.
Note that we get equality in the above arguments if and only if w = 0. But by (1) this
means that u and v are linearly dependent.
The Cauchy-Schwarz inequality has many different proofs. Here is another one.
Proof. For given u, v ∈ V consider the norm square of the vector u + reiθ v,

0 ≤ u + reiθ v2 = u2 + r 2 v2 + 2Re(reiθ u, v).

Since u, v is a complex number, one can choose θ so that eiθ u, v is real. Hence the right
hand side is a parabola ar 2 + br + c with real coefficients. It will lie above the real axis,
i.e. ar 2 + br + c ≥ 0, if it does not have any real solutions for r. This is the case when the
discriminant satisfies b2 − 4ac ≤ 0. In our case this means

4|u, v|2 − 4u2 v2 ≤ 0.

Equality only holds if r can be chosen such that u + reiθ v = 0, which means that u and v
are scalar multiples.

We now come to the proof of the triangle inequality which shows that v = v, v
indeed defines a norm.

Theorem 5 (Triangle inequality). For all u, v ∈ V we have

u + v ≤ u + v.

Proof. By straighforward calculation we obtain

u + v2 = u + v, u + v = u, u + v, v + u, v + v, u


= u, u + v, v + u, v + u, v = u2 + v2 + 2Reu, v.

Note that Reu, v ≤ |u, v| so that using the Cauchy-Schwarz inequality we obtain

u + v2 ≤ u2 + v2 + 2uv = (u + v)2 .

Taking the square root of both sides gives the triangle inequality.
3 ORTHOGONALITY 6

Remark 6. Note that equality holds for the triangle inequality if and only if v = ru or u = rv
for r ≥ 0. Namely, equality in the proof happens only if u, v = uv which is equivalent
to the above statement.

u+v

v
u+v‘
v‘
u

Theorem 7 (Parallelogram equality). For all u, v ∈ V we have

u + v2 + u − v2 = 2(u2 + v2 ).

Proof. By direct calculation

u + v2 + u − v2 = u + v, u + v + u − v, u − v
= u2 + v2 + u, v + v, u + u2 + v2 − u, v − v, u
= 2(u2 + v2 ).

u−v u+v

v
h

u g
4 ORTHONORMAL BASES 7

4 Orthonormal bases
We now define the notion of orthogonal and orthonormal bases of an inner product space.
As we will see later, orthonormal bases have very special properties that simplify many
calculations.
Definition 5. Let V be an inner product space with inner product ·, ·. A list of nonzero
vectors (e1 , . . . , em ) of V is called orthogonal if

ei , ej  = 0 for all 1 ≤ i = j ≤ m.

The list (e1 , . . . , em ) is called orthonormal if

ei , ej  = δi,j for all i, j = 1, . . . , m,

where δij is the Kronecker delta symbol and is 1 if i = j and zero otherwise.
Proposition 8. Every orthogonal list of nonzero vectors in V is linearly independent.
Proof. Let (e1 , . . . , em ) be an orthogonal list of vectors in V and suppose a1 , . . . , am ∈ F are
such that
a1 e1 + · · · + am em = 0.
Then
0 = a1 e1 + · · · + am em 2 = |a1 |2 e1 2 + · · · + |am |2 em 2
Note that ek  > 0 for all k = 1, . . . , m since every ek is a nonzero vector. Also |ak |2 ≥ 0.
Hence the only solution to this equation is a1 = · · · = am = 0.
Definition 6. An orthonormal basis of a finite-dimensional inner product space V is an
orthonormal list of vectors that is basis (i.e., in particular spans V ).
Clearly any orthonormal list of length dim V is a basis of V .
Example 2. The canonical basis of Fn is orthonormal.
Example 3. The list (( √12 , √12 ), ( √12 , − √12 )) is an orthonormal basis of R2 .
The next theorem shows that the coefficients of a vector v ∈ V in terms of an orthonormal
basis are easy to compute via the inner product.
Theorem 9. Let (e1 , . . . , en ) be an orthonormal basis of V . Then for all v ∈ V we have

v = v, e1 e1 + · · · + v, en en


n
and v2 = k=1 |v, ek |2.
5 THE GRAM-SCHMIDT ORTHOGONALIZATION PROCEDURE 8

Proof. Let v ∈ V . Since (e1 , . . . , en ) is a basis of V there exist unique scalars a1 , . . . , an ∈ F


such that
v = a1 e1 + · · · + an en .
Taking the inner product on both sides with respect to ek yields v, ek  = ak .

5 The Gram-Schmidt orthogonalization procedure


We now come to a very important algorithm, called the Gram-Schmidt orthogonalization
procedure. This algorithm makes it possible to construct for each list of linearly independent
vectors (or a basis) a corresponding orthonormal list (or orthonormal basis).

Theorem 10. If (v1 , . . . , vm ) is a linearly independent list of vectors in V , then there exists
an orthonormal list (e1 , . . . , em ) such that

span(v1 , . . . , vk ) = span(e1 , . . . , ek ) for all k = 1, . . . , m. (2)

Proof. The proof is constructive, that is, we will actually construct the vectors e1 , . . . , em
with the desired properties. Since (v1 , . . . , vm ) is linearly independent, vk = 0 for all k =
1, 2, . . . , m. Set e1 = vv11  . This is a vector of norm 1 and satisfies (2) for k = 1. Next set

v2 − v2 , e1 e1
e2 = .
v2 − v2 , e1 e1 

This is in fact the normalized version of the orthogonal decomposition (1)

w = v2 − v2 , e1 e1

where w⊥v2 . Note that e2  = 1 and span(e1 , e2 ) = span(v1 , v2 ).


Now suppose e1 , . . . , ek−1 have been constructed such that (e1 , . . . , ek−1 ) is an orthonormal
list and span(v1 , . . . , vk−1) = span(e1 , . . . , ek−1). Then define

vk − vk , e1 e1 − vk , e2 e2 − · · · − vk , ek−1 ek−1


ek = .
vk − vk , e1 e1 − vk , e2 e2 − · · · − vk , ek−1 ek−1

Since (v1 , . . . , vk ) is linearly independent we know that vk ∈


span(v1 , . . . , vk−1 ). Hence also
vk ∈ span(e1 , . . . , ek−1 ). Hence the norm in the definition of ek is not zero and hence ek is
well-defined (we are not dividing by zero). Also, a vector divided by its norm has norm 1.
5 THE GRAM-SCHMIDT ORTHOGONALIZATION PROCEDURE 9

Hence ek  = 1. Furthermore



vk − vk , e1 e1 − vk , e2 e2 − · · · − vk , ek−1 ek−1
ek , ei  = , ei
vk − vk , e1 e1 − vk , e2 e2 − · · · − vk , ek−1 ek−1 
vk , ei  − vk , ei 
= =0
vk − vk , e1 e1 − vk , e2 e2 − · · · − vk , ek−1 ek−1 

for 1 ≤ i < k. Hence (e1 , . . . , ek ) is orthonormal.


From the definition of ek we see that vk ∈ span(e1 , . . . , ek ) so that span(v1 , . . . , vk ) ⊂
span(e1 , . . . , ek ). Since both lists (e1 , . . . , ek ) and (v1 , . . . , vk ) are linearly independent, they
must span subspaces of the same dimension and therefore are the same subspace. Hence (2)
holds.
Example 4. Take v1 = (1, 1, 0) and v2 = (2, 1, 1) in R3 . The list (v1 , v2 ) is linearly indepen-
dent (check!). Then
v1 1
e1 = = √ (1, 1, 0).
v1  2
Next
v2 − v2 , e1 e1
e2 = .
v2 − v2 , e1 e1 
The inner product is v2 , e1  = √1 (1, 1, 0), (2, 1, 1) = √3 , so that
2 2

3 1
u2 = v2 − v2 , e1 e1 = (2, 1, 1) − (1, 1, 0) = (1, −1, 2).
2 2
 √
Calculating the norm of u2 we obtain u2  = 14 (1 + 1 + 4) = 26 . Hence normalizing this
vector we obtain
u2 1
e2 = = √ (1, −1, 2).
u2  6
The list (e1 , e2 ) is therefore orthonormal with the same span as (v1 , v2 ).
Corollary 11. Every finite-dimensional inner product space has an orthonormal basis.
Proof. Let (v1 , . . . , vn ) be a basis for V . This list is linearly independent and spans V . Apply
the Gram-Schmidt procedure to this list to obtain an orthonormal list (e1 , . . . , en ) which still
spans V . By Proposition 8 this list is linearly independent and hence a basis of V .
Corollary 12. Every orthonormal list of vectors in V can be extended to an orthonormal
basis of V .
Proof. Let (e1 , . . . , em ) be an orthonormal list of vectors in V . By Proposition 8 this list
is linearly independent and hence can be extended to a basis (e1 , . . . , em , v1 , . . . , vk ) of V
6 ORTHOGONAL PROJECTIONS AND MINIMIZATION PROBLEMS 10

by the Basis Extension Theorem. Now apply the Gram-Schmidt procedure to obtain a
new orthonormal basis (e1 , . . . , em , f1 , . . . , fk ). The first m vectors do not change since they
already are orthonormal. The list still spans V and is linearly independent by Proposition 8
and therefore forms a basis.
Recall that we proved that for a complex vector space V there is always a basis with
respect to which a given operator T ∈ L(V, V ) is upper-triangular. We would like to extend
this result to require the additional property of orthonormality.

Corollary 13. Let V be an inner product space over F and T ∈ L(V, V ). If T is upper-
triangular with respect to some basis, then T is upper-triangular with respect to some or-
thonormal basis.

Proof. Let (v1 , . . . , vn ) be a basis of V with respect to which T is upper-triangular. Apply


the Gram-Schmidt procedure to obtain an orthonormal basis (e1 , . . . , en ). Note that

span(e1 , . . . , ek ) = span(v1 , . . . , vk ) for all 1 ≤ k ≤ n.

We proved before that T is upper-triangular with respect to a basis (v1 , . . . , vn ) if and only
if span(v1 , . . . , vk ) is invariant under T for all 1 ≤ k ≤ n. Since the span has not changed by
going to the basis (e1 , . . . , en ), T is still upper-triangular even for this new basis.

6 Orthogonal projections and minimization problems


Definition 7. Let V be a finite-dimensional inner product space and U ⊂ V a subset of V .
Then the orthogonal complement of U is the set

U ⊥ = {v ∈ V | u, v = 0 for all u ∈ U }.

Note that in fact U ⊥ is always a subspace of V (as you should check!) and

{0}⊥ = V, V ⊥ = {0}.

If U1 ⊂ U2 , then U2⊥ ⊂ U1⊥ .


Furthermore, if U ⊂ V is not only a subset, but a subspace then we will now show that

V = U ⊕ U⊥
U ⊥⊥ = U.

Theorem 14. If U ⊂ V is a subspace of V , then V = U ⊕ U ⊥ .

Proof. We need to show that


6 ORTHOGONAL PROJECTIONS AND MINIMIZATION PROBLEMS 11

1. V = U + U ⊥ .

2. U ∩ U ⊥ = {0}.

To show 1, let (e1 , . . . , em ) be an orthonormal basis of U. Then for all v ∈ V we can write

v = v, e1 e1 + · · · + v, em em + v − v, e1 e1 − · · · − v, em em . (3)





u w

The vector u ∈ U and

w, ej  = v, ej  − v, ej  = 0 for all j = 1, 2, . . . , m

since (e1 , . . . , em ) is an orthonormal list of vectors. Hence w ∈ U ⊥ . This implies that


V = U + U ⊥.
To prove 2, let v ∈ U ∩U ⊥ . Then v has to be orthogonal to every vector in U, in particular
to itself, so that v, v = 0. However, this implies v = 0, so that U ∩ U ⊥ = {0}.

Example 5. R2 is the direct sum of two orthogonal lines and R3 is the direct sum of a plane
and a line orthogonal to this plane. For example

R2 = {(x, 0) | x ∈ R} ⊕ {(0, y) | y ∈ R}
R3 = {(x, y, 0) | x, y ∈ R} ⊕ {(0, 0, z) | z ∈ R}.

Theorem 15. If U ⊂ V is a subspace of V , then U = (U ⊥ )⊥ .

Proof. First we show that U ⊂ (U ⊥ )⊥ . Let u ∈ U. Then for all v ∈ U ⊥ we have u, v = 0.
Hence u ∈ (U ⊥ )⊥ by the definition of (U ⊥ )⊥ .
6 ORTHOGONAL PROJECTIONS AND MINIMIZATION PROBLEMS 12

Next we show that (U ⊥ )⊥ ⊂ U. Suppose 0 = v ∈ (U ⊥ )⊥ such that v ∈ U. Decompose v


according to Theorem 14
v = u1 + u2 ∈ U ⊕ U ⊥ ,
where u1 ∈ U and u2 ∈ U ⊥ . Then u2 = 0 since v ∈ U. Furthermore, u2 , v = u2, u2  = 0.
But then v is not in (U ⊥ )⊥ which contradicts our initial assumption. Hence we must have
(U ⊥ )⊥ ⊂ U.
By Theorem 14 we have the decomposition V = U ⊕ U ⊥ for every subspace U ⊂ V .
Hence we can define the orthogonal projection PU of V onto U as follows. Every v ∈ V
can be uniquely written as v = u + w where u ∈ U and w ∈ U ⊥ . Define

PU : V → V
v → u.

Clearly, PU is a projection operator since PU2 = PU . It also satisfies

range PU = U
null PU = U ⊥

so that range PU ⊥null PU . Therefore PU is called an orthogonal projection.


The decomposition of a vector v ∈ V into a piece in U and a piece in U ⊥ as given in (3)
yields the following formula for PU

PU v = v, e1 e1 + · · · + v, em em ,

where (e1 , . . . , em ) is an orthonormal basis of U.


Let us now apply the inner product to the following minimization problem. Given a
subspace U ⊂ V and a vector v ∈ V , find the vector u ∈ U that is closest to the vector v,
that is, such that v − u is smallest. The next proposition shows that PU v is the closest
point in U to the vector v and that this minimum is in fact unique.

Proposition 16. Let U ⊂ V be a subspace of V and v ∈ V . Then

v − PU v ≤ v − u for every u ∈ U.

Furthermore, equality holds if and only if u = PU v.

Proof. Let u ∈ U and set P := PU for short. Then

v − P v2 ≤ v − P v2 + P v − u2


= (v − P v) + (P v − u)2 = v − u2 ,
6 ORTHOGONAL PROJECTIONS AND MINIMIZATION PROBLEMS 13

where the second line follows from Pythagoras’ Theorem 3 since v−P v ∈ U ⊥ and P v−u ∈ U.
Equality only holds if P v − u2 = 0 which is equivalent to P v = u.

You might also like