Part IA - Vectors and Matrices: Theorems With Proof

Download as pdf or txt
Download as pdf or txt
You are on page 1of 26

Part IA — Vectors and Matrices

Theorems with proof

Based on lectures by N. Peake


Notes taken by Dexter Chua

Michaelmas 2014

These notes are not endorsed by the lecturers, and I have modified them (often
significantly) after lectures. They are nowhere near accurate representations of what
was actually lectured, and in particular, all errors are almost surely mine.

Complex numbers
Review of complex numbers, including complex conjugate, inverse, modulus, argument
and Argand diagram. Informal treatment of complex logarithm, n-th roots and complex
powers. de Moivre’s theorem. [2]

Vectors
Review of elementary algebra of vectors in R3 , including scalar product. Brief discussion
of vectors in Rn and Cn ; scalar product and the Cauchy-Schwarz inequality. Concepts
of linear span, linear independence, subspaces, basis and dimension.
Suffix notation: including summation convention, δij and εijk . Vector product and
triple product: definition and geometrical interpretation. Solution of linear vector
equations. Applications of vectors to geometry, including equations of lines, planes and
spheres. [5]

Matrices
Elementary algebra of 3 × 3 matrices, including determinants. Extension to n × n
complex matrices. Trace, determinant, non-singular matrices and inverses. Matrices as
linear transformations; examples of geometrical actions including rotations, reflections,
dilations, shears; kernel and image. [4]
Simultaneous linear equations: matrix formulation; existence and uniqueness of solu-
tions, geometric interpretation; Gaussian elimination. [3]
Symmetric, anti-symmetric, orthogonal, hermitian and unitary matrices. Decomposition
of a general matrix into isotropic, symmetric trace-free and antisymmetric parts. [1]

Eigenvalues and Eigenvectors


Eigenvalues and eigenvectors; geometric significance. [2]
Proof that eigenvalues of hermitian matrix are real, and that distinct eigenvalues give
an orthogonal basis of eigenvectors. The effect of a general change of basis (similarity
transformations). Diagonalization of general matrices: sufficient conditions; examples
of matrices that cannot be diagonalized. Canonical forms for 2 × 2 matrices. [5]
Discussion of quadratic forms, including change of basis. Classification of conics,
cartesian and polar forms. [1]
Rotation matrices and Lorentz transformations as transformation groups. [1]

1
Contents IA Vectors and Matrices (Theorems with proof)

Contents
0 Introduction 4

1 Complex numbers 5
1.1 Basic properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 Complex exponential function . . . . . . . . . . . . . . . . . . . . 5
1.3 Roots of unity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4 Complex logarithm and power . . . . . . . . . . . . . . . . . . . . 6
1.5 De Moivre’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.6 Lines and circles in C . . . . . . . . . . . . . . . . . . . . . . . . 7

2 Vectors 8
2.1 Definition and basic properties . . . . . . . . . . . . . . . . . . . 8
2.2 Scalar product . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2.1 Geometric picture (R2 and R3 only) . . . . . . . . . . . . 8
2.2.2 General algebraic definition . . . . . . . . . . . . . . . . . 8
2.3 Cauchy-Schwarz inequality . . . . . . . . . . . . . . . . . . . . . . 8
2.4 Vector product . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.5 Scalar triple product . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.6 Spanning sets and bases . . . . . . . . . . . . . . . . . . . . . . . 9
2.6.1 2D space . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.6.2 3D space . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.6.3 Rn space . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.6.4 Cn space . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.7 Vector subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.8 Suffix notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.9 Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.9.1 Lines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.9.2 Plane . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.10 Vector equations . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

3 Linear maps 12
3.1 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.1.1 Rotation in R3 . . . . . . . . . . . . . . . . . . . . . . . . 12
3.1.2 Reflection in R3 . . . . . . . . . . . . . . . . . . . . . . . 12
3.2 Linear Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.3 Rank and nullity . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.4 Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.4.1 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.4.2 Matrix Algebra . . . . . . . . . . . . . . . . . . . . . . . . 13
3.4.3 Decomposition of an n × n matrix . . . . . . . . . . . . . 13
3.4.4 Matrix inverse . . . . . . . . . . . . . . . . . . . . . . . . 13
3.5 Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.5.1 Permutations . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.5.2 Properties of determinants . . . . . . . . . . . . . . . . . 14
3.5.3 Minors and Cofactors . . . . . . . . . . . . . . . . . . . . 16

2
Contents IA Vectors and Matrices (Theorems with proof)

4 Matrices and linear equations 17


4.1 Simple example, 2 × 2 . . . . . . . . . . . . . . . . . . . . . . . . 17
4.2 Inverse of an n × n matrix . . . . . . . . . . . . . . . . . . . . . . 17
4.3 Homogeneous and inhomogeneous equations . . . . . . . . . . . . 17
4.3.1 Gaussian elimination . . . . . . . . . . . . . . . . . . . . . 17
4.4 Matrix rank . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.5 Homogeneous problem Ax = 0 . . . . . . . . . . . . . . . . . . . 18
4.5.1 Geometrical interpretation . . . . . . . . . . . . . . . . . . 18
4.5.2 Linear mapping view of Ax = 0 . . . . . . . . . . . . . . . 18
4.6 General solution of Ax = d . . . . . . . . . . . . . . . . . . . . . 18

5 Eigenvalues and eigenvectors 19


5.1 Preliminaries and definitions . . . . . . . . . . . . . . . . . . . . . 19
5.2 Linearly independent eigenvectors . . . . . . . . . . . . . . . . . . 19
5.3 Transformation matrices . . . . . . . . . . . . . . . . . . . . . . . 19
5.3.1 Transformation law for vectors . . . . . . . . . . . . . . . 19
5.3.2 Transformation law for matrix . . . . . . . . . . . . . . . 20
5.4 Similar matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
5.5 Diagonalizable matrices . . . . . . . . . . . . . . . . . . . . . . . 20
5.6 Canonical (Jordan normal) form . . . . . . . . . . . . . . . . . . 21
5.7 Cayley-Hamilton Theorem . . . . . . . . . . . . . . . . . . . . . . 22
5.8 Eigenvalues and eigenvectors of a Hermitian matrix . . . . . . . . 22
5.8.1 Eigenvalues and eigenvectors . . . . . . . . . . . . . . . . 22
5.8.2 Gram-Schmidt orthogonalization (non-examinable) . . . . 23
5.8.3 Unitary transformation . . . . . . . . . . . . . . . . . . . 23
5.8.4 Diagonalization of n × n Hermitian matrices . . . . . . . 23
5.8.5 Normal matrices . . . . . . . . . . . . . . . . . . . . . . . 24

6 Quadratic forms and conics 25


6.1 Quadrics and conics . . . . . . . . . . . . . . . . . . . . . . . . . 25
6.1.1 Quadrics . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
6.1.2 Conic sections (n = 2) . . . . . . . . . . . . . . . . . . . . 25
6.2 Focus-directrix property . . . . . . . . . . . . . . . . . . . . . . . 25

7 Transformation groups 26
7.1 Groups of orthogonal matrices . . . . . . . . . . . . . . . . . . . 26
7.2 Length preserving matrices . . . . . . . . . . . . . . . . . . . . . 26
7.3 Lorentz transformations . . . . . . . . . . . . . . . . . . . . . . . 26

3
0 Introduction IA Vectors and Matrices (Theorems with proof)

0 Introduction

4
1 Complex numbers IA Vectors and Matrices (Theorems with proof)

1 Complex numbers
1.1 Basic properties
Proposition. z z̄ = a2 + b2 = |z|2 .
Proposition. z −1 = z̄/|z|2 .

Theorem (Triangle inequality). For all z1 , z2 ∈ C, we have

|z1 + z2 | ≤ |z1 | + |z2 |.

Alternatively, we have |z1 − z2 | ≥ ||z1 | − |z2 ||.

1.2 Complex exponential function


Lemma.
∞ X
X ∞ ∞ X
X r
amn = ar−m,m
n=0 m=0 r=0 m=0

Proof.
∞ X
X ∞
amn = a00 + a01 + a02 + · · ·
n=0 m=0
+ a10 + a11 + a12 + · · ·
+ a20 + a21 + a22 + · · ·
= (a00 ) + (a10 + a01 ) + (a20 + a11 + a02 ) + · · ·
X∞ X r
= ar−m,m
r=0 m=0

Theorem. exp(z1 ) exp(z2 ) = exp(z1 + z2 )

Proof.
∞ X ∞
X z1m z2n
exp(z1 ) exp(z2 ) =
n=0 m=0
m! n!
∞ X
r
X z1r−m z2m
=
r=0 m=0
(r − m)! m!
∞ r
X 1 X r!
= z1r−m z2m
r=0
r! m=0
(r − m)!m!

X (z1 + z2 )r
=
r=0
r!

Theorem. eiz = cos z + i sin z.

5
1 Complex numbers IA Vectors and Matrices (Theorems with proof)

Proof.
∞ n
X i n
eiz = z
n=0
n!
∞ ∞
X i2n 2n X i2n+1 2n+1
= z + z
n=0
(2n)! n=0
(2n + 1)!
∞ ∞
X (−1)n 2n X (−1)n 2n+1
= z +i z
n=0
(2n)! n=0
(2n + 1)!
= cos z + i sin z

1.3 Roots of unity


2πi

Proposition. If ω = exp n , then 1 + ω + ω 2 + · · · + ω n−1 = 0
Proof. Two proofs are provided:
(i) Consider the equation z n = 1. The coefficient of z n−1 is the sum of
all roots. Since the coefficient of z n−1 is 0, then the sum of all roots
= 1 + ω + ω 2 + · · · + ω n−1 = 0.
(ii) Since ω n − 1 = (ω − 1)(1 + ω + · · · + ω n−1 ) and ω =
6 1, dividing by (ω − 1),
we have 1 + ω + · · · + ω n−1 = (ω n − 1)/(ω − 1) = 0.

1.4 Complex logarithm and power


1.5 De Moivre’s theorem
Theorem (De Moivre’s theorem).

cos nθ + i sin nθ = (cos θ + i sin θ)n .

Proof. First prove for the n ≥ 0 case by induction. The n = 0 case is true since
it merely reads 1 = 1. We then have

(cos θ + i sin θ)n+1 = (cos θ + i sin θ)n (cos θ + i sin θ)


= (cos nθ + i sin nθ)(cos θ + i sin θ)
= cos(n + 1)θ + i sin(n + 1)θ

If n < 0, let m = −n. Then m > 0 and

(cosθ + i sin θ)−m = (cos mθ + i sin mθ)−1


cos mθ − i sin mθ
=
(cos mθ + i sin mθ)(cos mθ − i sin mθ)
cos(−mθ) + i sin(−mθ)
=
cos2 mθ + sin2 mθ
= cos(−mθ) + i sin(−mθ)
= cos nθ + i sin nθ

6
1 Complex numbers IA Vectors and Matrices (Theorems with proof)

1.6 Lines and circles in C


Theorem (Equation of straight line). The equation of a straight line through
z0 and parallel to w is given by

z w̄ − z̄w = z0 w̄ − z̄0 w.

Theorem. The general equation of a circle with center c ∈ C and radius ρ ∈ R+


can be given by
z z̄ − c̄z − cz̄ = ρ2 − cc̄.

7
2 Vectors IA Vectors and Matrices (Theorems with proof)

2 Vectors
2.1 Definition and basic properties
2.2 Scalar product
2.2.1 Geometric picture (R2 and R3 only)
2.2.2 General algebraic definition

2.3 Cauchy-Schwarz inequality


Theorem (Cauchy-Schwarz inequality). For all x, y ∈ Rn ,

|x · y| ≤ |x||y|.

Proof. Consider the expression |x − λy|2 . We must have

|x − λy|2 ≥ 0
(x − λy) · (x − λy) ≥ 0
λ |y| − λ(2x · y) + |x|2 ≥ 0.
2 2

Viewing this as a quadratic in λ, we see that the quadratic is non-negative and


thus cannot have 2 real roots. Thus the discriminant ∆ ≤ 0. So

4(x · y)2 ≤ 4|y|2 |x|2


(x · y)2 ≤ |x|2 |y|2
|x · y| ≤ |x||y|.

Corollary (Triangle inequality).

|x + y| ≤ |x| + |y|.

Proof.

|x + y|2 = (x + y) · (x + y)
= |x|2 + 2x · y + |y|2
≤ |x|2 + 2|x||y| + |y|2
= (|x| + |y|)2 .

So

|x + y| ≤ |x| + |y|.

2.4 Vector product


Proposition.

a × b = (a1 î + a2 ĵ + a3 k̂) × (b1 î + b2 ĵ + b3 k̂)


= (a2 b3 − a3 b2 )î + · · ·

î ĵ k̂

= a1 a2 a3
b1 b2 b3

8
2 Vectors IA Vectors and Matrices (Theorems with proof)

2.5 Scalar triple product


Proposition. If a parallelepiped has sides represented by vectors a, b, c that
form a right-handed system, then the volume of the parallelepiped is given by
[a, b, c].
Proof. The area of the base of the parallelepiped is given by |b||c| sin θ = |b × c|.
Thus the volume= |b × c||a| cos φ = |a · (b × c)|, where φ is the angle between
a and the normal to b and c. However, since a, b, c form a right-handed system,
we have a · (b × c) ≥ 0. Therefore the volume is a · (b × c).
Theorem. a × (b + c) = a × b + a × c.
Proof. Let d = a × (b + c) − a × b − a × c. We have

d · d = d · [a × (b + c)] − d · (a × b) − d · (a × c)
= (b + c) · (d × a) − b · (d × a) − c · (d × a)
=0

Thus d = 0.

2.6 Spanning sets and bases


2.6.1 2D space
Theorem. The coefficients λ, µ are unique.
Proof. Suppose that r = λa + µb = λ0 a + µ0 b. Take the vector product with a
on both sides to get (µ − µ0 )a × b = 0. Since a × b 6= 0, then µ = µ0 . Similarly,
λ = λ0 .

2.6.2 3D space
Theorem. If a, b, c ∈ R3 are non-coplanar, i.e. a · (b × c) 6= 0, then they form
a basis of R3 .
Proof. For any r, write r = λa + µb + νc. Performing the scalar product
with b × c on both sides, one obtains r · (b × c) = λa · (b × c) + µb · (b × c) +
νc · (b × c) = λ[a, b, c]. Thus λ = [r, b, c]/[a, b, c]. The values of µ and ν can
be found similarly. Thus each r can be written as a linear combination of a, b
and c.
By the formula derived above, it follows that if αa + βb + γc = 0, then
α = β = γ = 0. Thus they are linearly independent.

2.6.3 Rn space
2.6.4 Cn space

2.7 Vector subspaces


2.8 Suffix notation
Proposition. (a × b)i = εijk aj bk
Proof. By expansion of formula

9
2 Vectors IA Vectors and Matrices (Theorems with proof)

Theorem. εijk εipq = δjp δkq − δjq δkp


Proof. Proof by exhaustion:

+1
 if j = p and k = q
RHS = −1 if j = q and k = p

0 otherwise

LHS: Summing over i, the only non-zero terms are when j, k = 6 i and p, q 6= i.
If j = p and k = q, LHS is (−1)2 or (+1)2 = 1. If j = q and k = p, LHS is
(+1)(−1) or (−1)(+1) = −1. All other possibilities result in 0.
Proposition.
a · (b × c) = b · (c × a)
Proof. In suffix notation, we have

a · (b × c) = ai (b × c)i = εijk bj ck ai = εjki bj ck ai = b · (c × a).

Theorem (Vector triple product).

a × (b × c) = (a · c)b − (a · b)c.

Proof.

[a × (b × c)]i = εijk aj (b × c)k


= εijk εkpq aj bp cq
= εijk εpqk aj bp cq
= (δip δjq − δiq δjp )aj bp cq
= aj bi cj − aj ci bj
= (a · c)bi − (a · b)ci

Proposition. (a × b) · (b × c) = (a · a)(b · c) − (a · b)(a · c).


Proof.

LHS = (a × b)i (a × c)i


= εijk aj bk εipq ap cq
= (δjp δkq − δjq δkp )aj bk ap cq
= aj bk aj ck − aj bk ak cj
= (a · a)(b · c) − (a · b)(a · c)

2.9 Geometry
2.9.1 Lines
Theorem. The equation of a straight line through a and parallel to t is

(x − a) × t = 0 or x × t = a × t.

10
2 Vectors IA Vectors and Matrices (Theorems with proof)

2.9.2 Plane
Theorem. The equation of a plane through b with normal n is given by

x · n = b · n.

2.10 Vector equations

11
3 Linear maps IA Vectors and Matrices (Theorems with proof)

3 Linear maps
3.1 Examples
3.1.1 Rotation in R3
3.1.2 Reflection in R3

3.2 Linear Maps


Theorem. Consider a linear map f : U → V , where U, V are vector spaces.
Then im(f ) is a subspace of V , and ker(f ) is a subspace of U .
Proof. Both are non-empty since f (0) = 0.
If x, y ∈ im(f ), then ∃a, b ∈ U such that x = f (a), y = f (b). Then
λx + µy = λf (a) + µf (b) = f (λa + µb). Now λa + µb ∈ U since U is a vector
space, so there is an element in U that maps to λx + µy. So λx + µy ∈ im(f )
and im(f ) is a subspace of V .
Suppose x, y ∈ ker(f ), i.e. f (x) = f (y) = 0. Then f (λx + µy) = λf (x) +
µf (y) = λ0 + µ0 = 0. Therefore λx + µy ∈ ker(f ).

3.3 Rank and nullity


Theorem (Rank-nullity theorem). For a linear map f : U → V ,

r(f ) + n(f ) = dim(U ).

Proof. (Non-examinable) Write dim(U ) = n and n(f ) = m. If m = n, then f is


the zero map, and the proof is trivial, since r(f ) = 0. Otherwise, assume m < n.
Suppose {e1 , e2 , · · · , em } is a basis of ker f , Extend this to a basis of the
whole of U to get {e1 , e2 , · · · , em , em+1 , · · · , en }. To prove the theorem, we
need to prove that {f (em+1 ), f (em+2 ), · · · f (en )} is a basis of im(f ).
(i) First show that it spans im(f ). Take y ∈ im(f ). Thus ∃x ∈ U such that
y = f (x). Then

y = f (α1 e1 + α2 e2 + · · · + αn en ),

since e1 , · · · en is a basis of U . Thus

y = α1 f (e1 ) + α2 f (e2 ) + · · · + αm f (em ) + αm+1 f (em+1 ) + · · · + αn f (en ).

The first m terms map to 0, since e1 , · · · em is the basis of the kernel of f .


Thus
y = αm+1 f (em+1 ) + · · · + αn f (en ).

(ii) To show that they are linearly independent, suppose

αm+1 f (em+1 ) + · · · + αn f (en ) = 0.

Then
f (αm+1 em+1 + · · · + αn en ) = 0.

12
3 Linear maps IA Vectors and Matrices (Theorems with proof)

Thus αm+1 em+1 + · · · + αn en ∈ ker(f ). Since {e1 , · · · , em } span ker(f ),


there exist some α1 , α2 , · · · αm such that

αm+1 em+1 + · · · + αn en = α1 e1 + · · · + αm em .

But e1 · · · en is a basis of U and are linearly independent. So αi = 0 for all i.


Then the only solution to the equation αm+1 f (em+1 ) + · · · + αn f (en ) = 0
is αi = 0, and they are linearly independent by definition.

3.4 Matrices
3.4.1 Examples
3.4.2 Matrix Algebra
Proposition.
(i) (AT )T = A.
 
x1
 x2 
(ii) If x is a column vector . , xT is a row vector (x1 x2 · · · xn ).
 
 .. 
xn

(iii) (AB)T = B T AT since (AB)Tij = (AB)ji = Ajk Bki = Bki Ajk


= (B T )ik (AT )kj = (B T AT )ij .
Proposition. tr(BC) = tr(CB)

Proof. tr(BC) = Bik Cki = Cki Bik = (CB)kk = tr(CB)

3.4.3 Decomposition of an n × n matrix


3.4.4 Matrix inverse
Proposition. (AB)−1 = B −1 A−1
Proof. (B −1 A−1 )(AB) = B −1 (A−1 A)B = B −1 B = I.

3.5 Determinants
3.5.1 Permutations
Proposition. Any q-cycle can be written as a product of 2-cycles.

Proof. (1 2 3 · · · n) = (1 2)(2 3)(3 4) · · · (n − 1 n).


Proposition.
a b

c = ad − bc
d

13
3 Linear maps IA Vectors and Matrices (Theorems with proof)

3.5.2 Properties of determinants


Proposition. det(A) = det(AT ).

Proof. Take a single term Aσ(1)1 Aσ(2)2 · · · Aσ(n)n and let ρ be another permuta-
tion in Sn . We have

Aσ(1)1 Aσ(2)2 · · · Aσ(n)n = Aσ(ρ(1))ρ(1) Aσ(ρ(2))ρ(2) · · · Aσ(ρ(n))ρ(n)

since the right hand side is just re-ordering the order of multiplication. Choose
ρ = σ −1 and note that ε(σ) = ε(ρ). Then
X
det(A) = ε(ρ)A1σ(1) A2σ(2) · · · Anσ(n) = det(AT ).
ρ∈Sn

Proposition. If matrix B is formed by multiplying every element in a single row


of A by a scalar λ, then det(B) = λ det(A). Consequently, det(λA) = λn det(A).

Proof. Each term in the sum is multiplied by λ, so the whole sum is multiplied
by λn .
Proposition. If 2 rows (or 2 columns) of A are identical, the determinant is 0.
Proof. wlog, suppose columns 1 and 2 are the same. Then
X
det(A) = ε(σ)Aσ(1)1 Aσ(2)2 · · · Aσ(n)n .
σ∈Sn

Now write an arbitrary σ in the form σ = ρ(1 2). Then ε(σ) = ε(ρ)ε((1 2)) =
−ε(ρ). So X
det(A) = −ε(ρ)Aρ(2)1 Aρ(1)2 Aρ(3)3 · · · Aρ(n)n .
ρ∈Sn

But columns 1 and 2 are identical, so Aρ(2)1 = Aρ(2)2 and Aρ(1)2 = Aρ(1)1 . So
det(A) = − det(A) and det(A) = 0.

Proposition. If 2 rows or 2 columns of a matrix are linearly dependent, then


the determinant is zero.
Proof. Suppose in A, (column r) + λ(column s) = 0. Define
(
Aij j=6 r
Bij = .
Aij + λAis j = r

Then det(B) = det(A) + λ det(matrix with column r = column s) = det(A).


Then we can see that the rth column of B is all zeroes. So each term in the sum
contains one zero and det(A) = det(B) = 0.
Proposition. Given a matrix A, if B is a matrix obtained by adding a multiple
of a column (or row) of A to another column (or row) of A, then det A = det B.

Corollary. Swapping two rows or columns of a matrix negates the determinant.

14
3 Linear maps IA Vectors and Matrices (Theorems with proof)

Proof. We do the column case only. Let A = (a1 · · · ai · · · aj · · · an ). Then

det(a1 · · · ai · · · aj · · · an ) = det(a1 · · · ai + aj · · · aj · · · an )
= det(a1 · · · ai + aj · · · aj − (ai + aj ) · · · an )
= det(a1 · · · ai + aj · · · − ai · · · an )
= det(a1 · · · aj · · · − ai · · · an )
= − det(a1 · · · aj · · · ai · · · an )

Alternatively, we can prove this from the definition directly, using the fact that
the sign of a transposition is −1 (and that the sign is multiplicative).
Proposition. det(AB) = det(A) det(B).
P
Proof. First note that σ ε(σ)Aσ(1)ρ(1) Aσ(2)ρ(2) = ε(ρ) det(A), i.e. swapping
columns (or rows) an even/odd number of times gives a factor ±1 respectively.
We can prove this by writing σ = µρ.
Now
X
det AB = ε(σ)(AB)σ(1)1 (AB)σ(2)2 · · · (AB)σ(n)n
σ
X n
X
= ε(σ) Aσ(1)k1 Bk1 1 · · · Aσ(n)kn Bkn n
σ k1 ,k2 ,··· ,kn
X X
= Bk1 1 · · · Bkn n ε(σ)Aσ(1)k1 Aσ(2)k2 · · · Aσ(n)kn
k1 ,··· ,kn σ
| {z }
S

Now consider the many different S’s. If in S, two of k1 and kn are equal, then S
is a determinant of a matrix with two columns the same, i.e. S = 0. So we only
have to consider the sum over distinct ki s. Thus the ki s are are a permutation
of 1, · · · n, say ki = ρ(i). Then we can write
X X
det AB = Bρ(1)1 · · · Bρ(n)n ε(σ)Aσ(1)ρ(1) · · · Aσ(n)ρ(n)
ρ σ
X
= Bρ(1)1 · · · Bρ(n)n (ε(ρ) det A)
ρ
X
= det A ε(ρ)Bρ(1)1 · · · Bρ(n)n
ρ

= det A det B

Corollary. If A is orthogonal, det A = ±1.


Proof.

AAT = I
det AAT = det I
det A det AT = 1
(det A)2 = 1
det A = ±1

15
3 Linear maps IA Vectors and Matrices (Theorems with proof)

Corollary. If U is unitary, | det U | = 1.


Proof. We have det U † = (det U T )∗ = det(U )∗ . Since U U † = I, we have
det(U ) det(U )∗ = 1.
Proposition. In R3 , orthogonal matrices represent either a rotation (det = 1)
or a reflection (det = −1).

3.5.3 Minors and Cofactors


Theorem (Laplace expansion formula). For any particular fixed i,
n
X
det A = Aji ∆ji .
j=1

Proof.
n
X n
X
det A = Aj i i εj1 j2 ···jn Aj1 1 Aj2 2 · · · Aji i · · · Ajn n
ji =1 j1 ,··· ,ji ,···jn

Let σ ∈ Sn be the permutation which moves ji to the ith position, and leave
everything else in its natural order, i.e.
 
1 · · · i i + 1 i + 2 · · · ji − 1 ji ji + 1 · · · n
σ=
1 · · · ji i i + 1 · · · ji − 2 ji − 1 ji + 1 · · · n

if ji > i, and similarly for other cases. To perform this permutation, |i − ji |


transpositions are made. So ε(σ) = (−1)i−ji .
Now consider the permutation ρ ∈ Sn

1 · · · · · · j¯i · · · n
 
ρ=
j1 · · · j¯i · · · · · · jn

The composition ρσ reorders (1, · · · , n) to (j1 , j2 , · · · , jn ). So ε(ρσ) = εj1 ···jn =


ε(ρ)ε(σ) = (−1)i−ji εj1 ···j̄i ···jn . Hence the original equation becomes
n
X X
det A = Aj i i (−1)i−ji εj1 ···j̄i ···jn Aj1 1 · · · Aji i · · · Ajn n
ji =1 j1 ···j̄i ···jn
n
X
= Aji i (−1)i−ji Mji i
ji =1
X n
= Aj i i ∆ j i i
ji =1
X n
= Aji ∆ji
j=1

16
4 Matrices and linear equations
IA Vectors and Matrices (Theorems with proof)

4 Matrices and linear equations


4.1 Simple example, 2 × 2
4.2 Inverse of an n × n matrix
P
Lemma. Aik ∆jk = δij det A.
Proof. If i 6= j, then consider an n × n matrix B, which is identical to A except
the jth row is replaced by the ith row of A. So ∆jk of B = ∆jk of A, since ∆jk
does not depend on the elements in row j. Since B has a duplicate row, we know
that
Xn Xn
0 = det B = Bjk ∆jk = Aik ∆jk .
k=1 k=1

If i = j, then the expression is det A by the Laplace expansion formula.


Theorem. If det A 6= 0, then A−1 exists and is given by
∆ji
(A−1 )ij = .
det A
Proof.
∆ki δij det A
(A−1 )ik Akj = Akj = = δij .
det A det A
So A−1 A = I.

4.3 Homogeneous and inhomogeneous equations


4.3.1 Gaussian elimination

4.4 Matrix rank


Theorem. The column rank and row rank are equal for any m × n matrix.
Proof. Let r be the row rank of A. Write the biggest set of linearly independent
rows as v1T , v2T , · · · vrT or in component form vkT = (vk1 , vk2 , · · · , vkn ) for k =
1, 2, · · · , r.
Now denote the ith row of A as rTi = (Ai1 , Ai2 , · · · Ain ).
Note that every row of A can be written as a linear combination of the v’s.
(If ri cannot be written as a linear combination of the v’s, then it is independent
of the v’s and v is not the maximum collection of linearly independent rows)
Write
r
X
rTi = Cik vkT .
k=1

For some coefficients Cik with 1 ≤ i ≤ m and 1 ≤ k ≤ r.


Now the elements of A are
r
X
Aij = (ri )Tj = Cik (vk )j ,
k=1

17
4 Matrices and linear equations
IA Vectors and Matrices (Theorems with proof)

or    
A1j C1k
 A2j  Xr  C2k 
 ..  = vkj  . 
   
 .   .. 
k=1
Amj Cmk
So every column of A can be written as a linear combination of the r column
vectors ck . Then the column rank of A ≤ r, the row rank of A.
Apply the same argument to AT to see that the row rank is ≤ the column
rank.

4.5 Homogeneous problem Ax = 0


4.5.1 Geometrical interpretation
4.5.2 Linear mapping view of Ax = 0

4.6 General solution of Ax = d

18
5 Eigenvalues and eigenvectors
IA Vectors and Matrices (Theorems with proof)

5 Eigenvalues and eigenvectors


5.1 Preliminaries and definitions
Theorem (Fundamental theorem of algebra). Let p(z) be a polynomial of degree
m ≥ 1, i.e.
Xm
p(z) = cj z j ,
j=0

where cj ∈ C and cm 6= 0.
Then p(z) = 0 has precisely m (not necessarily distinct) roots in the complex
plane, accounting for multiplicity.
Theorem. λ is an eigenvalue of A iff
det(A − λI) = 0.
Proof. (⇒) Suppose that λ is an eigenvalue and x is the associated eigenvector.
We can rearrange the equation in the definition above to
(A − λI)x = 0
and thus
x ∈ ker(A − λI)
But x 6= 0. So ker(A − λI) is non-trivial and det(A − λI) = 0. The (⇐) direction
is similar.

5.2 Linearly independent eigenvectors


Theorem. Suppose n×n matrix A has distinct eigenvalues λ1 , λ2 , · · · , λn . Then
the corresponding eigenvectors x1 , x2 , · · · , xn are linearly independent.
Proof. Proof by contradiction: Suppose x1 , x2 , · · · , xn are linearly dependent.
Then we can find non-zero constants di for i = 1, 2, · · · , r, such that
d1 x1 + d2 x2 + · · · + dr xr = 0.
Suppose that this is the shortest non-trivial linear combination that gives 0 (we
may need to re-order xi ).
Now apply (A − λ1 I) to the whole equation to obtain
d1 (λ1 − λ1 )x1 + d2 (λ2 − λ1 )x2 + · · · + dr (λr − λ1 )xr = 0.
We know that the first term is 0, while the others are not (since we assumed
λi 6= λj for i 6= j). So
d2 (λ2 − λ1 )x2 + · · · + dr (λr − λ1 )xr = 0,
and we have found a shorter linear combination that gives 0. Contradiction.

5.3 Transformation matrices


5.3.1 Transformation law for vectors
Theorem. Denote vector as u with respect to {ei } and ũ with respect to {e˜i }.
Then
u = P ũ and ũ = P −1 u

19
5 Eigenvalues and eigenvectors
IA Vectors and Matrices (Theorems with proof)

5.3.2 Transformation law for matrix


Theorem.
à = P −1 AP.

5.4 Similar matrices


Proposition. Similar matrices have the following properties:

(i) Similar matrices have the same determinant.


(ii) Similar matrices have the same trace.
(iii) Similar matrices have the same characteristic polynomial.

Proof. They are proven as follows:


(i) det B = det(P −1 AP ) = (det A)(det P )−1 (det P ) = det A
(ii)

tr B = Bii
= Pij−1 Ajk Pki
= Ajk Pki Pij−1
= Ajk (P P −1 )kj
= Ajk δkj
= Ajj
= tr A

(iii)

pB (λ) = det(B − λI)


= det(P −1 AP − λI)
= det(P −1 AP − λP −1 IP )
= det(P −1 (A − λI)P )
= det(A − λI)
= pA (λ)

5.5 Diagonalizable matrices


Theorem. Let λ1 , λ2 , · · · , λr , with r ≤ n be the distinct eigenvalues of A. Let
B1 , B2 , · · · Br be the bases of the eigenspaces Eλ1 , Eλ2 , · · · , Eλr correspondingly.
[r
Then the set B = Bi is linearly independent.
i=1

(1) (1) (1)


Proof. Write B1 = {x1 , x2 , · · · xm(λ1 ) }. Then m(λ1 ) = dim(Eλ1 ), and simi-
larly for all Bi .

20
5 Eigenvalues and eigenvectors
IA Vectors and Matrices (Theorems with proof)

Consider the following general linear combination of all elements in B. Con-


sider the equation
r m(λ
Xi ) (i)
X
αij xj = 0.
i=1 j=1

The first sum is summing over all eigenspaces, and the second sum sums over
the basis vectors in Bi . Now apply the matrix
Y
(A − λk I)
k=1,2,··· ,K̄,··· ,r

to the above sum, for some arbitrary K. We obtain


 
m(λK )
(K)
X Y
αKj  (λK − λk ) xj = 0.
j=1 k=1,2,··· ,K̄,··· ,r

(K)
Since the xj are linearly independent (BK is a basis), αKj = 0 for all j. Since
K was arbitrary, all αij must be zero. So B is linearly independent.
Proposition. A is diagonalizable iff all its eigenvalues have zero defect.

5.6 Canonical (Jordan normal) form


Theorem. Any 2 × 2 complex matrix A is similar to exactly one of
     
λ1 0 λ 0 λ 1
, ,
0 λ2 0 λ 0 λ

Proof. For each case:


(i) If A has two distinct eigenvalues, then eigenvectors are linearly independent.
Then we can use P formed from eigenvectors as its columns
(ii) If λ1 = λ2 = λ and dim Eλ = 2, then write Eλ = span{u, v}, with
u, v linearly independent.
  Now use {u, v} as a new basis of C2 and
λ 0
à = P −1 AP = = λI
0 λ
Note that since P −1 AP = λI, we have A = P (λI)P −1 = λI. So A is
isotropic, i.e. the same with respect to any basis.
(iii) If λ1 = λ2 = λ and dim(Eλ ) = 1, then Eλ = span{v}. Now choose basis
of C2 as {v, w}, where w ∈ C2 \ Eλ .
We know that Aw ∈ C2 . So Aw
 = αv
 + βw. Hence, if we change basis to
−1 λ α
{v, w}, then à = P AP = .
0 β
However, A and à both have eigenvalue λ with algebraic multiplicity 2.
So we must have β = λ. To make α = 1, let u = (Ã − λI)w. We know
u 6= 0 since w is not in the eigenspace. Then
  
2 0 α 0 α
(Ã − λI)u = (Ã − λI) w = w = 0.
0 0 0 0

21
5 Eigenvalues and eigenvectors
IA Vectors and Matrices (Theorems with proof)

So u is an eigenvector of à with eigenvalue λ.


We have u = Ãw − λw. So Ãw = u + λw.
 
λ 1
Change basis to {u, w}. Then A with respect to this basis is .
0 λ
This is a two-stage process: P sends basis to {v, w} and then matrix Q
sends to basis {u, w}. So the similarity transformation is Q−1 (P −1 AP )Q =
(P Q)−1 A(P Q).
Proposition. (Without proof) The canonical form, or Jordan normal form,
exists for any n × n matrix A. Specifically, there exists a similarity transform
such that A is similar to a matrix to à that satisfies the following properties:
(i) Ãαα = λα , i.e. the diagonal composes of the eigenvalues.
(ii) Ãα,α+1 = 0 or 1.

(iii) Ãij = 0 otherwise.

5.7 Cayley-Hamilton Theorem


Theorem (Cayley-Hamilton theorem). Every n × n complex matrix satisfies
its own characteristic equation.
Proof. We will only prove for diagonalizable matrices here. So suppose for our
matrix A, there is some P such that D = diag(λ1 , λ2 , · · · , λn ) = P −1 AP . Note
that
Di = (P −1 AP )(P −1 AP ) · · · (P −1 AP ) = P −1 Ai P.
Hence
pD (D) = pD (P −1 AP ) = P −1 [pD (A)]P.
Since similar matrices have the same characteristic polynomial. So

pA (D) = P −1 [pA (A)]P.

However, we also know that Di = diag(λi1 , λi2 , · · · λin ). So

pA (D) = diag(pA (λ1 ), pA (λ2 ), · · · , pA (λn )) = diag(0, 0, · · · , 0)

since the eigenvalues are roots of pA (λ) = 0. So 0 = pA (D) = P −1 pA (A)P and


thus pA (A) = 0.

5.8 Eigenvalues and eigenvectors of a Hermitian matrix


5.8.1 Eigenvalues and eigenvectors
Theorem. The eigenvalues of a Hermitian matrix H are real.
Proof. Suppose that H has eigenvalue λ with eigenvector v 6= 0. Then

Hv = λv.

We pre-multiply by v† , a 1 × n row vector, to obtain

v† Hv = λv† v (∗)

22
5 Eigenvalues and eigenvectors
IA Vectors and Matrices (Theorems with proof)

We take the Hermitian conjugate of both sides. The left hand side is

(v† Hv)† = v† H † v = v† Hv

since H is Hermitian. The right hand side is

(λv† v)† = λ∗ v† v

So we have
v† Hv = λ∗ v† v.
From (∗), we know that λv† v = λ∗ v† v. Since v 6= 0, we know that v† v =
v · v 6= 0. So λ = λ∗ and λ is real.
Theorem. The eigenvectors of a Hermitian matrix H corresponding to distinct
eigenvalues are orthogonal.
Proof. Let

Hvi = λi vi (i)
Hvj = λj vj . (ii)

Pre-multiply (i) by vj† to obtain

vj† Hvi = λi vj† vi . (iii)

Pre-multiply (ii) by vi† and take the Hermitian conjugate to obtain

vj† Hvi = λj vj† vi . (iv)

Equating (iii) and (iv) yields

λi vj† vi = λj vj† vi .

Since λi 6= λj , we must have vj† vi = 0. So their inner product is zero and are
orthogonal.

5.8.2 Gram-Schmidt orthogonalization (non-examinable)


5.8.3 Unitary transformation
5.8.4 Diagonalization of n × n Hermitian matrices
Theorem. An n × n Hermitian matrix has precisely n orthogonal eigenvectors.
Proof. (Non-examinable) Let λ1 , λ2 , · · · , λr be the distinct eigenvalues of H (r ≤
n), with a set of corresponding orthonormal eigenvectors B = {v1 , v2 , · · · , vr }.
Extend to a basis of the whole of Cn

B 0 = {v1 , v2 , · · · , vr , w1 , w2 , · · · , wn−r }

Now use Gram-Schmidt to create an orthonormal basis

B̃ = {v1 , v2 , · · · , vr , u1 , u2 , · · · , un−r }.

23
5 Eigenvalues and eigenvectors
IA Vectors and Matrices (Theorems with proof)

Now write  
↑ ↑ ↑ ↑ ↑
P = v1 v2 ··· vr u1 ··· un−r 
↓ ↓ ↓ ↓ ↓
We have shown above that this is a unitary matrix, i.e. P −1 = P † . So if we
change basis, we have

P −1 HP = P † HP
 
λ1 0 ··· 0 0 0 ··· 0
 0 λ2 ··· 0 0 0 ··· 0 
 .. .. .. .. ..
 
.. .. 
.
 . . . . . . 0 

0 0 ··· λr 0 0 ··· 0 
= 
0
 0 ··· 0 c11 c12 ··· c1,n−r 

0
 0 ··· 0 c21 c22 ··· c2,n−r 

. .. .. .. .. .. .. ..
 ..

. . . . . . . 
0 0 ··· 0 cn−r,1 cn−r,2 ··· cn−r,n−r

Here C is an (n − r) × (n − r) Hermitian matrix. The eigenvalues of C are also


eigenvalues of H because det(H − λI) = det(P † HP − λI) = (λ1 − λ) · · · (λr −
λ) det(C − λI). So the eigenvalues of C are the eigenvalues of H.
We can keep repeating the process on C until we finish all rows. For example,
if the eigenvalues of C are all distinct, there are n − r orthonormal eigenvectors
wj (for j = r + 1, · · · , n) of C. Let
 
1

 1 


 . .. 

Q=
 
 1 


 ↑ ↑ ↑ 

 wr+1 wr+2 · · · wn 
↓ ↓ ↓

with other entries 0. (where we have a r × r identity matrix block on the top
left corner and a (n − r) × (n − r) with columns formed by wj )
Since the columns of Q are orthonormal, Q is unitary. So Q† P † HP Q =
diag(λ1 , λ2 , · · · , λr , λr+1 , · · · , λn ), where the first r λs are distinct and the re-
maining ones are copies of previous ones.
The n linearly-independent eigenvectors are the columns of P Q.

5.8.5 Normal matrices


Proposition.
(i) If λ is an eigenvalue of N , then λ∗ is an eigenvalue of N † .
(ii) The eigenvectors of distinct eigenvalues are orthogonal.
(iii) A normal matrix can always be diagonalized with an orthonormal basis of
eigenvectors.

24
6 Quadratic forms and conicsIA Vectors and Matrices (Theorems with proof)

6 Quadratic forms and conics


Theorem. Hermitian forms are real.
Proof. (x† Hx)∗ = (x† Hx)† = x† H † x = x† Hx. So (x† Hx)∗ = x† Hx and it is
real.

6.1 Quadrics and conics


6.1.1 Quadrics
6.1.2 Conic sections (n = 2)

6.2 Focus-directrix property

25
7 Transformation groups IA Vectors and Matrices (Theorems with proof)

7 Transformation groups
7.1 Groups of orthogonal matrices
Proposition. The set of all n × n orthogonal matrices P forms a group under
matrix multiplication.
Proof.
0. If P, Q are orthogonal, then consider R = P Q. RRT = (P Q)(P Q)T =
P (QQT )P T = P P T = I. So R is orthogonal.
1. I satisfies II T = I. So I is orthogonal and is an identity of the group.
2. Inverse: if P is orthogonal, then P −1 = P T by definition, which is also
orthogonal.
3. Matrix multiplication is associative since function composition is associative.

7.2 Length preserving matrices


Theorem. Let P ∈ O(n). Then the following are equivalent:
(i) P is orthogonal
(ii) |P x| = |x|
(iii) (P x)T (P y) = xT y, i.e. (P x) · (P y) = x · y.
(iv) If (v1 , v2 , · · · , vn ) are orthonormal, so are (P v1 , P v2 , · · · , P vn )
(v) The columns of P are orthonormal.
Proof. We do them one by one:
(i) ⇒ (ii): |P x|2 = (P x)T (P x) = xT P T P x = xT x = |x|2
(ii) ⇒ (iii): |P (x + y)|2 = |x + y|2 . The right hand side is

(xT + yT )(x + y) = xT x + y T y + yT x + xT y = |x|2 + |y|2 + 2xT y.

Similarly, the left hand side is

|P x + P y|2 = |P x|2 + |P y| + 2(P x)T P y = |x|2 + |y|2 + 2(P x)T P y.

So (P x)T P y = xT y.
(iii) ⇒ (iv): (P vi )T P vj = viT vj = δij . So P vi ’s are also orthonormal.
(iv) ⇒ (v): Take the vi ’s to be the standard basis. So the columns of P , being
P ei , are orthonormal.
(v) ⇒ (i): The columns of P are orthonormal. Then (P P T )ij = Pik Pjk =
(Pi ) · (Pj ) = δij , viewing Pi as the ith column of P . So P P T = I.

7.3 Lorentz transformations

26

You might also like