Part IA - Vectors and Matrices: Theorems With Proof

Part IA — Vectors and Matrices
Theorems with proof
Based on lectures by N. Peake

Notes taken by Dexter Chua
Michaelmas 2014
These notes are not endorsed by the lecturers, and I have modified them (often
significantly) after lectures. They are nowhere near accurate representations of what
was actually lectured, and in particular, all errors are almost surely mine.
Complex numbers
Review of complex numbers, including complex conjugate, inverse, modulus, argument
and Argand diagram. Informal treatment of complex logarithm, n-th roots and complex
powers. de Moivre’s theorem. [2]
Vectors
Review of elementary algebra of vectors in R3 , including scalar product. Brief discussion
of vectors in Rn and Cn ; scalar product and the Cauchy-Schwarz inequality. Concepts
of linear span, linear independence, subspaces, basis and dimension.
Suffix notation: including summation convention, δij and εijk . Vector product and
triple product: definition and geometrical interpretation. Solution of linear vector
equations. Applications of vectors to geometry, including equations of lines, planes and
spheres. [5]
Matrices
Elementary algebra of 3 × 3 matrices, including determinants. Extension to n × n
complex matrices. Trace, determinant, non-singular matrices and inverses. Matrices as
linear transformations; examples of geometrical actions including rotations, reflections,
dilations, shears; kernel and image. [4]
Simultaneous linear equations: matrix formulation; existence and uniqueness of solu-
tions, geometric interpretation; Gaussian elimination. [3]
Symmetric, anti-symmetric, orthogonal, hermitian and unitary matrices. Decomposition
of a general matrix into isotropic, symmetric trace-free and antisymmetric parts. [1]
Eigenvalues and Eigenvectors

Eigenvalues and eigenvectors; geometric significance. [2]
Proof that eigenvalues of hermitian matrix are real, and that distinct eigenvalues give
an orthogonal basis of eigenvectors. The effect of a general change of basis (similarity
transformations). Diagonalization of general matrices: sufficient conditions; examples
of matrices that cannot be diagonalized. Canonical forms for 2 × 2 matrices. [5]
Discussion of quadratic forms, including change of basis. Classification of conics,
cartesian and polar forms. [1]
Rotation matrices and Lorentz transformations as transformation groups. [1]
1
Contents IA Vectors and Matrices (Theorems with proof)
Contents
0 Introduction 4
1 Complex numbers 5
1.1 Basic properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 Complex exponential function . . . . . . . . . . . . . . . . . . . . 5
1.3 Roots of unity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4 Complex logarithm and power . . . . . . . . . . . . . . . . . . . . 6
1.5 De Moivre’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.6 Lines and circles in C . . . . . . . . . . . . . . . . . . . . . . . . 7
2 Vectors 8
2.1 Definition and basic properties . . . . . . . . . . . . . . . . . . . 8
2.2 Scalar product . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2.1 Geometric picture (R2 and R3 only) . . . . . . . . . . . . 8
2.2.2 General algebraic definition . . . . . . . . . . . . . . . . . 8
2.3 Cauchy-Schwarz inequality . . . . . . . . . . . . . . . . . . . . . . 8
2.4 Vector product . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.5 Scalar triple product . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.6 Spanning sets and bases . . . . . . . . . . . . . . . . . . . . . . . 9
2.6.1 2D space . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.6.2 3D space . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.6.3 Rn space . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.6.4 Cn space . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.7 Vector subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.8 Suffix notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.9 Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.9.1 Lines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.9.2 Plane . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.10 Vector equations . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3 Linear maps 12
3.1 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.1.1 Rotation in R3 . . . . . . . . . . . . . . . . . . . . . . . . 12
3.1.2 Reflection in R3 . . . . . . . . . . . . . . . . . . . . . . . 12
3.2 Linear Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.3 Rank and nullity . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.4 Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.4.1 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.4.2 Matrix Algebra . . . . . . . . . . . . . . . . . . . . . . . . 13
3.4.3 Decomposition of an n × n matrix . . . . . . . . . . . . . 13
3.4.4 Matrix inverse . . . . . . . . . . . . . . . . . . . . . . . . 13
3.5 Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.5.1 Permutations . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.5.2 Properties of determinants . . . . . . . . . . . . . . . . . 14
3.5.3 Minors and Cofactors . . . . . . . . . . . . . . . . . . . . 16
2
Contents IA Vectors and Matrices (Theorems with proof)
4 Matrices and linear equations 17

4.1 Simple example, 2 × 2 . . . . . . . . . . . . . . . . . . . . . . . . 17
4.2 Inverse of an n × n matrix . . . . . . . . . . . . . . . . . . . . . . 17
4.3 Homogeneous and inhomogeneous equations . . . . . . . . . . . . 17
4.3.1 Gaussian elimination . . . . . . . . . . . . . . . . . . . . . 17
4.4 Matrix rank . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.5 Homogeneous problem Ax = 0 . . . . . . . . . . . . . . . . . . . 18
4.5.1 Geometrical interpretation . . . . . . . . . . . . . . . . . . 18
4.5.2 Linear mapping view of Ax = 0 . . . . . . . . . . . . . . . 18
4.6 General solution of Ax = d . . . . . . . . . . . . . . . . . . . . . 18
5 Eigenvalues and eigenvectors 19

5.1 Preliminaries and definitions . . . . . . . . . . . . . . . . . . . . . 19
5.2 Linearly independent eigenvectors . . . . . . . . . . . . . . . . . . 19
5.3 Transformation matrices . . . . . . . . . . . . . . . . . . . . . . . 19
5.3.1 Transformation law for vectors . . . . . . . . . . . . . . . 19
5.3.2 Transformation law for matrix . . . . . . . . . . . . . . . 20
5.4 Similar matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
5.5 Diagonalizable matrices . . . . . . . . . . . . . . . . . . . . . . . 20
5.6 Canonical (Jordan normal) form . . . . . . . . . . . . . . . . . . 21
5.7 Cayley-Hamilton Theorem . . . . . . . . . . . . . . . . . . . . . . 22
5.8 Eigenvalues and eigenvectors of a Hermitian matrix . . . . . . . . 22
5.8.1 Eigenvalues and eigenvectors . . . . . . . . . . . . . . . . 22
5.8.2 Gram-Schmidt orthogonalization (non-examinable) . . . . 23
5.8.3 Unitary transformation . . . . . . . . . . . . . . . . . . . 23
5.8.4 Diagonalization of n × n Hermitian matrices . . . . . . . 23
5.8.5 Normal matrices . . . . . . . . . . . . . . . . . . . . . . . 24
6 Quadratic forms and conics 25

6.1 Quadrics and conics . . . . . . . . . . . . . . . . . . . . . . . . . 25
6.1.1 Quadrics . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
6.1.2 Conic sections (n = 2) . . . . . . . . . . . . . . . . . . . . 25
6.2 Focus-directrix property . . . . . . . . . . . . . . . . . . . . . . . 25
7 Transformation groups 26
7.1 Groups of orthogonal matrices . . . . . . . . . . . . . . . . . . . 26
7.2 Length preserving matrices . . . . . . . . . . . . . . . . . . . . . 26
7.3 Lorentz transformations . . . . . . . . . . . . . . . . . . . . . . . 26
3
0 Introduction IA Vectors and Matrices (Theorems with proof)
0 Introduction
4
1 Complex numbers IA Vectors and Matrices (Theorems with proof)
1 Complex numbers
1.1 Basic properties
Proposition. z z̄ = a2 + b2 = |z|2 .
Proposition. z −1 = z̄/|z|2 .
Theorem (Triangle inequality). For all z1 , z2 ∈ C, we have
|z1 + z2 | ≤ |z1 | + |z2 |.
Alternatively, we have |z1 − z2 | ≥ ||z1 | − |z2 ||.
1.2 Complex exponential function

Lemma.
∞ X
X ∞ ∞ X
X r
amn = ar−m,m
n=0 m=0 r=0 m=0
Proof.
∞ X
X ∞
amn = a00 + a01 + a02 + · · ·
n=0 m=0
+ a10 + a11 + a12 + · · ·
+ a20 + a21 + a22 + · · ·
= (a00 ) + (a10 + a01 ) + (a20 + a11 + a02 ) + · · ·
X∞ X r
= ar−m,m
r=0 m=0
Theorem. exp(z1 ) exp(z2 ) = exp(z1 + z2 )
Proof.
∞ X ∞
X z1m z2n
exp(z1 ) exp(z2 ) =
n=0 m=0
m! n!
∞ X
r
X z1r−m z2m
=
r=0 m=0
(r − m)! m!
∞ r
X 1 X r!
= z1r−m z2m
r=0
r! m=0
(r − m)!m!
∞
X (z1 + z2 )r
=
r=0
r!
Theorem. eiz = cos z + i sin z.
5
Proof.
∞ n
X i n
eiz = z
n=0
n!
∞ ∞
X i2n 2n X i2n+1 2n+1
= z + z
n=0
(2n)! n=0
(2n + 1)!
∞ ∞
X (−1)n 2n X (−1)n 2n+1
= z +i z
n=0
(2n)! n=0
(2n + 1)!
= cos z + i sin z
1.3 Roots of unity

2πi

Proposition. If ω = exp n , then 1 + ω + ω 2 + · · · + ω n−1 = 0
Proof. Two proofs are provided:
(i) Consider the equation z n = 1. The coefficient of z n−1 is the sum of
all roots. Since the coefficient of z n−1 is 0, then the sum of all roots
= 1 + ω + ω 2 + · · · + ω n−1 = 0.
(ii) Since ω n − 1 = (ω − 1)(1 + ω + · · · + ω n−1 ) and ω =
6 1, dividing by (ω − 1),
we have 1 + ω + · · · + ω n−1 = (ω n − 1)/(ω − 1) = 0.
1.4 Complex logarithm and power

1.5 De Moivre’s theorem
Theorem (De Moivre’s theorem).
cos nθ + i sin nθ = (cos θ + i sin θ)n .
Proof. First prove for the n ≥ 0 case by induction. The n = 0 case is true since
it merely reads 1 = 1. We then have
(cos θ + i sin θ)n+1 = (cos θ + i sin θ)n (cos θ + i sin θ)

= (cos nθ + i sin nθ)(cos θ + i sin θ)
= cos(n + 1)θ + i sin(n + 1)θ
If n < 0, let m = −n. Then m > 0 and
(cosθ + i sin θ)−m = (cos mθ + i sin mθ)−1

cos mθ − i sin mθ
=
(cos mθ + i sin mθ)(cos mθ − i sin mθ)
cos(−mθ) + i sin(−mθ)
=
cos2 mθ + sin2 mθ
= cos(−mθ) + i sin(−mθ)
= cos nθ + i sin nθ
6
1.6 Lines and circles in C

Theorem (Equation of straight line). The equation of a straight line through
z0 and parallel to w is given by
z w̄ − z̄w = z0 w̄ − z̄0 w.
Theorem. The general equation of a circle with center c ∈ C and radius ρ ∈ R+

can be given by
z z̄ − c̄z − cz̄ = ρ2 − cc̄.
7
2 Vectors IA Vectors and Matrices (Theorems with proof)
2 Vectors
2.1 Definition and basic properties
2.2 Scalar product
2.2.1 Geometric picture (R2 and R3 only)
2.2.2 General algebraic definition
2.3 Cauchy-Schwarz inequality

Theorem (Cauchy-Schwarz inequality). For all x, y ∈ Rn ,
|x · y| ≤ |x||y|.
Proof. Consider the expression |x − λy|2 . We must have
|x − λy|2 ≥ 0
(x − λy) · (x − λy) ≥ 0
λ |y| − λ(2x · y) + |x|2 ≥ 0.
2 2
Viewing this as a quadratic in λ, we see that the quadratic is non-negative and

thus cannot have 2 real roots. Thus the discriminant ∆ ≤ 0. So
4(x · y)2 ≤ 4|y|2 |x|2

(x · y)2 ≤ |x|2 |y|2
|x · y| ≤ |x||y|.
Corollary (Triangle inequality).
|x + y| ≤ |x| + |y|.
Proof.
|x + y|2 = (x + y) · (x + y)
= |x|2 + 2x · y + |y|2
≤ |x|2 + 2|x||y| + |y|2
= (|x| + |y|)2 .
So
|x + y| ≤ |x| + |y|.
2.4 Vector product

Proposition.
a × b = (a1 î + a2 ĵ + a3 k̂) × (b1 î + b2 ĵ + b3 k̂)

= (a2 b3 − a3 b2 )î + · · ·

î ĵ k̂

= a1 a2 a3
b1 b2 b3
8
2.5 Scalar triple product

Proposition. If a parallelepiped has sides represented by vectors a, b, c that
form a right-handed system, then the volume of the parallelepiped is given by
[a, b, c].
Proof. The area of the base of the parallelepiped is given by |b||c| sin θ = |b × c|.
Thus the volume= |b × c||a| cos φ = |a · (b × c)|, where φ is the angle between
a and the normal to b and c. However, since a, b, c form a right-handed system,
we have a · (b × c) ≥ 0. Therefore the volume is a · (b × c).
Theorem. a × (b + c) = a × b + a × c.
Proof. Let d = a × (b + c) − a × b − a × c. We have
d · d = d · [a × (b + c)] − d · (a × b) − d · (a × c)
= (b + c) · (d × a) − b · (d × a) − c · (d × a)
=0
Thus d = 0.
2.6 Spanning sets and bases

2.6.1 2D space
Theorem. The coefficients λ, µ are unique.
Proof. Suppose that r = λa + µb = λ0 a + µ0 b. Take the vector product with a
on both sides to get (µ − µ0 )a × b = 0. Since a × b 6= 0, then µ = µ0 . Similarly,
λ = λ0 .
2.6.2 3D space
Theorem. If a, b, c ∈ R3 are non-coplanar, i.e. a · (b × c) 6= 0, then they form
a basis of R3 .
Proof. For any r, write r = λa + µb + νc. Performing the scalar product
with b × c on both sides, one obtains r · (b × c) = λa · (b × c) + µb · (b × c) +
νc · (b × c) = λ[a, b, c]. Thus λ = [r, b, c]/[a, b, c]. The values of µ and ν can
be found similarly. Thus each r can be written as a linear combination of a, b
and c.
By the formula derived above, it follows that if αa + βb + γc = 0, then
α = β = γ = 0. Thus they are linearly independent.
2.6.3 Rn space
2.6.4 Cn space
2.7 Vector subspaces

2.8 Suffix notation
Proposition. (a × b)i = εijk aj bk
Proof. By expansion of formula
9
Theorem. εijk εipq = δjp δkq − δjq δkp

Proof. Proof by exhaustion:

+1
 if j = p and k = q
RHS = −1 if j = q and k = p

0 otherwise

LHS: Summing over i, the only non-zero terms are when j, k = 6 i and p, q 6= i.
If j = p and k = q, LHS is (−1)2 or (+1)2 = 1. If j = q and k = p, LHS is
(+1)(−1) or (−1)(+1) = −1. All other possibilities result in 0.
Proposition.
a · (b × c) = b · (c × a)
Proof. In suffix notation, we have
a · (b × c) = ai (b × c)i = εijk bj ck ai = εjki bj ck ai = b · (c × a).
Theorem (Vector triple product).
a × (b × c) = (a · c)b − (a · b)c.
Proof.
[a × (b × c)]i = εijk aj (b × c)k

= εijk εkpq aj bp cq
= εijk εpqk aj bp cq
= (δip δjq − δiq δjp )aj bp cq
= aj bi cj − aj ci bj
= (a · c)bi − (a · b)ci
Proposition. (a × b) · (b × c) = (a · a)(b · c) − (a · b)(a · c).

Proof.
LHS = (a × b)i (a × c)i

= εijk aj bk εipq ap cq
= (δjp δkq − δjq δkp )aj bk ap cq
= aj bk aj ck − aj bk ak cj
= (a · a)(b · c) − (a · b)(a · c)
2.9 Geometry
2.9.1 Lines
Theorem. The equation of a straight line through a and parallel to t is
(x − a) × t = 0 or x × t = a × t.
10
2.9.2 Plane
Theorem. The equation of a plane through b with normal n is given by
x · n = b · n.
2.10 Vector equations
11
3 Linear maps IA Vectors and Matrices (Theorems with proof)
3 Linear maps
3.1 Examples
3.1.1 Rotation in R3
3.1.2 Reflection in R3
3.2 Linear Maps

Theorem. Consider a linear map f : U → V , where U, V are vector spaces.
Then im(f ) is a subspace of V , and ker(f ) is a subspace of U .
Proof. Both are non-empty since f (0) = 0.
If x, y ∈ im(f ), then ∃a, b ∈ U such that x = f (a), y = f (b). Then
λx + µy = λf (a) + µf (b) = f (λa + µb). Now λa + µb ∈ U since U is a vector
space, so there is an element in U that maps to λx + µy. So λx + µy ∈ im(f )
and im(f ) is a subspace of V .
Suppose x, y ∈ ker(f ), i.e. f (x) = f (y) = 0. Then f (λx + µy) = λf (x) +
µf (y) = λ0 + µ0 = 0. Therefore λx + µy ∈ ker(f ).
3.3 Rank and nullity

Theorem (Rank-nullity theorem). For a linear map f : U → V ,
r(f ) + n(f ) = dim(U ).
Proof. (Non-examinable) Write dim(U ) = n and n(f ) = m. If m = n, then f is

the zero map, and the proof is trivial, since r(f ) = 0. Otherwise, assume m < n.
Suppose {e1 , e2 , · · · , em } is a basis of ker f , Extend this to a basis of the
whole of U to get {e1 , e2 , · · · , em , em+1 , · · · , en }. To prove the theorem, we
need to prove that {f (em+1 ), f (em+2 ), · · · f (en )} is a basis of im(f ).
(i) First show that it spans im(f ). Take y ∈ im(f ). Thus ∃x ∈ U such that
y = f (x). Then
y = f (α1 e1 + α2 e2 + · · · + αn en ),
since e1 , · · · en is a basis of U . Thus
y = α1 f (e1 ) + α2 f (e2 ) + · · · + αm f (em ) + αm+1 f (em+1 ) + · · · + αn f (en ).
The first m terms map to 0, since e1 , · · · em is the basis of the kernel of f .

Thus
y = αm+1 f (em+1 ) + · · · + αn f (en ).
(ii) To show that they are linearly independent, suppose
αm+1 f (em+1 ) + · · · + αn f (en ) = 0.
Then
f (αm+1 em+1 + · · · + αn en ) = 0.
12
Thus αm+1 em+1 + · · · + αn en ∈ ker(f ). Since {e1 , · · · , em } span ker(f ),

there exist some α1 , α2 , · · · αm such that
αm+1 em+1 + · · · + αn en = α1 e1 + · · · + αm em .
But e1 · · · en is a basis of U and are linearly independent. So αi = 0 for all i.

Then the only solution to the equation αm+1 f (em+1 ) + · · · + αn f (en ) = 0
is αi = 0, and they are linearly independent by definition.
3.4 Matrices
3.4.1 Examples
3.4.2 Matrix Algebra
Proposition.
(i) (AT )T = A.
 
x1
 x2 
(ii) If x is a column vector . , xT is a row vector (x1 x2 · · · xn ).
 
 .. 
xn
(iii) (AB)T = B T AT since (AB)Tij = (AB)ji = Ajk Bki = Bki Ajk

= (B T )ik (AT )kj = (B T AT )ij .
Proposition. tr(BC) = tr(CB)
Proof. tr(BC) = Bik Cki = Cki Bik = (CB)kk = tr(CB)
3.4.3 Decomposition of an n × n matrix

3.4.4 Matrix inverse
Proposition. (AB)−1 = B −1 A−1
Proof. (B −1 A−1 )(AB) = B −1 (A−1 A)B = B −1 B = I.
3.5 Determinants
3.5.1 Permutations
Proposition. Any q-cycle can be written as a product of 2-cycles.
Proof. (1 2 3 · · · n) = (1 2)(2 3)(3 4) · · · (n − 1 n).

Proposition.
a b

c = ad − bc
d
13
3.5.2 Properties of determinants

Proposition. det(A) = det(AT ).
Proof. Take a single term Aσ(1)1 Aσ(2)2 · · · Aσ(n)n and let ρ be another permuta-
tion in Sn . We have
Aσ(1)1 Aσ(2)2 · · · Aσ(n)n = Aσ(ρ(1))ρ(1) Aσ(ρ(2))ρ(2) · · · Aσ(ρ(n))ρ(n)
since the right hand side is just re-ordering the order of multiplication. Choose
ρ = σ −1 and note that ε(σ) = ε(ρ). Then
X
det(A) = ε(ρ)A1σ(1) A2σ(2) · · · Anσ(n) = det(AT ).
ρ∈Sn
Proposition. If matrix B is formed by multiplying every element in a single row

of A by a scalar λ, then det(B) = λ det(A). Consequently, det(λA) = λn det(A).
Proof. Each term in the sum is multiplied by λ, so the whole sum is multiplied
by λn .
Proposition. If 2 rows (or 2 columns) of A are identical, the determinant is 0.
Proof. wlog, suppose columns 1 and 2 are the same. Then
X
det(A) = ε(σ)Aσ(1)1 Aσ(2)2 · · · Aσ(n)n .
σ∈Sn
Now write an arbitrary σ in the form σ = ρ(1 2). Then ε(σ) = ε(ρ)ε((1 2)) =
−ε(ρ). So X
det(A) = −ε(ρ)Aρ(2)1 Aρ(1)2 Aρ(3)3 · · · Aρ(n)n .
ρ∈Sn
But columns 1 and 2 are identical, so Aρ(2)1 = Aρ(2)2 and Aρ(1)2 = Aρ(1)1 . So
det(A) = − det(A) and det(A) = 0.
Proposition. If 2 rows or 2 columns of a matrix are linearly dependent, then

the determinant is zero.
Proof. Suppose in A, (column r) + λ(column s) = 0. Define
(
Aij j=6 r
Bij = .
Aij + λAis j = r
Then det(B) = det(A) + λ det(matrix with column r = column s) = det(A).

Then we can see that the rth column of B is all zeroes. So each term in the sum
contains one zero and det(A) = det(B) = 0.
Proposition. Given a matrix A, if B is a matrix obtained by adding a multiple
of a column (or row) of A to another column (or row) of A, then det A = det B.
Corollary. Swapping two rows or columns of a matrix negates the determinant.
14
Proof. We do the column case only. Let A = (a1 · · · ai · · · aj · · · an ). Then
det(a1 · · · ai · · · aj · · · an ) = det(a1 · · · ai + aj · · · aj · · · an )
= det(a1 · · · ai + aj · · · aj − (ai + aj ) · · · an )
= det(a1 · · · ai + aj · · · − ai · · · an )
= det(a1 · · · aj · · · − ai · · · an )
= − det(a1 · · · aj · · · ai · · · an )
Alternatively, we can prove this from the definition directly, using the fact that
the sign of a transposition is −1 (and that the sign is multiplicative).
Proposition. det(AB) = det(A) det(B).
P
Proof. First note that σ ε(σ)Aσ(1)ρ(1) Aσ(2)ρ(2) = ε(ρ) det(A), i.e. swapping
columns (or rows) an even/odd number of times gives a factor ±1 respectively.
We can prove this by writing σ = µρ.
Now
X
det AB = ε(σ)(AB)σ(1)1 (AB)σ(2)2 · · · (AB)σ(n)n
σ
X n
X
= ε(σ) Aσ(1)k1 Bk1 1 · · · Aσ(n)kn Bkn n
σ k1 ,k2 ,··· ,kn
X X
= Bk1 1 · · · Bkn n ε(σ)Aσ(1)k1 Aσ(2)k2 · · · Aσ(n)kn
k1 ,··· ,kn σ
| {z }
S
Now consider the many different S’s. If in S, two of k1 and kn are equal, then S
is a determinant of a matrix with two columns the same, i.e. S = 0. So we only
have to consider the sum over distinct ki s. Thus the ki s are are a permutation
of 1, · · · n, say ki = ρ(i). Then we can write
X X
det AB = Bρ(1)1 · · · Bρ(n)n ε(σ)Aσ(1)ρ(1) · · · Aσ(n)ρ(n)
ρ σ
X
= Bρ(1)1 · · · Bρ(n)n (ε(ρ) det A)
ρ
X
= det A ε(ρ)Bρ(1)1 · · · Bρ(n)n
ρ
= det A det B
Corollary. If A is orthogonal, det A = ±1.

Proof.
AAT = I
det AAT = det I
det A det AT = 1
(det A)2 = 1
det A = ±1
15
Corollary. If U is unitary, | det U | = 1.

Proof. We have det U † = (det U T )∗ = det(U )∗ . Since U U † = I, we have
det(U ) det(U )∗ = 1.
Proposition. In R3 , orthogonal matrices represent either a rotation (det = 1)
or a reflection (det = −1).
3.5.3 Minors and Cofactors

Theorem (Laplace expansion formula). For any particular fixed i,
n
X
det A = Aji ∆ji .
j=1
Proof.
n
X n
X
det A = Aj i i εj1 j2 ···jn Aj1 1 Aj2 2 · · · Aji i · · · Ajn n
ji =1 j1 ,··· ,ji ,···jn
Let σ ∈ Sn be the permutation which moves ji to the ith position, and leave
everything else in its natural order, i.e.

1 · · · i i + 1 i + 2 · · · ji − 1 ji ji + 1 · · · n
σ=
1 · · · ji i i + 1 · · · ji − 2 ji − 1 ji + 1 · · · n
if ji > i, and similarly for other cases. To perform this permutation, |i − ji |

transpositions are made. So ε(σ) = (−1)i−ji .
Now consider the permutation ρ ∈ Sn
1 · · · · · · j¯i · · · n

ρ=
j1 · · · j¯i · · · · · · jn
The composition ρσ reorders (1, · · · , n) to (j1 , j2 , · · · , jn ). So ε(ρσ) = εj1 ···jn =

ε(ρ)ε(σ) = (−1)i−ji εj1 ···j̄i ···jn . Hence the original equation becomes
n
X X
det A = Aj i i (−1)i−ji εj1 ···j̄i ···jn Aj1 1 · · · Aji i · · · Ajn n
ji =1 j1 ···j̄i ···jn
n
X
= Aji i (−1)i−ji Mji i
ji =1
X n
= Aj i i ∆ j i i
ji =1
X n
= Aji ∆ji
j=1
16
4 Matrices and linear equations
IA Vectors and Matrices (Theorems with proof)

4.1 Simple example, 2 × 2
4.2 Inverse of an n × n matrix
P
Lemma. Aik ∆jk = δij det A.
Proof. If i 6= j, then consider an n × n matrix B, which is identical to A except
the jth row is replaced by the ith row of A. So ∆jk of B = ∆jk of A, since ∆jk
does not depend on the elements in row j. Since B has a duplicate row, we know
that
Xn Xn
0 = det B = Bjk ∆jk = Aik ∆jk .
k=1 k=1
If i = j, then the expression is det A by the Laplace expansion formula.

Theorem. If det A 6= 0, then A−1 exists and is given by
∆ji
(A−1 )ij = .
det A
Proof.
∆ki δij det A
(A−1 )ik Akj = Akj = = δij .
det A det A
So A−1 A = I.
4.3 Homogeneous and inhomogeneous equations

4.3.1 Gaussian elimination
4.4 Matrix rank

Theorem. The column rank and row rank are equal for any m × n matrix.
Proof. Let r be the row rank of A. Write the biggest set of linearly independent
rows as v1T , v2T , · · · vrT or in component form vkT = (vk1 , vk2 , · · · , vkn ) for k =
1, 2, · · · , r.
Now denote the ith row of A as rTi = (Ai1 , Ai2 , · · · Ain ).
Note that every row of A can be written as a linear combination of the v’s.
(If ri cannot be written as a linear combination of the v’s, then it is independent
of the v’s and v is not the maximum collection of linearly independent rows)
Write
r
X
rTi = Cik vkT .
k=1
For some coefficients Cik with 1 ≤ i ≤ m and 1 ≤ k ≤ r.

Now the elements of A are
r
X
Aij = (ri )Tj = Cik (vk )j ,
k=1
17
or    
A1j C1k
 A2j  Xr  C2k 
 ..  = vkj  . 
   
 .   .. 
k=1
Amj Cmk
So every column of A can be written as a linear combination of the r column
vectors ck . Then the column rank of A ≤ r, the row rank of A.
Apply the same argument to AT to see that the row rank is ≤ the column
rank.
4.5 Homogeneous problem Ax = 0

4.5.1 Geometrical interpretation
4.5.2 Linear mapping view of Ax = 0
4.6 General solution of Ax = d
18
5 Eigenvalues and eigenvectors

5.1 Preliminaries and definitions
Theorem (Fundamental theorem of algebra). Let p(z) be a polynomial of degree
m ≥ 1, i.e.
Xm
p(z) = cj z j ,
j=0
where cj ∈ C and cm 6= 0.
Then p(z) = 0 has precisely m (not necessarily distinct) roots in the complex
plane, accounting for multiplicity.
Theorem. λ is an eigenvalue of A iff
det(A − λI) = 0.
Proof. (⇒) Suppose that λ is an eigenvalue and x is the associated eigenvector.
We can rearrange the equation in the definition above to
(A − λI)x = 0
and thus
x ∈ ker(A − λI)
But x 6= 0. So ker(A − λI) is non-trivial and det(A − λI) = 0. The (⇐) direction
is similar.
5.2 Linearly independent eigenvectors

Theorem. Suppose n×n matrix A has distinct eigenvalues λ1 , λ2 , · · · , λn . Then
the corresponding eigenvectors x1 , x2 , · · · , xn are linearly independent.
Proof. Proof by contradiction: Suppose x1 , x2 , · · · , xn are linearly dependent.
Then we can find non-zero constants di for i = 1, 2, · · · , r, such that
d1 x1 + d2 x2 + · · · + dr xr = 0.
Suppose that this is the shortest non-trivial linear combination that gives 0 (we
may need to re-order xi ).
Now apply (A − λ1 I) to the whole equation to obtain
d1 (λ1 − λ1 )x1 + d2 (λ2 − λ1 )x2 + · · · + dr (λr − λ1 )xr = 0.
We know that the first term is 0, while the others are not (since we assumed
λi 6= λj for i 6= j). So
d2 (λ2 − λ1 )x2 + · · · + dr (λr − λ1 )xr = 0,
and we have found a shorter linear combination that gives 0. Contradiction.
5.3 Transformation matrices

5.3.1 Transformation law for vectors
Theorem. Denote vector as u with respect to {ei } and ũ with respect to {e˜i }.
Then
u = P ũ and ũ = P −1 u
19
5.3.2 Transformation law for matrix

Theorem.
Ã = P −1 AP.
5.4 Similar matrices

Proposition. Similar matrices have the following properties:
(i) Similar matrices have the same determinant.

(ii) Similar matrices have the same trace.
(iii) Similar matrices have the same characteristic polynomial.
Proof. They are proven as follows:

(i) det B = det(P −1 AP ) = (det A)(det P )−1 (det P ) = det A
(ii)
tr B = Bii
= Pij−1 Ajk Pki
= Ajk Pki Pij−1
= Ajk (P P −1 )kj
= Ajk δkj
= Ajj
= tr A
(iii)
pB (λ) = det(B − λI)

= det(P −1 AP − λI)
= det(P −1 AP − λP −1 IP )
= det(P −1 (A − λI)P )
= det(A − λI)
= pA (λ)
5.5 Diagonalizable matrices

Theorem. Let λ1 , λ2 , · · · , λr , with r ≤ n be the distinct eigenvalues of A. Let
B1 , B2 , · · · Br be the bases of the eigenspaces Eλ1 , Eλ2 , · · · , Eλr correspondingly.
[r
Then the set B = Bi is linearly independent.
i=1
(1) (1) (1)

Proof. Write B1 = {x1 , x2 , · · · xm(λ1 ) }. Then m(λ1 ) = dim(Eλ1 ), and simi-
larly for all Bi .
20
Consider the following general linear combination of all elements in B. Con-

sider the equation
r m(λ
Xi ) (i)
X
αij xj = 0.
i=1 j=1
The first sum is summing over all eigenspaces, and the second sum sums over
the basis vectors in Bi . Now apply the matrix
Y
(A − λk I)
k=1,2,··· ,K̄,··· ,r
to the above sum, for some arbitrary K. We obtain

 
m(λK )
(K)
X Y
αKj  (λK − λk ) xj = 0.
j=1 k=1,2,··· ,K̄,··· ,r
(K)
Since the xj are linearly independent (BK is a basis), αKj = 0 for all j. Since
K was arbitrary, all αij must be zero. So B is linearly independent.
Proposition. A is diagonalizable iff all its eigenvalues have zero defect.
5.6 Canonical (Jordan normal) form

Theorem. Any 2 × 2 complex matrix A is similar to exactly one of

λ1 0 λ 0 λ 1
, ,
0 λ2 0 λ 0 λ
Proof. For each case:

(i) If A has two distinct eigenvalues, then eigenvectors are linearly independent.
Then we can use P formed from eigenvectors as its columns
(ii) If λ1 = λ2 = λ and dim Eλ = 2, then write Eλ = span{u, v}, with
u, v linearly independent.
Now use {u, v} as a new basis of C2 and
λ 0
Ã = P −1 AP = = λI
0 λ
Note that since P −1 AP = λI, we have A = P (λI)P −1 = λI. So A is
isotropic, i.e. the same with respect to any basis.
(iii) If λ1 = λ2 = λ and dim(Eλ ) = 1, then Eλ = span{v}. Now choose basis
of C2 as {v, w}, where w ∈ C2 \ Eλ .
We know that Aw ∈ C2 . So Aw
= αv
+ βw. Hence, if we change basis to
−1 λ α
{v, w}, then Ã = P AP = .
0 β
However, A and Ã both have eigenvalue λ with algebraic multiplicity 2.
So we must have β = λ. To make α = 1, let u = (Ã − λI)w. We know
u 6= 0 since w is not in the eigenspace. Then

2 0 α 0 α
(Ã − λI)u = (Ã − λI) w = w = 0.
0 0 0 0
21
So u is an eigenvector of Ã with eigenvalue λ.

We have u = Ãw − λw. So Ãw = u + λw.

λ 1
Change basis to {u, w}. Then A with respect to this basis is .
0 λ
This is a two-stage process: P sends basis to {v, w} and then matrix Q
sends to basis {u, w}. So the similarity transformation is Q−1 (P −1 AP )Q =
(P Q)−1 A(P Q).
Proposition. (Without proof) The canonical form, or Jordan normal form,
exists for any n × n matrix A. Specifically, there exists a similarity transform
such that A is similar to a matrix to Ã that satisfies the following properties:
(i) Ãαα = λα , i.e. the diagonal composes of the eigenvalues.
(ii) Ãα,α+1 = 0 or 1.
(iii) Ãij = 0 otherwise.
5.7 Cayley-Hamilton Theorem

Theorem (Cayley-Hamilton theorem). Every n × n complex matrix satisfies
its own characteristic equation.
Proof. We will only prove for diagonalizable matrices here. So suppose for our
matrix A, there is some P such that D = diag(λ1 , λ2 , · · · , λn ) = P −1 AP . Note
that
Di = (P −1 AP )(P −1 AP ) · · · (P −1 AP ) = P −1 Ai P.
Hence
pD (D) = pD (P −1 AP ) = P −1 [pD (A)]P.
Since similar matrices have the same characteristic polynomial. So
pA (D) = P −1 [pA (A)]P.
However, we also know that Di = diag(λi1 , λi2 , · · · λin ). So
pA (D) = diag(pA (λ1 ), pA (λ2 ), · · · , pA (λn )) = diag(0, 0, · · · , 0)
since the eigenvalues are roots of pA (λ) = 0. So 0 = pA (D) = P −1 pA (A)P and

thus pA (A) = 0.
5.8 Eigenvalues and eigenvectors of a Hermitian matrix

5.8.1 Eigenvalues and eigenvectors
Theorem. The eigenvalues of a Hermitian matrix H are real.
Proof. Suppose that H has eigenvalue λ with eigenvector v 6= 0. Then
Hv = λv.
We pre-multiply by v† , a 1 × n row vector, to obtain
v† Hv = λv† v (∗)
22
We take the Hermitian conjugate of both sides. The left hand side is
(v† Hv)† = v† H † v = v† Hv
since H is Hermitian. The right hand side is
(λv† v)† = λ∗ v† v
So we have
v† Hv = λ∗ v† v.
From (∗), we know that λv† v = λ∗ v† v. Since v 6= 0, we know that v† v =
v · v 6= 0. So λ = λ∗ and λ is real.
Theorem. The eigenvectors of a Hermitian matrix H corresponding to distinct
eigenvalues are orthogonal.
Proof. Let
Hvi = λi vi (i)
Hvj = λj vj . (ii)
Pre-multiply (i) by vj† to obtain
vj† Hvi = λi vj† vi . (iii)
Pre-multiply (ii) by vi† and take the Hermitian conjugate to obtain
vj† Hvi = λj vj† vi . (iv)
Equating (iii) and (iv) yields
λi vj† vi = λj vj† vi .
Since λi 6= λj , we must have vj† vi = 0. So their inner product is zero and are
orthogonal.
5.8.2 Gram-Schmidt orthogonalization (non-examinable)

5.8.3 Unitary transformation
5.8.4 Diagonalization of n × n Hermitian matrices
Theorem. An n × n Hermitian matrix has precisely n orthogonal eigenvectors.
Proof. (Non-examinable) Let λ1 , λ2 , · · · , λr be the distinct eigenvalues of H (r ≤
n), with a set of corresponding orthonormal eigenvectors B = {v1 , v2 , · · · , vr }.
Extend to a basis of the whole of Cn
B 0 = {v1 , v2 , · · · , vr , w1 , w2 , · · · , wn−r }
Now use Gram-Schmidt to create an orthonormal basis
B̃ = {v1 , v2 , · · · , vr , u1 , u2 , · · · , un−r }.
23
Now write  
↑ ↑ ↑ ↑ ↑
P = v1 v2 ··· vr u1 ··· un−r 
↓ ↓ ↓ ↓ ↓
We have shown above that this is a unitary matrix, i.e. P −1 = P † . So if we
change basis, we have
P −1 HP = P † HP
 
λ1 0 ··· 0 0 0 ··· 0
 0 λ2 ··· 0 0 0 ··· 0 
 .. .. .. .. ..
 
.. .. 
.
 . . . . . . 0 

0 0 ··· λr 0 0 ··· 0 
= 
0
 0 ··· 0 c11 c12 ··· c1,n−r 

0
 0 ··· 0 c21 c22 ··· c2,n−r 

. .. .. .. .. .. .. ..
 ..

. . . . . . . 
0 0 ··· 0 cn−r,1 cn−r,2 ··· cn−r,n−r
Here C is an (n − r) × (n − r) Hermitian matrix. The eigenvalues of C are also

eigenvalues of H because det(H − λI) = det(P † HP − λI) = (λ1 − λ) · · · (λr −
λ) det(C − λI). So the eigenvalues of C are the eigenvalues of H.
We can keep repeating the process on C until we finish all rows. For example,
if the eigenvalues of C are all distinct, there are n − r orthonormal eigenvectors
wj (for j = r + 1, · · · , n) of C. Let
 
1

 1 


 . .. 

Q=
 
 1 


 ↑ ↑ ↑ 

 wr+1 wr+2 · · · wn 
↓ ↓ ↓
with other entries 0. (where we have a r × r identity matrix block on the top
left corner and a (n − r) × (n − r) with columns formed by wj )
Since the columns of Q are orthonormal, Q is unitary. So Q† P † HP Q =
diag(λ1 , λ2 , · · · , λr , λr+1 , · · · , λn ), where the first r λs are distinct and the re-
maining ones are copies of previous ones.
The n linearly-independent eigenvectors are the columns of P Q.
5.8.5 Normal matrices

Proposition.
(i) If λ is an eigenvalue of N , then λ∗ is an eigenvalue of N † .
(ii) The eigenvectors of distinct eigenvalues are orthogonal.
(iii) A normal matrix can always be diagonalized with an orthonormal basis of
eigenvectors.
24
6 Quadratic forms and conicsIA Vectors and Matrices (Theorems with proof)
6 Quadratic forms and conics

Theorem. Hermitian forms are real.
Proof. (x† Hx)∗ = (x† Hx)† = x† H † x = x† Hx. So (x† Hx)∗ = x† Hx and it is
real.
6.1 Quadrics and conics

6.1.1 Quadrics
6.1.2 Conic sections (n = 2)
6.2 Focus-directrix property
25
7 Transformation groups IA Vectors and Matrices (Theorems with proof)
7 Transformation groups
7.1 Groups of orthogonal matrices
Proposition. The set of all n × n orthogonal matrices P forms a group under
matrix multiplication.
Proof.
0. If P, Q are orthogonal, then consider R = P Q. RRT = (P Q)(P Q)T =
P (QQT )P T = P P T = I. So R is orthogonal.
1. I satisfies II T = I. So I is orthogonal and is an identity of the group.
2. Inverse: if P is orthogonal, then P −1 = P T by definition, which is also
orthogonal.
3. Matrix multiplication is associative since function composition is associative.
7.2 Length preserving matrices

Theorem. Let P ∈ O(n). Then the following are equivalent:
(i) P is orthogonal
(ii) |P x| = |x|
(iii) (P x)T (P y) = xT y, i.e. (P x) · (P y) = x · y.
(iv) If (v1 , v2 , · · · , vn ) are orthonormal, so are (P v1 , P v2 , · · · , P vn )
(v) The columns of P are orthonormal.
Proof. We do them one by one:
(i) ⇒ (ii): |P x|2 = (P x)T (P x) = xT P T P x = xT x = |x|2
(ii) ⇒ (iii): |P (x + y)|2 = |x + y|2 . The right hand side is
(xT + yT )(x + y) = xT x + y T y + yT x + xT y = |x|2 + |y|2 + 2xT y.
Similarly, the left hand side is
|P x + P y|2 = |P x|2 + |P y| + 2(P x)T P y = |x|2 + |y|2 + 2(P x)T P y.
So (P x)T P y = xT y.
(iii) ⇒ (iv): (P vi )T P vj = viT vj = δij . So P vi ’s are also orthonormal.
(iv) ⇒ (v): Take the vi ’s to be the standard basis. So the columns of P , being
P ei , are orthonormal.
(v) ⇒ (i): The columns of P are orthonormal. Then (P P T )ij = Pik Pjk =
(Pi ) · (Pj ) = δij , viewing Pi as the ith column of P . So P P T = I.
7.3 Lorentz transformations
26

Part IA - Vectors and Matrices: Theorems With Proof

Uploaded by

Copyright:

Available Formats

Part IA - Vectors and Matrices: Theorems With Proof

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Part IA - Vectors and Matrices: Theorems With Proof

Uploaded by

Copyright:

Available Formats

Part IA — Vectors and Matrices

Theorems with proof

Based on lectures by N. Peake

Eigenvalues and Eigenvectors

4 Matrices and linear equations 17

5 Eigenvalues and eigenvectors 19

6 Quadratic forms and conics 25

Theorem (Triangle inequality). For all z1 , z2 ∈ C, we have

|z1 + z2 | ≤ |z1 | + |z2 |.

Alternatively, we have |z1 − z2 | ≥ ||z1 | − |z2 ||.

1.2 Complex exponential function

Theorem. exp(z1 ) exp(z2 ) = exp(z1 + z2 )

Theorem. eiz = cos z + i sin z.

1.3 Roots of unity

1.4 Complex logarithm and power

cos nθ + i sin nθ = (cos θ + i sin θ)n .

(cos θ + i sin θ)n+1 = (cos θ + i sin θ)n (cos θ + i sin θ)

If n < 0, let m = −n. Then m > 0 and

(cosθ + i sin θ)−m = (cos mθ + i sin mθ)−1

1.6 Lines and circles in C

Theorem. The general equation of a circle with center c ∈ C and radius ρ ∈ R+

2.3 Cauchy-Schwarz inequality

Proof. Consider the expression |x − λy|2 . We must have

Viewing this as a quadratic in λ, we see that the quadratic is non-negative and

4(x · y)2 ≤ 4|y|2 |x|2

Corollary (Triangle inequality).

2.4 Vector product

a × b = (a1 î + a2 ĵ + a3 k̂) × (b1 î + b2 ĵ + b3 k̂)

2.5 Scalar triple product

2.6 Spanning sets and bases

2.7 Vector subspaces

Theorem. εijk εipq = δjp δkq − δjq δkp

a · (b × c) = ai (b × c)i = εijk bj ck ai = εjki bj ck ai = b · (c × a).

Theorem (Vector triple product).

[a × (b × c)]i = εijk aj (b × c)k

Proposition. (a × b) · (b × c) = (a · a)(b · c) − (a · b)(a · c).

LHS = (a × b)i (a × c)i

2.10 Vector equations

3.2 Linear Maps

3.3 Rank and nullity

r(f ) + n(f ) = dim(U ).

Proof. (Non-examinable) Write dim(U ) = n and n(f ) = m. If m = n, then f is

since e1 , · · · en is a basis of U . Thus

y = α1 f (e1 ) + α2 f (e2 ) + · · · + αm f (em ) + αm+1 f (em+1 ) + · · · + αn f (en ).

The first m terms map to 0, since e1 , · · · em is the basis of the kernel of f .

(ii) To show that they are linearly independent, suppose

αm+1 f (em+1 ) + · · · + αn f (en ) = 0.

Thus αm+1 em+1 + · · · + αn en ∈ ker(f ). Since {e1 , · · · , em } span ker(f ),

But e1 · · · en is a basis of U and are linearly independent. So αi = 0 for all i.

(iii) (AB)T = B T AT since (AB)Tij = (AB)ji = Ajk Bki = Bki Ajk

Proof. tr(BC) = Bik Cki = Cki Bik = (CB)kk = tr(CB)

3.4.3 Decomposition of an n × n matrix

Proof. (1 2 3 · · · n) = (1 2)(2 3)(3 4) · · · (n − 1 n).

3.5.2 Properties of determinants

Aσ(1)1 Aσ(2)2 · · · Aσ(n)n = Aσ(ρ(1))ρ(1) Aσ(ρ(2))ρ(2) · · · Aσ(ρ(n))ρ(n)

Proposition. If matrix B is formed by multiplying every element in a single row

Proposition. If 2 rows or 2 columns of a matrix are linearly dependent, then

Then det(B) = det(A) + λ det(matrix with column r = column s) = det(A).