Maxtrix Algebra For Engineers
Maxtrix Algebra For Engineers
Maxtrix Algebra For Engineers
Jeffrey R. Chasnov
Copyright ○
c 2018, 2019 by Jeffrey Robert Chasnov
This work is licensed under the Creative Commons Attribution 3.0 Hong Kong License. To view a copy of this
license, visit http://creativecommons.org/licenses/by/3.0/hk/ or send a letter to Creative Commons, 171 Second
Street, Suite 300, San Francisco, California, 94105, USA.
Preface
View an Introduction on YouTube
These are my lecture notes for my online Coursera course, Matrix Algebra for Engineers. I have
divided these notes into chapters called Lectures, with each Lecture corresponding to a video on
Coursera. I have also uploaded all my Coursera videos to YouTube, and links are placed at the top of
each Lecture.
There are problems at the end of each lecture chapter and I have tried to choose problems that
exemplify the main idea of the lecture. Students taking a formal university course in matrix or linear
algebra will usually be assigned many more additional problems, but here I follow the philosophy
that less is more. I give enough problems for students to solidify their understanding of the material,
but not too many problems that students feel overwhelmed and drop out. I do encourage students to
attempt the given problems, but if they get stuck, full solutions can be found in the Appendix.
There are also additional problems at the end of coherent sections that are given as practice quizzes
on the Coursera platform. Again, students should attempt these quizzes on the platform, but if a
student has trouble obtaining a correct answer, full solutions are also found in the Appendix.
The mathematics in this matrix algebra course is at the level of an advanced high school student, but
typically students would take this course after completing a university-level single variable calculus
course. There are no derivatives and integrals in this course, but student’s are expected to have a
certain level of mathematical maturity. Nevertheless, anyone who wants to learn the basics of matrix
algebra is welcome to join.
Jeffrey R. Chasnov
Hong Kong
July 2018
iii
Contents
I Matrices 1
1 Definition of a matrix 5
3 Special matrices 9
4 Transpose matrix 13
6 Inverse matrix 17
7 Orthogonal matrices 21
8 Rotation matrices 23
9 Permutation matrices 25
10 Gaussian elimination 33
12 Computing inverses 39
13 Elementary matrices 43
14 LU decomposition 45
v
vi CONTENTS
15 Solving (LU)x = b 47
16 Vector spaces 57
17 Linear independence 59
19 Gram-Schmidt process 65
21 Null space 71
23 Column space 77
25 Orthogonal projections 83
29 Laplace expansion 99
Matrices
1
3
In this week’s lectures, we learn about matrices. Matrices are rectangular arrays of numbers or
other mathematical objects and are fundamental to engineering mathematics. We will define matrices
and how to add and multiply them, discuss some special matrices such as the identity and zero matrix,
learn about transposes and inverses, and define orthogonal and permutation matrices.
4
Lecture 1
Definition of a matrix
View this lecture on YouTube
An m-by-n matrix is a rectangular array of numbers (or other mathematical objects) with m rows
and n columns. For example, a two-by-two matrix A, with two rows and two columns, looks like
!
a b
A= .
c d
The first row has elements a and b, the second row has elements c and d. The first column has elements
a and c; the second column has elements b and d. As further examples, two-by-three and three-by-two
matrices look like
! a d
a b c
B= , C = b e .
d e f
c f
Of special importance are column matrices and row matrices. These matrices are also called vectors.
The column vector is in general n-by-one and the row vector is one-by-n. For example, when n = 3,
we would write a column vector as
a
x = b ,
Here, the matrix element of A in the ith row and the jth column is denoted as aij .
5
6 LECTURE 1. DEFINITION OF A MATRIX
a) Write down the three-by-three matrix with ones on the diagonal and zeros elsewhere.
b) Write down the three-by-four matrix with ones on the diagonal and zeros elsewhere.
c) Write down the four-by-three matrix with ones on the diagonal and zeros elsewhere.
Matrices can be added only if they have the same dimension. Addition proceeds element by element.
For example, ! ! !
a b e f a+e b+ f
+ = .
c d g h c+g d+h
Matrices can also be multiplied by a scalar. The rule is to just multiply every element of the matrix.
For example, ! !
a b ka kb
k = .
c d kc kd
Matrices (other than the scalar) can be multiplied only if the number of columns of the left matrix
equals the number of rows of the right matrix. In other words, an m-by-n matrix on the left can only
be multiplied by an n-by-k matrix on the right. The resulting matrix will be m-by-k. Evidently, matrix
multiplication is generally not commutative. We illustrate multiplication using two 2-by-2 matrices:
! ! ! ! ! !
a b e f ae + bg a f + bh e f a b ae + c f be + d f
= , = .
c d g h ce + dg c f + dh g h c d ag + ch bg + dh
First, the first row of the left matrix is multiplied against and summed with the first column of the right
matrix to obtain the element in the first row and first column of the product matrix. Second, the first
row is multiplied against and summed with the second column. Third, the second row is multiplied
against and summed with the first column. And fourth, the second row is multiplied against and
summed with the second column.
In general, an element in the resulting product matrix, say in row i and column j, is obtained by
multiplying and summing the elements in row i of the left matrix with the elements in column j of
the right matrix. We can formally write matrix multiplication in terms of the matrix elements. Let A
be an m-by-n matrix with matrix elements aij and let B be an n-by-p matrix with matrix elements bij .
Then C = AB is an m-by-p matrix, and its ij matrix element can be written as
n
cij = ∑ aik bkj .
k =1
Notice that the second index of a and the first index of b are summed over.
7
8 LECTURE 2. ADDITION AND MULTIPLICATION OF MATRICES
1 3 4 0 0 4
4. Prove the associative law for matrix multiplication. That is, let A be an m-by-n matrix, B an n-by-p
matrix, and C a p-by-q matrix. Then prove that A(BC) = (AB)C.
Special matrices
View this lecture on YouTube
The zero matrix, denoted by 0, can be any size and is a matrix consisting of all zero elements. Multi-
plication by a zero matrix results in a zero matrix. The identity matrix, denoted by I, is a square matrix
(number of rows equals number of columns) with ones down the main diagonal. If A and I are the
same sized square matrices, then
AI = IA = A,
and multiplication by the identity matrix leaves the matrix unchanged. The zero and identity matrices
play the role of the numbers zero and one in matrix multiplication. For example, the two-by-two zero
and identity matrices are given by
! !
0 0 1 0
0= , I= .
0 0 0 1
A diagonal matrix has its only nonzero elements on the diagonal. For example, a two-by-two diagonal
matrix is given by !
d1 0
D= .
0 d2
Usually, diagonal matrices refer to square matrices, but they can also be rectangular.
A band (or banded) matrix has nonzero elements only on diagonal bands. For example, a three-by-
three band matrix with nonzero diagonals one above and one below a nonzero main diagonal (called
a tridiagonal matrix) is given by
d1 a1 0
B = b1 d2 a2 .
0 b2 d3
An upper or lower triangular matrix is a square matrix that has zero elements below or above the
diagonal. For example, three-by-three upper and lower triangular matrices are given by
a b c a 0 0
U = 0 d e , L = b d 0 .
0 0 f c e f
9
10 LECTURE 3. SPECIAL MATRICES
a) A and C only
b) A and D only
c) B and C only
d) B and D only
11
12 LECTURE 3. SPECIAL MATRICES
Lecture 4
Transpose matrix
View this lecture on YouTube
The transpose of a matrix A, denoted by AT and spoken as A-transpose, switches the rows and
columns of A. That is,
a11 a12 ··· a1n a11 a21 ··· am1
a21
a22 ··· a2n
a12
T
a22 ··· am2
if A =
.. .. .. ,
.. then A =
.. .. .. .
..
. . . . . . . .
am1 am2 ··· amn a1n a2n ··· amn
Evidently, if A is m-by-n then AT is n-by-m. As a simple example, view the following transpose pair:
T
a d !
a b c
b e = .
d e f
c f
A less obvious fact is that the transpose of the product of matrices is equal to the product of the
transposes with the order of multiplication reversed, i.e.,
(AB)T = BT AT .
If A is a square matrix, and AT = A, then we say that A is symmetric. If AT = −A, then we say that A
is skew symmetric. For example, 3-by-3 symmetric and skew symmetric matrices look like
a b c 0 b c
b d e , −b 0 e .
c e f −c −e 0
13
14 LECTURE 4. TRANSPOSE MATRIX
2. Show using the transpose operator that any square matrix A can be written as the sum of a sym-
metric and a skew-symmetric matrix.
The inner product (or dot product or scalar product) between two vectors is obtained from the ma-
trix product of a row vector times a column vector. A row vector can be obtained from a column
vector by the transpose operator. With the 3-by-1 column vectors u and v, their inner product is given
by
v1
uT v = u1 u2 u3 v2 = u1 v1 + u2 v2 + u3 v3 .
v3
If the inner product between two vectors is zero, we say that the vectors are orthogonal. The norm of a
vector is defined by
1/2 1/2
||u|| = uT u = u21 + u22 + u23 .
If the norm of a vector is equal to one, we say that the vector is normalized. If a set of vectors are
mutually orthogonal and normalized, we say that these vectors are orthonormal.
An outer product is also defined, and is used in some applications. The outer product between u
and v is given by
u1 u1 v1 u1 v2 u1 v3
uvT = u2 v1 v2 v3 = u2 v1 u2 v2 u2 v3 .
u3 u3 v1 u3 v2 u3 v3
Notice that every column is a multiple of the single vector u, and every row is a multiple of the single
vector vT .
15
16 LECTURE 5. INNER AND OUTER PRODUCTS
c f
square matrix and that the sum of its diagonal elements is the sum of the squares of all the elements
of A.
2. The trace of a square matrix B, denoted as Tr B, is the sum of the diagonal elements of B. Prove that
Tr(AT A) is the sum of the squares of all the elements of A.
Inverse matrix
View this lecture on YouTube
Square matrices may have inverses. When a matrix A has an inverse, we say it is invertible and
denote its inverse by A−1 . The inverse matrix satisfies
AA−1 = A−1 A = I.
If A and B are invertible matrices, then (AB)−1 = B−1 A−1 . Furthermore, if A is invertible then so is
AT , and (AT )−1 = (A−1 )T .
It is illuminating to derive the inverse of a general 2-by-2 matrix. Write
! ! !
a b x1 x2 1 0
= ,
c d y1 y2 0 1
and try to solve for x1 , y1 , x2 and y2 in terms of a, b, c, and d. There are two inhomogeneous and two
homogeneous linear equations:
To solve, we can eliminate y1 and y2 using the two homogeneous equations, and find x1 and x2 using
the two inhomogeneous equations. The solution for the inverse matrix is found to be
! −1 !
a b 1 d −b
= .
c d ad − bc −c a
The term ad − bc is just the definition of the determinant of the two-by-two matrix:
!
a b
det = ad − bc.
c d
The determinant of a two-by-two matrix is the product of the diagonals minus the product of the
off-diagonals. Evidently, a two-by-two matrix A is invertible only if det A ̸= 0. Notice that the inverse
of a two-by-two matrix, in words, is found by switching the diagonal elements of the matrix, negating
the off-diagonal elements, and dividing by the determinant.
Later, we will show that an n-by-n matrix is invertible if and only if its determinant is nonzero.
This will require a more general definition of the determinant.
17
18 LECTURE 6. INVERSE MATRIX
2. Prove that if A and B are same-sized invertible matrices , then (AB)−1 = B−1 A−1 .
a) AT BT CT
b) AT CT BT
c) CT AT BT
d) CT BT AT
a) A + AT
b) AAT
c) A − AT
d) AT A
!
2 2
3. Which matrix is the inverse of ?
1 2
!
1 −2
2
a)
2 −1 2
!
1 −2 2
b)
2 1 −2
!
1 2 2
c)
2 −1 −2
!
1 −2 −2
d)
2 1 2
19
20 LECTURE 6. INVERSE MATRIX
Lecture 7
Orthogonal matrices
View this lecture on YouTube
Q−1 = QT
The norm of a vector is also called its length, so we can also say that orthogonal matrices preserve
lengths.
21
22 LECTURE 7. ORTHOGONAL MATRICES
Rotation matrices
View this lecture on YouTube
A matrix that rotates a vector in space doesn’t change the vector’s length and so should be an orthog-
y'
y
r
x' x
onal matrix. Consider the two-by-two rotation matrix that rotates a vector through an angle θ in the
x-y plane, shown above. Trigonometry and the addition formula for cosine and sine results in
x ′ = r cos (θ + ψ) y′ = r sin (θ + ψ)
= r (cos θ cos ψ − sin θ sin ψ) = r (sin θ cos ψ + cos θ sin ψ)
= x cos θ − y sin θ = x sin θ + y cos θ.
The above two-by-two matrix is a rotation matrix and we will denote it by Rθ . Observe that the rows
and columns of Rθ are orthonormal and that the inverse of Rθ is just its transpose. The inverse of Rθ
rotates a vector by −θ.
23
24 LECTURE 8. ROTATION MATRICES
2. Find the three-by-three matrix that rotates a three-dimensional vector an angle θ counterclockwise
around the z-axis.
Permutation matrices
View this lecture on YouTube
Another type of orthogonal matrix is a permutation matrix. An n-by-n permutation matrix, when
multiplying on the left permutes the rows of a matrix, and when multiplying on the right permutes
the columns. Clearly, permuting the rows of a column vector will not change its norm.
For example, let the string {1, 2} represent the order of the rows or columns of a two-by-two matrix.
Then the permutations of the rows or columns are given by {1, 2} and {2, 1}. The first permutation is
no permutation at all, and the corresponding permutation matrix is simply the identity matrix. The
second permutation of the rows or columns is achieved by
! ! ! ! ! !
0 1 a b c d a b 0 1 b a
= , = .
1 0 c d a b c d 1 0 d c
The rows or columns of a three-by-three matrix have 3! = 6 possible permutations, namely {1, 2, 3},
{1, 3, 2}, {2, 1, 3}, {2, 3, 1}, {3, 1, 2}, {3, 2, 1}. For example, the row or column permutation {3, 1, 2} is
obtained by
0 0 1 a b c g h i a b c 0 1 0 c a b
1 0 0 d e f = a b c , d e f 0 0 1 = f d e .
0 1 0 g h i d e f g h i 1 0 0 i g h
Notice that the permutation matrix is obtained by permuting the corresponding rows (or columns) of
the identity matrix. This is made evident by observing that
PA = (PI)A, AP = A(PI),
where P is a permutation matrix and PI is the identity matrix with permuted rows. The identity matrix
is orthogonal, and so is the matrix obtained by permuting its rows.
25
26 LECTURE 9. PERMUTATION MATRICES
2. Find the inverses of all the three-by-three permutation matrices. Explain why some matrices are
their own inverses, and others are not.
2. Which matrix rotates a three-by-one column vector an angle θ counterclockwise around the x-axis?
1 0 0
a) 0 cos θ − sin θ
0 sin θ cos θ
sin θ 0 cos θ
b) 0 1 0
cos θ 0 − sin θ
cos θ − sin θ 0
c) sin θ cos θ 0
0 0 1
cos θ sin θ 0
d) − sin θ cos θ 0
0 0 1
27
28 LECTURE 9. PERMUTATION MATRICES
3. Which matrix, when multiplying another matrix on the left, moves row one to row two, row two to
row three, and row three to row one?
0 1 0
a) 0 0 1
1 0 0
0 0 1
b) 1 0 0
0 1 0
0 0 1
c) 0 1 0
1 0 0
1 0 0
d) 0 0 1
0 1 0
29
31
In this week’s lectures, we learn about solving a system of linear equations. A system of linear
equations can be written in matrix form, and we can solve using Gaussian elimination. We will learn
how to bring a matrix to reduced row echelon form, and how this can be used to compute a matrix
inverse. We will also learn how to find the LU decomposition of a matrix, and how to use this
decomposition to efficiently solve a system of linear equations.
32
Lecture 10
Gaussian elimination
View this lecture on YouTube
3 −4 4 x3 −6
or symbolically as Ax = b.
The standard numerical algorithm used to solve a system of linear equations is called Gaussian
elimination. We first form what is called an augmented matrix by combining the matrix A with the
column vector b:
−3 2 −1 −1
6 −6 7 −7 .
3 −4 4 −6
Row reduction is then performed on this augmented matrix. Allowed operations are (1) interchange
the order of any rows, (2) multiply any row by a constant, (3) add a multiple of one row to another
row. These three operations do not change the solution of the original equations. The goal here is
to convert the matrix A into upper-triangular form, and then use this form to quickly solve for the
unknowns x.
We start with the first row of the matrix and work our way down as follows. First we multiply the
first row by 2 and add it to the second row. Then we add the first row to the third row, to obtain
−3 2 −1 −1
0 −2 5 −9 .
0 −2 3 −7
33
34 LECTURE 10. GAUSSIAN ELIMINATION
We then go to the second row. We multiply this row by −1 and add it to the third row to obtain
−3 2 −1 −1
0 −2 5 −9 .
0 0 −2 2
The original matrix A has been converted to an upper triangular matrix, and the transformed equations
can be determined from the augmented matrix as
These equations can be solved by back substitution, starting from the last equation and working
backwards. We have
x3 = −1,
1
x2 = − (−9 − 5x3 ) = 2,
2
1
x1 = − (−1 + x3 − 2x2 ) = 2.
3
x3 −1
When performing Gaussian elimination, the diagonal element that is used during the elimination
procedure is called the pivot. To obtain the correct multiple, one uses the pivot as the divisor to the
matrix elements below the pivot. Gaussian elimination in the way done here will fail if the pivot is
zero. If the pivot is zero, a row interchange must first be performed.
Even if no pivots are identically zero, small values can still result in an unstable numerical compu-
tation. For very large matrices solved by a computer, the solution vector will be inaccurate unless row
interchanges are made. The resulting numerical technique is called Gaussian elimination with partial
pivoting, and is usually taught in a standard numerical analysis course.
35
(a)
(b)
x1 − 2x2 + 3x3 = 1,
− x1 + 3x2 − x3 = −1,
2x1 − 5x2 + 5x3 = 1.
If we continue the row elimination procedure so that all the pivots are one, and all the entries above
and below the pivots are eliminated, then we say that the resulting matrix is in reduced row echelon
form. We notate the reduced row echelon form of a matrix A as rref(A). For example, consider the
three-by-four matrix
1 2 3 4
A = 4 5 6 7 .
6 7 8 9
6 7 8 9 0 −5 −10 −15 0 1 2 3 0 0 0 0
0 0 0 0
We say that the matrix A has two pivot columns, that is two columns that contain a pivot position with
a one in the reduced row echelon form. Note that rows may need to be exchanged when computing
the reduced row echelon form.
37
38 LECTURE 11. REDUCED ROW ECHELON FORM
(a)
3−7 −2 −7
A = −3 5 1 5
6 −4 0 2
(b)
1 2 1
A = 2 4 1
3 6 2
Computing inverses
View this lecture on YouTube
By bringing an invertible matrix to reduced row echelon form, that is, to the identity matrix, we
can compute the matrix inverse. Given a matrix A, consider the equation
AA−1 = I,
for the unknown inverse A−1 . Let the columns of A−1 be given by the vectors a1−1 , a2−1 , and so on.
The matrix A multiplying the first column of A−1 is the equation
T
Aa1−1 = e1 , with e1 = 1 0 ... 0 ,
Aai−1 = ei ,
for i = 1, 2, . . . , n. The method then is to do row reduction on an augmented matrix which attaches
the identity matrix to A. To find A−1 , elimination is continued until one obtains rref(A) = I.
We illustrate below:
−3 2−1 1 0 0 −3 2 −1 1 0 0
6 −6 7 0 1 0 → 0 −2 5 2 1 0 →
3 −4 4 0 0 1 0 −2 3 1 0 1
−3 2−1 1 0 0 −3 0 4 3 1 0
0 −2 5 2 1 0 → 0 −2 5 2 1 0 →
0 0 −2 −1 −1 1 0 0 −2 −1 −1 1
−3 0 0 1 −1 2 1 0 0 −1/3 1/3 −2/3
0 −2 0 −1/2 −3/2 5/2 → 0 1 0 1/4 3/4 −5/4 ;
39
40 LECTURE 12. COMPUTING INVERSES
6 −4 0
4 −7 1 −2
1 −2 1 0
a) 0 1 −1 1
0 0 −2 −3
1 −2 1 0
b) 0 1 −1 1
0 0 −2 3
1 −2 1 0
c) 0 1 −1 1
0 0 −3 −2
1 −2 1 0
d) 0 1 −1 1
0 0 −3 2
0 0 1 2
1 2 0 0
b) 0 0 1 0
0 0 0 1
1 0 1 0
c) 0 1 0 0
0 0 1 1
1 0 0 0
d) 0 1 2 0
0 0 0 1
41
42 LECTURE 12. COMPUTING INVERSES
3 −7 −2
3. The inverse of −3 5 1 is
6 −4 0
4/3 2/3 1/2
a) 2 1 1/2
−3 −5 −1
2/3 1/2 4/3
b) 1 1/2 2
−3 −5 −1
2/3 4/3 1/2
c) 1 2 1/2
−5 −3 −1
2/3 4/3 1/2
d) 1 2 1/2
−3 −5 −1
Elementary matrices
View this lecture on YouTube
The row reduction algorithm of Gaussian elimination can be implemented by multiplying elemen-
tary matrices. Here, we show how to construct these elementary matrices, which differ from the
identity matrix by a single elementary row operation. Consider the first row reduction step for the
following matrix A:
−3 2 −1 −3 2 −1 1 0 0
A = 6 −6 7 → 0 −2 5 = M1 A, where M1 = 2 1 0 .
3 −4 4 3 −4 4 0 0 1
To construct the elementary matrix M1 , the number two is placed in column-one, row-two. This matrix
multiplies the first row by two and adds the result to the second row.
The next step in row elimination is
−3 2 −1 −3 2 −1 1 0 0
0 −2 5 → 0 −2 5 = M2 M1 A, where M2 = 0 1 0 .
3 −4 4 0 −2 3 1 0 1
Here, to construct M2 the number one is placed in column-one, row-three, and the matrix multiplies
the first row by one and adds the result to the third row.
The last step in row elimination is
−3 2 −1 −3 2 −1 1 0 0
0 −2 5 → 0 −2 5 = M3 M2 M1 A, where M3 = 0 1 0 .
0 −2 3 0 0 −2 0 −1 1
Here, to construct M3 the number negative-one is placed in column-two, row-three, and this matrix
multiplies the second row by negative-one and adds the result to the third row.
We have thus found that
M3 M2 M1 A = U,
where U is an upper triangular matrix. This discussion will be continued in the next lecture.
43
44 LECTURE 13. ELEMENTARY MATRICES
LU decomposition
View this lecture on YouTube
In the last lecture, we have found that row reduction of a matrix A can be written as
M3 M2 M1 A = U,
Now, the matrix M1 multiples the first row by two and adds it to the second row. To invert this
operation, we simply need to multiply the first row by negative-two and add it to the second row, so
that
1 0 0 1 0 0
M1 = 2 1 0 , M1−1 = −2 1 0 .
0 0 1 0 0 1
Similarly,
1 0 0 1 0 0 1 0 0 1 0 0
M2 = 0 1 0 , M2−1 = 0 1 0 ; M3 = 0 1 0 , M3−1 = 0 1 0 .
1 0 1 −1 0 1 0 −1 1 0 1 1
Therefore,
L = M1−1 M2−1 M3−1
is given by
1 0 0 1 0 0 1 0 0 1 0 0
L = −2 1 0 0 1 0 0 1 0 = −2 1 0 ,
0 0 1 −1 0 1 0 1 1 −1 1 1
which is lower triangular. Also, the non-diagonal elements of the elementary inverse matrices are
simply combined to form L. Our LU decomposition of A is therefore
−3 2 −1 1 0 0 −3 2 −1
6 −6 7 = −2 1 0 0 −2 5 .
3 −4 4 −1 1 1 0 0 −2
45
46 LECTURE 14. LU DECOMPOSITION
6 −4 0
Solving (LU)x = b
View this lecture on YouTube
The LU decomposition is useful when one needs to solve Ax = b for many right-hand-sides. With the
LU decomposition in hand, one writes
(LU)x = L(Ux) = b,
and lets y = Ux. Then we solve Ly = b for y by forward substitution, and Ux = y for x by backward
substitution. It is possible to show that for large matrices, solving (LU)x = b is substantially faster
than solving Ax = b directly.
−1 1 1 0 0 −2 −6
−1 1 1 y3 −6
y1 = −1,
y2 = −7 + 2y1 = −9,
y3 = −6 + y1 − y2 = 2.
0 0 −2 x3 2
47
48 LECTURE 15. SOLVING (LU)X = B
x3 = −1,
1
x2 = − (−9 − 5x3 ) = 2,
2
1
x1 = − (−1 − 2x2 + x3 ) = 2,
3
x3 −1
49
6 −4 0 2 −5 1 0 0 −1
2 1
51
52 LECTURE 15. SOLVING (LU)X = B
3 −7 −2
2. Which of the following is the LU decomposition of −3 5 1?
6 −4 0
1 0 0 3 −7 −2
a) −1 1 0 0 −2 −1
2 −5 1/2 0 0 −2
1 0 0 3 −7 −2
b) −1 1 0 0 −2 −1
2 −5 1 0 0 −1
1 0 0 3 −7 −2
c) −1 2 −1 0 −1 −1
2 −10 6 0 0 −1
1 0 0 3 −7 −2
d) −1 1 0 0 −2 −1
4 −5 1 −6 14 3
1 0 0 3 −7 −2 1
3. Suppose L = −1 1 0, U = 0 −2 −1, and b = −1. Solve LUx = b by letting
2 −5 1 0 0 −1 1
y = Ux. The solutions for y and x are
−1 1/6
a) y = 0, x = 1/2
1 −1
1 −1/6
b) y = 0, x = −1/2
−1 1
1 1/6
c) y = 0, x = −1/2
−1 1
−1 −1/6
d) y = 0, x = 1/2
1 1
Vector spaces
53
55
In this week’s lectures, we learn about vector spaces. A vector space consists of a set of vectors
and a set of scalars that is closed under vector addition and scalar multiplication and that satisfies
the usual rules of arithmetic. We will learn some of the vocabulary and phrases of linear algebra,
such as linear independence, span, basis and dimension. We will learn about the four fundamental
subspaces of a matrix, the Gram-Schmidt process, orthogonal projection, and the matrix formulation
of the least-squares problem of drawing a straight line to fit noisy data.
56
Lecture 16
Vector spaces
View this lecture on YouTube
A vector space consists of a set of vectors and a set of scalars. Although vectors can be quite gen-
eral, for the purpose of this course we will only consider vectors that are real column matrices. The
set of scalars can either be the real or complex numbers, and here we will only consider real numbers.
For the set of vectors and scalars to form a vector space, the set of vectors must be closed under
vector addition and scalar multiplication. That is, when you multiply any two vectors in the set by
real numbers and add them, the resulting vector must still be in the set.
As an example, consider the set of vectors consisting of all three-by-one column matrices, and let u
and v be two of these vectors. Let w = au + bv be the sum of these two vectors multiplied by the real
numbers a and b. If w is still a three-by-one matrix, that is, w is in the set of vectors consisting of all
three-by-one column matrices, then this set of vectors is closed under scalar multiplication and vector
addition, and is indeed a vector space. The proof is rather simple. If we let
u1 v1
u = u2 , v = v2 ,
u3 v3
then
au1 + bv1
w = au + bv = au2 + bv2
au3 + bv3
is evidently a three-by-one matrix, so that the set of all three-by-one matrices (together with the set of
real numbers) is a vector space. This space is usually called R3 .
Our main interest in vector spaces is to determine the vector spaces associated with matrices. There
are four fundamental vector spaces of an m-by-n matrix A. They are called the null space, the column
space, the row space, and the left null space. We will meet these vector spaces in later lectures.
57
58 LECTURE 16. VECTOR SPACES
2. Explain why the following sets of three-by-one matrices (with real number scalars) are vector spaces:
(a) The set of three-by-one matrices with zero in the first row;
(b) The set of three-by-one matrices with first row equal to the second row;
(c) The set of three-by-one matrices with first row a constant multiple of the third row.
Linear independence
View this lecture on YouTube
The set of vectors, {u1 , u2 , . . . , un }, are linearly independent if for any scalars c1 , c2 , . . . , cn , the equation
c1 u1 + c2 u2 + · · · + cn un = 0
has only the solution c1 = c2 = · · · = cn = 0. What this means is that one is unable to write any of
the vectors u1 , u2 , . . . , un as a linear combination of any of the other vectors. For instance, if there was
a solution to the above equation with c1 ̸= 0, then we could solve that equation for u1 in terms of the
other vectors with nonzero coefficients.
As an example consider whether the following three three-by-one column vectors are linearly
independent:
1 0 2
u = 0 , v = 1 , w = 3 .
0 0 0
Indeed, they are not linearly independent, that is, they are linearly dependent, because w can be written
in terms of u and v. In fact, w = 2u + 3v.
Now consider the three three-by-one column vectors given by
1 0 0
u = 0 , v = 1 , w = 0 .
0 0 1
These three vectors are linearly independent because you cannot write any one of these vectors as a
linear combination of the other two. If we go back to our definition of linear independence, we can
see that the equation
a 0
au + bv + cw = b = 0
b 0
59
60 LECTURE 17. LINEAR INDEPENDENCE
−1
1 1
(b) 1 , −1 , 1
1 1 −1
0
1 1
(c) 1 , 0 , 1
0 1 1
Given a set of vectors, one can generate a vector space by forming all linear combinations of that
set of vectors. The span of the set of vectors {v1 , v2 , . . . , vn } is the vector space consisting of all linear
combinations of v1 , v2 , . . . , vn . We say that a set of vectors spans a vector space.
For example, the set of vectors given by
1
0 2
0 , 1 , 3
0 0 0
spans the vector space of all three-by-one matrices with zero in the third row. This vector space is a
vector subspace of all three-by-one matrices.
One doesn’t need all three of these vectors to span this vector subspace because any one of these
vectors is linearly dependent on the other two. The smallest set of vectors needed to span a vector
space forms a basis for that vector space. Here, given the set of vectors above, we can construct a basis
for the vector subspace of all three-by-one matrices with zero in the third row by simply choosing two
out of three vectors from the above spanning set. Three possible bases are given by
1
0
1
2
0
2
0 , 1 , 0 , 3 , 1 , 3 .
0 0 0 0 0 0
Although all three combinations form a basis for the vector subspace, the first combination is usually
preferred because this is an orthonormal basis. The vectors in this basis are mutually orthogonal and
of unit norm.
The number of vectors in a basis gives the dimension of the vector space. Here, the dimension of
the vector space of all three-by-one matrices with zero in the third row is two.
61
62 LECTURE 18. SPAN, BASIS AND DIMENSION
b) The set of three-by-one matrices with the sum of all the rows equal to one.
c) The set of three-by-one matrices with the first row equal to the third row.
d) The set of three-by-one matrices with the first row equal to the sum of the second and third rows.
2
1 4
b) 1 , −1 , 6
1 2 −2
1 0 1
c) 0 , 1 , −1
−1 −1 0
3
3 2
d) 2 , 1 , 1
1 2 0
63
64 LECTURE 18. SPAN, BASIS AND DIMENSION
3. Which of the following is an orthonormal basis for the vector space of all three-by-one matrices
with the sum of all rows equal to zero?
1 1 −1
1
a) √ −1 , √ 1
2
2
0 0
1 1 1
1
b) √ −1 , √ 1
2
6
0 −2
1 1 1 0
1 1
c) √ −1 , √ 0 , √ 1
2
2 2
0 −1 −1
1 2 −1 −1
1 1
d) √ −1 , √ 2 , √ −1
6
6 6
−1 −1 2
Gram-Schmidt process
View this lecture on YouTube
Given any basis for a vector space, we can use an algorithm called the Gram-Schmidt process to
construct an orthonormal basis for that space. Let the vectors v1 , v2 , . . . , vn be a basis for some n-
dimensional vector space. We will assume here that these vectors are column matrices, but this process
also applies more generally.
We will construct an orthogonal basis u1 , u2 , . . . , un , and then normalize each vector to obtain an
orthonormal basis. First, define u1 = v1 . To find the next orthogonal basis vector, define
(uT1 v2 )u1
u2 = v2 − .
uT1 u1
Observe that u2 is equal to v2 minus the component of v2 that is parallel to u1 . By multiplying both
sides of this equation with uT1 , it is easy to see that uT1 u2 = 0 so that these two vectors are orthogonal.
The next orthogonal vector in the new basis can be found from
Here, u3 is equal to v3 minus the components of v3 that are parallel to u1 and u2 . We can continue in
this fashion to construct n orthogonal basis vectors. These vectors can then be normalized via
u1
b1 =
u T
, etc.
(u1 u1 )1/2
Since uk is a linear combination of v1 , v2 , . . . , vk , the vector subspace spanned by the first k basis
vectors of the original vector space is the same as the subspace spanned by the first k orthonormal
vectors generated through the Gram-Schmidt process. We can write this result as
span{u1 , u2 , . . . , uk } = span{v1 , v2 , . . . , vk }.
65
66 LECTURE 19. GRAM-SCHMIDT PROCESS
and construct an orthonormal basis for this subspace. Let u1 = v1 . Then u2 is found from
(uT1 v2 )u1
u2 = v2 −
uT1 u1
0 1 −2
2 1
= 1 − 1 = 1 .
3 3
1 1 1
Notice that the initial two vectors v1 and v2 span the vector subspace of three-by-one column matri-
ces for which the second and third rows are equal. Clearly, the orthonormal basis vectors constructed
from the Gram-Schmidt process span the same subspace.
67
68 LECTURE 20. GRAM-SCHMIDT PROCESS EXAMPLE
Use the Gram-Schmidt process to construct an orthonormal basis for this subspace.
2. Consider a subspace of all four-by-one column vectors with the following basis:
1
0 0
1 1 0
W = , , .
1 1 1
1
1 1
Use the Gram-Schmidt process to construct an orthonormal basis for this subspace.
a) v1
b) v2
c) v3
d) v4
( ! !)
1 1
2. The Gram-Schmidt process applied to {v1 , v2 } = , results in
1 −1
( ! !)
1 1 1 1
a) {u b2 } = √
b1 , u ,√
2 1 2 −1
( ! !)
1 1 0
b) {u b2 } = √
b1 , u ,
2 1 0
( ! !)
1 0
c) {u b2 } =
b1 , u ,
0 1
( ! !)
1 1 1 2
d) {u b2 } =
b1 , u √ ,√
3 2 3 −1
69
70 LECTURE 20. GRAM-SCHMIDT PROCESS EXAMPLE
1 0
3. The Gram-Schmidt process applied to {v1 , v2 } = 1 , 1 results in
−1 −1
1 1 0
1
a) {u b 2 } = √ 1 , √ 1
b1 , u
3
2
−1 1
1 1 −2
1
b) {u b 2 } = √ 1 , √ 1
b1 , u
3
6
−1 −1
1 1 1
1
c) {u b 2 } = √ 1 , √ −1
b1 , u
3 2
−1 0
1 1 1
1
d) {u b2 } =
b1 , u √ 1 , √ 0
3
2
−1 1
Null space
Clearly, if x and y are in the null space of A, then so is ax + by so that the null space is closed under
vector addition and scalar multiplication. If the matrix A is m-by-n, then Null(A) is a vector subspace
of all n-by-one column matrices. If A is a square invertible matrix, then Null(A) consists of just the
zero vector.
To find a basis for the null space of a noninvertible matrix, we bring A to reduced row echelon
form. We demonstrate by example. Consider the three-by-five matrix given by
−3 6 −1 1 −7
A = 1 −2 2 3 −1 .
2 −4 5 8 −4
By judiciously permuting rows to simplify the arithmetic, one pathway to construct rref(A) is
−3 6 −1 1 −7 1 −2 2 3 −1 1 −2 2 3 −1
1 −2 2 3 −1 → −3 6 −1 1 −7 → 0 0 5 10 −10 →
2 −4 5 8 −4 2 −4 5 8 −4 0 0 1 2 −2
1 −2 2 3 −1 1 −2 0 −1 3
0 0 1 2 −2 → 0 0 1 2 −2 .
0 0 5 10 −10 0 0 0 0 0
We call the variables associated with the pivot columns, x1 and x3 , basic variables, and the variables
associated with the non-pivot columns, x2 , x4 and x5 , free variables. Writing the basic variables on the
left-hand side of the Ax = 0 equations, we have from the first and second rows
x1 = 2x2 + x4 − 3x5 ,
x3 = −2x4 + 2x5 .
71
72 LECTURE 21. NULL SPACE
Eliminating x1 and x3 , we can write the general solution for vectors in Null(A) as
2x2 + x4 − 3x5 2 1 −3
x2 1 0 0
−2x4 + 2x5 = x2 0 + x4 −2 + x5 2 ,
x4 0 1 0
x5 0 0 1
where the free variables x2 , x4 , and x5 can take any values. By writing the null space in this form, a
basis for Null(A) is made evident, and is given by
2 1 −3
1 0 0
0 , −2 , 2 .
0 1 0
0 0 1
The null space of A is seen to be a three-dimensional subspace of all five-by-one column matrices. In
general, the dimension of Null(A) is equal to the number of non-pivot columns of rref(A).
73
1 0 1 1
An underdetermined system of linear equations Ax = b with more unknowns than equations may
not have a unique solution. If u is the general form of a vector in the null space of A, and v is any
vector that satisfies Av = b, then x = u + v satisfies Ax = A(u + v) = Au + Av = 0 + b = b. The
general solution of Ax = b can therefore be written as the sum of a general vector in Null(A) and a
particular vector that satisfies the underdetermined system.
As an example, suppose we want to find the general solution to the linear system of two equations
and three unknowns given by
2x1 + 2x2 + x3 = 0,
2x1 − 2x2 − x3 = 1,
The null space is determined from x1 = 0 and x2 = − x3 /2, and we can write
0
Null(A) = span −1 .
2
A particular solution for the inhomogeneous system is found by solving x1 = 1/4 and x2 + x3 /2 =
−1/4. Here, we simply take the free variable x3 to be zero, and we find x1 = 1/4 and x2 = −1/4. The
general solution to the original underdetermined linear system is the sum of the null space and the
particular solution and is given by
x1 0 1
1
x2 = a −1 + −1 .
4
x3 2 0
75
76 LECTURE 22. APPLICATION OF THE NULL SPACE
Column space
View this lecture on YouTube
The column space of a matrix is the vector space spanned by the columns of the matrix. When a
matrix is multiplied by a column vector, the resulting vector is in the column space of the matrix, as
can be seen from ! ! ! ! !
a b x ax + by a b
= =x +y .
c d y cx + dy c d
In general, Ax is a linear combination of the columns of A. Given an m-by-n matrix A, what is the
dimension of the column space of A, and how do we find a basis? Note that since A has m rows, the
column space of A is a subspace of all m-by-one column matrices.
Fortunately, a basis for the column space of A can be found from rref(A). Consider the example
−3 6 −1 1 −7 −2 1 0 −1 3
A = 1 −2 2 3 −1 , rref(A) = 0 0 1 2 −2 .
2 −4 5 8 −4 0 0 0 0 0
The matrix equation Ax = 0 expresses the linear dependence of the columns of A, and row operations
on A do not change the dependence relations. For example, the second column of A above is −2 times
the first column, and after several row operations, the second column of rref(A) is still −2 times the
first column.
It should be self-evident that only the pivot columns of rref(A) are linearly independent, and the
dimension of the column space of A is therefore equal to its number of pivot columns; here it is two.
A basis for the column space is given by the first and third columns of A, (not rref(A)), and is
−3
−1
1 , 2 .
2 5
Recall that the dimension of the null space is the number of non-pivot columns—equal to the
number of free variables—so that the sum of the dimensions of the null space and the column space
is equal to the total number of columns. A statement of this theorem is as follows. Let A be an m-by-n
matrix. Then
dim(Col(A)) + dim(Null(A)) = n.
77
78 LECTURE 23. COLUMN SPACE
1 0 1 1
In addition to the column space and the null space, a matrix A has two more vector spaces asso-
ciated with it, namely the column space and null space of AT , which are called the row space and the
left null space.
If A is an m-by-n matrix, then the row space and the null space are subspaces of all n-by-one
column matrices, and the column space and the left null space are subspaces of all m-by-one column
matrices.
The null space consists of all vectors x such that Ax = 0, that is, the null space is the set of all
vectors that are orthogonal to the row space of A. We say that these two vector spaces are orthogonal.
A basis for the row space of a matrix can be found from computing rref(A), and is found to be
rows of rref(A) (written as column vectors) with pivot columns. The dimension of the row space of A
is therefore equal to the number of pivot columns, while the dimension of the null space of A is equal
to the number of nonpivot columns. The union of these two subspaces make up the vector space of all
n-by-one matrices and we say that these subspaces are orthogonal complements of each other.
Furthermore, the dimension of the column space of A is also equal to the number of pivot columns,
so that the dimensions of the column space and the row space of a matrix are equal. We have
dim(Col(A)) = dim(Row(A)).
We call this dimension the rank of the matrix A. This is an amazing result since the column space and
row space are subspaces of two different vector spaces. In general, we must have rank(A) ≤ min(m, n).
When the equality holds, we say that the matrix is of full rank. And when A is a square matrix and of
full rank, then the dimension of the null space is zero and A is invertible.
79
80 LECTURE 24. ROW SPACE, LEFT NULL SPACE AND RANK
Check to see that null space is the orthogonal complement of the row space, and the left null space is
the orthogonal complement of the column space. Find rank(A). Is this matrix of full rank?
3 6 1 1
−2
4
1 −2
a) ,
0 0
0 0
0
0
b)
0
0
0
0
c)
−3
2
−2
1
d)
0
0
81
82 LECTURE 24. ROW SPACE, LEFT NULL SPACE AND RANK
x1 + 2x2 + x4 = 1,
2x1 + 4x2 + x3 + x4 = 1,
3x1 + 6x2 + x3 + x4 = 1,
is
0 −2
0 1
a) 0 + 0
a
1 0
−2 0
1 0
b) 0 + 0
a
0 1
0 0
0 0
c) 0 + −3
a
1 2
0 0
0 0
d) −3 + 0
a
2 1
1 2 0 1
3. What is the rank of the matrix 2 4 1 1?
3 6 1 1
a) 1
b) 2
c) 3
d) 4
Orthogonal projections
View this lecture on YouTube
Suppose that V is an n-dimensional vector space and W is a p-dimensional subspace of V. For fa-
miliarity, we assume here that all vectors are column matrices of fixed size. Let v be a vector in V and
let {s1 , s2 , . . . , s p } be an orthonormal basis for W. In general, the orthogonal projection of a vector v in
V onto the subspace W is given by
v = vprojW + (v − vprojW ),
v = a1 s1 + a2 s2 + · · · + a p s p + b1 t1 + b2 t2 + bn− p tn− p .
Using the orthonormality of the basis vectors, the orthogonal projection of v onto W in then seen to be
vprojW = a1 s1 + a2 s2 + · · · + a p s p ,
w = c1 s1 + c2 s2 + · · · + c p s p .
The distance between v and w is given by the norm ||v − w||, and we have
or ||v − vprojW || ≤ ||v − w||, a result that will be used later in the problem of least squares.
83
84 LECTURE 25. ORTHOGONAL PROJECTIONS
0 0
Suppose there is some experimental data that you want to fit by a straight line. This is called a
linear regression problem and an illustrative example is shown below.
y
Linear regression
y1 = β 0 + β 1 x1 , y2 = β 0 + β 1 x2 , ..., yn = β 0 + β 1 xn .
These equations constitute a system of n equations in the two unknowns β 0 and β 1 . The corresponding
matrix equation is given by
1 x1 y1
!
1 x2 y2
β0
.
. ..
=
.. .
. . β1 .
1 xn yn
This is an overdetermined system of equations with no solution. The problem of least squares is to
find the best solution.
We can generalize this problem as follows. Suppose we are given a matrix equation, Ax = b, that
has no solution because b is not in the column space of A. So instead we solve Ax = bprojCol(A) , where
bprojCol(A) is the projection of b onto the column space of A. The solution is then called the least-squares
solution for x.
85
86 LECTURE 26. THE LEAST-SQUARES PROBLEM
A unique solution to this matrix equation exists when the columns of A are linearly independent.
An interesting formula exists for the matrix which projects b onto the column space of A. Multi-
plying the normal equations on the left by A(AT A)−1 , we obtain
Notice that the projection matrix P = A(AT A)−1 AT satisfies P2 = P, that is, two projections is the
same as one. If A itself is a square invertible matrix, then P = I and b is already in the column space
of A.
As an example of the application of the normal equations, consider the toy least-squares problem of
fitting a line through the three data points (1, 1), (2, 3) and (3, 2). With the line given by y = β 0 + β 1 x,
the overdetermined system of equations is given by
1 1 ! 1
β0
1 2 = 3 .
β1
1 3 2
or ! ! !
3 6 β0 6
= .
6 14 β1 13
We can using Gaussian elimination to determine β 0 = 1 and β 1 = 1/2, and the least-squares line is
given by y = 1 + x/2. The graph of the data and the line is shown below.
87
88 LECTURE 27. SOLUTION OF THE LEAST-SQUARES PROBLEM
2
y
1 2 3
x
1
1
a) 1
3
−2
−1
1
b) −1
3
2
2
1
c) −1
3
−1
−2
1
d) 1
3
1
2. Suppose we have data points given by ( xn , yn ) = (1, 1), (2, 1), and (3, 3). If the data is to be fit by
the line y = β 0 + β 1 x, which is the overdetermined equation for β 0 and β 1 ?
1 1 ! 1
β0
a) 1 1 = 2
β1
3 1 3
1 1 ! 1
β0
b) 2 1 = 1
β1
3 1 3
1 1 ! 1
β0
c) 1 1 = 2
β1
1 3 3
1 1 ! 1
β0
d) 1 2 = 1
β1
1 3 3
91
92 LECTURE 27. SOLUTION OF THE LEAST-SQUARES PROBLEM
3. Suppose we have data points given by ( xn , yn ) = (1, 1), (2, 1), and (3, 3). Which is the best fit line
to the data?
1
a) y = +x
3
1
b) y = − + x
3
1
c) y = 1 + x
3
1
d) y = 1 − x
3
Solutions to the Practice quiz
Week IV
93
95
In this week’s lectures, we will learn about determinants and the eigenvalue problem. We will
learn how to compute determinants using a Laplace expansion, the Leibniz formula, or by row or
column elimination. We will formulate the eigenvalue problem and learn how to find the eigenvalues
and eigenvectors of a matrix. We will learn how to diagonalize a matrix using its eigenvalues and
eigenvectors, and how this leads to an easy calculation of a matrix raised to a power.
96
Lecture 28
We already showed that a two-by-two matrix A is invertible when its determinant is nonzero, where
a b
det A = = ad − bc.
c d
If A is invertible, then the equation Ax = b has the unique solution x = A−1 b. But if A is not invertible,
then Ax = b may have no solution or an infinite number of solutions. When det A = 0, we say that
the matrix A is singular.
It is also straightforward to define the determinant for a three-by-three matrix. We consider the
system of equations Ax = 0 and determine the condition for which x = 0 is the only solution. With
a b c x1
d e f x2 = 0,
g h i x3
one can do the messy algebra of elimination to solve for x1 , x2 , and x3 . One finds that x1 = x2 = x3 = 0
is the only solution when det A ̸= 0, where the definition, apart from a constant, is given by
An easy way to remember this result is to mentally draw the following picture:
a b c a b a b c a b
d e f d e
— d e f d e
g h i g h g h i g h
.
The matrix A is periodically extended two columns to the right, drawn explicitly here but usually only
imagined. Then the six terms comprising the determinant are made evident, with the lines slanting
down towards the right getting the plus signs and the lines slanting down towards the left getting the
minus signs. Unfortunately, this mnemonic only works for three-by-three matrices.
97
98 LECTURE 28. TWO-BY-TWO AND THREE-BY-THREE DETERMINANTS
2. Show that the three-by-three determinant changes sign when the first two rows are interchanged.
3. Let A and B be two-by-two matrices. Prove by direct computation that det AB = det A det B.
Laplace expansion
View this lecture on YouTube
There is a way to write the three-by-three determinant that generalizes. It is called a Laplace expansion
(also called a cofactor expansion or expansion by minors). For the three-by-three determinant, we have
a b c
d e f = aei + b f g + cdh − ceg − bdi − a f h
g h i
= a(ei − f h) − b(di − f g) + c(dh − eg),
Evidently, the three-by-three determinant can be computed from lower-order two-by-two determi-
nants, called minors. The rule here for a general n-by-n matrix is that one goes across the first row of
the matrix, multiplying each element in the row by the determinant of the matrix obtained by crossing
out that element’s row and column, and adding the results with alternating signs.
In fact, this expansion in minors can be done across any row or down any column. When the minor
is obtained by deleting the ith-row and j-th column, then the sign of the term is given by (−1)i+ j . An
easy way to remember the signs is to form a checkerboard pattern, exhibited here for the three-by-three
and four-by-four matrices:
+ − + −
+ − +
− + − +
− + − , .
+ − + −
+ − +
− + − +
We first expand in minors down the second column. The only nonzero contribution comes from the
99
100 LECTURE 29. LAPLACE EXPANSION
two in the third row, and we cross out the second column and third row (and multiply by a minus
sign) to obtain a three-by-three determinant:
1 0 0 −1
1 0 −1
3 0 0 5
= − 2 3 0 5 .
2 2 4 −3
1
5 0
1 0 5 0
We then again expand in minors down the second column. The only nonzero contribution comes
from the five in the third row, and we cross out the second column and third row (and mutiply by a
minus sign) to obtain a two-by-two determinant, which we then compute:
1
0 −1
1 −1
−2 3 0 5 = 10 = 80.
3 5
1 5 0
The trick here is to expand by minors across the row or column containing the most zeros.
101
4 0 3 2 0
Leibniz formula
View this lecture on YouTube
Another way to generalize the three-by-three determinant is called the Leibniz formula, or more de-
scriptively, the big formula. The three-by-three determinant can be written as
a b c
d e f = aei − a f h + b f g − bdi + cdh − ceg,
g h i
where each term in the formula contains a single element from each row and from each column. For
example, to obtain the third term b f g, b comes from the first row and second column, f comes from
the second row and third column, and g comes from the third row and first column. As we can choose
one of three elements from the first row, then one of two elements from the second row, and only
one element from the third row, there are 3! = 6 terms in the formula, and the general n-by-n matrix
without any zero entries will have n! terms.
The sign of each term depends on whether the choice of columns as we go down the rows is an even
or odd permutation of the columns ordered as {1, 2, 3, . . . , n}. An even permutation is when columns
are interchanged an even number of times, and an odd permutation is when they are interchanged an
odd number of times. Even permutations get a plus sign and odd permutations get a minus sign.
For the determinant of the three-by-three matrix, the plus terms aei, b f g, and cdh correspond to
the column orderings {1, 2, 3}, {2, 3, 1}, and {3, 1, 2}, which are even permutations of {1, 2, 3}, and
the minus terms a f h, bdi, and ceg correspond to the column orderings {1, 3, 2}, {2, 1, 3}, and {3, 2, 1},
which are odd permutations.
103
104 LECTURE 30. LEIBNIZ FORMULA
Properties of a determinant
View this lecture on YouTube
The determinant is a function that maps a square matrix to a scalar. It is uniquely defined by the
following three properties:
Property 1: The determinant of the identity matrix is one;
Property 2: The determinant changes sign under row interchange;
Property 3: The determinant is a linear function of the first row, holding all other rows fixed.
Using two-by-two matrices, the first two properties are illustrated by
1 0 a b c d
=1 and = − ;
0 1 c d a b
Both the Laplace expansion and Leibniz formula for the determinant can be proved from these three
properties. Other useful properties of the determinant can also be proved:
∙ The determinant is a linear function of any row, holding all other rows fixed;
∙ The determinant of an upper or lower triangular matrix is the product of the diagonal elements;
∙ The determinant of the product of two matrices is equal to the product of the determinants;
∙ The determinant of the inverse matrix is equal to the reciprical of the determinant;
∙ The determinant of the transpose of a matrix is equal to the determinant of the matrix.
Notably, these properties imply that Gaussian elimination, done on rows or columns or both, can be
used to simplify the computation of a determinant. Row interchanges and multiplication of a row by
a constant change the determinant and must be treated correctly.
105
106 LECTURE 31. PROPERTIES OF A DETERMINANT
2. Using the defining properties of a determinant, prove that the determinant is a linear function of
any row, holding all other rows fixed.
3. Using the results of the above problems, prove that if we add k times row-i to row-j, the determinant
doesn’t change.
0 −1 1
−3 3 3 0 −2
a) 48
b) 42
c) −42
d) −48
a e 0 0
b f g 0
2. The determinant of
c
is equal to
0 h i
d 0 0 j
c) agij − beij + ce f j − de f h
d) agij + beij − ce f j − de f h
3. Assume A and B are invertible n-by-n matrices. Which of the following identities is false?
b) det AT = det A
107
108 LECTURE 31. PROPERTIES OF A DETERMINANT
Lecture 32
Let A be a square matrix, x a column vector, and λ a scalar. The eigenvalue problem for A solves
Ax = λx
for eigenvalues λi with corresponding eigenvectors xi . Making use of the identity matrix I, the eigen-
value problem can be rewritten as
(A − λI)x = 0,
where the matrix (A − λI) is just the matrix A with λ subtracted from its diagonal. For there to be
nonzero eigenvectors, the matrix (A − λI) must be singular, that is,
det (A − λI) = 0.
This equation is called the characteristic equation of the matrix A. From the Leibniz formula, the char-
acteristic equation of an n-by-n matrix is an n-th order polynomial equation in λ. For each found λi , a
corresponding eigenvector xi can be determined directly by solving (A − λi I)x = 0 for x.
For illustration, we compute the eigenvalues of a general two-by-two matrix. We have
a−λ b
0 = det (A − λI) = = ( a − λ)(d − λ) − bc = λ2 − ( a + d)λ + ( ad − bc);
c d−λ
λ2 − Tr A λ + det A = 0,
109
110 LECTURE 32. THE EIGENVALUE PROBLEM
We compute here the two real eigenvalues and eigenvectors ! of a two-by-two matrix.
0 1
Example: Find the eigenvalues and eigenvectors of A = .
1 0
The characteristic equation of A is given by
λ2 − 1 = 0,
with solutions λ1 = 1 and λ2 = −1. The first eigenvector is found by solving (A − λ1 I)x = 0, or
! !
−1 1 x1
= 0.
1 −1 x2
The equation from the second row is just a constant multiple of the equation from the first row and
this will always be the case for two-by-two matrices. From the first row, say, we find x2 = x1 . The
second eigenvector is found by solving (A − λ2 I)x = 0, or
! !
1 1 x1
= 0,
1 1 x2
The eigenvectors can be multiplied by an arbitrary nonzero constant. Notice that λ1 + λ2 = Tr A and
that λ1 λ2 = det A, and analogous relations are true for any n-by-n matrix. In particular, comparing
the sum over all the eigenvalues and the matrix trace provides a simple algebra check.
111
112 LECTURE 33. FINDING EIGENVALUES AND EIGENVECTORS (1)
0 1 2
λ2 = 0,
so that there is a degenerate eigenvalue of zero. The eigenvector associated with the zero eigenvalue
is found from Bx = 0 and has zero second component. This matrix therefore has only one eigenvalue
and eigenvector, given by !
1
λ = 0, x = .
0
!
0 −1
Example: Find the eigenvalues of C = .
1 0
The characteristic equation of C is given by
λ2 + 1 = 0,
which has the imaginary solutions λ = ±i. Matrices with complex eigenvalues play an important role
in the theory of linear differential equations.
113
114 LECTURE 34. FINDING EIGENVALUES AND EIGENVECTORS (2)
a) 1 ± 3i
√
b) 1 ± 3
√
c) 3 3 ± 1
d) 3 ± i
115
116 LECTURE 34. FINDING EIGENVALUES AND EIGENVECTORS (2)
2 1 0
3. Which of the following is an eigenvector of 1 2 1?
0 1 2
1
a) 0
1
1
√
b) 2
1
0
c) 1
0
√
2
d) 1
√
2
Matrix diagonalization
View this lecture on YouTube
For concreteness, consider a two-by-two matrix A with eigenvalues and eigenvectors given by
! !
x11 x12
λ 1 , x1 = ; λ 2 , x2 = .
x21 x22
Generalizing, we define S to be the matrix whose columns are the eigenvectors of A, and Λ to be
the diagonal matrix with eigenvalues down the diagonal. Then for any n-by-n matrix with n linearly
independent eigenvectors, we have
AS = SΛ,
where S is an invertible matrix. Multiplying both sides on the right or the left by S−1 , we derive the
relations
A = SΛS−1 or Λ = S−1 AS.
To remember the order of the S and S−1 matrices in these formulas, just remember that A should be
multiplied on the right by the eigenvectors placed in the columns of S.
117
118 LECTURE 35. MATRIX DIAGONALIZATION
2. Prove that if the columns of an n-by-n matrix are linearly independent, then the matrix is invert-
ible. (A matrix whose columns are eigenvectors corresponding to distinct eigenvalues is therefore
invertible.)
!
a b
Example: Diagonalize the matrix A = .
b a
The eigenvalues of A are determined from
a − λ b
det(A − λI) = = ( a − λ)2 − b2 = 0.
b a − λ
Solving for λ, the two eigenvalues are given by λ1 = a + b and λ2 = a − b. The corresponding
eigenvector for λ1 is found from (A − λ1 I)x1 = 0, or
! ! !
−b b x11 0
= ;
b −b x21 0
Solving for the eigenvectors and normalizing them, the eigenvalues and eigenvectors are given by
! !
1 1 1 1
λ1 = a + b, x1 = √ ; λ2 = a − b, x2 = √ .
2 1 2 −1
The matrix S of eigenvectors can be seen to be orthogonal so that S−1 = ST . We then have
!
1 1 1
S= √ and S−1 = ST = S;
2 1 −1
119
120 LECTURE 36. MATRIX DIAGONALIZATION EXAMPLE
0 1 2
Powers of a matrix
View this lecture on YouTube
Diagonalizing a matrix facilitates finding powers of that matrix. Suppose that A is diagonalizable,
and consider
A2 = (SΛS−1 )(SΛS−1 ) = SΛ2 S−1 ,
In general, Λ p has the eigenvalues raised to the power of p down the diagonal, and
A p = SΛ p S−1 .
121
122 LECTURE 37. POWERS OF A MATRIX
1 2 1
ex = 1 + x + x + x3 + . . . .
2! 3!
1 2 1
eA = I + A + A + A3 + . . . .
2! 3!
where
e λ1 0 ... 0
0 e λ2 ... 0
Λ
e = .. .. .. .. .
.
. . .
0 0 ... eλn
123
124 LECTURE 38. POWERS OF A MATRIX EXAMPLE
125
126 LECTURE 38. POWERS OF A MATRIX EXAMPLE
1.
1 0 0
1 0 0 1 0 0 0
0 1 0
a) 0 1 0 b) 0 1 0 0 c)
0 0 1
0 0 1 0 0 1 0
0 0 0
127
128 APPENDIX A. PROBLEM AND PRACTICE QUIZ SOLUTIONS
!
4 7
2. AB = AC = .
8 14
2 3 4 2 2 2
3. AD = 2 6 12 , DA = 3 6 9 .
2 9 16 4 12 16
n n p p n p
4. [A(BC)]ij = ∑ aik [BC]kj = ∑ ∑ aik bkl clj = ∑ ∑ aik bkl clj = ∑ [AB]il clj = [(AB)C)]ij .
k =1 k =1 l =1 l =1 k =1 l =1
129
2. Let A be an m-by-p diagonal matrix, B a p-by-n diagonal matrix, and let C = AB. The ij element of
C is given by
p
cij = ∑ aik bkj .
k =1
Since A is a diagonal matrix, the only nonzero term in the sum is k = i and we have cij = aii bij . And
since B is a diagonal matrix, the only nonzero elements of C are the diagonal elements cii = aii bii .
3. Let A and B be n-by-n upper triangular matrices, and let C = AB. The ij element of C is given by
n
cij = ∑ aik bkj .
k =1
Since A and B are upper triangular, we have aik = 0 when k < i and bkj = 0 when k > j. Excluding the
zero terms from the summation, we have
j
cij = ∑ aik bkj ,
k =i
which is equal to zero when i > j proving that C is upper triangular. Furthermore, cii = aii bii .
130 APPENDIX A. PROBLEM AND PRACTICE QUIZ SOLUTIONS
! ! !
1 −1 −1 1 −2 2
2. a. =
−1 1 1 −1 2 −2
3. b. For upper triangular matrices A and B, aik = 0 when k < i and bkj = 0 when k > j.
131
p p
cTij = c ji = ∑ a jk bki = ∑ bikT aTkj .
k =1 k =1
2. The square matrix A + AT is symmetric, and the square matrix A − AT is skew symmetric. Using
these two matrices, we can write
1 1
A= A + AT + A − AT .
2 2
(AT A)T = AT A.
132 APPENDIX A. PROBLEM AND PRACTICE QUIZ SOLUTIONS
1.
! a d !
T a b c a2 + b2 + c2 ad + be + c f
A A= b e = .
d e f ad + be + c f d2 + e2 + f 2
c f
n n m m n
Tr(AT A) = ∑ (AT A) jj = ∑ ∑ aTji aij = ∑ ∑ a2ij ,
j =1 j =1 i =1 i =1 j =1
Taking the transpose of both sides of these two equations, using both IT = I and (AB)T = BT AT , we
obtain
(A−1 )T AT = I and AT (A−1 )T = I.
We can therefore conclude that AT is invertible and that (AT )−1 = (A−1 )T .
4. Let A be an invertible matrix, and suppose B and C are its inverse. To prove that B = C, we write
B = BI = B(AC) = (BA)C = C.
134 APPENDIX A. PROBLEM AND PRACTICE QUIZ SOLUTIONS
3. a. Exchange the diagonal elements, negate the off-diagonal elements, and divide by the determinant.
! −1 !
2 2 1 2 −2
We have = .
1 2 2 −1 2
135
2. Since I I = I, we have I−1 = I. And since IT = I, we have I−1 = IT and I is an orthogonal matrix.
136 APPENDIX A. PROBLEM AND PRACTICE QUIZ SOLUTIONS
2. The z-coordinate stays fixed, and the vector rotates an angle θ in the x-y plane. Therefore,
cos θ − sin θ 0
Rz = sin θ cos θ 0 .
0 0 1
137
1.
1 0 0 1 0 0 0 1 0
P123 = 0 1 0 , P132 = 0 0 1 , P213 = 1 0 0 ,
0 0 1 0 1 0 0 0 1
0 1 0 0 0 1 0 0 1
P231 = 0 0 1 , P312 = 1 0 0 , P321 = 0 1 0 .
1 0 0 0 1 0 1 0 0
2.
−1 −1 −1 −1
P123 = P123 , P132 = P132 , P213 = P213 , P321 = P321 ,
−1 −1
P231 = P312 , P312 = P231 .
The matrices that are their own inverses correspond to either no permutation or a single permutation
of rows (or columns), e.g., {1, 3, 2}, which permutes row (column) two and three. The matrices that are
not their own inverses correspond to two permutations, e.g., {2, 3, 1}, which permutes row (column)
one and two, and then two and three. For example, commuting rows by left multiplication, we have
−1 −1 −1
P231 = P213 P132 = P213 P132 .
−1
Because matrices in general do not commute, P231 ̸= P231 . Note also that the permutation matrices
are orthogonal, so that the inverse matrices are equal to the transpose matrices. Therefore, only the
symmetric permutation matrices can be their own inverses.
138 APPENDIX A. PROBLEM AND PRACTICE QUIZ SOLUTIONS
1. d. An!orthogonal matrix has orthonormal rows and columns. The rows and columns of the matrix
1 −1
are not orthonormal and therefore this matrix is not an orthogonal matrix.
0 0
2. a. The rotation matrix representing a counterclockwise rotation around the x-axis in the y-z plane
can be obtained from the rotation matrix representing a counterclockwise rotation around the z-axis in
the x-y plane by shifting the elements to the right one column and down one row, assuming a periodic
1 0 0
extension of the matrix. The result is 0 cos θ − sin θ .
0 sin θ cos θ
1 0 0 0 0 1
3. b. Interchange the rows of the identity matrix: 0 1 0 → 1 0 0.
0 0 1 0 1 0
139
1.
6 −4 0 2 0 10 4 16 0 0 −1 6
x3 = −6,
1
x2 = − ( x3 − 2) = 4,
2
1
x1 = (7x2 + 2x3 − 7) = 3.
3
x3 −6
2 −5 5 1 0 −1 −1 −1 0 0 1 −1
x3 = −1,
x2 = −2x3 = 2,
x1 = 2x2 − 3x3 + 1 = 8.
x3 −1
140 APPENDIX A. PROBLEM AND PRACTICE QUIZ SOLUTIONS
1.
6 −4 0 2 0 10 4 16 0 0 −1 6
3 0 0 9 1 0 0 3
0 −2 0 −8 → 0 1 0 4 .
0 0 −1 6 0 0 1 −6
3 6 2 0 0 −1 0 0 −1 0 0 0
1.
3 −7 −2 1 0 0 −7 −2
3 1 0 0
− 3 5 1 0 1 0 → 0 −2 −1 1 1 0 →
6 −4 0 0 0 1 0 10 4 −2 0 1
3 0 3/2 −5/2 −7/2 0 3 0 3/2 −5/2 −7/2 0
0 −2 −1 1 1 0 → 0 −2 −1 1 1 0 →
0 0 −1 3 5 1 0 0 −1 3 5 1
1 0 0 2/3 4/3 1/2
0 1 0 1 2 1/2 .
0 0 1 −3 −5 −1
Therefore,
−1
3−7 −2 2/3 4/3 1/2
−3 5 1 = 1 2 1/2 .
6 −4 0 −3 −5 −1
142 APPENDIX A. PROBLEM AND PRACTICE QUIZ SOLUTIONS
4 −7 1 −2 0 1 −3 −2 0 1 −3 −2 0 0 −2 −3
2. c. A matrix in reduced row echelon form has all its pivots equal to one, and all the entries
above and below the pivots eliminated. The only matrix that is not in reduced row echelon form is
1 0 1 0
0 1 0 0. The pivot in the third row, third column has a one above it in the first row, third
0 0 1 1
column.
3. d. There are many ways to do this computation by hand, and here is one way:
3 −7 −2 1 0 0 3 −7 −2 1 0 0 3 −7 −2 1 0 0
−3 5 1 0 1 0 → 0 −2 −1 1 1 0 → 0 −2 −1 1 1 0 →
6 −4 0 0 0 110 4 −2 0 1
0 0 0 −1 3 5 1
3 −7 0 −5 −10 −2 3 −7 0 −5 −10 −2 3 0 0 2 4 3/2
0 −2 −1 1 1 0 → 0 −2 0 −2 −4 −1 → 0 1 0 1 2 1/2 →
0 0 1 −3 −5 −1 0 0 1 −3 −5 −1 0 0 1 −3 −5 −1
−1
1 0 0 2/3 4/3 1/2 3 −7 −2 2/3 4/3 1/2
0 1 0 1 2 1/2 . Therefore, −3 5 1 = 1 2 1/2.
0 0 1 −3 −5 −1 6 −4 0 −3 −5 −1
143
1.
1 0 0 0
0 1 0 0
M=
0
.
0 1 0
0 2 0 1
144 APPENDIX A. PROBLEM AND PRACTICE QUIZ SOLUTIONS
1.
3 −7 −2 3 −7 −2 1 0 0 −7 −2
3
−3 5 1 → 0 −2 −1 = 1 1 0 −3 5 1
6 −4 0 6 −4 0 0 0 1 6 −4 0
3 −7 −2 3 −7 −2 1 0 0 3 −7 −2
0 −2 −1 → 0 −2 −1 = 0 1 0 0 −2 −1
6 −4 0 0 10 4 −2 0 1 6 −4 0
3 −7 −2 3 −7 −2 1 0 0 3 −7 −2
0 −2 −1 → 0 −2 −1 = 0 1 0 0 −2 −1
0 10 4 0 0 −1 0 5 1 0 10 4
Therefore,
3−7 −2 1 0 0 −7 −2
3
−3 5 1 = −1 1 0 0 −2 −1 .
6 −4 0 2 −5 1 0 0 −1
145
1. We know
3−7 −2 1 0 0 −7 −2
3
A = −3 5 1 = −1 1 0 0 −2 −1 = LU.
6 −4 0 2 −5 1 0 0 −1
To solve LUx = b, we let y = Ux, solve Ly = b for y, and then solve Ux = y for x.
(a)
−3
b = 3
y1 = −3,
−y1 + y2 = 3,
2y1 − 5y2 + y3 = 2,
(b)
1
b = −1
y1 = 1,
−y1 + y2 = −1,
2y1 − 5y2 + y3 = 1,
2. b.
3−7 −2 3 −7 −2 1 0 0
A = −3 5 1 → 0 −2 −1 = M1 A, where M1 = 1 1 0 ;
6 −4 0 6 −4 0 0 0 1
3 −7 −2 3 −7 −2 1 0 0
0 −2 −1 → 0 −2 −1 = M M A, where M = 0 1 0 ;
2 1 2
6 −4 0 0 10 4 −2 0 1
3 −7 −2 3 −7 −2 1 0 0
0 −2 −1 → 0 −2 −1 = M3 M2 M1 A, where M3 = 0 1 0 .
0 10 4 0 0 −1 0 5 1
Therefore,
1 0 0 −7 −2
3
A = −1 1 0 0 −2 −1 .
2 −5 1 0 0 −1
3. b. To solve LUx = b, let y = Ux. Then solve Ly = b for y and Ux = y for x. The equations given by
Ly = b are
y1 = 1,
−y1 + y2 = −1,
2y1 − 5y2 + y3 = 1.
1
Solution by forward substitution gives y = 0. The equations given by Ux = y are
−1
1
147
1. Let v be a vector in the vector space. Then both 0v and v + (−1)v must be vectors in the vector
space and both of them are the zero vector.
2. In all of the examples, the vector spaces are closed under scalar multiplication and vector addition.
148 APPENDIX A. PROBLEM AND PRACTICE QUIZ SOLUTIONS
1. Only (a) and (b) are linearly independent. (c) is linearly dependent.
149
1. b. A vector space must be closed under vector addition and scalar multiplication. The set of three-
by-one matrices with the sum of all the rows equal to one is not closed under vector addition and
scalar multiplication. For example, if you multiply a vector whose sum of all rows is equal to one by
the scalar k, then the resulting vector’s sum of all rows is equal to k.
0 0 0 2 −2 1 −1 0 −1
3
3 2
so that these sets of three matrices are linearly dependent. The remaining set 2 , 1 , 1 is
1 2 0
linearly independent.
3. b. Since a three-by-one matrix has three degrees of freedom, and the constraint that the sum
of all rows equals zero eliminates one degree of freedom, the basis should consist of two vectors.
1
We can arbitrarily take the first unnormalized vector to be −1. The vector orthogonal to this
0
1
first vector with sum of all rows equal to zero is 1. Normalizing both of these vectors, we get
−2
1 1 1
1
√ −1 , √ 1 .
2
6
0 −2
151
1.
(uT1 v4 )u1 (uT2 v4 )u2 (uT3 v4 )u3
u 4 = v4 − − − .
uT1 u1 uT2 u2 uT3 u3
152 APPENDIX A. PROBLEM AND PRACTICE QUIZ SOLUTIONS
1. Define
0 1
{v1 , v2 } = 1 , 1 .
−1 −1
(uT1 v2 )u1
u 2 = v2 −
uT1 u1
1 0 1
= 1 − 1 = 0 .
−1 −1 0
2. Define
1
0 0
1 1 0
{v1 , v2 , v3 } = , , .
1 1 1
1 1 1
(uT1 v2 )u1
u 2 = v2 −
uT1 u1
0 1 −3
1 3 1 1 1
1 − 4 1 = 4 1 ;
=
1 1 1
2. a. Since the vectors are already orthogonal, we need only normalize them to find
( ! !)
1 1 1 1
{u b2 } =
b1 , u √ ,√ .
2 1 2 −1
1
3. b. Let u1 = v1 = 1. Then,
−1
0 −2 1
(uT1 v2 )u1 2 1
u 2 = v2 − = 1 − 1 = 1 .
uT1 u1 3 3
−1 −1 −1
1 0 1 1 0 −1 0 1 0 0 1 −1 0 0 1 −1 0 0 1 −1
The equation Ax = 0 with the pivot variables on the left-hand sides is given by
x1 = −2x4 , x2 = x4 , x3 = x4 ,
−2x4 −2
x4 1
x = x4 1. A basis for the null
and a general vector in the nullspace can be written as
4
x4 1
−2
1
1.
space is therefore given by the single vector
1
156 APPENDIX A. PROBLEM AND PRACTICE QUIZ SOLUTIONS
2 −4 5 8 x3 −4
We form the augmented matrix and bring the first four columns to reduced row echelon form:
−3 6 −1 1 −7 1 −2 0 −1 3
1 −2 2 3 −1 → 0 0 1 2 −2 .
2 −4 5 8 −4 0 0 0 0 0
The null space is found from the first four columns, and writing the basic variables on the left-hand
side, we have the system
x1 = 2x2 + x4 , x3 = −2x4 ;
from which we can write the general form of the null space as
2x2 + x4 2 1
x2
= x2 1 + x4 0 .
−2x 0 −2
4
x4 0 1
The free variables x2 and x4 can be set to zero, and the particular solution is determined to be x1 = 3
and x3 = −2. The general solution to the underdetermined system of equations is therefore given by
2 1 3
1 0 0
0 + b −2 + −2 .
x = a
0 1 0
157
1. We find
1 1 1 0 1 0 0 2
A = 1 1 0 1 , rref(A) = 0 1 0 −1 ,
1 0 1 1 0 0 1 −1
and dim(Col(A)) = 3, with a basis for the column space given by the first three columns of A.
158 APPENDIX A. PROBLEM AND PRACTICE QUIZ SOLUTIONS
2 1 1 −3 0 0 0 0
Columns one, two, and four are pivot columns of A and columns one, two, and three are pivot columns
of AT . Therefore, the column space of A is given by
2 3 1
−1 −1 −1
Col(A) = span , , ;
1 2 1
1 −2 −1
x1 = − x3 + x5 , x2 = x3 − 2x5 , x4 = 2x5 ,
x5 0 1
It can be checked that the null space is the orthogonal complement of the row space and the left null
space is the orthogonal complement of the column space. The rank(A) = 3, and A is not of full rank.
160 APPENDIX A. PROBLEM AND PRACTICE QUIZ SOLUTIONS
1. d. To find the null space of a matrix, bring it to reduced row echelon form. We have
1 2 0 1 1 2 0 1 1 2 0 1 1 2 0 0
2 4 1 1 → 0 0 1 −1 → 0 0 1 −1 → 0 0 1 0 .
3 6 1 1 0 0 1 −2 0 0 0 −1 0 0 0 1
−2
1
With x1 = −2x2 , x3 = 0, and x4 = 0, a basis for the null space is .
0
0
2. b. This system of linear equations is underdetermined, and the solution will be a general vector
in the null space of the multiplying matrix plus a particular vector that satisfies the underdetermined
system of equations. The linear system in matrix form is given by
x1
1 2 0 1 1
x2
2 4 1 1 = 1 .
x3
3 6 1 1 1
x4
3 6 1 1 1 0 0 1 −2 −2 0 0 0 −1 −1 0 0 0 1 1
−2
1
A basis for the null space is , and a particular solution can be found by setting the free
0
0
variable x2 = 0. Therefore, x1 = x3 = 0 and x4 = 1, and the general solution is
x1 −2 0
x2
= a 1 + 0 ,
x 0 0
3
x4 0 1
1 2 0 1 1 2 0 0
3. c. The matrix in reduced row echelon form is 2 4 1 1 → 0 0 1 0 . The number of
3 6 1 1 0 0 0 1
pivot columns is three, and this is the rank.
161
When a = 1, b = c = 0, we have
1 −2 1
1 1
vprojW = 1 − 1 = 0 ;
3 3
1 1 0
1.
1 0 ! 1
1 1 β0 3
1
=
2
β1 3
1 3 4
163
or ! ! !
4 6 β0 11
= .
6 14 β1 21
The solution is β 0 = 7/5 and β 1 = 9/10, and the least-squares line is given by y = 7/5 + 9x/10.
164 APPENDIX A. PROBLEM AND PRACTICE QUIZ SOLUTIONS
1
−2 0 −1
1 1 1
vprojW = ( vT w 1 ) w 1 + ( vT w 2 ) w 2 = − 1 + 1 = − 1 .
2 6 3
−1 1 2
β 0 + β 1 x1 = y1 ,
β 0 + β 1 x2 = y2 ,
β 0 + β 1 x3 = y3 .
Substituting in the values for x and y, and writing in matrix form, we have
1 1 ! 1
β0
1 2 = 1 .
β1
1 3 3
1
The best fit line is therefore y = − + x.
3
165
1.
1 0 0
0 1 0 = 1 × 1 × 1 = 1.
0 0 1
2.
d e f
a b c = dbi + ecg + f ah − f bg − eai − dch
g h i
a b c
= −( aei + b f g + cdh − ceg − bdi − a f h) = − d e f.
g h i
3. Let ! !
a b e f
A= , B= .
c d g h
Then !
ae + bg a f + bh
AB = ,
ce + dg c f + dh
and
1. For each element chosen from the first row, there is only a single way to choose nonzero elements
from all subsequent rows. Considering whether the columns chosen are even or odd permutations of
the ordered set {1, 2, 3, 4}, we obtain
a b c d
e f 0 0
= a f hj − behj + cegj − degi.
0
g h 0
0 0 i j
168 APPENDIX A. PROBLEM AND PRACTICE QUIZ SOLUTIONS
1. Suppose the square matrix A has two zero rows. If we interchange these two rows, the determinant
of A changes sign according to Property 2, even though A doesn’t change. Therefore, det A = − det A,
or det A = 0.
2. To prove that the determinant is a linear function of row i, interchange rows 1 and row i using
Property 2. Use Property 3, then interchange rows 1 and row i again.
3. Consider a general n-by-n matrix. Using the linear property of the jth row, and that a matrix with
two equal rows has zero determinant, we have
.. .. .. .
. .. .. .
. .. .. .. .. ..
. . . . . . . . . . . .
a ... ain a ... ain a ... ain ai1 ... ain
i1 i1 i1
.. .. .. .
= . .. .. . .. .. .. .. ..
. . . . . . + k .. . . = . . . .
a j1 + kai1 ... a jn + kain a j1 ... a jn ai1 ... ain a j1 ... a jn
.. .. .. .
.. .. .. .
.. .. .. .. .. ..
. . . . . . . . . .
4.
2
0 −1 2 0 −1 2 0 −1
3 1 1 = 0 1 5/2 = 0 1 5/2 = 2 × 1 × 7/2 = 7.
0 −1 1 0 −1 1 0 0 7/2
169
1. a. To find the determinant of a matrix with many zero elements, perform a Laplace expansion
across the row or down the column with the most zeros. Choose the correct sign. We have for one
expansion choice,
−3 0 −2 0 0
−3 −2
0 0
2 −2 −2 0 0 −3 0 −2
2 −2 −2 0 −3 0
0 0 −2 0 0 = 2 = −4 2 −2 −2 = 8 = 48.
0 0 −2 0 2 −2
3 0 −3 2 −3 0 0 −2
−3 3 3 −2
−3 3 3 0 −2
2. b. We can apply the Leibniz formula by going down the first column. For each element in the first
column there is only one possible choice of elements from the other three columns. We have
a e 0 0
b f g 0
= a f hj − behj + cegj − degi.
c 0 h i
d 0 0 j
The signs are obtained by considering whether the following permutations of the rows {1, 2, 3, 4} are
even or odd: a f hj = {1, 2, 3, 4} (even); behj = {2, 1, 3, 4} (odd); cegj = {3, 1, 2, 4} (even); degi =
{4, 1, 2, 3} (odd).
1.
a − λ b c
0 = det(A − λI) = d e−λ f
g h i − λ
= ( a − λ)(e − λ)(i − λ) + b f g + cdh − c(e − λ) g − bd(i − λ) − ( a − λ) f h
= −λ3 + ( a + e + i )λ2 − ( ae + ai + ei − bd − cg − f h)λ + aei + b f g + cdh − ceg − bdi − a f h.
171
Therefore, 2 − λ = ±7, and the eigenvalues are λ1 = −5, λ2 = 9. The eigenvector for λ1 = −5 is
found from ! !
7 7 x1
= 0,
7 7 x2
0 1 0 x3
1
√
or x2 = 0 and x1 + x3 = 0, or x1 = 0. The eigenvector for λ2 = 2 − 2 is found from
−1
√
2 1 0 x1
√
1 2 1 x2 = 0.
√
0 1 2 x3
1. b. The characteristic equation det (A − λI) = 0 for a two-by-two matrix A results in the quadratic
equation λ2 − TrA λ + det A = 0, which for 2
√ yields λ − 3λ + 1 = 0. Application of
√ the given matrix
3± 9−4 3 5
the quadratic formula results in λ± = = ± .
2 2 2
3 − λ −1 √
2. d. The characteristic equation is det (A − λI) = = (3 − λ)2 + 1 = 0. With i = −1,
1 3−λ
the solution is λ± = 3 ± i.
3. b. One can either compute the eigenvalues and eigenvectors of the matrix, or test the given possible
answers. If we test the answers, then only one is an eigenvector, and we have
√
2 1 0 1 2+ 2 1
√ √ √ √
1 2 1 2 = 2 + 2 2 = (2 + 2) 2 .
√
0 1 2 1 2+2 1
175
c1 x1 + c2 x2 = 0.
To prove that x1 and x2 are linearly independent, we need to show that c1 = c2 = 0. Multiply the
above equation on the left by A and use Ax1 = λ1 x1 and Ax2 = λ2 x2 to obtain
c1 λ1 x1 + c2 λ2 x2 = 0.
from which we conclude that if λ1 ̸= λ2 , then c1 = c2 = 0 and x1 and x2 are linearly independent.
dim(Col(A)) + dim(Null(A)) = n.
Since the columns of A are linearly independent, we have dim(Col(A)) = n and dim(Null(A)) = 0.
If the only solution to Ax = 0 is the zero vector, then det A ̸= 0 and A is invertible.
176 APPENDIX A. PROBLEM AND PRACTICE QUIZ SOLUTIONS
0 1 2
1 1 1
√ √ √ √
λ1 = 2, x1 = 0 ; λ2 = 2 − 2, x1 = − 2 ; λ2 = 2 + 2, x1 = 2 .
−1 1 1
Notice that the three eigenvectors are mutually orthogonal. This will happen when the matrix is
symmetric. If we normalize the eigenvectors, the matrix with eigenvectors as columns will be an
orthogonal matrix. Normalizing the orthogonal eigenvectors (so that S−1 = ST ) , we have
√
1/ 2 1/2 1/2
√ √
S= 0 −1/ 2 1/ 2 .
√
−1/ 2 1/2 1/2
We therefore find
√ √ √
20 0 1/ 2 0−1/ 2 2 1 0 1/ 2 1/2 1/2
√ √ √ √
0 2 − 2 0 = 1/2 −1/ 2 1/2 1 2 1 0 −1/ 2 1/ 2
√ √ √
0 0 2+ 2 1/2 1/ 2 1/2 0 1 2 −1/ 2 1/2 1/2
177
1.
−1
eA = eSΛS
SΛ2 S−1 SΛ3 S−1
= I + SΛS−1 + + +...
2! 3!
Λ2 Λ3
= S I+Λ+ + + . . . S−1
2! 3!
= SeΛ S−1 .
Because Λ is a diagonal matrix, the powers of Λ are also diagonal matrices with the diagonal elements
raised to the specified power. Each diagonal element of eΛ contains a power series of the form
λ2i λ3
1 + λi + + i +...,
2! 3!
to find !n !
−1
1 2n −1 − 2n −1
= .
−1 1 − 2n −1 2n −1
179
!2 ! !100 !2 50
0 1 1 0 0 1 0 1
2. d. A simple calculation shows that = = I. Therefore = =
1 0 0 1 1 0 1 0
I50= I.
A more complicated calculation diagonalizes this symmetric
! matrix. The eigenvalues! and orthonor-
1 1 1 1
mal eigenvectors are found to be λ1 = 1, v1 = √ and λ2 = −1, v2 = √ . The diagonal-
2 1 2 −1
! ! ! !
0 1 1 1 1 1 0 1 1
ization then takes the form = . Then,
1 0 2 1 −1 0 −1 1 −1
!100 ! !100 ! ! ! !
0 1 1 1 1 1 0 1 1 1 1 1 1 0 1 1
= = = I.
1 0 2 1 −1 0 −1 1 −1 2 1 −1 0 1 1 −1
3. a. We have
!
I2 I3
1 1 e 0
I
e = I + I + + + · · · = I 1 + 1 + + + . . . = Ie1 = .
2! 3! 2! 3! 0 e