Linear Algebra in 4 Pages PDF
Linear Algebra in 4 Pages PDF
Linear Algebra in 4 Pages PDF
Abstract—This document will review the fundamental ideas of linear algebra. B. Matrix operations
We will learn about matrices, matrix operations, linear transformations and
discuss both the theoretical and computational aspects of linear algebra. The
We denote by A the matrix as a whole and refer to its entries as aij .
tools of linear algebra open the gateway to the study of more advanced The mathematical operations defined for matrices are the following:
mathematics. A lot of knowledge buzz awaits you if you choose to follow the • addition (denoted +)
path of understanding, instead of trying to memorize a bunch of formulas.
C =A+B ⇔ cij = aij + bij .
I. I NTRODUCTION
• subtraction (the inverse of addition)
Linear algebra is the math of vectors and matrices. Let n be a positive • matrix product. The product of matrices A ∈ Rm×n and B ∈ Rn×`
integer and let R denote the set of real numbers, then Rn is the set of all is another matrix C ∈ Rm×` given by the formula
n-tuples of real numbers. A vector ~v ∈ Rn is an n-tuple of real numbers. n
The notation “∈ S” is read “element of S.” For example, consider a vector
X
C = AB ⇔ cij = aik bkj ,
that has three components: k=1
~v = (v1 , v2 , v3 ) ∈ (R, R, R) ≡ R3 . a11 a12
b11 b12
a11 b11 + a12 b21 a11 b12 + a12 b22
a21 a22 = a21 b11 + a22 b21 a21 b12 + a22 b22
A matrix A ∈ Rm×n is a rectangular array of real numbers with m rows b21 b22
a31 a32 a31 b11 + a32 b21 a31 b12 + a32 b22
and n columns. For example, a 3 × 2 matrix looks like this:
a11 a12
R R
• matrix inverse (denoted A−1 )
A = a21 a22 ∈ R R ≡ R3×2 . • matrix transpose (denoted T ):
a31 a32 R R T α1 β1
α1 α2 α3
The purpose of this document is to introduce you to the mathematical = α2 β2 .
β1 β2 β3
operations that we can perform on vectors and matrices and to give you a α3 β3
feel of the power of linear algebra. Many problems in science, business, matrix trace: Tr[A] ≡ n
P
• i=1 aii
and technology can be described in terms of vectors and matrices so it is • determinant (denoted det(A) or |A|)
important that you understand how to work with these.
Note that the matrix product is not a commutative operation: AB 6= BA.
Prerequisites
The only prerequisite for this tutorial is a basic understanding of high school C. Matrix-vector product
math concepts1 like numbers, variables, equations, and the fundamental The matrix-vector product is an important special case of the matrix-
arithmetic operations on real numbers: addition (denoted +), subtraction matrix product. The product of a 3 × 2 matrix A and the 2 × 1 column
(denoted −), multiplication (denoted implicitly), and division (fractions). x results in a 3 × 1 vector ~
vector ~ y given by:
You should also be familiar with functions that take real numbers as
y1 a11 a12 a11 x1 + a12 x2
inputs and give real numbers as outputs, f : R → R. Recall that, by x1
~
y = A~x ⇔ y2 =a21 a22 = a21 x1 + a22 x2
definition, the inverse function f −1 undoes the effect of f . If you are x2
y3 a31 a32 a31 x1 + a32 x2
given f (x) and you want to find x, you can use the inverse function as
follows: f −1 (f (x)) = x. For example, the function f (x) = ln(x) has the a11 a12
√ = x1a21 +x2a22 (C)
inverse f −1 (x) = ex , and the inverse of g(x) = x is g −1 (x) = x2 .
a31 a32
II. D EFINITIONS
(a11 , a12 ) · ~
x
A. Vector operations = (a21 , a22 ) · ~x. (R)
We now define the math operations for vectors. The operations we can (a31 , a32 ) · ~
x
perform on vectors ~ u = (u1 , u2 , u3 ) and ~v = (v1 , v2 , v3 ) are: addition,
There are two2 fundamentally different yet equivalent ways to interpret the
subtraction, scaling, norm (length), dot product, and cross product:
matrix-vector product. In the column picture, (C), the multiplication of the
~
u + ~v = (u1 + v1 , u2 + v2 , u3 + v3 ) matrix A by the vector ~x produces a linear combination of the columns
u − ~v = (u1 − v1 , u2 − v2 , u3 − v3 )
~ of the matrix: ~ y = A~x = x1 A[:,1] + x2 A[:,2] , where A[:,1] and A[:,2] are
the first and second columns of the matrix A.
α~u = (αu1 , αu2 , αu3 )
q In the row picture, (R), multiplication of the matrix A by the vector ~ x
u|| = u21 + u22 + u23
||~ produces a column vector with coefficients equal to the dot products of
rows of the matrix with the vector ~ x.
u · ~v = u1 v1 + u2 v2 + u3 v3
~
u × ~v = (u2 v3 − u3 v2 , u3 v1 − u1 v3 , u1 v2 − u2 v1 )
~
D. Linear transformations
The dot product and the cross product of two vectors can also be described The matrix-vector product is used to define the notion of a linear
in terms of the angle θ between the two vectors. The formula for the dot transformation, which is one of the key notions in the study of linear
u · ~v = k~
product of the vectors is ~ ukk~v k cos θ. We say two vectors ~ u and algebra. Multiplication by a matrix A ∈ Rm×n can be thought of as
~v are orthogonal if the angle between them is 90◦ . The dot product of computing a linear transformation TA that takes n-vectors as inputs and
orthogonal vectors is zero: ~ ukk~v k cos(90◦ ) = 0.
u · ~v = k~ produces m-vectors as outputs:
The norm of the cross product is given by k~ u × ~v k = k~
ukk~v k sin θ. The
cross product is not commutative: ~ u × ~v 6= ~v × ~ u × ~v = −~v × ~
u, in fact ~ u. TA : Rn → Rm .
1A good textbook to (re)learn high school math is minireference.com 2 For more info see the video of Prof. Strang’s MIT lecture: bit.ly/10vmKcL
1
2
Instead of writing ~ y = TA (~
x) for the linear transformation TA applied to III. C OMPUTATIONAL LINEAR ALGEBRA
the vector ~x, we simply write ~y = A~ x. Applying the linear transformation Okay, I hear what you are saying “Dude, enough with the theory talk, let’s
TA to the vector ~x corresponds to the product of the matrix A and the see some calculations.” In this section we’ll look at one of the fundamental
column vector ~x. We say TA is represented by the matrix A. algorithms of linear algebra called Gauss–Jordan elimination.
You can think of linear transformations as “vector functions” and describe
their properties in analogy with the regular functions you are familiar with: A. Solving systems of equations
function f : R → R ⇔ linear transformation TA : R → R n m Suppose we’re asked to solve the following system of equations:
input x ∈ R ⇔ input ~
x∈R n 1x1 + 2x2 = 5,
(1)
output f (x) ⇔ output TA (~ x∈R
x) = A~ m 3x1 + 9x2 = 21.
Without a knowledge of linear algebra, we could use substitution, elimina-
g ◦ f = g(f (x)) ⇔ TB (TA (~
x)) = BA~
x
tion, or subtraction to find the values of the two unknowns x1 and x2 .
function inverse f −1 ⇔ matrix inverse A−1 Gauss–Jordan elimination is a systematic procedure for solving systems
zeros of f ⇔ N (A) ≡ null space of A of equations based the following row operations:
range of f ⇔ C(A) ≡ column space of A = range of TA α) Adding a multiple of one row to another row
β) Swapping two rows
Note that the combined effect of applying the transformation TA followed γ) Multiplying a row by a constant
by TB on the input vector ~x is equivalent to the matrix product BA~x. These row operations allow us to simplify the system of equations without
changing their solution.
To illustrate the Gauss–Jordan elimination procedure, we’ll now show the
E. Fundamental vector spaces sequence of row operations required to solve the system of linear equations
A vector space consists of a set of vectors and all linear combinations of described above. We start by constructing an augmented matrix as follows:
these vectors. For example the vector space S = span{~v1 , ~v2 } consists of
1 2 5
all vectors of the form ~v = α~v1 + β~v2 , where α and β are real numbers. .
3 9 21
We now define three fundamental vector spaces associated with a matrix A.
The column space of a matrix A is the set of vectors that can be produced The first column in the augmented matrix corresponds to the coefficients of
as linear combinations of the columns of the matrix A: the variable x1 , the second column corresponds to the coefficients of x2 ,
and the third column contains the constants from the right-hand side.
y ∈ Rm | ~
C(A) ≡ {~ y = A~ x ∈ Rn } .
x for some ~ The Gauss-Jordan elimination procedure consists of two phases. During
the first phase, we proceed left-to-right by choosing a row with a leading
The column space is the range of the linear transformation TA (the set
one in the leftmost column (called a pivot) and systematically subtracting
of possible outputs). You can convince yourself of this fact by reviewing
that row from all rows below it to get zeros below in the entire column. In
the definition of the matrix-vector product in the column picture (C). The
the second phase, we start with the rightmost pivot and use it to eliminate
vector A~x contains x1 times the 1st column of A, x2 times the 2nd column
all the numbers above it in the same column. Let’s see this in action.
of A, etc. Varying over all possible inputs ~
x, we obtain all possible linear
combinations of the columns of A, hence the name “column space.” 1) The first step is to use the pivot in the first column to eliminate the
The null space N (A) of a matrix A ∈ Rm×n consists of all the vectors variable x1 in the second row. We do this by subtracting three times
that the matrix A sends to the zero vector: the first row from the second row, denoted R2 ← R2 − 3R1 ,
1 2 5
x ∈ Rn | A~ x = ~0 .
N (A) ≡ ~ .
0 3 6
The vectors in the null space are orthogonal to all the rows of the matrix. 2) Next, we create a pivot in the second row using R2 ← 13 R2 :
We can see this from the row picture (R): the output vectors is ~0 if and
only if the input vector ~x is orthogonal to all the rows of A. 1 2 5
.
The row space of a matrix A, denoted R(A), is the set of linear 0 1 2
combinations of the rows of A. The row space R(A) is the orthogonal 3) We now start the backward phase and eliminate the second variable
complement of the null space N (A). This means that for all vectors from the first row. We do this by subtracting two times the second
~v ∈ R(A) and all vectors w ~ ∈ N (A), we have ~v · w ~ = 0. Together, the row from the first row R1 ← R1 − 2R2 :
null space and the row space form the domain of the transformation TA ,
1 0 1
Rn = N (A) ⊕ R(A), where ⊕ stands for orthogonal direct sum. 0 1 2
.
The matrix is now in reduced row echelon form (RREF), which is its
F. Matrix inverse “simplest” form it could be in. The solutions are: x1 = 1, x2 = 2.
By definition, the inverse matrix A−1 undoes the effects of the matrix A.
The cumulative effect of applying A−1 after A is the identity matrix 1: B. Systems of equations as matrix equations
1 0
We will now discuss another approach for solving the system of
A−1 A = 1 ≡ ..
.
. equations. Using the definition of the matrix-vector product, we can express
0 1 this system of equations (1) as a matrix equation:
1 2 x1 5
The identity matrix (ones on the diagonal and zeros everywhere else) = .
3 9 x2 21
x) = 1~
corresponds to the identity transformation: T1 (~ x=~ x, for all ~
x.
The matrix inverse is useful for solving matrix equations. Whenever we This matrix equation had the form A~ x = ~b, where A is a 2 × 2 matrix, ~x
want to get rid of the matrix A in some matrix equation, we can “hit” A is the vector of unknowns, and ~b is a vector of constants. We can solve for
with its inverse A−1 to make it disappear. For example, to solve for the x by multiplying both sides of the equation by the matrix inverse A−1 :
~
matrix X in the equation XA = B, multiply both sides of the equation
− 23
x1 3 5 1
by A−1 from the right: X = BA−1 . To solve for X in ABCXD = E, A−1 A~ x = 1~x= = A−1~b = 1 = .
x2 −1 21 2
multiply both sides of the equation by D−1 on the right and by A−1 , B −1 3
and C −1 (in that order) from the left: X = C −1 B −1 A−1 ED−1 . But how did we know what the inverse matrix A−1 is?
3
free variable, which we will denote s. We are looking for a vector with three
.
Λ = . ..
, Q =~
· · · ~eλn , then A = QΛQ .
−1
unknowns and one free variable (x1 , s, x3 , x4 )T that obeys the conditions: eλ1
.
. 0
|
x
0 0 λn
0 1
1 3 0 0 1x1 + 3s = 0 Matrices that can be written this way are called diagonalizable.
s
0 0 1 0 = 0
⇒ 1x3 = 0
x3 The decomposition of a matrix into its eigenvalues and eigenvectors
0 0 0 1 0 1x4 = 0
x4 gives valuable insights into the properties of the matrix. Google’s original
PageRank algorithm for ranking webpages by “importance” can be
Let’s express the unknowns x1 , x3 , and x4 in terms of the free variable s.
formalized as an eigenvector calculation on the matrix of web hyperlinks.
We immediately see that x3 = 0 and x4 = 0, and we can write x1 = −3s.
Therefore, any vector of the form (−3s, s, 0, 0), for any s ∈ R, is in the VI. T EXTBOOK PLUG
null space of A. We write N (A) = span{(−3, 1, 0, 0)T }.
If you’re interested in learning more about linear algebra, check out the
Observe that the dim(C(A)) = dim(R(A)) = 3, this is known as the
N O BULLSHIT GUIDE TO LINEAR ALGEBRA. The book is available via
rank of the matrix A. Also, dim(R(A)) + dim(N (A)) = 3 + 1 = 4,
lulu.com, amazon.com, and also here: gum.co/noBSLA .
which is the dimension of the input space of the linear transformation TA .