Linear Algebra 2
Keshav Dogra∗
The following material is based on Chapter 1 of Sydsaeter et al., “Further Mathematics for
Economic Analysis” (henceforth FMEA) and Sergei Treil, “Linear Algebra Done Wrong” (available
1 Complex numbers
Complex numbers have the form a + bi where a, b ∈ R, and i = −1. Formally, complex numbers
can be regarded as 2-vectors (a, b), with the standard addition rule but with a new multiplication
rule. The rules for addition, multiplication and division of complex numbers are:
2 Eigenvalues
Suppose we want to compute the nth power of a square matrix A multiplied by a nonzero vector
x, An x. This is a lot easier if there happens to be a scalar λ with the property that
Ax = λx (1)
We know this homogeneous system has a nonzero solution if and only if the determinant |A − λI|
equals zero. For example, suppose A is a 2 × 2 matrix, so (1) becomes
a11 a12 x1 x1
= λ
a21 a22 x2 x2
The determinant is
a11 − λ
= λ2 − (a11 + a22 )λ + (a11 a22 − a12 a21 ) = 0
|A − λI| =
a21 a22 − λ
The eigenvalues are the (real or complex) solutions to this quadratic equation. Once we know the
eigenvalues λ1 , λ2 , we can find the corresponding eigenvectors by solving the homogeneous system
(A − λ1 I)x = 0 for x1 , and solving (A − λ2 I)x = 0 for x2 .
Suppose A is a n × n matrix. Define p(λ) = |A − λI|, then we have
a −λ a12 ... a1n
a21 a22 − λ ... a2n
p(λ) = |A − λI| = =0 (2)
.. .. .. ..
. . . .
an1 an2 ... ann − λ
(2) is called the characteristic equation of A. p(λ) is a polynomial of degree n in λ, and is called
the characteristic polynomial of A.
where an 6= 0. There exist constants λ1 , λ2 , ..., λn , which may be real or complex, such that
P (x) = an (λ − λ1 )(λ − λ2 ) . . . (λ − λn )
λ1 , ..., λn are called zeros of P (λ) and roots of P (λ) = 0. Roots may be repeated (e.g., we may
have λ1 = λ2 ). The largest positive integer k such that (λ − λi )k divides P (λ) - that is, such that
P (λ) = (λ − λi )k Q(λ), where Q is a polynomial - is called the multiplicity of the root λi . Provided
that a0 , a1 , ..., an are real, if a + bi is a root of P , its complex conjugate a − bi is a root of P .
2.1 Trace
The trace of a square matrixA, denoted tr(A), is the sum of its diagonal elements, a11 + a22 + ... +
ann . The trace has the following properties:
tr(cA) = c(tr(A))
tr(A0 ) = tr(A)
tr(A + B) = tr(A) + tr(B)
tr(In ) = n
x0 x = tr(x0 x) = tr(xx0 ), where x is an n × 1 vector
tr(ABCD) = tr(BCDA) = tr(CDAB) = tr(DABC)
The last rule works only for any cyclic permutation. So tr(ABC) = tr(BCA) = tr(CAB), but
tr(ABC) 6= tr(ACB).
If A is a n × n matrix with eigenvalues λ1 , ..., λn , then
|A| = λ1 λ2 ...λn
tr(A) = a11 + a22 + ... + ann = λ1 + λ2 + ... + λn
Note that you can use this to find the eigenvalues of a 2 × 2 matrix (once you have calculated its
trace and determinant).
The eigenvalues of a triangular matrix are its diagonal elements.
3 Diagonalization
Let A and P be n × n matrices with P invertible. The A and P−1 AP have the same eigenvalues.
To see this, note that the two matrices have the same characteristic polynomial:
Since P−1 AP has the same eigenvalues as A, and the eigenvalues of a diagonal matrix are its
diagonal elements, it follows that if A is diagonalizable, P−1 AP =diag{λ1 , λ2 , ..., λn }. Further, A
is diagonalizable if and only if it has a set of n linearly independent eigenvectors x1 , ..., xn . In that
P−1 AP = diag{λ1 , λ2 , ..., λn }
Theorem 3.1. Let λ1 , ..., λr be distinct eigenvalues of A. Then the corresponding eigenvectors
x1 , ..., xr are linearly independent.
Proof. By induction. The case when r = 1 is trivial: any single eigenvector x1 is nonzero, so it
forms a linearly independent set. Suppose the statement of the theorem holds for r − 1. Suppose
λ1 , ..., λr are distinct eigenvalues. Since the theorem holds for r − 1, we know that the first r − 1 of
the corresponding eigenvectors are linearly independent. Suppose some linear combination of all r
eigenvectors equals zero,
c1 x1 + . . . + cr xr = 0. (3)
We want to show this must be the trivial linear combination, c1 = . . . = cr . Premultiply both sides
by (A − λr I):
Since the first r − 1 eigenvectors are linearly independent, and all the eigenvalues are distinct, we
must have c1 = . . . = cr−1 = 0. But then (3) becomes cr xr = 0, which can only hold if cr = 0. So
all r eigenvectors must be linearly independent.
• There exists an orthogonal matrix P such that P−1 AP =diag{λ1 , λ2 , ..., λn }, where the
columns of P are eigenvectors of unit length.
4 Quadratic Forms
A general quadratic form in two variables is
Without loss of generality we can assume the matrix is symmetric, i.e. a21 = a12 .
We want to know whether Q(x1 , x2 ) will have the same sign whatever the value of (x1 , x2 ). We
call the quadratic form and its associated matrix:
• negative definite if Q(x1 , x2 ) < 0 for all (x1 , x2 ) 6= (0, 0)
A quadratic form is indefinite if it is neither positive semidefinite nor negative semidefinite: that
is, if it sometimes takes positive values and sometimes takes negative values, depending on (x1 , x2 ).
Q(x1 , x2 ) is positive semidefinite if and only if a11 ≥ 0, a22 ≥ 0, and a11 a22 − a212 ≥ 0.
Q(x1 , x2 ) is negative semidefinite if and only if a11 ≤ 0, a22 ≤ 0, and a11 a22 − a212 ≥ 0.
Q(x1 , x2 ) is positive definite if and only if a11 > 0 and a11 a22 − a212 > 0.
Q(x1 , x2 ) is negative definite if and only if a11 < 0 and a11 a22 − a212 > 0.
n X
X n
Q(x1 , ..., xn ) = aij xi xj
i=1 j=1
where x = (x1 , ..., xn )0 , A = (aij )n×n . By the same argument as before, we can assume A is
symmetric without loss of generality.
We call the quadratic form Q(x) = x0 Ax and its associated symmetric matrix A:
A quadratic form is indefinite if it is neither positive semidefinite nor negative semidefinite: that
is, if it sometimes takes positive values and sometimes takes negative values, depending on x.
A principal minor of order r of an n × n matrix A is a minor of A of order r that is obtained
by deleting the ‘same’ rows and columns (if the ith row is deleted, so is the ith column).1 We also
count the determinant —A— as a principal minor. A principal minor is called a leading principal
minor of order r if it is obtained by delelting the last n − r rows and columns; in other words, it is
1 Recalling the definition of a minor, this means that a principal minor of A of order r is the determinant of a
matrix obtained by deleting n − r rows and n − r columns such that if the ith row is deleted, so is the ith column.
the determinant of the matrix consisting of the first r rows and columns of A. We use ∆k to denote
an arbitrary principal minor of order k, and Dk to denote the leading principal minor of order k.
Consider the symmetric matrix A and the associated quadratic form Q(x) = x0 Ax.
• Q is negative semidefinite iff (−1)k ≥ 0 for all principal minors of order k = 1, ..., n
Sometimes it is useful to divide a matrix into submatrices. This is called partitioning. For
example, we could partition a matrix into a 2 × 2 array of submatrices:
A11 A12
A21 A22
2 This material was not covered in class and will not be required for the math camp exam.
We can add, subtract and multiply partitioned matrices as if their submatrices are ordinary matrix
A11 A12 B11 B12 A11 + B11 A12 + B12
+ =
A21 A22 B21 B22 A21 + B21 A22 + B22
A11 B11 + A12 B21 A11 B12 + A12 B21
AB =
A21 B11 + A22 B21 A21 B12 + A22 B22
where A11 is a k × k invertible matrix. If A is invertible, there exists an n × n matrix B such that
AB = I. Partitioning B in the same way as A, we have
A11 B11 + A12 B21 A11 B12 + A12 B21 Ik 0k×(n−k)
AB = =
A21 B11 + A22 B21 A21 B12 + A22 B22 0(n−k)×k I(n−k)×(n−k)