Linear Algebra 2
Linear Algebra 2
Linear Algebra 2
Keshav Dogra∗
The following material is based on Chapter 1 of Sydsaeter et al., “Further Mathematics for
Economic Analysis” (henceforth FMEA) and Sergei Treil, “Linear Algebra Done Wrong” (available
at http://www.math.brown.edu/~treil/papers/LADW/LADW.html)
1 Complex numbers
√
Complex numbers have the form a + bi where a, b ∈ R, and i = −1. Formally, complex numbers
can be regarded as 2-vectors (a, b), with the standard addition rule but with a new multiplication
rule. The rules for addition, multiplication and division of complex numbers are:
1
2 Eigenvalues
Suppose we want to compute the nth power of a square matrix A multiplied by a nonzero vector
x, An x. This is a lot easier if there happens to be a scalar λ with the property that
Ax = λx (1)
We know this homogeneous system has a nonzero solution if and only if the determinant |A − λI|
equals zero. For example, suppose A is a 2 × 2 matrix, so (1) becomes
a11 a12 x1 x1
= λ
a21 a22 x2 x2
The determinant is
a11 − λ
a12
= λ2 − (a11 + a22 )λ + (a11 a22 − a12 a21 ) = 0
|A − λI| =
a21 a22 − λ
The eigenvalues are the (real or complex) solutions to this quadratic equation. Once we know the
eigenvalues λ1 , λ2 , we can find the corresponding eigenvectors by solving the homogeneous system
(A − λ1 I)x = 0 for x1 , and solving (A − λ2 I)x = 0 for x2 .
Suppose A is a n × n matrix. Define p(λ) = |A − λI|, then we have
a −λ a12 ... a1n
11
a21 a22 − λ ... a2n
p(λ) = |A − λI| = =0 (2)
.. .. .. ..
. . . .
an1 an2 ... ann − λ
(2) is called the characteristic equation of A. p(λ) is a polynomial of degree n in λ, and is called
2
the characteristic polynomial of A.
where an 6= 0. There exist constants λ1 , λ2 , ..., λn , which may be real or complex, such that
P (x) = an (λ − λ1 )(λ − λ2 ) . . . (λ − λn )
λ1 , ..., λn are called zeros of P (λ) and roots of P (λ) = 0. Roots may be repeated (e.g., we may
have λ1 = λ2 ). The largest positive integer k such that (λ − λi )k divides P (λ) - that is, such that
P (λ) = (λ − λi )k Q(λ), where Q is a polynomial - is called the multiplicity of the root λi . Provided
that a0 , a1 , ..., an are real, if a + bi is a root of P , its complex conjugate a − bi is a root of P .
2.1 Trace
The trace of a square matrixA, denoted tr(A), is the sum of its diagonal elements, a11 + a22 + ... +
ann . The trace has the following properties:
tr(cA) = c(tr(A))
tr(A0 ) = tr(A)
tr(A + B) = tr(A) + tr(B)
tr(In ) = n
x0 x = tr(x0 x) = tr(xx0 ), where x is an n × 1 vector
tr(ABCD) = tr(BCDA) = tr(CDAB) = tr(DABC)
The last rule works only for any cyclic permutation. So tr(ABC) = tr(BCA) = tr(CAB), but
tr(ABC) 6= tr(ACB).
3
If A is a n × n matrix with eigenvalues λ1 , ..., λn , then
|A| = λ1 λ2 ...λn
tr(A) = a11 + a22 + ... + ann = λ1 + λ2 + ... + λn
Note that you can use this to find the eigenvalues of a 2 × 2 matrix (once you have calculated its
trace and determinant).
The eigenvalues of a triangular matrix are its diagonal elements.
3 Diagonalization
Let A and P be n × n matrices with P invertible. The A and P−1 AP have the same eigenvalues.
To see this, note that the two matrices have the same characteristic polynomial:
Since P−1 AP has the same eigenvalues as A, and the eigenvalues of a diagonal matrix are its
diagonal elements, it follows that if A is diagonalizable, P−1 AP =diag{λ1 , λ2 , ..., λn }. Further, A
is diagonalizable if and only if it has a set of n linearly independent eigenvectors x1 , ..., xn . In that
case,
P−1 AP = diag{λ1 , λ2 , ..., λn }
Theorem 3.1. Let λ1 , ..., λr be distinct eigenvalues of A. Then the corresponding eigenvectors
x1 , ..., xr are linearly independent.
Proof. By induction. The case when r = 1 is trivial: any single eigenvector x1 is nonzero, so it
forms a linearly independent set. Suppose the statement of the theorem holds for r − 1. Suppose
λ1 , ..., λr are distinct eigenvalues. Since the theorem holds for r − 1, we know that the first r − 1 of
the corresponding eigenvectors are linearly independent. Suppose some linear combination of all r
eigenvectors equals zero,
c1 x1 + . . . + cr xr = 0. (3)
We want to show this must be the trivial linear combination, c1 = . . . = cr . Premultiply both sides
4
by (A − λr I):
Since the first r − 1 eigenvectors are linearly independent, and all the eigenvalues are distinct, we
must have c1 = . . . = cr−1 = 0. But then (3) becomes cr xr = 0, which can only hold if cr = 0. So
all r eigenvectors must be linearly independent.
• There exists an orthogonal matrix P such that P−1 AP =diag{λ1 , λ2 , ..., λn }, where the
columns of P are eigenvectors of unit length.
4 Quadratic Forms
A general quadratic form in two variables is
Without loss of generality we can assume the matrix is symmetric, i.e. a21 = a12 .
We want to know whether Q(x1 , x2 ) will have the same sign whatever the value of (x1 , x2 ). We
call the quadratic form and its associated matrix:
5
• negative definite if Q(x1 , x2 ) < 0 for all (x1 , x2 ) 6= (0, 0)
A quadratic form is indefinite if it is neither positive semidefinite nor negative semidefinite: that
is, if it sometimes takes positive values and sometimes takes negative values, depending on (x1 , x2 ).
Q(x1 , x2 ) is positive semidefinite if and only if a11 ≥ 0, a22 ≥ 0, and a11 a22 − a212 ≥ 0.
Q(x1 , x2 ) is negative semidefinite if and only if a11 ≤ 0, a22 ≤ 0, and a11 a22 − a212 ≥ 0.
Q(x1 , x2 ) is positive definite if and only if a11 > 0 and a11 a22 − a212 > 0.
Q(x1 , x2 ) is negative definite if and only if a11 < 0 and a11 a22 − a212 > 0.
n X
X n
Q(x1 , ..., xn ) = aij xi xj
i=1 j=1
where x = (x1 , ..., xn )0 , A = (aij )n×n . By the same argument as before, we can assume A is
symmetric without loss of generality.
We call the quadratic form Q(x) = x0 Ax and its associated symmetric matrix A:
A quadratic form is indefinite if it is neither positive semidefinite nor negative semidefinite: that
is, if it sometimes takes positive values and sometimes takes negative values, depending on x.
A principal minor of order r of an n × n matrix A is a minor of A of order r that is obtained
by deleting the ‘same’ rows and columns (if the ith row is deleted, so is the ith column).1 We also
count the determinant —A— as a principal minor. A principal minor is called a leading principal
minor of order r if it is obtained by delelting the last n − r rows and columns; in other words, it is
1 Recalling the definition of a minor, this means that a principal minor of A of order r is the determinant of a
matrix obtained by deleting n − r rows and n − r columns such that if the ith row is deleted, so is the ith column.
6
the determinant of the matrix consisting of the first r rows and columns of A. We use ∆k to denote
an arbitrary principal minor of order k, and Dk to denote the leading principal minor of order k.
Consider the symmetric matrix A and the associated quadratic form Q(x) = x0 Ax.
• Q is negative semidefinite iff (−1)k ≥ 0 for all principal minors of order k = 1, ..., n
Sometimes it is useful to divide a matrix into submatrices. This is called partitioning. For
example, we could partition a matrix into a 2 × 2 array of submatrices:
A11 A12
A=
A21 A22
2 This material was not covered in class and will not be required for the math camp exam.
7
We can add, subtract and multiply partitioned matrices as if their submatrices are ordinary matrix
elements:
A11 A12 B11 B12 A11 + B11 A12 + B12
+ =
A21 A22 B21 B22 A21 + B21 A22 + B22
A11 B11 + A12 B21 A11 B12 + A12 B21
AB =
A21 B11 + A22 B21 A21 B12 + A22 B22
where A11 is a k × k invertible matrix. If A is invertible, there exists an n × n matrix B such that
AB = I. Partitioning B in the same way as A, we have
A11 B11 + A12 B21 A11 B12 + A12 B21 Ik 0k×(n−k)
AB = =
A21 B11 + A22 B21 A21 B12 + A22 B22 0(n−k)×k I(n−k)×(n−k)