Linear Algebra 2

Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

Linear Algebra 2

Keshav Dogra∗

The following material is based on Chapter 1 of Sydsaeter et al., “Further Mathematics for
Economic Analysis” (henceforth FMEA) and Sergei Treil, “Linear Algebra Done Wrong” (available
at http://www.math.brown.edu/~treil/papers/LADW/LADW.html)

1 Complex numbers

Complex numbers have the form a + bi where a, b ∈ R, and i = −1. Formally, complex numbers
can be regarded as 2-vectors (a, b), with the standard addition rule but with a new multiplication
rule. The rules for addition, multiplication and division of complex numbers are:

(a + bi) + (c + di) = (a + c) + (b + d)i


(a + bi)(c + di) = (ac − bd) + (ad + bc)i
a + bi (a + bi)(c − di) ac + bd − (bc − ad)i
= =
c + di (c + di)(c − di) c2 + d2

The modulus of a complex number a + bi is a2 + b2 , sometimes denoted |a + bi|.
If z = x + iy, the complex conjugate of z is z̄ = x − iy. A complex number multiplied by its
complex conjugate yields a real number: specifically, z z̄ = x2 + y s = |z|2 .
For more details, see Appendix B.3 of FMEA, on which this section is based.
∗ Department of Economics, Columbia University, [email protected]

1
2 Eigenvalues
Suppose we want to compute the nth power of a square matrix A multiplied by a nonzero vector
x, An x. This is a lot easier if there happens to be a scalar λ with the property that

Ax = λx (1)

Then we have An x = λn x, which is easy to compute.


A nonzero vector x that solves (1) is called an eigenvector, and the associated λ is called an
eigenvalue. λ may be real or complex; the elements of x may be either real or complex.
Equivalently, the eigenvectors of a matrix A are those vectors x that, when premultiplied by A,
remain proportional to the original vector.
If x is an eigenvector of A, then so is αx for every nonzero scalar α.
We can rewrite (1) as
(A − λI)x = 0

We know this homogeneous system has a nonzero solution if and only if the determinant |A − λI|
equals zero. For example, suppose A is a 2 × 2 matrix, so (1) becomes
    
a11 a12 x1 x1
   = λ 
a21 a22 x2 x2

The determinant is

a11 − λ

a12
= λ2 − (a11 + a22 )λ + (a11 a22 − a12 a21 ) = 0

|A − λI| =
a21 a22 − λ

The eigenvalues are the (real or complex) solutions to this quadratic equation. Once we know the
eigenvalues λ1 , λ2 , we can find the corresponding eigenvectors by solving the homogeneous system
(A − λ1 I)x = 0 for x1 , and solving (A − λ2 I)x = 0 for x2 .
Suppose A is a n × n matrix. Define p(λ) = |A − λI|, then we have

a −λ a12 ... a1n
11

a21 a22 − λ ... a2n
p(λ) = |A − λI| = =0 (2)

.. .. .. ..

. . . .


an1 an2 ... ann − λ

(2) is called the characteristic equation of A. p(λ) is a polynomial of degree n in λ, and is called

2
the characteristic polynomial of A.

Theorem 2.1. (Fundamental Theorem of Algebra)


Consider the polynomial of degree n,

P (λ) = an λn + an−1 λn−1 + . . . + a1 λ + a0

where an 6= 0. There exist constants λ1 , λ2 , ..., λn , which may be real or complex, such that

P (x) = an (λ − λ1 )(λ − λ2 ) . . . (λ − λn )

λ1 , ..., λn are called zeros of P (λ) and roots of P (λ) = 0. Roots may be repeated (e.g., we may
have λ1 = λ2 ). The largest positive integer k such that (λ − λi )k divides P (λ) - that is, such that
P (λ) = (λ − λi )k Q(λ), where Q is a polynomial - is called the multiplicity of the root λi . Provided
that a0 , a1 , ..., an are real, if a + bi is a root of P , its complex conjugate a − bi is a root of P .

By the Fundamental Theorem of Algebra, |A − λIn | = 0 has exactly n solutions (real or


complex), counting multiplicities. As in the 2×2 case, we can in principle solve for the n eigenvalues
λ1 , ..., λn and then solve the homogeneous systems (A − λ1 In )x1 = 0,...,(A − λn In )xn = 0 for the
corresponding eigenvectors x1 , ..., xn .

2.1 Trace
The trace of a square matrixA, denoted tr(A), is the sum of its diagonal elements, a11 + a22 + ... +
ann . The trace has the following properties:

tr(cA) = c(tr(A))
tr(A0 ) = tr(A)
tr(A + B) = tr(A) + tr(B)
tr(In ) = n
x0 x = tr(x0 x) = tr(xx0 ), where x is an n × 1 vector
tr(ABCD) = tr(BCDA) = tr(CDAB) = tr(DABC)

The last rule works only for any cyclic permutation. So tr(ABC) = tr(BCA) = tr(CAB), but
tr(ABC) 6= tr(ACB).

3
If A is a n × n matrix with eigenvalues λ1 , ..., λn , then

|A| = λ1 λ2 ...λn
tr(A) = a11 + a22 + ... + ann = λ1 + λ2 + ... + λn

Note that you can use this to find the eigenvalues of a 2 × 2 matrix (once you have calculated its
trace and determinant).
The eigenvalues of a triangular matrix are its diagonal elements.

3 Diagonalization
Let A and P be n × n matrices with P invertible. The A and P−1 AP have the same eigenvalues.
To see this, note that the two matrices have the same characteristic polynomial:

|P−1 AP − λI| = |P−1 (A − λI)P| = |P−1 ||(A − λI)||P| = |(A − λI)|

A n × n matrix A is diagonalizable if there exist an invertible n × n matrix P and a diagonal


matrix D such that
P−1 AP = D

Since P−1 AP has the same eigenvalues as A, and the eigenvalues of a diagonal matrix are its
diagonal elements, it follows that if A is diagonalizable, P−1 AP =diag{λ1 , λ2 , ..., λn }. Further, A
is diagonalizable if and only if it has a set of n linearly independent eigenvectors x1 , ..., xn . In that
case,
P−1 AP = diag{λ1 , λ2 , ..., λn }

where P is the matrix with x1 , ..., xn as its columns.

Theorem 3.1. Let λ1 , ..., λr be distinct eigenvalues of A. Then the corresponding eigenvectors
x1 , ..., xr are linearly independent.

Proof. By induction. The case when r = 1 is trivial: any single eigenvector x1 is nonzero, so it
forms a linearly independent set. Suppose the statement of the theorem holds for r − 1. Suppose
λ1 , ..., λr are distinct eigenvalues. Since the theorem holds for r − 1, we know that the first r − 1 of
the corresponding eigenvectors are linearly independent. Suppose some linear combination of all r
eigenvectors equals zero,
c1 x1 + . . . + cr xr = 0. (3)

We want to show this must be the trivial linear combination, c1 = . . . = cr . Premultiply both sides

4
by (A − λr I):

(A − λr I)c1 x1 + . . . + (A − λr I)cr−1 xr−1 + (A − λr I)cr xr = 0


c1 (λ1 − λr )x1 + . . . + cr−1 (λr−1 − λr )xr−1 + cr (λr − λr )xr = 0
c1 (λ1 − λr )x1 + . . . + cr−1 (λr−1 − λr )xr−1 = 0

Since the first r − 1 eigenvectors are linearly independent, and all the eigenvalues are distinct, we
must have c1 = . . . = cr−1 = 0. But then (3) becomes cr xr = 0, which can only hold if cr = 0. So
all r eigenvectors must be linearly independent.

Corollary 3.2. If an n × n matrix A has n distinct eigenvalues, then it is diagonalizable.

Note that this is a sufficient, not a necessary condition.


A matrix is orthogonal if P0 = P−1 . If the matrix A is symmetric, then

• All its n eigenvalues are real

• Eigenvectors corresponding to distinct eigenvalues are orthogonal

• There exists an orthogonal matrix P such that P−1 AP =diag{λ1 , λ2 , ..., λn }, where the
columns of P are eigenvectors of unit length.

4 Quadratic Forms
A general quadratic form in two variables is

Q(x1 , x2 ) = a11 x21 + a12 x1 x2 + a21 x2 x1 + a22 x22

We can write this in matrix form as


  
a11 a12 x1
Q(x1 , x2 ) = (x1 , x2 )   
a21 a22 x2

Without loss of generality we can assume the matrix is symmetric, i.e. a21 = a12 .
We want to know whether Q(x1 , x2 ) will have the same sign whatever the value of (x1 , x2 ). We
call the quadratic form and its associated matrix:

• positive definite if Q(x1 , x2 ) > 0 for all (x1 , x2 ) 6= (0, 0)

• positive semidefinite if Q(x1 , x2 ) ≥ 0 for all (x1 , x2 ) 6= (0, 0)

5
• negative definite if Q(x1 , x2 ) < 0 for all (x1 , x2 ) 6= (0, 0)

• negative semidefinite if Q(x1 , x2 ) ≤ 0 for all (x1 , x2 ) 6= (0, 0)

A quadratic form is indefinite if it is neither positive semidefinite nor negative semidefinite: that
is, if it sometimes takes positive values and sometimes takes negative values, depending on (x1 , x2 ).
Q(x1 , x2 ) is positive semidefinite if and only if a11 ≥ 0, a22 ≥ 0, and a11 a22 − a212 ≥ 0.
Q(x1 , x2 ) is negative semidefinite if and only if a11 ≤ 0, a22 ≤ 0, and a11 a22 − a212 ≥ 0.
Q(x1 , x2 ) is positive definite if and only if a11 > 0 and a11 a22 − a212 > 0.
Q(x1 , x2 ) is negative definite if and only if a11 < 0 and a11 a22 − a212 > 0.

4.1 The General Case


A quadratic form in n variables is a function

n X
X n
Q(x1 , ..., xn ) = aij xi xj
i=1 j=1

We can write this in matrix form:

Q(x1 , ..., xn ) = Q(x) = x0 Ax

where x = (x1 , ..., xn )0 , A = (aij )n×n . By the same argument as before, we can assume A is
symmetric without loss of generality.
We call the quadratic form Q(x) = x0 Ax and its associated symmetric matrix A:

• positive definite if Q(x) > 0 for all x 6= 0

• positive semidefinite if Q(x) ≥ 0 for all x 6= 0

• negative definite if Q(x) < 0 for all x 6= 0

• negative semidefinite if Q(x) ≤ 0 for all x 6= 0

A quadratic form is indefinite if it is neither positive semidefinite nor negative semidefinite: that
is, if it sometimes takes positive values and sometimes takes negative values, depending on x.
A principal minor of order r of an n × n matrix A is a minor of A of order r that is obtained
by deleting the ‘same’ rows and columns (if the ith row is deleted, so is the ith column).1 We also
count the determinant —A— as a principal minor. A principal minor is called a leading principal
minor of order r if it is obtained by delelting the last n − r rows and columns; in other words, it is
1 Recalling the definition of a minor, this means that a principal minor of A of order r is the determinant of a

matrix obtained by deleting n − r rows and n − r columns such that if the ith row is deleted, so is the ith column.

6
the determinant of the matrix consisting of the first r rows and columns of A. We use ∆k to denote
an arbitrary principal minor of order k, and Dk to denote the leading principal minor of order k.
Consider the symmetric matrix A and the associated quadratic form Q(x) = x0 Ax.

• Q is positive definite iff Dk > 0, k = 1, ..., n

• Q is positive semidefinite iff ∆k ≥ 0 for all principal minors of order k = 1, ..., n

• Q is negative definite iff (−1)k Dk > 0 for k = 1, ..., n

• Q is negative semidefinite iff (−1)k ≥ 0 for all principal minors of order k = 1, ..., n

Note that Q is NOT necessarily positive semidefinite if Dk ≥ 0, k = 1, ..., n.


Another way to check the definiteness of a quadratic form is to check the signs of the eigenvalues
of its associated matrix. Let Q = x0 Ax be a quadratic form, where A is symmetric, and let λ1 , ..., λn
be the eigenvalues of A. Then

• Q is positive definite iff λ1 > 0, ..., λn > 0

• Q is positive semidefinite iff λ1 ≥ 0, ..., λn ≥ 0

• Q is negative definite iff λ1 < 0, ..., λn < 0

• Q is negative semidefinite iff λ1 ≤ 0, ..., λn ≤ 0

Note that the eigenvalues of A must be real, since A is symmetric.

Theorem 4.1. If A is positive definite, it can be decomposed as A = LL0 , where L is a lower


triangular matrix with strictly positive diagonal entries. L is unique. This is called the Cholesky
decomposition of A.

5 Partitioned Matrices and Their Inverses


2

Sometimes it is useful to divide a matrix into submatrices. This is called partitioning. For
example, we could partition a matrix into a 2 × 2 array of submatrices:
 
A11 A12
A= 
A21 A22
2 This material was not covered in class and will not be required for the math camp exam.

7
We can add, subtract and multiply partitioned matrices as if their submatrices are ordinary matrix
elements:      
A11 A12 B11 B12 A11 + B11 A12 + B12
 + = 
A21 A22 B21 B22 A21 + B21 A22 + B22
 
A11 B11 + A12 B21 A11 B12 + A12 B21
AB =  
A21 B11 + A22 B21 A21 B12 + A22 B22

Suppose A is an n × n matrix, partitioned as follows:


 
A11 A12
A= 
A21 A22

where A11 is a k × k invertible matrix. If A is invertible, there exists an n × n matrix B such that
AB = I. Partitioning B in the same way as A, we have
   
A11 B11 + A12 B21 A11 B12 + A12 B21 Ik 0k×(n−k)
AB =  = 
A21 B11 + A22 B21 A21 B12 + A22 B22 0(n−k)×k I(n−k)×(n−k)

These 4 matrix equations can be solved for B to give:


 −1  
A11 A12 A−1 −1
11 + A11 A12 ∆
−1
A21 A−1
11 −A−1
11 A12 ∆
−1
  = 
A21 A22 −∆−1 A21 A−1
11 ∆−1

where ∆ = A22 − A21 A−1


11 A12 . See FMEA, pp39-40 for the proof.

You might also like