Lectures On Spectral Graph Theory PDF
Lectures On Spectral Graph Theory PDF
Lectures On Spectral Graph Theory PDF
Fan R. K. Chung
Author address:
University of Pennsylvania, Philadelphia, Pennsylvania 19104
E-mail address: [email protected]
Contents
Chapter 1. Eigenvalues and the Laplacian of a graph
1.1. Introduction
11
14
23
2.1. History
23
24
25
29
32
36
43
43
45
49
50
59
59
iii
iv
CONTENTS
60
62
64
68
73
5.1. Quasi-randomness
73
75
81
85
Bibliography
91
CHAPTER 1
From the start, spectral graph theory has had applications to chemistry [27].
Eigenvalues were associated with the stability of molecules. Also, graph spectra
arise naturally in various problems of theoretical physics and quantum mechanics,
for example, in minimizing energies of Hamiltonian systems. The recent progress
on expander graphs and eigenvalues was initiated by problems in communication
networks. The development of rapidly mixing Markov chains has intertwined with
advances in randomized approximation algorithms. Applications of graph eigenvalues occur in numerous areas and in dierent guises. However, the underlying
mathematics of spectral graph theory through all its connections to the pure and
applied, the continuous and discrete, can be viewed as a single unied subject. It
is this aspect that we intend to cover in this book.
if u = v,
dv
1 if u and v are adjacent,
L(u, v) =
0
otherwise.
Let T denote the diagonal matrix with the (v, v)-th entry having value dv . The
Laplacian of G is dened to be the matrix
1
if u = v and dv = 0,
1
if u and v are adjacent,
L(u, v) =
du dv
0
otherwise.
We can write
L = T 1/2 LT 1/2
with the convention T 1 (v, v) = 0 for dv = 0. We say v is an isolated vertex if
dv = 0. A graph is said to be nontrivial if it contains at least one edge.
T 1/2 LT 1/2
I T 1/2 AT 1/2 .
C
C1
0
S
Since L is symmetric, its eigenvalues are all real and non-negative. We can
use the variational characterizations of those eigenvalues in terms of the Rayleigh
quotient of L (see, e.g. [162]). Let g denote an arbitrary function which assigns to
each vertex v of G a real value g(v). We can view g as a column vector. Then
g, Lg
g, g
=
=
(1.1)
g, T 1/2 LT 1/2 g
g, g
f, Lf
T 1/2 f, T 1/2 f
(f (u) f (v))2
uv
v
f (v)2 dv
denotes the sum over all unordered pairs {u, v} for which
u and v are adjacent. Here
f, g =
f (x)g(x) denotes the standard inner product
x
(f (u) f (v))2 is sometimes called the Dirichlet sum of G and
in Rn . The sum
uv
uv
G = 1
inf
(f (u) f (v))2
uv
f T 1
f (v)2 dv
inf M
|f |2
2
|f |
M
We remark that the corresponding measure here for each edge is 1 although in the
general case for weighted graphs the measure for an edge is associated with the edge
weight (see Section 1.4.) The measure for each vertex is the degree of the vertex.
A more general notion of vertex weights will be considered in Section 2.5.
(1.3)
(f (u) f (v))2
= inf uv
(1.4)
(f (v) t)2 dv
(f (v) f)2 dv
where
f =
f (v)dv
,
vol G
and vol G denotes the volume of the graph G, given by
vol G =
dv .
v
N
(ai a)2 =
(ai aj )2
i=1
for a =
N
i<j
i=1
[126]):
(1.5)
where
1
(f (u) f (v))2
u,v
u,v
(1.6)
= sup uv
f
f 2 (v)dv
(1.7)
=
inf
f
(f (u) f (v))2
uv
sup
(f (v) g(v))2 dv
gPk1
(1.8)
inf
f T Pk1
(f (u) f (v))2
uv
v
f (v)2 dv
Example 1.5. For the cycle Cn on n vertices, the eigenvalues are 1 cos 2k
n
for k = 0, , n 1.
n
Example
n
1.6. For the n-cube Qn on 2 vertices, the eigenvalues are
multiplicity k ) for k = 0, , n.
2k
n
(with
i n
n
n1
with equality holding if and only if G is the complete graph on n vertices.
Also, for a graph G without isolated vertices, we have
n
.
n1
n1
1
db if v = a,
da if v = b,
f1 (v) =
0 if v = a, b.
(iii) then follows from (1.2).
If G is connected, the eigenvalue 0 has multiplicity 1 since any harmonic eigenfunction with eigenvalue 0 assumes the same value at each vertex. Thus, (iv) follows
from the fact that the union of two disjoint graphs has as its spectrum the union
of the spectra of the original graphs.
(v) follows from equation (1.6) and the fact that
(f (x) f (y))2 2(f 2 (x) + f 2 (y)).
Therefore
i sup
(f (x) f (y))2
xy
f 2 (x)dx
2.
Equality holds for i = n 1 when f (x) = f (y) for every edge {x, y} in G.
Therefore, since f = 0, G has a bipartite connected component. On the other hand,
if G has a connected component which is bipartite, we can choose the function f
so as to make n1 = 2.
(vi) follows from the denition.
For bipartite graphs, the following slightly stronger result holds:
Lemma 1.8. The following statements are equivalent:
(i): G is bipartite.
(ii): G has i + 1 connected components and nj = 2 for 1 j i.
(iii): For each i , the value 2 i is also an eigenvalue of G.
1
D vol G
u0 satisfying f (u0 )f (v0 ) < 0. Let P denote a shortest path in G joining u0 and v0 .
Then by (1.2) we have
1
(f (x) f (y))
xy
f 2 (x)dx
(f (x) f (y))
{x,y}P
vol G f 2 (v0 )
1
D
(f (v0 ) f (u0 ))
vol G f 2 (v0 )
1
D vol G
if y = x0 ,
f (x0 ) +
dx0
f (y) =
%
otherwise.
f (y)
vol G dx0
We have
x,yV
xy
f2 (x)dx
xV
dx0
vol G dx0
y
y
(f (x) f (y))2 +
x,yV
xy
y=x0
yx0
y
yy
2%
f (y)dy
vol G dx0
y=x0
xV
2
+O(% )
2%
=
(f (x0 ) f (y))
y
yx0
(f (x) f (y))2 +
x,yV
xy
2%
+
dx0
xV
(f (x0 ) f (y))
y
yx0
vol G dx0
+O(% )
since
f (x)dx = 0, and
y
xV
y
implies that
x,yV
xy
f2 (x)dx
xV
(f (x) f (y))2
x,yV
xy
f 2 (x)dx
xV
10
One can also prove the statement in Lemma 1.10 by recalling that f = T 1/2 g,
where Lg = G g. Then
T 1 Lf = T 1 (T 1/2 LT 1/2 )(T 1/2 g) = T 1/2 G g = G f,
and examining the entries gives the desired result.
With a little linear algebra, we can improve the bounds on eigenvalues in terms
of the degrees of the vertices.
We consider the trace of (I L)2 . We have
T r(I L)2 =
(1 i )2
i
2,
1 + (n 1)
(1.9)
where
= max |1 i |.
i=0
T r(I L)2
=
=
=
( )2 ,
dx xy dx
dy
x
1
(tr(I L)2 1).
n1
=
(n 1)dH
2m
2
as m .
11
(1.12)
n1 1 + .
1
1 2
(
) dx
dx
dH
xV
k1
2
2
1 1 2
1
+ .
k
D
D
One way to bound eigenvalues from above is to consider contracting the
graph G into a weighted graph H (which will be dened in the next section). Then
the eigenvalues of G can be upper-bounded by the eigenvalues of H or by various
upper bounds on them, which might be easier to obtain. We remark that the proof
of Lemma 1.14 proceeds by basically contracting the graph into a weighted path.
We will prove Lemma 1.14 in the next section.
We note that Lemma 1.14 gives a proof (see [5]) that for any xed k and for
any innite family of regular graphs with degree k,
k1
lim sup 1 1 2
.
k
This bound is the best possible since it is sharp for the Ramanujan graphs (which
will be
discussed in Chapter ??). We note that the cleaner version of 1
1 2 k 1/k is not true for certain graphs (e.g., 4-cycles or complete bipartite graphs). This example also illustrates that the assumption in Lemma 1.14
concerning D 4 is essential.
1.4. Eigenvalues of weighted graphs
Before dening weighted graphs, we will say a few words about two dierent
approaches for giving denitions. We could have started from the very beginning
with weighted graphs, from which simple graphs arise as a special case in which
the weights are 0 or 1. However, the unique characteristics and special strength of
12
graph theory is its ability to deal with the {0, 1}-problems arising in many natural
situations. The clean formulation of a simple graph has conceptual advantages.
Furthermore, as we shall see, all denitions and subsequent theorems for simple
graphs can usually be easily carried out for weighted graphs. A weighted undirected
graph G (possibly with loops) has associated with it a weight function w : V V
R satisfying
w(u, v) = w(v, u)
and
w(u, v) 0.
We note that if {u, v} E(G) , then w(u, v) = 0. Unweighted graphs are just the
special case where all the weights are 0 or 1.
In the present context, the degree dv of a vertex v is dened to be:
dv =
w(u, v),
vol G =
dv .
dv w(v, v)
w(u, v)
L(u, v) =
if u = v,
if u and v are adjacent,
otherwise.
y
xy
Let T denote the diagonal matrix with the (v, v)-th entry having value dv . The
Laplacian of G is dened to be
L = T 1/2 LT 1/2 .
In other words, we have
w(v, v)
dv
w(u, v)
L(u, v) =
du dv
if u = v, and dv = 0,
if u and v are adjacent,
otherwise.
13
We can still use the same characterizations for the eigenvalues of the generalized
versions of L. For example,
(1.13)
G := 1
inf
gT 1/2 1
g, Lg
g, g
f (x)Lf (x)
xV
inf
f
!
f (x)dx =0
inf
f 2 (x)dx
xV
xy
f
!
f (x)dx =0
f 2 (x)dx
xV
w(v , v ) =
a(k 1)i/2 ,
f (yj ) =
b(k 1)j/2 ,
f (z) =
0,
14
2 k1
1
1
uv
,
1
1
+
2
k
t+1
t+1
f (v) dv
v
since the ratio is maximized when w(xi , xi+1 ) = k(k 1)i1 = w(yi , yi+1 ). This
completes the proof of the lemma.
1.5. Eigenvalues and random walks
In a graph G, a walk is just a sequence of vertices (v0 , v1 , , vs ) with
{vi1 , vi } E(G) for all 1 i s. A random walk is determined by the transition
probabilities P (u, v) = P rob(xi+1 = v|xi = u), which are independent of i. Clearly,
for each vertex u,
P (u, v) = 1.
v
steps is just f P k (i.e., a matrix multiplication with f viewed as a row vector where
P is the matrix of transition probabilities). The random walk is said to be ergodic
if there is a unique stationary distribution (v) satisfying
lim f P s (v) = (v).
It is easy to see that necessary conditions for the ergodicity of P are (i) irreducibility, i.e., for any u, v V , there exists some s such that P s (u, v) > 0 (ii)
aperiodicity, i.e., g.c.d. {s : P s (u, v) > 0} = 1. As it turns out, these are also
sucient conditions. A major problem of interest is to determine the number of
steps s required for P s to be close to its stationary distribution, given an arbitrary
initial distribution.
We say a random walk is reversible if
(u)P (u, v) = (v)P (v, u).
An alternative description for a reversible random walk can be given by considering
a weighted connected graph with edge weights satisfying
w(u, v) = w(v, u) = (v)P (v, u)/c
where c can be any constant chosen for the purpose of simplifying the values.
(For example, we can take c to be the average of (v)P (v, u) over all (v, u) with
P (v, u) = 0, so that the values for w(v, u) are either 0 or 1 for a simple graph.)
The random walk on a weighted graph has as its transition probabilities
P (u, v) =
w(u, v)
,
du
where du =
z w(u, z) is the (weighted) degree of u. The two conditions for
ergodicity are equivalent to the conditions that the graph be (i) connected and
(ii) non-bipartite. From Lemma 1.7, we see that (i) is equivalent to 1 > 0 and
15
(ii) implies n1 < 2. As we will see later in (1.15), together (i) and (ii) deduce
ergodicity.
We remind the reader that an unweighted graph has w(u, v) equal to either 0
or 1. The usual random walk on an unweighted graph has transition probability
1/dv of moving from a vertex v to any one of its neighbors. The transition matrix
P then satises
1/du if u and v are adjacent,
P (u, v) =
0
otherwise.
In other words,
f P (v) =
1
f (u)
u du
uv
f T 1/2 , 1T 1/2
1
=
1T 1/2
vol G
since
f, 1 = 1. We then have
a0 =
f P s
= f P s 1T /vol G
= f P s a0 0 T 1/2
= f T 1/2 (I L)s T 1/2 a0 0 T 1/2
(1 i )s ai i T 1/2
=
i=0
dx
(1 )
miny dy
maxx
d
x
es
miny dy
s maxx
(1.14)
16
where
1
if 1 1 n1 1
2 n1 otherwise.
Although occurs in the above upper bound for the distance between the
stationary distribution and the s-step distribution, in fact, only 1 is crucial in the
following sense. Note that is either 1 or 2 n1 . Suppose the latter holds,
i.e., n1 1 1 1 . We can consider a modied random walk, called the lazy
walk, on the graph G formed by adding a loop of weight dv to each vertex v. The
k = k /2 1, which follows from equation
new graph has Laplacian eigenvalues
(1.13). Therefore,
1 1
n1 0,
1
and the convergence bound in L2 distance in (1.14) for the modied random walk
becomes
2/1 log(
maxx dx
).
% miny dy
In general, suppose a weighted graph with edge weights w(u, v) has eigenvalues
i with n1 1 1 1 . We can then modify the weights by choosing, for some
constant c,
w(v, v) + cdv if u = v
(1.15)
w (u, v) =
w(u, v)
otherwise.
The resulting weighted graph has eigenvalues
k =
2k
k
=
1+c
n1 + k
where
c=
1 + n1
1
1 .
2
2
Then we have
1 1 = n1 1 =
n1 1
.
n1 + 1
maxx dx
1
.
log
% miny dy
17
|P s (y, x) (x)|
.
(x)
i i ,
y T 1/2 =
i i .
where i s denote the eigenfunction of the Laplacian L of the weighted graph associated with the random walk. In particular,
dx
,
0 =
vol G
1
.
0 =
vol G
Let A denote the transpose of A. We have
(t)
= max
x,y
= max
x,y
max
|y P t x (x)|
(x)
|y T 1/2 (I L)t T 1/2 x (x)|
(x)
|(1 i )t i i |
i=0
x,y
t max
x,y
dx /vol G
|i i |
i=0
dx /vol G
t max x T
=
x,y
1/2
y T 1/2
dx /vol G
vol G
minx,y dx dy
vol G
et(1)
minx dx
t
18
vol G
1
log % minx dx ,
1
vol G
vol G
exp2t1 /(2+1 )
.
minx dx
minx dx
vol G
1
log
minx dx
f (y)
| P s (y, x) (x) |
(x)
f (y)(s)
(s).
=
=
max
max |
AV (G) yV (G)
1
max
2 yV (G)
xV (G)
(P s (y, x) (x)) |
xA
| P s (y, x) (x) | .
19
The total variation distance is bounded above by the relative pointwise distance,
since
max
max |
(P s (y, x) (x)) |
T V (s) =
AV (G) yV (G)
volA volG
2
max
AV (G)
xA
volA volG
2
xA
(x)(s)
1
(s).
2
Therefore, any convergence bound using relative pointwise distance implies the
same convergence bound using total variation distance. There is yet another notion
of distance, sometimes called -squared distance, denoted by (s) and dened by:
(s) =
max
yV (G)
xV (G)
max
yV (G)
1/2
(P (y, x) (x))
(x)
s
| P s (y, x) (x) |
xV (G)
2T V (s),
using the Cauchy-Schwarz inequality. (s) is also dominated by the relative pointwise distance (which we will mainly use in this book).
1/2
(P s (x, y) (y))2
max
xV (G)
(y)
yV (G)
1
max (
((s))2 (y)) 2
(s).
(s) =
xV (G)
yV (G)
We note that
(s)2
x
(x)
(P s (x, y) (y))2
(y)
y
x T 1/2 (P s I0 )T 1 (P s I0 )T 1/2 x
x ((I L)2s I0 )x ,
where I0 denotes the projection onto the eigenfunction 0 , i denotes the i-th
orthonormal eigenfunction of L and x denotes the characteristic function of x.
Since
x =
i (x)i ,
i
20
we have
(1.16)
(s)2
x ((I L)2s I0 )x
(
i (x)i )((I L)2s I0 )(
i (x)i )
x
2i (x)(1
2s
i )
i=0
i=0
2i (x)(1 i )2s
(1 i )2s .
i=0
The above theorem is often derived from the Plancherel formula. Here we have
employed a direct proof. We remark that for some graphs which are not vertextransitive, a somewhat weaker version of (1.17) can still be used with additional
work (see [81] and the remarks in Section 4.6). Here we will use Theorem 1.18 to
consider random walks on an n-cube.
Example 1.19. For the n-cube Qn , our (lazy) random walk (as dened in
(1.15)) converges to the uniform distribution under the total variation distance, as
estimated as follows:
From Example (1.6), the eigenvalues of the Qn are 2k/n of
k=1
n
1
(
2
e
4ks
k=1
c
if s 14 n log n + cn.
We can also compute the rate of convergence of the lazy walk under the relative pointwise distance. Suppose we denote vertices of Qn by subsets of an n-set
21
Clearly,
X =
(1)|SX|
2n/2
S .
Therefore,
|P s (X, Y ) (Y )|
(Y )
= |2n X P s Y 1|
1|
|2n X P s X
2|S| s
=
)
(1
n+1
S=
n
n
2k s
)
=
(1
k
n+1
k=1
This implies
(s)
n
n
2k s
)
(1
k
n+1
k=1
n
2ks
ek log n n+1
k=1
c
e
if
s
n log n
+ cn.
2
So, the rate of convergence under relative pointwise distance is about twice
that under the total variation distance for Qn .
In general, T V (s), (s) and (s) can be quite dierent [81]. Nevertheless, a
convergence lower bound for any of these notions of distance (and the L2 -norm) is
1 . This we will leave as an exercise. We remark that Aldous [4] has shown that
if T V (s) %, then P s (y, x) c (x) for all vertices x, where c depends only on
%.
Notes
For an induced subgraph of a graph, we can dene the Laplacian with boundary
conditions. We will leave the denitions for eigenvalues with Neumann boundary
conditions and Dirichlet boundary conditions for Chapter ??.
22
The Laplacian for a directed graph is also very interesting. The Laplacian for
a hypergraph has very rich structures. However, in this book we mainly focus on
the Laplacian of a graph since the theory on these generalizations and extensions
is still being developed.
vol G
In some cases, the factor log min
in the upper bound for (t) can be further
x dx
reduced. Recently, P. Diaconis and L. Salo-Coste [100] introduced a discrete version of the logarithmic Sobolev inequalities which can reduce this factor further for
certain graphs (for (t)). In Chapter 12, we will discuss some advanced techniques
for further bounding the convergence rate under the relative pointwise distance.