First Course Network Theory
First Course Network Theory
First Course Network Theory
Dr Philip A. Knight
Lecturer in Mathematics, University of Strathclyde, UK
3
3
Great Clarendon Street, Oxford, OX2 6DP,
United Kingdom
Oxford University Press is a department of the University of Oxford.
It furthers the University’s objective of excellence in research, scholarship,
and education by publishing worldwide. Oxford is a registered trade mark of
Oxford University Press in the UK and in certain other countries
© Ernesto Estrada and Philip A. Knight 2015
The moral rights of the authors have been asserted
First Edition published in 2015
Impression: 1
All rights reserved. No part of this publication may be reproduced, stored in
a retrieval system, or transmitted, in any form or by any means, without the
prior permission in writing of Oxford University Press, or as expressly permitted
by law, by licence or under terms agreed with the appropriate reprographics
rights organization. Enquiries concerning reproduction outside the scope of the
above should be sent to the Rights Department, Oxford University Press, at the
address above
You must not circulate this work in any other form
and you must impose this same condition on any acquirer
Published in the United States of America by Oxford University Press
198 Madison Avenue, New York, NY 10016, United States of America
British Library Cataloguing in Publication Data
Data available
Library of Congress Control Number: 2014955860
ISBN 978–0–19–872645–6 (hbk.)
ISBN 978–0–19–872646–3 (pbk.)
Printed and bound by
CPI Group (UK) Ltd, Croydon, CR0 4YY
Dedication
The origins of this book can be traced to lecture notes we prepared for a class
entitled Introduction to Network Theory offered by the Department of Mathemat-
ics and Statistics at the University of Strathclyde and attended by undergraduate
students in the Honour courses in the department. The course has since been
extended, based on experience gained in teaching the course to graduate students
and postdoctoral researchers from a variety of backgrounds around the world.
To mathematicians, physicists, and computer scientists at Emory University in
Atlanta. To postgraduate students in biological and environmental sciences on
courses sponsored by the Natural Environmental Research Council in the UK. To
Masters students on intensive short courses at the African Institute of Mathemat-
ical Sciences in both South Africa and Ghana. And to mathematicians, computer
scientists, physicists, and more at an International Summer School on Complex
Networks in Bertinoro, Italy.
Designing courses with a common thread suitable for students with very dif-
ferent backgrounds represents a big challenge. For example, the balance between
theory and application will vary significantly between students of mathematics
and students of computer sciences. An even greater challenge is to ensure that
those students with an interest in network theory, but who lack the normal quan-
titative backgrounds expected on a mathematics course, do not become frustrated
by being overloaded by seemingly unnecessary theory. We believe in the interdis-
ciplinary nature of the study of complex networks. The aim of this book is to
approach our students in an interdisciplinary fashion and as a consequence we
try to avoid a heavy mathematical bias. We have avoided a didactic ‘Theorem–
Proof ’ approach but we do not believe we have sacrificed rigour and the book is
replete with examples and solved problems which will lead students through the
theory as constructively as possible.
This book is written with senior undergraduate students and new graduate
students in mind. The major prerequisite is elementary algebra at a level one
would expect in the first year of an undergraduate science degree. To make this
book accessible for students from non-quantitative subjects we explain most of
the basic concepts of linear algebra needed to understand the more specific topics
of network theory. This material should not be wasted on students coming from
more quantitative subjects. As well as providing a reminder of familiar concepts,
we expect they will encounter a number of simple results which are not typically
presented in undergraduate linear algebra courses. We insist on no prerequisites
in graph theory for understanding this book since we believe it contains all the
necessary basic concepts in that area to allow progress in network theory. Based
viii Preface
Finally, we would like to thank the colleagues and students who have helped
us and inspired us to write this book. In particular, we would like to thank Mary
McAuley for her patience and skill in organizing our material and Eusebio Vargas
for lending us his talents to produce the high quality illustrations of networks
which you will find in the book.
Ernesto Estrada
Philip A. Knight
Contents
3 How To Prove It 31
3.1 Motivation 31
3.2 Draw pictures 32
3.3 Use induction 34
3.4 Try to find a counterexample 35
3.5 Proof by contradiction 36
3.6 Make connections between concepts 37
3.7 Other general advice 38
Further reading 38
9 Degree Distributions 95
9.1 Motivation 95
9.2 General degree distributions 95
9.3 Scale-free networks 97
Further reading 100
Index 251
Introduction to Network
Theory 1
1.1 Overview of networks 1
In this chapter 1.2 History of graphs 5
We start with a brief introduction outlining some of the areas where we find 1.3 What you will learn from this book 9
networks in the real-world. While our list is far from exhaustive, it highlights Further reading 11
why they are such a fundamental topic in contemporary applied mathematics.
We then take a step back and give a historical perspective of the contribution
of mathematicians in graph theory, to see the origins of some of the terms
and ideas we will use. Finally, we give an example to demonstrate some of the
typical problems a network analyst can be expected to find answers to.
You may have noticed that these concepts are not completely disjoint and it is
certainly the case that we may want to interpret one network from many differ-
ent points of view. Some examples of these classes of networks are illustrated in
Figures 1.1–1.4.
The problem that the city’s residents tried to solve was the following: is it pos- and is in Russia.
3 Picture taken from Euler, L., Solu-
sible to walk through the city and cross each bridge once and only once. It is not
tio problematis ad geometriam situs perti-
clear how long this problem had been taxing the populace, but nobody had found nentis, Commentarii academiae scientiarum
a suitable path when Euler entered the picture. He described the problem and his Petropolitanae, 8:128–140, 1741.
6 Introduction to Network Theory
solution in a paper published in 1736. He was not interested in the problem per
se, but in the fact that there was no mathematical equipment for tackling the prob-
lem despite its geometric flavour (Euler described it as an example of ‘geometry
of position’). He wanted to avoid exhaustively listing all the possible paths and
he found a way of recasting the problem that stripped it of all irrelevant features
(such as the distances between points).
Euler spotted that the key to solving the problem lay in the number of bridges
connected to each piece of land; in particular, whether the number is even or
odd. Consider the north bank of the river, labelled C, which is reached by three
bridges. Suppose a path starts on C. The first time we cross one of its bridges we
leave C, the second time we cross one of its bridges we re-enter C and the final
crossing takes us out again. On the other hand, if a path does not start on C, after
the three bridge crossings it must end on C. The south bank, B, can be reached
by three bridges, and we will only finish on B if that is where we started.
We can conclude that each parcel of land that has an odd number of bridges
must either be the start or finish of a valid route. But there are five bridges on A,
three on C, and three on D. Since it is impossible to have a route with three end
points, no valid route exists.
Euler’s insight could be extended to the more general problem involving any
number of bridges linking any number of pieces of land. He stated the solution
as follows.
If there are more than two areas to which an odd number of bridges lead, then such a
journey is impossible.
If, however, the number of bridges is odd for exactly two areas, then the journey
is possible if it starts in either of these areas.
If, finally, there are no areas to which an odd number of bridges leads, then the
required journey can be accomplished starting from any area.
With these rules, the problem can always be solved.
be a route including this island, no matter the number of even and odd areas
elsewhere. But Euler’s approach showed how network problems could be framed
in such a way that they could be analysed with mathematical rigour. Implicit in
his approach were the notion of a graph, vertices, edges, vertex degree, and paths: the
atoms of network theory.
1.2.3 Trees
Network theory would have been of little interest today if it was limited to recre-
ational applications. During the nineteenth century, though, it became apparent 4 There are known to be over 26 trillion
that network analysis could inform many areas of mathematics and science. One different closed tours on an 8 × 8 board
of the first such applications was in calculus, for which Arthur Cayley showed which start and finish at the same square.
8 Introduction to Network Theory
poraries and it was not until the twentieth century that much of the pioneering
work in network theory was revisited and recognized as an application of matrix
algebra.
1.2.5 Chemistry
In this book, the terms ‘network’ and ‘graph’ are synonymous. The use of the
word ‘graph’ in this context can be dated precisely to February 1878 where it ap-
peared in a paper by James Sylvester entitled Chemistry and Algebra. By 1850, it
was well known that molecules were formed from atoms; for example, that ethanol
had the chemical formula C2 H5 OH. Sylvester was contributing to ensuing devel-
opments into understanding the possible arrangements of the atoms in molecules.
The modern notation for representing molecules was essentially introduced by the
Scottish chemist Alexander Crum Brown in 1864. He highlighted the concept of
valency, an indication of how many bonds that each atom in the molecule must a H H
be part of—directly related to the vertex degree.
H C C O H
A representation of ethanol is given in Figure 1.8. The valency of each of the
atoms is clear and we can easily represent double and triple bonds by drawing H H
multiple edges between two atoms. Notice the underlying network in this picture.
Figure 1.8 Ethanol
These diagrams made it plain that for some chemical formulae several different
arrangements of atoms are possible: for example, propanol can be configured in
two distinct ways as shown in Figure 1.9.
The graphical representation made it clear that such isomers were an important H H H
topic in molecular chemistry. Mathematicians made a significant contribution to
H C C C O H
this embryonic field. For instance, Cayley was able to exploit his earlier work
on trees in enumerating the number of isomers of alkanes.5 Each isomer has a H H H
carbon skeleton which is augmented with hydrogen atoms. Cayley realized that
H
the number of isomers was equal to the number of different trees with n vertices
(so long as none of these vertices had a degree of more than four). Sylvester built H O H
on this work by showing that even more abstract algebraic ideas could be related
H C C C H
to the graphical structure of molecules.
The different motives of chemists and mathematicians for understanding the H H H
structures that were being uncovered meant that the disciplines diverged soon
afterwards; but more recently the ties have become closer again and network Figure 1.9 Two forms of propanol
theory can be used to design novel molecules with particular properties.
growth; and potential strategies for improving its efficiency. Suppose for instance
that you are analysing the network represented in Figure 1.10, which describes
the social communication among a group of 36 individuals in a sawmill.
You are informed that the employees were asked to indicate the frequency with
which they discussed work matters with each of their colleagues. Two nodes are
connected if the corresponding individuals frequently discuss work matters. After
studying this book you will be able to make the following conclusions.
With this analysis in hand you will be able to understand how this small firm
is organized; where the potential bottlenecks in the communication among the
employees occur; how to improve structural organization towards an improved
efficiency and functionality; and also how to develop a model that will allow you
to simulate your proposed changes before they are implemented by the firm.
Enjoy the journey through network-land!
..................................................................................................
FURTHER READING
Barabási, A.-L., Linked: The New Science of Networks, Perseus Books, 2003.
Biggs, N.L., Lloyd, E.K., and Wilson, R.J., Graph Theory 1736–1936, Clarendon
Press, 1976.
Caldarelli, C. and Catanzaro, M., Networks: A Very Short Introduction, Oxford
University Press, 2012.
Estrada, E., The Structure of Complex Networks. Theory and Applications, Oxford
University Press, 2011.
General Concepts
2 in Network Theory
Example 2.1
We introduce two very simple networks which we will use to illustrate some of the concepts in this chapter. They can
be represented diagrammatically as in Figure 2.1.
A natural vertex set to use for both networks is V = {1, 2, 3, 4}. Applying these vertex labels to the nodes from left
to right, the edge set of Gl is then
El = {(1, 2), (1, 3), (2, 1), (2, 3), (3, 1), (3, 2), (3, 4), (4, 3)},
and that of Gr is
Er = {(1, 1), (1, 2), (1, 3), (2, 1), (2, 3), (2, 3), (2, 3), (3, 1), (3, 2), (3, 2), (3, 2), (3, 4), (4, 3)}.
(a) Gl (b) Gr
Neither the labelling nor the diagrammatic representation of a network is unique. For example, Gl can also be expressed
as (V , Em ) where
Em = {(1, 2), (1, 3), (1, 4), (2, 1), (2, 4), (3, 1), (4, 1), (4, 2)}.
Or as in Figure 2.2.
Definition 2.2
Examples 2.2
(i) Gl = (V , El ) is a subgraph of Gr = (V , Er ).
(V , El ) and (V , Em ) are isomorphic under the mapping f : V → V where f (1) = 3, f (2) = 2, f (3) = 4, f (4) = 1.
Gl = (V , {(1, 4), (4, 1), (2, 4), (4, 2)}), illustrated in Figure 2.3.
Letting F = {(1, 1), (2, 3), (2, 3), (3, 2), (3, 2)} gives (V , El ) = (V , Er ) – (V , F).
Elementary graph theory concepts 15
3 4
(v) A network with n nodes but no edges is called a null graph. Every node is isolated. We write Nn to denote this
network. It is not particularly exciting, but it arises in many theorems.
(vi) A network is regular if all nodes have the same degree. If this common degree is k, the network is called k-regular
or regular of degree k. The null graph is 0-regular and a single edge connecting two nodes is 1-regular. Kn is
(n – 1)-regular. For 0 < k < n – 1 there are several different k-regular networks with n nodes.
(vii) In a cycle graph with n nodes, Cn , the nodes can be ordered so that each is connected to its immediate neighbours.
It is 2-regular. If we remove an edge from Cn we get the path graph, Pn–1 . If we add a node to Cn and connect it
to every other node we get the wheel graph, Wn+1 . Examples are given in Figure 2.5.
Example 2.3
(ii) If G is a simple network with adjacency matrix A then its complement, G, has adjacency matrix E – I – A, where
E is a matrix of ones.
(iii) Starting with the cycle graph Cn we can add edges so that each node is linked to its k nearest neighbours clockwise
and anticlockwise. The resulting network is called a circulant network and its adjacency matrix is an example of
a circulant matrix. For example, if n = 7 and k = 2 the adjacency matrix is
Networks and matrices 17
⎡ ⎤
0 1 1 0 0 1 1
⎢1 0 1 1 0 0 1⎥
⎢ ⎥
⎢1 0⎥
⎢ 1 0 1 1 0 ⎥
⎢ ⎥
⎢0 1 1 0 1 1 0⎥ .
⎢ ⎥
⎢0 0 1 1 0 1 1⎥
⎢ ⎥
⎣1 0 0 1 1 0 1⎦
1 1 0 0 1 1 0
If the edge (u, v) is in a simple network then so is (v, u). We will only include
one of each of these pairs (it doesn’t matter which) in the incidence matrix. In this
case, you may find in some references that the incidence matrix is defined so that
all the nonzero entries are set to one and our definition of the incidence matrix is
known as the oriented incidence matrix. There are many different conventions
for including loops in incidence matrices. Since we are primarily concerned with
simple networks it doesn’t really matter which convention we use.
We will look more at the connections between the adjacency and incidence
matrices when we look at the spectra of networks.
Example 2.4
continued
18 General Concepts in Network Theory
Example 2.5
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
2 1 1 1 2 3 4 1 7 6 6 4
⎢1 1⎥ ⎢3 1⎥ ⎢6 7 6 4⎥
⎢ 2 1 ⎥ ⎢ 2 4 ⎥ ⎢ ⎥
A2 = ⎢ ⎥ , A3 = ⎢ ⎥ , A4 = ⎢ ⎥.
⎣1 1 3 0⎦ ⎣4 4 2 3⎦ ⎣6 6 11 2⎦
1 1 0 1 1 1 3 0 4 4 2 3
Note that the diagonal of A2 gives the degree of each node. Can you
explain why this is true for all simple graphs? Can you find all four walks
of length three from node 1 to 3? How about all seven closed walks of
length four from node 1?
Note that the shortest walk in a network between distinct nodes is also the
shortest path. This can be formally established by an inductive proof.
A walk of length one must also be a path. For a longer walk between nodes u
and v simply consider the walk without the last edge (that connects w and v, say).
This must be the shortest such walk from u to w (why?) and therefore it is a path.
This path does not contain the node v (why?) and the edge (w, v) is not a part of
this path, so the walk from u to v must also be a path.
Note that the shortest closed walk is not necessarily the shortest cycle. In a
simple network one can move along any edge back and forth to find a closed
walk of length two from any node with positive degree. But the shortest cycle in
any simple network is at least three. There are general techniques for finding the
shortest cycle in any network, but we will not discuss them here.
We can find the shortest walk between any two nodes by looking at successive
powers of A: it will be the value of p for which the (i, j)th element of Ap is first
nonzero.
20 General Concepts in Network Theory
Examples 2.6
(i) Consider C5 (Figure 2.6). Can you write down A2 and A3 just by looking at the network?
⎡ ⎤
0 1 0 0 1
⎢1 0⎥
⎢ 0 1 0 ⎥
⎢ ⎥
A = ⎢0 1 0 1 0⎥
⎢ ⎥
⎣0 0 1 0 1⎦
1 0 0 1 0
Using a natural ordering of the nodes, we can write the adjacency matrix, A, and A2 , and A3 as
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
0 1 0 0 0 0 0 0 1 0 1 0 0 1 0 0 0 3 0 1 1 0 2 0
⎢1 0 1 0 0 1 0 0⎥ ⎢0 3 0 1 1 0 2 0⎥ ⎢3 0 6 0 0 6 0 2⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢0 1 0 1 0 0 1 0⎥ ⎢1 0 3 0 0 2 0 1⎥ ⎢0 6 0 3 2 0 0⎥
⎢ ⎥ ⎢ ⎥ ⎢ 6 ⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢0 0 1 0 0 0 0 0⎥ ⎢0 1 0 1 0 0 1 0⎥ ⎢1 0 3 0 0 2 0 1⎥
⎢ ⎥, ⎢ ⎥, ⎢ ⎥.
⎢0 0 0 0 0 1 0 0⎥ ⎢0 1 0 0 1 0 1 0⎥ ⎢1 0 2 0 0 3 0 1⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢0 1 0 0 1 0 1 0⎥ ⎢1 0 2 0 0 3 0 1⎥ ⎢0 6 0 2 3 0 6 0⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎣0 0 1 0 0 1 0 1⎦ ⎣0 2 0 1 1 0 3 0⎦ ⎣2 0 6 0 0 6 0 3⎦
0 0 0 0 0 0 1 0 0 0 1 0 0 1 0 1 0 2 0 1 1 0 3 0
Network connectivity 21
Comparing these matrices, we see that the only entries that are always zero are (1, 8), (4, 5), and their symmetric
counterparts. These are the only ones which cannot be connected with paths of length three or fewer. We also see, for
example, that the minimum path lengths from node two to the other nodes are 1, 1, 2, 2, 1, 2, and 3, respectively.
All of this information is readily gleaned from the diagrammatic form of the network, but the adjacency matrix
encodes all this information in a way that can be manipulated algebraically.
Notice that every entry of the diagonal of A2 is nonzero. None of these entries represents a circuit: they are ‘out and
back again’ walks along edges. Since the diagonal of A3 is zero, there are no triangles in the network: the only circuits
are permutations of 2 → 3 → 7 → 6 → 2.
Problem 2.1
Show that a network with adjacency matrix A is disconnected if and only if there
is a permutation matrix P such that
X Y
A=P PT .
O Z
First we establish sufficiency. If A has this form then, since X and Z are square,
X2 XY + YZ
A2 = P PT .
O Z2
That is, the zero block remains and a simple induction confirms its presence in all
powers of A. This means that there is no path of any length between nodes cor-
responding to the rows of the zero block and nodes corresponding to its columns,
so by definition the network is not connected.
For necessity, we consider a disconnected network. We identify two nodes for
which there is no path from the first to the second and label them n and 1. If there
are any other nodes which cannot be reached from n then label these as nodes
2, 3, . . . , r. The remaining nodes, r + 1, . . . , n – 1 are all accessible from n. There
can be no path from a node in this second set to the first. For if such a path existed
(say between i and j) then there would be a path from n to j via i.
22 General Concepts in Network Theory
If there is no path between two nodes then they are certainly not adjacent.
Hence aij = 0 if r < i ≤ n and 1 ≤ j ≤ r, thus
X Y
A=
O Z
Problem 2.2
Show that if a simple network, G, has n nodes, m edges, and k components then
1
n – k ≤ m ≤ (n – k)(n – k + 1).
2
The lower bound can be established by induction on m. The result is trivial if
G = Nn , the null graph. Suppose G has m edges. If one removes a single edge
then the new network has n nodes, m – 1 edges, and K components where K = k
or K = k + 1. By the inductive hypothesis, n – K ≤ m – 1 and so n – k ≤ m.
For the upper bound, we note that if a network with n nodes and k components
has the greatest possible number of edges then every one of its components is a
complete graph. We leave it to the reader to show that we attain the maximum
edge number if k – 1 of the components are isolated vertices. The number of
edges in Kn – k + 1 is (n – k)(n – k + 1)/2.
Note that we can conclude that any simple network with n nodes and at least
(n – 1)(n – 2)/2 + 1 edges is connected.
Examples 2.7
(i) Of all networks with n nodes, the complete graph, Kn , has most edges. There are n – 1 edges emerging out of
each of the n nodes. Each of these edges is shared by two nodes. Thus the total number of edges is n(n – 1)/2.
Kn has a single component.
(ii) Of all networks with n nodes, the null graph, Nn , has most components, namely n. Nn has no edges.
(iii) Consider the network Gl Ga (where Gl and Ga were defined in Examples 2.1 and 2.6, respectively). It has
n = 12 nodes, m = 12 edges, and k = 2 components. Clearly n – k < m and m < (n – k)(n – k + 1)/2.
In Gl , n = 4, k = 2 and m = 2 giving n – k = m and m < (n – k)(n – k + 1)/2.
In Gl , n = 4 and m = 4. Since m > (n – 1)(n – 2)/2 we know it must be connected without any additional
information.
Network connectivity 23
Examples 2.8
One can also view connectivity from the perspective of vertices. A set of nodes
in a connected network is called a separating set if their removal (and the re-
moval of incident edges) disconnects the graph. If the smallest such set has size
k then the network is called k-connected and its connectivity, denoted κ(G), is k.
If κ(G) = 1 then a node whose removal disconnects the network is known as a
cut-vertex.
24 General Concepts in Network Theory
Example 2.9
(i) The left-hand network in Figure 2.9 is 1-connected. For the middle, κ =
2. k-connectivity does not really make sense in the right-hand network,
K4 . We cannot isolate any nodes without removing the whole of the rest
of the network, but by convention we let κ = n – 1 for Kn .
Notice that for each network the minimal separating set must contain a
node of maximal degree.
(ii) Figure 2.10 shows a network representing relationships between power-
ful families in fifteenth-century Florence. Notice that the Medici family
are a cut-vertex but no other family can cut out anything other than end
vertices.
LAMBERTES BISCHERI
GUADAGNI PERUZZI
STROZZI
CASTELLAN
RIDOLFI ALBIZZI
TORNABUON
MEDICI GINORI
SALVIATI BARBARADORI
PAZZI ACCIAIUOL
Figure 2.10 Socio-economic ties between fifteenth-century Floren-
tine families
(iii) In Figure 2.11, κ(G) = 2. Can you identify all the separating sets of size
two?
Network connectivity 25
Notice that in Figure 2.11 the network has a number of distinct parts which
are highly connected: there are a number of subgraphs that are completely con-
nected. This is a sort of structure that arises in many practical applications and
is worth naming. Any subgraph of a simple network that is completely connected
is called a clique. The biggest such clique is the maximal clique. As with many of
the concepts we have seen, we can guarantee the existence of cliques of a certain
size in networks with sufficient edges. It was established over a century ago that
a simple graph with n nodes and more than n2 /4 edges must contain a triangle.
This result was extended in 1941 by the Hungarian mathematician Turán who
showed that if a simple network with n nodes has m edges then there will be a
clique of at least size k if
n2 (k – 2)
m> .
2(k – 1)
Examples 2.10
(i) To prove his result, Turán devised a way of constructing the network with as many edges as possible with n nodes
and maximal clique size k.
To do this, one divides the nodes into k subsets with sizes as equal as possible. Nodes are connected by an edge
if and only if they belong to different subsets. We use Tn,k to denote this network. T5,3 is illustrated in Figure 2.12
for which m = 8.
Can you find the cliques of size four that are created by adding an extra edge?
continued
26 General Concepts in Network Theory
(ii) Tn,2 is the densest graph without any triangles. T6,2 is illustrated in Figure 2.13 along with the adjacency matrix,
A. T6,2 has a special structure which we will discuss in more detail.
⎡ ⎤
0 0 0 1 1 1
⎢0 1⎥
⎢ 0 0 1 1 ⎥
⎢ ⎥
⎢0 0 0 1 1 1⎥
A=⎢ ⎥
⎢1 1 1 0 0 0⎥
⎢ ⎥
⎣1 1 1 0 0 0⎦
1 1 1 0 0 0
Figure 2.13 T6,2
Triangles can be identified by looking at the diagonal of the cube of the adjacency matrix. It is easy to show that
A3 = 9A. The zero diagonal of A3 confirms the absence of triangles.
2.5.1 Trees
The word ‘tree’ evokes a similar picture for most people, and we can use it to
describe a particular structure in a network that frequently arises in practice. We
encountered trees in our history lesson.
There are lots and lots of trees! There are nn–2 distinct (up to isomorphism)
labelled trees with n nodes. For n = 1, 2, 3, 4, 5, 6 this gives 1, 1, 3, 16, 125, 1296
trees before truly explosive growth sets in. Counting unlabelled trees is much
harder, and there is no known formula in terms of the number of nodes but their
abundance appears to grow exponentially in n.
Graph structures 27
Examples 2.11
(ii) There are only two different unlabelled trees with four nodes, as illustrated in Figure 2.15. The left-hand tree
can be labelled in four ways, but only in 12 distinct ways since one half are just the reverse of the other. Once we
label the pivotal node of the right-hand tree (for which we have four choices) all labellings are equivalent.
Examples 2.12
(i) In Figure 2.16 we illustrate a network and two of the possible spanning trees.
(ii) The notion of a spanning tree can be expanded to disconnected networks. If we form a spanning tree for each
component and take their union, the result is a spanning forest.
(iii) The number of edges in a spanning tree/forest of G is called its cut-set rank, denoted by ξ (G). If a network has k
components ξ (G) = n – k. The number of edges removed from G to form the forest, m – n + k, is known as the
cycle rank and is denoted by γ (G).
For example, in (i) above, γ (G) = 3 and ξ (G) = 6.
28 General Concepts in Network Theory
There are many networks in real applications that are exactly or nearly bi-
partite. In chapter 18 we will look at how to measure how close to bipartite a
network is in order to infer other properties. For now, we briefly discuss some of
the properties an exactly bipartite network possesses.
Examples 2.13
(i) The Turán network, Tn,2 , is bipartite. Recall that T6,2 has the adjacency
matrix
⎡ ⎤
0 0 0 1 1 1
⎢0 0 0 1 1 1⎥
⎢ ⎥
⎢ ⎥
⎢0 0 0 1 1 1⎥
A=⎢ ⎥.
⎢1 1 1 0 0 0⎥
⎢ ⎥
⎣1 1 1 0 0 0⎦
1 1 1 0 0 0
The odd powers have a zero diagonal hence every cycle in a bipartite network has
even length.
Bipartivity can be generalized to k-partivity. A network is k-partite if its nodes
can be partitioned into k sets V1 , V2 , . . . , Vk such that if u, v ∈ Vi then there is no
edge between them.
Examples 2.14
(i) Trees are bipartite. To show this, pick a node on a tree and colour it black. Then colour all its neighbours white.
Colour the nodes adjacent to the white nodes black and repeat until the whole tree is coloured. This could only
break down if we encounter a previously coloured node. If this were the case, we would have found a cycle in
the network. The nodes can then be divided into black and white sets. We show some appropriate colourings of
trees in Figure 2.18.
continued
30 General Concepts in Network Theory
(ii) The maximal clique in a bipartite network has size 2, since Kn has odd cycles for n > 2.
(iii) The n node star graph, S1,n–1 has a single central node connected to all other n – 1 nodes and no other edges. S1,5
is illustrated in Figure 2.19.
..................................................................................................
FURTHER READING
Aldous, J.M. and Wilson, R.J., Graphs and Applications: An Introductory Approach,
Springer, 2003.
Bapat, R.B., Graphs and Matrices, Springer, 2011.
Chartrand, G. and Zhang, P., A First Course in Graph Theory, Dover, 2012.
Wilson, R.J., Introduction to Graph Theory, Prentice Hall, 2010.
How To Prove It
3
In this chapter
3.1 Motivation 31
We motivate the necessity for rigorous proofs of results in network theory.
Then we give some advice on how to prove results by using techniques such 3.2 Draw pictures 32
as induction and proof by contradiction. At the same time we encourage 3.3 Use induction 34
the student to use drawings and counterexamples and to build connections 3.4 Try to find a counterexample 35
between different concepts to prove a result. 3.5 Proof by contradiction 36
3.6 Make connections between
concepts 37
3.7 Other general advice 38
3.1 Motivation Further reading 38
You may have noticed that in this book we have not employed the ‘Theorem–
Proof ’ structure familiar to many textbooks in mathematics, and when you read
the title of this chapter maybe you thought, “Why should I care about proving
things rigorously?”. Let us start by considering a practical problem. Suppose you
are interested in constructing certain networks displaying the maximal possible
heterogeneity in their degrees. You figure out that a network in which every node
has a different degree will do it and you try by trial-and-error to construct such
a network. However, every time you attempt to draw such a network you end up
stymied by the fact that there is always at least one pair of nodes which share the
same degree. You try as hard as possible and may have even used a computer to
help you in generating such networks, but you have always failed. You then make
the following conjecture.
In any simple network with n ≥ 2 nodes, there should be at least two nodes which have
exactly the same degree.
But, are you sure about this? Is it not possible that you are missing something
and such a dreamed network can be constructed? The only way to be sure about
this statement is by means of a rigorous proof that it is true. Such a proof is just a
deductive argument that such a statement is true. Indeed it has been said that:
Proofs are to the mathematician what experimental procedures are to the experimental 1 Rav, Y., Why do we prove theorems?
scientist: in studying them one learns of new ideas, new concepts, new strategic- Philosophia Mathematica 15 (1999)
devices which can be assimilated for one’s own research and be further developed.1 291–320.
32 How To Prove It
There are many ways to establish the veracity of your conjecture. Here is a
very concise argument. First observe that in your network of n nodes, no node
can have degree bigger than n – 1. So for a completely heterogeneous set of de-
grees you may assume that the degrees of the n nodes of your imagined network
are 0, 1, 2, . . . , n – 2, n – 1. The node with degree n – 1 must be connected to all the
other nodes in the network. However, there is a node with degree 0, which contra-
dicts the previous statement. Consequently, you have proved by contradiction that
the statement is true. It has become a theorem and you can now be absolutely
convinced that such a dreamed network cannot exist.
When you read these statements and their proofs in any textbook they usually
look so beautiful, short, and insightful that your first impression is: “I will never
be able to construct something like that”. However, such statement of a theorem
is usually the result of a long process in which hands have possibly got dirty on
the way; some not so beautiful, short, and insightful sketches of the proof were
advanced and then distilled until the last proof was produced. You, too, should
be able to produce such beautiful and condensed results if you train yourself and
know a few general rules and tricks. You can even create an algorithmic scheme
to generate such proofs. Something of the following sort.
1. Read carefully the statement and determine what the problem is asking you
to do.
2. Determine which information is provided and which assumptions are
made.
3. Try getting your hands dirty with some calculations to see yourself how
the problem looks in practice.
4. Plan a strategy for attacking the problem and select some of the many
techniques available to prove a theorem.
5. Sketch the proof.
6. Check yourself if your solution is the one asked for by the problem stated.
7. Simplify the proof as much as possible by eliminating all the superfluous
statements, assumptions, and calculations.
Some of these techniques for proving results in network theory are provided
in this chapter as a guide to students for solving their own problems. Have a look
through Chapter 2 and you will see that we have used some of these techniques in
the examples and problems. Hopefully, with practice, you can use them to solve
more general problems that you find during your independent work.
theory is very pictorial (and this book is filled with figures of networks), it is still
as analytic and rigorous as any branch of mathematics. There are many situations
in which it is not obvious as to how one starts solving a particular problem. A
drawing or a sketch can help trigger an idea that eventually leads to the solution.
Let us illustrate this situation with a particular example.
Example 3.1
Show that if there is a walk from the node v1 to v2 , such that v1 = v2 , then there is a path from v1 to v2 .
2. Notice that to establish a path between 1 and 6 exists we simply have to avoid visiting the same vertex twice.
For instance, avoiding visiting vertex 5 twice, we obtain the path shown in Figure 3.2(a) and if we avoid visiting
node 2 twice we obtain the path illustrated in Figure 3.2(b).
(a) (b)
Figure 3.2 Two walks in a network that avoid visiting a node twice
continued
34 How To Prove It
Example 3.2
Prove that a connected network with n nodes is a tree if and only if it has exactly m = n – 1 edges.
1. First, gain some insights by drawing some pictures as we have recommended before.
2. For the if (⇒ ) part of the theorem we proceed as follows.
(a) Start by assuming that the network is a tree with n nodes.
(b) We can easily verify that the result is true for n = 1, because this corresponds to a network with one node
and zero edges, which is also a tree.
(c) Suppose now that the result is true for any k < n.
(d) Select a tree with n nodes and remove one edge. Because it is a tree, the result is a network with two disjoint
connected components, each of which is a tree with n1 and n2 nodes, respectively.
(e) Because n1 < n and n2 < n, by the induction hypothesis the result is true for these two trees. Hence they
have n1 – 1 and n2 – 1 edges, respectively.
(f) As n = n1 + n2 we can verify that, returning the edge that was removed, the total number of edges in the
network is m = (n1 – 1) + (n2 – 1) + 1 = n1 + n2 – 1 = n – 1, which proves the (⇒ ) part of the theorem.
Try to find a counterexample 35
Notice that we only used induction in the ‘if ’ part of the proof. We are free to combine as many individual techniques
as we like in creating proofs.
Example 3.3
You can start by drawing a bipartite network (following some previous advice). Now try to add a triangle to the
network. You will immediately realize that to add a triangle you necessarily need to connect two nodes which are in the
same disjoint set of the bipartite network. Thus, the graph to which you have added the triangle is no longer bipartite.
The same happens if you try a pentagon or a heptagon. Thus, a key ingredient in your proof should be the fact that
the existence of odd cycles necessarily implies connections between nodes in the same set of the bipartition, which
necessarily means destroying the bipartivity of the graph. We will see how to use this fact to prove this result using a
powerful technique in Section 3.5.
36 How To Prove It
Example 3.4
For instance, assume that G is a bipartite network and that it contains an odd cycle (a contradiction according to
this statement). We can then proceed as follows.
To prove necessity (⇒)
1. Let V1 and V2 be the two disjoint sets of nodes in the bipartite network.
2. Let l = 2k + 1 be the length of that cycle in G, such that
v1 , v2 , . . . , v2k+1 , v1
1. Suppose that the network G has no cycle of odd length and assume that the network is connected. This second
assumption is not part of the statement of the theorem but splitting the network and the proof into individual
components lets us focus on the important details.
2. Select an arbitrary node vi .
3. Partition the nodes into two sets V1 and V2 so that any node at even distance from vi (including vi itself) is
placed in V1 and any node at odd distance from vi is placed in V2 . In particular, there is no node connected to
vi in V1 .
4. Now suppose that G is not bipartite. Then there exists at least one pair of adjacent nodes that lie in the same
partition Vi . Label these node vp and vq .
5. Suppose that the distance between v1 and vp is k and that between v1 and vq is l. Now construct a closed walk
that moves from v1 to vp along a shortest path; then from vp to vq along their common edge; and finally back
from vq to v1 along a shortest path. This walk is closed and has length k + l + 1, which must be odd.
Make connections between concepts 37
6. Because every closed walk of odd length contains an odd cycle we conclude that the network has an odd cycle,
which contradicts our initial assumption and so the network must be bipartite.
7. If G has more than one component then we can complete the proof by constructing sets such as V1 and V2 for
each individual component and then combine them to form disjoint sets for the whole network. This final bit
of housekeeping does not require any additional contradictions to be established.
Example 3.5
Let us prove here only the statement that if the graph is regular then the previous equality holds and we leave the
only if part as an exercise. We start by noticing that
n
λ2j = tr(A2 ).
j=1
We also know that the diagonal entries of A2 are equal to the number of closed walks of length two starting (and
ending) at the corresponding node, which is simply the degree of the node (ki ). That is,
n
tr(A2 ) = ki .
i=1
n
Now, we make a connection with the Handshaking Lemma, which states that ki = 2m for any graph. Thus,
i=1
n
λ2j = tr(A2 ) = 2m.
j=1
continued
38 How To Prove It
1
n
We can write the average degree as: k = ki and then we have
n i=1
1 2
n
k= λ .
n j=1 j
..................................................................................................
FURTHER READING
We turn our attention to some of the phenomena we should be aware of when 4.3 Processing data 42
carrying out experiments. For example, experimental data are prone to error 4.4 Data statistics and random
variables 46
from many sources and we classify some of these sources. We give a brief
overview of some of the techniques that we can use to make sense of the 4.5 Experimental tools 48
data. For all of these techniques, mathematicians and statisticians have devel- Further reading 51
oped effective computational approaches and we hope our discussion gives
the student a flavour of the issues which should be considered to perform an
accurate and meaningful analysis. We list some of the key statistical concepts
that a successful student of network analysis needs in their arsenal and give
an idea of some of the software tools available.
4.1 Motivation
The focus of much of this book is to present the theory behind network analysis.
Many of the networks we choose to illustrate the theory are idealized in order
to accentuate the effectiveness of the analysis. Once we have developed enough
theory, though, we can start applying it to real life networks—if you leaf through
this book you will see examples based on complex biological, social, and transport
networks, to name just three—and with a sound understanding of the theory we
can ensure we can draw credible inferences when we analyse real networks. But
we should also be aware of the limitations of any analysis we attempt. When we
build a network to represent a food chain, can we be sure we have included all
the species? When we look at a social network where edges represent friendship,
can we be certain of the accuracy of all of our links? And if we study a network
of transport connections, how do we accommodate routes which are seasonal or
temporary?
The point is that any data collected from a real life setting are subject to error.
While we may be able to control or mitigate errors, we should always ensure that
the analytical techniques we use are sufficiently robust for us to be confident in
our results. In this chapter, after presenting a brief taxonomy of experimental
error, we look at some of the techniques available to us for processing and
analysing data.
40 Data Analysis and Manipulation
Example 4.1
1
Suppose we wish to compute I20 where In = 0 xn ex–1 dx.
1
Note that I0 = 0 ex–1 dx = 1 – e–1 = 0.6321 and using integration by parts we find
1 1
In+1 = xn+1 ex–1 – (n + 1) xn ex–1 dx = 1 – (n + 1)In .
0 0
Notice that for 0 ≤ x ≤ 1, if m > n, 0 ≤ x ≤ xn and the sequence {In } should be nonnegative and monotone
m
decreasing.
In
0 10 n
Figure 4.2 shows In as computed on a computer with 16 digits of accuracy using the recurrence In = 1 – n In–1 up
to n = 19. At the next step it gives the value I20 = –30.192.
The problem is caused by the accumulation of tiny rounding errors (by n = 20 the initial error in rounding I0
has been multiplied by 20!). The problem can be overcome by rearranging the recurrence. For example, if we let
In = (1 – In+1 )/(n + 1) and assume I30 = 0, we calculate I20 to full accuracy.
42 Data Analysis and Manipulation
Example 4.2
Consider the network of sawmill employees we considered in Chapter 1. Suppose the mill is visited one year later and
the study is repeated but we find that two of the employees (who we know are still working at the mill) are not recorded
in our follow-up survey. To deal with the missing data we could choose one of the following options.
1. Ignore it. Maybe two missing participants will not skew our results.
2. Substitute by including the missing workers in our survey and connecting them to the network using the links
to fellow employees recorded in the previous study.
3. Suppose one of the missing workers had exactly the same connections in the original survey as a worker who
has not gone missing. In this case, we can substitute the missing data by assuming their links are still identical.
Processing data 43
If there are noidentical twins (in this sense) then this process could be expanded to substitute with the links of the
most similar node. Later in this book (Chapter 21) we will see ways of measuring this similarity between nodes.
4. A statistical analysis of the two networks may show some discrepancies which can be ameliorated by adding
links in particular places in the network. This is the essence of a process known as imputation. The analysis is
based on statistics such as the degree distribution, a concept we will also visit in Chapter 9.
For statistical reasons, imputation is the favoured approach for dealing with miss-
ing nodes as it can effectively remove bias. But dealing with the ‘known unknowns’
and the ‘unknown unknowns’ of data uncertainty can be fraught with danger for
even the most experienced of practitioners and you may be playing it safe by ap-
plying a simplistic approach to missing data. For particular classes of network,
sophisticated techniques exist for recovering missing data. Such techniques are
beyond the scope of this book. They are underpinned by theory on high-level
properties of networks such as degree distribution (see Chapter 9) and network
motifs (see Chapter 13).
Example 4.3
for a more restrictive class of techniques for dealing with data subject to noise
(particularly unbiased noise). We define filtering to be the process of smoothing
the data. When dealing with networks, we may want to filter raw data when de-
termining whether to link nodes together (such as in constructing PPI networks
when the results of experiments are judged) but it is also an essential tool once we
start our analysis to get a smoother picture of our statistical results.
Example 4.4
Degree
Time Time
(a) Raw data (b) Data after smoothing
To get a better idea of whether this drop is real or just an artefact of the
noise we have plotted a moving average of the data. At every point in time we
have replace the measured value with the average taken over (in this case) six
successive time intervals. This ‘averages out’ the noise and appears to show
that the apparent drop is a real phenomenon within the network.
There are many ways of choosing the aj (in our example they were uniform)
depending on what we are trying to achieve and the suspicions we have about the
nature of the contamination of our data. For i < k an appropriate filtering must
be chosen to properly define the initial points in the filtered data (for example, by
creating fictitious data x–k , . . . , x–1 ).
Example 4.5
Let x and y be two variables and a and b be constants. A simple change of variables can be used to convert some
nonlinear relationships between the variables into linear ones:
y = ax2 + b, ⇒ y = aX + b, X = x2 ,
y = aebx ⇒ Y = bx + ln a, Y = ln y, (4.2)
y = ax ⇒ Y = bX + ln a, Y = ln y, X = ln x.
b
Suppose we are given data (x1 , y1 ), . . . , (xn , yn ) for two variables. The basic
principle of linear fitting is to find constants a and b so that the line y = ax + b
matches the data as closely as possible. We do this by minimizing the errors
ei = yi – axi – b over all choices of a and b. There are many ways of choosing
the error measure but if we assume that the errors can be modelled by a ran-
dom variable then it usually makes sense to minimize the Euclidean distance in the
errors, namely,
n
min ( yi – axi – b)2 , (4.3)
a,b
i=1
to find the least squares solution, for which there are many efficient computational
techniques.
46 Data Analysis and Manipulation
Example 4.6
Given that the variables x and y are related, and that when x = 0, 2, and 3, y = 0, 4, and 9, respectively, we can estimate
the value of y when x = 1 in a number of ways. For example, we can assume that between x = 0 and x = 2, the
relationship between the variables is linear and hence when we interpolate to x = 1 we find y = 2. Or we notice that
our three data points lie on a quadratic curve and if we assume this relationship throughout then y = 1 at x = 1. We
could also fit other shapes to pass through the points or we could interpolate from the least squares linear fit to the
data, y = 0.94 + 0.93x.
Any of our choices can be used to extrapolate outside the measured values of x but notice that these will diverge
significantly as x increases.
As with all the other techniques of massaging data that we have presented
in this section, we can be most confident in our interpolated/extrapolated data
when there is an underlying justification for the method we use provided by the
theory. If we have a large amount of data we should be judicious in the use of
interpolation/extrapolation. Large amounts of data allow us to calculate unique
interpolants of great complexity (for example, with 25 pieces of data we can fit
a degree 24 polynomial). But these complicated functions can fluctuate wildly
between the given data and give outlandish results in the gaps. Extrapolation es-
pecially needs to be done carefully. Any discrepancy between our assumed fit and
the actual behaviour can be accentuated to ridiculous proportions and lead to
predictions which are physically impossible.
the way. To get a feeling of what the results mean it is often useful to calculate
certain representative statistics: averages, maxima, and minima. For a set of results
x, some of the statistics we will make use of the most are the maximum, xmax , the
minimum, xmin , the mean, x, the standard deviation, σx , and the variance, σx2 . If
we label the individual elements of x as x1 , x2 , . . . , xn then
1 1
n n
x= xi , and σx2 = (xi – x)2 . (4.4)
n i=1 n i=1
It is often natural (and profitable) to look for and measure correlations be-
tween variables. A simple way to compare two variables x and y is to measure the
covariance between samples {x1 , x2 , . . . , xn } and {y1 , y2 , . . . , yn }, defined as
1
n
cov(x, y) = (xi – x)( yi – y). (4.5)
n i=1
Powerful statistical techniques have been developed which boil down a large
amount of information into a number between –1 and 1 which gives a precise
value of the dependency between two variables. For example, if we assume that x
and y are linearly related then we can calculate the Pearson correlation coefficient of
two samples as
cov(x, y)
r= . (4.6)
σx σy
safe to approximate with continuous distributions and/or ones with infinite do-
mains such as the Gaussian. And (since we are mathematicians!) we will be taking
limits to infinity to get a complete understanding of the finite. But care should be
taken so that our use of random distributions does not give meaningless results
and the student should be prepared to use truncated approximations to idealized
distributions to avoid such misfortune.
Understanding unlabelled networks can give us rich insight into structure and
theory but in applications the actual labels attached to the nodes must be taken
into account if we want to name the most important node or the members of a
particular set. For these labelled graphs, a wide array of file formats have been
developed that are influenced by their author’s particular interests.
For unlabelled networks one can work exclusively with an adjacency matrix.
All mathematical software packages will have a format for storing matrices and
an initial analysis can be performed using well-established methods from linear
algebra. However, for large networks it may be necessary to use efficient storage
formats. Suppose you are analysing a simple network with n nodes and m edges.
To store this information one simply needs a list of the edges and their end points.
If we assign each node a number between 1 and n we can store all the information
as a set of m pairs of numbers. Assuming each number requires four bytes of stor-
age (which gives us around four billion numbers to play with) the whole network
therefore requires 4m bytes of room in the computer’s memory. Unless instructed
otherwise, most software packages will allocate eight bytes to store each entry of a
matrix (in so-called double precision format). Thus the adjacency matrix requires
8n2 bytes of storage.
Example 4.7
Consider a social network of around 100,000 people who are each connected
on average to around 100 other individuals. We only need around 20 mega-
bytes to store the links (m ≈ 5, 000, 000 since each edge adds two to the total
of connections). If we form the adjacency matrix we need 8 × 1010 bytes or
80 gigabytes; a factor 4,000 times as big. While most modern computers have
room for a file of 80GB, they may not be easy to manipulate (for example, as
of 2014 very few personal computers would be able to store the whole matrix
in RAM) and the problem is exacerbated as n increases.
The point of Example 4.7 is that even if they are simple and unlabelled net-
works, thought needs to go in to the method of representing a network on a
computer. If we just want to store the edges then this can be done in a simple
two-column text file, or a spreadsheet. It is easy enough to work in a similar way
with directed networks and by adding an additional column one can also add
weights.
If one makes use of a simple file type then there are usually tools to convert
the network into a format which can be manipulated efficiently by your software
package of choice. If the amount of storage is critical, one can consider formats
that attempt to compress information as much as possible (for example graph6
and sparse6).
Simple text files can also be used to represent labelled networks but, again,
when m and n are large it often pays to consider a format which is optimal for the
50 Data Analysis and Manipulation
software we want to use. Formats such as GML and Pajek have been designed to
make network data portable and use flexible hierarchical structures which allow
researchers to add detailed annotations to provide context, but one can also make
use of some of the other countless file types that were developed without networks
necessarily being at the forefront of the creators’ minds.
capabilities of the other packages we have mentioned for analysing very large net-
works but often a purely visual approach can lead one to uncover patterns which
can then be analysed more systematically. In particular, it is an excellent package
for creating arresting illustrations of networks and we have used it extensively in
producing this book.
..................................................................................................
FURTHER READING
Clarke, G.M. and Cooke, D., A Basic Course in Statistics, Edward Arnold, 1998.
Ellenberg, J., How Not To Be Wrong: The Hidden Maths of Everyday Life, Allen
Lane, 2014.
Lyons, L., A Practical Guide to Data Analysis for Physical Science Students,
Cambridge University Press, 1991.
Mendenhall, W., Beaver, R.J., and Beaver, B.M., Introduction to Probability and
Statistics, Brooks/Cole, 2012.
Algebraic Concepts
5 in Network Theory
Definition 5.1 Let A ∈ Rn×n then its determinant, written det(A) is the quantity
defined inductively by
⎧
⎪
⎨ A,n n = 1,
det(A) =
⎪
⎩ (–1)i+j aij det(Aij ), n > 1,
j=1
for any fixed i, where Aij denotes the submatrix formed from A by deleting its ith
row and the jth column.
Examples 5.1
And if A ∈ R3×3 ,
det(A) = a11 (a22 a33 – a23 a32 ) – a12 (a21 a33 – a23 a31 ) + a13 (a21 a32 – a22 a31 )
= a11 a22 a33 + a21 a32 a13 + a31 a12 a23 – a13 a22 a31 – a23 a32 a11 – a33 a12 a21 .
(ii) Let
⎡ ⎤
1 3 1
⎢ ⎥
A = ⎣2 1 1⎦ .
0 3 1
Then det(A) = 1 + 6 + 0 – 0 – 3 – 6 = –2. There are three 2 × 2 principal minors, det(A11 ) = 1 – 3 = –2,
det(A22 ) = 1 – 0 = 1, and det(A33 ) = 1 – 6 = –5.
continued
54 Algebraic Concepts in Network Theory
(iii) Suppose that G is a simple connected network with three nodes. Then the adjacency matrix of G is either
⎡ ⎤
0 1 1
⎢ ⎥
⎣1 0 1⎦
1 1 0
or a permutation of
⎡ ⎤
0 1 1
⎢ ⎥
⎣1 0 0⎦ .
1 0 0
The first matrix represents K3 , a triangle, and the second is the adjacency matrix of P2 . The determinant of the
first matrix is two and that of the second is zero. In this very simple case we can characterize connected networks
through their determinant.
Definition 5.2 For any n dimensional square matrix A there exist scalar values2 λ
and vectors3 x such that
Ax = λx.
Any value of λ that satisfies this equation is called an eigenvalue. Any nonzero
vector x that satisfies this equation is called an eigenvector.
For all values of λ we have A0 = λ0. The zero vector is not considered an
eigenvector, though. Now
Ax = λx ⇐⇒ (A – λI )x = 0,
which means that the matrix A – λI is singular. We have therefore established
the important result that the eigenvalues of λ are the roots of the equation
det(A – λI ) = 0, a polynomial of degree n.
det(A – λI ) = b0 + b1 λ + b2 λ2 + · · · + bn λn
2 At least one and at most n. is known as the characteristic polynomial (or c.p.) of A. The equation
3 At least one for each value of λ. det(A – λI ) = 0 is known as the characteristic equation.
Eigenvalues and eigenvectors 55
where the λi are unique. Generally k = n and pi = 1 for all i. But sometimes we
encounter repeated eigenvalues for which pi > 1. The value of pi is known as the
algebraic multiplicity of λi . For each distinct eigenvalue there are between 1 and
pi linearly independent eigenvectors. The number of eigenvectors associated with
an eigenvalue is known as its geometric multiplicity.
Encoded in the characteristic polynomial are certain quantities that are very
useful to us. We can link its coefficients to principal minors: (–1)k bn – k is the sum
of all the k × k principal minors of A. Since tr(A) is also the sum of the principal
1 × 1 minors of A, –tr(A) is the coefficient of λn–1 in the c.p. At the same time, if
A has eigenvalues λ1 , λ2 , . . . , λn then
det(λI – A) = (λ – λ1 )(λ – λ2 ) · · · (λ – λn )
and multiplying out the right-hand side we immediately see that the coefficient of
λn–1 is
–λ1 – λ2 – · · · – λn
which means that the sum of the eigenvalues of a matrix equals its trace.
For a general real matrix A, the roots of the characteristic polynomial can be
complex. Notice that if A is a real matrix then
Problem 5.1
Show that if x and y∗ are right and left eigenvectors corresponding to different
eigenvalues then they are orthogonal and hence conclude that if A ∈ Rn×n is
symmetric then (i) the eigenvalues of A are all real and (ii) the eigenvectors of
distinct eigenvalues A are mutually orthogonal.
Suppose that x1 is the right eigenvector corresponding to λ1 = 0 and y∗2 is the
left eigenvector corresponding to λ2 , where λ2 = λ1 . Then,
Ax1 x1 λ2 ∗
y∗2 x1 = y∗2 = (y∗2 A) = y x1 .
λ1 λ1 λ1 2
56 Algebraic Concepts in Network Theory
But λ2 = λ1 so y∗2 x1 = 0. If λ1 = 0,
Ax1
y∗2 x1 = y∗2 = λ1 y∗2 x1 = 0.
λ2
If A = AT then by symmetry x = y in the above argument, hence the left and
right eigenvectors are identical.
(i) Since A is symmetric, (x∗ Ax)∗ = x∗ A∗ x = x∗ Ax, for any vector x. Now
(x∗ Ax)∗ is the complex conjugate of x∗ Ax, and for a number to equal its
complex conjugate it must be real. Now suppose that λ is an eigenvalue of
A with eigenvector x. Then x∗ Ax = x∗ (λx) = λx∗ x and this can only be
real if λ is real.
(ii) Suppose that x and y are eigenvectors with respective eigenvalues λ and
μ. Then
If λ = μ this means that yT x = 0, hence they are at right angles to each other.
Similarity transforms
When it comes to utilizing spectral information to analyse networks, we will want
to look at both eigenvalues and eigenvectors. Furthermore, the behaviour of ma-
trix functions is intimately tied to the interplay of eigenvalues and eigenvectors.
It is worthwhile, then, looking at the basic tools for understanding this interplay.
Similarity transformations are of utmost importance in this regard.
SX : A → X –1 AX
BC –1 x = (C –1 AC)C –1 x = C –1 Ax = λC –1 x.
Example 5.2
Let
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
–5 1 3 1 0 0 1 1 0
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
A = ⎣ 18 2 –9⎦ , D = ⎣0 2 0⎦ , X = ⎣0 1 3⎦ .
–20 2 11 0 0 5 2 2 –1
Example 5.3
If
⎡ ⎤
4 –2 –7
⎢ ⎥
X =⎣ 1 0 –2⎦
–2 1 4
Eigenvalues and eigenvectors 59
then A = XJX –1 . The first column of X is the eigenvector of A and it is the only eigenvector of the matrix. The second
and third columns satisfy the equations
(A – I )xi+1 = xi , i = 1, 2.
The diagonal blocks in the Jordan canonical form of a matrix are unique, hence
two matrices are similar if and only if their Jordan canonical forms have the same
diagonal blocks.
The Jordan decomposition is of great theoretical use but in practice it can be
difficult to compute accurately as it can be highly sensitive to small perturbations.
An alternative is to use the Schur decomposition.
Problem 5.2
Show that for any n × n matrix A there is a unitary matrix U and an upper-
triangular matrix T such that
A = UTU ∗ . (5.1)
and let U
(for some vector b) which is upper triangular. Call this matrix T = XV
,
which is unitary (why?). A = UT
U ∗ is a Schur decomposition of A.
60 Algebraic Concepts in Network Theory
The matrix T in (5.1) is known as the Schur canonical form and UTU ∗ is the
Schur decomposition of A.
If T is triangular,
n
det(T ) = tii ,
i=1
A = QDQT (5.2)
T ∗ = (U ∗ AU )∗ = U ∗ A∗ U = U ∗ AU = T ,
If we pick X carefully we can often get tight bounds on the locations of the
eigenvalues. Simple choices of X can also be useful. Note that if the discs are
disjoint they each contain a single eigenvalue of A.
Example 5.4
Suppose X = I , then
⎧ ⎫
⎨ ⎬
i = z ∈ C : |z – aii | ≤ |aij | .
⎩ ⎭
j =i
For example, if
⎡ ⎤
10 2 3
⎢ ⎥
A = ⎣ –1 0 2⎦
1 –2 1
then
1 = {z : |z – 10| ≤ 5}
2 = {z : |z| ≤ 3}
3 = {z : |z – 1| ≤ 3}.
1 = {z : |z – 10| ≤ 5/6}
2 = {z : |z| ≤ 8}
3 = {z : |z – 1| ≤ 8}
and we can give a much better estimate of the largest eigenvalue, as illustrated in Figure 5.1.
continued
62 Algebraic Concepts in Network Theory
(a) (b)
Figure 5.1 Gershgorin discs for (a) X = I (b) X = diag(6, 1, 1). The eigenvalues are indicated
by black circles
xT Ax xT λx
= T = λ.
xT x x x
The range of Rayleigh quotients for a given matrix will thus include the whole
spectrum (if that spectrum is real). Bounds on this range can be used to bound
the spectral radius (which is an upper bound on the size of Rayleigh quotients of
a given matrix).
The proof of Perron’s theorem can be found in many standard linear algebra
texts. The details are rather intricate and we omit them here, but it would be
remiss of us not to give a flavour of what the proof involves.
Problem 5.3
Show that if A > 0 then the following are true.
1. By induction. A > 0. Suppose B = Ak > 0. Then the (i, j)th entry of Ak+1
is nk=1 bik akj and the result follows since every term in this sum is positive.
2. If ρ(A) = 0 then Ak = O for k ≥ n. But we have just shown that Ak > 0 for
a positive matrix.
3. The kth entry of Ax is nk=1 aik xk . Every term is nonnegative and the sum
can only equal zero if every element is zero.
For nonnegative matrices, point one of the Perron theorem is also true. But
the uniqueness of the largest eigenvalue is not guaranteed.
Examples 5.5
0 1
(i) A = is nonnegative and has eigenvalues ±1, showing point two of the Perron theorem does not hold for
1 0
all nonnegative matrices.
A O
(ii) Suppose A > 0 and that Ax = ρ(A)x where x > 0. Now let B = . Then it should be obvious that
O A
ρ(B) = ρ(A) and
x x x x
B = ρ(B) , B = ρ(A) ,
0 0 –x –x
proving that points three and four of the Perron theorem do not hold for all nonnegative matrices.
(iii) We could have used B = I in the last example.
64 Algebraic Concepts in Network Theory
Many nonnegative matrices do share all the key properties of positive matrices.
Whether they do or not depends on the pattern of zeros in the matrix which for
adjacency matrices, as previously stated, can be related to connectivity in networks.
where X and Z are both square A ∈ Rn×n is fully decomposable if there are
permutations P and Q such that
X Y
A=P Q.
O Z
A = Pdiag(A1 , A2 , . . . , Ak )P T
FURTHER READING
Horn, R.A. and Johnson, C.R., Matrix Analysis, Cambridge University Press,
2012.
Meyer, C.D., Matrix Analysis and Applied Linear Algebra, SIAM, 2000.
Strang, G., Linear Algebra and Its Applications, Brooks/Cole, 2004.
Spectra of Adjacency
6 Matrices
6.1 Motivation 66
6.2 Spectral analysis of simple In this chapter
networks 66
Diverse physical phenomena can be understood by studying their spectral
6.3 Spectra and structure 70
properties. An understanding of the harmonies of music can be developed by
6.4 Eigenvectors of the adjacency
looking at characteristic frequencies that can be viewed as eigenfunctions; and
matrix 75
astronomers can predict the chemical composition of unimaginably distant
Further reading 77
galaxies from the spectra of the electromagnetic radiation they emit.
In this chapter we look at some ways to define the spectrum of a network
and what we can infer from the resulting eigenvalues. We will only consider
undirected networks, which allows us to take advantage of some powerful
tools from matrix algebra.
6.1 Motivation
The obvious place to start when looking for the spectrum of a network is the ad-
jacency matrix. For now, we will focus on simple networks. Since the adjacency
matrix is symmetric, the eigenvalues are real (by the spectral theorem) and since it
is nonnegative, its largest eigenvalue is real and positive (by the Perron–Frobenius
theorem). We can compare networks through the spectra of their adjacency matri-
ces but we can also calculate some useful network statistics from them, too. We
will assume that the spectrum of the adjacency matrix A is ordered so that
λ1 ≥ λ2 ≥ · · · ≥ λn .
Since A is symmetric and the eigenvalues are real, such an ordering is possible.
Examples 6.1
All the eigenvalues corresponding to complex ω have algebraic multi- 1 Throughout this book e denotes a vec-
plicity 2 since ωj and ωn – j share the same real part. tor of ones and E a matrix of ones. Their
dimensions will vary but should be readily
continued understood from the context in which they
appear.
68 Spectra of Adjacency Matrices
(iv) If G is bipartite then its adjacency matrix can be permuted to the form
O B
A= .
BT O
x
Suppose v is an eigenvector of A. Partition it as we have A so v = .
y
Now Av = λv and
O B x By
Av = = ,
BT O y BT x
so
T
Emn Emn x = λEmn y = λ2 x.
T
But Emn Emn = nE, where E is an m×m matrix of ones. From our analysis
of the complete graph, we know that the spectrum of nE is {mn, 0} and
√
so the eigenvalues of Kmn are ± mn and 0. We get a similar result if
T
we consider Emn Emn y and if we account for all the copies of the zero
eigenvalue we find that it has algebraic multiplicity of m + n – 2. Hence,
(vi) If G is disconnected then its spectrum is simply the union of the spectra
of the individual components.
(vii) If the adjacency matrix of G has characteristic polynomial
For general networks we can either compute (parts of) the spectrum nu-
merically or estimate certain eigenvalues. The most useful eigenvalues to have
information about are λ1 , λ2 , and λn (the most negative eigenvalue). Since A is
symmetric and nonnegative we can deduce several things about λ1 and λn .
By Gershgorin’s theorem, the spectrum lies in discs centred on the diagonal
elements of A and have radius equal to the sum of off diagonal elements in a
row. Note that since the eigenvalues are real, this means all eigenvalues lie in the
interval [–n + 1, n – 1]. By the Perron–Frobenius theorem, the largest eigenvalue is
nonnegative so 0 ≤ λ1 ≤ n – 1 and by symmetry, λ1 = 0 if and only if A = O. Note
that the adjacency matrices of all networks apart from that of Nn have negative
eigenvalues.
Another consequence of the Perron–Frobenius theorem is that unless A is re-
ducible, |λn | < λ1 . We saw in an earlier example that the spectrum of a bipartite
network is symmetric about zero, and hence |λn | = λ1 . The only other case in
which A is reducible for a simple network, G, is if G has several components. And
unless one of these components is bipartite, the smallest eigenvalue is guaranteed
to satisfy |λn | < λ1 , too.
We can get a lower bound on the biggest eigenvalue using the ratio of edges to
nodes in a network. Since λ1 = ρ(A) is an upper bound on the size of Rayleigh
xT Ax
quotients, for any x = 0, λ1 ≥ T . Letting x = e gives
x x
70 Spectra of Adjacency Matrices
eT Ae 2m
λ1 ≥ = .
eT e n
Problem 6.1
Show that if kmax is the maximal degree of any node of a simple network with
√
adjacency matrix A then λ1 ≥ kmax .
Suppose node k has the highest degree and define x so
√
kmax , i = k,
xi =
aik , i = k,
Since xT x = 2kmax ,
√ √
xT Ax kmax kmax + kmax kmax %
λ1 ≥ T ≥ = kmax .
x x 2kmax
det(λI – A) = c0 + c1 λ + · · · + cn – 1 λn – 1 + cn λn .
As previously stated, the coefficients of the c.p. are related to the sum of the
principal minors of A. Since A is simple its diagonal is zero and hence so is that
of any principal submatrix. All nonzeros in an adjacency matrix are ones and all
Spectra and structure 71
Only the third of these has a nonzero determinant (it is two). Such a subma-
trix corresponds to a triangle in G and so –cn – 3 counts twice the number of
triangles in G.
We can use the spectrum of a network to count walks. In particular, since the
entries of Ap denote the number of walks of length p between nodes, the number
of closed walks lies along the diagonal and hence the sum of the closed walks of
length p is
n
tr(Ap ) = λpi , (6.1)
i=1
Examples 6.2
continued
72 Spectra of Adjacency Matrices
One can readily confirm that the coefficients of λn – 2 and λn – 3 match the
expected values in terms of edges and triangles.
(ii) The number of edges in the cycle graph Cn is n and there are no
triangles for n ≥ 4, thus its characteristic polynomial is
λn – nλn – 2 + cn λn – 3 + · · ·
(iv) Since a bipartite network has no triangles, the cn – 3 coefficient in the c.p.
is zero.
(v) Suppose that the spectrum of the adjacency matrix of a network is
symmetric about zero. Then if p is odd,
n
tr(Ap ) = λpi = 0,
i=1
We can also use (6.1) to predict whether a network contains certain features.
Problem 6.2
√
Suppose G = (V , E) has m edges. Show that if λ1 > m, then G contains a
triangle.
Since m ≥ 0, λ1 > 0. From (6.1), 2λ31 > 2λ1 m ≥ λ1 ni=1 λ2i ≥ ni=1 |λ3i |, so
λ31 > ni=2 |λ3i |.
If t is the number of triangles then 6t = ni=1 λ3i ≥ λ31 – ni=2 |λ3i | > 0.
Since t must take an integer value, we have established the existence of at least
one triangle.
While this may not be the most practical of results, it illustrates the point that
if we know or can estimate only limited parts of the spectrum, we may be able to
determine many characteristics of the network.
Spectra and structure 73
Example 6.3
Suppose that G is a connected k-regular network with n nodes and that its
adjacency matrix has only four distinct eigenvalues, namely,
such that
& '
σ (G) = [λ1 ]p1 , [λ2 ]p2 , [λ3 ]p3 , [λ4 ]p4 .
Since G is connected, p1 = 1. And since pi = n (algebraic multiplicities
always sum to the dimension), we get
1 + p2 + p3 + p4 = n.
k + p2 λ2 + p3 λ3 + p4 λ4 = 0.
Similarly,
1 1
t= tr(A3 ) = (k3 + p2 λ32 + p3 λ33 + p4 λ34 ).
6 6
Problem 6.3
Calculate the number of 4-cycles in the network described in Example 6.3 in
terms of k, n, the eigenvalues and their multiplicities.
Given the constraints that we have imposed, it can be shown that the number
of closed walks of a particular length is the same from any node in the network.
Suppose
u→v→w→x→u
is a closed walk of length four. Following the previous example, we can express
the number of closed walks of length four as
74 Spectra of Adjacency Matrices
In some of these, the nodes are not all distinct and so do not count towards the
total of 4-cycles.
There are three types of walk of length four with duplicate nodes.
1. u → v → u → v → u
2. u → v → u → x → u (x = v)
3. u → v → x → v → u (x = u)
Given a node u, there are k choices for v and k – 1 for x: any node adjacent to
u (u) that is not v (u).
In total this gives k walks of type one and k(k – 1) of each of types two and
three. We get the same number from any node in a k-regular graph, hence the
total number of closed walks of length four which are not cycles is nk(2k – 1).
Each 4-cycle represents eight different closed walks: start from one of its nodes
and move clockwise or anticlockwise.
Thus the total number of distinct 4-cycles is
1( 4 )
k + p2 λ42 + p3 λ43 + p4 λ44 – nk(2k – 1) .
8
Example 6.4
Suppose that the adjacency matrix A has distinct eigenvalues and that x is an
eigenvector of A such that Ax = λx.
If P T AP = A then APx = PAx = λ(Px) and so Px is also an eigenvector
of A corresponding to the eigenvalue λ. This is only possible if Px = ±x
which means that P 2 x = x. Since this is true for all eigenvectors, P 2 = I (and
P = P T ).
The only permutations which have this property are the identity matrix
and ones where pairs of nodes are swapped with each other.
Eigenvectors of the adjacency matrix 75
Examples 6.5
(i) Consider a bipartite network with principal eigenvalue λ. From Example 6.1(iv), we know that the signs of the
elements of the eigenvector associated with the eigenvalue –λ can be used to divide the nodes of the network into
its two parts.
(ii) We know that if ω is an nth root of 1 then
T
v = 1 ω ω2 . . . ω n – 1
is an eigenvector of Cn (see Example 6.1(ii)) with corresponding eigenvalue 2 cos(2π j/n) for some j ∈
{1, 2, . . . , n}. In this case the principal eigenvector is e. All the nodes are identical in this network and so it is
no surprise that they are indistinguishable. Note that e is an eigenvector of any regular network.
If n is even then Cn is bipartite and we can divide the network according to the signs of the elements of the
eigenvector associated with the eigenvalue –2. In this case, the eigenvector is
T
v = 1 –1 1 . . . –1 ,
continued
76 Spectra of Adjacency Matrices
Figure 6.1 Splitting the cycle graph C2n into two copies of the path graph
Pn – 2
Using Example 6.1(ii) we can write a basis for the two-dimensional eigenspaces in the form
⎧⎡ ⎤ ⎡ ⎤⎫
⎪
⎪ 1 1 ⎪
⎪
⎪
⎪ ⎢ ⎢ –1 ⎥ ⎪
⎪
⎪ ω ⎥ ⎥ ⎢ ω ⎥⎪
⎪
⎨⎢⎢ 2 ⎥ ⎢ –2 ⎥
⎪
⎬
⎢ ω ⎥,⎢ ω ⎥ ,
⎪ ⎢ ⎥ ⎢
.. ⎥ ⎢ .. ⎥ ⎪ ⎥
⎪
⎪ ⎢ ⎪
⎪
⎪
⎪
⎣ . ⎦ ⎣ . ⎦⎪ ⎪
⎪
⎪
⎩ ⎭
ω2n–1 ω1–2n
where ω2n = 1.
Now choose such an ω, for which the eigenvalue is λ = ω + ω–1 , and call the basis elements v1 and v2 ,
respectively. Note that the first and (n + 1)th elements of x = v1 – v2 are both zero.
Since a1,n+1 = an+1,1 = 0 it follows that λ is an eigenvalue and x is an eigenvector of the matrix you obtain by
replacing the first and (n + 1)th rows and columns of A with zeros. For example, when n = 4 we end up with the
matrix
⎡ ⎤
0 0 0 0 0 0 0 0
⎢0 0 1 0 0 0 0 0⎥
⎢ ⎥
⎢0 0⎥
⎢ 1 0 1 0 0 0 ⎥
⎢ ⎥
⎢0 0 1 0 0 0 0 0⎥
⎢ ⎥,
⎢0 0 0 0 0 0 0 0⎥
⎢ ⎥
⎢0 0 0 0 0 0 1 0⎥
⎢ ⎥
⎣0 0 0 0 0 1 0 1⎦
0 0 0 0 0 0 1 0
Eigenvectors of the adjacency matrix 77
isolating two copies of the adjacency matrix of the path graph Pn – 2 . Thus each repeated eigenvalues of C2n is an
eigenvalue of Pn – 2 , as claimed in Example 6.1 (iii).
(iv) Consider the network in Figure 6.2.
The principal eigenvector of the adjacency matrix is
T
0.106 0.044 0.128 0.219 0.177 0.186 0.140 .
1 5
3 4 6
2 7
Notice that the largest element (the fourth) is associated with the node at the hub of the network and that the
smallest element is associated with the most peripheral. The eigenvector of the next largest eigenvalue is
T
–0.172 –0.091 0.206 –0.202 0.183 –0.039 0.107 .
If we split the nodes according to the signs of the elements of this vector we find a grouping of nodes that points
towards a bipartite split in the network we only need to remove the edge between nodes 4 and 6 to achieve this.
We will revisit applications of eigenvectors frequently in later chapters.
..................................................................................................
FURTHER READING
u + λu = 0
over a bounded region tell us about characteristic modes associated with vibra-
tions in the region and Laplace’s equation u = 0 can be related to equilibria in a
system.
The graph Laplacian extends the idea of to a discrete network. Starting from
the definition of a derivative as the limit of differences,
f (x) – f (a)
f (a) = lim ,
x→a x–a
we note that the incidence matrix lets us find differences between nodes in a net-
work. Consider a function f which is assigned a value f (xi ) = fi at each node. Then
the differences between incident nodes can be found by calculating g = Bf, where
B is the incidence matrix. To get the second difference at a node i, as we must
to find , we need to take the differences of all the first differences incident to i,
namely BT g = BT Bf. With a little more rigour (we need to choose appropriate
denominators for our differences and make sure signs of differences are properly
matched together) we can show that the matrix L = BT B is indeed a discrete
The graph Laplacian 79
Recall that row k of B contains only two nonzeros: a 1 and a –1 in the columns
incident to the kth edge. There is a positive contribution to lii every time bki = ±1.
That is, there is a positive contribution of 1 corresponding to every edge incident
to i and hence lii is equal to the degree of node i. But if i = j then bki bkj = 0 if and
only if i and j are adjacent in which case bki bkj = –1. Therefore
L = D–A
Examples 7.1
(a) (b)
Ordering nodes from left to right and top to bottom, their Laplacian
matrices are
⎡ ⎤
1 –1 0 0 0 0 0 0 ⎡ ⎤
⎢–1 3 –1 0 0 –1 0 0⎥ 2 0 –1 0 –1 0 0
⎢ ⎥ ⎢ 0 1 –1 0 0 0 0⎥
⎢ 0 –1 3 –1 0 0 –1 0⎥ ⎢ ⎥
⎢ ⎥ ⎢–1 –1 3 –1 0 0 0⎥
⎢ ⎥ ⎢ ⎥
⎢ 0 0 –1 1 0 0 0 0⎥ ⎢ ⎥
⎢ ⎥ and ⎢ 0 0 –1 4 –1 –1 –1⎥ .
⎢ 0 0 0 0 1 –1 0 0⎥ ⎢ ⎥
⎢ ⎥ ⎢–1 0 0 –1 3 0 –1⎥
⎢ 0 –1 0 0 –1 3 –1 0⎥ ⎢ ⎥
⎢ ⎥ ⎣ 0 0 0 –1 0 2 –1⎦
⎣ 0 0 –1 0 0 –1 3 –1⎦
0 0 0 –1 –1 –1 3
0 0 0 0 0 0 –1 1
continued
80 The Network Laplacian
(ii) The complete network, Kn has Laplacian nI – E. For the null network,
L = O. The Laplacian of the path graph Pn – 1 is an n × n matrix of the
form
⎡ ⎤
1 –1 0 ... 0
⎢ .. ..⎥
⎢–1 . .⎥
⎢ 2 –1 ⎥
⎢ ⎥
⎢ .. ⎥
⎢ 0 . 0⎥ .
⎢ ⎥
⎢ .. .. ⎥
⎣ . . –1 2 –1⎦
0 ... 0 –1 1
Le = (D – A)e = diag(Ae)e – Ae = Ae – Ae = 0
Problem 7.1
Suppose we order the eigenvalues of the graph Laplacian L of a simple network
G so that
λ1 ≥ λ2 ≥ · · · ≥ λn – 1 ≥ λn = 0.
1. λn – 1 > 0 ⇐⇒ G is connected.
2. kmax ≤ λ1 ≤ 2kmax .
3. λ1 is bounded above by the maximum of the sum of degrees of adjacent
nodes. That is,
λ1 ≤ k(ij)
max = max ki + kj .
(i, j)|aij =1
Eigenvalues and eigenvectors of the graph Laplacian 81
There are ki nonzero terms in the sum and they are bounded above by the
maximum degree of the nodes adjacent to i (call this k(i)
max ). Thus the point in the
ith Gershgorin disc of maximum size is ki + k(i)
max . Taking the maximum over all i
gives the desired bound.
Note that (7.1) gives the mean degree of nodes adjacent to i. Thus we can
improve our upper bound to
λ1 ≤ max ki + mi ,
i
The spectra of the adjacency matrix and the graph Laplacian share some
relations, but usually we can infer different information from each.
Examples 7.2
continued
82 The Network Laplacian
(iv) As with the adjacency matrix, we can use the eigenvalues and eigen-
vectors of the Laplacian of the cycle graph C2n to find the spectrum of
the path graph (although the details differ). First observe that grouping
together pairs of entries in C2n gives us something that looks very like
Pn – 1 (see Figure 7.2.1 ).
Given a repeated eigenvalue λ = 2 – ω – ω–1 (ω2n = 1) of the Lapla-
cian, LC , of C2n we can form a linear combination of the eigenvectors
we gave in Chapter 6 to give another eigenvector. In particular,
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
1 1 ω+1
⎢ ω ⎥ ⎢ω2n–1 ⎥ ⎢ω2 + ω2n–1 ⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢ 2 ⎥ ⎢ 2n–2 ⎥ ⎢ ⎥
z = ω⎢ ω ⎥ + ⎢ω ⎥ = ⎢ .. ⎥,
⎢ . ⎥ ⎢ . ⎥ ⎢ . ⎥
⎢ . ⎥ ⎢ . ⎥ ⎢ 2n–1 ⎥
⎣ . ⎦ ⎣ . ⎦ ⎣ω +ω ⎦
2
ω2n–1 ω 1+ω
1 2 3 ··· n− 1 n
2n 2n − 1 2n − 2 ··· n+ 2 n+ 1
···
1 Technically, we are treating Pn – 1 as
a quotient graph of C2n . The interested Figure 7.2 Pairing the nodes in the cycle graph C2n
reader can find many more details in a text gives the path graph Pn – 1
book devoted to graph theory.
Eigenvalues and eigenvectors of the graph Laplacian 83
where Emn is an m × n matrix of all ones. We can find its spectrum with
a similar technique to that used to find the spectrum of its adjacency
matrix.
x x x
Let be an eigenvector of the eigenvalue λ. Then L =λ
y y y
gives
so,
clusters of the network. Choosing r to be the median value of x ensures that the
clusters are evenly sized. Another popular choice of r is 0. Spectral clustering is
one of many ways to divide a network into pieces. The vector x is known as the
Fiedler vector.
We will consider a variety of methods for partitioning a network in Chapter 21.
Some of these will exploit spectral information but there are a variety of other
techniques, too.
Example 7.3
5 6
3 7 8
1 4 9 11
2 13
10 12
Figure 7.3 shows a network we saw in Chapter 2. The nodes have been
labelled to highlight the clustering given by the Fiedler vector.
The first five elements of the Fiedler vector have the opposite sign to the
others, suggesting one particular partition. Assuming x1 > 0, we find that
x1 = x2 > x3 > x4 > x5 > x6 > x7 > x10 > x8 > x13 > x9 > x12 > x11 .
..................................................................................................
FURTHER READING
8.1 Motivation
Analogies and metaphors are very useful in any branch of science. Complex
networks are already an abstraction of the connectivity patterns observed in real-
world complex systems in which we reduce complex entities to single nodes and
their complex relationships to the links of the network. It is very natural, therefore,
to use some physical analogies to study these networks so that we can use familiar
physical and mathematical concepts to understand the structure and dynamics
of networks; and we can use physical intuition about the abstract mathematical
concepts to study the structure/dynamics of networks.
In this chapter we focus on analogies based on classical physics. The first cor-
responds to the use of mass–spring systems in classical mechanics and the second
uses electrical circuits. In both cases we show how to gain intuition into the ana-
lysis of networks as well as how to import techniques from physics, such as the
resistance distance, which can be useful for the study of networks.
For instance, we understand mathematically the Fiedler vector of the Laplacian
matrix. Now if we observe that it is analogous to a vibrational mode of the nodes of
a mass–spring network, we can develop a physical picture of how this eigenvector
splits the nodes of the network into two clusters: one corresponding to the nodes
vibrating in one direction and the other to the nodes vibrating in the opposite
direction for a certain natural frequency. We fill in the details in Section 8.2.
Classical mechanical analogies 87
where ẍ(t) is the second derivative of the position with respect to time i.e. the
acceleration of the particle.
The state of a system with n degrees of freedom is fully determined by n
coordinates xi (t) and n velocities ẋi (t) for i = 1, . . . , n, and the system is described
by the Lagrangian L(x, ẋ, t). The Lagrangian function for a dynamical system is
its kinetic energy minus the potential function from which the generalized force
components are determined. That is, L = T – V , where T is the kinetic and V the
potential energies of the system. For a system with n particles
1
T= mi ẋi2 , (8.2)
2 i
and
1
V = kij (xj – xi )2 , (8.3)
2 j,i
where the last summation is over all pairs of particles interacting with each other.
Thus,
⎡ ⎤
1 ⎣
2⎦
L= 2
mi ẋi – kij (xj – xi ) (8.4)
2 i j,i
∂L(x, ẋ, t)
pi (t) = . (8.5)
∂ ẋi
∂L(x, ẋ, t) n
H = ẋi – L(x, ẋ, t) = ẋi pi – L(x, ẋ, t). (8.6)
∂ ẋi i=1
H = T + V. (8.7)
88 Classical Physics Analogies
The so-called phase space of the system
{(x, p)} is formed by the 2n-tuples
(x, p) = (x1 , x2 , . . . , xn , p1 , p2 , . . . , pn ) in which a path of the particle system is
determined by the Hamilton equations
∂H(x, p, t) ∂H(x, p, t)
ẋi (t) = = {xi , H}, ṗi (t) = – = {pi , H},
∂pi ∂xi
d " #
∂A ∂B ∂B ∂A
{A, B} = – . (8.8)
i=1
∂xi ∂pi ∂pi ∂xi
m1 k1 m2 k2 m3
We are interested in systems with several interconnected particles. So, for the
x1 x2 x3 purpose of illustrating the connections between classical mechanics and network
theory, consider a mass–spring system like the system of three masses and two
Figure 8.1 A mass–spring system with
springs illustrated in Figure 8.1.
masses mi , spring constants ki and posi-
The kinetic and potential energy for this system can be written as
tions xi
1 1 1 1 1
T= m1 ẋ21 + m2 ẋ22 + m3 ẋ23 , V = k1 (x1 – x2 )2 + k2 (x3 – x2 )2 ,
2 2 2 2 2
1* +
L= m1 ẋ21 + m2 ẋ22 + m3 ẋ23 – k1 (x1 – x2 )2 – k2 (x3 – x2 )2 . (8.9)
2
d ∂L ∂L
– = 0, (8.10)
dt ∂ ẋi ∂xi
∂L ∂L ∂L
= m1 ẋ1 , = m2 ẋ2 , = m3 ẋ3 , (8.11)
∂ ẋ1 ∂ ẋ2 ∂ ẋ3
and
∂L ∂L ∂L
= –k1 (x1 – x2 ), = k1 (x1 – x2 ) + k2 (x3 – x2 ), = –k2 (x3 – x2 ). (8.12)
∂x1 ∂x2 ∂x3
d ∂L
It is evident that = mi ẍi , which is just Fi according to Newton’s
dt ∂ ẋi
equation. Hence, for the system of three masses and two springs,
m1 ẍ1 = –k1 x1 + k 1 x2 ,
m2 ẍ2 = k 1 x1 + (–k1 – k2 )x2 + k 2 x3 , (8.13)
m3 ẍ3 = k 2 x2 + –k2 x3 .
Classical mechanical analogies 89
or more concisely as
Notice that the matrix on the right-hand side of the equation is just the Lapla-
cian matrix for a weighted network having three nodes and two edges i.e. a path
of length 3. If we consider a system with m1 = m2 = m and k1 = k2 = k, then
⎡ ⎤ ⎡ ⎤⎡ ⎤
ẍ1 –k/m k 0 x1
⎢ ⎥ ⎢ ⎥⎢ ⎥
⎣ẍ2 ⎦ = ⎣ k –2k/m k ⎦ ⎣x2 ⎦ . (8.16)
ẍ3 0 k –k/m x3
This analogy can be very helpful, not least when we use the eigenvectors of
the Laplacian to make partitions of the nodes in a network. Recall that the Fiedler
vector partitions a network into two clusters and we can think of this partition as
the grouping of the nodes according to the vibrational mode of the nodes in the
network corresponding to the slowest or fundamental natural frequency of the
1/2
system, μ2 .
Problem 8.1
Write down the Hamiltonian for the mass-spring network illustrated in Figure 8.3.
Clearly,
" #
1 p21 p2 p2 p2 p2
T= + 2 + 3 + 4 + 5
2 m1 m2 m3 m4 m5
and
1 *
V = k1 (x2 – x1 )2 + k2 (x3 – x2 )2 + k3 (x4 – x3 )2 + k4 (x5 – x4 )2
2
+
Figure 8.3 A network with edges +k5 (x5 – x1 )2 + k6 (x4 – x2 )2 .
labelled by force constants ki
Letting
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
2 –1 0 0 –1 k1 0 0 0 0 x1
⎢–1 0⎥ ⎢0 0⎥ ⎢x ⎥
⎢ 3 –1 –1 ⎥ ⎢ k2 0 0 ⎥ ⎢ 2⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
L=⎢ 0 –1 2 –1 0⎥ , K = ⎢ 0 0 k3 0 0 ⎥ , x = ⎢x3 ⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎣ 0 –1 –1 3 –1⎦ ⎣0 0 0 k4 0⎦ ⎣x4 ⎦
–1 0 0 –1 2 0 0 0 0 k5 x5
1
gives H(x, p) = [T + xT KLx].
2
created by the differences in the currents flowing through the edges of the circuit
and their respective resistance. According to Ohm’s law the voltage is related to
the current I and the resistance R according to the relation V = IR. The inverse
of the resistance is known as the conductance of the corresponding edge.
For a circuit we can represent the voltages, currents, and resistances of all the
edges by means of vectors. That is, let v be the vector representing the voltages
at each node of a circuit. Then the current at each edge is given by the vector
representation of Ohm’s law
i = R –1 BT v, (8.19)
i = CBT v. (8.20)
iext = BT i, (8.21)
v = L –1 iext . (8.23)
Let us now suppose that we want to calculate the voltage at each node induced
when a current of 1 amp enters at node p and a current of –1 amp leaves node q.
Using (8.24) we have
T
+
v = (Lp1 +
– Lq1 ) +
(Lp2 +
– Lq2 ) ... +
(Lpn +
– Lqn ) . (8.25)
If we want the difference in voltage created at the nodes p and q, we only need
pq = vp – vq = (Lpp
+ +
– Lqp +
) – (Lpq +
– Lqq +
) = Lpp +
+ Lqq +
– 2Lpq . (8.26)
+ +
Notice that Lpq = Lqp because the network is undirected.
The voltage difference pq represents the effective resistance between the two
nodes p and q, which indicates the potential drop measured when a unit current
is injected at node p and extracted at node q. Hence
Problem 8.2
Show that the effective resistance is a Euclidean distance between a pair of nodes.
By definition, the effective resistance is given by
pq = Lpp
+ +
+ Lqq +
– 2Lpq .
n
+
Lpq = μ–1
j qj (p)qj (q),
j=2
n
1
pq = [qj (p) – qj (q)]2 ,
j=2
μj
where
T
qr = q2 (r) q3 (r) ... qn (r)
Networks as electrical circuits 93
and M is a diagonal matrix of the eigenvalues of the Laplacian with the trivial one
replaced by a nonzero. Hence
The resistance distance between all pairs of nodes in the network can be rep-
resented in a matrix form, namely as the resistance matrix . This matrix can be
written as
Example 8.1
Let us compare the shortest path distance and the resistance distance for the
network illustrated in Figure 8.3. These are
⎡ ⎤ ⎡ ⎤
0 1 2 2 1 0 0.727 1.182 0.909 0.727
⎢ 0 1 1 2⎥ ⎢ 0.636 0.545 0.909⎥
⎢ ⎥ ⎢ 0 ⎥
⎢ ⎥ ⎢ ⎥
D=⎢ 0 1 2⎥ and = ⎢ 0 0.636 1.182⎥ .
⎢ ⎥ ⎢ ⎥
⎣ 0 1⎦ ⎣ 0 0.727⎦
0 0
Although the pairs (1, 3) and (1, 4) are the same distance apart according
to the shortest path distance, the second pair is closer according to the resist-
ance distance. The reason is that 1 and 4 are part of a square in the network
while the smallest cycle 1 and 3 are part of is a pentagon. In fact, the short-
est resistance distance is between the nodes 2 and 4, which is the only pair
of nodes which are part of a triangle and a square at the same time. Thus,
the resistance distance appears to take into account not only the length of the
shortest path connecting two nodes, but also the cycles involving them.
94 Classical Physics Analogies
..................................................................................................
FURTHER READING
Doyle, P.G. and Snell, J.L., Random Walks and Electric Networks, John Wiley and
Sons, 1985.
Estrada, E. and Hatano, N., A vibrational approach to node centrality and
vulnerability in complex networks, Physica A 389:3648–3660, 2010.
Klein, D.J. and Randić, M., Resistance distance, Journal of Mathematical Chem-
istry 12:81–95, 1993.
Susskind, L. and Hrabovsky, G., Classical Mechanics: The Theoretical Minimum,
Penguin, 2014.
Degree Distributions
9
In this chapter
9.1 Motivation 95
We start by introducing the concept of degree distribution. We analyse some
of the most common degree distributions found in complex networks, such 9.2 General degree distributions 95
as the Poisson, exponential, and power-law degree distributions. We explore 9.3 Scale-free networks 97
some of the main problems found when fitting real-world data to certain kinds Further reading 100
of distributions.
9.1 Motivation
The study of degree distributions is particularly suited to the analysis of complex
networks. This kind of statistical analysis of networks is inappropriate for the
small graphs typically studied in graph theory. The aim is to find the best fit
for the probability distribution of the node degrees in a given network. From
a simple inspection of adjacency matrices of networks one can infer that there
are important differences in the way degrees are distributed. In this chapter we
introduce the tools which allow us to analyse these distributions in more detail.
For large n (and np = k̄ constant) one finds in the limit that node degree
follows a Poisson distribution
k̄k e–λ
Pr(deg(v) = k) → .
k!
96 Degree Distributions
Pr(deg(v) = k) = Ae–k/k̄ .
Many real networks can be found that have degree distributions similar to
those illustrated. But there is another distribution one frequently sees that deserves
attention. That is the power-law distribution
Pr(deg(v) = k) = Bk–γ .
One can show that such a distribution emerges in a random graph where as
new nodes are added they attach preferentially to nodes with high degree. This is
a model of popularity, and is often the predicted behaviour one expects to see in
social networks.
If a large network has a power-law degree distribution then one can expect to
see many end vertices and other nodes of low degree and a handful of very well
connected nodes.
In Figure 9.1 we illustrate some common distributions found in complex
networks.
0.14 0.02
0.12 0.016
0.1
0.012
0.08
p(k)
p(k)
0.06 0.008
0.04
0.004
0.02
0 0
0 5 10 15 20 25 30 35 40 0 10 20 30 40 50 60 70 80 90100
k k
k
e−k k (b) p(k) = 1 2 2
e−(k−k) / (2σk)
(a) p(k) = √2πσk
k!
0.02
0.1
0.016
0.08
0.012 0.06
p(k)
p(k)
0.008 0.04
0.02
0.004
0
Figure 9.1 Common examples of de- 0 10 20 30 40 50 60 70 80 90 100 0 5 10 15 20 25 30 35 40
0.5000 0.5000
0.0500 0.0500
ln p(k)
ln p(k)
0.0050
0.0050
ln p(k) = –γ ln k + ln A,
where –γ is the slope and ln A the intercept of the function. Scaling by a constant
factor c only alters the intercept and the slope is preserved so
ln p(k, c) = –γ ln k + ln A – γ ln c
The existence of a scaling law in a system means that the phenomenon under
study will reproduce itself on different time and/or space scales. That is, it has
self-similarity.
In network theory we often use the quantity
P(k) = p(k ) = 1 – P (k) + p(k). (9.2)
k ≥k
∞
P(k) = Ak– γ dk = Bk–γ +1 ,
k
Problem 9.1
A representation of the internet as an autonomous system gives a network with
4,885 nodes. The CDF of its degree distribution is illustrated in Figure 9.4 on a
log–log plot. The slope of the best fitting straight line is –1.20716 and the intercept
is –0.11811.
Scale-free networks 99
0.5000
0.0500
ln P(k)
0.0050
0.0005
100
10–1
P(k)
10–2
10–3
10–4
100 101 102 103 104 Figure 9.4 The CDF of an internet’s
k degree distribution
Recalling that the cumulative degree P(k) represents the probability of finding
a node with degree larger at least k we note that
The number of nodes with degree at least one is nP(1) (since P(k) = n(k)/n,
where n(k) is the number of nodes with degree at least k) and
Assuming there is only one such node with degree equal to kmax then its degree
will satisfy the equation nP(kmax ) = 1. So
0.7619k–1.20716 = n–1
or
log(0.7619n–1 )
log kmax = – .
–1.20716
..................................................................................................
FURTHER READING
10.1 Motivation
Many real-world networks are characterized by the presence of a relatively large
number of triangles. This characteristic feature of a network is a general conse-
quence of high transitivity. For instance, in a social network it is highly probable
that if Bob and Phil are both friends of Joe then they will eventually be introduced
to each other by Joe, closing a transitive relation, i.e. forming a triangle. Our rela-
tive measure is between the proportion of triangles existing in a network and the
potential number of triangles it can support given the degrees of its nodes. In this
chapter we study two methods of quantifying this property of a network, known
as its clustering coefficient.
ti 2ti
Ci = = . (10.2)
ki (ki – 1)/2 ki (ki – 1)
102 Clustering Coefficients of Networks
1
C= Ci . (10.3)
n i
Example 10.1
5 1 2 6
8 4 3 7
Nodes 1 and 3 are equivalent. They both take part in two triangles and
their degrees is 4. Thus,
2 · (2) 1
C1 = C3 = = . (10.4)
4·3 3
1
C2 = C4 = . (10.5)
3
Notice that because nodes 5–8 are not involved in any triangle we have,
Ci ≥ 5 = 0. Consequently,
" #
1 4 1
C= = . (10.6)
8 3 6
3t 3|C3 |
C= = . (10.7)
|P2 | |P2 |
The Newman clustering coefficient 103
Example 10.2
Consider again the network illustrated in Figure 10.1. We can obtain the
number of triangles in that network by using the spectral properties of the
adjacency matrix. That is,
1
t= tr(A3 ) = 2. (10.8)
6
The number of paths of length two in the network can be obtained using the
following formula (which we will justify in Chapter 13).
n " #
n
k1 ki (ki – 1)
|P2 | = = = 18. (10.9)
i=1
2 i=1
2
Thus,
3×2 1
C= = . (10.10)
18 3
Problem 10.1
Figure 10.2 A network formed by tri-
Consider the network illustrated in Figure 10.2. Obtain an expression for the
angles joined at a central node. The
average clustering, C, and the network transitivity, C, in terms of the number of
dashed line indicates the existence of
nodes n. Analyse your results as n → ∞.
other triangles.
To answer this problem, observe that there are two types of nodes in the net-
work, which will be designated as i and j (see Figure 10.3). There is one node of
type i and n – 1 nodes of type j.
The average clustering coefficient is then,
Ci + (n – 1)Cj
C= . (10.11)
n
Evidently, Cj = 1 and
2t
Ci = , (10.12)
ki (ki – 1)
where t is the number of triangles in the network (note that node i is involved in
Figure 10.3 A labelling of Figure 10.2
all of them) and ki is the degree of that node. It is easy to see that ki = 2t = n – 1.
to indicate nodes with different properties
Then,
2t 1 1
Ci = = = (10.13)
2t(2t – 1) 2t – 1 n – 2
104 Clustering Coefficients of Networks
and
" #
1
+ (n – 1) · 1
Ci + (n – 1)Cj n–2 1 n–1
C= = = + . (10.14)
n n n(n – 2) n
" kt #
0.9
0.8 1 (n – 1) 2t(2t – 1)
|P2 | = = kt (kt – 1) = (2 × 1) +
0.7
t
2 2 t 2 2
0.6
= 2t + t(2t – 1) = t(2t + 1). (10.15)
0.5
C
0.4
Thus,
0.3
0.2 3t 3t 3 3
0.1 C= = = = . (10.16)
|P2 | t(2t + 1) 2t + 1 n
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7
C As the number of nodes tends to infinity we have:
This indicates that the indices are accounting for different structural charac-
teristics of a network.
..................................................................................................
FURTHER READING
Estrada, E., The Structure of Complex Networks: Theory and Applications, Oxford
University Press, 2011, Chapter 4.5.1.
Newman, M.E.J., Networks: An Introduction, Oxford University Press, 2010,
Chapter 7.9.
Random Models
of Networks 11
11.1 Motivation 105
In this chapter 11.2 The Erdös–Rényi model of
random networks 105
We introduce simple and general models for generating random networks:
11.3 The Barabási–Albert model 108
the Erdös–Rényi model, the Barabási–Albert model, and the Watts–Strogatz
11.4 The Watts–Strogatz model 110
model. We study some of the general properties of the networks generated by
using these models, such as their densities, average path length, and clustering Further reading 113
11.1 Motivation
Every time that we look at a real-world network and analyse its most important
topological properties it is worth considering how that network was created. In
other words, we have to figure out what are the mechanisms behind the evolution
of a group of nodes and links which give rise to the topological structure we
observe.
Intuitively we can think about a model in which pairs of nodes are connected
with some probability. That is, if we start with a collection of n nodes and for
each of the n(n – 1)/2 possible links, we connect a pair of nodes u, v with certain
probability pu,v . Then, if we consider a set of network parameters to be fixed
and allow the links to be created by a random process, we can create models
that permit us to understand the influence of these parameters on the structure
of networks. Here we study some of the better known models that employ such
mechanisms.
in giant component
800
Number of nodes
nent, while the rest of nodes are isolated in very small components. In 700
Figure 11.3 we illustrate the change of the size of the main connected 600
component in an ER random network with 1,000 nodes, as a function of 500
400
the linking probability.
300
7. The structure of GER (n, p) changes as a function of p = k̄/(n – 1), giving 200
rise to the following three stages. 100
0
(a) Subcritical k̄ < 1, where all components are simple and very small. 0 0.5 1 1.5 2 2.5 3
The size of the largest component is S = O(ln n). p × 10–3
(b) Critical k̄ = 1, where the size of the largest component is Figure 11.3 Connectivity of Erdös–
S = O(n2/3 ). Rényi random networks
(c) Supercritical k̄ > 1, where the probability that ( f –ε)n < S < ( f +ε)n
is 1 when n → ∞, ε > 0, and where f = f (k̄) is the positive solution
of the equation e–k̄f = 1 – f . The rest of the components are very
small, with the second largest having size about ln n.
In Figure 11.4 we illustrate this behaviour for an ER random network with
100 nodes and different linking probabilities. The nodes in the largest
connected component are drawn in a darker shade.
8. The largest eigenvalue of the adjacency matrix in an ER network grows
λ1 (A)
proportionally to n so that lim = p.
n→∞ n
9. The second largest eigenvalue grows more slowly than λ1 . In fact,
λ2 (A)
lim =0
n→∞ nε
for every ε > 0.5.
10. The most negative eigenvalue grows in a similar way to λ2 (A). Namely,
λn (A)
lim =0
n→∞ nε
for every ε > 0.5.
0.3 ρ(λ) = .
2π
0.2
0.0
−3 −2 −1 0 1 2 3
λ/r
11.3 The Barabási–Albert model
Figure 11.5 Spectral density for a net-
work generated with the ER model The ER model generates networks with Poisson degree distributions. However,
it has been empirically observed that many networks in the real-world have a fat-
tailed degree distribution of some kind, which varies greatly from the distribution
observed for ER random networks. A simple model to generate networks in which
the probability of finding a node of degree k decays as a power law of the degree
was put forward by Barabási and Albert in 1999. We initialize with a small network
with m0 nodes. At each step we add a new node u to the network and connect it
to m ≤ m0 of the existing nodes v ∈ V . The probability of attaching node u to
node v is proportional to the degree of v. That is, we are more likely to attach new
nodes to existing nodes with high degree. This process is known as preferential
attachment.
We can assume that our initial random network is connected and of ER type
with m0 nodes, GER = (V , E). In this case the Barabási–Albert (BA) algorithm
can be understood as a process in which small inhomogeneities in the degree
distribution of the ER network grow in time. A typical BA network is illustrated
in Figure 11.6.
Networks generated by this model have several global properties
2d(d – 1)
p(k) = ≈ k–3 . (11.2)
k(k + 1)(k + 2)
0.3 100
0.25
10–1
0.2
PDF, P (k)
PDF, P(k)
0.15 10–2
0.1
10–3
0.05
0 10–4 0
50 100 150 200 10 101 102 103
Degree, k Degree, k
100
10–1
CDF, P (k)
10–2
10–3
Figure 11.8 Cumulative degree distri-
100 101 102 bution for a network generated with the
Degree, k BA model
ln n – ln(d/2) – 1 – γ 3
l̄ = + , (11.3)
ln ln n + ln(d/2) 2
5.5
0.5
5 ER networks
Average path length
4.5 0.4
ρ(λ) r
0.3
3.5
BA networks 0.2
3
2.5 0.1
2
0.0
1.5 −4 −2 0 2 4
0 0.5 1 1.5 2
×105 λ/r
Number of nodes
Figure 11.9 (a) Comparison of the small-worldness of BA and ER networks. (b) Spectral density of a model BA network
characteristic effects. First, the average number of steps needed for the letters to
arrive to its target was around six. And second, there was a large group inbreed-
ing, which resulted in acquaintances of one individual feeding a letter back into
his/her own circle, thus usually eliminating new contacts.
Although the ER model reproduces the first characteristic very well, i.e. that
most nodes are separated by a very small average path length, it fails in reprodu-
cing the second. That is, the clustering coefficient in the ER network is very small
in comparison with those observed in real-world systems. The model put forward
by Watts and Strogatz in 1998 tries to sort out this situation.
First we form the circulant network with n nodes connected to k neighbours.
We then rewire some of its links: each of the original links has a probability p
(fixed beforehand) of having one of its end points moved to a new randomly
chosen node. If p is too high, meaning almost all links are random, we approach
the ER model.
The general process is illustrated in Figure 11.10. On the left is a circulant
graph and on the right is a random ER network. Somewhere in the middle are the
so-called ‘small-world’ networks.
In Figure 11.11 we illustrate the rewiring process, which is the basis of the
Watts–Strogatz (WS) model for small-world networks. Starting from a regular
circulant network with n = 20, k = 6 links are rewired with different choices of
probability p.
Networks generated by the WS model have several general properties, listed
below.
3(k – 2)
1. The average clustering coefficient is given by C = . For large
4(k – 1)
values of k, C approaches 0.75.
2. The average path length decays very fast from that of a circulant graph,
(n – 1)(n + k – 1)
l̄ = , (11.5)
2kn
to approach that of a random network. In Figure 11.12 we illustrate the
effect of changing the rewiring probability on both the average path length
Let GER (n, p) be an Erdös–Rényi random network with n nodes and probability
0.7
Cp/C0 p. Use known facts about the spectra of ER random networks to show that if k̄ is
0.6
0.5
the average degree of this network then the expected number of triangles tends to
Ip/I0
0.4 k̄3 /6 as n → ∞.
0.3
0.2 For any graph the number of triangles is given by
0.1
1 3
n
0 1
10–2 10–1 100 t= tr(A3 ) = λ . (11.6)
Rewiring probability
6 6 j=1 j
and, since |λi | ≤ max{|λ2 |, |λn |} for i ≥ 1, Table 11.1 Degree frequencies in an
example network.
1 3 1
lim t = λ = (np)3 . (11.8)
n→∞ 6 1 6 k n(k)
Problem 11.2 20 3
The data shown in Table 11.1 belong to a network having n = 1,000 nodes and
m = 4,000 links. The network does not have any node with k ≤ 3. Let n(k) be the Table 11.2 Probability distribution of
number of nodes with degree k. Determine whether this network was generated degrees in an example network.
by the BA model.
The probability that a node chosen at random has a given degree is shown in k p(k)
Table 11.2.
4 0.343
A sketch of the plot of k against p(k) in Figure 11.13 indicates that there is
a fast decay of the probability with the degree, which is indicative of fat-tailed 5 0.196
degree distributions, like the one produced by the BA model. 10 0.023
If the network was generated with the BA models it has to have a PDF of the
form p(k) ∼ k–3 which means that ln p(k) ∼ –3 ln k + b. 20 0.003
Given two degree values k1 and k2 , the slope of a log–log plot is given by
0.05
0
.................................................................................................. 0 5 10 15 20 25
Degree, k
FURTHER READING
Figure 11.13 Degree distribution in an
Barabási, A.-L. and Albert, R., Emergence of scaling in random networks, Science illustrative network
286:509–512, 1999.
Bollobás, B., Mathematical results on scale-free random graphs, in Bernholdt, S.
and Schuster, H.G. (eds.), Handbook of Graph and Networks: From the Genome
to the Internet, Wiley-VCH, 1–32, 2003.
Bollobás, B., Random Graphs, Cambridge University Press, 2001.
Watts, D.J., Strogatz, S.H., Collective dynamics of ‘small-world’ networks, Nature
393:440–442, 1998.
Matrix Functions
12
In this chapter
12.1 Motivation 114
As we have seen, we can analyse networks by understanding properties of
12.2 Matrix powers 114 their adjacency matrices. We now introduce some tools for manipulating
12.3 The matrix exponential 118 matrices which will assist in a more detailed analysis, namely functions of
12.4 Extending matrix powers 121 matrices. The three most significant in terms of networks will be matrix
12.5 General matrix functions 123 polynomials, the resolvent, and the matrix exponential and we give a brief
Further reading 128 introduction to each. We then present elements of a unifying theory for func-
tions of matrices and give examples of some familiar scalar functions in an
n × n dimensional setting.
12.1 Motivation
Amongst one’s earliest experiences of mathematics is the application of the basic
operations of arithmetic to whole numbers. As we mature mathematically, we see
that it is natural to extend the domain of these operations. In particular, we have
already exploited the analogues of these operations to develop matrix algebra.
Now we look at some familiar functions as applied to matrices. We will exploit
these ideas later to better understand networks, for example to develop measures
of centrality in Chapter 15 and to measure global properties of networks in Chap-
ters 17 and 18. We start with some familiar ideas that will prove vital in defining
a comprehensive generalization.
Note that we will only consider functions of square matrices. Throughout this
chapter, assume that A ∈ Rn×n unless otherwise stated.1
Example 12.1
Clearly, lim Ap = O if and only if ρ(A) < 1 and if ρ(A) > 1 then the powers
p→∞
of A grow unboundedly.
If ρ(A) = 1 and A is simple then the limiting behaviour is trickier to pin down.
While the powers do not grow explosively, lim Ap exists if and only if the only
p→∞
eigenvalue of size one is one itself.
As implied by the following example, things are more complicated for defective
matrices.
Examples 12.2
λ 1
(i) Let J = . Then
0 λ
2 λ2 2λ 3 λ3 3λ2 p λp pλp – 1
J = 2 , J = 3 , J = .
0 λ 0 λ 0 λp
Note that if |λ| < 1 then limp → ∞ pλp – 1 = 0 and the powers converge. If |λ| ≥ 1 they diverge.
continued
116 Matrix Functions
(ii)
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
0 1 0 0 0 0 1 0 0 0 0 1
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢0 0 1 0⎥ ⎢0 0 0 1⎥ ⎢0 0 0 0⎥
Let J = ⎢ ⎥ . Then J 2 = ⎢ ⎥ , J3 = ⎢ ⎥ , J 4 = O.
⎣0 0 0 1⎦ ⎣0 0 0 0⎦ ⎣0 0 0 0⎦
0 0 0 0 0 0 0 0 0 0 0 0
Notice that ρ( J) = 0 . If all the eigenvalues of a matrix A are zero then it is said to be nilpotent and we can show
that Ap = O for p ≥ n.
pm (A) = a0 I + a1 A + · · · + am Am .
Example 12.3
1 1 1 3 0 –3
(i) Let p(x) = 1 – x2 and A = . Then p(A) = I – = .
0 2 0 4 0 –3
(ii) Recall that the characteristic polynomial of A is given by p(z) = det(A – zI ). This is a degree n polynomial with the
property that p(z) = 0 ⇐⇒ z ∈ σ (A). The Cayley–Hamilton theorem tells us that p(A) = O.
If p(x) and q(x) are polynomials then we can define the rational function
r(x) = p(x)/q(x) so long as q(x) = 0. It is natural to write.
where q(A)–1 is the inverse matrix of q(A). This is well defined so long as q(A) is
nonsingular.
Examples 12.4
1 – zp + 1
1 + z + z2 + · · · + zp = .
1–z
If |z| < 1 then
∞
zi = (1 – z)–1 .
i=0
∞
The function f (z) = (1 – z)–1 is an analytic continuation of zi to the
punctured disc C – {1}. i=0
∞
Ai = (I – A)–1 .
i=0
∞
s (sA)i = s(I – sA)–1 = (zI – A)–1 ,
i=0
where z = 1/s.
The matrix (zI – A)–1 is defined so long as (zI – A) has no zero eigenvalues,
which is the case so long as z ∈
/ σ (A). Given a matrix A we call
for all z ∈ C.
By bounding the size of the terms in this series we can use the comparison test
(extended to matrix series) to show that for any square matrix A and any finite
∞
(zA)k
z ∈ C, the series is convergent.
k=0
k!
Examples 12.5
∞
(zI )k zk
∞
(i) ezI = = I = ez I .
k=0
k! k=0
k!
(ii) Let D = diag(λ1 , λ2 , . . . , λn ).
⎡ ⎤ ⎡ ⎤
λk1 eλ1
⎢ λk2 ⎥ ⎢ eλ2 ⎥
∞
Dk 1 ⎢
∞
⎥ ⎢ ⎥
eD = = ⎢ ⎥=⎢ ⎥.
k=0
k! k=0
k! ⎢
⎣
..
.
⎥ ⎢
⎦ ⎣
..
.
⎥
⎦
λkn eλn
(iii) If A is symmetric then so is Ak and from (12.1) one can see that e A is
T
symmetric, too. Again, from (12.1), e A = (e A )T .
The matrix exponential 119
Theorem 12.1 Let A and B be square matrices of the same size and let s, t ∈ C. The
following properties hold for matrix exponentials.
1. eO = I .
d ( t A)
2. e = Aet A .
dt
3. e(s+t)A = esA et A .
4. e A+B = e A eB if and only if AB = BA.
( )–1
5. et A is nonsingular and et A = e–t A .
–1
6. If B is nonsingular then Bet A B–1 = etBAB .
7. eA = lim (I + A/k)k .
( )t
k→∞
8. et A = e A .
Problem 12.1
Show that if AB = BA then e A+B = e A eB .
By a simple induction, ABk = Bk A for any k and hence from (12.1) AetB =
e A. Similarly, XetY = etY X for any combination of X and Y chosen from A, B,
tB
and A + B.
Now let F(t) = et(A+B) e–Bt e–At . By the product rule,
F (t) = (A + B)e(A+B)t e–Bt e–At + e(A+B)t (–B)e–Bt e–At + e(A+B)t e–Bt (–A)e–At ,
and by commutativity, the right-hand side of this expression is zero. Thus F(t) is
a constant matrix and since F(0) = eO eO eO = I we know that
et(A+B) = et A etB
Examples 12.6
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
0 1 0 0 0 1 1 1 12
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
(i) Let J = ⎣0 0 1⎦. Then, J 2 = ⎣0 0 0⎦ and J 3 = O so eJ = ⎣0 1 1 ⎦.
0 0 0 0 0 0 0 0 1
⎡ ⎤
λ 1 0
⎢ ⎥
(ii) If A = ⎣0 λ 1⎦ then A = λI + J (where J is taken from (i)). So
0 0 λ
⎡ ⎤
1 1 12
⎢ ⎥
eA = eλI+J = eλI eJ = eλ ⎣0 1 1 ⎦ .
0 0 1
continued
120 Matrix Functions
∞
∞
A 1 1/k! 1 e–1
A
e =I+ = k=1
∞ = .
k! 0 k=0 1/k! 0 e
k=1
0 0 0 1
Write A = D + F where D = ,F = .
0 1 0 0
2 D 1 0 F 1 1
Since D is diagonal and F = O, e = and e = .
0 e 0 1
1 1 1 e
D F
So e e = F D
and e e = . Neither of these equal e A since DF = FD.
0 e 0 e
Examples 12.7
–1
5 3 1 1 2 0 1 1
(i) A = = hence
–6 –4 –1 –2 0 –1 –1 –2
1 1 e2 0 2 1 2e2 – e–1 e2 – e–1
eA = =
–1 –2 0 e–1 –1 –1 2e–1 – 2e2 2e–1 – e2
tA 2e2t – e–t e2t – e–t
and e = .
2e–t – 2e2t 2e–t – e2t
–1
–5 2 1 2 1 0 1 2
(ii) B = = hence
–15 6 3 5 0 0 3 5
B 1 2 e 0 –5 2 6 – 5e 2e – 2
e = = .
3 5 0 1 3 –1 15 – 15e 6e – 5
Extending matrix powers 121
A B 2e2 – e–1 e2 – e–1 6 – 5e 2e – 2 –290.36 128.93
(iii) e e = –1 2 = . We can show that
2e – 2e 2e–1 – e2 15 – 15e 6e – 5 278.08 –123.50
A+B –1.759 –0.931
e = .
3.910 –2.131
As is the case for matrix powers, the asymptotic behaviour of etA is linked
directly to the spectrum of A. If A is simple then
⎡ ⎤
eλ1 t
⎢ eλ2 t ⎥
⎢ ⎥ –1
et A = X ⎢
⎢ ..
⎥X ,
⎥ (12.2)
⎣ . ⎦
eλn t
where the λi are the eigenvalues of A (and X contains the eigenvectors). We can
infer that if A is simple then lim et A = O if and only if all the eigenvalues of A lie
t→∞
to the left of the imaginary axis in the complex plane; that the limit diverges if any
of the eigenvalues lie strictly to the right, and that things are more complicated
when eigenvalues lie on (but not to the right of) the imaginary axis.
The matrix exponential has a number of roles in network theory. As an
example, consider the following.
Problem 12.2
Show that a network with adjacency matrix A is connected if and only if e A > 0.
If there is a walk of length p between nodes i and j then there is a nonzero in the
(i, j)th element of Ap . If a network is connected then a nonzero must eventually
appear in the sequence
Examples 12.8
0 1
(i) Let A = and suppose X 2 = A. Since A is nilpotent then so is X .
0 0
But then X 2 = O, a contradiction, so A has no square roots.
Notice that A is one of the (infinite) solutions to the equation
X 2 = O.
(ii) Suppose A = SDS –1 where D = diag(λ1 , λ2 , . . . , λn ) and let
E = diag(μ1 , μ2 , . . . , μn ) where μi is one of the square roots of λi . Then
( )2
SES–1 = SE 2 S –1 = A.
If λi = 0 then there are two choices for μi and by going through the
various permutations we get (up to) 2n different solutions to X 2 = A.
(iii) If X 2 = A then (–X )2 = A, too.
1
1 1 1
(iv) The only two solutions to X 2 = are X = ± 2 .
0 1 0 1
From the examples we see that the matrix equation X 2 = A exhibits an un-
expectedly complicated behaviour compared to the scalar analogue, particularly
for defective matrices.
√ To define a matrix square root function we need to give a
√
unique value to A. Recall that the principal square root of z ∈ C, written z, is
the solution of ω2 = z with smallest principal argument. If z is real and positive
√
then so is z.
If A is simple and has the factorization A = XDX –1 where D = diag(λ1 , . . . , λn )
then the principal square root of A is the unique matrix
√ % %
A = X diag( λ1 , . . . , λn )X –1 . (12.3)
√
If A is defective then the square root function, A, is not defined. If all the
eigenvalues of a simple matrix are real and positive then the same is true of
its principal square root. For most applications, the principal square root is the
General matrix functions 123
right one to take. Having defined the principal square root, there is an obvious
√ 1/p
extension to pth roots by replacing λi in (12.3) with λi .
Combining pth powers, qth roots, and the inverse, one can define a principal
rth power of a matrix for any r ∈ Q. For irrational values of r we need to make use
of the matrix exponential in a similar way to the extension of irrational powers of
scalars.
Example 12.9
If A is simple we can extend the last example using the Jordan decomposition
( )p
to show that epA = eA for rational values of p. We can extend the definition of
matrix powers to irrational indices by saying At = etX where eX = A.2
∞
f (z) = a k zk
k=0
∞
f (A) = ak A k . (12.4)
k=0
There are other ways of extending the definition of scalar functions to matri-
ces, too. We will give just one more. If you have studied complex analysis you may
recall that Cauchy’s integral formula tells us that if f is analytic inside some region
R and a ∈ R then 2 But how do we find X ?
124 Matrix Functions
-
1 f (z)
f (a) = dz
2π i C z–a
k=0
∞
∞
= ak XDk X –1 = ak Ak ,
k=0 k=0
Problem 12.3
Show that if f (z) is analytic in a region R containing the spectrum of A then (12.5)
and (12.6) are equivalent.
If z ∈
/ σ (A) then (zI – A)–1 is well defined and (zI – A)–1 = X (zI – D)–1 X –1 .
Note that (zI – D)–1 is diagonal with entries of the form 1/(z – λi ). So, since
σ (A) C = ∅,
- "- #
1 1
f (z)(zI – A)–1 dz = X f (z)(zI – D)–1 dz X –1
2πi C 2π i C
⎛ ⎡ ⎤ ⎞
1
⎜- ⎢ z – λ1 ⎥ ⎟
⎜ ⎢ ⎥ ⎟
1 ⎜ ⎢ . ⎥ ⎟ –1
= X ⎜ f (z) ⎢ . . ⎥ dz⎟ X
2π i ⎜ C ⎢ ⎥ ⎟
⎝ ⎣ 1 ⎦ ⎠
z – λn
General matrix functions 125
⎡ - ⎤
1 f (z)
⎢ 2π i dz ⎥
⎢ C z – λ1 ⎥
⎢ .. ⎥ –1
=X⎢ . ⎥X
⎢ - ⎥
⎣ 1 f (z) ⎦
dz
2π i C z – λn
⎡ ⎤
f (λ1 )
⎢ ⎥ –1
=X⎢
⎣
..
.
⎥X .
⎦
f (λn )
It can be shown using some basic tools of analysis that (12.4) and (12.6) are
also equivalent for defective matrices.3 If J is a ( p + 1) × ( p + 1) defective Jordan
block of the eigenvalue λ then one can show that if we define
⎡ ⎤
f ( p) (λ)
⎢f (λ) f (λ) ...
⎢ p! ⎥ ⎥
⎢ .. ⎥
⎢ . ⎥
f ( J) = ⎢ f (λ) ⎥ (12.7)
⎢ ⎥
⎢ .. ⎥
⎣ . f (λ) ⎦
f (λ)
then the consequent extension of (12.5) is equivalent to the other definitions, too.
Examples 12.10
√
(i) If f (z) = z then f (z) is undefined for z = 0.
0 1
If J = then (12.7) is undefined, there is no Maclaurin series
0 0
for f (z) and the conditions for the Cauchy integral formula are not met
because any contour enclosing σ ( J) includes zero. Recall that J has no
square root.
√
(ii) If A = O then we still cannot use (12.4) or (12.6) to define A,
but (12.5) works fine.
If f (z) is analytic at z ∈ σ (A) then the value of f (A) given by our equivalent
definitions is called the primary value of f (A). Applications for values other than
the primary are limited and we will not consider them further.
Using (12.4) we can establish a number of generic properties that any ma- 3 One uses the fact that simple matri-
trix function satisfies. If f is analytic on the spectrum of A then, for example, ces form a dense set amongst all square
f (AT ) = f (A)T and f (XAX –1 ) = Xf (A)X –1 . matrices.
126 Matrix Functions
Problem 12.4
Show that if AB = BA and f is analytic on the spectrum of A and B then
f (A)A = Af (A), f (A)B = Bf (A), and f (A)f (B) = f (B)f (A).
.∞ / .∞ /
∞
f (A)A = am Am A = Am+1 = A am Am = Af (A).
m=0 m=0 m=0
12.5.1 log A
X is a logarithm of A if eX = A. Suppose A = XDX –1 where D = diag(λ1 , . . . , λn ).
Since A is nonsingular, all the λj are nonzero and if
Examples 12.11
z2 z3
log(1 + z) = z + + + ··· .
2 3
General matrix functions 127
1 2 1 3
log(I + A) = A + A + A + ··· .
2 3
Strictly speaking, we have not shown that each of the logs in this
expression is the primary value, but there is a large class of matrices for
which this is true.
(iii) Let X = log A. By Theorem 12.1 Ap = (eX )p = epX .
Taking logs of each side gives
log Ap = p log A.
and write
1 iA 1
cos A = (e + e –i A ), sin A = (eiA – e–iA ).
2 2i
We can use these identities to show that many trigonometric identities still hold
when we use matrix arguments. For example, the addition formulae for cosine and
sine are true for matrices so long as the matrices involved commute.
Example 12.12
As with there scalar equivalents, cos A and sin A arise naturally in the solution
of (systems of) second order ODEs. They are far removed from their original
role in trigonometry!
128 Matrix Functions
1 A 1
cosh A = (e + e–A ), sinh A = (eA – e–A ).
2 2
Examples 12.13
e A – e –A A2k+1
∞
sinh A = = .
2 k=0
(2k + 1)!
∞
A2k
cosh A = .
k=0
(2k)!
..................................................................................................
FURTHER READING
significance. We illustrate the concept by studying motifs in some real-world 13.3 Network motifs 140
networks. We then outline mathematical methods to quantify the number of Further reading 142
small subgraphs in networks analytically. We develop some general techniques
that can be adapted to search for other fragments.
13.1 Motivation
In many real-life situations, we are able to identify small structural pieces of a
system which are responsible for certain functional properties of the whole sys-
tem. Biologists, chemists, and engineers usually isolate these small fragments of
the system to understand how they work and gain understanding of their roles in
the whole system. These kinds of structural fragments exist in complex networks.
In Chapter 11 we saw that triangles can indicate transitive relations in social net-
works. They also play a role in interactions between other entities in complex
systems. In this chapter we develop techniques to quantify some of the simplest
but most important fragments or subgraphs in networks. We also show how to
determine whether the presence of these fragments in a real-world network is just
a manifestation of a random underlying process, or that they signify something
more significant.
In network theory fragments are synonymous with subgraphs. Typical sub-
graphs are illustrated in Figure 13.1. In general, a subgraph can be formed
by one (connected subgraph) or more (disconnected subgraphs) connected
components, and they may be cyclic or acyclic.
(a)
1
n
|S1,1 | = m = ki . (13.1)
2 i=1
The next star subgraph is S1,2 . Copies of S1,2 in a network can be enumer-
ated by noting that they are formed from any two edges incident to a common
node. That is, |S1,2 | is equal to the number of times that the nodes attached to a
particular node can be combined in pairs. This is simply
n " #
1
n
ki
|S1,2 | = = ki (ki – 1). (13.2)
i=1
2 2 i=1
n
μk = tr(Ak ) = λki
i=1
but we can rewrite the right-hand side of this expression in terms of particular
subgraphs. Before continuing, visualize what happens with a CW of length two.
Each such walk represents an edge. But in an undirected network there are two
closed walks along each edge (i, j), namely i → j → i and j → i → j. Thus,
μ2 = 2|S1,1 | = 2|P1 |.
Similarly, there are six CWs of length three around every triangle i, j,k since
we can start from any one of its three nodes and move either clockwise or anti-
clockwise: i → j → k → i; i → k → j → i; j → k → i → j; j → i → k → j;
k → i → j → k; k → j → i → k. So,
μ3 = 6|C3 |.
Things begin to get messy for longer CWs as there are several subgraphs as-
sociated with such walks. For example, a CW of length four can be generated by
moving along the same edge four times. This can be done in two ways
i → j → i → j → i and j → i → j → i → j.
We could also walk along two edges and then return to the origin in two ways,
i → j → k → j → i and k → j → i → j → k.
There are two ways of visiting two nearest neighbours before returning to the
origin,
j → i → j → k → j and j → k → j → i → j.
132 Fragment-based Measures
And finally, there are eight ways of completing a cycle of length four in a square
i, j, k, l since we can start from any node and go clockwise or anticlockwise. For
example, starting from node i gives
i → j → k → l → i and i → l → k → j → i.
Consequently,
Problem 13.1
Let G be a regular network with n = 2r nodes of degree k and spectrum
1 3 16 3 6 77
n
1
t= tr(A3 ) = λi = k + (–k)3 + (n – 1) 13 + (–1)3 = 0.
6 6 i=1 6
The number of squares is given by |C4 | = μ4 /8 – |P1 |/4 – |P2 |/2. Since each
node has degree k,
kn
|P1 | = = kr
2
and
n " #
k nk(k – 1)
|P2 | = = = nk(k – 1). (13.6)
i=1
2 2
Given that
we conclude that
k4 + (r – 1) rk(k – 1) rk k4 + r – 1 – 2rk(k – 1) – rk
|C4 | = – – =
4 2 4 4
k4 + r – rk(2k – 1) – 1
= . (13.8)
4
Counting subgraphs in networks 133
Example 13.1
A network of the type described in problem 13.2 is the cube Q3 which has
the spectrum
σ (G) = {[3]1 , [1]3 , [–1]3 , [–3]1 }.
|T3,1 | = ti (ki – 2). (13.9)
ki >2
Example 13.2
5 1 2 2
6
8 4 3 6
2
7
continued
134 Fragment-based Measures
Using (13.9) and concentrating only on those nodes with degree larger than
two we have,
|T3,1 |2 × (4 – 2) + 1 × (3 – 2) + 2 × (4 – 2) + 1 × (3 – 2) = 10.
Let us add another edge and consider the fragment illustrated in Figure 13.4.
This subgraph is known as the cricket graph, which we designate by Cr. Here
again, we can use a technique that combines the calculation of the two subgraphs
forming this fragment. That is, this fragment is characterized by a node i that is
simultaneously part of a triangle and a star S1,2 . Using ti as before, we consider
nodes for which ki > 3.
Figure 13.4 Illustration of the cricket If ti > 0 then node i has ki – 2 additional
" # nodes which are attached to it. These
graph ki – 2
ki – 2 nodes can be combined in pairs to form all the S1,2 subgraphs in
2
which node i is involved.
The number of crickets involving node i is then
" #
ki – 2
|Cri | = ti (13.10)
2
and hence
" ki – 2 # 1
|Cr| = ti = ti (ki – 2) · (ki – 3). (13.11)
k ≥4
2 2 k ≥4
i i
Example 13.3
1
|Cr| = (2(4 – 2)(4 – 3) + 2(4 – 2)(4 – 3)) = 4. (13.12)
2
(a) (b)
(c) (d)
Figure 13.5 Illustration of the four cricket subgraphs within the
network in figure 13.3
The diamond graph (D) is characterized by the existence of two connected nodes
(1 and 3) which are also connected by two paths of length two (1-2-3 and
1-4-3). It is illustrated in Figure 13.6. To calculate the number of diamonds in
a network we note that the number of walks of length two between two connected 4 3
nodes is given by (A2 )ij Aij and hence that the number of pairs of paths of length
two among two connected nodes i, j is given by Figure 13.6 The diamond graph
" 2 #
(A )ij Aij
.
2
" #
1 (A2 )ij Aij 1 6 2 76 7
|D| = = (A )ij Aij (A2 )ij Aij – 1 . (13.13)
2 i, j 2 4 i, j
Problem 13.2
Find an expression for |C5 |, the number of pentagons in a network.
A CW of length l = 2d+1 necessarily visits only nodes in subgraphs containing
at least one odd cycle. So a CW of length five can visit only the nodes of a triangle,
C3 ; a tadpole, T3,1 ; or a pentagon, C5 . Hence
i → j →k → l → j → i j → i → j → l →k → j
i → j → l →k → j → i k → l → j → i → j →k
j →k → l → j → i → j k → j → i → j → l →k
j → l →k → j → i → j l → j → i → j →k → l
and
1
|C5 | = (μ5 – a|C3 | – b|T3,1 |). (13.15)
c
We have seen how to calculate |C3 | and |T3,1 | hence our task is to determine
the coefficients a, b, and c.
To find a we must enumerate all the CWs of length five in a triangle. This
can be done by calculating tr(A5C ) where AC is the adjacency matrix of C3 . From
Chapter 6 we know that the eigenvalues of C3 are 2, –1, and –1 hence
a= λ5j = 25 + 2(–1)5 = 30. (13.16)
i
To find b we can enumerate all the CWs of length five involving all the nodes
of T3,1 . This is done in Figure 13.7 and we see that b = 10.
We could also proceed in a similar way as for the triangle, but in the tadpole
T3,1 not every CW of length five visits all the nodes of the fragment. That is, there
are CWs of length five which only visit the nodes of the triangle in T3,1 . Thus,
b = tr(A5T ) – a (13.17)
where AT is the adjacency matrix of T3,1 . Computing A5T explicitly we find that
tr(A5T ) = 40 and, again,
b = 40 – 30 = 10.
Finally, to find c, note that for every node in C5 there is one CW of length five
in a clockwise direction and another anticlockwise, e.g. i → j → k → l → m → i
and i → m → l → k → j → i. Thus, c = 10. Finally,
1 ( )
|C5 | = μ5 – 30|C3 | – 10|T3,1 | . (13.18)
10
1
F2 |F2 | = tr(A3 )
6
F3 |F3 | = (ki – 1)(kj – 1) – 3|F2 |
(i, j)∈E
1
F4 |F4 | = ki (ki – 1)(ki – 2)
6 i
1
F5 |F5 | = (tr(A4 ) – 4|F1 | – 2m)
8
F6 |F6 | = ti (ki – 2)
ki >2
1 6 2 76 7
F7 |F7 | = (A )ij Aij (A2 )ij · Aij – 1
4 i, j
138 Fragment-based Measures
1 ( )
F8 |F8 | = tr(A5 ) – 30|F2 | – 10|F6 |
10
1
F9 |F9 | = ti (ki – 2)(ki – 3)
2 k ≥4
i
1 "(A2 )ij #
F10 |F10 | = (ki – 2) × – 2|F7 |
2 i i, j
2
F11 |F11 | = (A2 )ij (ki – 2)(kj – 2) – 2|F7 |
(i, j)∈E
⎛ ⎞
F12 |F12 | = ti ⎝ (A2 )ij ⎠ – 6|F2 | – 2|F6 | – 4|F7 |
i i = j
1
F13 |F13 | = ti (ti – 1) – 2|F7 |
2 i
Counting subgraphs in networks 139
F14 |F14 | = (A3 )ij (A2 )ij – 9|F2 | – 2|F6 | – 4|F7 |
(i, j)∈E
1 (
|F15 | = tr(A6 ) – 2m – 12|F1 | – 24|F2 | – 6|F3 |
F15 12 )
–12|F4 | – 48|F5 | – 36|F7 | – 12|F10 | – 24|F13 |
1
|F16 | = (ki – 2)Bi – 2|F14 | where
2 i
F16 ( )
Bi = (A5 )ii – 20ti – 8ti (ki – 2) – 2 (A2 )ij (kj – 2) – 2 tj – (A2 )ij
(i, j)∈E (i, j)∈E
"(A2 )ij #
F17 |F17 | =
(i, j)∈E
3
"(A2 )ij #
F18 |F18 | = ti · – 6|F7 | – 2|F14 | – 6|F17 |
i i = j
2
140 Fragment-based Measures
Nireal – Nirandom
Zi = , (13.19)
σirandom
where Nireal is the number of times the subgraph i appears in the real network,
Nirandom and σirandom are the average and standard deviation of the number of
times that i appears in an ensemble of random networks, respectively. Similarly,
the relative abundance of a given fragment can be estimated using the statistic:
Nireal – Nirandom
αi = . (13.20)
Nireal + Nirandom
of the ith motif with respect to the other motifs in the network. The resulting
component of the significance profile vector is given by
Zi
SPi = . (13.21)
Zj2
i
Problem 13.3
The connected component of the protein–protein interaction network of yeast has
2,224 nodes and 6,609 links. It has been found computationally that the number
of triangles in that network is 3,530. Determine the relative abundance of this
fragment in order to see whether it is a motif in this network.
We use the formula
tireal – tirandom
αi = ,
tireal + tirandom
where we know that tireal = 3,530. We have to estimate tirandom . Let us consider
Erdös-Rényi random networks with 2,224 nodes and 6,609 links for which
2m
p= = 0.00267.
n(n – 1)
142 Fragment-based Measures
1.0
0.8
0.6
0.4
0.2
0.0
–0.2
–0.4
–0.6
Internet Drug users
–0.8 Airports Yeast PPI
Thesaurus Prison immates
–1.0
Figure 13.9 Motifs and anti-motifs in
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
undirected networks
We also know that for large n, λ1 → pn and all the other eigenvalues are
negligible so we use the approximation
1 3
n
λ3 (np)3
tirandom = λj ≈ 1 = .
6 j=1 6 6
Thus, tirandom ≈ 35. This estimate is very good indeed. For instance, the
average number of triangles in 100 realizations of an ER network is tirandom =
35.4 ± 6.1. Using the value of tirandom ≈ 35 we obtain αi = 0.98, which is very
close to one. We conclude that the number of triangles in the yeast PPI is sig-
nificantly larger than expected by chance, and we can consider it as a network
motif.
..................................................................................................
FURTHER READING
Alon, N., Yuster, R., and Zwick, U., Finding and counting given length cycles,
Algorithmica 17:209–223, 1997.
Milo, R. et al., Network motifs: Simple building blocks of complex networks,
Science 298:824–827, 2002.
Milo, R. et al., Superfamilies of evolved and designed networks, Science
303:1538–1542, 2004.
Classical Node Centrality
14
In this chapter
14.1 Motivation 143
The concept of node centrality is motivated and introduced. Some proper-
ties of the degree of a node are analysed along with extensions to consider 14.2 Degree centrality 143
non-nearest neighbours. Two centralities based on shortest paths on the net- 14.3 Closeness centrality 146
work are defined—the closeness and betweenness centrality—and differences 14.4 Betweenness centrality 152
between them are described. We finish this chapter with a few problems to Further reading 156
illustrate how to find analytical expression for these centralities in certain
classes of networks.
14.1 Motivation
The notion of centrality of a node first arose in the context of social sciences
and is used in the determination of the most ‘important’ nodes in a network.
There are a number of characteristics, not necessarily correlated, which can be
used in determining the importance of a node. These include its ability to com-
municate directly with other nodes, its closeness to many other nodes, and its
indispensability to act as a communicator between different parts of a network.
Considering each of these characteristics in turn leads to different central-
ity measures. In this chapter we study such measures and illustrate the different
qualities of a network that they can highlight.
n
ki = aij = (eT A)i = (Ae)i . (14.1)
j=1
The following are some elementary facts about the degree centrality. You are
invited to prove these yourself.
1. ki = (A2 )ii .
n
2. ki = 2m, where m is the number of links.
i=1
n
n
3. kin
i = kout
i = m , where m is the number of links.
i=1 i=1
Example 14.1
1 4
3 5
Example 14.2
Let us consider the network displayed in Figure 14.2 together with its adjacency matrix
⎡ ⎤
0 1 0 1 1 0
⎢1 0 0 0 0 0⎥
⎢ ⎥
⎢ ⎥
⎢0 1 0 1 0 0⎥
A=⎢ ⎥.
⎢0 0 0 0 0 0⎥
⎢ ⎥
⎣0 0 0 1 0 0⎦
1 0 0 0 0 0
2 1
3 4
Example 14.3
Let us now consider a real-world network. It corresponds to the food web of St Martin island in the Caribbean, in
which nodes represent species and food sources and the directed links indicate what eats what in the ecosystem. Here
we represent the networks in Figure 14.3 by drawing the nodes as circles with radius proportional to the corresponding
in-degree in (a) and out-degree in (b).
(a) (b)
Figure 14.3 Food webs in St Martin with nodes drawns as circles of radii proportional to (a)
in-degree (b) out-degree
The in- and out-degree vectors are calculated in exactly the same way as in the last example. In this case, every
node has a label which corresponds to the identity of the species in question. In analysing this network according to
the in- and out-degree, we can point out the following observations which are of relevance for the functioning of this
ecosystem.
• Nodes with high out-degree are predators with a large variety of prey. Examples include the lizards Anolis
gingivinus (the Anguilla Bank Anole) and Anolis pogus; and the birds the pearly-eyed thrasher and the yellow
warbler.
• High in-degree nodes represent species and organic matter which are eaten by many others in this ecosystem,
such as leaves, detritus, and insects such as aphids.
• In general, top predators are not predated by other species, thus having significantly higher out-degree than
in-degree.
• The sources with zero in-degree are all birds: the pearly-eye thrasher, yellow warbler, kestrel, and grey kingbird.
• Highly predated species are not usually prolific predators, thus they have high in-degree but low out-degree.
• The sinks are all associated with plants or detritus.
In a directed network a node has in- and out-closeness centrality. The first cor-
responds to how close this node is to nodes it is receiving information from. The
out-closeness centrality indicates how close the node is from those it is sending
information to. In directed networks the shortest path is a pseudo-distance due to
a possible lack of symmetry.
Example 14.4
2 3 9
4 8
1 10
7 6
s = De = (eT D)T = [ 15 22 17 14 19 20 21 18 26 26 ]T
continued
148 Classical Node Centrality
And we use (14.3) to measure the closeness centrality of each node. For instance for node 1
9
CC(1) = = 0.6.
15
CC = [0.600 0.409 0.529 0.643 0.474 0.450 0.428 0.500 0.346 0.346]T,
indicating that the most central node is node 4. Notice that in this case the degree centrality identifies another node
(namely 1) as the most important whereas in Figure 14.1 node 2 has both the highest degree and closeness centralities.
Example 14.5
We consider here the air transportation network of the USA, where the nodes represent the airports in the USA and the
links represent the existence of at least one flight connecting the two airports. In Figure 14.5 we illustrate this network
in which the nodes are represented as circles with radii proportional to the closeness centrality.
The most central airports according to the closeness centrality are given in Table 14.1.
The first four airports in this list (and the sixth) correspond to airports in the geographic centre area of continental
USA. The other three are airports located on the west coast. The first group are important airports in connecting
the East and West of the USA with an important traffic also between north and south of the continental USA. The
second group represents airports with important connections between the main USA and Alaska, as well as overseas
territories like Hawaii and other Pacific islands. The most highly ranked airports according to degree centrality are
given in Table 14.2. Notice that the group of west coast airports is absent.
Problem 14.1
Let CC(i) be the closeness centrality of the ith node in the path network Pn–1
labelled 1 – 2 – 3 – 4 – · · · – (n – 1) – n.
(a) Find a general expression for the closeness centrality of the ith node in
Pn–1 in terms of i and n only.
(b) Simplify the expressions found in (a) for the node(s) at the centre of the
path Pn–1 (for both odd and even values of n).
(c) Show that the closeness centrality of these central nodes is the largest in a
path Pn–1 .
(a) We start by considering the sum of all the distances from one node to the
rest of the nodes in the path.
Table 14.3 The sum of all the distances from one node.
Node d ij
j =i
1 1 + 2 + ··· +n – 1
2 1 + 1 + 2 + ··· +n – 2
3 2 + 1 + 1 + 2 + ··· + n– 3
.. ..
. .
i (i – 1) + (i – 2) + · · · + 2 + 1 + 1 + 2 + · · · + n – i
It is important to notice here that for each node the sum of the dis-
tances corresponds to a ‘right’ sum, i.e. the sum of the distances of all
nodes to the right of the node i and a ‘left’ sum, i.e. the sum of the dis-
tances of all nodes located to the left of i, (i – 1) + (i – 2) + · · · + 2 + 1.
These two sums are given, respectively, by
(n – 1)(n – i + 1)
1 + 2 + ··· +n – i = , (14.5)
2
(i – 1)i
(i – 1) + (i – 2) + · · · + 2 + 1 = . (14.6)
2
Then, by substituting into the formula (14.3) we obtain
n–1
CC(i) = , (14.7)
(i – 1)i (n – i)(n – i + 1)
+
2 2
Closeness centrality 151
2(n – 1)
CC(i) = . (14.8)
(i – 1)i + (n – i)(n – i + 1)
n+1
(b) For a path with an odd number of nodes the central node is i = .
2
By substitution into (14.8) we obtain
" #
n+1 2(n – 1)
CC = " # " #" #,
2 n+1 n+1 n+1 n+1
–1 + n– n– +1
2 2 2 2
(14.9)
which reduces to
" #
n+1 4
CC = . (14.10)
2 (n + 1)
n
For a path with an even number of nodes the central nodes are i = and
2
n+1
i= . Now,
2
6n7 2(n – 1)
CC = 6n 7n 6 n76 n 7, (14.11)
2 –1 + n– n– +1
2 2 2 2
n+1
We recover the same value when i = .
2
(c) Simply consider
2(n – 1)
CC(i + 1) – CC(i) =
(i + 1)i + (n – i – 1)(n – i)
2(n – 1)
– . (14.13)
i(i – 1) + (n – i)(n – i + 1)
which is positive if i < n/2 and negative if i > n/2, so CC(i) reaches its
maximum value in the centre of the path.
152 Classical Node Centrality
ρ( j, i, k)
BC(i) = , i = j = k, (14.15)
i k
ρ( j, k)
where ρ( j, k) is the number of shortest paths connecting the node j to the node
k, and ρ( j, i, k) is the number of these shortest paths that pass through node i in
the network.
If the network is directed, the term ρ( j, i, k) refers to the number of directed
paths from the node j to the node k that pass through the node i, and ρ( j, k) to
the total number of directed paths from the node j to the node k.
Example 14.6
We consider again the network used in Figure 14.4 and we explain how to
obtain the betweenness centrality for the node labelled as one. For this, we
construct Table 14.4 in which we give the number of shortest paths from any
pair of nodes that pass through the node 1, ρ( j, 1, k). We also report the total
number of shortest paths from these pairs of nodes ρ( j, k).
The betweenness centrality of the node 1 is simply the total sum of the
terms in the last column of Table 14.4,
ρ( j, 1, k)
BC(1) = = 12.667.
j k
ρ( j, k)
which indicates that the node 4 is the most central one, i.e. it is the most
important in allowing communication between other pairs of nodes.
Betweenness centrality 153
ρ( j, i, k)
( j, k) ρ( j, 1, k) ρ( j, k)
ρ( j, k)
2,4 1 2 1/2
2,5 2 3 2/3
2,6 1 1 1
2,7 1 1 1
2,8 1 2 1/2
2,9 1 2 1/2
2,10 1 2 1/2
3,6 1 1 1
3,7 1 1 1
4,6 1 2 1/2
4,7 1 1 1
6,8 1 2 1/2
6,9 1 2 1/2
6,10 1 2 1/2
7,8 1 1 1
7,9 1 1 1
7,10 1 1 1
12.667
Example 14.7
In Figure 14.6 we illustrate the urban street network of the central part of
Cordoba, Spain. The most central nodes according to the betweenness cor-
respond to those street intersections which surround the central part of the
city and connect it with the periphery.
continued
154 Classical Node Centrality
Problem 14.2
Let G be a tree with n = n1 + n2 + 1 and the structure displayed in Figure 14.7.
State conditions for the nodes labelled a, b, and c to have the largest value of
betweenness centrality.
We start by considering the betweenness centrality of node a. Let us designate
by V1 and V2 the two branches of the graph, the first containing n1 and the second
a b c
n2 nodes.
n1 n2 Fact 1 Because the network is a tree, the number of shortest paths from p to
q that pass through node k, ρ( p, k, q), is the same as the number of shortest
Figure 14.7 A networked formed by paths from p to q, ρ( p, q). That is, ρ( p, k, q) = ρ( p, q).
joining two star networks together.
Fact 2 There are n1 nodes in the branch V1 . Let us denote by i any node in
Dashed lines indicate the existence of
this branch which is not a and by j any node in V2 which is not c. There are
other equivalent nodes
n1 – 1 shortest paths from nodes i to node b. That is,
ρ(i, a, b) = n1 – 1. (14.16)
Fact 3 We can easily calculate the number of paths from a node i to any node
in the branch V2 which go through node a. Because there are n1 – 1 nodes
of type i and n2 nodes in the branch V2 we have
Fact 4 Any path going from a node denoted by i to another such node passes
through the node a. Because there are n1 – 1 nodes of the type i we have that
the number of these paths is given by
" #
n1 – 1 (n1 – 1)(n1 – 2)
ρ(i, a, i) = = . (14.17)
2 2
Therefore the total number of paths containing the node a, and conse-
quently its betweenness centrality, is
(n1 – 1)(n1 – 2)
BC(a) = 2(n1 – 1) + (n1 – 1)(n2 – 1) +
2
(n1 – 1)(2n2 + n1 )
= .
2
(n2 – 1)(2n1 + n2 )
BC(c) = . (14.18)
2
To calculate the betweenness centrality for node b we observe that every path
from the n1 nodes in branch V1 to the n2 nodes in branch V2 passes through
node b. Consequently,
BC(b) = n1 n2 . (14.19)
Obviously, all nodes apart from a, b, and c have zero betweenness centrality.
We consider in turn the conditions for the three remaining nodes to be central.
In order for node a to have the maximum BC, the following conditions are
necessary:
First,
and
..................................................................................................
FURTHER READING
Borgatti, S.P., Centrality and network flow, Social Networks 27:55–71, 2005.
Borgatti, S.P. and Everett, M.G., A graph-theoretic perspective on centrality,
Social Networks 28:466–484, 2006.
Brandes, U. and Erlebach, T. (Eds.), Network Analysis: Methodological Founda-
tions, Springer, 2005, Chapters 3–5.
Estrada, E., The Structure of Complex Networks: Theory and Applications, Oxford
University Press, 2011, Chapter 7.
Wasserman, S. and Faust, K., Social Network Analysis: Methods and Applications,
Cambridge University Press, 1994, Chapter 5.
Spectral Node Centrality
15
In this chapter
15.1 Motivation 157
The necessity of considering the influence of a node beyond its nearest
neighbours is motivated. We introduce centrality measures that account for 15.2 Katz centrality 158
long-range effects of a node, such as the Katz index, eigenvector centrality, 15.3 Eigenvector centrality 160
the PageRank index, and subgraph centrality. A common characteristic of 15.4 Subgraph centrality 164
these centrality measures is that they can be expressed in terms of spectral Further reading 169
properties of the networks.
15.1 Motivation
Suppose we use a network to model a contagious disease amongst a population.
Nodes represent individuals and edges represent potential routes of infection be-
tween these individuals. We illustrate a simple example in Figure 15.1. We focus
on the nodes labelled 1 and 4 and ask which of them has the higher risk of con-
tagion. Node 1 can be infected from nodes 2 and 3, while node 4 can be infected
from 5 and 6. From this point of view it looks like both nodes are at the same level
of risk. However, while 2 and 3 cannot be infected by any other node, nodes 5
and 6 can be infected from nodes 7 and 8, respectively. Thus, we can intuitively
think that 4 is at a greater risk than 1 as a consequence of the chain of transmis-
sion of the disease. Local centrality measures like node degree do not account for
a centrality that goes beyond the first nearest neighbours, so we need other kinds
of measures to account for such effects. In this chapter we study these measures
and illustrate the different qualities of a network that they can highlight.
2 5 7
1 4
Figure 15.1 A simple network. Nodes
3 6 8 1 and 4 are rivals for title of most central
158 Spectral Node Centrality
–1
The series in (15.1) is related to the resolvent function (zI – A) . In particular,
we saw in Example 12.4(iii) that the series converges so long as α < ρ(A) in which
case
* +
Ki = (I – αA)–1 e i . (15.2)
The Katz index can be expressed in terms of the eigenvalues and eigen-
vectors of the adjacency matrix. From the spectral decomposition A = QDQT
(see Chapter 5),
1
Ki = qj (i)qj (l) . (15.3)
l j
1 – αλj
When deriving his index, Katz ignored the contribution from A0 = I and
instead used
*( ) +
K i = (I – αA)–1 – I e i . (15.4)
While the values given by (15.2) and (15.4) are different, the rankings are
exactly the same. We will generally use (15.2) because of the nice mathematical
properties of the resolvent.
Example 15.1
Node 4 has the highest Katz index, followed by node 1 which accords with
our intuition on the level of risk of each of these nodes in the network.
Katz centrality 159
* + * +
Kiout = (I – αA)–1 e i , Kiin = eT (I – αA)–1 i .
Example 15.2
1 2 5
4 3 6
K in = 1.50 2.25 2.62 1.00 1.00 2.31 ,
T
K out = 1.88 1.75 1.50 2.69 1.88 1.00 .
Notice that nodes 2 and 3 are each pointed to by two nodes. However,
node 3 is more central because it is pointed to by nodes with greater centrality
than those pointing to 2. In fact, node 6 is more central than node 2 because
the only node pointing to it is the most important one in the network. On the
other hand, out-Katz identifies node 4 as the most central one. It is the only
node having out-degree of two.
160 Spectral Node Centrality
.∞ / ⎛ ⎞ ⎛ ⎞
∞
n
1 n ∞
ν = α k–1 Ak e = ⎝ α k–1 qj qTj λkj ⎠e = ⎝ (αλj )k qj qTj ⎠e
k=1 k=1 j=1
α j=1 k=1
⎛ ⎞
1 n
1
=⎝ q q ⎠e.
T
α j=1 1 – αλj j j
Now, let the parameter α approach the inverse of the largest eigenvalue of the
adjacency matrix from below, i.e. α → 1/λ–1 . Then
⎛ ⎞ . /
1 n
(1 – αλ 1 )ν
n
lim (1–αλ1 )ν = lim – ⎝ q j q j ⎠e = λ 1
T
q1 (i) q1 = γ q1 .
α→1/λ–
1
α→1/λ
1 α j=1 1 – αλj i=1
Thus the eigenvector associated with the largest eigenvalue of the adjacency ma-
trix is a centrality measure conceptually similar to the Katz index. Accordingly,
the eigenvector centrality of the node i is given by q(i), the ith component of the
principal eigenvector q1 of A. Typically, we normalize q1 so that its Euclidean
length is one. By the Perron–Frobenius theorem we can choose q1 so that all of
its components are nonnegative.
Examples 15.3
(i) The eigenvector centralities for the nodes of the network in Figure 15.1 are
Here again node 4 is the one with the highest centrality, followed by node 1. Node 4 is connected to nodes which
are higher in centrality than the nodes to which node 1 is connected. High degree is not the only factor considered
by this centrality measure. The most central nodes are generally connected to other highly central nodes.
(ii) Sometimes being connected to a few very important nodes make a node more central than being connected to
many not so central ones. For instance, in Figure 15.3, node 4 is connected to only three other nodes, while
1 is connected to four. However, 4 is more central than 1 according to the eigenvector centrality because it is
connected to two nodes with relatively high centrality while 1 is mainly connected to peripheral nodes. The vector
of centralities is
T
q1 = 0.408 0.167 0.167 0.500 0.408 0.167 0.167 0.167 0.408 0.167 0.167 0.167 .
Eigenvector centrality 161
6
7
2 5 8
13 1 4
3 9 10
11
12
Problem 15.1
Let G be a simple connected network with n nodes and adjacency matrix A with
spectral decomposition QDQT . Let Nk (i) be the number of walks of length k
starting at node i. Let
Nk (i)
sk (i) = n
j=1 Nk (j)
be the ith element of the vector sk . Show that if G is not bipartite then there is a
scalar α such that as k → ∞, sk → αq1 almost surely. That is, the vector sk will
tend to rank nodes identically to eigenvector centrality.
Since Ak = QDk QT ,
k
eTi Ak e eTi QDk QT e qTi Dk r qTi D r
sk (i) = = = T k = k
,
eT Ak e eT QDk QT e r Dr rT D r
lim sk = αq1 ,
k→∞
as desired. Note that we require eTi r = 0 in this analysis, which is almost surely
true for a network chosen at random.
Example 15.4
2 1 5
3 4
Figure 15.4 A directed network
highlighting the difference between
left and right eigenvector centrality
The left and right eigenvector centralities of the network in Figure 15.4 are
T T
x = 0.592 0.288 0.366 0.465 0.465 , y = 0.592 0.465 0.366 0.288 0.465 .
Notice the differences in the rankings of nodes 4 and 5. According to the right eigenvector, both nodes ranked as
the second most central. They both point to the most central node of the network according to this criterion, node 1.
However, according to the left eigenvector, while node 5 is still the second most important, node 4 has been relegated
to the least central one. Node 5 is pointed to by the most central node, but node 4 is pointed to only by a node with
low centrality.
Eigenvector centrality 163
S = D–1 H ,
1–α T
P = αS + ee . (15.6)
n
164 Spectral Node Centrality
Example 15.5
2 1
7 5
3 4
when α = 0.85. Notice that node 1 has higher PageRank than node 4 due
to its in-link from node 2. In this example, the rankings vary little as we
change α.
We can easily generalize this idea and work with other weighted sums of the
powers of the adjacency matrix, namely,
∞
f (A) = cl Al . (15.7)
l=0
The coefficients cl are expected to ensure that the series is convergent; they
should give more weight to small powers of the adjacency matrix than to the
larger ones; and they should produce positive numbers for all i ∈ V .
Subgraph centrality 165
Notice that if the first of the three requirements hold then (15.7) defines a
matrix function and we can use theory introduced in Chapter 12. The diagonal
entries, fi (A) = f (A)ii , are directly related to subgraphs in the network and the
second requirement ensures that more weight is given to the smaller than to the
bigger ones.
Example 15.6
1
Let us examine (15.7) when we truncate the series at l = 5 and select cl =
l!
to find an expression for fi (A).
Using information collected in Chapter 13 on enumerating small sub-
graphs we know that
(A4 )ii = |F1 (i)| + |F3 (i)| + 2|F4 (i)| + 2|F5 (i)|, (15.10)
(A5 )ii = 10|F2 (i)| + 2|F6 (i)| + 2|F7 (i)| + 4|F8 (i)| + 2|F9 (i)|. (15.11)
fi (A) = (c2 + c4 )|F1 (i)| + (2c3 + 10c5 )|F2 (i)| + (c4 )|F3 (i)| + (2c4 )|F4 (i)|
+ (2c4 )|F5 (i)| + (2c5 )|F6 (i)| + (2c5 )|F7 (i)| + (4c5 )|F8 (i)|
+ (2c5 )|F9 (i)|.
(15.12)
1
By using cl = we get
l
13 5 1 1
fi (A) = |F1 (i)| + |F2 (i)| + |F3 (i)| + |F4 (i)|
24 12 24 12
1 1 1 1 1
+ |F5 (i)| + |F6 (i)| + |F7 (i)| + |F8 (i)| + |F9 (i)|.
12 60 60 30 60
(15.13)
Clearly, the edges (and hence node degrees) are making the largest
contribution to the centrality, followed by paths of length two, triangles,
and so on.
continued
166 Spectral Node Centrality
To define subgraph centrality we do not truncate (15.7) but work with the
matrix functions which arise with particular choices of coefficients cl . Some of
the most well known are
.∞ /
Al ( )
EEi = = eA ii , (15.14)
l=0
l!
ii
.∞ /
A2l+1
odd
EEi = = (sinh(A))ii , (15.15)
l=0
(2l + 1)!
ii
.∞ /
A2l
even
EEi = = (cosh(A))ii , (15.16)
l=0
(2l)!
ii
.∞ /
Al 6 7
res
EEi = = (I – αA) –1
, 0 < α < 1/λ1 . (15.17)
l=0
αl ii
ii
odd even
Notice that EE and EE take into account only contributions from odd or
even closed walks in the network, respectively. We will refer generically to EE as
the subgraph centrality. Using the spectral decomposition of the adjacency ma-
trix, these indices can be represented in terms of the eigenvalues and eigenvectors
of the adjacency matrix as follows:
Subgraph centrality 167
∞
EEi = ql (i)2 exp(λl ),
l=0
∞
EEiodd = ql (i)2 sinh(λl ),
l=0
∞
EEieven = ql (i)2 cosh(λl ),
l=0
∞
ql (i)2
EEires = , 0 < α < 1/λ1 .
l=0
1 – αλl
Example 15.7
Example 15.8
In the Eurovision song contest, countries vote for their favourite songs from
other countries. We can represent these countries as nodes and the votes as
directed edges. The aggregate voting over the 2000–2013 contests has been
measured with links weighted according to the sum of votes between coun-
tries over the 14 years.2 The countries can be grouped together according
to their pattern of votes. Groups which vote in a similar way are represented
by the directed network illustrated in Figure 15.8. The labels correspond to
countries as follows.
..................................................................................................
FURTHER READING
Langville, A.N. and Meyer, C.D., Google’s PageRank and Beyond: The Science of
Search Engine Rankings, Princeton University Press, 2006.
Estrada, E., The Structure of Complex Networks: Theory and Applications, Oxford
University Press, 2011, Chapter 7.2.
Newmann, M.E.J., Networks: An Introduction, Oxford University Press, 2010,
Chapter 7.
Quantum Physics Analogies
16
In this chapter
16.1 Motivation 170
We introduce the basic principles and formalism of quantum mechanics. We
16.2 Quantum mechanical analogies 171 study the quantum harmonic oscillator and introduce ladder operators. Then,
16.3 Tight-binding models 174 we introduce the simplest model to deal with quantum (electronic) systems,
16.4 Some specific the tight-binding model. We show that the Hamiltonian of the tight-binding
quantum-mechanical systems 177
model of a system represented by a network corresponds to the adjacency
Further reading 178 matrix of that network and its eigenvalues correspond to the energy levels of
the system. We briefly introduce the Hubbard and Ising models.
16.1 Motivation
In Chapter 8 we used classical mechanics analogies to study networks. In a simi-
lar way, we can use quantum mechanics analogies. Quantum mechanics is the
mechanics of the microworld. That is, the study of the mechanical properties of
particles which are beyond the limits of our perception, such as electrons and
photons. We remark here again that our aim is not simply to consider complex
networks in which the entities represented by nodes behave quantum mechanic-
ally but to use quantum mechanics as a metaphor that allows us to interpret some
of the mathematical concepts we use for studying networks in an amenable phys-
ical way. At the same time we aim to use elements of the arsenal of techniques
and methods developed for studying quantum systems in the analysis of complex
networks. Analogies are just that: analogies.
As we will see in this chapter, through the lens of quantum mechanics we can
interpret the spectrum of the adjacency matrix of a network as the energy levels
of a quantum system in which an electron is housed at each node of the network.
This will prepare the terrain for more sophisticated analysis of networks in terms
of how information is diffused through their nodes and links. And we will inves-
tigate other theoretical tools, such as the Ising model, which have applications in
the analysis of social networks. Thus, when we apply these models to networks
we will be equipped with a better understanding of the physical principles used
in them.
Quantum mechanical analogies 171
p2 1 p2 1
H(x, p) = + kx2 = + mω2 x2 . (16.8)
2m 2 2m 2
1 2 1
Ĥ = p̂ + mω2 x̂2 . (16.9)
2m x 2
∂
p̂x = –i h̄ . (16.10)
∂x
Let us first see what happens if we apply these two operators in a different
order to a given function φ(x). That is,
∂φ(x) ∂
x̂p̂x φ(x) = –i h̄x , p̂x x̂φ(x) = –i h̄ [xφ(x)].
∂x ∂x
That is, the momentum and the coordinates do not commute. A common
alternative representation of (16.11) is [ x̂, p̂x ] = i h̄.
Since
" #" #
∂ ∂ ∂2
p̂2x = –i h̄ –i h̄ = –h̄2 2 , (16.12)
∂x ∂x ∂x
Quantum mechanical analogies 173
we can rewrite (16.9) to express the Hamiltonian operator for the SHO as
h̄2 d 2 1
Ĥ = + mω2 x̂2 . (16.13)
2m dx2 2
From (16.6),
h̄2 d 2 ψ 1
– + mω2 x2 ψ = Eψ. (16.14)
2m dx2 2
8
mω 2E
Letting u = x and ε = , (16.14) becomes
h̄ h̄ω
d2ψ
+ (u – ε 2 )ψ = 0. (16.15)
du2
Solutions of this second order differential equation can be written as
2 /2
ψ j (u) = Hj (u)e–u , (16.16)
2 d j –z2
Hj (z) = (–1)j ez (e ). (16.17)
dzj
Thus
" #1/4 "8 #
1 mω mω 2
ψ j (x) = √ Hj x e–mωx /2h̄ . (16.18)
2 n! π h̄
n h̄
Applying the series solution to the Schrödinger equation we obtain the values
of the energy of the quantum SHO
" #
1
Ej = h̄ω j + , j = 0, 1, 2, . . . . (16.19)
2
This is notably different from the classical SHO because now the energy can
take only certain discrete values, i.e. it is quantized. Indeed the first energy levels
of the SHO are:
1 3 5
E0 = h̄ω, E1 = h̄ω, E2 = h̄ω, . . . .
2 2 2
A useful technique in solving the quantum SHO is to use the so-called lad-
der operators. The so-called annihilation (lowering) operator ĉ and the creation
(rising) operator ĉ † are defined as
8 " #
mω i p̂
ĉ = x̂ + , (16.20)
2h̄ mω
8 " #
mω i p̂
ĉ † = x̂ – . (16.21)
2h̄ mω
174 Quantum Physics Analogies
The Hamiltonian of the quantum SHO can be written in terms of the ladder
operators as
" # " #
1 1
Ĥ = –h̄ω ĉ † ĉ + = –h̄ω N̂ + , (16.23)
2 2
Problem 16.1
Show that ĉ lowers the energy of a state by an amount h̄ω and that ĉ † raises the
energy by the same amount.
From (16.22), Ĥ ĉ = ĉ Ĥ – h̄ωĉ. Now consider the effect of Ĥ on the action of
applying the annihilation operator to a state of the system, ĉ|j . That is,
( ) 6 7 ( )( )
Ĥ ĉ|j = ĉ Ĥ – h̄ωĉ |j = Ej – h̄ω ĉ|j .
which indicates that the operator ĉ † has raised the energy Ej of |j by h̄ω.
where V (rj – rk ) is the potential describing the interactions between electrons and
U (rj ) is an external potential which we will assume is zero.
Tight-binding models 175
The electron has a property which is unknown in classical physics, called the
spin. It is an intrinsic form of angular momentum and mathematically it can be
described by a state in the Hilbert space
which is spanned by the basis vectors |±. Using the ladder operators previously
introduced, the Hamiltonian (16.24) can be written as
1
Ĥ = – tij ĉi † ĉj + Vijkl ĉi † ĉk† ĉl ĉj , (16.26)
ij
2 ijkl
where tij and Vijkl are integrals which control the hopping of an electron from
one site to another and the interaction between electrons, respectively. We can
further simplify our Hamiltonian if we suppose that the electrons do not interact
with each other, so all the Vijkl equal zero. This method, which is known as the
tight-binding approach or the Hückel molecular orbital method is very useful to
calculate the properties of solids and molecules, like graphene. The Hamiltonian
of the system becomes
Ĥtb = – tij ĉiρ† ĉiρ , (16.27)
ij
where ĉiρ creates (and ĉiρ† annihilates) an electron with spin ρ at the node i. We
can now separate the in-site energy αi from the transfer energy βij and write the
Hamiltonian as
Ĥtb = αi ĉiρ† ĉiρ + βij ĉiρ† ĉiρ , (16.28)
ij ijρ
where the second sum is carried out over all pairs of nearest-neighbours. Con-
sequently, in a network with n nodes, the Hamiltonian (16.28) is reduced to an
n × n matrix,
⎧
⎪
⎨ αi , i = j,
Ĥij = βij , i is connected to j, (16.29)
⎪
⎩ 0, otherwise.
Ĥ = αI + βA, (16.30)
176 Quantum Physics Analogies
where I is the identity matrix, and A is the adjacency matrix of the graph repre-
senting the electronic system. The energy levels of the system are simply given by
the eigenvalues of the adjacency matrix of the network:
Ej = α + βλj . (16.31)
Notice that since β < 0 we can interpret the eigenvalues of the adjacency
matrix of a network as the negative of the energy levels of a tight-binding system,
as described in this section. This will be very useful when we introduce statistical
mechanics concepts for networks. For each energy level the molecular orbitals are
constructed as linear combinations of the corresponding atomic orbitals for all the
atoms in the system. That is,
ψj = cj (i)qj (i), (16.32)
i
where qj (i) is the ith entry of the jth eigenvector of the adjacency matrix and cj (i)
are coefficients of a linear combination.
Example 16.1
where t is the hopping parameter and U > 0 indicates that the electrons repel
each other.
Notice that if there is no electron–electron repulsion (U = 0 ), we recover the
tight-binding Hamiltonian studied in the previous section.
(i) Two spins interact only if they are located in nearest neighbour nodes.
(ii) The interaction between every pair of spins has the same strength.
(iii) The energy of the system decreases with the interaction between two Figure 16.2 A network with nodes
identical spins and increases otherwise. signed according to spin up (+) or
(iv) Each spin can interact with an external magnetic field H . down (–)
Often we let J = β = (kB T )–1 be the inverse temperature of the system, where
kB is the Boltzmann constant (more details in Chapter 20). Then, at low tempera-
ture, configurations in which most spins are aligned have lower energy. It is easy
to imagine some potential applications of the Ising model in studying complex
networks. If we consider a social network in which the nodes represent people
and the links their social interactions, the spin can represent a vote in favour or
against a certain statement. One can use the model to investigate whether the local
alignment of voting or opinions among the nodes can generate a global state of
consensus in the whole network.
..................................................................................................
FURTHER READING
We study the correlation between the degrees of the nodes connected by links 17.3 Network reciprocity 187
in a network. Using these correlations, we classify whether a network is as- 17.4 Network returnability 189
sortative or disassortative, indicating the tendency of high-degree nodes to be Further reading 191
connected to each other or to low-degree nodes, respectively. We show how
to represent this statistical index in a combinatorial expression. We also study
other global properties of networks, such as the reciprocity and returnability
indices in directed networks.
17.1 Motivation
Characterizing complex networks at a global scale is necessary for many reasons.
For example, we can learn about the global topological organization of a given
network; it also allows us to compare networks with each other and to obtain
information about potential universal mechanisms that give rise to networks with
similar structural properties. First we will uncover global topological properties by
analysing how frequently the high-degree nodes, or hubs, in a network are con-
nected to each other. We will check the average reciprocity of links in a directed
network and the degree to which information departing from a node of a network
can return to its source after wandering around the nodes and links. In Chapter 18
we will look at other important global topological properties of networks, too.
Example 17.1
In Figure 17.1 we have marked each ordered pair of two real-world networks as dots. The social network illustrated is
an example of an assortative network, while the mini internet illustrated in the same figure is disassortative.
(a) (b)
Figure 17.1 (a) Social network of the American corporate elite (b) the internet at the autonomous
system level
Let e(ki , kj ) be the fraction of links that connect a node of degree ki to a node of
degree kj . For mathematical convenience, we will consider ‘excess degrees’, which
are simply one less than the degree of the corresponding nodes. Let p(kj ) be the
probability that a node selected at random in the network has degree kj . Then,
the Pearson correlation coefficient for the degree–degree correlation is given by
1
r= ki kj [e(ki , kj ) – q(ki )q(kj )] , (17.1)
σq2 k k
i j
where
(kj + 1)p(kj + 1)
q(kj ) = , (17.2)
ki p(ki )
i
represents the distribution of the excess degree of a node at the end of a randomly
chosen link and σq2 is the standard deviation of the distribution q(kj ). We call
Degree–degree correlation 181
this index the assortativity coefficient of a network for obvious reasons. It can be
rewritten as
⎛ ⎞2
1 1
ki kj – ⎝ (ki + kj )⎠
m (i, j)∈E 2m (i, j)∈E
r= ⎛ ⎞2 , (17.3)
1 2 1
(k + k2j ) – ⎝ (ki + kj )⎠
2m (i, j)∈E i 2m (i, j)∈E
where m = |E|.
A revealing property of assortativity can be found by showing that the
denominator of (17.3) is nonnegative. We can confirm that
(k2i + k2j ) = k3i and (ki + kj ) = k2i ,
(i, j)∈E i (i, j)∈E i
Network r Network r
Example 17.2
The two networks illustrated in Figure 17.2 correspond to food webs. The first represents mostly macroinvertebrates,
fishes, and birds associated with an estuarine sea-grass community, Halodule wrightii, at St Marks National Wildlife
Refuge in Florida. The second represents trophic interactions between birds and predators and arthropod prey of
Anolis lizards on the island of St Martin in the Lesser Antilles. The first network has 48 nodes and 218 links and the
second has 44 nodes and 218 links.
The assortativity coefficient for these two networks are St Marks r = 0.118 and St Martin r = –0.153. In St
Marks, low-degree species prefer to join to other low-degree ones, while high-degree species are preferentially linked
to other high-degree ones. On the other hand, in the food web of St Martin, the species with a large number of trophic
interactions are preferentially linked to low-degree ones.
Figure 17.2 Illustration of two food webs with different assortativity coefficient values
Example 17.3
Consider the two small networks illustrated in Figure 17.3. The networks are
almost identical, except for the fact that the one on the right a path of length 2
instead of one of length 1 attached to the cycle. Despite their close structural
similarity, the two have very different degree assortativity.
Degree–degree correlation 183
If we analyse what has been written in the literature about the meaning of
assortativity, it is evident that a clear structural interpretation is necessary. For in-
stance, it has been said that in assortatively mixed networks ‘the high-degree vertices
will tend to stick together in a subnetwork or core group of higher mean degree than the
network as a whole’1 . However, this may not be visible to the naked eye.
Example 17.4
In the network illustrated in Figure 17.4, the nodes enclosed in the circle
indicated with a broken line are among the ones with the highest degree in
the network and they are apparently clumped together. Thus we might expect
that this network is assortative. However, the assortativity coefficient for this
network is r = –0.304, showing that it is highly disassortative.
Figure 17.4 Apparent assortativity within a network ing in networks, Physical Review Letters,
89:208701. (2002)
184 Global Properties of Networks I
Then
ki kj = (ki +kj )–m+3|C3 |+|P3 | = m+2|P2 |+|P3 |+3|C3 |. (17.6)
(i, j)∈E (i, j)∈E
To deal with the final term we first rewrite the expression for the number of
star subgraphs S1,3 as
" ki # 1 1 3 1 2 1
|S1,3 | = = ki (ki –1)(ki –2) = k – k + ki . (17.7)
i
3 6 i 6 i i 2 i i 3 i
Then
(k2i + k2j ) = k3i = 6|S1,3 | + 3 k2i – 2 ki = 6|S1,3 | + 6|P2 | + 2m.
(i, j)∈E i i i
(17.8)
Substituting all these terms into (17.3) gives
1 ( ) 1 ( )2
m + 2|P2 | + |P3 | + 3|C3 | – 2
2|P2 | + 2m
r= m 4m (17.9)
1 ( ) 1 ( )2
6|S1,3 | + 6|P2 | + 2m – 2
2|P2 | + 2m
2m 4m
which simplifies to
|P2 |2
|P3 | + 3|C3 | –
r= m . (17.10)
|P2 |2
3|S1,3 | + |P2 | –
m
Degree–degree correlation 185
Alternatively, let |Pr/s | = |Pr |/|Ps | and |P1 | = m. Multiply and divide the
numerator by |P2 | and we obtain
" #
3|C3 |
|P2 | |P3/2 | + – |P2/1 |
|P2 |
r= . (17.11)
3|S1,3 | + |P2 |(1 – |P2/1 |)
3|C3 |
Since C = is the Newman clustering coefficient,
|P2 |
( )
|P2 | |P3/2 | + C – |P2/1 |
r= ( ). (17.12)
3|S1,3 | + |P2 | 1 – |P2/1 |
Problem 17.1
Use the combinatorial expression for the assortativity coefficient to show that the
path of n nodes of infinite length is neutral, i.e. r(G) → 0 as n → ∞.
In a path C = 0 and |S1,3 | = 0 so the assortativity formula is further
simplified to
( )
|P2 | |P3/2 | – |P2/1 |
r(Pn – 1 ) = ( ) .
|P2 | 1 – |P2/1 |
(n – 1)(n – 3) – (n – 2)2 1
r(Pn – 1 ) = =– .
(n – 1)(n – 2) – (n – 2)2 n–2
1
So clearly, lim r(Pn – 1 ) = – lim = 0.
n→∞ n→∞ n–2
186 Global Properties of Networks I
Table 17.2 Assortativity coefficient in terms of path ratios and clustering coefficient in
complex networks
Problem 17.2
Use the combinatorial expression for the assortativity coefficient to show that the
star of n nodes has the maximum possible disassortativity, i.e. r(G) = –1.
In the star with n nodes C = 0 and |P3 | = 0. Also,
Thus,
" #
1 1
(n – 1)(n – 2) – (n – 2)
2 2
r(S1,n–1 ) = .
1 1 1
(n – 1)(n – 2)(n – 3) + (n – 1)(n – 2) – (n – 1)(n – 2)2
2 2 4
This simplifies to
1 1
– (n – 1)(n – 2)2 – (n – 2)
r(S1,n–1 ) = 4 " # = 2 = –1.
1 1 1
(n – 1)(n – 2) (n – 2) (n – 2)
2 2 2
Network reciprocity 187
Example 17.5
2 1
3 4
In the network illustrated in Figure 17.5 there are four reciprocal links (1 – 2, 2 – 1, 3 – 4, and 4 – 3), thus L ↔ = 4, and
the total number of links is L = 9. The probability that a link picked randomly in this network is reciprocal is r = 4/9.
The normalized reciprocity index is then
continued
188 Global Properties of Networks I
30 · 4 – 81
ρ= = 0.206.
30 · 9 – 81
In Table 17.3 we illustrate some values of the reciprocity index for real-world networks.
Problem 17.3
A network with n nodes has reciprocity equal to –0.25. How many links should
become bidirectional for the network to show reciprocity equal to 0.1?
The reciprocity for this network in its current state is given by
n(n – 1)L1↔ – L 2 1
=– ,
n(n – 1)L – L 2 4
where L1↔ is the number of bidirectional links in the network. So,
5L 2 – n(n – 1)L
L1↔ = .
4n(n – 1)
Notice that because L1↔ > 0, the number of directed links is bounded by
L > n(n – 1)/5.
Now, let L2↔ be the number of bidirectional links when the reciprocity is 0.1.
Then,
9L 2 + n(n – 1)L
L2↔ = .
10n(n – 1)
Let L ↔ = L2↔ – L1↔ be the increase in the number of reciprocal links.
Consequently, the number of reciprocal links should be increased by
7[n(n – 1)L – L 2 ]
L ↔ = .
20n(n – 1)
For instance, if the network has n = 22, L = 100, and L1↔ = 2, the number of
reciprocal links should increase by L ↔ ≈ 27. This means that the new network
should have L2↔ = 29 in order to have ρ = 0.1. You can convince yourself by
substituting the values into (17.15).
Network returnability 189
tr(exp(D)) – n – S
Kr = , (17.16)
tr(exp(A)) – n – S
where A is the adjacency matrix of the same network when all edges are con-
sidered to be undirected. This index is bounded in the range 0 ≤ Kr ≤ 1, where
the lower bound is obtained for a network with no returnable cycle and the upper
bound is obtained for any network with a symmetric adjacency matrix.
Example 17.6
1 2 5
4 3 6
Problem 17.4
Find the returnability of the directed triads shown in Figure 17.7.
The returnability for these networks can be written
tr(exp(D)) – 3
Kr = ,
tr(exp(A)) – 3 (a) A (b) B
3
where tr(exp(D)) = j=1 exp(λj (D)) and tr(exp(A)) = exp(2) + 2 exp(–1) be- Figure 17.7 Two triangles with differ-
cause the underlying undirected graph is K3 . In order to find the eigenvalues of ent returnabilities
190 Global Properties of Networks I
the adjacency matrix of the directed network A, we should find the roots of the
polynomial
, ,
, –λ ,
, 0 1 ,
, ,
, 1 –λ 1 , = –λ3 + 2λ + 1 = 0
, ,
, 1 1 –λ ,
√ √
5+1 5–1
which are λ1 = , λ2 = –1, and λ3 = . Notice that λ1 = ϕ, the
2 2
golden ratio, and λ3 = 1 – ϕ. Hence,
eϕ + e–1 + e1 – ϕ – 3
Kr (A) = = 0.567.
e2 – 2e–1 – 3
Example 17.7
FURTHER READING
Estrada, E., The Structure of Complex Networks: Theory and Applications, Oxford
University Press, 2011, Chapters 2.3, 4.5.2, and 5.4.
Newmann, M.E.J., Networks: An Introduction, Oxford University Press, 2010,
Chapter 7.
Newman, M.E.J., ‘Assortative Mixing in Networks’, Physical Review Letters,
89:208701, (2002).
Global Properties
18 of Networks II
18.1 Motivation
Consider two networks with structures such as those displayed in Figure 18.1.
Network A displays a very regular, homogeneous type of structure, while network
B has a few regions which are more densely connected than others, and which we
term structural heterogeneities. Now suppose we use a signal θ that propagates
locally through the links of the network from a randomly selected node to other
nodes relatively close by. At the same time we emit another signal ϑ, which starts
at the same node and propagates on a longer length scale. After a certain time,
both signals return to the original node. Because of the homogeneity of network
A at both scales (close neighbourhood of a node and global network) the times
taken by θ and ϑ to return to the original node are linearly correlated. This is true
for any node of the network. However, in B the signal’s path will be influenced
by the heterogeneity in the network and as a consequence, a lack of correlation
between the return times of the two signals is observed. The level of correlation
between the two signals characterizes the type of structure that a network has at
a global scale. In this chapter, we are going to find a mathematical way to obtain
such kinds of correlations for general networks.
(a) A (b) B
t(θ) t(θ)
Problem 18.1
Show that the expansion of a cycle network tends to zero as the size of the network
tends to infinity.
The cycle Cn is an example of a connected graph which is divided into two
connected components by removing two edges. Note that the boundary of any
non-empty set S of nodes from Cn must contain at least two edges and since
|V | n
|S| ≤ ≤ ,
2 2
194 Global Properties of Networks II
|∂S|/|S| is bounded below by 2/(n/2) = 4/n. This bound is attained (for even
n) if we take S to be a string of n/2 connected nodes. Consequently,
4
lim φ(Cn ) = lim = 0.
n→∞ n→∞ n
Problem 18.2
n
Show that a complete network has expansion constant φ(Kn ) = if n is even and
2
n+1
φ(Kn ) = if n is odd.
2
If |S| = m then each node in S has (n – m) edges connected to S and so
|∂S|/|S| = n – m. The infimum is attained when m is as large as possible. That
is m = n/2 if n is even and (n – 1)/2 if n is odd.
A key result in the theory of expander graphs connects the expansion constant
to the eigenvalues of the adjacency matrix of the network. Let λ1 > λ2 ≥ · · · ≥ λn
be these eigenvalues. Then the expansion factor is bounded by
λ1 – λ2 %
≤ φ(G) ≤ 2λ1 (λ1 – λ2 ). (18.2)
2
Thus the larger the spectral gap, λ1 – λ2 , the larger the expansion of the graph.
Example 18.1
n
EE odd (i) = q21 (i) sinh(λ1 ) + q2j (i) sinh(λj ). (18.3)
j=2
1 6% 7
ln q1 (i) ≈ ln EE odd (i) – ln sinh(λ1 ) . (18.5)
2
For any network, the straight line (18.5) defines the ideal situation in which
the local and global environments of all the nodes are highly correlated, i.e. an
ideal topological homogeneity in the network. However, the values of ln q1 (i)
and ln EE odd (i) can deviate from a straight line if the network is not par-
ticularly homogeneous. Such deviations can be quantified, for instance, by
measuring the deviation of the eigenvector centrality of the given node from the
relationship (18.5) using
1/2
q21 (i) sinh(λ1 )
ln q1 (i) = ln . (18.6)
EE odd (i)
196 Global Properties of Networks II
Now we can identify the following four general types of correlation between
the local and global environments of the nodes in a network. These can in turn be
shown to represent four structural classes of network topology.
0.300
0.090
EC(i)
0.060
0.030
0.009
250 5000 75000
EE(i)
Figure 18.3 Spectral scaling for net-
works in Class I (a) Spectral scaling (b) Network structural pattern
0.32
0.28
0.24
0.20
0.16
EE(i)
0.12
0.08
2 4 6 8 10 12 14
Figure 18.4 Spectral scaling for net- EE(i)
0.7
0.5
0.3
EC(i)
0.1
0.45
0.35
0.25
EC(i)
0.15
0.05
2 6 10 14 18 22
EE(i)
Figure 18.6 Spectral scaling for net-
(a) Spectral scaling (b) Network structural pattern works in Class IV
198 Global Properties of Networks II
Examples 18.2
(i) Figure 18.7 is a pictorial representation of the food web of St Martin Island and its spectral scaling. The network
clearly belongs to Class I.
(ii) Figure 18.8 is an illustration of a protein (left) and its protein residue network (right). In the network, nodes
represent the amino acids and the links represent pairs of amino acids interacting physically. The spectral scaling
plot in Figure 18.9 shows a clear Class II type for the topology of this network. Proteins are known to fold in 3D,
leaving some holes, which in general represent physical cavities where ligands dock. These holes may represent
binding sites for potential drugs.
100
10–2
101 102 103 104 105
Subgraph centrality
(a) (b)
Figure 18.7 Spectral scaling of the St Martin food web. Notice that the scaling perfectly
corresponds to a Class I network
(a) (b)
Figure 18.8 Cartoon representation of a protein (a) and its residue interaction
network (b)
Spectral scaling method 199
100
Eigenvector centrality
10–1
10–2
10–3
10–1 100 101 102
Odd subgraph centrality
Figure 18.9 Spectral scaling of the residue net-
work in Figure 18.8(b)
Problem 18.3
Show that an Erdös–Rényi random network belongs to the first class of homoge-
neous networks as n → ∞.
Recall that as n → ∞, the eigenvalues of an Erdös–Rényi network satisfy
λ1 → np, λj≥2 → 0.
Thus, as n → ∞
n
EE odd (i) = q21 (i) sinh(np) + q2j (i) sinh(0) = q21 (i) sinh(np),
j=2
1 %
ln q1 (i) = ln EE odd (i) – ln sinh(np)
2
Problem 18.4
It is known that the largest eigenvalue of the adjacency matrix of a certain protein–
protein interaction (PPI) network is 3.8705. Further analysis has shown that the
values of the eigenvector and subgraph centrality of the network are 0.2815 and
0.5603, respectively. Is this PPI network an example of a homogeneous network?
Or does it belong to the class of networks with holes?
200 Global Properties of Networks II
If the PPI network were in Class I, then ln qi (i) ≈ 0 for all i ∈ V . Using the
given data, we have
1/2
0.28152 sinh(3.8705)
ln q1 (i) = ln = 0.6105,
0.5603
which is far from zero. Thus the network does not belong to the class of homo-
geneous networks. In addition, because ln q1 (i) > 0 for at least one node, the
network cannot belong to the class of networks with holes. The only possibilities
that remain are that the network is in Class III or Class IV but without additional
data we cannot determine which.
md
bc = 1 – , (18.7)
m
where the subindex c is introduced to indicate that this index may have to be found
computationally. That is, we need to search for links that destroy the bipartivity in
the network in a computationally intensive way. We aim to find the best partition
of the nodes into two almost disjoint subsets. Once we have found such a ‘best
bipartition’, we can simply count the number of links that are connecting nodes in
the same set. Although the method is very simple and intuitive, to state, the com-
putation of this index is an example of an NP-complete problem.1 Computational
1 In short, this means that it is extremely approximations for finding this index have been reported in the literature but we
costly to compute for even fairly modestly are not going to explain them here. Instead we will consider other approaches that
sized networks. exploit some of the spectral properties of bipartite networks.
Bipartivity measures 201
Recall that a bipartite network does not contain any odd-length cycle. Because
every closed walk of odd length involves at least one cycle of odd length, we also
know that a bipartite network does not contain any closed walks of odd length.
Consequently, we can identify bipartivity if tr(sinh(A)) = 0. Or, because
Using these simple facts we can design an index to account for the degree of
bipartivity of a network by taking the proportion of even closed walks to the total
number of closed walks in the network giving
n
cosh(λj )
tr(cosh(A)) j=1
bs = = n . (18.10)
tr(eA )
exp(λj )
j=1
Example 18.3
We consider the effect of adding a new edge, e, to a network G with spectral bipartivity index bs (G). Let us add a new
edge to G and calculate bs (G + e).
Since cosh x > sinh x for all x,
n
n
EE even = cosh(λj ) > sinh(λj ) = EE odd .
j=1 j=1
Let a and b be the contributions of e to the even and odd closed walks in G + e, respectively and assume that b ≥ a.
Then
Suppose we start from a bipartite network with bs = 1 and add new links. How
small can we make bs ? And for what network?
Problem 18.5
Show that bs → 1/2 in Kn as n → ∞.
The eigenvalues of Kn are n – 1 with multiplicity one and –1 with multiplicity
n – 1 so
cosh(n – 1) + (n – 1) cosh(–1) 1
bs (Kn ) = →
exp(n – 1) + (n – 1) exp(–1) 2
as n → ∞.
Of all graphs with n nodes, the complete graph is the one with the largest num-
ber of odd cycles. So limn→∞ bs (Kn ) = 1/2 is the minimum value of bs that any
network can take.
Another way of accounting for the global bipartivity of a network is to consider
the difference of the number of closed walks of even and odd length, and then to
normalize the index by the sum of closed walks. That is,
n
n
n
cosh(λj ) – sinh(λj ) exp(–λj )
j=1 j=1 tr(exp(–A)) j=1
be = = = n . (18.13)
n n
tr(exp(A))
cosh(λj ) + sinh(λj ) exp(λj )
j=1 j=1 j=1
Clearly, 0 ≤ be ≤ 1.
Problem 18.6
Show that be = 1 for any bipartite network and that be (Kn ) → 0 as n → ∞.
For a bipartite network the spectrum of the adjacency matrix is symmetrically
distributed around zero. Since sinh is an odd function.
n
sinh(λj ) = 0
j=1
and so be = 1.
Using the eigenvalues of the adjacency matrix of a complete graph we have
exp(1 – n) + (n – 1) exp(1)
be (Kn ) = → 0, (18.14)
exp(n – 1) + (n – 1) exp(–1)
as n → ∞.
Problem 18.7
Suppose that we repeatedly add edges to a network G and each time we do we
add more odd walks than even walks. Show that be decreases monotonically with
the addition of edges.
Bipartivity measures 203
Suppose again that a and b are the contributions of the new edge e to even and
odd closed walks, respectively, and b ≥ a. Then
Adding (b + EE)EE odd to each side of the inequality bEE even ≥ aEE odd gives
EE odd + b EE odd
≥ .
EE + a + b EE
EE even EE even + a
≥ .
EE EE + a + b
" # " #
EE even EE odd EE even + a EE odd + b
be (G) = – ≥ – = be (G + e),
EE EE EE + a + b EE + a + b
as desired.
Examples 18.4
(i) In Figure 18.10 we show how the values of the two spectral indices of bipartivity change as we add links to the
complete bipartite network K2,3 .
(ii) Figure 18.11a illustrates a food web among invertebrates in an area of grassland in England. The values of the
spectral bipartivity indices for this network are bs = 0.766 and be = 0.532, both indicating that the network
displays some bipartite-like structure due to the different trophic layers of interaction among the species. We will
see how to find such bipartitions in Chapter 21.
(iii) Figure 18.11b illustrates an electronic sequential logic circuit where nodes represent logic gates. The values of
the spectral bipartivity indices for this network are bs = 0.948 and be = 0.897, both indicating that the network
displays a high bipartivity.
continued
204 Global Properties of Networks II
(a) (b)
Figure 18.11 (a) A food network in English grassland (b) An electronic circuit network
Bipartivity measures 205
Problem 18.8
Find the conditions for which an Erdös–Rényi (ER) network will have be ≈ 1 and
those for which be ≈ 0 when the number of nodes, n, is sufficiently large.
As n → ∞ the eigenvalues of an ER network are given by λ1 = np, λj≥2 = 0.
Thus,
There is a regime for which the first term on the right-hand side is domin-
ant in the limit and another in which it is the second. In the first case the limit
approaches 1 if exp(np) → 0. In the second case, the limit will approach 0 if
exp(np) → ∞. Consequently, be (ER) ≈ 1 if exp(np) ! n – 1. This condition is
equivalent to
ln(n – 1)
p! . (18.15)
n
ln(n – 1)
Sometimes p < is sufficient. For example, in an ER network with
n
1,000 nodes and 2,990 links we measure be (ER) ≈ 0.9. Notice that here np ≈ 6
while ln(n – 1) ≈ 7. However in most cases where (18.15) is not satisfied we can
expect very low bipartivity in ER networks.
..................................................................................................
FURTHER READING
Estrada, E., Spectral scaling and good expansion properties in complex networks,
Europhys. Lett. 73:649–655, 2006.
Hoory, S., Linial, N. and Wigderson, A., Expander graphs and their applications.
Bulletin of the American Mathematical Society 43:439–561, 2006.
Sarnak, P., What is an Expander? Notices of the AMS, 51:761–763, 2004.
Communicability
19 in Networks
19.1 Motivation
The transmission of information is one of the principal functions of complex
networks. Such communication among the nodes of a network can represent the
interchange of thoughts or opinions in social networks, the transfer of information
from one molecule or cell to another by means of chemical, electrical, or other
kind of signal, or the routes of transportation of any material. It is intuitive to think
that this communication mainly takes place by using the shortest route connecting
a pair of nodes. That is, assuming that information is transmitted through the
shortest paths of a network. This is represented in Figure 19.1 (a) by the blue
path between the two marked nodes in the network. However, in any network
different from a tree, there are many other routes for communication between
any pair of nodes. This abundance of alternative routes is of great relevance when
there are failures in some of the links in the shortest path, or simply if there is
heavy traffic along some routes. Some of these alternative routes are illustrated
in Figure 19.1 (b) where the central link of the shortest path connecting the two
marked nodes has been removed.
We conclude that the nodes in a network have improved communication if
there are many relatively short alternative routes between them, other than the
shortest path. The larger the number of these alternative routes the better the
communicability between the corresponding pair of nodes. This is the topic of this
chapter.
∞
Gpq = ck (Ak )pq , (19.1)
k=0
where the coefficients ck must fulfil the same requirements stated when we
introduced subgraph centrality. By selecting ck = 1/k! we obtain
∞
(Ak )pq
Gpq = = (exp(A))pq . (19.2)
k=0
k!
Recall that Gpp measures the subgraph centrality of a node. Using the spec-
tral decomposition of the adjacency matrix the communicability function can be
expressed as
n
Gpq = qj (p)qj (p) exp(λj ). (19.3)
j=1
odd
Gpq = (sinh(A))pq , (19.4)
even
Gpq = (cosh(A))pq , (19.5)
6 7
res
Gpq = (I – αA)–1 , 0 < α < 1/λ1 . (19.6)
pq
we can write
n
Gpq (Kn ) = q1 (p)q1 (q)en–1 + e–1 qj (p)qj (p). (19.7)
j=2
And since the eigenvector matrix Q has orthonormal rows and columns with
√
q1 = e/ n,
" #
en–1 1
Gpq (Kn ) = + e–1 1 – →∞
n n
as n → ∞.
Now, we turn to the path of n edges Pn . In this case, we know that the
eigenvalues are
" #
jπ
λj = 2 cos , (19.8)
n+1
So
.8 / .8 / 6 7
n
2 jpπ 2 jqπ 2 cos
jπ
Gpq (Pn ) = sin sin e n+1
j=1
n+1 n+1 n+1 n+1
n " #" # 6 7
2 jpπ jqπ 2 cos
jπ
= sin sin e n+1 .
n + 1 j=1 n+1 n+1
Using the standard trigonometric identity 2 sin θ sin ϑ = cos(θ –ϑ) – cos(θ +ϑ),
n " # 6 7
1 jπ (p – q) jπ (p + q) 2 cos n+1
jπ
Gpq (Pn ) = cos – cos e
n+1 n+1 n+1
j=1
n ! " # 6 7 " # 6 7$
1 jπ (p – q) 2 cos n+1
jπ 1 jπ (p + q) 2 cos n+1
jπ
= cos e – cos e .
n+1 n+1 n+1 n+1
j=1
Network communicability 209
5
4.5
4
3.5
3
I γ (x)
2.5
2
I4(x)
I3(x)
1.5 I0(x) I2(x)
I1(x) I5(x)
1
Figure 19.2 Plot of the modified Bessel
0.5
functions of the first kind. The verti-
0 cal line at x = 2 helps illustrate that
0 1 2 3 4 5
x Iγ (2) → 0
Figure 19.2 illustrates Bessel functions of the first kind. We see that as γ →
∞, Iγ (2) → 0 very quickly. Consequently, Gn1 (Pn ) ≈ (In–1 (2) – In+1 (2)) → 0 as
n → ∞.
For instance, G1,5 (P5 ) = 0.048 and G1,10 (P10 ) = 2.98 × 10–6 .
Example 19.1
Consider the network of interconnections between the different regions of the visual cortex of the macaque. Figure 19.3
illustrates a planar map of the regions of the visual cortex (a) and the map of communicability between pairs of regions
(b). As can be seen, there are a few small regions which communicate intensely.
continued
210 Communicability in Networks
4
×10
TPT
TS3
TS2
7
TS1
TGD
REIT
PROA
PAL
KA
PAAC
PAAL
PAAR
G
6
A13
A11
A12
A45
A10
A14
A25
A32
A9
A24
5
A23
A3
SMA
A6
A4
A35
ID
IG
A7b
S2
4
R1
A5
A2
A1
A3b
A3a
ER
TGV
A46
FEF
3
A7a
DP
VIP
LIP
PIP
PO
TH
TF
STPa
STPp
2
AITv
AITd
CITv
CITd
PITv
PITd
FST
MSTI
MSTd
MT
1
V4t
VOT
V4
V3a
VP
V3
V2
V1
0
(a) (b)
Figure 19.3 (a) A representation of the macaque cortex (b) The communicability between regions
Problem 19.1
Let S1,n–1 be a star with n nodes in which we have labelled the central node
as 1. Given that the eigenvectors associated with the largest and the smallest
eigenvalues are
1 √ T
q1 = √ n–1 1 ··· 1
2(n – 1)
and
1 √ T
qn = √ – n–1 1 ··· 1
2(n – 1)
find expressions for the communicability between any pair of nodes in the
network.
Network communicability 211
In the star graph there are two nonequivalent types of pairs of nodes. One
is formed by the central node and any of the nodes of degree one, the other is
formed by any pair of nodes of degree one. Let us designate the communicability
between the first type by Gp1 (S1,n–1 ) and the second by Gpq (S1,n–1 ).
By substituting the values of the eigenvalues and eigenvectors into the
expression for the communicability we have
1 1 √ 1 1 √ n–1
Gp1 (S1,n–1 ) = √ √ e n–1 – √ √ e– n–1 + qj (1)qj (p), p = 1.
2 2(n – 1) 2 2(n – 1) j=2
(19.13)
From the orthonormality of the eigenvectors we deduce that
n
n–1
0= qj (1)qj (p) = qj (1)qj (p) + q1 (1)q1 (p) + qn (1)qn (p). (19.14)
j=1 j=2
Thus,
n–1
qj (1)qj (p) = 0, (19.15)
j=2
and if p = 1,
. / 6√ 7
√ √
1 e n–1
– e– n–1 sinh n–1
Gp1 (S1,n–1 ) = √ = √ .
n–1 2 n–1
Similarly,
1 √ 1 √ n–1
Gpq (S1,n–1 ) = e n–1 + e– n–1 + qj (p)qj (q). (19.16)
2(n – 1) 2(n – 1) j=2
n–1
1
qj (p)qj (q) = – . (19.17)
j=2
n–1
1 √
Gpq (S1,n–1 ) = cosh( n – 1) – 1 . (19.18)
n–1
Notice that the communicability between the central node and any other node
is determined only by walks of odd length while that between any two noncentral
nodes is determined by even length walks only.
212 Communicability in Networks
routes that maximize the communication between a pair of nodes but reduce the
disruption in the communication due to transitivity of the relationships.
In order to identify routes that maximise the communication between nodes p
and q, it is intuitive to think about the communicability function Gpq , which ac-
counts for the amount of information that departs from p and successfully arrives
at q. The ‘disruption’ in the communication is represented by the information
that departs from p and after wandering around the nodes of the network returns
again to p. A natural index to account for this disrupted information is Gpp . If
we assume that the communication between p and q is bidirectional, there is also
disruption in the information sent from q, accounted for by Gqq .
We define an index that accounts for the amount of information disrupted
minus the amount of information that arrives at its destination with
def
ξpq = Gpp + Gqq – 2Gpq . (19.19)
Minimizing ξpq represents the case in which we minimize loss of information and
at the same time maximize the information that finally arrives at its destination.
We can show that ξpq can be viewed as the square of an appropriately defined
Euclidean distance between the nodes p and q of the network.
Let A = QDQT be the spectral decomposition of the adjacency matrix of a
network and denote the pth row of Q by uTp . Then we can write
& 'T & D/2 '
ξpq = (up – uq )T eD (up – uq ), = (eD/2 (up – uq ) (e (up – uq )
ξpq = (xp – xq )T (xp – xq ) = xp – xq 2 ,
hence
9
ξpq = ξpq
= x – x
p q (19.20)
is a Euclidean distance. For obvious reasons, we will call ξpq the communicability
distance between the nodes p and q of a graph.
We can define an analogue of the distance matrix for the communicability
T
distance as follows. Let s = EE11 EE22 · · · EEnn be a column vector of
the subgraph centralities of every node in the graph and let
Problem 19.2
Consider the network illustrated in Figure 19.5. Find the routes connecting nodes
1 and 6 with shortest path and communicability distances.
9 8 7 First, we identify the paths connecting 1 and 6. There are 6, namely,
3 1→2→4→6 1→2→5→4→6
1→2→3→5→4→6 1→2→5→3→4→6
1 → 2 → 3 → 4 → 6 1 → 9 → 8 → 7 → 6.
1 2 4 6
In order to find the one with the shortest path distance, we simply sum the
shortest path distances between the pairs of nodes forming the path (in this un-
5 weighted network the distance between adjacent nodes is 1). It is obvious that the
shortest path distance is 3 for 1 → 2 → 4 → 6.
Figure 19.5 Calculate the communic-
For the communicability distance we proceed in a similar way and calculate
ability of the coloured nodes
ξ12 + ξ24 + ξ46 , ξ12 + ξ23 + ξ34 + ξ46 , and so on. In order to calculate the communi-
cability distance we need
⎡ ⎤
2.482 2.597 1.625 1.704 1.625 0.465 0.299 0.709 1.628
⎢2.597 6.530 5.510 5.842 5.510 1.704 0.470 0.333 0.905⎥
⎢ ⎥
⎢ ⎥
⎢1.625 5.510 5.558 5.510 5.190 1.625 0.416 0.167 0.416⎥
⎢ ⎥
⎢1.704 5.842 5.510 6.530 5.510 2.597 0.905 0.333 0.470⎥
⎢ ⎥
G = eA = ⎢ ⎢1.625 5.510 5.190 5.510 5.558 1.625 0.416 0.167 0.416⎥ .
⎥
⎢0.465 1.704 1.625 2.597 1.625 2.482 1.628 0.709 0.299⎥
⎢ ⎥
⎢ ⎥
⎢0.299 0.470 0.416 0.905 0.416 1.628 2.285 1.594 0.704⎥
⎢ ⎥
⎣0.709 0.333 0.167 0.333 0.167 0.709 1.594 2.280 1.594⎦
1.628 0.905 0.416 0.470 0.416 0.299 0.704 1.594 2.285
Note that s is the diagonal of G and from (19.21) and (19.22) we obtain
⎡ ⎤
0.000 1.954 2.188 2.367 2.188 2.008 2.042 1.829 1.230
⎢1.954 0.000 1.033 1.173 1.033 2.367 2.806 2.854 2.647⎥
⎢ ⎥
⎢ ⎥
⎢2.188 1.033 0.000 1.033 0.858 2.188 2.648 2.739 2.648⎥
⎢ ⎥
⎢2.367 1.173 1.033 0.000 1.033 1.954 2.647 2.854 2.806⎥
⎢ ⎥
X (G) = ⎢
⎢2.188 1.033 0.858 1.033 0.000 2.188 2.648 2.739 2.648⎥ .
⎥
⎢2.008 2.367 2.188 1.954 2.188 0.000 1.230 1.829 2.042⎥
⎢ ⎥
⎢ ⎥
⎢2.042 2.806 2.648 2.647 2.648 1.230 0.000 1.174 1.778⎥
⎢ ⎥
⎣1.829 2.854 2.739 2.854 2.739 1.829 1.174 0.000 1.174⎦
1.230 2.647 2.648 2.806 2.648 2.042 1.778 1.174 0.000
In this case, ξ12 + ξ24 + ξ46 = 5.081 but ξ19 + ξ98 + ξ87 + ξ76 = 4.808. This means
that according to the communicability distance, the route 1 → 9 → 8 → 7 → 6
is shorter than 1 → 2 → 4 → 6 and it can be confirmed that no other route has
shorter communicability distance. The two routes are marked in Figure 19.6.
In order to understand the differences between the two routes we just need
to recall the definition of communicability distance ξpq = Gpp + Gqq – 2Gpq .
Communicability distance 215
9 8 7 9 8 7
3 3
1 2 4 6 1 2 4 6
5 5
Figure 19.6 A contrast between dista-
(a) Shortest path (b) Shortest communicability route nce and communicability minimization
It indicates a route that maximizes the communicability between the two nodes
and minimizes the disruption. The route 1 → 9 → 8 → 7 → 6 certainly reduces
the chances of getting lost along the way without increasing the length of the route
excessively.
A good analogy for understanding the differences between the shortest path
and the communicability distance is the following. Suppose every node is rep-
resented by a ball of mass proportional to its subgraph centrality. Then a node
participating in many small subgraphs will have a very large mass. Now place the
network onto a rubber sheet which will be deformed according to the masses of
the corresponding nodes. Then to travel from one node to another we need to
follow the ‘geodesic’ paths, which include the deformations of the sheet. Conse-
quently, as illustrated in Figure 19.7, going from one node to another by using a
route that involves nodes of large masses (large subgraph centrality) will increase
the length of the trajectory greatly in comparison with those routes involving low
mass nodes only. In the picture, there are two alternative routes between nodes
1 and 3. The route 1 → 2 → 3 involves node 2 which has very low subgraph
centrality and barely deforms the rubber sheet. The route 1 → 4 → 3 involves
node 4 which has a large subgraph centrality and produces a large deformation
of the sheet. So the second route can be considered to involve a longer trajectory
due to the large deformation of the ‘space’ produced by node 4.
Example 19.2
..................................................................................................
FURTHER READING
network and a network in a thermal reservoir, respectively. We derive the 20.3 Micro-canonical ensembles 220
fundamental formulae for entropy, Helmholtz, and Gibbs free energies for 20.4 The canonical ensemble 221
classical and quantum systems. We make an interpretation of the concept of 20.5 The temperature in network
temperature as applied to network sciences. theory 225
Further reading 226
20.1 Motivation
In Chapters 8 and 16 we introduced classical and quantum mechanical analo-
gies for studying complex networks. Here, we develop analogies from statistical
mechanics to use in network sciences. The term ‘statistical mechanics’ was intro-
duced by Gibbs in the nineteenth century as a means to emphasize the necessity
of using statistical tools in describing macroscopic physical systems, since it is
impossible to deduce the properties of such systems by analysing the mechanical
properties of its individual constituents. It is possible to obtain some macroscopic
measurements of the system and the goal of statistical mechanics is to use stat-
istical methods to connect these macroscopic properties of the system with the
microscopic structure and dynamics occurring in them.
A fundamental concept of both thermodynamics and statistical mechanics is
that of entropy. This concept has proved useful in areas far removed from thermal
physics and it is now ubiquitous in many fields of physical, biological, and social
sciences. In this chapter we will prepare the terrain for understanding entropy
and its implications for studying networks. We will consider briefly the follow-
ing fundamental questions: What is the physical meaning of entropy? How is it
connected to the information content of a network? What is the network tempera-
ture? What is the connection between network entropy and other thermodynamic
properties?
218 Statistical Physics Analogies
Zeroth law: If the systems A and B are in thermal equilibrium with a third
system C, then A and B must also be in thermal equilibrium.
First law: The internal energy U of an isolated system is constant. The change
of internal energy produced by a process that causes the system to absorb
heat Q and do work W is given by U = Q – W .
Second law: For an isolated system, the entropy S is a state function. No pro-
cess taking place in the system can decrease the entropy. That is, S ≥ 0.
If the system absorbs an infinitesimal amount of heat δQ (the system is not
isolated), the entropy changes by
δQ
dS = , (20.1)
T
" #
∂H
T= . (20.2)
∂S p
F = H – TS. (20.3)
220 Statistical Physics Analogies
S = kB ln N , (20.4)
p2 mω2 x2
E= + , (20.5)
2m 2
which can be written as the equation of an ellipse
x2 p2
1= + . (20.6)
x2max p2max
where
√
pmax = 2mE, (20.7)
8
2E
xmax = (20.8)
mω2
and the ‘surfaces’ of constant energy are described by ellipses with these axes. In
this case, the volume of the phase space is the area of the ellipse
8
4E 2 2π E
π pmax xmax = π = . (20.9)
ω2 ω
Suppose that the energy can only be measured with a degree of uncertainty
E. The area of the accessible phase space is given by
2πE + E 2π E E
a = – = 2π . (20.10)
ω ω ω
Thus, the number of microscopic states accessible to the system is obtained by
dividing
a by h0 , the size of a cell, in which the phase space is partitioned
E
N
= 2π , (20.11)
h0 ω
The canonical ensemble 221
The first two terms in (20.19) are constant and if we define the constant
T (ET )
Z= (20.20)
R (ET )
then
or
1
P(E) =
(E) exp(–βE). (20.22)
Z
The Helmholtz free energy of the system can now be obtained from the
partition function using the expression
F = –β –1 ln Z. (20.24)
The canonical ensemble 223
For instance, for the classical SHO the partition function is written as
" 2 #
dxdp p mω2 x2
Z= exp –β + , (20.25)
h0 2m 2
kB T
Z= , (20.26)
h̄ω
so
" # " #
kB T h̄ω
F = –kB T ln = kB T ln , (20.27)
h̄ω β
and
" # " #
∂F kB T
S=– = kB ln +1 . (20.28)
∂T V h̄ω
Problem 20.1
Find expressions for the partition function, entropy, Helmholtz, and Gibbs free
energies of the simple quantum harmonic oscillator.
For the quantum SHO we have seen that
" #
= –h̄ω N
H + 1 .
2
" # ∞
β h̄ω
∞
1
Z= exp –β h̄ω j + = exp – exp[–β h̄ωj].
j=0
2 2 j=0
Then,
β h̄ω
exp –
2 1
Z= = # " " #
1 – exp[–β h̄ω] β h̄ω β h̄ω
exp – exp –
2 2
" #
1 1 β h̄ω
= " # = csch .
β h̄ω 2 2
2 sinh
2
224 Statistical Physics Analogies
There are alternative expressions for the entropy of a system in the quantum
canonical ensemble and we now derive one. The probability of finding the system
in a quantum state with energy Ej is given by
1
pj = exp(–βEj ), (20.29)
Z
n
1
n
with the condition that pj = exp(–βEj ) = 1.
j=1
Z j=1
Then from (20.27)
" # " #
∂F 1 ∂
S=– = kB ln Z + kB T – ln Z
∂T V kB T 2 ∂β
⎡ ⎤ ⎡ ⎤
β β
= kB ⎣ln Z – (–Ej ) exp(–βEj )⎦ = kB ⎣ln Z pj – (–Ej ) exp(–βEj )⎦
Z j j
Z j
= kB pj (ln Z + βEj ) = –kB pj ln pj .
j j
Problem 20.2
Consider a tight-binding model for a network with parameters α̃ = 0 and β̃ = –1.
Define the partition function, entropy, Helmholtz, and Gibbs free energies for a
network.
The Hamiltonian for the tight-binding model is H = α̃I + β̃A and the partition
function of the network with α̃ = 0 and β̃ = –1 is
= tr(exp(βA)).
Z = tr(exp(–β H))
n
Z= exp(βλj ).
j=1
Notice that the partition function of the network is just the sum of the subgraph
centralities of the network. The index tr(exp(βA)) is usually called the Estrada
index of the network. Using (20.31) we can obtain the entropy of the network in
the canonical ensemble. Since Ej = –λj , we have pj = exp(βλj )/Z and so,
1 1
S = –kB pj (βλj – ln Z) = –λj pj + kB ln Z pj = – λj pj + kB ln Z,
j
T j j
T j
(20.32)
which can be rearranged to give
–β –1 ln Z = – λj pj – TS.
j
Using the expression F = H – TS we obtain H = – λj pj and F = –β –1 ln Z.
j
..................................................................................................
FURTHER READING
selves than with the rest of the nodes of the network. We study some of the 21.3 Network partition methods 230
methods that aim to find such structures in networks. We finish with a method 21.4 Clustering by centrality 235
to detect anti-communities (bipartitions) in networks. 21.5 Modularity 237
21.6 Communities based
on communicability 239
21.7 Anti-communities 245
Further reading 250
21.1 Motivation
In real-world networks, nodes frequently group together forming densely con-
nected clusters which are poorly connected with other parts of the network.
Clusters, also known as communities in network theory, may form for many
reasons. For instance, we all belong to clusters formed by our friends and re-
latives. Inside these groups one can find a relatively high density of ties but in
many cases these are poorly connected to others groups in society. Clusters can
also be formed due to similarities among the nodes. For instance, groups of pro-
teins with similar functions in a protein–protein interaction network may be more
densely connected with each other than with proteins which have different func-
tions. In this chapter, we study how to find these communities of nodes based on
the information provided by the topological structure of the network.
Examples 21.1
(i) In Figure 21.1 we illustrate two well-known cases of networks with communities. The first corresponds to the
friendship ties among individuals in a karate club in a USA university. At some point in time, the members of
this social network were polarized into two different factions due to an argument between the instructor and
the president. These two factions, represented in Figure 21.1(a) in different colours, act as cohesive groups
which can be considered as independent entities in the network. Another example is provided in Figure 21.1(b).
continued
228 Communities in Networks
The network represents 62 bottlenose dolphins living in Doubtful Sound, New Zealand. Links are drawn between
two animals if they are seen together more frequently than expected at random. These dolphins split into groups
after one particular dolphin moved away for a period of time.
(ii) Political polarization in the USA leads to a number of clustered networks. We illustrate this with an example
based on political literature. Figure 21.2 represents a network of books on US politics published around the time
of the 2004 presidential election and sold on Amazon.com. There is a link between two books if they share several
common purchasers. As can be seen, there is a clear congregation into two main communities. These represent
the purchases of consumers of conservative literature on one side and of liberal literature on the other.
In these examples, the clusters were induced by empirical evidence. The ques-
tion that arises is whether we can find such partitions in networks without any
other information than that provided by the topological structure of the network.
When looking at the graph Laplacian we saw that such an approach is viable
via the Fiedler vector. We will return to that subject in Section 21.3, but we will
consider several other approaches, too.
1 int
mC = k ,
2 i∈C i
2m
δ(G) = .
n(n – 1)
There can be at most n1 n2 edges between two groups of nodes of size n1 and
n2 . So we define the inter-cluster density to be
ext
mC – C i∈C ki
δext (C) = = .
nC (n – nC ) nC (n – nC )
230 Communities in Networks
Example 21.2
We can calculate the quantities defined above for the networks in Figure 21.1.
In the karate club network, the values of the internal densities of the com-
munities C1 (followers of the instructor) and C2 (followers of the president)
are δint (C1 ) = 0.26 and δint (C2 ) = 0.24, respectively, and the inter-cluster
density is δext (C1 ) = δext (C2 ) = 0.035. The total density of this network is
δ(G) = 0.14.
For the network of bottlenose dolphins, the cluster represented in blue has
δint (C1 ) = 0.26, while the one represented in red has δint (C2 ) = 0.14 and the
external density is δext (C1 ) = δext (C2 ) = 0.007, while the total density of the
network is δ(G) = 0.08.
For these clusters found experimentally the internal density of every cluster is
significantly larger than its external density, and of the total density of the net-
work. Another observation is that there is at least one path between every pair of
nodes in a community linked together by edges and nodes in the same commu-
nity. That is, communities are internally connected. We can loosely define a cluster,
or community as follows.
This principle is the basis for the construction of networks with explicit clus-
ters, which are known as benchmark graphs. Although our two examples relate
to social networks, clustering is important in applications involving many other
types of network, too.
:
p
• Vi = V and Vi ∩ Vj = ∅ for i = j.
i=1
The process can be generalized to weighted networks where the cut-set is de-
fined as the sum of the weights of the links crossing subsets and the final condition
Network partition methods 231
Example 21.3
Let us consider the network illustrated in Figure 21.3. We will return to this network repeatedly in this chapter and we
name it Gclus . A random partition of the nodes into subsets of equal size is represented by the dotted line.
V1
V2
V3
V1
V3
V2
Moving v1 unbalances the number of nodes in each partition. Rather than simply analysing the gain obtained by
moving one node from one partition to another, we need to quantify the gain produced by swapping two nodes in
opposite partitions. If v1 ∈ V1 and v2 ∈ V2 then the gain of interchanging is given by
g(v1 ) + g(v2 ) – 2, v1 ∼ v2 ,
g(v1 , v2 ) =
g(v1 ) + g(v2 ), otherwise.
In our example, since v1 and v2 are not adjacent, g(v1 , v2 ) = 3 and the cut size is reduced from 5 to 2. The new
partition is shown in Figure 21.5.
V1
V3
V2
This node swapping process forms the basis for the Kernighan–Lin algo-
rithm. We start with a balanced bisection {V1 , V2 } and we compute the cut size,
C0 (V1 , V2 ), of the bisection. Then for k = 1, 2, . . . , r, where r = min(|V1 |, |V2 |)
we carry out the following steps.
• Find the pair of nodes v1 ∈ V1 and v2 ∈ V2 which give the biggest value of
g(v1 , v2 ) (which may be negative).
• Label these nodes vk1 and vk2 .
• For any node u adjacent to either vk1 or vk2 we update the value of g(u).
• Calculate
between V1 and V2 .
• Repeat the whole process until no further improvement is achieved for the
cut size.
The algorithm requires a time proportional to the third power of the number of
nodes in the network, O(n3 ), but improvements can be made to reduce this cost
significantly. In brief, these changes improve the process for swapping nodes;
use a fixed number of iterations; and only evaluate gain for nodes close to the
partition boundary. We introduce this algorithm here not because it is used today
for detecting communities in networks but because it helps us to understand the
intuition behind the partitioning methods for detecting communities.
In 1988, Powers showed that eigenvectors of the adjacency matrix can also
be used to find partitions in a network. The idea behind these methods is
that the second largest eigenvector qA2 has both positive and negative compo-
nents, allowing a partition of the network according to the sign pattern of this
eigenvector.
Example 21.4
qL2 = [–0.30 – 0.13 0.18 0.29 0.28 0.32 0.34 0.22 – 0.12 – 0.22
–0.33 – 0.51]T .
qA2 = [–.50 – 0.34 0.08 0.13 0.24 0.23 0.16 0.16 – 0.31
–0.42 – 0.37 – 0.20]T .
Both eigenvectors produce the same bipartition of this network as the one
produced by the Kernighan–Lin algorithm.
Example 21.5
In Figure 21.6 we show the clusters induced in the karate club by using two eigenvectors. In (a) we have used the
Laplacian matrix and in (b) the adjacency matrix. Notice that in picture (b) not all of the clusters are connected.
Figure 21.6 Two partitions of the Zachary karate club induced by multiple eigenvectors
Example 21.6
In Figure 21.7 we highlight the two links with highest edge betweenness
centrality for Gclus .
continued
236 Communities in Networks
Removing these links partitions the network in the same way as by using
the Kernighan–Lin approach but does not require an initial bisection.
• Calculate the edge betweenness centrality for all links in the network.
• Remove all links with the largest edge betweenness.
• Repeat these steps on the new network and continue until all links have been
removed.
Example 21.7
The dendrogram for Gclus is illustrated in Figure 21.8. The top dotted line indicates the division of the network into
two communities, each formed by six nodes. The second dotted line indicates a division into four communities having
3, 3, 5, and 1 nodes, respectively.
21.5 Modularity
Girvan and Newman proposed modularity as a measure of quality of clusters.
They start with the assumption that a cluster is a structural element of a network
that has been formed in a far from random process. If we consider the actual
density of links in a community, it should be significantly larger than the density
we would expect if the links in the network were formed by a random process.
Definition 21.2 Let G(V , E) be a network of n nodes and m edges with ad-
jacency matrix A and suppose we have divided the nodes into nC clusters
V1 , V2 , . . . , VnC . Define sir to equal 1 if node i is in cluster r and 0 otherwise.
Then the modularity of the partitioning is given by
n " #
1
nC
ki kj
Q= aij – sir sjr ,
4m r=1 i, j=1 2m
Modularity can be interpreted as the sum over all partitions of the difference
between the fraction of links inside each partition and the expected fraction, by
considering a random network with the same degree for each node, giving
⎡ ⎛ ⎞2 ⎤
nC
⎢ k
|E | 1 ⎥
Q= ⎣ – ⎝ kj ⎠ ⎦ , (21.1)
m 4m 2
k=1 j∈V k
238 Communities in Networks
where |Ek | is the number of links between nodes in the kth partition of the net-
work. If the number of intra-cluster links is no bigger than the expected value for
a random network then Q = 0. The maximum modularity is one and the more
positive the value of Q, the more convincing is the community structure.
Example 21.8
C2
C1
The number of edges and sum of degree for the communities are
|EC1 | = 7, kj = 16, |EC2 | = 9, kj = 20,
j∈VC j∈VC
1 2
" #2 " #2
7 16 9 20
Q= – + – = 0.383.
18 36 18 36
; <= > ; <= >
C1 C2
The goal of finding this value is to compare it with the modularity of other
partitions of the same network, as we will see later, in Section 21.6.
Communities based on communicability 239
Example 21.9
n
Gpq = q1 ( p)q1 (q) exp(λ1 ) + qj ( p)qj (q) exp(λj ). (21.2)
j=2
Recall from Chapter 18 that if a network has a large spectral gap (λ1 λ2 )
then we can expect it to be homogeneous, which indicates that it contains no
communities or clusters due to the lack of any cut-set in the network. If this is
the case, the term q1 ( p)q1 (q) exp(λ1 ) dominates (21.2). Whether there is a large
spectral gap or not, we can interpret this term as the extent to which the network
forms a single cohesive unit and use the other terms to indicate potential clusters.
One approach is to make use of the signs of the elements of eigenvectors. This
is illustrated in Figure 21.11 where we show the sign of the eigenvectors corres-
ponding to the four largest eigenvalues of Gclus . As we have seen, the sign pattern
of the second largest eigenvector of the adjacency matrix induces a partition of
the network. We represent this by using two different kinds of arrows for positive
and negative contributions for λ2 , λ3 , and λ4 .
We can use this sign pattern of the eigenvectors to express the sum of the
contributions of the non-principal eigenvalues and eigenvectors to the commu-
nicability function as
n
n
n
qj ( p)qj (q) exp(λj ) = q+j ( p)q+j (q) exp(λj ) + q–j ( p)q–j (q) exp(λj )
j=2 j=2 j=2
, ,
, n ,
, + n
,
– ,, –
qj ( p)qj (q) exp(λj ) + qj ( p)qj (q) exp(λj ),, ,
– +
, j=2 j=2 ,
(21.3)
where q+j (p) indicates that the entry corresponding to the node p of the jth
eigenvector is positive.
Now, consider a cluster formed by nodes which have the same sign contribu-
tion to the communicability function. A physical interpretation is that two nodes
with the same sign in one eigenvector are ‘vibrating’ in the same direction, so
should be coupled together. On the contrary, if two nodes have different signs
for the same eigenvector they are in different clusters because they are vibrating
in different phases. This analogy allows us to group the two contributions of this
term to the communicability in the intra- and inter-cluster communicabilities,
n
intra–cluster inter–cluster
qj ( p)qj (q) exp(λj ) = Gpq – |Gpq |. (21.4)
j=2
which means that we only need to calculate Gpq – q1 ( p)q1 (q) exp(λ1 ) in order
to determine the difference between intra- and inter-cluster communicabilities.
However, because of the definition of a community, we need to check Gpq for
every pair of nodes. In order to do that we can use the following algorithm.
Example 21.10
3 4
2 5 6
12 1 8 7
11 9
10
and
⎡ ⎤
3.49 1.84 –0.58 –0.78 –1.37 –1.18 –0.81 –0.93 1.73 1.86 2.12 1.65
⎢ 1.84 2.29 0.30 –0.28 –0.93 –1.19 –0.90 –0.97 0.72 1.70 1.16 0.59⎥
⎢ ⎥
⎢–0.58 –0.27⎥
⎢ 0.30 1.16 0.83 0.43 –0.57 –0.64 –0.29 –0.75 –0.47 –0.51 ⎥
⎢ ⎥
⎢–0.78 –0.28 0.83 1.49 0.76 –0.35 –0.55 –0.38 –0.84 –0.68 –0.54 –0.29⎥
⎢ ⎥
⎢–1.37 –0.93 0.43 0.76 1.15 0.41 –0.10 0.14 –1.09 –1.20 –0.95 –0.50⎥
⎢ ⎥
⎢–1.18 –1.19 –0.57 –0.35 0.41 1.48 1.05 0.63 –0.70 –1.04 –0.80 –0.42⎥
G = ⎢
⎢–0.81
⎥.
⎢ –0.90 –0.64 –0.55 –0.10 1.05 1.53 0.68 –0.37 –0.71 –0.55 –0.29⎥⎥
⎢–0.93 –0.39⎥
⎢ –0.97 –0.29 –0.38 0.14 0.63 0.68 0.84 –0.13 –0.78 –0.72 ⎥
⎢ ⎥
⎢ 1.73 0.72 –0.75 –0.84 –1.09 –0.70 –0.37 –0.13 2.05 1.60 1.10 0.56⎥
⎢ ⎥
⎢ 1.86 1.70 –0.47 –0.68 –1.20 –1.04 –0.71 –0.78 1.60 2.78 1.93 0.47⎥
⎢ ⎥
⎣ 2.12 1.16 –0.51 –0.54 –0.95 –0.80 –0.55 –0.72 1.10 1.93 2.38 0.71⎦
1.65 0.59 –0.27 –0.29 –0.50 –0.42 –0.29 –0.39 0.56 0.47 0.71 1.63
In Figure 21.13 we have reordered the nodes of G in order to illustrate the existence of a main diagonal positive
block formed by nodes 1, 2, 9, 10, 11, 12, which has a negative level of communication with the rest of the nodes. This
contour plot fails to highlight the community formed by nodes 3, 4, 5, 6, 7, 8.
8 3
7 2.5
6
2
5
4 1.5
Node
3 1
12 0.5
11
0
10
9 −0.5
2 −1
1
1 2 9 10 11 12 3 4 5 6 7 8
Node
We transform G into a communicability graph by replacing nonpositive values by zeroes and positive values by
ones to give
continued
244 Communities in Networks
This (0, 1)-matrix represents a new graph consisting of the same set of nodes as the network under analysis, but in
which two nodes are connected if and only if their intra-cluster communicability is larger than the inter-cluster one.
This communicability graph is illustrated in Figure 21.14.
12 1 4
6
11 2 3 5 7
8
10 9
We now find the cliques in the communicability graph, which correspond to the following groups of nodes:
C1 = {1, 2, 9, 10, 11, 12}, C2 = {3, 4, 5}, C3 = {5, 6, 8}, C4 = {6, 7, 8}.
These cliques correspond to the communities identified by the communicability method and they can be represented
by overlapping sets as shown in Figure 21.15.
We want to analyse whether this partition of Gclus into four (overlapping) communities is a better representation than
those obtained previously which partition the network into two communities, so we calculate the modularity of the new
partition. The number of edges and sum of degree in each community are |EC1 | = 7, |EC2 | = |EC3 | = |EC4 | = 3,
kj = 16, kj = 10, kj = 12, kj = 10.
j∈VC j∈VC j∈VC j∈VC
1 2 3 4
which is larger than that of Q = 0.383 previously found by the partitions introduced by the methods considered earlier:
Kernighan–Lin, Girvan–Newman, and the two spectral clustering techniques. Consequently, in this particular case,
the partition introduced by communicability is, at least in terms of modularity, better than the bipartition previously
considered.
21.7 Anti-communities
In Chapter 18 we showed how to measure the degree of bipartivity in a network.
We now show how to find bipartitions in complex networks. For obvious reasons
we call these clusters anti-communities in the network.
Examples 21.11
continued
246 Communities in Networks
Consider a bipartite network and let p and q be two nodes which are in the
two different disjoint partitions of the network. Since there are no walks of even
length starting at p and ending at q,
However, if p and q are in the same partition then since there are no walks of
odd length connecting them, due to the lack of odd cycles in the bipartite graph,
pq = [cosh(A)]pq > 0.
G (21.8)
• = exp(–A).
Form the anti-communicability matrix G
• Define
1, pq > 0,
G
=
Gpq
0, pq ≤ 0 or p = q.
G
Example 21.12
Consider the network illustrated in Figure 21.18. This graph has spectral bipartivity bS = 0.64. Then
⎡ ⎤
10.15 6.83 7.09 7.11 6.66 8.19 –5.62 –7.98 –7.29 –7.28 –9.02 –8.73
⎢ 6.83 8.03 4.17 5.87 5.43 6.72 –6.31 –4.32 –6.56 –5.65 –7.28 –7.04⎥
⎢ ⎥
⎢ 7.09 –7.05⎥
⎢ 4.17 8.05 5.62 5.50 6.73 –5.42 –6.62 –4.30 –6.56 –7.29 ⎥
⎢ ⎥
⎢ 7.11 5.87 5.62 8.17 3.93 6.80 –5.68 –5.96 –6.62 –4.32 –7.98 –7.10⎥
⎢ ⎥
⎢ 6.66 5.43 5.50 3.93 7.37 6.36 –5.39 –5.68 –5.42 –6.31 –5.62 –6.77⎥
⎢ ⎥
⎢ –7.28⎥
= ⎢ 8.19
G
6.72 6.73 6.80 6.36 9.38 –6.77 –7.10 –7.05 –7.04 –8.73 ⎥,
⎢ –5.62 –6.31 –5.42 –5.68 –5.39 –6.77 7.37 3.93 5.50 5.43 6.66 6.36⎥
⎢ ⎥
⎢ –7.98 6.80⎥
⎢ –4.32 –6.62 –5.96 –5.68 –7.10 3.93 8.17 5.62 5.87 7.11 ⎥
⎢ ⎥
⎢ –7.29 –6.56 –4.30 –6.62 –5.42 –7.05 5.50 5.62 8.05 4.17 7.09 6.73⎥
⎢ ⎥
⎢ –7.28 –5.65 –6.56 –4.32 –6.31 –7.04 5.43 5.87 4.17 8.03 6.83 6.72⎥
⎢ ⎥
⎣ –9.02 –7.28 –7.29 –7.98 –5.62 –8.73 6.66 7.11 7.09 6.83 10.15 8.19⎦
–8.73 –7.04 –7.05 –7.10 –6.77 –7.28 6.36 6.80 6.73 6.72 8.19 9.38
continued
248 Communities in Networks
9 4
3 11
10 2
6 7
12 1
5 8
and
⎡ ⎤
0 1 1 1 1 1 0 0 0 0 0 0
⎢1 0 1 1 1 1 0 0 0 0 0 0⎥
⎢ ⎥
⎢1 0⎥
⎢ 1 0 1 1 1 0 0 0 0 0 ⎥
⎢ ⎥
⎢1 1 1 0 1 1 0 0 0 0 0 0⎥
⎢ ⎥
⎢1 1 1 1 0 1 0 0 0 0 0 0⎥
⎢ ⎥
⎢
= ⎢1 1 1 1 1 0 0 0 0 0 0 0⎥⎥.
G ⎢0
⎢ 0 0 0 0 0 0 1 1 1 1 1⎥⎥
⎢0 1⎥
⎢ 0 0 0 0 0 1 0 1 1 1 ⎥
⎢ ⎥
⎢0 0 0 0 0 0 1 1 0 1 1 1⎥
⎢ ⎥
⎢0 0 0 0 0 0 1 1 1 0 1 1⎥
⎢ ⎥
⎣0 0 0 0 0 0 1 1 1 1 0 1⎦
0 0 0 0 0 0 1 1 1 1 1 0
The anti-communicability graph with adjacency matrix Ĝ is the disconnected network illustrated in Figure 21.19.
This anti-communicability graph has only the two cliques
These two cliques are the anti-communities in the network. This can be emphasized by redrawing the network as
in Figure 21.20.
Anti-communities 249
1 7
6 2 12 8
5 3 11 9
4 10
1 2 3 4 5 6
7 8 9 10 11 12
Using the approach detailed in this section, we can find the best bipartitions
for the PPI and the advisors–advisees networks illustrated in Figure 21.16. The
results are shown in Figure 21.21.
(a)
FURTHER READING