Full Ebook of Networks Second Edition Instructor S Manual Solutions Mark Newman Online PDF All Chapter

Download as pdf or txt
Download as pdf or txt
You are on page 1of 49

Networks Second Edition Instructor s

Manual Solutions Mark Newman


Visit to download the full and correct content document:
https://ebookmeta.com/product/networks-second-edition-instructor-s-manual-solution
s-mark-newman/
More products digital (pdf, epub, mobi) instant
download maybe you interests ...

Optical Networks Solutions Instructor Solution Manual


1st Edition Debasish Datta

https://ebookmeta.com/product/optical-networks-solutions-
instructor-solution-manual-1st-edition-debasish-datta/

Instructor s Solutions Manual for Optics Global Edition


Eugene Hecht

https://ebookmeta.com/product/instructor-s-solutions-manual-for-
optics-global-edition-eugene-hecht/

Advanced Engineering Dynamics Second Edition 2nd Ed


Instructor Solution Manual Solutions Jerry Ginsberg

https://ebookmeta.com/product/advanced-engineering-dynamics-
second-edition-2nd-ed-instructor-solution-manual-solutions-jerry-
ginsberg/

Instructor s Solutions Manual for Microelectronic


Circuits International Seventh Edition Adel S Sedra

https://ebookmeta.com/product/instructor-s-solutions-manual-for-
microelectronic-circuits-international-seventh-edition-adel-s-
sedra/
An Introduction to Stochastic Modeling Modelling Fourth
Edition Ed 4th Instructor s Solution Manual Solutions
Mark Pinsky Samuel Karlin

https://ebookmeta.com/product/an-introduction-to-stochastic-
modeling-modelling-fourth-edition-ed-4th-instructor-s-solution-
manual-solutions-mark-pinsky-samuel-karlin/

Engineering Mechanics Statics Dynamics Instructor s


Solutions Manual 14th Edition Russell C Hibbeler

https://ebookmeta.com/product/engineering-mechanics-statics-
dynamics-instructor-s-solutions-manual-14th-edition-russell-c-
hibbeler/

Instructor s Solutions Manual for Statistics for


Business and Economics 13th Edition Nancy Boudreau

https://ebookmeta.com/product/instructor-s-solutions-manual-for-
statistics-for-business-and-economics-13th-edition-nancy-
boudreau/

Quantum Theory of Materials Second Edition Instructor


Res n 1 of 3 Solution Manual Solutions Efthimios
Kaxiras

https://ebookmeta.com/product/quantum-theory-of-materials-second-
edition-instructor-res-n-1-of-3-solution-manual-solutions-
efthimios-kaxiras/

Financial Analytics with R Building a Laptop Laboratory


for Data Science Instructor s Resources Lecture
Solution Manual Solutions 1st Edition Mark J. Bennett

https://ebookmeta.com/product/financial-analytics-with-r-
building-a-laptop-laboratory-for-data-science-instructor-s-
resources-lecture-solution-manual-solutions-1st-edition-mark-j-
Networks, 2nd Edition
Mark Newman

Solutions to Exercises

If you find errors in these solutions, please let the author know. Suggestions for improvements are also welcome. Please email
Mark Newman, [email protected]. Please do not post these solutions on the Web or elsewhere in electronic form. Copyright © 2018
Mark Newman.

6 Mathematics of networks
1 0 1 0 0
­0 1 1 0 0®
© ª
b) B  ­
­0 0 0 1 0®
®
Exercise 6.1:
«0 1 1 1 1¬
a) Undirected
b) Directed, approximately acyclic 0 0 1 0 0
c) Planar, tree, directed or undirected depending on the ©0 0 1 1 1ª®
c) BT B  ­­1
­
representation 1 0 1 1®®
d) Undirected, approximately planar ­0 1 1 0 1®
e) Directed or undirected depending on the network «0 1 1 1 0¬
f) Citation networks, food webs
g) The web, the network of who says they’re friends with
whom Exercise 6.4:
h) A river network, a plant or a tree or their root system a) k  A1
i) A road network, the network of adjacencies of countries
j) Any affiliation network, recommender networks, key- b) m  12 1T A1
word indices
c) N  A2
k) A web crawler
l) Draw data from a professionally curated index such as the d) 16 Tr A3
Science Citation Index or Scopus, or from an automated
citation crawler such as Google Scholar Exercise 6.5:
m) A literature search
a) A 3-regular graph has three ends of edges per node, and
n) Questionnaires or interviews
hence 3n ends total. But the total number of ends of edges
o) An appropriate map
is also equal to 2m, which is an even number. Hence n
must be even.
Exercise 6.2: The maximum number of edges is n2 because

there are n2 distinct places to put an edge and each can have b) A tree with n nodes has m  n − 1 edges. Hence the
only one edge in a simple network. The minimum is n − 1 average degree is 2m/n  2(n − 1)/n < 2.
because we require that the network be connected and n − 1 is c) The connectivity of A and C must be at least y, because if
the minimum number of edges that will achieve this—see the there are y paths from B to C and x > y paths from A to B,
discussion at the top of page 123. then there are at least y paths all the way from A to C.
On the other hand the connectivity of A and C cannot be
Exercise 6.3: The matrices are as follows: greater than y by the same argument: if there were more
than y paths from A to C and more than y paths from
B to A, then there would be more than y paths from B
0 1 0 0 1 to C (via A). Hence the connectivity of A and C must be
©0 0 1 0 0ª® exactly y.
­
a) A  ­­1 0 0 0 1®®
­0 1 1 0 0® Exercise 6.6: Let the eigenvector element at the central node
be x1 . By symmetry the elements at the peripheral nodes all
«0 0 0 0 0¬
have the same value. Let us denote this value x2 . Then the

1
Networks (2nd Edition)

eigenvalue equation looks like this: Exercise 6.8: The total number of edges attached to nodes of
type 1 is n1 c1 . The total number attached to nodes of type 2 is
0 1 1 1 ··· x x n 2 c2 . But each edge is attached to one node of each type and
ª ©x1 ª © 1ª
­1 0 0 0 ··· ­x2 ® hence these two numbers must be equal n 1 c 1  n2 c2 .
©
® ­ 2®
­1
­ 0 0 0 ··· ® ­x ®
® ­ 2 ®  λ ­x2 ® ,
­ ®
­1
­ 0 0 0 ··· ® ­x ®
® ­ 2®
­x ®
­ 2® Exercise 6.9: The network contains an expansion of UG, and
­. .. .. .. .. ®­ . ® ­.® hence is nonplanar by Kuratowski’s theorem:
. . .. .
«. . . . ¬« ¬ «.¬
where λ is the leading eigenvalue. This implies that (n − 1)x2 
λx1 and x1  λx2 . Eliminating x1 and x 2 from these equations

we find that λ  n − 1. The equation x1  λx2 then implies
that x1 and x2 have the same sign, which means that this must
be the leading eigenvalue (by the Perron–Frobenius theorem—
see the discussion on page 160 and the footnote on page 161).

Exercise 6.7:
a)
r
Õ (The five-fold symmetric appearance of the network might lead
Total ingoing edges  k iin , one at first to hypothesize that it contains an expansion of K5 ,
i1 but upon reflection we see that this is clearly impossible, since
r
Õ every node has degree 3, whereas every node in K5 has de-
Total outgoing edges  k iout . gree 4. Thus if the network is to be nonplanar it must contain
i1 an expansion of UG.)

b) The number of edges running to nodes 1 . . . r from nodes Exercise 6.10: The edge connectivity is two. To prove this we
r + 1 . . . n is equal to the total number of edges running display two edge-independent paths and a cut set of size two
to nodes 1 . . . r minus the number originating at nodes thus:
1 . . . r. In other words, it is equal to the difference of the
two expressions above:
r
Õ A B A B
Number of edges  k iin − k iout .

i1

c) All outgoing edges at node r + 1 must attach to nodes in


the range 1 . . . r and hence the number of edges outgoing
from node r + 1 can be no greater than the total number This constitutes a proof because the existence of two indepen-
from nodes r + 1 . . . n to nodes 1 . . . r. Thus dent paths proves that the connectivity must be at least two,
r while the existence of the cut set proves that the connectivity
Õ
out
k r+1 ≤ k iin − k iout .
 can be no greater than two.
i1
Exercise 6.11:
Similarly, all edges ingoing at node r must originate from a) n  1, m  0, f  1.
nodes in the range r + 1 . . . n and hence the total number b) n → n 0  n + 1, m → m 0  m + 1, f → f 0  f .
ingoing at node r can be no greater than the total number
c) n → n 0  n, m → m 0  m + 1, f → f 0  f + 1.
from nodes r + 1 . . . n to nodes 1 . . . r. Thus
d) The relation is f + n − m  2. Clearly it is true for case (a),
r
Õ which can be conveniently used as a starting point for in-
k rin ≤ k iin − k iout .

duction. The relation is also preserved by (b) and (c). And
i1 any connected planar graph can be built up by adding its
Ín in nodes and edges one by one, i.e., by a combination of
Noting that the total number of edges is m  i1 k i 
Ín out moves (b) and (c). Hence by induction f + n − m  2 is
i1 k i , this can also be written as true for all such graphs.
n e) In a simple graph all faces have at least three edges, except
Õ
k rin ≤ k iout − k iin .
 for the “outside” face that extends to infinity, and each
ir+1
edge has two sides. Therefore the number of edges m

2
Solutions to exercises

times two is at least as great as f − 1 times three, or 7 Measures and metrics


f ≤ 2m/3 + 1. Substituting this inequality into the rela-
tion from part (d) we get m ≤ 3n − 3, and combining this Exercise 7.1:
with the relation for the average degree c  2m/n we get a) We note that [A1]i  A i j  k i  k and hence A1  k1.
Í
j
6 b) The vector x of Katz centralities is given by
c ≤ 6 − < 6.
n
x  (I − αA)−1 1  (I + αA + α2 A2 + . . .)1.
Exercise 6.12:
a) The number of paths from s to t of length r is [Ar ]st Noting, as above, that A1  k1 for the regular graph, this
and each has weight α r . Thus the sum of the weights for then becomes
paths of length r is [(αA)r ]st and the sum for paths of all
lengths (including length zero) is 1
x  (1 + αk + α2 k 2 + . . .)1  ,

1 − αk
Õ
Z st  [(αA)r ]st  [(I − αA)−1 ]st . and hence x i  1/(1 − αk) for all i.
r0
c) Betweenness centrality and closeness centrality are the
b) The sum converges if |α| < 1/κ 1 , where κ 1 is the largest obvious choices.
eigenvalue of A (which is always positive, a consequence
of the Perron–Frobenius theorem). Exercise 7.2: Starting at any node, there is one node at dis-
c) The derivative in the problem is given by tance 0, two at distance 1, two at distance 2, and so forth up to
a maximum distance of 12 (n − 1). So the mean distance is
∂ log Z st α ∂Z st α Õ r−1 r
  rα [A ]st
∂ log α Z st ∂α Z st r (n−1)/2
2 Õ 2 1 2 n2 − 1
1 Õ k × (n − 1)  .
r n n 8 4n
 r[(αA) ]st . k1
Z st r

Let m be the number of paths from s to t of length ` st . And the closeness is the reciprocal of this, or 4n/(n 2 − 1).
Then the leading terms in Z st and the sum above are
Exercise 7.3:
[(αA)r ]st  mα ` st + O(α ` st +1 ),
Õ
Z st  a) The equivalence is most easily demonstrated in the re-
r
verse direction. We write the series as x  ∞ (αA)k 1,
Í
k0
r[(αA)r ]st  m` st α ` st + O(α ` st +1 ).
Õ
then
r

Õ ∞
Õ
Substituting into the previous result we then get αAx + 1  αA (αA)k 1 + 1  1 + (αA)k 1
∂ log Z st m` st α ` st + O(α ` st +1 ) k0 k1
  ` st + O(α). ∞
∂ log α mα ` st + O(α ` st +1 )
Õ
 (αA)k 1  x.
Taking the limit α → 0 then gives the required result. r0

Exercise 6.13: Alternatively, we can rearrange the definition of x accord-


a) There are k iin incoming edges at node i Í and the sum of ing to Eq. (7.7) as x  (I − αA)−1 1 and then expand the
the trophic levels at their other ends is j A i j x j . Thus matrix inverse as a geometric series.
the average trophic level is (1/k iin ) j A i j x j and the result b) In the limit of small α we can neglect terms in the series
Í
follows. beyond the first two to get x ' 1 + αA1 which implies that
b) For species with no prey, k iin  0 and so x i is undeter- the centrality of node i is x i  1+αk i , which is linear in the
mined. We can fix this by artificially setting k iin  1 for degrees. Hence in this limit the Katz centrality is, apart
autotrophs (or indeed setting it to any nonzero value). from additive and multiplicative constants, the same as
Then the equation for x i can be rewritten in vector form the degree centrality—higher degree implies higher Katz
as centrality.
x  D−1 Ax + 1, c) Let us express 1 as a linear combination Í of the eigen-
where D is the matrix with the in-degrees down the diag- vectors vr of the adjacency matrix 1  k c k vk for some
onal or 1 for nodes with zero in-degree, and 1 is the vector choice of coefficients c k . (For a directed network, we use
(1, 1, 1, . . .). Rearranging this expression then gives the the right eigenvectors.) Then
required result. Õ Õ
(αA)k 1  (αA)k c r vr  c r (ακ r )k vr ,
r r

3
Networks (2nd Edition)

where κ r is the eigenvalue corresponding to eigenvec- Exercise 7.7:


tor vr . Summing over k we now have a) Because the network is a tree there is only a single short-

Õ Õ ∞
Õ Õ c v est path between any pair of nodes. The particular node
r r
x (αA)k 1  cr (ακ r )k vr  , of interest lies on the shortest path between every pair of
1 − ακ r
k0 r k0 r nodes except pairs where both Í members fall in the same
disjoint region. There are i n 2i such pairs, and n 2 pairs
so long as ακ r < 1. Now as we take the limit α → 1/κ1 ,
in total, so x  n 2 − i n 2i .
Í
all terms in the sum remain finite except the term in r  1,
which diverges. Hence this term dominates in the limit b) The removal of the ith node divides the line graph into
and x becomes proportional to the leading eigenvector v1 . two disjoint regions of sizes n1  i − 1 and n2  n − i.
Applying the formula we then find that the betweenness
Exercise 7.4: Every node in this network is symmetry- of the ith node is 2(n − i + 1)i − 1.
equivalent, so we only have to calculate the closeness of one of
them. Starting at any node there is one node at distance 0, three Exercise 7.8:
at distance 1, and the remaining six are at distance 2. So the a) The four leftmost nodes form a 3-core.
mean distance is (0+3+12)/10  32 and the closeness centrality
b) There are eight edges, of which six are reciprocated, so
is the reciprocal 23 . r  34 .
Exercise 7.5: The vector x of PageRank scores is given by c) The two nodes have two common neighbors and they
Eq. (7.11) to be x  (I − αAD−1 )−1 1. In this network, how- have degrees 4 and 5 respectively, so their cosine similar-
ever, all out-degrees are 1 and hence D  D−1  I and ity is
2 1
σ √  √ .
x  (I − αA)−1 1  (I + αA + α2 A2 + . . .)1. 4×5 5
(The central node has out-degree zero but, as discussed in Sec- Exercise 7.9:
tion 7.4, we conventionally set the out-degree to one to avoid
a) Every node in the first network is connected to at least
dividing by zero, and this change has no effect on the PageR-
three of the others, so the entire network is a 3-core. In
ank values.) But recall now that the matrix element [Ad ]i j
the second network, however, there is one node with only
counts the number of paths of length d from j to i and hence
(d)
two neighbors. Removing this node and then iteratively
n i  j [Ad ]i j  [Ad 1]i is the number of paths of length d
Í
removing all subsequent nodes with two or fewer neigh-
from all nodes to node i. Since the network is a directed tree, bors, we end up removing the entire network. Hence
however, there is at most one path between each pair of nodes, there is no 3-core in the second network, despite its close
(d)
and hence n i is also the number of nodes that have distance similarity to the first.
exactly d from i. Thus, if the central node is node 1, then b) If you consider single nodes to be strongly connected
(d)
Õ components then there are three such components in this
n1  δ di d , network:
i

where δ mn is the Kronecker delta. Now the PageRank of the


central node is
∞ ∞ ∞
(d)
Õ Õ Õ Õ
x1  α d Ad 1 1  α d n1  αd δ di d
 

d0 d0 d0 i



ÕÕ Õ
 α d δ di d  α di .
i d0 i

Exercise 7.6: Let L1 and R 1 be the sums of the distances from


node 1 to nodes in the left and right shaded regions and simi- (Depending on who you ask, a single node may or may
larly for L2 and R 2 with node 2. Then we note that the distance not be considered a strongly connected component.)
from node 2 to any node in the left shaded region is 1 greater
c) Top to bottom and left to right the local clustering coeffi-
than the corresponding distance from node 1 to the same node
cients of the nodes are 0, 1, 13 , 1, and 23 .
and hence L2  L 1 + n1 . Similarly R 1  R2 + n 2 . Adding these
two expressions gives d) In terms of the quantities e r and a r defined on page 205
of the book, we have e1  10 3 , e  1 , a  2 , and a  3 .
2 2 1 5 2 5
L1 + R1 + n1  L2 + R2 + n2 . Then the modularity is
Noting that the closeness centralities are given by C 1  n/(L1 + 7
R1 ) and C 2  n/(L 2 + R2 ), we then recover the required result. Q  e11 − a12 + e22 − a22  .
25

4
Solutions to exercises

e) There are n 2 paths total and all of them start, end, or pass Exercise 7.14: Assume the network satisfies Davis’s criterion
through the central node except for those that start and of having no loops with exactly one negative edge. Performing
end at the same peripheral node, of which there are n − 1. the coloring as described and then adding back in the nega-
Hence the betweenness of the central node is n 2 − (n − 1). tive edges, we see that a negative edge will fall between two
nodes of the same color if and only if those nodes are in the
Exercise 7.10: There are three independent paths between ev- same component, meaning that they are connected by a path
ery pair of nodes in a 3-component. A node in a 3-core, on the of positive edges. That path plus the newly added negative
other hand, need only have edges connecting it to three other edge then form a loop with exactly one negative edge. But by
members of the 3-core, which is a weaker condition. This net- hypothesis there are no such loops in the network and hence
work, for example, is a single 3-core, but has two 3-components: no negative edges can fall between nodes in the same compo-
nent: they only fall between nodes in different components.
Hence all edges between components are negative. Given that
all edges within components are by definition positive (since
this is how we constructed the components in the first place),
the graph is therefore clusterable and the components of the
positive-edge network are the clusters.
Exercise 7.11: One third of the edges are not reciprocated and
Exercise 7.15: This question is most simply answered in vector
two thirds are, so r  23 . notation. Summing over j is equivalent to multiplying by the
uniform vector 1  (1, 1, 1, . . .), which gives:
Exercise 7.12:
a) It is balanced, as one can show by exhaustively verifying σ1  (D − αA)−1 1  [(I − αAD−1 )D]−1 1
that all loops contain an even number of minus signs.
 D−1 (I − αAD−1 )1  D−1 x,
b) All balanced graphs are clusterable and hence this one is
too. Here are the clusters: where x  (I − αAD−1 )1 is the vector of PageRank scores (see
Eq. (7.11)). Noting that D−1 is the diagonal matrix with the re-
ciprocals of the degrees along its diagonal, this then completes
the proof.

Exercise 7.16:
a) The numbers e r are simple—they are just the diagonal
entries in the table. To get the a r we need to add the frac-
tion of couples in which the woman is in group r and the
fraction in which the man is in group r, then divide by
two (since a r is defined as the fraction of ends of edges in
group r and the edge corresponding to each couple has
two ends). Thus we have a1  (0.323 + 0.289)/2  0.306,
Exercise 7.13: The number of times the color changes as we and similarly a2  0.226, a3  0.400, and a4  0.068.
go around a loop is equal to the number of minus signs. If Then the modularity is
this number is odd, then we change an odd number of times,
meaning that we end up with the opposite color from the one Q  0.258 + 0.157 + 0.306 + 0.016
we started with when we get back to the starting node. Thus − 0.3062 − 0.2262 − 0.4002 − 0.0682
the last edge around the loop will not be satisfied: either it is
 0.428.
positive and joins unlike colors or it is negative and joins like
ones. If all loops have a even number of minus signs, on the b) Applying the same approach to the second set of data
other hand, we never run into problems, and the entire net- gives a1  0.345, a2  0.250, and a3  0.395, and the
work can be colored in this way. Then we simply divide the modularity with respect to political alignment is
network into contiguous groups of like-colored nodes. By def-
inition all edges within such a cluster are positive and all edges Q  0.25 + 0.15 + 0.30 − 0.3452 − 0.2502 − 0.3952
between different-colored clusters are negative. There are no  0.362.
edges between clusters of the same color, because if there were
the clusters would be considered one large one, not two smaller c) Both of these modularity values are quite high as such
ones. Hence the network is clusterable in the sense defined by things go—values above Q  0.3 are often considered
Harary. significant—so there seems to be substantial homophily
in these populations, meaning that couples tend to have
similar ethnicity and similar political views significantly
more often than one would expect by random chance.

5
Networks (2nd Edition)

8 The large-scale structure of networks with the degrees k i as its elements. When we multiply
an arbitrary vector v by this matrix we get
Exercise 8.1:
k T
a) If you double the area of the carpet it will take about Bv  Av − k v.
twice as long to vacuum, so the complexity is O(n). 2m
b) One finds words in a dictionary, roughly speaking, by The first term can be evaluated in time O(m + n) as in
binary search—open the book at a random point, decide part (b). The inner product kT v in the second term takes
whether the word you want is backward or forward from time O(n) to evaluate, then we simply multiply it by
where you are, open the book at another random point in k/2m, which takes a further time O(n). Thus the total
that direction, and repeat. Each time you do this you de- time for the computation is O(m + n).
crease the distance to the desired word by, on average, a
factor of two. When the distance gets down to one word, Exercise 8.4:
you have found the word you want. The number k of a) O(n)
factors of two needed to do this is given by 2k  n, so b) On each round of the algorithm we take O(n) time to find
k  log2 n and the complexity is O(log n). the highest-degree node, then time O(m/n) to remove it
(see Table 8.2 on page 228), for a total of O(n + m/n) time
Exercise 8.2: per round. There are n rounds, so total running time is
a) Perform a breadth-first search starting from the given O(n 2 + m). (Normally m is less than n 2 , so to leading
node to find the distance to all other nodes, then average order it can be ignored and the running time is O(n 2 ).)
those distances and take the reciprocal to get the closeness c) If we use a heap we can find the highest-degree node in
centrality. The breadth-first search takes time O(m + n) time O(1) and remove it from the heap in time O(log n)
and the average takes time O(n), so the overall running and from the network in time O(m/n), for a total time
time is O(m + n). of O(m/n + log n) per round of the algorithm, to leading
b) Use Dijkstra’s algorithm. If implemented using a binary order. Over n rounds the whole calculation then takes
heap, the time complexity would be O((m + n) log n). time O(m + n log n).
c) Use repeated breadth-first searches. Start at node 1 and d) We place all the numbers in the heap, which takes
perform a breadth-first search to find all the nodes in the O(log n) time per number or O(n log n) for all of them,
component it belongs to. Then find the next node that then repeatedly find and remove the largest one. Find-
is not in that component and start another breadth-first ing the largest one takes time O(1) and removing it takes
search from there to find all the nodes in its component. time O(log n), and hence the total running time for sort-
Repeat until there are no nodes left that are not in any ing n numbers is O(n log n) to leading order.
of the previously discovered components. Each breadth- e) Make a histogram of the degrees as follows. First, create
first search takes time O(n c + m c ), where n c and m c de- an array of n integers, initially all equal to zero, which
note the numbers of nodes and edges in the component. takes time O(n). This array represents the bins in our
Summing over all components, the total running time is histogram. Then go through the degrees one by one and
O(n + m). for each degree k increase the count in the kth bin by
one. This also takes time O(n), and at the end of the
d) You could use a truncated version of the augmenting
process the kth bin will contain the number of degrees
path algorithm, in which you repeatedly find indepen-
equal to k. Now print out the contents of the histogram
dent paths, but stop when you have found two—there is
in order from largest degrees to smallest, going through
no need to keep going beyond this point if your aim is
each bin in turn and printing out separately each of the
only to find two paths.
node degrees it contains. For instance if the k  5 bin
Exercise 8.3: contains three nodes, print out a 5 three times. This too
takes time O(n). The end result will be a printed list of
a) Multiplying an n × n matrix into an n-element vector in- the degrees in decreasing order, which takes time O(n)
volves n 2 multiplies and n 2 additions. So the running to generate. This algorithm is (a version of) radix sort.
time for the complete calculation is O(n 2 ).
b) Set up an n-element array to represent the matrix, which Exercise 8.5:
takes time O(n), then add to it one term for each non-zero a) For a network stored in adjacency list format, one nor-
element in the adjacency matrix, of which there are 2m. mally has the degrees stored separately as well, in which
So the total running time is O(m + n). case calculating their mean is simply a matter of summing
c) At first sight this calculation is going to take time O(n 2 ) them and then dividing by n, which takes O(n) time.
as in part (a) above. But we can do it faster by noting b) The simplest way to calculate the median is to sort the list
that the modularity matrix can be written in matrix nota- of degrees in either increasing or decreasing order and
tion as B  A − kkT /2m, where k is the n-element vector then find the one in the middle of the list. As discussed

6
Solutions to exercises

in Exercise 8.4, this sorting takes either O(n log n) time or If the degrees were correlated you could have a problem,
O(n) time, depending on the algorithm used, and hence because more edges could end at nodes with high out-degree
finding the median takes the same time. than you would expect on average. That would increase the
c) One would use Dijkstra’s algorithm, which has complex- amount of work you needed to do to check all the outgoing
ity O((m + n) log n). edges. Imagine, for instance, the extreme case of a star-like
d) This requires us to calculate the minimum node cut set, network in which half of all edges pointed inwards to a single
which we can do using the augmenting path algorithm. hub node and the other half pointed outwards. Then for the
1 m ingoing edges you would need to check the destinations of
The running time is O((m + n)m/n), as shown in Sec- 2
1 m outgoing edges for reciprocity, and the whole calculation
tion 8.7.2. 2
would take time O(m 2 ), which is much larger than O(m 2 /n).
Exercise 8.6:
a) If we know the true distance to every node at distance d Exercise 8.9:
or less then any node without an assigned distance must a) x i 
Í
α di j .
j
have distance d + 1 or greater. If such a node is adjacent
b) Use breadth-first search to calculate and store all the dis-
to one with distance d, however, then there is at least
tances, then run through all nodes performing the sum
one path to it of length d + 1, via its distance-d neighbor.
above.
Hence its distance must be exactly d + 1.
b) There must be at least one path of length d + 1 to the node c) The time complexity for each node is O(m + n) and
in question (let us call it node v). The distance along that O(n(m + n)) for all nodes.
path to the penultimate node u (which is a neighbor of v)
is then d, meaning that u has distance no greater than d.
But u can also have distance no less than d, since if it did 9 Network statistics and measurement error
then there would be a path of length less than d + 1 to v
Exercise 9.1:
and hence its distance would not be d + 1. Thus u, which
is a neighbor of v, must have distance d. a) Confusions arising because authors have the same name.
Confusions arising because the same author may give
Exercise 8.7: their name differently on different papers. Missing pa-
a) To find the diameter of a network we must perform a pers.
breadth-first search starting from each node, then take the b) Some pages are not reachable from the starting point.
largest distance found in any such search. Each breadth- Dynamically generated pages might have to be excluded
first search takes time O(m + n) and there are n searches, to avoid getting into infinite loops or trees.
so the total running time is O(n(m + n)).
c) Laboratory experimental error of many kinds. Missing
b) Listing the neighbors of a node i is simply a matter of pathways.
running along the appropriate row of the adjacency list,
which takes time of order the average length of the row, d) Subjectivity on the part of participants. Missing partici-
which is hki. To list the second neighbors, however, we pants. Inaccurate specification of what a “friend” is.
need to list separately all the k j neighbors of each of the e) Out-of-date or inaccurate maps.
first neighbors j. But the number of ends of edges that
attach to node j is, by definition, k j , and hence we are k j Exercise 9.2:
times more likely to be attached to node j than to a node a) The likelihood is
with degree 1. Taking the appropriate weighted average,
the expected number of neighbors of a neighbor is n
Ö
Í L  µn e−µx i .
j kj × kj hk 2 i i1
Í  ,
j kj hki
b) The log-likelihood is L  n log µ − µ ni1 x i , and, differ-
Í
and the average total number of second neighbors is hki entiating with respect to µ and setting the result to zero
times this, or hk 2 i, which is also the amount of time it will we get
take to list the second neighbors. n
1 1Õ
 xi .
Exercise 8.8: To calculate the reciprocity you have to go µ n
i1
through each edge in the network in turn, of which there are m,
and ask whether any of the edges outgoing from the node at Exercise 9.3: After the algorithm converges, the parameter
the tail end of the edge connect back to the node at the head. values are α  0.598, β  0.202, and ρ  0.415, and the prob-
To check through all the outgoing edges takes a time O(m/n) abilities of edges existing between each pair of nodes are as
on average, so the whole calculation takes O(m 2 /n). follows:

7
Networks (2nd Edition)

way, in a sparse network there are very few edges, and


even if all of them were false positives they would still

0.
5
96

44
only be a tiny fraction of all node pairs, so the value of β

0.

2
0.022 would be small.

0.119
Exercise 9.6: If observations of edges are reliable, then there
0.

3
11

82
are no false positives: when we see an edge, it’s really there.
9

0.
So β  0 in this case. Substituting β  0 into Eq. (9.29), we get
ρ(1 − α)N
Exercise 9.4: Qi j 
ρ(1 − α)N + (1 − ρ)(1 − β)N
a) We are certain about every pair that is connected by an
edge in any of these paths: they definitely have an edge. if E i j  0 and Q i j  1 otherwise. Substituting these values into
We are also certain about the non-existence of any edge Eq. (9.30) then gives
that would shorten a path. For instance, if there were an
Í
i< j Ei j
edge between node 5 and node 7, then the shortest path α Í ,
N i< j Qi j
to node 7 would go along that edge. Since it does not,
we know that the edge does not exist. For this reason we with ρ given by the same expression as previously and β  0.
can be sure there is no edge between the pairs (1,3), (1,4),
(1,6), (1,7), (1,8), (2,7), and (5,7). The remaining pairs are 10 The structure of real-world networks
all uncertain: they could have an edge or not.
Exercise 10.1:
b) (i) They are definitely connected if they have an edge in
any of the paths. (ii) They are definitely not connected if a) Every node is connected by an edge to every other, so the
adding an edge would create a shorter path to any node. shortest path between any two distinct nodes has length 1,
(iii) All other edges are uncertain. and hence the diameter is also 1.
b) The furthest points on the lattice are opposite corners. To
Exercise 9.5: reach one corner from the other you have to go L steps
along and L steps down for a total of 2L steps, so the
a) Dropping the parameters α, β, and ρ from our notation
diameter is 2L. The equivalent result for a d-dimensional
for the sake of brevity, we have
hypercubic lattice is dL. The number of nodes on the
P(Oi j  1)  P(Oi j  1, A i j  1) + P(Oi j  1, A i j  0) hypercubic lattice is n  (L + 1)d , which implies that
 P(Oi j  1|A i j  1)P(A i j  1) L  n 1/d − 1. Thus the diameter as a function of n is
d(n 1/d − 1).
+ P(Oi j  1|A i j  0)P(A i j  0). c) On the first step we reach k nodes. On each of the sub-
sequent d − 1 steps the tree branches by a factor of k − 1,
b) We have: so the number of nodes is multiplied by k − 1. After d
P(Oi j  1|A i j  1)  α, steps, therefore, we reach k(k − 1)d−1 nodes. The total
number n d of nodes reachable in d steps or less is then
P(Oi j  1|A i j  0)  β, given by the sum of this quantity over distances 1 to d,
P(A i j  1)  ρ, plus 1 for the central node:
P(A i j  0)  1 − ρ. d
Õ d−1
Õ
1+ k(k − 1)m−1  1 + k (k − 1)m
Substituting all of these into the expressions above and m1 m0
in the question gives the required answer. k 
1+ (k − 1)d − 1 .

c) For the reality mining example we have α  0.4242, k−2
β  0.0043, and ρ  0.0335, which gives a false discov- When this number is equal to n we have reached the
ery rate of 0.226. In other words, more than one in five whole network, and the diameter is the side-to-side dis-
observed edges is actually wrong. tance in the network, which is twice the corresponding
d) The false discovery rate is relatively large because the ob- value of d. Setting the above expression equal to n and
servations are unreliable: in the language of the reality rearranging for d we get
mining study, many pairs of people who are observed in
log[1 + (k − 2)(n − 1)/k]
proximity do not actually have a connection in the net- diameter  2 .
work. Even though this is true, however, the false positive log(k − 1)
rate is still small because most people who do not have d) The diameter of networks (a) and (c) grows logarithmi-
a connection are never observed in proximity. This is cally or slower with n and hence these networks show
just because the graph is sparse: most pairs of people are the small-world effect. Network (b) does not, although
never observed in proximity at all. To put that another one could argue that it does in the limit of large d.

8
Solutions to exercises

Exercise 10.2: Exercise 10.5:


a) Í
The constant is fixed by the normalization condition a) The one on the right is roughly scale-free, as we can tell

p  1 which means that
k0 k
because the cumulative distribution is approximately a
straight line on the logarithmic scales used in the figure.

Õ b) The slope of the line in the right-hand figure is approxi-
1C a k  C/(1 − a).
mately 1.1, so the exponent is α  2.1 (because the slope
k0
of the cumulative plot is one less than the exponent).
Hence C  1 − a. c) Rearranging Eq. (10.24) for P gives P  W (α−1)/(α−2) and
b) setting W  12 and α  2.1 then gives P  4.9 × 10−4 , or
∞ ∞ 1 of a percent of
Õ Õ about 0.05%. In other words, less than 20
P p k  (1 − a) am  ak .
the best connected nodes have a half of all the edge ends.
mk mk
c) There are m ends of edges attached to each node of de- Exercise 10.6:
gree m and there are np m such nodes, so the total num- a) The average degree is (n m − 1)p m  A(n m − 1)−β+1 .
ber of ends of edges attached to nodes of degree m is b) The probability that two of your neighbors are connected
mnp m . The number of ends of edges attached to nodes within your group is just the probability that any two
of degree k or greater is given by this quantity summed nodes are connected, which is p m  A(n m − 1)−β .
over m thus:
c) Eliminating n m between the previous two expressions
∞ ∞ gives the required answer.
Õ Õ k − ka + a
mnp m  n(1 − a) ma m  na k . d) For the local clustering to fall off as hki −3/4 we need
(1 − a)2
mk mk
β/(1 − β)  34 or β  37 .
The total number of ends of edges is the same expression
with k  0, which is just na/(1 − a)2 . Dividing one by the
other, we then find that
11 Random graphs
Exercise 11.1:
W  a k 1 − k(1 − a −1 ) .
 
a) The probability of any particular set of three nodes form-
d) Eliminating k between our expressions for P and W we ing a triangle is p 3 , and there are n3 possible such sets.
then have the claimed result. Hence the expected number of triangles in the network is
e) The unphysical values W > 1 all fall in the range n c3
0 < k < 1. However, k is only allowed to take integer p 3  16 n(n − 1)(n − 2) ' 16 c 3 ,
3 (n − 1)3
values, so W is never greater than 1 in any real-world
situation. where the approximate equality becomes exact in the
limit of large n. Note that the appearance of triangles
Exercise 10.3: α  2.53 ± 0.34 in different positions is not independent, since some tri-
angles share edges, but this makes no difference to the
Exercise 10.4: The numerator of (10.27) is
result: the expected number of triangles is equal to the
Õ Õ 1 Õ 2Õ 2 expected number in each position times the number of
(A i j − k i k j /2m)k i k j  Ai j ki k j − ki kj positions, regardless of whether the triangles are inde-
2m
ij ij i j pendent (because the average of a sum is equal to a sum
S22 of averages).
 Se − , b) Similarly, the probability of a connected triple in any par-
S1
ticular position is p 2 and the number of possible positions
where we have made use of 2m   S1 . Likewise, the
Í
i ki is the number of ways of choosing the central node of the
denominator is triple then choosing two others, which is n × n−1

2 . Thus
Õ Õ 1 Õ 2Õ 2 the expected number of connected triples is
(k i δ i j − k i k j /2m)k i k j  k i3 − ki kj
2m
ij i i j c2
1 n(n
2 − 1)(n − 2) ' 12 nc 2 .
S22 (n − 1)2
 S3 − .
S1 c) Following Eq. (7.28), the clustering coefficient is
Dividing numerator by denominator and multiplying top and 3 × 61 c 3 c
bottom by S1 then gives the required answer. 1 nc 2
 .
n
2

9
Networks (2nd Edition)

Exercise 11.2:
a) 1 − S is the probability, averaged over all nodes, that a # Pull a node off the queue
node does not belong to the giant component. For a node i = q[pout]
specifically of degree k to not belong to the giant compo- pout += 1
nent all of its k neighbors must not belong to the giant
# Check its neighbors
component, which happens with probability (1 − S)k .
for j in edge[i]:
b) By Bayes’ rule, the probability P(k|GC) of a node having if d[j]==0:
degree k given that it is not in the giant component is re- d[j] = c
lated to the probability P(GC|k) that it is not in the giant q[pin] = j
component given that it has degree k thus: pin += 1

P(k) e−c c k # Check if this is the largest component


P(k|GC)  P(GC|k)  (1 − S)k
P(GC) k!(1 − S) if pin>maxs: maxs = pin
e−c c k (1 − S)k−1
 . print("Largest component has size",maxs)
k!
On a typical run the program prints
Exercise 11.3: Here is an example program to solve this prob-
lem, written in Python: Largest component has size 500644
In other words it has found a value of
from math import log
from numpy import empty,zeros 500 644
S  0.500644.
from random import randrange 1 000 000
The true value is S  12 , so we are off by less than 0.1%.
n = 1000000 # Number of nodes
c = 2*log(2) # Mean degree Exercise 11.4:
m = int(n*c/2) # Number of edges a) The average degree is given by
edge = empty(n,set) # Adjacency list
ln 1
for i in range(n): ln(1 − S)
edge[i] = set() c−  − 1 2  2 ln 2  1.38 . . .
S
2
# Place the edges b)
for k in range(m): c5
p5  e−c  0.0107 . . .
i = randrange(n) 5!
j = randrange(n) or about 1%.
while (i==j) or (i in edge[j]): c) It is not a member of the giant component if and only
i = randrange(n) if none of its five neighbors are, which happens with
j = randrange(n) 5
probability 12  321 . Thus it is a member of the giant
edge[i].add(j) 1  31  0.96875, or
edge[j].add(i) component with probability 1 − 32 32
about 97%.
# Create queue and set up breadth-first search d) We can use Bayes’ rule thus:
q = empty(n,int) # Queue array P(k)
d = zeros(n,int) # Component labels P(k|in g.c.)  P(in g.c.|k)
P(in g.c.)
c = 0 # Number of components 5
maxs = 0 # Largest component 31 × e−c c × 2  0.0207 . . .
 32
5!
for v in range(n): or about 2%—twice as high as the fraction in the network
if d[v]==0: as a whole.
q[0] = v # First node in queue
pin = 1 # Write pointer Exercise 11.5:
pout = 0 # Read pointer a) The probability of having no edges to the giant compo-
c += 1 nent is simply equal to the probability of not being in
d[v] = c # Label node v the giant component, which is 1 − S  e−cS by Eq. (11.16).
Alternatively, if we want a more elaborate proof, the prob-
# Main loop ability of being connected to the giant component via a
while pin>pout: particular other node is the probability p of having an

10
Solutions to exercises

edge to that node times the probability S that the node is node is equal to the probability p that there is an out-
itself in the giant component for a total probability of pS. going edge to that node times the probability S that the
Then the probability of not being connected via that node node is in the giant strongly connected component, for a
is 1 − pS, and the probability of not being connected via total probability of pS. And the probability of not being
any other node is (1 − pS)n−1 . Putting p  c/(n − 1) and joined to the giant strongly connected component is one
taking the limit of large n then gives the required result. minus this, or 1 − pS. Then the total probability of not
b) There are n − 1 ways to choose the one node, and proba- being joined to the giant strongly connected component
bility pS that we have a connection to that node and that via an outgoing edge to any of the n − 1 other nodes in
the node is itself in the giant component. We must not the network is
be connected via any of the other nodes, which happens 
cS
 n−1
with probability e−cS as in part (a), so the total probability (1 − pS) n−1
 1− ' e−cS ,
n−1
is (n − 1)pSe−cS  cSe−cS .
c) The probability of not being in the giant bicomponent is where we have made use of c  p/(n − 1) from part (a),
the probability of having exactly zero or one connections and the last equality becomes exact in the limit of large n.
to it, which gives 1−T  e−cS +cSe−cS , which in turn gives The probability of being joined to the giant strongly con-
the required result. (Interesting question: Why does hav- nected component via at least one outgoing edge is one
ing two connections to the giant component mean that you minus this value, or 1−e−cS in the large-n limit. Similarly
are in the giant bicomponent?) the probability of being joined to the giant strongly con-
nected component via at least one ingoing edge is also
d) The probability S satisfies S  1 − e−cS . We can use this
1 − e−cS . And the probability of having both of these
result to eliminate cS in favor of − ln(1 − S) and hence
conditions satisfied at once, which is also equal to the
show that T  S + (1 − S) ln(1 − S). Since the final term
probability S of being in the giant strongly connected
(1 − S) ln(1 − S) is negative this makes T < S, except when 2
S is either 0 or 1, in which case the final term vanishes component, is S  1 − e−cS .
and T  S.
Exercise 11.8:
Exercise 11.6: a) There are n − i nodes with labels higher than i and prob-
a) All edges are independent in a random graph, which is ability p of an edge from each one, so the expected in-
still true after we remove the giant component. More- degree is (n − i)p. There are i − 1 nodes with labels lower
over, while edge probabilities in the small components than i so the expected out-degree is similarly (i − 1)p.
are lower than edge probabilities in the network as a b) There are i nodes with labels i and lower and n − i with
whole, the nodes are indistinguishable, so all edge prob- labels higher than i. Hence there are i(n −i) possible pairs
abilities are the same. Hence the small components form and each is connected with probability p for an expected
a random graph, but one with no giant component, mean- number of edges i(n − i)p.
ing that the average degree is less than 1. c) The largest value occurs when i  21 n. The smallest val-
b) Part (a) assumes that if there is no giant component then ues occur when i  1 or i  n −1, which both give (n −1)p.
the average degree is less than 1 but this doesn’t necessar- (Or you could say the smallest values occur for i  0 and
ily imply that if the average degree is greater than 1 there i  n, which give zero.)
must be a giant component. That the latter is true, we Exercise 11.9:
can however see by setting (1 − S)c < 1 and rearranging
a) For each node there are n−1

to get S > 1 − 1/c, which implies that S > 0 if c > 1. 2 pairs of others with which
it could form a triangle, and each triangle is present with
probability c/ n−1

Exercise 11.7: 2 , for an average of c triangles per node.
Each triangle contributes two edges to the degree, so the
a) There are n(n − 1) ordered pairs of nodes and a frac-
average degree is 2c.
tion p of them are joined by directed edges, so there are
m  n(n − 1)p edges on average. Then the average degree b) The probability p t of having t triangles follows the bino-
is mial distribution
m n−1 
c  (n − 1)p. t

n−1 c
n pt  2 p t (1 − p)( 2 )−t ' e−c ,
t t!
b) If we keep all m edges but discard their directions then
the average (undirected degree) is 2m/n  2(n −1)p  2c, where the final equality is exact in the limit of large n.
where c in this expression is the directed degree from The degree is twice the number of triangles and hence
part (a). Setting the average degree to 2c in the stan- t  k/2 and
dard equation for the giant component of an undirected c k/2
p k  e−c
random graph then gives the required result. (k/2)!
c) The probability of being joined to the giant strongly con- so long as k is even. Odd values of k cannot occur, so
nected component via an outgoing edge to one particular p k  0 for k odd.

11
Networks (2nd Edition)

2m−2
c) As shown above, there are on average c triangles around 2  21 (2m − 2)(2m − 3), and so forth. The number of ways
each node and hence nc is the total number of trian- of choosing all m pairs is thus
gles in the network times three (since each triangle ap-
pears around three different nodes and gets counted three 2m(2m − 1) (2m − 2)(2m − 3) 2 × 1 (2m)!
times). ···  m .
2 2 2 2
The number of connected triples around a node of de-
gree k  2t is 2t 2  t(2t − 1) and there are np t nodes

However, there are m! ways of choosing the same set of pairs
with t triangles, with p t as above. So the total number of in different orders, each of which gives rise to the same over-
connected triples is all matching, so the total number of matchings is the above
number divided by m!.

Õ ct Note that the number of matchings is not the same as the
ne−c t(2t − 1)  ne−c (2c 2 + c)ec  nc(2c + 1).
t! number of networks: if we are only concerned with the topol-
t0
ogy and not with which particular stub matches with which,
(The sum is a standard one that can be found in tables, then the count is different. See the following exercise for a
but it’s also reasonably straightforward to do by hand if discussion of this issue.
you know the right tricks.) From Eq. (7.28) the clustering
coefficient is now equal to Exercise 12.2: As discussed
Î in the question, there is in general
an overall factor of i k i ! that comes from the permutation of
(number of triangles) × 3 the stubs at each node. However, if there is a multiedge be-
C tween two nodes, then this factor gets modified. Consider, for
(number of connected triples)
nc 1 instance, two nodes i, j with a double edge between them such
  . that stub A on one node is connected to stub A on the other
nc(2c + 1) 2c + 1
and stub B is connected to stub B. Now if we permute the two
d) Let u be the probability that a node is not in the giant stubs on one of the nodes we will have A connected to B and
component. If a node is not in the giant component then B connected to A, but if we permute the stubs on both nodes
it must be that for each of the n−1
 simultaneously we still have A connected to A and B to B—the
2 distinct pairs of other matching hasn’t changed. Indeed, we can see that for a multi-
nodes in the network either (a) that pair does not form a
triangle with our node (probability 1 − p) or (b) the pair edge with multiplicity A i j any permutation of the stubs at one
does form a triangle (probability p) but neither member of end has no effect on the matching if we perform the same per-
the pair is itself in the giant component (probability u 2 ). mutation at the other end. There areÎA i j ! permutations of the
Thus the analog of Eq. (11.12) for this model is multiedge, which gives the factor of i< j A i j ! in the question.

n−1
For self-edges the argument is similar. If we have 12 A ii self-
2( ).
u  1 − p + pu 2 edges at node i, then any permutation of stubs at one end of
each edge has no effect on the matching if we permute those
Putting p  c/ n−1 at the other ends in the same way, for a factor of ( 12 A ii )!. But

2 and taking the limit of large n this
2)
becomes u  e −c(1−u . Putting S  1 − u we then find in addition, we can also swap the two stubs at opposite ends
that S  1 − e−cS(2−S) . of the same self-edge and it will have effect on the matching,
which gives us a factor of 2A ii /2 . So overall, we have
e) Rearranging for c in terms of S we have

ln(1 − S) 2A ii /2 ( 21 A ii )!  2A ii /2 (1 × 2 × . . . × 21 A ii )
c− ,
S(2 − S)  2 × 4 × . . . × A ii  A ii !!
and for S  12 this gives c  43 ln 2. Substituting into the
expression for the clustering coefficient above then gives Putting everything together, we get the expression in the ques-
tion.
1
C  0.351 . . . Exercise 12.3:
1 + 83 ln 2
a) To average over every edge, we simply sum over all
pairs i, j for which A i j  1, giving
12 The configuration model Í
ij Ai j xi 1 Õ
Exercise 12.1: Imagine matching pairs of stubs one by one. hxiedge  Í  ki xi ,
The number of ways of choosing the first pair from the 2m ij Ai j 2m
i
total stubs is 2m  12 2m(2m − 1). The number of ways of

2
choosing the second pair from the remaining 2m − 2 stubs is where we have made use of Eqs. (6.12) and (6.13).

12
Solutions to exercises

b) The difference is given by that we don’t is one minus this. So the probability that
we loop back on precisely the sth step is
1 Õ 1Õ
hxiedge − hxi  ki xi − xi s−1 
2m n 1 Ö 1

i i
πs  1− .
2(n − s) + 1 2(n − r) + 1
   Õ 
n 1Õ 1Õ 1 r1
 ki xi − ki xi
2m n n n
i i i Taking logs of both sides we get
hkxi − hkihxi cov(k, x)
  . ln π s  − ln[2(n − s) + 1]
2m/n hki
s−1  
Exercise 12.4:
Õ 1
+ ln 1 − ,
2(n − r) + 1
a) The degree distribution is r1

and, assuming n − s to be large, this gives


if m  k,

1
pm 
0 otherwise. s−1
Õ 1
ln π s ' − ln[2(n − s)] − 12
The generating functions are 10 (z)  and 11 (z) 
zk z k−1 . n−r
r1
b) The giant component has size S  1 − 10 (u) where n−1
Õ 1
u  11 (u). In the present case, where 11 (z)  z k−1 the lat-  − ln[2(n − s)] − 12
r
ter equation gives u  u k−1 , which has solutions u  0, 1 rn−s+1
when k ≥ 3. The solution u  1 is the trivial solution  n−1 n−s 
Õ 1 Õ1
which is always present because 11 (1)  1. The solu-  − ln[2(n − s)] − 12 −
r r
tion u  0 corresponds to a giant component with size r1 r1
S  1 − 10 (0)  1, i.e., a giant component that fills the ' − ln[2(n − s)] − 12 [ln n − ln(n − s)],
whole network. (Technically, it is possible for a node,
where in the last line we have used mr1 (1/r) ' γ + ln m,
Í
by chance, not to belong to the giant component—for
instance two nodes of degree three could be connected where γ is Euler’s constant. Taking exponentials of both
together by three parallel edges to form a component of sides again then gives
size two. But our calculation tells us that the fraction
1
of nodes in such small components tends to zero as the πs  p .
size of the network becomes large, so that as n → ∞ the 2n 1 − s/n
chance of a randomly chosen node belonging to the giant Another way of expressing this result is to say that the
component approaches 1.) probability π(x) dx that the size s of the component lies
c) For k  1 each node has only one neighbor, which also has between nx and n(x + dx) is
degree 1, so all components consist of two nodes joined
by a single edge. Hence we have 12 n small components dx
π(x) dx  π s × n dx  √ .
of size 2 each. 2 1−x
d) For k  2 the situation is more complicated. In this case So the average size of the component to which a node
the equation u  11 (u) says that u  u, which is trivially belongs is
true but tells us nothing. However, we can see that for
1 1
k  2 all components must take the form of rings, whose
∫ ∫
x
hsi  nxπ(x) dx  21 n √ dx  23 n.
size distribution we can calculate as follows. 0 0 1−x
Let π s be the probability that a randomly chosen node
belongs to a ring of size s. Consider starting at a random Thus the average component is a giant component in this
node, moving to one of its neighbors, and keeping on network, but it doesn’t fill the entire network, meaning
walking around the ring until we get back to the start- there must be more than one giant component. One way
ing node. When we have visited r nodes in this manner to think about this unusual situation is that small com-
there are n − r nodes left in the network that we have not ponents in a configuration model are trees, meaning they
visited, which have 2(n − r) stubs of edges attached to have no loops. But all components in this case are neces-
them. Along with the one unused stub attached to start- sarily rings, which have loops, so we must not have any
ing node (the stub that we have not yet traversed) this small components, only giant ones.
leaves 2(n − r) + 1 stubs free, to any of which, with equal Exercise 12.5: In a network of this kind the generating func-
probability, the next step of the walk could take us. Only tions 10 and 11 for the degree distribution and excess degree
one of these stubs is the one connected to the starting distribution take the form
node, so the probability that we loop back to the starting
node on the next step is 1/[2(n −r)+1] and the probability 10 (z)  p2 z 2 + p3 z 3 + . . . , 11 (z)  q1 z + q2 z 2 + . . . ,

13
Networks (2nd Edition)

so that Eq. (12.30) takes the form u  q1 u + q2 u 2 + . . ., which, in while pin>pout:


addition to the standard trivial solution at u  1 has the non-
trivial solution u  0. Substituting this value into Eq. (12.27) # Pull a node off the queue
then gives S  1, implying that (in the limit n → ∞ at least) the i = q[pout]
giant component occupies 100% of the network. pout += 1

Exercise 12.6: A common issue for those attempting this ex- # Check its neighbors
ercise is incorrect code for generating the network. Recall that for j in edge[i]:
generating a configuration model network involves choosing if d[j]==0:
pairs of stubs at random and connecting them to form edges. d[j] = c
A common error is instead to chose pairs of nodes at random q[pin] = j
and then connect them if they have any available stubs. This is pin += 1
not the same thing and will not give the correct answer.
Here is an example program in Python to perform the re- # Check if this is the largest component
quired calculation: if pin>maxs: maxs = pin

from numpy import empty,zeros,ones,concatenate print("Largest component has size",maxs)


from random import random,shuffle
On a typical run the program prints
n = 100000
Largest component has size 64574
p1 = 0.6
In other words it has found a value of S of
# Generate the degrees
64 574
k = empty(n,int) S  0.64574
for i in range(n): 100 000
if random()<p1: k[i] = 1 The true value from Eq. (12.41) is S  0.65, so we are off by less
else: k[i] = 3 than 1%.
Here is what the figure should look like:
# Create an empty adjacency list
edge = empty(n,list) 100000
for i in range(n): edge[i] = []

# Create the network 80000

stub = concatenate([ i*ones(k[i],int) \


Size of largest component

for i in range(n) ])
shuffle(stub) 60000

for e in range(0,len(stub),2):
i,j = stub[e:e+2]
40000
edge[i].append(j)
edge[j].append(i)
20000
# Perform repeated breadth-first searches to
# find components
q = empty(n,int)
0
d = zeros(n,int) # Component labels 0.0 0.2 0.4 0.6 0.8 1.0
c = 0 # Components found so far p1

maxs = 0 # Largest component


From this plot it appears that the phase transition falls at
for v in range(n):
around p1  0.75 ± 0.05. Putting p3  1 − p1 in Eq. (12.40), we
if d[v]==0:
see that there should be a giant component whenever p1 < 43 ,
q[0] = v # First node
so the analytic and numerical results are in good agreement.
pin = 1 # Write pointer
pout = 0 # Read pointer Exercise 12.7:
c += 1 # Number of components
d[v] = c # Label node v a) The generating function is given by
n n  
Õ Õ n
1(z)  pk z k  (pz)k (1 − p)n−k  (pz + 1 − p)n .
k
k0 k0

14
Solutions to exercises

b) The moments are As z is increased from zero this expression will at some
point diverge. The divergence happens when the denom-
d1 h i
inator reaches zero,√i.e., when z 2 + z − 1  0, which has
hki   np(pz + 1 − p)n−1  np,
dz z1 z1 the solution z  12 ( 5 − 1)  0.618 . . .

d
2  b) The generating function is
hk 2 i  z 1(z)
dz z1

Õ ∞ Õ
Õ k−1
h 1(z)  ak z k  z + a j a k− j z k
 npz(pz + 1 − p)n−1 k1 k2 j1
∞ ∞
i
2 2 n−2
+ n(n − 1)p z (pz + 1 − p) Õ Õ
z1 z+ aj a k− j z k ,
2
 pn + p n(n − 1). j1 k j+1

where we have reversed the order of the summations,


Hence the variance is
being careful to keep the limits correct. Now making the
σ 2  hk 2 i − hki 2 substitution l  k − j, this becomes
∞ ∞
 pn + p 2 n(n − 1) − (pn)2  p(1 − p)n. Õ Õ 2
1(z)  z + ajzj a l z l  z + 1(z) .


c) As shown in Section 12.10.5, the generating function for j1 l1


the sum of two independent draws from the same distri-
Thus 1 satisfies 12
− 1 + z  0 which has the solutions
bution is the square of the generating function for a single √
1(z)  21 (1 ± 1 − 4z). To determine which sign is correct
draw (see Eq. (12.95)). In the present case, therefore, the
we can, for instance, check the zeroth order term in the
generating function for the sum of two draws is
series expansion of 1(z), which should be zero. Setting

1(z)
2
 (pz + 1 − p)2n , z  0, we get a0  1(0)  12 (1 ± 1), which tells us that we
need to take the minus sign to get a correct expression.
which is also the generating function for the draw of a Thus √
1(z)  12 1 − 1 − 4z .

single number k from a binomial distribution with max-
imum 2n. Thus we can immediately write down the
Exercise 12.9:
distribution of the sum: it is 2n 2n−k , as hypoth-
 k
k p (1 − p) a) From Eq. (12.89) the mean-square size is given by
esized in the question.
2
dh0 d2 h0
   
d
Exercise 12.8: s2  z h 0 (z)  + .
dz z1 dz dz 2 z1
a) If Fk is the kth Fibonacci number with k  0, 1, 2, . . . then
And from Eq. (12.123)

Õ ∞
Õ
k k
 
f (z)  Fk z  1 + Fk z , dh0
 10 (u) + 100 (u)h10 (1),
k0 k1 dz z1
∞ ∞
f (z) 1 Õ 1 Õ
with u  h1 (1) being the solution of u  11 (u) and 1 0
 + Fk z k−1  +1+ Fk+1 z k ,
z z z indicating the first derivative of 1. Similarly,
k1 k1
∞ ∞ d2 h0
 
 2100 (u)h10 (1) + 1000 (u)[h 10 (1)]2 + 100 (u)h100 (1).
Õ Õ
k+1 k
z f (z)  Fk z  Fk−1 z .
dz 2 z1
k0 k1
Putting these expressions into the formula for hs 2 i, we
Multiplying Fk−1 + Fk  Fk+1 by z k and summing over k get
from 1 to ∞ we get
s 2  10 (u) + 3100 (u)h10 (1) + 1000 (u)[h 10 (1)]2
∞ ∞ ∞
+ 100 (u)h100 (1).
Õ Õ Õ
Fk−1 z k + Fk z k  Fk+1 z k ,
k1 k1 k1 It remains to calculate the values of h10 (1) and h100 (1), which
or we can do from Eq. (12.124). We find that
f (z) 1 u
z f (z) + f (z) − 1  − − 1, h 10 (1)  ,
z z 1 − 110 (u)
which rearranges to give
2u110 (u) u 2 1100 (u)
1 h100 (1)  + ,
f (z)  . [1 − 110 (u)]2 [1 − 110 (u)]3
1 − z − z2

15
Networks (2nd Edition)

where we have again used u  11 (u). After this, the rest c) The quadratic equation has solutions
is just algebra.
1 h
q i
b) For a Poisson distribution we have 10 (z)  11 (z)  ec(z−1) u 2 − a ± (2 − a)2 − 4(1 − a)2
2a
and when there is no giant component we have u  1.
1h
q i
Thus:  1 − 21 a ± a − 34 a 2 .
a
10 (u)  11 (u)  1, 100 (u)  110 (u)  c, However, if we choose the plus sign we get u > 1, which
is not allowed since u is a probability, so we must have
1000 (u)  1100 (u)  c 2 ,
1h
q i
u 1 − 21 a − a − 34 a 2 .
which means a

1 2c c2 Then the size of the giant component, as a fraction of the


h10 (1)  , h100 (1)  + , size of the network, is
1−c (1 − c)2 (1 − c)3
1−a 1−a
and S  1 − 10 (u)  1 − 1−
1 − au
q
1 a + a − 3 a2
2 4
3c 3c 2 c3 1
s2  1 + + +  . q
1 a − a − 3 a2
1 − c (1 − c)2 (1 − c)3 (1 − c)3 2 4
q
 1 − (1 − a) 1 a 2 − (a − 3 a 2 )
 32 − a −1 − 34 .
Exercise 12.10: 4 4

a)
q

Õ ∞
Õ 1 d) To get S > 0 we need 32 > a −1 − 34 or 94 > a −1 − 34 .
p k  21 1 k  2
 1.

2
Rearranging for a then gives the required result.
k0 k0
1 − 21
Exercise 12.12:
b)
a) A node of degree k does not belong to the giant com-
1 1
10 (z)  , 11 (z)  . ponent if none of its neighbors do, which happens with
2−z (2 − z)2
probability u k . Hence the probability that the node does
c) belong to the giant component, given that it has degree k,
is
c2 P(GC|k)  1 − u k .
c 1  100 (1)  1,  110 (1)  2 ⇒ c2  2.
c1
b) Applying Bayes’ rule:
d) Yes, because c2 /c1 > 1 (see Section 12.6).
P(k)P(GC|k) p k (1 − u k )
e) Following Eq. (12.133) this probability is P(k|GC)   ,
P(GC) S
   
c d  3 d where S is, as usual, the fraction of nodes in the giant
π3  1 11 (z)  12 c1 (2 − z)−6
2! dz z0 dz z0 component, which is given by Eq. (12.27).
1
h
−7
i 3 c) The average degree in the giant component is
 2 c1 6(2 − z)  .
z0 128 Õ 1Õ
kP(k|GC)  kp k (1 − u k )
Exercise 12.11: S
k k
a) For this degree distribution Eq. (12.30) takes the form Í
kp k − cu k q k u k
Í
k

S
(1 − a)2
u . c − cu11 (u) c(1 − u 2 )
(1 − au)2   ,
S S
Rearranging we then retrieve the given cubic. where we have made use of the definition ofÍ the excess
b) The cubic can be factorized in the form degree distribution q k  (k + 1)p k+1 /c and k q k u k 
11 (u)  u.
(u − 1) a 2 u 2 − a(2 − a)u + (1 − a)2  0,
 
d) The inequality is trivially true because, given that u lies
between zero and one (because it is a probability), every
and hence when u , 1 we have a 2 u 2 − a(2− a)u +(1− a)2  term in the sum is non-positive, and hence so is the entire
0. sum.

16
Solutions to exercises

Expanding the sum in the inequality, we get Exercise 12.15:


Õ a) Equation (12.54) tells us that
p j p k (ju j + ku k − ju k − ku j )
jk 100 (1)
hsi  1 + ,
1 − 110 (1)
Õ Õ Õ
2 kp k u k − 2 jp j p k u k ≤ 0,
k j k
and substituting in the values of 10 and 11 then gives the
which can be written in terms of the generating functions required result.
as 2cu11 (u) − 2c10 (u) ≤ 0, or equivalently u 2 ≤ 1 − S, b) The probability of belonging to a component of size 1 is
where we have used Eq. (12.27) again and Eq. (12.30). just π1  p 0 . There is only one way to belong to a com-
Rearranging, we find that the probability S of any node ponent of size 2: you must have degree 1, which happens
belonging to the giant component satisfies with probability p1 , and the node at the end of your sin-
gle edge must also have degree 1. The probability that
S ≤ 1 − u 2  P(GC|k  2), the node at the end of an edge has degree 1 is the same
as the probability that it has excess degree 0, which is q 0 .
where we have used our result from part (a) above. Thus
e) The average degree in the giant component is given by p1 p12
the expression in part (c) to be c(1 − u 2 )/S. But from π2  p 1 q 0  p 1  .
hki p1 + 2p2 + 3p3
part (d) we see that (1 − u 2 )/S ≥ 1, so
There are two different ways to be in a component of
c(1 − u 2 ) size 3: either you have degree 2 and each of your neigh-
≥ c, bors has excess degree 0, or you have degree 1, your single
S
neighbor has excess degree 1, and their other neighbor
and hence the average degree in the giant component is has excess degree 0, so the total probability is
greater than or equal to the average degree in the network
as a whole. 3p 12 p 2
π3  p2 q02 + p 1 q 1 q0  .
Exercise 12.13: For a network with Poisson degree distribu- (p1 + 2p2 + 3p 3 )2
tion, 11 (z)  ec(z−1) and [11 (z)]s  ecs(z−1) . Then One can also derive these expressions from the general
formula, Eq. (12.133).
dn
[11 (z)]s  (cs)n ecs(z−1) ,
dz n Exercise 12.16:
a) We require that k p k  1, which implies that C ka k 
Í Í
and Eq. (12.133) gives k
1, or C  (1 − a)2 /a.
hki (cs)s−1 e−cs b) The mean degree is given by
πs  (cs)s−2 e−cs  ,
(s − 1)! s!
Õ Õ (1 − a)2 (a + a 2 ) 1 + a
where we have made use of hki  c. kp k  C k2 a k   .
a (1 − a)3 1−a
k k
Exercise 12.14: The expected degree of node i is the sum of
the expected number of edges between i and every node in the c) The mean-square degree is
network:
(1 − a)2 (a + 4a 2 + a 3 )
Õ Õ Õ Õ
ci  pi j  K fi f j  A fi k2 pk  C k3 a k 
j j
a (1 − a)4
k k

where A  K j f j . Thus p i j  K f i f j  c i c j /A2 . 1 + 4a + a 2


Í
Substituting
 .
back into the above equation again, we then get (1 − a)2
Õ ci Õ c d) From Eq. (12.24) we know that the phase transition occurs
ci  pi j  c j  i2 2m,
j
A2
j
A at the point where hk 2 i − 2hki  0. Using the expressions
above, this gives
where m  12 j c j is the expected number of edges in the
Í
1 + 4a + a 2 1+a
network. Rearranging, we find that A2  2m and hence −2  0,
(1 − a)2 1−a
p i j  c i c j /2m.

or 3a 2 + 4a − 1  0, which has solutions 16 (−4 ± 28). Only

one of these is positive, giving a  13 ( 7 − 2)  0.215 . . .

17
Networks (2nd Edition)

Exercise 12.17: whole network, i.e., the probability that that edge does
a) Using the “pure” power-law form of Eq. (12.57), the fun- not lead to a node in the giant out-component. This is
damental generating functions are given in (12.63) and true if, from the node at that edge’s end, none of the other
(12.64) to be nodes that are reachable are themselves in the giant out-

1 Õ −α k component, which occurs with probability u k , where k is
10 (u)  k u the node’s out-degree.
ζ(α)
k1 The probability of landing at a node with in-degree j is
and P(j)  jp inj
/h ji where p in
j
is the in-degree distribution.

1 Õ
And the probability of that node having out-degree k is
11 (u)  k −α+1 u k−1 .
ζ(α − 1)
k1 Õ 1 Õ 1 Õ
The probability u then satisfies P(k| j)P(j)  jP(k| j)p in
j  jp jk .
h ji h ji
j j j

1 Õ
u k −(α−1) u k Then the probability that the node we reach is not in the
uζ(α − 1)
k1 giant out-component is equal to u k averaged over this

1 Õ distribution:
 (k + 1)−(α−1) u k , 1 Õ
ζ(α − 1) u jp jk u k .
k0 hji
jk
and the fraction of nodes in the giant component—
This has a trivial solution at u  1, but it can also have a
i.e., nodes that are functional—is S  1 − 10 (u).
non-trivial solution u < 1 if the slope of the right-hand
b) A rough numerical solution for u from the equation above side is greater than 1 at u  1. Performing the deriva-
gives u  0.470. Then S  0.614. So the model suggests tive, this gives us the condition for the existence of a giant
that about 61% of the Internet should be working at any out-component: hjki − h ji > 0. For uncorrelated degrees,
one time. Actually about 97% of the Internet is working where h jki  h jihki  c 2 , this gives c > 1 as in part (a).
at a time. Why the difference? Probably because people For correlated degrees, we can write the covariance of in-
work very hard to be connected to the giant component— and out-degree as ρ  h jki − h jihki so that h jki  ρ + c 2 ,
the Internet is not much use unless you are connected. So and hence the condition for the giant component is
the giant component is likely to have size close to S  1.

Exercise 12.18: h jki − h ji  ρ + c 2 − c > 0.

a) Consider the node you arrive at by following an edge (in e) Indeed we would, and of course it does.
its forward direction). If the in- and out-degrees are un-
correlated then the average number of edges leaving that Exercise 12.19:
node is simply equal to the average number leaving any
a) In the ordinary configuration model the average excess
node, which is c. If the number of reachable nodes is to
degree is (hk 2 i − hki)/hki (see Eq. (12.18)). In the bipar-
grow on average the further you go, this number should
tite version, there are two average excess degrees, one for
be greater than 1. Hence a giant out-component exists if
nodes of type A and one for nodes of type B. For a giant
(and only if) c > 1.
component to exist, the product of these two should be
b) The argument for the giant in-component is identical, greater than 1, so that the number of reachable nodes
except that you follow edges backwards. increases as we go further away from any starting point.
c) If there is a giant in- or out-component then there is nec- Thus the condition for a giant component is
essarily a giant weakly connected component, since the
largest weakly connected component is trivially at least hk 2 iA − hkiA hk 2 iB − hkiB
as large as the largest in- or out-component. The giant > 0,
hkiA hkiB
strongly connected component is only a little more tricky.
As discussed in Section 6.12.1, a strongly connected com- or equivalently
ponent is equal to the intersection of the in- and out-
components of any of its member nodes. If we have giant hk 2 iA hk 2 iB − hk 2 iA hkiB − hk 2 iB hkiA > 0.
in- and out-components, and if all nodes are equally likely
to belong to them (since the nodes in a random graph are b) A node of type A at the end of an edge is not in the giant
indistinguishable), then they will have an intersection component if all of its neighbors (which are of type B) are
that fills a non-vanishing fraction of the network. This not in the giant component. If the node has excess de-
intersection is the giant strongly connected component. gree k this happens with probability u Bk . Averaging over
d) Let u be the probability that when following a randomly the excess degree distribution then gives the required
chosen edge we reach only a vanishing fraction of the result. The formula for u B is analogous.

18
Solutions to exercises

c) A node of type A is not in the giant component if none of or about 5.5%.


its k neighbors (which are of type B) are in the giant com- d) From Eq. (13.49) the probability of having zero citations is
ponent, which happens with probability u Bk . Averaging
over the degree distribution of nodes of type A, we then π0 (τ)  τ ca/(c+a)  τ 15 .
find that SA  1 − 10A (u B ), where 10A (z) is the generating
function for the degree distribution of type-A nodes. For the 100th paper we have τ  100/10000  10−2 which
gives a probability of 10−30 that the paper has no citations.
On the other hand, for the 100th-to-last paper τ  0.99
13 Models of network formation and the probability of having no citations is 0.9915  0.860
Exercise 13.1: or about 86%.
a) Just before the jth node is added there are j − 1 nodes
Exercise 13.3:
in total in the network, which means the rescaled time τ
for the ith node, as defined in Eq. (13.42), is τ  i/(j − 1) a) Equation (13.58) tells us that the expected in-degree of
and the expected in-degree for the ith node is given by a node is a function only of the rescaled time τ and the
Eq. (13.58) to be model parameters. The current rescaled time is 10/n for
  −c/(c+a)  the tenth node and 1/n for the first node. The tenth node
−c/(c+1) i will thus have as many citations as the first currently has
hq i i  a(τ − 1)  a −1 .
j−1 when its rescaled time is equal to 1/n, which will happen
when the network has size 10n. If time is measured in
b) If the in-degree of the ith node were exactly q i when the nodes added, the interval of time before this happens is
jth node was added, then the probability that a particular 10n − n  9n, or nine times the current number of nodes.
one of the c edges emerging from j connects to i would
b) The average is given by
be given by Eq. (13.1) to be (q i + a)/[(j − 1)(c + a)], and the
total probability of an edge from j to i would be c times τ2∫ τ2 ∫
1 a
γ(τ) dτ  τ −c/(c+a) − 1 dτ

this. The expected value of this probability is thus
τ2 − τ1 τ1 τ2 − τ1 τ1
hq i i + a a/(c+a) a/(c+a)
Pi j  c τ − τ1
(j − 1)(c + a)  (c + a) 2 − a.
τ2 − τ1
a[(i/(j − 1))−c/(c+a) − 1] + a
c
(j − 1)(c + a) c) For papers in the first 10% we have τ1  0 and τ2  0.1.
ca −c/(c+a) Plugging these figures and the values for a and c into the
 i (j − 1)−a/(c+a) . above expression gives an average number of citations of
c+a
Note that this probability remains the same as the net- 152.7. For the last 10% we have τ1  0.9 and τ2  1.0,
work grows to any larger size n > j, because no edges are which gives an average number of citations of 0.2, mean-
ever added or removed between i and j at any subsequent ing that most papers in this interval have no citations.
time. These numbers are of comparable magnitude to published fig-
ures for real citations.
Exercise 13.2:
a) The average number of citations received is the same as Exercise 13.4:
the average number made, which is 30. In network terms, a)
the average in-degree and out-degree of a directed net- p q  12 qp q−1 − (q + 1)p q
 
work are equal (see Section 6.10.2).
for q > 0, and
b) The parameters for Price’s model in this case are c  a 
30. (The values of a and c must be equal to give α  3.) p0  1 − 12 p0 .
Using these values and Eq. (13.9) we find the probability b) Multiplying by z q and summing over q  0 . . . ∞, we get
of having zero citations to be

Õ ∞
Õ ∞
Õ
1 10 (z)  p q z q  1 + 12 qp q−1 z q − 12 (q + 1)p q z q
p0   0.0625,
16 q0 q1 q0
or a little over 6%. ∞
Õ ∞
Õ
c) Using Eq. (13.34) we find the fraction of papers with 100  1 + 12 z (q + 1)p q z q − 21 (q + 1)p q z q
or more citations to be q0 q0
∞ ∞
" #
B(130, 2) Γ(130)Γ(2) Γ(32)
P101   ×
Õ Õ
1 (z q q
B(30, 2) Γ(132) Γ(30)Γ(2) 1+ 2 − 1) qp q z + pq z
31 × 30 q0 q0
  0.0546 . . . ,
 1 + 12 (z − 1) z100 (z) + 10 (z) .
 
131 × 130

19
Networks (2nd Edition)

c) Differentiating h(z), we get So the master equation is

dh (1 − z)2 (z 3 100 + 3z 2 10 ) + 2(1 − z)z 3 10 (n + 1)p k (n + 1)  np k (n) +


c 
(k − 1 + a)p k−1 (n)
 2c + a
dz (1 − z)4
− (k + a)p k (n) .

z2 
 (1 − z)(z100 + 310 ) + 2z10

(1 − z)3 Taking the limit of large n this becomes
2z 2 c 
 , pk  (k − 1 + a)p k−1 − (k + a)p k .

(1 − z)3 2c + a
where in the last line we have made use of the equation The only exception is for nodes of degree exactly c, for
for the generating function above to eliminate one factor which the equation reads
of z100 + 10 .
c+a
d) The differential equation for h(z) is quite easy to solve. pc  1 − c pc .
We write 2c + a

2z 2 2 4 2 c) Rearranging the last equation gives


3
 3
− 2
+
(1 − z) (1 − z) (1 − z) 1−z 2c + a
pc  .
and then integrate to get 2c + a + c(c + a)
1 4 Exercise 13.6:
h(z)  − − 2 ln(1 − z) + constant.
(1 − z)2 1 − z a) The probability of a new edge attaching to a node of in-
We note that 10 (0)  p0 , so h(0)  0, which means that degree q when there are n nodes total in the network
the constant must be equal to 3. Then is 1/n, the number of such nodes is np q (n), and the
number of new edges per node added is c, for a total
(1 − z)2 3z 2 − 2z − 2(1 − z)2 ln(1 − z) attachment probability of
10 (z)  3
h(z)  .
z z3
1
Taking the limit z → 1 and recalling that lim→0  ln   0 c× × np q (n)  cp q (n).
n
gives us 10 (1)  1 as expected. The limit z → 0 is a lit-
tle trickier. Perhaps the simplest way to derive it is to Hence the number of nodes of in-degree q at the (n + 1)th
expand h(z) about zero, which to leading order gives step of the growth process is
h(z)  32 z 3 + O(z 4 ). Then 10 (z)  23 (1 − z)2 + (1 − z)2 O(z)
(n + 1)p q (n + 1)  np q (n) + cp q−1 − cp q ,
for small z, which gives 10 (0)  32 . You can confirm us-
ing the equation for p 0 in part (a) that this is indeed the for q ≥ 1 and
correct value.
e) The mean in-degree is equal to 100 (1). Differentiating our (n + 1)p0 (n + 1)  np 0 (n) + 1 − cp 0 (n),
expression for 10 (z), we get
for q  0. Taking the limit n → ∞ and writing
d10 1 3 p q  p q (∞), we have master equations


z 8z − 4 − 4(1 − z) ln(1 − z)
dz 6
p q  cp q−1 − cp q ,
− 3z 2 3z 2 − 2z + 2(1 − z)2 ln(1 − z) .
 
p0  1 − cp 0 .
Taking the limit z → 1, this gives a mean in-degree of
8 − 4 − 3(3 − 2)  1, which is the correct answer: the mean b) Rearranging the second of these equations gives us p0 
out-degree of the network is c  1, and in a directed 1/(c + 1) and the first gives us
network the mean in- and out-degrees are equal.
c
pq  p ,
Exercise 13.5: c + 1 q−1
a) The total number of edges is nc, so the total num- which implies that
ber of ends of edges is 2nc and the average degree is
hki  2nc/n  2c.
 c q 1  c q
pq  p0  .
b) The probability of connecting to a node of degree k is c+1 c+1 c+1
k+a k+a
c× Í × np k (n)  c np (n)
i (k i + a) 2m + na k
c
 (k + a)p k (n).
2c + a

20
Solutions to exercises

Exercise 13.7: e) Substituting 1(c, z)  f (c − c/z) into the left- and right-
hand sides of the equation separately gives
a) The correctly normalized probability of a specific end of
an edge attaching to node i is ∂1
c  c(1 − 1/z) f 0 (c − c/z),
∂c
ki k ∂1
 i , c
z(z − 1)  z(z − 1) 2 f 0 (c − c/z),
Í
k 2m
j j ∂z z
and two ends are added with each edge, so the total prob- which are trivially equal.
ability is twice this. f) Setting 1(1, z)  z gives us f (1 − 1/z)  z, and making
the substitution x  1 − 1/z then gives f (x)  1/(1 − x).
b) We lose a node of degree k when it gains an extra edge
Thus the general solution for 1 is
and gain one when a node of degree k − 1 gains an edge.
Thus the master equation is 1
1(c, z)  .
1 − c + c/z
k−1 k
np k (m + 1)  np k (m) + np k−1 (m) − np k (m), g) Rewriting the generating function in the form
m m
z 1
and a factor of n cancels throughout. For k  0 the same 1(c, z)  ,
c 1 − (1 − 1/c)z
argument applies except that there is no way to gain new
nodes of degree zero. We can represent this situation we can expand the second factor as a geometric series in
using the same master equation but with the convention z to give

that p−1  0 for all m. Thus, with this convention, we z Õ
have 1(c, z)  (1 − 1/c)m z m .
c
m0
k−1 k Now we can easily read off the coefficients of the expan-
p k (m + 1)  p k (m) + p (m) − p k (m),
m k−1 m sion, which give us our degree distribution. Since there
is no term in z 0 , there are no nodes of degree zero, and
for all m.
for degree k ≥ 1 we have
c) Given that the average degree is c  2m/n when there
are m edges in the network, then when there are m + 1 (1 − 1/c)k−1 (c − 1)k−1
p k (c)   .
edges the average degree is c ck
Exercise 13.8:
2(m + 1) 2m 2 2
 + c+ . a) The correctly normalized probability that a particular
n n n n
new edge attaches to a previous node i is
Putting m  12 nc and writing p k as a function of average qi + ai qi + ai
degree, the master equation then becomes  ,
+ + n ā
Í
(q
i i a i ) nc

Í we have made use of i q i  nc and assumed that


Í
k−1 k where
p k (c + 2/n)  p k (c) + 1 p k−1 (c) − 1 p k (c), n −1 i a i is a good approximation to the mean ā, which is
2 nc 2 nc valid when n is large. Given that c new edges are added
or for each new node, the total probability that node i gains
a new edge upon the addition of a node is c times the
p k (c + 2/n) − p k (c) expression above.
c  (k − 1)p k−1 (c) − kp k (c).
2/n b) Let p q (a, n) da be the fraction of nodes with in-degree q
and a values in the range a to a +da when the network has
Now we take the limit n → ∞ and recover the required n nodes, and the total number of such nodes is n times
differential equation. this. Then the expected number of new incoming edges
d) Multiplying both sides of the differential equation for acquired by such nodes when a single new node is added
p k (c) by z k and summing over all k we get to the network is
q+a c
∂1 Õ
∞ ∞ np q (a, n) da × c ×  (q + a)p q (a, n) da.
n(c + ā) c + ā
Õ
c  (k − 1)p k−1 z k − kp k z k
∂c
k0 k0 Now we can write a master equation, the equivalent of

Õ ∂1 Eq. (13.5), thus:
 z(z − 1) kp k z k−1  z(z − 1) ,
∂z (n + 1)p q (a, n + 1)  np q (a, n)
k0
c 
+ (q − 1 + a)p q−1 (a, n) − (q + a)p q (a, n) .

where we have made use of p −1  0 again. c + ā

21
Networks (2nd Edition)

This equation is correct for q > 0. For q  0 we have c) The number of nodes in components of size k is na k (n).
This number goes up by k when a new component of size
(n + 1)p0 (a, n + 1)  np 0 (a, n) k forms and down by k when one is lost, except when we
ca join two components of size k together, in which case we
+ ρ(a) − p (a, n),
c + ā 0 lose 2k nodes, k for each of the components. Hence the
master equation takes the form
where ρ(a) is the probability distribution from which a is
drawn. (n + 1)a k (n + 1)  na k (n) − βk 2a k (n) − 2a k (n)2
 
In the limit of large n, using the shorthand p q (a)  k−1
Õ
p q (a, ∞), these equations become − 2βka k (n)2 + βk a j (n) a k− j (n)
j1
c
p q (a)  (q − 1 + a)p q−1 (a) − (q + a)p q (a) ,
 
c + ā  na k (n) − 2βka k (n)
k−1
for q > 0 and
Õ
+ βk a j (n) a k− j (n).
ca j1
p0 (a)  ρ(a) − p (a),
c + ā 0
d) For components of size 1 we have
for q  0. By a series of manipulations similar to those
leading to Eq. (13.21), we can then show that (n + 1)a 1 (n + 1)  na 1 (n) − 2βa1 (n) + 1.

B(q + a, 2 + ā/c) e) Writing a k (n)  a k (n + 1)  a k in the steady state, the


p q (a)  ρ(a), equations in the question follow in a straightforward fash-
B(a, 1 + ā/c)
ion.
where B(x, y) is the Euler beta function. (Other equiva- f) Multiplying by z k and summing over k, we get
lent forms are also possible if you write the beta function
in terms of its constituent gamma functions.) ∞
Õ ∞
Õ k−1
Õ
c) From Eq. (13.25) we see that for large values of q this ex- (1 + 2βk)a k z k  z + β kz k a j a k− j .
k1 k2 j1
pression falls off as q −(2+ā/c) , so the exponent is 2 + ā/c,
independent of a. Indeed, the entire network has a Í∞
power-law degree distribution with this same exponent, Defining 1(z)  a z
k1 k
k, this can be written
as we can see by writing p q (a)  f (a) q −(2+ā/c) in the
∞ k−1
large-q regime, where f (a) is a q-independent function d1 d ÕÕ
1(z) + 2βz  z + βz a j z j a k− j z k− j
whose value we can if we wish calculate from Eq. (13.25). dz dz
k2 j1
The complete degree distribution for the network is then
∞ ∞
given by integrating p q (a) over all a: d ÕÕ
 z + βz a j z j al z l
dz
∫ ∫ j1 l1
pq  p q (a) da  q −(2+ā/c) f (a) da, d  2
 z + βz 1(z) ,
dz
and hence the degree distribution has a power-law tail
with exponent 2 + ā/c. where we have made the substitution l  k − j in the
second line. Thus 1 + 2βz1 0  z + 2βz11 0 , which can be
Exercise 13.9: rearranged to give the equation in the question.
a) The probability that the first end of the edge falls in a com- Exercise 13.10:
ponent of size r is a r and the probability that the second
falls in a component of size s is a s . Hence the proba- a) This is just algebra.
bility of both is a r a s (or twice that if one includes the b) We write
probability that it happens the other way around). The ∫ 1 Γ(q + a) 1 ∫
total probability of forming a new component of size k is π q (τ)dτ 
q
τ ca/(c+a) 1−τ c/(c+a) dτ,
therefore k−1 Γ(q + 1)Γ(a)
Í
j1 a j a k−j . 0 0

b) The probability of joining to a component of any size then we make the substitution u  τ c/(c+a) , which gives
other than k is 2a k (1 − a k )  2a k − 2a 2k . If the two com-
ponents have the same size then the probability is just a 2k 1 Γ(q + a) 1
∫ ∫
with no factor of 2. π q (τ)dτ  (1+a/c) u a+a/c (1−u)q du.
0 Γ(q + 1)Γ(a) 0

22
Solutions to exercises

Using Eq. (13.33) we can now perform the integral to get


∫ 1 Γ(q + a)
π q (τ) dτ  (1 + a/c) B(q + 1, a + 1 + a/c)
0 Γ(q + 1)Γ(a)
Γ(q + a)Γ(2 + a/c) Γ(a + 1 + a/c)

Γ(q + a + 2 + a/c) Γ(a)Γ(1 + a/c)
B(q + a, 2 + a/c) Exercise 14.3:
 ,
B(a, 1 + a/c) a) The likelihood is
which recovers Eq. (13.21).
n
Ö µki
P(k1 , . . . , k n |µ)  e−µ ,
ki !
14 Community structure i1

Exercise 14.1: and the log-likelihood L is the logarithm of this expres-


a) The total number of edges in the graph is n − 1, and no sion:
n
matter where the cut falls the number of edges within Õ
L  −nµ + (k i log µ − log k i !).
groups following the cut will be n − 2. So the fraction of
i1
edges within groups is (n − 2)/(n − 1). The total num-
ber of ends of edges is 2(n − 1) and the numbers of ends b) To derive the maximum likelihood estimate of µ we dif-
that fall in the two groups are 2r − 1 and 2(n − r) − 1. ferentiate with respect to µ and set the result to zero:
So the fractions of ends of edges in the two groups are n
(2r − 1)/[2(n − 1)] and [2(n − r) − 1]/[2(n − 1)]. Plugging 1Õ
−n + k i  0,
these numbers into Eq. (7.58), we get µ
i1
 2  2
n−2 2r − 1 2(n − r) − 1 or
Q − − , n
n−1 2(n − 1) 2(n − 1) 1Õ
µ ki .
n
which simplifies to give the result in the question. i1
b) Differentiating with respect to r and setting the result to In other words, the best estimate of µ is just given by the
zero gives 4n − 8r  0, or r  21 n, so the best split is the standard formula for the mean.
one into two parts of equal sizes. The requirement that n
be even is necessary because r must be an integer. Exercise 14.4:
a) If the network were in fact generated from the model
Exercise 14.2: There are many ways to solve this problem but, described, then every observed edge (B i j  1) was gen-
for example, this short Python program will calculate the mod- erated with probability p and every non-edge (B i j  0)
ularity matrix and its leading eigenvector: with probability 1 − p. The total probability can then be
from numpy import array,outer conveniently written
from numpy.linalg import eigh
n1 Ö
Ö n2
P(B|p)  p Bi j (1 − p)1−Bi j
A = array( [[ 0, 1, 1, 0, 0, 0 ],
i1 j1
[ 1, 0, 1, 0, 0, 0 ], Í Í
[ 1, 1, 0, 1, 0, 0 ], p ij Bi j
(1 − p) i j (1−B i j )

[ 0, 0, 1, 0, 1, 1 ],  p m (1 − p)n1 n2 −m ,
[ 0, 0, 0, 1, 0, 1 ],
[ 0, 0, 0, 1, 1, 0 ]], int) where m  i j B i j is the total number of edges in the
Í
k = sum(A) network. This expression is analogous to Eq. (14.26) for
twom = sum(k) the ordinary random graph.
B = A - outer(k,k)/twom b) Differentiating with respect to p and setting the result to
x,v = eigh(B) zero gives
vector = v[:,-1] m
print(vector) p .
n1 n2
For the leading eigenvector, the program gives (0.444, 0.444,
0.325, −0.325, −0.444, −0.444). For the purposes of this pro-
gram the nodes were numbered left to right in the picture, so
this vector tells us that the left three nodes are in one commu-
nity and the right three are in the other, like this:

23
Networks (2nd Edition)

Exercise 14.5: b) Differentiating, we find that the maximum with respect


a) The definition of m rs is to θl is given by
Õ
m rs  A i j δ c i ,r δ c j ,s ∂L ÕA Õ Al j Õ Õ
il
 12 + 12 − 12 θi − 12 θ j  0.
ij ∂θl θl θl
i i i j
which equals the number of edges between group r and
group s, or twice that number when r  s. Thus for this Noting that A is symmetric (becauseÍ the network is undi-
network we have rected) this is equivalent to k l /θ l − i θi  0, where we
have used k l  i A il . Rearranging, we then get
Í
m 11  10, m 12  m 21  1, m 22  10,
k
and θl  Í l .
n1  4, n2  4. i θi

Then the log profile likelihood for the split shown is θl 


Í
Summing both sides over l then gives l
2
i θi  2m/ i θi , and hence i θi  2m and
Í Í Í Í
Õ m rs l kl / √
L  m rs log
rs
nr ns θl  k l / 2m. Thus the mean number of edges between i
10 1 10 and j is
 10 × log + 2 × log + 10 × log ki k j
4×4 4×4 4×4 θi θ j  .
 −14.95 . . . 2m
Exercise 14.7:
b) The six other possible values for the log profile likelihood
that result from moving one of the six possible symmetry- a) Let us number the nodes thus:
distinct nodes to the other group, are:
1 3
6 3 10
6 × log + 6 × log + 10 × log  −21.25 . . .
3×3 3×5 5×5
5
4 4 10
4 × log + 8 × log + 10 × log  −22.98 . . .
3×3 3×5 5×5
6 2 12 2 4
6 × log + 4 × log + 12 × log  −19.30 . . .
3×3 3×5 5×5 The table of cosine similarities then looks like this:
12 3 4
12 × log + 6 × log + 4 × log  −21.71 . . .
5×5 5×3 3×3 1 2 3 4 5
10 3 6 1 – 0.408 0.354 0.816 0.500
10 × log + 6 × log + 6 × log  −21.25 . . . 2 0.408 – 0.577 0.333 0.816
5×5 5×3 3×3
10 4 4 3 0.354 0.577 – 0.577 0.354
10 × log + 8 × log + 4 × log  −22.98 . . . 4 0.816 0.333 0.577 – 0.408
5×5 5×3 3×3
5 0.500 0.816 0.354 0.408 –
All of these are lower (more negative) than the log-
likelihood for the even split. b) The biggest similarities are between nodes 1 and 4 and
between nodes 2 and 5 (both pairs have the same similar-
Exercise 14.6: ity), so these pairs join up first. The biggest similarities
a) Writing the terms for ordinary edges and self-edges sep- between the three remaining clusters are then between
arately, the likelihood is 3 and (1, 4) and between 3 and (2, 5) so these join up as
well. Then the resulting dendrogram looks like this:
Ö (θi θ j )A i j Ö ( 1 θ2 )A ii /2 2
e−θi θ j
2 i
P(A|θ)  e−θi /2 ,
Ai j ! (A ii /2)!
i< j i

bearing in mind that the number of self-edges at node i


is A ii /2. Taking logs, we then get the log-likelihood
Õ 1 4 2 5 3
L  A i j log(θi θ j ) − θi θ j − log A i j !


i< j In practice, one might draw the joins between 1 and 4


Õ and between 2 and 5 at different levels, to indicate the op-
+ 1 log( 12 θi2 ) − 12 θi2 − log(A ii /2)!

2 A ii eration of a clustering algorithm that chooses randomly
i between pairs with identical similarity. Similarly the join-
Õ
 12 A i j log(θi θ j ) − θi θ j + constants. ing of the groups (1, 4), (2, 5) and 3 might be divided into


ij two joins at different levels.

24
Solutions to exercises

15 Percolation and network resilience c) The node belongs to a cluster of size zero if it is unoc-
cupied, which happens with probability 1 − φ. If it is
Exercise 15.1:
occupied the results of Section 12.10.9 along with those
a) The generating functions for the degree distribution and of Exercise 12.13 tell us that the probability of belong-
excess degree distribution are 10 (z)  z 4 and 11 (z)  z 3 . ing to a cluster of size s is φ (the probability of being
Hence Eq. (15.4) becomes u  1 − φ + φu 3 , which is a occupied) times the probability of being in a component
cubic equation. However, we know that u  1 is always of size hsi within the network of occupied nodes, which
a solution, so we can factor that out, and we find that gives the expression in the question.
φu 2 + φu + φ − 1  0, so
Exercise 15.3:
p
−φ ± 4φ − 3φ 2 a) The mean and mean-square degrees are
u .
2φ hki  p 1 + 2p2 + 3p3 , hk 2 i  p 1 + 4p2 + 9p3 ,
The negative solution is disallowed since u is a probabil- so from Eq. (15.9)
ity, so we take the positive one. Then the size of the giant
hki
cluster is φc 
hk 2 i − hki
p1 + 2p2 + 3p 3
p 4
4φ − 3φ2 − φ
4
S  1 − 10 (u)  1 − u  1 − . 
16φ 4 (p1 + 4p2 + 9p3 ) − (p1 + 2p 2 + 3p3 )
p1 + 2p 2 + 3p3
 .
b) The critical probability occurs at the point where S  0, 2p2 + 6p3
which gives 3φ  4φ − 3φ 2 , or φ c  31 .
p
p b) If p1 > 3p3 then
c) We have S  1 when φ  4φ − 3φ2 , which gives φ  1. 3p3 + 2p2 + 3p 3
So the giant cluster fills the entire network when the oc- φc >  1,
2p2 + 6p3
cupation probability reaches 1. This can happen because
in a regular graph with k ≥ 3 the giant component also so there can be no giant cluster. (Indeed if p1 > 3p 3 there
fills the entire network (see Exercise 12.4). is no giant component—see Eq. (12.40)—so obviously there
can be no giant cluster.) The result does not depend on
Exercise 15.2: p 2 because you can add as many degree-2 nodes to the
a) Since the occupied nodes are chosen at random, any pair network as you like and it makes no difference—they just
of them is connected by an edge independently with sit in the middle of edges but don’t change the overall
probability p, just as is the case for any nodes in the net- network structure.
work. Hence they form a random graph with the same c) The probability u satisfies
edge probability p as the network as a whole. The ex- u  1 − φ + φ11 (u)  1 − φ + φ(q0 + q 1 u + q 2 u 2 ),
pected number of occupied nodes is n 0  nφ, and hence,
from Eq. (11.6), the mean degree is or equivalently

c 0  (n 0 − 1)p  (nφ − 1)p ' cφ, φq2 u 2 + (φq 1 − 1)u + 1 − φ + φq0  0.


Since u  1 is necessarily a solution, the left-hand side
where the approximation becomes exact in the limit of of this equation must contain a factor of u − 1. Knowing
large n for fixed φ. this, it’s straightforward to show that the equation can be
b) A percolation cluster corresponds to a component in factorized as
the subgraph composed of the occupied nodes. From
(u − 1)(φq 2 u − 1 + φ − φq0 )  0,
Eq. (11.29), the average size of the component that a ran-
domly chosen occupied node belongs to, when S  0, is and hence if u < 1 so that there is a giant cluster, u must
satisfy
1 1 φq 2 u − 1 + φ − φq0  0,
 .
1 − c0 1 − cφ or
1 − φ + φq0
By convention, however, hsi is defined as the average size u .
φq 2
of the component any node belongs to, occupied or not,
which is 1/(1 − cφ) as above if the node is occupied or The size of the giant cluster is
zero if it is unoccupied, for an average value of S  φ[1 − 10 (u)]  φ(1 − p1 u − p2 u 2 − p3 u 3 ),
φ and combining the last two equations gives us our solu-
hsi  . tion.
1 − cφ

25
Networks (2nd Edition)

Exercise 15.4: From Eq. (12.102), the generating function for Exercise 15.7:
the excess degree distribution is a) The position of the phase transition is given by Eq. (15.41),
 1 − a 2 f10 (1)  1. The excess degree distribution is q k 
11 (z)  , (k + 1)p k+1 /hki  (k + 1)−α+1 /ζ(α − 1) so from Eq. (15.35)
1 − az
and thus k0
1 Õ
2a(1 − a)2 f1 (z)  k −α+1 z k−1 .
110 (z)  . ζ(α − 1)
(1 − az)3 k1

Setting z  1, we get Differentiating this expression and setting z  1, the con-


dition for the phase transition is
1 1−a
φc  0  . Ík0 −1
11 (1) 2a
k0
(k − 1)k −α+1
 1.
ζ(α − 1)
Exercise 15.5:
a) There is an occupied gray bond for every unoccupied Rearranging then gives the required result.
black bond, so the fraction of occupied gray bonds is b) Rearranging the equation given in the question we get
Ík0 −x
1 − φ.  ζ(x) − ∞ k −x . Using the trapezoidal
Í
k1
k kk0 +1
b) No gray bond can cross a black bond, so if there is a path rule on the last term and dropping the error terms gives
of black bonds from side to side there can be no gray path ∞ ∫ ∞
crossing it, and hence no gray path from top to bottom.
Õ
k −x  k −x dk + 21 (k0 + 1)−x
Conversely, if there is no path of black bonds from side k0 +1
kk0 +1
to side, then there is at least one gap through which the
gray path can slip and connect top and bottom. (k0 + 1)−x+1 1
 + 2 (k0 + 1)−x ,
c) The results of part (b) imply that (in the limit of large lat- x−1
tice size) if the black bonds percolate the gray ones do not where we have assumed that x > 1, so that the term at
and vice versa. Both systems however are square lattices k  ∞ vanishes. Combining results then gives the re-
and hence have the same percolation threshold φ c . Thus quired expression.
when the black bonds percolate we must have φ > φ c c) Using the result from part (b) in the expression from
and at the same time the gray bonds don’t percolate so part (a) tells us that the phase transition occurs approxi-
we must have 1 − φ < φ c . Combining these two relations mately when
we have 1 − φ < φ c < φ. Now setting φ  12 + , we find
(k0 + 1)−α+3
 
that 12 −  < φ c < 12 + , and letting  → 0+ the result is ζ(α − 2) − 12 (k 0 + 1)−α+2 −
established. α−3
(k0 + 1)−α+2
 
Exercise 15.6: − ζ(α − 1) − 12 (k0 + 1)−α+1 −
α−2
a) Equation (15.4) tells us that u  1− φ + φ11 (u). For a Pois-
 ζ(α − 1).
son distribution the excess degree generating function is
11 (z)  ec(u−1) , so the equation is u  1 − φ + φec(u−1) . Dropping the subleading terms in k0 , i.e., those in
b) In order to not be in the giant cluster a node must not (k0 + 1)−α+1 and (k0 + 1)−α+2 , we are left with the re-
be connected to the giant cluster along any of its edges, sult given in the question.
which happens with probability u k . d) Setting α  2.5 and solving for k0 gives k0  11.17. So k0
c) Applying Bayes’ rule, we have is about 11.

P(k) Exercise 15.8:


P(k|not in g.c.)  P(not in g.c.|k) a) The probability u obeys (15.33), which says u  1− f1 (1)+
P(not in g.c.)
f1 (u). There is a trivial solution of this equation at u  1,
pk u k pk u k but it doesn’t give a giant cluster—the probability of not
  .
1−S 10 (u) being in the giant cluster is 1. For a giant cluster we need
another solution with u < 1. This can only occur if the
d) The mean degree is
slope of the function 1 − f1 (1) + f1 (u) is steeper than 1 at
∞ u100 (u) u  1, i.e., if
Õ pk u k cec(u−1)
k  u  cu. d 
10 (u) 10 (u) ec(u−1) 1 − f1 (1) + f1 (u) u1 > 1,

k0
du
which gives f10 (1) > 1 as in the question.

26
Solutions to exercises

b) For this model we have relabeled nodes belong increases in size by only one node. In
∞ ∞
the worst case where we do this on every step of the algorithm,
Õ Õ 1−a repeatedly adding a cluster of size one to the largest cluster,
f0 (z)  p k φ k z k  (1 − a) (abz)k  ,
1 − abz we would end up relabeling each node O(n) times before the
k0 k0
largest cluster had grown to fill the whole network. The total
and work done relabeling all n nodes O(n) times would then be

Õ 1−a O(n 2 ).
10 (z)  pk z k  .
1 − az
k0 Exercise 15.11:
Then Eq. (15.36) tells us that a) If the outbreak starts at a randomly chosen node, then the
probability of starting in cluster m is s m /n. And when it
f00 (z) (1 − a)2
f1 (z)  b , does so, it will infect s m individuals. Hence the expected
100 (1) (1 − abz)2 size of the outbreak is
k
and Õ sm
(1 − a)2 I sm .
f10 (z)  2ab 2 . m1
n
(1 − abz)3
And there is a giant cluster when f10 (1) > 1, i.e., when The smallest value of I occurs when the clusters are of
equal size, meaning that s m  (n − r)/k. Substituting into
2ab 2 (1 − a)2 > (1 − ab)3 , the formula for I then gives the required result.
b) There is only one node that will make any difference at
as stated in the question. all if removed. This one:

Exercise 15.9:
a) Each of individual i’s neighbors j has probability f of
being chosen and then probability 1/k j of nominating i
for immunization. Thus the probability of receiving a
nomination from neighbor j is f /k j and summing over
all neighbors then gives the required result.
b) The probability of not receiving a nomination from neigh-
bor j is 1 − f /k j , and the probability of not receiving a
− f /k j )A i j . Taking
Î
nomination from any node is j (1
logs, we have

Ö f Ai j Õ Õ Ai j
  
f c) If you can remove two nodes, you should remove these:
log 1−  A i j log 1 − ' −f
kj kj kj
j j j

 − f κi .

Taking the exponential again then gives the required re-


sult. The expansion of the logarithm is a good approxi-
mation if either the fraction f of nodes chosen vanishes
(for instance if we choose a fixed number of nodes, in-
dependent of n) or the degrees of the neighboring nodes
are large.
c) Low-degree neighbors mean a larger value of κ i , which
means that the probability 1 − e− f κ i of being vaccinated
is also larger. Hence the node with low-degree neighbors
is more likely to be vaccinated. This is perhaps not a good
thing: between a node with high-degree neighbors and 16 Epidemics on networks
a node with low-degree neighbors, the one with high-
degree neighbors is a bigger risk for disease spreading Exercise 16.1: For a disease starting at a single node the prob-
and therefore a more desirable target for vaccination. ability of having an epidemic outbreak is S, and the probability
of not having an epidemic outbreak is 1 − S. If the disease starts
Exercise 15.10: If we relabel one of the clusters at random then at c different nodes, chosen uniformly and independently, then
it is possible to join a large cluster to a cluster of size one but the probability of not having an epidemic outbreak is (1 − S)c ,
still relabel the large cluster, so that the cluster to which the and the probability of having one is 1 − (1 − S)c .

27
Networks (2nd Edition)

Exercise 16.2: In terms of the total number of individuals S, X, b) The size of the giant cluster is
and R in the three states the equations are
S  1 − 10 (u)  1 − u 4 ,
dS X
 −βS , where u is the solution of u  1−φ+φ11 (u)  1−φ+φu 3 ,
dt S+X
dX X so we have a cubic equation for u:
 βS − γX,
dt X+S
φu 3 − u + 1 − φ  0.
dR
 γX.
dt Although cubics are usually hard to solve, this one can be
Dividing throughout by the total population size n then gives simplified by noting that u  1 is a solution, and hence
u − 1 must be a factor. A little work reveals that the cubic
ds sx equation can be rewritten as
 −β ,
dt s+x
dx sx (u − 1)(φu 2 + φu + φ − 1)  0.
β − γx,
dt x+s If u , 1, then we must have φu 2 + φu + φ − 1  0, which
dr has solutions
 γx.
dt p
4φ − 3φ 2
Exercise 16.3: u − 12 ± .

a) Equation (16.29) is the same as Eq. (15.4) and, as shown in
Eq. (15.17), the solution for the exponential distribution But u cannot be negative (since it is a probability), so we
is q must choose the positive sign. Then
u  a −1 − 21 φ − 14 φ 2 + φ a −1 − 1 .

p 4
1 4φ − 3φ2
S 1− −1 .
b) The probability that a node with degree k does not belong 16 φ
to the giant cluster is u k , and if it does not belong to the
giant cluster then, with probability 1 in the limit of large Putting φ  21 then gives the required result.
network size, it is not infected. If it does belong to the gi-
Exercise 16.5: Here is an example program to solve this prob-
ant cluster—which happens with probability 1−u k —then lem, which is written in Python:
it may be infected or it may not, depending on whether
the outbreak starts in the giant cluster. The probability from numpy import empty,zeros
that the outbreak starts in the giant cluster is S. Hence from random import random,randrange
the total probability that our node of degree k is infected from pylab import plot,show,xlabel
is S(1 − u k ).
The size of the giant cluster in the present case is given n = 10000
by Eq. (16.30) to be c = 5
m = n*c//2
1−a phi = 0.4
S  1 − 10 (u)  1 −
1 − au
1−a # Generate the network
1−
k = zeros(n,int)
q
1 φa + 1 φ2 a 2 + φa(1 − a)
2 4 edge = empty(n,set)
for i in range(n): edge[i] = set()
q
1 φa − 1 φ2 a 2 + φa(1 − a)
2 4 for e in range(m):
 1 − (1 − a) 1 1 2 2
4φ 4φ a + φa(1 − a)
2 a2

− i = randrange(n)
q j = randrange(n)
 32 − 1
4 + (a −1 − 1)/φ. while (i==j) or (i in edge[j]):
i = randrange(n)
Putting the results together then gives us our expression j = randrange(n)
for the probability of infection. edge[i].add(j)
c) For a  0.4 and φ  0.9 we find that u  0.804 and edge[j].add(i)
S  0.116. Then the probability of infection for k  0 is
zero (obviously), for k  1 is 0.023, and for k  10 is 0.103. # Make all nodes susceptible except the starting node
S,I,R = 0,1,2
Exercise 16.4: state = zeros(n,int)
a) Every node has excess degree 3, so 11 (z)  z 3 and v = randrange(n)
φ c  1/110 (1)  31 . state[v] = I

28
Solutions to exercises

And here is the graph the program produces:


# Create a queue to hold the infected nodes
10000
q = empty(n,int)
q[0] = v
pin = 1
pout = 0 8000

t = 0
endstep = 1 6000
susceptible = n - 1
infected = 1
recovered = 0 4000

tpoints = []
spoints = [] 2000
ipoints = []
rpoints = []
0
0 5 10 15 20
# Main loop Time
while pin>pout:
Exercise 16.6:
# Pull an infected node off the queue a) Following the lines of the argument given in Sec-
v = q[pout] tion 16.3.2, we define u to be the probability that a node is
pout += 1 not connected to the giant cluster via a specific one of its
edges. For this to happen, either the edge is unoccupied,
# Check its neighbors one by one or the node at its other end is unoccupied, or both are
for i in edge[v]: occupied but the node at the other end is not connected
if state[i]==S and random()<phi: to the giant cluster via any of its remaining edges. The
state[i] = I probability that both the node and edge are occupied is
q[pin] = i φ s φ b , and the probability that at least one of them is
pin += 1 unoccupied is 1 − φ s φ b . The probability that none of
infected += 1 the remaining edges connects to the giant cluster is u k ,
susceptible -= 1 where k is the excess degree. Putting the terms together
the total probability that a given edge does not connect
# Mark the infecting node as recovered
to the giant cluster is 1 − φ s φ b + φ s φ b u k . Averaging
state[v] = R
over the distribution q k of the excess degree then gives
infected -= 1
u  1 − φ s φ b + φ s φ b 11 (u) as in the question.
recovered += 1
A given node does not belong to the giant cluster if ei-
ther it is itself unoccupied (probability 1 − φ s ) or it is
# Check if the time step has ended
occupied (probability φ s ) but none of its edges lead to
# If it has, record the current state
the giant component (probability u k , where k is now the
if pout==endstep:
total degree). Putting these terms together gives a total
t += 1
tpoints.append(t) probability of 1 − φ s + φ s u k . Averaging over the distri-
spoints.append(susceptible) bution p k of the total degree then gives the expression
ipoints.append(infected) for 1 − S, which can be rearranged to give the required
rpoints.append(recovered) expression for S.
endstep = pin b) As in Fig. 15.2, the phase transition between the epidemic
and non-epidemic regimes occurs at the point where the
plot(tpoints,spoints) derivative of y  1 − φ s φ b + φ s φ b 11 (u) at u  1 is equal
plot(tpoints,ipoints) to 1. Performing the derivative, rearranging for φ s , and
plot(tpoints,rpoints) recalling that the fraction of individuals vaccinated is
xlabel("Time") 1 − φ s , then gives the required result.
show() Exercise 16.7:
a) The probability of a small outbreak is 1 − S, where S is the
size of the giant percolation cluster in the network, given

29
Networks (2nd Edition)

by the solutions of Eqs. (16.29) and (16.30). Equivalently Exercise 16.8:


the probability of a small outbreak is 10 (u), where 10 is a) At short times the average probability hx i i that node i is
the generating function for the degree distribution and u infected satisfies
is the solution of Eq. (16.29).
dhx i i Õ
b) Small outbreaks correspond to small percolation clusters β A i j hs i x j i,
in the bond percolation picture and π s is the probability dt
j
that a randomly chosen node belongs to a small cluster of
size s. As with the calculation of small component sizes for but at short times s i  1 for all i with probability ap-
the configuration model in Section 12.10.8, let us define proaching 1 as n → ∞. Thus hs i x j i → hx j i and, drop-
ρ t to be the probability that an edge from a node i leads to ping the notation h. . .i for the sake of brevity,
a component of size t when Í i has been removed from the dx i Õ
network, and let h1 (z)  t ρ t z t be the generating func- β Ai j x j .
dt
tion for this probability. Then the generating function for j
the size of the component to which a randomly chosen
node with degree k belongs is z[h1 (z)]k , where the lead- It is straightforward to demonstrate that x i  x0 eβkt is a
ing factor of z is for the node itself and we have made solution to this equation with x0  c/n.
use of the results about products of generating functions b) The average probability of infection is the same for every
from Section 12.10.5. Averaging over the degree distribu- node when t  0—it is x i  c/n for all i. Assuming it
tion p k then gives the total generating function for small also remains the same for all subsequent times, x i  x(t),
outbreaks then Eq. (16.55) becomes
∞ dx Õ
k  β(1 − x) A i j x  βkx(1 − x).
Õ
h0 (z)   z10 (h1 (z)).

p k z h 1 (z) dt
j
k0

The generating function h1 (z) for the size of the cluster at The x(t) that solves this equation is a solution of (16.55)
the end of an edge has two terms. If the edge is unoccu- for the given initial condition and, since solutions to first-
pied (probability 1 − φ) then the cluster has size zero, so order differential equations are unique once the initial
the generating function has a constant term equal to 1− φ. condition is fixed, it follows that this is the only solution
If the edge is occupied then by an argument analogous to and hence that the probability of infection is indeed the
same for all nodes at all times.
that for h0 the generating function is equal to z[h1 (z)]k ,
where k is now the excess degree of the node at the end of c) The equation can be solved by separating the variables:
the edge. Each term in this generating function now gets ∫ x
dy 1 x

n−c

multiplied by φ, the probability that the edge is occupied t  ln + ln .
c/n βk y(1 − y) βk 1 − x c
and, after averaging over the distribution q k of the excess
degree, the full generating function is Rearranging for x(t) then gives the required answer.
∞ d) The rate of appearance of new cases is n dx/dt, which is
increasing when its derivative n d2 x/dt 2 is positive (and
Õ k
h1 (z)  1 − φ + q k φz h1 (z)


k0 decreasing when its derivative is negative). The inflection


point occurs where the derivative is zero:
 1 − φ + φz11 (h1 (z)).
d2 x
c) The mean size of a small outbreak when one happens is 0  βk(1 − 2x).
given by dt 2
h 0 (1) Hence the inflection point occurs at x  12 , or when
Í
sπ s
hsi  Ís  0 .
s πs 1−S
1 n−c
Putting 1 − S  10 (u), where u  h1 (1), and performing t ln .
βk c
the derivatives, we find that
Exercise 16.9:
φ100 (u)11 (u)
hsi  1 + , a) The generating function is
10 (u)[1 − φ110 (u)]
p1 + 2p2 z + 3p3 z 2
and u  1 − φ + φ11 (u) from the expression for h1 above. 11 (z)   71 + 27 z + 47 z 2 .
p 1 + 2p2 + 3p3
A number of alternative forms can be derived by making
use of the substitutions 100 (u)  100 (1)11 (u)  hki11 (u), b) Equation (16.84) becomes
φ11 (u)  u − 1 + φ, or 10 (u)  1 − S.
du
 72 βu(2u 2 + u − 3)  27 βu(u − 1)(2u + 3).
dt

30
Solutions to exercises

where we have made use of j A i j  k i  k. If x(t) is a


Í
Separating the variables and integrating then gives
u
∫   solution of this equation then it is a solution of the orig-
7 4 3 5 inal. Noting also that with x(0)  x0 it satisfies also the
t − − dy
30β 1− 2y + 3 1 − y y given boundary condition, the result is established.
(2u + 3)2 (1 − u)3
 
7 3 b) This result is a special case of Eq. (17.57). For a k-regular
 ln − ln 25 .
30β u5 network κ 1  k and the rest follows straightforwardly.

c) In the limit of long times u → 0 and we have Exercise 17.2:

7 9 a) Equations (17.63) and (17.64) tell us that the system is


t∼ ln 5 , stable if f 0 (x ∗ ) < 0 and if the largest eigenvalue of the
30β u
Laplacian satisfies
or u ∼ e−6βt/7 . Then   
1 d1 df
1 >− .
λn
Õ Õ
s(t)  p k s k (t)  s 0 pk u k ' (3u + 3u 2 + 4u 3 ). dx dx xx ∗
10
k k
Meanwhile, Eq. (17.95) tells us that for all networks
The leading behavior for small u comes from the first term λ n ≤ 2kmax . Combining the latter two inequalities then
in parentheses, and hence s(t) ∼ e−6βt/7 at long times. gives us the result we want.
There was an error in this question in the first printing of b) As discussed in Section 17.2.2, a symmetric fixed point for
the book in which the final answer was incorrectly given as this system is any point x i  x ∗ for all i with f (x ∗ )  0,
e−21βt/2 . and thus rx ∗ (1 − x ∗ )  0, which gives x ∗  0 or 1. How-
ever, we require f 0 (x ∗ ) < 0 for stability, which is not true
Exercise 16.10: at x ∗  0 so long as r > 0, so the point at x ∗  0 is unstable.
a) Comparing the two equations we get c) For x ∗  1 and with the given forms for f and 1, our
condition for stability above reads

Õ
κ 1  110 (1)  kq k 1 2ax ∗ 4a
> −2 ∗
 .
k0 kmax r(1 − 2x ) r
∞ ∞
1 Õ 1 Õ
 k(k + 1)p k+1  (k − 1)kp k Hence the system is stable at this fixed point if the
hki hki maximum degree of a node in the network satisfies
k0 k0
k max < r/4a.
hk 2 i − hki
 .
hki Exercise 17.3:
b) For a Poisson degree distribution with mean c we have a) Setting x i  x ∗ for all i, we have a fixed point when
hki  c and hk 2 i  c(c + 1), so
dx i Õ
0  f (x ∗ ) + (A i j − A ji )1(x ∗ , x ∗ )
hk 2 i − hki
q p dt
 c, hk 2 i  c(c + 1), j
hki
 f (x ) + (k − k)1(x ∗ , x ∗ )

which clearly violates the inequality for all c > 0. The ex-
 f (x ∗ ).
planation is that, as discussed in footnote 18 on page 662,
Eq. (16.100) is only approximate. b) Making the suggested substitution and expanding in  i
to first order we find that
17 Dynamical systems on networks d i Õ 
 f (x ∗ ) +  i f 0 (x ∗ ) + (A i j − A ji ) 1(x ∗ , x ∗ )
Exercise 17.1: dt
j
a) Solutions of first-order differential equations are unique ∂1(u, v) ∂1(u, v)

given the initial values of the variables, which means if + i + j
∂u uvx ∗ ∂v uvx ∗
a solution exists of the form given in the question then
it is the solution—no other exists. Thus it is sufficient to ∂1(u, v) Õ
  i f 0 (x ∗ ) + Mi j  j ,
substitute the given form into the equations and show ∂v uvx ∗ j
that it is indeed a correct solution. Putting x i (t)  x(t)
for all i we get which has the required form provided
dx Õ
∂1(u, v)
 f (x) + A i j 1(x, x)  f (x) + k1(x, x),
dt α  f 0 (x ∗ ), β .
j ∂v uvx ∗

31
Networks (2nd Edition)

c) MT  AT − A  −(A − AT )  −M. Hence Av  2(cos k x + cos k y )v and v is an eigenvector of


d) If v is a right eigenvector then µv  Mv. Taking the trans- A with eigenvalue κ  2(cos k x + cos k y ).
pose of both sides we get µvT  vT MT  −vT M, which This, however, ignores the periodic boundary conditions.
establishes the result. Taking the complex conjugate of If the vector v is to take only a single value on each node,
the equality given in the question we then have then its value must stay the same when we loop around
the boundary conditions and come back to the node we
vT Mv∗ vT µv∗ started at. In other words
µ∗   −  −µ.
vT v∗ vT v∗
exp[ikT (r + Lx̂)]  exp(ikT r),
Thus the real part of µ satisfies 2 Re µ  µ+ µ∗  µ− µ  0
and hence µ is purely imaginary. which implies that exp(ik x L)  1 or equivalently that
e) Writing the vector  as a linear combination of the k x L  2πn 1 , where n 1 is an integer. Similarly k y L  2πn2
eigenvectors vr of M, as in Eq. (17.42), we have (t)  with n 2 another integer.
r c r (t) vr and our equation for  becomes
Í
b) We get unique eigenvectors only so long as −π < k x < π.
Õ dc Õ Outside this range we get copies of earlier eigenvectors
r
vr  (αI + βM) c r (t) vr because of aliasing. (In physics-speak, we must stay in-
r
dt r
Õ side the first Brillouin zone.) Similarly we must have
 (α + βµ r )c r (t) vr . −π < k y < π. These conditions in turn imply that
r our integers n1 and n2 satisfy −2L < n1 < 2L and
−2L < n2 < 2L. The most negative eigenvalue occurs
Comparing terms in each eigenvector then gives
for k x  k y  ±π, which gives κ  −4, and the most
dc r positive occurs for k x  k y  0, which gives κ  4. The
 (α + βµ r ) c r (t)
dt smallest magnitude eigenvalue is κ  0, which occurs for
a variety of choices of k x and k y , but for instance occurs
so that c r (t)  c r (0) exp (α + βµ r )t , which will decay to
 
zero provided Re(α + βµ r ) < 0. But since α and β are at k x  k y  12 π.
real and µ r is purely imaginary, this condition is simply
Exercise 17.6:
equivalent to saying that α < 0.
a) Substituting the given expression into the equations, both
Exercise 17.4: For the k-regular graph hki  k and hk 2 i  k 2 . sides are equal to ω and hence the expression is indeed a
For the random graph hki  c and hk 2 i  c(c + 1). For the solution.
star graph hki  2 − 2/n and hk 2 i  n − 1. The rest is straight- b) Putting θi  ωt +  i and performing a Taylor expansion
forward, except perhaps for the case of the star graph, where to first order we get

the first inequality requires us to prove that n − 1 ≥ 2 − 2/n,
which is not entirely obvious. It can be proved by noting that d i Õ
ω+ ω+ A i j 1( i −  j )
it is true for n  1, 2, 3, and 4 (which we can show just by dt
√ j
calculating the values), and that for n ≥ 5 we have n − 1 ≥ 2, Õ
'ω+ A i j 1(0) + ( i −  j )1 0 (0) ,
 
while 2 − 2/n < 2 for all positive n.
j
Exercise 17.5: or
a) The elements of the adjacency matrix for the square lattice
d i Õ
 1 0 (0) k i  i − Ai j  j
 
can be written as
dt
j
Ar,r0  δr+x̂,r0 + δr−x̂,r0 + δr+ŷ,r0 + δr−ŷ,r0 Õ
0
 1 (0) ki δi j − Ai j  j ,

where x̂ and ŷ are unit vectors in the x and y directions
j
and δx,y is the (vector) Kronecker delta. Then the rth
element of Av is which is equivalent to the expression given in the ques-
tion.
Õ
[Av]r  Ar,r0 v r0
r0 c) We can write (t) as a linear combination of the eigen-
vectors vr of the Laplacian: (t)  r c r (t) vr . Then we
Õ Í
 δr+x̂,r0 + δr−x̂,r0 + δr+ŷ,r0 + δr−ŷ,r0 exp(ikT r0 )

have
r0
Õ dc Õ Õ
r
 exp[ikT (r + x̂)] + exp[ikT (r − x̂)] vr  1 0 (0) c r (t) Lvr  1 0 (0) λ r c r (t) vr .
r
dt r r
T T
+ exp[ik (r + ŷ)] + exp[ik (r − ŷ)]
Comparing terms gives
 2 cos k x + cos k y exp(ikT r)


 2 cos k x + cos k y vr .
 dc r
 1 0 (0)λ r c r (t),
dt

32
Solutions to exercises

which has the solution c r (t)  c r (0) exp 1 0 (0)λ r t . Since


 
for small crawls because the expected number of visits is
all eigenvalues of the Laplacian are nonnegative, all c r only equal to the probability of a visit if both are small.
will decay to zero if and only if 1 0 (0) < 0, except for c1 ,
which corresponds to the zero eigenvalue λ 1  0 and Exercise 18.2:
remains constant. However, this eigenvalue has corre- a) This result is proved in the book. It is Eq. (6.44).
sponding eigenvector (1, 1, 1, . . .), and hence a nonzero b) Upon arriving at node i we learn about k i − 1 nodes. But
value of c 1 simply adds a constant value to all  i , which the probability of arriving at node i is k i /2m. Thus the
is equivalent to shifting the origin of time by a constant. average number of nodes we learn about is
By an appropriate choice of the point t  0, therefore, we
can always arrange for c 1 to be zero, so that our solution Õ k
i 1 Õ 2 hk 2 i
(k i − 1)  (k i − k i )  − 1,
takes the form θi  ωt +  i , with  i (t) decaying to zero 2m nhki hki
i i
for all i.
where we have used 2m  nhki (see Eq. (6.15)). If the tar-
get exists on a fraction c of the nodes, then the expected
18 Network search number of copies found on any step is simply the number
Exercise 18.1: of nodes we learn about times c, as stated in the question.
a) Since the first page is randomly chosen, the probability is c) The probability that the item exists on any particular node
just 1/S i . is c/n and the probability that it does not is 1 − c/n. If
we learn about r nodes, then the probability it exists on
b) If a page has not yet been crawled but the neighboring
none of them is (1 − c/n)r . The log of this probability is
node at the other end of one of its incoming edges has,
then it will be crawled on the next step. Thus the proba-
 c cr
r ln 1 − '− ,
bility of being crawled on the next step is the probability n n
the neighbor was crawled on the last, or a sum of such and hence, taking exponentials again, the probability
probabilities if there is more than one neighbor: is roughly exp(−cr/n). In the present case we have
Õ r  hk 2 i/hki − 1 and hence the result is established.
p i (r)  A i j p j (r − 1),
d) No, it is not true, because the result applies, as the ques-
j
tion points out, only in the limit of a large number of
which in vector notation is p(r)  Ap(r − 1). This result steps. It doesn’t apply on the first step. (Which is ob-
would be exact if the network were a tree of outgoing vious anyway—on the first step you encounter only the
edges from the starting node, so that there was only one neighbors of the first node.)
directed path to any node from the starting node. How-
Exercise 18.3: As in the one-dimensional case, we divide the
ever, in a network like the web that contains loops and
nodes into classes or zones at given distances from the tar-
hence potentially has more than one path to a node, the
get node. Using the “Manhattan distance,” the zones end up
result is only approximate: even if your neighbor was first
diamond shaped, like this:
crawled on the previous step you cannot assume you will
be first crawled on the next, because you could have been
crawled already on an earlier step. In such circumstances
the derivation above fails. What one can say, however,
is that if the crawl kept going even when it encounters a
page it has encountered before, then our expression for
p i (r) would correctly give the expected number of times
the crawl visits node i on step r.
c) The expected number of times the crawl visits each node
in the first r steps is given by
r
Õ
p(t)  I + A + A2 + . . . + Ar p(0).

t0

As shown in footnote 2 on page 161, a high power of the


adjacency matrix multiplied into any vector gives a vector
proportional to the leading eigenvector v1 . Assuming the
leading eigenvalue is much greater than 1, the expression
above will be dominated by the later terms in the series,
all of which will then give p(t) ' Cv1 with some appro-
priate normalizing constant C. This result only works

33
Networks (2nd Edition)

There are four ends of edges at every node on the square lattice which is always less than 2k+1 . Thus the probability of the
and hence an average of 4p ends of shortcuts and 2np shortcuts message having a shortcut to a particular node in one of the
in the whole system. The equivalent of the power-law condi- lower classes is never less than 2pK2−(k+1)α , and the proba-
tion on the lengths of shortcuts in the one-dimensional model bility of having a shortcut to any of them is never less than
is that in two dimensions each shortcut spans a given vector 22k × 2pK2−(k+1)α  pK22k+1−(k+1)α .
displacement with probability Kr −α , where r is the Manhat-
If the message does not have a shortcut to a better zone of
tan length of the vector. There are n places a shortcut that
the lattice from its current location, it gets passed along the
spans a given vector can fall, each with probability 1/n, and
lattice to a node one step closer to the target and tries again
hence 2np × Kr −α /n  2pKr −α is the probability of connection
to find a shortcut there. The expected number of tries until it
between any node pair.
finds a shortcut is the reciprocal of the probability above, or
The normalizing constant K is, as before, fixed by the fact
that each shortcut must span some vector. The number of 1 1 α−1 (α−2)k
 2 2 .
nodes at distance r from a given starting point is 4r and, if pK22k+1−(k+1)α pK
we assume a diamond-shaped system as in the picture above,
with L nodes on a side, the maximum value Í of r is L. The In the worst case the message has to pass through all of the
normalization condition then says that K Lr1 (2r × r −α )  1, classes before reaching the target and there are log2 L classes.
which by a calculation analogous to that for Eq. (18.5), implies Hence an upper bound on the expected number of steps is
that
1 α−2
 2 (2 − α)L for α < 2, log2 L
1 α−1 2(α−2)[1+log2 L] − 1

 1 α−1 Õ (α−2)k
`≤ 

K ' 1/(2 ln L) for α  2, 2 2 2
pK pK 2α−2 − 1
 (α − 2)/α for α > 2.

k0
1 α−1 (2L)α−2 − 1

We assume without loss of generality that the target is in  2 .
the center of our picture. As before, consider a message at pK 2α−2 − 1
a node in the kth class and let us ask what the probability is Finally, making use of our expression for the normalizing con-
that there exists a shortcut from that node to a node in a lower √
stant K and noting that L ≤ n, we have
class (i.e., a class closer to the target). The number of nodes
in classes lower than k is easily shown to be 22k + (2k − 1)2 , 1−α/2 if α < 2,
An


which is never less than 22k . The Manhattan distance from

`≤ B log2 n if α  2,
the message to the furthest of these nodes in a lower class is  C n α/2−1

if α > 2.

no greater than the maximum distance to the target, which
is 2k − 1, plus the distance out to the furthest node, which is Thus it is indeed possible to find the target in log2 n time, but
2k−1 − 1. Thus the distance is no greater than 2k + 2k−1 − 2, only if α  2.

34
Another random document with
no related content on Scribd:
THE FULL PROJECT GUTENBERG LICENSE
PLEASE READ THIS BEFORE YOU DISTRIBUTE OR USE THIS WORK

To protect the Project Gutenberg™ mission of promoting the


free distribution of electronic works, by using or distributing this
work (or any other work associated in any way with the phrase
“Project Gutenberg”), you agree to comply with all the terms of
the Full Project Gutenberg™ License available with this file or
online at www.gutenberg.org/license.

Section 1. General Terms of Use and


Redistributing Project Gutenberg™
electronic works
1.A. By reading or using any part of this Project Gutenberg™
electronic work, you indicate that you have read, understand,
agree to and accept all the terms of this license and intellectual
property (trademark/copyright) agreement. If you do not agree to
abide by all the terms of this agreement, you must cease using
and return or destroy all copies of Project Gutenberg™
electronic works in your possession. If you paid a fee for
obtaining a copy of or access to a Project Gutenberg™
electronic work and you do not agree to be bound by the terms
of this agreement, you may obtain a refund from the person or
entity to whom you paid the fee as set forth in paragraph 1.E.8.

1.B. “Project Gutenberg” is a registered trademark. It may only


be used on or associated in any way with an electronic work by
people who agree to be bound by the terms of this agreement.
There are a few things that you can do with most Project
Gutenberg™ electronic works even without complying with the
full terms of this agreement. See paragraph 1.C below. There
are a lot of things you can do with Project Gutenberg™
electronic works if you follow the terms of this agreement and
help preserve free future access to Project Gutenberg™
electronic works. See paragraph 1.E below.
1.C. The Project Gutenberg Literary Archive Foundation (“the
Foundation” or PGLAF), owns a compilation copyright in the
collection of Project Gutenberg™ electronic works. Nearly all the
individual works in the collection are in the public domain in the
United States. If an individual work is unprotected by copyright
law in the United States and you are located in the United
States, we do not claim a right to prevent you from copying,
distributing, performing, displaying or creating derivative works
based on the work as long as all references to Project
Gutenberg are removed. Of course, we hope that you will
support the Project Gutenberg™ mission of promoting free
access to electronic works by freely sharing Project
Gutenberg™ works in compliance with the terms of this
agreement for keeping the Project Gutenberg™ name
associated with the work. You can easily comply with the terms
of this agreement by keeping this work in the same format with
its attached full Project Gutenberg™ License when you share it
without charge with others.

1.D. The copyright laws of the place where you are located also
govern what you can do with this work. Copyright laws in most
countries are in a constant state of change. If you are outside
the United States, check the laws of your country in addition to
the terms of this agreement before downloading, copying,
displaying, performing, distributing or creating derivative works
based on this work or any other Project Gutenberg™ work. The
Foundation makes no representations concerning the copyright
status of any work in any country other than the United States.

1.E. Unless you have removed all references to Project


Gutenberg:

1.E.1. The following sentence, with active links to, or other


immediate access to, the full Project Gutenberg™ License must
appear prominently whenever any copy of a Project
Gutenberg™ work (any work on which the phrase “Project
Gutenberg” appears, or with which the phrase “Project
Gutenberg” is associated) is accessed, displayed, performed,
viewed, copied or distributed:

This eBook is for the use of anyone anywhere in the United


States and most other parts of the world at no cost and with
almost no restrictions whatsoever. You may copy it, give it
away or re-use it under the terms of the Project Gutenberg
License included with this eBook or online at
www.gutenberg.org. If you are not located in the United
States, you will have to check the laws of the country where
you are located before using this eBook.

1.E.2. If an individual Project Gutenberg™ electronic work is


derived from texts not protected by U.S. copyright law (does not
contain a notice indicating that it is posted with permission of the
copyright holder), the work can be copied and distributed to
anyone in the United States without paying any fees or charges.
If you are redistributing or providing access to a work with the
phrase “Project Gutenberg” associated with or appearing on the
work, you must comply either with the requirements of
paragraphs 1.E.1 through 1.E.7 or obtain permission for the use
of the work and the Project Gutenberg™ trademark as set forth
in paragraphs 1.E.8 or 1.E.9.

1.E.3. If an individual Project Gutenberg™ electronic work is


posted with the permission of the copyright holder, your use and
distribution must comply with both paragraphs 1.E.1 through
1.E.7 and any additional terms imposed by the copyright holder.
Additional terms will be linked to the Project Gutenberg™
License for all works posted with the permission of the copyright
holder found at the beginning of this work.

1.E.4. Do not unlink or detach or remove the full Project


Gutenberg™ License terms from this work, or any files
containing a part of this work or any other work associated with
Project Gutenberg™.
1.E.5. Do not copy, display, perform, distribute or redistribute
this electronic work, or any part of this electronic work, without
prominently displaying the sentence set forth in paragraph 1.E.1
with active links or immediate access to the full terms of the
Project Gutenberg™ License.

1.E.6. You may convert to and distribute this work in any binary,
compressed, marked up, nonproprietary or proprietary form,
including any word processing or hypertext form. However, if
you provide access to or distribute copies of a Project
Gutenberg™ work in a format other than “Plain Vanilla ASCII” or
other format used in the official version posted on the official
Project Gutenberg™ website (www.gutenberg.org), you must, at
no additional cost, fee or expense to the user, provide a copy, a
means of exporting a copy, or a means of obtaining a copy upon
request, of the work in its original “Plain Vanilla ASCII” or other
form. Any alternate format must include the full Project
Gutenberg™ License as specified in paragraph 1.E.1.

1.E.7. Do not charge a fee for access to, viewing, displaying,


performing, copying or distributing any Project Gutenberg™
works unless you comply with paragraph 1.E.8 or 1.E.9.

1.E.8. You may charge a reasonable fee for copies of or


providing access to or distributing Project Gutenberg™
electronic works provided that:

• You pay a royalty fee of 20% of the gross profits you derive from
the use of Project Gutenberg™ works calculated using the
method you already use to calculate your applicable taxes. The
fee is owed to the owner of the Project Gutenberg™ trademark,
but he has agreed to donate royalties under this paragraph to
the Project Gutenberg Literary Archive Foundation. Royalty
payments must be paid within 60 days following each date on
which you prepare (or are legally required to prepare) your
periodic tax returns. Royalty payments should be clearly marked
as such and sent to the Project Gutenberg Literary Archive
Foundation at the address specified in Section 4, “Information
about donations to the Project Gutenberg Literary Archive
Foundation.”

• You provide a full refund of any money paid by a user who


notifies you in writing (or by e-mail) within 30 days of receipt that
s/he does not agree to the terms of the full Project Gutenberg™
License. You must require such a user to return or destroy all
copies of the works possessed in a physical medium and
discontinue all use of and all access to other copies of Project
Gutenberg™ works.

• You provide, in accordance with paragraph 1.F.3, a full refund of


any money paid for a work or a replacement copy, if a defect in
the electronic work is discovered and reported to you within 90
days of receipt of the work.

• You comply with all other terms of this agreement for free
distribution of Project Gutenberg™ works.

1.E.9. If you wish to charge a fee or distribute a Project


Gutenberg™ electronic work or group of works on different
terms than are set forth in this agreement, you must obtain
permission in writing from the Project Gutenberg Literary
Archive Foundation, the manager of the Project Gutenberg™
trademark. Contact the Foundation as set forth in Section 3
below.

1.F.

1.F.1. Project Gutenberg volunteers and employees expend


considerable effort to identify, do copyright research on,
transcribe and proofread works not protected by U.S. copyright
law in creating the Project Gutenberg™ collection. Despite
these efforts, Project Gutenberg™ electronic works, and the
medium on which they may be stored, may contain “Defects,”
such as, but not limited to, incomplete, inaccurate or corrupt
data, transcription errors, a copyright or other intellectual
property infringement, a defective or damaged disk or other
medium, a computer virus, or computer codes that damage or
cannot be read by your equipment.

1.F.2. LIMITED WARRANTY, DISCLAIMER OF DAMAGES -


Except for the “Right of Replacement or Refund” described in
paragraph 1.F.3, the Project Gutenberg Literary Archive
Foundation, the owner of the Project Gutenberg™ trademark,
and any other party distributing a Project Gutenberg™ electronic
work under this agreement, disclaim all liability to you for
damages, costs and expenses, including legal fees. YOU
AGREE THAT YOU HAVE NO REMEDIES FOR NEGLIGENCE,
STRICT LIABILITY, BREACH OF WARRANTY OR BREACH
OF CONTRACT EXCEPT THOSE PROVIDED IN PARAGRAPH
1.F.3. YOU AGREE THAT THE FOUNDATION, THE
TRADEMARK OWNER, AND ANY DISTRIBUTOR UNDER
THIS AGREEMENT WILL NOT BE LIABLE TO YOU FOR
ACTUAL, DIRECT, INDIRECT, CONSEQUENTIAL, PUNITIVE
OR INCIDENTAL DAMAGES EVEN IF YOU GIVE NOTICE OF
THE POSSIBILITY OF SUCH DAMAGE.

1.F.3. LIMITED RIGHT OF REPLACEMENT OR REFUND - If


you discover a defect in this electronic work within 90 days of
receiving it, you can receive a refund of the money (if any) you
paid for it by sending a written explanation to the person you
received the work from. If you received the work on a physical
medium, you must return the medium with your written
explanation. The person or entity that provided you with the
defective work may elect to provide a replacement copy in lieu
of a refund. If you received the work electronically, the person or
entity providing it to you may choose to give you a second
opportunity to receive the work electronically in lieu of a refund.
If the second copy is also defective, you may demand a refund
in writing without further opportunities to fix the problem.

1.F.4. Except for the limited right of replacement or refund set


forth in paragraph 1.F.3, this work is provided to you ‘AS-IS’,
WITH NO OTHER WARRANTIES OF ANY KIND, EXPRESS
OR IMPLIED, INCLUDING BUT NOT LIMITED TO
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR
ANY PURPOSE.

1.F.5. Some states do not allow disclaimers of certain implied


warranties or the exclusion or limitation of certain types of
damages. If any disclaimer or limitation set forth in this
agreement violates the law of the state applicable to this
agreement, the agreement shall be interpreted to make the
maximum disclaimer or limitation permitted by the applicable
state law. The invalidity or unenforceability of any provision of
this agreement shall not void the remaining provisions.

1.F.6. INDEMNITY - You agree to indemnify and hold the


Foundation, the trademark owner, any agent or employee of the
Foundation, anyone providing copies of Project Gutenberg™
electronic works in accordance with this agreement, and any
volunteers associated with the production, promotion and
distribution of Project Gutenberg™ electronic works, harmless
from all liability, costs and expenses, including legal fees, that
arise directly or indirectly from any of the following which you do
or cause to occur: (a) distribution of this or any Project
Gutenberg™ work, (b) alteration, modification, or additions or
deletions to any Project Gutenberg™ work, and (c) any Defect
you cause.

Section 2. Information about the Mission of


Project Gutenberg™
Project Gutenberg™ is synonymous with the free distribution of
electronic works in formats readable by the widest variety of
computers including obsolete, old, middle-aged and new
computers. It exists because of the efforts of hundreds of
volunteers and donations from people in all walks of life.

Volunteers and financial support to provide volunteers with the


assistance they need are critical to reaching Project
Gutenberg™’s goals and ensuring that the Project Gutenberg™
collection will remain freely available for generations to come. In
2001, the Project Gutenberg Literary Archive Foundation was
created to provide a secure and permanent future for Project
Gutenberg™ and future generations. To learn more about the
Project Gutenberg Literary Archive Foundation and how your
efforts and donations can help, see Sections 3 and 4 and the
Foundation information page at www.gutenberg.org.

Section 3. Information about the Project


Gutenberg Literary Archive Foundation
The Project Gutenberg Literary Archive Foundation is a non-
profit 501(c)(3) educational corporation organized under the
laws of the state of Mississippi and granted tax exempt status by
the Internal Revenue Service. The Foundation’s EIN or federal
tax identification number is 64-6221541. Contributions to the
Project Gutenberg Literary Archive Foundation are tax
deductible to the full extent permitted by U.S. federal laws and
your state’s laws.

The Foundation’s business office is located at 809 North 1500


West, Salt Lake City, UT 84116, (801) 596-1887. Email contact
links and up to date contact information can be found at the
Foundation’s website and official page at
www.gutenberg.org/contact

Section 4. Information about Donations to


the Project Gutenberg Literary Archive
Foundation
Project Gutenberg™ depends upon and cannot survive without
widespread public support and donations to carry out its mission
of increasing the number of public domain and licensed works
that can be freely distributed in machine-readable form
accessible by the widest array of equipment including outdated
equipment. Many small donations ($1 to $5,000) are particularly
important to maintaining tax exempt status with the IRS.

The Foundation is committed to complying with the laws


regulating charities and charitable donations in all 50 states of
the United States. Compliance requirements are not uniform
and it takes a considerable effort, much paperwork and many
fees to meet and keep up with these requirements. We do not
solicit donations in locations where we have not received written
confirmation of compliance. To SEND DONATIONS or
determine the status of compliance for any particular state visit
www.gutenberg.org/donate.

While we cannot and do not solicit contributions from states


where we have not met the solicitation requirements, we know
of no prohibition against accepting unsolicited donations from
donors in such states who approach us with offers to donate.

International donations are gratefully accepted, but we cannot


make any statements concerning tax treatment of donations
received from outside the United States. U.S. laws alone swamp
our small staff.

Please check the Project Gutenberg web pages for current


donation methods and addresses. Donations are accepted in a
number of other ways including checks, online payments and
credit card donations. To donate, please visit:
www.gutenberg.org/donate.

Section 5. General Information About Project


Gutenberg™ electronic works
Professor Michael S. Hart was the originator of the Project
Gutenberg™ concept of a library of electronic works that could
be freely shared with anyone. For forty years, he produced and
distributed Project Gutenberg™ eBooks with only a loose
network of volunteer support.

Project Gutenberg™ eBooks are often created from several


printed editions, all of which are confirmed as not protected by
copyright in the U.S. unless a copyright notice is included. Thus,
we do not necessarily keep eBooks in compliance with any
particular paper edition.

Most people start at our website which has the main PG search
facility: www.gutenberg.org.

This website includes information about Project Gutenberg™,


including how to make donations to the Project Gutenberg
Literary Archive Foundation, how to help produce our new
eBooks, and how to subscribe to our email newsletter to hear
about new eBooks.

You might also like