Full Ebook of Networks Second Edition Instructor S Manual Solutions Mark Newman Online PDF All Chapter
Full Ebook of Networks Second Edition Instructor S Manual Solutions Mark Newman Online PDF All Chapter
Full Ebook of Networks Second Edition Instructor S Manual Solutions Mark Newman Online PDF All Chapter
https://ebookmeta.com/product/optical-networks-solutions-
instructor-solution-manual-1st-edition-debasish-datta/
https://ebookmeta.com/product/instructor-s-solutions-manual-for-
optics-global-edition-eugene-hecht/
https://ebookmeta.com/product/advanced-engineering-dynamics-
second-edition-2nd-ed-instructor-solution-manual-solutions-jerry-
ginsberg/
https://ebookmeta.com/product/instructor-s-solutions-manual-for-
microelectronic-circuits-international-seventh-edition-adel-s-
sedra/
An Introduction to Stochastic Modeling Modelling Fourth
Edition Ed 4th Instructor s Solution Manual Solutions
Mark Pinsky Samuel Karlin
https://ebookmeta.com/product/an-introduction-to-stochastic-
modeling-modelling-fourth-edition-ed-4th-instructor-s-solution-
manual-solutions-mark-pinsky-samuel-karlin/
https://ebookmeta.com/product/engineering-mechanics-statics-
dynamics-instructor-s-solutions-manual-14th-edition-russell-c-
hibbeler/
https://ebookmeta.com/product/instructor-s-solutions-manual-for-
statistics-for-business-and-economics-13th-edition-nancy-
boudreau/
https://ebookmeta.com/product/quantum-theory-of-materials-second-
edition-instructor-res-n-1-of-3-solution-manual-solutions-
efthimios-kaxiras/
https://ebookmeta.com/product/financial-analytics-with-r-
building-a-laptop-laboratory-for-data-science-instructor-s-
resources-lecture-solution-manual-solutions-1st-edition-mark-j-
Networks, 2nd Edition
Mark Newman
Solutions to Exercises
If you find errors in these solutions, please let the author know. Suggestions for improvements are also welcome. Please email
Mark Newman, [email protected]. Please do not post these solutions on the Web or elsewhere in electronic form. Copyright © 2018
Mark Newman.
6 Mathematics of networks
1 0 1 0 0
0 1 1 0 0®
© ª
b) B
0 0 0 1 0®
®
Exercise 6.1:
«0 1 1 1 1¬
a) Undirected
b) Directed, approximately acyclic 0 0 1 0 0
c) Planar, tree, directed or undirected depending on the ©0 0 1 1 1ª®
c) BT B 1
representation 1 0 1 1®®
d) Undirected, approximately planar 0 1 1 0 1®
e) Directed or undirected depending on the network «0 1 1 1 0¬
f) Citation networks, food webs
g) The web, the network of who says they’re friends with
whom Exercise 6.4:
h) A river network, a plant or a tree or their root system a) k A1
i) A road network, the network of adjacencies of countries
j) Any affiliation network, recommender networks, key- b) m 12 1T A1
word indices
c) N A2
k) A web crawler
l) Draw data from a professionally curated index such as the d) 16 Tr A3
Science Citation Index or Scopus, or from an automated
citation crawler such as Google Scholar Exercise 6.5:
m) A literature search
a) A 3-regular graph has three ends of edges per node, and
n) Questionnaires or interviews
hence 3n ends total. But the total number of ends of edges
o) An appropriate map
is also equal to 2m, which is an even number. Hence n
must be even.
Exercise 6.2: The maximum number of edges is n2 because
there are n2 distinct places to put an edge and each can have b) A tree with n nodes has m n − 1 edges. Hence the
only one edge in a simple network. The minimum is n − 1 average degree is 2m/n 2(n − 1)/n < 2.
because we require that the network be connected and n − 1 is c) The connectivity of A and C must be at least y, because if
the minimum number of edges that will achieve this—see the there are y paths from B to C and x > y paths from A to B,
discussion at the top of page 123. then there are at least y paths all the way from A to C.
On the other hand the connectivity of A and C cannot be
Exercise 6.3: The matrices are as follows: greater than y by the same argument: if there were more
than y paths from A to C and more than y paths from
B to A, then there would be more than y paths from B
0 1 0 0 1 to C (via A). Hence the connectivity of A and C must be
©0 0 1 0 0ª® exactly y.
a) A 1 0 0 0 1®®
0 1 1 0 0® Exercise 6.6: Let the eigenvector element at the central node
be x1 . By symmetry the elements at the peripheral nodes all
«0 0 0 0 0¬
have the same value. Let us denote this value x2 . Then the
1
Networks (2nd Edition)
eigenvalue equation looks like this: Exercise 6.8: The total number of edges attached to nodes of
type 1 is n1 c1 . The total number attached to nodes of type 2 is
0 1 1 1 ··· x x n 2 c2 . But each edge is attached to one node of each type and
ª ©x1 ª © 1ª
1 0 0 0 ··· x2 ® hence these two numbers must be equal n 1 c 1 n2 c2 .
©
® 2®
1
0 0 0 ··· ® x ®
® 2 ® λ x2 ® ,
®
1
0 0 0 ··· ® x ®
® 2®
x ®
2® Exercise 6.9: The network contains an expansion of UG, and
. .. .. .. .. ® . ® .® hence is nonplanar by Kuratowski’s theorem:
. . .. .
«. . . . ¬« ¬ «.¬
where λ is the leading eigenvalue. This implies that (n − 1)x2
λx1 and x1 λx2 . Eliminating x1 and x 2 from these equations
√
we find that λ n − 1. The equation x1 λx2 then implies
that x1 and x2 have the same sign, which means that this must
be the leading eigenvalue (by the Perron–Frobenius theorem—
see the discussion on page 160 and the footnote on page 161).
Exercise 6.7:
a)
r
Õ (The five-fold symmetric appearance of the network might lead
Total ingoing edges k iin , one at first to hypothesize that it contains an expansion of K5 ,
i1 but upon reflection we see that this is clearly impossible, since
r
Õ every node has degree 3, whereas every node in K5 has de-
Total outgoing edges k iout . gree 4. Thus if the network is to be nonplanar it must contain
i1 an expansion of UG.)
b) The number of edges running to nodes 1 . . . r from nodes Exercise 6.10: The edge connectivity is two. To prove this we
r + 1 . . . n is equal to the total number of edges running display two edge-independent paths and a cut set of size two
to nodes 1 . . . r minus the number originating at nodes thus:
1 . . . r. In other words, it is equal to the difference of the
two expressions above:
r
Õ A B A B
Number of edges k iin − k iout .
i1
2
Solutions to exercises
Let m be the number of paths from s to t of length ` st . And the closeness is the reciprocal of this, or 4n/(n 2 − 1).
Then the leading terms in Z st and the sum above are
Exercise 7.3:
[(αA)r ]st mα ` st + O(α ` st +1 ),
Õ
Z st a) The equivalence is most easily demonstrated in the re-
r
verse direction. We write the series as x ∞ (αA)k 1,
Í
k0
r[(αA)r ]st m` st α ` st + O(α ` st +1 ).
Õ
then
r
∞
Õ ∞
Õ
Substituting into the previous result we then get αAx + 1 αA (αA)k 1 + 1 1 + (αA)k 1
∂ log Z st m` st α ` st + O(α ` st +1 ) k0 k1
` st + O(α). ∞
∂ log α mα ` st + O(α ` st +1 )
Õ
(αA)k 1 x.
Taking the limit α → 0 then gives the required result. r0
3
Networks (2nd Edition)
4
Solutions to exercises
e) There are n 2 paths total and all of them start, end, or pass Exercise 7.14: Assume the network satisfies Davis’s criterion
through the central node except for those that start and of having no loops with exactly one negative edge. Performing
end at the same peripheral node, of which there are n − 1. the coloring as described and then adding back in the nega-
Hence the betweenness of the central node is n 2 − (n − 1). tive edges, we see that a negative edge will fall between two
nodes of the same color if and only if those nodes are in the
Exercise 7.10: There are three independent paths between ev- same component, meaning that they are connected by a path
ery pair of nodes in a 3-component. A node in a 3-core, on the of positive edges. That path plus the newly added negative
other hand, need only have edges connecting it to three other edge then form a loop with exactly one negative edge. But by
members of the 3-core, which is a weaker condition. This net- hypothesis there are no such loops in the network and hence
work, for example, is a single 3-core, but has two 3-components: no negative edges can fall between nodes in the same compo-
nent: they only fall between nodes in different components.
Hence all edges between components are negative. Given that
all edges within components are by definition positive (since
this is how we constructed the components in the first place),
the graph is therefore clusterable and the components of the
positive-edge network are the clusters.
Exercise 7.11: One third of the edges are not reciprocated and
Exercise 7.15: This question is most simply answered in vector
two thirds are, so r 23 . notation. Summing over j is equivalent to multiplying by the
uniform vector 1 (1, 1, 1, . . .), which gives:
Exercise 7.12:
a) It is balanced, as one can show by exhaustively verifying σ1 (D − αA)−1 1 [(I − αAD−1 )D]−1 1
that all loops contain an even number of minus signs.
D−1 (I − αAD−1 )1 D−1 x,
b) All balanced graphs are clusterable and hence this one is
too. Here are the clusters: where x (I − αAD−1 )1 is the vector of PageRank scores (see
Eq. (7.11)). Noting that D−1 is the diagonal matrix with the re-
ciprocals of the degrees along its diagonal, this then completes
the proof.
Exercise 7.16:
a) The numbers e r are simple—they are just the diagonal
entries in the table. To get the a r we need to add the frac-
tion of couples in which the woman is in group r and the
fraction in which the man is in group r, then divide by
two (since a r is defined as the fraction of ends of edges in
group r and the edge corresponding to each couple has
two ends). Thus we have a1 (0.323 + 0.289)/2 0.306,
Exercise 7.13: The number of times the color changes as we and similarly a2 0.226, a3 0.400, and a4 0.068.
go around a loop is equal to the number of minus signs. If Then the modularity is
this number is odd, then we change an odd number of times,
meaning that we end up with the opposite color from the one Q 0.258 + 0.157 + 0.306 + 0.016
we started with when we get back to the starting node. Thus − 0.3062 − 0.2262 − 0.4002 − 0.0682
the last edge around the loop will not be satisfied: either it is
0.428.
positive and joins unlike colors or it is negative and joins like
ones. If all loops have a even number of minus signs, on the b) Applying the same approach to the second set of data
other hand, we never run into problems, and the entire net- gives a1 0.345, a2 0.250, and a3 0.395, and the
work can be colored in this way. Then we simply divide the modularity with respect to political alignment is
network into contiguous groups of like-colored nodes. By def-
inition all edges within such a cluster are positive and all edges Q 0.25 + 0.15 + 0.30 − 0.3452 − 0.2502 − 0.3952
between different-colored clusters are negative. There are no 0.362.
edges between clusters of the same color, because if there were
the clusters would be considered one large one, not two smaller c) Both of these modularity values are quite high as such
ones. Hence the network is clusterable in the sense defined by things go—values above Q 0.3 are often considered
Harary. significant—so there seems to be substantial homophily
in these populations, meaning that couples tend to have
similar ethnicity and similar political views significantly
more often than one would expect by random chance.
5
Networks (2nd Edition)
8 The large-scale structure of networks with the degrees k i as its elements. When we multiply
an arbitrary vector v by this matrix we get
Exercise 8.1:
k T
a) If you double the area of the carpet it will take about Bv Av − k v.
twice as long to vacuum, so the complexity is O(n). 2m
b) One finds words in a dictionary, roughly speaking, by The first term can be evaluated in time O(m + n) as in
binary search—open the book at a random point, decide part (b). The inner product kT v in the second term takes
whether the word you want is backward or forward from time O(n) to evaluate, then we simply multiply it by
where you are, open the book at another random point in k/2m, which takes a further time O(n). Thus the total
that direction, and repeat. Each time you do this you de- time for the computation is O(m + n).
crease the distance to the desired word by, on average, a
factor of two. When the distance gets down to one word, Exercise 8.4:
you have found the word you want. The number k of a) O(n)
factors of two needed to do this is given by 2k n, so b) On each round of the algorithm we take O(n) time to find
k log2 n and the complexity is O(log n). the highest-degree node, then time O(m/n) to remove it
(see Table 8.2 on page 228), for a total of O(n + m/n) time
Exercise 8.2: per round. There are n rounds, so total running time is
a) Perform a breadth-first search starting from the given O(n 2 + m). (Normally m is less than n 2 , so to leading
node to find the distance to all other nodes, then average order it can be ignored and the running time is O(n 2 ).)
those distances and take the reciprocal to get the closeness c) If we use a heap we can find the highest-degree node in
centrality. The breadth-first search takes time O(m + n) time O(1) and remove it from the heap in time O(log n)
and the average takes time O(n), so the overall running and from the network in time O(m/n), for a total time
time is O(m + n). of O(m/n + log n) per round of the algorithm, to leading
b) Use Dijkstra’s algorithm. If implemented using a binary order. Over n rounds the whole calculation then takes
heap, the time complexity would be O((m + n) log n). time O(m + n log n).
c) Use repeated breadth-first searches. Start at node 1 and d) We place all the numbers in the heap, which takes
perform a breadth-first search to find all the nodes in the O(log n) time per number or O(n log n) for all of them,
component it belongs to. Then find the next node that then repeatedly find and remove the largest one. Find-
is not in that component and start another breadth-first ing the largest one takes time O(1) and removing it takes
search from there to find all the nodes in its component. time O(log n), and hence the total running time for sort-
Repeat until there are no nodes left that are not in any ing n numbers is O(n log n) to leading order.
of the previously discovered components. Each breadth- e) Make a histogram of the degrees as follows. First, create
first search takes time O(n c + m c ), where n c and m c de- an array of n integers, initially all equal to zero, which
note the numbers of nodes and edges in the component. takes time O(n). This array represents the bins in our
Summing over all components, the total running time is histogram. Then go through the degrees one by one and
O(n + m). for each degree k increase the count in the kth bin by
one. This also takes time O(n), and at the end of the
d) You could use a truncated version of the augmenting
process the kth bin will contain the number of degrees
path algorithm, in which you repeatedly find indepen-
equal to k. Now print out the contents of the histogram
dent paths, but stop when you have found two—there is
in order from largest degrees to smallest, going through
no need to keep going beyond this point if your aim is
each bin in turn and printing out separately each of the
only to find two paths.
node degrees it contains. For instance if the k 5 bin
Exercise 8.3: contains three nodes, print out a 5 three times. This too
takes time O(n). The end result will be a printed list of
a) Multiplying an n × n matrix into an n-element vector in- the degrees in decreasing order, which takes time O(n)
volves n 2 multiplies and n 2 additions. So the running to generate. This algorithm is (a version of) radix sort.
time for the complete calculation is O(n 2 ).
b) Set up an n-element array to represent the matrix, which Exercise 8.5:
takes time O(n), then add to it one term for each non-zero a) For a network stored in adjacency list format, one nor-
element in the adjacency matrix, of which there are 2m. mally has the degrees stored separately as well, in which
So the total running time is O(m + n). case calculating their mean is simply a matter of summing
c) At first sight this calculation is going to take time O(n 2 ) them and then dividing by n, which takes O(n) time.
as in part (a) above. But we can do it faster by noting b) The simplest way to calculate the median is to sort the list
that the modularity matrix can be written in matrix nota- of degrees in either increasing or decreasing order and
tion as B A − kkT /2m, where k is the n-element vector then find the one in the middle of the list. As discussed
6
Solutions to exercises
in Exercise 8.4, this sorting takes either O(n log n) time or If the degrees were correlated you could have a problem,
O(n) time, depending on the algorithm used, and hence because more edges could end at nodes with high out-degree
finding the median takes the same time. than you would expect on average. That would increase the
c) One would use Dijkstra’s algorithm, which has complex- amount of work you needed to do to check all the outgoing
ity O((m + n) log n). edges. Imagine, for instance, the extreme case of a star-like
d) This requires us to calculate the minimum node cut set, network in which half of all edges pointed inwards to a single
which we can do using the augmenting path algorithm. hub node and the other half pointed outwards. Then for the
1 m ingoing edges you would need to check the destinations of
The running time is O((m + n)m/n), as shown in Sec- 2
1 m outgoing edges for reciprocity, and the whole calculation
tion 8.7.2. 2
would take time O(m 2 ), which is much larger than O(m 2 /n).
Exercise 8.6:
a) If we know the true distance to every node at distance d Exercise 8.9:
or less then any node without an assigned distance must a) x i
Í
α di j .
j
have distance d + 1 or greater. If such a node is adjacent
b) Use breadth-first search to calculate and store all the dis-
to one with distance d, however, then there is at least
tances, then run through all nodes performing the sum
one path to it of length d + 1, via its distance-d neighbor.
above.
Hence its distance must be exactly d + 1.
b) There must be at least one path of length d + 1 to the node c) The time complexity for each node is O(m + n) and
in question (let us call it node v). The distance along that O(n(m + n)) for all nodes.
path to the penultimate node u (which is a neighbor of v)
is then d, meaning that u has distance no greater than d.
But u can also have distance no less than d, since if it did 9 Network statistics and measurement error
then there would be a path of length less than d + 1 to v
Exercise 9.1:
and hence its distance would not be d + 1. Thus u, which
is a neighbor of v, must have distance d. a) Confusions arising because authors have the same name.
Confusions arising because the same author may give
Exercise 8.7: their name differently on different papers. Missing pa-
a) To find the diameter of a network we must perform a pers.
breadth-first search starting from each node, then take the b) Some pages are not reachable from the starting point.
largest distance found in any such search. Each breadth- Dynamically generated pages might have to be excluded
first search takes time O(m + n) and there are n searches, to avoid getting into infinite loops or trees.
so the total running time is O(n(m + n)).
c) Laboratory experimental error of many kinds. Missing
b) Listing the neighbors of a node i is simply a matter of pathways.
running along the appropriate row of the adjacency list,
which takes time of order the average length of the row, d) Subjectivity on the part of participants. Missing partici-
which is hki. To list the second neighbors, however, we pants. Inaccurate specification of what a “friend” is.
need to list separately all the k j neighbors of each of the e) Out-of-date or inaccurate maps.
first neighbors j. But the number of ends of edges that
attach to node j is, by definition, k j , and hence we are k j Exercise 9.2:
times more likely to be attached to node j than to a node a) The likelihood is
with degree 1. Taking the appropriate weighted average,
the expected number of neighbors of a neighbor is n
Ö
Í L µn e−µx i .
j kj × kj hk 2 i i1
Í ,
j kj hki
b) The log-likelihood is L n log µ − µ ni1 x i , and, differ-
Í
and the average total number of second neighbors is hki entiating with respect to µ and setting the result to zero
times this, or hk 2 i, which is also the amount of time it will we get
take to list the second neighbors. n
1 1Õ
xi .
Exercise 8.8: To calculate the reciprocity you have to go µ n
i1
through each edge in the network in turn, of which there are m,
and ask whether any of the edges outgoing from the node at Exercise 9.3: After the algorithm converges, the parameter
the tail end of the edge connect back to the node at the head. values are α 0.598, β 0.202, and ρ 0.415, and the prob-
To check through all the outgoing edges takes a time O(m/n) abilities of edges existing between each pair of nodes are as
on average, so the whole calculation takes O(m 2 /n). follows:
7
Networks (2nd Edition)
0.
5
96
44
only be a tiny fraction of all node pairs, so the value of β
0.
2
0.022 would be small.
0.119
Exercise 9.6: If observations of edges are reliable, then there
0.
3
11
82
are no false positives: when we see an edge, it’s really there.
9
0.
So β 0 in this case. Substituting β 0 into Eq. (9.29), we get
ρ(1 − α)N
Exercise 9.4: Qi j
ρ(1 − α)N + (1 − ρ)(1 − β)N
a) We are certain about every pair that is connected by an
edge in any of these paths: they definitely have an edge. if E i j 0 and Q i j 1 otherwise. Substituting these values into
We are also certain about the non-existence of any edge Eq. (9.30) then gives
that would shorten a path. For instance, if there were an
Í
i< j Ei j
edge between node 5 and node 7, then the shortest path α Í ,
N i< j Qi j
to node 7 would go along that edge. Since it does not,
we know that the edge does not exist. For this reason we with ρ given by the same expression as previously and β 0.
can be sure there is no edge between the pairs (1,3), (1,4),
(1,6), (1,7), (1,8), (2,7), and (5,7). The remaining pairs are 10 The structure of real-world networks
all uncertain: they could have an edge or not.
Exercise 10.1:
b) (i) They are definitely connected if they have an edge in
any of the paths. (ii) They are definitely not connected if a) Every node is connected by an edge to every other, so the
adding an edge would create a shorter path to any node. shortest path between any two distinct nodes has length 1,
(iii) All other edges are uncertain. and hence the diameter is also 1.
b) The furthest points on the lattice are opposite corners. To
Exercise 9.5: reach one corner from the other you have to go L steps
along and L steps down for a total of 2L steps, so the
a) Dropping the parameters α, β, and ρ from our notation
diameter is 2L. The equivalent result for a d-dimensional
for the sake of brevity, we have
hypercubic lattice is dL. The number of nodes on the
P(Oi j 1) P(Oi j 1, A i j 1) + P(Oi j 1, A i j 0) hypercubic lattice is n (L + 1)d , which implies that
P(Oi j 1|A i j 1)P(A i j 1) L n 1/d − 1. Thus the diameter as a function of n is
d(n 1/d − 1).
+ P(Oi j 1|A i j 0)P(A i j 0). c) On the first step we reach k nodes. On each of the sub-
sequent d − 1 steps the tree branches by a factor of k − 1,
b) We have: so the number of nodes is multiplied by k − 1. After d
P(Oi j 1|A i j 1) α, steps, therefore, we reach k(k − 1)d−1 nodes. The total
number n d of nodes reachable in d steps or less is then
P(Oi j 1|A i j 0) β, given by the sum of this quantity over distances 1 to d,
P(A i j 1) ρ, plus 1 for the central node:
P(A i j 0) 1 − ρ. d
Õ d−1
Õ
1+ k(k − 1)m−1 1 + k (k − 1)m
Substituting all of these into the expressions above and m1 m0
in the question gives the required answer. k
1+ (k − 1)d − 1 .
c) For the reality mining example we have α 0.4242, k−2
β 0.0043, and ρ 0.0335, which gives a false discov- When this number is equal to n we have reached the
ery rate of 0.226. In other words, more than one in five whole network, and the diameter is the side-to-side dis-
observed edges is actually wrong. tance in the network, which is twice the corresponding
d) The false discovery rate is relatively large because the ob- value of d. Setting the above expression equal to n and
servations are unreliable: in the language of the reality rearranging for d we get
mining study, many pairs of people who are observed in
log[1 + (k − 2)(n − 1)/k]
proximity do not actually have a connection in the net- diameter 2 .
work. Even though this is true, however, the false positive log(k − 1)
rate is still small because most people who do not have d) The diameter of networks (a) and (c) grows logarithmi-
a connection are never observed in proximity. This is cally or slower with n and hence these networks show
just because the graph is sparse: most pairs of people are the small-world effect. Network (b) does not, although
never observed in proximity at all. To put that another one could argue that it does in the limit of large d.
8
Solutions to exercises
9
Networks (2nd Edition)
Exercise 11.2:
a) 1 − S is the probability, averaged over all nodes, that a # Pull a node off the queue
node does not belong to the giant component. For a node i = q[pout]
specifically of degree k to not belong to the giant compo- pout += 1
nent all of its k neighbors must not belong to the giant
# Check its neighbors
component, which happens with probability (1 − S)k .
for j in edge[i]:
b) By Bayes’ rule, the probability P(k|GC) of a node having if d[j]==0:
degree k given that it is not in the giant component is re- d[j] = c
lated to the probability P(GC|k) that it is not in the giant q[pin] = j
component given that it has degree k thus: pin += 1
10
Solutions to exercises
edge to that node times the probability S that the node is node is equal to the probability p that there is an out-
itself in the giant component for a total probability of pS. going edge to that node times the probability S that the
Then the probability of not being connected via that node node is in the giant strongly connected component, for a
is 1 − pS, and the probability of not being connected via total probability of pS. And the probability of not being
any other node is (1 − pS)n−1 . Putting p c/(n − 1) and joined to the giant strongly connected component is one
taking the limit of large n then gives the required result. minus this, or 1 − pS. Then the total probability of not
b) There are n − 1 ways to choose the one node, and proba- being joined to the giant strongly connected component
bility pS that we have a connection to that node and that via an outgoing edge to any of the n − 1 other nodes in
the node is itself in the giant component. We must not the network is
be connected via any of the other nodes, which happens
cS
n−1
with probability e−cS as in part (a), so the total probability (1 − pS) n−1
1− ' e−cS ,
n−1
is (n − 1)pSe−cS cSe−cS .
c) The probability of not being in the giant bicomponent is where we have made use of c p/(n − 1) from part (a),
the probability of having exactly zero or one connections and the last equality becomes exact in the limit of large n.
to it, which gives 1−T e−cS +cSe−cS , which in turn gives The probability of being joined to the giant strongly con-
the required result. (Interesting question: Why does hav- nected component via at least one outgoing edge is one
ing two connections to the giant component mean that you minus this value, or 1−e−cS in the large-n limit. Similarly
are in the giant bicomponent?) the probability of being joined to the giant strongly con-
nected component via at least one ingoing edge is also
d) The probability S satisfies S 1 − e−cS . We can use this
1 − e−cS . And the probability of having both of these
result to eliminate cS in favor of − ln(1 − S) and hence
conditions satisfied at once, which is also equal to the
show that T S + (1 − S) ln(1 − S). Since the final term
probability S of being in the giant strongly connected
(1 − S) ln(1 − S) is negative this makes T < S, except when 2
S is either 0 or 1, in which case the final term vanishes component, is S 1 − e−cS .
and T S.
Exercise 11.8:
Exercise 11.6: a) There are n − i nodes with labels higher than i and prob-
a) All edges are independent in a random graph, which is ability p of an edge from each one, so the expected in-
still true after we remove the giant component. More- degree is (n − i)p. There are i − 1 nodes with labels lower
over, while edge probabilities in the small components than i so the expected out-degree is similarly (i − 1)p.
are lower than edge probabilities in the network as a b) There are i nodes with labels i and lower and n − i with
whole, the nodes are indistinguishable, so all edge prob- labels higher than i. Hence there are i(n −i) possible pairs
abilities are the same. Hence the small components form and each is connected with probability p for an expected
a random graph, but one with no giant component, mean- number of edges i(n − i)p.
ing that the average degree is less than 1. c) The largest value occurs when i 21 n. The smallest val-
b) Part (a) assumes that if there is no giant component then ues occur when i 1 or i n −1, which both give (n −1)p.
the average degree is less than 1 but this doesn’t necessar- (Or you could say the smallest values occur for i 0 and
ily imply that if the average degree is greater than 1 there i n, which give zero.)
must be a giant component. That the latter is true, we Exercise 11.9:
can however see by setting (1 − S)c < 1 and rearranging
a) For each node there are n−1
to get S > 1 − 1/c, which implies that S > 0 if c > 1. 2 pairs of others with which
it could form a triangle, and each triangle is present with
probability c/ n−1
Exercise 11.7: 2 , for an average of c triangles per node.
Each triangle contributes two edges to the degree, so the
a) There are n(n − 1) ordered pairs of nodes and a frac-
average degree is 2c.
tion p of them are joined by directed edges, so there are
m n(n − 1)p edges on average. Then the average degree b) The probability p t of having t triangles follows the bino-
is mial distribution
m n−1
c (n − 1)p. t
n−1 c
n pt 2 p t (1 − p)( 2 )−t ' e−c ,
t t!
b) If we keep all m edges but discard their directions then
the average (undirected degree) is 2m/n 2(n −1)p 2c, where the final equality is exact in the limit of large n.
where c in this expression is the directed degree from The degree is twice the number of triangles and hence
part (a). Setting the average degree to 2c in the stan- t k/2 and
dard equation for the giant component of an undirected c k/2
p k e−c
random graph then gives the required result. (k/2)!
c) The probability of being joined to the giant strongly con- so long as k is even. Odd values of k cannot occur, so
nected component via an outgoing edge to one particular p k 0 for k odd.
11
Networks (2nd Edition)
2m−2
c) As shown above, there are on average c triangles around 2 21 (2m − 2)(2m − 3), and so forth. The number of ways
each node and hence nc is the total number of trian- of choosing all m pairs is thus
gles in the network times three (since each triangle ap-
pears around three different nodes and gets counted three 2m(2m − 1) (2m − 2)(2m − 3) 2 × 1 (2m)!
times). ··· m .
2 2 2 2
The number of connected triples around a node of de-
gree k 2t is 2t 2 t(2t − 1) and there are np t nodes
However, there are m! ways of choosing the same set of pairs
with t triangles, with p t as above. So the total number of in different orders, each of which gives rise to the same over-
connected triples is all matching, so the total number of matchings is the above
number divided by m!.
∞
Õ ct Note that the number of matchings is not the same as the
ne−c t(2t − 1) ne−c (2c 2 + c)ec nc(2c + 1).
t! number of networks: if we are only concerned with the topol-
t0
ogy and not with which particular stub matches with which,
(The sum is a standard one that can be found in tables, then the count is different. See the following exercise for a
but it’s also reasonably straightforward to do by hand if discussion of this issue.
you know the right tricks.) From Eq. (7.28) the clustering
coefficient is now equal to Exercise 12.2: As discussed
Î in the question, there is in general
an overall factor of i k i ! that comes from the permutation of
(number of triangles) × 3 the stubs at each node. However, if there is a multiedge be-
C tween two nodes, then this factor gets modified. Consider, for
(number of connected triples)
nc 1 instance, two nodes i, j with a double edge between them such
. that stub A on one node is connected to stub A on the other
nc(2c + 1) 2c + 1
and stub B is connected to stub B. Now if we permute the two
d) Let u be the probability that a node is not in the giant stubs on one of the nodes we will have A connected to B and
component. If a node is not in the giant component then B connected to A, but if we permute the stubs on both nodes
it must be that for each of the n−1
simultaneously we still have A connected to A and B to B—the
2 distinct pairs of other matching hasn’t changed. Indeed, we can see that for a multi-
nodes in the network either (a) that pair does not form a
triangle with our node (probability 1 − p) or (b) the pair edge with multiplicity A i j any permutation of the stubs at one
does form a triangle (probability p) but neither member of end has no effect on the matching if we perform the same per-
the pair is itself in the giant component (probability u 2 ). mutation at the other end. There areÎA i j ! permutations of the
Thus the analog of Eq. (11.12) for this model is multiedge, which gives the factor of i< j A i j ! in the question.
n−1
For self-edges the argument is similar. If we have 12 A ii self-
2( ).
u 1 − p + pu 2 edges at node i, then any permutation of stubs at one end of
each edge has no effect on the matching if we permute those
Putting p c/ n−1 at the other ends in the same way, for a factor of ( 12 A ii )!. But
2 and taking the limit of large n this
2)
becomes u e −c(1−u . Putting S 1 − u we then find in addition, we can also swap the two stubs at opposite ends
that S 1 − e−cS(2−S) . of the same self-edge and it will have effect on the matching,
which gives us a factor of 2A ii /2 . So overall, we have
e) Rearranging for c in terms of S we have
ln(1 − S) 2A ii /2 ( 21 A ii )! 2A ii /2 (1 × 2 × . . . × 21 A ii )
c− ,
S(2 − S) 2 × 4 × . . . × A ii A ii !!
and for S 12 this gives c 43 ln 2. Substituting into the
expression for the clustering coefficient above then gives Putting everything together, we get the expression in the ques-
tion.
1
C 0.351 . . . Exercise 12.3:
1 + 83 ln 2
a) To average over every edge, we simply sum over all
pairs i, j for which A i j 1, giving
12 The configuration model Í
ij Ai j xi 1 Õ
Exercise 12.1: Imagine matching pairs of stubs one by one. hxiedge Í ki xi ,
The number of ways of choosing the first pair from the 2m ij Ai j 2m
i
total stubs is 2m 12 2m(2m − 1). The number of ways of
2
choosing the second pair from the remaining 2m − 2 stubs is where we have made use of Eqs. (6.12) and (6.13).
12
Solutions to exercises
b) The difference is given by that we don’t is one minus this. So the probability that
we loop back on precisely the sth step is
1 Õ 1Õ
hxiedge − hxi ki xi − xi s−1
2m n 1 Ö 1
i i
πs 1− .
2(n − s) + 1 2(n − r) + 1
Õ
n 1Õ 1Õ 1 r1
ki xi − ki xi
2m n n n
i i i Taking logs of both sides we get
hkxi − hkihxi cov(k, x)
. ln π s − ln[2(n − s) + 1]
2m/n hki
s−1
Exercise 12.4:
Õ 1
+ ln 1 − ,
2(n − r) + 1
a) The degree distribution is r1
13
Networks (2nd Edition)
Exercise 12.6: A common issue for those attempting this ex- # Check its neighbors
ercise is incorrect code for generating the network. Recall that for j in edge[i]:
generating a configuration model network involves choosing if d[j]==0:
pairs of stubs at random and connecting them to form edges. d[j] = c
A common error is instead to chose pairs of nodes at random q[pin] = j
and then connect them if they have any available stubs. This is pin += 1
not the same thing and will not give the correct answer.
Here is an example program in Python to perform the re- # Check if this is the largest component
quired calculation: if pin>maxs: maxs = pin
for i in range(n) ])
shuffle(stub) 60000
for e in range(0,len(stub),2):
i,j = stub[e:e+2]
40000
edge[i].append(j)
edge[j].append(i)
20000
# Perform repeated breadth-first searches to
# find components
q = empty(n,int)
0
d = zeros(n,int) # Component labels 0.0 0.2 0.4 0.6 0.8 1.0
c = 0 # Components found so far p1
14
Solutions to exercises
b) The moments are As z is increased from zero this expression will at some
point diverge. The divergence happens when the denom-
d1 h i
inator reaches zero,√i.e., when z 2 + z − 1 0, which has
hki np(pz + 1 − p)n−1 np,
dz z1 z1 the solution z 12 ( 5 − 1) 0.618 . . .
d
2 b) The generating function is
hk 2 i z 1(z)
dz z1
∞
Õ ∞ Õ
Õ k−1
h 1(z) ak z k z + a j a k− j z k
npz(pz + 1 − p)n−1 k1 k2 j1
∞ ∞
i
2 2 n−2
+ n(n − 1)p z (pz + 1 − p) Õ Õ
z1 z+ aj a k− j z k ,
2
pn + p n(n − 1). j1 k j+1
15
Networks (2nd Edition)
where we have again used u 11 (u). After this, the rest c) The quadratic equation has solutions
is just algebra.
1 h
q i
b) For a Poisson distribution we have 10 (z) 11 (z) ec(z−1) u 2 − a ± (2 − a)2 − 4(1 − a)2
2a
and when there is no giant component we have u 1.
1h
q i
Thus: 1 − 21 a ± a − 34 a 2 .
a
10 (u) 11 (u) 1, 100 (u) 110 (u) c, However, if we choose the plus sign we get u > 1, which
is not allowed since u is a probability, so we must have
1000 (u) 1100 (u) c 2 ,
1h
q i
u 1 − 21 a − a − 34 a 2 .
which means a
a)
q
∞
Õ ∞
Õ 1 d) To get S > 0 we need 32 > a −1 − 34 or 94 > a −1 − 34 .
p k 21 1 k 2
1.
2
Rearranging for a then gives the required result.
k0 k0
1 − 21
Exercise 12.12:
b)
a) A node of degree k does not belong to the giant com-
1 1
10 (z) , 11 (z) . ponent if none of its neighbors do, which happens with
2−z (2 − z)2
probability u k . Hence the probability that the node does
c) belong to the giant component, given that it has degree k,
is
c2 P(GC|k) 1 − u k .
c 1 100 (1) 1, 110 (1) 2 ⇒ c2 2.
c1
b) Applying Bayes’ rule:
d) Yes, because c2 /c1 > 1 (see Section 12.6).
P(k)P(GC|k) p k (1 − u k )
e) Following Eq. (12.133) this probability is P(k|GC) ,
P(GC) S
c d 3 d where S is, as usual, the fraction of nodes in the giant
π3 1 11 (z) 12 c1 (2 − z)−6
2! dz z0 dz z0 component, which is given by Eq. (12.27).
1
h
−7
i 3 c) The average degree in the giant component is
2 c1 6(2 − z) .
z0 128 Õ 1Õ
kP(k|GC) kp k (1 − u k )
Exercise 12.11: S
k k
a) For this degree distribution Eq. (12.30) takes the form Í
kp k − cu k q k u k
Í
k
S
(1 − a)2
u . c − cu11 (u) c(1 − u 2 )
(1 − au)2 ,
S S
Rearranging we then retrieve the given cubic. where we have made use of the definition ofÍ the excess
b) The cubic can be factorized in the form degree distribution q k (k + 1)p k+1 /c and k q k u k
11 (u) u.
(u − 1) a 2 u 2 − a(2 − a)u + (1 − a)2 0,
d) The inequality is trivially true because, given that u lies
between zero and one (because it is a probability), every
and hence when u , 1 we have a 2 u 2 − a(2− a)u +(1− a)2 term in the sum is non-positive, and hence so is the entire
0. sum.
16
Solutions to exercises
17
Networks (2nd Edition)
Exercise 12.17: whole network, i.e., the probability that that edge does
a) Using the “pure” power-law form of Eq. (12.57), the fun- not lead to a node in the giant out-component. This is
damental generating functions are given in (12.63) and true if, from the node at that edge’s end, none of the other
(12.64) to be nodes that are reachable are themselves in the giant out-
∞
1 Õ −α k component, which occurs with probability u k , where k is
10 (u) k u the node’s out-degree.
ζ(α)
k1 The probability of landing at a node with in-degree j is
and P(j) jp inj
/h ji where p in
j
is the in-degree distribution.
∞
1 Õ
And the probability of that node having out-degree k is
11 (u) k −α+1 u k−1 .
ζ(α − 1)
k1 Õ 1 Õ 1 Õ
The probability u then satisfies P(k| j)P(j) jP(k| j)p in
j jp jk .
h ji h ji
j j j
∞
1 Õ
u k −(α−1) u k Then the probability that the node we reach is not in the
uζ(α − 1)
k1 giant out-component is equal to u k averaged over this
∞
1 Õ distribution:
(k + 1)−(α−1) u k , 1 Õ
ζ(α − 1) u jp jk u k .
k0 hji
jk
and the fraction of nodes in the giant component—
This has a trivial solution at u 1, but it can also have a
i.e., nodes that are functional—is S 1 − 10 (u).
non-trivial solution u < 1 if the slope of the right-hand
b) A rough numerical solution for u from the equation above side is greater than 1 at u 1. Performing the deriva-
gives u 0.470. Then S 0.614. So the model suggests tive, this gives us the condition for the existence of a giant
that about 61% of the Internet should be working at any out-component: hjki − h ji > 0. For uncorrelated degrees,
one time. Actually about 97% of the Internet is working where h jki h jihki c 2 , this gives c > 1 as in part (a).
at a time. Why the difference? Probably because people For correlated degrees, we can write the covariance of in-
work very hard to be connected to the giant component— and out-degree as ρ h jki − h jihki so that h jki ρ + c 2 ,
the Internet is not much use unless you are connected. So and hence the condition for the giant component is
the giant component is likely to have size close to S 1.
a) Consider the node you arrive at by following an edge (in e) Indeed we would, and of course it does.
its forward direction). If the in- and out-degrees are un-
correlated then the average number of edges leaving that Exercise 12.19:
node is simply equal to the average number leaving any
a) In the ordinary configuration model the average excess
node, which is c. If the number of reachable nodes is to
degree is (hk 2 i − hki)/hki (see Eq. (12.18)). In the bipar-
grow on average the further you go, this number should
tite version, there are two average excess degrees, one for
be greater than 1. Hence a giant out-component exists if
nodes of type A and one for nodes of type B. For a giant
(and only if) c > 1.
component to exist, the product of these two should be
b) The argument for the giant in-component is identical, greater than 1, so that the number of reachable nodes
except that you follow edges backwards. increases as we go further away from any starting point.
c) If there is a giant in- or out-component then there is nec- Thus the condition for a giant component is
essarily a giant weakly connected component, since the
largest weakly connected component is trivially at least hk 2 iA − hkiA hk 2 iB − hkiB
as large as the largest in- or out-component. The giant > 0,
hkiA hkiB
strongly connected component is only a little more tricky.
As discussed in Section 6.12.1, a strongly connected com- or equivalently
ponent is equal to the intersection of the in- and out-
components of any of its member nodes. If we have giant hk 2 iA hk 2 iB − hk 2 iA hkiB − hk 2 iB hkiA > 0.
in- and out-components, and if all nodes are equally likely
to belong to them (since the nodes in a random graph are b) A node of type A at the end of an edge is not in the giant
indistinguishable), then they will have an intersection component if all of its neighbors (which are of type B) are
that fills a non-vanishing fraction of the network. This not in the giant component. If the node has excess de-
intersection is the giant strongly connected component. gree k this happens with probability u Bk . Averaging over
d) Let u be the probability that when following a randomly the excess degree distribution then gives the required
chosen edge we reach only a vanishing fraction of the result. The formula for u B is analogous.
18
Solutions to exercises
19
Networks (2nd Edition)
20
Solutions to exercises
Exercise 13.7: e) Substituting 1(c, z) f (c − c/z) into the left- and right-
hand sides of the equation separately gives
a) The correctly normalized probability of a specific end of
an edge attaching to node i is ∂1
c c(1 − 1/z) f 0 (c − c/z),
∂c
ki k ∂1
i , c
z(z − 1) z(z − 1) 2 f 0 (c − c/z),
Í
k 2m
j j ∂z z
and two ends are added with each edge, so the total prob- which are trivially equal.
ability is twice this. f) Setting 1(1, z) z gives us f (1 − 1/z) z, and making
the substitution x 1 − 1/z then gives f (x) 1/(1 − x).
b) We lose a node of degree k when it gains an extra edge
Thus the general solution for 1 is
and gain one when a node of degree k − 1 gains an edge.
Thus the master equation is 1
1(c, z) .
1 − c + c/z
k−1 k
np k (m + 1) np k (m) + np k−1 (m) − np k (m), g) Rewriting the generating function in the form
m m
z 1
and a factor of n cancels throughout. For k 0 the same 1(c, z) ,
c 1 − (1 − 1/c)z
argument applies except that there is no way to gain new
nodes of degree zero. We can represent this situation we can expand the second factor as a geometric series in
using the same master equation but with the convention z to give
∞
that p−1 0 for all m. Thus, with this convention, we z Õ
have 1(c, z) (1 − 1/c)m z m .
c
m0
k−1 k Now we can easily read off the coefficients of the expan-
p k (m + 1) p k (m) + p (m) − p k (m),
m k−1 m sion, which give us our degree distribution. Since there
is no term in z 0 , there are no nodes of degree zero, and
for all m.
for degree k ≥ 1 we have
c) Given that the average degree is c 2m/n when there
are m edges in the network, then when there are m + 1 (1 − 1/c)k−1 (c − 1)k−1
p k (c) .
edges the average degree is c ck
Exercise 13.8:
2(m + 1) 2m 2 2
+ c+ . a) The correctly normalized probability that a particular
n n n n
new edge attaches to a previous node i is
Putting m 12 nc and writing p k as a function of average qi + ai qi + ai
degree, the master equation then becomes ,
+ + n ā
Í
(q
i i a i ) nc
21
Networks (2nd Edition)
This equation is correct for q > 0. For q 0 we have c) The number of nodes in components of size k is na k (n).
This number goes up by k when a new component of size
(n + 1)p0 (a, n + 1) np 0 (a, n) k forms and down by k when one is lost, except when we
ca join two components of size k together, in which case we
+ ρ(a) − p (a, n),
c + ā 0 lose 2k nodes, k for each of the components. Hence the
master equation takes the form
where ρ(a) is the probability distribution from which a is
drawn. (n + 1)a k (n + 1) na k (n) − βk 2a k (n) − 2a k (n)2
In the limit of large n, using the shorthand p q (a) k−1
Õ
p q (a, ∞), these equations become − 2βka k (n)2 + βk a j (n) a k− j (n)
j1
c
p q (a) (q − 1 + a)p q−1 (a) − (q + a)p q (a) ,
c + ā na k (n) − 2βka k (n)
k−1
for q > 0 and
Õ
+ βk a j (n) a k− j (n).
ca j1
p0 (a) ρ(a) − p (a),
c + ā 0
d) For components of size 1 we have
for q 0. By a series of manipulations similar to those
leading to Eq. (13.21), we can then show that (n + 1)a 1 (n + 1) na 1 (n) − 2βa1 (n) + 1.
b) The probability of joining to a component of any size then we make the substitution u τ c/(c+a) , which gives
other than k is 2a k (1 − a k ) 2a k − 2a 2k . If the two com-
ponents have the same size then the probability is just a 2k 1 Γ(q + a) 1
∫ ∫
with no factor of 2. π q (τ)dτ (1+a/c) u a+a/c (1−u)q du.
0 Γ(q + 1)Γ(a) 0
22
Solutions to exercises
[ 0, 0, 1, 0, 1, 1 ], p m (1 − p)n1 n2 −m ,
[ 0, 0, 0, 1, 0, 1 ],
[ 0, 0, 0, 1, 1, 0 ]], int) where m i j B i j is the total number of edges in the
Í
k = sum(A) network. This expression is analogous to Eq. (14.26) for
twom = sum(k) the ordinary random graph.
B = A - outer(k,k)/twom b) Differentiating with respect to p and setting the result to
x,v = eigh(B) zero gives
vector = v[:,-1] m
print(vector) p .
n1 n2
For the leading eigenvector, the program gives (0.444, 0.444,
0.325, −0.325, −0.444, −0.444). For the purposes of this pro-
gram the nodes were numbered left to right in the picture, so
this vector tells us that the left three nodes are in one commu-
nity and the right three are in the other, like this:
23
Networks (2nd Edition)
24
Solutions to exercises
15 Percolation and network resilience c) The node belongs to a cluster of size zero if it is unoc-
cupied, which happens with probability 1 − φ. If it is
Exercise 15.1:
occupied the results of Section 12.10.9 along with those
a) The generating functions for the degree distribution and of Exercise 12.13 tell us that the probability of belong-
excess degree distribution are 10 (z) z 4 and 11 (z) z 3 . ing to a cluster of size s is φ (the probability of being
Hence Eq. (15.4) becomes u 1 − φ + φu 3 , which is a occupied) times the probability of being in a component
cubic equation. However, we know that u 1 is always of size hsi within the network of occupied nodes, which
a solution, so we can factor that out, and we find that gives the expression in the question.
φu 2 + φu + φ − 1 0, so
Exercise 15.3:
p
−φ ± 4φ − 3φ 2 a) The mean and mean-square degrees are
u .
2φ hki p 1 + 2p2 + 3p3 , hk 2 i p 1 + 4p2 + 9p3 ,
The negative solution is disallowed since u is a probabil- so from Eq. (15.9)
ity, so we take the positive one. Then the size of the giant
hki
cluster is φc
hk 2 i − hki
p1 + 2p2 + 3p 3
p 4
4φ − 3φ2 − φ
4
S 1 − 10 (u) 1 − u 1 − .
16φ 4 (p1 + 4p2 + 9p3 ) − (p1 + 2p 2 + 3p3 )
p1 + 2p 2 + 3p3
.
b) The critical probability occurs at the point where S 0, 2p2 + 6p3
which gives 3φ 4φ − 3φ 2 , or φ c 31 .
p
p b) If p1 > 3p3 then
c) We have S 1 when φ 4φ − 3φ2 , which gives φ 1. 3p3 + 2p2 + 3p 3
So the giant cluster fills the entire network when the oc- φc > 1,
2p2 + 6p3
cupation probability reaches 1. This can happen because
in a regular graph with k ≥ 3 the giant component also so there can be no giant cluster. (Indeed if p1 > 3p 3 there
fills the entire network (see Exercise 12.4). is no giant component—see Eq. (12.40)—so obviously there
can be no giant cluster.) The result does not depend on
Exercise 15.2: p 2 because you can add as many degree-2 nodes to the
a) Since the occupied nodes are chosen at random, any pair network as you like and it makes no difference—they just
of them is connected by an edge independently with sit in the middle of edges but don’t change the overall
probability p, just as is the case for any nodes in the net- network structure.
work. Hence they form a random graph with the same c) The probability u satisfies
edge probability p as the network as a whole. The ex- u 1 − φ + φ11 (u) 1 − φ + φ(q0 + q 1 u + q 2 u 2 ),
pected number of occupied nodes is n 0 nφ, and hence,
from Eq. (11.6), the mean degree is or equivalently
25
Networks (2nd Edition)
Exercise 15.4: From Eq. (12.102), the generating function for Exercise 15.7:
the excess degree distribution is a) The position of the phase transition is given by Eq. (15.41),
1 − a 2 f10 (1) 1. The excess degree distribution is q k
11 (z) , (k + 1)p k+1 /hki (k + 1)−α+1 /ζ(α − 1) so from Eq. (15.35)
1 − az
and thus k0
1 Õ
2a(1 − a)2 f1 (z) k −α+1 z k−1 .
110 (z) . ζ(α − 1)
(1 − az)3 k1
26
Solutions to exercises
b) For this model we have relabeled nodes belong increases in size by only one node. In
∞ ∞
the worst case where we do this on every step of the algorithm,
Õ Õ 1−a repeatedly adding a cluster of size one to the largest cluster,
f0 (z) p k φ k z k (1 − a) (abz)k ,
1 − abz we would end up relabeling each node O(n) times before the
k0 k0
largest cluster had grown to fill the whole network. The total
and work done relabeling all n nodes O(n) times would then be
∞
Õ 1−a O(n 2 ).
10 (z) pk z k .
1 − az
k0 Exercise 15.11:
Then Eq. (15.36) tells us that a) If the outbreak starts at a randomly chosen node, then the
probability of starting in cluster m is s m /n. And when it
f00 (z) (1 − a)2
f1 (z) b , does so, it will infect s m individuals. Hence the expected
100 (1) (1 − abz)2 size of the outbreak is
k
and Õ sm
(1 − a)2 I sm .
f10 (z) 2ab 2 . m1
n
(1 − abz)3
And there is a giant cluster when f10 (1) > 1, i.e., when The smallest value of I occurs when the clusters are of
equal size, meaning that s m (n − r)/k. Substituting into
2ab 2 (1 − a)2 > (1 − ab)3 , the formula for I then gives the required result.
b) There is only one node that will make any difference at
as stated in the question. all if removed. This one:
Exercise 15.9:
a) Each of individual i’s neighbors j has probability f of
being chosen and then probability 1/k j of nominating i
for immunization. Thus the probability of receiving a
nomination from neighbor j is f /k j and summing over
all neighbors then gives the required result.
b) The probability of not receiving a nomination from neigh-
bor j is 1 − f /k j , and the probability of not receiving a
− f /k j )A i j . Taking
Î
nomination from any node is j (1
logs, we have
Ö f Ai j Õ Õ Ai j
f c) If you can remove two nodes, you should remove these:
log 1− A i j log 1 − ' −f
kj kj kj
j j j
− f κi .
27
Networks (2nd Edition)
Exercise 16.2: In terms of the total number of individuals S, X, b) The size of the giant cluster is
and R in the three states the equations are
S 1 − 10 (u) 1 − u 4 ,
dS X
−βS , where u is the solution of u 1−φ+φ11 (u) 1−φ+φu 3 ,
dt S+X
dX X so we have a cubic equation for u:
βS − γX,
dt X+S
φu 3 − u + 1 − φ 0.
dR
γX.
dt Although cubics are usually hard to solve, this one can be
Dividing throughout by the total population size n then gives simplified by noting that u 1 is a solution, and hence
u − 1 must be a factor. A little work reveals that the cubic
ds sx equation can be rewritten as
−β ,
dt s+x
dx sx (u − 1)(φu 2 + φu + φ − 1) 0.
β − γx,
dt x+s If u , 1, then we must have φu 2 + φu + φ − 1 0, which
dr has solutions
γx.
dt p
4φ − 3φ 2
Exercise 16.3: u − 12 ± .
2φ
a) Equation (16.29) is the same as Eq. (15.4) and, as shown in
Eq. (15.17), the solution for the exponential distribution But u cannot be negative (since it is a probability), so we
is q must choose the positive sign. Then
u a −1 − 21 φ − 14 φ 2 + φ a −1 − 1 .
p 4
1 4φ − 3φ2
S 1− −1 .
b) The probability that a node with degree k does not belong 16 φ
to the giant cluster is u k , and if it does not belong to the
giant cluster then, with probability 1 in the limit of large Putting φ 21 then gives the required result.
network size, it is not infected. If it does belong to the gi-
Exercise 16.5: Here is an example program to solve this prob-
ant cluster—which happens with probability 1−u k —then lem, which is written in Python:
it may be infected or it may not, depending on whether
the outbreak starts in the giant cluster. The probability from numpy import empty,zeros
that the outbreak starts in the giant cluster is S. Hence from random import random,randrange
the total probability that our node of degree k is infected from pylab import plot,show,xlabel
is S(1 − u k ).
The size of the giant cluster in the present case is given n = 10000
by Eq. (16.30) to be c = 5
m = n*c//2
1−a phi = 0.4
S 1 − 10 (u) 1 −
1 − au
1−a # Generate the network
1−
k = zeros(n,int)
q
1 φa + 1 φ2 a 2 + φa(1 − a)
2 4 edge = empty(n,set)
for i in range(n): edge[i] = set()
q
1 φa − 1 φ2 a 2 + φa(1 − a)
2 4 for e in range(m):
1 − (1 − a) 1 1 2 2
4φ 4φ a + φa(1 − a)
2 a2
− i = randrange(n)
q j = randrange(n)
32 − 1
4 + (a −1 − 1)/φ. while (i==j) or (i in edge[j]):
i = randrange(n)
Putting the results together then gives us our expression j = randrange(n)
for the probability of infection. edge[i].add(j)
c) For a 0.4 and φ 0.9 we find that u 0.804 and edge[j].add(i)
S 0.116. Then the probability of infection for k 0 is
zero (obviously), for k 1 is 0.023, and for k 10 is 0.103. # Make all nodes susceptible except the starting node
S,I,R = 0,1,2
Exercise 16.4: state = zeros(n,int)
a) Every node has excess degree 3, so 11 (z) z 3 and v = randrange(n)
φ c 1/110 (1) 31 . state[v] = I
28
Solutions to exercises
t = 0
endstep = 1 6000
susceptible = n - 1
infected = 1
recovered = 0 4000
tpoints = []
spoints = [] 2000
ipoints = []
rpoints = []
0
0 5 10 15 20
# Main loop Time
while pin>pout:
Exercise 16.6:
# Pull an infected node off the queue a) Following the lines of the argument given in Sec-
v = q[pout] tion 16.3.2, we define u to be the probability that a node is
pout += 1 not connected to the giant cluster via a specific one of its
edges. For this to happen, either the edge is unoccupied,
# Check its neighbors one by one or the node at its other end is unoccupied, or both are
for i in edge[v]: occupied but the node at the other end is not connected
if state[i]==S and random()<phi: to the giant cluster via any of its remaining edges. The
state[i] = I probability that both the node and edge are occupied is
q[pin] = i φ s φ b , and the probability that at least one of them is
pin += 1 unoccupied is 1 − φ s φ b . The probability that none of
infected += 1 the remaining edges connects to the giant cluster is u k ,
susceptible -= 1 where k is the excess degree. Putting the terms together
the total probability that a given edge does not connect
# Mark the infecting node as recovered
to the giant cluster is 1 − φ s φ b + φ s φ b u k . Averaging
state[v] = R
over the distribution q k of the excess degree then gives
infected -= 1
u 1 − φ s φ b + φ s φ b 11 (u) as in the question.
recovered += 1
A given node does not belong to the giant cluster if ei-
ther it is itself unoccupied (probability 1 − φ s ) or it is
# Check if the time step has ended
occupied (probability φ s ) but none of its edges lead to
# If it has, record the current state
the giant component (probability u k , where k is now the
if pout==endstep:
total degree). Putting these terms together gives a total
t += 1
tpoints.append(t) probability of 1 − φ s + φ s u k . Averaging over the distri-
spoints.append(susceptible) bution p k of the total degree then gives the expression
ipoints.append(infected) for 1 − S, which can be rearranged to give the required
rpoints.append(recovered) expression for S.
endstep = pin b) As in Fig. 15.2, the phase transition between the epidemic
and non-epidemic regimes occurs at the point where the
plot(tpoints,spoints) derivative of y 1 − φ s φ b + φ s φ b 11 (u) at u 1 is equal
plot(tpoints,ipoints) to 1. Performing the derivative, rearranging for φ s , and
plot(tpoints,rpoints) recalling that the fraction of individuals vaccinated is
xlabel("Time") 1 − φ s , then gives the required result.
show() Exercise 16.7:
a) The probability of a small outbreak is 1 − S, where S is the
size of the giant percolation cluster in the network, given
29
Networks (2nd Edition)
The generating function h1 (z) for the size of the cluster at The x(t) that solves this equation is a solution of (16.55)
the end of an edge has two terms. If the edge is unoccu- for the given initial condition and, since solutions to first-
pied (probability 1 − φ) then the cluster has size zero, so order differential equations are unique once the initial
the generating function has a constant term equal to 1− φ. condition is fixed, it follows that this is the only solution
If the edge is occupied then by an argument analogous to and hence that the probability of infection is indeed the
same for all nodes at all times.
that for h0 the generating function is equal to z[h1 (z)]k ,
where k is now the excess degree of the node at the end of c) The equation can be solved by separating the variables:
the edge. Each term in this generating function now gets ∫ x
dy 1 x
n−c
multiplied by φ, the probability that the edge is occupied t ln + ln .
c/n βk y(1 − y) βk 1 − x c
and, after averaging over the distribution q k of the excess
degree, the full generating function is Rearranging for x(t) then gives the required answer.
∞ d) The rate of appearance of new cases is n dx/dt, which is
increasing when its derivative n d2 x/dt 2 is positive (and
Õ k
h1 (z) 1 − φ + q k φz h1 (z)
30
Solutions to exercises
31
Networks (2nd Edition)
2 cos k x + cos k y vr .
dc r
1 0 (0)λ r c r (t),
dt
32
Solutions to exercises
33
Networks (2nd Edition)
There are four ends of edges at every node on the square lattice which is always less than 2k+1 . Thus the probability of the
and hence an average of 4p ends of shortcuts and 2np shortcuts message having a shortcut to a particular node in one of the
in the whole system. The equivalent of the power-law condi- lower classes is never less than 2pK2−(k+1)α , and the proba-
tion on the lengths of shortcuts in the one-dimensional model bility of having a shortcut to any of them is never less than
is that in two dimensions each shortcut spans a given vector 22k × 2pK2−(k+1)α pK22k+1−(k+1)α .
displacement with probability Kr −α , where r is the Manhat-
If the message does not have a shortcut to a better zone of
tan length of the vector. There are n places a shortcut that
the lattice from its current location, it gets passed along the
spans a given vector can fall, each with probability 1/n, and
lattice to a node one step closer to the target and tries again
hence 2np × Kr −α /n 2pKr −α is the probability of connection
to find a shortcut there. The expected number of tries until it
between any node pair.
finds a shortcut is the reciprocal of the probability above, or
The normalizing constant K is, as before, fixed by the fact
that each shortcut must span some vector. The number of 1 1 α−1 (α−2)k
2 2 .
nodes at distance r from a given starting point is 4r and, if pK22k+1−(k+1)α pK
we assume a diamond-shaped system as in the picture above,
with L nodes on a side, the maximum value Í of r is L. The In the worst case the message has to pass through all of the
normalization condition then says that K Lr1 (2r × r −α ) 1, classes before reaching the target and there are log2 L classes.
which by a calculation analogous to that for Eq. (18.5), implies Hence an upper bound on the expected number of steps is
that
1 α−2
2 (2 − α)L for α < 2, log2 L
1 α−1 2(α−2)[1+log2 L] − 1
1 α−1 Õ (α−2)k
`≤
K ' 1/(2 ln L) for α 2, 2 2 2
pK pK 2α−2 − 1
(α − 2)/α for α > 2.
k0
1 α−1 (2L)α−2 − 1
We assume without loss of generality that the target is in 2 .
the center of our picture. As before, consider a message at pK 2α−2 − 1
a node in the kth class and let us ask what the probability is Finally, making use of our expression for the normalizing con-
that there exists a shortcut from that node to a node in a lower √
stant K and noting that L ≤ n, we have
class (i.e., a class closer to the target). The number of nodes
in classes lower than k is easily shown to be 22k + (2k − 1)2 , 1−α/2 if α < 2,
An
which is never less than 22k . The Manhattan distance from
`≤ B log2 n if α 2,
the message to the furthest of these nodes in a lower class is C n α/2−1
if α > 2.
no greater than the maximum distance to the target, which
is 2k − 1, plus the distance out to the furthest node, which is Thus it is indeed possible to find the target in log2 n time, but
2k−1 − 1. Thus the distance is no greater than 2k + 2k−1 − 2, only if α 2.
34
Another random document with
no related content on Scribd:
THE FULL PROJECT GUTENBERG LICENSE
PLEASE READ THIS BEFORE YOU DISTRIBUTE OR USE THIS WORK
1.D. The copyright laws of the place where you are located also
govern what you can do with this work. Copyright laws in most
countries are in a constant state of change. If you are outside
the United States, check the laws of your country in addition to
the terms of this agreement before downloading, copying,
displaying, performing, distributing or creating derivative works
based on this work or any other Project Gutenberg™ work. The
Foundation makes no representations concerning the copyright
status of any work in any country other than the United States.
1.E.6. You may convert to and distribute this work in any binary,
compressed, marked up, nonproprietary or proprietary form,
including any word processing or hypertext form. However, if
you provide access to or distribute copies of a Project
Gutenberg™ work in a format other than “Plain Vanilla ASCII” or
other format used in the official version posted on the official
Project Gutenberg™ website (www.gutenberg.org), you must, at
no additional cost, fee or expense to the user, provide a copy, a
means of exporting a copy, or a means of obtaining a copy upon
request, of the work in its original “Plain Vanilla ASCII” or other
form. Any alternate format must include the full Project
Gutenberg™ License as specified in paragraph 1.E.1.
• You pay a royalty fee of 20% of the gross profits you derive from
the use of Project Gutenberg™ works calculated using the
method you already use to calculate your applicable taxes. The
fee is owed to the owner of the Project Gutenberg™ trademark,
but he has agreed to donate royalties under this paragraph to
the Project Gutenberg Literary Archive Foundation. Royalty
payments must be paid within 60 days following each date on
which you prepare (or are legally required to prepare) your
periodic tax returns. Royalty payments should be clearly marked
as such and sent to the Project Gutenberg Literary Archive
Foundation at the address specified in Section 4, “Information
about donations to the Project Gutenberg Literary Archive
Foundation.”
• You comply with all other terms of this agreement for free
distribution of Project Gutenberg™ works.
1.F.
Most people start at our website which has the main PG search
facility: www.gutenberg.org.