Solutions

Download as pdf or txt
Download as pdf or txt
You are on page 1of 107

Eidgenössische Ecole polytechnique fédérale de Zurich

Technische Hochschule Politecnico federale di Zurigo


Zürich Federal Institute of Technology at Zurich

Departement of Computer Science 26 September 2022


Markus Püschel, David Steurer
François Hublet, Goran Zuzic, Tommaso d’Orsi, Ding Jingqiu

Algorithms & Data Structures Exercise sheet 0 HS 22

The solutions for this sheet do not have to be submitted. The sheet will be solved in the first exercise
session on 26.09.2022.
Exercises that are marked by ∗ are challenge exercises.

Exercise 0.1 Induction.


a) Prove by mathematical induction that for any positive integer n,
n(n + 1)
1 + 2 + ··· + n = .
2
• Base Case.
Let n = 1. Then:
1·2
1= .
2
• Induction Hypothesis.
Assume that the property holds for some positive integer k. That is:
k(k + 1)
1 + 2 + ··· + k = .
2

• Inductive Step.
We must show that the property holds for k + 1 summands.
k(k + 1)
1 + 2 + ··· + k + k + 1 = +k+1
2
k(k + 1) + 2(k + 1)
=
2
(k + 1)(k + 2)
= .
2

By the principle of mathematical induction, this is true for any positive integer n.
b) (This subtask is from August 2019 exam). Let T : N → R be a function that satisfies the
following two conditions:

T (n) ≥ 4 · T ( n2 ) + 3n whenever n is divisible by 2;


T (1) = 4.

Prove by mathematical induction that

T (n) ≥ 6n2 − 2n

holds whenever n is a power of 2, i.e., n = 2k with k ∈ N0 .


• Base Case.
Let k = 0, n = 20 = 1. Then:
T (1) = 4 ≥ 6 · 12 − 2 · 1

• Induction Hypothesis.
Assume that the property holds for some positive integer m = 2k . That is:
T (m) ≥ 6m2 − 2m

• Inductive Step. We must show that the property holds for 2m = 2k+1 .

T (2m) ≥ 4 · T (m) + 3 · 2 · m
≥ 24m2 − 8m + 6m
= 24m2 − 2m
≥ 24m2 − 4m
= 6 · (2m)2 − 2 · (2m) .
By the principle of mathematical induction, this is true for any integer n that is a power of 2.

Asymptotic Growth
When we estimate the number of elementary operations executed by algorithms, it is often useful to
ignore smaller order terms, and instead focus on the asymptotic growth defined below. We denote by
0 the set of nonnegative real numbers.
R+ the set of all (strictly) positive real numbers and by R+

Definition 1. Let f, g : N → R+ be two functions. We say that f grows asymptotically faster than g if
lim fg(n)
(n) = 0.
n→∞
g(n)
This definition is also valid for functions defined on R+ instead of N. In general, lim is the same
n→∞ f (n)
g(x)
as lim if the second limit exists.
x→∞ f (x)

Exercise 0.2 Comparison of functions part 1.


Show that
a) f (n) := n log n grows asymptotically faster than g(n) := n.
Solution:
n 1
lim = lim = 0,
n→∞ n log n n→∞ log n

hence by Definition 1, f (n) := n log n grows asymptotically faster than g(n) := n.

b) f (n) := n3 grows asymptotically faster than g(n) := 10n2 + 100n + 1000.


Solution:
10n2 + 100n + 1000  10 100 1000 
lim = lim + 2 + 3 = 0,
n→∞ n3 n→∞ n n n
hence by Definition 1, f (n) := n3 grows asymptotically faster than g(n) := 10n2 + 100n + 1000.

2
c) f (n) := 3n grows asymptotically faster than g(n) := 2n .
Solution:
2n  2 n
lim = lim = 0,
n→∞ 3n n→∞ 3

hence by Definition 1, f (n) := 3n grows asymptotically faster than g(n) := 2n .

The following theorem can be useful to compute some limits.


Theorem 1 (L’Hôpital’s rule). Assume that functions f : R+ → R+ and g : R+ → R+ are differentiable,
0 (x) 0 (x)
lim f (x) = lim g(x) = ∞ and for all x ∈ R+ , g 0 (x) 6= 0. If lim fg0 (x) = C ∈ R+
0 or lim fg0 (x) = ∞,
x→∞ x→∞ x→∞ x→∞
then
f (x) f 0 (x)
lim = lim 0 .
x→∞ g(x) x→∞ g (x)

Exercise 0.3 Comparison of functions part 2.


Show that
a) f (n) := n1.01 grows asymptotically faster than g(n) := n ln n.
Solution: We apply Theorem 1 to compute

x ln x ln x (ln x)0 1/x 1


lim1.01
= lim 0.01
= lim 0.01 0
= lim −0.99
= lim = 0.
x→∞ x x→∞ x x→∞ (x ) x→∞ 0.01x x→∞ 0.01x0.01

Hence by Definition 1, f (n) := n1.01 grows asymptotically faster than g(n) := n ln n.

b) f (n) := en grows asymptotically faster than g(n) := n.


Solution: We apply Theorem 1 to compute

x x0 1
lim = lim = lim x = 0 .
x→∞ ex x→∞ (ex )0 x→∞ e

Hence by Definition 1, f (n) := en grows asymptotically faster than g(n) := n.

c) f (n) := en grows asymptotically faster than g(n) := n2 .


Solution: We apply Theorem 1 to compute

x2 (x2 )0 2x x0 1
lim = lim = lim = 2 lim = 2 lim x = 0 .
x→∞ ex x→∞ (ex )0 x→∞ ex x→∞ (ex )0 x→∞ e

Hence by Definition 1, f (n) := en grows asymptotically faster than g(n) := n2 .

d)∗ f (n) := 1.01n grows asymptotically faster than g(n) := n100 .


g(x)
Solution: Note that we can rewrite f (x) as

x100 e100 ln x
= = e100 ln x−ln(1.01)x .
(1.01)x ex ln(1.01)

3
We have  
ln x
lim (100 ln x − ln(1.01)x) = lim x 100 − ln(1.01) = −∞,
x→∞ x→∞ x
x100
and therefore lim x = 0. Hence by Definition 1, f (n) := 1.01n grows asymptotically faster
x→∞ (1.01)
than g(n) := n .
100

e)∗ f (n) := log2 n grows asymptotically faster than g(n) := log2 log2 n.
g(x) log2 y
Solution: Define y := log2 x. Then y → ∞ as x → ∞, and therefore lim = lim y .
x→∞ f (x) y→∞
Remembering that log2 y = ln y/ ln 2, we can apply Theorem 1 to compute

log2 y 1 ln y 1 (ln y)0 1 1/y


lim = lim = lim 0
= lim = 0.
y→∞ y ln 2 y→∞ y ln 2 y→∞ y ln 2 y→∞ 1

Hence by Definition 1, f (n) := log2 n grows asymptotically faster than g(n) := log2 log2 n.


f)∗ f (n) := 2 log2 n grows asymptotically faster than g(n) := log100
2 n.

Solution:

log100
100
2log2 (log2 n) 2100 log2 log2 n √
2 n
lim √ = lim √ = lim √ = lim 2100 log2 log2 n− log2 n
n→∞
2 log2 n n→∞ 2 log2 n n→∞
2 log2 n n→∞

Notice that
   p log log2 n 
lim 100 log2 log2 n − log2 n = lim − log2 n 1 − 100 p2
p
= −∞ .
n→∞ n→∞ log2 n

Hence
log100 n √
lim √2 = lim 2100 log2 log2 n− log2 n = 0 .
n→∞
2 log2 n n→∞

Therefore, by Definition 1, f (n) := 2 log2 n grows asymptotically faster than g(n) := log100
2 n.


g)∗ f (n) := n0.01 grows asymptotically faster than g(n) := 2 log2 n .
Solution:
√ √ √
2 log2 n 2 log2 n 2 log2 n √
log2 n−0.01 log2 n
lim = lim = lim = lim 2
n→∞ n0.01 n→∞ 2log(n0.01 ) n→∞ 20.01 log2 n n→∞

Notice that
p
p   log2 n 
lim log2 n − 0.01 log2 n = lim − 0.01 log2 n 1 − = −∞ .
n→∞ n→∞ 0.01 log2 n
Hence √
2 log2 n √
log2 n−0.01 log2 n
lim = lim 2 = 0.
n→∞ n0.01 n→∞

Therefore, by Definition 1, f (n) := n0.01 grows asymptotically faster than g(n) := 2 log2 n .

4
Exercise 0.4 Simplifying expressions.
Simplify the following expressions as much as possible without changing their asymptotic growth rates.
Concretely, for each expression f (n) in the following list, find an expression g(n) that is as simple as
possible and that satisfies lim fg(n)
(n)
∈ R+ .
n→∞

It is guaranteed that all functions in this exercise take values in R+ (you don’t have to prove it).
a) f (n) := 5n3 + 40n2 + 100
Solution: Let g(n) := n3 . Then indeed we have

f (n)  40 100 
lim = lim 5 + + 3 = 5 ∈ R+ .
n→∞ g(n) n→∞ n n

b) f (n) := 5n + ln n + 2n3 + 1
n

Solution: Let g(n) := n3 . Then indeed we have

f (n) 5 ln n 1
lim = lim + + 2 + = 2 ∈ R+ .
n→∞ g(n) n→∞ n2 n3 n4

c) f (n) := n ln n − 2n + 3n2
Solution: Let g(n) := n2 . Then indeed we have

f (n)  ln n 2 
lim = lim − + 3 = 3 ∈ R+ .
n→∞ g(n) n→∞ n n


d) f (n) := 23n + 4n log5 n6 + 78 n − 9

ln 5 . Let g(n) := n ln n.
Solution: By the properties of logarithms, 4n log5 n6 = 24n log5 n = 24n ln n

Then indeed we have


f (n)  23 24 78 9  24
lim = lim + +√ − = ∈ R+ .
n→∞ g(n) n→∞ ln n ln 5 n ln n n ln n ln 5


e) f (n) := log2
p
n5 + log2 n5
Solution: By the properties of logarithms,
√ 5
log2 n5 = ln n,
2 ln 2
and
5 √
r
p
log2 n5 = · ln n.
ln 2
Let g(n) := ln n. Then indeed we have
r
f (n)  5 5 1  5
lim = lim + ·√ = ∈ R+ .
n→∞ g(n) n→∞ 2 ln 2 ln 2 ln n 2 ln 2

5
√ log log n √ log log n
f)∗ f (n) := 2n3 + 4
n 5 6 + 7n 8 9
Solution:
√ log log n 1
7
n 8 9 n 7 log8 log9 n 1 1
lim √ log log n
= lim 1 = lim n 7 log8 log9 n− 4 log5 log6 n .
n→∞ n→∞ n 4 log log n n→∞

4
n 5 6 5 6

Notice that 1 1 
lim log8 log9 n − log5 log6 n = −∞ ,
n→∞7 4
since loga x ≤ logb y if x ≤ y and a ≥ b. Hence
√ log log n
7
n 8 9 1 1
lim √ log log n
= lim n 7 log8 log9 n− 4 log5 log6 n = 0 .
n→∞ n→∞

4
n 5 6

Moreover, we also have


2n3 1
lim √ log5 log6 n
= 2 lim n3− 4 log5 log6 n = 0 .
n→∞ n→∞
4

n

1
Let g(n) := n 4 log5 log6 n . Then indeed we have

f (n)
lim = 1 ∈ R+ .
n→∞ g(n)

Exercise 0.5∗ Finding the range of your bow.


To celebrate your start at ETH, your parents gifted you a bow and (an infinite number of) arrows. You
would like to determine the range of your bow, in other words how far you can shoot arrows with it.
For simplicity we assume that all your arrow shots will cover exactly the same distance r, and we define
r as the range of your bow. You also know that this range is at least r ≥ 1 (meter).
You have at your disposition a ruler and a wall. You cannot directly measure the distance covered by
an arrow shot (because the arrow slides some more distance on the ground after reaching distance r),
so the only way you can get information about the range r is as follows. You can stand at a distance `
(of your choice) from the wall and shoot an arrow: if the arrow reaches the wall, you know that ` ≤ r,
and otherwise you deduce that ` > r. By performing such an experiment with various choices of the
distance `, you will be able to determine r with more and more accuracy. Your goal is to do so with as
few arrow shots as possible.
a) What is a fast strategy to find an upper bound on the range r ? In other words, how can you
find a distance D ≥ 1 such that r < D, using few arrow shots ? The required number of shots
might depend on the actual range r, so we will denote it by f (r). Good solutions should have
f (r) ≤ 10 log2 r for large values of r.
Solution: One possible fast strategy is to first shoot an arrow at distance 2 from the wall, and as
long as the arrow reaches the wall, you double your distance to the wall for the next shot. More
formally, let `i denote your distance to the wall for the i-th shot. Then this startegy uses distances
given by `i = 2i , and does this until you find a distance `t for which your arrow does not reach the
wall. D is then given by D = `t = 2t , and the required number of shots is f (r) = t, the smallest
integer t such that r < 2t .

6
This strategy therefore needs f (r) = dlog2 re shots, and indeed
f (r) = dlog2 re ≤ 1 + log2 r ≤ 10 log2 r
for all r ≥ 21/9 .
b) You are now interested in determining r up to some additive error. More precisely, you should find
an estimate r̃ such that the range is contained in the interval [r̃ − 1, r̃ + 1], i.e. r̃ − 1 ≤ r ≤ r̃ + 1.
Denoting by g(r) the number of shots required by your strategy, your goal is to find a strategy with
g(r) ≤ 10 log2 r for all r sufficiently large.
Solution: You start by performing the strategy described in part (a). Note that this allows you to
find a distance D such that r ∈ [ 12 D, D] using f (r) = dlog2 re shots. You will then iteratively find
smaller and smaller intervals [a, b] ⊆ [ 21 D, D] with r ∈ [a, b], until you get an interval whose length
is at most 2 (and then you can take r̃ to be the center of this interval).
You start by shooting an arrow from distance ( 12 D + D)/2 = 34 D. If the arrow reaches the wall,
then you know that r ∈ [ 34 D, D], and otherwise you deduce that r ∈ [ 21 D, 34 D]. Note that in
both cases, the length of the interval of possible ranges r was divided by 2. In the next step, if you
know that r ∈ [ 34 D, D] then you shoot an arrow from distance ( 34 D + D)/2, and if you know that
r ∈ [ 12 D, 34 D] then you shoot an arrow from distance ( 12 D + 43 D)/2, which allows you to again
divide the length of the interval of possible ranges by 2. You carry on this procedure until you find
an interval [a, b] of length b − a ≤ 2 satisfying r ∈ [a, b], and you define r̃ = (a + b)/2.
By construction, this strategy finds an estimate r̃ such that r̃ − 1 ≤ r ≤ r̃ + 1. Let’s compute the
number of required shots g(r). You start with f (r) = dlog2 re shots in order to perform the strategy
described in (a), and then you need t0 additional shots to find the interval [a, b]. Note that you start
with the interval of possible ranges [ 12 D, D] which has length D/2, and with each additional shot
you divide this length by 2, until you reach a length smaller than 2. Therefore, t0 is the smallest
0 0
integer such that D/2t +1 ≤ 2, i.e. D ≤ 2t +2 . This means that t0 = max{dlog2 De − 2, 0} (the
maximum with 0 is taken because you cannot have a negative number of shots). This is at most
dlog2 2re = 1 + dlog2 re because D ≤ 2r, so the total number of required shots is
g(r) = f (r) + t0 ≤ f (r) + dlog2 re + 1 = 2dlog2 re + 1 ≤ 2 log2 r + 3,
which is smaller than 10 log2 r for all r ≥ 23/8 .
c) Coming back to part (a), is it possible to have a significantly faster strategy (for example with f (r) ≤
10 log2 log2 r for large values of r) ?
Solution: Let h : R+ → R+ be any strictly increasing function with lim h(r) = ∞. We will show
r→∞
that there exists a strategy that finds some D > r using f (r) := dh(r)e shots.
Since h : R+ → R+ is strictly increasing, it is bijective and therefore has an inverse h−1 : R+ → R+
which is also strictly increasing. Moreover, we have lim h−1 (r) = ∞ because lim h(r) = ∞. The
r→∞ r→∞
strategy is then to shoot the arrow at the i-th step with a distance of h−1 (i) from the wall, until we
get to a step t00 where the arrow doesn’t reach the wall (i.e. h−1 (t00 ) > r). The number of required
shots is then t00 , which is the smallest integer satisfying h−1 (t00 ) > r, or equivalently t00 > h(r).
Therefore, t00 = dh(r)e as claimed.
For the particular example of f (r) ≤ 10 log2 log2 r, take the function h(r) = log2 log2 r. This
i
corresponds to shooting an arrow from distance h−1 (i) = 22 in the i-th step. Then the number of
required shots is
f (r) = dlog2 log2 re ≤ 1 + log2 log2 r,
1/9
which is smaller than 10 log2 log2 r for all r ≥ 22 .

7
Eidgenössische Ecole polytechnique fédérale de Zurich
Technische Hochschule Politecnico federale di Zurigo
Zürich Federal Institute of Technology at Zurich

Departement of Computer Science 26 September 2022


Markus Püschel, David Steurer
François Hublet, Goran Zuzic, Tommaso d’Orsi, Jingqiu Ding

Algorithms & Data Structures Exercise sheet 1 HS 22

The solutions for this sheet are submitted at the beginning of the exercise class on 3 October 2022.
Exercises that are marked by ∗ are “challenge exercises”. They do not count towards bonus points.
You can use results from previous parts without solving those parts.

Exercise 1.1 Guess the formula (1 point).


Consider the recursive formula defined by a1 = 1 and an+1 = 2an + 1. Find a simple closed formula
for an and prove that an follows it using induction.

Hint: Write out the first few terms. How fast does the sequence grow?
Solution:
Writing out the first few terms, we get: 1, 3, 7, 15, 31, etc. From this sequence, we guess the closed
formula
an = 2n − 1.

Now we prove an = 2n − 1 by induction.


• Base Case.
For n = 1:
1 = a1 = 21 − 1 = 1,
so it is true for n = 1.
• Induction Hypothesis.
Now we assume that it is true for n = k, i.e., ak = 2k − 1.
• Induction Step.
We will prove that it is also true for n = k + 1.

2k+1 − 1 = 2k+1 − 2 + 1 = 2k · 2 − 2 + 1 = (2k · 2 − 2) + 1 = 2(2k − 1) + 1

= 2ak + 1 = ak+1
Hence it is true for n = k + 1.

Exercise 1.2 Sum of Squares.


Prove by mathematical induction that for every positive integer n,

n(n + 1)(2n + 1)
12 + 22 + · · · + n2 = .
6

Solution:
• Base Case.
Let n = 1. Then:
1 · (1 + 1) · (2 + 1)
1= = 1.
6

• Induction Hypothesis.
Assume that the property holds for some positive integer k. That is,

k(k + 1)(2k + 1)
12 + 22 + 32 + · · · + k 2 = .
6

• Inductive Step.
We must show that the property holds for k + 1. Let’s add (k + 1)2 to both sides of our inductive
hypothesis.

k(k + 1)(2k + 1)
12 + 22 + 32 + · · · + k 2 + (k + 1)2 = + (k + 1)2
6
k(k + 1)(2k + 1) + 6(k + 1)2
=
6
(k + 1)(2k 2 + k + 6k + 6)
=
6
(k + 1)(2k 2 + 7k + 6)
=
6
(k + 1)(k + 2)(2k + 3)
=
6
(k + 1)((k + 1) + 1)(2(k + 1) + 1)
= .
6
By the principle of mathematical induction, this is true for any positive integer n.

Exercise 1.3 Sums of powers of integers (1 point).


In this exercise, we fix an integer k ∈ N0 .
(a) Show that, for all n ∈ N0 , we have ni=1 ik ≤ nk+1 .
P

Solution:
As all terms in the sum are at most nk , we have:
n
X n
X
ik ≤ nk = n · nk = nk+1 .
i=1 i=1

2
Pn
(b) Show that for all n ∈ N0 , we have i=1 i
k ≥ 1
2k+1
· nk+1 .
Pn k
Hint: Consider the second half of the sum, i.e., i=d n ei . How many terms are there in this sum?
2
How small can they be?
Solution:
We have:
n n n  
X
k
X
k
X n k  lnm   n k
i ≥ i ≥ = n− +1 ·
2 2 2
i=1 i=d n
2
e n
i=d 2 e

By definition of d·e, we have − 1 ≤ n2 , hence n − + 1 ≥ n2 . Hence


n n
2 2

n
X n  n k 1
ik ≥ · = k+1 · nk+1 .
2 2 2
i=1

Together, these two inequalities show that C1 · nk+1 ≤ ni=1 ik ≤ C2 · nk+1 , where C1 = 2k+1
1
and
P
Pn k
C2 = 1 are two constants independent of n. Hence, when n is large, i=1 i behaves “almost like
nk+1 ” up to a constant factor.

Exercise 1.4 Asymptotic growth (1 point).


Recall the concept of asymptotic growth that we introduced in Exercise sheet 0: If f, g : N → R+ are
two functions, then:
• We say that f grows asymptotically slower than g if lim fg(m)
(m)
= 0. If this is the case, we also
m→∞
say that g grows asymptotically faster than f .
Prove or disprove each of the following statements.
(a) f (m) = 100m3 + 10m2 + m grows asymptotically slower than g(m) = 0.001 · m5 .
Solution:
True, since

f (m) 100m3 + 10m2 + m


lim = lim
m→∞ g(m) m→∞ 0.001m5
= lim 105 m−2 + 104 m−3 + 103 m−4
m→∞
= 105 lim m−2 + 104 lim m−3 + 103 lim m−4
m→∞ m→∞ m→∞
= 10 · 0 + 10 · 0 + 103 · 0 = 0.
5 4

(b) f (m) = log (m3 ) grows asymptotically slower than g(m) = (log m)3 .
Solution:
True, since

f (m) log (n3 ) 3 log n 1


lim = lim 3 = lim 3 = lim 3 · = 3 · 0 = 0.
m→∞ g(m) m→∞ (log n) m→∞ (log n) m→∞ (log n)2

3
(c) f (m) = e2m grows asymptotically slower than g(m) = 23m .
Hint: Recall that for all n, m ∈ N, we have nm = em ln n .
Solution:
True, since

f (m) e2m e2m


lim = lim 3m = lim 3m ln 2 = lim e(2−3 ln 2)m = lim e(−0.079...)·m = 0.
m→∞ g(m) m→∞ 2 m→∞ e m→∞ m→∞

Pm2 Pm
(d) f (m) = i=1 i grows asymptotically slower than g(m) = i=1 i
2.

Hint: You can reuse the inequalities from exercise 1.2.


Solution:
P 2 2 2 = 1 m4 (inequality from 1.3.b
False. With the inequalities fromP1.3, we have m 1

i=1 i ≥ 4 m 4
with k = 1 and n = m2 ) and m 2
i=1 i ≤ m
2+1 = m3 (inequality from 1.3.a with k = 2 and
1
m4
n = m)1 . Hence, lim fg(m)
(m)
≥ lim 4m3 = lim 14 m = +∞, and f does not grow asymptotically
m→∞ m→∞ m→∞
slower than g.
(e)* If f (m) grows asymptotically slower than g(m), then log(f (m)) grows asymptotically slower than
log(g(m)).
Solution:
f (m)
False. Consider f (m) = m and g(m) = m2 . We have lim = lim m
2 = lim 1
= 0,
m→∞ g(m) m→∞ m m→∞ m
hence f grows asymptotically slower than g. However, log(f (m)) = log m and log(g(m)) =
log (m2 ) = 2 log m, therefore lim log(f (m)) log m
log(g(m)) = lim 2 log m = 2 6= 0 and log(f (m)) does not
1
m→∞ m→∞
grow asymptotically slower than log(g(m)).

(f)* f (m) = log( log(m)) grows asymptotically slower than g(m) = log( m).
p p

Hint: You can use L’Hôpital’s rule from sheet 0.


Solution:
1
You can also show this from 1.2

4
True, since
p
f (m) log( log(m))
lim = lim p √
m→∞ g(m) m→∞ log( m)
 p 0
log log(m)
= lim p √ 0 (L’Hôpital’s rule)
m→∞
log ( m)
1 p √
2m log m 2 log ( m)
= lim = lim
m→∞ q 1 √ m→∞ log m
4m log( m)
 p √ 0
2 log ( m)
= lim (L’Hôpital’s rule again)
m→∞ (log m)0
q 1 √
m log( m) 1
= lim 1 = lim √ = 0.
m→∞ m→∞ log ( m)
m

Exercise 1.5 Proving Inequalities.


(a) By induction, prove the inequality
1 3 5 2n − 1 1
· · · ... · ≤√ , n ≥ 1.
2 4 6 2n 3n + 1

Solution:
• Base Case.
For n = 1:
1 1
≤√ ,
2 4
which is an equality.
• Induction Hypothesis.
Now we assume that it is true for n = k, i.e.,
1 3 5 2k − 1 1
· · · ... · ≤√ .
2 4 6 2k 3k + 1

• Induction Step.
We will prove that it is also true for n = k + 1.

1 3 5 2k − 1 2k + 1 1
· · · ... · · ≤√
2 4 6 2k 2k + 2 3k + 4
.
Plugging in the induction hypothesis, it is sufficient to prove.

1 2k + 1 1
√ · ≤√ ⇔
3k + 1 2k + 2 3k + 4

5
. √
2k + 1 3k + 1
≤√
2k + 2 3k + 4
.
Rewriting:
r
2k + 1 3k + 1

2k + 2 3k + 4
2k + 1 2 3k + 1
 
⇔ ≤
2k + 2 3k + 4
⇔ (4k 2 + 4k + 1)(3k + 4) ≤ (4k 2 + 8k + 4)(3k + 1)
⇔ 12k 3 + 28k 2 + 19k + 4 ≤ 12k 3 + 28k 2 + 20k + 4
⇔0≤k

Hence it is true for n = k + 1.


(b)* Replace 3n + 1 by 3n on the right side, and try to prove the new inequality by induction. This
inequality is even weaker, hence it must be true. However, the induction proof fails. Try to explain
to yourself how is this possible?
Solution:
(b) Sometimes it is easier to prove more than less. This simple approach does not work for the
weaker inequality as we are using a weaker (and insufficiently so!) induction hypothesis in each
step.

6
Eidgenössische Ecole polytechnique fédérale de Zurich
Technische Hochschule Politecnico federale di Zurigo
Zürich Federal Institute of Technology at Zurich

Departement of Computer Science 3 October 2022


Markus Püschel, David Steurer
François Hublet, Goran Zuzic, Tommaso d’Orsi, Jingqiu Ding

Algorithms & Data Structures Exercise sheet 2 HS 22

The solutions for this sheet are submitted at the beginning of the exercise class on 10 October 2022.
Exercises that are marked by ∗ are “challenge exercises”. They do not count towards bonus points.
You can use results from previous parts without solving those parts.

Exercise 2.1 Induction.


(a) Prove via mathematical induction that for all integers n ≥ 5,

2n > n2 .

Solution:
• Base Case.
Let n = 5. Then:
25 = 32 > 25 = 52 .

• Induction Hypothesis.
Assume that the property holds for some positive integer k. That is,

2k > k 2 .

• Inductive Step.
We must show that the property holds for k + 1.

2k+1 = 2 · 2k
I.H.
> 2 · k2
= k2 + k2
≥ k 2 + 5k
= k 2 + 2k + 3k
≥ k 2 + 2k + 15
> k 2 + 2k + 1
= (k + 1)2 .

By the principle of mathematical induction, this is true for every positive integer n.
(b) Let x be a real number. Prove via mathematical induction that for every positive integer n, we have
n  
n
X n i
(1 + x) = x ,
i
i=0

where  
n n!
= .
i i!(n − i)!
We use a standard convention 0! = 1, so n0 = nn = 1 for every positive integer n.
 

Hint: Y ou can use the following fact without justification: for every 1 ≤ i ≤ n,
     
n n n+1
+ = .
i i−1 i

Solution:
We will use the identity from the hint to show (via mathematical induction) that
n  
n
X n i
(1 + x) = x .
i
i=0

• Base Case. Pn
Let n = 1. Then (1 + x)1 = 1
x0 + 1
x1 = n
xi .
  
0 1 i=0 i

• Induction Hypothesis.
Assume that the property holds for some positive integer k. That is,
k  
X
k k i
(1 + x) = x.
i
i=0

• Inductive Step.
We must show that the property holds for k + 1.

(1 + x)k+1 = (1 + x)(1 + x)k


k  
I.H.
X k i
= (1 + x) x
i
i=0
k    k  
X k i X k i+1 
= x + x
i i
i=0 i=0
k    k+1   
X k i X k
= x + xi
i i−1
i=0 i=1
  k       
k 0 X k i k k k+1
= x + x + xi + x
0 i i−1 k
i=1
  k     k+1  
k+1 0 X k+1 i k + 1 k+1 X k + 1 i
= x + x + x = x.
0 i k+1 i
i=1 i=0

By the principle of mathematical induction, this is true for every positive integer n.

2
Exercise 2.2 Growth of Fibonacci numbers (1 point).
There are a lot of neat properties of the Fibonacci numbers that can be proved by induction. Recall that
the Fibonacci numbers are defined by f0 = 0, f1 = 1 and the recursion relation fn+1 = fn + fn−1 for
all n ≥ 1. For example, f2 = 1, f5 = 5, f10 = 55, f15 = 610.
(a) Prove that fn+1 ≤ 1.75n for n ≥ 0.
Solution:
• Base Case. We prove that the inequality holds for n = 0 and n = 1.
For n = 0: f1 = 1 ≤ 1.750 = 1, which is true. For n = 1: f2 = 1 ≤ 1.751 = 1.75, which is
true.
• Induction Hypothesis. We assume that it is true for n = k and n = k + 1, i.e.,

fk+1 ≤ 1.75k
fk+2 ≤ 1.75k+1

• Inductive Step. We must show that the property holds for n = k + 2, k ≥ 0. We have:

fk+3 = fk+2 + fk+1


≤ 1.75k+1 + 1.75k
= 1.75k (1.75 + 1)
= 1.75k · 2.75
≤ 1.75k · 3.0625
= 1.75k · 1.752
= 1.75k+2

By the principle of mathematical induction, this is true for every integer n ≥ 0.


(b) Prove that fn ≥ 1
3 · 1.5n for n ≥ 1.
Solution:
• Base Case. We prove that the inequality holds for n = 1 and n = 2.
For n = 1: f1 = 1 ≥ 0.5 = 13 · 1.5, which is true.
For n = 2: f2 = 1 ≥ 0.75 = 31 · 1.52 , which is true.
• Induction Hypothesis. We assume that it is true for n = k and n = k + 1, i.e.,
1
fk ≥ 1.5k
3
1 k+1
fk+1 ≥ 1.5
3

3
• Inductive Step. We must show that the property holds for n = k + 2, k ≥ 1. We have:

fk+2 = fk+1 + fk
1 1
≥ 1.5k+1 + 1.5k
3 3
1 k
= 1.5 · (1.5 + 1)
3
1 k
= 1.5 · 2.5
3
1 k
≥ 1.5 · 2.25
3
1 k
= 1.5 · 1.52
3
1 k+2
= 1.5
3

By the principle of mathematical induction, this is true for every integer n ≥ 1.

Asymptotic Notation
When we estimate the number of elementary operations executed by algorithms, it is often useful
to ignore constant factors and instead use the following kind of asymptotic notation, also called O-
Notation. We denote by R+ the set of all (strictly) positive real numbers and by N the set of all (strictly)
positive integers.

Definition 1 (O-Notation). Let n0 ∈ N, N := {n0 , n0 + 1, . . .} and let f : N → R+ . O(f ) is the set


of all functions g : N → R+ such that there exists C > 0 such that for all n ∈ N , g(n) ≤ Cf (n).
In general, we say that g ≤ O(f ) if Definition 1 applies after restricting the domain to some N =
{n0 , n0 + 1, . . .}. Some sources use the notation g = O(f ) or g ∈ O(f ) instead.
Instead of working with this definition directly, it is often easier to use limits in the way provided by
the following theorem.

Theorem 1 (Theorem 1.1 from the script). Let f : N → R+ and g : N → R+ .


f (n)
• If lim = 0, then f ≤ O(g) and g 6≤ O(f ).
n→∞ g(n)
f (n)
• If lim = C ∈ R+ , then f ≤ O(g) and g ≤ O(f ).
n→∞ g(n)
f (n)
• If lim = ∞, then f 6≤ O(g) and g ≤ O(f ).
n→∞ g(n)
f (n)
The theorem holds all the same if the functions are defined on R+ instead of N . In general, lim
n→∞ g(n)
f (x)
is the same as lim if the second limit exists.
x→∞ g(x)

The following theorem can also be helpful when working with O-notation.
Theorem 2. Let f, g, h : N → R+ . If f ≤ O(h) and g ≤ O(h), then
1. For every constant c ≥ 0, c · f ≤ O(h).

4
2. f + g ≤ O(h).

Notice that for all real numbers a, b > 1, loga n = loga b · logb n (where loga b is a positive constant).
Hence loga n ≤ O(logb n). So you don’t have to write bases of logarithms in asymptotic notation, that
is, you can just write O(log n).

Exercise 2.3 O-notation quiz.


(a) Prove or disprove the following statements. Justify your answer.
2n+3
(1) n n+1 = O(n2 )
Solution:
True by Theorem 1, since
2n+3
n n+1 2n+3
−2 2n+3−2n−2 1 log n
lim = lim n n+1 = lim n n+1 = lim n n+1 = lim e n+1 = 1.
n→∞ n2 n→∞ n→∞ n→∞ n→∞

(2) e1.2n = O(en )


Solution:
False by Theorem 1, since

e1.2n
lim = lim e1.2n−n = lim e0.2n = ∞.
n→∞ en n→∞ n→∞

(3) log(n4 + n3 + n2 ) = O(log(n3 + n2 + n))


Solution:
True by Theorem 1, since
3n2 +2n+1
log(n3 + n2 + n) L’Hôpital (3n2 + 2n + 1)(n4 + n3 + n2 )
3 +n2 +n
lim = lim 4nn3 +3n 2 +2n = lim
n→∞ log(n4 + n3 + n2 ) n→∞ n→∞ (n3 + n2 + n)(4n3 + 3n2 + 2n)
n4 +n3 +n2
3n6 + P (n) 3n6
 
P (n)
= lim = lim + 6
n→∞ 4n6 + Q(n) n→∞ 4n6 + Q(n) 4n + Q(n)
3 P (n)
= + lim
4 n→∞ 4n6 + Q(n)
αnk αnk
where deg P = deg Q = 5. For all α and k ≤ 5, 4n6 +Q(n)
≤ 4n6
= α k−6
4n → 0, hence
P (n)
4n6 +Q(n)
→ 0 by Theorem 2. Therefore,

log(n3 + n2 + n) 3
lim = .
n→∞ log(n4 + n3 + n2 ) 4

(b) Find f and g as in Theorem 1 such that f = O(g), but the limit limn→∞ fg(n)(n)
does not exist. This
proves that the first point of Theorem 1 provides a sufficient, but not a necessary condition for
f = O(g).

5
Solution:
f (n) 1+(−1)n
Let f (n) = 1 + (−1)n and g(n) = 1. We have g(n) = 1 = 1 + (−1)n , which has no limit
when n → ∞.

Exercise 2.4 Asymptotic growth of ln(n!).


Recall that the factorial of a positive integer n is defined as n! = 1 × 2 × · · · × (n − 1) × n.
a) Show that ln(n!) ≤ O(n ln n).
Hint: You can use the fact that n! ≤ nn for n ≥ 1 without proof.
Solution:
Solution: From the hint, we have n! ≤ nn , which implies that ln(n!) ≤ n ln n and thus ln(n!) ≤
O(n ln n).
b) Show that n ln n ≤ O(ln(n!)).
n
n
Hint: You can use the fact that 2
2
≤ n! for n ≥ 1 without proof.
Solution:
n n/2
From the hint, we have n! ≥ .
Now by the monotonicity of the logarithm we have

2
  
n n/2 n
ln(n!) ≥ ln = (ln n − ln 2) ,
2 2
so n ln n ≤ 2 ln(n!) + 2 ln 2. By Theorem 1, n ln n ≤ O(ln(n!)).

Exercise 2.5 Triplet Search (2 points).


Given an array of n integers, and an integer t, design an algorithm that checks if there exists three (not
necessarily different) elements of the array a, b, c such that a + b + c = t.
(a) Design a simple O(n3 ) algorithm.
Solution:
The algorithm can simply check all n3 triples (A[i], A[j], A[k]) of elements in A by using three
nested loops with indices (i, j, k) that iterate over all integers in [1, n]. For each such triple, we
check whether A[i] + A[j] + A[k] = t and report success (“YES”) if we ever find a satisfying triple.
Otherwise, we return failure (“NO”). The pseudocode is given below.

Algorithm 1 Input: an array A of n integers, and an integer t.


for i = 1, 2, . . . , n do
for j = 1, 2, . . . , n do
for k = 1, 2, . . . , n do
if A[i] + A[j] + A[k] = t then
return “YES” and exit
return “No”

The algorithm clearly works by checking all n3 possibilities, hence it is trivially correct and its
runtime is clearly O(n3 ).

6
(b) Suppose that elements of the array are integers in the range [1, 100n], and that t ≤ 300n. Design
a better algorithm with runtime O(n2 ) to solve the same problem, assuming the constraints.
Hint: You can use a separate array with O(n) entries to help you. Start with the “naive” algorithm
from (a) and try removing one of the loops with a smart lookup using the new array.
Hint: a + b + c = t implies that a = t − b − c.
Solution:
We use a separate O(n)-sized array B[1 . . . 100n], which is originally initialized to value 0. First,
in O(n) time, we mark every entry that appears in A with a value of 1 in B: more precisely, we set
B[A[i]] ← 1 for all i ∈ {1, . . . , n}. Then, we use two nested loops i, j to iterate over all possible
n2 pairs of elements from the array A. In each iteration, we check whether there exists an element
A[k] such that A[i] + A[j] + A[k] = t. In other words, we check if t − A[i] − A[j] is in the array,
which can be accomplished in O(1) time by checking if t−A[i]−A[j] fits within the range [1, 100n]
and then if B[t − A[i] − A[j]] = 1; if both of these are true, we output “Yes”. At the end of the
algorithm, if we didn’t output yet, we output “No”. A pseudocode equivalent to the above algorithm
is given below.

Algorithm 2 Input: an array A of n integers, and an integer t.


B[1 . . . 100n] ← (0, 0, . . . , 0)
for i = 1, 2, . . . , n do
B[A[i]] ← 1
for i = 1, 2, . . . , n do
for j = 1, 2, . . . , n do
if 1 ≤ t − A[i] − A[j] ≤ 100n AND B[t − A[i] − A[j]] = 1 then
return “YES” and exit
return “No”

It is clear from the algorithm that the runtime is O(n2 ) as the initialization of B took O(n), and
then each iteration over the n2 pairs took O(1) time, for a total of O(n) + n2 · O(1) = O(n2 ) time.
Finally, we argue correctness. If there are no satisfying triplets in the input, the algorithm clearly
outputs “NO” as whenever the algorithm outputs “YES” it finds a satisfying triplet in the input
(namely, A[i], A[j], and some A[k] = t − A[i] − A[j] which is guaranteed to exist by algorithm
design). If there exists a satisfying triplet (say) i∗ , j ∗ , k ∗ , then during the nested loop iteration at
some point we will have i = i∗ and j = j ∗ . At that point, t − A[i∗ ] − A[j ∗ ] is within [1, 100n] and
t − A[i∗ ] − A[j ∗ ] exists in the array A (namely, as A[k ∗ ]), hence it would be found. This concludes
the correctness proof.
(c)* Suppose now that, unlike in (b), we don’t have a bound on the size of the integers elements of A
nor on t (but we can still perform arithmetic operations on them in O(1) time). However, they are
given in increasing order in A, i.e., A[1] ≤ A[2] ≤ . . . A[n]. Design an O(n2 ) algorithm to solve
the same problem, assuming the constraints.
Hint: Exploit the increasing order of A to leverage the computation done in the previous step to help
you in the next one.
Solution:
We use two nested loops i, j to iterate over all possible n2 pairs of elements from the array A,
in order. However, in parallel to j, we will also store a value k satisfying the invariant that k

7
is the smallest index in [1, n] such that A[i] + A[j] + A[k] ≥ t. Each time we increment j we
need to keep decreasing k until the invariant would start failing. At that point, we check whether
A[i]+A[j]+A[k] = t and report we found a satisfying triple if this condition is ever true. Otherwise,
we report failure at the end. The pseudocode implementing this algorithm follows.

Algorithm 3 Input: an array A of n integers, and an integer t.


for i = 1, 2, . . . , n do
k←n
for j = 1, 2, . . . , n do
while k ≥ 1 and A[i] + A[j] + A[k − 1] ≥ t do . Can we decrease k and keep the invariant?
k ←k−1
if A[i] + A[j] + A[k] = t then
return “YES” and exit
return “No”

The above invariant is kept true (after time while loop terminates) since each time we increment
j, the value of A[i] + A[j] + A[k] increases, hence we need to decrease k (zero or possibly large)
number of times until decreasing it any further would make A[i] + A[j] + A[k] < t. The while loop
directly implements this, hence the invariant is clearly satisfied.
For correctness, if there are no satisfying triplets, the algorithm cannot find any (since the algorithm
outputs “YES” only upon explicitly finding a satisfying triplet). On the other hand, if there exists
a satisfying triplet i∗ , j ∗ , k ∗ such that A[i∗ ] + A[j ∗ ] + A[k ∗ ] = t, then at some point we will have
i = i∗ , j = j ∗ . Then, by the invariant, k will be the largest index with A[i∗ ] + A[j ∗ ] + A[k] ≥ t.
Since there exists a value k ∗ such that A[i∗ ] + A[j ∗ ] + A[k ∗ ] = t and A is sorted, the largest value
of k must satisfy also A[i∗ ] + A[j ∗ ] + A[k] = t. This proves correctness of the algorithm.
For runtime, there are at most n iterations of the i loop. In each such iteration of i, k can be
decremented a total of n times, hence the total number of decrements is at most n (for each fixed
value i). Hence, the total number of times the while loop iterates (in the entire program) is O(n2 ).
All other operations beside the while-loop are also O(n2 ), hence the runtime is O(n2 ).

8
Eidgenössische Ecole polytechnique fédérale de Zurich
Technische Hochschule Politecnico federale di Zurigo
Zürich Federal Institute of Technology at Zurich

Departement of Computer Science 10 October 2022


Markus Püschel, David Steurer
François Hublet, Goran Zuzic, Tommaso d’Orsi, Jingqiu Ding

Algorithms & Data Structures Exercise sheet 3 HS 22

The solutions for this sheet are submitted at the beginning of the exercise class on 17 October 2022.
Exercises/questions marked by ∗ are “challenge exercises”. They do not count towards bonus points.
You can use results from previous parts without solving those parts.

Exercise 3.1 Some properties of O-Notation.


Let f : R+ → R+ and g : R+ → R+ .
(a) Show that if f ≤ O(g), then f 2 ≤ O(g 2 ).
Solution:
Assume that f ≤ O(g). Then we can find T, C ∈ R+ such that for all x ≥ T , we have f (x) ≤
C · g(x). For all x ≥ T , we get f 2 (x) = f (x) · f (x) ≤ (C · g(x)) · (C · g(x)) = C 2 · g 2 (x), hence
f 2 ≤ O(g 2 ).
(b) Does f ≤ O(g) imply 2f ≤ O(2g )? Prove it or provide a counterexample.
Solution:
The implication does not hold.
Consider f (n) = 2n, g(n) = n. Obviously, f ≤ O(g). However,

2f (n) 22n
lim = lim = lim 2n = ∞ ,
n→∞ 2g(n) n→∞ 2n n→∞

hence by Theorem 1 of Exercise sheet 1, 2f 6≤ O(2g ).


Another important example is f (n) = log2 n and g(n) = log4 n. As we already showed, f ≤ O(g).

However, 2f (n) = n and 2g(n) = n, so by Theorem 1 of Exercise sheet 1, 2f 6≤ O(2g ).

Exercise 3.2 Substring counting (1 point).


Given a n-bit bitstring S (an array over {0, 1} of size n), and an integer k ≥ 0, we would like to count
the number of nonempty substrings of S with exactly k ones. For example, when S = “0110” and
k = 2, there are 4 such substrings: “011”, “11”, “110”, and “0110”.
(a) Design a “naive” algorithm that solves this problem with a runtime of O(n3 ). Justify its runtime
and correctness.
Solution:
We can for example use the following algorithm:
Algorithm 1 Naive substring counting
c←0 . Initialize counter of substrings with k ones
for i ← 0, . . . , n − 1 do . Enumerate all nonempty substrings S[i..j]
for j ← i, . . . , n − 1 do
x←0 . Initialize counter of ones
for ` ← i, . . . , j do . Count ones in substring
if S[`] = 1 then
x←x+1
if x = k then . If there are k ones in substring, increment c
c←c+1
return c . Return number of substrings with k ones

We perform at most n iterations of each loop, leading to a total runtime in O(n3 ). The correctness
directly follows from the description of the algorithm (see comments above).
(b) We say that a bitstring S 0 is a (non-empty) prefix of a bitstring S if S 0 is of the form S[0..i] where
0 ≤ i < length(S). For example, the prefixes of S = “0110” are “0”, “01”, “011” and “0110”.
Given a n-bit bitstring S, we would like to compute a table T indexed by 0..n such that for all i,
T [i] contains the number of prefixes of S with exactly i ones.
For example, for S = “0110”, the desired table is T = [1, 1, 2, 0, 0], since, of the 4 prefixes of S, 1
prefix contains zero “1”, 1 prefix contains one “1”, 2 prefixes contain two “1”, and 0 prefix contains
three “1” or four “1”.
Describe an algorithm prefixtable that computes T from S in time O(n), assuming S has size n.
Solution:

Algorithm 2
function prefixtable(S)
T ← int[n + 1]
s←0
for i ← 0, . . . , n − 1 do
s ← s + S[i]
T [s] ← T [s] + 1
return T

Remark: This algorithm can also be applied on a reversed bitstring to compute the same table for
all suffixes of S. In the following, you can assume an algorithm suffixtable that does exactly this.
(c) Let S be a n-bit bitstring. Consider an integer m ∈ {0, . . . , n − 1}, and divide bitstring S into two
substrings S[0..m] and S[m+1..n−1]. Using prefixtable and suffixtable, describe an algorithm
spanning(m, k, S) that returns the number of substrings S[i..j] of S that have exactly k ones and
such that i ≤ m < j. What is its complexity?
For example, if S = “0110”, k = 2, and m = 0, there exist exactly two such strings: “011” and
“0110”. Hence, spanning(m, k, S) = 2.
Hint: Each substring S[i..j] with i ≤ m < j can be obtained by concatenating a string S[i..m] that
is a suffix of S[0..m] and a string S[m + 1..j] that is a prefix of S[m + 1..n − 1].

2
Solution:
Each substring S[i..j] with i ≤ m < j is obtained by concatenating a string S[i..m] that is a suffix
of S[0..m] and a string S[m + 1..j] that is a prefix of S[m + 1..n − 1], such that the numbers of “1”
in S[i..m] and S[m + 1..j] sum up to k. Moreover, from each S[i..m] that contains p ≤ k ones, we
can build as many different sequences S[i..j] with k ones as there are substrings S[m + 1..j] with
k − p ones. We obtain the following algorithm:

Algorithm 3
function spanning(m, k, S)
T1 ← suffixtable(S[0..m])
T2 ← prefixtable(S[m + 1..n − 1])
Pmin(k,m)
return p=max(0,k−(n−m−1)) (T1 [p] · T2 [k − p])

The complexity of this algorithm is O(n).


*(d) Using spanning, design an algorithm with a runtime of at most O(n log n) that counts the number
of nonempty substrings of a n-bit bitstring S with exactly k ones. (You can assume that n is a power
of two.)
Hint: Use the recursive idea from the lecture.
Solution:
Whenever n ≥ 2, we can distinguish between:
• Substrings with k ones located entirely in the first half of the bitstring, which we compute
recursively;
• Substrings with k ones located entirely in the second half of the bitstring, which we also
compute recursively;
• Substrings with k ones that span the two halves, which we can count using (c).
We obtain the following algorithm:

Algorithm 4 Clever substring counting


function countsubstr(S, k, i = 0, j = n − 1)
if i = j then
if k = 1 and S[i] = 1 then
return 1
else if k = 0 and S[i] = 0 then
return 1
else
return 0
else
m ← b(i + j)/2c
return countsubstr(S, k, i, m) + countsubstr(S, k, m + 1, j) + spanning(m, k, S)

The complexity of this algorithm is given by a recursive expression of the form A(n) = 2A( n2 ) +
O(n), which, as in the lecture, yields a total complexity of O(n log n).

3
Exercise 3.3 Counting function calls in loops (1 point).
For each of the following code snippets, compute the number of calls to f as a function of n. Provide
both the exact number of calls and a maximally simplified, tight asymptotic bound in big-O notation.

Algorithm 5
(a) f ()
i←0
while i ≤ n do
f ()
i←i+1

Solution:
Pn
This algorithm performs 1 + i=0 1 = 1 + (n + 1) = n + 2 = O(n) calls to f .

Algorithm 6
(b) i←0
while i2 ≤ n do
f ()
f ()
for j ← 1, . . . , n do
f ()
i←i+1

Solution:
Pb√nc √
This algorithm performs i=0 (2 + n) = (2 + n)(b nc + 1) = O(n1.5 ) calls to f .

Exercise 3.4 Fibonacci Revisited (1 point).


In this exercise we continue playing with the Fibonacci sequence.
(a) Write an O(n) algorithm that computes the nth Fibonacci number. As a reminder, Fibonacci num-
bers are a sequence defined as f0 = 0, f1 = 1, and fn+2 = fn+1 + fn for all integers n ≥ 0.
Remark: As shown in the last week’s exercise sheet, fn grows exponentially (e.g., at least as fast as
Ω(1.5n )). On a physical computer, working with these numbers often causes overflow issues as they
exceed variables’ value limits. However, for this exercise, you can freely ignore any such issue and
assume we can safely do arithmetic on these numbers.
Solution:

Algorithm 7
F ← int[n + 1]
F [0] ← 0
F [1] ← 1
for i ← 2, . . . , n do
F [i] ← F [i − 2] + F [i − 1]
return F [n]

4
At the end of iteration i of this algorithm, we have F [j] = fj for all 0 ≤ j ≤ i. Hence, at the end of
the last iteration, F [n] contains fn . Each of the n iterations has complexity O(1), yielding a total
complexity in O(n).
(b) Given an integer k ≥ 2, design an algorithm that computes the largest Fibonacci number fn such
that fn ≤ k. The algorithm should have complexity O(log k). Prove this.
Remark: Typically we express runtime in terms of the size of the input n. In this exercise, the runtime
will be expressed in terms of the input value k.
Hint: Use the bound proved in 2.2.(b).
Solution:
Consider the following algorithm, where we can just assume for now that K is ‘large enough’ so
that no access outside of the valid index range of the array is performed.

Algorithm 8
F ← int[K]
F [0] ← 0
F [1] ← 1
i=1
while F [i] ≤ k do
i←i+1
F [i] ← F [i − 2] + F [i − 1]
return F [i − 1]

After the ith iteration, we have F [j] = fj for all 0 ≤ j ≤ i. The loop exists when the condition
F [i] = fi > k is satisfied for the first time, and, in this case, F [i − 1] = fi is the largest Fibonacci
number smaller or equal to k. Using 2.2(b), we have k ≥ fi ≥ 31 ·1.5i . We can rewrite k ≥ 13 ·1.5i as
1.5 ≤ 3(2 + ln k) = O(log k). Note that ln x denotes the natural logarithm;
i ≤ ln1.5 (3k) = lnln3+ln k

we do not need to specify the base of the logarithm within O-notation since different bases are
equivalent up to constants (and get hidden in the O-notation). Therefore, the inner while loop can
only execute O(log k) iterations. We can choose K = 3(2 + ln k).
*(c) Given an integer k ≥ 2, consider the following algorithm:

Algorithm 9
while k > 0 do
find the largest n such that fn ≤ k
k ← k − fn

Prove that the loop body is executed at most O(log k) times.


1
Hint: First, prove that fn−1 ≥ 2 · fn for all n.
Solution:

5
We have that fk = fk−1 + fk−2 for all k ≥ 2. Using fk−2 ≤ fk−1 (for k ≥ 2) we have:

fk = fk−1 + fk−2
≤ fk−1 + fk−1
≤ 2 · fk−1 .

The last inequality one can be rewritten as fk−1 ≥ 12 fk .


After any single iteration of the outer while loop, the variable k is at least halved. Hence, by straight-
forward induction, it must be 0 after at most blog2 nc = O(log n) steps.

Exercise 3.5 Iterative squaring.


In this exercise you are going to develop an algorithm to compute powers an , with a ∈ Z and n ∈
N, efficiently. For this exercise, we will treat multiplication of two integers as a single elementary
operation, i.e., for a, b ∈ Z you can compute a · b using one operation.
(a) Assume that n is even, and that you already know an algorithm An/2 (a) that efficiently computes
an/2 , i.e., An/2 (a) = an/2 . Given the algorithm An/2 , design an efficient algorithm An (a) that
computes an .
Solution:

Algorithm 10 An (a)
x ← An/2 (a)
return x · x

(b) Let n = 2k , for k ∈ N0 . Find an algorithm that computes an efficiently. Describe your algorithm
using pseudo-code.
Solution:

Algorithm 11 Power(a, n)
if n = 1 then
return a
else
x ← Power(a, n/2)
return x · x

(c) Determine the number of elementary operations (i.e., integer multiplications) required by your
algorithm for part b) in O-notation. You may assume that bookkeeping operations don’t cost any-
thing. This includes handling of counters, computing n/2 from n, etc.
Solution:
Let T (n) be the number of elementary operations that the algorithm from part b) performs on input
a, n. Then

T (n) ≤ T (n/2) + 1 ≤ T (n/4) + 2 ≤ T (n/8) + 3 ≤ . . . ≤ T (1) + log2 n ≤ O(log n) .

6
(d) Let Power(a, n) denote your algorithm for the computation of an from part b). Prove the correctness
of your algorithm via mathematical induction for all n ∈ N that are powers of two.
In other words: show that Power(a, n) = an for all n ∈ N of the form n = 2k for some k ∈ N0 .
Solution:
• Base Case.
Let k = 0. Then n = 1 and Power(a, n) = a = a1 .
• Induction Hypothesis.
k
Assume that the property holds for some positive integer k. That is, Power(a, 2k ) = a2 .
• Inductive Step.
We must show that the property holds for k + 1.
I.H. k k k+1
Power(a, 2k+1 ) = Power(a, 2k ) · Power(a, 2k ) = a2 · a2 = a2 .

By the principle of mathematical induction, this is true for any integer k ≥ 0 and n = 2k .
*(e) Design an algorithm that can compute an for a general n ∈ N, i.e., n does not need to be a power
of two.
Hint: Generalize the idea from part a) to the case where n is odd, i.e., there exists k ∈ N such that
n = 2k + 1.
Solution:

Algorithm 12 Power(a, n)
if n = 1 then
return a
else
if n is odd then
x ← Power(a, (n − 1)/2)
return x · x · a
else
x ← Power(a, n/2)
return x · x

7
Eidgenössische Ecole polytechnique fédérale de Zurich
Technische Hochschule Politecnico federale di Zurigo
Zürich Federal Institute of Technology at Zurich

Departement of Computer Science 17 October 2022


Markus Püschel, David Steurer
François Hublet, Goran Zuzic, Tommaso d’Orsi, Jingqiu Ding

Algorithms & Data Structures Exercise sheet 4 HS 22

The solutions for this sheet are submitted at the beginning of the exercise class on 24 October 2022.
Exercises that are marked by ∗ are “challenge exercises”. They do not count towards bonus points.
You can use results from previous parts without solving those parts.

Master Theorem. The following theorem is very useful for running-time analysis of divide-and-
conquer algorithms.

Theorem 1 (Master theorem). Let a, C > 0 and b ≥ 0 be constants and T : N → R+ a function such
that for all even n ∈ N,
T (n) ≤ aT (n/2) + Cnb . (1)
Then for all n = 2k , k ∈ N,
• If b > log2 a, T (n) ≤ O(nb ).
• If b = log2 a, T (n) ≤ O(nlog2 a · log n).
• If b < log2 a, T (n) ≤ O(nlog2 a ).
If the function T is increasing, then the condition n = 2k can be dropped. If (1) holds with “=”, then we
may replace O with Θ in the conclusion.

This generalizes some results that you have already seen in this course. For example, the (worst-case)
running time of Karatsuba algorithm satisfies T (n) ≤ 3T (n/2) + 100n, so a = 3 and b = 1 <
log2 3, hence T (n) ≤ O(nlog2 3 ). Another example is binary search: its running time satisfies T (n) ≤
T (n/2) + 100, so a = 1 and b = 0 = log2 1, hence T (n) ≤ O(log n).

Exercise 4.1 Applying Master theorem.


For this exercise, assume that n is a power of two (that is, n = 2k , where k ∈ {0, 1, 2, 3, 4, . . .}).
a) Let T (1) = 1, T (n) = 4T (n/2)+100n for n > 1. Using Master theorem, show that T (n) ≤ O(n2 ).
Solution:
We can apply Theorem 1 with a = 4, b = 1 and C = 100. In this case, b < log2 a, and therefore the
by the Master theorem we have T (n) ≤ O(nlog2 a ) = O(n2 ).
b) Let T (1) = 5, T (n) = T (n/2) + 32 n for n > 1. Using Master theorem, show that T (n) ≤ O(n).
Solution:
We can apply Theorem 1 with a = 1, b = 1 and C = 23 . In this case, b > log2 a, and therefore the
by the Master theorem we have T (n) ≤ O(nb ) = O(n).
c) Let T (1) = 4, T (n) = 4T (n/2) + 72 n2 for n > 1. Using Master theorem, show that T (n) ≤
O(n2 log n).
Solution:
We can apply Theorem 1 with a = 4, b = 2 and C = 27 . In this case, b = log2 a, and therefore the
by the Master theorem we have T (n) ≤ O(nlog2 a · log n) = O(n2 log n).
The following definitions are closely related to O-Notation and are also useful in running time analysis
of algorithms.
Definition 1 (Ω-Notation). Let n0 ∈ N, N := {n0 , n0 + 1, . . .} and let f : N → R+ . Ω(f ) is the set
of all functions g : N → R+ such that f ∈ O(g). One often writes g ≥ Ω(f ) instead of g ∈ Ω(f ).
Definition 2 (Θ-Notation). Let n0 ∈ N, N := {n0 , n0 + 1, . . .} and let f : N → R+ . Θ(f ) is the set
of all functions g : N → R+ such that f ∈ O(g) and g ∈ O(f ). One often writes g = Θ(f ) instead of
g ∈ Θ(f ).

Exercise 4.2 Asymptotic notations.


a) Give the (worst-case) running time of the following algorithms in Θ-Notation.
1) Karatsuba algorithm.
Solution:
Θ(nlog2 (3) )
2) Binary Search.
Solution:
Θ(log2 (n))
3) Bubble Sort.
Solution:
Θ(n2 )
b) (This subtask is from January 2019 exam). For each of the following claims, state whether it is
true or false. You don’t need to justify your answers.

claim true false

n √
log n ≤ O( n)  

log(n!) ≥ Ω(n2 )  

nk ≥ Ω(k n ), if 1 < k ≤ O(1)  

log3 n4 = Θ(log7 n8 )  

Solution:

2
claim true false

n √
log n ≤ O( n)  

log n! ≥ Ω(n2 )  

nk ≥ Ω(k n )  

log3 n4 = Θ(log7 n8 )  

c) (This subtask is from August 2019 exam). For each of the following claims, state whether it is
true or false. You don’t need to justify your answers.

claim true false

n
log n ≥ Ω(n1/2 )  

log7 (n8 ) = Θ(log3 (n n ))  

3n4 + n2 + n ≥ Ω(n2 )  

(∗) n! ≤ O(nn/2 )  

Solution:

claim true false

n
log n ≥ Ω(n1/2 )  

log7 (n8 ) = Θ(log3 (n n ))  

3n4 + n2 + n ≥ Ω(n2 )  

(∗) n! ≤ O(nn/2 )  

Note that the last claim is challenge. It was one of the hardest tasks of the exam. If you want a 6
grade, you should be able to solve such exercises.
Solution:
f (n)
All claims except for the last one are easy to verify using either the theorem about the limit of g(n)
or simply the definitions of O, Ω and Θ. Thus, we only present the solution for the last one.
Note that for all n ≥ 1,
n! ≥ 1 · 2 · · · n ≥ dn/10e · · · n ≥ dn/10e0.9n ≥ (n/10)0.9n .
Let’s show that (n/10)0.9n grows asymptotically faster than nn/2 .
nn/2
lim = lim 100.9n · n−0.4n = lim (109/4 /n)0.4n = 0 .
n→∞ (n/10)0.9n n→∞ n→∞

3
Hence it is not true that (n/10)0.9n ≤ O(nn/2 ) and so it is not true that n! ≤ O(nn/2 ).

Sorting and Searching.

Exercise 4.3 One-Looped Sort (1 point).


Consider the following pseudocode whose goal is to sort an array A containing n integers.

Algorithm 1 Input: array A[0 . . . n − 1].


i←0
while i < n do
if i = 0 or A[i] ≥ A[i − 1] then:
i←i+1
else
swap A[i] and A[i − 1]
i←i−1

(a) Show the steps of the algorithm on the input A = [10, 20, 30, 40, 50, 25] until termination. Specif-
ically, give the contents of the array A and the value of i after each iteration of the while loop.
Solution:
The initial state of the algorithm is:

10
A = [10
10, 20, 30, 40, 50, 25] i=0

We bolded the element A[i] for convenience. In the first 5 steps, the algorithm executes i ← i + 1
and gets to the state i = 5 without changing the array.

A = [10, 20
20, 30, 40, 50, 25] i=1
A = [10, 20, 30
30, 40, 50, 25] i=2
A = [10, 20, 30, 40
40, 50, 25] i=3
A = [10, 20, 30, 40, 50
50, 25] i=4
A = [10, 20, 30, 40, 50, 25
25] i=5

Then, in the next 3 steps, the algorithm moves the element 25 into its correct sorted position in the
array:

A = [10, 20, 30, 40, 25


25, 50] i=4
A = [10, 20, 30, 25
25, 40, 50] i=3
A = [10, 20, 25
25, 30, 40, 50] i=2

4
After that, in the next 4 steps, the algorithm again executes i ← i + 1 until i = n and we are done.

A = [10, 20, 25, 30


30, 40, 50] i=3
A = [10, 20, 25, 30, 40
40, 50] i=4
A = [10, 20, 25, 30, 40, 50
50] i=5
A = [10, 20, 25, 30, 40, 50] i=6

(b) Explain why the algorithm correctly sorts any input array. Formulate a reasonable loop invariant,
prove it (e.g., using induction), and then conclude using invariant that the algorithm correctly sorts
the array.
Hint: Use the invariant “at the moment when the variable i gets incremented to a new value i = k for
the first time, the first k elements of the array are sorted in increasing order”.
Solution:
We prove the hinted loop invariant by induction.
• Base Case.
After the first while-loop iteration we always have i = 1, and the first element is trivially
sorted.
• Induction Hypothesis.
Assume now that the hypothesis for 1 ≤ k ≤ n: assume that the variable i is, for the first
time, equal to k, and the first k elements are sorted in increasing order.
• Inductive Step.
We must show that the property holds when i becomes k + 1 for the first time.
Suppose i = k for the first time. Examining the algorithm, we see that the algorithm inserts
A[k] into A[0 . . . k] by moving A[i] to the left until it is in its correct place (i.e., its left neighbor
is not larger). This phase is the same method as in a single phase of the InsertionSort algorithm.
This makes the first k + 1 elements sorted, as required. Then, the algorithm increments i until
i = k + 1 (for the first time), proving the claim.
Proving this loop invariant immediately implies that, at termination when i = n, the first n ele-
ments are sorted, meaning that the entire array is sorted.
(c) Give a reasonable running-time upper bound, expressed in O-notation.
Solution:
Consider the above loop invariant for i = 1, 2, . . . , n. For each value k ≥ 1, between the first time
i = k and the first time i = k + 1 there are O(k) in-between steps. Since the algorithm terminates
when i = n, the number of steps required is O(1) + O(2) + O(3) + . . . + O(n − 1) = O(n2 ). The
final running time is upper-bounded O(n2 ).
Remark: On a reverse-sorted array, it can be shown that the algorithm takes Ω(n2 ) steps, hence the
above O(n2 )-bound cannot be improved.

Exercise 4.4 Searching for the summit (1 point).

5
Suppose we are given an array A[1 . . . n] with n unique integers that satisfies the following property.
There exists an integer k ∈ [1, n], called the summit index, such that A[1 . . . k] is a strictly increasing
array and A[k . . . n] is a strictly decreasing array. We say an array is valid is if satisfies the above
properties.
(a) Provide an algorithm that find this k with worst-case running time O(log n). Give the pseudocode
and give an argument why its worst-case running time is O(log n).
Note: Be careful about edge-cases! It could happen that k = 1 or k = n, and you don’t want to peek
outside of array bounds without taking due care.
Solution:
The summit index can be found using the following algorithm:

Algorithm 2 Find the summit


function findSummitIndex(T , i, j)
m ← b(i + j)/2c
if j = i then
return i
if T [m + 1] < T [m] then . m is right of the summit (or is the summit)
return findSummitIndex(T, i, m) . keep searching in the left half
else . m is strictly left of the summit
return findSummitIndex(T, m + 1, j) . keep searching in the right half
Input: Valid array T of length n with unique elements
Output: findSummitIndex(T, 1, n)

Let A(n) be the worst-case running time of this algorithm on an input array of length n. Then, A(n)
is such that A(n) ≤ A(n/2) + C where C is a constant, since a constant number of operations are
performed before a recursive call is performed on an array twice smaller. This is A(n) ≤ 1·A(n/2)+
Cn0 , hence, by the Master theorem, we have A(n) = O(log n) (case log a = log 1 = 0 = b, yielding
A(n) = O(nb log n) = O(n0 log n) = O(log n)).
(b) Given an integer x, provide an algorithm with running time O(log n) that checks if x appears in the
array of not. Describe the algorithm either in words or pseudocode and argue about its worst-case
running time.
Solution:
Consider the binary search algorithm for sorted integer arrays from the lecture. More precisely,
let the binary search algorithm for arrays sorted in ascending order be denoted by BS↑ , while the
binary search for arrays sorted in descending order is BS↓ . Assume that for c ∈ {↑, ↓}, BSc (T, x)
returns true if x is in T , and false otherwise. These two algorithms have running times O(log n).
We can now use BS↑ , BS↓ , and findSummitIndex as subroutines to find our element:

Algorithm 3 Search in a valid array


Input: Valid integer array T of length n with unique elements, integer x
k ← findSummitIndex(T, 1, n)
k1 ← BS↑ (T [1..k], x) . search in array T [1..k], sorted in ascending order

k2 ← BS (T [k + 1..n], x) . search in array T [k + 1..n], sorted in descending order
Output: k1 or k2

6
This algorithm runs in time O(log n) + O(log n) + O(log n) = O(log n), since every of the three
subroutines has O(log n) running times.

Exercise 4.5 Counting function calls in loops (cont’d) (1 point).


For each of the following code snippets, compute the number of calls to f as a function of n. Provide
both the exact number of calls and a maximally simplified, tight asymptotic bound in big-O notation.

Algorithm 4
(a) i←0
while 2i < n do
j←i
while j < n do
f ()
j ←j+1
i←i+1

Solution:
Given i, the inner loop performs n−1
P
j=i 1 = (n − 1) − i + 1 = n − i calls to f . The full algorithm
Pdlog2 ne−1 Pdlog ne−1
thus performs i=0 (n − i) = ndlog2 ne − i=0 2 i = ndlog2 ne − (dlog2 ne−1)dlog
2
2 ne
=
O(n log n) calls to f .

Algorithm 5
(b) i←n
while i > 0 do
j←0
f ()
while j < n do
f ()
k←j
while k < n do
f ()
k ←k+1
j ←j+1
i ← b 2i c

Solution:
Given i and j, the innermost loop performs n−1
P
k=j 1 = n − j calls to f . Hence, the second loop
(n+1)(n+2)
(guarded by j < n) performs j=0 (1 + (n − j)) = n−1
Pn−1 P Pn+1
j=0 ((n + 1) − j) = j=2 j = 2 −
n(n+3)
1 = 2 calls to f . Finally, we observe that, if n > 1, the outermost loop performs exactly
2
blog2 nc + 1 iterations: writing n = b` . . . b0 in binary notation with ` = blog2 nc and b` = 1, the
2
variable i contains exactly b` . . . bi after i iterations, and is zero after exactly ` + 1 of them. Hence,
the full algorithm performs (dlog2 ne + 1) n(n+3) 2 = O(n2 log n) calls to f .

7
Eidgenössische Ecole polytechnique fédérale de Zurich
Technische Hochschule Politecnico federale di Zurigo
Zürich Federal Institute of Technology at Zurich

Departement of Computer Science 24 October 2022


Markus Püschel, David Steurer
François Hublet, Goran Zuzic, Tommaso d’Orsi, Jingqiu Ding

Algorithms & Data Structures Exercise sheet 5 HS 22

The solutions for this sheet are submitted at the beginning of the exercise class on 31 October 2022.
Exercises that are marked by ∗ are “challenge exercises”. They do not count towards bonus points.
You can use results from previous parts without solving those parts.

Exercise 5.1 Heapsort (1 point).


Given the array [0, 7, 2, 8, 4, 6, 3, 1], we want to sort it in ascending order using Heapsort.
(a) Draw the tree interpretation of the array as a heap, before any call of RestoreHeapCondition.
Solution:
0

7 2

8 4 6 3

(b) In the lecture you have learned a method to construct a heap from an unsorted array (see also pages
35–36 in the script). Draw the resulting max heap if this method is applied to the above array.
Solution:
We start from the heap drawn above. The root of the heap is at level 0. Heapifying the subtree with
root at level 2 leaves the heap unchanged.
Then, heapifying the subtrees with roots at level 1 yields:
0

8 6

7 4 2 3

Finally, heapifying the subtree at the root node yields

7 6

1 4 2 3

which corresponds to the array [8, 7, 6, 1, 4, 2, 3, 0].


(c) Sort the above array in ascending order with heapsort, beginning with the heap that you obtained
in (b). Draw the array after each intermediate step in which a key is moved to its final position.
Solution:
We begin with the max heap [8, 7, 6, 1, 4, 2, 3, 0]. We extract the root 8 and put it into the last
position in the array, i.e., we swap 8 with the last element 0, removing 8 from the heap, which
yields

7 6

1 4 2 3

We then sift 0 downwards until the heap condition is restored:

2
7

4 6

1 0 2 3

Now, the array is [7, 4, 6, 10, 2, 3, 8] and contains the one-smaller heap in the front and the sorted
entries in the end.
The array after the subsequent steps are as follows. Blue numbers are at their final positions.
1) Swap 7 and 3: [3, 4, 6, 1, 0, 2, 7, 8]
Sift 3 down: [6, 4, 3, 1, 0, 2, 7, 8]
2) Swap 6 and 2: [2, 4, 3, 1, 0, 6, 7, 8]
Sift 2 down: [4, 2, 3, 1, 0, 6, 7, 8]
3) Swap 4 and 0: [0, 2, 3, 1, 4, 6, 7, 8]
Sift 0 down: [3, 2, 0, 1, 4, 6, 7, 8]
4) Swap 3 and 1: [1, 2, 0, 3, 4, 6, 7, 8]
Sift 1 down: [2, 1, 0, 3, 4, 6, 7, 8]
5) Swap 2 and 0: [0, 1, 2, 3, 4, 6, 7, 8]
Sift 0 down: [1, 0, 2, 3, 4, 6, 7, 8]
6) Swap 0 and 1: [0, 1, 2, 3, 4, 6, 7, 8]
done: [0, 1, 2, 3, 4, 6, 7, 8].
We are done.

Exercise 5.2 Sorting algorithms.


Below you see four sequences of snapshots, each obtained in consecutive steps of the execution of
one of the following algorithms: InsertionSort, SelectionSort, QuickSort, MergeSort, and
BubbleSort. For each sequence, write down the corresponding algorithm.

3 6 5 1 2 4 8 7 3 6 5 1 2 4 8 7
3 6 5 1 2 4 8 7 3 5 1 2 4 6 7 8
3 5 6 1 2 4 8 7 3 1 2 4 5 6 7 8

3 6 5 1 2 4 8 7 3 6 5 1 2 4 8 7
3 6 1 5 2 4 7 8 1 6 5 3 2 4 8 7
1 3 5 6 2 4 7 8 1 2 5 3 6 4 8 7

Solution:
InsertionSort – BubbleSort – MergeSort – SelectionSort.

3
Exercise 5.3 Counting function calls in recursive functions (1 point).
For each of the following functions g, h, and k, provide an asymptotic bound in big-O notation on the
number of calls to f as a function of n. You can assume that n is a power of two.

Algorithm 1
(a) function g(n)
i←1
while i < n do
f ()
i←i+2
g(n/2)
g(n/2)
g(n/2)

Solution:
Denoting by G(n) the number of calls to f performed by g(n), we have
1 1
G(n) = 3G(n/2) + bn/2c ≤ 3 · G(n/2) + ·n .
2
Since log2 3 > 1, the Master theorem yields G(n) ≤ O(nlog2 3 ) = O(n1.58... ).

Algorithm 2
(b) function h(n)
i←1
while i < n do
f ()
i←i+1
k(n)
k(n)
function k(n)
i←2
while i < n do
f ()
i ← i2
h(n/2)

Solution:
First, consider the number of calls to f performed in a call of k(n). Variable i takes the values
j j
2, 22 , 24 , 28 ..., i.e., (22 )1≤j . We leave the while loop when 22 ≥ n, i.e., when 2j ≥ log2 n or
j ≥ log2 log2 n. Hence, the number of iterations is dlog2 log2 (n)e.
Denoting by H(n) and K(n) respectively the number of calls to f performed by h(n) and k(n),
we have

H(n) = 2K(n) + n − 1
K(n) = H(n/2) + dlog2 log2 (n − 1)e + 1

4
Injecting the definition of K(n) into the definition of H, we get

H(n) ≤ 2H(n/2) + 2 log2 log2 n + n ≤ 2 · H(n/2) + 3 · n1

and since log2 2 = 1, the Master theorem yields H(n) ≤ O(nlog2 2 · log n) = O(n log n). Since
K(n) ≤ H(n/2) + O(log log n), we immediately obtain K(n) ≤ O(n log n) + O(log log n) =
O(n log n) too.

Exercise 5.4 Bubble sort invariant.


Consider the pseudocode of the bubble sort algorithm on an integer array A[1, . . . , n]:

Algorithm 3 BubbleSort(A)
for 1 ≤ i ≤ n do
for 1 ≤ j ≤ n − i do
if A[j] > A[j + 1] then
t ← A[j]
A[j] ← A[j + 1]
A[j + 1] ← t
return A

(a) Formulate an invariant INV(i) that holds at the end of the i-th iteration of the outer for-loop.
Solution:
After i iterations of the outer for-loop, the subarray A[n − i + 1, . . . , n] is sorted and each element
from A[1, . . . , n − i] is not greater than each element from A[n − i + 1, . . . , n].
(b) Using the invariant from part (a), prove the correctness of the algorithm. Specifically, prove the
following three assertions:
(1) INV(1) holds.
(2) If INV(i) holds, then INV(i + 1) holds (for all 1 ≤ i < n).
(3) INV(n) implies that BubbleSort(A) correctly sorts the array A.
Solution:
(1) INV(1) means that after the first iteration of the outer for-loop, the largest element of A is at
position n. Suppose that this largest element was originally at position j for some 1 ≤ j ≤ n.
If j = n, the element will never be swapped by the first inner for-loop, and hence is still at
position n at the end, as desired. For j < n, this largest element will be swapped to position
j + 1 in the j-th iteration of the inner for-loop, and then swapped to position j + 2 in the next
iteration, and so on until it is swapped to position n. So in both cases it is at position n at the
end of the first for-loop.
(2) Let 1 ≤ i < n. Assuming that INV(i) holds, we know that before the (i + 1)st iteration of
the outer for-loop, the i last entries of the array are the i largest entries of the input array A
sorted in ascending order. Using a similar reasoning as in (i), we see that during the (i + 1)st
iteration, the largest element among the remaining part of the array (namely A[1, . . . , n − i])
will be placed at the last position of this remaining part, so that now the the i + 1 last entries of
the array are the i+1 largest entries of the input array in ascending order. Therefore, INV(i+1)
holds.

5
(3) INV(n) means that the “subarray” A[1, . . . , n] is sorted. But this is actually the full array (since
A has length n) returned by BubbleSort(A), which shows that the algorithm correctly sorts
the array A.

Exercise 5.5 Guessing a pair of numbers (1 point).


Alice and Bob play the following game:
• Alice selects two integers 1 ≤ a, b ≤ 1000, which she keeps secret
• Then, Alice and Bob repeat the following:
– Bob chooses two integers (a0 , b0 )
– If a = a0 and b = b0 , Bob wins
– If a > a0 and b > b0 , Alice tells Bob ‘high!’
– If a < a0 and b < b0 , Alice tells Bob ‘low!’
– Otherwise, Alice does not give any clue to Bob
Bob claims that he has a strategy to win this game in 12 attempts at most.
Prove that such a strategy cannot exist.
Hint: Represent Bob’s strategy as a decision tree. Each edge of the decision tree corresponds to one of Alice’s
answers, while each leaf corresponds to a win for Bob.
Hint: After defining the decision tree, you can consider the sequence k0 = 1, kn+1 = 3kn + 1, and prove
n+1
that kn = 3 2 −1 . The number of leaves in the decision tree of level n should be related kn .
Solution:
Bob’s strategy can be represented as follows, where green arrows correspond to a win, red arrows to
‘high!’, blue arrows to ’low!’, and black arrows to the absence of a clue.

... ...

... ... ...

Each node of the corresponding tree has four children, of which one (corresponding to Bob winning
the game) has no other child, while the three others can have four children with the same structure as
their parent. Denoting by kn the number of leaves in a tree of level n + 1 of the above form, we see
that
(
k0 = 1
kn+1 = 3kn + 1 ∀n > 0.
3n+1 −1
We will now prove by induction that, for all n > 0, we have: P (n): kn = 2 .

30+1 −1
Base case: n = 0 k0 = 1 = 2 , hence P (0).

6
P (n) 3n+1 −1
Inductive step: Let n > 0. Assume P (n). Then we have kn+1 = 3kn + 1 = 3 · 2 +1 =
3n+2 −3 n+2 n+2
2 + 1 = 3 2−3+2 = 3 2 −1 , hence P (n + 1).
In order for Bob’s strategy to allow him to win for any pair of integers chosen by Alice, the tree repre-
senting his strategy must have at least 1000·1000 = 106 leaves, which is the number of pairs (a, b) that
12
Alice can choose. If Bob’s statement is true, we therefore have k11 ≥ 106 . Now, k11 = 3 2−1 < 106 ,
hence Bob cannot win in at most 12 attempts.

7
Eidgenössische Ecole polytechnique fédérale de Zurich
Technische Hochschule Politecnico federale di Zurigo
Zürich Federal Institute of Technology at Zurich

Departement of Computer Science 31 October 2022


Markus Püschel, David Steurer
François Hublet, Goran Zuzic, Tommaso d’Orsi, Jingqiu Ding

Algorithms & Data Structures Exercise sheet 6 HS 22

The solutions for this sheet are submitted at the beginning of the exercise class on 07 November 2022.
Exercises that are marked by ∗ are “challenge exercises”. They do not count towards bonus points.
You can use results from previous parts without solving those parts.

Exercise 6.1 Longest ascending subsequence.


The longest ascending subsequence problem is concerned with finding a longest subsequence of a given
array A of length n such that the subsequence is sorted in ascending order. The subsequence does not
have to be contiguous and it may not be unique. For example if A = [1, 5, 4, 2, 8], a longest ascending
subsequence is 1, 5, 8. Other solutions are 1, 4, 8, and 1, 2, 8.
Given is the array:

[19, 3, 7, 1, 4, 15, 18, 16, 14, 6, 5, 10, 12, 19, 13, 17, 20, 8, 14, 11]

Use the dynamic programming algorithm from section 3.2. of the script to find the length of a longest
ascending subsequence and the subsequence itself. Provide the intermediate steps, i.e., DP-table up-
dates, of your computation.
Solution:
The solution is given by a one-dimensional DP table that we update in each round. After round i, the
entry DP [j] contains the smallest possible endvalue for an ascending sequence of length j that only
uses the first i entries of the array. In each round, we need to update exactly one entry. If there is
no ascending sequence of length j, we mark it by “-” . In order to visualise the algorithm, we display
the table after each round. Note that the algorithm does not create a new array in each round, it just
updates the single value that changes
length 1 2 3 4 5 6 7 8 9
round 1 19 - - - - - - - -
round 2 3 - - - - - - - -
round 3 3 7 - - - - - - -
round 4 1 7 - - - - - - -
round 5 1 4 - - - - - - -
round 6 1 4 15 - - - - - -
round 7 1 4 15 18 - - - - -
round 8 1 4 15 16 - - - - -
round 9 1 4 14 16 - - - - -
round 10 1 4 6 16 - - - - -
round 11 1 4 5 16 - - - - -
round 12 1 4 5 10 - - - - -
round 13 1 4 5 10 12 - - - -
round 14 1 4 5 10 12 19 - - -
round 15 1 4 5 10 12 13 - - -
round 16 1 4 5 10 12 13 17 - -
round 17 1 4 5 10 12 13 17 20 -
round 18 1 4 5 8 12 13 17 20 -
round 19 1 4 5 8 12 13 14 20 -
round 20 1 4 5 8 11 13 14 20 -
The longest subsequence has length 8, since this is the largest length for which there is an entry in the
table after the final round. To obtain the subsequence itself, we work backwards: The last entry is 20. To
get the second-to-last value, we check out the left neighbor of 20 in the round in which 20 was entered
(round 17), which is 17. Then we go the left neighbor of 17 in the round in which it entered the table
(round 16), and obtain 13. Continuing in this fashion, we obtain the sequence 1, 4, 5, 10, 12, 13, 17, 20.

Exercise 6.2 Coin Conversion (1 point).


Suppose you live in a country where the transactions between people are carried out by exchanging
coins denominated in dollars. The country uses coins with k different values, where the smallest coin
has value of b1 = 1 dollar, while other coins have values of b2 , b3 , . . . , bk dollars. You received a bill
for n dollars and want to pay it exactly using the smallest number of coins. Assuming you have an
unlimited supply of each type of coin, define OPT to be the minimum number of coins you need to
pay exactly n dollars. Your task is to calculate OPT. All values n, k, b1 , . . . , bk are positive integers.
Example: n = 17, k = 3 and b = [1, 9, 6], then OPT = 4 because 17 can be obtained via 4 coins as
1 + 1 + 9 + 6. No way to obtain 17 with three or less coins exists. (A previous version had a typo
“k = 4” that was corrected to “k = 3”.)
(a) Consider the pseudocode of the following algorithm that “tries” to compute OPT.

2
Algorithm 1
1: Input: integers n, k and an array b = [1 = b1 , b2 , b3 , . . . , bk ].
2:
3: counter ← 0
4: while n > 0 do
5: Let b[i] be the value of the largest coin b[i] such that b[i] ≤ n.
6: n ← n − b[i].
7: counter ← counter + 1
8: Print(“min. number of required coins = “, counter)

Algorithm 1 does not always produce the correct output. Show an example where the above algo-
rithms fails, i.e., when the output does not match OPT. Specify what are the values of n, k, b, what
is OPT and what does Algorithm 1 report.
Solution:
Set n = 12, k = 3, b = [1, 9, 6] (this is the same example as above except n = 12). Algorithm 1
returns 4 as it finds the sequence of coins [9, 1, 1, 1]. The correct answer is OPT = 2 because
12 = 6 + 6.
(b) Consider the pseudocode below. Provide an upper bound in O notation that bounds the time it
takes a compute f [n] (it should be given in terms of n and k). Give a short high-level explanation
of your answer. For full points your upper bound should be tight (but you do not have to prove its
tightness).

Algorithm 2
1: Input: integers n, k. Array b = [1 = b1 , b2 , b3 , . . . , bk ].
2:
3: Let f [0 . . . n] be an array of integers.
4: f [0] ← 0 . Terminating condition.
5: for N ← 1 . . . n do
6: f [N ] ← ∞ . At first, we need ∞ coins. We try to improve upon that.
7: for i ← 1 . . . k do
8: if b[i] ≤ N then
9: val ← 1 + f [N − b[i]] . Use coin b[i], it remains to optimally pay N − b[i].
10: f [N ] ← min(f [N ], val)
11: Print(f [n])

Solution:
In worst case, the algorithm completes in O(n · k) time. There are a total of n different states
f [1], . . . , f [N ], and computing the answer for each state takes O(k) time (due to the inner for
loop). Therefore, the total runtime is O(n · k).
(c) Let OPT(N ) be the answer (min. number of coins needed) when n = N . Algorithm 2 (correctly)
computes a function f [N ] that is equal to OPT(N ). Formally prove why this is the case, i.e., why
f [N ] = OP T (N ).
Hint: Use induction to prove the invariant f [n] = OP T (n). Assume the claim holds for all values of
n ∈ {1, 2, . . . , N − 1}. Then show the same holds for n = N .

3
Solution:
We use induction. For the base of the induction, f [0] = 0 = OP T (0) is trivially correct. Suppose
that f [n] = OP T (n) for all values of n ∈ {1, 2, . . . , N − 1}. It remains the prove the induction
step: that f [N ] is also correct.
Suppose that the optimal way to obtain N is via OP T (N ) = T ∗ coins: N = a1 + a2 + . . . + aT ∗
where ai ∈ {b1 , . . . , bk } for all i. Let x be the index such that a1 = bx . Then, in the inner for
loop (lines 7–10), after the variable i becomes equal to x, we will have that f [N ] ≤ 1 + f [N − bx ].
However, by assumption, we have that f [N − bx ] = OP T (N − bx ) is computed correctly, hence
OP T (N − bx ) ≤ T ∗ − 1 since N = a1 + a2 + . . . + aT ∗ can be rewritten as N − bx = a2 + . . . + aT ∗
(this uses T ∗ − 1 coins). Therefore, f [N ] ≤ 1 + OP T (N − bx ) ≤ 1 + T ∗ − 1 = T ∗ = OP T (N ).
We shown f [N ] ≤ OP T (N ). It remains to argue that f [N ] ≥ T ∗ . Suppose the latter is not the
case and consider the moment when f [N ] (i.e., its corresponding variable solution) got assigned
a value less than T ∗ . At that moment, we have that 1 + f [N − bi ] < T ∗ . Rewriting, we have that
f [N − bi ] < T ∗ − 1. This means, by assumption and N − bi < N , that there exists a way to pay
N − bi using less than T ∗ − 1 coins. However, this implies that we can then pay N using less than
T ∗ coins: simply pay for N − bi and then use an additional coin bi . This contradicts the choice of
T ∗.
Hence, we proved that f [N ] = T ∗ = OP T (N ). The claim follows by induction.
(d) Rewrite Algorithm 2 to be recursive and use memoization. The running time and correctness should
not be affected.
Solution:

Algorithm 3
1: Input: integers n, k. Array b = [1 = b1 , b2 , b3 , . . . , bk ].
2: Global variable: memo[1 . . . n], initialized to −1.
3:
4: function f (N )
5: if N = 0 then return 0
6: if memo[N ] 6= −1 then return memo[N ]
7: solution ← ∞
8: for i ← 1 . . . k do
9: if b[i] ≤ N then
10: val ← 1 + f (N − b[i]) . Use coin bi , it remains to optimally pay x − bi .
11: solution ← min(solution, val) . Check if this is the best seen so far?
12: memo[N ] ← solution
13: return solution
14:
15: Print(“OPT = “, f (n))

Exercise 6.3 Longest common subsequence.


Given are two arrays, A of length n, and B of length m, we want to find the their longest common
subsequence and its length. The subsequence does not have to be contiguous. For example, if A =

4
[1, 8, 5, 2, 3, 4] and B = [8, 2, 5, 1, 9, 3], a longest common subsequence is 8, 5, 3 and its length is 3.
Notice that 8, 2, 3 is another longest common subsequence.
Given are the two arrays:
A = [7, 6, 3, 2, 8, 4, 5, 1]
and
B = [3, 9, 10, 8, 7, 1, 2, 6, 4, 5],
Use the dynamic programming algorithm from Section 3.3 of the script to find the length of a longest
common subsequence and the subsequence itself. Show all necessary tables and information you used
to obtain the solution.
Solution:
As described in the lecture, DP [i, j] denotes the size of the longest common subsequence between the
strings A[1 . . . i] and B[1 . . . j]. Note that we assume that A has indices between 1 and 8, so A[1 . . . 0]
is empty, and similarly for B. Then we get the following DP-table:

0 1 2 3 4 5 6 7 8 9 10
0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 0 1 1 1 1 1 1
2 0 0 0 0 0 1 1 1 2 2 2
3 0 1 1 1 1 1 1 1 2 2 2
4 0 1 1 1 1 1 1 2 2 2 2
5 0 1 1 1 2 2 2 2 2 2 2
6 0 1 1 1 2 2 2 2 2 3 3
7 0 1 1 1 2 2 2 2 2 3 4
8 0 1 1 1 2 2 3 3 3 3 4
To find some longest common subsequence, we create an array S of length DP [n, m] and then we start
moving from cell (n, m) of the DP table in the following way:
If we are in cell (i, j) and DP [i − 1, j] = DP [i, j], we move to DP [i − 1, j].
Otherwise, if DP [i, j − 1] = DP [i, j], we move to DP [i, j − 1].
Otherwise, by definition of DP table, DP [i − 1, j − 1] = DP [i, j] − 1 and A[i] = B[j], so we assign
S[DP [i, j]] ← A[i] and then we move to DP [i − 1, j − 1].
We stop when i = 0 or j = 0.
Using this procedure we find the following longest common subsequence: S = [7, 6, 4, 5].

Exercise 6.4 Coin Collection (2 points).


Suppose you are playing a video game where your character’s goal is to collect as many coins in a
two-dimensional m × n grid world (m rows by n columns). The world is given to you as a table
A[1 . . . m][1 . . . n] where each cell is either a coin (denoted as “C”), impassible (denoted as “#”), or
passable without coins (denoted as “.”).

5
Your character starts at (1, 1) (this cell will always be passable) and, in each turn, can move either
right or down (up to your choice), or stop whenever (ending the game). Moving right corresponds to
moving from (x, y) → (x, y + 1) and moving down is (x, y) → (x + 1, y). The goal is to determine the
maximum number of coins the player can collect (by moving into a cell).

For example, on the m × n = 5 × 6 grid depicted 1 2 3 4 5 6


right, the player can collect 5 coins by following . . .
1 C C C
the solid-red path. This is maximum possible and
the answer is 5. A suboptimal path is depicted in 2 . # C C C #
dashed-blue, yielding 4 coins. 3 C . . . # #
4 . C # # . .

5 . C . . C .

Remark: Be careful not to peek into an element of the table that is out-of-bounds (i.e., not within [1, m] ×
[1, n]), as this can cause undefined behavior on a real computer.
(a) Write the pseudocode of a recursive function f (x, y) which takes as argument a position of the
character (x, y), and outputs the maximum number of coins that the character can collect if it
started at (x, y) (ignoring all coins it might have previously collected). For example, in the grid
above, f (1, 1) = 5, f (2, 1) = 4, f (5, 5) = 1, f (5, 6) = 0. The function does not need to be
memoized for this subtask.
Solution:

Algorithm 4
1: Input: integers m, n, grid A (seen as global read-only variables).
2:
3: function f (x, y)
4: coinHere ← 1 if A[x][y] = ”C” and 0 otherwise
5: ret ← coinHere
6: if x + 1 ≤ m and A[x + 1][y] 6= ”#” then
7: goDown ← coinHere + f (x + 1, y)
8: ret ← max(ret, goDown)
9: if y + 1 ≤ n and A[x][y + 1] 6= ”#” then
10: goRight ← coinHere + f (x, y + 1)
11: ret ← max(ret, goRight)
12: return ret

(b) Prove that your algorithm terminates in finite time (even if possibly exponential in the size of the
input). Prove that the algorithm is correct.
Hint: (This hint is assuming you implemented part (a) in the most natural recursive way.) To prove the
algorithm completes in finite time, observe that x + y only increases and is bounded, hence no infinite
execution paths exist.
Hint: To prove the algorithm is correct, we simply need to prove the invariant which describes f (x, y)
(i.e., the first sentence of part (a)). Assume, by induction, the invariant holds for recursive calls f (x, y)
with strictly larger values of x + y, i.e., for those f (x0 , y 0 ) such that x0 + y 0 > x + y. Argue that

6
then it also holds for f (x, y) — we do this by considering the optimal path P ∗ that starts at (x, y) and
consider three cases: if P ∗ ends immediately, if P ∗ initially goes to the right, or it goes down. Using
the inductive hypothesis, argue that in each of those cases f (x, y) becomes a value at least as large
as the number of coins collected on P ∗ . Similarly, by considering the three cases, argue that the final
value cannot be larger than that of P ∗ since otherwise we could find a better P ∗ . This, by induction,
establishes that f (x, y) is always equal to the number of coins on P ∗ .
Solution:
Finite time. Clearly, the parameters (x, y) of the function f satisfy x ∈ {1, 2, . . . , m} and y ∈
{1, 2, . . . , n} since this condition is true when the function is first called and the function ensures
it remains true upon subsequent calls. Furthermore, in each subsequent call of f , the value of x + y
(the sum of values of parameters) strictly increases; since x + y is also bounded within the range
[2, m + n] we conclude that f will eventually terminate.
Correctness. In short: we check all possible paths. Using induction, we prove the invariant that
f (x, y) reports the largest number coins we can collect by starting at (x, y). Induction hypothesis:
the invariant holds for calls f (x, y) with strictly larger value of x + y. Induction step: let P ∗ be
the optimal path starting at (x, y). If P ∗ stops immediately, clearly f (x, y) is going to (correctly)
return 0/1 based on whether there is a coin on (x, y), hence f (x, y) will return the correct value.
From now on, let us define with val(P ∗ ) the number of coins on P ∗ .
If P ∗ initially goes to the right, let P 0 be a suffix of P ∗ without the first cell (i.e., starting at (x, y +
1)) and let coinHere be 1 if there is a coin at (x, y) and 0 otherwise. By construction, we have
val(P ∗ ) = coinHere + val(P 0 ). Then, consider the call goRight = coinHere + f (x, y + 1): by
induction, f (x, y + 1) reports a value at least as large as val(P 0 ), hence

f (x, y) ≥ goRight = coinHere + f (x, y + 1) ≥ coinHere + val(P 0 ) = val(P ∗ ).

Analogously, the same claim holds if P ∗ initially goes down.


We have proven f (x, y) ≥ val(P ∗ ). We now prove f (x, y) cannot exceed val(P ∗ ). For the sake
of contradiction, suppose f (x, y) > val(P ∗ ). There are 3 cases: (1) if the final value of f (x, y) was
assigned in line 5 (of Algorithm 4). This is impossible, as then f (x, y) = coinHere ≤ val(P ∗ ),
a contradiction. Case (2): if the final value of f (x, y) was assigned in line 11 (i.e., by going right),
i.e., f (x, y) = coinHere + f (x, y + 1). Then, let P 0 be the optimal path starting at (x, y + 1).
By the induction hypothesis, we have that f (x, y + 1) = val(P 0 ). But then, we can construct a
path starting at (x, y) that is better than P ∗ : simply prepend (x, y) to P 0 which gives a value of
coinHere + val(P 0 ) = f (x, y) > val(P ∗ ). However, this contradicts the choice of P ∗ , resulting
in a contradiction. Finally, case (3), when the final value of f (x, y) was assigned in line 8 (i.e., going
down), is completely analogous to case (2) and we are done. Hence, f (x, y) cannot contain more
coins than the optimal path P ∗ . Hence, we proven the claim.
(c) Rewrite the pseudocode of the subtask (a), but apply memoization to the above f . Prove that calling
f (1, 1) will, in the worst-case, complete in O(m · n) time.
Solution:

7
Algorithm 5 Differences with Algorithm 4 are marked in blue for convenience.
1: Input: integers m, n, grid A (seen as global read-only variables).
2: Global variable: memo[1 . . . m][1 . . . n], initialized to −1.
3:
4: function f (x, y)
5: if memo[x][y] 6= −1 then return memo[x][y]
6: coinHere ← 1 if A[x][y] = ”C” and 0 otherwise
7: ret ← coinHere
8: if x + 1 ≤ m and A[x + 1][y] 6= ”#” then
9: goDown ← coinHere + f (x + 1, y)
10: ret ← max(ret, goDown)
11: if y + 1 ≤ n and A[x][y + 1] 6= ”#” then
12: goRight ← coinHere + f (x, y + 1)
13: ret ← max(ret, goRight)
14: memo[x][y] ← ret
15: return ret

Any two calls to f when the arguments (x, y) have the same value (i.e., on two calls f (x1 , y1 ) and
f (x2 , y2 ) where x1 = x2 and y1 = y2 ), at most one call can proceed beyond the first if statement —
the second call will short-circuit due to memoization and exit immediately. Therefore, the number
of times the function f proceeds beyond the first “memoization” if is O(m·n). In each such call, the
number of operations excluding recursive calls is O(1), hence we conclude that the total runtime
is O(m · n).
(d) Write the pseudocode for an algorithm that computes the solution in O(m · n) time, but does not
use any recursion. Address the following aspects of your solution:
(a) Definition of the DP table: What are the dimensions of the table DP ? What is the meaning
of each entry?
(b) Computation of an entry: How can an entry be computed from the values of other entries?
(c) Specify the base cases, i.e., the entries that do not depend on others.
(d) Calculation order: In which order can entries be computed so that values needed for each
entry have been determined in previous steps?
(e) Extracting the solution: How can the final solution be extracted once the table has been filled?
(f) Running time: What is the running time of your solution?
(g) Explicitly write out the pseudocode.
Solution:
(a) DP [1 . . . m][1 . . . n]. The entry DP [x][y] corresponds to the maximum number of coins that
the character can collect if it started at (x, y) (ignoring all coins it might have previously
collected)
(b) Each entry DP [x][y] is equal to the maximum of three things: (1) whether there is a coin at
(x, y) (0 if not, 1 if yes; this corresponds to the path stopping here), (2) whether there is a coin
at (x, y) plus DP[x+1][y] (corresponds to continuing the path down), (3) whether there is a
coin at (x, y) plus DP[x][y+1] (corresponds to continuing the path right).

8
(c) The base case is essentially case (1) from part (b) above: if the path stops at (x, y) we initialize
DP [x][y] with 0 if there is no coin at (x, y) and 1 if there is one.
(d) One way to compute the entries is bottom-to-top: from last to first row in the outer loop, then
from last to first column in the inner loop.
(e) DP [1][1] contains the final solution.
(f) Running time is O(m·n) since there are O(m·n) table entries, and each entry can be computed
in O(1) time.
(g) (see below)

Algorithm 6
1: Input: integers m, n, grid A (seen as global read-only variables).
2:
3: Define a table dp[1 . . . m][1 . . . n].
4: for x ← m downto 1 do
5: for y ← n downto 1 do
6: coinHere ← 1 if A[x][y] = ”C” and 0 otherwise
7: dp[x][y] ← coinHere
8: if x + 1 ≤ m and A[x + 1][y] 6= ”#” then
9: goDown ← coinHere + dp[x + 1][y]
10: dp[x][y] ← max(dp[x][y], goDown)
11: if y + 1 ≤ n and A[x][y + 1] 6= ”#” then
12: goRight ← coinHere + dp[x][y + 1]
13: dp[x][y] ← max(dp[x][y], goRight)
14: Print(dp[1][1])

9
Eidgenössische Ecole polytechnique fédérale de Zurich
Technische Hochschule Politecnico federale di Zurigo
Zürich Federal Institute of Technology at Zurich

Departement of Computer Science 7 November 2022


Markus Püschel, David Steurer
François Hublet, Goran Zuzic, Tommaso d’Orsi, Jingqiu Ding

Algorithms & Data Structures Exercise sheet 7 HS 22

The solutions for this sheet are submitted at the beginning of the exercise class on 14 November 2022.
Exercises that are marked by ∗ are “challenge exercises”. They do not count towards bonus points.
You can use results from previous parts without solving those parts.

Exercise 7.1 k-sums (1 point).


We say that an integer n ∈ N is a k-sum if it can be written as a sum n = ak1 +· · ·+akp where a1 , . . . , ap
are distinct natural numbers, for some arbitrary p ∈ N.
For example, 36 is a 3-sum, since it can be written as 36 = 13 + 23 + 33 .
Describe a DP algorithm that, given two integers n and k, returns True if and only if n is a k-sum. Your
2
algorithm should have asymptotic runtime complexity at most O(n1+ k ).
1
Hint: The intended solution has complexity O(n1+ k ).
In your solution, address the following aspects:
1. Dimensions of the DP table: What are the dimensions of the DP table?
2. Definition of the DP table: What is the meaning of each entry?
3. Computation of an entry: How can an entry be computed from the values of other entries? Specify
the base cases, i.e., the entries that do not depend on others.
4. Calculation order: In which order can entries be computed so that values needed for each entry have
been determined in previous steps?
5. Extracting the solution: How can the solution be extracted once the table has been filled?
6. Running time: What is the running time of your solution?
Solution:
Given n and k, let m = bn1/k c be the largest integer such that mk ≤ n.
1. Dimensions of the DP table: DP [0 . . . n][0 . . . m]
2. Definition of the DP table: DP [i][j] is True if, and only if, i can be written as a sum i = ak1 + · · · + akp
where p ∈ N, the (a` )1≤`≤p are distinct, and {a1 , . . . , ap } ⊆ {1..j}.
3. Computation of an entry: DP can be computed recursively as follows:

DP [0][j] = True 0≤j≤m (1)


DP [i][0] = False 0<i≤n (2)
DP [i][j] = DP [i − j k ][j − 1] or DP [i][j − 1] jk ≤ i ≤ n (3)
DP [i][j] = DP [i][j − 1] otherwise. (4)

Equation (1) expresses that 0 can always be written as an (empty) sum of distinct integers in any
interval {1..j}. Equation (2) says that non-zero values cannot be obtained as a sum of integers in
{1..0} = ∅. Equation (3) and (4) provide the recurrence relation. An integer j can be obtained as a
sum i = ak1 + · · · + akp of distinct integers in {1..j} iff either
(a) Some a` (for example, ap ) is j and the rest of the sum is ak1 + · · · + akp−1 = i − akp = i − j k , such
that {a1 , . . . , ap−1 } ⊆ {1..j − 1} or
(b) No a` is j, and ak1 + · · · + akp = i is a sum of integers from {1..j − 1}.
Case (a) corresponds to DP [i − j k ][j − 1], case (b) to DP [i][j − 1]. When j k < i, both cases are
possible, and we obtain formula (3); when j k > i, only case (2) is possible, and we obtain (4).
4. Calculation order: Following the recurrence relations above, we can compute first by order of in-
creasing j, and then in an arbitrary order for i (for example, in increasing order).
5. Extracting the solution: The solution is DP [n][m], since any a` that appears in a sum ak1 +· · ·+akp = n
is such that ak` ≤ n, which implies a` ≤ bn1/k c = m.
1 1
6. Running time: The running time of the solution is O(n·n k ) = O(n1+ k ) as there are (n+1)·(m+1) =
1
O(nm) = O(n · n k ) entries in the table, we process each entry in O(1) time, and the solution is
extracted in O(1) time.

Exercise 7.2 Road trip.


You are planning a road trip for your summer holidays. You want to start from city C0 , and follow
the only road that goes to city Cn from there. On this road from C0 to Cn , there are n − 1 other
cities C1 , . . . , Cn−1 that you would be interested in visiting (all cities C1 , . . . , Cn−1 are right on the
road from C0 to Cn ). For each 0 ≤ i ≤ n, the city Ci is at kilometer ki of the road for some given
0 = k0 < k1 < . . . < kn−1 < kn .
You want to decide in which cities among C1 , . . . , Cn−1 you will make an additional stop (you will stop
in C0 and Cn anyway). However, you do not want to drive more than d kilometers without making a
stop in some city, for some given value d > 0 (we assume that ki < ki−1 + d for all i ∈ [n] so that
this is satisfiable), and you also don’t want to travel backwards (so from some city Ci you can only go
forward to cities Cj with j > i).
(a) Provide a dynamic programming algorithm that computes the number of possible routes from C0
to Cn that satisfies these conditions, i.e., the number of allowed subsets of stop-cities. In order to
get full points, your algorithm should have O(n2 ) runtime.
Address the same six aspects as in Exercise 7.1 in your solution.
Solution:
1. Dimensions of the DP table: The DP table is linear, and its size is n + 1.

2
2. Definition of the DP table: DP [i] is the number of possible routes from C0 to Ci (which stop at
Ci ).
3. Computation of an entry: Initialize DP [0] = 1.
For every i > 0, we can compute DP [i] using the formula
X
DP [i] = DP [j]. (5)
0≤j<i
ki ≤kj +d

4. Calculation order: We can calculate the entries of DP from the smallest index to the largest
index.
5. Extracting the solution: All we have to do is read the value at DP [n].
6. Running time: For i = 0, DP [0] is computed in O(1) time. For i ≥ 1, the entry DP [i] is
computed in O(i) time
P (as we potentially need to take the sum of i entries). Therefore, the total
runtime is O(1) + ni=1 O(i) = O(n2 ).
(b) If you know that ki > ki−1 + d/10 for every i ∈ [n], how can you turn the above algorithm into a
linear time algorithm (i.e., an algorithm that has O(n) runtime) ?
Solution:
Assuming that ki > ki−1 + d/10 for all i, we know that ki > ki−10 + d, and hence ki > kj + d for
all j ≤ i − 10. Therefore, the sum in formula (5) contains at most 10 terms DP [j] (and for each of
them we can check in constant time whether we should include it or not, i.e., whether ki ≤ kj + d).
So in this case the computation of the entry DP [i] takes time O(1) for all 0 ≤ i ≤ n, and hence
the total runtime is O(n).

Exercise 7.3 Safe pawn lines (1 point).


On an N × M chessboard (N being the number of rows and M the number of columns), a safe pawn
line is a set of M pawns with exactly one pawn per column of the chessboard, and such that every two
pawns from adjacent columns are located diagonally to each other. When a pawn line is not safe, it is
called unsafe.
The first two chessboards below show safe pawn lines, the latter two unsafe ones. The line on the third
chessboard is unsafe because pawns d4 and e4 are located on the same row (rather than diagonally);
the line on the fourth chessboard is unsafe because pawn a5 has no diagonal neighbor at all.

6
0Z0Z0Z
5
Z0Z0Z0 5
o0Z0Z
4
0Z0o0Z 4
pZp 4
0Z0opZ 4
0Z0o0
3
o0o0o0 3
ZpZ 3
Z0o0Zp 3
Z0o0o
2
0o0Z0o 2
0Z0 2
0o0Z0Z 2
0o0Z0
1
Z0Z0Z0 1
Z0Z 1
o0Z0Z0 1
Z0Z0Z
a b c d e f a b c a b c d e f a b c d e

Describe a DP algorithm that, given N, M > 0, counts the number of safe pawn lines on an N × M
chessboard. In your solution, address the same six aspects as in Exercise 7.1. Your solution should have
complexity at most O(N M ).

3
Solution:
1. Dimensions of the DP table: DP [1 . . . N ][1 . . . M ]
2. Definition of the DP table: DP [i][j] counts the number of distinct safe pawn lines on an N × j
chessboard with the pawn in the last column located in row i. For example, for N = 4, we have
DP [3][3] = 3, since 3 safe pawn lines on a 4 × 3 chessboard have their last pawn in row 3, namely:

4
0Z0 4
0Z0 4
0o0
3
Z0o 3
o0o 3
o0o
2
0o0 2
0o0 2
0Z0
1
o0Z 1
Z0Z 1
Z0Z
a b c a b c a b c

3. Computation of an entry: DP can be computed recursively as follows:

DP [i][1] = 1 1≤i≤N (6)


DP [1][j] = DP [2][j − 1] 1<j≤M (7)
DP [N ][j] = DP [N − 1][j − 1] 1<j≤M (8)
DP [i][j] = DP [i − 1][j − 1] + DP [i + 1][j − 1] 1 < i < N, 1 < j ≤ M (9)

Equation (6) solves the base case where the chessboard has only one column. In that case, there exists
exactly one safe pawn line. Equation (9) provides the general recurrence formula. The rationale
behind this formula it is as follows: a pawn line on a N × j chessboard with its last pawn in row i is
obtained by adding a single pawn located at (j, i) (the black pawn on the board below) to a pawn line
on a N ×(j−1) chessboard (the red pawns on first board below). Clearly, the last pawn of the smaller
line must be on row i + 1 or i − 1. Hence, we have DP [i][j] = DP [i − 1][j − 1] + DP [i + 1][j − 1].
However, this is not true when we have the edge cases i = 1 or i = N . In these cases, only one
position is available for the last pawn of the smaller line, yielding formulae (7) and (8).

5
Z0Z0Z0
4
0o0Z0Z
3
o0o0Z0
2
0Z0o0o
1
Z0Z0o0
a b c d e f

4. Calculation order: We first compute by order of increasing j, and then in an arbitrary order for i (for
example, in increasing order).
5. Extracting the solution: The solution is N
P
i=1 DP [i][M ].

6. Running time: The running time of the solution is O(M N ), as there are N M entries in the table
which are processed in O(1) time, and extracting the solution takes O(N ) ≤ O(M N ) time.

Exercise 7.4 String Counting (1 point).


Given a binary string S ∈ {0, 1}n of length n, let f (S) be the length of the longest substring of con-
secutive 1s. For example f (”0110001101110001”) = 3 because the string contains ”111” (underlined)

4
but not ”1111”. Given n and k, the goal is to count the number of binary strings S of length n where
f (S) = k.
Write the pseudocode of an algorithm that, given positive integers n and k where k ≤ n, reports the
required answer. For full points, the running time of your solution can be any polynomial in n and k
(e.g., even O(n11 k 20 ) is acceptable).
Hint: The intended solution has complexity O(nk 2 ).
In your solution, address the same six aspects as in Exercise 7.1.
Solution:
1. Dimensions of the DP table: DP [1 . . . n][0 . . . k + 1][0 . . . k + 1]
2. Definition of the DP table: Given a string S, let g(S) be the length of the (longest) suffix of “all
ones”. For example, g(”01011”) = 2, g(”010110”) = 0, g(”01101010111”) = 3. The entry
DP [len][maks][curr] represents the number of binary strings S of length exactly len where f (S) =
maks and g(S) = curr.
3. Computation of an entry: While each entry can be computed directly, in this case it is a bit easier to
compute it indirectly. Namely, we take the entire collection of strings S = {S1 , S2 , . . .} represented
by some entry dp[len][maks][curr] = |S| and append “0” to all of them: {S1 + ”0”, S2 + ”0”, . . .}.
All of them correspond to the entry dp[len + 1][maks][0], hence we increase the latter entry by
dp[len][maks][curr]. Similarly, we append “1” to all of S and obtain {S1 + ”1”, S2 + ”1”, . . .}. All
of them correspond to the entry dp[len + 1][max(maks, curr + 1)][curr + 1], hence we analogously
increase that entry by dp[len][maks][curr]. The base corresponds when len = 1, where the only
strings are “0” and “1”. Hence, dp[1][1][1] = 1 and dp[1][0][0] = 1, while dp[1][1][0] = 0 and
dp[1][0][1] = 0.
4. Calculation order: The entries can be calculated in order of increasing len. There is no interaction
between entries with the same len, hence the order within the same value of len can be arbitrary.
5. Extracting
Pk the solution: The solution is extracted by summing up over all possible values curr of
g(S): curr=0 dp[n][k][curr].
6. Running time: The running time of the solution is O(nk 2 ) as there are O(nk 2 ) entries in the table,
each of which is processed in O(1) time, and the solution is extracted in O(k) ≤ O(nk 2 ).
7. Explicitly write out the full pseudocode.

5
Algorithm 1
1: Input: integers n, k.
2: Define dp[1 . . . n][0 . . . k + 1][0 . . . k + 1], initialized to 0.
3: dp[1][0][0] ← 1
4: dp[1][1][1] ← 1
5: for len ∈ {1, . . . , n − 1} do
6: for maks ∈ {0, . . . , k} do
7: for curr ∈ {0, . . . , k} do
8: val ← dp[len][maks][curr]
9: if val 6= 0 then . Prevents going out-of-bounds.
+
10: (Note: let a ← b be the shorthand for a ← a + b.)
+
11: dp[len + 1][maks][0] ← val . Append 0.
+
12: dp[len + 1][max(maks, curr + 1)][curr + 1] ← val . Append 1.
13: sol ← 0
14: for curr ∈ {0, 1, . . . , k} do
15: sol ← sol + dp[n][k][curr]
16: Print(“solution = “, sol)

Exercise 7.5 Longest Snake.


You are given a game-board consisting of hexagonal fields F1 , . . . , Fn . The fields contain natural num-
bers v1 , . . . , vn ∈ N. Two fields are neighbors if they share a border. We call a sequence of fields
(Fi1 , . . . , Fik ) a snake of length k if, for j ∈ {1, . . . , k − 1}, Fij and Fij+1 are neighbors and their
values satisfy vij+1 = vij + 1. Figure 1 illustrates an example game board in which we highlighted the
longest snake.
For simplicity you can assume that Fi are represented by their indices. Also you may assume that you
know the neighbors of each field. That is, to obtain the neighbors of a field Fi you may call N (Fi ),
which will return the set of the neighbors of Fi . Each call of N takes unit time.
(a) Provide a dynamic programming algorithm that, given a game-board F1 , . . . , Fn , computes the
length of the longest snake.

11 12 3 21
10 5 3 2 20
9 4 11 1 2
1 6 5 10 9
12 13 6 7 8

Figure 1: Example of a longest snake.

6
Hint: Your algorithm should solve this problem using O(n log n) time, where n is the number of
hexagonal fields.
Address the same six aspects as in Exercise 7.1 in your solution.
Solution:
1. Dimensions of the DP table: The DP table is linear, its size is n.
2. Definition of the DP table: DP [i] is the length of the longest snake with head Fi (that is, the
length of the longest snake of the form (Fj1 , . . . , Fjm−1 , Fi )).
3. Computation of an entry:
DP [i] = 1 + max DP [j].
Fj ∈N (Fi )
vj =vi −1

That is, we look at those neighbors of Fi that have values vj smaller than vi exactly by 1, and
choose the maximal value in the DP table among them. If there are no such neighbors, we define
max in this formula to be 0.
4. Calculation order: We first sort the hexagons by their values. Then we fill the table in ascending
order, that is, i1 , . . . , in such that vij ≤ vij+1 for all j = 1, . . . n − 1.
5. Extracting the solution: The output is max DP [i].
1≤i≤n

6. Running time: We compute the order in time O(n log n) by sorting v1 , . . . , vn . Then each entry
can be computed in time O(1) and finally we compute the output in time O(n). So the running
time of the algorithm is O(n log n).
(b) Provide an algorithm that takes as input F1 , . . . Fn and a DP table from part a) and outputs the
longest snake. If there are more than one longest snake, your algorithm can output any of them.
State the running time of your algorithm in Θ-notation in terms of n.
Solution:
At the beginning we find a head of a snake that is some Fj1 such that DP [j1 ] = max DP [i].
1≤i≤n
If DP [j1 ] 6= 1, we look at its neigbours and find some Fj2 such that DP [j2 ] = DP [j1 ] − 1. If
DP [j2 ] 6= 1, then among neighbors of Fj2 we find some Fj3 such that DP [j3 ] = DP [j2 ] − 1 and
so on. We stop when DP [jm ] = 1 (where m is exactly the length of the longest snake). Then we
output the snake (Fj1 , . . . , Fjm ).
The running time of this algorithm is Θ(n), since we use Θ(n) operations to find Fj1 and we need
Θ(1) time to find each Fjk for 1 < k ≤ m ≤ n and Θ(m) time to output the snake.
Remark. An alternative solution would be to store the predecessor in a longest snake with head
Fi directly in DP [i] (in addition to the length of this longest snake), and store ∅ if the length of the
longest snake is just 1. Then, in order to recover a longest snake, we simply need to find a head of
a snake that has maximal length and then follow the sequence of predecessors until we reach an
entry DP [i] that has ∅ as predecessor.
*(c) Find a linear time algorithm that finds the longest snake. That is, provide an O(n) time algorithm
that, given a game-board F1 , . . . , Fn , outputs the longest snake (if there are more than one longest
snake, your algorithm can output any of them).
Solution:

7
We can use recursion with memorization. Similar to part a), we will fill an array S[1, . . . , n] of
lengths of longest snakes, that is, S[i] is the length of the longest snake with head Fi . Consider the
following pseudocode:

Algorithm 2 Fill-lengths(v1 , . . . , vn )
S[1], . . . , S[n] ← 0, . . . , 0
for i = 1, . . . , n do
if S[i] = 0 then
Move-to-tails(i, S, v1 , . . . , vn )
return S

where the procedure Move-to-tails(i, S, v1 , . . . , vn ) is:

Algorithm 3 Move-to-tails(i, S, v1 , . . . , vn )
for Fj ∈ N (Fi ) do
if vj = vi − 1 and S[j] = 0 then
Move-to-tails(j, S, v1 , . . . , vn )
S[i] = 1 + max S[j]
Fj ∈N (Fi )
vj =vi −1

As in part a), we assume that max over the empty set is 0. Let us show why this procedure is correct.
First, since the algorithm Move-to-tails is recursive, we have to check that it actually finishes. Move-
to-tails(i, S, v1 , . . . , vn ) is calling Move-to-tails only for indices j with vj < vi , and therefore an
easy induction on vj shows that the algorithm will always terminate. We now show the correctness
of Move-to-tails(i, S, v1 , . . . , vn ) by induction on vi .
Base case vi = 1: If vi = 1, then there is no j such that vj = vi − 1. Therefore, the max in Move-
to-tails(i, S, v1 , . . . , vn ) is empty, so S[i] is set to 1, which is indeed the length of a longest
snake with head Fi when vi = 1.
Induction hypothesis: After calling Move-to-tails(i, S, v1 , . . . , vn ) with vi = k, the value of S[i]
contains the length of the longest snake with head Fi .
Induction step k → k + 1: Let i be an index with vi = k + 1. Then for any Fj ∈ N (Fi ) such
that vj = vi − 1, we have vj = k, so by the induction hypothesis after calling Move-to-
tails(j, S, v1 , . . . , vn ) the value of S[j] contains the length of the longest snake with head Fj .
Therefore, after setting

S[i] = 1 + max S[j],


Fj ∈N (Fi )
vj =vi −1

the value of S[i] indeed contains the length of the longest snake with head Fi .
After we fill S, we can use the same algorithm as in part b) to find a longest snake (we should
replace DP by S in the description of that algorithm).
For the runtime, we will show that for each i ∈ {1, . . . , n} we call Move-to-tails(i, S, v1 , . . . , vn ) ex-
actly once. Indeed, it is called only when S[i] = 0, and after the first call of Move-to-tails(i, S, v1 , . . . , vn )
has terminated, we have S[i] > 0 by the invariant for the rest of the algorithm. So Move-to-
tails(i, S, v1 , . . . , vn ) will not be called a second time after the first call has terminated. While
the first call of Move-to-tails(i, S, v1 , . . . , vn ) is running, Move-to-tails is only called for indices

8
j with vj < vi , which follows from a very simple induction. So Move-to-tails(i, S, v1 , . . . , vn ) is
also not called a second time while the first call is still running. So we have shown that Move-
to-tails(i, S, v1 , . . . , vn ) is called exactly once for each i. Therefore, the running time is linear in
n.
The technique that we used here is closely related to depth-first search and topological ordering of
a graph. These topics will be studied later in this course.

9
Eidgenössische Ecole polytechnique fédérale de Zurich
Technische Hochschule Politecnico federale di Zurigo
Zürich Federal Institute of Technology at Zurich

Departement of Computer Science 14 November 2022


Markus Püschel, David Steurer
François Hublet, Goran Zuzic, Tommaso d’Orsi, Jingqiu Ding

Algorithms & Data Structures Exercise sheet 8 HS 22

The solutions for this sheet are submitted at the beginning of the exercise class on 21 November 2022.
Exercises that are marked by ∗ are “challenge exercises”. They do not count towards bonus points.
You can use results from previous parts without solving those parts.

Exercise 8.1 Exponential bounds for a sequence defined inductively.


Consider the sequence (an )n∈N defined by

a0 = 1,
a1 = 1,
a2 = 2,
ai = ai−1 + 2ai−2 + ai−3 ∀i ≥ 3.
The goal of this exercise is to find exponential lower and upper bounds for an .
(a) Find a constant C > 1 such that an ≤ O(C n ) and prove your statement.
Solution:
Intuitively, the sequence (an )n∈N seems to be increasing. Assuming so, we would have

ai = ai−1 + 2ai−2 + ai−3 ≤ ai−1 + 2ai−1 + ai−1 = 4ai−1 ,

which yields

an ≤ 4an−1 ≤ . . . ≤ 4n a0 = 4n .

This only comes from an intuition, but it is a good way to guess what the upper bound could be.
Now let us actually prove (by induction) that an ≤ 4n for all n ∈ N.
Induction Hypothesis. We assume that for k ≥ 2 we have

ak ≤ 4k , ak−1 ≤ 4k−1 , ak−2 ≤ 4k−2 . (1)

Base case k = 2. Indeed we have a0 = 1 ≤ 40 , a1 = 1 ≤ 41 and a2 = 2 ≤ 42 .


Inductive step (k → k + 1). Let k ≥ 2 and assume that the induction hypothesis (1) holds. To
show that it also holds for k + 1, we need to check that ak+1 ≤ 4k+1 , ak ≤ 4k and ak−1 ≤ 4k−1 .
The two last inequalities clearly hold since they are part of the induction hypothesis, so we only
need to check that ak+1 ≤ 4k+1 . Indeed,
(1)
ak+1 = ak + 2ak−1 + ak−2 ≤ 4k + 2 · 4k−1 + 4k−2 ≤ 4k + 2 · 4k + 4k = 4 · 4k = 4k+1 .

Thus, an ≤ 4n for all n ∈ N. In particular, we have shown that an ≤ O(C n ) for C = 4 > 1.
(b) Find a constant c > 1 such that an ≥ Ω(cn ) and prove your statement.
Solution:
If we again assume that the sequence is increasing, we would get

ai = ai−1 + 2ai−2 + ai−3 ≥ ai−3 + 2ai−3 + ai−3 = 4ai−3 ,

which yields

an ≥ 4an−3 ≥ . . . ≥ 4bn/3c a0 = 4bn/3c .

So we will aim to prove a lower bound of the form an ≥ ε · 4n/3 for some constant ε > 0. We see
that taking ε := min{1, 4−1/3 , 2 · 4−2/3 } = 4−1/3 will make the inequality satisfied for the base
case, so let’s prove by induction that an ≥ 4−1/3 4n/3 for all n ∈ N.
Induction Hypothesis. We assume that for k ≥ 2 we have

ak ≥ 4−1/3 4k/3 , ak−1 ≥ 4−1/3 4(k−1)/3 , ak−2 ≥ 4−1/3 4(k−2)/3 . (2)

Base case k = 2. Indeed we have a0 = 1 ≥ 4−1/3 · 40 , a1 = 1 ≥ 4−1/3 41/3 and a2 = 2 ≥ 41/3 =


4−1/3 42/3 .
Inductive step (k → k + 1). Let k ≥ 2 and assume that the induction hypothesis (2) holds. To
show that it also holds for k + 1, we need to check that ak+1 ≥ 4−1/3 4(k+1)/3 , ak ≥ 4−1/3 4k/3 and
ak−1 ≥ 4−1/3 4(k−1)/3 . The two last inequalities clearly hold since they are part of the induction
hypothesis, so we only need to check that ak+1 ≥ 4−1/3 4(k+1)/3 . Indeed,
(2)  
ak+1 = ak + 2ak−1 + ak−2 ≥ 4−1/3 4k/3 + 2 · 4(k−1)/3 + 4(k−2)/3
 
≥ 4−1/3 4(k−2)/3 + 2 · 4(k−2)/3 + 4(k−2)/3 = 4−1/3 · 4 · 4(k−2)/3 = 4−1/3 4(k+1)/3 .

Thus, an ≥ 4−1/3 4n/3 for all n ∈ N. In particular, we have shown that an ≥ Ω(cn ) for c = 41/3 >
1.
Remark. One can actually show that an = Θ(φn ), where φ ≈ 2.148 is the unique positive solution
of the equation x3 = x2 + 2x + 1.

Exercise 8.2 AVL trees (1 point).


(a) Draw the tree obtained by inserting the keys 1, 6, 8, 0, 3, 2, 9 in this order into an initially empty
AVL tree. Give also the intermediate states before and after each rotation that is performed during
the process.
Solution:
Insert 1 and then 6:
1

2
Insert 8:
1 6

Rotate left
6 1 8

Insert 0 and 3:
6

1 8

0 3

Insert 2:
6 6 3

Rotate left Rotate right


1 8 3 8 1 6

0 3 1 0 2 8

2 0 2

Insert 9:
3 3

Rotate left
1 6 1 8

0 2 8 0 2 6 9

(b) Delete 0, 2, and 1 in this tree, and afterwards delete key 6 in the resulting tree. Give also the
intermediate states before and after each rotation is performed during the process.

3
Solution:
Delete 0 and 2:
3

1 8

6 9

Delete 1:
3 3 6

Rotate right Rotate left


8 6 3 8

6 9 8 9

Delete 6:
Key 6 can either be replaced by its predecessor key, 3, or its successor key, 8. If key 6 is replaced by
its predecessor:

3 8

Rotate left
8 3 9

If key 6 is instead replaced by its successor:

3 9

Exercise 8.3 Augmented Binary Search Tree.

4
Consider a variation of a binary search tree, where each node has an additional member variable called
size. The purpose of the variable size is to indicate the size of the subtree rooted at this node. An
example of an augmented binary search tree (with integer data) can be seen below (Fig. 1).

10
size=7

7 12
size=4 size=2

3 8 15
size=1 size=2 size=1

9
size=1

Figure 1: Augmented binary search tree

a) What is the relation between the size of a node and the sizes of its children?
Solution:
For every node in the tree, we have

node.size = node.left.size + node.right.size + 1.

Note that throughout the solution of this exercise, we adopt the convention that null.size = 0.
b) Describe in pseudo-code an algorithm VerifySizes(root) that returns true if all the sizes in the
tree are correct, and returns false otherwise. For example, it should return true given the tree in
Fig. 1, but false given the tree in Fig. 2.
What is the running time of your algorithm? Justify your answer.
Solution:

Algorithm 1 Verifying the sizes of the tree


function VerifySizes(root)
if root = null then
return true
else if VerifySizes(root.left) = false or VerifySizes(root.right) = false then
return false
else
CorrectSize ← 1 + root.left.size + root.right.size
return CorrectSize = root.size

5
10
size=7

7 12
size=4 size=5

3 8 15
size=1 size=2 size=1

9
size=1

Figure 2: Augmented binary search tree with buggy size: incorrect size for node with data “12”

The above recursive algorithm visits every node of the tree exactly once. Furthermore, it performs
a constant number of operations O(1) at each node. Therefore, the runtime is O(n), where n is the
number of nodes in the tree.
c) Suppose we have an augmented AVL tree (i.e., as above, each node has a size member variable).
Describe in pseudo-code an algorithm Select(root, k) which, given an augmented AVL tree and
an integer k, returns the k-th smallest element in the tree in O(log n) time.
Example: Given the tree in Fig. 1, for k = 3, Select returns 8; for k = 5, it returns 10; for k = 1, it
returns 3; etc.
Solution:

Algorithm 2 Selecting the k-th smallest element


function Select(root, k)
current ← root.left.size + 1
if k = current then
return root.data
else if k < current then
return Select(root.left, k)
else
return Select(root.right, k − current)

The above algorithm follows a downward path until it finds the correct node. Furthermore, it per-
forms a constant number of operations O(1) at each visited node. Therefore, the runtime of the
algorithm is O(h), where h is the height of the tree. Now since the tree is an AVL tree, we have
h = O(log n). We conclude that the runtime of the above algorithm is O(log n).
d)* To maintain the correct sizes for each node, we have to modify the AVL tree operations, insert
and remove. For this problem, we will consider only the modifications to the AVL-insert method

6
(i.e., you are not responsible for AVL-remove). Recall that AVL-insert first uses regular insert for
binary search trees, and then balances the tree if necessary via rotations.
• How should we update insert to maintain correct sizes for nodes?
During the balancing phase, AVL-insert performs rotations. Describe what updates need to be
made to the sizes of the nodes. (It is sufficient to describe the updates for left rotations, as right
rotations can be treated analogously.)
Solution:
The regular insert function follows a downward path and then adds the new node as a leaf at the
correct place. We only need to increment the variable size by 1 at each visited node, and set the
variable size of the added leaf to 1. The runtime of the modified function remains O(h), where h is
the height of the tree. If the tree is an AVL tree, then the runtime is O(log n).
Regarding AVL-insert, after modifying the regular insert function as we explained, we need to
modify the rotation functions Left-Rotate and Right-Rotate to maintain the correct size vari-
ables.
Suppose we are performing a right-rotation on the node y of the tree that is drawn on the left (or
performing a left-rotation on the node x of the tree that is drawn on the right):
Rotate-Right
y Rotate-Left x

x γ α y

α β β γ

In the above diagrams, α, β and γ represent subtrees. As can be easily seen, only x.size and y.size
need to be updated, and we can apply the relation in a) in the correct order:
• At the end of Rotate-Right, we apply

y.size ← y.left.size + y.right.size + 1,

and then
x.size ← x.left.size + x.right.size + 1.

• At the end of Rotate-Left, we apply

x.size ← x.left.size + x.right.size + 1,

and then
y.size ← y.left.size + y.right.size + 1.

As we can see, the runtime of the modified Right-Rotate (resp. Left-Rotate) function remains
O(1). Therefore, the runtime of AVL-insert remains O(log n).
Remark: It is also possible to modify AVL-delete to maintain the correctness of the size variables
while keeping the O(log n) runtime.

7
Exercise 8.4 Round and square brackets.
A string of characters on the alphabet {A, . . . , Z, (, ), [, ]} is called well-formed if either
1. It does not contain any brackets, or
2. It can be obtained from an empty string by performing a sequence of the following operations,
in any order and with an arbitrary number of repetitions:
(a) Take two non-empty well-formed strings a and b and concatenate them to obtain ab,
(b) Take a well-formed string a and add a pair of round brackets around it to obtain (a),
(c) Take a well-formed string a and add a pair of square brackets around it to obtain [a].
The above reflects the intuitive definition that all brackets in the string are ‘matched’ by a bracket of the
same type. For example, s = FOO(BAR[A]), is well-formed, since it is the concatenation of s1 = FOO,
which is well-formed by 1., and s2 = (BAR[A]), which is also well-formed. String s2 is well-formed
because it is obtained by operation 2(b) from s3 = BAR[A], which is well-formed as the concatenation
of well-formed strings s4 = BAR (by 1.) and s5 = [A] (by 2(c) and 1.). String t = FOO[(BAR]) is not
well-formed, since there is no way to obtain it from the above rules. Indeed, to be able to insert the
only pair of square brackets according to the rules, its content t1 = (BAR must be well-formed, but this
is impossible since t1 contains only one bracket.
Provide an algorithm that determines whether a string of characters is well-formed. Justify briefly why
your algorithm is correct, and provide a precise analysis of its complexity.
Hint: Use a data structure from the last lecture.
Solution:
We use a stack providing standard pop, push, and isEmpty operations. Given a stack S, S.pop() removes
and returns the element on top of the stack, if it exists, and a constant None otherwise, while S.push(x)
pushes x onto the top of the stack, and S.isEmpty() returns a boolean indicating whether the stack is
empty or not. Finally, we assume a function emptyStack that initializes and returns an empty stack.
Our algorithm is as follows:

Algorithm 3 Detecting well-formed strings


function IsWellFormed(s)
S ← emptyStack()
for i ∈ {0, ..., |s| − 1} do
if s[i] = “(” then
S.push(“(”)
else if s[i] = “[” then
S.push(“[”)
else if s[i] = “)” then
if S.pop() 6= “(” then
return False
else if s[i] = “]” then
if S.pop() 6= “[” then
return False
return S.isEmpty()

8
Correctness. First, we see that we can completely ignore non-bracket characters to determine well-
formedness. The correctness of our algorithm then results from the following invariant: for all s, the
for loop of IsWellFormed(s) terminates (without returning early) in a configuration with an empty
stack if and only if s is well-formed.
We can prove this by induction on the length of s.
Base case: If s has length 0, then it is empty. Then s is well-formed and IsWellFormed(s) indeed
terminates immediately with an empty stack.
Induction hypothesis: Let n > 0. Assume that for all s of length |s| ≤ n − 1, the for loop of
IsWellFormed(s) terminates (without returning early) in a configuration with an empty stack if and
only if s is well-formed.
Induction step: Let s be a well-formed string of length s. First, assume that s is well-formed. There
are three cases:
• If s is of the form ab with 0 < |a|, |b| < |s|, then by our induction hypothesis the for loop
of IsWellFormed(a) and IsWellFormed(b) terminates in a configuration with an empty stack.
When running IsWellFormed(s), the first |a| are exactly the same as in IsWellFormed(a), and
we end up with an empty stack after |a| iterations. Then, we run exactly the same |b| steps as in
IsWellFormed(b), ending up again with an empty stack. We successfully return True.
• If s if of the form (a), then running IsWellFormed(s) first pushes “(” onto the stack, and then
runs the same steps (from iterations 1 to |s|) as in IsWellFormed(a), but with the additional “(”
element remaining at the bottom of the stack. By our induction hypothesis, the stack contains
only “(” after iteration |s|, after which iteration |s|+1 removes “(” from the stack and terminates
successfully.
• The argument is similar for s = [a].
Conversely, assume that IsWellFormed(s) returns True. We distinguish between two cases:
• If S is empty in some intermediate iteration i ∈ {1, . . . , |s| − 1}, consider such an i. Then the
successful execution of IsWellFormed(s) is exactly the concatenation of two successful execu-
tions of IsWellFormed(s[0..i]) and IsWellFormed(s[i + 1..|s| − 1]). Hence, by our induction
hypothesis, s[0..i] and s[i + 1..|s| − 1] are well-formed, and their concatenation s is also well-
formed.
• If S is never empty in any intermediate iteration, then we observe that the first element pushed
onto the stack is never popped before the very last iteration. For this last pop to be succeed, the
first and last character of s must be matching brackets (i.e., () or []). Moreover, as the element
at the bottom of the stack is never popped and the final stack is empty, iterations 1 to |s| − 2
must be exactly identical to a successful execution of IsWellFormed(s[1..|s|−2]). Hence, by our
induction hypothesis, s[1..|s|−2] is a well-formed string, and so is s which is either (s[1..|s|−2])
or [s[1..|s| − 2]].
Remark. The above constitutes a formal proof of the correctness of the algorithm, provided for the
sake of completeness. A more informal argument shall also be counted as correct.

Complexity. Each iteration of the for loop has runtime complexity O(1): stack operations are O(1),
and we execute at most one such operation per iteration, along with at most 5 constant-time tests
and at most one constant-time return statement. As there are |s| iterations in total and the rest of the
operations are constant-time, we get a total runtime complexity in O(|s|).

9
Exercise 8.5 Computing with a stack (2 points).
In many programming languages, e.g., in Python, stacks are commonly used for evaluating arithmetic
expressions. Evaluating expressions usually happens in two steps. First, values are loaded into the
stack. Then, operations are applied stepwise on the top elements in order to obtain the desired value.

Figure 3: A stack S0 containing the numbers 4, 3, 2, and 7 (7 is the top of the stack)

In this exercise, we focus on the second phase, and consider the following three basic operations used
to compute with stacks:
pop: If there is at least one element in the stack, remove the top element of the stack. Otherwise, do
nothing.
add: If there are at least two elements in the stack, remove the top two elements, compute their sum,
and push this sum back into the stack. If there is less than two elements in the stack, do nothing.
mul: If there are at least two elements in the stack, remove the top two elements, compute their prod-
uct, and push this product back into the stack. If there is less than two elements in the stack, do
nothing.
Below are examples of applications of pop, add, and mul.

7 7 7

2 → 2 2 → 9 2 → 14

3 3 3 3 3 3

4 4 4 4 4 4

(a) pop (remove 7) (b) add (7 + 2 = 9) (c) mul (7 · 2 = 14)

We say that an integer i can be computed from a stack S if and only if there exists a sequence of pop, add,
and mul operations on S that ends with i on top of the stack. For example, the value (3 · 2) + 4 = 10
can be computed from the stack S0 above through the following sequence of operations:

10
7

pop mul add


2 → 2 → →

3 3 6

4 4 4 10

Figure 5: Computing 10 from S0

Given a stack S containing n integers S1 , . . . , Sn ∈ {1, . . . , k} (with S1 being the top of the stack) and
an integer c, you are tasked to design a DP algorithm which determines if c can be computed from S.
To obtain full points, your algorithm should have complexity at most O(c · n), but partial points will
be awarded for any solution running in time O(k n · n).
In your solution, address the following aspects:
1. Dimensions of the DP table: What are the dimensions of the DP table?
2. Definition of the DP table: What is the meaning of each entry?
3. Computation of an entry: How can an entry be computed from the values of other entries? Specify
the base cases, i.e., the entries that do not depend on others.
4. Calculation order: In which order can entries be computed so that values needed for each entry have
been determined in previous steps?
5. Extracting the solution: How can the solution be extracted once the table has been filled?
6. Running time: What is the running time of your solution?

Solution:
For i ∈ {1, . . . , n}, we denote by S[1..i] the stack containing the top i elements of S.
1. Dimensions of the DP table: DP [1 . . . c][1 . . . n]
2. Definition of the DP table: DP [i][j] is True if, and only if, i can be computed from the stack S[1..j]
and the stack produced by the computation contains only one element in the end, and False otherwise.
3. Computation of an entry: DP can be computed recursively as follows:

DP [i][1] = (i == S1 )
DP [i][j] = False j > 1, Sj > i
DP [i][j] = (i == Sj ) or DP [i − Sj , j − 1] j > 1, Sj ≤ i, Sj - i
DP [i][j] = (i == Sj ) or DP [i − Sj , j − 1] or DP [i/Sj , j − 1] j > 1, Sj | i.

The three factors of the disjunction in the last equation correspond to the three possible cases in
which i can be computed from S[1..j], leaving a singleton stack in the end:
(a) By popping all of S[1..j − 1] and returning Sj = i (case i = Sj ),

11
(b) By computing i − Sj from S[1..j − 1], and then performing add (case DP [i − Sj , j − 1])
between i − Sj (which is now on top of the stack) and Sj ,
(c) By computing i/Sj from S[1..j −1], and then performing mul (case DP [i−Sj , j −1]) between
i/Sj (which is now on top of the stack) and Sj .
The second case is only possible if Sj ≤ i, the last if Sj is a divisor of i.
Note that since all numbers in the stack are positive, all intermediate values obtained during the
computation of c must be contained in {1, . . . , c}. Hence, considering only i ∈ {1, . . . , c} is suffi-
cient.
4. Calculation order: Following the recurrence relations above, we can compute first by order of in-
creasing j, and then by order of increasing i.
5. Extracting the solution: The solution is DP [c][1] or . . . or DP [c][n].
6. Running time: The running time of the solution is O(c · n + n) = O(c · n) as there are c · n entries
in the table, we process each entry in O(1) time, and the solution is computed in O(n) time.
Remark. In practice, a solution based on memoization might be more efficient for this problem, given
that many values of i may not be computable from the Sk .
(*) Challenge question: Extend your algorithm to support the following additional operation:
neg: If there is at least one element in the stack, remove the top element x of the stack, and push −x
back into the stack. Otherwise, do nothing.
Solution:
With the neg operation at our disposal, a naı̈ve approach would consist in adding a disjunct

. . . or DP [−i, j]

in the recursion. But this would break the calculation order, since entries with the same value of j
(concretely, all DP [i, j] and DP [−i, j]) would depend on each other. We observe, however, that i can
be computed if and only if −i can be computed (just apply neg). Hence, we can change the definition
of our table to be “DP [i][j] is True if, and only if, i or −i can be computed from the stack S[1..j], and
False otherwise”, and instead add only a case DP [i + Sj , j − 1] to the disjunction, corresponding to
an application of neg followed by an application of add. The last thing that needs to be changed is
the size of the first dimension of the table, since intermediate results can now be larger than i. A safe
bound for intermediate results is k n .

12
Eidgenössische Ecole polytechnique fédérale de Zurich
Technische Hochschule Politecnico federale di Zurigo
Zürich Federal Institute of Technology at Zurich

Departement of Computer Science 21 November 2022


Markus Püschel, David Steurer
François Hublet, Goran Zuzic, Tommaso d’Orsi, Jingqiu Ding

Algorithms & Data Structures Exercise sheet 9 HS 22

The solutions for this sheet are submitted at the beginning of the exercise class on 28 November 2022.
Exercises that are marked by ∗ are “challenge exercises”. They do not count towards bonus points.
You can use results from previous parts without solving those parts.

Exercise 9.1 Party & Beer & Party & Beer.


For your birthday, you organize a party and invite some friends over at your place. Some of your friends
bring their partners, and it turns out that in the end everybody (including yourself) knows exactly 7
other people at the party (note that the relation of knowing someone is commutative, i.e. if you know
someone then this person also knows you and vice versa). Show that there must be an even number of
people at your party.
Solution:
Let n denote the number of people at your party. We can model the situation by a graph G = (V, E),
where the vertices V are the people who came to your party, and two vertices are connected by an edge
whenever they know each other. Since everybody P knows exactly 7 other people at the party, we have
deg(v) = 7 for all vertices v ∈ V . Therefore, v∈V deg(v) = 7n since there are P n vertices. On the
other hand, by the Handshaking lemma (Handschlaglemma) we also know that v∈V deg(v) = 2|E|,
and in particular the sum of the degrees must be an even number. In other words, 7n is an even number,
which implies that n must be even as well.

Exercise 9.2 Transitive graphs (1 point).


We say that a graph G = (V, E) is
• transitive when, for any two edges {u, v} and {v, w} in E, the edge {u, w} is also in E;
• complete when its set of edges is {{u, v} | u, v ∈ V, u ̸= v};
• the disjoint sum of G1 = (V1 , E1 ), . . . , Gk = (Vk , Ek ) iff V = V1 ∪ · · · ∪ Vk , E = E1 ∪ · · · ∪ Ek ,
and the (Vi )1≤i≤k are pairwise disjoint.
Show that a graph is transitive if, and only if, it is a disjoint sum of complete graphs.
Solution:
We first show that disjoint sums of complete graphs are transitive (⇐), and then that any transitive
graph is a disjoint sum of complete graphs (⇒).
⇐ Let G = (V, E) be a disjoint sum of complete graphs G1 = (V1 , E1 ), . . . , Gk = (Vk , Ek ). Let
{u, v}, {v, w} ∈ E. Since G is a disjoint sum, there exists i ∈ {1..k} such that v ∈ Vi , {u, v} ∈ Ei , and
{v, w} ∈ Ei . Since Ei ⊆ Vi × Vi , we get u ∈ Vi and w ∈ Vi . From the assumption that Gi is complete,
we finally get {u, w} ∈ Ei ⊆ E.
⇒ Let G = (V, E) be a transitive graph. We can decompose G into its connected components G1 =
(V1 , E1 ), . . . , Gk = (Vk , Ek ). Clearly, G is the disjoint sum of its connected components. Let us now
show that each connected component is complete. Let i ∈ {1..k}. Consider u, v ∈ Vi with u ̸= v. As u
and v are in the same connected component of G, there exists a path u = w1 → w2 → · · · → wp = v
from u to v in Gi .
We will now show by induction on j ∈ {2, . . . , p}: P (j): “the edge {u, wi } is in Ei .”
Base case: j = 2. As u = w1 , . . . , wp forms a path in Gi , we have {u, w2 } = {w1 , w2 } ∈ Ei , which
immediately yields P (2).
Induction step: Let j ∈ {2, . . . , p − 1} such that P (j) holds. This means that we have an edge
{u, wj } ∈ Ei . Now, as w1 , . . . , wp is a path in Gi , we also have an edge {wj , wj+1 } ∈ Ei . Using
the transitive property of G, we obtain {u, wj+1 } ∈ E, and since G is a disjoint sum, we also have
{u, wj+1 } ∈ Ei . This shows P (j + 1).
Conclusion: P (j) holds for all j ∈ {2, . . . , p}.
For j = p, we obtain P (p): “the edge {u, wp } is in Ei ”. As wp = v and Ei ⊆ E, we get {u, v} ⊆ E.
Hence Gi is complete, which concludes the proof.

Exercise 9.3 Star search, reloaded (1 point).


A star in an undirected graph G = (V, E) is a vertex that is adjacent to all other vertices. More formally,
v ∈ V is a star if and only if {{v, w} | w ∈ V \ {v}} ⊆ E.
In this exercise, we want to find a star in a graph G by walking through it. Initially, we are located at
some vertex v0 ∈ V . Each vertex has an associated flag (a Boolean) that is initially set to False. We
have access to the following constant-time operations:
• countNeighbors() returns the number of neighbors of the current vertex
• moveTo(i) moves us to the ith neighbor of the current vertex, where i ∈ {1..countNeighbors()}
• setFlag() sets the flag of the current vertex to True
• isSet() returns the value of the flag of the current vertex
• undo() undoes the latest action performed(the movement or the setting of last flag)
Assume that G has exactly one star and |G| = n. Give the pseudocode of an algorithm that finds the
star, i.e., your algorithm should always terminate in a configuration where the current vertex is a star
in G. To obtain full points, your algorithm must have complexity O(|V | + |E|), and must not introduce
any additional datastructures (no sets, no lists etc.). Show that your algorithm is correct and prove its
complexity. The behavior of your algorithm on graphs that do not contain a star can be disregarded.
Solution:
Consider the following algorithm:
In the following, we say that a vertex is marked iff its flag is set to True. In each iteration of the while
loop, a new, previously unmarked vertex is explored (if the vertex was already marked, the movement
towards this vertex would have been undone). Hence, in each iteration, either the current vertex has
n−1 neighbors and the algorithm terminates (case 1), or the number of vertices to be explored decreases

2
Algorithm 1 Star-finding algorithm
while countNeighbors() ̸= n − 1 do
setFlag()
for i = 1 to countNeighbors() do
moveTo(i)
if isSet() then
undo()
else
break

by exactly one (case 2), or the current vertex has no unmarked neighbors and we loop forever on this
vertex (case 3). Whenever the algorithm reaches the star s ∈ V , it successfully terminates (case 1), since
a vertex is a star if and only if it has n − 1 neighbors. Now, the star s is, by definition, a neighbor of all
vertices; in particular, s is always a neighbor of the current vertex. Hence, for case 3 to occur, the star s
must have been previously marked. But this never occurs, since the algorithm always terminates when
reaching the star. Hence, only cases 1 and 2 can happen, and the number of unmarked vertices decreases
by exactly one in each iteration until the star is eventually reached. This proves the correctness of the
algorithm.
Pdeg v
The cost of each iteration of the while loop is O(1) + O(1) + i=1 (O(1) +P O(1) + O(1))  = O(1) +
O(deg v), which sums up to at most v∈V (O(1)+O(deg v)) = O(|V |)+O
P
v∈V deg v = O(|V |)+
O(2|E|) = O(|V | + |E|) as every vertex is explored at most once.

Exercise 9.4 Domino.


(a) A domino set consists of all possible 62 + 6 = 21 different tiles of the form [x|y], where x and y


are numbers from {1, 2, 3, 4, 5, 6}. The tiles are symmetric, so [x|y] and [y|x] is the same tile and
appears only once.
Show that it is impossible to form a line of all 21 tiles such that the adjacent numbers of any
consecutive tiles coincide.

(b) What happens if we replace 6 by an arbitrary n ≥ 2? For which n is it possible to line up all n

2 +n
different tiles along a line?
Solution:
We directly solve the general problem.
First we note that we may neglect tiles of the form [x|x]. If we have a line without them, then we can
easily insert them to any place with an x. Conversely, if we have a line with them then we can just
remove them. Thus the problem with and without these tiles are equivalent.
Consider the following graph G with n vertices, labelled with {1, . . . , n}. We represent the domino
tile [x|y] by an edge between vertices x and y. Then the resulting graph G is a complete graph Kn , i.e.,
the graph where every pair of vertices is connected by an edge. A line of domino tiles corresponds to
a walk in this graph that uses every edge at most once, and vice versa. A complete line (of all tiles)

3
corresponds to an Eulerian walk in G. Thus we need to decide whether G = Kn has an Euler walk or
not.
Kn is obviously connected. If n is odd then all vertices have even degree n − 1, and thus the graph is
Eulerian. On the other hand, if n is even then all vertices have odd degree n − 1. If n ≥ 4 is even, then
there are more than 3 vertices of odd degree, and therefore Kn does not have an Euler walk. Finally,
for n = 2, the graph Kn is just an edge and has an Euler walk. Summarizing, there exists an Euler walk
if n = 2 or n is odd, and there is no Euler walk in all other cases. Hence, it is possible to line up the
domino tiles if n = 2 or n is odd, and it is impossible otherwise.

Exercise 9.5 Introduction to Trees (1 point).


We start with a few definitions:
Definition 1. Let G = (V, E) be a graph.
• A sequence of vertices (v0 , v1 , . . . , vk ) (with vi ∈ V for all i) is a simple path iff all the vertices
are distinct (i.e., vi ̸= vj for 0 ≤ i < j ≤ k) and {vi , vi+1 } is an edge for each 0 ≤ i ≤ k − 1. We
say that v0 and vk are the endpoints of the path.
• A sequence of vertices (v0 , v1 , . . . , vk ) (with vi ∈ V for all i) is a simple cycle iff (1) v0 = vk ,
(2) all other vertices are distinct (i.e., vi ̸= vj for 0 ≤ i < j < k), and (3) {vi , vi+1 } is an edge for
each 0 ≤ i ≤ k − 1.
• A graph G is connected iff for every two vertices u, v ∈ V there exists a simple path with
endpoints u and v.
• A graph G is a tree iff it is connected and has no simple cycles.
In this exercise the goal is to prove a few basic properties of trees.
(a) A leaf is a vertex with degree 1. Prove that in every tree G with at least two vertices there exists a
leaf. (post-publication correction marked in red)
Hint: Consider the longest simple path in G. Prove that its endpoint is a leaf.
Solution:
Consider the longest simple path P = (v0 , v1 , v2 , . . . , vk−1 , vk ) in G. Let a := v0 be an endpoint
of P . We claim a is a leaf. Suppose for the sake of contradiction that this is not true, i.e., the
degree of a is at least 2. Hence, there exists a neighbor b ̸= v1 of a. Now, consider, the path P ′ =
(b, v0 , v1 , . . . , vk ). This is a longer path, hence by choice of P , it cannot be simple. Therefore, since
b is the only new addition, there must exist an index i such that b = vi . But now, (b, v0 , v1 , . . . , vi )
is a simple cycle in G, a contradiction.
(b) Prove that every tree with n vertices has exactly n − 1 edges.
Hint: Prove by using induction on n. In the inductive step, use part (a) to find a leaf. Disconnect the
leaf from the tree and argue the remaining subgraph is also a tree. Apply the inductive hypothesis and
conclude.
Solution:
We proceed by induction on n.
Base case. When n = 1, there can only be 0 = n − 1 edges. When n = 2, there exists a unique
tree (two vertices connected by an edge), and that one has 1 = n − 1 edges. This completes the

4
base case.
Induction hypothesis. Assume that the hypothesis is true for every tree with n ≥ 2 vertices: it
contains n − 1 edges.
Inductive step. We now show the property holds for every tree G = (V, E) with |V | = n + 1
vertices.
Let u be a leaf in G (it must exist by part (a)), and let v be u’s only neighbor in the tree G = (V, E).
Consider the graph G′ := (V \ {u}, E \ {u, v}). We first argue that G′ is a tree.
Claim: G′ is connected. Proof of Claim: Let a, b ∈ V \ {u}. Since G is a tree, there exists a simple
path P in G with endpoints a, b. It is immediate that no simple path can contain a leaf except on
its endpoints (or the leaf’s only incident edge). Hence, P is also a simple path in G′ . Hence, a and
b are connected in G′ . Hence, G′ is connected. This completes the claim.
Claim: G′ has no simple cycles. Proof of Claim: Suppose for the sake of contradiction that P is a
simple cycle in G′ . But since G′ is a subgraph of G, P is also a simple cycle in G. However, G is a
tree and this is impossible. This completes the claim.
We proven that G′ is a tree. It contains |V \ {u}| = (n + 1) − 1 = n vertices. Hence, by induction,
|E \ {u, v}| = n − 1. Therefore, |E| = n. This completes the inductive step and the proof.
(c) Prove that a graph with n vertices is a tree iff it has n − 1 edges and is connected.
Hint: One direction is immediate by part (c). For the other direction (every connected graph with n − 1
edges is a tree), use induction on n. First, prove there always exists a leaf by considering the average
degree. Then, disconnect the leaf from the graph and argue the remaining graph is still connected and
has exactly one less edge. Apply the inductive hypothesis and conclude.
Solution:
Suppose G is a tree. By definition, G is connected. By part (b), it has n − 1 edges. This completes
one direction of the implication.
We now prove the other direction. Suppose G is connected and has n − 1 edges. We proceed by
induction on n.
Base case. Let n = 1. The graph with a single vertex and 0 edges is trivially a tree. Let n = 2.
There exists one unique graph with two vertices and 1 edge, and that one graph is also obviously a
tree. This completes the base case.
Induction hypothesis. Assume the hypothesis: every connected graph with n ≥ 2 vertices and
n − 1 edges is a tree.
Inductive step. We now show the property holds for n + 1. Let G = (V, E) be a connected graph
with n + 1 vertices and n edges. The average degree in this graph is 2|E|/|V | = 2n/(n + 1) < 2.
Hence, there must exist a vertex u with degree 1 (no connected graph with at least 2 vertices can
have 0-degree vertices).
In other words, u is a leaf and let v be u’s only neighbor in G. Consider the graph G′ := (V \
{u}, E \ {u, v}). Clearly, G′ has n − 1 edges.
Claim: G′ is connected. Proof of Claim: Let a, b ∈ V \ {u}. Since G is connected, there exists a
simple path P in G with endpoints a, b. It is immediate that no simple path can contain a leaf except
on its endpoints (or the leaf’s only incident edge). Hence, P is also a simple path in G′ . Hence, a
and b are connected in G′ . Hence, G′ is connected. This completes the claim.
Therefore, we can apply the induction hypothesis on G′ and conclude G′ is a tree. It is simple to
conclude that then G is also a tree: any simple cycle in G must be fully contained in G′ (since it

5
cannot contain a leaf), and this is impossible since G′ is a tree.
(d) Write the pseudocode of an algorithm that is given a graph G as input and checks whether G is a
tree.
As input, you can assume that the algorithm has access to the number of vertices n, the number
of edges m, and to the edges {a1 , b1 }, {a2 , b2 }, . . . , {am , bm } (i.e., the algorithm has access to 2m
integers a1 , . . . , am , b1 , . . . , bm , where each edge of G is given by its endpoints ai and bi ). You can
assume that the graph is valid (specifically, 1 ≤ ai , bi ≤ n and ai ̸= bi ). The algorithm outputs
“YES” or “NO”, corresponding to whether G is a tree or not. Your algorithm must always complete
in time polynomial in n (e.g., even O(n10 m10 ) suffices).
Hint: Use part (c). There exists a (relatively) simple O(n + m) solution. However, the official solution
is O(n · m) for brevity and uses recursion to check if G is connected.

Example 1: n = 6 3 Output: YES


m=5
a1 , b1 = 1, 3
a2 , b2 = 6, 1 1 5 2
a3 , b3 = 3, 5
a4 , b4 = 2, 3
a5 , b5 = 4, 1 6 4

Example 2: n = 5 Output: NO
m=4 4 5
a1 , b1 = 1, 3
a2 , b2 = 4, 5
2
a3 , b3 = 5, 2
a4 , b4 = 2, 4
3 1

Solution:

6
Algorithm 2
1: Input: integers n, m. Collection of integers a1 , b1 , a2 , b2 , . . . , am , bm .
2:
3: Let visited[1 . . . n] be a global variable, initialized to F alse.
4:
5: function walk(u) ▷ Find all neighbors of u that have not been visited and walk there.
6: visited[u] ← T rue
7: for i ← 1 . . . m do ▷ Iterate over all edges.
8: if ai = u and not visited[bi ] then
9: walk(bi )
10: if bi = u and not visited[ai ] then
11: walk(ai )
12:
13: walk(1) ▷ Find all vertices connected to 1.
14: connected ← T rue if visited[·] = [T rue, T rue, . . . , T rue] and connected ← F alse otherwise
15: if connected = T rue and m = n − 1 then ▷ Use the characterization from part (c).
16: Print(“YES”)
17: else
18: Print(“NO”)

7
Eidgenössische Ecole polytechnique fédérale de Zurich
Technische Hochschule Politecnico federale di Zurigo
Zürich Federal Institute of Technology at Zurich

Departement of Computer Science 28 November 2022


Markus Püschel, David Steurer
François Hublet, Goran Zuzic, Tommaso d’Orsi, Jingqiu Ding

Algorithms & Data Structures Exercise sheet 10 HS 22

The solutions for this sheet are submitted at the beginning of the exercise class on 5 December 2022.
Exercises that are marked by ∗ are “challenge exercises”. They do not count towards bonus points.
You can use results from previous parts without solving those parts.

Exercise 10.1 Depth-First Search (1 point).


Execute a depth-first search (Tiefensuche) on the following graph starting from vertex A. Use the algo-
rithm presented in the lecture. When processing the neighbors of a vertex, process them in alphabetical
order.

T
B A
F

T F
B T
D E F
T
T
C

C G H
T

(a) Mark the edges that belong to the depth-first tree (Tiefensuchbaum) with a “T” (for tree edge).
(b) For each vertex in the depth-first tree, give its pre- and post-number.
Solution:
A(1,14) B(2,13) C(4,11) D(3,12) E(6,7) F(5,8) G(9,10).
(c) Give the vertex ordering that results from sorting the vertices by pre-number. Give the vertex
ordering that results from sorting the vertices by post-number.
Solution:
Pre-ordering: A, B, D, C, F, E, G.
Post-ordering: E, F, G, C, D, B, A.
(d) Mark every forward edge (Vorwärtskante) with an “F”, every backward edge (Rückwärtskante) with
an “B”, and every cross edge (Querkante) with a “C”.
(e) Does the above graph have a topological ordering? How can we use the above execution of depth-
first search to find a directed cycle?
Solution:
The decreasing order of the post-numbers gives a topological ordering whenever the graph is
acyclic. This is the case if and only if there are no back edges. If there is a back edge, then to-
gether with the tree edges between its end points it forms a directed cycle. In our graph, the only
back edge is E → D, and the tree edges from D to E are D → C, C → F, and F → E. Together they
form the directed cycle (D → C → F → E → D).
(f) Draw a scale from 1 to 16, and mark for every vertex v the interval Iv from pre-number to post-
number of v. What does it mean if Iu ⊂ Iv for two different vertices u and v?
Solution:
1 A 14

2 B 13

3 D 12

4 C 11

5 F 8 9 G 10

6 E 7

If Iu ⊂ Iv for two different vertices u and v, then u is visited during the call of DFS-Visit(v).
(g) Consider the graph above where the edge from E to D is removed and an edge from A to H is
added. How does the execution of depth-first search change? Which topological sorting does the
depth-first search give? If you sort the vertices by pre-number, does this give a topological sorting?
Solution:
The execution of depth-first search only changes in the last step, where H is visited (from A).
The topological sorting (reversed post-ordering) is: A, H, B, D, C, G, F, E.
The pre-ordering is A, B, D, C, F, E, G, H; it does not give a topological ordering, since there is an
edge (G, F) in the graph.

Exercise 10.2 Longest path in DAGs (1 point).


Given a directed graph G = (V, E) without directed cycles (i.e., a DAG), the goal is to find the number
of edges on the longest path in G.
Describe a dynamic-programming algorithm that, given G, returns the length of the longest path in G
in O(|V | + |E|) time. You can assume that V = {1, 2, . . . , n}, and that the graph is provided to you
as a pair (n, Adj) of an integer n = |V | and an adjacency list Adj. Your algorithm can access Adj[u],
which is a list of vertices to which u has a direct edge, in constant time. Formally, Adj[u] := {v ∈ V |
(u, v) ∈ E}.

2
Example: n = 5 1 2 Output: 3
(the path is highlighted in red.)
3

In your solution, address the following aspects:


1. Dimensions of the DP table: What are the dimensions of the DP table?
2. Definition of the DP table: What is the meaning of each entry?
3. Computation of an entry: How can an entry be computed from the values of other entries? Specify
the base cases, i.e., the entries that do not depend on others.
4. Calculation order: In which order can entries be computed so that values needed for each entry have
been determined in previous steps?
5. Extracting the solution: How can the solution be extracted once the table has been filled?
6. Running time: What is the running time of your solution?
Solution:
For convenience, in the solution we will use the standard notation n := |V |, m := |E|.
1. Dimensions of the DP table: DP [1 . . . n], one for each node.
2. Definition of the DP table: DP [u] is the length of the longest path starting at u.
3. Computation of an entry: If u has no outgoing neighbor, we set DP [u] = 0. Otherwise, DP [u] =
maxv∈Adj[u] (1 + DP [v]).
4. Calculation order: We compute the entries in reverse topological order (i.e., the first node is one with
no outgoing edges and the last is one with no incoming edges). Remark. This is the crucial part
that differentiates this dynamic program from all prior ones.
5. Extracting the solution: We return maxu∈V DP [u].
6. Running time: We compute the entry for each node, contributing O(n) to the runtime. Furthermore,
each node u processes all outgoing edges from v, but every edge gets processed only once. Hence,
this contributes O(|E|) to the runtime. In total, we get O(|V | + |E|).

Exercise 10.3 Subtree sum (1 point).


~ we define its undirected version as ←
~ = (V, E)
Definition 1. Given a directed graph G
→ ←

G = (V, E )


with each directed edge u → v being transformed to an undirected u ↔ v. Formally, E := ({u, v} |
~
(u, v) ∈ E}.
←→
Definition 2. A directed graph G = (V, E) is a tree rooted at r ∈ V if G’s undirected version G is
an undirected tree (see Exercise 9.5 for a definition) and every node is reachable from r via a directed
path.

3
Write the pseudocode of an algorithm that, given a rooted tree G = (V, E), computes, for each vertex
v, the total number of vertices that are reachable from v (via directed paths). The algorithm should have
a runtime of O(|V | + |E|). You an assume V = {1, 2, . . . , n}. The graph will be given to the algorithm
as access to n, the root r ∈ V , and an adjacency list. Namely, the algorithm can access Adj[u], which
is a list of vertices to which u has a direct edge. Formally, Adj[u] := {v ∈ V | (u, v) ∈ E}.
Explain in a few sentences why your algorithm achieves the desired runtime.
Hint: If needed, you can use the fact that “in G, there is a unique path from the root to each vertex” without
proof.

Example: 3 Output: [3, 1, 6, 1, 1, 1].


n=6
r=3
1 5 2

6 4

Solution:

Algorithm 1
1: Input: integers n, r. Adjacency list Adj[1 . . . n].
2:
3: Let reachable[1 . . . n] be a global variable, initialized to 0.
4:
5: function walk(u) . Calculate the number of reachable vertices from u.
6: for each v in Adj[u] do . Iterate over all children v.
7: walk(v)
8: reachable[u] ← reachable[u] + reachable[v]
9: reachable[u] ← reachable[u] + 1 . u can reach itself.
10:
11: walk(r) . Start walking from the root.
12: Print(reachable)

There is a unique path from the root to each vertex (easy to prove). Therefore, the function will be called
on each vertex at most once (contributing O(|V |)). Each vertex processes all outgoing edges, hence each
edge will be processed at most once (contributing O(|E|)). In total, we get runtime O(|V | + |E|).

Exercise 10.4 Data structures for graphs.


Consider three types of data structures for storing a graph G with n vertices and m edges:
a) Adjacency matrix.
b) Adjacency lists:

4
1 2 3 4 5
2 4 1 6
3 1
4 5 2 1
5 4 1
6 2

c) Adjacency lists, and additionally we store the degree of each node, and there are pointers between
the two occurences of each edge. (An edge appears in the adjacency list of each endpoint).
1 deg: 4 2 3 4 5

2 deg: 3 4 1 6
3 deg: 1
1
4 deg: 3
5 2 1
5 deg: 2 4 1
6 deg: 1
2

For each of the above data structures, what is the required memory (in Θ-Notation)?
Solution:
Θ(n2 ) for adjacency matrix, Θ(n + m) for adjacency list and improved adjacency list.
Which runtime (worst case, in Θ-Notation) do we have for the following queries? Give your answer
depending on n, m, and/or deg(u) and deg(v) (if applicable).
(a) Input: A vertex v ∈ V . Find deg(v).
Solution:
Θ(n) in adjacency matrix, Θ(1 + deg(v)) in adjacency list, Θ(1) in improved adjacency list.
(b) Input: A vertex v ∈ V . Find a neighbour of v (if a neighbour exists).
Solution:
Θ(n) in adjacency matrix, Θ(1) in adjacency list and in improved adjacency list.
(c) Input: Two vertices u, v ∈ V . Decide whether u and v are adjacent.
Solution:
Θ(1) in adjacency matrix, Θ(1+min{deg(v), deg(u)}) in adjacency list and in improved adjacency
list.
(d) Input: Two adjacent vertices u, v ∈ V . Delete the edge e = {u, v} from the graph.
Solution:
Θ(1) in adjacency matrix, Θ(deg(v) + deg(u)) in adjacency list and Θ(min{deg(v), deg(u)}) in
improved adjacency list.
(e) Input: A vertex u ∈ V . Find a neighbor v ∈ V of u and delete the edge {u, v} from the graph.
Solution:

5
Θ(n) in the adjacency matrix (Θ(n) for finding a neighbor and Θ(1) for the edge deletion).
Θ(1+ max deg(w)) for the adjacency list (Θ(1) for finding a neighbor and Θ( max deg(w))
w:{u,w}∈E w:{u,w}∈E
for the edge deletion).
Θ(1) for the improved adjacency list (Θ(1) for finding a neighbor and Θ(1) for the edge deletion).
(f) Input: Two vertices u, v ∈ V with u 6= v. Insert an edge {u, v} into the graph if it does not exist
yet. Otherwise do nothing.
Solution:
Θ(1) in adjacency matrix, Θ(1+min{deg(v), deg(u)}) in adjacency list and in improved adjacency
list.
(g) Input: A vertex v ∈ V . Delete v and all incident edges from the graph.
Solution:
Θ(n2 ) in adjacency matrix, Θ(n + m) in adjacency list and Θ(n) in improved adjacency list.
For the last two queries, describe your algorithm.
Solution:
Query (vi): We check whether the edge {u, v} does not exist. In adjacency matrix this information
is directly stored in the u-v-entry. For adjacency lists we iterate over the neighbours of u and the
neighbours of v in parallel and stop either when one of the lists is traversed or when we find v among
the neighbours of u or when we find u among the neighbours of v. If we didn’t find this edge, we add
it: in the adjacency matrix we just fill two entries with ones, in the adjacency lists we add nodes to two
lists that correspond to u and v. In the improved adjacency lists, we also need to set pointers between
those two nodes, and we need to increase the degree for u and v by one.
Query (vii): In the adjacency matrix we copy the complete matrix, but leave out the row and column
that correspond to v. This takes time Θ(n2 ). There is an alternative solution if we are allowed to rename
vertices: In this case we can just rename the vertex n as v, and copy the n-th row and column into the
v-th row and column. Then the (n − 1) × (n − 1) submatrix of the first n − 1 rows and columns will be
the new adjacancy matrix. Then the runtime is Θ(n). Whether it is allowed to rename vertices depends
on the context. For example, this is not possible if other programs use the same graph.
In the adjacency lists we remove v from every list of neighbours of every vertex (it takes time Θ(n+m))
and then we remove a list that corresponds to v from the array of lists (it takes time Θ(n)). In the im-
proved adjacency lists we iterate over the neighbours of v and for every neighbour u we remove v from
the list of neighbours of u (notice that for each u we can do it in Θ(1) since we have a pointer between
two occurences of {u, v}) and decrease deg(u) by one. Then we remove the list that corresponds to v
from the array of lists (it takes time Θ(n)).

Exercise 10.5 Maze solver.


You are given a maze that is described by a n × n grid of blocked and unblocked cells (see Figure ??).
There is one start cell marked with ’S’ and one target cell marked with ’T’. Starting from the start cell
your algorithm may traverse the maze by moving from unblocked fields to adjacent unblocked fields.
The goal of this exercise is to devise an algorithm that given a maze returns the best solution (traversal
from ’S’ to ’T’) of the maze. The best solution is the one that requires the least moves between adjacent
fields.

6
Hint: You may assume that there always exists at least one unblocked path from ’S’ to ’T’ in a maze.

Figure 1: An example of 7 × 7 maze in which purple fields are blocked, white fields can be traversed
(are unblocked). The start field is marked with ’S’ and the target field with a ’T’.

(a) Model the problem as a graph problem. Describe the set of vertices V and the set of edges E in
words. Reformulate the problem description as a graph problem on the resulting graph.
Solution:
V is the set of unblocked fields, and there is an edge between vi and vj if and only if vi and vj
are adjacent unblocked fields. The corresponding graph problem is to find a shortest path between
vertices ‘S’ and ‘T’ in G = (V, E).
(b) Choose a data structure to represent your maze-graphs and use an algorithm discussed in the lecture
to solve the problem.
Solution:
The data structure is adjacency list, the algorithm is BFS starting from ‘S’. Once we know all the
distances from ‘S’, we append vertices to a sequence starting from ‘T’ using the following rule: if the
last appended vertex is v, we append some neighbour u of v such that dG (‘S’, v) = dG (‘S’, u) + 1.
We stop after appending ‘S’. Then we return a reverse sequence.
Hint: If there are multiple solutions of the same quality, return any one of them.
(c) Determine the running time and memory requirements of your algorithm in terms of n in Θ nota-
tion.
Solution:
Adjacency list requires Θ(|V | + |E|) memory, where V is a number of vertices and |E| is a number
of edges in the graph. BFS requires Θ(|V | + |E|) time and appending procedure also requires
Θ(|V | + |E|) time, so the total running time is Θ(|V | + |E|). Since each vertex has degree at most
4, |E| = O(|V |), so the running time and memory are Θ(|V |) which is Θ(n2 ) in the worst case.

7
Eidgenössische Ecole polytechnique fédérale de Zurich
Technische Hochschule Politecnico federale di Zurigo
Zürich Federal Institute of Technology at Zurich

Departement of Computer Science 5 December 2022


Markus Püschel, David Steurer
François Hublet, Goran Zuzic, Tommaso d’Orsi, Jingqiu Ding

Algorithms & Data Structures Exercise sheet 11 HS 22

The solutions for this sheet are submitted at the beginning of the exercise class on 12 December 2022.
Exercises that are marked by ∗ are “challenge exercises”. They do not count towards bonus points.
You can use results from previous parts without solving those parts.

Exercise 11.1 Shortest paths by hand.


Dijkstra’s algorithm allows to find shortest paths in a directed graph when all edge costs are nonnega-
tive. Here is a pseudo-code for that algorithm:

Algorithm 1
Input: a weighted graph, represeted via c(·, ·). Specifically, for two vertices u, v the value c(u, v)
represents the cost of an edge from u to v (or ∞ if no such edge exists).
function Dijkstra(G, s)
d[s] ← 0 . upper bounds on distances from s
d[v] ← ∞ for all v 6= s
S←∅ . set of vertices with known distances
while S 6= V do
choose v ∗ ∈ V \ S with minimum upper bound d[v ∗ ]
add v ∗ to S
update upper bounds for all v ∈ V \ S:
d[v] ← minpredecessor u∈S of v d[u] + c(u, v)
(if v has no predecessors in S, this minimum is ∞)

We remark that this version of Dijkstra’s algorithm focuses on illustrating how the algorithm explores
the graph and why it correctly computes all distances from s. You can use this version of Dijkstra’s
algorithm to solve this exercise.
In order to achieve the best possible running time, it is important to use an appropriate data structure
for efficiently maintaining the upper bounds d[v] with v ∈ V \S, as you saw in the lecture on December
1. In the other exercises/sheets and in the exam you should use the running time of the efficient version
of the algorithm (and not the running time of the pseudocode described above).
Consider the following weighted directed graph:
s
3
5
a 10 b
1
5
8 c 1
3
9
d e
2

a) Execute the Dijkstra’s algorithm described above by hand to find a shortest path from s to each
vertex in the graph. After each step (i.e. after each choice of v ∗ ), write down:
1) the upper bounds d[u], for u ∈ V , between s and each vertex u computed so far,
2) the set M of all vertices for which the minimal distance has been correctly computed so far,
3) and the predecessor p(u) for each vertex in M .
Solution:
When we choose s: d[s] = 0, d[a] = d[b] = d[c] = d[d] = d[e] = ∞, M = {s}, there is no p(s).
When we choose b: d[s] = 0, d[a] = 5, d[b] = 3, d[c] = 10, d[d] = d[e] = ∞, M = {s, a, b}, there
is no p(s), p(a) = p(b) = s.
When we choose a: d[s] = 0, d[a] = 5, d[b] = 3, d[c] = 8, d[d] = d[e] = ∞, M = {s, a, b}, there is
no p(s), p(a) = p(b) = s.
When we choose c: d[s] = 0, d[a] = 5, d[b] = 3, d[c] = 6, d[d] = 13, d[e] = ∞, M = {s, a, b, c},
there is no p(s), p(a) = p(b) = s, p(c) = a.
When we choose e: d[s] = 0, d[a] = 5, d[b] = 3, d[c] = 6, d[d] = 13, d[e] = 9, M = {s, a, b, c, e},
there is no p(s), p(a) = p(b) = s, p(c) = a, p(e) = c.
When we choose d: d[s] = 0, d[a] = 5, d[b] = 3, d[c] = 6, d[d] = 11, d[e] = 9, M = {s, a, b, c, d, e},
there is no p(s), p(a) = p(b) = s, p(c) = a, p(d) = e, p(e) = c.
b) Change the weight of the edge (a, c) from 1 to −1 and execute Dijkstra’s algorithm on the new
graph. Does the algorithm work correctly (are all distances computed correctly) ? In case it breaks,
where does it break?
Solution:
The algorithm works correctly.
When we choose s: d[s] = 0, d[a] = d[b] = d[c] = d[d] = d[e] = ∞.
When we choose b: d[s] = 0, d[a] = 5, d[b] = 3, d[c] = 10, d[d] = d[e] = ∞.
When we choose a: d[s] = 0, d[a] = 5, d[b] = 3, d[c] = 8, d[d] = d[e] = ∞.
When we choose c: d[s] = 0, d[a] = 5, d[b] = 3, d[c] = 4, d[d] = 13, d[e] = ∞.
When we choose e: d[s] = 0, d[a] = 5, d[b] = 3, d[c] = 4, d[d] = 13, d[e] = 7.

2
When we choose d: d[s] = 0, d[a] = 5, d[b] = 3, d[c] = 4, d[d] = 9, d[e] = 7.
c) Now, additionally change the weight of the edge (e, b) from 1 to −6 (so edges (a, c) and (e, b) now
have negative weights). Show that in this case the algorithm doesn’t work correctly, i.e. there exists
some u ∈ V such that d[u] is not equal to a minimal distance from s to u after the execution of the
algorithm.
Solution:
The algorithm doesn’t work correctly, for example, the distance from s to b is 1 (via the path s-a-c-
e-b), but the algorithm computes exactly the same values of d[·] as in part b), so d[b] = 3.

Exercise 11.2 Depth-First Search Revisited (1 point).


In this exercise we examine the depth-first search in a graph G = (V, E), printed here for convenience.
For concreteness, you can assume that V = {1, . . . , n} and that for v ∈ V we have access to an
adjacency list adj[v].

Algorithm 2
Input: graph G, given as adj and n ≥ 1.
Global variable: marked[1 . . . n], initialized to [F alse, F alse, . . . , F alse].
Global variable: T , initialized to T ← 1.
Global variable: pre[1 . . . n]. . Pre-order number.
Global variable: post[1 . . . n]. . Post-order number.

function DF S(v)
marked[v] ← T rue
pre[v] ← T
T ←T +1
for each neighbor w ∈ adj[v] do
if not marked[w] then
DF S(w)
post[v] ← T
T ←T +1

for v ∈ {1, . . . , n} do
if not marked[v] then
DF S(v)

(a) Consider the graphical representation of the DFS order where a vertex v is represented as an interval
[pre(v), post(v)]. Give a short argument why in directed or undirected graphs no two such intervals
can intersect without one being fully contained in the other. Specifically, argue why the situation
depicted in the figure below cannot happen.

pre(b) post(b)

pre(a) post(a)

Solution:

3
Assume, for the sake of contradiction that the pre/post relations can be achieved as in the figure.
Hence, the DFS function is called on vertex a before being called on b. However, the figure stipulates
that b starts before a completes. Now, since recursive calls stack (i.e., a parent call can only finish
after all of its children’s calls finish), it must be the case that b completes before a completes. Hence,
interval b would be completely contained within interval a, which is contradicting the figure.
Grading: To get points, you need to mention(or imply) the observation that recursive calls stack.
(b) Give a short argument why undirected graphs cannot have any cross edges.
Solution:
A crossing edge is an edge between vertices a and b, where the intervals corresponding to those
vertices are disjoint (see the figure).

pre(a) post(a) pre(b) post(b)

Suppose this was the case. Then, in the recursive call corresponding to vertex a, we would discover
an unmarked neighbor b. Hence, the call to b would necessarily be a child of a, and {a, b} would
be a back/forward edge. This completes the argument.
Grading: To get points, you need to recall the definition of crossing edge; and observe that the call
to b would necessarily be a child of a, which contradicts the definition.
(c) Prove that a directed graph is acyclic (i.e., a DAG) if and only if it has no back edges. This was
proven in the lecture, but the goal here is to explicitly write out the entire argument.
Hint: You need to prove both directions of the equivalence.
Hint: For the ( =⇒ ) direction, assume the opposite (there is a back edge), then simply find a cycle
containing that back edge. If needed, you can use without proof the property that if the interval of a is
contained within interval b, then there exists a simple path from b to a.
Hint: For the ( ⇐= ) direction, we need to prove the graph is a DAG (i.e., acyclic). It is sufficient to
find a topological ordering such that all directed edges originate at vertices that are before their tail
(according to the ordering). One specific order that works is the reverse post-order.
Solution:
Direction ( =⇒ ). Assume for the sake of contradicting that there is a back edge a → b. In other
words, the interval a is nested inside interval b. Hence, by DFS properties, there exists a simple
path p from b to a. Hence, p and a → b form a (directed) cycle. This contradiction completes the
( =⇒ ) direction.
Direction ( ⇐= ). It is sufficient to find a function π : V → {1, . . . , 2n} (called the topological
ordering) such that for every directed edge a → b we have π(a) > π(b) (it can be easily shown
that any graph that is consistent with π must be a DAG). Consider π := post, i.e., the post-order
and consider an edge a → b. As covered in the lecture, there are 4 possible relations between the
intervals of a and b: (1) the intervals are disjoint and a is before b, (2) the intervals are disjoint and b
is before a (3) a is nested within b, or (4) b is nested within a. Option (1) is impossible as then the call
to a would trigger a nested call to b. In option (2), it holds that post(a) = π(a) > π(b) = post(b).
Option (3) is impossible since then a → b is a back edge and we assumed those don’t exist. In
option (4), it holds that post(a) = π(a) > π(b) = post(b). Since all options satisfy π(a) > π(b),
we conclude that the π is a topological ordering, hence G is a DAG. This completes the ( ⇐= )
direction.
Grading: For direction ( =⇒ ), you need to observe that there is a path p from b to a which leads to

4
cycle with backedge. For direction ( ⇐= ), you need to list all 4 different cases, and argue that for
possible cases, there is a topological ordering. You also need to mention that existence of topological
ordering implies DAG.

Exercise 11.3 Language Hiking (2 points).


Alice loves both hiking and learning new languages. Since she moved to Switzerland, she has always
wanted to discover all four language regions of the country in a single hike – but she is not sure whether
her week of vacation will be sufficient.
You are given a graph G = (V, E) representing the towns of Switzerland. Each vertex V corresponds
to a town, and there is an (undirected) edge {v1 , v2 } ∈ E if and only if there exists a direct road going
from town v1 to town v2 . Additionally, there is a function w : E → N such that w(e) corresponds to
the number of hours needed to hike over road e, and a function ` : V → {G, F, I, R} that maps each
town to the language that is spoken there1 . For simplicity, we assume that only one language is spoken
in each town.
Alice asks you to find an algorithm that returns the walking duration (in hours) of the shortest hike
that goes through at least one town speaking each of the four languages.
For example, consider the following graph, where languages appear on vertices:

G
30 6 25
21
F F G G
4
12
12 25 30 14 8 12
7 15 14 9
F G G R G
10 8
25
I
9
I

The shortest path satisfying the condition is marked in red. It goes through one R vertex, one I vertex,
two G vertices and one F vertex. Your algorithm should return the cost of this path, i.e., 40.
(a) Suppose we know the order of languages encountered in the shortest hike. It first goes from an
R vertex to an I vertex, then immediately to a G vertex, and reaches an F vertex in the end, af-
ter going through zero, one or more additional G vertices. In other terms, the form of the path
is RIGF or RIG…GF. In this case, describe an algorithm which finds the shortest path satisfying
the condition, and explain its runtime complexity. Your algorithm must have complexity at most
O((|V | + |E|) log |V |).
Hint: Consider the new vertex set V 0 = V × {1, 2, 3, 4} ∪ {vs , vd }, where vs is a ‘super source’ and
vd a ‘super destination’ vertex.
Solution:
1
G, F, I and R stand for German, French, Italian, and Romansh respectively.

5
Consider the vertex set V 0 above, as well as the following edge set E 0 and weight function w0 :

E 0 = {{vs , (v, 1)} | {u, v} ∈ E, `(v) = R}


∪ {{(u, 1), (v, 2)} | {u, v} ∈ E, `(v) = I}
∪ {{(u, 2), (v, 3)} | {u, v} ∈ E, `(v) = G}
∪ {{(u, 3), (v, 3)} | {u, v} ∈ E, `(v) = G}
∪ {{(u, 3), (v, 4)} | {u, v} ∈ E, `(v) = F}
∪ {{(v, 4), vd } | v ∈ V }
(
0 0 0 0 if u0 = vs or v 0 = vd
w ({u , v }) =
w({u, v}) if u0 = (u, i) and v 0 = (v, j)

For each new vertex (v, i) ∈ V 0 , the first component v ∈ V is a vertex in the original graph, while
i is a counter which measures the progress over the path: if i = 1, only an R town has been visited;
if i = 2, an R and an I town have been visited; if i = 3, an R, and I and at least one (or more) G
towns have been visited; if i = 4, an R, an I, one or more G, and an F town have been visited. The
weight of this edge remains the same as before. As an arbitrary number of G towns can be visited,
we have transitions (u, 3) → (v, 3) (G to G) as well as (u, 3) → (v, 4) (G to F); since this is not the
case for R, I, and F, we have only transitions vs → (u, 1), (u, 1) → (v, 2), and (u, 4) → ve .
Moreover, a global source vertex vs is connected to all R vertices. This corresponds to the choice of
the first vertex (where Alice will start hiking). Similarly, a global destination vertex vd is connected
to all vertices with i = 4 with edges of weight 0, corresponding to the choice of the last vertex.
The length of the shortest path that follows the given pattern is exactly the length of the shortest
path between vs and vd in G0 = (V 0 , E 0 ) with weights w0 . Since all weights are nonnegative, we
can use Dijkstra’s algorithm to find this shortest path.
The complexity of Dijkstra’s algorithm is O((|V 0 | + |E 0 |) log(|V 0 |)). Here, we have

|V 0 | = |V | · 4 + 2 ≤ O(|V |)
|E 0 | ≤ |V | + |V | + |E| · 2 ≤ O(|V | + |E|),

yielding O((|V | + (|V | + |E|)) log(|V |)) = O((|V | + |E|) log(|V |)). Constructing the graph adds
a cost O(|V | + |E|) and extracting the result a O(1). We obtain a total runtime in O((|V | +
|E|) log(|V |)).
Grading: To get points, you need to 1. construct the correct graph 2. argue that it corresponds to
a shortest path problem in a nonnegative weight graph. 3. Use Dijkstra’s algorithm and gives the
right complexity.
(b) Now we don’t make the assumption in (a). Describe an algorithm which finds the shortest path
satisfying the condition. Briefly explain your approach and the resulting runtime complexity. To
obtain full points, your algorithm must have complexity at most O((|V | + |E|) log |V |).
Hint: Consider the new vertex set V 0 = V × {0, 1}4 ∪ {vs , vd }, where vs is a ‘super source’ and vd a
‘super destination’ vertex.
Solution:

6
Consider the vertex set V 0 above, as well as the following edge set E 0 and weight function w0 :

E 0 = {{vs , (v, (1, 0, 0, 0))} | v ∈ V, `(v) = G, }


{{vs , (v, (0, 1, 0, 0))} | v ∈ V, `(v) = F, }
{{vs , (v, (0, 0, 1, 0))} | v ∈ V, `(v) = I, }
{{vs , (v, (0, 0, 0, 1))} | v ∈ V, `(v) = R, }
∪ {{(v, (1, 1, 1, 1)), vd } | v ∈ V }
∪ {{(u, (g, f, i, r)), (v, (1, f, i, r))} | (g, f, i, r) ∈ {0, 1}4 , {u, v} ∈ E, `(v) = G}
∪ {{(u, (g, f, i, r)), (v, (g, 1, i, r))} | (g, f, i, r) ∈ {0, 1}4 , {u, v} ∈ E, `(v) = F}
∪ {{(u, (g, f, i, r)), (v, (g, f, 1, r))} | (g, f, i, r) ∈ {0, 1}4 , {u, v} ∈ E, `(v) = I}
∪ {{(u, (g, f, i, r)), (v, (g, f, i, 1))} | (g, f, i, r) ∈ {0, 1}4 , {u, v} ∈ E, `(v) = R}
(
0 if u0 = vs or v 0 = vd
w0 ({u0 , v 0 }) =
w({u, v}) if u0 = (u, (g, f, i, r)) and v 0 = (v, (g, f, i, r))

For each new vertex (v, (g, f, i, r)) ∈ V 0 , the first component v ∈ V is a vertex in the original
graph, while g, f , i, and r are four Boolean variables that keep trace of whether a town with
language G, F, I, or R has been visited already. Every edge {u, v} ∈ E is replaced by a set of edges
{(u, (g, f, i, r)), (v, (g 0 , f 0 , i0 , r0 ))} ⊆ E 0 where the Boolean corresponding to language `(v) is set
to 1 and other Booleans are kept unchanged. The weight of this edge remains the same as before.
Moreover, a global source vertex vs is connected to all vertices with (0, 0, 0, 0) Booleans with edges
of weight 0. This corresponds to the choice of the first vertex (where Alice will start hiking). Simi-
larly, a global destination vertex vd is connected to all vertices with (1, 1, 1, 1) Booleans with edges
of weight 0, corresponding to the choice of the last vertex.
The length of the shortest path that goes through all language regions is exactly the length of
the shortest path between vs and vd in G0 = (V 0 , E 0 ) with weights w0 . Since all weights are
nonnegative, we can use Dijkstra’s algorithm to find this shortest path.
The complexity of Dijkstra’s algorithm is O((|V 0 | + |E 0 |) log(|V 0 |)). Here, we have

|V 0 | = |V | · 24 + 2 ≤ O(|V |)
|E 0 | = |V | + |V | + |E| · 24 ≤ O(|V | + |E|),

yielding O((|V | + (|V | + |E|)) log(|V |)) = O((|V | + |E|) log(|V |)). Constructing the graph adds
a cost O(|V | + |E|) and extracting the result a O(1). We obtain a total runtime in O((|V | +
|E|) log(|V |)).
Grading: To get points, you need to 1. construct the correct graph 2. argue that it corresponds to
a shortest path problem in a nonnegative weight graph. 3. Use Dijkstra’s algorithm and gives the
right complexity.

7
Eidgenössische Ecole polytechnique fédérale de Zurich
Technische Hochschule Politecnico federale di Zurigo
Zürich Federal Institute of Technology at Zurich

Departement of Computer Science 12 December 2022


Markus Püschel, David Steurer
François Hublet, Goran Zuzic, Tommaso d’Orsi, Jingqiu Ding

Algorithms & Data Structures Exercise sheet 12 HS 22

The solutions for this sheet are submitted at the beginning of the exercise class on 19 December 2022.
Exercises that are marked by ∗ are “challenge exercises”. They do not count towards bonus points.
You can use results from previous parts without solving those parts.

Exercise 12.1 MST practice.


Consider the following graph

a
6
8

b 7 c
10
1

4 d 3
2
12
e f
5

a) Compute the minimum spanning tree (MST) using Boruvka’s algorithm. For each step, provide the
set of edges that are added to the MST.
Solution:
At the first step we add edges {a, c}, {b, e}, {c, d}, {d, f}. At the second step we add {e, f}.
b) Provide the order in which Kruskal’s algorithm adds the edges to the MST.
Solution:
{c, d}, {d, f}, {b, e}, {e, f}, {a, c}.
c) Provide the order in which Prim’s algorithm (starting at vertex d) adds the edges to the MST.
Solution:
{c, d}, {d, f}, {e, f}, {b, e}, {a, c}.
Exercise 12.2 Maximum Spanning Trees and Trucking (2 points).
We start with a few questions about maximum spanning trees.
(a) How would you find the maximum spanning tree in a weighted graph G? Briefly explain an
algorithm with runtime O((|V | + |E|) log |V |).
Solution:
We simply take any MST algorithm (e.g., Boruvka, Prim, or Kruskal) and replace all the mins with
maxs. Specifically: in Boruvka, we will find the maximum-weight outgoing edge from each con-
nected component (“ZHK” from the lecture); in Prim, we will extract-max (instead of extract-min),
use max to update weights, and use increase-key; in Kruskal, we will sort in decreasing order. The
correctness arguments do not change (except for replacing “minimum” with “maximum”); the same
O((|V | + |E|) log |V |) bound holds for runtime.
(b) Given a weighted graph G = (V, E) with weights w : E → R, let G≥x = (V, {e ∈ E | w(e) ≥ x})
be the subgraph where we only preserve edges of weight x or more. Prove that for every s ∈ V, t ∈
V, x ∈ R, if s and t are connected in G≥x then they will also be connected in T≥x , where T is the
maximum spanning tree of G.
Hint: Use Kruskal’s algorithm as inspiration for the proof.
Hint: If it helps, you can assume all edges have distinct weight and only prove the claim for that case.
Solution:
As argued in class, the maximum spanning tree is obtained by running Kruskal’s algorithm that
sorts the edges by decreasing weight, hence edges of G≥x will be processed strictly before all of
G<x := G\G≥x . Furthermore, Kruskal’s algorithm only removes an edge if it would create a cycle,
which does not affect connectivity. Hence, any pair s, t ∈ V that was connected in G≥x will still
be connected in the maximum spanning tree using edges of weight at least x. In other words, s and
t will be connected in T≥x , as needed.
Problem: You are starting a truck company in a graph G = (V, E) with V = {1, 2, . . . , n}. Your
headquarters are in vertex 1 and your goal is to deliver the maximum amount of cargo to a destination
t ∈ V in a single trip. Due to local laws, each road e ∈ E has a maximum amount of cargo your truck
can be loaded with while traversing e. Find the maximum amount of cargo you can deliver for each
t ∈ V with an algorithm that runs in O((|V | + |E|) log |V |) time.
Example:

1 5 3 10 4 Output: Explanation:
Max cargo to 1 is ∞ The best path from the headquar-
10 8 Max cargo to 2 is 10 ters to 4 is 1 → 2 → 3 → 4, and
Max cargo to 3 is 8 the maximum cargo the truck
2 Max cargo to 4 is 8 can carry is min(10, 8, 10) = 8.

(c) Prove that for every t ∈ V , the optimal route is to take the unique path in the maximum spanning
tree of G.
Hint: Suppose that the largest amount of cargo we can carry from 1 to t in G (i.e., the correct result)
is OP T and let ALG be the largest amount of cargo from 1 to t in the maximum spanning tree. We
need to prove two directions: OP T ≤ ALG and OP T ≥ ALG.

2
Hint: One direction holds trivially as any spanning tree is a subgraph. For the other direction, use part
(b).
Solution:
Suppose that the largest amount of cargo we can carry from 1 to t in G (i.e., the correct result) is
OP T and let ALG be the largest amount of cargo from 1 to t in the maximum spanning tree.
Direction ALG ≥ OP T . By definition of OP T , there exists a path from 1 to t where all edges
have weight w(e) ≥ OP T . In other words, 1 and t are connected via G≥OP T . By part (b), they will
also be connected in T≥OP T , where T is the maximum spanning tree of G. Hence, there is a path
in T between 1 and t where all edges have weight w(e) ≥ OP T . We conclude that ALG ≥ OP T .
Direction ALG ≤ OP T . Since any spanning tree is a subgraph of the original graph and no
solution in a subgraph can be larger than in G, we conclude that ALG ≤ OP T .
(d) Write the pseudocode of the algorithm that computes the output for all t ∈ V and runs in O((|V | +
|E|) log |V |). You can assume that you have access to a function that computes the maximum
spanning tree from G and outputs it in any standard format. Briefly explain why the runtime
bound holds.
Solution:

Algorithm 1
Input: graph G, given as n ≥ 1 and an adjacency list adj of (neighbor, weight) pairs.
Global variable: marked[1 . . . n], initialized to [F alse, F alse, . . . , F alse].

function DF S(u, capacity) . we can reach u with a truck of capacity


Print(”Max cargo to ”, u, ” is ”, capacity)
marked[u] ← T rue
for each neighbor (v, w) ∈ adj[u] do . edge u → v has weight w
if not marked[v] then
DF S(v, min(capacity, w))

adj ← M aximumSpanningT ree(G) . We replace G with its maximum spanning tree.


DF S(1, ∞)

The runtime of maximum spanning tree is O((|V | + |E|) log |V |) and the DFS runtime is O(|V | +
|E|). In total, we have a runtime of O((|V | + |E|) log |V |).

Exercise 12.3 Counting Minimum Spanning Trees With Identical Edge Weights (1 point).
Let G = (V, E) be an undirected, weighted graph with weight function w.
It can be proven that, if G is connected and all its edge weights are pairwise distinct1 , then its Minimum
Spanning Tree is unique. You can use this fact without proof in the rest of this exercise.
For k ≥ 0, we say that G is k-redundant if k of G’s edge weights are non-unique, e.g.

|{e ∈ E | ∃e0 ∈ E. e 6= e0 ∧ w(e) = w(e0 )}| = k.


1
I.e., for all e 6= e0 ∈ E, w(e) 6= w(e0 ).

3
In particular, if G’s edge weights are all distinct, then G is 0-redundant, and if its edge weights are all
identical, it is |E|-redundant.
(a) Given a weighted graph G = (V, E) with weight function c and e = {v, w} ∈ E, we say that we
contract e when we perform the following operations:
(i) Replace v and w by a single vertex vw in V , i.e., V 0 ← V − {v, w} ∪ {vw}.
(ii) Replace any edge {v, x} or {w, x} by an edge {vw, x} in E, i.e.,

E 0 ← E − {{v, x} | x ∈ V } − {{w, x} | x ∈ V } ∪ {{vw, x} | {v, x} ∈ E ∨ {w, x} ∈ E}.

(iii) Set the weight of the new edges to the weight of the original edges, taking the minimum of
the two weights if two edges are merged, i.e.

c0 ({x, y}) = c({x, y}) x, y ∈


/ {v, w}
0
c ({vw, x}) = c({v, x}) {v, x} ∈ E, {w, x} ∈
/E
0
c ({vw, x}) = c({w, x}) {v, x} ∈
/ E, {w, x} ∈ E
0
c ({vw, x}) = min(c({v, x}), c({w, x})) {v, x} ∈ E, {w, x} ∈ E.

For all G = (V, E) and e ∈ E, we denote by Ge the graph obtained by contracting e in G. Explain
why if T is an MST of G and e ∈ T , then Te must be an MST of Ge .
Solution:
Assume that Te is not an MST of Ge = (Ve , Ee ). Then there exists a spanning tree (Ve , T 0 ) of Ge
with total cost w(T 0 ) < w(Te ). Based on T 0 , we will construct a spanning tree in the original graph
G with smaller total cost.
Consider the following set of edges of the original graph G:

T 00 = {e} ∪ {{x, y} | {x, y} ∈ T 0 ∧ x, y 6= vw}


∪ {{v, x} | {vw, x} ∈ T 0 ∧ {v, x} ∈ E ∧ ({w, x} ∈
/ E ∨ c({w, x}) > c({v, x})}
∪ {{w, x} | {vw, x} ∈ T 0 ∧ {w, x} ∈ E ∧ ({v, x} ∈
/ E ∨ c({v, x}) > c({w, x})}

Let us show that (V, T 00 ) is a tree, using the following characterization: a tree is a connected graph
on n vertices with n − 1 edges. First, T 00 has |T 00 | = |T 0 | + 1 = |Ve | − 1 + 1 = |Ve | = |V | − 1
edges. Moreover, there is a path between every pair of vertices of G in T 00 . To show this, consider
x, y ∈ V . If {x, y} = {v, w}, then e is a path between x and y in T 00 . If {x, y} =6 {v, w}, let p be a
path between x and y in T 0 . There are two cases:
• Either p does not go through vw, and it is also a path in T 00 ;
• Or it contains vw, and we can replace the (at most two) edges adjacent to vw in p by their
preimage in T 00 . If the path p is transformed into two disjoint paths ending at v and w in the
process, then the edge e can be used to reconnect them in T 00 .
Therefore, (V, T 00 ) is a tree. As it covers all vertices of G, (V, T 00 ) is also a spanning tree of G.
Now, w(T 00 ) = w(T 0 ) + w(e) < w(Te ) + w(e) = w(T ), contradicting the minimality of T . We
conclude that Te is an MST of Ge .
(b) Let k > 0. Show that for all k-redundant G = (V, E) and e 6= e0 ∈ E with w(e) = w(e0 ), then Ge
is k 0 -redundant for some k 0 ≤ k − 1.

4
Solution:
Let Ve , Ee such that Ge = (Ve , Ee ). Denote by we the weight function of Ge . For each a 6= b ∈ Ee
such that we (a) = we (b), we can find a0 6= b0 ∈ E such that a0 and b0 are contracted to a and b
respectively, and w(a0 ) = w(b0 ). However, a0 and b0 can never be e, since e is removed from the
graph through the contraction operation. Therefore,
|{a ∈ E | ∃b ∈ Ee . a 6= b ∧ we (a) = we (b)}| ≤ |{a0 ∈ E | ∃b0 ∈ E. a0 6= b0 ∧ w(a0 ) = w(b0 )}| − 1,
and Ge is k 0 -redundant for some k 0 ≤ k − 1.
(c) Show that if G is connected and k-redundant, it has at most 2k distinct MSTs.
Hint: By induction over k, using (a) and (b).
Solution:
We prove, by induction over k ≥ 0: P (k): “Any k-redundant graph has at most 2k distinct MSTs.”

Base case. For k = 0, this is exactly the lemma from the lecture: a graph whose edge weights are
all pairwise distinct has 20 = 1 MSTs.

Induction hypothesis. Let k ≥ 0 such that P (k 0 ) holds for all k 0 ≤ k, i.e., any k 0 -redundant
0
graph has at most 2k distinct MSTs.

Induction step. Let G = (V, E) be a k + 1-redundant graph. Let e be an edge whose weight
w(e) is not unique among the weights of edges in E. Let us consider the sets M1 of MSTs of G
that contain e and M2 of MSTs of G that do not contain e. Clearly, the total number of MSTs of G
is |M1 | + |M2 |. By (a), for any MST T ∈ M1 , Te is an MST of Ge . Moreover, Ge is k 0 -redundant
for some k 0 ≤ k. Now, |M1 | is at most the number of MSTs of Ge , which is at most 2k by P (k).
Every MST T ∈ M2 is also an MST of G − {e}, and therefore |M2 | ≤ 2k by P (k). We get
|M1 | + |M2 | ≤ 2k + 2k = 2k+1 k, which proves P (k + 1).
(d) Show that for all large enough n, there exists a graph G such that G is n-redundant and has at least
n
2 2 distinct MSTs.
Hint: First assume that n = 3k for some k. Consider graphs of the following form, where all unmarked
edges have weight 0. When n = 3k + 1 or n = 3k + 2, you can add one or two edges with cost k and
k + 1 at either end.
• • • •

• •
1
• •
2
• •
3 . . . k − 1• •

Solution:
For k ≥ 0, denote by Gk the graph of the above form, with k connected triangles. This graph has
3k +(k −1) = 4k −1 edges and redundancy 3k, since there are 3k edges with weight 0 (the triangle
edges) and all other edges have distinct weights 1..k − 1.
For any k ≥ 0, the MSTs of Gk contain all non-zero edges, while in each triangle, one can choose
independently between the following three pairs of edges:
• • •

• • • • • •

5
3k 3k log2 3
Hence, the 3k-redundant graph has 3k = 3 3 = 2log2 3· 3 distinct MSTs. Since 3 ≈ 0.53 > 12 ,
3k
this is more that 2 2 MSTs. This proves the result when n = 3k.
When n = 3k + 1 or n = 3k + 2, we can add one or two additional edges at either end of Gk to
obtain an n-redundant graph, e.g., for n = 3k + 1:
• • • •

• •
1
• •
2
• •
3 . . . k − 1• •
0

n−1 n−2 n
The graph has 2log2 3· 3 or 2log2 3· 3 MSTs, which is at least 2 2 as soon as log2 3 · n−2
3 ≥ 2,
n

which is n( log32 3 − 12 ) ≥ 2 log3 2 3 or n ≥ log


2 log2 3
3− 3
= 1− 2 3 ≈ 37.3. Hence, for n ≥ 38, there
2 2 2 log2 3
n
exists an n-redundant graph with at least 2 distinct MSTs.
2

6
Eidgenössische Ecole polytechnique fédérale de Zurich
Technische Hochschule Politecnico federale di Zurigo
Zürich Federal Institute of Technology at Zurich

Departement of Computer Science 19 December 2021


Markus Püschel, David Steurer
François Hublet, Goran Zuzic, Tommaso d’Orsi, Jingqiu Ding

Algorithms & Data Structures Homework 13 HS 22

Exercise Class (Room & TA):


Submitted by:
Peer Feedback by:
Points:

Submission: This exercise sheet is not to be turned in. The solutions will be published at the end of
the week, before Christmas.
Exercise 13.1 Shortest path with negative edge weights (part I).
Let G = (V, E, w) be a graph with edge weights w : E → Z \ {0} and wmin = mine∈E w(e).
Since Dijkstra’s algorithm must not be used whenever some edge weights are negative (i.e., wmin < 0),
one could come up with the idea of applying a transformation to the edge weight of every edge e ∈ E,
namely w0 (e) = w(e) − wmin + 1, such that all weights become positive, and then find a shortest path
P in G by running Dijkstra with these new edge weights w0 .
Show that this is not a good idea by providing an example graph G with a weight function w, such
that the above approach finds a path P that is not a shortest path in G (this path P can start from the
vertex of your choice). The example graph should have exactly 5 nodes and not all weights should be
negative.
Solution: Consider for example the following graph:

We have that wmin = mine∈E w(e) = −1, thus we add the value 1 − (−1) = 2 to every edge weight
to obtain the following transformed graph:
A shortest s-t-path in the trasformed graph is hs, u, v, ti. However, there is a shorter path in the original
graph since the vertices hu, v, w, ui form a cycle with negative weight. Hence, for an arbitrary s-t-path
in the original graph, we can always find a path with smaller weight by following this cycle once more.

Exercise 13.2 Shortest path with negative edge weights (part II).
We consider the following graph:

6
1 2
4
2 1 5
3 1

1 4 -4 2 5

3 1 4

1. What is the length of the shortest path from vertex 1 to vertex 6 ?


Solution: The shortest path from vertex 1 to vertex 6 is (1, 3, 5, 2, 6) and has length 5−4+1+1 =
3.
2. Consider Dijkstra’s algorithm (that fails here, because the graph has negative edge weights).
Which path length from vertex 1 to vertex 6 is Dijkstra computing? State the sets S, V \ S im-
mediately before Dijkstra is making its first error and explain in words what goes wrong.
Solution: With Dijkstra’s algorithm we find the path (1, 2, 6) having length 4. The first mistake
happens already after having processed vertex 1. The sets at that point in time are S = {1} and
V \ S = {2, 3, 4, 5, 6}. To vertex 2, we know a path of length 3, to vertex 3 a path of length 5. To
the other vertices, we do not know a path so far. Hence, Dijkstra’s algorithm choses vertex 2 to
continue, i.e., includes 2 into S, which corresponds to the assumption, that we already know the
shortest path to this vertex. This is clearly a mistake, since the path (1, 3, 5, 2) has only length 2.
3. Which efficient algorithm can be used to compute a shortest path from vertex 1 to vertex 6 in the
given graph? What is the running time of this algorithm in general, expressed in n, the number
of vertices, and m, the number of edges ?
Solution: We can use the algorithm of Bellman and Ford which runs in O(nm) time.
4. On the given graph, execute the algorithm by Floyd and Warshall to find all shortest paths. Ex-
press all entries of the (6 × 6 × 7)-table as 7 tables of size 6 × 6. (It is enough to state the path
length in the entry without the predecessor vertex.) Mark the entries in the table in which one
can see that the graph does not contain a negative cycle.
Solution: Each of the following tables corresponds to a fixed value k ∈ {0, 1, 2, 3, 4, 5, 6} and
contains the lengths of all shortest paths that use only vertices in {0, . . . , k}. Since all entries
on the diagonal are non-negative, we can conclude that the graph does not contain any negative
cycle.

2
from \
to 1 2 3 4 5 6 from \
to 1 2 3 4 5 6
1 0 3 5 ∞ ∞ ∞ 1 0 3 5 ∞ ∞ ∞
2 1 0 4 ∞ 4 1 2 1 0 4 ∞ 4 1
3 ∞ ∞ 0 1 -4 ∞ 3 ∞ ∞ 0 1 -4 ∞
4 ∞ ∞ ∞ 0 5 ∞ 4 ∞ ∞ ∞ 0 5 ∞
5 ∞ 1 ∞ 2 0 ∞ 5 ∞ 1 ∞ 2 0 ∞
6 ∞ ∞ ∞ ∞ 2 0 6 ∞ ∞ ∞ ∞ 2 0
k=0 k=1

from \
to 1 2 3 4 5 6 from \
to 1 2 3 4 5 6
1 0 3 5 ∞ 7 4 1 0 3 5 6 1 4
2 1 0 4 ∞ 4 1 2 1 0 4 5 0 1
3 ∞ ∞ 0 1 -4 ∞ 3 ∞ ∞ 0 1 -4 ∞
4 ∞ ∞ ∞ 0 5 ∞ 4 ∞ ∞ ∞ 0 5 ∞
5 2 1 5 2 0 2 5 2 1 5 2 0 2
6 ∞ ∞ ∞ ∞ 2 0 6 ∞ ∞ ∞ ∞ 2 0
k=2 k=3

from \
to 1 2 3 4 5 6 from \
to 1 2 3 4 5 6
1 0 3 5 6 1 4 1 0 2 5 3 1 3
2 1 0 4 5 0 1 2 1 0 4 2 0 1
3 ∞ ∞ 0 1 -4 ∞ 3 -2 -3 0 -2 -4 -2
4 ∞ ∞ ∞ 0 5 ∞ 4 7 6 10 0 5 7
5 2 1 5 2 0 2 5 2 1 5 2 0 2
6 ∞ ∞ ∞ ∞ 2 0 6 4 3 7 4 2 0
k=4 k=5

from \
to 1 2 3 4 5 6
1 0 2 5 3 1 3
2 1 0 4 2 0 1
3 -2 -3 0 -2 -4 -2
4 7 6 10 0 5 7
5 2 1 5 2 0 2
6 4 3 7 4 2 0
k=6

Exercise 13.3 Invariant and correctness of algorithm (This exercise is from the January 2020 ex-
am).
Given is a weighted directed acyclic graph G = (V, E, w), where V = {1, . . . , n}. The goal is to find

3
the length of the longest path in G.
Let’s fix some topological ordering of G and consider the array top[1, . . . , n] such that top[i] is a vertex
that is on the i-th position in the topological ordering.
Consider the following pseudocode

Algorithm 1 Find-length-of-longest-path(G, top)


L[1], . . . , L[n] ← 0, . . . , 0
for i = 1, . . . , n do
v ← top[i]  
L[v] ← max L[u] + w (u, v)
(u,v)∈E

return max L[i]


1≤i≤n

Here we assume that maximum over the empty set is 0.


Show that the pseudocode above satisfies the following loop invariant INV(k) for 1 ≤ k ≤ n: After k
iterations of the for-loop, L[top[j]] contains the length of the longest path that ends with top[j] for all
1 ≤ j ≤ k.
Specifically, prove the following 3 assertions:
i) INV(1) holds.
ii) If INV(k) holds, then INV(k + 1) holds (for all 1 ≤ k < n).
iii) INV(n) implies that the algorithm correctly computes the length of the longest path.

State the running time of the algorithm described above in Θ-notation in terms of |V | and |E|. Justify
your answer.
Solution:
Proof of i).
In the first iteration we have v = top[1]. By the definition the first vertex in topological order has no
incoming edges. Thus, L[top[1]] gets assigned the maximum over the empty set, which we assume to
be 0. As a consequence, INV(1) holds as there is no longest path that ends at top[1] and L[top[1]] = 0.
Proof of ii).
In the (k + 1)-th iteration we have v = top[k + 1]. By the definition of topological ordering we have
that all u ∈ V with (u, top[k + 1]) ∈ E are in {top[1], . . . , top[k]}. The length of the longest path via
u ending at v can be decomposed into the length of the longest path ending at u plus the weight of
the edge (u, v). Therefore, given INV(k), i.e., L[top[j]]
 contains the length of the longest path for all
1 ≤ j ≤ k, the maximum max L[u] + w (u, v) computes the length of the longest path ending

(u,v)∈E
at v. Consequently, INV(k + 1) holds given INV(k) holds.
Proof of iii).
IN V (n) implies that each entry L[v] contains the length of the longest path ending at v. Thus, com-
puting the maximum max L[i] corresponds to computing the length of the longest path in G.
1≤i≤n

4
Running time:
The running time is in Θ(|E| + |V |). The loop takes time Θ(|E| + |V |) since
P
v∈V deg− (v) = |E|,
and taking the maximum at the end takes time Θ(|V |).

Exercise 13.4 Cheap flights (This exercise is from the January 2020 exam).
Suppose that there are n airports in the country Examistan. Between some of them there are direct
flights. For each airport there exists at least one direct flight from this airport to some other airport.
Totally there are m different direct flights between the airports of Examistan.
For each direct flight you know its cost. The cost of each flight is a strictly positive integer.
You can assume that each airport is represented by its number, i.e. the set of airports is {1, . . . , n}.

a) Model these airports, direct flights and their costs as a directed graph: give a precise description of
the vertices, the edges and the weights of the edges of the graph G = (V, E, w) involved (if possible,
in words and not formal).
Solution: Each airport is a vertex in the directed graph. Two vertices u, v ∈ V are connected by a
directed edge e ∈ E, if there exists a direct flight starting from airport u to airport v. The weight
w(e) of the edge e = (u, v), is the cost of the direct flight from u to v.
Notice that the graph might not be connected, but |E| ≥ |V |, since “For each airport there exists at
least one direct flight from this airport to some other airport.”

In points b) and c) you can assume that the directed graph is represented by a data structure that allows
you to traverse the direct predecessors and direct successors of a vertex u in time O(deg− (u)) and
O(deg+ (u)) respectively, where deg− (u) is the in-degree of vertex u and deg+ (u) is the out-degree of
vertex u.

b) Suppose that you are at the airport S and you want to fill the array d of minimal traveling costs to
each airport. That is, for each airport A, d[A] is a minimal cost that you must pay to travel from S
to A.
Name the most efficient algorithm that was discussed in lectures which solves the corresponding
graph problem. If several such algorithms were described in lectures (with the same running time),
it is enough to name one of them. State the running time of this algorithm in Θ-notation in terms
of n and m.
Solution: Name of the algorithm used to solve this problem: Dijkstra’s Algorithm
Runtime: O(m + n log n) if implemented with Fibonnachy heap, O (m + n) · log n if implemented


with binary heap.


c) Now you want to know how many optimal routes there are to airport T . In other words, if cmin is
the minimal cost from S to T then you want to compute the number of routes from S to T of cost
cmin .
Assume that the array d from b) is already filled. Provide an as efficient as possible dynamic pro-
gramming algorithm that takes as input the graph G from task a), the array d from point b) and the
airports S and T , and outputs the number of routes from S to T of minimal cost.

5
Address the following aspects in your solution and state the running time of your algorithm:
1) Definition of the DP table: What are the dimensions of the table DP [. . .] ? What is the meaning
of each entry ?
2) Computation of an entry: How can an entry be computed from the values of other entries ?
Specify the base cases, i.e., the entries that do not depend on others.
3) Calculation order: In which order can entries be computed so that values needed for each entry
have been determined in previous steps ?
4) Extracting the solution: How can the final solution be extracted once the table has been filled ?
5) Running time: What is the running time of your algorithm ? Provide it in Θ-notation in terms
of n and m, and justify your answer.

Solution:
Size of the DP table / Number of entries: We use a 1-dimensional DP table consisting of n entries.
Meaning of a table entry:
DP [i] is the number of optimal routes from S to the airport i.
Computation of an entry (initialization and recursion):
DP [S] = 1. If d[v] = ∞, DP [v] = 0. If v 6= S and d[v] < ∞, then
X
DP [v] = DP [u] .
u:(u,v)∈E
d[u]+w((u,v))=d[v]

Order of computation: The order of the array d. That is, if d[i] < d[j], then i is before j in this
order.
Computing the result: The result is contained in DP [T ].
Running time in concise Θ-notation in terms of n and m. Justify your answer.
Hint: Note that the array d is a part of the input, so you don’t need to include the time that is required
to fill this array to the running time here.
We need Θ(n log n) time to sort the array d. To fill the
P DP table we need Θ(n + m), since the time
required to compute DP [v] is Θ(deg− (v) + 1), and v∈V Θ(deg− (v) + 1) = Θ(n + m). Hence the
running time of the algorithm described above is Θ(n log n + m).

Exercise 13.5 Elevator (This exercise is from the January 2022 exam).
Consider the following definitions for a directed graph G = (V, E):
1. The out-degree of a vertex v ∈ V , denoted with degout (v), is the number of edges of E that start
at v, i.e., degout (v) = |{(v, w) ∈ E | w ∈ V }|.
2. The in-degree of a vertex v ∈ V , denoted with degin (v), is the number of edges that end at v, i.e.,
degout (v) = |{(u, v) ∈ E | u ∈ V }|.

6
3. A Eulerian walk is a sequence v1 , . . . , vk ∈ V such that k = |E| + 1 and {(vi , vi+1 ) | 1 ≤ i <
k} = E. Note that this definition implies (vi , vi+1 ) being different edges for 1 ≤ i < k.
In this exercise, you can use without proof the following result from the lecture:

Lemma 1. A directed graph G = (V, E) admits a Eulerian walk if, and only if, all of the following
conditions holds:
1. At most one vertex v ∈ V is such that degout (v) = degin (v) + 1;
2. At most one vertex v ∈ V is such that degin (v) = degin (v) + 1;
3. Every vertex that satisfies neither (i) nor (ii) is such that degout (v) = degin (v);
4. The undirected graph G0 obtained by ignoring the direction of edges in G is connected.
a) Write down the pseudocode of an O(|V | + |E|) time algorithm that takes as input a directed graph
G, and returns true if G has a Eulerian walk, and false otherwise. Justify its correctness and
complexity.
b) Alice is launching iFahrstuhl™, a start-up developing the next generation of elevators.
Assume a building with n floors indexed from 1 to n and an elevator which has room for a single
person. The elevator receives requests in the form of pairs (i, j) ∈ {1, . . . , n}2 of distinct floors
between which a single person is willing to travel.
Consider the scenario where m people want to use the elevator. For 1 ≤ t ≤ m, the t-th people want
to go from floor it to floor jt . These requests are given as a finite set S = {(i1 , j1 ), . . . , (im , jm )}.
A finite set S = {(i1 , j1 ), . . . , (im , jm )} of requests is called optimal if the pairs can be ordered such
that all requests can be processed and the elevator is never empty when moving between two floors
(except maybe on its way to fetching the first person).
For example, for n = 5, the set S1 = {(2, 3), (4, 1), (3, 4)} is optimal, since it can ordered as
{(2, 3), (3, 4), (4, 1)}, which means that the elevator can start on floor 2 to fetch person 1, go to
floor 3, drop person 1 and fetch person 3, go to floor 4, drop person 3 and fetch person 2, go to floor
1, drop person 2, and terminate there. However, the set S2 = {(2, 3), (4, 1)} is not optimal, since
there is no way a single elevator can satisfy both requests without moving empty from floor 3 to
floor 4 or floor 1 to floor 2.
Given a set of requests S, Alice’s elevators should be able to decide whether it’s optimal. Model
the problem of detecting optimal sets of requests as a graph problem and provide an algorithm to
solve it. Describe the vertex and edge set, edge weights (if needed), the graph problem you solve,
the algorithm you use, and its complexity. To obtain full points, your algorithm should run in time
O(n + |S|).
c) Alice’s startup has installed k single-person elevators in your n-floor building. Unfortunately, not
all elevators can reach all floors. Hence, for each elevator j ∈ {1, . . . , k}, you are given a set Fj ⊆
{1, . . . , n} of floors it can reach. When you arrive in front of an elevator j, say on floor f ∈ Fj ,
you can immediately call it, after which you have to wait until it reaches your floor from its current
position, moving at the constant speed of 1 time unit per floor. When the elevator arrives, you choose
the destination floor f 0 ∈ Fj , and the elevator brings you to this floor at the constant speed of 0.5
time units per floor (for security reasons, the elevator is slower when it is not empty). The time spent
moving between elevators on the same floor, calling the elevator or choosing the destination floor
is negligible, since you are very fast at interacting with elevators.

7
You are alone in the building at floor 1, with each elevator j being initally located on floor fj . You
would like to go to floor n. What is the minimal amount of time that you have to travel using Alice’s
elevators? If you cannot reach floor n, then output ∞.
Model the problem as a graph problem and provide an algorithm to solve it. Describe the vertex
and edge set, edge weights (if needed), the graph problem you solve, the algorithm you use, and
its complexity. To obtain full points, your algorithm should run in time O((n + K) log n), where
K = pj=1 |Fj |2 .
P

d) Continue the setting of (c). Elevator doors in your building need maintenance, but the people in
your building also need elevators. In your building, there is exactly one elevator door per elevator
and floor, which needs to be functional in order for the elevator to be used from or to this floor. Even
if a door is not functional, the elevator can still be used between all other floors where a functional
door is present. Alice wants to select as many elevator doors as possible to be maintained during
the next working day such that all floors can be reached from each other using the elevators and
the remaining functional doors (those not in maintenance).
Model the problem as a graph problem and provide an algorithm to solve it. Describe the vertex
and edge set, edge weights (if needed), the graph problem you solve, the algorithm you use, and
its complexity. To obtain full points, your algorithm should run in time O((n + K 0 ) log(n + K 0 )),
where K 0 = pj=1 |Fj |.
P

Hint: Consider the set of vertices

V = {v1 , . . . , vn } ∪ {w1 , . . . , wn } ∪ {elevator1 , . . . , elevatorj }

and use subgraphs (“gadgets”) of the form

v i1 1 wi1

1 0
v i2 wi2

0
elevatorj
... ...
0

viq 1 w iq

where Fj = {i1 , . . . , iq }.

You might also like