Introduction Recursive Programming
Introduction Recursive Programming
Recursive
Programming
Introduction to
Recursive
Programming
Manuel Rubio-Sánchez
CRC Press
Taylor & Francis Group
6000 Broken Sound Parkway NW, Suite 300
Boca Raton, FL 33487-2742
This book contains information obtained from authentic and highly regarded sources. Reasonable
efforts have been made to publish reliable data and information, but the author and publisher cannot
assume responsibility for the validity of all materials or the consequences of their use. The authors and
publishers have attempted to trace the copyright holders of all material reproduced in this publication
and apologize to copyright holders if permission to publish in this form has not been obtained. If any
copyright material has not been acknowledged please write and let us know so we may rectify in any
future reprint.
Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced,
transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or
hereafter invented, including photocopying, microfilming, and recording, or in any information
storage or retrieval system, without written permission from the publishers.
For permission to photocopy or use material electronically from this work, please access
www.copyright.com (http://www.copyright.com/) or contact the Copyright Clearance Center, Inc.
(CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-for-profit organization
that provides licenses and registration for a variety of users. For organizations that have been granted
a photocopy license by the CCC, a separate system of payment has been arranged.
Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and
are used only for identification and explanation without intent to infringe.
Visit the Taylor & Francis Web site at
http://www.taylorandfrancis.com
and the CRC Press Web site at
http://www.crcpress.com
To the future generations
Contents
PREFACE xv
LIST OF FIGURES xxi
LIST OF TABLES xxxi
LIST OF LISTINGS xxxiii
vii
viii Contents
Index 407
Preface
xv
xvi Preface
INTENDED AUDIENCE
The main goal of the book is to teach students how to think and program
recursively, by analyzing a wide variety of computational problems. It is
intended mainly for undergraduate students in computer science or re-
lated technical disciplines that cover programming and algorithms (e.g.,
bioinformatics, engineering, mathematics, physics, etc.). The book could
also be useful for amateur programmers, students of massive open online
courses, or more experienced professionals who would like to refresh the
material, or learn it in a different or clearer way.
Students should have some basic programming experience in order
to understand the code in the book. The reader should be familiar with
notions introduced in a first programming course such as expressions,
variables, conditional and loop constructs, methods, parameters, or el-
ementary data structures such as arrays or lists. These concepts are
not explained in the book. Also, the code in the book is in accordance
with the procedural programming paradigm, and does not use object
oriented programming features. Regarding Python, a basic background
can be helpful, but is not strictly necessary. Lastly, the student should
be competent in high school mathematics.
Computer science professors can also benefit from the book, not just
as a handbook with a large collection and variety of problems, but also
by adopting the methodology and diagrams described to build recursive
solutions. Furthermore, professors may employ its structure to organize
their classes. The book could be used as a required textbook in introduc-
tory (CS1/2) programming courses, and in more advanced classes on the
design and analysis of algorithms (for example, it covers topics such as
xviii Preface
The advantages of recursion over iteration are mainly due to the use
of “multiple recursion,” where methods invoke themselves several times,
and the algorithms are based on combining several solutions to smaller
instances of the same problem. Chapter 6 introduces multiple recursion
through methods based on the eminent “divide and conquer” algorithm
design paradigm. While some examples can be used in an introductory
programming course, the chapter is especially appropriate in a more ad-
vanced class on algorithms. Alternatively, Chapter 7 contains challenging
problems, related to puzzles and fractal images, which can also be solved
through multiple recursion, but are not considered to follow the divide
and conquer approach.
Recursion is used extensively in combinatorics, which is a branch
of mathematics related to counting that has applications in advanced
analysis of algorithms. Chapter 8 proposes using recursion for solving
combinatorial counting problems, which are usually not covered in pro-
gramming texts. This unique chapter will force the reader to apply the
acquired recursive thinking skills to a different family of problems. Lastly,
although some examples are challenging, many of the solutions will have
appeared in earlier chapters. Thus, some examples can be used in an
introductory programming course.
Chapter 9 introduces “mutual recursion,” where several methods in-
voke themselves indirectly. The solutions are more sophisticated since it
is necessary to think about several problems simultaneously. Neverthe-
less, this type of recursion involves applying the same essential concepts
covered in earlier chapters.
Chapter 10 covers how recursive programs work from a low-level
point of view. It includes aspects such as tracing and debugging, the
program stack, or recursion trees. In addition, it contains a brief intro-
duction to memoization and dynamic programming, which is another
important algorithm design paradigm.
Tail-recursive algorithms can not only be transformed to iterative
versions; some are also designed by thinking iteratively. Chapter 11 ex-
amines the connection between iteration and tail recursion in detail. In
addition, it provides a brief introduction to “nested recursion,” and in-
cludes a strategy for designing simple tail-recursive functions that are
usually defined by thinking iteratively, but through a purely declarative
approach. These last two topics are curiosities regarding recursion, and
should be skipped in introductory courses.
The last chapter presents backtracking, which is another major al-
gorithm design technique that is used for searching for solutions to com-
xx Preface
ACKNOWLEDGEMENTS
The content of this book has been used to teach computer programming
courses at Universidad Rey Juan Carlos, in Madrid (Spain). I am grate-
ful to the students for their feedback and suggestions. I would also like
to thank Ángel Velázquez and the members of the LITE (Laboratory
of Information Technologies in Education) research group for providing
useful insights regarding the content of the book. I would also like to
express my gratitude to Luís Fernández, computer science professor at
Universidad Politécnica de Madrid, for his advice and experience related
to teaching recursion. A special thanks to Gert Lanckriet and members
of the Computer Audition Laboratory at University of California, San
Diego.
Manuel Rubio-Sánchez
July, 2017
List of Figures
xxi
xxii List of Figures
xxxi
List of Listings
xxxiii
xxxiv List of Listings
12.14 Branch and bound code for solving the 0-1 knapsack
problem. 396
12.15 Auxiliary code for the branch and bound algorithm
related to the 0-1 knapsack problem. 397
CHAPTER 1
Basic Concepts of
Recursive Programming
1
2 Introduction to Recursive Programming
apparent that the individual florets resemble the entire plant. Other ex-
amples include mountain ranges, clouds, or animal skin patterns.
Recursion also appears in art. A well-known example is the Droste
effect, which consists of a picture appearing within itself. In theory the
process could be repeated indefinitely, but naturally stops in practice
when the smallest picture to be drawn is sufficiently small (for example,
if it occupies a single pixel in a digital image). A computer-generated
fractal is another type of recursive image. For instance, Sierpiński’s tri-
angle is composed of three smaller identical triangles that are subse-
quently decomposed into yet smaller ones. Assuming that the process is
infinitely repeated, each small triangle will exhibit the same structure as
the original’s. Lastly, a classical example used to illustrate the concept
of recursion is a collection of matryoshka dolls. In this craftwork each
doll has a different size and can fit inside a larger one. Note that the
recursive object is not a single hollow doll, but a full nested collection.
Thus, when thinking recursively, a collection of dolls can be described
as a single (largest) doll that contains a smaller collection of dolls.
While the recursive entities in the previous examples were clearly
tangible, recursion also appears in a wide variety of abstract concepts.
In this regard, recursion can be understood as the process of defining
concepts by using the definition itself. Many mathematical formulas and
definitions can be expressed this way. Clear explicit examples include
sequences for which the n-th term is defined through some formula or
procedure involving earlier terms. Consider the following recursive defi-
nition:
sn = sn−1 + sn−2 . (1.1)
The formula states that a term in a sequence (sn ) is simply the sum of the
two previous terms (sn−1 and sn−2 ). We can immediately observe that the
formula is recursive, since the entity it defines, s, appears on both sides
of the equation. Thus, the elements of the sequence are clearly defined
in terms of themselves. Furthermore, note that the recursive formula in
(1.1) does not describe a particular sequence, but an entire family of
sequences in which a term is the sum of the two previous ones. In order
to characterize a specific sequence we need to provide more information.
In this case, it is enough to indicate any two terms in the sequence.
Typically, the first two terms are used to define this type of sequence.
For instance, if s1 = s2 = 1 the sequence is:
Throughout the book we will use this notation in order to describe func-
tions, where the definitions include two types of expressions or cases.
The base cases correspond to scenarios where the function’s output
can be obtained trivially, without requiring values of the function on
additional arguments. For Fibonacci numbers the base cases are, by def-
inition, F (1) = 1, and F (2) = 1. The recursive cases include more
complex recursive expressions that typically involve the defined function
applied to smaller input arguments. The Fibonacci function has one re-
cursive case: F (n) = F (n − 1) + F (n − 2), for n > 2. The base cases are
necessary in order to provide concrete values for the function’s terms
in the recursive cases. Lastly, a recursive definition may contain several
base and recursive cases.
Another function that can be expressed recursively is the factorial of
some nonnegative integer n:
n! = 1 × 2 × ⋯ × (n − 1) × n.
In this case, it is not immediately obvious whether the function can be
expressed recursively, since there is not an explicit factorial on the right-
hand side of the definition. However, since (n − 1)! = 1 × 2 × ⋯ × (n − 1),
we can rewrite the formula as the recursive expression n! = (n − 1)! × n.
Lastly, by convention 0! = 1, which follows from plugging in the value
n = 1 in the recursive formula. Thus, the factorial function can be defined
recursively as:
⎧
⎪
⎪1 if n = 0,
n! = ⎨ (1.3)
⎪(n − 1)! × n if n > 0.
⎪
⎩
Similarly, consider the problem of calculating the sum of the first n
positive integers. The associated function S(n) can be obviously defined
as:
S(n) = 1 + 2 + ⋯ + (n − 1) + n. (1.4)
Basic Concepts of Recursive Programming 5
Lists Trees
In this case the recursive entity is the derivative function, denoted as [⋅]′ ,
but not the functions f (x) and g(x). Observe that the formula explicitly
indicates the decomposition that takes place, where some initial function
(which is the input argument to the derivative function) is broken up
into the sum of the functions f (x) and g(x).
Data structures can also be understood as recursive entities. Fig-
ure 1.2 shows how lists and trees can be decomposed recursively. On the
one hand, a list can consist of a single element plus another list (this
is the usual definition of a list as an abstract data type), or it can be
subdivided into several lists (in this broader context a list is any collec-
tion of data elements that are arranged linearly in an ordered sequence,
as in lists, arrays, tuples, etc.). On the other hand, a tree consists of a
parent node and a set (or list) of subtrees, whose root node is a child of
the original parent node. The recursive definitions of data structures are
completed by considering empty (base) cases. For instance, a list that
contains only one element would consist of that element plus an empty
list. Lastly, observe that in these diagrams the darker boxes represent a
Basic Concepts of Recursive Programming 7
full recursive entity, while the lighter ones indicate smaller self-similar
instances.
Recursion can even be used to define words in dictionaries. This
might seem impossible since we are told in school that the description of
a word in a dictionary should not contain the word itself. However, many
concepts can be defined correctly this way. Consider the term “descen-
dant” of a specific ancestor. Notice that it can be defined perfectly as:
someone who is either a child of the ancestor, or a descendant of any of
the ancestor’s children. In this case, we can identify a recursive structure
where the set of descendants can be organized in order to form a (family)
tree, as shown in Figure 1.3. The darker box contains all of the descen-
dants of a common ancestor appearing at the root of the tree, while the
lighter boxes encompass the descendants of the ancestor’s children.
Problem
Decompose
Solution
“given some positive integer n, calculate the sum of the first n positive
integers” is the statement of a computational problem with one input
parameter (n), and one output value defined as 1 + 2 + ⋯ + (n − 1) + n.
An instance of a problem is a specific collection of valid input values
that will allow us to compute a solution to the problem. In contrast,
an algorithm is a logical procedure that describes a step-by-step set
of computations needed in order to obtain the outputs, given the ini-
tial inputs. Thus, an algorithm determines how to solve a problem. It is
worth mentioning that computational problems can be solved by differ-
ent algorithms. The goal of this book is to explain how to design and
implement recursive algorithms and programs, where a key step involves
decomposing a computational problem.
Decomposition is an important concept in computer science and
plays a major role not only in recursive programming, but also in gen-
eral problem solving. The idea consists of breaking up complex problems
into smaller, simpler ones that are easier to express, compute, code, or
solve. Subsequently, the solutions to the subproblems can be processed
in order to obtain a solution to the original complex problem.
In the context of recursive problem solving and programming, de-
composition involves breaking up a computational problem into several
subproblems, some of which are self-similar to the original, as illustrated
in Figure 1.4. Note that obtaining the solution to a problem may re-
Basic Concepts of Recursive Programming 9
n−1
n
(a) (b)
n
2 n−1
2
(c) (d)
S(n) = 3S((n − 1)/2) + S((n + 1)/2). The final recursive function is:
⎧
⎪
⎪
⎪
1 if n = 1,
⎪
⎪
⎪
⎪3 if n = 2,
S(n) = ⎨
⎪
(1.6)
⎪
⎪3S ( n2 ) + S ( n2 − 1)
⎪
if n > 2 and n is even,
⎪
⎪
⎪
⎩3S ( 2 ) + S ( 2 ) if n > 2 and n is odd.
n−1 n+1
i=1 F (i) is the sum F (1)+F (2)+ ⋯ +F (n−2) (see Section 3.1.4).
where ∑n−2
In this example, for some value n that determines the size of the problem,
12 Introduction to Recursive Programming
5 1 12 4 2 5 6 3 7
0 1 2 3 4 5 6 7 8
(a)
5 1 12 4 2 5 6 3 7
0 1 2 3 4 5 6 7 8
(b)
5 1 12 4 2 5 6 3 7
0 1 2 3 4 5 6 7 8
(c)
5 1 12 4 2 5 6 3 7
0 1 2 3 4 5 6 7 8
(d)
the recursive case relies on decomposing the original problem into every
smaller problem of size 1 up to n − 2.
For the last example of problem decomposition we will use lists that
allow us to access individual elements by using numerical indices (in
many programming languages this data structure is called an “array,”
while in Python it is simply a “list”). The problem consists of adding
Basic Concepts of Recursive Programming 13
where a[i] is the (i + 1)-th element in the list, since the first one is not
indexed by 1, but by 0. Figure 1.6(a) shows a diagram representing a
particular instance of 9 elements.
Regarding notation, in this book we will assume that a sublist of
some list a is a collection of contiguous elements of a, unless explicitly
stated otherwise. In contrast, in a subsequence of some initial sequence
s its elements appear in the same order as in s, but they are not required
to be contiguous in s. In other words, a subsequence can be obtained
from an original sequence s by deleting some elements of s, and without
modifying the order of the remaining elements.
The problem can be decomposed by decreasing its size by a single
unit. On the one hand, the list can be broken down into the sublist
containing the first n − 1 elements (a[0 ∶ n − 1], where a[i ∶ j] denotes the
sublist from a[i] to a[j − 1], following Python’s notation) and a single
value corresponding to the last number on the list (a[n − 1]), as shown
in Figure 1.6(b). In that case, the problem can be defined recursively
through:
⎧
⎪
⎪0 if n = 0,
s(a) = ⎨
⎪s(a[0 ∶ n − 1]) + a[n − 1] if n > 0.
(1.9)
⎪
⎩
In the recursive case the subproblem is naturally applied to the sublist
of size n − 1. The base case considers the trivial situation when the list
is empty, which does not require any addition. Another possible base
case can be s(a) = a[0] when n = 1. However, it would be redundant in
this decomposition, and therefore not necessary. Note that if n = 1 the
function adds a[0] and the result of applying the function on an empty
list, which is 0. Thus, it can be omitted for the sake of conciseness.
On the other hand, we can also interpret that the original list is its
first element a[0], together with the smaller list a[1 ∶ n], as illustrated
in Figure 1.6(c). In this case, the problem can be expressed recursively
through:
⎧
⎪
⎪0 if n = 0,
s(a) = ⎨
⎪a[0] + s(a[1 ∶ n]) if n > 0.
(1.10)
⎪
⎩
Although both decompositions are very similar, the code for each one
can be quite different depending on the programming language used.
14 Introduction to Recursive Programming
Section 1.3 will show several ways of coding algorithms for solving the
problem according to these decompositions.
Another way to break up the problem consists of considering each
half of the list separately, as shown in Figure 1.6(d). This results in two
subproblems of roughly half the size of the original’s. The decomposition
produces the following recursive definition:
⎧
⎪
⎪
⎪
0 if n = 0,
⎪
s(a) = ⎨a[0]
⎪
if n = 1, (1.11)
⎪
⎪
⎪
⎩s(a[0 ∶ n//2]) + s(a[n//2 ∶ n]) if n > 1.
Unlike the previous definitions, this decomposition requires a base case
when n = 1. Without it the function would never return a concrete value
for a nonempty list. Observe that the definition would not add or return
any element of the list. For a nonempty list the recursive case would
be applied repeatedly, but the process would never halt. This situation
is denoted as infinite recursion. For instance, if the list contained a
single element the recursive case would add the value associated with an
empty list (0), to the result of the same initial problem. In other words,
we would try to calculate s(a) = 0 + s(a) = 0 + s(a) = 0 + s(a) . . ., which
would be repeated indefinitely. The obvious issue in this scenario is that
the original problem s(a) is not decomposed into smaller and simpler
ones when n = 1.
C, Java:
1 int sum_first_naturals(int n)
2 {
3 if (n==1)
4 return 1;
5 else
6 return sum_first_naturals(n-1) + n;
7 }
Pascal:
1 function sum_first_naturals(n: integer ): integer ;
2 begin
3 if n=1 then
4 sum_first_naturals := 1
5 else
6 sum_first_naturals := sum_first_naturals(n-1) + n;
7 end;
MATLAB® :
1 function result = sum_first_naturals(n)
2 if n==1
3 result = 1;
4 else
5 result = sum_first_naturals(n-1) + n;
6 end
Scala:
1 def sum_first_naturals(n: Int): Int = {
2 if (n==1)
3 return 1
4 else
5 return sum_first_naturals(n-1) + n
6 }
Haskell:
1 sum_first_naturals 1 = 1
2 sum_first_naturals n = sum_first_naturals (n - 1) + n
Figure 1.7 Functions that compute the sum of the first n natural numbers
in several programming languages.
16 Introduction to Recursive Programming
Listing 1.1 Python code for adding the first n natural numbers.
1 def sum_first_naturals(n):
2 if n == 1:
3 return 1 # Base case
4 else:
5 return sum_first_naturals(n - 1) + n # Recursive case
Listing 1.2 Alternative Python code for adding the first n natural num-
bers.
1 def sum_first_naturals_2(n):
2 if n == 1:
3 return 1
4 elif n == 2:
5 return 3
6 elif n % 2:
7 return (3 * sum_first_naturals_2((n - 1) / 2)
8 + sum_first_naturals_2((n + 1) / 2))
9 else:
10 return (3 * sum_first_naturals_2(n / 2)
11 + sum_first_naturals_2(n / 2 - 1))
Listing 1.2 shows the recursive code associated with (1.6). The func-
tion uses a cascaded if statement in order to differentiate between the
two base cases (lines 2–5) and the two recursive cases (lines 6–11), which
each make two recursive calls to the defined Python function.
It is also straightforward to code a function that computes the n-th
Fibonacci number by relying on the standard definition in (1.2). List-
Basic Concepts of Recursive Programming 17
Listing 1.3 Python code for computing the n-th Fibonacci number.
1 def fibonacci(n):
2 if n == 1 or n == 2:
3 return 1
4 else:
5 return fibonacci(n - 1) + fibonacci(n - 2)
ing 1.3 shows the corresponding code, where both base cases are consid-
ered in the Boolean expression of the if statement.
Listing 1.4 Alternative Python code for computing the n-th Fibonacci
number.
1 def fibonacci_alt(n):
2 if n == 1 or n == 2:
3 return 1
4 else:
5 aux = 1
6 for i in range(1, n - 1):
7 aux += fibonacci_alt(i)
8 return aux
a 5 -1 3 2 4 -3 length 6
(a)
a 5 -1 3 2 4 -3 size 6
(b)
a 7 0 5 -1 3 2 4 -3 9 2 ∗ ⋯ ∗
lower 2 upper 7
(c)
1.4 INDUCTION
Induction is another concept that plays a fundamental role when design-
ing recursive code. The term has different meanings depending on the
field and topic where it is used. In the context of recursive programming
it is related to mathematical proofs by induction. The key idea is that
programmers must assume that the recursive code they are trying to
implement already works for simpler and smaller problems, even if they
have not yet written a line of code! This notion is also referred to as the
recursive “leap of faith.” This section reviews these crucial concepts.
a) Base case (basis). Verify that the formula is valid for the smallest
value of n, say n0 .
Basic Concepts of Recursive Programming 21
b) Inductive step. Firstly, assume the formula is true for some general
value of n. This assumption is referred to as the induction hy-
pothesis. Subsequently, by relying on the assumption, show that
if the formula holds for some value n, then it will also be true for
n + 1.
22 Introduction to Recursive Programming
The base case is trivially true, since S(1) = 1(2)/2 = 1. For the induction
step we need to show whether
n+1
(n + 1)(n + 2)
S(n + 1) = ∑ i = (1.13)
i=1 2
i=1
n(n + 1) n2 + n + 2n + 2 (n + 1)(n + 2)
induction
hypothesis
S(n + 1) +n+1=
↓
= = ,
2 2 2
showing that (1.13) is true, which completes the proof.
S(100) = ? S(99) = ?
5050 4950
(a)
+100 +99 +3 +2
sume that these solutions are already available. Naturally, we will have
to process (modify, extend, combine, etc.) them somehow in order to
construct the recursive cases. Lastly, recursive algorithms are completed
by incorporating base cases that are not only correct, but that also allow
the algorithms to terminate.
cursive one, and vice versa. Choosing which one to use may depend on
several factors, such as the computational problem to solve, efficiency,
the language, or the programming paradigm. For instance, iteration is
preferred in imperative languages, while recursion is used extensively in
functional programming, which follows the declarative paradigm.
The examples shown so far in the book can be coded easily through
loops. Thus, the benefit of using recursion may not be clear yet. In
practice, the main advantage of using recursive algorithms over iterative
ones is that for many computational problems they are much simpler
to design and comprehend. A recursive algorithm could resemble more
closely the logical approach we would take to solve a problem. Thus, it
would be more intuitive, elegant, concise, and easier to understand.
In addition, recursive algorithms use the program stack implicitly
to store information, where the operations carried out on it (e.g., push
and pop) are transparent to the programmer. Therefore, they constitute
clear alternatives to iterative algorithms where it is the programmer’s
responsibility to explicitly manage a stack (or similar) data structure.
For instance, when the structure of the problem or the data resembles a
tree, recursive algorithms may be easier to code and comprehend than
iterative versions, since the latter may need to implement breadth- or
depth-first searches, which use queues and stacks, respectively.
In contrast, recursive algorithms are generally not as efficient as iter-
ative versions, and use more memory. These drawbacks are related to the
use of the program stack. In general, every call to a function, whether
it is recursive or not, allocates memory on the program stack and stores
information on it, which entails a higher computational overhead. Thus,
a recursive program cannot only be slower than an iterative version, a
large number of calls could cause a stack overflow runtime error. Further-
more, some recursive definitions may be considerably slow. For instance,
the Fibonacci codes in Listings 1.3 and 1.4 run in exponential time, while
Fibonacci numbers can be computed in (much faster) logarithmic time.
Lastly, recursive algorithms are harder to debug (i.e., analyze a program
step by step in order to detect errors and fix them), especially if the
functions invoke themselves several times, as in Listings 1.3 and 1.4.
Finally, while in some functional programming languages loops are
not allowed, many other languages support both iteration and recursion.
Thus, it is possible to combine both programming styles in order to build
algorithms that are not only powerful, but also clear to understand (e.g.,
backtracking). Listing 1.4 shows a simple example of a recursive function
that contains a loop.
Basic Concepts of Recursive Programming 27
⎧
⎪
⎪
⎪
1 if n = 1 or n = 2,
⎪
⎪
⎪
f (n) = ⎨[f ( n2 + 1)] − [f ( n2 − 1)]
2 2
⎪
if n > 2 and n even, (1.16)
⎪
⎪
⎪
⎪
⎪
⎩[f ( 2 )] + [f ( 2 )]
n+1 2 n−1 2
if n > 2 and n odd.
1.7 EXERCISES
Exercise 1.1 — What does the following function calculate?
⎧
⎪
⎪1 if n = 0,
f (n) = ⎨
⎪
⎩f (n − 1) × n if n > 0.
⎪
Methodology for
Recursive Thinking
31
32 Introduction to Recursive Programming
phasizes declarative thinking due to its fourth step, since using induction
implies focusing on what an algorithm performs, rather than on how it
solves the problem. The following sections explain each step and mention
common pitfalls and misunderstandings.
problem might not be specified explicitly by the inputs, but the pro-
gramming language will allow us to retrieve the particular size through
its syntax and constructs. For example, when working with lists, their
length often determines the size of the problem, which can be recovered
through the function len.
In other problems the size may be expressed as a function of the input
parameters. For instance, consider the problem of computing the sum of
the digits of some positive integer n. Although the input parameter is
n, the size of the problem is not n, but rather the number of digits of n,
which determines the number of operations to carry out. Formally this
quantity can be expressed as ⌊log10 (n)⌋ + 1.
The size of a problem may also depend on several input parameters.
Consider the problem of adding the elements of a sublist within a list
(see Listing 1.6), determined by “lower” and “upper” indices. Solving the
problem requires adding n−m+1 elements, where n and m are the upper
and lower indices, respectively. Thus, it needs to compute n−m additions.
The small (unit) difference between both expressions is irrelevant. It is
worth mentioning that it is not necessary to know exactly how many
operations will be carried out in order to solve a problem. Instead, it is
enough to understand when a base case is reached, and how to reduce the
size of the problem in order to decompose it into smaller subproblems.
For this last problem, the size is decreased by reducing the difference
between m and n.
Moreover, the size of a problem does not necessarily need to specify
the number of operations that algorithms must perform. Consider the
problem of adding the elements of a square n × n-dimensional matrix.
Since it contains n2 elements, the sum requires n2 − 1 additions, which is
a function of n. However, in this case the size of the problem is simply n,
and not n2 . Note that smaller subproblems arise by decreasing n, not n2 .
If the matrix were n × m-dimensional, the problem size would depend on
both parameters n and m. In particular, the size of the problem would
be nm, where we could obtain subproblems of smaller size by decreasing
n, m, or both.
The size is generally a property of a problem, not of a particular
algorithm that solves it. Thus, it is not the actual number of computa-
tions carried out by a specific algorithm, since problems can be solved
by different algorithms whose runtime may vary considerably. Consider
the searching problem that consists of determining whether or not some
number appears in a sorted list of n numbers. In the worst case (when
the list does not contain the number and the result is False) it can be
34 Introduction to Recursive Programming
Listing 2.1 uses the factorial function to illustrate this and other
basic pitfalls regarding bases cases. Firstly, the factorial method
provides a perfect implementation of the mathematical function. The
factorial_redundant method is correct since it produces a valid out-
put for any nonnegative integer input argument. However, it contains an
extra base case that is unnecessary.
In addition, we should also strive for generality, which is an im-
portant desirable software feature. More general functions are those
that can operate correctly on a wider range of inputs, and are there-
fore more applicable and useful. The factorial function in (1.3) re-
ceives a nonnegative integer as its input argument, and is defined
adequately since it can be applied correctly not only to every posi-
tive integer, but also to 0. Replacing the base case 0! = 1 by 1! =
1 would lead to a less general function, since it would not be de-
fined for n = 0. The method factorial_missing_base_case imple-
ments this function. If it were called with n = 0 as its input argu-
ment it would fail to produce a valid result since it would run into
an infinite recursion. In particular, factorial_missing_base_case(0)
would call factorial_missing_base_case(-1), which would call
factorial_missing_base_case(-2), and so on. In practice the pro-
cess textcolorredwould halt producing a runtime error message (e.g.,
“stack overflow,” or “maximum recursion depth exceeded”) after per-
forming a large number of function calls (see Chapter 10). Finally,
factorial_no_base_case always generates an infinite recursion since
it does not contain any base case, and can therefore never halt.
For many problems, there are instances for which we can provide a
result without using recursion, even if their size is not small. For example,
consider the problem of determining whether some element appears in
a list. The size of the problem is the number of elements in the list.
A first base case occurs when the list is empty (this is the smallest
instance of the problem, of zero size), where the result is obviously False.
In addition, the algorithm can check if the element appears in some
position (e.g., the first, middle, or last) of the list, and return a true
value immediately if it does, even if the list is very large (Chapter 5
covers this type of searching problems). In both cases there is no need
to carry out a recursive call.
Methodology for Recursive Thinking 37
Subproblem
³¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ·¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹µ
1 + 2 + 3 + ⋯ + (n − 2) + (n − 1) + n
´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶
Original problem
(a)
1 + 2 + 3 + ⋯ + (n − 2) + (n − 1) + n
Subproblem
Original problem
(b)
1
Subproblem
2
3 Original
problem
⋮ ⋮
n−2 ⋯
n−1 ⋯
n ⋯
(c)
S(n − 1) S(n) ✔
(a)
(b)
S(n)
S(n − 1)
S(n) ⋮
✔ ✘
S(3)
S(n − 1)
(c)
S(2)
S(1)
(d)
F (n − 1) + F (n − 2)
(a)
(b)
A1,1 = [ ], A1,2 = [ ],
3 1 2 0 3 4 9
7 2 3 3 1 7 5
A2,1 = [ ], A2,2 = [ ].
0 5 3 2 1 4 8
6 3 1 5 4 9 0
Methodology for Recursive Thinking 41
In this case, the number of rows and columns of each submatrix results
from performing the integer division of the original height and width of
the matrix by two.
Regarding efficiency, dividing the size of the problem by two, instead
of decreasing it by a single unit, may lead to faster algorithms. There-
fore, this strategy should be considered in general, and especially if a
theoretical analysis shows that it is possible to improve the computa-
tional cost of an algorithm based on reducing the size of the problem by
a unit. However, dividing the size of a problem by two may not lead to
a reasonable recursive algorithm. For instance, it is difficult to obtain a
simple recursive formula for the factorial function n! by considering the
subproblem (n/2)!.
Inputs Results
Recursive
method
Problem Input parameters Solution
Derive
Decomposition recursive
cases
Recursive
method
Subproblem Simpler inputs Simpler solution
Induction
ters for the self-similar subproblem are shown in the bottom-left box.
A recursive call to the method with those parameters would therefore
produce the results in the bottom-right box, which are obtained by ap-
plying the statement of the problem to the simpler inputs. Finally, since
we can assume that these results are available by relying on induction,
we derive recursive cases by determining how we can modify or extend
them in order to obtain or arrive at the solution to the original problem
(top-right box).
For the sum of the first n positive integers (S(n)) the diagram would
be:
Inputs Results
S
n ÐÐÐÐÐÐÐÐÐÐ→ 1 + 2 + ⋯ + (n − 1) + n = S(n)
+ n
S
n−1 ÐÐÐÐÐÐÐÐÐÐ→ 1 + 2 + ⋯ + (n − 1) = S(n − 1)
Inputs Results
S
n ÐÐÐÐÐÐÐÐÐÐ→ 1 + 2 + ⋯ + ( n2 ) + ( n2 + 1) + ⋯ + n = S(n)
S
n
2
ÐÐÐÐÐÐÐÐÐÐ→ 1 + 2 + ⋯ + ( n2 ) = S( n2 )
The question is to figure out how (and if) we can obtain S(n) by
modifying or extending S(n/2). A first obvious idea leads to S(n) =
S(n/2) + (n/2 + 1) + ⋯ + n. However, although it is correct, its imple-
mentation requires either a loop, or using a different recursive function,
in order to calculate the sum of the last n/2 terms. Observe that it is
not possible to obtain (n/2 + 1) + ⋯ + n by using a recursive call to the
function under construction (S), since that sum is not an instance of the
problem. Nevertheless, it can be broken up, for example, in the following
way:
(n/2 + 1) + (n/2 + 2) + ⋯ + n
n/2 + n/2 + ⋯ + n/2 = (n/2)2
+ 1 + 2 + ⋯ + n/2 = S(n/2)
This not only simplifies the expression, but it also contains S(n/2), which
we can use in order to obtain a much simpler recursive case:
n
2
n
2
(a) (b)
Inputs Results
S
n ÐÐÐÐÐÐÐÐÐÐ→ 1+2+⋯+ ( n2 ) + ( n2 + 1) + ⋯ + n = S(n)
× 2 + ( n2 )2
S
n
2
ÐÐÐÐÐÐÐÐÐÐ→ 1 + 2 + ⋯ + ( n2 ) = S( n2 )
Although we could use the formula to derive the recursive cases, we will
proceed by analyzing concrete instances of the problem. For example (we
can discard using the method’s name in the results column, and when
labeling the arrows, for the sake of simplicity):
Inputs Results
n = 5342 ÐÐÐÐÐÐÐÐÐ→ 14
+2
or
Inputs Results
n = 879 ÐÐÐÐÐÐÐÐÐ→ 24
+9
n//10 = 87 ÐÐÐÐÐÐÐÐÐ→ 15
It is easy to see in these diagrams that we can obtain the result of the
method by adding the last (underlined) digit of the original number to
the output of the subproblem.
Methodology for Recursive Thinking 47
Listing 2.2 Code for computing the sum of the digits of a nonnegative
integer.
1 def add_digits(n):
2 if n < 10:
3 return n
4 else:
5 return add_digits(n // 10) + (n % 10)
Inputs Results
n = dm−1 ⋯ d1 d0 ÐÐÐÐÐÐÐÐÐ→ dm−1 + ⋯ + d1 + d0
+ d0
and the function can be coded as shown in Listing 2.2, where (n%10)
represents d0 . Note that the base case for this problem occurs when n
contains a single digit (i.e., n < 10), for which the result is obviously n.
2.5.4 Procedures
The methods seen so far correspond to functions that return values,
where the results can be defined through formulas. Therefore, they can
be used within expressions in other methods, where they would return
specific values given a set of input arguments. However, there exist meth-
ods, called “procedures” in certain programming languages (e.g., Pas-
cal), which do not return values. Instead, they may alter data structures
48 Introduction to Recursive Programming
Firstly, the size of this problem is the number of digits of n. The base
case occurs when n contains a single digit (n < 10), where the algorithm
would simply print n. As in the previous example, the simplest decompo-
sition considers n//10, where the least significant digit is removed from
the original number. Figure 2.7 shows a possible decomposition diagram
of the problem.
In addition, the general diagram can also be used for this type of
procedure:
Inputs Results
3¶
4¶
n = 2743 ÐÐÐÐÐÐÐÐÐ→ 7¶
2¶
print(3) (before)
4¶
n//10 = 274 ÐÐÐÐÐÐÐÐÐ→ 7¶
2¶
In this example the results are no longer numerical values, but a se-
quence of instructions, which correspond to printed lines on a console.
For this problem it is possible to arrive at the solution by printing the
least significant digit of the input number, and by calling the recursive
method on the remaining digits. However, the order in which these op-
erations are performed is crucial. In particular, the least significant digit
Methodology for Recursive Thinking 49
3¶ d0 ¶
4¶ d1 ¶
7¶ ⋮
2¶ dm−1 ¶
(a) (b)
must be printed before the rest of the digits. The associated code is
shown in Listing 2.3.
Inputs Results
Recursive
method
Problem Input parameters Solution
Derive
Decomposition recursive
cases
Recursive
method
Subproblem #1 Simpler inputs #1 Simpler solution #1
⋮ ⋮
Recursive
method
Subproblem #N Simpler inputs #N Simpler solution #N
Induction
Figure 2.8 General diagram for thinking about recursive cases, when a
problem is decomposed into several (N ) self-similar subproblems.
5 -1 3 2 4 7 2
Inputs Results
[5, -1, 3 ,2, 4, 7, 2] ÐÐÐÐÐÐÐÐÐ→ 7
max(5, 7)
the particular half. Therefore, the recursive case can simply return the
maximum value in either half. The recursive function (f ) is defined as
follows:
⎧
⎪
⎪a[0] if n = 1,
f (a) = ⎨
⎪
(2.3)
⎩max (f (a[0 ∶ n//2]), f (a[n//2 ∶ n])) if n > 1.
⎪
Listing 2.4 shows two ways of coding the function. The version that
uses the lower and upper limits is usually faster. Naturally, this problem
also allows recursive solutions based on a single subproblem whose size is
decreased by a single unit. This approach is as efficient as the divide and
conquer strategy. However, in practice it may produce runtime errors for
large lists (see Chapter 10).
52 Introduction to Recursive Programming
2.6 TESTING
Testing is a fundamental stage in any software development process. In
the context of this book its main purpose consists of discovering errors in
the code. Testing therefore consists of running the developed software on
different instances (i.e., inputs) of a problem in order to detect failures.
Novice programmers are strongly encouraged to test their code since the
ability to detect and correct errors (e.g., with a debugger) is a basic
programming skill. In addition, it teaches valuable lessons in order to
avoid pitfalls and code more efficiently and reliably.
Besides checking the basic correctness of the base and recursive cases,
when testing recursive code, programmers should pay special attention to
possible scenarios that lead to infinite recursions. These usually appear
due to missing base cases or by erroneous recursive cases. For example,
consider the function in Listing 2.5 whose goal consists of determining
whether some nonnegative integer n is even. Both base and recursive
cases are correct. Naturally, if a number n is even then so is n − 2, and
the function must return the same Boolean value for both integers. Nev-
ertheless, is_even_incorrect only works for even numbers. Let f (n)
represent is_even_incorrect(n). A call to f (7) produces the following
Methodology for Recursive Thinking 53
recursive calls:
which is an infinite recursion since the process does not halt at a base
case. The fact that the function does not contain a base case that re-
turns False provides a warning regarding its correction (not all Boolean
functions need two base cases in order to return True or False). Indeed,
the method can be fixed by adding that base case. Listing 2.6 shows a
function that works for any valid argument (n ≥ 0).
Another example is the function in Listing 2.7 that uses the recursive
case described in (2.1) in order to compute the sum of the first n positive
integers (S(n)). It is incomplete, and generates infinite recursions for
values of n that are not a power of two. Firstly, since Python considers
n to be a real number, n/2 is also a real number in general. Therefore, if
the argument n is an odd integer in any recursive function call then n/2
will have a fractional part, and so will the arguments of the following
recursive calls. Thus, the algorithm would not halt at the base case
n = 1 (which is an integer with no fractional part), and would continue
to make function calls with smaller and smaller arguments. For example,
let f (n) represent sum_first_naturals_3(n). A call to f (6) produces
the following recursive calls:
never stopping at the base case. The only situation in which the algo-
rithm works is when the first argument n is a power of two, since each of
the divisions by two produces an even number, eventually reaching n = 2
and afterwards n = 1, where the function can finally return a concrete
value instead of producing another function call.
The method sum_first_naturals_3 does not work properly due
to real-valued arguments. Thus, we could try replacing the real divi-
54 Introduction to Recursive Programming
n−1
2
n
n+1
2
(a) (b)
Listing 2.8 Incomplete Python code for adding the first n positive num-
bers.
1 def sum_first_naturals_4(n):
2 if n == 1:
3 return 1
4 else:
5 return 2 * sum_first_naturals_4(n // 2) + (n // 2)**2
sions by integer divisions, as shown in Listing 2.8. This forces the ar-
guments to be integers, which prevents infinite recursions. Nevertheless,
the function still does not work properly for arguments that are not
powers of two. The issue with sum_first_naturals_4 is that it is in-
complete. In particular, although the recursive case is correct, it only
applies when n is even. Figure 2.10 shows how to derive the recursive
case (S(n) = 2S((n − 1)/2) + ((n + 1)/2)2 ) for the problem and decom-
position when n is odd. The complete function is therefore:
⎧
⎪
⎪
⎪
⎪
1 if n = 1,
⎪
S(n) = ⎨2S ( n2 ) + ( n2 )
2
⎪
if n > 1 and n is even,
⎪
⎪
⎪
⎪
⎩2S ( 2 ) + ( 2 )
n+1 2
n−1
if n > 1 and n is odd,
and the corresponding code is shown in Listing 2.9. With the new re-
cursive case, every argument to function sum_first_naturals_5 will
also be an integer (given an initial integer input). Finally, replacing the
Methodology for Recursive Thinking 55
real divisions (/) by integer divisions (//) would also lead to a correct
algorithm.
2.7 EXERCISES
Exercise 2.1 — Let n be some positive integer. Consider the problem
of determining the number of bits set to 1 in the binary representation
of n (i.e., n expressed in base 2). For example, for n = 2510 = 110012
(the subscript indicates the base in which a number is expressed), the
result is three bits set to 1. Indicate the size of the problem and provide
a mathematical expression for such size.
Exercise 2.2 — Consider the function that adds the first n positive
integers in (1.5). Define a more general function that it is also applicable
to all nonnegative integers. In other words, modify the function consid-
ering that it can also receive n = 0 as an input argument. Finally, code
the function.
Exercise 2.3 — Use similar diagrams as in Figures 2.6 and 2.10 in order
to derive recursive definitions of the sum of the first n positive integers
(S(n)), where the recursive case will add the result of four subproblems
of (roughly) half the size as the original. Finally, define and code the full
recursive function.
2¶
7¶
4¶
3¶
Indicate the size of the problem and its base case. Draw a diagram for
a general nonnegative input integer n = dm−1 ⋯ d1 d0 , where m is the
number of digits of n, in order to illustrate the decomposition of the
problem, and how to recover the solution to the problem given the result
of a subproblem. Finally, derive the recursive case and code the method.
Runtime Analysis of
Recursive Algorithms
57
58 Introduction to Recursive Programming
• logb b = 1 • logb 1 = 0
• logb (xy) = logb (x) + logb (y) • logb (x/y) = logb (x) − logb (y)
where a, b, x, and y are arbitrary real numbers, with the exceptions that:
(1) the base of a logarithm must be positive and different than 1, (2) a
logarithm is only defined for positive numbers, and (3) the denominator
in a fraction cannot be 0. For example, logb x = loga x/ loga b is only valid
for a > 0 with a ≠ 1, b > 0 with b ≠ 1, and x > 0.
Logarithms and powers of positive numbers are monotonically in-
creasing functions. Therefore, if x ≤ y then logb x ≤ logb x, and bx ≤ by
(for valid values of x, y, and b).
for a valid base b for the logarithm, and for a > 0. Therefore, the following
indeterminate form appears frequently:
f (n) ∞
lim
∞
= .
n→∞ g(n)
f (n) f ′ (n)
lim = lim ′ (3.3)
n→∞ g (n)
,
n→∞ g(n)
where f ′ (n) and g ′ (n) are the derivatives of f (n) and g(n), respectively.
Formally, L’Hopital’s rule is only valid when the limit on the right-hand
side of (3.3) exists, which is usually the case.
60 Introduction to Recursive Programming
Thus, the result of the sum is simply the addition of terms that arise
from substituting every occurrence of i in function f (i) with integers
from m up to n. For instance:
4
∑ ki = k ⋅ 0 + k ⋅ 1 + k ⋅ 2 + k ⋅ 3 + k ⋅ 4 ,
2 2 2 2 2 2
i=0
∑ f (i) = ∑ f (i + 1) = ∑ f (n − i + m).
n n−1 n
In the second sum the limits (and parameter of f ) appear shifted. The
third sum simply adds the terms in “reverse” order (when i = m it adds
f (n), while when i = n the term added is f (m)). Finally, if the lower
limit m is greater than the upper limit n, then the sum evaluates to 0,
by convention.
Runtime Analysis of Recursive Algorithms 61
Notice that f (i) is a constant (1) that does not depend on the index
variable i. Similarly,
n
∑ k = k + k + ⋯ + k + k = kn.
i=1 ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶
n times
i=m i=m
which follows from the distributive law of multiplication over addition,
where the expression has been simplified by extracting the common fac-
tor k from all of the terms being added. Naturally, the factor k can be
the product of several terms that do not depend on the index, and may
contain the upper and lower limits, as shown in the following example:
n n
∑ amn i = amn ∑ i ,
2 3 2 3
i=m i=m
S(n) = 1 + 2 + ⋯ + (n − 1) + n
+
S(n) = n + (n − 1) + ⋯ + 2 + 1
2S(n) = (n + 1) + (n + 1) + ⋯ + (n + 1) + (n + 1)
Runtime Analysis of Recursive Algorithms 63
S(n)
2S(n) = n(n + 1)
n
n(n+1)
⇒ S(n) = 2
S(n)
n+1
It follows from the result that 2S(n) = n(n + 1), since there are n terms
(each equal to n + 1) on the right-hand side of the identity. Finally,
dividing by 2 yields:
n
n(n + 1)
S(n) = ∑ i = .
i=1 2
(s0 + sn−1 ),
n−1
∑ si =
n
i=0 2
which is the average between the first and last elements of the sequence,
multiplied by the number of elements in the sequence.
3.1.4.4 Differentiation
Another useful sum is:
n
S = ∑ iri = 1r1 + 2r2 + 3r3 + ⋯ + nrn , (3.10)
i=1
3.1.4.5 Products
Similarly to the notation used for sums, a product of several terms of
some function f (i), evaluated at consecutive integer values from an ini-
tial index m up to a final index n, can be written as follows:
n! = ∏ i = 1 ⋅ 2 ⋅ ⋯ ⋅ (n − 1) ⋅ n,
n
i=1
i=1 i=1
In addition, the product of a sum is not the sum of the products in
general:
∏ (f1 (i) + f2 (i)) ≠ ∏ f1 (i) + ∏ f2 (i).
n n n
i=m i=m
where Z represents the set of all integers, and ∣ can be read as “such
that.” The following list includes several basic properties of floors and
ceilings:
• ⌊x⌋ ≤ x • ⌈x⌉ ≥ x
• ⌊x + n⌋ = ⌊x⌋ + n • ⌈x + n⌉ = ⌈x⌉ + n
3.1.6 Trigonometry
Consider the right triangle in Figure 3.2. The following list reviews basic
trigonometric definitions and properties:
Runtime Analysis of Recursive Algorithms 67
c
a
• sin(0) = 0 • cos(0) = 1
√
• sin(30○ ) = sin(π/6) = 1/2 • cos(30○ ) = cos(π/6) = 3/2
√ √
• sin(45○ ) = sin(π/4) = 2/2 • cos(45○ ) = cos(π/4) = 2/2
√
• sin(60○ ) = sin(π/3) = 3/2 • cos(60○ ) = cos(π/3) = 1/2
where sin, cos, tan, and cot denote sine, cosine, tangent, and cotan-
gent, respectively. In most programming languages, the arguments of
the trigonometric functions are in radians (one radian is equal to 180/π
degrees).
where ai,j denotes the particular element or entry of matrix A in its i-th
row and j-th column.
If one of the dimensions is equal to one the mathematical entity is
called a vector, while if both dimensions are equal to one the object is
a scalar (i.e., a number). In this book we will use a standard notation
where matrices are represented by capital boldface letters (A), vectors
through boldface lower case letters (a), and scalars by italic lower case
letters or symbols (a).
The transpose of an n × m-dimensional matrix A is the m × n-
dimensional matrix AT whose rows are the columns of A (and therefore
its columns are the rows of A). For example, if:
⎡ 3 1 ⎤
⎢ ⎥
⎢ ⎥
A=[ ], AT = ⎢ 4 8 ⎥ .
3 4 2
⎢ ⎥
then
1 8 5 ⎢ 2 5 ⎥
⎣ ⎦
It is possible to add and multiply matrices (and vectors). The sum
A + B is a matrix whose entries are ai,j + bi,j . In other words, matrices
(and vectors) are added entrywise. Thus, A and B must share the same
dimensions. For example:
⎡ 4 ⎤ ⎡ 2 ⎤ ⎡ 6 ⎤
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢ −1 ⎥ + ⎢ 3 ⎥ = ⎢ 2 ⎥ ,
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢ 2 ⎥ ⎢ −7 ⎥ ⎢ −5 ⎥
⎣ ⎦ ⎣ ⎦ ⎣ ⎦
is a basic vector sum.
Instead, multiplication is more complex. Let a and b be two n-
dimensional (column) vectors. Their dot product, expressed as aT b
(other notations include a⃗⋅b
⃗ , or ⟨a, b⟩), is defined as the sum of their
entrywise products:
n
aT b = ∑ ai bi . (3.14)
i=1
where α is the angle between the vectors a and b, and ∣ ⋅ ∣ denotes the
Euclidean norm: √
∣a∣ = a21 + ⋯ + a2n . (3.16)
Runtime Analysis of Recursive Algorithms 69
u+v v v v−u
u u
(a) (b)
⎢ 2 5 ⎥ 1 8 5 ⎢ 11 48 29 ⎥
⎣ ⎦ ⎣ ⎦
which is a symmetric matrix, since it is identical to its transpose.
Vectors can also be regarded as points. Geometrically, adding two
vectors u and v can be understood as creating a new vector whose end-
point is the result of “concatenating” u and v, as shown in Figure 3.3(a).
It therefore follows that the vector that begins at the endpoint of u and
ends at the endpoint of v is the vector v − u, as illustrated in (b).
Lastly, a 2-dimensional vector can be rotated counterclockwise α
degrees (or radians) by multiplying it times the following “rotation”
⎡ ⎤
matrix:
⎢ cos(α) − sin(α) ⎥
R=⎢ ⎢
⎥,
⎥ (3.17)
⎢ sin(α) cos(α) ⎥
⎣ ⎦
as shown in Figure 3.4, where u is a column vector (in most mathematical
texts vectors are expressed as column vectors).
70 Introduction to Recursive Programming
Ru ⎡ ⎤
⎢ cos(α) − sin(α) ⎥
R=⎢
⎢
⎥
⎥
⎢ sin(α) cos(α) ⎥
⎣ ⎦
α
u
7
x 10
5 0.5n2
4.5
3.5
2.5
2 2000n + 50000
1.5
0.5
0
0 2000 4000 6000 8000 10000
n
1 < log n < n < n log n < n2 < n3 < 2n < n!,
72 Introduction to Recursive Programming
216 n!
2n
212
n3
28 n2
n log n
24 n
log n
20 1
2 4 6 8 10 12
n
f (n)
lim = 0.
n→∞ g(n)
The previous orders of growth are called (from left to right) “constant,”
“logarithmic,” “linear,” “n-log-n,” “quadratic,” “cubic,” “exponential,”
and “factorial.”
Since the scale of the Y axis in Figure 3.6 is logarithmic, the differ-
ences between the orders of growth may appear to be much smaller than
they actually are (the difference between consecutive tick marks means
that an algorithm is 16 times slower/faster). Table 3.1 shows concrete
values for the functions, where the fast growth rates of the exponen-
tial or factorial functions clearly stand out. Problems that cannot be
solved by algorithms in polynomial time are typically considered to be
intractable, since it would take too long for the methods to terminate
even for problems of moderate size. In contrast, problems that can be
solved in polynomial time are regarded as tractable. However, the line
between tractable and intractable problems can be subtle. If the run-
time of an algorithm is characterized by a polynomial order with a large
degree, in practice it could take too long to obtain a solution, or for its
intermediate results to be useful.
f (n) ∈ O(g(n))
y c ⋅ g(x)
f (x)
0 n0 x
(a)
f (n) ∈ Ω(g(n))
y
f (x)
c ⋅ g(x)
0 n0 x
(b)
f (n) ∈ Θ(g(n))
y
c2 ⋅ g(x) f (x)
c1 ⋅ g(x)
0 n0 x
(c)
from some positive value n0 until infinity (whatever happens for n < n0
is irrelevant). In order to prove that a function belongs to O(g(n)) it
is sufficient to show the existence of a pair of constants c and n0 that
will satisfy the definition, since they are not unique. For instance, if the
definition is true for some particular c and n0 , then it will also be true for
larger values of c and n0 . In this regard, it is not necessary to provide the
lowest values of c and n0 that satisfy the definition O (this also applies
to the notations mentioned below).
Algorithms are often compared according to their efficiency in the
worst case, which corresponds to an instance of a problem, amongst
all that share the same size, for which the algorithm will require more
resources (time, storage, etc.). Since Big-O notation specifies asymptotic
upper bounds, it can be used in order to provide a guarantee that a
particular algorithm will need at most a certain amount of resources,
even in a worst-case scenario, for large inputs. For example, the running
time for the quicksort algorithm that sorts a list or array of n elements
(see Section 6.2.2) belongs to O(n2 ) in general, since it requires carrying
out on the order of n2 comparisons in the worst case. However, quicksort
can run faster (in n log n time) in the best and average cases.
In contrast, Big-Omega notation defines asymptotic lower
bounds:
Ω(g(n)) = {f (n) ∶ ∃ c > 0 and n0 > 0 / 0 ≤ c ⋅ g(n) ≤ f (n), ∀n ≥ n0 }.
If f (n) ∈ Θ(g(n)) then f (n) and g(n) will share the same order of
growth. Thus, by choosing two appropriate constants c1 and c2 , g(n)
will be both an upper and lower asymptotic bound of f (n), as shown in
Figure 3.7(c). In other words, f (n) ∈ O(g(n)) and f (n) ∈ Ω(g(n)). For
example, the merge sort algorithm for sorting a list or array of n ele-
ments always requires (in the best and worst case) on the order of n log n
comparisons. Therefore, we say its running time belongs to Θ(n log n).
76 Introduction to Recursive Programming
⎛ ⎞
⎜ 1 ⎟
) = ρ⎜ logb g(n)⎟
logb g(n)
ρ (loga g(n)) = ρ ( ⎜ ⎟ = ρ (logb g(n)) .
logb a ⎜ logb a ⎟
⎝² ⎠
constant
Thus, the base of a logarithm is not specified when indicating its order
of growth.
Finally, we can also use limits to determine the order of functions, due
to the following equivalent statements that relate them to the definitions
of asymptotic bounds:
f (n)
f (n) ∈ O(g(n)) ⇔ lim < ∞ (constant or zero),
n→∞ g(n)
f (n)
f (n) ∈ Ω(g(n)) ⇔ lim > 0 (constant > 0, or infinity),
n→∞ g(n)
f (n)
f (n) ∈ Θ(g(n)) ⇔ lim = constant > 0.
n→∞ g(n)
def sum_first_naturals(n): a0
if n==1:
return 1
else:
return sum_first_naturals(n-1) + n
def sum_first_naturals(n):
if n==1: a1
return 1
else:
return sum_first_naturals(n-1) + n
def sum_first_naturals(n):
if n==1: a2
return 1
else:
return sum_first_naturals(n-1) + n
def sum_first_naturals(n):
if n==1:
return 1 a3
else:
return sum_first_naturals(n-1) + n
the return address). Say this requires a0 units of computing time, where
a0 is a simple constant. The next basic operation evaluates the condition,
taking a1 units of time. Since the result is True, the next operation is a
“jump” to the third line of the method, which requires a2 units of time.
Finally, the method can return the value 1 in the last step, requiring a3
units of time. In total, the method requires a = a0 + a1 + a2 + a3 units
of time for n = 1. Thus, we can define T (1) = a. The exact value of a
is irrelevant regarding the asymptotic computational complexity of the
method. What is important is that a is a constant quantity that does
not depend on n.
Alternatively, Figure 3.9 shows the operations carried out in the re-
cursive case (when n > 1). Let b = ∑5i=0 bi be the total computing time
needed to carry out the basic operations (store low-level information,
evaluate the condition, jump to the recursive case, subtract a unit from
n, add n to the output of the recursive call, and return the result), which
78 Introduction to Recursive Programming
def sum_first_naturals(n): b0
if n==1:
return 1
else:
return sum_first_naturals(n-1) + n
def sum_first_naturals(n):
if n==1: b1
return 1
else:
return sum_first_naturals(n-1) + n
def sum_first_naturals(n):
if n==1: b2
return 1
else:
return sum_first_naturals(n-1) + n
def sum_first_naturals(n):
if n==1:
return 1
else:
return sum_first_naturals(n-1) + n b3
def sum_first_naturals(n):
if n==1:
return 1
else:
return sum_first_naturals(n-1) + n T (n − 1)
def sum_first_naturals(n):
if n==1:
return 1
else:
return sum_first_naturals(n-1) + n b4
def sum_first_naturals(n):
if n==1:
return 1
else:
return sum_first_naturals(n-1) + n b5
The next sections will cover methods for solving common recurrence
relations.
In addition, in this introductory text we will simplify the recurrence
relations in order to ease the analysis. Consider the code in Listing 2.9.
Its associated runtime cost function can be defined as:
⎧
⎪
⎪
⎪
if n = 1,
⎪
a
T (n) = ⎨T ( n2 ) + b
⎪
if n > 1 and n is even, (3.19)
⎪
⎪
⎪
⎩T (⌊ 2 ⌋) + c if n > 1 and n is odd.
n
This recurrence relation is hard to analyze for two reasons. On the one
hand, it contains more than one recursive case. On the other hand, al-
though it is possible to work with the floor function, it is subject to
technicalities and requires more complex mathematics (e.g., inequali-
ties). Moreover, the extra complexity that stems from separating the
odd and even cases, and dealing with a recurrence of finer detail, is un-
necessary regarding the function’s order of growth. Since the algorithm
is based on solving subproblems of (approximately, in the odd case) half
the size, we can work with the following recurrence relation instead:
⎧
⎪
⎪a if n = 1,
T (n) = ⎨ n
⎪
(3.20)
⎩T ( 2 ) + b if n > 1.
⎪
80 Introduction to Recursive Programming
Using this function is analogous to utilizing (3.19) and assuming that the
size of the problem (n) will be a power of two, which can be arbitrarily
large.
Therefore, in this book we will cover recurrence relations with only
one recursive case, which will not involve the floor or ceiling functions.
This will allow us to determine exact nonrecursive definitions of recur-
rence relations whose order of growth can be characterized by the tight
Θ asymptotic bound.
where the expression inside the square brackets is the expansion of T (n−
1). The idea can be applied again expanding T (n−2), which is T (n−3)+b.
Thus, at a third step we obtain:
T (n) = [T (n − 3) + b] + 2b = T (n − 3) + 3b.
´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹¶
T (n−2)
Runtime Analysis of Recursive Algorithms 81
Finally, for some value of i the process will reach a base case. For function
(3.18) it is defined for T (1). Therefore, the term T (n−i) will correspond
to the base case when n−i = 1, or equivalently, when i = n−1. Substituting
this result in (3.22) allows us to eliminate the variable i from the formula,
and provide a full nonrecursive definition of T (n):
where the general pattern for the recurrence relation at step i is:
The base case T (1) is reached when n/2i = 1. Thus, it occurs when n = 2i ,
or equivalently, when i = log2 n. By substituting in (3.23) we obtain:
Since the order of growth is logarithmic, Listing 2.9 is faster than List-
ing 1.1, whose order is linear. This makes intuitive sense, since the former
decomposes the original problem by dividing its size by two, while the
latter is based on decrementing the size of the problem by a unit. Thus,
Listing 2.9 needs fewer recursive function calls in order to reach the base
case.
A subtle detail about recurrence relations that stem from dividing
the size of a problem by an integer constant k ≥ 2 is that they should not
contain a single base case for n = 0. From a mathematical point of view,
it would never be reached, and we would not be able to find a value for
i in the method’s third step. The pitfall is that the argument of T (n)
must be an integer. Thus, the fraction in T (n/k) actually corresponds
to an integer division. Notice that after reaching T (1) the next expan-
sion would correspond to T (0) instead of T (1/k). Therefore, for these
recurrence relations we should include an additional base case, usually
for n = 1, in order to apply the method correctly.
In the previous examples it was fairly straightforward to detect the
general recursive pattern for the i-th stage of the expansion process. For
the next recurrence relations the step is slightly more complex since it
will involve computing sums like the ones presented in Section 3.1.4.
Consider the following recurrence relation:
⎧
⎪
⎪a if n = 0,
T (n) = ⎨
⎪
(3.24)
⎩T (n − 1) + bn + c if n > 0.
⎪
T (n) = T (n − 1) + bn + c
= [T (n − 2) + b(n − 1) + c] + bn + c
= T (n − 2) + 2bn − b + 2c
= [T (n − 3) + b(n − 2) + c] + 2bn − b + 2c
= T (n − 3) + 3bn − b(1 + 2) + 3c
Runtime Analysis of Recursive Algorithms 83
(i − 1)i
T (n) = T (n − i) + ibn − b ∑ j + ic = T (n − i) + ibn − b
i−1
+ ic. (3.25)
j=1 2
Two common misconceptions when using sums in the pattern at the i-th
stage are: (1) using i as the index variable of the sum, and (2) choosing
n as its upper limit. It is important to note that i − 1 is the upper limit
of the sum, which implies that the index of the sum cannot be i.
Finally, the base case T (0) is reached when i = n. Therefore, substi-
tuting in (3.25) yields:
T (n) = 2T (n/2) + bn + c
= 2[2T (n/4) + bn/2 + c] + bn + c
= 4T (n/4) + 2bn + 2c + c
= 4[2T (n/8) + bn/4 + c] + 2bn + 2c + c
= 8T (n/8) + 3bn + 4c + 2c + c
84 Introduction to Recursive Programming
Finally, we reach base case T (1) when n/2i = 1, which implies that
i = log2 n. Therefore, substituting in (3.27) yields:
T (n) ∈ Θ(nlogb a ).
3. If f (n) = Ω(nlogb a+ǫ ) with ǫ > 0, and f (n) satisfies the regularity
condition (af (n/b) ≤ df (n) for some constant d < 1, and for all n
sufficiently large), then:
constant (i.e., it does does not diverge) for r < 1. Thus, in this
scenario:
T (n) = cnlogb a + dKnk ,
for some constant K. In addition, since a < bk implies that logb a <
k, the highest-order term is nk . Therefore,
T (n) ∈ Θ(nk ).
implies that
T (n) ∈ Θ(nk log n).
Runtime Analysis of Recursive Algorithms 87
logb n
nlogb a
cnlogb a + dnk
−1
nk
= K
= (c + d
K
) nlogb a − d k
K
n ,
T (n) = 2T (n/2) + 1
= 2[2T (n/4) + 1] + 1 = 4T (n/4) + 2 + 1
= 4[2T (n/8) + 1] + 2 + 1 = 8T (n/8) + 4 + 2 + 1
= 8[2T (n/16) + 1] + 4 + 2 + 1 = 16T (n/16) + 8 + 4 + 2 + 1
⋮
= 2i T (n/2i ) + ∑ 2j = 2i T (n/2i ) + 2i − 1.
i−1
j=0
88 Introduction to Recursive Programming
T (n) = n + n − 1 = 2n − 1 ∈ Θ(n),
which is a special case of (3.21), where T (n) ∈ Θ(n). Instead, the third
method decomposes the original problem in two of half the size, where
the subsolutions do not need to be processed any further. Therefore, the
corresponding recurrence relation can be identical to the one in (3.30).
Finally, we will study the computational time complexity of List-
ings 2.2 and 2.3. The associated recurrence relation (ignoring multiplica-
tive constants) is:
⎧
⎪
⎪1 if n < 10,
T (n) = ⎨
⎪
(3.31)
⎩T (n/10) + 1 if n ≥ 10.
⎪
It is different than the previous recurrences since the base case is defined
on an interval. Nevertheless, by assuming that the input will be a power
of 10, we can use an alternative definition of the recurrence:
⎧
⎪
⎪1 if n = 1,
T (n) = ⎨
⎪
(3.32)
⎩T (n/10) + 1 if n > 1.
⎪
T (n) = T (n/10) + 1
= [T (n/100) + 1] + 1 = T (n/102 ) + 2
= [T (n/1000) + 1] + 2 = T (n/103 ) + 3
= [T (n/10000) + 1] + 3 = T (n/104 ) + 4
⋮
= T (n/10i ) + i.
where ai and bi are constants, and Pidi (n) are polynomials (of n) of
degree di . Terms involving T may appear several times on the right-
hand side of the definition. Moreover, their arguments are necessarily n
minus some integer constant. Thus, these recurrence relations are also
known as “difference equations.” In this book we will call these terms
involving T the “T difference” terms, to emphasize that the arguments
cannot take the form n/b, where b is a constant (if these terms appear it is
necessary to transform the recurrence). In addition, the right-hand side
of the definition may contain several terms that consist of a polynomial
times a power of the input n (i.e., an exponential).
Instead of providing a general procedure directly, the following sub-
sections will explain the method progressively, starting with simple re-
currences, and introducing new elements as they grow in complexity.
90 Introduction to Recursive Programming
x2 − x − 1.
The last step consists of finding values for the constants Ci by solving a
system of linear equations. Each equation is formed by choosing a small
value for n corresponding to a base case, and plugging it in (3.36). The
simplest equation is obtained for n = 0, which is the reason why we
considered the base case n = 0 in (3.35). For the second equation we can
use n = 1. This provides the following system of linear equations:
⎫
C1 + C2 = 0 = T (0) ⎪
⎪
⎪
⎬.
( 2 ) C1 + ( 2 ) C2 = 1 = T (1) ⎪
√ √
⎪
⎪
⎭
1+ 5 1− 5
⎧
⎪
⎪
⎪
⎪
0 if n = 1,
A(n) = ⎨1
⎪
if n = 2, (3.38)
⎪
⎪
⎪
⎩A(n − 1) + A(n − 2) if n ≥ 3.
On the other hand, since A(n − 1) = B(n), and A(n) = B(n + 1), we
can substitute in (1.17) in order to obtain B(n + 1) = B(n) + B(n − 1).
Furthermore, since B(2) = 0, we can define B(n) as:
⎧
⎪
⎪
⎪
⎪
1 if n = 1,
B(n) = ⎨0
⎪
if n = 2, (3.39)
⎪
⎪
⎪
⎩B(n − 1) + B(n − 2) if n ≥ 3.
Both of these functions have the form in (3.36). Thus, the only difference
between them is the value of the constants. In particular:
√ n √ n
1 1 1+ 5 1 1 1− 5
A(n) = ( − √ ) ( ) + ( + √ )( ) , (3.40)
2 2 5 2 2 2 5 2
and
√ n √ n
1 3 1+ 5 1 3 1− 5
B(n) = (− + √ ) ( ) + (− − √ ) ( ) , (3.41)
2 2 5 2 2 2 5 2
= C1 + C2 n + C3 n2 + C4 2n ,
where there are three terms associated with root r = 1. Finally, the
constants can be recovered by solving the following system of linear
equations:
⎫
+ C4 = 0 = T (0) ⎪ ⎪
⎪
⎪
⎪
C1
⎪
⎪
C1 + C2 + C3 + 2C4 = 2 = T (1) ⎪ ⎪
⎪
⎬.
C1 + 2C2 + 4C3 + 4C4 = 11 = T (2) ⎪ ⎪
⎪
⎪
⎪
⎪
⎪
⎪
C1 + 3C2 + 9C3 + 8C4 = 28 = T (3) ⎪ ⎪
⎭
The solutions are C1 = −1, C2 = −2, C3 = 3, and C4 = 1 (Listing 3.2
shows how to solve the system of linear equations, expressed as Ax = b,
with the NumPy package). Finally,
T (n) = −1 − 2n + 3n2 + 2n ∈ Θ(2n ),
whose order of growth is exponential.
94 Introduction to Recursive Programming
T (n) = 2T (n − 1) − T (n − 2) + 3n + n3n + 3 + n + n2 .
T (n) − 2T (n − 1) + T (n − 2) = 3n + n3n + 3 + n + n2 .
For the next step it is useful to express the terms on the right-hand side
as the product of a polynomial times an exponential. Naturally, if a term
only contains an exponential then it can be multiplied by 1, which is a
polynomial. Similarly, we can multiply polynomials times 1n . Therefore,
the recurrence can be written as:
T (n) − 2T (n − 1) + T (n − 2) = 1 ⋅ 3n + n ⋅ 3n + 3 ⋅ 1n + n ⋅ 1n + n2 ⋅ 1n .
T (n) − 2T (n − 1) + T (n − 2) = (1 + n) ⋅ 3n + (3 + n + n2 ) ⋅ 1n .
of the exponential, and 2 is the degree of the polynomial (1) plus one.
Similarly, (3 + n + n2 ) ⋅ 1n provides the new term (x − 1)3 . Therefore, the
characteristic polynomial is;
T (n) = C1 + C2 n + C3 n2 + C4 n3 + C5 n4 + C6 3n + C7 n3n .
The next example illustrates the approach with the following non-
homogeneous recurrence relation:
⎧
⎪
⎪1 if n = 0,
T (n) = ⎨
⎪
⎩2T (n − 1) + n + 2
⎪ if n > 0.
n
T (n) = C1 + C2 n + C3 2n + C4 n2n .
Since there are four unknown constants we need the values of T evaluated
at four different inputs. Staring with the base case T (0) = 1, we can
compute T (1), T (2), and T (3) by using T (n) = 2T (n − 1) + n + 2n . In
particular, T (1) = 5, T (2) = 16, and T (3) = 43. With this information
we can build the following system of linear equations:
⎫
+ C3 1 = T (0) ⎪
⎪
⎪
⎪
=
⎪
C1
⎪
⎪
C1 + C2 + 2C3 + 2C4 = 5 = T (1) ⎪ ⎪
⎪
⎬.
C1 + 2C2 + 4C3 + 8C4 = 16 = T (2) ⎪⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
C1 + 3C2 + 8C3 + 24C4 = 43 = T (3) ⎪
⎭
The solutions are C1 = −2, C2 = −1, C3 = 3, and C4 = 1. Therefore, the
nonrecursive expression of T (n) is:
T (n) = 2T (n/2) + 1,
and where we can assume that the input is a power of two. Since the
argument in the T term of the right-hand side is n/2 we need to apply
the change of variable n = 2k in order to obtain:
t(k) = 2t(k − 1) + 1,
which we can solve through the general method for difference equations.
In particular, the expression can be written as t(k) − 2t(k − 1) = 1 ⋅ 1n .
Therefore, the characteristic polynomial is (x−2)(x−1), and the function
will have the following form:
t(k) = C1 + C2 2k . (3.43)
T (n) = C1 + C2 n. (3.44)
The last step consists of determining the constants C1 and C2 . This can
be done through either (3.43) or (3.44). For T we can use the base cases
T (1) = 1 and T (2) = 3. The analogous base cases for t are: t(0) = T (20 ) =
T (1) = 1, and t(1) = T (21 ) = T (2) = 3. Either way, the system of linear
equations is:
⎫
C1 + C2 = 1 = T (1) = t(0) ⎪ ⎪
⎪
⎬.
C1 + 2C2 = 3 = T (2) = t(1) ⎪ ⎪
⎪
⎭
Runtime Analysis of Recursive Algorithms 97
recurrence relation:
√
T (n) = 2T ( n) + log2 n (3.45)
where T (2) = 1 and n = 2(2 ) (note that 2(2 ) ≠ 4k ). This last restriction
k k
t(k) = C1 2k + C2 k2k .
In order to undo the change of variable we can use k = log2 (log2 n), and
2k = log2 n. The recurrence as a function of n is therefore:
Finally, we can use the base cases T (2) = 1 and T (4) = 4 in order to
find the constants. In particular, we need to solve the following system
of linear equations:
⎫
= 1 = T (2) ⎪
⎪
⎪
⎬.
C1
2C1 + 2C2 = 4 = T (4) ⎪
⎪
⎪
⎭
The solutions are C1 = C2 = 1. Therefore, the final nonrecursive formula
for T (n) is:
T (n) = log2 n + (log2 (log2 n)) log2 n ∈ Θ((log(log n)) log n). (3.47)
t(k) = 2t(k/2) + k,
which still cannot be solved through the method since it is not a differ-
ence equation. However, we can apply a new change of variable in order
to transform it into one. With the change k = 2m we obtain:
t(2m ) = 2t(2m−1 ) + 2m ,
u(m) = 2u(m − 1) + 2m .
u(m) = C1 2m + C2 m2m .
and finally:
and apply the more complex change of function u(k) = log2 t(k), which
leads to:
u(k) = k + 2u(k − 1),
and that we can solve through the method. In particular, its charac-
teristic polynomial is (x − 2)(x − 1)2 , which implies that u(k) has the
following form:
u(k) = C1 2k + C2 + C3 k.
Undoing the changes, we first have:
k
t(k) = 2C1 2 +C2 +C3 k
,
and finally:
T (n) = 2C1 n+C2 +C3 log2 n .
The last step consists of determining the constants. Using the initial
recurrence with T (1) = 1/3, we obtain T (2) = 2[T (1)]2 = 2/9, and
T (4) = 4[T (2)]2 = 4(2/9)2 = 16/81. We can use these values to build
the following system of (nonlinear) equations:
⎫
1/3 = T (1) ⎪
⎪
⎪
⎪
⎪
2C1 +C2 =
⎪
⎪
2/9 = T (2) ⎬ ,
⎪
2 2C1 +C2 +C3
=
⎪
⎪
⎪
= 16/81 = T (4) ⎪
⎪
⎪
⎭
2 4C1 +C2 +2C3
4 n 1 4 n 1
=[ ] ⋅ ∈ Θ ([ ] ⋅ ) .
3 4n 3 n
Runtime Analysis of Recursive Algorithms 101
3.4 EXERCISES
Exercise 3.1 — Prove the following identity:
a logb n nlogb a
( ) =
bk nk
Exercise 3.2 — By using limits, show that log n! ∈ Θ(n log n). Hint:
when n approaches infinity, n! can be substituted by “Stirling’s approx-
√
imation”:
n n
n! ∼ 2πn ( ) .
e
Exercise 3.3 — Show that n log n ∈ O(n1+a ), where a > 0. Use limits,
and L’Hopital’s rule.
Exercise 3.5 — Write the sum of the first n odd integers in sigma
notation, and simplify it (the result should be a well-known polynomial).
(s0 + sn−1 ).
n−1
∑ si =
n
i=0 2
0 0 0 1
0 0 1 0
0 0 1 1
0 1 0 0
0 1 0 1
0 1 1 0
0 1 1 1
1 0 0 0
1 0 0 1
1 0 1 0
1 0 1 1
1 1 0 0
1 1 0 1
1 1 1 0
1 1 1 1
a) T (n) = 4T (n − 1) − 5T (n − 2) + 2T (n − 3) + n − 3 + 5n2 ⋅ 2n
b) T (n) = T (n − 1) + 3n − 3 + n3 ⋅ 3n
c) T (n) = 5T (n − 1) − 8T (n − 2) + 4T (n − 3) + 3 + n2 + n2n
where n is a power of 2.
Do the difficult things while they are easy and do the great things
while they are small. A journey of a thousand miles must begin
with a single step.
— Lao Tzu
105
106 Introduction to Recursive Programming
where the base b is a real number, and the exponent n is a nonnegative in-
teger (in Python, powers can be obtained through the ** operator.) The
following subsections examine algorithms that can compute the power
in linear and logarithmic time.
Inputs Results
f
(b, n) ÐÐÐÐÐÐÐÐÐÐ→ bn
×b
f
(b, n − 1) ÐÐÐÐÐÐÐÐÐÐ→ bn−1
Linear Recursion I: Basic Algorithms 107
Inputs Results
f
(b, n) ÐÐÐÐÐÐÐÐÐÐ→ bn
/b
f
(b, n + 1) ÐÐÐÐÐÐÐÐÐÐ→ bn+1
Inputs Results
f
(b, n) ÐÐÐÐÐÐÐÐÐÐ→ bn
( )2
f
(b, n/2) ÐÐÐÐÐÐÐÐÐÐ→ bn/2
Inputs Results
f
(b, n) ÐÐÐÐÐÐÐÐÐÐ→ bn
( )2 and × b
f
(b, (n − 1)/2) ÐÐÐÐÐÐÐÐÐÐ→ b(n−1)/2
odd (n-1)//2 is equivalent to n//2. The code uses the former expres-
sion since it resembles the mathematical definition of the function more
closely.
Its runtime can be defined through:
⎧
⎪
⎪1 if n = 0,
T (n) = ⎨
⎪
⎩T (n/2) + 1 if n > 0,
⎪
where T (n) = 2 + log2 n for n > 0. Thus, T (n) ∈ Θ(log n). The superior
performance stems from dividing the size of the problem by two in the
decomposition stage. However, the function must make a single recursive
call in each recursive case. For example, the code in Listing 4.4 does
not run in logarithmic time even though the decomposition divides the
problem size by two. The issue is that it calculates the result of the
same subproblem twice by using two identical recursive calls, which is
obviously unnecessary. The runtime cost for this function is:
⎧
⎪
⎪1 if n = 0,
T (n) = ⎨
⎪
⎩2T (n/2) + 1 if n > 0,
⎪
Inputs Results
f
(a, b) ÐÐÐÐÐÐÐÐÐÐ→ a+b
+1
f
(a − 1, b) ÐÐÐÐÐÐÐÐÐÐ→ a−1+b
Inputs Results
f
(a, b) ÐÐÐÐÐÐÐÐÐÐ→ a+b
+1 + 1
f
(a − 1, b − 1) ÐÐÐÐÐÐÐÐÐÐ→ a−1+b−1
mn(m + n + 2)
= .
2
However, we will develop a recursive solution that uses two recursive
functions (one for each sum).
Firstly, the outer sum is a function of the parameters m and n, whose
size is m. The function returns 0 in its base case when m is smaller than
114 Introduction to Recursive Programming
the lower index (i.e., when m ≤ 0). The general diagram for the recursive
case, assuming that the size of the subproblem is m − 1, is:
Inputs Results
n n n
f
(m, n) ÐÐÐÐÐ→ ∑ (1 + j) + ⋯ + ∑ ((m − 1) + j) + ∑ (m + j)
j=1 j=1 j=1
n
+ ∑ (m + j)
j=1
n n
f
(m − 1, n) ÐÐÐÐÐ→ ∑ (1 + j) + ⋯ + ∑ ((m − 1) + j)
j=1 j=1
Inputs Results
g
(n, m) ÐÐÐÐÐ→ (m + 1) + ⋯ + (m + (n − 1)) + (m + n)
+ (m + n)
g
(n − 1, m) ÐÐÐÐÐ→ (m + 1) + ⋯ + (m + (n − 1))
Listing 4.8 Recursive functions that compute the double sum in (4.3).
1 def inner_sum(n, m):
2 if n <= 0:
3 return 0
4 else:
5 return inner_sum(n - 1, m) + (m + n)
6
7
8 def outer_sum(m, n):
9 if m <= 0:
10 return 0
11 else:
12 return outer_sum(m - 1, n) + inner_sum(n, m)
with 0 ≤ di < b, and dm−1 ≠ 0 (i.e., we omit writing leading zeros). There-
fore, different bases lead to distinct sequences of digits that represent the
same number. Regarding notation, the base can be specified through a
subscript, which we usually omit when it is 10. For example, 14210 = 142,
but 1425 = 1 ⋅ 52 + 4 ⋅ 51 + 2 ⋅ 50 = 25 + 20 + 2 = 47. In this section we will
examine algorithms for converting numbers expressed in some base to
another one.
Since we need to create the sequence of “bits,” the size of the problem
is the number of bits in the binary representation of n. Mathematically,
this quantity is ⌊log2 n⌋ + 1 (for n > 0). However, we do not need the
formula in order to design the recursive algorithm. All we require is a
clear definition of the size of the problem that will enable us to define base
cases and decompose the problem. In particular, the smallest instances
of the problem correspond to numbers that contain a single bit, which
are 0 and 1. Thus, the base case occurs when n < 2, where the output is
simply n.
For the recursive case we need to decide how we can reduce the size
of the problem. The simplest way consists of decrementing the number
of bits by a single unit, which is accomplished by performing an integer
division of n by two (this shifts the bits one place to the right, where the
least significant bit is discarded). We can start the analysis with concrete
instances. For example:
Inputs Results
f
n = 18 ÐÐÐÐÐÐÐÐÐÐ→ 10010
× 10 + 0
f
n//2 = 9 ÐÐÐÐÐÐÐÐÐÐ→ 1001
and
Inputs Results
f
n = 19 ÐÐÐÐÐÐÐÐÐÐ→ 10011
× 10 + 1
f
n//2 = 9 ÐÐÐÐÐÐÐÐÐÐ→ 1001
Note that the output of f (18) is ten thousand and ten, since it is ex-
pressed in base 10. The diagrams illustrate that we can obtain the de-
sired result by multiplying the output of the subproblem by 10, and
then adding 1 if n is odd. Therefore, the recursive case seems to be
f (n) = 10f (n//2) + n%2.
We can proceed more rigorously through the following general dia-
gram of the recursive thought process:
Linear Recursion I: Basic Algorithms 117
Inputs Results
f
n = (bm−1 ⋯ b1 b0 )2 ÐÐÐÐÐÐÐÐÐÐ→ (bm−1 ⋯ b1 b0 )10
× 10 + b0
f
n//2 = (bm−1 ⋯ b1 )2 ÐÐÐÐÐÐÐÐÐÐ→ (bm−1 ⋯ b1 )10
Listing 4.9 shows the corresponding code. Finally, the binary represen-
tation of 142 = 14210 is 100011102 = 128 + 8 + 4 + 2.
2 3 0 1
(a)
2 3 0 1
(b)
to the original, but starts at 28, which is 142//5. Thus, the decomposi-
tion consists of performing the integer division n//b, which reduces the
size of the problem by a unit. Subsequently, we must determine how
it is possible to modify the solution to the subproblem (103), in order
to obtain the original (1032). The solution consists of multiplying the
result of the subproblem times 10, and adding n%b (which is 2 in the
example). Thus, the function can be coded as shown in Listing 4.10.
4.3 STRINGS
This section analyzes two problems involving strings, which are essen-
tially sequences of characters, and constitute a fundamental data type
in many programming languages.
Inputs Results
s0 s1 ⋯ sn−2 sn−1 ÐÐÐÐÐÐÐÐÐ→ sn−1 sn−2 ⋯ s1 s0
+ s0
Thus, the function simply has to concatenate the first character to the
result of the subproblem associated with s1 ⋯ sn−2 sn−1 (the + symbol
represents string concatenation). Together with the base case, the recur-
sive function in Python is shown in Listing 4.11.
120 Introduction to Recursive Programming
Inputs Results
⌊n/2⌋−1
f
s0 s1 ⋯ sn−2 sn−1 ÐÐÐÐÐÐÐÐÐÐ→ ⋀ (si = sn−i−1 )
i=0
∧ (s0 = sn−1 )
⌊n/2⌋−1
f
s1 ⋯ sn−2 ÐÐÐÐÐÐÐÐÐÐ→ ⋀ (si = sn−i−1 )
i=1
Inputs Results
[7, 5, 3, 8, 4] ÐÐÐÐÐÐÐÐÐ→ [3, 4, 5, 7, 8]
concatenate [3]
In order to solve the original problem, the recursive rule must concate-
nate the discarded smallest element ([3]) with the output list of the
subproblem (note that the results are sorted lists). Listing 4.13 shows
an implementation of the method that uses the function min to deter-
mine the smallest value in a list, the method index that returns the
location of an element in a list, and the + operator to concatenate lists.
An important detail regarding this algorithm is that it needs to make a
new copy of the input list (in line 5) in order to not alter it when calling
the method. In particular, without this copy the method would swap the
first and smallest elements of the list.
Another possibility for reducing the size of the problem consists of
discarding the smallest element of the list directly, by calling the remove
method. In this case, the recursive diagram would be:
Linear Recursion I: Basic Algorithms 123
Inputs Results
[7, 5, 3, 8, 4] ÐÐÐÐÐÐÐÐÐ→ [3, 4, 5, 7, 8]
concatenate [3]
The only difference with respect to the previous diagram is the order of
the elements in the input to the subproblem. This does not affect the
recursive rule, where we must also concatenate the discarded smallest
element ([3]) with the output list of the subproblem. Listing 4.14 shows
a possible implementation of the function, which relies on the remove
method. Again, the fifth line makes a copy of the input list in order to
keep it unaltered when calling the method.
124 Introduction to Recursive Programming
at some real value x. The sum contains powers of x that are multiplied
by the coefficients ci . A naive algorithm that computes each power in-
dependently would require on the order of n2 multiplications. Instead,
Horner’s method only needs to perform n products. Its clever idea is
based on expressing the polynomial as:
The size of the problem is clearly the degree n. Thus, the base case
occurs when n = 0, where the result is obviously c0 . In practice, c will be
a list (or a similar data structure such as an array) of n+1 elements that
represents the polynomial. Therefore, the base case is reached when the
length of c is one.
In order to apply recursion we need to detect a self-similar subprob-
lem of smaller size. The decomposition of the problem, decrementing its
size by a unit, is:
³¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹· ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ µ
subproblem
where the list c1..n is simply c without its first element. The full function
⎧
is:
⎪
⎪c0 if n = 1,
p(c, x) = ⎨
⎪
⎩c0 + x ⋅ p(c1..n , x) if n > 1,
⎪
Linear Recursion I: Basic Algorithms 125
p(c, x) = p(c0..n−1 , x) + cn xn ,
( ) ( ) ⋯ ( ) ( ),
n n n n
0 1 n−1 n
where the first and last elements are always 1. The problem can be
viewed from a recursive point of view by considering the definition in
3.2. Graphically, it means that a binomial coefficient is the sum of the
two immediately above it in the previous row of Pascal’s triangle. For
example, (43) = (32) + (33). Therefore, it is possible to define a row of
Pascal’s triangle if we know the values of the previous row. Figure 4.3
shows the relationship between a problem (n-th row) and a subproblem
((n − 1)-th row).
The size of the problem is n. Thus, the base case corresponds to
n = 0, where the output is simply a list containing a 1 ([1]). For n = 1
we cannot apply the recursive rule in 3.2. Thus, initially it appears that
we may need an additional base case for n = 1, where the output is
126 Introduction to Recursive Programming
(00) 1
(10) (11) 1 1
(a) (b)
1 (n−1
1
) (n−1
2
) ⋯ (n−3
n−1
) (n−2
n−1
) 1
+ + + +
1 (n1 ) (n2 ) ⋯ (n−2
n
) (n−1
n
) 1
Listing 4.16 Function that generates the n-th row of Pascal’s triangle.
1 def pascal (n):
2 if n == 0:
3 return [1]
4 else:
5 row = [1]
6 previous_row = pascal (n - 1)
7 for i in range(len(previous_row) - 1):
8 row.append (previous_row[i] + previous_row[i + 1])
9 row.append (1)
10 return row
([1,1]). However, since all of the rows for n > 0 begin and end with a
1, these elements can be incorporated in the recursive case by default.
Thus, in this scenario a special base case for n = 1 would be unnecessary.
In particular, in the recursive case we assume that we know the
solution to the subproblem, which allows us to compute every element
PSfrag
r r r
⋯ A A
r r r r r R
1 2 3
⋯ n−1 n
B B
(a)
r1 r1 ⋅r2
r1 +r2
A B A B
r2 (b)
of the solution except for the ones at the extremes (these elements can
be appended at some point during the process). Listing 4.16 shows a
possible solution to the problem. The recursive case starts by inserting
a 1 in a list (row) that will contain the result. Line 6 computes the
result of the subproblem of size n − 1, and in lines 7 and 8 the loop
adds consecutive integers of the subsolution (as shown in Figure 4.3),
appending the sum to the result. Finally, a 1 is inserted at the end,
which completes the solution. Exercise 4.14 proposes replacing the loop
by a recursive function.
r r r
⋯ A
r r r r r
1 2 3
⋯ n−1 n
B
(a)
r
A A
R(r, n − 1) r R(r, n)
B B
(b)
is r1 +r2 . Instead, when they are connected in parallel the new resistance
is r = (r1 ⋅ r2 )/(r1 + r2 ). Alternatively, it can be expressed as:
1 1 1
= + . (4.6)
r r1 r2
These rules can be applied successively to pairs of resistors until we
obtain a circuit that contains a single resistor. However, this process is
tedious. Instead, recursion provides a succinct and elegant solution.
The problem has two input parameters: the resistance (r) and the
number of rungs (n) of the ladder. The size of the problem is clearly
n (r plays a role in the final value, but is not responsible for the run-
time of the algorithm). Let R(r, n) denote the recursive function. The
base case occurs when n = 1, where the initial ladder would only contain
a single resistor. Therefore, in that case R(r, 1) = r. For the recursive
case we need to find a subproblem within the original with exactly the
same structure. Figure 4.6(a) shows the decomposition of the problem
by decrementing its size by a unit. The circuit associated with the sub-
problem can be replaced by a single resistor of resistance R(r, n − 1),
Linear Recursion I: Basic Algorithms 129
as shown in (b), where we can assume that we know its value by using
induction. Finally, it is fairly straightforward to simplify the resulting
circuit that contains only three resistors. Firstly, the left and top resis-
tors are connected in series. Thus, they can be merged to form a resistor
with resistance R(r, n − 1) + r. Finally, this new resistor is connected in
parallel to the right resistor. By applying (4.6), R(r, n) can be defined
through:
1 1 1
= +
R(r, n) r R(r, n − 1) + r
.
4.5 EXERCISES
Exercise 4.1 — Listing 2.6 contains a linear-recursive Boolean func-
tion that determines whether a nonnegative integer n is even, where the
decomposition reduces the size of the problem (n) by two units. Define
and code an alternative method based on decrementing the size of the
problem by a single unit.
i=m
Linear Recursion I: Basic Algorithms 131
Exercise 4.7 — Define and code a function that computes the number
of digits of a nonnegative integer n.
Exercise 4.8 — Define and code a function that, given a decimal num-
ber n whose digits are either zero or one, returns the number whose bi-
nary representation is precisely the sequence of zeros and ones in n. For
example, if n = 1011010 the function returns 22, since 101102 = 22.
Exercise 4.11 — Write a recursive method that uses the function de-
veloped in Exercise 4.10 in order to solve Exercise 3.8 computationally.
Print the solutions for numbers expressed with n = 1, . . . , 5 bits.
Exercise 4.14 — Replace the loop in Listing 4.16 with a recursive func-
tion. It should receive a row of Pascal’s triangle, and return a list with the
sum of its consecutive terms. For example, if the input is [1, 3, 3, 1]
the result should be [4, 6, 4].
132 Introduction to Recursive Programming
sorted unsorted
1 3 5 5 6 4 2 8 1 9 3 7 8
0 1 2 3 4 5 6 7 8 9 10 11 12
step 5
sorted unsorted
1 3 4 5 5 6 2 8 1 9 3 7 8
0 1 2 3 4 5 6 7 8 9 10 11 12
133
134 Introduction to Recursive Programming
Inputs Results
(n = dm−1 ⋯ d1 d0 , d) ÐÐÐ→ (dm−1 = d) ∨ ⋯ ∨ (d1 = d) ∨ (d0 = d)
∨ (d0 = d)
Inputs Results
(n, d) ÐÐÐÐÐÐÐÐÐ→ d∈n
∨ (n%10 = d)
Inputs Results
(n = dm−1 ⋯ d1 d0 , d) ÐÐÐ→ (dm−1 = d) ∨ ⋯ ∨ (d1 = d)
do nothing
This implies that the result of the subproblem is exactly the output of
the original problem, and the function can simply return the result of
the recursive call, without processing it. Together with the base cases,
this leads to a tail-recursive algorithm that can be coded as shown in
Listing 5.2.
Lastly, it is important to understand that although the chosen de-
composition divides the input by a constant (10), the size of the problem
is only reduced by a unit. The time complexity for the algorithm is log-
arithmic with respect to the input n, but it is linear with respect to the
number of digits (m) of n.
Finally, this problem has analogous counterparts that rely on data
structures such as lists, arrays, etc., since these also represent sequences
of elements. For example, it is very similar to deciding if a string contains
a character, or if an element is present in a list. Although the codes may
be different, the underlying reasoning is essentially identical.
Boolean function that solves the problem will therefore have two string
input parameters. If their lengths are different the algorithm can return
False immediately in a base case. Thus, the challenge lies in solving
the problem when they have the same length, which would constitute
its size.
Inputs Results
n−1
f
(s, t) ÐÐÐÐÐÐÐÐÐÐ→ ⋀ (si = ti)
i=0
∧ (s0 = t0 )
n−1
f
(s1..n−1 , t1..n−1 ) ÐÐÐÐÐÐÐÐÐÐ→ ⋀ (si = ti)
i=1
Clearly, the method has to check that the first characters are the same,
and that the remaining substrings are also identical through a recursive
call. The Boolean function is therefore:
⎧
⎪
⎪
⎪
false if length(s) ≠ length(t),
⎪
f (s, t) = ⎨true
⎪
if n = 0,
⎪
⎪
⎪
⎩(s0 = t0 ) ∧ f (s1..n−1 , t1..n−1 ) if n > 0,
Listing 5.4 Tail-recursive function that determines if two strings are iden-
tical.
1 def equal_strings_tail(s, t):
2 if len(s) != len(t):
3 return False
4 elif s == '':
5 return True
6 elif s[0] != t[0]:
7 return False
8 else:
9 return equal_strings_tail(s[1:], t[1:])
at the same position in the strings. Thus, we can incorporate a base case
that checks if s0 ≠ t0 . In that case, the algorithm would automatically
return False. With this base case we can be sure that s0 = t0 in the
recursive case. Therefore, the new recursive diagram would be:
Inputs Results
f n−1
(s, t) ÐÐÐÐÐÐÐÐÐÐ→ ⋀ (si = ti)
i=1
do nothing
n−1
f
(s1..n−1 , t1..n−1 ) ÐÐÐÐÐÐÐÐÐÐ→ ⋀ (si = ti)
i=1
Again, the result of the subproblem is exactly the output of the original
problem, which leads to a tail-recursive algorithm. Finally, together with
the base cases, the function can be coded as in Listing 5.4.
Linear Recursion II: Tail Recursion 139
the element if it is found. This base case will depend on the choice of
decomposition. If we reduce the size of the problem by discarding the
last element of the list, then the base case will need to check whether
x is located at the last position. If indeed an−1 = x, then the method
can simply return n − 1. Otherwise, it will need to carry out a recursive
call in order to solve the subproblem of size n − 1, and the final result
will be exactly that of the subproblem’s, which leads to a tail-recursive
solution. Listing 5.5 shows a possible implementation in Python of the
described linear search function.
Finally, it is worth noticing that with the previous decomposition
the locations of the elements in the sublist are the same as those in
the original list. Alternatively, if the decomposition omitted the first
element of the list, all of the indices in the sublist would be decremented
by a unit with respect to those in the original list. For example, if the
list were [2, 8, 5], the 8 is also located at position 1 in [2, 8], but it
appears at location 0 in [8, 5]. This leads to more complex methods
that require an additional parameter. For example, Listing 5.6 shows a
solution that adds a unit to the result of each recursive call. The method
trivially returns 0 if x = a0 , but the base case for an empty list is more
complicated. Note that if that base case is reached the algorithm will
have added n units in the n previous recursive calls. Thus, it needs to
return −n − 1, where n is the length of the initial list (not the particular
length of the input argument since it is 0 in that case) in order to return
−1, indicating that x does not appear in the list. Since n cannot be
obtained from an empty list it has to be passed as an extra parameter
in every function call in order to be recovered in the base case. The code
Linear Recursion II: Tail Recursion 141
a a
m x < am x > am m
Subproblem Subproblem
(a) (b)
For example, this data structure can be used to store the information
of a birthday calendar. Figure 5.2 shows a binary tree of seven persons
that allows us to retrieve their birthdays according to their names, which
are strings that we can assume are unique (naturally, we can include last
names or tags in order to identify each person uniquely). Keys can be
strings since they can be compared, for example, according to the lexico-
graphic order (in Python we can simply use the < operator). Therefore,
the names in the binary search tree are sorted as they would appear
in a regular dictionary. For instance, consider the root node associated
with “Emma”. Observe that all of the names contained in its left subtree
would appear before “Emma” in a dictionary, and all of the names in the
right subtree would be found after “Emma”. Furthermore, this property
holds for every node of the binary search tree.
In order to implement a binary search tree, we first have to decide
how to code a binary tree. There are several possibilities (one of the most
common approaches consists of using object oriented features in order
to declare a class associated with the nodes of the tree), but in this book
we will simply use lists. In particular, every node of the tree will consist
of a list of four elements: the key, the item, the left subtree, and the
right subtree, where the subtrees are also binary search trees. Thus, the
binary search tree in Figure 5.2 would correspond to the following list:
[‘Emma’, ‘2002/08/23’,
[‘Anna’, ‘1999/12/03’, [], []],
[‘Paul’, ‘2000/01/13’,
[‘Lara’, ‘1987/08/23’, (5.1)
[‘John’, ‘2006/05/08’, [], []],
[‘Luke’, ‘1976/07/31’, [], []]],
[‘Sara’, ‘1995/03/14’, [], []]]]
which is illustrated graphically in Figure 5.3. Observe that the left and
right subtrees are lists, which are empty at the leaf nodes.
Emma 2002/08/23
Figure 5.2 Binary search tree that stores information about a birthday
calendar.
‘Emma’,‘2002/08/23’, ,
Figure 5.3 Binary search tree in Figure 5.2 and (5.1), where each node
is a list of four elements: name (string), birthday (string), left subtree
(list), and right subtree (list).
found the item, and can therefore return it. Thus, in the recursive cases
we can be sure that the root node does not contain the searched item.
In the next step our goal is to find an appropriate decomposition of
the problem that reduces its size. We have already seen that trees are
composed recursively of subtrees. Thus, we could consider searching for
the item in the two subtrees of the binary search tree. This guarantees
reducing the size of the problem by a unit, since it discards the root
node. Nevertheless, it is easy to see that in this problem we can also
avoid searching in an entire subtree. If the key k is less than the key of
the root node (kroot ), then we can be sure that the item we are looking for
will not be in the right subtree, due to the binary tree search property.
146 Introduction to Recursive Programming
Subproblem Subproblem
Analogously, if k > kroot , the item will not appear in the left subtree.
Figure 5.4 illustrates this idea, where xroot is the item stored in the root
node. Clearly, for this particular problem there are two recursive cases.
If k < kroot the method must keep searching in the left subtree through
a recursive call, while if k > kroot it will search for the item in the right
subtree. Listing 5.9 shows an implementation of the searching algorithm,
where each node of the binary tree is coded as a list of four components,
as described in Section 5.3.
Finally, the height of the binary tree determines the cost of the algo-
rithm in the worst case. If the tree is balanced (i.e., it has approximately
the same number of nodes on the left and right subtrees of nodes that
Linear Recursion II: Tail Recursion 147
appear on the same level) it runs in Θ(log n), where n is the number of
nodes in the tree, since it would discard approximately half of the nodes
with each recursive call. However, the tree could have a linear structure,
in which case the algorithm would run in O(n).
pivot
6 4 1 7 4 7 3 6 5
4 1 3 4 6 7 7 6 5
containing the key and the item (and two empty subtrees). However,
if the root node does have a nonempty subtree then the method must
insert the new node in that subtree, which naturally leads to a recursive
call. The reasoning regarding the right subtree is analogous. Finally,
Listing 5.10 shows a possible implementation of the procedure.
1. Choose a “pivot” element from the list. Typical choices include the
first or middle element of the list, or simply some element selected
at random.
2. Construct a new list with the same elements as the input list, or
simply permute the input list, where the first elements are smaller
than or equal to the pivot, and the last ones are greater than the
pivot.
Figure 5.5 illustrates the idea through a concrete example. The problem
is relevant since the partitioning is key in other eminent algorithms such
as quickselect (see Section 5.5) or quicksort (see Section 6.2.2). There
are several well-known and efficient algorithms, also denoted as “parti-
tioning schemes,” which solve the problem. The most popular is Hoare’s
partitioning method, developed by Tony Hoare, which we will analyze
Linear Recursion II: Tail Recursion 149
Inputs Results
([3, 6, 1, 7, 4],5) ÐÐÐÐÐÐÐÐÐ→ [3, 1, 4]
concatenate [3]
Clearly, the algorithm must concatenate the first element of the list to
the output of the subproblem. In contrast, if the first element is greater
than x, the diagram is:
Inputs Results
([9, 6, 1, 7, 4],5) ÐÐÐÐÐÐÐÐÐ→ [1, 4]
do nothing
In this case, the algorithm simply has to return the solution (list) to
the subproblem. Listing 5.11 shows the linear-recursive codes that solve
150 Introduction to Recursive Programming
both problems (notice that in the recursive cases that perform the con-
catenation the function call is not the last action of the method). The
runtime cost of the methods can be characterized by:
⎧
⎪
⎪1 if n = 0,
T (n) = ⎨
⎪
⎩T (n − 1) + 1 if n > 0,
⎪
select pivot
6 4 1 7 4 7 3 6 5
and indices
left right
swap pivot
4 4 1 7 6 7 3 6 5
and first element
right
advance left
4 4 1 7 6 7 3 6 5
and right indices
left
swap left
4 4 1 3 6 7 7 6 5
and right elements
right
advance left
4 4 1 3 6 7 7 6 5
and right indices
left
swap pivot
3 4 1 4 6 7 7 6 5
and right element
swap
>p ⋯ ≤p ⋯
left right left right
(a)
≤p ⋯ ≤p ⋯
left right left right
>p ⋯ >p ⋯
left right left right
≤p ⋯ >p ⋯
left right left right
(b)
turns the final location of the pivot. Figure 5.6 shows a concrete example
of the partitioning method.
We will now see a tail-recursive algorithm that can substitute the
main loop in Hoare’s partitioning scheme. Given the input list a, initial
left and right indices, as well as the value of the pivot, the method
should partition the list analogously as the loop in Hoare’s method,
returning the final location of the right index. The size of the problem
is the difference between the left and right indices, since it determines
the number of increment/decrement operations to be performed on the
indices until they cross. The base case occurs precisely when they cross
(when the left index is greater than the right one), where the method
can simply return the right index.
For the recursive case we must reduce the size of the problem by
incrementing the left index and/or decrementing the right one. There
are two different scenarios, illustrated in Figure 5.7, where p denotes
154 Introduction to Recursive Programming
the pivot. If aleft > p and aright ≤ p the method first needs to swap
the elements referenced by the indices, and afterwards can perform a
function call by advancing both indices, as shown in (a). This leads to a
first recursive case that reduces the size of the problem by two units. If
the previous condition is False, then the method will not swap elements,
and will advance at least one of the indices (thus, reducing the size of
the problem), as shown in (b). If aleft ≤ p it will increment the left index,
and if aright > p it must decrement the right one. Subsequently, the
method can invoke itself with the updated indices. Listing 5.13 shows
the corresponding code, together with a wrapper method that completes
the partitioning algorithm. Note that it is identical to Listing 5.12, where
Linear Recursion II: Tail Recursion 155
ip = k − 1
a
(a)
ip > k − 1 ip < k − 1
a Subproblem a Subproblem
Original problem Original problem
(b) (c)
Figure 5.8 Base case and problem decomposition used by the quickselect
algorithm.
the code associated with the loop has been substituted by a call to the
recursive function. Lastly, the method runs in O(n) since the runtime
of the recursive method can be characterized by the recurrence T (n) =
T (n − 1) + 1 in the worst case.
Let ip denote the index of the pivot. Since the function that imple-
ments Hoare’s partition scheme returns ip , we can check whether it is
equal to k − 1 (since indices start at location 0, the j-th element appears
at position j −1). If it is, then the pivot will be the k-th smallest element
in a, and the method can terminate at this base case. This scenario is
illustrated in Figure 5.8(a).
In the recursive case the algorithm reduces the size of the problem
by considering either the sublist to the left, or to the right, of the pivot.
If ip > k − 1, then the location of the searched element will be smaller
than ip . Thus, the algorithm can focus on the sublist to the left of the
pivot, as shown in Figure 5.8(b). Instead, if ip < k − 1 then the algo-
rithm will proceed by solving the subproblem related to the list to the
right of the pivot, as illustrated in (c). Finally, Listing 5.14 contains an
implementation of the tail-recursive function.
The runtime cost of the algorithm depends on where the pivot is
located after performing the decomposition. If it is always located at the
middle position of the list, the running time can be characterized by:
⎧
⎪
⎪1 if n ≤ 1,
T (n) = ⎨
⎪
⎩T (n/2) + cn if n > 1,
⎪
which is a quadratic function (i.e, T (n) ∈ Θ(n2 )). This situation corre-
sponds to the worst case scenario for the algorithm.
f (x) f (x)
r r
a b a b
ẑ ẑ
Initial setting Step 1
f (x) f (x)
r r
a b a b
ẑ ẑ
Step 2 Step 3
different, ẑ should replace b when f (a) and f (z) have opposite signs.
Otherwise, ẑ will replace a. This leads to the tail-recursive method in
Listing 5.15, which contains two recursive cases. The code uses√ the func-
tion f (x) = x − 2 that allows us to find an approximation of √
2
2, since
it is a root of f (x). Note that the initial interval [0, 4] contains 2. Fi-
nally, the error of the approximation will be less than or equal to 10−10
(the result will be accurate up to nine decimal digits to the right of the
decimal point).
f (x)
∣ẑ − r∣ ≤ ǫ
a r b
ǫ ǫ
ẑ
trees, where the tallest tree has height H = 12. If the goal is to collect 10
units of wood, then the woodcutter should set the height of the cutting
machine to h = 8, where the total wood collected would be exactly 10
units. If the goal were to collect 7, 8, or 9 units of wood the optimal
height h would also be 8. Even though the woodcutter would obtain
more wood than initially required, h cannot be higher since cutting at
height 9 only provides 6 units of wood.
The problem is interesting from an algorithmic point of view since
it can be solved in several ways. For example, the trees can be initially
sorted in decreasing order by their height, and subsequently processed
from the highest to the lowest until obtaining the optimal height. This
160 Introduction to Recursive Programming
upper
middle
lower
Original problem
upper
middle
lower
Subproblem: wood at middle > w
upper
middle
lower
Subproblem: wood at middle < w
be less than the required amount w. A second scenario occurs when the
wood collected at hm is less than w. In that case we can discard all of the
heights that are greater than or equal to hm , since the woodcutter would
not obtain enough wood. The associated recursive case would therefore
invoke the method by replacing the upper limit with hm − 1.
Lastly, in the previous decomposition, when the upper limit is just a
unit greater than the lower one, the middle height is equal to the lower
one. In that situation the first recursive case would not reduce the size
of the problem (the limits would not vary). Thus, the algorithm needs
an additional base case in order to work properly. When this situation
occurs the method has to return either the lower or the upper limit.
In particular, if the amount of wood associated with the upper limit is
greater than or equal to w, then the upper limit is the correct result.
Otherwise, it will be the lower one.
Listing 5.17 shows the code related to the mentioned base and recur-
sive cases. Lastly, Figure 5.13 shows the steps it takes in order to solve
the instance in Figure 5.11. Step 1 shows an initial situation where the
lower limit is zero and the upper one is H = max{ti }, for i = 1, . . . , n.
Step 2 and 3 are associated with applying the first and second recursive
cases, respectively. Finally, step 4 applies the additional base case since
the upper limit (8) is just a unit over the lower one (7). The solution is
h = 8, since for that height the woodcutter obtains the requested amount
of wood (10 units).
164 Introduction to Recursive Programming
12 12
8 8
4 4
0 0
Step 1 Step 2
12 12
8 8
4 4
0 0
Step 3 Step 4
The proof of the property in the recursive case is similar to the one for
(5.3). Assume n ≥ m, and let m = az, and n = bz, where z is the product
of the common prime factors in n and m, and a ≤ b. This implies that
a and b do not share common prime factors. In addition, let b = qa + r,
where q and r are the quotient and remainder of the division b/a. The
key in this proof is that a and r cannot share common prime factors,
since otherwise they would also be prime factors of b, and this is not
possible since a and b do not share common prime factors. This implies
that z = gcd(rz, az). Thus, we can conclude that:
a 2 2 3 2 0 1 3 2 0 0 4
step 1
b 3 1 4 2 1
0 1 2 3 4
step 2
c 0 0 0 1 2 2 2 2 3 3 4
5.9 EXERCISES
Exercise 5.1 — Define and code Boolean linear and tail-recursive func-
tions that determine whether a nonnegative integer n contains an odd
digit.
Exercise 5.6 — Write a function that searches for the item with the
smallest key in a binary search tree T , defined as a list of four compo-
nents, as described in Section 5.3.
f (x)
xn−1 r
xn
(xn−1 , f (xn−1 ))
f (xn−1 )
xn = xn−1 −
f ′ (xn−1 )
,
√
Implement linear and tail-recursive functions that receive the value a,
√
an initial positive estimate x0 of a, and a certain number of steps n,
and return the final estimate ẑ = xn of a by applying (5.8) n times.
Finally, specify their asymptotic computational cost.
CHAPTER 6
Multiple Recursion I:
Divide and Conquer
171
172 Introduction to Recursive Programming
of half the size of the original’s (the method would invoke itself only
once). However, many authors consider that the term should only be
used when the solution relies on breaking up a problem into two or
more subproblems. This book adopts this last convention. Thus, the
methods that invoke themselves once, dividing the size of the problem
by two, have been covered in earlier chapters (mostly in Chapter 5).
The following sections describe classical recursive divide and conquer
algorithms.
Thus, the result is True when the elements of the list appear in (non-
strictly) increasing or nondecreasing order.
The size of this problem is the number of elements in the list. If the
list contains one element the result is trivially True. In addition, we can
consider that an empty list is also ordered.
The problem can be solved by a linear-recursive method that reduces
the size of the problem one unit in the decomposition stage. However,
we will present a solution that decomposes the problem by dividing the
input list in two halves, and which leads to the following partition of
Multiple Recursion I: Divide and Conquer 173
(6.1):
Clearly, if a list is sorted in ascending order then both halves must also be
sorted in ascending order. Thus, the method can invoke itself twice with
the two corresponding sublists, and perform a logical AND operation
with their result. Finally, the combination of the results of the subprob-
lems needs an additional step. In particular, the last element of the first
sublist (a⌊n/2⌋−1 ) must be less than or equal to the first element of the
second sublist (a⌊n/2⌋ ). This condition also requires another AND oper-
ation in the recursive case. Listing 6.1 shows a possible implementation
of the method, which runs in O(n) time, since its cost is characterized
by:
⎧
⎪
⎪1 if n ≤ 1,
T (n) = ⎨
⎪2T (n/2) + 1 if n > 1.
⎪
⎩
6.2 SORTING
Sorting a general list (or array, sequence, etc.) is one of the most studied
problems in computer science. It can be solved by numerous algorithms
that can be used to introduce essential concepts related to computational
complexity and runtime analysis, algorithm design paradigms, or data
structures. The sorting algorithms covered in this chapter will assume
that the input to the problem is a list a of n real numbers. The out-
put is a rearrangement (or permutation) of the elements that provides
another list a′ where a′i ≤ a′i+1 , for i = 0, . . . , n − 2 (≤ can be thought of
as Boolean function that allows us to determine if an element precedes
another, implementing a total order binary relation). The choice of real
numbers for the type of the elements of the list implies that the algo-
rithms need to carry out comparisons through ≤. It can be shown that
any algorithm that solves this problem requires Ω(n log n) comparisons
(i.e., at least on the order of n log n decisions using ≤). Lastly, there exist
algorithms for sorting lists that do not need to compare elements, which
require Ω(n) operations. However, the elements of the list must satisfy
certain conditions in order to apply them. For example, the “counting
174 Introduction to Recursive Programming
merge_sort
3 8 5 3 5 0 4 9
merge_sort merge_sort
3 3 5 8 0 4 5 9
merge
0 3 3 4 5 5 8 9
sort” algorithm sorts integers that belong to a small interval [0, k] (see
Exercise 5.3).
dividing the input list in two halves, giving rise to two different subprob-
lems of (roughly) half the size of the original’s. The following diagram
shows a concrete example of the recursive thought process associated
with the algorithm:
Inputs Results
[7, 3, 4, 8, 4, 6, 4, 6] ÐÐÐÐÐÐÐÐÐ→ [3, 4, 4, 4, 6, 6, 7, 8]
merge
where we have also assumed that the partition of the list can be obtained
in a constant number of operations. In other words, we have considered
that it is possible to obtain a[0:n//2] and a[n//2:n] in Θ(1) time. In
addition, f (n) measures the number of operations needed by the merge
method in order to combine two sorted lists of (approximately) n/2 ele-
ments. In this case, due to the master theorem (see (3.28)), if f (n) were
a linear function, the merge sort algorithm would run in the (optimal)
order of Θ(n log n). We will see shortly that indeed it is possible to solve
the merge problem in linear time. In general, when faced with a similar
problem, algorithm designers focus their efforts on developing the most
efficient combination method, which may reduce the order of growth of
the divide and conquer algorithm.
The inputs to the merging problem are two sorted lists a and b,
of lengths na and nb , respectively. The output is another sorted list of
length n = na + nb . We can interpret that the size of the problem is the
number of operations needed by the algorithm until it can return a trivial
answer. In this scenario, the size of the problem is m = min(na , nb ), since
the solution to the problem if one of the lists is empty is obviously the
other list. This constitutes the base cases. For the recursive case we can
use the following diagram with the sublists used in the previous example:
Inputs Results
([3, 4, 7, 8],[4, 4, 6, 6]) ÐÐÐÐÐÐ→ [3, 4, 4, 4, 6, 6, 7, 8]
concatenate [3]
tion runs in linear time with respect to m, and also n, which implies
that the merge sort algorithm runs in Θ(n log n).
Pivot
6 4 1 7 4 7 3 6 5
Decompose
3 4 1 4 6 7 7 6 5
Inputs Results
[6, 4, 1, 7, 4, 7, 3, 6, 5] ÐÐÐÐÐ→ [1, 3, 4, 4, 5, 6, 6, 7, 7]
concatenate
[4] in between
sorted sublists
The diagram clearly shows that sorting the original list simply requires
solving the two subproblems through recursive calls, and concatenating
the sorted sublists while maintaining the pivot in between them. Thus,
after solving the subproblems recursively, the combination of the respec-
tive solutions is straightforward. Lastly, an important detail regarding
this decomposition is that the pivot is removed from the list that con-
tains the elements that are smaller than or equal to it. This is necessary
in order to guarantee that the size of the subproblem is indeed smaller
than the size of the original problem.
Listing 6.4 implements a slower variant of the method that is based
on the basic partitioning schemes described in Section 5.4.1. Firstly, it
checks whether the input corresponds to the base case (which is the
same as in the merge sort method). In the recursive case a common
strategy is to consider that the pivot is the first element of the list.
With this choice the worst case occurs when the input list is already
sorted. Moreover, the algorithm would also perform poorly if the input
Multiple Recursion I: Divide and Conquer 179
where T (n) ∈ Θ(n log n). Instead, the worst case occurs when the pivot
is always located at an extreme of the list. In that case the runtime cost
is determined by:
⎧
⎪
⎪1 if n ≤ 1,
T (n) = ⎨
⎪
⎩T (n − 1) + cn if n > 1,
⎪
Inputs Results
a = [4, 4, 5, 1, 4, 2, 4, 3] ÐÐÐÐÐÐ→ (False, -, 0)
false result
In this case the result of both subproblems is False, which implies that
the input list cannot contain a majority element (even though there are
exactly n/2 occurrences of the element 4 in the example, it is not enough
to produce a true result). In general, this can be shown as follows. Firstly,
the initial list is divided into a sublist b of length ⌊n/2⌋, and another
sublist c of length ⌈n/2⌉, regardless of whether n is even or odd, since
n = ⌊n/2⌋ + ⌈n/2⌉. If the result for both subproblems is False then an
element can appear at most ⌊⌊n/2⌋/2⌋ times in b, and ⌊⌈n/2⌉/2⌋ times
in c. Adding these quantities yields:
Inputs Results
a = [4, 4, 5, 4, 1, 2, 4, 3] ÐÐÐÐÐ→ (False, -, 0)
Listing 6.6 Code for counting the number of times an element appears in
a list.
1 def occurrences_in_list(a, x):
2 if a == []:
3 return 0
4 else:
5 return int(a[0] == x) + occurrences_in_list(a[1:], x)
In this case the element 4 appears three times in the first sublist, and
is therefore a majority element since 3 > n/2 = 2. The algorithm must
therefore count the number of occurrences of 4 in the second list (denoted
through #(c, 4)) in order to determine whether it is also a majority
element of the initial list a. This can be computed through a simple
linear-recursive function (see Listing 6.6) that receives an input list a
and an element x. If a is empty the result is obviously 0. Otherwise, the
output can consist of the method applied to the tail of a (a1..n−1 ) and
x, plus a unit only if a0 = x.
Listing 6.7 shows a possible implementation of the function that
solves the majority element problem. Lines 3–6 code the base cases.
Lines 8 and 9 decompose the input list into two halves. Line 11 invokes
the method on the first sublist, and if there exists a majority element
(line 12), then line 13 computes the number of occurrences of the el-
ement in the second sublist. If the total number of occurrences of the
element (in both sublists) is greater than n/2 (line 14), then the method
returns a tuple in line 15 with the values: True, the majority element,
and the number of times it appears in the input list. Lines 17–21 are
analogous, but switch the roles of the sublists. Finally, if the function
has not returned, then the list does not contain a majority element (line
23).
In the recursive cases the method needs to invoke itself twice with
one half of the input list, and also needs to compute the occurrences
of an element on two sublists of length n/2 (approximately). Since this
last auxiliary function runs in linear time, the time complexity of the
method can be characterized by:
⎧
⎪
⎪1 if n ≤ 1,
T (n) = ⎨
⎪
⎩2T (n/2) + cn if n > 1.
⎪
Multiple Recursion I: Divide and Conquer 183
Therefore, the order of growth of the algorithm is Θ(n log n) (see (3.28)).
Lastly, this problem can be solved through the Boyer–Moore majority
vote algorithm in linear time.
where m = min(⌊bx /2⌋, ⌊by /2⌋). For example, for x = 594, and y = 69, the
decomposition is:
a = x >> m,
b = x − (a << m),
c = y >> m,
d = y − (c << m),
This initial (naive) approach can break up the original problem (a mul-
tiplication) into four smaller subproblems: ac, ad, bc, and bd, which can
be computed through four recursive calls (we can ignore the cost of
multiplications times powers of two, since these can be implemented
very efficiently as bit shifts). However, it is not more efficient than the
Multiple Recursion I: Divide and Conquer 185
which requires only three simpler products: ac, bd, and (a + b)(c + d),
leading to a faster algorithm that only carries out three recursive calls.
186 Introduction to Recursive Programming
AB = [ ] [ 1,1 ]
A1,1 A1,2 B B1,2
A2,1 A2,2 B2,1 B2,2
(6.5)
A B + A1,2 B2,1 A1,1 B1,2 + A1,2 B2,2
= [ 1,1 1,1 ].
A2,1 B1,1 + A2,2 B2,1 A2,1 B1,2 + A2,2 B2,2
Notice that the formula is analogous to multiplying two 2 × 2 matrices.
For example, the top-left block of the result (A1,1 B1,1 + A1,2 B2,1 ) can be
viewed as the product between the first (block) row of A and the first
(block) column of B.
The decomposition involves computing eight simpler matrix prod-
ucts. Thus, the method will invoke itself eight times in the recursive
case. The results of each product need to be added and stacked appro-
priately in order to form the output matrix. Listing 6.9 shows a possi-
ble implementation. The recursive case first defines each of the smaller
block matrices, adds the simpler products, and builds the output matrix
through the methods vstack and hstack. One of the base cases com-
putes a simple product when p = q = r = 1. In addition, the code also
considers the possibility of receiving empty input matrices, since they
appear when partitioning the matrices in the recursive case if one of the
dimensions is equal to one (obviously, a vector cannot be partitioned
into four vectors as described in (6.5)). Thus, if any of the dimensions is
0 a special base case returns an empty matrix of dimensions p × r, which
can be handled appropriately in Python.
The previous method creates 1 × 1 matrices (in a base case) and
progressively stacks them together to form the final p × r matrix. In
addition, note that the dimensions of the input matrices to the methods
are not fixed.
Another more efficient alternative consists of passing the entire ma-
trices A and B in each call, and specifying the blocks that need to be
188 Introduction to Recursive Programming
eters, which indicate lower and upper limits related to the dimensions
p, q, and r. The base case of matrix_mult_limits occurs when both
of the submatrices correspond to scalar numbers, say ai,j and bj,k . In
that case the method simply stores their product in row i and column
k of C. Lastly, the method is not a function, since it does not return a
matrix. Instead, it is a procedure that modifies the parameter C, where
it stores the result. Finally, the iterative method add_matrices_limits
adds the elements of submatrices passed as the first two matrix input
parameters, and stores the result in its third parameter (the submatrices
are specified through parameter limits).
since the methods invoke themselves eight times, and need to perform
four matrix additions whose cost is quadratic with respect to n. There-
fore, according to the master theorem (see (3.28)), T (n) ∈ Θ(nlog2 8 ) =
Θ(n3 ). We will now describe Strassen’s algorithm, which is a well-known
method that can reduce the time complexity to Θ(nlog2 7 ) = Θ(n2.807... ).
The method also decomposes each of the input matrices into four
block matrices as in the standard algorithm. Thus, AB = C can be
expressed as:
[ ] [ 1,1 ] = [ 1,1 ]
A1,1 A1,2 B B1,2 C C1,2
A2,1 A2,2 B2,1 B2,2 C2,1 C2,2
Multiple Recursion I: Divide and Conquer 191
The key to the method is the definition of the following new matrices
that involve one matrix multiplication operation:
M1 = (A1,1 + A2,2 )(B1,1 + B2,2 ),
M2 = (A2,1 + A2,2 )B1,1 ,
M3 = A1,1 (B1,2 − B2,2 ),
M4 = A2,2 (B2,1 − B1,1 ), (6.7)
M5 = (A1,1 + A1,2 )B2,2 ,
M6 = (A2,1 − A1,1 )(B1,1 + B1,2 ),
M7 = (A1,2 − A2,2 )(B2,1 + B2,2 ).
Finally, these matrices can be combined as follows to form the output’s
block matrices:
C1,1 = M1 + M4 − M5 + M7 ,
C1,2 = M3 + M5 ,
(6.8)
C2,1 = M2 + M4 ,
C2,2 = M1 − M2 + M3 + M6 .
The algorithm therefore computes seven products and 18 additions (or
subtractions) in every recursive call. Thus, its runtime cost is described
through:
⎧
⎪
⎪1 if n ≤ 1,
T (n) = ⎨
⎪
(6.9)
⎩7T (n/2) + 18Θ(n ) if n > 1,
⎪ 2
n = 2k
n = 2k
only two types of trominoes: the “I” and the “L” trominoes, as illustrated
in Figure 6.3. The following problem consists of covering a square n × n
board, where n ≥ 2 is a power of two, which contains a “hole” that
cannot be covered, with L trominoes. Figure 6.4 explains the problem
graphically with an example.
The size of the problem is clearly n. The smallest instances of the
problem correspond to 2×2 boards, whose solutions are trivial. Figure 6.5
illustrates the divide and conquer decomposition used in the recursive
case. The initial board in (a) is divided into four smaller square boards
of size n/2, as shown in (b). However, only one of these smaller boards
will contain the initial hole. Therefore, the other three boards do not
n/2 n/2
L1 L2 L3 L4
iary functions in Listing 6.11 to draw each one. The functions receive the
coordinates (x, y) of the bottom-left corner corresponding to the square
surrounding the tromino. The command plot([x1 , x2 ],[y1 , y2 ],’k-’)
draws a black line segment with endpoints (x1 , y1 ) and (x2 , y2 ).
Listing 6.12 shows a possible implementation of the recursive
method. The procedure needs to know which problem/subproblem it
should solve. This information is provided by the first three parameters.
The first two indicate the bottom-left coordinates (x, y) of the board,
while the third is the size of the board (n). The last two indicate the
location of the hole (in particular, (p, q) specifies the bottom-left corner
of the 1 × 1 square). In both base and recursive cases the method uses
conditions in order to determine the relative position of the hole, and
draws the appropriate tromino. Finally, the method invokes itself four
times in the recursive case, with different parameters indicating the new
subproblems, together with the new holes on three of them.
196 Introduction to Recursive Programming
Finally, Listing 6.13 shows a fragment of code that can be used to call
the trominoes method. Line 7 creates a figure, line 8 sets its background
color to white, and ax captures the axes of the figure in line 9. After defin-
ing the size of the initial board (line 10), the hole is chosen within it at
random, and then drawn in line 13. When using the Matplotlib package
a rectangle can be formed by calling the method Rectangle. It receives
the coordinates of the bottom-left vertex, together with the width and
height, and other possible arguments. Line 14 calls the main method,
and the last lines are included to avoid scaling factors, to eliminate the
axes in the final image, and to draw it.
5 10 15 20 25
(a)
5 10 15 20 25
(b)
5 10 15 20 25
(c)
(x1 , h)
(x1 , x2 , h)
(x2 , 0)
Figure 6.8 Base case for the skyline problem with one building.
that is able to solve the problem in Θ(n log n) time. The idea, illustrated
in Figure 6.9, is similar to the approach used in the merge sort algorithm.
The decomposition step consists of dividing the input list in two smaller
lists of approximately n/2 buildings. The method then carries out two
recursive calls on those sublists that return two independent skylines.
Assuming that the skylines have been constructed correctly (by apply-
ing induction), the final, and challenging, step consists of merging the
skylines in order to produce a final one. Listing 6.14 shows the associated
divide and conquer method, whose structure is essentially identical to
that of Listing 6.2.
The skyline merging problem is a new computational problem in its
own right. While the majority of solutions in texts are iterative, we will
now examine a linear-recursive method. The inputs are the two input
lists of sorted tuples representing skylines. In addition, since a tuple in-
dicates a change in the height of a skyline, the method needs to access
the previous height before such change. Moreover, since the proposed
algorithm will process the first tuples from the lists (until one list is
empty), but progressively discard them as they are analyzed, these pre-
vious heights will not be contained in the input lists. Thus, the method
will need two additional parameters, say p1 and p2 , in order to store the
previous heights of the skylines. Naturally, both of these parameters will
Multiple Recursion I: Divide and Conquer 199
8
6
4
2
5 10 15 20 25
Original problem
8 8
6 6
4 4
2 2
5 10 15 20 25 5 10 15 20 25
Subproblem 1 Subproblem 2
8 8
6 6
4 4
2 2
5 10 15 20 25 5 10 15 20 25
Skyline 1 Skyline 2
8
6
4
2
5 10 15 20 25
Combination problem
8
6
4
2
5 10 15 20 25
Solution
p1
(x2 , h2 )
p1
(x2 , h2 ) h2 h2
p2 p2
(x1 , h1 ) h1
(x1 , h1 ) h1
x x
(a) (b)
be initialized to zero when calling the method within the main skyline
function (see line 9 of Listing 6.14).
The size of the problem depends on the lengths of the input skyline
lists, say n1 and n2 . We can consider that the base case occurs when one
of the lists is empty, where the method must trivially return the other
list. Listing 6.15 codes the method, where the base cases are described
in lines 2–5.
A key observation for determining an appropriate decomposition is
that the output of the merging function produces a new skyline whose
tuples are sorted in ascending order according to their x value. Thus,
the algorithm will analyze the first tuples of each list, and process the
one with a smaller x value (or both if their x values are the same). Thus,
in the recursive case we need the first tuples of the skylines (x1 , h1 ) and
(x2 , h2 ), together with their previous heights p1 and p2 .
Firstly, let us consider the situation when x1 = x2 = x. Since the
tuples mark changes in the skyline, we may need to include in the solu-
tion the one with larger height. For example, in Figure 6.10(a) the point
(x, h2 ) would be included in the final skyline. Furthermore, the recursive
call will use the tails of both input lists, discarding (x, h1 ) and (x, h2 ),
since the possible changes at x will have been processed correctly. This
is accomplished in lines 19–21. Lastly, there is a situation where a new
tuple is not included in the solution. This occurs when the largest new
height is equal to the largest previous height of the skyline (i.e., when
max(h1 , h2 ) = max(p1 , p2 )), since there would be no change of heights
at x. Figure 6.10(b) illustrates this case, where the point (x, h2 ) would
not be included in the final skyline (see lines 16 and 17).
We will now analyze possible scenarios when the x values of the first
tuples of the skylines are not equal. Without loss of generality, assume
Multiple Recursion I: Divide and Conquer 201
h1 h1 h1 p1
p2
p1 p1 = p2 p1 h1
p2 p2
x1 x1 x1 x1
h1 > p2 — Include tuple (x1 , h1 )
p1 p1
p2
h1 h1 = p2
x1 x1
h1 ≤ p2 and p1 > p2 — Include tuple (x1 , p2 )
p2 p2
h1 p1 h1 = p2 p1 = p2
p1 h1 p1 h1
x1 x1 x1 x1
x1 < x2 . In that case, the algorithm must decide whether to include the
tuple (x1 , h1 ), or (x1 , p2 ), or none at all, as illustrated in Figure 6.11. If
h1 > p2 then the first skyline is above the second one at location x1 , and
therefore must include the tuple (x1 , h1 ) as part of the merged skyline
(see lines 25–27). If h1 ≤ p2 then the algorithm must check if p1 > p2 . If
the result is True then the method includes the tuple (x1 , p2 ) (see lines
29–31). Notice that when h1 < p2 this produces a new tuple that does
not appear in the input skyline lists. Lastly, in other situations, p2 ≥ h1
and p2 ≥ p1 , which implies that the merged skyline will not change at x1 .
Finally, having processed the first tuple from the first skyline (x1 , h1 ), the
method discards it when invoking itself in the corresponding recursive
Multiple Recursion I: Divide and Conquer 203
6.8 EXERCISES
Exercise 6.1 — Implement a divide and conquer algorithm that deter-
mines whether a list a contains an element x.
AT = [ ].
AT1,1 AT2,1
AT1,2 AT2,2
A ⋅ B = [ A1 A2 ] ⋅ [ ] = [ A1 B1 + A2 B2 ] .
B1
B2
A⋅B=[ ] ⋅ [ B1 B2 ] = [ 1 1 ].
A1 A B A1 B2
A2 A2 B1 A2 B2
CHAPTER 12
353
354 Introduction to Recursive Programming
q
q
q
q
Figure 12.1 One solution to the four-queens puzzle.
12.1 INTRODUCTION
This section introduces fundamental concepts related to backtracking,
and provides an overview of how it works by examining the simple four-
queens puzzle. Its goal consists of placing four chess queens on a 4 × 4
chessboard so that they do not threaten each other. Since queens can
move horizontally, vertically, and diagonally, two (or more) queens can-
not appear in the same row, column, or diagonal on the board. Fig-
ure 12.1 illustrates one of the two possible solutions to the puzzle. Nat-
urally, the problem can be generalized to placing n queens on an n × n
chessboard (see Section 12.3).
0 1 2 3 0 4 8 12
4 5 6 7 1 5 9 13
0 1 2 3 4 5 6
8 9 10 11 2 6 10 14
12 13 14 15 3 7 11 15
Figure 12.2 Partial solutions within complete solutions that are coded as
lists or matrices.
0 1
1 3 0 3
2
2
1 3 1 2 0
2 2
[-,-,-]
0 1
[0,-,-] [1,-,-]
0 1 0 1
[0,0,-] [0,1,-] [1,0,-] [1,1,-]
0 1 0 1 0 1 0 1
Figure 12.4 Binary recursion tree of an algorithm that generates all of the
subsets of three items.
12.2.1 Subsets
This section presents two strategies for generating all of the subsets of n
(distinct) elements, which will be provided through an input list. In both
methods the recursion tree will be binary, and the solutions (subsets)
will be represented at its leaves. One method uses partial solutions of
(fixed) length n, while in the other their length varies as the procedure
carries out recursive calls.
Listing 12.1 Code for printing all of the subsets of the elements in a list.
1 def generate_subsets(i, sol , elements ):
2 # Base case
3 if i == len(elements ):
4 # Print complete solution
5 print_subset_binary(sol , elements )
6 else:
7 # Generate candidate elements
8 for k in range(0, 2):
9
10 # Include candidate in partial solution
11 sol[i] = k
12
13 # Expand partial solution at position i+1
14 generate_subsets(i + 1, sol , elements )
15
16 # Remove candidate from partial solution
17 sol[i] = None # optional
18
19
20 def generate_subsets_wrapper(elements ):
21 sol = [None] * (len(elements ))
22 generate_subsets(0, sol , elements )
23
24
25 def print_subset_binary(sol , elements ):
26 no_elements = True
27 print('{', end='')
28 for i in range (0, len(sol)):
29 if sol[i] == 1:
30 if no_elements:
31 print (elements [i], sep='', end='')
32 no_elements = False
33 else:
34 print (', ', elements [i], sep='', end='')
35 print('}')
[]
0 1
[0] [1]
0 1 0 1
[0,0] [0,1] [1,0] [1,1]
0 1 0 1 0 1 0 1
Listing 12.2 Alternative code for printing all of the subsets of the elements
in a list.
1 def generate_subsets_alt(sol , a):
2 # Base case
3 if len(sol) == len(a):
4 # Print complete solution
5 print_subset_binary(sol , a)
6 else:
7 # Generate candidate elements
8 for k in range(0, 2):
9
10 # Include candidate in partial solution
11 sol = sol + [k]
12
13 # Expand partial solution at position i+1
14 generate_subsets_alt(sol , a)
15
16 # Remove candidate from partial solution
17 del sol[-1]
18
19
20 def generate_subsets_alt_wrapper(elements ):
21 sol = []
22 generate_subsets_alt(sol , elements )
[−, −, −]
a c
b
[a, −, −] [b, −, −] [c, −, −]
b c a c a b
c b c a b a
Figure 12.6 Recursion tree of an algorithm that generates all of the per-
mutations of three items.
12.2.2 Permutations
This section examines two similar algorithms that print all of the pos-
sible permutations of the n distinct elements in a given list. One way
to represent a permutation is through a list of indices from 0 to n − 1,
which reference the locations of the elements in the input list. For ex-
ample, given the list [a, b, c], the partial solution [1, 2, 0] would denote
the permutation [b, c, a]. However, the following algorithms will simply
use partial solutions formed by the items of the input list (and None
values). Figure 12.6 shows the structure of their recursion trees, for the
list [a, b, c]. Observe that the partial solutions have length n, but only
the first items are meaningful. In the first call to the methods the partial
solution is “empty,” where all of its elements are set to None. The pro-
cedures also receive a parameter i, initially set to zero, which specifies
how many candidate elements have been included in the partial solution.
Thus, it indicates the position in the partial solution where the methods
introduce a new candidate, and which is also equivalent to the depth of
the node related to a method call in the recursion tree.
Multiple Recursion III: Backtracking 365
Listing 12.3 Code for printing all of the permutations of the elements in
a list.
1 def generate_permutations(i, sol , elements ):
2 # Base case
3 if i == len(elements ):
4 print_permutation(sol) # complete solution
5 else:
6 # Generate candidate elements
7 for k in range(0, len(elements )):
8
9 # Check candidate validity
10 if not elements [k] in sol[0:i]:
11
12 # Include candidate in partial solution
13 sol[i] = elements [k]
14
15 # Expand partial solution at position i+1
16 generate_permutations(i + 1, sol , elements )
17
18 # Remove candidate from partial solution
19 sol[i] = None # not necessary
20
21
22 def generate_permutations_wrapper(elements ):
23 sol = [None] * (len(elements ))
24 generate_permutations(0, sol , elements )
25
26
27 def print_permutation(sol):
28 for i in range (0, len(sol)):
29 print (sol[i], ' ', end='')
30 print ()
a c
b
c a c a
b b
c b c a b a
0 1 2 3 3 4 5 6
ing point for building the backtracking method that solves the problem.
Although the final algorithm will introduce a few modifications, its struc-
ture will be very similar to the method that generates permutations.
Consider the method generate_permutations_alt and its input
parameters. It permutes the items in the list elements, which can con-
tain arbitrary numbers, characters, or other types of data. In this case
the elements that we have to permute are the rows of the chessboard.
Since they are simply the integers from 0 to n − 1, we will be able to
write the code omitting the list elements. However, we will use the list
sol that represents a partial solution containing rows, the parameter i
that indicates the column where a new queen will be placed, and the
Boolean list available that specifies which rows (candidates) are free
to be incorporated in the partial solutions. We will change the name of
this list to free_rows, since even though there might not be a queen
in a particular row, the row might not be available for inclusion in the
partial solution if there is a conflict related to a diagonal.
In addition to those parameters, we can use two more Boolean lists
in order to indicate whether there are queens on the diagonals of the
chessboard. There are two types of diagonals: (a) principal diagonals,
which are parallel to the main diagonal that runs from the top-left to the
bottom-right corners of the board; and (b) secondary diagonals, which
are perpendicular to the principal diagonals. There are exactly 2n − 1
370 Introduction to Recursive Programming
Listing 12.5 Code for finding all of the solutions to the n-queens puzzle.
1 def nqueens_all_sol(i, free_rows , free_pdiags ,
2 free_sdiags , sol):
3 n = len(sol)
4
5 # Test if the partial solution is a complete solution
6 if i == n:
7 print_chessboard(sol) # process the solution
8 else:
9
10 # Generate all possible candidates that could
11 # be introduced in the partial solution
12 for k in range(0, n):
13
14 # Check if the partial solution with the
15 # k-th candidate would be valid
16 if (free_rows[k] and free_pdiags[i - k + n - 1]
17 and free_sdiags[i + k]):
18
19 # Introduce candidate k in the partial solution
20 sol[i] = k
21
22 # Update data structures , indicating that
23 # candidate k is in the partial solution
24 free_rows[k] = False
25 free_pdiags[i - k + n - 1] = False
26 free_sdiags[i + k] = False
27
28 # Perform a recursive call in order to include
29 # more candidates in the partial solution
30 nqueens_all_sol(i + 1, free_rows , free_pdiags ,
31 free_sdiags , sol)
32
33 # Eliminate candidate k from the partial
34 # solution , and restore the data structures ,
35 # indicating that candidate k is no longer
36 # in the partial solution
37 free_rows[k] = True
38 free_pdiags[i - k + n - 1] = True
39 free_sdiags[i + k] = True
40
41
42 def nqueens_wrapper(n):
43 free_rows = [True] * n
44 free_pdiags = [True] * (2 * n - 1)
45 free_sdiags = [True] * (2 * n - 1)
46 sol = [None] * n
47 nqueens_all_sol(0, free_rows , free_pdiags , free_sdiags , sol)
372 Introduction to Recursive Programming
to restore the values of the Boolean lists, as if the k-th row had not been
included in the partial solution. Finally, the algorithm modifies the lists
specifying that the k-th row, and the diagonals related to the square at
column i and row k, will be free of queens (lines 37–39).
Listing 12.6 Code for finding one solution to the n-queens puzzle.
1 def nqueens_one_sol(i, free_rows , free_pdiags ,
2 free_sdiags , sol):
3 n = len(sol)
4 sol_found = False
5
6 if i == n:
7 return True
8 else:
9 k = 0
10 while not sol_found and k < n:
11 if (free_rows[k] and free_pdiags[i - k + n - 1]
12 and free_sdiags[i + k]):
13
14 sol[i] = k
15
16 free_rows[k] = False
17 free_pdiags[i - k + n - 1] = False
18 free_sdiags[i + k] = False
19
20 sol_found = nqueens_one_sol(i + 1, free_rows ,
21 free_pdiags ,
22 free_sdiags , sol)
23
24 free_rows[k] = True
25 free_pdiags[i - k + n - 1] = True
26 free_sdiags[i + k] = True
27
28 k = k + 1
29
30 return sol_found
31
32
33 def nqueens_one_sol_wrapper(n):
34 free_rows = [True] * n
35 free_pdiags = [True] * (2 * n - 1)
36 free_sdiags = [True] * (2 * n - 1)
37 sol = [None] * n
38
39 if nqueens_one_sol(0, free_rows , free_pdiags ,
40 free_sdiags , sol):
41 print_chessboard(sol)
42
43
44 def print_chessboard(sol):
45 for i in range (0, len(sol)):
46 print (sol[i], ' ', end='')
47 print ()
374 Introduction to Recursive Programming
that:
∑ si = x. (12.1)
si ∈T
Listing 12.7 Backtracking code for solving the subset sum problem.
1 def print_subset_sum(i, sol , psum, elements , x):
2 # Base case
3 if psum == x:
4 print_subset_binary(sol , elements )
5 elif i < len(elements ):
6 # Generate candidates
7 for k in range(0, 2):
8
9 # Check if recursion tree can be pruned
10 if psum + k * elements [i] <= x:
11
12 # Expand partial solution
13 sol[i] = k
14
15 # Update sum related to partial solution
16 psum = psum + k * elements [i]
17
18 # Try to expand partial solution
19 print_subset_sum(i + 1, sol , psum, elements , x)
20
21 # not necessary:
22 #psum = psum - k*elements [i]
23
24 # Make sure a 0 indicates the absence of an element
25 sol[i] = 0
26
27
28 def print_subset_sum_wrapper(elements , x):
29 sol = [0] * (len(elements ))
30 print_subset_sum(0, sol , 0, elements , x)
are always zeros when reaching a base case (this is implemented through
the assignment in line 25).
If the partial solution does not satisfy (12.1) the method can simply
terminate if the partial solution has n elements, without carrying out any
operation. Thus, with the condition in line 5 the algorithm only continues
expanding partial solutions if i< n. In that case it uses a loop in order
to process the two possibilities of ignoring (k=0) or including (k=1) the
i-th element of S in the partial solution. Since k is a number we can
use it within the expression psum + k*elements[i] to indicate the new
sum associated with the partial solution. Notice that when k=0 it does
not change, while the method adds the i-th element of S when k=1.
376 Introduction to Recursive Programming
+2 ?
0 2
+6 ?
0 6 2
8
+3 ?
0 3 6 2 5 {2, 6}
9
+5 ?
0 5 3 8 6 11 2 7 5 10
{3, 5}
Therefore, the condition in line 10 makes sure that the new sum is less
than or equal to x before including the element, and proceeding to make
additional recursive calls. If the condition is True the method expands
the partial solution with the corresponding candidate (line 13), updates
psum (line 16), and carries out the recursive call (line 19). Afterwards,
it is not necessary to restore the value of psum (in line 22) since in the
first iteration of the loop k=0, which implies that the value of psum is
not modified. Lastly, the algorithm sets the value of the partial solution
to zero at position i when terminating, which is necessary in order to
print the partial solutions correctly in the base case (its n−i last items
must always be zero).
Finally, Figure 12.9 shows the recursion tree of the method
print_subset_sum, for S = {2, 6, 3, 5} and x = 8. The partial solutions
are constructed at each level as in the methods in Section 12.2.1. When
descending through a left branch the algorithm does not include the i-th
element of S in T , but it incorporates it when advancing down a right
branch. In this case the numbers next to the nodes indicate the sum of
the items included in the partial solutions (i.e., psum) when invoking the
method. Observe that the tree is pruned as soon as it finds a solution that
satisfies (12.1), or if it detects that the sum of the elements in T is greater
than x. Lastly, a call to print_subset_sum_wrapper([2,6,3,5],8)
produces the correct result:
{3,5}
{2,6}
Multiple Recursion III: Backtracking 377
Finally, observe that without the assignment in line 25, the partial so-
lution would hold a 1 in its third position when reaching the base case
associated with the subset {2, 6}, and the method would print {2,6,3}.
Enter
E W E E E E E E W E W W¶
E E E W W W E W W E E E¶
W W E E W E E E W W W E¶
W E E W W E W E E E E E¶
E E W W E E E E W E W W¶
E W W E E W W W W E E E¶
E E E E W W E E E E W E¶
E W E W W W W W W W W E¶
E W E E E E W E E E W W¶
E W W E W E E E W E E E¶
W W E E W W E W W E W E¶
E E E W W E E E W E W E¶
Exit
(a) (b)
Exit Exit
(c) (d)
Exit Exit
then tries to advance downwards, but hits a wall located at (2, 0). Since
a path with that cell would not be valid, the algorithm continues by
trying to advance towards the right, which it can do since the cell at
(1, 1) is empty. Eventually the path reaches the cell at (6,0), where the
method will also begin by trying to advance downwards. The path ends
up advancing to the cell at (9, 0), where it hits a dead end. In that case,
not only are there walls below and to the right, but going upwards is
not allowed because the cell on top is already part of the path. Further-
more, going towards the left would mean exiting the limits of the maze.
Since the algorithm cannot advance in any direction, it backtracks to the
cell at (8, 0), where it has not yet tried to advanced towards the right,
upwards, or leftwards. However, these possibilities are also not allowed.
This also occurs when returning to the cell at (7, 0). After backtracking
to the cell at (6, 0), it explores new paths that go towards the right, after
having exhaustively explored all of the possible paths going downwards.
This process is repeated until the algorithm finds the exit cell. The so-
lution is shown in (d), where the explored cells appear with a shaded
background.
Backtracking algorithms can stop searching for solutions when they
find the first one, or they can continue computing every solution. We
will see an algorithm that halts as soon as it finds a path through the
maze. In this regard, the order in which it examines the possible paths
380 Introduction to Recursive Programming
(c, r)
Problem
↓ ←
→ ↑
search for this algorithm is therefore: down, right, up, and left. Observe
that adding a unit to a row implies going down (since the first row is
located at the top of the maze), while adding one to a column signifies
moving towards the right. Lastly, the wrapper method indicates that the
initial cell will form part of the solution path, and returns the (Boolean)
result of calling the recursive backtracking method. The parameter M of
find_path_maze is both the initial maze and the partial solution. The
first two parameters of the recursive function indicate the coordinates of
the last cell of the partial path M, and the algorithm will try to expand
it advancing to one of its neighboring cells. Finally, the method receives
incr, and the coordinates of the exiting cell.
The recursive function find_path_maze returns True if it finds a
path through the maze, where the solution would be stored precisely
in M. It declares the variable sol_found, initialized as False, which is
set to True at the base case if the algorithm finds a complete solution.
Otherwise, it uses a while loop to generate the four candidate cells,
but one that terminates as soon as a solution is found. The variables
new_col and new_row constitute the new candidate (lines 12 and 13).
Subsequently, the algorithm examines whether it is valid. In particular,
it has to be within the limits of the maze (lines 16 and 17), and it has
to be empty (line 18). If the cell does not violate the constraints of the
problem the method incorporates it to the path (line 21), and calls the
recursive method in lines 24–26, where the output is stored in sol_found.
If a solution is not found the cell will not belong to the path. Therefore,
it sets its value back to ‘E’, indicating that it is empty (line 30). This
is necessary since the path in the final maze is determined through the
cells marked as ‘P’. Without this condition all of the explored cells would
contain a ‘P’ character. Lastly, the method can simply return the value
of sol_found after exiting the loop (line 34).
Finally, Listing 12.9 shows additional code that can be used in
order to execute the program and draw the maze. The method
read_maze_from_file reads a text file defining a maze and returns the
corresponding list of lists. The basic iterative procedure draw_maze uses
the Matplotlib package to depict a maze. Finally, the last lines of the
code read a maze from a file, and draw it only if there exists a path
through it, where the initial and final cells are the top-left and bottom-
right ones, respectively.
Multiple Recursion III: Backtracking 383
6 1 4 5 9 6 3 1 7 4 2 5 8
8 3 5 6 1 7 8 3 2 5 6 4 9
2 1 2 5 4 6 8 9 7 3 1
8 4 7 6 8 2 1 4 3 7 5 9 6
6 3 4 9 6 8 5 2 3 1 7
7 9 1 4 7 3 5 9 6 1 8 2 4
5 2 5 8 9 7 1 3 4 6 2
7 2 6 9 3 1 7 2 4 6 9 8 5
4 5 8 7 6 4 2 5 9 8 1 7 3
with one of these digits it must skip it (it cannot expand the partial
solution for that cell), and carry out a recursive call that processes the
following cell. This requires incorporating a second recursive case that
is illustrated in (b).
Listings 12.10 and 12.11 show a backtracking procedure that solves
the problem, together with several auxiliary methods. The recursive
method solve_sudoku_all_solutions assumes that a sudoku might
not be well-posed, and could potentially have several solutions (or even
0). Thus, it prints all of the valid solutions to a given sudoku grid. For
example, if the top row of the sudoku in Figure 12.13 is replaced by an
empty row, there are 10 different ways to fill the sudoku grid with digits.
The inputs to the recursive procedure are the coordinates (row and
col) of a cell where it will try to include digits in order to expand
the partial solution represented by the list of lists of digits S. Since the
procedure expands the partial solution in row-major order, starting at
cell (0, 0), it will have obtained a valid solution when row is equal to 9. In
that base case it can simply print the sudoku grid (line 4). Otherwise, the
method checks whether the current cell is not empty (i.e., if it contains
one of the initial fixed digits). If the result is True it skips the cell, and
continues by invoking the recursive method (line 14) with the coordinates
of the next cell in the row-major order (computed in line 11).
In the second recursive case it uses a loop to generate the nine pos-
sible candidates to include in the empty cell. Afterwards, in line 20 it
analyzes whether it is feasible to incorporate candidate k in the partial
solution S, at cell (row,col). If it is, the method includes the candidate in
the partial solution (line 23), and continues carrying out a recursive call
with the next cell (line 29). When the loop finishes, the method needs
to undo the changes made to the cell, leaving it empty (line 32). This is
necessary for future explorations of solutions. Note that the cell has to
386 Introduction to Recursive Programming
The code also includes: (a) a function that returns a tuple with the
coordinates of the next cell in row-major order; (b) a function that de-
termines whether the digit placed at some row and column violates the
constraints of the problem, where the variables box_row and box_col
are the top-left cells of the 3 × 3 boxes; (c) a function for reading a su-
doku grid from a text file, where each row contains the nine initial digits
of a sudoku row, separated by space bar characters; (d) a method for
printing the grid; and (e) code for reading and solving a sudoku.
maximize n−1
∑ xi vi
x i=0
n−1
subject to ∑ xi wi ≤ C,
i=0
xi ∈ {0, 1}, i = 0, . . . , n − 1.
not be able to improve the best solution found in previous steps (by
expanding a particular partial solution).
x = remaining capacity
y = partial value
x x − wi
y y + vi
(a)
w = [3, 6, 9, 5]
15
v = [7, 2, 10, 4]
0
C = 15 1
15 12
0 7
0
15 9 12 6
0 2 7 9
15 6 9 0 12 3 6 -3
0 10 2 12 7 17 9 19
15 10 6 1 9 4 0 -5 12 7 3 -2 6 1
0 4 10 14 2 6 12 16 7 11 17 21 9 13
Listing 12.12 Backtracking code for solving the 0-1 knapsack problem.
1 def knapsack_0_1(i, w_left , current_v , sol ,
2 opt_sol , opt_v , w, v, C):
3 # Check base case
4 if i == len(sol):
5 # Check if better solution has been found
6 if current_v > opt_v:
7 # Update optimal value and solution
8 opt_v = current_v
9 for k in range (0, len(sol)):
10 opt_sol [k] = sol[k]
11 else:
12 # Generate candidates
13 for k in range(0, 2):
14
15 # Check if recursion tree can be pruned
16 if k * w[i] <= w_left :
17
18 # Expand partial solution
19 sol[i] = k
20
21 # Update remaining capacity and partial value
22 new_w_left = w_left - k * w[i]
23 new_current_v = current_v + k * v[i]
24
25 # Try to expand partial solution
26 opt_v = knapsack_0_1(i + 1, new_w_left ,
27 new_current_v , sol ,
28 opt_sol , opt_v , w, v, C)
29
30 # return value of optimal solution found so far
31 return opt_v
32
33
34 def knapsack_0_1_wrapper(w, v, C):
35 sol = [0] * (len(w))
36 opt_sol = [0] * (len(w))
37 total_v = knapsack_0_1(0, C, 0, sol , opt_sol , -1, w, v, C)
38 print_knapsack_solution(opt_sol , w, v, C, total_v )
clude the partial solution sol and the index i related to the object that
may be introduced in the knapsack. The method also receives the lists
of weights w, values v, and the capacity C. With these parameters it is
possible to compute the remaining capacity and the accumulated sum of
values of the objects in the knapsack, in i+1 steps. However, it is more
392 Introduction to Recursive Programming
solution. If at some method call this value is smaller than the best value
found by the algorithm in previous steps, it will not continue to expand
the partial solution, since it will not be able to obtain a better solution.
This allows us to prune the recursion tree at more nodes, which can lead
to a considerably more efficient search.
Figure 12.16 illustrates the recursion tree for the same weights, val-
ues, and knapsack capacity as in Figure 12.15. In this case, the numbers
inside the nodes indicate the partial value, and a bound on the maximum
possible value that we can obtain by expanding the associated partial
solution, as shown in (a). Initially, the partial value is 0, and the bound
is the sum of all of the values of the objects (i.e., 23 = 7 + 2 + 10 + 4), as
illustrated in (b), since at first we have to contemplate the case where
every object fits in the knapsack.
Each internal node of depth i can have two children. Descending
through the left branch implies not introducing object i in the knapsack.
Therefore, the partial value does not change, but the bound is decreased
by vi , since the object will not contribute its value to the total sum of
a complete solution. Instead, the object is introduced in the knapsack
when descending through the right branch. This implies adding vi to the
partial value, but the bound remains unaltered.
Similarly to the example in Figure 12.15, the method updates the
best solution found so far in the shaded leaves. In addition, the nodes
drawn with a dotted contour indicate method calls that are not carried
out because the sum of the weights of the objects exceeds the knapsack’s
capacity. Furthermore, the figure shows a new type of node with a lighter
dashed contour. The algorithm also discards the method calls associated
with these nodes since the value of the bound is smaller than the best
sum of values encountered previously. For instance, the best value after
reaching the fourth leaf is 14. Afterwards, consider reaching the node
with partial value 2 and bound 16. Discarding the third object implies
reducing the bound by 10. This means that the maximum sum of values
that it is possible to obtain by expanding the partial solution is 16 − 10 =
6. Since 6 < 14 (depicted underneath the node), the algorithm avoids
calling the method, pruning the recursion tree. Observe that it has the
same structure as the recursion tree in Figure 12.15, but it contains
less nodes since it prunes the tree on more occasions. In practice, this
enhancement can have a dramatic effect regarding efficiency.
Listing 12.14 shows a branch and bound code for solving the problem
that is very similar to the one in Listing 12.12. The main difference be-
tween them is the new (fourth) parameter max_v that stores the bound.
Multiple Recursion III: Backtracking 395
x = partial value
y = bound on maximum possible value
x x + vi
y − vi y
(a)
w = [3, 6, 9, 5]
0
v = [7, 2, 10, 4]
23
C = 15 −7 +7
0 7
16 23
−2 +2 −2 +2
0 2 7 9
14 16 21 23
0 10 2 12 7 17 9 19
4 14 6 16 11 21 13 23
6 < 14 11 < 14 13 < 17
−4 +4 −4 +4 −4 +4 −4 +4
0 4 10 14 12 16 17 21
0 4 10 14 12 16 17 21
12 < 14
Optimal solution: [1, 0, 1, 0]
(b)
Listing 12.14 Branch and bound code for solving the 0-1 knapsack prob-
lem.
1 def knapsack_0_1_bnb(i, w_left , current_v , max_v , sol ,
2 opt_sol , opt_v , w, v, C):
3 # Check base case
4 if i == len(sol):
5 # Check if better solution has been found
6 if current_v > opt_v:
7 # Update optimal value and solution
8 opt_v = current_v
9 for k in range (0, len(sol)):
10 opt_sol [k] = sol[k]
11 else:
12 # Generate candidates
13 for k in range(0, 2):
14
15 # Check if recursion tree can be pruned
16 # according to (capacity ) constraint
17 if k * w[i] <= w_left :
18
19 # Update maximum possible value
20 new_max_v = max_v - (1 - k) * v[i]
21
22 # Check if recursion tree can be pruned
23 # according to optimal value
24 if new_max_v > opt_v:
25
26 # Expand partial solution
27 sol[i] = k
28
29 # Update remaining capacity
30 # and partial value
31 new_w_left = w_left - k * w[i]
32 new_current_v = current_v + k * v[i]
33
34 # Try to expand partial solution
35 opt_v = knapsack_0_1_bnb(i + 1, new_w_left ,
36 new_current_v ,
37 new_max_v , sol ,
38 opt_sol , opt_v ,
39 w, v, C)
40
41 # return value of optimal solution found so far
42 return opt_v
Multiple Recursion III: Backtracking 397
Note that it is initialized to the sum of all of the values of the objects
in the wrapper method. In the recursive case, after making sure that a
partial solution with a new candidate is valid (in line 17), the method
computes the new value of the bound new_max_v in line 20. Observe
that when k = 0 the new bound is decreased by vi , while when k = 1 it
remains unaltered. Subsequently, the method uses another if statement
to check if it can prune the tree according to the new bound and the
optimal value of a solution computed in earlier calls (line 24). The rest
of the code is analogous to the backtracking algorithm. Finally, List-
ing 12.15 contains an associated wrapper method, defines an instance of
the problem (through w, v, and C), and solves it.
12.8 EXERCISES
Exercise 12.1 — There are numerous ways to build algorithms that
generate subsets of elements. The methods described in Section 12.2.1
were based on binary recursion trees. The goal of this exercise is to
implement alternative procedures that also print all of the subsets of n
items provided in a list. However, they must generate a subset at each
node of the recursion tree illustrated in Figure 12.17. Observe that there
are exactly 2n nodes, and the labels 0, 1, and 2 represent indices of the
elements of an initial input list [a, b, c]. Furthermore, instead of using
binary lists to indicate the presence of items in a subset, the partial
solutions will contain precisely these indices. For example, the list [0, 2]
398 Introduction to Recursive Programming
{}
0 2
1
{a} {b}
1 {c}
2 2
{a, b}
{a, c} {b, c}
2
{a, b, c}
r
r
r
r
Figure 12.18 One solution to the four-rooks puzzle.
will represent {a, c}. Therefore, partial solutions will also correspond to
complete solutions.
8 1 6 15
3 5 7 15
4 9 2 15
15 15 15 15 15
N N n
N N
n nn
N N n
N N n
(a) (b)
a specific square, such that it visits every square only once. Implement
a backtracking algorithm that provides one knight’s tour, and test it for
n = 5 and n = 6. Besides n, the method will receive the coordinates of
an initial square on the chessboard. Finally, it is not necessary to search
for a “closed” tour that ends at a square that is one move away from the
initial square.
set 3 5 9 14 20 24 set 3 5 9 14 20 24
solution 0 1 1 0 0 1 solution 1 2 5
(a) (b)
Figure 12.22 Two ways to represent a solution for the tug of war problem.
example, given the set {3, 5, 9, 14, 20, 24}, the optimal way to partition
it leads to the two subsets: {5, 9, 24}, and {3, 14, 20}. The sums of their
elements are 38 and 37, and the absolute difference between these sums
is 1.
Implement a backtracking algorithm based on generating subsets
that solves this problem. Assuming that the input set is coded as a list,
the solution would be a binary list of length n with n/2 zeros, and n/2
ones. The positions of the zeros (or ones) would indicate the locations of
the elements of a particular subset. In the example, one solution could
be the list [0, 1, 1, 0, 0, 1] (the list [1, 0, 0, 1, 1, 0] would be equivalent),
representing the subset {5, 9, 24}.
In addition, implement a more efficient strategy where the solution
is a list s of length n/2 whose elements appear in increasing order, and
correspond to the locations of the elements of a particular subset. For
example, the subset {5, 9, 24} would be represented by the list [1, 2, 5].
Figure 12.22 shows these two ways to represent a solution.
that computes the optimal subset, and prints it if it exists (it may not
be possible to find a subset T of S whose elements add up to x).
Further reading
MONOGRAPHS IN RECURSION
This book has provided a broad coverage of recursion, containing essen-
tial topics for designing recursive algorithms. However, the reader can
find additional aspects, examples, and implementation details in other
references. In particular, the following book:
• Jeffrey Soden Rohl. Recursion Via Pascal. Cambridge Computer Sci-
ence Texts. Cambridge University Press, 1st edition, August 1984
contains numerous examples in Pascal of recursive algorithms on data
structures (linked lists, trees, or graphs) implemented through point-
ers. In contrast, the current book avoids pointers since they are not
used explicitly in Python. The suggested reference includes backtrack-
ing algorithms for generating additional combinatorial entities such as
combinations, compositions, and partitions. Lastly, it contains a chapter
on recursion elimination, which offers low-level explanations on how to
transform recursive programs into equivalent iterative versions.
Another book that focuses entirely on recursion is:
• Eric S. Roberts. Thinking Recursively. Wiley, 1st edition, January 1986,
which contains examples in Pascal, and a more recent edition:
• Eric S. Roberts. Thinking Recursively with Java. Wiley, 1st edition,
February 2006,
where the code is in Java. The book contains a chapter on recursive data
types, and another on the implementation of recursion from a low-level
point of view.
Java programmers can also benefit from:
• Irena Pevac. Practicing Recursion in Java. CreateSpace Independent
Publishing Platform, 1st edition, April 2016,
which contains examples related to linked lists, linked trees, and graph-
ical problems.
403
404 Further reading
FUNCTIONAL PROGRAMMING
Recursion is omnipresent in functional programming. Thus, program-
mers should have mastered the contents of this book in order to be
competent in this programming paradigm. Popular references include:
• Harold Abelson and Gerald J. Sussman. Structure and Interpretation
of Computer Programs. MIT Press, Cambridge, MA, USA, 2nd edition,
1996.
• Richard Bird. Introduction to Functional Programming Using Haskell.
Prentice Hall Europe, April 1998.
• Martin Odersky, Lex Spoon, and Bill Venners. Programming in Scala:
A Comprehensive Step-by-Step Guide. Artima Incorporation, USA, 1st
edition, 2008.
The book by Odersky et al. is used in a highly recommended course also
offered at coursera.org.
PYTHON
It is assumed that the reader of this book has some programming expe-
rience. The following popular texts can be useful to readers interested
in learning more Python features, looking up implementation details,
or developing alternative and/or more efficient recursive variants of the
examples covered throughout the book:
• Mark Pilgrim. Dive Into Python 3. Apress, Berkely, CA, USA, 2009.
• Mark Summerfield. Programming in Python 3: A Complete Introduc-
tion to the Python Language. Addison-Wesley Professional, 2nd edition,
2009.
• Mark Lutz. Learning Python. O’Reilly, 5th edition, 2013.
407
408 Index