Guide To Graph Colouring Algorithms and Applications - R. M. R. Lewis
Guide To Graph Colouring Algorithms and Applications - R. M. R. Lewis
Guide To Graph Colouring Algorithms and Applications - R. M. R. Lewis
R. M. R. Lewis
Guide to Graph
Colouring
Algorithms and Applications
Second Edition
Texts in Computer Science
Series Editors
David Gries, Department of Computer Science, Cornell University, Ithaca, NY,
USA
Orit Hazzan , Faculty of Education in Technology and Science, Technion—Israel
Institute of Technology, Haifa, Israel
More information about this series at http://www.springer.com/series/3191
R. M. R. Lewis
123
R. M. R. Lewis
Cardiff School of Mathematics
Cardiff University
Cardiff, UK
This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
For Fifi, Maiwen, Aoibh, and Maccy
Gyda cariad
Preface
The first chapter of this book is kept deliberately light; it gives a brief tour of the
graph colouring problem, avoids jargon, and gives plenty of illustrated examples.
In Chap. 2, we discover that graph colouring is a type of “intractable” problem,
meaning that it usually needs to be tackled using inexact heuristic algorithms. To
reach this conclusion, we introduce the topics of problem complexity, polynomial
transformations, and N P-completeness. We also review several graph topologies
that are easy to colour optimally.
Chapter 3 of this book starts with some theory and uses various techniques to
derive bounds on the chromatic number of graphs. It then looks at five
well-established constructive heuristics for graph colouring (including the Greedy,
DSatur, and RLF algorithms) and analyses their relative performance. Source code
for these algorithms is provided.
The intention of Chap. 4 is to give the reader an overview of the different
strategies available for graph colouring, including both exact and heuristic methods.
Techniques considered include backtracking, integer programming, column gen-
eration, and various metaheuristics. No prior knowledge of these techniques is
assumed. We also describe ways in which graph colouring problems can be reduced
in size and broken into smaller parts, helping to improve algorithm performance.
vii
viii Preface
Audience
Supplementary Resources
All algorithms reviewed and tested in this book are available for free download at
http://www.rhydlewis.eu/resources/gCol.zip. These implementations are written in
C++ and can be compiled on Windows, macOS, and Linux. Full instructions on
how to do this are provided in Appendix A. Readers are invited to experiment with
these algorithms as they make their way through this book.
This book also shows how graph colouring problems can be generated and
tackled using Sage and Python. Both of these programming languages are free to
download. We also show how graph colouring problems can be solved via linear
programming software, in this case using the commercial software FICO Xpress.
Preface ix
xi
xii Contents
(a) (b)
Fig. 1.1 A small graph (a), and corresponding five-colouring (b)
Fig. 1.2 If we extract the vertices in the dotted circle, we are left with a subgraph that clearly needs
more than four colours
If we were to have only four colours available to us, as indicated in the figure we
would be unable to properly colour this subgraph, since its five vertices all need to
be assigned to a different colour in this instance. This allows us to conclude that the
solution in Fig. 1.1 is optimal, since there is no solution available that uses fewer
than five colours.
Let us now consider four simple practical applications of graph colouring to further
illustrate the underlying concepts of the problem.
Fig. 1.3 Illustration of how proper five- and four-colourings can be constructed for the same graph
students so that no student is put in a group containing one or more of his friends,
and so that the number of groups used is minimal. How might this be done?
Consider the example given in the table in Fig. 1.3a. Here, we have a list of eight
students with names A through to H, together with information on who their friends
are. From this information we can see that student A is friends with three students
(B, C and G), student B is friends with four students (A, C, E, and F), and so on.
Note that the information in this table is “symmetric” in that if student x lists student
y as one of his friends, then student y also does the same with student x. This
sort of relationship occurs in social network systems such as Facebook, where two
people are only considered friends if both parties agree to be friends in advance. An
illustration of this example in graph form is also given in the figure.
Let us now attempt to split the eight students of this problem into groups so that
each student is put into a different group to that of his friends. A simple method to do
this might be to take the students one by one in alphabetical order and assign them
to the first group where none of their friends is currently placed. Walking through
the process, we start by taking student A and assigning him to the first group. Next,
we take student B and see that he is friends with someone in the first group (student
A), and so we put him into the second group. Taking student C next, we notice that
he is friends with someone in the first group (student A) and also the second group
(student B), meaning that he must now be assigned to a third group. At this point,
we have only considered three students, yet we have created three separate groups.
What about the next student? Looking at the information, we can see that student D
is only friends with E and F, allowing us to place him into the first group alongside
student A. Following this, student E cannot be assigned to the first or second groups
4 1 Introduction to Graph Colouring
but can be assigned to the third. Continuing this process for all eight students gives
us the solution shown in Fig. 1.3b. This solution uses four groups and also involves
student F being assigned to a group by himself.
Can we do any better than this? By inspecting the graph in Fig. 1.3a, we can see that
there are three separate cases where three students are all friends with one another.
Specifically, these are students A, B, and C; students B, E, and F; and students D,
E, and F. The edges between these triplets of students form triangles in the graph.
Because of these mutual friendships, in each case these collections of three students
will need to be assigned to different groups, implying that at least three groups will
be needed in any valid solution. However, by visually inspecting the graph we can
see that there is no occurrence of four students all being friends with one another.
This hints that we may not necessarily need to use four groups in a solution.
A solution using three groups is possible in this case as Fig. 1.3c demonstrates.
This solution has been achieved using the same assignment process as before but
using a different ordering of the students, as indicated. Since we have already deduced
that at least three groups are required for this particular problem, we can conclude
that this solution is optimal—no proper solution using fewer colours exists.
The process we have used to form the solutions shown in Fig. 1.3b, c is known as
the Greedy algorithm for graph colouring. This is a fundamental part of the field of
graph colouring and will be considered further in Chap. 3. Among other things, we
will demonstrate that there will always be at least one ordering of the vertices that,
when used with the Greedy algorithm, will result in an optimal solution.
Fig. 1.4 A small timetabling problem (a), a feasible four-colouring (b), and its corresponding
timetable solution using four timeslots (c)
that only one event can take place in a room at any one time, we would also need
to ensure that three rooms are available during timeslot 1. If only two rooms are
available in each timeslot, then an extra timeslot might need to be added to the
timetable.
It should be noted that timetabling problems can often vary a great deal between
educational institutions, and can also be subject to a wide range of additional con-
straints beyond the event-clash constraint mentioned above. Many of these will be
examined further in Chap. 9.
A third example of how graph colouring can be used to model real-world problems
arises in the scheduling of timed tasks. Imagine that a taxi firm has received n journey
bookings, each of which has a start time, signifying when the taxi will leave the depot
and a finish time telling us when the taxi is expected to return. How can we assign
all of these bookings to vehicles so that the minimum number of vehicles is used?
Figure 1.5a shows an example problem where we have ten taxi bookings. For
illustrative purposes, these have been ordered from top to bottom according to their
start times. It can be seen, for example, that booking 1 overlaps with bookings 2, 3,
and 4; hence any taxi carrying out booking 1 will not be able to serve bookings 2, 3,
and 4. We can construct a graph from this information by using one vertex for each
booking and then adding edges between any vertex pair corresponding to overlap-
Fig. 1.5 A set of taxi journey requests over time (a), its corresponding interval graph and three-
colouring (b), and (c) the corresponding assignment of journeys to taxis
6 1 Introduction to Graph Colouring
ping bookings. A three-colouring of this example graph is shown in Fig. 1.5b, and
the corresponding assignment of the bookings to three taxis (the minimum number
possible) is shown in Fig. 1.5c.
A graph constructed from time-dependent tasks such as this is usually referred to
as an interval graph. In Chap. 3, we will show that a simple inexpensive algorithm
exists for interval graphs that will always produce a solution using that uses the
fewest number of colours possible.
Our final example in this section concerns the allocation of computer code variables
to registers on a computer processor. When writing code in a particular programming
language, whether it be C++, Fortran, Java, or Pascal, programmers are free to make
use of as many variables as they see fit. When it comes to compiling this code,
however, it is advantageous for the compiler to assign these variables to registers1 on
the processor since accessing and updating values in these locations are much faster
than carrying out the same operations using the computer’s RAM or cache.
Computer processors only have a limited number of registers. For example, most
RISC processors feature 64 registers: 32 for integer values and 32 for floating-point
values. However, not all variables in a computer program will be in use (or “live”)
at a particular time. We might therefore choose to assign multiple variables to the
same register if they are not judged to interfere with one another.
Figure 1.6a shows an example piece of computer code making use of five variables,
v1 , . . . , v5 . It also shows the live ranges for each variable. So, for example, variable
v2 is live only in Steps (2) and (3), whereas v3 is live from Steps (4) to (9). It can
also be seen, for example, that the live ranges of v1 and v4 do not overlap. Hence we
might use the same register for storing both of these variables at different periods
during execution.
The problem of deciding how to assign the variables to registers can be modelled
as a graph colouring problem by using one vertex for each live range and then adding
edges between any pairs of vertices corresponding to overlapping live ranges. Such a
graph is known as an interference graph, and the task is to now colour the graph using
equal or fewer colours than the number of available registers. Figure 1.6b shows that,
in this particular case, only three registers are needed: variables v1 and v4 can be
assigned to register 1, v2 and v5 to register 2, and v3 to register 3.
Note that in the example of Fig. 1.6, the resultant interference graph corresponds
to an interval graph, rather like the taxi example from the previous subsection. Such
graphs will arise in this setting when using straight-line code sequences or when using
software pipelining. In most situations, however, the flow of a program is likely to
be far more complex, involving if-else statements, loops, goto commands, and so on.
1 Registers can be considered physical parts of a processor that are used for holding small pieces of
data.
1.1 Some Simple Practical Applications 7
Fig. 1.6 a An example computer program together with the live ranges of each variable. Here, the
statement “vi ← . . .” denotes the assignment of some value to variable vi , whereas “. . . vi . . .”
is just some arbitrary operation using vi . Part b shows an optimal colouring of the corresponding
interference graph
In these cases, the more complicated process of liveness analysis will be needed for
determining the live ranges of each variable. This could result in interference graphs
of any arbitrary topology (see also [1]).
We have now seen four practical applications of the graph colouring problem. But
why exactly is it concerned with “colouring” vertices? According to legend, the
graph colouring problem was first noted in 1852 by a student of University College
London, Francis Guthrie (1831–1899), who, while colouring a map of the counties of
England, noticed that only four colours were needed to ensure that all neighbouring
counties were allocated different colours.
To show how the colouring of maps relates to the colouring of vertices in a graph,
consider the example map of the historical counties of Wales given in Fig. 1.7a.
This particular map involves 16 “regions”, including 14 counties, the sea on the left
and England bordering on the right. Figure 1.7d shows that this map can indeed be
coloured using just four colours (light grey, dark grey, black, and white). But how
does the graph colouring problem itself inform this process?
As shown in Fig. 1.7b, we begin by placing a single vertex in the centre of each
region of the map. Next, edges are drawn between any pair of vertices whose regions
are seen to share a border. Thus, for example, the vertex appearing in England on the
right will have edges drawn to the seven vertices in the seven neighbouring Welsh
counties and also to the vertex appearing in the sea on the far left. If we take care
in drawing these edges, we will always be able to draw a graph from a map so that
no pair of edges need cross one another. Technically speaking, a graph that can be
drawn with no crossing edges is known as a planar graph, of which Fig. 1.7c is an
example.
8 1 Introduction to Graph Colouring
Fig. 1.7 Illustration of how graphs can be used to colour the regions of a map
Figure 1.7c illustrates how we might colour this planar graph using just four
colours. The counties corresponding to these vertices can then be allocated the same
colours in the actual map of Wales, as shown in Fig. 1.7d.
We might now ask whether we always need to use exactly four colours to suc-
cessfully colour a map. In some cases, such as a map depicting a single island region
surrounded by sea, fewer than four colours will be sufficient. On the other hand, for
the map of Wales shown in Fig. 1.7, we can deduce that exactly four colours will
be needed, (a) because a solution using four colours has already been constructed
(as shown in the figure), and (b) because a solution using three or fewer colours is
impossible. The latter point is because the planar graph in Fig. 1.7c contains a set
of four vertices that each have an edge between them (indicated by asterisks in this
figure). This tells us that different colours will be needed for each of these vertices.
The fact that, as Francis Guthrie suspected, four colours turn out to be sufficient
to colour any map (or, equivalently, four colours are sufficient to colour any planar
1.2 Why Colouring? 9
graph) is due to the celebrated Four Colour Theorem, which was ultimately proved
in 1976 by Kenneth Appel and Wolfgang Haken of the University of Illinois—a full
124 years after it was first conjectured (see Sect. 6.1). However, it is important to
stress at this point that the Four Colour Theorem does not apply to all graphs, but
only to planar graphs. What can we say about the number of colours that are needed
for colouring graphs that are not planar? Unfortunately, as we shall see, in these cases
we do not have the luxury of a strong result like the Four Colour Theorem.
We are now in a position to define the graph colouring problem more formally. In
graph theory, a graph G is usually defined by a pair of sets, V and E. The set V gives
the names of all vertices in the graph, whereas E defines the set of edges.
Unless stated otherwise, in this book we will restrict our discussions to simple
graphs. These are undirected graphs in which loops and multiple edges between
vertices are forbidden. Consequently, each element of E is written as an unordered
pair of vertices {u, v} indicating the existence of an edge between vertices u and v
(where u ∈ V , v ∈ V and u = v). The number of vertices in G is denoted by n,
while the number of edges is given by m.
To illustrate these ideas, the graph G = (V, E) in Fig. 1.8 has a vertex set V
containing n = 10 vertices as follows:
V = {v1 , v2 , v3 , v4 , v5 , v6 , v7 , v8 , v9 , v10 }.
The edge set E of this graph then contains m = 21 different edges as follows:
E = {{v1 , v2 }, {v1 , v3 }, {v1 , v4 }, {v1 , v6 }, {v1 , v7 }, {v2 , v5 },
{v3 , v4 }, {v3 , v6 }, {v3 , v7 }, {v4 , v5 }, {v4 , v6 }, {v4 , v7 },
{v4 , v8 }, {v5 , v7 }, {v5 , v8 }, {v5 , v10 }, {v6 , v7 }, {v6 , v9 },
{v7 , v9 }, {v8 , v10 }, {v9 , v10 }}.
Fig. 1.8 A simple graph with n = 10 vertices and m = 21 edges, together with a corresponding
five-colouring
10 1 Introduction to Graph Colouring
Given a graph G = (V, E), relationships between vertices and edges can be
described using the following terms.
Definition 1.1 If {u, v} ∈ E, then the vertices u and v are said to be adjacent,
else they are nonadjacent. Vertices u and v are also said to be the endpoints
of the edge {u, v} ∈ E. An edge {u, v} ∈ E is also said to be incident to the
vertex u (and likewise for v).
Having gone over these basic definitions, we are now in a position to formally
state the graph colouring problem.
Definition 1.2 Given a graph G = (V, E), the graph colouring problem
involves assigning each vertex v ∈ V an integer c(v) ∈ {1, 2, . . . , k} such
that
In this interpretation, instead of using actual colours such as grey, black, and white
to colour the vertices, we use the labels 1, 2, 3, up to k. If we have a solution in which a
vertex v is assigned to, say, colour 4, then this is then written by c(v) = 4. According
to the first bullet above, pairs of vertices in G that are adjacent must be assigned to
different colours. The second bullet then states that we are seeking to minimise the
number of different colours being used.
Figure 1.8 also shows a five-colouring of our example graph. Using the above
notation, this solution can be written as follows:
c(v1 ) = 3, c(v2 ) = 2, c(v3 ) = 4, c(v4 ) = 2, c(v5 ) = 4,
c(v6 ) = 5, c(v7 ) = 1, c(v8 ) = 1, c(v9 ) = 2, c(v10 ) = 3.
We now give some further definitions that help us to describe a graph colouring
solution and its properties.
For example, the five-colouring shown in Fig. 1.8 is feasible because it is both
complete (all vertices have been allocated colours) and proper (it contains no clashes).
In this case, the chromatic number of this graph χ (G) is already known to be five,
so the colouring can also be said to be optimal.
Some further useful definitions for the graph colouring problem involve colour
classes and structures known as cliques and independent sets.
Definition 1.7 A colour class is a set containing all vertices in a graph that are
assigned to a particular colour in a solution. That is, given a particular colour
i ∈ {1, . . . , k}, a colour class is defined by the set {v ∈ V : c(v) = i}.
To illustrate these definitions, two example colour classes from Fig. 1.8 are {v2 ,
v4 , v9 } and {v7 , v8 }. Example independent sets from this figure include {v2 , v7 , v8 }
and {v3 , v5 , v9 }. The largest clique in Fig. 1.8 is {v1 , v3 , v4 , v6 , v7 }, though numerous
smaller cliques also exist, such as {v6 , v7 , v9 } and {v2 , v5 }.
Given the above definitions, it is also useful to view graph colouring as a type of
partitioning problem where a solution S is represented by a set of k colour classes
12 1 Introduction to Graph Colouring
This book focuses on the problem of colouring vertices in graphs. Four different
examples of this have already been discussed in this chapter. Sometimes the term
“graph colouring” can also be used for the task of colouring the edges of a graph
or the faces of a graph. However, as we will see in Chap. 6, edge and face colour-
ing problems can easily be transformed into equivalent vertex-colouring problems
using the concepts of line graphs and dual graphs, respectively. Consequently, unless
explicitly stated otherwise, the term “graph colouring” in this book refers exclusively
to vertex colouring.
This book is primarily concerned with algorithmic aspects of graph colouring.
It focuses particularly on the characteristics of different heuristics for this problem
and seeks answers to the following questions. Do these algorithms provide optimal
solutions to some graphs? How do they perform on different topologies where the
chromatic number is unknown? Why are some algorithms better for some types of
graphs, but worse for others? What are the run time characteristics of these algo-
rithms?
To help answer such questions, Chap. 2 of this book provides and in-depth treat-
ment on algorithm complexity and N P -completeness theory. This will help us to
understand how algorithm efficiency can be gauged, and also see why graph colouring
is such a difficult problem to solve to optimality. Chapters 3–5 then describe a number
of different algorithms for graph colouring, ranging from simple constructive heuris-
tics to complex metaheuristic- and mathematical programming-based techniques.
Another central aim of this book is to examine many of the real-world operational
research problems that can be tackled using graph colouring techniques. As we will
see, these include problems as diverse as the colouring of maps, the production of
round-robin tournaments, solving Sudoku puzzles, assigning variables to computer
1.4 About This Book 13
registers, and checking for short circuits on circuit boards. These topics are con-
sidered in Chap. 6. Chapters 7–9 are also give in-depth treatments on the problems
of designing seating plans, scheduling fixtures for sports leagues, and timetabling
lectures at universities.
http://rhydlewis.eu/resources/gCol.zip
Each algorithm in this resource is described and analysed in this book. They are all
are written in C++ and can be executed from the command line using common input
and output protocols. A user manual and compilation instructions are provided in
Appendix A. Readers are encouraged to make use of these algorithms on their graph
colouring instances and are invited to modify the code in any way they see fit.
As we will see, when gauging the effectiveness of a graph colouring algorithm
(or any algorithm for that matter) it is important to consider the amount of compu-
tational effort required to produce a solution of a given quality. Ideally, we should
try to steer clear of measures such as wall-clock time or CPU time because these
are largely influenced by the chosen hardware, operating systems, programming
languages and compiler options. A more rigorous approach involves examining the
number of atomic operations performed by an algorithm during execution. For clas-
sical computational problems such as searching or sorting the elements of a vector,
these are usually considered to be the constant-time operations of comparing two
elements and swapping two elements. For graph colouring algorithms, it is useful
to follow this scheme by gauging computational effort via the number of constraint
checks that are performed. Essentially, a constraint check occurs whenever an algo-
rithm requests some information about a graph, such as whether two vertices are
adjacent or not. We will define these operations presently, though it is first necessary
to describe how graphs are to be stored in computer memory.
An example adjacency list for a ten-vertex graph is shown in Fig. 1.9. The length
of an element, |Adjv |, tells us the number of vertices that are adjacent to a vertex v.
This is usually known as the degree of a vertex (see Definition 3.1). So, for example,
v1 and v5 and therefore has
vertex v2 in this graph is seen to be adjacent to vertices
a degree of 2. Note that the sum of all list lengths ∀v∈V |Adjv | = 2m since, if v
appears in a vertex u’s adjacency list, then u will also appear in v’s adjacency list.
In algorithm implementations, adjacency lists are useful when we are interested
in identifying all vertices that adjacent to a particular vertex v. On the other hand,
they are less useful when we want to quickly answer the question “are vertices u and
v adjacent?”, as to do so would require searching through either Adju or Adjv . For
these situations, it is preferable to use an adjacency matrix.
An example adjacency matrix is also provided in Fig. 1.9. When considering graph
colouring problems, note that edges are not directed, and graphs cannot contain loops.
Consequently A is symmetric (Ai j = A ji ) and has zeros along its main diagonal
(Aii = 0).
When implemented, adjacency matrices require memory for storing n 2 pieces of
information, regardless of the number of edges in the graph. Consequently, they can
be quite bulky, particularly for sparse graphs.
Finally, as we will see in Chap. 5, our graph colouring algorithms will often
attempt to improve solutions by changing the colours of certain vertices during a
Fig. 1.9 A ten-vertex graph (a), its adjacency list (b), and its adjacency matrix (c)
1.4 About This Book 15
1. The task of checking whether two vertices u and v are adjacent is performed using
the adjacency matrix A. Accessing element Auv counts as one constraint check.
2. The task of going through all vertices adjacent to a vertex v involves accessing
all elements of the list Adjv . This counts as |Adjv | constraint checks.
3. Determining the degree of a vertex v involves looking up the value |Adjv |. This
counts as one constraint check.
4. Determining the number of vertices in colour class Si ∈ S that is adjacent to a
particular vertex v involves accessing element Cvi . This counts as one constraint
check.
While many of the algorithms featured in this book are described within the main
text, others are more conveniently defined using pseudocode. The benefit of pseu-
docode is that it enables readers to concentrate on the algorithmic process without
worrying about the syntactic details of any particular programming language. Our
pseudocode style is based on that of the well-known textbook Introduction to Algo-
rithms by Cormen et al. [2]. This particular pseudocode style makes use of all the
usual programming constructs such as while-loops, for-loops, if-else statements,
break statements, and so on, with indentation being used to indicate their scope. To
avoid confusion, different symbols are also used for assignment and equality opera-
tors. For assignment, a left arrow (←) is used. So, for example, the statement x ← 10
should be read as “x becomes equal to 10”, or “let x be equal to 10”. On the other
hand, an equals symbol is used only for equality testing; hence a statement such as
x = 10 will only evaluate to true or false (x is either equal to 10, or it is not).
All other notation used within this book is defined when the necessary concepts
arise. Throughout the text, the notation G = (V, E) is used to denote a graph G
comprising a “vertex set” V and an “edge set” E. The number of vertices and edges
in a graph is denoted by n and m, respectively. The colour of a particular vertex
v ∈ V is written c(v), while a candidate solution to a graph colouring problem is
usually defined as a partition of the vertices into k subsets S = {S1 , S2 , . . . , Sk }.
16 1 Introduction to Graph Colouring
In this introductory chapter, we have defined the graph colouring problem and pro-
vided several examples of its practical applications. In the next chapter, we will con-
sider the reasons as to why graph colouring should be considered an “intractable”
problem. This characteristic implies that we are very unlikely to be able to create
an algorithm that can find an optimal solution to an arbitrary graph in reasonable
time frames. Heuristic algorithms for graph colouring will be considered in Chap. 3
onwards.
References
1. Chaitin G (2004) Register allocation and spilling via graph coloring. SIGPLAN Not. 39(4):66–74
2. Cormen T, Leiserson C, Rivest R, Stein C (2009) Introduction to algorithms, 3rd edn. The MIT
Press
Problem Complexity
2
The previous chapter defined the graph colouring problem and gave some examples
of practical applications. A question that we should now ask is “What algorithm can
be used to solve this problem?” Here we use the word “solve” in the strong sense,
in that an algorithm solves a problem only if it can take any problem instance and
always return an optimal solution. For the graph colouring problem, this involves
taking any graph G and returning a feasible solution using exactly χ (G) colours.
Algorithms that solve a problem in this way are known as exact algorithms.
In this chapter, we will see that an exact algorithm with acceptable run times
almost certainly does not exist for the graph colouring problem. Graph colouring
can, therefore, be considered a type of “intractable” problem that will usually need
to be tackled using inexact algorithms. To reach this conclusion, this chapter begins
by first providing an overview of how algorithm time requirements are measured
(Sect. 2.1). In Sect. 2.2, we then consider an intuitive algorithm for graph colouring
that can find the optimal solution by going through every possible assignment of
colours to vertices. While this algorithm is indeed exact, we will see that it is far too
slow to be of any practical use.
In Sect. 2.3 we define the notion of problem intractability more rigorously. In
particular, we consider the contrasting classes of polynomially solvable problems
and N P -complete problems. Several examples are then introduced to help describe
the concepts of P and N P , polynomial-time reductions, and Boolean satisfiability.
Section 2.4 then uses these concepts to prove that graph colouring is N P -complete.
Despite being N P -complete in the general case, there are still certain conditions
that, when satisfied, can make graph colouring an easy problem to solve. Section 2.5
surveys some of these. Section 2.6 then summarises this chapter and makes sugges-
tions for further reading.
From an analytical point of view, it is usually enough to only consider the first of these
factors. The complexity of an algorithm will then usually be measured in terms of the
number of “basic operations” that it needs to carry out to solve the given problem.
These basic operations are constant-time actions that are deemed necessary for the
problem at hand and can include things such as memory lookups, comparisons and
swaps. For graph colouring, these basic operations are the constraint checks described
in Sect. 1.4.1.
When analysing an algorithm’s complexity, it is usual to consider its worst-case
run times. That is, we are interested in determining the maximum number of basic
operations that will need to be performed for input of a particular size. Using the
worst case might seem rather pessimistic, but is useful for the following reasons.
• It gives us an upper bound on the amount of time that the algorithm will require.
• The worst case will often occur. For example, consider the problem of searching
for a number x in an unordered list of n integers. If x is not present in the list, we
will still need to go through the entire list (and perform n comparisons) to confirm
this.
• For many problems, worst-case run times are equal to the average case and the best
case. For example, in the task of calculating the maximum value in an unordered
list of n numbers, n − 1 comparisons are required in all cases.
Generally, we are not interested in the exact number of operations that are per-
formed by an algorithm; instead, we are concerned with the algorithm’s order of
growth with respect to problem size. For example, if the problem size is doubled,
does the number of operations performed by the algorithm stay the same? Does it
double? Or is it something else?
The worst-case complexity of an algorithm is therefore usually expressed using
what we call “big O” notation. This is formally defined as follows.
2 Problem Complexity 19
Another way of expressing this definition is to say that “ f (n) tends to g(n) as n
approaches infinity”. In practice, this allows us to drop any multiplicative constants
and lower order terms from a function. In analysing algorithm complexity, we can
therefore say things like the following.
• If, for an input of size n, an algorithm needs to perform a maximum of 6n 4 +2n 2 +53
operations, then the complexity of this algorithm is simply O(n 4 ).
• If an algorithm performs a maximum of 4n 2 −2n+2 operations, then its complexity
is O(n 2 ).
• If an algorithm performs a maximum of 100n operations, then its complexity is
O(n).
500
O(lg n)
O(n)
O(n lg n)
O(n2n)
400 O(2 )
O(n!)
Run time (num. operations)
300
200
100
0
0 20 40 60 80 100
n
Fig. 2.1 Comparison of various growth rates according to increases in problem size n
Let us now consider the growth rate of one very simple (though ultimately foolhardy)
algorithm for solving the graph colouring problem. The idea behind this method will
be to define a solution space of candidate solutions, where each candidate solution
specifies a different assignment of colours to vertices. Our task will then be to go
through all members of this space and return the best among them; that is, the
algorithm will return the candidate solution that is both feasible and seen to be using
the fewest colours. Such a method is indeed guaranteed to return an optimal solution
and is therefore exact. But will it be useful in practice?
If we were to follow such an approach, we would first need to decide the maximum
number of colours that any solution might need. In practice, this could be estimated
as some value between one and n, where n is the number of vertices in our graph.
For now though, we will use the maximum number of colours n, since no feasible
solution will ever require more colours than vertices.
Consider now the number of candidate solutions within this solution space. Since
there would be n choices of colour for each of the n vertices, this gives a solution space
containing a total of n n candidate solutions, each that will need to be evaluated. Our
algorithm will therefore feature an exponential complexity of O(n n ). This growth rate
increases rapidly with regard to n, meaning that the number of candidate solutions
to be checked will quickly become too large for even the most powerful computer
to tackle. To illustrate, a graph with n = 50 vertices would lead to over 5050 ≈
8.8 × 1084 different assignments: a truly astronomical number. This would make the
task of creating and checking all of these assignments, even for this modestly sized
problem, far beyond the computing power of all of the world’s computers combined.
(For comparison’s sake, the number of atoms in the observable universe is thought
2.2 Solving Graph Colouring via Exhaustive Search 21
then finding the solution amongst these that uses the smallest number of independent
sets. Unfortunately, however, the growth rate of Bell numbers is still exponential at
O(n n ), so such a method will still be infeasible for non-trivial values of n.
Alternatively, we might choose to limit the number of available colours to some
value k < n and then seek to partition the vertices into k colour classes. The number
of ways of partitioning n items into exactly k nonempty
subsets are given by Stirling
numbers of the second kind. These are denoted by nk , and can be calculated by the
formula:
1
k
n k
= (−1)i (k − i)n . (2.1)
k k! i
i=0
So, for instance, the number
of ways of partitioning three items into exactly two
nonempty subsets is 23 = 3, because we have three different options:
{{v1 }, {v2 , v3 }},
{{v1 , v3 }, {v2 }}, and
{{v1 , v2 }, {v3 }}.
Note that summing Stirling numbers of the second kind for all values of k from 1 to
n leads to the nth Bell number:
n
n
Bn = . (2.2)
k
k=1
We might now choose to employ an enumeration algorithm that starts byconsidering
k = 1 colour. At each step, the algorithm then simply needs to check all nk possible
partitions to see if any correspond to a feasible solution (k-colouring). If such a
solution is found, the algorithm can halt immediately with the knowledge that an
optimal solution has been found. Otherwise, k can be incremented by 1 and the
process should be repeated. Ultimately such an algorithm will need to consider a
maximum of
n
χ(G)
(2.3)
k
k=1
candidate solutions.
But is such an approach useful in any practical sense? Unfortunately not. Even
though our original solution space size of n n has been reduced, Stirling numbers of
the second kind still exhibit an exponential growth rate of O(k n ). As an example, if
we were seeking to produce and examine all partitions of 50 items into 10 subsets,
which
50 is again quite a modestly sized graph colouring problem, this would lead to
10 ≈ 2.6 × 10 43 candidate solutions. Such a figure is still well beyond the reach
The discussions in the previous subsection have demonstrated that a graph colouring
algorithm based on enumerating and checking all possible candidate solutions is not
sensible because, on anything except trivial problem instances, its execution will
simply take too long. However, the exponential growth rate of the solution space
is not the sole reason why the graph colouring problem is so troublesome since
many “easy to solve” problems also feature similarly large solution spaces. As an
example, consider the computational problem of sorting a collection of integers into
ascending order. Given a set of n unique integers, there are a total of n! different
ways of arranging these, and only one of these “candidate solutions” will give us
the required answer. However, it would be foolish and unnecessary to employ an
algorithm that went about checking all of the n! possibilities, because a multitude
of polynomial-time algorithms are available for the sorting problem, including the
O(n lg n) Merge Sort and Heap Sort algorithms. In this sense, we can say that the
sorting problem is “solvable in O(n lg n) time”.
To isolate just what it is that makes a problem “intractable”, it is, therefore, nec-
essary to identify features beyond the number of possible candidate solutions. These
features will be examined in the remainder of this subsection. To aid our discus-
sions, we will consider five additional computational problems on graphs. These are
defined as follows.
2.3.1 P and N P
For the time being, we will restrict our discussions to so-called “decision problems”.
These are computational problems that have only two outcomes, denoted by “yes”
and “no” (or, if we prefer, “true” and “false”). Although this might sound rather
24 2 Problem Complexity
Fig. 2.2 Illustration of the problems given in Definition 2.2. Part a demonstrates the minimum
distance problem, using bold edges to indicate a shortest path between vertices u and v. In this
case, the minimum distance between u and v is three (edges). Part b illustrates the maximum
clique problem. The shaded vertices indicate the largest clique in this graph. Part c demonstrates
the Hamiltonian cycle problem, using bold edges to indicate that a Hamiltonian does indeed exist
in this graph. Part d illustrates the maximum independent set problem. The largest independent
set in this graph is shown by the shaded vertices. Finally, Part e gives an example of the travelling
salesman problem. Edge weights are shown in the table. The bold edges in the graph give an optimal
solution for this graph, which has an edge-weight total of nine
Definition 2.3 Consider a graph G = (V, E). The following are all decision
problems.
Definition 2.4 Consider a graph G = (V, E). The decision problem k-COL
asks: Given an integer k, can a feasible colouring of G be achieved using k or
fewer colours?
If P = N P , then the differences between these two classes of problem are important.
Indeed, all decision problems in P can be solved in polynomial time, whereas those
in the set N P − P cannot. But how can we show that a problem belongs to the
set N P − P ? One convenient way is to use what are known as polynomial-time
reductions.
Figure 2.3 shows two simple examples of polynomial-time reductions using prob-
lems from Definition 2.3. The first of these illustrates how any instance of HAM-
CYCLE can be converted into an instance of the TSP, therefore showing that
HAM-CYCLE ∝ TSP. Consequently, a graph G will have a Hamiltonian cycle
(and be answerable with “yes”) if and only if G has a TSP solution with a total edge
weight of zero.
What does this relationship tell us? Suppose that an algorithm A can solve any
instance of the TSP. This implies that we could also use A to solve any instance of
HAM-CYCLE by simply applying the illustrated polynomial-time reduction first.
Moreover, if algorithm A was known to operate in polynomial time (implying that
TSP ∈ P ), then this would also mean that HAM-CYCLE ∈ P , since a polynomial-
time algorithm for HAM-CYCLE would involve simply applying the polynomial-
time reduction and then executing A. (In fact, we will see in the next subsection that
neither of these problems is actually in P .)
The second example of Fig. 2.3 shows how instances of k-CLIQUE can be con-
verted into instances of k-I-SET by taking the complement of the graph. Conse-
quently, k-CLIQUE ∝ k-I-SET. This process can also be carried out in reverse
here, meaning that k-I-SET ∝ k-CLIQUE. In this case, we can therefore say that
k-CLIQUE and k-I-SET are polynomially equivalent.
2.3 Problem Intractability 27
Fig. 2.3 Two examples of polynomial-time reductions on a graph G = (V, E). Part a shows how
HAM-CYCLE can be transformed into TSP. This involves creating a new graph G = (V, E ) with
edge weights of w({u, v}) = 0 if {u, v} ∈ E and w({u, v}) = 1 if {u, v} ∈ / E. A TSP solution for
G with a total edge weight of zero corresponds to a Hamiltonian cycle, as indicated on the right.
Part b shows how k-CLIQUE can be transformed into a corresponding instance of k-I-SET. This
involves constructing the complement graph G of G (that is, two vertices are adjacent in G if and
only if they are not adjacent in G). An independent set of k vertices in G (as shown on the right)
corresponds to a clique of size k in G
2.3.3 N P -Completeness
We now turn our attention towards the set of so-called N P -complete decision prob-
lems. When a problem is N P -complete it is a member of N P and so candidate solu-
tions can be verified in polynomial time. However, there is no known way of solving
the problem in polynomial time. Instead, the time required to solve an N P -complete
problem grows exponentially with regards to problem size. In fact, Problems 2 to 5
from Definition 2.3 all turn out to be N P -complete.
A formal definition of N P -completeness can be made using polynomial-time
reductions.
Fig. 2.4 Truth tables for the AND, OR, and NOT operators
the unary NOT operator, denoted by ¬. The truth tables for these operators are shown
in Fig. 2.4.
To exemplify these ideas, consider the following small Boolean formula
x1 ∧ ¬x2 .
This formula is satisfiable because an assignment of, say, true to x1 and false to x2
leads to
T ∧ ¬F = T ∧ T = T
as required. On the other hand, the formula
x1 ∧ ¬x1
is unsatisfiable, because all possible assignments to the variables eventually evaluate
to false. Specifically,
T ∧ ¬T = T ∧ F = F
and
F ∧ ¬F = F ∧ T = F.
SAT was the first decision problem to be proven N P -complete. This is due to
the seminal work of Cook [2], who showed how any decision problem in N P can
be converted in polynomial time to a corresponding instance of SAT such that the
instance of SAT will be satisfiable if and only if the original decision problem is
answerable with “yes”. This allows us to consider SAT to be the “hardest” problem
in N P since, if it were solvable in polynomial time, we would be able to solve all
problems in N P in polynomial time.1
When studying SAT, it is often useful to state all Boolean formulas in 3-conjunctive
normal form.
It is known that any Boolean formula can be converted into 3-CNF in polynomial
time such that the original formula is satisfiable if and only if the corresponding
3-CNF formula is satisfiable. For example, the Boolean formula
(x1 ∧ x2 ∧ ¬x3 ) ∨ (¬x1 ∧ x4 ) ∨ x5
1 This result is sometimes referred to as the Cook–Levin theorem. Cook’s proof involves using a
mathematical model of computing known as a deterministic Turing machine, the details of which
are given in the original manuscript.
30 2 Problem Complexity
We are now is a position to show that k-COL—the decision variant of the graph
colouring problem—is N P -complete. We begin by showing that this is true even
when limited to just three colours. This result is then generalised to all values of
k ≥ 3.
Fig. 2.5 a A partially constructed graph for showing 3-CNF-SAT ∝ 3-COL; b The OR-gadget
graph
1. If, in a three-colouring of an OR-gadget, the input vertices (x, y, and z) are all
coloured black (false), then the output vertex must also be coloured black.
2. If, in a three-colouring of an OR-gadget, any of the input vertices are coloured
white (true), then there exists a three-colouring where the output vertex is also
coloured white.
To finish the construction of G we now add an OR-gadget graph for each clause Ci
in φ, using the appropriate vertices from {v1 , v¯1 , . . . , vn , v¯n } as the inputs. We then
32 2 Problem Complexity
Fig.2.6 Graph constructed from the 3-CNF Boolean formula φ = (x1 ∨¬x2 ∨x3 )∧(x2 ∨x4 ∨¬x5 ).
Part b shows a feasible three-colouring of this graph, using white for true and black for false.
This colouring gives x1 = T , x2 = T , x3 = F, x4 = F, and x5 = T and therefore gives
(T ∨ ¬T ∨ F) ∧ (T ∨ F ∨ ¬T ) = (T ∨ F ∨ F) ∧ (T ∨ F ∨ ¬F) = T ∧ T = T . Hence φ is
satisfiable as expected
connect the output vertices of these gadgets to the B and F vertices. A full example
of this construction is shown in Fig. 2.6.
We now need to show that φ is satisfiable if and only if a feasible three-colouring
of G is possible. First, suppose that φ is satisfiable. This means that in each clause
Ci = (x ∨ y ∨ z), at least one of x, y, or z must be true; consequently, at least one of
the input vertices of the corresponding OR-gadget in G should be white. According
to Property 2 above, this implies that the output vertex of this gadget can also be
white. Since this output vertex is adjacent to the F and B vertices (coloured black
and grey respectively), this three-colouring will be feasible.
Now suppose that G is three-colourable. If vi is coloured white, then xi is set to
true, else xi is set to false. This gives a legal truth assignment. Now consider the
clause Ci = (x ∨ y ∨ z). It cannot be that all input vertices to the corresponding OR-
gadget are black (false) because, according to Property 1, this would force the output
vertex to also be black. However, since this output vertex is adjacent to the B and
F vertices, this would lead to an improper colouring, contradicting the assumption
that a feasible three-colouring of G exists.
Having shown that 3-COL is N P -complete, it is now just a little more work to
generalise this result to k-COL.
To summarise these proofs, what we have done is first shown that any instance
φ of 3-CNF-SAT can be converted in polynomial time to a graph G that is three-
colourable if and only if φ is satisfiable. Consequently, we can say that 3-COL is
at least as hard as 3-CNF-SAT. We have then shown how this result can also be
extended to other values of k ≥ 3. Again, this can be interpreted as saying that
k-COL is at least as hard as 3-COL. These results imply that in the unlikely event
that we were to discover an algorithm for solving k-COL in polynomial time, this
would also allow us to solve both 3-CNF-SAT and SAT in polynomial time.
Similar proofs to that of Theorem 2.2 have also been used to show that 3-CNF ∝
HAM-CYCLE and that 3-CNF ∝ k-CLIQUE [3]. The polynomial-time reduction
methods described in Fig. 2.3 also demonstrate that HAM-CYCLE ∝ TSP and that
k-CLIQUE ∝ k-I-SET. These results allow us to conclude that Problems 2 to 4
from Definition 2.3 are all N P -complete. A summary of how these results have
been derived is shown in Fig. 2.7.
Complete graphs with n vertices, denoted by K n , are graphs that feature an edge
between every pair of vertices, giving a set E of m = n2 = n(n−1) 2 edges. It
is obvious that because all vertices in the complete graph are mutually adjacent,
all vertices must be assigned to their own individual colour. Hence the chromatic
number of a complete graph χ (K n ) = n. Example optimal solutions for K 1 to K 5
are shown in Fig. 2.9.
Fig. 2.8 Optimal colourings of (from left to right) an arbitrary bipartite graph, a tree, and a star
graph
2.5 Graphs that are Easy to Colour Optimally 35
Fig. 2.9 Optimal colourings of the complete graphs (from left to right) K 1 , K 2 , K 3 , K 4 and K 5
Fig. 2.10 Optimal colourings of a cycle graphs C3 , C4 , C8 , and C9 , and b wheel graphs W4 , W5 ,
W9 , and W10
36 2 Problem Complexity
both the (n −1)th vertex (coloured grey), and the first vertex (coloured white). Hence
a third colour will be required.
Wheel graphs with n vertices, denoted by Wn , are obtained from the cycle
graph Cn−1 by adding a single extra vertex vn together with the additional edges
{v1 , vn }, {v2 , vn }, . . . , {vn−1 , vn }. Example wheel graphs are shown in Fig. 2.10b. It
is clear that similar results to cycle graphs can be stated for wheel graphs. Specifi-
cally, when n is odd, three colours will be required to colour Wn because the graph
will be composed of the even cycle Cn−1 , requiring two colours, and the additional
vertex vn which, being adjacent to all vertices in Cn−1 , will require a third colour.
Similarly, when n is even, χ (Wn ) = 4 because the graph will be composed of the
odd cycle Cn−1 , requiring three colours, together with vertex vn , which will require
a fourth colour.
From the illustrations in Fig. 2.10 we see that cycle graphs and wheel graphs
(of any size) are both particular cases of planar graphs, in that they can be drawn
on a two-dimensional plane without any of the edges crossing. This fits with the
Four Colour Theorem, which states that if a graph is planar then it can be feasibly
coloured using four or fewer colours (see Sects. 1.2 and 6.1). However, the Four
Colour Theorem does not imply that if a graph is four-colourable then it must also
be planar, as the next example illustrates.
Grid graphs can be formed by placing all vertices in a lattice formation on a two-
dimensional plane. In a sparse grid graph, each vertex is adjacent to four vertices:
the vertex above it, the vertex below it, the vertex to the right, and the vertex to the
left (see Fig. 2.11a). For a dense grid graph, a similar pattern is used, but vertices are
also adjacent to vertices on their surrounding diagonals (Fig. 2.11b).
A practical application of such graphs occurs in the arrangement of seats in exam
venues. Imagine a large examination venue where the desks have been placed in a
grid formation. In such cases, we might want to avoid instances of students copying
from each other by making sure that each student is always seated next to students
taking different exams. What is the minimum number of exams that can take place in
the venue? This problem can be posed as a graph colouring problem by representing
each desk as a vertex, with edges representing pairs of desks that are close enough
for students to copy from.
If it is assumed that students can only copy from students seated in front, behind,
to their left, or to their right, then we get the sparse grid graph shown in Fig. 2.11a.
This graph is a type of bipartite graph since it can be coloured using just two colours
according to the pattern shown. Hence a minimum of two exams can take place in
this venue at any one time.
In circumstances where students can copy from students sitting on any of the
eight desks surrounding them, we get the dense grid graph shown in Fig. 2.11b. As
illustrated, this grid can be coloured using four colours according to the pattern
shown. In this graph, each vertex, together with the vertex above, the vertex on the
2.5 Graphs that are Easy to Colour Optimally 37
Fig. 2.11 Optimal colourings of a a sparse grid graph and b a dense grid graph
right, and the vertex on the upper right diagonal, forms a clique of size four. Hence we
can conclude that a feasible colouring using fewer than four colours is not possible.
The dense grid graph also provides a simple example of a graph that is nonplanar
but is still four-colourable. Although cliques of size 4 are themselves planar, the way
in which the various cliques interlock in this example means that some edges will
always need to cross one another. This illustrates that a graph does not need to be
planar for it to be colourable with four or fewer colours.
This chapter has looked at the issues surrounding the intractable nature of various
computational problems, including graph colouring. Much of this chapter’s content
is due to the work of Cook [2], who first identified the existence of N P -complete
problems, and Karp [3], who used polynomial-time reductions to prove the N P -
completeness of a whole host of combinatorial problems, including the examples
considered here.
When talking about intractable problems, it is common to hear the term “N P -
hard” being used instead of N P -complete. N P -hard problems can be considered
“at least as hard” as N P -complete problems because they are not required to be in
the class N P and therefore do not have to be stated as decision problems. N P -hard
problems will often be stated as an optimisation version of a corresponding N P -
complete problem. So with graph colouring for instance, rather than asking “is there
a feasible colouring of G that uses k or fewer colours?” (to which the answer is “yes”
or “no”), the problem will now be stated: “What is the minimum number of colours
k needed to feasibly colour G, and how can we assign colours to the vertices of G to
achieve this? Note that the non-decision version of the graph colouring algorithm as
stated in Definition 1.2 is itself N P -hard, as are Problems 2 to 5 from Definition 2.2.
When a problem is known to be N P -complete (or N P -hard) then there will be
no algorithm that solves it in polynomial time. As we have discussed, this statement
depends on the conjecture that P = N P . Though this inequality is generally thought
to be true, the issue of formally proving (or disproving) it is actually one of the
most famous unsolved problems in the mathematical sciences. Indeed, the Clay
38 2 Problem Complexity
References
1. Cormen T, Leiserson C, Rivest R, Stein C (2009) Introduction to algorithms, 3rd edn. The MIT
Press
2. Cook S (1971) The complexity of theorem-proving procedures. In: Proceedings of the third
annual ACM symposium on theory of computing, STOC’71, New York, NY, USA. ACM, pp
151–158. https://doi.org/10.1145/800157.805047
3. Karp M (1972) Complexity of computer computations. In: Reducibility among combinatorial
problems. Plenum, New York, pp 85–103
4. Garey M, Johnson D (1979) Computers and intractability: a guide to NP-completeness, 1st edn.
W. H. Freeman and Company, San Francisco
5. Beigel R, Eppstein D (2005) 3-colouring in time O(1.3289n ). J Algorithms 54:168–204
6. Eppstein D (2003) Small maximal independent sets and faster exact graph coloring. J Graph
Algorithms Appl 7(2):131–140
Bounds and Constructive Heuristics
3
Definition 3.1 Given the graph G = (V, E), the neighbourhood of a vertex
v, written Γ (v), is the set of vertices adjacent to v. That is, Γ (v) = {u ∈ V :
{v, u} ∈ E}. The degree of a vertex v is the cardinality of its neighbourhood
set, |Γ (v)|, and is usually written deg(v).
Definition 3.6 A cycle is a uv-path for which u = v. All other vertices in the
cycle must be distinct. A graph containing no cycles is said to be acyclic.
3 Bounds and Constructive Heuristics 41
Definition 3.7 The density of a graph G = (V, E) is the ratio of the number
of edges to the number of
pairs of vertices. For a simple graph with no loops,
this is calculated by m/ n2 . Graphs with low densities are often referred to as
sparse graphs; those with high densities are known as dense graphs.
To illustrate these definitions, Fig. 3.1a shows a graph G where, for example, the
neighbourhood of v1 is Γ (v1 ) = {v3 , v5 }, giving deg(v1 ) = 2. The density of G
is 7/15 = 0.467. The subgraph G in Fig. 3.1b has been created via the operation
G − {v2 , v4 }. G is also the subgraph induced from G using the set of vertices
{v1 , v3 , v5 , v6 }. In this particular case both G and G are connected. Paths in G
from, for example, v1 to v6 include (v1 , v3 , v4 , v5 , v6 ) (of length 4) and (v1 , v5 , v6 )
(of length 2). Since the latter path is also the shortest path between v1 to v6 , the
distance between these vertices is also 2. Cycles also exist in both G and G , such
as (v1 , v3 , v5 , v1 ).
Recall the example from Sect. 1.1.1 where we sought to partition some students into
a minimal number of groups for a team-building exercise. The process we used to try
to achieve this is known as the Greedy algorithm for graph colouring. Pseudocode
for this very simple constructive heuristic is shown in Fig. 3.2.
As shown, Greedy operates by taking vertices one at a time according to some
predefined ordering P. Using colour labels 1, 2, 3, . . ., each vertex is then simply
assigned to the lowest colour not being used by any of its neighbours.
An example run of Greedy using the vertex permutation P = v1 , v2 , . . . , vn is
shown in Fig. 3.3. In the first iteration we see that v1 is assigned to colour 1 (that is,
42 3 Bounds and Constructive Heuristics
Fig. 3.3 Example application of the Greedy algorithm using the permutation P = v1 , v2 , . . . , v8 .
Here, uncoloured vertices are shown in white. As a partition, the final solution is written S =
{{v1 , v5 , v6 }, {v2 , v3 , v8 }, {v4 }, {v7 }}, where each S ∈ S is an independent set
c(v1 ) is set to 1). In the second iteration, v2 is adjacent to a vertex with colour 1, so
it is assigned to colour 2. The same thing happens in the third iteration: v3 is seen to
be adjacent to colour 1, so it is also assigned to colour 2. In the fourth iteration, v4
is seen to be adjacent to colours 1 and 2 and is therefore assigned to colour 3. The
process continues in this fashion until all vertices have been coloured. This gives the
four-colouring shown in the figure.
3.1 The Greedy Algorithm for Graph Colouring 43
Fig. 3.4 Two different colourings of a bipartite graph achieved by the Greedy algorithm. The
graph in a is using n/2 colours. Part b shows an optimal two-colouring
Case 1: An independent set S j<i ∈ S exists such that S j ∪{v} is also an independent
set. In this case v will be assigned to the jth colour class in S .
Case 2: An independent set S j=i ∈ S exists such that S j ∪{v} is also an independent
set.
In both cases, it is clear that v will always be assigned to a set in S with an index
that is less than or equal to that of its original set in S . Of course, if a situation arises
by which all items in a particular set Si are assigned according to Case 1, then at the
termination of Greedy, S will contain fewer colours than S .
Now assume that it is necessary to assign a vertex v ∈ Si to a set S j>i . For this to
occur, it is first necessary that the proposed actions of Cases 1 and 2 (i.e., adding v to
a set S j≤i ) cause a clash. However, Si ⊂ Si and is, therefore, an independent set. By
definition, Si ∪ {v} ⊆ Si is also an independent set, contradicting the assumption.
To show these concepts in action, the colouring shown in Fig. 3.5a has been gener-
ated by the Greedy algorithm using the permutation P = v1 , v2 , v3 , v4 , v5 , v6 , v7 , v8 .
This gives the four-colouring S = {{v1 , v4 , v8 }, {v2 , v7 }, {v3 , v5 }, {v6 }}. This solu-
tion might then be used to form a new permutation P = v1 , v4 , v8 , v2 , v7 , v3 , v5 , v6
which can then be fed back into Greedy. However, our use of sets in defining
a solution S means that we are free to use any ordering of the colour classes
in S to form P, and indeed any ordering of the vertices within each colour
class. One alternative permutation of the vertices formed from S in this way is
P = v2 , v7 , v5 , v3 , v6 , v4 , v8 , v1 . This permutation has been used with Greedy to
give the solution shown in Fig. 3.5b, which we see is using fewer colours than the
solution from which it was formed.
3.1 The Greedy Algorithm for Graph Colouring 45
permutations of the vertices which, when fed into Greedy, will result in an
optimal solution.
Proof This arises immediately from Theorem 3.1. Since S is optimal, an appropriate
permutation can be generated from S in the manner described. Moreover, because
the colour classes and vertices within each colour class can themselves be permuted,
the above formula holds.
In this section, we now review some of the upper and lower bounds that can be
stated about the chromatic number of a graph. Some of these bounds make use of
the Greedy algorithm in their proofs, helping us to further understand the behaviour
of the algorithm. While these bounds can be quite useful, or even exact for some
topologies, in other cases they are either too difficult to calculate or give us bounds that
are too loose to be of much practical use. This latter point will also be demonstrated
empirically in Sect. 3.5.
n/α(G) since, to be less than this value would require an independent set that is
larger than α(G).
These two bounds can be combined into the following:
χ (G) ≥ max{ω(G), n/α(G)} (3.2)
The accuracy of the bounds given in Eq. (3.2) will vary on a case to case basis. Their
major drawback is that the tasks of calculating ω(G) and α(G) are themselves N P -
hard problems, namely, the maximum clique problem and the maximum independent
set problem (see Definition 2.2). However, this does not mean that these bounds
are useless: in some practical applications, the sizes of the largest cliques and/or
independent sets might be quite obvious from the graph’s topology, or even specified
as part of the problem itself (see, for example, the sport scheduling models used in
Chap. 8). In other cases, we might also be able to approximate ω(G) and/or α(G)
using heuristics or by applying probabilistic arguments.
To illustrate how we might estimate the size of a maximum clique in probabilistic
terms, consider a graph G that is generated such that each pair of vertices is joined by
an edge with probability p. Assuming independence, the probability that a subset of
x ≤ n vertices forms a clique K x is calculated by p (2) , since there are x2 edges that
x
are required to be present among the x vertices. The probability that the x vertices do
not form a clique is therefore simply 1 − p (2) . Since there are nx different subsets
x
G is defined as
P(∃K x ⊆ G) = 1 − (1 − p (2) )(x )
x n
(3.3)
for 2 ≤ x ≤ n.
In practice, we might use this formula to estimate a lower bound with a certain
confidence. For example, we might say “with greater than 99% confidence we can
say that G contains a clique of size y”, where y represents the largest x value for
which Eq. 3.3 is greater than 0.99. We might also collect similar information on the
size of the largest maximum independent set in G by simply replacing p with (1− p)
in the above formula. We must be careful in calculating the latter, however, because
dividing n by an underestimation of α(G) could lead to an invalid bound that exceeds
χ (G). We should also be mindful that, for larger graphs, the numbers involved in
calculating Eq. (3.3) might be very large indeed, perhaps requiring rounding and
introducing inaccuracies.
Even if we can estimate or determine values such as ω(G), we must still bear in
mind that they can still give very weak lower bounds in many cases. Consider, for
example, the graph shown in Fig. 3.6, known as the Grötzch graph. This graph is
“triangle-free” in that it contains no cliques of size 3 or above; hence ω(G) = 2.
However, as illustrated in the figure, the chromatic number of the Grötzch graph
is four: double the lower bound determined by ω(G). In fact, the Grötzch graph
is the smallest graph in a set of graphs known as the Mycielskians, named after
their discoverer Jan Mycielski [2]. Mycielskian graphs demonstrate the potential
inaccuracies involved in using ω(G) as a lower bound by showing that, for any
q ≥ 1, there exists a graph G with ω(G) = 2 but for which χ (G) > q. Hence we
3.2 Bounds on the Chromatic Number 47
can encounter graphs for which ω(G) gives us a lower bound of 2, but for which the
chromatic number is arbitrarily large.
An example interval graph has already been seen in Sect. 1.1.3 where we sought to
assign taxi journeys with known start and end times to a minimal number of vehicles.
This set of intervals, together with its interval graph is reproduced in Fig. 3.7. In this
case, the real line is being used to represent time.
One feature of interval graphs is that they are known to contain a “perfect elim-
ination ordering”. This is defined as an ordering of the vertices such that, for every
vertex, all of its neighbours to the left of it in the ordering from a clique.
Fig. 3.7 An example set of intervals (a), and a corresponding interval graph (b). The left-to-right
ordering of the vertices in this figure indicates a perfect elimination ordering P = v1 , v2 , . . . , v10
Proof To start, arrange the intervals of I in ascending order of start values, such
that a1 ≤ a2 ≤ · · · ≤ an . Now label the vertices v1 , v2 , . . . , vn to correspond to
this ordering. This implies that for any vertex vi , the corresponding intervals of all
neighbours to its left in the ordering must contain the value ai ; hence all pairs of vi ’s
neighbours to the left must also share an edge, thereby forming a clique.
Proof During the execution of Greedy, each vertex vi is assigned to the lowest
indexed colour not used by any of its neighbours preceding it in P. Clearly, each
vertex has fewer than ω(G) neighbours that are already coloured. Hence at least one
of the colours labelled 1 to ω(G) must be feasible for vi . This implies that χ (G) ≤
ω(G). Since it is always necessary that ω(G) ≤ χ (G), this gives χ (G) = ω(G).
The optimal three-colouring provided in Fig. 3.7 shows the result of this process
using the permutation P = v1 , v2 , v3 , v4 , v5 , v6 , v7 , v8 , v9 , v10 .
More generally, graphs featuring perfect elimination orderings are usually known
as chordal graphs. Interval graphs are therefore a type of chordal graph. The problem
of determining whether a graph is chordal or not can be achieved in polynomial time
by algorithms such as lexicographic breadth-first search [3]. This implies that the
problem of optimally colouring chordal graphs can be solved in polynomial time,
making it a member of the problem class P .
3.2 Bounds on the Chromatic Number 49
Upper bounds on the chromatic number can be derived by considering the number
of edges in a graph and also the degrees of individual vertices. For instance, when a
graph has a high density (that is, a high number of edges m in relation to the number
of vertices n), a larger number of colours will often be needed because a greater
proportion of the vertex pairs will need to be separated into different colour classes.
This admittedly rather weak-sounding proposal gives rise to the following theorem.
Theorem 3.6 Let G be a graph with maximal degree Δ(G) (that is, Δ(G) =
max{deg(v) : v ∈ V }). Then χ (G) ≤ Δ(G) + 1.
Proof Consider the behaviour of the Greedy algorithm. Here, the ith vertex in
the permutation P will be assigned to the lowest indexed colour class that contains
none of its neighbouring vertices. Since each vertex has at most Δ(G) neighbours,
no more than Δ(G) + 1 colours will be needed to feasibly colour all the vertices
of G.
it comes to colouring planar graphs and graphs representing circuit boards (see
Chap. 6).
Proof We know there is a vertex with a degree of at most δ in G. Call this vertex
vn . We also know that there is a vertex of at most δ in the subgraph G − {vn },
which we can label vn−1 . Next, we can label as vn−2 a vertex of degree of at most
δ to form the graph G − {vn , vn−1 }. Continue this process until all of the n vertices
have been assigned labels. Now apply the Greedy algorithm using the permutation
P = v1 , v2 , . . . , vn . At each step of the algorithm, vi will be adjacent to at most δ of
the vertices v1 , . . . , vi−1 that have already been coloured; hence no more than δ + 1
colours will be required.
Let us now examine some implications of the latter two theorems. It can be seen
that Theorem 3.6 provides tight bounds for both complete graphs, where χ (K n ) =
Δ(K n ) + 1 = n, and for odd cycles, where χ (Cn ) = Δ(Cn ) + 1 = 3. However,
such accurate bounds will not always be so forthcoming. Consider, for example,
the wheel graph comprising 100 vertices, W100 . This features a “central” vertex
of degree Δ(W100 ) = 99, meaning that Theorem 3.6 merely informs us that the
chromatic number of W100 is less than 100, even though it is actually just four!
On the other hand, for any wheel graph, it is relatively easy to show that all of its
subgraphs will contain a vertex with a degree of no more than 3 (i.e., δ = 3). For
wheel graphs where n is even, this allows Theorem 3.7 to return a tight bound since
δ + 1 = χ (Wn ) = 4.
To illustrate these definitions, Fig. 3.8a shows a graph G comprising one com-
ponent. Removal of the indicated cut vertex would split G into two components.
Figure 3.8b can be considered a block in that it does not contain a cut vertex (i.e., it
is 2-connected). However, it is not 3-connected, because removal of the two vertices
in the indicated vertex separating set increases the number of components from one
to two.
Having gone over the necessary terms, we are now in a position to state and prove
Brooks’ theorem. Observe that this theorem is concerned with connected graphs
only. This is not restrictive, however, because if a graph G is composed of mul-
tiple components G 1 , . . . , G l , then χ (G) = max(χ (G 1 ), . . . , χ (G l )); hence each
component can be considered separately.
52 3 Bounds and Constructive Heuristics
Theorem 3.8 (Brooks [4]) Let G be a connected graph with maximal degree
Δ(G). Suppose further that G is neither complete nor an odd cycle. Then
χ (G) ≤ Δ(G).
Proof The theorem is obviously correct for Δ(G) ≤ 2. For Δ(G) = 0 and Δ(G) =
1, the corresponding graphs will be the complete graphs K 1 and K 2 , respectively,
and are, therefore, not included in the theorem. For Δ(G) = 2 on the other hand, G
will be a path or even cycle (giving χ (G) = 2) or will be an odd cycle, meaning it
is not included in the theorem.
Assuming Δ(G) ≥ 3, let G be a counterexample with the smallest possible
number of vertices for which the theorem does not hold, i.e., χ (G) > Δ(G). We
therefore assume that all graphs with fewer vertices than G can be feasibly coloured
using Δ(G) colours.
be determined using a breadth-first search tree rooted at v). If we now apply Greedy
using this permutation, the vertices p1 = u 1 and p2 = u 2 will first both be assigned
to colour 1, because they are nonadjacent. Moreover, when we colour the vertices pi
(for 3 ≤ i < n), there will always be at least one colour j ≤ Δ(G) that is feasible
for pi . Finally, when we come to colour vertex pn = v, at most Δ(G) − 1 colours
will have been used to colour the neighbours of v (since its neighbours u 1 and u 2
have been assigned to the same colour) and so at least one of the Δ(G) colours will
be feasible for v. This shows that χ (G) ≤ Δ(G) as required.
Theorem 3.9 (Welsh and Powell [5]) Given a graph G, assume that the ver-
tices have been labelled such that deg(v1 ) ≥ deg(v2 ) ≥ · · · ≥ deg(vn ). Then,
To demonstrate this bound, consider the example graph G in Fig. 3.9 where, as
required, vertices are labelled such that deg(v1 ) ≥ deg(v2 ) ≥ · · · ≥ deg(vn ). This
gives rise to the following sequence of (deg(vi ) + 1, i) pairs:
(6, 1), (3, 2), (3, 3), (3, 4), (3, 5), (2, 6).
Taking the minimum of each pair leads to the sequence 1, 2, 3, 3, 3, 2, the maximum
of which is 3. Hence χ (G) ≤ 3. In contrast, Brooks’ bound only tells us that χ (G) ≤
5 in this case.
Because the Welsh–Powell bound takes the minimum of each (deg(vi ) + 1, i)
pair, it will always be at least as good as the upper bound χ (G) ≤ Δ(G) + 1 seen
in Theorem 3.6. Indeed, these two bounds will only be equal in situations where the
graph G features at least Δ(G)+1 vertices with the maximum degree Δ(G). In these
cases this implies the existence of a vertex vi for which i = Δ(G) + 1, meaning that
min(deg(vi ) + 1, i) = Δ(G) + 1. This occurs with cycle graphs, complete graphs,
and (more generally) d-regular graphs.
54 3 Bounds and Constructive Heuristics
Fig. 3.9 Example graph G whose vertices are labelled such that deg(v1 ) ≥ deg(v2 ) ≥ · · · ≥
deg(vn ). The Welsh–Powell bound finds that χ(G) ≤ maxi=1,...,n min(deg(vi ) + 1, i) = 3
Fig. 3.10 The Welsh–Powell algorithm for graph colouring. The method Greedy is defined in
Fig. 3.2
The Welsh–Powell bound can also be used to define a simple constructive heuristic
for graph colouring. This operates by arranging the vertices in non-ascending order of
degree and then applying the Greedy algorithm. The number of colours in solutions
returned by this algorithm will never exceed the Welsh–Powell bound. The method
is summarised in Fig. 3.10.
We have now analysed the behaviour of the Greedy algorithm for graph colouring
and reviewed several bounds on the chromatic number. In the next two sections, we
consider three further heuristics for the graph colouring problem. As we will see, two
of these algorithms, namely, DSatur and RLF, are guaranteed to produce optimal
solutions for some simple graph topologies. Often, they will also produce solutions
that improve on the upper bounds mentioned above. Later, in Chaps. 4 and 5, we
will see that these algorithms, along with Greedy, can be used as building blocks
in many of the more sophisticated algorithms available for graph colouring.
Definition 3.13 Recall that c(v) denotes the colour assigned to the vertex
v, and let c(v) = null for any vertex v ∈ V not currently assigned to a
colour. Given an uncoloured vertex v, the saturation degree of v, denoted by
sat(v), is the number of different colours assigned to adjacent vertices. That
is, sat(v) = |{c(u) : u ∈ Γ (v) ∧ c(u) = null}|.
Pseudocode for the DSatur algorithm is shown in Fig. 3.11. Much of the algo-
rithm is the same as the Greedy algorithm in that once a vertex has been selected,
it is assigned to the lowest colour label not assigned to any of its neighbours.
Step (2) provides the main power behind the DSatur algorithm in that it prioritises
vertices that are seen to be the most “constrained”—that is, vertices that currently
have the fewest colour options available to them. Consequently, these “more con-
strained” vertices are dealt with by the algorithm first, allowing the less constrained
vertices to be coloured later.
Figure 3.12 shows an example application of DSatur on a small graph. As shown,
initially no vertices are coloured and so all vertices have a saturation degree equal to
zero. The first vertex to be coloured is, therefore, v4 , which has the highest degree
in the uncoloured subgraph. This is assigned to colour 1, as shown in Part (1). At
this point, we now see that five vertices (v1 , v3 , v5 , v6 , v7 ) feature the maximum
saturation degree, so the next vertex to be chosen is the one among these with the
highest degree in the uncoloured subgraph. This gives two options here, v3 and v7 ;
consequently, one of these is chosen and then assigned to colour 2. The resultant
colouring is shown in Part (2). This process continues in the same way until all
vertices have been coloured.
Earlier we saw that the number of colours used in solutions produced by the
Greedy algorithm depends on the order that the vertices are fed into the proce-
dure, with solution quality potentially varying a great deal. On the other hand, the
DSatur algorithm reduces this variance by generating the vertex ordering during
a run according to its selection rules. As a result, DSatur’s performance is more
predictable. One feature of the algorithm is that if a graph is composed of multiple
components, then all vertices of a single component will be coloured before the other
vertices are considered.
56 3 Bounds and Constructive Heuristics
Fig. 3.12 Example application of the DSatur algorithm. In the tables, d gives the degree of the
corresponding vertices in the uncoloured subgraph. Uncoloured vertices are shown in white
DSatur is also exact for several elementary graph topologies. The first of these is
the bipartite graph, and to prove this claim it is first necessary to show a well-known
result on the structure of these graphs.
Proof Let G be a connected bipartite graph with vertex sets V1 and V2 . (It is enough
to consider G as being connected, as otherwise we could simply treat each component
of G separately.) Let (v1 , v2 , . . . , vl , v1 ) be a cycle in G. We can also assume that
v1 ∈ V1 , v2 ∈ V2 , v3 ∈ V1 , and so on. Hence, a vertex vi ∈ V1 if and only if i is odd.
Since vl ∈ V2 , this implies l is even. Consequently G has no odd cycles.
Now suppose that G is known to feature no odd cycles. Choose any vertex v in
the graph and let the set V1 be the set of vertices such that the shortest path from
each member of V1 to v is of odd length, and let V2 be the set of vertices where the
shortest path from each member of V2 to v is even. Observe now that there is no edge
joining vertices of the same set Vi since otherwise G would contain an odd cycle.
Hence G is bipartite.
Theorem 3.11 (Brélaz [6]) The DSatur algorithm is exact for bipartite
graphs.
To illustrate the usefulness of this result, consider the bipartite graphs shown in
Fig. 3.4 earlier. Here, many permutations of the vertices used in conjunction with the
Greedy algorithm will lead to colourings using more than two colours. Indeed, in
the worst case they may even lead to (n/2)-colourings as demonstrated in the figure.
In contrast, DSatur is guaranteed to return the optimal solution for bipartite graphs,
as it is for some further topologies.
Theorem 3.12 The DSatur algorithm is exact for cycle and wheel graphs.
Proof Note that even cycles are two-colourable and are therefore bipartite. Hence
they are dealt with by Theorem 3.11. However, it is useful to consider both even and
odd cycles in the following.
58 3 Bounds and Constructive Heuristics
Fig.3.13 Part a shows optimal three-colouring. Part b shows a suboptimal four-colouring produced
by DSatur
Let Cn be an uncoloured cycle graph. Since all degrees and saturation degrees
are equal, the first vertex to be coloured, v, will be chosen arbitrarily by DSatur.
In the next (n − 2) steps, according to the behaviour of DSatur a path of vertices
of alternating colours will be constructed that extends from v in both clockwise and
anticlockwise directions. At the end of this process, a path comprising n − 1 vertices
will have been formed, and a single vertex u will remain that is adjacent to both
terminal vertices of this path. If Cn is an even cycle, n − 1 will be odd, meaning
that the terminal vertices have the same colour. Hence u can be coloured with the
alternative colour. If Cn is an odd cycle, n − 1 will be even, meaning that the terminal
vertices will have different colours. Hence u will be assigned to a third colour.
For wheel graphs Wn a similar argument applies. Assuming n ≥ 5, DSatur will
initially colour the central vertex vn because it features the highest degree. Since vn
is adjacent to all other vertices in Wn , all remaining vertices v1 , . . . , vn−1 will now
have a saturation degree of 1. The same colouring process as the cycle graphs Cn−1
then follows.
Although, as these theorems show, DSatur is exact for certain types of graph, the
N P -hardness of the graph colouring problem implies that it will be unable to pro-
duce optimal solutions for all graphs. Figure 3.13, for example, shows a small graph
that, while actually three-colourable, will always be coloured using four colours by
DSatur. Janczewski et al. [7] have proved that this is the smallest such graph where
this suboptimality occurs, but there are countless larger graphs where DSatur will
also not return the optimal. In other work, Spinrad and Vijayan [8] have also identi-
fied a graph topology of O(n) vertices that, despite being three-colourable, will be
coloured using n different colours using DSatur.
We have seen that the Greedy, Welsh–Powell and DSatur heuristics take vertices
one at a time and assign them to the first colour seen to be feasible. An alternative
strategy in graph colouring is to build solutions by constructing each colour class
one at a time; that is, identify an independent set S in a graph, give these vertices the
3.4 Colouring Using Maximal Independent Sets 59
same colour, remove them, and then repeat these actions on the remaining subgraph
until no graph remains.
To explore such a strategy, consider the following definitions.
Definition 3.14 Given a graph G = (V, E), recall that an independent set is a
set of vertices that are mutually nonadjacent. That is, S ⊆ V is an independent
set if and only if {u, v} ∈
/ E for all vertex pairs u, v ∈ S.
An independent set S is maximal if and only if, for all vertices v ∈ V , v is
either in S, or is a neighbour of a vertex in S. A maximum independent set is
the largest maximal independent set in G.
The Recursive Largest First (RLF) heuristic for graph colouring was originally pro-
posed by Leighton [9]. It is a special case of the Greedy-I-Set procedure in that
it uses specific rules for selecting vertices in Step (3) of Get-Maximal-I-Set. The
intention behind these rules is to produce maximal independent sets containing many
60 3 Bounds and Constructive Heuristics
vertices, thereby hopefully reducing the number of colours used in the final solution.
These rules are as follows.
• In each application of Get-Maximal-I-Set, when executing Step (3) for the first
time, u is selected as the member of X that has the largest number of neighbours
in X (that is, the vertex with the highest degree in the subgraph induced by X ).
• In subsequent executions of Step (3), u is selected as the member of X that has
the largest number of neighbours in Y . Ties can be broken by selecting the vertex
among these that has the minimum number of neighbouring vertices in X .
As an example, executing the RLF algorithm on the graph from Fig. 3.15 results
in vertices being selected in the following order: v4 , v2 and then v8 (colour 1); v3
and then v5 (colour 2); v6 and then v1 (colour 3); and then v8 (colour 4). This gives
the final (feasible) solution S = {{v1 , v6 }, {v2 , v4 , v8 }, {v3 , v5 }, {v8 }}.
Like DSatur, the RLF algorithm is also exact for several fundamental graph
topologies.
Fig. 3.15 Example application of Get-Maximal-I-Set using random vertex selection. Here,
black-filled vertices are those currently assigned to the independent set S; black outlines show
the subgraph induced by the vertices of X ; and grey lines show the subgraph induced by the vertices
of Y ∪ S. At the end of this process, X will be empty and S will be a maximal independent set
point the subgraph induced by V2 will have no edges, allowing RLF to colour all
remaining vertices with the second colour.
Theorem 3.14 The RLF algorithm is exact for cycle and wheel graphs.
Proof Even cycles are two-colourable and are therefore dealt with by Theorem 3.13.
However, for convenience we consider both even and odd cycles in the fol-
lowing. Let Cn be a cycle graph with vertices V = {v1 , . . . , vn } and edges
E = {{v1 , v2 }, {v2 , v3 }, . . . , {vn−1 , vn }, {vn , v1 }}. For bookkeeping purposes, also
assume that ties in the RLF selection rules are broken by taking the vertex with the
lowest index. It is easy to see that this theorem holds without this restriction, however.
The degree of all vertices in Cn is 2, so the first vertex to be coloured will be v1 .
Consequently, neighbouring vertices v2 and vn−1 will be added to Y . According to
the heuristics of RLF the next vertex to be coloured will be v3 , leading to v4 being
added to Y ; then v5 , leading to v6 being added to Y ; and so on. At the end of this
process, we will have colour class S1 = {v1 , v3 , . . . , vn−1 } when n is even, and the
colour class S1 = {v1 , v3 , . . . , vn−2 } when n is odd. In the even case, this leaves an
uncoloured subgraph with vertices v2 , v4 , . . . , vn and no edges. Consequently RLF
will assign all of these vertices to the second colour. In the odd case, we will be left
with uncoloured vertices v2 , v4 , . . . , vn−1 , vn together with a single edge {vn−1 , vn }.
Following the heuristic rules of RLF, all even-indexed vertices will then be assigned
to the second colour, with vn being assigned to the third.
For wheel graphs, Wn , similar reasoning applies. Assuming n ≥ 5, the central
vertex vn will be coloured first because it has the highest degree. Since vn is adjacent
to all other vertices, no further vertices can be added to this colour, so the algorithm
will move on to the second colour. The remaining uncoloured vertices now form the
cycle graph Cn−1 , and the same colouring process as above follows.
In this section we now present a comparison of the five heuristics considered in this
section, namely:
In the next subsection, we start by making some general points about good practice
when empirically comparing algorithms. In Sect. 3.5.2 we then consider each of the
five algorithms in turn and derive their complexities using big O notation. The results
of the comparison are then discussed in Sect. 3.5.3. The implementations used in these
experiments can be found in the online suite of graph colouring algorithms described
in Sect. 1.4.1 and Appendix A.1.
When new algorithms are proposed for the graph colouring problem, the quality
of the solutions it produces will often be compared to those achieved on the same
problem instances by other methods. A development in this regard occurred in 1992
with the organisation of the Second DIMACS Implementation Challenge (http://
mat.gsia.cmu.edu/COLOR/instances.html), which resulted in a suite of differently
structured graph colouring problems being placed into the public domain. Since this
time, authors of graph colouring papers have often used this set (or a subset of it)
and have concentrated on tuning their algorithms (perhaps by altering the underlying
operators or run time parameters) to achieve the best possible results.
More generally, when testing and comparing the accuracy of two heuristic algo-
rithms, an important question is:
Are we attempting to show that Algorithm A produces better results than Algorithm B on
(a) a particular problem instance? or (b) across a whole set of problem instances?
In some cases, we might be given a difficult practical problem that we only need
to consider once, and whose efficient solution might save lots of money or other
resources. Here, it would seem sensible to concentrate on answering question (a) and
spend our time choosing the correct heuristics and parameters to achieve the best
solution possible under the given time constraints. If our chosen algorithm involves
stochastic behaviour (i.e., making random choices) multiple runs of the algorithm
might then be performed on the problem instance to gain an understanding of its
average performance with this case.
In most situations, however, it is more likely that when a new algorithm is
proposed, the scientific community will be more interested in question (b) being
answered—that is, we will want to understand and appreciate the performance of
the algorithm across a whole set of problem instances, allowing more general con-
clusions to be drawn.
If we choose to try and answer (b) above, it is first necessary to decide what types
of graphs (i.e., what population of problems instances) we wish to make statements
about. For instance, this might be the set of all three-colourable graphs, or it could
be the set of all graphs containing fewer than 1000 vertices. Typically, populations
like these will be very large, or perhaps unlimited in size, and so it will be necessary
to test our algorithms on randomly selected samples of these populations. Under
appropriate experimental conditions, we might then be able to use the outcomes of
64 3 Bounds and Constructive Heuristics
Fig. 3.16 Illustration of how different binary l-tuples can represent graphs that are isomorphic
these trials to make helpful statistical statements about the population itself, such as:
“With ≥ 95% confidence, Algorithm A produces solutions with fewer colours than
Algorithm B on this particular graph type”.
In this section, to compare the performance of our five heuristics, we make use of
the following facts
to define our population. Given a graph with n vertices there are
a total of l = n2 different pairs of vertices. Any graph with n vertices can therefore
be represented by a binary l-tuple b = (b1 , b2 , . . . , bl ) where an element bi = 1 if
the corresponding pair of vertices are adjacent, and bi = 0 otherwise.
Now let B (n) define the set of all possible binary l-tuples, where l = n2 . The
size of this set is |B (n) | = 2l and can therefore be viewed as the set of all possible
ways of connecting vertices in an n-vertex graph. However, we must be careful in
this interpretation as it is not quite the same as saying that B (n) represents the set of
all graphs with n vertices (which it does not), because it fails to take into account
the principle of graph isomorphisms.
Consider the example in Fig. 3.16, where we show two different binary 6-tuples
and illustrate the graphs that they represent, called G 1 and G 2 here. Note that when we
come to colour G 1 and G 2 , their vertex labels are of little importance; indeed, without
the labels these two graphs might be considered identical. In these circumstances G 1
and G 2 are considered isomorphic as there exists a way of converting one graph into
the other by simply relabelling the vertices (in this example we can convert G 1 to
G 2 by relabelling v1 as v2 , v2 as v4 , v3 as v3 and v4 as v1 ). Because the set B (n) fails
to take these isomorphisms into account, it must therefore be interpreted as the “set
of all n-vertex graphs and their isomorphisms”, as opposed to the set of all n-vertex
graphs itself.
To generate a single member of the set B (n) at random (i.e., to select an element
of B (n) such that each element is equally likely), it is simply necessary to generate
an l-tuple b in which each element bi is set to one with a probability of one half, and
set to zero otherwise. This is the same process as producing a random graph with
p = 0.5:
Observe that the degrees of the vertices in a random graph are binomially dis-
tributed; that is, deg(v) ∼ B(n − 1, p). As a result, they are also sometimes known
as binomial graphs. Random graphs will be the focus of our algorithm comparison
in this chapter, though we will also look at other types of graphs in later chapters.
Before comparing the quality of the solutions produced by our five heuristic algo-
rithms, let us first determine their complexities using big O notation.
Recall from Sect. 3.1 that the Greedy heuristic takes each vertex v ∈ V in turn
and assigns it to the colour j ∈ {1, 2, . . . , Δ(G)+1} that represents the lowest colour
label not being used by any of v’s neighbours. An efficient way of determining a
value for j is shown in Fig. 3.17. This operates using a (Δ(G) + 1)-tuple called used
whose values are all initially set to false. Steps (1) to (3) of this procedure consider
each neighbour u of v. If u is already coloured then used(c(u)) is set to true. On
completion of this loop, false entries in used therefore correspond to colours not
being used by the neighbours of v.
Steps (4) to (6) of this procedure then determine a value for j by simply identifying
the index of the first false entry in used. Note that a false entry will always exist
among the first deg(v) + 1 elements. Finally, Steps (7) to (10) set the true elements
of used back to false.
The Greedy heuristic can be implemented by first setting c(v) to null for each
v ∈ V , and setting used(i) to false for all i ∈ {1, 2, . . . , Δ(G) + 1}. This is clearly
an O(n) process. The graph can then be coloured by making n separate calls to
Get-Lowest-Feasible-Colour, one for each vertex v ∈ V . Since each of these
calls hasa complexity of O(deg(v)), the overall complexity of Greedy is therefore
O(n + v∈V deg(v)) = O(n + m).
As we saw in Sect. 3.2.2.2, the Welsh–Powell heuristic operates in the same
way as Greedy except that vertices are first labelled such that deg(v1 ) ≥ deg(v2 ) ≥
· · · ≥ deg(vn ). This relabelling can be achieved in O(n lg n) time by a standard
sorting algorithm such as Merge Sort and Heap Sort. The overall complexity of
Welsh–Powell is therefore slightly more expensive than Greedy at O(n lg n +m).
We now turn our attention towards the DSatur algorithm. In his original publica-
tion, Brélaz [6] states that the complexity of DSatur is O(n 2 ). This can be achieved
by performing n separate applications of an O(n) process that (a) identifies the next
vertex to colour according to DSatur’s selection rules, and then (b) colours this
vertex.
For sparse graphs, the complexity of DSatur can be significantly improved by
making use of a priority queue. During execution, this priority queue should store
all vertices that are not yet coloured, together with their saturation degree and their
degree in the subgraph induced by the uncoloured vertices. The priority queue should
also allow the selection of the next vertex to colour (according to DSatur’s selection
rules) in constant time.
A description of this approach is shown in Fig. 3.18. As shown, this procedure
uses three n-tuples:
In this pseudocode, the priority queue Q stores details about the uncoloured vertices.
This information is in a 3-tuple containing the name of the vertex v, its saturation
degree |nc(v)|, and d(v) respectively.
3.5 Empirical Comparison 67
The contents of Q, c, nc, and d are initialised in Steps (1) to (6) of Fig. 3.18. In the
remaining steps, an uncoloured vertex u is first removed from Q and coloured using
the Get-Lowest-Feasible-Colour procedure. In Steps (12) to (16), the values of
nc and d are then adjusted for each uncoloured neighbour v of u, and the contents
of Q are updated accordingly.
The asymptotic running time of this version of DSatur now depends on the data
structures used for storing Q and each element of nc. An ideal option here is to
use a binary heap or self-balancing binary tree since this allows vertex selection in
constant time (Step (8)), with lookups, deletions, and insertions then being performed
in logarithmic time.2 Using these data structures, Steps (1) to (6) of Fig. 3.18 will take
O(n lg n) time. In the remaining steps, the neighbours of each vertex are considered
once, and these
are then updated using logarithmic-time operations, giving a run
time of O( v∈V deg(v) lg n) = O(m lg n). This leads to an overall complexity for
DSatur of O((n + m) lg n).
We now consider the two heuristics from Sect. 3.4 which, we recall, produce
solutions by identifying and removing maximal independent sets from a graph.
An efficient version of Greedy-I-Set using random vertex selection is shown
in Fig. 3.19. Here it is not necessary for the contents of the sets X and Y to be
ordered, so a suitable option is to use hash tables, which allow the addition and
removal of elements in constant time.3 Steps (1) to (3) of this procedure initialise
structures and operate in O(n) time. The remaining steps then operate
the data
in O( v∈V deg(v)) = O(m) time. The overall complexity of Greedy-I-Set is
therefore equivalent to Greedy at O(n + m).
Finally, we consider the complexity of the RLF algorithm. In his original publi-
cation [9] states this as O(n 3 ); however, this can be improved upon. As we saw in
Sect. 3.4, RLF operates in much the same way as Greedy-I-Set; indeed, the only
2 In our C++ implementations we use the set container for these purposes. In most versions of
C++, these are stored as self-balancing binary trees.
3 In our C++ implementations we use the unordered_set container for storing X and Y .
68 3 Bounds and Constructive Heuristics
difference occurs in Step (6) of the pseudocode in Fig. 3.19, where the selection of a
vertex v ∈ X is made according to heuristic rules as opposed to at random. The use of
these rules does indeed increase the complexity of the procedure because each time a
vertex u is selected, the number of neighbours in X and the number of neighbours in
Y need to be recalculated for each uncoloured vertex. These calculations can be per-
formed in O(m) time, meaning that the overall complexity of RLF is O(mn). Note
that this is the highest complexity of the five heuristics considered in this section.
Table 3.1 summarises the number of colours used4 by the solutions returned by the
five heuristics. These trials were carried out using random graphs of edge probability
p = 0.5. For each value of n, one hundred graphs were generated and each algo-
rithm was executed on it once. In applications of the Greedy algorithm, the vertex
permutation P was generated randomly. For Greedy-I-Set, vertices were selected
from X at random.
The results in Table 3.1 show that the two simplest algorithms, Greedy and
Greedy-I-Set, produce the poorest solutions overall. There are also no significant
differences between these results.5 As we move from left to right in the table we
see that the results returned by the corresponding algorithms improve. Each of these
improvements was also seen to be significant for the three values of n. We can there-
fore conclude that, out of these five constructive heuristics, RLF produces the best
4 Mean plus/minus standard deviation in number of colours, taken from runs across 100 graphs.
5 The samples collected for each algorithm and value of n were not generally found to be derived
from an underlying normal distribution according to a Shapiro–Wilk test; consequently, statistical
significance is claimed here according to the results of a nonparametric related samples Wilcoxon
Signed Rank test at the 0.1% significance level.
3.5 Empirical Comparison 69
solutions across the set of all graphs and their isomorphisms for n = 100, 1000, and
2000.
The data in Table 3.1 also reveals that the generated lower and upper bounds seem
to be some distance from the number of colours ultimately used by the algorithms.
This indicates that the Welsh–Powell bound tends to provide a rather inaccurate upper
bound for random graphs of density 0.5. It also suggests two factors concerning
the lower bound: (a) that the probabilistic bound determined by Eq. (3.3) is quite
inaccurate and/or (b) that all five heuristic algorithms are producing solutions whose
numbers of colours are some distance from the chromatic number.
The graphs shown in Fig. 3.20 expand upon the results of Table 3.1 by considering
random graphs across a range of different values for p. Bounds are also indicated
by the shaded areas. We see that the unshaded areas of these graphs are generally
quite wide, with the algorithms’ results falling in a fairly narrow band within these.
This again indicates the inadequacy of the upper bounds considered in this chapter,
particularly for larger values of n. The Welsh–Powell bound appears to be the most
accurate for these graphs overall.
The differences in solution quality between these five algorithms are presented
more clearly in Fig. 3.21. Here, the bars in the charts show the number of colours
used in solutions produced by Greedy. The lines then indicate the percentage of this
figure used by the remaining four algorithms. We see that Welsh–Powell, DSatur
and RLF achieve percentages of less than 100% across all of the tested values for p,
indicating their superior performance across the set of random graphs, from sparse
to dense. We also see that RLF consistently produces the lowest percentages, once
again indicating its general superiority over the other methods.
We now consider the implications of the computational complexity of the five
heuristics. Figure 3.22 shows the strong correlation that exists between the number
of checks performed by each algorithm and its subsequent run time. They also show
that greater amounts of computation are required for dense graphs, which is what we
should expect when we consider the complexities of these algorithms, as discussed
in Sect. 3.5.2.
The scales of the charts in Fig. 3.22 also give us further information. We see that
the number of checks performed by Greedy, Greedy-I-Set, and Welsh–Powell
are very similar, indicating that the additional O(n lg n) sorting operation required by
Welsh–Powell has a negligible effect. On the other hand, we see that the CPU time
of Greedy-I-Set is higher than the other two methods. This is due to the additional
overheads of using the C++ container unordered_set for storing X and Y in the
implementation. Finally, we also see that the computational requirements of RLF are
the highest of the five algorithms. This is to be expected due to its higher complexity
of O(mn).
70 3 Bounds and Constructive Heuristics
Colours
various values of p, using
n = 100, 1000, and 2000, 40
400
200
0
0 0.2 0.4 0.6 0.8 1
p
2000
Greedy
Greedy−I−Set
Welsh−Powell
DSatur
RLF
1500
Colours
1000
500
0
0 0.2 0.4 0.6 0.8 1
p
3.6 Chapter Summary and Further Reading 71
In this chapter, we have reviewed a number of bounds for the graph colouring prob-
lem. We have also compared and contrasted five constructive heuristics. A summary
of these bounds and heuristics is shown in Table 3.2. For random graphs of different
sizes and densities (including sets of graphs and their isomorphisms), we have seen
that the RLF algorithm generally produces the solutions with the fewest colours,
though this comes at the expense of added computation time.
In the next two chapters, we will analyse techniques that seek to improve upon
the solutions produced by these constructive heuristics. We now end this chapter by
providing points of reference for further work on bounds for the chromatic number.
Reed [10] has shown that Brooks’ bound (Theorem 3.8) can be improved by one
colour when a graph G has a sufficiently large value for Δ(G) and also has no cliques
of size Δ(G). Specifically:
Theorem 3.15 (Reed [10]) There exists some value δ such that if Δ(G) ≥ δ
and ω(G) ≤ Δ(G) − 1 then χ (G) ≤ Δ(G) − 1.
In this work, a sufficient value for δ is shown to be 1014 . This is obviously a very
large number, though it is suggested that “a more careful analysis could bring the
bound down to 1000”.
A further conjecture, now known as Reed’s Conjecture, is also stated in this paper.
This proposes that for any graph G,
1 + Δ(G) + ω(G)
χ (G) ≤ . (3.7)
2
A good survey on these issues can be found in the work of Cranston and Rabern [11].
In Sect. 3.2.1.1 of this chapter, we saw that interval graphs (and more generally
chordal graphs) always feature chromatic numbers χ (G) equal to their clique num-
bers ω(G). Chordal graphs form part of a larger family of graphs known as perfect
graphs which, in addition to satisfying this criterion, are also known to maintain this
property when any of its vertices are removed.
Defining the structures needed for a graph to be perfect has been the subject of
much research in the field of graph theory and was eventually settled by Chudnovsky
et al. [12], who proved the earlier conjecture of Berge [13], which stated that a graph
is perfect if and only if it contains no odd hole and no odd antihole. (A hole is an
induced subgraph which is a cycle of length at least 4; an antihole is the complement.)
See the work of Mackenzie [14] for further details.
72 3 Bounds and Constructive Heuristics
Fig. 3.21 Mean quality of solutions achieved on random graphs G n, p by the Greedy-I-Set,
Welsh–Powell, DSatur and RLF algorithms in comparison to Greedy. All points are the mean
across 100 graphs using n = 100, 1000, and 2000 respectively
Table 3.2 Summary of the bounds and heuristic algorithms considered in this chapter
Bound Notes
χ(G) ≥ ω(G) Involves calculating the size of the largest clique
ω(G), which is N P -hard
χ(G) ≥ n/α(G) Involves calculating the size of the largest
independent set α(G), which is N P -hard
√
χ(G) ≤ 1/2 + 2m + 1/4 Derived from Theorem 3.5
χ(G) ≤ Δ(G) + 1 Can be observed due to the behaviour of the Greedy
algorithm. Brooks’ bound strengthens this to
χ(G) ≤ Δ(G), providing G is not a complete graph
or an odd cycle
χ(G) ≤ maxi=1,...,n min(deg(vi ) + 1, i) Assumes vertices are labelled such that
deg(v1 ) ≥ deg(v2 ) ≥ . . . ≥ deg(vn ) At least as good
3.6 Chapter Summary and Further Reading
Fig. 3.22 Relationship between the number of checks and the execution time of, respectively, the
Greedy, Greedy-I-Set, Welsh–Powell, DSatur and RLF algorithms. Each point in the figure
is the mean taken from 100 random graphs G 1000, p . Moving from left to right in each figure, these
probabilities are p = 0.025, 0.05, 0.075, . . . , 0.975. All experiments were performed on a 3.0 GHz
Windows 7 PC with 3.87 GB RAM
Further bounds on general graphs have also been given by Berge [16], who finds
n2
≤ χ (G), (3.10)
n 2 − 2m
and Hoffman [17], who has shown
λ1 (G)
1− ≤ χ (G), (3.11)
λn (G)
where λ1 (G) and λn (G) are the biggest and smallest eigenvalues of the adjacency
matrix of G. Both of these usually give very loose lower bounds in practice, however.
Note that the five constructive methods in this section should be classed as heuristic
algorithms as opposed to approximation algorithms. Unlike heuristics, approxima-
tion algorithms are usually associated with provable bounds on the quality of solu-
tions they produce compared to the optimal. So for the graph colouring problem,
using A(G) to denote the number of colours used in a feasible solution produced
by algorithm A with graph G, a good approximation algorithm should feature an
approximation ratio A(G)/χ (G) as close to 1 as possible. To date, the best-known
approximation ratio for a graph colouring algorithm is O(n(log log n)2 /(log n)3 ) due
to Halldórsson [18]. This method operates by randomly selecting independent sets
which are then allocated a colour and removed from the graph. The process repeats
until the graph is empty. Those seeking an algorithm with a low approximation ratio
for the graph colouring problem, however, should take note of the following theorem:
Theorem 3.16 (Garey and Johnson [19]) If, for some constant r < 2 and con-
stant d, there exists a polynomial-time graph colouring algorithm A which is
guaranteed to produce A(G) ≤ r χ (G)+d, then there also exists an algorithm
A which guarantees A (G) = χ (G).
In other words, this states that we cannot hope to find an approximation algorithm
A for the graph colouring problem that, for all graphs, produces A(G) < 2χ (G)
unless P = N P .
References
1. Bollobás B (1998) Modern graph theory. Springer
2. Mycielski J (1955) Sur le coloriage des graphes. Colloq Math 3:161–162
3. Rose D, Lueker G, Tarjan R (1976) Algorithmic aspects of vertex elimination on graphs. SIAM
J Comput 5(2):266–283
4. Brooks R (1941) On colouring the nodes of a network. Math Proc Cambridge Philos Soc
37:194–197
5. Welsh D, Powell M (1967) An upper bound for the chromatic number of a graph and its
application to timetabling problems. Comput J 12:317–322
76 3 Bounds and Constructive Heuristics
6. Brélaz D (1979) New methods to color the vertices of a graph. Commun ACM 22(4):251–256
7. Janczewski R, Kubale M, Manuszewski K, Piwakowski K (2001) The smallest hard-to-color
graph for algorithm DSatur. Discret Math 236:151–165
8. Spinrad J, Vijayan G (1984) Worst case analysis of a graph coloring algorithm. Discret Appl
Math 12:89–92
9. Leighton F (1979) A graph coloring algorithm for large scheduling problems. J Res Natl Bur
Stand 84(6):489–506
10. Reed B (1999) A strengthening of Brooks’ theorem. J Comb Theory Ser B 76(2):136–149
11. Cranston D, Rabern L (2014) Brooks’ theorem and beyond. J Graph Theory. https://doi.org/
10.1002/jgt.21847
12. Chudnovsky M, Robertson N, Seymour P, Thomas R (2006) The strong perfect graph theorem.
Ann Math 164(1):51–229
13. Berge C (1960) Les problémes de coloration en théorie des graphes. Publ Inst Stat Univ Paris
9:123–160
14. Mackenzie D (2002) Graph theory uncovers the roots of perfection. Science 38:297
15. Bollobás B (1988) The chromatic number of random graphs. Combinatorica 8(1):49–55
16. Berge C (1970) Graphs and hypergraphs. North-Holland
17. Hoffman A (1970) On eigenvalues and colorings of graphs. In: Graph theory and its applica-
tions, Proc. Adv. Sem., Math., Research Center, University of Wisconsin, Madison, WI, 1969.
Academic Press, New York, pp 79–91
18. Halldórsson M (1993) A still better performance guarantee for approximate graph
coloring. Inf Process Lett 45(1):19–23. https://www.sciencedirect.com/science/article/pii/
0020019093902466
19. Garey M, Johnson D (1976) The complexity of near-optimal coloring. J Assoc Comput Mach
23(1):43–49
Advanced Techniques for Graph
Colouring 4
In this chapter, we review many of the algorithmic techniques that can be used for the
graph colouring problem. The intention is to give the reader an overview of the differ-
ent strategies available, including both exact and heuristic methods. As we will see,
a variety of different approaches are available, including backtracking algorithms,
integer programming, column generation, evolutionary algorithms, neighbourhood
search, and other metaheuristics. Full descriptions of these techniques are provided
as they arise in the text. We also describe ways in which graph colouring problems
can be reduced in size and/or broken up, helping to improve algorithm performance
in many cases.
Exact algorithms are those that, given sufficient time, will always determine the
optimal solution to a computational problem. As discussed in Chap. 2, one way of
exactly solving an N P -hard combinatorial problem such as graph colouring is to
exhaustively search the space of all possible candidate solutions; however, as problem
sizes grow, the running times for such brute-force methods soon become too large,
making them impractical.
Despite this, it is still possible to design exact algorithms that are significantly
faster than exhaustive search, though still not operating in polynomial time. Often
we can also choose to impose computation limits on these algorithms, allowing good
quality (though not-necessarily-optimal) solutions to be returned within reasonable
time frames. Three alternatives are considered in the following subsections, namely,
backtracking, integer programming, and column generation.
Fig. 4.1 The complete search tree that results when attempting to colour the graph G with a
maximum of three colours
backtracking techniques will often allow us to disregard large sections of the search
tree, thereby improving execution times. This is usually referred to as pruning. To
see this, consider a parse of this tree from the root down in depth-first order (that is,
we are navigating the search tree such that the leaf nodes are considered from left to
right).
1. First, at each node x in the tree, a test can be conducted to see whether the current
partial solution can be completed to make a feasible solution. If it cannot, then
the whole of the subtree rooted at x can be pruned, and therefore ignored by the
algorithm.
An example of this occurs at the node marked by the asterisk (*) in Fig. 4.1. Here
we see that the assignment of the colour black to v2 will result in a clash with
its neighbour v1 , which is also black. As a result, there is no need to consider
the subtree rooted at x because all of its leaf nodes define infeasible solutions.
Instead, the algorithm can backtrack to the parent node and consider its other
branches.
2. Second, consider the situation where solutions are evaluated according to a cost
function that we want to minimise. Suppose further that the best feasible solution
observed so far, S , has a cost of f (S ), but that the algorithm has not yet completed
its parse of the search tree. Now, let x be a node in the search tree such that all
complete solutions S stemming from x are known to have a cost f (S ) ≥ f (S ).
In this case, the whole of the subtree rooted at x can be pruned because none of
its solutions will improve on the best solution seen so far. Instead, the algorithm
can again backtrack to the parent node.
As we know, for graph colouring the cost of a feasible solution is given by the
number of different colours it is using. This means that once a feasible k-colouring
has been achieved, there is no need to consider paths in the search tree that involve
using k or more colours. More specifically, given a node x in the search tree, if
the path from the root to x uses k or more colours, then the subtree rooted at x
can be ignored. In Fig. 4.1, for example, we see that a feasible two-colouring is
established at the eleventh leaf node from the left. From this point onwards, there
is no need to consider subtrees rooted at any node x where the root-to-x path uses
two or more colours.
The application of these two rules allows the backtracking algorithm to remove
many parts of the search tree, thereby improving performance. Of course, due to the
N P -hardness of graph colouring, the time requirements of this approach may still
be excessively large, and executions will often need to be terminated prematurely,
perhaps leaving the user with a suboptimal solution. On the other hand, the systematic
construction of different solutions, together with the way that the algorithm can
ignore large swathes of the solution space, means that backtracking is usually far
more efficient than brute-force enumeration.
Continuing with our example, Fig. 4.2 shows the parts of the search tree from
Fig. 4.1 that are ultimately considered by the backtracking algorithm for graph colour-
ing. As shown, each path through the tree terminates in one of three ways: by an
80 4 Advanced Techniques for Graph Colouring
Fig. 4.2 The search tree considered by the backtracking algorithm when trying to colour G using a
maximum of three colours. Nodes in the tree marked by “X” indicate applications of Rule 1. Nodes
marked by “–” indicate applications of Rule 2
• Using appropriate orderings of the vertices. In the example of Fig. 4.1, the vertices
are considered in label order; however, this is an arbitrary choice. Better perfor-
mance has been noted if vertices are sorted by decreasing degree, or by ordering
vertices such that those with the fewest available colours are coloured first (in a
similar fashion to the DSatur algorithm).
• Using better branching rules. In our current example, each branch is considered in
order from left to right, that is, the vertex is first considered for assignment to black,
then grey, then white. However, better performance might be gained by making
more informed decisions. For example, we may choose to explore branches that
involve assigning vertices to the largest colour classes first (the rationale being that
forming large colour classes leads to an overall reduction in the number of colour
classes). Similarly, we might choose to prioritise colour classes that contain large
numbers of high-degree vertices.
as c(vi ) = i (for all vi ∈ C) and (b) prevent any branching at nodes corresponding
to these vertices in the search tree. In addition to reducing the size of the search
tree, this method also provides a lower bound on χ (G); consequently, if a feasible
l-colouring is achieved during a run, the algorithm can halt immediately and give
the user a provably optimal solution.
A backtracking algorithm using an effective combination of these schemes is
considered further in Chap. 5.
Another way of achieving an exact algorithm for graph colouring is to use a special
type of linear programming model called integer programming (IP). Linear pro-
gramming (LP) is a general methodology for achieving optimal solutions to linear
mathematical models. Such models consist of variables, linear constraints, and a
linear objective function. The variables take on numerical values, and the constraints
are used to define feasible ranges for these variables. The objective function is then
used to measure the quality of a solution and to define the particular assignment of
values to variables that is considered optimal.
In general, the variables of an LP are continuous in the sense that they are permit-
ted to be fractional. On the other hand, IP models are those in which the variables
are restricted to integer values. Though this might seem like a subtle difference, this
insistence on integer-valued variables greatly increases the number of problems that
can be modelled. Indeed, IP models can be used to solve a wide variety of combina-
torial problems, including supply chain design, resource management, timetabling,
employee rostering, and, as we will see presently, graph colouring.
Whereas algorithms such as the well-known simplex method are known to be
effective for solving LPs, there is no single preferred technique for solving integer
programs. Instead, various exact methods are available, including branch-and-bound,
cutting-plane, branch-and-price, as well as various hybrid techniques. Because of
their wide applicability, several off-the-shelf software applications have also been
developed in recent decades for solving linear and integer programming models.
These include commercial packages such as Xpress, CPLEX, and Gurobi, and free
open-source applications such as the SCIP optimisation suite and Coin-OR. Such
packages allow users to input their particular model (in terms of variables, constraints,
and an objective function) and then simply click a button, at which point the software
goes on to produce solutions using the methods just mentioned.
In this subsection, we will focus on how branch-and-bound can be used to solve
the graph colouring problem. Readers interested in finding out about other methods
for integer programming are invited to consult the textbook of Wolsey [2], which
provides a thorough overview of the subject.
For now, we will consider a very basic IP formulation of the graph colouring
problem. As usual, let G = (V, E) be a graph with n vertices and m edges. We now
define two binary matrices Xn×n and Yn that will hold the variables of this problem.
82 4 Advanced Techniques for Graph Colouring
Fig. 4.3 Example branch-and-bound search tree for the IP specified in (4.1)–(4.7). In this case,
the root problem is defined by dropping the integrality constraints for X. The branching variable is
always selected as the first fractional value in X
together with its cost, should be stored. There are also two further ways in which a
node can become fathomed. First, if the LP specified at this node admits no feasible
solution; second, if the optimal solution to the LP has a cost that is worse than the
best integer solution observed so far.
An example of how the branch-and-bound process operates with our IP formu-
lation for graph colouring is shown in Fig. 4.3. To begin, we have defined the root
problem by removing the integrality constraints for X. An optimal solution for this
relaxed problem appears at the top of the figure and features a cost of two. Since the
solution contains fractional values, we now branch on this node. This results in two
new problems, one with the additional constraint X 1,2 = 0 and one where X 1,2 = 1.
Continuing this process leads to the tree shown. Note that in this particular case, all
of the leaf nodes result in integer solutions; hence, all leaf nodes are fathomed and
there is no need for further branching. The best observed integer solution in this tree
4.1 Exact Algorithms 85
Fig. 4.4 Screenshot of the application XPress IVE. In this case, we are using branch-and-bound
to optimally colour a small random graph G 30,0.5 using the IP given in (4.1)–(4.7). Here, “best
solution” indicates the upper bound and “best bound” indicates the lower bound. The algorithm
halts (with a certificate of optimality) when these values have become equal
corresponds to an optimal solution to the original IP. The cost of this solution also
corresponds to the chromatic number of the graph (three in this case).
During the execution of the branch-and-bound algorithm, note that two important
values are stored. The first of these is the cost of the best integer solution observed
so far. In cases where we are attempting to minimise the cost function (such as here),
this value gives an upper bound, telling us that we will never need to accept a solution
with a cost that is worse (higher) than this value. The second value is a lower bound
and is obtained by taking the best (minimum) cost across all of the unfathomed leaf
nodes in the current search tree. This tells us that there is no integer solution to the
original IP that has a cost that is better (lower) than this value. If an integer solution
is obtained whose cost equals this lower bound, this tells us that an optimal solution
has been found. In this case, the branch-and-bound algorithm can halt immediately,
providing the user with a certificate of optimality.
Figure 4.4 illustrates how branch-and-bound can refine these upper and lower
bounds during a run using a small random graph G 30,0.5 . As shown, in this particular
case, the algorithm has produced integer solutions with costs of eight, and then
seven, in just under a second. In the remainder of the run, this latter solution is not
improved upon, so the upper bound does not change further; however, the expansion
of the search tree allows the lower bound to be improved. At around 10 seconds, the
upper and lower bounds are seen to be equal, proving that an optimal solution to the
problem (using seven colours) has been obtained.
86 4 Advanced Techniques for Graph Colouring
Together, these constraints specify a unique permutation of the first k columns for
each possible k-colouring. Specifically, vertex v1 must be assigned to colour 1, v2
must be assigned to either colour 1 or colour 2, and so on. (Or, in other words, the
columns are sorted according to the minimally labelled vertex in each colour class.)
Under these constraints the optimal solution to our example problem is now:
⎛ ⎞
10000000
⎜0 1 0 0 0 0 0 0⎟
⎜ ⎟
⎜0 1 0 0 0 0 0 0⎟
⎜ ⎟
⎜0 0 1 0 0 0 0 0⎟
X=⎜ ⎜ ⎟
⎟
⎜1 0 0 0 0 0 0 0⎟
⎜1 0 0 0 0 0 0 0⎟
⎜ ⎟
⎝1 0 0 0 0 0 0 0⎠
00100000
Y= 11100000
as required.
Fig. 4.6 Time required to solve (to optimality) various random graphs G n, p using branch-and-
bound
subject to:
Xi j + Xl j ≤ Y j ∀{vi , vl } ∈ E, ∀ j ∈ {1, . . . , n} (4.16)
n
Xi j = 1 ∀vi ∈ V (4.17)
j=1
Xi j = 0 ∀vi ∈ V, j ∈ {i + 1, . . . , n} (4.18)
i−1
Xi j ≤ X l j−1 ∀vi ∈ V − {v1 }, ∀ j ∈ {2, . . . , i − 1} (4.19)
l= j−1
X i j ∈ {0, 1} ∀vi ∈ V, ∀ j ∈ {1, . . . , n} (4.20)
Y j ∈ {0, 1} ∀ j ∈ {1, . . . , n}. (4.21)
In these trials, we used the software XPress IVE (v. 8.11) for specifying and
solving this IP model. A full listing of this code is given in Appendix A.5. All trials
were conducted on a 3.2 GHz Windows machine with 8 GB of RAM using a time
limit of 1 hour.
Figure 4.6 shows the times required for solving various random graphs G n, p . For
p = 0.5, we see that branch-and-bound can solve graphs of up to 40 vertices in
very short amounts of time. Beyond this, however, the time requirements increase
drastically—indeed, no graphs with n > 50 were solved within the 1-hour time limit.
For the other values of p, similar results occur, though slightly larger values of n can
be tolerated. That said, no graphs with n ≥ 100 have been solved within the time
limit here.
90 4 Advanced Techniques for Graph Colouring
Similar results to these are shown in Fig. 4.7. In the top chart (where n = 25),
all problems are solved to optimality in a matter of seconds. For the larger random
graphs G 50, p , we see that most problems are solved within the time limit, but that
difficulties arise for values of p between 0.55 and 0.7. In these cases, there exists a
gap between the upper and lower bounds, indicating that a provably optimal solution
has not been established within the 1-hour time limit. These patterns are even more
striking for G 100, p where only p = 0.05 is solved within the time limit.
An exact algorithm for graph colouring can also be created using a technique known
as column generation. The idea in column generation is to make use of integer
programming, but to also avoid considering all variables of the problem explicitly;
instead, variables are only introduced when they have the potential to improve the
objective function. This will often make problem sizes more manageable, allowing
larger problems to be tackled.
To apply column generation with graph colouring, we can make use of the mini-
mum set covering problem. This is defined as follows.
For example, given U = {1, 2, 3, 4} and S = {{1}, {1, 2}, {1, 3}, {3, 4}, {4}}, one
example covering is {{1, 2}, {1, 3}, {4}}, which contains three elements of S and all
members of U . However, the minimal covering in this case is {{1, 2}, {3, 4}}, which
contains just two elements of S.
The minimum set covering problem can be formulated by the following integer
program:
min XS (4.22)
S∈S
subject to:
XS ≥ 1 ∀u ∈ U (4.23)
S:u∈S
X S ∈ {0, 1} ∀S ∈ S. (4.24)
In this formulation, the variable X S = 1 if the subset S ∈ S is being used in the
covering; else X S = 0. We are therefore seeking to minimise the number of sets
being used for the covering (4.22), while covering every element u of the universe
U (4.23). The integrality constraints are given by (4.24).
92 4 Advanced Techniques for Graph Colouring
Fig. 4.8 Part a shows a small graph with n = 5 vertices and m = 6 edges. As a set covering
problem, this graph colouring problem is defined by the universe U = V = {v1 , v2 , v3 , v4 , v5 }
and the set of all maximal independent sets S = {{v1 , v3 }, {v1 , v4 }, {v2 , v5 }, {v3 , v5 }}. In this case,
a minimum set covering of the universe is given by S = {{v1 , v3 }, {v1 , v4 }, {v2 , v5 }}, which has
a size of three. If S contains any duplicates (as is the case with v1 here), these can be removed
arbitrarily to form a feasible colouring. An optimal three-colouring in this example is therefore
S = {{v1 , v3 }, {v4 }, {v2 , v5 }}, as shown in Part b
To solve the graph colouring problem using set covering, we can take the set
S of all maximal independent sets of a graph G (see Definition 3.14). An optimal
colouring of G can then be found by simply identifying a minimum set covering S
of the universe U = V = {v1 , v2 , . . . , vn }. An example of this process is shown in
Fig. 4.8.
Attempting to solve the graph colouring problem using a set covering formulation
brings two difficulties, however.
• First, the set covering problem is itself an N P -hard problem [5], implying that it
cannot be solved in polynomial time in general.
• Second, the number of maximal independent sets in a graph has the potential to
grow exponentially in relation to graph size, meaning that the task of constructing
a set S that contains all maximal independent sets will often be beyond our means.
To illustrate the second point, Table 4.1 shows the number of maximal independent
sets that exist in different random graphs G n, p . As n is increased, we see that the
number of maximal independent sets grows quickly. This is particularly the case for
sparser graphs where, in our case, there is often insufficient memory to complete the
computation. On the other hand, for dense graphs, the presence of so many edges in
the graphs makes the maximal independent sets quite small, meaning that they can
be enumerated quickly, even for fairly large values of n.
If we want to use set covering principles to solve the graph colouring problem, an
obvious approach is to populate the set S with all maximal independent sets of G and
then seek to solve the integer program specified in (4.22)–(4.24). As we have seen,
however, this has the potential to involve huge numbers of variables. In addition, it
is likely that most of these variables will assume values of zero, meaning that the
corresponding members of S are not needed to optimally colour the graph. Column
generation seeks to resolve these issues by using a smaller number of variables, and
then only adding further variables to the IP when it is deemed necessary.
4.1 Exact Algorithms 93
Table 4.1 Number of maximal independent sets in random graphs G n, p . These figures were gen-
erated using the NetworkX command nx.find_cliques(H), where H is the complement of
the random graph G n, p . Further details on the NetworkX library are given in Appendix A.3. The
number of seconds required to execute these operations is shown in italics. Missing values indicate
that the operations did not complete due to an Out of Memory error. All trials were conducted on a
3.2 GHz Windows machine with 8 GB RAM
n Edge probability p
0.1 0.25 0.5 0.75 0.9
50 43,815 9015 1021 205 87
0.26 0.06 <0.01 <0.01 <0.01
100 – 1,314,202 14,841 1227 337
9.75 0.11 <0.01 <0.01
250 – – 1,578,449 22,925 2599
13.82 0.16 <0.01
500 – – – 263,957 15,903
1.95 0.09
1000 – – – 4,144,959 99,646
37.97 0.60
2000 – – – – 747,296
6.80
In more detail, column generation operates by starting with a relatively small set
S. The contents of S might be generated at random or via specialised heuristics—at
this stage it is only necessary that S covers the universe of the set covering problem.
In the next stage, the linear relaxation of the set covering IP is solved. The dual
values from this optimal solution are then used to define a so-called pricing problem
that needs to be solved. The solution to this pricing problem can then be used to
determine if further variables need to be added to the IP. If this is the case then S is
updated, the linear relaxation is solved again, and the process is repeated. Otherwise,
an integer solution to the IP gives an optimal solution to the original problem.
The original application of column generation to graph colouring is due to Mehro-
tra and Trick [6], who identified the pricing problem as the maximum weighted
independent set problem. This is defined as follows.
Definition 4.2 Let G = (V, E) be a graph with weights w(v) for each vertex
v ∈ V . The maximum weighted independent set problem involves identifying
an independent set of vertices V ⊆ V whose weight v∈V w(v) is maximal
among all independent sets of G.
In this work, the authors also consider a relaxed version of the set covering IP in
which the integrality constraints are replaced by the requirement that each variable
94 4 Advanced Techniques for Graph Colouring
1. Given a graph G = (V, E), let the universe U = V . Also, let S = {S1 , S2 , . . . , Sl }
be a set whose elements are maximal independent sets of G, and for which
S∈S S = U .
2. Solve the LP given by (4.25)–(4.27). The optimal solution to this LP gives a dual
value πi for each vertex vi ∈ V .
3. For each vi ∈ V , let w(vi ) = πi . Now solve the maximum weighted independent
set problem. If the weight of this solution v∈V w(v) > 1, then add V to S and
return to Step 2. Otherwise proceed to Step 4.
4. If the current solution to the linear relaxation is an integer solution then stop:
we have determined an optimal colouring of G. Otherwise, solve the original set
covering IP (4.22)–(4.24) using S. An optimal solution to this IP corresponds to
an optimal colouring of G.
set problem seen in Sect. 2.3, which is also N P -hard. Any instance of the latter can be converted
into a corresponding instance of the former by simply allocating a weight of one to all vertices of
the graph G. A maximum weighted independent set of this graph is then equivalent to a maximum
independent set of G.
4.1 Exact Algorithms 95
Fig. 4.9 Example run of the column generation algorithm for graph colouring. In Iteration 1, the
contents of S are maximal independent sets that, together, cover the vertex set of G. At the end
of Iteration 3, an optimal colouring {{v1 , v3 }, {v2 , v5 }, {v4 , v6 }} for G has been determined. Note
that in this particular example the variables in the optimal solutions have assumed binary values;
however, this is not enforced. Instead, each variable only needs to assume a nonnegative value, as
defined in the LP given in (4.25)–(4.27)
96 4 Advanced Techniques for Graph Colouring
The first set of algorithms we consider are those that operate in the space of feasible
colourings. Approaches of this type seek to identify solutions within this space
that feature the minimum number of colours. Often these methods make use of the
Greedy algorithm to construct solutions; hence, they are concerned with identifying
good permutations of the vertices. (Recall from Theorem 3.2 that, for any graph,
a permutation of the vertices always exists that decodes into an optimal solution
through the application of Greedy.)
One early example of this type of approach was the iterated Greedy algorithm
of Culberson and Luo [8]. This rather elegant algorithm exploits the findings of
Theorem 3.1, namely, that, given a feasible colouring S , a permutation of the vertices
can be formed that, when fed back into the Greedy algorithm, results in a new
feasible solution S such that |S | ≤ |S |. To start, DSatur is used to produce an initial
feasible solution. Then, at each iteration, the current solution S = {S1 , . . . , S|S | } is
taken and its colour classes are reordered to form a new permutation of the vertices.
This permutation is then used with Greedy to produce a new feasible solution before
the process is repeated indefinitely.
Culberson and Luo [8] suggest several ways in which reorderings of the colour
classes can be achieved at each iteration. These include the following:
The largest first heuristic is used in an attempt to construct large independent sets in
the graph, while the reverse heuristic encourages vertices to be mixed among different
colour classes. The random heuristic is then used to try and prevent the algorithm
from cycling among the same set of solutions, allowing new regions of the solution
space to be explored. Culberson and Luo [8] ultimately recommend selecting these
heuristics randomly at each iteration according to the ratio 5:5:3, respectively.
Two further algorithms that operate within the feasible-only space are the evolu-
tionary algorithms (EAs) of Mumford [9] and Erben [10]. EAs are a type of meta-
heuristic inspired by biological evolution. They operate by maintaining a population
of candidate solutions that represent a sample of the solution space. During a run,
EAs then attempt to improve the quality of members within this population using
the following operators:
• Evolutionary pressure. As with biological evolution, EAs usually also exhibit some
bias towards keeping good candidate solutions in the population and rejecting bad
ones. Hence, high-quality solutions are more likely to be used for generating new
offspring, and weaker solutions are more likely to be deleted from the population.
the possibility that an offspring might inherit all of its colour classes from the second
parent due to the policy of deleting colour classes originating from the first parent.
The mutation operator of this EA works in a similar fashion to recombination
by deleting some randomly selected colour classes from a solution, randomly per-
muting these vertices, and then reinserting them into the solution via Greedy. The
following heuristic-based objective function is also proposed for gauging the quality
of a solution S : 2
Si ∈S v∈Si deg(v)
f 1 (S ) = . (4.29)
|S |
Here, v∈Si deg(v) gives the total degree of all vertices assigned to the colour class
Si . The aim is to maximise f 1 by making increases to the numerator (by forming large
colour classes containing high-degree vertices) and/or decreases to the denominator
(by reducing the number of colour classes). It is also suggested that this objective
function allows evolutionary pressure to be sustained in a population for longer
during a run (compared to the more obvious choice of using the number of colours
|S |), because it allows greater distinction between individuals.
The EA of Mumford [9] also seeks to construct offspring solutions by combining
the colour classes of parent solutions. In this research, two recombination operators
are suggested: the Merge Independent Sets (MIS) operator and the Permutation
One Point (POP) operator. The MIS operator starts by taking two feasible parent
solutions, S1 and S2 , and constructs two permutations. As with the iterated Greedy
algorithm, vertices within the same colour classes are put into adjacent positions in
these permutations. For example, the two solutions
S1 = {{v1 , v2 , v3 }, {v4 , v5 , v6 , v7 }, {v8 , v9 , v10 }}
S2 = {{v1 , v6 , v9 }, {v2 , v8 }, {v3 , v4 , v5 , v7 }, {v10 }}
might result in the following two vertex permutations:
P1 = (v1 , v2 , v3 : v4 , v5 , v6 , v7 : v8 , v9 , v10 )
P2 = (v1 , v6 , v9 : v2 , v8 : v3 , v4 , v5 , v7 : v10 ).
(For convenience, colons are used in these permutations to mark boundaries between
different colour classes.) In the next step of the operator, the two permutations are
merged randomly such that the boundaries between the colour classes are maintained.
For example, we might merge the above examples to get
(v1 , v6 , v9 : v1 , v2 , v3 : v4 , v5 , v6 , v7 : v2 , v8 : v3 , v4 , v5 , v7 : v8 , v9 , v10 : v10 ).
Finally, two offspring permutations are formed by using the first occurrence of each
vertex for the first offspring, and the second occurrence for the second offspring:
P1 = (v1 , v6 , v9 , v2 , v3 , v4 , v5 , v7 , v8 , v10 )
P2 = (v1 , v6 , v2 , v3 , v4 , v5 , v7 , v8 , v9 , v10 ).
These permutations are then converted into feasible offspring solutions using the
Greedy algorithm.
100 4 Advanced Techniques for Graph Colouring
The POP operator of Mumford [9] follows a similar scheme by first forming
two permutations, P1 and P2 , as above. A random cut point is then chosen, and the
first portion of P1 up to the cut point becomes the first portion of the first offspring
permutation P1 . The remainder of P1 is then obtained by copying the vertices absent
from P1 in the order that they occur in P2 . The second offspring permutation P2 is
found in the same way, but with the roles of the parents reversed. For example, using
“|” to signify the cut point, the permutations
P1 = (v1 , v2 , v3 , v4 | v5 , v6 , v7 , v8 , v9 , v10 )
P2 = (v1 , v6 , v9 , v2 | v8 , v3 , v4 , v5 , v7 , v10 )
result in the following new permutations:
P1 = (v1 , v2 , v3 , v4 , v6 , v9 , v8 , v5 , v7 , v10 ) and
P2 = (v1 , v6 , v9 , v2 , v3 , v4 , v5 , v7 , v8 , v10 ).
As before, two offspring solutions are then formed by feeding these new permutations
into the Greedy algorithm.
As we have seen, the recombination operators used in the EAs of both Mumford
[9] and Erben [10] attempt to provide mechanisms by which good colour classes
within a population can be propagated, thereby hopefully allowing good offspring
solutions to be formed. However, the overall performance of their algorithms as
presented in their papers does not seem as strong as that of other algorithms reported
in the literature. That said, we will see later that excellent results can occur when
evolutionary-based algorithms are hybridised with local search-based procedures.
Moving away from EAs, the technique of Lewis [11] also operates in the space
of feasible solutions. The suggested algorithm makes use of operators based on the
iterated Greedy algorithm for making large changes to a solution. These are then
combined with other specialised operators that make small alterations to a solution
while ensuring that it remains feasible at all times. These latter operators are the
so-called Kempe chain interchange and pair swap operators, defined as follows:
Definition 4.4 Let the Kempe chains Kempe(u, i, j) and Kempe(v, j, i) both
contain just one vertex each (therefore implying that u and v are nonadjacent.)
A pair swap involves swapping the colours of u and v.
4.2 Inexact Heuristics and Metaheuristics 101
Fig. 4.11 a An example five-colouring; b the result of a Kempe chain interchange using
Kempe(v7 , 1, 2); c the result of a pair swap using v1 and v5
Figure 4.11 shows examples of these operators. In Fig. 4.11a, we see that
Kempe(v7 , 1, 2) = {v4 , v7 , v8 , v9 }. Interchanging the colours of these vertices gives
the colouring shown in Fig. 4.11b. For an example pair swap, observe that in Fig. 4.11a
the Kempe chains identified by Kempe(v1 , 3, 4) and Kempe(v5 , 4, 3) both contain
just one vertex each. Hence, a pair swap can be performed using v1 and v5 , as shown
in Fig. 4.11c.
The fact that applications of these operators will always preserve the feasibility
of a solution is due to the following theorem.
Proof For the Kempe chain interchange operator, consider the situation where S is
proper but S is not. Because a Kempe chain interchange involves two colours i and
j, S must feature a pair of adjacent vertices u and v that are assigned to the same
colour i or the same colour j. Without loss of generality, assume this to be colour
i. This tells us that u and v must have both been in the Kempe chain used for the
interchange since they are adjacent, implying that u and v were both assigned to
colour j in S . However, this is impossible since S is known to be proper.
According to the conditions given in Definition 4.4, v cannot be adjacent to any
vertex coloured with j, and u cannot be adjacent to any vertex coloured with i.
Hence, swapping the colours of u and v will also ensure that S is proper.
The method of Lewis [11] has been shown to outperform those of both Culberson
and Luo [8] and Erben [10] on a variety of different graphs. Consequently, this forms
one of the case-study algorithms discussed in Chap. 5.
102 4 Advanced Techniques for Graph Colouring
Many algorithms proposed for graph colouring have been designed to explore the
space of complete improper colourings. Such methods typically start by proposing a
fixed number of colours k, with each vertex then being assigned to one of these colours
using heuristics, or possibly at random. During this assignment there may be vertices
that cannot be assigned to any of the k colours without inducing a clash; however,
these will be assigned to one of the colours anyway. (Recall that a clash occurs when
a pair of adjacent vertices are assigned to the same colour—see Definition 1.4.)
The above assignment process leaves us with a k-partition of the vertices S =
{S1 , . . . , Sk } that represents a complete, but most likely improper k-colouring. A
natural way to measure the quality of this solution is to now count the number of
clashes. This can be achieved via the following objective function:
f 2 (S ) = g(u, v) (4.30)
∀{u,v}∈E
where
1 if c(u) = c(v)
g(u, v) =
0 otherwise.
The aim of an algorithm that uses this solution space is to make alterations to the
k-partition so that the number of clashes is reduced to zero. If this is achieved, k
might then be reduced and the process restarted. Alternatively, if all clashes cannot
be eliminated, k can be increased.
Perhaps the first algorithm to make use of the above strategy was due to Chams
et al. [12], who made use of the simulated annealing metaheuristic. Soon after this,
Hertz and de Werra [13] proposed a similar algorithm called TabuCol based on
the tabu search metaheuristic of Glover [14]. Simulated annealing and tabu search
are types of metaheuristics based on the concept of local search. In essence, local
search algorithms make use of neighbourhood operators which are simple schemes
for changing (or disrupting) a particular candidate solution. In the examples just
cited, this operator is simply:
• Take a vertex v that is currently assigned to colour i, and assign it to a new colour
j (where 1 ≤ i = j ≤ k).
is seen to be better than the incumbent (i.e., f (S ) < f (S )) then it is set as the
incumbent for the next iteration (i.e., S ← S ); otherwise, no changes occur. The
algorithm can then be left to run indefinitely or until some user-defined stopping
criterion is met.
Although the random descent method is very intuitive, it is highly susceptible
to getting caught at local optima within the solution space. This occurs when all
neighbours of the incumbent solution feature an equal or inferior cost—that is, ∀S ∈
N (S ), f (S ) ≥ f (S ). It is obvious that if a random descent algorithm reaches such a
point in the solution space, then no further improvements (or changes to the solution)
will be possible.
The simulated annealing algorithm is a generalisation of random descent which
offers a mechanism by which this issue can be avoided. In essence, the main difference
between the two methodologies lies in the criterion used for deciding whether to
perform a move or not. As noted, for random descent this criterion is simply f (S ) <
f (S ). Simulated annealing uses this criterion, but it also accepts a move to a worse
solution with probability exp(−δ/t), where δ = | f (S ) − f (S )| gives the proposed
change in cost and t is a parameter known as the temperature. Typically, t is set
to a relatively high value at the beginning of execution. This results in nearly all
moves being accepted, meaning that the exploration method closely resembles a
random walk through the solution space. During a run, the temperature t is then
slowly reduced, meaning that the chances of accepting a worsening move become
increasingly less likely. This causes the algorithm’s behaviour to approach that of the
random descent method. This additional acceptance criterion allows the algorithm
to escape from local optima, helping SA to explore a greater span of the solution
space.
A pseudocode description of the simulated annealing algorithm is given in
Fig. 4.12. In this case, the temperature t is reduced every z iterations by multiplying
it by a cooling rate α in the range (0, 1). Many other cooling schemes are possible,
however. Since its introduction by Kirkpatrick et al. [15], simulated annealing has
become a well-known and often very successful method for combinatorial optimi-
sation problems, including applications in areas such as scheduling [16], university
timetabling [17], packing problems [18], and bridge construction [19]. Methods
104 4 Advanced Techniques for Graph Colouring
based on simulated annealing were also the winners of the first two International
Timetabling Competitions held in 2003 and 2007 (see Chap. 9).
One potentially problematic feature of the simulated annealing metaheuristic is
that it does not maintain any memory of the solutions previously observed. Indeed,
during a run, it may visit the same solution multiple times, or could even spend signif-
icant amounts of time cycling within the same subset of solutions. In contrast to this,
the tabu search metaheuristic contains mechanisms that are intended to help avoid
cycling, therefore encouraging the algorithm to enter new regions of the solution
space.
In the same way that simulated annealing can be considered an extension of
random descent, tabu search can be seen as an extension of the steepest descent
methodology. Steepest descent acts similarly to random descent in that it starts with
an initial solution S and then repeatedly applies a neighbourhood operator to try to
make improvements. In contrast, however, at each iteration of the steepest descent
algorithm all solutions in the neighbourhood are evaluated, with the best of these
then being chosen as the next incumbent. A pseudocode description of this process
is given in Fig. 4.13.
One advantage of using steepest descent over random descent is that it is abun-
dantly clear when a local optimum has been reached (the algorithm will not be able
to identify any neighbouring solution S that is better than S ). Tabu search extends
steepest descent by offering a mechanism for escaping these local optima. It does
this by also allowing worsening moves to be made when they are seen to be the
best available in the current neighbourhood. To avoid cycling, tabu search then also
makes use of a memory structure called a tabu list that keeps track of previously
visited solutions and bans the algorithm from returning to these for a certain period
of time. This encourages the algorithm to enter new parts of the solution space.
As we have discussed, the papers of Chams et al. [12] and Hertz and de Werra
[13] suggested some time ago that both the simulated annealing and tabu search
metaheuristics are suitable for graph colouring problems. A tabu search method
called TabuCol, in particular, has proved to be very popular, both when used in
isolation and when used as an improvement procedure as part of broader algorithmic
schemes. This algorithm is discussed further in Chap. 5.
In more recent years, many other methods for exploring the space of complete
improper k-colourings have been proposed, including techniques based on
4.2 Inexact Heuristics and Metaheuristics 105
Two of the most notable examples from the above list, particularly due to the
quality of results they produce, are the hybrid evolutionary algorithm of Galinier
and Hao [23] and the ant colony optimisation algorithm of Thompson and Dowsland
[28]. Both of these algorithms make use of population-based methods combined with
the TabuCol algorithm. The idea behind this hybridisation is to use the population-
based elements of the algorithms to guide the search over the long term, gently
directing it towards favourable regions of the solution space, with the TabuCol
element then being used to identify high-quality solutions within these regions. Both
of these algorithms are considered further in Chap. 5.
A further strategy for graph colouring involves exploring the space of proper par-
tial solutions. This scheme again involves stipulating a fixed number of colours k
at the outset. In this case, however, when vertices are encountered that cannot be
feasibly assigned to a colour, they are transferred to a set of uncoloured vertices
U . A solution S is therefore defined by a set of k feasible colour classes (inde-
pendent sets) {S1 , . . . , Sk } together with a set of uncoloured vertices U such that
j=1,...,k S j ∪ U = V . The aim is to then make changes to the colour classes
so that all vertices in U can be feasibly coloured, resulting in U = ∅. If this goal
is achieved, k can then be reduced and the algorithm repeated, as with the previous
scheme.
An effective example of this strategy is the PartialCol algorithm of Blöchliger
and Zufferey [29]. This approach uses tabu search and operates in a very similar
fashion to the TabuCol algorithm, albeit with a different neighbourhood operator.
Specifically, a move in the solution space is achieved as follows:
In their work, Blöchliger and Zufferey [29] make use of the simple objective function
f 3 = |U | to evaluate solutions. A second objective function, f 4 = v∈U deg(v), is
also suggested but is found to only give better solutions in a small number of cases.
This algorithm is discussed in more detail in Chap. 5.
An earlier algorithm using this same scheme is due to Morgenstern and Shapiro
[30]. This uses the same neighbourhood operator and objective function in conjunc-
tion with simulated annealing. However, it also employs an additional operator that
is periodically applied to the partial solution to help reinvigorate the search pro-
106 4 Advanced Techniques for Graph Colouring
cess. Specifically, this mechanism shuffles vertices between colour classes in the
partial solution while not introducing any clashes. This has the effect of moving the
algorithm into different parts of the solution space while not changing its objective
function value.
High-quality results based on exploring the space of partial proper k-colourings
have also been reported by Malaguti et al. [31]. This algorithm is similar to the hybrid
evolutionary algorithm of Galinier and Hao [23] and uses an analogous recombina-
tion operator together with a local search procedure based on PartialCol. Their
approach also makes use of the objective function f 4 in an attempt to sustain evolu-
tionary pressure during execution. One feature of this work is that, during a run of
the EA, a set is maintained containing various independent sets encountered by the
algorithm. Upon termination of the EA, this set is then used in conjunction with a set
covering IP to try and make further improvements to the quality of solution returned
by the algorithm.
Interesting work has also been carried out by Hertz et al. [32], who propose a method
for operating in different solution spaces during different stages of execution. Specif-
ically, TabuCol is used to explore the space of complete improper k-colourings, and
PartialCol is used for the space of partial, proper solutions. The main idea is that
a local optimum in one solution space is not necessarily a local optimum in another.
Hence, when the search is deemed to have stagnated in one space, a procedure is
used to alter the incumbent solution so that it becomes a member of another space.
(For example, a complete improper solution formed by TabuCol is converted into
a partial proper solution by considering clashing vertices in a random order, and
moving them into the set U until no clashes remain.) The search can then be contin-
ued in this new space where further improvements might be made, with the process
being repeated as long as necessary. The authors also propose a third solution space
based on the idea of assigning orientations to edges in the graph and then trying to
minimise the length of the longest paths within the resultant directed graph (see also
the work of Gendron et al. [33]). The authors note, however, that improvements are
rarely achieved during exploration of the latter space, but that its inclusion is still
useful because it tends to make large alterations to a solution, helping to diversify
the search.
Concluding this review of different algorithms for the graph colouring problem, it is
relevant to note that many of the schemes mentioned above are also commonly used in
algorithms that tackle related problems. For example, we can observe the existence
of timetabling algorithms that use constructive heuristics with backtracking [34];
algorithms that allow additional timeslots (colours) in a timetable and then only
4.2 Inexact Heuristics and Metaheuristics 107
deal with feasible solutions [10,35–37]; methods that fix the number of timeslots in
advance and then allow constraints to be violated (i.e., clashes to occur) [38–40]; and
also algorithms that deal with partial timetables, never allowing constraint violations
to occur [17,41,42]. Similar examples can also be noted in other related problems
such as the frequency assignment problem [43,44].
This section now looks at ways in which the size of a graph colouring problem can
be reduced. In some cases, this can lead to shorter run times and/or more accurate
results.
Given a graph G = (V, E), it is sometimes possible to remove certain vertices and
edges to create a smaller subgraph G . If we then establish an optimal colouring of
G , we can then reinstate the missing vertices and assign them to appropriate colours,
giving an optimal colouring for G. Two options for doing this are now outlined.
1. Let v ∈ V such that Γ (v) = V − {v}. In this case, v is adjacent to all other
vertices in G implying that, in any feasible colouring, v will always assume its
own unique colour. Now let G = G − {v} and assume we have established a
feasible colouring of G . The vertex v can now be allocated to a new colour, giving
a feasible colouring to G that is using one more colour than G .
2. Let u, v ∈ V such that Γ (u) ⊆ Γ (v). This implies that u and v are nonadjacent.
Now let G = G − {u}. A feasible colouring of G can be established by taking a
feasible colouring of G and simply assigning u to the same colour as v.
The above steps can be applied repeatedly, creating a series of smaller graphs, until
neither condition holds.
In some cases, it is also possible to split a graph into a set of different subgraphs
that can each be coloured separately. This can be done as follows.
3. Given, G = (V, E), let C ⊆ V be a subset of vertices such that (a) C is a clique
in G and (b) the vertices of C are a vertex separating set (see Definition 3.10).
Now label as G 1 , . . . , G l the components that are formed by deleting C from
G, and let G i be the subgraph induced by G i ∪ C, for all i ∈ {1, . . . , l}. Feasible
colourings of the smaller subgraphs G 1 , . . . , G l can now be produced separately
and then merged into a feasible colouring of G.
Fig. 4.14 Examples of graphs that can be reduced in size before colouring
Hence, u can be removed from the graph together with all its incident edges. These
can be reinstated once the remaining vertices have been coloured.
To illustrate Item 3, consider Fig. 4.14b. As indicated, this graph contains a vertex
separating set of size 3 that is also a clique. In this case, the two smaller subgraphs
G 1 and G 2 can be coloured separately. If the vertices in the separating set are not
allocated to the same colours in each subgraph (as is the case here), then a colour
relabelling can be applied to make this so. The subgraphs can then be merged to form
a complete feasible colouring for G. Note that, by definition, this feature includes
cases where a graph G is disconnected (giving |C| = 0) or where G contains a cut
vertex or bridge (|C| = 1). Also note that χ (G) = max{χ (G 1 ), . . . , χ (G l )}.
In practice, it is easy to check whether vertices exist in a graph that satisfy the
conditions in Items 1 and 2 and, depending on the topology of the graph, it might be
possible to remove many vertices before applying a graph colouring algorithm. The
problem of identifying vertex separating sets is also solvable by various polynomial-
time algorithms (such as the approach of Kanevsky [45]), and it only takes the
addition of a simple checking step to determine whether these separating sets also
constitute cliques or not.
In addition to these steps, in situations where we are trying to solve the decision
variant of the graph colouring problem (given an integer k, identify whether a feasible
k-colouring exists), it is also permissible to delete all vertices with degrees of less
than k. That is, we can reduce the size of G by removing all vertices belonging to
the set {v ∈ V : deg(v) < k}. This is allowed since, obviously, vertices with fewer
than k adjacent vertices will always have a feasible colour from the set {1, . . . , k} to
which they can be assigned. Hence, colours can be allocated to these vertices once
the remaining subgraph has been coloured.
A further method for reducing the size of a graph involves the identification and
removal of independent sets. A suitable process can be summarised as follows.
Given a graph G = (V, E):
4.3 Reducing Problem Size 109
1. First let G = G. Now, identify an independent set I1 in G and remove it. Repeat
this step on G a further l − 1 times to form a set of l disjoint independent sets
{I1 , I2 , . . . , Il }. Call G the residual graph.
2. Next, use any graph colouring algorithm to find a feasible colouring for the resid-
ual graph G . Call this solution S = {S1 , . . . , Sk }.
3. A feasible (k + l)-colouring for the original graph G is now obtained by setting
S = S ∪ {I1 , I2 , . . . , Il }.
For Step 1, it is usually helpful to identify large maximal independent sets because
this will leave us with a smaller residual graph. Recall, however, that the problem of
identifying the maximum independent set in a graph is itself an N P -hard problem.
Methods for identifying large independent sets in a graph range from simple heuris-
tics such as the RLF algorithm (Sect. 3.4.1) to advanced metaheuristic algorithms
such as the tabu search approach of Wu and Hao [46].
A simple local search-based scheme for identifying an independent set of size q
in a graph G = (V, E) might operate as follows. First select a subset of q vertices
I ⊆ V and evaluate it according to the following cost function:
f 5 (I ) = g(u, v), (4.31)
∀u,v∈I
where
1 if {u, v} ∈ E
g(u, v) =
0 otherwise.
In other words, f 5 is simply a count on the number of edges in the subgraph induced
by I . If f 5 (I ) = 0 then I is an independent set of size q; otherwise, alter the contents
of I using a suitable neighbourhood operator. One such operator is to swap a vertex
u ∈ I with a vertex v ∈ (V − I ). Once an independent set has been established it can
then be removed from the graph. We might also seek to fulfill additional criteria, such
as identifying an independent set that, when removed, leaves a residual graph with
the fewest number of edges, thereby hopefully giving it a lower chromatic number.
This might involve making suitable changes to the neighbourhood operator and cost
function.
Note that if we choose to reduce a problem’s size by extracting independent sets,
a suitable balance will need to be struck between the time dedicated to this task and
the time spent colouring the residual graph itself. We should also be mindful of the
fact that extracting the wrong independent sets may also prevent us from being able
to identify the optimal solution to the original graph colouring problem.
Finally, recall from Chap. 2 that the problem of identifying a maximum inde-
pendent set in a graph G is equivalent to identifying a maximum clique in G’s
complement graph Ḡ = (V, Ē) (where Ē = {{u, v} : {u, v} ∈ / E}). A helpful
survey on heuristic algorithms for both of these problems is provided by Pelillo [47].
110 4 Advanced Techniques for Graph Colouring
In this chapter, we have examined several different algorithmic techniques for the
graph colouring problem. Initial sections considered three different ways to achieve
exact algorithms for this problem, namely, using backtracking, integer programming
(via branch-and-bound) and column generation. Given the N P -hard nature of the
graph colouring problem, all of these algorithms feature exponential growth rates
in the worst case; however, they still offer significant improvements over methods
based on brute-force enumeration (such as that seen in Sect. 2.2).
In the later sections of this chapter, we have also described some different heuris-
tic algorithms for graph colouring. Some of these also make use of metaheuristic
techniques like evolutionary algorithms, local search, and ant colony optimisation.
Unlike exact methods, these heuristics will rarely prove solution optimality, even
if granted excess time. However, in many cases, they are still able to provide very
high-quality solutions in comparison to exact algorithms. Evidence for this will be
presented in the next chapter, where we will present a comparison of six contrasting
graph colouring algorithms.
References
1. Kubale M, Jackowski B (1985) A generalized implicit enumeration algorithm for graph color-
ing. Commun ACM 28(28):412–418
2. Wolsey L (2020) Integer programming, 2nd edn. Wiley. ISBN 978-1119606536
3. Morrison D, Jacobson S, Sauppe J, Sewell E (2016) Branch-and-bound algorithms: a survey
of recent advances in searching, branching, and pruning. Discret Optim 19:79–102
4. Méndez-Díaz I, Zabala P (2008) A cutting plane algorithm for graph coloring. Discret Appl
Math 156:159–179
5. Karp M (1972) Complexity of computer computations. In: Reducibility among combinatorial
problems. Plenum, New York, pp 85–103
6. Mehrotra A, Trick M (1996) A column generation approach for graph coloring. INFORMS J
Comput 8(4):344–354
7. Gualandi S, Malucelli F (2012) Exact solution of graph coloring problems via constraint pro-
gramming and column generation. INFORMS J Comput 24(1)
8. Culberson J, Luo F (1996) Exploring the k-colorable landscape with iterated greedy. In: Cliques,
coloring, and satisfiability: second DIMACS implementation challenge, vol 26. American
Mathematical Society, pp 245–284
9. Mumford C (2006) New order-based crossovers for the graph coloring problem. In: Parallel
problem solving from nature (PPSN) IX. LNCS, vol 4193. Springer, pp 880–889
10. Erben E (2001) A grouping genetic algorithm for graph colouring and exam timetabling. In:
Practice and theory of automated timetabling (PATAT) III. LNCS, vol 2079. Springer, pp 132–
158
11. Lewis R (2009) A general-purpose hill-climbing method for order independent minimum
grouping problems: a case study in graph colouring and bin packing. Comput Oper Res
36(7):2295–2310
12. Chams M, Hertz A, Dubuis O (1987) Some experiments with simulated annealing for coloring
graphs. Eur J Oper Res 32:260–266
References 111
13. Hertz A, de Werra D (1987) Using Tabu search techniques for graph coloring. Computing
39(4):345–351
14. Glover F (1986) Future paths for integer programming and links to artificial intelligence.
Comput Oper Res 13(5):533–549
15. Kirkpatrick S, Gelatt C, Vecchi M (1983) Optimization by simulated annealing. Science
220(4598):671–680
16. Sekiner S, Kurt M (2007) A simulated annealing approach to the solution of job rotation
scheduling problems. Appl Math Comput 188(1):31–45
17. Lewis R, Thompson J (2015) Analysing the effects of solution space connectivity with an
effective metaheuristic for the course timetabling problem. Eur J Oper Res 240:637–648
18. Egeblad J, Pisinger D (2009) Heuristic approaches for the two- and three-dimensional knapsack
packing problem. Comput Oper Res 36(4):1026–1049
19. Perea C, Alcaca J, Yepes V, Gonzalez-Vidosa F, Hospitaler A (2008) Design of reinforced
concrete bridge frames by heuristic optimization. Adv Eng Softw 39(8):676–688
20. Dorne R, Hao J-K (1998) A new genetic local search algorithm for graph coloring. In: Eiben
A, Back T, Schoenauer M, Schwefel H (eds) Parallel problem solving from nature (PPSN) V.
LNCS, vol 1498. Springer, pp 745–754
21. Eiben A, van der Hauw J, van Hemert J (1998) Graph coloring with adaptive evolutionary
algorithms. J Heurist 4(1):25–46
22. Fleurent C, Ferland J (1996) Genetic and hybrid algorithms for graph colouring. Ann Oper Res
63:437–461
23. Galinier P, Hao J-K (1999) Hybrid evolutionary algorithms for graph coloring. J Comb Optim
3:379–397
24. Chiarandini M, Stützle T (2002) An application of iterated local search to graph coloring. In:
Proceedings of the computational symposium on graph coloring and it’s generalizations, pp
112–125
25. Paquete L, Stützle T (2002) An experimental investigation of iterated local search for color-
ing graphs. In: Cagnoni S, Gottlieb J, Hart E, Middendorf M, Raidl G (eds) Applications of
evolutionary computing, proceedings of EvoWorkshops2002: EvoCOP, EvoIASP, EvoSTim.
LNCS, vol 2279. Springer, pp 121–130
26. Laguna M, Marti R (2001) A grasp for coloring sparse graphs. Comput Optim Appl 19:165–78
27. Avanthay C, Hertz A, Zufferey N (2003) A variable neighborhood search for graph coloring.
Eur J Oper Res 151:379–388
28. Thompson J, Dowsland K (2008) An improved ant colony optimisation heuristic for graph
colouring. Discret Appl Math 156:313–324
29. Blöchliger I, Zufferey N (2008) A graph coloring heuristic using partial solutions and a reactive
tabu scheme. Comput Oper Res 35:960–975
30. Morgenstern C, Shapiro H (1990) Coloration neighborhood structures for general graph col-
oring. In: Proceedings of the first annual ACM-SIAM symposium on discrete algorithms, San
Francisco, California, USA. Society for Industrial and Applied Mathematics, pp 226–235
31. Malaguti E, Monaci M, Toth P (2008) A metaheuristic approach for the vertex coloring problem.
INFORMS J Comput 20(2):302–316
32. Hertz A, Plumettaz M, Zufferey N (2008) Variable space search for graph coloring. Discret
Appl Math 156(13):2551–2560
33. Gendron B, Hertz A, St-Louis P (2007) On edge orienting methods for graph coloring. J Comb
Optim 13(2):163–178
34. Carter M, Laporte G, Lee SY (1996) Examination timetabling: algorithmic strategies and
applications. J Oper Res Soc 47:373–383
35. Burke E, Elliman D, Weare R (1995) Specialised recombinative operators for timetabling
problems. In: The artificial intelligence and simulated behaviour workshop on evolutionary
computing, vol 993. Springer, pp 75–85
112 4 Advanced Techniques for Graph Colouring
36. Cote P, Wong T, Sabourin R (2005) Application of a hybrid multi-objective evolutionary algo-
rithm to the uncapacitated exam proximity problem. In: Burke E, Trick M (eds) Practice and
theory of automated timetabling (PATAT) V. LNCS, vol 3616. Springer, pp 294–312
37. Lewis R, Paechter B (2007) Finding feasible timetables using group based operators. IEEE
Trans Evol Comput 11(3):397–413
38. Carrasco M, Pato M (2001) A multiobjective genetic algorithm for the class/teacher timetabling
problem. In: Burke E, Erben W (eds) Practice and theory of automated timetabling (PATAT)
III. LNCS, vol 2079. Springer, pp 3–17
39. Colorni A, Dorigo M, Maniezzo V (1997) Metaheuristics for high-school timetabling. Comput
Optim Appl 9(3):277–298
40. Di Gaspero L, Schaerf A (2002) Multi-neighbourhood local search with application to course
timetabling. In: Burke E, De Causmaecker P (eds) Practice and theory of automated timetabling
(PATAT) IV. LNCS, vol 2740. Springer, pp 263–287
41. Burke E, Newall J (1999) A multi-stage evolutionary algorithm for the timetable problem.
IEEE Trans Evol Comput 3(1):63–74
42. Paechter B, Rankin R, Cumming A, Fogarty T (1998) Timetabling the classes of an entire
university with an evolutionary algorithm. In: Baeck T, Eiben A, Schoenauer M, Schwefel H
(eds) Parallel problem solving from nature (PPSN) V. LNCS, vol 1498. Springer, pp 865–874
43. Aardel K, van Hoesel S, Koster A, Mannino C, Sassano A (2002) Models and solution tech-
niques for the frequency assignment problems. 4OR: Q J Belgian, French and Italian Oper Res
Soc 1(4):1–40
44. Valenzuela C (2001) A study of permutation operators for minimum span frequency assignment
using an order based representation. J Heurist 7:5–21
45. Kanevsky A (1993) Finding all minimum-size separating vertex sets in a graph. Networks
23:533–541
46. Wu Q, Hao J-K (2012) Coloring large graphs based on independent set extraction. Comput
Oper Res 39:283–290
47. Pelillo M (2009) Encyclopedia of optimization. In: Heuristics for maximum clique and inde-
pendent set, 2nd edn. Springer, pp 1508–1520
Algorithm Case Studies
5
As we mentioned in the previous chapter, TabuCol has been used as a local search
subroutine in a number of high-performing hybrid algorithms, including those of
Avanthay et al. [1], Dorne and Hao [2], Galinier and Hao [3] and Thompson and
Dowsland [4]. The specific version of TabuCol that we consider here is the so-called
“improved” variant, which was originally used by Galinier and Hao [3].
As we have seen, TabuCol operates in the space of complete improper k-
colourings using an objective function that simply counts the number of clashes (i.e.,
objective function f 2 from Eq. (4.30)). Given a candidate solution S = {S1 , . . . , Sk },
moves in the solution space are performed by selecting a vertex v ∈ Si whose assign-
ment to colour class Si is currently causing a clash, and then switching it to a new
colour class S j (where i = j). Note that previous incarnations of this algorithm also
allowed nonclashing vertices to be moved between colours, though this is generally
thought to worsen performance [5].
The tabu list of this algorithm is stored using a matrix Tn×k . If at iteration l of the
algorithm, the neighbourhood operator transfers a vertex v from Si to S j , then the
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 113
R. M. R. Lewis, Guide to Graph Colouring, Texts in Computer Science,
https://doi.org/10.1007/978-3-030-81054-2_5
114 5 Algorithm Case Studies
element Tvi is set to l + t, where t is a positive integer that will be defined presently.
This signifies that the transfer of v back to colour class Si is tabu (i.e., disallowed)
for the next t iterations of the algorithm (or, in other words, that v cannot be moved
back to Si until at least iteration l + t). Note that this has the effect of making all
solutions containing the assignment of vertex v to Si tabu for t iterations.
As is typical in applications of tabu search, in each iteration of TabuCol, the entire
set of neighbouring solutions is considered. That is, the cost of moving each clashing
vertex into all other k − 1 colour classes is evaluated. This process consumes the
majority of the algorithm’s execution time; however, it can be sped up considerably
through the use of appropriate data structures. To explain, let x denote the number of
vertices involved in a clash in the current solution S . This leads to x(k − 1) members
in the set of neighbouring solutions N (S ). (Obviously, there is a strong positive
correlation between x and the objective function, so better solutions will tend to
have smaller neighbourhoods.) A naïve implementation of the TabuCol algorithm
would now set about separately performing the x(k − 1) different neighbourhood
moves and evaluating all of the resulting solutions. However, this is not necessary,
particularly because only two colour classes are affected by each neighbourhood
move.
A more efficient approach involves making use of an additional matrix Cn×k
where, given the current solution S = {S1 , . . . , Sk }, element Cv j denotes the number
of vertices in colour class S j that are adjacent to vertex v. When an initial solution
is generated, all elements in C will first need to be calculated. This can be done
using the O(nk + m) procedure Populate-C shown in Fig. 5.1. In each subsequent
iteration of TabuCol, the act of moving a vertex v from Si to S j will result in a new
solution S whose cost is simply:
f 2 (S ) = f 2 (S ) + Cv j − Cvi . (5.1)
Since f 2 (S ) will already be known, this means that the cost of all neighbouring
solutions can be determined by simply reading through each row of C corresponding
to the clashing vertices in S . In a solution with x clashing vertices, this action has
a complexity of O(xk). Once a move has been selected and performed (i.e., once v
has been moved from Si to S j ), the matrix C is then updated using the O(|Γ (v)|)
Update-C procedure shown in Fig. 5.1. As shown in this pseudocode, neighbours
of v are now marked as being adjacent to one fewer vertex in colour class Si and one
additional vertex in colour class S j . The complexity of each individual iteration of
TabuCol is therefore O(xk + |Γ (v)|), which is O(nk + m) in the worst case.
Having evaluated all neighbouring solutions, in each iteration, TabuCol selects
and performs the non-tabu move that brings about the largest decrease (or failing
that the smallest increase) in cost. Any ties in this criterion are broken randomly.
In addition, TabuCol employs an aspiration criterion which allows tabu moves to
be performed on occasion. Specifically, tabu moves are permitted if they are seen to
improve on the best solution found so far during the run. This is particularly helpful
if a tabu move is seen to lead to a solution with zero cost, at which point the algorithm
can halt. Finally, if all moves are seen to be tabu, then a vertex v ∈ V is selected
at random and moved to a new randomly selected colour class. The tabu list is then
updated as usual.
5.1 The TABUCOL Algorithm 115
P OPULATE -C ()
(1) Cv j ← 0 ∀v ∈ V, j ∈ {1, 2, . . . , k}
(2) forall {u, v} ∈ E do
(3) Cu,c(v) ← Cu,c(v) + 1
(4) Cv,c(u) ← Cv,c(u) + 1
U PDATE -C (v, i, j)
(1) forall u ∈ (v) do
(2) Cui ← Cui − 1
(3) Cu j ← Cu j + 1
Fig. 5.1 Procedures for populating and updating the matrix C used with TabuCol. In Populate-
C, c(v) gives the colour of vertex v in the current solution. Update-C is used when TabuCol has
moved vertex v from colour i to colour j. Γ (v) denotes the set of all vertices adjacent to vertex v
which are described by Blöchliger and Zufferey [6]. In our case, we choose to use
settings recommended by the authors and these are included in our source code of this
algorithm. We are perfectly at liberty to use other simpler schemes for calculating t
if required, however.
The third algorithm that we consider is the hybrid evolutionary algorithm (HEA) of
Galinier and Hao [3]. The HEA operates by maintaining a population of candidate
solutions that are evolved via a problem-specific recombination operator and a local
search method. Like TabuCol, the HEA operates in the space of complete improper
k-colourings using cost function f 2 (Eq. (4.30)).
The algorithm begins by creating an initial population of candidate solutions.
Each member of this population is formed using a modified version of the DSatur
algorithm for which the number of colours k is fixed at the outset. To provide diver-
sity between members, the first vertex is selected at random and assigned to the first
colour. The remaining vertices are then taken in sequence according to the maxi-
mum saturation degree (with ties being broken randomly) and assigned to the lowest
indexed colour class Si seen to be feasible (where 1 ≤ i ≤ k). When vertices are
encountered for which no feasible colour class exists, these are kept to one side and
are assigned to random colour classes at the end of this process. Upon construction of
this initial population, an attempt is then made to improve each member by applying
the local search routine, defined below.
As is typical for an evolutionary algorithm, for the remainder of the run the
algorithm evolves the population using recombination, mutation, and evolutionary
pressure. In each iteration, two parent solutions S1 and S2 are selected from the
population at random, and copies of these are used in conjunction with the recombi-
nation operator to produce one offspring solution S . This offspring is then improved
using local search and inserted into the population by replacing the weaker of its two
parents. Note that there is no bias towards selecting fitter parents for recombination;
rather evolutionary pressure only exists due to the offspring replacing their weaker
parent (regardless of whether the parent has a better cost than its offspring).
The recombination operator proposed Galinier and Hao [3] is the so-called Greedy
partition crossover (GPX). The idea behind GPX is to construct offspring using
large colour classes inherited from the parent solutions. A demonstration of how this
is done is given in Fig. 5.3. As shown, the largest (not necessarily proper) colour
class from the parents is first selected and copied into the offspring (ties are broken
randomly). To avoid duplicate vertices occurring in the offspring at a later stage,
these copied vertices are then removed from both parents. To form the next colour,
the other (modified) parent is then considered and, again, the largest colour class
is selected and copied into the offspring, before these vertices are removed from
both parents. This process is continued by alternating between the parents until the
offspring’s k colour classes have been formed.
118 5 Algorithm Case Studies
4) {{v9 }} {{v9 }} {{v4 , v5 , v6 , v7 }, Select the largest colour class in S1 and copy it into S .
{v2 , v8 , v10 }, Delete the copied vertices from both S1 and S2 .
{v1 , v3 }}
5) {{v9 }} {{v9 }} {{v4 , v5 , v6 , v7 }, Having formed k colour classes, assign any missing
{v2 , v8 , v10 , v9 }, vertices to random colours to form a complete
v1 , v3 but not necessarily proper offspring solution .
Fig. 5.3 Example application of the Greedy partition crossover (GPX) operator of Galinier and
Hao [3], using k = 3
At this point, each colour class in the offspring will be a subset of a colour class
existing in at least one of the parents. That is:
∀Si ∈ S ∃S j ∈ (S1 ∪ S2 ) : Si ⊆ S j , (5.3)
where S , S1 , and S2 represent the offspring, and the first and second parents, respec-
tively. However, some vertices may also be missing in the offspring (as is the case
with vertex v9 in Fig. 5.3). This issue is resolved by assigning the missing vertices
to random colour classes. Finally, local search is executed on the offspring before
inserting it into the population.
In this algorithm, TabuCol is used for the local search routine, executing it for a
fixed number of iterations I and using the same tabu tenure scheme as described in
Sect. 5.1. In their original paper, Galinier and Hao [3] manually tune I for different
problem instances. In our case, we choose not to follow this strategy and require a
setting for I to be determined automatically by the algorithm. We also need to be
wary that if I is set too low, then insufficient local search will be carried out on each
newly created solution, while an I that is too high will result in too much effort being
placed on local search as opposed to the global search carried out by the evolutionary
operators. Ultimately, we choose to settle on I = 16n, which roughly corresponds
to the settings used in the most successful runs reported by Galinier and Hao [3]. In
all cases reported here, we also use a population size of 10, as recommended by the
authors.
Like the HEA, the AntCol algorithm of Thompson and Dowsland [4] is another
metaheuristic-based method that combines global and local search operators, in this
case using the ant colony optimisation (ACO) metaheuristic.
5.4 The ANTCOL Algorithm 119
ACO is an algorithmic framework that was originally inspired by how real ants
determine efficient paths between food sources and their colonies. In their natural
habitat, when no food source has been identified, ants tend to wander about rather
randomly. However, when a food source is found, the discovering ants will take some
of this back to the colony leaving a pheromone trail in their wake. When other ants
discover this pheromone, they are less likely to continue wandering at random, but
may instead follow the trail. If they go on to discover the same food source, they
will then follow the pheromone trail back to the nest, adding their own pheromone
in the process. This encourages further ants to follow the trail. In addition to this,
pheromones on a trail also tend to evaporate over time, reducing the chances of an
ant following it. The longer it takes for an ant to traverse a path, the more time the
pheromones have to evaporate; hence, shorter paths tend to see a more rapid build-
up of pheromone, making other ants more likely to follow it and deposit their own
pheromone. This positive feedback eventually leads to all ants following a single,
efficient path between the colony and food source.
As might be expected, initial applications of ACO were aimed towards problems
such as the travelling salesman problem and vehicle routing problems, where we
seek to identify efficient paths and cycles in graphs (see, for example, the work of
Dorigo et al. [7] and Rizzoli [8]). However, applications to many other problems
have also been made.
The idea behind the AntCol algorithm is to use virtual “ants” to produce indi-
vidual candidate solutions. During a run each ant produces its solution in a nonde-
terministic manner, using probabilities based on heuristics and also on the quality
of solutions produced by previous ants. In particular, if previous ants have identified
features that are seen to lead to better-than-average solutions, the current ant is more
likely to include these features in its solution, generally leading to a reduction in the
number of colours during the course of a run.
A full description of the AntCol algorithm is provided in Fig. 5.4. As shown
in the pseudocode, in each cycle of the algorithm (Steps (4) to (16)), several ants
each produce a complete, though not necessarily feasible, solution. In Step (11), the
details of each of these solutions are then added to a trail update matrix δ. At the end
of a cycle, the contents of δ are used together with an evaporation rate ρ to update
the global trail matrix t (Step (15)).
As shown, at the start of each cycle, an individual ant constructs a candidate
solution S using the procedure BuildSolution. This procedure is based on the
Greedy-I-Set algorithm seen in Chap. 3 (Fig. 3.14) and operates by building up
each colour class in a solution one at a time. Recall that during the construction of
each colour class Si ∈ S , Greedy-I-Set makes use of two sets: X , which contains
the uncoloured vertices that can currently be added to Si without causing a clash;
and Y , which holds the uncoloured vertices that cannot be feasibly added to Si . The
modifications that BuildSolution employs are as follows:
• In the procedure, a maximum of k colour classes are permitted. Once these have
been constructed, any remaining vertices are left uncoloured.
120 5 Algorithm Case Studies
Fig. 5.4 The AntCol algorithm. At termination, the best feasible solution found is Sbest , using
|Sbest | = k + 1 colours
The AntCol algorithm also makes use of a “multi-sets” operator within the
BuildSolution procedure. Since the process of constructing a colour class is prob-
abilistic in this case, the operator makes ν separate attempts to construct each colour
class. It then selects the one that results in the minimum number of edges in the graph
5.4 The ANTCOL Algorithm 121
induced by the set of remaining uncoloured vertices Y (since such graphs will tend
to feature lower chromatic numbers).
On completion of BuildSolution, the generated solution S will be proper, but
could be partial. If the latter is true, all uncoloured vertices are assigned to random
colour classes to form a complete, improper solution, and TabuCol is run for I
iterations. Details on the solution are then written to the trail update matrix δ using
the evaluation function
1/ f 2 if f 2 > 0
F(S ) = (5.6)
3 otherwise,
as shown in Step (11) of Fig. 5.4. This means that higher quality solutions contribute
larger values to δ, encouraging their features to be included in solutions produced
by future ants.
The parameters used in our application, and recommended by Thompson and
Dowsland [4], are as follows: α = 2, β = 3, ρ = 0.75, nants = 10, I = 2n, and
ν = 5. The tabu tenure scheme of TabuCol is the same as in previous descriptions.
transfers are made if they are seen to retain feasibility. The local search procedure
continues in this fashion until I iterations have been performed.1
On completion of the local search procedure, the colour classes in T are moved
back into S to form a complete feasible solution. The independent sets in S are then
ordered according to some (possibly random) heuristic, and an updated solution is
formed by constructing a permutation of the vertices in the same manner as that
of the iterated Greedy algorithm (see Sect. 4.2.1) and then applying the Greedy
algorithm. This completes a single cycle of the HC algorithm.
The HC algorithm performs a series of cycles until a user-defined computation
limit is reached. The application of Greedy in each cycle is intended to generate large
alterations to the incumbent solution, which is then passed back to the local search
procedure for further optimisation. Note that none of the stages of this algorithm
allow the number of colour classes being used to increase, thus providing its hill-
climbing characteristics.
As with the previous algorithms, several parameters need to be set with the HC
algorithm, each that can influence its performance. The values used in our exper-
iments here were determined in preliminary tests and according to those reported
by Lewis [9]. For the local search procedure, independent sets are moved into T
by considering each Si ∈ S in turn and transferring it with probability 1/|S |. The
local search procedure is then run for I = 1000 iterations and, in each iteration, the
Kempe chain and swap neighbourhoods are called with probabilities 0.99 and 0.01,
respectively. Finally, when constructing the permutation of the vertices for passing
to the Greedy algorithm, the independent sets are ordered using the same 5:5:3 ratio
as detailed in Sect. 4.2.1.
The sixth and final algorithm considered in this chapter is the backtracking approach
of Korman [10]. Essentially, this operates in the same manner as the basic backtrack-
ing approach discussed in Sect. 4.1.1, though with the following modifications:
• Given the graph G = (V, E), initially vertices are relabelled such that deg(v1 ) ≥
deg(v2 ) ≥ · · · ≥ deg(vn ), breaking ties randomly.
• When performing a forward step, the next vertex to be coloured is chosen as
the uncoloured vertex with the smallest number of feasible colours to which it
can currently be assigned. Ties are broken using the vertex among these with the
lowest index.
1 Note that in some cases a Kempe chain will contain all vertices in both colour classes, that is,
the graph induced by Si ∪ S j will form a connected bipartite graph. Kempe chains of this type are
known as total, and interchanging their colours serves no purpose since this only results in the two
colour classes being relabelled. Consequently, total Kempe chains are ignored by the algorithm.
5.6 The Backtracking Algorithm 123
Fig. 5.5 Example run of the backtracking algorithm of Korman [10]. Initially, k can be set to
Δ(G) + 1. As usual, the notation c(v) = j means that vertex v is assigned to colour i
An example run of this algorithm is illustrated in Fig. 5.5. Each grey node in this
search tree represents a decision (an assignment of a colour to a vertex) and grey leaf
nodes correspond to a feasible solution. For clarity, the order in which these nodes
are visited is shown by the numbers next to the corresponding links in the tree. This
corresponds to a depth-first parse of the search tree.
As seen in Steps (1) to (7) of Fig. 5.5, the algorithm starts by performing a series
of assignments. This results in the four-colouring of G shown at the bottom left. At
each node along this path, the above rules have been used to select the next vertex to
colour. The lowest feasible colour from the set {1, . . . , k} has then been assigned to
124 5 Algorithm Case Studies
the vertex. At this point, we are now interested in producing a solution using fewer
colours, so we set k to be equal to the number of colours in this current solution
minus one and then backtrack. At Step (8), the algorithm now attempts to assign v6
to the colours 1, 2, and 3. However, all of these colours are infeasible, so no further
branches need exploring. This is signified by the black node in the search tree. At
Step (9), the algorithm then backtracks to try and identify a new colour for v5 . In this
case, colours 1 and 2 are not feasible, and colour 3 has already been tried so, again,
no further branching is necessary.
At Step (15), vertex v3 is assigned to colour 3, producing a new grey node. As
shown, the subtree rooted at this node contains a three-colouring. Once this is dis-
covered, k can be set to two. At this point, no further branching is possible, meaning
that once the algorithm has backtracked to the root of the search tree, the discovered
three-colouring is the guaranteed optimum.
Note that several parameters can also be set when applying this algorithm, some
of which might alter the performance quite drastically. These include limiting the
number of branches that can be considered at each node of the search tree and also
prohibiting branching at certain levels. In practice, it is not obvious how these settings
might be chosen a priori for individual graphs, so, in our case, we opt for the most
natural configuration, which is to simply attempt a complete exploration of the search
tree.2 This means that the algorithm is exact under excess time, though, of course,
such run-lengths will not be possible in many cases.
In this section, we now compare the performance of the above six algorithms using a
variety of different graph types. As with our comparison of constructive algorithms
in Chap. 3, we begin by considering random graphs. We then go on to look at flat
graphs, scale-free graphs, and planar graphs. We will also examine sets of graphs
arising from two real-world practical problems, namely, university timetabling and
social networking.
As with our previous experiments, computational effort for these algorithms is
measured by counting the number of constraint checks (see Sect. 1.4.1). Solution
quality is measured by recording the smallest number of colours being used in any
feasible solution observed during a run. Note that because TabuCol, PartialCol,
and the HEA operate using infeasible solutions, settings for k are required which
will need to be modified during a run. In our case, initial values are determined by
executing DSatur on each instance and setting k to the number of colours used in
the resultant solution. During runs, k is then decremented by 1 each time a feasible
k-colouring is found, with the algorithms being restarted.
As we saw in Definition 3.15, random graphs are generated such that each pair of
vertices is made adjacent with probability p. This gives an average of n2 p edges
per graph. It also means that the distribution of vertex degrees is characterised by the
binomial distribution B(n − 1, p). For the following experiments, we used values
of p ranging from 0.05 (sparse) to 0.95 (dense), incrementing in steps of 0.05, with
n ∈ {250, 500, 1000}. Twenty-five instances were generated in each case.
Table 5.1 shows the number of colours used in solutions produced by the six algo-
rithms for random graphs with edge probability p = 0.5 and varying numbers of
vertices. The results indicate that for the smaller graphs (n = 250), the TabuCol,
PartialCol, and HEA algorithms produce solutions with fewer colours than the
remaining algorithms.3 However, no statistical difference between these three algo-
rithms is apparent. For larger graphs, however, the HEA produces the best results,
allowing us to conclude that, for n = 500 and n = 1000, the HEA algorithm pro-
duces the best solutions across the set of all graphs and their isomorphisms under
this particular computation limit.
Considering other graph densities, the charts shown in Fig. 5.6 summarise the
mean solution quality achieved by the six algorithms on all random graphs generated.
In each figure, the bars show the number of colours used in solutions produced by
DSatur and the lines then give the proportion of this number used in the solutions
returned by each of the six algorithms. Note that all algorithms achieve a reduction
in the number of colours realised by DSatur, though in all but the smallest, sparsest
graphs, the backtracking algorithm exhibits the smallest margins of improvement.
It is clear from Fig. 5.6 that TabuCol, PartialCol, and the HEA, in particular,
produce the best results for the random graphs. For n = 250, these algorithms
produce mean results that, across the range of values for p, show no significant
difference among one another, perhaps indicating that the achieved solutions are
consistently close to being optimal. For larger graphs, however, the HEA’s solutions
are seen to be significantly better, though its rates of improvement are slightly slower
than those of TabuCol and PartialCol, as illustrated in Fig. 5.7. Similar behaviour
during runs was also witnessed with the smaller random instances.
Overall, the patterns shown in Fig. 5.6 indicate that the HEA’s strategy of exploring
the space of infeasible solutions using both global and local search operators is
Table 5.1 Summary of results produced at the computation limit using random graphs G n,0.5
Algorithma
n TabuCol PartialCol HEA AntCol HC Bktr
250 28.04±0.20 28.08 ±0.28 28.04±0.33 28.56 ± 0.51 29.28 ± 0.46 34.24 ± 0.78
500 49.08 ± 0.28 49.24 ± 0.44 47.88 ±0.51 49.76 ± 0.44 54.52 ± 0.77 62.24 ± 0.72
1000 88.92 ± 0.40 89.08 ± 0.28 85.48 ±0.46 89.44 ± 0.58 101.44 ± 0.82 112.88 ± 0.97
a Mean plus/minus standard deviation in number of colours, taken from runs across 25 graphs
the most beneficial of those considered here. Indeed, although the HC algorithm
also uses both global and local search operators, here its insistence on preserving
feasibility implies a lower level of connectivity in its underlying solution space,
making navigation more restricted and resulting in noticeably inferior solutions.
Figure 5.6 also reveals that AntCol does not perform well with large sparse
instances, though it does become more competitive with denser instances. The rea-
sons for this are twofold. First, the degrees of vertices in sparse graphs are naturally
lower, reducing the heuristic bias provided by η and perhaps implying an over-
dominant role of τ during applications of BuildSolution (see Eq. (5.4)). Secondly,
sparse graphs also feature greater numbers of vertices per colour—thus, even if very
promising independent sets are identified by AntCol, their reconstruction by later
ants will naturally depend on a longer sequence of random trials, making them less
likely to reoccur. To back these assertions, we also repeated the trials of AntCol
using the same local search iteration limit as the HEA, namely, I = 16n. How-
ever, though this brought slight improvements for denser graphs, the results were
still observed to be significantly worse than the HEA’s, suggesting the difference in
performance indeed lies with the global search element of AntCol in these cases.
Our second set of experiments concerns flat graphs. Flat graphs are produced by
starting with an empty graph G = (V, E = ∅) and then partitioning the n vertices into
q almost equal-sized independent sets (i.e., each set contains either n/q or n/q
vertices). Edges are then added between pairs of vertices in different independent
sets in such a way that the variance in vertex degrees is kept to a minimum. This is
continued until a user-specified density of p is reached.
It is well known that q-coloured solutions to flat graphs are quite easy to achieve
for most values of p. This is because, for lower values for p, problems will be under-
constrained, perhaps giving χ (G) < q, and making q-coloured solutions easily
identifiable. On the other hand, high values for p can result in over-constrained
problems with prominent global optima that are also easily discovered. Hard-to-
solve q-colourable graphs are known to occur for a region of p’s at the boundary of
these extremes, commonly termed the phase transition region [11, 12]. Flat graphs,
in particular, are known to have rather pronounced phase transition regions because
5.7 Algorithm Comparison 127
120
DSatur 100
TabuCol
PartialCol
100 HEA
AntCol 95
60 85
40 80
20 75
0 70
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
p
220
DSatur 100
200 TabuCol
PartialCol
180 HEA
AntCol 95
140 90
120
85
100
80
80
60
40 75
20
0 70
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
p
400
DSatur 100
TabuCol
350 PartialCol
HEA
AntCol 95
Colours (% compared to DSatur)
300 HC
Bktr
Colours using DSatur
250 90
200 85
150
80
100
75
50
0 70
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
p
Fig. 5.6 Mean quality of solution achieved on random graphs using n = 250, 500, and 1000
(respectively) for various edge probabilities p. All points are the mean of 25 runs on 25 different
instances
128 5 Algorithm Case Studies
56 TabuCol
PartialCol
HEA
AntCol
54 HC
Bktr
Colours 52
50
48
46
44
11 11 11 11 11
0 1×10 2×10 3×10 4×10 5×10
Checks
TabuCol
PartialCol
HEA
110 AntCol
HC
Bktr
105
Colours
100
95
90
0 1×1011 2×10
11
3×10
11
4×10
11
5×10
11
Checks
200
TabuCol
PartialCol
HEA
AntCol
190 HC
Bktr
180
Colours
170
160
150
11 11 11 11 11
0 1×10 2×10 3×10 4×10 5×10
Checks
Fig. 5.7 Run profiles on random graphs of n = 1000 with edge probabilities p = 0.25, 0.5, and
0.75, respectively. Each line represents a mean of 25 runs on 25 different instances
5.7 Algorithm Comparison 129
each colour class and vertex degree is deliberately similar, implying a lack of heuristic
information for algorithms to exploit (i.e., vertices tend to “look the same”).
For our experiments, flat graphs were generated using publicly available soft-
ware designed by Joseph Culberson which can be downloaded at http://webdocs.cs.
ualberta.ca/~joe/Coloring/. Graphs were produced for q ∈ {10, 50, 100} using var-
ious settings of p in and around the phase transition regions. In each case, we used
n = 500, implying approximately 50, 10, and 5 vertices per colour, respectively.
Twenty instances were generated in each case.
The relative performance of the six graph colouring algorithms on these flat graphs
is shown in Fig. 5.8. Similarly, to random graphs, we see that the HEA, TabuCol,
and PartialCol generally exhibit the best performance on instances within the
phase transition regions, with the HC and backtracking algorithms proving the least
favourable. One pattern to note is that, for all three values of q, the HEA tends
to produce the best quality results on the left side of the phase transition region,
while PartialCol produces better results for a small range of p’s on the right
side. However, this difference is not due to the “Foo” tabu tenure mechanism of
PartialCol, because no significant difference was observed when we repeated
our experiments using PartialCol under TabuCol’s tabu tenure scheme. Thus,
it seems that PartialCol’s strategy of only allowing solutions to be built from
independent sets is favourable in these cases, presumably because this restriction
facilitates the formation of independent sets of size n/q—structures that will be less
abundant in denser graphs, but which also serve as the underlying building blocks in
these cases.
Another striking feature of Fig. 5.8 is the poor performance of AntCol on the
right side of the phase transition regions. This again seems to be due to the diminished
effect of heuristic value η which, in this case, occurs because of the very low variance
in vertex degrees. Furthermore, in denser graphs, fewer combinations comprising n/q
vertices will form independent sets, decreasing the chances of an ant constructing
one. This reasoning is also backed by the fact that AntCol’s poor performance
lessens with larger values of q where, due to there being fewer vertices per colour,
the reproduction of independent sets is dependent on shorter sequences of random
trials.
Our next set of experiments concerns planar graphs. These provide some contrast-
ing results to those of the previous two subsections because, in many cases, the
backtracking algorithm is often able to quickly solve these problems to optimality.
However, there are still places where difficulties are encountered.
As we saw in Sect. 1.2, planar graphs are structured so that they can be drawn
on a two-dimensional plane such that no edges cross. Because of this, they are quite
sparse; indeed, the maximum possible number of edges in a planar graph with n
vertices is just 3n − 6 (see Theorem 6.2). In Sect. 6.1, we will see that all planar
graphs are actually four-colourable. In practice, this means that planar graphs can
be optimally coloured in polynomial time using, for example, the O(n 2 ) algorithm
130 5 Algorithm Case Studies
19
TabuCol
PartialCol
18 HEA
AntCol
17 HC
Bktr
Colours at cut−off 16
15
14
13
12
11
10
0.05 0.1 0.15 0.2
p
66
TabuCol
PartialCol
64 HEA
AntCol
HC
62 Bktr
Colours at cut−off
60
58
56
54
52
50
0.4 0.45 0.5 0.55 0.6 0.65 0.7 0.75 0.8
p
TabuCol
114 PartialCol
HEA
AntCol
112 HC
Bktr
110
Colours at cut−off
108
106
104
102
100
0.76 0.78 0.8 0.82 0.84 0.86 0.88 0.9 0.92 0.94 0.96
p
Fig. 5.8 Mean solution quality achieved with flat graphs of n = 500 with q = 10, 50, and 100
(respectively) for various edge probabilities p. All points are the mean of 20 runs on 20 different
instances
5.7 Algorithm Comparison 131
of Robertson et al. [13]. However, it is still interesting to see how our six case-study
algorithms can perform with these problem instances.
Example Python code for generating planar graphs is given in Appendix A.4.
This operates by first randomly placing n points into the unit square. A Delaunay
triangulation is then generated from these points to give a planar graph using approx-
imately (but not exceeding) 3n − 6 edges. A subset of these edges is then taken to
give a planar graph with the required number of edges m, ensuring that the resultant
graph is also connected. An illustration of this process is shown in Fig. A.3. In our
trials, planar graphs were generated using n ∈ {100, 1000, 2000}. In each case, 40
different values of m between n and 3n − 6 were considered and 25 different graphs
were then generated for each (n, m) pair.
Figure 5.9 summarises the performance of the backtracking algorithm on this set
of planar graphs by considering its success rate and computational requirements. The
success rate gives the percentage of instances in which the algorithm has navigated its
way back to the root of the search tree (and therefore produced an optimal solution)
within the computational limit. The computational requirements are then calculated
by taking the mean number of checks performed across these successful runs. For
n = 100, we see that the algorithm solves all instances to optimality using very
small amounts of computation. Factors contributing to this success are the relatively
small number of vertices and colours (contributing to smaller search trees), and the
fact that χ (G)-colourings are established quickly, allowing much of the remaining
search tree to be pruned.
For the larger values of n shown in Fig. 5.9, other patterns start to emerge, with
dips in the success rate occurring in two areas. These are somewhat reminiscent
of the phase transition regions experienced with flat graphs, seen in the previous
subsection. For the lowest values of m used in these figures, the chromatic number
of the graphs is nearly always three. Because there are fewer edges, there are also
many different three-colourings. The algorithm is therefore able to quickly identify
one of these and prune much of the remaining search tree. As m is increased from this
point, the number of feasible three colourings diminishes, causing a drop in success
rates and increases to the required computational effort. Next, at around m = 2n,
the increased number of edges means that the chromatic number is now usually four.
As before, there are now many different four-colourings, allowing the backtracking
algorithm to terminate with an optimal solution quite quickly. Then, as m is raised
further, the number of feasible four-colourings drops, once again causing a decrease
in the success rate and an increase in the required computational effort.
In Fig. 5.10, we compare the quality of solutions returned by the backtracking
algorithm to those of the other five algorithms. In these cases, the other five algorithms
have produced identical results, so just one line is used for them in the charts. As
shown, for n = 100, all six algorithms produce the same results. This indicates
that the remaining five algorithms are also producing optimal solutions. For larger
instances, the five algorithms give better solutions in the areas corresponding to the
phase transition regions of the backtracking algorithm. This is particularly so for the
densest graphs, where they always produce four-colourings, whereas backtracking
often only produces five-colourings.
132 5 Algorithm Case Studies
1×109
Checks
Success Rate
100
8
8×10
Checks
60
8
4×10
40
2×108
20
0 0
100 120 140 160 180 200 220 240 260 280
m
11
1×10
Checks
Success Rate
100
10
8×10
80
Success Rate (%)
10
6×10
Checks
60
10
4×10
40
10
2×10
20
0 0
1000 1200 1400 1600 1800 2000 2200 2400 2600 2800
m
11
1×10
Checks
Success Rate
100
8×1010
80
Success Rate (%)
6×1010
Checks
60
10
4×10
40
10
2×10
20
0 0
2000 2500 3000 3500 4000 4500 5000 5500
m
Fig. 5.9 Performance of the backtracking algorithm on planar graphs of differing densities for
n = 100, 1000, and 2000, respectively. All points in the figures are means taken across 25 different
problem instances
5.7 Algorithm Comparison 133
5
Bktr
Other five algorithms
4.5
Colours 4
3.5
3
100 120 140 160 180 200 220 240 260 280
m
5
Bktr
Other five algorithms
4.5
Colours
3.5
3
1000 1200 1400 1600 1800 2000 2200 2400 2600 2800
m
5
Bktr
Other five algorithms
4.5
Colours
3.5
3
2000 2500 3000 3500 4000 4500 5000 5500
m
Fig. 5.10 Average number of colours used in solutions returned by the backtracking algorithm
and the remaining five heuristics, for n = 100, 1000, and 2000, respectively. The lower shaded
areas indicate lower bounds on the chromatic number. These were determined using the NetworkX
command nx.graph_clique_number(G) to calculate the clique number ω(G) for each graph
G. Although the algorithm used by this command has an exponential complexity, we found that
these operations were able to complete quickly with these planar graphs. The upper shaded area
indicates four colours, which is the maximum number of colours required by any planar graph. All
points are means taken across 25 instances
134 5 Algorithm Case Studies
3×107
TabuCol
PartialCol
HEA
7
2.5×10
2×107
Checks
1.5×107
7
1×10
5×106
0
2000 2500 3000 3500 4000 4500 5000 5500
m
Fig. 5.11 Computational effort required to achieve optimal solutions for planar graphs with n =
2000 vertices (mean across 25 instances for each value of m)
Finally, Fig. 5.11 shows the computational effort required to find optimal solutions
using TabuCol, PartialCol, and the HEA algorithm (these are chosen here as they
seem to require the least effort overall). Note that these algorithms do not prove the
optimality of a solution by themselves; in these cases, optimality is therefore claimed
either because χ (G) has been previously determined by the backtracking algorithm,
or because a solution using ω(G) colours has been determined. In this figure, we see
that the number of checks required by these algorithms is slightly higher within the
phase transition regions. However, on the whole, these numbers are quite low, with
less than 0.003% of the computational limit being required to produce the optimal
on average.
We now consider another type of graph topology in which the backtracking algorithm
is observed to perform very well. Scale-free graphs are known to model many real-
world applications of networks, including the World Wide Web, citation networks
of academic papers, and flight connections between airports [14]. In essence, these
graphs are based around the idea of “preferential attachment” in that, when a vertex
is added to a graph, it is more likely to be made adjacent to existing vertices that have
high degrees. As a result, the degree distributions of these graphs follow a power
law, in which a small number of “hub” vertices feature very high degrees compared
to the remaining vertices.
Scale-free graphs can be artificially generated using the Barabási–Albert method
[15]. For these trials, we employed the NetworkX implementation of this algorithm.
5.7 Algorithm Comparison 135
1000
Random
Scale−free
800
600
Frequency
400
200
0
20 40 60 80 100 120 140 160 180 200
Degree
Fig. 5.12 Degree distributions for a random graph with n = 10,000 and p = 0.0004, and a scale-
free graph with n = 10,000 and q = 20. The maximum degree for the random graph is 65; for the
scale-free graph the maximum is 721
136 5 Algorithm Case Studies
30 120
Bktr
Greedy
DSatur 115
25 RLF
15 100
95
10
90
5
85
0 80
0 10 20 30 40 50 60 70 80 90 100
q
250 120
Bktr
Greedy
DSatur 115
RLF
200
105
150
100
100
95
90
50
85
0 80
0 100 200 300 400 500 600 700 800 900 1000
q
450 120
Bktr
Greedy
400 DSatur 115
RLF
350
Colours (% compared to Bktr)
110
300
Colours using Bktr
105
250
100
200
95
150
90
100
50 85
0 80
0 200 400 600 800 1000 1200 1400 1600 1800 2000
q
Fig.5.13 Number of colours achieved in solutions returned by the backtracking algorithm on scale-
free graphs for n = 100, 1000, and 2000 respectively. The corresponding results for the Greedy,
DSatur, and RLF heuristics are also shown. All points are the mean twenty problem instances
5.7 Algorithm Comparison 137
In our trials, scale-free graphs for n = {100, 1000, 2000} were considered using
values of q ∈ {1, 2, . . . , n − 1}. Twenty graphs were then generated for each (n, q)
pair, giving 61,940 problem instances in total. Although this resultant graph set
contains a wide range of sizes and densities, ultimately the backtracking algorithm
was seen to perform very well in these cases, with just 14 of the 61,940 instances not
being solved to optimality within the computation limit (specifically, 3 for n = 1000
and 11 for n = 2000). In addition, the time to find the optimal solutions for these
61,926 instances was less than one-quarter of a second per instance on average.4
The results of the backtracking algorithm with these instances are summarised
in Fig. 5.13. As expected, χ (G) = 2 when q = 1 and q = n − 1, because the
corresponding graphs are trees. More generally, low and high values for q result
in sparser graphs contributing to lower chromatic numbers. On the other hand, the
highest chromatic numbers are seen when q is set to around three-quarters of n. For
comparative purposes, results for the Greedy, DSatur and RLF heuristics are also
shown in the figure. For the densest graphs, DSatur and RLF are producing solutions
of similar quality; however, for sparser graphs, their solutions are suboptimal except
for where q = 1.
The success of the backtracking algorithm seen here is due to the special struc-
ture of these scale-free graphs. In these cases, the high-degree “hub” vertices tend
to belong to a maximum clique C. Because of the selection rules used by the back-
tracking algorithm, the vertices of C are coloured first, giving a partial |C|-colouring.
Usually, the remaining vertices are then easily coloured using the available colours,
giving a full, feasible |C|-colouring. If this is not possible, then only minor adjust-
ments near the leaves of the search tree are usually required. At this point, the number
of permitted colours is decremented by one and the algorithm backtracks to level
|C| of the search tree; however, because of the presence of the clique C, solutions
using fewer than |C| colours cannot be achieved. This allows the algorithm to quickly
backtrack to the root of the search tree, therefore providing an optimal solution.
Our next set of problem instances concerns graphs representing real-world univer-
sity timetabling problems. As we saw in Sect. 1.1.2, timetabling problems involve
assigning a set of “events” (exams, lectures, etc.) to a fixed number of “timeslots”.
A pair of events then “conflict” when they require the same single resource, e.g.,
there may be a student or lecturer who needs to attend both events, or the events may
require the use of the same room. As a result, conflicting events should be assigned
to different timeslots. Under this constraint, a timetabling problem can be modelled
using graph colouring by considering each event as a vertex, with edges occurring
between pairs of conflicting events. Each colour then represents a timeslot, and a
feasible colouring corresponds to a complete timetable with no conflict violations.
Table 5.2 Details of the 13 timetabling instances of Carter et al. [16]. The degree coefficient of
variation (CV) is defined as the ratio of the degree standard deviation to the degree mean
Instance n Density Degree
Min; Med; Max Mean μ CV (%)
hec-s-92 81 0.415 9; 33; 62 33.7 36.3
sta-f-83 139 0.143 7; 16; 61 19.9 67.4
yor-f-83 181 0.287 7; 51; 117 52.0 35.2
ute-s-92 184 0.084 2; 13; 58 15.5 69.1
ear-f-83 190 0.266 4; 45; 134 50.5 56.1
tre-s-92 261 0.180 0; 45; 145 47.0 59.6
lse-f-91 381 0.062 0; 16; 134 23.8 93.2
kfu-s-93 461 0.055 0; 18; 247 25.6 120.0
rye-s-93 486 0.075 0; 24; 274 36.5 111.8
car-f-92 543 0.138 0; 64; 381 74.8 75.3
uta-s-92 622 0.125 1; 65; 303 78.0 73.7
car-s-91 682 0.128 0; 77; 472 87.4 70.9
pur-s-93 2419 0.029 0; 47; 857 71.3 129.5
Table 5.3 Summary of algorithm performance on the 13 timetabling instances of Carter et al. [16].
All statistics are collected from 50 runs on each instance. Asterisks in the rightmost column indicate
where the backtracking algorithm was able to produce a provably optimal solution. In these cases,
the square brackets indicate the percentage of runs where this occurred, and the average percentage
of the computation limit that this took
Instance Colours at cut-off: mean (best)
TabuCol PartialCol HEA AntCol HC Bktr
hec-s-92 17.22 (17) 17.00 (17) 17.00 (17) 17.04 (17) 17.00 (17) 19.00 (19)
sta-f-83 13.35 (13) 13.00 (13) 13.00 (13) 13.13 (13) 13.00 (13) *13.00 (13)
[100%,
<0.1%]
yor-f-83 19.74 (19) 19.00 (19) 19.06 (19) 19.87 (19) 19.00 (19) 20.00 (20)
ute-s-92 10.00 (10) 10.00 (10) 10.00 (10) 11.09 (10) 10.00 (10) 10.00 (10)
ear-f-83 26.21 (24) 22.46 (22) 22.02 (22) 22.48 (22) 22.00 (22) *22.00 (22)
[100%, 0.7%]
tre-s-92 20.58 (20) 20.00 (20) 20.00 (20) 20.04 (20) 20.00 (20) 23.00 (23)
lse-f-91 19.42 (18) 17.02 (17) 17.00 (17) 17.00 (17) 17.00 (17) *17.00 (17)
[100%, 1.3%]
kfu-s-93 20.76 (19) 19.00 (19) 19.00 (19) 19.00 (19) 19.00 (19) 19.00 (19)
rye-s-93 22.40 (21) 21.06 (21) 21.04 (21) 21.55 (21) 21.00 (21) 22.00 (22)
car-f-92 39.92 (36) 32.48 (31) 28.50 (28) 30.04 (29) 27.96 (27) *27.00 (27)
[100%, 8.2%]
uta-s-92 41.65 (39) 35.66 (34) 30.80 (30) 32.89 (32) 30.27 (30) 29.00 (29)
car-s-91 39.10 (32) 30.20 (29) 29.04 (28) 29.23 (29) 29.10 (28) 28.00 (28)
pur-s-93 50.70 (47) 45.48 (42) 33.70 (33) 33.47 (33) 33.87 (33) 33.00 (33)
Total 341.05 (315) 302.36 (294) 280.16 (277) 286.84 (281) 279.20 (276) 282.00 (282)
Rank (6) (5) (2) (4) (1) (3)
Consider the instance kfu-s-93, by no means the hardest or largest in this set. It involves 5349
students sitting 461 exams, ideally fitted into 20 timeslots. The problem contains two cliques
of size 19 and huge numbers of smaller ones. There are 16 exams that clash with over 100
others.
Table 5.3 summarises the results achieved at the computation limit with the six
graph colouring algorithms. In contrast to many of our previous results, the worst
overall performance now occurs with the methods relying solely on local search,
that is, TabuCol and to a lesser extent PartialCol. Indeed, we find that these
methods are often incapable of achieving feasible solutions even using the initial
setting for k determined by DSatur.5 The cause of this poor performance seems
5 Consequently, the reported results for TabuCol and PartialCol in Table 5.3 are produced using
an initial k generated by executing the Greedy algorithm with a random permutation of the vertices.
140 5 Algorithm Case Studies
35
30
25
Cost Change
20
15
10
Fig. 5.14 Cost-change distributions for a random graph (n = 500, p = 0.15, CV= 10.7%, using
k = 16) and timetable graph car-f-92 (n = 543, p = 0.138, CV = 75.3%, using k = 27). In all
cases, samples are taken from candidate solutions with costs of 8
due to the large degree variances seen in these graphs, particularly in comparison
to the variances seen in random, flat, and planar graphs seen earlier. The effects of
this are demonstrated in Fig. 5.14 where, compared to a random graph of a similar
size and density, the differences in cost between neighbouring solutions vary much
more widely. This suggests a more “spiky” cost landscape in which the use of local
search mechanisms in isolation is insufficient, exhibiting a susceptibility to becoming
trapped at local optima.
Table 5.3 also shows that the most consistent performance with these graphs is
achieved by the HC and HEA algorithms (no significant difference between the
two methods across the instances is apparent). This demonstrates that the issues
of using local search in isolation are alleviated by the addition of a global search-
based operator. On individual instances, the relative performances of HC and HEA
do seem to vary, however. With the problem instances car-f-92 and car-s-91, for
example, the HEA’s best observed solutions are determined within approximately
1% of the computation limit, while HC’s progress is much slower. On the other hand,
with instances such as rye-s-93, HC consistently produces the best observed results,
using less than 0.3% of the computation limit. This suggests that its operators are
somehow suited to this instance (this issue is considered further in Sect. 5.8.1).
We also observe that the backtracking algorithm is again quite competitive with
these instances. For four of the problem instances, the algorithm has managed to
find and prove the optimal solutions in all runs using a small fraction of the com-
putation limit. Also, the algorithm has produced the best average performance out
of all algorithms with the four largest problem instances. It seems in these cases
5.7 Algorithm Comparison 141
that the abundance of large cliques in the graphs together with their large degree
CVs characterise an abundance of heuristic information that can be successfully
exploited by the algorithm. Indeed, for the four largest instances, all of the solutions
reported in Table 5.3 were found in less than 2% of the computation limit, implying
that the algorithm quickly identifies the correct regions of the search tree. How-
ever, counterexamples in which the backtracking algorithm consistently produces the
worst performance can also be seen in Table 5.3, such as with the smallest instance,
hec-s-92.
Finally, we also note the sporadic performance of AntCol with these instances.
For all but the four largest problems, AntCol’s best solutions equal those of the
other algorithms; however, its averages are less favourable, particularly compared to
the HEA and HC algorithms. Consider, for example, the results of ute-s-92 in the
table. This problem is consistently solved using ten colours by all methods except
AntCol, which often requires 11 or 12 colours. We find that for instances such
as these, AntCol’s performance depends very much on the quality of solutions
produced in the first cycle of the algorithm. Due to the low vertex degrees (and
reduced influence of η that results), Equation (5.4) is predominantly influenced by
the pheromone values τ ; however, if an 11- or 12-colour solution is produced during
the first cycle, features of these suboptimal solutions are still used to update the
pheromone matrix t, making their reoccurrence in later cycles more likely. The
upshot is that AntCol is rarely seen to improve upon solutions found in the initial
cycle of the algorithm with these instances.
Our final set of experiments in this chapter involves graphs representing social net-
works. Here we consider the social networks of school friends, compiled as part of
the USA-based National Longitudinal Study of Adolescent Health project [18]. The
colouring of such networks might be required when we wish to partition the students
into groups such that individuals are kept separate from their friends, e.g., for group
assignments and team-building exercises (see also Sect. 1.1.1).
To construct these networks, surveys were conducted in various schools, with
each student being asked to list all of his or her friends. In some cases, students were
only allowed to nominate friends attending the same school, while in others they
could include friends attending a “sister school” (e.g., middle-school students could
include friends in the local high school), leading to single-cluster and double-cluster
networks, respectively. In the resultant graphs, each student is represented by a vertex,
with edges signifying a claimed friendship between the associated individuals (see
Fig. 5.15). Note that in the original data, edges signifying friendships are both directed
and weighted; however, in our case, directions and weights have been removed to
form a simple graph.
For these trials, we took a random sample of ten single-cluster networks and ten
double-cluster networks from the Adolescent Health dataset. Summary statistics of
142 5 Algorithm Case Studies
Pajek
Fig. 5.15 Visualisation of a double-cluster social network collected in the National Longitudinal
Study of Adolescent Health project [18]
these graphs are given in Table 5.4. These figures indicate that the vertex degrees
are far lower than the timetabling graphs from the previous section, with the highest
degree across the whole set being just 29. Consequently, the densities of the graphs
are also much lower.
As before, each algorithm was executed 50 times on each instance. The relatively
straightforward outcomes of these trials are summarised in Table 5.5. Here, we see
that the number of colours needed for these problems ranges from five to ten, though
no obvious correlations exist to suggest any links with instance size, density, or the
presence of clusters. We also see that the HEA, HC, TabuCol, and PartialCol
methods have all produced the best observed (or optimal) solutions for all instances
in all runs. It seems, therefore, that the underlying structures and relative sparsity of
these graphs make them relatively “easy” to solve with these algorithms.
In addition, for six of the instances, the backtracking algorithm has managed to
find provably optimal solutions, though this does not occur in all runs. Indeed, when
this does happen, it seems to occur early in the process (<5% of the computation
limit), suggesting that the random elements of the algorithm can have a large effect on
the structure of the search tree in these cases. We also observe the poor performance
of AntCol, which seems to be due to the negative performance features noted in the
5.7 Algorithm Comparison 143
Table 5.4 Details of the 20 social networks used. The degree coefficient of variation (CV) is defined
as the ratio of the degree standard deviation to the degree mean
Instance n Density Degree
Min; Med; Max Mean μ CV (%)
Single cluster
#1 380 0.021 0; 8; 23 8.1 50.5
#2 542 0.013 0; 7; 35 7.1 61.7
#3 563 0.013 0; 7; 23 7.3 55.4
#4 578 0.015 0; 8; 24 8.8 52.7
#5 626 0.013 0; 7; 30 7.8 58.7
#6 746 0.010 0; 7; 28 7.3 58.6
#7 828 0.008 0; 6; 23 6.2 59.3
#8 877 0.009 0; 7; 29 7.8 58.2
#9 1229 0.003 0; 4; 17 4.1 54.6
#10 2250 0.002 0; 4; 25 4.3 78.0
Double cluster
#11 291 0.027 0; 8; 21 7.8 54.6
#12 426 0.018 0; 7; 26 7.5 56.2
#13 457 0.016 0; 7; 23 7.4 58.8
#14 495 0.017 0; 8; 22 8.5 46.8
#15 569 0.017 0; 9; 34 9.4 50.9
#16 586 0.016 0; 9; 30 9.6 48.4
#17 689 0.010 0; 6; 22 6.8 62.0
#18 795 0.011 0; 9; 24 8.7 53.7
#19 1089 0.007 0; 8; 29 8.1 57.9
#20 1246 0.007 0; 9; 33 8.6 54.4
previous subsection, with a high-quality solution either being produced very quickly
(in the first cycle), or not at all.
The results of the above comparison reveal a complicated picture, with different algo-
rithms outperforming others on different occasions. This suggests that the underlying
structures of graphs are often critical in an algorithm’s resultant performance. In terms
of overall patterns, we offer the following observations:
• Algorithms that rely solely on local search (in this case TabuCol and Partial-
Col) often struggle with instances whose cost landscapes are “spiky”, commonly
144 5 Algorithm Case Studies
Table 5.5 Summary of algorithm performance on the 20 social networks. All statistics are collected
from 50 runs on each instance. Asterisks in the rightmost column indicate where the backtracking
algorithm was able to produce a provably optimal solution. In these cases, the square brackets
indicate the percentage of runs where this occurred, and the average percentage of the computation
limit that this took
Instance Colours at cut-off: mean (best)
TabuCol PartialCol HEA AntCol HC Bktr
Single cluster
#1 8 (8) 8 (8) 8 (8) 8.15 (8) 8 (8) 8 (8)
#2 6 (6) 6 (6) 6 (6) 6.76 (6) 6 (6) *6 (6) [100%,
<1%]
#3 7 (7) 7 (7) 7 (7) 7.45 (7) 7 (7) 7.02 (7)
#4 8 (8) 8 (8) 8 (8) 8.75 (8) 8 (8) 8 (8)
#5 8 (8) 8 (8) 8 (8) 8.41 (8) 8 (8) 8 (8)
#6 6 (6) 6 (6) 6 (6) 6 (6) 6 (6) *6 (6) [90%,
<1%]
#7 6 (6) 6 (6) 6 (6) 6.38 (6) 6 (6) 6 (6)
#8 8 (8) 8 (8) 8 (8) 8.23 (8) 8 (8) 8 (8)
#9 6 (6) 6 (6) 6 (6) 6.10 (6) 6 (6) 6 (6
#10 5 (5) 5 (5) 5 (5) 5 (5) 5 (5) *5.38 (5)
[52%, <1%]
Double cluster
#11 6 (6) 6 (6) 6 (6) 6.70 (6) 6 (6) 6.02 (6)
#12 5 (5) 5 (5) 5 (5) 5 (5) 5 (5) *5 (5) [96%,
4%]
#13 6 (6) 6 (6) 6 (6) 6 (6) 6 (6) *6.32 (6)
[46%, 1%]
#14 7 (7) 7 (7) 7 (7) 7.46 (7) 7 (7) *7 (7) [42%,
<1%]
#15 7 (7) 7 (7) 7 (7) 7 (7) 7 (7) *7 (7) [100%,
<1%]
#16 10 (10) 10 (10) 10 (10) 10.13 (10) 10 (10) 10 (10)
#17 7 (7) 7 (7) 7 (7) 7.28 (7) 7 (7) 7 (7)
#18 6 (6) 6 (6) 6 (6) 6 (6) 6 (6) *6.14 (6)
[86%, 1%]
#19 7 (7) 7 (7) 7 (7) 7.65 (7) 7 (7) 7.13 (7)
#20 7 (7) 7 (7) 7 (7) 7.69 (7) 7 (7) 7.02 (7)
Total 136 (136) 136 (136) 136 (136) 142.14 (136) 136 (136) 137.03 (136)
Rank (1) (1) (1) (6) (1) (5)
5.7 Algorithm Comparison 145
• The HEA operates in the space of infeasible solutions. Unlike the HC algorithm,
which only permits changes to a solution that maintains feasibility, the strategy
of allowing infeasible solutions seems to offer higher levels of connectivity
(and thus less restriction of movement) within the solution space, helping the
algorithm to navigate its way towards high-quality solutions more effectively.
• The HEA makes use of global as well as local search operators. On many occa-
sions, TabuCol performs poorly when used in isolation; however, the HEA’s
use of global search operators in conjunction with TabuCol seems to alleviate
these problems by allowing the algorithm to regularly escape from local optima.
• The HEA’s global search operators are robust. Unlike AntCol’s global search
operator, which sometimes hinders performance, the HEA’s use of recombina-
tion in conjunction with a small population of candidate solutions seems ben-
eficial across the instances. This is despite the fact that across all of our tests,
recombination was never seen to consume more than 2% of the available run
time. Note, in particular, that the GPX operator does not consider any problem-
specific information in its operations (such as the connectivity or degree of ver-
tices), yet it still seems to strike a useful balance between (a) altering the solution
sufficiently, while (b) propagating useful substructures within the population.
Before concluding this chapter, we now take a look at some of the individual elements
of the HEA and give some ideas as to how the performance of this algorithm can be
improved in some cases. These ideas concern maintaining diversity in the population,
using alternative recombination operators, and modifying the HEA’s local search
procedure. They are considered in turn in the following subsections.
146 5 Algorithm Case Studies
Definition 5.1 Given a solution S , let PS be the set of all vertex pairs that are
assigned to the same colour in S . That is, PS = {{u, v} : u, v ∈ V ∧ u =
v ∧ c(u) = c(v)}. The distance between two solutions S1 and S2 can then be
calculated using the Jaccard distance measure on the sets PS1 and PS2 . That
is:
|PS1 ∪ PS2 | − |PS1 ∩ PS2 |
D(S1 , S2 ) = . (5.8)
|PS1 ∪ PS2 |
This distance measure gives the proportion of vertex pairs (assigned to the same
colour) that exist in just one of the two solutions. Consequently, if the solutions S1
and S2 are identical, then PS1 ∪ PS2 = PS1 ∩ PS2 , giving D(S1 , S2 ) = 0. Conversely,
if no vertex pair is assigned to the same colour in both solutions, then PS1 ∩ PS2 = ∅,
implying that D(S1 , S2 ) = 1. An example of this calculation is shown in Fig. 5.16.
Given this distance measure, we are also able to define a population diversity
metric. This is calculated by taking the mean distance between all pairs of solutions
in the population.
5.8 Further Improvements to the HEA 147
v1 v2 v1 v2
v3 v4 v5 v3 v4 v5
v6 v7 v8 v6 v7 v8
Fig. 5.16 Demonstration of how to measure the distance between two colourings according
to Definition 5.1. Here, the left solution S1 = {{v1 , v5 , v6 }, {v2 , v3 , v8 }, {v4 }, {v7 }} and the
right solution S2 = {{v1 , v6 }, {v2 , v7 }, {v3 , v5 }, {v4 , v8 }}. This gives two sets, PS1 = {{v1 , v5 },
{v1 , v6 }, {v2 , v3 }, {v2 , v8 }, {v3 , v8 }, {v5 , v6 }} and PS2 = {{v1 , v6 }, {v2 , v7 }, {v3 , v5 }, {v4 , v8 }},
leading to D(S1 , S2 ) = (9 − 1)/9 = 8/9
When applying the HEA to the graphs considered in this chapter, we found that
satisfactory levels of diversity were maintained in most cases. However, for some
graphs such as the timetabling problem instances, we also observed that large colour
classes of low-degree vertices were often formed in the early stages of the algorithm
and that these quickly came to dominate the population, causing premature con-
vergence. Indeed, as we saw in Table 5.3, the HEA can sometimes produce inferior
results with these problems.
One method by which population diversity might be prolonged in EAs is to make
larger changes (mutations) to an offspring to increase its distance from its parents.
However, this must be used with care, particularly because changes that are too
large might significantly worsen a solution, undoing much of the work carried out
in previous iterations of the algorithm. For the HEA, one obvious way of decreasing
the distance between parent and offspring is to increase the iteration limit of the local
search procedure. However, although this might allow further improvements to be
made to a solution, it could also slow the algorithm unnecessarily.
An alternative method for maintaining diversity in this case is to alter the HEA’s
recombination operator so that it works exclusively with proper colourings. As noted
in Sect. 5.3, the GPX operator considers candidate solutions in which clashes are
permitted. In practice, however, this could allow large colour classes containing
clashes to be unduly promoted in the population, when perhaps the real emphasis
should be on the promotion of large independent sets. Consequently, we might refine
the GPX operator so that it first removes all clashing vertices from each parent
before performing recombination. This implies that, before the assignment of missing
vertices to random colours, the partial offspring will always be proper. A further effect
is that a greater number of vertices will usually need to be recoloured because the
148 5 Algorithm Case Studies
1
HEA
HEA w/ Kempe
0.8
0.6
Diversity
0.4
0.2
0
0 500 1000 1500 2000 2500 3000
Iteration (number of crossovers)
Fig. 5.17 Population diversity using a population of size 10 with the timetabling problem instance
car-s-91, using k = 28
vertices originally removed from the parents may also be missing in the resultant
offspring. Hence, the resultant offspring will tend to be less similar to its parents.
If the above option is chosen, then before randomly reassigning missing vertices to
colours, we also have the opportunity to alter the partial proper solution using Kempe
chain interchanges. Recall from Theorem 4.1 that this operator, when applied to a
proper solution, does not introduce any clashes. Thus, it provides a mechanism by
which we can make changes to a solution without compromising its quality in any
way.
To illustrate the potential effects of this latter scheme, Fig. 5.17 shows the levels of
diversity that exist in the HEA’s population for the first 3000 iterations of a run using
the timetabling graph car-s-91, which has a chromatic number of 28 (see Table 5.3).
When using the original HEA, the population has converged at around 500 iterations
and, as we saw in Table 5.3, the algorithm produces solutions using more than 29
colours on average. On the other hand, by applying a series of random Kempe chain
moves (2k moves per each application of recombination in this case), population
diversity is maintained. In our tests, this modification enabled the algorithm to quickly
determine optimal 28-colourings in all runs.
Using Kempe chain interchanges in this way is not always beneficial, however. For
instance, similar tests to the above were also carried using random and flat graphs.
When using a suitably low value for k in these cases, we found that the Kempe chain
interchange operator was usually unable to alter the underlying structures of offspring
solutions because its application nearly always resulted in colour relabellings (or,
in other words, the bipartite graphs induced by each pair of colour classes in these
solutions were nearly always connected, giving total Kempe chains).
5.8 Further Improvements to the HEA 149
Note that within this book’s suite of graph colouring algorithms, the HEA contains
run-time options for outputting the population diversity and for applying Kempe
chain interchanges in the manner described above. (Refer to the algorithm user guide
in Appendix A.1 for further information.)
5.8.2 Recombination
Since the proposal of the GPX recombination operator by Galinier and Hao [3],
further recombination operators based on this scheme have been suggested, differing
primarily on the criteria used for deciding which colour classes to copy from parent
to offspring. Porumbel et al. [19], for example, suggest that instead of choosing
the largest available colour class at each stage of the recombination process, colour
classes with the least number of clashes should be prioritised, with class size and
information regarding the degrees of the vertices then being used to break ties. Lü
and Hao [20], on the other hand, have proposed extending the GPX operator to
allow more than two parents to play a part in producing a single offspring. In their
operator, the offspring are constructed in the same manner as the GPX, except that
at each stage the largest colour class from multiple parents is chosen to be copied
into the offspring. The intention behind this increased choice is that larger colour
classes will be identified, resulting in fewer uncoloured vertices once the k colour
classes have been constructed. To prohibit too many colours from being inherited
from one particular parent, the authors make use of a parameter q, specifying that if
the ith colour class in an offspring is copied from a particular parent, then this parent
should not be considered for a further q colours. Note, then, that GPX is simply an
application of this operator using two parents with q = 1.
Another method of recombination with the graph colouring problem involves
considering the individual assignments of vertices to colours as opposed to their par-
titions. Here, a natural way of representing a solution is to use a vector (c(v1 ), c(v2 ),
. . . , c(vn )), where c(vi ) gives the colour of vertex vi . However, it has long been
argued that this sort of approach has disadvantages, not least because it leads to a
solution space that is far larger than it needs to be (since any solution using k colours
can be represented in k! ways—refer to Sect. 2.2). Furthermore, authors such as
Falkenauer [21] and Coll et al. [22] have also argued that “traditional” recombina-
tion schemes such as 1-, 2-, and n-point crossover with this method of representation
tend to recklessly break up building blocks that we might want to be promoted in a
population.
In recognition of the perceived disadvantages of the assignment-based represen-
tation, Coll et al. [22] have proposed a procedure for relabelling the colours of
one of the parents before applying one of these “traditional” crossover operators.
Consider two (not necessarily feasible) parent solutions represented as partitions:
S1 = {S1,1 , . . . , S1,k } and S2 = {S2,1 , . . . , S2,k }. Now, using S1 and S2 , a com-
plete bipartite graph K k,k is formed. This bipartite graph has k vertices in each
partition, and the weight between two vertices from different partitions is defined
as Wi, j = |S1,i ∩ S2, j |. Given K k,k , a maximum weighted matching can then be
150 5 Algorithm Case Studies
Fig. 5.18 Example of the relabelling procedure proposed by Coll et al. [22]. Here, parent 2 is
relabelled using 1 → 3, 2 → 4, 3 → 1, 4 → 2, and 5 → 5
Fig. 5.19 Demonstration of the GGA recombination operator. Here, the colour classes in Parent 2
have first been labelled to maximally match those of Parent 1
determined using any suitable algorithm (such as the Hungarian algorithm [23] or
Auction algorithm [24]), and this matching can be used to relabel the colours in one
of the parents. Figure 5.18 gives an example of this procedure and shows how the
second parent can be altered so that its colour labellings maximally match those of
the first parent. In this example, we see that the colour classes {v1 , v10 }, {v3 , v5 },
and {v6 } occur in both parents and will be preserved in any offspring produced via a
traditional operator such as uniform crossover. However, this will not always be the
case and will depend very much on the best matching available in each case.
An interesting point regarding the structure of solutions and the resultant effects of
recombination have also been raised by Porumbel et al. [19]. Specifically, they pro-
pose that when solutions to graph colouring problems involve a small number of large
colour classes, good quality solutions will tend to occur through the identification of
large independent sets, perhaps suggesting that the GPX and its multi-parent variant
are naturally suited in these cases. On the other hand, if a solution involves many
small colour classes, quality seems to be determined more through the identification
of good combinations of independent sets.
To these ends, a further recombination operator for graph colouring is also pro-
posed by Lewis [25] which, unlike GPX, shows no bias towards offspring inheriting
larger colour classes, or towards offspring inheriting half of its colour classes from
each parent. An example of this operator is given in Fig. 5.19. Given two parents, the
colour classes in the second parent are first relabelled using procedure of Coll et al.
from above. Using the partition-based representations of these solutions, a subset of
colour classes from the second parent is then selected randomly, and these replace
5.8 Further Improvements to the HEA 151
the corresponding colours in a copy of the first parent. Duplicate vertices are then
removed from colour classes originating from the first parent, and any uncoloured
vertices are assigned to random colour classes. Tests by Lewis [25] indicate that this
recombination operator can produce marginally better solutions than the GPX oper-
ator when colour classes are small (approximately five vertices per colour), though
worse results can occur in other cases.
Note that the recombination operators listed in this subsection are also included
as run-time options within this book’s suite of graph colouring algorithms (see
Appendix A.1).
Finally, from the analyses in this chapter, it is apparent that graph colouring algo-
rithms such as the HEA benefit greatly when used in conjunction with an appropriate
local search procedure. For algorithms operating in the space of complete improper
solutions, this is usually provided by the TabuCol algorithm. The tabu search meta-
heuristic seems very suitable for this purpose because, by extending the steepest
descent algorithm, it allows rapid improvements to be made to a solution.
To contrast this, consider the rates of improvement achieved by an analogous sim-
ulated annealing algorithm that uses the same neighbourhood operator as TabuCol
but which follows the pseudocode given in Fig. 4.12. For this algorithm, values need
to be determined for the initial temperature t, the cooling rate α, and the frequency
of temperature updates z. Figure 5.20 compares the run profile of TabuCol to this
simulated annealing algorithm on an example random graph. It can be seen that
TabuCol
SA (Cool Rate = 0.99)
1200 SA (Cool Rate = 0.999)
1000
Objective Function
800
600
400
200
0
9 10 10 10 10 10
0 5×10 1×10 1.5×10 2×10 2.5×10 3×10
Checks
Fig. 5.20 Example run profiles of TabuCol and an analogous SA algorithm using a random graph
G 1000,0.5 with k = 86 colours. Here the SA algorithm uses an initial value for t = 0.7, with
z = 500,000
152 5 Algorithm Case Studies
TabuCol quickly reduces the objective function value, while the SA approach takes
much longer. In addition, the SA algorithm seems quite sensitive to adjustments in
its parameters, with inappropriate values potentially hindering performance. On the
other hand, it is well known that when the temperature is reduced more slowly, runs
of SA tend to produce better quality solutions [26]. Hence, with extended run times,
SA may have the potential to produce superior solutions to TabuCol in some cases.
This chapter has described six different high-performance algorithms for the graph
colouring problem. These algorithms have been compared and contrasted over a wide
range of problem instances including random, flat, planar, scale-free, and timetabling
graphs. Implementations of these methods can be found online (see Appendix A.1).
As with earlier chapters, this chapter’s comparison has been carried out using
a platform-independent measure of computational effort. In terms of CPU time,
Table 5.6 shows the relative run times of the algorithms using a small sample of
random graphs. Perhaps the most striking feature is that the HEA is among one
of the quickest to execute, a fact that further endorses the method. On the other
hand, the AntCol and the HC algorithms seem to require significantly more time,
apparently due to the computational overheads associated with their BuildSolution
and Kempe chain operators, respectively.
One of the intentions in this chapter has been to test the robustness of our six
case-study algorithms by executing them blindly on different problem instances. As
we have seen, this has involved using the same parameter values (or methods for
calculating them) across all trials. However, different settings may lead to better
results in some cases. A broader issue concerns how we might go about predicting
the performance of a particular algorithm on a previously unseen problem instance.
Accurate predictions are useful here because, given a particular graph, we could then
apply the most appropriate method from our available portfolio of algorithms.
Table 5.6 Time (in seconds) to complete runs of 5 × 1011 constraint checks with random graphs
G n,0.5 using a 3.0 GHz Windows 7 PC with 3.87 GB RAM
n = 250 500 1000
TabuCol 1346 1622 1250
PartialCol 1435 1372 1356
HEA 1469 1400 1337
AntCol 4152 3840 4349
HC 5829 5473 5320
Bktr 6328 4794 3930
5.9 Chapter Summary and Further Reading 153
Research in this area has been carried out by Smith-Miles et al. [27]. In their work,
the authors consider 18 different graph metrics. These are then used to help predict
which graph colouring algorithm will perform best on any given problem instance.
Metrics considered by the authors include the following:
To achieve their aims, Smith-Miles et al. [27] executed this chapter’s algorithms on
a wide range of different problem instances. Machine learning methods were then
used to classify the types of graph that the different algorithms were seen to perform
well with. This information can then be used for predicting algorithm performance on
new problem instances. One observation from this work is that, of the six algorithms,
only three seem to consistently show regions of the instance space where they are
uniquely best, namely, HEA, HC, and AntCol. As we have seen, each of these
methods combines local search strategies with global operators.
Subsequent work in this area is also due to Neis and Lewis [28]. Here, the authors
again use the above algorithms but they also consider different values of the algo-
rithms’ control parameters (local search iteration limits, tabu tenures, population
sizes, and so on). Similar graph metrics to the list above are used and, after perform-
ing a multidimensional database analysis on their results, the authors propose that the
most useful metrics for predicting algorithm performance are the standard deviation
of betweenness centrality, together with the density and energy of the graph.
References
1. Avanthay C, Hertz A, Zufferey N (2003) A variable neighborhood search for graph coloring.
Eur J Oper Res 151:379–388
2. Dorne R, Hao J-K (1998) A new genetic local search algorithm for graph coloring. In: Eiben
A, Back T, Schoenauer M, Schwefel H (eds) Parallel problem solving from nature (PPSN) V.
LNCS, vol 1498. Springer, pp 745–754
3. Galinier P, Hao J-K (1999) Hybrid evolutionary algorithms for graph coloring. J Comb Optim
3:379–397
4. Thompson J, Dowsland K (2008) An improved ant colony optimisation heuristic for graph
colouring. Discret Appl Math 156:313–324
154 5 Algorithm Case Studies
5. Galinier P, Hertz A (2006) A survey of local search algorithms for graph coloring. Comput
Oper Res 33:2547–2562
6. Blöchliger I, Zufferey N (2008) A graph coloring heuristic using partial solutions and a reactive
tabu scheme. Comput Oper Res 35:960–975
7. Dorigo M, Maniezzo V, Colorni A (1996) The ant system: optimisation by a colony of coop-
erating agents. IEEE Trans Syst Man Cybern 26(1):29–41
8. Rizzoli A, Montemanni R, Lucibello E, Gambardella L (2007) Ant colony optimization for
real-world vehicle routing problems. Swarm Intell 1(2):135–151
9. Lewis R (2009) A general-purpose hill-climbing method for order independent minimum
grouping problems: a case study in graph colouring and bin packing. Comput Oper Res
36(7):2295–2310
10. Korman S (1979) Combinatorial optimization. In: The graph-coloring problem. Wiley, New
York, pp 211–235
11. Cheeseman P, Kanefsky B, Taylor W (1991) Where the really hard problems are. In: Proceed-
ings of IJCAI-91, pp 331–337
12. Turner J (1988) Almost all k-colorable graphs are easy to color. J Algorithms 9:63–82
13. Robertson N, Sanders D, Seymour P, Thomas R (1997) The four color theorem. J Comb Theory
Ser B 70:2–44
14. Barabási A, Bonabeau E (2003) Scale-free networks. Scientific American, May 2003
15. Barabási A (2016) Network science. Cambridge University Press
16. Carter M, Laporte G, Lee SY (1996) Examination timetabling: algorithmic strategies and
applications. J Oper Res Soc 47:373–383
17. Ross P, Hart E, Corne D (2003) Genetic algorithms and timetabling. In: Ghosh A, Tsutsui
K (eds) Advances in evolutionary computing: theory and applications. Natural computing.
Springer, pp 755–771
18. Moody J, White D (2003) Structural cohesion and embeddedness: a hierarchical concept of
social groups. Am Sociol Rev 68(1):103–127
19. Porumbel D, Hao J-K, Kuntz P (2010) An evolutionary approach with diversity guarantee and
well-informed grouping recombination for graph coloring. Comput Oper Res 37:1822–1832
20. Lü Z, Hao J-K (2010) A memetic algorithm for graph coloring. Eur J Oper Res 203(1):241–250
21. Falkenauer E (1998) Genetic algorithms and grouping problems. Wiley
22. Coll E, Duran G, Moscato P (1995) A discussion on some design principles for efficient
crossover operators for graph coloring problems. Anais do XXVII Simposio Brasileiro de
Pesquisa Operacional, Vitoria-Brazil
23. Munkres J (1957) Algorithms for the assignment and transportation problems. J Soc Ind Appl
Math 5(1):32–38
24. Bertsekas D (1992) Auction algorithms for network flow problems: a tutorial introduction.
Comput Optim Appl 1:7–66
25. Lewis R (2015) Springer handbook of computational intelligence. In: Graph coloring and
recombination. Studies in computational intelligence. Springer, pp 1239–1254
26. van Laarhoven P, Aarts E (1987) Simulated annealing: theory and applications. Kluwer Aca-
demic Publishers
27. Smith-Miles K, Baatar D, Wreford B, Lewis R (2014) Towards objective measures of algorithm
performance across instance space. Comput Oper Res 45:12–24
28. Neis P, Lewis R (2020) Evaluating the influence of parameter setup on the performance of
heuristics for the graph colouring problem. Int J Metaheurist 7(4):352–378
Applications and Extensions
6
We are now at a point in this book where we have seen several different algorithms
for the graph colouring problem and have noted many of their relative strengths
and weaknesses. This chapter now presents a range of problems, both theoretical
and practical based, for which such algorithms might be applied. These include face
colouring, edge colouring, precolouring, constructing Latin squares, solving Sudoku
puzzles, and testing for short circuits in circuit boards. Note that these problems
are either equivalent to, or represent special cases of, the general graph colouring
problem.
This chapter also considers variants of the graph colouring problem where not all
of the graph is visible to an algorithm, or where the graph’s structure is subject to
change over time. Such problems can arise when setting up wireless networks and also
in some timetabling applications. We then go on to consider problems that extend
and therefore generalise the graph colouring problem, specifically list colouring,
equitable colouring, weighted graph colouring, and chromatic polynomials. Detailed
real-world applications of graph colouring are also the subject of Chaps. 7, 8, and 9.
Note that, in contrast to the rest of this book, the first two sections of this chapter
are concerned with colouring the faces of graphs and the edges of graphs. As we will
see, these two problems can be converted into equivalent formulations of the vertex
colouring problem using the concepts of dual graphs and line graphs, respectively.
However, it is often useful for face and edge colouring problems to be considered as
separate problems; hence, we will often use the term “vertex colouring” instead of
“graph colouring” to avoid any ambiguities.
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 155
R. M. R. Lewis, Guide to Graph Colouring, Texts in Computer Science,
https://doi.org/10.1007/978-3-030-81054-2_6
156 6 Applications and Extensions
In the face colouring problem, we want to colour the spaces between vertices and
edges, as opposed to the vertices themselves. Face colouring is specifically concerned
with planar graphs which, as we saw in Chap. 1, are graphs that can be drawn on
a plane so that no edges cross one another. When drawn in this way planar graphs
can be divided into faces, including one unbounded face that surrounds the graph.
Figure 6.1, for example, shows a planar graph comprising ten faces: nine bounded
faces and one unbounded face (numbered 10 in the figure). The boundary of a face
is the set of edges that surrounds it. When a face is bounded, its boundary forms a
cycle.
It is evident by inspecting Fig. 6.1 that the number of faces seems to be related
to the number of vertices and edges of the graph. In fact, this relationship can be
stated explicitly due to an elegant theorem that was first noted by Leonhard Euler
(1707–1783):
Proof The proof is via induction on the number of faces f . If f = 1, then the graph
contains no cycles and must therefore be a tree. Since the number of edges in a tree
m = n − 1, the theorem holds because n − (n − 1) + 1 = 2.
Now assume f ≥ 2, meaning that G must contain at least one cycle. Let {u, v}
be an edge in one of these cycles. Since this cycle divides two faces, say F1 and F2 ,
removing {u, v} from G to form a subgraph G will have the effect of joining F1 and
F2 , with all other faces remaining unchanged. Hence, G has f − 1 faces.
Let n , m , and f be the number of vertices, edges, and faces in G . Thus, n = n,
m = m − 1, f = f − 1, and n − m + f = n − m + f = 2.
We see that Euler’s characteristic does indeed hold for the example graph in
Fig. 6.1 since n − m + f = 15 − 23 + 10 = 2 as expected.
When considering the face colouring problem it is necessary to restrict ourselves
to planar graphs that contain no bridges. A bridge is defined as an edge in a graph
G whose removal increases the number of components. When a graph contains a
bridge {u, v}, the unbounded face will surround the graph, but will also feature {u, v}
on its boundary twice, making it impossible to colour feasibly. Hence, planar graphs
containing bridges are not considered further in this section.
Let us now consider the maximum number of edges that a graph can feature
while retaining the property of planarity. Consider a connected planar graph G with
n vertices, m edges, and f faces. Also write f i for the number of faces in G that
contain exactly i edges in their boundaries. Clearly i f i = f and, assuming that
G does not contain a bridge,
= i f i = 2m (6.1)
i
since every edge is on the boundary of exactly two faces. We can use this relationship
in conjunction with Euler’s characteristic to give an upper bound on the number of
edges in a planar graph. This result also involves knowledge of the girth of a graph,
defined as follows.
Definition 6.1 The girth of a graph G is the length of the shortest cycle in G.
If G is acyclic (i.e., contains no cycles), then its girth equals infinity.
Theorem 6.2 can sometimes be used to decide whether a graph is planar or not.
For example, the complete graph with five vertices K 5 cannot possibly be planar
because it has n = 5 vertices and m = 10 edges, meaning m ≤ 3n − 6 is not
satisfied. As another example, the complete bipartite graph with six vertices G =
(V1 , V2 , E), where E = {{u, v} : u ∈ V1 , v ∈ V2 } and |V1 | = |V2 | = 3, is also not
bipartite since it has m = 9 edges, n = 6 vertices, and a girth of four, meaning that
m = 9 > 4−2 4
(6 − 2) = 8. Less obvious, but profoundly more useful, however, is
the amazing fact that a graph is planar if and only if it does not contain a subgraph
that is a subdivision of either K 5 or K 3,3 . This result, due to Kuratowski [1], has
been used alongside similar results to help construct several efficient (polynomial
time) algorithms for determining whether a graph is planar or not, including the Path
Addition method of Hopcroft and Tarjan [2] and the more recent Edge Addition
method of Boyer and Myrvold [3].
6.1.1 Dual Graphs, Colouring Maps, and the Four Colour Theorem
The close relationship between the problems of vertex colouring and face colouring
becomes apparent when we consider the concept of dual graphs. Given a planar
graph G, the dual of G, denoted by G ∗ , is constructed according to the following
steps. First, draw a single vertex vi∗ inside each face Fi of G. Second, for each edge
e in G, draw a line e∗ that crosses e but no other edge in G, and that links the two
vertices in G ∗ corresponding to the two faces in G that e is separating.
This procedure is demonstrated in Fig. 6.2. Here, the vertices in G are shown in
grey, and the vertices in G ∗ are shown in black. G has six faces in total: five bounded
faces and one unbounded face. The unbounded face is represented by the top vertex
of G ∗ in the example and is made adjacent to all vertices in G ∗ whose corresponding
faces in G have an edge on the exterior of the graph. Note that G ∗ may also have
multiple edges between a pair of vertices, as it occurs on the right-hand side of the
example graph.
It is clear from the figure that the process of forming duals is reversible, that is,
we can use the same process to form G from G ∗ . It is also clear that because G is
planar, its dual G ∗ must also be planar. We can now state relationships between the
number of vertices, faces, and edges in G and G ∗ such as the following.
G G*
Proof It is clear that n ∗ = f due to the method by which duals are constructed.
Similarly, m ∗ = m because all edges in G ∗ intersect exactly one edge each in G (and
vice versa). The third relation follows by substituting the previous two relationships
into Euler’s characteristic applied to both G and G ∗ .
Recall from Chap. 1 that the four colour theorem (or “conjecture” as it was at the
time) was originally stated in 1852 by Francis Guthrie, who hypothesised that four
colours are sufficient for colouring the faces of any map such that neighbouring faces
have different colours. In the context of graph theory, a map can be represented by
a bridge-free planar graph G, with the faces of G representing the various regions
of the map, edges representing borders between regions, and vertices representing
points where the borders intersect. An illustration using a map of Australia is given
in Fig. 6.3.
The following theorem now reveals the close relationship between the vertex
colouring and face colouring problems.
Theorem 6.4 Let G be a connected planar graph without loops, and let G ∗
be its dual. Then the vertices of G are k-colourable if and only if the faces of
G ∗ are k-colourable.
Proof Since G is connected, planar, and without loops, its dual G ∗ is a planar graph
with no bridges. If we have a k-colouring of the vertices of G, then each face of G ∗
can now be assigned to the same colour as its corresponding vertex in G. Because
no adjacent vertices in G have the same colour, it follows that no adjacent faces in
G ∗ have the same colour. Thus, the faces of G ∗ are k-colourable.
160 6 Applications and Extensions
Now suppose that we have a k-colouring of the faces of G ∗ . Since every vertex
of G is contained in a face of G ∗ , each vertex in G can assume the colour of its
corresponding face in G ∗ . Again, since no adjacent faces in G ∗ are allocated the
same colour, this implies no adjacent vertices in G are given the same colour.
This result is important because it tells us that the faces of any map (represented as
a planar graph G ∗ with no bridges) can be k-coloured by simply determining a vertex
k-colouring of its dual graph G. The result also tells us that we can take any theorem
concerning the vertex colouring of a planar graph and then state a corresponding
theorem on the face colouring of its dual, and vice versa.
One elegant theorem that arises from this relationship demonstrates a link between
Eulerian graphs and graphs that are bipartite.
Theorem 6.5 The faces of a planar graph with no bridges G are two-
colourable if and only if G is Eulerian.
Proof Recall that a graph’s vertices are two-colourable if and only if it is bipartite.
Hence, we need to show that the dual of any planar Eulerian graph is bipartite, and
vice versa.
Let G be an Eulerian planar graph. By definition, all vertices in G are even
in degree. Since the degree of a vertex in G corresponds to the number of edges
surrounding a face in the dual G ∗ , the edges surrounding each face in G ∗ constitute
cycles of even length. Hence, according to Theorem 3.10, G ∗ is bipartite.
Conversely, let G ∗ be bipartite. This means G ∗ contains no odd cycles and, since
G is planar, all faces are surrounded by an even number of edges. Hence, all vertex
degrees in G are even, making G Eulerian.
Practical examples of Theorem 6.5 arise in the tiling industry where we are often
interested in laying tiles of two different colours such that adjacent tiles do not have
the same colour. Example titling patterns are shown in Fig. 6.4a, b. Close examination
of these patterns reveals that the underlying graphs are Eulerian as expected. Two-
colourings also arise when a picture is drawn using a single line that is joined at
either end, such as with the geometric drawing device “Spirograph”. Figure 6.4c
shows an example of this. We see that each time the line crosses itself, the degree of
the “vertex” existing at this intersection increases by two; hence, the vertex degrees
will always be even.
6.1 Face Colouring 161
(a) (b)
(c)
Theorem 6.6 The vertices of any loop-free planar graph are six-colourable.
1 Recall that loops (i.e., edges of the form {v, v}) are disallowed in the vertex colouring problem.
162 6 Applications and Extensions
With some additional reasoning we can improve this result to get the following.
Theorem 6.7 (Heawood [4]) The vertices of any loop-free planar graph are
five-colourable.
Proof For contradictory purposes, suppose this statement to be false, and let G
be a planar graph with chromatic number χ (G) = 6 and a minimal number of
vertices n. Because of Theorem 6.2, G must have a vertex v with deg(v) ≤ 5.
Now let G = G − {v}. We know that G can be five-coloured using, say, colours
labelled 1–5. Each of these colours must also be used to colour at least one neighbour
of v (otherwise G would also be five-colourable). We can now assume that v has
five neighbours, say u 1 , u 2 , . . . , u 5 , arranged in a clockwise fashion around v, with
colours c(u i ) = i.
Now denote by G (i, j) the subgraph of G spanned by vertices with colours i and
j. Suppose that u 1 and u 3 belong to separate components of G (1, 3). Interchanging
the colours 1 and 3 in the component of G (1, 3) containing u 1 will give us another
6.1 Face Colouring 163
In the proof of Theorem 6.7, the notation G(i, j) denotes the subgraph induced by
taking the vertices coloured with colours i and j in G. Individual components of
G(i, j) are Kempe chains (see Definition 4.3), which are named after the mathemati-
cian Alfred Kempe (1849–1922), who used them in an infamous incorrect proof for
the four colour theorem in 1879.
As we saw in Chap. 1, the conjecture that all maps can be coloured using at most
four colours was first pointed out in 1852 by Francis Guthrie (1831–1899) who, at the
time, was a student at University College London. Guthrie passed these observations
on to his brother Frederick who, in turn, passed them on to his mathematics tutor
Augustus De Morgan (1807–1871). De Morgan was not able to provide a conclusive
proof for this conjecture, but the problem, being both easy to state and tantalisingly
difficult to solve, captured the interest of many notable mathematicians of the era,
including William Hamilton (1805–1865), Arthur Cayley (1821–1895), and Charles
Pierce (1839–1914).
Indeed, over time the four colour conjecture was to become one of the most famous
unsolved problems in all of mathematics.
In 1879, a student of Arthur Cayley, Alfred Kempe, announced in Nature magazine
that he had proved the four colour theorem, publishing his result in the American
Journal of Mathematics [5]. In his arguments, Kempe made use of his eponymous
Kempe chains in the following way. Suppose we have a map in which all faces except
one are coloured using colours 1, 2, 3, or 4. If the uncoloured face, which we shall
164 6 Applications and Extensions
call F, is not surrounded by faces featuring all four colours, then obviously we can
colour F using the missing colour. Therefore, suppose now that F is surrounded by
faces F1 , F2 , F3 , and F4 (in that order), which are coloured using colours 1, 2, 3,
and 4, respectively. There are now two cases to consider:
Case 1: There exists no chain of adjacent faces from F1 to F3 that are alternately
coloured with colours 1 and 3.
Case 2: There is a chain of adjacent faces from F1 to F3 that are alternately coloured
with colours 1 and 3.
If Case 1 holds then F1 can be switched to colour 3, and any remaining faces in the
chain can also have their colours interchanged. This operation retains the feasibility
of the solution (no adjacent faces will have the same colour) and also means that
no face adjacent to F will have colour 1. Consequently, F can be assigned to this
colour.
If Case 2 holds then there cannot exist a chain of faces from F2 to F4 using only
colours 2 and 4. This is because, for such a chain to exist, it would need to cross the
chain from F1 to F3 , which is impossible on a map. Thus, Case 1 holds for F2 and
F4 , allowing us to switch colours as with Case 1.
The arguments of Kempe were widely accepted among mathematicians of the day.
He was later elected a Fellow of the Royal Society and also went on to be knighted in
1912. The four colour conjecture was now considered to be the four colour theorem.
This all changed 11 years later when, in 1890, English mathematician Percy
Heawood (1861–1955) shocked the mathematics fraternity by publishing an example
map that exposed a flaw in Kempe’s arguments [4]. Though he failed to supply his
own proof, Heawood had shown that the four colour theorem was indeed still a
conjecture. In the same publication, Heawood did show, however, that arguments
analogous to Kempe’s could be used to prove that all maps are five-colourable, as we
saw in Theorem 6.7. In later work, Heawood also proved that if the number of edges
around each region of a map is divisible by 3 then the map can be four-coloured.
As the decades passed, the problem that had first been pointed out by Guthrie in
1852 remained unproven. Some piecemeal progress towards a solution was made
with one proof showing that four colours were sufficient for colouring maps of up to
27 faces. This was followed by proofs for up to 31 faces, and then 35 faces. However,
it would turn out that methods used by Kempe and his contemporaries in early papers
would ultimately pave the way.
To start, the focus of research turned towards proofs concerning the vertices of
loop-free planar graphs (i.e., the dual graphs of maps). In the first half of the twentieth
century, researchers also concentrated their efforts on reducing these graphs to special
cases that could be identified and classified. The idea was to produce a minimal set
of configurations that could each be tested. Initially, this set was thought to contain
nearly 9000 members, which was considered far too large for mathematicians to
study individually. This compelled some to turn towards using computers to design
specialised algorithms for testing them.
6.1 Face Colouring 165
Ultimately, the first conclusive proof of the four colour theorem was produced
in 1976 by mathematicians Kenneth Appel (1932–2013) and Wolfgang Haken (b.
1928). Their proof is based on the idea that if the four colour conjecture were false,
then there would exist at least one planar graph G with the smallest possible number
of vertices such that χ (G) = 5. They then showed that G cannot exist. To do this,
they used the notions of unavoidable sets and reducibility.
1. An unavoidable set is a set of configurations such that any planar graph has at
least one member of this set as a subgraph.
2. A reducible configuration is a planar graph that cannot occur in a minimal coun-
terexample G. If a planar graph contains a reducible configuration, then it can
be reduced to a smaller planar graph. This smaller graph also has the condition
that if it can be four-coloured, then so can the original. Also, if the original graph
cannot be four-coloured then neither can the smaller graph, so the original graph
is not minimal.
Appel and Haken’s proof involved constructing an unavoidable set and therefore
proving that G cannot exist. The number of members in this set was found to be
1936, which were then checked one by one by hand and by computer [6–9]. As was
later stated in Appel’s obituary in The Economist on 4 May 2013:
Both he and Dr. Haken hugely exceeded their time allocation on the computer, which belonged
to the university administration department. …Their proof depended on both hand-checking
by family members and then brute-force computer power; the result was published in over
140 pages in the Illinois Journal of Mathematics and 400 pages of further diagrams on
microfiche. They also, in the old fashioned way, chalked the message on a blackboard in the
mathematics department: four colours suffice.
At the time, this work was controversial, with some mathematicians questioning
the legitimacy of a proof in which much of the work had been carried out by computer.
(How might we guarantee the reliability of the algorithms and hardware?) However,
despite these concerns, independent verification soon convinced the community that
the four colour theorem had indeed finally been proved. Hence, we are now able to
state:
Theorem 6.8 (The Four Colour Theorem) The vertices of any loop-free pla-
nar graph are four-colourable. Equivalently, the faces of any map are four-
colourable.
In more recent years, Robertson et al. [10] have proposed an algorithm for four-
colouring planar graphs that operates in O(n 2 ) time. They have also shown how to
construct an unavoidable set containing just 633 reducible configurations. However,
a proof along more “traditional” lines remains elusive and, to this day, the four
colour theorem remains an excellent example, along with Fermat’s last theorem, of
a problem that is very easy to state, but exceptionally difficult to solve.
166 6 Applications and Extensions
Readers interested in finding out more about the fascinating history of the four
colour theorem are invited to consult the very accessible book Four Colors Suffice:
How the Map Problem Was Solved by Wilson [11].
Another way in which graphs can be coloured is to assign colours to their edges, as
opposed to their vertices or faces. This gives rise to the edge colouring problem where
we seek to colour the edges of a graph so that no pair of edges sharing an endpoint
(i.e., incident edges) have the same colour, and so that the number of colours used
is minimal.
The edge colouring problem has applications in scheduling round-robin tour-
naments and also transferring files in computer networks [12,13]. The minimum
number of colours needed to edge colour a graph G is called the chromatic index,
denoted by χ (G). This should not be confused with the chromatic number χ (G),
which is the minimum number of colours needed to colour the vertices of a graph G.
As mentioned earlier, the edge colouring and vertex colouring problems are closely
related because we can colour the edges of a graph by simply colouring the vertices
of its corresponding line graph.
An example conversion between a graph G and its line graph L(G) is shown in
Fig. 6.7a. From this process, it is natural that the number of vertices and edges in
L(G) is related to the number of vertices and edges in G.
Theorem 6.9 Let G = (V, E) be a graph with n vertices and m edges. Then
its line graph L(G) has m vertices and
1
deg(v)2 − m
2
v∈V
edges.
6.2 Edge Colouring 167
v3
v1,v 4
v1,v 4
v2,v5
v2,v5
v4 v5 v4,v5 v4,v5
G L(G)
(b) v1,v 2 v1 v2
v3
v1,v 4
v2,v5
v4,v5
L(G) v4 G v5
Fig. 6.7 Illustration of a how to convert a graph G into its line graph L(G), and b how a vertex
k-colouring of L(G) corresponds to an edge k-colouring of G
Proof Since each edge in G corresponds to a vertex in L(G) it is obvious that L(G)
has m vertices. Now let {u, v} be an edge in G. This means that {u, v} is a vertex in
L(G) with degree deg(u) + deg(v) − 2. Hence, the total number of edges in L(G) is
1 1
(deg(u) + deg(v) − 2) = (deg(u) + deg(v)) − m.
2 2
{u,v}∈E {u,v}∈E
Note that the degree of each vertex v appears exactly deg(v) times in this sum. Hence,
we can simplify the expression to that stated in Theorem 6.9 as required.
Figure 6.7b also demonstrates how a vertex k-colouring of the line graph L(G)
corresponds to an edge colouring of G. Consequently, rather like the way in which
a face colouring problem can be tackled by colouring the vertices of a graph’s dual,
any edge colouring problem stated on a graph G can be tackled by colouring the
vertices its line graph L(G).
We now discuss some important results concerning the chromatic index of a graph.
Theorem 6.10 Let K n be the complete graph with n > 1 vertices. Then its
chromatic index χ (K n ) = n − 1 if n is even; otherwise χ (K n ) = n.
Proof When n is odd, the edges of K n can be coloured using n colours by the
following process. First, draw the vertices of K n in the form of a regular n-sided
168 6 Applications and Extensions
v1 v1 v1 v1 v1
(a)
v2 v2 v2 v2 v2
v5 v5 v5 v5 v5
v4 v3 v4 v3 v4 v3 v4 v3 v4 v3
Colour-1 Colour-2 Colour-3 Colour-4 Colour-5
(b) v1 v1 v1 v1 v1
v2 v2 v2 v2 v2
v5 v5 v5 v5 v5
v6 v6 v6 v6 v6
v4 v3 v4 v3 v4 v3 v4 v3 v4 v3
Colour-1 Colour-2 Colour-3 Colour-4 Colour-5
Fig. 6.8 Illustrating how optimal edge colourings can be constructed for complete graphs with a
K 5 and b K 6 using the circle method
polygon. Next, select an arbitrary edge on the boundary of this polygon and colour
it, together with all edges parallel to it, using colour 1. Now moving in a clockwise
direction, select the next edge on the boundary and colour it, together with its parallel
edges, with colour 2. Continue this process until all edges have been coloured.
It is easy to demonstrate that the edges of K n are not (n −1)-colourable by the fact
that the largest number of edges that can be assigned the same colour is (n − 1)/2;
it then follows, because the number of edges in K n is n(n−1) 2 , that n colours are
required.
When n is even, a similar process can be followed, where a regular (n − 1)-sided
polygon is constructed, with the remaining vertex being placed in the centre. The
same method for the (n − 1) case is then followed, with edges perpendicular to the
edges currently being coloured also being assigned to the same colour. As in the
previous case, it is easily shown that no feasible edge colouring of K n exists using
fewer than n colours.
The method used in the proof of Theorem 6.10 is often referred to as the circle
(or polygon) method and was originally proposed by mathematician and Church of
England Minister Thomas Kirkman (1806–1895) [14]. An important practical use
of this method is for constructing round-robin sports leagues, where we have a set
of n teams that are required to play each other once across a sequence of rounds.
Figure 6.8 provides examples of this method for n = 5 and n = 6. Here, the vertices
can be thought of as “teams”, with edges representing “matches” between these
teams. Each colour then represents a round in the schedule. Considering Fig. 6.8a,
where n = 5, the first round involves matches between team-v2 and team-v5 and
between team-v3 and team-v4 , with team-v1 receiving a bye. The next round then
involves matches between team-v1 and team-v3 and between team-v4 and team-v5 ,
with team-v2 receiving a bye, and so on. The pattern is similar when n is even,
as shown in Fig. 6.8b, except that no team receives a bye. Applications of graph
colouring to sports scheduling problems are considered in more detail in Chap. 8.
6.2 Edge Colouring 169
A further result, due to König [15], concerns the chromatic index of bipartite
graphs.
The previous two theorems demonstrate that the edge colouring problem is solv-
able in polynomial time for both complete and bipartite graphs. We have also seen
that, for both topologies, their chromatic indices χ (G) are either Δ(G) or Δ(G)+1.
Somewhat surprisingly, it turns out that this feature applies to any graph G, as proved
by Vizing [16].
Proof When Δ(G) edges are incident to a vertex, these edges all require a different
colour. Hence, the lower bound is proved: Δ(G) ≤ χ (G).
The upper bound can be proved via induction on the number of edges. Suppose
that, using Δ(G) + 1 colours, we have coloured all edges in G except for the single
edge {u, v0 }. Since Δ(G) gives the maximal degree, at least one colour will be unused
at each of these two vertices. Now construct a series of edges, {u, v0 }, {u, v1 }, . . .,
170 6 Applications and Extensions
If neither case above holds, then we consider the subgraph of grey and black edges.
The components of this subgraph will be paths and/or cycles. The vertices u, vi ,
and vk are the terminal vertices of paths; hence, they cannot all belong to the same
component. In this case, select a component containing just one of these vertices and
interchange the colours of its edges. This means that one of the cases above now
applies.
In essence, Vizing’s theorem tells us that the set of all graphs can be partitioned
into two classes: “class one” graphs, for which χ (G) = Δ(G), and “class two”
graphs, where χ (G) = Δ(G) + 1. Holyer [17] has shown that the decision problem
of testing determining whether a graph belongs to class one is N P -complete. On the
other hand, several polynomially bounded algorithms are available for colouring the
edges of any graph using exactly Δ(G) + 1 colours, such as the O(nm) algorithm of
Misra and Gries [18]. The existence of such algorithms tells us that we can colour
the edges of any graph using a maximum of one extra colour beyond its chromatic
index.
We might now ask whether the existence of such tight bounds for the edge colour-
ing problem helps us to garner further information about the vertex colouring prob-
lem. It is clear that if we were given the task of vertex colouring a line graph L(G),
one approach would be to convert L(G) into its “original” graph G, and then try to
solve the corresponding edge colouring problem on G. Since χ (L(G)) = χ (G),
then according to Vizing’s theorem this would immediately tell us that we need to
use either Δ(G) or Δ(G) + 1 colours to feasibly colour the vertices of L(G). Indeed,
if G were a type two graph, then algorithms such as Mistra and Gries’s could be
used to quickly find the optimal edge colouring for G and therefore the optimal
vertex colouring for L(G). However, it should be remembered that this very attrac-
tive sounding proposal is only applicable when we wish to colour the vertices of
a line graph that therefore has an “original” graph into which it can be converted.
Unfortunately, we cannot convert all graphs into an “original” graph in this way.
6.3 Precolouring 171
6.3 Precolouring
In the precolouring problem, we are given a graph G in which some subset of the
vertices V ⊆ V has already been assigned colours. Our task is to then colour the
remaining vertices in the set V − V so that the resultant solution is feasible and uses
a minimal number of colours.
Applications of precolouring arise in register allocation problems (see Sect. 1.1.4)
where certain variables must be assigned to specific registers, perhaps due to calling
conventions or communication between modules. They also occur in areas such as
timetabling and sports scheduling where we might be given a problem instance in
which some of the events have already been assigned to particular timeslots.
Precolouring problems can easily be converted into a standard graph colouring
problem using graph contraction operations.
The following steps can now be taken. Given a precolouring problem instance
defined on a graph G, let V (i) define the set of vertices precoloured with colour
i.
Assuming there are k different colours used in the precolouring, this means that
k
i=1 V (i) = V and V (i) ∩ V ( j) = ∅, for 1 ≤ i = j ≤ k. Now, for each set
V (i), merge all vertices into a single vertex using a series of contraction operations.
This has the effect of reducing the number of precoloured vertices to k. Next, add
edges between each pair of the k contracted vertices to form a clique. Finally, remove
all colours from the vertices of this graph, and apply any arbitrary graph colouring
algorithm to produce a feasible solution. A colouring of the original can then be
obtained by simply reversing the above process. An example is provided in Fig. 6.9.
Fig.6.9 Part a shows an example precolouring problem. Part b then shows how this can be converted
into a new graph by contracting the precoloured vertices and forming a k-clique. A feasible colouring
of this graph (shown in c) can then be converted back into a solution to the original problem, as
shown in (d)
172 6 Applications and Extensions
Another prominent area of mathematics for which graph colouring is naturally suited
is the field of Latin squares. Latin squares are l ×l grids that are filled with l different
symbols, each occurring exactly once per row and once per column. They were
originally considered in detail by Leonhard Euler, who filled his grids with symbols
from the Latin alphabet, though nowadays it is common to use the integers 1 through
to l to fill the grids. Example Latin squares of different sizes are shown in Fig. 6.10.
Latin squares have practical applications in several areas, including scheduling
and experimental design. For an application in scheduling, imagine that we have two
groups of l people and we want to schedule meetings between all pairs of people
belonging to different groups. Clearly l 2 meetings are needed here. Also, since only
l meetings can take place simultaneously, at least l timeslots are required. Latin
squares give solutions to such problems that make use of exactly l timeslots. To see
this, let us name the members of Team One as r1 , r2 , . . . , rl , which are represented
by the rows in the grid, and the members of Team Two as c1 , c2 , . . . , cl , represented
by the columns. The characters within an l ×l Latin square then represent the various
timeslots to which the meetings are assigned. For example, the Latin square shown
in Fig. 6.10a schedules meetings between r1 and c1 , r2 and c2 , and r3 and c3 into
timeslot 1; meetings between r1 and c3 , r2 and c1 , and r3 and c2 into timeslot 2; and
meetings between r1 and c2 , r2 and c3 , and r3 and c1 into timeslot 3. Any l × l Latin
square will provide a suitable meeting schedule fitting these criteria.
For an example application of Latin squares in experimental design, imagine
that we want to test the effects of l different drugs on a particular illness. Suppose
further that the trials are to take place over l weeks using l different patients, with
each patient receiving a single drug each week. An l × l Latin square can be used
to allocate treatments in this case, with rows representing patients, and columns
representing weeks. This means that over the course of the l weeks each patient
receives each of the l drugs once, and in each week all of the l drugs are tested.
Looking at the 3 × 3 Latin square from Fig. 6.10a, for example, we see that Patients
1, 2, and 3 are administered Drugs 1, 2, and 3 (respectively) in Week 1; Drugs 3, 1,
and 2 in Week 2; and Drugs 2, 3, and 1 in Week 3, as required.
Note that we can permute the rows and columns of a Latin square and still retain
the property of each character occurring exactly one per column and once per row. It
Fig. 6.11 Demonstration of the relationship between graph colouring and Latin squares. Part a
associates each grid cell with a vertex; Part b shows the corresponding graph together with a
feasible colouring; and Part c gives a valid Latin square corresponding to this colouring
is therefore common to write Latin squares in their standardised form, whereby the
rows and columns are arranged so that the top row and leftmost column of the grid
have the characters in their natural order 1, 2, . . . , l. The other l!(l − 1)! − 1 Latin
squares that can be formed by permuting the rows and columns are then considered
to be equivalent to this. The Latin square in Fig. 6.10b is in standardised form, while
the one in Fig. 6.10a is not.
It is also known that as l is increased, then so does the number of different l × l
Latin squares. For l = 1, . . . , 4, these numbers are 1, 1, 1, and 4, respectively;
however, the growth rate is rapid: for l = 11 there are more than 5.36 × 1033
different Latin squares. Further information on this growth rate can be found at the
Online Encyclopedia of Integer Sequences (https://oeis.org/A000315).
Figure 6.11 shows how the production of a Latin square can be expressed as a graph
colouring problem. As illustrated, the symbols used within the grid represent the
colours. Each cell of the grid is then associated with a vertex, and edges are added
between all pairs of vertices in the same row and all pairs of vertices in the same
column. This results in a graph G = (V, E) with n = l 2 vertices and m = l 2 (l − 1)
edges, for which deg(v) = 2(l − 1) ∀v ∈ V . (This graph is equivalent to the
Cartesian product of the complete graphs K l and K l .) Note that the set of vertices
in each row forms a clique of size l, as do vertices in each column. This implies that
solutions using fewer than l colours are not possible.
Note that it is simple to produce a Latin square in standardised form for any
value of l by simply using values (1, 2, . . . , l) for the first row, (2, 3, . . . , l, 1) for
the second row, (3, 4, . . . , l, 1, 2) for the third row, and so on (see, for example,
the left Latin square in Fig. 6.10c). This tells us that Latin squares are a particular
topology for which the associated graph colouring problem can be easily solved
in polynomial time for any value of l, without a need for resorting to heuristics or
approximation algorithms. Graph colouring algorithms can, however, be used for
producing different Latin squares to this.
Graph colouring algorithms arguably become more useful in this area when we
consider the partial Latin square problem. This is the problem of taking a partially
174 6 Applications and Extensions
Fig. 6.12 Part a shows a partial 3 × 3 Latin square with four filled-in cells. Part b shows the
corresponding precolouring problem. In Part c, the graph has been modified using the steps described
in Sect. 6.3 and, in Part d, a three-colouring of this graph has been established. Part e shows the
final Latin square
filled l × l grid and deciding whether or not it can be completed to form a Latin
square. This problem has been proven N P -complete by Colbourn [19].
Figure 6.12 demonstrates how the partial Latin square problem can be tackled
using graph colouring principles. It follows the same method as the previous example
given in Fig. 6.11, except that certain vertices are now also precoloured. This means
that the same steps as those used with the precolouring problem (Sect. 6.3) can now
be followed, with an l-colouring of this graph corresponding to a completed l × l
Latin square. Of course, depending on the values of the filled-in cells in the original
problem, there could be zero, one, or multiple feasible l-colourings available.
The partial Latin square problem has become very popular in recent decades in the
form of Sudoku puzzles. In Sudoku, we are given a partially filled Latin square and
the objective is to complete the remaining cells so that each column and row contains
the characters 1, . . . , l exactly once. In addition, Sudoku grids are also divided into
l “boxes” (usually marked by bold lines) which are also required to contain the
characters 1, . . . , l exactly once; thus, Sudoku can be considered a special case of
the partial Latin square problem in which the constraint of appropriately filling out the
“boxes” must also be satisfied. An example 9×9 Sudoku puzzle and a corresponding
solution is shown in Fig. 6.13.
Because Sudoku is intended to be an enjoyable puzzle, problems posed in books
and newspapers will nearly always be logic solvable.
Puzzles that are not logic solvable require random choices to be made. In general,
these should be avoided because players will have to go through the tedious process
of backtracking and re-guessing if their original guesses turn out to be wrong.
As an example of how a player might deduce the contents of cells, consider the
puzzle given in Fig. 6.13. Here, we see that the cell in the seventh row and sixth
6.4 Latin Squares and Sudoku Puzzles 175
column (shaded) must contain a 6 because all numbers 1–5 and 7–9 appear either in
the same column, the same row, or the same box as this cell. If the problem instance
is logic solvable (as indeed this one is), the filling-in of this cell will present further
clues, allowing the user to eventually complete the puzzle.
Many algorithms for solving Sudoku puzzles are available online, such as those at
http://www.sudokuwiki.org and http://www.sudoku-solutions.com. Such algorithms
typically mimic the logical processes that a human might follow, with popular deduc-
tive techniques, such as the so-called X-wing and Swordfish rules, also being com-
monplace. In other areas of Sudoku research, Russell and Jarvis [20] have shown that
the number of essentially different Sudoku solutions (when symmetries such as rota-
tion, reflection, permutation, and relabelling are taken into account) is 5,472,730,538
for the popular 9 × 9 grids. McGuire et al. [21] have also shown that 9 × 9 Sudoku
puzzles must contain at least 17 filled-in cells to be logic solvable and that 9 × 9
puzzles with 16 or fewer filled-in cells will always admit more than one solution.
Similar results for larger grids are unknown, however. Herzberg and Murty [22] have
also shown that at least l − 1 of the l characters must be present in the filled cells of
a Sudoku puzzle for it to be logic solvable.
Although Sudoku is a special case of the partial Latin squares problem, Yato and
Seta [23] have demonstrated that the problem of deciding whether or not a Sudoku
puzzle features a valid solution is still N P -complete. Graph colouring algorithms
can therefore be useful for solving instances of Sudoku, particularly those that are
not necessarily logic solvable.
Sudoku puzzles can be transformed into a corresponding graph colouring problem
in the same fashion as partial Latin square problems (see Fig. 6.12), with additional
edges also being imposed to enforce the extra constraint concerning the “boxes” of
the grid. We now present two sets of experiments that illustrate the capabilities of
the HEA and backtracking algorithms from Chap. 5 for solving Sudoku puzzles. In
the first set of experiments, we focus on Sudoku problems that are not necessarily
logic solvable (random puzzles), while in the second set we focus on 9 × 9 grids that
are logic solvable.
2 4 7 1 2 4 9 5 7 3 8 6
6 6 8 5 3 4 1 2 9 7
3 6 8 4 1 5 7 9 3 6 8 2 4 1 5
4 3 1 5 4 3 1 2 6 5 9 7 8
5 3 2 5 6 8 4 7 9 1 3 2
7 9 6 7 9 2 1 3 8 5 6 4
2 9 7 1 8 2 5 9 7 1 6 8 4 3
4 9 3 8 4 7 5 9 3 6 2 1
3 1 4 7 5 3 1 6 8 2 4 7 5 9
Each of these shuffle operators preserves the validity of a Sudoku solution. Finally,
some cells in the grid should be made blank by going through each cell in turn and
deleting its contents with probability 1 − p, where p is a parameter to be defined by
the user. This means that instances generated with a low value for p have a lower
proportion of filled-in cells.
Figure 6.14 illustrates the performance of the HEA and backtracking algorithms
on 9 × 9, 16 × 16, and 25 × 25 Sudoku grids, respectively. In each case, 100 instances
for each value of p have been generated and, as in Chap. 5, a computation limit of
5 × 1011 constraint checks has been imposed. For each algorithm, two statistics
are displayed. The success rate (SR) indicates the percentage of runs for which the
algorithms have found a valid Sudoku solution (a feasible l-colouring) within the
computation limit. The solution time then indicates the mean number of constraint
checks that it took to achieve these solutions. Note that only successful runs are
considered in the latter statistic.
Looking at the results for 9 × 9 Sudoku puzzles first, we see that both algorithms
feature a 100% success rate across all instances with only a very small proportion
of the computation limit being required.2 For 16 × 16 puzzles, similar patterns
occur for the HEA, with all problem instances being solved, and no runs requiring
more than one second of computation time. On the other hand, the backtracking
algorithm features a dip in its success rate for values of p between 0.1 and 0.55,
with a corresponding increase in solution times. With the larger 25 × 25 puzzles, this
pattern becomes more apparent, with both algorithms featuring dips in their success
rates and subsequent increases in their solution times. However, these dips are less
pronounced with the HEA, indicating its superior performance overall.
The dips in the success rates of these algorithms are analogous to the phase tran-
sition regions we saw with the flat graphs in Sect. 5.7. When p is low, although
solution spaces will be larger, there will tend to be many optimal solutions within
2 On our equipment (3.0 GHz Windows 7 PC with 3.87 GB RAM) the longest run in the entire set
took just 0.02 s.
6.4 Latin Squares and Sudoku Puzzles 177
100
Bktr (Solution Time) 7
HEA (Solution Time) 1.4×10
Bktr (SR)
HEA (SR)
7
80 1.2×10
7
Success Rate (%) 1×10
60
Checks
6
8×10
6
40 6×10
4×106
20
2×106
0 0
0 0.2 0.4 0.6 0.8 1
Proportion of Filled Cells (p)
10
100 2.5×10
Bktr (Solution Time)
HEA (Solution Time)
Bktr (SR)
HEA (SR)
10
80 2×10
Success Rate (%)
10
60 1.5×10
Checks
40 1×1010
9
20 5×10
0 0
0 0.2 0.4 0.6 0.8 1
Proportion of Filled Cells (p)
100 5×1011
Bktr (Solution Time)
HEA (Solution Time)
Bktr (SR)
HEA (SR)
11
80 4×10
Success Rate (%)
60 3×1011
Checks
11
40 2×10
20 1×1011
0 0
0 0.2 0.4 0.6 0.8 1
Proportion of Filled Cells (p)
Fig. 6.14 Comparison of the HEA and backtracking algorithm’s performance with random Sudoku
instances of size 9 × 9, 16 × 16, and 25 × 25, respectively. Note the different scales on the vertical
axes in each case
178 6 Applications and Extensions
these spaces. Consequently, an effective algorithm should be able to find one of these
within a reasonable amount of computation time, as is the case with the HEA. For
high vales for p meanwhile, although there will only be a very small number of
optimal solutions (and perhaps only one), the solution space will be much smaller.
Additionally, solutions to these highly constrained instances will also tend to reside
at prominent optima, thus also allowing easy discovery by an effective algorithm.
However, instances at the boundary of these two extremes will cause greater difficul-
ties. First, the solution spaces for these instances will still be relatively large, but they
will also tend to admit only a small number of optimal solutions. Second, because
of their moderate number of constraints, the cost landscapes will also tend to feature
more plateaus and local optima, making navigation towards a global optimum more
difficult for the algorithm.
2.5×1010
Bktr
HEA
100
10
1.5×10
60
1×1010
40
9
5×10 20
0 0
20 30 40 50 60 70 80
Number of Filled Cells
Fig. 6.15 Comparison of the HEA and backtracking algorithm’s performance on 9 × 9 Sudoku
grids with unique solutions
Another practical application of graph colouring is due to Garey et al. [24], who
suggest its use in the process of testing for (undesired) short circuits in printed
circuit boards. In their model, a circuit board is represented by a finite lattice of
evenly spaced points onto which a set of n cycle-free components has been printed.
This set P is referred to as a net pattern, with individual components p ∈ P being
called nets. Each net connects points that are intended to be electrically common. An
example net pattern comprising four components is shown in Fig. 6.16a. Note that
connections between points are only permitted in vertical or horizontal directions.
Given a net pattern P, the problem of interest is to determine whether there exists
some fault on the circuit board (due to the manufacturing process) whereby an extra
conductor path has been introduced between two nets that are not intended to be
electrically common. This is the case in Fig. 6.16b. These extra conductor paths are
known as “shorts”.
An obvious strategy to determine whether a short has occurred is to test each pair
of nets pi , p j ∈ P in turn by applying an electrical current to pi and seeing if this
P1 P1
P3 P3
current spreads to p j . However, Garey et al. [24] suggest that the number of pairwise
tests can be reduced significantly by making use of the following two observations.
First, note that many pairs do not need to be tested. Consider, for example, the net
pattern in Fig. 6.17a. Here it is unnecessary to test the pair p1 , p3 because, if there is
a short between them, then shorts must also exist between pairs p1 , p2 and p2 , p3 .
Since the objective of the problem is to determine if any shorts exist, testing either
p1 , p2 or p2 , p3 is therefore sufficient. Furthermore, if we consider the net pattern
in Fig. 6.17b, it might also be reasonable to assume that shorts cannot occur between
p1 and p3 without also causing a short involving p2 . Thus, depending on the criteria
used for deciding how and where shorts can occur, we have the opportunity to exclude
many pairs of nets from the testing procedure. If it is deemed necessary to test a pair
of nets, these are called critical pairs; otherwise, they are deemed noncritical.
The second observation is as follows. Let G = (V, E) be a graph with a set of
vertices V = {v1 , v2 , . . . , vn }, where each vertex vi ∈ V corresponds to a particular
net pi ∈ P (for 1 ≤ i ≤ n). Also, let each edge {vi , v j } ∈ E correspond to a pair
of nets pi , p j judged to be critical. Now let S = {S1 , . . . , Sk } be a partition of V
such that no pair of vertices vi , v j in any subset Sl forms a critical pair. From a graph
colouring perspective, S therefore defines a feasible k-colouring of the vertices of G.
Now suppose that, for the printed circuit board in question, external conductor paths
are provided so that all nets in any subset S ∈ S can be made electrically common
during testing. This means that there are k “supernets” that need to be tested. It can
now be seen that the printed circuit board contains no short if and only if no pair of
“supernets” is seen
to be electrically common. Therefore, we only have to perform
a maximum of 2k tests as opposed to our original figure of n2 tests. Naturally, it is
desirable to reduce k as far as possible to minimise the number of tests needed.
In their paper, Garey et al. [24] propose several criteria for deciding whether a pair
of nets should be deemed critical, with associated theorems then being presented.
We now review some of these.
Proof Given a net pattern P, for each pair of nets for which a vertical line of sight
exists, draw such a line. Since each line is vertical, none can intersect. It is now
possible to contract each net into a single point, deforming the lines of sight (which
may no longer be straight lines) in such a way that they remain nonintersecting. This
structure now corresponds to the graph G = (V, E), with each vertex corresponding
to a contracted net, and each edge corresponding to the lines of sight. Since G is
planar, χ (G) ≤ 4 according to the four colour theorem (Theorem 6.8).
Proof It is first necessary to show that any graph G formed in this way has a vertex v
with deg(v) ≤ 11. Let G 1 = (V, E 1 ) and G 2 = (V, E 2 ) be subgraphs of G such that
E 1 is the set of edges formed from vertical lines and E 2 is the set of edges formed
from horizontal lines. Hence E = E 1 ∪ E 2 . By Theorem 6.13, both G 1 and G 2 are
planar. We can also assume without loss of generality that the number of vertices
n > 12. According to Theorem 6.2, the number of edges in a planar graph with n
vertices is less than or equal to 3n − 6. Thus:
m ≤ |E 1 | + |E 2 | ≤ (3n − 6) + (3n − 6) = 6n − 12.
Since each edge contributes to the degree of two distinct vertices, this gives
deg(v) = 2m ≤ 12n − 24.
v∈V
Hence, it follows that some vertex in G must have a degree of 11 or less.
Now consider any subset V ⊆ V with |V | > 12. The induced subgraph of
V must contain a vertex with a degree at most 11. Consequently, according to
Theorem 3.7, χ (G) ≤ 11 + 1.
In their paper, Garey et al. [24] conjecture that the result of Theorem 6.14 might be
improved to χ (G) ≤ 8 because, in their experiments, they were not able to produce
graphs featuring chromatic numbers higher than this. They also go on to consider
the maximum length of lines of sight and show that:
• If lines of sight can be both horizontal and vertical but are limited to a maximum
length of 1 (where one unit of length corresponds to the distance between a pair
of vertically adjacent points or a pair of horizontally adjacent points on the circuit
board), then G will be planar, giving χ (G) ≤ 4.
• If lines of sight can be both horizontal and vertical but are limited to a maximum
length of 2, then G will have a chromatic number χ (G) ≤ 8.
182 6 Applications and Extensions
Finally, they also note that if arbitrarily long lines of sight travelling in any direction
are permitted (as opposed to merely horizontal or vertical) then it is possible to
form all sorts of different graphs, including complete graphs. Hence, arbitrarily high
chromatic numbers can occur.
In this section, we now consider graph colouring problems for which information
about a graph is incomplete at the beginning of execution. In the following sub-
sections, we discuss three different interpretations, specifically decentralised graph
colouring, online graph colouring, and dynamic graph colouring, and give practical
examples of each.
(a) (b)
v u v1 u v2
u
v
u v1 v2
Fig. 6.18 Illustration of a primary collision (a), and (b) a secondary collision (dotted line) in a
wireless network
More precisely, let G = (V, E) be a graph with vertex set V and an edge E. The
set of edges due to primary collisions, E 1 , contains all pairs of devices that are close
enough to be able to receive each other’s transmissions (as with Fig. 6.18a). The set
E 2 then contains pairs of devices subject to secondary collisions, that is, {vi , v j } ∈ E 2
if and only if the distance between vi and v j in the graph G 1 = (V, E 1 ) is exactly
two (as is the case in Fig. 6.18b). If only primary collisions need to be considered
when assigning frequencies, we only need to colour the graph G 1 ; otherwise, we
will need to colour the graph G = (V, E = E 1 ∪ E 2 ). In either case, this task
is a type of decentralised graph colouring problem because each vertex (wireless
device) is responsible for choosing its colour (frequency) while being aware only of
its neighbours and their current colours.
One simple but effective algorithm for the decentralised graph colouring problem
is due to Finocchi et al. [25]. This operates as follows. Let G = (V, E) be a graph with
maximal degree Δ(G). To begin, all vertices in G are set as uncoloured. Each vertex
is also allocated a set of candidate colours, defined L v = {1, 2, . . . , deg(v)+1} ∀v ∈
V . A single iteration of the algorithm now involves the following four steps:
{1,2,3} {1,2,3}
2 2
3 4 3 4
{1,2,3,4} {1,2,3,4,5} {1,2,3,4} {1,2,3,4,5} {1,2,3,4}
3 1 3 1
{1,2,3,4} {1,2,3} {1,2,3,4} {1,2,3} {1,2,3,4}
Initial state Step 1. Step 2.
2 2 2
4 1 4 1 4
{1,3} {1,3}
1 2 1 2 1
{2,3} {2,3}
Steps 3(a) and (b). Step 1. Step 2.
Fig. 6.19 Example run of algorithm of Finocchi et al. Here tentatively coloured vertices are shown
in white. Labels within the vertices indicate colours
An example run of this algorithm is shown in Fig. 6.19. In the first iteration, we see
that three of the five vertices are allocated final colours; the remaining two vertices
are then allocated final colours in the second iteration.
Note that in the above algorithm, each vertex v is initially assigned a set of
candidate colours L v = {1, 2, . . . , deg(v) + 1}. This means that L v always contains
sufficient options to allow each vertex v to be coloured differently from all of its
neighbours; hence, Step 3(c) will never actually be used. If, however, we desire a
solution using fewer colours, we might choose to introduce a shrinking factor s > 1,
which can be used to limit the initial set of candidate colours for each vertex v to
L v = {1, 2, . . . , deg(v)+1
s }. In this case, Step 3(c) might now be needed if the
original contents of L v prove insufficient. The algorithm may also need to execute
for an increased number of iterations to achieve a feasible solution (if, indeed, one
can be found).
Finocchi et al. [25] also suggest an improvement to this algorithm by replacing
Step 2 with a more powerful operator. Observe in Step 2 of the first iteration of
Fig. 6.19 that there are two vertices tentatively coloured with colour 3. Accordingly,
neither of these vertices receives a final colour at this iteration, though it is obvious
that one of them could indeed receive colour 3 as a final colour at this point. An
improvement to Step 2 therefore operates as follows. Let G(i) = (V (i), E(i)) be
the subgraph induced by all vertices tentatively coloured with colour i. We now
identify a maximal independent set for G(i) and assign all vertices in this set to a
final colour i. All other vertices in G(i) should remain uncoloured. To form this
independent set, in parallel each vertex v ∈ V (i) first generates a random number
rv ∈ [0, 1]. The tentative colour of a vertex v is then selected as its final colour if
and only if rv is less than the random numbers chosen by its neighbours in G(i).
6.6 Graph Colouring with Incomplete Information 185
This is equivalent to the Greedy process of randomly permuting the vertices in V (i)
and then adding each vertex v ∈ V (i) to the independent set if and only if it appears
before its neighbours in the permutation.
In addition to assigning frequencies in wireless networks, decentralised graph
colouring problems are known to arise in several other practical situations, including
TDMA slot assignment, wake-up scheduling, and data collection [26]. One particu-
larly noteworthy piece of research is due to Kearns et al. [27], who have examined
the decentralised colouring of graphs representing social networks. In their case,
each vertex in the graph is a human participant, and two vertices are adjacent if these
people are judged to know one another. The objective of the problem is for each
person to choose a colour for himself or herself only by using information regarding
the colours of his or her neighbours. Participants are also able to change their colour
as often as necessary until, ultimately, a feasible colouring of the entire graph is
formed. This problem has real-world implications in situations where it is desirable
to distinguish oneself from one’s neighbours, for example, selecting a mobile phone
ringtone that differs from one’s friends, or choosing to develop professional exper-
tise that differs from one’s colleagues. In their research, Kearns et al. [27] carried
out experiments on several graph topologies using segregated participants. Under a
time limit of 5 min, topologies such as cycle graphs were optimally coloured quite
quickly through the collective efforts of the participants. Other more complex graphs
modelling more realistic social network topologies were seen to present significantly
more difficulties, however.
Studies of online graph colouring have also focussed on the behaviour of the
Greedy algorithm, which, we recall, operates by assigning each vertex to the lowest
indexed colour seen to be feasible (see Sect. 3.1). Bounds noted by Gyárfás and
Lehel [29] include
χGreedy (G) ≤ ω(G) + 1 (6.3)
if G is a split graph (i.e., a graph that can be partitioned into one clique and one
independent set),
3
χGreedy (G) ≤ ω(G) + 1 (6.4)
2
if G is the complement of a bipartite graph, and
χGreedy (G) ≤ 2ω(G) − 1 (6.5)
if G is the complement of a chordal graph.
Upper bounds on the quality of solutions produced by Greedy can also be deter-
mined by looking at the Grundy chromatic number. For a particular graph G, this is
defined as the maximum number of colours used by Greedy over all orderings of
G’s vertices. Graphs for which the Grundy chromatic number is equal to the chro-
matic number include complete graphs, empty graphs, odd cycles, and complete
k-partite graphs. In general, however, the problem of determining Grundy chromatic
numbers is N P -hard, with the best known exact algorithm operating in O(2.443n )
time [30,31].
Empirical work by Ouerfelli and Bouziri [32] has also suggested that instead
of following the Greedy algorithm’s strategy of assigning vertices to the lowest
indexed feasible colour, in online colouring it is often beneficial to assign vertices to
the feasible colour containing the most vertices. This is because such a heuristic will
often assist in the formation of larger independent sets, ultimately helping to reduce
the number of colours used in the final solution.
A real-world application of online graph colouring is presented by Dupont et al.
[33]. Here, a military-based frequency assignment problem is considered in which
wireless communication devices are introduced one by one into a battlefield environ-
ment. From a graph colouring perspective, given a graph G = (V, E), the problem
starts with an initial colouring of the subgraph induced by the subset of vertices
{v1 , . . . , vi }. The remaining vertices vi+1 , . . . , vn are then introduced one at a time,
with the colour (frequency) of each vertex having to be determined before the next
vertex in the sequence is considered. In this application, the number of available
colours is fixed from the outset, so it is possible that a vertex v j (i < j ≤ n) might
be introduced for which no feasible colour is available. In this case, a repair operator
is used that attempts to rearrange the existing colouring so that a feasible colour is
created for v j . Because such rearrangements are considered expensive, the repair
operator also attempts to minimise the number of vertices that have their colours
changed during this process.
6.6 Graph Colouring with Incomplete Information 187
Dynamic graph colouring differs from decentralised and online graph colouring in
that we again possess a global knowledge of the graph we are trying to colour.
However, in this case, graphs are also permitted to change over time.
A practical application of dynamic graph colouring might occur in the timetabling
of lectures at a university (see Sect. 1.1.2 and Chap. 9). To begin, a general set
of requirements and constraints will be specified by the university and an initial
timetable will be produced. However, on viewing this draft timetable, circumstances
might dictate that some constraints need to be altered, additional lectures need to be
introduced, or other lectures need to be cancelled. This will result in a new timetabling
problem that needs to be solved, with the process continuing in this fashion until a
finalised solution is agreed upon.
The dynamic graph problem is considered by Hardy [34], who considers two
general cases.
1. Edge dynamic problems. Here, the vertices of a graph are not altered, but edges
can be added and removed.
2. Vertex dynamic problems. A generalisation of the above in which vertices (and
their edges) are added and removed.
For both of these cases, a dynamic graph colouring problem can be modelled using
a sequence of graphs G = (G 1 , G 2 , . . . , G l ). A solution to each graph G i ∈ G will
then need to be produced within a limited time frame before the next graph G i+1 is
considered. An important issue in this problem is to decide how and when solutions to
the previously observed graphs, G 1 , . . . , G i , can be used to help establish solutions
for the future graphs G i+1 , . . . , G l . When changes between successive graphs are
made at random, Hardy finds that making use of a solution for G i can help establish
a solution for G i+1 , providing that the number of changes made between time steps
is fairly small. In the remaining cases, it is sufficient to produce solutions to G i+1
using no previous information.
Hardy [34] also considers situations where future changes to graphs are expected
to occur with certain probabilities. Solutions can then be sought that are robust to
these potential changes. As an example, consider the situation where two vertices
u, v are nonadjacent in G i , but are expected to become adjacent at a later time
step. In these circumstances, it might be advantageous to try and assign u and v to
different colours, even though this is not currently required. For both edge and vertex
dynamic problems, Hardy finds that it is useful to take these expected changes into
account. The scheme suggested involves using a local search routine that maintains
the feasibility of a solution with regard to the current graph G i , but that also seeks to
optimise a “robustness” measure that takes future change probabilities into account.
Further information is also documented by Hardy et al. [35].
188 6 Applications and Extensions
The list colouring problem is an extension to the graph colouring problem that, as
usual, involves assigning differing colours to adjacent vertices. In this case, though,
individual vertices are also given a list of permissible colours to which they can be
assigned.
Defined more precisely, the list colouring problem takes a graph G = (V, E)
together with a set L v of permissible colours for each vertex v ∈ V . The sets
L v are usually referred to as “lists”, giving the problem its name. The task is to
now produce a feasible colouring of G with the added restriction that all vertices
should only be assigned to colours appearing in their corresponding lists (that is,
∀v ∈ V, c(v) ∈ L v ). If a k-colouring exists for a particular problem instance of
the list colouring problem, we say that the graph G is “k-choosable”. The “choice
number” χ L (G) then refers to the minimum k for which G is k-choosable. Note that
the chromatic number of a graph χ (G) ≤ χ L (G).
List colouring problems have obvious applications in areas such as timetabling
where, in addition to scheduling events into a minimal number of timeslots (as we
saw in Sect. 1.1.2), we might also face constraints of the form “event v can only
be assigned to timeslots x and y”, or “event u cannot be assigned to timeslot z”.
The problem is also N P -hard because it generalises the graph colouring problem.
Specifically, graph colouring problems can be easily converted into an equivalent list
colouring decision problem by simply setting L v = {1, 2, . . . , Δ(G) + 1}, ∀v ∈ V .
In practice, algorithms for the graph colouring problem can often be used for
deciding whether a graph is k-choosable. More specifically, graph colouring algo-
rithms can be used to tackle any list colouring problem for which our chosen k ≥ |L|,
where L is defined as the union of all lists: L = v∈V L v .
To see this, imagine we have a list colouring problem defined on a graph G for
which k ≥ |L| is satisfied. First, we create a new graph G by copying the vertices
and edges of G and then adding k additional
vertices, which we label u 1 , u 2 , . . . , u k .
Next, we add edges between all 2k pairs of these additional vertices so that they
form a complete graph K k . This implies that any feasible colouring of G must use
at least k different colours. Without loss of generality, we can assume that c(u i ) = i
for 1 ≤ i ≤ k. Finally, we then go through each vertex v in G that came from the
original graph G and consider its colour list, adding an edge between v and u i if
colour i ∈/ L v . This has the effect of disallowing v from being assigned to colour i,
as required.
Figure 6.20 demonstrates this process. In this example, k = |L| = 4, meaning that
four additional vertices u 1 , . . . u 4 are added (larger values for k are also permitted).
The colouring produced for the extended graph G also uses four colours, which is
the chromatic number in this case. However, we also observe that none of the vertices
originating from G are coloured with colour 1 in this example; hence, we deduce
that G is 3-choosable, as shown in Fig. 6.20c.
6.8 Equitable Graph Colouring 189
Fig. 6.20 Illustration of how a list colouring problem (a) can be converted into an equivalent
graph colouring problem (b), whose colouring then represents a feasible solution to the original list
colouring problem (c)
Another extension to the graph colouring problem is the equitable colouring problem,
where we seek to establish a feasible colouring of a graph G such that the sizes of the
colour classes differ by at most 1. In other words, we desire a feasible k-colouring
so that exactly n mod k colour classes contain n/k vertices, and the remainder
contain exactly n/k vertices.
Examples of equitable graph colouring problems occur quite naturally as exten-
sions to the general graph colouring problem. In university timetabling, for example,
it might be desirable to minimise the number of rooms required by balancing the
number of events per timeslot (see Sect. 1.1.2). Another application can be found
in the creation of table plans for large parties. Imagine, for example, that we have n
guests who are to be seated at k equal-sized tables, but that some guests are known to
dislike each other and therefore need to be assigned to different tables. In this case,
we can model the problem as a graph by using vertices for guests, with edges occur-
ring between pairs of guests who dislike each other. An extension to this application
is the subject of Chap. 7.
Let G = (V, E) be a graph with n vertices, a maximal degree Δ(G), and an
independence number α(G).
Like the graph colouring problem, the equitable graph colouring problem is known
to be N P -complete. This follows from the fact that the problem of deciding whether
190 6 Applications and Extensions
Fig. 6.21 The equitable chromatic numbers for star graphs with n = 5, 6, 7, and 8 are 3, 4, 4, and
5, respectively
Theorem 6.15 (Hajnal and Szemerédi [37]) Let G be a graph with maximal
degree Δ(G). Then χe (G) ≤ Δ(G) + 1.
6.8 Equitable Graph Colouring 191
This fact was initially conjectured by Erdős [38], with a formal proof being pub-
lished 6 years later by Hajnal and Szemerédi [37]. Shorter proofs of this theorem
have also been shown by Kierstead and Kostochka [39] and Kierstead et al. [40].
The latter publication also presents a polynomial-time algorithm for constructing an
equitable (Δ(G) + 1)-colouring. The method involves first removing all edges from
G and dividing the n vertices arbitrarily into Δ(G) equal-sized colour classes. In
cases where n is not a multiple of Δ(G), sufficient isolated vertices are added. The
vertices are then considered in turn and, in each iteration i, the edges incident to ver-
tex vi are added to G. If vi is seen to be adjacent to another vertex in its colour class,
it is moved to a different feasible colour class, leading to a feasible colouring with
up to Δ(G) + 1 colours. If this colouring is not equitable, then a polynomial-length
sequence of adjustments are made to re-establish the balance of the colour classes.
It is notable that Theorem 6.15 is similar to Theorem 3.6 from Chap. 3 which
states that for any graph G, the chromatic number χ (G) ≤ Δ(G) + 1. Meyer [41]
has gone one step further to even conjecture a form of Brooks’ Theorem 3.8 for
equitable graph colouring: every graph G has an equitable colouring using Δ(G)
or fewer colours except for complete graphs and odd cycles. Recall, however, that
the problem of determining an equitable k-colouring for an arbitrary graph G is still
N P -complete, implying the need for approximation algorithms and heuristics in
general.
One simple approach for achieving approximate equitable k-colourings can be
achieved by making a simple modification of the DSatur algorithm. Recall from
Sect. 3.3 that this algorithm takes vertices one at a time and then colours them using
the lowest colour label not assigned to any of their neighbours. Here, we can change
this strategy by using k colour classes from the outset and, at each iteration, select
the feasible colour class containing the fewest vertices, breaking any ties randomly.
Figure 6.22 summarises results achieved by this modification for random graphs
G 500, p , using p = 0.1, 0.5, and 0.9, for a range of suitable k-values. For comparison’s
sake, the results of a second algorithm are also included here. This operates in the
same manner except that vertices are assigned to a randomly chosen feasible colour
in each case. The cost here is simply the difference in size between the largest and
smaller colour classes in a solution. Hence, a cost of 0 or 1 indicates an equitable
k-colouring.
Figure 6.22 demonstrates that, for these random graphs, the policy of assigning
vertices to feasible colour classes with the fewest vertices brings about more equi-
tably coloured solutions. We also see that the algorithm consistently achieves equi-
table colourings for the majority of k-values except for those close to the chromatic
number, and those where k is a divisor of n. For the former case, the low number of
available colours restricts the choice of feasible colours for each vertex, often leading
to inequitable colourings. On the other hand, when k is a divisor of n the algorithm
is seeking a solution with a cost of 0, meaning that the last vertex considered by the
algorithm must be assigned to the unique colour class containing one fewer vertex
than the remaining colour classes. If this colour turns out to be infeasible (which
is often the case), this vertex will then need to be assigned to another colour class,
resulting in a solution with a cost of 2.
192 6 Applications and Extensions
12
Random Feasible Colour
Feasible Colour with Fewest Vertices
10
Cost 8
0
20 30 40 50 60 70 80
k
12
Random Feasible Colour
Feasible Colour with Fewest Vertices
10
8
Cost
0
100 150 200 250
k
12
Random Feasible Colour
Feasible Colour with Fewest Vertices
10
8
Cost
0
200 250 300 350 400 450
k
Fig. 6.22 Quality of equitable solutions produced by the modified DSatur algorithms on random
graphs with n = 500 for, respectively, p = 0.1, 0.5, and 0.9. All figures are the average of 50
instances per k-value
6.8 Equitable Graph Colouring 193
It is also possible to further improve these solutions by, for example, applying
a local search algorithm with appropriate neighbourhood operators such as Kempe
chain interchanges and pair swaps. An approach along these lines for a related prob-
lem is the subject of the case study presented in Chap. 7.
Further useful extensions of the graph colouring problem can be achieved through
the addition of numeric weights to a graph. Typically, the term “weighted graph
colouring” is used in situations where the vertices of a graph are allocated weights.
However, the term is also sometimes used for problems where the edges are weighted,
and for the multicolouring problem. These are considered in turn in the following
subsections.
An example of this process is shown in Fig. 6.23. The matching M ∗ can be deter-
mined in polynomial time using methods such as the O(mn 2 ) blossom algorithm [44].
Note that each colour class in the solution is an independent set, but that these are lim-
ited to contain a maximum of two vertices. Indeed, in graphs where no independent
set contains more than two vertices (such as the complement of a bipartite graph),
this algorithm guarantees the optimal. In further work, Hassin and Monnot [45] have
6.9 Weighted Graph Colouring 195
3 1 3 1
2 1
1
2 2 2 2
3
4 4
G = (V, E, w) G = (V, E’, w’) M* Solution
shown that, for any graph, this process produces a solution whose objective function
value never exceeds twice the optimum. They also show that the same approximation
g(Si ) takes other forms such as g(Si ) = min{wv : v ∈ Si } and
ratio applies when
g(Si ) = |S1i | v∈Si wv . Malaguti et al. [46] have also proposed several IP-based
methods for this problem similar in spirit to those discussed in Sect. 4.1.2. In partic-
ular, they propose the use of heuristics for building up a large sample of independent
sets and then use an IP model similar to that of Sect. 4.1.3 to select a subset of these.
Local search-based methods based on Kempe chain interchanges and pair swaps also
seem to be naturally suited to this problem.
Here, if all edge weights in the graph are positive values, then any solution for which
f (S ) = 0 will correspond to a feasible k-coloured solution.
This sort of formulation is applicable in areas such as exam timetabling and social
networking. For the former, imagine that we wish to assign exams (vertices) to
timeslots (colours), but that there are insufficient timeslots to feasibly accommodate
all exams. To form a complete exam timetable, this means that some clashes will
be necessary; however, some types of clashes may be deemed less critical than
others. For example, if two clashing exams only have a small number of common
students, then we may allow them to both be assigned to the same timeslot, with
alternative arrangements then being made for the people affected. On the other hand,
if two exams contain a large number of common participants then a clash is far
196 6 Applications and Extensions
less desirable. Appropriate weights added to the corresponding edges can be used to
express such preferences.
Note that due to the nature of this problem’s requirements, algorithms that search
the space of complete improper solutions will often be naturally suitable here. In
Chap. 7, an application along these lines will be made to the problem of partitioning
members of social networks, where edge weights are used to express a level of
“liking” or “disliking” between pairs of individuals. A simulated annealing-based
approach for constructing subject options columns at schools is also described by
Lewis et al. [47].
6.9.3 Multicolouring
The final topic in this chapter concerns counting the number of different colourings of
a graph G. To do this, let P(G, k) denote the number of distinct feasible colourings of
G that are using k or fewer colours. As we will see presently, P(G, k) can always be
6.10 Chromatic Polynomials 197
k 0 1 2 3 4 ...
P(G, k) 0 0 0 12 72 . . .
Note that the lowest k-value for which P(G, k) is non-zero gives the chromatic
number of a graph. This fact is also apparent in Fig. 6.24, where none of the pre-
sented three-colourings is using fewer than three colours. In this sense, the chromatic
polynomial contains at least as much information as its chromatic number.
198 6 Applications and Extensions
Proof Let G 1 be the graph obtained by deleting an edge {u, v} from G. Also, let G 2
be the graph obtained by contracting {u, v}. This means that
P(G, k) = P(G 1 , k) − P(G 2 , k).
To see this, observe that the number of k-colourings of G 1 in which u and v have
different colours is unchanged if the edge {u, v} is added. It is therefore equal to
P(G, k). Similarly, the number of k-colourings of G 1 in which u and v have the
same colour is equal to P(G 2 , k). Hence, P(G 1 , k) = P(G, k) + P(G 2 , k).
The actions of removing and contracting edges can now be repeated on each
subsequent subgraph to form a binary tree of successively smaller graphs. The leaves
of this tree will be empty graphs. Since the number of ways of k-colouring an empty
graph is a polynomial, it follows that the P(G, k) is also a polynomial.
Figure 6.25 gives an example of the process described in the proof of Theo-
rem 6.16. As demonstrated, rather that continuing these actions until all leaf nodes
are empty graphs, it is sufficient for leaves to represent graphs with known chro-
matic polynomials. In this example, the leaves are path graphs and complete graphs
6.10 Chromatic Polynomials 199
G =
e
e
e
G1 = G2 =
e e
e
k(k – 1)(k – 2)
k(k – 1)4 k(k – 1)3 k(k – 1)3 k(k – 1)2 k(k – 1)3 k(k – 1)2
Fig. 6.25 Example binary tree formed using the ideas in the proof of Theorem 6.16. In each step,
the edge e is removed and contracted to form the left and right branches, respectively
k 0123 4 5 6 ...
P(G, k) 0 0 0 6 96 540 1920 . . .
These figures also demonstrate that the chromatic number of this graph is three.
200 6 Applications and Extensions
This chapter has reviewed a wide range of computational problems related to graph
colouring. Several of our chosen topics, such as face and edge colouring, precolour-
ing, multicolouring, and solving sudoku puzzles, are either equivalent to the graph
(vertex) colouring problem or represent special cases of the problem. They can there-
fore be tackled using the algorithms seen in Chaps. 3–5.
In later sections of this chapter, we have also discussed various extensions to
the graph colouring problem. These have included decentralised and online graph
colouring, dynamic colouring (where graphs are known to evolve), list colouring,
equitable colouring, weighted graph colouring, and chromatic polynomials. Further
extensions using real-world operational research problems are considered in the next
three chapters.
References
1. Kuratowski K (1930) Sur le probleme des courbes gauches en topologie. Fundam Math 15:271–
283
2. Hopcroft J, Tarjan R (1974) Efficient planarity testing. J Assoc Comput Mach 21:549–568
3. Boyer W, Myrvold J (2004) On the cutting edge: simplified O(n) planarity by edge addition.
J Graph Algorithms Appl 8:241–273
4. Heawood P (1890) Map-colour theorems. Q J Math 24:332–338
5. Kempe A (1879) On the geographical problem of the four colours. Am J Math 2:193–200
6. Appel K, Haken W (1977) Solution of the four color map problem. Sci Am 4:108–121
7. Appel K, Haken W (1997) Every planar map is four colorable. Part I. Discharging. Ill J Math
21:429–490
8. Appel K, Haken W (1977) Every planar map is four colorable. Part II. Reducibility. Ill J Math
21:491–567
9. Appel K, Haken W (1989) Every planar map is four colorable. Contemporary Mathematics,
AMS. 978-0-8218-5103-6
10. Robertson N, Sanders D, Seymour P, Thomas R (1997) The four color theorem. J Comb Theory,
Ser B 70:2–44
11. Wilson R (2003) Four colors suffice: how the map problem was solved. Penguin Books
12. de Werra D (1988) Some models of graphs for scheduling sports competitions. Discret Appl
Math 21:47–65
13. Coffman E, Garey M, Johnson D, LaPaugh A (1985) Scheduling file transfers. SIAM J Comput
14(3):744–780
14. Kirkman T (1847) On a problem in combinations. Camb Dublin Math J 2:191–204
15. König D (1916) Gráfok és alkalmazásuk a determinánsok és a halmazok elméletére. Mat
Termtud Értesö 34:104–119
16. Vizing V (1964) On an estimate of the chromatic class of a p-graph. Diskret Analiz 3:25–30
17. Holyer I (1981) The NP-completeness of edge-coloring. SIAM J Comput 10:718–720
18. Misra J, Gries D (1992) A constructive proof of Vizing’s theorem. Inf Process Lett 41:131–133
19. Colbourn C (1984) The complexity of completing partial Latin squares. Discret Appl Math
8(1):25–30
References 201
20. Russell E, Jarvis F (2005) There are 5,472,730,538 essentially different Sudoku grids. http://
www.afjarvis.staff.shef.ac.uk/sudoku/sudgroup.html, September 2005
21. McGuire G, Tugemann B, Civario G (2012) There is no 16-clue Sudoku: solving the Sudoku
minimum number of clues problem. Comput Res Repos. arXiv:1201.0749
22. Herzberg A, Murty M (2007) Sudoku squares and chromatic polynomials. Not AMS 54(6):708–
717
23. Yato T, Seta T (2003) Complexity and completeness of finding another solution and its appli-
cation to puzzles. IEICE Trans Fundam Electron Commun Comput Sci E86-A:1052–1060
24. Garey M, Johnson D, So H (1976) An application of graph coloring to printed circuit testing.
IEEE Trans Circuits Syst CAS-23:591–599
25. Finocchi I, Panconesi A, Silvestri R (2005) An experimental analysis of simple, distributed
vertex colouring algorithms. Algorithmica 41:1–23
26. Hernández H, Blum C (2014) FrogSim: distributed graph coloring in wireless ad hoc networks.
Telecommun Syst 55:211–223
27. Kearns M, Suri S, Montfort N (2006) An experimental study of the coloring problem on human
subject networks. Science 313:824–827
28. Kierstead H, Trotter W (1981) An extremal problem in recursive combinatorics. Congr Numer
33:143–153
29. Gyárfás A, Lehel J (1988) On-line and first fit colourings of graphs. J Graph Theory 12:217–227
30. Zaker M (2006) Results on the Grundy chromatic number of graphs. Discret Math
306(23):3166–3173
31. Bonnet É, Foucaud F, Kim E, Sikora F (2015) Complexity of Grundy coloring and its variants.
In: Xu D, Du D, Du D (eds) Computing and combinatorics. Springer International Publishing,
pp 109–120. ISBN 978-3-319-21398-9
32. Ouerfelli L, Bouziri H (2011) Greedy algorithms for dynamic graph coloring. In: Proceedings of
the international conference on communications, computing and control applications (CCCA),
pp 1–5. https://doi.org/10.1109/CCCA.2011.6031437
33. Dupont A, Linhares A, Artigues C, Feillet D, Michelon P, Vasquez M (2009) The dynamic
frequency assignment problem. Eur J Oper Res 195:75–88
34. Hardy B (2018) Heuristic methods for colouring dynamic random graphs. PhD thesis, Cardiff
University
35. Hardy B, Lewis R, Thompson J (2018) Tackling the edge dynamic graph colouring problem
with and without future adjacency information. J Heurist 24(3):321–343
36. Furmańczyk H (2004) Graph colorings. In: Equitable coloring of graphs. American Mathe-
matical Society, pp 35–54
37. Hajnal A, Szemerédi E (1970) Combinatorial theory and it’s application. In: Erdős P (ed) Proof
of a conjecture. North-Holland, pp 601–623
38. Erdős P (1964) Theory of graphs and its applications. In: Problem 9. Czech Academy of
Sciences, p 159.
39. Kierstead H, Kostochka A (2008) A short proof of the Hajnal-Szemerédi theorem on equitable
coloring. Comb Probab Comput 17:265–270
40. Kierstead H, Kostochka A, Mydlarz M, Szemerédi E (2010) A fast algorithm for equitable
graph coloring. Combinatorica 30:217–224
41. Meyer W (1973) Equitable coloring. Amer Math Monthly 80:920–922
42. Demange M, de Werra D, Monnot J, Paschos V (2007) Time slot scheduling of compatible
jobs. J Sched 10:111–127
43. Escoffier B, Monnot J, Paschos V (2006) Weighted coloring: further complexity and approx-
imability results. Inf Process Lett 97(3):98–103
44. Kolmogorov V (2009) Blossom V: aA new implementation of a minimum cost perfect matching
algorithm. Math Program Comput 1(1):43–67
45. Hassin R, Monnot J (2005) The maximum saving partition problem. Oper Res Lett 33:242–248
202 6 Applications and Extensions
46. Malaguti E, Monaci M, Toth P (2009) Models and heuristic algorithms for a weighted vertex
coloring problem. J Heurist 15:503–526
47. Lewis R, Anderson T, Carroll F (2020) Can school enrolment and performance be improved
by maximizing students sense of choice in elective subjects? J Learn Anal 7(1):75–87
48. Aardel K, van Hoesel S, Koster A, Mannino C, Sassano A (2002) Models and solution tech-
niques for the frequency assignment problems. 4OR: Q J Belgian, French and Italian Oper Res
Soc 1(4):1–40
49. McDiarmid C, Reed B (2000) Channel assignment and weighted coloring. Networks
36(2):114–117
50. Caramia M, Dell’Olmo P (2001) Solving the minimum-weighted coloring problem. Networks
38(2):88–101
51. Mehrotra A, Trick M (2007) Extending the horizons: advances in computing, optimization, and
decision technologies. In: A branch-and-price approach for graph multi-coloring. Operations
research/computer science interfaces series, vol 37. Springer, pp 15–29
52. Birkhoff G (1912) A determinant formula for the number of ways of coloring a map. Ann Math
14(1/4):42–46
53. Whitney H (1932) The coloring of graphs. Ann Math 33:688–718
54. Zhang J (2018) An introduction to chromatic polynomials. https://math.mit.edu/~apost/
courses/18.204_2018/Julie_Zhang_paper.pdf, May 2018
Designing Seating Plans
7
The following three chapters of this book contain detailed case studies showing how
graph colouring methods can be used to successfully tackle important real-world
problems. The first of these case studies concerns the task of designing table plans
for large parties, which, as we will see, combines elements of the N P -hard edge-
weighted graph colouring problem, the equitable graph colouring problem and the
k-partition problem. A user-friendly implementation of the algorithm proposed in
this section can also be found online at http://www.weddingseatplanner.com, which
is viewed using Adobe Flash Player.
Consider an event such as a wedding or gala dinner where, as part of the formalities,
the N guests need to be divided on to k dining tables. To ensure that guests are sat
at tables with appropriate company, it is often necessary for organisers to specify a
seating plan, taking into account the following sorts of factors:
• Guests belonging to groups, such as couples and families, should be sat at the same
tables, preferably next to each other.
• If there is any perceived animosity between different guests, these should be sat
on different tables. Similarly, if guests are known to like one another, it may be
desirable for them to be sat at the same table.
• Some guests might be required to sit at a particular table. Also, some guests might
be prohibited from sitting at other tables.
• Since tables may vary in size and shape, each table should be assigned a suitable
number of guests, and these guests should be appropriately arranged around the
table.
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 203
R. M. R. Lewis, Guide to Graph Colouring, Texts in Computer Science,
https://doi.org/10.1007/978-3-030-81054-2_7
204 7 Designing Seating Plans
A naïve method for producing a seating plan best fitting these sorts of criteria might
be to consider all possible plans and then choose the one perceived to be the most
suitable. However, for non-trivial values of N or k, the number of possible solutions
will be prohibitively large for this to be possible. To illustrate, consider a simple
example where we have 48 guests using 6 tables, with exactly 8 guests per table.
For
48 the first table, we need to choose 8 people from the 48, for which there are
8 = 377,348,994 possible choices. For 40
the next table, we then choose 8 further
people from the remaining 40, giving 8 = 76,904,685 further choices, and so
on. Assuming that N is a multiple of k (allowing equal-sized tables), the number of
possible plans is thus calculated:
k−2
N − in
k
N /k
i=0
N! N − Nk ! N − 2N ! N − (k−2)N
k !
= N · N · N k 3N · · · · ·
k ! N − k ! k ! N − k ! k ! N − k !
N 2N (k−1)N
k ! N − !
N
k
N!
= k−1
(k−1)N
k ! N− !
N
k
N!
= k (7.1)
k !
N
v5 v3 v5 v3 6 2
v4 v4 4
Fig. 7.1 a A graph G in which edges specify pairs of guests who should not be sat in adjacent
seats; b the compliment graph Ḡ, together with a Hamiltonain cycle (shown in bold); and c the
corresponding seating arrangement around a circular table
In its simplest form, the problem of constructing a seating plan might be defined
using an N × N binary matrix W, where element Wi j = 1 if guests i and j are
required to be sat at different tables and Wi j = 0 otherwise. We can also assume that
Wi j = W ji . Given this input matrix, our task might then be to partition the N guests
into k subsets S = {S1 , . . . , Sk }, such that the objective function:
k
f (S ) = Wi j (7.2)
t=1 ∀i, j∈St :i< j
is minimised.
The problem of confirming the existence of a zero-cost solution to this problem
is equivalent to the N P -complete decision variant of the graph colouring problem.
Here, the graph G = (V, E) is defined using the vertex set V = {v1 , . . . , v N } and
the edge set E = {{vi , v j } : Wi j = 1 ∧ vi , v j ∈ V }. That is, each guest corresponds
to a vertex, and two vertices vi and v j are considered to be adjacent if and only if
Wi j = 1. Colours correspond to tables, and we are now interested in colouring G
using k colours.
From an alternative perspective, consider the situation where we are again given
the binary matrix W, and are now presented with a subset S of guests that have
been assigned to a particular circular-shaped table. Here, we might be interested
in arranging the guests onto the table such that, for all pairs of guests i, j ∈ S, if
Wi j = 1 then i and j are not sat in adjacent seats. This problem can also be described
by a graph G = (V, E) for which the vertex set V = {vi : i ∈ S} and the edge set
E = {{vi , v j } : Wi j = 1 ∧ vi , v j ∈ V }. A Hamiltonian cycle of the complement
graph Ḡ defines a seating arrangement satisfying this criterion, as illustrated in the
example in Fig. 7.1. However, determining the existence of a Hamiltonian cycle in
an arbitrary graph is also an N P -complete problem.
In practical situations, it might be preferable for W to be an integer or real-valued
matrix instead of binary, allowing users to place greater importance on some of
their seating preferences compared to others. Assuming that lower values for Wi j
indicate an increased preference for guests i and j to be sat together, the problem of
partitioning the groups on to k tables now becomes equivalent to the edge-weighted
206 7 Designing Seating Plans
graph colouring problem, while the task of arranging people on to circular tables in
the manner described above becomes equivalent to the travelling salesman problem.
Of course, both of these problems are also N P -complete since they generalise the
graph k-colouring problem and the Hamiltonian cycle problem, respectively. Also,
the problem of arranging guests around tables can become even more complicated
when tables of different shapes are used. For example, with rectangular tables, we
might also need to take into account who is sat opposite a guest in addition to their
neighbours on either side.
In this chapter, we describe a formulation of the above problem that is closely related
to graph colouring, but which also involves the additional constraint of grouping to-
gether guests who like one another while also maintaining appropriate numbers of
guests per table. This problem interpretation is used in conjunction with the com-
mercial website http://www.weddingseatplanner.com, which contains a free tool for
inputting and solving instances of the problem. The reader is invited to try out this
tool while reading this chapter.
It is stated by Nielsen [1] that users tend to leave a website in less than two minutes
if it is not understood or perceived to fulfil their needs. Consequently, the particular
problem formulation considered here is intended to strike the right balance between
being quickly accessible to users while still being useful and flexible in practice. Since
users of the website will typically have little knowledge of optimisation algorithms
and the implications of problem intractability, the algorithm is also designed to
supply the user with high-quality (though not necessarily optimal) solutions in very
short amounts of run time (typically less than 3 seconds). In particular, our approach
seeks to exploit the underlying graph-based structures of this problem, encouraging
effective navigation of the solution space via specialised neighbourhood operators.
In our definition of this problem, we choose to first partition the N guests into n ≤ N
guest groups. Each guest group refers to a subset of guests who are required to sit
together (couples, families with young children, etc.) and will usually be known
beforehand by the user. In addition to making the problem smaller, specifying guest
groups in this way also means that users do not have to subsequently input preferences
between pairs of people in the same families, etc., to ensure that they are sat together
in a solution.
Having done this, the wedding seating problem (WSP) can now be formally stated
as a type of graph partitioning problem. Specifically, we are given a graph G =
(V, E) in which each vertex v ∈ V represents a guest group. The size of each
7.2 Problem Definition 207
guest group is denoted by sv . The total number of guests in the problem is thus
N = v∈V sv .
In G, each edge {u, v} ∈ E defines the relationship between vertices u and v
according to a weighting wuv (where wuv = wvu ). If wuv > 0 this is interpreted to
mean that we would prefer the guests associated with vertices u and v to be sat on
different tables. Larger values for wuv reflect a strengthening of this requirement.
Similarly, negative values for wuv mean that we would rather u and v were assigned
to the same table.
A solution to the WSP is now defined as a partition of the vertices into k subsets
S = {S1 , . . . , Sk }. The requested number of tables k is defined by the user, with each
subset Si defining the guests assigned to a particular table.
Under this definition of the problem, the quality of a particular candidate solution
might be calculated according to various metrics. In our case, we use two objective
functions, both that are to be minimised. The first of these is analogous to Eq. (7.2):
k
f1 = (sv + su )wuv (7.3)
i=1 ∀u,v∈Si :{u,v}∈E
and reflects the extent to which the rules governing who sits with whom are obeyed.
In this case, the weighting wuv is multiplied by the total size of the two guest groups
involved sv + su . This is done so that violations involving larger numbers of people
contribute more to the cost (i.e., it is assumed that sv people have expressed a seating
preference concerning guest group u, and su people have expressed a preference
concerning guest group v).
The second objective function used in our model is intended to encourage equal
numbers of guests being assigned to each table. In practice, some weddings may have
varying sized tables, and nearly all weddings will have a special “top table” where
the bride, groom, and their associates sit. The top table and its guests can be ignored
in this particular formulation because they can easily be added to a table plan once
the other guests have been arranged. We also choose to assume that the remaining
tables are equal in size, which seems to be a very common option, particularly for
large venues. Consequently, the second objective function measures the degree to
which the number of guests per table deviates from the required number of either
N /k or N /k :
k
f2 = (min (|τi − N /k| , |τi − N /k |)) . (7.4)
i=1
Here τi = ∀v∈Si sv denotes the number of guests assigned to each table i.
Obviously, if the number of guests N is a multiple of k, then Eq. (7.4) simplifies to
k
f 2 = i=1 (|τi − N /k|).
208 7 Designing Seating Plans
We now show that this problem is N P -hard. We do this by showing that it generalises
two classical N P -hard problems: the k-partition problem, and the equitable graph
k-colouring problem. Let us first define the k-partition problem.
Note that the k-partition problem is also sometimes known as the load balancing
problem, the equal piles problem, or the multiprocessor scheduling problem.
Proof Let G = (V, E). If E = ∅ then f 1 (Eq. (7.3)) equals zero for all solutions.
Hence, the only goal is to ensure that the number of guests per table is equal (or as
close to equal as possible). Consequently, the problem is equivalent to the N P -hard
k-partition problem.
From another perspective, let sv = 1 ∀v ∈ V and let wuv = 1 ∀{u, v} ∈ E.
The number of guests assigned to each table i therefore equals |Si |. This special
case is equivalent to the N P -hard optimisation version of the equitable k-colouring
problem (see Sect. 6.8).
On entering the website, the user is first asked to input (or import) the names of all
guests into an embedded interactive table. Guest groups that are to be seated together
(families, etc.) are placed on the same rows of the table thus defining the various
values for sv . Guests to be sat at the top table are also specified. At the next step,
the user is then asked to define seating preferences between different guest groups.
Since guests to be sat at the top table have already been given, constraints only need
to be considered between the remaining guest groups (guests at the top table are
essentially ignored from this point onwards).
Figure 7.2 shows a small example of this process. Here, nine guest groups ranging
in size from 1 to 4 have been input, though one group of four has been allocated to
the top table. Consequently, only the remaining eight groups (comprising N = 20
guests) are considered. The right-hand grid then shows how the seating preferences
(values for wuv ) are defined between these. On the website, this is done interactively
7.3 Problem Interpretation and Tabu Search Algorithm 209
(a) (b)
Ruth+2
John+3
Top Guest Companion Companion Companion
Una+1
Rod+2
Ken+2
Pat+1
Bill+1
Jane
table? name 1 2 3
1 Cath Michael Kurt Rosie
2 John Sarah Jack Jill John+3
3 Bill June Bill+1
4 Pat Susan Pat+1
5 Una Tom Una+1
6 Ruth Kevin Gareth Ruth+2
7 Ken Frank Bobby Ken+2
8 Rod Dereck Freddy Rod+2
9 Jane Jane
Fig. 7.2 Specification of guest groups (a) and seating preferences (b)
by clicking on the relevant cells in the grid. In our case, users are limited to three
options: (1) “Definitely Apart” (e.g., John and Pat); (2) “Rather Apart” (Pat and
Ruth); and (3) “Rather Together” (John and Ken). These are allocated weights of
∞, 1, and −1, respectively, for reasons that will be made clear below. Note that it
would have been possible to allow the user to input their own arbitrary weights here;
however, while being more flexible, it was felt by the website’s interface designers
that this ran the risk of bamboozling the user while not improving the effectiveness
of the tool [2].
Once the input to the problem has been defined by the user, the overall strategy of
our algorithm is to classify the requirements to the problem as either hard (mandatory)
constraints or soft (optional) constraints. In our case, we consider just one hard
constraint, which we attempt to satisfy in Stage 1—specifically the constraint that
all pairs of guest groups required to be “Definitely Apart” are assigned to different
tables. In Stage 2, the algorithm then attempts to reduce the number of violations of
the remaining constraints via specialised neighbourhood operators that do not allow
any of the hard constraints satisfied in Stage 1 to be re-violated. The two stages of
the algorithm are now described in more detail.
7.3.1 Stage 1
In Stage 1, the algorithm operates on the subgraph G = (V, E ), where each vertex
v ∈ V represents a guest group, and the edge set E = {{u, v} ∈ E : wuv = ∞}.
In other words, the graph G contains only those edges from the original graph G
that define the “Definitely Apart” requirement. Using this subgraph, the problem of
assigning all guests to k tables (while not violating the “Definitely Apart” constraint)
is equivalent to finding a feasible k-colouring of G .
In our case, an initial solution is produced using the variant of the DSatur heuristic
used with the equitable graph colouring problem in Sect. 6.8. Starting with k empty
colour classes (tables), each vertex (guest group) is taken in turn according to the
DSatur heuristic and assigned to the feasible colour class containing the fewest
210 7 Designing Seating Plans
vertices, breaking ties randomly. If no feasible colour exists for a vertex then it is
kept to one side and is assigned to a random colour at the end of this process, thereby
introducing violations of the hard constraint.
If the solution produced by the above constructive process contains hard constraint
violations, an attempt is then made to eliminate them using TabuCol (see Sect. 5.1).
As we saw in Chap. 5, this algorithm can often be outperformed by other approaches
in terms of the quality of solution it produces, but it does have the advantage of
being very fast, which is an important requirement in this application. Consequently,
TabuCol is only run for a fixed number of iterations, specifically 20n.
If at the end of this process a feasible k-colouring for G has not been achieved, k
is incremented by 1, and Stage 1 of the algorithm is repeated. Of course, this might
occur because the user has specified a k-value for which no k-colouring exists (that
is, k < χ (G )) or it might simply be that a solution does exist, but that the algorithm
has been unable to find it in the given computation limit. The process of incrementing
k and reapplying DSatur and TabuCol continues until all of the hard constraints
have been satisfied, resulting in a feasible colouring of G .
7.3.2 Stage 2
v1 v2
Colour
v3 1
v4 v5
2
v6 3
v7 v8
4
5
v9 v10
Fig. 7.4 Procedure for efficiently evaluating all possible Kempe chain interchanges in a solution S
Kempe(v7 , 2, 1), Kempe(v8 , 2, 1), and Kempe(v9 , 1, 2). Of course, only one of these
combinations needs to be considered at each iteration.
To achieve these speed-ups an additional matrix Kn×k can be used where, given
a vertex v ∈ Si , each element K v j is used to indicate the size of the Kempe chain
formed via Kempe(v, i, j). This matrix is populated in each iteration of tabu search
according to the steps shown in Fig. 7.4. As can be seen here, initially all elements
of K are set to zero. The algorithm then considers each vertex v ∈ Si in turn (for
1 ≤ i ≤ k) and, according to Step (5), only evaluates an interchange involving the
chain Kempe(v, i, j) if the same set of vertices has not previously been considered.
If a new Kempe chain is identified, the cost of performing this interchange is then
evaluated (Step (6)), and the matrix K is updated to make sure that this interchange
is not evaluated again in this iteration (Steps (7–9)).
Finally, after the evaluation of all possible Kempe chain interchange moves, the
information in K can also be used to quickly identify all possible moves achievable
via the pair swap operator. Specifically, for each v ∈ Si (for 1 ≤ i ≤ k − 1) and each
u ∈ S j (for i + 1 ≤ j ≤ k), pair swaps will only occur where both K v j = 1 and
K ui = 1.
since violations of the hard constraints cannot occur. Although such an aggregate
function is not wholly ideal (because it involves adding together two different forms
of measurement) it is acceptable in our case because, in some sense, both metrics
relate to the number of people affected by the violations—that is, a table that is
considered to have x too many (or too few) people will garner the same penalty cost
as x violations of the “Rather Apart” constraint.
Finally, the speed of this algorithm can also be further increased by observing
that (a) the cost functions f 1 and f 2 only involve the addition of terms relating
to the quality of each separate colour class (table), and (b) neighbourhood moves
with this algorithm only affect two colours. These features imply that if a move
involving colours i and j is made in iteration l of the algorithm, then in iteration
l + 1, the cost changes involved with moves using any pair of colours from the set
({1, . . . , k}−{i, j}) will not have changed and do therefore not have to be recalculated
by the algorithm.
In this section, we analyse the performance of our two-stage tabu search algorithm
in terms of both computational effort and the costs of its resultant solutions.
The algorithm and interface described above was implemented in ActionScript
3.0 and is executed via a web browser (an installation of Adobe Flash Player is
required). The optimisation algorithm is therefore run at the client-side. To ensure
run times are kept relatively short, and to also allow the interface to be displayed
clearly on the screen, problem size has been limited to n = 50, with guest groups of
up to 8 people, allowing a maximum of N = 400 guests.
To gain an understanding of the performance characteristics of this algorithm, a
set of maximum-sized problem instances of n = 50 guest groups (vertices) were
constructed, with the size of each group chosen uniform randomly in the range 1–8
giving N ≈ 50 × 4.5 = 225. These instances were then modified such that each pair
of vertices was joined by an ∞-weighted edge with probability p, meaning that a
proportion of approximately p guest group pairs would be required to be “Definitely
Apart”. Tests were then carried out using values of p = {0.0, 0.3, 0.6, 0.9} with
numbers of tables k = {3, 4, . . . , 40}.
Figure 7.5 shows the results of these tests with regard to the costs that were
achieved by the algorithm at termination. Note that for p ≥ 0.3, values are not
reported for the lowest k values because feasible k-colourings were not achieved
(quite possibly because they do not exist). The figure indicates that, with no hard
constraints ( p = 0.0), balanced table sizes have been achieved for all k-values up to
30. From this point onwards, however, it seems there are simply too many tables
(and too few guests per table) to spread the guest groups equally. Higher costs are
also often incurred for larger values of p because, in these cases, many guest group
combinations (including many of those required for achieving low-cost solutions)
will now contain at least one hard constraint violation, meaning that they cannot be
7.4 Algorithm Performance 213
70
p=0.0
60 p=0.3
50 p=0.6
p=0.9
40
Cost
30
20
10
0
3 8 13 18 23 28 33 38
k
Fig. 7.5 Solution costs for four values of p using various k-values
assigned to the same table. That said, similar solutions are achieved for p = 0.0,
0.3, and 0.6 for various values of k, suggesting that the cost of the best solutions
found is not unduly affected by the presence of moderate levels of hard constraints.
The exception to this pattern, as shown in the figure, is for the smallest achievable
values for k. Here, the larger number of guest groups per table makes it more likely
that combinations of guest groups will be deemed infeasible, reducing the number
of possible feasible solutions and making the presence of a zero-cost solution less
likely.
Figure 7.6 now shows the effect that variations in p and k have on the neigh-
bourhood sizes encountered in Stage 2, together with the overall run times of the
algorithm. For unconstrained problem instances ( p = 0.0) all Kempe chains are of
size 1, and all pairs of vertices in different colours qualify for a pair swap. Hence, the
number of distinct moves available for each operator are n(k − 1) and approximately
(n(n − n/k))/2, respectively. However, for more constrained problems (lower k’s
and/or larger p’s), the number of neighbouring solutions is lower. This means that
a smaller number of evaluations need to take place at each iteration of tabu search,
resulting in shorter run times. The exception to this pattern is for low values of k
using p = 0.0, where the larger numbers of guests per table require more overheads
in the calculation of Kempe chains and the cost function, resulting in increased run
times.
Finally, it is also instructive to consider the proportion of Kempe chains that are
seen to be total during runs of the tabu search algorithm. Recall from Sect. 5.5
that a Kempe chain Kempe(v, i, j) is described as total when Kempe(v, i, j) =
(Si ∪ S j ): that is, the graph induced by the set of vertices Si ∪ S j forms a connected
bipartite graph. (Consider, for example, the chain Kempe(v3 , 4, 2) from Fig. 7.3.)
Interchanging the colours of vertices in a total Kempe chain serves no purpose since
this only results in the labels of the two colour classes being swapped, leading to
no changes in the objective function. Figure 7.7 shows these proportions for the four
considered problem instances. We see that total Kempe chains are more likely to
occur with higher values of p (due to the greater connectivity of the graphs), and
214 7 Designing Seating Plans
3500
p=0.0
3000 p=0.3
2500 p=0.6
p=0.9
Neighbours
2000
1500
1000
500
0
3 8 13 18 23 28 33 38
k
5
p=0.0
p=0.3
4
p=0.6
CPU Time (secs)
p=0.9
3
0
3 8 13 18 23 28 33 38
k
Fig. 7.6 Average number of neighbouring solutions per iteration of the tabu search algorithm
(top); and average run times of the algorithm (bottom) for the four problem instances using various
k-values (using a 3.0 GHz Windows 7 PC with 3.87 GB RAM)
for lower values of k (because the vertices of the graph are more likely to belong
to one of the two colours being considered). Indeed, for p = 0.9 and k = 22, we
see that all Kempe chains considered by the algorithm are total, meaning that the
neighbourhood operator is ineffective in this case.
In this section, we now compare the results achieved by our two-stage tabu search
algorithm to those of a commercial integer programming (IP) solver. As we saw
in Sect. 4.1.2, one of the advantages of using an IP approach is that, given excess
time, we can determine with certainty the optimal solution to a problem instance
(or, indeed, whether a feasible solution exists). In contrast to our tabu search-based
method, the IP solver is, therefore, able to provide the user with a certificate of
optimality and/or infeasibility, at which point it can be halted. Of course, due to the
7.5 Comparison to an IP Model 215
1
p=0.0
0.8 p=0.3
p=0.6
p=0.9
ProporƟon
0.6
0.4
0.2
0
3 8 13 18 23 28 33 38
k
Fig. 7.7 Proportion of Kempe chains seen to be total for the four problem instances using various
k-values
underlying intractability of the WSP these certificates will not always be produced
in reasonable time, but given the relatively small problem sizes being considered in
this chapter, it is still pertinent to ask how often this is the case and to also compare
the quality of the IP solver’s solutions to our tabu search approach under similar time
limits.
The WSP can formulated as an IP problem as follows. Recall that there are n guest
groups that we seek to partition on to k tables. Accordingly, the seating preferences
of guests can be expressed using a symmetric n × n matrix W, where:
⎧
⎪
⎪ ∞ if we require guest groups i and j to be “definitely apart”;
⎨
1 if we would prefer i and j to be on different tables (“rather apart”);
Wi j =
⎪
⎪ −1 if we would prefer i and j to be on the same table (“rather together”);
⎩
0 otherwise.
(7.5)
As before, we also let si define the size of each guest group i ∈ {1, . . . , n}. A solution
to the problem can then be represented by a n × n binary matrix X, where:
1 if guest group i is assigned to table t,
X it = (7.6)
0 otherwise,
and a binary vector Y of length n, where:
1 if at least one guest group is assigned to table t,
Yt = (7.7)
0 otherwise,
216 7 Designing Seating Plans
Here, Constraints (7.8)–(7.13) stipulate the hard constraints of the problem; hence,
a solution satisfying these can be considered feasible. This IP formulation is essen-
tially the same as the graph colouring formulation seen in Sect. 4.1.2.4, except that
the required number of colours k is specified as a constraint. Equation (7.8) states
that each guest group (vertex) should be assigned to exactly one table (colour), while
(7.9) specifies that no pair of guest groups should be assigned to the same table if
they are subject to a “Definitely Apart” constraint, with Yt = 1 when at least one
guest group has been assigned to table t. Equations (7.10) and (7.11) then ensure that
a maximum of k tables are used. Finally, (7.12) and (7.13) impose the anti-symmetry
constraints.
As with the tabu search algorithm, the quality of a feasible candidate solution
in the IP model is quantified using the sum of the two previously defined objective
functions ( f 1 + f 2 ). For the IP model, f 1 is rewritten as
k n−1 n
f1 = X it X jt (si + s j )Wi j (7.14)
t=1 i=1 j=i+1
in order to cope with the binary matrix method of solution representation; however,
it is equivalent in form to Eq. (7.3). Similarly, f 2 in the IP model is defined in the
same manner as Eq. (7.4), except that τi is now calculated as τi = nj=1 X jt s j .
Again, this is equivalent to Eq. (7.4).
It is worth noting here that the objective function defined in Eq. (7.14) actu-
ally contains a quadratic term, making our proposed mathematical model a binary
quadratic integer program. Although modern commercial IP solvers such as Xpress
and CPLEX can cope with such formulations, the use of quadratic objective func-
tions is sometimes thought to hinder performance. One way to linearise this model
is to introduce an additional auxiliary binary variable:
1 if guest groups i and j are both assigned to table t,
Z i jt = (7.15)
0 otherwise,
7.5 Comparison to an IP Model 217
7.5.1 Results
In our experiments, both IP formulations were tested using the commercial software
Xpress. We repeated the experiments of Sect. 7.4 using two time limits: 5 seconds,
which was approximately the longest time required by our tabu search algorithm (see
Fig. 7.6); and 600 s, to gain a broader view of the IP solver’s capabilities with these
formulations. Across the 152 combinations of p and k, under the five-second limit, the
linear model produced feasible solutions for just 11 cases compared to the quadratic
model’s 112. Similarly, the number of cases where certificates of infeasibility were
returned were 24 and 26, respectively. The underperformance of the linear model in
these cases may well be due to the much larger number of variables and constraints
involved, which seems to present difficulties under this very strict time limit. That
said, even though the models’ results became more similar under the 600 s time limit,
the costs returned by the linear model were still consistently worse than the quadratic
model’s. Consequently, only the results from the quadratic model are considered for
the remainder of this section.
The results of the trials are summarised in Fig. 7.8. The circled lines in the left of
the graphs indicate values of k where certificates of infeasibility were produced by
the IP solver under the two time limits. As might be expected, these certificates are
produced for a larger range of k-values when the longer time limit is used; however,
for p ∈ {0.3, 0.6} there remain values of k for which feasible solutions have not
been produced (by any algorithm) and where certificates of infeasibility have not
been supplied. Thus, we are none the wiser as to whether feasible solutions exist for
these particular k-values. Also, note that certificates of optimality were not provided
by the IP solver in any of the trials conducted.
Figure 7.8 shows that, under the 5 s limit, the IP approach has produced solutions
of inferior quality compared to tabu search in all cases. Also, the IP method has
failed to achieve feasible solutions in seven of the 121 cases where tabu search has
been successful. When the extended run time limit is applied, this performance gap
diminishes, but similar patterns still emerge. We see that the tabu search algorithm
has produced feasible solutions whenever the IP approach has, plus three further
218 7 Designing Seating Plans
160
140 IP (5s)
IP (600s)
120
Tabu
100
Cost 80
60
40
20
0
3 8 13 18 23 28 33 38
k
160
140 IP (5s)
120 IP (600s)
Tabu
100
Cost
80
60
40
20
0
3 8 13 18 23 28 33 38
k
160
140 IP (5s)
120 IP (600s)
Tabu
100
Cost
80
60
40
20
0
3 8 13 18 23 28 33 38
k
160
140 IP (5s)
120 IP (600s)
100 Tabu
Cost
80
60
40
20
0
3 8 13 18 23 28 33 38
k
Fig. 7.8 Comparison of solution costs achieved using the IP solver (using 2 different time limits)
and the tabu search-based approach for p = 0.0, p = 0.3, p = 0.6, and p = 0.9, respectively
7.5 Comparison to an IP Model 219
cases. Also, in the 119 cases where both algorithms have achieved feasible solutions,
tabu search has produced superior quality solutions in 94 cases, compared to the IP
method’s six. However, we must bear in mind that the IP solver has required more
than 400 times the CPU time of tabu search to achieve these particular solutions,
making it much less suitable for an online tool.
• If guest group v is not permitted to sit at table i, then an edge of weight ∞ can be
imposed between vertex v and the ith table vertex.
• If a guest group v must be assigned to table i, then edges of weight ∞ can be
imposed between vertex v and all table vertices except the ith table vertex.
1 That is, the graph G used in Stages 1 and 2 would comprise edge set E = {{u, v} ∈ E : wuv ≥ c}.
220 7 Designing Seating Plans
Note that if our model is to be extended in this way, we will now be associating
each subset of guest groups Si in a solution S = {S1 , . . . , Sk } with a particular table
number i. Hence, we might also permit tables of different sizes and shapes into the
model, perhaps incorporating constraints concerning these factors into the objective
function. Extensions of this nature will be considered with a different problem in the
next chapter.
References
1. Nielsen J (2004) The need for web design standards. https://www.nngroup.com/articles/the-
need-for-web-design-standards/, September 2004
2. Carroll F, Lewis R (2013) The “engaged” interaction: important considerations for the HCI
design and development of a web application for solving a complex combinatorial optimization
problem. World J Comput Appl Technol 1(3):75–82
Designing Sports Leagues
8
In this chapter, our case study considers the applicability of graph colouring methods
for producing round-robin tournaments. These are particularly common in sports
competitions. As we will see, the task of producing valid round-robin tournaments
is relatively straightforward, but things become more complicated when additional
constraints are added to the problem. The initial sections of this chapter focus on the
problem of producing round-robins in general terms and examine the relationship
between this problem and graph colouring. A detailed real-world case study that
makes use of various graph colouring techniques is then presented in Sect. 8.6.
Round-robin schedules are used in many sports tournaments and leagues across
the globe, including the Six Nations Rugby Championships, various European and
South American domestic soccer leagues, and the England and Wales County Cricket
Championships. Round-robins are schedules involving t teams, where each team is
required to play all other teams exactly l times within a fixed number of rounds. The
most common types are single round-robins, where l = 1, and double round-robins,
where l = 2. In the latter, teams are typically scheduled to meet once in each other’s
home venue.
Usually, the number of teams in a round-robin schedule will be even. In cases
where t is odd, an extra “dummy team” can be introduced, and teams assigned to
play this dummy team will receive a bye in the appropriate part of the schedule.
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 221
R. M. R. Lewis, Guide to Graph Colouring, Texts in Computer Science,
https://doi.org/10.1007/978-3-030-81054-2_8
222 8 Designing Sports Leagues
Definition 8.2 Let K t be the complete graph with t vertices, where t is even. A
one-factor of K t is a perfect matching. A one-factorisation of K t is a partition
of the edges into t − 1 disjoint one-factors.
The task of minimising breaks has also been explored in other research. Trick [3],
Elf et al. [4] and Miyashiro and Matsui [5], for example, have examined the problem
of taking an existing single round-robin and then assigning home/away values to
each of the matches to minimise the number of breaks. Miyashiro and Matsui [6]
have also shown that the problem of deciding whether a home/away assignment
exists for a particular schedule such that the theoretical minimum of t − 2 breaks is
achieved is computable in polynomial time. The inverse of this problem—taking a
fixed home/away pattern and then assigning matches consistent with this pattern—
has also been studied by various other authors [2,7,8].
One interesting feature of the Greedy, circle, and canonical methods is that the
solutions they produce are isomorphic. For example, a canonical single round-robin
schedule for t teams can be transformed into the schedule produced by the Greedy
round-robin algorithm by simply converting the ordered pairs into unordered pairs
and then reordering the rounds. Similarly, a circle schedule can be transformed
into a Greedy schedule by relabelling the teams using the mapping t1 ← tt and
ti ← ti−1 ∀i ∈ {2, . . . , t}, with the rounds then being reordered (see also [9]).
1 As an illustration, in Fig. 8.1 we see that team 2, for instance, is scheduled to play the opponents of
team 1 from the previous round on five different occasions. This feature also exists for other teams;
thus this schedule contains rather a large amount of carryover.
8.1 Problem Background 225
The above paragraphs illustrate that the requirements of sports scheduling problems
can be complex and idiosyncratic. In the remainder of this chapter, we will examine
how graph colouring concepts can be used to help find solutions to such problems.
In the next section, we will describe how basic round-robin scheduling problems
can be represented as graph colouring problems. In Sect. 8.3 we then assess the
“difficulty” of solving such graphs using our suite of graph colouring algorithms from
Chap. 5. Following this, in Sect. 8.4 we will then discuss ways in which this model
can be extended to incorporate other types of “hard” (i.e., mandatory) constraint,
and in Sect. 8.5 we review various neighbourhood operators that can be used with
this extended model for exploring the space of feasible solutions (that is, round-
robin solutions that are compact, valid, and also obey any imposed hard constraints).
Finally, in Sect. 8.6 we consider a real-world round-robin scheduling problem from
the Welsh Rugby Union and propose two separate algorithms that make use of
our proposed algorithmic operators. The performance of these algorithms is then
analysed over several different problem instances.
Fig. 8.4 Graph for a double round-robin problem with t = 4 teams (a), an optimal colouring of
this graph (b), and the corresponding schedule (c)
available rounds. Note that for the remainder of this chapter we only consider the
task of producing compact schedules: thus, k = χ (G) unless otherwise specified.
For a single round-robin, each vertex is associated with an unordered pair {ti , t j },
denoting a match between teams ti and t j . The number of vertices n in such graphs
is thus 21 t (t − 1), with deg(v) = 2(t − 2) ∀v ∈ V . For a compact schedule, the
number of available colours k = t − 1. For double round-robins the number of
vertices n = t (t − 1), deg(v) = 4(t − 2) + 1 ∀v ∈ V , and k = 2(t − 1), since teams
will play each other twice. In this case, each vertex is associated with an ordered pair
(ti , t j=i ), with ti denoting the home-team and t j the away-team. An example graph
for a double round-robin with t = 4 teams is provided in Fig. 8.4.
Recall from Sect. 6.2 that the complete graph K t can also be used to represent
a round-robin scheduling problem by associating each vertex with a team and each
edge with a match. In such cases, the task is to find a proper edge colouring of K t ,
with all edges of a particular colour indicating the matches that occur in a particular
round. The graphs generated using our methods above are the corresponding line
graphs of these complete graphs. Of course, in practice, it is easy to switch between
these two representations. However, the main advantage of using our representation
is that it allows the exploitation of previously developed vertex-colouring techniques,
as the following sections will demonstrate.
Having defined the basic structures of the “round-robin graphs” that we wish to
colour, in this section we investigate whether such graphs constitute difficult-to-
colour problem instances. Note that by k-colouring such graphs we are doing nothing
more than producing valid, compact round-robin schedules which, as we have men-
tioned, can be easily achieved using the circle, Greedy, and canonical algorithms.
However, there are several reasons why solving these problems from the perspective
of vertex colouring is worthwhile.
8.3 Generating Valid Round-Robin Schedules 227
1. Because of the structured, deterministic way in which the circle, Greedy and
canonical methods operate, their range of output will only represent a very small
part of the space of all valid round-robin schedules.
2. The schedules that are produced by the circle, Greedy and canonical methods also
occupy very particular parts of the solution space. For example, Lambrechts et
al. [25] have shown that, for even numbers of teams, the circle method produces
schedules in which the amount of carryover is maximised.
3. As noted earlier, the solutions produced via the Greedy, circle and canonical meth-
ods are isomorphic. Moreover, the specific structures present in these isomorphic
schedules are often seen to have adverse effects when applying neighbourhood
search operators, as we will see in Sect. 8.5.
4. Finally, we are also able to modify the graph colouring model to incorporate
additional real-world constraints, as shown in Sect. 8.4.
By using graph colouring methods, particularly those that are stochastic in nature,
the hope is that we have a more robust and less biased mechanism for producing
round-robin schedules, allowing a larger range of structurally distinct schedules to
be sampled. This is especially useful in the application of metaheuristics, where the
production of random initial solutions is often desirable.
Figure 8.5 summarises the results of experiments using single and double round-
robins of up to t = 60 teams. Fifty runs of the backtracking and hybrid evolutionary
algorithms were executed in each case using a computation limit of 5×1011 constraint
checks as before. The success rates in these figures give the percentage of these runs
where optimal colourings (compact valid round-robins) were produced. It is obvious
from these figures that the HEA is very successful here, featuring 100% success rates
across all instances. Indeed, no more than 0.006% of the computation limit on average
was required for any of the values of t tested. On the other hand, the backtracking
approach experiences more difficulty, with success rates dropping considerably for
larger values of t. That said, when the algorithm does produce optimal solutions it
does so quickly, indicating that solutions are either found early in the search tree or
not at all.2 The success of the HEA with these instances is also reinforced by the fact
that its solutions are very diverse, as illustrated in the figure.
2 On this point, Lewis and Thompson [26] have also found that much better results for the back-
tracking algorithm on these particular graphs can be achieved by restricting the algorithm to only
inspect one additional branch from each node of the search tree. The source code available for this
algorithm—see Appendix A.1—can easily be modified to allow this.
228 8 Designing Sports Leagues
100 1
60 0.6
Diversity
40 0.4
20 0.2
HEA Diversity
Bktr
HEA
0 0
0 10 20 30 40 50 60
t
100 1
80 0.8
Success Rate (%)
60 0.6
Diversity
40 0.4
20 0.2
HEA Diversity
Bktr
HEA
0 0
0 10 20 30 40 50 60
t
Fig. 8.5 Success rates of the Backtracking and HEA algorithms for finding optimal colourings
with, respectively, single round-robin graphs for t = 2, . . . , 60 (n = 1, . . . , 1770), and double
round-robin graphs (n = 2, . . . , 3540). All figures are averaged across 50 runs. The bars show the
diversity of solutions produced by the HEA across the 50 runs, calculated using Eq. (5.9)
To impose round-specific constraints we follow the method seen in Sect. 6.7 for
the list colouring problem. First, k extra vertices are added to the model, one for
each available round. Next, edges are then added between all pairs of these “round-
vertices” to form a clique of size k, ensuring that each round-vertex will be assigned
to a different colour in any feasible solution. Having introduced these extra vertices,
a variety of different round-specific constraints can then be introduced:
Figure 8.6 gives two examples of how we can impose such constraints. Any fea-
sible k-coloured solution for such graphs will constitute a valid compact round-
robin schedule that obeys the imposed round-specific constraints. As we will see in
Sect. 8.5, incorporating constraints in this fashion also allows us to apply neighbour-
hood operators stemming from the underlying graph colouring model that ensures
these extra constraints are never re-violated. Note that an alternative strategy for cop-
ing with hard constraints such as these is to allow their violation within a schedule,
but to then penalise their occurrence via a cost function. Anagnostopoulos et al. [21],
for example, use a strategy whereby the space of all valid compact round-robins is
explored, with a cost function then being used that reflects the number of hard and
soft constraint violations. Weights are then used to place a higher penalty on viola-
tions of the hard constraints, and it is hoped that by using such weights the search
will eventually move into areas of the solution space where no hard constraint viola-
230 8 Designing Sports Leagues
tions occur. The choice of which strategy to employ will depend largely on practical
requirements.
To investigate the effects that the imposition of round-specific constraints has
on the difficulty of the underlying graph colouring problem, double round-robin
graphs were generated with varying numbers of match-unavailability constraints.
Specifically, these constraints were added by considering each match-vertex/round-
vertex pair in turn and adding edges between them with probability p. This means,
for example, that if p = 0.5, each match can only be assigned to approximately
half of the available rounds. Graphs were also generated in two ways: one where
k = χ (G) was ensured (by referring to a pre-generated valid round-robin), and one
where this matter was ignored, possibly resulting in graphs for which χ (G) > k.
Note that by adding edges in this binomially distributed manner, the expected
degrees of each vertex can be calculated in the following way. Let V1 define the set
of match vertices and V2 the set of round-vertices, and let v ∈ V1 and u ∈ V2 . Then:
E(deg(v)) = 4(t − 2) + 1 + p × |V2 | ∀v ∈ V1 , and
(8.1)
E(deg(u)) = 2(t − 1) − 1 + p × |V1 | ∀u ∈ V2 .
The expected variance in degree across all vertices V = V1 ∪ V2 , where n = |V |, is
thus approximated as
|V1 | × E(deg(v))2 + |V2 | × E(deg(u))2 |V1 | × E(deg(v)) + |V2 | × E(deg(u)) 2
− .
n n
(8.2)
The effect that p has on the overall degree coefficient of variation (CV) of these graphs
is demonstrated in Fig. 8.7. As p is increased from zero, E(deg(v)) and E(deg(u))
initially become more alike, resulting in a slight drop in the CV. However, as p is
increased further, E(deg(u)) rises more quickly than E(deg(v)), resulting in large
increases to the CV.
The consequences of these specific characteristics help to explain the performance
of our six graph colouring algorithms across a large number of instances, as shown in
Fig. 8.8. As with the results from Chap. 5, the quality of solution achieved by Tabu-
Col and PartialCol is observed to be substantially worse than that of the other
approaches when p, and therefore the degree CV, is high. In particular, TabuCol
shows very disappointing performance, providing the worst-quality results for both
graph sizes where the CV is 40%.
In contrast, some of the best performance across the instances is once again due to
the HEA. Surprisingly, AntCol also performs well here, with no significant differ-
ence being observed in the mean results of the HEA and AntCol algorithms across
the set. The reasons for the improved performance of AntCol, particularly with
denser graphs, seems due to two factors: (a) the higher degrees of the vertices in the
graphs, and (b) the high variance in degrees. In AntCol’s BuildSolution proce-
dure (Sect. 5.4) the first factor naturally increases the influence of the heuristic value
η in Eq. (5.4), while the second allows a greater discrimination between vertices. In
these cases it seems that a favourable balance between heuristic and pheromone infor-
mation is being struck, allowing AntCol’s global operator to effectively contribute
to the search.
8.4 Extending the Graph Colouring Model 231
100
n=16
n=30
Coefficient of Variation 80
60
40
20
0
0 0.2 0.4 0.6 0.8 1
p
Fig. 8.7 Effect of varying p on the degree coefficient of variation with double round-robin graphs
of size t = 16 and 30
55 55
TabuCol TabuCol
PartialCol PartialCol
HEA HEA
AntCol AntCol
50 HC 50 HC
Bktr Bktr
Colours at cut−off
Colours at cut−off
45 45
40 40
35 35
30 30
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
p p
TabuCol TabuCol
95 PartialCol 95 PartialCol
HEA HEA
AntCol AntCol
90 HC 90 HC
Bktr Bktr
85 85
Colours at cut−off
Colours at cut−off
80 80
75 75
70 70
65 65
60 60
Fig. 8.8 Mean quality of solutions achieved with double round-robin graphs using (respectively):
t = 16, (n = 270, k = 30) with χ(G) = 30; t = 16 with χ(G) ≥ 30; t = 30, (n = 928, k = 58)
with χ(G) = 58; and t = 30 with χ(G) ≥ 58. All points are the average of 25 runs on 25 graphs
232 8 Designing Sports Leagues
66 66
HEA HEA
HEA* HEA*
65 AntCol 65 AntCol
64 64
63 63
Colours
Colours
62 62
61 61
60 60
59 59
58 58
0 1×1011 2×1011 3×1011 4×1011 5×1011 0 1×1011 2×1011 3×1011 4×1011 5×1011
Checks Checks
66 66
HEA HEA
HEA* HEA*
65 AntCol 65 AntCol
64 64
63 63
Colours
62 Colours 62
61 61
60 60
59 59
58 58
0 11 11 11 11 11 0 11 11 11 11 11
1×10 2×10 3×10 4×10 5×10 1×10 2×10 3×10 4×10 5×10
Checks Checks
Fig. 8.9 Run profiles for double round-robins with t = 30 (n = 928) using, respectively: p = 0.8,
χ(G) = 58; p = 0.8, χ(G) ≥ 58; p = 0.9, χ(G) = 58; and p = 0.9, χ(G) ≥ 58. HEA* denotes
the HEA algorithm with a reduced local-search limit of I = 2n
Table 8.1 Description of various neighbourhood operators that preserve the validity and compact-
ness of double round-robin schedules. “Move Size” refers to the number of matches (vertices in the
graph colouring model) that are affected by the application of these operators
Description Move size
N1 Select two teams, ti = t j , and swap the rounds of vertices (ti , t j ) and (t j , ti ) 2
N2 Select two teams, ti = t j , and swap their opponents in all rounds 2(k − 2)
N3 Select two teams, ti = t j , and swap all occurrences of ti to t j and all 2k − 2
occurrences of t j to ti
N4 Select two rounds, ri = r j and swap their contents t
N5 Select a match and move it to a new round. Repair the schedule using an Variable
ejection chain repair procedure
Upon production of a valid round-robin schedule, we may now choose to apply one
or more neighbourhood operators to try and eliminate occurrences of any remaining
soft-constraint violations. Table 8.1 lists several neighbourhood operators that have
been proposed for round-robin schedules, mostly for use with the travelling tour-
nament problem [21,24,27]. Note that the information given in this table applies
to double round-robins, so the number of rounds k = 2(t − 1); however, simple
adjustments can be made for other cases.
A point to note about these operators is that while they all preserve the validity
of a round-robin schedule, they will not be useful in all circumstances. For example,
applications of N1 , N2 , and N3 will not affect the amount of carryover in a schedule.
Also, while applications of N2 can change the home/away patterns of individual
teams, they cannot alter the total number of breaks in a schedule. Finally, perhaps
the most salient point from our perspective is that if extra hard constraints are being
considered, such as the round-specific constraints listed in Sect. 8.4, then the appli-
cation of such operators may lead to schedules that, while valid, are not necessarily
feasible.
Pursuing the relationship with graph colouring, a promising strategy for exploring
the space of round-robin schedules is again presented by the Kempe chain interchange
operator (see Definition 4.3). Of course, because this operator is known to preserve
the feasibility of a graph colouring solution, it is suitable for both the basic and
extended versions of our graph colouring model. On the other hand, the pair-swap
operator (Definition 4.4) is not suitable here because, for these graphs, swapping the
colours of two nonadjacent vertices is equivalent to swapping the rounds of a pair of
matches with no common team. Such moves will never maintain the feasibility of a
round-robin and therefore should not be considered.
Recall that the number of vertices affected by a Kempe chain interchange can
vary. For basic (non-extended) round-robin colouring problems involving t teams,
the largest possible move involves t vertices (i.e., two colour classes, with t/2 vertices
in each). In Fig. 8.10 we illustrate what we have found to be typical-shaped distri-
234 8 Designing Sports Leagues
Relative frequency
for SRRs and DRRs 0.6
0.4
0.2
0
0
0.25
0.5
Vertices moved (x t) 0.75
1
0.8 t = 40
t = 20
t = 10
0.6
Relative frequency
0.4
0.2
0
0
0.25
0.5
Vertices moved (x t) 0.75
1
butions of the differently sized Kempe chains with single and double round-robins.
These examples were gained by generating initial solutions with our graph colour-
ing algorithms and then performing random walks of 106 neighbourhood moves. We
see that, in the case of double round-robins, the smallest moves involve exactly two
vertices, which only occurs when a chain is formed containing the complementary
match vertices (ti , t j ) and (t j , ti ). In this case, the Kempe chain move is equivalent to
1
the operator N1 (Table 8.1) and it occurs with a probability k−1 (obviously moves of
size 2 do not occur with single round-robins because a match does not have a corre-
sponding reverse fixture with which to be swapped). Meanwhile, the most probable
move in both cases is a total Kempe chain interchange (i.e., involving all vertices
of the two associated colours). Moves of this size are equivalent to a correspond-
ing move in N4 and occur when all vertices in the two colours form a connected
component. Such moves appear to be quite probable due to the relatively high edge
densities of the graphs. Importantly, however, we see that for larger values of t the
majority of moves are of sizes between these two extremes, resulting in moves that
are beyond those achievable with neighbourhood operators N1 and N4 .
We may also choose to perform random walks in this way from schedules gener-
ated by the circle, Greedy, or canonical algorithms. However, due to the structured
8.5 Exploring the Space of Round-Robins 235
way in which these go about constructing a schedule, many different values of t re-
sult in single round-robins in which all applications of the Kempe chain interchange
operator are of size t. Such solutions are usually termed perfect one-factorisations
and are known to be produced by the circle, Greedy, and canonical algorithms for any
value of t for which t − 1 is a prime number [28]. Such features are undesirable as
they do not allow the Kempe chain interchange operator to produce moves beyond
what can already be achieved using N1 and N4 , limiting the number of solutions
accessible via the operator. We should note, however, that we found that this prob-
lem could be circumnavigated in some cases by applying neighbourhood operator
N5 from Table 8.1 to the solution. It seems that, unlike the other operators detailed
in this table, N5 has the potential of breaking up the structural properties of these
solutions, allowing the Kempe chain distributions to assume their more “natural”
shapes as seen in Fig. 8.10. However, we still found cases where this situation was
not remedied.3
One of the main reasons why an analysis of move sizes is relevant here is because
of the effects that the size of a move can have on the cost of a solution at different
stages of the optimisation process. On the one hand, “large” moves can facilitate the
exploration of wide expanses of the solution space and can provide useful mecha-
nisms for escaping local optima. On the other hand, when relatively good candidate
solutions are being considered, large moves will also be disruptive, usually worsening
the quality of a solution as opposed to improving it. These effects are demonstrated
in Fig. 8.11 where we illustrate the relationship between the size of a move and the
resultant change in an arbitrary cost function. In the top chart graph, the Kempe chain
interchange operator has been repeatedly applied to a solution that was randomly
produced by one of our graph colouring algorithms. Note that larger moves here tend
to give rise to greater variance in cost, but that many moves lead to improvements.
In contrast, in the bottom chart the effects of the Kempe chain interchange operator
on a relatively “good” solution (which has a cost of approximately a quarter of the
previous one) are demonstrated. Here, larger moves again feature a larger variance
in cost, but we also witness a statistically significant medium positive correlation
(r = 0.46), demonstrating that larger moves tend to be associated with larger de-
creases in solution quality. Di Gaspero and Schaerf [23] have also noted the latter
phenomenon (albeit with different neighbourhoods and a different cost function) and
have suggested a modification to their neighbourhood search algorithm whereby any
move seen to be above a specific size is automatically rejected, with no evaluation
taking place. Because such moves lead to a degradation in quality and will therefore
be rejected in the majority of cases, they find that their algorithm’s performance over
time is increased by skipping these mostly unnecessary evaluations. On the flip side,
of course, such a strategy also eliminates the possibility of “larger” moves occurring
which could diversify the search in a useful way.
3 Specifically for SRRs and values of t less than fifty, this was seen to occur with t = 12, 14, 20,
30, and 38.
236 8 Designing Sports Leagues
200
change in cost
100
-100
-200
0 2 4 6 8 10 12 14 16 18
move size
200
change in cost
100
-100
-200
0 2 4 6 8 10 12 14 16 18
move size
Fig. 8.11 Demonstrating how Kempe chain interchanges of different sizes influence the change
in cost of a randomly generated solution (top) and a “good” solution (bottom). In both cases a
double round-robin with t = 16 teams was considered using cost function c2 defined in Sect. 8.6.1
(negative changes thus reflect an improvement)
• Hard Constraint A. Some pairs of teams in the league share a home stadium.
Therefore when one of these teams plays at home, the other team must play away.
• Hard Constraint B. Some teams in the league also share their stadia with teams
from other leagues and sports. These stadia are therefore unavailable in certain
rounds. (In practice, the other sports teams using these venues have their matches
scheduled before the Principality Premiership teams, and so unavailable rounds
are known in advance.)
• Hard Constraint C. Matches involving regional rivals (so-called “derby matches”)
need to be preassigned to two specific rounds in the league, corresponding to those
falling on the Christmas and Easter weekends.
The league administrators also specify two soft constraints. First, they express
a preference for keeping reverse fixtures (i.e., matches (ti , t j ) and (t j , ti )) at least
five rounds apart and, if possible, for reverse fixtures to appear in opposite “halves”
of the schedule (they do not consider the stricter requirement of “mirroring” to be
important, however). Second, they also express a need for all teams to have good
home/away patterns, which means avoiding breaks wherever possible.
In this section, we describe two algorithms for this scheduling problem. Both of
these use the strategy of first producing a feasible solution, followed by a period of
optimisation via neighbourhood search in which feasibility (i.e., validity, compact-
ness, and adherence to all hard constraints) is maintained. Specific details of these
methods together with a comparison are given in the next three subsections.
In both cases, initial feasible solutions are produced by encoding all of the hard
constraints using the extended graph colouring model from Sect. 8.4, with one of
our graph colouring algorithms then being applied. For Hard Constraint A, if a pair
of teams ti and t j is specified as sharing a stadium then edges are simply added
between all match-vertices corresponding to home matches of these teams. For Hard
Constraint B, if a venue is specified as unavailable in a particular round, then edges
are added between all match-vertices denoting home matches of the venue’s team(s)
and the associated round-vertex. Finally, for Hard Constraint C, edges are also added
between vertices corresponding to derby matches and all round-vertices except those
representing derby weekends.
Details of the specific problem instance faced at the WRU are given in bold in
Table 8.2. To aid our analysis we also generated (artificially) a further nine instances
of comparable size and difficulty, details of which are also given in the table. As
it turned out, we found that it was quite straightforward to find a feasible solution
to the WRU problem using the graph colouring algorithms from Chap. 5. For our
238 8 Designing Sports Leagues
Table 8.2 Summary of the sports scheduling problem instances used. The entry in bold refers to
the real-world WRU problem. These problems can be downloaded from http://www.rhydlewis.eu/
resources/PrincipalityPremProbs.zip
# Teams t Vertices n Graph Aa Bb Cc
Density
1 12 154 0.268 0 2 {5, 5} 3
2 12 154 0.292 1 3 {6, 8, 10} 6
3 12 154 0.308 2 4 {3, 6, 8, 10} 6
4 14 208 0.236 0 2 {4, 5} 4
5 14 208 0.260 1 3 {8, 10, 10} 7
6 14 208 0.271 2 5 {3, 6, 8, 10, 10} 7
7 16 270 0.219 1 3 {4, 5, 6} 5
8 16 270 0.237 2 5 {3, 6, 8, 10, 10} 8
9 18 340 0.194 1 3 {4, 5, 6} 6
10 18 340 0.212 2 6 {4, 5, 6, 7, 10, 10} 9
a Number of pairs of teams sharing a stadium
b Number of teams sharing a stadium with teams from another league/sport. The number of match-
unavailability constraints for each of these teams is given in { }’s.
c Number of local derby pairings
experiments, we, therefore, ensured that all artificially generated problems also fea-
ture at least one feasible solution. However, the minimum number of soft-constraint
violations achievable in these problems is not known.
The soft constraints of this problem are captured in two cost functions, c1 and c2 .
Both of these need to be minimised.
• Spread Cost (c1 ): Here, a penalty of 1 is added each time a match (ti , t j ) and its
return fixture (t j , ti ) are scheduled in rounds r p and rq , such that |r p − rq | ≤ 5.
In addition, a penalty of 1/2×t1(t−1) is also added each time matches (ti , t j ) and
(t j , ti ) are scheduled to occur in the same half of the schedule.
• Break Cost (c2 ): Here, the home/away pattern of each team is analysed in turn
and penalties of bl are incurred for each occurrence of l consecutive breaks. In
other words, if a team is required to play two home matches (or away matches) in
succession, this is considered as one break and incurs a penalty of b1 . If a team
has three consecutive home matches (or away matches), this is considered as two
consecutive breaks and results in a penalty of b2 being added, and so on.
The term 1/2×t1(t−1) is used as part of c1 to ensure that the total penalty due to
match pairs occurring in the same half is never greater than 1, thus placing a greater
emphasis on keeping matches and their return fixtures at least five rounds apart. The
penalty unit of bl in cost function c2 is also used to help discourage long breaks from
occurring in the schedule. In our case, we use b = 2: thus a penalty of 2 is incurred
for single breaks, 4 for double breaks, 8 for triple breaks, and so on.
8.6 Case Study: Welsh Premiership Rugby 239
It is notable that, because the cost functions c1 and c2 measure different charac-
teristics, use different penalty units, and feature different growth rates, they are in
some sense incommensurable. For this reason, it is appropriate to use the concept of
dominance to distinguish between solutions. This is defined as follows:
In this definition, it is assumed that both cost functions are being minimised. Note
that this definition can also be extended to more than two cost functions if required.
If S1 does not dominate S2 , and S2 does not dominate S1 , then S1 and S2 are
said to be incomparable. The output to both algorithms is then a list L of mutually
incomparable solutions that are not dominated by any other solutions encountered
during the search.
Note that the concept of dominance is commonly used in the field of multiobjective
optimisation where, in addition to being incommensurable, cost functions are often
in conflict with one another (that is, an improvement in one cost will tend to invoke
the worsening of another). It is unclear whether the two cost functions used here are
necessarily in conflict, however.
increasing the current spread cost. This is achieved using a phase of simulated
annealing with a restricted neighbourhood operator where only matches and their
reverse fixtures (i.e., (ti , t j ) and (t j , ti )) can be swapped. Note that the latter moves
can, on occasion, violate some of the additional hard constraints of this problem, and
so in these cases, such moves are rejected automatically. Also, note that moves in this
restricted neighbourhood do not alter the spread cost of the schedule and therefore
do not undo any of the work carried out in the previous random descent stage. On
completion of Step (4), the best solution S ∗ found during this round of simulated
annealing is used to update L. Specifically, if S ∗ is seen to dominate any solutions in
L, then these solutions are removed from L and S ∗ is added to L. The entire process
is then repeated.
Our choice of random descent for reducing c1 arises simply because in initial
experiments we observed that, in isolation, the associated soft constraints seemed
quite easy to satisfy. Thus a simple descent procedure seems effective for making
quick and significant gains in quality (for all instances spread costs of less than 1,
and often 0, were nearly always achieved within our imposed cut-off point of 10,000
evaluations). In addition to this, we also noticed that only short execution times
were needed for the simulated annealing stage due to the relatively small solution
space resulting from the restricted neighbourhood operator, which meant that the
search would tend to converge quite quickly at a local optimum. In preliminary
experiments we also found that if we lengthened the simulated annealing process
by allowing the temperature variable to be reset (thus allowing the search to escape
these optima), then the very same optimum would be achieved after another period
of search, perhaps suggesting that the convergence points in these searches are the
true optima in these particular spaces.4
Finally, our use of a perturbation operator in the multi-stage algorithm is intended
to encourage diversification in the search. In this case, a balance needs to be struck
by applying enough changes to the current solution to cause the search to enter a
different part of the solution space, but not applying too many changes so that the
operator becomes nothing more than a random restart mechanism. In our case, we
chose to simply apply the Kempe chain operator five times in succession, which
proved sufficient for our purposes.
4 Inall cases we used an initial temperature t = 20, a cooling rate of α = 0.99, and z = n2 (refer to
the simulated annealing algorithm in Fig. 4.12). The annealing process ended when no move was
accepted for 20 successive temperatures. Such parameters were decided upon in preliminary testing
and were not seen to be critical in dictating algorithm performance.
8.6 Case Study: Welsh Premiership Rugby 241
Multiobjective Algorithm (S , w)
(1) Set reference costs x1 and x2
(2) Set initial weights using wi = ci (xSi ) for i ∈ {1, 2}
(3) Calculate weighted cost of solution, f (S ) = w1 × c1 (S ) + w2 × c2 (S )
(4) B ← f (S)
(5) while (not stopping condition) do
(6) Form new solution S by applying a Kempe chain interchange to S
(7) if ( f (S ) ≤ f (S ) or ( f (S ) ≤ B) then
(8) S ← S
(9) S
Update list L of non-dominatedsolutions using
c1 (S ) c2 (S )
(10) Find i corresponding to maxi∈{1,2} x1 , x2
(11) Increase weight wi ← wi (1 + w)
Fig.8.13 Multiobjective algorithm with variable weights [29]. In all reported experiments, a setting
of w = 10−6 was used. The input S is a feasible solution provided by a suitable graph colouring
algorithm
An obvious issue with the objective function f is that suitable values need to
be assigned to the weights. Such assignments can, of course, have large effects on
the performance of an algorithm, but they are not always easy to determine as they
depend on many factors such as the size and type of problem instance, the nature of
the individual cost functions, the user requirements, and the amount of available run
time. To deal with this issue, we adopt a multiobjective optimisation technique of
Petrovic and Bykov [29]. The strategy of this approach is to alter weights dynamically
during the search based on the quality of solutions found so far, thus directing the
search into specific regions of the solution space. This is achieved by providing two
reference costs to the algorithm, x1 and x2 . Using these values, we can then imagine
a reference point (x1 , x2 ) being plotted in a two-dimensional Cartesian space, with
a straight reference line then being drawn from the origin (0, 0) and through the
reference point (see Fig. 8.14). During the search, all solutions encountered are then
also represented as points in this Cartesian space and, at each iteration, the weights
are adjusted automatically to encourage the search to move towards the origin while
remaining close to the reference line. It is hoped that eventually solutions will be
produced that feature costs less than the original reference costs.
A pseudocode description of this approach is given in Fig. 8.13. Note that the
weight update mechanism used here (Step (11)) means that weights are gradually
increased during the run. Since, according to Step (7), changes to solutions are only
permitted if (a) they improve the cost, or (b) if the weighted cost is kept below
a constant B, this implies that worsening moves become increasingly less likely
during execution. In this respect, the search process is similar in nature to simulated
annealing.
242 8 Designing Sports Leagues
2000 400
Multi-stage
Multi-objective
Multi-stage
350
Reference
Multi-objective
Reference
300
c2
1500 250
200
150
0 0.1 0.2 0.3 0.4 0.5
c2
1000 c1
500
0
0 5 10 15 20 25
c1
Fig. 8.14 Costs of solutions encountered by the multi-stage and multiobjective approaches in one
run with problem instance #5. Solution costs were recorded every 10,000 evaluations. The dotted line
(in both the main graph and the projection) represents the reference line used by the multiobjective
approach and is drawn from the origin and through the reference point
Because the costs of the global optima are not known for these instances, we choose
to use the values given in the “best” column as approximations to this. In our case
distances are calculated as follows. First, the costs of all solutions returned by the
algorithms, in addition to the costs of the “best” solutions, are normalised to values
in [0, 1] by dividing by the maximum cost values specified in the table. Next, for
each solution, the (Euclidean) distance between the normalised best costs and all
normalised solution costs is calculated. The mean, standard deviation and median of
all distances returned for each algorithm are recorded in the table.
The results in Table 8.3 can be split into two cases. The first involves the larger
problem instances #3 to #10. Here, we see that the multiobjective approach consis-
tently produces better results than the multi-stage approach, which is reflected in the
lower means, medians, and deviations in the Distance from Best column. Note that
the best results for these instances have also come from the multiobjective approach.
We also see that the multi-stage approach has produced larger solution lists for these
problem instances, which could be useful if a user wanted to be presented with a
choice of solutions, though the solutions in these lists are of lower quality in general.
Also note that the differences between the mean and median values here reveal that
the distributions of distances with the multi-stage approach feature larger amounts
of positive skew, reflecting the fact that this method produces solutions of very low
quality on occasion. For the smaller problem instances #1 and #2 we see that these
patterns are more or less reversed from those of the multi-stage approach, produc-
ing solutions with costs that are consistently closer (or equal to) to the best-known
244 8 Designing Sports Leagues
Table 8.3 Summary of results achieved in 100 runs of the multi-stage and multiobjective algo-
rithms. An asterisk (*) in the “best” column indicates that the solution with the associated costs was
found using the multi-stage approach (otherwise it was found by the multiobjective approach)
Multi-stage Multiobjective
# Best Maximum |L| Distance from Best |L| Distance from Best
(c1 , c2 ) c1 c2 Mean±SD Med. Mean±SD Med.
1 (0, 104)* 4.21 352 1.81 0.15 ± 0.22 0.04 2.57 0.27 ± 0.10 0.27
2 (0, 134)* 11.3 394 2.81 0.06 ± 0.05 0.05 3.82 0.20 ± 0.11 0.17
(2.15,
132)*
3 (0, 102) 8.24 560 4.32 0.24 ± 0.17 0.17 3.32 0.13 ± 0.10 0.12
4 (0, 112) 2.17 186 2.72 0.28 ± 0.09 0.26 1.00 0.14 ± 0.06 0.13
5 (0, 134) 3.29 294 3.61 0.34 ± 0.11 0.31 1.00 0.11 ± 0.04 0.11
6 (0, 148) 4.26 444 4.44 0.32 ± 0.11 0.30 1.04 0.08 ± 0.05 0.08
7 (0, 156) 2.18 816 4.51 0.19 ± 0.11 0.16 1.04 0.05 ± 0.04 0.04
8 (0, 186) 3.27 786 4.71 0.26 ± 0.12 0.22 1.53 0.08 ± 0.06 0.06
9 (0, 204) 1.26 940 4.32 0.21 ± 0.10 0.19 1.08 0.05 ± 0.06 0.04
10 (0, 234) 2.29 1398 4.63 0.20 ± 0.10 0.16 1.30 0.09 ± 0.05 0.07
costs. One feature to note in this case is the relatively large difference between the
mean and median with the multi-stage approach for instance #1, where we saw about
60% of produced solutions being very close to the best, and the remainder having
much larger distances. Finally, note that for all problem instances, the nonparametric
Mann–Whitney test indicates that the distances of each algorithm are significantly
different with significance level ≤0.01%.
In summary, the results in Table 8.3 suggest that the strategy used by the multi-
stage algorithm of employing many rounds of short intensive searches seems more
fitting for smaller, less constrained instances, but for larger instances, including the
real-world problem instance, better solutions are achieved by using the multiobjective
approach where longer, less intensive searches are performed.
In this chapter, we have shown that round-robin schedules can be successfully con-
structed using graph colouring principles, often in the presence of many additional
hard constraints. In Sect. 8.6 we exploited this link with graph colouring by propos-
ing two algorithms for a real-world sports scheduling problem. In the case of the
real-world problem instance (#5), we found that more than 98% of all solutions
generated by our multiobjective approach dominated the solution that was manually
produced by the WRU’s league administrators. On the other hand, for the multi-stage
approach, this figure was just 0.02%. We should, however, interpret these statistics
with care, firstly because the manually produced solution was actually for a slightly
8.7 Chapter Summary and Discussion 245
different problem (the exact specifications of which we were unable to obtain from
the league organisers), and secondly because our specific cost functions were not
previously used by the league organisers for evaluating their solutions.
One further neighbourhood operator that might be used with these problems (and
indeed any graph colouring problem) is an extension to the Kempe chain interchange
operator known as the s-chain interchange operator. Let S = {S1 , . . . , Sk } be a
feasible graph colouring solution and let v be an arbitrary vertex in S coloured with
colour j1 . Furthermore, let j2 , . . . , js be a sequence of distinct colours taken from
the set {1, . . . , k} − { j1 }. An s-chain is constructed by first identifying all vertices
adjacent to v that are coloured with colour j2 . From these, adjacent vertices coloured
with j3 are then identified, and from these adjacent vertices with colour j4 , and so
on. When considering vertices with colour js , adjacent vertices with colour j1 are
sought.
As an example, using the graph from Fig. 7.3, an s-chain using s = 3, v = v2
and colours j1 = 2, j2 = 4, and j3 = 1 can be seen to contain the vertices
{v2 , v3 , v4 , v5 , v7 , v8 }. Through similar reasoning to that of Kempe chains (The-
orem 4.1), it is simple to show that we can take the vertices of an s-chain and inter-
change their colours using the mapping j1 ← j2 , j2 ← j3 , . . . , js−1 ← js , js ← j1
such that feasibility of the solution is maintained. Note also that s-chains are equiv-
alent to Kempe chains when s = 2. In our experiments with round-robin schedules,
we also tested the effects of the s-chain interchange operator for s ≥ 3; however, be-
cause of the relatively high levels of connectivity between different colours in these
graphs, we observed that over 99% of moves contained the maximum of s(t/2)
vertices. In other words, almost all moves simply resulted in moves that are also
achievable through combinations of N4 . s-chains may show more promise in other
applications, however, particularly those involving sparser graphs.
References
1. Dinitz J, Garnick D, McKay B (1994) There are 526,915,620 nonisomorphic one-factorizations
of K12 . J Comb Des 2(2):273–285
2. de Werra D (1988) Some models of graphs for scheduling sports competitions. Discret Appl
Math 21:47–65
3. Trick M (2001) A schedule-then-break approach to sports timetables. In: Burke E, Erben W
(eds) Practice and theory of automated timetabling (PATAT) III. LNCS, vol 2079. Springer, pp
242–253
4. Elf M, Junger M, Rinaldi G (2003) Minizing breaks by maximizing cuts. Oper Res Lett
31(5):343–349
5. Miyashiro R, Matsui T (2006) Semidefinite programming based approaches to the break min-
imization problem. Comput Oper Res 33(7):1975–1992
6. Miyashiro R, Matsui T (2005) A polynomial-time algorithm to find an equitable home-away
assignment. Oper Res Lett 33:235–241
7. Russell R, Leung J (1994) Devising a cost effective schedule for a baseball league. Oper Res
42(4):614–625
246 8 Designing Sports Leagues
8. Nemhauser G, Trick M (1998) Scheduling a major college basketball conference. Oper Res
46:1–8
9. Anderson I (1991) Kirkman and G K 2n . Bull Inst Combin Appl 3:111–112
10. Bartsch T, Drexl A, Kroger S (2006) Scheduling the professional soccer leagues of Austria and
Germany. Comput Oper Res 33(7):1907–1937
11. della Croce F, Oliveri D (2006) Scheduling the Italian football league: an ILP-based approach.
Comput Oper Res 33(7):1963–1974
12. Wright M (2006) Scheduling fixtures for Basketball New Zealand. Comput Oper Res
33(7):1875–1893
13. della Croce F, Tadei R, Asioli P (1999) Scheduling a round-robin tennis tournament under
courts and players unavailability constraints. Ann Oper Res 92:349–361
14. Wright M (1994) Timetabling county cricket fixtures using a form of Tabu search. J Oper Res
Soc 47(7):758–770
15. Fleurent C, Ferland J (1993) Allocating games for the NHL using integer programming. Oper
Res 41(4):649–654
16. Kendall G, Knust S, Ribeiro C, Urrutia S (2010) Scheduling in sports, an annotated bibliogra-
phy. Comput Oper Res 37(1):1–19
17. Russell K (1980) Balancing carry-over effects in round-robin tournaments. Biometrika
67(1):127–131
18. Henz M, Muller T, Theil S (2004) Global constraints for round robin tournament scheduling.
Eur J Oper Res 153:92–101
19. Easton K, Nemhauser G, Trick M (2001) The traveling tournament problem: description and
benchmarks. In Walsh T (ed) Principles and practice of constraint programming. LNCS, vol
2239. Springer, pp 580–585
20. Easton K, Nemhauser G, Trick M (2003) Solving the traveling tournament problem: a combined
integer programming and constraint programming approach. In Burke E, De Causmaecker P
(eds) Practice and theory of automated timetabling (PATAT) IV. LNCS, vol 2740. Springer, pp
100–109
21. Anagnostopoulos A, Michel L, van Hentenryck P, Vergados Y (2006) A simulated annealing
approach to the traveling tournament problem. J Sched 9(2):177–193
22. Lim A, Rodrigues B, Zhang X (2006) A simulated annealing and hill-climbing algorithm for
the traveling tournament problem. Eur J Oper Res 174(3):1459–1478
23. Di Gaspero L, Schaerf A (2007) A composite-neighborhood Tabu search approach to the
traveling tournament problem. J Heurist 13(2):189–207
24. Ribeiro C, Urrutia S (2007) Heuristics for the mirrored travelling tournament problem. Eur J
Oper Res 179(3):775–787
25. Lambrechts E, Ficker M, Goossens D, Spieksma F (2018) Round-robin tournaments generated
by the circle method have maximum carry-over. Math Program 172:277–302
26. Lewis R, Thompson J (2010) On the application of graph colouring techniques in round-robin
sports scheduling. Comput Oper Res 38(1):190–204
27. Di Gaspero L, Schaerf A (2006) Neighborhood portfolio approach for local search applied to
timetabling problems. J Math Model Algorithms 5(1):65–89
28. Januario T, Urrutia S, de Werra D (2016) Sports scheduling search space connectivity: a riffle
shuffle driven approach. Discret Appl Math 211:113–120
29. Petrovic S, Bykov Y (2003) A multiobjective optimisation approach for exam timetabling
based on trajectories. In Burke E, De Causmaecker P (eds) Practice and theory of automated
timetabling (PATAT) IV. LNCS, vol 2740. Springer, pp 181–194
30. Deb K, Pratap A, Agarwal S, Meyarivan T (2000) A fast elitist multi-objective genetic algo-
rithm: NSGA-II. IEEE Trans Evol Comput 6(2):182–197
Designing University Timetables
9
In this chapter, our case study looks at how graph colouring concepts can be used in
the process of constructing high-quality timetables for universities and other types
of educational establishments. As we will see, this problem area can contain a whole
host of different constraints, which will often make problems very difficult to tackle.
That said, most timetabling problems contain an underlying graph colouring problem,
allowing us to use many of the concepts developed in previous chapters.
The first section of this chapter will look at university timetabling from a broad
perspective, discussing among other things the various constraints that might be
imposed on the problem. Section 9.2 onwards will then conduct a detailed analysis
of a well-known timetabling formulation that has been the subject of various articles
in the literature. As we will see, powerful algorithms derived from graph colouring
principles can be developed for this problem, though careful modifications also need
to be made to allow the methods to cope with the various other constraints that this
problem involves.
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 247
R. M. R. Lewis, Guide to Graph Colouring, Texts in Computer Science,
https://doi.org/10.1007/978-3-030-81054-2_9
248 9 Designing University Timetables
• Unary Constraints. These involve just one event, such as the constraint “event a
must not take place on a Tuesday”, or the constraint “event a must occur in timeslot
b”.
• Binary Constraints. These concern pairs of events, such as the constraint “event
a must take place before event b”, or the event clash constraint, which specifies
pairs of events that cannot be held at the same time in the timetable.
• Capacity Constraints. These are governed by room capacities. For example “All
events should be assigned to a room that has a sufficient capacity”.
• Event Spread Constraints. These concern requirements involving the “spreading-
out” or “grouping-together” of events within the timetable to ease student/teacher
workload, and/or to agree with a university’s timetabling policy.
• Agent Constraints. These are imposed to promote the requirements and/or prefer-
ences of the people who will use the timetables, such as the constraint “lecturer a
likes to teach event b on Mondays”, or “lecturer c must have three free mornings
per week”.
The field of university timetabling has seen many solution approaches proposed
over the years, including methods based on constructive heuristics, mathematical
programming, branch and bound, and metaheuristics. (See, for example, the surveys
of Carter et al. [5], Burke et al. [6], Schaerf [7], Lewis [8].) The latter survey has
suggested that metaheuristic approaches for university timetabling can be classified
into three categories as follows:
function in which violations of hard constraints are penalised more heavily than
violations of soft constraints. If desired, these weights can be altered during a run.
• Two-stage Optimisation Algorithms. In this case, the hard constraints are first
satisfied to form a feasible solution. Attempts are then made to eliminate violations
of the soft constraints by navigating the space of feasible solutions. Similar schemes
to this have been used in the case studies from Chaps. 7 and 8.
• Algorithms That Allow Relaxations. Here, violations of the hard constraints are
disallowed from the outset by relaxing some features of the problem. Attempts are
then made to try and satisfy the soft constraints, while also considering the task
of eliminating these relaxations. These relaxations could include allowing certain
events to be left out of the timetable, or using additional timeslots or rooms.
The wide variety of constraints, coupled with the fact that each higher educa-
tion institution will usually have its own timetabling policies, means that timetabling
problem formulations have always tended to vary quite widely in the literature. While
making the problem area very rich, one drawback has been the lack of opportunity
for accurate comparison of algorithms. Since the early 2000s, this situation has been
mitigated to a certain extent with the organisation of a series of timetabling compe-
titions and the release of publicly available problem instances. In 2007, for example,
the Second International Timetabling Competition (ITC2007) was organised by a
group of timetabling researchers from different European Universities, which con-
sidered the three types of timetabling problems mentioned above: exam timetabling,
post enrolment-based course timetabling, and curriculum-based timetabling. The
competition operated by releasing problem instances into the public domain, with
entrants then designing algorithms to try and solve these. Entrants’ algorithms were
then compared under strict time limits according to specific evaluation criteria.1
In this chapter, we will examine the post enrolment-based course timetabling problem
used for ITC2007. This formulation models the real-world situation where students
are given a choice of lectures that they wish to attend, with the timetable then being
constructed according to these choices. The next section contains a formal definition
of this problem, with Sect. 9.3 then containing a short review of the most noteworthy
algorithms. We then go on to describe a high-performance graph colouring-based
method in Sects. 9.4 and 9.5. The final results of our algorithm are given in Sect. 9.6,
with a discussion and conclusions then being presented in Sect. 9.7.
1 http://www.cs.qub.ac.uk/itc2007/.
9.2 Problem Definition and Preprocessing 251
(5)
and finally a precedence matrix Pn×n , where
⎧
⎪
⎨1 if event ei should be assigned to an earlier timeslot than event e j
(5)
Pi j = −1 if event ei should be assigned to a later timeslot than event e j
⎪
⎩
0 otherwise.
For the precedence matrix above, note that two conditions are necessary for the
(5) (5) (5)
relationships to be consistent: (a) Pi j = 1 if and only if P ji = −1, and (b) Pi j = 0
(5)
if and only if P ji = 0. We can also observe the transitivity of this relationship:
∃ei , e j , el ∈ e : Pi(5)
j = 1 ∧ P (5)
jl = 1 ⇒ Pil(5) = 1 (9.1)
In some of the competition problem instances, this transitivity is not fully expressed;
however, observing it enables further 1’s and −1’s to be added to P(5) during pre-
processing, allowing the relationships to be more explicitly stated.
Given the above five matrices, we are also able to calculate two further matrices
that allow fast detection of hard constraint violations. The first of these is a room
suitability matrix Rn×|r | defined as
|s| (1) (3) (2)
Ri j =
1 if l=1 Pli ≤ c(r j ) ∧ fl ∈ f : Pil = 1 ∧ P jl = 0
(9.2)
0 otherwise.
The second is then a conflicts matrix Cn×n , defined:
⎧
⎪
⎪
⎪ 1 if ∃sl ∈ s : Pli(1) = 1 ∧ Pl(1) j =1
⎪
⎪
⎪
⎪ ∨ ∃r ∈ : = ∧ = ∧
|r |
Ril = 1 ∧
|r |
R jl = 1
⎪
⎨ r R il 1 R jl 1
l l=1 l=1
Ci j = (5)
⎪ ∨ P = 0
⎪
⎪ i j
⎪ ∨ t ∈ t : P (4) = 1 ∧ P (4) = 1
⎪
⎪
⎪ l il jl
⎪
⎩
0 otherwise.
(9.3)
The matrix R therefore specifies the rooms that are suitable for each event (that is,
rooms that are large enough for all attending students and that have all the required
features). The C matrix, meanwhile, is a symmetrical matrix (Ci j = C ji ) that spec-
ifies pairs of events that cannot be assigned to the same timeslot (i.e., those that
conflict). According to Eq. (9.3), this will be the case if two events ei and e j share
a common student, require the same individual room, are subject to a precedence
relation, or have mutually exclusive subsets of timeslots for which they are available.
Note that the matrix C is analogous to the adjacency matrix of a graph G = (V, E)
with n vertices, highlighting the similarities between this timetabling problem and
the graph colouring problem. However, unlike the graph colouring problem, in this
case the ordering of the timeslots (colour classes) is also an important property of
a solution. Consequently, a solution is represented by an ordered set of sets S =
9.2 Problem Definition and Preprocessing 253
r5 r5
(S1 , . . . , Sk=45 ) and is subject to the satisfaction of the following hard constraints.
k
Si ⊆ e (9.4)
i=1
Si ∩ S j = ∅ (1 ≤ i = j ≤ k) (9.5)
∀e j , el ∈ Si , C jl = 0 (1 ≤ i ≤ k) (9.6)
(4)
∀e j ∈ Si , P ji = 1 (1 ≤ i ≤ k) (9.7)
(5)
∀e j ∈ Si , el ∈ Sq<i , P jl = 1 (1 ≤ i ≤ k) (9.8)
(5)
∀e j ∈ Si , el ∈ Sq>i , P jl = −1 (1 ≤ i ≤ k) (9.9)
Si ∈ M (1 ≤ i ≤ k). (9.10)
Constraints (9.4) and (9.5) state that S should partition the event set e (or a subset
of e) into an ordered set of sets, labelled S1 , . . . , Sk . Each set Si ∈ S contains the
events that are assigned to timeslot ti in the timetable. Equation (9.6) stipulates that
no pair of conflicting events should be assigned to the same set Si ∈ S (the graph
colouring constraint), while (9.7) states that each event should be assigned to a set
Si ∈ S whose corresponding timeslot ti is deemed available according to matrix
P(4) . Constraints (9.8) and (9.9) then impose the precedence requirements of the
problem.
Finally, (9.10) is concerned with ensuring that the events assigned to a set Si ∈
S can each be assigned to a suitable room from the room set r . To achieve this,
it is necessary to solve a maximum bipartite matching problem. Specifically, let
G = (Si , r, E) be a bipartite graph with vertex sets Si and r , and an edge set
E = {{e j ∈ Si , rl ∈ r } : R jl = 1}. Given G, the set Si is a member of M if and
only if there exists a maximum bipartite matching of G comprising |Si | edges. In
this case, the room constraints for this timeslot are satisfied.
Figure 9.2 shows two examples of these ideas using |Si | = 4 events and |r | = 5
rooms. In Fig. 9.2a, a matching exists—for example, event e1 can be assigned to
room r1 , e2 to r2 , e3 to r3 , and e4 to r5 . On the other hand, a matching does not exist
in Fig. 9.2b, meaning Si ∈ / M in this case. Matching problems√on bipartite graphs
can be solved in polynomial time using, for example, the O(m n) Hopcroft–Karp
algorithm.
254 9 Designing University Timetables
As mentioned, in addition to finding a solution that obeys all of the hard constraints,
three soft constraints are also considered with this problem.
• SC1. Students should not be required to attend an event in the last timeslot of each
day (i.e., timeslots 9, 18, 27, 36, or 45);
• SC2. Students should not have to attend events in three or more successive timeslots
occurring in the same day; and,
• SC3. Students should not be required to attend just one event in a day.
The extent to which these constraints are violated is measured by a soft constraints
cost (SCC), which is worked out in the following way. For SC1, if a student attends
an event assigned to an end-of-day timeslot, this is counted as one penalty point.
Naturally, if x students attend this class, this counts as x penalty points. For SC2, if a
student attends three events in a row we count this as one penalty point. If a student
has four events in a row we count this as two, and so on. Note that students assigned
to events occurring in consecutive timeslots over two separate days are not counted
as violations. Finally, each time we encounter a student with a single event on a day,
we count this as one penalty point (two for 2 days with single events, etc.). The SCC
is simply the total of these three values.
More formally, the SCC can be calculated using two matrices: X|s|×45 , which
tells us the timeslots for which each student is attending an event, and Y|s|×5 , which
specifies whether or not a student is required to attend just one event in each of the
5 days.
(1)
1 if ∃el ∈ S j : Pil = 1
Xi j = (9.12)
0 otherwise,
9
1 if l=1 X i,9( j−1)+l = 1
Yi j = (9.13)
0 otherwise.
9.2 Problem Definition and Preprocessing 255
Here, the three terms summed in the outer parentheses of Eq. 9.14 define the number
of violations of SC1, SC2, and SC3 (respectively) for each student on each day of
the timetable.
Having defined the post enrolment-based course timetabling problem, we are now
in a position to state its complexity.
Proof. Let Cn×n be our symmetric conflicts matrix as defined above, filled arbitrarily.
In addition, let the following conditions hold:
|r | ≥ n (9.15)
Ri j = 1 ∀ei ∈ e, r j ∈ r (9.16)
Pi(4)
j =1 ∀ei ∈ e, t j ∈ t (9.17)
(5)
Pi j =0 ∀ei , e j ∈ e. (9.18)
Here, there is an excess number of rooms which are suitable for all events ((9.15)
and (9.16)), there are no event availability constraints (9.17), and no precedence
constraints (9.18). In this special case we are therefore only concerned with satisfying
Constraints (9.4)–(9.6) while minimising the DTF. Determining the existence of a
feasible solution using k timeslots is therefore equivalent to the N P -complete graph
k-colouring problem.
From a different perspective, Cambazard et al. [9] have also shown that, in the
absence of all hard constraints (9.6)–(9.10) and soft constraints SC1 and SC2, the
problem of satisfying SC3 (i.e., minimising the number of occurrences of students
sitting a single event in a day) is equivalent to the N P -hard set covering problem.
From the above descriptions, we see that a timetable’s quality is described by two
values: the distance to feasibility (DTF) and the soft constraint cost (SCC). According
to the competition criteria, when comparing solutions the one with the lowest DTF is
deemed the best timetable, reflecting the increased importance of the hard constraints
256 9 Designing University Timetables
over the soft constraints. However, when two or more solutions’ DTFs are equal, the
winner is deemed the solution among these that has the lowest SCC.
There are 24 benchmark instances available for this problem. These were gener-
ated so that all are known to feature at least one perfect solution (that is, a solution
with DTF = 0 and SCC = 0). For comparative purposes, a benchmark timing pro-
gram is also available on the competition website that allocates a strict time limit for
each machine that it is executed on (based on its hardware and operating system).
This allows researchers to use approximately the same amount of computational
effort when testing their implementations, allowing more accurate comparisons.
One of the first studies into the post enrolment-based timetabling problem (in this
form) was carried out by Rossi-Doria et al. [10], who used it as a test problem for
comparing five different metaheuristics, namely evolutionary algorithms, simulated
annealing, iterated local search, ant colony optimisation, and tabu search. Two inter-
esting observations were offered in their work:
These conclusions have since proven to be quite salient, with several successful algo-
rithms following this suggested two-stage methodology. This includes the winning
entry of ITC2007 itself, due to Cambazard et al. [11], which uses tabu search together
with an intensification procedure to achieve feasibility, with simulated annealing then
being used to satisfy the soft constraints.
Since the running of ITC2007, several papers have been published that have
equalled or improved upon the results of the competition. Cambazard et al. [9]
have shown how the results of their two-stage competition entry can be improved
by relaxing Constraint (9.10) such that a timeslot ti is considered feasible whenever
|Si | < |r |. The rationale for this relaxation is that it will “increase the solution density
of the underlying search space”, though a repair operator is also needed to make sure
that the timeslots satisfy Constraint (9.10) at the end of execution. Cambazard et
al. [9] have also examined constraint programming-based approaches and a large
neighbourhood search (LNS) scheme, and find that their best results can be found
when using simulated annealing together with the LNS operator for reinvigorating
the search from time to time.
Other successful algorithms for this problem have followed the one-stage op-
timisation scheme by attempting to reduce violations of hard and soft constraints
9.3 Previous Approaches to This Problem 257
simultaneously. Ceschia et al. [12], for example, treat this problem as a single ob-
jective optimisation problem in which the space of valid and invalid solutions is
explored. Specifically, they allow violations of Constraints (9.6), (9.8), and (9.9)
within a solution, and use the number of students affected by such violations, to-
gether with the DTF, to form an infeasibility measure. This is then multiplied by a
weighting coefficient w and added to the SCC to form the objective function. Sim-
ulated annealing is then used to optimise this objective function and, surprisingly,
after extensive parameter tuning w = 1 is found to provide their best results.
Nothegger et al. [13] have also attempted to optimise the DTF and SCC simultane-
ously, making use of ant colony optimisation to explore the space of valid solutions.
Here, the DTF and SCC measures are used to update the algorithm’s pheromone
matrices so that favourable assignments of events to rooms and timeslots will occur
with higher probability in later iterations of the algorithm. Nothegger et al. also show
that the results of their algorithm can be improved by adding a local search-based
improvement method and by parallelising the algorithm.
Jat and Yang [14] have also used a weighted sum objective function in their hybrid
evolutionary algorithm/tabu search approach, though their results do not appear as
strong as those of the previous two papers. Similarly, van den Broek and Hurkens [15]
have also used a weighted sum objective function in their deterministic algorithm
based on column generation techniques.
From the above studies, it is clear that the density and connectivity of the under-
lying solution space is an important issue in the performance of a neighbourhood
search algorithm for this problem. In particular, if connectivity is low then move-
ments in the solution space will be more restricted, perhaps making improvements in
the objective function more difficult to achieve. From the research discussed above,
it is noticeable that some of the best approaches for this problem have attempted to
mitigate this issue by relaxing some of the hard constraints and/or by allowing events
to be kept out of the timetable. However, such methods also require mechanisms for
coping with these relaxations, such as repair operators (which may ultimately require
large alterations to be made to a solution), or by introducing terms into the objective
function (which will require appropriate weighting coefficients to be determined,
perhaps via tuning). On the other hand, a two-stage approach of the type discussed
by Rossi-Doria et al. [10] will not need these features, though because feasibility
must be maintained when the SCC is being optimised, the underlying solution space
may be more sparsely connected, perhaps making good levels of optimisation more
difficult to achieve. We will focus on the issue of connectivity in Sect. 9.5 onwards.
Before looking at the task of eliminating soft constraint violations, it is first necessary
to produce a valid solution that minimises the DTF measure (Eq. (9.11)). Previous
strategies for this task have typically involved inserting all events into the timetable,
and then rearranging them to remove violations of the hard constraints [9,11,12,16].
258 9 Designing University Timetables
Table 9.1 Heuristics used for producing an initial solution in Stage 1. Here, a “valid place” is
defined as a room/timeslot pair that an event can be assigned to without violating Constraints
(9.4)–(9.10)
Rule Description
h1 Choose the unplaced event with the smallest number of valid places in the timetable to
which it can be assigned
h2 Choose the unplaced event ei that conflicts with the most other events (i.e., that
maximises nj=1 Ci j )
h3 Choose an event randomly
h4 Choose the place that is valid for the least number of other unplaced events in U
h5 Choose the valid timeslot containing the fewest events
h6 Choose a place randomly
the current cost, 0.6 × |S | + x, where x is an integer uniformly selected from the
set {0, 1, . . . , 9}.
Similarly to the original PartialCol algorithm, at each iteration the entire neigh-
bourhood of (|S | × k) moves is examined, and the move that is chosen is the one that
invokes the largest decrease (or failing that, the smallest increase) in the cost of any
valid, non-tabu move. Ties are broken randomly, and tabu moves are also permitted
if they are seen to improve on the best solution found so far. From time to time, there
may also be no valid non-tabu moves available from a particular solution, in which
case a randomly selected event is transferred from S into S , before the process is
continued as above.
9.4.1 Results
Table 9.2 contains the results of our PartialCol algorithm and compares them to
those reported by Cambazard et al. [9].2 We report the percentage of runs in which
each instance has been solved (i.e., where a DTF of zero has been achieved), and
the average time that this took (calculated only from the solved runs).3 We see that
the success rates for the two approaches are similar, with all except one instance
being solved in 100% of cases (instance #10 in Cambazard et al.’s case, instance
#11 in ours). However, except for instance #11, the time required by PartialCol is
considerably less, with an average reduction of 97.4% in CPU time achieved across
the 15 remaining instances.
Curiously, when using our PartialCol algorithm with instance #11, most of the
runs were solved very quickly. However, in a small number of cases, the algorithm
seemed to quickly navigate to a point at which a small number of events remained
unplaced and where no further improvements could be made, suggesting the search
was caught in a conspicuous valley in the cost landscape. To remedy this situation
we, therefore, added a diversification mechanism to the method which attempts to
break out of such regions. We call this our improved PartialCol algorithm and its
results are also given in Table 9.2.
In the improved PartialCol method, our diversification mechanism is used for
making relatively large changes to the incumbent solution, allowing new regions of
2 Our algorithm was implemented in C++, and all experiments were conducted on 3.0 GHz Win-
dows 7 PCs with 3.87 GB RAM. The competition benchmarking program allocated 247 s on this
equipment. The source code is available at http://www.rhydlewis.eu/resources/ttCodeResults.zip.
3 For comparative purposes, the computation times stated by Cambazard et al. [9] have been altered
in Table 9.2 to reflect the increased speed of our equipment. According to Cambazard et al. the
competition benchmark program allocated them 324 s per run. Consequently, their original run
times have been reduced by 23.8%. We should note, however, that when comparing algorithms
in this way, discrepancies in results and times can also occur due to differences in the hardware,
operating system, programming language, and compiler options used. Our use of the competition
benchmark program attempts to reduce discrepancies caused by the first two factors, but cannot
correct for differences arising due to the latter two.
260
Table 9.2 Comparison of results from the LS-colouring method of Cambazard et al. [9], and our PartialCol and Improved PartialCol algorithms (all figures
taken from 100 runs per instance)
Instance # 1 2 3 4 5 6 7 8 9 10 11 12
Cambazard et al. [9] % Solved 100 100 100 100 100 100 100 100 100 98 100 100
Avg. time (s) 11.60 37.10 0.37 0.43 3.58 4.32 1.84 1.11 51.73 170.24 0.40 0.64
PartialCol % Solved 100 100 100 100 100 100 100 100 100 100 98 100
Avg. time (s) 0.25 0.79 0.02 0.04 0.05 0.07 0.02 0.01 0.71 1.80 1.88 0.04
Improved PartialCol % Solved 100 100 100 100 100 100 100 100 100 100 100 100
Avg. time (s) 0.25 0.79 0.02 0.02 0.06 0.08 0.03 0.01 0.68 2.03 0.03 0.04
Instance # 13 14 15 16 17 18 19 20 21 22 23 24
Cambazard et al. [9] % Solved 100 100 100 100 – – – – – – – –
Avg. time (s) 8.86 7.97 0.80 0.55 – – – – – – – –
PartialCol % Solved 100 100 100 100 100 100 100 100 100 100 100 100
Avg. time (s) 0.08 0.11 0.01 0.01 0.00 0.02 0.74 0.01 0.07 3.77 1.33 0.17
Improved PartialCol % Solved 100 100 100 100 100 100 100 100 100 100 100 100
Avg. time (s) 0.08 0.11 0.01 0.01 0.00 0.02 0.71 0.01 0.08 3.80 1.10 0.18
9 Designing University Timetables
9.4 Algorithm Description: Stage One 261
the solution space to be explored. It is called when the best solution found so far has
not been improved for a set number of iterations. The mechanism operates by first
randomly selecting a percentage of events in S and transferring them to the set of
unplaced events S . Next, alterations are made to S by performing a random walk
using neighbourhood operator N5 (to be described in Sect. 9.5). Finally, the tabu list
is reset so that all potential moves are deemed non-tabu before PartialCol contin-
ues to execute as before. For the results in Table 9.2, the diversification mechanism
was called after 5000 non-improving iterations and extracted 10% of all events in S .
A random walk of 100 neighbourhood moves was then performed, giving a >95%
chance of all timeslots being altered by the neighbourhood operator (several other
parameters were also tried here, though few differences in performance were ob-
served). We see that the improved PartialCol method has achieved feasibility in
all runs in the sample, with the average time reduction remaining at 97.4% compared
to the method of Cambazard et al. [9].
In the second stage of this algorithm, we use simulated annealing (SA) to explore
the space of valid/feasible solutions, and attempt to minimise the number of soft
constraint violations measured by the SCC (Eq. (9.14)). This metaheuristic is applied
similarly to that described in Chap. 4: starting at an initial temperature T0 , during
execution the temperature variable is slowly reduced according to an update rule
Ti+1 = αTi , where the cooling rate α ∈ (0, 1). At each temperature Ti , a Markov
chain is generated by performing n 2 applications of the neighbourhood operator.
Moves that are seen to violate a hard constraint are immediately rejected. Moves
that preserve feasibility but that increase the cost of the solution are accepted with
probability exp(−|δ|/Ti ) (where δ is the change in cost), while moves that reduce
or maintain the cost are always accepted. The initial temperature T0 is calculated
automatically by performing a small sample of neighbourhood moves and using the
standard deviation of the cost over these moves [18].
Because this algorithm is intended to operate according to a time limit, a value for
α is determined automatically so that the temperature is reduced as slowly as possible
between T0 and some end temperature Tend . This is achieved by allowing α to be
modified during a run according to the length of time that each Markov chain takes
to generate. Specifically, let μ∗ denote the estimated number of Markov chains that
will be completed in the remainder of the run, calculated by dividing the amount of
remaining run time by the length of time the most recent Markov chain (operating at
temperature Ti ) took to generate. On completion of the ith Markov chain, a modified
cooling rate can thus be calculated as
∗
αi+1 = (Tend /Ti )1/μ (9.19)
262 9 Designing University Timetables
The upshot is that the cooling rate will be altered slightly during a run, allowing the
user-specified end temperature Tend to be reached at the time limit. Suitable values
for Tend , the only parameter required for this phase, are examined in Sect. 9.6.
N1 : The first neighbourhood operator is based on those used by Lewis [17] and
Nothegger et al. [13]. Consider a valid solution S represented as a matrix Z|r |×k
in which rows represent rooms and columns represent timeslots. Each element
of Z can be blank or can be occupied by exactly one event. If Z i j is blank,
then room ri is vacant in timeslot t j ; if Z i j = el , then event el is assigned to
room ri and timeslot t j . N1 operates by first randomly selecting an element
Z i1 j1 containing an arbitrary event el . A second element Z i2 j2 is then randomly
selected in a different timeslot ( j1 = j2 ). If Z i2 , j2 is blank, the operator attempts
to transfer el from timeslot j1 into any vacant room in timeslot j2 ; if Z i2 j2 = eq ,
then a swap is attempted in which el is moved into any vacant room in timeslot
j2 , and eq is moved into any vacant room in timeslot j1 . If such changes are
seen to violate any of the hard constraints, they are rejected immediately; else
they are kept and the new solution is evaluated according to Eq. (9.14).
N2 : This operates in the same manner as N1 . However, when seeking to insert
an event into a timeslot, if no vacant, suitable room is available, a maximum
matching algorithm is also executed to determine if a valid room allocation of
the events can be found. A similar operator was used by Cambazard et al. [11]
in their winning competition entry.
N3 : This is an extension of N2 . Specifically, if the proposed move in N2 will result in
a violation of Constraint (9.6), then a Kempe chain interchange is attempted (see
Definition 4.3). An example of this process is shown in Fig. 9.3a. Imagine in this
case that we have chosen to swap the events e5 ∈ Si and e10 ∈ S j . However,
doing so will violate Constraint (9.6) because events e5 and e11 conflict but
would now both be assigned to timeslot S j . In this case, we therefore construct
the Kempe chain Kempe(e5 , i, j) = {e5 , e10 , e11 } which, when interchanged,
guarantees the preservation of Constraint (9.6), as shown in Fig. 9.3b. Observe
9.5 Algorithm Description: Stage Two 263
that this neighbourhood operator also includes pair swaps (see Definition 4.4)—
for example, if we were to select events e4 ∈ Si and e8 ∈ S j from Fig. 9.3a.
Note, however, that as with the previous neighbourhood operators, applications
of N3 may not preserve the satisfaction of the remaining hard constraints. Such
moves will again need to be rejected if this is the case.
N4 : This operator extends N3 by using the idea of double Kempe chains, originally
proposed by Lü and Hao [19]. In many cases, a proposed Kempe chain inter-
change will be rejected because it will violate Constraint (9.10): that is, suitable
rooms will not be available for all of the events proposed for assignment to a par-
ticular timeslot. For example, in Figure 9.3a, the proposed Kempe interchange
involving events {e1 , e2 , e3 , e6 , e7 } is guaranteed to violate Constraint (9.10)
because it will result in too many events in timeslot S j for a feasible matching
to be possible. However, applying a second Kempe chain interchange at the
same time may result in feasibility being maintained, as illustrated in Fig. 9.3c.
In this operator, if a proposed single Kempe chain interchange is seen to violate
Constraint (9.10) only, then a random vertex from one of the two timeslots,
but from outside this chain, is randomly selected, and a second Kempe chain
is formed from it. If the proposed interchange of both Kempe chains does not
violate any of the hard constraints, then the move can be performed and the
new solution can be evaluated according to Eq. (9.14) as before.
N5 : Finally, N5 defines a multi-Kempe chain operator. This generalises N4 in that if
a proposed double Kempe chain interchange is seen to violate Constraint (9.10)
only, then triple Kempe chains, quadruple Kempe chains, and so on, can also
be investigated in the same manner. Note that when constructing these multiple
Kempe chains, a violation of any of the constraints (9.7)–(9.9) allows us to
reject the move immediately. However, if only Constraint (9.10) continues to
be violated, then eventually the considered Kempe chains will contain all events
in both timeslots, in which case the move becomes equivalent to swapping the
contents of the two timeslots. Trivially, in such a move Constraint (9.10) is
guaranteed to be satisfied.
From the above descriptions, it is clear that each successive neighbourhood oper-
ator requires more computation than its predecessor. Each operator also generalises
its predecessor—that is, N1 (S ) ⊆ N2 (S ) ⊆ · · · ⊆ N5 (S ), ∀S ∈ S. From the per-
spective of the graph G = (S, E) defined above, this implies a greater connectivity of
the solution space since E 1 ⊆ E 2 ⊆ . . . ⊆ E 5 (where E i = {{S , S } : S ∈ Ni (S )}
for i = 1, . . . , 5). Note, though, that the set of vertices (solutions) S remains the
same under these different operators.
Finally, it is also worth mentioning that each of the above operators only ever
alters the contents of two timeslots in any particular move. In practice, this means
that we only need to consider the particular days and students affected by the move
when reevaluating the solution according to Eq. (9.14). This allows considerable
speed-up of the algorithm.
264 9 Designing University Timetables
r2 e1 e7 r2 e1 e7 r2 e7 e1
r3 e2 e8 r3 e2 e8 r3 e4 e2
r4 e3 e9 r4 e3 e9 r4 e9
r5 e4 e10 r5 e4 r5 e10 e8
Fig.9.3 Example moves using N3 and N4 . Here, edges exist between pairs of vertices (events) el , eq
if and only if Clq = 1. Part a shows two timeslots containing two Kempe chains, {e1 , e2 , e3 , e6 , e7 }
and {e5 , e10 , e11 }. Part b shows a result of interchanging the latter chain. Part c shows the result of
interchanging both chains. Note that room allocations are determined via a matching algorithm and
can therefore change during an interchange
(a) (b)
Feasible solution
Solution using
dummy room
Fig. 9.4 Graphs depicting the connectivity of a solution space with a no dummy rooms, and b one
or more dummy rooms
(i.e., S( j) ⊆ S( j+1) , ∀ j ≥ 0), with extra edges (dotted in the figure) being created
between some of the original vertices and new vertices. As depicted, this could also
allow previously disjoint components of the solution space to become connected.
Because they do not form part of the original problem, at the end of the optimisation
process all dummy rooms will need to be removed. This means that any events
assigned to these will contribute to the DTF measure. Because this is undesirable,
in our case we attempt to discourage the assignment of events to the dummy rooms
during evaluation by considering all events assigned to a dummy room as unplaced.
We then use the cost function w × DTF + SCP, where w is a weighting coefficient
that will need to be set by the user. Additionally, when employing the maximum
matching algorithm it also makes sense to ensure that the dummy room is only
used when necessary—that is, if a feasible matching can be achieved without using
dummy rooms, then this is the one that will be used.
We have now seen various neighbourhood operators for this problem and made some
observations on the connectivity of their underlying solution spaces, defined by the
graph G = (S, E). Unfortunately, however, it is very difficult to gain a complete
understanding of G’s connectivity because it is simply too large. In particular, we are
unlikely to be able to confirm whether G is connected or not, which would be useful
information if we wanted to know whether an optimal solution could be reached
from any other solution within the solution space.
One way to gain an indication of G’s connectivity is to make use of what we call
the feasibility ratio. This is defined as the proportion of proposed neighbourhood
moves that are seen to not violate any of the hard constraints (i.e., that maintain va-
lidity/feasibility). A lower feasibility ratio suggests lower connectivity in G because,
on average, more potential moves will be seen to violate a hard constraint, making
movements within the solution space more restricted. A higher feasibility ratio will
suggest a greater level of connectivity.
Figure 9.5 displays the feasibility ratios for neighbourhood operators N1 , . . . , N5
(1)
and also N5 for all available problem instances. These mean figures were found by
performing random walks of 50,000 feasible-preserving moves from a sample of 20
feasible solutions per instance. As expected, we see that the feasibility ratios increase
for each successive neighbourhood operator, though the differences between N3 , N4 ,
and N5 appear to be only marginal. We also observe quite a large range across the
instances, with instance #10 appearing to exhibit the least connected solution space
(1)
(with feasibility ratios ranging from just 0.0005 (N1 ) to 0.004 (N5 )), and instance
#17 having the highest levels of connectivity (0.04 (N1 ) to 0.10 (N5(1) )). Standard
deviations from these samples range between 0.000018 (N2 , #20) and 0.000806 (N1 ,
#23). These observations will help to explain the results in the following sections.
266 9 Designing University Timetables
0.12
N1
N2
0.1 N3
N4
0.08 N5
Feasibility Ratio
N(1)
5
0.06
0.04
0.02
0
10 22 2 9 1 20 21 13 14 19 24 5 6 15 16 8 18 23 12 7 11 4 3 17
Instance #
Fig. 9.5 Feasibility ratio for neighbourhood operators N1 . . . N5 and also N5(1) for all 24 problem
instances
We now examine the ability of each neighbourhood operator to reduce the soft
constraint cost within the time limit specified by the competition benchmarking
program (minus the time used for Stage 1). We also consider the effects of altering the
end temperature of simulated annealing Tend , which is the only run-time parameter
required in this stage. To measure performance, we compare our results to those
achieved by the five finalists of the 2007 competition using the competition’s ranking
system. This involves calculating a “ranking score” for each algorithm, which is
derived as follows.
Given x algorithms and a single problem instance, each algorithm is executed
y times, giving x y results. These results are then ranked from 1 to x y, with ties
receiving a rank equal to the average of the ranks they span. The mean of the ranks
assigned to each algorithm is then calculated, giving the respective rank scores for
the x algorithms on this instance. This process is then repeated on all instances, and
the mean of all ranking scores for each algorithm is taken as its overall ranking score.
A worked example of this process is shown in Table 9.3. We see that the best ranking
score achievable for an algorithm on a particular instance is (y + 1)/2, in which case
its y results are better than all of the other algorithms’ (as is the case with Algorithm
A on instance #3 in the table). The worst possible ranking score, (x −1)y +(y +1)/2,
has occurred with Algorithm C with instances #1, #2, and #3 in the table.
A full breakdown of the results and ranking scores of the five competition finalists
can be found on the official website of ITC2007 at http://www.cs.qub.ac.uk/itc2007/.
9.6 Experimental Results 267
Table 9.3 Worked example of how rank scores are calculated using, in this case, y = 2 runs
of x = 3 algorithms on three problem instances. Results of each run are given by the DTF (in
parentheses) and the SCC. Here, Algorithm A is deemed the winner and C the loser
Instance Results Ranks Rank scores Mean
run
#1 #2 #3 #1 #2 #3 #1 #2 #3
1 2 1 2 1 2
Alg. A (0) 0 (0) 10 (0) 1 (0) 5 (0) 0 (0) 2 1, 3 1.5, 3 1, 2 2 2.25 1.5 1.92
Alg. B (0) 5 (0) 17 (0) 1 (0) 8 (0) 8 (0) 11 2, 4 1.5, 4 3, 4 3 2.75 3.5 3.08
Alg. C (9) 3 (0) 19 (0) 18 (0) 16 (0) 12 (4) 0 6, 5 6, 5 5, 6 5.5 5.5 5.5 5.50
26
24 N1
(1)
Ranking Score
22 N5
20 N2
18
N5 N4
16
N3
14
0 0.2 0.4 0.6 0.8 1 1.2 1.4
End Temperature: Tend
Fig. 9.6 Ranking scores achieved by the different neighbourhood operators using different end
temperatures. The shaded area indicates the results that would have won the competition
In our case, we added results from ten runs of our algorithm to these published results,
giving x = 6 and y = 10. A summary of the resultant ranking scores achieved by
our algorithm with each neighbourhood operator over a range of different settings
for Tend is given in Fig. 9.6. The shaded area of the figure indicates those settings
where our algorithm would have won the competition (i.e., that have achieved a
lower ranking score than the other five entries).
Figure 9.6 shows a clear difference in the performance of neighbourhood opera-
tors N1 and N2 , illustrating the importance of the extra solution space connectivity
provided by the maximum matching algorithm. Similarly, the results of N3 , N4 and
N5 are better still, outperforming N1 and N2 across all of the values of Tend tested.
However, there is very little difference between the performance of N3 , N4 and
N5 themselves presumably because, for these particular problem instances, the be-
haviour and therefore feasibility ratios of these operators are very similar (as seen in
Fig. 9.5). Moreover, we find that the extra expense of N5 over N3 and N4 appears to
268 9 Designing University Timetables
have minimal effect, with N5 producing less than 0.5% fewer Markov chains than N3
over the course of the run on average. Of course, such similarities will not always be
the case—they merely seem to be occurring with these particular problem instances
because, in most cases, hard constraints are being broken (and the move rejected)
before the inspection of more than one Kempe chain is deemed necessary.
Figure 9.6 also indicates that using dummy rooms does not seem to improve
results across the instances. In initial experiments, we tested the use of one and two
dummy rooms along with a range of different values for the weighting coefficient
w ∈ {1, 2, 5, 10, 20, 200, ∞} (some of these values were chosen due to their use in
existing algorithms that employ weighted sum functions with this problem [12,13,
15]). Figure 9.6 reports the best of these: one dummy room with w = 2. For higher
values of w, results were found to be inferior because the additional solutions in the
solution space (shaded vertices in Fig. 9.4) would still be evaluated by the algorithm,
but nearly always rejected due to their high cost. On the other hand, using a setting
of w = 1 means that the penalty of assigning an event to a dummy room will be
equal to the penalty of assigning the event to the last timeslot of a day (soft constraint
SC1), meaning there is little distinction between the cost of infeasibility and the cost
of soft constraint violations.
Note that we might consider the use of no dummy rooms as similar to using
w ≈ ∞, in that the algorithm will be unable to accept moves that involve moving an
event into a dummy room (i.e., introducing infeasibility to the timetable). However,
the difference is that, when using dummy rooms, such moves will still be evaluated
by the algorithm before being rejected. On the other hand, without dummy rooms
these unnecessary evaluations do not take place. This saves significant amounts of
time during the course of a run.
As mentioned, a setting of w = 2, which seems the best compromise between
these extremes, still produces inferior results on average compared to when using no
dummy rooms. However, for problem instance #10 we found the opposite to be true,
with significantly better results being produced when dummy rooms are used. From
Fig. 9.5, we observe that instance #10 has the lowest feasibility ratio of all instances,
and so the extra connectivity provided by the dummy rooms seems to be aiding the
search in this case. On the other hand, the existence of a perfect solution here could
mean that, while optimising the SCC, the search might also be being simultaneously
guided towards regions of the solution space that are feasible. This matter will be
discussed further in Sect. 9.7.
Finally, Fig. 9.7 illustrates, for the 24 problem instances, the relationship between
the feasibility ratio and the proportion by which the SCC is reduced by the SA
algorithm for two contrasting neighbourhood operators N1 and N5 . We see that the
points for N5 are shifted upwards and rightwards compared to N1 , illustrating the
larger feasibility ratios and higher performance of the operator. The general pattern in
the figure suggests that higher feasibility ratios allow large decreases in cost during
a run, while lower feasibility ratios can result in both large or small decreases,
depending on the instance. Thus, while there is some relationship between the two
variables, it seems that other factors also have an impact here, including the size
9.6 Experimental Results 269
0.6
0.4
0.2
N1
N5
0
0 0.02 0.04 0.06 0.08 0.1
Feasibility Ratio
Fig. 9.7 Scatter plot showing the relationship between the feasibility ratio and the reduction in cost
for the 24 competition instances, using neighbourhood operators N1 and N5
and shape of the cost landscape and the amount of computation needed for each
application of the evaluation function.
In our next set of experiments, we compare the performance of our algorithm to the
best results that were reported in the literature in the five years following the 2007
competition. Table 9.4 gives a breakdown of the results achieved by our method using
Tend = 0.5 compared to the approaches of Cambazard et al. [9], Nothegger et al.
[13] and van den Broek and Hurkens [15]. Note that the latter two papers only list
results for the first 16 instances. In this table, all statistics are calculated from 100
runs on each instance except for van den Broek and Hurkens, whose algorithm is
deterministic. All results were achieved strictly within the time limits specified by the
competition benchmark program. Our experiments were performed using N3 , N4 ,
and N5 , though no significant difference was observed between the three operators’
best, mean, or worst results. Consequently, we only present the results for one of
these.4
Table 9.4 shows that, using N4 , perfect solutions have been achieved by our method
in 17 of the 24 problem instances. A comparison to the 16 results reported by Cam-
bazard et al. [9] indicates that our method’s best, mean, and worst results are also
significantly better than their corresponding results. Similarly, our best, mean, and
4 For pairwise comparisons, Related Samples Wilcoxon Signed-Rank Tests were used; else Friedman
Table 9.4 Results from the literature, taken from samples of 100 runs per instance. Figures indicate
the SCC achieved at the cut-off point defined by the competition benchmarking program. Numbers
in parentheses indicate the % of runs where feasibility was found. No parentheses indicates that
feasibility was achieved in all runs
# Our method using N4 Cambazarda van den Notheggerc
Broekb
Best Mean Worst Best Mean Worst Result Best Mean
1 0 377.0 833 15 547 1072 1636 0 (54) 613
2 0 382.2 1934 9 403 1254 1634 0 (59) 556
3 122 181.8 240 174 254 465 355 110 680
4 18 319.4 444 249 361 666 644 53 580
5 0 7.5 60 0 26 154 525 13 92
6 0 22.8 229 0 16 133 640 0 (95) 212
7 0 5.5 11 1 8 32 0 0 4
8 0 0.6 59 0 0 0 241 0 61
9 0 514.4 1751 29 1167 1902 1889 0 (85) 202
10 0 1202.4 2215 2 (89) 12972637 1677 0 4
11 48 202.6 358 178 361 496 615 143 (99) 774
12 0 340.2 583 14 380 676 528 0 (86) 538
13 0 79.0 269 0 135 425 485 5 (94) 360
14 0 0.5 7 0 15 139 739 0 41
15 0 139.9 325 0 47 294 330 0 29
16 0 105.2 223 1 58 245 260 0 101
17 0 0.1 3 – – – 35 – –
18 0 2.2 57 – – – 503 – –
19 0 346.1 1222 – – – 963 – –
20 557 724.5 881 – – – 1229 – –
21 1 32.1 159 – – – 670 – –
22 4 1790.1 2280 – – – 1956 – –
23 0 514.1 1178 – – – 2368 – –
24 18 328.2 818 – – – 945 – –
a SA-colouring method [9, p. 122]
b DeterministicIP-based heuristic (one result per instance) [15, p. 451]
c Serial ACO algorithm [13, p. 334]
worst results are all seen to outperform the results of van den Broek and Hurkens
[15]. Finally, no significant difference is observed between the best and mean results
of our method compared to Nothegger et al. [13]; however, unlike our algorithm,
they have failed to achieve feasibility in several cases.
9.6 Experimental Results 271
In this chapter’s final set of experiments, we look at the effects of using different
time limits with our algorithm. Until this point, experiments have been performed
according to the time limit specified by the competition benchmark program; how-
ever, it is pertinent to ask whether the less expensive neighbourhood operators are
more suitable when shorter time limits are used and whether further improvements
can be achieved when the time limit is extended.
In Fig. 9.8 we show the relative performance of operators N1 , N2 , N3 , and N5 us-
ing time limits of between 1 and 600 s, signifying very fast and very slow coolings,
respectively. (N4 is omitted here due to its close similarity with N3 and N5 ’s results.)
We see that even for very short time limits of less than five seconds, the more expen-
sive neighbourhoods consistently produce superior solutions across the instances.
We also see that when the time limit is extended beyond the benchmark and up to
600 s, the mean reduction in the soft cost rises from 89.1 to 94.6% (under N5 ), indi-
cating that superior results can also be gained with additional computing resources.
This latter observation is consistent with that of Nothegger et al. [13], who were also
able to improve the results of their algorithm, in their case via parallelisation.
0.6
0.5
Proportion Reduction in Cost
0.4
0.3
0.2
0.1 N1
N2
N3
N5
0
0 1 2 3 4 5
Time Limit (s)
1
0.95
Proportion Reduction in Cost
0.9
0.85
0.8
0.75
0.7
N1
0.65 N2
N3
N5
0.6
0 100 200 300 400 500 600
Time Limit (s)
Fig. 9.8 Proportion decrease in SCC using differing time limits and differing neighbourhood op-
erators. All points are taken from an average of ten runs on each problem instance (i.e., 240 runs).
Error bars represent one standard error each side of the mean
show the effect of performing a neighbourhood move (i.e., changing the incumbent
solution) with simulated annealing. In particular, we see that the connectivity of G
does not change (though the probabilities of traversing the edges may change if the
temperature parameter is subsequently updated). On the other hand, when the same
move is performed using tabu search (Part (b)), several edges in G, including {S, S },
will be made tabu for several iterations, effectively removing them from the graph
for a time dictated by the tabu tenure. The exact edges that will be made tabu depends
on the structure of the tabu list, and in typical applications, when an event ei has been
9.7 Chapter Summary and Discussion 273
(a) (b)
S S
S’ S’
Fig.9.9 Illustration of the effects of performing a neighbourhood move using a simulated annealing,
and b tabu search
moved from timeslot S j to a new timeslot, all moves that involve moving ei back
into S j are made tabu. While the use of tabu moves helps to prevent cycling (which
may regularly occur with SA), it therefore also has the effect of further reducing the
connectivity of G. Over the course of a run, the cumulative effects of this phenomenon
may put tabu search at a disadvantage with these particular problems.
In this chapter, we have also noted that an alternative approach to a two-stage
algorithm is to use a one-stage optimisation algorithm in which the satisfaction of
both hard and soft constraints is attempted simultaneously, as with the methods of
Ceschia et al. [12] and Nothegger et al. [13]. As we have seen, despite the favourable
performance of our two-stage algorithm overall, it does seem to struggle in compar-
ison to these approaches for a small number of problem instances, particularly #10
and #22. According to Fig. 9.5, these instances exhibit the lowest feasibility ratios
with our operators, seemingly suggesting that freedom of movement in the solution
spaces is too restricted to allow adequate optimisation of the objective function.
On the other hand, it is also possible that the algorithms of Ceschia et al. [12]
and Nothegger et al. [13] are being aided by the fact that perfect solutions to the
24 competition instances are known to exist—a feature that is unlikely to occur in
real-world problem instances. For example, as mentioned in Sect. 9.3, Ceschia et
al.’s algorithm is reported to produce its best results when optimisation is performed
using an objective function in which hard and soft constraint violations are given
equal weights. However, it could be that, by moving towards solutions with low
SCCs, the search could also inadvertently be moving towards feasible regions of
the solution space, simultaneously helping to satisfy the hard constraints along the
way. This hypothesis was tested by Lewis [21], who compared the algorithm of this
chapter to Ceschia et al.’s using a different suite of timetabling problems for which
the existence of perfect solutions is not always known. These can be downloaded
from http://www.rhydlewis.eu/hardTT. The results of their tests strongly support
this hypothesis: for the 23 instances of this suite with no known perfect solution, the
two-stage algorithm outperformed Ceschia et al.’s in 21 cases (91.3%), with stark
differences in results. On the other hand, with the remaining 17 instances, Ceschia et
al.’s approach produced better results in 12.5 cases (73.5%), suggesting that the
existence of perfect solutions indeed benefits the algorithm.
274 9 Designing University Timetables
As mentioned earlier, all of the problem instances used in this chapter are avail-
able online at http://www.cs.qub.ac.uk/itc2007/. Also, a full listing of this chapter’s
results, together with the C++ source code of the two-stage algorithm is available at
http://www.rhydlewis.eu/resources/ttCodeResults.zip.
References
1. Corne D, Ross P, Fang H (1995) Evolving timetables. In: Chambers L (ed) The practical
handbook of genetic algorithms, vol 1. CRC Press, pp 219–276
2. McCollum B, Schaerf A, Paechter B, McMullan P, Lewis R, Parkes A, Di Gaspero L, Qu R,
Burke E (2010) Setting the research agenda in automated timetabling: the second international
timetabling competition. INFORMS J Comput 22(1):120–130
3. Müller T, Rudova H (2012) Real life curriculum-based timetabling. In: Kjenstad D, Riise
A, Nordlander T, McCollum B, Burke E (eds) Practice and theory of automated timetabling
(PATAT 2012), pp 57–72
4. Cooper T, Kingston J (1996) The complexity of timetable construction problems. In: Burke
E, Ross P (eds) Practice and theory of automated timetabling (PATAT) I. LNCS, vol 1153.
Springer, pp 283–295
5. Carter M, Laporte G, Lee SY (1996) Examination timetabling: algorithmic strategies and
applications. J Oper Res Soc 47:373–383
6. Burke E, Elliman D, Ford P, Weare R (1996) Examination timetabling in British universities:
a survey. In: Burke E, Ross P (eds) Practice and theory of automated timetabling (PATAT) I.
LNCS, vol 1153. Springer, pp 76–92
7. Schaerf A (1999) A survey of automated timetabling. Artif Intell Rev 13(2):87–127
8. Lewis R (2008) A survey of metaheuristic-based techniques for university timetabling prob-
lems. OR Spectr 30(1):167–190
9. Cambazard H, Hebrard E, O’Sullivan B, Papadopoulos A (2012) Local search and constraint
programming for the post enrolment-based timetabling problem. Ann Oper Res 194:111–135
10. Rossi-Doria O, Samples M, Birattari M, Chiarandini M, Knowles J, Manfrin M, Mastrolilli
M, Paquete L, Paechter B, Stützle T (2002) A comparison of the performance of different
metaheuristics on the timetabling problem. In: Burke E, De Causmaecker P (eds) Practice and
theory of automated timetabling (PATAT) IV. LNCS, vol 2740. Springer, pp 329–351
11. Cambazard H, Hebrard E, O’Sullivan B, Papadopoulos A (2008) Local search and constraint
programming for the post enrolment-based course timetabling problem. In: Burke E, Gendreau
M (eds) Practice and theory of automated timetabling (PATAT) VII
12. Ceschia S, Di Gaspero L, Schaerf A (2012) Design, engineering, and experimental analysis
of a simulated annealing approach to the post-enrolment course timetabling problem. Comput
Oper Res 39:1615–1624
13. Nothegger C, Mayer A, Chwatal A, Raidl G (2012) Solving the post enrolment course
timetabling problem by ant colony optimization. Ann Oper Res 194:325–339
14. Jat S, Yang S (2011) A hybrid genetic algorithm and Tabu search approach for post enrolment
course timetabling. J Sched 14:617–637
15. van den Broek J, Hurkens C (2012) An IP-based heuristic for the post enrolment course
timetabling problem of the ITC2007. Ann Oper Res 194:439–454
16. Chiarandini M, Birattari M, Socha K, Rossi-Doria O (2006) An effective hybrid algorithm for
university course timetabling. J Sched 9(5):403–432. ISSN 1094-6136
17. Lewis R (2012) A time-dependent metaheuristic algorithm for post enrolment-based course
timetabling. Ann Oper Res 194(1):273–289
References 275
18. van Laarhoven P, Aarts E (1987) Simulated Annealing: Theory and Applications. Kluwer
Academic Publishers
19. Lü Z, Hao J-K (2010b) Adaptive Tabu search for course timetabling. Eur J Oper Res
200(1):235–244
20. Kostuch P (2005) The university course timetabling problem with a 3-phase approach. In:
Burke E, Trick M (eds) Practice and theory of automated timetabling (PATAT) V. LNCS, vol
3616. Springer, pp 109–125
21. Lewis R, Thompson J (2015) Analysing the effects of solution space connectivity with an
effective metaheuristic for the course timetabling problem. Eur J Oper Res 240:637–648
Computing Resources
A
This section contains instructions on how to compile and use the implementations of
the algorithms described in Chaps. 3 and 5 of this book. These can all be downloaded
directly from:
http://rhydlewis.eu/resources/gCol.zip
Once downloaded and unzipped, we see that the directory contains a number of
subdirectories. Specifically, these are:
All of these algorithms are programmed in C++. They have been successfully
compiled in Windows using Microsoft Visual Studio and in Linux using g++. Both
of these compilers are available for free online.
To compile and execute using Microsoft Visual Studio the following steps can be
taken:
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer 277
Nature Switzerland AG 2021
R. M. R. Lewis, Guide to Graph Colouring, Texts in Computer Science,
https://doi.org/10.1007/978-3-030-81054-2
278 Appendix A: Computing Resources
1. Open Visual Studio and click File, then New, and then Project from Existing
Code.
2. In the dialogue box, select Visual C++ and click Next.
3. Select one of the subdirectories above, give the project a name and click Next.
4. Finally, select Console Application Project for the project type and then click
Finish.
The source code for the chosen algorithm can then be viewed and executed from the
Visual Studio application. Release mode should be used during compilation to allow
the programs to execute at maximum speed.
Appropriate makefiles are included for compiling the source code using g++.
Simply navigate to the correct directory at the command prompt, type make and hit
return.
A.1.1 Usage
Once generated, the executable files (one per subdirectory) can be run from the com-
mand line. If the programs are called with no arguments, usage information is output
to the screen. For example, suppose we are using the executable file HillClimber.
Running this program with no arguments from the command line gives the following
output:
1 Hill C l i m b i n g A l g o r i t h m for G r a p h C o l o u r i n g
2
3 USAGE :
4 < InputFile > ( R e q u i r e d . F i l e m u s t be in D I M A C S f o r m a t )
5 - s < int > ( S t o p p i n g c r i t e r i a e x p r e s s e d as n u m b e r of
6 c o n s t r a i n t c h e c k s . Can be a n y t h i n g up to
7 9 x10 ^18. D E F A U L T = 1 0 0 ,000 ,000.)
8 - I < int > ( N u m b e r of i t e r a t i o n s of l o c a l s e a r c h per c y c l e .
9 DEFAULT = 1000)
10 - r < int > ( R a n d o m seed . D E F A U L T = 1)
11 - T < int > ( T a r g e t n u m b e r of c o l o u r s . A l g o r i t h m h a l t s if this
12 is r e a c h e d . D E F A U L T = 1.)
13 -v ( V e r b o s i t y . If present , o u t p u t is s e n t to s c r e e n .
14 If -v is repeated , m o r e o u t p u t is g i v e n .)
15 ****
The input file specifies the graph we intend to colour. This is the only mandatory
argument. Input files should be text files in the DIMACS format. Specifically,
• Initial lines in the text file will begin with the character c. These are used for
comments but are otherwise ignored.
• After the comments, the next line in the file should start with the character p. This
is followed by the text edge, which tells us that the graph is being specified by
using a list of its edges. The number of vertices n and edges m are then given.
• Finally, a series of m lines beginning with the character e should be included. Each
of these specifies a single edge of the graph by giving its two endpoints. Note that
each edge e u v should appear exactly once in the input file. It is therefore not
repeated as e v u.
Appendix A: Computing Resources 279
It is assumed that files are well-formed and consistent—vertex labels are valid,
exactly m edges are defined, self-loops are not present, and so on. Vertices are
numbered from 1 to n.
For illustrative purposes, the following text specifies a wheel graph (comprising
n = 5 vertices and m = 8 edges) using the DIMACS format. An example input file
called graph.txt is also provided in each subdirectory.
1 c E x a m p l e text file s p e c i f y i n g a g r a p h u s i n g the D I M A C S f o r m a t .
2 c
3 c This is a wheel graph c o m p r i s i n g 5 v e r t i c e s and 8 e d g e s
4 c
5 p edge 5 8
6 e 1 2
7 e 1 4
8 e 1 5
9 e 2 5
10 e 2 3
11 e 3 4
12 e 3 5
13 e 4 5
The remaining arguments for each of the programs are optional and are allo-
cated default values if left unspecified. Here are some example commands using the
HillClimber executable:
1 H i l l C l i m b e r g r a p h . txt
This will execute the algorithm on the problem given in the file graph.txt, using
the default of 1000 iterations of local search per cycle and a random seed of 1. The
algorithm will halt when 100,000,000 constraint checks have been performed. No
output will be written to the screen. Another example command is
1 H i l l C l i m b e r g r a p h . txt - r 6 -T 50 - v - s 5 0 0 0 0 0 0 0 0 0 0 0
This run will be similar to the previous one but will use a random seed of six and will
halt either when 500,000,000,000 constraint checks have been performed, or when
a feasible solution using fifty or fewer colours has been found. The presence of -v
in this command means that output will be written to the screen. Including -v more
than once will increase the amount of output.
The arguments -r and -v are used with all of the algorithms supplied here. Simi-
larly, -T and -s are used with all algorithms except for the single-parse constructive
algorithms. Descriptions of arguments particular to just one algorithm are found
by typing the name of the program with no arguments, as described above. Inter-
pretations of the run-time parameters for the various algorithms can be found by
consulting the algorithm descriptions in this book.
A.1.2 Output
When a run of any of the programs is completed, three files are created: ceffort.txt
(computational effort), teffort.txt (time effort), and solution.txt. The first two files
280 Appendix A: Computing Resources
specify how long (in terms of constraint checks and milliseconds, respectively) so-
lutions with certain numbers of colours took to produce during the run. For example,
we might get the following computational effort file:
1 40 126186
2 39 427143
3 38 835996
4 37 1187086
5 36 1714932
6 35 2685661
7 34 6849302
8 33 X
This file is interpreted as follows: The first feasible solution observed used 40
colours, and this took 126,186 constraint checks to achieve. A solution with 39
colours was then found after 427,143 constraint checks, and so on. To find a solution
using 34 colours, a total of 6,849,302 constraint checks was required. Once a row
with an “X” is encountered, this indicates that no further improvements were made:
that is, no solution using fewer colours than that indicated in the previous row was
achieved. Therefore, in this example, the best solution found used 34 colours. For
consistency, the “X” is always present in a file, even if a specified target has been
met.
The file teffort.txt is interpreted in the same way as ceffort.txt, with the right-
hand column giving the time (in milliseconds) as opposed to the number of constraint
checks. Both of these files are useful for analysing algorithm speed and performance.
For example, the computational effort file above can be used to generate the following
plot:
40
39
38
Colours
37
36
35
34
6 6 6 6 6 6 6
0 1×10 2×10 3×10 4×10 5×10 6×10 7×10
Checks
The file solution.txt contains the best feasible solution (i.e., the solution with the
fewest colours) that was achieved during the run. The first line of this file gives the
number of vertices n, and the remaining n lines then state the colour of each vertex,
using colour labels 0, 1, 2, . . .. For example, the following solution file
1 5
2 0
Appendix A: Computing Resources 281
3 2
4 1
5 0
6 1
is interpreted as follows: There are five vertices; the first and fourth vertices are
assigned to colour 0, the third and fifth vertices are assigned to colour 1, and the
second vertex is assigned to colour 2. Hence, three colours are being used.
Finally, on completion, a single line is also appended to the file resultsLog.log.
This contains the following pieces of information, separated by tabs:
In this section, we give a brief demonstration of how Sage can be used to create,
colour, and visualise graphs.
Sage is specialised software that allows the exploration of many aspects of mathe-
matics, including combinatorics, graph theory, algebra, calculus, and number theory.
It is both free to use and open source. To use Sage, commands can be typed into a
Sage notebook. Blocks of commands are then executed by hitting Shift+Enter next
to these commands, with output (if applicable) then being written back to the note-
book.
Sage contains a whole host of elementary and specialised mathematical functions
that are documented online at https://doc.sagemath.org/html/en/reference/index.
html. Of particular interest to us here is the functionality surrounding graph colouring
and graph visualisation. A full description of the graph colouring library for Sage can
be found at https://doc.sagemath.org/html/en/reference/graphs/sage/graphs/graph_
coloring.html.
The following text now shows some example commands from this library, together
with the output that Sage produces. In our case, these commands have been typed into
notebooks provided by the online tool CoCalc. This allows the editing and execution
of Sage notebooks through a web browser and can be freely accessed online (go to
https://cocalc.com/ and open a blank Sage notebook).
The following pieces of code each represent an individual block of executable
Sage commands. Textual output produced by these commands is indicated by the »
symbol.
282 Appendix A: Computing Resources
To begin, it is first necessary to specify the names of the libraries that we intend
to use in our Sage program. We therefore type:
1 from s a g e . g r a p h s . g r a p h _ c o l o r i n g import
chromatic_number
2 from s a g e . g r a p h s . g r a p h _ c o l o r i n g import
vertex_coloring
3 from s a g e . g r a p h s . g r a p h _ c o l o r i n g import
number_of_n_colorings
4 from s a g e . g r a p h s . g r a p h _ c o l o r i n g import edge_coloring
This will allow us to access the various graph colouring functions used below.
We now use Sage to generate a small graph G. In this case, our graph has n = 4
vertices, m = 5 edges, and is defined by the following adjacency matrix
⎛ ⎞
0110
⎜1 0 1 1⎟
:A=⎜ ⎝1 1 0 1⎠.
⎟
0110
The first Sage command below defines this matrix. The next command then transfers
this information into a graph G. The final command draws G to the screen.
1 A = m a t r i x ([[0 ,1 ,1 ,0] ,[1 ,0 ,1 ,1] ,[1 ,1 ,0 ,1] ,[0 ,1 ,1 ,0]])
2 G = Graph (A)
3 G . show ()
Note that, by default, Sage labels the vertices from 0 to n − 1 in this diagram as
opposed to using indices 1 to n.
We will now produce an optimal colouring of this graph. The algorithms that
Sage uses to obtain these solutions are based on integer programming techniques
(see Chap. 4). They are therefore able to produce provably optimal solutions for
small graphs. A colouring is produced via the following command (note the spelling
of “coloring” as opposed to “colouring”):
1 vertex_coloring (G)
2 >> {[[2] , [1] , [3 , 0]]}
Appendix A: Computing Resources 283
The output produced by Sage tells us that G can be optimally coloured using three
colours, with vertices 0 and 3 receiving the same colour. This solution is expressed
as a partition of the vertices, which can be used to produce a visualisation of the
colouring as follows:
1 S = vertex_coloring (G)
2 G . show ( p a r t i t i o n = S )
which tells us that a two-colouring is not possible for this graph. On the other hand,
if we want to confirm whether G is four-colourable, we get
1 v e r t e x _ c o l o r i n g ( G ,4)
2 >> {[[3 , 0] , [2] , [1] , []]}
which tells us that one way of four-colouring the G is to not use the fourth colour!
In addition to the above, commands are also available in Sage for determining the
chromatic number
1 chromatic_number (G)
2 >> 3
and for calculating the number of different k-colourings. For example, with k = 2,
we get
1 n u m b e r _ o f _ n _ c o l o r i n g s ( G ,2)
2 >> 0
1 n u m b e r _ o f _ n _ c o l o r i n g s ( G ,3)
2 >> 6
telling us that there are six different ways of feasibly assigning three colours to G.
Sage also provides commands for calculating edge colourings of a graph (see
Sect. 6.2). For example, continuing with the graph G from above, we can use the
edge_coloring() command to get
1 edge_coloring (G)
2 >> {[[(0 , 1) , (2 , 3) ] , [(0 , 2) , (1 , 3) ] , [(1 , 2) ]]}
This tells us that the chromatic index of G is 4, with edges {0, 1} and {2, 3} being
assigned to one colour, {0, 2} and {1, 3} being assigned to a second, and {1, 2}
being assigned to a third.
Sage also contains a collection of predefined graphs. This allows us to make use
of common graph topologies without having to manually type out their adjacency
matrices. A full list of these graphs is provided at https://doc.sagemath.org/html/
en/reference/graphs/sage/graphs/graph_generators.html. For example, here are the
commands for producing an optimal colouring of a dodecahedral graph. In this case,
we have switched off vertex labelling to make the illustration clearer:
1 G = g r a p h s . D o d e c a h e d r a l G r a p h ()
2 S = vertex_coloring (G)
3 G . show ( p a r t i t i o n =S , v e r t e x _ l a b e l s = F a l s e )
The following is an optimal colouring of the complete graph with ten vertices,
K 10 :
1 G = g r a p h s . C o m p l e t e G r a p h (10)
2 S = vertex_coloring (G)
3 G . show ( p a r t i t i o n =S , v e r t e x _ l a b e l s = F a l s e )
Appendix A: Computing Resources 285
Finally, Sage also allows us to define random graphs G n, p that have n vertices
and edge probabilities p (see Definition 3.15). Here is an example with n = 50 and
p = 0.05:
1 G = g r a p h s . R a n d o m G N P (50 ,0.05)
2 S = vertex_coloring (G)
3 G . show ( p a r t i t i o n =S , v e r t e x _ l a b e l s = F a l s e )
It can be seen that this particular graph is three-colourable, although the default
layout of this graph is not very helpful because the connected component on the left
is tightly clustered. If desired, we can change this layout so that vertices are shown
in a circle:
1 G . show ( v e r t e x _ l a b e l s = False , l a y o u t = ’ c i r c u l a r ’ ,
partition =S)
In this section,, we show how to create and colour graphs using NetworkX. Net-
workX is a Python package used for the creation, manipulation, and study of graphs.
An open-source distribution of Python including NetworkX can be downloaded for
Appendix A: Computing Resources 287
In this example, Lines 1–3 first import the libraries that we need. Line 5 then uses
the command nx.binomial_graph(20, 0.2) to create a random graph G
with twenty vertices and an edge probability of 0.2. Similar commands can be used
in NetworkX to generate other topologies such as cycles, scale-free graphs, d-regular
graphs, and so on. Lines 6 and 7 then draw this graph to the screen using a circular
layout of the vertices. This output is shown in Fig. A.1. Note that, by default, vertices
are labelled from zero upwards.
288 Appendix A: Computing Resources
Having created an example graph, the commands on Lines 9–22 output some of its
properties. The problem of identifying cut vertices (articulation points) and bridges in
a graph can be solved in O(m) time, where m is the number of edges. The commands
on Lines 9 and 12 are therefore fast to execute. On the other hand, identifying the
clique number in a graph is N P -hard; the algorithm employed by NetworkX is
exact and therefore has an exponential running time. Finally, the commands on
Lines 18 and 21 employ approximation algorithms to identify a large clique and a
large independent set in the graph.
Lines 35 and 40 show some of the available commands for feasibly colouring a
graph. The first example uses the Greedy algorithm with a random permutation of
the vertices, the second uses DSatur. The output of these commands is a Python
dictionary in which each vertex is assigned to a colour, labelled from zero upwards.
In our example, these dictionaries are used to draw the resultant colourings to the
screen via the user-defined function drawcolouring(), shown in Lines 24–32.
The full output of this program is given below. Note that the application of
Greedy has resulted in a four-colouring in this case while DSatur has given a
three-colouring. These solutions are visualised in Fig. A.2.
1 The a r t i c u l a t i o n p o i n t s of G are {8}
2 The b r i d g e s of G are {(8 , 16) }
3 The c l i q u e n u m b e r of G is 3
4 A l a r g e c l i q u e in G c o m p r i s e s v e r t i c e s {0 , 18 , 10}
5 A large i n d e p e n d e n t set in G c o m p r i s e s v e r t i c e s {0 , 1 , 2 , 3 ,
6 , 11 , 16 , 17}
6 Here is a 4 c o l o r i n g of G : {3: 0 , 16: 0 , 2: 0 , 19: 0 , 14: 1 ,
15: 0 , 17: 0 , 9: 2 , 4: 1 , 8: 1 , 6: 3 , 0: 0 , 12: 1 , 11:
0 , 1: 2 , 13: 3 , 7: 1 , 10: 2 , 5: 1 , 18: 3}
7 Here is a 3 c o l o r i n g of G : {5: 0 , 6: 1 , 10: 1 , 18: 2 , 0: 0 ,
2: 2 , 14: 0 , 9: 2 , 17: 1 , 7: 2 , 12: 1 , 15: 2 , 1: 1 , 19:
2 , 4: 0 , 8: 0 , 13: 2 , 3: 1 , 16: 1 , 11: 0}
Fig. A.2 Colourings of our example graph due to the Greedy (left) and DSatur (right) algorithms
In this section, we provide the Python code used for randomly generating planar
graphs. This program requires three parameters from the user: the number of ver-
tices n, the number of edges m, and a random seed (Lines 6–10). The program first
generates random coordinates for n points in the unit square (Line 13). A Delaunay
triangulation is then generated from these points and this is converted into a corre-
sponding planar graph T (Lines 14–24). The task is to now identify a random subset
of T ’s edges so that the resultant graph G has n vertices, m edges, and is connected.
To do this, G is first set as a minimum spanning tree of T (Line 30). Additional
edges from T are then copied to G until the required number of edges is reached
(Lines 33–41). The command nx.graph_clique_number(G) is also used in
this example to calculate the size of the largest clique in G. For planar graphs, this
command executes quickly.
An illustration of this process is shown in Fig. A.3. The complete code listing is
as follows:
1 i m p o r t n e t w o r k x as nx
2 import random
3 from s c i p y . s p a t i a l i m p o r t D e l a u n a y
4
5 # Get the input variables , check their validity , and seed the
random number generator
6 n = int ( i n p u t ( " E n t e r n >> " ) )
7 m = int ( i n p u t ( " E n t e r m >> " ) )
8 s = int ( i n p u t ( " Enter seed >> " ) )
9 a s s e r t n > 0 and m >= n - 1 and m <= 3 * n - 6 , " I l l e g a l
parameter combination ."
10 random . seed (s)
11
12 # G e n e r a t e a list P of n points in the unit square and form a
Delaunay triangulation
13 P = [( r a n d o m . r a n d o m () , r a n d o m . r a n d o m () ) for i in r a n g e ( n ) ]
14 T = Delaunay (P)
15 tri = T . s i m p l i c e s . copy ()
16
17 # C o n v e r t the t r i a n g u l a t i o n to a s i m p l e g r a p h T
18 T = nx . G r a p h ()
19 for v in r a n g e ( n ) :
290 Appendix A: Computing Resources
20 T. add_node (v)
21 for e in tri :
22 T . a d d _ e d g e ( e [0] , e [1])
23 T . a d d _ e d g e ( e [0] , e [2])
24 T . a d d _ e d g e ( e [1] , e [2])
25
26 # C h e c k the t r i a n g u l a t i o n g i v e s e n o u g h e d g e s
27 a s s e r t T . n u m b e r _ o f _ e d g e s () >= m , " C a n n o t f o r m p l a n a r g r a p h
with this many edges . "
28
29 # B u i l d the g r a p h G . It s t a r t s as a s p a n n i n g tree of T
30 G = nx . m i n i m u m _ s p a n n i n g _ t r e e ( T ) ;
31
32 # C o n s t r u c t a list L of edges in T that might be added to G
33 L = []
34 for e in T . e d g e s () :
35 if e not in G . e d g e s () :
36 L . a p p e n d (( e [0] , e [ 1 ] ) )
37
38 # T a k e a r a n d o m s a m p l e of L and add t h e s e e d g e s to G .
39 S = r a n d o m . s a m p l e ( L , m - ( n - 1) )
40 for e in S :
41 G . a d d _ e d g e ( e [0] , e [1])
42
43 # D r a w the g r a p h to the s c r e e n
44 nx . d r a w _ n e t w o r k x (G , pos = P , w i t h _ l a b e l s = False , n o d e _ s i z e =30)
45
46 # Write the graph to the file ’ graph . txt ’
47 f = open ( " g r a p h . txt " , " w + " )
48 f. write ("c Undirected planar graph \n")
49 f . w r i t e ( " c Seed = " + str ( s ) + " \ n " )
50 f . w r i t e ( " c C l i q u e n u m b e r = " + str ( nx . g r a p h _ c l i q u e _ n u m b e r ( G ) )
+ "\n")
51 f . w r i t e ( " p edge " + str ( G . n u m b e r _ o f _ n o d e s () ) + " " + str ( G .
n u m b e r _ o f _ e d g e s () ) + " \ n " )
52 for e in G . e d g e s () :
53 f . w r i t e ( " e " + str ( e [0] + 1) + " " + str ( e [1] + 1) + " \ n " )
54 f . c l o s e ()
The following code demonstrates how the graph colouring problem can be specified
using integer programming methods and then solved using off-the-shelf optimisa-
tion software. This example gives the implementation used in the experiments of
Sect. 4.1.2.4 and is coded in the Xpress-Mosel language, which comes as part of the
FICO Xpress Optimisation Suite. Comments in the code are preceded by exclamation
marks.
1 m o d e l GCOL
2
3 ! G a i n a c c e s s to the Xpress - O p t i m i z e r s o l v e r and t i m e r
4 uses " m m x p r s " , " m m s y s t e m "
5
6 ! S p e c i f y a hard time limit of 60 s e c o n d s
7 s e t p a r a m ( " X P R S _ M A X T I M E " , -60)
8
9 ! S t a r t the t i m e r
Appendix A: Computing Resources 291
10 s t a r t t i m e := g e t t i m e
11
12 ! Define input file
13 f o p e n ( " g r a p h . txt " , F _ I N P U T )
14
15 ! D e f i n e i n t e g e r s used in the p r o g r a m
16 declarations
17 n , m , v1 , v2 : i n t e g e r
18 end - d e c l a r a t i o n s
19
20 ! R e a d the n u m b e r of v e r t i c e s and e d g e s from the input file
21 read ( n , m )
22 w r i t e l n ( " n = " ,n , " , m = " , m )
23
24 ! D e c l a r e the d e c i s i o n v a r i a b l e s and make them b i n a r y
25 declarations
26 X : a r r a y (1.. n ,1.. n ) of m p v a r
27 Y : a r r a y (1.. n ) of m p v a r
28 end - d e c l a r a t i o n s
29 f o r a l l ( i in 1.. n ) do
30 f o r a l l ( j in 1.. n ) do
31 X (i , j ) i s _ b i n a r y
32 end - do
33 Y(i) is_binary
34 end - do
Fig. A.3 Example of the planar graph generation process. Top-left shows n = 100 points placed
at random coordinates in the unit square; top-right then shows a Delaunay triangulation of these
points. Bottom-left shows a minimum spanning tree of this triangulation, while bottom-right shows
the final planar graph G, formed by adding further edges
292 Appendix A: Computing Resources
35
36 ! R e a d in all of the e d g e s and d e f i n e the c o n s t r a i n t s that
e n s u r e t h a t ( a ) a d j a c e n t v e r t i c e s are a s s i g n e d to
d i f f e r e n t colours , and ( b ) Y ( i ) is a set to 1 only if a
v e r t e x is a s s i g n e d to c o l o u r i
37 write ("E = {")
38 f o r a l l ( j in 1.. m ) do
39 read ( v1 , v2 )
40 f o r a l l ( i in 1.. n ) do
41 X ( v1 , i ) + X ( v2 , i ) <= Y ( i )
42 end - do
43 w r i t e ( " { " ,v1 , " ," , v2 , " } " )
44 end - do
45 writeln ("}")
46
47 ! S p e c i f y that each v e r t e x is to be a s s i g n e d to e x a c t l y one
colour
48 f o r a l l ( i in 1.. n ) do
49 sum ( j in 1.. n ) X (i , j ) = 1
50 end - do
51
52 ! Eliminate solution symmetries
53 f o r a l l ( i in 1.. n ) do
54 f o r a l l ( j in i +1.. n ) do
55 X (i , j ) = 0
56 end - do
57 end - do
58 f o r a l l ( i in 2.. n ) do
59 f o r a l l ( j in 2.. i -1) do
60 X (i , j ) <= sum ( l in j -1.. i -1) X (l , j -1)
61 end - do
62 end - do
63
64 ! S p e c i f y the o b j e c t i v e f u n c t i o n
65 o b j f n := sum ( i in 1.. n ) Y ( i )
66
67 ! Run the m o d e l
68 writeln
69 w r i t e l n ( " R u n n i n g m o d e l ... " )
70 minimise ( objfn )
71 w r i t e l n ( " ... Run e n d e d " )
72
73 ! W r i t e the o u t p u t to the s c r e e n
74 writeln
75 w r i t e l n ( " Total time = " , gettime - starttime , " secs " )
76 w r i t e l n ( " U p p e r b o u n d = " , getobjval , " ( n u m b e r of c o l o u r s in
best o b s e r v e d s o l u t i o n ) " )
77 writeln (" Lower bound = " , getparam (" XPRS_BESTBOUND "))
78 writeln
79
80 writeln ("X = ")
81 f o r a l l ( i in 1.. n ) do
82 f o r a l l ( j in 1.. n ) do
83 write ( getsol (X(i ,j)) ," ")
84 end - do
85 writeln
86 end - do
87 writeln
88 writeln ("Y = ")
89 f o r a l l ( j in 1.. n ) do
90 write ( getsol (Y(j)) ," ")
91 end - do
92 writeln
93 writeln
94 writeln (" Solution = ")
Appendix A: Computing Resources 293
95 f o r a l l ( i in 1.. n ) do
96 w r i t e ( " c ( v_ " ,i , " ) = " )
97 f o r a l l ( j in 1.. n ) do
98 if ( g e t s o l ( X ( i , j ) ) =1) then
99 writeln (j)
100 end - if
101 end - do
102 end - do
103
104 end - m o d e l
The above program starts by reading in a graph colouring problem from a text
file, called graph.txt in this case. The objective function and constraints of the prob-
lem are then specified, before the optimisation process itself is invoked using the
minimize(objfn) command on Line 70. In this implementation, the optimisa-
tion process is terminated once a provably optimal solution has been found, or when
a time limit has been reached (specified as sixty seconds here). The upper and lower
bounds are then written to the screen, together with the total run time. If the lower
and upper bounds are equal, then the provably optimal solution has been found. If
this is not the case, then the best-observed integer solution found within the time
limit (if, indeed, one has been found) is output. The number of colours used in this
integer solution corresponds to the upper bound.
Here is some example input that can be read in by the above program. The first
two lines give the number of vertices and edges, n and m, respectively. The m edges
then follow, one per line. This particular example corresponds to the graph shown in
Fig. 4.5.
1 8
2 12
3 1 2
4 1 3
5 1 4
6 2 5
7 2 6
8 2 8
9 3 4
10 3 7
11 4 7
12 5 8
13 6 8
14 7 8
9 Lower bound = 3
10
11 X =
12 1 0 0 0 0 0 0 0
13 0 1 0 0 0 0 0 0
14 0 0 1 0 0 0 0 0
15 0 1 0 0 0 0 0 0
16 1 0 0 0 0 0 0 0
17 1 0 0 0 0 0 0 0
18 1 0 0 0 0 0 0 0
19 0 0 1 0 0 0 0 0
20
21 Y =
22 1 1 1 0 0 0 0 0
23
24 Solution =
25 c ( v_1 ) = 1
26 c ( v_2 ) = 2
27 c ( v_3 ) = 3
28 c ( v_4 ) = 2
29 c ( v_5 ) = 1
30 c ( v_6 ) = 1
31 c ( v_7 ) = 1
32 c ( v_8 ) = 3
It can be seen that the upper and lower bounds reported here are equal. A provably
optimal solution has therefore been found.
Here are some further web resources related to graph colouring. A page of resources
maintained by Joseph Culberson featuring, most notably, a collection of problem
generators and C code for the algorithms presented by Culberson and Luo [1] can
be found at:
http://webdocs.cs.ualberta.ca/~joe/Coloring/
An excellent bibliography on the graph colouring problem can also be found at:
http://www.imada.sdu.dk/~marco/gcp/
A large set of graph colouring problem instances has been collected by the Center
for Discrete Mathematics and Theoretical Computer Science (DIMACS) as part of
their DIMACS Implementation Challenge series. These can be downloaded at:
http://mat.gsia.cmu.edu/COLOR/instances.html
Appendix A: Computing Resources 295
These problem instances have been used in a large number of graph colouring-based
papers and are written in the DIMACS graph format, a specification of which can be
found in the following (postscript) document:
http://mat.gsia.cmu.edu/COLOR/general/ccformat.ps
or at
http://prolland.free.fr/works/research/dsat/dimacs.html
The fun graph colouring game CoLoRaTiOn, which is suitable for both adults and
children, can be downloaded from:
http://vispo.com/software/
The goal in this game is to achieve a feasible colouring within a certain number
of moves. The difficulty of each puzzle depends on several factors, including its
topology, whether you can see all of the edges, the number of vertices, and the
number of available colours.
Finally, C++ code for the random Sudoku problem instance generator used in
Sect. 6.4.2 of this book can be downloaded from:
http://rhydlewis.eu/resources/sudokuGeneratorMetaheuristics.zip
http://rhydlewis.eu/resources/sudokuToGCol.zip
When compiled, this program reads in a single Sudoku problem (from a text file)
and converts it into the equivalent graph colouring problem in the DIMACS format
mentioned above.
296 Appendix A: Computing Resources
Reference
The following table lists the main notation used throughout this book. Further infor-
mation can be found by consulting the index and the relevant definitions.
Notation Description
Graph properties
G = (V, E) Graph G comprising a vertex set V and an edge set E
n Number of vertices in a graph, n = |V |
m Number of edges in a graph, m = |E|
Γ (v) Set of vertices adjacent to a vertex v ∈ V (i.e., the neighbourhood of v)
deg(v) Degree (number of neighbours) of a vertex v. deg(v) = |Γ (v)|
Δ(G) Maximum degree of any vertex in the graph G
c(v) Colour label of a vertex v ∈ V
sat(v) Saturation degree of a vertex v. Used with the DSatur algorithm
Kempe(v, i, j) Set of vertices in the Kempe chain generated by v ∈ V and colours
i = c(v) and j
χ(G) Chromatic number of a graph G
α(G) Independence number of a graph G
ω(G) Clique number of a graph G
L(G) Line graph of a graph G
χ (G) Chromatic index of a graph G
χ L (G) Choice number of a graph G
χe (G) Equitable chromatic number of a graph G
General mathematical notation
O(.) Big-O notation for describing the order of growth of a function
|S| Set cardinality (number of elements in the set S)
lg n Binary logarithm. lg n = log2 n
n! Factorial function. n! = n × (n − 1) × (n − 2) × . . . × 2 × 1
nP Permutation function. Number of ways of picking an ordered set of k
k
elements from a set of n elements. n Pk = (n−k)!
n!
(continued)
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer 297
Nature Switzerland AG 2021
R. M. R. Lewis, Guide to Graph Colouring, Texts in Computer Science,
https://doi.org/10.1007/978-3-030-81054-2
298 Appendix B: Table of Notation
n
k Binomial coefficient. Number of ways
of picking an unordered set of k
elements from a set of n elements. nk = k!(n−k)!
n!
Bibliography
1. Beineke L, Wilson J (eds) (2015) Topics in chromatic graph theory. Encyclopedia of mathe-
matics and its applications (no. 156). Cambridge University Press
2. Jensen T, Toft B (1994) Graph coloring problems, 1st edn. Wiley-Interscience
3. Lewis R, Thompson J, Mumford C, Gillard J (2012) A wide-ranging computational comparison
of high-performance graph colouring algorithms. Comput Oper Res 39(9):1933–1950
Index
A C
Adjacency list, 14 Canonical round-robin algorithm, 222, 234
Adjacency matrix, 14 Cayley, Arthur, 163
Adjacent (vertices), 10 Checks (counting), 15
Algorithm user guide, 277–281 Choice number χ L (G), 188
Ant colony optimisation, 118–121, 257 Chordal graph, 48
AntCol algorithm, 118–121 Chromatic index χ (G), 166
empirical performance, 124–145 Chromatic number χ(G)
Appel, Kenneth, 9, 161, 165 Berge’s bound, 75
Approximation algorithms, 75 Bollobás’ bound, 74
Approximation ratio, 75 Brooks’ bound, 50–53
Articulation point, see Cut vertex definition, 11
Aspiration criterion (tabu search), 114 Hoffman’s bound, 75
interval graphs, 47–48
B lower bounds, 45–48
Backtracking algorithm, 78–81, 122–124 Reed’s bound, 71
empirical performance, 124–145 Reed’s conjecture, 71
Barabási-Albert method, 134 upper bounds, 49–54
Bell number Bn , 21 Welsh–Powell bound, 53–54
Betweenness centrality, 153 Chromatic polynomial, 196–199
Big O notation O Circle method, 168, 222, 234
definition, 19 Clash, 10
examples, 19 Clique
Binary heap, 67 counting in NetworkX, 92
Bipartite graph, 34, 43, 56, 57, 60, 149, 158, definition, 11
169 identification using NetworkX, 288
Block, 51 use with backtracking, 80
Boolean satisfiability, see Satisfiability problem use with branch-and-bound, 88
Branch-and-bound algorithm, 81–86 use with multicolouring, 196
empirical performance, 88–91 Clique number ω(G), see also Maximum
Breadth-first search, 25 clique problem, 45
Bridge, 156 identification with NetworkX, 289
identification using NetworkX, 288 CoLoRaTiOn game, 295
Brooks’ bound, see Chromatic number χ(G) Colour class, 11
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer 299
Nature Switzerland AG 2021
R. M. R. Lewis, Guide to Graph Colouring, Texts in Computer Science,
https://doi.org/10.1007/978-3-030-81054-2
300 Index