Eppstein 2013
Eppstein 2013
Eppstein 2013
in Near-Optimal Time
DAVID EPPSTEIN, University of California, Irvine
MAARTEN LÖFFLER, Utrecht University
DARREN STRASH, University of California, Irvine
We modify an algorithm of Bron and Kerbosch [1973] for maximal clique enumeration to choose more
carefully the order in which the vertices are processed, giving us a fixed-parameter tractable algorithm with
running time O(dn3d/3 ) on graphs with n vertices and degeneracy d. Our time bound matches a worst-case
bound of (n − d)3d/3 on the number of maximal cliques when d is a multiple of 3 and n ≥ d + 3. For graphs
with degeneracy d and maximum clique size κ, the algorithm satisfies a time bound of O(d2 n(d/κ)κ ), and for
Kh-minor-free graphs we obtain a time bound of n2 O(h log log h) , matching a bound of Fomin et al. [2010] for the
number of cliques in these graphs. We implement our algorithm and provide a comparative analysis of it and
a different variation of the Bron–Kerbosch algorithm by Tomita et al. [2006] on a large corpus of real-world
graphs with low degeneracy. Our algorithm always performs comparably with Tomita et al. on moderately
sized graphs, in some cases is much faster, and due to its more space-efficient data structures is capable of
being applied to significantly larger graphs.
Categories and Subject Descriptors: F.2.2 [Analysis of Algorithms and Problem Complexity]: Nonnu-
merical Algorithms and Problems—Computations on discrete structures
General Terms: Algorithms, Design, Experimentation, Performance, Theory
Additional Key Words and Phrases: Maximal clique enumeration; Bron–Kerbosch algorithm; parameterized
complexity
ACM Reference Format:
David Eppstein, Maarten Löffler, and Darren Strash. 2013. Listing all maximal cliques in large sparse real-
world graphs in near-optimal time. ACM J. Exp. Algor. 18, 3, Article 3.1 (December 2013), 21 pages.
DOI: http://dx.doi.org/10.1145/2543629
1. INTRODUCTION
Cliques, complete subgraphs of a graph, are of great importance in many applications.
Often, it is important to find not just one large clique, but all maximal cliques, sets of
vertices that form a clique, but for which no superset is also a clique (Figure 1). Many
algorithms are now known for this problem [Akkoyunlu 1973; Bron and Kerbosch 1973;
This research was supported in part by the National Science Foundation under grants 0830403 and 1217322,
and by the Office of Naval Research under MURI grant N00014-08-1-1015.
Preliminary versions of the research reported in this article appeared as two conference papers, “Listing all
maximal cliques in sparse graphs in near-optimal time”, by Eppstein, Löffler, and Strash, at the 21st Inter-
national Symposium on Algorithms and Computation, Korea, 2010, and “Listing all maximal cliques in large
sparse real-world graphs”, by Eppstein and Strash, at the 10th International Symposium on Experimental
Algorithms, Crete, 2011.
Authors’ addresses: D. Eppstein and D. Strash, Computer Science Department, University of California,
Irvine, Irvine, CA; email: {eppstein, dstrash}@uci.edu; M. Löffler, Department of Computing and Information
Sciences, Utrecht University, the Netherlands; email: [email protected].
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted
without fee provided that copies are not made or distributed for profit or commercial advantage and that
copies show this notice on the first page or initial screen of a display along with the full citation. Copyrights for
components of this work owned by others than ACM must be honored. Abstracting with credit is permitted.
To copy otherwise, to republish, to post on servers, to redistribute to lists, or to use any component of this
work in other works requires prior specific permission and/or a fee. Permissions may be requested from
Publications Dept., ACM, Inc., 2 Penn Plaza, Suite 701, New York, NY 10121-0701 USA, fax +1 (212)
869-0481, or [email protected].
c 2013 ACM 1084-6654/2013/12-ART3.1 $15.00
DOI: http://dx.doi.org/10.1145/2543629
ACM Journal of Experimental Algorithmics, Vol. 18, No. 3, Article 3.1, Publication date: December 2013.
3.1:2 D. Eppstein et al.
Cazals and Karande 2008; Cheng et al. 2010; Chiba and Nishizeki 1985; Chrobak and
Eppstein 1991; Du et al. 2009; Gély et al. 2009; Gerhards and Lindenberg 1979; Harary
and Ross 1957; Johnston 1976; Lu et al. 2010; Makino and Uno 2004; Modani and Dey
2008; Mulligan and Corneil 1972; Pan and Santos 2008; Samatova et al. 2008; Schmidt
et al. 2009; Tomita et al. 2006; Zhang et al. 2005] and for the complementary problem
of finding maximal independent sets [Eppstein 2009; Johnson et al. 1988; Lawler et al.
1980; Loukakis and Tsouros 1981; Tsukiyama et al. 1977]. One of the most successful
in practice is the Bron–Kerbosch algorithm, a simple backtracking procedure that
recursively solves subproblems specified by three sets of vertices: the vertices that are
required to be included in a partial clique, the vertices that are to be excluded from the
clique, and some remaining vertices whose status still needs to be determined [Bron
and Kerbosch 1973; Cazals and Karande 2008; Johnston 1976; Koch 2001; Tomita et al.
2006]. All maximal cliques can be listed in polynomial time per clique [Lawler et al.
1980; Tsukiyama et al. 1977] or in a total time proportional to the maximum possible
number of cliques in an n-vertex graph, without additional polynomial factors [Eppstein
2003; Tomita et al. 2006]. In particular, a variant of the Bron–Kerbosch algorithm is
optimal in this sense [Cazals and Karande 2008; Tomita et al. 2006]. Unfortunately this
maximum possible number of cliques is exponential [Moon and Moser 1965], so that all
general-purpose algorithms for listing maximal cliques necessarily take exponential
time in the worst case.
ACM Journal of Experimental Algorithmics, Vol. 18, No. 3, Article 3.1, Publication date: December 2013.
Listing All Maximal Cliques in Large Sparse Real-World Graphs in Near-Optimal Time 3.1:3
a measure of sparsity that upper bounds degeneracy, is also low for social networks
[Eppstein and Spiro 2009]. In addition, our experiments in Section 4 show that protein–
protein interaction networks have low degeneracy as well. Furthermore, planar
graphs have degeneracy at most five [Lick and White 1970], and the Barabási–Albert
model of preferential attachment [Barabási and Albert 1999], frequently used to model
large scale-free social networks, produces graphs with bounded degeneracy.
1.2. Practice
Many different clique-finding algorithms have been implemented, and an algorithm
of Tomita et al. [2006], based on the much earlier algorithm of Bron and Kerbosch
[1973], has been shown through many experiments to be faster by orders of magnitude
in practice than the other algorithms that have been studied to date. An unfortunate
drawback of the algorithm of Tomita et al., however, is that both its theoretical analysis
and implementation rely on an adjacency matrix representation of the input graph. For
this reason, their algorithm has limited applicability for large sparse graphs, whose
adjacency matrix may not fit into working memory. We therefore seek to have the
best of both worlds: we would ideally like an algorithm that rivals the speed of the
Tomita et al. result, while having linear storage cost.
In Section 4, we describe an implementation of our new algorithm that has these
ideal properties. Our algorithm uses only linear space, so it is usable on very large
graphs. Its running time rivals, and in many cases is faster than, the running time of
the Tomita et al. method. And, although it is never slower than Tomita et al. by more
than a small factor, it is sometimes faster by a large factor, showing that it is more
reliably fast than the previous algorithm.
Our algorithms are fixed-parameter tractable, with running times of the form
O( f (d)n) where f (d) = d3d/3 . Algorithms for listing all maximal cliques in graphs
of constant degeneracy in time O(n) were known [Chiba and Nishizeki 1985; Chrobak
and Eppstein 1991], but had not been analyzed for their dependence on degeneracy. We
compare the parameterized running time bounds of these other algorithms to our vari-
ant of the Bron–Kerbosch algorithm, and we show that the Bron–Kerbosch algorithm
has a much smaller dependence on d.
ACM Journal of Experimental Algorithmics, Vol. 18, No. 3, Article 3.1, Publication date: December 2013.
3.1:4 D. Eppstein et al.
Fig. 2. (a) A graph with degeneracy 3. The clique consisting of vertices D, E, H, and I shows that it cannot
have degeneracy smaller than 3. (b) A vertex ordering showing that the degeneracy is not larger than 3.
2. PRELIMINARIES
We work with an undirected graph G = (V, E), which we assume is stored in an
adjacency list data structure. We let n and m be the number of vertices and edges of G,
respectively. For a vertex v, we define (v) to be the set {w | (v, w) ∈ E}, which we call
neighborhood of v, and similarly for a subset W ⊂ V we define (W) to be the set
the
w∈W (w), which is the common neighborhood of all vertices in W.
2.1. Degeneracy
Our algorithm is parameterized by the degeneracy of a graph, a measure of its sparsity.
The degeneracy of a graph G is the smallest value d such that every nonempty
subgraph of G contains a vertex of degree at most d [Lick and White 1970].
Figure 2(a) shows an example of a graph of degeneracy 3. Degeneracy is also known
as the k-core number [Batagelj and Zaveršnik 2003], width [Freuder 1982], and link-
age [Kirousis and Thilikos 1996] of a graph and is one less than the coloring num-
ber [Erdős and Hajnal 1966]. In a graph of degeneracy d, the maximum clique size can
be at most d + 1, for any larger clique would form a subgraph in which all vertices have
degree higher than d.
A graph of degeneracy d has a degeneracy ordering, an ordering such that each vertex
has d or fewer neighbors that come later in the ordering (Figure 2(b)). Such an ordering
may be formed by repeatedly removing a vertex of degree d or less: by the assumption
that G is d-degenerate, at least one such vertex exists at each step. Conversely, if G
has an ordering with this property, then it is d-degenerate, because for any subgraph
H of G, the vertex of H that comes first in the ordering has d or fewer neighbors in
H. Thus, as Lick and White [1970] showed, degeneracy may equivalently be defined as
the minimum d for which a degeneracy ordering exists. A third, equivalent definition
is that d is the minimum value for which G has an orientation as a directed acyclic
graph in which all vertices have out-degree at most d [Chrobak and Eppstein 1991]:
such an orientation may be found by orienting each edge from its earlier endpoint to
its later endpoint in a degeneracy ordering, and if such an orientation is given, then a
degeneracy ordering may be found by topologically ordering the oriented graph.
Degeneracy is a robust measure of sparsity: it is within a constant factor of other
popular measures of sparsity including arboricity and thickness. In addition, degener-
acy, along with a degeneracy ordering, can be computed by a simple greedy strategy
of repeatedly removing a vertex with smallest degree (and its incident edges) from the
graph until it is empty [Batagelj and Zaveršnik 2003]. The degeneracy is the maximum
of the degrees of the vertices at the time they are removed from the graph, and the
degeneracy ordering is the order in which vertices are removed from the graph [Jensen
and Toft 1995]. The easy computation of degeneracy has made it a useful tool in algo-
rithm design and analysis [Chrobak and Eppstein 1991; Eppstein 2009].
We can implement the greedy algorithm for finding a degeneracy ordering in time
O(n + m) [Batagelj and Zaveršnik 2003]. To do so, maintain an array D, where D[i]
ACM Journal of Experimental Algorithmics, Vol. 18, No. 3, Article 3.1, Publication date: December 2013.
Listing All Maximal Cliques in Large Sparse Real-World Graphs in Near-Optimal Time 3.1:5
stores a doubly-linked list of vertices that have exactly i neighbors among the graph
vertices that have not yet been included in the greedy ordering; initially D[i] stores
a list of the vertices with degree i. To remove a vertex of minimum degree from the
graph, scan D starting at D[0] until reaching the first nonempty list D[i]. Remove any
vertex v from D[i], and move each neighbor w of v from D[ j] to D[ j − 1], where j is
the number of unlisted neighbors of w prior to including v in the ordering. Each vertex
removal step takes time proportional to the degree of the removed vertex, so the overall
algorithm takes linear time.
We can also use degeneracy to bound the total number of edges in the graph. If we
sum, for each vertex v, the number of neighbors of v that are later in the ordering,
then each edge is counted once, so the value of the sum is just m. However, for a vertex
v at position i in the degeneracy, the number of later neighbors of v can be at most
min(d, n − i). Adding this quantity over all possible positions, we get the following
bound on the number of edges of a d-degenerate graph.
LEMMA 2.1 (PROPOSITION 3 OF LICK AND WHITE [1970]). A graph G = (V, E) with de-
generacy d has at most d(n − d+1
2
) edges.
3. THEORETICAL RESULTS
In this section, we show that as well as the pivoting strategy, the order in which the
vertices of G are processed by the Bron–Kerbosch algorithm is also important. We
develop a variant of the Bron–Kerbosch algorithm that chooses this ordering carefully
and correctly lists all maximal cliques in time O(dn3d/3 ). Our algorithm performs the
outer level of recursion of the Bron–Kerbosch algorithm without pivoting, using a
ACM Journal of Experimental Algorithmics, Vol. 18, No. 3, Article 3.1, Publication date: December 2013.
3.1:6 D. Eppstein et al.
Fig. 4. Partitioning the common neighbors of a clique R into the set P of later vertices and the set X of
remaining neighbors.
degeneracy ordering to order the sequence of recursive calls made at this level, and
then switches at inner levels of recursion to the pivoting rule of Tomita et al. [2006].
We also show that this performance is optimal in terms of the degeneracy, and that
the dependence on the degeneracy is better than other algorithms that are designed
for clique-finding in sparse graphs.
ACM Journal of Experimental Algorithmics, Vol. 18, No. 3, Article 3.1, Publication date: December 2013.
Listing All Maximal Cliques in Large Sparse Real-World Graphs in Near-Optimal Time 3.1:7
in X. By the choice of ordering, the new vertex v is earlier than all remaining vertices
in P. Therefore, after adding v, it remains the case that the partition of the common
neighbors of R into X and P is determined by the ordering of these common neighbors
with respect to the last vertex v of R.
Our algorithm (Figure 5) computes a degeneracy ordering of the given graph, and
performs the outermost recursive calls in the ordered variant of the Bron–Kerbosch
algorithm (without pivoting) for this ordering. The sets P passed to each of these
recursive calls will have at most d elements in them, leading to few recursive calls
within each of these outer calls. Below the top level of the recursion we switch from the
ordered nonpivoting version of the Bron–Kerbosch algorithm to the pivoting algorithm
(with the same choice of pivots as Tomita et al. [2006]) to further control the number
of recursive calls.
ACM Journal of Experimental Algorithmics, Vol. 18, No. 3, Article 3.1, Publication date: December 2013.
3.1:8 D. Eppstein et al.
LEMMA 3.4. In a recursive call to BronKerboschPivot that is passed the graph HP,X
as an auxiliary argument, the sequence of graphs HP∩(v),X∩(v) to be passed to lower-level
recursive calls can be computed in total time O(|P|2 (|P| + |X|)).
PROOF. It takes O(|P| + |X|) time to identify the subsets P ∩ (v) and X ∩ (v) by
examining the neighbors of v in HP,X. Once these sets are identified, HP∩(v),X∩(v) may
be constructed as a subgraph of HP,X in time O(|P|(|P| + |X|)) by testing for each edge of
HP,X whether its endpoints belong to these sets. There are O(|P|) graphs to construct,
one for each recursive call, hence the total time bound.
LEMMA 3.5 (THEOREM 3 OF TOMITA ET AL. [2006]). Let T be a function which satisfies
the following recurrence relation:
maxk{kT ( p − k)} + dp2 if p > 0
T ( p) ≤
e if p = 0,
where p and k are integers, such that p ≥ k, and d, e are constants greater than zero.
Then, T ( p) ≤ maxk{kT ( p − k)} + dp2 = O(3 p/3 ).
LEMMA 3.6. Let v be a vertex, Pv , be v’s later neighbors, and Xv be v’s earlier neighbors.
Then BronKerboschPivot(Pv , {v}, Xv ) executes in time O((d + |Xv |)3|Pv |/3 ), excluding the
time to report the discovered maximal cliques.
PROOF. Define D( p, x) to be the running time of BronKerboschPivot(Pv , {v}, Xv ),
where p = |Pv |, and x = |Xv |. We show that D( p, x) = O((d + x)3 p/3 ). By the description
of BronKerboschPivot, D satisfies the following recurrence relation:
maxk{kD( p − k, x)} + c1 p2 ( p + x) if p > 0
D( p, x) ≤
c2 if p = 0,
D( p, x) ≤ max{kD( p − k, x)} + c1 p2 ( p + x)
k
kD( p − k, x)
≤ (d + x) max + c1 p2
k d+ x
≤ (d + x) max{kT ( p − k)} + c1 p2
k
ACM Journal of Experimental Algorithmics, Vol. 18, No. 3, Article 3.1, Publication date: December 2013.
Listing All Maximal Cliques in Large Sparse Real-World Graphs in Near-Optimal Time 3.1:9
Fig. 6. The lower bound construction for d = 6, consisting of a Moon–Moser graph of size d on the right
(blue vertices) and an independent set of n − d remaining vertices that are each connected to all of the last
d vertices.
ACM Journal of Experimental Algorithmics, Vol. 18, No. 3, Article 3.1, Publication date: December 2013.
3.1:10 D. Eppstein et al.
However, we will show that by a simple modification, we can in fact obtain an algo-
rithm that runs in time O(d(n − d)3d/3 ), which matches the worst-case output size for
all values of d.
This new algorithm, which we’ll call BronKerboschDegeneracy2, is divided into two
phases. In the first phase, we run BronKerboschDegeneracy, which we allow to compute
the degeneracy ordering and we stop its execution once it has processed the first n − d
vertices in the ordering. This takes time
ACM Journal of Experimental Algorithmics, Vol. 18, No. 3, Article 3.1, Publication date: December 2013.
Listing All Maximal Cliques in Large Sparse Real-World Graphs in Near-Optimal Time 3.1:11
one implies that there are no edges between pairs of vertices in P, so each recursive call
returns immediately, and the total time for the algorithm is O( p2 x) = O( p( p+x)( p/k)k).
For k > 1, the time bounds for the algorithm obey a recurrence of the form
T (x, p, k) ≤ max jT (x, p − j, k − 1) + O( p2 x).
j
To show that this result leads to nontrivial improvements compared to our O(d3d/3 n)
bound, we consider the case of minor-closed graph families, families of graphs closed
under the operations of edge contraction and edge deletion. We say that such a family F
is nontrivial if it does not include all graphs; in this case, we may let h be the minimum
number of vertices in a graph that does not belong to F. The complete graph Kh is an
excluded minor for F and F ⊂ Fh where Fh is the family of graphs that do not have Kh
as a minor.
All graphs in Fh have degeneracy O(h log h), and this is tight [Kostochka 1984;
Thomason 2001, 1984]. Therefore, our time bound for BronKerboschDegeneracy
√ that
uses only the degeneracy would show that it takes time n2 O(h log h) on graphs in Fh.
However, if we use the observation that the maximum clique size κ on graphs in Fh is
at most h, we can instead apply Theorem 3.11, giving us a tighter bound of n2 O(h log log h)
on the running time of BronKerboschDegeneracy. This time bound matches an upper
bound of n2 O(h log log h) on the number of cliques in these graphs given by Fomin et al.
[2010], and provides an alternative proof for their bound.
3.6. Comparison with Other FPT Algorithms
Our algorithm is not the first to list the maximal cliques of sparse graphs, and algo-
rithms that run in linear time on graphs of constant degeneracy were already known.
ACM Journal of Experimental Algorithmics, Vol. 18, No. 3, Article 3.1, Publication date: December 2013.
3.1:12 D. Eppstein et al.
Essentially, although their analysis was not necessarily phrased in these terms, these
previous algorithms are fixed-parameter tractable algorithms. However, the depen-
dence of these algorithms on the degeneracy parameter has not always been explicitly
described, so we now review these existing algorithms and show that our algorithm’s
dependence on d is superior.
Chiba and Nishizeki. Chiba and Nishizeki [1985] describe two algorithms for finding
cliques in sparse graphs. One of the two reports all maximal cliques using O(am) time
per clique, where a is the arboricity of the graph, and m is the number of edges in G.
The arboricity is the minimum number of edge-disjoint spanning forests into which
the graph can be decomposed [Harary 1972]. The degeneracy of a graph is closely
related to arboricity, being always within a factor of two of each other: If a graph G has
arboricity a, then any subgraph of G with k vertices has at most a(k− 1) edges, average
degree 2a(k − 1)/k < 2a, and minimum degree at most 2a − 1. In the other direction,
if G has degeneracy d, then we may partition it into d forests by finding a degeneracy
orientation of G and, at each vertex v, assigning each of the at most d outgoing edges
to a different forest.
In terms of degeneracy, Chiba and Nishizeki’s algorithm uses O(d2 n) time per clique.
Combining this with the bound on the number of cliques derived in Section 3.3 results
in a worst-case time bound of O(d2 n(n− d)3d/3 ). For constant d, this is a quadratic time
bound, in contrast to the linear time of our algorithm.
Another algorithm of Chiba and Nishizeki [1985] lists cliques that have l vertices in
time O(lal−2 m). It can be adapted to enumerate all maximal cliques in a graph with
degeneracy d by first enumerating all cliques of order d + 1, d, . . . down to 1, and
then removing the cliques that are not maximal. Applying their algorithm directly to a
d-degenerate graph
takes timei−1 O(ldl−1 n). Therefore, the running time to find all max-
imal cliques is 1≤i≤d+1 O(ind ) = O(ndd+1 ). Like our algorithm, this is linear when
d is constant, but with a much worse dependence on the parameter d.
Chrobak and Eppstein. Chrobak and Eppstein [1991] list triangles and 4-cliques in
graphs of bounded degeneracy by testing all sets of two or three later neighbors of
each vertex according to a degeneracy ordering. The same idea extends to maximal
cliques of size greater than four, by testing all subsets of later neighbors of each vertex.
For each vertex v, there are at most 2d subsets to test; each subset may be tested for
being a clique in time O(d2 ), by checking whether each of its vertices has all the later
vertices in the subset among its later neighbors, giving a total time of O(nd2 2d) to list
all the cliques in the graph. However, although this singly exponential time bound is
considerably faster than Chiba and Nishizeki, and is close to known bounds on the
number of (possibly nonmaximal) cliques in d-degenerate graphs [Wood 2007], it is
slower than our algorithm by a factor that is exponential in d. Our new algorithm
uses this same idea of searching among the later neighbors in a degeneracy order but
achieves much greater efficiency by combining it with the Bron–Kerbosch algorithm.
Makino and Uno. Makino and Uno [2004] list all maximal cliques in graphs with
maximum degree in time O(4 ) per clique. For graphs with degeneracy d, can be
any value between d and n − 1, making a meaningful comparison with our algorithm
difficult. Therefore, for graphs with constant degeneracy, their time bound is a factor
4 slower than our algorithm in the worst case, which may be a constant factor, or
much worse. A graph with maximum degree can have no more than (n − )3/3
maximal cliques, and an analysis similar to that of the previous section shows that
this algorithm has a run time of O(4 (n − )3/3 ), which is fixed-parameter tractable
when parameterized on . However, this bound is always worse than the bound for our
ACM Journal of Experimental Algorithmics, Vol. 18, No. 3, Article 3.1, Publication date: December 2013.
Listing All Maximal Cliques in Large Sparse Real-World Graphs in Near-Optimal Time 3.1:13
1 In our experiments, we use an array instead of a linked list to store the neighbors of a vertex.
ACM Journal of Experimental Algorithmics, Vol. 18, No. 3, Article 3.1, Publication date: December 2013.
3.1:14 D. Eppstein et al.
Fig. 7. When a vertex v is added to the partial clique R, its neighbors in P and X (highlighted in this
example) are moved toward the dividing line in preparation for the next recursive call.
is added to R for a recursive call, we can intersect the neighborhood of v with P and
X in O() time, by iterating over its at most neighbors and testing for membership
in P or X. Thus, preparing subsets for all recursive calls takes time O(|P|). Finally,
we note that the pivot at the top level of recursion intersects neighbors in P, and
therefore, we only make (n − ) recursive calls at the top level.
Fitting these facts into the analysis of Tomita et al. shows that the running time
of this algorithm is O((n − )3/3 ). We note that may be significantly larger than
the degeneracy, so this algorithm’s theoretical time bounds are not as good as those of
Tomita et al. or ours; nevertheless, as we show in the next section, the simplicity of
this algorithm makes it competitive in practice.
Our Algorithm with No Data Structure (ELS-bare). It is possible to implement
BronKerboschDegeneracy with only the degeneracy ordering, and no extra data struc-
turing, only passing lists of vertices as arguments in each recursive call. The slowest
part of each recursive call, and therefore the bottleneck of the algorithm, is the step in
which the pivot vertex is selected. A simple strategy for determining each pivot would
be to loop over all the possible pivots in X ∪ P and, for each one, loop over its later
neighbors in the degeneracy ordering to determine how many of them are in P. The
same strategy can also be used to perform the neighborhood intersection required to
set up the arguments for the recursive calls. With the pivot selection and set inter-
section algorithms implemented in this way, the algorithm would have running time
O(d2 n3d/3 ), a factor of d larger than the worst-case output size and a factor of d slower
than the more highly optimized version of our algorithm. However, the simplicity of
this version of the algorithm may benefit it compared to the other algorithms.
Our Algorithm with Linear-Size Data Structure (ELS-array). To obtain our claimed
O(dn3d/3 ) worst-case time bound in a practical implementation, we maintain the sets
of vertices P and X in a single array, the address to which is passed between recursive
calls. Initially, the array contains the elements of X followed by the elements of P. We
keep a reverse lookup table, so that in constant time we can look up the position of
each vertex in this array. With this lookup table, we can tell whether a vertex is in P or
X in constant time, by comparing its index to the position in the array where the two
subarrays for X and P are separated from each other. When a vertex v is added to R in
preparation for a recursive call, we reorder the array. Vertices in (v) ∩ X are moved to
the end of the X subarray, and vertices in (v) ∩ P are moved to the beginning of the P
subarray (see Figure 7). We then make a recursive call on the subarray containing the
vertices (v) ∩ (X ∪ P). After the recursive call, we move v to X by swapping it to the
beginning of the P subarray and moving the boundary so that v is in the X subarray.
Of course, moving vertices between sets will affect P and X in higher recursive calls.
Therefore, within a given recursive call, we maintain a separate list of the vertices that
are moved from P to X, and we move these vertices back to P when the call ends.
To perform pivots quickly, our algorithm uses a modified adjacency list representation
of a subgraph formed by the vertices in X ∪ P: a set of arrays, one for each potential
ACM Journal of Experimental Algorithmics, Vol. 18, No. 3, Article 3.1, Publication date: December 2013.
Listing All Maximal Cliques in Large Sparse Real-World Graphs in Near-Optimal Time 3.1:15
Fig. 8. For each vertex in P ∪ X, we keep an array containing neighbors in P. We update these arrays
whenever a vertex is moved from P to R, and whenever we need to intersect a neighborhood with P and X
for a recursive call.
ACM Journal of Experimental Algorithmics, Vol. 18, No. 3, Article 3.1, Publication date: December 2013.
3.1:16 D. Eppstein et al.
Ubuntu 10.10, with a 2.53GHz Intel Core i5 M460 processor (with three cache levels
of 128KB, 512KB, and 3,072KB, respectively) and 2.6GB of memory. We compiled our
code with version 4.4.5 of the gcc compiler with the -O2 optimization flag. We have
released our code under the GNU GPL 3.0 license, and we freely provide the code and
data sets upon request.
ACM Journal of Experimental Algorithmics, Vol. 18, No. 3, Article 3.1, Publication date: December 2013.
Listing All Maximal Cliques in Large Sparse Real-World Graphs in Near-Optimal Time 3.1:17
Table V. Experimental Results for Moon–Moser Graphs and DIMACS Benchmark Graphs
graph n m d μ TTT-classic TTT-lists ELS-bare ELS-array
M-M-30 30 405 27 27 59,049 0.04 0.04 0.06 0.04
M-M-45 45 945 42 42 14,348,907 7.50 15.11 20.36 10.21
M-M-48 48 1080 45 45 43,046,721 22.52 48.37 63.07 30.22
M-M-51 51 1224 48 48 129,140,163 67.28 150.02 198.06 91.80
MANN a9 45 918 40 41 590,887 0.44 0.88 0.90 0.53
brock 200 2 200 9876 84 114 431,586 0.55 2.95 2.61 1.22
c-fat200-5 200 8473 83 86 7 0.01 0.01 0.01 0.01
c-fat500-10 500 46627 185 188 8 0.04 0.04 0.09 0.12
hamming6-2 64 1824 57 57 1,281,402 1.36 4.22 4.15 2.28
hamming6-4 64 704 22 22 464 < 0.01 < 0.01 < 0.01 < 0.01
johnson8-4-4 70 1855 53 53 114,690 0.13 0.35 0.40 0.24
johnson16-2-4 120 5460 91 91 2,027,025 5.97 27.05 31.04 12.17
keller4 171 9435 102 124 10,284,321 5.98 24.97 26.09 11.53
p hat300-1 300 10933 49 132 58,176 0.07 0.29 0.25 0.15
p hat300-2 300 21928 98 229 79,917,408 91.31 869.34 371.72 163.16
algorithm was significantly faster than that of Tomita et al. on the worm and fruitfly
networks, and matched or came close to its performance on all the other networks,
even the relatively dense yeast network. Our algorithm was consistently faster on the
networks in the Pajek datasets (Table III). Due to their large size, the algorithm of
Tomita et al. was unable to run on two of these networks; nevertheless, our algorithm
found all cliques quickly in these graphs. Finally, nearly all of the graphs in the
Stanford Large Network Dataset Collection (Table IV) were too large for the Tomita
et al. algorithm to fit into memory. For graphs which are extremely sparse, it is no sur-
prise that the TTT-lists algorithm was faster than our algorithm, but our algorithm was
consistently fast on each of these datasets, whereas the TTT-lists algorithm was orders
of magnitude slower than our algorithm on the large soc-wiki-Talk network.
We also ran our comparisons using the two sets of graphs that Tomita et al. used in
their experiments, the DIMACS challenge graphs (Table V) and a set of random graphs
(Table VI). Our algorithm runs about 2 to 3 times slower than that of Tomita et al. on
many of these graphs; this confirms that the algorithm is still competitive on graphs
that are not sparse, in contrast to the competitors in Tomita et al.’s paper which ran
10 to 160 times slower on these input graphs. The largest of the random graphs in the
second dataset were generated with edge probabilities that made them significantly
sparser than the rest of the set; for those graphs our algorithm outperformed that
of Tomita et al. by a factor that was as large as 17 on the sparsest of the graphs.
ACM Journal of Experimental Algorithmics, Vol. 18, No. 3, Article 3.1, Publication date: December 2013.
3.1:18 D. Eppstein et al.
The TTT-lists algorithm was even faster than our algorithm in these cases, but it was
significantly slower on other data.
4.2.2. A Remark about Degeneracy and Maximum Degree. Observe that the real-world
graphs in Tables I to IV tend to have degeneracy that is significantly lower than
the maximum degree. This experimentally verifies that degeneracy is a tighter mea-
sure of sparsity than the maximum degree, and we believe our experimental results
demonstrate the power of degeneracy as a parameter when designing algorithms for
such sparse real-world graphs.
5. CONCLUSION
We have presented theoretical evidence for the fast performance of the Bron–Kerbosch
algorithm for finding cliques in graphs, as has been observed in practice. Our modified
algorithm is fixed-parameter tractable in terms of the degeneracy of the graph, a
parameter that is expected to be low in many real-world applications, and performs
optimally in terms of this parameter. Furthermore, our experimental results show that
our algorithm is efficient in practice for large sparse graphs. This algorithm is highly
competitive with the algorithm of Tomita et al. on sparse graphs, and within a small
constant factor on other graphs. The advantage of this algorithm is that it requires only
linear space for storing the graph and all data structures. It does not suffer from the
drawback of requiring an adjacency matrix, which may not fit into memory. Its closest
competitor in this respect, the Tomita et al. algorithm modified to use adjacency lists,
is sometimes faster by a small factor but is also sometimes slower by a large factor.
ACM Journal of Experimental Algorithmics, Vol. 18, No. 3, Article 3.1, Publication date: December 2013.
Listing All Maximal Cliques in Large Sparse Real-World Graphs in Near-Optimal Time 3.1:19
Thus, our algorithm is a fast and reliable choice for listing maximal cliques, especially
when the input graphs are large and sparse.
ACKNOWLEDGMENTS
We would like to thank Patrick Prosser for finding copyediting errors in our tables, and for finding an error
with one of our data sets.
REFERENCES
Faisal N. Abu-Khzam, Michael A. Langston, Pushkar Shanbhag, and Christopher T. Symons. 2006. Scalable
parallel algorithms for FPT problems. Algorithmica 45, 3 (2006), 269–284.
E. A. Akkoyunlu. 1973. The enumeration of maximal cliques of large graphs. SIAM J. Comput. 2, 1 (1973),
1–6.
Noga Alon and Shai Gutner. 2009. Linear time algorithms for finding a dominating set of fixed size in
degenerated graphs. Algorithmica 54, 4 (2009), 544–556.
Albert-László Barabási and Réka Albert. 1999. Emergence of scaling in random networks. Science 286 (1999),
509–512.
Vladimir Batagelj and Andrej Mrvar. 2006. Pajek datasets. http://vlado.fmf.unilj.si/pub/networks/data/.
Vladimir Batagelj and M. Zaveršnik. 2003. An O(m) algorithm for cores decomposition of networks. Electronic
preprint. arXiv: cs/0310049v1.
Coen Bron and Joep Kerbosch. 1973. Algorithm 457: Finding all cliques of an undirected graph. Commun.
ACM 16, 9 (1973), 575–577.
Leizhen Cai, Siu Man Chan, and Siu On Chan. 2006. Random separation: A new method for solving fixed-
cardinality optimization problems. In Proceedings of the 2nd International Workshop on Parameterized
and Exact Computation (IWPEC 2006). Lecture Notes in Computer Science, Vol. 4169. Springer-Verlag,
239–250.
Frederic Cazals and Chinmay Karande. 2008. A note on the problem of reporting maximal cliques. Theoret.
Comput. Sci. 407, 1–3 (2008), 564–568.
James Cheng, Yiping Ke, Ada Wai-Chee Fu, Jeffrey Xu Yu, and Linhong Zhu. 2010. Finding maximal cliques
in massive networks by H*-graph. In Proceedings of the International Conference on Management of
Data (SIGMOD ’10). 447–458.
Norishige Chiba and Takao Nishizeki. 1985. Arboricity and subgraph listing algorithms. SIAM J. Comput.
14, 1 (1985), 210–223.
Marek Chrobak and David Eppstein. 1991. Planar orientations with low out-degree and compaction of
adjacency matrices. Theoret. Comput. Sci. 86, 2 (1991), 243–266.
Rod G. Downey and Michael R. Fellows. 1995. Fixed-parameter tractability and completeness II: On com-
pleteness for W[1]. Theoret. Comput. Sci. 141, 1–2 (1995), 109–131.
Rod G. Downey and Michael R. Fellows. 1999. Parameterized Complexity. Springer-Verlag.
Nan Du, Bin Wu, Liutong Xu, Bai Wang, and Pei Xin. 2009. Parallel algorithm for enumerating maximal
cliques in complex network. In Mining Complex Data. Studies in Computational Intelligence Series,
Vol. 165. Springer-Verlag, 207–221.
John D. Eblen, Charles A. Phillips, Gary L. Rogers, and Michael A. Langston. 2011. The maximum clique
enumeration problem: algorithms, applications and implementations. In Proceedings of the 7th Inter-
national Symposium on Bioinformatics Research and Applications (ISBRA 2011). Lecture Notes in
Computer Science, Vol. 6674. Springer-Verlag, 306–319.
David Eppstein. 2003. Small maximal independent sets and faster exact graph coloring. J. Graph Algor.
Appl. 7, 2 (2003), 131–140.
David Eppstein. 2009. All maximal independent sets and dynamic dominance for sparse graphs. ACM Trans.
Algor. 5, 4 (2009), A38.
David Eppstein and Emma S. Spiro. 2009. The h-index of a graph and its application to dynamic subgraph
statistics. In Proceedings of the 11th Symposium on Algorithms and Data Structures (WADS 2009).
Lecture Notes in Computer Science, Vol. 5664. Springer-Verlag, 278–289.
Paul Erdős and András Hajnal. 1966. On chromatic number of graphs and set-systems. Acta Mathematica
Hungarica 17, 1–2 (1966), 61–99.
Fedor V. Fomin, Sang-il Oum, and Dimitrios M. Thilikos. 2010. Rank-width and tree-width of H-minor-free
graphs. Europ. J. Combin. 31, 7 (2010), 1617–1628.
Eugene C. Freuder. 1982. A sufficient condition for backtrack-free search. J. ACM 29, 1 (1982), 24–32.
ACM Journal of Experimental Algorithmics, Vol. 18, No. 3, Article 3.1, Publication date: December 2013.
3.1:20 D. Eppstein et al.
Alain Gély, Lhouari Nourine, and Bachir Sadi. 2009. Enumeration aspects of maximal cliques and bicliques.
Disc. Appl. Math. 157, 7 (2009), 1447–1459.
L. Gerhards and W. Lindenberg. 1979. Clique detection for nondirected graphs: Two new algorithms. Com-
puting 21, 4 (1979), 295–322.
Gaurav Goel and Jens Gustedt. 2006. Bounded arboricity to determine the local structure of sparse graphs. In
Proceedings of WG 2006, Fedor V. Fomin (Ed.). Lecture Notes in Computer Science, Vol. 4271. Springer-
Verlag, 159–167.
Petr A. Golovach and Yngve Villanger. 2008. Parameterized complexity for domination problems on degen-
erate graphs. In Proceedings of 34th International Workshop on Graph-Theoretic Concepts in Computer
Science (WG 2008). Lecture Notes in Computer Science, Vol. 5344 (2008), 195–205.
Frank Harary. 1972. Graph Theory. Addison-Wesley, Reading, MA.
Frank Harary and Ian C. Ross. 1957. A procedure for clique detection using the group matrix. Sociometry
20, 3 (1957), 205–215.
T. R. Jensen and B. Toft. 1995. Graph Coloring Problems. Wiley-Interscience, New York.
David S. Johnson and Michael A. Trick. 1996. Cliques, Coloring, and Satisfiability: Second DIMACS Im-
plementation Challenge, Workshop, October 11-13, 1993. American Mathematical Society, Boston, MA,
USA.
David S. Johnson, Mihalis Yannakakis, and Christos H. Papadimitriou. 1988. On generating all maximal
independent sets. Inf. Proc. Lett. 27, 3 (1988), 119–123.
H. C. Johnston. 1976. Cliques of a graph—variations on the Bron–Kerbosch algorithm. Int. J. Paral. Program.
5, 3 (1976), 209–238.
L. M. Kirousis and Dimitrios M. Thilikos. 1996. The linkage of a graph. SIAM J. Comput. 25, 3 (1996),
626–647.
Ton Kloks and Leizhen Cai. 2000. Parameterized tractability of some (efficient) Y-domination variants for
planar graphs and t-degenerate graphs. In Proceedings of the International Computer Symposium.
Ina Koch. 2001. Enumerating all connected maximal common subgraphs in two graphs. Theoret. Comput.
Sci. 250, 1–2 (2001), 1–30.
Alexandr V. Kostochka. 1984. Lower bound of the Hadwiger number of graphs by their average degree.
Combinatorica 4 (1984), 307–316.
E. L. Lawler, J. K. Lenstra, and A. H. G. Rinnooy Kan. 1980. Generating all maximal independent sets:
NP-hardness and polynomial-time algorithms. SIAM J. Comput. 9, 3 (1980), 558–565.
Jure Leskovec. 2009. Stanford Large Network Dataset Collection. http://snap.stanford.edu/data/.
Don R. Lick and Arthur T. White. 1970. k-degenerate graphs. Canad. J. Math. 22 (1970), 1082–1096.
E. Loukakis and C. Tsouros. 1981. A depth first search algorithm to generate the family of maximal inde-
pendent sets of a graph lexicographically. Computing 27, 4 (1981), 349–366.
Li Lu, Yunhong Gu, and Robert Grossman. 2010. dMaximalCliques: A distributed algorithm for enumerating
all maximal cliques and maximal clique distribution. In Proceedings of the International Conference on
Data Mining Workshops. IEEE Computer Society, 1320–1327.
Kazuhisa Makino and Takeaki Uno. 2004. New algorithms for enumerating all maximal cliques. In Pro-
ceedings of the 9th Scandinavian Workshop on Algorithm Theory. Lecture Notes in Computer Science,
Vol. 3111. Springer-Verlag, 260–272.
Natwar Modani and Kuntal Dey. 2008. Large maximal cliques enumeration in sparse graphs. In Proceedings
of the 17th ACM Conference on Information and Knowledge Management (CIKM ’08). 1377–1378.
J. W. Moon and L. Moser. 1965. On cliques in graphs. Israel J. Math. 3, 1 (1965), 23–28.
Gordon D. Mulligan and Derek G. Corneil. 1972. Corrections to Bierstone’s algorithm for generating cliques.
J. ACM 19, 2 (1972), 244–247.
Mark E. J. Newman. 2006. Network data. http://www-personal.umich.edu/ mejn/netdata/.
Long Pan and Eunice E. Santos. 2008. An anytime-anywhere approach for maximal clique enumeration
in social network analysis. In Proceedings of the IEEE International Conference on Systems, Man and
Cybernetics. 3529–3535.
Nagiza F. Samatova, Matthew C. Schmidt, W. Hendrix, P. Breimyer, Kevin Thomas, and Byung-Hoon Park.
2008. Coupling graph perturbation theory with scalable parallel algorithms for large-scale enumeration
of maximal cliques in biological graphs. J. Phys. Conference Ser. 125, 1 (2008), 012053.
Matthew C. Schmidt, Nagiza F. Samatova, Kevin Thomas, and Byung-Hoon Park. 2009. A scalable, parallel
algorithm for maximal clique enumeration. J. Paral. Distrib. Comput. 69, 4 (2009), 417–428.
Chris Stark, Bobby-Joe Breitkreutz, Teresa Reguly, Lorrie Boucher, Ashton Breitkreutz, and Mike Tyers.
2006. BioGRID: A general repository for interaction datasets. Nucleic Acids Res. 34 (2006), D535–D539.
ACM Journal of Experimental Algorithmics, Vol. 18, No. 3, Article 3.1, Publication date: December 2013.
Listing All Maximal Cliques in Large Sparse Real-World Graphs in Near-Optimal Time 3.1:21
Andrew Thomason. 1984. An extremal function for contractions of graphs. Math. Proc. Camb. Phil. Soc. 95,
2 (1984), 261–265.
Andrew Thomason. 2001. The extremal function for complete minors. J. Combinat. Theory, Ser. B 81, 2
(2001), 318–338.
Etsuji Tomita, Akira Tanaka, and Haruhisa Takahashi. 2006. The worst-case time complexity for generating
all maximal cliques and computational experiments. Theoret. Comput. Sci. 363, 1 (2006), 28–42.
Shuji Tsukiyama, Mikio Ide, Hiromu Ariyoshi, and Isao Shirakawa. 1977. A new algorithm for generating
all the maximal independent sets. SIAM J. Comput. 6, 3 (1977), 505–517.
David R. Wood. 2007. On the maximum number of cliques in a graph. Graphs Combinat. 23, 3 (2007),
337–352.
Yun Zhang, Faisal N. Abu-Khzam, Nicole E. Baldwin, Elissa J. Chesler, Michael A. Langston, and Nagiza F.
Samatova. 2005. Genome-scale computational approaches to memory-intensive applications in systems
biology. In Proceedings of the 2005 ACM/IEEE Conference on Supercomputing.
ACM Journal of Experimental Algorithmics, Vol. 18, No. 3, Article 3.1, Publication date: December 2013.