Abstract
The well-known Cluster Vertex Deletion problem (cluster-vd) asks for a given graph G and an integer k whether it is possible to delete a set S of at most k vertices of G such that the resulting graph \(G-S\) is a cluster graph (a disjoint union of cliques). We give a complete characterization of graphs H for which cluster-vd on H-free graphs is polynomially solvable and for which it is \(\textsf{NP}\)-complete. Moreover, in the \(\textsf{NP}\)-completeness cases, cluster-vd cannot be solved in sub-exponential time in the vertex number of the H-free input graphs unless the Exponential-Time Hypothesis fails. We also consider the connected variant of cluster-vd, the Connected Cluster Vertex Deletion problem (connected cluster-vd), in which the set S has to induce a connected subgraph of G. It turns out that connected cluster-vd admits the same complexity dichotomy for H-free graphs. Our results enlarge a list of rare dichotomy theorems for well-studied problems on H-free graphs.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction and Results
A very extensively studied version of graph modification problems asks to modify a given graph to a graph that satisfies a certain property \(\mathcal G\) by deleting a minimum number of vertices. The case \(\mathcal G\) being ‘edgeless’ is the well-known vertex cover problem, one of the classical \(\textsf{NP}\)-hard problems. If \(\mathcal G\) is a ‘cluster graph’, a graph in which every connected component is a clique, the corresponding problem is another well-known \(\textsf{NP}\)-hard problem, the cluster vertex deletion problem (cluster-vd for short). In this paper, we revisit the computational complexity of cluster-vd, formally given below.
Being an hereditary property on induced subgraphs, cluster-vd is \(\textsf{NP}\)-complete [25] and cannot be solved in \(2^{o(n+m)}\) time unless the ETH (Exponential-Time Hypothesis) fails [21], where n and m are the vertex and edge number of the input graphs, respectively. cluster-vd remains \(\textsf{NP}\)-complete even when restricted to planar graphs [32] and to bipartite graphs [33], and to planar bipartite graphs of maximum degree 3 [14]. Most recent works on cluster-vd deal with exact, FPT and approximation algorithms [1, 2, 15, 31].
It is noticeable that there are only a few known cases where the problem can be solved efficiently: cluster-vd is polynomially solvable on block graphs, split graphs and interval graphs [3], and on graphs of bounded treewidth [29]. On the other hand, the complexity status of cluster-vd on many well-studied graph classes is still open, e.g., chordal graphs discussed in [3] and planar bipartite graphs mentioned in [4].
In this paper we initiate studying the computational complexity of cluster-vd on graphs defined by forbidding certain induced subgraphs. We remark that related approaches for other problems are quite common in the literature, e.g., for vertex cover (aka independent set) [10, 13] and coloring [11, 23], and that many popular graph classes are defined or characterized by forbidding induced subgraphs, e.g., chordal and bipartite graphs (by infinitely many forbidden subgraphs), and cographs and line graphs (by finitely many forbidden subgraphs).
All graphs considered are undirected, finite and have no multiple edges or self-loops. Let H be a given graph. A graph G is H-free if no induced subgraph in G is isomorphic to H. A path with n vertices and \(n-1\) edges is denoted by \(P_n\). The main result of the present paper is the following complexity dichotomy:
Theorem 1
Let H be a fixed graph. cluster-vd is polynomially solvable on H-free graphs if H is an induced subgraph of the 4-vertex path \(P_4\), and \(\textsf{NP}\)-complete otherwise.
Furthermore, in case H is not an induced subgraph of \(P_4\), no algorithm of runtime \(2^{o(n)}\) can solve cluster-vd on H-free n-vertex graphs, unless the ETH fails.
We also consider the connected variant of cluster-vd, which is as follows.
It is known that connected cluster-vd is \(\textsf{NP}\)-complete and cannot be solved in \(2^{o(n+m)}\) time unless the ETH fails [21]. It turns out that connected cluster-vd admits the same complexity dichotomy as for cluster-vd:
Theorem 2
Let H be a fixed graph. connected cluster-vd is polynomially solvable on H-free graphs if H is an induced subgraph of the 4-vertex path \(P_4\), and \(\textsf{NP}\)-complete otherwise.
Furthermore, in case H is not an induced subgraph of \(P_4\), no algorithm of runtime \(2^{o(n)}\) can solve connected cluster-vd on H-free n-vertex graphs, unless the ETH fails.
Theorems 1 and 2 enlarge a list of rare dichotomy theorems on H-free graphs: Korobitsin [22] proved that dominating set is solvable in polynomial time on H-free graphs if H is an induced subgraph of \(P_4+tP_1\), the union of \(P_4\) and t isolated vertices for \(t\ge 0\), and \(\textsf{NP}\)-complete otherwise. Munaro [27] proved that the same dichotomy holds for connected dominating set and for graph VC\(_{\textsc {con}}\) dimension. Král, Kratochvíl, Tuza and Woeginger [23] proved that colouring on H-free graphs is solvable in polynomial time if H is an induced subgraph of \(P_4\) or of \(P_3+P_1\) and \(\textsf{NP}\)-complete otherwise. Kamiński [20] proved that max-cut is solvable in polynomial time if H is an induced subgraph of \(P_4\) and \(\textsf{NP}\)-complete otherwise.
2 Preliminaries
For a set \(\mathcal H\) of graphs, \(\mathcal{H}\)-free graphs are those in which no induced subgraph is isomorphic to a graph in \(\mathcal H\). We denote by \(K_{1,n}\) the tree with \(n+1\ge 3\) vertices and n leaves, by \(C_n\) the n-vertex cycle. The girth girth(G) of a graph G is the smallest length of a cycle in G; we set \(girth(G)=\infty \) if G is a forest, a graph without cycles. Thus, for any fixed integer \(g\ge 3\), \(gith(G)>g\) if and only if G is \(\{C_3,C_4, \ldots , C_g\}\)-free.
As usual, we denote by \(\overline{G}\) the complement of a graph G. The union \(G+H\) of two vertex-disjoint graphs G and H is the graph with vertex set \(V(G)\cup V(H)\) and edge set \(E(G)\cup E(H)\); we write pG for the union of p copies of G. For a subset \(S \subseteq V(G)\), let G[S] denote the subgraph of G induced by S; \(G-S\) stands for \(G[V(G)\setminus S]\). By ‘G contains an H’ we mean G contains H as an induced subgraph. Graphs in which every vertex has degree 3 are called 3-regular graphs or cubic graphs and graphs with maximum degree 3 subcubic graphs.
A graph G is a cluster graph if each of its connected components is a clique. Observe that G is a cluster graph if and only if G is \(P_3\)-free. If \(S\subseteq V(G)\) is a subset of vertices of G such that \(G-S\) is \(P_3\)-free, then S is called a cluster vertex deletion set of G. An optimal cluster vertex deletion set is one of minimum size.
Algorithmic lower bounds in this paper are conditional, based on the Exponential Time Hypothesis (ETH) [16]. The ETH asserts that no algorithm can solve 3sat in subexponential time \(2^{o(n)}\) for n-variable 3-cnf formulas. As shown by the Sparsification Lemma in [17], the hard cases of 3sat consist of sparse formulas with \(m=O(n)\) clauses. Hence, the ETH implies that 3sat cannot be solved in time \(2^{o(n+m)}\).
Recall that an instance for nae 3sat is a 3-cnf formula \(F=C_1\wedge C_2\wedge \cdots \wedge C_m\) over n variables, in which each clause \(C_j\) consists of three distinct literals. The problem asks whether there is a truth assignment of the variables such that every clause in F has at least one true and at least one false literal. Such an assignment is called an nae assignment, i.e. a not-all-equal assignment. There is a polynomial reduction from 3sat to nae 3sat ([26, Theorem 7.3]), which transforms an instance for 3sat with n variables and m clauses to an equivalent instance for nae 3sat with \(2n+24m\) variables and 32m clauses. Thus, we obtain:
Theorem 3
([17, 26]) nae 3sat is \(\textsf{NP}\)-complete and, assuming ETH, cannot be solved in time \(2^{o(n+m)}\) on inputs with n variables and m clauses.
We will also need the following restriction of nae 3sat. For integers \(p, q\ge 2\), let (p, q)-3sat denote the problem of deciding if a 3-cnf formula in which each variable occurs at most p times positively and at most q times negatively is satisfiable. (p, q)-nae 3sat is defined analogously. A reduction from 3sat, linear in the number of clauses, due to Tovey [30] shows that (2, 2)-3sat remains \(\textsf{NP}\)-complete and, assuming ETH, cannot be solved in time \(2^{o(n)}\) for inputs with n variables. Now, the reduction due to Moret [26, Theorem 7.3] mentioned above transforms an instance for (2, 2)-3sat to an equivalent instance for (4, 4)-nae 3sat, linear in the number of variables and clauses. Hence, we obtain:
Theorem 4
([17, 26, 30]) (4, 4)-nae 3sat is \(\textsf{NP}\)-complete and, assuming ETH, cannot be solved in time \(2^{o(n)}\) on inputs with n variables.
Structure of the paper
We first address the polynomial part of Theorems 1 and 2 in the next section. Then we present two new \(\textsf{NP}\)-completeness results for cluster-vd and connected cluster-vd in Sections 4 and 5. These hardness results allow us to clear the \(\textsf{NP}\)-completeness part of Theorems 1 and 2 in Section 6. The last section concludes the paper.
3 H-free Graphs: Polynomial Cases
The polynomial part in Theorems 1 and 2 consists of six cases; see Fig. 1 for all graphs H for which cluster-vd and connected cluster-vd are polynomially solvable on H-free graphs.
Observe that H-freeness is hereditary, meaning if \(H'\) is an induced subgraph of H then \(H'\)-free graphs are H-free graphs. Thus, it suffices to prove the polynomial part only for the case where H is the 4-vertex path \(P_4\).
The proof will follow from the concept of clique-width of graphs in connection with the so-called monadic second-order logic, \(MSOL_1\) for short, an extension of first-order logic with quantification over vertex set variables. Briefly, the clique-width of a graph G, introduced in [8], is the minimum number of labels needed to construct G by:
-
creating a new vertex with label i,
-
taking a disjoint union of two labeled graphs,
-
joining every vertex with label i to every vertex with label \(j\not = i\), and
-
renaming label i to label j.
Such a construction with k labels defines an algebraic k-expression. A well-known meta-theorem by Courcelle, Makowsky and Rotics [9] states that any graph property expressible in \(MSOL_1\) is decidable in linear time for graphs with bounded clique-width, provided a k-expression of the graphs is given. It is well known that \(P_4\)-free graphs, also known as cographs, have clique-width at most 2 and a corresponding 2-expression can be constructed in linear time (see, e.g., [9]). Hence, any \(MSOL_1\) graph property is decidable in linear time when restricted to \(P_4\)-free graphs.
Now, being a cluster vertex deletion set is a \(MSOL_1\) property:
where S(x) means \(x\in S\) and E(x, y) means \(xy\in E(G)\). (The sentence says that the graph \(G-S\) is \(P_3\)-free.)
Also, the fact that the vertex set S in a graph G induces a connected subgraph of G can be written as a \(MSOL_1\) sentence:
(The sentence says that, for any bipartition of S into two non-empty sets, there is an edge joining two vertices in different parts of the bipartition.)
Thus, cluster-vd and connected cluster-vd can be solved in linear time on \(P_4\)-free graphs. Indeed, we have a stronger fact. The weighted optimization version of cluster-vd and connected cluster-vd, minimum cluster-vd and minimum connected cluster-vd, are \(LinEMSOL_{\tau _{1,p}}\) problems (\(LinEMSOL_{\tau _{1,p}}\) is an extension of \(MSOL_1\) which allows one to search for optimal sets of vertices with respect to some linear objective function). We refer to the paper [9] for details, in which it is shown that every \(LinEMSOL_{\tau _{1,p}}\) problem on \(P_4\)-free graphs can be solved in linear time [9, Theorem 4]. To sum up, we have:
Proposition 5
cluster-vd and connected cluster-vd can be solved in linear time on \(P_4\)-free graphs, even in the weighted optimization version.
Another approach for obtaining the above results is to use the so-called cotree of cographs. Using the cotree of a cograph G, we are able to compute an optimal (connected) cluster vertex deletion set of G in linear time in a direct and simple way. The details are given in the Appendices A and B.
4 Cluster-VD and Connected Cluster-VD on Dense Graphs
In this section, we give a polynomial reduction from vertex cover to cluster-vd, showing that cluster-vd remains \(\textsf{NP}\)-complete when restricted to \(\{3P_1, 2P_2\}\)-free n-vertex graphs with minimum degree at least \(n-4\).
Recall that the vertex cover problem asks, for a given graph G and an integer k, if one can delete a vertex set S of size at most k such that \(G-S\) is edgeless. It is well known that vertex cover is \(\textsf{NP}\)-complete and, assuming ETH, cannot be solved in \(2^{o(n+m)}\) time on n-vertex m-edge graphs. This fact and a result in [18] imply that, assuming ETH, vertex cover cannot be solved in \(2^{o(n)}\) time on subcubic n-vertex graphs. There is a polynomial-time reduction from vertex cover in cubic graphs to vertex cover in subcubic planar graphs with arbitrarily large girth, which transforms an instance (G, k) of the first version to an equivalent instance \((G',k')\) for the second version, where the vertex number of \(G'\) is linear in the vertex number of G (see, e.g., [28] or [21]). Thus, we obtain:
Theorem 6
([18, 21, 28]) Let \(g\ge 3\) be a fixed integer. vertex cover is \(\textsf{NP}\)-complete even when restricted to subcubic graphs of girth \(>g\) and, assuming ETH, vertex cover cannot be solved in \(2^{o(n)}\) time in this restricted graph class.
We now describe the announced reduction. Let \(g\ge 3\) be an integer and let (G, k) be an instance for vertex cover, where G is a n-vertex subcubic graph with girth \(>g\). We may assume that
-
G is not perfect. This is because vertex cover is polynomially solvable on perfect graphs (see [12]); notice that G is perfect if and only if \(\overline{G}\) is perfect and perfect graphs can be recognized in polynomial time [5], and
-
\(k\le |V(G)|/2\). This fact can be easily seen as follows: given G with n vertices and an integer k, let \(G'\) be obtained from G by adding \(p=\max \{0,2k-n\}\) isolated vertices. Then \(k=|V(G')|/2\) and \((G,k)\in \textsc {vertex cover} \) if and only if \((G',k)\in \textsc {vertex cover} \). Notice that like G, \(G'\) is subcubic, not perfect and has girth \(>g\), too.
From (G, k) we construct an equivalent instance \((G',k')\) for cluster-vd as follows: \(G'\) is obtained from two disjoint copies of \(\overline{G}\), \(G_1\) and \(G_2\), by adding all possible edges between \(V(G_1)\) and \(V(G_2)\). Set \(k'=2k\).
We argue that \((G,k)\in \textsc {vertex cover} \) if and only \((G',k')\in \textsc {cluster-vd} \). First, let \(S\subset V(G)\) be a vertex cover, that is \(G-S\) is edgeless, with \(|S|\le k\). Let \(S_1\) and \(S_2\) be the copy of S in \(G_1\) and \(G_2\), respectively. Then, for each \(i\in \{1,2\}\), \(G_i-S_i\) is a clique in \(G_i=\overline{G}\), and with \(S'=S_1\cup S_2\), \(G'-S'\) is a clique in \(G'\) with \(|S'|=2|S|\le 2k=k'\).
Conversely, let \(S'\subseteq V(G')\) be a cluster vertex deletion set of \(G'\) with \(|S'|\le k'=2k\). Observe that, for each \(i\in \{1,2\}\), \(S'\cap V(G_i)\) is a proper nonempty subset of \(V(G_i)\): if for some i, \(S'\cap V(G_i)=\emptyset \) then \(G_i\) (hence G) would be perfect because in this case \(G_i\) would be a cluster, and if \(V(G_i)\subset S'\) then \(2k\ge |S'|>|V(G_i)|=|V(G)|\), contradicting \(k\le |V(G)|/2\). It follows from the above that \(G'-S'\) is a single clique, implying for each \(i\in \{1,2\}\), \(G_i-S_i\) is a clique in \(G_i\) where \(S_i=S'\cap V(G_i)\). Since \(|S'|\le 2k\), \(|S_1|\le k\) or \(|S_2|\le k\). Let \(|S_1|\le k\), say, and let \(S\subseteq V(G)\) be the set of the corresponding vertices in G. Then \(G-S\) is edgeless with \(|S|\le k\).
We have seen that G has a vertex cover of size at most k if and only if \(G'\) has a cluster vertex deletion set of size at most \(k'\), as claimed.
Note that \(G'\) has 2n vertices and minimum degree at least \(2n-4\) (as G has n vertices and maximum degree at most 3). Now, observe that, for any connected graph X, if G is X-free then \(G'\) is \(\overline{X}\)-free. Since G is \(\{C_3,C_4,\ldots ,C_g\}\)-free, we obtain with Theorem 6:
Theorem 7
For any fixed \(g\ge 3\), cluster-vd is \(\textsf{NP}\)-complete on \(\{\overline{C_3}, \overline{C_4}, \ldots , \overline{C_g}\}\)-free n-vertex graphs with minimum degree at least \(n-4\) and, assuming ETH, cannot be solved in \(2^{o(n)}\) time.
In particular, cluster-vd is \(\textsf{NP}\)-complete on \(\{3P_1, 2P_2\}\)-free graphs and, assuming ETH, cannot be solved in \(2^{o(n)}\) time.
We observe that the proof of Theorem 7 remains true for connected cluster vertex deletion sets: G has a vertex cover of size at most \(k\le |V(G)|/2\) if and only if \(G'\) has a connected cluster vertex deletion set of size at most \(k'=2k\). Thus, Theorem 7 also holds for connected cluster-vd:
Theorem 8
For any fixed \(g\ge 3\), connected cluster-vd is \(\textsf{NP}\)-complete on \(\{\overline{C_3}, \overline{C_4}, \ldots , \overline{C_g}\}\)-free n-vertex graphs with minimum degree at least \(n-4\) and, assuming ETH, cannot be solved in \(2^{o(n)}\) time.
In particular, connected cluster-vd is \(\textsf{NP}\)-complete on \(\{3P_1, 2P_2\}\)-free graphs and, assuming ETH, cannot be solved in \(2^{o(n)}\) time.
5 Cluster-VD and Connected Cluster-VD on Sparse Graphs
In [33, Lemma 1], Yannakakis gave a polynomial-time reduction from nae 3sat to cluster-vd, which transforms an instance for nae 3sat with n variables and m clauses, into an equivalent instance (G, k) for cluster-vd, where G is a bipartite graph with \(6n+12m\) vertices. Thus, by Theorem 3, cluster-vd is \(\textsf{NP}\)-complete even when restricted to bipartite graphs and, assuming ETH, cluster-vd cannot be solved in \(2^{o(n)}\) time on bipartite graphs with n vertices.
We remark that by considering (4, 4)-nae 3sat instead of nae 3sat, the bipartite graph obtained from the reduction of Yannakakis mentioned above has maximum degree at most four. Thus, by Theorem 4, we obtain:
Theorem 9
([33]) cluster-vd is \(\textsf{NP}\)-complete even when restricted to n-vertex bipartite graphs of maximum degree at most 4 and, assuming ETH, cannot be solved in \(2^{o(n)}\) time.
In [14], Hsieh, Le, Le and Peng gave another polynomial-time reduction from nae 3sat to cluster-vd, which transforms an instance for nae 3sat with n variables and m clauses, into an equivalent instance (G, k) for cluster-vd, where G is a subcubic bipartite graph with \(6nm+30m\) vertices. Recall that we may assume (by the Sparsification Lemma) that \(m=O(n)\). Thus, by Theorem 3, we obtain:
Theorem 10
([14]) cluster-vd is \(\textsf{NP}\)-complete even when restricted to subcubic n-vertex bipartite graphs and, assuming ETH, cannot be solved in time \(2^{o(\sqrt{n})}\).
In this section, we will further improve Theorems 9 and 10 by Theorems 12 and 13, respectively. We begin with the following fact.
Lemma 11
Given a graph G, let \(G'\) be obtained from G by subdividing each edge \(e=xy\) in G with three new vertices \(e_x, e_{xy}\) and \(e_y\), thus obtaining the 5-vertex path \(xe_xe_{xy}e_yy\) in \(G'\) in which all new vertices are of degree 2. Assuming G is triangle-free, G has a cluster vertex deletion set of size at most k if and only if \(G'\) has a cluster vertex deletion set of size at most \(k+m\), where m is the edge number of G.
Proof
Observe that since G is triangle-free, a cluster in G is a collection of isolated vertices and edges.
For one direction, extend a cluster vertex deletion set \(S\subseteq V(G)\) to a cluster vertex deletion set \(S'\subseteq V(G')\) of size \(|S|+m\) as follows; see also Fig. 2: initially, set \(S'=S\). Then, for each edge \(e=xy\) in G,
-
if both x and y are in S or outside S, put \(e_{xy}\) into \(S'\);
-
if \(x\in S\) and \(y\notin S\), put \(e_y\) into \(S'\);
-
if \(x\notin S\) and \(y\in S\), put \(e_x\) into \(S'\).
To see that \(G'-S'\) is \(P_3\)-free, notice that by construction, for each edge \(e=xy\) in G, exactly one of \(e_x, e_{xy}\) and \(e_y\) is in \(S'\), and if \(e_x, e_{xy}\notin S'\) then \(x\in S\), and if \(e_x, x\notin S'\) then \(y\notin S\), hence \(e_{xy}\in S'\). Since each \(P_3\) in \(G'\) has the form \(xe_xe_{xy}\), \(e_xe_{xy}e_y\) or \(e_xxe'_x\) for some edge \(e=xy\) and \(e'=xz\), it follows from these facts and the assumption that G is triangle-free that \(G'-S'\) is \(P_3\)-free.
For the other direction, suppose that \(G'\) has a cluster vertex deletion set of size at most \(k+m\), and consider such a set \(S'\) of minimum size. Then, we may assume that, for each edge \(e=xy\) in G, \(S'\) contains exactly one of \(e_x, e_{xy}\) and \(e_y\): note that \(e_xe_{xy}e_y\) is a \(P_3\), hence \(|S'\cap \{e_x,e_{xy},e_y\}|\ge 1\), and by minimality, \(|S'\cap \{e_x,e_{xy},e_y\}|\le 2\). Now, if \(|S'\cap \{e_x,e_{xy},e_y\}| = 2\) for some edge \(e=xy\) in G, then \(S'\) can be modified to a minimum cluster vertex deletion set containing exactly one of \(e_x, e_{xy}\) and \(e_y\) as follows:
-
suppose that \(e_x, e_{xy}\in S'\). Then \(x, y\not \in S'\) (if \(x\in S'\) then \(S'-e_x\) would be a cluster vertex deletion set of \(G'\), and if \(y\in S'\) then \(S'-e_{xy}\) would be a cluster vertex deletion set of \(G'\), contradicting the minimality of \(S'\)), and \(S''=S'-e_{xy}+y\) is the desired cluster vertex deletion set of minimum size;
-
suppose that \(e_y, e_{xy}\in S'\). Then similar to the above case, \(x, y\not \in S'\), and \(S''=S'-e_{xy}+x\) is the desired cluster vertex deletion set of minimum size;
-
suppose that \(e_x,e_y\in S'\). Then \(x,y\notin S'\) (if \(x\in S'\) or \(y\in S'\) then \(S''=S'-e_x\), respectively \(S''=S'-e_y\), would be a cluster vertex deletion set of \(G'\), contradicting the minimality of \(S'\)), and \(S''=S'-e_x+x\) is the desired cluster vertex deletion set of minimum size.
Hence, \(S=S'\cap V(G)\) has at most k vertices, and \(G-S\) is \(P_3\)-free: if there would be an induced \(P_3\) xyz in G with edges \(e=xy\) and \(e'=yz\), then, as \(|S'\cap \{e_x,e_{xy},e_y\}|=1=|S'\cap \{e'_y,e'_{yz},e'_z\}|\), one of the 3-paths \(xe_xe_{xy}\), \(e_yye'_y\) and \(e'_{yz}e'_zz\) would be outside \(S'\).
Thus, G has a cluster vertex deletion set of size at most k if and only if \(G'\) has a cluster vertex deletion set of size at most \(k+m\), as claimed. \(\square \)
We now show that, for any given tree T containing two vertices of degree 3, cluster-vd remains \(\textsf{NP}\)-complete when restricted to T-free bipartite graphs of maximum degree 4 and with arbitrarily large girth.
Theorem 12
For any given integer \(g\ge 3\) and any given tree T containing two degree-3 vertices, cluster-vd is \(\textsf{NP}\)-complete on T-free n-vertex bipartite graphs of maximum degree at most 4 and with girth \(>g\) and, assuming ETH, cannot be solved in \(2^{o(n)}\) time.
Proof
Note that cluster-vd restricted to the graph class in question is in \(\textsf{NP}\). Below we give a polynomial-time reduction from cluster-vd restricted to bipartite graphs of degree at most 4 to cluster-vd restricted to T-free bipartite graphs of degree at most 4 and with arbitrarily large girth.
First, given a bipartite graph G of maximum degree at most 4 with n vertices and m edges, let \(G'\) be obtained from G by subdividing the edges as described in Lemma 11. Note that like G, \(G'\) is bipartite and has maximum degree at most 4. By Lemma 11, G has a cluster vertex deletion set of size at most k if and only if \(G'\) has a cluster vertex deletion set of size at most \(k+m\).
Now, given \(g>0\) and a tree T with two degree-3 vertices, fix an integer \(t\ge \max \{\log _4 g{,} |V(T)|\}\). Then, repeating the construction in Lemma 11t times, the final bipartite graph \(G'\) has girth \(4^t\cdot girth(G) > g\) and maximum degree at most 4, and contains no induced subgraph isomorphic to T (as the distance between two degree-3 vertices in \(G'\) is larger than |V(T)|). Thus the \(\textsf{NP}\)-hardness part of the theorem follows from the first part of Theorem 9. Note that \(G'\) has \(n+(4^t-1)m=O(n)\) vertices, hence, the second part of the theorem follows from the second part of Theorem 9. \(\square \)
Observe that if we consider subcubic bipartite graphs and make use of Theorem 10 instead of Theorem 9 in the proof of Theorem 12, we obtain:
Theorem 13
For any given integer \(g\ge 3\) and any given tree T containing two degree-3 vertices, cluster-vd is \(\textsf{NP}\)-complete on T-free subcubic bipartite graphs and with girth \(>g\) and, assuming ETH, cannot be solved in \(2^{o(\sqrt{n})}\) time.
We now are going to show that connected cluster-vd remains \(\textsf{NP}\)-complete when restricted to bipartite graphs with arbitrarily large girth. (Notice that a reduction based on Lemma 11, similar to the reduction in Theorem 12, does not work for connected cluster-vd.) Let \(g>0\) be a given integer. From an instance (G, k) of cluster-vd, where \(G=(X\cup Y,E)\) is a bipartite graph with girth \(>g\), we construct an instance \((G(g),k')\), where G(g) is a bipartite graph of girth \(>g\), for connected cluster-vd as follows:
-
We may assume that g is odd (otherwise, replace g by \(g+1\));
-
Write \(X=\{x_1,x_2,\ldots ,x_r\}\), \(Y=\{y_1,y_2,\ldots ,y_s\}\), and \(n=r+s\);
-
Let H(g, r, s) be the tree depicted in Fig. 3; note that H(g, r, s) has \(6r+3gr+6s+3gs=(6+3g)n\) vertices. The property of H(g, r, s) that will be used is that the set of all degree-3 vertices of H(g, r, s), that is all \(x_{ig}\), \(1\le i \le r\), and all \(y_{jg}\), \(1\le j\le s\), is both an optimal cluster vertex deletion set and the unique connected cluster vertex deletion set. The vertices \(x_{ig}\) and \(y_{jg}\) will have degree 3 in the whole graph G(g). In Fig. 3 the unique connected cluster vertex deletion set contains the \((g + 2)n\) black vertices.
Then, let G(g) be obtained from G and H(g, r, s) by adding an edge between \(x_i\) and \(x_{ig}\), \(1\le i\le r\), and between \(y_j\) and \(y_{jg}\), \(1\le j\le s\). Note that like G, G(g) is bipartite (as g is odd) and has \(n'=n+(6+3g)n=(7+3g)n\) vertices. See Fig. 4 for an example in case \(g=3\). Finally, set \(k'=k+(g+2)n\). Clearly, \((G(g),k')\) can be constructed in polynomial time from (G, k).
Now, let S be a cluster vertex deletion set of G of size at most k. Then G(g) has a connected cluster vertex deletion set \(S'\) of size \(|S|+(g+2)n\le k'\): \(S'\) is obtained from S by adding all vertices of H(g, r, s) with degree 3 in G(g) (the \((g+2)n\) black vertices in Fig. 3). Observe that \(S'\) induces a connected subgraph in G(g) since every vertex in S is adjacent to some \(x_{ig}\) or \(y_{jg}\), and all vertices of H(g, r, s) with degree 3 in G(g) induce a connected subgraph in G(g).
Conversely, let \(S'\) be a (connected or not) cluster vertex deletion set of G(g) of size at most \(k'\). Since every vertex u in H(g, r, s) with degree 3 in G(g) (the black vertices in Fig. 3) belongs to an induced \(P_3=uvw\) in H(g, r, s) with \(\deg _{G(g)}(v)=2\) and \(\deg _{G(g)}(w)=1\), we may assume that \(S'\) contains all \((g+2)n\) vertices of H(g, r, s) with degree 3 (and no other vertices of H(g, r, s)). Let S be the restriction of \(S'\) on V(G). Then S is a cluster vertex deletion set of G of size \(|S|=|S'|-(g+2)n\le k\).
Observe that the girth of G(g) is at least \(\max \{girth(G), 2g+6\}>g\) and the maximum degree of G(g) is one more than the maximum degree of G. Hence, by Theorems 12 and 13, we obtain:
Theorem 14
For any given integer \(g\ge 3\), connected cluster-vd is \(\textsf{NP}\)-complete on bipartite graphs of maximum degree at most 5 and with girth \(>g\) and, assuming ETH, cannot be solved in \(2^{o(n)}\) time.
Theorem 15
For any given integer \(g\ge 3\), connected cluster-vd is \(\textsf{NP}\)-complete on bipartite graphs of maximum degree at most 4 and with girth \(>g\) and, assuming ETH, cannot be solved in \(2^{o(\sqrt{n})}\) time.
6 H-free Graphs: \(\textsf{NP}\)-completeness Cases
In this section we give the proof of the \(\textsf{NP}\)-completeness part of Theorems 1 and 2.
Let H be a fixed graph. By Proposition 5, cluster-vd is polynomially solvable on H-free graphs whenever H is an induced subgraph of the 4-vertex path \(P_4\). The following fact is easy to see:
Observation 16
A graph is an induced subgraph of the 4-path \(P_4\) if and only if it is a \(\{3P_1, 2P_2\}\)-free forest.
Thus, it remains to consider the cases where H contains a cycle or a \(3P_1\) or a \(2P_2\) as an induced subgraph.
Now, if H contains a cycle then graphs of girth \(> g=|V(H)|\) are H-free, hence Theorems 12 and 14 imply that cluster-vd and connected cluster-vd are \(\textsf{NP}\)-complete on H-free graphs and, assuming ETH, cannot be solved in \(2^{o(n)}\) time on H-free n-vertex graphs. If H contains a \(3P_1\) or a \(2P_2\) then \(\{3P_1, 2P_2\}\)-free graphs are H-free graphs, hence Theorems 7 and 8 imply that cluster-vd and connected cluster-vd are \(\textsf{NP}\)-complete on H-free graphs and, assuming ETH, cannot be solved in \(2^{o(n)}\) time on H-free n-vertex graphs.
7 Conclusion
We have found a complete characterization of graphs H for which cluster-vd on H-free graphs is polynomially solvable and for which it is \(\textsf{NP}\)-complete (Theorem 1). The same complexity dichotomy holds also for connected cluster-vd (Theorem 2).
We remark that a complexity dichotomy for vertex cover and connected vertex cover on H-free graphs, like Theorems 1 and 2 for cluster-vd and connected cluster-vd, respectively, seems very hard to achieve. Indeed, it is a long-standing open problem whether there exists a constant t for which vertex cover or connected vertex cover is \(\textsf{NP}\)-complete on \(P_t\)-free graphs. So far it is known that such a constant t, if any, must be at least 7 for vertex cover [13], respectively, at least 6 for connected vertex cover [19].
Let \(\mathcal H\) be a set of (possibly infinitely many) graphs. A natural question generalizing the case of one forbidden induced subgraph is: what is the complexity of cluster-vd and of connected cluster-vd on \(\mathcal{H}\)-free graphs? The case \(\mathcal{H}=\{H\}\) is completely solved by Theorems 1 and 2. The case \(\mathcal{H}=\{C_\ell \mid \ell \ge 4\}\), also known as chordal graphs, addressed in [3] is still open. The next step may be the case of two-element sets \(\mathcal{H} =\{H_1,H_2\}\); in particular, \(\mathcal{H} =\{H,\overline{H}\}\). Another interesting problem is to clear the complexity of cluster-vd and connected cluster-vd on line graphs, a well-studied graph class defined by excluding nine small induced subgraphs.
References
Aprile, M., Drescher, M., Fiorini, S., Huynh, T.: A tight approximation algorithm for the cluster vertex deletion problem. Math. Program. 197(2), 1069–1091 (2023). https://doi.org/10.1007/s10107-021-01744-w
Boral, A., Cygan, M., Kociumaka, T., Pilipczuk, M.: A Fast Branching Algorithm for Cluster Vertex Deletion. Theory Comput. Syst. 58(2), 357–376 (2016). https://doi.org/10.1007/s00224-015-9631-7
Cao, Y., Ke, Y., Otachi, Y., You, J.: Vertex deletion problems on chordal graphs. Theor. Comput. Sci. 745, 75–86 (2018). https://doi.org/10.1016/j.tcs.2018.05.039
Chakraborty, D., Chandran, L.S., Padinhatteeri, S., Pillai, R.R.: Algorithms and Complexity of \(s\)-Club Cluster Vertex Deletion. In: Flocchini P., Moura L. (eds.) Combinatorial Algorithms - 32nd International Workshop, IWOCA 2021, Ottawa, ON, Canada, Proceedings. Lecture Notes in Computer Science, vol. 12757, pp. 152–164. Springer (2021). https://doi.org/10.1007/978-3-030-79987-8_11
Chudnovsky, M., Cornuéjols, G., Liu, X., Seymour, P.D., Vuskovic, K.: Recognizing Berge Graphs. Combinatorica 25(2), 143–186 (2005). https://doi.org/10.1007/s00493-005-0012-8
Corneil, D.G., Lerchs, H., Burlingham, L.S.: Complement reducible graphs. Discret. Appl. Math. 3(3), 163–174 (1981). https://doi.org/10.1016/0166-218X(81)90013-5
Corneil, D.G., Perl, Y., Stewart, L.K.: A Linear Recognition Algorithm for Cographs. SIAM J. Comput. 14(4), 926–934 (1985). https://doi.org/10.1137/0214065
Courcelle, B., Engelfriet, J., Rozenberg, G.: Handle-Rewriting Hypergraph Grammars. J. Comput. Syst. Sci. 46(2), 218–270 (1993). https://doi.org/10.1016/0022-0000(93)90004-G
Courcelle, B., Makowsky, J.A., Rotics, U.: Linear Time Solvable Optimization Problems on Graphs of Bounded Clique-Width. Theory Comput. Syst. 33(2), 125–150 (2000). https://doi.org/10.1007/s002249910009
Gartland, P., Lokshtanov, D.: Independent Set on \(P_{k}\)-Free Graphs in Quasi-Polynomial Time. In: Irani S. (ed.) 61st IEEE Annual Symposium on Foundations of Computer Science, FOCS 2020, Durham, NC, USA, pp. 613–624. IEEE (2020). https://doi.org/10.1109/FOCS46700.2020.00063
Golovach, P.A., Johnson, M., Paulusma, D., Song, J.: A Survey on the Computational Complexity of Coloring Graphs with Forbidden subgraphs. J. Graph Theory 84(4), 331–363 (2017). https://doi.org/10.1002/jgt.22028
Grötschel, M., Lovász, L., Schrijver, A.: Geometric Algorithms and Combinatorial Optimization. Springer (1988). https://doi.org/10.1007/978-3-642-97881-4
Grzesik, A., Klimosová, T., Pilipczuk, M., Pilipczuk, M.: Polynomial-time Algorithm for Maximum Weight Independent Set on \(P_{6}\)-free Graphs. ACM Trans. Algorithms 18(1), 4:1-4:57 (2022). https://doi.org/10.1145/3414473
Hsieh, S.-Y., Le, H.-O., Le, V.B., Peng, S.-L.: On the \(d\)-Claw Vertex Deletion Problem. Algorithmica (2023). https://doi.org/10.1007/s00453-023-01144-w
Hüffner, F., Komusiewicz, C., Moser, H., Niedermeier, R.: Fixed-Parameter Algorithms for Cluster Vertex Deletion. Theory Comput. Syst. 47(1), 196–217 (2010). https://doi.org/10.1007/s00224-008-9150-x
Impagliazzo, R., Paturi, R.: On the Complexity of \(k\)-SAT. J. Comput. Syst. Sci. 62(2), 367–375 (2001). https://doi.org/10.1006/jcss.2000.1727
Impagliazzo, R., Paturi, R., Zane, F.: Which Problems Have Strongly Exponential Complexity? J. Comput. Syst. Sci. 63(4), 512–530 (2001). https://doi.org/10.1006/jcss.2001.1774
Johnson, D.S., Szegedy, M.: What are the Least Tractable Instances of Max Independent Set? In: Tarjan, R.E., Warnow, T.J. (eds) Proceedings of the Tenth Annual ACM-SIAM Symposium on Discrete Algorithms, Baltimore, Maryland, USA, pp. 927–928. ACM/SIAM, (1999). http://dl.acm.org/citation.cfm?id=314500.315093
Johnson, M., Paesani, G., Paulusma, D.: Connected Vertex Cover for (\(sP_{1} + P_{5}\))-Free Graphs. Algorithmica 82(1), 20–40 (2020). https://doi.org/10.1007/s00453-019-00601-9
Kaminśki, M.: Max-Cut and containment relations in graphs. Theor. Comput. Sci. 438, 89–95 (2012). https://doi.org/10.1016/j.tcs.2012.02.036
Komusiewicz, C.: Tight Running Time Lower Bounds for Vertex Deletion Problems. ACM Trans. Comput. Theory 10(2), 6:1-6:18 (2018). https://doi.org/10.1145/3186589
Korobitsin, D.V.: On the complexity of domination number determination in monogenic classes of graphs. Discrete Math. Appl. 2, 191–200 (1992). https://doi.org/10.1515/dma.1992.2.2.191
Král, D., Kratochvíl, J., Tuza, Z., Woeginger, G.J.: Complexity of Coloring Graphs without Forbidden Induced Subgraphs. In: Brandstädt A., Le, V.B. (eds.) Graph-Theoretic Concepts in Computer Science, 27th International Workshop, WG 2001, Boltenhagen, Germany, Proceedings. Lecture Notes in Computer Science, vol. 2204, pp. 254–262. Springer (2001). https://doi.org/10.1007/3-540-45477-2_23
Le, H.-O., Le, V.B.: Complexity of the Cluster Vertex Deletion Problem on \(H\)-Free Graphs. In: Szeider S., Ganian R., Silva A. (eds.) 47th International Symposium on Mathematical Foundations of Computer Science (MFCS 2022), vol. 241 of Leibniz International Proceedings in Informatics (LIPIcs), pp. 68:1–68:10, Dagstuhl, Germany (2022). Schloss Dagstuhl - Leibniz-Zentrum für Informatik. https://doi.org/10.4230/LIPIcs.MFCS.2022.68
Lewis, J.M., Yannakakis, M.: The Node-Deletion Problem for Hereditary Properties is NP-complete. J. Comput. Syst. Sci. 20(2), 219–230 (1980). https://doi.org/10.1016/0022-0000(80)90060-4
Moret, B.M.E.: Theory of Computation. Addison-Wesley-Longman. (1998)
Munaro, A.: Boundary classes for graph problems involving non-local properties. Theor. Comput. Sci. 692, 46–71 (2017). https://doi.org/10.1016/j.tcs.2017.06.012
Murphy, O.J.: Computing independent sets in graphs with large girth. Discret. Appl. Math. 35(2), 167–170 (1992). https://doi.org/10.1016/0166-218X(92)90041-8
Sau, I., dos Santos Souza, U.: Hitting forbidden induced subgraphs on bounded treewidth graphs. Inf. Comput. 281, 104812 (2021). https://doi.org/10.1016/j.ic.2021.104812
Tovey, C.A.: A simplified NP-complete satisfiability problem. Discret. Appl. Math. 8(1), 85–89 (1984). https://doi.org/10.1016/0166-218X(84)90081-7
Tsur, D.: Faster Parameterized Algorithm for Cluster Vertex Deletion. Theory Comput. Syst. 65(2), 323–343 (2021). https://doi.org/10.1007/s00224-020-10005-w
Yannakakis, M.: Node- and Edge-Deletion NP-Complete Problems. In: Lipton R.J., Burkhard W.A., Savitch W.J., Friedman E.P., Aho A.V. (eds.) Proceedings of the 10th Annual ACM Symposium on Theory of Computing, San Diego, California, USA, pp. 253–264. ACM (1978). https://doi.org/10.1145/800133.804355
Yannakakis, M.: Node-Deletion Problems on Bipartite Graphs. SIAM J. Comput. 10(2), 310–327 (1981). https://doi.org/10.1137/0210022
Acknowledgements
We are grateful to the reviewers for their careful reading and helpful comments. In particular, we thank one of them for her/his very meticulous reading with many valuable suggestions that significantly improved the quality of the paper.
Funding
Open Access funding enabled and organized by Projekt DEAL.
Author information
Authors and Affiliations
Contributions
Both authors wrote the main manuscript text.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Parts of this paper were presented at the 47th International Symposium on Mathematical Foundations of Computer Science (MFCS 2022) [24].
Appendices
Appendix A: Computing the Cluster Vertex Deletion Number of Co-graphs using the Cotrees
Recall that \(P_4\)-free graphs are also called cographs [6]. More precisely, for vertex-disjoint graphs \(G_i=(V_i,E_i)\), \(i=1,2\), let be the union (or co-join) of \(G_1\) and \(G_2\), and let be the join of \(G_1\) and \(G_2\), With these notations, cographs are exactly those graphs that can be constructed from the one-vertex graph by applying the join and co-join operations. Thus, a cograph is the one-vertex graph or is the join of two smaller cographs or is the co-join of two smaller cographs.
Recall that \(S\subseteq V(G)\) is a vertex cover if \(G-S\) is edgeless and is a cluster vertex deletion set if \(G-S\) is a cluster graph. Let \(\tau (G)\) and \(\varsigma (G)\) denote the vertex cover number and the cluster vertex deletion number of G, respectively,
We will see that \(\tau (G)\) and \(\varsigma (G)\) can be computed efficiently when restricted to cographs. The calculation is based on the following fact:
Lemma 17
For any (not necessarily \(P_4\)-free) graphs \(G_1\) and \(G_2\), the following relations hold:
Proof
(2): Let \(S_i\) be a vertex cover of \(G_i\) of optimal size \(\tau (G_i)\), \(i=1,2\). Then \(S_1\cup V(G_2)\) and \(S_2\cup V(G_1)\) are vertex covers of . Hence .
For the other direction, let S be a vertex cover of of optimal size, and write \(S_i=S\cap V(G_i)\). Then \(S_i\) is a vertex cover of \(G_i\), and moreover, \(S_1=V(G_1)\) or else \(S_2=V(G_2)\) (because \(S_i=V(G_i)\) for some i is needed to cover the edges between \(G_1\) and \(G_2\)). Hence .
(4): Let \(S_i\) be a cluster vertex deletion set of \(G_i\) of optimal size \(\varsigma (G_i)\), \(i=1,2\). Then \(S_1\cup V(G_2)\) and \(S_2\cup V(G_1)\) are cluster vertex deletion sets of . Hence . Let \(S_i\) be a vertex cover of \(\overline{G_i}\) of optimal size \(\tau (\overline{G_i})\), \(i=1,2\). Then \(S_1\cup S_2\) is a cluster vertex deletion set of , hence .
For the other direction, let S be a cluster vertex deletion set of of optimal size, and write \(S_i=S\cap V(G_i)\). Then \(S_i\) is a cluster vertex deletion set of \(G_i\), and moreover,
-
if \(G_1-S_1\) is not a clique then \(S_2=V(G_2)\), likewise
-
if \(G_2-S_2\) is not a clique then \(S_1=V(G_1)\).
In these two cases, . In the third case where each of \(G_1-S_1\) and \(G_2-S_2\) is a clique, \(S_1\) and \(S_2\) are vertex covers of \(\overline{G_1}\) and \(\overline{G_2}\), respectively. Hence in this case, . \(\square \)
Remark 18
For any integer \(r\ge 2\), Lemma 17 holds accordingly for and . We also note that Lemma 17 holds for the weighted version, too.
With each cograph \(G=(V,E)\), one can associate a so-called cotree T of G as follows.
-
The leaves of T are the vertices of G;
-
Every internal node of T has a label or , and has at least two children;
-
No two internal nodes of T with the same label are adjacent;
-
Two vertices u and v of G are (non-)adjacent if and only if the least common ancestor of u and v in T has label (respectively, ).
In particular, the cotree of an n-vertex cograph has at most \(2n-1\) nodes.
Note that, for any internal node v of T, the subtree \(T_v\) of T rooted at v is the cotree of the subgraph of G induced by the leaves of \({T}_v\). The cograph corresponding to \({T}_v\) where v has label is the disjoint union of the cographs corresponding to the children of v. The cograph corresponding to \({T}_v\) where v has label is the join of the cographs corresponding to the children of v.
In particular, the cotree of \(\overline{G}\) can be obtained from the cotree of G by changing the label to and to .
In [7], a linear time algorithm is given for recognizing if a given graph is a cograph, and if so, constructing its cotree. Note that the cotree can immediately be transformed to an equivalent binary tree; see Fig. 5 for an example of a cograph G, the cotree of G and its binary version. For simplification, we will use the binary cotree in our algorithm below.
Now, given a cograph G together with its binary cotree T, the bottom-up Algorithm 1 below computes the cluster vertex deletion number \(\varsigma (G)\) of G, as suggested by Lemma 17. The algorithm traverses the cotree T by post-order, that is, for the current node v of T, it recursively traverses the left subtree of \(T_v\), then the right subtree of \(T_v\), and finally visits the current node v. The algorithm uses the following notations. For a node v of T,
-
if v is an internal node then \(\ell (v)\) and r(v) stands for the left child and the right child of v, respectively;
-
n(v) denotes the size of the subgraph of G induced by the leaves of \(T_v\). Thus, if v is a leaf then \(n(v)=1\) and if v is the root of T then \(n(v)=|V(G)|\);
-
\(\varsigma (v)\) denotes the cluster vertex deletion number of the subgraph of G induced by the leaves of \(T_v\). Thus, if v is a leaf then \(\varsigma (v)=0\) and if v is the root of T then \(\varsigma (v)=\varsigma (G)\);
-
\(\overline{\tau }(v)\) denotes the vertex cover number of the complement of the subgraph of G induced by the leaves of \(T_v\). Thus, if v is a leaf then \(\overline{\tau }(v)=0\) and if v is the root of T then \(\overline{\tau }(v)=\tau (\overline{G})\).
Proposition 19
Given a \(P_4\)-free n-vertex graph G together with its cotree, Algorithm 1 correctly computes the cluster deletion number \(\varsigma (G)\) of G in O(n) time.
Proof
The correctness of Algorithm 1 directly follows from Lemma 17. Since per node in the cotree a constant number of operations is performed, the algorithm runs in O(n) time. \(\square \)
We remark that Algorithm 1 can be slightly modified for computing a minimum cluster vertex deletion set. Also, since Lemma 17 holds accordingly for the weighted version, the minimum weight cluster vertex deletion number of cographs can be computed in linear time, too.
Appendix B: Computing the Connected Cluster Vertex Deletion Number of Cographs using the Cotrees
Recall that \(S\subseteq V(G)\) is a connected cluster vertex deletion set if \(G-S\) is a cluster graph and G[S] is connected. Note that G has a connected cluster vertex deletion set if and only if G has at most one connected component that contains an induced \(P_3\) (if G has more than two connected components containing an induced \(P_3\) then any cluster vertex deletion set must contain vertices in different connected components). Let \(\varsigma _c(G)\) denote the connected cluster vertex deletion number of G,
(We set \(\varsigma _c(G)=\infty \) if G has no connected cluster vertex deletion set.)
When computing \(\varsigma _c(G)\), we will have to consider a special case of (connected) cluster vertex deletion. A set \(S\subseteq V(G)\) is a (connected) clique deletion set if \(G-S\) is a clique (and G[S] is connected). Let \(\theta (G)\) and \(\theta _c(G)\) denote the clique vertex deletion number and the connected clique vertex deletion number of G, respectively,
(Again, we set \(\theta _c(G)=\infty \) if G has no connected clique deletion set.) Notice that \(\theta (G)=\tau (\overline{G})\), and thus \(\theta (G)\) can be computed in linear time when restricted to cographs (by Lemma 17 and Proposition 19.) Notice also that \(\theta (G)\le \theta _c(G)\) and \(\varsigma (G)\le \varsigma _c(G)\). We will see in this section that \(\theta _c(G)\) and \(\varsigma _c(G)\) can be computed efficiently when restricted to cographs.
We first consider the connected clique vertex deletion number. The following fact follows immediately from the definition:
Lemma 20
For arbitrary graphs \(G_1\) and \(G_2\),
The following two lemmas provide a formula for computing the connected clique vertex deletion number of the join of two graphs.
Lemma 21
Let \(G_1\) be a complete graph and let \(G_2\) be an arbitrary graph. Then:
Proof
Let S be an optimal connected clique vertex deletion set of , and write \(S_i=S\cap V(G_i)\), \(i=1,2\). Then \(S_1\) is a (connected) clique deletion set of \(G_1\) (possibly empty) and \(S_2\) is a clique deletion set of \(G_2\). Thus, \(|S_2|\ge \theta (G_2)\). Moreover, if \(G_2[S_2]\) is connected then \(|S_2|\ge \theta _c(G_2)\), and hence in this case, . If \(G_2[S_2]\) is disconnected then \(|S_1\cap V(G_1)|=1\) (due to the connectedness and the optimality of S) and \(|S|\ge 1+\theta (G_2)\). Hence, in this case, .
For the other direction, let S be a clique vertex deletion set of \(G_2\) of optimal size \(\theta (G_2)\). If \(G_2[S]\) is connected then S is a connected clique deletion set of , hence . If \(G_2[S]\) is disconnected then, for any vertex \(u\in V(G_1)\), \(S\cup \{u\}\) is a connected clique deletion set of , hence . \(\square \)
Lemma 22
Let \(G_1\) and \(G_2\) be two arbitrary non-complete graphs. Then:
Proof
Let S be an optimal connected clique deletion set of and write \(S_i=S\cap V(G_i)\), \(i=1,2\). Then \(S_i\) is a clique deletion set of \(G_i\), hence .
For the other direction let \(T_i\) be an optimal clique deletion set of \(G_i\), \(i=1,2\). By assumption, \(T_i\not =\emptyset \), hence \(T_1\cup T_2\) is a connected clique deletion set of . Therefore, . \(\square \)
We now consider the connected cluster vertex deletion number of the disjoint union and the join of two graphs. The following fact follows immediately from the definition:
Lemma 23
For arbitrary graphs \(G_1\) and \(G_2\),
Lemmas 24 and 26 below provide a formula for computing the connected cluster vertex deletion number of the join of two graphs.
Lemma 24
Let \(G_1\) be a complete graph and let \(G_2\) be an arbitrary graph. Then:
Proof
Let S be a connected cluster vertex deletion set of of optimal size, and write \(S_i=S\cap V(G_i)\), \(i=1,2\). Then \(S_1\) is a (connected) clique deletion set of \(G_1\) (possibly empty) and \(S_2\) is a cluster vertex deletion set of \(G_2\). Moreover, if \(G_2-S_2\) is not a clique then \(S_1=V(G_1)\), hence . In the case where \(G_2-S_2\) is a clique, \(|S_2|\ge \theta (G_2)\). Moreover, if \(G_2[S_2]\) is connected then \(S_1=\emptyset \) (because of the optimality of S) and \(|S_2|\ge \theta _c(G_2)\); if \(G_2[S_2]\) is disconnected, \(|S_1\cap V(G_1)|=1\). Hence in this case, .
For the other direction, observe first that by definition, , and hence by Lemma 21, . Observe next that, for any cluster vertex deletion set S of \(G_2\) of optimal size \(\varsigma (G_2)\), \(V(G_1)\cup S\) is a connected cluster vertex deletion set of , hence . \(\square \)
For two non-complete graphs, we first show:
Lemma 25
Let \(G_1\) and \(G_2\) be two arbitrary, non-complete graphs. Then:
Furthermore, if both \(G_1\) and \(G_2\) are disconnected, then:
Proof
Let S be a connected cluster vertex deletion set of of optimal size, and write \(S_i=S\cap V(G_i)\), \(i=1,2\). Then \(S_i\) is a cluster vertex deletion set of \(G_i\). Note, moreover, that at least one of \(G_1-S_1\) and \(G_2-S_2\) must be a clique.
If each of \(G_1-S_1\) and \(G_2-S_2\) is a clique, \(S_1\) and \(S_2\) are clique deletion sets of \(G_1\) and \(G_2\), respectively. Hence in this case, . If \(G_1-S_1\) is not a clique then \(S_2=V(G_2)\), and likewise, if \(G_2-S_2\) is not a clique then \(S_1=V(G_1)\). In these two cases, .
Now, suppose that both \(G_1\) and \(G_2\) are disconnected. Then, the connectivity of S implies that if \(S_1=V(G_1)\) then \(|S_2\cap V(G_2)|{\ge } 1\), and likewise, if \(S_2=V(G_2)\) then \(|S_1\cap V(G_1)|{\ge } 1\). Hence, . \(\square \)
Lemma 26
Let \(G_1\) and \(G_2\) be two arbitrary, non-complete graphs.
-
(1)
If \(G_1\) or \(G_2\) is connected, then:
-
(2)
If both \(G_1\) and \(G_2\) are disconnected, then:
Proof
By Lemma 25, it remains to show that in both claims the left-hand side is at most the right-hand side. Observe first that , and so by Lemma 22, .
-
(1):
Let \(G_1\) be connected, say. Observe that any cluster vertex deletion set \(S_1\) of \(G_1\) is non-empty (because \(G_1\) is connected non-complete), hence \(V(G_2)\cup S_1\) is a connected cluster vertex deletion set of , and for any cluster vertex deletion set \(S_2\) of \(G_2\), \(V(G_1)\cup S_2\) is a connected cluster vertex deletion set of (because \(G_1\) is connected). Thus, .
-
(2):
Observe that for any cluster vertex deletion set \(S_1\) of \(G_1\) of optimal size \(\varsigma (G_1)\), \(V(G_2)\cup S_1\) (if \(S_1\not =\emptyset \)) or \(V(G_2)\cup \{u\}\) (if \(S_1=\emptyset \)), where u is any vertex of \(G_1\), is a connected cluster vertex deletion of . Hence . Similarly, .
\(\square \)
Now, given a cograph G together with its cotree, with Lemmas 20, 21, 22, 23, 24, and 26 we can compute the connected clique vertex deletion number and the connected cluster deletion number of G in linear time. This is done in the same way for computing the vertex cover number and the cluster vertex deletion number in Appendix A, hence we omit the details.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Le, HO., Le, V.B. Complexity of the (Connected) Cluster Vertex Deletion Problem on H-free Graphs. Theory Comput Syst 68, 250–270 (2024). https://doi.org/10.1007/s00224-024-10161-3
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00224-024-10161-3