Advanced Algorithms Course. Lecture Notes. Part 3: Addendum: Weighted Hitting Set

Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

Advanced Algorithms Course.

Lecture Notes. Part 3

Addendum: Weighted Hitting Set


In the Weighted Hitting Set problem we are given a set A of n elements,
each with a weight wi , and a collection of m subsets Bj A. A hitting set
is a subset of A that intersects (hits) every Bj . We wish to determine a
hitting set with minimum total weight.
It is not hard to see that Weighted Vertex Cover is a special case of this
problem, when all Bj have size 2. Accordingly, the pricing algorithm can
be generalized to Weighted Hitting Set when all Bj have size b, where b is
any fixed integer. Then we can achieve approximation ratio b in polynomial
time. This is left as an exercise.

Disjoint Paths and Routing


Given a directed graph with m edges, and k node pairs (si , ti ), we wish
to find directed paths from si to ti for as many as possible indices i, that
do not share any edges. We also call such paths edge-disjoint. This is
a fundamental problem in routing in networks. Imagine that we want to
send goods, information, etc., from source nodes to destination nodes along
available directed paths, without unreasonable congestion. In general we
cannot send everything simultaneously, but we may try and maximize the
number of served requests.
The problem is NP-complete (which we do not prove here), but we

present an algorithm with approximation ratio O( m). The square root
is a small function, still the quality of the solution deteriorates with grow-
ing network size. This result seems to be poor, but it is the best possible
guarantee one can achieve in polynomial time, and still better than no guar-
antee at all.

1
As often, the idea of a greedy algorithm is simple: Short paths should
minimize the chances of conflicts with other paths, and the shortest paths
can be computed efficiently. Therefore, the proposed algorithm just chooses
a shortest path that connects some yet unconnected pair and adds it to the
solution, as long as possible. After every step we delete the edges of the
path used, in order to avoud collision with paths chosen later.
However, the idea is not as powerful as one might hope: In each step
there could exist many short paths to choose from, and we may easily miss
a good one, since we only take length as selection criterion. But at least

we can prove the O( m) factor, as follows. Let I and I denote the set
of indices i of the pairs (si , ti ) connected by the optimal and the greedy
solution, respectively. Let Pi and Pi denote the selected paths for index i.
The analysis works with case a distinction regarding the length: We call a

path with at least m edges long, and other paths are called short. Let
Is and Is be the set of indices i of the pairs (si , ti ) connected by the short
paths in I and I, respectively.

Since only m edges exist, I can have at most m long paths. Consider
any index i where Pi is short, but (si , ti ) is not even connected in I. (This is
the worst that can happen to a pair, hence our worst-case analysis focusses
on this case.) The reason why the greedy algorithm has not chosen Pi must
be that some edge e Pi is in some Pj chosen earlier. We say that e

blocks Pi . We have |Pj | |Pi | m. Every edge in Pj can block at

most one path of I . Hence Pj blocks at most m paths of I . The number

of such particularly bad indices i is therefore bounded by |Is \ I| |Is | m.
Finally some simple steps prove the claimed approximation ratio: |I |

|I \ Is | + |I| + |Is \ I| m + |I| + |Is | m (2 m + 1)|I|.

An Approximation Scheme for Knapsack


So far we have seen some approximation algorithms whose approximation
ratio on an instance is fixed, either an absolute constant or depending on
the input size. But often we may be willing to spend more computation
time to get a better solution, i.e., closer to the optimum. In other words, we
may trade time for quality. An approximation scheme is an algorithm where
the user can freely decide on some accuracy parameter  and get a solution
within a factor 1 +  of optimum, at cost of a time complexity that grows as
 decreases. The actual choice of  may then depend on the demands and
resources. A nice example is the following Knapsack algorithm.

2
In the Knapsack problem, a knapsack of capacity W is given, as well as
n items with weights wi and values vi (all integer). The problem is to find
a subset S of items with iS wi W (so that S fits in the knapsack) and
P
P
maximum value iS vi . Define v := max vi .
You may already know that Knapsack is NP-complete but can be solved
by some dynamic programming algorithm. Its time bound O(nW ) is polyno-
mial in the numerical value W , but not in the input size n, therefore we call it
pseudopolynomial. (A truly polynomial algorithm for an NP-complete prob-
lem cannot exist, unless P=NP.) However, for our approximation scheme we
need another dynamic programming algorithm that differs from the most
natural one, because we need a time bound in terms of values rather than
weights. (This point will become more apparent later on.) Here it comes:
Define OP T (i, V ) to be the minimum (necessary) capacity of a knapsack
that contains a subset of the first i items, of total value at least V . We can
compute OP T (i, V ) using the OP T values for smaller arguments, as follows.
If V > i1
P
j=1 vj then, obviously, we must add item i to reach V . Thus we
have OP T (i, V ) = wi + OP T (i 1, V vi ) in this case. If V i1
P
j=1 vj then
item i may be added or not, leading to

OP T (i, V ) = min(OP T (i 1, V ), wi + OP T (i 1, max(V vi , 0))).

(Think for a while.) Since i n and V nv , the time is bounded by


O(n2 v ). As usual in dynamic programming, backtracing can reconstruct
an actual solution from the OP T values.
Now the idea of the approximation scheme is: If v is small, we can
afford an optimal solution, as the time bound is small. If v is large, we
round the values to multiples of some number and solve approximately the
given problem. The point is that we can divide all the rounded values by
the common factor without changing the solution sets, which gives us again
a small problem. In the following we work out this idea precisely. We do not
specify what small and large means in the above sketch, instead, some
free parameter b (integer) controls the problem size.
First compute new values vi0 as follows: Divide vi by some fixed b and
round up to the next integer: vi0 = dvi /be. Then run the dynamic program-
ming algorithm for the new values vi0 rather than vi .
Let us compare the solution S found by this algorithm, and the optimal
solution S . Since we have not changed the weights of elements, S still
fits in the knapsack despite the new values. Since S is optimal for the new

3
vi0 vi0 . Now one can easily see:
P P
values, clearly iS iS

vi0 vi0
X X X X X
vi /b (vi /b + 1) n + vi /b.
iS iS iS iS iS

This shows X X
vi nb + vi ,
iS iS

in words, the optimal total value is larger than the achieved value by at most
an additional amount nb. Depending on the maximum value v we choose
a suitable b. By chosing b := v /n, the above inequality becomes

vi v +
X X
vi .
iS iS

vi v , this becomes
P
Since trivially iS
X X X
vi  vi + vi ,
iS iS iS

hence X X
(1 ) vi vi .
iS iS

In words: We achieve at least a 1  fraction of the optimal value. The time


is O(n2 v /b) = O(n3 /). Thus we can compute a solution with at least 1 
times the optimum value in O(n3 /) time.
For any fixed accuracy  this time bound is polynomial in n (not only
pseudopolynomial as the exact dynamic programming algorithm). However,
the smaller  we want, the more time we have to invest. To end with a
general definition: A fully polynomial-time approximation scheme (FPTAS)
is an algorithm that takes an additional input parameter  and computes a
solution that has at least 1  times the optimum value (for a maximization
problem), or at most 1 +  times the optimum value (for a minimization
problem), and runs in a time that is polynomial in n and 1/.

You might also like