DAA Unit-4: P-Class

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 8

DAA unit-4

P-Class:
The class P consists of those problems that are solvable in polynomial time, i.e.
these problems can be solved in time O(nk) in worst-case, where k is constant.
These problems are called tractable, while others are called intractable or
superpolynomial.
Formally, an algorithm is polynomial time algorithm, if there exists a
polynomial p(n) such that the algorithm can solve any instance of size n in a
time O(p(n)).
Problem requiring Ω(n50) time to solve are essentially intractable for large n. Most
known polynomial time algorithm run in time O(nk) for fairly low value of k.
The advantages in considering the class of polynomial-time algorithms is that all
reasonable deterministic single processor model of computation can be
simulated on each other with at most a polynomial slow-d
NP-Class:
The class NP consists of those problems that are verifiable in polynomial time. NP
is the class of decision problems for which it is easy to check the correctness of a
claimed answer, with the aid of a little extra information. Hence, we aren’t asking for
a way to find a solution, but only to verify that an alleged solution really is correct.
Every problem in this class can be solved in exponential time using exhaustive
search.
P versus NP:
Every decision problem that is solvable by a deterministic polynomial time algorithm
is also solvable by a polynomial time non-deterministic algorithm.
All problems in P can be solved with polynomial time algorithms, whereas all
problems in NP - P are intractable.
It is not known whether P = NP. However, many problems are known in NP with the
property that if they belong to P, then it can be proved that P = NP.
If P ≠ NP, there are problems in NP that are neither in P nor in NP-Complete.
The problem belongs to class P if it’s easy to find a solution for the problem. The
problem belongs to NP, if it’s easy to check a solution that may have been very
tedious to find.
A problem is in the class NPC if it is in NP and is as hard as any problem in NP. A
problem is NP-hard if all problems in NP are polynomial time reducible to it, even
though it may not be in NP itself.
If a polynomial time algorithm exists for any of these problems, all problems in NP
would be polynomial time solvable. These problems are called NP-complete. The
phenomenon of NP-completeness is important for both theoretical and practical
reasons.
Definition of NP-Completeness
A language B is NP-complete if it satisfies two conditions
 B is in NP
 Every A in NP is polynomial time reducible to B.
If a language satisfies the second property, but not necessarily the first one, the
language B is known as NP-Hard. Informally, a search problem B is NP-Hard if
there exists some NP-Complete problem A that Turing reduces to B.
The problem in NP-Hard cannot be solved in polynomial time, until P = NP. If a
problem is proved to be NPC, there is no need to waste time on trying to find an
efficient algorithm for it. Instead, we can focus on design approximation algorithm.
NP-Complete Problems
Following are some NP-Complete problems, for which no polynomial time algorithm
is known.
 Determining whether a graph has a Hamiltonian cycle
 Determining whether a Boolean formula is satisfiable, etc.

NP-Hard Problems
The following problems are NP-Hard
 The circuit-satisfiability problem
 Set Cover
 Vertex Cover
 Travelling Salesman Problem
In this context, now we will discuss TSP is NP-Complete
TSP is NP-Complete
The traveling salesman problem consists of a salesman and a set of cities.
The salesman has to visit each one of the cities starting from a certain one and
returning to the same city. The challenge of the problem is that the traveling
salesman wants to minimize the total length of the trip
Proof:
To prove TSP is NP-Complete, first we have to prove that TSP belongs to NP. In
TSP, we find a tour and check that the tour contains each vertex once. Then the
total cost of the edges of the tour is calculated. Finally, we check if the cost is
minimum. This can be completed in polynomial time. Thus TSP belongs to NP.
Secondly, we have to prove that TSP is NP-hard. To prove this, one way is to show
that Hamiltonian cycle ≤p TSP (as we know that the Hamiltonian cycle problem is
NPcomplete).
Assume G = (V, E) to be an instance of Hamiltonian cycle.
Hence, an instance of TSP is constructed. We create the complete graph G' = (V,
E'), where
E′={(i,j):i,j∈Vandi≠jE′={(i,j):i,j∈Vandi≠j
Thus, the cost function is defined as follows −

t(i,j)={01if(i,j)∈Eotherwiset(i,j)={0if(i,j)∈E1otherwise

Now, suppose that a Hamiltonian cycle h exists in G. It is clear that the cost of each
edge in h is 0 in G' as each edge belongs to E. Therefore, h has a cost of 0 in G'.
Thus, if graph G has a Hamiltonian cycle, then graph G' has a tour of 0 cost.
Conversely, we assume that G' has a tour h' of cost at most 0. The cost of edges
in E' are 0 and 1 by definition. Hence, each edge must have a cost of 0 as the cost
of h' is 0. We therefore conclude that h' contains only edges in E.
We have thus proven that G has a Hamiltonian cycle, if and only if G' has a tour of
cost at most 0. TSP is NP-complete

COOK’S THEOREM :
Stephen Cook presented four theorems in his paper “The Complexity of Theorem
Proving Procedures”. These theorems are stated below. We do understand that
many unknown terms are being used in this chapter, but we don’t have any scope
to discuss everything in detail.
Following are the four theorems by Stephen Cook −
Theorem-1
If a set S of strings is accepted by some non-deterministic Turing machine within
polynomial time, then S is P-reducible to {DNF tautologies}.
Theorem-2
The following sets are P-reducible to each other in pairs (and hence each has the
same polynomial degree of difficulty): {tautologies}, {DNF tautologies}, D3, {sub-
graph pairs}.
Theorem-3
 For any TQ(k) of type Q, TQ(k)k√(logk)2TQ(k)k(logk)2 is unbounded
 There is a TQ(k) of type Q such that TQ(k)⩽2k(logk)2TQ(k)⩽2k(logk)2

Theorem-4
If the set S of strings is accepted by a non-deterministic machine within time T(n) =
2n, and if TQ(k) is an honest (i.e. real-time countable) function of type Q, then there is
a constant K, so S can be recognized by a deterministic machine within
time TQ(K8n).
 First, he emphasized the significance of polynomial time reducibility. It means
that if we have a polynomial time reduction from one problem to another, this
ensures that any polynomial time algorithm from the second problem can be
converted into a corresponding polynomial time algorithm for the first
problem.
 Second, he focused attention on the class NP of decision problems that can
be solved in polynomial time by a non-deterministic computer. Most of the
intractable problems belong to this class, NP.
 Third, he proved that one particular problem in NP has the property that every
other problem in NP can be polynomially reduced to it. If the satisfiability
problem can be solved with a polynomial time algorithm, then every problem
in NP can also be solved in polynomial time. If any problem in NP is
intractable, then satisfiability problem must be intractable. Thus, satisfiability
problem is the hardest problem in NP.
 Fourth, Cook suggested that other problems in NP might share with the
satisfiability problem this property of being the hardest member of NP.

Approximate Algorithms
Introduction:
An Approximate Algorithm is a way of approach NP-COMPLETENESS for the
optimization problem. This technique does not guarantee the best solution. The goal
of an approximation algorithm is to come as close as possible to the optimum value
in a reasonable amount of time which is at the most polynomial time. Such
algorithms are called approximation algorithm or heuristic algorithm.
o For the traveling salesperson problem, the optimization problem is to find the
shortest cycle, and the approximation problem is to find a short cycle.

o For the vertex cover problem, the optimization problem is to find the vertex cover
with fewest vertices, and the approximation problem is to find the vertex cover with
few vertices.

Performance Ratios
Suppose we work on an optimization problem where every solution carries a cost. An
Approximate Algorithm returns a legal solution, but the cost of that legal solution
may not be optimal.
      For Example, suppose we are considering for a minimum size vertex-cover
(VC). An approximate algorithm returns a VC for us, but the size (cost) may not be
minimized.
      Another Example is we are considering for a maximum size Independent set
(IS). An approximate Algorithm returns an IS for us, but the size (cost) may not be
maximum. Let C be the cost of the solution returned by an approximate algorithm,
and C* is the cost of the optimal solution.
5.4M
113
History of Java
We say the approximate algorithm has an approximate ratio P (n) for an input size n,
where
Intuitively, the approximation ratio measures how bad the approximate solution is
distinguished with the optimal solution. A large (small) approximation ratio measures
the solution is much worse than (more or less the same as) an optimal solution.
      Observe that P (n) is always ≥ 1, if the ratio does not depend on n, we may write
P. Therefore, a 1-approximation algorithm gives an optimal solution. Some problems
have polynomial-time approximation algorithm with small constant approximate
ratios, while others have best-known polynomial time approximation algorithms
whose approximate ratios grow with n.
Approximation Algorithms:
Overview
An approximation algorithm is a way of dealing with NP-completeness for an optimization problem.
The goal of the approximation algorithm is to come close as much as possible to the optimal solution
in polynomial time.
Features of Approximation Algorithm : 
Here, we will discuss the features of the Approximation Algorithm as follows.
 An approximation algorithm guarantees to run in polynomial time though it does not
guarantee the most effective solution.
 An approximation algorithm guarantees to seek out high accuracy and top quality
solution(say within 1% of optimum)
 Approximation algorithms are used to get an answer near the (optimal) solution of an
optimization problem in polynomial time
Performance Ratios for approximation algorithms :
Here, we will discuss the performance ratios of the Approximation Algorithm as follows.
Scenario-1 :
1. Suppose that we are working on an optimization problem in which each potential solution
has a cost, ad we wish to find a near-optimal solution. Depending on the problem, we
may define an optimal solution as one with maximum possible cost or one with minimum
possible cost,i.e, the problem can either be a maximization or minimization problem.
2. We say that an algorithm for a problem has an appropriate ratio of P(n) if, for any input
size n, the cost C of the solution produced by the algorithm is within a factor of P(n) of
the cost C* of an optimal solution as follows.
max(C/C*,C*/C)<=P(n)
Scenario-2 :
If an algorithm reaches an approximation ratio of P(n), then we call it a P(n)-approximation algorithm.

 For a maximization problem, 0< C < C×, and the ratio of C/C* gives the factor by which
the cost of an optimal solution is larger than the cost of the approximate algorithm.
 For a minimization problem, 0< C* < C, and the ratio of C/C* gives the factor by which
the cost of an approximate solution is larger than the cost of an optimal solution.
Some examples of Approximation algorithm :
Here, we will discuss some examples of the Approximation Algorithm as follows.
1. The Vertex Cover Problem – 
In the vertex cover problem, the optimization problem is to select a minimum number of
vertices that should cover all the edges in a graph.
 
2. Travelling Salesman Problem –
In the Travelling Salesman Problem, the optimization problem is that the salesman has to
take a route that has a minimum cost.
 
3. The Set Covering Problem – 
This is an optimization problem that models many problems that require resources to be
allocated. Here, a logarithmic approximation ratio is used.
 
4. The Subset Sum Problem – 
In the Subset sum problem, the optimization problem is to find a subset of {x1,×2,×3…
xn} whose sum is as large as possible but not larger than target value t.

Randomized Algorithms (Introduction and Analysis):


What is a Randomized Algorithm?
An algorithm that uses random numbers to decide what to do next anywhere in its logic is called a
Randomized Algorithm. For example, in Randomized Quick Sort, we use a random number to pick
the next pivot (or we randomly shuffle the array). And in Karger’s algorithm, we randomly pick an
edge. 
How to analyse Randomized Algorithms?
Some randomized algorithms have deterministic time complexity. For example, this implementation
of Karger’s algorithm has time complexity is O(E). Such algorithms are called Monte Carlo
Algorithms and are easier to analyse for worst case. 
On the other hand, time complexity of other randomized algorithms (other than Las Vegas) is
dependent on value of random variable. Such Randomized algorithms are called Las Vegas
Algorithms. These algorithms are typically analysed for expected worst case. To compute expected
time taken in worst case, all possible values of the used random variable needs to be considered in
worst case and time taken by every possible value needs to be evaluated. Average of all evaluated
times is the expected worst case time complexity. Below facts are generally helpful in analysis os
such algorithms. 
Linearity of Expectation 
Expected Number of Trials until Success. 
For example consider below a randomized version of QuickSort. 
A Central Pivot is a pivot that divides the array in such a way that one side has at-least 1/4 elements. 
// Sorts an array arr[low..high]
randQuickSort(arr[], low, high)

1. If low >= high, then EXIT.

2. While pivot 'x' is not a Central Pivot.


(i) Choose uniformly at random a number from [low..high].
Let the randomly picked number number be x.
(ii) Count elements in arr[low..high] that are smaller
than arr[x]. Let this count be sc.
(iii) Count elements in arr[low..high] that are greater
than arr[x]. Let this count be gc.
(iv) Let n = (high-low+1). If sc >= n/4 and
gc >= n/4, then x is a central pivot.

3. Partition arr[low..high] around the pivot x.

4. // Recur for smaller elements


randQuickSort(arr, low, sc-1)

5. // Recur for greater elements


randQuickSort(arr, high-gc+1, high)

The important thing in our analysis is, time taken by step 2 is O(n). 
How many times while loop runs before finding a central pivot? 
The probability that the randomly chosen element is central pivot is 1/n. 
Therefore, expected number of times the while loop runs is n (See this for details) 
Thus, the expected time complexity of step 2 is O(n). 
What is overall Time Complexity in Worst Case? 
In worst case, each partition divides array such that one side has n/4 elements and other side has 3n/4
elements. The worst case height of recursion tree is Log 3/4 n which is O(Log n). 
T(n) < T(n/4) + T(3n/4) + O(n)
T(n) < 2T(3n/4) + O(n)
Solution of above recurrence is O(n Log n)
Note that the above randomized algorithm is not the best way to implement randomized Quick Sort.
The idea here is to simplify the analysis as it is simple to analyse. 
Typically, randomized Quick Sort is implemented by randomly picking a pivot (no loop). Or by
shuffling array elements. Expected worst case time complexity of this algorithm is also O(n Log n),
but analysis is complex, the MIT prof himself mentions same in his lecture

Randomized Algorithm:

 In addition to the input, the algorithm uses a source of pseudo random numbers. During
execution, it takes random choices depending on those random numbers.
 The behavior (output) can vary if the algorithm is run multiple times on the same input.

Advantage of Randomized Algorithm :


 The algorithm is usually simple and easy to implement, The algorithm is fast with very high
probability, and/or It produces optimum output with very high probability. Difficulties There
is a finite probability of getting incorrect answer. However, the probability of getting a wrong
answer can be made arbitrarily small by the repeated employment of randomness.
 Analysis of running time or probability of getting a correct answer is usually difficult.
 Getting truely random numbers is impossible. One needs to depend on pseudo random
numbers. So, the result highly depends on the quality of the random numbers.

An Important Note :
Randomized algorithms are not the probabilistic analysis of expected running time of a deterministic
algorithm, where

 The inputs are assumed to come from a probability distribution.


 The objective is to compute the expected running time of the algorithm.

You might also like