Daa Ktu Notes

1 of 112
Design and Analysis of Algorithms (CS 302) CSE
ALGORITHM ANALYSIS AND DESIGN
3+1+0
Module 1 Introduction to Algorithm Analysis
Time and Space Complexity- Elementary operations and Computation of Time

Complexity – Best, Worst and Average Case Complexities – Complexity Calculation
of simple algorithms.
Recurrence Equations: Solution of Recurrence Equations – Iteration Method and

Recursion Tree Methods
Module 2 Masters Theorm (Proof not required) – examples, Asymptotic Notations and
their properties- Application of Asymptotic Notations in Algorithm Analysis-
Common Complexity Functions.
AVL Trees – rotations, Red-Black Trees insertion and deletion (Techniques only;
algorithm not expected). B- Trees – insertion and deletion operations. Sets – Union
and find operations on disjoint sets.
dot co ICET, Mulavoor
Module 3 Graphs
DFS and BFS traversals, complexity, Spanning trees- Minimum Cost Spanning Trees,
single source shortest path algorithms, Topological sorting, strongly connected
components.
Module 4 Divide and Conquer
The control Abstraction, 2 way Merge sort, Strassen‘s Matrix Multiplication, Analysis
Dynamic Programming: The control Abstraction- The Optimality Principle-

Optimal matrix multiplication, Bellman – Ford Algorithm.
Module 5
Analysis, Comparison of Divide and Conquer and Dynamic Programming strategies.
Greedy Strategy:- The control abstraction- The Fractional Knapsack Problem,

Minimum Cost Spanning Tree Computation- Prim‘s Algorithm- Kruskal‘s Algorithm.
Module 6
Back Tracking:- The control Abstraction – The N Queen‘s Problem, 0/1 Knapsack
Problem.
Branch and Bound:- Travelling Salesman Problem.
Arunkumar M, Asst. Professor, CSE ICET, Mulavoor

2 of 112
Introduction to Complexity Theory:- Tractable and Intractable Problems- The P

and NP Classes- Polynomial Time Reductions- The NP- Hard and NP- Complete
Classes.
Text Book
1. Computer Algorithms, Universities Press, 2007 - Horowitz and Sahni, Sanguthevar [

Modules 3,4,5]
2. Thomas H Corman, Charles E Leiserson, Ronald L Rivest, Clifford Stein, Introduction to
Algorithms, MIT Press, 2009 [Modules 1,2,6]
References
1. Computer Algorithms – Introduction to Design and Analysis - Sara Baase & Allen Van
Gelder, Pearson Education
2. Data Structures algorithms and applications - Sahni, Tata McGrHill

3. Foundations of Algorithms - Richard Neapolitan, Kumarss N., DC Hearth & Company
4. Introduction to algorithm- Thomas Coremen, Charles, Ronald Rivest -PHI

3 of 112

MODULE 1

4 of 112
Introduction to Algorithm Analysis
ALGORITHM
Informal Definition:
An Algorithm is any well-defined computational procedure that takes some

value or set of values as Input and produces a set of values or some value as output. Thus
algorithm is a sequence of computational steps that transforms the i/p into the o/p.
Formal Definition:
An Algorithm is a finite set of instructions that, if followed, accomplishes a

particular task. In addition, all algorithms should satisfy the following properties.
Properties of an Algorithm
1. INPUT  Zero or more quantities are externally supplied.

2. OUTPUT  At least one quantity is produced.
3. DEFINITENESS  Each instruction is clear and unambiguous.
4. FINITENESS  If we trace out the instructions of an algorithm, then for all cases,
the algorithm terminates after a finite number of steps.
5. EFFECTIVENESS  Every instruction must very basic so that it can be carried out,
in principle, by a person using only pencil & paper.
Development of an Algorithm
Algorithm can be described in three ways.
1. Natural language like English
When this way is selected, we should ensure that each and every statement is definite.

5 of 112
2. Graphic representation called flowchart:
This method will work well when the algorithm is small& simple.
3. Pseudo-code Method:
In this method, we should typically describe algorithms as program, which resembles

language like Pascal & algol.
PSEUDO-CODE CONVENTIONS:
1. Comments begin with // and continue until the end of line.

2. Blocks are indicated with matching braces {and}.
3. An identifier begins with a letter. The data types of variables are not explicitly
declared.
4. Compound data types can be formed with records. Here is an example,
Node. Record
data type – 1 data-1;
data type – n data – n;
node * link;

6 of 112
Here link is a pointer to the record type node. Individual data items of a record
can be accessed with  and period.
5. Assignment of values to variables is done using the assignment statement.

<Variable>:= <expression>;
6. There are two Boolean values TRUE and FALSE.
 Logical Operators AND, OR, NOT
Relational Operators <, <=,>,>=, =, !=
7. The following looping statements are employed.

For, while and repeat-until
While Loop:
While < condition > do
<statement-1>
<statement-n>
For Loop:
For variable: = value-1 to value-2 step step do

7 of 112
<statement-1>
<statement-n>
repeat-until:
repeat
<statement-1>
<statement-n>
until<condition>
8. A conditional statement has the following forms.
 If <condition> then <statement>
 If <condition> then <statement-1>
Else <statement-1>
Case statement:
Case

8 of 112
: <condition-1> : <statement-1>
: <condition-n> : <statement-n>
: else : <statement-n+1>
9. Input and output are done using the instructions read & write.
10. There is only one type of procedure:

Algorithm, the heading takes the form,
Algorithm Name (Parameter lists)
Example
Algorithm to find the maximum of ‗n‘ given numbers:
1. algorithm Max(A,n)
2. // A is an array of size n
3. {
4. Result := A[1];
5. for I:= 2 to n do
6. if A[I] > Result then
7. Result :=A[I];
8. return Result;
9. }

9 of 112
PERFORMANCE ANALYSIS:
1. Space Complexity:
The space complexity of an algorithm is the amount of money it needs to
run to compilation.
2. Time Complexity:
The time complexity of an algorithm is the amount of computer time it
needs to run to compilation.
Space Complexity:
Space Complexity
Example 1:
Algorithm abc(a,b,c)
return a+b++*c+(a+b-c)/(a+b) +4.0;
 The Space needed by each of these algorithms is seen to be the sum of the following
component.
1. A fixed part that is independent of the characteristics (eg: number, size) of the inputs and
outputs.
The part typically includes the instruction space (ie. Space for the code), space for simple
variable and fixed-size component variables (also called aggregate) space for constants, and
so on.

10 of 112
1. A variable part that consists of the space needed by component variables whose size
is dependent on the particular problem instance being solved, the space needed by referenced
variables (to the extent that is depends on instance characteristics), and the recursion stack
space.
 The space requirement s(p) of any algorithm p may therefore be written as,
S(P) = c+ Sp(Instance characteristics)
Where ‗c‘ is a constant.
Example 2:
Algorithm sum(a,n)
s=0.0;
for I=1 to n do
s= s+a[I];
return s;
 The problem instances for this algorithm are characterized by n,the

number of elements to be summed. The space needed d by ‗n‘ is one word, since it is of type
integer.
 The space needed by ‗a‘a is the space needed by variables of type array of
floating point numbers.
 This is at least ‗n‘ words, since ‗a‘ must be large enough to hold the ‗n‘
elements to be summed.
 So, we obtain Ssum(n)>=(n+s)
[ n for a[],one each for n,I a& s]

11 of 112
Time Complexity:
The time T(p) taken by a program P is the sum of the compile time and the run
time(execution time)
The compile time does not depend on the instance characteristics. Also we may assume
that a compiled program will be run several times without recompilation .This rum time is
denoted by tp(instance characteristics).
 The number of steps depends on the kind of statement.

Example comments  0 steps.
Assignment statements  1 steps.
[Which does not involve any calls to other algorithms]
Interactive statement such as for, while & repeat-until Control part of the
statement.
1. We introduce a variable, count into the program statement to increment count with
initial value 0.Statement to increment count by the appropriate amount are introduced into the
program.
This is done so that each time a statement in the original program is
executes count is incremented by the step count of that statement.

12 of 112
Eg:-
Algorithm sum(a,n)
s= 0.0;
count = count+1;
for I=1 to n do
count =count+1;
s=s+a[I];
count=count+1;
count=count+1;
count=count+1;
return s;
 If the count is zero to start with, then it will be 2n+3 on termination. So each invocation
of sum execute a total of 2n+3 steps.
2. The second method to determine the step count of an algorithm is to build a
table in which we list the total number of steps contributes by each statement.
First determine the number of steps per execution (s/e) of the statement and the
total number of times (ie., frequency) each statement is executed.
By combining these two quantities, the total contribution of all statements, the
step count for the entire algorithm is obtained.

13 of 112
Step Table method for finding Time complexity
Statement S/e Frequency Total

1. Algorithm Sum(a,n) 0 - 0
2.{ 0 - 0
3. S=0.0; 1 1 1
4. for I=1 to n do 1 n+1 n+1
5. s=s+a[I]; 1 n n
6. return s; 1 1 1
7. } 0 - 0
2n+3
Total
Elementary operations and Computation of Time Complexity

In computer science, the time complexity is the computational complexity that measures or
estimates the time taken for running an algorithm. Time complexity is commonly estimated
by counting the number of elementary operations performed by the algorithm, supposing that
an elementary operation takes a fixed amount of time to perform. Thus, the amount of time
taken and the number of elementary operations performed by the algorithm differ by at most
a constant factor.
Since an algorithm's running time may vary with different inputs of the same size, one
commonly considers the worst-case time complexity, which is the maximum amount of time
taken on inputs of a given size. Less common, and usually specified explicitly, is the average-
case complexity, which is the average of the time taken on inputs of a given size (this makes
sense, as there is only a finite number of possible inputs of a given size).
In both cases, the time complexity is generally expressed as a function of the size of the
input. Since this function is generally difficult to compute exactly, and the running time is
usually not critical for small input, one focuses commonly on the behavior of the complexity
when the input size increases, that is on the asymptotic behavior of the complexity.
Therefore, the time complexity is commonly expressed using big O notation,
typically O(n), O(n log n), O(n2), O(2n) etc., where n is the input size measured by the
number of bits needed for representing it.

14 of 112
ANALYSIS
Best, worst and average cases

Best, worst and average cases of a given algorithm express what the resource usage is at
least, at most and on average, respectively. Usually the resource being considered is running
time, but it could also be memory or other resources.
 The worst-case complexity of the algorithm is the function defined by the maximum
number of steps taken on any instance of size n. It represents the curve passing through the
highest point of each column.
The best-case complexity of the algorithm is the function defined by the minimum number
of steps taken on any instance of size n. It represents the curve passing through the lowest
point of each column. For example, the best case for a simple linear search on a list occurs
when the desired element is the first element of the list.
Finally, the average-case complexity of the algorithm is the function defined by the average
number of steps taken on any instance of size n.
Minimum number of comparisons = 1

Maximum number of comparisons = n

15 of 112
Therefore, average number of comparisons = (n + 1)/2
Complexity:
Complexity refers to the rate at which the storage time grows as a function of the
problem size.
RECURRENCE RELATIONS
The recurrence relation is an equation or inequality that describes a function in terms of its
values of smaller inputs. The main tool for analyzing the time efficiency of a recurrence
algorithm is to setup a sum expressing the number executions of its basic operation and
ascertain the solution‗s order of growth.
To solve a recurrence relation means, to obtain a function defined on natural numbers that
satisfies the recurrence.
METHODS FOR SOLVING RECURRENCES
– Substitution method
– Recursion tree method
– Master method
– Iteration method
1. Solving Recurrences with the Iteration

 In the iteration method we iteratively ―unfold‖ the recurrence until we ―see the
pattern‖.
 The iteration method does not require making a good guess like the substitution
method (but
it is often more involved than using induction).
 Example: Solve T(n) = 8T(n/2) + n² (T(1) = 1)

T(n) = n² + 8T(n/2)

16 of 112
= n² + 8(8T( n/2² ) + (n/2)²)
= n² + 8²T( n/2 ²) + 8(n²/4))
= n²+ 2n² + 8²T( n/2² )
= n² + 2n² + 8²(8T( n/2³ ) + ( n/2² )²)
= n² + 2n² + 8³T( n/2³ ) + 8²(n²/4² ))
= n² + 2n² + 2²n²+ 8³T( n/2³ )
=...
= n² + 2n² + 2²n²+ 2²n³ + . . .
2. Recursion tree Method

A different way to look at the iteration method: is the recursion-tree .we draw out the
recursion tree with cost of single call in each node—running time is sum of costs in all nodes
if you are careful drawing the recursion tree and summing up the costs, the recursion tree is a
direct proof for the solution of the recurrence, just like iteration and substitution
Example: T(n) = 8T(n/2) + n² (T(1) = 1)
Example of recursion tree

Solve T(n) = T(n/4) + T(n/2) + n2:
n2
(n/4)2 (n/2)2
(n/16)2 (n/8)2 (n/8)2 (n/4)2

…
…
(1) Total =
= (n2)
L2.18

17 of 112
Profiling or Performance Measurement
Profiling or performance measurement is the process of executing a correct program on data

set and measuring the time and space it takes to compute the result.
Amortized analysis
Amortized analysis is a method of analyzing algorithms that considers the entire sequence of
operations of the program to show that the average cost per operation is small, even though a
single operation within the sequence might be expensive.
An amortized analysis guarantees the average performance of each operation in the worst
case.
Amortized analysis is not just an analysis tool, it is also a way of thinking about designing
algorithms.
Note
--Refer Notebook for equations
-- Refer Text book -Fundamentals of computer Algorithms—Page No:28

Authors: Ellis Horowitz, Sartaj Sahni.
Types of amortized analyses

.
•the aggregate method
•the accounting method
•the potential method.
1. The aggregate method
• Show that for all n, a sequence of n operations take worst-case time T(n) in total
• In the worst case, the average cost, or amortized cost , per operation is T(n)/n.
• The amortized cost applies to each operation, even when there are several types of
operations in the sequence.
Example
• Stack operations:
– PUSH(S,x), O(1)
– POP(S), O(1)
– MULTIPOP(S,k), min(s,k) means Pop top k elements of s.

ie,
• while not STACK-EMPTY(S) and k>0

18 of 112
• do POP(S)
• k=k-1
• Let us consider a sequence of n PUSH, POP, MULTIPOP.
– The worst case cost for MULTIPOP in the sequence is O(n), since the stack
size is at most n.
– Thus the cost of the sequence is O(n2). Correct, but not tight.
If any sequence of n operations on a data structure takes _ T(n) time, the amortized time per
operation is T(n)/n, which is O(1).
In aggregate analysis, all operations have the same amortized cost (total cost divided by n)

19 of 112

MODULE II

20 of 112
MASTERS THEOREM
The master method is used for solving the following type of recurrence
T(n) =aT(n/b)+f(n),a 1 and b>1.
In above recurrence the problem is divided in to ‗a‘ subproblems each of size atmost
‗n/b‘.The subproblems are solved recursively each in T(n/b) time.The cost of split the
problem or combine the solutions of subproblems is given by function f(n).It should be note
that the number of leaves in the recursion tree is where E=
THEOREM
Let T(n) be defined on the non-negative integers by the recurrence

T(n)=aT(n/b)+f(n) where a 1,b>1 are constants and f(n) be a function.Then T(n) can be
asymtotically as,
PROFILING: Profiling or performance measurement is the process of executing a correct

program on data set and measuring the time and space it takes to compute the result.

21 of 112


22 of 112

ASYMPTOTIC NOTATIONS
Step count is to compare time complexity of two programs that compute same function and
also to predict the growth in run time as instance characteristics changes.Determining exact
step count is difficult and not necessary also. Since the values are not exact quantities we
need only comparative statements like c1n2 = tp(n) = c2n2.For example, consider two
programs with complexities c1n2 + c2n and c3n respectively.For small values of n,
complexity depend upon values of c1, c2 and c3. But there will also be an n beyond which
complexity of c3n is better than that of c1n2 + c2n.This value of n is called break-even point.
If this point is zero, c3n is always faster (or at least as fast).
1. Big ‘Oh’ Notation (O)
O-Notation (Upper Bound)
This notation gives an upper bound for a function to within a constant factor. We
write f(n) = O(g(n)) if there are positive constants n0 and c such that to the right of n0,
the value of f(n) always lies on or below cg(n).

23 of 112
Big-O, commonly written as O, is an Asymptotic Notation for the worst case, or ceiling of
growth for a given function. It provides us with an asymptotic upper bound for the growth
rate of runtime of an algorithm. Say f(n) is your algorithm runtime, and g(n) is an
arbitrary time complexity you are trying to relate to your algorithm. f(n) is O(g(n)), if for
some real constants c (c > 0) and n0, f(n) <= c g(n) for every input size n (n > n0).
Find the Big ‗Oh‘ for the following functions:
Linear Functions
Example 1
f(n) = 3n + 2
General form is f(n) = cg(n)
When n = 2, 3n + 2 = 3n + n = 4n
Hence f(n) = O(n), here c = 4 and n0 = 2
When n = 1, 3n + 2 = 3n + 2n = 5n
Hence f(n) = O(n), here c = 5 and n0 = 1
Hence we can have different c,n0 pairs satisfying for a given function.

24 of 112
Example 2
f(n) = 10n2 + 4n + 2
When n = 2, 10n2 + 4n + 2 = 10n2 + 5n
When n = 5, 5n = n2, 10n2 + 4n + 2 = 10n2 + n2 = 11n2
Hence f(n) = O(n2), here c = 11 and n0 = 5
Example 3
f(n) = 6*2n + n2
When n = 4, n2 = 2n
So f(n) = 6*2n + 2n = 7*2n
Hence f(n) = O(2n), here c = 7 and n0 = 4
Constant Functions
2. Big-Omega Notation
For non-negative functions, f(n) and g(n), if there exists an integer n0 and a constant c > 0
such that for all integers n > n0, f(n) ≥ cg(n), then f(n) is omega of g(n). This is denoted as
"f(n) = Ω(g(n))".
Big-Omega, commonly written as Ω, is an Asymptotic Notation for the best case, or a

floor growth rate for a given function. It provides us with an asymptotic lower bound for
the growth rate of runtime of an algorithm.
f(n) is Ω(g(n)), if for some real constants c (c > 0) and n0 (n0 > 0), f(n) is >= c g(n) for
every input size n (n > n0).

25 of 112
Ω-Notation (Lower Bound)
This notation gives a lower bound for a function to within a constant factor. We write f(n)
= Ω(g(n)) if there are positive constants n0 and c such that to the right of n0, the value of
f(n) always lies on or above cg(n).
Theta Notation (T)

Definition:
The function f(n)=theta(g(n))(read as ―f of n is theta of g of n‖) iff there exist

positive constants c1,c2, and n0 such that c1g(n) <=f(n)<=c2g(n) for all n, n>=n0.
If f(n) = T(g(n)), all values of n right to n0 f(n) lies on or above c1g(n) and on or below
c2g(n). Hence it is asymptotic tight bound for f(n).
The theta notation is more precise than both the big oh and big omega notations. The
function f(n)=theta(g(n)) iff g(n) is both lower and upper bound of f(n).

26 of 112
Example 1
f(n) = 3n + 2
f(n) = T(n) because f(n) = O(n) , n = 2.
Similarly we can solve all examples specified under Big‘Oh‘.
Little ‘Oh’ Notation (o)

o(g(n)) = { f(n) : for any positive constants c > 0, there exists n0>0, such that 0 = f(n) <
cg(n) for all n = n0 }
It defines the asymptotic tight upper bound. Main difference with Big Oh is that Big Oh
defines for some constants c by Little Oh defines for all constants.
Little Omega (.)
.(g(n)) = { f(n) : for any positive constants c>0 and n0>0 such that 0 = cg(n) < f(n) for all
n = n0 }
It defines the asymptotic tight lower bound. Main difference with O is that, . defines for
some constants c by . defines for all constants.

27 of 112
How asymptotic notation relates to analyzing complexity
Temporal comparison is not the only issue in algorithms. There are space issues as well.
Generally, a tradeoff between time and space is noticed in algorithms. Asymptotic notation
empowers you to make that trade off. If you think of the amount of time and space your
algorithm uses as a function of your data over time or space (time and space are usually
analyzed separately), you can analyze how the time and space is handled when you introduce
more data to your program.
This is important in data structures because you want a structure that behaves efficiently as
you increase the amount of data it handles. Keep in mind though that algorithm that are
efficient with large amounts of data are not always simple and efficient for small amounts of
data. So if you know you are working with only a small amount of data and you have
concerns for speed and code space, a trade off can be made for a function that does not
behave well for large amounts of data.
A few examples of asymptotic notation
Generally, we use asymptotic notation as a convenient way to examine what can happen in a
function in the worst case or in the best case. For example, if you want to write a function
that searches through an array of numbers and returns the smallest one:
function find-min(array a[1..n])
let j :=
for i := 1 to n:
j := min(j, a[i])
repeat
return j
end

28 of 112
Regardless of how big or small the array is, every time we run find-min, we have to initialize
the i and j integer variables and return j at the end. Therefore, we can just think of those parts
of the function as constant and ignore them.
So, how can we use asymptotic notation to discuss the find-min function? If we search
through an array with 87 elements, then the for loop iterates 87 times, even if the very first
element we hit turns out to be the minimum. Likewise, for n elements, the for loop iterates n
times. Therefore we say the function runs in time O(n).
What about this function:
Function find-min-plus-max(array a[1..n])
// First, find the smallest element in the array
let j := ;
for i := 1 to n:
j := min(j, a[i])
repeat
let minim := j
// Now, find the biggest element, add it to the smallest and
j := ;
for i := 1 to n:
j := max(j, a[i])
repeat
let maxim := j
// return the sum of the two
return minim + maxim;
end

29 of 112
What's the running time for find-min-plus-max? There are two for loops, that each iterate n
times, so the running time is clearly O(2n). Because 2 is a constant, we throw it away and
write the running time as O(n). Why can you do this? If you recall the definition of Big-O
notation, the function whose bound you're testing can be multiplied by some constant. If
f(x) = 2x, we can see that if g(x) = x, then the Big-O condition holds. Thus O(2n) = O(n). This
rule is general for the various asymptotic notations.
Common Complexity Functions
Comparison of functions
Many of the relational properties of real numbers apply to asymptotic comparisons as well.
For the following, assume that f(n) and g(n) are asymptotically positive.
Transitivity:
f(n)=(g(n)) and g(n)=(h(n)) imply f(n)=(h(n)),

f(n)=O(g(n)) and g(n)=O(h(n)) imply f(n)=O(h(n)),

f(n)=Ω(g(n)) and g(n)=Ω(h(n)) imply f(n)=Ω(h(n)),
f(n)=o(g(n)) and g(n)=o(h(n)) imply f(n)=o(h(n)),
f(n)=ω(g(n)) and g(n)=ω(h(n)) imply f(n)=ω(h(n)).
Reflexivity:
f(n)=(f(n)),
f(n)=O(f(n)),
f(n)=Ω(f(n)).
Symmetry:
f(n)=(g(n)) if and only if g(n)=(f(n)).

30 of 112
AVL TREES
AVL tree is a self-balancing Binary Search Tree (BST) where the difference between heights
of left and right subtrees cannot be more than one for all nodes.
Example: AVL Tree
The above tree is AVL because differences between heights of left and right subtrees for
every node is less than or equal to 1.
An Example Tree that is

NOT an AVL Tree
The above tree is not AVL because differences between heights of left and right subtrees for
8 and 18 are greater than 1.
Why AVL Trees?

Most of the BST operations (e.g., search, max, min, insert, delete.. etc) take O(h) time where
h is the height of the BST. The cost of these operations may become O(n) for a skewed
Binary tree. If we make sure that height of the tree remains O(Logn) after every insertion and
deletion, then we can guarantee an upper bound of O(Logn) for all these operations. The

31 of 112
height of an AVL tree is always O(Logn) where n is the number of nodes in the tree (See this
video lecture for proof).
Insertion
To make sure that the given tree remains AVL after every insertion, we must augment the
standard BST insert operation to perform some re-balancing. Following are two basic
operations that can be performed to re-balance a BST without violating the BST property
(keys(left) < key(root) < keys(right)). 1) Left Rotation 2) Right Rotation
Steps to follow for insertion
Let the newly inserted node be w
1) Perform standard BST insert for w.

2) Starting from w, travel up and find the first unbalanced node. Let z be the first unbalanced
node, y be the child of z that comes on the path from w to z and x be the grandchild of z that
comes on the path from w to z.
3) Re-balance the tree by performing appropriate rotations on the subtree rooted with z. There
can be 4 possible cases that needs to be handled as x, y and z can be arranged in 4 ways.
Following are the possible 4 arrangements:
a) y is left child of z and x is left child of y (Left Left Case)
b) y is left child of z and x is right child of y (Left Right Case)
c) y is right child of z and x is right child of y (Right Right Case)
d) y is right child of z and x is left child of y (Right Left Case)
Following are the operations to be performed in above mentioned 4 cases. In all of the cases,
we only need to re-balance the subtree rooted with z and the complete tree becomes balanced
as the height of subtree (After appropriate rotations) rooted with z becomes same as it was
before insertion.

32 of 112


33 of 112


34 of 112


35 of 112
Red-Black Tree | Set 1 (Introduction)

Red-Black Tree is a self-balancing Binary Search Tree (BST) where every node follows following
rules.
1) Every node has a color either red or black.

2) Root of tree is always black.
3) There are no two adjacent red nodes (A red node cannot have a red parent or red child).
4) Every path from root to a NULL node has same number of black nodes.
WhyRed-BlackTrees?
Most of the BST operations (e.g., search, max, min, insert, delete.. etc) take O(h) time where
h is the height of the BST. The cost of these operations may become O(n) for a skewed
Binary tree. If we make sure that height of the tree remains O(Logn) after every insertion and
deletion, then we can guarantee an upper bound of O(Logn) for all these operations. The
height of a Red Black tree is always O(Logn) where n is the number of nodes in the tree.

36 of 112
Comparison with AVL Tree

The AVL trees are more balanced compared to Red Black Trees, but they may cause more
rotations during insertion and deletion. So if your application involves many frequent
insertions and deletions, then Red Black trees should be preferred. And if the insertions and
deletions are less frequent and search is more frequent operation, then AVL tree should be
preferred over Red Black Tree.
How does a Red-Black Tree ensure balance?

A simple example to understand balancing is, a chain of 3 nodes is not possible in red black
tree. We can try any combination of colors and see all of them violate Red-Black tree
property.
From the above examples, we get some idea how Red-Black trees ensure balance. Following
is an important fact about balancing in Red-Black Trees.
Black Height of a Red-Black Tree :

Black height is number of black nodes on a path from a node to a leaf. Leaf nodes are also
counted black nodes. From above properties 3 and 4, we can derive, a node of height h has
black-height >= h/2.
Every Red Black Tree with n nodes has height <= 2Log2(n+1)
This can be proved using following facts:
1) For a general Binary Tree, let k be the minimum number of nodes on all root to NULL
paths, then n >= 2k – 1 (Ex. If k is 3, then n is atleast 7). This expression can also be written
as k <= 2Log2(n+1)
2) From property 4 of Red-Black trees and above claim, we can say in a Red-Black Tree with
n nodes, there is a root to leaf path with at-most Log2(n+1) black nodes.
3) From property 3 of Red-Black trees, we can claim that the number black nodes in a Red-
Black tree is at least ⌊ n/2 ⌋ where n is total number of nodes.

37 of 112
From above 2 points, we can conclude the fact that Red Black Tree with n nodes has height
<= 2Log2(n+1)
In this post, we introduced Red-Black trees and discussed how balance is ensured. The hard
part is to maintain balance when keys are added and removed. We will soon be discussing
insertion and deletion operations in coming posts on Red-Black tree.
In AVL tree insertion, we used rotation as a tool to do balancing after insertion caused
imbalance. In Red-Black tree, we use two tools to do balancing.
1) Recoloring
2) Rotation
We try recoloring first, if recoloring doesn‘t work, then we go for rotation. Following is
detailed algorithm. The algorithms has mainly two cases depending upon the color of uncle.
If uncle is red, we do recoloring. If uncle is black, we do rotations and/or recoloring.
Color of a NULL node is considered as BLACK.

Let x be the newly inserted node.
1) Perform standard BST insertion and make the color of newly inserted nodes as RED.
2) If x is root, change color of x as BLACK (Black height of complete tree increases by 1).
3) Do following if color of x‘s parent is not BLACK or x is not root.
….a) If x’s uncle is RED (Grand parent must have been black from property 4)
……..(i) Change color of parent and uncle as BLACK.
……..(ii) color of grand parent as RED.
……..(iii) Change x = x‘s grandparent, repeat steps 2 and 3 for new x.
….b) If x’s uncle is BLACK, then there can be four configurations for x, x‘s parent (p) and
x‘s grandparent (g) (This is similar to AVL Tree)
……..i) Left Left Case (p is left child of g and x is left child of p)

……..ii) Left Right Case (p is left child of g and x is right child of p)
……..iii) Right Right Case (Mirror of case a)
……..iv) Right Left Case (Mirror of case c)
Following are operations to be performed in four subcases when uncle is BLACK.

38 of 112
All four cases when Uncle is BLACK
Left Left Case (See g, p and x)
Left Right Case (See g, p and x)

Right Right Case (See g, p and x)

39 of 112
Right Left Case (See g, p and x)


40 of 112
Insertion Vs Deletion:
Like Insertion, recoloring and rotations are used to maintain the Red-Black properties.
In insert operation, we check color of uncle to decide the appropriate case. In delete
operation, we check color of sibling to decide the appropriate case.
The main property that violates after insertion is two consecutive reds. In delete, the main
violated property is, change of black height in subtrees as deletion of a black node may cause
reduced black height in one root to leaf path.
Deletion is fairly complex process. To understand deletion, notion of double black is used.
When a black node is deleted and replaced by a black child, the child is marked as double
black. The main task now becomes to convert this double black to single black.
Deletion Steps
Following are detailed steps for deletion.
1) Perform standard BST delete. When we perform standard delete operation in BST, we
always end up deleting a node which is either leaf or has only one child (For an internal node,
we copy the successor and then recursively call delete for successor, successor is always a
leaf node or a node with one child). So we only need to handle cases where a node is leaf or
has one child. Let v be the node to be deleted and u be the child that replaces v (Note that u is
NULL when v is a leaf and color of NULL is considered as Black).
2) Simple Case: If either u or v is red, we mark the replaced child as black (No change in
black height). Note that both u and v cannot be red as v is parent of u and two consecutive
reds are not allowed in red-black tree.
3) If Both u and v are Black.
3.1) Color u as double black. Now our task reduces to convert this double black to single
black. Note that If v is leaf, then u is NULL and color of NULL is considered as black. So the
deletion of a black leaf also causes a double black.

41 of 112
3.2) Do following while the current node u is double black and it is not root. Let sibling of
node be s.
….(a): If sibling s is black and at least one of sibling’s children is red, perform rotation(s).
Let the red child of s be r. This case can be divided in four subcases depending upon
positions of s and r.
…………..(i) Left Left Case (s is left child of its parent and r is left child of s or both children
of s are red). This is mirror of right right case shown in below diagram.
…………..(ii) Left Right Case (s is left child of its parent and r is right child). This is mirror
of right left case shown in below diagram.
…………..(iii) Right Right Case (s is right child of its parent and r is right child of s or both
children of s are red)
…………..(iv) Right Left Case (s is right child of its parent and r is left child of s)

42 of 112

…..(b): If sibling is black and its both children are black, perform recoloring, and
recur for the parent if parent is black.
In this case, if parent was red, then we didn‘t need to recur for prent, we can simply make it
black (red + double black = single black)
…..(c): If sibling is red, perform a rotation to move old sibling up, recolor the old sibling
and parent. The new sibling is always black (See the below diagram). This mainly converts
the tree to black sibling case (by rotation) and leads to case (a) or (b). This case can be
divided in two subcases.
…………..(i) Left Case (s is left child of its parent). This is mirror of right right case shown
in below diagram. We right rotate the parent p.
…………..(iii) Right Case (s is right child of its parent). We left rotate the parent p.

43 of 112

3.3) If u is root, make it single black and return (Black height of complete tree reduces by 1).
B- Trees
B-Tree is a self-balancing search tree. In most of the other self-balancing search trees
(like AVL and Red Black Trees), it is assumed that everything is in main memory. To
understand use of B-Trees, we must think of huge amount of data that cannot fit in main
memory.When the number of keys is high, the data is read from disk in the form of blocks.
Disk access time is very high compared to main memory access time. The main idea of using
B-Trees is to reduce the number of disk accesses. Most of the tree operations (search, insert,
delete, max, min, ..etc ) require O(h) disk accesses where h is height of the tree. B-tree is a fat
tree. Height of B-Trees is kept low by putting maximum possible keys in a B-Tree node.
Generally, a B-Tree node size is kept equal to the disk block size. Since h is low for B-Tree,
total disk accesses for most of the operations are reduced significantly compared to balanced
Binary Search Trees like AVL Tree, Red Black Tree, ..etc.
Properties of B-Tree
1) All leaves are at same level.

2) A B-Tree is defined by the term minimum degree ‗t‘. The value of t depends upon disk
block size.
3) Every node except root must contain at least t-1 keys. Root may contain minimum 1 key.
4) All nodes (including root) may contain at most 2t – 1 keys.
5) Number of children of a node is equal to the number of keys in it plus 1.
6) All keys of a node are sorted in increasing order. The child between two keys k1 and k2
contains all keys in range from k1 and k2.
7) B-Tree grows and shrinks from root which is unlike Binary Search Tree. Binary Search

44 of 112
Trees grow downward and also shrink from downward.

8) Like other balanced Binary Search Trees, time complexity to search, insert and delete is
O(Logn).
Following is an example B-Tree of minimum degree 3. Note that in practical B-Trees, the
value of minimum degree is much more than 3.
Search
Search is similar to search in Binary Search Tree. Let the key to be searched be k. We start
from root and recursively traverse down. For every visited non-leaf node, if the node has key,
we simply return the node. Otherwise we recur down to the appropriate child (The child
which is just before the first greater key) of the node. If we reach a leaf node and don‘t find k
in the leaf node, we return NULL.
Traverse
Traversal is also similar to Inorder traversal of Binary Tree. We start from the leftmost child,
recursively print the leftmost child, then repeat the same process for remaining children and
keys. In the end, recursively print the rightmost child.
Insert
In this post, insert() operation is discussed. A new key is always inserted at leaf node. Let the
key to be inserted be k. Like BST, we start from root and traverse down till we reach a leaf
node. Once we reach a leaf node, we insert the key in that leaf node. Unlike BSTs, we have a
predefined range on number of keys that a node can contain. So before inserting a key to
node, we make sure that the node has extra space.
How to make sure that a node has space available for key before the key is inserted? We use
an operation called splitChild() that is used to split a child of a node. See the following
diagram to understand split. In the following diagram, child y of x is being split into two
nodes y and z. Note that the splitChild operation moves a key up and this is the reason B-
Trees grow up unlike BSTs which grow down.

45 of 112
As discussed above, to insert a new key, we go down from root to leaf. Before traversing
down to a node, we first check if the node is full. If the node is full, we split it to create space.
Following is complete algorithm.
Insertion
1) Initialize x as root.
2) While x is not leaf, do following
..a) Find the child of x that is going to to be traversed next. Let the child be y.
..b) If y is not full, change x to point to y.
..c) If y is full, split it and change x to point to one of the two parts of y. If k is smaller than
mid key in y, then set x as first part of y. Else second part of y. When we split y, we move a
key from y to its parent x.
3) The loop in step 2 stops when x is leaf. x must have space for 1 extra key as we have been
splitting all nodes in advance. So simply insert k to x.
Note that the algorithm follows the Cormen book. It is actually a proactive insertion
algorithm where before going down to a node, we split it if it is full. The advantage of
splitting before is, we never traverse a node twice. If we don‘t split a node before going down
to it and split it only if new key is inserted (reactive), we may end up traversing all nodes
again from leaf to root. This happens in cases when all nodes on the path from root to leaf are
full. So when we come to the leaf node, we split it and move a key up. Moving a key up will
cause a split in parent node (because parent was already full). This cascading effect never
happens in this proactive insertion algorithm. There is a disadvantage of this proactive
insertion though, we may do unnecessary splits.
Let us understand the algorithm with an example tree of minimum degree ‗t‘ as 3 and a
sequence of integers 10, 20, 30, 40, 50, 60, 70, 80 and 90 in an initially empty B-Tree.
Initially root is NULL. Let us first insert 10.
Let us now insert 20, 30, 40 and 50. They all will be inserted in root because maximum
number of keys a node can accommodate is 2*t – 1 which is 5.
Let us now insert 60. Since root node is full, it will first split into two, then 60 will be
inserted into the appropriate child.

46 of 112
Let us now insert 70 and 80. These new keys will be inserted into the appropriate leaf
without any split.
Let us now insert 90. This insertion will cause a split. The middle key will go up to the
parent.
Deletion Process
Deletion from a B-tree is more complicated than insertion, because we can delete a key from
any node-not just a leaf—and when we delete a key from an internal node, we will have to
rearrange the node‘s children.
As in insertion, we must make sure the deletion doesn‘t violate the B-tree properties. Just
as we had to ensure that a node didn‘t get too big due to insertion, we must ensure that a node
doesn‘t get too small during deletion (except that the root is allowed to have fewer than the
minimum number t-1 of keys). Just as a simple insertion algorithm might have to back up if a
node on the path to where the key was to be inserted was full, a simple approach to deletion
might have to back up if a node (other than the root) along the path to where the key is to be
deleted has the minimum number of keys.
The deletion procedure deletes the key k from the subtree rooted at x. This procedure
guarantees that whenever it calls itself recursively on a node x, the number of keys in x is at
least the minimum degree t . Note that this condition requires one more key than the
minimum required by the usual B-tree conditions, so that sometimes a key may have to be
moved into a child node before recursion descends to that child. This strengthened condition

47 of 112
allows us to delete a key from the tree in one downward pass without having to ―back up‖
(with one exception, which we‘ll explain). You should interpret the following specification
for deletion from a B-tree with the understanding that if the root node x ever becomes an
internal node having no keys (this situation can occur in cases 2c and 3b then we delete x,
and x‘s only child x.c1 becomes the new root of the tree, decreasing the height of the tree by
one and preserving the property that the root of the tree contains at least one key (unless the
tree is empty).
We sketch how deletion works with various cases of deleting keys from a B-tree.
1. If the key k is in node x and x is a leaf, delete the key k from x.
2. If the key k is in node x and x is an internal node, do the following.
a) If the child y that precedes k in node x has at least t keys, then find the predecessor k0 of
k in the sub-tree rooted at y. Recursively delete k0, and replace k by k0 in x. (We can find k0
and delete it in a single downward pass.)
b) If y has fewer than t keys, then, symmetrically, examine the child z that follows k in
node x. If z has at least t keys, then find the successor k0 of k in the subtree rooted at z.
Recursively delete k0, and replace k by k0 in x. (We can find k0 and delete it in a single
downward pass.)
c) Otherwise, if both y and z have only t-1 keys, merge k and all of z into y, so that x loses
both k and the pointer to z, and y now contains 2t-1 keys. Then free z and recursively delete k
from y.
3. If the key k is not present in internal node x, determine the root x.c(i) of the appropriate
subtree that must contain k, if k is in the tree at all. If x.c(i) has only t-1 keys, execute step 3a
or 3b as necessary to guarantee that we descend to a node containing at least t keys. Then
finish by recursing on the appropriate child of x.
a) If x.c(i) has only t-1 keys but has an immediate sibling with at least t keys, give x.c(i) an
extra key by moving a key from x down into x.c(i), moving a key from x.c(i) ‘s immediate
left or right sibling up into x, and moving the appropriate child pointer from the sibling into
x.c(i).
b) If x.c(i) and both of x.c(i)‘s immediate siblings have t-1 keys, merge x.c(i) with one
sibling, which involves moving a key from x down into the new merged node to become the
median key for that node.
Since most of the keys in a B-tree are in the leaves, deletion operations are most often used to
delete keys from leaves. The recursive delete procedure then acts in one downward pass
through the tree, without having to back up. When deleting a key in an internal node,
however, the procedure makes a downward pass through the tree but may have to return to
the node from which the key was deleted to replace the key with its predecessor or successor
(cases 2a and 2b).
The following figures from CLRS book explain the deletion porcess.

48 of 112

Sets
Refer text.

49 of 112

MODULE III

50 of 112
Graphs
Graph is a data structure that consists of following two components:
1. A finite set of vertices also called as nodes.
2. A finite set of ordered pair of the form (u, v) called as edge.
The pair is ordered because (u, v) is not same as (v, u) in case of directed graph(di-graph).
The pair of form (u, v) indicates that there is an edge from vertex u to vertex v. The edges
may contain weight/value/cost.
Graphs are used to represent many real life applications: Graphs are used to represent
networks. The networks may include paths in a city or telephone network or circuit network.
Graphs are also used in social networks like linkedIn, facebook. For example, in facebook,
each person is represented with a vertex(or node). Each node is a structure and contains
information like person id, name, gender and locale. See this for more applications of graph.
Following is an example undirected graph with 5 vertices.
Following two are the most commonly used representations of graph.

1. Adjacency Matrix
2. Adjacency List
There are other representations also like, Incidence Matrix and Incidence List. The choice of
the graph representation is situation specific. It totally depends on the type of operations to be
performed and ease of use.
Adjacency Matrix:
Adjacency Matrix is a 2D array of size V x V where V is the number of vertices in a graph.

Let the 2D array be adj[][], a slot adj[i][j] = 1 indicates that there is an edge from vertex i to
vertex j. Adjacency matrix for undirected graph is always symmetric. Adjacency Matrix is
also used to represent weighted graphs. If adj[i][j] = w, then there is an edge from vertex i to
vertex j with weight w.
The adjacency matrix for the above example graph is:

51 of 112
Pros: Representation is easier to implement and follow. Removing an edge takes O(1) time.
Queries like whether there is an edge from vertex ‗u‘ to vertex ‗v‘ are efficient and can be
done O(1).
Cons: Consumes more space O(V^2). Even if the graph is sparse(contains less number of
edges), it consumes the same space. Adding a vertex is O(V^2) time.
Adjacency List:
An array of linked lists is used. Size of the array is equal to number of vertices. Let the array
be array[]. An entry array[i] represents the linked list of vertices adjacent to the ith vertex.
This representation can also be used to represent a weighted graph. The weights of edges can
be stored in nodes of linked lists. Following is adjacency list representation of the above
graph.
Breadth First Traversal or BFS for a Graph

Breadth First Traversal (or Search) for a graph is similar to Breadth First Traversal of a tree
(See method 2 of this post). The only catch here is, unlike trees, graphs may contain cycles,
so we may come to the same node again. To avoid processing a node more than once, we use
a boolean visited array. For simplicity, it is assumed that all vertices are reachable from the
starting vertex.
For example, in the following graph, we start traversal from vertex 2. When we come to
vertex 0, we look for all adjacent vertices of it. 2 is also an adjacent vertex of 0. If we don‘t
mark visited vertices, then 2 will be processed again and it will become a non-terminating
process. A Breadth First Traversal of the following graph is 2, 0, 3, 1.

52 of 112
Time Complexity: O (IVI+ IEI)
Depth First Traversal or DFS for a Graph
Depth First Traversal (or Search) for a graph is similar to Depth First Traversal of a tree. The
only catch here is, unlike trees, graphs may contain cycles, so we may come to the same node
again. To avoid processing a node more than once, we use a boolean visited array.
For example, in the following graph, we start traversal from vertex 2. When we come to
vertex 0, we look for all adjacent vertices of it. 2 is also an adjacent vertex of 0. If we don‘t
mark visited vertices, then 2 will be processed again and it will become a non-terminating
process. A Depth First Traversal of the following graph is 2, 0, 1, 3.
Applications of Depth First Search
Depth-first search (DFS) is an algorithm (or technique) for traversing a graph.

Following are the problems that use DFS as a bulding block.
1) For an unweighted graph, DFS traversal of the graph produces the minimum spanning tree
and all pair shortest path tree.
2) Detecting cycle in a graph
A graph has cycle if and only if we see a back edge during DFS. So we can run DFS for the
graph and check for back edges. (See this for details)
3) Path Finding
We can specialize the DFS algorithm to find a path between two given vertices u and z.

53 of 112
i) Call DFS(G, u) with u as the start vertex.

ii) Use a stack S to keep track of the path between the start vertex and the current vertex.
iii) As soon as destination vertex z is encountered, return the path as the
contents of the stack
See this for details.
4) Topological Sorting
Topological Sorting is mainly used for scheduling jobs from the given dependencies among
jobs. In computer science, applications of this type arise in instruction scheduling, ordering of
formula cell evaluation when recomputing formula values in spreadsheets, logic synthesis,
determining the order of compilation tasks to perform in makefiles, data serialization, and
resolving symbol dependencies in linkers.
5) To test if a graph is bipartite
We can augment either BFS or DFS when we first discover a new vertex, color it opposited
its parents, and for each other edge, check it doesn‘t link two vertices of the same color. The
first vertex in any connected component can be red or black! See this for details.
6) Finding Strongly Connected Components of a graph

A directed graph is called strongly connected if there is a path from each vertex in the graph
to every other vertex. (See this for DFS based algo for finding Strongly Connected
Components)
7) Solving puzzles with only one solution, such as mazes. (DFS can be adapted to find all
solutions to a maze by only including nodes on the current path in the visited set.)
Applications of Breadth First Traversal

We have earlier discussed Breadth First Traversal Algorithm for Graphs. We have also
discussed Applications of Depth First Traversal. In this article, applications of Breadth First
Search are discussed.
1) Shortest Path and Minimum Spanning Tree for unweighted graph In unweighted
graph, the shortest path is the path with least number of edges. With Breadth First, we always
reach a vertex from given source using minimum number of edges. Also, in case of
unweighted graphs, any spanning tree is Minimum Spanning Tree and we can use either
Depth or Breadth first traversal for finding a spanning tree.
2) Peer to Peer Networks. In Peer to Peer Networks like BitTorrent, Breadth First Search is
used to find all neighbor nodes.
3) Crawlers in Search Engines: Crawlers build index using Breadth First. The idea is to
start from source page and follow all links from source and keep doing same. Depth First
Traversal can also be used for crawlers, but the advantage with Breadth First Traversal is,
depth or levels of built tree can be limited.

54 of 112
4) Social Networking Websites: In social networks, we can find people within a given
distance ‗k‘ from a person using Breadth First Search till ‗k‘ levels.
5) GPS Navigation systems: Breadth First Search is used to find all neighboring locations.
6) Broadcasting in Network: In networks, a broadcasted packet follows Breadth First

Search to reach all nodes.
7) In Garbage Collection: Breadth First Search is used in copying garbage collection

using Cheney‘s algorithm. Refer this and for details. Breadth First Search is preferred over
Depth First Search because of better locality of reference:
8) Cycle detection in undirected graph: In undirected graphs, either Breadth First Search or
Depth First Search can be used to detect cycle. In directed graph, only depth first search can
be used.
9) Ford–Fulkerson algorithm In Ford-Fulkerson algorithm, we can either use Breadth First

or Depth First Traversal to find the maximum flow. Breadth First Traversal is preferred as it
reduces worst case time complexity to O(VE2).
10) To test if a graph is Bipartite We can either use Breadth First or Depth First Traversal.
11) Path Finding We can either use Breadth First or Depth First Traversal to find if there is a
path between two vertices.
12) Finding all nodes within one connected component: We can either use Breadth First or
Depth First Traversal to find all nodes reachable from a given node.
Many algorithms like Prim‘s Minimum Spanning Tree and Dijkstra‘s Single Source Shortest
Pathuse structure similar to Breadth First Search.
Time Complexity (DFS): O (IVI+ IEI)
SPANNING TREES
A spanning tree is a subset of Graph G, which has all the vertices covered with minimum
possible number of edges. Hence, a spanning tree does not have cycles and it cannot be
disconnected..
By this definition, we can draw a conclusion that every connected and undirected Graph G
has at least one spanning tree. A disconnected graph does not have any spanning tree, as it
cannot be spanned to all its vertices.

55 of 112

We found three spanning trees off one complete graph. A complete undirected graph can
have maximum nn-2 number of spanning trees, where n is the number of nodes. In the above
addressed example, 33−2 = 3 spanning trees are possible.
General Properties of Spanning Tree

We now understand that one graph can have more than one spanning tree. Following are a
few properties of the spanning tree connected to graph G −
 A connected graph G can have more than one spanning tree.
 All possible spanning trees of graph G, have the same number of edges and vertices.
 The spanning tree does not have any cycle (loops).
 Removing one edge from the spanning tree will make the graph disconnected, i.e. the
spanning tree is minimally connected.
 Adding one edge to the spanning tree will create a circuit or loop, i.e. the spanning
tree is maximally acyclic.
Application of Spanning Tree

Spanning tree is basically used to find a minimum path to connect all nodes in a graph.
Common application of spanning trees are –

56 of 112
 Civil Network Planning

 Computer Network Routing Protocol
 Cluster Analysis
Let us understand this through a small example. Consider, city network as a huge graph and
now plans to deploy telephone lines in such a way that in minimum lines we can connect to
all city nodes. This is where the spanning tree comes into picture.
Minimum Cost Spanning Tree (MST)
In a weighted graph, a minimum spanning tree is a spanning tree that has minimum weight
than all other spanning trees of the same graph. In real-world situations, this weight can be
measured as distance, congestion, traffic load or any arbitrary value denoted to the edges.
Minimum Spanning-Tree Algorithm
We shall learn about two most important spanning tree algorithms here −
 Kruskal's Algorithm
 Prim's Algorithm
Both are greedy algorithms.
Single Source Shortest Paths Problem: Dijkstra's Algorithm
Dijkstra‘s algorithm is very similar to Prim‘s algorithm for minimum spanning tree. Like
Prim‘s MST, we generate a SPT (shortest path tree) with given source as root. We maintain
two sets, one set contains vertices included in shortest path tree, other set includes vertices
not yet included in shortest path tree. At every step of the algorithm, we find a vertex which
is in the other set (set of not yet included) and has minimum distance from source.
Below are the detailed steps used in Dijkstra‘s algorithm to find the shortest path from a
single source vertex to all other vertices in the given graph.
Algorithm
1) Create a set sptSet (shortest path tree set) that keeps track of vertices included in shortest
path tree, i.e., whose minimum distance from source is calculated and finalized. Initially, this
set is empty.
2) Assign a distance value to all vertices in the input graph. Initialize all distance values as
INFINITE. Assign distance value as 0 for the source vertex so that it is picked first.

57 of 112
3) While sptSet doesn‘t include all vertices
a) Pick a vertex u which is not there in sptSetand has minimum distance value.
b) Include u to sptSet.
c) Update distance value of all adjacent vertices of u. To update the distance values,
iterate through all adjacent vertices. For every adjacent vertex v, if sum of distance
value of u (from source) and weight of edge u-v, is less than the distance value of v,
then update the distance value of v.
Let us understand with the following example:

The set sptSetis initially empty and distances assigned to vertices are {0, INF, INF, INF, INF,
INF, INF, INF} where INF indicates infinite. Now pick the vertex with minimum distance
value. The vertex 0 is picked, include it in sptSet. So sptSet becomes {0}. After including 0
to sptSet, update distance values of its adjacent vertices. Adjacent vertices of 0 are 1 and 7.
The distance values of 1 and 7 are updated as 4 and 8. Following subgraph shows vertices
and their distance values, only the vertices with finite distance values are shown. The vertices
included in SPT are shown in green color.

58 of 112
Pick the vertex with minimum distance value and not already included in SPT (not in
sptSET). The vertex 1 is picked and added to sptSet. So sptSet now becomes {0, 1}. Update
the distance values of adjacent vertices of 1. The distance value of vertex 2 becomes 12.
sptSET). Vertex 7 is picked. So sptSet now becomes {0, 1, 7}. Update the distance values of
adjacent vertices of 7. The distance value of vertex 6 and 8 becomes finite (15 and 9
respectively).
sptSET). Vertex 6 is picked. So sptSet now becomes {0, 1, 7, 6}. Update the distance values
of adjacent vertices of 6. The distance value of vertex 5 and 8 are updated.
We repeat the above steps until sptSet doesn‘t include all vertices of given graph. Finally, we
get the following Shortest Path Tree (SPT).

59 of 112
Topological Sort
An ordering of the vertices in a directed acyclic graph, such that
If there is a path from u to v, then v appears after u in the ordering.
Types of Graphs:
 The graph should be directed: otherwise for any edge (u,v) there would be a path from
u to v and also from v to u, and hence they cannot be ordered.
 The graph should be acyclic: otherwise for any two vertices u and v on a cycle u
would precede v and v would precede u.
Out degree of vertex U : the number of edges (U,V) – outgoing edges.
Indegree of vertex U : the number of edges (U,V) – incoming edges.
The algorithm for topological sort uses « indegrees » of vertices.
Example :

60 of 112


61 of 112
Strongly Connected Components

A directed graph is strongly connected if there is a path between all pairs of vertices. A
strongly connected component (SCC) of a directed graph is a maximal strongly connected
subgraph. For example, there are 3 SCCs in the following graph.
We can find all strongly connected components in O(V+E) time using Kosaraju‘s algorithm.
Following is detailed Kosaraju‘s algorithm.
1) Create an empty stack ‗S‘ and do DFS traversal of a graph. In DFS traversal, after calling
recursive DFS for adjacent vertices of a vertex, push the vertex to stack. In the above graph,
if we start DFS from vertex 0, we get vertices in stack as 1, 2, 4, 3, 0.
2) Reverse directions of all arcs to obtain the transpose graph.
3) One by one pop a vertex from S while S is not empty. Let the popped vertex be ‗v‘. Take v
as source and do DFS (call DFSUtil(v)). The DFS starting from v prints strongly connected
component of v. In the above example, we process vertices in order 0, 3, 4, 2, 1 (One by one
popped from stack).
How does this work?
The above algorithm is DFS based. It does DFS two times. DFS of a graph produces a single
tree if all vertices are reachable from the DFS starting point. Otherwise DFS produces a
forest. So DFS of a graph with only one SCC always produces a tree. The important point to
note is DFS may produce a tree or a forest when there are more than one SCCs depending
upon the chosen starting point. For example, in the above diagram, if we start DFS from
vertices 0 or 1 or 2, we get a tree as output. And if we start from 3 or 4, we get a forest. To
find and print all SCCs, we would want to start DFS from vertex 4 (which is a sink vertex),
then move to 3 which is sink in the remaining set (set excluding 4) and finally any of the
remaining vertices (0, 1, 2). So how do we find this sequence of picking vertices as starting
points of DFS? Unfortunately, there is no direct way for getting this sequence. However, if
we do a DFS of graph and store vertices according to their finish times, we make sure that the
finish time of a vertex that connects to other SCCs (other that its own SCC), will always be
greater than finish time of vertices in the other SCC (See this for proof). For example, in DFS

62 of 112
of above example graph, finish time of 0 is always greater than 3 and 4 (irrespective of the
sequence of vertices considered for DFS). And finish time of 3 is always greater than 4. DFS
doesn‘t guarantee about other vertices, for example finish times of 1 and 2 may be smaller or
greater than 3 and 4 depending upon the sequence of vertices considered for DFS. So to use
this property, we do DFS traversal of complete graph and push every finished vertex to a
stack. In stack, 3 always appears after 4, and 0 appear after both 3 and 4.
In the next step, we reverse the graph. Consider the graph of SCCs. In the reversed graph, the
edges that connect two components are reversed. So the SCC {0, 1, 2} becomes sink and the
SCC {4} becomes source. As discussed above, in stack, we always have 0 before 3 and 4. So
if we do a DFS of the reversed graph using sequence of vertices in stack, we process vertices
from sink to source (in reversed graph). That is what we wanted to achieve and that is all
needed to print SCCs one by one.

63 of 112

MODULE IV

64 of 112
DIVIDE AND CONQUER
Divide and conquer (D&C) is an important algorithm design paradigm. It works by

recursively breaking down a problem into two or more sub-problems of the same (or
related) type, until these become simple enough to be solved directly. The solutions to the
sub-problems are then combined to give a solution to the original problem. A divide and
conquer algorithm is closely tied to a type of recurrence relation between functions of the
data in question; data is "divided" into smaller portions and the result calculated thence.
Steps
 Splits the input with size n into k distinct sub problems 1 < k ≤ n
 Solve sub problems
 Combine sub problem solutions to get solution of the whole problem. Often sub
problems will be of same type as main problem. Hence solutions can be expressed
as recursive algorithms
Control Abstraction
Control abstraction is a procedure whose flow of control is clear but primary

operations are specified by other functions whose precise meaning is undefined
DAndC (P)
if Small(P) return S(P);
else
divide P into smaller instances P1, P2,…,Pk
apply DandC to each of these sub problems;
retun Combine(DandC(P1), DandC(P2),…,DandC(Pk));
The recurrence relation for the algorithm is given by

65 of 112
g(n) – time to computer answer directly from small inputs.
f(n) – time for dividing P and combining solution to sub problems
Merge Sort
Given a sequence of n elements a[1],...,a[n].
Spilt into two sets
Each set is individually sorted, resultant is merged to form sorted list of n elements
Algorithm

66 of 112
Example
Consider 10 elements 310, 285, 179, 652, 351, 423, 861, 254, 450 and 520. The recursive
call can be represented using a tree as given below
The recurrence relation for the algorithm can be written as
When n is power of 2, we can write n = 2k
T(n) = 2(T(n/4) + cn/2) + cn
= 4T(n/4) + 2cn
= 4(2T(n/8) + cn/4) + 2cn
=…
= 2kT(1) + kcn
T(n) = an + cnlogn
If 2k<n≤2k+1 , T(n)≤T(2k+1)
Hence T(n) = O (nlogn)
Complexity of Merge Sort = O (nlogn)

67 of 112
DIVIDE-AND-CONQUER MATRIX MULTIPLICATION ALGORITHM
The product of two n x n matrices X and Y is a third , n x n matrix Z=XY , with (i,j)th entry
Zij=∑Xik Ykj
k=1
In general XY, not the same as YX; matrix multiplication is not commutative.
The formula above implies an O (n3) algorithm for matrix multiplication: there are n2 entries
to be computed, and each takes linear time. For quite a while, this was widely believed to be
the best running time possible, and it was even proved that no algorithm which used just
additions and multiplications could do better. It was therefore a source of great excitement
when in 1969, Strassen announced a signi_cantly more ef_cient algorithm, based upon
divide-and-conquer.
Matrix multiplication is particularly easy to break into sub problems, because it can be
performed blockwise. To see what this means, carve X into four n/2 x n/2 blocks, and also Y:
A B E F
X= Y=
C D G H
Then their product can be expressed in terms of these blocks, and is exactly as if the blocks
were single elements.
A B E F AE+BG AF+BH
XY=
C D G H CE+DG CF+DH

68 of 112
We now have a divide-and-conquer strategy: to compute the size-n, product XY,

recursively compute eight size-n/2 products AE,BG,AF,BH,CE,DG,CF,DH and then do a
few O(n2) time additions. The total running time is described by the recurrence relation
T(n)= 8T(n/2) + O(n2),
which comes out to O(n3), the same as for the default algorithm. However, an improvement
in the time bound is possible, and as with integer multiplication, it relies upon algebraic
tricks. It turns out that XY can be computed from just seven sub problems.
P1 = A(F – H)
=
P2 (A + B)H
=
P3 (C + D)E
=
P4 D(G – E)
=
P5 (A + D) (E + H)
=
P6 (B - D) (G + H)
=
P7 (A - C) (E + F)
P +P – P +P P +P
XY = 5 4 2 6 1 2
P +P P +P – P – P
3 4 1 5 3 7
This translates into a running time of
T(n)= 7T(n/2) + O(n2),
which by the result is O(nlog2 7 ) ≈ O(n2.81).

69 of 112
STRESSEN’S MATRIX MULTIPLICATION
Strassen showed that 2x2 matrix multiplication can be accomplished in 7 multiplication and
18 additions or subtractions. This reduce can be done by Divide and Conquer
Approach.Divide the input data S in two or more disjoint subsets S1, S2. Solve the
subproblems recursively. Combine the solutions for S1, S2, …, into a solution for S. The base
case for the recursion are subproblems of constant size.Analysis can be done using recurrence
equations.Divide matrices in sub-matrices and recursively multiply sub-matrices
Let A, B be two square matrices over a ring R. We want to calculate the matrix product C as
If the matrices A, B are not of type 2n x 2n we fill the missing rows and columns with zeros.
We partition A, B and C into equally sized block matrices

with
then
With this construction we have not reduced the number of multiplications. We still need 8
multiplications to calculate the Ci,j matrices, the same number of multiplications we need
when using standard matrix multiplication.

70 of 112
Now comes the important part. We define new matrices
which are then used to express the Ci,j in terms of Mk. Because of our definition of the Mk we
can eliminate one matrix multiplication and reduce the number of multiplications to 7 (one
multiplication for each Mk) and express the Ci,j as
We iterate this division process n-times until the submatrices degenerate into numbers (group
elements).
Practical implementations of Strassen's algorithm switch to standard methods of matrix

multiplication for small enough submatrices, for which they are more efficient. The particular
crossover point for which Strassen's algorithm is more efficient depends on the specific
implementation and hardware. It has been estimated that Strassen's algorithm is faster for
matrices with widths from 32 to 128 for optimized implementations and 60,000 or more for
basic implementations.

71 of 112
Numerical analysis
The standard matrix multiplications takes approximately 2n3 arithmetic operations (additions
and multiplications); the asymptotic complexity is O(n3). (Refer Note for analysis)
C i, j  a i ,k
bk , j
k 1
N N N
Thus T ( N )     c  cN  O(N )
3 3
i 1 j 1 k 1
The number of additions and multiplications required in the Strassen algorithm can be
calculated as follows: let f(k) be the number of operations for a matrix. Then by
recursive application of the Strassen algorithm, we see that f(k) = 7f(k − 1) + l4k, for some
constant l that depends on the number of additions performed at each application of the
algorithm. Hence f(k) = (7 + o(1))k, i.e., the asymptotic complexity for multiplying matrices
of size n = 2k using the Strassen algorithm is
. ( Refer Note for Analysis)
The reduction in the number of arithmetic operations however comes at the price of a
somewhat reduced numerical stability.
Algorithm
void matmul(int *A, int *B, int *R, int n)
if (n == 1)
(*R) += (*A) * (*B);
else

72 of 112
matmul(A, B, R, n/4);
matmul(A, B+(n/4), R+(n/4), n/4);
matmul(A+2*(n/4), B, R+2*(n/4), n/4);
matmul(A+2*(n/4), B+(n/4), R+3*(n/4), n/4);
matmul(A+(n/4), B+2*(n/4), R, n/4);
matmul(A+(n/4), B+3*(n/4), R+(n/4), n/4);
matmul(A+3*(n/4), B+2*(n/4), R+2*(n/4), n/4);
matmul(A+3*(n/4), B+3*(n/4), R+3*(n/4), n/4);
}
DYNAMIC PROGRAMMING
Dynamic programming is a method of solving problems exhibiting the properties of

overlapping sub problems and optimal substructure that takes much less time than naive
methods. The term was originally used in the 1940s by Richard Bellman to describe the
process of solving problems where one needs to find the best decisions one after another.
It is used when the solution to a problem can be viewed as the result of a sequence of
decisions. It avoid duplicate calculation in many cases by keeping a table of known results
and fills up as sub instances are solved.
It follows a bottom-up technique by which it start with smaller and hence simplest sub
instances. Combine their solutions to get answer to sub instances of bigger size until we
arrive at solution for original instance.
Dynamic programming differs from Greedy method because Greedy method makes only one
decision sequence but dynamic programming makes more than one decision.
 The idea of dynamic programming is thus quit simple: avoid calculating the same thing
twice, usually by keeping a table of known result that fills up a sub instances are solved.

73 of 112
 Divide and conquer is a top-down method.
 When a problem is solved by divide and conquer, we immediately attack the complete
instance, which we then divide into smaller and smaller sub-instances as the algorithm
progresses.
 Dynamic programming on the other hand is a bottom-up technique.
 We usually start with the smallest and hence the simplest sub- instances.
 By combining their solutions, we obtain the answers to sub-instances of increasing size,

until finally we arrive at the solution of the original instances.
 The essential difference between the greedy method and dynamic programming is that
the greedy method only one decision sequence is ever generated.
 In dynamic programming, many decision sequences may be generated. However,

sequences containing sub-optimal sub-sequences can not be optimal and so will not be
generated.
RELATED NOTES:
Dynamic Programming is also used in optimization problems. Like divide-and-conquer
method, Dynamic Programming solves problems by combining the solutions of
subproblems. Moreover, Dynamic Programming algorithm solves each sub-problem just
once and then saves its answer in a table, thereby avoiding the work of re-computing the
answer every time.
Two main properties of a problem suggest that the given problem can be solved using
Dynamic Programming. These properties are overlapping sub-problems and optimal

substructure.
Overlapping Sub-Problems
Similar to Divide-and-Conquer approach, Dynamic Programming also combines solutions to
sub-problems. It is mainly used where the solution of one sub-problem is needed repeatedly.

74 of 112
The computed solutions are stored in a table, so that these don‘t have to be re-computed.
Hence, this technique is needed where overlapping sub-problem exists.
For example, Binary Search does not have overlapping sub-problem. Whereas recursive
program of Fibonacci numbers have many overlapping sub-problems.
Optimal Sub-Structure
A given problem has Optimal Substructure Property, if the optimal solution of the given
problem can be obtained using optimal solutions of its sub-problems.
For example, the Shortest Path problem has the following optimal substructure property −
If a node x lies in the shortest path from a source node u to destination node v, then the
shortest path from u to v is the combination of the shortest path from u to x, and the shortest
path from x to v.
The standard All Pair Shortest Path algorithms like Floyd-Warshall and Bellman-Ford are
typical examples of Dynamic Programming.
Steps of Dynamic Programming Approach
Dynamic Programming algorithm is designed using the following four steps −
 Characterize the structure of an optimal solution.
 Recursively define the value of an optimal solution.
 Compute the value of an optimal solution, typically in a bottom-up fashion.
 Construct an optimal solution from the computed information.
Applications of Dynamic Programming Approach
 Matrix Chain Multiplication
 Longest Common Subsequence
 Travelling Salesman Problem
***************************************************************

75 of 112
Principle of Optimality
An optimal sequence of decisions has the property that whatever the initial state and
decisions are, the remaining decisions must constitute an optimal decision sequence with
regard to the state resulting from the first decision.
MatrixChain Multiplication
Given a sequence of matrices, find the most efficient way to multiply these matrices
together. The problem is not actually to perform the multiplications, but merely to decide in
which order to perform the multiplications.
We have many options to multiply a chain of matrices because matrix multiplication is
associative. In other words, no matter how we parenthesize the product, the result will be the
same. For example, if we had four matrices A, B, C, and D, we would have:
(ABC)D = (AB)(CD) = A(BCD) = ....
However, the order in which we parenthesize the product affects the number of simple
arithmetic operations needed to compute the product, or the efficiency. For example, suppose
A is a 10 × 30 matrix, B is a 30 × 5 matrix, and C is a 5 × 60 matrix. Then,
(AB)C = (10×30×5) + (10×5×60) = 1500 + 3000 = 4500 operations
A(BC) = (30×5×60) + (10×30×60) = 9000 + 18000 = 27000

operations.
Given an array p[] which represents the chain of matrices such that the ith matrix Ai is of
dimension p[i-1] x p[i]. We need to write a function MatrixChainOrder () that should return
the minimum number of multiplications needed to multiply the chain.
Input: p[] = {40, 20, 30, 10, 30}

Output: 26000
There are 4 matrices of dimensions 40x20, 20x30, 30x10 and 10x30. Let the input 4 matrices
be A, B, C and D. The minimum number of multiplications are obtained by putting
parenthesis in following way
(A(BC))D --> 20*30*10 + 40*20*10 + 40*10*30
Input: p[] = {10, 20, 30, 40, 30}

Output: 30000

76 of 112
There are 4 matrices of dimensions 10x20, 20x30, 30x40 and 40x30. Let the input 4
matrices be A, B, C and D. The minimum number of multiplications are obtained by putting
parenthesis in following way
((AB)C)D --> 10*20*30 + 10*30*40 + 10*40*30
Input: p[] = {10, 20, 30}

Output: 6000
There are only two matrices of dimensions 10x20 and 20x30. So there is only one way to
multiply the matrices, cost of which is 10*20*30
1) Optimal Substructure:
A simple solution is to place parenthesis at all possible places, calculate the cost for each
placement and return the minimum value. In a chain of matrices of size n, we can place the
first set of parenthesis in n-1 ways. For example, if the given chain is of 4 matrices. let the
chain be ABCD, then there are 3 ways to place first set of parenthesis outer side: (A)(BCD),
(AB)(CD) and (ABC)(D). So when we place a set of parenthesis, we divide the problem into
subproblems of smaller size. Therefore, the problem has optimal substructure property and
can be easily solved using recursion.
Minimum number of multiplication needed to multiply a chain of size n = Minimum of all n-
1 placements (these placements create subproblems of smaller size)
2) Overlapping Subproblems
Following is a recursive implementation that simply follows the above optimal substructure
property.
/* A naive recursive implementation that simply

follows the above optimal substructure property */
#include<stdio.h>
#include<limits.h>
// Matrix Ai has dimension p[i-1] x p[i] for i = 1..n

int MatrixChainOrder(int p[], int i, int j)
{
if(i == j)
return 0;
int k;
int min = INT_MAX;
int count;
// place parenthesis at different places between first

// and last matrix, recursively calculate count of

77 of 112
// multiplications for each parenthesis placement and

// return the minimum count
for (k = i; k <j; k++)
{
count = MatrixChainOrder(p, i, k) +
MatrixChainOrder(p, k+1, j) +
p[i-1]*p[k]*p[j];
if (count < min)

min = count;
}
// Return minimum count

return min;
}
// Driver program to test above function

int main()
{
int arr[] = {1, 2, 3, 4, 3};
int n = sizeof(arr)/sizeof(arr[0]);
printf("Minimum number of multiplications is %d ",

MatrixChainOrder(arr, 1, n-1));
getchar();
return 0;
}
Time complexity of the above naive recursive approach is exponential. It should be noted that
the above function computes the same subproblems again and again. See the following
recursion tree for a matrix chain of size 4. The function MatrixChainOrder (p, 3, 4) is called
two times. We can see that there are many subproblems being called more than once.
Since same suproblems are called again, this problem has Overlapping Subprolems property.
So Matrix Chain Multiplication problem has both properties (see this and this) of a dynamic
programming problem. Like other typical Dynamic Programming(DP) problems,

78 of 112
recomputations of same subproblems can be avoided by constructing a temporary array m[][]

in bottom up manner.
Dynamic Programming Solution
Following is C/C++ implementation for Matrix Chain Multiplication problem using Dynamic
Programming.
// See the Cormen book for details of the following algorithm

#include<stdio.h>
#include<limits.h>
// Matrix Ai has dimension p[i-1] x p[i] for i = 1..n

int MatrixChainOrder(int p[], int n)
{
/* For simplicity of the program, one extra row and one

extra column are allocated in m[][]. 0th row and 0th
column of m[][] are not used */
int m[n][n];
int i, j, k, L, q;
/* m[i,j] = Minimum number of scalar multiplications needed

to compute the matrix A[i]A[i+1]...A[j] = A[i..j] where
dimension of A[i] is p[i-1] x p[i] */
// cost is zero when multiplying one matrix.

for (i=1; i<n; i++)
m[i][i] = 0;
// L is chain length.
for (L=2; L<n; L++)
{
for (i=1; i<n-L+1; i++)
{
j = i+L-1;
m[i][j] = INT_MAX;
for (k=i; k<=j-1; k++)
{
// q = cost/scalar multiplications
q = m[i][k] + m[k+1][j] + p[i-1]*p[k]*p[j];
if (q < m[i][j])
m[i][j] = q;
}
}
}
return m[1][n-1];

79 of 112
int main()
{
int arr[] = {1, 2, 3, 4};
int size = sizeof(arr)/sizeof(arr[0]);
printf("Minimum number of multiplications is %d ",

MatrixChainOrder(arr, size));
getchar();
return 0;
}
Output:
Minimum number of multiplications is 18

Time Complexity: O(n^3)

Auxiliary Space: O(n^2)
Bellman–Ford Algorithm
Given a graph and a source vertex src in graph, find shortest paths from src to all vertices in
the given graph. The graph may contain negative weight edges.
We have discussed Dijkstra‘s algorithm for this problem. Dijksra‘s algorithm is a Greedy
algorithm and time complexity is O(VLogV) (with the use of Fibonacci heap). Dijkstra
doesn’t work for Graphs with negative weight edges, Bellman-Ford works for such graphs.
Bellman-Ford is also simpler than Dijkstra and suites well for distributed systems. But time
complexity of Bellman-Ford is O(VE), which is more than Dijkstra.
Algorithm
Following are the detailed steps.

80 of 112
Input: Graph and a source vertex src
Output: Shortest distance to all vertices from src. If there is a negative weight cycle, then
shortest distances are not calculated, negative weight cycle is reported.
1) This step initializes distances from source to all vertices as infinite and distance to source
itself as 0. Create an array dist[] of size |V| with all values as infinite except dist[src] where
src is source vertex.
2) This step calculates shortest distances. Do following |V|-1 times where |V| is the number of
vertices in given graph.
…..a) Do following for each edge u-v

……If dist[v] > dist[u] + weight of edge uv, then update dist[v]
……dist[v] = dist[u] + weight of edge uv
3) This step reports if there is a negative weight cycle in graph. Do following for each edge u-
v
……If dist[v] > dist[u] + weight of edge uv, then ―Graph contains negative weight cycle‖
The idea of step 3 is, step 2 guarantees shortest distances if graph doesn‘t contain negative
weight cycle. If we iterate through all edges one more time and get a shorter path for any
vertex, then there is a negative weight cycle
How does this work? Like other Dynamic Programming Problems, the algorithms calculate
shortest paths in bottom-up manner. It first calculates the shortest distances which have at-
most one edge in the path. Then, it calculates shortest paths with at-nost 2 edges, and so on.
After the i-th iteration of outer loop, the shortest paths with at most i edges are calculated.
There can be maximum |V| – 1 edge in any simple path that is why the outer loop runs |v| – 1
times. The idea is, assuming that there is no negative weight cycle, if we have calculated
shortest paths with at most i edges, then an iteration over all edges guarantees to give shortest
path with at-most (i+1) edges.
Example
Let us understand the algorithm with following example graph. The images are taken
from this source.
Let the given source vertex be 0. Initialize all distances as infinite, except the distance to
source itself. Total number of vertices in the graph is 5, so all edges must be processed 4
times.

81 of 112
Let all edges are processed in following order: (B,E), (D,B), (B,D), (A,B), (A,C), (D,C),
(B,C), (E,D). We get following distances when all edges are processed first time. The first
row in shows initial distances. The second row shows distances when edges (B,E), (D,B),
(B,D) and (A,B) are processed. The third row shows distances when (A,C) is processed. The
fourth row shows when (D,C), (B,C) and (E,D) are processed.
The first iteration guarantees to give all shortest paths which are at most 1 edge long. We get
following distances when all edges are processed second time (The last row shows final
values).
The second iteration guarantees to give all shortest paths which are at most 2 edges long. The
algorithm processes all edges 2 more times. The distances are minimized after the second
iteration, so third and fourth iterations don‘t update the distances.
Bellman-Ford makes IEI relaxations for every iteration, and there are IVI- 1 iterations.
Therefore, the worst-case scenario is that Bellman-Ford runs in O(IVI.IEI) time.

82 of 112
// A C++ program for Bellman-Ford's single source

// shortest path algorithm.
#include <bits/stdc++.h>
// a structure to represent a weighted edge in graph

struct Edge
{
int src, dest, weight;
};
// a structure to represent a connected, directed and

// weighted graph
struct Graph
{
// V-> Number of vertices, E-> Number of edges
int V, E;
// graph is represented as an array of edges.

struct Edge* edge;
};
// Creates a graph with V vertices and E edges

struct Graph* createGraph(int V, int E)
{
struct Graph* graph = new Graph;
graph->V = V;
graph->E = E;
graph->edge = new Edge[E];
return graph;
}
// A utility function used to print the solution

void printArr(int dist[], int n)
{
printf("Vertex Distance from Source\n");
for (int i = 0; i < n; ++i)
printf("%d \t\t %d\n", i, dist[i]);
}
// The main function that finds shortest distances from src to

// all other vertices using Bellman-Ford algorithm. The function
// also detects negative weight cycle
void BellmanFord(struct Graph* graph, int src)
{
int V = graph->V;
int E = graph->E;
int dist[V];

83 of 112
// Step 1: Initialize distances from src to all other vertices

// as INFINITE
for (int i = 0; i < V; i++)
dist[i] = INT_MAX;
dist[src] = 0;
// Step 2: Relax all edges |V| - 1 times. A simple shortest

// path from src to any other vertex can have at-most |V| - 1
// edges
for (int i = 1; i <= V-1; i++)
{
for (int j = 0; j < E; j++)
{
int u = graph->edge[j].src;
int v = graph->edge[j].dest;
int weight = graph->edge[j].weight;
if (dist[u] != INT_MAX && dist[u] + weight < dist[v])
dist[v] = dist[u] + weight;
}
}
// Step 3: check for negative-weight cycles. The above step

// guarantees shortest distances if graph doesn't contain
// negative weight cycle. If we get a shorter path, then there
// is a cycle.
for (int i = 0; i < E; i++)
{
int u = graph->edge[i].src;
int v = graph->edge[i].dest;
int weight = graph->edge[i].weight;
if (dist[u] != INT_MAX && dist[u] + weight < dist[v])
printf("Graph contains negative weight cycle");
}
printArr(dist, V);
return;
}
// Driver program to test above functions

int main()
{
/* Let us create the graph given in above example */
int V = 5; // Number of vertices in graph
int E = 8; // Number of edges in graph
struct Graph* graph = createGraph(V, E);
// add edge 0-1 (or A-B in above figure)

84 of 112
graph->edge[0].src = 0;
graph->edge[0].dest = 1;
graph->edge[0].weight = -1;
// add edge 0-2 (or A-C in above figure)

graph->edge[1].weight = 4;
// add edge 1-2 (or B-C in above figure)

// add edge 1-3 (or B-D in above figure)

// add edge 1-4 (or A-E in above figure)

// add edge 3-2 (or D-C in above figure)

// add edge 3-1 (or D-B in above figure)

// add edge 4-3 (or E-D in above figure)

graph->edge[7].weight = -3;
BellmanFord(graph, 0);
return 0;
}

85 of 112
Output:
Vertex Distance from Source
0 0
1 -1
2 2
3 -2
4 1
Notes
1) Negative weights are found in various applications of graphs. For example, instead of
paying cost for a path, we may get some advantage if we follow the path.
2) Bellman-Ford works better (better than Dijksra‘s) for distributed systems. Unlike Dijksra‘s
where we need to find minimum value of all vertices, in Bellman-Ford, edges are considered
one by one.

86 of 112

MODULE V

87 of 112
ANALYSIS
Comparison of Divide and Conquer and Dynamic Programming Strategies:
Divide & Conquer
1. The divide-and-conquer paradigm involves three steps at each level of the recursion:
• Divide the problem into a number of sub problems.

• Conquer the sub problems by solving them recursively. If the sub problem sizes are small
enough, however, just solve the sub problems in a straightforward manner.
• Combine the solutions to the sub problems into the solution for the original problem.
2. They call themselves recursively one or more times to deal with closely related sub
problems.
3. D&C does more work on the sub-problems and hence has more time consumption.
4. In D&C the sub problems are independent of each other.
5. Example: Merge Sort, Binary Search
Dynamic Programming
1. The development of a dynamic-programming algorithm can be broken into a sequence of

four steps. a. Characterize the structure of an optimal solution.b. Recursively define the value
of an optimal solution. c. Compute the value of an optimal solution in a bottom-up fashion.d.
Construct an optimal solution from computed information
2. Dynamic Programming is not recursive.
3. DP solves the sub problems only once and then stores it in the table.
4. In DP the sub-problems are not independent.
5. Example: Matrix chain multiplication

88 of 112
GREEDY STRATEGY
Greedy method is another important algorithm design paradigm. Before going in detail
about the method let us see what an optimization problem is about.
For an optimization problem, we are given a set of constraints and an optimization
function. Solutions that satisfy the constraints are called feasible solutions. And feasible
solution for which the optimization function has the best possible value is called an
optimal solution.
In a greedy method we attempt to construct an optimal solution in stages. At each stage

we make a decision that appears to be the best (under some criterion) at the time. A
decision made at one stage is not changed in a later stage, so each decision should assure
feasibility. Greedy algorithms do not always yield a genuinely optimal solution. In such
cases the greedy method is frequently the basis of a heuristic approach. Even for
problems which can be solved exactly by a greedy algorithm, establishing the correctness
of the method may be a non-trivial process.
In general, greedy algorithms have five pillars:
1. A candidate set, from which a solution is created

2. A selection function, which chooses the best candidate to be added to the solution
3. A feasibility function, that is used to determine if a candidate can be used to contribute to
a solution
4. An objective function, which assigns a value to a solution, or a partial solution
5. A solution function, which will indicate when we have discovered a complete solution
Control Abstraction
SolutionType Greedy (type a[], int n)
SolutionType solution = EMPTY;
for(int i=1; i<=n; i++)
Type x = Select(a);
If Feasible(solution, x)
solution = Union (solution, x);

89 of 112
return solution;
Select – select an input from a[] and removes it from the array
Feasible – check feasibility
Union - combines x with solution and updates objective function
Knapsack problem
Problem: Given n inputs and a knapsack or bag. Each object i is associated with a weight wi
and profit pi. If fraction of an object xi, 0 ≤ xi ≥ 1 is placed in knapsack earns profit pixi. Fill
the knapsack with maximum profit.
 If sum of all weights ≤ m, then all xi =1, 0 ≤ xi ≤ 1 , is an optimal solution

 If sum >m, all xi cannot be equal to 1. Then optimal solution fills knapsack exactly
Example
Number of inputs n = 3
Size of the knapsack m = 20
(x1, x2, x3) ∑wixi ∑pixi
(1, 2/15, 0) 20 28.2
(0, 2/3, 1) 20 31
(0, 1, 1/2) 20 31.5

90 of 112
Algorithm
void GreedyKnapsack (float m, int n)
// p[i]/w[i] ≥ p[i+1]/w[i+1]
for (int i=1; i<=n; i++)
x[i] = 0.0;
float u = m;
for (i=1; i<=n; i++)
if(w[i] > u) break;
x[i] = 1.0;
u = u – w[i];
if (i<=m) x[i] = u/w[i];
Theorem 3.1
If objects are selected in order of decreasing pi/wi, then knapsack finds an optimal
solution
Minimum Cost Spanning Trees
A spanning tree T = (V,E‘) of a connected, undirected graph G = (V,E) is a tree

composed of all the vertices and some (or perhaps all) of the edges of G. A spanning tree
of a connected graph G can also be defined as a maximal set of edges of G that contains
no cycle, or as a minimal set of edges that connect all vertices.
A Minimum cost spanning tree is a subset T of edges of G such that all the vertices
remain connected when only edges in T are used, and sum of the lengths of the edges in T
is as small as possible. Hence it is then a spanning tree with weight less than or equal to
the weight of every other spanning tree.

91 of 112
We know that the number of edges needed to connect an undirected graph with n vertices
is n-1. If more that n-1 edges are present, then there will be a cycle. Then we can remove
an edge which is a part of the cycle without disconnecting T. This will reduce the cost.
There are two algorithms to find minimum spanning trees. They are Prim‘s algorithm and
Kruskal‘s algorithm.
1. Prim’s Algorithm
Prim's algorithm finds a minimum spanning tree for a connected weighted graph. This
means it finds a subset of the edges that forms a tree that includes every vertex, where the
total weight of all the edges in the tree is minimized. The algorithm was discovered in
1930 by mathematician Vojtěch Jarník and later independently by computer scientist
Robert C. Prim in 1957 and rediscovered by Edsger Dijkstra in 1959. Therefore it is
sometimes called the DJP algorithm, the Jarník algorithm, or the Prim-Jarník algorithm.
Steps
 Builds the tee edge by edge

 Next edge to be selected is one that result in a minimum increase in the sum of
costs of the edges so far included
 Always verify that resultant is a tree
Example 3.2
Consider the connected graph given below
Minimum spanning tree using Prim‘s algorithm can be formed as given below.

92 of 112
Algorithm (Program): Refer Note for Algorithm
float Prim (int e[][], float cost[][])
int near[], j, k, l, I;
float mincost = cost[k][l]; // (k, l) is the edge with minimum cost
t[1][1] = k;
t[1][2] = l;
for (i=1; i<=n; i++)
if (cost[i][l] < cost[i][k]) near[i] = l;
else near[i] = k;
near[k] = 0;
near[l] = 0;
for (i=2; i<=n; i++)
find j such that cost[j][near[j]] is minimum
t[i][1] = j;
t[i][2] = near[j];
mincost = mincost + cost[j][near[j]];
near[j] = 0;
for (k=1; k<=n; k++)

93 of 112
if (near[k] != 0 && cost[k][near[k] > cost[k][j])
near[k] = j;
return (mincost);
Time complexity of the above algorithm is O(n2).
2. Kruskal’s Algorithm
Kruskal's algorithm is another algorithm that finds a minimum spanning tree for a
connected weighted graph. If the graph is not connected, then it finds a minimum
spanning forest (a minimum spanning tree for each connected component).
Kruskal's Algorithm builds the MST in forest. Initially, each vertex is in its own tree in
forest. Then, algorithm considers each edge in turn, order by increasing weight. If an edge
(u, v) connects two different trees, then (u, v) is added to the set of edges of the MST, and
two trees connected by an edge (u, v) are merged into a single tree on the other hand, if an
edge (u, v) connects two vertices in the same tree, then edge (u, v) is discarded. The
resultant may not be a tree in all stages. But can be completed into a tree at the end.
t = EMPTY;
while ((t has fewer than n-1 edges) && (E != EMPTY))
{
choose an edge(v, w) from E of lowest cost;
delete (v, w) from E;
if (v, w) does not create a cycle in t
add (v, w) to t;
else
discard (v, w);
}
To check whether there exist a cycle, place all vertices in the same connected component
of t into a set. Then two vertices v and w are connected in t then they are in the same set.

94 of 112
Example
Consider the connected graph given below
Minimum spanning tree using Kruskal‘s algorithm can be formed as given below.

95 of 112
Algorithm ( Program ) : Refer Note for Algorithm
Float kruskal (int E[][], float cost[][], int n, int t[][2])

{
int parent[w];
consider heap out of edge cost;
for (i=1; i<=n; i++)
parent[i] = -1; //Each vertex in different set
i=0;
mincost = 0;
while((i<n-1) && (heap not empty))
{
Delete a minimum cost edge (u,v) from the heap and reheapify;
j = Find(u); k = Find(v); // Find the set
if (j != k)
{
i++;
t[i][1] = u;
t[i][2] = v;
mincost += cost[u][v];
Union(j, k);
}
if (i != n-1) printf(―No spanning tree \n‖);
else return(mincost);
}
}
Time complexity of the above algorithm is O (n log n)

96 of 112

MODULE VI

97 of 112
BACK TRACKING
Backtracking is a type of algorithm that is a refinement of brute force search. In

backtracking, multiple solutions can be eliminated without being explicitly examined, by
using specific properties of the problem. It systematically searches for a solution to a problem
among all available options. It does so by assuming that the solutions are represented by
vectors (x1, ..., xn) of values and by traversing, in a depth first manner, the domains of the
vectors until the solutions are found. When invoked, the algorithm starts with an empty
vector. At each stage it extends the partial vector with a new value. Upon reaching a partial
vector (x1, ..., xi) which can‘t represent a partial solution, the algorithm backtracks by
removing the trailing value from the vector, and then proceeds by trying to extend the vector
with alternative values. The term "backtrack" was coined by American mathematician D. H.
Lehmer in the 1950s.The constraints are divided into implicit constraints and explicit
constraints. Implicit constraints are rules which determine which of the tuples in solution
space of I satisfy the criteria function. And explicit constraints restrict xi take values from a
given set. e.g., xi ≥0, xi�{0,1} etc.
Backtracking algorithm determine the problem solution by systematically searching the
problem space with a tree representation.
State space tree of Backtracking
Backtracking: General principle

98 of 112
Backtracking find all answer nodes, not just one case. Let (x1, ..., xi) be a path from root to a
node in the state space tree. T(x1, ..., xi) be the set of all possible values for xi+1 such that
(x1, ..., xi,xi+1) is also a path to problem state. So we know that T(x1, ..., xn) = φ. Let Bi+1
be a bounding function(criterion function) such that if Bi+1(x1, ..., xi,xi+1) is false for the
path (x1, ...,xi,xi+1) from the root to the problem state, then path cannot be extended to
answer node.Then candidates for position i+1 are those generated by the T satisfying Bi+1.
Control Abstraction
Void Backtrack (int k)

{
// While entering assume that first k-1 values x[1], x[2],…,x[k-1] of the
solution
// vector x[1:n] have been assigned.
for (each x[k] such that x[k] ∈ T(x[1],…,x[k-1])) do
{
if (Bk(x[1],…,x[k-1]!=0)) then
{
if ((x[1],…,x[k]) is a path to an answer node)
write x[1:k];
if (k<n) Backtrack(k+1);
}
}
}
Back track approach
– Requires less than m trials to determine the solution

– Form a solution (partial vector) one component at a time, and check at every
step if this has any chance of success
– If the solution at any point seems not-promising, ignore it
– If the partial vector (x1, x2, . . . , xi) does not yield an optimal solution, ignore
mi+1···mn possible test vectors even without looking at them.
– Effectively, find solutions to a problem that incrementally builds candidates to
the solutions, and abandons each partial candidate that cannot possibly be
completed to a valid solution
∗ Only applicable to problmes which admit the concept of partial candidate solution

99 of 112
and a relatively quick test of whether the partial solution can grow into a complete solution
∗ If a problem does not satisfy the above constraint, backtracking is not applicable·
Backtracking is not very efficient to find a given value in an unordered list
All the solutions require a set of constraints divided into two categories:
Explicit and Implicit constraints
Definition 1: Explicit constraints are rules that restrict each xi to take on values only from a
given set.
– Explicit constraints depend on the particular instance I of problem being
solved
– All tuples that satisfy the explicit constraints define a possible solution space
for I.
Definition 2: Implicit constraints are rules that determine which of the tuples in the solution
space of I satisfy the criterion function.
– Implicit constraints describe the way in which the xi‘s must relate to each
other..
Determine problem solution by systematically searching the solution space for the given
problem instance
– Use a tree organization for solution space
N Queens Problem
Problem: Given an nxn chessboard, place n queens in nom-attacking position i.e., no two
queens are in same row or same row or same diagonal. Let us assume that queen i is placed in
row i. So the problem can be represented as n tuple (x1, ..., xn) where each xi represents the
column in which we have to place queen i.Hence the explicit constraints is Si ={1,2,..,n}. No
two xi can be same and no two queens can be in same diagonal is the implicit constraint.
Since we fixed the row number the solution space reduce from nn to n!. To check whether
they are on the same diagobal, let chessboard be represented by an array a[1..n][1..n]. Every
element with same diagonal that runs from upper left to lower right has same ―row-column‖
value. E.g., consider the
element a[4][2]. Elements a[3][1], a[5][3], a[6][4], a[7][5] and a[8][6] have row-column
value 2. Similarly every element from same diagonal that goes from upper right to lower left
has same ―row+column‖ value. Tthe elements a[1][5], a[2][4], a[3][3], a[5][1] have same
―row+column‖ value as that of element a[4][2] which is 6.

100 of 112
Hence two queens placed at (i,j) and (k,l) have same diagonal iff
i – j = k – l or i + j = k + l
i.e., j – l = i – k or j – l = k – i
|j – l| = |i – k|
The n-queens problem is a generalization of the 8-queens problem. Here n queens are to be
placed on an nxn chessboard so that no two attack, that is no two queens are on the same row,
column, or diagonal. The solution space consists of all n! permutations of the n-tuple
(1,2,…,n). The tree is called permutation tree. The edges are labeled by possible values of xi.
Edges from level 1 to level 2 nodes specify the values for x1. Edges from level i to i+1 are
labeled with values of xi. The solution space is defined by all the paths from the root node to
a leaf node. For eg. If n=4, then there will be 4! =24 leaf nodes in the tree.
Eg: 8-queens problem
A classic problem is to place eight queens on an 8x8 chessboard so that no two

―attack,‖ that is, so that no two of them are on the same row, column, or diagonal.
Identify data structures to solve the problem
∗ First pass: Define the chessboard to be an 8×8 array
∗ Second pass: Since each queen is in a different row, define the chessboard solution to be an
8-tuple (x1, . . . , x8), where xi is the column for ith queen
Identify explicit constraints
∗ Explicit constraints using 8-tuple formulation are Si ={1,2,3,4,5,6,7,8},1≤i≤8

∗ Solution space of 88 8-tuples.
Identify implicit constraints
∗ No two xi can be the same, or all the queens must be in different columns
·All solutions are permutations of the 8-tuple (1,2,3,4,5,6,7,8)·

. Reduces the size of solution space from 88 to 8! Tuples.
∗ No two queens can be on the same diagonal

101 of 112
Terminology
Problem state: is each node in the depth-first search tree
State space: is the set of all paths from root node to other nodes
Solution states: are the problem states s for which the path from the root node to s
defines a tuple in the solution space.
–In variable tuple size formulation tree, all nodes are solution states
–In fixed tuple size formulation tree, only the leaf nodes are solution states
–Partitioned into disjoint sub-solution spaces at each internal node
Answer states: are those solution states s for which the path from root node to s defines a
tuple that is a member of the set of solutions.
– These states satisfy implicit constraints
State space tree: is the tree organization of the solution space
Static trees: are ones for which tree organizations are independent of the problem instance
being solved
–Fixed tuple size formulation
–Tree organization is independent of the problem instance being solved
Dynamic trees: are ones for which organization is dependent on problem instance
–After conceiving state space tree for any problem, the problem can be solved by
systematically generating problem states, checking which of them are solution states, and
checking which solution states are answer states.
Live node: is a generated node for which all of the children have not been generated yet.
E-node: is a live node whose children are currently being generated or explored
Dead node: is a generated node that is not to be expanded any further
–All the children of a dead node are already generated

–Live nodes are killed using a bounding function to make them dead nodes.

102 of 112

• Bounding function
– No two queens placed on the same column  xi’s are distinct

– No two queens placed on the same diagonal  how to test?
• The same value of ―row-column‖ or ―row+column‖
• Supposing two queens on (i,j) and (k,l)
– i-j=k-l (i.e., j-l=i-k) or i+j=k+l (i.e., j-l=k-i)
• So, two queens placed on the same diagonal iff
|j-l| = |i-k|
This example shows how the backtracking works (4-Queens)

103 of 112

TREE STRUCTURE
Tree organization of the 4-queens solution space. Nodes are numbered as in depth first
search.

104 of 112
The n-queens problem is a generalization of the 8-queens problem. Now n queens are to be
placed on an n×n chess board so that no two attack; that is, no two queens are on the same
row, column or diagonal.Generalising our discussions, the solution space consists of all n!
permutations of n-tuple (1, 2.., n).The above figure shows a possible tree organization for the
case n=4. A tree such as this is called a permutation tree. The edges are labeled by possible
values of xi. Edges from level 1 to level 2 nodes specify the value for x1.Thus the leftmost
sub tree contains all solutions with x1=1 and x2=2,and so on. Edges from level to level
i+1are labeled with the value of xi.the solution space is defined by all paths from the root
node to a leaf node. There are 4! =24 nodes in the tree.
ALGORITHM
Algorithm Place (k, ί)
// Returns true if a queen can be placed in the kth row and

//i th column. Otherwise it returns false. x[] is a
//global array whose first (k –1) values have been set.
//Abs (p) returns the absolute value of p.
{
for j: =1 to k-1 do
If ((x [j] = ί) // Two in the same column
or (Abs (x [j] – ί ) = Abs (j-k)) )
// or in the same diagonal
then return false;
return true;
}

105 of 112
Algorithm Nqueens (k, n)

// using backtracking, this procedure prints all
// possible placements of n queens on an n x n
// chessboard so that they are nonattacking.
{
for ί: =1 to n do
{
if Place (k, ί) then
{
x [k] : = ί;
if (k=n) then write (x[1:n]);
else Nqueens (k+1, n);
}
}
0/1 KNAPSACK PROBLEM
In this tutorial, earlier we have discussed Fractional Knapsack problem using Greedy
approach. We have shown that Greedy approach gives an optimal solution for Fractional
Knapsack.
In 0-1 Knapsack, items cannot be broken which means the thief should take the item as
a whole or should leave it. This is reason behind calling it as 0-1 Knapsack.
Hence, in case of 0-1 Knapsack, the value of xi can be either 0 or 1, where other constraints
remain the same.
0-1 Knapsack cannot be solved by Greedy approach. Greedy approach does not ensure an
optimal solution. In many instances, Greedy approach may give an optimal solution.

106 of 112


107 of 112


108 of 112

BRANCH AND BOUND ALGORITHM TECHNIQUE
Introduction
Branch and bound is another algorithm technique that we are going to present in our multi-
part article series covering algorithm design patterns and techniques. B&B, as it is often
abbreviated, is one of the most complex techniques and surely cannot be discussed in its
entirety in a single article. Thus, we are going to focus on the so-called A* algorithm that is
the most distinctive B&B graph search algorithm.
If you have followed this article series then you know that we have already covered the most
important techniques such as backtracking, the greedy strategy, divide and conquer, dynamic
programming, and even genetic programming. As a result, in this part we will compare
branch and bound with the previously mentioned techniques as well. It is really useful to
understand the differences.

109 of 112
Branch and bound is an algorithm technique that is often implemented for finding the optimal
solutions in case of optimization problems; it is mainly used for combinatorial and discrete
global optimizations of problems. In a nutshell, we opt for this technique when the domain of
possible candidates is way too large and all of the other algorithms fail. This technique is
based on the en masse elimination of the candidates.
You should already be familiar with the tree structure of algorithms. Out of the techniques
that we have learned both the backtracking and divide and conquer traverse the tree in its
depth, though they take opposite routes. The greedy strategy picks a single route and forgets
about the rest. Dynamic programming approaches this in a sort of breadth-first search
variation (BFS).
Now if the decision tree of the problem that we are planning to solve has practically
unlimited depth, then, by definition, the backtracking and divide and conquer algorithms are
out. We shouldn't rely on greedy because that is problem-dependent and never promises to
deliver a global optimum, unless proven otherwise mathematically.
As our last resort we may even think about dynamic programming. The truth is that maybe
the problem can indeed be solved with dynamic programming, but the implementation
wouldn't be an efficient approach; additionally, it would be very hard to implement. You see,
if we have a complex problem where we would need lots of parameters to describe the
solutions of sub-problems, DP becomes inefficient.
Branch and bound is a systematic method for solving optimization problems B&B is a rather
general optimization technique that applies where the greedy method and dynamic
programming fail. However, it is much slower. Indeed, it often leads to exponential time
complexities in the worst case. On the other hand, if applied carefully, it can lead to
algorithms that run reasonably fast on average. The general idea of B&B is a BFS-like search
for the optimal solution, but not all nodes get expanded (i.e., their children generated).
Rather, a carefully selected criterion determines which node to expand and when, and another
criterion tells the algorithm when an optimal solution has been found. The basic concept
underlying the branch-and-bound technique is to divide and conquer. Since the original
―large‖ problem is hard to solve directly,it is divided into smaller and smaller subproblems
until these subproblems can be conquered. The dividing (branching) is done by partitioning

110 of 112
the entire set of feasible solutions into smaller and smaller subsets.The conquering
(fathoming) is done partially by (i) giving a bound for the best solution in the subset;(ii)
discarding the subset if the bound indicates that it can‘t contain an optimal solution. These
three basic steps – branching, bounding, and fathoming – are illustrated on the following
example.
Eg: The Traveling Salesman problem

Problem: A salesman spends his time in visiting cities cyclically. In his tour he visits each
city just once, and finishes up where he started. In which order should he visit to minimize
the distance traveled.
The problem an be represented using a graph. Let G=(V,E) be a directed graph with cost cij,
cij >0 for all <i j>  E and cij = ∞ for all <j,t> E. |V| = n and n>1. We know that tour of a
graph includes all vertices in V and cost of tour is the sum of the cost of all edges on the tour.
Hence the traveling salesman problem is to minimize the cost.
Application
The traveling salesman problem can be correlated to many problems that we find in the day
to day life. For example, consider a production environment with many commodities
manufactured by same set of machines. Manufacturing occur in cycles. In each production
cycle n different commodities are produced. When machine changes from product i to
product j, a cost Cij is incurred. Since products are manufactured cyclically, for the change
from last commodity to the first a cost is incurred. The problem is to find the optimal
sequence to manufacture the products so that the production cost is minimum.
We know that the tour of the simple graph starts and ends at vertex1. Every tour consist of an
edge <i, k> for some k  V-{1} and a path from k to 1. Path from k to 1 goes through each
vertex in V- {1,k} exactly once. If tour is optimal, path from k to 1 muat be shortest k to 1
path going through all vertices in V-{1,k}. Hence the principle of optimality holds.
Let g(i, S) be length of shortest path starting at i, going through all vertices in S ending at 1. Function
g(1, V-{1}) is the optimal salesman tour. From the principle of optimality

111 of 112
Use eq.(1) to get g(i, S) for |S| =1. Then find g(i, S) with |S|=2 and so on.
Example
Consider the directed graph with the cost adjacency matrix given below.
g(2, φ) = C21 = 5
g(3, φ) = C31 = 6
g(4, φ) = C41 = 8
g(2, {3}) = C23 + g(3, φ) = 15
g(2, {4}) = C24 + g(4, φ) = 18
g(3, {2}) = C32 + g(2, φ) = 18
g(3, {4}) = C34 + g(4, φ) = 20
g(4, {2}) = C42 + g(2, φ) = 13
g(4, {3}) = C43 + g(3, φ) = 15
g(2, {3,4}) = min{C23 + g(3, {4}) , C24 + g(4, {3}) }= 25
g(3, {2,4}) = min{C32 + g(2, {4}) , C34 + g(4, {2}) }= 25
g(4, {2,3}) = min{C42 + g(2, {3}) , C43 + g(3, {2}) }= 23

112 of 112
g(1, {2,3,4}) = min{C12 + g(2, {3,4}) , C13 + g(3, {2,4}) , C14 + g(4, {2,3}) }
= min {35, 40, 43} = 35
Let J(i, S) represent the value of j which is the value of g(i, S) that minimizes the right
hand side of equ(2). Then J(1, {2,3,4}) =2, which infers that the tour starts from 1
goes to 2. Since J(2,{3,4}) = 4, we know that from 2 it goes to 4. J (4, {3}) = 3. Hence
the next vertex is 3. Then we have the optimal sequence as 1, 2,4,3,1.
INTRODUCTION TO COMPLEXITY THEORY
COMPLEXITY- REFER TEXT BOOK


Daa Ktu Notes

Uploaded by

Document Informationclick to expand document information

Copyright:

Available Formats

Daa Ktu Notes

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Daa Ktu Notes

Uploaded by

Copyright:

Available Formats

1 of 112

Design and Analysis of Algorithms (CS 302) CSE

ALGORITHM ANALYSIS AND DESIGN

Module 1 Introduction to Algorithm Analysis

Time and Space Complexity- Elementary operations and Computation of Time

Recurrence Equations: Solution of Recurrence Equations – Iteration Method and

Module 4 Divide and Conquer

Dynamic Programming: The control Abstraction- The Optimality Principle-

Analysis, Comparison of Divide and Conquer and Dynamic Programming strategies.

Greedy Strategy:- The control abstraction- The Fractional Knapsack Problem,

Branch and Bound:- Travelling Salesman Problem.

Arunkumar M, Asst. Professor, CSE ICET, Mulavoor

Design and Analysis of Algorithms (CS 302) CSE

Introduction to Complexity Theory:- Tractable and Intractable Problems- The P

1. Computer Algorithms, Universities Press, 2007 - Horowitz and Sahni, Sanguthevar [

2. Data Structures algorithms and applications - Sahni, Tata McGrHill

3. Foundations of Algorithms - Richard Neapolitan, Kumarss N., DC Hearth & Company

4. Introduction to algorithm- Thomas Coremen, Charles, Ronald Rivest -PHI

Arunkumar M, Asst. Professor, CSE ICET, Mulavoor

Design and Analysis of Algorithms (CS 302) CSE

Arunkumar M, Asst. Professor, CSE ICET, Mulavoor

Design and Analysis of Algorithms (CS 302) CSE

Introduction to Algorithm Analysis

An Algorithm is any well-defined computational procedure that takes some

An Algorithm is a finite set of instructions that, if followed, accomplishes a

1. INPUT  Zero or more quantities are externally supplied.

Algorithm can be described in three ways.

1. Natural language like English

Arunkumar M, Asst. Professor, CSE ICET, Mulavoor

Design and Analysis of Algorithms (CS 302) CSE

2. Graphic representation called flowchart:

In this method, we should typically describe algorithms as program, which resembles

1. Comments begin with // and continue until the end of line.

2. Blocks are indicated with matching braces {and}.

4. Compound data types can be formed with records. Here is an example,

data type – 1 data-1;

data type – n data – n;

Arunkumar M, Asst. Professor, CSE ICET, Mulavoor

Design and Analysis of Algorithms (CS 302) CSE

5. Assignment of values to variables is done using the assignment statement.

6. There are two Boolean values TRUE and FALSE.

 Logical Operators AND, OR, NOT

Relational Operators <, <=,>,>=, =, !=

7. The following looping statements are employed.

For, while and repeat-until

While < condition > do

For variable: = value-1 to value-2 step step do

Arunkumar M, Asst. Professor, CSE ICET, Mulavoor

Design and Analysis of Algorithms (CS 302) CSE

8. A conditional statement has the following forms.

 If <condition> then <statement>

 If <condition> then <statement-1>

Arunkumar M, Asst. Professor, CSE ICET, Mulavoor

Design and Analysis of Algorithms (CS 302) CSE

10. There is only one type of procedure:

Algorithm Name (Parameter lists)

Algorithm to find the maximum of ‗n‘ given numbers:

Arunkumar M, Asst. Professor, CSE ICET, Mulavoor

Design and Analysis of Algorithms (CS 302) CSE