To See Comments Click On Review and and Click On Show Comments

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 107

• To see comments click on review and and

click on show comments


You will see like this
Algorithm Analysis and
Design

The main issue in this course is; how to


select and design efficient algorithm to
solve a given problem.

3
Overview of algorithm
• To apply algorithm first of all there should be a problem
that needs a solution
– Is there any problem for which we can’t design algorithm?

– Program = Algorithm + Data Structure


4
What is an algorithm?
• An algorithm is a clearly specified set of simple
instructions to be followed to solve a problem
• Any well-defined computational procedure that takes some
value (or set of values) as an input and produces some
value (or set of values) as an output
• A sequence of computational steps that transforms the
input into the output
• A set of well-defined, finite rules used for problem solving
• A finite set of instructions that, if followed, accomplishes a
particular task
• It is a precise, systematic method for producing a specified
result
6
What is an algorithm?
• An algorithm is a sequence of unambiguous
instructions that accept any legitimate input
for transforming into a required output in a
finite amount of time so as to solve a specific
problem.
• From the above definition, algorithm has the
following five properties: Sequence,
Unambiguous, Input, Output, Finite
7
Properties of an algorithm: Sequence
• It is a step-by-step procedure for solving a given
problem
• Every algorithm should have a beginning (start)
and a halt (end) step
• The first step (start step) and last step (halt step)
must be clearly noted
– Between the two steps, every step should have
preceding and succeeding steps
• That is, each step must have a uniquely defined
preceding and succeeding step
8
Properties of an algorithm: Unambiguous
• Define rigorously the sequence of operations performed
for transforming the inputs into the outputs
• No ambiguous statements are allowed: Each step of an
algorithm must be clearly and precisely defined, having
one and only one interpretation.
• At each point in computation, one should be able to tell
exactly what will happen next
• Algorithms must specify every step. It must be composed
of concrete steps
• Every detail of each step must be spelled out, including
how to handle errors
• This ensures that if the algorithm is performed at
different times or by different systems using the same
data, the output will be the same. 9
Properties of an algorithm: Input specified
• The inputs are the data that will be transformed
during the computation to produce the output
• An input to an algorithm specifies an instance of
the problem the algorithm solves
• Every algorithm should have a specified number
(zero or more) input values (or quantities) which
are externally supplied
– We must specify the type of data and the amount of data
• Note that, correct algorithm is not one that works
most of the time but one that works correctly for
all legitimate inputs
10
Properties of an algorithm: Output specified
• The output is the data resulting from the computation
– It is the intended result
• Every algorithm should have one or a sequence of
output values
–There must be one or more result values
• A possible output for some computations is a statement
that there can be no output, i.e., no solution is possible
• The algorithm can be proved to produce the correct
output given a valid input.

11
Properties of an algorithm: Finiteness
• Every valid algorithm must complete or terminate
after a finite number of steps.
• If you trace out the instructions of an algorithm,
then for all cases the algorithm must terminate
after a finite number of steps
– It must eventually stop either with the right output or
with a statement that no solution is possible
• Finiteness is an issue for computer algorithms
because
– Computer algorithms often repeat instructions
– If the algorithm doesn’t specify when to stop, the
computer will continue to repeat the instructions
forever 12
Why algorithm analysis ?
• There are many ways to solve a given problem
– So writing a working program to solve a problem
is not good enough
• The program may be inefficient and/or incorrect!
– If the program is run on a large data set, then the
running time becomes an issue
• Always we have to undertake algorithm analysis
before implementation
• Example: Selection Problem
– Given a list of N numbers, determine the kth
largest (or smallest) element in the list, where k  N.
Example: Selection Problem

• Algorithm 1:
(1)   Read N numbers into an array
(2)   Sort the array in decreasing order by
some simple algorithm
(3)   Return the element in kth position
Example: Selection Problem…
• Algorithm 2:
(1) Read the first k elements into an array and
sort them in decreasing order
(2) Each remaining element is read one by one
– If smaller than the kth element, then it is ignored
– Otherwise, it is placed in its correct spot in the
array, bumping one element out of the array.
(3) The element in the kth position is returned as
the answer.
Example: Selection Problem…
• Which algorithm is better when
– N =100 and k = 100?
– N =100 and k = 1?

• What happens when


– N = 1,000,000 and k = 500,000?

• Which one is an efficient algorithm?


Correct algorithm?
– Is there exist better algorithms?
Algorithm Evaluation: Two main ways
• The Efficiency of the algorithm
– determination of the number of resources necessary
to execute it, such as
• the number of steps or iterations (time complexity)
• storage locations (space complexity).

– the efficiency or running time of an algorithm is


stated as a function relating the input length to time
and storage space requirement
• The Correctness of the algorithm
– We only analyze the efficiency of correct algorithms

17
Correct Algorithm
• A correct algorithm solves the given computational
problem
– If the algorithm is not doing what it is supposed to do, it is
worthless
• An algorithm is said to be correct if, for every input
instance, it halts with the correct output
• An incorrect algorithm
–might not halt at all on some input instances, or
–might halt with a wrong answer.
• In order to show that an algorithm is incorrect, you need
just one instance of its input for which the algorithm fails
• How to prove the correctness of an algorithm ?
– Common techniques are by mathematical induction & contradiction
18
Proof by Induction
• The induction base:
– is the proof that the statement is true for initial value (e.g. n =1)
• The induction hypothesis:
– is the assumption that the statement is true for an arbitrary values
1, 2, …, n
• The induction step:
– is the proof that if the statement is true for n, it must be true for
n+1

• Example: show the correctness of the following equation,


for all positive integers n, n( n  1)
1  2  ...  n 
2
19
Assignment
• Proof by induction that the following
geometric serious is correct:

n 1
n n 1 1 0 r 1
r r  ...  r  r 
r 1

20
Example 2: Finding Maximum
• Finding the maximum element problem
–The Input is an array A storing n elements and the output is the
maximum one in A
• E.g.: Given array A=[31, 41, 26, 41, 58], max algorithm returns 58

Algorithm for finding the maximum element


Algorithm findMax(A, n)
//Input: An array A[1..n].
//Output: The maximum element in A.
currentMax  A[0]
for i  1 to n -1 do
if currentMax < A[i] then
currentMax  A[i]
end for
return currentMax
end algorithm
21
ALGORITHM
Let A be a linear array of n elements A [1], A [2], A [3],...... A [n].
Digit is the total number of digits in the largest element in array
A.
1. Input n number of elements in an array A.
2. Find the total number of Digits in the largest element in the
array.
3. Initialise i = 1 and repeat the steps 4 and 5 until (i <= Digit).
4. Initialise the buckets j = 0 and repeat the steps (a) until ( j < n)
(a) Compare ith position of each element of the array with bucket
number and place it in the corresponding bucket.
5. Read the element(s) of the bucket from 0th bucket to 9th
bucket and from first
position to higher one to generate new array A.
6. Display the sorted array A.
7. Exit
Solution
• void selectionSort(int[] a) {
   for (int i = 0; i < a.length - 1; i++) {
     int min = i;
     for (int j = i + 1; j < a.length; j++) {
       if (a[j] < a[min]) {
         min = j;
       }
     }
     if (i != min) {
       int swap = a[i];
       a[i] = a[min];
       a[min] = swap;
     }
   }
 }
Reading Materials
• T. H. Cormen et al. (2003, 1990). Introduction to
Algorithms. MIT Press: McGraw Hill Book Co.
• E. Horowitz & S. Sahni (1978). Fundamental of Computer
Algorithms. Computer Science Press
• Z. Manna (1974). Mathematical Theory of Computation.
McGraw Hill Book Co.

26
Algorithm Analysis

• Best, average and worst case


algorithm analysis
•Asymptotic analysis
Why we study algorithm?
• Suppose computers are infinitely fast and
computer memory was free. Would you have
any reason to study algorithm?
– Yes, because we want to demonstrate that the
algorithm is correct; it terminates with the intended
solution for all inputs given

• However, the reality shows that :


– Computers may be fast, but they are not infinitely fast
– Memory may be cheap, but it is not free
– Computing time and resources are therefore a
bounded resources
Algorithm Design & Analysis Process
Understand the problem

Decide
Decideon on:algorithm
algorithm
design
designtechniques
techniquesetc.

Design an algorithm
Correctness

Prove correctness
Efficiency
Analyze efficiency

Code the algorithm


Analysis of Algorithm
• As you may have noted, there are often multiple
algorithms one can use to solve the same problem.
–In searching from a sequence of list, one can use linear
search, binary search, …..
–For sorting a sequence of list, we may consider bubble sort,
selection sort, quick sort, merge sort, heap sort,…
–You can come up with your own variants.

• Why all these algorithms? Do we need them?


• How do we choose which algorithm is the best?
–The fastest/most efficient algorithm.
–The one that uses the fewest resources.
–The clearest.
–The shortest, ...
Analysis of algorithms
• Analysis of algorithm is the analysis of resource usage
of a given algorithm
– It means predicting the resources that the algorithm requires
• The main resources are running time and memory
space
– An algorithm that solves a problem but requires a year and/or
GBs of main memory is hardly of any use.
• The objective of algorithm analysis is to measure the
efficiency of an algorithm to make sure that;
– how quickly (with less memory) an algorithm executes in
practice
Efficiency
• An algorithm must solve a problem with the least amount
of computational resources such as time and space.
– an algorithm should run as fast as possible using as little
memory as possible
• Two types of algorithmic efficiency evaluation
– Time efficiency - indicates how fast the algorithm runs
– Space efficiency - indicates how much memory the algorithm
needs
• What to analyze?
– To keep things simple, we will concentrate on the running time of
algorithms.
– So, efficiency considerations of algorithm usually focus on the
amount of time elapsed (called running time of an algorithm)
when processing data.
Analysis of Insertion Sort
algorithm INSERTION-SORT(A) cost times
1. for j  2 to length[A] do c1 n
2. key  A[j] c2 n -1
3. i  j-1 c3 n -1
n

4. while i >0 and A[i]>key do c4 t


j 2
j

n
5. A[i+1]  A[i] c5 t
j 2
j 1
n
6. i  i-1 c6 t
j 2
j 1

7. A[i+1]  key c7 n -1
(tj is the number of times the while loop in line 4 is executed for that value of j)
• The running time, T(n), of the insertion algorithm is the sum of running
times for each statement executed, i.e.:
=c1n+ c2(n-1)+ c3(n-1)+ c4nj=2 tj+ c5nj=2 (tj-1)+ c6nj=2(tj-1)+ c7(n-1)
Best Case Analysis of Insertion Sort
• Occurs if the array contains an already sorted values
–For each j = 2, 3, 4,…, n, we then find that A[i] ≤ key in
line 4 when i has its initial value of j – 1.
–Thus tj=1 for j = 2, 3,…, n, and line 5 and 6 will be
executed 0 times
• The best case running time is
T(n) = c1n + c2(n-1) + c3(n-1) + c4(n-1) + c7(n-1)
= (c1 + c2 + c3 + c4 + c7)n – (c2 + c3 + c4 + c7)
–This running time can be expressed as an + b for
constants a and b that depends on the statement cost c i;
• it is thus a linear function of n
Worst Case Analysis of Insertion Sort
• Occurs if the array contains values that are in reverse sorted
order, that is in decreasing order
• We must compare each element A[j] with each element in
the entire sorted subarray A[1..j-1]. So, t j = j for j = 2,3,…,n.
n n
n(n  1) n n
n(n  1)
t  
j 2
j
j 2
j
2
 1   (t j  1)   ( j  1) 
j 2 j 2 2
Therefore the worst case running time of INSERTION-SORT is T(n)
n(n  1) n(n  1) n(n  1)
 c1n  c2 (n  1)  c3 (n  1)  c4 (  1)  c5 ( )  c6 ( )  c7 (n  1)
2 2 2
c4  c5  c6 2 c4 c5 c6
( )n  (c1  c2  c3     c7 )n  (c2  c3  c4  c7 )
2 2 2 2
– This worst case running time can be expressed as an2 + bn + c for
constants a, b, c, it is thus a quadratic function on n
Average Case Analysis of Insertion Sort
• Suppose that we randomly choose n numbers and apply
insertion sort
• How long does it take to determine where in subarray
A[1..j-1] to insert the element A[j]?
– On average, half the elements in A[1..j-1] are less than A[j], and
half the elements are greater
– On average, therefore, we check half the subarray A[1..j-1], so tj =
j/2 and T(n) will still be in the order of n2,
• This average case running time can then be expressed as
quadratic function, an2 + bn + c for constants a, b, c, which is
the same as worst case
• In summary, the running time of insertion sort for
– Best case: an – b
– Worst case: an2 + bn - c
– Average case: an2 + bn - c
Asymptotic Analysis
• When analyzing the running time or space usage of
programs, we usually try to estimate the time or space as a
function of the input size.
– For example, when analyzing the worst case running time of an
insertion algorithm, we say the running time of insertion sort is,
T(n) = an2 + bn - c, for some constants a, b & c.
• The asymptotic behavior of a function f(n) refers to the
growth of f(n) as n gets large. We typically ignore small
values of n, since we are usually interested in estimating
how slow the program will be on large inputs.
– A good rule of thumb is: the slower the asymptotic growth rate,
the better the algorithm.
• By this measure, a linear algorithm; f(n)=an+c, is always
asymptotically better than a quadratic one; f(n)=an2+c. That
is because for any given (positive) a & c, there is always
some n at which the magnitude of an2+c overtakes an+c.
– For moderate values of n, the quadratic algorithm could very
well take less time than the linear one. However, the linear
algorithm will always be better for sufficiently large inputs.
Asymptotic analysis

1 ≤ n for all n ≥ 1 2n ≤ n! for all n ≥ 4


n ≤ n2 for all n ≥ 1 log2n ≤ n for all n ≥ 2

n ≤ nlog2n for all n ≥ 2


Asymptotic notations: Big-Oh (O)
• Definition
– Let f(n) and g(n) be functions mapping nonnegative integers to
real numbers.
– A function f(n) = O(g(n)), if there is some positive constant c > 0
and a non-negative integer no ≥ 1 such that
f(n) ≤ c.g(n) for all n ≥ no
• Big-O expresses an upper bound on the growth rate of a
function, for sufficiently large values of n
– An upper bound is the best algorithmic solution that has been
found for a problem (“what is the best thing that we know we
can do?”)
• In simple words, f(n) = O(g(n)) means that the growth rate
of f(n) is less than or equal to g(n).
– The statement f(n) = O(g(n)) states only that c.g(n) is an upper
bound on the value of f(n) for all n, n ≥ n0
Big-Oh theorems
• Theorem 1: If k is a constant, then k is O(1)
– Example: f(n) = 2100 = O(1)
• Theorem 2: If f(n) is a polynomial of degree k, then
f(n) = O(nk)
– If f(n) = a0+ a1n + a2n2 + … + aknk, where ai and k are constants,
then f(n) is in O(nk)
– Polynomial’s growth rate is determined by the leading term
Example: f(n) = 7n4 + 3n2 + 5n + 1000 is O(n4)
• Theorem 3: Constant factors may be ignored
If g(n) is in O(f(n)), then k * g(n) is O(f(n)), k >0
Example:
• T(n) = 7n4 +3n2 + 5n +1000 is O(n4)
• T(n) = 28n4 + 12n2 + 20n + 4000 is O(n4)
Example: Big-Oh (O)
Find O(f(n) for the given functions:
• f(n) = 2n + 6

• f(n) = 13n3 + 42n2 + 2n log n

• If f(n) = 3n2 + 4n + 1 then show that f(n) = O(n2)

• If f(n) = 10n + 5 and g(n) = n, then show that f(n) is


O(g(n))
Asymptotic notations: Big-Omega ()
• Definition
–Let f(n) and g(n) be functions mapping nonnegative
integers to real numbers.
–A function f(n) = (g(n)), if there is some positive
constant c > 0 & a non-negative integer no ≥ 1 such that
f(n) ≥ c.g(n) for all n ≥ no
• The statement f(n) = (g(n)) states only that
c.g(n) is a lower bound on the value of f(n) for all
n, n ≥ n0
–In simple terms, f(n) = (g(n)) means that the growth
rate of f(n) is greater than or equal to g(n)
Big-Omega- Example
• Show that the function T(n) = 5n2 – 64n + 256 = Ω(n2)
– We need to show that for non-negative integer n0 and a constant
c > 0, T(n) ≥ c.n2 for all integers n ≥ n0
–we have that for c=1 and n0 = 0, T(n) ≥ cn2 for all integers n ≥ n0
– What if c = 2 and n0 = 16 ?

• Show if f(n) = 10n2 + 4n + 2 and g(n) = n2 , then f(n) =


Ω(n2)

• Show that 3n2 + 5 ≠ Ω(n3)


Asymptotic notations: Theta ()
• Definition
–Let f(n) and g(n) be functions mapping nonnegative integers to
real numbers.
–A function f(n) = (g(n)), if there exist some positive constant c1
and c2 and a negative integer constant no ≥ 1 such that c1.g(n) ≤
f(n) ≤ c2.g(n) for all n ≥ no
• The Theta notation is used when the function f can be
bounded both from above and below by the same function
–When we write f(n) = (g(n)), we mean that f lies between c1
times the function g and c2 times the function g except possibly
when n is smaller than n0
• Another way to view the θ-notation is that the function
–f(n) = θ(n) if and only if
f(n) = Ο(g(n)) and f(n) = Ω(g(n))
Asymptotic Tightness
• The theta notation is more precise than both the
Big-O and Big-notations.
– The function f(n) = (g(n)) iff g(n) is both an upper and lower
bounds on f(n).
• Big-Oh does not have to be asymptotically tight
– f(n) = ½n is O(n) with c=1, n0=1, but is also in O(n100)…
• Big- isn’t tight either
– f(n) = n5 is (n) with c=1, n0 = 1…
• Theta ( is tight…
– f(n) must be in same growth classes to meet definition.
• Prove this assertion using f(n)=3n3+2n2+1 is (n3).
– Show that f(n) is O(n3), and also, f(n) is (n3)
Example: Algorithmic efficiency
• Write an algorithm for linear search which scans through the
given sequence of inputs, A = <a1, a2, …, an>, looking for a
Key.
– analyze its best case, worst case and average case time complexity,
– Compute Big-O and Big-
worst best average
algrithm linearSearch (A, n, k)
//input:squance of input from A1... An
//output: key
key = k 1 1 1
for i = 1 to n do n-1+1=n 1 n/2
if A[i] = key then n-1 1 (n-1)/2
return i 0 1 1
end if
end for
return ‘not found’ 1 0 0
end algorithm f(n)=2n + 1 4 n+3/2
Exercise: Algorithmic efficiency
• Write (i) an algorithm for linear search and (ii)
analyze its time complexity, which scans through
the given sequence of inputs, A = <a1, a2, …,
an>, looking for Key.
The output is either
– One or more index i (the position of all values if key =
A[i]) or
– the special message “Not Found” if Key does not
appear in the list.

• Design an algorithm that scans through the


given sequence of inputs, A = <a1, a2, …, an>,
finding for one or more minimum value(s).
– What is the best, worst and average case time
complexity of the algorithm?
Exercise
• Given a sequence of elements, A1, A2, …,
An, design a recursive algorithm
– that
• reverse the array, An, An-1, …, A1
• sum the sequence of array
– Analyze the efficiency of your algorithms
– Write source code to implement your
algorithms

51
Solution
1) Initialize start and end indexes
as start = 0, end = n-1 
2) Swap arr[start] with arr[end] 
3) Recursively call reverse
for rest of the array.
This is program answer
• • void printArray(int arr[], int size)
void rvereseArray(int arr[], int start, int end)
• {
• { •    for (int i = 0; i < size; i++)
• •    cout << arr[i] << " ";
    if (start >= end)
•  
•     return; •    cout << endl;

•       }
•  
•     int temp = arr[start]; • /* Driver function to test above functions */
• int main()
•     arr[start] = arr[end];
• {
•     arr[end] = temp; •     int arr[] = {1, 2, 3, 4, 5, 6};
•      
•      
•     // To print original array
•     // Recursive Function calling •     printArray(arr, 6);
•      
•     rvereseArray(arr, start + 1, end - 1);
•     // Function calling
• }     •     rvereseArray(arr, 0, 5);
•      
•  
•     cout << "Reversed array is" << endl;
•   •      
•     // To print the Reversed array
• /* Utility function to print an array */
•     printArray(arr, 6);
•      
•     return 0;
• }
• void rvereseArray(int arr[], int • /* Utility function to print an array
start, int end) */
• { • void printArray(int arr[], int size)
•     while (start < end) • {
•    for (int i = 0; i < size; i++)
•     {
•    cout << arr[i] << " ";
•         int temp = arr[start];
•  
•         arr[start] = arr[end]; •    cout << endl;
•         arr[end] = temp; • }
•         start++; •  
•         end--; • /* Driver function to test above
•     } functions */

• }    
• int main()
• {
•     int arr[] = {1, 2, 3, 4, 5, 6};
•      
•     int n = sizeof(arr) / sizeof(arr[0]);
•  
•     // To print original array
•     printArray(arr, n);
•      
•     // Function calling
•     rvereseArray(arr, 0, n-1);
•      
•     cout << "Reversed array is" << endl;
•      
•     // To print the Reversed array
•     printArray(arr, n);
•      
•     return 0;
• }
Algorithmic Paradigms
Algorithmic Paradigms
•Techniques for Algorithms Design:
–General approaches to the construction of efficient solutions
to problems.
•Such methods are of interest because:
–They provide templates suited to solve a broad range of
diverse problems which can be precisely analyzed.
–They can be translated into common control and data
structures provided by most high-level languages.
•Although more than one technique may be applicable
to a specific problem, it is often the case that an
algorithm constructed by one approach is clearly
superior to equivalent solutions built using alternative
techniques.
–The choice of design paradigm is an important aspect of
algorithm analysis
Algorithmic Paradigms
• Some of the techniques for the Design of Algorithms
include:
– Divide and Conquer
– Greedy
– Dynamic Programming
– Backtracking
– Branch and bound, ….
» t
The divide-and-conquer
strategy
Divide-and-Conquer Strategy
• This is a general algorithm design paradigm that has
created such efficient algorithms as Max-Min, Merge
Sort, Binary Search, ….
• This method has three distinct steps:
– Divide: If the input size is too large, divide the input into two
or more sub-problems. That is, divide P  P1, …, Pk
• If the input size of the problem is small, it is solved directly
– Recur: Use divide and conquer to solve the sub-problems
associated with each one-kth of the data subsets separately,
That is, find solution for S(P1), …, S(Pk)
– Conquer: Take the solutions to the sub-problems and
combine (“merge”) these solutions into a solution for the
original problem. That is, Merge S(P1 ), …, S(Pk)  S(P)
The Divide and Conquer Strategy
• Implementation: suppose we consider the divide-
and-conquer strategy when it splits the input into two
sub-problems of the same kind as the original
problem.
• If the input size of the problem is small, it is solved
directly.
• If the input size of the problem is large, apply the
strategy:
– Divide: divide the input data S in two disjoint subsets S1and
S2
– Recur: Solve each half of the sub-problems associated with
S1 and S2
– Conquer: combine the solution for S1and S2 into a solution
for S
General Algorithm
procedure DCS (P)
if small(P) then
return S(P)
else
divide P into smaller instances P1, P2 …, Pk
apply DCS to each of these sub-problems
return (combine(DCS(P1), DCS(P2), …, DCS(Pk))
end if;
end DCS;
Complexity: T(n) = f(n) n small
aT(n/b) + g(n) otherwise, where
• b be the ways we divide the problem at each step
• a be the number of sub-problems we solve at each step; i.e. n/b.
• T(n) be the time needed to solve the problem with input of size n
• g(n) be the time for dividing the problem & combining solutions to sub-
problems
• f(n) be the time to compute the answer directly for small inputs
Divide-and-Conquer Technique
a problem of size n
(instance)
subproblem 1 subproblem 2
of size n/2 of size n/2

a solution to a solution to
subproblem 1 subproblem 2

a solution to
the original problem
In general it leads to a recursive algorithm with complexity
T(n) = 2 T(n/2) + g(n)
Solving Recurrence Relation
• One of the method for solving recurrence relation is
called the substitution method.
–This method repeatedly makes substitutions for each
occurrence of the function T(n) until all such occurrences
disappear

Example: solve the following recurrence by substitution




2 n=1
1.T(n) = 2T(n/2)+n n>1



1 n2
2.T(n) = 2T(n/2)+1 n>2
Example of Recursion: SUM A[1…n]
•Problem: Write a recursive function to find the sum of the first
n integers A[1…n] and output the sum
–Example: given k = 3, we return sum = A[1] + A[2] + A[3]
given k = n, we return A[1] + A[2] + … + A[n]
–How can you define the problem in terms of a smaller
problem of the same type?
1 + 2 + … + n = [1 + 2 + … + (n -1)] + n
for n > 1, f(n) = f(n-1) + n
–How does each recursive call diminish the size of the
problem? It reduces by 1 the number of values to be
summed.
–What instance of the problem can serve as the base case?
n=1
–As the problem size diminishes, will you reach this base
case? Yes, as long as n is nonnegative. Therefore the
statement “n >= 1” needs to be a precondition
Example of Recursion : SUM A[1…n]
Problem: Write a recursive function to find the sum of
the first n integers A[1…n] and output the sum
algorithm LinearSum(A, n)
// Input: an array A with n elements
// Output: The sum of the first n integers in A
if n = 1 then call return 15 + A [4 ] = 15 + 5 = 20
return A[0]
LinearSum ( A ,5)
else
return LinearSum(A, n - 1) + A[n] call return 13 + A [3 ] = 13 + 2 = 15

end algorithm LinearSum (A ,4)

call return 7 + A [2 ] = 7 + 6 = 13

LinearSum ( A ,3)

Example recursion trace: call return 4 + A [1 ] = 4 + 3 = 7

= (4,3,6,2,5) LinearSum ( A ,2)

call return A [ 0 ] = 4

LinearSum (A , 1)
Binary Sum
• Binary sum occurs whenever there are two recursive calls for
each non-base case.
Algorithm BinarySum(A, i, n):
//Input: An array A and integers i and n
//Output: The sum of the n integers in A starting at index i
if n = 1 then
return A[i ]
return (BinarySum(A, i, n/ 2) + BinarySum(A, i + n/ 2, n/ 2))
0, 8
end algorithm
0, 4 4, 4

0, 2 2, 2 4, 2 6, 2

0, 1 1, 1 2, 1 3, 1 4, 1 5, 1 6, 1 7, 1
Binary search
• Binary Search is an algorithm to find an item in a sorted list.
–very efficient algorithm for searching in sorted array
–Limitations: must be a sorted array
• Problem: determine whether a given element K is present in
the given list or not
–Input: Let A = <a1, a2, … an> be a list of elements that are sorted in non-
decreasing order.
–Output: If K is present output its position. Otherwise output “Not Found”.

• Implementation:
–Pick the pivot item in the middle: Split the list in two halves (size n/2) at m
so that
A[1], … A[m], … A[n].
–If K = A[m], stop (successful search);
–Otherwise, until the list has shrunk to size 1 narrow our search
recursively to either
the top half of the list : A[1..m-1] if K < A[m] or
the bottom half of the list: A[m+1..n] if K > A[m]
Example
• Example: Binary Search for 64 in the given list A[1:17] =
{5 8 9 13 22 30 34 37 38 41 60 63 65 82 87 90 91}
1. Looking for 64 in this list.
2. Divide the list into two
(1+17)/2 = 9
3. Pivot = 38. Is 64 < 38?
No.
4. Recurse looking for 64 in Pivot
the list > 38.
5. etc.

• Given 14 elements: A[1:14] = (-15, -6, 0, 7, 9, 23, 54, 82, 101,


112, 125, 131, 142, 151).
–Construct binary search tree and search for (i) 151, (ii) 10
SEARCHING A NODE
Searching a node was part of the operation performed during insertion. Algorithm
to search as element from a binary search tree is given below.
ALGORITHM
1. Input the DATA to be searched and assign the address of the root node to
ROOT.
2. If (DATA == ROOT → Info)
(a) Display “The DATA exist in the tree”
(b) GoTo Step 6
3. If (ROOT == NULL)
(a) Display “The DATA does not exist”
(b) GoTo Step 6
4. If(DATA > ROOT→Info)
(a) ROOT = ROOT→RChild
(b) GoTo Step 2
5. If(DATA < ROOT→Info)
(a) ROOT = ROOT→Lchild
(b) GoTo Step 2
6. Exit
Binary Search Recursive Algorithm
procedure BSearch(A, low, high, key)
// A is sorted array. Low =1, high = n
if low = high then
if key = A[low] then return low
else return “Not Found”;
end if
else
mid = (low + high)/2;
if key > A[mid]
return BSearch(A, mid+1, high, key);
else
return BSearch(A, low, mid-1, key);
end if
end if
end algorithm
Binary Search Iterative Algorithm
Procedure BinarySearch(A, n, key)
low  1; high  n;
while low  high do
mid  (low+high)/2
if key = A[mid] then
return mid
else if key < A[mid] then
high  mid-1
else low  mid+1
return “NotFound”
end
Binary Search Iterative Algorithm
• Analysis: considering the number of element
comparison, the worst-case recurrence is:
T(n) = 1 n =1
T(n/2) + 1 n >1

• T(n) = O(log2n). Show?


Finding the minimum & maximum
• Let there are n elements < a1, a2, … an>. The problem is to find
max and min elements in a set.
Straightforward algorithm
procedure max_min(A, n, max,min)
max = min = A[1] • Analysis: there are
for i = 2 to n do 2(n-1) number of
if A[i] > max then element comparisons
in the best, worst and
max = A[i] average cases.
if A[i] < min then
min = A[i]
end procedure
• Can you suggest any improvement in the max_min algorithm ?
Divide and conquer for finding min & max
Recurrence relation:
• If n =1, both max and min are
the same.
max = min = a[1]
• If n =2 the problem can be
solved by making one
element comparison
a[1] > a[2] or a[1] < a[2]
• If n > 2 divide < a1, a2, … an>
into two instances a[1..n/2]
and a[n/2+1…n] and solve the
sub-problems recursively.
Example: find the maximum of a
given set of n numbers a[] = {29 14
15 1 6 10 32 12}
Recursive algorithm
procedure max_min(A, low, high, max, min)
if low = high then
max = min = A[1]
else if low = high – 1 then
if A[low] < A[high] then
max = A[high]; min = A[low]
else
max = A[low]; min = A[high];
end if
else
mid = (low+high)/2
max_min(A, low, mid, max, min);
max_min(A, mid+1, high, max2, min2);
if max < max2 then max = max2;
if min > min2 then min = min2;
end if
end procedure
Exercise
• Consider the searching problem.
Given a sequence of n numbers, A = <a1, a2, …,
an> and a value Key, write a code for linear
search, which scans through the sequence,
looking for Key, and returns one or more position i
such that Key = A[i].
• Let A be an array of n elements,
A[1],A[2],A[3], ...... A[n]. “data” is the
element to be
• searched. Then this algorithm will find the
location “loc” of data in A. Set loc = – 1,if
the
• search is unsuccessful.
1. Input an array A of n elements and “data” to be searched
and initialise loc = – 1.
2. Initialise i = 0; and repeat through step 3 if (i < n) by
incrementing i by one .
3. If (data = A[i])
(a) loc = i (b) GOTO step 4
4. If (loc > 0)
(a) Display “data is found and searching is successful”
5. Else
(a) Display “data is not found and searching is unsuccessful”
6. Exit
• printf (“\nEnter the element to be searched : ”);
• scanf (“%d”,&item); //Input the item to be searched
• for(i=0;i < n;i++)
• {
• if item == arr[i])
• {
• printf (“\n%d found at position %d\n”,item,i+1);
• break;
• }
• }/*End of for*/
• if (i == n)
• printf (“\nItem %d not found in array\n”,item);
Greedy Algorithms
The greedy method
• An optimization problem is one in which you
want to find, not just a solution, but the best
solution
– A “greedy algorithm” works well for most
optimization problems

• Greedy method suggests that one can devise


an algorithm that works in phases:
– At each phase, take one input at a time (from the
ordered input according to the selection criteria)
and decide whether it is an optimal solution.
Feasible vs. optimal solution
• Greedy method solves problem by making a
sequence of decisions.
• Decisions are made one by one in some order.
• Each decision is made using a greedy criterion.
• A decision, once made, is (usually) not changed later.

• Given n inputs we are required to obtain a subset


that satisfies some constraints
–Any subset that satisfies the given constraints is called a
feasible solution
–A feasible solution that either maximizes or minimizes a
given objective function is called an optimal solution.
Greedy algorithm
To apply greedy algorithm
• Decide optimization measure (maximization of
profit or minimization of cost)
– Sort the input in increasing or decreasing order based
on the optimization measure selected for the given
problem
• Formulate a multistage solution
– Take one input at a time as per the selection criteria
• Select an input that is feasible and part of the
optimal solution
– from the ordered list pick one input at a time and
include it in the solution set if it fulfills the criteria
Greedy Choice Property
• Greedy algorithm always makes the choice
that looks best at the moment
– With the hope that a locally optimal choice will
lead to a globally optimal solution

• Greedy choice property says that a globally


optimal solution can be arrived at by making
a locally optimal choice
– Locally optimal choice  globally optimal solution
The Problem of Making Coin Change
•Assume the coin denominations are: 25, 10, 5, and 1.
•Problem: Make a change of a given amount using
the smallest possible number of coins
•Example: make a change for x = 92.
–Mathematically this is written as
x = 25a + 10b + 5c + 1d
So that a + b + c + d is minimum & a, b, c, d ≥ 0.
•Greedy algorithm for coin changing
–Order coins in decreasing order
–Select coins one at a time (divide x by denomination)
–Solution: contains a = 3, b = 1, c = 1, d = 2.
Greedy Algorithm
procedure greedy (A, n)
        Solution ← { };    // set that will hold the solution set.
        FOR i = 1 to n DO
            x = SELECT (A)
            IF FEASIBLE (Solution, x) THEN
                Solution = UNION (Solution, x)
            end if
end FOR
        RETURN Solution
end procedure
• SELECT function: selects the most promising candidates
from A[ ] and removes it from the list.
• FEASIBLE function: a Boolean valued function that
determines whether x can be included into the solution
vector.
• UNION function: combines x with the solutions
Algorithm for Coin Change
• Make change for n units using the least possible
number of coins.

Algorithm MAKE-CHANGE (C, n, A)


    //C ← {50, 25, 10, 5, 1}
//A is the amount to be changed 
    Sol ← {}       // initialize Sol
    rem = A
    WHILE rem > 0 & i < n DO
Sol[k++] = rem / C[i++]
         rem = rem mod C[i++]
end while
      RETURN Sol
end algorithm
Minimum Spanning Trees
Problem: Laying Telephone Wire

Central office

Minimize the total length of wire connecting the customers


Minimum Spanning Tree (MST)
• Assume you have an undirected graph G = (V,E) with
weights assigned to edges.
•The objective is “use smallest set of edges of the given
graph to connect everything together”. How?
•A minimum spanning tree is a least-cost subset of the edges
of a graph that connects all the nodes

• MST is a sub-graph of an undirected weighted graph


G, such that:
•It is a tree (i.e., it is acyclic)
•It covers all the vertices V
•contains |V| - 1 edges
•The total cost associated with tree edges is the minimum
among all possible spanning trees
Applications of MST
• Network design, road planning, etc.
How can we generate a MST?
• A MST is a least-cost subset of the edges of a graph that
connects all the nodes
• A greedy method to obtain a minimum-cost spanning tree
builds this tree edge by edge.
– The next edge to include is chosen according to some
optimization criterion.
• Criteria: to choose an edge that results in a minimum
increase in the sum of the costs of the edges so far included.
• General procedure:
– Start by picking any node & adding it to the tree
6
4
– Repeatedly: Pick any least-cost edge from a node 2
in the tree to a node not in the tree, & add the
edge and new node to the tree 1 4
5
– Stop when all nodes have been added to the tree 3 2
• Two techniques: Prim and Kruskal algorithms
3 2 3
3 3
4
2 4
Prim’s algorithm
Prim’s algorithm
•Example: find the minimum spanning tree
using Prim algorithm

9 9
2 2
1 2 6 1 2 6
4 4
4 5 4 5
5 4 5 4

5 5 5 5
3 3
Prim’s Algorithm
procedure primMST(G, cost, n, T)
Pick a vertex 1 to be the root of the spanning tree T
mincost = 0
for i = 2 to n do near (i) = 1
near(1) = 0
for i = 1 to n-1 do
find j such that near(j) ≠ 0 and cost(j,near(j)) is min
T(i,1) = j; T(i,2) = near (j)
mincost = mincost + cost(j,near(j))
near (j) = 0
for k = 1 to n do
if near(k) ≠ 0 and cost(k,near(k) > cost(k,j) then
near (k) = j
end for
end for
return mincost
end procedure
Correctness of Prim’s
• If the algorithm is correct it halts with the right
answer or optimal solution.
• Optimal solution is obtained if:
• Prim algorithm adds n-1 edges (with
minimum cost) to the spanning tree without
creating a cycle
• Proof that PRIM algorithm creates a minimum
spanning tree of any connected graph.
• Prove by contradiction
• Suppose it wasn't.
• Proof. Let T be the spanning tree returned
by the algorithm, and suppose there
doesn’t exist any MST of G consistent with
T. Consider an optimal MST O of G.
Kruskal’s algorithm
• Kruskal algorithm: Always tries the lowest-cost remaining edge
• It considers the edges of the graph in increasing order of cost.
• In this approach, the set T of edges so far selected for the
spanning tree may not be a tree at all stages in the algorithm.
But it is possible to complete T into a tree.
• Create a forest of trees from the vertices
• Repeatedly merge trees by adding “safe edges” until only
one tree remains. A “safe edge” is an edge of minimum
weight which does not create a cycle
• Example:
9 b
a 2 6 Initially there is a forest:
d {a}, {b}, {c}, {d}, {e}
4 5
5 4

5 e E = {(a,d), (c,d), (d,e), (a,c), (b,e), (c,e),


c (b,d), (a,b)}
Dynamic programming
Divide & Conquer vs. Dynamic Programming

•Both techniques split their input into parts, find


sub-solutions to the parts, and combine
solutions to sub-problems.
•In divide and conquer, solution to one sub-
problem may not affect the solutions to other
sub-problems of the same problem.
–In dynamic programming, sub-problems are not
independent. Sub-problems may share sub-sub-
problems
Greedy vs. Dynamic Programming
•Both techniques are an algorithm design technique for
optimization problems (minimizing or maximizing), and
both build solutions from a collection of choices of
individual elements.
–The greedy method computes its solution by making its
choices in a serial forward fashion, never looking back or
revising previous choices.
–Dynamic programming computes its solution
forward/backward by synthesizing them from smaller sub-
solutions, and by trying many possibilities and choices before
it arrives at the optimal set of choices.
•There is no a priori test by which one can tell if the
Greedy method will lead to an optimal solution.
–By contrast, there is a test for Dynamic Programming, called
The Principle of Optimality
The Principle of Optimality
•In DP an optimal sequence of decisions is obtained by making
explicit appeal to the principle of optimality.
•Definition: A problem is said to satisfy the Principle of
Optimality if the sub-solutions of an optimal solution of the
problem are themselves optimal solutions for their sub-
problems.
– In solving a problem, we make a sequence of decisions D1, D2,..., Dn. If
this sequence is optimal, then the k decisions also be optimal
•Examples: The shortest path problem satisfies the principle of
optimality.
– This is because if a, x1, x2,..., xn, b is a shortest path from node a to node b
in a graph, then the portion of xi to xj on that path is a shortest path from xi
to xj.
•DP reduces computation by
– Storing solution to a sub-problem the first time it is solved.
– Looking up the solution when sub-problem is encountered again.
– Solving sub-problems in a bottom-up or top-down fashion.
Dynamic programming (DP)
•DP is an algorithm design method that can be used
when the solution to a problem can be viewed as the
result of a sequence of decisions.
–Example: The solution to knapsack problem can be viewed
as the result of a sequence of decisions. We have to decide
the values of xi, 0 or 1. First we make a decision on x1, then
x2 and so on.
•For some problems, an optimal sequence of
decisions can be found by making the decisions one
at a time using greedy method.
•For other problems, it is not possible to make step-
wise decisions based on only local information.
–One way to solve such problems is to try all possible
decision sequences. However time and space requirement is
prohibitive.
–DP reduces those possible sequences not leading to optimal
decision.
Dynamic programming approaches
• To solve a problem by using dynamic programming:
–Find out the recurrence relations.
• Dynamic programming is a technique for efficiently computing
recurrences by storing partial results.
–Represent the problem by a multistage graph.
–In summary, if a problem can be described by a multistage
graph, then it can be solved by dynamic programming

• Forward approach and backward approach:


–If the recurrence relations are formulated using the
backward approach, then the relations are solved beginning
with the last decision.
–If the recurrence relations are formulated using the forward
approach, then the relations are solved starting from the
beginning until we each to the final decision
Example: 0-1 knapsack problem
The shortest path in multistage graphs
• Find the shortest path in multistage graphs for the
following example? 4
A D
1 18
11 9

2 5 13
S B E T
16 2

5
C 2
F

• The greedy method can not be applied to this


case: (S, A, D, T) 1+4+18 = 23.
• The real shortest path is:
(S, C, F, T) 5+2+2 = 9.
The shortest path
• Given a multi-stage graph, how can I find a
shortest path?
–Forward approach: Let p(i,j) denote the minimum
cost path from vertex j to the terminal vertex T. Let
COST(i,j) denote the cost of p(i,j) path. Then using
the forward approach, we obtain:
COST(i,j) = min {COST(i,j), c(i,k) + COST(k,j)}
–Backward approach: Let p(i,j) be a minimum cost
path from vertex S to a vertex j in Vi . Let COST(i,j)
be the cost of p(i,j).
COST(i,j) = min {COST(i,j), COST(i,k) + c(k,j)}

NB. If (i, j) is not element of E then COST(i, j) = + inf.


Algorithm
procedure shortest_path (COST[], A[], n)
//cost[i,j] is the cost of edges[i,j] and A[i,j] is the shortest path
from i to j
//cost[i,i] is 0.0
for i = 1 to n do
for j = 1 to n do
A(i, j) := COST(i, j) //copy cost into A
for k = 1 to n do
for i = 1 to n do
for j = 1 to n do
A(i, j ) = min(A(i, j), A(i,k) + A(k,j));
end for
end for
end for
return A(1..n,1..n)
end shortest_path
This algorithm runs in time O( n3 )
String editing
• The problem is given two sequences of symbols, X = x1
x2 … xn and Y = y1 y2 … ym, transform X to Y, based on a
sequence of three operations: Delete, Insert and
Change, so that for every operation COST(Cij) is
incurred.
• The objective of string editing is to identify a minimum
cost sequence of edit operation that will transform X into
Y.
Example: consider the sequences
X = {a a b a b} and Y = {b a b b}
Identify a minimum cost sequence of edit operation that
transform X into Y. Assume change costs 2 units, delete
1 unit and insert 1 unit.
(a) apply brute force approach
(b) apply dynamic programming
Dynamic programming
•The minimum cost of any edit sequence that transforms x1
x2 … xi into y1 y2 … yj (for i>0 and j>0) is the minimum of the
three costs: delete, change, or insert operations.
•The following recurrence equation is used for COST(i,j).
0 if i=0,j=0
COST(i-1,0) + D(xi) i>0, j=0 COST(0,j-1) + I(yj)
j>0, i=0
COST'(i,j) i>0, j>0
COST(i,j) =
where COST'(i,j) = min { COST(i-1,j) + D(xi),
COST(i-1,j-1) + C(xi,yj), COST(i,j-1) + I(yj) }

It takes O(n,m)
def lcs(s1, s2):
matrix = [["" for x in range(len(s2))] for x in range(len(s1))]
for i in range(len(s1)):
for j in range(len(s2)):
if s1[i] == s2[j]:
if i == 0 or j == 0:
matrix[i][j] = s1[i]
else:
matrix[i][j] = matrix[i-1][j-1] + s1[i]
else:
matrix[i][j] = max(matrix[i-1][j], matrix[i][j-1], key=len)

cs = matrix[-1][-1]

return len(cs), cs
str1 = input("enter first string")
str2 = input("enter second string")

print(lcs(str1, str2))
Example
Transform the sequences
Xi = {a a b a b} into Yj = {b a b b}
with minimum cost sequence of edit operation using
dynamic programming approach, Assume that
change costs 2 units, delete and insert 1 unit.
j 0 1 2 3 4 The value 3 at (5,4) is the
i
0 0 1 2 3 4 optimal solution
1 1 2 1 2 3 By tracing back one can
determine which operations
2 2 3 2 3 4 lead to optimal solution
3 3 2 3 2 3 • Delete x1, Delete x2 and
4 4 3 2 3 4 Insert y4 Or,
5 5 4 3 2 3 • Change x1 to y1 & Delete x4

You might also like