Algorithms

UNIT 1 : FUNDAMENTALS OF ALGORITHM
Algorithm:-
 An algorithm is a finite lists of instructions in sequence to solve the computation problems.
 An algorithm is a step by step of finite number of process to solve the problems.You can write the
algorithms in any language which is understandable to the persons (programmers)
 In Real life,an algorithm is a recipe for any cooking dish.
Write a Algorithm in Natural Language:-
1. Start
2. Read array elements
3. Scan n elements in array A
4. Declare a sum variable
5. Assign zero value in sum variable
6. Add all array elements using for loop
7. Display the sum
8. stop
#Write a Algorithm in Pseudo code:-
1. start
2. Read Array A
3. sum<--0
4. for <--1 to n do
5. sum<--sum A[i]
6. display sum
7. stop
#Write a Algorithm in Flow Chart:-
Characteristics of Algorithms:-
There are following characteristics of any algorithms as given below.
1. Input:-An algorithm should have one or more inputs.
2. Output:-An algorithm must have at least one output.
3. Definiteness:- Every statement in any algorithm should be definiteness.It means.Every statement in
algorithm should have unambiguous.Every statement should have clear and there should not be
more than one way to interprate the single statement in given algorithm.
4. Finiteness:-An algorithm should have a finite number of steps(instructions) to solve the problems and
get a valid output.
5. Effectiveness:-An algorithm should have effectiveness and produce well defined output of any
programs. Effectiveness means, an algorithms should have good method which produce effective
output with less time and less storage capacity.
Different between Algorithms and programs:-
There are some difference between algorithms and programs as given below:-
Data abstraction
Data abstraction is the programming process of creating a data type, usually a class, that hides the details
of the data representation in order to make the data type easier to work with. Data abstraction involves
creating a representation for data that separates the interface from the implementation so a programmer
or user only has to understand the interface, the commands to use, and not how the internal structure of
the data is represented and/or implemented
Stack
A Stack is a linear data structure that follows the LIFO (Last-In-First-Out) principle. Stack has one end,
whereas the Queue has two ends (front and rear). It contains only one pointer top pointer pointing to the
topmost element of the stack. Whenever an element is added in the stack, it is added on the top of the stack,
and the element can be deleted only from the stack. In other words, a stack can be defined as a container
in which insertion and deletion can be done from the one end known as the top of the stack.
Some key points related to stack
o It is called as stack because it behaves like a real-world stack, piles of books, etc.
o A Stack is an abstract data type with a pre-defined capacity, which means that it can store the
elements of a limited size.
o It is a data structure that follows some order to insert and delete the elements, and that order can
be LIFO or FILO.
Working of Stack
Stack works on the LIFO pattern. As we can observe in the below figure there are five memory blocks in the
stack; therefore, the size of the stack is 5.
Suppose we want to store the elements in a stack and let's assume that stack is empty. We have taken the
stack of size 5 as shown below in which we are pushing the elements one by one until the stack becomes
full.
Since our stack is full as the size of the stack is 5. In the above cases, we can observe that it goes from the
top to the bottom when we were entering the new element in the stack. The stack gets filled up from the
bottom to the top.
When we perform the delete operation on the stack, there is only one way for entry and exit as the other
end is closed. It follows the LIFO pattern, which means that the value entered first will be removed last. In
the above case, the value 5 is entered first, so it will be removed only after the deletion of all the other
elements.
PUSH operation
The steps involved in the PUSH operation is given below:
o Before inserting an element in a stack, we check whether the stack is full.

o If we try to insert the element in a stack, and the stack is full, then the overflow condition occurs.
o When we initialize a stack, we set the value of top as -1 to check that the stack is empty.
o When the new element is pushed in a stack, first, the value of the top gets incremented,
i.e., top=top+1, and the element will be placed at the new position of the top.
o The elements will be inserted until we reach the max size of the stack.
POP operation
The steps involved in the POP operation is given below:
o Before deleting the element from the stack, we check whether the stack is empty.
o If we try to delete the element from the empty stack, then the underflow condition occurs.
o If the stack is not empty, we first access the element which is pointed by the top
o Once the pop operation is performed, the top is decremented by 1, i.e., top=top-1.
Queue
1. A queue can be defined as an ordered list which enables insert operations to be performed at one end
called REAR and delete operations to be performed at another end called FRONT.
2. Queue is referred to be as First In First Out list.
3. For example, people waiting in line for a rail ticket form a queue.
Applications of Queue
Due to the fact that queue performs actions on first in first out basis which is quite fair for the ordering of
actions. There are various applications of queues discussed as below.
1. Queues are widely used as waiting lists for a single shared resource like printer, disk, CPU.
2. Queues are used in asynchronous transfer of data (where data is not being transferred at the same
rate between two processes) for eg. pipes, file IO, sockets.
3. Queues are used as buffers in most of the applications like MP3 media player, CD player, etc.
4. Queue are used to maintain the play list in media players in order to add and remove the songs from
the play-list.
5. Queues are used in operating systems for handling interrupts.
Complexity
Data Time Complexity Space

Structure Complexity
Average Worst Worst
Access Search Insertion Deletion Access Search Insertion Deletion
Queue θ(n) θ(n) θ(1) θ(1) O(n) O(n) O(1) O(1) O(n)
Asymptotic Notations:
Asymptotic Notation is a way of comparing function that ignores constant factors and small input sizes. Three
notations are used to calculate the running time complexity of an algorithm:
1. Big-oh notation: Big-oh is the formal method of expressing the upper bound of an algorithm's running
time. It is the measure of the longest amount of time. The function f (n) = O (g (n)) [read as "f of n is big-oh
of g of n"] if and only if exist positive constant c and such that
 | f(x) | <= c*|g(n)|
Hence, function g (n) is an upper bound for function f (n), as g (n) grows faster than f (n)
For Example:
1. 1. 3n+2=O(n) as 3n+2≤4n for all n≥2

2. 2. 3n+3=O(n) as 3n+3≤4n for all n≥3
2. Omega () Notation: The function f (n) = Ω (g (n)) [read as "f of n is omega of g of n"] if and only if there
exists positive constant c and n0 such that
 |f(x)|>= c*|g(n)|
For Example:
f (n) =8n2+2n-3≥8n2-3
=7n2+(n2-3)≥7n2 (g(n))
Thus, k1=7
Hence, the complexity of f (n) can be represented as Ω (g (n))
3. Theta (θ): The function f (n) = θ (g (n)) [read as "f is the theta of g of n"] if and only if there exists positive
constant k1, k2 and k0 such that
 C1|g(w)<=|f(x)|<=C 2|g(n)|
For Example:
3n+2= θ (n) as 3n+2≥3n and 3n+2≤ 4n, for n

k1=3,k2=4, and n0=2
The Theta Notation is more precise than both the big-oh and Omega notation. The function f (n) = θ (g (n))
if g(n) is both an upper and lower bound.
What Is Time Complexity
Time complexity is defined in terms of how many times it takes to run a given algorithm, based on the
length of the input. Time complexity is not a measurement of how much time it takes to execute a
particular algorithm because such factors as programming language, operating system, and processing
power are also considered.
What Is Space Complexity
When an algorithm is run on a computer, it necessitates a certain amount of memory space. The amount
of memory used by a program to execute it is represented by its space complexity. Because a program
requires memory to store input data and temporal values while running, the space complexity is auxiliary
and input space.
Worst Case Analysis:

In the worst-case analysis, we calculate the upper limit of the execution time of an algorithm. It is
necessary to know the case which causes the execution of the maximum number of operations.
For linear search, the worst case occurs when the element to search for is not present in the array. When x
is not present, the search () function compares it with all the elements of arr [] one by one. Therefore, the
temporal complexity of the worst case of linear search would be Θ (n).
Average Case Analysis:

In the average case analysis, we take all possible inputs and calculate the computation time for all inputs.
Add up all the calculated values and divide the sum by the total number of entries.
We need to predict the distribution of cases. For the linear search problem, assume that all cases are
uniformly distributed. So we add all the cases and divide the sum by (n + 1).
Best Case Analysis:

In the best case analysis, we calculate the lower bound of the execution time of an algorithm. It is
necessary to know the case which causes the execution of the minimum number of operations. In the
linear search problem, the best case occurs when x is present at the first location.
The number of operations in the best case is constant. The best-case time complexity would therefore be
Θ (1) Most of the time, we perform worst-case analysis to analyze algorithms. In the worst analysis, we
guarantee an upper bound on the execution time of an algorithm which is good information.
Divide and Conquer Introduction
Divide and Conquer is an algorithmic pattern. In algorithmic methods, the design is to take a dispute on a
huge input, break the input into minor pieces, decide the problem on each of the small pieces, and then
merge the piecewise solutions into a global solution. This mechanism of solving the problem is called the
Divide & Conquer Strategy.
Divide and Conquer algorithm consists of a dispute using the following three steps.
1. Divide the original problem into a set of subproblems.

2. Conquer: Solve every subproblem individually, recursively.
3. Combine: Put together the solutions of the subproblems to get the solution to the whole problem.
Generally, we can follow the divide-and-conquer approach in a three-step process.
Examples: The specific computer algorithms are based on the Divide & Conquer approach:
1. Maximum and Minimum Problem

2. Binary Search
3. Sorting (merge sort, quick sort)
4. Tower of Hanoi.
Greedy Algorithm
The greedy method is one of the strategies like Divide and conquer used to solve the problems. This method
is used for solving optimization problems. An optimization problem is a problem that demands either
maximum or minimum results. Let's understand through some terms.
This technique is basically used to determine the feasible solution that may or may not be optimal. The
feasible solution is a subset that satisfies the given criteria. The optimal solution is the solution which is the
best and the most favorable solution in the subset. In the case of feasible, if more than one solution satisfies
the given criteria then those solutions will be considered as the feasible, whereas the optimal solution is the
best solution among all the solutions.
Characteristics of Greedy method
The following are the characteristics of a greedy method:
o To construct the solution in an optimal way, this algorithm creates two sets where one set contains
all the chosen items, and another set contains the rejected items.
o A Greedy algorithm makes good local choices in the hope that the solution should be either feasible
or optimal.
Components of Greedy Algorithm
The components that can be used in the greedy algorithm are:
o Candidate set: A solution that is created from the set is known as a candidate set.
o Selection function: This function is used to choose the candidate or subset which can be added in
the solution.
o Feasibility function: A function that is used to determine whether the candidate or subset can be
used to contribute to the solution or not.
o Objective function: A function is used to assign the value to the solution or the partial solution.
o Solution function: This function is used to intimate whether the complete function has been reached
or not.
Applications of Greedy Algorithm
o It is used in finding the shortest path.

o It is used to find the minimum spanning tree using the prim's algorithm or the Kruskal's algorithm.
o It is used in a job sequencing with a deadline.
o This algorithm is also used to solve the fractional knapsack problem.
o Let's understand through an example.
o Suppose there is a problem 'P'. I want to travel from A to B shown as below:
o P:A→B
Consider the graph which is given below:
o
o We have to travel from the source to the destination at the minimum cost. Since we have three
feasible solutions having cost paths as 10, 20, and 5. 5 is the minimum cost path so it is the optimal
solution. This is the local optimum, and in this way, we find the local optimum at each stage in order
to calculate the global optimal solution.
UNIT 2 : SORTING
Bubble sort Algorithm
In this article, we will discuss the Bubble sort Algorithm. The working procedure of bubble sort is simplest.
This article will be very helpful and interesting to students as they might face bubble sort as a question in
their examinations. So, it is important to discuss the topic.
Bubble sort works on the repeatedly swapping of adjacent elements until they are not in the intended order.
It is called bubble sort because the movement of array elements is just like the movement of air bubbles in
the water. Bubbles in water rise up to the surface; similarly, the array elements in bubble sort move to the
end in each iteration.
Bubble short is majorly used where -
o complexity does not matter

o simple and shortcode is preferred
Algorithm
In the algorithm given below, suppose arr is an array of n elements. The assumed swap function in the
algorithm will swap the values of given array elements.
1. begin BubbleSort(arr)
2. for all array elements
3. if arr[i] > arr[i+1]
4. swap(arr[i], arr[i+1])
5. end if
6. end for
7. return arr
8. end BubbleSort
Working of Bubble sort Algorithm
Now, let's see the working of Bubble sort Algorithm.
To understand the working of bubble sort algorithm, let's take an unsorted array. We are taking a short and
accurate array, as we know the complexity of bubble sort is O(n2).
Let the elements of array are -

UNIT 2 : SORTING
First Pass
Sorting will start from the initial two elements. Let compare them to check which is greater.
Here, 32 is greater than 13 (32 > 13), so it is already sorted. Now, compare 32 with 26.
Here, 26 is smaller than 36. So, swapping is required. After swapping new array will look like -
Now, compare 32 and 35.
Here, 35 is greater than 32. So, there is no swapping required as they are already sorted.
Now, the comparison will be in between 35 and 10.
Here, 10 is smaller than 35 that are not sorted. So, swapping is required. Now, we reach at the end of the
array. After first pass, the array will be -
Now, move to the second iteration.

UNIT 2 : SORTING
Second Pass
The same process will be followed for second iteration.
Here, 10 is smaller than 32. So, swapping is required. After swapping, the array will be -
Now, move to the third iteration.
Third Pass
The same process will be followed for third iteration.
Here, 10 is smaller than 26. So, swapping is required. After swapping, the array will be -
Now, move to the fourth iteration.
Fourth pass
Similarly, after the fourth iteration, the array will be -
Hence, there is no swapping required, so the array is completely sorted.

UNIT 2 : SORTING
Bubble sort complexity
Now, let's see the time complexity of bubble sort in the best case, average case, and worst case. We will also
see the space complexity of bubble sort.
1. Time Complexity
Case Time Complexity
Best Case O(n)
Average Case O(n2)
Worst Case O(n2)
o Best Case Complexity - It occurs when there is no sorting required, i.e. the array is already sorted.
The best-case time complexity of bubble sort is O(n).
o Average Case Complexity - It occurs when the array elements are in jumbled order that is not
properly ascending and not properly descending. The average case time complexity of bubble sort
is O(n2).
o Worst Case Complexity - It occurs when the array elements are required to be sorted in reverse
order. That means suppose you have to sort the array elements in ascending order, but its elements
are in descending order. The worst-case time complexity of bubble sort is O(n2).
2. Space Complexity
Space Complexity O(1)
Stable YES
o The space complexity of bubble sort is O(1). It is because, in bubble sort, an extra variable is required
for swapping.
o The space complexity of optimized bubble sort is O(2). It is because two extra variables are required
in optimized bubble sort.
Now, let's discuss the optimized bubble sort algorithm.
Optimized Bubble sort Algorithm
In the bubble sort algorithm, comparisons are made even when the array is already sorted. Because of that,
the execution time increases.
To solve it, we can use an extra variable swapped. It is set to true if swapping requires; otherwise, it is set
to false.
UNIT 2 : SORTING
It will be helpful, as suppose after an iteration, if there is no swapping required, the value of
variable swapped will be false. It means that the elements are already sorted, and no further iterations are
required.
This method will reduce the execution time and also optimizes the bubble sort.
Selection Sort Algorithm
In this article, we will discuss the Selection sort Algorithm. The working procedure of selection sort is also
simple. This article will be very helpful and interesting to students as they might face selection sort as a
question in their examinations. So, it is important to discuss the topic.
In selection sort, the first smallest element is selected from the unsorted array and placed at the first
position. After that second smallest element is selected and placed in the second position. The process
continues until the array is entirely sorted.
Selection sort is generally used when -
o A small array is to be sorted

o Swapping cost doesn't matter
o It is compulsory to check all elements
Now, let's see the algorithm of selection sort.
Algorithm
1. SELECTION SORT(arr, n)
2.
3. Step 1: Repeat Steps 2 and 3 for i = 0 to n-1
4. Step 2: CALL SMALLEST(arr, i, n, pos)
5. Step 3: SWAP arr[i] with arr[pos]
6. [END OF LOOP]
7. Step 4: EXIT
8.
9. SMALLEST (arr, i, n, pos)
10. Step 1: [INITIALIZE] SET SMALL = arr[i]
11. Step 2: [INITIALIZE] SET pos = i
12. Step 3: Repeat for j = i+1 to n
13. if (SMALL > arr[j])
14. SET SMALL = arr[j]
15. SET pos = j
16. [END OF if]
17. [END OF LOOP]
18. Step 4: RETURN pos
UNIT 2 : SORTING
Working of Selection sort Algorithm
Now, let's see the working of the Selection sort Algorithm.
To understand the working of the Selection sort algorithm, let's take an unsorted array. It will be easier to
understand the Selection sort via an example.
Now, for the first position in the sorted array, the entire array is to be scanned sequentially.
At present, 12 is stored at the first position, after searching the entire array, it is found that 8 is the smallest
value.
So, swap 12 with 8. After the first iteration, 8 will appear at the first position in the sorted array.
For the second position, where 29 is stored presently, we again sequentially scan the rest of the items of
unsorted array. After scanning, we find that 12 is the second lowest element in the array that should be
appeared at second position.
Now, swap 29 with 12. After the second iteration, 12 will appear at the second position in the sorted array.
So, after two iterations, the two smallest values are placed at the beginning in a sorted way.
The same process is applied to the rest of the array elements. Now, we are showing a pictorial representation
of the entire sorting process.
UNIT 2 : SORTING
Now, the array is completely sorted.
Selection sort complexity
Now, let's see the time complexity of selection sort in best case, average case, and in worst case. We will
also see the space complexity of the selection sort.
1. Time Complexity

Best Case O(n2)
Average Case O(n2)
Worst Case O(n2)
The best-case time complexity of selection sort is O(n2).
properly ascending and not properly descending. The average case time complexity of selection sort
is O(n2).
are in descending order. The worst-case time complexity of selection sort is O(n2).
UNIT 2 : SORTING
2. Space Complexity
Space Complexity
Stable
o The space complexity of selection sort is O(1). It is because, in selection sort, an extra variable is
required for swapping.
Insertion Sort Algorithm
In this article, we will discuss the Insertion sort Algorithm. The working procedure of insertion sort is also
simple. This article will be very helpful and interesting to students as they might face insertion sort as a
question in their examinations. So, it is important to discuss the topic.
Insertion sort works similar to the sorting of playing cards in hands. It is assumed that the first card is already
sorted in the card game, and then we select an unsorted card. If the selected unsorted card is greater than
the first card, it will be placed at the right side; otherwise, it will be placed at the left side. Similarly, all
unsorted cards are taken and put in their exact place.
Insertion sort has various advantages such as -
o Simple implementation
o Efficient for small data sets
o Adaptive, i.e., it is appropriate for data sets that are already substantially sorted.
Now, let's see the algorithm of insertion sort.
Algorithm
The simple steps of achieving the insertion sort are listed as follows -
Step 1 - If the element is the first element, assume that it is already sorted. Return 1.
Step2 - Pick the next element, and store it separately in a key.
Step3 - Now, compare the key with all elements in the sorted array.
Step 4 - If the element in the sorted array is smaller than the current element, then move to the next
element. Else, shift greater elements in the array towards the right.
Step 5 - Insert the value.
Step 6 - Repeat until the array is sorted.

UNIT 2 : SORTING
Working of Insertion sort Algorithm
Now, let's see the working of the insertion sort Algorithm.
To understand the working of the insertion sort algorithm, let's take an unsorted array. It will be easier to
understand the insertion sort via an example.
Initially, the first two elements are compared in insertion sort.
Here, 31 is greater than 12. That means both elements are already in ascending order. So, for now, 12 is
stored in a sorted sub-array.
Now, move to the next two elements and compare them.
Here, 25 is smaller than 31. So, 31 is not at correct position. Now, swap 31 with 25. Along with swapping,
insertion sort will also check it with all elements in the sorted array.
For now, the sorted array has only one element, i.e. 12. So, 25 is greater than 12. Hence, the sorted array
remains sorted after swapping.
Now, two elements in the sorted array are 12 and 25. Move forward to the next elements that are 31 and 8.
Both 31 and 8 are not sorted. So, swap them.

UNIT 2 : SORTING
After swapping, elements 25 and 8 are unsorted.
So, swap them.
Now, elements 12 and 8 are unsorted.
So, swap them too.
Now, the sorted array has three items that are 8, 12 and 25. Move to the next items that are 31 and 32.
Hence, they are already sorted. Now, the sorted array includes 8, 12, 25 and 31.
Move to the next elements that are 32 and 17.
17 is smaller than 32. So, swap them.
Swapping makes 31 and 17 unsorted. So, swap them too.

UNIT 2 : SORTING
Now, swapping makes 25 and 17 unsorted. So, perform swapping again.
Insertion sort complexity
Now, let's see the time complexity of insertion sort in best case, average case, and in worst case. We will
also see the space complexity of insertion sort.
1. Time Complexity

Best Case O(n)
Average Case O(n2)
Worst Case O(n2)
The best-case time complexity of insertion sort is O(n).
properly ascending and not properly descending. The average case time complexity of insertion sort
is O(n2).
are in descending order. The worst-case time complexity of insertion sort is O(n2).
2. Space Complexity
Space Complexity
Stable
o The space complexity of insertion sort is O(1). It is because, in insertion sort, an extra variable is
required for swapping.
UNIT 2 : SORTING
Shell Sort Algorithm
In this article, we will discuss the shell sort algorithm. Shell sort is the generalization of insertion sort, which
overcomes the drawbacks of insertion sort by comparing elements separated by a gap of several positions.
It is a sorting algorithm that is an extended version of insertion sort. Shell sort has improved the average
time complexity of insertion sort. As similar to insertion sort, it is a comparison-based and in-place sorting
algorithm. Shell sort is efficient for medium-sized data sets.
1. hh = h * 3 + 1
2. where, 'h' is the interval having initial value 1.
Now, let's see the algorithm of shell sort.
Algorithm
The simple steps of achieving the shell sort are listed as follows -
1. ShellSort(a, n) // 'a' is the given array, 'n' is the size of array

2. for (interval = n/2; interval > 0; interval /= 2)
3. for ( i = interval; i < n; i += 1)
4. temp = a[i];
5. for (j = i; j >= interval && a[j - interval] > temp; j -= interval)
6. a[j] = a[j - interval];
7. a[j] = temp;
8. End ShellSort
Working of Shell sort Algorithm
Now, let's see the working of the shell sort Algorithm.
To understand the working of the shell sort algorithm, let's take an unsorted array. It will be easier to
understand the shell sort via an example.
We will use the original sequence of shell sort, i.e., N/2, N/4,....,1 as the intervals.
In the first loop, n is equal to 8 (size of the array), so the elements are lying at the interval of 4 (n/2 = 4).
Elements will be compared and swapped if they are not in order.
UNIT 2 : SORTING
Here, in the first loop, the element at the 0th position will be compared with the element at 4th position. If
the 0th element is greater, it will be swapped with the element at 4th position. Otherwise, it remains the
same. This process will continue for the remaining elements.
At the interval of 4, the sublists are {33, 12}, {31, 17}, {40, 25}, {8, 42}.
Now, we have to compare the values in every sub-list. After comparing, we have to swap them if required in
the original array. After comparing and swapping, the updated array will look as follows -
In the second loop, elements are lying at the interval of 2 (n/4 = 2), where n = 8.
Now, we are taking the interval of 2 to sort the rest of the array. With an interval of 2, two sublists will be
generated - {12, 25, 33, 40}, and {17, 8, 31, 42}.
UNIT 2 : SORTING
Now, we again have to compare the values in every sub-list. After comparing, we have to swap them if
required in the original array. After comparing and swapping, the updated array will look as follows -
In the third loop, elements are lying at the interval of 1 (n/8 = 1), where n = 8. At last, we use the interval of
value 1 to sort the rest of the array elements. In this step, shell sort uses insertion sort to sort the array
elements.
Now, the array is sorted in ascending order.
Shell sort complexity
Now, let's see the time complexity of Shell sort in the best case, average case, and worst case. We will also
see the space complexity of the Shell sort.
1. Time Complexity
Best Case O(n*logn)
Average Case O(n*log(n)2)
Worst Case O(n2)

UNIT 2 : SORTING
o Best Case Complexity - It occurs when there is no sorting required, i.e., the array is already sorted.
The best-case time complexity of Shell sort is O(n*logn).
properly ascending and not properly descending. The average case time complexity of Shell sort
is O(n*logn).
are in descending order. The worst-case time complexity of Shell sort is O(n2).
2. Space Complexity
Stable NO
o The space complexity of Shell sort is O(1).
Merge Sort Algorithm
In this article, we will discuss the merge sort Algorithm. Merge sort is the sorting technique that follows the
divide and conquer approach. This article will be very helpful and interesting to students as they might face
merge sort as a question in their examinations. In coding or technical interviews for software engineers,
sorting algorithms are widely asked. So, it is important to discuss the topic.
Merge sort is similar to the quick sort algorithm as it uses the divide and conquer approach to sort the
elements. It is one of the most popular and efficient sorting algorithm. It divides the given list into two equal
halves, calls itself for the two halves and then merges the two sorted halves. We have to define
the merge() function to perform the merging..
Algorithm
In the following algorithm, arr is the given array, beg is the starting element, and end is the last element of
the array.
1. MERGE_SORT(arr, beg, end)

2.
3. if beg < end
4. set mid = (beg + end)/2
5. MERGE_SORT(arr, beg, mid)
6. MERGE_SORT(arr, mid + 1, end)
7. MERGE (arr, beg, mid, end)
8. end of if
9.
10. END MERGE_SORT
UNIT 2 : SORTING
Working of Merge sort Algorithm
Now, let's see the working of merge sort Algorithm.
To understand the working of the merge sort algorithm, let's take an unsorted array. It will be easier to
understand the merge sort via an example.
According to the merge sort, first divide the given array into two equal halves. Merge sort keeps dividing the
list into equal parts until it cannot be further divided.
As there are eight elements in the given array, so it is divided into two arrays of size 4.
Now, again divide these two arrays into halves. As they are of size 4, so divide them into new arrays of size
2.
Now, again divide these arrays to get the atomic value that cannot be further divided.
Now, combine them in the same manner they were broken.
In combining, first compare the element of each array and then combine them into another array in sorted
order.
So, first compare 12 and 31, both are in sorted positions. Then compare 25 and 8, and in the list of two
values, put 8 first followed by 25. Then compare 32 and 17, sort them and put 17 first followed by 32. After
that, compare 40 and 42, and place them sequentially.
UNIT 2 : SORTING
In the next iteration of combining, now compare the arrays with two data values and merge them into an
array of found values in sorted order.
Now, there is a final merging of the arrays. After the final merging of above arrays, the array will look like -
Merge sort complexity
Now, let's see the time complexity of merge sort in best case, average case, and in worst case. We will also
see the space complexity of the merge sort.
1. Time Complexity
Best Case O(n*logn)
Average Case O(n*logn)
Worst Case O(n*logn)
The best-case time complexity of merge sort is O(n*logn).
properly ascending and not properly descending. The average case time complexity of merge sort
is O(n*logn).
are in descending order. The worst-case time complexity of merge sort is O(n*logn).
UNIT 2 : SORTING
2. Space Complexity
Space Complexity O(n)
Stable YES
o The space complexity of merge sort is O(n). It is because, in merge sort, an extra variable is required
for swapping.
Quick Sort Algorithm
In this article, we will discuss the Quicksort Algorithm. The working procedure of Quicksort is also simple.
This article will be very helpful and interesting to students as they might face quicksort as a question in their
examinations. So, it is important to discuss the topic.
Divide: In Divide, first pick a pivot element. After that, partition or rearrange the array into two sub-arrays
such that each element in the left sub-array is less than or equal to the pivot element and each element in
the right sub-array is larger than the pivot element.
Conquer: Recursively, sort two subarrays with Quicksort.
Combine: Combine the already sorted array.
Quicksort picks an element as pivot, and then it partitions the given array around the picked pivot element.
In quick sort, a large array is divided into two arrays in which one holds values that are smaller than the
specified value (Pivot), and another array holds the values that are greater than the pivot.
After that, left and right sub-arrays are also partitioned using the same approach. It will continue until the
single element remains in the sub-array.
UNIT 2 : SORTING
Choosing the pivot
Picking a good pivot is necessary for the fast implementation of quicksort. However, it is typical to determine
a good pivot. Some of the ways of choosing a pivot are as follows -
o Pivot can be random, i.e. select the random pivot from the given array.
o Pivot can either be the rightmost element of the leftmost element of the given array.
o Select median as the pivot element.
Algorithm
Algorithm:
1. QUICKSORT (array A, start, end)

2. {
3. 1 if (start < end)
4. 2 {
5. 3 p = partition(A, start, end)
6. 4 QUICKSORT (A, start, p - 1)
7. 5 QUICKSORT (A, p + 1, end)
8. 6 }
9. }
Partition Algorithm:
The partition algorithm rearranges the sub-arrays in a place.
1. PARTITION (array A, start, end)

2. {
3. 1 pivot ? A[end]
4. 2 i ? start-1
5. 3 for j ? start to end -1 {
6. 4 do if (A[j] < pivot) {
7. 5 then i ? i + 1
8. 6 swap A[i] with A[j]
9. 7 }}
10. 8 swap A[i+1] with A[end]
11. 9 return i+1
12. }
UNIT 2 : SORTING
Working of Quick Sort Algorithm
Now, let's see the working of the Quicksort Algorithm.
To understand the working of quick sort, let's take an unsorted array. It will make the concept more clear
and understandable.
In the given array, we consider the leftmost element as pivot. So, in this case, a[left] = 24, a[right] = 27 and
a[pivot] = 24.
Since, pivot is at left, so algorithm starts from right and move towards left.
Now, a[pivot] < a[right], so algorithm moves forward one position towards left, i.e. -
Now, a[left] = 24, a[right] = 19, and a[pivot] = 24.
Because, a[pivot] > a[right], so, algorithm will swap a[pivot] with a[right], and pivot moves to right, as -
UNIT 2 : SORTING
Now, a[left] = 19, a[right] = 24, and a[pivot] = 24. Since, pivot is at right, so algorithm starts from left and
moves to right.
As a[pivot] > a[left], so algorithm moves one position to right as -
Now, a[left] = 9, a[right] = 24, and a[pivot] = 24. As a[pivot] > a[left], so algorithm moves one position to right
as -
Now, a[left] = 29, a[right] = 24, and a[pivot] = 24. As a[pivot] < a[left], so, swap a[pivot] and a[left], now pivot
is at left, i.e. -
UNIT 2 : SORTING
Since, pivot is at left, so algorithm starts from right, and move to left. Now, a[left] = 24, a[right] = 29, and
a[pivot] = 24. As a[pivot] < a[right], so algorithm moves one position to left, as -
Now, a[pivot] = 24, a[left] = 24, and a[right] = 14. As a[pivot] > a[right], so, swap a[pivot] and a[right], now
pivot is at right, i.e. -
Now, a[pivot] = 24, a[left] = 14, and a[right] = 24. Pivot is at right, so the algorithm starts from left and move
to right.
UNIT 2 : SORTING
Now, a[pivot] = 24, a[left] = 24, and a[right] = 24. So, pivot, left and right are pointing the same element. It
represents the termination of procedure.
Element 24, which is the pivot element is placed at its exact position.
Elements that are right side of element 24 are greater than it, and the elements that are left side of element
24 are smaller than it.
Now, in a similar manner, quick sort algorithm is separately applied to the left and right sub-arrays. After
sorting gets done, the array will be -
Quicksort complexity
Now, let's see the time complexity of quicksort in best case, average case, and in worst case. We will also
see the space complexity of quicksort.
1. Time Complexity
Best Case O(n*logn)
Average Case O(n*logn)
Worst Case O(n2)

UNIT 2 : SORTING
o Best Case Complexity - In Quicksort, the best-case occurs when the pivot element is the middle
element or near to the middle element. The best-case time complexity of quicksort is O(n*logn).
properly ascending and not properly descending. The average case time complexity of quicksort
is O(n*logn).
o Worst Case Complexity - In quick sort, worst case occurs when the pivot element is either greatest
or smallest element. Suppose, if the pivot element is always the last element of the array, the worst
case would occur when the given array is sorted already in ascending or descending order. The worst-
case time complexity of quicksort is O(n2).
Though the worst-case complexity of quicksort is more than other sorting algorithms such as Merge
sort and Heap sort, still it is faster in practice. Worst case in quick sort rarely occurs because by changing the
choice of pivot, it can be implemented in different ways. Worst case in quicksort can be avoided by choosing
the right pivot element.
2. Space Complexity
Space Complexity O(n*logn)
Stable NO
o The space complexity of quicksort is O(n*logn).
Heap Sort Algorithm
In this article, we will discuss the Heapsort Algorithm. Heap sort processes the elements by creating the min-
heap or max-heap using the elements of the given array. Min-heap or max-heap represents the ordering of
array in which the root element represents the minimum or maximum element of the array.
Heap sort basically recursively performs two main operations -
o Build a heap H, using the elements of array.

o Repeatedly delete the root element of the heap formed in 1st phase.
Before knowing more about the heap sort, let's first see a brief description of Heap.
Algorithm
1. HeapSort(arr)
2. BuildMaxHeap(arr)
3. for i = length(arr) to 2
4. swap arr[1] with arr[i]
5. heap_size[arr] = heap_size[arr] ? 1
6. MaxHeapify(arr,1)
7. End
UNIT 2 : SORTING
BuildMaxHeap(arr)
1. BuildMaxHeap(arr)
2. heap_size(arr) = length(arr)
3. for i = length(arr)/2 to 1
4. MaxHeapify(arr,i)
5. End
MaxHeapify(arr,i)
1. MaxHeapify(arr,i)
2. L = left(i)
3. R = right(i)
4. if L ? heap_size[arr] and arr[L] > arr[i]
5. largest = L
6. else
7. largest = i
8. if R ? heap_size[arr] and arr[R] > arr[largest]
9. largest = R
10. if largest != i
11. swap arr[i] with arr[largest]
12. MaxHeapify(arr,largest)
13. End
Working of Heap sort Algorithm
Now, let's see the working of the Heapsort Algorithm.
In heap sort, basically, there are two phases involved in the sorting of elements. By using the heap sort
algorithm, they are as follows -
o The first step includes the creation of a heap by adjusting the elements of the array.
o After the creation of heap, now remove the root element of the heap repeatedly by shifting it to the
end of the array, and then store the heap structure with the remaining elements.
Now let's see the working of heap sort in detail by using an example. To understand it more clearly, let's take
an unsorted array and try to sort it using heap sort. It will make the explanation clearer and easier.
First, we have to construct a heap from the given array and convert it into max heap.
UNIT 2 : SORTING
After converting the given heap into max heap, the array elements are -
Next, we have to delete the root element (89) from the max heap. To delete this node, we have to swap it
with the last node, i.e. (11). After deleting the root element, we again have to heapify it to convert it into
max heap.
After swapping the array element 89 with 11, and converting the heap into max-heap, the elements of array
are -
In the next step, again, we have to delete the root element (81) from the max heap. To delete this node, we
have to swap it with the last node, i.e. (54). After deleting the root element, we again have to heapify it to
convert it into max heap.
UNIT 2 : SORTING
After swapping the array element 81 with 54 and converting the heap into max-heap, the elements of array
are -
In the next step, we have to delete the root element (76) from the max heap again. To delete this node, we
are -
In the next step, again we have to delete the root element (54) from the max heap. To delete this node, we
UNIT 2 : SORTING
are -
are -
UNIT 2 : SORTING
are -
After swapping the array element 11 with 9, the elements of array are -
Now, heap has only one element left. After deleting it, heap will be empty.
After completion of sorting, the array elements are -

UNIT 2 : SORTING
Heap sort complexity
Now, let's see the time complexity of Heap sort in the best case, average case, and worst case. We will also
see the space complexity of Heapsort.
1. Time Complexity
Best Case O(n logn)
Average Case O(n log n)
Worst Case O(n log n)
The best-case time complexity of heap sort is O(n logn).
properly ascending and not properly descending. The average case time complexity of heap sort
is O(n log n).
are in descending order. The worst-case time complexity of heap sort is O(n log n).
The time complexity of heap sort is O(n logn) in all three cases (best case, average case, and worst case). The
height of a complete binary tree having n elements is logn.
2. Space Complexity
Stable N0
o The space complexity of Heap sort is O(1).

UNIT 2 : SORTING
Counting Sort Algorithm
In this article, we will discuss the counting sort Algorithm. Counting sort is a sorting technique that is based
on the keys between specific ranges. In coding or technical interviews for software engineers, sorting
algorithms are widely asked. So, it is important to discuss the topic.
Now, let's see the algorithm of counting sort.
Algorithm
1. countingSort(array, n) // 'n' is the size of array

2. max = find maximum element in the given array
3. create count array with size maximum + 1
4. Initialize count array with all 0's
5. for i = 0 to n
6. find the count of every unique element and
7. store that count at ith position in the count array
8. for j = 1 to max
9. Now, find the cumulative sum and store it in count array
10. for i = n to 1
11. Restore the array elements
12. Decrease the count of every restored element by 1
13. end countingSort
Working of counting sort Algorithm
Now, let's see the working of the counting sort Algorithm.
To understand the working of the counting sort algorithm, let's take an unsorted array. It will be easier to
understand the counting sort via an example.
1. Find the maximum element from the given array. Let max be the maximum element.
2. Now, initialize array of length max + 1 having all 0 elements. This array will be used to store the count of
the elements in the given array.
UNIT 2 : SORTING
3. Now, we have to store the count of each array element at their corresponding index in the count array.
The count of an element will be stored as - Suppose array element '4' is appeared two times, so the count of
element 4 is 2. Hence, 2 is stored at the 4th position of the count array. If any element is not present in the
array, place 0, i.e. suppose element '3' is not present in the array, so, 0 will be stored at 3rd position.
Now, store the cumulative sum of count array elements. It will help to place the elements at the correct
index of the sorted array.
Similarly, the cumulative count of the count array is -
4. Now, find the index of each element of the original array

UNIT 2 : SORTING
After placing element at its place, decrease its count by one. Before placing element 2, its count was 2, but
after placing it at its correct position, the new count for element 2 is 1.
Similarly, after sorting, the array elements are -
Counting sort complexity
Now, let's see the time complexity of counting sort in best case, average case, and in worst case. We will
also see the space complexity of the counting sort.
UNIT 2 : SORTING
1. Time Complexity
Best Case O(n + k)
Average Case O(n + k)
Worst Case O(n + k)
The best-case time complexity of counting sort is O(n + k).
properly ascending and not properly descending. The average case time complexity of counting sort
is O(n + k).
are in descending order. The worst-case time complexity of counting sort is O(n + k).
In all above cases, the time complexity of counting sort is same. This is because the algorithm goes
through n+k times, regardless of how the elements are placed in the array.
Counting sort is better than the comparison-based sorting techniques because there is no comparison
between elements in counting sort. But, when the integers are very large the counting sort is bad because
arrays of that size have to be created.
2. Space Complexity
Space Complexity O(max)
Stable YES
o The space complexity of counting sort is O(max). The larger the range of elements, the larger the
space complexity.
UNIT 2 : SORTING
Bucket Sort Algorithm
In this article, we will discuss the bucket sort Algorithm. The data items in the bucket sort are distributed in
the form of buckets. In coding or technical interviews for software engineers, sorting algorithms are widely
asked. So, it is important to discuss the topic.
The basic procedure of performing the bucket sort is given as follows -
o First, partition the range into a fixed number of buckets.

o Then, toss every element into its appropriate bucket.
o After that, sort each bucket individually by applying a sorting algorithm.
o And at last, concatenate all the sorted buckets.
The advantages of bucket sort are -
o Bucket sort reduces the no. of comparisons.

o It is asymptotically fast because of the uniform distribution of elements.
The limitations of bucket sort are -
o It may or may not be a stable sorting algorithm.

o It is not useful if we have a large array because it increases the cost.
o It is not an in-place sorting algorithm, because some extra space is required to sort the buckets.
The best and average-case complexity of bucket sort is O(n + k), and the worst-case complexity of bucket
sort is O(n2), where n is the number of items.
Bucket sort is commonly used -
o With floating-point values.

o When input is distributed uniformly over a range.
The basic idea to perform the bucket sort is given as follows -
1. bucketSort(a[], n)
2. 1. Create 'n' empty buckets
3. 2. Do for each array element a[i]
4. 2.1. Put array elements into buckets, i.e. insert a[i] into bucket[n*a[i]]
5. 3. Sort the elements of individual buckets by using the insertion sort.
6. 4. At last, gather or concatenate the sorted buckets.
7. End bucketSort
Now, let's see the algorithm of bucket sort.

UNIT 2 : SORTING
Algorithm
1. Bucket Sort(A[])
2. 1. Let B[0....n-1] be a new array
3. 2. n=length[A]
4. 3. for i=0 to n-1
5. 4. make B[i] an empty list
6. 5. for i=1 to n
7. 6. do insert A[i] into list B[n a[i]]
8. 7. for i=0 to n-1
9. 8. do sort list B[i] with insertion-sort
10. 9. Concatenate lists B[0], B[1],........, B[n-1] together in order
11. End
Scatter-gather approach
We can understand the Bucket sort algorithm via scatter-gather approach. Here, the given elements are
first scattered into buckets. After scattering, elements in each bucket are sorted using a stable sorting
algorithm. At last, the sorted elements will be gathered in order.
Let's take an unsorted array to understand the process of bucket sort. It will be easier to understand the
bucket sort via an example.
Now, create buckets with a range from 0 to 25. The buckets range are 0-5, 5-10, 10-15, 15-20, 20-25.
Elements are inserted in the buckets according to the bucket range. Suppose the value of an item is 16, so
it will be inserted in the bucket with the range 15-20. Similarly, every item of the array will insert
accordingly.
This phase is known to be the scattering of array elements.
Now, sort each bucket individually. The elements of each bucket can be sorted by using any of the stable
sorting algorithms.
UNIT 2 : SORTING
At last, gather the sorted elements from each bucket in order
Bucket sort complexity
Now, let's see the time complexity of bucket sort in best case, average case, and in worst case. We will also
see the space complexity of the bucket sort.
1. Time Complexity
Best Case O(n + k)
Average Case O(n + k)
Worst Case O(n2)
o Best Case Complexity - It occurs when there is no sorting required, i.e. the array is already sorted. In
Bucket sort, best case occurs when the elements are uniformly distributed in the buckets. The
complexity will be better if the elements are already sorted in the buckets.
If we use the insertion sort to sort the bucket elements, the overall complexity will be linear, i.e., O(n
+ k), where O(n) is for making the buckets, and O(k) is for sorting the bucket elements using
algorithms with linear time complexity at best case.
The best-case time complexity of bucket sort is O(n + k).
properly ascending and not properly descending. Bucket sort runs in the linear time, even when the
elements are uniformly distributed. The average case time complexity of bucket sort is O(n + K).
o Worst Case Complexity - In bucket sort, worst case occurs when the elements are of the close range
in the array, because of that, they have to be placed in the same bucket. So, some buckets have more
number of elements than others.
The complexity will get worse when the elements are in the reverse order.
The worst-case time complexity of bucket sort is O(n2).
UNIT 2 : SORTING
2. Space Complexity
Space Complexity O(n*k)
Stable YES
o The space complexity of bucket sort is O(n*k).
Radix Sort Algorithm
In this article, we will discuss the Radix sort Algorithm. Radix sort is the linear sorting algorithm that is used
for integers. In Radix sort, there is digit by digit sorting is performed that is started from the least significant
digit to the most significant digit.
The process of radix sort works similar to the sorting of students names, according to the alphabetical order.
In this case, there are 26 radix formed due to the 26 alphabets in English. In the first pass, the names of
students are grouped according to the ascending order of the first letter of their names. After that, in the
second pass, their names are grouped according to the ascending order of the second letter of their name.
And the process continues until we find the sorted list.
Now, let's see the algorithm of Radix sort.
Algorithm
1. radixSort(arr)
2. max = largest element in the given array
3. d = number of digits in the largest element (or, max)
4. Now, create d buckets of size 0 - 9
5. for i -> 0 to d
6. sort the array elements using counting sort (or any stable sort) according to the digits at
7. the ith place
Working of Radix sort Algorithm
Now, let's see the working of Radix sort Algorithm.
The steps used in the sorting of radix sort are listed as follows -
o First, we have to find the largest element (suppose max) from the given array. Suppose 'x' be the
number of digits in max. The 'x' is calculated because we need to go through the significant places of
all elements.
o After that, go through one by one each significant place. Here, we have to use any stable sorting
algorithm to sort the digits of each significant place.
Now let's see the working of radix sort in detail by using an example. To understand it more clearly, let's take
an unsorted array and try to sort it using radix sort. It will make the explanation clearer and easier.
UNIT 2 : SORTING
In the given array, the largest element is 736 that have 3 digits in it. So, the loop will run up to three times
(i.e., to the hundreds place). That means three passes are required to sort the array.
Now, first sort the elements on the basis of unit place digits (i.e., x = 0). Here, we are using the counting sort
algorithm to sort the elements.
Pass 1:
In the first pass, the list is sorted on the basis of the digits at 0's place.
After the first pass, the array elements are -
Pass 2:
In this pass, the list is sorted on the basis of the next significant digits (i.e., digits at 10 th place).
After the second pass, the array elements are -

UNIT 2 : SORTING
Pass 3:
In this pass, the list is sorted on the basis of the next significant digits (i.e., digits at 100 th place).
After the third pass, the array elements are -
Now, the array is sorted in ascending order.
Radix sort complexity
Now, let's see the time complexity of Radix sort in best case, average case, and worst case. We will also see
the space complexity of Radix sort.
1. Time Complexity
Best Case Ω(n+k)
Average Case θ(nk)
Worst Case O(nk)
The best-case time complexity of Radix sort is Ω(n+k).
properly ascending and not properly descending. The average case time complexity of Radix sort
is θ(nk).
are in descending order. The worst-case time complexity of Radix sort is O(nk).
Radix sort is a non-comparative sorting algorithm that is better than the comparative sorting algorithms. It
has linear time complexity that is better than the comparative algorithms with complexity O(n logn).
2. Space Complexity
Space Complexity O(n + k)
Stable YES
o The space complexity of Radix sort is O(n + k).

UNIT 3 : SEARCHING
LINEAR SEARCH ALGORITHAM

A linear search algorithm define as sequential search algorithm that starts at one end and goes
through each element of a list until the desired element is found, otherwise the search continues till
end of the data set.
WORKING PROCESS
In linear search algorithm
 Every element is considered a potential match for the key and chackd for the same.
 If any element is found equal to the key, the search is successful and the index of the
element is returned
 If no element is found equal to the key, the search yet “ No Match Found”
For Example:
Consider the array : { 10, 50, 30, 70, 80, 20, 90, 40 }
Step 1:
Comparing key with first element. Since not equal, the the next element as a potential match.
30 10 50 30 70 80 60 20 90 40
Key Current Element
Not Equal
Comparing key with next Element. Since not equal, the next element as a potential match.
30 10 50 30 70 80 60 20 90 40
Key Current Element
Not Equal
Step 2:
Now when comparing with key, the value match. So the linear search algorithm will yield successful
massage and return.
30 10 50 30 70 80 60 20 90 40
key Current Element
Equal
UNIT 3 : SEARCHING
The Complexity of Linear Search Algorithm

You have three different complexities faced while performing Linear Search Algorithm, they are
mentioned as follows.
1. Best Case
2. Worst Case
3. Average Case
You will learn about each one of them in a bit more detail.
Best Case Complexity
 The element being searched could be found in the first position.

 In this case, the search ends with a single successful comparison.
 Thus, in the best-case scenario, the linear search algorithm performs O(1) operations.
Become an Automation Test Engineer in 11 Months!

Automation Testing Masters
Worst Case Complexity
 The element being searched may be at the last position in the array or not at all.
 In the first case, the search succeeds in ‘n’ comparisons.
 In the next case, the search fails after ‘n’ comparisons.
 Thus, in the worst-case scenario, the linear search algorithm performs O(n) operations.
Average Case Complexity

When the element to be searched is in the middle of the array, the average case of the Linear
Search Algorithm is O(n).
Next, you will learn about the Space Complexity of Linear Search Algorithm.
Space Complexity of Linear Search Algorithm
The linear search algorithm takes up no extra space; its space complexity is O(n) for an array of n
elements.
Now that you’ve grasped the complexity of the linear search algorithm, look at some of its
applications.
Application of Linear Search Algorithm

The linear search algorithm has the following applications:
 Linear search can be applied to both single-dimensional and multi-dimensional arrays.

 Linear search is easy to implement and effective when the array contains only a few
elements.
 Linear Search is also efficient when the search is performed to fetch a single search in an
unordered-List.
UNIT 3 : SEARCHING
Advantages Disadvantages
Easy to understand Time-consuming
No special data structure required Not suitable for large data sets
Can be used on unsorted data Not suitable for ordered data
No additional memory required Not suitable for repetitive task
Not affected by data size Not suitable for real-time applications
Binary Search Algorithm

In this article, we will discuss the Binary Search Algorithm. Searching is the process of finding some
particular element in the list. If the element is present in the list, then the process is called successful,
and the process returns the location of that element. Otherwise, the search is called unsuccessful.
Linear Search and Binary Search are the two popular searching techniques. Here we will discuss the
Binary Search Algorithm.
Binary search is the search technique that works efficiently on sorted lists. Hence, to search an
element into some list using the binary search technique, we must ensure that the list is sorted.
Binary search follows the divide and conquer approach in which the list is divided into two halves,
and the item is compared with the middle element of the list. If the match is found then, the location
of the middle element is returned. Otherwise, we search into either of the halves depending upon
the result produced through the match.
UNIT 3 : SEARCHING
Algorithm
1. Binary_Search(a, lower_bound, upper_bound, val) // 'a' is the given array, 'lower_bound' is the index of the f
irst array element, 'upper_bound' is the index of the last array element, 'val' is the value to search
2. Step 1: set beg = lower_bound, end = upper_bound, pos = - 1
3. Step 2: repeat steps 3 and 4 while beg <=end
4. Step 3: set mid = (beg + end)/2
5. Step 4: if a[mid] = val
6. set pos = mid
7. print pos
8. go to step 6
9. else if a[mid] > val
10. set end = mid - 1
11. else
12. set beg = mid + 1
13. [end of if]
14. [end of loop]
15. Step 5: if pos = -1
16. print "value is not present in the array"
17. [end of if]
18. Step 6: exit
Working of Binary search
Now, let's see the working of the Binary Search Algorithm.
To understand the working of the Binary search algorithm, let's take a sorted array. It will be easy to
understand the working of Binary search with an example.
There are two methods to implement the binary search algorithm -
o Iterative method
o Recursive method
The recursive method of binary search follows the divide and conquer approach.

UNIT 3 : SEARCHING
Let the element to search is, K = 56
We have to use the below formula to calculate the mid of the array -
1. mid = (beg + end)/2
So, in the given array -
beg = 0
end = 8
mid = (0 + 8)/2 = 4. So, 4 is the mid of the array.
Now, the element to search is found. So algorithm will return the index of the element matched.
UNIT 3 : SEARCHING
Binary Search complexity

Now, let's see the time complexity of Binary search in the best case, average case, and worst case.
We will also see the space complexity of Binary search.
1. Time Complexity
Best Case O(1)
Average Case O(logn)
Worst Case O(logn)
o Best Case Complexity - In Binary search, best case occurs when the element to search is found in first
comparison, i.e., when the first middle element itself is the element to be searched. The best-case
time complexity of Binary search is O(1).
o Average Case Complexity - The average case time complexity of Binary search is O(logn).
o Worst Case Complexity - In Binary search, the worst case occurs, when we have to keep reducing the
search space till it has only one element. The worst-case time complexity of Binary search is O(logn).
2. Space Complexity
o The space complexity of binary search is O(1).

UNIT 3 : SEARCHING
Advantages and Disadvantages of Binary Search

Advantages:
 It is better than a linear search algorithm since its run time complexity is O(logN).
 At each iteration, the binary search algorithm eliminates half of the list and significantly
reduces the search space.
 The binary search algorithm works even when the array is rotated by some position and finds
the target element.
Disadvantages:
 The recursive method uses stack space.

 Binary search is error-prone. Some of the common errors are as follows:
 Off-by-one errors: While determining the boundary of the next interval, there might be
overlapping errors.
o Handling of duplicate items: While returning the first item, it might be possible we
return a subsequence similar item.
o Numerical underflows/overflows: In huge arrays when computing indices. There might
be overflows
o Recursive vs non-recursive implementation, which should be considered while
designing as recursive takes stack space.
 Caching is poor.
Binary Search tree
In this article, we will discuss the Binary search tree. This article will be very helpful and informative
to the students with technical background as it is an important topic of their course.
Before moving directly to the binary search tree, let's first see a brief description of the tree.
A binary search tree follows some order to arrange the elements. In a Binary search tree, the value
of left node must be smaller than the parent node, and the value of right node must be greater than
the parent node. This rule is applied recursively to the left and right subtrees of the root.
Let's understand the concept of Binary search tree with an example.

UNIT 3 : SEARCHING
In the above figure, we can observe that the root node is 40, and all the nodes of the left subtree are
smaller than the root node, and all the nodes of the right subtree are greater than the root node.
Similarly, we can see the left child of root node is greater than its left child and smaller than its right
child. So, it also satisfies the property of binary search tree. Therefore, we can say that the tree in the
above image is a binary search tree.
Suppose if we change the value of node 35 to 55 in the above tree, check whether the tree will be
binary search tree or not.
In the above tree, the value of root node is 40, which is greater than its left child 30 but smaller than
right child of 30, i.e., 55. So, the above tree does not satisfy the property of Binary search tree.
Therefore, the above tree is not a binary search tree.
UNIT 3 : SEARCHING
Advantages of Binary search tree
o Searching an element in the Binary search tree is easy as we always have a hint that which subtree has
the desired element.
o As compared to array and linked lists, insertion and deletion operations are faster in BST.
Example of creating a binary search tree

Now, let's see the creation of binary search tree using an example.
Suppose the data elements are - 45, 15, 79, 90, 10, 55, 12, 20, 50
o First, we have to insert 45 into the tree as the root of the tree.
o Then, read the next element; if it is smaller than the root node, insert it as the root of the left subtree,
and move to the next element.
o Otherwise, if the element is larger than the root node, then insert it as the root of the right subtree.
Now, let's see the process of creating the Binary search tree using the given data element. The
process of creating the BST is shown below -
Step 1 - Insert 45.
Step 2 - Insert 15.
As 15 is smaller than 45, so insert it as the root node of the left subtree.
Step 3 - Insert 79.

UNIT 3 : SEARCHING
As 79 is greater than 45, so insert it as the root node of the right subtree.
Step 4 - Insert 90.
90 is greater than 45 and 79, so it will be inserted as the right subtree of 79.
Step 5 - Insert 10.
10 is smaller than 45 and 15, so it will be inserted as a left subtree of 15.
Step 6 - Insert 55.

UNIT 3 : SEARCHING
55 is larger than 45 and smaller than 79, so it will be inserted as the left subtree of 79.
Step 7 - Insert 12.
12 is smaller than 45 and 15 but greater than 10, so it will be inserted as the right subtree of 10.
Step 8 - Insert 20.
20 is smaller than 45 but greater than 15, so it will be inserted as the right subtree of 15.
UNIT 3 : SEARCHING
Step 9 - Insert 50.
50 is greater than 45 but smaller than 79 and 55. So, it will be inserted as a left subtree of 55.
Now, the creation of binary search tree is completed. After that, let's move towards the operations
that can be performed on Binary search tree.
We can perform insert, delete and search operations on the binary search tree.
Let's understand how a search is performed on a binary search tree.
Searching in Binary search tree

UNIT 3 : SEARCHING
Searching means to find or locate a specific element or node in a data structure. In Binary search tree,
searching a node is easy because elements in BST are stored in a specific order. The steps of searching
a node in Binary Search tree are listed as follows -
1. First, compare the element to be searched with the root element of the tree.
2. If root is matched with the target element, then return the node's location.
3. If it is not matched, then check whether the item is less than the root element, if it is smaller than the
root element, then move to the left subtree.
4. If it is larger than the root element, then move to the right subtree.
5. Repeat the above procedure recursively until the match is found.
6. If the element is not found or not present in the tree, then return NULL.
Now, let's understand the searching in binary tree using an example. We are taking the binary search
tree formed above. Suppose we have to find node 20 from the below tree.
Step1:
Step2:
Step3:
UNIT 3 : SEARCHING
Now, let's see the algorithm to search an element in the Binary search tree.
Algorithm to search an element in Binary search tree
1. Search (root, item)

2. Step 1 - if (item = root → data) or (root = NULL)
3. return root
4. else if (item < root → data)
5. return Search(root → left, item)
6. else
7. return Search(root → right, item)
8. END if
9. Step 2 - END
Now let's understand how the deletion is performed on a binary search tree. We will also see an
example to delete an element from the given tree.
Deletion in Binary Search tree

In a binary search tree, we must delete a node from the tree by keeping in mind that the property of
BST is not violated. To delete a node from BST, there are three possible situations occur -
o The node to be deleted is the leaf node, or,

o The node to be deleted has only one child, and,
o The node to be deleted has two children
We will understand the situations listed above in detail.
When the node to be deleted is the leaf node

UNIT 3 : SEARCHING
It is the simplest case to delete a node in BST. Here, we have to replace the leaf node with NULL and
simply free the allocated space.
We can see the process to delete a leaf node from BST in the below image. In below image, suppose
we have to delete node 90, as the node to be deleted is a leaf node, so it will be replaced with NULL,
and the allocated space will free.
When the node to be deleted has only one child
In this case, we have to replace the target node with its child, and then delete the child node. It means
that after replacing the target node with its child node, the child node will now contain the value to
be deleted. So, we simply have to replace the child node with NULL and free up the allocated space.
We can see the process of deleting a node with one child from BST in the below image. In the below
image, suppose we have to delete the node 79, as the node to be deleted has only one child, so it will
be replaced with its child 55.
So, the replaced node 79 will now be a leaf node that can be easily deleted.
When the node to be deleted has two children

UNIT 3 : SEARCHING
This case of deleting a node in BST is a bit complex among other two cases. In such a case, the steps
to be followed are listed as follows -
o First, find the inorder successor of the node to be deleted.

o After that, replace that node with the inorder successor until the target node is placed at the leaf of
tree.
o And at last, replace the node with NULL and free up the allocated space.
The inorder successor is required when the right child of the node is not empty. We can obtain the
inorder successor by finding the minimum element in the right child of the node.
We can see the process of deleting a node with two children from BST in the below image. In the
below image, suppose we have to delete node 45 that is the root node, as the node to be deleted
has two children, so it will be replaced with its inorder successor. Now, node 45 will be at the leaf of
the tree so that it can be deleted easily.
Now let's understand how insertion is performed on a binary search tree.
Insertion in Binary Search tree

A new key in BST is always inserted at the leaf. To insert an element in BST, we have to start searching
from the root node; if the node to be inserted is less than the root node, then search for an empty
location in the left subtree. Else, search for the empty location in the right subtree and insert the
data. Insert in BST is similar to searching, as we always have to maintain the rule that the left subtree
is smaller than the root, and right subtree is larger than the root.
UNIT 3 : SEARCHING
Now, let's see the process of inserting a node into BST using an example.
The complexity of the Binary Search tree

Let's see the time and space complexity of the Binary search tree. We will see the time complexity
for insertion, deletion, and searching operations in best case, average case, and worst case.
1. Time Complexity
Operations Best case time Average case time Worst case time
complexity complexity complexity
Insertion O(log n) O(log n) O(n)
Deletion O(log n) O(log n) O(n)
Search O(log n) O(log n) O(n)
Where 'n' is the number of nodes in the given tree.

UNIT 3 : SEARCHING
2. Space Complexity
Operations Space complexity
Insertion O(n)
Deletion O(n)
Search O(n)
o The space complexity of all operations of Binary search tree is O(n).
AVL Tree
AVL Tree is invented by GM Adelson - Velsky and EM Landis in 1962. The tree is named AVL in honour
of its inventors.
AVL Tree can be defined as height balanced binary search tree in which each node is associated with
a balance factor which is calculated by subtracting the height of its right sub-tree from that of its left
sub-tree.
Tree is said to be balanced if balance factor of each node is in between -1 to 1, otherwise, the tree
will be unbalanced and need to be balanced.
Balance Factor (k) = height (left(k)) - height (right(k))

If balance factor of any node is 1, it means that the left sub-tree is one level higher than the right sub-
tree.
If balance factor of any node is 0, it means that the left sub-tree and right sub-tree contain equal
height.
If balance factor of any node is -1, it means that the left sub-tree is one level lower than the right sub-
tree.
An AVL tree is given in the following figure. We can see that, balance factor associated with each
node is in between -1 and +1. therefore, it is an example of AVL tree.
UNIT 3 : SEARCHING
Complexity
Algorithm Average case Worst case
Space o(n) o(n)
Search o(log n) o(log n)
Insert o(log n) o(log n)
Delete o(log n) o(log n)
Operations on AVL tree

Due to the fact that, AVL tree is also a binary search tree therefore, all the operations are performed
in the same way as they are performed in a binary search tree. Searching and traversing do not lead
to the violation in property of AVL tree. However, insertion and deletion are the operations which
can violate this property and therefore, they need to be revisited.
UNIT 3 : SEARCHING
SN Operation Description
1 Insertion Insertion in AVL tree is performed in the same way as it is performed in a binary search
tree. However, it may lead to violation in the AVL tree property and therefore the tree
may need balancing. The tree can be balanced by applying rotations.
2 Deletion Deletion can also be performed in the same way as it is performed in a binary search
tree. Deletion may also disturb the balance of the tree therefore, various types of
rotations are used to rebalance the tree.
AVL Rotations
We perform rotation in AVL tree only in case if Balance Factor is other than -1, 0, and 1. There are
basically four types of rotations which are as follows:
1. L L rotation: Inserted node is in the left subtree of left subtree of A

2. R R rotation : Inserted node is in the right subtree of right subtree of A
3. L R rotation : Inserted node is in the right subtree of left subtree of A
4. R L rotation : Inserted node is in the left subtree of right subtree of A
Where node A is the node whose balance Factor is other than -1, 0, 1.
The first two rotations LL and RR are single rotations and the next two rotations LR and RL are double
rotations. For a tree to be unbalanced, minimum height must be at least 2, Let us understand each
rotation
1. RR Rotation
When BST becomes unbalanced, due to a node is inserted into the right subtree of the right subtree
of A, then we perform RR rotation, RR rotation is an anticlockwise rotation, which is applied on the
edge below a node having balance factor -2
In above example, node A has balance factor -2 because a node C is inserted in the right subtree of A
right subtree. We perform the RR rotation on the edge below A.
UNIT 3 : SEARCHING
2. LL Rotation
When BST becomes unbalanced, due to a node is inserted into the left subtree of the left subtree of
C, then we perform LL rotation, LL rotation is clockwise rotation, which is applied on the edge below
a node having balance factor 2.
In above example, node C has balance factor 2 because a node A is inserted in the left subtree of C
left subtree. We perform the LL rotation on the edge below A.
UNIT 3 : SEARCHING
3. LR Rotation
Double rotations are bit tougher than single rotation which has already explained above. LR rotation
= RR rotation + LL rotation, i.e., first RR rotation is performed on subtree and then LL rotation is
performed on full tree, by full tree we mean the first node from the path of inserted node whose
balance factor is other than -1, 0, or 1.
Let us understand each and every step very clearly:
State Action
A node B has been inserted into the right subtree of A the left subtree of C, because of which
C has become an unbalanced node having balance factor 2. This case is L R rotation where:
Inserted node is in the right subtree of left subtree of C
As LR rotation = RR + LL rotation, hence RR (anticlockwise) on subtree rooted at A is

performed first. By doing RR rotation, node A, has become the left subtree of B.
After performing RR rotation, node C is still unbalanced, i.e., having balance factor 2, as
inserted node A is in the left of left of C
Now we perform LL clockwise rotation on full tree, i.e. on node C. node C has now become
the right subtree of node B, A is left subtree of B
Balance factor of each node is now either -1, 0, or 1, i.e. BST is balanced now.
UNIT 3 : SEARCHING
4. RL Rotation
As already discussed, that double rotations are bit tougher than single rotation which has already
explained above. R L rotation = LL rotation + RR rotation, i.e., first LL rotation is performed on subtree
and then RR rotation is performed on full tree, by full tree we mean the first node from the path of
inserted node whose balance factor is other than -1, 0, or 1.
State Action
A node B has been inserted into the left subtree of C the right subtree of A, because of which A
has become an unbalanced node having balance factor - 2. This case is RL rotation where: Inserted
node is in the left subtree of right subtree of A
As RL rotation = LL rotation + RR rotation, hence, LL (clockwise) on subtree rooted at C is performed

first. By doing RR rotation, node C has become the right subtree of B.
After performing LL rotation, node A is still unbalanced, i.e. having balance factor -2, which is
because of the right-subtree of the right-subtree node A.
Now we perform RR rotation (anticlockwise rotation) on full tree, i.e. on node A. node C has now
become the right subtree of node B, and node A has become the left subtree of B.
Balance factor of each node is now either -1, 0, or 1, i.e., BST is balanced now.
UNIT 3 : SEARCHING
Q: Construct an AVL tree having the following elements

H, I, J, B, A, E, C, F, D, G, K, L
1. Insert H, I, J
On inserting the above elements, especially in the case of H, the BST becomes unbalanced as the
Balance Factor of H is -2. Since the BST is right-skewed, we will perform RR Rotation on node H.
The resultant balance tree is:
2. Insert B, A
On inserting the above elements, especially in case of A, the BST becomes unbalanced as the Balance
Factor of H and I is 2, we consider the first node from the last inserted node i.e. H. Since the BST from
H is left-skewed, we will perform LL Rotation on node H.
The resultant balance tree is:

UNIT 3 : SEARCHING
3. Insert E
On inserting E, BST becomes unbalanced as the Balance Factor of I is 2, since if we travel from E to I
we find that it is inserted in the left subtree of right subtree of I, we will perform LR Rotation on node
I. LR = RR + LL rotation
UNIT 3 : SEARCHING
3 a) We first perform RR rotation on node B
The resultant tree after RR rotation is:
3b) We first perform LL rotation on the node I
The resultant balanced tree after LL rotation is:
4. Insert C, F, D
On inserting C, F, D, BST becomes unbalanced as the Balance Factor of B and H is -2, since if we travel
from D to B we find that it is inserted in the right subtree of left subtree of B, we will perform RL
Rotation on node I. RL = LL + RR rotation.
UNIT 3 : SEARCHING
4a) We first perform LL rotation on node E
The resultant tree after LL rotation is:
4b) We then perform RR rotation on node B
The resultant balanced tree after RR rotation is:
5. Insert G
On inserting G, BST become unbalanced as the Balance Factor of H is 2, since if we travel from G to
H, we find that it is inserted in the left subtree of right subtree of H, we will perform LR Rotation on
node I. LR = RR + LL rotation.
UNIT 3 : SEARCHING
5 a) We first perform RR rotation on node C
The resultant tree after RR rotation is:
5 b) We then perform LL rotation on node H
The resultant balanced tree after LL rotation is:
6. Insert K
On inserting K, BST becomes unbalanced as the Balance Factor of I is -2. Since the BST is right-skewed
from I to K, hence we will perform RR Rotation on the node I.
The resultant balanced tree after RR rotation is:

UNIT 3 : SEARCHING
7. Insert L
On inserting the L tree is still balanced as the Balance Factor of each node is now either, -1, 0, +1.
Hence the tree is a Balanced AVL tree
Balanced Binary Search Tree

A balanced binary tree is also known as height balanced tree. It is defined as binary tree in when the
difference between the height of the left subtree and right subtree is not more than m, where m is
usually equal to 1. The height of a tree is the number of edges on the longest path between the root
node and the leaf node.
The above tree is a binary search tree. A binary search tree is a tree in which each node on the left
side has a lower value than its parent node, and the node on the right side has a higher value than its
UNIT 3 : SEARCHING
parent node. In the above tree, n1 is a root node, and n4, n6, n7 are the leaf nodes. The n7 node is
the farthest node from the root node. The n4 and n6 contain 2 edges and there exist three edges
between the root node and n7 node. Since n7 is the farthest from the root node; therefore, the height
of the above tree is 3.
Now we will see whether the above tree is balanced or not. The left subtree contains the nodes n2,
n4, n5, and n7, while the right subtree contains the nodes n3 and n6. The left subtree has two leaf
nodes, i.e., n4 and n7. There is only one edge between the node n2 and n4 and two edges between
the nodes n7 and n2; therefore, node n7 is the farthest from the root node. The height of the left
subtree is 2. The right subtree contains only one leaf node, i.e., n6, and has only one edge; therefore,
the height of the right subtree is 1. The difference between the heights of the left subtree and right
subtree is 1. Since we got the value 1 so we can say that the above tree is a height-balanced tree. This
process of calculating the difference between the heights should be performed for each node like n2,
n3, n4, n5, n6 and n7. When we process each node, then we will find that the value of k is not more
than 1, so we can say that the above tree is a balanced binary tree.
In the above tree, n6, n4, and n3 are the leaf nodes, where n6 is the farthest node from the root
node. Three edges exist between the root node and the leaf node; therefore, the height of the above
tree is 3. When we consider n1 as the root node, then the left subtree contains the nodes n2, n4, n5,
and n6, while subtree contains the node n3. In the left subtree, n2 is a root node, and n4 and n6 are
leaf nodes. Among n4 and n6 nodes, n6 is the farthest node from its root node, and n6 has two edges;
therefore, the height of the left subtree is 2. The right subtree does have any child on its left and
right; therefore, the height of the right subtree is 0. Since the height of the left subtree is 2 and the
right subtree is 0, so the difference between the height of the left subtree and right subtree is 2.
According to the definition, the difference between the height of left sub tree and the right subtree
must not be greater than 1. In this case, the difference comes to be 2, which is greater than 1;
therefore, the above binary tree is an unbalanced binary search tree.
UNIT 3 : SEARCHING
Why do we need a balanced binary tree?

Let's understand the need for a balanced binary tree through an example.
The above tree is a binary search tree because all the left subtree nodes are smaller than its parent
node and all the right subtree nodes are greater than its parent node. Suppose we want to want to
find the value 79 in the above tree. First, we compare the value of node n1 with 79; since the value
of 79 is not equal to 35 and it is greater than 35 so we move to the node n3, i.e., 48. Since the value
79 is not equal to 48 and 79 is greater than 48, so we move to the right child of 48. The value of the
right child of node 48 is 79 which is equal to the value to be searched. The number of hops required
to search an element 79 is 2 and the maximum number of hops required to search any element is 2.
The average case to search an element is O(logn).
The above tree is also a binary search tree because all the left subtree nodes are smaller than its
parent node and all the right subtree nodes are greater than its parent node. Suppose we want to
find the find the value 79 in the above tree. First, we compare the value 79 with a node n4, i.e., 13.
Since the value 79 is greater than 13 so we move to the right child of node 13, i.e., n2 (21). The value
of the node n2 is 21 which is smaller than 79, so we again move to the right of node 21. The value of
UNIT 3 : SEARCHING
right child of node 21 is 29. Since the value 79 is greater than 29 so we move to the right child of node
29. The value of right child of node 29 is 35 which is smaller than 79 so we move to the right child of
node 35, i.e., 48. The value 79 is greater than 48, so we move to the right child of node 48. The value
of right child node of 48 is 79 which is equal to the value to be searched. In this case, the number of
hops required to search an element is 5. In this case, the worst case is O(n).
If the number of nodes increases, the formula used in the tree diagram1 is more efficient than the
formula used in the tree diagram2. Suppose the number of nodes available in both above trees is
100,000. To search any element in a tree diagram2, the time taken is 100,000µs whereas the time
taken to search an element in tree diagram is log(100,000) which is equal 16.6 µs. We can observe
the enormous difference in time between above two trees. Therefore, we conclude that the balance
binary tree provides searching more faster than linear tree data structure.
Introduction to Hashing
Assume we want to create a system for storing employee records that include phone numbers (as
keys). We also want the following queries to run quickly:
o Insert a phone number and any necessary information.

o Look up a phone number and get the information.
o Remove a phone number and any associated information.
We can consider using the following data structures to store information about various phone
numbers.
o A collection of phone numbers and records.

o Phone numbers and records are linked in this list.
o Phone numbers serve as keys in a balanced binary search tree.
o Table with Direct Access.
We must search in a linear fashion for arrays and linked lists, which can be costly in practise. If we
use arrays and keep the data sorted, we can use Binary Search to find a phone number in O(Logn)
time, but insert and delete operations become expensive because we must keep the data sorted.
We get moderate search, insert, and delete times with a balanced binary search tree. All of these
operations will be completed in O(Logn) time.
The term "access-list" refers to a set of rules for controlling network traffic and reducing network
attacks. ACLs are used to filter network traffic based on a set of rules defined for incoming or outgoing
traffic.
Another option is to use a direct access table, in which we create a large array and use phone numbers
as indexes. If the phone number is not present, the array entry is NIL; otherwise, the array entry
stores a pointer to the records corresponding to the phone number. In terms of time complexity, this
UNIT 3 : SEARCHING
solution is the best of the bunch; we can perform all operations in O(1) time. To insert a phone
number, for example, we create a record with the phone number's details, use the phone number as
an index, and store the pointer to the newly created record in the table.
This solution has a number of practical drawbacks. The first issue with this solution is the amount of
extra space required. For example, if a phone number has n digits, we require O(m * 10n) table space,
where m is the size of a pointer to record. Another issue is that an integer in a programming language
cannot hold n digits.
Because of the limitations mentioned above, Direct Access Table cannot always be used. In practise,
Hashing is the solution that can be used in almost all such situations and outperforms the above data
structures such as Array, Linked List, and Balanced BST. We get O(1) search time on average (under
reasonable assumptions) and O(n) in the worst case with hashing. Let's break down what hashing is.
UNIT 3 : SEARCHING
Open Addressing-
In open addressing,
• Unlike separate chaining, all the keys are stored inside the hash table.
• No key is stored outside the hash table.
Techniques used for open addressing are-
• Linear Probing
• Quadratic Probing
• Double Hashing
Operations in Open Addressing-

Let us discuss how operations are performed in open addressing-
Insert Operation-
• Hash function is used to compute the hash value for a key to be inserted.
• Hash value is then used as an index to store the key in the hash table.
In case of collision,
• Probing is performed until an empty bucket is found.

• Once an empty bucket is found, the key is inserted.
• Probing is performed in accordance with the technique used for open addressing.
Search Operation-
To search any particular key,
• Its hash value is obtained using the hash function used.

• Using the hash value, that bucket of the hash table is checked.
• If the required key is found, the key is searched.
• Otherwise, the subsequent buckets are checked until the required key or an empty bucket is found.
• The empty bucket indicates that the key is not present in the hash table.
Delete Operation-
• The key is first searched and then deleted.
• After deleting the key, that particular bucket is marked as “deleted”.
NOTE-
• During insertion, the buckets marked as “deleted” are treated like any other empty bucket.
• During searching, the search is not terminated on encountering the bucket marked as
“deleted”.
• The search terminates only after the required key or an empty bucket is found.
UNIT 3 : SEARCHING
• Open Addressing Techniques-

Techniques used for open addressing are-
1. Linear Probing-
In linear probing,
• When collision occurs, we linearly probe for the next bucket.

• We keep probing until an empty bucket is found.
Advantage-
• It is easy to compute.
Disadvantage-
• The main problem with linear probing is clustering.

• Many consecutive elements form groups.
• Then, it takes time to search an element or to find an empty bucket.
Time Complexity-
Worst time to search an element in linear probing is O (table size).
This is because-
• Even if there is only one element present and all other elements are deleted.
• Then, “deleted” markers present in the hash table makes search the entire table.
2. Quadratic Probing-
In quadratic probing,
• When collision occurs, we probe for i2‘th bucket in ith iteration.

• We keep probing until an empty bucket is found.
UNIT 3 : SEARCHING
3. Double Hashing-
In double hashing,
• We use another hash function hash2(x) and look for i * hash2(x) bucket in ith iteration.
• It requires more computation time as two hash functions need to be computed.
PRACTICE PROBLEM BASED ON SEPARATE CHAINING-

Problem-
Using the hash function ‘key mod 7’, insert the following sequence of keys in the hash table-
50, 700, 76, 85, 92, 73 and 101
Use separate chaining technique for collision resolution.
Solution-
The given sequence of keys will be inserted in the hash table as-
Step-01:
• Draw an empty hash table.

• For the given hash function, the possible range of hash values is [0, 6].
• So, draw an empty hash table consisting of 7 buckets as-
UNIT 3 : SEARCHING
Step-02:
• Insert the given keys in the hash table one by one.

• The first key to be inserted in the hash table = 50.
• Bucket of the hash table to which key 50 maps = 50 mod 7 = 1.
• So, key 50 will be inserted in bucket-1 of the hash table as-
Step-03:
• The next key to be inserted in the hash table = 700.

Step-04:

UNIT 3 : SEARCHING
Step-05:

• Since bucket-1 is already occupied, so collision occurs.
• Separate chaining handles the collision by creating a linked list to bucket-1.
Step-06:

UNIT 3 : SEARCHING
Step-07:

Step-08:
To gain better understanding about Separate Chaining,
PRACTICE PROBLEM BASED ON SEPARATE

CHAINING-
UNIT 3 : SEARCHING
Problem-
50, 700, 76, 85, 92, 73 and 101
Use separate chaining technique for collision resolution.
Solution-
The given sequence of keys will be inserted in the hash table a
Step-01:

Step-02:

UNIT 3 : SEARCHING
Step-03:

Step-04:

Step-05:

UNIT 3 : SEARCHING
Step-06:

Step-07:

Step-08:

UNIT 3 : SEARCHING
Separate Chaining-
Separate Chaining is advantageous when it is required to perform all the following operations on the keys
stored in the hash table-
• Insertion Operation
• Deletion Operation
• Searching Operation
PRACTICE PROBLEM BASED ON OPEN ADDRESSING-

Problem-
50, 700, 76, 85, 92, 73 and 101
Use linear probing technique for collision resolution.
Solution-
The given sequence of keys will be inserted in the hash table as-
UNIT 3 : SEARCHING
Step-01:

Step-02:

Step-03:

• Bucket of the hash table to which key 700 maps = 700 mod 7 = 0. So, key 700 will be inserted in
bucket-0 of the hash table as-
UNIT 3 : SEARCHING
Step-04:

Step-05:
• To handle the collision, linear probing technique keeps probing linearly until an empty bucket is found.
• The first empty bucket is bucket-2.
Step-06:

UNIT 3 : SEARCHING
Step-07:

Step-08:
UNIT 3 : SEARCHING
Separate Chaining Vs Open Addressing-
Separate Chaining Open Addressing
Keys are stored inside the hash table as well as outside All the keys are stored only inside the hash table.
the hash table.
No key is present outside the hash table.
The number of keys to be stored in the hash table can The number of keys to be stored in the hash table can
even exceed the size of the hash table. never exceed the size of the hash table.
Deletion is easier. Deletion is difficult.
Extra space is required for the pointers to store the No extra space is required.
keys outside the hash table.
Cache performance is poor.

Cache performance is better.
This is because of linked lists which store the keys
This is because here no linked lists are used.
outside the hash table.
Buckets may be used even if no key maps to those

Some buckets of the hash table are never used which particular buckets.
leads to wastage of space.
Load Factor (α)-

Load factor (α) is defined as-
In open addressing, the value of load factor always lie between 0 and 1.
This is because-
• In open addressing, all the keys are stored inside the hash table.
• So, size of the table is always greater or at least equal to the number of keys stored in the table.
Unit 4 : GRAPH THEORY
Graph
A graph can be defined as group of vertices and edges that are used to connect these vertices. A graph can
be seen as a cyclic tree, where the vertices (Nodes) maintain any complex relationship among them instead
of having parent child relationship.
Definition
A graph G can be defined as an ordered set G(V, E) where V(G) represents the set of vertices and E(G)
represents the set of edges which are used to connect these vertices.
A Graph G(V, E) with 5 vertices (A, B, C, D, E) and six edges ((A,B), (B,C), (C,E), (E,D), (D,B), (D,A)) is shown in
the following figure.
Directed and Undirected Graph
A graph can be directed or undirected. However, in an undirected graph, edges are not associated with the
directions with them. An undirected graph is shown in the above figure since its edges are not attached with
any of the directions. If an edge exists between vertex A and B then the vertices can be traversed from B to
A as well as A to B.
In a directed graph, edges form an ordered pair. Edges represent a specific path from some vertex A to
another vertex B. Node A is called initial node while node B is called terminal node.
A directed graph is shown in the following figure.

Graph Terminology
Path
A path can be defined as the sequence of nodes that are followed in order to reach some terminal node V
from the initial node U.
Closed Path
A path will be called as closed path if the initial node is same as terminal node. A path will be closed path if
V0=VN.
Simple Path
If all the nodes of the graph are distinct with an exception V0=VN, then such path P is called as closed simple
path.
Cycle
A cycle can be defined as the path which has no repeated edges or vertices except the first and last vertices.
Connected Graph
A connected graph is the one in which some path exists between every two vertices (u, v) in V. There are no
isolated nodes in connected graph.
Complete Graph
A complete graph is the one in which every node is connected with all other nodes. A complete graph contain
n(n-1)/2 edges where n is the number of nodes in the graph.
Weighted Graph
In a weighted graph, each edge is assigned with some data such as length or weight. The weight of an edge
e can be given as w(e) which must be a positive (+) value indicating the cost of traversing the edge.
Digraph
A digraph is a directed graph in which each edge of the graph is associated with some direction and the
traversing can be done only in the specified direction.
Loop
An edge that is associated with the similar end points can be called as Loop.
Adjacent Nodes
If two nodes u and v are connected via an edge e, then the nodes u and v are called as neighbours or adjacent
nodes.
Degree of the Node
A degree of a node is the number of edges that are connected with that node. A node with degree 0 is called
as isolated node.
Spanning tree
In this article, we will discuss the spanning tree and the minimum spanning tree. But before moving directly
towards the spanning tree, let's first see a brief description of the graph and its types.
Graph
A graph can be defined as a group of vertices and edges to connect these vertices. The types of graphs are
given as follows -
o Undirected graph: An undirected graph is a graph in which all the edges do not point to any particular
direction, i.e., they are not unidirectional; they are bidirectional. It can also be defined as a graph
with a set of V vertices and a set of E edges, each edge connecting two different vertices.
o Connected graph: A connected graph is a graph in which a path always exists from a vertex to any
other vertex. A graph is connected if we can reach any vertex from any other vertex by following
edges in either direction.
o Directed graph: Directed graphs are also known as digraphs. A graph is a directed graph (or digraph)
if all the edges present between any vertices or nodes of the graph are directed or have a defined
direction.
Now, let's move towards the topic spanning tr
spanning Tree
A spanning tree can be defined as the subgraph of an undirected connected graph. It includes all the vertices
along with the least possible number of edges. If any vertex is missed, it is not a spanning tree. A spanning
tree is a subset of the graph that does not have cycles, and it also cannot be disconnected.
A spanning tree consists of (n-1) edges, where 'n' is the number of vertices (or nodes). Edges of the spanning
tree may or may not have weights assigned to them. All the possible spanning trees created from the given
graph G would have the same number of vertices, but the number of edges in the spanning tree would be
equal to the number of vertices in the given graph minus 1.
A complete undirected graph can have nn-2 number of spanning trees where n is the number of vertices in
the graph. Suppose, if n = 5, the number of maximum possible spanning trees would be 55-2 = 125.
Applications of the spanning tree
Basically, a spanning tree is used to find a minimum path to connect all nodes of the graph. Some of the
common applications of the spanning tree are listed as follows -
o Cluster Analysis
o Civil network planning
o Computer network routing protocol
Now, let's understand the spanning tree with the help of an example.
Example of Spanning tree
Suppose the graph be -
As discussed above, a spanning tree contains the same number of vertices as the graph, the number of
vertices in the above graph is 5; therefore, the spanning tree will contain 5 vertices. The edges in the
spanning tree will be equal to the number of vertices in the graph minus 1. So, there will be 4 edges in the
spanning tree.
Some of the possible spanning trees that will be created from the above graph are given as follows -
Properties of spanning-tree
Some of the properties of the spanning tree are given as follows -
o There can be more than one spanning tree of a connected graph G.

o A spanning tree does not have any cycles or loop.
o A spanning tree is minimally connected, so removing one edge from the tree will make the graph
disconnected.
o A spanning tree is maximally acyclic, so adding one edge to the tree will create a loop.
o There can be a maximum nn-2 number of spanning trees that can be created from a complete graph.
o A spanning tree has n-1 edges, where 'n' is the number of nodes.
o If the graph is a complete graph, then the spanning tree can be constructed by removing maximum
(e-n+1) edges, where 'e' is the number of edges and 'n' is the number of vertices.
So, a spanning tree is a subset of connected graph G, and there is no spanning tree of a disconnected graph.
Minimum Spanning tree
A minimum spanning tree can be defined as the spanning tree in which the sum of the weights of the edge
is minimum. The weight of the spanning tree is the sum of the weights given to the edges of the spanning
tree. In the real world, this weight can be considered as the distance, traffic load, congestion, or any random
value.
Example of minimum spanning tree
Let's understand the minimum spanning tree with the help of an example.
The sum of the edges of the above graph is 16. Now, some of the possible spanning trees created from the
above graph are -
So, the minimum spanning tree that is selected from the above spanning trees for the given weighted graph
is -
Applications of minimum spanning tree
The applications of the minimum spanning tree are given as follows -
o Minimum spanning tree can be used to design water-supply networks, telecommunication networks,
and electrical grids.
o It can be used to find paths in the map.
Algorithms for Minimum spanning tree
A minimum spanning tree can be found from a weighted graph by using the algorithms given below -
o Prim's Algorithm
o Kruskal's Algorithm
Let's see a brief description of both of the algorithms listed above.

Prim's algorithm - It is a greedy algorithm that starts with an empty spanning tree. It is used to find the
minimum spanning tree from the graph. This algorithm finds the subset of edges that includes every vertex
of the graph such that the sum of the weights of the edges can be minimized.
Kruskal's algorithm - This algorithm is also used to find the minimum spanning tree for a connected weighted
graph. Kruskal's algorithm also follows greedy approach, which finds an optimum solution at every stage
instead of focusing on a global optimum.
So, that's all about the article. Hope the article will be helpful and informative to you. Here, we have
discussed spanning tree and minimum spanning tree along with their properties, examples, and applications.
DAG representation for basic blocks
A DAG for basic block is a directed acyclic graph with the following labels on nodes:
1. The leaves of graph are labeled by unique identifier and that identifier can be variable names or
constants.
2. Interior nodes of the graph is labeled by an operator symbol.
3. Nodes are also given a sequence of identifiers for labels to store the computed value.
o DAGs are a type of data structure. It is used to implement transformations on basic blocks.
o DAG provides a good way to determine the common sub-expression.
o It gives a picture representation of how the value computed by the statement is used in subsequent
statements.
Algorithm for construction of DAG
Input:It contains a basic block
Output: It contains the following information:
o Each node contains a label. For leaves, the label is an identifier.

o Each node contains a list of attached identifiers to hold the computed values.
1. Case (i) x:= y OP z

2. Case (ii) x:= OP y
3. Case (iii) x:= y
Method:
Step 1:
If y operand is undefined then create node(y).
If z operand is undefined then for case(i) create node(z).

Step 2:
For case(i), create node(OP) whose right child is node(z) and left child is node(y).
For case(ii), check whether there is node(OP) with one child node(y).
For case(iii), node n will be node(y).
Output:
For node(x) delete x from the list of identifiers. Append x to attached identifiers list for the node n found in
step 2. Finally set node(x) to n.
Example:
Consider the following three address statement:
1. S1:= 4 * i
2. S2:= a[S1]
3. S3:= 4 * i
4. S4:= b[S3]
5. S5:= s2 * S4
6. S6:= prod + S5
7. Prod:= s6
8. S7:= i+1
9. i := S7
10. if i<= 20 goto (1)
Stages in DAG Construction:

Topological Sorting
A topological sort or topological ordering of a directed graph is a linear ordering of its vertices in which u
occurs before v in the ordering for every directed edge uv from vertex u to vertex v. For example, the graph's
vertices could represent jobs to be completed, and the edges could reflect requirements that one work must
be completed before another.
In this case, a topological ordering is just a legitimate task sequence. A topological sort is a graph traversal
in which each node v is only visited after all of its dependencies have been visited. If the graph contains no
directed cycles, then it is a directed acyclic graph. Any DAG has at least one topological ordering, and there
exist techniques for building topological orderings in linear time for any DAG.
Topological sorting has many applications, particularly in ranking issues like the feedback arc set. Even if the
DAG includes disconnected components, topological sorting is possible.
Advantages of Topological Sorting
Topological Sorting is mostly used to schedule jobs based on their dependencies. Instruction scheduling,
ordering formula cell evaluation when recomputing formula values in spreadsheets, logic synthesis,
determining the order of compilation tasks to perform in make files, data serialization, and resolving symbol
dependencies in linker are all examples of applications of this type in computer science.
o Finding cycle in a graph: Only directed acyclic graphs may be ordered topologically (DAG). It is
impossible to arrange a circular graph topologically.
o Operation System deadlock detection: A deadlock occurs when one process is waiting while another
holds the requested resource.
o Dependency resolution: Topological Sorting has been proved to be very helpful in Dependency
resolution.
o Sentence Ordering: A set of n documents D={d1,d2...,dn} and the number of sentences in a document
is vi, where ∀i, vi>=1. Suppose a random order o=[o1,....ovi] and a set of vi sentences in random order
are {So1, So2,..., Sovi}. Then the task is to find the right order of the sentences o*={o*1,...o*vi}. A set
of constraints Ci represents the relative ordering between every pair of sentences in di where
|Ci|=(vi×(vi-1))/2. For example, if a document has three sentences in the correct order s1 < s2 < s3,
then we have three set of constraints {s1 < s2, s1 < s3, s2 < s3}
The order of the sentences can be represented using a DAG. Here the sentences (Si) represent the
vertices, and the edges represent the ordering between sentences. For example, if we have a directed
edge between S1 to S2, then S1 must come before S2. Topological sort can produce an ordering of
these sentences (Sentence ordering).
o Critical Path Analysis: A project management approach known as critical route analysis. It's used to
figure out how long a project should take and how dependent each action is on the others. There
may be some preceding actions before an activity. Before beginning a new activity, all previous
actions must be completed.
o Course Schedule problem: Topological Sorting has been proved to be very helpful in solving the
Course Schedule problem.
o Other applications like manufacturing workflows, data serialization, and context-free grammar.
Prim's Algorithm
In this article, we will discuss the prim's algorithm. Along with the algorithm, we will also see the complexity,
working, example, and implementation of prim's algorithm.
Before starting the main topic, we should discuss the basic and important terms such as spanning tree and
minimum spanning tree.
Spanning tree - A spanning tree is the subgraph of an undirected connected graph.
Minimum Spanning tree - Minimum spanning tree can be defined as the spanning tree in which the sum of
the weights of the edge is minimum. The weight of the spanning tree is the sum of the weights given to the
edges of the spanning tree.
Now, let's start the main topic.
Prim's Algorithm is a greedy algorithm that is used to find the minimum spanning tree from a graph. Prim's
algorithm finds the subset of edges that includes every vertex of the graph such that the sum of the weights
of the edges can be minimized.
Prim's algorithm starts with the single node and explores all the adjacent nodes with all the connecting edges
at every step. The edges with the minimal weights causing no cycles in the graph got selected.
prim's algorithm work
Prim's algorithm is a greedy algorithm that starts from one vertex and continue to add the edges with the
smallest weight until the goal is reached. The steps to implement the prim's algorithm are given as follows -
o First, we have to initialize an MST with the randomly chosen vertex.

o Now, we have to find all the edges that connect the tree in the above step with the new vertices.
From the edges found, select the minimum edge and add it to the tree.
o Repeat step 2 until the minimum spanning tree is formed.
The applications of prim's algorithm are -
o Prim's algorithm can be used in network designing.

o It can be used to make network cycles.
o It can also be used to lay down electrical wiring cables.
Example of prim's algorithm
Now, let's see the working of prim's algorithm using an example. It will be easier to understand the prim's
algorithm using an example.
Suppose, a weighted graph is -
Step 1 - First, we have to choose a vertex from the above graph. Let's choose B.
Step 2 - Now, we have to choose and add the shortest edge from vertex B. There are two edges from vertex
B that are B to C with weight 10 and edge B to D with weight 4. Among the edges, the edge BD has the
minimum weight. So, add it to the MST.
Step 3 - Now, again, choose the edge with the minimum weight among all the other edges. In this case, the
edges DE and CD are such edges. Add them to MST and explore the adjacent of C, i.e., E and A. So, select the
edge DE and add it to the MST.
Step 4 - Now, select the edge CD, and add it to the MST.
Step 5 - Now, choose the edge CA. Here, we cannot select the edge CE as it would create a cycle to the graph.
So, choose the edge CA and add it to the MST.
So, the graph produced in step 5 is the minimum spanning tree of the given graph. The cost of the MST is
given below -
Cost of MST = 4 + 2 + 1 + 3 = 10 units.
Algorithm
1. Step 1: Select a starting vertex

2. Step 2: Repeat Steps 3 and 4 until there are fringe vertices
3. Step 3: Select an edge 'e' connecting the tree vertex and fringe vertex that has minimum weight
4. Step 4: Add the selected edge and the vertex to the minimum spanning tree T
5. [END OF LOOP]
6. Step 5: EXIT
Complexity of Prim's algorithm
Now, let's see the time complexity of Prim's algorithm. The running time of the prim's algorithm depends
upon using the data structure for the graph and the ordering of edges. Below table shows some choices -
o Time Complexity
Data structure used for the minimum edge weight Time Complexity
Adjacency matrix, linear searching O(|V|2)
Adjacency list and binary heap O(|E| log |V|)
Adjacency list and Fibonacci heap O(|E|+ |V| log |V|)
Prim's algorithm can be simply implemented by using the adjacency matrix or adjacency list graph
representation, and to add the edge with the minimum weight requires the linearly searching of an array of
weights. It requires O(|V|2) running time. It can be improved further by using the implementation of heap
to find the minimum weight edges in the inner loop of the algorithm.
The time complexity of the prim's algorithm is O(E logV) or O(V logV), where E is the no. of edges, and V is
the no. of vertices.
Kruskal's Algorithm
In this article, we will discuss Kruskal's algorithm. Here, we will also see the complexity, working, example,
and implementation of the Kruskal's algorithm.
But before moving directly towards the algorithm, we should first understand the basic terms such as
spanning tree and minimum spanning tree.
Spanning tree - A spanning tree is the subgraph of an undirected connected graph.
Minimum Spanning tree - Minimum spanning tree can be defined as the spanning tree in which the sum of
the weights of the edge is minimum. The weight of the spanning tree is the sum of the weights given to the
edges of the spanning tree.
Now, let's start with the main topic.
Kruskal's Algorithm is used to find the minimum spanning tree for a connected weighted graph. The main
target of the algorithm is to find the subset of edges by using which we can traverse every vertex of the
graph. It follows the greedy approach that finds an optimum solution at every stage instead of focusing on a
global optimum.
Kruskal's algorithm work
In Kruskal's algorithm, we start from edges with the lowest weight and keep adding the edges until the goal
is reached. The steps to implement Kruskal's algorithm are listed as follows -
o First, sort all the edges from low weight to high.

o Now, take the edge with the lowest weight and add it to the spanning tree. If the edge to be added
creates a cycle, then reject the edge.
o Continue to add the edges until we reach all vertices, and a minimum spanning tree is created.
The applications of Kruskal's algorithm are -
o Kruskal's algorithm can be used to layout electrical wiring among cities.

o It can be used to lay down LAN connections.
Example of Kruskal's algorithm
Now, let's see the working of Kruskal's algorithm using an example. It will be easier to understand Kruskal's
algorithm using an example.
Suppose a weighted graph is -
The weight of the edges of the above graph is given in the below table -
Edge AB AC AD AE BC CD DE
Weight 1 7 10 5 3 4 2
Now, sort the edges given above in the ascending order of their weights.
Edge AB DE BC CD AE AC AD
Weight 1 2 3 4 5 7 10
Now, let's start constructing the minimum spanning tree.
Step 1 - First, add the edge AB with weight 1 to the MST.

Step 2 - Add the edge DE with weight 2 to the MST as it is not creating the cycle.
Step 3 - Add the edge BC with weight 3 to the MST, as it is not creating any cycle or loop.
Step 4 - Now, pick the edge CD with weight 4 to the MST, as it is not forming the cycle.
Step 5 - After that, pick the edge AE with weight 5. Including this edge will create the cycle, so discard it.
Step 6 - Pick the edge AC with weight 7. Including this edge will create the cycle, so discard it.
Step 7 - Pick the edge AD with weight 10. Including this edge will also create the cycle, so discard it.
So, the final minimum spanning tree obtained from the given weighted graph by using Kruskal's algorithm is
-
The cost of the MST is = AB + DE + BC + CD = 1 + 2 + 3 + 4 = 10.
Now, the number of edges in the above tree equals the number of vertices minus 1. So, the algorithm stops
here.
Algorithm
1. Step 1: Create a forest F in such a way that every vertex of the graph is a separate tree.
2. Step 2: Create a set E that contains all the edges of the graph.
3. Step 3: Repeat Steps 4 and 5 while E is NOT EMPTY and F is not spanning
4. Step 4: Remove an edge from E with minimum weight
5. Step 5: IF the edge obtained in Step 4 connects two different trees, then add it to the forest F
6. (for combining two trees into one tree).
7. ELSE
8. Discard the edge
9. Step 6: END
Complexity of Kruskal's algorithm
Now, let's see the time complexity of Kruskal's algorithm.
o Time Complexity
The time complexity of Kruskal's algorithm is O(E logE) or O(V logV), where E is the no. of edges, and
V is the no. of vertices.
An Introduction to Dijkstra's Algorithm
Now that we know some basic Graphs concepts let's dive into understanding the concept of Dijkstra's
Algorithm.
Ever wondered how does Google Maps finds the shortest and fastest route between two places?
Well, the answer is Dijkstra's Algorithm. Dijkstra's Algorithm is a Graph algorithm that finds the shortest
path from a source vertex to all other vertices in the Graph (single source shortest path). It is a type of Greedy
Algorithm that only works on Weighted Graphs having positive weights. The time complexity of Dijkstra's
Algorithm is O(V2) with the help of the adjacency matrix representation of the graph. This time complexity
can be reduced to O((V + E) log V) with the help of an adjacency list representation of the graph, where V is
the number of vertices and E is the number of edges in the graph.
History of Dijkstra's Algorithm
Dijkstra's Algorithm was designed and published by Dr. Edsger W. Dijkstra, a Dutch Computer Scientist,
Software Engineer, Programmer, Science Essayist, and Systems Scientist.
During an Interview with Philip L. Frana for the Communications of the ACM journal in the year 2001, Dr.
Edsger W. Dijkstra revealed:
"What is the shortest way to travel from Rotterdam to Groningen, in general: from given city to given city?
It is the algorithm for the shortest path, which I designed in about twenty minutes. One morning I was
shopping in Amsterdam with my young fiancée, and tired, we sat down on the café terrace to drink a cup of
coffee and I was just thinking about whether I could do this, and I then designed the algorithm for the
shortest path. As I said, it was a twenty-minute invention. In fact, it was published in '59, three years later.
The publication is still readable, it is, in fact, quite nice. One of the reasons that it is so nice was that I designed
it without pencil and paper. I learned later that one of the advantages of designing without pencil and paper
is that you are almost forced to avoid all avoidable complexities. Eventually, that algorithm became to my
great amazement, one of the cornerstones of my fame."
Dijkstra thought about the shortest path problem while working as a programmer at the Mathematical
Centre in Amsterdam in 1956 to illustrate the capabilities of a new computer known as ARMAC. His goal was
to select both a problem and a solution (produced by the computer) that people with no computer
background could comprehend. He developed the shortest path algorithm and later executed it for ARMAC
for a vaguely shortened transportation map of 64 cities in the Netherlands (64 cities, so 6 bits would be
sufficient to encode the city number). A year later, he came across another issue from hardware engineers
operating the next computer of the institute: Minimize the amount of wire required to connect the pins on
the machine's back panel. As a solution, he re-discovered the algorithm called Prim's minimal spanning tree
algorithm and published it in the year 1959.
Fundamentals of Dijkstra's Algorithm
The following are the basic concepts of Dijkstra's Algorithm:
1. Dijkstra's Algorithm begins at the node we select (the source node), and it examines the graph to find
the shortest path between that node and all the other nodes in the graph.
2. The Algorithm keeps records of the presently acknowledged shortest distance from each node to the
source node, and it updates these values if it finds any shorter path.
3. Once the Algorithm has retrieved the shortest path between the source and another node, that node
is marked as 'visited' and included in the path.
4. The procedure continues until all the nodes in the graph have been included in the path. In this
manner, we have a path connecting the source node to all other nodes, following the shortest
possible path to reach each node.
Understanding the Working of Dijkstra's Algorithm
A graph and source vertex are requirements for Dijkstra's Algorithm. This Algorithm is established on Greedy
Approach and thus finds the locally optimal choice (local minima in this case) at each step of the Algorithm.
Each Vertex in this Algorithm will have two properties defined for it:
1. Visited Property
2. Path Property
Let us understand these properties in brief.
Visited Property:
1. The 'visited' property signifies whether or not the node has been visited.
2. We are using this property so that we do not revisit any node.
3. A node is marked visited only when the shortest path has been found.
Path Property:
1. The 'path' property stores the value of the current minimum path to the node.
2. The current minimum path implies the shortest way we have reached this node till now.
3. This property is revised when any neighbor of the node is visited.
4. This property is significant because it will store the final answer for each node.
Initially, we mark all the vertices, or nodes, unvisited as they have yet to be visited. The path to all the nodes
is also set to infinity apart from the source node. Moreover, the path to the source node is set to zero (0).
We then select the source node and mark it as visited. After that, we access all the neighboring nodes of the
source node and perform relaxation on every node. Relaxation is the process of lowering the cost of reaching
a node with the help of another node.
In the process of relaxation, the path of each node is revised to the minimum value amongst the node's
current path, the sum of the path to the previous node, and the path from the previous node to the current
node.
Let us suppose that p[n] is the value of the current path for node n, p[m] is the value of the path up to the
previously visited node m, and w is the weight of the edge between the current node and previously visited
one (edge weight between n and m).
In the mathematical sense, relaxation can be exemplified as:
p[n] = minimum(p[n], p[m] + w)
We then mark an unvisited node with the least path as visited in every subsequent step and update its
neighbor's paths.
We repeat this procedure until all the nodes in the graph are marked visited.
Whenever we add a node to the visited set, the path to all its neighboring nodes also changes accordingly.
If any node is left unreachable (disconnected component), its path remains 'infinity'. In case the source itself
is a separate component, then the path to all other nodes remains 'infinity'.
Understanding Dijkstra's Algorithm with an Example
The following is the step that we will follow to implement Dijkstra's Algorithm:
Step 1: First, we will mark the source node with a current distance of 0 and set the rest of the nodes to
INFINITY.
Step 2: We will then set the unvisited node with the smallest current distance as the current node, suppose
X.
Step 3: For each neighbor N of the current node X: We will then add the current distance of X with the weight
of the edge joining X-N. If it is smaller than the current distance of N, set it as the new current distance of N.
Step 4: We will then mark the current node X as visited.
Step 5: We will repeat the process from 'Step 2' if there is any node unvisited left in the graph.
Let us now understand the implementation of the algorithm with the help of an example:
Figure 6: The Given Graph
1. We will use the above graph as the input, with node A as the source.
2. First, we will mark all the nodes as unvisited.
3. We will set the path to 0 at node A and INFINITY for all the other nodes.
4. We will now mark source node A as visited and access its neighboring nodes.
Note: We have only accessed the neighboring nodes, not visited them.
5. We will now update the path to node B by 4 with the help of relaxation because the path to
node A is 0 and the path from node A to B is 4, and the minimum((0 + 4), INFINITY) is 4.
6. We will also update the path to node C by 5 with the help of relaxation because the path to
node A is 0 and the path from node A to C is 5, and the minimum((0 + 5), INFINITY) is 5. Both the
neighbors of node A are now relaxed; therefore, we can move ahead.
7. We will now select the next unvisited node with the least path and visit it. Hence, we will visit
node B and perform relaxation on its unvisited neighbors. After performing relaxation, the path to
node C will remain 5, whereas the path to node E will become 11, and the path to node D will
become 13.
8. We will now visit node E and perform relaxation on its neighboring nodes B, D, and F. Since only
node F is unvisited, it will be relaxed. Thus, the path to node B will remain as it is, i.e., 4, the path to
node D will also remain 13, and the path to node F will become 14 (8 + 6).
9. Now we will visit node D, and only node F will be relaxed. However, the path to node F will remain
unchanged, i.e., 14.
10. Since only node F is remaining, we will visit it but not perform any relaxation as all its neighboring
nodes are already visited.
11. Once all the nodes of the graphs are visited, the program will end.
Hence, the final paths we concluded are:
1. A = 0
2. B = 4 (A -> B)
3. C = 5 (A -> C)
4. D = 4 + 9 = 13 (A -> B -> D)
5. E = 5 + 3 = 8 (A -> C -> E)
6. F = 5 + 3 + 6 = 14 (A -> C -> E -> F)
Pseudocode for Dijkstra's Algorithm
We will now understand a pseudocode for Dijkstra's Algorithm.
o We have to maintain a record of the path distance of every node. Therefore, we can store the path
distance of each node in an array of size n, where n is the total number of nodes.
o Moreover, we want to retrieve the shortest path along with the length of that path. To overcome
this problem, we will map each node to the node that last updated its path length.
o Once the algorithm is complete, we can backtrack the destination node to the source node to retrieve
the path.
o We can use a minimum Priority Queue to retrieve the node with the least path distance in an efficient
way.
Bellman Ford Algorithm
Bellman ford algorithm is a single-source shortest path algorithm. This algorithm is used to find the shortest
distance from the single vertex to all the other vertices of a weighted graph. There are various other
algorithms used to find the shortest path like Dijkstra algorithm, etc. If the weighted graph contains the
negative weight values, then the Dijkstra algorithm does not confirm whether it produces the correct answer
or not. In contrast to Dijkstra algorithm, bellman ford algorithm guarantees the correct answer even if the
weighted graph contains the negative weight values.
Rule of this algorithm
1. We will go on relaxing all the edges (n - 1) times where,

2. n = number of vertices
Consider the below graph:
As we can observe in the above graph that some of the weights are negative. The above graph contains 6
vertices so we will go on relaxing till the 5 vertices. Here, we will relax all the edges 5 times. The loop will
iterate 5 times to get the correct answer. If the loop is iterated more than 5 times then also the answer will
be the same, i.e., there would be no change in the distance between the vertices.
Relaxing means:
1. If (d(u) + c(u , v) < d(v))

2. d(v) = d(u) + c(u , v)
To find the shortest path of the above graph, the first step is note down all the edges which are given below:
(A, B), (A, C), (A, D), (B, E), (C, E), (D, C), (D, F), (E, F), (C, B)
Let's consider the source vertex as 'A'; therefore, the distance value at vertex A is 0 and the distance value
at all the other vertices as infinity shown as below:
Since the graph has six vertices so it will have five iterations.
First iteration
Consider the edge (A, B). Denote vertex 'A' as 'u' and vertex 'B' as 'v'. Now use the relaxing formula:
d(u) = 0
d(v) = ∞
c(u , v) = 6
Since (0 + 6) is less than ∞, so update
1. d(v) = d(u) + c(u , v)
d(v) = 0 + 6 = 6
Therefore, the distance of vertex B is 6.
Consider the edge (A, C). Denote vertex 'A' as 'u' and vertex 'C' as 'v'. Now use the relaxing formula:
d(u) = 0
d(v) = ∞
c(u , v) = 4
1. d(v) = d(u) + c(u , v)
d(v) = 0 + 4 = 4
Therefore, the distance of vertex C is 4.

Consider the edge (A, D). Denote vertex 'A' as 'u' and vertex 'D' as 'v'. Now use the relaxing formula:
d(u) = 0
d(v) = ∞
c(u , v) = 5
1. d(v) = d(u) + c(u , v)
d(v) = 0 + 5 = 5
Therefore, the distance of vertex D is 5.
Consider the edge (B, E). Denote vertex 'B' as 'u' and vertex 'E' as 'v'. Now use the relaxing formula:
d(u) = 6
d(v) = ∞
c(u , v) = -1
Since (6 - 1) is less than ∞, so update
1. d(v) = d(u) + c(u , v)
d(v) = 6 - 1= 5
Therefore, the distance of vertex E is 5.
Consider the edge (C, E). Denote vertex 'C' as 'u' and vertex 'E' as 'v'. Now use the relaxing formula:
d(u) = 4
d(v) = 5
c(u , v) = 3
Since (4 + 3) is greater than 5, so there will be no updation. The value at vertex E is 5.
Consider the edge (D, C). Denote vertex 'D' as 'u' and vertex 'C' as 'v'. Now use the relaxing formula:
d(u) = 5
d(v) = 4
c(u , v) = -2
Since (5 -2) is less than 4, so update
1. d(v) = d(u) + c(u , v)
d(v) = 5 - 2 = 3
Therefore, the distance of vertex C is 3.
Consider the edge (D, F). Denote vertex 'D' as 'u' and vertex 'F' as 'v'. Now use the relaxing formula:
d(u) = 5
d(v) = ∞
c(u , v) = -1
Since (5 -1) is less than ∞, so update
1. d(v) = d(u) + c(u , v)
d(v) = 5 - 1 = 4
Therefore, the distance of vertex F is 4.
Consider the edge (E, F). Denote vertex 'E' as 'u' and vertex 'F' as 'v'. Now use the relaxing formula:
d(u) = 5
d(v) = ∞
c(u , v) = 3
Since (5 + 3) is greater than 4, so there would be no updation on the distance value of vertex F.
Consider the edge (C, B). Denote vertex 'C' as 'u' and vertex 'B' as 'v'. Now use the relaxing formula:
d(u) = 3
d(v) = 6
c(u , v) = -2
Since (3 - 2) is less than 6, so update
1. d(v) = d(u) + c(u , v)
d(v) = 3 - 2 = 1
Therefore, the distance of vertex B is 1.
Now the first iteration is completed. We move to the second iteration.

Second iteration:
In the second iteration, we again check all the edges. The first edge is (A, B). Since (0 + 6) is greater than 1 so
there would be no updation in the vertex B.
The next edge is (A, C). Since (0 + 4) is greater than 3 so there would be no updation in the vertex C.
The next edge is (A, D). Since (0 + 5) equals to 5 so there would be no updation in the vertex D.
The next edge is (B, E). Since (1 - 1) equals to 0 which is less than 5 so update:
d(v) = d(u) + c(u, v)
d(E) = d(B) +c(B , E)
=1-1=0
The next edge is (C, E). Since (3 + 3) equals to 6 which is greater than 5 so there would be no updation in the
vertex E.
The next edge is (D, C). Since (5 - 2) equals to 3 so there would be no updation in the vertex C.
The next edge is (D, F). Since (5 - 1) equals to 4 so there would be no updation in the vertex F.
The next edge is (E, F). Since (5 + 3) equals to 8 which is greater than 4 so there would be no updation in the
vertex F.
The next edge is (C, B). Since (3 - 2) equals to 1` so there would be no updation in the vertex B.
Third iteration
We will perform the same steps as we did in the previous iterations. We will observe that there will be no
updation in the distance of vertices.
1. The following are the distances of vertices:

2. A: 0
3. B: 1
4. C: 3
5. D: 5
6. E: 0
7. F: 3
Time Complexiy
The time complexity of Bellman ford algorithm would be O(E|V| - 1).
1. function bellmanFord(G, S)
2. for each vertex V in G
3. distance[V] <- infinite
4. previous[V] <- NULL
5. distance[S] <- 0
6.
7. for each vertex V in G
8. for each edge (U,V) in G
9. tempDistance <- distance[U] + edge_weight(U, V)
10. if tempDistance < distance[V]
11. distance[V] <- tempDistance
12. previous[V] <- U
13.
14. for each edge (U,V) in G
15. If distance[U] + edge_weight(U, V) < distance[V}
16. Error: Negative Cycle Exists
17.
18. return distance[], previous[]
Drawbacks of Bellman ford algorithm
o The bellman ford algorithm does not produce a correct answer if the sum of the edges of a cycle is
negative. Let's understand this property through an example. Consider the below graph.
o In the above graph, we consider vertex 1 as the source vertex and provides 0 value to it. We provide
infinity value to other vertices shown as below:
Edges can be written as:

(1, 3), (1, 2), (2, 4), (3, 2), (4, 3)
First iteration
Consider the edge (1, 3). Denote vertex '1' as 'u' and vertex '3' as 'v'. Now use the relaxing formula:
d(u) = 0
d(v) = ∞
c(u , v) = 5
1. d(v) = d(u) + c(u , v)
d(v) = 0 + 5 = 5
Therefore, the distance of vertex 3 is 5.
d(u) = 0
d(v) = ∞
c(u , v) = 4
1. d(v) = d(u) + c(u , v)
d(v) = 0 + 4 = 4
d(u) = 5
d(v) = 4
c(u , v) = 7
Since (5 + 7) is greater than 4, so there would be no updation in the vertex 2.
d(u) = 4
d(v) = ∞
c(u , v) = 7
Since (4 + 7) equals to 11 which is less than ∞, so update
1. d(v) = d(u) + c(u , v)
d(v) = 4 + 7 = 11
d(u) = 11
d(v) = 5
c(u , v) = -15
Since (11 - 15) equals to -4 which is less than 5, so update
1. d(v) = d(u) + c(u , v)
d(v) = 11 - 15 = -4
Therefore, the distance of vertex 3 is -4.
Now we move to the second iteration.
Second iteration
Now, again we will check all the edges. The first edge is (1, 3). Since (0 + 5) equals to 5 which is greater than
-4 so there would be no updation in the vertex 3.
The next edge is (1, 2). Since (0 + 4) equals to 4 so there would be no updation in the vertex 2.
The next edge is (3, 2). Since (-4 + 7) equals to 3 which is less than 4 so update:
d(v) = d(u) + c(u, v)
d(2) = d(3) +c(3, 2)

= -4 + 7 = 3
Therefore, the value at vertex 2 is 3.
The next edge is (2, 4). Since ( 3+7) equals to 10 which is less than 11 so update
d(v) = d(u) + c(u, v)
d(4) = d(2) +c(2, 4)
= 3 + 7 = 10
The next edge is (4, 3). Since (10 - 15) equals to -5 which is less than -4 so update:
d(v) = d(u) + c(u, v)
d(3) = d(4) +c(4, 3)
= 10 - 15 = -5
Therefore, the value at vertex 3 is -5.
Now we move to the third iteration.
Third iteration
Now again we will check all the edges. The first edge is (1, 3). Since (0 + 5) equals to 5 which is greater than
-5 so there would be no updation in the vertex 3.
The next edge is (1, 2). Since (0 + 4) equals to 4 which is greater than 3 so there would be no updation in the
vertex 2.
d(v) = d(u) + c(u, v)
d(2) = d(3) +c(3, 2)
= -5 + 7 = 2
The next edge is (2, 4). Since (2 + 7) equals to 9 which is less than 10 so update:
d(v) = d(u) + c(u, v)
d(4) = d(2) +c(2, 4)
=2+7=9
The next edge is (4, 3). Since (9 - 15) equals to -6 which is less than -5 so update:
d(v) = d(u) + c(u, v)
d(3) = d(4) +c(4, 3)
= 9 - 15 = -6
Therefore, the value at vertex 3 is -6.
Since the graph contains 4 vertices, so according to the bellman ford algorithm, there would be only 3
iterations. If we try to perform 4th iteration on the graph, the distance of the vertices from the given vertex
should not change. If the distance varies, it means that the bellman ford algorithm is not providing the
correct answer.
4th iteration
The first edge is (1, 3). Since (0 +5) equals to 5 which is greater than -6 so there would be no change in the
vertex 3.
The next edge is (1, 2). Since (0 + 4) is greater than 2 so there would be no updation.
d(v) = d(u) + c(u, v)
d(2) = d(3) +c(3, 2)
= -6 + 7 = 1
In this case, the value of the vertex is updated. So, we conclude that the bellman ford algorithm does not
work when the graph contains the negative weight cycle.
Floyd-Warshall Algorithm
The Floyd-Warshall algorithm is a dynamic programming algorithm used to discover the shortest paths in a
weighted graph, which includes negative weight cycles. The algorithm works with the aid of computing the
shortest direction between every pair of vertices within the graph, the usage of a matrix of intermediate
vertices to keep music of the exceptional-recognized route thus far.
But before we get started, let us briefly understand what Dynamic Programming is.
Understanding Dynamic Programming
Dynamic programming is a technique used in computer science and mathematics to remedy complicated
troubles with the aid of breaking them down into smaller subproblems and solving each subproblem as
simple as soon as. It is a technique of optimization that can be used to locate the pleasant technique to a
hassle with the aid of utilizing the solutions to its subproblems.
The key idea behind dynamic programming is to keep the solutions to the subproblems in memory, so they
can be reused later whilst solving larger problems. This reduces the time and area complexity of the set of
rules and lets it resolve tons larger and extra complex issues than a brute force approach might.
There are two important styles of dynamic programming:
1. Memoization
2. Tabulation
Memoization involves storing the outcomes of every subproblem in a cache, in order that they may be
reused later. Tabulation includes building a desk of answers to subproblems in a bottom-up manner,
beginning with the smallest subproblems and working as much as the larger ones. Dynamic programming is
utilized in an extensive range of packages, including optimization troubles, computational geometry, gadget
studying, and natural language processing.
Some well-known examples of problems that may be solved by the usage of dynamic programming consist
of the Fibonacci collection, the Knapsack trouble, and the shortest path problem.
History of Floyd-Warshall algorithm:
The Floyd-Warshall set of rules was advanced independently via Robert Floyd and Stephen Warshall in 1962.
Robert Floyd turned into a mathematician and computer scientist at IBM's Thomas J. Watson Research
Center, whilst Stephen Warshall became a computer scientist at the University of California, Berkeley. The
algorithm was originally developed for use inside the field of operations research, where it turned into used
to solve the all-pairs shortest direction problem in directed graphs with tremendous or negative side weights.
The problem become of outstanding hobby in operations research, as it has many applications in
transportation, conversation, and logistics.
Floyd first presented the set of rules in a technical record titled "Algorithm 97: Shortest Path" in 1962.
Warshall independently discovered the set of rules quickly afterwards and posted it in his personal technical
document, "A Theorem on Boolean Matrices". The algorithm has on account that emerged as a cornerstone
of pc technology and is broadly used in lots of regions of studies and enterprise. Its capability to correctly
find the shortest paths between all pairs of vertices in a graph, including those with terrible side weights,
makes it a treasured tool for solving an extensive range of optimization problems.
Working of Floyd-Warshall Algorithm:
The set of rules works as follows:
1. Initialize a distance matrix D wherein D[i][j] represents the shortest distance between vertex i and
vertex j.
2. Set the diagonal entries of the matrix to 0, and all other entries to infinity.
3. For every area (u,v) inside the graph, replace the gap matrix to mirror the weight of the brink: D[u][v]
= weight(u,v).
4. For every vertex okay in the graph, bear in mind all pairs of vertices (i,j) and check if the path from i
to j through k is shorter than the current best path. If it is, update the gap matrix: D[i][j] = min(D[i][j],
D[i][k] D[k][j]).
5. After all iterations, the matrix D will contain the shortest course distances between all pairs of
vertices.
Example:
Floyd-Warshall is an algorithm used to locate the shortest course between all pairs of vertices in a weighted
graph. It works by means of keeping a matrix of distances between each pair of vertices and updating this
matrix iteratively till the shortest paths are discovered.
Let's see at an example to illustrate how the Floyd-Warshall algorithm works:
Consider the following weighted graph:
Figure: A Weighted Graph
In this graph, the vertices are represented by letters (A, B, C, D), and the numbers on the edges represent
the weights of those edges.
To follow the Floyd-Warshall algorithm to this graph, we start by way of initializing a matrix of distances
among every pair of vertices. If two vertices are immediately related by using a side, their distance is the
load of that edge. If there may be no direct edge among vertices, their distance is infinite.
In the first iteration of the set of rules, we keep in mind the possibility of the usage of vertex 1 (A) as an
intermediate vertex in paths among all pairs of vertices. If the space from vertex 1 to vertex 2 plus the space
from vertex 2 to vertex three is much less than the present-day distance from vertex 1 to vertex three, then
we replace the matrix with this new distance. We try this for each possible pair of vertices.
In the second iteration, we recollect the possibility to use of vertex 2 (B) as an intermediate vertex in paths
among all pairs of vertices. We replace the matrix in the same manner as earlier before.
In the third iteration, we consider the possibility of using vertex 3 (C) as an intermediate vertex in paths
between all pairs of vertices.
Finally, in the fourth and final iteration, we consider the possibility of using vertex 4 (D) as an intermediate
vertex in paths between all pairs of vertices.
After the fourth iteration, we have got the shortest path between every pair of vertices in the graph. For
example, the shortest path from vertex A to vertex D is 4, which is the value in the matrix at row A and
column D.
After the fourth iteration, we have got the shortest path between every pair of vertices in the graph. For
example, the shortest path from vertex A to vertex D is 4, which is the value in the matrix at row A and
column D.
Unit 5 : Strings
String Sort:
String sorting involves arranging a collection of strings in a particular order. The most common sorting algorithm for
strings is lexicographic sorting, where strings are sorted based on their dictionary order. You can use established
sorting algorithms like quicksort or mergesort to achieve this. Here's a simple example in Python:
strings = ["apple", "banana", "orange", "grape"]
sorted_strings = sorted(strings)
print(sorted_strings)
This will output:
['apple', 'banana', 'grape', 'orange']
Tries:
A trie is a tree-like data structure that is used to store a dynamic set of strings. It is particularly efficient for tasks like
prefix matching. Each node in a trie represents a character, and the path from the root to a node forms a string. Tries
are commonly used in spell checkers and IP routers.
Here's a basic example of a trie in Python:
class TrieNode:
def __init__(self):
self.children = {}
self.is_end_of_word = False
class Trie:
def __init__(self):
self.root = TrieNode()
def insert(self, word):
node = self.root
for char in word:
if char not in node.children:
node.children[char] = TrieNode()
node = node.children[char]
node.is_end_of_word = True
def search(self, word):
node = self.root
for char in word:
if char not in node.children:

Unit 5 : Strings
return False
node = node.children[char]
return node.is_end_of_word
Example usage:
trie = Trie()
trie.insert("apple")
trie.insert("banana")
print(trie.search("apple")) # Output: True
print(trie.search("orange")) # Output: False
```
Search a Substring within a String:

To search for a substring within a string, you can use various algorithms such as the Knuth-Morris-Pratt algorithm or
the Boyer-Moore algorithm. Here's a simple example using Python's built-in `in` operator:
main_string = "Hello, world!"
substring = "world"
if substring in main_string:
print("Substring found!")
else:
print("Substring not found.")
```
1. Simple/Naive String Matching Algorithm:

Algorithm:
- Compare each character of the pattern with each character of the text.
- If a mismatch is found, move the pattern one position to the right and continue.
Complexity Analysis:
- Time Complexity: O((n-m+1) * m), where n is the length of the text and m is the length of the pattern.
- This algorithm is straightforward but can be inefficient for large texts or patterns.
2. Rabin-Karp Algorithm:
Unit 5 : Strings
Algorithm:
- Utilizes hashing to compare the hash value of the pattern with the hash values of substrings in the text.
- If hash values match, then compares characters to confirm the match.
- Uses rolling hash to efficiently update the hash value as the window slides.
- Time Complexity: O((n+m) * constant), where n is the length of the text and m is the length of the pattern.
- Hashing can reduce the number of character comparisons, but hash collisions may affect performance.
3. Knuth-Morris-Pratt (KMP) Algorithm:

Algorithm:
- Preprocesses the pattern to find the longest proper prefix which is also a suffix for each position.
- Uses this information to skip unnecessary comparisons in the text.
- Time Complexity: O(n + m), where n is the length of the text and m is the length of the pattern.
- KMP avoids unnecessary character comparisons by utilizing the information from the preprocessing step.
4. Horspool String Matching Algorithm:

Algorithm:
- Shifts the pattern in the text based on a table of bad character shifts.
- The table represents the rightmost position of each character in the pattern.
- This algorithm is particularly efficient for certain cases but may not be as versatile as others.
5. Boyer-Moore String Matching Algorithm:

Algorithm:
- Preprocesses the pattern to skip comparisons based on mismatched characters.
- Utilizes a bad character shift and a good suffix shift.
- Boyer-Moore tends to perform well in practice due to its ability to skip large portions of the text based on the
mismatched character.
Regular Expressions:
Unit 5 : Strings
Regular expressions (regex or regexp) are powerful tools for pattern matching and text manipulation. They provide a
concise and flexible way to search, match, and manipulate strings. Here are some key concepts:
1. Basic Syntax:
- `.`: Matches any single character.
- `*`: Matches zero or more occurrences of the preceding character or group.
- `+`: Matches one or more occurrences of the preceding character or group.
- `?`: Matches zero or one occurrence of the preceding character or group.
- `[]`: Defines a character class.
- `|`: Acts like a logical OR.
2. Anchors:
- `^`: Matches the start of a line.
- `$`: Matches the end of a line.
3. Character Classes:
- `[a-z]`: Matches any lowercase letter.
- `[0-9]`: Matches any digit.
- `[^0-9]`: Matches any character that is not a digit.
4. Quantifiers:
- `{n}`: Matches exactly n occurrences.
- `{n,}`: Matches n or more occurrences.
- `{n,m}`: Matches between n and m occurrences.
5. Escape Characters:
- `\`: Escapes a special character, allowing it to be treated as a literal.
6. Grouping and Capturing:

- `()`: Groups patterns together.
- `(?:...)`: Non-capturing group.
Elementary Data Compression:

Data compression involves reducing the size of data to save space or transmission time. Here are two elementary
compression techniques:
1. Run-Length Encoding (RLE):

- Replaces sequences of the same data value with a single value and a count.
- For example, the string "AAAABBBCCDAA" can be compressed to "4A3B2C1D2A".
2. Huffman Coding:
Unit 5 : Strings
- A variable-length encoding algorithm that assigns shorter codes to more frequent symbols and longer codes to
less frequent symbols.
- It builds a binary tree in which the leaves represent the symbols to be encoded.
- The more frequent a symbol, the closer its code is to the root of the tree.
Data compression is widely used in various applications, such as file compression (ZIP), image compression (JPEG),
and video compression (H.264). More advanced compression algorithms, like Lempel-Ziv and its variants, are
commonly used in practice.
Remember that the effectiveness of compression depends on the characteristics of the data. Some data types
compress well, while others may not show significant reduction in size. The choice of a compression algorithm
depends on factors such as the type of data and the specific requirements of the application.

Algorithms

Uploaded by

Algorithms

Uploaded by

UNIT 1 : FUNDAMENTALS OF ALGORITHM

The steps involved in the PUSH operation is given below:

o Before inserting an element in a stack, we check whether the stack is full.

The steps involved in the POP operation is given below:

2. Queue is referred to be as First In First Out list.

Data Time Complexity Space

Average Worst Worst

Access Search Insertion Deletion Access Search Insertion Deletion

 | f(x) | <= c*|g(n)|

1. 1. 3n+2=O(n) as 3n+2≤4n for all n≥2

Hence, the complexity of f (n) can be represented as Ω (g (n))

3n+2= θ (n) as 3n+2≥3n and 3n+2≤ 4n, for n

What Is Space Complexity

Worst Case Analysis:

Average Case Analysis:

Best Case Analysis:

1. Divide the original problem into a set of subproblems.

Generally, we can follow the divide-and-conquer approach in a three-step process.

1. Maximum and Minimum Problem

Characteristics of Greedy method

The following are the characteristics of a greedy method:

Components of Greedy Algorithm

The components that can be used in the greedy algorithm are:

Applications of Greedy Algorithm

o It is used in finding the shortest path.

Bubble short is majorly used where -

o complexity does not matter

Working of Bubble sort Algorithm

Now, let's see the working of Bubble sort Algorithm.

Let the elements of array are -

Now, compare 32 and 35.

Now, the comparison will be in between 35 and 10.

Now, move to the second iteration.

The same process will be followed for second iteration.

Now, move to the third iteration.

The same process will be followed for third iteration.

Now, move to the fourth iteration.

Similarly, after the fourth iteration, the array will be -

Hence, there is no swapping required, so the array is completely sorted.

Case Time Complexity

Best Case O(n)

Average Case O(n2)

Worst Case O(n2)

Space Complexity O(1)

Now, let's discuss the optimized bubble sort algorithm.

Optimized Bubble sort Algorithm

Selection Sort Algorithm

Selection sort is generally used when -

o A small array is to be sorted

Now, let's see the algorithm of selection sort.

Now, let's see the working of the Selection sort Algorithm.

Let the elements of array are -

Now, the array is completely sorted.

Selection sort complexity

Case Time Complexity

Insertion Sort Algorithm

Insertion sort has various advantages such as -

Now, let's see the algorithm of insertion sort.

Step2 - Pick the next element, and store it separately in a key.

Step 5 - Insert the value.

Step 6 - Repeat until the array is sorted.

Now, let's see the working of the insertion sort Algorithm.

Let the elements of array are -

Initially, the first two elements are compared in insertion sort.

Now, move to the next two elements and compare them.

Both 31 and 8 are not sorted. So, swap them.

After swapping, elements 25 and 8 are unsorted.

So, swap them.

Now, elements 12 and 8 are unsorted.

So, swap them too.

Move to the next elements that are 32 and 17.

17 is smaller than 32. So, swap them.

Swapping makes 31 and 17 unsorted. So, swap them too.

Now, swapping makes 25 and 17 unsorted. So, perform swapping again.

Now, the array is completely sorted.

Insertion sort complexity