Algorithms
Algorithms
Algorithm:-
An algorithm is a finite lists of instructions in sequence to solve the computation problems.
An algorithm is a step by step of finite number of process to solve the problems.You can write the
algorithms in any language which is understandable to the persons (programmers)
In Real life,an algorithm is a recipe for any cooking dish.
Write a Algorithm in Natural Language:-
1. Start
2. Read array elements
3. Scan n elements in array A
4. Declare a sum variable
5. Assign zero value in sum variable
6. Add all array elements using for loop
7. Display the sum
8. stop
#Write a Algorithm in Pseudo code:-
1. start
2. Read Array A
3. sum<--0
4. for <--1 to n do
5. sum<--sum A[i]
6. display sum
7. stop
#Write a Algorithm in Flow Chart:-
UNIT 1 : FUNDAMENTALS OF ALGORITHM
Characteristics of Algorithms:-
There are following characteristics of any algorithms as given below.
1. Input:-An algorithm should have one or more inputs.
2. Output:-An algorithm must have at least one output.
3. Definiteness:- Every statement in any algorithm should be definiteness.It means.Every statement in
algorithm should have unambiguous.Every statement should have clear and there should not be
more than one way to interprate the single statement in given algorithm.
4. Finiteness:-An algorithm should have a finite number of steps(instructions) to solve the problems and
get a valid output.
5. Effectiveness:-An algorithm should have effectiveness and produce well defined output of any
programs. Effectiveness means, an algorithms should have good method which produce effective
output with less time and less storage capacity.
Different between Algorithms and programs:-
There are some difference between algorithms and programs as given below:-
Data abstraction
Data abstraction is the programming process of creating a data type, usually a class, that hides the details
of the data representation in order to make the data type easier to work with. Data abstraction involves
creating a representation for data that separates the interface from the implementation so a programmer
or user only has to understand the interface, the commands to use, and not how the internal structure of
the data is represented and/or implemented
Stack
A Stack is a linear data structure that follows the LIFO (Last-In-First-Out) principle. Stack has one end,
whereas the Queue has two ends (front and rear). It contains only one pointer top pointer pointing to the
topmost element of the stack. Whenever an element is added in the stack, it is added on the top of the stack,
and the element can be deleted only from the stack. In other words, a stack can be defined as a container
in which insertion and deletion can be done from the one end known as the top of the stack.
UNIT 1 : FUNDAMENTALS OF ALGORITHM
Some key points related to stack
o It is called as stack because it behaves like a real-world stack, piles of books, etc.
o A Stack is an abstract data type with a pre-defined capacity, which means that it can store the
elements of a limited size.
o It is a data structure that follows some order to insert and delete the elements, and that order can
be LIFO or FILO.
Working of Stack
Stack works on the LIFO pattern. As we can observe in the below figure there are five memory blocks in the
stack; therefore, the size of the stack is 5.
Suppose we want to store the elements in a stack and let's assume that stack is empty. We have taken the
stack of size 5 as shown below in which we are pushing the elements one by one until the stack becomes
full.
Since our stack is full as the size of the stack is 5. In the above cases, we can observe that it goes from the
top to the bottom when we were entering the new element in the stack. The stack gets filled up from the
bottom to the top.
When we perform the delete operation on the stack, there is only one way for entry and exit as the other
end is closed. It follows the LIFO pattern, which means that the value entered first will be removed last. In
the above case, the value 5 is entered first, so it will be removed only after the deletion of all the other
elements.
UNIT 1 : FUNDAMENTALS OF ALGORITHM
PUSH operation
o Before deleting the element from the stack, we check whether the stack is empty.
o If we try to delete the element from the empty stack, then the underflow condition occurs.
o If the stack is not empty, we first access the element which is pointed by the top
o Once the pop operation is performed, the top is decremented by 1, i.e., top=top-1.
UNIT 1 : FUNDAMENTALS OF ALGORITHM
Queue
1. A queue can be defined as an ordered list which enables insert operations to be performed at one end
called REAR and delete operations to be performed at another end called FRONT.
3. For example, people waiting in line for a rail ticket form a queue.
Applications of Queue
Due to the fact that queue performs actions on first in first out basis which is quite fair for the ordering of
actions. There are various applications of queues discussed as below.
1. Queues are widely used as waiting lists for a single shared resource like printer, disk, CPU.
2. Queues are used in asynchronous transfer of data (where data is not being transferred at the same
rate between two processes) for eg. pipes, file IO, sockets.
3. Queues are used as buffers in most of the applications like MP3 media player, CD player, etc.
4. Queue are used to maintain the play list in media players in order to add and remove the songs from
the play-list.
5. Queues are used in operating systems for handling interrupts.
UNIT 1 : FUNDAMENTALS OF ALGORITHM
Complexity
Queue θ(n) θ(n) θ(1) θ(1) O(n) O(n) O(1) O(1) O(n)
Asymptotic Notations:
Asymptotic Notation is a way of comparing function that ignores constant factors and small input sizes. Three
notations are used to calculate the running time complexity of an algorithm:
1. Big-oh notation: Big-oh is the formal method of expressing the upper bound of an algorithm's running
time. It is the measure of the longest amount of time. The function f (n) = O (g (n)) [read as "f of n is big-oh
of g of n"] if and only if exist positive constant c and such that
Hence, function g (n) is an upper bound for function f (n), as g (n) grows faster than f (n)
For Example:
|f(x)|>= c*|g(n)|
For Example:
f (n) =8n2+2n-3≥8n2-3
=7n2+(n2-3)≥7n2 (g(n))
Thus, k1=7
3. Theta (θ): The function f (n) = θ (g (n)) [read as "f is the theta of g of n"] if and only if there exists positive
constant k1, k2 and k0 such that
C1|g(w)<=|f(x)|<=C 2|g(n)|
For Example:
The Theta Notation is more precise than both the big-oh and Omega notation. The function f (n) = θ (g (n))
if g(n) is both an upper and lower bound.
UNIT 1 : FUNDAMENTALS OF ALGORITHM
What Is Time Complexity
Time complexity is defined in terms of how many times it takes to run a given algorithm, based on the
length of the input. Time complexity is not a measurement of how much time it takes to execute a
particular algorithm because such factors as programming language, operating system, and processing
power are also considered.
When an algorithm is run on a computer, it necessitates a certain amount of memory space. The amount
of memory used by a program to execute it is represented by its space complexity. Because a program
requires memory to store input data and temporal values while running, the space complexity is auxiliary
and input space.
For linear search, the worst case occurs when the element to search for is not present in the array. When x
is not present, the search () function compares it with all the elements of arr [] one by one. Therefore, the
temporal complexity of the worst case of linear search would be Θ (n).
We need to predict the distribution of cases. For the linear search problem, assume that all cases are
uniformly distributed. So we add all the cases and divide the sum by (n + 1).
The number of operations in the best case is constant. The best-case time complexity would therefore be
Θ (1) Most of the time, we perform worst-case analysis to analyze algorithms. In the worst analysis, we
guarantee an upper bound on the execution time of an algorithm which is good information.
UNIT 1 : FUNDAMENTALS OF ALGORITHM
Divide and Conquer Introduction
Divide and Conquer is an algorithmic pattern. In algorithmic methods, the design is to take a dispute on a
huge input, break the input into minor pieces, decide the problem on each of the small pieces, and then
merge the piecewise solutions into a global solution. This mechanism of solving the problem is called the
Divide & Conquer Strategy.
Divide and Conquer algorithm consists of a dispute using the following three steps.
Examples: The specific computer algorithms are based on the Divide & Conquer approach:
The greedy method is one of the strategies like Divide and conquer used to solve the problems. This method
is used for solving optimization problems. An optimization problem is a problem that demands either
maximum or minimum results. Let's understand through some terms.
This technique is basically used to determine the feasible solution that may or may not be optimal. The
feasible solution is a subset that satisfies the given criteria. The optimal solution is the solution which is the
best and the most favorable solution in the subset. In the case of feasible, if more than one solution satisfies
the given criteria then those solutions will be considered as the feasible, whereas the optimal solution is the
best solution among all the solutions.
o To construct the solution in an optimal way, this algorithm creates two sets where one set contains
all the chosen items, and another set contains the rejected items.
o A Greedy algorithm makes good local choices in the hope that the solution should be either feasible
or optimal.
o Candidate set: A solution that is created from the set is known as a candidate set.
o Selection function: This function is used to choose the candidate or subset which can be added in
the solution.
o Feasibility function: A function that is used to determine whether the candidate or subset can be
used to contribute to the solution or not.
o Objective function: A function is used to assign the value to the solution or the partial solution.
o Solution function: This function is used to intimate whether the complete function has been reached
or not.
o
o We have to travel from the source to the destination at the minimum cost. Since we have three
feasible solutions having cost paths as 10, 20, and 5. 5 is the minimum cost path so it is the optimal
solution. This is the local optimum, and in this way, we find the local optimum at each stage in order
to calculate the global optimal solution.
UNIT 2 : SORTING
Bubble sort Algorithm
In this article, we will discuss the Bubble sort Algorithm. The working procedure of bubble sort is simplest.
This article will be very helpful and interesting to students as they might face bubble sort as a question in
their examinations. So, it is important to discuss the topic.
Bubble sort works on the repeatedly swapping of adjacent elements until they are not in the intended order.
It is called bubble sort because the movement of array elements is just like the movement of air bubbles in
the water. Bubbles in water rise up to the surface; similarly, the array elements in bubble sort move to the
end in each iteration.
Algorithm
In the algorithm given below, suppose arr is an array of n elements. The assumed swap function in the
algorithm will swap the values of given array elements.
1. begin BubbleSort(arr)
2. for all array elements
3. if arr[i] > arr[i+1]
4. swap(arr[i], arr[i+1])
5. end if
6. end for
7. return arr
8. end BubbleSort
To understand the working of bubble sort algorithm, let's take an unsorted array. We are taking a short and
accurate array, as we know the complexity of bubble sort is O(n2).
Sorting will start from the initial two elements. Let compare them to check which is greater.
Here, 32 is greater than 13 (32 > 13), so it is already sorted. Now, compare 32 with 26.
Here, 26 is smaller than 36. So, swapping is required. After swapping new array will look like -
Here, 35 is greater than 32. So, there is no swapping required as they are already sorted.
Here, 10 is smaller than 35 that are not sorted. So, swapping is required. Now, we reach at the end of the
array. After first pass, the array will be -
Here, 10 is smaller than 32. So, swapping is required. After swapping, the array will be -
Third Pass
Here, 10 is smaller than 26. So, swapping is required. After swapping, the array will be -
Fourth pass
Now, let's see the time complexity of bubble sort in the best case, average case, and worst case. We will also
see the space complexity of bubble sort.
1. Time Complexity
o Best Case Complexity - It occurs when there is no sorting required, i.e. the array is already sorted.
The best-case time complexity of bubble sort is O(n).
o Average Case Complexity - It occurs when the array elements are in jumbled order that is not
properly ascending and not properly descending. The average case time complexity of bubble sort
is O(n2).
o Worst Case Complexity - It occurs when the array elements are required to be sorted in reverse
order. That means suppose you have to sort the array elements in ascending order, but its elements
are in descending order. The worst-case time complexity of bubble sort is O(n2).
2. Space Complexity
Stable YES
o The space complexity of bubble sort is O(1). It is because, in bubble sort, an extra variable is required
for swapping.
o The space complexity of optimized bubble sort is O(2). It is because two extra variables are required
in optimized bubble sort.
In the bubble sort algorithm, comparisons are made even when the array is already sorted. Because of that,
the execution time increases.
To solve it, we can use an extra variable swapped. It is set to true if swapping requires; otherwise, it is set
to false.
UNIT 2 : SORTING
It will be helpful, as suppose after an iteration, if there is no swapping required, the value of
variable swapped will be false. It means that the elements are already sorted, and no further iterations are
required.
This method will reduce the execution time and also optimizes the bubble sort.
In this article, we will discuss the Selection sort Algorithm. The working procedure of selection sort is also
simple. This article will be very helpful and interesting to students as they might face selection sort as a
question in their examinations. So, it is important to discuss the topic.
In selection sort, the first smallest element is selected from the unsorted array and placed at the first
position. After that second smallest element is selected and placed in the second position. The process
continues until the array is entirely sorted.
Algorithm
1. SELECTION SORT(arr, n)
2.
3. Step 1: Repeat Steps 2 and 3 for i = 0 to n-1
4. Step 2: CALL SMALLEST(arr, i, n, pos)
5. Step 3: SWAP arr[i] with arr[pos]
6. [END OF LOOP]
7. Step 4: EXIT
8.
9. SMALLEST (arr, i, n, pos)
10. Step 1: [INITIALIZE] SET SMALL = arr[i]
11. Step 2: [INITIALIZE] SET pos = i
12. Step 3: Repeat for j = i+1 to n
13. if (SMALL > arr[j])
14. SET SMALL = arr[j]
15. SET pos = j
16. [END OF if]
17. [END OF LOOP]
18. Step 4: RETURN pos
UNIT 2 : SORTING
Working of Selection sort Algorithm
To understand the working of the Selection sort algorithm, let's take an unsorted array. It will be easier to
understand the Selection sort via an example.
Now, for the first position in the sorted array, the entire array is to be scanned sequentially.
At present, 12 is stored at the first position, after searching the entire array, it is found that 8 is the smallest
value.
So, swap 12 with 8. After the first iteration, 8 will appear at the first position in the sorted array.
For the second position, where 29 is stored presently, we again sequentially scan the rest of the items of
unsorted array. After scanning, we find that 12 is the second lowest element in the array that should be
appeared at second position.
Now, swap 29 with 12. After the second iteration, 12 will appear at the second position in the sorted array.
So, after two iterations, the two smallest values are placed at the beginning in a sorted way.
The same process is applied to the rest of the array elements. Now, we are showing a pictorial representation
of the entire sorting process.
UNIT 2 : SORTING
Now, let's see the time complexity of selection sort in best case, average case, and in worst case. We will
also see the space complexity of the selection sort.
1. Time Complexity
o Best Case Complexity - It occurs when there is no sorting required, i.e. the array is already sorted.
The best-case time complexity of selection sort is O(n2).
o Average Case Complexity - It occurs when the array elements are in jumbled order that is not
properly ascending and not properly descending. The average case time complexity of selection sort
is O(n2).
o Worst Case Complexity - It occurs when the array elements are required to be sorted in reverse
order. That means suppose you have to sort the array elements in ascending order, but its elements
are in descending order. The worst-case time complexity of selection sort is O(n2).
UNIT 2 : SORTING
2. Space Complexity
Space Complexity
Stable
o The space complexity of selection sort is O(1). It is because, in selection sort, an extra variable is
required for swapping.
In this article, we will discuss the Insertion sort Algorithm. The working procedure of insertion sort is also
simple. This article will be very helpful and interesting to students as they might face insertion sort as a
question in their examinations. So, it is important to discuss the topic.
Insertion sort works similar to the sorting of playing cards in hands. It is assumed that the first card is already
sorted in the card game, and then we select an unsorted card. If the selected unsorted card is greater than
the first card, it will be placed at the right side; otherwise, it will be placed at the left side. Similarly, all
unsorted cards are taken and put in their exact place.
o Simple implementation
o Efficient for small data sets
o Adaptive, i.e., it is appropriate for data sets that are already substantially sorted.
Algorithm
The simple steps of achieving the insertion sort are listed as follows -
Step 1 - If the element is the first element, assume that it is already sorted. Return 1.
Step3 - Now, compare the key with all elements in the sorted array.
Step 4 - If the element in the sorted array is smaller than the current element, then move to the next
element. Else, shift greater elements in the array towards the right.
To understand the working of the insertion sort algorithm, let's take an unsorted array. It will be easier to
understand the insertion sort via an example.
Here, 31 is greater than 12. That means both elements are already in ascending order. So, for now, 12 is
stored in a sorted sub-array.
Here, 25 is smaller than 31. So, 31 is not at correct position. Now, swap 31 with 25. Along with swapping,
insertion sort will also check it with all elements in the sorted array.
For now, the sorted array has only one element, i.e. 12. So, 25 is greater than 12. Hence, the sorted array
remains sorted after swapping.
Now, two elements in the sorted array are 12 and 25. Move forward to the next elements that are 31 and 8.
Now, the sorted array has three items that are 8, 12 and 25. Move to the next items that are 31 and 32.
Hence, they are already sorted. Now, the sorted array includes 8, 12, 25 and 31.
Now, let's see the time complexity of insertion sort in best case, average case, and in worst case. We will
also see the space complexity of insertion sort.
1. Time Complexity
o Best Case Complexity - It occurs when there is no sorting required, i.e. the array is already sorted.
The best-case time complexity of insertion sort is O(n).
o Average Case Complexity - It occurs when the array elements are in jumbled order that is not
properly ascending and not properly descending. The average case time complexity of insertion sort
is O(n2).
o Worst Case Complexity - It occurs when the array elements are required to be sorted in reverse
order. That means suppose you have to sort the array elements in ascending order, but its elements
are in descending order. The worst-case time complexity of insertion sort is O(n2).
2. Space Complexity
Space Complexity
Stable
o The space complexity of insertion sort is O(1). It is because, in insertion sort, an extra variable is
required for swapping.
UNIT 2 : SORTING
In this article, we will discuss the shell sort algorithm. Shell sort is the generalization of insertion sort, which
overcomes the drawbacks of insertion sort by comparing elements separated by a gap of several positions.
It is a sorting algorithm that is an extended version of insertion sort. Shell sort has improved the average
time complexity of insertion sort. As similar to insertion sort, it is a comparison-based and in-place sorting
algorithm. Shell sort is efficient for medium-sized data sets.
1. hh = h * 3 + 1
2. where, 'h' is the interval having initial value 1.
Algorithm
The simple steps of achieving the shell sort are listed as follows -
To understand the working of the shell sort algorithm, let's take an unsorted array. It will be easier to
understand the shell sort via an example.
We will use the original sequence of shell sort, i.e., N/2, N/4,....,1 as the intervals.
In the first loop, n is equal to 8 (size of the array), so the elements are lying at the interval of 4 (n/2 = 4).
Elements will be compared and swapped if they are not in order.
UNIT 2 : SORTING
Here, in the first loop, the element at the 0th position will be compared with the element at 4th position. If
the 0th element is greater, it will be swapped with the element at 4th position. Otherwise, it remains the
same. This process will continue for the remaining elements.
At the interval of 4, the sublists are {33, 12}, {31, 17}, {40, 25}, {8, 42}.
Now, we have to compare the values in every sub-list. After comparing, we have to swap them if required in
the original array. After comparing and swapping, the updated array will look as follows -
In the second loop, elements are lying at the interval of 2 (n/4 = 2), where n = 8.
Now, we are taking the interval of 2 to sort the rest of the array. With an interval of 2, two sublists will be
generated - {12, 25, 33, 40}, and {17, 8, 31, 42}.
UNIT 2 : SORTING
Now, we again have to compare the values in every sub-list. After comparing, we have to swap them if
required in the original array. After comparing and swapping, the updated array will look as follows -
In the third loop, elements are lying at the interval of 1 (n/8 = 1), where n = 8. At last, we use the interval of
value 1 to sort the rest of the array elements. In this step, shell sort uses insertion sort to sort the array
elements.
Now, let's see the time complexity of Shell sort in the best case, average case, and worst case. We will also
see the space complexity of the Shell sort.
1. Time Complexity
2. Space Complexity
Stable NO
In this article, we will discuss the merge sort Algorithm. Merge sort is the sorting technique that follows the
divide and conquer approach. This article will be very helpful and interesting to students as they might face
merge sort as a question in their examinations. In coding or technical interviews for software engineers,
sorting algorithms are widely asked. So, it is important to discuss the topic.
Merge sort is similar to the quick sort algorithm as it uses the divide and conquer approach to sort the
elements. It is one of the most popular and efficient sorting algorithm. It divides the given list into two equal
halves, calls itself for the two halves and then merges the two sorted halves. We have to define
the merge() function to perform the merging..
Algorithm
In the following algorithm, arr is the given array, beg is the starting element, and end is the last element of
the array.
To understand the working of the merge sort algorithm, let's take an unsorted array. It will be easier to
understand the merge sort via an example.
According to the merge sort, first divide the given array into two equal halves. Merge sort keeps dividing the
list into equal parts until it cannot be further divided.
As there are eight elements in the given array, so it is divided into two arrays of size 4.
Now, again divide these two arrays into halves. As they are of size 4, so divide them into new arrays of size
2.
Now, again divide these arrays to get the atomic value that cannot be further divided.
In combining, first compare the element of each array and then combine them into another array in sorted
order.
So, first compare 12 and 31, both are in sorted positions. Then compare 25 and 8, and in the list of two
values, put 8 first followed by 25. Then compare 32 and 17, sort them and put 17 first followed by 32. After
that, compare 40 and 42, and place them sequentially.
UNIT 2 : SORTING
In the next iteration of combining, now compare the arrays with two data values and merge them into an
array of found values in sorted order.
Now, there is a final merging of the arrays. After the final merging of above arrays, the array will look like -
Now, let's see the time complexity of merge sort in best case, average case, and in worst case. We will also
see the space complexity of the merge sort.
1. Time Complexity
o Best Case Complexity - It occurs when there is no sorting required, i.e. the array is already sorted.
The best-case time complexity of merge sort is O(n*logn).
o Average Case Complexity - It occurs when the array elements are in jumbled order that is not
properly ascending and not properly descending. The average case time complexity of merge sort
is O(n*logn).
o Worst Case Complexity - It occurs when the array elements are required to be sorted in reverse
order. That means suppose you have to sort the array elements in ascending order, but its elements
are in descending order. The worst-case time complexity of merge sort is O(n*logn).
UNIT 2 : SORTING
2. Space Complexity
Stable YES
o The space complexity of merge sort is O(n). It is because, in merge sort, an extra variable is required
for swapping.
In this article, we will discuss the Quicksort Algorithm. The working procedure of Quicksort is also simple.
This article will be very helpful and interesting to students as they might face quicksort as a question in their
examinations. So, it is important to discuss the topic.
Divide: In Divide, first pick a pivot element. After that, partition or rearrange the array into two sub-arrays
such that each element in the left sub-array is less than or equal to the pivot element and each element in
the right sub-array is larger than the pivot element.
Quicksort picks an element as pivot, and then it partitions the given array around the picked pivot element.
In quick sort, a large array is divided into two arrays in which one holds values that are smaller than the
specified value (Pivot), and another array holds the values that are greater than the pivot.
After that, left and right sub-arrays are also partitioned using the same approach. It will continue until the
single element remains in the sub-array.
UNIT 2 : SORTING
Choosing the pivot
Picking a good pivot is necessary for the fast implementation of quicksort. However, it is typical to determine
a good pivot. Some of the ways of choosing a pivot are as follows -
o Pivot can be random, i.e. select the random pivot from the given array.
o Pivot can either be the rightmost element of the leftmost element of the given array.
o Select median as the pivot element.
Algorithm
Algorithm:
Partition Algorithm:
To understand the working of quick sort, let's take an unsorted array. It will make the concept more clear
and understandable.
In the given array, we consider the leftmost element as pivot. So, in this case, a[left] = 24, a[right] = 27 and
a[pivot] = 24.
Since, pivot is at left, so algorithm starts from right and move towards left.
Now, a[pivot] < a[right], so algorithm moves forward one position towards left, i.e. -
Because, a[pivot] > a[right], so, algorithm will swap a[pivot] with a[right], and pivot moves to right, as -
UNIT 2 : SORTING
Now, a[left] = 19, a[right] = 24, and a[pivot] = 24. Since, pivot is at right, so algorithm starts from left and
moves to right.
Now, a[left] = 9, a[right] = 24, and a[pivot] = 24. As a[pivot] > a[left], so algorithm moves one position to right
as -
Now, a[left] = 29, a[right] = 24, and a[pivot] = 24. As a[pivot] < a[left], so, swap a[pivot] and a[left], now pivot
is at left, i.e. -
UNIT 2 : SORTING
Since, pivot is at left, so algorithm starts from right, and move to left. Now, a[left] = 24, a[right] = 29, and
a[pivot] = 24. As a[pivot] < a[right], so algorithm moves one position to left, as -
Now, a[pivot] = 24, a[left] = 24, and a[right] = 14. As a[pivot] > a[right], so, swap a[pivot] and a[right], now
pivot is at right, i.e. -
Now, a[pivot] = 24, a[left] = 14, and a[right] = 24. Pivot is at right, so the algorithm starts from left and move
to right.
UNIT 2 : SORTING
Now, a[pivot] = 24, a[left] = 24, and a[right] = 24. So, pivot, left and right are pointing the same element. It
represents the termination of procedure.
Element 24, which is the pivot element is placed at its exact position.
Elements that are right side of element 24 are greater than it, and the elements that are left side of element
24 are smaller than it.
Now, in a similar manner, quick sort algorithm is separately applied to the left and right sub-arrays. After
sorting gets done, the array will be -
Quicksort complexity
Now, let's see the time complexity of quicksort in best case, average case, and in worst case. We will also
see the space complexity of quicksort.
1. Time Complexity
Though the worst-case complexity of quicksort is more than other sorting algorithms such as Merge
sort and Heap sort, still it is faster in practice. Worst case in quick sort rarely occurs because by changing the
choice of pivot, it can be implemented in different ways. Worst case in quicksort can be avoided by choosing
the right pivot element.
2. Space Complexity
Stable NO
In this article, we will discuss the Heapsort Algorithm. Heap sort processes the elements by creating the min-
heap or max-heap using the elements of the given array. Min-heap or max-heap represents the ordering of
array in which the root element represents the minimum or maximum element of the array.
Before knowing more about the heap sort, let's first see a brief description of Heap.
Algorithm
1. HeapSort(arr)
2. BuildMaxHeap(arr)
3. for i = length(arr) to 2
4. swap arr[1] with arr[i]
5. heap_size[arr] = heap_size[arr] ? 1
6. MaxHeapify(arr,1)
7. End
UNIT 2 : SORTING
BuildMaxHeap(arr)
1. BuildMaxHeap(arr)
2. heap_size(arr) = length(arr)
3. for i = length(arr)/2 to 1
4. MaxHeapify(arr,i)
5. End
MaxHeapify(arr,i)
1. MaxHeapify(arr,i)
2. L = left(i)
3. R = right(i)
4. if L ? heap_size[arr] and arr[L] > arr[i]
5. largest = L
6. else
7. largest = i
8. if R ? heap_size[arr] and arr[R] > arr[largest]
9. largest = R
10. if largest != i
11. swap arr[i] with arr[largest]
12. MaxHeapify(arr,largest)
13. End
In heap sort, basically, there are two phases involved in the sorting of elements. By using the heap sort
algorithm, they are as follows -
o The first step includes the creation of a heap by adjusting the elements of the array.
o After the creation of heap, now remove the root element of the heap repeatedly by shifting it to the
end of the array, and then store the heap structure with the remaining elements.
Now let's see the working of heap sort in detail by using an example. To understand it more clearly, let's take
an unsorted array and try to sort it using heap sort. It will make the explanation clearer and easier.
First, we have to construct a heap from the given array and convert it into max heap.
UNIT 2 : SORTING
After converting the given heap into max heap, the array elements are -
Next, we have to delete the root element (89) from the max heap. To delete this node, we have to swap it
with the last node, i.e. (11). After deleting the root element, we again have to heapify it to convert it into
max heap.
After swapping the array element 89 with 11, and converting the heap into max-heap, the elements of array
are -
In the next step, again, we have to delete the root element (81) from the max heap. To delete this node, we
have to swap it with the last node, i.e. (54). After deleting the root element, we again have to heapify it to
convert it into max heap.
UNIT 2 : SORTING
After swapping the array element 81 with 54 and converting the heap into max-heap, the elements of array
are -
In the next step, we have to delete the root element (76) from the max heap again. To delete this node, we
have to swap it with the last node, i.e. (9). After deleting the root element, we again have to heapify it to
convert it into max heap.
After swapping the array element 76 with 9 and converting the heap into max-heap, the elements of array
are -
In the next step, again we have to delete the root element (54) from the max heap. To delete this node, we
have to swap it with the last node, i.e. (14). After deleting the root element, we again have to heapify it to
convert it into max heap.
UNIT 2 : SORTING
After swapping the array element 54 with 14 and converting the heap into max-heap, the elements of array
are -
In the next step, again we have to delete the root element (22) from the max heap. To delete this node, we
have to swap it with the last node, i.e. (11). After deleting the root element, we again have to heapify it to
convert it into max heap.
After swapping the array element 22 with 11 and converting the heap into max-heap, the elements of array
are -
In the next step, again we have to delete the root element (14) from the max heap. To delete this node, we
have to swap it with the last node, i.e. (9). After deleting the root element, we again have to heapify it to
convert it into max heap.
UNIT 2 : SORTING
After swapping the array element 14 with 9 and converting the heap into max-heap, the elements of array
are -
In the next step, again we have to delete the root element (11) from the max heap. To delete this node, we
have to swap it with the last node, i.e. (9). After deleting the root element, we again have to heapify it to
convert it into max heap.
After swapping the array element 11 with 9, the elements of array are -
Now, heap has only one element left. After deleting it, heap will be empty.
Now, let's see the time complexity of Heap sort in the best case, average case, and worst case. We will also
see the space complexity of Heapsort.
1. Time Complexity
o Best Case Complexity - It occurs when there is no sorting required, i.e. the array is already sorted.
The best-case time complexity of heap sort is O(n logn).
o Average Case Complexity - It occurs when the array elements are in jumbled order that is not
properly ascending and not properly descending. The average case time complexity of heap sort
is O(n log n).
o Worst Case Complexity - It occurs when the array elements are required to be sorted in reverse
order. That means suppose you have to sort the array elements in ascending order, but its elements
are in descending order. The worst-case time complexity of heap sort is O(n log n).
The time complexity of heap sort is O(n logn) in all three cases (best case, average case, and worst case). The
height of a complete binary tree having n elements is logn.
2. Space Complexity
Stable N0
In this article, we will discuss the counting sort Algorithm. Counting sort is a sorting technique that is based
on the keys between specific ranges. In coding or technical interviews for software engineers, sorting
algorithms are widely asked. So, it is important to discuss the topic.
Algorithm
To understand the working of the counting sort algorithm, let's take an unsorted array. It will be easier to
understand the counting sort via an example.
1. Find the maximum element from the given array. Let max be the maximum element.
2. Now, initialize array of length max + 1 having all 0 elements. This array will be used to store the count of
the elements in the given array.
UNIT 2 : SORTING
3. Now, we have to store the count of each array element at their corresponding index in the count array.
The count of an element will be stored as - Suppose array element '4' is appeared two times, so the count of
element 4 is 2. Hence, 2 is stored at the 4th position of the count array. If any element is not present in the
array, place 0, i.e. suppose element '3' is not present in the array, so, 0 will be stored at 3rd position.
Now, store the cumulative sum of count array elements. It will help to place the elements at the correct
index of the sorted array.
After placing element at its place, decrease its count by one. Before placing element 2, its count was 2, but
after placing it at its correct position, the new count for element 2 is 1.
Now, let's see the time complexity of counting sort in best case, average case, and in worst case. We will
also see the space complexity of the counting sort.
UNIT 2 : SORTING
1. Time Complexity
o Best Case Complexity - It occurs when there is no sorting required, i.e. the array is already sorted.
The best-case time complexity of counting sort is O(n + k).
o Average Case Complexity - It occurs when the array elements are in jumbled order that is not
properly ascending and not properly descending. The average case time complexity of counting sort
is O(n + k).
o Worst Case Complexity - It occurs when the array elements are required to be sorted in reverse
order. That means suppose you have to sort the array elements in ascending order, but its elements
are in descending order. The worst-case time complexity of counting sort is O(n + k).
In all above cases, the time complexity of counting sort is same. This is because the algorithm goes
through n+k times, regardless of how the elements are placed in the array.
Counting sort is better than the comparison-based sorting techniques because there is no comparison
between elements in counting sort. But, when the integers are very large the counting sort is bad because
arrays of that size have to be created.
2. Space Complexity
Stable YES
o The space complexity of counting sort is O(max). The larger the range of elements, the larger the
space complexity.
UNIT 2 : SORTING
Bucket Sort Algorithm
In this article, we will discuss the bucket sort Algorithm. The data items in the bucket sort are distributed in
the form of buckets. In coding or technical interviews for software engineers, sorting algorithms are widely
asked. So, it is important to discuss the topic.
The best and average-case complexity of bucket sort is O(n + k), and the worst-case complexity of bucket
sort is O(n2), where n is the number of items.
1. bucketSort(a[], n)
2. 1. Create 'n' empty buckets
3. 2. Do for each array element a[i]
4. 2.1. Put array elements into buckets, i.e. insert a[i] into bucket[n*a[i]]
5. 3. Sort the elements of individual buckets by using the insertion sort.
6. 4. At last, gather or concatenate the sorted buckets.
7. End bucketSort
1. Bucket Sort(A[])
2. 1. Let B[0....n-1] be a new array
3. 2. n=length[A]
4. 3. for i=0 to n-1
5. 4. make B[i] an empty list
6. 5. for i=1 to n
7. 6. do insert A[i] into list B[n a[i]]
8. 7. for i=0 to n-1
9. 8. do sort list B[i] with insertion-sort
10. 9. Concatenate lists B[0], B[1],........, B[n-1] together in order
11. End
Scatter-gather approach
We can understand the Bucket sort algorithm via scatter-gather approach. Here, the given elements are
first scattered into buckets. After scattering, elements in each bucket are sorted using a stable sorting
algorithm. At last, the sorted elements will be gathered in order.
Let's take an unsorted array to understand the process of bucket sort. It will be easier to understand the
bucket sort via an example.
Now, create buckets with a range from 0 to 25. The buckets range are 0-5, 5-10, 10-15, 15-20, 20-25.
Elements are inserted in the buckets according to the bucket range. Suppose the value of an item is 16, so
it will be inserted in the bucket with the range 15-20. Similarly, every item of the array will insert
accordingly.
Now, sort each bucket individually. The elements of each bucket can be sorted by using any of the stable
sorting algorithms.
UNIT 2 : SORTING
Now, let's see the time complexity of bucket sort in best case, average case, and in worst case. We will also
see the space complexity of the bucket sort.
1. Time Complexity
o Best Case Complexity - It occurs when there is no sorting required, i.e. the array is already sorted. In
Bucket sort, best case occurs when the elements are uniformly distributed in the buckets. The
complexity will be better if the elements are already sorted in the buckets.
If we use the insertion sort to sort the bucket elements, the overall complexity will be linear, i.e., O(n
+ k), where O(n) is for making the buckets, and O(k) is for sorting the bucket elements using
algorithms with linear time complexity at best case.
The best-case time complexity of bucket sort is O(n + k).
o Average Case Complexity - It occurs when the array elements are in jumbled order that is not
properly ascending and not properly descending. Bucket sort runs in the linear time, even when the
elements are uniformly distributed. The average case time complexity of bucket sort is O(n + K).
o Worst Case Complexity - In bucket sort, worst case occurs when the elements are of the close range
in the array, because of that, they have to be placed in the same bucket. So, some buckets have more
number of elements than others.
The complexity will get worse when the elements are in the reverse order.
The worst-case time complexity of bucket sort is O(n2).
UNIT 2 : SORTING
2. Space Complexity
Stable YES
In this article, we will discuss the Radix sort Algorithm. Radix sort is the linear sorting algorithm that is used
for integers. In Radix sort, there is digit by digit sorting is performed that is started from the least significant
digit to the most significant digit.
The process of radix sort works similar to the sorting of students names, according to the alphabetical order.
In this case, there are 26 radix formed due to the 26 alphabets in English. In the first pass, the names of
students are grouped according to the ascending order of the first letter of their names. After that, in the
second pass, their names are grouped according to the ascending order of the second letter of their name.
And the process continues until we find the sorted list.
Algorithm
1. radixSort(arr)
2. max = largest element in the given array
3. d = number of digits in the largest element (or, max)
4. Now, create d buckets of size 0 - 9
5. for i -> 0 to d
6. sort the array elements using counting sort (or any stable sort) according to the digits at
7. the ith place
The steps used in the sorting of radix sort are listed as follows -
o First, we have to find the largest element (suppose max) from the given array. Suppose 'x' be the
number of digits in max. The 'x' is calculated because we need to go through the significant places of
all elements.
o After that, go through one by one each significant place. Here, we have to use any stable sorting
algorithm to sort the digits of each significant place.
Now let's see the working of radix sort in detail by using an example. To understand it more clearly, let's take
an unsorted array and try to sort it using radix sort. It will make the explanation clearer and easier.
UNIT 2 : SORTING
In the given array, the largest element is 736 that have 3 digits in it. So, the loop will run up to three times
(i.e., to the hundreds place). That means three passes are required to sort the array.
Now, first sort the elements on the basis of unit place digits (i.e., x = 0). Here, we are using the counting sort
algorithm to sort the elements.
Pass 1:
In the first pass, the list is sorted on the basis of the digits at 0's place.
Pass 2:
In this pass, the list is sorted on the basis of the next significant digits (i.e., digits at 10 th place).
In this pass, the list is sorted on the basis of the next significant digits (i.e., digits at 100 th place).
Now, let's see the time complexity of Radix sort in best case, average case, and worst case. We will also see
the space complexity of Radix sort.
1. Time Complexity
o Best Case Complexity - It occurs when there is no sorting required, i.e. the array is already sorted.
The best-case time complexity of Radix sort is Ω(n+k).
o Average Case Complexity - It occurs when the array elements are in jumbled order that is not
properly ascending and not properly descending. The average case time complexity of Radix sort
is θ(nk).
o Worst Case Complexity - It occurs when the array elements are required to be sorted in reverse
order. That means suppose you have to sort the array elements in ascending order, but its elements
are in descending order. The worst-case time complexity of Radix sort is O(nk).
Radix sort is a non-comparative sorting algorithm that is better than the comparative sorting algorithms. It
has linear time complexity that is better than the comparative algorithms with complexity O(n logn).
2. Space Complexity
Stable YES
WORKING PROCESS
In linear search algorithm
Every element is considered a potential match for the key and chackd for the same.
If any element is found equal to the key, the search is successful and the index of the
element is returned
If no element is found equal to the key, the search yet “ No Match Found”
For Example:
Consider the array : { 10, 50, 30, 70, 80, 20, 90, 40 }
Step 1:
Comparing key with first element. Since not equal, the the next element as a potential match.
30 10 50 30 70 80 60 20 90 40
Not Equal
Comparing key with next Element. Since not equal, the next element as a potential match.
30 10 50 30 70 80 60 20 90 40
Not Equal
Step 2:
Now when comparing with key, the value match. So the linear search algorithm will yield successful
massage and return.
30 10 50 30 70 80 60 20 90 40
key Current Element
Equal
UNIT 3 : SEARCHING
1. Best Case
2. Worst Case
3. Average Case
You will learn about each one of them in a bit more detail.
Best Case Complexity
The element being searched may be at the last position in the array or not at all.
In the first case, the search succeeds in ‘n’ comparisons.
In the next case, the search fails after ‘n’ comparisons.
Thus, in the worst-case scenario, the linear search algorithm performs O(n) operations.
Advantages Disadvantages
No special data structure required Not suitable for large data sets
Linear Search and Binary Search are the two popular searching techniques. Here we will discuss the
Binary Search Algorithm.
Binary search is the search technique that works efficiently on sorted lists. Hence, to search an
element into some list using the binary search technique, we must ensure that the list is sorted.
Binary search follows the divide and conquer approach in which the list is divided into two halves,
and the item is compared with the middle element of the list. If the match is found then, the location
of the middle element is returned. Otherwise, we search into either of the halves depending upon
the result produced through the match.
UNIT 3 : SEARCHING
Algorithm
1. Binary_Search(a, lower_bound, upper_bound, val) // 'a' is the given array, 'lower_bound' is the index of the f
irst array element, 'upper_bound' is the index of the last array element, 'val' is the value to search
2. Step 1: set beg = lower_bound, end = upper_bound, pos = - 1
3. Step 2: repeat steps 3 and 4 while beg <=end
4. Step 3: set mid = (beg + end)/2
5. Step 4: if a[mid] = val
6. set pos = mid
7. print pos
8. go to step 6
9. else if a[mid] > val
10. set end = mid - 1
11. else
12. set beg = mid + 1
13. [end of if]
14. [end of loop]
15. Step 5: if pos = -1
16. print "value is not present in the array"
17. [end of if]
18. Step 6: exit
Working of Binary search
Now, let's see the working of the Binary Search Algorithm.
To understand the working of the Binary search algorithm, let's take a sorted array. It will be easy to
understand the working of Binary search with an example.
o Iterative method
o Recursive method
The recursive method of binary search follows the divide and conquer approach.
We have to use the below formula to calculate the mid of the array -
beg = 0
end = 8
Now, the element to search is found. So algorithm will return the index of the element matched.
UNIT 3 : SEARCHING
1. Time Complexity
o Best Case Complexity - In Binary search, best case occurs when the element to search is found in first
comparison, i.e., when the first middle element itself is the element to be searched. The best-case
time complexity of Binary search is O(1).
o Average Case Complexity - The average case time complexity of Binary search is O(logn).
o Worst Case Complexity - In Binary search, the worst case occurs, when we have to keep reducing the
search space till it has only one element. The worst-case time complexity of Binary search is O(logn).
2. Space Complexity
It is better than a linear search algorithm since its run time complexity is O(logN).
At each iteration, the binary search algorithm eliminates half of the list and significantly
reduces the search space.
The binary search algorithm works even when the array is rotated by some position and finds
the target element.
Disadvantages:
Before moving directly to the binary search tree, let's first see a brief description of the tree.
A binary search tree follows some order to arrange the elements. In a Binary search tree, the value
of left node must be smaller than the parent node, and the value of right node must be greater than
the parent node. This rule is applied recursively to the left and right subtrees of the root.
In the above figure, we can observe that the root node is 40, and all the nodes of the left subtree are
smaller than the root node, and all the nodes of the right subtree are greater than the root node.
Similarly, we can see the left child of root node is greater than its left child and smaller than its right
child. So, it also satisfies the property of binary search tree. Therefore, we can say that the tree in the
above image is a binary search tree.
Suppose if we change the value of node 35 to 55 in the above tree, check whether the tree will be
binary search tree or not.
In the above tree, the value of root node is 40, which is greater than its left child 30 but smaller than
right child of 30, i.e., 55. So, the above tree does not satisfy the property of Binary search tree.
Therefore, the above tree is not a binary search tree.
UNIT 3 : SEARCHING
o Searching an element in the Binary search tree is easy as we always have a hint that which subtree has
the desired element.
o As compared to array and linked lists, insertion and deletion operations are faster in BST.
Suppose the data elements are - 45, 15, 79, 90, 10, 55, 12, 20, 50
o First, we have to insert 45 into the tree as the root of the tree.
o Then, read the next element; if it is smaller than the root node, insert it as the root of the left subtree,
and move to the next element.
o Otherwise, if the element is larger than the root node, then insert it as the root of the right subtree.
Now, let's see the process of creating the Binary search tree using the given data element. The
process of creating the BST is shown below -
As 15 is smaller than 45, so insert it as the root node of the left subtree.
As 79 is greater than 45, so insert it as the root node of the right subtree.
90 is greater than 45 and 79, so it will be inserted as the right subtree of 79.
55 is larger than 45 and smaller than 79, so it will be inserted as the left subtree of 79.
12 is smaller than 45 and 15 but greater than 10, so it will be inserted as the right subtree of 10.
20 is smaller than 45 but greater than 15, so it will be inserted as the right subtree of 15.
UNIT 3 : SEARCHING
50 is greater than 45 but smaller than 79 and 55. So, it will be inserted as a left subtree of 55.
Now, the creation of binary search tree is completed. After that, let's move towards the operations
that can be performed on Binary search tree.
We can perform insert, delete and search operations on the binary search tree.
Searching means to find or locate a specific element or node in a data structure. In Binary search tree,
searching a node is easy because elements in BST are stored in a specific order. The steps of searching
a node in Binary Search tree are listed as follows -
1. First, compare the element to be searched with the root element of the tree.
2. If root is matched with the target element, then return the node's location.
3. If it is not matched, then check whether the item is less than the root element, if it is smaller than the
root element, then move to the left subtree.
4. If it is larger than the root element, then move to the right subtree.
5. Repeat the above procedure recursively until the match is found.
6. If the element is not found or not present in the tree, then return NULL.
Now, let's understand the searching in binary tree using an example. We are taking the binary search
tree formed above. Suppose we have to find node 20 from the below tree.
Step1:
Step2:
Step3:
UNIT 3 : SEARCHING
Now, let's see the algorithm to search an element in the Binary search tree.
Now let's understand how the deletion is performed on a binary search tree. We will also see an
example to delete an element from the given tree.
It is the simplest case to delete a node in BST. Here, we have to replace the leaf node with NULL and
simply free the allocated space.
We can see the process to delete a leaf node from BST in the below image. In below image, suppose
we have to delete node 90, as the node to be deleted is a leaf node, so it will be replaced with NULL,
and the allocated space will free.
In this case, we have to replace the target node with its child, and then delete the child node. It means
that after replacing the target node with its child node, the child node will now contain the value to
be deleted. So, we simply have to replace the child node with NULL and free up the allocated space.
We can see the process of deleting a node with one child from BST in the below image. In the below
image, suppose we have to delete the node 79, as the node to be deleted has only one child, so it will
be replaced with its child 55.
So, the replaced node 79 will now be a leaf node that can be easily deleted.
This case of deleting a node in BST is a bit complex among other two cases. In such a case, the steps
to be followed are listed as follows -
The inorder successor is required when the right child of the node is not empty. We can obtain the
inorder successor by finding the minimum element in the right child of the node.
We can see the process of deleting a node with two children from BST in the below image. In the
below image, suppose we have to delete node 45 that is the root node, as the node to be deleted
has two children, so it will be replaced with its inorder successor. Now, node 45 will be at the leaf of
the tree so that it can be deleted easily.
Now, let's see the process of inserting a node into BST using an example.
1. Time Complexity
Operations Best case time Average case time Worst case time
complexity complexity complexity
2. Space Complexity
Insertion O(n)
Deletion O(n)
Search O(n)
AVL Tree
AVL Tree is invented by GM Adelson - Velsky and EM Landis in 1962. The tree is named AVL in honour
of its inventors.
AVL Tree can be defined as height balanced binary search tree in which each node is associated with
a balance factor which is calculated by subtracting the height of its right sub-tree from that of its left
sub-tree.
Tree is said to be balanced if balance factor of each node is in between -1 to 1, otherwise, the tree
will be unbalanced and need to be balanced.
If balance factor of any node is 0, it means that the left sub-tree and right sub-tree contain equal
height.
If balance factor of any node is -1, it means that the left sub-tree is one level lower than the right sub-
tree.
An AVL tree is given in the following figure. We can see that, balance factor associated with each
node is in between -1 and +1. therefore, it is an example of AVL tree.
UNIT 3 : SEARCHING
Complexity
SN Operation Description
1 Insertion Insertion in AVL tree is performed in the same way as it is performed in a binary search
tree. However, it may lead to violation in the AVL tree property and therefore the tree
may need balancing. The tree can be balanced by applying rotations.
2 Deletion Deletion can also be performed in the same way as it is performed in a binary search
tree. Deletion may also disturb the balance of the tree therefore, various types of
rotations are used to rebalance the tree.
AVL Rotations
We perform rotation in AVL tree only in case if Balance Factor is other than -1, 0, and 1. There are
basically four types of rotations which are as follows:
Where node A is the node whose balance Factor is other than -1, 0, 1.
The first two rotations LL and RR are single rotations and the next two rotations LR and RL are double
rotations. For a tree to be unbalanced, minimum height must be at least 2, Let us understand each
rotation
1. RR Rotation
When BST becomes unbalanced, due to a node is inserted into the right subtree of the right subtree
of A, then we perform RR rotation, RR rotation is an anticlockwise rotation, which is applied on the
edge below a node having balance factor -2
In above example, node A has balance factor -2 because a node C is inserted in the right subtree of A
right subtree. We perform the RR rotation on the edge below A.
UNIT 3 : SEARCHING
2. LL Rotation
When BST becomes unbalanced, due to a node is inserted into the left subtree of the left subtree of
C, then we perform LL rotation, LL rotation is clockwise rotation, which is applied on the edge below
a node having balance factor 2.
In above example, node C has balance factor 2 because a node A is inserted in the left subtree of C
left subtree. We perform the LL rotation on the edge below A.
UNIT 3 : SEARCHING
3. LR Rotation
Double rotations are bit tougher than single rotation which has already explained above. LR rotation
= RR rotation + LL rotation, i.e., first RR rotation is performed on subtree and then LL rotation is
performed on full tree, by full tree we mean the first node from the path of inserted node whose
balance factor is other than -1, 0, or 1.
State Action
A node B has been inserted into the right subtree of A the left subtree of C, because of which
C has become an unbalanced node having balance factor 2. This case is L R rotation where:
Inserted node is in the right subtree of left subtree of C
After performing RR rotation, node C is still unbalanced, i.e., having balance factor 2, as
inserted node A is in the left of left of C
Now we perform LL clockwise rotation on full tree, i.e. on node C. node C has now become
the right subtree of node B, A is left subtree of B
Balance factor of each node is now either -1, 0, or 1, i.e. BST is balanced now.
UNIT 3 : SEARCHING
4. RL Rotation
As already discussed, that double rotations are bit tougher than single rotation which has already
explained above. R L rotation = LL rotation + RR rotation, i.e., first LL rotation is performed on subtree
and then RR rotation is performed on full tree, by full tree we mean the first node from the path of
inserted node whose balance factor is other than -1, 0, or 1.
State Action
A node B has been inserted into the left subtree of C the right subtree of A, because of which A
has become an unbalanced node having balance factor - 2. This case is RL rotation where: Inserted
node is in the left subtree of right subtree of A
After performing LL rotation, node A is still unbalanced, i.e. having balance factor -2, which is
because of the right-subtree of the right-subtree node A.
Now we perform RR rotation (anticlockwise rotation) on full tree, i.e. on node A. node C has now
become the right subtree of node B, and node A has become the left subtree of B.
Balance factor of each node is now either -1, 0, or 1, i.e., BST is balanced now.
UNIT 3 : SEARCHING
1. Insert H, I, J
On inserting the above elements, especially in the case of H, the BST becomes unbalanced as the
Balance Factor of H is -2. Since the BST is right-skewed, we will perform RR Rotation on node H.
2. Insert B, A
On inserting the above elements, especially in case of A, the BST becomes unbalanced as the Balance
Factor of H and I is 2, we consider the first node from the last inserted node i.e. H. Since the BST from
H is left-skewed, we will perform LL Rotation on node H.
3. Insert E
On inserting E, BST becomes unbalanced as the Balance Factor of I is 2, since if we travel from E to I
we find that it is inserted in the left subtree of right subtree of I, we will perform LR Rotation on node
I. LR = RR + LL rotation
UNIT 3 : SEARCHING
4. Insert C, F, D
On inserting C, F, D, BST becomes unbalanced as the Balance Factor of B and H is -2, since if we travel
from D to B we find that it is inserted in the right subtree of left subtree of B, we will perform RL
Rotation on node I. RL = LL + RR rotation.
UNIT 3 : SEARCHING
5. Insert G
On inserting G, BST become unbalanced as the Balance Factor of H is 2, since if we travel from G to
H, we find that it is inserted in the left subtree of right subtree of H, we will perform LR Rotation on
node I. LR = RR + LL rotation.
UNIT 3 : SEARCHING
6. Insert K
On inserting K, BST becomes unbalanced as the Balance Factor of I is -2. Since the BST is right-skewed
from I to K, hence we will perform RR Rotation on the node I.
7. Insert L
On inserting the L tree is still balanced as the Balance Factor of each node is now either, -1, 0, +1.
Hence the tree is a Balanced AVL tree
The above tree is a binary search tree. A binary search tree is a tree in which each node on the left
side has a lower value than its parent node, and the node on the right side has a higher value than its
UNIT 3 : SEARCHING
parent node. In the above tree, n1 is a root node, and n4, n6, n7 are the leaf nodes. The n7 node is
the farthest node from the root node. The n4 and n6 contain 2 edges and there exist three edges
between the root node and n7 node. Since n7 is the farthest from the root node; therefore, the height
of the above tree is 3.
Now we will see whether the above tree is balanced or not. The left subtree contains the nodes n2,
n4, n5, and n7, while the right subtree contains the nodes n3 and n6. The left subtree has two leaf
nodes, i.e., n4 and n7. There is only one edge between the node n2 and n4 and two edges between
the nodes n7 and n2; therefore, node n7 is the farthest from the root node. The height of the left
subtree is 2. The right subtree contains only one leaf node, i.e., n6, and has only one edge; therefore,
the height of the right subtree is 1. The difference between the heights of the left subtree and right
subtree is 1. Since we got the value 1 so we can say that the above tree is a height-balanced tree. This
process of calculating the difference between the heights should be performed for each node like n2,
n3, n4, n5, n6 and n7. When we process each node, then we will find that the value of k is not more
than 1, so we can say that the above tree is a balanced binary tree.
In the above tree, n6, n4, and n3 are the leaf nodes, where n6 is the farthest node from the root
node. Three edges exist between the root node and the leaf node; therefore, the height of the above
tree is 3. When we consider n1 as the root node, then the left subtree contains the nodes n2, n4, n5,
and n6, while subtree contains the node n3. In the left subtree, n2 is a root node, and n4 and n6 are
leaf nodes. Among n4 and n6 nodes, n6 is the farthest node from its root node, and n6 has two edges;
therefore, the height of the left subtree is 2. The right subtree does have any child on its left and
right; therefore, the height of the right subtree is 0. Since the height of the left subtree is 2 and the
right subtree is 0, so the difference between the height of the left subtree and right subtree is 2.
According to the definition, the difference between the height of left sub tree and the right subtree
must not be greater than 1. In this case, the difference comes to be 2, which is greater than 1;
therefore, the above binary tree is an unbalanced binary search tree.
UNIT 3 : SEARCHING
The above tree is a binary search tree because all the left subtree nodes are smaller than its parent
node and all the right subtree nodes are greater than its parent node. Suppose we want to want to
find the value 79 in the above tree. First, we compare the value of node n1 with 79; since the value
of 79 is not equal to 35 and it is greater than 35 so we move to the node n3, i.e., 48. Since the value
79 is not equal to 48 and 79 is greater than 48, so we move to the right child of 48. The value of the
right child of node 48 is 79 which is equal to the value to be searched. The number of hops required
to search an element 79 is 2 and the maximum number of hops required to search any element is 2.
The average case to search an element is O(logn).
The above tree is also a binary search tree because all the left subtree nodes are smaller than its
parent node and all the right subtree nodes are greater than its parent node. Suppose we want to
find the find the value 79 in the above tree. First, we compare the value 79 with a node n4, i.e., 13.
Since the value 79 is greater than 13 so we move to the right child of node 13, i.e., n2 (21). The value
of the node n2 is 21 which is smaller than 79, so we again move to the right of node 21. The value of
UNIT 3 : SEARCHING
right child of node 21 is 29. Since the value 79 is greater than 29 so we move to the right child of node
29. The value of right child of node 29 is 35 which is smaller than 79 so we move to the right child of
node 35, i.e., 48. The value 79 is greater than 48, so we move to the right child of node 48. The value
of right child node of 48 is 79 which is equal to the value to be searched. In this case, the number of
hops required to search an element is 5. In this case, the worst case is O(n).
If the number of nodes increases, the formula used in the tree diagram1 is more efficient than the
formula used in the tree diagram2. Suppose the number of nodes available in both above trees is
100,000. To search any element in a tree diagram2, the time taken is 100,000µs whereas the time
taken to search an element in tree diagram is log(100,000) which is equal 16.6 µs. We can observe
the enormous difference in time between above two trees. Therefore, we conclude that the balance
binary tree provides searching more faster than linear tree data structure.
Introduction to Hashing
Assume we want to create a system for storing employee records that include phone numbers (as
keys). We also want the following queries to run quickly:
We can consider using the following data structures to store information about various phone
numbers.
We must search in a linear fashion for arrays and linked lists, which can be costly in practise. If we
use arrays and keep the data sorted, we can use Binary Search to find a phone number in O(Logn)
time, but insert and delete operations become expensive because we must keep the data sorted.
We get moderate search, insert, and delete times with a balanced binary search tree. All of these
operations will be completed in O(Logn) time.
The term "access-list" refers to a set of rules for controlling network traffic and reducing network
attacks. ACLs are used to filter network traffic based on a set of rules defined for incoming or outgoing
traffic.
Another option is to use a direct access table, in which we create a large array and use phone numbers
as indexes. If the phone number is not present, the array entry is NIL; otherwise, the array entry
stores a pointer to the records corresponding to the phone number. In terms of time complexity, this
UNIT 3 : SEARCHING
solution is the best of the bunch; we can perform all operations in O(1) time. To insert a phone
number, for example, we create a record with the phone number's details, use the phone number as
an index, and store the pointer to the newly created record in the table.
This solution has a number of practical drawbacks. The first issue with this solution is the amount of
extra space required. For example, if a phone number has n digits, we require O(m * 10n) table space,
where m is the size of a pointer to record. Another issue is that an integer in a programming language
cannot hold n digits.
Because of the limitations mentioned above, Direct Access Table cannot always be used. In practise,
Hashing is the solution that can be used in almost all such situations and outperforms the above data
structures such as Array, Linked List, and Balanced BST. We get O(1) search time on average (under
reasonable assumptions) and O(n) in the worst case with hashing. Let's break down what hashing is.
UNIT 3 : SEARCHING
Open Addressing-
In open addressing,
• Unlike separate chaining, all the keys are stored inside the hash table.
• No key is stored outside the hash table.
• Linear Probing
• Quadratic Probing
• Double Hashing
Insert Operation-
• Hash function is used to compute the hash value for a key to be inserted.
• Hash value is then used as an index to store the key in the hash table.
In case of collision,
NOTE-
• During insertion, the buckets marked as “deleted” are treated like any other empty bucket.
• During searching, the search is not terminated on encountering the bucket marked as
“deleted”.
• The search terminates only after the required key or an empty bucket is found.
UNIT 3 : SEARCHING
1. Linear Probing-
In linear probing,
Advantage-
• It is easy to compute.
Disadvantage-
Time Complexity-
This is because-
• Even if there is only one element present and all other elements are deleted.
• Then, “deleted” markers present in the hash table makes search the entire table.
2. Quadratic Probing-
In quadratic probing,
3. Double Hashing-
In double hashing,
• We use another hash function hash2(x) and look for i * hash2(x) bucket in ith iteration.
• It requires more computation time as two hash functions need to be computed.
Using the hash function ‘key mod 7’, insert the following sequence of keys in the hash table-
Solution-
The given sequence of keys will be inserted in the hash table as-
Step-01:
Step-02:
Step-03:
Step-04:
Step-05:
Step-06:
Step-07:
Step-08:
• The next key to be inserted in the hash table = 101.
• Bucket of the hash table to which key 101 maps = 101 mod 7 = 3.
• Since bucket-3 is already occupied, so collision occurs.
• Separate chaining handles the collision by creating a linked list to bucket-3.
• So, key 101 will be inserted in bucket-3 of the hash table as-
Problem-
Using the hash function ‘key mod 7’, insert the following sequence of keys in the hash table-
Solution-
The given sequence of keys will be inserted in the hash table a
Step-01:
Step-02:
Step-03:
Step-04:
Step-05:
Step-06:
Step-07:
Step-08:
Separate Chaining-
Separate Chaining is advantageous when it is required to perform all the following operations on the keys
stored in the hash table-
• Insertion Operation
• Deletion Operation
• Searching Operation
Using the hash function ‘key mod 7’, insert the following sequence of keys in the hash table-
Solution-
The given sequence of keys will be inserted in the hash table as-
UNIT 3 : SEARCHING
Step-01:
Step-02:
Step-03:
Step-04:
Step-05:
• The next key to be inserted in the hash table = 85.
• Bucket of the hash table to which key 85 maps = 85 mod 7 = 1.
• Since bucket-1 is already occupied, so collision occurs.
• To handle the collision, linear probing technique keeps probing linearly until an empty bucket is found.
• The first empty bucket is bucket-2.
• So, key 85 will be inserted in bucket-2 of the hash table as-
Step-06:
Step-07:
Step-08:
• The next key to be inserted in the hash table = 101.
• Bucket of the hash table to which key 101 maps = 101 mod 7 = 3.
• Since bucket-3 is already occupied, so collision occurs.
• To handle the collision, linear probing technique keeps probing linearly until an empty bucket is found.
• The first empty bucket is bucket-5.
• So, key 101 will be inserted in bucket-5 of the hash table as-
UNIT 3 : SEARCHING
Keys are stored inside the hash table as well as outside All the keys are stored only inside the hash table.
the hash table.
No key is present outside the hash table.
The number of keys to be stored in the hash table can The number of keys to be stored in the hash table can
even exceed the size of the hash table. never exceed the size of the hash table.
Extra space is required for the pointers to store the No extra space is required.
keys outside the hash table.
In open addressing, the value of load factor always lie between 0 and 1.
This is because-
• In open addressing, all the keys are stored inside the hash table.
• So, size of the table is always greater or at least equal to the number of keys stored in the table.
Unit 4 : GRAPH THEORY
Graph
A graph can be defined as group of vertices and edges that are used to connect these vertices. A graph can
be seen as a cyclic tree, where the vertices (Nodes) maintain any complex relationship among them instead
of having parent child relationship.
Definition
A graph G can be defined as an ordered set G(V, E) where V(G) represents the set of vertices and E(G)
represents the set of edges which are used to connect these vertices.
A Graph G(V, E) with 5 vertices (A, B, C, D, E) and six edges ((A,B), (B,C), (C,E), (E,D), (D,B), (D,A)) is shown in
the following figure.
A graph can be directed or undirected. However, in an undirected graph, edges are not associated with the
directions with them. An undirected graph is shown in the above figure since its edges are not attached with
any of the directions. If an edge exists between vertex A and B then the vertices can be traversed from B to
A as well as A to B.
In a directed graph, edges form an ordered pair. Edges represent a specific path from some vertex A to
another vertex B. Node A is called initial node while node B is called terminal node.
Path
A path can be defined as the sequence of nodes that are followed in order to reach some terminal node V
from the initial node U.
Closed Path
A path will be called as closed path if the initial node is same as terminal node. A path will be closed path if
V0=VN.
Simple Path
If all the nodes of the graph are distinct with an exception V0=VN, then such path P is called as closed simple
path.
Cycle
A cycle can be defined as the path which has no repeated edges or vertices except the first and last vertices.
Connected Graph
A connected graph is the one in which some path exists between every two vertices (u, v) in V. There are no
isolated nodes in connected graph.
Complete Graph
A complete graph is the one in which every node is connected with all other nodes. A complete graph contain
n(n-1)/2 edges where n is the number of nodes in the graph.
Weighted Graph
In a weighted graph, each edge is assigned with some data such as length or weight. The weight of an edge
e can be given as w(e) which must be a positive (+) value indicating the cost of traversing the edge.
Digraph
A digraph is a directed graph in which each edge of the graph is associated with some direction and the
traversing can be done only in the specified direction.
Loop
An edge that is associated with the similar end points can be called as Loop.
Adjacent Nodes
If two nodes u and v are connected via an edge e, then the nodes u and v are called as neighbours or adjacent
nodes.
Unit 4 : GRAPH THEORY
Degree of the Node
A degree of a node is the number of edges that are connected with that node. A node with degree 0 is called
as isolated node.
Spanning tree
In this article, we will discuss the spanning tree and the minimum spanning tree. But before moving directly
towards the spanning tree, let's first see a brief description of the graph and its types.
Graph
A graph can be defined as a group of vertices and edges to connect these vertices. The types of graphs are
given as follows -
o Undirected graph: An undirected graph is a graph in which all the edges do not point to any particular
direction, i.e., they are not unidirectional; they are bidirectional. It can also be defined as a graph
with a set of V vertices and a set of E edges, each edge connecting two different vertices.
o Connected graph: A connected graph is a graph in which a path always exists from a vertex to any
other vertex. A graph is connected if we can reach any vertex from any other vertex by following
edges in either direction.
o Directed graph: Directed graphs are also known as digraphs. A graph is a directed graph (or digraph)
if all the edges present between any vertices or nodes of the graph are directed or have a defined
direction.
spanning Tree
A spanning tree can be defined as the subgraph of an undirected connected graph. It includes all the vertices
along with the least possible number of edges. If any vertex is missed, it is not a spanning tree. A spanning
tree is a subset of the graph that does not have cycles, and it also cannot be disconnected.
A spanning tree consists of (n-1) edges, where 'n' is the number of vertices (or nodes). Edges of the spanning
tree may or may not have weights assigned to them. All the possible spanning trees created from the given
graph G would have the same number of vertices, but the number of edges in the spanning tree would be
equal to the number of vertices in the given graph minus 1.
A complete undirected graph can have nn-2 number of spanning trees where n is the number of vertices in
the graph. Suppose, if n = 5, the number of maximum possible spanning trees would be 55-2 = 125.
Basically, a spanning tree is used to find a minimum path to connect all nodes of the graph. Some of the
common applications of the spanning tree are listed as follows -
Unit 4 : GRAPH THEORY
o Cluster Analysis
o Civil network planning
o Computer network routing protocol
Now, let's understand the spanning tree with the help of an example.
As discussed above, a spanning tree contains the same number of vertices as the graph, the number of
vertices in the above graph is 5; therefore, the spanning tree will contain 5 vertices. The edges in the
spanning tree will be equal to the number of vertices in the graph minus 1. So, there will be 4 edges in the
spanning tree.
Some of the possible spanning trees that will be created from the above graph are given as follows -
Unit 4 : GRAPH THEORY
Properties of spanning-tree
So, a spanning tree is a subset of connected graph G, and there is no spanning tree of a disconnected graph.
A minimum spanning tree can be defined as the spanning tree in which the sum of the weights of the edge
is minimum. The weight of the spanning tree is the sum of the weights given to the edges of the spanning
tree. In the real world, this weight can be considered as the distance, traffic load, congestion, or any random
value.
Let's understand the minimum spanning tree with the help of an example.
The sum of the edges of the above graph is 16. Now, some of the possible spanning trees created from the
above graph are -
Unit 4 : GRAPH THEORY
So, the minimum spanning tree that is selected from the above spanning trees for the given weighted graph
is -
o Minimum spanning tree can be used to design water-supply networks, telecommunication networks,
and electrical grids.
o It can be used to find paths in the map.
A minimum spanning tree can be found from a weighted graph by using the algorithms given below -
o Prim's Algorithm
o Kruskal's Algorithm
Kruskal's algorithm - This algorithm is also used to find the minimum spanning tree for a connected weighted
graph. Kruskal's algorithm also follows greedy approach, which finds an optimum solution at every stage
instead of focusing on a global optimum.
So, that's all about the article. Hope the article will be helpful and informative to you. Here, we have
discussed spanning tree and minimum spanning tree along with their properties, examples, and applications.
A DAG for basic block is a directed acyclic graph with the following labels on nodes:
1. The leaves of graph are labeled by unique identifier and that identifier can be variable names or
constants.
2. Interior nodes of the graph is labeled by an operator symbol.
3. Nodes are also given a sequence of identifiers for labels to store the computed value.
o DAGs are a type of data structure. It is used to implement transformations on basic blocks.
o DAG provides a good way to determine the common sub-expression.
o It gives a picture representation of how the value computed by the statement is used in subsequent
statements.
Method:
Step 1:
For case(i), create node(OP) whose right child is node(z) and left child is node(y).
For case(ii), check whether there is node(OP) with one child node(y).
Output:
For node(x) delete x from the list of identifiers. Append x to attached identifiers list for the node n found in
step 2. Finally set node(x) to n.
Example:
1. S1:= 4 * i
2. S2:= a[S1]
3. S3:= 4 * i
4. S4:= b[S3]
5. S5:= s2 * S4
6. S6:= prod + S5
7. Prod:= s6
8. S7:= i+1
9. i := S7
10. if i<= 20 goto (1)
Topological Sorting
A topological sort or topological ordering of a directed graph is a linear ordering of its vertices in which u
occurs before v in the ordering for every directed edge uv from vertex u to vertex v. For example, the graph's
vertices could represent jobs to be completed, and the edges could reflect requirements that one work must
be completed before another.
In this case, a topological ordering is just a legitimate task sequence. A topological sort is a graph traversal
in which each node v is only visited after all of its dependencies have been visited. If the graph contains no
directed cycles, then it is a directed acyclic graph. Any DAG has at least one topological ordering, and there
exist techniques for building topological orderings in linear time for any DAG.
Topological sorting has many applications, particularly in ranking issues like the feedback arc set. Even if the
DAG includes disconnected components, topological sorting is possible.
Topological Sorting is mostly used to schedule jobs based on their dependencies. Instruction scheduling,
ordering formula cell evaluation when recomputing formula values in spreadsheets, logic synthesis,
determining the order of compilation tasks to perform in make files, data serialization, and resolving symbol
dependencies in linker are all examples of applications of this type in computer science.
o Finding cycle in a graph: Only directed acyclic graphs may be ordered topologically (DAG). It is
impossible to arrange a circular graph topologically.
o Operation System deadlock detection: A deadlock occurs when one process is waiting while another
holds the requested resource.
o Dependency resolution: Topological Sorting has been proved to be very helpful in Dependency
resolution.
o Sentence Ordering: A set of n documents D={d1,d2...,dn} and the number of sentences in a document
is vi, where ∀i, vi>=1. Suppose a random order o=[o1,....ovi] and a set of vi sentences in random order
are {So1, So2,..., Sovi}. Then the task is to find the right order of the sentences o*={o*1,...o*vi}. A set
of constraints Ci represents the relative ordering between every pair of sentences in di where
|Ci|=(vi×(vi-1))/2. For example, if a document has three sentences in the correct order s1 < s2 < s3,
then we have three set of constraints {s1 < s2, s1 < s3, s2 < s3}
The order of the sentences can be represented using a DAG. Here the sentences (Si) represent the
vertices, and the edges represent the ordering between sentences. For example, if we have a directed
edge between S1 to S2, then S1 must come before S2. Topological sort can produce an ordering of
these sentences (Sentence ordering).
o Critical Path Analysis: A project management approach known as critical route analysis. It's used to
figure out how long a project should take and how dependent each action is on the others. There
may be some preceding actions before an activity. Before beginning a new activity, all previous
actions must be completed.
Unit 4 : GRAPH THEORY
o Course Schedule problem: Topological Sorting has been proved to be very helpful in solving the
Course Schedule problem.
o Other applications like manufacturing workflows, data serialization, and context-free grammar.
Prim's Algorithm
In this article, we will discuss the prim's algorithm. Along with the algorithm, we will also see the complexity,
working, example, and implementation of prim's algorithm.
Before starting the main topic, we should discuss the basic and important terms such as spanning tree and
minimum spanning tree.
Minimum Spanning tree - Minimum spanning tree can be defined as the spanning tree in which the sum of
the weights of the edge is minimum. The weight of the spanning tree is the sum of the weights given to the
edges of the spanning tree.
Prim's Algorithm is a greedy algorithm that is used to find the minimum spanning tree from a graph. Prim's
algorithm finds the subset of edges that includes every vertex of the graph such that the sum of the weights
of the edges can be minimized.
Prim's algorithm starts with the single node and explores all the adjacent nodes with all the connecting edges
at every step. The edges with the minimal weights causing no cycles in the graph got selected.
Prim's algorithm is a greedy algorithm that starts from one vertex and continue to add the edges with the
smallest weight until the goal is reached. The steps to implement the prim's algorithm are given as follows -
Now, let's see the working of prim's algorithm using an example. It will be easier to understand the prim's
algorithm using an example.
Unit 4 : GRAPH THEORY
Suppose, a weighted graph is -
Step 1 - First, we have to choose a vertex from the above graph. Let's choose B.
Step 2 - Now, we have to choose and add the shortest edge from vertex B. There are two edges from vertex
B that are B to C with weight 10 and edge B to D with weight 4. Among the edges, the edge BD has the
minimum weight. So, add it to the MST.
Step 3 - Now, again, choose the edge with the minimum weight among all the other edges. In this case, the
edges DE and CD are such edges. Add them to MST and explore the adjacent of C, i.e., E and A. So, select the
edge DE and add it to the MST.
Step 4 - Now, select the edge CD, and add it to the MST.
Unit 4 : GRAPH THEORY
Step 5 - Now, choose the edge CA. Here, we cannot select the edge CE as it would create a cycle to the graph.
So, choose the edge CA and add it to the MST.
So, the graph produced in step 5 is the minimum spanning tree of the given graph. The cost of the MST is
given below -
Algorithm
Now, let's see the time complexity of Prim's algorithm. The running time of the prim's algorithm depends
upon using the data structure for the graph and the ordering of edges. Below table shows some choices -
Unit 4 : GRAPH THEORY
o Time Complexity
Data structure used for the minimum edge weight Time Complexity
Prim's algorithm can be simply implemented by using the adjacency matrix or adjacency list graph
representation, and to add the edge with the minimum weight requires the linearly searching of an array of
weights. It requires O(|V|2) running time. It can be improved further by using the implementation of heap
to find the minimum weight edges in the inner loop of the algorithm.
The time complexity of the prim's algorithm is O(E logV) or O(V logV), where E is the no. of edges, and V is
the no. of vertices.
Kruskal's Algorithm
In this article, we will discuss Kruskal's algorithm. Here, we will also see the complexity, working, example,
and implementation of the Kruskal's algorithm.
But before moving directly towards the algorithm, we should first understand the basic terms such as
spanning tree and minimum spanning tree.
Minimum Spanning tree - Minimum spanning tree can be defined as the spanning tree in which the sum of
the weights of the edge is minimum. The weight of the spanning tree is the sum of the weights given to the
edges of the spanning tree.
Kruskal's Algorithm is used to find the minimum spanning tree for a connected weighted graph. The main
target of the algorithm is to find the subset of edges by using which we can traverse every vertex of the
graph. It follows the greedy approach that finds an optimum solution at every stage instead of focusing on a
global optimum.
In Kruskal's algorithm, we start from edges with the lowest weight and keep adding the edges until the goal
is reached. The steps to implement Kruskal's algorithm are listed as follows -
Now, let's see the working of Kruskal's algorithm using an example. It will be easier to understand Kruskal's
algorithm using an example.
The weight of the edges of the above graph is given in the below table -
Edge AB AC AD AE BC CD DE
Weight 1 7 10 5 3 4 2
Now, sort the edges given above in the ascending order of their weights.
Edge AB DE BC CD AE AC AD
Weight 1 2 3 4 5 7 10
Step 2 - Add the edge DE with weight 2 to the MST as it is not creating the cycle.
Step 3 - Add the edge BC with weight 3 to the MST, as it is not creating any cycle or loop.
Step 4 - Now, pick the edge CD with weight 4 to the MST, as it is not forming the cycle.
Step 5 - After that, pick the edge AE with weight 5. Including this edge will create the cycle, so discard it.
Step 6 - Pick the edge AC with weight 7. Including this edge will create the cycle, so discard it.
Step 7 - Pick the edge AD with weight 10. Including this edge will also create the cycle, so discard it.
Unit 4 : GRAPH THEORY
So, the final minimum spanning tree obtained from the given weighted graph by using Kruskal's algorithm is
-
Now, the number of edges in the above tree equals the number of vertices minus 1. So, the algorithm stops
here.
Algorithm
1. Step 1: Create a forest F in such a way that every vertex of the graph is a separate tree.
2. Step 2: Create a set E that contains all the edges of the graph.
3. Step 3: Repeat Steps 4 and 5 while E is NOT EMPTY and F is not spanning
4. Step 4: Remove an edge from E with minimum weight
5. Step 5: IF the edge obtained in Step 4 connects two different trees, then add it to the forest F
6. (for combining two trees into one tree).
7. ELSE
8. Discard the edge
9. Step 6: END
o Time Complexity
The time complexity of Kruskal's algorithm is O(E logE) or O(V logV), where E is the no. of edges, and
V is the no. of vertices.
Now that we know some basic Graphs concepts let's dive into understanding the concept of Dijkstra's
Algorithm.
Ever wondered how does Google Maps finds the shortest and fastest route between two places?
Well, the answer is Dijkstra's Algorithm. Dijkstra's Algorithm is a Graph algorithm that finds the shortest
path from a source vertex to all other vertices in the Graph (single source shortest path). It is a type of Greedy
Algorithm that only works on Weighted Graphs having positive weights. The time complexity of Dijkstra's
Algorithm is O(V2) with the help of the adjacency matrix representation of the graph. This time complexity
Unit 4 : GRAPH THEORY
can be reduced to O((V + E) log V) with the help of an adjacency list representation of the graph, where V is
the number of vertices and E is the number of edges in the graph.
Dijkstra's Algorithm was designed and published by Dr. Edsger W. Dijkstra, a Dutch Computer Scientist,
Software Engineer, Programmer, Science Essayist, and Systems Scientist.
During an Interview with Philip L. Frana for the Communications of the ACM journal in the year 2001, Dr.
Edsger W. Dijkstra revealed:
"What is the shortest way to travel from Rotterdam to Groningen, in general: from given city to given city?
It is the algorithm for the shortest path, which I designed in about twenty minutes. One morning I was
shopping in Amsterdam with my young fiancée, and tired, we sat down on the café terrace to drink a cup of
coffee and I was just thinking about whether I could do this, and I then designed the algorithm for the
shortest path. As I said, it was a twenty-minute invention. In fact, it was published in '59, three years later.
The publication is still readable, it is, in fact, quite nice. One of the reasons that it is so nice was that I designed
it without pencil and paper. I learned later that one of the advantages of designing without pencil and paper
is that you are almost forced to avoid all avoidable complexities. Eventually, that algorithm became to my
great amazement, one of the cornerstones of my fame."
Dijkstra thought about the shortest path problem while working as a programmer at the Mathematical
Centre in Amsterdam in 1956 to illustrate the capabilities of a new computer known as ARMAC. His goal was
to select both a problem and a solution (produced by the computer) that people with no computer
background could comprehend. He developed the shortest path algorithm and later executed it for ARMAC
for a vaguely shortened transportation map of 64 cities in the Netherlands (64 cities, so 6 bits would be
sufficient to encode the city number). A year later, he came across another issue from hardware engineers
operating the next computer of the institute: Minimize the amount of wire required to connect the pins on
the machine's back panel. As a solution, he re-discovered the algorithm called Prim's minimal spanning tree
algorithm and published it in the year 1959.
1. Dijkstra's Algorithm begins at the node we select (the source node), and it examines the graph to find
the shortest path between that node and all the other nodes in the graph.
2. The Algorithm keeps records of the presently acknowledged shortest distance from each node to the
source node, and it updates these values if it finds any shorter path.
3. Once the Algorithm has retrieved the shortest path between the source and another node, that node
is marked as 'visited' and included in the path.
4. The procedure continues until all the nodes in the graph have been included in the path. In this
manner, we have a path connecting the source node to all other nodes, following the shortest
possible path to reach each node.
Unit 4 : GRAPH THEORY
Understanding the Working of Dijkstra's Algorithm
A graph and source vertex are requirements for Dijkstra's Algorithm. This Algorithm is established on Greedy
Approach and thus finds the locally optimal choice (local minima in this case) at each step of the Algorithm.
Each Vertex in this Algorithm will have two properties defined for it:
1. Visited Property
2. Path Property
Visited Property:
1. The 'visited' property signifies whether or not the node has been visited.
2. We are using this property so that we do not revisit any node.
3. A node is marked visited only when the shortest path has been found.
Path Property:
1. The 'path' property stores the value of the current minimum path to the node.
2. The current minimum path implies the shortest way we have reached this node till now.
3. This property is revised when any neighbor of the node is visited.
4. This property is significant because it will store the final answer for each node.
Initially, we mark all the vertices, or nodes, unvisited as they have yet to be visited. The path to all the nodes
is also set to infinity apart from the source node. Moreover, the path to the source node is set to zero (0).
We then select the source node and mark it as visited. After that, we access all the neighboring nodes of the
source node and perform relaxation on every node. Relaxation is the process of lowering the cost of reaching
a node with the help of another node.
In the process of relaxation, the path of each node is revised to the minimum value amongst the node's
current path, the sum of the path to the previous node, and the path from the previous node to the current
node.
Let us suppose that p[n] is the value of the current path for node n, p[m] is the value of the path up to the
previously visited node m, and w is the weight of the edge between the current node and previously visited
one (edge weight between n and m).
We then mark an unvisited node with the least path as visited in every subsequent step and update its
neighbor's paths.
Unit 4 : GRAPH THEORY
We repeat this procedure until all the nodes in the graph are marked visited.
Whenever we add a node to the visited set, the path to all its neighboring nodes also changes accordingly.
If any node is left unreachable (disconnected component), its path remains 'infinity'. In case the source itself
is a separate component, then the path to all other nodes remains 'infinity'.
The following is the step that we will follow to implement Dijkstra's Algorithm:
Step 1: First, we will mark the source node with a current distance of 0 and set the rest of the nodes to
INFINITY.
Step 2: We will then set the unvisited node with the smallest current distance as the current node, suppose
X.
Step 3: For each neighbor N of the current node X: We will then add the current distance of X with the weight
of the edge joining X-N. If it is smaller than the current distance of N, set it as the new current distance of N.
Step 5: We will repeat the process from 'Step 2' if there is any node unvisited left in the graph.
Let us now understand the implementation of the algorithm with the help of an example:
1. We will use the above graph as the input, with node A as the source.
2. First, we will mark all the nodes as unvisited.
3. We will set the path to 0 at node A and INFINITY for all the other nodes.
4. We will now mark source node A as visited and access its neighboring nodes.
Note: We have only accessed the neighboring nodes, not visited them.
5. We will now update the path to node B by 4 with the help of relaxation because the path to
node A is 0 and the path from node A to B is 4, and the minimum((0 + 4), INFINITY) is 4.
Unit 4 : GRAPH THEORY
6. We will also update the path to node C by 5 with the help of relaxation because the path to
node A is 0 and the path from node A to C is 5, and the minimum((0 + 5), INFINITY) is 5. Both the
neighbors of node A are now relaxed; therefore, we can move ahead.
7. We will now select the next unvisited node with the least path and visit it. Hence, we will visit
node B and perform relaxation on its unvisited neighbors. After performing relaxation, the path to
node C will remain 5, whereas the path to node E will become 11, and the path to node D will
become 13.
8. We will now visit node E and perform relaxation on its neighboring nodes B, D, and F. Since only
node F is unvisited, it will be relaxed. Thus, the path to node B will remain as it is, i.e., 4, the path to
node D will also remain 13, and the path to node F will become 14 (8 + 6).
9. Now we will visit node D, and only node F will be relaxed. However, the path to node F will remain
unchanged, i.e., 14.
10. Since only node F is remaining, we will visit it but not perform any relaxation as all its neighboring
nodes are already visited.
11. Once all the nodes of the graphs are visited, the program will end.
1. A = 0
2. B = 4 (A -> B)
3. C = 5 (A -> C)
4. D = 4 + 9 = 13 (A -> B -> D)
5. E = 5 + 3 = 8 (A -> C -> E)
6. F = 5 + 3 + 6 = 14 (A -> C -> E -> F)
o We have to maintain a record of the path distance of every node. Therefore, we can store the path
distance of each node in an array of size n, where n is the total number of nodes.
o Moreover, we want to retrieve the shortest path along with the length of that path. To overcome
this problem, we will map each node to the node that last updated its path length.
o Once the algorithm is complete, we can backtrack the destination node to the source node to retrieve
the path.
o We can use a minimum Priority Queue to retrieve the node with the least path distance in an efficient
way.
Unit 4 : GRAPH THEORY
Bellman Ford Algorithm
Bellman ford algorithm is a single-source shortest path algorithm. This algorithm is used to find the shortest
distance from the single vertex to all the other vertices of a weighted graph. There are various other
algorithms used to find the shortest path like Dijkstra algorithm, etc. If the weighted graph contains the
negative weight values, then the Dijkstra algorithm does not confirm whether it produces the correct answer
or not. In contrast to Dijkstra algorithm, bellman ford algorithm guarantees the correct answer even if the
weighted graph contains the negative weight values.
As we can observe in the above graph that some of the weights are negative. The above graph contains 6
vertices so we will go on relaxing till the 5 vertices. Here, we will relax all the edges 5 times. The loop will
iterate 5 times to get the correct answer. If the loop is iterated more than 5 times then also the answer will
be the same, i.e., there would be no change in the distance between the vertices.
Relaxing means:
To find the shortest path of the above graph, the first step is note down all the edges which are given below:
(A, B), (A, C), (A, D), (B, E), (C, E), (D, C), (D, F), (E, F), (C, B)
Let's consider the source vertex as 'A'; therefore, the distance value at vertex A is 0 and the distance value
at all the other vertices as infinity shown as below:
Unit 4 : GRAPH THEORY
Since the graph has six vertices so it will have five iterations.
First iteration
Consider the edge (A, B). Denote vertex 'A' as 'u' and vertex 'B' as 'v'. Now use the relaxing formula:
d(u) = 0
d(v) = ∞
c(u , v) = 6
d(v) = 0 + 6 = 6
Consider the edge (A, C). Denote vertex 'A' as 'u' and vertex 'C' as 'v'. Now use the relaxing formula:
d(u) = 0
d(v) = ∞
c(u , v) = 4
d(v) = 0 + 4 = 4
d(u) = 0
d(v) = ∞
c(u , v) = 5
d(v) = 0 + 5 = 5
Consider the edge (B, E). Denote vertex 'B' as 'u' and vertex 'E' as 'v'. Now use the relaxing formula:
d(u) = 6
d(v) = ∞
c(u , v) = -1
d(v) = 6 - 1= 5
Consider the edge (C, E). Denote vertex 'C' as 'u' and vertex 'E' as 'v'. Now use the relaxing formula:
d(u) = 4
d(v) = 5
c(u , v) = 3
Consider the edge (D, C). Denote vertex 'D' as 'u' and vertex 'C' as 'v'. Now use the relaxing formula:
d(u) = 5
d(v) = 4
c(u , v) = -2
Unit 4 : GRAPH THEORY
Since (5 -2) is less than 4, so update
d(v) = 5 - 2 = 3
Consider the edge (D, F). Denote vertex 'D' as 'u' and vertex 'F' as 'v'. Now use the relaxing formula:
d(u) = 5
d(v) = ∞
c(u , v) = -1
d(v) = 5 - 1 = 4
Consider the edge (E, F). Denote vertex 'E' as 'u' and vertex 'F' as 'v'. Now use the relaxing formula:
d(u) = 5
d(v) = ∞
c(u , v) = 3
Since (5 + 3) is greater than 4, so there would be no updation on the distance value of vertex F.
Consider the edge (C, B). Denote vertex 'C' as 'u' and vertex 'B' as 'v'. Now use the relaxing formula:
d(u) = 3
d(v) = 6
c(u , v) = -2
d(v) = 3 - 2 = 1
In the second iteration, we again check all the edges. The first edge is (A, B). Since (0 + 6) is greater than 1 so
there would be no updation in the vertex B.
The next edge is (A, C). Since (0 + 4) is greater than 3 so there would be no updation in the vertex C.
The next edge is (A, D). Since (0 + 5) equals to 5 so there would be no updation in the vertex D.
The next edge is (B, E). Since (1 - 1) equals to 0 which is less than 5 so update:
=1-1=0
The next edge is (C, E). Since (3 + 3) equals to 6 which is greater than 5 so there would be no updation in the
vertex E.
The next edge is (D, C). Since (5 - 2) equals to 3 so there would be no updation in the vertex C.
The next edge is (D, F). Since (5 - 1) equals to 4 so there would be no updation in the vertex F.
The next edge is (E, F). Since (5 + 3) equals to 8 which is greater than 4 so there would be no updation in the
vertex F.
The next edge is (C, B). Since (3 - 2) equals to 1` so there would be no updation in the vertex B.
Third iteration
We will perform the same steps as we did in the previous iterations. We will observe that there will be no
updation in the distance of vertices.
Time Complexiy
The time complexity of Bellman ford algorithm would be O(E|V| - 1).
1. function bellmanFord(G, S)
2. for each vertex V in G
3. distance[V] <- infinite
4. previous[V] <- NULL
5. distance[S] <- 0
6.
7. for each vertex V in G
8. for each edge (U,V) in G
9. tempDistance <- distance[U] + edge_weight(U, V)
10. if tempDistance < distance[V]
11. distance[V] <- tempDistance
12. previous[V] <- U
13.
14. for each edge (U,V) in G
15. If distance[U] + edge_weight(U, V) < distance[V}
16. Error: Negative Cycle Exists
17.
18. return distance[], previous[]
o The bellman ford algorithm does not produce a correct answer if the sum of the edges of a cycle is
negative. Let's understand this property through an example. Consider the below graph.
o In the above graph, we consider vertex 1 as the source vertex and provides 0 value to it. We provide
infinity value to other vertices shown as below:
Unit 4 : GRAPH THEORY
First iteration
Consider the edge (1, 3). Denote vertex '1' as 'u' and vertex '3' as 'v'. Now use the relaxing formula:
d(u) = 0
d(v) = ∞
c(u , v) = 5
d(v) = 0 + 5 = 5
Consider the edge (1, 2). Denote vertex '1' as 'u' and vertex '2' as 'v'. Now use the relaxing formula:
d(u) = 0
d(v) = ∞
c(u , v) = 4
d(v) = 0 + 4 = 4
Consider the edge (3, 2). Denote vertex '3' as 'u' and vertex '2' as 'v'. Now use the relaxing formula:
d(u) = 5
d(v) = 4
Unit 4 : GRAPH THEORY
c(u , v) = 7
Consider the edge (2, 4). Denote vertex '2' as 'u' and vertex '4' as 'v'. Now use the relaxing formula:
d(u) = 4
d(v) = ∞
c(u , v) = 7
d(v) = 4 + 7 = 11
Consider the edge (4, 3). Denote vertex '4' as 'u' and vertex '3' as 'v'. Now use the relaxing formula:
d(u) = 11
d(v) = 5
c(u , v) = -15
d(v) = 11 - 15 = -4
Second iteration
Now, again we will check all the edges. The first edge is (1, 3). Since (0 + 5) equals to 5 which is greater than
-4 so there would be no updation in the vertex 3.
The next edge is (1, 2). Since (0 + 4) equals to 4 so there would be no updation in the vertex 2.
The next edge is (3, 2). Since (-4 + 7) equals to 3 which is less than 4 so update:
The next edge is (2, 4). Since ( 3+7) equals to 10 which is less than 11 so update
= 3 + 7 = 10
The next edge is (4, 3). Since (10 - 15) equals to -5 which is less than -4 so update:
= 10 - 15 = -5
Third iteration
Now again we will check all the edges. The first edge is (1, 3). Since (0 + 5) equals to 5 which is greater than
-5 so there would be no updation in the vertex 3.
The next edge is (1, 2). Since (0 + 4) equals to 4 which is greater than 3 so there would be no updation in the
vertex 2.
The next edge is (3, 2). Since (-5 + 7) equals to 2 which is less than 3 so update:
= -5 + 7 = 2
The next edge is (2, 4). Since (2 + 7) equals to 9 which is less than 10 so update:
=2+7=9
Unit 4 : GRAPH THEORY
Therefore, the value at vertex 4 is 9.
The next edge is (4, 3). Since (9 - 15) equals to -6 which is less than -5 so update:
= 9 - 15 = -6
Since the graph contains 4 vertices, so according to the bellman ford algorithm, there would be only 3
iterations. If we try to perform 4th iteration on the graph, the distance of the vertices from the given vertex
should not change. If the distance varies, it means that the bellman ford algorithm is not providing the
correct answer.
4th iteration
The first edge is (1, 3). Since (0 +5) equals to 5 which is greater than -6 so there would be no change in the
vertex 3.
The next edge is (1, 2). Since (0 + 4) is greater than 2 so there would be no updation.
The next edge is (3, 2). Since (-6 + 7) equals to 1 which is less than 3 so update:
= -6 + 7 = 1
In this case, the value of the vertex is updated. So, we conclude that the bellman ford algorithm does not
work when the graph contains the negative weight cycle.
Floyd-Warshall Algorithm
The Floyd-Warshall algorithm is a dynamic programming algorithm used to discover the shortest paths in a
weighted graph, which includes negative weight cycles. The algorithm works with the aid of computing the
Unit 4 : GRAPH THEORY
shortest direction between every pair of vertices within the graph, the usage of a matrix of intermediate
vertices to keep music of the exceptional-recognized route thus far.
But before we get started, let us briefly understand what Dynamic Programming is.
Dynamic programming is a technique used in computer science and mathematics to remedy complicated
troubles with the aid of breaking them down into smaller subproblems and solving each subproblem as
simple as soon as. It is a technique of optimization that can be used to locate the pleasant technique to a
hassle with the aid of utilizing the solutions to its subproblems.
The key idea behind dynamic programming is to keep the solutions to the subproblems in memory, so they
can be reused later whilst solving larger problems. This reduces the time and area complexity of the set of
rules and lets it resolve tons larger and extra complex issues than a brute force approach might.
1. Memoization
2. Tabulation
Memoization involves storing the outcomes of every subproblem in a cache, in order that they may be
reused later. Tabulation includes building a desk of answers to subproblems in a bottom-up manner,
beginning with the smallest subproblems and working as much as the larger ones. Dynamic programming is
utilized in an extensive range of packages, including optimization troubles, computational geometry, gadget
studying, and natural language processing.
Some well-known examples of problems that may be solved by the usage of dynamic programming consist
of the Fibonacci collection, the Knapsack trouble, and the shortest path problem.
The Floyd-Warshall set of rules was advanced independently via Robert Floyd and Stephen Warshall in 1962.
Robert Floyd turned into a mathematician and computer scientist at IBM's Thomas J. Watson Research
Center, whilst Stephen Warshall became a computer scientist at the University of California, Berkeley. The
algorithm was originally developed for use inside the field of operations research, where it turned into used
to solve the all-pairs shortest direction problem in directed graphs with tremendous or negative side weights.
The problem become of outstanding hobby in operations research, as it has many applications in
transportation, conversation, and logistics.
Floyd first presented the set of rules in a technical record titled "Algorithm 97: Shortest Path" in 1962.
Warshall independently discovered the set of rules quickly afterwards and posted it in his personal technical
document, "A Theorem on Boolean Matrices". The algorithm has on account that emerged as a cornerstone
of pc technology and is broadly used in lots of regions of studies and enterprise. Its capability to correctly
find the shortest paths between all pairs of vertices in a graph, including those with terrible side weights,
makes it a treasured tool for solving an extensive range of optimization problems.
Unit 4 : GRAPH THEORY
Working of Floyd-Warshall Algorithm:
1. Initialize a distance matrix D wherein D[i][j] represents the shortest distance between vertex i and
vertex j.
2. Set the diagonal entries of the matrix to 0, and all other entries to infinity.
3. For every area (u,v) inside the graph, replace the gap matrix to mirror the weight of the brink: D[u][v]
= weight(u,v).
4. For every vertex okay in the graph, bear in mind all pairs of vertices (i,j) and check if the path from i
to j through k is shorter than the current best path. If it is, update the gap matrix: D[i][j] = min(D[i][j],
D[i][k] D[k][j]).
5. After all iterations, the matrix D will contain the shortest course distances between all pairs of
vertices.
Example:
Floyd-Warshall is an algorithm used to locate the shortest course between all pairs of vertices in a weighted
graph. It works by means of keeping a matrix of distances between each pair of vertices and updating this
matrix iteratively till the shortest paths are discovered.
In this graph, the vertices are represented by letters (A, B, C, D), and the numbers on the edges represent
the weights of those edges.
To follow the Floyd-Warshall algorithm to this graph, we start by way of initializing a matrix of distances
among every pair of vertices. If two vertices are immediately related by using a side, their distance is the
load of that edge. If there may be no direct edge among vertices, their distance is infinite.
Unit 4 : GRAPH THEORY
In the first iteration of the set of rules, we keep in mind the possibility of the usage of vertex 1 (A) as an
intermediate vertex in paths among all pairs of vertices. If the space from vertex 1 to vertex 2 plus the space
from vertex 2 to vertex three is much less than the present-day distance from vertex 1 to vertex three, then
we replace the matrix with this new distance. We try this for each possible pair of vertices.
In the second iteration, we recollect the possibility to use of vertex 2 (B) as an intermediate vertex in paths
among all pairs of vertices. We replace the matrix in the same manner as earlier before.
In the third iteration, we consider the possibility of using vertex 3 (C) as an intermediate vertex in paths
between all pairs of vertices.
Finally, in the fourth and final iteration, we consider the possibility of using vertex 4 (D) as an intermediate
vertex in paths between all pairs of vertices.
After the fourth iteration, we have got the shortest path between every pair of vertices in the graph. For
example, the shortest path from vertex A to vertex D is 4, which is the value in the matrix at row A and
column D.
After the fourth iteration, we have got the shortest path between every pair of vertices in the graph. For
example, the shortest path from vertex A to vertex D is 4, which is the value in the matrix at row A and
column D.
Unit 5 : Strings
String Sort:
String sorting involves arranging a collection of strings in a particular order. The most common sorting algorithm for
strings is lexicographic sorting, where strings are sorted based on their dictionary order. You can use established
sorting algorithms like quicksort or mergesort to achieve this. Here's a simple example in Python:
sorted_strings = sorted(strings)
print(sorted_strings)
Tries:
A trie is a tree-like data structure that is used to store a dynamic set of strings. It is particularly efficient for tasks like
prefix matching. Each node in a trie represents a character, and the path from the root to a node forms a string. Tries
are commonly used in spell checkers and IP routers.
class TrieNode:
def __init__(self):
self.children = {}
self.is_end_of_word = False
class Trie:
def __init__(self):
self.root = TrieNode()
node = self.root
node.children[char] = TrieNode()
node = node.children[char]
node.is_end_of_word = True
node = self.root
node = node.children[char]
return node.is_end_of_word
Example usage:
trie = Trie()
trie.insert("apple")
trie.insert("banana")
```
substring = "world"
if substring in main_string:
print("Substring found!")
else:
```
- If a mismatch is found, move the pattern one position to the right and continue.
Complexity Analysis:
- Time Complexity: O((n-m+1) * m), where n is the length of the text and m is the length of the pattern.
- This algorithm is straightforward but can be inefficient for large texts or patterns.
2. Rabin-Karp Algorithm:
Unit 5 : Strings
Algorithm:
- Utilizes hashing to compare the hash value of the pattern with the hash values of substrings in the text.
- Uses rolling hash to efficiently update the hash value as the window slides.
Complexity Analysis:
- Time Complexity: O((n+m) * constant), where n is the length of the text and m is the length of the pattern.
- Hashing can reduce the number of character comparisons, but hash collisions may affect performance.
Complexity Analysis:
- Time Complexity: O(n + m), where n is the length of the text and m is the length of the pattern.
- KMP avoids unnecessary character comparisons by utilizing the information from the preprocessing step.
- The table represents the rightmost position of each character in the pattern.
Complexity Analysis:
- Time Complexity: O(n + m), where n is the length of the text and m is the length of the pattern.
- This algorithm is particularly efficient for certain cases but may not be as versatile as others.
Complexity Analysis:
- Time Complexity: O(n + m), where n is the length of the text and m is the length of the pattern.
- Boyer-Moore tends to perform well in practice due to its ability to skip large portions of the text based on the
mismatched character.
Regular Expressions:
Unit 5 : Strings
Regular expressions (regex or regexp) are powerful tools for pattern matching and text manipulation. They provide a
concise and flexible way to search, match, and manipulate strings. Here are some key concepts:
1. Basic Syntax:
- `.`: Matches any single character.
2. Anchors:
- `^`: Matches the start of a line.
3. Character Classes:
- `[a-z]`: Matches any lowercase letter.
4. Quantifiers:
- `{n}`: Matches exactly n occurrences.
5. Escape Characters:
- `\`: Escapes a special character, allowing it to be treated as a literal.
2. Huffman Coding:
Unit 5 : Strings
- A variable-length encoding algorithm that assigns shorter codes to more frequent symbols and longer codes to
less frequent symbols.
- It builds a binary tree in which the leaves represent the symbols to be encoded.
- The more frequent a symbol, the closer its code is to the root of the tree.
Data compression is widely used in various applications, such as file compression (ZIP), image compression (JPEG),
and video compression (H.264). More advanced compression algorithms, like Lempel-Ziv and its variants, are
commonly used in practice.
Remember that the effectiveness of compression depends on the characteristics of the data. Some data types
compress well, while others may not show significant reduction in size. The choice of a compression algorithm
depends on factors such as the type of data and the specific requirements of the application.