Module5 DSA

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 68

Module 5

Searching and Sorting


17/12/2023 2
Sorting Algorithms
• Arranging and rearranging sets of data in some specific order
• Bubble sort
• Selection sort
• Insertion Sort
• Quick Sort
• Merge sort
• Heap Sort
• Shell sort
• Radix sort
Bubble sort
• Compares two adjacent elements and swaps them until they are not in the intended order.
• Working of Bubble Sort:
• Sorting of the elements in ascending order.
• First Iteration (Compare and Swap)
• Starting from the first index, compare the first and the second elements.
• If the first element is greater than the second element, they are swapped.
• Now, compare the second and the third elements. Swap them if they are not in order.
• The above process goes on until the last element.
• Remaining Iteration
• The same process goes on for the remaining iterations.
• After each iteration, the largest element among the unsorted elements is placed at the end.
Bubble sort algorithms and steps
Bubble_sort(A,N) First Iteration

• Step 1:Repeat steps 2 for i=0 to N-2


• Step 2: Repeat for j=0 to N-i-2
• Step 3: If A[j] > A[j+1], then
SWAP A[j] and A[j+1]
[End of inner loop]
[End of outer loop]
• Step 4 : Exit
Bubble sort steps
Second Pass Third Pass

Fourth Pass
Improved Bubble sort algorithm
Bubble_sort(A,N)
• Step 1:Repeat steps 2 for i=0 to N-2
• Step 2 : flag=0
• Step 2: Repeat for j=0 to N-i-2
• Step 3: If A[j] > A[j+1], then
SWAP A[j] and A[j+1]
flag=1
[End of inner loop]
If(flag==0)
break
[End of outer loop]
• Step 4 : Exit
Time and Space complexity
• Time Complexity of Bubble sort
• Best case scenario: The best case scenario occurs when the array is already sorted. Time
complexity in the best case scenario is Ω (n) because it has to traverse through all the
elements once.
• Worst case and Average case scenario: In Bubble Sort, n-1 comparisons are done in the 1st
pass, n-2 in 2nd pass, n-3 in 3rd pass and so on. So, the total number of comparisons will
be:
Sum = (n-1) + (n-2) + (n-3) + ..... + 3 + 2 + 1
Sum = n(n-1)/2
Hence, the worst and average time complexity is of the order n2 or θ(n2)/O(n2).
• Space Complexity of Bubble sort
• The space complexity for the algorithm is O(1), because only a single additional memory
space is required for temporary variable used for swapping and for flag variable.
• It is an in-place sorting algorithm, which modifies the original array's elements to sort the given
array.
Insertion sort
• Algorithm that places an unsorted element Insertion sort(A,N)
at its suitable place in each iteration.
Step 1: repeat steps 2 to 5 for k=1 to N-
• The array to be sorted is divided into two 1
sets. One that stores sorted values and
another that contains unsorted values. Step 2: set key = A[k]
• Step 1 − If it is the first element, it is already Step 3: set j = k-1
sorted. Step 4: repeat while (j>=0 &&key <= A[j])
• Step 2 − Pick next element, name it as key set A[j+1] =A[j]
• Step 3 − Compare key with all elements in
the sorted sub-list
set j=j-1
• Step 4 − Shift all the elements in the sorted [end of inner loop]
sub-list that is greater than the value to be Step 5: set A[j+1] = key
sorted
[end of loop]
• Step 5 − Place the key at appropriate place
Step 6:Exit
• Step 6 − Repeat until list is sorted
Working of insertion sort

1. The first element in the array is assumed to be sorted. Take the


second element and store it separately in key. Compare key with the
first element. If the first element is greater than key, then key is
placed in front of the first element.
2. Now, the first two elements are sorted.
Take the third element and compare it with the elements on the left
of it. Placed it just behind the element smaller than it. If there is no
element smaller than it, then place it at the beginning of the array.
3. Similarly, place every unsorted element at its correct position.
example
• Consider the following array: • Second Iteration
25, 17, 31, 13, 2 • Since 31> 25, no swapping takes
• First Iteration: place.
• Also, 31> 17, no swapping takes
place and 31 remains at its
position.
example
• Third Iteration:
• Since 13< 31, we swap the two.
• Array now becomes: 17, 25, 13, 31, 2.
• Since, 13 < 25, we swap the two.
• The array becomes 17, 13, 25, 31, 2.
• Since 13 < 17, we swap the two.
• The array now becomes 13, 17, 25, 31, 2.
example
• Fourth Iteration: The array now
becomes 13, 17, 25, 31, 2.
• Since, 2< 31. Swap 2 and 31.
• Array now becomes: 13, 17, 25, 2, 31.
• Since, 2< 25. Swap 25 and 2.
• 13, 17, 2, 25, 31.
• Since, 2<17. Swap 2 and 17.
• Array now becomes:
• 13, 2, 17, 25, 31.
• Since 2< 13. Swap 2 and 13.
• The array now becomes:
• 2, 13, 17, 25, 31.
Time and Space complexity
• Time Complexity of Insertion sort
• Best case scenario: The best case scenario occurs when the array is already sorted
because the inner loop will never run. Time complexity in the best case scenario is Ω(n)
• Worst case scenario: The worst case scenario occurs when the array is sorted in reverse
order. Time complexity in this scenario is O(n2)
• Average Case scenario: The average case is when all the elements in the given input
array are in jumbled and mixed up order, i.e. neither ascending nor descending. Time
complexity in this scenario is θ(n2)
• Space Complexity of Bubble sort
• The space complexity for the algorithm is O(1), because only a single additional memory
space is required i.e. temporary variable used for storing the element to be sorted .
• It is an in-place sorting algorithm, which modifies the original array's elements
to sort the given array.
Selection sort

• In selection sort, the smallest value among the unsorted elements of


the array is selected in every pass and inserted to its appropriate
position into the array.
Selection sort Algorithm
1. Find the smallest element in the array Step 1: For i = 0 to n-2
and swap it with the first element of the step 2: Set min = arr[i]
array i.e. a[0]. step 3: Set position = i
step 4: For j = i+1 to n-1 repeat:
2. The elements left for sorting are n-1 so if (min > arr[j])
far. Find the smallest element in the Set min = arr[j]
array from index 1 to n-1 i.e. a[1] to a[n- Set position = j
1] and swap it with a[1]. [end of if]
[end of loop]
3. Continue this process for all the step 5: swap arr[i] with arr[position]
elements in the array until we get a [end of loop]
sorted list. step 6: END
Working of selection sort
Time and space complexity
1. Best Case: In this case, the data is already sorted inside the array. Thus there will
be zero number of swaps but the comparison will occur at every point. Time
complexity in the best case scenario is Ω(n2)
2. Average case: Elements are neither in increasing order nor decreasing order. The
values in the array are randomly placed. Time complexity in this scenario is θ(n2)
3. Worst case: In the worst case, the array is completely in decreasing or in the
non-increasing order. It will have maximum swaps as well as the maximum number
of comparisons. Time complexity in this scenario is θ(n2)
• The space complexity of the selection sort algorithm is O(1).
• It is an in-place sorting algorithm, which modifies the original array's elements to
sort the given array.
Merge sort
• Based on the Divide, conquer and combine strategy.
• Works by recursively dividing the array into two equal halves, then
sort and combine them. It takes a time of (n logn) in the worst case.
• Divide: In this step, the array/list divides itself recursively into sub-
arrays until the base case(single element) is reached.
• Recursively solve: Here, the sub-arrays are sorted using recursion.
• Combine: This step makes use of the merge( ) function to combine
the sub-arrays into the final sorted array.
Merge sort(dividing)
MERGE_SORT(arr, beg, end)
Step 1:if beg < end,then
set int mid = (beg + end)/2
call MERGE_SORT(arr, beg, mid)
call MERGE_SORT(arr, mid + 1, end)
MERGE (arr, beg, mid, end)
end of if
Step 2:END MERGE_SORT
Merge(combining)

MERGE (arr, beg, mid, end)


step1: set i=beg,j=mid+1,k=beg
Step2:Repeat while(i<=mid and j<=end)
If arr[i] <arr[j], then
set temp[k]=arr[i]
set i=i+1
Else
set temp[k]=arr[j]
set j=j+1
End of If
k=k+1
End of loop
Step 3
Step 3: [copy the remaining elements of right sub array,if any]

Merge(combining) repeat while j<=end


set temp[k]=arr[j]
set k=k+1

MERGE (arr, beg, mid, end) set j=j+1


end of loop
Step1: set i=beg,j=mid+1,k=beg
[copy the remaining elements of left sub array,if any]
Step2:Repeat while(i<=mid and j<=end) repeat while i<=mid

If arr[i] <arr[j], then set temp[k]=arr[i]


set k=k+1
set temp[k]=arr[i] set i=i+1
set i=i+1 end of loop

Else End of if
Step 4: copy the contents of temp back to arr,set index=0
set temp[k]=arr[j] Step5: repeat while index<k

set j=j+1 set arr[index]=temp[index]


set index=index+1
End of If
end of loop
k=k+1 Step 6:End

End of loop
Merge and Mergesort
Time and Space complexity
Time Complexity

Best O(n*log n)

Worst O(n*log n)

Average O(n*log n)

Space Complexity O(n)


Quick sort
• Divide, conquer and combine approach
• Divide and combine is a technique of breaking down the algorithms
into subproblems, then solving the subproblems, and combining the
results back together to solve the original problem.
• Divide: In Divide, first pick a pivot element. After that, partition or
rearrange the array into two sub-arrays such that each element in the
left sub-array is less than or equal to the pivot element and each
element in the right sub-array is larger than the pivot element.
• Conquer: Recursively, sort two subarrays with Quicksort.
• Combine: Combine the already sorted array.
steps
• An array is divided into subarrays by
Step 1 − Make any element as selecting a pivot element (element
pivot selected from the array).
Step 2 − Partition the array on the
While dividing the array, the pivot element
basis of pivot should be positioned in such a way that
Step 3 − Apply quick sort on left elements less than pivot are kept on the
partition recursively left side and elements greater than pivot
are on the right side of the pivot.
Step 4 − Apply quick sort on right
• The left and right subarrays are also
partition recursively divided using the same approach. This
process continues until each subarray
contains a single element.
• At this point, elements are already sorted.
Finally, elements are combined to form a
sorted array.
Algorithm Consider pivot as first element
partition(A, lb, ub)
{
Quicksort(A, lb, ub) Int pivot=A[lb], int beg=lb,int end=ub;
While(beg<end)
{
While(A[beg]<=pivot)
If(lb<ub) beg++
{ while(A[end]>pivot)
end--
location =partition(A, lb, ub) If(beg<end)
Quicksort(A, lb, location-1) swap(A[beg],A[end])
End of while
Quicksort(A,location+1, ub)
}
} Swap(pivot, A[end])
return end
}
Best case of quick sort
Worst case of quick sort
Time and space complexity
1. Time Complexities
Worst Case Complexity [Big-O]: O(n2)
It occurs when the pivot element picked is either the greatest or the smallest
element.
Best Case Complexity [Big-omega]: O(n*log n)
It occurs when the chosen pivot element is the middle element of the sorted
order array or near to the middle element.
Average Case Complexity [Big-theta]: O(n*log n)
It occurs when the above conditions do not occur.
2. Space Complexity
The space complexity for quicksort is O(log(n))
Heap sort
• What is binary tree
• How array is represented as complete binary tree
• How to create heap structure
• Min heap and Max heap
• Inserting an element in the heap
• Deleting an element from a heap
Binary tree and Complete Binary Tree
Binary tree Complete binary tree:
• Each node has at most • binary tree in which all the levels
two children which are referred to
as the left child and the right child.
are completely filled except
possibly the lowest one, which is
filled from the left

CS314 33
Binary Heap
• Complete binary tree ,all its elements can be stored
sequentially in an array
• Root node = A[0]
• Children of A[i] = A[2i+1], A[2i+2]
• Keep track of current size N (number of nodes)

value 7 5 6 2 44 23 9 5 6
index 0 1 2 3 4 5 6 7 2 44 23 9
N=5
Max heap and min heap
Insertion in a binary heap
• Consider a heap H with n elements. Inserting a new value into the
heap is done in the following steps:
• 1) add a new value at the bottom of H in such a way such that H is still
a complete binary tree but not necessarily a heap
• 2) Let the new value rise to its appropriate place in H so that H now
becomes a heap as well.
Constructing a max heap(heapify)

Given array Step 2:

Step 1:
Step 3:
Deleting elements from a heap(for sorting)
• Consider a max heap H having N elements. An element is always
deleted from the root of the heap.
• Step 1:Replace the root node with the last node’s value so that H is
still a complete binary tree but not necessarily a heap.
• Step 2:Sink down the new node’s value so that H satisfies the heap
property.

swap 50 with 33
heapify
1. Create a complete binary tree from the array
2. Start from the first index of non-leaf node whose index is given by n/2-1.
3. Set current element i as largest
4. The index of left child is given by 2*i+1 and the right child is given by 2*i+2.
5. If left chid is greater than current element set left child index as largest.
6. If right child is greater than element in largest, set right child index as largest
7. Swap largest with current element
8. Repeat steps 2-7 until all subtrees are heapified.
Heapify procedure
Heap sort algorithm Heapify (arr[], n, i)
Heap_Sort (arr[], n)
set largest = i;2
1)Creating the initial Max heap
set left = 2*i + 1; // Left child
repeat for i = n/2 – 1 to 0 //i=2,1,0
set right = 2*i + 2; // Right child
heapify(arr, n, i)
Check if left child exists and is larger than root
If (left < n && arr[left] > arr[largest]):
2) Swapping largest element and repeating
the steps further for sorting purpose Largest = left;

repeat for i = n-1 to 1: Check if right child exists and is larger than larges

swap(arr[0], arr[i]) If (right < n && arr[right] > arr[largest]):

heapify(arr, i, 0) largest = right;


Change root, if root is not the largest
If(largest != i)
Swap(arr[i], arr[largest])
Heapify(arr, n, largest);
Time and space complexity
• The total number of comparisons required in the max heap is
according to the height of the tree. The height of the complete binary
tree is always logn; therefore, the time complexity would also be
O(nlogn).
• Worst Case Time Complexity: O(n*log n)
• Best Case Time Complexity: O(n*log n)
• Average Time Complexity: O(n*log n)
• Space Complexity : O(1)
Time and space complexity comparison
Shell sort
• Improved version of “insertion sort,”
• breaks the original list/array into a number of smaller sub lists based
on the gap, each of which is sorted using an insertion sort.
• The method starts by sorting elements far apart from each other and
progressively reducing the gap between them.
• Starting with far apart elements can move some out-of-place
elements into position faster than a simple nearest neighbour
exchange
Algorithm of Shell Sort

• Initialize the gap size.


• Divide the array into subarrays of equal gap size.
• Apply insertion sort on the subarrays.
• Repeat the above steps until the gap size becomes 0 resulting into a
sorted array.
example
First iteration ,gap=n/2=8/2=4 second iteration ,gap=n/4=8/4=2
5 6 3 1 9 8 4 7 swapping

3 6 5 1 9 8 4 7 swapping
3 1 5 6 9 8 4 7 No swapping
3 1 5 6 9 8 4 7
No swapping
3 1 5 6 9 8 4 7
swapping
3 1 5 6 4 8 9 7
swapping
3 1 4 6 5 7 9 8 No swapping
third iteration ,gap=n/8=8/8=1 Fourth iteration,gap =n/16
3 1 4 6 5 7 9 8 which is less than 1
swapping
1 3 4 6 5 7 9 8 No swapping
1 3 4 6 5 7 9 8 No swapping
• stop
1 3 4 5 6 7 9 8 swapping
1 3 4 5 6 7 9 8 No swapping
1 3 4 5 6 7 9 8
1 3 4 5 6 7 8 9
swapping
pseudocode
shellSort(int arr[], int num)
int i, j, gap;
for (gap = num / 2; gap > 0; gap = gap / 2)
{
for (j = gap; j < num; j++)
{
for(i = j - gap; i >= 0; i = i - gap)
{
if (arr[i+gap] >= arr[i])
break;
else
{
swap(a[i],a[i+gap])
}}}}
Time and space complexity
Time Complexity

Best O(nlog n)

Worst O(n2)

Average O(nlog n)

Space Complexity O(1)


Radix sort
• Working of Radix Sort
• Step 1: Find the largest element in the array
• Step 2: Find the number of digits in the largest element . Let it be X.X
is calculated because we have to go through all the significant places
of all elements.
• Step 3: Sort the array by place values using any sorting algorithm
starting from ones place to maximum place value. Let us call it as
bucket
• Step 4:Combine all array(buckets) to finally get the sorted elements
example
comparison
Space
Sorting Algorithm Best Case Worst Case In place
complexity

Bubble Sort O(n) O(n2) O(1) Yes

Insertion Sort O(n) O(n2) O(1) Yes

Selection Sort O(n2) O(n2) O(1) Yes

Quick Sort O(n.log(n)) O(n2) O(logn) Yes

Merge Sort O(n.log(n)) O(n.log(n)) O(n) No

Heap Sort O(n.log(n)) O(n.log(n)) O(1) Yes

O(n.log(n))
Shell Sort O(n2) O(1) Yes

Radix Sort O(n*k) O(n*k) O(n+k) No


Searching
• The process of identifying or finding a particular record is called
Searching
• successful or unsuccessful depending upon whether the element that
is being searched is found or not
• Linear Search or Sequential Search
• Binary Search
• Hashing table and hashing function
Linear search
• Basic and simple search algorithm
• We search an element or value in a given array by traversing the array
from the starting, till the desired element or value is found
• It compares the element to be searched with all the elements present
in the array and when the element is matched successfully, it returns
the index of the element in the array, else it return -1
• Linear Search is applied on unsorted or unordered lists, when there
are fewer elements in a list
• It has best time complexity of O(1)
• It has worst time complexity of O(n)
Pseudocode
Algorithm A=[10,5,100,73,67,34], key=34
LinearSearch(A, key)
LinearSearch(array, key) found=0;
for(i=0;i<n ; i++)
for each item in the array {
if item == key if(a[i]==key)
{
return its index
Printf(“element found at index %d,i”)
found=1;
break;
}
}
if (found==0)
print(“not found”)
Binary search
• Fast search algorithm with run-time complexity of Ο(log n)
• Works on the principle of divide and conquer
• Data should be in the sorted order
• follows divide and conquer approach, recursively divide the array into
two parts
• Worst time complexity is O[log(n)]
int binary search(Array, left ,right, key)
{
Algorithmic steps Int mid ;
Binary search(Array, left, right, key) if(left<=right)
1) Start with complete array of size n. {
Repeat steps 2 to 5 if left is less than or
equal to right mid=(left+right)/2

2) Calculate mid value ,mid=(left + right)/2 If (key==a[mid]

3) Case 1:if key is equal to middle index value, return mid+1;


return index value else if(key<a[mid])
4) Case 2:If the key value is less than the value right=mid-1;
at the middle index, we search the key in the
return binary search(a,left,right,key)
left sub array. right=mid-1, call Binary
search(a, left, right, key) else
5) Case 3: If the key value is greater than the left=mid+1
value at the middle index, we search the key return binary search(a,left,right,key)
in the right sub array. left=mid+1, call Binary
search(a, left, right, key) }
return -1;
Hash table and Hash Functions
• Hashing is a technique of mapping and searching keys/values into the
hash table by using a hash function
• Hash table is a type of data structure which is used for storing and
accessing data very quickly
• Hash function is a function which is applied on a key by which it
produces an integer, which can be used as an address of hash table
• It is done for faster access to elements
• Whole idea is to reduce searching time as O[1]
• The efficiency of mapping depends on the efficiency of the hash
function used.
Hash function
• Hash function is a mathematical formula applied to the key which
produces an integer which can be used as an index for the key in the
hash table.
• Aim of hash function is that elements should be relatively, randomly
and uniformly distributed.
• The process of mapping the keys to appropriate locations in the hash
table is called hashing.
• A good hash function should minimize the number of collisions.
Properties of a good hash function
• Low cost : cost of computing should be small
• Determinism: The hash value is fully determined by the data being
hashed.
• Uniformity: The hash function "uniformly" distributes the data across
the entire set of possible hash values
• Collision: Number of collisions should be less while placing the data in
the hash table.
Hashing function
• Division method
• hash function can be defined as:
• h(x) = x % m;
• where x is key value, m is the size of the hash table, h(x) is the index value of key.
• It is good to choose M to be a prime number for uniform distribution of keys.
• Mid square method
• key is squared and then mid part of the result is taken as the index.
• When key = 5642 then key2 = 31832164 ,h(key)=32
• Folding method
• In folding, the keys are divided into parts where each part has same number of
digits except the last part which are then added together and transformed into the
address(ignoring the last carry)
Collision resolution techniques
• Collisions occur when the hash function maps two different keys to the
same location.
• The method used to solve the problem of collision are called collision
resolution techniques.
• Open addressing
• Linear Probing
• Quadratic Probing
• Double Probing
• Chaining
• In chaining, the keys are not stored in the table, but in the info portion of a
linked list of nodes associated with each table position
Linear probing
• Simplest approach to resolve collision
• If collision happens, then following hash function is used:
h(x,i)=[h(x)+i] % m , h(x) = x%m
where m is size of hash table, i is the probe number and varies from 0 to m-1.
• Disadvantage: Primary Clustering
Quadratic probing
• Simplest approach to resolve collision as well as clustering
• If collision happens, then following hash function is used:
h(x,i)=[h(x)+c1i +c2i2] % m , h(x) = x%m
where m is size of hash table, i is the probe number and varies from 0 to m-1.
• Disadvantage: Secondary Clustering
Double probing
• Simplest approach to resolve collision
• If collision happens, then following hash function is used:
h(x,i)=[h1(x)+ih2(x)] % m , where m is size of hash table, i is the probe
number and varies from 0 to m-1.
Chaining
• In chaining, each slot of the hash table is a linked list.
• We will insert the element into a specific linked list to store it in the
hash table.
• If there is any collision i.e. if more than one element after calculating
the hashed value mapped to the same key then we will store those
elements in the same linked list.
chaining

You might also like