Today's Material: - Medians & Order Statistics - Ch. 9
Today's Material: - Medians & Order Statistics - Ch. 9
Today's Material: - Medians & Order Statistics - Ch. 9
1
Selection: Problem Definition
• Given a sequence of numbers a1, a2, a3, …aN
and integer “i”, 1 <= i <= N, compute the ith
smallest element
• Can we do better?
– There is a deterministic O(n) algorithm, but it is
very complicated and not very practical
– However, there is a simple randomized algorithm,
whose expected running time is O(n)
• We will only look at this randomized algorithm--next
3
Randomized Algorithms – An Intro
• A randomized algorithm is one that incorporates
a random number generator
• Studies in recent years because many of the
practical algorithms make use of randomization
• There are 2 classes of randomized algorithms
– Monte Carlo Algorithms
• May make an error in its output, but presumably the
probability of this happening is very small
– Las Vegas Algorithms
• Always produces the correct answer, but there is a small
probability that the algorithm takes longer than it should
– With Monte Carlo algorithms randomization affects
the result, with Las Vegas it affects the running time
4
A Simple Monte-Carlo Algorithm
• Problem: Given a number N, is N prime?
– Important for cryptography
while (1){
while (A[j]>pivot) j--; // Move j
while (A[i]<pivot && i<j) i++; // Move i
if (i>=j) break;
Swap(&A[i], &A[j]);
i++; j--;
} //end-while
7
A Las Vegas Randomized Selection
• Observe that there are “q” elements <= pivot,
and hence the rank of the pivot is q
• If i==q then return A[q];
• If i < q then we select the ith smallest element
from the left sublist, A[1..q]
8
Randomized Selection: Pseudocode
9
Randomized Selection: C Code
10
Running Time - 1
• Because the algorithm is randomized, we
analyze its “expected” time complexity
– Where the expectation is taken over all possible
choices of the random pivot element
1 n 1
– T(n) <= [ T (max(k , n k ))] n
n k 1
– Basically, the recurrence can be simplified to:
2 n 1
– T(n) <= [ T (k )] n
n k n / 2 12
Running Time - 3
• Then an induction argument is used to show
that T(n) <= c*n for some appropriately chosen
constant c
• After working through the induction proof
(see page 189 in CLRS), we arrive at the
condition
– c*(3n/4 – ½) + n < c*n
– This is satisfied for any c >= 4
– This technique of setting up an induction with an
unknown parameter, and then determining the
conditions on the parameter is known as
“constructive proof”
13
Deterministic Selection
• Once we find the median of the medians,
partition the array using the medians of the
medians
• Then run the algorithm on the partitioned
array recursively
14
Deterministic Selection
• (1) Divide the elements into roughly n/5
groups, each of size 5
• (2) Compute the median of each group (by any
method you like)
• (3) Compute the median of these n/5 group
medians
• How do you implement step (3)?
– You call deterministic selection recursively
– Since the list is of smaller size, it will eventually
terminate
– Why groups of 5?
• You need an odd number for median computation
• 3 does not work. The smallest odd number greater than 3 is
5. But any other bigger odd number (7, 9, ..) would do too.
15