Data Structures & Algorithm in Java

Introduction
Data Structures & Algorithm in Java

Robert Lafore
1
Objectives
Provide background in data structures (arrangements of data
in memory) and algorithms (methods for manipulating this
data)
Why is this important? The answer is there is far more to
programming than finding a solution that will work:
Execution time
Memory consumption
So by the end of the course, you should be more equipped to
not just develop an accurate solution to a problem, but the
best solution possible.
2
Definition of a Data Structure
A data structure is an arrangement of data in a computer’s
memory (or disk).
Questions to ponder:
What are some examples of data structures you already know
about from your Java course?
How can the arrangement of data in memory affect

performance?
3
Definition of an Algorithm
An algorithm provides a set of instructions for manipulating
data in structures.
Questions to ponder:
What’s an example of an algorithm?
How can the design of an algorithm affect performance? How

can it affect memory?
4
Data Structure or Algorithm?
Linked List
Sort
Search
Stack
Vector
5
Real World Data Storage
Real-world data: data that describes physical entities external
to the computer. Can we think of some examples?
What’s an example of non-computer data storage?

Addresses and phone #s:
Book names, titles, ISBNs:
What method do these use?
6
Real World Data Storage
Say we wanted to convert one of these systems to a computer
program, what must we consider?
Memory consumption:
Scalability:
Algorithms (which ones do we care about?):
7
Important: Data Structures can be
HUGE!!!
What are some example scenarios where we need large data
structures?
Now the issue of scalability comes into play. Suppose I

design a reusable component for (i.e.) sorting an array of
records by last name. I’m going to distribute this on the
internet. I have to prepare for many more situations now:
8
Programmer Tools
Do not store real-world data, but perform some function
internal to the machine.
Example - A ‘stack’ of data, where I can only insert or
remove elements from the top:
What programmer tool would this be useful for? Hint:

Think in terms of a variables in a program.
9
Real World Modeling
Effectively, ‘simulate’ a real-world situation.
For example, what could the following represent:
10
Real World Modeling
How about airline routes?
This type of structure is called an ‘undirected graph’
11 We’ll cover it!!

Real World Modeling
How about a ‘queue’ of data, where the first element in is the
first element out (termed ‘FIFO’):
Example applications:
Grocery store lines
Traffic (Queues are actually used when determining timing of
traffic lights!! How? Let’s think about it)
12
Data Structure Trade-offs
A structure we have dealt with before: arrays
Requirement that is enforced:
Arrays store data sequentially in memory.
Let’s name the advantages (i.e., when is an array efficient?)
Let’s name the disadvantages
13
Overall Costs for Structures We’ll Study
Structure Access Search Insert Delete Impl. Memory
Array Low High Med High Low Low
Ord. Array Low Med High High Med Low
Linked List High High Low Low Med Med
Stack Med High Med Med Med Med
Queue Med High Med Med Med Med
Bin.Tree Med Low Low Low High High
R-BTree Med Low Low Low Very High High
234 Tree Med Low Low Low Very High High
Hash Table Med Med Low High Low High
Heap Med Med Low Low High High
Graph High High Med Med Med Med
Point that you should be getting: No ‘universal’ data structure!!!

14
Algorithms We’ll Study
Insertion/Searching/Deletion
Iteration. Java contains data types called iterators which
accomplish this.
Sorting. Believe it or not, there LOTS of ways to do this! So
much, you get two chapters on it. :p
Recursion.
What’s the key attribute of a recursive function in Java?
This technique will be used to develop some of the afore

mentioned algorithms.
15
Databases
A database refers to all data that will be dealt with in a
particular situation. We can think of a database as a table of
rows and columns:
What is the ‘database’ in the case of a collection of index

cards and names/phone #s?
16
Database Records
A record is the unit into which a database is divided. How is a
record represented in a table:
What would be a ‘record’ in a stack of index cards? What

about a banking system? How about a cookbook?
What basic construct is used to create records in Java?
17
Database Fields
A field is the unit into which a record is divided. What
represents a field in our table:
What would some example fields be for a banking system?
A clothing catalogue?
18
Database Keys
Given a database (a collection of records), a common
operation is obviously searching. In particular we often want
to find a single particular record. But what exactly does this
mean? Each record contains multiple fields, i.e.:
Last First Acct No. City State Zip

Azuma Goro 123456789 St. Pete FL 33712
Smith John 867530912 Clearwater FL 33720
Gunnz JT 102938475 South Bend IN 46545
Zozzfuzzel Ziggy 000000000 St. Pete FL 61589
So how do we designate a record? We need something

unique to each record, that’s the key. What is the key above?
19
Database Keys
So if I wanted to search for a particular record, I would want
to do it by key (i.e. Acct No).
Note that searching by anything else could yield multiple
records.
Last First Acct No. City State Zip

Azuma Goro 123456789 St. Pete FL 33712
Smith John 867530912 Clearwater FL 33720
Gunnz JT 102938475 South Bend IN 46545
Zozzfuzzel Ziggy 000000000 St. Pete FL 61589
20
Java.util Package
Includes Vector, Stack, Dictionary, and Hashtable. We won’t
cover these particular implementations but know they are
there and accessible through:
import java.util.*;
You may not use these on homeworks unless I explicitly say you
can.
Several other third-party libraries available
A central purpose of Java
21
Review of Object-Oriented
Programming
Procedural Programming Languages
Examples: C, Pascal, early BASIC
What is the main unit of abstraction?
Object-Oriented Languages:
Examples: C++, Ada, Java
What is the main unit of abstraction?
Obviously procedural languages weren’t good enough in all
cases. Let’s rediscover why.
22
Main Limitations of Procedural
Programming
1. Poor Real World Modeling. Let’s discuss why.
2. Crude Organization. Let’s discuss why.
23
Idea of Objects
A programming unit which has associated:
Variables (data), and
Methods (functions to manipulate this data).
How does this address the two problems on the previous
slide?
Real World Modeling
Organization
24
Idea of Classes (Java, C++)
Objects by themselves can create lots of redundancy. Why?
With a class, we specify a blueprint for one or more objects. For

example, in Java (cf. book page 16):
class thermostat {
private float currentTemp;
private float desiredTemp;
public void furnaceOn() {

…
}
25 }
Instances of a Class
Java creates objects using the new keyword, and stores a
reference to this instance in a variable:
thermostat therm1;
therm1 = new thermostat();
thermostat therm2 = new thermostat();
Object creation (or instantiation) is done through the

constructor.
Which constructor did we use here?
26
Invoking Methods of an Object
Parts of the program external to the class can access its
methods (unless they are not declared public):
Dot operator:
therm2.furnace_on();
Can I access data members similarly?
therm2.currentTemp = 77;
What would I need to change to do so?
Is this change good programming practice?
How, ideally, should data members be accessed?
27
Another Example
If you have your book, look at the BankAccount class on page
18. If you don’t have it, don’t worry I’ll write it on the
board.
Look at the output. Let’s go over why this is generated.
28
Inheritance
Creation of one class, called the base class
Creation of another class, called the derived class
Has all features of the base, plus some additional features.
Example:
A base class Animal may have associated methods eat() and
run() and variable name, and a derived class Dog can inherit
from Animal, gaining these methods plus a new method
bark().
If name is private, does Dog have the attribute?
How do we enforce Dog to also have attribute name?
29
Polymorphism
Idea: Treat objects of different classes in the same way.
What’s the requirement?
Let’s return to an example with Animal and Dog, and

throw in another derived class Cat.
30
Software Engineering
Bigger picture. How do the topics of this course fit relate to
software engineering?
The life cycle of a software project consists of the following:
Specification – purpose of the software, requirements , etc.
Design – components and interaction of the software
Verification – individual components and global functionality
Coding – actual writing of the software
Testing – validating proper functionality
Production – distribution to the community
Maintenance – updating the software (FYI in a poor design,
these costs can be high!!)
31
Stage 1: Specification
Here we answer the questions of what purpose the software
serves, and requirements. Very high level at this stage.
How do data structures fit in?
The specification _______________ the data structures.
Why?
32
Stage 2: Design
At this stage, we break software into components and
describe their interaction.
The data structures help _______________ the design.
Why?
33
Stage 3: Verification
Verification involves a review of all components, studying
their individual inputs and outputs and ensuring functionality
of the design.
In order to do this, we have to know the expected ________
and __________ of the data structures. Why?
34
Stage 4: Coding
Involves the actual writing of the software.
How can data structures save time at this stage? Think in
terms of reusable components.
35
Stage 5: Testing
Involves running the software package through a set of
benchmarks, which make up a testsuite, and verify proper
functionality. You can also test individual components.
How does understanding a data structure help in terms of
testing?
How does understanding a data structure help in terms of

debugging?
36
Stage 6: Production
Distributing the software to the community.
If we use reusable components from another source, what
must we do in this case? Think about how Java packages
work.
37
Stage 7: Maintenance
Updating the software while keeping the internal design
intact.
How do generic components come in handy here? How
might individual components (and thus their associated data
structures) have to scale with issues such as data size? How
about data types?
In a simple situation, think about maintenance on a software
package where every data structure was hardcoded to operate
on only integers. What happens when we extend to floats?
See why these costs can be high? How can software updates
propagate between components?
Dependencies can grow exponentially!
38
Final Review of some Java Concepts
Difference between a value and a reference:
int intVar;
BankAccount bc1;
Which is a value and which is a reference?
How did the interpreter know to allocate them differently?
What does the address stored in bc1 contain right now?

What must I do before I use bc1?
39
Java Assignments
What must be noted about the following code snippet:
int intVar1 = 45;

int intVar2 = 90;
BankAccount bc1(72.45);
BankAccount bc2 = bc1;
intVar1 = intVar2;
intVar1 = 33; // Does this modify intVar2?
bc2.withdraw(30.00); // Does this modify object bc1?
40
Java Garbage Collection
When is the memory allocated to an object reclaimed in Java?
Code like this would leak memory in C++, but does not in Java
because of the garbage collector:
while (true) {
Integer tmp = new Integer();
…
}
41
Passing by Value vs. Reference
Same idea:
void method1() {
BankAccount ba1 = new BankAccount(350.00);
float num = 4;
}
void method2(BankAccount acct) { … }
void method3(float f) { … }
If I change f in method3, does that affect num?

If I change acct in method2, does that affect object ba1?
42
== vs. equals()
carPart cp1 = new carPart(“fender”);
carPart cp2 = cp1;
// What’s the difference between this:
if (cp1 == cp2)
System.out.println(“Same”);
// And this:
if (cp1. equals(cp2)
System.out.println(“Same”);
Does “Same” print twice, once, or not at all?
43
Primitive Sizes and Value Ranges
44 Source: roseindia.net
Screen Output
System.out is an output stream which corresponds to
standard output, which is the screen:
int var = 33;

// What’s do these three statements print?
System.out.print(var);
System.out.println(var);
System.out.println(``The answer is `` + var);
45
Keyboard Input
Package: java.io
Read a string: InputStreamReader isr = new InputStreamReader(System.in);
BufferedReader br = new BufferedReader(isr);
String s = br.readLine();
Read a character: char c = s.charAt(0);
Read an integer: int i = Integer.parseInt(s);
46 Read a float: double d = Double.valueOf(s).doubleValue();

Data Structure #2: Arrays
CS221N, Data Structures
47
The Array
Most commonly used data structure
Common operations
Insertion
Searching
Deletion
How do these differ for an ‘ordered array’?
How do these differ for an array which does not allow
duplicates?
48
Array Storage
An array is a collection of data of the same type
Stored linearly in memory:
49
Remember, value vs. reference…
In Java:
Data of a primitive type is a ____________.
All objects are ________________.
Java arrays are also considered references.
50
Defining a Java Array
Say, of 100 integers:
int[] intArray;
intArray = new int[100];
We can combine these statements:
Or, change the [] to after the variable name
What do the [] signify?
51
We said an array was a reference…
That means if we do this:
int[] intArray;
What exactly does intArray contain? Let’s look internally.
52
The Size
Size of an array cannot change once it’s been declared:
But, one nice thing is that arrays are objects. So you can
access its size easily:
int arrayLength = intArray.length;
Getting an array size is difficult in many other languages
53
Access
Done by using an index number in square brackets:
int temp = intArray[3]; // Gets 4th element
intArray[7] = 66; // Sets 8th element
How do we access the last element of the array, if we don’t

remember its size?
What range of indices will generate the IndexOutOfBounds

exception?
The index is an offset. Let’s look at why.
54
Initialization
What do the elements of this array contain:
int[] intArray = new int[100];
How about this one:
BankAccount[] myAccounts = new BankAccount[100];
What happens if we attempt to access one of these values?
int[] intArray = {0, 3, 6, 9, 12, 15, 18, 21, 24,

27};
Automatically determines the size

Can do this with primitives or objects
55
Look at a book example…
See the example on p. 41-42, where we do the following:
Insert 10 elements into an array of integers
Display them
Find item with key 66
Delete item with key 55
Display them
Ask ourselves:
How could we make the initialization shorter?
How could we save declaring nElems?
56
This did not use OOP
So our next task will be to divide it up (p. 45)
What will we want for the array class? Let’s think about the
purpose of classes. They have data, and functions to manipulate
that data.
So in this case, what will our data be?
For functions, we’ll provide:

A constructor which takes a size and initializes the array
A function to retrieve a single element
A function to set a single element
And then modify main()
57
The LowArray interface
Here’s what it looked like:
What’s inadequate currently in terms of operations?

How can we improve things?
58
Further division…
Let’s make a new HighArray class (p. 49) which includes the
following functions:
Constructor which takes an integer for the size
Function to find an element
Function to insert an element
Function to delete an element
One more data member

nElems, which holds the number of occupied cells
Then, let’s modify main().
59
Abstraction
This illustrates the concept of abstraction
The way in which an operation is performed inside a class is
invisible
Client of HighArray performs more complex operations
through simple method invocations
Never directly accesses the private data in the array
Now we can reuse HighArray much easier

Note – a client does not even really know about the member
array!
Hint, we’ll see later how we can change it.
60
The Ordered Array
An array in which the data items are arranged in ascending
order
Smallest value is at index:
Largest value is at index:
Think about what functions we’d have to modify
Why could this be a nice feature? What operation could be

much faster?
61
That’s right, searching!
We can still do a linear search, which is what we’ve seen.
Step through the elements
In the average case, would this be faster than an unordered

array?
We can also do what’s called binary search, which is much

faster
Especially for large arrays
62
Binary Search: Idea
Ever see the Price is Right?
Guess the price on an item
If guess is too low, Bob Barker says “higher”
If guess is too high, Bob Barker says “lower”
This can work if we are using ordered arrays

Check the middle element
If it’s too low, restrict search to the first half of the array
Otherwise restrict search to the second half of the array
And repeat.
63
Note what this can save!
Let’s take a simple case, where we search for an item in a
100-element array:
int[] arr = {1,2,3,4,5,6,…..,100}
For an unordered array where we must use linear search,

how many comparisons on average must we perform?
How about for binary search on an ordered array? Let’s look

for the element 33.
64
Binary Search
Array has values 1-100
First search: Check element 50
50 > 33, so repeat on first half (1-49)
Second search: Check element 25
25 < 33, so repeat on second half (26-49)
Third search: Check element 37
Fourth search: Check element 31
Fifth search: Check element 34
Sixth search: Check element 32
32 < 33, so repeat on second half (33)
Seventh search: Check element 33! Found.
65 So 7 comparisons. With linear search, it would’ve been 33.
Affect on Operations
We saw how binary search sped up the searching operation
Can it also speed up deletion?
What about, insertion of a new element into an ordered

array?
66
Implementation
Let’s go through the Java implementation, on pages 56-57.
At any given time:
lowerBound holds the lower index of the range we are searching
upperBound holds the upper index of the range we are
searching
curIn holds the current index we are looking at
What if the element is not in the array? What happens?
67
Now, let’s implement the OrdArray
Data
The array itself
The number of occupied slots
Methods
Constructor
Size
Find (with binary search)
Insert (with binary search)
Delete (with binary search)
Display
68
Analysis
What have we gained by using ordered arrays?
Is searching faster or slower?
Is insertion faster or slower?
Is deletion faster or slower?
All in all, ordered arrays would be useful in situations where

insertion/deletion are infrequent, but searching is frequent
Employee records – hiring/firing is less frequent than accessing
or updating an employee record
69
Ordered Array: Operation Counts
Maximum number of comparisons for an ordered array of n elements,
running binary search:
n Comparisons
10 4
100 7
1000 10
10000 14
100000 17
1000000 20
How does this compare with linear search, particularly for large arrays?
Whew.
70
A Deeper Analysis
How many comparisons would be required for an array of
256 elements? (2^8)
What about 512 (2^9)?
What do you think 1024 would be (2^10)?
See the pattern?
So for n values, the number of comparisons is log2(n)+1.

This is an example of an algorithm which scales
logarithmically with the input size. Linear search, scales
linearly.
71
Computing log2n
On a calculator, if you use the “log” button, usually the base is
10. If you want to convert:
Multiply by 3.322
Algorithms that scale logarithmically are preferable to those

that scale linearly, because the log of a function grows much
slower than the function itself.
So for large input sets, you’ll have a MUCH smaller number

of operations.
72
Storing Objects
We’ve seen an example where we used arrays to store
primitive data. Now let’s look at an example which stores
objects. What’s our situation now with values and
references?
The array itself is still a _________________.

The elements of the array are ________________.
Implications?
73
Person Class
Let’s go through the Person class on page 65.
Data:
First name and last name (String objects), age (integer value)
Functions
Constructor which takes two strings and an integer
Function to display information
Function to return the last name (we’ll eventually use this for
searching)
74
Adapting our HighArray class
Rewrite the implementation on page 49
Change to operate on Persons instead of integers
Watch out for the ==!
In main() construct Person objects
75
Big-Oh Notation
Provides a metric for evaluating the efficiency of an
algorithm
Analogy: Automobiles
Subcompacts
Compacts
Midsize
etc.
76
How it’s done
It’s difficult to simply say: A is twice as fast as B
We saw with linear search vs. binary search, the comparison
can be different when you change the input size. For
example, for an array of size n:
n=16, linear search comparisons = 10, binary search
comparisons = 5
Binary search is 2x as fast
comparisons = 6
Binary search is 5.3x as fast
77
Example: Insertion into Unordered
Array
Suppose we just insert at the next available position:
Position is a[nElems]
Increment nElems
Both of these operations are independent of the size of the

array n.
So they take some time, K, which is not a function of n

We say this is O(1), or constant time
Meaning that the runtime is proportional to 1.
78
Example: Linear search
You’ll require a loop which runs in the worst case n times
Each time, you have to:
Increment a loop counter
Compare the loop counter to n
Compare the current element to the key
Each of these operations take time independent of n, so let’s say
they consume a total time of K.
Then the algorithm would take K*n total time

We say this is O(n).
79
Example: Binary Search
We’ve already said that for an array of n elements, we need
log(n)+1 comparisons.
Each comparison takes time independent of n, call it K
Total time is then: K(log(n)+1) = K*log(n) + K
For large n, this grows proportional to log(n), i.e. the leading

term dominates.
We say this is O(log n)
80
Why this is useful
Useful to evaluate how well an algorithm scales with input
size n. For example:
O(1) scales better than…
O(log n), which scales better than…
O(n), which scales better than…
O(n log n), which scales better than…
O(n^2), etc.
Each of these successively grows faster with n.
81
Generally speaking…
For an input of size n and a function T(n), to compute the
Big-Oh value, you take the leading term and drop the
coefficient.
Examples – compute Big Oh values of the following
runtimes:
T(n) = 100*n^2 + n + 70000
T(n) = (n*log n) / n
T(n) = n^3 + 754,000*n^2 + 1
T(n) = (n + 2) * (log n)
82
But, these large constants must mean
something…
T(n) = n^3 + 754,000*n^2 + 1
This huge constant on the n^2 term, has to have some effect,
right?
The answer is yes and no.
Yes, if the input size is _________________.

But for very large values of n, n^3 overtakes the term, even
with the large constant.
83
Algorithms we’ve discussed…
Linear search: O(n)
Binary search: O(log n)
Insertion, unordered array: O(1)
Insertion, ordered array: O(n)
Deletion, unordered array: O(n)
Deletion, ordered array: O(n)
84
Graph of Big O times.
See page 72.
85
Unordered/Ordered Array Tradeoffs
Unordered
Insertion is fast – O(1)
Searching is slow – O(n)
Ordered
Searching is fast – O(log n)
Insertion is slow – O(n)
Deletion is even – O(n)
Memory can be wasted, or even misused. Let’s discuss.
86
What we will see…
There are structures (trees) which can insert, delete and
search in O(log n) time
Of course as you’d expect, they’re more complex
We will also learn about structures with flexible sizes
java.util has class Vector – what you should know:
Array of flexible size
Some efficiency is lost (why do you think?)
What happens when we try to go beyond the current size?
Why is this penalty very large at the beginning of array population?
87
Data Structure #2: Arrays
88
The Array
Most commonly used data structure
Common operations
Insertion
Searching
Deletion
How do these differ for an ‘ordered array’?
How do these differ for an array which does not allow
duplicates?
89
Array Storage
An array is a collection of data of the same type
Stored linearly in memory:
90
Remember, value vs. reference…
In Java:
Data of a primitive type is a ____________.
All objects are ________________.
Java arrays are also considered references.
91
Defining a Java Array
Say, of 100 integers:
int[] intArray;
We can combine these statements:
Or, change the [] to after the variable name
What do the [] signify?
92
We said an array was a reference…
That means if we do this:
int[] intArray;
What exactly does intArray contain? Let’s look internally.
93
The Size
Size of an array cannot change once it’s been declared:
But, one nice thing is that arrays are objects. So you can
access its size easily:
int arrayLength = intArray.length;
Getting an array size is difficult in many other languages
94
Access
Done by using an index number in square brackets:
int temp = intArray[3]; // Gets 4th element
intArray[7] = 66; // Sets 8th element
How do we access the last element of the array, if we don’t

remember its size?
What range of indices will generate the IndexOutOfBounds

exception?
The index is an offset. Let’s look at why.
95
Initialization
What do the elements of this array contain:
int[] intArray = new int[100];
How about this one:
BankAccount[] myAccounts = new BankAccount[100];
What happens if we attempt to access one of these values?
int[] intArray = {0, 3, 6, 9, 12, 15, 18, 21, 24,

27};
Automatically determines the size

Can do this with primitives or objects
96
Look at a book example…
See the example on p. 41-42, where we do the following:
Insert 10 elements into an array of integers
Display them
Find item with key 66
Delete item with key 55
Display them
Ask ourselves:
How could we make the initialization shorter?
How could we save declaring nElems?
97
This did not use OOP
So our next task will be to divide it up (p. 45)
What will we want for the array class? Let’s think about the
purpose of classes. They have data, and functions to manipulate
that data.
So in this case, what will our data be?
For functions, we’ll provide:

A constructor which takes a size and initializes the array
A function to retrieve a single element
A function to set a single element
And then modify main()
98
The LowArray interface
Here’s what it looked like:
What’s inadequate currently in terms of operations?

How can we improve things?
99
Further division…
Let’s make a new HighArray class (p. 49) which includes the
following functions:
Constructor which takes an integer for the size
Function to find an element
Function to insert an element
Function to delete an element
One more data member

nElems, which holds the number of occupied cells
Then, let’s modify main().
100
Abstraction
This illustrates the concept of abstraction
The way in which an operation is performed inside a class is
invisible
Client of HighArray performs more complex operations
through simple method invocations
Never directly accesses the private data in the array
Now we can reuse HighArray much easier

Note – a client does not even really know about the member
array!
Hint, we’ll see later how we can change it.
101
The Ordered Array
An array in which the data items are arranged in ascending
order
Smallest value is at index:
Largest value is at index:
Think about what functions we’d have to modify
Why could this be a nice feature? What operation could be

much faster?
102
That’s right, searching!
We can still do a linear search, which is what we’ve seen.
Step through the elements
In the average case, would this be faster than an unordered

array?
We can also do what’s called binary search, which is much

faster
Especially for large arrays
103
Binary Search: Idea
Ever see the Price is Right?
Guess the price on an item
If guess is too low, Bob Barker says “higher”
If guess is too high, Bob Barker says “lower”
This can work if we are using ordered arrays

And repeat.
104
Note what this can save!
Let’s take a simple case, where we search for an item in a
100-element array:
int[] arr = {1,2,3,4,5,6,…..,100}
For an unordered array where we must use linear search,

how many comparisons on average must we perform?
How about for binary search on an ordered array? Let’s look

for the element 33.
105
Binary Search
Affect on Operations
We saw how binary search sped up the searching operation
Can it also speed up deletion?
What about, insertion of a new element into an ordered

array?
107
Implementation
Let’s go through the Java implementation, on pages 56-57.
At any given time:
lowerBound holds the lower index of the range we are searching
upperBound holds the upper index of the range we are
searching
curIn holds the current index we are looking at
What if the element is not in the array? What happens?
108
Now, let’s implement the OrdArray
Data
The array itself
The number of occupied slots
Methods
Constructor
Size
Find (with binary search)
Insert (with binary search)
Delete (with binary search)
Display
109
Analysis
What have we gained by using ordered arrays?
Is searching faster or slower?
Is insertion faster or slower?
Is deletion faster or slower?
All in all, ordered arrays would be useful in situations where

insertion/deletion are infrequent, but searching is frequent
Employee records – hiring/firing is less frequent than accessing
or updating an employee record
110
Ordered Array: Operation Counts
Maximum number of comparisons for an ordered array of n elements,
running binary search:
n Comparisons
10 4
100 7
1000 10
10000 14
100000 17
1000000 20
How does this compare with linear search, particularly for large arrays?
Whew.
111
A Deeper Analysis
How many comparisons would be required for an array of
256 elements? (2^8)
What about 512 (2^9)?
What do you think 1024 would be (2^10)?
See the pattern?
So for n values, the number of comparisons is log2(n)+1.

This is an example of an algorithm which scales
logarithmically with the input size. Linear search, scales
linearly.
112
Computing log2n
On a calculator, if you use the “log” button, usually the base is
10. If you want to convert:
Multiply by 3.322
Algorithms that scale logarithmically are preferable to those

that scale linearly, because the log of a function grows much
slower than the function itself.
So for large input sets, you’ll have a MUCH smaller number

of operations.
113
Storing Objects
We’ve seen an example where we used arrays to store
primitive data. Now let’s look at an example which stores
objects. What’s our situation now with values and
references?
The array itself is still a _________________.

The elements of the array are ________________.
Implications?
114
Person Class
Let’s go through the Person class on page 65.
Data:
First name and last name (String objects), age (integer value)
Functions
Constructor which takes two strings and an integer
Function to display information
Function to return the last name (we’ll eventually use this for
searching)
115
Adapting our HighArray class
Rewrite the implementation on page 49
Change to operate on Persons instead of integers
Watch out for the ==!
In main() construct Person objects
116
Big-Oh Notation
Provides a metric for evaluating the efficiency of an
algorithm
Analogy: Automobiles
Subcompacts
Compacts
Midsize
etc.
117
How it’s done
It’s difficult to simply say: A is twice as fast as B
We saw with linear search vs. binary search, the comparison
can be different when you change the input size. For
example, for an array of size n:
comparisons = 5
Binary search is 2x as fast
comparisons = 6
Binary search is 5.3x as fast
118
Example: Insertion into Unordered
Array
Suppose we just insert at the next available position:
Position is a[nElems]
Increment nElems
Both of these operations are independent of the size of the

array n.
So they take some time, K, which is not a function of n

We say this is O(1), or constant time
Meaning that the runtime is proportional to 1.
119
Example: Linear search
You’ll require a loop which runs in the worst case n times
Each time, you have to:
Increment a loop counter
Compare the loop counter to n
Compare the current element to the key
Each of these operations take time independent of n, so let’s say
they consume a total time of K.
Then the algorithm would take K*n total time

We say this is O(n).
120
Example: Binary Search
We’ve already said that for an array of n elements, we need
log(n)+1 comparisons.
Each comparison takes time independent of n, call it K
Total time is then: K(log(n)+1) = K*log(n) + K
For large n, this grows proportional to log(n), i.e. the leading

term dominates.
We say this is O(log n)
121
Why this is useful
Useful to evaluate how well an algorithm scales with input
size n. For example:
O(1) scales better than…
O(log n), which scales better than…
O(n), which scales better than…
O(n log n), which scales better than…
O(n^2), etc.
Each of these successively grows faster with n.
122
Generally speaking…
For an input of size n and a function T(n), to compute the
Big-Oh value, you take the leading term and drop the
coefficient.
Examples – compute Big Oh values of the following
runtimes:
T(n) = 100*n^2 + n + 70000
T(n) = (n*log n) / n
T(n) = n^3 + 754,000*n^2 + 1
T(n) = (n + 2) * (log n)
123
But, these large constants must mean
something…
T(n) = n^3 + 754,000*n^2 + 1
This huge constant on the n^2 term, has to have some effect,
right?
The answer is yes and no.
Yes, if the input size is _________________.

But for very large values of n, n^3 overtakes the term, even
with the large constant.
124
Algorithms we’ve discussed…
Linear search: O(n)
Binary search: O(log n)
Insertion, unordered array: O(1)
Insertion, ordered array: O(n)
Deletion, unordered array: O(n)
Deletion, ordered array: O(n)
125
Graph of Big O times.
See page 72.
126
Unordered/Ordered Array Tradeoffs
Unordered
Insertion is fast – O(1)
Searching is slow – O(n)
Ordered
Searching is fast – O(log n)
Insertion is slow – O(n)
Deletion is even – O(n)
Memory can be wasted, or even misused. Let’s discuss.
127
What we will see…
There are structures (trees) which can insert, delete and
search in O(log n) time
Of course as you’d expect, they’re more complex
We will also learn about structures with flexible sizes
java.util has class Vector – what you should know:
Array of flexible size
Some efficiency is lost (why do you think?)
What happens when we try to go beyond the current size?
Why is this penalty very large at the beginning of array population?
128
Sorting Algorithms
129
Sorting in Databases
Many possibilities
Names in alphabetical order
Students by grade
Customers by zip code
Home sales by price
Cities by population
Countries by GNP
Stars by magnitude
130
Sorting and Searching
We saw with arrays they could work in tandem to improve
speed
What was the search method that required sorting an array?
How much faster was the search?
But that means nothing if sorting takes forever!

For this and its usefulness, sorting algorithms have been heavily
researched
131
Basic Sorting Algorithms
Bubble
Selection
Insertion
Although these are:

Simpler
Slower
They sometimes are better than advanced

And sometimes the advanced methods build on them
132
Example
Unordered:
Ordered:
Of course, a computer doesn’t have the luxuries we have…

133
Simple Sorts
All three algorithms involve two basic steps, which are
executed repeatedly until the data is sorted
Compare two items
Either (1) swap two items, or copy one item
They differ in the details and order of operations
134
Sort #1: The Bubble Sort
Way to envision:
Suppose you’re ‘near sighted’
You can only see two adjacent players at the same time
How would you sort them?
This is how a bubble sort works!

1. Start at position i=0
2. Compare position i and i+1
3. If the player at position i is taller than the one at i+1, swap
4. Move one position right
135
Bubble Sort:
First Pass
136
Bubble Sort: End of First Pass
Note: Now the tallest person must be on the end

Why?
Do we understand the name ‘bubble sort’ now?
137
Count Operations
First Pass, for an array of size n:
How many comparisons were made?
How many (worst case) swaps were made?
Now we have to start again at zero and do the same thing

Compare
Swap if appropriate
Move to the right
But this time, we can stop one short (why?)
Keep doing this until all players are in order

138
Let’s do an example ourselves…
Use a bubble sort to sort an array of ten integers:
30 45 8 204 165 95 28 180 110 40
139
How many operations total then?
First pass: n-1 comparisons, n-1 swaps
Second pass: n-2 comparisons, n-2 swaps
Third pass: n-3 comparisons, n-3 swaps
(n-1)th pass: 1 comparison, 1 swap
Then it’s sorted
So we have (worst case):

(n-1)+(n-2)+….+1 comparisons = n(n+1)/2 = 0.5(n2+n)
(n-1)+(n-2)+….+1 comparisons = n(n+1)/2 = 0.5(n2+n)
Total: 2*(0.5(n2+n)) = n2+n = O(n2)

140
Java Implementation
Let’s go through the example together on page 85
Zero in on the bubble sort method
Note how simple it is (4 lines)!
Also note: the outer loop goes backwards!
Why is this more convenient? Look where else this loop counter is used.
We also have a function for swapping

What’s good about this?
What’s bad?
141
Invariants
Algorithms tend to have invariants, i.e. facts which are true
all the time throughout its execution
In the case of the bubble sort, what is always true is….
Worth thinking about when trying to understand algorithms
142
Sort #2: Selection Sort
Purpose:
Improve the speed of the bubble sort
Number of comparisons: O(n2)
Number of swaps: O(n)
We’ll go back to the baseball team…

Now, we no longer compare only players next to each other
Rather, we must ‘remember’ heights
So what’s another tradeoff of selection sort?
143
What’s Involved
Make a pass through all the players
Find the shortest one
Swap that one with the player at the left of the line
At position 0
Now the leftmost is sorted
Find the shortest of the remaining (n-1) players
Swap that one with the player at position 1
And so on and so forth…
144
Selection Sort in
Action
145
Count Operations
How many (worst case) swaps were made?
Now we have to start again at position one and do the same

thing
Find the smallest
Swap with position one
146
Use a selection sort to sort an array of ten integers:
30 45 8 204 165 95 28 180 110 40
147
First pass: n-1 comparisons, 1 swap
Second pass: n-2 comparisons, 1 swap
Third pass: n-3 comparisons, 1 swap
(n-1)th pass: 1 comparison, 1 swap
Then it’s sorted

(n-1)+(n-2)+….+1 comparisons = n(n-1)/2 = 0.5(n2 - n)
1+1+….+1 comparisons = n-1
Total: 0.5(n2 - n) + (n – 1) = 0.5n2 + 0.5n - 1 = O(n2)

148
Implementation – page 92
Let’s go through the function together.
Things to watch:
A bit more complex than bubble (6 lines)
A bit more memory (one extra integer)
149
A method of an array class…
See page 93, we’ll go through it.
150
Sort #3: Insertion Sort
In most cases, the best one…
2x as fast as bubble sort
Somewhat faster than selection in MOST cases
Slightly more complex than the other two

More advanced algorithms (quicksort) use it as a stage
151
Proceed..
A subarray to the left is
‘partially sorted’
Start with the first element
The player immediately to
the right is ‘marked’.
The ‘marked’ player is
inserted into the correct
place in the partially sorted
array
Remove first
Marked player ‘walks’ to
the left
Shift appropriate elements
until we hit a smaller one
152
Use a insertion sort to sort an array of ten integers:
30 45 8 204 165 95 28 180 110 40
153
Count Operations
How many swaps were made?
Were there any? What were there?
Now we have to start again at position two and do the same

thing
Move the marked player to the correct spot
154
First pass: 1 comparison, 1 copy
Second pass: 2 comparisons, 2 copies
Third pass: 3 comparisons, 3 copies
(n-1)th pass: n-1 comparison, n-1 copies
Then it’s sorted

(n-1)+(n-2)+….+1 comparisons = n(n-1)/2 = 0.5(n2- n)
(n-1)+(n-2)+….+1 copies = n(n-1)/2 = 0.5(n2- n)
Total: 2*(0.5(n2+n)) = n2- n = O(n2)
155
Why are we claiming it’s better than
selection sort?
A swap is an expensive operation. A copy is not.
To see this, how many copies are required per swap?
Selection/bubble used swaps, insertion used copies.
Secondly, if an array is sorted or almost sorted, insertion sort

runs in approximately O(n) time. Why is that?
How many comparisons would you need per iteration? And
how many copies?
Bubble and selection would require O(n2) comparisons even if
the array was completely sorted initially!
156
Flow
157
Implementation
We’ll write the function (pages 99-100) together
Use the control flow on the previous slide
158
And finally…
Encapsulate the functionality within an array class (p. 101-
102)
159
Invariant of Insertion Sort
At the end of each pass, the data items with indices smaller
than __________ are partially sorted.
160
Sorting Objects
Let’s modify our insertion sort to work on a Person class,
which has three private data members:
lastName (String)
firstName (String)
age (int)
What do we need for something like insertion sort to work?
How do we define that?
161
Lexicographic Comparison
For Java Strings, you can lexicographically compare them
through method compareTo():
s1.compareTo(s2);
Returns an integer
If s1 comes before s2 lexicographically, returns a value < 0
If s1 is the same as s2, return 0
If s1 comes after s2 lexicographically, returns a value > 0
162
Stable Object Sorts
Suppose you can have multiple persons with the same last
name.
Now given this ordering, sort by something else (first name)
A stable sort retains the first ordering when the second sort
executes.
All three of these sorts (bubble, selection, insertion) are

stable.
163
Sort Comparison: Summary
Bubble Sort – hardly ever used
Too slow, unless data is very small
Selection Sort – slightly better
Useful if: data is quite small and swapping is time-consuming
compared to comparisons
Insertion Sort – most versatile
Best in most situations
Still for large amounts of highly unsorted data, there are better
ways – we’ll look at them
Memory requirements are not high for any of these
164
Stacks and Queues
165
New Structures
Stack
Queue
Priority Queue
What’s “new”?
Contrast with arrays
Usage
Access
Abstraction
166
Usage
Arrays are conducive for databases
Data which will be accessed and modified
Easy operations for insertion, deletion and searching
Although some of these are time consuming
Stacks and queues are good for programming tools

Data will not be touched by the user
Used and then discarded
167
Access
Arrays allow immediate access to any element
Takes constant time
Very fast
Stacks and queues only allow access to one element at a time

Much more restricted
168
Abstraction
A bit higher than arrays. Why?
When a user indexes an array, they specify a memory address
Indirectly, because they say:
Array name -> address of the first element
Index -> offset from that address * size of array elements
With stacks and queues, everything is done through methods
User has no idea what goes on behind the scenes
Also no initial size needed
BIGGEST THING
Stacks, queues and priority queues can use arrays as their underlying
structure
Or linked lists…
From the user’s perspective, they are one and the same
169
Stack
A stack only allows access to the last item inserted
To get the second-to-last, remove the last
Analogy: US Postal Service
If we receive a stack of mail, we

typically open the top one first,
pay the bill or whatever, then get
rid of it and open the second
170
Performance Implication
Note what we can already infer about stack performance!
It is critical that we are able to process mail efficiently
Otherwise what happens to the letters on the bottom?
In other words, what happens to the bills on the bottom if we

never get to them? J
A stack is what is known as a Last-In, First-Out (LIFO) structure

We can only insert to the top (push)
We can only access the element on top (peek)
We can only delete from the top (pop)
171
Applications
Compilers
Balancing parentheses, braces, brackets
Symbol tables
Parsing arithmetic expressions
Traversing nodes of trees and graphs
Invoking methods
Pocket calculators
172
The ‘push’ operation
Pushing involves placing an element on the top of the stack
Analogy: Workday
You’re given a long-term project A (push)
A coworker interrupted for temporary help with project B
(push)
Someone in accounting stops by for a meeting on project C
(push)
Emergency call for help on project D (push)
At any time, you’re working on the project most recently

pushed
173
The ‘pop’ operation
Popping involves removing the top element from the stack
Analogy: Workday
Finish the emergency call with project D (pop)
Finish the meeting on project C (pop)
Finish the help on project B (pop)
Complete the long-term project A (pop)
When everything is popped off the stack, it is considered an

empty stack
Stacks are always initially empty
174
The ‘peek’ operation
Peek allows you to view the element on top of the stack
without removing it.
Side note: Stack sizes often do not need to be too large

This is because in the applications where stacks will be used,
in most cases you can just discard data after you use it
175
Stack Class
Java implementation, page 120
Let’s go through it
Note, we have to pick our internal data structure
For now we’ll stick with what we know: The Array
And analyze the main()
176
Stack class methods
Constructor:
Accepts a size, creates a new stack
Internally allocates an array of that many slots
push()
Increments top and stores a data item there
pop()
Returns the value at the top and decrements top
Note the value stays in the array! It’s just inaccessible (why?)
peek()
Return the value on top without changing the stack
isFull(), isEmpty()
Return true or false
177
Pictorally, let’s view the execution of
main()
StackX theStack =
new StackX(10);
178
Push
theStack.push(20);
top gets bumped up

20 gets stored in the
slot with index top
179
theStack.push(40);
top gets bumped up

slot with index top
180
theStack.push(60);
top gets bumped up

slot with index top
181
theStack.push(80);
top gets bumped up

slot with index top
182
Pop
while (!theStack.isEmpty())
{
long value = theStack.pop()
…
The element indexed by top

is stored in value
top is decremented by 1
183
Print
{
…
System.out.print(value)
System.out.print(“”)
184
Pop
{
…

is stored in value
185
Print
{
…
186
Pop
{
…

is stored in value
187
Print
{
…
188
Pop
{
…

is stored in value
189
Print
{
…
190
Error Handling
When would it be responsible to perform error handling in
the case of the stack?
What function would we add it?
And how would we do it?
191
Example Application: Word Reversal
Let’s use a stack to take a string and reverse its characters
How could this work? Let’s look.
Reminder of the available operations with Strings:
If I have a string s
s.charAt(j) <- Return character with index j
s + “…” <- Append a string (or character to s)
What would we need to change about our existing stack class?
Reverser, page 125
192
Example Application: Delimiter Matching
This is done in compilers!
Parse text strings in a computer language
Sample delimiters in Java:
{, }
[, ]
(, )
All opening delimiters should be matched by closing ones

Also, later opening delimiters should be closer before earlier
ones
See how the stack can help us here?
193
Example Strings
c[d]
a{b[c]d}e
a{b(c]d}e
a[b{c}d]e}
a{b(c)
Which of these are correct?

Which of these are incorrect?
194
Algorithm
Read each character one at a time
If an opening delimiter, place on the stack
If a closing delimiter, pop the stack
If the stack is empty, error
Otherwise if the opening delimiter matches, continue
Otherwise, error
If the stack is not empty at the end, error
195
Example
Let’s look at a stack for a{b(c[d]e)f}
Character Stack Action

a x
{ { push ‘{‘
b { x
( {( push ‘(‘
c {( x
[ {([ push ‘[‘
d {([ x
] {( pop ‘[‘, match
e {( x
) { pop ‘(‘, match
f { x
} pop ‘{‘, match
196
Example
Let’s do one that errors: a[b{c}d]e}
Together on the board
197
Java Implementation
Let’s implement the checker together
Page 129
We’ll write a function which accepts a string input
And returns true or false depending on if the string has all
delimiters matching
We can use the Stack class where the internal array held
characters
198
Stacks: Evaluation
For the tools we saw: reversing words and matching
delimiters, what about stacks made things easier?
i.e. What would have been difficult with arrays?
Why does using a stack make your program easier to
understand?
Efficiency
Push -> O(1) (Insertion is fast, but only at the top)
Pop -> O(1) (Deletion is fast, but only at the top)
Peek -> O(1) (Access is fast, but only at the top)
199
Queues
British for “line”
Somewhat like a stack
Except, first-in-first-out
Thus this is a FIFO structure.
200
Analogy:
Line at the movie theatre
Last person to line up is the last person to buy
201
Applications
Graph searching
Simulating real-world situations
People waiting in bank lines
Airplanes waiting to take off
Packets waiting to be transmitted over the internet
Hardware
Printer queue
Keyboard strokes
Guarantees the correct processing order
202
Queue Operations
insert()
Also referred to as put(), add(), or enque()
Inserts an element at the back of the queue
remove()
Also referred to as get(), delete(), or deque()
Removes an element from the front of the queue
peekRear()
Element at the back of the queue
peekFront()
Element at the front of the queue
203
Question
In terms of memory now, what about the queue do we need
to worry about?
That we did not have to worry about with the stack
Hint: Think in terms of the low-level representation
204
Insert and remove occur at opposite
ends!!!
Whereas with a stack,
they occurred at the
same end
That means that if we
remove an element we
can reuse its slot
With a queue, you
cannot do that
Unless….
205
Circular Queue
Indices ‘wraparound’
206
Java Implementation
Page 137-138 in textbook, which again uses an internal array
representation
We’ll construct that class
Then analyze the main function pictorally
207
Queue theQueue = new Queue(5);
208
theQueue.insert(10);
209
210
211
212
213
theQueue.remove();
214
theQueue.remove();
215
theQueue.remove();
216
217
218
219
220
Remove and print…
while (!theQueue.isEmpty())
long n = theQueue.remove();
System.out.print(n);
221
Remove and print…
222
Remove and print…
223
Remove and print…
224
Remove and print…
225
Queues: Evaluation
Some implementations remove nItems
Allow front and rear indices to determine if queue is full or
empty, or size
Queue can appear to be full and empty (why?)
Additional overhead when determining size (why?)
Can remedy these by making array one size larger than the max number
of items
Efficiency
Same as stack:
Push: O(1) only at the back
Pop: O(1) only at the front
Access: O(1) only at the front
226
Priority Queues
Like a Queue
Has a front and a rear
Items are removed from the front
Difference
No longer FIFO
Items are ordered
We have seen ordered arrays. A priority queue is essentially
an ‘ordered queue’
Mail analogy: you want to answer the most important first
227
Priority Queue Implementation
Almost NEVER use arrays. Why?
Usually employs a heap (we’ll learn these later)
Application in Computing
Programs with higher priority, execute first
Print jobs can be ordered by priority
Nice feature: The min (or max) item can be found in O(1)
time
228
(Time Pending) Java Implementation
Page 147
Biggest difference will be the insert() function
Analysis
delete() - O(1)
insert() - O(n) (again, since arrays are used)
findMin() - O(1) if arranged in ascending order
findMax() – O(1) if arranged in descending order
229
Parsing Arithmetic Expressions
A task that must be performed by devices such as computers
and calculators
Parsing is another word for analyzing, that is, piece by piece
For example, given the expression 2*(3+4)

We have to know to first evaluate 3+4
Then multiple the result by 2
230
How it’s done…
1. Transform the arithmetic expression into postfix notation
Operators follow their two operands, i.e.
3+4 = 34+ (in postfix)
2*(3+4) = 234+* (in postfix)
May seem silly, but it makes the expression easier to evaluate
with a stack
2. Use a stack and evaluate
231
Some practice
Convert the following to postfix:
3*5
3+8*4 (remember the rules of precedence!)
(3+4)*(4+6)
232
Translating infix to postfix
Think conceptually first. How do we evaluate something
like: 2*(3+4) to get 14?
Read left to right
When we’ve read far enough to evaluate two operands and an
operator - in the above case, 3+4
Evaluate them: 3+4=7
Substitute the result: 2*7 = 14
Repeat as necessary
233
Parsing in our Heads
2*(3+4)
We have to evaluate anything in parentheses before using it
Read Parsed
2 2
2* 2*
2*( 2*(
2*(3 2*(3
2*(3+ 2*(3+
2*(3+4) 2*(3+4)
2*7
14
234
Precedence
3+4*5
Note here we don’t evaluate the ‘+’ until we know what follows
the 4 (a ‘*’)
So the ‘parsing’ proceeds like this:
Read Parsed
3 3
+ 3+
4 3+4
* 3+4*
5 3+4*5
3+20
23
235
Summary
We go forward reading operands and operators
When we have enough information to apply an operator, go
backward and recall the operands, then evaluate
Sometimes we have to defer evaluation based on precedence
Think about this when we do the conversion
236
Infix to Postfix: Algorithm
Start with your infix expression, and an empty postfix string
Infix: 2*(3+4) Postfix:
Go through the infix expression character-by-character
For each operand:
Copy it to the postfix string
For each operator:
Copy it at the ‘right time’
When is this? We’ll see
237
Example: 2*(3+4)
Read Postfix Comment
2 2 Operand
* 2 Operator
( 2 Operator
3 23 Operand
+ Operator
4 234 Operand
) 234+ Saw ), copy +
234+* Copy remaining ops
238
Example: 3+4*5
Read Postfix Comment
3 3 Operand
+ 3 Operator
4 34 Operand
* 34 Operator
5 345 Operand
345* Saw 5, copy *
345*+ Copy remaining ops

239
Rules on copying operators
You cannot copy an operator to the postfix string if:
It is followed by a left parenthesis ‘(‘
It is followed by an operator with higher precedence (i.e., a ‘+’
followed by a ‘*’)
If neither of these are true, you can copy an operator once
you have copied both its operands
We can use a stack to hold the operators before they are

copied. Here’s how:
240
How can we use a stack?
Suppose we have our infix expression, empty postfix string
and empty stack S. We can have the following rules:
If we get an operand, copy it to the postfix string
If we get a ‘(‘, push it onto S
If we get a ‘)’:
Keep popping S and copying operators to the postfix string until either S
is empty or the item popped is a ‘(‘
Any other operator:
If S is empty, push it onto S
Otherwise, while S is not empty and the top of S is not a ‘(‘ or an
operator of lower precedence, pop S and copy to the postfix string
Push operator onto S
To convince ourselves, let’s try some of the expressions

241
Example: 3+4*5
Read Postfix Stack
242
Evaluating postfix expressions
If we go through the trouble of converting to postfix, there’s
got to be a reason, right?
Well, there is! The resulting expression is much easier to
evaluate, once again using a stack
Take one example: 345*+

For every operand push it onto a stack
Everytime we encounter an operator, apply it to the top two
items and pop them, then push the result on the stack
We’re done when we have a result and the stack is empty
Let’s do some examples!
243
Example: 234*+
Read Stack Comment
2 2 Operand
3 23 Operand
4 234 Operand
* 2 12 Apply * to 3 and 4
push result
+ 24 Apply * to 2 and 12
push result
244
Why easier?
It is clear what operators go with which operands
Order of operations is enforced – removed from our concern
No parentheses to worry about
245
Java Implementations
Infix->Postfix, 161-165
Postfix Evaluator, 169-172
Time pending, let’s check them out
Otherwise, please read through them
246
Linked Lists
247
Recall Arrays
Advantages
Access is fast – O(1)
Insertion is fast in an unordered array O(1)
Searching is fast in an ordered array – O(log n)
Because we can apply the binary search
Disadvantages
Deletion is slow – O(n)
Searching is slow in an unordered array – O(n)
Insertion is slow in an ordered array – O(n)
248
Recall stacks and queues
Not generally used for real-world data storage
Why?
Operations on the top of the stack are all fast:

Access – O(1)
Insertion – O(1)
Deletion – O(1)
But on any other element….
249
A versatile data structure
The linked list
Second most commonly used behind arrays
Used for real-world data storage
We also saw last week that for stacks and queues, you could use
arrays as an underlying data structure
You also can use linked lists!
Good in situations where structure will be frequently modified

Bad in situations with frequent accesses (we’ll see why)
250
Several Types
Simple
Double-ended
Sorted
Doubly-linked
Lists with iterators
251
A Link
Data in linked lists are embedded in links
Each link consists of:
The data itself
A reference to the next link in the list, which is null for the last
item
252
The Link class
It makes sense to make Link its own class, since a list can
then just be a collection of Link objects:
This is sometimes called a self-referential class. Any theories
why?
class Link {
public int iData;
public Link next; // Does this cause trouble? Why not?
}
253
References
Remember in Java, all objects are references
That means that the variable ‘next’, for each link just contains an
integer for a memory address
A ‘magic number’ which tells us where the object is
They are always the same size (so no problem)
Initially, whenever a link object is created its reference is null

Until we actually say:
mylink.next = new Link();
Even then, next does not contain an object.
It is still an address!
It just now is the address of an actual object, as opposed to null
254
Memory
How would this look in memory then? Let’s draw it on the
board.
255
Recall the implication!
Access for linked lists is slow compared to arrays
Arrays are like rows of houses
They are arranged sequentially
So it’s easy to just find, for example, the third house
With linked lists, you have to follow links in the chain
The next references
How do we get the third element here:
256
Links of Records
We can have a link of personnel records:
class Link{
public String name;
public String address;
public int ssn;
public Link next;
}
We can also have a class for PersonnelRecords, and a linked list of

those. Let’s do that together!
257
Operations
Insertion
At the beginning (fast)
In the middle (slower, although still better than arrays)
Deletion
At the beginning (fast)
In the middle (slower, although still better than arrays)
Search
Similar to arrays, worst case we have to check all elements
Let’s construct some of these!
258
LinkedList class
Start with:
A private Link to the first element
A constructor which sets this reference to null
A method isEmpty() which returns true if the list is empty
259
insertFirst(): O(1)
Accept a new integer (p. 188)
Create a new link
Change the new link’s next reference to the current first
Change first to reference the new link
We could not execute these last two in reverse. Why?
260
deleteFirst(): O(1)
Remove the first integer in the list (p. 188)
Just reset the first reference to first.next
261
displayList() – p. 189 O(n)
Use a reference current to iterate through the elements
Print the value
Set current to current.next
Stop when current becomes null
Before setting current to current.next:
After setting current to current.next:
262
main() function
LinkedList theList = new LinkedList();
theList.insertFirst(22);
theList.displayList();
while (!theList.isEmpty())
theList.deleteFirst();
263
find() – p. 194 O(n)
Essentially the same idea as displayList()
Linearly iterate through the elements with a reference current
Repeatedly set current to current.next
Except this time, stop when you find the item!
264
delete() – p. 194 O(n)
Pass a value stored in the list, and remove it
First we have to find it, at that point it will be in current
Set the previous element’s next reference to current.next
When we find the value:
After we delete the value:
265
main() function - #2
Link f = theList.find(44);
theList.delete(66);
266
Double-Ended Lists
Just like a regular linked list, except there are now two
references kept
One to the beginning (first)
And one to the end (last)
Enables easy insertion at both ends
You still cannot delete the last element any easier. Why?
You cannot change find() to start from the end. Why?
267
insertLast() – p. 199 O(1)
What does this look like now? Let’s see:
Create the new link with the new value (next=null)
Set last.next to reference the new link
Set last to reference the new link
Might we also have to set first? When?
268
theList.insertLast(11);
// Inserting items at the beginning reverses their order

// Inserting items at the end, preserves their order!
269
Double-Ended Lists
Would we also have to modify delete()?
When? Let’s do it.
270
Efficiency: Summary
Fast insertion/deletion at ends: O(1)
Searching: O(n)
Deleting a specific item: O(n)
BUT, faster than arrays
You have equal O(n) for the search
But then an array requires an O(n) shift, where a list requires
reference copies – O(1)
Insertion at a specific point can be done, with similar results
271
Memory: Summary
A linked list uses (more or less) memory than an array?
Why?
A linked list uses memory more efficiently than an array.

Size is flexible
Never have to ‘overcompensate’
Usually results in empty slots with arrays
272
Abstract Data Types (ADT)
A central feature of the Java programming language
Let’s review
What is a datatype?
What do we mean by abstraction?
What is an interface?
273
Data Types
Examples of data types: int, float double
These are called primitive data types
When we refer to a datatype:
Characteristics of the data
Operations which you can perform on that data
Object-oriented programming defines classes
Which are also datatypes. They fit the description above.
274
Abstraction
Abstract: Considered apart from detailed specifications or
implementation.
Let’s ponder the following questions:
What is an analogy of abstraction in the English language?
How does abstraction equate to datatypes and operations?
How can we describe abstraction in the context of object-
oriented programming?
What was an example of abstraction that we saw in stacks and
queues?
275
Abstract Data Types (ADTs)
Idea: Represent a Data Structure by focusing on what it does
and ignoring how it does it.
We’ve seen this already with stacks and queues
Internally, they stored data as an array
But the user didn’t know this! All they saw:
push(), pop(), peek() in the case of the stack
insert(), remove(), front(), rear() in the case of the queue
The set of functions that a client of the class can use, is called
the interface.
We can represent stacks and queues using linked lists instead
of arrays. Let’s look at how to do it.
276
Revisiting the stack….
LIFO structure
Items are inserted, removed and accessed from the top
Three operations: push(), pop() and peek()

Let’s revisit the stack on p. 120, which used an array
Then rewrite it to use a linked list
277 Rule: CANNOT change the function names or parameters!
Revisiting the queue…
FIFO structure
Items are inserted from the rear and removed from the front
Four operations: insert(), remove(), front() and rear()

Let’s look at the implementation with arrays, p.138
Change it to use linked lists
278
When to use which?
List is clearly the better choice when you (know or do not
know?) the number of elements that the stack or queue will
hold
Analyze: what are the tradeoffs
In the case of the queue, a linked list saves us the concern of
wraparound
Keeping track of two references, front and rear
Watching if they move too far in one direction
279
Summary: ADTs
In Software Engineering
It’s always important to consider the operations you want before you
determine details, like:
Memory
Implementation of functions
For example, the operations you desire will strongly determine the
data structure you use
First item? Last item? Item in a certain position?
This is called decoupling specification and implementation.

Use this on an interview and they’ll smile! J
What rewards can this have in terms of client code and portability?
280
Sorted Lists
Linked list where the data is maintained in sorted order
Useful in some applications
Same applications where you’d use a sorted array
But, insertion will be faster!
And, memory will be used more efficiently
But, a tad more difficult to implement
Let’s check them out…
281
insert() p. 214 O(n)
We haven’t looked at inserting in the middle. Let’s see how
it will be done:
So for each element, we just insert at the correct spot

O(n), worst case, to find the spot for insertion
O(1) for the actual insertion
Versus O(n) for arrays with the shift
Would any other operation change (delete, find)?

282
theList.insert(20);
theList.insert(40);
theList.insert(10);
theList.insert(30);
theList.insert(50);
theList.remove();
283
Sorted Linked List: Efficiency
Insertion and deletion are O(n) for the search worst case
Cannot do a binary search on a sorted linked list, like we could
with arrays! Why not?
Minimum value can be found in O(1) time
If list is double-ended, the maximum can as well (why?)
Thus, good if an application frequently accesses minimum (or
maximum) item
For which type of queue would this help us?
Also good for a sorting algorithm!
n insertions, each require O(n) comparisons so still O(n2)
However, O(n) copies as opposed to O(n2) with insertion sort
284 But twice the memory (why?)
Limitation: Previous element
Numerous times, we found the inability to access the
previous element inconvenient
Double-ended list and deleting the last element
Could not search from both ends
Doubly linked lists solve this problem

Allows your to traverse backward and forward
Each link contains references to previous and next
Nice!
But remember, there’s always a catch
Memory
Also, straightforward algorithms become slower (guess as to why?)
285
Our new Link class
class Link
{
public int iData;
public Link previous;
public Link next;
}
next and iData were there before, previous is new

For the first element, previous is null
For the last element, next is null
286
Pictorally…
Single-ended (‘L’ references the List)
Double-ended (usually done)

‘F’ references the First element, ‘L’ the Last
287
Reverse Traversal O(n)
Forward traversal is the same as before
Use current to reference a Link, and repeatedly set it to
current.next
Backward traversal is new
It can only be done conveniently if the list is double-ended
Now we repeatedly set current to current.previous
See page 222
288
Java Implementation, p. 226
Methods
isEmpty(), check if empty
insertFirst(), insert at beginning
insertLast(), insert at end
insertAfter(), insert in the middle
deleteFirst(), delete at beginning
deleteLast(), delete at end
deleteKey(), delete in the middle
displayForward(), forward traversal
displayBackward(), backward traversal
289
isEmpty() O(1)
Simple function
Returns true if first is null.
290
insertFirst() O(1)
Steps
Create a new link
Set its next reference to first
Set first’s previous reference to the new link
Set first (and last if empty) to reference the new link
Before
After
291
insertLast() O(1)
Steps
Create a new link
Set it previous reference to last
Set last’s next reference to the new link
Set last (and first if empty) to reference the new link
Before
After
292
insertAfter() O(n)
Steps
Find the element in the list to insert after (current)
Set current.next’s previous reference to the new link
Set link’s next reference to current.next
Set current.next to the new link
Set the link’s previous reference to current
Before
After
293
deleteFirst() O(1)
Steps
Set first.next’s previous reference to null
Remember first.next could be null!!
Set first to first.next
Before
After
294
deleteLast() O(1)
Steps
Set last.previous’ next reference to null
Remember last.previous could be null!!
Set last to last.previous
Before
After
295
deleteKey() O(n)
Steps
Find the key, call it current
Set current.previous’ next reference to current.next
Set current.next’s previous reference to current.previous
Be sure to handle the case when either is null!! This would be equivalent to
deleteFirst() or deleteLast()
Before
After
296
displayForward() O(n)
Initially equal to first, print the value
Set current to current.next
297
displayBackward() O(n)
Initially equal to last, print the value
Set current to current.previous
Before setting current to current.previous:
After setting current to current.previous:
298
Iterators
What have we seen?
Ability to linearly traverse a list and find an item
What have we been missing?
Control over the items we traverse
Suppose we wanted to perform an operation on all list

elements which meet a certain criteria. Right now:
Call find() repeatedly
Find all unique elements that meet the critera
Ensure no duplicates
Perform operations
299 Iterators make this easier
Idea
Provide a class which:
Contains a reference to some element in the list
Can easily increment itself to reference the next element
class ListIterator() {
private Link current;
public ListIterator(Link l) {current = l;}
public Link getCurrent() {return current;}
public void nextLink() {current = current.next;}
}
300
ListIterator iter = new ListIterator(theList.getFirst());
// This increments every value in the LinkedList

while (iter.getCurrent() != null) {
iter.getCurrent().iData++;
iter.nextLink();
}
301
Pictorally…
I can create multiple instances of ListIterators, and have their
member current reference various points in the list
302
Bidirectional Iterators
If we have a doubly-linked list, it’s easy.
Let’s add two methods to our previous iterator class:
One to access the previous Link prevLink()
If we have a singly-linked list, it’s more difficult

Must add a data member previous
303
Encapsulation
Let’s connect the components for LinkedList and ListIterator
I want to be able to construct a LinkedList, and return a
ListIterator referencing itself, through a function
getIterator(), i.e.:
LinkedList myList = new LinkedList();

ListIterator itr = myList.getIterator();
// itr now references the first element of myList
Why would this help?

Middle insertion and deletion could be done through the
iterator
304
Looking at LinkedList…
class LinkedList {
private Link first;
public LinkedList() {first=null;}
public Link getFirst() {return first;}
public Link setFirst(Link l) {first = l;}
public boolean isEmpty() {return (first == null);}
public ListIterator getIterator() {return new ListIterator(this); }
}
// Do we remember what ‘this’ does?
305
ListIterator: p. 237
Let’s now change the ListIterator
Make it bidirectional
Contain a reference to the list, as opposed to a single link
Methods
ListIterator(): Pass a LinkedList reference, and set
reset(): Reset iterator to the beginning of the list
atEnd(): Check if we are at the end
nextLink(): Move to the next link
getCurrent(): Return the current link
insertAfter(): Insert a link after the link referenced
insertBefore(): Insert a link before the element referenced
deleteCurrent(): Delete the currently referenced link
306
deleteCurrent(): Notes
If we delete an item, where should the iterator now point?
We’ve deleted the item it was pointed to!
We don’t want to move it to the beginning
Concept of ‘locality’
Can’t move it to the previous item
No way to reset ListIterator.previous
Could if we had a doubly linked list
Must move it to the next item
307
atEnd(): Notes
Our implementation checks if the iterator points to the last
element. Tradeoffs:
Looping becomes awkward
Iterator is always pointing at a valid link, which is good for
reliability
Must be careful with iteration
Let’s say we used a loop to display data
First reset, then display, THEN loop:
Go to the next, and display again
Need one extra, because if we simply looped till iter.atEnd()
was true, we’d hit a null reference
308
Another example
Deleting all links that contain values with multiples of three.
How can we use the iterator to do this?
Are we seeing now, why doubly linked lists are highly

convenient?
309
Recursion
310
Final Exam
In case anyone’s making travel plans now…
Wednesday, December 10
12 noon – 3 PM
Location: TBA
311
Recursion
Definition
A programming technique where a function calls itself
Very effective technique in programming
Used to solve a wide class of problems

Triangular numbers (3)
Factorials (12)
Fibonacci (14)
Anagrams (15)
Binary Search (21)
Towers of Hanoi (28)
312 Mergesort
Example: Triangular Numbers
1, 3, 6, 10, 15, 21, 28, …
Start with 1, get the next number by adding 2, get the next by
adding 3, the next by adding 4, etc.
Used to count the number of squares in a triangle:
Forms an infinite set

313
Key with Recursion
Separate the problem into:
A base case which you know to be true
A recursive step, which represents the answer to a larger
problem in terms of a smaller one
Idea, the recursive step will get you closer to the base case.
Let’s solve triangle(n) in this way, where n is the row of the

triangle:
triangle(1) = 1
triangle(2) = 3
triangle(3) = 6
triangle(4) = 10
314
Triangular Numbers and Recursion
Note:
triangle(1) = 1
triangle(2) = 3
triangle(3) = 6
triangle(4) = 10
We can note that, in general:
triangle(n) = n + triangle(n-1)
This can be our recursive step
Which will carry us to the base case:
triangle(1) = 1
This cannot be broken down any further
315
Triangular Numbers and Recursion
Here’s our recursive solution:
triangle(1) = 1, triangle(n) = n + triangle(n-1)
In Java:
public int triangle(int n) {

if (n == 1) return 1;
else return n + triangle(n-1);
}
316
Scope
In Java, each function call
creates a new ‘scope’
Each of which declares a
new version of n, which is
visible
Suppose we call triangle(5)
public int triangle(int n) {

else return n + triangle(n-1);
}
317
Recursive Methods
Characteristics:
They call themselves
When they call themselves, they do so to solve a smaller
problem
A base case must exist
What happens if it doesn’t?
Therefore, the recursion must be able to stop at some point

As it did with triangle, at n=1
triangle(1) = 1
318
Iteration
Anything recursive can be made iterative, using a while or
for loop:
int triangle(int n) {
int total = 0;
while (n > 0) {
total += n;
n--;
}
}
319
Efficiency of Recursion
Recursion is very often simpler and easier to read
But, often is slower. Why?
Overhead of function calls
Sometimes, you can make redundant recursive calls
We’ll see an example of this with Fibonacci
Also often uses more memory. Why?

When we call triangle(1), we’re storing data from the outer
recursive calls
So the main merit is simplicity.
320
Mathematical Induction
This can be a convenient way to represent a recursive
problem:
tri(n) = { 1 n=1
{ n*tri(n-1) n>1
321
Example: Factorials
Let’s start by representing fact(n) by mathematical induction:
And, now let’s write the Java function:
322
Factorial Scope
In Java, each function call
creates a new ‘scope’
Each of which declares a
new version of n, which is
visible
Suppose we call factorial(4)
public int factorial(int n) {

else return n * factorial(n-1);
}
323
Fibonacci Numbers
Mathematical Induction:
fib(n) = {1 n=0
{1 n=1
{fib(n-1) + fib(n-2) n>1
Produces the sequence: 1, 1, 2, 3, 5, 8, 13, 21, 34….
Would recursion or iteration be better here? Let’s think.

Develop Java function
Draw tree of recursive calls
What’s bad?
324
Recall Binary Search: Idea
Only uses ordered arrays
And repeat.
325
Binary Search
Our implementation before was
iterative…
public int find(long key) {
int lower = 0;
int upper = nElems-1;
int curIn;
while (true) {
curIn = (lower + upper) / 2;
if (a[curIn] == key) return curIn;
else if (lower > upper) return -1;
else {
if (a[curIn] < key) lower = curIn + 1;
else upper = curIn – 1;
}
}
327 }
But we can also do it recursively!
If we think of binary search in these terms:
Start lower at 0, and upper at n-1
Let mid = arr[lower+upper]/2
If arr[mid] = key, return mid # we’re done
else if lower > upper, return -1 # not found
else if arr[mid] > key:
perform binarysearch on arr[lower…mid-1]
else if arr[mid] < key:
perform binarysearch on arr[mid+1…upper]
So we have TWO base cases. One if the element is found,

328
and the other if it is not found.
Using this pseudocode, let’s do it in
Java…
Start lower at 0, and upper at n-1

Let mid = arr[lower+upper]/2
If arr[mid] = key, return mid # we’re done
else if lower > upper, return -1 # not found
else if arr[mid] > key:
perform binarysearch on arr[lower…mid-1]
else if arr[mid] < key:
perform binarysearch on arr[mid+1…upper]
329 Each call: need key, lower and upper indices

Complexity… say n = 2^i
We can prove complexity of a recursive function…
Note, every call to binarysearch had some number of
comparisons and assignments, call it c
So the first call, we perform on an n-element array:
c operations
The second call, we perform on an (n/2)-element array:
c operations
The third call, we perform on an (n/4)-element array:
c operations
….
The (i+1)th call, we perform on an (n/n) element array:
330
c operations
Operation Count
If n=2i, we have (i+1)*c operations
But i = log2n!
So the total number of operations is: (log2n + 1)*c
Which is O(log2n)! So it’s the same complexity as iteration

Doesn’t mean it runs at the same speed!
Just means the runtime scales in the same way with the input.
331
Anagrams
Involves producing all possible combinations of the letters of
a word.
Example: Anagram the word cat:
cat
cta
atc
act
tca
tac
Six possible combinations
332
Anagrams: General
In general, for a word of n letters, there will be n!
combinations assuming all letters are distinct
We saw for cat (3 letters), there were 6 possible
If some letter(s) repeat themselves, this will reduce the
number of combinations. Example, tat only has 3:
tat
att
tta
333
Anagram Algorithm
Anagram a word with n letters:
Anagram the rightmost n-1 letters
If n=2, display the word
Rotate all n letters
Repeat these steps n times
Where this is a rotation ->
Gives every letter a chance

to begin the word
334
Recursive
anagrams for
“cat”
335
rotate() and doAnagram() function
Java implementation, page 266
We will write:
A rotate() function which moves each character one slot to the
left, and the first character in the last position
A recursive anagram() function which invokes rotate().
Base case: n=1, just return
Recursive step (do n times):
anagram(n-1)
display if n=2
rotate(n)
336
Output Produced
cats cast ctsa ctas csat csta
atsc atcs asct astc acts acst
tsca tsac tcas tcsa tasc tacs
scat scta satc sact stca stac
337
Towers of Hanoi
An ancient puzzle consisting of disks on pegs A, B and C
Start all disks on peg A
Object: Transfer all disks from A to C

Can only move one at a time
Cannot place a disk on one that’s smaller
Note: This algorithm is EXPENSIVE for large # of disks
338
Recursion
With recursion, it’s
always good to break a
problem into smaller,
easier problems
For example:
Completing the problem
requires:
Moving the top three disks
to (B)
Moving the lowermost disk
to (C)
Moving all three disks from
(B) to (C)
339
Of course..
We broke a rule, can’t
move multiple disks
But, moving top three to
(B) is a smaller problem:
Move the top two to (C)
Move the third to (B)
Move the other two from
(C) to (B)
Similar for moving top

two to (C)
340
1 disk = base case
So…
# A is source peg
# B is intermediate peg
# C is destination peg
Solving TOH(n,A,B,C):
Solve TOH(n-1,A,C,B)
Move nth to C
Solve TOH(n-1,B,A,C)
341
Base Case?
For TOH(n,A,B,C):
Well if there’s just one
disk (n=1), move from A
to C!
Java implementation,
page 278
Note: It’s just a few lines!
342
Complexity
For n disks:
(n disks) - 1st call, 2 recursive calls (n disks)
(n-1 disks) Two 2nd calls, 2 recursive calls
(n-2 disks) Four 3rd calls, 2 recursive calls
…
(1 disk) Many nth calls, base case
Let’s draw the tree
See why, this is too expensive for large numbers of disks?
Old legend: In remote India temple, monks continuously work
at solving this problem with 64 disks and 3 diamond towers
The world ends when they are finished
No worries, it will take forever anyway…. J
343
Number of Operations
Each recursive call generates two recursive calls, and a
constant number of operations (call it c)
First call: c
Two second calls, times c: 2*c
Four third calls, times c: 4*c
…
2n nth calls, times c: 2n*c
Complexity: O(2n) – we call this an exponential algorithm

4 disks, 16 operations
8 disks, 256 operations
344
64 disks, 1.8*1019 operations. We’re safe. J
Mergesort
This begins our discussion of more advanced sorting
techniques
But it uses recursion
Complexity
O(n log n)
Where the best we’ve seen so far is O(n2)
An idea of the difference:
n=10000, 100 million operations for O(n2)
n=10000, 40 thousand operations for O(n log n)
Bad thing: Memory

Requires a temporary array of size n
345
Merging
Let’s say we have an array of size n
Suppose we used recursion and solved a smaller problem
(why would we ever do that? I don’t know J) and had two
sorted subarrays:
We could merge them by repeatedly comparing their first

elements
346
Merge..
Have: two subarrays, and a temporary array of size n
Compare first elements
Take the lesser, and put it in the temporary array
347
Merge…
Whichever one we chose, move one spot to the right in that
subarray and repeat
348
Keep going…
349
And going…
350
And going…
351
And going…
352
Get the idea?
353
Few more…
354
Put in the rest…
When we get to the end of one subarray, just insert the rest
of the other.
355
Finally…
We’re done when the temporary array is full
356
So now, we know…
If we have two sorted subarrays, we can merge them to sort
the entire array. And we can do it in O(n) time.
Just one comparison for each of the n elements
Where recursion comes in…

Say we start with n elements…
And separate into two subarrays of n/2 elements…
And separate them into four subarrays of n/4 elements…
Until we get to subarrays of one element.
Trivial to merge!
357
Pictorally…
358
So conceptually, what must we do?
mergesort(A, n): # Sort an array of size n
mergesort(first half of A, n/2)
mergesort(second half of A, n/2)
merge(first half of A, second half of A)
359
Let’s add a merge() procedure to
this class. (p. 289)
class DArray {
private long[] theArray;
private int nElems;
public Darray(int max) {
theArray = new long[max];
nElems++;
}
public void insert(long value) {
theArray[nElems] = value;
nElems++;
360 }
What merge() accepts
A workspace array of size n
The lower, middle and upper indices of theArray to merge
First half is index lower to index (middle-1)
Second half is index middle to index upper
So n=upper-lower+1
361
Now write mergesort()
Our recursive mergesort will accept:
Workspace of size n, and lower/upper indices of theArray to
sort
Initial call will pass an empty workspace, 0, and nElems-1.
362
Complexity
Every call makes two recursive calls, each with n/2 copies
First call: n copies, and generates:
Two recursive calls at (n/2) copies each, which generate:
Four recursive calls at (n/4) copies each
…
n recursive calls at (n/n) copies each
To get the total number of operations, let’s draw a tree.
363
Total number of operations
n + 2(n/2) + 4(n/4) +…. + n(n/n)
= n + n + …. + n
= (log n + 1) * n
= n log n + n
O(n log n)
Best so far!!
364
Advanced Sorting
365
Radar
Midterm
Two weeks from this Friday on 3/27
In class
Closed book
366
Improved Sorting Techniques
What we have seen
Bubble, selection and insertion
Easy to implement
Slower: O(n2)
Mergesort
Faster: O(n log n)
More memory (temporary array)
What we’ll now see

Shellsort, O(n log2 n)
Quicksort, O(n log n)
Radix sort, O(k n)
Limited to integers
367
Shellsort
Insertion sort, but adds a new feature
Not quite as fast as Quicksort and Mergesort
But:
Easier to implement
Still much faster than the basic sorts
Best: For arrays of a few thousand items
368
Recall Insertion Sort....
A subarray to the left is
‘partially sorted’
Start with the first element
The player immediately to
the right is ‘marked’.
The ‘marked’ player is
inserted into the correct
place in the partially sorted
array
Remove first
Marked player ‘walks’ to
the left
Shift appropriate elements
until we hit a smaller one
369
The problem
If a small item is very far
to the right
Like in this case ->
You must shift many
intervening large items
one space to the right
Almost N copies
Average case N/2
N items, N2/2 copies
Better if:
Move a small item many
370
spaces, without shifting
Remember
What made insertion
sort the best of the basic
sorts?
If the array is almost
sorted, O(n)
Shellsort has two steps:

“Almost sort” the array
Run insertion sort
371
The “Almost Sort” step
Say we have a 10-element array:
60 30 80 90 0 20 70 10 40 50
Sort indices 0, 4, and 8:
0 30 80 90 40 20 70 10 60 50
Sort indices 1, 5, and 9:
0 20 80 90 40 30 70 10 60 50
Sort indices 2, 6:
0 20 70 90 40 30 80 10 60 50
Sort indices 3, 7
0 20 70 10 40 30 80 90 60 50
372
The “Almost Sort” step
This is called a “4-sort”:
60 30 80 90 0 20 70 10 40 50
0 30 80 90 40 20 70 10 60 50
0 20 80 90 40 30 70 10 60 50
0 20 70 90 40 30 80 10 60 50
0 20 70 10 40 30 80 90 60 50
Once we’ve done this, the array is almost sorted, and we can
run insertion sort on the whole thing
Should be about O(n) time
373
Interval Sequencing
4-sort was sufficient for a 10-element array
For larger arrays, you’ll want to do many of these to achieve
an array that is really almost sorted. For example for 1000
items:
364-sort
121-sort
40-sort
13-sort
4-sort
insertion sort
374
Knuth’s Interval Sequence
How to determine?
Knuth’s algorithm:
Start at h=1
Repeatedly apply the function h=3*h+1, until you pass the
number of items in the array:
h = 1, 4, 13, 40, 121, 364, 1093….
Thus the previous sequence for a 1000-element array
So what interval sequences should we use for a 10000

element array?
375
Why is it better?
When h is very large, you are sorting small numbers of
elements and moving them across large distances
Efficient
When h is very small, you are sorting large numbers of
elements and moving them across small distances
Becomes more like traditional insertion sort
But each successive sort, the overall array is more sorted
So we should be nearing O(n)
That’s the idea!
376
Let’s do our own example…
Sort these fifteen elements:
8 10 1 15 7 4 12 13 2 6 11 14 3 9 5
What sequence of h should we use?

Let’s do it on the board.
377
Java Implementation, page 322
What are function needs to do:
Initialize h properly
Have an outer loop start at outer=h and count up
Have an inner loop which sorts outer, outer-h, outer-2h, etc.
For example, if h is 4:
We must sort (8, 4, 0), and (9, 5, 1), etc.
Let’s check it out
378
Other Interval Sequences
Original Shellsort:
h=h/2
Inefficient, leads to O(n2)
Variation:
h = h / 2.2
Need extra effort to make sure we eventually hit h=1
Flamig’s Approach (yields similar to Knuth)
if (h < 5) h = 1; else h = (5*h-1) / 11;
Rule of thumb: Values of h should be relatively prime

More likely that they will intermingle all items on previous pass
379
Number of Operations
Too difficult to perform analytically
Estimates have been made

Range: O(N3/2) to O(N7/6)
Accepted: O(n log2 n)
Even if we assume worst case, it still is much better than
insertion sort
n=10, Shellsort 32 operations, Insertion Sort 100
n=100, Shellsort 1000 operations, Insertion Sort 10000
n=1000, Shellsort 32000 operations, Insertion Sort 1 million
n=10000, ShellSort 1 million operations, Insertion Sort 100 million
Optimistic yields 160,000 in the case of n=10000
380
Quicksort
This will be the sort that is the most optimal in terms of
performance and memory
One must exercise care to avoid the worst case, which is still
very expensive
Central mechanism is partitioning, which we’ll look at first
381
Partitioning
Idea: Divide data into two groups, such that:
All items with a key value higher than a specified amount (the
pivot) are in one group
All items with a lower key value are in another
Applications:
Divide employees who live within 15 miles of the office with
those who live farther away
Divide households by income for taxation purposes
Divide computers by processor speed
Let’s see an example with an array

382
Partitioning
Say I have 12 values:
175 192 95 45 115 105 20 60 185 5 90 180
I pick a pivot=104, and partition (NOT sorting yet):
95 45 20 60 5 90 | 175 192 115 105 185 180
Note: In the future the pivot will be an actual element
Also: Partitioning need not maintain order of elements and
usually won’t
Although I did in this example
The partition is the leftmost item in the right array:

95 45 20 60 5 90 | 175 192 115 105 185 180
Which we return to designate where the division is located.
383
Partition function must:
Accept a pivot value, and a starting and ending index
Start an integer left=start-1, and right=end
Increment left until we find an A[left] bigger than the pivot:
Then decrement right until we find an A[right] bigger than the pivot.
Swap A[left] and A[right].
Stop when left and right cross
Then left is the partition
All items smaller are on the left, and larger on the right
There can’t be anything smaller to the right, because we would’ve
swapped.
384
Efficiency: Partitioning
O(n) time
left starts at 0 and moves one-by-one to the right
right starts at n-1 and moves one-by-one to the left
When left and right cross, we stop.
So we’ll hit each element just once
Number of comparisons is n+1

Number of swaps is worst case n/2
Worst case, we swap every single time
Each swap involves two elements
Usually, it will be less than this
Since in the random case, some elements will be on the correct side of
385
the pivot
Modified Parititioning
In preparation for Quicksort:
Choose our pivot value to be the rightmost element
Partition the array around the pivot
Ensure the pivot is at the location of the partition
Meaning, the pivot should be the leftmost element of the right subarray
Example:
Unpartitioned: 42 89 63 12 94 27 78 10 50 36
Partitioned around Pivot: 3 27 12 36 63 94 89 78 42 50
What does this imply about the pivot element after the
partition?
386
Placing the Pivot
Goal: Pivot must be in the leftmost position in the right
subarray
3 27 12 36 63 94 89 78 42 50
Our algorithm does not do this currently.
It currently will not touch the pivot
left increments till it finds an element < pivot
right decrements till it finds an element > pivot
So the pivot itself won’t be touched, and will stay on the right:
3 27 12 63 94 89 78 42 50 36
387
Options
We have this:
3 27 12 63 94 89 78 42 50 36
Our goal is the position of 36:
3 27 12 36 63 94 89 78 42 50
We could either:
Shift every element in the right subarray up (inefficient)
Just swap the leftmost with the pivot! Better J
We can do this because the right subarray is not in any
particular order
3 27 12 36 94 89 78 42 50 63
388
Swapping the Pivot
Just takes one more line to our Java method
Basically, a single call to swap()
Swaps A[end-1] (the pivot) with A[left] (the partition index)
389
Quicksort
The most popular sorting algorithm
For most situations, it runs in O(n log n)
Remember partitioning. It’s the key step. And it’s O(n).
The basic algorithm (recursive):

1. Partition the array into left (smaller keys) and right (larger
keys) groups
2. Call ourselves and sort the left group
3. Call ourselves and sort the right group
Base case: We partition just one element, which is just the

390
element itself
We can probably just do this in our heads:
1. Choose the pivot to be the rightmost element
2. Partition the array into left (smaller keys) and right (larger
keys) groups
3. Call ourselves and sort the left group
4. Call ourselves and sort the right group
Base case: We partition just one element, which is just the

element itself
391
Shall we try it on an array?
10 70 50 30 60 90 0 40 80 20
Let’s go step-by-step on the board
392
Best case…
We partition the array each time into two equal subarrays
Say we start with array of size n = 2i
We recurse until the base case, 1 element
Draw the tree

First call -> Partition n elements, n operations
Second calls -> Each partition n/2 elements, 2(n/2)=n operations
Third calls -> Each partition n/4, 4(n/4) = n operations
…
(i+1)th calls -> Each partition n/2i = 1, 2i(1) = n(1) = n ops
393
Total: (i+1)*n = (log n + 1)*n -> O(n log n)
The VERY bad case….
If the array is inversely sorted.
Let’s see the problem:
90 80 70 60 50 40 30 20 10 0
What happens after the partition? This:
0 20 30 40 50 60 70 80 90 10
This is almost sorted, but the algorithm doesn’t know it.
It will then call itself on an array of zero size (the left
subarray) and an array of n-1 size (the right subarray).
Producing:
0 10 30 40 50 60 70 80 90 20
394
The VERY bad case…
In the worst case, we partition every time into an array of 0
elements and an array of n-1 elements
This yields O(n2) time:
First call: Partition n elements, n operations
Second calls: Partition 0 and n-1 elements, n-1 operations
Third calls: Partition 0 and n-2 elements, n-2 operations
Draw the tree
Yielding:
Operations = n + n-1 + n-2 + … + 1 = n(n+1)/2 -> O(n2)
395
Summary
What caused the problem was “blindly” choosing the pivot
from the right end.
In the case of a reverse sorted array, this is not a good choice
at all
Can we improve our choice of the pivot?

Let’s choose the middle of three values
396
Median-Of-Three Partitioning
Everytime you partition, choose the median value of the left,
center and right element as the pivot
Example:
44 11 55 33 77 22 00 99 101 66 88
Pivot: Take the median of the leftmost, middle and rightmost

44 11 55 33 77 22 00 99 101 66 88 - Median: 44
Then partition around this pivot:
11 33 22 00 44 55 77 99 101 66 88
Increases the liklihood of an equal partition
397
Also, it cannot possibly be the worst case
Let’s see how this fixes the worst
case for Quicksort
Here’s our array:
90 80 70 60 50 40 30 20 10 0
Let’s see on the board how this fixes things

In fact in a perfectly reversed array, we choose the middle
element as the pivot!
Which is optimal
We get O(n log n)
Vast majority of the time, if you use QuickSort with a

Median-Of-Three partition, you get O(n log n) behavior
398
One final optimization…
After a certain point, just doing insertion sort is faster than
partitioning small arrays and making recursive calls
Once you get to a very small subarray, you can just sort with
insertion sort
You can experiment a bit with ‘cutoff’ values
Knuth: n=9
399
(Time Pending) Java
Implementation
QuickSort with maximum optimization
Median-Of-Three Partitioning
Insertion Sort on arrays of size less than 9
400
Operation Count Estimates
For QuickSort
n=8: 30 comparisons, 12 swaps
The only competitive algorithm is mergesort

But, takes much more memory like we said.
401
Radix Sort
Idea: Disassemble elements into individual digits
Sort digit by digit
Does not require any comparisons
Arrange data items according to digit values

Makes sense:
Arrange by the least significant digit first
Then the second least, etc.
Assumptions
402
Assumption
The assumption is: base 10, positive integers!
If we can do that, we can assemble 10 groups of values

Group array values by 1s digit
Then reassemble by 10s digit, using group 0 first and group 9
last
So sort by 10s digit, without losing the order of the sort by 1s
So within each group by 10s, the elements are still sorted by 1s
Then reassemble by 100s digit, same process
Repeat for all digits
403
Example: On the Board
421 240 35 532 305 430 124
Remember: 35 has a 100s digit of zero!
404
Operations
n Elements
Copy each element once to a group, and once back again:
2n copies -> O(n)
Then you have to copy k times, where k is the maximum
number of digits in any value
So, 2*k*n copies -> O(kn)
Zero comparisons
What will k typically be?

For n=100 (unique values), you have at least 3 digits
For n=1000 (unique values), you have at least 4 digits
405
Generally, for n = 10i, you have (i+1) digits or log10n
Logarithms
Independently of the base, logarithmic functions grow at
about the same rate
So really, you are dealing with roughly O(n log n) complexity
still
Plus Houston, you’ve got memory…
Need to store extra space for the groups!
Like mergesort, double the memory
So a solid argument can be made that Quicksort is still better
Best case:
Small numbers of digits
And remember underlying assumption: Positive integers
406
Java Implementation
Not in the book; will be left as an exercise
This is your two-week assignment! Implement the entire
Radix Sort.
407
Binary Trees
408
Binary Trees
A fundamental data structure
Combines advantages of arrays and linked lists
Fast search time
Fast insertion
Fast deletion
Moderately fast access time
Of course, they’re a bit more complex to implement
409
Recall Ordered Arrays…
Their search time is faster, because there is some ‘ordering’
to the elements.
We can do binary search, O(log n)
Instead of linear search, O(n)
Their insertion time is slower, because you have to find the

correct position to insert first
That takes O(log n) time
Instead of just dropping the element at the end, O(1)
Trees provide ‘somewhat’ of an ordering

410 Each of these algorithms will be O(log n)
Recall Linked Lists…
Insertion and deletion are fast
O(1) on the end
In the middle, O(n) to find the position, but O(1) to
insert/delete
Better than the expensive shifting in arrays
Finding is slower, O(n)
Trees perform insertion/deletion similarly, by changing

references
But they provide shorter paths for finds that are log n in
411 length, as opposed to a linked list which could be length n
Trees: General
A tree consists of nodes, connected by edges
Trees cannot have cycles
Otherwise, it’s a graph
Here’s a general tree:
412
Traversing a Tree
Start at the root and traverse downward along its edges
Typically, edges represent some kind of relationship
We represent these by references
Just as in linked lists:
class Link {
int data;
Link next;
}
In a tree:
class Node {
int data;
Node child1;
Node child2;
413 …
}
Size of a Tree
Increases as you go down
Opposite of nature. J
414
Binary Trees
A special type of tree
With this tree, nodes had varying numbers of children:
With binary trees, nodes can have a maximum of two

children.
415
The tree above is called a multiway tree
A Binary Tree
For now, note each node has no more than two children
416
A Binary Tree
Each node thus has a left and right child
What would the Java class look like?
417
Binary Trees: Terms
Path: Sequence of nodes connected by edges
Green line is a path from A to J
418
Binary Trees: Terms
Root: The node at the top of the tree
Can be only one (in this case, A)
419
Binary Trees: Terms
Parent: The node above. (B is the parent of D, A is the
parent of B, A is the grandparent of D)
420
Binary Trees: Terms
Child: A node below. (B is a child of A, C is a child of A, D
is a child of B and a grandchild of A)
421
Binary Trees: Terms
Leaf: A node with no children
In this graph: H, E, I, J, and G
422
Binary Trees: Terms
Subtree: A node’s children, it’s children’s children, etc.
The hilited example is just one, there are many in this tree
423
Binary Trees: Terms
Visit: Access a node, and do something with its data
For example we can visit node B and check its value
424
Binary Trees: Terms
Traverse: Visit all the nodes in some specified order.
One example: A, B, D, H, E, C, F, I, J, G
425
Binary Trees: Terms
Levels: Number of generations a node is from the root
A is level 0, B and C are at level 1, D, E, F, G are level 2, etc.
426
Binary Trees: Terms
Key: The contents of a node
427
A Binary Search Tree
A binary tree, with the following characteristics:
The left child is always smaller than its parent
The right child is always larger than its parent
All nodes to the right are bigger than all nodes to the left
428
Integer Tree
Will use this class for individual nodes:
class Node {
public int data;
public Node left;
public Node right;
}
Let’s sketch the Java template for a binary search tree (page
375)
429
Example main() function
Page 275, with a slight tweak
Insert three elements: 50, 25, 75
Search for node 25
If it was found, print that we found it
If it was not found, print that we did not find it
430
Finding a node
What do we know?
For all nodes:
All elements in the left subtree are smaller
All elements in the right subtree are larger
431
Searching for a KEY
We’ll start at the root, and check its value
If the value = key, we’re done.
If the value is greater than the key, look at its left child
If the value is less than the key, look at its right child
Repeat.
432
Example
Searching for
element 57
433
Java Implementation – find()
Pages 377-378
434
Number of operations: Find
Typically about O(log n). Why?
What’s a case where it won’t be?

435 How can we guarantee O(log n)?
Inserting a node
What must we do?
Find the place to insert a node (similar to a find, except we
go all the way till there are no more children)
Put it there
436
Example
Inserting
element
45
437
Java Implementation – insert()
Page 380
438
Traversing a Tree
Three Ways:
Inorder (most common)
Preorder
Postorder
439
Inorder Traversal
Visits each node of the tree in ascending order:
In this tree, an inorder traversal produces:

440 9 14 23 30 34 39 47 53 61 72 79 84
Inorder Traversal
Ascending Order
Implies: We have to print a node’s left child before the node

441 itself, and the node itself before its right child
Inorder Traversal
Ascending Order
We can think of this recursively: start at the root, inorder

traverse the left subtree, print the root, inorder traverse the
442
right subtree
Java Implementation
Page 382
A recursive function
Let’s try it with a simple example
443
Preorder Traversal
Prints all parents before children
Prints all left children before right children. So with this tree:
A preorder traversal produces:

444 53 30 14 9 23 39 34 47 72 61 84 79
Preorder Traversal
Order: Root, left, right
We can again do this recursively: print the root, preorder

445 traverse the left subtree, preorder traverse the right subtree
Java Implementation
Not in the book
But we should be able to do it!
446
Postorder Traversal
Prints all children before parents
Prints all left children before right children. So with this tree:
A postorder traversal produces:

447 9 23 14 34 46 39 30 61 79 84 72 53
Preorder Traversal
Order: Left, right, root
We can again do this recursively: postorder traverse the left

448 subtree, postorder traverse the right subtree, print the root
Java Implementation
Not in the book
But again, we should be able to do it!
449
Finding the Minimum
In a binary search tree, this is always the leftmost child of the
tree! Easy. Java?
Start at the root, and traverse until you have no more left children
450
Finding the Maximum
In a binary search tree, this is also easy – it’s the rightmost
child in the tree
Start at the root, traverse until there are no more right children
Java?
451
Deletion
This is the challenging one
First, find the element you want to delete
Once you’ve found it, one of three cases:
1. The node has no children (easy)
2. The node has one child (decently easy)
3. The node has two children (difficult)
452
Case 1: No Children
To delete a node with no children:
Find the node
Set the appropriate child field in its parent to null
Example: Removing 7 from the tree below
453
Java Implementation
Start from page 390-391
Find the node first
As we go through, keep track of:
The parent
Whether the node is a left or right child of its parent
Then, handle the case when both children are null

Set either the left or right child of the parent to null
Unless it’s the root, in which case the tree is now empty
454
Case 2: One Child
Assign the deleted node’s
child as the child of its
parent
Essentially, ‘snip out’ the
deleted node from the
sequence
Example, deleting 71
from this tree:
455
Java Implementation
Pages 392-393
Two cases to handle. Either:
The right child is null
If the node is a left child, set its parent’s left child to the node’s left child
If the node is a right child, set its parent’s right child to the node’s left
child
The left child is null
If the node is a left child, set its parent’s left child to the node’s right child
If the node is a right child, set its parent’s right child to the node’s right
child
456
Case 3: Two Children
Here’s the tough case.
Let’s see an example of why it’s complicated…
457
What we need is the next highest node to replace 25.
For example, if we replaced 25 by 30, we’re set.
458
We call this the inorder successor of the deleted node
i.e., 30 is the inorder successor of 25. This replaces 25.
459
Inorder successor
The inorder successor is
always going to be the
smallest element in the
right subtree
In other words, the
smallest element that is
larger than the deleted
node.
460
Finding the inorder successor
Algorithm to find the inorder
successor of some node X:
First go to the right child of X
Then keep moving to left
children
Until there are no more
Then we are at the inorder
successor
This is what should replace X
461
Removing the successor
We must remove the
successor from its current
spot, and place it in the spot
of the deleted node
If the successor is the

deleted node’s right child:
Set the successor’s left to
the deleted node’s left
Replace deleted node by
462
successor
If the successor is not the deleted node’s right child, tougher
We must add two steps:
1. Set the successor’s parent’s left to the successor’s right
2. Set the successor’s right to the deleted node’s right
3. Set the successor’s left to the deleted node’s left (as before)
4. Replace the deleted node by the successor (as before)
463
Java Implementation (Time
Pending)
getSuccessor() function, page 396
Accepts a node
First goes to its right child
Then goes to the left child
Does this until no more left children
Also removes successor
Rest of delete(), page 398
464
Efficiency: Binary Search Trees
Note that:
Insertion, deletion, searching all involved visiting nodes of the
tree until we found either:
The position for insertion
The value for deletion
The value we were searching for
For any of these, we would not visit more than the number of
levels in the tree
Because for every node we visit, we check its value, and if we’re
not done, we go to one of its children
465
Efficiency: Binary Search Trees
So for a tree of n nodes, how many levels are there:
Nodes Levels
1 1
3 2
7 3
15 4
31 5
….
1,073,741,824 30
466 It’s actually log(n) + 1!
So…
All three of our algorithms: insertion, deletion, and
searching take O(log n) time
We go through log n + 1 levels, each time with one
comparison.
At the point of insertion or deletion, we just manipulate a
constant number of references (say, c)
That’s independent of n
So the number of operations is log n + 1 + c, or O(log n)
467
Compare to Arrays
Take 1 million elements and delete an element in the middle
Arrays -> Average case, 500 thousand shifts
Binary Search Trees -> 20 or fewer comparisons
Similar case when comparing with insertion into an ordered
array
What is slow for a binary search tree is traversal

Going through all elements in the entire tree
But for a large database, this probably will never be necessary
468
Huffman Codes
An algorithm to ‘compress’ data
Purpose:
Apply a compression algorithm to take a large file and store it as
a smaller set of data
Apply a decompression algorithm to take the smaller
compressed data, and get the original back
So, you only need to store the smaller compressed version, as

long as you have a program to compress/decompress
Compression Examples: WinZip, MP3
469
Quick Lesson In Binary
Generally for an n-digit number in binary:
bn-1 … b2b1b0 = bn-12n-1 + … + b222 + b121 + b020
Assume unsigned bytes, convert these:

01011010
10000101
00101001
10011100
11111111
Everything internally is stored in binary

470
Characters
Take one byte of memory (8 bits)
All files are just sequences of characters
Internal storage (ASCII Codes):

CHAR DEC BINARY_
A 65 01000001
B 66 01000010
C 67 01000011
...
Y 89 01011001
471 Z 90 01011010
Example File
ILOVETREES
(ASCII: I=73, L=76, O=79, V=86,E=69,T=84,R=82,S=83)
Internal Storage
01001001 01001100 01001111 01010110 01000101
01010100 01010010 01000101 01000101 01010011
All characters take one byte (8 bits) of storage, and those 8

bits correspond to the ASCII values
472
Underlying Motivation
Why use the same number of bits to store all characters?
For example, E is used much more often than Z
So what if we only used two bits to store E
And still used the eight to store Z
We should save space.
For example, with ILOVETREES, if we change E’s to 01:

We save 6*3 = 18 bits
For large files, we could start saving a significant amount of

data!
473
One thing we must watch
When choosing shorter codes, we cannot use any code that is
the prefix of another code. For example, we could not have:
E: 01
X: 01011000
Why? Because if we come across this in the file:

…….010110000
We don’t know if it’s an E followed by something else,
Or an X.
474
Most Used Characters
The most used characters will vary by file
Computing Huffman Codes first requires computing the
frequency of each character, for example for “SUSIE SAYS IT
IS EASY”:
CHAR COUNT
A 2
E 2
I 3
S 6
T 1
U 1
Y 2
Space 4
475 Linefeed 1
Computing Huffman Codes
Huffman Codes are varying bit lengths depending on
frequency (remember S had the highest freq at 6):
CHAR CODE
A 010
E 1111
I 110
S 10
T 0110
U 01111
Y 1110
Space 00
476 Linefeed 01110
Coding “SUSIE SAYS IT IS EASY”
CHAR CODE
A 010
E 1111
I 110
S 10
T 0110
U 01111
Y 1110
Space 00
Linefeed 01110
10 01111 10 110 1111 00 10 010 1110 10 00 110 0110 00
110 10 00 1111 010 10 1110 01110 (65 bits)
477 Before, it would’ve been (21*8=168 bits!)
A Huffman Tree
Idea:
Each character appears as
a leaf in the tree
The higher the frequency
of a character, the higher
up in the tree it is
Number outside a leaf is
its frequency
Number outside a non-
leaf is the sum of all child
frequencies
478
A Huffman Tree
Decoding a message:
For each bit, go right (1)
or left (0)
Once you hit a character,
print it, go back to the
root and repeat
Example: 0100110
Start at root:
Go L(0), R(1), L(0), get A
Go back to root
Go L(0), R(1), R(1), L(0),
get T
479
Encoding
Decoding is thus easy
when you have this tree
However, we must
produce the tree
The idea will be to start

with just characters and
frequencies (the leaves),
and then grow the tree
480
First step
Start from the leaves, which contain single characters and
their associated frequencies
Store these nodes in a priority queue, ordered by frequency
481
Next
Take the left two elements, and form a subtree
The two leaves are the two characters
The parent is empty, with a frequency as the sum of its two
children
Put this back in the priority queue, in the right spot
482
Continue this process…
Again, adjoin the leftmost two elements (now we actually
adjoin a leaf and a subtree):
483
Keep going…
Adjoin leaves Y (2) and E (2), this forms a subtree with root
frequency of 4
484
Continue until we have one tree…
485
486
487
488
Our final tree
Note we were able to construct this from the frequency table
489
Obtaining the Huffman Code from
the tree
Once we construct the tree,
we still need the Huffman
Code to encode the file
No way around this: we
have to start from the root
and traverse all possible
paths to leaf nodes
As we go along, keep track
of if we go left (0) or right
(1)
So A went left (0), then
right (1), then left (0)
490
Code Table
When we get the Huffman
Code for each character,
we insert them into a Code
Table, as to the right
Now encoding becomes

easy!
For each character, lookup
in the code table and store
those binary digits
For decoding, just use the
491 tree
“Official Steps”
Encoding a File
Obtain a frequency table (count each character’s frequency)
Make the Huffman Tree
Make the Code Table
Store each character as its Huffman Code from the Code Table
Decoding a File
Read the compressed file bit-by-bit
Use the Huffman Tree to get each character
492
Red-Black Trees
493
Recall Binary Trees
What were the advantages?
We could perform the following operations quickly:

Insertion
Deletion
Searching
Because for n nodes:

Height of the binary tree was log(n)+1
Each operation involved iteratively searching child nodes
494
Unbalanced binary trees
Let’s form two binary search trees
One inserting this sequence:
10 20 30 40 50 60 70 80 90 100
Another inserting this sequence:
100 90 80 70 60 50 40 30 20 10
What happens? Are insertion, deletion and searching still

fast?
What’s the underlying problem?
495
Red-Black Trees
Binary search trees, with some added features
These ‘added features’ make sure that the tree is balanced
Which we’re never guaranteed with binary search trees!
Thus keeping:
Insertion
Deletion
Searching
Fast under all circumstances
Note: Code is complex! We will learn by examples.

496
The concept of rotation (ignore
shading)
Rotation involves three nodes:
A parent and its two children
For a right rotation, a node’s left child becomes its parent:
497 Left rotation is just the opposite. We’ll need these.

Rotations help balance a tree
For example if the left side of a tree is “heavy”, doing a right
rotation would help balance things
Rotations, in general:
Raise some nodes and lower others to help balance the tree
Ensure that we do not violate any characteristics of a binary
search tree
Thus all nodes to the left must still have values smaller
All nodes to the right must still have values larger
498
Rotations Involving Many Nodes
A three node rotation was easy.
Let’s look at a more complicated one.
Let’s say we wanted to rotate the following tree, with 50 as

our “top node”. Which node becomes the root?
499
Literally, this is what we must do…
For a right rotation, the
top node must have a left
child
Then we deal with a

‘crossover node’, which
must be moved to the
opposite side.
500 In this case, (37)
Who is this oddball?
A rotation involves a top
node, in this case (50)
The crossover node (37)
is the right child of the
left child of (50)
For a right rotation, this
will always be the case
Also known as the inside
grandchild
Everybody else moves
right, (37) moves across
ALWAYS reconnects to
501 the top node (50)
So the steps of a right rotation..
Rotate every node,
except the crossover, to
the right.
After this happens, the
top node will have a null
left child
Move the crossover there
To gain more practice,

let’s try a left rotation.
502
Rotating with
Subtrees
Same principle. If we right
rotate with (50) as the top
node, we do the same thing
but move subtrees
(25) becomes the root
The subtree with (37) is the

‘crossover’, and is moved to
the left child of (50)
503
More Practice…
Do a left-rotation on this tree, with (50) as the top node

Do a right-rotation on this tree, with (25) as the top node
504 Do a left-rotation on this tree, with (25) as the top node
Red-Black Trees: Characteristics
Nodes are colored, either red or black
During insertion and deletion, rules are followed that make
sure the tree is balanced
505
Color? Really?
How would we include ‘color’ as a characteristic of a Node?
506 Let’s construct the Java class for a single Node.

Red-Black Rules: #1
Every node is either red or black.
507
Red-Black Rules: #2
The root of the tree MUST be black.
508
Red-Black Rules: #3
If a node is red, its children MUST be black
The converse is NOT true; black nodes can have black or red
children
509
Red-Black Rules: #4
Every path from the root to a leaf or null child must have the
same number of black nodes.
510
Summary
These are the four rules:
All nodes are either red or black.
The root must be black.
A red node can only have black children.
All paths to leaf or null have the same number of black children
How do we uphold these rules? There are only two things we

can do:
Change node colors (color ‘flip’)
Perform node rotations (we’ll see how)
Note we only have to do these on insertion and deletion!! Because those
are the only operations that change the structure of the tree.
511
Inserting a new node
When we insert a new node, it’s always red by default unless
it’s the root. Why is this a good idea?
Remember the rules:

1. All nodes are either red or black.
2. The root must be black.
3. A red node can only have black children.
4. All paths to leaf or null have the same number of black
children
If it’s red, which rule could be violated?

512 If it’s black, which rule WOULD be violated?
For example..
Insert 13 (it’s the root, so
black):
Insert 8 (initially red):
Insert 17 (initially red):
513
Color Flip
Now suppose we insert 15
(initially red). What rule is
broken?
We can fix it through a

color flip:
Change the colors of (8)
and (17) to black
Note: This is not too tough
514 to fix.
But color flips can’t fix everything…
For example, what if we inserted (16) into this tree
We already said, it’s initially red.

What rule gets violated?
Let’s look: Why will a color flip not fix it?
515
Putting in (16)…
We initially have this
situation ->
Clearly, this violates

rule (3)
We have a red node,
(15) with a red
child, (16)
516
Putting in (16)…
situation ->
If we make (16)
black, what is
violated?
517
Putting in (16)…
situation ->
If we make (15)
black, what is
violated?
518
Putting in (16)…
situation ->
Can we make (17)

red and (15) black?
What rule is violated
then?
519
Need rotations
To fix this situation:
We have to color flip, to get rid of the rule (3) violation
But we also have to rotate to fix other problems
So, we need both color flips and rotations.
520
How do we do it?
Let’s reduce it to a general case
We initially insert (16) as red
General:
Let X be a node that causes a rule violation
Let P be the parent of X
Let G be the grandparent of X (the parent of P)
521
This Example
X is (16)
P is (15) - the parent of X (16)
G is (17) – the grandparent of X (16), parent of P (15)
522
Insertion: Color Flips
To find the point of insertion, you have to start at the root
and go down the tree
If you encounter a black node with two red children:
Flip both children’s color to black
Flip the black node to red, unless it’s the root (then keep it
black)
At the point of insertion:

Remember the new node is initially red
If the new node’s parent is black, just put the new node there
If the new node’s parent is red, we must rotate
523
Color Flip: Revisit
Again, let’s look at
inserting (15), red
We start at (13) which is
black, and see it has two
red children (8) and (17)
Flip (8) and (17) to black
Normally we’d flip (13) to
red, but it’s the root
Now go right to (17)
Go left, that’s for (15)
(17) is black, so we can just
524 pop it in
Color Flip: Revisit
Again, let’s look at
inserting (16), red
Start at (13), go right
At (17), go left
At (15), go right – that’s
for (16)
But now (15), the parent of

(16), is red – so we must
solve this with rotations
525
When must we rotate…
So we’ve gone down the tree, flipped colors as necessary, and
gotten to the point of insertion for our new node, X
X is red
Call its parent P
We don’t have to do anything if P is black
If P is red, two possibilities, both requiring rotations:

X is an outside grandchild of G
X is an inside grandchild of G
526
Inside vs. Outside Grandchildren
X is an outside grandchild if:
X is a left child of P, and P is a left child of G, or…
X is a right child of P, and P is a right child of G
X is an inside grandchild if:
X is a right child of P, and P is a left child of G, or…
X is a left child of P, and P is a right child of G
527
Rotations Required
If X is an inside grandchild:
Flip the color of G
Flip the color of X
Rotate with P at the top, in the direction that raises X
Rotate with G at the top, in the direction that raises X
This is a perfect example, (16) is an inside grandchild of (17)
X is (16)
P is (15)
G is (17)
528
Step 1: Color Flips
Flip the color of X (16)
Flip the color of G (17)
529
Step 2: Rotate with P as the top
Rotate with P (15) as the top, in the direction that raises X
(16). In this case, it’s to the left
530
Step 3: Rotate with G as the top
Rotate with G (17) as the top, in the direction that raises X
(16). In this case, it’s to the right
531 Now we’re good!

Rotations Required
If X is an outside grandchild (easier):
Flip the color of G
Flip the color of P
Rotate with G as the top, in the direction that raises X
532
Summary: Insertion
Start at the root, and find the point of insertion
Go right or left, just like an ordinary binary search tree
But as you descend, if you find a black node with two red children,
flip color
* IF YOU THEN HAVE A RED PARENT WITH RED CHILD, ROTATE
USING RULES BELOW *
At the point of insertion,
You’ll insert some node X as the child of P and grandchild of G
If P is black, done
If P is red, then:
If X is an outside grandchild, flip the colors of G and P and rotate with G as the
top in the direction that raises X
If X is an inside grandchild, flip the colors of G and X, and:
Rotate with P as the top in the direction that raises X
533
Rotate with G as the top in the direction that raises X
Example
Construct the red-black tree that results from inserting the
following elements:
10 20 30 40 50 60 70 80 90 100
Remember with binary search tree, this results in maximum
non-balance!
Let’s go back to the previous slide and look at the rules.
534
Another Example
Draw the red-black tree that results from inserting the
following elements:
1 6 8 11 13 15 17 22 25 27
535
Deletion: Red-Black Trees
This is very difficult to do
Remember: Deletion from a plain binary search tree is hard!
In Red-Black Trees:
You must delete just as you would from a binary search tree
PLUS, uphold the properties of Red-Black Trees
The code is long and complex

Book skips it, so will I
Some people don’t even do it. They just mark nodes as

deleted or not, and interpret ‘deleted’ nodes as null
536
Memory
Do Red-Black Trees take more or less memory than a binary
search tree? Specifically, how much more?
537
Efficiency: Searching
Because of the extra effort that we take with insertion and
deletion, a red black tree will always be balanced
For n nodes, no more than log(n)+1 levels
So the total number of operations for a search: log(n)+1

worst case
And we’re secured O(log n) runtime
Instead of with binary search trees, we had O(log n) ONLY IF
the tree was balanced
538
Efficiency: Insertion
Note: Insertion into a red-black tree will be (faster or
slower?) than a regular binary search tree
However, we’re still doing a constant amount of work! It’s

just slightly more
Find the point of insertion –> log(n)+1
As we go down, worst case we’ll flip three colors each time, so
3(log(n)+1)
At the point of insertion:
We may have to rotate, but we manipulate a constant number of
references in either case, call it c
Total number of operations:
539 3(log(n)+1) + c -> O(log n)
Efficiency: Deletion
This is also O(log n), for the same argument as insertion
You find the node to delete
Then, you manipulate a constant number of references to
uphold the properties of a red-black tree
540
Final Comparison
Compare to binary search trees:
If your data is fairly random, a binary search tree will likely be
better
Of course, you’re playing the odds
But there is a penalty for inserting and deleting into a red-black tree, once
the point of insertion is found
Thus if the data would be fairly balanced in a binary search tree, it’s better
If your data is fairly sorted, a red-black tree will likely be better
May ask: why would our data ever be sorted?
What would be a structure for which a red-black tree would be good?
541
Other Balanced Trees
Just to be aware
AVL Trees
Instead of a color, each node stores the difference between the heights of
its left and right subtrees
This difference cannot be greater than 1
Similar penalties, advantages vs. binary search trees
A bit slower than red-black trees actually, so rarely used
Multiway or 2-3-4 Tree
Each node has left children and right children, with the same properties
as a binary search tree
Easier to keep balanced, but requires linear search through left children
when for example we ‘branch left’
However, if the number of left (or right) children is restricted to a
small number, not too bad
542
Hash Tables
543
Hash Tables: Overview
Provide very fast insertion and searching
Both are O(1)
Is this too good to be true?
Disadvantages
Based on arrays, so the size must be known in advance
Performance degrades when the table becomes full
No convenient way to sort data
Summary
Best structure if you have no need to visit items in order and
544 you can predict the size of your database in advance.
Motivation
Let’s suppose we want to insert a key into a data structure,
where the key can fall into a range from 0 to m, where m is
very large
And we want to be able to find the key quickly
Clearly, one option is to use an array of size m+1

Put each key into its corresponding slot (0 to m)
Searching is then constant time
But we may only have n keys, where n <<< m

So we waste a ton of space!
545
Motivation
Moreover, we may not be storing integers
For example, if we store words of the dictionary
Our ‘range’ is ‘a’ to ‘zyzzyva’
And we’re talking hundreds of thousands of words in between
So the ‘mapping’ is not necessarily clear

For example, we won’t know immediately that the word
‘frog’ is at 85,467th word in the dictionary
And btw I’m not claiming that is right. J
546
Hash Table
Idea
Provide easy searching and insertion by mapping keys to
positions in an array
This mapping is provided by a hash function
Takes the key as input
Produces an index as output
The array is called a hash table.
547
Hash Function: Example
The easiest hash function Index Value
is the following: 0
H(key) = key % tablesize 1 2001
H(key) now contains a 2
value between 0 and 3 13
tablesize-1 4
So if we inserted the 5
following keys into a table 6 11456
of size 10: 13,11456, 7 157
2001, 157 8
You probably already see 9
potential for collisions
548
Patience, we’ll come to it!
What have we accomplished?
We have stored keys of an Index Value
unpredictable large range 0
into a smaller data 1 2001
structure 2
And searching and 3 13
inserting becomes easy! 4
5
To find a key k, just 6 11456
retrieve table[H(k)] 7 157

8
To insert a key k, just set
9
table[H(k)] = k
549 Both are O(1)!
What’s our price?
Of course, you’ve probably Index Value
already realized, multiple 0
values in our range could 1 2001
map to the same hash table 2
index 3 13
For example, if we used: 4
H(k) = k % 10 5
6 11456
Then, tried to insert 207 7 157
H(207) = 7 8
We have a collision at 9
550 position 7
What have we learned?
If we use hash tables, we need the following:
Some way of handling collisions. We’ll study a couple ways:
Open addressing
Which has 3 kinds: linear probing, quadratic probing, and double
hashing
Separate chaining
Also, the choice of the hash function is delicate

We can produce hash functions which are more or less likely to
have high collision frequencies
We’ll look at potential options
551
Linear Probing
Presumably, you will have define your hash table size to be
‘safe’
As in, larger than the maximum amount of items you expect to
store
As a result, there should be some available cells
In linear probing, if an insertion results in a collision, search

sequentially until a vacant cell is found
Use wraparound if necessary
552
Linear Probing: Example
Again, say we insert Index Value
element 207 0
H(207) = 207 % 10 = 7 1 2001
2
3 13
This results in a collision
with element 157 4
5
So we search linearly for
6 11456
the next available cell,
7 157
which is at position 8
8 207
And put 207 there
9
553
Linear Probing
Note: This complicates Index Value
insertion and searching a 0
bit! 1 2001
For example, if we then 2
inserted element 426, we 3 13
would have to check three 4
cells before finding a vacant 5
one at position 9 6 11456
7 157
And searching, is not simply 8 207

a matter of applying H(k) 9 426
554
You apply H(k), and probe!
Linear Probing: Clusters
As the table to the right Index Value
illustrates, linear probing 0
also tends to result in the 1 2001
formation of clusters. 2
Where large amounts of 3 13
cells in a row are populated 4
And large amounts of cells 5
are sparse
6 11456
7 157
This becomes worse as the 8 207
table fills up 9 426
Degrades performance
555
Linear Probing: Clusters
LaFore: A cluster is like a Index Value
‘faint scene’ at a mall 0
Initially, the first arrivals 1 2001
come 2
Later arrivals come because 3 13
they wonder why everyone 4
was in one place 5
As the crowd gets bigger, 6 11456
more are attracted 7 157
8 207
Same thing with clusters! 9 426
Items that hash to a value in
556
the cluster will add to its size
Index Value
0
Linear Probing 1
2
2001
3
One option: If the table 4
becomes full enough, double 5

6 426
its size 7 207
Note this is not quite as 8

9
simple as it seems 10
Because for every value 11

12
inside, you have to 13 13
recompute its hash value 14
The hash function is 15
necessarily different: 16 11456
17 157
H(k) = k % 20
18
But, less clustering 19
557 20
Linear Probing
Linear probing is the simplest way to handle collisions, and is
thus worthy of explanation
Let’s look at the Java implementation on page 533
This assumes a class with member variables:
hashArray (the hash table)
arraySize (the size of the hash table)
Assume an empty slot contains -1
We’ll construct:
hashFunc()
find()
insert()
558
delete()
Quadratic Probing
The main problem with linear probing was its potential for
clustering
Quadratic probing attempts to address this
Instead of linearly searching for these next available cell
i.e. for hash x, search cell x+1, x+2, x+3, x+4….
Search quadratically
i.e. for hash x, search cell x+1, x+4, x+9, x+16, x+25…
Idea
On a collision, initially assume a small cluster and go to x+1
If that’s occupied, assume a larger cluster and go to x+4
If that’s occupied assume an even larger cluster, and go to x+9
559
Quadratic Probing: Example
Returning to our old Index Value
example with inserting 207 0
H(207) = 207 % 10 = 7 1 2001
2
3 13
with element 157 4
5
In this case, slot 7 is
6 11456
occupied but slot 7+1=8 is
7 157
open, so we put it there
8
9
560
Quadratic Probing
Now, if we insert 426 Index Value
H(426) = 426 % 10 = 6 0
Which is occupied 1 2001
Slot 6+1=7 is also occupied 2

3 13
4
So we check slot:
5
6+4=10
6 11456
This passes the end, so we
7 157
wraparound to slot 0 and
8 207
insert there
9
561
Quadratic Probing
We have achieved a decrease Index Value
in the cluster count 0 426
Clusters will tend to be 1 2001
smaller and more sparse 2
Instead of having large 3 13
clusters 4
And largely sparse areas 5
6 11456
Thus quadratic probing got 7 157
rid of what we call primary 8 207
clustering. 9
562
Quadratic Probing
Quadratic probing does, Index Value
however, suffer from 0 426
secondary clustering 1 2001
2
Where, if you have several 3 13
keys hashing to the same 4
value 5
The first collision requires 6 11456
one probe 7 157
The second requires four 8 207
The third requires nine 9
The fourth requires sixteen
563
Quadratic Probing
Secondary clustering would Index Value
happen if we inserted for 0 426
example: 1 2001
827, 10857, 707 1117 2
Because they all hash to 7 3 13
4
Not as serious a problem as 5
primary clustering 6 11456
7 157
8 207
But there is a better solution
that avoids both. 9
564
Double Hashing
The problem thus far is that the probe sequences are always
the same
For example: linear probing always generates x+1, x+2, x+3...
Quadratic probing always generates x+1, x+4, x+9…
Solution: Make both the hash location and the probe

dependent upon the key
Hash the key once to get the location
Hash the key a second time to get the probe
This is called double hashing.

565
Second Hash Function
Characteristics of the hash function for the probe
It cannot be the same as the first hash function
It can NEVER hash to zero
Why not?
Experts have discovered, this type of hash function works

good for the probe:
probe = c – (key % c)
Where c is a prime number that is smaller than the array size
566
Double Hashing: Example
Returning to our old Index Value
example with inserting 207 0
H(207) = 207 % 10 = 7 1 2001
2
3 13
with element 157 4
5
So we hash again, to get the
6 11456
probe
7 157
Suppose we choose c=5
8
Then:
9
P(207) = 5 – (207 % 5)
567 P(207) = 5 – 2 = 3
So we insert 207 at Index Value
position: 0 207
H(207) + P(207) = 1 2001
7+3 = 2
10 3 13
4
Wrapping around, this will 5
put 207 at position 0 6 11456
7 157
8
9
568
Now, let’s again insert Index Value
value 426 0 207
We run the initial hash: 1 2001
H(426) = 426 % 10 = 6 2
We get a collision, so we 3 13
probe: 4
P(426) = 5 – (426 % 5) 5
=5–1=4 6 11456
7 157
And insert at location:
8
H(426) + P(426) = 10
9
Wrapping around, we get
569 0. Another collision!
So, we probe again Index Value
P(426) = 4 0 207
So we insert at location 1 2001
0+4 = 4, and this time 2
there is no collision 3 13
4 426
Double hashing will in 5
general produce the fewest 6 11456
clusters 7 157
Because both the hash and 8
probe are key-dependent 9
570
Let’s try this again:
Again, we have our hash table stored in hashArray
And arraySize as the size of the hash table
Again, assume positive integers and all entries are initially -1
Let’s construct
hashFunc()
hashFunc2()
find()
insert()
delete()
571
Note…
What is a potential problem with choosing a hash table of size
10 and a c of 5 for the probe, as we just did?
Suppose we had a value k where H(k) = 0 and P(k) = 5

i.e., k = 0
What would the probe sequence be?

What’s the problem?
572
Probe Sequence
The probe sequence may never find an open cell!
Because H(0) = 0, we’ll start at hash location 0
If we have a collision, P(0) = 5 so we’ll next check 0+5=5
If we have a collision there, we’ll next check 5+5=10, with
wraparound we get 0
We’ll infinitely check 0 and 5, and never find an open cell!
573
Double Hashing Requirement
The root of the problem is that the table size is not prime!
For example if the size were 11:
0, 5, 10, 4, 9, 3, 8, 2, 7, 1, 6
If there is even one open cell, the probing is guaranteed to find
it
Thus, very important – a requirement of double hashing

is that the table size is prime.
So our previous table size of 10 is not a good idea
We would want 11, or 13, etc.
574
Generally, for open addressing, double hashing is best
Separate Chaining
The alternative to open
addressing
Does not involve probing
to different locations in the
hash table
Rather, every location in
the hash table contains a
linked list of keys
575
Separate Chaining
Simple case, 7 element
hash table
H(k) = k % 7
So:
21, 77 each hash to
location 0
72 hashes to location 2
75, 5, 19 hash to location 5
Each is simply appended to

the correct linked list
576
Separate Chaining
In separate chaining,
trouble happens when a list
gets too full
Generally, we want to keep
the size of the biggest list,
call it M, much smaller
than N
Searching and insertion
will then take O(M) time
in the worst case
577
Java Implementation
Let’s look at pages 555-557
Note: We will need a linked list and the hash table!
Will take a little time
I’m going to use an unsorted list for simplicity

A sorted list will speed up searching, slow down insertion
578
A Good Hash Function
Has two properties:
Is computable quickly; so as not to degrade performance of
insertion and searching
Can take a range of key values and transform them into indices
such that the key values are distributed randomly across the
hash table
For random keys, the modulo (%) operator is good

It is not always an easy task!
579
For example…
Data can be highly non-random
For example, a car-part ID:
033-400-03-94-05-0-535
For each set of digits, there can be a unique range or set of

values!
i.e. Digits 3-5 could be a category code, where the only
acceptable values are 100, 150, 200, 250, up to 850.
Digits 6-7 could be a month of introduction (0-12)
Digit 12 could be “yes” or “no” (0 or 1)
Digits 13-15 could be a checksum, a function of all the other
580
digits in the code
Rule #1: Don’t Use Non-Data
Compress the key fields down enough until every bit counts
For example:
The category (bits 3-5, with restricted values 100, 150, 200, …
, 850) counting by 50s needs to be compressed down to run
from 0 to 15
The checksum is not necessary, and should be removed. It is a
function of the rest of the code and thus redundant with respect
to the hash table
581
Rule #2: Use All of the Data
Every part of the key should contribute to the hash function
More data portions that contribute to the key, more likely it
will be that the keys hash evenly
Saving collisions, which cause trouble no matter what the
algorithm you use
582
Rule #3: Use a Prime Number for
Modulo Base
This is a requirement for double hashing
Important for quadratic probing
Especially important if the keys may not be randomly
distributed
The more keys that share a divisor with the array size, the more
collisions
Example, non-random data which are multiples of 50
If the table size is 50, they all hash to the same spot
If the table size is 10, they all hash to the same pot
If the table size is 53, no keys divide evenly into the table size. Better!
583
Hashing Efficiency
Insertion and Searching are O(1) in the best case
This implies no collisions
If you minimize collisions, you can approach this runtime
If collisions occur:
Access times depend on resulting probe lengths
Every probe equals one more access
So every worst case insertion or search time is proportional to:
The number of required probes if you use open addressing
The number of links in the longest list it you use separate chaining
584
Efficiency: Linear Probing
Let’s assume a load factor L,
where L is the percentage of
hash table slots which are
occupied.
Knuth showed that, for a

successful search:
P = (1 + 1 / (1 – L)2 ) / 2
For an unsuccessful search:

P = (1 + 1 / (1 – L) ) / 2
585
Efficiency: Linear Probing
What’s the ideal load factor?
At L=0.5:
Successful search takes 1.5
probes
Unsuccessful takes 2.5
At L = 2/3:
Successful: 2.0
Unsuccessful: 5.0
Good to keep load factor

586 under 0.5!
Efficiency: Quadratic Probing and
Double Hashing
Again, assume a load factor
L, where L is the percentage
of hash table slots which are
occupied.
Knuth showed that, for a

successful search:
P = 1 / (1 – L)
For an unsuccessful search:
P = -log(1-L) / L
Can tolerate somewhat
587 higher L
Efficiency: Separate Chaining
Here L is a bit more
complicated:
For N elements
And an array of size S
L=N/S
Successful (average):
1 + (L/2)
Unsuccessful:
1+L
588
Summary: When to use What
If the number of items that will be inserted is uncertain, use
separate chaining
Must create a LinkedList class
But performance degrades only linearly
With open addressing, major penalties
Otherwise, use double hashing, unless…

Plenty of memory available
Low load factors
Then linear or quadratic probing should be done for ease of
implementation
589
(Time Pending)
Application to Disk Storage
590
Heaps
591
Heaps: Motivation
Recall priority queues. What were they?
An ordered queue
Offered us O(1) removal, searching of:
The largest element if ordered from highest to lowest
The smallest element if ordered from lowest to highest
Insertion still takes O(n) time
Heaps attempt to accomplish the same thing with a tree

where a node is always larger than its children
Enforce balancing
O(log n) insertion, O(log n) deletion
O(1) searching of maximum element (max-heap)
592
You could trivially accomplish O(1) for the minimum as well
Heap: Binary tree
Note: A heap is NOT a binary search tree.
Each node has two children
Instead of the left child being smaller and the right child larger,
both children are smaller (heap condition)
593
Complete
A heap is a complete binary tree
In that, each row is completely filled in reading from left to right
The last row need not be
594
Array Implementation
Heaps are usually implemented with arrays
It will become clear why
595
Traversal
Note: An inorder traversal of the heap is very difficult!
Because the elements are weakly ordered
The heap condition is not as strong as the organizing principle
in the binary search tree
Thus this operation is not supported by heaps
596
Arbitrary search and deletion
Searching and deleting any element other than the maximum
is also not supported
For the same reasons, they are difficult and expensive
There are actually only two operations that a heap
supports….
597
Supported Operations
A heap only supports two operations:
Deletion of the maximum element
Insertion of a new element
These are actually the two required operations for a priority
queue!
598
Operation 1: Removing the max
We already know that the maximum element is:
At the root of the heap
At position 0 of the heap array
Generally, follow these steps:
Remove the root
Move the last node into the root
Trickle the last node down until it’s below a larger node and
above a smaller one
599
Example
Delete node 95 (the max) from the following heap:
600
Step 1
Remove the root and replace it by the last node
601
Step 2
Trickle the node down the tree, swap until it lies between
larger and smaller nodes
602
Step 2
603
Step 2
604
Step 2
605
Implementation
Assuming we know the current size of the array (call it n),
removing the root and replacing it with the last node is easy
Just set heapArray[0] equal to heapArray[n-1]
And decrement n
606
Trickling Down
Once we’ve moved the last element to the root, if either or
both children are larger:
Find the bigger child and swap
607
Trickling Down
Note, given a node at index x:
Its parent is at (x-1)/2
Its children are at 2x+1 and 2x+2
608
Trickling Down
So generally, for trickling down:
Start x at 0
while (heapArray[x] < heapArray[2x+1] or
heapArray[x] < heapArray[2x+2])
largerChild = max(heapArray[2x+1], heapArray[2x+2])
swap(heapArray[x], largerChild)
if (largerChild was left child)
x = 2x+1
else
x = 2x+2
609
Of course, we need checks if we are at the bottom…
We’ll avoid the extra Node class and just use integers
Let’s construct
A constructor, which takes a maximum heap size
A function to check if the heap is empty
A function to accept an index and perform a trickle down
A function to perform the deletion of the maximum element
610
Operation 2: Insertion
Generally, follow the following steps:
Insert the new node at the next available spot in the bottom row
If it violates the heap condition (translation: it’s bigger than its
parent)
Trickle the node upwards, until it’s smaller than the parent
611
Example
Add node 95 to the following tree:
612
Step 1
Put the new node in the next empty spot
613
Step 2
If the node is larger than the parent, swap it
614
Step 2
615
Step 2
616
Step 2
We’re done! We actually added a new maximum.
617
Implementation
Once again, step 1 is easy
If the current size of the heap is n
Set heapArray[n] to the new key
Increment n
618
Trickling Up
Again, given a node at index x:
Its parent is at (x-1)/2
Its children are at 2x+1 and 2x+2
619
Trickling Up
General approach will be:
while (heapArray[(x-1)/2] < heapArray[x] and we’re not at the
root)
swap(heapArray[x], heapArray[(x-1)/2]
x = (x-1)/2
Let’s look at the implementation..
620
Again, we’re avoiding the Node class and just using integers
Let’s implement
Constructor
Function to check if the heap is empty
A function which takes an index and trickles that node up
A function which performs the insertion
621
Let’s do our own example…
Begin with an initially empty heap, with a maximum size of
10. Perform the following operations, showing both the
array contents and the corresponding heap:
Insert 64
Insert 91
Insert 80
Insert 21
Insert 45
Remove the max
Insert 110
Insert 35
Remove the max
Remove the max
622 Insert 204
Efficiency
Swapping just takes O(1) time
Trickle up and trickle down each take O(log n) time
They each iteratively examine parent nodes
A heap is necessarily balanced because of its completeness
Resizing the array is not overly difficult, although is not a

desirable operation and should be done minimally
O(n) copies!
623
Why use arrays?
Actually you can use trees if you want to
This is called a tree heap
Let’s construct the Node class… will look similar to the binary
search tree (but remember the properties are different!)
Difficult to implement however

If you know how many elements are in the tree, you can find
the last element by converting the size to binary
For example, if there are 29 nodes:
29 = 11101 in binary. Remove the first digit, get 1101
Go right (1), right (1), left (0) and right(1) to get to the 29th node
Once you find the last node, our operations become relatively easy
624 But you’re adding an O(log n) operation just to find the last element. L
Heapsort
If I had a collection of n integers that I wanted to sort:
Insert all n into the heap
Remove the maximum element n times
Really? That easy?
Yes!
625
Heapsort Efficiency
Let’s look at each operation:
Insert all n into the heap
Really? That easy?
So we perform n insertions and each are O(log n) -> O(n

log n)
And we perform n removals of the max, each are O(log n) -
> O(n log n)
So our heapsort is O(n log n) and ties quicksort?!?

626
Actually, not quite…
But it is pretty good!
It is not quite as fast as quicksort, because…
In the inner loop of trickling down, there are more operations
than in the inner loop of quicksort
Many assignments and comparisons
To see this, just look at the implementation on page 594
We can make heapsort more efficient, however!
627
New Insertion Process
The one we learned:
For each new node we insert:
Place it in the last available position O(1)
Trickle up O(log n)
Overall, O(n log n)
The new one

Insert all elements randomly
Apply trickle down to any non-leaf nodes
This saves an O(log n) process for n/2 of the nodes
Let’s see how this saves us

628
Revisiting Trickle Down…
To the right, you can see the trickling down of the root node 30:
Continue to swap positions until 30 is between a smaller and

larger node
629
What we can note…
Trickling down
requires correct
subheaps
630
What we can note…
However, the leaf
nodes in the bottom
row, already must be
correct heaps
So we don’t have to
apply trickle down to
them
These comprise
roughly half the nodes
631 in the tree
So, summary with insertion
So with insertion, we actually can save operations
Instead of n operations of trickle up, we have n/2 operations of
trickle down
Overall
Randomly insert n elements -> n*O(1) = O(n)
Trickle down n/2 elements -> (n/2)*O(log n) = O(n log n)
So it’s still O(n log n), but you’re doing half as many trickles
Not a huge savings, but well worth it!
632
One way: Iteratively
Note that, the bottom row of nodes begins at index n/2
So, apply trickle down to nodes (n/2)-1 to 0 (the root)
Note we have to go in reverse, because trickle down works
properly only if subtrees are correct heaps
633
Second Way: Recursion
Called ‘heapify’, and pass a
node index
Envision as:
If the index is larger than
(n/2)-1, do nothing
Otherwise:
Heapify the left subtree
Heapify the right subtree
Trickle down this node
Java implementation, page

603
634
Memory
Heapsort, as it is currently designed, requires two arrays
One to hold the current heap
One for temporary data that is removed
However, this does not have to be the case!
Let’s recall the operations for insertion

Insert n elements in random order
Apply trickle down (n/2) times
635
Sharing Space
Note: We don’t necessarily
need two arrays for
heapsort!
n/2 trickle down
operations can be easily
done on the same array, it’s
just swapping contents of
cells
636
Sharing Space
When we remove the
maximum element, one
slot becomes open at the
end of the subarray we are
sorting
We can just insert the
maximum element there!
The result will be a sorted
array
637
(Time Pending)
Summary:
Get the array size from the user -> 1
Fill with random data -> n
Turn array into a heap with n/2 applications of trickle down ->
(n/2)*(log n)
Remove items from the heap -> n log n
Write back to the end of the array -> n
Total: (3n/2)*(log n) + 2n + 1 -> O(n log n)

638
Graphs
639
Graphs
Graphs are a data structure which represent relationships
between entities
Vertices represent entities
Edges represent some kind of relationship
640
Example
The graph on the previous page could be used to model San
Jose freeway connections:
641
Adjacency
Two vertices are adjacent to one another if they are
connected by a single edge
For example:
I and G are adjacent
A and C are adjacent
I and F are not adjacent
Two adjacent nodes are

considered neighbors
642
Path
A path is a sequence of edges
Paths in this graph include:

BAEJ, CAFG, HFEJDCAB, HIJDCBAFE, etc.
643
Connected Graphs
A graph is connected if there is at least one path from every
vertex to every other vertex
644
Unconnected Graph
An unconnected graph consists of several connected
components:
Connected components of this graph are:

AB, and CD
We’ll be working with connected graphs
645
Directed Graphs
A graph where edges have directions
Usually designated by an arrow
646
Weighted Graphs
A graph where edges have weights, which quantifies the
relationship
For example, you may assign path distances between cities
Or airline costs
These graphs can be directed or undirected
647
Vertices: Java Implementation
We can represent a vertex as a Java class with:
Character data
A boolean data member to check if it has been visited
Now we need to specify edges.

But in a graph, we don’t know how many there will be!
We can do this using either an adjacency matrix or an

adjacency list. We’ll look at both.
648
Adjacency Matrix
An adjacency matrix for a graph with n nodes, is size n x n
Position (i, j) contains a 1 if there is an edge connecting node i
with node j
Zero otherwise
For example, here is a graph and its adjacency matrix:
649
Redundant?
This may seem a bit redundant:
Why store two pieces of information for the same edge?

i.e., (A, B) and (B, A)
Unfortunately, there’s no easy way around it
Because edges have no direction
No concept of ‘parents’ and ‘children’
650
Adjacency List
An array of linked lists
Index by vertex, and obtain a linked list of neighbors
Here is the same graph, with its adjacency list:
651
Application: Searches
A fundamental operation for a graph is:
Starting from a particular vertex
Find all other vertices which can be reached by following paths
Example application
How many towns in the US can be reached by train from
Tampa?
Two approaches
Depth first search (DFS)
Breadth first search (BFS)
652
Depth First Search (DFS)
Idea
Pick a starting point
Follow a path to unvisited
vertices, as long as you can
until you hit a dead end
When you hit a dead end,
go back to a previous spot
and hit unvisited vertices
Stop when every path is a
dead end
653
Depth First Search (DFS)
Algorithm
Pick a vertex (call it A) as your starting point
Visit this vertex, and:
Push it onto a stack of visited vertices
Mark it as visited (so we don’t visit it again)
Visit any neighbor of A that hasn’t yet been visited
Repeat the process
When there are no more unvisited neighbors
Pop the vertex off the stack
Finished when the stack is empty
Note: We get as far away from the starting point until we

654 reach a dead end, then pop (can be applied to mazes)
Example
Start from A, and execute depth first search on this graph,
showing the contents of the stack at each step
Every step, we’ll either have a visit or pop
655
Depth First Search: Complexity
Let |V| be the number of vertices in a graph
And let |E| be the number of edges
In the worst case, we visit every vertex and every edge:

O(|V| + |E|) time
At first glance, this doesn’t look too bad

But remember a graph can have lots of edges!
Worst case, every vertex is connected to every other:
(n-1) + (n-2) + … + 1 = O(n2)
So it can be expensive if the graph has many edges
656
Breadth First Search (BFS)
Same application as DFS;
we want to find all vertices
which we can get to from a
starting point, call it A
However this time, instead
of going as far as possible
until we find a dead end,
like DFS
We visit all the closest
vertices first
Then once all the closest
vertices are visited, branch
657 further out
Breadth First Search (BFS)
We’re going to use a queue instead of a stack!
Algorithm
Start at a vertex, call it current
If there is an unvisited neighbor, mark it, and insert it into the
queue
If there is not:
If the queue is empty, we are done, otherwise:
Remove a vertex from the queue and set current to that vertex, and
repeat the process
658
Example
Start from A, and execute breadth first search on this graph,
showing the contents of the queue at each step
Every step, we’ll either have a visit or a removal
659
Breadth First Search: Complexity
Let |V| be the number of vertices in a graph
And let |E| be the number of edges
In the worst case, we visit every vertex and every edge:

O(|V| + |E|) time
Same as DFS
Again, if the graph has lots of edges, you approach quadratic

run time which is the worst case
660
Minimum Spanning Trees (MSTs)
On that note of large numbers of edges slowing down our
precious search algorithms:
Let’s look at MSTs, which can help ameliorate this problem
It would be nice to take a graph and reduce the number of
edges to the minimum number required to span all vertices:
What’s the
number of
edges
now?
661
We’ve done it already…
Actually, if you execute DFS you’ve already computed the
MST!
Think about it: you follow a path for as long as you can, then
backtrack (visit every vertex at most once)
You just have to save edges as you go
662
Directed Graphs
A directed graph is a graph where the edges have direction,
signified by arrows:
This will simplify the adjacency matrix a bit…
663
Adjacency Matrix
The adjacency matrix for this graph does not contain
redundant entries
Because now each edge has a source and a sink
So entry (i, j) is only set to 1 if there is an edge going from i to j
0 otherwise
664
Topological Sort
Only works with DAGs (Directed Acyclic Graphs)
That is if the graph has a cycle, this will not work
Idea: Sort all vertices such that if a vertex’s successor always

665
appears after it in a list (application: course prerequisites)
Topological Sort
Algorithm
Find a vertex that has no successors
Delete this vertex from the graph, and insert its label at the
beginning of a list
Repeat until every vertex is gone
Works because if a vertex has no successors, it must be the

last one (or tied for last) in the topological ordering
As soon as it’s removed, at least one of the remaining vertices
must not have successors
Because there are no cycles!
Graph does NOT have to be fully connected for this to work!
666
Weighted Graphs
Once again, a graph where edges have weights, which
quantifies the relationship
These graphs can be directed or undirected
667
Weighted Graph: Adjacency List
The adjacency list for a weighted graph contains edge weights
Instead of 0 and 1
If there is no edge connecting vertices i and j, a weight of

INFINITY is used (not 0!)
Because ‘0’ can also be a weight
Also most applications of weighted graphs are to find minimum
spanning trees or shortest path (we’ll look at this)
Also remember if the graph is undirected, redundant

information should be stored
668
Weighted Graph: Adjacency List
A B C D E F
A INF INF INF INF 0.1 0.9
B 0.3 INF 0.3 0.4 INF INF
C INF INF INF 0.6 0.4 INF
D INF INF INF INF 1 INF
E 0.55 INF INF INF INF 0.45
F INF INF INF 1 INF INF
669
Dijkstra’s Algorithm
Given a weighted graph, find the shortest path (in terms of
edge weights) between two vertices in the graph
Numerous applications
Cheapest airline fare between departure and arrival cities
Shortest driving distance in terms of mileage
670
Suppose in the graph below, we wanted the shortest path
from B to F
Idea: Maintain a table of the current shortest paths from B to

all other vertices (and the route it takes)
When finished, the table will hold the shorest path from B to all
671
other vertices
Here is our initial graph:
And here is our table:
A C D E F
INF INF INF INF INF
672
Step 1
Take all edges leaving B,
and put their weights in
the table
Along with the source
vertex
A C D E F
0.3 (B) 0.3 (B) 0.4 (B) INF INF
673
Step 2
Pick the edge with
the smallest weight
and mark it as the shortest
path from B
(How do we know that?)
A C D E F
0.3* (B) 0.3 (B) 0.4 (B) INF INF
674
Step 3
Now choose one of the
edges with minimal weight
and repeat the process
(explore adj. vertices and
mark their total weight)
A C D E F
0.3* (B) 0.3 (B) 0.4 (B) INF INF
675
Step 4
In this case, we’ll look at A
Explore adjacent vertices
Enter the total weight
from B to those vertices
IF that weight is smaller than
the current entry in the table
Ignore the ones marked (*)
So here is our table:
A C D E F
0.3* (B) 0.3 (B) 0.4 (B) 0.4 (A) 1.2 (A)
676
Step 5
Now, A is marked and
we’ve visited its neighbors
So pick the lowest entry
in the table (in this case C)
and repeat the process
So here is our table:

A C D E F
0.3* (B) 0.3* (B) 0.4 (B) 0.4 (A) 1.2 (A)
677
Step 6
Visit C’s neighbors that
are unmarked
Insert their total weight
into the table, IF it’s
smaller than the current
entry
So in fact, nothing changes in the table:

A C D E F
0.3* (B) 0.3* (B) 0.4 (B) 0.4 (A) 1.2 (A)
678
Step 7
Now we visit D
Which only contains
one edge to E
0.4+1 = 1.4, which

is larger than 0.4
So again, nothing changes in the table:
A C D E F
0.3* (B) 0.3* (B) 0.4* (B) 0.4 (A) 1.2 (A)
679
Step 8
Now we visit E
Has two outgoing edges
One to A (marked, ignore)
One to F, which changes
the table to
0.4 + 0.45 = 0.85
Which is smaller than the
current entry, 1.2
So the table changes, we found a shorter path to F:

A C D E F
0.3* (B) 0.3* (B) 0.4* (B) 0.4* (A) 0.85 (E)
680
Step 9
Only one vertex left, so
we’re actually finished
Shortest path can be
obtained by starting from
the destination entry and
working backwards
F <- E <- A <- B
Shortest path from B to F is: B -> A -> E -> F

Total weight: 0.85
A C D E F
681 0.3* (B) 0.3* (B) 0.4* (B) 0.4* (A) 0.85* (E)
Example
Let’s try this example with train costs:
682

Data Structures & Algorithm in Java - Robert Lafore - PPT

Uploaded by

Data Structures & Algorithm in Java - Robert Lafore - PPT

Uploaded by

Introduction

 How can the arrangement of data in memory affect

 How can the design of an algorithm affect performance? How

 What’s an example of non-computer data storage?

 What method do these use?

 Algorithms (which ones do we care about?):

 Now the issue of scalability comes into play. Suppose I

 What programmer tool would this be useful for? Hint:

11  We’ll cover it!!

 Let’s name the disadvantages

Point that you should be getting: No ‘universal’ data structure!!!

 This technique will be used to develop some of the afore

 What is the ‘database’ in the case of a collection of index

 What would be a ‘record’ in a stack of index cards? What

 What would some example fields be for a banking system?

Last First Acct No. City State Zip

 So how do we designate a record? We need something

Last First Acct No. City State Zip

 2. Crude Organization. Let’s discuss why.

 With a class, we specify a blueprint for one or more objects. For

public void furnaceOn() {

Object creation (or instantiation) is done through the

 Let’s return to an example with Animal and Dog, and

 How does understanding a data structure help in terms of

 Which is a value and which is a reference?

 How did the interpreter know to allocate them differently?

 What does the address stored in bc1 contain right now?

int intVar1 = 45;

 If I change f in method3, does that affect num?

int var = 33;

 Read a character: char c = s.charAt(0);

 Read an integer: int i = Integer.parseInt(s);

46  Read a float: double d = Double.valueOf(s).doubleValue();

CS221N, Data Structures

 Java arrays are also considered references.

 We can combine these statements:

 Or, change the [] to after the variable name

 What do the [] signify?

 What exactly does intArray contain? Let’s look internally.

 Getting an array size is difficult in many other languages

 How do we access the last element of the array, if we don’t

 What range of indices will generate the IndexOutOfBounds

 The index is an offset. Let’s look at why.

 int[] intArray = {0, 3, 6, 9, 12, 15, 18, 21, 24,

 Automatically determines the size

 So in this case, what will our data be?

 For functions, we’ll provide:

 What’s inadequate currently in terms of operations?

 One more data member

 Then, let’s modify main().

 Now we can reuse HighArray much easier

 Think about what functions we’d have to modify

 Why could this be a nice feature? What operation could be

 In the average case, would this be faster than an unordered

 We can also do what’s called binary search, which is much

 This can work if we are using ordered arrays

 For an unordered array where we must use linear search,

 How about for binary search on an ordered array? Let’s look

 What about, insertion of a new element into an ordered

 All in all, ordered arrays would be useful in situations where

 So for n values, the number of comparisons is log2(n)+1.

 Algorithms that scale logarithmically are preferable to those

 So for large input sets, you’ll have a MUCH smaller number

 The array itself is still a _________________.

 Both of these operations are independent of the size of the

 So they take some time, K, which is not a function of n

 Then the algorithm would take K*n total time

 Total time is then: K(log(n)+1) = K*log(n) + K

 For large n, this grows proportional to log(n), i.e. the leading

 We say this is O(log n)

 Each of these successively grows faster with n.

 Yes, if the input size is _________________.

CS221N, Data Structures

 Java arrays are also considered references.

 We can combine these statements:

How can the arrangement of data in memory affect

How can the design of an algorithm affect performance? How

What’s an example of non-computer data storage?

What method do these use?

Algorithms (which ones do we care about?):

Now the issue of scalability comes into play. Suppose I

What programmer tool would this be useful for? Hint:

11 We’ll cover it!!

Let’s name the disadvantages

This technique will be used to develop some of the afore

What is the ‘database’ in the case of a collection of index

What would be a ‘record’ in a stack of index cards? What

What would some example fields be for a banking system?

So how do we designate a record? We need something

2. Crude Organization. Let’s discuss why.

With a class, we specify a blueprint for one or more objects. For

Let’s return to an example with Animal and Dog, and

How does understanding a data structure help in terms of

Which is a value and which is a reference?

How did the interpreter know to allocate them differently?

What does the address stored in bc1 contain right now?

If I change f in method3, does that affect num?

Read a character: char c = s.charAt(0);

Read an integer: int i = Integer.parseInt(s);

46 Read a float: double d = Double.valueOf(s).doubleValue();

Java arrays are also considered references.

We can combine these statements:

Or, change the [] to after the variable name

What do the [] signify?

What exactly does intArray contain? Let’s look internally.

Getting an array size is difficult in many other languages

How do we access the last element of the array, if we don’t

What range of indices will generate the IndexOutOfBounds

The index is an offset. Let’s look at why.

int[] intArray = {0, 3, 6, 9, 12, 15, 18, 21, 24,

Automatically determines the size

So in this case, what will our data be?

For functions, we’ll provide:

What’s inadequate currently in terms of operations?

One more data member

Then, let’s modify main().

Now we can reuse HighArray much easier

Think about what functions we’d have to modify

Why could this be a nice feature? What operation could be

In the average case, would this be faster than an unordered

We can also do what’s called binary search, which is much

This can work if we are using ordered arrays

For an unordered array where we must use linear search,

How about for binary search on an ordered array? Let’s look

What about, insertion of a new element into an ordered

All in all, ordered arrays would be useful in situations where

So for n values, the number of comparisons is log2(n)+1.

Algorithms that scale logarithmically are preferable to those

So for large input sets, you’ll have a MUCH smaller number

The array itself is still a _________________.

Both of these operations are independent of the size of the

So they take some time, K, which is not a function of n

Then the algorithm would take K*n total time

Total time is then: K(log(n)+1) = K*log(n) + K

For large n, this grows proportional to log(n), i.e. the leading

We say this is O(log n)

Each of these successively grows faster with n.

Yes, if the input size is _________________.