CSE 326 Lecture 10: B-Trees and Heaps It's Lunch Time - What's Cookin'?
CSE 326 Lecture 10: B-Trees and Heaps It's Lunch Time - What's Cookin'?
" B-Trees
# Insert/Delete Examples and Run Time Analysis
" Summary of Search Trees " Introduction to Heaps and Priority Queues
! Covered in Chapters 4 and 6 in the text
All keys in first subtree T1 !"k1 All keys in subtree Ti must be between ki-1 and ki
" If leaf node is not full, fill in empty slot with X. E.g. Insert 5 in the tree below " If leaf node is full, split leaf node and adjust parents up to root node. E.g. Insert 9 in the tree below
13:6:11 17:-
2-3 Tree
6 7 8
11 12 -
13 14 -
17 18 3
" May have to combine leaf nodes and adjust parents up to root node if number of data items falls below %L/2& = 2 E.g. Delete 17 in the tree below
13:6:11 17:-
3 4 -
6 7 8
11 12 -
13 14 -
17 18 -
1. Each internal node has up to M-1 keys to search 2. Each internal node has between %M/2& and M children 3. Each leaf stores between %L/2& and L data items Depth d of B-Tree storing N data items is: log %M/2& (N/L) - 1 # d < log %M/2& (N/L) i.e. d = O(log %M/2& (N/L)) = O(log N)
(Why? Hint: Draw a B-tree with minimum children at each node. Count its leaves as a function of depth)
! Find: Run time includes:
" O(M) to handle splitting or combining keys in nodes " Total time is O(depth*M) = O((log N/log %M/2& )*M) = O((M/log M)*log N)
How do we select M?
want M and L to be small to minimize search time at each node/leaf " Typically M = 3 or 4 (e.g. M = 3 is a 2-3 tree) " All N items stored in internal memory Disk access time dominates! allows
" Choose M & L so that interior and leaf nodes fit on 1 disk block
" To minimize number of disk accesses, minimize tree height
" Typically M = 32 to 256, so that depth = 2 or 3 very fast access to data in large databases.
! See Textbook for more numbers and examples.
per node allows shallow trees; all leaves are at the same depth keeping tree balanced at all times
A New Problem
! Instead of finding any item (as in a search tree), suppose we
want to find only the smallest (highest priority) item quickly. Examples:
" Operating system needs to schedule jobs according to priority " Doctors in ER take patients according to severity of injuries " Event simulation (bank customers arriving and departing, ordered according to when the event happened)
! We want an ADT that can efficiently perform:
" If sorted: DeleteMin is O(1) but Insert is O(N) " If not sorted: Insert is O(1) but DeleteMin is O(N)
! Binary Search Trees (BSTs)
" BSTs designed to be efficient for Find, not just FindMin " We only need FindMin/DeleteMin
! We can do better than BSTs!
Heaps
! A binary heap is a binary tree that is:
1. Complete: the tree is completely filled except possibly the bottom level, which is filled from left to right 2. Satisfies the heap order property: every node is smaller than (or equal to) its children
! Therefore, the root node is always the smallest in a heap
2 6 7 8 4 0 0
-1 1 3 2
1 6 4 5
" Root node = A[1] " Children of A[i] = A[2i], A[2i + 1] " Keep track of current size N (number of nodes) 2
2
0 1
4
2
6
3
7
4
5
5 6 7 7 N=5
4 5
12
2 4 7 3 5 8 9
" Delete (and return) value at root node " We now have a Hole at the root " Need to fill the hole with another value " Replace with smallest child? # Try replacing 2 with smallest child and that node with its smallest child, and so onwhat happens?
R. Rao, CSE 326
11 9 6 10
13
DeleteMin Take 1
! DeleteMin:
" Delete (and return) value at root node " We now have a Hole at the root " Need to fill the hole with another value " Replace with smallest child? # Try replacing 2 with smallest child and so onwhat happens? # Tree is no longer complete! # Lets try another strategy
3 4 7 5 8 9
11 9 6 10
14
DeleteMin (Take 2)
! DeleteMin:
" Delete (and return) value at root node " We now have a Hole at the root " Need to fill hole with another value
! Since heap is smaller by one node, we
2 4 7 3 5 8 9
11 9 6 10
1. Move last item to top; decrease size by 1 2. Percolate down the top item to its correct position in the heap
2
R. Rao, CSE 326
9 11 9 6 10
15
11 9 6
11 9 6
11 9 6
Keep comparing with children A[2i] and A[2i + 1] Replace with smaller child and go down one level Done if both children are $ item or reached a leaf node What is the run time?
R. Rao, CSE 326 16
17
" At depth d, you can have: N = 2d (one leaf at depth d) to 2d+1-1 nodes (all leaves at depth d) " So, depth d for a heap is: log N # d # log(N+1)-1 or '(log N)
! Therefore, run time of DeleteMin is O(log N)
18
Next Class: Up close and personal with Binomial Heaps To Do: Read Chapter 6 Homework # 2 (due this Friday)
19