Heaps and The Heapsort

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 24

Heaps and the Heapsort

Heaps and priority queues Heap structure and position numbering Heap structure property Heap ordering property Removal of top priority node Inserting a new node into the heap The heap sort Source code for heap sort program

Heaps and priority queues


A heap is a data structure used to implement an efficient priority queue. The idea is to make it efficient to extract the element with the highest priority the next item in the queue to be processed. We could use a sorted linked list, with O(1) operations to remove the highest priority node and O(N) to insert a node. Using a tree structure will involve both operations being O(log2N) which is faster.

Heap semantics

The usage of the term heap to describe a tree sorted from bottom to top is unrelated to usage of the same term for the pool of memory available for dynamic allocation, i.e. using malloc(). It does relate to the winner of a competive process, e.g. a football league, as being "at the top of the heap".

Heap structure and position numbering 1

A heap can be visualised as a binary tree in which every layer is filled from the left. For every layer to be full, the tree would have to have a size exactly equal to 2n1, e.g. a value for size in the series 1, 3, 7, 15, 31, 63, 127, 255 etc. So to be practical enough to allow for any particular size, a heap has every layer filled except for the bottom layer which is filled from the left.

Heap structure and position numbering 2

Heap structure and position numbering 3


In the above diagram nodes are labelled based on position, and not their contents. Also note that the left child of each node is numbered node*2 and the right child is numbered node*2+1. The parent of every node is obtained using integer division (throwing away the remainder) so that for a node i's parent has position i/2 .

Because this numbering system makes it very easy to move between nodes and their children or parents, a heap is commonly implemented as an array with element 0 unused.

Heap structure property

For a heap based on the above structure to be maintained, every layer must be complete except the bottom layer, which must be filled from the left and items must be removed from the right.

In order to insert items elsewhere and remove items from the top, localised rearrangements along single branches will made to restore this structure.

Heap ordering property 1


A data structure with the shape described above becomes useful if data within it is organised, so that the key of every node is smaller or equal to the keys of its 2 (or sometimes one) children. A child with a key smaller than its parent's would violate this condition. When a heap is organised like this, it can be useful as a priority queue, because the lowest key will always be at the top of the heap and most easy to remove. This is called a min heap. This ordering property is reversed (a max heap) if it is desired for the highest key should always to be removed first.

Heap ordering property 2

Removal of top priority node 1

The rest of these notes assume a min heap will be used.

Removal of the top node creates a hole at the top which is "bubbled" downwards by moving values below it upwards, until the hole is in a position where it can be replaced with the rightmost node from the bottom layer. This process restores the heap ordering property.

Removal of top priority node 2


figures 6.6 6.11 from "Data Structures and Algorithm Analysis in C", 2e, M.A. Weiss.

Removal of top priority node 3

Removal of top priority node 4

Inserting a node into the heap 1


To insert a node into the heap, a hole is first created at the next right position available within the bottom layer. If the bottom layer is full, a new layer is started from the left. If an array is used to implement the heap, a size check must first be performed to avoid array overflow.

Values above a hole within the structure are bubbled down into the hole, so the hole "bubbles up" to the position where the hole can receive the value to be inserted while maintaining the heap ordering property.

Inserting a node into the heap 2

Inserting a node into the heap 3

The heap sort


Using a heap to sort data involves performing N insertions followed by N delete min operations as described above. Memory usage will depend upon whether the data already exists in memory or whether the data is on disk. Allocating the array to be used to store the heap will be more efficient if N, the number of records, can be known in advance. Dynamic allocation of the array will then be possible, and this is likely to be preferable to preallocating the array.

Source code 1

Source code 2

Source code 3

Source code 4

Source code 5

Source code 6

Source code 7

You might also like