Full Notes - Shashank R-6-134pdf
Full Notes - Shashank R-6-134pdf
Full Notes - Shashank R-6-134pdf
MODULE-1
Syllabus
INTRODUCTION TO DATA STRUCTURES: Data Structures, Classifications (Primitive
& Non-Primitive), Data structure Operations
Review of pointers and dynamic Memory Allocation,
ARRAYS and STRUCTURES: Arrays, Dynamic Allocated Arrays, Structures and Unions,
Polynomials, Sparse Matrices, representation of Multidimensional Arrays, Strings
STACKS: Stacks, Stacks Using Dynamic Arrays, Evaluation and conversion of Expressions
Introduction
• Data is a value or a set of values.
Example 90, Bob
• A data item refers to a single unit of values. Data items that are divided into sub items are
called group items.
Example :Name of an employee can be divided to three subitems- first name, middle
name and last name
• Data items that are not divided into sub items are called elementary items.
Data Structures: A data structure is a particular method of storing and organizing data
in acomputer so that it can be used efficiently. The data Structure is classified into
a. Primitive data structure: These can be manipulated directly by the machine instructions.
Example integer character, float etc
b. Non primitive data structures: They cannot be manipulated directly by the machine
instructions. The non primitive data structures are further classified into linear and non linear
data structures.
• Linear data structures: show the relationship of adjacency between the elements of
the data structures. Example are arrays, stacks, queues , list etc.
• Non linear data structure: They do not show the relationship of adjacency between
the elements. Example are Trees and graphs
Review of arrays
• Array is a collection of elements of the same data type
• An array is declared by appending brackets to the name of a variable.
For example
In C all array index start at 0 and so list[0],list[1],list[2],list[3],list[4] are the names of the five array
elements each of which contains an integer value.
Structures : Structure is basically a user-defined data type that can store related information
thatmay be of same or different data types together.
The major difference between a structure and an array is that an array can store only information of
same data type. A structure is therefore a collection of variables under a single name. The variables
within a structure are of different data types and each has a name that is used to select it from the
structure.
For example,
Struct student {
char sname[10];
int age;
float average_marks;
} st;
To assign values to these fields dot operator (. ) is used as the structure member operator. We use
this operator to select a particular member of the structure.
strcpy(st.sname,"james");
st.age = 10;
st.average_marks = 35;
We can create our own structure data types by using the typedef statement. Consider an example
that creates a structure for the employee details.
Comparing structures: Return TRUE if employee 1 and employee 2 are the same
otherwisereturn FALSE
Nested Structure: A structure can be embedded within another structure. That is a structure
canhave another structure as its member such a structure is called a nested structure.
For example, associated with our employee structure we may wish to include the date of Birth of an
employee by using nested stucture
typedef struct {
int month;
int day;
int year;
} date;
typedef struct {
char name[10];
int age;
float salary;
date dob;
}employee;
Array of Structures: In the case of a student or the employee we may not store the details
ofonly 1 student or 1 employee. When we have to store the details of a group of students
we can declare an array of structures.
Example:
typedef struct list {
int data;
list *link ;
};
• Each instance of the structure list will have two components, data and link. data is a single
character, while link is a pointer to a list structure.
• The value of link is either the address in memory of an instance of list or the null pointer.
Consider these statements, which create three structures and assign values to their respective fields:
list item1, item2, item3;
item1.data = 5
item2.data = 10
item3.data = 15
item1.link = item2.link = item3.link = NULL;
We can attach these structures together by replacing the null link field in item 2 with one that points
to item 3 and by replacing the null link field in item 1 with one that points to item 2.
item1.link = &item2; item2.1ink = &item3;
Unions: A union is a user-defined data type that can store related information that may be of
different data types or same data type, but the fields of a union must share their memory
space. This means that only one field of the union is "active" at any given time.
Example1: Suppose a program uses either a number that is int or float we can define a union as
Union num
{
int a;
float b;
};
Union num n1;
Now we can store values as n1.a=5 or n2.b= 3.14 only one member is active at a point of time.
Initialization of pointer variables: Uninitialized variables have unknown garbage values stored in
them, similarly uninitialized pointer variables will have uninitialized memory address stored inside
them which may be interpreted as a memory location, and may lead to runtime error.
These errors are difficult to debug and correct, therefore a pointer should always be initialized with a
valid memory address.
//Here the variable a and the pointer variable p are of the same data type. To make p to point at a we
have to write a statement
p=&a; // now the address of a is stored in the pointer variable p and now p is said to be
pointing at a.
If we do not want the pointer variable to point at anything we can initialize it to point at NULL
NOTE: A pointer variable can only point at a variable of the same type.
We can have more than one pointer variable pointing at the same variable. For example
int a;
int *p,*q;
p=&a;
q=&a;
now both the pointer variable p and q are pointing at the same variable a. There is no limit to the
number of pointer variable that can point to a variable.
Note:
➢ we need parenthesis for expressions like (*p) ++ as the precedence of postfix increment is
more than precedence of the indirection operator (*). If the parenthesis is not used the address
will be incremented.
➢ The indirection and the address operators are the inverse of each other when combined in an
expression such as *&a they cancel each other
Memory allocation functions: In high level languages the data structures are fully defined
at compile time. Modern languages like C can allocate memory at execution this feature is
known as dynamic memory allocation.
There are two ways in which we can reserve memory locations for a variable
• Static memory allocation: the declaration and definition of memory should be specified in
the source program. The number of bytes reserved cannot be changed during runtime
• Dynamic memory allocation : Data definition can be done at runtime .It uses predefined
functions to allocate and release memory for data while the program is running. To use
dynamic memory allocation the programmer must use either standard data types or must
declare derived data types
Memory usage: Four memory management functions are used with dynamic memory. malloc, calloc
and realloc are used for memory allocation. The function free is used to return memory when it is not
used.
Heap: It is the unused memory allocated to the program When requests are made by memory
allocating functions, memory is allocated from the heap at run time.
Releasing memory (free): When memory locations allocated are no longer needed, they should be
freed by using the predefined function free.
Syntax: free(void*);
Example: int *p,a;
p=&a;
free(p);
Example: To allocate a one dimensional array of integers whose capacity is n the following code can
be written.
int *ptr
ptr=(int*)calloc(n,sizeof(int))
Reallocation of memory(realloc): The function realloc resizes the memory previously allocated by
either malloc or calloc.
Example
int *p;
p=(int*)calloc(n,sizeof(int))
p=realloc(p,s) /*where s is the new size*/
The statement realloc(p,s) -- Changes the size of the memory pointed by p to s. The existing contents
of the block remain unchanged.
➢ When s> oldsize(Block size increases) the additional (s – oldsize )have unspecified value
➢ When s<oddsize (Block size reduces) the rightmost (oldsize-s) bytes of the old block are freed.
➢ When realloc is able to do the resizing it returns a pointer to the start of the new block
➢ When is not able to do the resizing the old block is unchanged and the function returns the
value NULL.
Dangling Reference: Once a pointer is freed using the free function then there is no way to retrieve
this storage and any reference to this location is called dangling reference.
Example2:
int i,*p,*f;
i=2;
p=&i;
f=p;
free(p);
*f=*f+2 /* Invalid dangling reference*/
The location that holds the value 2 is freed but still there exist a reference to this location through f
and pointer f will try to access a location that is freed so the pointer f is a dangling reference
Pointers can be dangerous: When pointers are used the following points needs to be taken care
1. When a pointer is not pointing at any object it is a good practise to set it to NULL so that there
is no attempt made to access a memory location that is out of range of our program or that does
not contain a pointer reference to the legitimate object.
One dimensional array: When we cannot determine the exact size of the array the space of the array
can be allocated at runtime.
For example consider the code given below
int i,n,*list;
printf(“enter the size of the array”);
scanf(“%d”,&n);
if (n<1)
{
fprintf(stderr,”Improper values of n \n”);
exit();
}
list=(int*) malloc (n*sizeof(n))/* or list=(int*)calloc(n,sizeof(int))
In C we find the element x[i][j] by first accessing the pointer in x[i]. This pointer gives the address
of the zeroth element of row i of the array. Then by adding j*sizeof(int) to this pointer, the address
of the jth element of the ith row is determined
Example to find x[1][3] we first access the pointer in x[1] this pointer gives the address of x[1][0]
now by adding 3*sizeof (int) the address of the element x[1][3] is determined.
Arrays
Linear Arrays: A Linear Array is a list of finite number (n) of homogenous data elements.
a. The elements of the array are referenced by an index set consisting of n consecutive
numbers(0. ..(n-1)).
b. The elements of the array are stored in successive memory locations
c. The number n of elements is called the length or size of the array. Length of the array can be
obtained from the index set using the formula
Length = Upper bound – Lowe bound +1
d. The elements of an array may be denoted by a[0],a[2] ............a[n-1]. The number k in a[k] is
called a subscript or index and a[k] is called the subscripted value.
e. An array is usually implemented as a consecutive set of memory locations
Declaration: Linear arrays are declared by adding a bracket to the name of a variable. The size of
the array is mentioned within the brackets.
In C all arrays start at index 0. Therefore, list[0], list[1], list[2], list[3], and list[4] are the names of
the five array elements ,each of which contains an integer value.
Two dimensional arrays: C uses the array of array representation to represent a multidimensional
array. In this representation a 2 dimensional array is represented as a one dimensional array in which
each element is itself a one dimensional array.
• A two dimensional m X n array A is a collection of m* n data elements such that each element
is specified by a pair of integers called subscripts.
• An element with subscript i and j will be represented as A[i][j]
• Declaration: int A[3][5];
// It declares an array A that contains three elements where each element is a one
dimensional array. Each one dimensional array has 5 integer elements.
0 1 2 3
0 A[0][0] A[0][1] A[0][2] A[0][3]
Rows 1 A[1][0] A[1][1] A[1][2] A[1][3]
2 A[2][0] A[2][1] A[2][2] A[2][3]
Example: Representation of the two dimensional array A[3][4] in row major order and column major
order
A Subscript A Subscript
A[0][0] A[0][0] Column1
A[0][1] A[1][0]
Row1
A[0][2] A[2][0]
A[0][3] A[0][1] Column2
A[1][0] A[1][1]
A[1][1] A[2][1]
Row2
A[1][2] A[0][2] Column3
A[1][3] A[1][2]
A[2][0] A[2][2]
A[2][1] A[0][3] Column4
Row3
A[2][2] A[1][3]
A[2][3] A[2][3]
• Using the base address , the address of any element in an array A of size row X col can be
calculated using the formula.
• Row Major order
Address (A[i][j]) = Base address + w[ i*col+ j] considering the array indexing starts at 0
• Column Major order
Address (A[i][j]) = Base address + w[ i+row.j] considering the array indexing starts at 0
Example : When the compiler encounters an array declaration such as int A[3][4] it creates an array
A and allocates 20 consecutive memory locations. Each memory location is large enough to hold a
single integer.
Let α be the address of the first element A[0][0], is called the base address
Considering Row major order: Using the bases address we can calculate the addresses of other
element
Address of A[0][1] = 100 +2[0*4+ 1]= 100 +2=102
Address of A[0][2] = 100 +2[0*4+ 2]= 100 +4=104
Address of A[1][0] = 100 +2[1*4+ 0]= 100 +8=108
Addres of A[2][3]= 100 +2[2*4+3]= 100+22= 122
where Π is the product of the upperi's. For instance, if we declare a as a[10][10][10], then we require
10·10·10 = 1000 memory cell to hold the array. There are two common ways to represent
multidimensional arrays: row major order and column major order. We consider only row major order
here. As its name implies, row major order stores multidimensional arrays by rows.
Array Operations: Operations that can be performed on any linear structure whether it is
anarray or a linked list include the following
a. Traversal- processing each element in the list
b. Search- Finding the location of the element with a given key.
c. Insertion- Adding a new element to the list
d. Deletion- Removing an element from the list.
e. Sorting- Arranging the elements in some type of order.
f. Merging- combining two list into a single list.
Traversing Linear Arrays: Traversing an array is accessing and processing each element exactly
once. Considering the processing applied during traversal as display of elements the array can be
traversed as follows
void displayarray(int a[])
{
int i;
printf("The Array Elements are:\n");
for(i=0;i<n;i++)
printf("%d\t",a[i]);
}
Insertion
• Inserting an element at the end of the array can be done provided the memory space allocated
for the array is large enough to accommodate the additional element.
• If an element needs to be inserted in the middle then all the elements form the specified position
to the end of the array should be moved down wards to accommodate the new element and to
keep the order of the other element.
• The following function inserts an element at the specified position
else
{
for(i=n-1;i>=pos;i--)
a[i+1]=a[i]; //Make space for the new element in the given position
a[pos]=element;
*n++;
}
}
Deletion
• If an element needs to be deleted in the middle then all the elements form the specified position
to the end of the array should be moved upwards to fill up the array.
• The following function deletes an element at the specified position
*n--;
}
}
Sorting: Sorting refers to the operation of rearranging the elements of an array in increasing or
decreasing order.
Example: Write a program to sort the elements of the array in ascending order using bubble
sort.
#include<stdio.h>
void main()
{
int a[10],i,j,temp,n;
printf("enter the size of the array : ");
scanf("%d",&n);
printf("enter the elements of the array\n");
for(i=0;i<n;i++)
scanf("%d",&a[i]);
for(i=1;i<=n-1;i++)
for(j=0;j<n-i ;j++)
if (a[j] >a[j+1])
{
temp=a[j];
a[j]=a[j+1];
a[j+1]= temp;
}
printf("the sorted array is \n");
for(i=0;i<n;i++)
printf("%d \t",a[i]);
return(0);
}
Searching:
• Let DATA be a collection of data elements in memory and suppose a specific ITEM of
information is given.
• Searching refers to the operation of finding the Location LOC of the ITEM in DATA or
printing a message that the item does not appear here.
• The search is successful if the ITEM appear in DATA and unsuccessful otherwise.
The algorithm chosen for searching depends on the way the data is organised. The two algorithm
considered here is linear search and binary search.
LINEAR SEARCH: This program traverses the array sequentially to locate key
#include<stdio.h>
#include<stdlib.h>
void main()
{
int a[10],i,key,pos,n,flag=0;
printf("enter the size of the array : ");
scanf("%d",&n);
printf("enter the elements of the array\n");
for(i=0;i<n;i++)
scanf("%d",&a[i]);
printf("enter the key \n");
scanf("%d",&key);
for(i=0;i<=n-1;i++)
if (a[i]== key)
{
printf("key %d found at %d",key,pos+1);
exit();
}
printf("key not found");
}
Complexity of Linear search: The complexity is based on the number of comparison C(n) required
to find the key in the array element.
• The best case occurs when the key is found at first position. C(n)O(1)
• Worst case occurs when key element is not found in the array or when the element is in the
last position. Thus in worst case the running time is proportional to n C(n) O(n)
• The running time of the average case uses the probabilistic notation of expectation. Number
of comparison can be any number from 1 to n and each occurs with probability p= 1/n then
c(n) = 1.1/n +2.1/n+................n.1/n
= (1+2+3… .....+n).1/n
=n(n+1)/2.1/n=n+1/2
#include<stdio.h>
#include<stdlib.h>
int main()
{
int a[10],i,key,mid,low,high,n;
printf("enter the size of the array : ");
scanf("%d",&n);
printf("enter the elements of the array in ascending order\n");
for(i=0;i<n;i++)
scanf("%d",&a[i]);
printf("enter the key \n");
scanf("%d",&key);
low=0;
high=n-1;
while(low<=high)
{
mid=(low+high)/2;
if (key==a[mid])
{
printf("element %d found at %d",key,mid+1);
exit(0);
}
else
{
if (key<a[mid])
high = mid-1;
else
low=mid+1;
}
}
printf("key not found");
return(0);
}
Polynomials: A polynomial is a sum of terms, where each term has a form axe, where x is the variable,
a is the coefficient, and e is the exponent.
The largest (or leading) exponent of a polynomial is called its degree. Coefficients that are zero are
not displayed.
• Standard mathematical definitions for the sum and product of polynomials are:
• Assume that we have two polynomials
then
ADT Polynomial is objects: a set of ordered pairs of <ei, ai> where ai is Coefficients and ei is
Exponents, ei are integers >= 0
Functions:
for all poly,poly1,poly2 ∈ Polynomial,coef ∈ Coefficients, expon ∈ Exponents
Polynomial Representation:
• A polynomial can be represented as an array of structures as follows.
• Only one global array, terms, is used to store all the polynomials.
• The C declarations needed are:
For our example, startA = 0, finishA = 1, startB = 2, finishB = 5, and avail =6.
Coef 2 1 1 10 3 1
Exp 1000 0 4 3 2 0
0 1 2 3 4 5 6 7 8
since A (x) = 2x 1000 + 1 uses only six units of storage: one for startA, one for finishA, two for the
coefficients, and two for the exponents. However, when all the terms are nonzero, the current
representation requires about twice as much space as the first one. This representation is useful only
when the number of non zero terms are more.
Polynomial addition
• C function that adds two polynomials, A and B to obtain the resultant polynomial D = A + B.
The polynomial is added term by term.
• The attach function places the terms of D into the array, terms starting at position avail,.
• If there is not enough space in terms to accommodate D, an error message is printed to the
standard error device and we exit the program with an error condition.
void padd(int startA,int finishA,int startB, int finishB, int *startD,int *finishD)
{
/ * add A(x) and B(x) to obtain D(x) */
float coefficient;
*startD = avail;
while (startA <= finishA && startB <= finishB)
{
switch(COMPARE(terms[startA].expon, terms[startB].expon))
{
case -1: attach(terms[startB].coef,terms[startB].expon);
startB++;
break;
case 0: coefficient = terms[startA].coef + terms[startB].coef;
if (coefficient)
attach(coefficient,terms[startA].expon);
startA++;
startB++;
break;
case 1: attach(terms[startA].coef,terms[startA].expon);
startA++;
}
}
while(startA <= finishA)
{
attach(terms[startA].coef,terms[startA].expon); /* add in remaining terms of A(x) */
startA++;
}
*finishD = avail-1;
}
Sparse Matrices
• If a matrix contains m rows and n columns the total number of elements in such a matrix is
m*n. If m equals n the matrix is a square matrix.
When a sparse matrix is represented as a two dimensional array space is wasted for example if we
have 1000x 1000 matrix with only 2000 non zero element, the corresponding 2 dimensional array
requires space for 1,000,000 elements
ADT Sparse Matrix objects: a set of triples, <row, column, value>, where row and column are
integers and form a unique combination, and value comes from the set item.
Functions:
for all a, b∈SparseMatrix, x∈item, i, j, maxCol, maxRow∈index
Example
Figure 1.4 two dimensional array and its sparse matrix stored as triples
Write a program to store a sparse matrix in triplet form and search an element specified by
the user
#include<stdio.h>
#include<stdlib.h>
int main()
{
struct sparse
{
int r;
int c;
int v;
};
struct poly s[100];
int ele,i,j,k,n,m,key;
printf("enter the size of the array ; ");
scanf("%d %d",&m,&n);
k=1;
s[0].r=m;
s[0].c=n;
printf("\n enter the elements of the array\n");
for(i=0;i<m;i++)
for(j=0;j<n;j++)
{
scanf("%d",&ele);
if(ele !=0)
{
s[k].r=i;
s[k].c=j;
s[k].v= ele;
k++;
}
s[0].v=k-1;
}
for(i=0;i<=s[0].v;i++)
printf(" %d\t %d \t %d \n ",s[i].r, s[i].c, s[i].v);
printf(" enter the key to be searched");
scanf("%d",&key);
for(i=0;i<=s[0].v;i++)
if (key== s[i].v)
{
printf("element found at %d row and %d column",s[i].r,s[i].c);
exit(0);
}
printf("element not found ");
return(0);
}
Transposing a Matrix: To transpose a matrix we must interchange the rows and columns.
This means that each element a[i][j] in the original matrix becomes element b[j][i] in the
transposematrix.
The algorithm finds all the elements in column 0 and store them in row 0 of the transpose matrix, find
all the elements in column 1 and store them in row 1, etc." Since the original matrix was ordered by
rows and the columns were ordered within each row. The transpose matrix will also be arranged in
ascending order. The variable, currentb, holds the position in b that will contain the next transposed
term. The terms in b is generated by rows by collecting the nonzero terms from column i of a
The transpose b of the sparse matrix a of figure 1.4b is shown in figure 1.5
Analysis of transpose: Hence, the asymptotic time complexity of the transpose algorithm is
O(columns·elements).
It first determines the number of elements in each column of the original matrix. This gives us the
number of elements in each row of the transpose matrix. From this information, we can determine the
starting position of each row in the transpose matrix. We now can move the elements in the original
matrix one by one into their correct position in the transpose matrix. We assume that the number of
columns in the original matrix never exceeds MAX_COL.
• The first two for loops compute the values for rowTerms, the third for loop carries out the
computation of startingPos, and the last for loop places the triples into the transpose matrix.
These four loops determine the computing time of fastTranspose.
• The bodies of the loops are executed numCols, numTerms, numCols - 1, and numTerms times,
respectively. The computing time for the algorithm is O(columns + elements).
• However, transpose requires less space than fastTranspose since the latter function must
allocate space for the rowTerms and startingPos arrays.
Strings: A string is an array of characters that is delimited by the null character (\0).
C L A S S \0
S[0] S[1] S[2] S[3] S[4] S[5]
Using this declaration the compiler would have reserved just enough space to hold each character
word including the null character. In such cases we cannot store a string of length more than 5 in s
String Null(m) ::= Return a string whose length is m characters long, but is initially set to
NULL. We write NULL as “”
Integer compare(s, t)::= If s equals t return 0
Else if s precedes t return -1
Else return +1
Boolean ISNull(s) ::= If (compare(s, NULL)) return FALSE
Else return TRUE
Integer Length(s) ::= If(compare(s, NULL))
Returns the number of characters in s else returns 0
String concat(s,t) ::= If(compare(t, NULL))
Return a string s whose elements are those of s followed by those of t
C provides several string functions which we access by including the header file string.h
Given below is a set of C string functions
char *strcat(char *dest, const char Appends the string pointed to, by src to the end of the
*src) string pointed to by dest.
char *strncat(char *dest, const char Appends the string pointed to, by src to the end of the
*src, size_t n) string pointed to, by dest up to n characters long.
int strcmp(const char *str1, const Compares the string pointed to, by str1 to the string
char *str2) pointed to bystr2.
char *strcpy(char *dest, const char Copies the string pointed to, by src to dest and return dest
*src)
char *strncpy(char *dest, const char Copies n characters from the string pointed to,
*src, size_t n) by src to dest and returns dest
size_t strlen(const char *str) Returns the length of the string str . But not including the
terminating null character.
char *strchr(const char *str, int c) Returns pointer to the first occurrence of c in str . Returns
NULL if not present
char *strrchr(const char *str, int c) Returns pointer to the last occurrence of c in str . Returns
NULL if not present
char *strtok(char *str, const char Returns a token from string str . Tokens are separated
*delim) by delim.
char *strstr(char *str, const char Returns pointer to start of pat in str
*pat)
size_t strspn(const char *str, const Scan str for characters in spanset, returns the length of the
char *spanset) span
size_t strcspn(const char *str, const Scans str for character not in spanset, returns the length of
char *spanset) the span
char *strpbrk(const char *str, const Scan str for characters in spanset, returns pointer to first
char *spanset) occurrence of a character from spanset
Storing Strings
Strings are stored in three types of structures
1. Fixed Length structure
2. Variable Length structure with fixed maximums
3. Linked structures
1. Fixed length Storage, record oriented: In this structure each line of text is viewed as a record
where all records have the same length or have the same number of characters
Example: Assuming our record has a maximum of 12 characters per record the strings are stored as
follows
0 D A T A
1 S T R U C T U R E S
2 A N D
3 A P P L L I C A T I O N
4
5
Advantages:
• Ease of accessing data from any given record
• Ease of updating data in any given record( provided the length of the new data does not exceed
the record length
Disadvantages
• Time is wasted reading an entire record if most of the storage consist of in essential blank
spaces
• Certain records may require more space or data than available
• When the correction consist of more or fewer characters than original text, updation requires
the entire record to be changed( the disadvantage can be resolved by using array of pointers)
2. Variable Length storage with fixed maximum: The storage of variable length strings in memory
cells wth fixed lengths can be done in two ways
• Use a marker such as ($) to mark the end of the string
• List the length of the string as an additional field in the pointer array
Example :
0 5 D A T A $
1 11 S T R U C T U R E S $
2 4 A N D $
3 12 A P P L I C A T I O N $
4 0
5 0
3. Linked storage: Linked list is an ordered sequence of memory cells called nodes, where each node
stores two information the data and also stores the address of the next node in the list. Strings may be
stored in linked list as each node storing one character or a fixed number of characters and a link
containing the address of the node containing the next group of characters.
Example:
S= A m o b i l e ‘\0\
t= U t o ‘\0\
Initially
Temp= ‘\0\
Strncpy(temp,s,i)
a ‘\0\
Strcat(temp,t)
a U t o ‘\0\
Strcat(temp,(s+i))
a u T o m o b i L e ‘\0\
Consider two string str1 and str2 . insert srting str2 into str1 at position i.
# include<string.h>
# define max_size 100
Char str1[max_size];
Char str2 [max_size];
strncpy(temp,s,i);
strcat(temp,t);
strcat(temp,(s+i));
strcpy(s,temp);
}}
Pattern matching : Consider two strings str and pat where pat is a pattern to be searched for in
stri. The easiest way to find if the pat is in str is by using the built in function strstr.
Since there are different methods to find pattern matching discussed below are two functions that
finds pattern matching in a more efficient way.
The easiest and the least efficient method in pattern matching is sequential search. The computing
time is of O(n.m).
for(i=0;endmatch<=lasts;endmatch++,strt++)
{
If(string[endmatch] == pat[lastp])
{
j=0;i= start;
while(j<lastp && string[i]== pat[j])
{
i++;
j++);
}
}
if(j==lastp)
return start;
}
return -1
}
Simulation of nfind
Pattern
a A b
j lastp
a B a b b A a b a a
s em ls
No Match
a B a b b A a b a a
S em ls
No Match
a B a b b A a b a a
s i em ls
No Match
a B a b b A a b a a
s Em ls
No Match
a B a b b A a b a a
s em ls
No Match
A B a b b A a b a a
S em ls
Match
Analysis of nfind algorithm: The speed of the program is linear O(m) the length of the string in the
best and average case but the worst case computing is still O(n.m)
Example: For the pattern pat=abcabcacab we have the failure values calculated as below
j 1 2 3 4 5 6 7 8 9 10
pat a b c a b c a c a b
failure 0 0 0 1 2 3 1 3 1 2
Therefore when the failure function is not known in advance the total
computing time isO(strlen(string)) + O(strlem(pa))
Stack
Stack Definition and Examples
• Stack is an ordered list in which insertions (also called push) and deletions (also called pops )
are made at one end called the top.
• Given a stack S = (a0, ...., an-1), we say that a0 is the bottom element, an-1 is the top
element,and ai is on top of element ai-1, 0 < i < n.
• Since the last element inserted into a stack is the first element removed, a stack is also known
as a Last-In-First-Out (LIFO) list.
Implementation of stack
• The first, or bottom, element of the stack is stored in stack [0], the second in stack [1], and the
ith in stack [i-1].
• Variable top points to the top element in the stack.
• Top= -1 to denote an empty stack.
ADTStack is
objects: a finite ordered list with zero or more elements.
}#include<stdio.h>
#define MAX 10
int top= -1,stack[MAX];
int pop()
{
int itemdel;
if (top==-1)
return 0;
else
{
itemdel=stack[top--];
return itemdel;
}
}
void display()
{
int i;
if(top==-1)
printf("Stack Empty\n");
else
{
printf("Elements Are:\n");
for(i=top;i>=0;i--)
printf("%d\n",stack[i]);
}
}
void main()
{
int ch,item,num,itemdel;
while(1)
{
printf("\nEnter the Choice\n1.Push\n2.Pop\n3.Display\n4.Exit\n");
scanf("%d",&ch);
switch(ch)
{
case 1: printf("Enter item to be inserted\n");
scanf("%d",&item);
push(item);
break;
case 2: itemdel=pop();
if(itemdel)
printf("\n Deleted Item is:%d\n",itemdel);
else
printf("Stack Underflow\n");
break;
case 3: display();
break;
case 4: exit(0);
}
}
If we do not know the maximum size of the stack at compile time, space can be allocated for the
elements dynamically at run time and the size of the array can be increases as needed.
Creation of stack: Here the capacity of the stack is taken as 1. The value of the capacity can be altered
specific to the application
StackCreateS() ::=
int *stack
Stack=(int*)malloc(stack, sizeof(int));
int capacity = 1;
int top = -1;
The function push remains the same except that MAX_STACK_SIZE is replaced with capacity
pop()
{/* delete and return the top element from the stack */
if (top == -1)
return stackEmpty(); /* returns an error key */
return stack[top--];
}
Stackfull with Array doubling:The code for stackFull is changed. The new code for stackFull
attempts to increase the capacity of the array stack so that we can add an additional element to the
stack. In array doubling, the capacity of the array is doubled whenever it becomes necessary to increase
the capacity of an array.
void stackFull()
{
stack=(int*)realloc(stack, 2 * capacity * sizeof(int))
capacity =capacity * 2;
}
Analysis
• In the worst case, the realloc function needs to allocate 2*capacity *sizeof (*stack) bytes of
memory and copy capacity*sizeof (*stack)) bytes of memory from the old array into the new
one.
• Under the assumptions that memory may be allocated in O(1) time and that a stack element
can be copied in O(1) time, the time required by array doubling is O(capacity). The total time
spent in array doubling is of O(n) where n is the total number of push operations.
• Hence even with the time spent on array doubling in the total time of push over all n pushes in
O(n). This conclusion is valid even the stack array is resized by a factor c>1.
Application of stack
➢ Conversion of Expression
➢ Evaluation of expression
➢ Recursion
To convert an expression from infix to prefix or postfix we follow the rules of precedence.
• Precedence : The order in which different operators are evaluated in an expression is called
precendence
• Associativity : The order in which operators of same precedence are evaluated in an expression
is called Associativity.
The operators are listed in the order of higher precedence down to lower precedence
Operator Associativity
--,++ left-to-right
Unary operators ,!,-,+, &, *,sizeof Right to left
*,/,% left-to-right
+,- left-to-right
The operands in the infix and the postfix expression are in the same order. With respect to operators ,
precedence of operators plays an important role in converting an infix expression to postfix expression.
We make use of the stack to insert the operators according to their precedence.
Algorithm Polish(Q,P)
Suppose Q is an arithmetic expression written in infix notation. This algorithm finds the equivalent
Postfix expression P.
char symbol,item;
push('#');
for (i=0;infix[i]!='\0';i++)
{
symbol=infix[i];
switch(symbol)
{
case '(': push(symbol);
break;
case ')':item=pop();
while(item!='(')
{
postfix[p++]=item;
item=pop();
}
break;
case '+':
case '-':
case '*':
case '/':
case '%':while(precd(s[top])>=precd(symbol))
{
item=pop();
postfix[p++]=item;
}
push(symbol);
break;
Analysis: Let n be length of the infix string. (n) time is spent extracting tokens . There are two while
loop where the total time spent is (n) since the number of tokens that get stacked and unstacked is
linear in n . So the complexity of the function is (n)
Each operator in a postfix string refers to the previous two operands in the string. If we are parsing a
string, each time we read operands we push it to the stack and when we read a operator, its operands
will be the two topmost elements in the stack. We can then pop these two operands and perform the
indicated operation and push the result back to the stack so that it can be used by the next operator.
The following function evaluates a postfix expression using a stack and a stack of float elements is
declared globally
float s[25];
int top;
➢ It does not check if the postfix expression is valid or not. If we input erroneous expression it
returns wrong result
➢ We cannot enter negative numbers, as the symbol to indicate negation will be misinterpreted
as subtraction operation
Analysis: Let n be length of the postfix string then the complexity of the function is (n)
Algorithm PostfixEval(P)
This algorithm finds the VALUE of an arithmetic expression P written in postfix notation
Recursion: Recursion is the process of defining an object in terms of a simpler case of itself.
Suppose p is a function containing either a call statement to itself (direct recursion) or a call statement
to a second function that may eventually result in a call statement back to the original function
P(indirect recursion). Then the function P is called a recursive function.
• There must be a certain argument called the base value for which the function will not call
itself.
• Each time the function refers to itself the argument of the function must be closer to the base
value.
Factorial function: The factorial of a number n is got by finding the product of all the numberform
1 to n. ie 1*2*3…*n.. It is represented as n!
Example 4!=4*3*2*1=24
5!=5*4*3*2*1=120
0!=1
From a close observation it can be observed that 5!= 5*4! . Therefore n!=n*(n-1)!
• The definition is recursive since the function refers to itself for all value of n>0.
• The value of n! is explicitly given as 1 when the value of n=0 which can be taken as the base
value.
• This can be implemented by the code
factorial(int n)
{
f=1;
for(i=1;i<=n;i++)
f=f*i;
return(f)
}
This is the iterative implementation of the factorial function
For example
factorial (5)=5*factorial(4)
factorial (4)=4*factorial(3)
factorial (3)=3*factorial(2)
factorial(2)=2*factorial(1)
factorial(int n)
{
if (n==0 )
return(1);
else
return(n*factorial(n-1))
}
Fibonacci numbers in C
Example:
Fibo(4)= fibo(3)+fibo(2)
=fibo(2)+fibo(1)+fibo(2)
=fibo(1)+fibo(0)+fibo(1)+ fibo(2)
= 1+ fibo(0)+fibo(1)+ fibo(2)
=1+0 + fibo(1)+ fibo(2)
SUNIL G L, Dept. of CSE(DS), RNSIT, Bengaluru Page 41
Data Structures and Applications/BCS304 Module 1
=1+0+1+fibo(2)
=2+ fibo(1)+ fibo(0)
=2+1+fibo(0)
=2+1+0= 3
GCD of two numbers: The function accepts two numbers as input and returns the gcd of the two
numbers
gcd(int m, int n)
{
if (n==0)
return m;
retrun(gcd(n,m%n))
}
Example:
gcd(2,0)=2
gcd(0,2)= gcd(2,0)=2
gcd(4,2)= gcd(2,0)=2
gcd(7,3)= gcd(3,1)= gcd(1,0)=1
Binary search in C
int binsearch(int *a , int key, int low, int high)
{
If (low>high)
return(-1);
mid=(low+high)/2;
if (key==a[mid])
return(mid)
else
if (key>a[mid])
return(binsearch(a,key,mid+1,high));
else
return(binsearch(a,key,low,mid-1));
}
Write a program to solve the Tower of Hanoi problem using a recursive function
void tower(int n,char source,char temp,char dest)
{
if(n==1)
{
printf("Move disc 1 from %c to %c\n",source,dest);
count++;
return;
}
tower(n-1,source,dest,temp);
printf("Move disc %d from %c to %c\n",n,source,dest);
count++;
tower(n-1,temp,source,dest);
}
Void main()
{
int n,count;
printf("Enter the number of discs\n");
scanf("%d",&n);
tower(n,'A','B','C');
printf("The number of moves=%d\n",count);
}
Note: Ideal number of moves to solve Tower of Hanoi is given as 2n -1 where n is the total number of
disks
Ackermann Function
The Ackermann function is a function with two arguments each of which can be assigned any non
negative integer 0,1,2… ........... This function finds its application in mathematical logic.
This function is defined as follows
a) If m = 0 then A(m,n) = n + 1
b) If m != 0 but n = 0 then A(m,n) = A(m - 1,1)
c) If m != 0 and n != 0 then A(m,n) = A(m - 1, A(m,n - 1))
Example1:
A(1,2) =A(0,A(1,1))
=A(0,A(0,A(1,0)))
= A(0,A(0,A(0,1))
=A(0,A(0,2)
=A(0,3)
4
Example 2:
A(1,3) = A(0,A(1,2))
=A(0,A(0,A(1,1)))
=A(0,A(0,A(0,A(1,0))))
=A(0,A(0,A(0,A(0,1))))
=A(0,A(0,A(0,2)
=A(0,A(0,3)
=A(0,4)
5
Iterative Recursive
Implemented using looping Implemented using recursive calls to
statements functions
Executes faster Takes more time to execute
Memory utilization is Less Memory utilization is more
Lines of code are more Lines of code are lesser
Does not require stack Implementation requires stack
MODULE-2
QUEUES: Queues, Circular Queues, Using Dynamic Arrays, Multiple Stacks and queues.
LINKED LISTS : Singly Linked, Lists and Chains, Representing Chains in C, Linked
Stacks and Queues, Polynomials
Queues:
A queue is an ordered list in which insertions and deletions take place at different ends. The end at
which new elements are added is called the rear, and that from which old elements are deleted is called
the front. Queues are also known as First-In-First-Out (FIFO) lists.
Example
Initially f =-1 r=-1 queue empty
Element
index [0] [1] [2] [3] [4] [5]
f=-1
Insert 3
Element 3
index [0] [1] [2] [3] [4] [5]
,r
f=-1
Insert 5
Element 3 5
index [0] [1] [2] [3] [4] [5]
r
f=-1
Insert 7
Element 3 5 7
delete
Element 5 7
Deleted item =3
C implementation of queues for an integer array: A queue can be represented by using an array to
hold the elements of the array and to use two variables to hold the position of the first and last element
of the queue.
#define size 10
int q[size];
int front=-1 ,rear=-1;
SUNIL G L, Dept. of CSE(DS), RNSIT, Bengaluru. Page 2
Data Structures and Applications(BCS304) Module 2
Insert operation
The insert operation first checks for queue overflow. If the queue is not full it inserts one element into
the queue at the rear.
Void insert(int item)
{
If rear==size-1)
Printf(“queue overflow”);
else
{
rear++;
q[rear]=item;
}
}
Delete operation: Delete operation checks for queue underflow condition and if the queue is not
empty it will remove the element at the front.
int delete()
{
int itemdel;
if (front ==rear)
{
Printf(“queue underflow”);
return(0);
}
else
{
front++
itemdel=q[front];
return(itemdel);
}
}
Display operation: The display operation will display the elements of the queue if the queue is not
empty.
void display(s)
{
if (front==rear)
printf(“queue empty”);
else
{
for(i=front+1;i<=rear;i++)
printf(“%d”,q[i]);
}
}
Disadvantage of linear queue: The following example illustrates the disadvantage of linear queue
Even if the queue is empty since the value of rear= size-1 elements cannot be inserted into the queue.
This is the disadvantage of linear queue.
Example:
• Initially when front=0 rear=0 i.e when front == rear the queue is empty now after 6 insertions
are made again fron=0 and rear= 0 that is the queue is full. So, we cannot distinguish between
an empty and a full queue.
• To avoid the resulting confusion, the value of the rear is incremented before we check for the
condition front == rear for queue overflow.
#define MAX_QUEUE_SIZE 6
int q[size];
int front=0 ,rear=0;
Insert operation: The insert operation first checks for queue overflow. If the queue is not full it inserts
one element into the queue at the rear.
Delete operation: Delete operation checks for queue underflow condition and if the queue is not
empty it will remove the element at the front.
element deleteq()
{
element item;
if (front == rear)
return queueEmpty();
front = (front+1) % MAX_QUEUE_SIZE;
return queue[front];
}
• To add an element to a full queue, we must first increase the size of this array using a function
such as realloc.
• As with dynamically allocated stacks, we use array doubling. However, it isn't sufficient to
simply double array size using realloc.
• Consider the full queue . This figure shows a queue with seven elements in an array whose
capacity is 8. To visualize array doubling when a circular queue is used, the array is flattened
out as shown in the array of Figure (b).
To get a proper circular queue configuration, The number of elements copied can be limited to capacity
- 1 by customizing the array doubling code so as to obtain the configuration as shown below.
/* switch to newQueue */
front= 2 * capacity - 1;
rear = capacity - 2;
capacity *= 2;
free(queue);
queue = newQueue;
}
The function copy(a,b,c) copies elements from locations a through b-1 to locations beginning at c
Deques: A deque (pronounced either as deck or dequeue) is a linear list in which elements
can be added or removed at either end but not in the middle. The term deque is a contraction
of the namedouble ended queue.
Representation: It is represented as a circular array deque with pointers left and right, which point to
the two ends of the queue. It is assumed that the elements extend from the left end to the right end in
the array. The term circular comes from the fact that DEQUE[0] comes after DEQUE[n-1] in the array.
Example1:
Left=2 A B C
Right=4
Example2:
Left=5 A B D E
Right=1
[0] [1] [2] [3] [4] [5] [6]
Priority queue: A priority queue is a collection of elements such that each element has been
assigned a priority such that the order in which the elements are deleted and processed comes
from the following rules.
1. An element of higher priority is processed before any element of lower priority.
2. Two elements with the same priority are processed according to the order in which they were
added to the queue.
Example: Time sharing system: programs of higher priority are processed first and programs with the
same priority form a standard queue.
Representation using multiple queue: Use a separate queue for each level of priority. Each queue
will appear in its own circular array and must have its own pair of pointers, front and rear. If each
queue is allocated the same amount of space, a two dimensional array can be used instead of the linear
arrays.
Example : Consider the queue given below with the jobs and its priorities and its representation. A
job with priority 1 is considered to have the highest priority
J1 1
J2 1
J3 2
J4 4
J5 4
J6 6
Front rear 1 2 3 4 5 6
1 1 2 1 J1 J2
2 2 3 2 J3
3 3
4 4 5 4 J4 J5
5 5
6 6 6 6 J6
Delete operation
Algorithm:
1. Find the smallest k such that front[k]!=rear[k] ie Find the first non empty queue
2. Delete the process at the front of the queue
3. Exit
Insert operation
Algorithm: this algorithm adds an ITEM with priority number P to a priority queue maintained by a
two dimensional array
1. Inset ITEM as the rear element in row P-1 of queue
2. exit
• If there is a single stack, the starting point is top=-1 and maximum size is SIZE-1
• If there are two stacks to be represented in a single array then we use stack [0] for the bottom
element of the first stack, and stack[MEMORY_SIZE - 1] for the bottom element of the second
stack. The first stack grows toward stack[MEMORY_SIZE - 1] and the second grows toward
stack[0]. With this representation, we can efficiently use all the available space.
• Representing more than two stacks within the same array poses problems since we no longer
have an obvious point for the bottom element of each stack. Assuming that we have n stacks,
we can divide the available memory into n segments. This initial division may be done in
proportion to the expected sizes of the various stacks, if this is known. Otherwise, we may
divide the memory into equal segments.
• Assume that i refers to the stack number of one of the n stacks.To establish this stack, we must
create indices for both the bottom and top positions of this stack.The convention we use is that
o bottom [i], 0 ≤ i < MAX_STACKS, points to the position immediately to the left of the
bottom element of stack i.
o top[i], 0 ≤ i < MAX_STACKS points to the top element.
o Stack i is empty if bottom[i] = top[i].
To divide the array into roughly equal segments we use the following code:
Stack i can grow from bottom[i] + 1 to bottom [i + 1 ] before it is full. Boundary for the last stack,
boundary [n] is set to MEMORY_SIZE- 1
Initial configuration of the stack is shown below m is the size of the memory
element pop(int i)
{
if (top[i] == bottom[i])
return stackEmpty(i);
return stack[top[i]--];
}
Mazing Problem
Representation: Maze is represented as a two dimensional array in which zeros represent the open
paths and ones the barriers. The location in the maize can be determined by the row number and
column number Figure below shows a simple maze.
If the position is on the border then there are less than eight directions. To avoid checking for border
conditions the maze can be surrounded by a border of ones. Thus an m*p maize will require an
(m+2)*(p+2) array the entrance is at position [1][1] and exit is at [m][p]. The possible direction to
move can be predefined in an array move as shown below where the eight possible directions are
numbered from 0 to 7. For each direction we indicate the vertical and horizontal offset.
Offset move[8];
Table of moves: The array moves is initialized according to the table given below.
As we move through the maze, we may have the choice of several directions of movement. Since we
do not know the best choice we save our current position and arbitrarily pick a possible move. By
saving the current position we can return to it and try another path. A second two dimensional array
can be maintained to keep track of the visited locations. The maze can be implemented by making use
of a stack where the element is defined as follows.
Linked List
Introduction
Linked list is a collection of zero or more nodes ,where each node has some information. Given the
address of the first node, any node in the list can be obtained. Every node consis of two parts one is
the information part and the other is the address of the next node. The pointer of the last node
contains a special value called NULL.
Representation of linked list: Each item in the list is called a node and contains two fields
➢ Information field - The information field holds the actual elements in the list
➢ Link field- The Link field contains the address of the next node in the list
To create a linked list of integers the node can be defined as follows using a self referential
structure.
Sruct Node
{
int info;
struct Node * link;
};
Typedef struct Node NODE;
After the node is created we have to create a new empty list as follows
node * first=NULL;
• The pointer first stores the address of the first node in the list. With this information we
will be able to access the location of all the other nodes in the list.
• To obtain a node we use the statement
• First=(node*) malloc(sizeof(node)
• To place the information 5 ,we can use the statement Firs->info= 5;
• As there are no other nodes in the list the link part can be made NULL as follow
• First->link=NULL
First 5
• The maintenance of linked lists in memory assumes the possibility of inserting new nodes
into the lists and hence requires some mechanism which provides unused memory space
for the new nodes. Similarly a mechanism is required which makes the deleted node
available for future use.
• Together with the linked list in memory, a special list is maintained which consist of
unused memory cells.
• This list which has its own pointer is called the list of available space or the free storage
list or the free pool. Such a list is also called AVAIL
Instead of using the malloc function the following getnode() function can be used to get a new
node
NODE * getNode(void)
{
/* provide a node for use */NODE
* new;
if (avail)
{
new = avail;
avail = avail→link;
return new;
}
else
{
new=( NODE *)malloc(sizeof(NODE));
return new;
}
}
Instead of the free function the following retnode function can be used
Garbage collection
• Suppose some memory space becomes reusable because a node is deleted from a list or an
entire list is deleted from a program, we can make this space to be available for future use.
One way is to immediately reinsert the space into the free storage list.
• This is done when a list is implemented by linear arrays. But this method may be too time
consuming for the operating system of the computer. So an alternate method is devised.
• The operating system of a computer may periodically collect all the deleted space on to the
free storage list this technique si called garbage collection
temp->link=NULL;
if (first==NULL)
first=temp;
else
{
cur=first
while(cur->link!=NULL)cur=cur-
>link;
cur->link=temp
}
}
}
}
Delete the nodes from a linked list pointed by first whose information part is specified is item
cur=first;
while (cur!=NULL)
{
If (cur->info==item)
{
prev->link=cur->link;
free(cur); return(first);
}
else
{
prev=cur;
cur=cur->link;
}
}
Printf(“node with item not found”)
return(first);
}
Delete the NODE present at location loc, the NODE that precedes is present at location locp. If
there is only one NODE then locp=NULL
if (first==NULL)
printf(“list is empty);
else
{
cur=first
while(cur!=NULL)
{
Printf(“%d \t”, cur->info);
cur=cur->link;
}
}
if (first==NULL)
{
printf(“list is empty);
return(0)
}
cur=first
while(cur!=NULL)
{
count++ cur=cur-
>link;
}
return(count)
}
retrun;
}
cur =cur->link;
}
}
Printf(“search unsuccessfull”);
}
printf("queue overflow\n");
return(first)
}
temp->info=item;
temp->link=NULL;
if (front==NULL)
{
rear=temp;
front=temp;
}
else
{
rear->link=temp;
rear=temp;
}
}
Function deletes the NODE in the front and returns the item
int del_front(NODE * front)
{
NODE cur;int
itemdel;
if(front==NULL)
{
printf("Queue underflow\n");
return front;
}
cur=front;
itemdel=cur->info;
front=front->link;
free(cur);
return(itemdel);
}
top[MAX_STACKS]
We assume that the initial condition for the stacks is: top[i] = NULL, 0≤i < MAX_STACKS
and the boundary condition is: top[i] = NULL if the ith stack is empty
Function push creates a new NODE, temp, and inserts the NODE in front of the ith stack.
Function pop returns the top element and changes top to point to the address contained in its
link field.
int pop(int i)
{/* remove top element from the ith stack */int
itemdel;
Stack * temp;
if (top[i]==NULL) return
stackEmpty();
We assume that the initial condition for the queues is: front[i] = NULL, ,rear[i]=NULL0 ≤ i
< MAX_QUEUES
and the boundary condition is: front[i] = NULL iff the ith queue is empty
{
front[i] = temp;
rear[i] = temp;
}
else
{
rear[i]→link = temp;
rear[i]=temp;
}
}
Function deleteq deletes the item in the front of the ith queue
int deleteq(int i)
{/* delete an element from queue i */Queue *
temp;
int itemdel
if (front[i]==NULL) return
queueEmpty();
A singly linked list in which the last NODE has a null link is called a chain. If the link field of the
last NODE points to the first NODE in the list, then such a linked list is called a circular list.
last
By keeping a pointer at the last instead of the front we can now insert easily at the front and end of
the list
if (last==NULL)
{
/* list is empty, change last to point to new entry */last =
new;
last→link = last;
}
else
{
/* list is not empty, add new entry at front */
new→link = last→link;
last→link = new;
}
}
if (last==NULL)
{
/* list is empty, change last to point to new entry */last =
new;
last→link = last;
}
else
{
/* list is not empty, add new entry at front */
New→link = last→link;
last→link = New;
last=New;
}
}
count++;
temp = temp→link;
}
return count;
}
A header linked list is a linked list which always contains a special NODE, called the header NODE,
at the beginning of the list. There are two types of header list.
Note: Unless stated it is assumed that the linked list is circular header list.
Polynomials Polynomial
Representation
We should be able to represent any number of different polynomials as long as memory is available.
In general, A polynomial is represented as :
A(x)= am-1xm-1 + ............ a0x0
where the ai are nonzero coefficients and the ei are nonnegative integer exponents such that
e m-1 > em-2 >......................> e1 > e0 ≥ 0.
We represent each term as a NODE containing coefficient and exponent fields, as well as a pointer
to the next term. Assuming that the coefficients are integers, the type declarations are:
int expon;
struct polyNode * link;
};
Typedef struct polyNode POLY;
Consider the polynomials a = 3x14 + 2x8 + 1x+2 and b = 8x12- 3x10 + 10x5 +3 It can be represented as
follows
a 3 14 2 8 1 1 2 0
b 8 12 3 10 10 5 3 0
Adding Polynomials
To add two polynomials, we examine their terms starting at the NODEs pointed to by a and b.
1. If a→expon = b→expon, we add the two coefficients a→coef + b→coef and create a new
term for the result c. a = a→link; b = b→link;
2. If a→expon < b→expon, then we create a duplicate term of b, attach this term to the
result,called c, and b = b→link;
3. If a→expon > b→expon, then we create a duplicate term of a, attach this term to the result,
called c, and a = a→link;
POLY *Pointer padd(POLY * a, POLY * b) /* return a polynomial which is the sum of a andb */
{
POLY * c,*tempa, *tempb,*lastc;int sum;
c= (POLY*)malloc(sizeof(POLY))c-
>link=NULL
tempa=a
tempb=blastc=c;
while (tempa!=NULL && tempb!=NULL)
{
switch (COMPARE(tempa→expon, tempb→expon))
{
case -1: lastc=attach(tempb→coef, tempb→expon,lastc);
tempb = tempb→link;
break;
return(c);
}
POLY *Pointer padd(POLY * a, POLY * b) /* return a polynomial which is the sum of a andb */
{
POLY * c,*tempa, *tempb,*lastc;int sum;
c= (POLY*)malloc(sizeof(POLY)) c-
>link=c
SUNIL G L, Dept. of CSE(DS), RNSIT, Bengaluru. Page 29
Data Structures and Applications(BCS304) Module 2
tempa=a->link;
tempb=b->link;lastc=c;
while (tempa!=a && tempb!=b)
{
switch (COMPARE(tempa→expon, tempb→expon))
{
case -1: lastc=attach(tempb→coef, tempb→expon,lastc);
tempb = tempb→link;
break;
lastc->link=c
return(c);
}
temp→coef = c;
temp→expon = e;
temp->link =rear->link;
rear->link=temp;
rear=temp;
return(rear);
}
If m > 0 and n > 0, the while loop is entered. Each iteration of the loop requires O(1) time.At each
iteration, either a or b moves to the next term or both move to the next term. Since the iteration
terminates when either a or b reaches the end of the list, therefore, the number of iterations is bounded
by m + n - 1.
The time for the remaining two loops is bounded by O(n + m). The first loop can iterate m times and
the second can iterate n times. So, the asymptotic computing time of this algorithm is O(n +m).
MODULE – 3
LINKED LISTS : Sparse Matrices, Doubly Linked List.
TREES: Introduction, Binary Trees, Binary Tree Traversals, Threaded Binary Trees.
• Each header NODE is in three lists: a list of rows, a list of columns, and a list of header
NODEs. The list of header NODEs also has a header NODE that has the same structure as an
entry NODE.
• The row and col value of the header NODE consist of the dimension of the matrix
Example:
Consider the sparse matrix shown below.
Since there are two different types of NODEs a union is used to create the appropriate data structure.
The necessary C declarations are as follows:
matrixPointer *next;
entryNODE entry;
} u;
};
• If we are pointing to a specific NODE, say p, then we can move only in the direction of the
links. The only way to find the NODE that precedes p is to start at the beginning of the list.
• If we wish to delete an arbitrary NODE from a singly linked list. Easy deletion of an arbitrary
NODE requires knowing the preceding NODE.
• Can traverse only in one direction.
• Difficult to delete arbitrary NODEs.
A doubly linked list may or may not be circular. The data field of the header NODE usually contains
no information.
struct NODE
{
strunct NODE * llink;
int data;
struct NODE * rlink;
};
The function dinsert() inserts a newNODE into a doubly linked list after a NODE pointed by
ptr
The function ddelete() deletes a NODE from a doubly linked list pointed by head
}
}
Space Efficiency: We have the overhead of storing two pointers for each element.
Polynomials Polynomial
Representation
We should be able to represent any number of different polynomials as long as memory is available.
In general, A polynomial is represented as :
A(x)= am-1xm-1 + ............ a0x0
where the ai are nonzero coefficients and the ei are nonnegative integer exponents such that
e m-1 > em-2 >......................> e1 > e0 ≥ 0.
We represent each term as a NODE containing coefficient and exponent fields, as well as a pointer
to the next term. Assuming that the coefficients are integers, the type declarations are:
struct polyNode {
int coef;
int expon;
struct polyNode * link;
};
Typedef struct polyNode POLY;
Consider the polynomials a = 3x14 + 2x8 + 1x+2 and b = 8x12- 3x10 + 10x5 +3 It can be represented as
follows
a 3 14 2 8 1 1 2 0
b 8 12 3 10 10 5 3 0
Adding Polynomials
To add two polynomials, we examine their terms starting at the NODEs pointed to by a and b.
1. If a→expon = b→expon, we add the two coefficients a→coef + b→coef and create a new
term for the result c. a = a→link; b = b→link;
2. If a→expon < b→expon, then we create a duplicate term of b, attach this term to the
result,called c, and b = b→link;
3. If a→expon > b→expon, then we create a duplicate term of a, attach this term to the result,
called c, and a = a→link;
POLY *Pointer padd(POLY * a, POLY * b) /* return a polynomial which is the sum of a and
b */
{
POLY * c,*tempa, *tempb,*lastc;
int sum;
c= (POLY*)malloc(sizeof(POLY))
c->link=NULL
tempa=a
tempb=b
lastc=c;
while (tempa!=NULL && tempb!=NULL)
{
switch (COMPARE(tempa→expon, tempb→expon))
{
case -1: lastc=attach(tempb→coef, tempb→expon,lastc);
tempb = tempb→link;
break;
case 0: sum =tempa→coef + tempb→coef;
if (sum)
{
lastc=attach(sum, tempa→expon, lastc);
tempa = tempa→link; tempb = tempb→link;
}
break;
case 1: lastc=attach(tempa→coef,tempa→expon,lastc);
tema = tempa→link;
}
}
While(tempb!=NULL)
{
lastc=attach (tempb→coef,tempb→expon,lastc);
tempb=tempb->link;
}
return(c);
}
POLY *Pointer padd(POLY * a, POLY * b) /* return a polynomial which is the sum of a and
b */
{
POLY * c,*tempa, *tempb,*lastc;
int sum;
c= (POLY*)malloc(sizeof(POLY))
c->link=c
tempa=a->link;
tempb=b->link;
lastc=c;
while (tempa!=a && tempb!=b)
{
switch (COMPARE(tempa→expon, tempb→expon))
{
case -1: lastc=attach(tempb→coef, tempb→expon,lastc);
tempb = tempb→link;
break;
break;
case 1: lastc=attach(a→coef,a→expon,lastc);
tempa = tempa→link;
}
}
lastc->link=c
return(c);
}
If m > 0 and n > 0, the while loop is entered. Each iteration of the loop requires O(1) time.At each
iteration, either a or b moves to the next term or both move to the next term. Since the iteration
terminates when either a or b reaches the end of the list, therefore, the number of iterations is bounded
by m + n - 1.
The time for the remaining two loops is bounded by O(n + m). The first loop can iterate m times and
the second can iterate n times. So, the asymptotic computing time of this algorithm is O(n +m).
TREES
Definition of Tree:
A tree is a finite set of one or more nodes such that
• There is a special node called the root.
• The remaining nodes are partitioned into n ≥ 0 disjoint sets T1, ⋯, Tn, where each of these
sets are called the subtrees of the root.
The tree has 13 nodes and has one character as its information. A tree is always drawn with its root
at the top. Here the node A is the root.
Binary Trees
The Abstract Data type
Definition: A binary tree is a finite set of nodes that is either empty or consists of a root and two
disjoint binary trees called the left subtree and the right subtree.
Objects: a finite set of nodes either empty or consisting of a root node, left Binary_Tree, and right
Binary_Tree.
Induction Base: The root is the only node on level i = 1. Hence, the maximum number of
nodes
on level i = 1 is 2 i-1 = 20 = 1.
Induction Hypothesis: Let i be an arbitrary positive integer greater than 1. Assume that the
maximum number of nodes on level i - 1 is 2i-2.
Induction Step:
The maximum number of nodes on level i - 1 is 2i-2 by the induction hypothesis
Since each node in a binary tree has a maximum degree of 2, the maximum number of nodes
on level i is two times the maximum number of nodes on level i-1,
i.e 2* 2i-2 = 2i-1
Hence Prooved
Lemma 2: [Relation between number of leaf nodes and degree-2 nodes]: For any non empty binary
tree, T, if n0 is the number of leaf nodes and n2 the number of nodes of degree 2, then n0 = n2 + 1.
Proof:
Let n1 be the number of nodes of degree one and n the total number of nodes. Since all
nodes in T are at most of degree two, we have
n=n0+n1+n2--------- (1)
If we count the number of branches in a binary tree, we see that every node except the root has
a branch leading into it.
Definition
Full Binary Tree: A full binary tree of depth k is a binary tree of depth k having 2k - 1 nodes, k ≥ 0.
Example
The nodes are numbered in a full binary tree starting with the root on level 1, continuing with the
nodes on level 2, and so on. Nodes on any level are numbered from left to right.
Complete binary tree : A binary tree is complete if the number of nodes in each level i except
possibly the last level is 2i-1. The number of nodes in the last level appears as left as possible.
Example: A complete tree T11 with 11 nodes is shown below. This is not a full binary tree.
Strictly Binary Tree is a tree where every non leaf node in a binary tree has non empty left and
right subtrees.A strictly binary tree with n leaves always contain 2n-1 nodes
Example:
Almost Complete Binary Tree: A binary tree of depth d is an almost complete binary tree if
Example
This representation can be used for any binary tree. In most cases there will be a lot of unutilized
spaces. For complete binary tree such as
Linked Representation
Disadvantage of array representation
• The array representation is good for complete binary trees but, it wastes a lot of space for
many other binary trees.
• Insertion and deletion of nodes from the middle of a tree require the movement of potentially
many nodes to reflect the change in level number of these nodes.
• These problems can be overcome easily through the use of a linked representation.
Leftchild Rightchild
Node Representation
With this node structure it is difficult to determine the parent of a node, If it is necessary to be able to
determine the parent of random nodes, then a fourth field, parent, may be included in the class
TreeNode
Inorder Traversal
• Inorder traversal move down the tree toward the left until we can go no farther.
• Then "visit" the node,
• move one node to the right and continue.
• If we cannot move to the right, go back one more node and continue
• A precise way of describing this traversal is by using recursion as follows
Preorder Traversal
• visit a node
• traverse left, and continue.
• When you cannot continue, move right and begin again or move back until you can move
right and resume."
}
}
Postorder Traversal
• traverse left, and continue.
• When you cannot continue, move right and traverse right as far as possible
• Visit the node
Example:
Expression Tree: An expression containing operands and binary operators can be representedby a
binary tree.
A node representing an operator is a non leaf. A node representing an operand is a leaf. The root of
the tree contains the operator that has to be applied to the results of evaluating the left subtree and the
right subtree.
When the binary expression trees are traversed preorder we get the preorder expression. When we
traverse the tree postorder we get the postorder expression . When we traverse it inorder we get the
inorder expression.
For Example : consider the traversals for the tree given above
Example:
Result = 3
Testing Equality
• Equivalent Binary trees have the same structure and the same information in the
corresponding nodes.
• Same structure means every branch in one tree corresponds to a branch in the second tree that
is the branching of the trees is identical.
• This function returns true if the two trees are equivalent and false otherwise.
return false;
}
Defintion: The satisfiability problem for formulas of the propositional calculus asks if there is an
assignment of values to the variables that causes the value of the expression to be true.
Example: Representation of the expression (x1 x2) (x1 x3) x3 as a binary tree
x3
x3
x1
X1
X2
Inorder Traversal of the tree is x1 x2 x1 x3 x3 this is the infix form of the expression.
Note: The node containing has only a right branch since is a unary operator.
To determine satisfiability (x1,x2,x3) must take all possible combinations of true or false values and
check the formula for each combination. For n variables ther are 2n possible combinations of true=t
and false=f.
Example: For n=3 the eight combinations are (t,t,t),(t,t,f),(t,f,t),(t,f,f),(f,t,t), (f,t,f), (f,f,t),(f,f,f).
Analysis: This algorithm will take O(g.2n ) or exponential time, where g is the time to substitute
values for x1,x2, ......... xn and evaluate the expression.
Postorder Evaluation function: To evaluate an expression, the tree is traversed in postorder. When
a node is visited the value of the expression represented by the left and right sub trees of a node are
computed first. So the recursive postorder traversal algorithm is modified to obtain the function that
evaluates the tree.
postorderEval(node->leftchild);
postorderEval(node->rightchild);
switch(node->data)
{
Case not: node->value= ! node->rightchild->value;
Break;
Case and: node->value= node->rightchild->value && node>lefchild->value;
Break;
Case or: node->value= node->rightchild->value || node>lefchild->value;
Break;
Case true: node->value= true; break;
Case false: node->value= false; break;
}
}
Threads
A binary tree has more NULL links than pointers. These null links can be replaced by special
pointers, called threads, to other nodes in the tree.
One way threading: If ptr → rightChild is null, replace ptr → rightChild with a pointer to the
inorder successor of ptr. Ptr->left child remains unchanged.
Note : unless specified we consider threading corresponds to the inorder traversal. To distinguish the
threads from ordinary pointers, threads are always drawn with broken links
Example: consider the binary tree and the corresponding threaded tree given below
Binary tree
• In Figure two threads have been left dangling: one in the left child of H, the other in the right
child of G.
• To avoid loose threads, a header node is assumed for all threaded binary trees.
• The original tree is the left subtree of the header node.
• An empty binary tree is represented by its header node as in figure below
The variable root points to the header node of the tree, while root → leftChild points to the start of
the first node of the actual tree.
By using the threads, we can perform an inorder traversal without making use of a stack.
• For any node, ptr, in a threaded binary tree, if ptr → rightThread = TRUE, the
inorder successor of ptr is ptr → rightChild by definition of the threads.
• Otherwise we obtain the inorder successor of ptr by following a path of left-child links from
the right-child of ptr until we reach a node with leftThread = TRUE.
Finding the inorder successor of a node: The function insucc finds the inorder successor of any
node in a threaded tree without using a stack.
ThreadNode * insucc(ThreadNode *tree)
{
ThreadNode * temp;
temp = tree→rightChild;
if (tree→rightThread==’f’)
while (temp→leftThread==’f’)
temp = temp→leftChild;
return temp;
}
for (;;)
{
temp = insucc(temp);
if (temp == tree) break;
printf("%c", temp→data);
}
}
ADT Dictionary
objects: a collection of n > 0 pairs, each pair has a key and an associated item
functions: for all d ∈ Dictionary, item ∈ Item, k ∈ Key, n ∈ integer
To search for a node whose key is k. We begin at the root of the binary search tree.
SUNIL G L, Dept. of CSE(DS), RNSIT, Bengaluru Page 25
Data Structures and Applications (BCS304) Module III
• If the root is NULL, the search tree contains no nodes and the search is unsuccessful.
• we compare k with the key in root. If k equals the root's key, then the search terminates
successfully.
• If k is less than root's key, then, we search the left subtree of the root.
• If k is larger than root's key value, we search the right subtree of the root.
struct node
{
Struct node *lchild;
struct
{
int item; /* Itype represents the data type of the element*/
int key;
}data;
Recursive search of a binary search tree: Return a pointer to the element whose key is k, if there
is no such element, return NULL. We assume that the data field of the element is of type elemenet
and it has two components key and item.
{
while (tree!=null)
{
if (k == tree→data.key)
return (tree);
if (k < tree→data.key)
tree = tree→leftChild;
else
tree = tree→rightChild;
}
return NULL;
SUNIL G L, Dept. of CSE(DS), RNSIT, Bengaluru Page 26
Data Structures and Applications (BCS304) Module III
Analysis of Search(Both iterative and recursive): If h is the height of the binary search tree, then
we can perform the search using either search in O(h). However, recursive search has an additional
stack space requirement which is O(h).
Inserting in to a Binary Search Tree: Binary search tree has distinct values, first we searchthe
tree for the key and if the search is unsuccessful the key is inserted at the point the search
terminated
Example : consider the tree given below
If k is in the tree pointed at by node do nothing. Otherwise add a new node with data = (k, item)
lastnode=Modifiedsearch(root,k);
Ptr=(TreeNode*)malloc(sizeof(TreeNode));
ptr→data.key = k;
ptr→data.item = Item;
ptr→leftChild = ptr→rightChild = NULL;
if (root==NULL)
{
root=ptr;
return(root);
}
if(lastnode!=NULL)
{
if (k < lastnode→data.key)
lastnode→leftChild = ptr;
else
lastnode→rightChild = ptr;
return (root);
}
If the element is present or if the tree is empty the function Modifiedsearch returns NULL. If the
element is not present it retrun a pointer to the last node searched.
Modifiedsearch(Treenode *root,int k)
TreeNode *temp,*prev;
temp==node;
prev=NULL;
If(temp==NULL)
return(NULL);
while(temp!=NULL)
{
if(temp->data.key==k)
{
printf(“element already found”);
return(NULL);
}
if(key<temp->data.key)
{
Prev=temp;
temp=temp->rcchild;
}
else
{
Prev=temp;
Temp=temp->rchild;
}
}
retrun(prev);
Deletion from a binary search tree: Suppose T is a a binary search tree. The function to
delete an item from tree T first search the tree to find the location of the node with the item
and the location of the parent of N and the deletion of the node N depends on three cases:
Case 1: N has no children. Then N is deleted from T by replacing the location of the node N in the
parent(N) by the NULL pointer
Deleting node
Case 2: If N has exactly one child. Then N is deleted from T by replacing the location of N in Parent
(N) by the location of the only child of N.
Deleting node 75
Case 3: N has Two children. Let S(N) denote the inorder successor of N(S(N) does not have a left
child).Then N is deleted from T by first deleting S(N) from T (by using case1 or cae 2) and then
replacing node N in T by the node S(N).
if (node == NULL)
return node;
else
{
temp = node->rchild;
while(temp->lchild!=NULL) //Get the inorder successor
temp=temp->lchild;
node->data.item = temp->data.item;
node->data.key=temp->data.key;
node->rlink = delete_element(node->rchild, temp->data.key);
return node;
}
}}
MODULE - 4
TREES(Cont..): Binary Search trees, Selection Trees, Forests, Representation of Disjoint
sets, Counting Binary Trees,
GRAPHS: The Graph Abstract Data Types, Elementary Graph Operations
To search for a node whose key is k. We begin at the root of the binary search tree.
• If the root is NULL, the search tree contains no nodes and the search is unsuccessful.
• we compare k with the key in root. If k equals the root's key, then the search terminates
successfully.
• If k is less than root's key, then, we search the left subtree of the root.
• If k is larger than root's key value, we search the right subtree of the root.
struct node
{
Struct node *lchild;
struct
{
int item; /* Itype represents the data type of the element*/
int key;
}data;
Recursive search of a binary search tree: Return a pointer to the element whose key is k, if there
is no such element, return NULL. We assume that the data field of the element is of type elemenet
and it has two components key and item.
{
while (tree!=null)
{
if (k == tree→data.key)
return (tree);
if (k < tree→data.key)
tree = tree→leftChild;
else
tree = tree→rightChild;
}
return NULL;
}
Analysis of Search(Both iterative and recursive): If h is the height of the binary search tree, then
we can perform the search using either search in O(h). However, recursive search has an additional
stack space requirement which is O(h).
SUNIL G L, Dept. of CSE(DS), RNSIT, Bengaluru Page 2
Data Structures and Applications (BCS304) Module IV
Inserting in to a Binary Search Tree: Binary search tree has distinct values, first we searchthe
tree for the key and if the search is unsuccessful the key is inserted at the point the search
terminated
Example : consider the tree given below
If k is in the tree pointed at by node do nothing. Otherwise add a new node with data = (k, item)
lastnode=Modifiedsearch(root,k);
Ptr=(TreeNode*)malloc(sizeof(TreeNode));
ptr→data.key = k;
ptr→data.item = Item;
ptr→leftChild = ptr→rightChild = NULL;
if (root==NULL)
{
root=ptr;
return(root);
}
if(lastnode!=NULL)
{
if (k < lastnode→data.key)
lastnode→leftChild = ptr;
else
lastnode→rightChild = ptr;
return (root);
}
If the element is present or if the tree is empty the function Modifiedsearch returns NULL. If the
element is not present it retrun a pointer to the last node searched.
Modifiedsearch(Treenode *root,int k)
TreeNode *temp,*prev;
temp==node;
prev=NULL;
If(temp==NULL)
return(NULL);
while(temp!=NULL)
{
if(temp->data.key==k)
{
printf(“element already found”);
return(NULL);
}
if(key<temp->data.key)
{
Prev=temp;
temp=temp->rcchild;
}
else
{
Prev=temp;
Temp=temp->rchild;
}
}
retrun(prev);
Deletion from a binary search tree: Suppose T is a a binary search tree. The function to
delete an item from tree T first search the tree to find the location of the node with the item
and the location of the parent of N and the deletion of the node N depends on three cases:
Case 1: N has no children. Then N is deleted from T by replacing the location of the node N in the
parent(N) by the NULL pointer
Deleting node
Case 2: If N has exactly one child. Then N is deleted from T by replacing the location of N in Parent
(N) by the location of the only child of N.
SUNIL G L, Dept. of CSE(DS), RNSIT, Bengaluru Page 4
Data Structures and Applications (BCS304) Module IV
Deleting node 75
Case 3: N has Two children. Let S(N) denote the inorder successor of N(S(N) does not have a left
child).Then N is deleted from T by first deleting S(N) from T (by using case1 or cae 2) and then
replacing node N in T by the node S(N).
if (node == NULL)
return node;
else
{
// node with only one child
if (node->lchild == NULL)
{
temp = node->rchild;
free(node);
return temp;
}
else if (node->rchild == NULL)
{
temp = node->lchild;
free(node);
return temp;
}
// node with two children
else
{
temp = node->rchild;
while(temp->lchild!=NULL) //Get the inorder successor
temp=temp->lchild;
node->data.item = temp->data.item;
node->data.key=temp->data.key;
node->rlink = delete_element(node->rchild, temp->data.key);
return node;
}
}
}
GRAPHS
Introduction
The first recorded evidence of the use of graph dates back to 1736. When Leonhard Euler used them
to solve the classical Konigsberg bridge problem.
Definitions
Example:
Undirected Graph: In a undirected graph the pair of vertices representing an edge is unordered.
thus the pairs (u,v) and (v,u) represent the same edge.
Example:
V(G)={a,b,c,d}
E(G)={(a,b),(a,d),(b,d),(b,c)
Directed Graph (digraph): In a directed graph each edge is represented by a directed pair (u,v), v is
the head and u is the tail of the edge. Therefore (v,u) and(<u,v) represent two different edges
Example:
V(G)={a,b,d}
Self Edges/Self Loops: Edges of the form(v,v) are called self edges or self loops . It is an edge
which starts and ends at the same vertex.
Example:
Mutigraph: A graph with multiple occurrences of the same edge is called a multigraph
Example:
Complete Graph: An undirected graph with n vertices and exactly n(n-1)/2 edges is said to be a
complete graph. In a graph all pairs of vertices are connected by an edge.
Example : A complete graph with n=3 vertices
Adjacent Vertex
If (u,v) is an edge in E(G), then we say that the vertices u and v are adjacent and the edge(u,v) is
incident on vertices u and v.
Path: A path from vertex u to v in graph g is a sequence of vertices u,i1,i2,…….ik,v such that
(u,i1),(i1,i2)………(ik,v) are edges in E(G). if G’ is directed then the path consists of
<u,i1>,<i1,i2>………<ik,v> edges in E(G’).
Example:
Cycle: A cycle is a simple path in which all the vertices except the first and last vertices are distinct.
The first and the last vertices are same.
Example :
(B,C),(C,D)(D,E)(E,A)(A,B) is a cycle
Degree of a vertex : In a undirected graph degree of a vertex is the number of edges incident on a
vertex.
In a directed graph the in-degree if a vertex v is the number of edges for which v is the head i.e. the
number of edges that are coming into a vertex. The out degree is defined as the number of edges for
which v is the tail i.e. the number of edges that are going out of a vertex
Subgraph: A subgraph of G is a graph G’ such that V(G’) V(G) and E(G’) E(G)
Example :
Graph(G) Subgraph(G’)
Connected Graph: An undirected graph G is said to be connected if for every pair of distinct
vertices u and v in V(G) there is a path from u to v in G.
Strongly connected graph : A directed graph G is said to be strongly connected if for every pair of
distinct vertices u an v in V(G), there is a directed path from u to v and from v to u.
ADT Graph
Objects: a nonempty set of vertices and a set of undirected edges, where each edge is a pair of
vertices.
Graph create():=
Example: return an empty graph
Graph InsertVertex(graph,v):= return a graph with v inserted. v has no incident edges
Graph InsertEdge(graph,v1,v2) := retrun a graph with a new edge between v1 and v2
Graph DeleteVertex(graph,v) := return a graph in which v and all edges incident to it is
removed
Graph DeleteEdge(graph,v1,v2):= retrun a graph in which the edge (v1,v2) is removed,
leave the incident nodes in the graph
Boolean IsEmpty:= If (graph == empty graph) retrun TRUE else Retrun
FALSE
List Adjacent(graph,v) := retrun a list of all vertices that are adjacent to v
Graph Representation
• Adjacency Matrix
• Adjacency List
• Adjacency Multilist
Adjacency Matrix: Let G=(V,E) be a graph with n vertices, n>=1. The adjacency matrix of G is a
two dimensional n*n array for example a, with the property that a[i][j]=1 if there exist ane edge (i,j)
(for a directed graph edge <i,j> is in E(G).a[i][j]=0 if no such edge in G.
Example:
0
Adjacency Matrix
0 1 2 3
0 0 1 1 1
1 2 1 1 0 1 1
2 1 1 0 1
3 1 1 1 0
Adjacency list: In adjacency matrix the n rows of the adjacency matrix are represented as n chains.
There is one chain for each vertex in G. The nodes in chain i represent the vertices that are adjacent
from vertex i. The data field of a chain node stores the index of an adjacent vertex.
AdjLists
Example: data link
[0] 1 2 3 0
[1] 0 2 3 0
[2] 0 1 3 0
[3] 0 1 2 0
• For an undirected graph with n vertices and e edges. The linked adjacency lists representation
requires an array of size n and 2e chain nodes.
• The degree of any vertex in an undirected graph may be determined by counting the number
of nodes in the adjacency list.
• For a digraph the number of list nodes is only e.
Adjacency Multi lists: For each edge there will be exactly one node, but this node will be in two
list(i.e., the adjacency list for each of the two nodes to which it is incident). A new field is necessary
to determine if the edge is determined and mark it as examined.
adjLists N0 0 1 N1 N3 edge(0,1)
[0]
N1 0 2 N2 N3 edge(0,2)
[1]
N2 0 3 0 N4 edge(0,3)
[2]
N3 1 2 N4 N5 edge(1,2)
[3]
N4 1 3 0 N5 edge(1,3)
F
N5 2 3 0 0 edge(2,3)
Weighted Edges: In many applications the edges of a graph have weight assigned to them. These
weights may represent the distance from one vertex t o another or the cost for going from one vertex
to an adjacent vertex. The adjacency matrix and list maintains the weight information also. A graph
with weighted edges are also called network.
Example:
Given an undirected graph G=(V,E) and a vertex v in V(G) ,there are two ways to find all the
vertices that are reachable from v or are connected to v .
A global array visited is maintained , it is initialized to false, when we visit a vertex i we change the
visited[i] to true.
Global Declaraions
# define FALSE 0
# define true 1
Short int visited[max_vertices];
void dfs(int v)
{
visited[v]=TRUE;
printf(“%d”,v);
w=graph[v]
while(w!=NULL)
{
If(visited[w->vertex]==FALSE)
dfs(w->vertex);
w=w->link;
}}
Example:
Analysis
• If we represent G by its adjacency list then we can determine the vertices adjacent to v by
following a chain of links. Since dfs examines each node in the adjacency list at most once
then the time to complete the search is O(e).
• If we represent G by its adjacency matrix then determining all vertices adjacent o v requires
O(n) time. Since we visit at most n vertices the total time is O(n2).
Example: For the graph given below if the search is initiated from vertex 0 then the vertices are
visited in the order vertex 3, 1, 2
1 2
struct node
{
int vertex;
struct node * link;
};
typedef struct node queue;
queue * front,*rear;
int visied[max_vertics];
void addq(int);
int delete();
void bfs(int v)
{
front=rear=NULL;
printf(“%d”,v);
visisted[v]= TRUE;
addq(v);
while(front)
{
v=deleteq();
while(w!=NULL)
{
if(visited[w->vertex]==FALSE)
{
printf(“%d”,w->vertex);
addq(w->vertex);
visited[w->vertex]=TRUE;
}
w=w->link;
}
}
}
Analysis of BFS:
• For each vertex is placed on the queue exactly once, the while loop is iterated at most n
times.
• For the adjacency list representation the loop has a total cost of O(e). For the adjacency
matrix representation the loop takes O(n) times
• Therefore the total time is O (n2).
MODULE - 5
Hashing
Hashing enables us to perform dictionary operations like search insert and delete in O(1) time. There
are two types of hashing
◾ Static and
◾ Dynamic
Static Hashing
◾ In static Hashing the dictionary pairs are stored in a table, ht called the hash table.
◾ The hash table is partitioned into b buckets, ht[0],…….ht[b-1]
◾ Each bucket is capable of holding s dictionary pairs.
◾ Thus a bucket is said to consist of s slots. usually s=1
◾ The address or location of a pair whose key is k is determined by hash function h which
maps keys into buckets.
◾ Thus for any key k, h(k) is an integer in range 0 through b-1
.
.
.
b-2
b-1
1 2 ………… s
S slots
b=26, s=2
n=10 distinct identifiers- each representing a C library function
Loading factor a = n/(sb) = 10/52=0.19
f(x)= first character of x
x: acos, define, float, exp, char, atan, ceil, floor, clock, ctime
f(x) : 0, 3, 5, 4, 2, 0, 2, 5, 2, 2
Slot0 Slot1
0 Acos atan
1
2 Char ceil
3 Define
4 exp
5 foat floor
.
.
24
25
Hash Functions: A hash function maps a key into a bucket in the hash table. A function H from the
set K of keys into the set L of memory addresses is called the hash function
H:K->L
Division : Chose a number m larger than the number n of keys in K. The number m is chosen to be a
prime number or a number without small divisors to reduce collisions. The function is defined as
Bucket addresses range from 0 to m-1 and the hash table must have m buckets
Mid Square: In this method the square of the key is found and appropriate number of bits are used
from the middle of the square to obtain the bucket address
• F(K)=middle(K2)
• The number of bits used to obtain bucket address depends on table size.
• If r bits are used the range of values is 0 through 2r-1
▪ All parts except for the last one have the same length
▪ The parts are added together to obtain the hash address
▪ Two possibilities
Example k= 12320324111220
x1=123, x2=203, x3=241, x4=112, x5=20, address= 123+203+241+112+20= 699
Digit Analysis
▪ Useful in the case of a static file where all the keys in the table are known in advance
▪ Each key is interpreted using some radix r.
▪ The same radix is used for all the keys in the table
▪ Digits are examined with this radix
▪ Digits having the most skewed distributions are deleted.
▪ Enough digits are deleted so that the remaining digits are small enough to give and address in
the range of hash table
◾ Converting each character to a unique integer and summing these unique integers.
◾ Shifting the integer corresponding to every other character by 8 bits and then summing it up
Synonyms: Hash function h maps several different keys into the same bucket
Two keys, k1 and k2 are synonyms with respect to h
if h(k1) = h(k2)
An overflow occurs when home bucket for a new dictionary pair is full when we wish to insert
this pair
A collision occurs when the home bucket for the new pair is not empty at the time of insertion.
increased.
◾ For good performance the table size is increased when loading density exceeds a prescribed
threshold such as 0.75 rather when the table is full.
◾ When the hash table is resized
▪ Hash function changes
▪ Home bucket of each key may change
Example:
Suppose the table T has 11 memory locations T[1]……T[11] and suppose the file f contains 8
records with the following hash addresses
Records A B C D E X Y Z
H(K) 4 8 2 11 4 11 5 1
Suppose these 8 records are entered into the hash table in the above order the hash table will look as
shown below.
Table T X C Z A E Y - B - - D
Address 1 2 3 4 5 6 7 8 9 10 11
U= (7+6+5+4+3+2+1+2+1+1+8)/11=40/11=3.6
Example-2
0 1 2 3 4 5 6 7 8 9 10 11 12
function for do while else if
◾ Compute h(k)
◾ Examine the hash table in the order ht[h(k) +i]%b, 0<=i <=b-1, untill one of the follwing
happens
▪ The bucket ht[h(k) +i]%b contains the key k and the desired pair is found
▪ ht[h(k) +i]%b is empty; k is not in the table.
▪ Return to ht[h(k)], the table is full and k is not in the table
Insert acos, atoi,char,define,exp,ceil,cos, float, atol, floor , ctime into a 26 bucket hash table
We see the number of searches increasing and the keys clustering together
Quadratic Probing
▪ Quadratic probing uses a quadratic function of i as the increment
▪ Suppose a record R with key k has the hash addres H(k)=h then instead of searching the
locations with h,h+1, h+2,……….. we linearly search locations with h,h+1,h+4,h+9, ......... h+i2
▪ If the number m of locations in the table T is a prime number, then the above sequence will
access half of the locations T
Double hashing
Here a second hash function H’ is used for resolving a collision, as follows.
Suppose a record R with key k has the hash address H(k)=h and h’(k)=h’ m then we linearly search
locations with addresses h, h+h’,h=2h’.h+3h’,………
If m is a prime number then the above sequence will access all the locations in the table T.
Note: One major disadvantage in any type of open addressing procedure is in the
implementation of deletion.
Suppose a record r is deleted from location T(r) , suppose we reach this location during a search, it
does not mean the search is unsuccesssfull..
Thus when deleting a record the location should be labeled to indicate that previously it did contain a
record
Chaining
• Maintain one list per bucket
5.7.4 Rehashing: When the hash table becomes nearly full, the number of collisions increases, thereby
degrading the performance of insertion and search operations. In such cases, a better option is to create
a new hash table with size double of the original hash table.
All the entries in the original hash table will then have to be moved to the new hash table. This is done
by taking each entry, computing its new hash value, and then inserting it in the new hash table. Though
rehashing seems to be a simple process, it is quite expensive and must therefore not be done frequently.
Example:
Consider the hash table of size 5 given below. The hash function used is h(x)= x % 5.
Rehash the entries into to a new hash table using hash function—h(x)= x % 10.
Dynamic hashing
Limitation of static hashing: when the table tends to be full, overflow increases and reduces
performance.
To ensure good performance, it is necessary to increase the size of a hash table whenever the
loading density exceeds a prescribed threshold.
When the loading density increases array doubling is used to increase the size of the array to
2b+1.Change in divisor causes us to rebuild the hash table by reinserting the key in the smaller
table. Dynamic hashing or extendible hashing reduces the rebuild time.
There are two forms of dynamic hashing
▪ Dynamic hashing using directories
▪ Directory less dynamic hashing
Example: Hash function that transforms keys into 6 bit non negative integers. H(k,t) denote the
integers formed by the ‘t’ least significant bits of h(k).
The example taken is a two letter key. H transforms Letter A,B,C into bit sequnce 100,101 and
110 respectively Digits 0 through 7 are transformed into their 3 bit representation
k h(k)
A0 100 000
A1 100 001
B0 101 000
B1 101 001
C1 110 001
C2 110 010
C3 110 011
C5 110 101
Example: Figure below shows a dynamic hash table that contain the keys A0, B0,A1,B1,C2 and C3.
Here the directory depth is 2 and uses buckets that have 2 slots. For each key k,
we examine the bucket pointed to by d[h(k,t)] where t is the directory depth. Suppose we insert C5
into the hash table since h(c5,2)=01 we follow the pointer d[01] and this bucket is full. To resolve
the overflow, we determine the least u such that h(k,u) is not the same for all keys. Incase u is
greater than the directory depth we increase the directory depth to this least value u. Figure below
Advantages
Figure Below shows a directory less hash table ht with r=2 and q=0. The number of active bucket is
4. The index of the active bucket identifies its chain.. Each active bucket has 2 slots.
r=2, q=0
When we insert C5 into the table, chain 01 is examined and we verify that C5 is not present. Since the
active bucket for the searched chain is full we get an overflow. An overflow is handled by activating
bucker 2r+q, reallocating the entries in the chain q then the value of q is incremented by 1.incase q
becomes 2r. We increment r by 1 and reset q to 0. The reallocation is done usingh(k,r+1). Finally the
new pair is inserted into the chain.
SUNIL G L, Dept. of CSE(DS), RNSIT, Bengaluru Page 8
Data Structures and Applications (BCS304) Module V
r=2,q=1
Insert C1 will again result in an overflow at 001 so the bucket 5=100 is activated . Rehashing is done
and the table is as shown below.
r=2,q=2