Programming in C 1e
Programming in C 1e
Reema Thareja,
Thareja Assistant
Professor, Institute of Information
Technology and Management
CHAPTER - 1
INTRODUCTION TO
PROGRAMMING
COMPUTER SOFTWARE
SYSTEM SOFTWARE
APPLICATION SOFTWARE
The computer hardware cannot think and make decisions on its own. So, it cannot
be used to analyze a given set of data and find a solution on its own. The
hardware needs a software (a set of programs) to instruct what has to be done. A
program is a set of instructions that is arranged in a sequence to guide a computer
to find a solution for the given problem. The process of writing a program is called
programming.
INTRODUCTION contd.
Computer software is written by computer programmers using a
programming language.
The programmer writes a set of instructions (program) using a specific
programming language. Such instructions are known as the source code.
Another computer program called a compiler is then used on the source
code, to transform the instructions into a language that the computer can
understand. The result is an executable computer program, which is
another name for software.
Examples of computer software include:
Computer Games
Driver Software
Educational Software
Media Players and Media Development Software
Productivity Software
Operating Systems software
USER 2
USER N
Application programs
For example, games, spreadsheets, word processor,
database, web browsers
System Software
For example, Operating System
Computer Hardware
For example, printer, mouse, scanner, keyboard,
CPU, disk
COMPUTER BIOS
The BIOS is built into the computer and is the first code run by the computer
when it is switched on. The key role of the BIOS is to load and start the
operating system.
When the computer starts, the first function that BIOS performs is to initialize
and identify system devices such as the video display card, keyboard and
mouse, hard disk, CD/DVD drive and other hardware. In other words, the code
in the BIOS chip runs a series of tests called POST (Power On Self Test) to
ensure that the system devices are working correctly.
Tests RAM
OPERATING SYSTEM
From the users point of view the primary consideration is always the convenience.
Users should find it easy to launch an application and work on it. For example, we
use icon which gives us a clue about which application it is.
An operating system ensures that the system resources (like CPU, memory, I/O
devices, etc) are utilized efficiently. For example, there may be many service
requests on a web server and each user request need to be serviced. Similarly,
there may be many programs residing in the main memory. Therefore, the system
needs to determine which programs are active and which need to wait for some
I/O operation. Since, the programs that need to wait can be suspended
temporarily from engaging the processor. Hence, it is important for an operating
system to have a control policy and algorithm to allocate the system resources.
UTILITY SOFTWARE
Utility software is used to analyze, configure, optimize and maintain the computer
system. Utility programs may be requested by application programs during their
execution. for multiple purposes. Some of them are listed below.
Disk defragmenters
Disk checkers
Disk cleaners
Disk partitions
Backup utilities
Disk compression
File managers
System profilers
Anti-virus utilities
Cryptographic utilities
Launcher applications
Registry cleaners
Network utilities
A compiler is a special type of program that transforms source code written in a programming
language (the source language) into machine language comprising of just two digits- 1s and 0s
(the target language). The resultant code in 1s and 0s is known as the object code. The object
code is the one which will be used to create an executable program.
If the source code contains errors then the compiler will not be able to its intended task. Errors
that limit the compiler in understanding a program are called syntax errors. Syntax errors are
like spelling mistakes, typing mistakes, etc. Another type of error is logic error which occurs
when the program does not function accurately. Logic errors are much harder to locate and
correct.
Interpreter: Like the compiler, the interpreter also executes instructions written in a high-level
language.
While the compiler translates instructions written in high level programming language directly
into the machine language; the interpreter on the other hand, translates the instructions into an
intermediate form, which it then executes.
Usually, a compiled program executes faster than an interpreted program. However, the big
advantage of an interpreter is that it does not need to go through the compilation stage during
which machine instructions are generated. This process can be time-consuming if the program
is long. Moreover, the interpreter can immediately execute high-level programs.
Linker: Also called link editor and binder, a linker is a program that combines
object modules to form an executable program.
APPLICATION SOFTWARE
PROGRAMMING LANGUAGES
While high-level programming languages are easy for the humans to read and understand, the
computer actually understands the machine language that consists of numbers only.
In between the machine languages and high-level languages, there is another type of language
known as assembly language. Assembly languages are similar to machine languages, but they
are much easier to program in because they allow a programmer to substitute names for
numbers.
However, irrespective of what language the programmer use, the program written using any
programming languages has to be converted into machine language so that the computer can
understand it. There are two ways to do this: compile the program or interpret the program
For ex, FORTRAN is a good language for processing numerical data, but it does not lend itself
very well to organizing large programs. Pascal can be used for writing well-structured and
readable programs, but it is not as flexible as the C programming language. C++ goes one step
ahead of C by incorporating powerful object-oriented features, but it is complex and difficult to
learn.
Oxford University Press 2011. All rights reserved.
Machine language is the lowest level of programming language. It is the only language that the
computer understands. All the commands and data values are expressed using 1s and 0s.
In the 1950s each computer had its own native language. Although there were similarities
between each of the machine language but a computer could not understand programs written
in another machine language.
The main advantage of machine language is that the code can run very fast and efficiently,
since it is directly executed by the CPU.
However, on the down side, the machine language is difficult to learn and is far more difficult to
edit if errors occur. Moreover, if you want to add some instructions into memory at some
location, then all the instructions after the insertion point would have to be moved down to make
room in memory to accommodate the new instructions.
Last but not the least, code written in machine language is not portable and to transfer code to
a different computer it needs to be completely rewritten. Architectural considerations make
portability a tough issue to resolve.
Assembly languages are symbolic programming languages that use mnemonics (symbols) to
represent machine-language instructions. Since assembly language is close to the machine, it
is also called low-level language.
Basically, an assembly language statement consists of a label, an operation code, and one or
more operands.
Labels are used to identify and reference instructions in the program. The operation code
(opcode) is a mnemonic that specifies the operation that has to be performed, such as move,
add, subtract, or compare. The operand specifies the register or the location in main memory
from where the data to be processed is located.
Assembly language is machine dependent. This makes the code written in assembly language
less portable as the code written to be executed on one machine will not run on machines from
a different or sometimes even the same manufacturer.
No doubt, the code written in assembly language will be very efficient in terms of execution time
and main memory usage as the language is also close to the computer.
Programs written in assembly language need a translator often known as the assembler to
convert them into machine language. This is because the computer will understand only the
language of 1s and 0s. it will not understand mnemonics like ADD and SUB.
The following instructions are a part of assembly language code to illustrate addition of two
numbers
MOV AX,4
MOV BX,6
ADD AX,BX
The third generation was introduced to make the languages more programmer-friendly.
3GLs spurred the great increase in data processing that occurred in the 1960s and 1970s. in
these languages, the program statements are not closely related to the internal characteristics
of the computer and is therefore often referred to has high-level languages.
Programs were written in an English-like manner, making them more convenient to use and
giving the programmer more time to address a client's problems.
Most of the programmers preferred to use general purpose high level languages like BASIC
(Beginners' All-purpose Symbolic Instruction Code), FORTRAN, PASCAL, COBOL, C++ or
Java to write the code for their applications.
Again, a translator is needed to translate the instructions written in high level language into
computer-executable machine language. Such translators are commonly known as interpreters
and compilers.
3GLs makes it easier to write and debug a program and gives the programmer more time to
think about its overall logic. The programs written in such languages are portable between
machines.
4GLs is a little different from its prior generation because they are basically nonprocedural so
the programmers define only what they want the computer to do, without supplying all the
details of how it has to be done.
Characteristics of such language include:
4GL code enhances the productivity of the programmers as they have to type fewer lines of
code to get something done. It is said that a programmer become 10 times more productive
when he writes the code using a 4GL than using a 3GL.
A typical example of a 4GL is the query language that allows a user to request information from
a database with precisely worded English-like sentences.
Let us take an example in which a report has to be generated that displays the total number of
students enrolled in each class and in each semester. Using a 4GL, the request would look
similar to this:
The only down side of a 4GL is that it does not make efficient use of machines resources.
However, the benefit of executing a program fast and easily far outweighs the extra costs of
running it.
5GLs are centered on solving problems using constraints given to the program,
rather than using an algorithm written by a programmer.
With 5GL, the programmer only needs to worry about what problems need to be
solved and what conditions need to be met, without worrying about how to
implement a routine or algorithm to solve them.
Programming in C, 1/e
Reema Thareja,
Thareja Assistant
Professor, Institute of Information
Technology and Management
CHAPTER - 2
INTRODUCTION TO C
INTRODUCTION
Today, C has become a popular language and various software programs are written
using this language.
Many other commonly used programming languages such as C++ and Java are also
based on C
Characteristics of C
Small size. C has only 32 keywords. This makes it relatively easy to learn
Unlike PASCAL it supports loose typing (as a character can be treated as an integer and vice
versa)
Stable language.
Quick language
C is a core language
C is a portable language.
C is an extensible language
main()
{
USES OF C
Statement 1;
Statement 2;
Statement N;
}
Function1()
{
}
Function2()
{
Statement 1;
Statement 2;
Statement N;
Statement 1;
Statement 2;
Statement N;
}
.
.
FunctionN()
{
Statement 1;
Statement 2;
Statement N;
STRUCTURE OF A C PROGRAM
}
Source File
Header File
Object File
Executable File
The source code file contains the source code of the program. The file extension of any C
source code file is .c. This file contains C source code that defines the main function and
maybe other functions. The main() is the starting point of execution when you successfully
compile and run the program. A C program in general may include even other source code files
(with the file extension .c).
Header Files
When working with large projects, it is often desirable to make sub-routines and store them in a
different file known as header file. The advantage of header files can be realized when
a) The programmer wants to use the same subroutines in different programs.
b) The programmer wants to change, or add, subroutines, and have those changes be reflected
in all other programs.
Conventionally, header files names ends with a .h extension and its name can use only
letters, digits, dashes, and underscores.
While some standard header files are available in C, but the programmer may also create his
own user defined header files
Oxford University Press 2011. All rights reserved.
Object files are generated by the compiler as a result of processing the source code file. Object
files contain compact binary code of the function definitions. Linker uses this object file to
produce an executable file (.exe file) by combining the of object files together. Object files have
a .o extension, although some operating systems including Windows and MS-DOS have a
.obj extension for the object file.
The binary executable file is generated by the linker. The linker links the various object files to
produce a binary file that can be directly executed. On Windows operating system, the
executable files have .exe extension.
Source
File
Preproce
ss
Compile
r
Library
Files
Library
Files
Source
File
Object
Files
Preproc
ess
Compile
r
Linker
Object
Files
Library
Files
Executable
Files
USING COMMENTS
It is a good programming practice to place some comments in the code to help the reader
understand the code clearly.
Comments are just a way of explaining what a program does. It is merely an internal program
documentation.
The compiler ignores the comments when forming the object file. This means that the
comments are non-executable statements.
// is used to comment a single statement. This is known as a line comment. A line comment can
be placed anywhere on the line and it does not require to be specifically ended as the end of
the line automatically ends the line.
/* is used to comment multiple statements. A /* is ended with */ and all statements that lie within
these characters are commented.
KEYWORDS
C has a set of 32 reserved words often known as keywords. All keywords are basically a
sequence of characters that have a fixed meaning. By convention all keywords must be written
in lowercase (small) letters.
Example: for, while, do-while, auto break, case, char, continue, do, double,
else, enum, extern, float, goto, if, int, long, register, return, short, signed,
sizeof, static, struct, switch, typedef, union, unsigned, void, volatile
IDENTIFIERS
SIZE IN
BYTES
DATA TYPE
RANGE
char
-128 to 127
unsigned char
0 to 255
signed char
-128 to 127
int
-32768 to 32767
unsigned int
0 to 65535
-32768 to 32767
signed int
-32768 to 32767
short int
-32768 to 32767
unsigned short
int
long int
unsigned long
int
4
4
0 to 65535
-2147483648 to
2147483647
0 to 4294967295
-2147483648 to
2147483647
float
3.4E-38 to 3.4E+38
double
1.7E-308 to 1.7E+308
long double
10
3.4E-4932 to
1.1E+4932
DATA TYPES IN C
VARIABLES IN C
A variable is defined as a meaningful name given to the data storage location in computer
memory.
When using a variable, we actually refer to address of the memory where the data is stored. C
language supports two basic kinds of variables.
Numeric variables can be used to store either integer values or floating point values.
While an integer value is a whole numbers without a fraction part or decimal point, a floating
point number, can have a decimal point in them.
Numeric values may also be associated with modifiers like short, long, signed and unsigned.
Character variables can include any letter from the alphabet or from the ASCII chart and
numbers 0 9 that are put between single quotes.
Variables
Character Variables
CONSTANTS
STREAMS
A stream acts in two ways. It is the source of data as well as the destination of data.
C programs input data and output data from a stream. Streams are associated with a physical
device such as the monitor or with a file stored on the secondary memory.
In a text stream, sequence of characters is divided into lines with each line being terminated
with a new-line character (\n). On the other hand, a binary stream contains data values using
their memory representation.
Although, we can do input/output from the keyboard/monitor or from any file but in this chapter
we will assume that the source of data is the keyboard and destination of the data is the
monitor.
Streams in a C
program
Text Stream
Keyboard
Data
Monitor
Data
Binary Stream
The printf function is used to display information required to the user and also prints the values
of the variables. Its syntax can be given as
printf (conversion string, variable list);
The parameter control string is a C string that contains the text that has to be written on to the
standard output device. The prototype of the control string can be given as below
%[flags][width][.precision][length]specifier
length
flag
description
Description
specifier
Qualifying Input
E, e
G, G
The scanf() is used to read formatted data from the keyboard. The syntax of the scanf() can be given as,
scanf (control string, arg1, arg2, .argn);
The control string specifies the type and format of the data that has to be obtained from the keyboard and
stored in the memory locations pointed by the arguments arg1, arg2,, argn. The prototype of the control
string can be give as:
[=%[*][width][modifiers]type=]
* is an optional argument that suppresses assignment of the input field. That is, it indicates that data should
be read from the stream but ignored (not stored in the memory location).
width is an optional argument that specifies the maximum number of characters to be read.
modifiers is an optional argument that can be h, l or L for the data pointed by the corresponding additional
arguments. Modifier h is used for short int or unsigned short int, l is used for long int, unsigned long int or
double values. Finally, L is used long double data values.
Type is same as specifier in printf()
int num;
float fnum;
char ch, str[10];
double dnum;
short snum;
long int lnum;
printf(\n Enter the values : );
scanf("%d %f %c %s %e %hd %ld", &num, &fnum, &ch, str, &dnum, &snum, &lnum);
printf("\n num = %d \n fnum = %.2f \n ch = %c \n str = %s \n dnum = %e \n snum = %hd \n lnum = %ld",
num, fnum, ch, str, dnum, snum, lnum);
OPERATORS IN C
Arithmetic operators
Relational Operators
Equality Operators
Logical Operators
Unary Operators
Conditional Operators
Bitwise Operators
Assignment operators
Comma Operator
Sizeof Operator
ARITHMETIC
OPERATORS
OPERATION
OPERATOR
SYNTAX
COMMENT
RESULT
Multiply
a * b
result = a * b
27
Divide
a / b
result = a / b
Addition
a + b
result = a + b
12
Subtraction
a - b
result = a b
Modulus
a % b
result = a % b
RELATIONAL OPERATORS
Also known as a comparison operator, it is an operator that compares two values. Expressions that
contain relational operators are called relational expressions. Relational operators return true or
false value, depending on whether the conditional relationship between the two operands holds or
not.
OPERATOR
MEANING
EXAMPLE
<
LESS THAN
3 < 5 GIVES 1
>
GREATER THAN
7 > 9 GIVES 0
>=
<=
50 >=100 GIVES 0
EQUALITY OPERATORS
C language supports two kinds of equality operators to compare their operands for strict
equality or inequality. They are equal to (==) and not equal to (!=) operator.
The equality operators have lower precedence than the relational operators.
OPERATOR
MEANING
==
!=
LOGICAL OPERATORS
C language supports three logical operators. They are- Logical AND (&&), Logical OR (||) and
Logical NOT (!).
As in case of arithmetic expressions, the logical expressions are evaluated from left to right.
A
A &&B
A || B
!A
UNARY OPERATORS
Unary operators act on single operands. C language supports three unary operators. They are
unary minus, increment and decrement operators.
When an operand is preceded by a minus sign, the unary operator negates its value.
The increment operator is a unary operator that increases the value of its operand by 1. Similarly,
the decrement operator decreases the value of its operand by 1. For example,
int x = 10, y;
y = x++;
is equivalent to writing
y = x;
x = x + 1;
whereas, y = ++x;
is equivalent to writing
x = x + 1;
y = x;
Oxford University Press 2011. All rights reserved.
CONDITIONAL OPERATOR
The conditional operator operator (?:) is just like an if .. else statement that can be written within
expressions.
Conditional operators make the program code more compact, more readable, and safer to use
as it is easier both to check and guarantee that the arguments that are used for evaluation.
Conditional operator is also known as ternary operator as it is neither a unary nor a binary
operator; it takes three operands.
BITWISE OPERATORS
Bitwise operators perform operations at bit level. These operators include: bitwise AND, bitwise
OR, bitwise XOR and shift operators.
The bitwise AND operator (&) is a small version of the boolean AND (&&) as it performs
operation on bits instead of bytes, chars, integers, etc.
The bitwise OR operator (|) is a small version of the boolean OR (||) as it performs operation on
bits instead of bytes, chars, integers, etc.
The bitwise NOT (~), or complement, is a unary operation that performs logical negation on
each bit of the operand. By performing negation of each bit, it actually produces the ones'
complement of the given binary value.
The bitwise XOR operator (^) performs operation on individual bits of the operands. The result
of XOR operation is shown in the table
If a right arithmetic shift is performed on an unsigned integer then zeros are shifted on
the left.
unsigned int x = 11000101;
Then x >> 2 = 00110001
A^ B
ASSIGNMENT OPERATORS
The assignment operator is responsible for assigning values to the variables. While the equal sign (=) is the
fundamental assignment operator, C also supports other assignment operators that provide shorthand ways
to represent common variable assignments. They are shown in the table.
OPERATOR
SYNTAX
EQUIVALENT TO
/=
variable /= expression
\=
variable \= expression
*=
variable *= expression
+=
variable += expression
-=
variable -= expression
&=
^=
variable ^= expression
<<=
>>=
COMMA OPERATOR
The comma operator in C takes two operands. It works by evaluating the first and discarding its
value, and then evaluates the second and returns the value as the result of the expression.
Comma separated operands when chained together are evaluated in left-to-right sequence with
the right-most value yielding the result of the expression.
Among all the operators, the comma operator has the lowest precedence. For example,
int a=2, b=3, x=0;
x = (++a, b+=a);
Now, the value of x = 6.
SIZEOF OPERATOR
The operator returns the size of the variable, data type or expression in bytes.
'sizeof' operator is used to determine the amount of memory space that the
variable/expression/data type will take. For example,
Type conversion and type casting of variables refers to changing a variable of one data type
into another.
While type conversion is done implicitly, casting has to be done explicitly by the programmer.
We will discuss both of them here.
Type conversion is done when the expression has variables of different data types. So to
evaluate the expression, the data type is promoted from lower to higher level where the
hierarchy of data types can be given as: double, float, long, int, short and char.
For example, type conversion is automatically done when we assign an integer value to a
floating point variable. For ex,
float x;
int y = 3;
x = y;
Now, x = 3.0,
Type casting is also known as forced conversion. It is done when the value of a higher data type has to be
converted in to the value of a lower data type. For example, we need to explicitly type cast an integer variable
into a floating point variable.
float salary = 10000.00;
int sal;
sal = (int) salary;
Typecasting can be done by placing the destination data type in parentheses followed by the variable name
that has to be converted.
Programming in C, 1/e
Reema Thareja,
Thareja Assistant
Professor, Institute of Information
Technology and Management
CHAPTER - 3
DECISION CONTROL AND
LOOPING STATEMENTS
Decision control statements are used to alter the flow of a sequence of instructions.
These statements help to jump from one part of the program to another depending on whether
a particular condition is satisfied or not.
IF STATEMENT
If statement is the simplest form of decision control statements that is frequently used in
decision making. The general form of a simple if statement is shown in the figure.
First the test expression is evaluated. If the test expression is true, the statement of if block
(statement 1 to n) are executed otherwise these statements will be skipped and the execution
will jump to statement x.
FALSE
SYNTAX OF IF STATEMENT
if (test expression)
{
statement 1;
..............
statement n;
}
statement x;
Test
Expression
TRUE
Statement Block 1
Statement x
IF ELSE STATEMENT
In the if-else construct, first the test expression is evaluated. If the expression is true, statement
block 1 is executed and statement block 2 is skipped. Otherwise, if the expression is false,
statement block 2 is executed and statement block 1 is ignored. In any case after the statement
block 1 or 2 gets executed the control will pass to statement x. Therefore, statement x is
executed in every case.
FALSE
SYNTAX OF IF STATEMENT
if (test expression)
{
statement_block 1;
}
else
{
statement_block 2;
}
statement x;
TRUE
Test
Expression
Statement Block 1
Statement Block 2
Statement x
IF ELSE IF STATEMENT
C language supports if else if statements to test additional conditions apart from the initial test
expression. The if-else-if construct works in the same way as a normal if statement.
SYNTAX OF IF-ELSE STATEMENT
if ( test expression 1)
{
statement block 1;
}
else if ( test expression 2)
{
statement block 2;
}
...........................
else if (test expression N)
{
statement block N;
}
else
{
Statement Block X;
}
Statement Y;
FALSE
TRUE
Test
Expression
1
FALSE
Statement Block 1
TRUE
Statement Block 2
Test
Expression
2
Statement Block X
Statement Y
SWITCH CASE
A switch case statement is a multi-way decision statement. Switch statements are used:
When there is only one variable to evaluate in the expression
When many conditions are being tested for
Switch case statement advantages include:
Easy to debug, read, understand and maintain
Execute faster than its equivalent if-else construct
switch(grade)
{
case 'A':
printf("\n Excellent");
break;
case 'B':
printf("\n Good");
break;
case 'C':
printf("\n Fair");
break;
default:
printf("\n Invalid Grade");
break;
}
ITERATIVE STATEMENTS
Iterative statements are used to repeat the execution of a list of statements, depending on the
value of an integer expression. In this section, we will discuss all these statements.
While loop
Do-while loop
For loop
WHILE LOOP
The while loop is used to repeat one or more statements while a particular condition is true.
In the while loop, the condition is tested before any of the statements in the statement block is
executed.
If the condition is true, only then the statements will be executed otherwise the control will jump
to the immediate statement outside the while loop block.
Statement x
We must constantly update the condition of the while loop.
while (condition)
{
statement_block;
}
statement x;
Update the
condition
expression
TRUE
Statement Block
Conditio
n
FALSE
Statement y
DO WHILE LOOP
The do-while loop is similar to the while loop. The only difference is that in a do-while loop, the
test condition is tested at the end of the loop.
The body of the loop gets executed at least one time (even if the condition is false).
The do while loop continues to execute whilst a condition is true. There is no choice whether to
execute the loop or not. Hence, entry in the loop is automatic there is only a choice to continue
it further or not.
The major disadvantage of using a do while loop is that it always executes at least once, so
even if the user enters some invalid data, the loop will execute.
Do-while loops are widely used to print a list of options for a menu driven program.
Statement x
Statement x;
do
{
Statement Block
statement_block;
} while (condition);
statement y;
expression
TRUE
Condition
FALSE
Statement y
// condition updated
FOR LOOP
When a for loop is used, the loop variable is initialized only once.
With every iteration of the loop, the value of the loop variable is updated and the condition is
checked. If the condition is true, the statement block of the loop is executed else, the
statements comprising the statement block of the for loop are skipped and the control jumps to
the immediate statement following the for loop body.
Updating the loop variable may include incrementing the loop variable, decrementing the loop
variable or setting it to some other value like, i +=2, where i is the loop variable.
BREAK STATEMENT
The break statement is used to terminate the execution of the nearest enclosing loop in which it
appears.
When compiler encounters a break statement, the control passes to the statement that follows
the loop in which the break statement appears. Its syntax is quite simple, just type keyword
break followed with a semi-colon.
break;
In switch statement if the break statement is missing then every case from the matched case
label to the end of the switch, including the default, is executed.
CONTINUE STATEMENT
GOTO STATEMENT
Here label is an identifier that specifies the place where the branch is to be made. Label can be
any valid variable name that is followed by a colon (:).
Note that label can be placed anywhere in the program either before or after the goto
statement. Whenever the goto statement is encountered the control is immediately transferred
to the statements following the label.
If the label is placed after the goto statement then it is called a forward jump and in case it is
located before the goto statement, it is said to be a backward jump.
Programming in C, 1/e
Reema Thareja,
Thareja Assistant
Professor, Institute of Information
Technology and Management
CHAPTER - 4
FUNCTIONS
INTRODUCTION
Every function in the program is supposed to perform a well defined task. Therefore, the
program code of one function is completely insulated from that of other functions.
Every function has a name which acts as an interface to the outside world in terms of how
information is transferred to it and how results generated by the function are transmitted back
from it.
In the fig, main() calls another function, func1() to perform a well defined task.
main() is known as the calling function and func1() is known as the called function.
When the compiler encounters a function call, instead of executing the next statement in
the calling function, the control jumps to the statements that are a part of the called
function.
After the called function is executed, the control is returned back to the calling program.
main()
{
..
..
func1();
..
return 0;
}
func1()
{
Statement
Block;
}
INTRODUCTION CONTD.
It is not necessary that the main() can call only one function, it can call as many
functions as it wants and as many times as it wants. For example, a function call
placed within a for loop, while loop or do-while loop may call the same function multiple
times until the condition holds true.
It is not that only the main() can call another functions. Any function can call any other
function. In the fig. one function calls another, and the other function in turn calls some
other function.
main()
{
..
..
func1();
..
return 0;
}
func1()
{
..
func2();
..
.
return;
}
func2()
{
func3()
{
..
..
func3();
..
.
return;
}
..
.
return;
}
Dividing the program into separate well defined functions facilitates each function to be written
and tested separately. This simplifies the process of getting the total program to work.
Understanding, coding and testing multiple separate functions are far easier than doing the
same for one huge function.
If a big program has to be developed without the use of any function (except main()), then there
will be countless lines in the main() .
All the libraries in C contain a set of functions that the programmers are free to use in their
programs. These functions have been prewritten and pre-tested, so the programmers use them
without worrying about their code details. This speeds up program development.
TERMINOLOGY OF FUNCTIONS
A function, f that uses another function g, is known as the calling function and g is known as the
called function.
When a called function returns some result back to the calling function, it is said to return that
result.
The calling function may or may not pass parameters to the called function. If the called
function accepts arguments, the calling function will pass parameters, else not.
Main() is the function that is called by the operating system and therefore, it is supposed to
return the result of its processing to the operating system.
FUNCTION DECLARATION
Function declaration is a declaration statement that identifies a function with its name, a list of
arguments that it accepts and the type of data it returns.
The general format for declaring a function that accepts some arguments and returns some
value as result can be given as:
return_data_type function_name(data_type variable1, data_type variable2,..);
FUNCTION DEFINITION
Function definition consists of a function header that identifies the function, followed by the
body of the function containing the executable code for that function
When a function defined, space is allocated for that function in the memory.
The syntax of a function definition can be given as:
return_data_type function_name(data_type variable1, data_type variable2,..)
{
.
statements
.
return( variable);
}
The no. and the order of arguments in the function header must be same as that given in
function declaration statement.
Oxford University Press 2011. All rights reserved.
FUNCTION CALL
When a function is invoked the compiler jumps to the called function to execute the statements
that are a part of that function.
Once the called function is executed, the program control passes back to the calling function.
Function name and the number and type of arguments in the function call must be same as that
given in the function declaration and function header of the function definition
Names (and not the types) of variables in function declaration, function call and header of
function definition may vary
Arguments may be passed in the form of expressions to the called function. In such a case,
arguments are first evaluated and converted to the type of formal parameter and then the body
of the function gets executed.
If the return type of the function is not void, then the value returned by the called function may
be assigned to some variable as given below.
variable_name = function_name(variable1, variable2, );
// FUNCTION DECLARATION
int main()
{
int num1, num2, total = 0;
printf(\n Enter the first number : );
scanf(%d, &num1);
printf(\n Enter the second number : );
scanf(%d, &num2);
total = sum(num1, num2);
// FUNCTION CALL
// FUNCTION HEADER
// FUNCTION BODY
return (a + b);
RETURN STATEMENT
The return statement is used to terminate the execution of a function and return control to the
calling function. When the return statement is encountered, the program execution resumes in
the calling function at the point immediately following the function call.
Programming Tip: It is an error to use a return statement in a function that has void as its
return type.
A return statement may or may not return a value to the calling function. The syntax of return
statement can be given as
return <expression> ;
For functions that has no return statement, the control automatically returns to the calling
function after the last statement of the called function is executed.
There are two ways in which arguments or parameters can be passed to the called function.
Call by value in which values of the variables are passed by the calling function to the called
function.
Call by reference in which address of the variables are passed by the calling function to the
called function.
Call by value
Call by reference
CALL BY VALUE
In the Call by Value method, the called function creates new variables to store the value of the
arguments passed to it. Therefore, the called function uses a copy of the actual arguments to
perform its intended task.
If the called function is supposed to modify the value of the parameters passed to it, then the
change will be reflected only in the called function. In the calling function no change will be
made to the value of the variables.
#include<stdio.h>
void add( int n);
int main()
{
int num = 2;
printf("\n The value of num before calling the function = %d", num);
add(num);
printf("\n The value of num after calling the function = %d", num);
return 0;
}
void add(int n)
{
n = n + 10;
printf("\n The value of num in the called function = %d", n);
}
The output of this program is:
The value of num before calling the function = 2
The value of num in the called function = 20
The value of num after calling the function = 2
CALL BY REFERENCE
When the calling function passes arguments to the called function using call by value method,
the only way to return the modified value of the argument to the caller is explicitly using the
return statement. The better option when a function can modify the value of the argument is to
pass arguments using call by reference technique.
In call by reference, we declare the function parameters as references rather than normal
variables. When this is done any changes made by the function to the arguments it received are
visible by the calling program.
To indicate that an argument is passed using call by reference, an ampersand sign (&) is
placed after the type in the parameter list. This way, changes made to that parameter in the
called function body will then be reflected in its value in the calling program.
#include<stdio.h>
void add( int &n);
int main()
{
int num = 2;
printf("\n The value of num before calling the function = %d", num);
add(num);
printf("\n The value of num after calling the function = %d", num);
return 0;
}
void add( int &n)
{
n = n + 10;
printf("\n The value of num in the called function = %d", n);
}
VARIABLES SCOPE
Block scope
Function scope
File scope
Program scope
BLOCK SCOPE
A statement block is a group of statements enclosed within an opening and closing curly brackets ({ }). If a variable is declared
within a statement block then, as soon as the control exits that block, the variable will cease to exist. Such a variable also known
as a local variable is said to have a block scope.
#include <stdio.h>
int main()
{
int x = 10. i;
printf("\n The value of x outside the while loop is %d", x);
while (i<3)
{
int x = i;
printf("\n The value of x inside the while loop is %d", x);
i++;
}
printf("\n The value of x outside the while loop is %d", x);
return 0;
}
Output:
The value of x outside the while loop is 10
The value of x inside the while loop is 0
The value of x inside the while loop is 1
The value of x inside the while loop is 2
The value of x outside the while loop is 10
FUNCTION SCOPE
Function scope is applicable only with goto label names. That is the programmer can not have
the same label name inside a function.
PROGRAM SCOPE
If you want that functions should be able to access some variables which are not passed to
them as arguments, then declare those variables outside any function blocks. Such
variables are commonly known as global variables. Hence, global variables are those
variables that can be accessed from any point in the program.
#include<stdio.h>
int x = 10;
void print();
int main()
{
printf("\n The value of x in the main() = %d", x);
int x = 2;
printf("\n The value of local variable x in the main() = %d", x);
print();
}
void print()
{
printf("\n The value of x in the print() = %d", x);
}
FILE SCOPE
When a global variable is accessible until the end of the file, the variable is said to have file
scope.
To allow a variable to have file scope, declare that variable with the static keyword before
specifying its data type, like this:
static int x = 10;
A global static variable can be used any where from the file in which it is declared but it is not
accessible by any other files.
Such variables are useful when the programmer writes his own header files.
STORAGE CLASSES
The storage class of a variable defines the scope (visibility) and life time of variables and/or
functions declared within a C Program. In addition to this, the storage class gives the following
information about the variable or the function.
It is used to determine the part of memory where storage space will be allocated for that
variable or function (whether the variable/function will be stored in a register or in RAM)
it specifies how long the storage allocation will continue to exist for that function or variable.
It specifies the scope of the variable or function. That is, the part of the C program in which the
variable name is visible, or accessible.
It specifies whether the variable will be automatically initialized to zero or to any indeterminate
value
STORAGE CLASS
FEATURE
Auto
Extern
Register
Static
Accessibility
Accessible within
all program files
that are a part of
the program
Local:
Accessible
within the function or
block in which it is
declared
Global:
Accessible
within the program
in which it is declared
Storage
Main Memory
Main Memory
CPU Register
Main Memory
Existence
Exists throughout
the execution of
the program
Exists
when
the
function or block in
which it is declared is
entered. Ceases to
exist when the control
returns
from
the
function or the block
in which it was
declared
Default value
Garbage
Zero
Garbage
Zero
#include<stdio.h>
void print(void);
int main()
{
printf("\n First call of print()");
print();
printf("\n\n Second call of print()");
print();
printf("\n\n Third call of print()");
print();
return 0;
}
// FILE 1.C
#include<stdio.h>
#include<FILE2.C>
int x;
void print(void);
int main()
{
x = 10;
print();
return 0;
}
// END OF FILE1.C
// FILE2.C
#include<stdio.h>
extern x;
void print()
{
printf("\n x = %d", x);
}
main()
{
// Statements
}
// END OF FILE2.C
void print()
{
static int x;
int y = 0;
printf("\n Static integer variable, x = %d",
x);
printf("\n Integer variable, y = %d", y);
x++;
y++;
}
Output:
First call of print()
Static integer variable, x = 0
Integer variable, y = 0
Second call of print()
Static integer variable, x = 1
Integer variable, y = 0
Third call of print()
Static integer variable, x = 2
Integer variable, y = 0
RECURSIVE FUNCTIONS
A recursive function is a function that calls itself to solve a smaller version of its task until a final
call is made which does not require a call to itself.
Therefore, recursion is defining large and complex problems in terms of a smaller and more
easily solvable problem. In recursive function, complicated problem is defined in terms of
simpler problems and the simplest problem is given explicitly.
4!
4 X 3!
4 X 3 X 2!
4 X 3 X 2 X 1!
SOLUTION
5 X 4 X 3 X 2 X 1!
=
5 X 4 X 3 X 2 X 1
=
5 X 4 X 3 X 2
=
5 X 4 X 6
=
5 X 24
=
120
int num;
scanf(%d, &num);
printf(\n Factorial of %d = %d, num, Fact(num));
return 0;
1, if n<=2
FIB (n - 1) + FIB (n - 2), otherwise
FIB(7)
FIB(6)
FIB(5)
FIB(4)
FIB(3)
main()
{
FIB(2)
FIB(2)
FIB(2)
FIB(3)
FIB(1)
FIB(5)
FIB(4)
FIB(3)
FIB(2)
FIB(4)
FIB(2)
FIB(1)
FIB(1)
int n;
printf(\n Enter the number of terms in the series : );
scanf(%d, &n);
for(i=0;i<n;i++)
printf(\n Fibonacci (%d) = %d, i, Fibonacci(i));
}
int Fibonacci(int num)
{
if(num <= 2)
return 1;
return ( Fibonacci (num - 1) + Fibonacci(num 2));
Oxford University Press 2011. All rights reserved.
}
FIB(3)
FIB(2)
FIB(3)
FIB(2)
FIB(1)
FIB(2)
FIB(1)
TYPES OF RECURSION
Direct
Indirect
Linear
Tree
Tail
DIRECT RECURSION
A function is said to be directly recursive if it explicitly calls itself. For example, consider the function
given below.
int Func( int n)
{
if(n==0)
retrun n;
return (Func(n-1));
}
INDIRECT RECURSION
A function is said to be indirectly recursive if it contains a call to another function which
ultimately calls it. Look at the functions given below. These two functions are indirectly recursive
as they both call each other.
int Func1(int n)
{
if(n==0)
return n;
return Func2(n);
int Func2(int x)
{
return Func1(x-1);
}
TAIL RECURSION
A recursive function is said to be tail recursive if no operations are pending to be performed when
the recursive function returns to its caller.
That is, when the called function returns, the returned value is immediately returned from the
calling function.
Tail recursive functions are highly desirable because they are much more efficient to use as in the
case, the amount of information that has to be stored on the system stack is independent of the
number of recursive calls.
int Fact(n)
{
return Fact1(n, 1);
}
Recursive functions can also be characterized depending on the way in which the recursion
grows- in a linear fashion or forming a tree structure.
In simple words, a recursive function is said to be linearly recursive when no pending operation
involves another recursive call to the function. For example, the factorial function is linearly
recursive as the pending operation involves only multiplication to be performed and does not
involve another call to Fact.
On the contrary, a recursive function is said to be tree recursive (or non-linearly recursive) if the
pending operation makes another recursive call to the function. For example, the Fibonacci
function Fib in which the pending operations recursively calls the Fib function.
Pros: Recursive solutions often tend to be shorter and simpler than non-recursive ones.
Recursion is implemented using system stack. If the stack space on the system is limited,
recursion to a deeper level will be difficult to implement.
Using a recursive function takes more memory and time to execute as compared to its nonrecursive counter part.
TOWER OF HANOI
Tower of Hanoi is one of the main applications of a recursion. It says, "if you can solve n1 cases, then you can easily solve the nth case?"
If there is only one ring, then simply move the ring from source to the destination
If there are two rings, then first move ring 1 to the spare pole
and then move ring 2 from source to the destination. Finally
move ring 1 from the source to the destination
A
Programming in C, 1/e
Reema Thareja,
Thareja Assistant
Professor, Institute of Information
Technology and Management
CHAPTER - 5
ARRAYS
INTRODUCTION
2nd
element
3rd
element
4th
element
5th
element
marks[0]
marks[1]
marks[2]
marks[3]
marks[4]
6th
element
marks[5]
7th
element
marks[6]
8th
element
marks[7]
9th
element
marks[8]
10th
element
marks[9]
int i, marks[10];
for(i=0;i<10;i++)
marks[i] = -1;
67
78
marks[1]
1002
marks[2]
1004
56
marks[3]
1006
88
marks[4]
1008
90
marks[5]
1010
34
marks[6]
1012
85
marks[7]
1014
Inputting Values
int i, marks[10];
for(i=0;i<10;i++)
scanf(%d, &marks[i]);
67
78
Marks[0]
marks[1]
marks[2]
56
marks[3]
88
marks[4]
90
marks[5]
34
85
marks[6
marks[7]]
1: [INITIALIZATION] SET I = N
2: Repeat Steps 3 and 4 while I >= POS
3:
SET A[I + 1] = A[I]
4:
SET I = I 1
[End of Loop]
Step 5: SET N = N + 1
Step 6: SET A[POS] = VAL
Step 7: EXIT
Calling INSERT (Data, 6, 3, 100) will lead to the following processing in the array
45
23
Data[0]
Data[1]
45
23
Data[0]
Data[1]
34
12
56
Data[2] Data[3]
Data[4]
34
56
12
Data[2] Data[3]
20
20
Data[5] Data[6]
56
Data[4]
20
Data[5] Data[6]
45
Data[0]
23
Data[1]
45
Data[0]
23
Data[1]
34
12
Data[2] Data[3]
34
12
Data[4]
100
Data[2] Data[3]
56
20
Data[5] Data[6]
12
Data[4]
56
20
Data[5] Data[6]
Step
Step
Step
Step
45
23
Data[0]
Data[1]
34
12
Data[2]
Data[3]
56
Data[4]
20
Data[5]
Calling DELETE (Data, 6, 2) will lead to the following processing in the array
23
45
Data[0]
45
Data[0]
45
Data[0]
23
12
12
56
Data[2]
Data[3]
Data[4]
Data[1]
23
Data[1]
56
Data[2] Data[3]
12
20
56
56
Data[2] Data[3]
Data[4]
20
Data[4]
20
45
Data[1]
12
Data[5]
Data[0]
23
Data[1]
12
56
Data[2] Data[3]
20
Data[4]
Data[5]
20
Data[5]
LINEAR SEARCH
LINEAR_SEARCH(A, N, VAL, POS)
Step
Step
Step
Step
[END OF IF]
[END OF LOOP]
Step 5: PRINT Value Not Present In The Array
Step 6: EXIT
BINARY SEARCH
BEG = lower_bound and END = upper_bound
MID = (BEG + END) / 2
If VAL < A[MID], then VAL will be present in the left segment of the array. So,
the value of END will be changed as, END = MID 1
If VAL > A[MID], then VAL will be present in the right segment of the array. So,
the value of BEG will be changed as, BEG = MID + 1
Oxford University Press 2011. All rights reserved.
Passing addresses
main()
{
int arr[5] ={1, 2, 3, 4, 5};
func(&arr[3]);
}
A two dimensional array is specified using two subscripts where one subscript denotes row
and the other denotes column.
C looks a two dimensional array as an array of a one dimensional array.
int marks[3][5]
Rows/Columns
Col 0
Col 1
Col2
Col 3
Col 4
Row 0
Marks[0][0]
Marks[0][1]
Marks[0][2]
Marks[0][3]
Marks[0][4]
Row 1
Marks[1][0]
Marks[1][1]
Marks[1][2]
Marks[1][3]
Marks[1][4]
Row 2
Marks[2][0]
Marks[2][1]
Marks[2][2]
Marks[2][3]
Marks[2][4]
(0,0)
(0, 1)
(0,2)
(0,3)
(1,0)
(1,1)
(1,2)
(1,3)
(2,0)
(2,1)
(2,2)
(2,3)
However, when we store the elements in a column major order, the elements of the first column
are stored before the elements of the second and third column. That is, the elements of the
array are stored column by column where n elements of the first column will occupy the first nth
locations.
(0,0)
(1,0)
(2,0)
(3,0)
(0,1)
(1,1)
(2,1
(3,1)
(0,2)
(1,2)
(2,2)
(3,2)
Address(A[I][J] = Base_Address + w{M ( J - 1) + (I - 1)}, if the array elements are stored in column
major order.
And, Address(A[I][J] = Base_Address + w{N ( I - 1) + (J - 1)}, if the array elements are stored in row
major order.
Where, w is the number of words stored per memory location
m, is the number of columns
n, is the number of rows
I and J are the subscripts of the array element
A two dimensional array is initialized in the same was as a single dimensional array is initialized. For example,
#include<stdio.h>
#include<conio.h>
main()
int i, j;
for(i=0;i<2;i++)
printf("\n");
for(j=0;j<2;j++)
printf("%d\t", arr[i][j]);
return 0;
Passing a row
There are three ways of passing parts of the two dimensional array to a function. First, we can pass
individual elements of the array. This is exactly same as we passed element of a one dimensional
array.
Passing a row
main()
{
int arr[2][3]= ( {1, 2, 3}, {4, 5, 6} };
func(arr[1]);
}
void func(int arr[])
{
int i;
for(i=0;i<5;i++)
printf("%d", arr[i] * 10);
}
PROGRAM
ILLUSTRATING PASSING ENTIRE ARRAY TO A FUNCTION
#include<stdio.h>
Like we have one index in a single dimensional array, two indices in a two dimensional array, in
the same way we have n indices in a n-dimensional array or multi dimensional array.
I2<=M2
I3 <= M3
In <= Mn
SPARSE MATRIX
Sparse matrix is a matrix that has many elements with a value zero.
In order to efficiently utilize the memory, specialized algorithms and data structures that take
advantage of the sparse structure of the matrix should be used. Otherwise, execution will
slow down and the matrix will consume large amounts of memory.
There are two types of sparse matrices. In the first type of sparse matrix, all elements above
the main diagonal have a value zero. This type of sparse matrix is also called a (lower)
triagonal matrix. In a lower triangular matrix, Ai,j = 0 where i<j.
An nXn lower triangular matrix A has one non zero element in the first row, two non zero
element in the second row and likewise, n non zero elements in the nth row.
1
5
2
3
-9
3
7
1
2
-1
4
-8
2
1
2
3
3
6
-1
4
7
9
9
5
8
1
3
7
In the second variant of a sparse matrix, elements with a non-zero value can appear only on the
diagonal or immediately above or below the diagonal. This type of matrix is also called a
tridiagonal matrix.
the main diagonal the, it contains non-zero elements for i=j. In all there will be n elements
diagonal below the main diagonal, it contains non zero elements for i=j+1. In all there will be n-1
elements
diagonal above the main diagonal, it contains non zero elements for i=j-1. In all there will be n-1
elements
4
5
1
1
9
2
3
4
1
2
5
2
1
6
9
7
Programming in C, 1/e
Reema Thareja,
Thareja Assistant
Professor, Institute of Information
Technology and Management
CHAPTER 6
STRINGS
INTRODUCTION
A string is a null-terminated character array. This means that after the last character, a null
character (\0) is stored to signify the end of the character array.
1000
str[1]
1001
str[2]
1002
str[3]
1003
str[4]
1004
str[5]
1005
\0
READING STRINGS
char str[100];
Then str can be read from the user by using three ways
use scanf function
using gets() function
using getchar()function repeatedly
The string can also be read by calling the getchar() repeatedly to read a sequence of single
characters (unless a terminating character is entered) and simultaneously storing it in a
character array.
i=0;
getchar(ch);
while(ch != '*)
{
str[i] = ch;
i++;
getchar(ch);
str[i] = '\0';
WRITING STRINGS
The string can be displayed on screen using three ways
The string can also be written by calling the putchar() repeatedly to print a sequence of single
characters
i=0;
while(str[i] != '\0*)
{
putchar(str[i]);
i++;
SUPPRESSING INPUT
scanf() can be used to read a field without assigning it to any variable. This is done by
preceding that field's format code with a *. For example, given:
The time can be read as 9:05 as a pair. Here the colon would be read but not assigned to
anything.
Using a Scanset
The ANSI standard added the new scanset feature to the C language. A scanset is used to
define a set of characters which may be read and assigned to the corresponding string. A
scanset is defined by placing the characters inside square brackets prefixed with a %
int main()
{
char str[10];
printf("\n Enter string: " );
scanf("%[aeiou]", str );
printf( "The string is : %s", str);
return 0;
}
The code will stop accepting character as soon as the user will enter a character that is not a
vowel.
However, if the first character in the set is a ^ (caret symbol), then scanf() will accept any
character that is not defined by the scanset. For example, if you write
scanf("%[^aeiou]", str );
Oxford University Press 2011. All rights reserved.
LENGTH
The number of characters in the string constitutes the length of the string.
For example, LENGTH(C PROGRAMMING IS FUN) will return 20. Note that even blank
spaces are counted as characters in the string.
LENGTH(0) = 0 and LENGTH() = 0 because both the strings does not contain any character.
In memory the ASCII code of a character is stored instead of its real value. The ASCII code for
A-Z varies from 65 to 91 and the ASCII code for a-z ranges from 97 to 123. So if we have to
convert a lower case character into upper case, then we just need to subtract 32 from the ASCII
value of the character.
IF S1 and S2 are two strings, then concatenation operation produces a string which
contains characters of S1 followed by the characters of S2.
APPENDING
Appending one string to another string involves copying the contents of the source string at the
end of the destination string. For example, if S1 and S2 are two strings, then appending S1 to
S2 means we have to add the contents of S1 to S2. so S1 is the source string and S2 is the
destination string. The appending operation would leave the source string S1 unchanged and
destination string S2 = S2+S1.
If S1 and S2 are two strings then comparing two strings will give either of these results
REVERSING A STRING
If S1= HELLO, then reverse of S1 = OLLEH. To reverse a string we just need to swap the
first character with the last, second character with the second last character, so on and so forth.
ALGORITHM TO REVERSE A STRING
Step1: [Initialize] SET I=0, J= Length(STR)
Step 2: Repeat Step 3 and 4 while I< Length(STR)
Step 3:
SWAP( STR(I), STR(J))
Step 4:
SET I = I + 1, J = J 1
[END OF LOOP]
Step 5: EXIT
In order to extract a substring from the main string we need to copy the content of the string
starting from the first position to the nth position where n is the number of characters to be
extracted.
In order to extract a substring from the right side of the main string we need to first calculate
the position. For example, if S1 = Hello World and we have to copy 7 characters starting
from the right, then we have to actually start extracting characters from the 5th position. This
is calculated by, total number of characters n + 1.
the position of the first character of the substring in the given string
INSERTION
The insertion operation inserts a string S in the main text, T at the kth position. The general
syntax of this operation is: INSERT(text, position, string). For ex, INSERT(XYZXYZ, 3, AAA)
= XYZAAAXYZ
INDEXING
Index operation returns the position in the string where the string pattern first occurs. For
example,
However, if the pattern does not exist in the string, the INDEX function returns 0.
DELETION
The deletion operation deletes a substring from a given text. We write it as, DELETE(text,
position, length)
REPLACEMENT
Replacement operation is used to replace the pattern P1 by another pattern P2. This is done
by writing, REPLACE(text, pattern1, pattern2)
Note in the second example there is no change as X does not appear in the text.
1:
2:
3:
4:
ARRAY OF STRINGS
Now suppose that there are 20 students in a class and we need a string that stores names of all
the 20 students. How can this be done? Here, we need a string of strings or an array of strings.
Such an array of strings would store 20 individual strings. An array of string is declared as,
char names[20][30];
Here, the first index will specify how many strings are needed and the second index specifies
the length of every individual string. So here, we allocate space for 20 names where each name
can be maximum 30 characters long.
Let us see the memory representation of an array of strings. If we have an array declared as,
char name[5][10] = {Ram, Mohan, Shyam, Hari, Gopal};
Name[0]
\0
Name[1]
\0
Name[2]
\0
Name[3]
\0
Name[4]
\0
Programming in C, 1/e
Reema Thareja,
Thareja Assistant
Professor, Institute of Information
Technology and Management
CHAPTER 7
POINTERS
Every computer has a primary memory. All our data and programs need to be placed in the
primary memory for execution.
The primary memory or RAM (Random Access Memory which is a part of the primary memory)
is a collection of memory locations (often known as cells) and each location has a specific
address. Each memory location is capable of storing 1 byte of data
Generally, the computer has three areas of memory each of which is used for a specific task.
These areas of memory include- stack, heap and global memory.
Stack- A fixed size of stack is allocated by the system and is filled as needed from the bottom
to the top, one element at a time. These elements can be removed from the top to the bottom
by removing one element at a time. That is, the last element added to the stack is removed first.
Heap- Heap is a contiguous block of memory that is available for use by the program when
need arise. A fixed size heap is allocated by the system and is used by the system in a random
fashion.
When the program requests a block of memory, the dynamic allocation technique carves out a
block from the heap and assigns it to the program.
When the program has finished using that block, it returns that memory block to the heap and
the location of the memory locations in that block is added to the free list.
Oxford University Press 2011. All rights reserved.
Global Memory- The block of code that is the main() program (along with other functions in the
program) is stored in the global memory. The memory in the global area is allocated randomly
to store the code of different functions in the program in such a way that one function is not
contiguous to another function. Besides, the function code, all global variables declared in the
program are stored in the global memory area.
Other Memory Layouts- C provides some more memory areas like- text segment, BSS and
shared library segment.
The text segment is used to store the machine instructions corresponding to the compiled
program. This is generally a read-only memory segment
Shared libraries segment contains the executable image of shared libraries that are being used
by the program.
INTRODUCTION
Every variable in C has a name and a value associated with it. When a variable is declared, a
specific block of memory within the computer is allocated to hold the value of that variable. The
size of the allocated block depends on the type of the data.
int x = 10;
When this statement executes, the compiler sets aside 2 bytes of memory to hold the value 10.
It also sets up a symbol table in which it adds the symbol x and the relative address in memory
where those 2 bytes were set aside.
Thus, every variable in C has a value and an also a memory location (commonly known as
address) associated with it. Some texts use the term rvalue and lvalue for the value and the
address of the variable respectively.
The rvalue appears on the right side of the assignment statement and cannot be used on the
left side of the assignment statement. Therefore, writing 10 = k; is illegal.
char *pch;
float *pfnum;
int x= 10;
int *ptr = &x;
The '*' informs the compiler that ptr is a pointer variable and the int specifies that it will store the
address of an integer variable.
The & operator retrieves the lvalue (address) of x, and copies that to the contents of the pointer
ptr.
We can "dereference" a pointer, i.e. refer to the value of the variable to which it points by using
unary '*' operator as in *ptr. That is, *ptr = 10, since 10 is value of x.
#include<stdio.h>
int main()
{
int num, *pnum;
pnum = #
printf(\n Enter the number : );
scanf(%d, &num);
printf(\n The number that was entered is : %d, *pnum);
return 0;
}
OUTPUT:
Enter the number : 10
The number that was entered is : 10
We can add integers to or subtract integers from pointers as well as to subtract one pointer
from the other.
We can compare pointers by using relational operators in the expressions. For example p1 > p2
, p1==p2 and p1!=p2 are all valid in C.
When using pointers, unary increment (++) and decrement (--) operators have greater precedence
than the dereference operator (*). Therefore, the expression
*ptr++ is equivalent to *(ptr++). So the expression will increase the value of ptr so that it now points
to the next element.
In order to increment the value of the variable whose address is stored in ptr, write (*ptr)++
Oxford University Press 2011. All rights reserved.
NULL POINTERS
A null pointer which is a special pointer value that is known not to point anywhere. This means
that a NULL pointer does not point to any valid memory address.
To declare a null pointer you may use the predefined constant NULL,
int *ptr = NULL;
You can always check whether a given pointer variable stores address of some variable or
contains a null by writing,
if ( ptr == NULL)
{
Statement block;
}
Null pointers are used in situations if one of the pointers in the program points somewhere
some of the time but not all of the time. In such situations it is always better to set it to a null
pointer when it doesn't point anywhere valid, and to test to see if it's a null pointer before using
it.
GENERIC POINTERS
A generic pointer is pointer variable that has void as its data type.
It is declared by writing
void *ptr;
You need to cast a void pointer to another kind of pointer before using it.
Generic pointers are used when a pointer has to point to data of different types at different
times. For ex,
#include<stdio.h>
int main()
int x=10;
char ch = A;
void *gp;
gp = &x;
gp = &ch; printf("\n Generic pointer now points to the character %c", *(char*)gp);
} OUTPUT:
return 0;
Look at the code given below which illustrates the use of a pointer to a two dimensional array.
#include<stdio.h>
main()
{
int arr[2][2]={{1,2}.{3,4}};
int i, (*parr)[2];
parr=arr;
for(i=0;i<2;i++)
{
for(j=0;j<2;j++)
printf(" %d", (*(parr+i))[j]);
}
}
OUTPUT
Oxford University Press 2011. All rights reserved.
#include<stdio.h>
main()
printf(%c, *pstr);
pstr++;
In this program we declare a character pointer *pstr to show the string on the screen. We then
"point" the pointer pstr at str. Then we print each character of the string in the while loop.
Instead of using the while loop, we could have straight away used the function puts(), like
puts(pstr);
int puts(const char *s); Here the "const" modifier is used to assure the user that the function will
not modify the contents pointed to by the source pointer. Note that the address of the string is
Oxford
passed to the function
as an University
argument.Press 2011. All rights reserved.
Consider another program which reads a string and then scans each character to count the
number of upper and lower case characters entered
#include<stdio.h>
int main()
gets(str);
pstr = str;
upper++;
lower++;
pstr++;
ARRAY OF POINTERS
An array of pointers can be declared as
int *ptr[10]
The above statement declares an array of 10 pointers where each of the pointer points to an
integer variable. For example, look at the code given below.
int *ptr[10];
int p=1, q=2, r=3, s=4, t=5;
ptr[0]=&p;
ptr[1]=&q;
ptr[2]=&r;
ptr[3]=&s;
ptr[4]=&t
Can you tell what will be the output of the following statement?
printf(\n %d, *ptr[3]);
Yes, the output will be 4 because ptr[3] stores the address of integer variable s and *ptr[3] will
therefore print the value of s that is 4.
POINTER TO FUNCTION
#include <stdio.h>
void print(int n);
main()
{
void (*fp)(int);
fp = print;
(*fp)(10);
fp(20);
return 0;
}
void print(int value)
{
printf("\n %d", value);
}
Comparing Function Pointers
if(fp >0)
// check if initialized
{
if(fp == print)
printf("\n Pointer points to Print");
else
printf("\n Pointer not initialized!");
}
int result;
result = operate(add, 9, 7);
printf ("\n Addition Result = %d", result);
result = operate(sub, 9, 7);
printf ("\n Subtraction Result = %d", result);
}
int add (int a, int b)
{ return (a+b);}
nt subtract (int a, int b){ return (a-b);}
int operate(int (*operate_fp) (int, int), int a, int b)
{
int result;
result = (*operate_fp) (a,b);
Oxford University Press 2011. All rights reserved.
POINTERS TO POINTERS
You can use pointers that point to pointers. The pointers in turn, point to data (or even to other
pointers). To declare pointers to pointers just add an asterisk (*) for each level of reference.
For example, if we have:
int x=10;
int *px, **ppx;
px=&x;
ppx=&px;
10
Now if we write,
printf(\n %d, **ppx);
Then it would print 10, the value of x.
1002
px
2004
ppx
Programming in C, 1/e
Reema Thareja,
Thareja Assistant
Professor, Institute of Information
Technology and Management
CHAPTER 8
STRUCTURES
INTRODUCTION
Structure is basically a user defined data type that can store related information (even of
different data types) together.
A structure is declared using the keyword struct followed by a structure name. All the variables
of the structures are declared within the structure. A structure type is defined by using the given
syntax.
struct struct-name
{
data_type var-name;
data_type var-name;
...
};
struct student
{
int r_no;
char name[20];
char course[20];
float fees;
};
The structure definition does not allocates any memory. It just gives a template that conveys to
the C compiler how the structure is laid out in memory and gives details of the member names.
Memory is allocated for the structure when we declare a variable of the structure. For ex, we
can define a variable of student by writing
struct student stud1; Oxford University Press 2011. All rights reserved.
TYPEDEF DECLARATIONS
When we precede a struct name with typedef keyword, then the struct becomes a new type. It
is used to make the construct shorter with more meaningful names for types already defined by
C or for types that you have declared. With a typedef declaration, becomes a synonym for the
type.
int r_no;
char name[20];
char course[20];
float fees;
};
Now that you have preceded the structures name with the keyword typedef, the student
becomes a new data type. Therefore, now you can straight away declare variables of this new
data type as you declare variables of type int, float, char, double, etc. to declare a variable of
structure student you will just write,
student stud1;
Oxford University Press 2011. All rights reserved.
INITIALIZATION OF STRUCTURES
Initializing a structure means assigning some constants to the members of the structure.
When the user does not explicitly initializes the structure then C automatically does that. For int
and float members, the values are initialized to zero and char and string members are initialized
to the \0 by default.
The initializers are enclosed in braces and are separated by commas. Note that initializers
match their corresponding types in the structure definition.
struct struct_name
data_type member_name1;
data_type member_name2;
data_type member_name3;
.......................................
OR
struct struct_name
data_type member_name1;
data_type member_name2;
data_type member_name3;
.......................................
};
Each member of a structure can be used just like a normal variable, but its name will be
a bit longer. A structure member variable is generally accessed using a . (dot operator).
For ex, to assign value to the individual data members of the structure variable Rahul,
we may write,
stud1.r_no = 01;
strcpy(stud1.name, Rahul);
stud1.course = BCA;
stud1.fees = 45000;
We can assign a structure to another structure of the same type. For ex, if we have two
structure variables stu1 and stud2 of type struct student given as
stud2 = stud1;
#include<stdio.h>
int main()
struct student
{
int roll_no;
char name[80];
float fees;
char DOB[80];
};
scanf(%d, &stud1.roll_no);
scanf(%s, stud1.name);
scanf(%f, &stud1.fees);
scanf(%s, stud1.DOB);
NESTED STRUCTURES
A structure can be placed within another structure. That is, a structure may contain
another structure as its member. Such a structure that contains another structure as its
member is called a nested structure.
typedef struct
char first_name[20];
char mid_name[20];
char last_name[20];
}NAME;
typedef struct
{
int dd;
int mm;
int yy;
}DATE;
struct student stud1;
stud1.name.first_name = Janak;
stud1.name.mid_name = Raj;
stud1.name.last_name = Thareja;
stud1.course = BCA;
stud1.DOB.dd = 15;
stud1.DOB.mm = 09;
stud1.DOB.yy = 1990;
#include<stdio.h>
int main()
{
struct DOB
{
int day;
int month;
int year;
};
struct student
{
int roll_no;
char name[100];
float fees;
struct DOB date;
};
struct student stud1;
printf(\n Enter the roll number : );
scanf(%d, &stud1.roll_no);
printf(\n Enter the name : );
scanf(%s, stud1.name);
printf(\n Enter the fees : );
scanf(%f, &stud1.fees);
printf(\n Enter the DOB : );
scanf(%d %d %d, &stud1.date.day, &stud1.date.month, &stud1.date.year);
printf(\n ********STUDENTS DETAILS *******);
printf(\n ROLL No. = %d, stud1.roll_no);
printf(\n NAME. = %s, stud1.name);
printf(\n FEES. = %f, stud1.fees);
printf(\n DOB = %d - %d - %d, stud1.date.day, stud1.date.month, stud1.date.year);
Oxford University Press 2011. All rights reserved.
}
ARRAYS OF STRUCTURES
The general syntax for declaring an array of structure can be given as,
Now, to assign values to the ith student of the class, we will write,
stud[i].r_no = 09;
stud[i].name = RASHI;
stud[i].course = MCA;
stud[i].fees = 60000;
#include<stdio.h>
int main()
{
struct student
{
int roll_no;
char name[80];
float fees;
char DOB[80];
};
struct student stud[50];
int n, i;
printf(\n Enter the number of students : );
scanf(%d, &n);
for(i=0;i<n;i++)
{
printf(\n Enter the roll number : );
scanf(%d, &stud[i].roll_no);
printf(\n Enter the name : );
scanf(%s, stud[i].name);
printf(\n Enter the fees : );
scanf(%f, stud[i].fees);
printf(\n Enter the DOB : );
scanf(%s, stud[i].DOB);
}
for(i=0;i<n;i++)
{
printf(\n ********DETAILS OF %dth STUDENT*******, i+1);
printf(\n ROLL No. = %d, stud[i].roll_no);
printf(\n NAME. = %s, stud[i].name);
printf(\n ROLL No. = %f, stud[i].fees);
printf(\n ROLL No. = %s, stud[i].DOB);
}
To pass any individual member of the structure to a function we must use the direct
selection operator to refer to the individual members for the actual parameters. The called
program does not know if the two variables are ordinary variables or structure members.
#include<stdio.h>
typedef struct
int x;
int y;
}POINT;
main()
display(p1.x, p1.y);
return 0;
When a structure is passed as an argument, it is passed using call by value method. That is a
copy of each member of the structure is made. No doubt, this is a very inefficient method
especially when the structure is very big or the function is called frequently. Therefore, in such a
situation passing and working with pointers may be more efficient.
The general syntax for passing a structure to a function and returning a structure can be given
as, struct struct_name func_name(struct struct_name struct_var);
The code given below passes a structure to the function using call-by-value method.
#include<stdio.h>
typedef struct
int x;
int y;
}POINT;
void display(POINT);
main()
display(p1);
return 0;
C allows to crerate a pointer to a structure. Like in other cases, a pointer to a structure is never
itself a structure, but merely a variable that holds the address of a structure. The syntax to
declare a pointer to a structure can be given as
struct struct_name
{
data_type member_name1;
data_type member_name2;
.....................................
}*ptr;
OR
struct struct_name *ptr;
For our student structure we can declare a pointer variable by writing
struct student *ptr_stud, stud;
The next step is to assign the address of stud to the pointer using the address operator (&). So
to assign the address, we will write
ptr_stud = &stud;
To access the members of the structure, one way is to write
/* get the structure, then select a member */
(*ptr_stud).roll_no;
An alternative to the above statement can be used by using pointing-to operator (->) as shown
below.
/* the roll_no in the structure ptr_stud points to */
ptr_stud->roll_no = 01;
Oxford University Press 2011. All rights reserved.
#include<stdio.h>
struct student
{
int r_no;
char name[20];
char course[20];
float fees;
};
main()
{
struct student stud1, *ptr_stud1;
ptr_stud1 = &stud1;
ptr_stud1->r_no = 01;
strcpy(ptr_stud1->name, "Rahul");
strcpy(ptr_stud1->course, "BCA");
ptr_stud1->fees = 45000;
printf("\n DETAILS OF STUDENT");
printf("\n ---------------------------------------------");
printf("\n ROLL NUMBER = %d", ptr_stud1->r_no);
printf("\n NAME = ", puts(ptr_stud1->name));
printf("\n COURSE = ", puts(ptr_stud1->course));
printf("\n FEES = %f", ptr_stud1->fees);
}
Oxford University Press 2011. All rights reserved.
Self referential structures are those structures that contain a reference to data of its same type.
That is, a self referential structure in addition to other data contains a pointer to a data that is of
the same type as that of the structure. For example, consider the structure node given below.
struct node
int val;
};
Here the structure node will contain two types of data- an integer val and next that is a pointer
to a node. You must be wondering why do we need such a structure? Actually, self-referential
structure is the foundation of other data structures.
UNION
Like structure, a union is a collection of variables of different data types. The only difference
between a structure and a union is that in case of unions, you can only store information in one
field at any one time.
To better understand union, think of it as a chunk of memory that is used to store variables of
different types. When a new value is assigned to a field, the existing data is replaced with the
new data.
Thus unions are used to save memory. They are useful for applications that involve multiple
members, where values need not be assigned to all the members at any one time.
DECLARING A UNION
union union-name
data_type var-name;
data_type var-name;
...
};
Again, the typedef keyword can be used to simplify the declaration of union variables.
The most important thing to remember about a union is that the size of an union is the size of its
largest field. This is because a sufficient number of bytes must be reserved to store the largest
sized field.
#include<stdio.h>
typedef struct POINT1
{
int x, y;
};
typedef union POINT2
{
int x;
int y;
};
main()
{
POINT1 P1 = {2,3};
// POINT2 P2 ={4,5}; Illegeal with union
POINT2 P2;
P2. x = 4;
P2.y = 5;
printf("\n The co-ordinates of P1 are %d and %d", P1.x, P1.y);
printf("\n The co-ordinates of P2 are %d and %d", P2.x, P2.y);
return 0;
}
OUTPUT
The co-ordinates of P1 are 2 and 3
The co-ordinates of P2 are
and 5 University Press 2011. All rights reserved.
5Oxford
Like structures we can also have array of union variables. However, because of the problem of
new data overwriting existing data in the other fields, the program may not display the accurate
results.
#include <stdio.h>
union POINT
{
int x, y;
};
main()
{
int i;
union POINT points[3];
points[0].x = 2;
points[0].y = 3;
points[1].x = 4;
points[1].y = 5;
points[2].x = 6;
points[2].y = 7;
for(i=0;i<3;i++)
printf("\n Co-ordinates of Points[%d] are %d and %d", i, points[i].x, points[i].y);
return 0;
}
OUTPUT
Co-ordinates of Points[0] are 3 and 3
Co-ordinates of Points[1] are 5 and 5
Co-ordinates of Points[2] are 7 and 7
union can be very useful when declared inside a structure. Consider an example in which you
want a field of a structure to contain a string or an integer, depending on what the user
specifies. The following code illustrates such a scenario.
struct student
{
union
{
char name[20];
int roll_no;
};
int marks;
};
main()
{
struct student stud;
char choice;
printf("\n You can enter the name or roll number of the student");
printf("\n Do you want to enter the name? (Yes or No) : ");
gets(choice);
if(choice=='y' || choice=='Y')
{
printf("\n Enter the name : ");
gets(stud.name);
}
else
{
printf("\n Enter the roll number : ");
scanf("%d", &stud.roll_no);
}
printf("\n Enter the marks : ");
scanf("%d", &stud.marks);
if(choice=='y' || choice=='Y')
printf("\n Name : %s ", stud.name);
else
printf("\n Roll Number : %d ", stud.roll_no);
printf("\n Marks : %d", stud.marks);
}
The enumerated data type is a user defined type based on the standard integer type.
An enumeration consists of a set of named integer constants. That is, in an enumerated type,
each integer value is assigned an identifier. This identifier (also known as an enumeration
constant) can be used as symbolic names to make the program more readable.
Enumerations create new data types to contain values that are not limited to the values
fundamental data types may take. The syntax of creating an enumerated data type can be
given as below.
enum enumeration_name { identifier1, identifier2, ..., identifiern };
Consider the example given below which creates a new type of variable called COLORS to
store colors constants.
In case you do not assign any value to a constant, the default value for the first one in the list RED (in our case), has the value of 0. The rest of the undefined constants have a value 1 more
than its previous one. So in our example,
If you want to explicitly assign values to these integer constants then you should specifically
mention those values as shown below.
enum COLORS {RED = 2, BLUE, BLACK = 5, GREEN = 7, YELLOW, PURPLE , WHITE = 15};
ENUM VARIABLES
The syntax for declaring a variable of an enumerated data type can be given as,
enumeration_name variable_name;
enum COLORS {RED, BLUE, BLACK, GREEN, YELLOW, PURPLE, WHITE}bg_color, fore_color;
C permits to use typedef keyword for enumerated data types. For ex, if we write
Once the enumerated variable has been declared, values can be stored in it. However, an
enumerated variable can hold only declared values for the type. For example, to assign the
color black to the back ground color, we will write,
bg_color = BLACK;
Once an enumerated variable has been assigned a value, we can store its value in another
variable of the same type as shown below.
bg_color = BLACK;
border_color = bg_color;
Enumerated types can be implicitly or explicitly cast. For ex, the compiler can implicitly cast an
enumerated type to an integer when required.
However, when we implicitly cast an integer to an enumerated type, the compiler will either
generate an error or warning message.
Here, c is an enumerate data type variable. If we write, c = BLACK + WHITE, then logically, it
should be 2 + 6 = 8; which is basically a value of type int. However, the left hand side of the
assignment operator is of the type enum COLORS. SO the statement would complain an error.
To remove the error, you can do either of two things. First, declare c to be an int.
C also allows using comparison operators on enumerated data type. Look at the following
statements which illustrate this concept.
Since enumerated types are derived from integer type, they can be used in a switch-case
statement.
Programming in C, 1/e
Reema Thareja,
Thareja Assistant
Professor, Institute of Information
Technology and Management
CHAPTER 9
FILE HANDLING IN C
INTRODUCTION TO FILES
A file is a collection of data stored on a secondary storage device like hard disk.
A file is basically used because real life applications involve large amounts of data and in such
situations the console oriented I/O operations pose two major problems:
First, it becomes cumbersome and time consuming to handle huge amount of data through
terminals.
Second, when doing I/O using terminal, the entire data is lost when either the program is
terminated or computer is turned off. Therefore, it becomes necessary to store data on a
permanent storage (the disks) and read whenever necessary, without destroying the data.
STREAMS IN C
In C, the standard streams are termed as pre-connected input and output channels between a
text terminal and the program (when it begins execution). Therefore, stream is a logical
interface to the devices that are connected to the computer.
Stream is widely used as a logical interface to a file where a file can refer to a disk file, the
computer screen, keyboard, etc. Although files may differ in the form and capabilities, all
streams are the same.
The three standard streams in C languages are- standard input (stdin), standard output (stdout)
and standard error (stderr).
Oxford University Press 2011. All rights reserved.
STREAMS IN C contd.
Standard input (stdin): Standard input is the stream from which the
program receives its data. The program requests transfer of data using
the read operation. However, not all programs require input. Generally,
unless redirected, input for a program is expected from the keyboard.
KEYBOARD
stdin
program writes its output data. The program requests data transfer
using the write operation. However, not all programs generate output.
PROGRAM
stderr
SCREEN
stdout
When a stream linked to a disk file is created, a buffer is automatically created and associated
with the stream. A buffer is nothing but a block of memory that is used for temporary storage of
data that has to be read from or written to a file.
Buffers are needed because disk drives are block oriented devices as they can operate
efficiently when data has to be read/ written in blocks of certain size. The size of ideal buffer
size is hardware dependant.
The buffer acts as an interface between the stream (which is character-oriented) and the disk
hardware (which is block oriented). When the program has to write data to the stream, it is
saved in the buffer till it is full. Then the entire contents of the buffer are written to the disk as a
block.
Data from the buffer is written to the disk file
PROGRAM
BUFFER
DISK
Similarly, when reading data from a disk file, the data is read as a block from the file and written into
the buffer. The program reads data from the buffer. The creation and operation of the buffer is
automatically handled by the operating system. However, C provides some functions for buffer
manipulation. The data resides in the buffer until the buffer is flushed or written to a file.
TYPES OF FILES
In C, the types of files used can be broadly classified into two categories- text files and
binary files.
A text file is a stream of characters that can be sequentially processed by a computer in forward
direction. For this reason a text file is usually opened for only one kind of operation (reading,
writing, or appending) at any given time.
Because text files only process characters, they can only read or write data one character at a
time.
In a text file, each line contains zero or more characters and ends with one or more characters
that specify the end of line. Each line in a text file can have maximum of 255 characters.
A line in a text file is not a c string, so it is not terminated by a null character. When data is
written to a text file, each newline character is converted to a carriage return/line feed character.
Similarly, when data is read from a text file, each carriage return/ line feed character is
converted in to newline character.
Another important thing is that when a text file is used, there are actually two representations of
data- internal or external. For ex, an int value will be represented as 2 or 4 bytes of memory
internally but externally the int value will be represented as a string of characters representing
its decimal or hexadecimal value. To convert internal representation into external, we can use
printf and fprintf functions. Similarly, to convert an external representation into internal scanf
Oxford University Press 2011. All rights reserved.
and fscanf can be used.
BINARY FILES
A binary file is a file which may contain any type of data, encoded in binary form for computer
storage and processing purposes. Like a text file, a binary file is a collection of bytes. Note that
in C a byte and a character are equivalent. Therefore, a binary file is also referred to as a
character stream with following two essential differences.
A binary file does not require any special processing of the data and each byte of data is
transferred to or from the disk unprocessed.
C places no constructs on the file, and it may be read from, or written to, in any manner the
programmer wants.
Binary files store data in the internal representation format. Therefore, an int value will be stored
I binary form as 2 or byte value. The same format is used to store data in memory as well as in
file. Like text file, binary file also ends with an EOF marker.
In a text file, an integer value 123 will be stored as a sequence of three characters- 1, 2 and 3.
So each character will take 1 byte and therefore, to store the integer value 123 we need 3
bytes. However, in a binary file, the int value 123 will be stored in 2 bytes in the binary form.
This clearly indicates that binary files takes less space to store the same piece of data and
eliminates conversion between internal and external representations and are thus more efficient
than the text files.
Oxford University Press 2011. All rights reserved.
USING FILES IN C
There can be a number of files on the disk. In order to access a particular file, you must
specify the name of the file that has to be used. This is accomplished by using a file pointer
variable that points to a structure FILE (defined in stdio.h). The file pointer will then be used
in all subsequent operations in the file. The syntax for declaring a file pointer is
FILE *file_pointer_name;
FILE *fp;
An error will be generated if you use the filename to access a file rather than the file pointer
Opening a File
A file must be first opened before data can be read from it or written to it. In order to open a file
and associate it with a stream, the fopen() function is used. The prototype of fopen() can be
given as:
FILE *fopen(const char *file_name, const char *mode);
Using the above prototype, the file whose pathname is the string pointed to by file_name is
opened in the mode specified using the mode. If successful, fopen() returns a pointer-tostructure and if it fails, it returns NULL.
MODE
DESCRIPTION
Open a text file for reading. If the stream (file) does not exist then an error will be reported.
Open a text file for writing. If the stream does not exist then it is created otherwise if the file already exists, then its
contents would be deleted
rb
Open a binary file for reading. B indicates binary. By default this will be a sequential file in Media 4 format
wb
ab
r+
Open a text file for both reading and writing. The stream will be positioned at the beginning of the file. When you
specify "r+", you indicate that you want to read the file before you write to it. Thus the file must already exist.
w+
Open a text file for both reading and writing. The stream will be created if it does not exist, and will be truncated if
it exist.
a+
Open a text file for both reading and writing. The stream will be positioned at the end of the file content.
r+b/ rb+
w+b/wb+
a+b/ab+
The fopen() can fail to open the specified file under certain conditions that are listed below:
Opening a file that is not ready for usage
Opening a file that is specified to be on a non-existent directory/drive
Opening a non-existent file for reading
Opening a file to which access is not permitted
FILE *fp;
fp = fopen("Student.DAT", "r");
if(fp==NULL)
{
printf("\n The file could not be opened");
exit(1);
}
OR
char filename[30];
FILE *fp;
gets(filename);
fp = fopen(filename, "r+");
if(fp==NULL)
{
printf("\n The file could not be opened");
exit(1);
}
To close an open file, the fclose() function is used which disconnects a file pointer from a file.
After the fclose() has disconnected the file pointer from the file, the pointer can be used to
access a different file or the same file but in a different mode.
The fclose() function not only closes the file but also flushed all the buffers that are maintained
for that file
If you do not close a file after using it, the system closes it automatically when the program
exits. However, since there is a limit on the number of files which can be open simultaneously;
the programmer must close a file when it has been used. The prototype of the fclose() function
can be given as,
Here, fp is a file pointer which points to the file that has to be closed. The function returns an
integer value which indicates whether the fclose() was successful or not. A zero is returned if
the function was successful; and a non-zero value is returned if an error occurred.
fscanf()
The fscanf() is used to read formatted data from the stream. The syntax of the fscanf() can be
given as,
The fscanf() is used to read data from the stream and store them according to the parameter
format into the locations pointed by the additional arguments.
#include<stdio.h>
main()
{
FILE *fp;
char name[80];
int roll_no;
fp = fopen("Student.DAT", "r");
if(fp==NULL)
{
printf("\n The file could not be opened");
exit(1);
}
printf("\n Enter the name and roll number of the student : ");
fscanf(stdin, "%s %d", name, &roll_no); /* read from keyboard */
printf(\n NAME : %s \t ROLL NUMBER = %d", name, roll_no);
// READ FROM FILE- Student.DAT
fscanf(fp, "%s %d", name, &roll_no);
printf(\n NAME : %s \t ROLL NUMBER = %d", name, roll_no);
fclose(fp);
Oxford University Press 2011. All rights reserved.
fgets()
fgets() stands for file get string. The fgets() function is used to get a string from a stream. The
syntax of fgets() can be given as:
The fgets() function reads at most one less than the number of characters specified by size
(gets size - 1 characters) from the given stream and stores them in the string str. The fgets()
terminates as soon as it encounters either a newline character or end-of-file or any other error.
However, if a newline character is encountered it is retained. When all the characters are read
without any error, a '\0' character is appended to end the string.
FILE *fp;
char str[80];
fp = fopen("Student.DAT", "r");
if(fp==NULL)
exit(1);
fclose(fp);
fgetc()
The fgetc() function returns the next character from stream, or EOF if the end of file is reached
or if there is an error. The syntax of fgetc() can be given as
fgetc returns the character read as an int or return EOF to indicate an error or end of file.
fgetc() reads a single character from the current position of a file (file associated with stream).
After reading the character, the function increments the associated file pointer (if defined) to
point to the next character. However, if the stream has already reached the end of file, the endof-file indicator for the stream is set.
FILE *fp;
char str[80];
int i, ch;
fp = fopen("Program.C", "r");
if(fp==NULL)
{
printf("\n The file could not be opened");
exit(1);
}
// Read 79 characters and store them in str
ch = fgetc(fp);
for( i=0; (i < 79 ) && ( feof( fp ) == 0 ); i++ )
{
str[i] = (char)ch;
ch = fgetc( stream );
}
str[i] = '\0';
printf( "\n %s", str);
fclose(fp);
fread()
The fread() function is used to read data from a file. Its syntax can be given as
int fread( void *str, size_t size, size_t num, FILE *stream );
The function fread() reads num number of objects (where each object is size bytes) and places
them into the array pointed to by str. The data is read from the given input stream.
Upon successful completion, fread() returns the number of bytes successfully read. The number
of objects will be less than num if a read error or end-of-file is encountered. If size or num is 0,
fread() will return 0 and the contents of str and the state of the stream remain unchanged. In
case of error, the error indicator for the stream will be set.
The fread() function advances the file position indicator for the stream by the number of bytes
read.
FILE *fp;
char str[11];
fp = fopen("Letter.TXT", "r+");
if(fp==NULL)
exit(1);
str[10]= '\0';
fclose(fp);
FILE *fp;
int i;
char name[20];
float salary;
fp = fopen("Details.TXT", "w");
if(fp==NULL)
{ printf("\n The file could not be opened");
exit(1);
}
for(i=0;i<10;i++)
{
puts("\n Enter your name : ");
gets(name);
fflush(stdin);
puts("\n Enter your salary : ");
scanf("%f", &salary);
fprintf(fp, " (%d) NAME : [%-10.10s] \t SALARY " %5.2f", i, name, salary);
}
fclose(fp);
fputs()
The fputs() is used to write a line into a file. The syntax of fputs() can be given as
The fputs() writes the string pointed to by str to the stream pointed to by stream. On successful
completion, fputs() returns 0. In case of any error, fputs() returns EOF.
#include<stdio.h>
main()
FILE *fp;
char feedback[100];
fp = fopen("Comments.TXT", "w");
if(fp==NULL)
exit(1);
gets(feedback);
fflush(stdin);
fputs(feedback, fp);
fclose(fp);
}
fputc()
The fputc() function will write the byte specified by c (converted to an unsigned char) to the
output stream pointed to by stream. Upon successful completion, fputc() will return the value it
has written. Otherwise, in case of error, the function will return EOF and the error indicator for
the stream will be set.
#include<stdio.h>
main()
FILE *fp;
char feedback[100];
int i;
fp = fopen("Comments.TXT", "w");
if(fp==NULL)
exit(1);
gets(feedback);
for(i=0i<feedback[i];i++)
fputc(feedback[i], fp);
fclose(fp);
}
fwrite()
The fwrite() is used to write data to a file. The syntax of fwrite can be given as,
int fwrite(const void *str, size_t size, size_t count, FILE *stream);
The fwrite() function will write, from the array pointed to by str, up to count objects of size
specified by size, to the stream pointed to by stream.
The file-position indicator for the stream (if defined) will be advanced by the number of bytes
successfully written. In case of error, the error indicator for the stream will be set.
main(void)
FILE *fp;
size_t count;
fp = fopen("Welcome.txt", "wb");
if(fp==NULL)
fclose(fp);
fwrite() can be used to write characters, integers, structures, etc to a file. However, fwrite() can
be used only with files that are opened in binary mode.
Oxford University Press 2011. All rights reserved.
While reading the file in text mode, character by character, the programmer can compare the
character that has been read with the EOF, which is a symbolic constant defined in stdio.h with
a value -1.
while(1)
{
c = fgetc(fp);
if (c==EOF)
break;
printf("%c", c);
}
The other way is to use the standard library function feof() which is defined in stdio.h. The feof()
Feof() returns zero (false) when the end of file has not been reached and a one (true) if the endof-file has been reached.
while( !feof(fp)
{
fgets(str,
80, fp);
University Press 2011. All rights reserved.
Oxford
It is not uncommon that an error may occur while reading data from or writing data to a file. For
example, an error may arise
When you try to read a file beyond EOF indicator
When trying to read a file that does not exist
When trying to use a file that has not been opened
When trying to use a file in un-appropriate mode. That is, writing data to a file that has been
opened for reading
When writing to a file that is write-protected
The function ferror() is used to check for errors in the stream. Its prototype can be given as
int ferror ( FILE *stream);
It returns a zero if no errors have occurred and a non-zero value if there is an error. In case of an
error, the programmer can determine which error has occurred by using the perror().
FILE *fp;
char feedback[100];
int i;
fp = fopen("Comments.TXT", "w");
printf("\n Kindly give the feedback on this book : ");
gets(feedback);
for(i=0i<feedback[i];i++)
fputc(feedback[i], fp);
if(ferror(fp))
{
printf(\n Error writing in file);
exit(1);
}
fclose(fp);
clearerr()
The function clearerr() is used to clears the end-of-file and error indicators for the stream. Its
protoype can be given as
The clearerr() clears the error for the stream pointed to by stream. The function is used
because error indicators are not automatically cleared; once the error indicator for a specified
stream is set, operations on that stream continue to return an error value until clearerr, fseek,
fsetpos, or rewind is called.
FILE *fp;
fp = fopen("Comments.TXT", "w");
if(fp==NULL)
{
perror("OOPS ERROR");
printf("\n error no = %d", errno);
exit(1);
}
printf("\n Kindly give the feedback on this book : ");
gets(feedback);
for(i=0i<feedback[i];i++)
{
fputc(feedback[i], fp);
if (ferror(fp))
{
clearer(fp);
break;
} }
fclose(fp);
Oxford University Press 2011. All rights reserved.
perror()
perror() stands for print error. The perror() function is used to handle errors in C programs.
When called, perror() displays a message on stderr describing the most recent error that
occurred during a library function call or system call. The prototype of perror() can be given as
The perror() takes one argument msg which points to an optional user-defined message. This
message is printed first, followed by a colon and the implementation-defined message that
describes the most recent error.
If a call to perror() is made when no error has actually occurred, then a No error will be
displayed. The most important thing to remember is that a call to perror() does nothing to deal
with the error condition.
#include<stdio.h>
#include<stdlib.h>
#include<errno.h>
main()
{
FILE *fp;
fp = fopen("Comments.TXT", "w");
if(fp==NULL)
{
perror("OOPS ERROR");
printf("\n error no = %d", errno);
exit(1);
}
printf("\n Kindly give the feedback on this book : ");
gets(feedback);
for(i=0i<feedback[i];i++)
fputc(feedback[i], fp);
fclose(fp);
}
OUTPUT
OOPS ERROR : No such file or directory
errno =2
Oxford University Press 2011. All rights reserved.
Command-line arguments are given after the name of a program in command-line operating
systems like DOS or Linux, and are passed in to the program from the operating system.
First argument is an integer value that specifies number of command line arguments
The integer, argc specifies the number of arguments passed into the program from the
command line, including the name of the program.
The array of character pointers, argv contains the list of all the arguments. argv[0] is the name
of the program, or an empty string if the name is not available. argv[1] to argv[argc 1]
specifies the command line argument. In the C program, every element in the argv can be used
as a string.
fseek() is used to reposition a binary stream. The prototype of fseek() can be given as,
fseek() is used to set the file position pointer for the given stream. Offset is an integer value that
gives the number of bytes to move forward or backward in the file. Offset may be positive or
negative, provided it makes sense. For example, you cannot specify a negative offset if you are
starting at the beginning of the file. The origin value should have one of the following values
(defined in stdio.h):
SEEK_SET: to perform input or output on offset bytes from start of the file
SEEK_CUR: to perform input or output on offset bytes from the current position in the file
SEEK_END: to perform input or output on offset bytes from the end of the file
SEEK_SET, SEEK_CUR and SEEk_END are defined constants with value 0, 1 and 2
respectively.
On successful operation, fseek() returns zero and in case of failure, it returns a non-zero value.
For example, if you try to perform a seek operation on a file that is not opened in binary mode
then a non-zero value will be returned.
fseek() can be used to move the file pointer beyond a file, but not before the beginning.
Oxford University Press 2011. All rights reserved.
Write a program to print the records in reverse order. The file must be opened in binary
mode. Use fseek()
#include<stdio.h>
#include<conio.h>
main()
{
typedef struct employee
{
int emp_code;
char name[20];
int hra;
int da;
int ta;
};
FILE *fp;
struct employee e;
int result, i;
fp = fopen("employee.txt", "rb");
if(fp==NULL)
{
printf("\n Error opening file");
exit(1);
}
for(i=5;i>=0;i--)
{
fseek(fp, i*sizeof(e), SEEK_SET);
fread(&e, sizeof(e), 1, fp);
printf("\n EMPLOYEE CODE : %d", e.emp_code);
printf("\n Name : %s", e.name);
printf("\n HRA, TA and DA : %d %d %d", e.hra, e.ta, e.da);
}
fclose(fp);
getch();
return 0;
}
Oxford University Press 2011. All rights reserved.
rewind()
rewind() is used to adjust the position of file pointer so that the next I/O operation will take
place at the beginning of the file. Its prototype can be given as
fgetpos()
The fgetpos() is used to determine the current position of the stream. Its prototype can be
given as
int fgetpos(FILE *stream, fpos_t *pos);
Here, stream is the file whose current file pointer position has to be determined. pos is used to
point to the location where fgetpos() can store the position information. The pos variable is of
type fops_t which is defined in stdio.h and is basically an object that can hold every possible
position in a FILE.
On success, fgetpos() returns zero and in case of error a non-zero value is returned. Note that
the value of pos obtained through fgetpos() can be used by the fsetpos() to return to this same
position.
fsetpos()
The fsetpos() is used to move the file position indicator of a stream to the location indicated by
the information obtained in "pos" by making a call to the fgetpos(). Its prototype is
Here, stream points to the file whose file pointer indicator has to be re-positioned. pos points to
positioning information as returned by "fgetpos".
On success, fsetpos() returns a zero and clears the end-of-file indicator. In case of failure it
returns a non-zero value
Oxford University Press 2011. All rights reserved.
// The program opens a file and reads bytes at several different locations.
#include <stdio.h>
main()
{
FILE *fp;
fpos_t pos;
char feedback[20];
fp = fopen(comments.txt, rb);
if(fp == NULL)
{
printf(\n Error opening file);
exit(1);
}
// Read some data and then check the position.
fread( feedback, sizeof(char), 20, fp);
if( fgetpos(fp, &pos) != 0 )
{
printf(\n Error in fgetpos()");
exit(1);
}
fread(feeback, sizeof(char), 20, fp);
printf("\n 20 bytes at byte %ld: %s", pos, feedback);
// Set a new position and read more data
pos = 90;
if( fsetpos(fp, &pos ) != 0 )
{
printf(\n Error in fsetpos()");
exit(1);
}
fread( feedback, sizeof(char), 20, fp);
printf( "\n 20 bytes at byte %ld: %s", pos, feedback);
fclose(fp);
}
ftell()
The ftell function is used to know the current position of file pointer. It is at this position at which the
next I/O will be performed. The syntax of the ftell() defined in stdio.h can be given as:
On successful, ftell() function returns the current file position (in bytes) for stream. However, in case
of error, ftell() returns -1.
When using ftell(), error can occur either because of two reasons:
First, using ftell() with a device that cannot store data (for example, keyboard)
Second, when the position is larger than that can be represented in a long integer. This will usually
happen when dealing with very large files
FILE *fp;
char c;
int n;
fp=fopen("abc","w");
if(fp==NULL)
{
printf("\n Error Opening The File");
exit(1);
}
while((c=getchar())!=EOF)
putc(c,fp);
n = ftell(fp);
fclose(fp);
fp=fopen("abc","r");
if(fp==NULL)
{
printf("\n Error Opening The File");
exit(1);
}
while(ftell(fp)<n)
{
c= fgetc(fp);
printf('%c", c);
}
fclose(fp);
remove()
The remove() as the name suggests is used to erase a file. The prototype of remove() as
given in stdio.h can be given as,
The remove() will erase the file specified by filename. On success, the function will return
zero and in case of error, it will return a non-zero value.
The rename() as the name suggests is used to renames a file. The prototype is:
Here, the oldname specifies the pathname of the file to be renamed and the newname gives
the new pathname of the file.
On success, rename() returns zero. In case of error, it will return a non-zero value will set
the errno to indicate the error.
The tmpfile() function is used to create a temporary file. The tmpfile() opens the
corresponding stream with access parameters set as w+. The file created with tmpfile() will
be automatically deleted when all references to the file are closed. That is, the file created
will be automatically closed and erased when the program has been completely executed.
FILE * tmpfile(void);
On success, tmpfile() will return a pointer to the stream of the file that is created. In case of
Oxford a
University
Press[and
2011.
Allerrno
rights to
reserved.
return
error, the function will
null pointer
set
indicate the error.
Programming in C, 1/e
Reema Thareja,
Thareja Assistant
Professor, Institute of Information
Technology and Management
CHAPTER 10
INTRODUCTION
The preprocessor is a program that processes the source code before it passes through the
compiler. It operates under the control of preprocessor directive which is placed in the source
program before the main().
Before the source code is passed through the compiler, it is examined by the preprocessor for
any preprocessor directives. In case, the program has some preprocessor directives,
appropriate actions are taken (and the source program is handed over to the compiler.
The preprocessor is executed before the actual compilation of program code begins. Therefore,
the preprocessor expands all the directives and take the corresponding actions before any code
is generated by the program statements.
Program becomes portable as preprocessor directives makes it easy to compile the program in
different execution environments
Due to the aforesaid reason the program also becomes more efficient to use.
Unconditional
define
line
undef
include
Conditional
error
pragma
if
else
elif
ifdef
ifndef
endif
#define
To define preprocessor macros we use #define. The #define statement is also known as macro
definition or simply a macro. There are two types of macros- object like macro and function like
macro.
An object-like macro is a simple identifier which will be replaced by a code fragment. They are
usually used to give symbolic names to numeric constants. Object like macros do not take ant
argument. It is the same what we have been using to declare constants using #define directive.
The general syntax of defining a macro can be given as:
The preprocessor replaces every occurrence of the identifier in the source code by a string.
#define PI 3.14
Oxford University Press 2011. All rights reserved.
Function-like macros
When a function is stimulated using a macro, the macro definition replaces the function
definition.
The name of the macro serves as the header and the macro body serves as the function body.
The name of the macro will then be used to replace the function call.
References to such macros look like function calls. However, when a macro is referenced,
source code is inserted into the program at compile time. The parameters are replaced by the
corresponding arguments, and the text is inserted into the program stream. Therefore, macros
are considered to be much more efficient than functions as they avoid the overhead involved in
calling a function.
The following line defines the macro MUL as having two parameters a and b and the
replacement string (a * b):
Look how the preprocessor changes the following statement provided it appears after the
macro definition.
c = MUL(a,b);
Oxford University Press 2011. All rights reserved.
#include<stdio.h>
#define PRINT(num) printf( #num " = %d", num)
main()
{
PRINT(20);
Finally, the preprocessor will automatically concatenate two string literals into one string. So
the above statement will become
to
#include<stdio.h>
#define JOIN(A,B) A##B
main()
{
int i;
for(i=1;i<=5;i++)
printf("\n HI JOIN(USER, i) : ");
}
The above program would print
HI USER1
HI USER2
HI USER3
HI USER4
HI USER5
Oxford University Press 2011. All rights reserved.
#include
An external file containing function, variables or macro definitions can be included as a part of
our program. This avoids the effort to re-write the code that is already written.
The #include directive is used to inform the preprocessor to treat the contents of a specified file
as if those contents had appeared in the source program at the point where the directive
appears.
The #include directive can be used in two forms. Both forms makes the preprocessor insert the
entire contents of the specified file into the source code of our program. However, the difference
between the two is the way in which they search for the specified.
#include <filename>
This variant is used for system header files. When we include a file using angular brackets, a
search is made for the file named filename in a standard list of system directories.
#include "filename"
This variant is used for header files of your own program. When we include a file using double
quotes, the preprocessor searches the file named filename first in the directory containing the
current file, then in the quote directories and then the same directories used for <filename>.
#undef
As the name suggests, the #undef directive undefines or removes a macro name previously
created with #define. Undefining a macro means to cancel its definition. This is done by writing
#undef followed by the macro name that has to be undefined.
Oxford University Press 2011. All rights reserved.
#line
#include<stdio.h>
main()
{
int a=10:
printf("%d", a);
}
The above program has a compile time error because instead of a semi-colon there is a colon
that ends line int a = 10. So when you compile this program an error is generated during the
compiling process and the compiler will show an error message with references to the name of
the file where the error happened and a line number. This makes it easy to detect the
erroneous code and rectify it.
The #line directive enables the users to control the line numbers within the code files as well as
the file name that we want that appears when an error takes place. The syntax of #line directive
is:
#line line_number filename
Here, line_number is the new line number that will be assigned to the next code line. The line
numbers of successive lines will be increased one by one from this point on.
Filename is an optional parameter that redefines the file name that will appear in case an error
occurs. The filename must be enclosed within double quotes. If no filename is specified then
the compiler will show the original file name. For example:
#include<stdio.h>
main()
{
#line 10 "Error.C"
int a=10:
#line 20
printf("%d, a);
}
PRAGMA DIRECTIVES
The #pragma directive is used to control the actions of the compiler in a particular portion of a
program without affecting the program as a whole.
The effect of pragma will be applied from the point where it is included to the end of the
compilation unit or until another pragma changes its status.
#pragma string
INSTRUCTION
DESCRIPTION
COPYRIGHT
COPYRIGHT_DATE
HP_SHLIB_VERSION
LOCALITY
OPTIMIZE
OPT_LEVEL
VERSIONID
CONDITIONAL DIRECTIVES
A conditional directive is used instruct the preprocessor to select whether or not to include a
chunk of code in the final token stream passed to the compiler. The preprocessor conditional
directives can test arithmetic expressions, or whether a name is defined as a macro, etc.
A program may need to use different code depending on the machine or operating system it is
to run on.
The conditional preprocessor directive is very useful when you want to compile the same
source file into two different programs. While one program might make frequent timeconsuming consistency checks on its intermediate data, or print the values of those data for
debugging, the other program, on the other hand can avoid such checks.
The conditional preprocessor directives can be used to exclude code from the program whose
condition is always false but is needed to keep it as a sort of comment for future reference.
#ifdef
#ifdef is the simplest sort of conditional preprocessor directive and is used to check for the
existence of macro definitions. Its syntax can be given as:
#ifdef MACRO
controlled text
#endif
#ifdef MAX
int STACK[MAX];
#endif
Oxford University Press 2011. All rights reserved.
#ifndef
The #ifndef directive is the opposite of #ifdef directive. It checks whether the MACRO has not
been defined or if its definition has been removed with #undef.
#ifndef is successful and returns a non-zero value if the MACRO has not been defined.
Otherwise in case of failure, that is when the MACRO has already been defined, #ifndef returns
false (0).
#ifndef MACRO
controlled text
#endif
The #if directive is used to control the compilation of portions of a source file. If the specified
condition (after the #if) has a nonzero value, the controlled text immediately following the #if
directive is retained in the translation unit.
#if condition
controlled text
#endif
While using #if directive, make sure that each #if directive must be matched by a closing #endif
directive. Any number of #elif directives can appear between the #if and #endif directives, but at
most one #else directive is allowed. However, the #else directive (if present) must be the last
directive before #endif.
The #endif directive ends the scope of the #if , #ifdef , #ifndef , #else , or #elif directives.
Oxford University Press 2011. All rights reserved.
The above expression evaluates to 1 if MACRO is defined and to 0 if it is not. The defined
operator helps you to check for macro definitions in one concise line without having to use
many #ifdef or #ifndef directives. For example,
#error DIRECTIVE
The #error directive is used to produce compiler-time error messages. The syntax of
this directive is:
#error string
The error messages include the argument string. The #error directive is usually used
to detect programmer inconsistencies and violation of constraints during
preprocessing.
When #error directive is encountered, the compilation process terminates and the
message specified in string is printed to stderr. For example,
#ifndef SQUARE
#error MACRO not defined.
#endif
Oxford University Press 2011. All rights reserved.
Programming in C, 1/e
Reema Thareja,
Thareja Assistant
Professor, Institute of Information
Technology and Management
CHAPTER 11
LINKED LISTS
INTRODUCTION
A linked list in simple terms is a linear collection of data elements. These data elements are
called nodes.
Linked list is a data structure which in turn can be used to implement other data structures.
Thus, it acts as building block to implement data structures like stacks, queues and their
variations.
A linked list can be perceived as a train or a sequence of nodes in which each node contain one
or more data fields and a pointer to the next node.
START
In the above linked list, every node contains two parts- one integer and the other a pointer to the
next node. The left part of the node which contains data may include a simple data type, an array or
a structure. The right part of the node contains a pointer to the next node (or address of the next
node in sequence). The last node will have no next node connected to it, so it will store a special
value called NULL.
INTRODUCTION contd.
Linked list contains a pointer variable, START which stores the address of the first node in the
list.
We can traverse the entire list using a single pointer variable START. The START node will
contain the address of the first node; the next part of the first node will in turn store the address
of its succeeding node.
Using this technique the individual nodes of the list will form a chain of nodes. If START =
NULL, this means that the linked list is empty and contains no nodes.
struct node
{
int data;
struct node *next;
};
START
DATA
NEXT
5
6
AVAIL
10
10
9
9
3
4
memory, we will first find any free space in the memory and then use
-1
the CPU idle or whenever the programs are falling short of memory.
The operating system scans through all the memory cells and mark
the cells that are being used by some or the other program. Then, it
collects all those cells which are not being used and add their
address to the free pool so that it can be reused by the programs.
This process is called garbage collection. The whole process of
collecting unused memory cells (garbage collection) is transparent to
the programmer.
An array is linear collection of data elements and a linked list is a linear collection of nodes. But
unlike an array, a linked list does not store its nodes in consecutive memory locations.
Another point of difference between an array and a linked list is that a linked list does not allow
random access of data. Nodes in a linked list can be accessed only in a sequential manner.
But like an array, insertions and deletions can be done at any point in the list in a constant time.
Another advantage of linked list over an array is that, we can add any number of elements in
the list. This is not possible in case of an array. For example, if we declare an array as int
marks[10], then the array can store maximum ten data elements but not even a single more
than that. There is no such restriction in case of a linked list.
Thus linked lists provide an efficient way of storing related data and perform basic operations
such as insertion, deletion and updating of information at the cost of extra space required for
storing address of the next node.
A singly linked list is the simplest type of linked list in which every node contains some data and
a pointer to the next node of the same data type. By saying that the node contains a pointer to
the next node we mean that the node stores the address of the next node in sequence.
START
Algorithm to print the information stored in each node of the linked list
Step
Step
Step
Step
START
9
START
START, PTR
START
5
PTR
Algorithm to insert a new node after a node that has value NUM
Step 1: IF AVAIL = NULL, then
Write OVERFLOW
Go to Step 12
[END OF IF]
Step 2: SET New_Node = AVAIL
Step 3: SET AVAIL = AVAIL->NEXT
Step 4: SET New_Node->DATA = VAL
Step 5: SET PTR = START
Step 6: SET PREPTR = PTR
Step 7: Repeat Step 8 and 9 while PREPTR->DATA != NUM
Step 8:
SET PREPTR = PTR
Step 9:
SET PTR = PTR->NEXT
[END OF LOOP]
Step 10: PREPTR->NEXT = New_Node
Step 11: SET New_Node->NEXT = PTR
Step 12: EXIT
START
1
START
PREPTR
5
4
PTR
Algorithm to delete the node after a given node from the linked list
Step 1: IF START = NULL, then
Write UNDERFLOW
Go to Step 10
[END OF IF]
Step 2: SET PTR = START
Step 3: SET PREPTR = PTR
Step 4: Repeat Step 5 and 6 while PRETR->DATA != NUM
Step 5:
SET PREPTR = PTR
Step 6:
SET PTR = PTR->NEXT
[END OF LOOP]
Step7: SET TEMP = PTR->NEXT
Step 8: SET PREPTR->NEXT = TEMP->NEXT
Step 9: FREE TEMP
Step 10: EXIT
4
PREPTR
START
1
START
PTR
START
In a circular linked list, the last node contains a pointer to the first node of the list. We can have
a circular singly listed list as well as circular doubly linked list. While traversing a circular linked
list, we can begin at any node and traverse the list in any direction forward or backward until we
reach the same node where we had started. Thus, a circular linked list has no beginning and no
ending.
START
The prev field of the first node and the next field of the last node will contain NULL. The
prev field is used to store the address of the preceding node. This would enable to traverse
the list in the backward direction as well.
A circular doubly linked list or a circular two way linked list is a more complex type of linked
list which contains a pointer to the next as well as previous node in the sequence.
The difference between a doubly linked and a circular doubly linked list is same as that
exists between a singly linked list and a circular linked list. The circular doubly linked list
does not contain NULL in the previous field of the first node and the next field of the last
node. Rather, the next field of the last node stores the address of the first node of the list,
i.e; START. Similarly, the previous field of the first field stores the address of the last node.
Since a circular doubly linked list contains three parts in its structure, it calls for more space
per node and for more expensive basic operations. However, it provides the ease to
manipulate the elements of the list as it maintains pointers to nodes in both the directions .
The main advantage of using a circular doubly linked list is that it makes searches twice as
efficient.
START
A header linked list is a special type of linked list which contains a header node at the
beginning of the list. So, in a header linked list START will not point to the first node of the list
but START will contain the address of the header node. There are basically two variants of a
header linked list-
Grounded header linked list which stores NULL in the next field of the last node
Circular header linked list which stores the address of the header node in the next field of the
last node. Here, the header node will denote the end of the list.
START
1
Header Node
START
Header Node
Programming in C, 1/e
Reema Thareja,
Thareja Assistant
Professor, Institute of Information
Technology and Management
CHAPTER 12
INTRODUCTION TO STACKS
Stack is an important data structure which stores its elements in an ordered manner. Take an
analogy of a pile of plates where one plate is placed on top of the other. A plate can be
removed from the topmost position. Hence, you can add and remove the plate only at/from one
position that is, the topmost position.
Another plate
will be added on
top of this plate
Push Operation
The push operation is used to insert an element in to the stack. The new element is added at
the topmost position of the stack. However, before inserting the value, we must first check if
TOP=MAX-1, because if this is the case then it means the stack is full and no more insertions
can further be done. If an attempt is made to insert a value in a stack that is already full, an
OVERFLOW message is printed.
1
TOP = 4
4 TOP = 5
POP OPERATION
The pop operation is used to delete the topmost element from the stack. However, before
deleting the value, we must first check if TOP=NULL, because if this is the case then it means
the stack is empty so no more deletions can further be done. If an attempt is made to delete a
value from a stack that is already empty, an UNDERFLOW message is printed.
1
TOP = 4
4
TOP = 3
PEEP OPERATION
Peep is an operation that returns the value of the topmost element of the stack without deleting
it from the stack.
However, the peep operation first checks if the stack is empty or contains some elements. For
this, a condition is checked. If TOP = NULL, then an appropriate message is printed else the
value is returned.
Oxford University Press 2011. All rights reserved.
Infix, Postfix and Prefix notations are three different but equivalent notations of writing algebraic
expressions.
While writing an arithmetic expression using infix notation, the operator is placed in between the
operands. For example, A+B;
Although for us it is easy to write expressions using infix notation but computers find it difficult to
parse as the computer needs a lot of information to evaluate the expression. Information is
needed about operator precedence, associativity rules and brackets which overrides these
rules. So, computers work more efficiently with expressions written using prefix and postfix
notations.
POSTFIX NOTATION
Postfix notation was given by Jan ukasiewicz who was a Polish logician, mathematician, and
philosopher. His aim was to develop a parenthesis-free prefix notation (also known as Polish
notation) and a postfix notation which is better known as Reverse Polish Notation or RPN.
In postfix notation, as the name suggests, the operator is placed after the operands. For
example, if an expression is written as A+B in infix notation, the same expression can be written
AB+ in postfix notation. The order of evaluation of a postfix expression is always from left to
right. Even brackets can not alter the order of evaluation.
[AB+]*C
PREFIX NOTATION
Although a Prefix notation is also evaluated from left to right but the only difference between a
postfix notation and a prefix notation is that in a prefix notation, the operator is placed before
the operands. For example, if A+B is an expression in infix notation, then the corresponding
expression in prefix notation is given by +AB.
While evaluating a prefix expression, the operators are applied to the operands that are present
immediately on the right of the operator. Like postfix, prefix expressions also do not follow the
rules of operator precedence, associativity and even brackets cannot alter the order of
evaluation.
STEP 1: Convert the infix expression into its equivalent postfix expression
Exercise: Convert the following infix expression into postfix expression using the algorithm
given in figure 9.21.
A ( B / C + (D % E * F) / G )* H
A ( B / C + (D % E * F) / G )* H )
Infix Character Scanned
STACK
Postfix Expression
(
A
( -
( - (
( - (
A B
( - ( /
A B
( - ( /
A B C
( - ( +
A B C /
( - ( + (
A B C /
( - ( + (
A B C / D
( - ( + ( %
A B C / D
( - ( + ( %
A B C / D E
( - ( + ( % *
A B C / D E
( - ( + ( % *
A B C / D E F
( - ( +
A B C / D E F * %
( - ( + /
A B C / D E F * %
( - ( + /
A B C / D E F * % G
( -
A B C / D E F * % G / +
( - *
A B C / D E F * % G / +
( - *
A B C / D E F * % G / + H
* % G / + H * -
Let us now take an example that makes use of this algorithm. Consider the infix
expression given as 9 - (( 3 * 4) + 8) / 4. Evaluate the expression.
Character scanned
Stack
9, 3
9, 3, 4
9, 12
9, 12, 8
9, 20
9, 20, 4
9, 5
QUEUES
Queue is an important data structure which stores its elements in an ordered manner. Take for
example the analogies given below.
People moving on an escalator. The people who got on the escalator first will be the first one to
step out of it.
People waiting for bus. The first person standing in the line will be the first one to get into the
bus.
A queue is a FIFO (First In First Out) data structure in which
the element that was inserted first is the first one to be taken
out. The elements in a queue are added at one end called
the rear and removed from the other one end called front.
Queues can be easily represented using linear arrays. As stated earlier, every queue will have
front and rear variables that will point to the position from where deletions and insertions can be
done respectively.
12
18
14
36
Here, front = 0 and rear = 5. If we want to add one more value in the list say with value 45, then
rear would be incremented by 1 and the value would be stored at the position pointed by rear.
12
18
14
36
45
6
Here, front = 0 and rear = 6. Now, if we want to delete an element from the queue, then the
value of front will be incremented. Deletions are done from only this end of the queue.
18
14
36
45
Similarly, before deleting an element from the queue, we must check for underflow
condition. An underflow condition occurs when we try to delete an element from a queue
that is already empty. If front = -1 and rear = -1, this means there is no element in the
Oxford University Press 2011. All rights reserved.
Programming in C, 1/e
Reema Thareja,
Thareja Assistant
Professor, Institute of Information
Technology and Management
CHAPTER 13
TREES
BINARY TREES
A binary tree is a data structure which is defined as a collection of elements called nodes. Every
node contains a "left" pointer, a "right" pointer, and a data element. Every binary tree has a root
element pointed by a "root" pointer. The root element is the topmost node in the tree. If root =
NULL, then it means the tree is empty.
If the root node R is not NULL, then the two trees T1 and T2 are called the left and right
subtrees of R. if T1 is non-empty, then T1 is said to be the left successor of R. likewise, if T2 is
non-empty then, it is called the right successor of R.
In a binary tree every node has 0, 1 or at the most 2
successors. A node that has no successors or 0
successors is called the leaf node or the terminal
node.
ROOT NODE
1
T1
T2
4
5
1
0
1
1
1
2
KEY TERMS
Sibling: If N is any node in T that has left successor S1 and right successor S2, then N is called
the parent of S1 and S2. Correspondingly, S1 and S2 are called the left child and the right child
of N. Also, S1 and S2 are said to be siblings. Every node other than the root node has a
parent. In other words, all nodes that are at the same level and share the same parent are
called siblings (brothers).
Level number: Every node in the binary tree is assigned a level number. The root node is
defined to be at level 0. The left and right child of the root node has a level number 1. Similarly,
every node is at one level higher than its parents. So all child nodes are defined to have level
number as parents level number + 1.
Degree: Degree of a node is equal to the number of children that a node has. The degree of a
leaf node is zero.
In-degree of a node is the number of edges arriving at that node. The root node is the only node
that has an in-degree equal to zero. Similarly, Out-degree of a node is the number of edges
leaving that node.
Leaf node: A leaf node has no children.
TREE T
A
TREE T
Copies of binary trees: Two binary trees T and T are said to be copies if they have
similar structure and same contents at the corresponding nodes.
TREE T
A
TREE T
Directed edge: Line drawn from a node N to any of its successor is called a directed edge. A
binary tree of n nodes have exactly n 1 edges (because, every node except the root node is
connected to its parent via an edge).
Depth: The depth of a node N is given as the length of the path from the root R to the node N.
The depth of the root node is zero. The height/depth of a tree is defined as the length of the
path from the root node to the deepest node in the tree.
A tree with only a root node has a height of zero. A binary tree of height h, has at least h nodes
and at most 2h 1 nodes. This is because every level will have at least one node and can have
at most 2 nodes. So, if every level has two nodes then a tree with height h will have at the most
2h 1 nodes as at level 0, there is only one element called the root. The height of a binary tree
with n nodes is at least n and at most log2(n+1)
Ancestor and descendant nodes: Ancestors of a node are all the nodes along the path from the
root to that node. Similarly, descendants of a node are all the nodes along the path from that
node to the leaf node.
Binary trees are commonly used to implement binary search trees, expression trees,
tournament trees and binary heaps.
Oxford University Press 2011. All rights reserved.
A complete binary tree is a binary tree which satisfies two properties. First, in a complete binary
tree every level, except possibly the last, is completely filled. Second, all nodes appear as far
left as possible
In a complete binary tree Tn, there are exactly n nodes and level r of T can have at most 2r
nodes.
The formula to find the parent, left child and right child can be given as- if K is a parent node,
then its left child can be calculated as 2 * K and its right child can be calculated as 2 * K + 1. For
example, the children of node 4 are 8 (2*4) and 9 (2* 4 + 1). Similarly, the parent of the node K
can be calculated as | K/2 |. Given the node 4, its parent can be calculated as | 4/2 | = 2. The
height of a tree Tn having exactly n nodes is given as,
Hn = | log2 n + 1 |
This means, if a tree T has 10,00,000 nodes then its height is 21.
1
1
0
1
1
1
2
1
3
A binary tree T is said to be an extended binary tree (or a 2-tree) if each node in the tree has
either no child or exactly two children.
In an extended binary tree nodes that have two children are called internal nodes and nodes
that have no child or zero children are called internal nodes. In the figure internal nodes are
represented using a circle and external nodes are represented using squares.
To convert a binary tree into an extended tree, every empty sub-tree is replaced by a new node.
The original nodes in the tree are the internal nodes and the new nodes added are called the
external nodes.
Binary tree
Extended binary tree
In computers memory, a binary tree can be maintained either using a linked representation (as
in case of a linked list) or using sequential representation (as in case of single arrays).
In linked representation of binary tree, every node will have three parts, the data element, a
pointer to the left node and a pointer to the right node. So in C, the binary tree is built with a
node type given as below.
struct node {
struct node* left;
int data;
struct node* right;
};
10
11
12
20
15
35
12
17
21
39
(2d+1-1),
where d is
16
11
18
12
13
35
15
10
14
36
15
45
16
17
12
17
1
6
3
9
2
1
1
8
18
19
3
6
4
5
EXPRESSION TREES
Binary trees are widely used to store algebraic expressions. For
example, consider the algebraic expression Exp given as,
Exp = (a b ) + ( c * d)
This expression can be represented using a binary tree as shown in
figure
+
*
a
TOURNAMENT TREES
In a tournament tree (also called a selection tree), each external node represents a
player and each internal node represents the winner of the match played between the
players represented by its children nodes. These tournament trees are also called
winner trees because they are being used to record the winner at each level. We can
also have a loser tree that records the loser at each level.
Traversing a binary tree is the process of visiting each node in the tree exactly once,
in a systematic way. Unlike linear data structures in which the elements are traversed
sequentially, tree is a non-linear data structure in which the elements can be
traversed in many different ways. There are different algorithms for tree traversals.
These algorithms differ in the order in which the nodes are visited. In this section, we
will read about these algorithms.
Pre-order algorithm
A, B, D, C, E, F, G, H and I
Oxford University Press 2011. All rights reserved.
In-order algorithm
To traverse a non-empty binary tree in in-order, the following
operations are performed recursively at each node. The algorithm
starts with the root node of the tree and continues by,
B, D, A, E, H, G, I, F AND C.
D
Post-order algorithm
D, B, H, I, G, F, E, C and A.
Programming in C, 1/e
Reema Thareja,
Thareja Assistant
Professor, Institute of Information
Technology and Management
CHAPTER 14
GRAPHS
INTRODUCTION
A graph is an abstract data structure that is used to implement the graph concept from
mathematics. A graph is basically, a collection of vertices (also called nodes) and edges that
connect these vertices. A graph is often viewed as a generalization of the tree structure, where
instead of a having a purely parent-to-child relationship between tree nodes, any kind of
complex relationships between the nodes can be represented.
Graphs are widely used to model any situation where entities or things are related to each other
in pairs; for example, the following information can be represented by graphs:
Family trees in which the member nodes have an edge from parent to each of their children.
Transportation networks in which nodes are airports, intersections, ports, etc. The edges can be
airline flights, one-way roads, shipping routes, etc.
Definition
A graph G is defined as an ordered set (V, E), where V(G) represent the set of vertices and
E(G) represents the edges that connect the vertices.
The figure given shows a graph with V(G) = { A, B, C, D and E} and E(G) = { (A, B), (B, C),
(A, D), (B, D), (D, E), (C, E) }. Note that there are 5 vertices or nodes and 6 edges in the
graph.
A
B
C
A graph can be directed or undirected. In an undirected graph, the edges do not have any
direction associated with them. That is, if an edge is drawn between nodes A and B, then the
nodes can be traversed from A to B as well as from B to A. The above figure shows an
undirected graph because it does not gives any information about the direction of the edges.
The given figure shows a directed graph. In a directed graph, edges form an ordered pair. If
there is an edge from A to B, then there is a path from A to B but not from B to A. The edge (A,
B) is said to initiate from node A (also known as initial node) and terminate at node B (terminal
node).
A
GRAPH TERMINOLOGY
Adjacent Nodes or Neighbors: For every edge, e = (u, v) that connects nodes u and v; the
nodes u and v are the end-points and are said to be the adjacent nodes or neighbors.
Degree of a node: Degree of a node u, deg(u), is the total number of edges containing the
node u. If deg(u) = 0, it means that u does not belong to any edge and such a node is known as
an isolated node.
Regular graph: Regular graph is a graph where each vertex has the same number of
neighbors. That is every node has the same degree. A regular graph with vertices of degree k is
called a k-regular graph or regular graph of degree k.
1 regular graph
2 regular graph
O regular graph
Path: A path P, written as P = {v0, v1, v2,.., vn), of length n from a node u to v is defined as a sequence of
(n+1) nodes. Here, u = v0, v = vn and vi-1 is adjacent to vi for i = 1, 2, 3, , n.
Closed path: A path P is known as a closed path if the edge has the same end-points. That is, if v0 = vn.
Simple path: A path P is known as a simple path if all the nodes in the path are distinct with an exception that
v0 may be equal to vn. If v0 = vn, then the path is called a closed simple path.
Cycle: A closed simple path with length 3 or more is known as a cycle. A cycle of length k is
called a k cycle.
Connected graph: A graph in which there exists a path between any two of its nodes is called a
connected graph. That is to say that there are no isolated nodes in a connected graph. A
connected graph that does not have any cycle is called a tree.
Complete graph: A graph G is said to be a complete, if all its nodes are fully connected, that is,
there is a path from one node to every other node in the graph. A complete graph has n(n-1)/2
edges, where n is the number of nodes in G.
Labeled graph or weighted graph: A graph is said to be labeled if every edge in the graph is
assigned some data. In a weighted graph, the edges of the graph are assigned some weight or
length. Weight of the edge, denoted by w(e) is a positive value which indicates the cost of
traversing the edge.
Multiple edges: Distinct edges which connect the same end points are called multiple edges.
That is, e = {u, v) and e = (u, v) are known as multiple edges of G.
Loop: An edge that has identical end-points is called a loop. That is, e = (u, u).
Multi- graph: A graph with multiple edges and/or a loop is called a multi-graph.
Weighted
Graph
Size
of the graph: The size of(b)
a Tree
graph is the (c)
total
number
of edges in it.
(a) Multi-graph
3
A
e1
e4
e2
7
e3
D
e6
C
e5
e7
F
D
Directed Graph
A directed graph G, also known as a digraph, is a graph in which every edge has a direction assigned
to it. An edge of a directed graph is given as an ordered pair (u, v) of nodes in G. For an edge (u, v)the edge begins at u and terminates at v
u is known as the origin or initial point of e. Correspondingly, v is known as the destination or
terminal point of e
u is the predecessor of v. Correspondingly, v is the successor of u
nodes u and v are adjacent to each other.
Out-degree of a node: The out degree of a node u, written as outdeg(u), is the number of edges
that originate at u.
In-degree of a node: The in degree of a node u, written as indeg(u), is the number of edges that
terminate at u.
Degree of a node: Degree of a node written as deg(u) is equal to the sum of in-degree and outdegree of that node. Therefore, deg(u) = indeg(u) + outdeg(u)
Source: A node u is known as a source if it has a positive out-degree but an in-degree = 0.
Sink: A node u is known as a sink if it has a positive in degree but a zero out-degree.
Reachability: A node v is said to be reachable from node u, if and only if there exists a (directed)
Strongly connected directed graph: A digraph is said to be strongly connected if and only if
there exists a path from every pair of nodes in G. That is, if there is a path from node u to v, then there
must be a path from node v to u.
Unilaterally connected graph: A digraph is said to be unilaterally connected if there exists a path
from any pair of nodes u, v in G such that there is a path from u to v or a path from v to u but not both.
Parallel/Multiple edges: Distinct edges which connect the same end points are called multiple
edges. That is, e = {u, v) and e = (u, v) are known as multiple edges of G.
Simple directed graph: A directed graph G is said to be a simple directed graph if and only if it has
no parallel edges. However, a simple directed graph may contain cycle with an exception that it
cannot have more than one loop at a given node
Oxford University Press 2011. All rights reserved.
e7
e2
e4
e6
e3
e5
Graph G
Transitive Closure G*
Definition:
For a directed graph G = (V,E), where V is the set of vertices and E is the set of edges, the
transitive closure of G is a graph G* = (V,E*). In G*, for every vertex v, w in V there is an edge
(v, w) in E* if and only if there is a valid path from v to w in G.
Why and where is it needed?
Finding the transitive closure of a directed graph is an important problem in many computational
tasks that listed below.
Transitive closure is used to find the reachability analysis of transition networks representing
distributed and parallel systems
Recently, transitive closure computation is being used to evaluate recursive database queries
(because almost all practical recursive queries are transitive in nature).
Algorithm
The algorithm to find transitive enclosure of a graph G is given in figure below. In order to
determine the transitive closure of a graph, we define a matrix t where tkij = 1, (for i, j, k = 1, 2,
3, n) if there exists a path in G from the vertex i to vertex j with intermediate vertices in the set
(1, 2, 3, .., k) and 0 otherwise. That is, G* is constructed by adding an edge (i, j) into E* if and
only if tkij = 1.
When k =
0
T0ij =
0 if (i, j) is not in E
1 if (I, j) is in E
When k 1
Tkij
1 )
kj
REPRESENTATION OF GRAPHS
There are two common ways of storing graphs in computers memory. They are:
Linked representation by using an adjacency list that stores the neighbors of a node using a
linked list
aij
0
otherwise
Since an adjacency matrix contains only 0s and 1s, it is called a bit matrix or a Boolean matrix. The
entries in the matrix depend on the ordering of the nodes in G. therefore, a change in the order of
nodes will result in a different
adjacency
Oxford
Universitymatrix.
Press 2011. All rights reserved.
A
B
C
D
E
A
0
1
0
0
0
B
0
0
0
0
0
C D E
0 1 0
1 1 0
0 0 0
0 0 1
1 0 0
A
B
C
D
A
0
0
1
0
B
1
1
0
0
C D
0 1
1 1
0 1
1 0
A
B
C
D
E
A
0
1
0
1
0
B
1
0
1
1
0
C D E
0 1 0
1 1 0
0 0 1
0 0 1
1 1 0
Directed Graph
Undirected Graph
A
B
C
D
E
B
2
7
A
0
0
0
0
0
B C D E
4 0 2 0
0 0 7 0
5 0 0 0
0 0 0 3
0 1 0 0
E
3
Weighted Graph
From adjacency matrix A1, we have learnt that an entry 1 in the ith row and jth column means that there
exists a path of length 1 from vi to vj. Now consider, A2, A3 and A4
aij 2 = aik akj
Any entry aij = 1 if aik = akj = 1. That is, if there is an edge (vi, vk) and (vk, vj). This implies that there is a
path from vi to vj of length 2.
Similarly, every entry in the ith row and jth column of A3 gives the number of paths of length 3 from
node vi to vj.
In general terms, we can conclude that every entry in the ith row and jth column of An (where n is the
number of nodes in the graph) gives the number of paths of length n from node vi to vj.
A
B
C
D
A
0
0
0
1
B C D
1 1 0
0 1 1
0 0 1
1 0 0
D
Oxford University
Press 2011. All rights reserved.
A2 = A1 X A1, therefore,
A2 =
A2 =
A
0
1
1
1
A
B
C
D
B
0
1
1
1
C
1
0
0
2
D
2
1
0
1
A
B
C
D
A
2
1
0
1
B
2
2
1
2
C
0
2
2
2
A
B
C
D
B
1
0
0
1
C D
1 0
1 1
0 1
0 0
A
B
C
D
A
0
0
0
1
B
1
0
0
1
C D
1 0
1 1
0 1
0 0
B
1
0
0
1
C D
1 0
1 1
0 1
0 0
A3 =
A3 =
A
0
0
0
1
A
B
C
D
A
0
1
1
1
B C D
1 1 2
1 0 1
1 0 0
1 2 1
A
B
C
D
A
0
0
0
1
D
1
1
1
3
A
2
1
0
1
A
B
C
D
B C D
2 0 1
2 2 1
1 2 1
2 2 3
A
0
0
0
1
A
B
C
D
B
1
0
0
1
C D
1 0
1 1
0 1
0 0
A4 =
A
1
1
1
3
A
B
C
D
B
3
2
1
4
C
4
3
1
3
D
2
4
3
4
Pij
0
otherwise
Let us now calculate matrix B and matrix P using the above discussion.
B=
A B C D
A B C D
A B C D
A
B
C
0 1
0 0
0 0
1
1
0
0
1
1
1 1
0 1
B
C
D
1 1
1 1
1 1
0
0
2
1
0
1
2 2
B
C
D
1 2
0 1
1 2
2
2
2
1
1
3
A
B
C
D
A
1
1
1
3
B C D
3 4 2
2 3 4
1 1 3
4 3 4
B=
A
B
C
D
A
3
3
2
6
B
7
5
3
6
C D
6 5
6 7
3 5
7 8
1
1
1
1
D
1 1 1
1 1 1
1 1 1
1 1 1
ADJACENCY LIST
The adjacency list is another way in which graphs can be represented in computers memory. This
structure consists of a list of all nodes in G. Furthermore, every node is in turn linked to its own list
that contains the names of all other nodes that are adjacent to itself.
The key advantage of using an adjacency list includes:
It is easy to follow, and clearly shows the adjacent nodes of a particular node
It is often used for storing graphs that have a small to moderate number of edges. That is an
Adjacency list is preferred for representing sparse graphs in computers memory; otherwise, an
adjacency matrix is a good choice.
Adding new nodes in G is easy and straightforward when G is represented using an Adjacency list.
Adding new nodes in an Adjacency matrix is a difficult task as size of the matrix needs to be
changed and existing nodes may have to be reordered.
Oxford University Press 2011. All rights reserved.
C X
X
A
B X
E X
E X
X
DESCRIPTION
Ready
Waiting
Processed
Breadth-first search (BFS) is a graph search algorithm that begins at the root node and
explores all the neighboring nodes. Then for each of those nearest nodes, the algorithm
explores their unexplored neighbor nodes, and so on, until it finds the goal.
That is, we start examining the node A and then all the neighbors of A are examined. In the
next step we examine the neighbors of neighbors of A, so on and so forth
Example: Consider the graph G given below. The adjacency list of G is also given. Assume that
G represents the daily flights between different cities and we want to fly from city A to H with
minimum stops. That is, find the minimum path P from A to H given that every edge has length
= 1.
Adjacency Lists
B
A: B, C, D
B: E
C: B, G
D: C, G
E: C, F
F: C, H
G: F, H, I
H: E, I
I: F
ORIG = \0
Dequeue a node by setting FRONT = FRONT + 1 (remove the FRONT element of QUEUE) and enqueue the
neighbors of A. Also add A as the ORIG of its neighbors, so
FRONT = 2
REAR = 4
QUEUE = A B C D
ORIG = \0 A A A
Dequeue a node by setting FRONT = FRONT + 1 and enqueue the neighbors of B. Also add B as the ORIG of its
neighbors, so
FRONT = 3
REAR = 5
QUEUE = A B C D E
ORIG = \0 A A A B
Dequeue a node by setting FRONT = FRONT + 1 and enqueue the neighbors of C. Also add C as the ORIG of its
neighbors. Note that C has two neighbors B and G. Since B has already been added to the queue and it is not in the
Ready state, we will not add B and add only G, so
FRONT = 4
REAR = 6
QUEUE = A B C D E G
ORIG = \0Press
AAA B
C All rights reserved.
Oxford University
2011.
Dequeue a node by setting FRONT = FRONT + 1 and enqueue the neighbors of D. Also add D as the ORIG of its neighbors. Note
that D has two neighbors C and G. Since both of them have already been added to the queue and they are not in the Ready state, we
will not add them again, so
FRONT = 5
REAR = 6
QUEUE = A B C D E G
ORIG = \0 A A A B C
Dequeue a node by setting FRONT = FRONT + 1 and enqueue the neighbors of E. Also add E as the ORIG of its neighbors. Note
that E has two neighbors C and F. Since C has already been added to the queue and it is not in the Ready state, we will not add C
and add only F, so
FRONT = 6
REAR = 7
QUEUE = A B C D E G F
ORIG = \0 A A A B C G
Dequeue a node by setting FRONT = FRONT + 1 and enqueue the neighbors of G. Also add G as the ORIG of its neighbors. Note
that G has three neighbors F, H and I.
FRONT = 7
REAR = 10
QUEUE = A B C D E G F H I
ORIG = \0 A A A B C G G G
Since I is our final destination, we stop the execution of this algorithm as soon as it is encountered and added to the QUEUE.
Now backtrack from I using ORIG to find the minimum path P. thus, we have P as
Example: Consider the graph G given below. The adjacency list of G is also given. Suppose we want
to print all nodes that can be reached from the node H (including H itself). One alternative is to use a
Depth- First Search of G starting at node H. the procedure can be explained as below.
Adjacency Lists
A
A: B, C, D
B: E
C: B, G
D: C, G
E: C, F
F: C, H
G: F, H, I
H: E, I
I: F
STACK: E, F
Pop and Print the top element of the STACK, that is, F. Push all the neighbors of F on to the stack that are in the ready state. (Note F
has two neighbors C and H. but only C will be added as H is not in the ready state). The STACK now becomes:
PRINT: F
STACK: E, C
Pop and Print the top element of the STACK, that is, C. Push all the neighbors of C on to the stack that are in the ready state. The STACK
now becomes:
PRINT: C
STACK: E, B, G
Pop and Print the top element of the STACK, that is, G. Push all the neighbors of G on to the stack that are in the ready state. Since there
are no neighbors of G that are in the ready state no push operation is performed. The STACK now becomes:
PRINT G:
STACK: E, B
Pop and Print the top element of the STACK, that is, B. Push all the neighbors of B on to the stack that are in the ready state. Since
there are no neighbors of B that are in the ready state no Push operation is performed. The STACK now becomes:
PRINT e:
STACK: E
Pop and Print the top element of the STACK, that is, E. Push all the neighbors of E on to the stack that are in the
ready state. Since there are no neighbors of E that are in the ready state no Push operation is performed. The
STACK now becomes empty:
PRINT: E
Since the STACK is now empty, the depth-first search of G starting at node H is complete and the nodes which
were printed are-
H, I, F, C, G, B E.
These are the nodes which are reachable from the node H.