21CS63 - CG&FIP Course Material

COURSE MATERIAL
COMPUTER GRAPHICS & FUNDAMENTAL OF IMAGE

PROCESSING [21CS63]
VI SEMESTER, 2023-2024
Prepared By
RAJATH A N ASHWINI M S
Assistant Professor Assistant Professor
Vision of the Department
"Knowledge dissemination with development of future leaders in Information Technology

having a research blend."
Mission of the Department
M1: Equip students with continuous learning process to acquire Hardware, Software and
Computing knowledge to face new challenges.
M2: Inculcate the core Computer Science and Engineering components with discipline among
the students by providing the state-of-the-art learner centric environment.
M3: To impart essential knowledge through quality and value based education to mould them as
a complete Computer Science Engineer with ethical values, leadership roles by possessing good
communication skills and ability to work effectively as a team member.
M4: Provide a platform to collaborate with successful people from entrepreneurial and research
domains to learn and accomplish.
PROGRAM EDUCATIONAL OBJECTIVES
PEO1: To produce graduates satisfying Computer Science Engineering challenges.
PEO2: To meet dynamic requirements of IT industries professionally and ethically along with
social responsibilities.
PEO3: To provide Computer Science and Engineering graduates to support nations self
employment growth with women entrepreneurial skills.
PEO4: To equip Graduates with minimum research blend for further career challenges
internationally.
Program Outcomes
Engineering Graduates will be able to:
1. Engineering knowledge: Apply the knowledge of mathematics, science, engineering

fundamentals, and an engineering specialization to the solution of complex engineering
problems.
2. Problem analysis: Identify, formulate, review research literature, and analyze complex
engineering problems reaching substantiated conclusions using first principles of
mathematics, natural sciences, and engineering sciences.
3. Design/development of solutions: Design solutions for complex engineering problems and

design system components or processes that meet the specified needs with appropriate
consideration for the public health and safety, and the cultural, societal, and environmental
considerations.
4. Conduct investigations of complex problems: Use research-based knowledge and research

methods including design of experiments, analysis and interpretation of data, and synthesis of
the information to provide valid conclusions.
5. Modern tool usage: Create, select, and apply appropriate techniques, resources, and modern
engineering and IT tools including prediction and modeling to complex engineering activities
with an understanding of the limitations.
6. The engineer and society: Apply reasoning informed by the contextual knowledge to assess
societal, health, safety, legal and cultural issues and the consequent responsibilities relevant
to the professional engineering practice.
7. Environment and sustainability: Understand the impact of the professional engineering
solutions in societal and environmental contexts, and demonstrate the knowledge of, and
need for sustainable development.
8. Ethics: Apply ethical principles and commit to professional ethics and responsibilities and
norms of the engineering practice.
9. Individual and team work: Function effectively as an individual, and as a member or leader
in diverse teams, and in multidisciplinary settings.
10. Communication: Communicate effectively on complex engineering activities with the

engineering community and with society at large, such as, being able to comprehend and
write effective reports and design documentation, make effective presentations, and give and
receive clear instructions.
11. Project management and finance: Demonstrate knowledge and understanding of the
engineering and management principles and apply these to one’s own work, as a member and
leader in a team, to manage projects and in multidisciplinary environments.
12. Life-long learning: Recognize the need for, and have the preparation and ability to engage
in independent and life-long learning in the broadest context of technological change.
Program Specific Outcomes
Enable students to design system and system architecture, inculcating

PSO1
software, computing and analytical ability.
Enhance skills to be successful in National, International level competition

PSO2
like GATE, GRE, GMAT.
VI Semester
COMPUTER GRAPHICS AND FUNDAMENTALS OF IMAGE PROCESSING

Course Code 21CS63 CIE Marks 50
Teaching Hours/Week (L:T:P: S) 3:0:0:0 SEE Marks 50
Total Hours of Pedagogy 40 Total Marks 100
Credits 03 Exam Hours 03
Course Objectives:
CLO 1. Overview of Computer Graphics along with its applications.
CLO 2. Exploring 2D and 3D graphics mathematics along with OpenGL API’s.
CLO 3. Use of Computer graphics principles for animation and design of GUI’s .
CLO 4. Introduction to Image processing and Open CV.
CLO 5. Image segmentation using Open CV.
Teaching-Learning Process (General Instructions)
These are sample Strategies, which teacher can use to accelerate the attainment of the various course
outcomes.
1. Lecturer method (L) need not to be only traditional lecture method, but alternative effective
teaching methods could be adopted to attain the outcomes.
2. Use of Video/Animation to explain functioning of various concepts.
3. Encourage collaborative (Group Learning) Learning in the class.
4. Ask at least three HOT (Higher order Thinking) questions in the class, which promotes critical
thinking.
5. Adopt Problem Based Learning (PBL), which fosters students’ Analytical skills, develop design
thinking skills such as the ability to design, evaluate, generalize, and analyse information
rather than simply recall it.
6. IntroduceTopicsin manifold representations.
7. Show the different ways to solve the same problem and encourage the students to come up
with their own creative ways to solve them.
8. Discuss how every concept can be applied to the real world - and when that's possible, it helps
improve the students' understanding.
Module-1
Overview: Computer Graphics hardware and software and OpenGL: Computer Graphics: Video Display
Devices, Raster-Scan Systems Basics of computer graphics, Application of Computer Graphics. OpenGL:
Introduction to OpenGL, coordinate reference frames, specifying two-dimensional world coordinate
reference frames in OpenGL, OpenGL point functions, OpenGL line functions, point attributes, line
attributes, curve attributes, OpenGL point attribute functions, OpenGL line attribute functions, Line
drawing algorithms(DDA, Bresenham’s).
Textbook 1: Chapter -1,2,3, 5(1 and 2 only)

Self-study topics : Input devices, hard copy devices, coordinate representation, graphics functions,
fill area primitives, polygon fill areas, pixel arrays, Parallel Line algorithms
Teaching- Chalk & board, Active Learning

Learning Virtual Lab
Process
Module-2
2D and 3D graphics with OpenGL: 2D Geometric Transformations: Basic 2D Geometric
Transformations, matrix representations and homogeneous coordinates, 2D Composite
transformations, other 2D transformations, raster methods for geometric transformations, OpenGL
raster transformations, OpenGL geometric transformations function,
3D Geometric Transformations: Translation, rotation, scaling, composite 3D transformations, other

3D transformations, OpenGL geometric transformations functions
Textbook 1: Chapter -6, 8
Self-study topics: Transformation between 2D coordinate system, OpenGL geometric-
transformation, Transformation between 3D coordinate system.
Teaching- Chalk & board, Active Learning, Problem based learning
Learning Virtual Lab:
Process
Module-3
Interactive Input Methods and Graphical User Interfaces: Graphical Input Data ,Logical Classification
of Input Devices, Input Functions for Graphical Data , Interactive Picture-Construction Techniques,
Virtual-Reality Environments, OpenGL Interactive Input-Device Functions, OpenGL Menu Functions ,
Designing a Graphical User Interface.
Computer Animation :Design of Animation Sequences, Traditional Animation Techniques, General

Computer-Animation Functions, Computer-Animation Languages, Character Animation, Periodic
Motions, OpenGL Animation Procedures.
Textbook 1: Chapter -11, 18

Self-study topics: Raster methods for computer animation, Key frame systems, Motion
specification.
Teaching- Chalk & board, MOOC, Active Learning
Learning
Process
Module-4
Introduction to Image processing: overview, Nature of IP, IP and its related fields, Digital Image
representation, types of images.
Digital Image Processing Operations: Basic relationships and distance metrics, Classification of Image
processing Operations.
Text book 2: Chapter 3
( Below topics is for experiential learning only , No questions in SEE)

Computer vision and OpenCV: What is computer vision, Evolution of computer vision, Application of
Computer vision, Feature of OpenCV, OpenCV library modules, OpenCV environment, Reading, writing and
storing images using OpenCV. OpenCV drawing Functions. OpenCV Geometric Transformations.
(Note : Computer vision and OpenCV for experimental learning or Activity Based
Learning using web sources, Preferred for assignments. No questions in SEE )
Web Source: https://www.tutorialspoint.com/opencv/
Teaching- Chalk& board, Problem based learning
Learning Lab practice for OpenCV for basic geometric objects and basic image operation
Process
Module-5
Image Segmentation: Introduction, classification, detection of discontinuities, Edge detection (up to
canny edge detection(included)).
Text Book 2: Chapter 9: 9.1 to 9.4.4.4
( Below topics is for experiential learning only , No questions in SEE)

Image processing with Open CV: Resizing , Rotation/ Flipping, Blending, Creating region of Interest
(ROI), Image Thresholding, Image Blurring and smoothing, Edge Detection, Image contours and Face
Detection on images using OpenCV.
(Note :Image Processing withOpenCV for experimental learning or Activity Based
Learning using web sources, Preferred for assignments. No questions in SEE)
Web source: https://medium.com/analytics-vidhya/introduction-to-computer-vision-opencv-in-

python-fb722e805e8b
Teaching- Chalk & board, MOOC

Learning Lab practice on image processing.
Process Virtual Lab:
Course Outcomes:
At the end of the course the student will be able to:
CO 1. Construct geometric objects using Computer Graphics principles and OpenGL APIs.
CO 2. Use OpenGL APIs and related mathematics for 2D and 3D geometric Operations on the objects.
CO 3. Design GUI with necessary techniques required to animate the created objects
CO 4. Apply OpenCV for developing Image processing applications.
CO 5. Apply Image segmentation techniques along with programming, using OpenCV, for developing
simple applications.
Assessment Details (both CIE and SEE)
The weightage of Continuous Internal Evaluation (CIE) is 50% and for Semester End Exam (SEE) is 50%.
The minimum passing mark for the CIE is 40% of the maximum marks (20 marks). A student shall be
deemed to have satisfied the academic requirements and earned the credits allotted to each subject/
course if the student secures not less than 35% (18 Marks out of 50)in the semester-end
examination(SEE), and a minimum of 40% (40 marks out of 100) in the sum total of the CIE (Continuous
Internal Evaluation) and SEE (Semester End Examination) taken together.
Continuous Internal Evaluation:
Three Unit Tests each of 20 Marks (duration 01 hour)
1. First test at the end of 5th week of the semester
2. Second test at the end of the 10th week of the semester
3. Third test at the end of the 15th week of the semester
Two assignments each of 10 Marks
4. First assignment at the end of 4th week of the semester
5. Second assignment at the end of 9th week of the semester
Group discussion/Seminar/quiz any one of three suitably planned to attain the COs and POs for 20
Marks (duration 01 hours)
6. At the end of the 13th week of the semester
The sum of three tests, two assignments, and quiz/seminar/group discussion will be out of 100 marks
and will be scaled down to 50 marks
(To have less stressed CIE, the portion of the syllabus should not be common /repeated for any of the
methods of the CIE. Each method of CIE should have a different syllabus portion of the course).
CIE methods /question paper is designed to attain the different levels of Bloom’s taxonomy as per
the outcome defined for the course.
Semester End Examination:
Theory SEE will be conducted by University as per the scheduled timetable, with common question
papers for the subject (duration 03 hours)
3. The question paper will have ten questions. Each question is set for 20 marks. Marks scored shall
be proportionally reduced to 50 marks
4. There will be 2 questions from each module. Each of the two questions under a module (with a
maximum of 3 sub-questions), should have a mix of topics under that module.
The students have to answer 5 full questions, selecting one full question from each module..
Suggested Learning Resources:

Textbooks
1. Donald D Hearn, M Pauline Baker and WarrenCarithers: Computer Graphics with OpenGL 4th
Edition, Pearson, 2014
2. S. Sridhar, Digital Image Processing, second edition, Oxford University press 2016.
Reference Books
1. Edward Angel: Interactive Computer Graphics- A Top Down approach with OpenGL, 5th edition.
Pearson Education, 2008
2. James D Foley, Andries Van Dam, Steven K Feiner, John F Huges Computer graphics with
OpenGL: Pearson education
Web links and Video Lectures (e-Resources):
Web links and Video Lectures (e-Resources):
1. https://nptel.ac.in/courses/106/106/106106090/
5. https://www.tutorialspoint.com/opencv/ (Tutorial, Types of Images, Drawing Functions )
Activity Based Learning (Suggested Activities in Class)/ Practical Based learning
2. Mini project on computer graphics using Open GL/Python/Open CV.
[Type here]
LESSON PLAN
Semester & Year: 6th Sem, 3rd Year Faculty Name: Rajath A N & Ashwini M S
Subject with Code: Computer Graphics & Fundamentals of Image Processing (21CS63)
COURSE OUTCOME: On completion of this subject, students will be expected to:
CO 1. Construct geometric objects using Computer Graphics principles and OpenGL APIs.
CO 2. Use OpenGL APIs and related mathematics for 2D and 3D geometric Operations on the objects.
CO 3. Design GUI with the necessary techniques required to animate the created objects
CO 4. Apply OpenCV for developing Image processing applications.
CO 5. Apply Image segmentation techniques along with programming, using OpenCV, for developing
simple applications.
Course
Module Sub Titles Sessions
Outcomes
Overview, Nature of IP, IP and its related fields 1,2
Digital Image Representation 3
Types of images 4
Module-4
Digital Image Processing Operations: Basic relationships 5,6,7
Introduction to CO4
and distance metrics,
Image Processing Classification of Image Processing Operations 8,9
Best Practice (Content Beyond Syllabus)
Lab practice for OpenCV for basic geometric objects and 10
basic image operation
Introduction, classification 11,12
Module-5 detection of discontinuities 13,14,15 CO5
Image Edge detection (up to canny edge detection(included)) 16,17,18
Segmentation Best Practice (Content Beyond Syllabus)
19,20
Example programs on Image segmentation applications.
Module-1 Computer Graphics: Basics of computer graphics, 21 CO1

Application of Computer Graphics
Computer Graphics & Fundamental of Image Processing (21CS63) 1|Page2

[Type here]
Introduction to Video Display Devices: Raster Scan displays 22
Computer OpenGL: Introduction to OpenGL, coordinate reference 23
Graphics & frames
Specifying two-dimensional world coordinate reference 24,25
OpenGL
frames in OpenGL
OpenGL point functions, OpenGL line functions, point 26,27
attributes, line attributes
Curve attributes, OpenGL point attribute functions, 28
OpenGL line attributefunctions
Line drawing algorithms (DDA, Bresenham’s) 29,30
2D Geometric Transformations: Basic 2D Geometric 31
Transformations
Matrix representations and homogeneous coordinates 32,33
Inverse transformations, 2D Composite transformations 34
Other 2D transformations, raster methods for geometric 35
Module-2
transformations
2D and 3D graphics OpenGL raster transformations, OpenGL geometric CO2
36,37
with OpenGL transformation’s function
3D Geometric Transformations: 3D translation, rotation, 38
scaling
Composite 3D transformations, other 3D transformations, 39
affine transformations
OpenGL geometric transformation functions. 40
Graphical Input Data, Logical Classification
of Input Devices, Input Functions for Graphical Data 41,42
Interactive Picture-Construction Techniques, Virtual-
Module-3 Reality Environments, 43
Interactive Input OpenGL Interactive Input-Device Functions, OpenGL
Menu Functions, Designing a Graphical User Interface. 44,45
Methods and CO3
Computer Animation: Design of Animation Sequences,
Graphical User Traditional Animation Techniques 46
Interfaces General Computer-Animation Functions, Computer-
Animation Languages 47,48
Character Animation, Periodic Motions, OpenGL
Animation Procedures 49,50
As per VTU: 40 Total Planned Hours:
50
Hours
Faculty Program Assessment HOD-CSE

Committee
Computer Graphics & Fundamental of Image Processing (21CS63) 2|Page2

Module 1
Computer Graphics Hardware and Software and OpenGL
1. Computer Graphics
1.1 Video Display Devices
 The primary output device in a graphics system is a video monitor.

 Historically, the operation of most video monitors was based on the standard
cathoderay tube (CRT) design, but several other technologies exist.
 In recent years, flat-panel displays have become significantly more popular due to
their reduced power consumption and thinner designs.
Refresh Cathssode-Ray Tubes
 This Figure illustrates the basic operation of a CRT.

 A beam of electrons, emitted by an electron gun, passes through focusing and
deflection systems that direct the beam toward specified positions on the
phosphor-coated screen.
 The phosphor then emits a small spot of light at each position contacted by the
electron beam and the light emitted by the phosphor fades very rapidly.
 One way to maintain the screen picture is to store the picture information as a
charge distribution within the CRT in order to keep the phosphors activated.
 The most common method now employed for maintaining phosphor glow is to
redraw the picture repeatedly by quickly directing the electron beam back over the
same screen points. This type of display is called a refresh CRT.
 The frequency at which a picture is redrawn on the screen is referred to as the
refresh rate.
Dept. of CSE 1 GSSSIETW, Mysuru
Operation of an electron gun with an accelarating anode
 The primary components of an electron gun in a CRT are the heated metal cathode
and a control grid.
 The heat is supplied to the cathode by directing a current through a coil of wire,
called the filament, inside the cylindrical cathode structure.
 This causes electrons to be “boiled off” the hot cathode surface.
 Inside the CRT envelope, the free, negatively charged electrons are then
accelerated toward the phosphor coating by a high positive voltage.
 Intensity of the electron beam is controlled by the voltage at the control grid.
 Since the amount of light emitted by the phosphor coating depends on the number
of electrons striking the screen, the brightness of a display point is controlled by
varying the voltage on the control grid.
 The focusing system in a CRT forces the electron beam to converge to a small
cross section as it strikes the phosphor and it is accomplished with either electric
or magnetic fields.
 With electrostatic focusing, the electron beam is passed through a positively
charged metal cylinder so that electrons along the center line of the cylinder are in
equilibrium position.
 Deflection of the electron beam can be controlled with either electric or magnetic
fields.
 Cathode-ray tubes are commonly constructed with two pairs of magnetic-

deflection coils
 One pair is mounted on the top and bottom of the CRT neck, and the other pair is
mounted on opposite sides of the neck.

 The magnetic field produced by each pair of coils results in a traverse deflection
force that is perpendicular to both the direction of the magnetic field and the
direction of travel of the electron beam.
 Horizontal and vertical deflections are accomplished with these pair of coils.
Electrostatic deflection of the electron beam in a CRT

 When electrostatic deflection is used, two pairs of parallel plates are mounted
inside the CRT envelope where, one pair of plates is mounted horizontally to
control vertical deflection, and the other pair is mounted vertically to control
horizontal deflection.
 Spots of light are produced on the screen by the transfer of the CRT beam energy
to the phosphor.
 When the electrons in the beam collide with the phosphor coating, they are
stopped and their kinetic energy is absorbed by the phosphor.
 Part of the beam energy is converted by the friction in to the heat energy, and the
remainder causes electros in the phosphor atoms to move up to higher quantum-
energy levels.
 After a short time, the “excited” phosphor electrons begin dropping back to their
stable ground state, giving up their extra energy as small quantum of light energy
called photons.
 What we see on the screen is the combined effect of all the electrons light
emissions: a glowing spot that quickly fades after all the excited phosphor
electrons have returned to their ground energy level.
 The frequency of the light emitted by the phosphor is proportional to the energy
difference between the excited quantum state and the ground state.

 Lower persistence phosphors required higher refresh rates to maintain a picture on

the screen without flicker.
 The maximum number of points that can be displayed without overlap on a CRT is
referred to as a resolution.
 Resolution of a CRT is dependent on the type of phosphor, the intensity to be
displayed, and the focusing and deflection systems.
High-resolution systems are often referred to as high-definition systems.
1.1.1 Raster-Scan Displays and Random Scan Displays
i) Raster-Scan Displays
 The electron beam is swept across the screen one row at a time from top to bottom.
 As it moves across each row, the beam intensity is turned on and off to create a
pattern of illuminated spots.
 This scanning process is called refreshing. Each complete scanning of a screen is
normally called a frame.
 The refreshing rate, called the frame rate, is normally 60 to 80 frames per second,
or described as 60 Hz to 80 Hz.
 Picture definition is stored in a memory area called the frame buffer.
 This frame buffer stores the intensity values for all the screen points. Each screen
point is called a pixel (picture element).
 Property of raster scan is Aspect ratio, which defined as number of pixel columns
divided by number of scan lines that can be displayed by the system.

Case 1: In case of black and white systems

 On black and white systems, the frame buffer storing the values of the pixels is
called a bitmap.
 Each entry in the bitmap is a 1-bit data which determine the on (1) and off (0) of
the intensity of the pixel.
Case 2: In case of color systems
 On color systems, the frame buffer storing the values of the pixels is called a
pixmap (Though now a days many graphics libraries name it as bitmap too).
 Each entry in the pixmap occupies a number of bits to represent the color of the
pixel. For a true color display, the number of bits for each entry is 24 (8 bits per
red/green/blue channel, each channel 28=256 levels of intensity value, ie. 256
voltage settings for each of the red/green/blue electron guns).
ii). Random-Scan Displays

 When operated as a random-scan display unit, a CRT has the electron beam
directed only to those parts of the screen where a picture is to be displayed.
 Pictures are generated as line drawings, with the electron beam tracing out the
component lines one after the other.
 For this reason, random-scan monitors are also referred to as vector displays (or
strokewriting displays or calligraphic displays).
 The component lines of a picture can be drawn and refreshed by a random-scan
system in any specified order

 A pen plotter operates in a similar way and is an example of a random-scan, hard-copy

device.
 Refresh rate on a random-scan system depends on the number of lines to be displayed
on that system.
 Picture definition is now stored as a set of line-drawing commands in an area of
memory referred to as the display list, refresh display file, vector file, or display
program
 To display a specified picture, the system cycles through the set of commands in the
display file, drawing each component line in turn.
 After all line-drawing commands have been processed, the system cycles back to the
first line command in the list.
 Random-scan displays are designed to draw all the component lines of a picture 30 to
60 times each second, with up to 100,000 “short” lines in the display list.
 When a small set of lines is to be displayed, each refresh cycle is delayed to avoid
very high refresh rates, which could burn out the phosphor.
Difference between Raster scan system and Random scan system

Base of Raster Scan System Random Scan System
Difference
The electron beam is The electron beam is directed

Electron Beam swept across the screen, only to the parts of screen
one row at a time, from where a picture is to be drawn
top to bottom
Its resolution is poor Its resolution is good because
Resolution because raster system in this system produces smooth
contrast produces zigzag lines drawings because CRT
lines that are plotted as beam directly follows the line
discrete point sets. path.
Picture definition is Picture definition is stored as a

Picture stored as a set of set of line drawing instructions
Definition intensity values for all in a display file.
screen points,called
pixels in a refresh
buffer area.
The capability of this These systems are designed for
Realistic system to store intensity line-drawing and can’t display
Display values for pixel makes it realistic shaded scenes.
well suited for the
realistic display of
scenes contain shadow

and color pattern.

Draw an Image Screen points/pixels are Mathematical functions are used
used to draw an image to draw an image
1.1.2 Color CRT Monitors

 A CRT monitor displays color pictures by using a combination of phosphors that
emit different-colored light.
 It produces range of colors by combining the light emitted by different phosphors.
 There are two basic techniques for color display:
1. Beam-penetration technique
2. Shadow-mask technique
1) Beam-penetration technique:
 This technique is used with random scan monitors.
 In this technique inside of CRT coated with two phosphor layers usually red and
green.
 The outer layer of red and inner layer of green phosphor.
 The color depends on how far the electron beam penetrates into the phosphor
layer.
 A beam of fast electron penetrates more and excites inner green layer while slow
eletron excites outer red layer.
 At intermediate beam speed we can produce combination of red and green lights
which emit additional two colors orange and yellow.
 The beam acceleration voltage controls the speed of the electrons and hence color
of pixel.
Disadvantages:
 It is a low cost technique to produce color in random scan monitors.
 It can display only four colors.
 Quality of picture is not good compared to other techniques.
2) Shadow-mask technique
 It produces wide range of colors as compared to beam-penetration technique.
 This technique is generally used in raster scan displays. Including color TV.
 In this technique CRT has three phosphor color dots at each pixel position.
 One dot for red, one for green and one for blue light. This is commonly known as
Dot triangle.

 Here in CRT there are three electron guns present, one for each color dot. And a
shadow mask grid just behind the phosphor coated screen.
 The shadow mask grid consists of series of holes aligned with the phosphor dot
pattern.
 Three electron beams are deflected and focused as a group onto the shadow mask
and when they pass through a hole they excite a dot triangle.
 In dot triangle three phosphor dots are arranged so that each electron beam can
activate only its corresponding color dot when it passes through the shadow mask.
 A dot triangle when activated appears as a small dot on the screen which has color
of combination of three small dots in the dot triangle.
 By changing the intensity of the three electron beams we can obtain different
colors in the shadow mask CRT.
1.1.3 Flat Panel Display

 The term flat panel display refers to a class of video device that have reduced
volume, weight & power requirement compared to a CRT.
 As flat panel display is thinner than CRTs, we can hang them on walls or wear on
our wrists.
 Since we can even write on some flat panel displays they will soon be available as
pocket notepads.
 We can separate flat panel display in two categories:
1. Emissive displays: - the emissive display or emitters are devices that

convert electrical energy into light. For Ex. Plasma panel, thin film
electroluminescent displays and light emitting diodes.

2. Non emissive displays: - non emissive display or non emitters use optical
effects to convert sunlight or light from some other source into graphics
patterns.
For Ex. LCD (Liquid Crystal Display).
a) Plasma Panels displays

 This is also called gas discharge displays.
 It is constructed by filling the region between two glass plates with a mixture of
gases that usually includes neon.
 A series of vertical conducting ribbons is placed on one glass panel and a set of
horizontal ribbon is built into the other glass panel.
 Firing voltage is applied to a pair of horizontal and vertical conductors cause the
gas at the intersection of the two conductors to break down into glowing plasma of
electrons and ions.
 Picture definition is stored in a refresh buffer and the firing voltages are applied to
refresh the pixel positions, 60 times per second.
 Alternating current methods are used to provide faster application of firing
voltages and thus brighter displays.
 Separation between pixels is provided by the electric field of conductor.
 One disadvantage of plasma panels is they were strictly monochromatic device

that means shows only one color other than black like black and white.
b) Thin Film Electroluminescent Displays

 It is similar to plasma panel display but region between the glass plates is filled
with phosphors such as doped with magnesium instead of gas.

 When sufficient voltage is applied the phosphors becomes a conductor in area of

intersection of the two electrodes.
 Electrical energy is then absorbed by the manganese atoms which then release the
energy as a spot of light similar to the glowing plasma effect in plasma panel.
 It requires more power than plasma panel.
 In this good color and gray scale difficult to achieve.
c. Light Emitting Diode (LED)

 In this display a matrix of multi-color light emitting diode is arranged to form the
pixel position in the display and the picture definition is stored in refresh buffer.
 Similar to scan line refreshing of CRT information is read from the refresh buffer
and converted to voltage levels that are applied to the diodes to produce the light
pattern on the display.
d)Liquid Crystal Display (LCD)
 This non emissive device produce picture by passing polarized light from the
surrounding or from an internal light source through liquid crystal material that can
be aligned to either block or transmit the light.
 The liquid crystal refreshes to fact that these compounds have crystalline
arrangement of molecules then also flows like liquid.
 It consists of two glass plates each with light polarizer at right angles to each other
sandwich the liquid crystal material between the plates.
 Rows of horizontal transparent conductors are built into one glass plate, and
column of vertical conductors are put into the other plates.
 The intersection of two conductors defines a pixel position.

 In the ON state polarized light passing through material is twisted so that it will
pass through the opposite polarizer.
 In the OFF state it will reflect back towards source.
Three- Dimensional Viewing Devices

 Graphics monitors for the display of three-dimensional scenes have been devised
using a technique that reflects a CRT image from a vibrating, flexible mirror As
the varifocal mirror vibrates, it changes focal length.
 These vibrations are synchronized with the display of an object on a CRT so that
each point on the object is reflected from the mirror into a spatial position
corresponding to the distance of that point from a specified viewing location.
 This allows us to walk around an object or scene and view it from different sides.

1.2 Raster-Scan Systems

 Interactive raster-graphics systems typically employ several processing units.
 In addition to the central processing unit (CPU), a special-purpose processor,

called the video controller or display controller, is used to control the operation of
the display device.
 Organization of a simple raster system is shown in below Figure.
 Here, the frame buffer can be anywhere in the system memory, and the video
controller accesses the frame buffer to refresh the screen.
 In addition to the video controller, raster systems employ other processors as
coprocessors and accelerators to implement various graphics operations.
1.2.1 Video controller:

 The figure below shows a commonly used organization for raster systems.
 A fixed area of the system memory is reserved for the frame buffer, and the video
controller is given direct access to the frame-buffer memory.
 Frame-buffer locations, and the corresponding screen positions, are referenced in
the Cartesian coordinates.

Cartesian reference frame:

 Frame-buffer locations and the corresponding screen positions, are referenced in
Cartesian coordinates.
 In an application (user) program, we use the commands within a graphics software
package to set coordinate positions for displayed objects relative to the origin of
the Cartesian reference frame.
 The coordinate origin is referenced at the lower-left corner of a screen display area
by the software commands, although we can typically set the origin at any
convenient location for a particular application.
Working:
 Figure shows a two-dimensional Cartesian reference frame with the origin at the
lowerleft screen corner.
 The screen surface is then represented as the first quadrant of a two-dimensional

system with positive x and y values increasing from left to right and bottom of the
screen to the top respectively.
 Pixel positions are then assigned integer x values that range from 0 to xmax across
the screen, left to right, and integer y values that vary from 0 to ymax, bottom to
top.
Basic Video Controller Refresh Operations

 The basic refresh operations of the video controller are diagrammed

 Two registers are used to store the coordinate values for the screen pixels.
 Initially, the x register is set to 0 and the y register is set to the value for the top
scan line.
 The contents of the frame buffer at this pixel position are then retrieved and used
to set the intensity of the CRT beam.
 Then the x register is incremented by 1, and the process is repeated for the next
pixel on the top scan line.
 This procedure continues for each pixel along the top scan line.
 After the last pixel on the top scan line has been processed, the x register is reset to
0 and the y register is set to the value for the next scan line down from the top of
the screen.
 The procedure is repeated for each successive scan line.
 After cycling through all pixels along the bottom scan line, the video controller
resets the registers to the first pixel position on the top scan line and the refresh
process starts over.
a.Speed up pixel position processing of video controller:
 Since the screen must be refreshed at a rate of at least 60 frames per second,the
simple procedure illustrated in above figure may not be accommodated by RAM
chips if the cycle time is too slow.
 To speed up pixel processing, video controllers can retrieve multiple pixel values
from the refresh buffer on each pass.
 When group of pixels has been processed, the next block of pixel values is
retrieved from the frame buffer.
Advantages of video controller:
 A video controller can be designed to perform a number of other operations.
 For various applications, the video controller can retrieve pixel values from
different memory areas on different refresh cycles.
 This provides a fast mechanism for generating real-time animations.
 Another video-controller task is the transformation of blocks of pixels, so that
screen areas can be enlarged, reduced, or moved from one location to another
during the refresh cycles.
 In addition, the video controller often contains a lookup table, so that pixel values
in the frame buffer are used to access the lookup table. This provides a fast method
for changing screen intensity values.
 Finally, some systems are designed to allow the video controller to mix the
framebuffer image with an input image from a television camera or other input
device.
b) Raster-Scan Display Processor
 Figure shows one way to organize the components of a raster system that contains
a separate display processor, sometimes referred to as a graphics controller or a
display coprocessor.
 The purpose of the display processor is to free the CPU from the graphics chores.
 In addition to the system memory, a separate display-processor memory area can

be provided.
Scan conversion:
 A major task of the display processor is digitizing a picture definition given in an
application program into a set of pixel values for storage in the frame buffer.
This digitization process is called scan conversion.
Example 1: displaying a line
 Graphics commands specifying straight lines and other geometric objects are scan
converted into a set of discrete points, corresponding to screen pixel positions.
 Scan converting a straight-line segment.
Example 2: displaying a character
 Characters can be defined with rectangular pixel grids
 The array size for character grids can vary from about 5 by 7 to 9 by 12 or more
for higher-quality displays.

 A character grid is displayed by superimposing the rectangular grid pattern into the
frame buffer at a specified coordinate position.
Using outline:
 For characters that are defined as outlines, the shapes are scan-converted into the
frame buffer by locating the pixel positions closest to the outline.
Additional operations of Display processors:

 Display processors are also designed to perform a number of additional operations.
 These functions include generating various line styles (dashed, dotted, or solid),
displaying color areas, and applying transformations to the objects in a scene.
 Display processors are typically designed to interface with interactive input
devices, such as a mouse.
Methods to reduce memory requirements in display processor:
 In an effort to reduce memory requirements in raster systems, methods have been
devised for organizing the frame buffer as a linked list and encoding the color
information.
 One organization scheme is to store each scan line as a set of number pairs.
 Encoding methods can be useful in the digital storage and transmission of picture
information
i) Run-length encoding:
 The first number in each pair can be a reference to a color value, and the second
number can specify the number of adjacent pixels on the scan line that are to be
displayed in that color.

 This technique, called run-length encoding, can result in a considerable saving in

storage space if a picture is to be constructed mostly with long runs of a single
color each.
 A similar approach can be taken when pixel colors change linearly.
ii) Cell encoding:

 Another approach is to encode the raster as a set of rectangular areas (cell
encoding).
Disadvantages of encoding:
 The disadvantages of encoding runs are that color changes are difficult to record
and storage requirements increase as the lengths of the runs decrease.
 In addition, it is difficult for the display controller to process the raster when many
short runs are involved.
 Moreover, the size of the frame buffer is no longer a major concern, because of
sharp declines in memory costs
1.2.3 Graphics workstations and viewing systems

 Most graphics monitors today operate as raster-scan displays, and both CRT and
flat panel systems are in common use.
 Graphics workstation range from small general-purpose computer systems to multi
monitor facilities, often with ultra –large viewing screens.
 High-definition graphics systems, with resolutions up to 2560 by 2048, are
commonly used in medical imaging, air-traffic control, simulation, and CAD.
 Many high-end graphics workstations also include large viewing screens, often
with specialized features.
 Multi-panel display screens are used in a variety of applications that require “wall-
sized” viewing areas. These systems are designed for presenting graphics displays
at meetings, conferences, conventions, trade shows, retail stores etc.
 A multi-panel display can be used to show a large view of a single scene or several
individual images. Each panel in the system displays one section of the overall
picture
 A large, curved-screen system can be useful for viewing by a group of people
studying a particular graphics application.

 A 360 degree paneled viewing system in the NASA control-tower simulator,

which is used for training and for testing ways to solve air-traffic and runway
problems at airports.
1.3 Basics of Computer Graphics
Computer graphics is an art of drawing pictures, lines, charts, etc. using computers with the
help of programming. Computer graphics image is made up of number of pixels. Pixel is the
smallest addressable graphical unit represented on the computer screen.
Computer graphics is concerned with all aspects of producing pictures or images using a
computer. The field began with the display of a few lines on a cathode-ray tube (CRT) and
now, the field generates photograph-equivalent images.
The development of computer graphics has been driven both by the needs of the user
community and by advances in hardware and software. The combination of computers,
networks, and the complex human visual system, through computer graphics, has led to new
ways of displaying information, seeing virtual worlds, and communicating with both other
people and machines.
1.4 Applications of Computer Graphics
a. Graphs and Charts
 An early application for computer graphics is the display of simple data graphs usually
plotted on a character printer. Data plotting is still one of the most common graphics
application.
 Graphs & charts are commonly used to summarize functional, statistical, mathematical,
engineering and economic data for research reports, managerial summaries and other
types of publications.

 Typically examples of data plots are line graphs, bar charts, pie charts, surface graphs,
contour plots and other displays showing relationships between multiple parameters in
two dimensions, three dimensions, or higher-dimensional spaces.
b. Computer-Aided Design
 A major use of computer graphics is in design processes-particularly for engineering

and architectural systems.
 CAD, computer-aided design or CADD, computer-aided drafting and design methods
are now routinely used in the automobiles, aircraft, spacecraft, computers, home
appliances.
 Circuits and networks for communications, water supply or other utilities are constructed
with repeated placement of a few geographical shapes.
 Animations are often used in CAD applications. Real-time, computer animations using
wire-frame shapes are useful for quickly testing the performance of a vehicle or system.
c. Virtual-Reality Environments
 Animations in virtual-reality environments are often used to train heavy-equipment

operators or to analyze the effectiveness of various cabin configurations and control
placements.

 With virtual-reality systems, designers and others can move about and interact with
objects in various ways. Architectural designs can be examined by taking simulated
“walk” through the rooms or around the outsides of buildings to better appreciate the
overall effect of a particular design.
 With a special glove, we can even “grasp” objects in a scene and turn them over or
move them from one place to another.
d. Data Visualizations
 Producing graphical representations for scientific, engineering and medical data sets
and processes is another fairly new application of computer graphics, which is
generally referred to as scientific visualization. And the term business visualization
is used in connection with data sets related to commerce, industry and other
nonscientific areas.
 There are many different kinds of data sets and effective visualization schemes
depend on the characteristics of the data. A collection of data can contain scalar
values, vectors or higher-order tensors.
e.Education and Training
 Computer generated models of physical,financial,political,social,economic & other

systems are often used as educational aids.

 Models of physical processes physiological functions,equipment, such as the color coded

diagram as shown in the figure, can help trainees to understand the operation of a
system.
 For some training applications,special hardware systems are designed.Examples of such
specialized systems are the simulators for practice sessions ,aircraft pilots,air traffic-
control personnel.
 Some simulators have no video screens,for eg: flight simulator with only a control panel
for instrument flying.
f. Computer Art
 The picture is usually painted electronically on a graphics tablet using a stylus, which
can simulate different brush strokes, brush widths and colors.
 Fine artists use a variety of other computer technologies to produce images. To create
pictures the artist uses a combination of 3D modeling packages, texture mapping,
drawing programs and CAD software etc.
 Commercial art also uses theses “painting” techniques for generating logos & other
designs, page layouts combining text & graphics, TV advertising spots & other
applications.
 A common graphics method employed in many television commercials is morphing,
where one object is transformed into another.
g. Entertainment

 Television production, motion pictures, and music videos routinely a computer graphics
methods.
 Sometimes graphics images are combined a live actors and scenes and sometimes the
films are completely generated a computer rendering and animation techniques.
 Some television programs also use animation techniques to combine computer generated
figures of people, animals, or cartoon characters with the actor in a scene or to transform
an actor’s face into another shape.
h. Image Processing
 The modification or interpretation of existing pictures, such as photographs and TV

scans is called image processing.
 Methods used in computer graphics and image processing overlap, the two areas are
concerned with fundamentally different operations.
 Image processing methods are used to improve picture quality, analyze images, or
recognize visual patterns for robotics applications.
 Image processing methods are often used in computer graphics, and computer graphics
methods are frequently applied in image processing.
 Medical applications also make extensive use of image processing techniques for picture
enhancements in tomography and in simulations and surgical operations.
 It is also used in computed X-ray tomography(CT), position emission
tomography(PET),and computed axial tomography(CAT).
i. Graphical User Interfaces

 It is common now for applications software to provide graphical user interface (GUI).
 A major component of graphical interface is a window manager that allows a user to

display multiple, rectangular screen areas called display windows.

 Each screen display area can contain a different process, showing graphical or non-
graphical information, and various methods can be used to activate a display window.
 Using an interactive pointing device, such as mouse, we can active a display window on
some systems by positioning the screen cursor within the window display area and
pressing the left mouse button.
2. OpenGL
2.1 Introduction to OpenGL
OpenGL basic(core) library: A basic library of functions is provided in OpenGL for

specifying graphics primitives, attributes, geometric transformations, viewing
transformations, and many other operations.
Basic OpenGL Syntax

 Function names in the OpenGL basic library (also called the OpenGL core
library) are prefixed with gl. The component word first letter is capitalized.
 For eg:- glBegin, glClear, glCopyPixels, glPolygonMode
 Symbolic constants that are used with certain functions as parameters are all in
capital letters, preceded by “GL”, and component are separated by underscore.
 For eg:- GL_2D, GL_RGB, GL_CCW, GL_POLYGON,
GL_AMBIENT_AND_DIFFUSE.
 The OpenGL functions also expect specific data types. For example, an OpenGL
function parameter might expect a value that is specified as a 32-bit integer. But
the size of an integer specification can be different on different machines.
 To indicate a specific data type, OpenGL uses special built-in, data-type names,
such as
GLbyte, GLshort, GLint, GLfloat, GLdouble, Glboolean

 Each data-type name begins with the capital letters GL, and the remainder of the
name is a standard data-type designation written in lowercase letters.
 Somearguments of OpenGLfunctionscanbeassignedvalues using anarray that lists

a set of data values.
Related Libraries
 In addition to OpenGL basic(core) library(prefixed with gl), there are a number of
associated libraries for handling special operations:-
1) OpenGL Utility(GLU):- Prefixed with “glu”. It provides routines for setting up
viewing and projection matrices, describing complex objects with line and
polygon approximations, displaying quadrics and B-splines using linear
approximations, processing the surface-rendering operations, and other complex
tasks.
Every OpenGL implementation includes the GLU library
2) Open Inventor:- provides routines and predefined object shapes for interactive
three- dimensional applications which are written in C++.
3) Window-system libraries:- To create graphics we need display window. We
cannot create the display window directly with the basic OpenGL functions since
it contains only device-independent graphics functions, and window-management
operations are device-dependent. However, there are several window-system
libraries that supports OpenGL functions for a variety of machines.
Eg:- Apple GL(AGL), Windows-to-OpenGL(WGL), Presentation Manager to
OpenGL(PGL), GLX.
4) OpenGL Utility Toolkit(GLUT):- provides a library of functions which acts as
interface for interacting with any device specific screen-windowing system, thus
making our program device-independent. The GLUT library functions are
prefixed with “glut”.
Header Files
 In all graphics programs, we will need to include the header file for the OpenGL
core library.
 In windows to include OpenGL core libraries and GLU we can use the following
header files:-
#include <windows.h> //precedes other header files for including Microsoft

windows ver of OpenGL libraries

#include<GL/gl.h>
#include <GL/glu.h>
 The above lines can be replaced by using GLUT header file which ensures gl.h
and glu.h are included correctly,
 #include <GL/glut.h> //GL in windows
 In Apple OS X systems, the header file inclusion statement will be,
 #include <GLUT/glut.h>
Display-Window Management Using GLUT

 We can consider a simplified example, minimal number of operations for
displaying a picture.
Step 1: initialization of GLUT
 We are using the OpenGL Utility Toolkit, our first step is to initialize GLUT.
 This initialization function could also process any command line arguments, but we
will not need to use these parameters for our first example programs.
 We perform the GLUT initialization with the statement glutInit (&argc, argv);
Step 2: title
 We can state that a display window is to be created on the screen with a given
caption for the title bar. This is accomplished with the function
glutCreateWindow ("An Example OpenGL Program");
 where the single argument for this function can be any character string that we
want to use for the display-window title.
Step 3: Specification of the display window
 Then we need to specify what the display window is to contain.
 For this, we create a picture using OpenGL functions and pass the picture
definition to the GLUT routine glutDisplayFunc, which assigns our picture to the
display window.
 Example: suppose we have the OpenGL code for describing a line segment in a
procedure called lineSegment.
 Then the following function call passes the line-segment description to the display
window: glutDisplayFunc (lineSegment);

Step 4: one more GLUT function

 But the display window is not yet on the screen.
 We need one more GLUT function to complete the window-processing operations.
 After execution of the following statement, all display windows that we have
created, including their graphic content, are now activated: glutMainLoop ( );
 This function must be the last one in our program. It displays the initial graphics
and puts the program into an infinite loop that checks for input from devices such
as a mouse or keyboard.
Step 5: these parameters using additional GLUT functions
 Although the display window that we created will be in some default location and
size, we can set these parameters using additional GLUT functions.
GLUT Function 1:
 We use the glutInitWindowPosition function to give an initial location for the

upper left corner of the display window.
 This position is specified in integer screen coordinates, whose origin is at the
upper-left corner of the screen.
GLUT Function 2:
After the display window is on the screen, we can reposition and resize it.
GLUT Function 3:
 We can also set a number of other options for the display window, such as
buffering and a choice of color modes, with the glutInitDisplayMode function.
 Arguments for this routine are assigned symbolic GLUT constants.
 Example: the following command specifies that a single refresh buffer is to be

used for the display window and that we want to use the color mode which uses
red, green, and blue (RGB) components to select color values:
glutInitDisplayMode (GLUT_SINGLE | GLUT_RGB);

 The values of the constants passed to this function are combined using a logical or
operation.
 Actually, single buffering and RGB color mode are the default options.
 But we will use the function now as a reminder that these are the options that are
set for our display.
 Later, we discuss color modes in more detail, as well as other display options,
such as double buffering for animation applications and selecting parameters for
viewing threedimensional scenes.
A Complete OpenGL Program

 There are still a few more tasks to perform before we have all the parts that we
need for a complete program.
Step 1: to set background color
 For the display window, we can choose a background color.
 Using RGB color values, we set the background color for the display window to
be white, with the OpenGL function: glClearColor (1.0, 1.0, 1.0, 0.0);
 The first three arguments in this function set the red, green, and blue component
colors to the value 1.0, giving us a white background color for the display window.
If, instead of 1.0, we set each of the component colors to 0.0, we would get a black
background.
 The fourth parameter in the glClearColor function is called the alpha value for the
specified color. One use for the alpha value is as a “blending” parameter
 When we activate the OpenGL blending operations, alpha values can be used to
determine the resulting color for two overlapping objects.
 An alpha value of 0.0 indicates a totally transparent object, and an alpha value of
1.0 indicates an opaque object.
 For now, we will simply set alpha to 0.0.
 Although the glClearColor command assigns a color to the display window, it

does not put the display window on the screen.
Step 2: to set window color

 To get the assigned window color displayed, we need to invoke the following
OpenGL function:
glClear (GL_COLOR_BUFFER_BIT);

 The argument GL COLOR BUFFER BIT is an OpenGL symbolic constant

specifying that it is the bit values in the color buffer (refresh buffer) that are to be
set to the values indicated in the glClearColor function. (OpenGL has several
different kinds of buffers that can be manipulated.
Step 3: to set color to object

 In addition to setting the background color for the display window, we can choose
a variety of color schemes for the objects we want to display in a scene.
 For our initial programming example, we will simply set the object color to be a
dark green glColor3f (0.0, 0.4, 0.2);
 The suffix 3f on the glColor function indicates that we are specifying the three
RGB color components using floating-point (f) values.
 This function requires that the values be in the range from 0.0 to 1.0, and we have
set red = 0.0, green = 0.4, and blue = 0.2.
Example program
 For our first program, we simply display a two-dimensional line segment.
 To do this, we need to tell OpenGL how we want to “project” our picture onto the
display window because generating a two-dimensional picture is treated by
OpenGL as a special case of three-dimensional viewing.
 So, although we only want to produce a very simple two-dimensional line,
OpenGL processes our picture through the full three-dimensional viewing
operations.
 We can set the projection type (mode) and other viewing parameters that we need
with the following two functions:
glMatrixMode (GL_PROJECTION);
gluOrtho2D (0.0, 200.0, 0.0, 150.0);
 This specifies that an orthogonal projection is to be used to map the contents of a
twodimensional rectangular area of world coordinates to the screen, and that the x-
coordinate values within this rectangle range from 0.0 to 200.0 with y-coordinate
values ranging from 0.0 to 150.0.
 Whatever objects we define within this world-coordinate rectangle will be shown
within the display window.
 Anything outside this coordinate range will not be displayed.

 Therefore, the GLU function gluOrtho2D defines the coordinate reference frame
within the display window to be (0.0, 0.0) at the lower-left corner of the display
window and (200.0, 150.0) at the upper-right window corner.
 For now, we will use a world-coordinate rectangle with the same aspect ratio as
the display window, so that there is no distortion of our picture.
 Finally, we need to call the appropriate OpenGL routines to create our line
segment.
 The following code defines a two-dimensional, straight-line segment with integer,
 Cartesian endpoint coordinates (180, 15) and (10, 145).
glBegin (GL_LINES);
glVertex2i (180, 15);
glEnd ( );
 Now we are ready to put all the pieces together:
The following OpenGL program is organized into three functions.
 init: We place all initializations and related one-time parameter settings in
function init.
 lineSegment: Our geometric description of the “picture” that we want to display is

in function lineSegment, which is the function that will be referenced by the
GLUT function glutDisplayFunc.
 main function contains the GLUT functions for setting up the display window and
getting our line segment onto the screen.
 glFlush: This is simply a routine to force execution of our OpenGL functions,
which are stored by computer systems in buffers in different locations,depending
on how OpenGL is implemented.
 The procedure lineSegment that we set up to describe our picture is referred to as a
display callback function.
 And this procedure is described as being “registered” by glutDisplayFunc as the
routine to invoke whenever the display window might need to be redisplayed.
Example: if the display window is moved.
Following program to display window and line segment generated by this program:
#include <GL/glut.h> // (or others, depending on the system in use)
void init (void)
{

glClearColor (1.0, 1.0, 1.0, 0.0); // Set display-window color to white.

glMatrixMode (GL_PROJECTION); // Set projection parameters.
gluOrtho2D (0.0, 200.0, 0.0, 150.0);
}
void lineSegment (void)
{
glClear (GL_COLOR_BUFFER_BIT); // Clear display window.
glColor3f (0.0, 0.4, 0.2); // Set line segment color to green.
glBegin (GL_LINES);
glVertex2i (180, 15); //Specify line-segment geometry.
glEnd ( );
glFlush ( ); // Process all OpenGL routines as quickly as possible.
}
void main (int argc, char** argv)
{
glutInit (&argc, argv); // Initialize GLUT.
glutInitDisplayMode (GLUT_SINGLE | GLUT_RGB); // Set display mode.
glutInitWindowPosition (50, 100); // Set top-left display-window position.
glutInitWindowSize (400, 300); // Set display-window width and height.
glutCreateWindow ("An Example OpenGL Program"); // Create display window.
init ( ); // Execute initialization procedure.
glutDisplayFunc (lineSegment); // Send graphics to display window.
glutMainLoop ( ); // Display everything and wait.
2.2 Coordinate Reference Frames

To describe a picture, we first decide upon
 A convenient Cartesian coordinate system, called the world-coordinate reference

frame, which could be either 2D or 3D.
 We then describe the objects in our picture by giving their geometric specifications
in terms of positions in world coordinates.
 Example: We define a straight-line segment with two endpoint positions, and a
polygon is specified with a set of positions for its vertices.

 These coordinate positions are stored in the scene description along with other info
about the objects, such as their color and their coordinate extents
 Co-ordinate extents :Co-ordinate extents are the minimum and maximum x, y, and
z values for each object.
 A set of coordinate extents is also described as a bounding box for an object.
 Ex:For a 2D figure, the coordinate extents are sometimes called its bounding
rectangle.
 Objects are then displayed by passing the scene description to the viewing routines
which identify visible surfaces and map the objects to the frame buffer positions
and then on the video monitor.
 The scan-conversion algorithm stores info about the scene, such as color values, at
the appropriate locations in the frame buffer, and then the scene is displayed on the
output device.
Screen co-ordinates:
 Locations on a video monitor are referenced in integer screen coordinates, which
correspond to the integer pixel positions in the frame buffer.
 Scan-line algorithms for the graphics primitives use the coordinate descriptions to
determine the locations of pixels
 Example: given the endpoint coordinates for a line segment, a display algorithm
must calculate the positions for those pixels that lie along the line path between the
endpoints.
 Since a pixel position occupies a finite area of the screen, the finite size of a pixel
must be taken into account by the implementation algorithms.
 For the present, we assume that each integer screen position references the centre
of a pixel area.
 Once pixel positions have been identified the color values must be stored in the
frame buffer
Assume we have available a low-level procedure of the form

i) setPixel (x, y);
• Stores the current color setting into the frame buffer at integer position(x, y),
relative to the position of the screen-coordinate origin
ii) getPixel (x, y, color);
• Retrieves the current frame-buffer setting for a pixel location.

• Parameter color receives an integer value corresponding to the combined RGB bit
codes stored for the specified pixel at position (x,y).
• Additional screen-coordinate information is needed for 3D scenes.
• For a two-dimensional scene, all depth values are 0.
Absolute and Relative Coordinate Specifications

Absolute coordinate:
 So far, the coordinate references that we have discussed are stated as absolute
coordinate values.
 This means that the values specified are the actual positions within the coordinate
system in use.
Relative coordinates:
 However, some graphics packages also allow positions to be specified using
relative coordinates.
 This method is useful for various graphics applications, such as producing
drawings with pen plotters, artist’s drawing and painting systems, and graphics
packages for publishing and printing applications.
 Taking this approach, we can specify a coordinate position as an offset from the
last position that was referenced (called the current position).
2.3 Specifying a Two-Dimensional World-Coordinate Reference Frame in

OpenGL
 The gluOrtho2D command is a function we can use to set up any 2D Cartesian
reference frames.
 The arguments for this function are the four values defining the x and y coordinate
limits for the picture we want to display.
 Since the gluOrtho2D function specifies an orthogonal projection, we need also to
be sure that the coordinate values are placed in the OpenGL projection matrix.
 In addition, we could assign the identity matrix as the projection matrix before
defining the world-coordinate range.
 This would ensure that the coordinate values were not accumulated with any
values we may have previously set for the projection matrix.
 Thus, for our initial two-dimensional examples, we can define the coordinate
frame for the screen display window with the following statements
glMatrixMode (GL_PROJECTION);
glLoadIdentity ( );
gluOrtho2D (xmin, xmax, ymin, ymax);
 The display window will then be referenced by coordinates (xmin, ymin) at the
lower-left corner and by coordinates (xmax, ymax) at the upper-right corner, as
shown in Figure below
 We can then designate one or more graphics primitives for display using the
coordinate reference specified in the gluOrtho2D statement.
 If the coordinate extents of a primitive are within the coordinate range of the
display window, all of the primitive will be displayed.
 Otherwise, only those parts of the primitive within the display-window coordinate
limits will be shown.
 Also, when we set up the geometry describing a picture, all positions for the
OpenGL primitives must be given in absolute coordinates, with respect to the
reference frame defined in the gluOrtho2D function.
2.4 OpenGL Point Functions

 The type within glBegin() specifies the type of the object and its value can be as
follows:
GL_POINTS
 Each vertex is displayed as a point.
 The size of the point would be of at least one pixel.
 Then this coordinate position, along with other geometric descriptions we may
have in our scene, is passed to the viewing routines.

 Unless we specify other attribute values, OpenGL primitives are displayed with a
default size and color.
 The default color for primitives is white, and the default point size is equal to the
size of a single screen pixel
Syntax:
Case 1:
glBegin (GL_POINTS);
glEnd ( );
Case 2:
 we could specify the coordinate values for the preceding points in arrays such as
int point1 [ ] = {50, 100};
int point2 [ ] = {75, 150};
int point3 [ ] = {100, 200};
and call the OpenGL functions for plotting the three points as
glVertex2iv (point1);
glEnd ( );
Case 3:
 specifying two point positions in a three dimensional world reference frame. In this
case, we give the coordinates as explicit floating-point values:
glVertex3f (-78.05, 909.72, 14.60);
glVertex3f (261.91, -5200.67, 188.33);
glEnd ( );
2.5 OpenGL LINE FUNCTIONS

 Primitive type is GL_LINES
 Successive pairs of vertices are considered as endpoints and they are connected to
form an individual line segments.

 Note that successive segments usually are disconnected because the vertices are
processed on a pair-wise basis.
 we obtain one line segment between the first and second coordinate positions and
another line segment between the third and fourth positions.
 if the number of specified endpoints is odd, so the last coordinate position is
ignored.
Case 1: Lines
glBegin (GL_LINES);
glVertex2iv (p1);
glVertex2iv (p2);
glVertex2iv (p3);
glVertex2iv (p4);
glVertex2iv (p5);
glEnd ( );
Case 2: GL_LINE_STRIP:
Successive vertices are connected using line segments. However, the final vertex is not
connected to the initial vertex.
glBegin (GL_LINES_STRIP);
glVertex2iv (p1);
glVertex2iv (p2);
glVertex2iv (p3);
glVertex2iv (p4);
glVertex2iv (p5);
glEnd ( );
Case 3: GL_LINE_LOOP:
Successive vertices are connected using line segments to form a closed path or loop i.e.,
final vertex is connected to the initial vertex.
glBegin (GL_LINES_LOOP);
glVertex2iv (p1);
glVertex2iv (p2);
glVertex2iv (p3);

glVertex2iv (p4);
glVertex2iv (p5);
glEnd ( );
2.6 Point Attributes

 Basically, we can set two attributes for points: color and size.
 In a state system: The displayed color and size of a point is determined by the
current values stored in the attribute list.
 Color components are set with RGB values or an index into a color table.
 For a raster system: Point size is an integer multiple of the pixel size, so that a
large point is displayed as a square block of pixels
Opengl Point-Attribute Functions

Color:
 The displayed color of a designated point position is controlled by the current
color values in the state list.
 Also, a color is specified with either the glColor function or the glIndex function.
Size:
 We set the size for an OpenGL point with
glPointSize (size);
and the point is then displayed as a square block of pixels.
 Parameter size is assigned a positive floating-point value, which is rounded to an

integer (unless the point is to be antialiased).
 The number of horizontal and vertical pixels in the display of the point is
determined by parameter size.
 Thus, a point size of 1.0 displays a single pixel, and a point size of 2.0 displays a
2×2 pixel array.
 If we activate the antialiasing features of OpenGL, the size of a displayed block of
pixels will be modified to smooth the edges.
 The default value for point size is 1.0.
Example program:
 Attribute functions may be listed inside or outside of a glBegin/glEnd pair.

 Example: the following code segment plots three points in varying colors and
sizes.
 The first is a standard-size red point, the second is a double-size green point, and
the third is a triple-size blue point:
Ex:
glColor3f (1.0, 0.0, 0.0);
glPointSize (2.0);
glColor3f (0.0, 1.0, 0.0);
glPointSize (3.0);
glColor3f (0.0, 0.0, 1.0);
glEnd ( );
2.7 Line-Attribute Functions OpenGL

 In OpenGL straight-line segment with three attribute settings: line color, line-
width, and line style.
 OpenGL provides a function for setting the width of a line and another function for
specifying a line style, such as a dashed or dotted line.
OpenGL Line-Width Function

 Line width is set in OpenGL with the function
Syntax: glLineWidth (width);
 We assign a floating-point value to parameter width, and this value is rounded to

the nearest nonnegative integer.
 If the input value rounds to 0.0, the line is displayed with a standard width of 1.0,
which is the default width.
 Some implementations of the line-width function might support only a limited
number of widths, and some might not support widths other than 1.0.
 That is, the magnitude of the horizontal and vertical separations of the line
endpoints, deltax and deltay, are compared to determine whether to generate a
thick line using vertical pixel spans or horizontal pixel spans.

OpenGL Line-Style Function

 By default, a straight-line segment is displayed as a solid line.
 But we can also display dashed lines, dotted lines, or a line with a combination of
dashes and dots.
 We can vary the length of the dashes and the spacing between dashes or dots.
 We set a current display style for lines with the OpenGL function:
Syntax: glLineStipple (repeatFactor, pattern);
Pattern:
 Parameter pattern is used to reference a 16-bit integer that describes how the line
should be displayed.
 bit in the pattern denotes an “on” pixel position, and a 0 bit indicates an “off” pixel
position.
 The pattern is applied to the pixels along the line path starting with the low-order
bits in the pattern.
 The default pattern is 0xFFFF (each bit position has a value of 1),which produces a
solid line.
repeatFactor
 Integer parameter repeatFactor specifies how many times each bit in the pattern is
to be repeated before the next bit in the pattern is applied.
 The default repeat value is 1.
Polyline:
 With a polyline, a specified line-style pattern is not restarted at the beginning of
each segment.
 It is applied continuously across all the segments, starting at the first endpoint of
the polyline and ending at the final endpoint for the last segment in the series.
Example:
 For line style, suppose parameter pattern is assigned the hexadecimal
representation 0x00FF and the repeat factor is 1.
 This would display a dashed line with eight pixels in each dash and eight pixel
positions that are “off” (an eight-pixel space) between two dashes.

 Also, since low order bits are applied first, a line begins with an eight-pixel dash
starting at the first endpoint.
 This dash is followed by an eight-pixel space, then another eight-pixel dash, and
so forth, until the second endpoint position is reached.
Activating line style:

 Before a line can be displayed in the current line-style pattern, we must activate
the line- style feature of OpenGL.
glEnable (GL_LINE_STIPPLE);
 If we forget to include this enable function, solid lines are displayed; that is, the
default pattern 0xFFFF is used to display line segments.
 At any time, we can turn off the line-pattern feature with
glDisable (GL_LINE_STIPPLE);
 This replaces the current line-style pattern with the default pattern (solid lines).
Example Code:
typedef struct { float x, y; } wcPt2D;
wcPt2D dataPts [5];
void linePlot (wcPt2D dataPts [5])
{
int k;
glBegin
(GL_LINE_STRIP); for
(k = 0; k < 5; k++)
glVertex2f (dataPts [k].x, dataPts [k].y);
glFlush ( );
glEnd ( );
}
/* Invoke a procedure here to draw coordinate axes. */
glEnable (GL_LINE_STIPPLE); /* Input first set of (x, y) data values. */
glLineStipple (1, 0x1C47); // Plot a dash-dot, standard-width polyline.
linePlot (dataPts);
/* Input second set of (x, y) data values. */ glLineStipple
(1, 0x00FF); / / Plot a dashed, double-width polyline.

glLineWidth (2.0);
linePlot (dataPts);
/* Input third set of (x, y) data values. */ glLineStipple
(1, 0x0101); // Plot a dotted, triple-width polyline.
glLineWidth (3.0);
linePlot (dataPts);
glDisable (GL_LINE_STIPPLE);
2.8 Curve Attributes

 Parameters for curve attributes are the same as those for straight-line segments.
 We can display curves with varying colors, widths, dot-dash patterns, and
available pen or brush options.
 Methods for adapting curve-drawing algorithms to accommodate attribute
selections are similar to those for line drawing.
 Raster curves of various widths can be displayed using the method of horizontal or
vertical pixel spans.
Case 1: Where the magnitude of the curve slope |m| <= 1.0, we plot vertical spans;
Case 2: when the slope magnitude |m| > 1.0, we plot horizontal spans.
Different methods to draw a curve:
Method 1: Using circle symmetry property, we generate the circle path with vertical spans
in the octant from x = 0 to x = y, and then reflect pixel positions about the line y = x to y=0.
Method 2: Another method for displaying thick curves is to fill in the area between two
Parallel curve paths, whose separation distance is equal to the desired width. We could do
this using the specified curve path as one boundary and setting up the second boundary
either inside or outside the original curve path. This approach, however, shifts the original
curve path either inward or outward, depending on which direction we choose for the
second boundary.
Method 3:The pixel masks discussed for implementing line-style options could also be
used in raster curve algorithms to generate dashed or dotted patterns
Method 4: Pen (or brush) displays of curves are generated using the same techniques
discussed for straight-line segments.

Method 5: Painting and drawing programs allow pictures to be constructed interactively by

using a pointing device, such as a stylus and a graphics tablet, to sketch various curve
shapes.
2.9 Line Drawing Algorithm

 A straight-line segment in a scene is defined by coordinate positions for the
endpoints of the segment.
 To display the line on a raster monitor, the graphics system must first project the
endpoints to integer screen coordinates and determine the nearest pixel positions
along the line path between the two endpoints then the line color is loaded into the
frame buffer at the corresponding pixel coordinates
 The Cartesian slope-intercept equation for a straight line is
y=m * x +b ------------ >(1)
with m as the slope of the line and b as the y intercept.
 Given that the two endpoints of a line segment are specified at positions (x0,y0)
and (xend, yend) ,as shown in fig.
 We determine values for the slope m and y intercept b with the following
equations:
m=(yend - y0)/(xend - x0) ---------------- >(2)
b=y0 - m.x0 ------------- >(3)
 Algorithms for displaying straight line are based on the line equation (1) and
calculations given in eq(2) and (3).
 For given x interval δx along a line, we can compute the corresponding y interval
δy from eq.(2) as
δy=m. δx ---------------- >(4)
 Similarly, we can obtain the x interval δx corresponding to a specified δy as
δx=δy/m ----------------- >(5)
 These equations form the basis for determining deflection voltages in analog
displays, such as vector-scan system, where arbitrarily small changes in deflection
voltage are possible.
 For lines with slope magnitudes
 |m|<1, δx can be set proportional to a small horizontal deflection voltage with

the corresponding vertical deflection voltage set proportional to δy from eq.(4)
 |m|>1, δy can be set proportional to a small vertical deflection voltage with the
corresponding horizontal deflection voltage set proportional to δx from eq.(5)
 |m|=1, δx=δy and the horizontal and vertical deflections voltages are equal
DDA Algorithm (DIGITAL DIFFERENTIAL ANALYZER)

 The DDA is a scan-converA line is sampled at unit intervals in one coordinate and
the corresponding integer values nearest the line path are determined for the other
coordinate
 DDA Algorithm has three cases so from equation i.e.., m=(yk+1 - yk)/(xk+1 - xk)
Case1:
if m<1,x increment in unit intervals i.e..,xk+1=xk+1 then, m=(yk+1 - yk)/( xk+1 - xk)
m= yk+1 - yk
yk+1 = yk + m ----------- >(1)
 where k takes integer values starting from 0,for the first point and increases by 1
until final endpoint is reached. Since m can be any real number between 0.0 and
1.0.
Case2:
if m>1, y increment in unit intervals
i.e.., yk+1 = yk + 1
then, m= (yk + 1- yk)/( xk+1 - xk)
m(xk+1 - xk)=1
xk+1 =(1/m)+ xk ------------------------ (2)
Case3:
if m=1,both x and y increment in unit intervals
i.e..,xk+1=xk + 1 and yk+1 = yk + 1

Equations (1) and (2) are based on the assumption that lines are to be processed from the
left endpoint to the right endpoint. If this processing is reversed, so that the starting
endpoint is at the right, then either we have δx=-1 and
yk+1 = yk - m (3)
or(when the slope is greater than 1)we have δy=-1 with
xk+1 = xk - (1/m) --------------- (4)

 Similar calculations are carried out using equations (1) through (4) to determine the
pixel positions along a line with negative slope. thus, if the absolute value of the slope
is less than 1 and the starting endpoint is at left ,we set δx==1 and calculate y values
with eq(1).
 when starting endpoint is at the right(for the same slope),we set δx=-1 and obtain y
positions using eq(3).
 This algorithm is summarized in the following procedure, which accepts as input two
integer screen positions for the endpoints of a line segment.
 if m<1,where x is incrementing by 1
yk+1 = yk + m
 So initially x=0,Assuming (x0,y0)as initial point assigning x= x0,y=y0 which is the
starting point .
o Illuminate pixel(x, round(y))
o x1= x+ 1 , y1=y + 1
o Illuminate pixel(x1,round(y1))
o x2= x1+ 1 , y2=y1 + 1
o Illuminate pixel(x2,round(y2))
o Till it reaches final point.
 if m>1,where y is incrementing by 1
xk+1 =(1/m)+ xk
 So initially y=0,Assuming (x0,y0)as initial point assigning x= x0,y=y0 which is
the starting point .
o Illuminate pixel(round(x),y)
o x1= x+( 1/m) ,y1=y
o Illuminate pixel(round(x1),y1)
o x2= x1+ (1/m) , y2=y1
o Illuminate pixel(round(x2),y2)

o Till it reaches final point.

 The DDA algorithm is faster method for calculating pixel position than one that
directly implements .
 It eliminates the multiplication by making use of raster characteristics, so that
appropriate increments are applied in the x or y directions to step from one pixel
position to another along the line path.
 The accumulation of round off error in successive additions of the floating point
increment, however can cause the calculated pixel positions to drift away from the
true line path for long line segments. Furthermore ,the rounding operations and
floating point arithmetic in this procedure are still time consuming.
 we improve the performance of DDA algorithm by separating the increments m and
1/m into integer and fractional parts so that all calculations are reduced to integer
operations.
#include <stdlib.h>
#include <math.h>
inline int round (const float a)
{
return int (a + 0.5);
}
void lineDDA (int x0, int y0, int xEnd, int yEnd)
{
int dx = xEnd - x0, dy = yEnd - y0, steps, k;
float xIncrement, yIncrement, x = x0, y = y0;
if (fabs (dx) > fabs (dy))
steps = fabs (dx);
else
steps = fabs (dy);
xIncrement = float (dx) / float (steps);
yIncrement = float (dy) / float (steps);
setPixel (round (x), round (y));
for (k = 0; k < steps; k++) {
x += xIncrement; y +=
yIncrement; setPixel
(round (x), round (y));
}

Bresenham’s Algorithm:
 It is an efficient raster scan generating algorithm that uses incremental integral
calculations
 To illustrate Bresenham’s approach, we first consider the scan-conversion process
for lines with positive slope less than 1.0.
 Pixel positions along a line path are then determined by sampling at unit x
intervals. Starting from the left endpoint (x0, y0) of a given line, we step to each
successive column (x position) and plot the pixel whose scan-line y value is
closest to the line path.
 Consider the equation of a straight line y=mx+c where m=dy/dx
Bresenham’s Line-Drawing Algorithm for |m| < 1.0

1. Input the two line endpoints and store the left endpoint in (x0, y0).
2. Set the color for frame-buffer position (x0, y0); i.e., plot the first point.
3. Calculate the constants ∆x, ∆y, 2∆y, and 2∆y − 2∆x, and obtain the starting value for
the decision parameter as
p0 = 2∆y −∆x
4. At each xk along the line, starting at k = 0, perform the following test: If pk < 0, the
next point to plot is (xk + 1, yk ) and
pk+1 = pk + 2∆y
Otherwise, the next point to plot is (xk + 1, yk + 1) and
pk+1 = pk + 2∆y − 2∆x
5. Repeat step 4 ∆x − 1 more times.
Note:
If |m|>1.0
Then
p0 = 2∆x −∆y
and

If pk < 0, the next point to plot is (xk , yk +1) and
pk+1 = pk + 2∆x
Otherwise, the next point to plot is (xk + 1, yk + 1) and
pk+1 = pk + 2∆x − 2∆y
Code:
#include <stdlib.h>
#include <math.h>
/* Bresenham line-drawing procedure for |m| < 1.0. */
void lineBres (int x0, int y0, int xEnd, int yEnd)
{
int dx = fabs (xEnd - x0), dy = fabs(yEnd - y0);
int p = 2 * dy - dx;
int twoDy = 2 * dy, twoDyMinusDx = 2 * (dy - dx);
int x, y;
/* Determine which endpoint to use as start position. */
if (x0 > xEnd)
{
x = xEnd;
y = yEnd;
xEnd = x0;
}
else {
x = x0;
y = y0;
}
setPixel (x, y);
while (x < xEnd) {
x++;
if (p < 0)
p += twoDy;
else {
y++;
p += twoDyMinusDx;
}
setPixel (x, y);

}
}

Module-2
2D and 3D graphics with OpenGL
2D Geometric Transformations
Operations that are applied to the geometric description of an object to change its
position, orientation, or size are called geometric transformations.
Basic Two-Dimensional Geometric Transformations

The geometric-transformation functions that are available in all graphics packages are
those for translation, rotation, and scaling.
Two-Dimensional Translation
 We perform a translation on a single coordinate point by adding offsets to its

coordinates so as to generate a new coordinate position.
 We are moving the original point position along a straight-line path to its new
location.
 To translate a two-dimensional position, we add translation distances tx and ty to
the original coordinates (x, y) to obtain the new coordinate position (x’, y’) as shown
in Figure
 The translation values of x’ and y’ is calculated as
 The translation distance pair (tx, ty) is called a translation vector or shift vector
Column vector representation is given as

 This allows us to write the two-dimensional translation equations in the matrix

Form
 Translation is a rigid-body transformation that moves objects without deformation.
Code:
class wcPt2D {
public
:
GLflo
at x, y;
};
void translatePolygon (wcPt2D * verts, GLint nVerts, GLfloat tx, GLfloat ty)
{
GLint k; for (k = 0; k <
nVerts; k++) { verts [k].x =
verts [k].x + tx; verts [k].y =
verts [k].y + ty;
}
glBegin (GL_POLYGON);
for (k = 0; k < nVerts; k++)
glVertex2f (verts [k].x, verts
[k].y);
glEnd ( );
}
Two-Dimensional Rotation
 We generate a rotation transformation of an object by specifying a rotation axis
and a rotation angle.
 A two-dimensional rotation of an object is obtained by repositioning the object
along a circular path in the xy plane.
 In this case, we are rotating the object about a rotation axis that is perpendicular to
the xy plane (parallel to the coordinate z axis).
 Parameters for the two-dimensional rotation are the rotation angle θ and a position

(xr, yr ), called the rotation point (or pivot point), about which the object is to be
rotated
 A positive value for the angle θ defines a counterclockwise rotation about the pivot
point, as in above Figure , and a negative value rotates objects in the clockwise
direction.
 The angular and coordinate relationships of the original and transformed point
positions are shown in Figure
 In this figure, r is the constant distance of the point from the origin, angle φ is the
original angular position of the point from the horizontal, and θ is the rotation angle.
 we can express the transformed coordinates in terms of angles θ and φ as
 The original coordinates of the point in polar coordinates are
 Substituting expressions of x and y in the eaquations of x’ and y’ we get
 We can write the rotation equations in the matrix form P’ = R· P

Where the rotation matrix is,

 Rotation of a point about an arbitrary pivot position is illustrated in Figure
 The transformation equations for rotation of a point about any specified rotation
position
(xr , yr ):
Code:
class wcPt2D {
public:
GLfloat x, y;
};
void rotatePolygon (wcPt2D * verts, GLint nVerts, wcPt2D pivPt, GLdouble theta)
{
wcPt2D * vertsRot;
GLint k;
for (k = 0; k < nVerts; k++) {
vertsRot [k].x = pivPt.x + (verts [k].x - pivPt.x) * cos (theta) - (verts
[k].y - pivPt.y) * sin (theta);
vertsRot [k].y = pivPt.y + (verts [k].x - pivPt.x) * sin (theta) + (verts
[k].y - pivPt.y) * cos (theta);
}
glVertex2f (vertsRot [k].x, vertsRot [k].y);
glEnd ( );
}
Two-Dimensional Scaling
 To alter the size of an object, we apply a scaling transformation.
 A simple twodimensional scaling operation is performed by multiplying object

positions (x, y) by scaling factors sx and sy to produce the transformed coordinates
(x’, y’):
 The basic two-dimensional scaling equations can also be written in the following
matrix form
Where S is the 2 × 2 scaling matrix

 Any positive values can be assigned to the scaling factors sx and sy.
 Values less than 1 reduce the size of objects
 Values greater than 1 produce enlargements.
 Specifying a value of 1 for both sx and sy leaves the size of objects unchanged.
 When sx and sy are assigned the same value, a uniform scaling is produced, which
maintains relative object proportions.
 Unequal values for sx and sy result in a differential scaling that is often used in
design applications.
 In some systems, negative values can also be specified for the scaling parameters.
This not only resizes an object, it reflects it about one or more of the coordinate
axes.
 Figure below illustrates scaling of a line by assigning the value 0.5 to both sx and sy

 We can control the location of a scaled object by choosing a position, called the
fixed point, that is to remain unchanged after the scaling transformation.
 Coordinates for the fixed point, (x f , yf ), are often chosen at some object position,
such as its centroid but any other spatial position can be selected.
 For a coordinate position (x, y), the scaled coordinates (x’, y’) are then calculated
from the following relationships:
 We can rewrite Equations to separate the multiplicative and additive terms as
 Where the additive terms x f (1 − sx) and yf (1 − sy) are constants for all points in
the object.
Code:
class wcPt2D {
public:
GLfloat x, y;
};
void scalePolygon (wcPt2D * verts, GLint nVerts, wcPt2D fixedPt, GLfloat sx,
GLfloat sy)
{
wcPt2D vertsNew;
GLint k;
for (k = 0; k < nVerts; k++) {
vertsNew [k].x = verts [k].x * sx + fixedPt.x * (1 - sx);
vertsNew [k].y = verts [k].y * sy + fixedPt.y * (1 - sy);
}
glVertex2f (vertsNew [k].x, vertsNew [k].y);
glEnd ( );
}

Matrix Representations and Homogeneous Coordinates

 Each of the three basic two-dimensional transformations (translation, rotation, and
scaling) can be expressed in the general matrix form
 With coordinate positions P and P’ represented as column vectors.

 Matrix M1 is a 2 × 2 array containing multiplicative factors, and M2 is a two-
element column matrix containing translational terms.
 For translation, M1 is the identity matrix.
 For rotation or scaling, M2 contains the translational terms associated with the pivot
point or scaling fixed point.
Homogeneous Coordinates
 Multiplicative and translational terms for a two-dimensional geometric
transformation can be combined into a single matrix if we expand the
representations to 3 × 3 matrices
 We can use the third column of a transformation matrix for the translation terms,
and all transformation equations can be expressed as matrix multiplications.
We also need to expand the matrix representation for a two-dimensional coordinate
position to a three-element column matrix.
 A standard technique for accomplishing this is to expand each twodimensional

coordinate-position representation (x, y) to a three-element representation (xh, yh,
h), called homogeneous coordinates, where the homogeneous parameter h is a
nonzero value such that
 A general two-dimensional homogeneous coordinate representation could also be

written as (h·x, h·y, h).
 A convenient choice is simply to set h = 1. Each two-dimensional position is then
represented with homogeneous coordinates (x, y, 1).
 The term homogeneous coordinates is used in mathematics to refer to the effect of
this representation on Cartesian equations.

Two-Dimensional Translation Matrix

 The homogeneous-coordinate for translation is given by
 This translation operation can be written in the abbreviated form
with T(tx, ty) as the 3 × 3 translation matrix
Two-Dimensional Rotation Matrix

 Two-dimensional rotation transformation equations about the coordinate origin can
be expressed in the matrix form
 The rotation transformation operator R(θ ) is the 3 × 3 matrix with rotation

parameter θ.
Two-Dimensional Scaling Matrix

 A scaling transformation relative to the coordinate origin can now be expressed as
the matrix multiplication
 The scaling operator S(sx, sy ) is the 3 × 3 matrix with parameters sx and sy
Two-Dimensional Composite Transformations

 Forming products of transformation matrices is often referred to as a
concatenation, or composition, of matrices if we want to apply two
transformations to point position P, the transformed location would be calculated as

 The coordinate position is transformed using the composite matrixM, rather than
applying the individual transformations M1 and thenM2.
Composite Two-Dimensional Translations

 If two successive translation vectors (t1x, t1y) and (t2x, t2y) are applied to a
twodimensional coordinate position P, the final transformed location P’ is
calculated as
where P and P’ are represented as three-element, homogeneous-coordinate

column vectors
 Also, the composite transformation matrix for this sequence of translations is
Composite Two-Dimensional Rotations

 Two successive rotations applied to a point P produce the transformed position
 By multiplying the two rotation matrices, we can verify that two successive
rotations are additive:
R(θ2) · R(θ1) = R(θ1 + θ2)
 So that the final rotated coordinates of a point can be calculated with the composite
rotation matrix as
P’ = R(θ1 + θ2) · P
Composite Two-Dimensional Scalings

 Concatenating transformation matrices for two successive scaling operations in two
dimensions produces the following composite scaling matrix

General Two-Dimensional Pivot-Point Rotation
 We can generate a two-dimensional rotation about any other pivot point (xr , yr ) by
performing the following sequence of translate-rotate-translate operations:
1. Translate the object so that the pivot-point position is moved to the coordinate
origin.
2. Rotate the object about the coordinate origin.
3. Translate the object so that the pivot point is returned to its original position.
 The composite transformation matrix for this sequence is obtained with the
concatenation
which can be expressed in the form
where T(−xr , −yr ) = T−1(xr , yr ).

General Two-Dimensional Fixed-Point Scaling
 To produce a two-dimensional scaling with respect to a selected fixed position (x f ,

yf ), when we have a function that can scale relative to the coordinate origin only.
This sequence is
1. Translate the object so that the fixed point coincides with the coordinate origin.
2. Scale the object with respect to the coordinate origin.
3. Use the inverse of the translation in step (1) to return the object to its original
position.
 Concatenating the matrices for these three operations produces the required scaling
matrix
General Two-Dimensional Scaling Directions

 Parameters sx and sy scale objects along the x and y directions.
 We can scale an object in other directions by rotating the object to align the desired
scaling directions with the coordinate axes before applying the scaling
transformation.
 Suppose we want to apply scaling factors with values specified by parameters s1
and s2 in the directions shown in Figure

 The composite matrix resulting from the product of these three transformations is
Matrix Concatenation Properties

Property 1:
 Multiplication of matrices is associative.
 For any three matrices,M1,M2, andM3, the matrix product M3 · M2 · M1 can be
performed by first multiplying M3 and M2 or by first multiplyingM2 and M1: M3
·M2 · M1 = (M3 · M2) ·M1 = M3 · (M2 · M1)
 We can construct a composite matrix either by multiplying from left to right
(premultiplying) or by multiplying from right to left (postmultiplying)
Property 2:
 Transformation products, on the other hand, may not be commutative. The matrix
productM2 · M1 is not equal toM1 ·M2, in general.
 This means that if we want to translate and rotate an object, we must be careful
about the order in which the composite matrix is evaluated

 Reversing the order in which a sequence of transformations is performed may

affect the transformed position of an object. In (a), an object is first translated in
the x direction, then rotated counterclockwise through an angle of 45◦. In (b), the
object is first rotated 45◦ counterclockwise, then translated in the x direction.
General Two-Dimensional Composite Transformations and Computational Efficiency

 A two-dimensional transformation, representing any combination of translations,
rotations, and scalings, can be expressed as
 The four elements rsjk are the multiplicative rotation-scaling terms in the
transformation, which involve only rotation angles and scaling factors if an object is to
be scaled and rotated about its centroid coordinates (xc , yc ) and then translated, the
values for the elements of the composite transformation matrix are
 Although the above matrix requires nine multiplications and six additions, the explicit
calculations for the transformed coordinates are
 We need actually perform only four multiplications and four additions to transform
coordinate positions.
 Because rotation calculations require trigonometric evaluations andseveral
multiplications for each transformed point, computational efficiency can become an
important consideration in rotation transformations
 If we are rotating in small angular steps about the origin, for instance, we can set cos θ
to 1.0 and reduce transformation calculations at each step to two multiplications and
two additions for each set of coordinates to be rotated.
 These rotation calculations are
x’= x − y sin θ, y’ = x sin θ + y

Two-Dimensional Rigid-Body Transformation

 If a transformation matrix includes only translation and rotation parameters, it is a
rigid- body transformation matrix.
 The general form for a two-dimensional rigid-body transformation matrix is
where the four elements r jk are the multiplicative rotation terms, and the elements
trx and try are the translational terms
 A rigid-body change in coordinate position is also sometimes referred to as a rigid-
motion transformation.
 In addition, the above matrix has the property that its upper-left 2 × 2 submatrix is
an orthogonal matrix.
 If we consider each row (or each column) of the submatrix as a vector, then the two
row vectors (rxx, rxy) and (ryx, ryy) (or the two column vectors) form an orthogonal
set of unit vectors.
 Such a set of vectors is also referred to as an orthonormal vector set. Each vector
has unit length as follows
and the vectors are perpendicular (their dot product is 0):
 Therefore, if these unit vectors are transformed by the rotation submatrix, then the
vector (rxx, rxy) is converted to a unit vector along the x axis and the vector (ryx,
ryy) is transformed into a unit vector along the y axis of the coordinate system
 For example, the following rigid-body transformation first rotates an object through
an angle θ about a pivot point (xr , yr ) and then translates the object

 Here, orthogonal unit vectors in the upper-left 2×2 submatrix are (cos θ, −sin θ) and
(sin θ, cos θ).
Constructing Two-Dimensional Rotation Matrices

 The orthogonal property of rotation matrices is useful for constructing the matrix
when we know the final orientation of an object, rather than the amount of angular
rotation necessary to put the object into that position.
 We might want to rotate an object to align its axis of symmetry with the viewing
(camera) direction, or we might want to rotate one object so that it is above another
object.
 Figure shows an object that is to be aligned with the unit direction vectors u_ and v
The rotation matrix for revolving an object from position (a) to position (b) can be
constructed with the values of the unit orientation vectors u’ and v’ relative to the original
orientation.
Other Two-Dimensional Transformations

Two such transformations
1. Reflection and
2. Shear.
Reflection
 A transformation that produces a mirror image of an object is called a reflection.

 For a two-dimensional reflection, this image is generated relative to an axis of

reflection by rotating the object 180◦ about the reflection axis.
 Reflection about the line y = 0 (the x axis) is accomplished with the transformation
Matrix
 This transformation retains x values, but “flips” the y values of coordinate positions.
 The resulting orientation of an object after it has been reflected about the x axis is
shown in Figure
 A reflection about the line x = 0 (the y axis) flips x coordinates while keeping y
coordinates the same. The matrix for this transformation is
 Figure below illustrates the change in position of an object that has been reflected
about the line x = 0.

 We flip both the x and y coordinates of a point by reflecting relative to an axis that
is perpendicular to the xy plane and that passes through the coordinate origin the
matrix representation for this reflection is
 An example of reflection about the origin is shown in Figure
 If we choose the reflection axis as the diagonal line y = x (Figure below), the
reflection matrix is
 To obtain a transformation matrix for reflection about the diagonal y = −x, we could
concatenate matrices for the transformation sequence:
(1) clockwise rotation by 45◦,
(2) reflection about the y axis, and
(3) counterclockwise rotation by 45◦.

The resulting transformation matrix is
Shear
 A transformation that distorts the shape of an object such that the transformed
shape appears as if the object were composed of internal layers that had been
caused to slide over each other is called a shear.
 Two common shearing transformations are those that shift coordinate x values and
those that shift y values. An x-direction shear relative to the x axis is produced with
the transformation Matrix
which transforms coordinate positions as
 Any real number can be assigned to the shear parameter shx Setting parameter shx
to the value 2, for example, changes the square into a parallelogram is shown
below. Negative values for shx shift coordinate positions to the left.
A unit square (a) is converted to a parallelogram (b) using the x -direction shear with shx =
2.
 We can generate x-direction shears relative to other reference lines with

Now, coordinate positions are transformed as
 A y-direction shear relative to the line x = xref is generated with the transformation
Matrix
which generates the transformed coordinate values
Raster Methods for Geometric Transformations

 Raster systems store picture information as color patterns in the frame buffer.
 Therefore, some simple object transformations can be carried out rapidly by

manipulating an array of pixel values
 Few arithmetic operations are needed, so the pixel transformations are particularly
efficient.
 Functions that manipulate rectangular pixel arrays are called raster operations and
moving a block of pixel values from one position to another is termed a block
transfer, a bitblt, or a pixblt.
 Figure below illustrates a two-dimensional translation implemented as a block
transfer of a refresh-buffer area
Translating an object from screen position (a) to the destination position shown in (b) by
moving a rectangular block of pixel values. Coordinate positions Pmin and Pmax specify
the limits of the rectangular block to be moved, and P0 is the destination reference position.

 Rotations in 90-degree increments are accomplished easily by rearranging the

elements of a pixel array.
 We can rotate a two-dimensional object or pattern 90◦ counterclockwise by
reversing the pixel values in each row of the array, then interchanging rows and
columns.
 A 180◦ rotation is obtained by reversing the order of the elements in each row of the
array, then reversing the order of the rows.
 Figure below demonstrates the array manipulations that can be used to rotate a pixel
block by 90◦ and by 180◦.
 For array rotations that are not multiples of 90◦, we need to do some extra
processing.
 The general procedure is illustrated in Figure below.
 Each destination pixel area is mapped onto the rotated array and the amount of
overlap with the rotated pixel areas is calculated.
 A color for a destination pixel can then be computed by averaging the colors of the
overlapped source pixels, weighted by their percentage of area overlap.
 Pixel areas in the original block are scaled, using specified values for sx and sy, and
then mapped onto a set of destination pixels.
 The color of each destination pixel is then assigned according to its area of overlap
with the scaled pixel areas

OpenGL Raster Transformations

 A translation of a rectangular array of pixel-color values from one buffer area to
another can be accomplished in OpenGL as the following copy operation:
glCopyPixels (xmin, ymin, width, height, GL_COLOR);
 The first four parameters in this function give the location and dimensions of the
pixel block; and the OpenGL symbolic constant GL_COLOR specifies that it is
color values are to be copied.
 A block of RGB color values in a buffer can be saved in an array with the function
glReadPixels (xmin, ymin, width, height, GL_RGB, GL_UNSIGNED_BYTE,
colorArray);
 If color-table indices are stored at the pixel positions, we replace the constant
GL RGB with GL_COLOR_INDEX.
 To rotate the color values, we rearrange the rows and columns of the color array, as
described in the previous section. Then we put the rotated array back in the buffer
with
glDrawPixels (width, height, GL_RGB, GL_UNSIGNED_BYTE, colorArray);
 A two-dimensional scaling transformation can be performed as a raster

operation in OpenGL by specifying scaling factors and then invoking either
glCopyPixels or glDrawPixels.
 For the raster operations, we set the scaling factors with
glPixelZoom (sx, sy);
We can also combine raster transformations with logical operations to produce various
effects with the exclusive or operator

OpenGL Functions for Two-Dimensional Geometric Transformations

 To perform a translation, we invoke the translation routine and set the components
for the three-dimensional translation vector.
 In the rotation function, we specify the angle and the orientation for a rotation axis
that intersects the coordinate origin.
 In addition, a scaling function is used to set the three coordinate scaling factors
relative to the coordinate origin. In each case, the transformation routine sets up a 4
× 4 matrix that is applied to the coordinates of objects that are referenced after the
transformation call
Basic OpenGL Geometric Transformations

 A 4× 4 translation matrix is constructed with the following
routine: glTranslate* (tx, ty, tz);
 Translation parameters tx, ty, and tz can be assigned any real-
number values, and the single suffix code to be affixed to this
function is either f (float) or d (double).
 For two-dimensional applications, we set tz = 0.0; and a two-
dimensional position is represented as a four-element column matrix
with the z component equal to 0.0.
 example: glTranslatef (25.0, -10.0, 0.0);
 Similarly, a 4 × 4 rotation matrix is generated with glRotate*
(theta, vx, vy, vz);
 where the vector v = (vx, vy, vz) can have any floating-point values
for its components.
 This vector defines the orientation for a rotation axis that passes
through the coordinate origin.
 If v is not specified as a unit vector, then it is normalized
automatically before the elements of the rotation matrix are
computed.
 The suffix code can be either f or d, and parameter theta is to be
assigned a rotation angle in degree.
 For example, the statement: glRotatef (90.0, 0.0, 0.0, 1.0);
 We obtain a 4 × 4 scaling matrix with respect to the coordinate origin with the
following routine: glScale* (sx, sy, sz);

 The suffix code is again either f or d, and the scaling parameters can be
assigned any real-number values.
 Scaling in a two-dimensional system involves changes in the x and y
dimensions, so a typical two-dimensional scaling operation has a z scaling
factor of 1.0
 Example: glScalef (2.0, -3.0, 1.0);
OpenGL Matrix Operations

 The glMatrixMode routine is used to set the projection mode which designates the
matrix that is to be used for the projection transformation.
 We specify the modelview mode with the statement
glMatrixMode (GL_MODELVIEW);
• which designates the 4×4 modelview matrix as the current matrix
• Two other modes that we can set with the glMatrixMode function are the
texture mode and the color mode.
• The texture matrix is used for mapping texture patterns to surfaces, and the
color matrix is used to convert from one color model to another.
• The default argument for the glMatrixMode function is
GL_MODELVIEW.
 With the following function, we assign the identity matrix to the current matrix:
glLoadIdentity ( );
 Alternatively, we can assign other values to the elements of the current matrix using
glLoadMatrix* (elements16);
 A single-subscripted, 16-element array of floating-point values is specified with
parameter elements16, and a suffix code of either f or d is used to designate the
data type
 The elements in this array must be specified in column-major order.
 To illustrate this ordering, we initialize the modelview matrix with the following
code:
GLfloat elems [16]; GLint k;
for (k = 0; k < 16; k++)
elems [k] = float (k);
glLoadMatrixf (elems);
Which produces the matrix

 We can also concatenate a specified matrix with the current matrix as follows:
glMultMatrix* (otherElements16);
 Again, the suffix code is either f or d, and parameter otherElements16 is a 16-
element, single-subscripted array that lists the elements of some other matrix in
column-major order.
 Thus, assuming that the current matrix is the modelview matrix, which we designate
as M, then the updated modelview matrix is computed as
M = M· M’
 The glMultMatrix function can also be used to set up any transformation sequence
with individually defined matrices.
 For example,
glLoadIdentity ( ); // Set current matrix to the identity.
glMultMatrixf (elemsM2); // Postmultiply identity with matrix M2.
glMultMatrixf (elemsM1); // Postmultiply M2 with matrix M1.
produces the following current modelview matrix:
M = M2 · M1
3D Geometric Transformations
Three-Dimensional Geometric Transformations
• Methods for geometric transformations in three dimensions are extended from two
dimensional methods by including considerations for the z coordinate.
• A three-dimensional position, expressed in homogeneous coordinates, is
represented as a four-element column vector
Three-Dimensional Translation
 A position P = (x, y, z) in three-dimensional space is translated to a location P’= (x’,
y’, z’) by adding translation distances tx, ty, and tz to the Cartesian coordinates of P:

 We can express these three-dimensional translation operations in matrix form
or
 Moving a coordinate position with translation vector T = (tx , ty , tz ) .
 Shifting the position of a three-dimensional object using translation vector T.
CODE:
typedef GLfloat Matrix4x4 [4][4];
/* Construct the 4 x 4 identity matrix. */
void matrix4x4SetIdentity (Matrix4x4 matIdent4x4)
{
GLint row, col;
for (row = 0; row < 4; row++)
for (col = 0; col < 4 ; col++)
matIdent4x4 [row][col] = (row == col);
}
void translate3D (GLfloat tx, GLfloat ty, GLfloat tz)

{
Matrix4x4 matTransl3D;
/* Initialize translation matrix to identity. */
matrix4x4SetIdentity (matTransl3D);
matTransl3D [0][3] = tx;
matTransl3D [1][3] = ty;
matTransl3D [2][3] = tz;
}
 An inverse of a three-dimensional translation matrix is obtained by negating the

translation distances tx, ty, and tz
Three-Dimensional Rotation
 By convention, positive rotation angles produce counterclockwise rotations about a
coordinate axis.
 Positive rotations about a coordinate axis are counterclockwise, when looking along
the positive half of the axis toward the origin.
Three-Dimensional Coordinate-Axis Rotations

Along z axis:
 In homogeneous-coordinate form, the three-dimensional z-axis rotation equations

are

 Transformation equations for rotations about the other two coordinate axes can be
obtained with a cyclic permutation of the coordinate parameters x, y, and z
x → y→ z→ x
Along x axis
Along y axis
 An inverse three-dimensional rotation matrix is obtained in the same by replacing θ

with −θ.
General Three-Dimensional Rotations

 A rotation matrix for any axis that does not coincide with a coordinate axis can be
set up as a composite transformation involving combinations of translations and the
coordinate- axis rotations the following transformation sequence is used:

1. Translate the object so that the rotation axis coincides with the parallel coordinate axis.
2. Perform the specified rotation about that axis.
3. Translate the object so that the rotation axis is moved back to its original position.
 A coordinate position P is transformed with the sequence shown in this figure as
Where the composite rotation matrix for the transformation is
 When an object is to be rotated about an axis that is not parallel to one of the
coordinate axes, we must perform some additional transformations we can
accomplish the required rotation in five steps:
1. Translate the object so that the rotation axis passes through the coordinate origin.
2. Rotate the object so that the axis of rotation coincides with one of the coordinate axes.
3. Perform the specified rotation about the selected coordinate axis.
4. Apply inverse rotations to bring the rotation axis back to its original orientation.
5. Apply the inverse translation to bring the rotation axis back to its original spatial
position.

• Components of the rotation-axis vector are then computed as

V = P2 − P1
= (x2 − x1, y2 − y1, z2 − z1)
• The unit rotation-axis vector u is
Where the components a, b, and c are the direction cosines for the rotation
axis
• The first step in the rotation sequence is to set up the translation matrix that
repositions the rotation axis so that it passes through the coordinate origin.
• Translation matrix is given by
• Because rotation calculations involve sine and cosine functions, we can use
standard vector operations to obtain elements of the two rotation matrices.
• A vector dot product can be used to determine the cosine term, and a vector cross
product can be used to calculate the sine term.
• Rotation of u around the x axis into the x z plane is accomplished by rotating u’ (the
projection of u in the y z plane) through angle α onto the z axis.
• If we represent the projection of u in the yz plane as the vector u’= (0, b, c), then the
cosine of the rotation angle α can be determined from the dot product of u’ and the
unit vector uz along the z axis:

where d is the magnitude of u’
• The coordinate-independent form of this cross-product is
• and the Cartesian form for the cross-product gives us
• Equating the above two equations
or
• We have determined the values for cos α and sin α in terms of the components of
vector u, the matrix elements for rotation of this vector about the x axis and into the
xz plane
• Rotation of unit vector u” (vector u after rotation into the x z plane) about the y
axis. Positive rotation angle β aligns u” with vector uz .
• We can determine the cosine of rotation angle β from the dot product of unit vectors
u’’ and uz. Thus,

• Comparing the coordinate-independent form of the cross-product
with the Cartesian form
• we find that
• The transformation matrix for rotation of u” about the y axis is
• The specified rotation angle θ can now be applied as a rotation about the z axis as
follows:
• The transformation matrix for rotation about an arbitrary axis can then be expressed
as the composition of these seven individual transformations:
• The composite matrix for any sequence of three-dimensional rotations is of the form
• The upper-left 3 × 3 submatrix of this matrix is orthogonal
• Assuming that the rotation axis is not parallel to any coordinate axis, we could form
the following set of local unit vectors

• If we express the elements of the unit local vectors for the rotation axis as
• Then the required composite matrix, which is equal to the product Ry(β) · Rx(α), is
Quaternion Methods for Three-Dimensional Rotations

 A more efficient method for generating a rotation about an arbitrarily selected axis
is to use a quaternion representation for the rotation transformation.
 Quaternions, which are extensions of two-dimensional complex numbers, are useful
in a number of computer-graphics procedures, including the generation of fractal
objects.
 One way to characterize a quaternion is as an ordered pair, consisting of a scalar
part and a vector part: q = (s, v)
 A rotation about any axis passing through the coordinate origin is accomplished by
first setting up a unit quaternion with the scalar and vector parts as follows:
 Any point position P that is to be rotated by this quaternion can be represented in

quaternion notation as
 Rotation of the point is then carried out with the quaternion operation
where q−1 = (s, −v) is the inverse of the unit quaternion q

 This transformation produces the following new quaternion:
 The second term in this ordered pair is the rotated point position p’, which is
evaluated with vector dot and cross-products as
 Designating the components of the vector part of q as v = (a, b, c) , we obtain the

elements for the composite rotation matrix
 Using the following trigonometric identities to simplify the terms
we can rewrite Matrix as
Three-Dimensional Scaling
 The matrix expression for the three-dimensional scaling transformation of a
position P = (x, y, z) is given by
 The three-dimensional scaling transformation for a point position can be

represented as
where scaling parameters sx, sy, and sz are assigned any positive values.
 Explicit expressions for the scaling transformation relative to the origin are
 Because some graphics packages provide only a routine that scales relative to the
coordinate origin, we can always construct a scaling transformation with respect to
any selected fixed position (xf , yf , zf ) using the following transformation sequence:

1. Translate the fixed point to the origin.

2. Apply the scaling transformation relative to the coordinate origin.
3. Translate the fixed point back to its original position.
 This sequence of transformations is demonstrated
CODE:
class wcPt3D
{
private:
GLfloat x, y, z;
public:
/* Default Constructor:
* Initialize position as (0.0, 0.0, 0.0).
*/
wcPt3D ( ) {
x = y = z = 0.0;
}
setCoords (GLfloat xCoord, GLfloat yCoord, GLfloat zCoord) {
x = xCoord;
y = yCoord;
z = zCoord;
}
GLfloat getx ( ) const {
return x;
}
GLfloat gety ( ) const {
return y;
}
GLfloat getz ( ) const {
return z;
}
};
typedef float Matrix4x4 [4][4];
void scale3D (GLfloat sx, GLfloat sy, GLfloat sz, wcPt3D fixedPt)
{
Matrix4x4 matScale3D;
/* Initialize scaling matrix to identity. */

matrix4x4SetIdentity (matScale3D);
matScale3D [0][0] = sx;
matScale3D [0][3] = (1 - sx) * fixedPt.getx ( );
matScale3D [1][1] = sy;
matScale3D [1][3] = (1 - sy) * fixedPt.gety ( );
matScale3D [2][2] = sz;
matScale3D [2][3] = (1 - sz) * fixedPt.getz ( );
}
Composite Three-Dimensional Transformations

 We form a composite threedimensional transformation by multiplying the matrix
representations for the individual operations in the transformation sequence.
 We can implement a transformation sequence by concatenating the individual
matrices from right to left or from left to right, depending on the order in which the
matrix representations are specified

Other Three-Dimensional Transformations

Three-Dimensional Reflections
 A reflection in a three-dimensional space can be performed relative to a selected
reflection axis or with respect to a reflection plane.
 Reflections with respect to a plane are similar; when the reflection plane is a
coordinate plane (xy, xz, or yz), we can think of the transformation as a 180◦ rotation
in four- dimensional space with a conversion between a left-handed frame and a
right-handed frame
 An example of a reflection that converts coordinate specifications froma right
handed system to a left-handed system is shown below
 The matrix representation for this reflection relative to the xy plane is
Three-Dimensional Shears
 These transformations can be used to modify object shapes.
 For three-dimensional we can also generate shears relative to the z axis.
 A general z-axis shearing transformation relative to a selected reference position is
produced with the following matrix:
 The Below figure shows the shear transformation of a cube

A unit cube (a) is sheared relative to the origin (b) by Matrix 46, with sh zx = shzy = 1.
OpenGL Geometric-Transformation Functions

OpenGL Matrix Stacks
glMatrixMode:
 used to select the modelview composite transformation matrix as the target of
subsequent OpenGL transformation calls
 four modes: modelview, projection, texture, and color
 the top matrix on each stack is called the “current matrix”.
 for that mode. the modelview matrix stack is the 4 × 4 composite matrix that
combines the viewing transformations and the various geometric transformations
that we want to apply to a scene.
 OpenGL supports a modelview stack depth of at least 32,
glGetIntegerv (GL_MAX_MODELVIEW_STACK_DEPTH, stackSize);

 determine the number of positions available in the modelview stack for a particular
implementation of OpenGL.
 It returns a single integer value to array stackSize
 other OpenGL symbolic constants:
GL_MAX_PROJECTION_STACK_DEPTH,
GL_MAX_TEXTURE_STACK_DEPTH, or
GL_MAX_COLOR_STACK_DEPTH.
 We can also find out how many matrices are currently in the stack with
glGetIntegerv (GL_MODELVIEW_STACK_DEPTH, numMats);
We have two functions available in OpenGL for processing the matrices in a stack
glPushMatrix ( );
Copy the current matrix at the top of the active stack and store that copy in the
second stack position
glPopMatrix ( );
Which destroys the matrix at the top of the stack, and the second matrix in the stack
becomes the current matrix

Module-3
Interactive Input Methods and Graphical User Interfaces
Graphical Input Data

Graphical input data refers to the information provided to a computer system through
various input devices. This data includes coordinates, movements, and actions that the
system interprets to perform specific tasks or create graphical content. Examples of
graphical input data include mouse clicks, stylus movements, and touch gestures.
LOGICAL INPUT DEVICES

 These are characterized by its high-level interface with the application program
rather than by its physical characteristics.
 Consider the following fragment of C code:
int x;
scanf(“%d”,&x
);
printf(“%d”,x);
The above code reads and then writes an integer. Although we run this program on
workstation providing input from keyboard and seeing output on the display screen, the
use of scanf() and printf() requires no knowledge of the properties of physical devices such
as keyboard codes or resolution of the display.
 These are logical functions that are defined by how they handle input or output
character strings from the perspective of C program.
 From logical devices perspective inputs are from inside the application program.
The two major characteristics describe the logical behavior of input devices are
as follows:
 The measurements that the device returns to the user program

 The time when the device returns those measurements
API defines six classes of logical input devices which are given below:

1. STRING: A string device is a logical device that provides the ASCII values of input
characters to the user program. This logical device is usually implemented by means
of physical keyboard.
2. LOCATOR: A locator device provides a position in world coordinates to the user
program. It is usually implemented by means of pointing devices such as mouse or
track ball.
3. PICK: A pick device returns the identifier of an object on the display to the user
program. It is usually implemented with the same physical device as the locator
but has a separate software interface to the user program. In OpenGL, we can use a
process of selection to accomplish picking.
4. CHOICE: A choice device allows the user to select one of a discrete number of
options. In OpenGL, we can use various widgets provided by the window system. A
widget is a graphical interactive component provided by the window system or a
toolkit. The Widgets include menus, scrollbars and graphical buttons. For example, a
menu with n selections acts as a choice device, allowing user to select one of ‘n’
alternatives.
5. VALUATORS: They provide analog input to the user program on some graphical
systems; there are boxes or dials to provide value.
6. STROKE: A stroke device returns array of locations. Example, pushing down a
mouse button starts the transfer of data into specified array and releasing of button
ends this transfer.
INPUT MODES
Input devices can provide input to an application program in terms of two entities:
1. Measure of a device is what the device returns to the application program.
2. Trigger of a device is a physical input on the device with which the user can
send signal to the computer
Example 1: The measure of a keyboard is a single character or array of characters where as

the trigger is the enter key.
Example 2: The measure of a mouse is the position of the cursor whereas the trigger is when
the mouse button is pressed.
The application program can obtain the measure and trigger in three distinct modes:

1. REQUEST MODE: In this mode, measure of the device is not returned to the program
until the device is triggered.
• For example, consider a typical C program which reads a character input

using scanf(). When the program needs the input, it halts when it encounters the
scanf() statement and waits while user type characters at the terminal. The data
is placed in a keyboard buffer (measure) whose contents are returned to the
program only after enter key (trigger) is pressed.
• Another example, consider a logical device such as locator, we can move out
pointing device to the desired location and then trigger the device with its
button, the trigger will cause the location to be returned to the application
program.
2. SAMPLE MODE: In this mode, input is immediate. As soon as the function call
in the user program is executed, the measure is returned. Hence no trigger is needed.
Both request and sample modes are useful for the situation if and only if there is a
single input device from which the input is to be taken. However, in case of flight
simulators or computer games variety of input devices are used and these mode
cannot be used. Thus, event mode is used.
3. EVENT MODE: This mode can handle the multiple interactions.

• Suppose that we are in an environment with multiple input devices, each with its
own trigger and each running a measure process.
• Whenever a device is triggered, an event is generated.The device measure
including the identifier for the device is placed in an event queue.

• If the queue is empty, then the application program will wait until an event
occurs. If there is an event in a queue, the program can look at the first event
type and then decide what to do.
Another approach is to associate a function when an event occurs, which is called as “call
back.”
MENUS
 Menus are an important feature of any application program. OpenGL provides a
feature called “Pop-up-menus” using which sophisticated interactive applications
can be created.
 Menu creation involves the following steps:
1. Define the actions corresponding to each entry in the menu.
2. Link the menu to a corresponding mouse button.
3. Register a callback function for each entry in the menu.
 The glutCreateMenu() registers the callback function demo_menu. The function

glutAddMenuEntry() adds the entry in the menu whose name is pased in first
argument and the second argument is the identifier passed to the callback when the
entry is selected.
 GLUT also supports the creation of hierarchical menus which is given below:
Computer Animation
Design of Animation Sequences
 Constructing ananimation sequencec an be a complicated task,particularly when it
involves a story line and multiple objects, each of which can move in a different way.
 A basicapproachistodesignsuchanimationsequences usingthe following development
stages:
• Storyboard layout
• Object definitions
• Key-frame specifications
• Generation of in-between frames
The storyboard is an outline of the action. It defines the motion sequence as a set of basic
events that are to take place. Depending on the type of animation to be produced, the
storyboard could consist of a set of rough sketches, along with a brief description of the
movements, or it could just be a list of the basic ideas for the action. Originally, the set
ofmotionsketcheswasattachedtoalargeboardthat was used to present an overall view of the
animation project. Hence, the name “storyboard.”
An Object definitionis given for each participant in the action. Objects can be
definedintermsofbasic shapes, such as polygons or spline surfaces. In addition, a description
is often given of the movements that are to be performed by each character or object in the
story.
A key frame is a detailed drawing of the scene at a certain time in the ani mation sequence.
Within each key frame, each object (or character) is positioned according to the time for that
frame. Development of the key frames is generally the responsibility of the senior animators,
and often a separate animator is assigned to each character in the animation.

In-betweens are the intermediate frames between the key frames. The total number of
frames, and hence the total number of in-betweens, needed for an animation is determined
by the display media that is to be used. Film requires 24 frames per second, and graphics
terminals are refreshed at the rate of 60 or moreframespersecond.Typically, time intervals
for the motion are set up so that there are from three to five in-betweens for each pair of key
frames. Depending on the speed specified for the motion, some key frames could be
duplicated. As an example, a 1-minute film sequence with no duplication requires a total of
1,440 frames. If five in-betweens are required for each pair of key frames, then 288 key
frames would need to be developed.
Traditional Animation Techniques
 Film animators use a variety of methods for depicting and emphasizing motion
sequences.
 These include object deformations,spacing between animationframes, motion
anticipation and follow-through, and action focusing.
 One of the most important techniques for simulating acceleration effects, particularly
for nonrigid objects, is squash and stretch.
 Figure 4 shows how this technique isused to emphasize the acceleration and
deceleration of a bouncing ball. As the ball accelerates, it begins to stretch.
 When the ball hits the floor and stops, it is first compressed (squashed) and then
stretched again as it accelerates and bounces upwards.
 Another technique used by film animators is timing,which refers to the spacing
between motion frames.
 A slower moving object is represented with more closely spaced frames, and a faster
moving object is displayed with fewer frames over the path of the motion.
 This effect is illustrated in Figure 5, where the position changes between frames
increase as a bouncing ball moves faster.
 Object movements can also be emphasized by creating preliminary actions that
indicate an anticipation of a coming motion. Forexample,a cartoon character might
lean forward and rotate its body before starting to run;oracharactermight perform a
“windup” before throwing a ball.
 Similarly, follow-through actions can be used to emphasize a previous motion.
 After throwing a ball, a character can continue the arm swing back to its body; or a
hat can fly off a character that is stopped abruptly.
 An action also can be emphasized with staging, which refers to any method for
focusing on an important part of a scene, such as a character hiding something

General Computer-Animation Functions

 Many software packages have been developed either for general animation design or
for performing specialized animation tasks.
 Typical animation functions include managing object motions, generating views of
objects, producing cam era motions, and the generation of in-between frames.
 Some animation packages, such asWave front for example,provide special functions
for both the overall animation design and the processing of individual objects.
 Others are special-purpose packages for particular features of an animation, such as a
system for generating in-between frames or a system for figure animation.
 A set of routines is often provided in a general animation package for stor ing and
managing the object database.
 Object shapes and associated parameters are stored and updated in the database.
Other object functions include those for generating the object motion and those for
rendering the object surfaces.
 Movements can be generated according to specified constraints using two
dimensional or three-dimensional transformations.
 Standard functions can then be applied to identify visible surfaces and apply the
rendering algorithms.
 Another typical function set simulates camera movements. Standard camera motions
are zooming, panning, and tilting. Finally, given the specification for the key frames,
the in-betweens can be generated automatically.

Computer-Animation Languages
 We can develop routines to design and control animation sequences within a general-
purpose programming language, such as C, C++, Lisp, or Fortran, but several
specialized animation languages have been developed.
 These languages typically include a graphics editor, a key-frame generator, an in-
between genera tor, and standard graphics routines.
 The graphics editor allows an animator to design and modify object shapes, using
spline surfaces, constructive solid geometry methods, or other representation
schemes.
 An important task in an animation specification is scene description. This includes
the positioning of objects and light sources, defining the photometric parameters
(light-source intensities and surface illumination properties), and setting the camera
parameters (position, orientation, and lens characteristics).
 Another standard function is action specification, which involves the layout of
motion paths for the objects and camera.
 We need the usual graphics routines: viewing and perspective transformations,
geometric transformations to generate object movements as a function of
accelerations or kinematic path specifications, visible-surface identification, and the
surface-rendering operations.
 Key-frame systems were originally designed as a separate set of animation routines
for generating the in-betweens from the user-specified key frames. Now, these
routines are often a component in a moregeneralanimationpackage.
 In the simplest case, each object in a scene is defined as a set of rigid bodies
connected at the joints and with alimited number of degrees of freedom.
 Parameterized systems allow object motion characteristics to be specified as
partoftheobjectdefinitions.Theadjustableparameterscontrolsuchobjectcharac teristics
as degrees of freedom, motion limitations, and allowable shape changes.
 Scripting systems allow object specifications and animation sequences to be defined
with a user-input script. From the script, a library of various objects and motions can
be constructed
Character Animation
 Animation of simple objects is relatively straightforward.

 When we consider the animation of more complex figures such as humans or

animals, however, it becomes much more difficult to create realistic animation.
 Consider the animation of walking or running human(orhumanoid)characters.
 Baseduponobservations in their own lives of walking or running people, viewers will
expect to see ani mated characters move in particular ways.
 If an animated character’s movement doesn’t matchthisexpectation, thebelievability
of the character maysuffer.
 Thus, much of the work involved in character animation is focused on creating
believ able movements.
Articulated Figure Animation

 A basic technique for animating people, animals, insects, and other critters is
to modelthemasarticulatedfigures,whicharehierarchical structures composed
of a set of rigid links that are connected at rotary joints.
 In less formal terms, this just means that we model animate objects as
moving stick figures, or simplified skeletons, that can later be wrapped with
surfaces representing skin, hair, fur, feathers, clothes, or other outer
coverings.
Motion Capture
 An alternative to determining the motion of a character computationally is to

digitally record the movement of a live actor and to base the movement of an
animated character on that information. This technique, known as motion
capture or mo-cap, can be used when the movement of the character is
predetermined (as in a scripted scene).
 The animated character will perform the same series of movements as the
live actor. The classic motion captur
Periodic Motions
 Periodic Motions Whenweconstruct an animation with repeated motion patterns,
such as a rotat ing object, we need to be sure that the motion is sampled frequently
enough to represent the movements correctly.

 In other words, the motion must be synchro nized with the frame-generation rate so
that we display enough frames per cycle to show the true motion. Otherwise, the
animation may be displayed incorrectly.
 A typical example of an under sampled periodic-motion display is the wagon wheel
in a Western movie that appears to be turning in the wrong direction.
 If this motion is recorded on film at the standard motion-picture projection rate of 24
frames per second, then the first five frames depicting this motion would.
 Because the wheel completes 3 4 of a turn every 1 24 of a second, only one
animation frame is generated per cycle, and the wheel thus appears to be rotating in
the opposite (counterclockwise) direction.
OpenGL Animation Procedures.

 Raster operations and color-index assignment functions are available in the core
library, and routines for changing color-table values are provided in GLUT.
 Other raster-animation operations are available only as GLUT routines because they
depend on the window system in use.
 In addition, computer-animation fea tures such as double buffering may not be
included in some hardware systems.
 Double-buffering operations, if available, are activated using the following
GLUTcommand:
glutInitDisplayMode (GLUT_DOUBLE);
 This provides two buffers,called the front buffer and the backbuffer,that we can use
alternately to refresh the screen display.
 While one buffer is acting as the refresh buffer for the current display window, the
next frame of an animation can be constructed in the other buffer.
 We specify when the roles of the two buffers are to be interchanged using
glutSwapBuffers ( );
 To determine whether double-buffer operations are available on a system,we can
issue the following query:
glGetBooleanv (GL_DOUBLEBUFFER, status);
 A value of GL TRUE is returned to array parameter status if both front and back
buffers are available on a system.
 Otherwise, the returned value is GL FALSE.

 For a continuous animation, we can also use glutIdleFunc (animationFcn); where

parameter animation Fcn can be assigned the name of a procedure that is to perform
the operations for incrementing the animation parameters.
 This pro cedure is continuously executed whenever there are no display-window
events that mustbeprocessed.
 To disable the glutIdleFunc,we set its argument to the value NULL or the value 0.
An example animation program is given in the following code, which con tinuously
rotates a regular hexagon in the xyplane about the z axis.
 The origin of three-dimensionalscreencoordinatesisplacedatthecenter of the display
windows of that the z axis passes through this center position.

Module-4
Introduction to Image processing
Overview
Image processing involves the manipulation and analysis of images to enhance them or
extract useful information. It is a subset of signal processing where the input is an image,
and the output can be an image or characteristics/features associated with that image.
Image processing refers to the processing of visual information Sources, such as image for
Some specific task, as per the application requirements.. Pattern recognition is used for
identifying & recognzing the objects that is present in the image, using the features
generated & classification or clustering is pattern recognition
Computer vision associated with scene understanding. Most image processing algorithms
produce results that Can Serve as the first input for mile wision. Algorithm Computer
graphics & image processing are very clasely. Related areas. Image processing deals with
raster data & bitmaps whereas computer graphics primarily deals with vector data
An image is defined as a two dimensional function, f(x,y), where x and y are spatial (plane)
coordinates, and the amplitude of ‘f’ at any pair of coordinates (x,y) is called the intensity or
gray level of the image at that point
A digital image is a 2D representation of a scene as a finite set of digital values,

calledpicture elements or pixels or pels.
The field of digital image processing refers to processing digital image by means of a digital
computer.
NOTE: A digital image is composed of finite number of elements like picture elements,
image elements, pels, and pixels.
The digital image processing methods stems from 2 principal application areas:

1 Improvement of pictorial information for human interpretation

2 Processing of image data for storage, transmission and representation for
autonomous machine perception.
Fundamental Steps in Digital Image Processing
Figure: Fundamental steps in digital image processing.
Image acquisition: This is the first step in the digital image processing. An image is
captured by a sensor (such as digital camera) and digitized. The image that is acquired is
completely unprocessed. This step involves preprocessing such as scaling.
Image enhancement: It is the process of manipulating an image in order to make image

more suitable than the original for the specific application. The image enhancement
techniques are so verified, and use so many different image processing approaches. The idea
behind enhancement techniques is to bring out detail that is hidden, or simply to highlight
certain features of interest in an image like changing brightness & contrast etc.
Image restoration: It is an area that also deals with improving the appearance of an image
but it is objective than subjective, in the sense that restoration techniques tend to be based on
mathematical or probabilistic models of image degradation.
Color image processing: It is an area that has been gaining in importance because of the
significant increase in the use of digital images over the internet. Color is used also for
extracting features of interest in an image. This may include color modeling and processing
in a digital domain etc.

Wavelets: These are the foundation for representing images in various degrees of resolution.
In particular used for image data compression and for pyramidal representation, in which
images are subdivided successively into smaller regions.
Compression: Deals with techniques for reducing the storage required to saving an image,
or the bandwidth required to transmit it. An example for image compression standard is jpg
file extension used in the JPEG(Joint Photographic Experts Group) image compression
standard.
Morphological processing: It deals with tools for extracting image components that are
useful in the representation and description of shape.
Segmentation: Segmentation procedures partition an image into its constituent’s parts or

objects. In general, autonomous segmentation is one of the most difficult task in digital
processing. A rugged segmentation procedure brings the process a long way towards
successful solution of imaging problems that require objects to be identified individually. On
the other hand, weak or erratic segmentation algorithms always guarantee eventual failure.
Representation and description: It follows the output of the segmentation stage, which is
raw pixel data it’s needed to convert it to a form suitable for computer processing. The first
decision that must be made is whether the data should be represented as a boundary (i.e., the
set of pixels separating one image region from another) or as a complete region.
• The boundary representation is appropriate when the focus is on external

shape characteristics, such as corners and inflections.
• The regional representation is appropriate when the focus is on internal

properties, such as texture or skeletal shape.
Description also called feature selection, deals with extracting attributes that result in some
quantitative information of interest or are basic for differentiating one class of objects from
another.
Recognition: It is the process that assigns a label (e.g., ‘vehicle’) to an object based on its
descriptors.

Components of an Image Processing System
Figure: Components of a general – purpose image processing system.
The above figure shows the basic components comprising a typical general purpose system
used for digital image processing. With reference to sensing, two elements are required to
acquire digital images:
1. The physical device that is sensitive to the energy radiated by the object we wish
to image.
2. Digitizer is a device for converting the output of the physical sensing device into
digital form.
For example in a digital camera, the sensors produce an electrical output proportional to
light intensity.
Specialized image processing hardware usually consists of the digitizer plus hardware
that performs other primitive operations, such as an arithmetic logic unit(ALU), that
performs other primitive operations in parallel on entire images. This type of hardware is
called a front end subsystem, and its most distinguished characteristic is speed, so this unit

performs functions that require fast data throughputs that the typical main computer cannot
handle.
The Computer in an image processing system is a general purpose computer and can
range from a PC to a supercomputer. In dedicated applications, sometimes custom
computers are used to achieve a required level of performance.
Software for image processing consists of specialized modules that perform specific
tasks. A well designed package also includes the capacity for the user to write code that, as a
minimum, utilizes the specialized modules.
Mass storage capacity is a must in image processing applications. An image 1024 X

1024 pixels, in which the intensity of each pixel is an 8-bit quantity, requires one megabyte
of storage space if the image is not compressed. When dealing with thousands, or even
millions, of images, providing adequate storage in an image processing applications falls
into three principal categories:
1. Short term storage for use during processing

2. Online storage for relatively fast recall
3. Archival storage, characterized by infrequent access.
Storage is measured in bytes, Kbytes, Mbytes, Gbytes, Tbytes. One method of providing
short term storage is computer memory. An-other is by specialized buffers, that store one or
more images and can be accessed rapidly, usually at video buffers, that store one or more
images and can be accessed rapidly, usually at video rates. The latter method allows virtually
instantaneous image zoom, as well as scroll (vertical shifts) and pan (horizontal shifts).
Online storage generally takes the form of magnetic disks or optical image storage. The key
factor characterizing the online storage is frequent access to the stored data. Magnetic tapes
and optical disks housed in ‘jukeboxes’ are the usual media for the archival applications.
Image displays in use today are mainly color TV monitors. Monitors are driven by the
outputs of image and graphic display cards that are an integral part of the computer system.
In some cases it is necessary to have stereo displays, and these are implemented in the form
of head gear containing two small displays embedded in goggles worn by the users.
Hardcopy devices for recording images include laser printers, film cameras, heat-
sensitive devices, inject units, such as optical and CD ROM disks. Film provides the highest
possible resolution, but paper is the obvious medium of choice for written material.

Networking is almost a default function in any computer system in use today. Because of
the large amount of data inherent in image processing applications, the key consideration in
image transmission is bandwidth. Optical fiber and other broadband technologies
overcoming the problem of communicating with remote sites via internet.
Applications of Image Processing
The areas of application of digital image processing are so varied that some form of
organization is desirable in attempting to capture the breadth of this field. One of the
simplest ways to develop a basic understanding of the extent of image processing
applications is to categorize images according to their application.
1. Medical imaging
2. Robot vision
3. Character recognition
4. Remote Sensing.
Medical Imaging:
Gamma-Ray Imaging: Major uses of imaging based on gamma rays include nuclear
medicine and astronomical observations. In nuclear medicine, the approach is to inject a
patient with a radioactive isotope that emits gamma rays as it decays. Images are produced
from the emissions collected by gamma ray detectors.
X-ray Imaging: X-rays are among the oldest sources of EM radiation used for imaging. The
best known use of X-rays is medical diagnostics, but they also are used extensively in
industry and other areas, like astronomy. X-rays for medical and industrial imaging are
generated using an Xray tube, which is a vacuum tube with a cathode and anode. The
cathode is heated, causing free electrons to be released. These electrons flow at high speed
to the positively charged anode. When the electrons strike a nucleus, energy is released in
the form of X-ray radiation. e energy (penetrating power) of the X-rays is controlled by a
voltage applied across the anode, and the number of X-rays is controlled by a current
applied to the filament in the cathode. The intensity of the X-rays is modified by absorption
as they pass through the patient, and the resulting energy falling on the film develops it,
much in the same way that light develops photographic film. In digital radiography, digital
images are obtained by one of two methods:
(1) By digitizing X-ray films;

(2) by having the X-rays that pass through the patient fall directly onto devices
(such as a phosphor screen) that convert X-rays to light.
Robot Vision:
Apart from the many challenges that a robot face today, one of the biggest challenge still is
to increase the vision of the robot. Make robot able to see things, identify them.
1. Hurdle detection:
Hurdle detection is one of the common task that has been done through image processing,
by identifying different type of objects in the image and then calculating the distance
between robot and hurdles.
2. Line follower robot:
Most of the robots today work by following the line and thus are called line follower robots.
This helps a robot to move on its path and perform some tasks. This has also been achieved
through image processing.
Character Recognition:
• Optical Character Recognition
• Detecting License Plate
• Banking- To process the cheques
• Blind and visually impaired persons
• Legal department
• Retail Industry
Remote Sensing:
In the field of remote sensing, the area of the earth is scanned by a satellite or from a very
high ground and then it is analyzed to obtain information about it. One particular application
of digital image processing in the field of remote sensing is to detect infrastructure damages
caused by an earthquake. Since the area effected by the earthquake is sometimes so wide,
that it not possible to examine it with human eye in order to estimate damages. Even if it is ,
then it is very hectic and time consuming procedure. So a solution to this is found in digital
image processing using remote sensing. An image of the affected area is captured from the
above ground and then it is analyzed to detect the various types of damage done by the
earthquake. The key steps include in the analysis are

1. The extraction of edges
2. Analysis and enhancement of various types of edges
Image Processing and Related Fields

Computer Vision: Computer Vision is a field focused on enabling computers to interpret
and understand visual information from the world. It leverages image processing techniques
to preprocess images by enhancing them or reducing noise, making it easier for computer
vision algorithms to analyze the content. Applications of computer vision include object
detection and recognition, where systems can identify faces in a crowd or recognize objects
in an image. Autonomous vehicles use computer vision to understand and navigate their
surroundings, identifying road signs, pedestrians, and other obstacles. Augmented reality
overlays digital content onto the real world, requiring precise image processing to integrate
virtual elements seamlessly with real-time video feeds.
Pattern Recognition: Pattern Recognition involves recognizing patterns and regularities in

data, playing a crucial role in various image processing applications. It uses techniques such
as feature extraction and image segmentation to identify specific patterns within images. For
example, handwriting recognition systems digitize written text by identifying character
shapes, while biometric identification systems, like fingerprint or iris recognition, rely on
pattern recognition to authenticate individuals. In industrial inspection, pattern recognition
detects defects in manufactured products by analyzing visual patterns, ensuring quality
control.
Machine Learning: Machine Learning, particularly deep learning, is essential for advanced
image processing tasks. It involves training algorithms and statistical models on large
datasets to perform tasks without explicit instructions. In image processing, machine
learning models are trained on processed image data for tasks such as image classification,
where images are sorted into categories, and image enhancement, which improves resolution
using techniques like super-resolution. Predictive modeling, another application, uses
medical images to predict disease progression. Machine learning enhances image processing
by providing powerful tools for analyzing and interpreting complex visual data.

Signal Processing: Signal Processing deals with the analysis, modification, and synthesis of
signals, including images. It provides the mathematical and computational foundation for
many image processing techniques. Applications of signal processing in image processing
include audio and speech processing, where noise reduction techniques are applied to audio
recordings, and medical imaging, where signal processing algorithms reconstruct images
from MRI and CT scans. In communications, image compression techniques reduce file
sizes for efficient transmission and storage, highlighting the integral role of signal
processing in managing and enhancing visual data.
Artificial Intelligence (AI): Artificial Intelligence encompasses a wide range of techniques

that enable machines to mimic human intelligence, including reasoning, learning, and self-
correction. In image processing, AI powers advanced tasks such as object recognition, scene
understanding, and anomaly detection. For example, AI algorithms can diagnose diseases
from medical images with high accuracy by analyzing complex patterns that are often
indistinguishable to the human eye. In robotics, AI enables machines to navigate and interact
with their environment using visual inputs. AI's integration with image processing allows for
sophisticated analysis and interpretation of visual data, driving innovation across various
applications.
Digital Image representation

Let f(s, t) represent a continuous image function of two continuous variables, s and t. We
convert this function into a digital image by sampling and quantization. Suppose we sample
the continuous image into a 2-D array, f(x,y), containing M rows and N columns, where
(x,y) are discrete coordinates. For notational clarity and convenience, we use integer values
for these discrete coordinates: x = 0,1,2,3,…M-1 and y = 0,1,2,3,…..N-1.
In general, the value of the image at any coordinates (x,y) is denoted f(x,y), where x and y
are integers. The section of the real plane spanned by the coordinates of an image is called
the spatial domain, with x and y being referred to as spatial variables or spatial coordinates.
The image displays allow us to view results at a glance. Numerical arrays are used for
processing and algorithm development. In equation form, we write the representation of an
M X N numerical array as

Both the sides of this equation are equivalent ways of expressing a digital image
quantitatively. The right side is a matrix of real numbers. Each element of this matrix is
called an image element, pictures element, pixels, or pel. This digitization process requires
that decisions be made regarding the values for M, N, and for the number L of discrete
intensity levels. Here M and N are positive integers. The number of intensity levels typically
is an integer power of 2:
L= 2k
The number, b of bits required to store a digitized image is b = ‘M’ x ‘N’ x ‘k’ When M = N,
this equation becomes b = N2k
Types of Images
1. Binary Image: Images that have only two unique values of pixel intensity- 0
(representing black) and 1 (representing white) are called binary images. Such
images are generally used to highlight a discriminating portion of a colored
image. For example, it is commonly used for image segmentation, as shown
below.

2. Grayscale Image: Grayscale or 8-bit images are composed of 256 unique

colors, where a pixel intensity of 0 represents the black color and pixel
intensity of 255 represents the white color. All the other 254 values in between
are the different shades of gray.
An example of an RGB image converted to its grayscale version is shown

below. Notice that the shape of the histogram remains the same for the RGB
and grayscale images.
3. Colour Images: Colour images are three band monochrome images in

which, each band contains a different color and the actual information is stored
in the digital image. The color images contain gray level information in each
spectral band.The images are represented as red, green and blue (RGB images).
And each color image has 24 bits/pixel means 8 bits for each of the three color
band(RGB).

Digital Image Processing Operations
Basic relationships
1. Neighborhood: In digital images, the neighborhood of a pixel refers to the adjacent

pixels surrounding it. This can be a 4-neighborhood (adjacent in horizontal and
vertical directions) or an 8-neighborhood (including diagonal neighbors).
2. Connectivity: Describes how pixels are connected based on a defined neighborhood.

Pixels are considered connected if they share a common property, such as intensity,
within a specified neighborhood (4-connected or 8-connected).
3. Regions and Boundaries: A region in an image is a group of connected pixels that

share similar properties. The boundary of a region is the set of pixels that separate it
from other regions.
4. Adjacency: Two pixels are adjacent if they are neighbors and share a similar
property, such as intensity. Types of adjacency include 4-adjacency, 8-adjacency,
and m-adjacency (mixed).
A pixel p at coordinates ( x, y) has four horizontal and vertical neighbors whose coordinates
are given by
(x+1, y), (x-1, y), (x, y+1), (x, y-1)
This set of pixels is called the 4-neighbors of P, and is denoted by N 4(P). Each of them are
at a unit distance from P. The four diagonal neighbors of p(x,y) are given by,
(x+1, y+1), (x+1, y-1), (x-1, y+1), (x-1, y-1)
This set is denoted by N D(P). Each of them are at Euclidean distance of 1.414 from P. The
points ND(P) and N4(P) are together known as 8-neighbors of the point P, denoted by N8(P).
Some of the points in the N4, ND and N8 may fall outside image when P lies on the border
of image.

 N 4 - 4-neighbors
 N D - diagonal neighbors
 N 8 - 8-neighbors (N 4 U N D )
Neighbors of a pixel
a. 4-neighbors of a pixel p are its vertical and horizontal neighbors denoted by
N4(p)
b. 8-neighbors of a pixel p are its vertical horizontal and 4 diagonal neighbors

denoted by N8(p) is shown above.
Adjacency: Two pixels are connected if they are neighbors and their gray levels satisfy
some specified criterion of similarity. For example, in a binary image two pixels are
connected if they are 4-neighbors and have same value (0/1) then it is said to be satisfy
adjacency.
Let V be set of gray levels values used to define adjacency.
4-adjacency: Two pixels p and q with values from V are 4- adjacent if q is in the set N 4 (p).
8-adjacency: Two pixels p and q with values from V are 8- adjacent if q is in the set N 8 (p).
m-adjacency: Two pixels p and q with values from V are madjacent if,
A. q is in N 4(P).
B. q is in N D(p) and the set N4(p) ∩ N4(q) is empty (has no pixels whose values
are from V.
Connectivity: To determine whether the pixels are adjacent in some sense. Let V be the
set of gray-level values used to define connectivity; then Two pixels p, q that have values
from the set V(1,2) are:
a. 4-connected, if q is in the set N4(p)

b. 8-connected, if q is in the set N8(p)
c. m-connected, if q is in N4(p) or q is in ND(p) and the set N4 (p) ∩ N4 (q) is

empty
Paths & Path lengths:

A path from pixel p with coordinates (x, y) to pixel q with coordinates (s, t) is a
sequence of distinct pixels with coordinates:
(x 0, y 0), (x 1, y 1), (x2, y 2) … (x n, y n),
where (x 0, y 0)=(x, y) and (x n, y n)=(s, t); (x i, y i ) is adjacent to (xi-1, yi-1) (1 ≤ i ≤ n) and
‘n’ is the length of the path. We can define 4-, 8-, and m-paths based on type of adjacency
used.
Connected Components:
If p and q are pixels of an image subset S then p is connected to q in S if there is a
path from p to q consisting entirely of pixels in S. For every pixel p in S, the set of pixels in
S that are connected to p is called a connected component of S, If S has only one connected
component then S is called Connected Set .
Regions and Boundaries

A subset R of pixels in an image is called a Region of the image if R is a connected
set. The boundary of the region R is the set of pixels in the region that have one or more
neighbors that are not in R. If R happens to be an entire image, then its boundary is defined
as the set of pixels in the first and last rows and columns of the image and image has no

neighbors beyond its border. Normally, when we refer to a region, we are referring to a
subset of an image, and any pixels in the boundary of the region that happen to coincide
with the border of the image are included implicitly as part of the region boundary.
Distance Metrics
1. Euclidean Distance:The straight-line distance between two pixels (𝑥1,𝑦1)(x1,y1)

and (𝑥2,𝑦2)(x2,y2), calculated as:
𝑑=(𝑥2−𝑥1)2+(𝑦2−𝑦1)2d=(x2−x1)2+(y2−y1)2
2. Manhattan Distance (City Block Distance): The distance between two pixels
measured along axes at right angles, calculated as:
𝑑=∣𝑥2−𝑥1∣+∣𝑦2−𝑦1∣d=∣x2−x1∣+∣y2−y1∣
3. Chessboard Distance: The maximum of the absolute differences between the

coordinates, reflecting the number of moves a king in chess would take:
𝑑=max(∣𝑥2−𝑥1∣,∣𝑦2−𝑦1∣)d=max(∣x2−x1∣,∣y2−y1∣)
Given pixels p, q and z with coordinates (x, y), (s, t), (u, v) respectively, the distance
function D has following properties:
b. D(p, q) ≥ 0 [ D(p, q) = 0, iff p = q]

c. D(p, q) = D(q, p)
d. D(p, z) ≤ D(p, q) + D(q, z)
The following are the different Distance measures:
Euclidean Distance : De(p, q) = [(x-s)2 + (y-t) 2 ] 

City Block Distance: D4(p, q) = |x-s| + |y-t|
2 1 2
2 1 0 1 2
2 1 2
2
Chess Board Distance: D8(p, q) = max(|x-s|, |y-t|)

2 2 2 2 2
2 1 1 1 2
2 1 0 1 2
2 1 1 1 2
2 2 2 2 2
In this case, the pixels with D8 distance from (x, y) less than or equal to some value r form a
square centered at (x,y). For example, the pixels with D8 distance ≤ 2 from (x, y) form the
following contains of constant distance: The pxels with D8 = 1 are the 8- neighbors of (x, y).
Classification of Image processing Operations.

Image classification refers to a process in computer vision that can classify an image
according to its visual content. Image classification refers to the task of extracting
information classes from a multiband raster image. The resulting raster from image
classification can be used to create thematic maps. Depending on the interaction between the
analyst and the computer during classification, there are two types of classification:
supervised and unsupervised
Step 1: Definition of Classification Classes Depending on the objective and the

characteristics of the image data, the classification classes should be clearly defined.
Step 2: Selection of Features to discriminate between the classes should be established using
multispectral or multi-temporal characteristics, colour, textures etc.

Step 3: Sampling of Training Data Training data should be sampled in order to determine
appropriate decision rules. Classification techniques such as supervised or unsupervised
learning will then be selected on the basis of the training data sets.
Step 4: Finding of proper decision rule Various classification techniques will be compared
with the training data, so that an appropriate decision rule is selected for subsequent
classification.
Step 5: Classification depending upon the decision rule, all pixels are classified in a single
class. There are two methods of pixel by pixel classification and per-field classification, with
respect to segmented areas.
Step 6: Verification of Results The classified results should be checked and verified for
their accuracy and reliability.

Module-5
Image Segmentation
Image segmentation is the division of an image into regions or categories, which
correspond to different objects or parts of objects. Every pixel in an image is allocated to
one of a number of these categories.
A good segmentation is typically one in which:
• pixels in the same category have similar grey scale of multivariate values and
form a connected region,
• neighboring pixels which are in different categories have dissimilar values.
Segmentation is often the critical step in image analysis: the point at which we move
from considering each pixel as a unit of observation to working with objects (or parts of
objects) in the image, composed of many pixels.
Image segmentation is the key behind image understanding. Image segmentation is
considered as an important basic operation for meaningful analysis and interpretation of
image acquired.
It is a critical and essential component of an image analysis and/or pattern recognition
system, and is one of the most difficult tasks in image processing, which determines the
quality of the final segmentation.
If segmentation is done well then all other stages in image analysis are made simpler.
But, as we shall see, success is often only partial when automatic segmentation algorithms
are used. However, manual intervention can usually overcome these problems, and by this
stage the computer should already have done most of the work.
Segmentation algorithms may either be applied to the images as originally recorded,

or after the application of transformations and filters considered in chapters 2 and 3. After
segmentation, methods of mathematical morphology can be used to improve the results. The
segmentation results will be used to extract quantitative information from the images.
Classification of Image Segmentation
There are three general approaches to segmentation,
• Termed thresholding,
• Edge-based methods
• Region-based methods.
• In thresholding, pixels are allocated to categories according to the range of values in
which a pixel lies. Fig 4.1(a) shows boundaries which were obtained by thresholding
the muscle fibers image. Pixels with values less than 128 have been placed in one
category, and the rest have been placed in the other category. The boundaries between
adjacent pixels in different categories has been superimposed in white on the original
image. It can be seen that the threshold has successfully segmented the image into the
two predominant fiber types.
• In edge-based segmentation, an edge filter is applied to the image, pixels are

classified as edge or non-edge depending on the filter output, and pixels which are not
separated by an edge are allocated to the same category. Fig 4.1(b) shows the
boundaries of connected regions after applying Prewitt’s filter (§3.4.2) and eliminating
all non-border segments containing fewer than 500 pixels. (More details will be given
in §4.2.)
• Finally, region-based segmentation algorithms operate iteratively by grouping

together pixels which are neighbors and have similar values and splitting groups of
pixels which are dissimilar in value. Fig 4.1(c) shows the boundaries produced by one
such algorithm, based on the concept of watersheds, about which we will give more
details in §4.3
Note that none of the three methods illustrated in Fig 4.1 has been completely
successful in segmenting the muscle fibers image by placing a boundary between every
adjacent pair of fibers. Each method has distinctive faults.
For example, in Fig 4.1(a) boundaries are well placed, but others are missing. In Fig 4.1(c),
however, more boundaries are present, and they are smooth, but they are not always in
exactly the right positions.
The following three sections will consider these three approaches in more detail. Algorithms
will be considered which can either be fully automatic or require some manual intervention.
The key points of the chapter will be summarized in §4.4.

Fig. Boundaries produced by three segmentations of the muscle fibers images: (a) by
thresholding, (b) connected regions after thresholding the output of Prewitt”s edge filter
and removing small regions, (c) result produced by watershed algorithm on output from a
variance filter with Gaussian weights ( 2 = 96).
It is the prime area of research in computer vision.

A number of image segmentation techniques are available, but there is no one single
technique that is suitable to all the application. Researchers have extensively worked
over this fundamental problem and proposed various methods for image segmentation.
These methods can be broadly classified into seven groups:
(1) Histogram thresholding,
(2) Clustering (Fuzzy and Hard),
(3) Region growing, region splitting and merging,
(4) Discontinuity-based,
(5) Physical model- based,
(6) Fuzzy approaches, and

(7) Neural network and GA (Genetic algorithm) based approaches.

Discontinuity based segmentation is one of the widely used techniques for monochrome
image segmentation. In discontinuity-based approach, the partitions or subdivision of an
image is based on some abrupt changes in the intensity level of images. Here, we mainly
interest in identification of isolated points, lined and edges in an image.
The area of edge detection algorithms. The image segmentation based on discontinuity-
based approach. Under this approach, we analyses the point detection, line detection and
edge detection techniques. A number of operator which are based on first-order derivatives
and second-order derivatives such as prewitt, sobel, roberts etc..
THRESHOLDING:
h(x) P p (x) P p (x)=
Thresholding is the simplest and most commonly used method of segmentation.

Given a single threshold, t, the pixel located at lattice position (i, j), with grayscale value fij ,
is allocated to category 1 if
fij ≤ t.
In many cases t is chosen manually by the scientist, by trying a range of values of t

and seeing which one works best at identifying the objects of interest. Fig 4.2 shows some
segmentations of the soil image Thresholds of 7, 10, 13, 20, 29 and 38 were chosen in Figs
4.2(a) to (f) respectively, to identify approximately 10, 20, 30, 40, 50 and 60% of the pixels
as being pores. Fig 4.2(d), with a threshold of 20, looks best because most of the connected
pore network evident.

Fig.4.2: six segmentations of the soil image, obtained using manually-selected

thresholds of (a)7, (b) 10, (c) 13, (d) 20, (e) 29 and (f) 38. These correspond to
approximately 10%, 20%,….60%, respectively, of the image being displayed as black.
Note that:
Although pixels in a single thresholded category will have similar values (either in the
range 0 to t, or in the range (t + 1) to 255), they will not usually constitute a single connected
component. This is not a problem in the soil image because the object (air) is not necessarily
connected, either in the imaging plane or in three-dimensions. In other cases, thresholding
would be followed by dividing the initial categories into sub-categories of connected
regions.
More than one threshold can be used, in which case more than two categories are
produced.
Thresholds can be chosen automatically.
In §4.1.1 we will consider algorithms for choosing the threshold on the basis of the
histogram of greyscale pixel values. In §4.1.2, manually- and automatically-selected

classifiers for multivariate images will be considered. Finally, in §4.1.3, thresholding

algorithms which make use of context (that is, values of neighbouring pixels as well as the
histogram of pixel values) will be presented.
HISTOGRAM-BASED THRESHOLDING
We will denote the histogram of pixel values by h0, h1,...,hN , where hk specifies the
number of pixels in an image with grey scale value k and N is the maximum pixel value
(typically 255). Ridler and Calvard (1978) and Trussell (1979) proposed a simple algorithm
for choosing a single threshold. We shall refer to it as the intermeans algorithm. First we
will describe the algorithm in words, and then mathematically.
Initially, a guess has to be made at a possible value for the threshold. From this, the
mean values of pixels in the two categories produced using this threshold are calculated. The
threshold is repositioned to lie exactly half way between the two means. Mean values are
calculated again and a new threshold is obtained, and so on until the threshold stops
changing value. Mathematically, the algorithm can be specified as follows.
1. Make an initial guess at t: for example, set it equal to the median pixel value, that is,
the value for which
where n2 is the number of pixels in the n × n image.
2. Calculate the mean pixel value in each category. For values less than or equal to t, this is
given by:

Whereas, for values greater than t, it is given by:
3. Re-estimate t as half-way between the two means, i.e.
where [ ] denotes ‘the integer part of’ the expression between the brackets.
4. Repeat steps (2) and (3) until ‘t’ stops changing value between consecutive evaluations.
Fig shows the histogram of the soil image. From an initial value of t = 28 (the median pixel
value), the algorithm changed t to 31, 32, and 33 on the first three iterations, and then t
remained unchanged. The pixel means in the two categories are 15.4 and 52.3. Fig (a) shows
the result of using this threshold. Note that this value of t is considerably higher than the
threshold value of 20 which we favored in the manual approach.
The inter means algorithm has a tendency to find a threshold which divides the
histogram in two, so that there are approximately equal numbers of pixels in the two
categories. In many applications, such as the soil image, this is not appropriate. One way to
overcome this drawback is to modify the algorithm as follows.
Consider a distribution which is a mixture of two Gaussian distributions. Therefore,

in the absence of sampling variability, the histogram is given by:
Here, p1 and p2 are proportions (such that p1 + p2 = 1) and φl(k) denotes the probability
density of a Gaussian distribution, that is

where µl and σ2 l are the mean and variance of pixel values in category l. The best
classification criterion, i.e. the one which misclassifies the least number of pixels, allocates
pixels with value k to category 1 if p
and otherwise classifies them as 2. After substituting for φ and taking logs, the inequality
becomes
The left side of the inequality is a quadratic function in k. Let A, B and C denote the
three terms in curly brackets, respectively. Then the criterion for allocating pixels with value
k to category 1 is:
There are three cases to consider:
a. If A = 0 (i.e. σ12 = σ2 2 ), the criterion simplifies to one of allocating pixels with value k to
category 1 if
(If, in addition, p1 = p2 and µ1 < µ2, the criterion becomes k ≤ 1/ 2 {µ1 + µ2}. Note that this is
the intermeans criterion, which implicitly assumes that the two categories are of equal size.)
b. If B < AC, then the quadratic function has no real roots, and all pixels are classified as 1 if
A < 0 (i.e. σ1 2 > σ2 2), or as 2 if A > 0
c. Otherwise, denote the roots t1 and t2 , where t1 ≤ t2 and
The criteria for category 1 are

In practice, cases (a) and (b) occur infrequently, and if µ1 < µ2 the rule simplifies to the
threshold:
Kittler and Illingworth (1986) proposed an iterative minimum-error algorithm, which is

based on this threshold and can be regarded as a generalization of the intermeans algorithm.
Again, we will describe the algorithm in words, and then mathematically.
From an initial guess at the threshold, the proportions, means and variances of pixel values
in the two categories are calculated. The threshold is repositioned according to the above
criterion, and proportions, means and variances are recalculated. These steps are repeated
until there are no changes in values between iterations.
Mathematically:
1. Make an initial guess at a value for t.
2. Estimate p1, µ1 and σ1 2 for pixels with values less than or equal to t, by
Similarly, estimate p2, µ2 and σ2 2 for pixels in the range t + 1 to N.
3. Re-estimate t by
where A, B, C and [ ] have already been defined.
4. Repeat steps (2) and (3) until t converges to a stable value.

When applied to the soil image, the algorithm converged in 4 iterations to t = 24. Fig (b)
shows the result, which is more satisfactory than that produced by the intermeans algorithm
because it has allowed for a smaller proportion of air pixels (p1 = 0.45, as compared with p2
= 0.55). The algorithm has also taken account of the air pixels being less variable in value
than those for the soil matrix (σ1 2 = 30, whereas σ2 2 = 186). This is in accord with the left-
most peak in the histogram plot (Fig ) being quite narrow.
EDGE-BASED SEGMENTATION
As we have seen, the results of threshold-based segmentation are usually less than
perfect. Often, a scientist will have to make changes to the results of automatic
segmentation. One simple way of doing this is by using a computer mouse to control a
screen cursor and draw boundary lines between regions. Fig (a) shows the boundaries
obtained by thresholding the muscle fibres image (as already displayed in Fig (a)),
superimposed on the output from Prewitt’s edge filter (§3.4.2), with the contrast stretched so
that values between 0 and 5 are displayed as shades of grey ranging from white to black and
values exceeding 5 are all displayed as black. This display can be used as an aid to
determine where extra boundaries need to be inserted to fully segment all muscle fibres. Fig
4.10(b) shows the result after manually adding 71 straight lines.

Algorithms are available for semi-automatically drawing edges, whereby the

scientist’s rough lines are smoothed and perturbed to maximise some criterion of match with
the image (see, for example, Samadani and Han, 1993). Alternatively, edge finding can be
made fully automatic, although not necessarily fully successful. Fig 4.11(a) shows the result
of applying Prewitt’s edge filter to the muscle fibre image. In this display, the filter output
has been thresholded at a value of 5: all pixels exceeding 5 are labelled as edge pixels and
displayed as black. Connected chains of edge pixels divide the image into regions.
Segmentation can be achieved by allocating to a single category all non-edge pixels which
are not separated by an edge. Rosenfeld and Pfaltz (1966) gave an efficient algorithm for
doing this for 4- and 8- connected regions, termed a connected components algorithm. We
will describe this algorithm in words, and then mathematically.
The algorithm operates on a raster scan, in which each pixel is visited in turn,
starting at the top- left corner of the image and scanning along each row, finishing at the
bottom-right corner. For each non-edge pixel, (i, j), the following conditions are checked. If
its already visited neighbors — (i − 1, j) and (i, j − 1) in the 4-connected case, also (i − 1, j −
1) and (i − 1, j + 1) in the 8- connected case — are all edge pixels, then a new category is
created and (i, j) is allocated to it. Alternatively, if all its non-edge neighbors are in a single
category, then (i, j) is also placed in that category. The final possibility is that neighbors
belong to two or more categories, in which case (i, j) is allocated to one of them and a note
is kept that these categories are connected and therefore should from then on be considered
as a single category. More formally, for the simpler case of 4-connected regions:
Initialize the count of the number of categories by setting K = 0.

Consider each pixel (i, j) in turn in a raster scan, proceeding row by row (i = 1,...,n),
and for each value of i taking j = 1,...,n.
One of four possibilities apply to pixel (i, j):
1. If (i, j) is an edge pixel then nothing needs to be done.

2. If both previously-visited neighbours, (i − 1, j) and (i, j − 1), are edge pixels, then a new
category has to be created for (i, j):
where the entries in h1,...,hK are used to keep track of which categories are equivalent,
and gij records the category label for pixel (i, j).
3. If just one of the two neighbours is an edge pixel, then (i, j) is assigned the same label as
the other one:
4. The final possibility is that neither neighbor is an edge pixel, in which case (i, j) is given
the same label as one of them:
and if the neighbors have labels which have not been marked as equivalent, i.e. hgi−1,j
6= hgi,j−1 , then this needs to be done (because they are connected at pixel (i, j)). The
equivalence is recorded by changing the entries in h1,...,hK, as follows: – Set l1 =
min(hgi−1,j , hgi,j−1 ) and l2 = max(hgi−1,j , hgi,j−1 ). – For each value of k from 1 to
K, if hk = l2 then hk → l1.
Finally, after all the pixels have been considered, the array of labels is revised, taking into
account which categories have been marked for amalgamation: gij → hgij for i, j = 1, . . . , n
After application of the labeling algorithm, superfluous edge pixels — that is, those
which do not separate classes — can be removed: any edge-pixel which has neighbors only
of one category is assigned to that category. Fig 4.11(b) shows the result of applying the
labeling algorithm with edges as shown in Fig 4.11(a), and removing superfluous edge
pixels. The white boundaries have been superimposed on the original image.

Similarly, small segments (say less than 500 pixels in size) which do not touch the
borders of the image can be removed, leading to the previously displayed Fig 4.1(b). The
segmentation has done better than simple thresholding, but has failed to separate all fibers
because of gaps in output from Prewitt’s edge filter. Martello (1976), among others, has
proposed algorithms for bridging these gaps.
REGION-BASED SEGMENTATION
Segmentation may be regarded as spatial clustering:
 Clustering in the sense that pixels with similar values are grouped together, and
 spatial in that pixels in the same category also form a single connected component.
Clustering algorithms may be agglomerative, divisive or iterative (see, for example,

Gordon, 1981). Region-based methods can be similarly categorized into:
 those which merge pixels,
 those which split the image into regions, and
 those which both split-and-merge in an iterative search scheme

The distinction between edge-based and region-based methods is a little arbitrary.

For example, in §4.2 one of the algorithms we considered involved placing all neighboring
non-edge pixels in the same category. In essence, this is a merging algorithm.
Seeded region growing is a semi-automatic method of the merge type. We will

explain it by way of an example. Fig 4.13(a) shows a set of seeds, white discs of radius 3,
which have been placed inside all the muscle fibres, using an on-screen cursor controlled by
a computer mouse.
Fig 4.13(b) shows again the output from Prewitt’s edge filter. Superimposed on it in white
are the seeds and the boundaries of a segmentation produced by a form of watershed
algorithm. The boundaries are also shown superimposed on the original muscle fibres image
in Fig 4.13(c). The watershed algorithm operates as follows (we will explain the name later).
For each of a sequence of increasing values of a threshold, all pixels with edge
strength less than this threshold which form a connected region with one of the seeds are
allocated to the corresponding fibre. When a threshold is reached for which two seeds
become connected, the pixels are used to label the boundary. A mathematical representation
of the algorithm is too complicated to be given here. Instead, we refer the reader to Vincent
and Soille (1991) for more details and an efficient algorithm. Meyer and Beucher (1990)
also consider the watershed algorithm, and added some refinements to the method.
Note that:
• The use of discs of radius 3 pixels, rather than single points, as seeds make the
watershed results less sensitive to fluctuations in Prewitt’s filter output in the middle
of fibres.
• The results produced by this semi-automatic segmentation algorithm are almost as
good as those shown in Fig 4.10(b), but the effort required in positioning seeds
inside muscle fibres is far less than that required to draw boundaries.
• Adams and Bischof (1994) present a similar seeded region growing algorithm, but
based directly on the image greyscale, not on the output from an edge filter.
The watershed algorithm, in its standard use, is fully automatic. Again, we will demonstrate
this by illustration. Fig shows the output produced by a variance filter (§3.4.1) with
Gaussian weights (σ2 = 96) applied to the muscle fibers image after histogram-equalization
(as shown in Fig (d)). The white seeds overlie all the local minima of the filter output, that

is, pixels whose neighbors all have larger values and so are shaded lighter. Note that it is
necessary to use a large value of σ2 to ensure that the filter output does not have many more
local minima. The boundaries produced by the watershed algorithm have been added to Fig
An intuitive way of viewing the watershed algorithm is by considering the output from the
variance filter as an elevation map: light areas are high ridges and dark areas are valleys.
Each local minimum can be thought of as the point to which any water falling on the region
drains, and the segments are the catchments for them. Hence, the boundaries, or watersheds,
lie along tops of ridges. The previously mentioned Fig(c) shows this segmentation
superimposed on the original image.
Fig.: Manual segmentation of muscle fibres image by use of watersheds algorithm (a)
manually positioned ‘seeds’ in centers of all fibres, (b) output from Prewitt’s edge filter
together with watershed boundaries, (c) watershed boundaries superimposed on the image.

Figure : Output of variance filter with Gaussian weights (σ2 = 96) applied to muscle fibres
image, together with seeds indicating all local minima and boundaries produced by
watershed algorithm.
There are very many other region-based algorithms, but most of them are quite
complicated. In this section we will consider just one more, namely an elegant split-and-
merge algorithm proposed by Horowitz and Pavlidis (1976). We will present it in a slightly
modified form to segment the log-transformed SAR image (Fig ), basing our segmentation
decisions on variances, whereas Horowitz and Pavlidis based theirs on the range of pixel
values. The algorithm operates in two stages, and requires a limit to be specified for the
maximum variance in pixel values in a region.
The first stage is the splitting one. Initially, the variance of the whole image is
calculated. If this variance exceeds the specified limit, then the image is subdivided into four
quadrants. Similarly, if the variance in any of these four quadrants exceeds the limit it is
further subdivided into four. This continues until the whole image consists of a set of
squares of varying sizes, all of which have variances below the limit. (Note that the
algorithm must be capable of achieving this because at the finest resolution of each square
consisting of a single pixel the variances are taken to be zero.)
Fig :(a) shows the resulting boundaries in white, superimposed on the log- transformed
SAR image, with the variance limit set at 0.60. Note that:
• Squares are smaller in non-uniform parts of the image.

Figure : Region-growing segmentation of log-transformed SAR image: (a) division of image

into squares with variance less than 0.60, obtained as first step in algorithm, (b) final
segmentation, after amalgamation of squares, subject to variance limit of 0.60.
• The variance limit was set to 0.60, rather than to the speckle variance value of 0.41
(Horgan, 1994), because in the latter case the resulting regions were very small.
• The algorithm requires the image dimension, n, to be a power of 2. Therefore, the
250 × 250 SAR image was filled out to 256 × 256 by adding borders of width 3.
The second stage of the algorithm, the merging one, involves amalgamating squares
which have a common edge, provided that by so doing the variance of the new region does
not exceed the limit. Once all amalgamations have been completed, the result is a
segmentation in which every region has a variance less than the set limit. However, although
the result of the first stage in the algorithm is unique, that from the second is not — it
depends on the order of which squares are considered.
Fig (b) shows the boundaries produced by the algorithm, superimposed on the SAR
image. Dark and light fields appear to have been successfully distinguished between,
although the boundaries are rough and retain some of the artefacts of the squares in Fig (a).
Pavlidis and Liow (1990) proposed overcoming the deficiencies in the boundaries
produced by the Horowitz and Pavlidis algorthm by combining the results with those from
an edge-based segmentation. Many other ideas for region-based segmentation have been
proposed (see, for example, the review of Haralick and Shapiro, 1985), and it is still an
active area of research.
One possibility for improving segmentation results is to use an algorithm which over-
segments an image, and then apply a rule for amalgamating these regions. This requires
‘high- level’ knowledge, which falls into the domain of artificial intelligence. (All that we
have considered in this chapter may be termed ‘low-level’.) For applications of these ideas
in the area of remote sensing, see Tailor, Cross, Hogg and Mason (1986) and Ton, Sticklen
and Jain (1991). It is possible that such domain-specific knowledge could be used to
improve the automatic segmentations of the SAR and muscle fibres images, for example by
constraining boundaries to be straight in the SAR image and by looking only for convex
regions of specified size for the muscle fibres.

We briefly mention some other, more-complex techniques which can be used to

segment images.
• The Hough transform (see, for example, Leavers, 1992) is a powerful technique for
finding straight lines, and other parametrized shapes, in images.
• Boundaries can be constrained to be smooth by employing roughness penalties such
as bending energies. The approach of varying a boundary until some such criterion is
optimized is known as the fitting of snakes (Kass, Witkin and Terzopoulos 1988).
• Models of expected shapes can be represented as templates and matched to images.
Either the templates can be rigid and the mapping can be flexible (for example, the
thinplate spline of Bookstein, 1989), or the template itself can be flexible, as in the
approach of Amit, Grenander and Piccioni (1991).
• Images can be broken down into fundamental shapes, in a way analogous to the
decomposition of a sentence into individual words, using syntactic methods (Fu,
1974).
Detection of Discontinuities
Discontinuities in an image typically correspond to edges, which are significant changes in
intensity or color. Detecting these discontinuities is a fundamental step in many
segmentation techniques.
Point Detection
A point is the most basic type of discontinuity in a digital image. The most common
approach to finding discontinuities is to run an (n n) mask over each point in the image. The
mask is as shown in figure 2.
Figure 2. A mask for point detection
The point is detected at a location (x, y) in an image where the mask is centered. If the
corresponding value of R such that
Where R is the response of the mask at any point in the image and T is non-negative
threshold value. It means that isolated point is detected at the corresponding value (x, y).

This formulation serves to measures the weighted differences between the center point and
its neighbors since the gray level of an isolated point will be very different from that of its
neighbors [ ]. The result of point detection mask is as shown in figure 3
Figure 3. (a) Gray-scale image with a nearly invisible isolated black point (b) Image
showing the detected point
Line Detection
Line detection is the next level of complexity in the direction of image discontinuity. For
any point in the image, a response can be calculated that will show which direction the point
of a line is most associated with. For line detection, we use two masks, and, mask. Then, we
have
It means that the corresponding points is more likely to be associated with a line in the
direction of the mask i.
Figure 4. Line Detector masks in (a) Horizontal direction (b) 45° direction (c) Vertical
direction (d) - 45° direction The greatest response calculation from these matrices will yield
the direction of the given pixel []. The result of line detection mask is as shown in figure 5

Figure 5. (a) Original Image (b) result showing with horizontal detector (c) with 45° detector
(d) with vertical detector (e) with -45° detector
With the help of lines detector masks, we can detect the lines in a specified direction. For
example, we are interesting in finding all the lines that are one pixel thick, oriented at -45°.
For that, we take a digitized (binary portion of a wire-bond mask for an electronics circuit.
The results are shown as in figure 6.
Edge detection
Since isolated points and lines of unitary pixel thickness are infrequent in most
practical application, edge detection is the most common approach in gray level
discontinuity segmentation. An edge is a boundary between two regions having distinct
intensity level. It is very useful in detecting of discontinuity in an image. When the image
changes from dark to white or vice-versa. The changes of intensity, first-order derivative and
second-order derivative are shown in figure 7.
Figure 7. (a) Intensity profile (b) First-order derivatives (c) Second-order derivatives

First-order derivatives. First-order derivatives responds whenever there is discontinuity in

intensity level. It is positive at the leading edge and negative at the trailing edge. Suppose
we have an image f(x, y) and gradient operator f.
Figure 8. (a) Original Image (b) ‖𝐺𝑥‖component of the gradient along x-direction (c)
‖ Gy ‖component of the gradient along y-direction (d) Gradient Image ‖𝐺𝑥‖+ ‖𝐺𝑦‖
There is several ways to calculate the image gradient:
Prewitt Edge operator
Figure 9. Masks used for Prewitt Edge operator

The mask finds the horizontal edges is equivalent to gradient in the vertical direction
and the mask compute the vertical edges is equivalent to gradient in the horizontal direction.
Using these two masks passing to the intensity image, we can find out and component at
different location in an image. So, we can find out the strength and direction of edge at that
particular location (x, y).
Sobel Edge operator
Figure 10. Masks used for Sobel Edge operator
It gives the averaging effect over an image. It considers the effect due to the spurious
noise in the image. It is preferable over prewitt edge operator because it gives the smoothing
effect and by which we can reduce spurious edge which are generated because of noise
present in the image.
Second-order derivatives
It is positive at the darker side and negative at the white side. It is very sensitive to
noise present in an image. That’s why it is not used for edge detection. But, it is very useful
for extracting some secondary information i.e. we can find out whether the point lies on the
darker side or the white side.
Zero-crossing: It is useful to identify the exact location of the edge where there is
gradual transition of intensity from dark to bright region and vice-versa. There are several
second-order derivative operators: 3.3.2.1. Laplacian operator. The Laplacian mask
Laplacian operator. The Laplacian mask is given by:
Figure 11. Masks used for Laplacian operator

If we consider the diagonal elements:
Figure 12. Masks used for Laplacian operator using 8-connectivity
It is not used for edge detection because it is very sensitive to noise and also leads to
double edge. But, it is very useful for extracting secondary information. To reduce the effect
of noise, first image will be smooth using the Gaussian operator and then it is operated by
Laplacian operator. These two operations together is called LoG (Laplacian of Gaussian)
operator.
LoG operator
The LoG mask is given by
Figure 13. Masks used for LoG operator

Canny operator
It is very important method to find edges by isolating noise from the image before
find edges of images, without affecting the features of the edges in the image and then
applying the tendency to find the edges in the image and the critical value for threshold.

Question Bank
Module 1
1. Write the Basics and Application of computer graphics.
2. Explain Raster scan display with neat diagram.
3. Short notes on the Video controller, the display processor.
4. Explain Graphics workstations and viewing systems.
5. Explain Input devices, graphics network, graphics software.
6. Write a note on Introduction to OpenGL.
7. What are Coordinate reference frames?
8. What are Primitives and attributes in OpenGL?
9. Explain in detail Line drawing algorithms (DDA, Bresenham’s).
10. Explain in detail Circle generation algorithms (Bresenham’s).
Module 2
1. Write a short note on 2D geometric transformations.
2. Derive Matrix representation and homogeneous coordinates.
3. Derive Inverse transformations, 2D composite transformations.
4. Explain Raster’s methods for geometric transformations.
5. What are OpenGL raster transformations.
6. Explain the 2D viewing pipeline.
7. Explain OpenGL 2D viewing functions.
8. Derive 3D geometric transformations.
9. Explain 3D translation, rotation, and scaling.
Module 3
1. Explain Logical Classification of Input Devices
2. Explain Input Functions for Graphical Data
3. What are Interactive Picture- Construction Techniques.
4. Explain Various Interactive Input Device Functions.
5. Explain how to create and manage Menu functions with OpenGL
6. Explain fields are involved in Graphical User Interface
7. Explain stages are involved in Design of Animation Sequence
8. Write a note on Traditional Animation Techniques
9. Write a note on General Computer Animation Functions and Computer Animation
Languages
10. Explain steps are involved in Character Animation
11. Write a note on Periodic Motion
12. Explain procedures are involved in OpenGL Animation Procedures
Module 4
1. Explain the following terms:
i) Adjacency
ii) Connectivity
iii) Gray level resolution
iv) Spatial resolution
2. Consider the two image substates S1 and S2, for V=<1>, determine whether these two
substates are i) 4-adjacent ii) 8-adjacent or iii) M adjacent.
3. Mention the applications of image processing.
4. Explain the importance of brightness adaption and discrimination in image processing.
5. Define 4- adjacency, 8-adjacency and m- adjacency.
6. Consider the image segment i) let v={0,1},compute the lengths of shortest 4,8,m paths
between p and q ii) repeat for v={1,2}.
7. Explain the types of images.

8. Explain the representation of images in image processing.
9. Mention 10 examples of fields that use digital image processing
10. Consider the image segment; compute the lengths of the shortest 4, 8 and m path between
‘p’ and ‘q’. If path does not exist, explain why.
11. Briefly explain the following terms
i) Neighbors
ii) Path
iii) Connectivity of pixels
Module 5
1. What is segmentation?
2. Write the applications of segmentation.
3. What are the three types of discontinuity in digital images?
4. How the derivatives are obtained in edge detection during formulation?
5. Write about linking edge points.
6. What are the two properties used for establishing the similarity of edge pixels?
7. Give the properties of the second derivative around an edge.
8. What is thresholding? Explain about global thresholding.
9. Explain about basic adaptive thresholding process used in image segmentation.
10. Explain in detail the threshold selection based on boundary characteristics.
11. Explain about region-based segmentation.
Content Beyond Syllabus of
Computer Graphics & Fundamental of

Image Processing
[21CS63]
1. Mini project on computer graphics using Open GL/Python/Open CV.

2. Demonstration and Explanation on Segmentation of Text and Non-Textual Regions from a
scanned document using the Histogram Method and Split-Merge Method.
Geetha Shishu Shikshana Sangha(R)
GSSS INSTITUTE OF ENGINEERING & TECHNOLOGY FOR WOMEN
(Affiliated to VTU, Belagavi ,Approved by AICTE, New Delhi & Govt. of Karnataka)
K R S Road, Metagalli, Mysuru-570016.
Vision and Mission statements
Vision statement of the Institution
"To become a recognized world class Women Educational Institution, by

imparting professional education to the students, creating technical
opportunities through academic excellence and technical achievements,
with ethical values"
Mission statement of the Institution
• To support value based education with state of art infrastructure.
• To empower women with the additional skill for professional future carrier
• To enrich students with research blends in order to fulfill the International challenges
• To create multidisciplinary center of excellence
• To achieve Accreditation standards towards intentional education recognition.
• To establish more Post Graduate & Research course.
• To increase Doctorates numbers towards the Research quality of academics.

21CS63 - CG&FIP Course Material

Uploaded by

Copyright:

Available Formats

21CS63 - CG&FIP Course Material

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

21CS63 - CG&FIP Course Material

Uploaded by

Copyright:

Available Formats

COURSE MATERIAL

COMPUTER GRAPHICS & FUNDAMENTAL OF IMAGE

"Knowledge dissemination with development of future leaders in Information Technology

Mission of the Department

PROGRAM EDUCATIONAL OBJECTIVES

PEO1: To produce graduates satisfying Computer Science Engineering challenges.

Engineering Graduates will be able to:

1. Engineering knowledge: Apply the knowledge of mathematics, science, engineering

3. Design/development of solutions: Design solutions for complex engineering problems and

4. Conduct investigations of complex problems: Use research-based knowledge and research

10. Communication: Communicate effectively on complex engineering activities with the

Program Specific Outcomes

Enable students to design system and system architecture, inculcating

Enhance skills to be successful in National, International level competition

COMPUTER GRAPHICS AND FUNDAMENTALS OF IMAGE PROCESSING

Textbook 1: Chapter -1,2,3, 5(1 and 2 only)

Teaching- Chalk & board, Active Learning

3D Geometric Transformations: Translation, rotation, scaling, composite 3D transformations, other

Computer Animation :Design of Animation Sequences, Traditional Animation Techniques, General

Textbook 1: Chapter -11, 18

( Below topics is for experiential learning only , No questions in SEE)

( Below topics is for experiential learning only , No questions in SEE)

Web source: https://medium.com/analytics-vidhya/introduction-to-computer-vision-opencv-in-

Teaching- Chalk & board, MOOC

Suggested Learning Resources:

COURSE OUTCOME: On completion of this subject, students will be expected to:

Module-1 Computer Graphics: Basics of computer graphics, 21 CO1

Computer Graphics & Fundamental of Image Processing (21CS63) 1|Page2

Faculty Program Assessment HOD-CSE

Computer Graphics & Fundamental of Image Processing (21CS63) 2|Page2

 The primary output device in a graphics system is a video monitor.

Refresh Cathssode-Ray Tubes

 This Figure illustrates the basic operation of a CRT.

Operation of an electron gun with an accelarating anode

 Cathode-ray tubes are commonly constructed with two pairs of magnetic-

Dept. of CSE 2 GSSSIETW, Mysuru

Electrostatic deflection of the electron beam in a CRT

Dept. of CSE 3 GSSSIETW, Mysuru

 Lower persistence phosphors required higher refresh rates to maintain a picture on

Dept. of CSE 4 GSSSIETW, Mysuru

Case 1: In case of black and white systems

ii). Random-Scan Displays

Dept. of CSE 5 GSSSIETW, Mysuru

 A pen plotter operates in a similar way and is an example of a random-scan, hard-copy

Difference between Raster scan system and Random scan system

The electron beam is The electron beam is directed

Picture definition is Picture definition is stored as a

Dept. of CSE 6 GSSSIETW, Mysuru

and color pattern.

1.1.2 Color CRT Monitors

Dept. of CSE 7 GSSSIETW, Mysuru

1.1.3 Flat Panel Display

1. Emissive displays: - the emissive display or emitters are devices that

Dept. of CSE 8 GSSSIETW, Mysuru

a) Plasma Panels displays

 One disadvantage of plasma panels is they were strictly monochromatic device

b) Thin Film Electroluminescent Displays

Dept. of CSE 9 GSSSIETW, Mysuru

 When sufficient voltage is applied the phosphors becomes a conductor in area of