21CS63 - CG&FIP Course Material
21CS63 - CG&FIP Course Material
21CS63 - CG&FIP Course Material
Prepared By
RAJATH A N ASHWINI M S
Assistant Professor Assistant Professor
Vision of the Department
M1: Equip students with continuous learning process to acquire Hardware, Software and
Computing knowledge to face new challenges.
M2: Inculcate the core Computer Science and Engineering components with discipline among
the students by providing the state-of-the-art learner centric environment.
M3: To impart essential knowledge through quality and value based education to mould them as
a complete Computer Science Engineer with ethical values, leadership roles by possessing good
communication skills and ability to work effectively as a team member.
M4: Provide a platform to collaborate with successful people from entrepreneurial and research
domains to learn and accomplish.
PEO2: To meet dynamic requirements of IT industries professionally and ethically along with
social responsibilities.
PEO3: To provide Computer Science and Engineering graduates to support nations self
employment growth with women entrepreneurial skills.
PEO4: To equip Graduates with minimum research blend for further career challenges
internationally.
Program Outcomes
2. Problem analysis: Identify, formulate, review research literature, and analyze complex
engineering problems reaching substantiated conclusions using first principles of
mathematics, natural sciences, and engineering sciences.
5. Modern tool usage: Create, select, and apply appropriate techniques, resources, and modern
engineering and IT tools including prediction and modeling to complex engineering activities
with an understanding of the limitations.
6. The engineer and society: Apply reasoning informed by the contextual knowledge to assess
societal, health, safety, legal and cultural issues and the consequent responsibilities relevant
to the professional engineering practice.
7. Environment and sustainability: Understand the impact of the professional engineering
solutions in societal and environmental contexts, and demonstrate the knowledge of, and
need for sustainable development.
8. Ethics: Apply ethical principles and commit to professional ethics and responsibilities and
norms of the engineering practice.
9. Individual and team work: Function effectively as an individual, and as a member or leader
in diverse teams, and in multidisciplinary settings.
11. Project management and finance: Demonstrate knowledge and understanding of the
engineering and management principles and apply these to one’s own work, as a member and
leader in a team, to manage projects and in multidisciplinary environments.
12. Life-long learning: Recognize the need for, and have the preparation and ability to engage
in independent and life-long learning in the broadest context of technological change.
These are sample Strategies, which teacher can use to accelerate the attainment of the various course
outcomes.
1. Lecturer method (L) need not to be only traditional lecture method, but alternative effective
teaching methods could be adopted to attain the outcomes.
2. Use of Video/Animation to explain functioning of various concepts.
3. Encourage collaborative (Group Learning) Learning in the class.
4. Ask at least three HOT (Higher order Thinking) questions in the class, which promotes critical
thinking.
5. Adopt Problem Based Learning (PBL), which fosters students’ Analytical skills, develop design
thinking skills such as the ability to design, evaluate, generalize, and analyse information
rather than simply recall it.
6. IntroduceTopicsin manifold representations.
7. Show the different ways to solve the same problem and encourage the students to come up
with their own creative ways to solve them.
8. Discuss how every concept can be applied to the real world - and when that's possible, it helps
improve the students' understanding.
Module-1
Overview: Computer Graphics hardware and software and OpenGL: Computer Graphics: Video Display
Devices, Raster-Scan Systems Basics of computer graphics, Application of Computer Graphics. OpenGL:
Introduction to OpenGL, coordinate reference frames, specifying two-dimensional world coordinate
reference frames in OpenGL, OpenGL point functions, OpenGL line functions, point attributes, line
attributes, curve attributes, OpenGL point attribute functions, OpenGL line attribute functions, Line
drawing algorithms(DDA, Bresenham’s).
Digital Image Processing Operations: Basic relationships and distance metrics, Classification of Image
processing Operations.
Text book 2: Chapter 3
(Note : Computer vision and OpenCV for experimental learning or Activity Based
Learning using web sources, Preferred for assignments. No questions in SEE )
Web Source: https://www.tutorialspoint.com/opencv/
Teaching- Chalk& board, Problem based learning
Learning Lab practice for OpenCV for basic geometric objects and basic image operation
Process
Module-5
Image Segmentation: Introduction, classification, detection of discontinuities, Edge detection (up to
canny edge detection(included)).
Text Book 2: Chapter 9: 9.1 to 9.4.4.4
LESSON PLAN
Semester & Year: 6th Sem, 3rd Year Faculty Name: Rajath A N & Ashwini M S
Subject with Code: Computer Graphics & Fundamentals of Image Processing (21CS63)
CO 1. Construct geometric objects using Computer Graphics principles and OpenGL APIs.
CO 2. Use OpenGL APIs and related mathematics for 2D and 3D geometric Operations on the objects.
CO 3. Design GUI with the necessary techniques required to animate the created objects
CO 4. Apply OpenCV for developing Image processing applications.
CO 5. Apply Image segmentation techniques along with programming, using OpenCV, for developing
simple applications.
Course
Module Sub Titles Sessions
Outcomes
Overview, Nature of IP, IP and its related fields 1,2
Digital Image Representation 3
Types of images 4
Module-4
Digital Image Processing Operations: Basic relationships 5,6,7
Introduction to CO4
and distance metrics,
Image Processing Classification of Image Processing Operations 8,9
Best Practice (Content Beyond Syllabus)
Lab practice for OpenCV for basic geometric objects and 10
basic image operation
Introduction, classification 11,12
Module-5 detection of discontinuities 13,14,15 CO5
Image Edge detection (up to canny edge detection(included)) 16,17,18
Segmentation Best Practice (Content Beyond Syllabus)
19,20
Example programs on Image segmentation applications.
Module 1
Computer Graphics Hardware and Software and OpenGL
1. Computer Graphics
1.1 Video Display Devices
The primary components of an electron gun in a CRT are the heated metal cathode
and a control grid.
The heat is supplied to the cathode by directing a current through a coil of wire,
called the filament, inside the cylindrical cathode structure.
This causes electrons to be “boiled off” the hot cathode surface.
Inside the CRT envelope, the free, negatively charged electrons are then
accelerated toward the phosphor coating by a high positive voltage.
Intensity of the electron beam is controlled by the voltage at the control grid.
Since the amount of light emitted by the phosphor coating depends on the number
of electrons striking the screen, the brightness of a display point is controlled by
varying the voltage on the control grid.
The focusing system in a CRT forces the electron beam to converge to a small
cross section as it strikes the phosphor and it is accomplished with either electric
or magnetic fields.
With electrostatic focusing, the electron beam is passed through a positively
charged metal cylinder so that electrons along the center line of the cylinder are in
equilibrium position.
Deflection of the electron beam can be controlled with either electric or magnetic
fields.
One pair is mounted on the top and bottom of the CRT neck, and the other pair is
mounted on opposite sides of the neck.
The magnetic field produced by each pair of coils results in a traverse deflection
force that is perpendicular to both the direction of the magnetic field and the
direction of travel of the electron beam.
Horizontal and vertical deflections are accomplished with these pair of coils.
What we see on the screen is the combined effect of all the electrons light
emissions: a glowing spot that quickly fades after all the excited phosphor
electrons have returned to their ground energy level.
The frequency of the light emitted by the phosphor is proportional to the energy
difference between the excited quantum state and the ground state.
As it moves across each row, the beam intensity is turned on and off to create a
pattern of illuminated spots.
This scanning process is called refreshing. Each complete scanning of a screen is
normally called a frame.
The refreshing rate, called the frame rate, is normally 60 to 80 frames per second,
or described as 60 Hz to 80 Hz.
Picture definition is stored in a memory area called the frame buffer.
This frame buffer stores the intensity values for all the screen points. Each screen
point is called a pixel (picture element).
Property of raster scan is Aspect ratio, which defined as number of pixel columns
divided by number of scan lines that can be displayed by the system.
2) Shadow-mask technique
It produces wide range of colors as compared to beam-penetration technique.
This technique is generally used in raster scan displays. Including color TV.
In this technique CRT has three phosphor color dots at each pixel position.
One dot for red, one for green and one for blue light. This is commonly known as
Dot triangle.
Here in CRT there are three electron guns present, one for each color dot. And a
shadow mask grid just behind the phosphor coated screen.
The shadow mask grid consists of series of holes aligned with the phosphor dot
pattern.
Three electron beams are deflected and focused as a group onto the shadow mask
and when they pass through a hole they excite a dot triangle.
In dot triangle three phosphor dots are arranged so that each electron beam can
activate only its corresponding color dot when it passes through the shadow mask.
A dot triangle when activated appears as a small dot on the screen which has color
of combination of three small dots in the dot triangle.
By changing the intensity of the three electron beams we can obtain different
colors in the shadow mask CRT.
2. Non emissive displays: - non emissive display or non emitters use optical
effects to convert sunlight or light from some other source into graphics
patterns.
For Ex. LCD (Liquid Crystal Display).
Firing voltage is applied to a pair of horizontal and vertical conductors cause the
gas at the intersection of the two conductors to break down into glowing plasma of
electrons and ions.
Picture definition is stored in a refresh buffer and the firing voltages are applied to
refresh the pixel positions, 60 times per second.
Alternating current methods are used to provide faster application of firing
voltages and thus brighter displays.
Separation between pixels is provided by the electric field of conductor.
In the ON state polarized light passing through material is twisted so that it will
pass through the opposite polarizer.
In the OFF state it will reflect back towards source.
Here, the frame buffer can be anywhere in the system memory, and the video
controller accesses the frame buffer to refresh the screen.
In addition to the video controller, raster systems employ other processors as
coprocessors and accelerators to implement various graphics operations.
Two registers are used to store the coordinate values for the screen pixels.
Initially, the x register is set to 0 and the y register is set to the value for the top
scan line.
The contents of the frame buffer at this pixel position are then retrieved and used
to set the intensity of the CRT beam.
Then the x register is incremented by 1, and the process is repeated for the next
pixel on the top scan line.
This procedure continues for each pixel along the top scan line.
After the last pixel on the top scan line has been processed, the x register is reset to
0 and the y register is set to the value for the next scan line down from the top of
the screen.
The procedure is repeated for each successive scan line.
After cycling through all pixels along the bottom scan line, the video controller
resets the registers to the first pixel position on the top scan line and the refresh
process starts over.
a.Speed up pixel position processing of video controller:
Since the screen must be refreshed at a rate of at least 60 frames per second,the
simple procedure illustrated in above figure may not be accommodated by RAM
chips if the cycle time is too slow.
To speed up pixel processing, video controllers can retrieve multiple pixel values
from the refresh buffer on each pass.
When group of pixels has been processed, the next block of pixel values is
retrieved from the frame buffer.
Advantages of video controller:
A video controller can be designed to perform a number of other operations.
For various applications, the video controller can retrieve pixel values from
different memory areas on different refresh cycles.
This provides a fast mechanism for generating real-time animations.
Another video-controller task is the transformation of blocks of pixels, so that
screen areas can be enlarged, reduced, or moved from one location to another
during the refresh cycles.
In addition, the video controller often contains a lookup table, so that pixel values
in the frame buffer are used to access the lookup table. This provides a fast method
for changing screen intensity values.
Dept. of CSE 14 GSSSIETW, Mysuru
COMPUTER GRAPHICS AND FUNDAMENTALS OF IMAGE PROCESSING
Finally, some systems are designed to allow the video controller to mix the
framebuffer image with an input image from a television camera or other input
device.
b) Raster-Scan Display Processor
Figure shows one way to organize the components of a raster system that contains
a separate display processor, sometimes referred to as a graphics controller or a
display coprocessor.
The purpose of the display processor is to free the CPU from the graphics chores.
The array size for character grids can vary from about 5 by 7 to 9 by 12 or more
for higher-quality displays.
A character grid is displayed by superimposing the rectangular grid pattern into the
frame buffer at a specified coordinate position.
Using outline:
For characters that are defined as outlines, the shapes are scan-converted into the
frame buffer by locating the pixel positions closest to the outline.
These functions include generating various line styles (dashed, dotted, or solid),
displaying color areas, and applying transformations to the objects in a scene.
Display processors are typically designed to interface with interactive input
devices, such as a mouse.
Methods to reduce memory requirements in display processor:
In an effort to reduce memory requirements in raster systems, methods have been
devised for organizing the frame buffer as a linked list and encoding the color
information.
One organization scheme is to store each scan line as a set of number pairs.
Encoding methods can be useful in the digital storage and transmission of picture
information
i) Run-length encoding:
The first number in each pair can be a reference to a color value, and the second
number can specify the number of adjacent pixels on the scan line that are to be
displayed in that color.
Computer graphics is an art of drawing pictures, lines, charts, etc. using computers with the
help of programming. Computer graphics image is made up of number of pixels. Pixel is the
smallest addressable graphical unit represented on the computer screen.
Computer graphics is concerned with all aspects of producing pictures or images using a
computer. The field began with the display of a few lines on a cathode-ray tube (CRT) and
now, the field generates photograph-equivalent images.
The development of computer graphics has been driven both by the needs of the user
community and by advances in hardware and software. The combination of computers,
networks, and the complex human visual system, through computer graphics, has led to new
ways of displaying information, seeing virtual worlds, and communicating with both other
people and machines.
An early application for computer graphics is the display of simple data graphs usually
plotted on a character printer. Data plotting is still one of the most common graphics
application.
Graphs & charts are commonly used to summarize functional, statistical, mathematical,
engineering and economic data for research reports, managerial summaries and other
types of publications.
Typically examples of data plots are line graphs, bar charts, pie charts, surface graphs,
contour plots and other displays showing relationships between multiple parameters in
two dimensions, three dimensions, or higher-dimensional spaces.
b. Computer-Aided Design
With virtual-reality systems, designers and others can move about and interact with
objects in various ways. Architectural designs can be examined by taking simulated
“walk” through the rooms or around the outsides of buildings to better appreciate the
overall effect of a particular design.
With a special glove, we can even “grasp” objects in a scene and turn them over or
move them from one place to another.
d. Data Visualizations
Producing graphical representations for scientific, engineering and medical data sets
and processes is another fairly new application of computer graphics, which is
generally referred to as scientific visualization. And the term business visualization
is used in connection with data sets related to commerce, industry and other
nonscientific areas.
There are many different kinds of data sets and effective visualization schemes
depend on the characteristics of the data. A collection of data can contain scalar
values, vectors or higher-order tensors.
The picture is usually painted electronically on a graphics tablet using a stylus, which
can simulate different brush strokes, brush widths and colors.
Fine artists use a variety of other computer technologies to produce images. To create
pictures the artist uses a combination of 3D modeling packages, texture mapping,
drawing programs and CAD software etc.
Commercial art also uses theses “painting” techniques for generating logos & other
designs, page layouts combining text & graphics, TV advertising spots & other
applications.
A common graphics method employed in many television commercials is morphing,
where one object is transformed into another.
g. Entertainment
Television production, motion pictures, and music videos routinely a computer graphics
methods.
Sometimes graphics images are combined a live actors and scenes and sometimes the
films are completely generated a computer rendering and animation techniques.
Some television programs also use animation techniques to combine computer generated
figures of people, animals, or cartoon characters with the actor in a scene or to transform
an actor’s face into another shape.
h. Image Processing
Each screen display area can contain a different process, showing graphical or non-
graphical information, and various methods can be used to activate a display window.
Using an interactive pointing device, such as mouse, we can active a display window on
some systems by positioning the screen cursor within the window display area and
pressing the left mouse button.
2. OpenGL
2.1 Introduction to OpenGL
Symbolic constants that are used with certain functions as parameters are all in
capital letters, preceded by “GL”, and component are separated by underscore.
For eg:- GL_2D, GL_RGB, GL_CCW, GL_POLYGON,
GL_AMBIENT_AND_DIFFUSE.
The OpenGL functions also expect specific data types. For example, an OpenGL
function parameter might expect a value that is specified as a 32-bit integer. But
the size of an integer specification can be different on different machines.
To indicate a specific data type, OpenGL uses special built-in, data-type names,
such as
GLbyte, GLshort, GLint, GLfloat, GLdouble, Glboolean
Each data-type name begins with the capital letters GL, and the remainder of the
name is a standard data-type designation written in lowercase letters.
Related Libraries
In addition to OpenGL basic(core) library(prefixed with gl), there are a number of
associated libraries for handling special operations:-
1) OpenGL Utility(GLU):- Prefixed with “glu”. It provides routines for setting up
viewing and projection matrices, describing complex objects with line and
polygon approximations, displaying quadrics and B-splines using linear
approximations, processing the surface-rendering operations, and other complex
tasks.
Every OpenGL implementation includes the GLU library
2) Open Inventor:- provides routines and predefined object shapes for interactive
three- dimensional applications which are written in C++.
3) Window-system libraries:- To create graphics we need display window. We
cannot create the display window directly with the basic OpenGL functions since
it contains only device-independent graphics functions, and window-management
operations are device-dependent. However, there are several window-system
libraries that supports OpenGL functions for a variety of machines.
Eg:- Apple GL(AGL), Windows-to-OpenGL(WGL), Presentation Manager to
OpenGL(PGL), GLX.
4) OpenGL Utility Toolkit(GLUT):- provides a library of functions which acts as
interface for interacting with any device specific screen-windowing system, thus
making our program device-independent. The GLUT library functions are
prefixed with “glut”.
Header Files
In all graphics programs, we will need to include the header file for the OpenGL
core library.
In windows to include OpenGL core libraries and GLU we can use the following
header files:-
#include<GL/gl.h>
#include <GL/glu.h>
The above lines can be replaced by using GLUT header file which ensures gl.h
and glu.h are included correctly,
#include <GL/glut.h> //GL in windows
#include <GLUT/glut.h>
This initialization function could also process any command line arguments, but we
will not need to use these parameters for our first example programs.
We perform the GLUT initialization with the statement glutInit (&argc, argv);
Step 2: title
We can state that a display window is to be created on the screen with a given
caption for the title bar. This is accomplished with the function
glutCreateWindow ("An Example OpenGL Program");
where the single argument for this function can be any character string that we
want to use for the display-window title.
Step 3: Specification of the display window
Then we need to specify what the display window is to contain.
For this, we create a picture using OpenGL functions and pass the picture
definition to the GLUT routine glutDisplayFunc, which assigns our picture to the
display window.
Example: suppose we have the OpenGL code for describing a line segment in a
procedure called lineSegment.
Then the following function call passes the line-segment description to the display
window: glutDisplayFunc (lineSegment);
After execution of the following statement, all display windows that we have
created, including their graphic content, are now activated: glutMainLoop ( );
This function must be the last one in our program. It displays the initial graphics
and puts the program into an infinite loop that checks for input from devices such
as a mouse or keyboard.
Step 5: these parameters using additional GLUT functions
Although the display window that we created will be in some default location and
size, we can set these parameters using additional GLUT functions.
GLUT Function 1:
GLUT Function 2:
After the display window is on the screen, we can reposition and resize it.
GLUT Function 3:
We can also set a number of other options for the display window, such as
buffering and a choice of color modes, with the glutInitDisplayMode function.
Arguments for this routine are assigned symbolic GLUT constants.
The values of the constants passed to this function are combined using a logical or
operation.
Actually, single buffering and RGB color mode are the default options.
But we will use the function now as a reminder that these are the options that are
set for our display.
Later, we discuss color modes in more detail, as well as other display options,
such as double buffering for animation applications and selecting parameters for
viewing threedimensional scenes.
Using RGB color values, we set the background color for the display window to
be white, with the OpenGL function: glClearColor (1.0, 1.0, 1.0, 0.0);
The first three arguments in this function set the red, green, and blue component
colors to the value 1.0, giving us a white background color for the display window.
If, instead of 1.0, we set each of the component colors to 0.0, we would get a black
background.
The fourth parameter in the glClearColor function is called the alpha value for the
specified color. One use for the alpha value is as a “blending” parameter
When we activate the OpenGL blending operations, alpha values can be used to
determine the resulting color for two overlapping objects.
An alpha value of 0.0 indicates a totally transparent object, and an alpha value of
1.0 indicates an opaque object.
For now, we will simply set alpha to 0.0.
Therefore, the GLU function gluOrtho2D defines the coordinate reference frame
within the display window to be (0.0, 0.0) at the lower-left corner of the display
window and (200.0, 150.0) at the upper-right window corner.
For now, we will use a world-coordinate rectangle with the same aspect ratio as
the display window, so that there is no distortion of our picture.
Finally, we need to call the appropriate OpenGL routines to create our line
segment.
The following code defines a two-dimensional, straight-line segment with integer,
Cartesian endpoint coordinates (180, 15) and (10, 145).
glBegin (GL_LINES);
glVertex2i (180, 15);
glVertex2i (10, 145);
glEnd ( );
Now we are ready to put all the pieces together:
The following OpenGL program is organized into three functions.
init: We place all initializations and related one-time parameter settings in
function init.
}
void main (int argc, char** argv)
{
glutInit (&argc, argv); // Initialize GLUT.
glutInitDisplayMode (GLUT_SINGLE | GLUT_RGB); // Set display mode.
glutInitWindowPosition (50, 100); // Set top-left display-window position.
glutInitWindowSize (400, 300); // Set display-window width and height.
glutCreateWindow ("An Example OpenGL Program"); // Create display window.
init ( ); // Execute initialization procedure.
glutDisplayFunc (lineSegment); // Send graphics to display window.
glutMainLoop ( ); // Display everything and wait.
These coordinate positions are stored in the scene description along with other info
about the objects, such as their color and their coordinate extents
Co-ordinate extents :Co-ordinate extents are the minimum and maximum x, y, and
z values for each object.
A set of coordinate extents is also described as a bounding box for an object.
Ex:For a 2D figure, the coordinate extents are sometimes called its bounding
rectangle.
Objects are then displayed by passing the scene description to the viewing routines
which identify visible surfaces and map the objects to the frame buffer positions
and then on the video monitor.
The scan-conversion algorithm stores info about the scene, such as color values, at
the appropriate locations in the frame buffer, and then the scene is displayed on the
output device.
Screen co-ordinates:
Locations on a video monitor are referenced in integer screen coordinates, which
correspond to the integer pixel positions in the frame buffer.
Scan-line algorithms for the graphics primitives use the coordinate descriptions to
determine the locations of pixels
Example: given the endpoint coordinates for a line segment, a display algorithm
must calculate the positions for those pixels that lie along the line path between the
endpoints.
Since a pixel position occupies a finite area of the screen, the finite size of a pixel
must be taken into account by the implementation algorithms.
For the present, we assume that each integer screen position references the centre
of a pixel area.
Once pixel positions have been identified the color values must be stored in the
frame buffer
• Parameter color receives an integer value corresponding to the combined RGB bit
codes stored for the specified pixel at position (x,y).
• Additional screen-coordinate information is needed for 3D scenes.
glLoadIdentity ( );
gluOrtho2D (xmin, xmax, ymin, ymax);
The display window will then be referenced by coordinates (xmin, ymin) at the
lower-left corner and by coordinates (xmax, ymax) at the upper-right corner, as
shown in Figure below
We can then designate one or more graphics primitives for display using the
coordinate reference specified in the gluOrtho2D statement.
If the coordinate extents of a primitive are within the coordinate range of the
display window, all of the primitive will be displayed.
Otherwise, only those parts of the primitive within the display-window coordinate
limits will be shown.
Also, when we set up the geometry describing a picture, all positions for the
OpenGL primitives must be given in absolute coordinates, with respect to the
reference frame defined in the gluOrtho2D function.
GL_POINTS
Then this coordinate position, along with other geometric descriptions we may
have in our scene, is passed to the viewing routines.
Unless we specify other attribute values, OpenGL primitives are displayed with a
default size and color.
The default color for primitives is white, and the default point size is equal to the
size of a single screen pixel
Syntax:
Case 1:
glBegin (GL_POINTS);
glVertex2i (50, 100);
glVertex2i (75, 150);
glVertex2i (100, 200);
glEnd ( );
Case 2:
we could specify the coordinate values for the preceding points in arrays such as
int point1 [ ] = {50, 100};
int point2 [ ] = {75, 150};
int point3 [ ] = {100, 200};
and call the OpenGL functions for plotting the three points as
glBegin (GL_POINTS);
glVertex2iv (point1);
glVertex2iv (point2);
glVertex2iv (point3);
glEnd ( );
Case 3:
specifying two point positions in a three dimensional world reference frame. In this
case, we give the coordinates as explicit floating-point values:
glBegin (GL_POINTS);
glVertex3f (-78.05, 909.72, 14.60);
glVertex3f (261.91, -5200.67, 188.33);
glEnd ( );
Successive pairs of vertices are considered as endpoints and they are connected to
form an individual line segments.
Note that successive segments usually are disconnected because the vertices are
processed on a pair-wise basis.
we obtain one line segment between the first and second coordinate positions and
another line segment between the third and fourth positions.
if the number of specified endpoints is odd, so the last coordinate position is
ignored.
Case 1: Lines
glBegin (GL_LINES);
glVertex2iv (p1);
glVertex2iv (p2);
glVertex2iv (p3);
glVertex2iv (p4);
glVertex2iv (p5);
glEnd ( );
Case 2: GL_LINE_STRIP:
Successive vertices are connected using line segments. However, the final vertex is not
connected to the initial vertex.
glBegin (GL_LINES_STRIP);
glVertex2iv (p1);
glVertex2iv (p2);
glVertex2iv (p3);
glVertex2iv (p4);
glVertex2iv (p5);
glEnd ( );
Case 3: GL_LINE_LOOP:
Successive vertices are connected using line segments to form a closed path or loop i.e.,
final vertex is connected to the initial vertex.
glBegin (GL_LINES_LOOP);
glVertex2iv (p1);
glVertex2iv (p2);
glVertex2iv (p3);
glVertex2iv (p4);
glVertex2iv (p5);
glEnd ( );
In a state system: The displayed color and size of a point is determined by the
current values stored in the attribute list.
Color components are set with RGB values or an index into a color table.
For a raster system: Point size is an integer multiple of the pixel size, so that a
large point is displayed as a square block of pixels
Size:
We set the size for an OpenGL point with
glPointSize (size);
and the point is then displayed as a square block of pixels.
Example program:
Attribute functions may be listed inside or outside of a glBegin/glEnd pair.
Example: the following code segment plots three points in varying colors and
sizes.
The first is a standard-size red point, the second is a double-size green point, and
the third is a triple-size blue point:
Ex:
glColor3f (1.0, 0.0, 0.0);
glBegin (GL_POINTS);
glVertex2i (50, 100);
glPointSize (2.0);
glColor3f (0.0, 1.0, 0.0);
glVertex2i (75, 150);
glPointSize (3.0);
glColor3f (0.0, 0.0, 1.0);
glVertex2i (100, 200);
glEnd ( );
But we can also display dashed lines, dotted lines, or a line with a combination of
dashes and dots.
We can vary the length of the dashes and the spacing between dashes or dots.
We set a current display style for lines with the OpenGL function:
Pattern:
Parameter pattern is used to reference a 16-bit integer that describes how the line
should be displayed.
bit in the pattern denotes an “on” pixel position, and a 0 bit indicates an “off” pixel
position.
The pattern is applied to the pixels along the line path starting with the low-order
bits in the pattern.
The default pattern is 0xFFFF (each bit position has a value of 1),which produces a
solid line.
repeatFactor
Integer parameter repeatFactor specifies how many times each bit in the pattern is
to be repeated before the next bit in the pattern is applied.
The default repeat value is 1.
Polyline:
With a polyline, a specified line-style pattern is not restarted at the beginning of
each segment.
It is applied continuously across all the segments, starting at the first endpoint of
the polyline and ending at the final endpoint for the last segment in the series.
Example:
For line style, suppose parameter pattern is assigned the hexadecimal
representation 0x00FF and the repeat factor is 1.
This would display a dashed line with eight pixels in each dash and eight pixel
positions that are “off” (an eight-pixel space) between two dashes.
Also, since low order bits are applied first, a line begins with an eight-pixel dash
starting at the first endpoint.
This dash is followed by an eight-pixel space, then another eight-pixel dash, and
so forth, until the second endpoint position is reached.
Example Code:
typedef struct { float x, y; } wcPt2D;
wcPt2D dataPts [5];
void linePlot (wcPt2D dataPts [5])
{
int k;
glBegin
(GL_LINE_STRIP); for
(k = 0; k < 5; k++)
glVertex2f (dataPts [k].x, dataPts [k].y);
glFlush ( );
glEnd ( );
}
/* Invoke a procedure here to draw coordinate axes. */
glEnable (GL_LINE_STIPPLE); /* Input first set of (x, y) data values. */
glLineStipple (1, 0x1C47); // Plot a dash-dot, standard-width polyline.
linePlot (dataPts);
/* Input second set of (x, y) data values. */ glLineStipple
(1, 0x00FF); / / Plot a dashed, double-width polyline.
glLineWidth (2.0);
linePlot (dataPts);
/* Input third set of (x, y) data values. */ glLineStipple
(1, 0x0101); // Plot a dotted, triple-width polyline.
glLineWidth (3.0);
linePlot (dataPts);
glDisable (GL_LINE_STIPPLE);
We can display curves with varying colors, widths, dot-dash patterns, and
available pen or brush options.
Methods for adapting curve-drawing algorithms to accommodate attribute
selections are similar to those for line drawing.
Raster curves of various widths can be displayed using the method of horizontal or
vertical pixel spans.
Case 1: Where the magnitude of the curve slope |m| <= 1.0, we plot vertical spans;
Case 2: when the slope magnitude |m| > 1.0, we plot horizontal spans.
Method 1: Using circle symmetry property, we generate the circle path with vertical spans
in the octant from x = 0 to x = y, and then reflect pixel positions about the line y = x to y=0.
Method 2: Another method for displaying thick curves is to fill in the area between two
Parallel curve paths, whose separation distance is equal to the desired width. We could do
this using the specified curve path as one boundary and setting up the second boundary
either inside or outside the original curve path. This approach, however, shifts the original
curve path either inward or outward, depending on which direction we choose for the
second boundary.
Method 3:The pixel masks discussed for implementing line-style options could also be
used in raster curve algorithms to generate dashed or dotted patterns
Method 4: Pen (or brush) displays of curves are generated using the same techniques
discussed for straight-line segments.
We determine values for the slope m and y intercept b with the following
equations:
m=(yend - y0)/(xend - x0) ---------------- >(2)
b=y0 - m.x0 ------------- >(3)
Algorithms for displaying straight line are based on the line equation (1) and
calculations given in eq(2) and (3).
For given x interval δx along a line, we can compute the corresponding y interval
δy from eq.(2) as
δy=m. δx ---------------- >(4)
Similarly, we can obtain the x interval δx corresponding to a specified δy as
δx=δy/m ----------------- >(5)
Dept. of CSE 41 GSSSIETW, Mysuru
COMPUTER GRAPHICS AND FUNDAMENTALS OF IMAGE PROCESSING
These equations form the basis for determining deflection voltages in analog
displays, such as vector-scan system, where arbitrarily small changes in deflection
voltage are possible.
For lines with slope magnitudes
Case1:
if m<1,x increment in unit intervals i.e..,xk+1=xk+1 then, m=(yk+1 - yk)/( xk+1 - xk)
m= yk+1 - yk
yk+1 = yk + m ----------- >(1)
where k takes integer values starting from 0,for the first point and increases by 1
until final endpoint is reached. Since m can be any real number between 0.0 and
1.0.
Case2:
if m>1, y increment in unit intervals
i.e.., yk+1 = yk + 1
then, m= (yk + 1- yk)/( xk+1 - xk)
m(xk+1 - xk)=1
xk+1 =(1/m)+ xk ------------------------ (2)
Case3:
if m=1,both x and y increment in unit intervals
i.e..,xk+1=xk + 1 and yk+1 = yk + 1
Equations (1) and (2) are based on the assumption that lines are to be processed from the
left endpoint to the right endpoint. If this processing is reversed, so that the starting
endpoint is at the right, then either we have δx=-1 and
yk+1 = yk - m (3)
Bresenham’s Algorithm:
It is an efficient raster scan generating algorithm that uses incremental integral
calculations
To illustrate Bresenham’s approach, we first consider the scan-conversion process
for lines with positive slope less than 1.0.
Pixel positions along a line path are then determined by sampling at unit x
intervals. Starting from the left endpoint (x0, y0) of a given line, we step to each
successive column (x position) and plot the pixel whose scan-line y value is
closest to the line path.
2. Set the color for frame-buffer position (x0, y0); i.e., plot the first point.
3. Calculate the constants ∆x, ∆y, 2∆y, and 2∆y − 2∆x, and obtain the starting value for
the decision parameter as
p0 = 2∆y −∆x
4. At each xk along the line, starting at k = 0, perform the following test: If pk < 0, the
next point to plot is (xk + 1, yk ) and
pk+1 = pk + 2∆y
Note:
If |m|>1.0
Then
p0 = 2∆x −∆y
and
pk+1 = pk + 2∆x
Otherwise, the next point to plot is (xk + 1, yk + 1) and
pk+1 = pk + 2∆x − 2∆y
Code:
#include <stdlib.h>
#include <math.h>
/* Bresenham line-drawing procedure for |m| < 1.0. */
void lineBres (int x0, int y0, int xEnd, int yEnd)
{
int dx = fabs (xEnd - x0), dy = fabs(yEnd - y0);
int p = 2 * dy - dx;
int twoDy = 2 * dy, twoDyMinusDx = 2 * (dy - dx);
int x, y;
/* Determine which endpoint to use as start position. */
if (x0 > xEnd)
{
x = xEnd;
y = yEnd;
xEnd = x0;
}
else {
x = x0;
y = y0;
}
setPixel (x, y);
while (x < xEnd) {
x++;
if (p < 0)
p += twoDy;
else {
y++;
p += twoDyMinusDx;
}
Dept. of CSE 46 GSSSIETW, Mysuru
COMPUTER GRAPHICS AND FUNDAMENTALS OF IMAGE PROCESSING
Module-2
2D and 3D graphics with OpenGL
2D Geometric Transformations
Operations that are applied to the geometric description of an object to change its
position, orientation, or size are called geometric transformations.
Two-Dimensional Translation
The translation distance pair (tx, ty) is called a translation vector or shift vector
Column vector representation is given as
Code:
class wcPt2D {
public
:
GLflo
at x, y;
};
void translatePolygon (wcPt2D * verts, GLint nVerts, GLfloat tx, GLfloat ty)
{
GLint k; for (k = 0; k <
nVerts; k++) { verts [k].x =
verts [k].x + tx; verts [k].y =
verts [k].y + ty;
}
glBegin (GL_POLYGON);
for (k = 0; k < nVerts; k++)
glVertex2f (verts [k].x, verts
[k].y);
glEnd ( );
}
Two-Dimensional Rotation
We generate a rotation transformation of an object by specifying a rotation axis
and a rotation angle.
A two-dimensional rotation of an object is obtained by repositioning the object
along a circular path in the xy plane.
In this case, we are rotating the object about a rotation axis that is perpendicular to
the xy plane (parallel to the coordinate z axis).
Parameters for the two-dimensional rotation are the rotation angle θ and a position
(xr, yr ), called the rotation point (or pivot point), about which the object is to be
rotated
A positive value for the angle θ defines a counterclockwise rotation about the pivot
point, as in above Figure , and a negative value rotates objects in the clockwise
direction.
The angular and coordinate relationships of the original and transformed point
positions are shown in Figure
In this figure, r is the constant distance of the point from the origin, angle φ is the
original angular position of the point from the horizontal, and θ is the rotation angle.
we can express the transformed coordinates in terms of angles θ and φ as
The transformation equations for rotation of a point about any specified rotation
position
(xr , yr ):
Code:
class wcPt2D {
public:
GLfloat x, y;
};
void rotatePolygon (wcPt2D * verts, GLint nVerts, wcPt2D pivPt, GLdouble theta)
{
wcPt2D * vertsRot;
GLint k;
for (k = 0; k < nVerts; k++) {
vertsRot [k].x = pivPt.x + (verts [k].x - pivPt.x) * cos (theta) - (verts
[k].y - pivPt.y) * sin (theta);
vertsRot [k].y = pivPt.y + (verts [k].x - pivPt.x) * sin (theta) + (verts
[k].y - pivPt.y) * cos (theta);
}
glBegin (GL_POLYGON);
for (k = 0; k < nVerts; k++)
glVertex2f (vertsRot [k].x, vertsRot [k].y);
Dept. of CSE 51 GSSSIETW, Mysuru
COMPUTER GRAPHICS AND FUNDAMENTALS OF IMAGE PROCESSING
glEnd ( );
}
Two-Dimensional Scaling
To alter the size of an object, we apply a scaling transformation.
The basic two-dimensional scaling equations can also be written in the following
matrix form
Specifying a value of 1 for both sx and sy leaves the size of objects unchanged.
When sx and sy are assigned the same value, a uniform scaling is produced, which
maintains relative object proportions.
Unequal values for sx and sy result in a differential scaling that is often used in
design applications.
In some systems, negative values can also be specified for the scaling parameters.
This not only resizes an object, it reflects it about one or more of the coordinate
axes.
Figure below illustrates scaling of a line by assigning the value 0.5 to both sx and sy
We can control the location of a scaled object by choosing a position, called the
fixed point, that is to remain unchanged after the scaling transformation.
Coordinates for the fixed point, (x f , yf ), are often chosen at some object position,
such as its centroid but any other spatial position can be selected.
For a coordinate position (x, y), the scaled coordinates (x’, y’) are then calculated
from the following relationships:
Where the additive terms x f (1 − sx) and yf (1 − sy) are constants for all points in
the object.
Code:
class wcPt2D {
public:
GLfloat x, y;
};
void scalePolygon (wcPt2D * verts, GLint nVerts, wcPt2D fixedPt, GLfloat sx,
GLfloat sy)
{
wcPt2D vertsNew;
GLint k;
for (k = 0; k < nVerts; k++) {
vertsNew [k].x = verts [k].x * sx + fixedPt.x * (1 - sx);
vertsNew [k].y = verts [k].y * sy + fixedPt.y * (1 - sy);
}
glBegin (GL_POLYGON);
for (k = 0; k < nVerts; k++)
glVertex2f (vertsNew [k].x, vertsNew [k].y);
glEnd ( );
}
Homogeneous Coordinates
Multiplicative and translational terms for a two-dimensional geometric
transformation can be combined into a single matrix if we expand the
representations to 3 × 3 matrices
We can use the third column of a transformation matrix for the translation terms,
and all transformation equations can be expressed as matrix multiplications.
We also need to expand the matrix representation for a two-dimensional coordinate
position to a three-element column matrix.
The coordinate position is transformed using the composite matrixM, rather than
applying the individual transformations M1 and thenM2.
By multiplying the two rotation matrices, we can verify that two successive
rotations are additive:
R(θ2) · R(θ1) = R(θ1 + θ2)
So that the final rotated coordinates of a point can be calculated with the composite
rotation matrix as
P’ = R(θ1 + θ2) · P
We can generate a two-dimensional rotation about any other pivot point (xr , yr ) by
performing the following sequence of translate-rotate-translate operations:
1. Translate the object so that the pivot-point position is moved to the coordinate
origin.
2. Rotate the object about the coordinate origin.
3. Translate the object so that the pivot point is returned to its original position.
The composite transformation matrix for this sequence is obtained with the
concatenation
3. Use the inverse of the translation in step (1) to return the object to its original
position.
Concatenating the matrices for these three operations produces the required scaling
matrix
We can scale an object in other directions by rotating the object to align the desired
scaling directions with the coordinate axes before applying the scaling
transformation.
Suppose we want to apply scaling factors with values specified by parameters s1
and s2 in the directions shown in Figure
The composite matrix resulting from the product of these three transformations is
Property 2:
Transformation products, on the other hand, may not be commutative. The matrix
productM2 · M1 is not equal toM1 ·M2, in general.
This means that if we want to translate and rotate an object, we must be careful
about the order in which the composite matrix is evaluated
The four elements rsjk are the multiplicative rotation-scaling terms in the
transformation, which involve only rotation angles and scaling factors if an object is to
be scaled and rotated about its centroid coordinates (xc , yc ) and then translated, the
values for the elements of the composite transformation matrix are
Although the above matrix requires nine multiplications and six additions, the explicit
calculations for the transformed coordinates are
We need actually perform only four multiplications and four additions to transform
coordinate positions.
Because rotation calculations require trigonometric evaluations andseveral
multiplications for each transformed point, computational efficiency can become an
important consideration in rotation transformations
If we are rotating in small angular steps about the origin, for instance, we can set cos θ
to 1.0 and reduce transformation calculations at each step to two multiplications and
two additions for each set of coordinates to be rotated.
These rotation calculations are
x’= x − y sin θ, y’ = x sin θ + y
where the four elements r jk are the multiplicative rotation terms, and the elements
trx and try are the translational terms
A rigid-body change in coordinate position is also sometimes referred to as a rigid-
motion transformation.
In addition, the above matrix has the property that its upper-left 2 × 2 submatrix is
an orthogonal matrix.
If we consider each row (or each column) of the submatrix as a vector, then the two
row vectors (rxx, rxy) and (ryx, ryy) (or the two column vectors) form an orthogonal
set of unit vectors.
Such a set of vectors is also referred to as an orthonormal vector set. Each vector
has unit length as follows
Therefore, if these unit vectors are transformed by the rotation submatrix, then the
vector (rxx, rxy) is converted to a unit vector along the x axis and the vector (ryx,
ryy) is transformed into a unit vector along the y axis of the coordinate system
For example, the following rigid-body transformation first rotates an object through
an angle θ about a pivot point (xr , yr ) and then translates the object
Here, orthogonal unit vectors in the upper-left 2×2 submatrix are (cos θ, −sin θ) and
(sin θ, cos θ).
The rotation matrix for revolving an object from position (a) to position (b) can be
constructed with the values of the unit orientation vectors u’ and v’ relative to the original
orientation.
Reflection
A transformation that produces a mirror image of an object is called a reflection.
This transformation retains x values, but “flips” the y values of coordinate positions.
The resulting orientation of an object after it has been reflected about the x axis is
shown in Figure
A reflection about the line x = 0 (the y axis) flips x coordinates while keeping y
Figure below illustrates the change in position of an object that has been reflected
about the line x = 0.
We flip both the x and y coordinates of a point by reflecting relative to an axis that
is perpendicular to the xy plane and that passes through the coordinate origin the
matrix representation for this reflection is
If we choose the reflection axis as the diagonal line y = x (Figure below), the
reflection matrix is
To obtain a transformation matrix for reflection about the diagonal y = −x, we could
concatenate matrices for the transformation sequence:
(1) clockwise rotation by 45◦,
(2) reflection about the y axis, and
Dept. of CSE 64 GSSSIETW, Mysuru
COMPUTER GRAPHICS AND FUNDAMENTALS OF IMAGE PROCESSING
Shear
A transformation that distorts the shape of an object such that the transformed
shape appears as if the object were composed of internal layers that had been
caused to slide over each other is called a shear.
Two common shearing transformations are those that shift coordinate x values and
those that shift y values. An x-direction shear relative to the x axis is produced with
the transformation Matrix
Any real number can be assigned to the shear parameter shx Setting parameter shx
to the value 2, for example, changes the square into a parallelogram is shown
below. Negative values for shx shift coordinate positions to the left.
A unit square (a) is converted to a parallelogram (b) using the x -direction shear with shx =
2.
A y-direction shear relative to the line x = xref is generated with the transformation
Matrix
Translating an object from screen position (a) to the destination position shown in (b) by
moving a rectangular block of pixel values. Coordinate positions Pmin and Pmax specify
the limits of the rectangular block to be moved, and P0 is the destination reference position.
For array rotations that are not multiples of 90◦, we need to do some extra
processing.
The general procedure is illustrated in Figure below.
Each destination pixel area is mapped onto the rotated array and the amount of
overlap with the rotated pixel areas is calculated.
A color for a destination pixel can then be computed by averaging the colors of the
overlapped source pixels, weighted by their percentage of area overlap.
Pixel areas in the original block are scaled, using specified values for sx and sy, and
then mapped onto a set of destination pixels.
The color of each destination pixel is then assigned according to its area of overlap
with the scaled pixel areas
A block of RGB color values in a buffer can be saved in an array with the function
glReadPixels (xmin, ymin, width, height, GL_RGB, GL_UNSIGNED_BYTE,
colorArray);
If color-table indices are stored at the pixel positions, we replace the constant
GL RGB with GL_COLOR_INDEX.
To rotate the color values, we rearrange the rows and columns of the color array, as
described in the previous section. Then we put the rotated array back in the buffer
with
The suffix code is again either f or d, and the scaling parameters can be
assigned any real-number values.
Scaling in a two-dimensional system involves changes in the x and y
dimensions, so a typical two-dimensional scaling operation has a z scaling
factor of 1.0
Example: glScalef (2.0, -3.0, 1.0);
We can also concatenate a specified matrix with the current matrix as follows:
glMultMatrix* (otherElements16);
Again, the suffix code is either f or d, and parameter otherElements16 is a 16-
element, single-subscripted array that lists the elements of some other matrix in
column-major order.
Thus, assuming that the current matrix is the modelview matrix, which we designate
as M, then the updated modelview matrix is computed as
M = M· M’
The glMultMatrix function can also be used to set up any transformation sequence
with individually defined matrices.
For example,
glMatrixMode (GL_MODELVIEW);
glLoadIdentity ( ); // Set current matrix to the identity.
glMultMatrixf (elemsM2); // Postmultiply identity with matrix M2.
glMultMatrixf (elemsM1); // Postmultiply M2 with matrix M1.
produces the following current modelview matrix:
M = M2 · M1
3D Geometric Transformations
Three-Dimensional Geometric Transformations
• Methods for geometric transformations in three dimensions are extended from two
dimensional methods by including considerations for the z coordinate.
• A three-dimensional position, expressed in homogeneous coordinates, is
represented as a four-element column vector
Three-Dimensional Translation
A position P = (x, y, z) in three-dimensional space is translated to a location P’= (x’,
y’, z’) by adding translation distances tx, ty, and tz to the Cartesian coordinates of P:
or
CODE:
typedef GLfloat Matrix4x4 [4][4];
/* Construct the 4 x 4 identity matrix. */
void matrix4x4SetIdentity (Matrix4x4 matIdent4x4)
{
GLint row, col;
for (row = 0; row < 4; row++)
for (col = 0; col < 4 ; col++)
matIdent4x4 [row][col] = (row == col);
}
Dept. of CSE 72 GSSSIETW, Mysuru
COMPUTER GRAPHICS AND FUNDAMENTALS OF IMAGE PROCESSING
Three-Dimensional Rotation
By convention, positive rotation angles produce counterclockwise rotations about a
coordinate axis.
Positive rotations about a coordinate axis are counterclockwise, when looking along
the positive half of the axis toward the origin.
Transformation equations for rotations about the other two coordinate axes can be
obtained with a cyclic permutation of the coordinate parameters x, y, and z
x → y→ z→ x
Along x axis
Along y axis
1. Translate the object so that the rotation axis coincides with the parallel coordinate axis.
2. Perform the specified rotation about that axis.
3. Translate the object so that the rotation axis is moved back to its original position.
When an object is to be rotated about an axis that is not parallel to one of the
coordinate axes, we must perform some additional transformations we can
accomplish the required rotation in five steps:
1. Translate the object so that the rotation axis passes through the coordinate origin.
2. Rotate the object so that the axis of rotation coincides with one of the coordinate axes.
3. Perform the specified rotation about the selected coordinate axis.
4. Apply inverse rotations to bring the rotation axis back to its original orientation.
5. Apply the inverse translation to bring the rotation axis back to its original spatial
position.
Where the components a, b, and c are the direction cosines for the rotation
axis
• The first step in the rotation sequence is to set up the translation matrix that
repositions the rotation axis so that it passes through the coordinate origin.
• Translation matrix is given by
• Because rotation calculations involve sine and cosine functions, we can use
standard vector operations to obtain elements of the two rotation matrices.
• A vector dot product can be used to determine the cosine term, and a vector cross
product can be used to calculate the sine term.
• Rotation of u around the x axis into the x z plane is accomplished by rotating u’ (the
projection of u in the y z plane) through angle α onto the z axis.
• If we represent the projection of u in the yz plane as the vector u’= (0, b, c), then the
cosine of the rotation angle α can be determined from the dot product of u’ and the
unit vector uz along the z axis:
or
• We have determined the values for cos α and sin α in terms of the components of
vector u, the matrix elements for rotation of this vector about the x axis and into the
xz plane
• Rotation of unit vector u” (vector u after rotation into the x z plane) about the y
axis. Positive rotation angle β aligns u” with vector uz .
• We can determine the cosine of rotation angle β from the dot product of unit vectors
u’’ and uz. Thus,
• we find that
• The specified rotation angle θ can now be applied as a rotation about the z axis as
follows:
• The transformation matrix for rotation about an arbitrary axis can then be expressed
as the composition of these seven individual transformations:
• The composite matrix for any sequence of three-dimensional rotations is of the form
• Assuming that the rotation axis is not parallel to any coordinate axis, we could form
the following set of local unit vectors
• If we express the elements of the unit local vectors for the rotation axis as
• Then the required composite matrix, which is equal to the product Ry(β) · Rx(α), is
Rotation of the point is then carried out with the quaternion operation
The second term in this ordered pair is the rotated point position p’, which is
evaluated with vector dot and cross-products as
Three-Dimensional Scaling
The matrix expression for the three-dimensional scaling transformation of a
position P = (x, y, z) is given by
where scaling parameters sx, sy, and sz are assigned any positive values.
Explicit expressions for the scaling transformation relative to the origin are
Because some graphics packages provide only a routine that scales relative to the
coordinate origin, we can always construct a scaling transformation with respect to
any selected fixed position (xf , yf , zf ) using the following transformation sequence:
CODE:
class wcPt3D
{
private:
GLfloat x, y, z;
public:
/* Default Constructor:
* Initialize position as (0.0, 0.0, 0.0).
*/
wcPt3D ( ) {
x = y = z = 0.0;
}
setCoords (GLfloat xCoord, GLfloat yCoord, GLfloat zCoord) {
x = xCoord;
y = yCoord;
Dept. of CSE 81 GSSSIETW, Mysuru
COMPUTER GRAPHICS AND FUNDAMENTALS OF IMAGE PROCESSING
z = zCoord;
}
GLfloat getx ( ) const {
return x;
}
GLfloat gety ( ) const {
return y;
}
GLfloat getz ( ) const {
return z;
}
};
typedef float Matrix4x4 [4][4];
void scale3D (GLfloat sx, GLfloat sy, GLfloat sz, wcPt3D fixedPt)
{
Matrix4x4 matScale3D;
Three-Dimensional Shears
These transformations can be used to modify object shapes.
For three-dimensional we can also generate shears relative to the z axis.
A general z-axis shearing transformation relative to a selected reference position is
produced with the following matrix:
A unit cube (a) is sheared relative to the origin (b) by Matrix 46, with sh zx = shzy = 1.
We have two functions available in OpenGL for processing the matrices in a stack
glPushMatrix ( );
Copy the current matrix at the top of the active stack and store that copy in the
second stack position
glPopMatrix ( );
Which destroys the matrix at the top of the stack, and the second matrix in the stack
becomes the current matrix
Module-3
Interactive Input Methods and Graphical User Interfaces
These are logical functions that are defined by how they handle input or output
character strings from the perspective of C program.
From logical devices perspective inputs are from inside the application program.
The two major characteristics describe the logical behavior of input devices are
as follows:
API defines six classes of logical input devices which are given below:
1. STRING: A string device is a logical device that provides the ASCII values of input
characters to the user program. This logical device is usually implemented by means
of physical keyboard.
2. LOCATOR: A locator device provides a position in world coordinates to the user
program. It is usually implemented by means of pointing devices such as mouse or
track ball.
3. PICK: A pick device returns the identifier of an object on the display to the user
program. It is usually implemented with the same physical device as the locator
but has a separate software interface to the user program. In OpenGL, we can use a
process of selection to accomplish picking.
4. CHOICE: A choice device allows the user to select one of a discrete number of
options. In OpenGL, we can use various widgets provided by the window system. A
widget is a graphical interactive component provided by the window system or a
toolkit. The Widgets include menus, scrollbars and graphical buttons. For example, a
menu with n selections acts as a choice device, allowing user to select one of ‘n’
alternatives.
5. VALUATORS: They provide analog input to the user program on some graphical
systems; there are boxes or dials to provide value.
6. STROKE: A stroke device returns array of locations. Example, pushing down a
mouse button starts the transfer of data into specified array and releasing of button
ends this transfer.
INPUT MODES
Input devices can provide input to an application program in terms of two entities:
2. Trigger of a device is a physical input on the device with which the user can
send signal to the computer
Example 2: The measure of a mouse is the position of the cursor whereas the trigger is when
the mouse button is pressed.
The application program can obtain the measure and trigger in three distinct modes:
1. REQUEST MODE: In this mode, measure of the device is not returned to the program
until the device is triggered.
• Another example, consider a logical device such as locator, we can move out
pointing device to the desired location and then trigger the device with its
button, the trigger will cause the location to be returned to the application
program.
2. SAMPLE MODE: In this mode, input is immediate. As soon as the function call
in the user program is executed, the measure is returned. Hence no trigger is needed.
Both request and sample modes are useful for the situation if and only if there is a
single input device from which the input is to be taken. However, in case of flight
simulators or computer games variety of input devices are used and these mode
cannot be used. Thus, event mode is used.
• If the queue is empty, then the application program will wait until an event
occurs. If there is an event in a queue, the program can look at the first event
type and then decide what to do.
Another approach is to associate a function when an event occurs, which is called as “call
back.”
MENUS
Menus are an important feature of any application program. OpenGL provides a
feature called “Pop-up-menus” using which sophisticated interactive applications
can be created.
Menu creation involves the following steps:
1. Define the actions corresponding to each entry in the menu.
2. Link the menu to a corresponding mouse button.
GLUT also supports the creation of hierarchical menus which is given below:
Dept. of CSE 88 GSSSIETW, Mysuru
COMPUTER GRAPHICS AND FUNDAMENTALS OF IMAGE PROCESSING
Computer Animation
Design of Animation Sequences
Constructing ananimation sequencec an be a complicated task,particularly when it
involves a story line and multiple objects, each of which can move in a different way.
A basicapproachistodesignsuchanimationsequences usingthe following development
stages:
• Storyboard layout
• Object definitions
• Key-frame specifications
• Generation of in-between frames
The storyboard is an outline of the action. It defines the motion sequence as a set of basic
events that are to take place. Depending on the type of animation to be produced, the
storyboard could consist of a set of rough sketches, along with a brief description of the
movements, or it could just be a list of the basic ideas for the action. Originally, the set
ofmotionsketcheswasattachedtoalargeboardthat was used to present an overall view of the
animation project. Hence, the name “storyboard.”
An Object definitionis given for each participant in the action. Objects can be
definedintermsofbasic shapes, such as polygons or spline surfaces. In addition, a description
is often given of the movements that are to be performed by each character or object in the
story.
A key frame is a detailed drawing of the scene at a certain time in the ani mation sequence.
Within each key frame, each object (or character) is positioned according to the time for that
frame. Development of the key frames is generally the responsibility of the senior animators,
and often a separate animator is assigned to each character in the animation.
In-betweens are the intermediate frames between the key frames. The total number of
frames, and hence the total number of in-betweens, needed for an animation is determined
by the display media that is to be used. Film requires 24 frames per second, and graphics
terminals are refreshed at the rate of 60 or moreframespersecond.Typically, time intervals
for the motion are set up so that there are from three to five in-betweens for each pair of key
frames. Depending on the speed specified for the motion, some key frames could be
duplicated. As an example, a 1-minute film sequence with no duplication requires a total of
1,440 frames. If five in-betweens are required for each pair of key frames, then 288 key
frames would need to be developed.
Traditional Animation Techniques
Film animators use a variety of methods for depicting and emphasizing motion
sequences.
These include object deformations,spacing between animationframes, motion
anticipation and follow-through, and action focusing.
One of the most important techniques for simulating acceleration effects, particularly
for nonrigid objects, is squash and stretch.
Figure 4 shows how this technique isused to emphasize the acceleration and
deceleration of a bouncing ball. As the ball accelerates, it begins to stretch.
When the ball hits the floor and stops, it is first compressed (squashed) and then
stretched again as it accelerates and bounces upwards.
Another technique used by film animators is timing,which refers to the spacing
between motion frames.
A slower moving object is represented with more closely spaced frames, and a faster
moving object is displayed with fewer frames over the path of the motion.
This effect is illustrated in Figure 5, where the position changes between frames
increase as a bouncing ball moves faster.
Object movements can also be emphasized by creating preliminary actions that
indicate an anticipation of a coming motion. Forexample,a cartoon character might
lean forward and rotate its body before starting to run;oracharactermight perform a
“windup” before throwing a ball.
Similarly, follow-through actions can be used to emphasize a previous motion.
After throwing a ball, a character can continue the arm swing back to its body; or a
hat can fly off a character that is stopped abruptly.
An action also can be emphasized with staging, which refers to any method for
focusing on an important part of a scene, such as a character hiding something
Computer-Animation Languages
We can develop routines to design and control animation sequences within a general-
purpose programming language, such as C, C++, Lisp, or Fortran, but several
specialized animation languages have been developed.
These languages typically include a graphics editor, a key-frame generator, an in-
between genera tor, and standard graphics routines.
The graphics editor allows an animator to design and modify object shapes, using
spline surfaces, constructive solid geometry methods, or other representation
schemes.
An important task in an animation specification is scene description. This includes
the positioning of objects and light sources, defining the photometric parameters
(light-source intensities and surface illumination properties), and setting the camera
parameters (position, orientation, and lens characteristics).
Another standard function is action specification, which involves the layout of
motion paths for the objects and camera.
We need the usual graphics routines: viewing and perspective transformations,
geometric transformations to generate object movements as a function of
accelerations or kinematic path specifications, visible-surface identification, and the
surface-rendering operations.
Key-frame systems were originally designed as a separate set of animation routines
for generating the in-betweens from the user-specified key frames. Now, these
routines are often a component in a moregeneralanimationpackage.
In the simplest case, each object in a scene is defined as a set of rigid bodies
connected at the joints and with alimited number of degrees of freedom.
Parameterized systems allow object motion characteristics to be specified as
partoftheobjectdefinitions.Theadjustableparameterscontrolsuchobjectcharac teristics
as degrees of freedom, motion limitations, and allowable shape changes.
Scripting systems allow object specifications and animation sequences to be defined
with a user-input script. From the script, a library of various objects and motions can
be constructed
Character Animation
Animation of simple objects is relatively straightforward.
Motion Capture
Periodic Motions
Periodic Motions Whenweconstruct an animation with repeated motion patterns,
such as a rotat ing object, we need to be sure that the motion is sampled frequently
enough to represent the movements correctly.
In other words, the motion must be synchro nized with the frame-generation rate so
that we display enough frames per cycle to show the true motion. Otherwise, the
animation may be displayed incorrectly.
A typical example of an under sampled periodic-motion display is the wagon wheel
in a Western movie that appears to be turning in the wrong direction.
If this motion is recorded on film at the standard motion-picture projection rate of 24
frames per second, then the first five frames depicting this motion would.
Because the wheel completes 3 4 of a turn every 1 24 of a second, only one
animation frame is generated per cycle, and the wheel thus appears to be rotating in
the opposite (counterclockwise) direction.
Module-4
Introduction to Image processing
Overview
Image processing involves the manipulation and analysis of images to enhance them or
extract useful information. It is a subset of signal processing where the input is an image,
and the output can be an image or characteristics/features associated with that image.
Image processing refers to the processing of visual information Sources, such as image for
Some specific task, as per the application requirements.. Pattern recognition is used for
identifying & recognzing the objects that is present in the image, using the features
generated & classification or clustering is pattern recognition
Computer vision associated with scene understanding. Most image processing algorithms
produce results that Can Serve as the first input for mile wision. Algorithm Computer
graphics & image processing are very clasely. Related areas. Image processing deals with
raster data & bitmaps whereas computer graphics primarily deals with vector data
An image is defined as a two dimensional function, f(x,y), where x and y are spatial (plane)
coordinates, and the amplitude of ‘f’ at any pair of coordinates (x,y) is called the intensity or
gray level of the image at that point
The field of digital image processing refers to processing digital image by means of a digital
computer.
NOTE: A digital image is composed of finite number of elements like picture elements,
image elements, pels, and pixels.
The digital image processing methods stems from 2 principal application areas:
Image acquisition: This is the first step in the digital image processing. An image is
captured by a sensor (such as digital camera) and digitized. The image that is acquired is
completely unprocessed. This step involves preprocessing such as scaling.
Image restoration: It is an area that also deals with improving the appearance of an image
but it is objective than subjective, in the sense that restoration techniques tend to be based on
mathematical or probabilistic models of image degradation.
Color image processing: It is an area that has been gaining in importance because of the
significant increase in the use of digital images over the internet. Color is used also for
extracting features of interest in an image. This may include color modeling and processing
in a digital domain etc.
Wavelets: These are the foundation for representing images in various degrees of resolution.
In particular used for image data compression and for pyramidal representation, in which
images are subdivided successively into smaller regions.
Compression: Deals with techniques for reducing the storage required to saving an image,
or the bandwidth required to transmit it. An example for image compression standard is jpg
file extension used in the JPEG(Joint Photographic Experts Group) image compression
standard.
Morphological processing: It deals with tools for extracting image components that are
useful in the representation and description of shape.
Representation and description: It follows the output of the segmentation stage, which is
raw pixel data it’s needed to convert it to a form suitable for computer processing. The first
decision that must be made is whether the data should be represented as a boundary (i.e., the
set of pixels separating one image region from another) or as a complete region.
Description also called feature selection, deals with extracting attributes that result in some
quantitative information of interest or are basic for differentiating one class of objects from
another.
Recognition: It is the process that assigns a label (e.g., ‘vehicle’) to an object based on its
descriptors.
The above figure shows the basic components comprising a typical general purpose system
used for digital image processing. With reference to sensing, two elements are required to
acquire digital images:
1. The physical device that is sensitive to the energy radiated by the object we wish
to image.
2. Digitizer is a device for converting the output of the physical sensing device into
digital form.
For example in a digital camera, the sensors produce an electrical output proportional to
light intensity.
Specialized image processing hardware usually consists of the digitizer plus hardware
that performs other primitive operations, such as an arithmetic logic unit(ALU), that
performs other primitive operations in parallel on entire images. This type of hardware is
called a front end subsystem, and its most distinguished characteristic is speed, so this unit
performs functions that require fast data throughputs that the typical main computer cannot
handle.
The Computer in an image processing system is a general purpose computer and can
range from a PC to a supercomputer. In dedicated applications, sometimes custom
computers are used to achieve a required level of performance.
Software for image processing consists of specialized modules that perform specific
tasks. A well designed package also includes the capacity for the user to write code that, as a
minimum, utilizes the specialized modules.
Storage is measured in bytes, Kbytes, Mbytes, Gbytes, Tbytes. One method of providing
short term storage is computer memory. An-other is by specialized buffers, that store one or
more images and can be accessed rapidly, usually at video buffers, that store one or more
images and can be accessed rapidly, usually at video rates. The latter method allows virtually
instantaneous image zoom, as well as scroll (vertical shifts) and pan (horizontal shifts).
Online storage generally takes the form of magnetic disks or optical image storage. The key
factor characterizing the online storage is frequent access to the stored data. Magnetic tapes
and optical disks housed in ‘jukeboxes’ are the usual media for the archival applications.
Image displays in use today are mainly color TV monitors. Monitors are driven by the
outputs of image and graphic display cards that are an integral part of the computer system.
In some cases it is necessary to have stereo displays, and these are implemented in the form
of head gear containing two small displays embedded in goggles worn by the users.
Hardcopy devices for recording images include laser printers, film cameras, heat-
sensitive devices, inject units, such as optical and CD ROM disks. Film provides the highest
possible resolution, but paper is the obvious medium of choice for written material.
Networking is almost a default function in any computer system in use today. Because of
the large amount of data inherent in image processing applications, the key consideration in
image transmission is bandwidth. Optical fiber and other broadband technologies
overcoming the problem of communicating with remote sites via internet.
The areas of application of digital image processing are so varied that some form of
organization is desirable in attempting to capture the breadth of this field. One of the
simplest ways to develop a basic understanding of the extent of image processing
applications is to categorize images according to their application.
1. Medical imaging
2. Robot vision
3. Character recognition
4. Remote Sensing.
Medical Imaging:
Gamma-Ray Imaging: Major uses of imaging based on gamma rays include nuclear
medicine and astronomical observations. In nuclear medicine, the approach is to inject a
patient with a radioactive isotope that emits gamma rays as it decays. Images are produced
from the emissions collected by gamma ray detectors.
X-ray Imaging: X-rays are among the oldest sources of EM radiation used for imaging. The
best known use of X-rays is medical diagnostics, but they also are used extensively in
industry and other areas, like astronomy. X-rays for medical and industrial imaging are
generated using an Xray tube, which is a vacuum tube with a cathode and anode. The
cathode is heated, causing free electrons to be released. These electrons flow at high speed
to the positively charged anode. When the electrons strike a nucleus, energy is released in
the form of X-ray radiation. e energy (penetrating power) of the X-rays is controlled by a
voltage applied across the anode, and the number of X-rays is controlled by a current
applied to the filament in the cathode. The intensity of the X-rays is modified by absorption
as they pass through the patient, and the resulting energy falling on the film develops it,
much in the same way that light develops photographic film. In digital radiography, digital
images are obtained by one of two methods:
(2) by having the X-rays that pass through the patient fall directly onto devices
(such as a phosphor screen) that convert X-rays to light.
Robot Vision:
Apart from the many challenges that a robot face today, one of the biggest challenge still is
to increase the vision of the robot. Make robot able to see things, identify them.
1. Hurdle detection:
Hurdle detection is one of the common task that has been done through image processing,
by identifying different type of objects in the image and then calculating the distance
between robot and hurdles.
Most of the robots today work by following the line and thus are called line follower robots.
This helps a robot to move on its path and perform some tasks. This has also been achieved
through image processing.
Character Recognition:
• Legal department
• Retail Industry
Remote Sensing:
In the field of remote sensing, the area of the earth is scanned by a satellite or from a very
high ground and then it is analyzed to obtain information about it. One particular application
of digital image processing in the field of remote sensing is to detect infrastructure damages
caused by an earthquake. Since the area effected by the earthquake is sometimes so wide,
that it not possible to examine it with human eye in order to estimate damages. Even if it is ,
then it is very hectic and time consuming procedure. So a solution to this is found in digital
image processing using remote sensing. An image of the affected area is captured from the
above ground and then it is analyzed to detect the various types of damage done by the
earthquake. The key steps include in the analysis are
Machine Learning: Machine Learning, particularly deep learning, is essential for advanced
image processing tasks. It involves training algorithms and statistical models on large
datasets to perform tasks without explicit instructions. In image processing, machine
learning models are trained on processed image data for tasks such as image classification,
where images are sorted into categories, and image enhancement, which improves resolution
using techniques like super-resolution. Predictive modeling, another application, uses
medical images to predict disease progression. Machine learning enhances image processing
by providing powerful tools for analyzing and interpreting complex visual data.
Signal Processing: Signal Processing deals with the analysis, modification, and synthesis of
signals, including images. It provides the mathematical and computational foundation for
many image processing techniques. Applications of signal processing in image processing
include audio and speech processing, where noise reduction techniques are applied to audio
recordings, and medical imaging, where signal processing algorithms reconstruct images
from MRI and CT scans. In communications, image compression techniques reduce file
sizes for efficient transmission and storage, highlighting the integral role of signal
processing in managing and enhancing visual data.
In general, the value of the image at any coordinates (x,y) is denoted f(x,y), where x and y
are integers. The section of the real plane spanned by the coordinates of an image is called
the spatial domain, with x and y being referred to as spatial variables or spatial coordinates.
The image displays allow us to view results at a glance. Numerical arrays are used for
processing and algorithm development. In equation form, we write the representation of an
M X N numerical array as
Both the sides of this equation are equivalent ways of expressing a digital image
quantitatively. The right side is a matrix of real numbers. Each element of this matrix is
called an image element, pictures element, pixels, or pel. This digitization process requires
that decisions be made regarding the values for M, N, and for the number L of discrete
intensity levels. Here M and N are positive integers. The number of intensity levels typically
is an integer power of 2:
L= 2k
The number, b of bits required to store a digitized image is b = ‘M’ x ‘N’ x ‘k’ When M = N,
this equation becomes b = N2k
Types of Images
1. Binary Image: Images that have only two unique values of pixel intensity- 0
(representing black) and 1 (representing white) are called binary images. Such
images are generally used to highlight a discriminating portion of a colored
image. For example, it is commonly used for image segmentation, as shown
below.
Basic relationships
4. Adjacency: Two pixels are adjacent if they are neighbors and share a similar
property, such as intensity. Types of adjacency include 4-adjacency, 8-adjacency,
and m-adjacency (mixed).
A pixel p at coordinates ( x, y) has four horizontal and vertical neighbors whose coordinates
are given by
This set of pixels is called the 4-neighbors of P, and is denoted by N 4(P). Each of them are
at a unit distance from P. The four diagonal neighbors of p(x,y) are given by,
This set is denoted by N D(P). Each of them are at Euclidean distance of 1.414 from P. The
points ND(P) and N4(P) are together known as 8-neighbors of the point P, denoted by N8(P).
Some of the points in the N4, ND and N8 may fall outside image when P lies on the border
of image.
N 4 - 4-neighbors
N D - diagonal neighbors
N 8 - 8-neighbors (N 4 U N D )
Neighbors of a pixel
a. 4-neighbors of a pixel p are its vertical and horizontal neighbors denoted by
N4(p)
Adjacency: Two pixels are connected if they are neighbors and their gray levels satisfy
some specified criterion of similarity. For example, in a binary image two pixels are
connected if they are 4-neighbors and have same value (0/1) then it is said to be satisfy
adjacency.
Let V be set of gray levels values used to define adjacency.
4-adjacency: Two pixels p and q with values from V are 4- adjacent if q is in the set N 4 (p).
8-adjacency: Two pixels p and q with values from V are 8- adjacent if q is in the set N 8 (p).
m-adjacency: Two pixels p and q with values from V are madjacent if,
A. q is in N 4(P).
B. q is in N D(p) and the set N4(p) ∩ N4(q) is empty (has no pixels whose values
are from V.
Connectivity: To determine whether the pixels are adjacent in some sense. Let V be the
set of gray-level values used to define connectivity; then Two pixels p, q that have values
from the set V(1,2) are:
a. 4-connected, if q is in the set N4(p)
Connected Components:
If p and q are pixels of an image subset S then p is connected to q in S if there is a
path from p to q consisting entirely of pixels in S. For every pixel p in S, the set of pixels in
S that are connected to p is called a connected component of S, If S has only one connected
component then S is called Connected Set .
neighbors beyond its border. Normally, when we refer to a region, we are referring to a
subset of an image, and any pixels in the boundary of the region that happen to coincide
with the border of the image are included implicitly as part of the region boundary.
Distance Metrics
𝑑=(𝑥2−𝑥1)2+(𝑦2−𝑦1)2d=(x2−x1)2+(y2−y1)2
2. Manhattan Distance (City Block Distance): The distance between two pixels
measured along axes at right angles, calculated as:
𝑑=∣𝑥2−𝑥1∣+∣𝑦2−𝑦1∣d=∣x2−x1∣+∣y2−y1∣
𝑑=max(∣𝑥2−𝑥1∣,∣𝑦2−𝑦1∣)d=max(∣x2−x1∣,∣y2−y1∣)
Given pixels p, q and z with coordinates (x, y), (s, t), (u, v) respectively, the distance
function D has following properties:
2 1 2
2 1 0 1 2
2 1 2
2
2 2 2 2 2
2 1 1 1 2
2 1 0 1 2
2 1 1 1 2
2 2 2 2 2
In this case, the pixels with D8 distance from (x, y) less than or equal to some value r form a
square centered at (x,y). For example, the pixels with D8 distance ≤ 2 from (x, y) form the
following contains of constant distance: The pxels with D8 = 1 are the 8- neighbors of (x, y).
Step 2: Selection of Features to discriminate between the classes should be established using
multispectral or multi-temporal characteristics, colour, textures etc.
Step 3: Sampling of Training Data Training data should be sampled in order to determine
appropriate decision rules. Classification techniques such as supervised or unsupervised
learning will then be selected on the basis of the training data sets.
Step 4: Finding of proper decision rule Various classification techniques will be compared
with the training data, so that an appropriate decision rule is selected for subsequent
classification.
Step 5: Classification depending upon the decision rule, all pixels are classified in a single
class. There are two methods of pixel by pixel classification and per-field classification, with
respect to segmented areas.
Step 6: Verification of Results The classified results should be checked and verified for
their accuracy and reliability.
Module-5
Image Segmentation
Image segmentation is the division of an image into regions or categories, which
correspond to different objects or parts of objects. Every pixel in an image is allocated to
one of a number of these categories.
• pixels in the same category have similar grey scale of multivariate values and
form a connected region,
• neighboring pixels which are in different categories have dissimilar values.
Segmentation is often the critical step in image analysis: the point at which we move
from considering each pixel as a unit of observation to working with objects (or parts of
objects) in the image, composed of many pixels.
Image segmentation is the key behind image understanding. Image segmentation is
considered as an important basic operation for meaningful analysis and interpretation of
image acquired.
It is a critical and essential component of an image analysis and/or pattern recognition
system, and is one of the most difficult tasks in image processing, which determines the
quality of the final segmentation.
If segmentation is done well then all other stages in image analysis are made simpler.
But, as we shall see, success is often only partial when automatic segmentation algorithms
are used. However, manual intervention can usually overcome these problems, and by this
stage the computer should already have done most of the work.
• Termed thresholding,
Dept. of CSE 113 GSSSIETW, Mysuru
COMPUTER GRAPHICS AND FUNDAMENTALS OF IMAGE PROCESSING
• Edge-based methods
• Region-based methods.
• In thresholding, pixels are allocated to categories according to the range of values in
which a pixel lies. Fig 4.1(a) shows boundaries which were obtained by thresholding
the muscle fibers image. Pixels with values less than 128 have been placed in one
category, and the rest have been placed in the other category. The boundaries between
adjacent pixels in different categories has been superimposed in white on the original
image. It can be seen that the threshold has successfully segmented the image into the
two predominant fiber types.
For example, in Fig 4.1(a) boundaries are well placed, but others are missing. In Fig 4.1(c),
however, more boundaries are present, and they are smooth, but they are not always in
exactly the right positions.
The following three sections will consider these three approaches in more detail. Algorithms
will be considered which can either be fully automatic or require some manual intervention.
The key points of the chapter will be summarized in §4.4.
Fig. Boundaries produced by three segmentations of the muscle fibers images: (a) by
thresholding, (b) connected regions after thresholding the output of Prewitt”s edge filter
and removing small regions, (c) result produced by watershed algorithm on output from a
variance filter with Gaussian weights ( 2 = 96).
(4) Discontinuity-based,
Discontinuity based segmentation is one of the widely used techniques for monochrome
image segmentation. In discontinuity-based approach, the partitions or subdivision of an
image is based on some abrupt changes in the intensity level of images. Here, we mainly
interest in identification of isolated points, lined and edges in an image.
The area of edge detection algorithms. The image segmentation based on discontinuity-
based approach. Under this approach, we analyses the point detection, line detection and
edge detection techniques. A number of operator which are based on first-order derivatives
and second-order derivatives such as prewitt, sobel, roberts etc..
THRESHOLDING:
h(x) P p (x) P p (x)=
fij ≤ t.
Note that:
Although pixels in a single thresholded category will have similar values (either in the
range 0 to t, or in the range (t + 1) to 255), they will not usually constitute a single connected
component. This is not a problem in the soil image because the object (air) is not necessarily
connected, either in the imaging plane or in three-dimensions. In other cases, thresholding
would be followed by dividing the initial categories into sub-categories of connected
regions.
More than one threshold can be used, in which case more than two categories are
produced.
Thresholds can be chosen automatically.
In §4.1.1 we will consider algorithms for choosing the threshold on the basis of the
histogram of greyscale pixel values. In §4.1.2, manually- and automatically-selected
HISTOGRAM-BASED THRESHOLDING
We will denote the histogram of pixel values by h0, h1,...,hN , where hk specifies the
number of pixels in an image with grey scale value k and N is the maximum pixel value
(typically 255). Ridler and Calvard (1978) and Trussell (1979) proposed a simple algorithm
for choosing a single threshold. We shall refer to it as the intermeans algorithm. First we
will describe the algorithm in words, and then mathematically.
Initially, a guess has to be made at a possible value for the threshold. From this, the
mean values of pixels in the two categories produced using this threshold are calculated. The
threshold is repositioned to lie exactly half way between the two means. Mean values are
calculated again and a new threshold is obtained, and so on until the threshold stops
changing value. Mathematically, the algorithm can be specified as follows.
1. Make an initial guess at t: for example, set it equal to the median pixel value, that is,
the value for which
2. Calculate the mean pixel value in each category. For values less than or equal to t, this is
given by:
where [ ] denotes ‘the integer part of’ the expression between the brackets.
4. Repeat steps (2) and (3) until ‘t’ stops changing value between consecutive evaluations.
Fig shows the histogram of the soil image. From an initial value of t = 28 (the median pixel
value), the algorithm changed t to 31, 32, and 33 on the first three iterations, and then t
remained unchanged. The pixel means in the two categories are 15.4 and 52.3. Fig (a) shows
the result of using this threshold. Note that this value of t is considerably higher than the
threshold value of 20 which we favored in the manual approach.
The inter means algorithm has a tendency to find a threshold which divides the
histogram in two, so that there are approximately equal numbers of pixels in the two
categories. In many applications, such as the soil image, this is not appropriate. One way to
overcome this drawback is to modify the algorithm as follows.
Here, p1 and p2 are proportions (such that p1 + p2 = 1) and φl(k) denotes the probability
density of a Gaussian distribution, that is
where µl and σ2 l are the mean and variance of pixel values in category l. The best
classification criterion, i.e. the one which misclassifies the least number of pixels, allocates
pixels with value k to category 1 if p
and otherwise classifies them as 2. After substituting for φ and taking logs, the inequality
becomes
The left side of the inequality is a quadratic function in k. Let A, B and C denote the
three terms in curly brackets, respectively. Then the criterion for allocating pixels with value
k to category 1 is:
a. If A = 0 (i.e. σ12 = σ2 2 ), the criterion simplifies to one of allocating pixels with value k to
category 1 if
(If, in addition, p1 = p2 and µ1 < µ2, the criterion becomes k ≤ 1/ 2 {µ1 + µ2}. Note that this is
the intermeans criterion, which implicitly assumes that the two categories are of equal size.)
b. If B < AC, then the quadratic function has no real roots, and all pixels are classified as 1 if
A < 0 (i.e. σ1 2 > σ2 2), or as 2 if A > 0
In practice, cases (a) and (b) occur infrequently, and if µ1 < µ2 the rule simplifies to the
threshold:
From an initial guess at the threshold, the proportions, means and variances of pixel values
in the two categories are calculated. The threshold is repositioned according to the above
criterion, and proportions, means and variances are recalculated. These steps are repeated
until there are no changes in values between iterations.
Mathematically:
1. Make an initial guess at a value for t.
2. Estimate p1, µ1 and σ1 2 for pixels with values less than or equal to t, by
3. Re-estimate t by
When applied to the soil image, the algorithm converged in 4 iterations to t = 24. Fig (b)
shows the result, which is more satisfactory than that produced by the intermeans algorithm
because it has allowed for a smaller proportion of air pixels (p1 = 0.45, as compared with p2
= 0.55). The algorithm has also taken account of the air pixels being less variable in value
than those for the soil matrix (σ1 2 = 30, whereas σ2 2 = 186). This is in accord with the left-
most peak in the histogram plot (Fig ) being quite narrow.
EDGE-BASED SEGMENTATION
As we have seen, the results of threshold-based segmentation are usually less than
perfect. Often, a scientist will have to make changes to the results of automatic
segmentation. One simple way of doing this is by using a computer mouse to control a
screen cursor and draw boundary lines between regions. Fig (a) shows the boundaries
obtained by thresholding the muscle fibres image (as already displayed in Fig (a)),
superimposed on the output from Prewitt’s edge filter (§3.4.2), with the contrast stretched so
that values between 0 and 5 are displayed as shades of grey ranging from white to black and
values exceeding 5 are all displayed as black. This display can be used as an aid to
determine where extra boundaries need to be inserted to fully segment all muscle fibres. Fig
4.10(b) shows the result after manually adding 71 straight lines.
where the entries in h1,...,hK are used to keep track of which categories are equivalent,
and gij records the category label for pixel (i, j).
3. If just one of the two neighbours is an edge pixel, then (i, j) is assigned the same label as
the other one:
4. The final possibility is that neither neighbor is an edge pixel, in which case (i, j) is given
the same label as one of them:
and if the neighbors have labels which have not been marked as equivalent, i.e. hgi−1,j
6= hgi,j−1 , then this needs to be done (because they are connected at pixel (i, j)). The
equivalence is recorded by changing the entries in h1,...,hK, as follows: – Set l1 =
min(hgi−1,j , hgi,j−1 ) and l2 = max(hgi−1,j , hgi,j−1 ). – For each value of k from 1 to
K, if hk = l2 then hk → l1.
Finally, after all the pixels have been considered, the array of labels is revised, taking into
account which categories have been marked for amalgamation: gij → hgij for i, j = 1, . . . , n
After application of the labeling algorithm, superfluous edge pixels — that is, those
which do not separate classes — can be removed: any edge-pixel which has neighbors only
of one category is assigned to that category. Fig 4.11(b) shows the result of applying the
labeling algorithm with edges as shown in Fig 4.11(a), and removing superfluous edge
pixels. The white boundaries have been superimposed on the original image.
Similarly, small segments (say less than 500 pixels in size) which do not touch the
borders of the image can be removed, leading to the previously displayed Fig 4.1(b). The
segmentation has done better than simple thresholding, but has failed to separate all fibers
because of gaps in output from Prewitt’s edge filter. Martello (1976), among others, has
proposed algorithms for bridging these gaps.
REGION-BASED SEGMENTATION
Clustering in the sense that pixels with similar values are grouped together, and
spatial in that pixels in the same category also form a single connected component.
For each of a sequence of increasing values of a threshold, all pixels with edge
strength less than this threshold which form a connected region with one of the seeds are
allocated to the corresponding fibre. When a threshold is reached for which two seeds
become connected, the pixels are used to label the boundary. A mathematical representation
of the algorithm is too complicated to be given here. Instead, we refer the reader to Vincent
and Soille (1991) for more details and an efficient algorithm. Meyer and Beucher (1990)
also consider the watershed algorithm, and added some refinements to the method.
Note that:
• The use of discs of radius 3 pixels, rather than single points, as seeds make the
watershed results less sensitive to fluctuations in Prewitt’s filter output in the middle
of fibres.
• The results produced by this semi-automatic segmentation algorithm are almost as
good as those shown in Fig 4.10(b), but the effort required in positioning seeds
inside muscle fibres is far less than that required to draw boundaries.
• Adams and Bischof (1994) present a similar seeded region growing algorithm, but
based directly on the image greyscale, not on the output from an edge filter.
The watershed algorithm, in its standard use, is fully automatic. Again, we will demonstrate
this by illustration. Fig shows the output produced by a variance filter (§3.4.1) with
Gaussian weights (σ2 = 96) applied to the muscle fibers image after histogram-equalization
(as shown in Fig (d)). The white seeds overlie all the local minima of the filter output, that
is, pixels whose neighbors all have larger values and so are shaded lighter. Note that it is
necessary to use a large value of σ2 to ensure that the filter output does not have many more
local minima. The boundaries produced by the watershed algorithm have been added to Fig
An intuitive way of viewing the watershed algorithm is by considering the output from the
variance filter as an elevation map: light areas are high ridges and dark areas are valleys.
Each local minimum can be thought of as the point to which any water falling on the region
drains, and the segments are the catchments for them. Hence, the boundaries, or watersheds,
lie along tops of ridges. The previously mentioned Fig(c) shows this segmentation
superimposed on the original image.
Fig.: Manual segmentation of muscle fibres image by use of watersheds algorithm (a)
manually positioned ‘seeds’ in centers of all fibres, (b) output from Prewitt’s edge filter
together with watershed boundaries, (c) watershed boundaries superimposed on the image.
Figure : Output of variance filter with Gaussian weights (σ2 = 96) applied to muscle fibres
image, together with seeds indicating all local minima and boundaries produced by
watershed algorithm.
There are very many other region-based algorithms, but most of them are quite
complicated. In this section we will consider just one more, namely an elegant split-and-
merge algorithm proposed by Horowitz and Pavlidis (1976). We will present it in a slightly
modified form to segment the log-transformed SAR image (Fig ), basing our segmentation
decisions on variances, whereas Horowitz and Pavlidis based theirs on the range of pixel
values. The algorithm operates in two stages, and requires a limit to be specified for the
maximum variance in pixel values in a region.
The first stage is the splitting one. Initially, the variance of the whole image is
calculated. If this variance exceeds the specified limit, then the image is subdivided into four
quadrants. Similarly, if the variance in any of these four quadrants exceeds the limit it is
further subdivided into four. This continues until the whole image consists of a set of
squares of varying sizes, all of which have variances below the limit. (Note that the
algorithm must be capable of achieving this because at the finest resolution of each square
consisting of a single pixel the variances are taken to be zero.)
Fig :(a) shows the resulting boundaries in white, superimposed on the log- transformed
SAR image, with the variance limit set at 0.60. Note that:
The second stage of the algorithm, the merging one, involves amalgamating squares
which have a common edge, provided that by so doing the variance of the new region does
not exceed the limit. Once all amalgamations have been completed, the result is a
segmentation in which every region has a variance less than the set limit. However, although
the result of the first stage in the algorithm is unique, that from the second is not — it
depends on the order of which squares are considered.
Fig (b) shows the boundaries produced by the algorithm, superimposed on the SAR
image. Dark and light fields appear to have been successfully distinguished between,
although the boundaries are rough and retain some of the artefacts of the squares in Fig (a).
Pavlidis and Liow (1990) proposed overcoming the deficiencies in the boundaries
produced by the Horowitz and Pavlidis algorthm by combining the results with those from
an edge-based segmentation. Many other ideas for region-based segmentation have been
proposed (see, for example, the review of Haralick and Shapiro, 1985), and it is still an
active area of research.
One possibility for improving segmentation results is to use an algorithm which over-
segments an image, and then apply a rule for amalgamating these regions. This requires
‘high- level’ knowledge, which falls into the domain of artificial intelligence. (All that we
have considered in this chapter may be termed ‘low-level’.) For applications of these ideas
in the area of remote sensing, see Tailor, Cross, Hogg and Mason (1986) and Ton, Sticklen
and Jain (1991). It is possible that such domain-specific knowledge could be used to
improve the automatic segmentations of the SAR and muscle fibres images, for example by
constraining boundaries to be straight in the SAR image and by looking only for convex
regions of specified size for the muscle fibres.
• The Hough transform (see, for example, Leavers, 1992) is a powerful technique for
finding straight lines, and other parametrized shapes, in images.
• Boundaries can be constrained to be smooth by employing roughness penalties such
as bending energies. The approach of varying a boundary until some such criterion is
optimized is known as the fitting of snakes (Kass, Witkin and Terzopoulos 1988).
• Models of expected shapes can be represented as templates and matched to images.
Either the templates can be rigid and the mapping can be flexible (for example, the
thinplate spline of Bookstein, 1989), or the template itself can be flexible, as in the
approach of Amit, Grenander and Piccioni (1991).
• Images can be broken down into fundamental shapes, in a way analogous to the
decomposition of a sentence into individual words, using syntactic methods (Fu,
1974).
Detection of Discontinuities
Discontinuities in an image typically correspond to edges, which are significant changes in
intensity or color. Detecting these discontinuities is a fundamental step in many
segmentation techniques.
Point Detection
A point is the most basic type of discontinuity in a digital image. The most common
approach to finding discontinuities is to run an (n n) mask over each point in the image. The
mask is as shown in figure 2.
The point is detected at a location (x, y) in an image where the mask is centered. If the
corresponding value of R such that
Where R is the response of the mask at any point in the image and T is non-negative
threshold value. It means that isolated point is detected at the corresponding value (x, y).
This formulation serves to measures the weighted differences between the center point and
its neighbors since the gray level of an isolated point will be very different from that of its
neighbors [ ]. The result of point detection mask is as shown in figure 3
Figure 3. (a) Gray-scale image with a nearly invisible isolated black point (b) Image
showing the detected point
Line Detection
Line detection is the next level of complexity in the direction of image discontinuity. For
any point in the image, a response can be calculated that will show which direction the point
of a line is most associated with. For line detection, we use two masks, and, mask. Then, we
have
It means that the corresponding points is more likely to be associated with a line in the
direction of the mask i.
Figure 4. Line Detector masks in (a) Horizontal direction (b) 45° direction (c) Vertical
direction (d) - 45° direction The greatest response calculation from these matrices will yield
the direction of the given pixel []. The result of line detection mask is as shown in figure 5
Figure 5. (a) Original Image (b) result showing with horizontal detector (c) with 45° detector
(d) with vertical detector (e) with -45° detector
With the help of lines detector masks, we can detect the lines in a specified direction. For
example, we are interesting in finding all the lines that are one pixel thick, oriented at -45°.
For that, we take a digitized (binary portion of a wire-bond mask for an electronics circuit.
The results are shown as in figure 6.
Edge detection
Since isolated points and lines of unitary pixel thickness are infrequent in most
practical application, edge detection is the most common approach in gray level
discontinuity segmentation. An edge is a boundary between two regions having distinct
intensity level. It is very useful in detecting of discontinuity in an image. When the image
changes from dark to white or vice-versa. The changes of intensity, first-order derivative and
second-order derivative are shown in figure 7.
Figure 7. (a) Intensity profile (b) First-order derivatives (c) Second-order derivatives
Figure 8. (a) Original Image (b) ‖𝐺𝑥‖component of the gradient along x-direction (c)
‖ Gy ‖component of the gradient along y-direction (d) Gradient Image ‖𝐺𝑥‖+ ‖𝐺𝑦‖
There is several ways to calculate the image gradient:
The mask finds the horizontal edges is equivalent to gradient in the vertical direction
and the mask compute the vertical edges is equivalent to gradient in the horizontal direction.
Using these two masks passing to the intensity image, we can find out and component at
different location in an image. So, we can find out the strength and direction of edge at that
particular location (x, y).
It gives the averaging effect over an image. It considers the effect due to the spurious
noise in the image. It is preferable over prewitt edge operator because it gives the smoothing
effect and by which we can reduce spurious edge which are generated because of noise
present in the image.
Second-order derivatives
It is positive at the darker side and negative at the white side. It is very sensitive to
noise present in an image. That’s why it is not used for edge detection. But, it is very useful
for extracting some secondary information i.e. we can find out whether the point lies on the
darker side or the white side.
Zero-crossing: It is useful to identify the exact location of the edge where there is
gradual transition of intensity from dark to bright region and vice-versa. There are several
second-order derivative operators: 3.3.2.1. Laplacian operator. The Laplacian mask
It is not used for edge detection because it is very sensitive to noise and also leads to
double edge. But, it is very useful for extracting secondary information. To reduce the effect
of noise, first image will be smooth using the Gaussian operator and then it is operated by
Laplacian operator. These two operations together is called LoG (Laplacian of Gaussian)
operator.
LoG operator
Canny operator
It is very important method to find edges by isolating noise from the image before
find edges of images, without affecting the features of the edges in the image and then
applying the tendency to find the edges in the image and the critical value for threshold.
Module 1
1. Write the Basics and Application of computer graphics.
2. Explain Raster scan display with neat diagram.
3. Short notes on the Video controller, the display processor.
4. Explain Graphics workstations and viewing systems.
5. Explain Input devices, graphics network, graphics software.
6. Write a note on Introduction to OpenGL.
7. What are Coordinate reference frames?
8. What are Primitives and attributes in OpenGL?
9. Explain in detail Line drawing algorithms (DDA, Bresenham’s).
10. Explain in detail Circle generation algorithms (Bresenham’s).
Module 2
1. Write a short note on 2D geometric transformations.
2. Derive Matrix representation and homogeneous coordinates.
3. Derive Inverse transformations, 2D composite transformations.
4. Explain Raster’s methods for geometric transformations.
5. What are OpenGL raster transformations.
6. Explain the 2D viewing pipeline.
7. Explain OpenGL 2D viewing functions.
8. Derive 3D geometric transformations.
9. Explain 3D translation, rotation, and scaling.
Module 3
1. Explain Logical Classification of Input Devices
2. Explain Input Functions for Graphical Data
3. What are Interactive Picture- Construction Techniques.
4. Explain Various Interactive Input Device Functions.
5. Explain how to create and manage Menu functions with OpenGL
6. Explain fields are involved in Graphical User Interface
7. Explain stages are involved in Design of Animation Sequence
8. Write a note on Traditional Animation Techniques
9. Write a note on General Computer Animation Functions and Computer Animation
Languages
10. Explain steps are involved in Character Animation
11. Write a note on Periodic Motion
12. Explain procedures are involved in OpenGL Animation Procedures
Module 4
1. Explain the following terms:
i) Adjacency
ii) Connectivity
iii) Gray level resolution
iv) Spatial resolution
2. Consider the two image substates S1 and S2, for V=<1>, determine whether these two
substates are i) 4-adjacent ii) 8-adjacent or iii) M adjacent.
3. Mention the applications of image processing.
4. Explain the importance of brightness adaption and discrimination in image processing.
5. Define 4- adjacency, 8-adjacency and m- adjacency.
6. Consider the image segment i) let v={0,1},compute the lengths of shortest 4,8,m paths
between p and q ii) repeat for v={1,2}.
Module 5
1. What is segmentation?
2. Write the applications of segmentation.
3. What are the three types of discontinuity in digital images?
4. How the derivatives are obtained in edge detection during formulation?
5. Write about linking edge points.
6. What are the two properties used for establishing the similarity of edge pixels?
7. Give the properties of the second derivative around an edge.
8. What is thresholding? Explain about global thresholding.
9. Explain about basic adaptive thresholding process used in image segmentation.
10. Explain in detail the threshold selection based on boundary characteristics.
11. Explain about region-based segmentation.
Content Beyond Syllabus of
• To empower women with the additional skill for professional future carrier
• To enrich students with research blends in order to fulfill the International challenges