RMK Group 21cs905 CV Unit 1
RMK Group 21cs905 CV Unit 1
RMK Group 21cs905 CV Unit 1
2
Please read this disclaimer before
proceeding:
This document is confidential and intended solely for the
educational purpose of RMK Group of Educational Institutions. If
you have received this document through email in error,
please notify the system manager. This document contains
proprietary information and is intended only to the respective
group / learning community as intended. If you are not the
addressee you should not disseminate, distribute or copy
through e-mail. Please notify the sender immediately by e-mail
if you have received this document by mistake and delete this
document from your system. If you are not the intended recipient
you are notified that disclosing, copying, distributing or taking any
action in reliance on the contents of this information is strictly
prohibited.
3
21CS905
COMPUTER
VISION
Department:
ARTIFICIAL INTELLIGENCE AND DATA
SCIENCE
Batch/Year: BATCH 2021-25/IV
Created by:
Dr. Seethalakshmi V, Associate Professor, ADS, RMKCET
Date: 18-06-2024
4
Table of Contents
Sl. No. Contents Page No.
1 Contents 5
2 Course Objectives 6
6 CO-PO/PSO Mapping 14
Lecture Plan (S.No., Topic, No. of Periods, Proposed
7 date, Actual Lecture Date, pertaining CO, Taxonomy 18
level, Mode of Delivery)
18 Mini Project 75
5
Course
Objectives
6
COURSE OBJECTIVES
7
PRE
REQUISITES
8
PRE REQUISITES
9
Syllabus
10
Syllabus
L T P C
21CS905 COMPUTER VISION
3 0 0 3
OBJECTIVES:
To understand the fundamental concepts related to Image formation and processing.
To learn feature detection, matching and detection
To become familiar with feature based alignment and motion estimation
To develop skills on 3D reconstruction
To understand image based rendering and recognition
UNIT I INTRODUCTION TO IMAGE FORMATION AND PROCESSING 15
12
Course Outcomes
Course Description Knowledge
Outcomes Level
vision.
To implement basic and some advanced image processing
CO2 K3
techniques in OpenCV.
K6 Evaluation
K5 Synthesis
K4 Analysis
K3 Application
K2 Comprehension
K1 Knowledge
13
CO – PO/PSO
Mapping
14
CO – PO /PSO Mapping
Matrix
CO PO PO PO PO PO PO PO PO PO PO PO PO PS PS PS
1 2 3 4 5 6 7 8 9 10 11 12 O1 O2 03
1 3 2 1 1 3
2 3 3 2 2 3
3 3 3 1 1 3
4 3 3 1 1 3
5 3 3 1 1 3
6 2 2 1 1 3
15
UNIT –I
INTRODUCTION TO
IMAGE FORMATION
AND PROCESSING
16
Lecture
Plan
17
Lecture Plan – Unit 1– INTRODUCTION TO
IMAGE FORMATION AND PROCESSING
Numb
Actual Taxo
Sl. er of Proposed Mode of
Topic Lectur CO nomy
No. Period Date Delivery
e Date Level
s
Computer Vision 19-07-
2024
Blackboard
1 CO1 K3
/ ICT Tools
1
1 CO1 K3 Blackboard
Geometric 20-07- / ICT Tools
2
primitives and 2024
transformations
22-07-
3
Photometric image 2024 Blackboard
1 CO1 K4
formation / ICT Tools
The digital camera 23-07- Blackboard
2 CO1 K4
4 - Point operators 2024 / ICT Tools
27-07-
Geometric
2024
transformation Blackboard
1 CO1 K4
8 s / ICT Tools
Global
30-07- Blackboard
optimization. 2 CO1 K4
9 2024 / ICT Tools
18
Activity Based
Learning
19
Activity Based Learning
Sl. No. Contents Page No.
Transformations
20
Lecture Notes –
Unit 1
21
UNIT-1 INTRODUCTION TO
IMAGE FORMATION AND
PROCESSING
No. No.
1 Computer Vision 23
5 Point operators 29
Linear filtering
6 32
8 Fourier transforms 37
10 Geometric transformations 41
11 Global optimization. 45
22
UNIT-1INTRODUCTION TO IMAGE
FORMATION AND PROCESSING
Computer Vision - Geometric primitives and transformations - Photometric image formation -
The digital camera - Point operators - Linear filtering - More neighborhood operators –
Fourier transforms - Pyramids and wavelets - Geometric transformations - Global
optimization.
23
Computer vision applications are diverse and found in various fields, including
healthcare (medical image analysis), autonomous vehicles, surveillance,
augmentedreality,robotics,industrialautomation,andmore.Advancesindeep
learning, especially convolutional neural networks (CNNs), have significantly
contributed to the progress and success of computer vision tasks by enabling
efficient feature learning from large datasets
24
1. Translation: Moves an object by a certain distance along a
specified direction.
2. Rotation: Rotates an object around a specified point or axis.
3. Scaling: Changes the size of an object along different axes.
4. Shearing: Distorts the shape of an object by stretching or
compressing along one or more axes.
5. Reflection: Mirrors an object across a specified plane.
6. Affine Transformations: Combine translation, rotation, scaling, and
shearing.
7. Projective Transformations: Used for perspective transformations
in 3D graphics
Applications:
• Computer Graphics: Geometric primitives and
transformations are fundamental for rendering 2D and
3Dgraphics in applications such as video games, simulations, and
virtual reality.
• Computer-Aided Design (CAD): Used for designing and modeling
objects in engineering and architecture.
• Computer Vision: Geometric transformations are applied to align
and process images, correct distortions, and perform other tasks
in image analysis.
• Robotics: Essential for robot navigation, motion planning, and
spatial reasoning.
• Understanding geometric primitives and transformations is crucial
for creating realistic and visually appealing computer-generated
images, as well as for solving various problems in computer vision
and robotics.
25
1.3. Photometric image formation:
Photometric image formation refers to the process by which light
interacts with surfaces and is captured by a camera, resulting in the
creation of a digital image. This process involves various factors
related to the properties of light, the surfaces of objects, and the
characteristics of the imaging system. Understanding photometric
Image formation is crucial in computer vision, computer graphics, and
image processing.
27
Image Sensor:- Digital cameras use image sensors (such as CCD or CMOS) to convert light
into electrical signals.- The sensor captures the image by measuring the intensity of light at each
pixel location.
Lens:- The lens focuses light onto the image sensor.- Zoom lenses allow users to adjust the
focal length, providing optical zoom.
Aperture:- The aperture is an adjustable opening in the lens that controls the amount of light
entering the camera.- It affects the depth of field and exposure.
Shutter:- The shutter mechanism controls the duration of light exposure to the image sensor.-
Fast shutter speeds freeze motion, while slower speeds create motion blur.
Viewfinder and LCD Screen:- Digital cameras typically have an optical or electronic
viewfinder for composing shots.- LCD screens on the camera back allow users to view and frame
images.
Image Processor:- Digital cameras include a built-in image processor to convert raw sensor
data into a viewable image.- Image processing algorithms may enhance color, sharpness, and
reduce noise.
Memory Card:- Digital images are stored on removable memory cards, such as SD or CF
cards.- Memory cards provide a convenient and portable way to store and transfer images.
Autofocus and Exposure Systems:- Autofocus systems automatically adjust the lens to
ensure a sharp image.- Exposure systems determine the optimal combination of aperture,
shutter speed, and ISO sensitivity for proper exposure.
White Balance:- White balance settings adjust the color temperature of the captured image to
match different lighting conditions.
Modes and Settings:- Digital cameras offer various shooting modes (e.g., automatic, manual,
portrait, landscape) and settings to control image parameters.
Connectivity:- USB, HDMI, or wireless connectivity allows users to transfer images to
computers, share online, or connect to other devices.
Battery:- Digital cameras are powered by rechargeable batteries, providing the necessary
energy for capturing and processing images
28
1.5 Point operators:
Point operators, also known as point processing or pixel-wise
operations, are basic image processing operations that operate on
individual pixels independently. These operations are applied to each
pixel in an image without considering the values of neighboring pixels.
Point operators typically involve mathematical operations or functions
that transform the pixel values, resulting in changes to the image's
appearance. Here are some common point operators:
Brightness Adjustment:
- Addition/Subtraction: Increase or decrease the intensity of all pixels by
adding or subtracting a constant value.
- Multiplication/Division: Scale the intensity values by multiplying or
dividing them by a constant factor.
Contrast Adjustment:
- Linear Contrast Stretching: Rescale the intensity values to cover the full
dynamic range.
- Histogram Equalization: Adjust the distribution of pixel intensities to
enhance contrast.
Gamma Correction:
- Adjust the gamma value to control the overall brightness and contrast
of an image.
29
Thresholding:
- Convert a grayscale image to binary by setting a threshold value. Pixels
with values above the threshold become white, and those below become
black.
Bit-plane Slicing:
- Decompose an image into its binary representation by considering
individual bits.
Color Mapping:
- Apply color transformations to change the color balance or convert
between color spaces (e.g., RGB to grayscale).
Inversion:
- Invert the intensity values of pixels, turning bright areas dark and vice
versa.
Image Arithmetic:
- Perform arithmetic operations between pixels of two images, such as
addition, subtraction, multiplication, or division.
30
Point operators are foundational in image processing and form the basis for
more complex operations. They are often used in combination to achieve
desired enhancements or modifications to images. These operations are
computationally efficient, as they can be applied independently to each pixel,
making them suitable for real-time applications and basic image
manipulation tasks. It's important to note that while point operators are
powerful for certain tasks, more advanced image processing techniques,
such as filtering and convolution, involve considering the values of
neighboring pixels and are applied to local image regions
31
1.6 Linear filtering:
Linear filtering is a fundamental concept in image processing that involves
applying a linear operator to an image. The linear filter operates on each
pixel in the image by combining its value with the values of its
neighboring pixels according to a predefined convolution kernel or matrix.
The convolution operation is a mathematical operation that computes the
weighted sum of pixel values in the image, producing a new value for the
center pixel.
The general formula for linear filtering or convolution is given by
Where
32
Edge Detection:
- Sobel filter: Emphasizes edges by computing gradients in the x and y
directions.
- Prewitt filter: Similar to Sobel but uses a different kernel for gradient
computation.
Sharpening:
- Laplacian filter: Enhances high-frequency components to highlight edges.
- High-pass filter: Emphasizes details by subtracting a blurred version of the
image.
Embossing:
- Applies an embossing effect by highlighting changes in intensity.
33
Linear filtering is a versatile technique and forms the basis for more
advanced image processing operations. The convolution operation
can be efficiently implemented using convolutional neural networks
(CNNs) in deep learning, where filters are learned during the training
process to perform tasks such as image recognition, segmentation,
and denoising. The choice of filter kernel and parameters determines
the specific effect achieved through linear filtering.
34
1.7 More neighborhood operators :
Neighborhood operators in image processing involve the consideration of
pixel values in the vicinity of a target pixel, usually within a defined
neighborhood or window. Unlike point operators that operate on
individual pixels, neighborhood operators take into account the local
structure of the image. Here are some common neighborhood operators:
Median Filter:
- Computes the median value of pixel intensities within a local neighborhood.
- Effective for removing salt-and-pepper noise while preserving edges.
Gaussian Filter:
- Applies a weighted average to pixel values using a Gaussian distribution.
- Used for blurring and smoothing, with the advantage of preserving edges.
Bilateral Filter:
- Combines spatial and intensity information to smooth images while preserving
edges.
- Uses two Gaussian distributions, one for spatial proximity and one for
35
intensity similarity.
Non-local Means Filter:
- Computes the weighted average of pixel values based on similarity in a larger
non-local neighborhood.
- Effective for denoising while preserving fine structures.
Anisotropic Diffusion:
- Reduces noise while preserving edges by iteratively diffusing intensity values
along edges.
- Particularly useful for images with strong edges.
Morphological Operators:
- Dilation: Expands bright regions by considering the maximum pixel value in a
neighborhood.
Erosion:
- Contracts bright regions by considering the minimum pixel value in a
neighborhood.
- Used for operations like noise reduction, object segmentation, and shape
analysis.
Laplacian of Gaussian (LoG):
- Applies a Gaussian smoothing followed by the Laplacian operator.
- Useful for edge detection.
Canny Edge Detector:
- Combines Gaussian smoothing, gradient computation, non-maximum
suppression, and edge tracking by
hysteresis.
- Widely used for edge detection in computer vision applications.
Homomorphic Filtering:
- Adjusts image intensity by separating the image into illumination and
reflectance components.
- Useful for enhancing images with non-uniform illumination.
36
Adaptive Histogram Equalization:
- Improves contrast by adjusting the histogram of pixel intensities
based on local neighborhoods.
- Effective for enhancing images with varying illumination.
These neighborhood operators play a crucial role in image
enhancement, denoising, edge detection, and other image processing
tasks. The choice of operator depends on the specific characteristics of
the image and the desired outcome.
37
Texture Analysis:
- Fourier analysis is useful in characterizing and classifying textures based on
their frequency characteristics. It helps distinguish between textures with
different patterns.
Pattern Recognition:
- Fourier descriptors, which capture shape information, are used for
representing and recognizing objects in images. They provide a compact
representation of shape by capturing the dominant frequency components.
Image Compression:
- Transform-based image compression, such as JPEG compression, utilizes
Fourier transforms to transform image data into the frequency domain. This
allows for efficient quantization and coding of frequency
components.
Image Registration:
- Fourier transforms are used in image registration, aligning images or
transforming them to a common coordinate system. Cross-correlation in the
frequency domain is often employed for this purpose.
Optical Character Recognition (OCR):
- Fourier descriptors are used in OCR systems for character recognition. They
help in capturing the shape information of characters, making the recognition
process more robust.
Homomorphic Filtering:
- Homomorphic filtering, which involves transforming an image to a
logarithmic domain using Fourier transforms, is used in applications such as
document analysis and enhancement.
38
Image Reconstruction:
- Fourier transforms are involved in techniques like computed
tomography (CT) or magnetic resonance imaging (MRI) for
39
Laplacian Pyramid:
- Derived from the Gaussian pyramid.
- Each level of the Laplacian pyramid is obtained by subtracting the
expanded version of the higher level Gaussian pyramid from the original
image.
- Useful for image compression and coding, where the Laplacian pyramid
represents the residual information not captured by the Gaussian pyramid.
Image pyramids are especially useful for creating multi-scale
representations of images, which can be beneficial for various
computer vision tasks.
Wavelets:
Wavelets are mathematical functions that can be used to analyze signals
and images. Wavelet transforms provide a multi-resolution analysis by
decomposing an image into approximation (low-frequency) and detail
(high-frequency) components.
Key concepts include:
Wavelet Transform:- The wavelet transform decomposes an image into
different frequency components by convolving the image with wavelet
functions.- The result is a set of coefficients that represent the image at
various scales and orientations.
Multi-resolution Analysis:- Wavelet transforms offer a multi-resolution
analysis, allowing the representation of an image at different scales.- The
approximation coefficients capture the low-frequency information, while
detail coefficients capture high-frequency information.
Haar Wavelet:- The Haar wavelet is a simple wavelet function used in
basic wavelet transforms.- It represents changes in intensity between
adjacent pixels.
40
Wavelet Compression:
- Wavelet-based image compression techniques, such as JPEG2000, utilize
wavelet transforms to efficiently represent image data in both spatial and
frequency domains.
Image Denoising:
- Wavelet-based thresholding techniques can be applied to denoise images by
thresholding the wavelet coefficients.
Edge Detection:
- Wavelet transforms can be used for edge detection by analyzing the high-
frequency components of the image.
Both pyramids and wavelets offer advantages in multi-resolution analysis, but
they differ in terms of their representation and construction. Pyramids use a
hierarchical structure of smoothed and subsampled images, while wavelets
use a transform-based approach that decomposes the image into frequency
components. The choice between pyramids and wavelets often depends on
the specific requirements of the image processing task at hand.
41
Applications: Object movement, image registration.
2. Rotation:
●Description: Rotates an object by a specified angle about a fixed point.
●Transformation Matrix(2D)
42
4. Shearing:
●Description: Distorts the shape of an object by varying its coordinates
linearly.
●Transformation Matrix(2D):
43
●Applications:3D rendering, simulation.
7. Projective Transformation:
●Description: Generalization of perspective transformation with
additional control points.
●Transformation Matrix(3D):More complex than the perspective
transformation matrix.
●Applications: Computer graphics, augmented reality.
44
1.11 GLOBAL OPTIMIZATION:
Global optimization is a branch of optimization that focuses on finding
the global minimum or maximum of a function over its entire feasible
domain. Unlike local optimization, which aims to find the optimal
solution within a specific region, global optimization seeks the best
possible solution across the entire search space. Global optimization
problems are often challenging due to the presence of multiple local
optima or complex, non-convex search spaces.
Here are key concepts and approaches related to global optimization:
Concepts:
Objective Function:
- The function to be minimized or maximized.
Feasible Domain:
- The set of input values (parameters) for which the objective function
is defined.
Global Minimum/Maximum:
- The lowest or highest value of the objective function over the entire
feasible domain.
Local Minimum/Maximum:
●A minimum or maximum within a specific region of the feasible
domain.
Approaches:
Grid Search:-
Dividing the feasible domain into a grid and evaluating the objective
function at each grid point to find the optimal solution.
Random Search:-
Randomly sampling points in the feasible domain and evaluating the
objective function to explore different regions.
45
Evolutionary Algorithms:- Genetic algorithms, particle swarm optimization,
and other evolutionary techniques use populations of solutions and genetic
operators to iteratively evolve toward the optimal solution.
Simulated Annealing:- Inspired by the annealing process in metallurgy,
simulated annealing gradually decreases the temperature to allow the algorithm
to escape local optima.
Ant Colony Optimization:- Inspired by the foraging behavior of ants, this
algorithm uses pheromone trails to guide the search for the optimal solution.
Genetic Algorithms:- Inspired by biological evolution, genetic algorithms use
mutation, crossover, and selection to evolve a population of potential solutions.
Particle Swarm Optimization:- Simulates the social behavior of birds or fish,
where a swarm of particles moves through the search space to find the optimal
solution.
Bayesian Optimization:-
Utilizes probabilistic models to model the objective function and guide the
search toward promising regions.
Quasi-Newton Methods:- Iterative optimization methods that use an
approximation of the Hessian matrix to find the optimal solution efficiently.
Global optimization is applied in various fields, including engineering design,
machine learning, finance, and parameter tuning in algorithmic optimization.
The choice of a specific global optimization method depends on the
characteristics of the objective function, the dimensionality of the search space,
and the available computational resources.
46
ACTIVITY:
Activity 1: Implementing and Visualizing the Fourier Transform
Objective: Understand and implement the Fourier Transform on an
image and visualize its frequency components.
Reference:
•https://Fourier Transform in Image Processing.co.uk/#!/worksheets
47
Video Links
Unit – 1
48
Video Links
Sl. Topic Video Link
No.
https://www.youtube.com/watch?v=spUNpyF58BY
1 Fourier Transform in
Image Processing
Computer Vision
2 Basics" by OpenCV CAP5415 Fall 2014 - YouTube
Image
3 Processing in
Sockets with Python 3 - YouTube
Python
Digital Image
https://www.youtube.com/playlist?list=PL6ZV-
4 Processing fpYwBx0aJDM7-Y3iC5Bz4rOUJpYQ
https://www.youtube.com/watch?v=CmQz0Txa1bw
5 Histogram
Equalization in Image
Enhancement
49
Assignments
Unit - I
50
Assignment Questions
Assignment Questions – Very Easy
Q. ASSIGNMENT QUESTIONS Marks Knowledg CO
e level
No.
1 Define computer vision and explain the 5 K3 CO1
5 K3 CO1
2 What is the purpose of histogram
equalization in image processing?
2 5 K3 CO1
Perform histogram equalization on a given
image and display the original and
enhanced images side by side.
Topic: Histogram Processing, Image
Enhancement
51
Assignment Questions
Assignment Questions – Medium
Q. ASSIGNMENT QUESTIONS Marks Knowledg CO
e level
No.
1 Apply the Fast Fourier Transform (FFT) to 5 K3 CO1
53
Course Outcomes:
CO1: Explain low level processing of image and transformation techniques applied
to images.
*Allotment of Marks
Correctness of the Presentation Timely Submission Total (Marks)
Content
15 - 5 20
54
Part A – Questions
& Answers
Unit – I
55
Part A - Questions & Answers
1. What is computer vision?[K2, CO1]
Low-level tasks involve basic image processing operations like noise reduction and
edge detection, while high-level tasks involve interpreting and understanding the
content of images, such as object recognition and scene understanding.
56
7. What is an affine transformation?. [K3, CO1]
The Fourier Transform converts an image from the spatial domain to the
frequency domain, allowing for analysis and manipulation of its frequency
components.
57
11. What is convolution in image processing?[K2, CO1]
58
16. What is inverse filtering in image restoration?[K2, CO1]
nverse filtering attempts to recover the original image by reversing the
degradation function, often used to counteract blurring.
17. What is a histogram in the context of image
processing?[K3, CO1]
A histogram represents the distribution of pixel intensity values in an
image, showing the frequency of each intensity level.
18. Why is histogram equalization important in medical
imaging?[K3, CO1]
Histogram equalization enhances the contrast of medical images,
making important details more visible for better diagnosis.
19. What is edge detection? [K3, CO1]
Edge detection is a low-level image processing technique to identify and
locate sharp discontinuities in intensity, often corresponding to object
boundaries.
20. Name two popular edge detection algorithms.K3 CO1]
The Sobel and Canny edge detection algorithms are widely used for
detecting edges in images.
59
21. What is image segmentation?. [K2, CO1]
Image segmentation is a mid-level task that divides an image into
meaningful regions or segments, typically to isolate objects or areas of
interest.
22. How does the k-means clustering algorithm work in image
segmentation? [K3 ,CO1]
K-means clustering partitions the image into k segments by assigning pixels
to the nearest cluster center, iteratively updating cluster centers based on
pixel values.
23 What is object recognition? [K3, CO1]
Object recognition is a high-level computer vision task that identifies and
classifies objects within an image based on learned features.
24. How does a convolutional neural network (CNN) aid in object
recognition?[K3, CO1]
CNNs automatically learn hierarchical features from images through
multiple layers of convolutions, pooling, and fully connected layers,
effectively recognizing objects.
25. What is the role of the lens in a camera?[K3, CO1]
The lens focuses light onto the image sensor, forming a clear image by
converging or diverging light rays.
60
26. What causes chromatic aberration in a camera lens? [K3,
CO1]
Chromatic aberration occurs when different wavelengths of light are refracted
by different amounts, causing color fringing and blurring.
27. Explain the concept of a homography matrix.[K3, CO1]
A homography matrix is a 3x3 transformation matrix that maps points from
one plane to another in projective space, used in applications like image
stitching.
28. What is the main application of affine transformations in image
processing? [K3, CO1]
Affine transformations are used for geometric corrections, such as scaling,
rotating, and skewing images, while preserving collinearity and parallelism.
29. What is a median filter used for in image processing? [K3, CO1]
A median filter is used to reduce noise in an image, particularly salt-and-
pepper noise, by replacing each pixel value with the median of neighboring
pixel values.
30. Describe the purpose of a Laplacian filter.[K3, CO1]
A Laplacian filter detects edges by highlighting regions of rapid intensity
change, using a second-order derivative of the image.
61
Part B – Questions
Unit – I
62
Part B Questions
Q. No. Questions K Level CO
Mapping
63
7 Describe the process of image restoration. K4 CO1
Discuss different degradation models and
restoration techniques, such as Wiener
filtering and blind deconvolution, providing
examples of their applications.
64
Supportive online
Certification
courses (NPTEL,
Swayam, Coursera,
Udemy, etc.,)
65
Supportive Online Certification
Courses
Coursera – Introduction to Computer
Vision
• Description:
This course provides an overview of computer
vision, including image processing, feature
extraction, and object recognition.
• Offered by:
Georgia Tech
https://www.coursera.org/learn/introdu
ction-computer-vision
• NPTEL:Computer Vision
• Computer Vision - Course (nptel.ac.in)
Udemy:
• Computer Vision -
https://www.bing.com/ck/a?!&&p=37e82e3153ef651eJmlt
dHM9MTcxODc1NTIwMCZpZ3VpZD0wZmYzMGRhMi1iMTh
hLTZmZDgtMmFkNi0xZThmYjAyNzZlYjQmaW5zaWQ9NTU
xMA&ptn=3&ver=2&hsh=3&fclid=0ff30da2-b18a-6fd8- -
2ad6-
1e8fb0276eb4&u=a1L3ZpZGVvcy9yaXZlcnZpZXcvcmVsYX
RlZHZpZGVvP3E9UHJhY3RpY2FsK09wZW5DViszK3VkbWV5
K2NvdXJzZSZtaWQ9NThDMDExNjI2NTMzQzBFNDRCNjE1O
EMwMTE2MjY1MzNDMEU0NEI2MSZGT1JNPVZJUkU&ntb=1
66
Real time
Applications in day
to day life and to
Industry
67
Real time Applications
1. Autonomous Vehicles
Topic: Lane Detection, Obstacle Detection, Traffic Sign Recognition
Description: Computer vision systems are crucial for autonomous vehicles to
navigate safely. They detect lanes, obstacles, and traffic signs in real-time,
enabling the vehicle to make driving decisions.
2. Facial Recognition
Topic: Security and Surveillance, Access Control
Description: Real-time facial recognition systems are used in security and
surveillance to identify individuals. They are also used in access control systems
to authenticate users.
3. Medical Imaging
Topic: Real-time Diagnostics, Surgery Assistance
Description: Computer vision assists in medical imaging by providing real-time
analysis for diagnostics and during surgeries. Applications include real-time MRI
analysis and guiding surgical robots.
4. Augmented Reality (AR)
Topic: Interactive Gaming, Virtual Try-Ons
Description: AR applications use real-time computer vision to overlay digital
content onto the real world. Examples include interactive gaming (e.g.,
Pokémon GO) and virtual try-ons for glasses or clothes.
68
Content Beyond
Syllabus
69
Medical Image Analysis
Deep learning has revolutionized medical diagnostics by
enhancing the analysis of medical images such as MRIs, CT
scans, and X-rays. By leveraging sophisticated algorithms,
deep learning models can extract intricate patterns and
features from these images, aiding in precise diagnosis and
treatment planning.
One prominent application is in radiomics, where deep
learning algorithms extract quantitative features from
radiographic images. These features can include texture,
shape, and intensity characteristics that are not visible to the
human eye. By analyzing these features, clinicians can
potentially predict disease progression, treatment response,
and patient outcomes with greater accuracy.
Case studies have demonstrated the efficacy of deep learning
in various medical specialties. For instance, in oncology, AI
models can identify subtle changes in tumor characteristics
that may indicate response to therapy or recurrence. In
neurology, these models assist in the early detection of
neurological disorders by analyzing brain scans. In cardiology,
AI helps in assessing cardiac function and detecting anomalies
in heart images.
Overall, the integration of deep learning into medical image
analysis holds promise for advancing precision medicine,
improving diagnostic accuracy, and ultimately enhancing
patient care across diverse medical fields.
70
Assessment
Schedule
(Proposed Date &
Actual Date)
71
Assessment Schedule
(Proposed Date & Actual Date)
Sl. ASSESSMENT Proposed Actual
No. Date Date
1 FIRST INTERNAL
ASSESSMENT
2 SECOND
INTERNAL
ASSESSMENT
3 MODEL
EXAMINATION
4 END SEMESTER
EXAMINATION
72
Prescribed Text
Books & Reference
73
Prescribed Text Books &
Reference
TEXT BOOKS:
1. D. A. Forsyth, J. Ponce, “Computer Vision: A Modern
Approach”, Pearson Education, 2003.
2. Richard Szeliski, “Computer Vision: Algorithms and
Applications”, Springer Verlag London Limited,2011.
REFERENCES:
1. B. K. P. Horn -Robot Vision, McGraw-Hill.
2. Simon J. D. Prince, Computer Vision: Models, Learning,
and Inference, Cambridge University Press, 2012.
3. Mark Nixon and Alberto S. Aquado, Feature Extraction &
Image Processing for Computer Vision, Third Edition,
Academic Press, 2012.
4. E. R. Davies, (2012), “Computer & Machine Vision”,
Fourth Edition, Academic Press.
5. Concise Computer Vision: An Introduction into Theory
and Algorithms, by Reinhard Klette, 2014
74
Mini Project
Suggestions
75
Mini Project Suggestions
[1] Very Hard
Develop a deep learning model (like YOLO or Faster R-CNN) for real-
[2] Hard
images.
[3] Medium
[4] Easy
76
Thank you
Disclaimer:
77