Lecture W4ab
Lecture W4ab
DAU D A B D U L L A H
COMPUTER VISION W E E K 4
F E B 2 0 2 3
Agenda
2
2.1
Primitives and Transformations
Primitives and Transformations
I Geometric primitives are the basic building blocks used to describe 3D shapes
I In this unit, we introduce points, lines and planes
I Furthermore, the most basic transformations are discussed
I This unit covers the topics of the Szeliski book, chapter 2.1
I A more exhaustive introduction can be found in the book:
Hartley and Zisserman: Multiple View Geometry in Computer Vision
4
2D Points
2D points can be written in inhomogeneous coordinates as
!
x
x= ∈ R2
y
or in homogeneous coordinates as
x̃
x̃ = ỹ ∈ P2
w̃
Remark: Homogeneous vectors that differ only by scale are considered equivalent and
define an equivalence class. ⇒ Homogeneous vectors are defined only up to scale.
5
2D Points
An inhomogeneous vector x is converted to a homogeneous vector x̃ as follows
x̃ x !
x
x̃ = ỹ = y = = x̄
1
w̃ 1
with augmented vector x̄.To convert in the opposite direction we divide by w̃:
! x x̃ x̃/w̃
x 1 1
x̄ = = y = x̃ = ỹ = ỹ/w̃
1 w̃ w̃
1 w̃ 1
Augmented Vector
Homogeneous Inhomogeneous
Coordinates Coordinates
7
2D Lines
We can normalize l̃ so that l̃ = (nx , ny , d)> = (n, d)> with knk2 = 1. In this case, n is
the normal vector perpendicular to the line and d is its distance to the origin.
An exception is the line at infinity l̃∞ = (0, 0, 1)> which passes through all ideal points.
8
Cross Product
−a2 a1 0 b3 a1 b2 − a2 b1
Remark: In this course, we use squared brackets to distinguish matrices from vectors.
9
2D Conics
More complex algebraic objects can be represented using polynomial homogeneous
equations. For example, conic sections (arising as the intersection of a plane and a
3D cone) can be written using quadric equations:
{x̄ | x̄> Q x̄ = 0}
Useful for multi-view geometry and camera calibration, see Hartley and Zisserman.
13
3D Points
3D points can be written in inhomogeneous coordinates as
x
x = y ∈ R3
or in homogeneous coordinates as
x̃
ỹ 3
z̃ ∈ P
x̃ =
w̃
{x̄ | x̄> Q x̄ = 0}
Useful in the study of multi-view geometry. Also serves as useful modeling primitives
(spheres, ellipsoids, cylinders), see Hartley and Zisserman, Chapter 2 for details.
18
Superquadrics Revisited
Paschalidou, Ulusoy and Geiger: Superquadrics Revisited: Learning 3D Shape Parsing beyond Cuboids. CVPR, 2019. 19
2D Transformations
28
Origins of the Pinhole Camera
Camera Obscura:
4th Century BC
28
Origins of the Pinhole Camera
https://www.abelardomorell.net/camera-obscura
28
Origins of the Pinhole Camera
Physical Camera Model Mathematical Camera Model
e
e
y
t Ra
an
an
t Ra
y Ligh
Pl
Pl
Focal Ligh
e
e
ag
ag
Focal Point
Im
Point
Im
Camera
Coordinate Camera
System Coordinate
System
e
an
an
t Ray
Ligh
Pl
Pl
e
e
ag
ag
Im
Im
Focal Point
Opto Engineering Telecentric Lens Canon 800mm Telephoto Lens Nikon AF-S Nikkor 50mm Sony DSC-RX100 V Samsung Galaxy S20
I These two are the most important projections, see Szeliski Ch. 2.1.4 for others
30
Projection Models
30
Orthographic Projection
Image Plane
Image Coordinate System
Light Ray
Camera
Coordinate
System
Camera
Center
Orthography is exact for telecentric lenses and an approximation for telephoto lenses.
After projection the distance of the 3D point from the image can’t be recovered.
32
Scaled Orthographic Projection
Remark: The unit for s is px/m or px/mm to convert metric 3D points into pixels.
33
Perspective Projection
Image Plane
Image
Coordinate Light Ray
Camera System
Coordinate
System
Camera Principal Axis
Center Focal Length
34
Perspective Projection
Note that this projection is linear when using homogeneous coordinates. After the
projection it is not possible to recover the distance of the 3D point from the image.
Remark: The unit for f is px (=pixels) to convert metric 3D points into pixels.
35
Perspective Projection
Without Principal Point Offset With Principal Point Offset
Image
Coordinate
System
e
ay ay
an
ht R ht R
Lig Lig
Pl
e
ag
Im
y y
Light Ra Light Ra
Focal Point Focal Point
Principal Point
Principal Axis Principal Point Principal Axis
e
an
Coordinate Coordinate Coordinate
Pl
System System System
e
ag
Im
I To ensure positive pixel coordinates, a principal point offset c is usually added
I This moves the image coordinate system to the corner of the image plane
36
Perspective Projection
The complete perspective projection model is given by:
! ! fx s cx 0
xs fx xc /zc + s yc /zc + cx
= ⇔ x̃s = 0 fy cy 0 x̄c
ys fy yc /zc + cy
0 0 1 0
e
an
Coordinate
Pl
System
e
ag
Im
Camera
Coordinate
System
Let K be the calibration matrix (intrinsics) and [R|t] the camera pose (extrinsics).
We chain both transformations to project a point in world coordinates to the image:
" #
h i h i R t h i
x̃s = K 0 x̄c = K 0 x̄ w = K R t x̄w = P x̄w
0> 1
Images can be undistorted such that the perspective projection model applies.
More complex distortion models must be used for wide-angle lenses (e.g., fisheye).
40
Lens Distortion
41
2.3
Photometric Image Formation
Photometric Image Formation
light
source
^
n
sensor
plane
surface
optics
I So far we have discussed how individual light rays travel through space
I We now discuss how an image is formed in terms of pixel intensities and colors
I Light is emitted by one or more light sources and reflected or refracted
(once or multiple times) at surfaces of objects (or media) in the scene
43
Rendering Equation
Let p ∈ R3 denote a 3D surface point, v ∈ R3 the viewing direction and s ∈ R3 the
incoming light direction. The rendering equation describes how much of the light Lin
with wavelength λ arriving at p is reflected into the viewing direction v:
Z
Lout (p, v, λ) = Lemit (p, v, λ) + BRDF(p, s, v, λ) · Lin (p, s, λ) · (−n> s) ds
Ω
44
Diffuse and Specular Reflection
I The amount of light reflected from a surface depends on the viewing angle
I Large and very small pinholes result in image blur (averaging, diffraction)
I Small pinholes require very long shutter times (⇒ motion blur)
I http://www.pauldebevec.com/Pinhole/
49
Why Camera Lenses?
Small Pinhole
e
an
Pl
Pin-
e
ag
hole
Im
Large Pinhole
e
an
Pl
Pin-
e
ag
hole
Im
50
Optics
Pinhole Camera Model Camera with Lens
ne
e
an
a
Pl
Pl
Lens
Pin-
e
e
ag
ag
hole
Im
Im
I Cameras use one or multiple lenses to accumulate light on the sensor plane
I Importantly, if a 3D point is in focus, all light rays arrive at the same 2D pixel
I For many applications it suffices to model lens cameras with a pinhole model
I However, to address focus, vignetting and aberration we need to model lenses
51
Thin Lens Model
Image Focal Points of Lens
Plane
xs zs − f xs zs zs − f zs zs zs 1 1 1
= ∧ = ⇒ = ⇒ −1 = ⇒ + =
xc f xc zc f zc f zc zs zc f
I The thin lens model with spherical lens is often used as an approximation
I Properties: Axis-parallel rays pass the focal point, rays via center keep direction
R
I From Snell’s law we obtain f = 2(n−1) with radius R and index of refraction n
52
Depth of Field (DOF)
53
Depth of Field (DOF)
I To control the size of the circle of confusion, we change the lens aperture
I An aperture is a hole or an opening through which light travels
I The aperture limits the amount of light that can reach the image plane
I Smaller apertures lead to sharper, but more noisy images (less photons)
54
Depth of Field (DOF)
I The allowable depth variation that limits the circle of confusion c is called
depth of field and is a function of both the focus distance and the lens aperture
I Typical DSLR lenses have depth of field indicators
I The commonly displayed f-number is defined as
f
N= (often denoted as f /N , e.g.: f /1.4)
d
I In other words, it is the lens focal length f divided by the aperture diameter d
55
Depth of Field (DOF)
56
Chromatic Aberration
57
Chromatic Aberration
I Vignetting is the tendency for the brightness to fall off towards the image edge
I Composition of two effects: natural and mechanical vignetting
I Natural vignetting: foreshortening of object surface and lens aperture
I Mechanical vignetting: the shaded part of the beam never reaches the image
I Vignetting can be calibrated (i.e., undone)
59
Vignetting
60
2.4
Image Sensing Pipeline
Image Sensing Pipeline
I A focal plane shutter is positioned just in front the image sensor / film
I Most digital cameras use a combination of mechanical and electronic shutter
I The shutter speed (exposure time) controls how much light reaches the sensor
I It determines if an image appears over-/underexposed, blurred or noisy
63
Sensor
I CCDs move charge from pixel to pixel and convert it to voltage at the output node
I CMOS images convert charge to voltage inside each pixel and are standard
I Larger chips (full frame = 35 mm) are more photo-sensitive ⇒ less noise
https://meroli.web.cern.ch/lecture_cmos_vs_ccd_pixel_sensor.html
64
Color Filter Arrays
I Each pixel integrates the light spectrum L according to its spectral sensitivity S:
Z
R = L(λ) SR (λ) dλ
I Various different color spaces have been developed and are used in practice
68
Gamma Compression
Y’ Y
visible
Y’ = Y1/γ noise
Y = Y’γ
Y quantization Y’
noise
I Humans are more sensitive to intensity differences in darker regions
I Therefore, it is beneficial to nonlinearly transform (left) the intensities or colors
prior to discretization (left) and to undo this transformation during loading
69
Image Compression
70
QUESTIONS???
AC K N OW L E D G E M E N T !
• Various contents in this presentation have been taken from different books,
lecture notes, and the web. These solely belong to their owners, and are here used
only for clarifying various educational concepts. Any copyright infringement is
not intended.