Classical Optics and Its Applications
Classical Optics and Its Applications
Classical Optics and Its Applications
Second Edition
http://ebooks.cambridge.org/ebook.jsf?bid=CBO9780511803796
http://ebooks.cambridge.org/ebook.jsf?bid=CBO9780511803796
CLASSICAL OPTICS AND ITS
APPLICATIONS
Second Edition
MASUD MANSURIPUR
College of Optical Sciences
University of Arizona, Tucson
http://ebooks.cambridge.org/ebook.jsf?bid=CBO9780511803796
cambridge university press
Cambridge, New York, Melbourne, Madrid, Cape Town, Singapore, São Paulo, Delhi
Cambridge University Press
The Edinburgh Building, Cambridge CB2 8RU, UK
Published in the United States of America by Cambridge University Press, New York
www.cambridge.org
Information on this title: www.cambridge.org/9780521881692
ª M. Mansuripur 2009
A catalogue record for this publication is available from the British Library
http://ebooks.cambridge.org/ebook.jsf?bid=CBO9780511803796
http://ebooks.cambridge.org/ebook.jsf?bid=CBO9780511803796
Contents
vii
Contents viii
25 Diffractive optical elements 351
26 The Talbot effect 367
27 Some quirks of total internal reflection 379
28 Evanescent coupling 387
29 Internal and external conical refraction 404
30 Transmission of light through small elliptical apertures 418
31 The method of Fox and Li 447
32 The beam propagation method 459
33 Launching light into a fiber 476
34 The optics of semiconductor diode lasers 489
35 Michelson’s stellar interferometer 505
36 Bracewell’s interferometric telescope 515
37 Scanning optical microscopy 525
38 Zernike’s method of phase contrast 545
39 Polarization microscopy 554
40 Nomarski’s differential interference contrast microscope 566
41 The van Leeuwenhoek microscope 576
42 Projection photolithography 586
43 Interaction of light with subwavelength structures 599
44 The Ronchi test 614
45 The Shack–Hartmann wavefront sensor 624
46 Ellipsometry 632
47 Holography and holographic interferometry 642
48 Self-focusing in nonlinear optical media 654
49 Spatial optical solitons 664
50 Laser heating of multilayer stacks 678
Index 691
Preface to the second English edition
Following the publication of the first edition of this book, I wrote (or co-wrote)
nine additional columns for Optics & Photonics News (OPN), which appeared
between April 2001 and April 2007. Some of these columns were included in the
Japanese enlarged edition of the book, published in 2006; all nine columns are
now included in this second English edition. Throughout the years, I also wrote
four columns which were not submitted to OPN, because they ended up being
somewhat lengthy and perhaps too mathematical for the general readership of the
OPN; these appear here for the first time as Chapters 9, 14, 18, and 25.
The selection of topics and the exposition style of the thirteen new chapters of the
present edition follow the same principles and guidelines as did the original thirty-
seven chapters of the first edition. In each case a topic is chosen either for its intrinsic
value as a foundational contribution to the science of optics (e.g., the Sagnac effect,
second-order coherence, the Doppler shift), or because of its technological
significance (e.g., optical pulse compression, semiconductor diode lasers, diffractive
optical elements). To a large extent, the fifty chapters of the present book are
independent of each other and can be read in any desired sequence. Occasionally,
when the information in one chapter could benefit the understanding of the material
in another, cross references are provided. The presentation style is pedagogic and
informal, with mathematics used sparingly unless it is deemed essential and
unavoidable. Computer simulations are used extensively throughout the book as an
aid to explaining the concepts and to provide concrete examples of the physical
phenomena under consideration. As was the case in the first edition, the software
packages DIFFRACT and MULTILAYER, both products of MM Research,
Inc., Tucson, Arizona, were used for the numerical simulations reported in the new
Chapters 19, 20, 23, 25, 33, 34, 43, and 49. The computer simulations of Chapters 11,
30, and 43 were carried out by Armis Zakharian using his software package
Sim3D_Max, a product of Nonlinear Control Strategies, Inc., Tucson, Arizona.
The basis of Sim3D_Max is the Finite Difference Time Domain (FDTD) technique
for solving Maxwell’s equations as described, for example, in Computational
Electrodynamics by A. Taflove and S. C. Hagness, Artech House (2000).
ix
Preface to the second English edition x
Professor Emeritus Jumpei Tsujiuchi of the Tokyo Institute of Technology, Japan,
has painstakingly translated all of my OPN columns for the Japanese magazine O
Plus E; these articles appeared in print between 2001 and 2005. Subsequently,
Professor Tsujiuchi arranged for the collection of the translated articles to be
published in book form by the New Technology Communications Co., Ltd. The
Japanese enlarged edition of Classical Optics and its Applications has been in print
since 2006. I am grateful to Professor Tsujiuchi for his dedication and his untiring
efforts to bring these articles to the attention of the technical community in Japan.
For guidance and illuminating discussions, I am indebted to Armis Zakharian of
the Corning Corp., Jeffrey Wilde of the Capella Photonics, Inc., Seiji Yonezawa of
the Comets, Inc. (Japan), Sjoerd Stallinga of the Philips Research Laboratories
(Netherlands), and Kenji Konno of Minolta Corp. (Japan). I am also grateful to the
following colleagues from the College of Optical Sciences, University of Arizona,
for sharing their insights with me: Ewan Wright, Jerome Moloney, Brian Anderson,
Mahmoud Fallahi, Jason Jones, Jose Sasian, Nasser Peyghambarian, James Wyant,
Dennis Howe, Pavel Polynkin, and Pierre Meystre. Special thanks are due to Ewan
Wright, Armis Zakharian, and Jerome Moloney for granting permission to publish
our joint articles in this volume. (The co-authors of the corresponding chapters are
acknowledged in footnotes to each chapter.)
While working on several chapters of this book, my research has been supported
by the United States Air Force Office of Scientific Research (AFOSR) through
contracts F49620–03–1–0194, FA9550–04–1–0213, FA9550–04–1–0355 awarded
by the Joint Technology Office; I thank Dr Arje Nachman of the AFOSR for his
support of our research program over the past several years. I also would like to
thank the editor at the Cambridge University Press, Dr Simon Capelin, and his
professional staff for their superb handling of the publication of the English
editions of this book. Last but not least, I must mention with deep gratitude the
loving care and support of my wife, Annegret, during the years that this book has
been in preparation. As with previous editions, it is to her and to our children,
Kaveh and Tobias, that the second English edition of the book is dedicated.
Preface to the first edition
I started writing the Engineering column of Optics & Photonics News (OPN) in
early 1997. Since then nearly forty articles have appeared, covering a broad range
of topics in classical optical physics and engineering. My original goal was to
introduce students and practising engineers to some of the most fascinating topics
in classical optics. This I planned to achieve with minimal usage of the
mathematical language that pervades the literature of the field. I had met many
bright students and practitioners who either did not know or did not fully appreciate
some of the major concepts of classical optics such as the Talbot effect, Abbe’s
sine condition, the Goos–Hänchen effect, Hamilton’s internal and external conical
refraction, Zernike’s method of phase contrast, Michelson’s stellar interferometer,
and so on. My columns were going to have little mathematics but an abundance of
pictures and pedagogical arguments, to bring forth the essence of the physics
involved in each phenomenon. In the process, I hoped, the readers would
appreciate the beauty of the subject and, if they found it interesting, would dig
deeper by searching the cited literature.
A unique tool available to me for this purpose was the computer programs
DIFFRACTTM, MULTILAYERTM, and TEMPROFILETM, which I have developed
in the course of my research over the past 20 years. The first of these programs
simulates the propagation of light through optical systems consisting of discrete
elements such as lasers, lenses, mirrors, prisms, phase/amplitude masks, gratings,
polarizers, wave-plates, multilayer stacks, birefringent crystals, diffraction gratings,
and optically active materials. The other two programs simulate the optical and
thermal behavior of multilayer stacks. I have used these programs to generate graphs
and pictures to explain the various phenomena in ways that would promote a better
understanding.
The articles have been successful beyond my wildest dreams. While I had hoped
that a few readers would find something useful in this series, I have received notes,
e-mails, and verbal comments from distinguished scholars around the world who
have found the columns stimulating and helpful. Some teachers informed me that
they use the articles for their classroom teaching, and I have heard of several
xi
Preface to the first edition xii
readers who collect the articles for future reference. All in all, I have been
pleasantly surprised by the positive reaction of the OPN readers to these columns.
Optics & Photonics News is not an archival journal and, therefore, will not be
widely available to future students. Thus I believe that collecting the articles here
in one book, which provides for ease of cross-referencing, will be useful.
Moreover, the book contains additional explanations of topics that were
originally curtailed for lack of space in OPN; it includes corrections to errors
discovered afterwards and incorporates some comments and criticisms made by
OPN readers as well as my answers to these criticisms.
This book covers a broad range of topics: classical diffraction theory, the optics of
crystals, the peculiarities of polarized light, thin-film multilayer stacks and coatings,
geometrical optics and ray-tracing, various forms of optical microscopy, interfer-
ometry, coherence, holography, nonlinear optics, etc. It could serve as a companion
to the principal text used in a number of academic courses in physics, engineering,
and optics; it should be useful for university teachers as a guide to selecting topics for
a graduate-level course; it should be useful also for self-study by graduate students. It
could be used fruitfully by engineers who develop optical systems such as laser
printers, scanners, cameras, displays, image-processing equipment, lasers and laser-
based systems, telescopes, optical storage and communication systems, spectro-
meters, etc. I believe anyone working in the field of optics could benefit from this
book, by being exposed to some of the major concepts and ideas (developed over the
last three centuries) that shape our modern understanding of optics.
Some of the original OPN columns were written jointly with colleagues and
students; these are identified in the footnotes and the corresponding co-authors
acknowledged. I thank Ewan Wright and Rongguang Liang of the Optical Sciences
Center, Lifeng Li of Tsinghua University, Mahmoud Fallahi of Nortel Co., and
Wei-Hung Yeh of Maxoptix Co. for their collaboration as well as for giving
permission to publish our joint papers in this collection. I also would like to
acknowledge the late Peter Franken, Pierre Meystre, Yung-Chieh Hsieh, Dennis
Howe, Glenn Sincerbox, Harrison Barrett, Roland Shack, José Sasian, Michael
Descour, Arvind Marathay, Ray Chiao, James Wyant, Marc Levenson, Ronald
Gerber, James Burge, Ferry Zijp and Dror Sarid, who shared their valuable insights
with me and/or criticized the drafts of several articles prior to publication. Needless
to say, I am solely responsible for any remaining errors and inaccuracies. For their
help with graphics and word processing, I am grateful to our administrative
assistants Patricia Gransie, Nonie Veccia, Marylou Myers, and Amanda Palma.
Last but not least, I am grateful to my wife, Annegret, who has tolerated me
with love and patience over the past four years while this book was being written.
It is to her and to our children, Kaveh and Tobias, that this book is dedicated.
Introduction
The common threads that run through this book are the classical phenomena of
diffraction, interference, and polarization. Although the reader is expected to be
generally familiar with these electromagnetic phenomena, the book does cover
some of the principles of classical optics in the early chapters. The basic ideas of
diffraction and Fourier optics are introduced in chapters 1 through 4; this intro-
duction is followed by a detailed discussion of spatial and temporal coherence
and of partial polarization in chapters 5 through 8. These concepts are then used
throughout the book to explain phenomena that are either of technological import
or significant in their own right as natural occurrences that deserve attention.
Each chapter is concerned with a single topic (e.g., surface plasmons, dif-
fraction gratings, evanescent coupling, photolithography) and attempts to develop
an understanding of this subject through the use of pictures, examples, numerical
simulations, and logical argument. The reader already familiar with a particular
topic is likely to learn more about its applications, to appreciate better the physics
behind some of the formulas he or she may have previously encountered, and
perhaps even learn a thing or two about the nuances of the subject. For the reader
who is new to the field, our presentation is aimed to provide an introduction, an
intuitive feel for the physical and/or technological issues involved, and, hope-
fully, motivation for digging deeper by consulting the cited references. For the
most part, this book avoids repeating what is already in the open literature,
aiming instead to expose concepts and ideas, ask critical questions, and provide
answers by appealing to the reader’s intuition rather than to his or her math-
ematical skills.
Some of the chapters address fundamental problems that historically have
been crucial to our modern understanding of optics; conical refraction, the
Talbot effect, the principle of holography, and the Ewald–Oseen extinction
theorem are representatives of this class. Other chapters introduce devices and
phenomena of great scientific and technological importance; Fabry–Pérot
1
2 Classical Optics and its Applications
etalons, the magneto-optical Faraday and Kerr effects, and the phenomenon of
total internal reflection fall into this second category. Many of the remaining
chapters single out a tool or an instrument that not only is of immense techno-
logical value but also has its unique principles of operation, worthy of detailed
understanding; examples include various microscopes and telescopes, lithographic
systems, ellipsometers, and so on. Occasionally a theoretical concept or a
numerical method is found that has a wide range of applications; we have devoted a
few chapters to these topics, such as the method of Fox and Li, the beam propa-
gation method, and the concept of reciprocity in classical optics.
The majority of the computer simulations reported in this book were per-
formed with the software packages DIFFRACT, MULTILAYER, and
TEMPROFILE, which I have written in the course of the past twenty years and
which are now commercially available. These programs in turn are based on
theoretical methods and numerical algorithms that are fully documented in sev-
eral of my publications.1,2,3,4,5,6 In a few chapters, I have collaborated with
Professor Lifeng Li (now at the Tsinghua University in China). Here, we have
used Professor Li’s program DELTA, also commercially available, for calcu-
lations pertaining to diffraction gratings. The theoretical foundations of DELTA
are described in Professor Li’s publications.7
Throughout the book, black-and-white pictures will be used to display the
various properties of an optical beam; these include cross-sectional distributions
of intensity, phase, polarization, and the Poynting vector. A unified scheme for
the gray-scale encoding of real-valued functions of two variables is used in all the
chapters, and it is helpful to review these methods at the outset. In the convention
adopted the beam always propagates along the Z-axis, and its cross-sectional
plane is XY. The Cartesian XYZ coordinate system is right-handed, the polar
angles are measured from the positive Z-axis, and the azimuthal angles are
measured from the positive X-axis towards the positive Y-axis. In general, the
beam has three components of polarization along the X-, Y-, and Z-axes of the
coordinate system, that is, its electric field E has components Ex(x, y), Ey(x, y),
and Ez(x, y) at any given cross-sectional plane, say, at z ¼ z0. Since the E-field
components are complex-valued, their complete specification requires two dis-
tributions for each component, namely, amplitude and phase. The following
paragraphs describe in some detail the encoding scheme used for displaying
different cross-sectional properties of the beam and also provide a few examples.
cross-sectional XY-plane for the E-field component along the X-axis is denoted
by Ix (x, y) ¼ jEx(x, y)j2. Figure 0.1 shows plots of intensity distribution for
the three components of polarization of a Laguerre–Gaussian beam propagating
along the Z-axis. The black pixels represent locations where the intensity is at
its minimum (zero in the present case), the white pixels correspond to the
locations of maximum intensity within the corresponding frame, and the gray
pixels linearly interpolate between these minimum and maximum values. In the
case of Figure 0.1, the beam was taken to be linearly polarized at 45 to the
X-axis, leading to identical distributions for the X- and Y-components of
polarization.
The much weaker Z-component is computed to ensure that the Maxwell
equations will be satisfied for the assumed distributions of the X- and Y-
polarization components. In general, one may assume arbitrary distributions for
Ex and Ey within a given cross-sectional XY-plane. To determine Ez in a self-
consistent manner, one must break up the Ex and Ey distributions into their
plane-wave constituents and proceed to determine Ez for each plane wave that
propagates along the unit vector r ¼ (rx, ry, rz) by requiring the inner product
of E and r to vanish (i.e., Exrx þ Eyry þ Ezrz ¼ 0). One must then superimpose
the Z-components of all the plane waves thus obtained to arrive at the total
distribution of Ez. In Figure 0.1 the peak intensities in the three frames are in the
ratios Ix : Iy : Iz ¼ 1.0 : 1.0 : 1.47 · 107.
y/
–10 4
104 b
y/
–104
104 c
y/
–104
–10000 x/ 10000
y/
–5
–5 x/ +5 –5 x/ +5
Figure 0.2 (a) Intensity distribution in the focal plane of a 0.5NA lens having
1.5k of third-order coma (Seidel aberration). The uniformly distributed incident
beam is assumed to be circularly polarized. In the focal plane, the X-, Y-, and
Z- components of the electric field vector are added together to yield the total
E-field intensity. (b) Same as (a) but on a logarithmic scale with a ¼ 4 (see text).
Ellipse of polarization
Consider a collimated beam of light propagating along the Z-axis. In general,
the state of polarization of the beam at any given point is elliptical, as shown in
Figure 0.4. So long as the electric-field vector E may be assumed to be confined to
the XY-plane, it may be resolved into two orthogonal components, along the X- and
Y- axes say. If Ex and Ey happen to be in phase, the polarization will be linear along
some direction specified by the angle q. If, on the other hand, the phase difference
between Ex and Ey is 90 then the polarization will be elliptical, the major and
minor axes of the ellipse lying along the X- and Y-axes. In general, the phase
difference between Ex and Ey is somewhere between 0 and 360 , giving rise to an
ellipse whose major axis has an angle q with the X-axis and whose ellipticity is
given by the angle g. When the polarization is linear, g ¼ 0 ; for light that is right
circularly polarized (RCP), g ¼ þ45 , whereas for light that is left circularly
polarized (LCP), g ¼ 45 . In general, 90 < q 90 and 45 g 45 .
Figure 0.5 shows cross-sectional plots of intensity and polarization state for a
beam with a highly non-uniform state of polarization. Frame (a) is the logarithmic
6 Classical Optics and its Applications
104 a
y/
–10 4
104
b
y/
4
–10
104 c
y/
–10 4
–10000 x/ 10000
intensity pattern in the XY-plane. The polarization rotation angle q(x, y) is depicted
in (b), while the ellipticity g(x, y) is shown in (c). The gray-scale in Figure 0.5(b) is
a linear map of the values of q from 90 (black) to þ90 (white). Similarly, the
plot of g in Figure 0.5(c) is linearly encoded in gray-scale, with black representing
45 and white representing þ45 .
In the plot of q depicted in Figure 0.5(b), there are random-looking jumps
between black and bright-white pixels. This is due to the ambiguity of the
polarization rotation angle when either the E-field intensity is zero or the ellipticity
g is 45 . In these regions, a small numerical error could readily cause a discrete
jump between qmin ¼ 90 and qmax ¼ þ90 .
Introduction 7
Figure 0.4 The ellipse of polarization is uniquely specified by Ex and Ey, the
complex-valued electric field components along the X- and Y- axes. The major
axis of the ellipse makes an angle q with the X-direction, and the angle g facing
the minor axis represents the polarization ellipticity.
150 a
y/
–150
150 b
y/
–150
150 c
y/
–150
–150 x/ 150
9
10 Classical Optics and its Applications
f = 1.1133 mm
NA = 0.75
Second
principal
plane
we shall see below, the emergent phase pattern is quite different for a lens that
does satisfy the sine condition.
Geometric-optical concepts
The sine condition applies to a centered optical system designed for “aberration-
free” imaging of a small patch within the object plane to a corresponding patch
within the image plane (see Figure 1.3). The imaging system is intended for a
given pair of conjugate planes, so that the distance z0 between the object and the
first p.p. of the system is fixed, as is the distance z1 between the image and
the second p.p. The lens formula 1/z0þ1/z1¼1/f, where f is the focal length of the
system, applies here.5
Throughout this chapter, attention is confined to systems where both the object
and image are in air; extension of the results to situations where the object space
and image space have differing refractive indices (e.g., immersion-oil micro-
scopy) is straightforward but is not discussed.4,5
In the present context, “aberration-free” imaging means that a cone of light
emanating from any point (x0, y0) in the small patch within the object plane, when
captured by the optical system is turned into a convergent cone that – to a first
approximation in the relevant parameters – comes to focus at (x1, y1) in the image
12 Classical Optics and its Applications
7.5 a b
y/
–7.5
3250 c d
y/
–3250
Figure 1.2 (a) Logarithmic plot of intensity distribution at the focal plane of
the plano-convex lens of Figure 1.1 for a circularly polarized, collimated beam
traveling along the optical axis. (b) Same as (a) but for an obliquely incident
beam traveling at 0.076 relative to the optical axis. (c) Distribution of phase for
the oblique beam entering the lens at its flat surface. The gray-scale covers the
interval from 180 (black) to þ180 (white). (d) Distribution of phase for the
oblique beam emerging from the lens at its second p.p.
plane (see Figure 1.4).4,5 The point (x1, y1) is conjugate to (i.e., the Gaussian
image of) the point (x0, y0). Since the system is circularly symmetric around the
optical axis, the axial point at the center of the object plane is imaged to
the axial point at the center of the image plane. Denoting the distance between
(x0, y0) and the origin of the object plane by d0 and, similarly, the distance
between (x1, y1) and the origin of the image plane by d1, the transverse
magnification m of the system is d1/d0. It is not difficult to show that m is also
equal to z1/z0 (see Figure 1.3).
Principal planes
The concept of the principal planes is rooted in paraxial ray-tracing (i.e.,
Gaussian optics), where the angles between the rays and the optical axis are so
small that the sine and the tangent of each angle can be approximated by the
1 Abbe’s sine condition 13
X0 X X X1
First Second
principal principal
plane plane
Object Image
Imaging system
Y0 Y Y Y1
Figure 1.3 A small planar object in the vicinity of the optical axis in the X0Y0-plane
is imaged onto a small region of the X1Y1-plane. The principal planes of the imaging
system are also shown. The object and image planes are assumed to be in air, so that the
refractive indices of both the object space and the image space may be set to unity.
Object
plane
(x1, y1)
d1
Z
d0
Figure 1.4 The cone of light emanating from an off-axis object point (x0, y0) is
captured by the imaging system and brought to focus at the corresponding image
point (x1, y1). Note that beyond the paraxial regime the rays entering the first p.p. at
a given height do not necessarily emerge from the second p.p. at the same height.
value of the angle itself, sin h tan h h. In the neighborhood of the optical axis,
therefore, the entire system may be represented by a 2 · 2 matrix, and the
principal planes are uniquely determined from this so-called ABCD matrix of the
system.5
14 Classical Optics and its Applications
The principal planes are conjugate planes with unit transverse magnification. A
ray entering the first p.p. at a certain height h will emerge from the second p.p. at
the same height, as shown in Figure 1.5(a). Thus h z0h0 z1h1, where h0 and h1
are the angles of the incident and the emergent rays with the optical axis. Note
that, within the framework of the paraxial approximation, the system’s entrance
aperture at the first p.p. is identical in size and shape to the exit aperture located at
the second p.p. (The term aperture as used here should not be confused with
pupil, which has a more specific meaning in geometrical optics. The entrance and
exit pupils also define the boundaries of the cones of light that enter and exit the
system, but the pupils are not necessarily located at the principal planes.)
Beyond the paraxial regime, the principal planes cease to be conjugate planes.
Depending on its direction, a ray entering the first p.p. at a given height h might
emerge from different locations on the second p.p. One might confine attention to a
specific set of rays, such as those emanating from the axial point in the object plane,
in order to fix the directions of rays that enter the system. Yet there is no guarantee
that the height h of a ray on entering the first p.p. will remain the same when it
emerges from the second p.p. Of course one can impose this as a requirement on the
system, but many other possibilities exist that are equally plausible, as long as they
conform to the constraints of the paraxial regime. Abbe’s sine condition is one such
requirement placed on the heights of the entering and emerging rays.
u0 u1
Z
(b)
u0 u1
Z
Axial Axial
Object Point Image Point
Figure 1.5 (a) In the paraxial regime the height h of a ray is measured from the
optical axis in the principal planes. (b) In systems that operate beyond the
paraxial regime one may define the ray height at the point where the ray crosses
a reference sphere. When a system satisfies Abbe’s sine condition the height of a
ray thus defined remains the same upon entering and exiting the system.
the height of an emergent ray at the exit sphere differs from its height at the
second p.p.) In a sense, therefore, the sine condition requires the bending of the
principal planes into spheres to preserve the paraxial property that a ray entering
the system at a given height emerges from the system at the same height.
16 Classical Optics and its Applications
Whereas in the paraxial regime the angular magnification h1/h0 ¼ 1/m, where
m is the transverse magnification of the system, it is the ratio (sin h1)/(sin h0) that
equals 1/m in a system satisfying the sine condition. This turns out to be of crucial
significance for the image-forming system, as will be shown below. To empha-
size the point, note that in the system of Figure 1.5(a), where the entering and
emerging ray heights are equal at the principal planes, the ratio (tan h1)/(tan h0)
equals 1/m, whereas in the system of Figure 1.5(b), which satisfies Abbe’s sine
condition, the relevant ratio is (sin h1)/(sin h0).
Aplanatic system
A system that yields an aberration-free image of the axial object point and satisfies
Abbe’s sine condition is said to be “aplanatic”.4,5 Many imaging systems in use
today satisfy these conditions to a good approximation, if not exactly. Note that the
clear-aperture diameter of an aplanatic system as seen on the first p.p. is no longer
equal to that on the second p.p. If NA0 is the numerical aperture of the largest
cone of light emanating from the axial object point and captured by the system,
the aperture radius on the entrance sphere is z0NA0 whereas that on the first p.p. is
z0 tan[sin1(NA0)]. Similarly, in the image space the aperture radius on the exit
sphere is z1NA1 while that on the second p.p. is z1 tan[sin1(NA1)]. Abbe’s sine
condition guarantees that z0NA0 ¼ z1NA1 but, unless the imaging system has unit
magnification, the aperture radii at the two principal planes are not equal.
What is surprising about the sine condition is that a requirement imposed
solely on the cones of light corresponding to the on-axis points affects the quality
of imaging for nearby off-axis points: once the sine condition has been satisfied,
all near-axis points within the object plane will be imaged, essentially free of
aberration, to their conjugates in the image plane. Without the sine condition,
however, images of the near-axis points would be degraded by aberrations, most
prominently by coma. It is this surprising property of the sine condition that we
shall elucidate further.
aberrations of the lens, if any, to account for deviations of the emergent wavefront
from perfect sphericity.8 Thus if A1(x,y) represents the complex-amplitude
distribution at the first p.p., the distribution at the second p.p. will be written
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
A2 ðx; yÞ ¼ A1 ðx; yÞ exp½ið2p=kÞWðx; yÞ exp½ið2p=kÞ x2 þ y2 þ f 2 : ð1:1Þ
(x,y)
(x,y)
S (x1,y1)
S
Z
(0,0) (0,0)
(x0,y0)
Figure 1.6 The ray leaving the off-axis point (x0, y0) and arriving at (x0, y0) will
travel a slightly different distance than the ray from the axial point (0, 0) that
travels along S0 toward the same location. When (x0, y0) is sufficiently close to the
optical axis, the path-length difference between these two rays can be approxi-
mated by the projection on S0 of the line joining (x0, y0) to the point at the origin.
The same argument applies to the conjugate rays in the image space.
To a first approximation, therefore, upon arrival at the first p.p. the cone of light
that originates at (x0, y0) will be the same as that which originated from the axial
point, albeit with a modulation by the following phase factor:
exp½ið2p=kÞDl exp½ið2p=kÞðx0 Sx0 þ y0 Sy0 Þ: ð1:3Þ
Note that the phase in Eq. (1.3) is linear in (S0x, S0y) but not in (x0 , y0 ). The same
phase factor will appear on the beam at (x, y) on the second p.p. Now, and this is
the crux of the matter, if the sine condition is satisfied then this phase factor can
be replaced by exp[i(2p/k)(x1Sx þ y1Sy)], because the angular magnification
between (S0x, S0y) and (Sx, Sy) is exactly the reverse of the transverse magnification
m between (x0, y0) and (x1, y1). The distribution at the second p.p. now corres-
ponds to a spherical wavefront, converging toward (x1, y1) and having no
aberrations whatsoever. This is the essence of the sine condition, which cannot
be over-emphasized; it is the reason why there is “aberration-free” imaging of
near-axial points.
A wide-aperture aplanat
As an example, consider an ideal infinite-conjugate aplanatic lens having
z0 ¼ 1, NA0 ¼ 0, z1 ¼ f ¼ 4000k and NA1 ¼ 0.75. The phase pattern of an obliquely
1 Abbe’s sine condition 19
incident plane wave at the first p.p. of this lens is shown in Figure 1.7(a). The
beam has a linear phase over the entire entrance aperture, as expected of a plane
wave at oblique incidence. Upon emerging from the second p.p. the phase
pattern of the beam is that of Figure 1.7(b). In compliance with the sine con-
dition the exit aperture is seen to be larger than the entrance aperture,
and the phase pattern has undergone some sort of nonlinear “stretching”.
(The emergent phase pattern in Figure 1.7(b), however, is nonlinear because
it is displayed in the x, y coordinates; in the coordinates Sx, Sy it would be
perfectly linear.)
The emergent beam comes to focus at the focal plane of the lens, creating
the off-axis Airy pattern shown in Figure 1.7(c). For comparison, the on-axis
focused spot of the same lens is also shown in the figure. As expected, the
off-axis spot is free from aberrations, and the two spots are essentially
identical.
It is not difficult to design an aplanat with the characteristics of the lens in the
above example; a specific design is shown in Figure 1.8. The various parameters
of this meniscus, which consists of two conic surfaces, are listed in the figure
caption.
Note that Sx and Sy are proportional to sin h, but in the present case it is tan h that
is magnified by 1/m. A Taylor series expansion yields
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
tan h ¼ sin h= 1 sin2 h ¼ sin h þ 12 sin3 h þ 38 sin5 h þ : ð1:5Þ
y/
–4700
4700 b
y/
–4700
7.5
c
y/
–7.5
–7.5 x/ 7.5
Second
principal
plane
The classical theory of diffraction originated in the work of the French physicist
Augustin Jean Fresnel, in the first quarter of the nineteenth century. Fresnel’s
ideas were subsequently expanded and elaborated by, among others, William
Rowan Hamilton, Gustav Kirchhoff, George Biddell Airy, John William Strutt
(Lord Rayleigh), Ernst Abbe, and Arnold Sommerfeld, leading to a complete
understanding of light in its wave aspects.1
The Fourier-transform operation occurs naturally in any formulation of the
theory of diffraction, giving rise to a body of literature that has come to be known
as Fourier optics.2 The prominence of Fourier transforms in physical optics is
rooted in the fact that any spatial distribution of the complex amplitude of light
can be considered a superposition of plane waves.3 (Plane waves, of course, are
eigenfunctions of Maxwell’s equations for the propagation of electromagnetic
fields through homogeneous media.1,4)
Many students of Fourier optics are intimidated by the approximations
involved in deriving its basic formulas, but it turns out that the majority of these
approximations are in fact unnecessary: by starting from a plane-wave expansion
of the light amplitude distribution, rather than the traditional Huygens’
principle,1,2,4 one can readily arrive at the fundamental results of the classical
theory either directly or after applying the stationary-phase approximation.1,3
(For a detailed discussion of the stationary-phase method see the appendix to this
chapter.)
The goal of the present chapter is to show how decomposition into, and
subsequent superposition of, plane waves can lead straightforwardly to the near-
field (Fresnel) and far-field (Fraunhofer) formulas, to elucidation of the Fourier
transforming properties of a lens, and to the essence of Abbe’s theory of image
formation. Along the way, several numerical examples will demonstrate the
utility of the derived formulas.
23
24 Classical Optics and its Applications
Jean Baptiste Joseph Fourier (1768–1830), began to work on the theory of heat
around 1804 and by 1807 had completed a memoir, On the Propagation of Heat in
Solid Bodies, in which periodic functions were expressed as the sum of an infinite
series of sines and cosines. Lagrange and Laplace objected to Fourier’s expansion
on the grounds that it lacked generality and rigor. Fourier’s treatise, The Analytical
Theory of Heat, was not published until 1822. (Photo: Deutsches Museum, courtesy
of AIP Emilio Segré Visual Archives.)
Joseph von Fraunhofer (1787–1826) German physicist who first studied the
dark lines in the spectrum of the Sun. The first to use diffraction gratings, his work
set the stage for the further development of spectroscopy. (Photo: Bavarian
Academy of Sciences, courtesy of AIP Emilio Segré Visual Archives.)
On the one hand, rz will be real-valued if r2x þ r2y 1, in which case the plane
wave is said to be homogeneous or propagating. On the other hand, if r2x þ r2y > 1
then rz becomes imaginary and the plane wave is called inhomogeneous or
evanescent.
In scalar diffraction theory, the state of polarization of the light is ignored and
A0 is treated as a complex constant. Furthermore, if the x, y, z coordinates are
normalized by the wavelength k, then this parameter disappears from all subse-
quent equations. Throughout this chapter, therefore, all lengths will be assumed
to be normalized by k; a propagation distance of 1000, for example, should be
understood as a distance of 1000k.
Because Maxwell’s equations are linear, any superposition of plane waves within
homogeneous linear media is also a solution of Maxwell’s equations. In general,
the superposition of plane waves in Eq. (2.2b) contains both propagating and
evanescent waves. At a distance z ¼ z0 from the origin, the complex-amplitude
distribution of the light is thus given by
ZZ1
aðx; y; z ¼ z0 Þ ¼ Aðrx ; ry Þ exp½i2pðxrx þ yry þ z0 rz Þdrx dry : ð2:3Þ
1
Diffraction-free beams
If the propagation phase factor in Eq. (2.3) happens to be a constant then it can
be taken out of the integral, in which case, aside from a multiplicative phase
2 Fourier optics 27
a b c
d e f
According to Eq. (2.4), any initial distribution that is confined to a circle of radius
q0 in the Fourier domain will not diffract while propagating along the Z-axis.5 A
particularly simple case occurs when A(rx, ry) ¼ d(qq0), where d(·) is Dirac’s
delta function and q ¼ (r2x þ r2y)1/2. The inverse transform of this delta function is
a zeroth-order Bessel function of the first kind, namely, a(x, y, z ¼ 0) ¼ J0(2pq0r),
where r ¼ (x2 þ y2)1/2.
Needless to say, any azimuthal variation of the amplitude and/or phase of the
above delta function around the circle of radius q0 in the Fourier domain yields
another non-diffracting beam. Moreover, if the radius q0 is less than unity then
the non-diffracting beam will be a propagating beam, whereas q0 > 1 corresponds
to an exponentially attenuating, non-diffracting, evanescent beam.
28 Classical Optics and its Applications
X X
r0
Z
Y
z0
Figure 2.2 A collimated beam illuminates an opaque circular disk of radius r0.
At a distance z0 from the disk the intensity distribution in the XY-plane contains a
bright spot at the center of the geometrical shadow of the disk.
2 Fourier optics 29
a
–5000 x 5000
Figure 2.4 Incoherent imaging by means of a dark lens. The object in (a) is
illuminated by an extended quasi-monochromatic source through a 0.005NA
condenser of focal length f ¼ 6.0 · 105. The source consists of 529 mutually
incoherent point sources, imaged by the condenser at a distance of Dz ¼ 105
before the object. The dark lens is an opaque circular disk of radius r0 ¼ 2500,
placed a distance of Dz ¼ 106 from the object. The image in (b) is computed at a
distance of z0 ¼ 2.0 · 106 behind the dark lens.
The integral in Eq. (2.3) is then readily computed, without further approximations,
yielding
aðx; y; z ¼ z0 Þ i=ðx2 þ y2 þ z20 Þ1=2 exp i2pðx2 þ y2 þ z20 Þ1=2
1=2
· Aðrx0 ; ry0 Þ= 1 þ ðx=z0 Þ2 þ ðy=z0 Þ2 : ð2:6Þ
2 Fourier optics 31
X X
Incident
plane wave s
x x
s0
u
Z
Object
Z0
This is the so-called Fraunhofer (or far-field) distribution arising from the initial
distribution a(x, y, z ¼ 0). The far field is expressed in terms of the Fourier transform
A(rx, ry) of the initial distribution evaluated at (rx0, ry0) ¼ (x, y)/(x2þy2þz20)1/2.
Note how the obliquity factor cos h ¼ 1/[1þ(x/z0)2þ(y/z0)2]1/2 enters the above
equation (see Figure 2.5).
If the far field is observed on a spherical surface of radius z0 centered on the
object (see Figure 2.5) then the curvature phase factor becomes a constant and
(rx0, ry0) reduces to (x/z0, y/z0), yielding the following simple formula for the far-
field pattern on a spherical surface of radius z0:
aðx; y; zÞ ði=z0 Þ expði2pz0 Þ Aðx=z0 ; y=z0 Þ cos h: ð2:7Þ
The conservation of optical power passing through any cross-section of the beam
may be verified by integrating the squared modulus of the functions appearing in
Eqs. (2.6) and (2.7) over their respective domains.
32 Classical Optics and its Applications
a b c
d e f
The amplitude distribution at the entrance pupil (assumed to coincide with the 1st
principal plane) is denoted by a0(x1, y1). The coordinates at the 1st and 2nd
principal planes are related as follows: (x1, y1) = (fx, fy)/(x2 þ y2 þ f 2)1/2. The
corresponding infinitesimal areas in the two principal planes are in the ratio
cos3 h, where cos h ¼ f/(x2 þ y2 þ f 2)1/2; the amplitude in Eq. (2.8) is therefore
scaled by cos3/2 h to conserve optical power between the entrance and exit pupils.
The exponential phase factor in Eq. (2.8) is the curvature imparted by a perfect
lens to the emergent beam.
To determine, in accordance with Eq. (2.2a), the Fourier transform of the initial
distribution given by Eq. (2.8), we invoke the stationary-phase approximation.1 The
X1 X X2
u F
Z
ra = f NA
Incident
beam
f
z0
exponent of the integrand under the Fourier integral may be expanded in a Taylor
series around its stationary point,
yielding
n 1
xrx þ yry þ ðx2 þ y2 þ f 2 Þ1=2 ð1 r2x r2y Þ1=2 f þ 12 ð1 r2x Þðx x0 Þ2
f
rx ry ðx x0 Þð y y0 Þ
o
2
þ 2 ð1 ry Þð y y0 Þ :
1 2
ð2:9Þ
Without any other approximations, the Fourier transform of the initial distribution
is found to be
Aðrx ; ry Þ i fa0 ðf rx ; f ry Þ exp i2pf ð1 r2x r2y Þ1=2 =ð1 r2x r2y Þ1=4 :
ð2:10Þ
For a given distribution a0(x1, y1) at the entrance pupil, Eq. (2.11) gives
the distribution at and near the focal plane of the aplanatic lens of Figure 2.7. If
the final distribution is sought in the focal plane (i.e., z0 ¼ f ) and if the factor
cos1/2h ¼ (1r2xr2y)1/4 is ignored (i.e., the paraxial approximation), then the
focal-plane distribution becomes simply the Fourier transform of the entrance-
pupil distribution. For an aberration-free lens having a circular aperture of
radius ra ¼ fNA, and for a uniform incident beam, the focal- plane distribution is
thus proportional to J1(2pNAr)/r, where J1(·) is the first-order Bessel function of
the first kind and r ¼ (x22þy22)1/2. This is known as the Airy pattern, a plot of
which appears in Figure 2.8.
1.0 5
0.8
J1(2pr)/(pr)
0.6
0.4
–5
–5 x 5
0.2
0.0
0 1 2 3 4 5
r
Figure 2.8 Plot of the Airy function J1(2pr)/pr versus the radial distance r
from the focal point. The first zero of the Airy function is at r 0.61. The inset
shows a logarithmic plot of the intensity distribution at the focal plane of a
0.5NA diffraction-limited lens. This Airy pattern, being the result of a scalar
calculation, shows circular symmetry. In practice, both unpolarized and circu-
larly polarized incident beams produce circularly symmetric Airy patterns.
However, for linearly polarized light the Airy pattern tends to be slightly
elongated along the direction of the incident E-field.
Figure 2.9 Fourier transform lens having focal length f and aperture radius
ra ¼ fNA. The incident plane wave makes an angle h with the Z-axis in the XZ-
plane, that is, (rx, ry) ¼ (sin h, 0). The beam emerging from the lens converges to
the point (x2, y2) ¼ ( f sin h, 0) within the focal plane. The height of a ray entering
the lens at the first principal plane is the same as that of the emergent ray
measured on a spherical surface of radius f centered at the rear focal point F.
propagating from the exit pupil to the focus at x2 ¼ frx. The total phase at this
focus (relative to that at F) is thus given by
2
2 1=2
ðx1 ; rx Þ ¼ 2p x1 rx þ ðx1 f rx Þ þ ð f x1 Þ
2
f
1=2 ð2:12Þ
¼ 2pf ðx1 =f Þrx þ 1 þ rx 2ðx1 =f Þrx
2
1 :
For small values of both x1/f and rx, the above expression may be approximated as
(x1,rx) pfrx2, which is independent of x1. The various rays of the plane wave,
having thus acquired the common phase factor exp(ip f rx2), converge to a common
focus in the vicinity of the optical axis. Further away from the axis, of course,
higher-order terms will cause aberration. Unless the lens is properly designed to
correct these aberrations, the acceptable values of NA and rx will indeed be very
small. For example, Figure 2.10 shows plots of (x1, rx)pfr2x versus x1/f for
several values of rx, for a lens having NA ¼ 0.05 and f ¼ 25 000. Note that to keep
the maximum phase deviation at the edge of the pupil below 90 one must restrict
the aperture radius to ra 0.05f and the values of rx to the range within 0.055.
We conclude that, under appropriate conditions, a plane wave entering the
lens at rx ¼ sin h comes to diffraction-limited focus at x2 ¼ frx, with a phase
2 Fourier optics 37
sx = 0.01
0
0.02
–20 0.03
0.05
–60
0.06
–80
f = 25000, NA = 0.05
–100
Figure 2.10 Plots of (x1,rx)pfrx2 versus x1/f for several values of rx ¼ sin h
from 0.01 to 0.06 in the system of Figure 2.9. The function is given by
Eq. (2.12), and the specific values of the lens parameters used in the calculations
are NA ¼ 0.05, f ¼ 25 000.
pfr2x ¼ px22/f. Because of the finite aperture of the lens, the focused spot will
be not a geometric point but an Airy pattern of diameter 1/NA. Therefore, for an
object a0(x1, y1) at the entrance pupil the focal-plane distribution is related to the
Fourier transform A0(rx, ry) of the object as follows:
aðx2 ; y2 Þ exp ipðx2 þ y2 Þ=f A0 ðx2 =f ; y2 =f Þ
Airyðx2 ; y2 Þ:
2 2
ð2:13Þ
Needless to say, the range of (x2, y2) in Eq. (2.13) is limited to the region
for which the lens is properly designed to focus the incident plane waves into
diffraction-limited spots. In the absence of aberrations, the angular resolution of
such a lens is solely dependent on the lens-aperture radius ra and is given by
Drx ¼ Dry 0.61/ra. (Like all other spatial dimensions in this chapter, ra is
assumed to be normalized by the wavelength k of the light.)
Similar considerations apply when the object is placed a distance z1 before the
first principal plane. In this case each plane wave leaving the object must travel a
different distance to reach the entrance pupil. By the time it reaches the entrance
pupil, a plane wave traveling along the direction (rx, ry, rz) will have acquired
a phase 2pz1rz, which may be approximated as pz1(r2x þ r2y). Under these
38 Classical Optics and its Applications
Figure 2.11 (a), (b) Intensity and phase distributions in the XY-plane for an
object and (c), (d) for its Fourier transform. The object is in the front focal plane
of a 0.05NA lens having f ¼ 105, illuminated with a plane wave propagating
along the Z-axis; the Fourier transform is observed in the rear focal plane. The
intensity distribution in the Fourier-transform plane, (c), is displayed on a
logarithmic scale to enhance its weak regions. The phase plots in (b) and (d) are
encoded in gray-scale (black represents 180 , white represents þ180 ).
circumstances, Eq. (2.13) remains valid provided the exponent of the first term on
the right-hand side is multiplied by (1z1/f). In the special case where z1 ¼ f, the
quadratic phase factor in Eq. (2.13) disappears altogether, leaving a simple
Fourier-transform relation between the distributions in the front and rear focal
planes. As an example, Figure 2.11 shows a phase/amplitude object placed in the
front focal plane of a 0.05NA lens (see frames (a) and (b)), and the corresponding
Fourier transform as observed in the rear focal plane (frames (c) and (d)).5
Figure 2.12 Diagram of a simple imaging system. The object and image
distances from the respective principal planes are z1 and z2. The height of a ray
entering the lens is measured on a spherical surface of radius z1 centered at the
axial object point. Similarly, the height of a ray exiting the system is measured
on the spherical surface of radius z2 centered on the axial image point. For any
given ray, the entering and exiting heights are equal. Only one plane wave
(leaving the object at an angle h) is shown. The various rays of this plane wave
converge to a focus in the image space, then continue to propagate to the image
plane.
Abbe’s sine condition, the pupils are spherical caps centered at the axial object and
image points. The distance between the object and the first principal plane is z1 and
that between the second principal plane and the image is z2. The lateral magnifi-
cation of the system, therefore, is M ¼ z2/z1.
A plane wave leaving the object at an angle h relative to the Z-axis emerges
from the exit pupil, each one of its rays having the same height and the same
optical phase as at the entrance pupil. Confining attention to the two-dimensional
XZ-plane, and denoting the direction cosine of a ray in the object space
by rx1 ¼ sin h, the ray height x at the entrance pupil is found from simple
geometry to be
1=2
x ¼ x1 ð1 r2x1 Þ þ rx1 z21 x12 ð1 r2x1 Þ : ð2:14Þ
A ray leaving the object at x1 intersects the image at x2 ¼ Mx1. Obviously, the
ray fan reaching the image plane in Figure 2.12 is not a plane wave. However, it
will be seen that this bundle of rays has a phase distribution that can be expressed
40 Classical Optics and its Applications
as the sum of a linear term 1 and a nearly quadratic term 2. The linear term is
identical with that of the plane wave leaving the object, namely,
1 ðx2 ; rx2 Þ ¼ 2px2 rx2 ¼ 2px1 rx1 : ð2:15Þ
100
x1 = 0.00
0.25
80
40
20
0
–250 –125 0 125 250
x2
Figure 2.13 Plots of the function 2(x2, rx1) versus x2 for several values of
rx1 ¼ sin h equal to (top to bottom) 0.00, 0.25, 0.50, 0.75, 1.00. (See Figure 2.12
and Eq. (2.16); x2 is related to x1 through x2 ¼ Mx1.) The assumed system
parameters are z1 ¼ 104, z2 ¼ 105. The field of view in the image plane is con-
fined to the region jx2j < 250.
superimposed upon each other, produce in the image plane a magnified (or
demagnified) image of the object. Thus the differences between object and image
are: (i) the image is multiplied by a nearly quadratic phase factor, exp(i2); (ii)
the plane waves having a large angle h miss the lens and, therefore, do not
contribute to the image. This truncation by a circular aperture in the Fourier
domain is equivalent to convolution with an Airy function in the image plane.
The amplitude distribution in the image plane is thus given by
aimage ðx2 ; y2 Þ ¼ expði2 Þ aobject ðx2 =M; y2 =MÞ
Airyðx2 ; y2 Þ : ð2:18Þ
Figure 2.14 Distributions of intensity (left column) and phase (right column) at
the object and image planes of a coherent imaging system. The phase plots are
encoded in gray scale: black represents 180 , white represents þ180 . (a), (b)
Distributions in the plane of the object. (c), (d) Image obtained with a 10 ·, 0.6NA
objective lens. (e), (f) Image obtained with a 10 ·, 0.95NA objective lens.
the image plane of a 10 ·, 0.95NA lens. The higher-NA lens, capturing more of
the high-frequency Fourier components of the object, yields a superior image.
Both lenses, however, fail to reproduce the very fine features of the object.
Around each stationary point one may expand g(x, y) in a Taylor series up to the
second-order term to obtain
Replacing the expression for g(x, y) in Eq. (A2.1) with that in Eq. (A2.3), and
taking f(x, y) outside the integral, yields
X Z Z1
I f ðx0 ; y0 Þ exp½iggðx0 ; y0 Þ · exp iðg=2Þ gxx ðx x0 Þ2
1
2 ðA2:4Þ
þ 2gxy ðx x0 Þðy y0 Þ þ gyy ð y y0 Þ dx dy;
where the summation is over all stationary points (x0, y0). Notice that the domain
of integration is now extended to the entire plane, since the contribution to the
integral from regions outside the immediate neighborhood of the stationary points
is, in any event, negligible. The double integral in Eq. (A2.4) can be readily
carried out, yielding
X
I ð2pi=gÞ mj gxx gyy g2xy j1=2 exp½iggðx0 ; y0 Þ f ðx0 ; y0 Þ; ðA2:5Þ
where the summation is again over all stationary points (x0, y0) and the coefficient
m is given by
i if gxx gyy <g2xy
m¼
1 if gxx gyy >g2xy and gxx > < 0:
Equation (A2.5) is the final result of this appendix. If the numerical value of
44 Classical Optics and its Applications
45
46 Classical Optics and its Applications
Z
E F
E
is not necessary here, but it simplifies the problem by enabling the polarization
state of individual rays at the exit pupil to be determined solely on the basis of
their coordinates, without requiring detailed knowledge of the lens structure.) For
a linearly polarized incident beam, Figure 3.1 shows the bending of the E-vector
at two azimuthal positions. The ray at the top of the lens contributes both an
X- and a Z-component to the distribution in the image space, whereas the ray in
the YZ-plane contributes only an X-component. By the same token, rays inter-
mediate between those shown here will contribute to the polarization along all
three axes.
We present a simple treatment of polarization-related phenomena within the
framework of the classical theory of diffraction. This will not be a rigorous
treatment based on Maxwell’s equations; rather, it will be rooted in reasonable
physical arguments based on the bending of rays (or plane waves) by prisms. Our
approach to vector diffraction is in keeping with the spirit of diffraction theory; it
is not exact as far as Maxwell’s equations are concerned but incorporates intuitive
ideas about the propagation of electromagnetic waves.
With reference to Figure 3.2, consider a plane wave propagating along the unit
vector r0 ¼ (0, 0, 1), i.e., along the Z-axis, having linear polarization in the
X-direction. Let a prism be placed in the path of this beam, with orientation such
that the emerging beam would propagate in a direction specified by the unit
vector r1 ¼ (rx, ry, rz). Now, the incident polarization vector E0 ¼ (1, 0, 0) may
be decomposed into two components: one, the so-called p-polarization, is in the
plane of r0 and r1; the other, known as the s-polarization, is perpendicular to this
plane. As the latter component (perpendicular to the r0r1-plane) emerges from
the prism, it will have suffered no deviation in direction. The p-component,
48 Classical Optics and its Applications
Incident
Beam
p
0 = (0,0,1) Diffracted
s Beam
however, will have been reoriented such that it remains perpendicular to the
emergent direction. If it is further assumed that no losses, due to surface
reflections or otherwise, occur in this refraction process, one can use simple
geometry to determine the emerging polarization direction. A similar calculation
can be performed for an incident plane wave linearly polarized along the Y-axis.
Details of these calculations are left to the reader, but the final results are listed in
Table 3.1. Notice that the reorientation of the polarization vector described in
Table 3.1, while a consequence of the refraction of the direction of propagation,
is independent of the particular mechanism responsible for refraction. Given
an initial direction r0 and a direction for the emerging beam r1, one can use
Table 3.1 to identify the emergent components of polarization for an arbitrary
state of incident polarization.
In the stationary-phase approximation each ray is associated with a single
plane wave, the three polarization components of which may be treated inde-
pendently of each other. Therefore, for each of the components Ex, Ey, Ez of the
emergent beam, a single superposition integral (i.e., Fourier transform) yields the
sought-after distribution in the focal plane.
Example
The technique described in the preceding section is quite general and can be
applied to arbitrary incident distributions having arbitrary polarization states,
while taking into account various lens aberrations (including substantial amounts
of defocus). Computed results for an aberration-free, aplanatic lens having NA
¼ sin 75 ¼ 0.966 and f ¼ 3000k are shown in Figure 3.3. The assumed geometry
in these calculations is that depicted in Figure 3.1, where the incident beam is a
3 Effect of polarization on diffraction in systems of high numerical aperture 49
Table 3.1. Polarization E1 of a refracted beam when the incident polarization
E0 is along the X- or Y- axes. The refraction (from r0 to r1) is lossless
(a) | Ex | 2 (b) | Ey | 2
+3 +3
y/ +3 y/ +3
x/
–3 –3 x/
–3 –3
(c) | Ez | 2
+3
y/ +3
x/
–3
–3
uniform plane wave with linear polarization along the X-axis. Frames (a)–(c) in
Figure 3.3 are intensity plots for the X-, Y-, and Z- components of polarization in
the focal plane; their peak intensities are in the ratio 1.00 : 0.0081 : 0.192. The
corresponding gray-scale plots appear in Figure 3.4; frames (a)–(c) show the
intensity distributions and frames (d)–(f) display their logarithmic counterparts.
The observed four-fold symmetry of the Y-component and the two-fold symmetry
50 Classical Optics and its Applications
a d
b e
c f
–2 x/ +2 –3 x/ +3
1
y/
–1
–2
–3
–3 –2 –1 0 1 2 3
x/
Figure 3.5 Contour plot representing the sum of the three intensity profiles
shown in Figure 3.3, i.e., the total E-field energy density distribution in the focal
plane of the aplanatic lens.
A Gaussian beam is perhaps the simplest possible waveform that shows many of
the effects of diffraction. Using Gaussian beams one can study diffraction in the
near field and the far field, examine beam divergence upon propagation, inves-
tigate diffraction-limited focusing through a lens, observe the Gouy phase shift,
and analyze many other interesting properties of electromagnetic waves.
Although Gaussian beams have been thoroughly analyzed in the literature,1,2 it
is worthwhile to examine them in the Fourier domain from a less well-known
perspective. The need for the paraxial approximation (inherent in all treatments of
Gaussian beams) becomes particularly clear when employing the Fourier method
of analysis. There is also the issue of separability of the x- and y- dependences of
the Gaussian beam profile (assuming propagation along the Z-axis), which is often
assumed but not properly explained in the literature. It turns out that separability is
neither necessary nor desirable and that the two-dimensional analysis of a non-
separable beam is quite straightforward. It must be emphasized that separability is
not always achievable by rotating the coordinate axes. When the real and imaginary
parts of the Gaussian exponent require different rotations to become separable, the
x- and y-dependences remain entangled, thus necessitating a two-dimensional
analysis.
Here the complex constant â0 is the amplitude at the origin of the coordinate
system and the coefficients a ¼ (a1 þ ia2), b ¼ (b1 þ ib2), c ¼ (c1 þ ic2) are fixed
52
4 Gaussian beam optics 53
The real parts of the a, b, c parameters determine the profile of the beam’s
magnitude in the XY-plane at z ¼ 0, while their imaginary parts determine the
beam’s phase profile. The contours of constant magnitude are ellipses oriented at
h1 relative to X, where tan 2h1 ¼ 2b1/(a1–c1); the major and minor diameters of
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
1=2:
these ellipses are proportional to ða1 þ c1 Þ ða1 c1 Þ2 þ 4b21 The
phase contours are ellipses or hyperbolas whose axes are oriented at h2 relative to
X, where tan 2h2 ¼ 2b2 / (a2c2). In general h1 6¼ h2 and therefore coordinate
rotations cannot separate the x- and y- dependences of the Gaussian beam profile.
When a2c2 > b22 the contours of constant phase are ellipses; otherwise, they are
hyperbolas. Figure 4.1 shows two examples of amplitude and phase distributions
for Gaussian beams having different sets of the a, b, c parameters.
itsq
When a beam travels a distance z0 in free space, Fourier transform ffi is multiplied
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
by the transfer function of propagation, exp i2pz0 1 rx ry (see chapter 2,
2 2
c d
Figure 4.1 Distributions of intensity (left) and phase (right) in the cross-
sections of two Gaussian beams having different a, b, c parameters. The phase
plots are encoded in gray-scale, black representing 180 and white representing
þ180 . (a), (b) a ¼ 0.009 0.023i, b ¼ 0.006 0.002i, c ¼ 0.012 0.016i, (c),
(d) a ¼ 0.011 0.023i, b ¼ 0.01 0.003i, c ¼ 0.016 þ 0.012i.
into Â(rx,ry) of Eq. (4.2a) converts â0 to â0 exp(i2pz0), a to a þ iz0, and c to
c þ iz0, while keeping b unchanged. The beam’s Fourier transform thus retains its
Gaussian form and, consequently, the profile of the beam at z ¼ z0 remains
Gaussian, albeit with different a, b, c parameters and with a different value for â0.
It is readily verified that the new parameters of the beam at z ¼ z0 are given by
0 1
a b0 a þ iz0 b
¼ ; ð4:3aÞ
b0 c0 b c þ iz0
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
a00
^ a0 expði2pz0 Þ ða0 c0 b02 Þ=ðac b2 Þ
¼^
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
¼^
a0 expði2pz0 Þ= 1 ðac b2 Þz20 þ i ða þ cÞz0 : ð4:3bÞ
Thus the beam remains Gaussian as it propagates along Z, but its magnitude and
phase profiles change continuously. Figure 4.2 shows computed cross-sectional
profiles for a beam at several locations along the Z-axis. Note how the elliptical
4 Gaussian beam optics 55
8
y/
–8
10
y/
–10
20
y/
–20
1000
y/
–1000
Figure 4.2 Distributions of intensity (left) and phase (right) in the cross-
sectional planes of a Gaussian beam propagating along the Z-axis. The parameters
of the beam at z ¼ 0 are a ¼ 0.01 þ 0.05i, b ¼ 0.005 0.04i, c ¼ 0.02 0.12i.
The phase plots are encoded in gray-scale, black representing 180 and white
representing þ 180 . From top to bottom, the propagation distances along Z are 0,
5k, 25k, and 1000k. In the bottom right-hand frame the far-field curvature phase
factor (corresponding to a2 ¼ c2 ¼ 0.001) has been subtracted.
cross-section of the intensity profile rotates with increasing z0 and also how the
phase contours change from hyperbolas to ellipses and vice versa.
waist but if a waist exists then the a, b, c parameters in that cross-section will be
real. A question arises as to when an arbitrary Gaussian beam (for which the
a, b, c parameters at a given cross-section are complex) can be said to have
a waist. In other words, does a value of z0 (positive or negative) exist at which
a0 , b0 , c0 are real? According to Eq. (4.3a), this requirement is met if b is real and
the imaginary parts of a and c are identical, so that iz0 will end up canceling
their imaginary parts. This is equivalent to requiring both b and ac to be
real-valued.
Considering the relationship between a, b, c and a, b, c in Eq. (4.2b), it is not
difficult to show that the necessary and sufficient condition for an arbitrary
Gaussian beam to have a waist is that, in the complex plane, the three vectors b,
a c, and ac b2 must be parallel (or antiparallel) to each other. In other words,
these three complex numbers must lie along a straight line that goes through
the origin of the complex plane. This requirement, of course, is in addition to the
other Gaussian beam requirements, namely, a1
0, c1
0, a1c1
b21. When the a,
b, c parameters satisfy all the above constraints, the beam will have a waist at a
specific location along the Z-axis. The waist is unique, because there is only one
value of z0 that can cancel the imaginary parts of both a and c in Eq. (4.3a).
When a waist exists, there is symmetry between the locations before and after
the waist. Let the waist be at z ¼ 0. Then the a, b, c parameters at this location
will be real, which means that the corresponding a, b, c are real as well. Now, any
value of z0 will make a and c complex, while z0 will yield the conjugates of the
same a and c. Therefore, the a, b, c parameters on opposite sides of the waist will
be complex conjugates of each other. This means that the intensity profiles on
opposite sides are identical, while the phase profiles differ by a minus sign. The
beam is always convergent before, and divergent after, the waist.
approaches 90 for sufficiently large z0. Similarly, when z0 goes from 0 to
negative values, w moves toward þ90 . It is thus seen that, in crossing the waist,
4 Gaussian beam optics 57
the beam undergoes a 180 phase shift. This phase shift, which is particularly
rapid near the focus of a lens, was first observed experimentally by the French
physicist L. Georges Gouy in 1890.2,3,4
To demonstrate an observable effect of the Gouy phase, consider the experiment
depicted in Figure 4.3. Here an aberration-free lens is split into two identical halves,
and the upper half-lens is translated forward by Dz ¼ 300k. A collimated uniform
beam of light is directed at the split lens, and the distribution of intensity in the region
between the two foci, F1 and F2, is monitored. Figure 4.4 shows computed intensity
X
⌬z
F1 F2
20 a b c
y/
–20
–20 x/ 20 –20 x/ 20 –20 x/ 20
y/
–20
20 d e f
y/
–20
20 g h i
y/
–20
–20 x/ 20 –20 x/ 20 –20 x/ 20
patterns in a vertical plane half-way between F1 and F2 when (a) the upper half-lens
is blocked, (b) the lower half-lens is blocked, and (c) the light is allowed to go
through both half-lenses. For locations near the Z-axis, where the optical path lengths
are nearly identical, the light amplitudes contributed by the two half-lenses are
expected to be in phase, resulting in constructive interference. However, as
Figure 4.4(c) clearly demonstrates, the vicinity of the optical axis is dark. This
destructive interference is caused by the nearly 180 Gouy phase shift between
the “before-focus” and “after-focus” beams arriving from the two half-lenses.
Figure 4.5 shows several cross-sectional plots of intensity distribution in the region
between F1 and F2, starting at F1 and moving in steps of 37.5k to F2. The intensity at
and near the Z-axis is seen to diminish as the mid-plane between the foci is
approached from either side.
To account for propagation by a distance z0 along the Z-axis, the Fourier transform
of the initial distribution in Eq. (4.6a) is multiplied by the transfer function of free-
space propagation, which, in the paraxial approximation, is exp(i2pz0) exp(ipz0 r2).
This means that the coefficient 1/a in the exponent of the Gaussian function on the
right-hand side of Eq. (4.7) is augmented by iz0, yielding
1 1
¼ þ iz0 : ð4:8Þ
a0 a
The light amplitude distribution at z ¼ z0 is then obtained by an inverse Fourier
transform, yielding
pffiffiffiffiffiffiffi
aðx; z ¼ z0 Þ ¼ ^
^ a0 expði2pz0 Þða0 =aÞðnþ1Þ=2 Hn pa0 x expðpa0 x2 Þ: ð4:9Þ
The basic elements of an imaging system are shown in Figure 5.1. The light
from a source, either coherent (e.g., a laser) or incoherent (e.g., an incandescent
lamp or an arc lamp), is collected by the illumination optics (e.g., a condenser
lens) and projected onto the object. An image is then formed by an objective
lens upon a screen, a photographic plate, a CCD camera, the retina of an eye,
etc. Assuming that the objective lens is free from aberrations, the resolution
and the contrast of the image are determined not only by the numerical aperture
of the objective lens but also by the properties of the light source and the
illumination optics.
62
5 Coherent and incoherent imaging 63
Source
Illumination optics Object Objective lens Image
(Condenser lens)
Figure 5.1 Schematic diagram of a simple imaging system. The light source is
projected by the illumination optics onto an object, allowing the objective lens to
form an image of this object at the image plane.
a b
c d
Figure 5.2 Computed intensity patterns at the plane of the object corres-
ponding to various types of illumination. (a) Logarithmic plot (a ¼ 4) of the
intensity distribution obtained from a coherent source with a 0.03NA condenser
lens. (b) Logarithmic plot (a ¼ 4) of the intensity distribution obtained from a
coherent source with a 0.25NA condenser lens. The beam is focused to a plane
located just 50k before the plane of the object. (c) Same as (b) but showing the
intensity distribution rather than its logarithm. (d) Intensity distribution corres-
ponding to an incoherent light source consisting of 37 independent point sources
obtained with a 0.25NA condenser lens. Again the source is imaged to a plane
located 50k before the plane of the object.
64 Classical Optics and its Applications
of 50k before the object. The beam incident on the object is, therefore, divergent
and, although it covers the area of interest, its intensity distribution is not very
uniform. This nonuniformity may be better appreciated by considering the cor-
responding plot of intensity distribution in Figure 5.2(c). (Note the different
scales of Figures 5.2(b), (c).)
The third type of illumination to be examined is incoherent illumination. We
emphasize at the outset that our concern here is solely with spatial incoherence
and, as such, we will assume that the source is quasi-monochromatic. (Departure
from monochromaticity is a requirement for any source that is to exhibit
spatial incoherence; the bandwidth of the source can nonetheless be narrow
enough to give its light a long coherence time, making it in effect a temporally
coherent source.) To simulate an incoherent source we assumed that the quasi-
monochromatic light emerging from a fiber bundle consisting of 37 fibers is
imaged with a 0.25NA condenser lens to a plane located a distance of 50k before
the plane of the object in Figure 5.1. Each fiber within the bundle acts as a
coherent point source whose projected intensity distribution at the object plane
will be the same as that shown in Figure 5.2(c). When these fibers are properly
arranged in space and their intensity distributions added together, we obtain the
intensity pattern displayed in Figure 5.2(d). This is a fairly uniform distribution
over its central region, which is where the objects of interest will be placed.
Although the source could have been imaged directly onto the object plane in
this case, the 50k defocus helps to create a more uniform illumination. With this
type of illumination, in order to compute the intensity distribution at the image
plane, we treat the 37 fibers as independent point sources – each a coherent
point source in its own right. We then compute the image obtained with each
source independently, and add the intensities of the resulting 37 images together
to obtain the final image.
y/
–12.5
–12.5 x/ 12.5 –12.5 x/ 12.5
Figure 5.3 (a) Amplitude grating with a period of 3k and a 50% duty
cycle, used as the object in some of the simulations. (b) Pattern of marks with
different sizes and separations on a uniform background. In some cases these
marks will be black on a transparent background, in other cases they will
be transparent marks on a black background, in yet other cases they will be
phase objects with 100% transmissivity, imparting a 180 phase shift to the
incident beam.
examine the images of this grating under both coherent and incoherent illu-
mination and draw certain conclusions about the classical treatment of this
problem.
The second type of object with which we will be concerned is a mask
imprinted with seven marks of various sizes and shapes, shown in Figure 5.3(b).
The largest mark is 10k long, and the smallest mark is 3k wide. These marks are
large enough to yield a reasonably clear image with both coherent and incoherent
illumination. In one case the marks will be assumed to be bright objects on a dark
background, in another case they will be dark objects on a bright background, in
yet a third case they will be 180 phase objects having the same amplitude
transmissivity as the background.
Grating
0th order
Incident
beam
–1st order
a b
Figure 5.5 Computed intensity distribution (a) at the exit pupil of the
objective lens and (b) at the image plane, corresponding to coherent illumin-
ation with the divergent beam of Figure 5.2(c). The object is the grating of
Figure 5.3(a).
“image” of the grating will be formed. Figure 5.5 shows computed plots of
intensity distribution (a) at the exit pupil of the objective lens, where the overlap
between the zeroth-order and the first-order beams is clearly visible, and (b) at
the image plane, where an “image” of the grating is seen superimposed on a
nonuniform pattern of illumination. The reason that a coherent cone of light
produces an image of the grating whereas a collimated beam fails to do so may be
understood by studying Figure 5.6: the diffracted first-order cones are captured
by the objective lens as long as the lens’s NA is greater than k/(2P). The MTF
cutoff frequency for this type of illumination, therefore, is fc ¼ 2NA/k.
5 Coherent and incoherent imaging 67
Incident beam
Grating
Condenser lens
Objective lens
–1st order cone
a b
Figure 5.7 Computed intensity distribution (a) at the exit pupil of the objective
lens and (b) at the image plane, corresponding to the incoherent illumination
depicted in Figure 5.2(d). The object is the grating of Figure 5.3(a).
a b
c d
y/
–1500
125 b
y/
–125
125 c
y/
–125
c d
Finally we assume that the marks on the mask of Figure 5.3(b) represent
transparent phase objects that impart a phase shift of 180 (relative to the
background) to the incident beam. Figure 5.12 shows the computed intensity
distributions at the objective’s exit pupil and at the image plane, for the case of
illumination by the collimated coherent beam of Figure 5.2(a). Figure 5.13
shows the corresponding distributions for incoherent illumination. Note how
diffraction from mark boundaries can create an “image” of the marks in a case
where no explicit phase-contrast mechanism is present.3 In the two simulations
depicted in Figures 5.10 and 5.12, the amplitude transmission functions of the
respective objects differ only by an additive constant term. Therefore, the image
in Figure 5.10(b), for instance, may be derived from that in Figure 5.12(b) by
the addition of the image of the incident beam, it being understood that the
quantities being added are the complex amplitudes, not the intensities.
5 Coherent and incoherent imaging 71
1500
a
y/
–1500
125 b
y/
–125
125 c
y/
–125
c d
Figure 5.12 Same as Figure 5.10 but for a phase object. The assumed object in
this case is the mask of Figure 5.3(b), which has uniform transmissivity
everywhere; its marks impart a relative phase shift of 180 to the incident beam.
5 Coherent and incoherent imaging 73
1500 a
y/
–1500
125 b
y/
–125
125 c
y/
–125
Figure 5.13 Same as Figure 5.11 but for a phase object. The assumed object in
this case is the mask of Figure 5.3(b), which has uniform transmissivity
everywhere; its marks impart a relative phase-shift of 180 to the incident beam.
(For the logarithmic plot in (c) a ¼ 1.4.)
†
This chapter is coauthored with Ewan M. Wright, Professor of Optical Sciences at the University of Arizona.
74
6 First-order temporal coherence in classical optics 75
where An and n are the amplitude and phase of the spectral component whose
frequency is fn, and c is the speed of light in vacuum. The constant multiplier
(Df)1/2 is for normalization purposes only, its significance becoming clear as the
discussion proceeds. We set the central frequency f0 ¼ 4.74 · 1014 Hz (corres-
ponding to k0 ¼ 632.8 nm) and D f ¼ 4.74 · 1012 Hz, which leads to N0 ¼ 100. We
adopt a Gaussian shape for the distribution of the amplitudes An, as shown in
Figure 6.1(a), and let the values of n in Eq. (6.2) range from 15 to þ14, for a
total of 30 discrete wavelengths in the spectrum. To a large extent these choices
are arbitrary, but the points that we seek to clarify by way of examples based on
these choices are quite general in nature.
Throughout this chapter the same amplitude coefficients {An} are assumed for
all realizations of the waveform a(z, t), but the phase angles {n}, although fixed
for any particular waveform, differ for different realizations. The statistical pro-
perties of a(z, t) are thus uniquely determined by the joint probability distribution
over {n}. Furthermore, we consider stationary processes for which the ensemble
average over different phase-angle realizations coincides with the time average
derived from a single realization. This restriction of randomness to spectral phase
simplifies the discussion without affecting the validity of the final results.
Since the spectrum in Figure 6.1(a) is a discrete function of frequency, the
corresponding amplitude a(z, t) considered either as a function of time at a
fixed point z, or as a function of z at a given instant of time t, will be periodic.
With z fixed, for example, the period of the function in the time domain will be
76 Classical Optics and its Applications
1.00
(a)
0.75
A (f)
0.50
0.25
0.00
0 1 2 3 4 5 6
Frequency (1014 Hz)
15 (b)
10
5
0
a (t)
–5
–10
–15
15 (c)
10
5
a (t)
0
–5
–10
–15
5 10 15 20 25 30 35
Time (fs)
frequencies in between those that are already chosen. In this way, both the
spectrum and the wave packet retain their shapes but Df becomes smaller while T
becomes larger. In the limit Df ! 0 the separation T between adjacent wave
packets approaches infinity.
Where the first-order coherence of a given waveform is concerned, the phase
distribution over its spectral range is irrelevant, even though the shape of the
waveform as a function of time is significantly affected by this phase distribution.
For example, in Figure 6.1(b) the phase n is assumed to be a linear function of
frequency, whereas if n is picked randomly for each fn then an extended function
such as that in Figure 6.2 is obtained. (The latter might, for example, be the output
of a multi-longitudinal-mode laser.) There are many possible choices for {n} and
each choice yields a more or less extended function of time. Only in rare occasions
do we find a compact wave packet similar to that in Figure 6.1(b). However, all
functions obtained by different choices of {n} are identical in their first-order
coherence attributes. In other words, the compact packet of Figure 6.1(b) has the
same degree of first-order coherence as the extended waveform of Figure 6.2.
The time-averaged intensity of the waveform at an arbitrary point z ¼ z0 is
readily computed from Eq. (6.2) as follows:
Z
1 T 2 1X 2
hIðz ¼ z0 Þi ¼ a ðz ¼ z0 ; tÞ dt ¼ A Df: ð6:3Þ
T 0 2 n n
Note that the right-hand side of Eq. (6.3), being the area under the square of the
spectral distribution of Figure 6.1(a), remains constant as the sampling rate
increases. Thus reducing Df in order to increase the period T does not affect the
average intensity of the waveform.
7.5
5.0
2.5
a (t)
0.0
–2.5
–5.0
–7.5
0 50 100 150 200
Time (fs)
The first term on the right-hand side of this equation is a constant, independent
of s, while the second term is the autocorrelation function of the waveform a(t)
and coincides with the first-order field coherence function in the case of a
stationary process. The Fourier series coefficients of this autocorrelation
6 First-order temporal coherence in classical optics 79
Movable reflector
Photodetector 2
ΔZ
Photodetector 1
Mirror
Beam-splitter 2
Mirror
Incident beam
1.0
0.8
0.6
S1
0.4
0.2
0.0
–3 –2 –1 0 1 2 3
Δz (μm)
function are {A2n} and are independent of {n}. It is thus clear that the signals
S1(s) and S2(s), and hence the first-order temporal coherence of the waveform,
depend only on the magnitude – and not the phase – of the spectral distribution,
as was asserted earlier.
80 Classical Optics and its Applications
Coherence length
Figure 6.5 shows the waveforms arriving in channels 1 and 2 when the wave
packet of Figure 6.1(b) is sent through the interferometer, with its movable arm
extended by D z ¼ cT/8 ¼ 7.91 lm. The time delay between the packets traveling
in the two arms is therefore s ¼ 14T. Since this delay is longer than the duration of
each packet, the two packets upon arriving at the second BS do not overlap and,
therefore, appear separately in both channels. Obviously no interference takes
(a)
7.5 Channel 1
5.0
2.5
a (t)
0.0
–2.5
–5.0
–7.5
(b)
7.5 Channel 2
5.0
2.5
a (t)
0.0
–2.5
–5.0
–7.5
Figure 6.5 Waveforms arriving at (a) channel 1 and (b) channel 2 of the
Mach–Zehnder interferometer. The assumed incoming beam is the packet
of Figure 6.1, and the movable arm of the interferometer has been extended by
Dz ¼ cT/8 ¼ 7.91 lm. Because the delay is longer than the width of the packet
no interference takes place. The two packets act independently and appear in
both channels, albeit at half the original magnitude of the incoming wave. Note
that the first packet in channel 2, having been transmitted through both beam-
splitters, is flipped relative to the second packet, which has been reflected at
both beam-splitters. In contrast, each packet arriving in channel 1 has been
reflected at one and transmitted at the other beam-splitter. As a result, there is
no relative phase shift between the two packets in channel 1.
6 First-order temporal coherence in classical optics 81
place in this case and each channel receives an equal share from each packet,
each with one-half of the original amplitude.
In the above example, where the delay s between the two arms of the inter-
ferometer is 14T, one can divide the frequency content of the wave packet into
four categories. The first category consists of the frequencies f ¼ 85 D f, 89 D f,
93D f, . . . , 113 D f. All these terms are phase-shifted by 90 and, when combined
at the second BS, are equally split between channels 1 and 2. The output of
channel 1 for these frequency components is shown in Figure 6.6(a). The second
category consists of the frequencies f ¼ 86 D f, 90 D f, 94 D f, . . . , 114 D f, which
are phase-shifted by 180 and, therefore, appear exclusively in channel 2. The
third category, consisting of the frequencies f ¼ 87D f, 91D f, 95D f, . . . , 111Df ,
is phase-shifted by 90 and is, once again, equally split between the two
channels; the output of channel 1 for these components is shown in Figure 6.6(b).
The fourth and last category consists of frequencies f ¼ 88 D f, 92D f, 96D f, . . . ,
112D f, which are not phase-shifted at all and appear in their entirety in channel 1;
these are shown in Figure 6.6(c). Now if the three sets of signals in Figure 6.6 are
added together the twin packet of Figure 6.5(a) will be obtained.
It is clear that the behavior of individual frequency components (or groups of
such components that acquire the same phase shift) is independent of all the
other components; this is simply a statement of the principle of superposition
for the linear system under consideration. Furthermore, the fraction of each
component appearing in a given channel is only a function of the phase delay
acquired by that component between arms 1 and 2, independent of the original
phase of that component. Remembering that the various frequency terms are
orthogonal to each other, the behavior of the overall waveform within the
interferometer must be independent of the initial phase of its individual com-
ponents. Thus we see that the analysis of the packet of Figure 6.1(b) applies
equally to the extended waveform of Figure 6.2. These different-looking
functions share the same spectrum but have differing phase distributions
over their common range of frequencies. In particular, the coherence length is
equal to the width of the wave packet obtained by setting all n equal to zero.
The width of the packet, of course, is roughly equal to the inverse of its spectral
bandwidth.
In addition to the phase angles n initially present in, and those acquired
during propagation of, a given wave packet, the field may accumulate further
phase shifts due to dispersive elements (such as mirrors and prisms) in its path.
These phase shifts manifest themselves as delays or distortions of the packet. It
is of some interest, therefore, to study reflection and transmission delays caused
by dispersive elements in order to evaluate their impact on interferometric
measurements.
82 Classical Optics and its Applications
4 (a)
a (t)
0
–2
–4
4 (b)
2
a (t)
–2
–4
0 50 100 150 200
4 (c)
2
a (t)
–2
–4
0 50 100 150 200
Time (fs)
Figure 6.6 The spectrum of the wave packet in Figure 6.1(a) can be considered
as the superposition of four groups of frequencies. One of these groups appears
exclusively in channel 2. The other three groups appear in channel 1 either fully or
partially. The waveforms shown here are those that would have appeared in
channel 1 had the other groups been absent. When these three waveforms are
added together they reconstruct the pair of wave packets shown in Figure 6.5(b).
Incident beam
respectively.6 (The indices vary somewhat within the wavelength range of interest,
and the corresponding dispersion is taken into account in the following calcula-
tions.) The thickness of the quartz layer is 108 nm and that of SrTiO3 is 66 nm,
each being a quarter-wave thick at k0. The stack is grown on a substrate whose
central region has been subsequently removed. The hole thus created in the
substrate is of no consequence for our analysis of reflection, but it simplifies the
discussion in the following section concerning transmission through the stack.
Figure 6.8 shows computed plots of amplitude and phase for the reflection and
transmission coefficients of the stack in the frequency range covered by the wave
packet of Figure 6.1. Note that, within the bandwidth of interest, the phase r of
the reflection coefficient is essentially a linear function of frequency with a slope
of 1.5 per THz. This slope represents a 4.2 fs delay for the packet upon reflection
from the stack. It might therefore be argued that, upon arrival at the surface, the
packet spends 4.2 fs in exploring the stack before bouncing back. Roughly
speaking, the delay may be associated with a penetration depth of 625 nm for
this stack 1044 nm thick. (For an aluminum mirror the corresponding slope is
found to be 0.03 per THz, leading to a reflection delay of 0.083 fs and an
estimated penetration depth of only 12.5 nm.)
1.00 (a)
|r|
0.75
Amplitude
0.50
0.25
|t|
0.00
3.75 4.00 4.25 4.50 4.75 5.00 5.25 5.50 5.75
200 (b)
fr
100
ft
Phase (degrees)
–100
–200
3.75 4.00 4.25 4.50 4.75 5.00 5.25 5.50 5.75
Frequency (1014 Hz)
Figure 6.8 Computed amplitude and phase of the reflection and transmission
coefficients r and t of the multilayer stack of Figure 6.7. The depicted range of
frequencies covers the entire bandwidth of the wave packet shown in Figure 6.1.
requiring 3.5 fs for the light to cover this distance at its vacuum speed c. It
appears therefore that in passing through the stack the packet has exceeded the
speed of light.7,8,9,10 Since the special theory of relativity appears to have been
violated, we take a closer look at the transmitted beam.
Note in Figure 6.8(a) that the transmitted amplitude jt j is not constant over the
range of frequencies of the wave packet but rises at both ends. This means that
the actual transmitted spectrum is somewhat broadened (see Figure 6.9(a)).
Taking into account the actual amplitude and phase of the transmission
coefficient, we find the transmitted packet to be that of Figure 6.9(b). The peak of
this packet is in fact delayed by about 2.6 fs, implying its faster-than-light
6 First-order temporal coherence in classical optics 85
0.100 (a)
0.075
A(f )
0.050
0.025
0.000
0 1 2 3 4 5 6
Frequency (1014 Hz)
2 (b)
1
a (t)
–1
–2
5 10 15 20 25 30 35
Time (fs)
2 (c)
1
a (t)
–1
–2
5 10 15 20 25 30 35
Time (fs)
Figure 6.9 The wave packet transmitted through the stack of Figure 6.7 has a
broadened spectrum as shown in (a). This spectral broadening, together with the
linear phase shift t ( f ) depicted in Figure 6.8(b), results in the compressed and
delayed packet shown in (b). Had the spectral broadening been ignored and only
the phase shift t(f) taken into account, the transmitted packet would have
resembled that in (c).
propagation, but the entire packet is also compressed, which means that its
starting point is about 5 fs behind that of the incoming packet (compare Figure
6.9(b) with Figure 6.1(c)). This delay of the starting point ensures that special
relativity is not violated. Had we ignored the broadening of the spectrum and
only included the phase shift t( f ) in our transmission calculations, we would
have obtained the packet of Figure 6.9(c), which is only delayed relative to the
86 Classical Optics and its Applications
1.0
0.9
S1 0.8
0.7
0.6
–3 –2 –1 0 1 2 3
Δz (μm)
Figure 6.10 The signal S1 of detector 1 versus the extension Dz of the movable
end-reflector of the interferometer. The stack of Figure 6.7 is installed in the
fixed arm while the adjustable arm is extended to compensate for the trans-
mission delay through the stack. The incoming beam is assumed to be the wave
packet of Figure 6.1(b)
88
7 The van Cittert–Zernike theorem 89
the source’s inherently random radiation processes. We shall make exclusive use of
time-averaging to derive the degree of coherence of a pair of points within the field of
an extended, quasi-monochromatic, incoherent source.
At a given point in space, the (scalar) amplitude of the radiated waveform may be
written
X
aðtÞ ¼ An ðDf Þ1=2 cosð2pfn t n Þ; ð7:2Þ
n
where An and n are the amplitude and phase of the component whose frequency
is fn. The significance of the constant multiplier (Df)1/2, which is there for nor-
malization purposes only, becomes clear shortly. We set the central frequency
f0 ¼ 5.454 · 1014 Hz (corresponding to yellow light of wavelength k ¼ 550 nm)
and choose Df ¼ 5.454 · 1012 Hz, which leads to N0 ¼ 100. We adopt a Gaussian
shape for the distribution of the amplitudes An, as shown in Figure 7.1(a), and let
the value of n in Eq. (7.2) range from 4 to þ4, for a total of nine discrete
wavelengths in the spectrum. To a large extent these choices are arbitrary but, as
before, the points that we seek to clarify by way of examples based on these
choices are quite general in nature.
Since in the Fourier domain the spectrum in Figure 7.1(a) is a discrete function
of frequency, the corresponding amplitude a(t) must be periodic, with a period of
T ¼ 1/Df 183 fs. A plot of a(t) over a full period T is shown in Figure 7.1(b),
where the values of n at each frequency are chosen randomly and independently
of each other. To increase the period T without changing the overall shape of the
spectrum one must increase the rate of spectral sampling in Figure 7.1(a), by
selecting additional frequencies in between those that are already chosen. In this
way the spectrum retains its shape, but Df becomes smaller while T becomes
larger. In the limit Df ! 0 the period T of the waveform approaches infinity.
As far as the first-order coherence of a given waveform is concerned, the specific
phase distribution over its spectral range is irrelevant, even though the shape of the
function a(t) may be significantly affected by this phase distribution. For example,
in Figure 7.1(b) the value of n at each frequency is chosen randomly, whereas if
n were chosen as a linear function of frequency then a waveform such as that of
90 Classical Optics and its Applications
1.0 (a)
0.8
Amplitude 0.6
0.4
0.2
0.0
0 1 2 3 4 5 6
Frequency (1014 Hz)
3 (b)
1
Amplitude
–1
–2
–3
0 25 50 75 100 125 150 175
Time (fs)
(c)
4
2
Amplitude
–2
–4
Figure 7.1(c) would have been obtained. There are many possible choices for {n},
and each one yields a more or less extended function of time such as that of Figure
7.1(b). Only in rare occasions does one find a compact wave packet similar to that
of Figure 7.1(c). However, the compact packet has the same first-order coherence
properties as the extended waveform.
Intensity
The average intensity of the waveform a(t) in Eq. (7.2) is readily computed as
follows:
Z
1 T 2 1X 2
hIi ¼ a ðtÞ dt ¼ A Df: ð7:3Þ
T 0 2 n n
Note that the right-hand side of Eq. (7.3), being the area under the square of the
spectral distribution function of Figure 7.1(a), remains constant as the sampling
rate increases. Thus reducing Df in order to increase the period T does not affect
the average intensity.
Although the average intensity does not depend on {n}, the fluctuations in
intensity are most definitely affected by this phase distribution. A thermal source
(such as an incandescent lamp) tends to “assign” the values of n randomly and
independently of each other, thus resulting in significant fluctuations in I(t). This
behavior may be observed by examining the typical waveform in Figure 7.1(b).
In fact, it can be shown that, for a thermal source, h[I(t) hIi]2i ¼ hIi2. However,
it is possible to assign the phase angles in such a way as to minimize the intensity
fluctuations. In a well-stabilized single-mode laser, for instance, the locking of
the phase angles renders the root-mean-square fluctuations of intensity negligible,
that is, hI2(t)i ¼ hIi2. These considerations, however, pertain to higher-order
statistics and, as far as first-order coherence is concerned, one could as well
ignore the specific phase distribution.
Thus when T!1 the magnitude of CPP0 (s) for essentially all s goes to zero,
whereas in the same limit the average intensity hIi given by Eq. (7.3) remains
non-zero. If the fields from P and P0 are brought together in an attempt to create
interference fringes, their combined intensity will be the sum of their individual
intensities plus the cross-correlation term CPP0 (s). Since CPP0 (s)!0 for suffi-
ciently long T, the intensity of the sum will be the sum of individual intensities
and, therefore, no fringes will be observed.
Interpretation
One may think of the radiation emanating from the two point sources P and P0 in
terms of two finite-duration wave packets (see Figure 7.1(c)). However, since the
wave packets do not have a random relative phase it is impossible to get their
cross-correlation to vanish. Nonetheless, we can assume that the packets are
separated in time by an interval much longer than their individual widths and also
much longer than any time delay that might occur in a system under consider-
ation. In other words, as far as first-order coherence is concerned, an extended
incoherent source emitting continuous radiation from its various points is
equivalent to an identical source that emits relatively short bursts of light sepa-
rated by long intervals. In this model of an incoherent, quasi-monochromatic,
extended source each point emits only one pulse, no two points emit overlapping
pulses, and all pulses from the various source locations have the same duration
and shape.
As an example, consider an imaging system where a quasi-monochromatic
spatially incoherent light source illuminates a sample, of which an image is
formed on a photographic plate. One may imagine the individual points of the
source as being independent coherent point sources, each creating a coherent
image of the sample on the photographic plate. Because different points
radiate at different times, there will be no interference among the various
7 The van Cittert–Zernike theorem 93
images. The photographic plate duly records the intensity pattern produced by
each point source, automatically adding these images together as they arrive
sequentially. The final image is thus the sum of the intensity distributions of
all the coherent images produced by the various point sources.
X j
z0
d
Z
P
Point source
X X j
zs z0
Source
(x2, y2, 0)
Y Y h
the X0 Y 0 -plane at z ¼ zs. The distance zs between the source and the XY-plane
at z ¼ 0, on which we seek to determine the degree of coherence, is large
enough that all the simplifying assumptions invoked in the previous sections
still apply. We wish to determine the first-order coherence properties of the
light that reaches the XY-plane at z ¼ 0. We select two points (x1,y1) and (x2,y2)
on this plane and assume that two pinholes are placed at these points. The light
reaching the pinholes from a point source at (x,y,zs) will have nearly the
same amplitude but different phase. The phase difference at the pinholes is
given by
D ¼ 2pD‘=k
2p 1 2
x1 þ y21 12 x22 þ y22 ½ðx1 x2 Þx þ ðy1 y2 Þyg:
kzs 2 ð7:7Þ
Consider what happens when all the point sources are active. They all act
independently, each creating its own fringe pattern at the observation screen.
All fringes thus produced will have the same period but different strengths and
are shifted by different amounts along the n-axis. Because the point sources are
completely incoherent, their overlapping fringe patterns must simply be added
together. In other words, the final intensity distribution is the sum of Eq. (7.6)
over all points P. We assume a to be the same for all the point sources. The
fringe period kz0/d is also the same. Therefore, the sum of Eq. (7.6b) over all
point sources may be written as follows:
Z
I ðnÞ ¼ a I ðx; yÞdx dy
Z
source
þ aRe I ðx; yÞ exp½iDðx; yÞdx dy exp½i2pdn=ðkz0 Þ : ð7:8Þ
source
Z
I^ðx; yÞ ¼ I ðx; yÞ I ðx; yÞ dx dy ð7:9bÞ
source
Z
cðx1 ; y1 ; x2 ; y2 Þ ¼ I^ðx; yÞ exp½iDðx; yÞ dx dy: ð7:9cÞ
source
96 Classical Optics and its Applications
A comparison of Eqs. (7.10) and (7.6) reveals that the fringe contrast produced
by the pinholes at (x1, y1) and (x2, y2) is equal to jcj and that the phase of c
determines the shift of these fringes from the center. The function c is thus
described as the complex degree of spatial coherence between (x1, y1) and
(x2, y2). Substituting expression (7.7) for D in Eq. (7.9c) yields
Example
Consider the uniform, quasi-monochromatic, incoherent source depicted in
Figure 7.4(a). The source’s central wavelength is k, and its linear dimensions
are 3250k on each side. A square array of 13 · 13 independent point sources on
a rectangular mesh (with spacing 250k) is used to simulate this source. A pair of
pinholes in an otherwise opaque screen is located at zs ¼ 107k from the source.
The square pinholes shown in Figure 7.4(b) are each of side-length 350k
and separated by a distance d along the X-axis. The light from the source,
having gone through the pinholes, arrives at the observation plane located at
z0 ¼ 106k.
Figure 7.5 shows the computed fringe patterns at the observation plane for four
different values of d. Note that with increasing d the fringe period decreases. The
fringe contrast also declines at first, going to zero when d ¼ 3333k. Subsequently,
however, the contrast increases as d continues to increase. Whereas in frames (a)
and (b) the central fringe is bright, in frame (d) corresponding to d ¼ 4000k the
central fringe becomes dark. This is equivalent to a half-period shift of the pattern
upon crossing the point of zero contrast.
Figure 7.6 shows cross-sections of the fringe patterns of Figure 7.5. The
contrast calculated from these plots can be shown to be in good agreement with
the values predicted by the van Cittert–Zernike theorem.
7 The van Cittert–Zernike theorem 97
a b
Figure 7.4 (a) Intensity distribution over the surface area of a uniform, quasi-
monochromatic, incoherent source. The linear dimensions of the source
are 3250k along each side, where k is the wavelength of its radiation. (b) A
pair of square pinholes each measuring 350k along each side. The center-to-
center spacing d between the pinholes is an adjustable parameter of the
simulations.
a b
c d
Figure 7.5 Computed intensity distributions in the vicinity of the optical axis
at the observation plane of Figure 7.3 for the source and pinholes of Figure 7.4.
The distance between the source and the plane of the pinholes is zs ¼ 107k, while
the distance between the pinholes and the observation screen is z0 ¼ 106k. Each
frame corresponds to a different spacing d between the pinholes: (a) d ¼ 1250k;
(b) d ¼ 2500k; (c) d ¼ 3333k; (d) d ¼ 4000k.
98 Classical Optics and its Applications
1.0 1.0 (b)
(a) d = 1250 d =2500
0.8 0.8
0.6 0.6
0.4 0.4
0.2 0.2
0.0 0.0
–4800 –2400 0 2400 4800 –2400 –1200 0 1200 2400
0.8 0.8
Normalized Intensity
0.6 0.6
0.4 0.4
0.2 0.2
0.0 0.0
–1800 –1200 –600 0 600 1200 1800 –1500 –1000 –500 0 500 1000 1500
x/ x/
X
Ey ðz; tÞ ¼ Bn ðD f Þ1=2 cos½2p fn ðt z=cÞ þ wn : ð8:1bÞ
n
100
8 Partial polarization, Stokes parameters, and the Poincaré sphere 101
0.8
Amplitude
0.6
0.4
0.2
0.0
5.0 5.1 5.2 5.3 5.4 5.5 5.6
Frequency (1014 Hz)
speed of light in vacuum, and the constant multiplier (Df )1/2 is for normalization
purposes only, its significance becoming clear as the discussion proceeds.
As described by Eqs. (8.1), the contribution to the beam of each frequency
term fn is a fully polarized plane wave. For this plane wave, which is elliptically
polarized in general, one can determine the ellipticity and the orientation of the
ellipse of polarization in terms of An, Bn, and n wn. However, the superpos-
ition of different frequency terms, each having a different state of polarization,
results in partially polarized light.
Incident
beam u
Retarder Polarizer
Figure 8.2 A polychromatic beam of light propagating along the Z-axis is sent
through a variable retarder and a polarizer. The retarder’s fast and slow axes are
fixed along the X- and Y- directions, but its phase shift v as well as the polarizer’s
orientation angle h may be adjusted to minimize the amount of light that is
transmitted through the system. v must be the same for all the wavelengths
contained in the incident beam.
Because all frequencies fn in Eq. (8.2) are integer multiples of Df, namely,
fn ¼ (N0 þ n)Df, the transmitted amplitude E(z ¼ 0, t) is a periodic function of
time, with period T ¼ 1/Df. The time-averaged transmitted intensity as a func-
tion of v and h is thus given by
Z
1 T 2
Iðv; hÞ ¼ E ðz ¼ 0; tÞ dt
T 0
1X 2
¼ An cos2 h þ B2n sin2 h þ An Bn sinð2hÞ cosðn wn vÞ D f :
2 n
ð8:3Þ
The presence of D f in the above expression allows a smooth transition from the
discrete sum to a continuous integral in the limit D f ! 0; this, of course, is the
same limit in which T ! 1.
Stokes parameters
To streamline the calculation of the values of v and h that minimize I(v, h), we
follow Sir George Gabriel Stokes (1819–1903) in defining the four parameters
that now bear his name:4
1X 2
S0 ¼ ðAn þ B2n ÞD f ; ð8:4aÞ
2 n
1X 2
S1 ¼ ðAn B2n ÞD f ; ð8:4bÞ
2 n
X
S2 ¼ An Bn cosðn wn ÞD f ; ð8:4cÞ
n
X
S3 ¼ An Bn sinðn wn ÞD f : ð8:4dÞ
n
To minimize the transmitted intensity in Eq. (8.3) we first set the derivative of
I(v, h) with respect to v equal to zero. This yields v0, independently of the
value of h, as follows:
v0 ¼ arctanðS3 =S2 Þ: ð8:5aÞ
Substituting v0 for v in Eq. (8.3) and differentiating with respect to h, we find the
optimum h0 as
h0 ¼ 12 arctan ðS2 =S1 Þ cos v0 þ ðS3 =S1 Þ sin v0 : ð8:5bÞ
8 Partial polarization, Stokes parameters, and the Poincaré sphere 105
The transmitted intensity thus turns out to have a minimum at (v0, h0) and a
maximum at (v0, h0 þ 90 ), or vice versa. These values are given by
Degree of polarization
The minimum transmitted intensity Imin in Eq. (8.6a), being that part of the beam
which cannot be extinguished with a retarder and a polarizer, represents the
depolarized content of the beam. This, of course, is only half the total amount of
depolarized light, because the same amount must also be contained in Imax. The total
amount of depolarized light, therefore, is 2Imin, while the remaining part, Imax Imin,
is fully polarized. The degree of polarization P of the beam may thus be defined as
1=2
P ¼ ðImax Imin Þ=ðImax þ Imin Þ ¼ ðS1 =S0 Þ2 þ ðS2 =S0 Þ2 þ ðS3 =S0 Þ2 : ð8:7Þ
Using the Schwartz inequality,5 it is not difficult to show that S12 þ S22 þ S32 S02;
consequently, 0 P 1. (See Note 1 at the end of the chapter.)
One may question the generality of the above result because, in deriving it, the fast
and slow axes of the wave-plate were fixed along the X- and Y- axes. In other words,
one wonders if the result would have been different had the axes of the wave-plate
been allowed to rotate around the Z-axis. The result can be shown to be quite general,
however, because P of Eq. (8.7) remains invariant under a rotation of the XY-plane
around Z. The value of S0, being the total power of the beam, obviously remains the
same for arbitrary orientations of the coordinate system. Moreover, with some
elementary algebra, the quantity S12 þ S22 þ S32 may also be shown to be invariant
under coordinate rotation. (See Note 2 at the end of the chapter.)
In retrospect the variable retarder of Figure 8.2 could have been replaced by an
achromatic quarter-wave plate (e.g., a Fresnel rhomb) in a rotary mount. The axes
of the quarter-wave plate could then be made to coincide with the axes of the
ellipse of polarization in order to linearize that part of the beam which is fully
polarized. This is precisely what the variable retarder accomplishes in that it
adjusts the retardation v while maintaining a fixed orientation in the XY-plane.
S3
S
O
2 S2
Y
2
S1
Figure 8.3 The Poincaré sphere is the location of all points S with coordinates
(x, y, z) ¼ (S1, S2, S3). The radius of the sphere is PS0, and the latitude and
longitude of S specify the ellipticity g and orientation angle q of the polarized
component of the beam.
major axis of the ellipse and the X-axis). These parameters may be readily
expressed in terms of the Stokes parameters:
Using the above relations, the French mathematical physicist Henri Poincaré
(1854–1912) represented the state of polarization as a point S on the surface of a
sphere, as shown in Figure 8.3. In this representation the three Cartesian
coordinates of S are S1, S2, and S3. Thus, according to Eq. (8.7), the radius of the
Poincaré sphere is PS0, the power of that fraction of the beam which is fully
polarized. The latitude of S is twice the ellipticity g of the polarized component,
in accordance with Eq. (8.8a), while the longitude of S represents twice the
orientation angle q of the major axis of the ellipse of polarization, as prescribed
by Eq. (8.8b).
Unpolarized light
A completely unpolarized beam of light cannot be altered by the wave-plate and
polarizer of Figure 8.2. No matter what the phase shift v of the retarder and the
orientation h of the polarizer may be, the output power will be one-half the input
8 Partial polarization, Stokes parameters, and the Poincaré sphere 107
power. For this light S0 will be the total power of the beam, but S1 ¼ S2 ¼ S3 ¼ 0.
P 2 P
Since S1 ¼ 0, the relation An Df ¼ Bn2 Df implies that the power along the
X-axis equals that along the Y-axis. For natural light, where the polarization
components along the X- and Y- axes are independent of each other, the
relative phase angles n wn are uniformly distributed over (0, 2p) and tend to
be a random function of n. Hence, in the limit D f ! 0, the Stokes parameters S2
and S3 approach zero as well. However, there exist other combinations of n
and wn that yield totally unpolarized light. For example, a superposition of
two equal-magnitude beams of frequencies f1 and f2, where one beam is right-
and the other left-circularly polarized, can be readily shown to be fully
unpolarized.
Ep(i ) Es(r)
Es(i) Ep(r)
100 m n = 1.5
Ep(t )
Es(t)
|rs|
135 0
0.8
Phase (degrees)
0.6 45 –20
Amplitude
0 –30
–90 –50
0.2
–135
–60
–180
0.0 –70
546 548 550 552 554 546 548 550 552 554 546 548 550 552 554
(nm) (nm) (nm)
Figure 8.5 A polychromatic plane wave, having the spectrum of Figure 8.1
and a linear polarization at 45 to the plane of incidence, is reflected from a
glass slab at c ¼ 75 (see Figure 8.4). Shown as functions of k: (a) the reflected
amplitudes jrpj (broken line) and jrsj (solid line); (b) the phase angles of rp
(broken line) and rs (solid line); (c) the reflected polarization state, defined by
the rotation angle q and ellipticity g. For the reflected beam the computed
degree of polarization is P ¼ 0.978, the polarized component is essentially
linear (g ¼ 0.000026 ), and the polarization vector makes an angle q ¼
60.2 with the p-direction.
functions of k are depicted in Figure 8.5(a). Multiple reflections at the two facets
of the slab interfere with each other to produce the fine structure seen in the
spectra of Figure 8.5(a). The phase angles of the reflected p- and s- components
are shown in Figure 8.5(b), and the resulting polarization rotation angle q and
ellipticity g appear in Figure 8.5(c). The knowledge of these quantities allows
one to compute the Stokes parameters from Eqs. (8.4), yielding S1/S0 ¼ 0.495,
S2/S0 ¼ 0.844, S3/S0 ¼ 0.89 · 106. Thus the degree of polarization of the
reflected beam is P ¼ 0.978, the wave-plate’s required phase shift v0 is very
small, 0.00006 , and the polarizer’s angle for minimum transmission must be
set to h0 ¼ 29.8 . It is seen that the polarized content of the reflected beam is
essentially linear (g ¼ 0.000026 ) and is oriented at q ¼ 60.2 relative to the
p-direction.
Similar results may be obtained for the beam transmitted through the slab.
The corresponding amplitudes and phases are shown in Figure 8.6, and the Stokes
parameters are found to be S1/S0 ¼ 0.306, S2/S0 ¼ 0.907, S3/S0 ¼ 0.7 · 106.
Thus the degree of polarization is P ¼ 0.957, the wave-plate’s required phase
109
1.0 50 (c)
(a) 180 (b)
f(tp)
f(ts)
135 40
0.8 |tp|
Phase (degrees)
Amplitude
0 20
0.4 –45
10
|ts| –90
0.2
0
–135
–180
0.0 –10
546 548 550 552 554 546 548 550 552 554 546 548 550 552 554
(nm) (nm) (nm)
Figure 8.6 The counterpart of Figure 8.5 for the case of transmission through the glass slab 100 lm thick. The computed degree
of polarization of the transmitted beam is P ¼ 0.957, the polarized component is essentially linear (g ¼ 0.000022 ), and the
polarization vector makes an angle q ¼ 35.7 with the p-direction.
110 Classical Optics and its Applications
60
1.0 (a) 180 (b) f(tp) f(ts) (c)
|tp|
45
135
Rotation and Ellipticity (degrees)
0.8 |ts| 30
90
15
Phase (degrees)
0.6 45 0
Amplitude
0 –15
–45
–90
0.2 –60
–135
–75
–180
0.0 –90
546 548 550 552 554 546 548 550 552 554 546 548 550 552 554
(nm) (nm) (nm)
Note 1
The Schwartz inequality,5 which concerns the integral of the product of two
complex functions of the real variable x, is written as follows:
Z 2 Z Z
f ðxÞg ðxÞ dx j f ðxÞj2 dx jgðxÞj2 dx:
we find S22 þ S32 ¼ jS2 þ iS3j2 ¼ jAB*Tj2 kAk2kBk2 ¼ S02 S21, establishing the
desired inequality.
Note 2
The 2 · 2 matrix
A
T
M¼ A B
T
B
However, under this unitary transformation both the trace and the determinant of
M remain unchanged. Therefore, the beam’s total power S0 and the power of its
polarized component (S12 þ S22 þ S32)1/2 are rotation invariant.
112 Classical Optics and its Applications
Introduction
The degree of first-order temporal coherence, a function denoted by g(1)(s),
provides information about the coherence length and the power spectral density
of a light source. However, without additional information, g(1)(s) has no
bearing on intensity fluctuations and higher-order statistics of the emitted light.
A quasi-monochromatic laser beam and the beam of light from an incandescent
light bulb, provided that the latter is properly filtered to match the spectral line-
shape of the former, will have identical degrees of first-order coherence. Any
interferometric experiment involving the splitting and superposition of ampli-
tudes would yield identical results for the laser beam and the (properly filtered)
thermal light. Therefore, on the basis of such experiments alone, there is no way
to distinguish the two light sources. It turns out, however, that the intensity
fluctuations of laser light are fundamentally different from those of thermal
light. The two sources can, therefore, be distinguished based on their second-
order coherence properties.1,2,3
An ideal photodetector produces an electrical signal proportional to the “cycle-
averaged intensity” of the E-field (or B-field) of the light beam at the location of
the detector. Assuming that the electrical bandwidth of the detector (including all
associated circuitry) is greater than the bandwidth of the incident light wave by at
least a factor of 2, the output of the detector should accurately represent the
intensity fluctuations of the light beam as a function of time. At a given point in
space, the light beam’s degree of second-order coherence, g(2)(s), may be defined
in terms of the autocorrelation function of the output electric signal from such a
detector. In the next section we derive a general expression for g(2)(s), discuss the
fundamental difference between laser light and thermal light as manifested by
this degree of second-order coherence, and obtain a relationship between g(2)(s)
and g(1)(s) for the case of chaotic (e.g., thermal) light.
113
114 Classical Optics and its Applications
n n0
þ cos½2pðn þ n0 ÞDft þ n þ n0 g: ð9:2Þ
The second term in the above expression has high frequencies, (n þ n0 )Df , which can
be removed by a low-pass filter. (Such low-pass filtering is inherent to all commonly
available photo-detectors.) The low-frequency terms survive the filtering; upon
rearranging the double sum in Eq. (9.2), the filtered function a2(t) – known as the
9 Second-order coherence 115
where
X
I0 ¼ 12 A2n D f ; ð9:4bÞ
n
N0 X
þMm
Îm ¼ Îm expði wm Þ ¼ An Anþm exp½iðnþm n ÞD f
n ¼ N0 M
N0 X
þMm
¼ Ân*Ânþm D f :
n ¼ N0 M ð9:4cÞ
Z1
Îðf Þ ¼ Îðf Þ exp½iwðf Þ ¼
*
Âð f 0 Þ Â ð f 0 f Þd f 0 : ð9:5bÞ
0
Next, we exploit the fact that I(t) given by Eq. (9.4) is a periodic function of time
with period T ¼ 1/Df, and define the autocorrelation of the cycle-averaged
intensity distribution by averaging over one period T, as follows:
ZT X
2M
1 Îm 2 cosð2pmDf sÞ:
hIðtÞIðt þ sÞi ¼ IðtÞIðt þ sÞdt ¼ I02 þ 12 ð9:6Þ
T m¼1
0
116 Classical Optics and its Applications
Note that, in deriving the above expression for g(2)(s), no assumptions were
made about the statistical properties of either a(t) or its Fourier transform
A( f) exp[i( f)]. In particular, there is no need for a(t) to be stationary, although
ergodicity will be helpful, as the time-averages obtained over one particular
waveform will then be representative of the entire ensemble of such waveforms.
The classical degree of second-order coherence given by Eq. (9.7) has three
fundamental properties.
(i) g(2)(s) is an even function of s.
(ii) g(2)(s) g(2)(0), that is, the maximum of g(2)(s) occurs at s ¼ 0. This is a
consequence of the fact that all the cosines in Eq. (9.7) reach their peak values
simultaneously at s ¼ 0.
(iii) The value of the function at s ¼ 0 is greater than or equal to unity, namely, g(2)(0)
1,
because, inevitably, R (|Îm|/I0)2
0.
We now study two special cases in some detail.
Figure 9.1 (a) Gaussian amplitude spectrum |Â( f)| centered at f ¼ f0,
having FWHM equal to w ¼ 1000Df; the sampling interval is conveniently
chosen as Df ¼ 1.0 in arbitrary units. (b) Logarithmic plot of |Îm|/I0 computed
from the Gaussian frequency spectrum depicted in (a) with randomly
assigned phase values to each frequency fn; M ¼ 2500, I0 ¼ 376.35. (c) Plot
of g(2)(s) calculated from Eq. (9.7) with s in units of 1/Df. (d) Close-up
of g(2)(s) showing its Gaussian central part having FWHM ¼ 4ln2/(pw)
0.88/w.
phase values, we found the computed g(2)(s) changed only in small and insig-
nificant ways with each choice of {n}, provided that the chosen D f was small
enough to properly sample the line-shape.
Note that g(2)(s) 1 has a Gaussian form in Figure 9.1(d) and an inverse
Lorentzian form in Figure 9.2(d). This, of course, is not a coincidence and a
general relationship can be shown to exist between the line-shape of chaotic light
and the functional form of g(2)(s). Recalling that the degree of first-order
coherence is defined as
Z 1
P 2
An expði2pnDf sÞDf Âðf Þ2 expði2pf sÞ df
gð1Þ ðsÞ ¼ n P 2 ! 0 Z 1 ; ð9:8Þ
An Df Âðf Þ2 df
n
0
118 Classical Optics and its Applications
1.0 0
(a) (b)
log10(|Îm|/I0)
0.8 | Â( f )| –1
–2
0.6 w
–3
0.4
–4
0.2
–5
0.0 –6
f0 – 1500 Δf f0 f0 + 1500 Δf 1 5000 10000
m
2.0 2.0
1.8 (c) (d)
1.6 g(2)() 1.8
1.4
1.2 1.6 3 ln 2/(w)
1.0
0.8 1.4
0.6
0.4 1.2
0.2
0.0 1.0
–½ 0 ½ –0.005 0 0.005
(1/Δf ) (1/Δf )
Figure 9.2 (a) Lorentzian amplitude spectrum |Â( f)| centered at f ¼ f0,
having FWHM equal to w ¼ 250Df ; the sampling interval is conveniently
chosen as Df ¼ 1.0 in arbitrary units. (b) Logarithmic plot of |Îm|/I0 computed
from the Lorentzian frequency spectrum depicted in (a) with randomly
assigned phase values to each frequency fn; M ¼ 5000, I0 ¼ 112.32. (c) Plot
of g(2)(s) calculated from Eq. (9.7) with s in units of 1/Df. (d) Close-
pffiffiffi of g (s) showing its inverse Lorentzian central part having FWHM ¼
(2)
up
3 ln 2 / (px) 0.38 / w.
To see this, note that Îm in Eq. (9.4c) is the sum of (2M þ 1 m) complex numbers.
Therefore, |Îm|2 contains the sum of the squared moduli of these numbers plus many
cross terms. The cross terms, however, all have random phases and, when a large
number of such terms are added together, they tend to cancel out. What remains,
therefore, is mainly the sum of the squared moduli, namely,
Z 1
2 N0 X þMm
Îm An Anþm D f !
2 2 2 0 2 0 0
j Âðf Þj j Âðf f Þj df D f :
2
ð9:10Þ
n¼N0 M 0
9 Second-order coherence 119
In the limit when D f ! 0 the first term in the numerator of Eq. (9.11) approaches
zero. As for the second term, the coefficient of cos(2pmDfs) is the same as |Îm|2
given by Eq. (9.10). Also, with reference to Eq. (9.4b), the denominator is equal
to 4I02. It is thus clear that, in the case of chaotic light, Eq. (9.9) is a direct
consequence of Eq. (9.7).
Note that, for chaotic light, g(2)(0) ¼ 2, irrespective of the shape of the spectral
density function (i.e., the line-shape), simply because g(1) (0) ¼ 1 according to
Eq. (9.8). Also, if the linewidth approaches zero, g(2)(s) becomes very broad; in
the limit of zero linewidth, therefore, g(2)(s) ¼ 2 for all values of s. The chaotic
fluctuations of intensity are, therefore, intrinsic to this type of light and cannot be
removed by spectral filtering, no matter how narrow the filter’s linewidth may be.
This is the fundamental difference between the coherent light from a laser and the
chaotic light from a thermal source; whereas the classical degree of second-order
coherence for thermal light is equal to 2, that for monochromatic laser light (i.e.,
single longitudinal mode, narrowband) is always equal to unity, as shown in the
following subsection.
Figure 9.3 (a) Phase profile v(t) over the time interval t ¼ 0 to t ¼ T ¼ 1.0 in
units of 1/Df. Note that the range of variation of v is [p : p]. (b) Plot of the
function a(t) ¼ a0 cos[2p f0t þ v(t)] over the brief time interval [0.500, 0.525],
with a0 ¼ 1.0, f0 ¼ 500Df and v(t) as shown in (a). (c), (d) Amplitude and phase
profiles of Â( f), the Fourier transform of a(t), obtained numerically over the
entire time interval [0, 1]. The function Â( f) is truncated, with the values cov-
ering the range f0 ± 20Df retained. (e) Plot of |Îm|/I0, computed from the truncated
Â( f) using Eqs. (9.4b, 9.4c). (f) Computed plot of g(2)(s) obtained with the values
of |Îm|/I0 inserted into Eq. (9.7). Aside from minor fluctuations – caused by the
truncation of Â(f) – the degree of second-order coherence is equal to 1.0 for all
values of s.
9 Second-order coherence 121
of v(t) guarantee a stable amplitude for the waveform over its entire duration.
Figures 9.3(c,d) display the computed amplitude and phase of Â( f), the
Fourier transform of a(t), obtained numerically over the time interval [0, T ]; here
T ¼ 1/Df is the inverse of the frequency domain sampling interval, which is
conveniently chosen as D f ¼ 1.0 in arbitrary units. As usual, the spectrum is
sampled at discrete frequencies, fn ¼ f0 þ nD f ¼ (N0 þ n)D f, then truncated by
limiting its frequency content to the range M n M.
In Figure 9.3, f0 ¼ 500Df, and the truncated Â( f) is confined to the frequency range
f0 ± 20Df. The values of |Îm|/I0, computed from the truncated Â( f) in accordance with
Eqs. (9.4b) and (9.4c), are shown in Figure 9.3(e), while a plot of g(2)(s), obtained
from Eq. (9.7) using these values of |Îm|/I0, appears in Figure 9.3(f). Aside from
minor fluctuations – caused by the truncation of Â( f) – note that the degree of
second-order coherence, g(2)(s), is essentially equal to 1.0 for all values of s.
D1 I1(t)
Sa L/2
Z
L/2
D2 I2(t)
Sb
Figure 9.4 The light from two independent point sources, Sa, Sb, is detected by
the photo-detectors, D1, D2, located far away from the sources. The radiation from
both sources is narrowband and centered at the same frequency f0. The ideal, point-
like detectors are separated from each other by an adjustable distance L in the same
direction as Sa is separated from Sb. Seen from the detectors’ plane, the angular
distance between Sa and Sb is h. Each detector produces an electrical signal pro-
portional to the cycle-averaged intensity of the corresponding incident light.
122 Classical Optics and its Applications
write the light amplitudes a1(t) and a2(t) arriving at the two detectors as follows:
NX
0 þM pffiffiffiffiffiffi pffiffiffiffiffiffi
a1 ðtÞ ¼ fAn D f cosð2pnD ft þ n Þ þ Bn Df cosð2pnDft þ vn Þg;
n¼ N0 M
ð9:13aÞ
NX
0 þM pffiffiffiffiffiffi pffiffiffiffiffiffi
a2 ðtÞ ¼ fAn Df cosð2pnDft þ n Þ þ Bn Df cos½2pnDf ðt þ sb Þ þ vn g:
n ¼ N0 M
ð9:13bÞ
Here the frequency component fn ¼ nDf arriving from Sa has the complex amplitude
pffiffiffiffiffiffi pffiffiffiffiffiffi
Ân ¼ An Df exp(in), while that from Sb has the amplitude B̂n ¼ Bn Df exp(ivn).
The source Sa, being equi-distant from the two detectors, makes equal contri-
butions to a1(t) and a2(t), whereas the contributions of Sb are shifted in time by
the relative delay sb.
Following the same steps that led from Eq. (9.1) to Eq. (9.4), we now deter-
mine the filtered (i.e., cycle-averaged) intensities I1(t) and I2(t) observed at the
detectors of Figure 9.4. We find
X
2M
I1 ðtÞ ¼ I10 þ Î1m cosð2pmD ft þ w Þ; ð9:14aÞ
1m
m¼1
X
2M
I2 ðtÞ ¼ I20 þ Î2m cosð2pmD ft þ w Þ: ð9:14bÞ
2m
m¼1
X
N0 þMm
Î1m ¼ Î1m expðiw1m Þ ¼ ðÂ*n þ B̂*n ÞðÂnþm þ B̂nþm ÞD f ; ð9:14dÞ
n ¼ N0 M
NX
0 þM
X
N0 þMm
Î2m ¼ Î2m expðiw2m Þ ¼ ½ Â*n þ B̂*n expði2pnDf sb ÞfÂnþm
n ¼ N0 M
So far we have not made any assumptions about the nature of Sa and Sb, beyond
the fact that they are distant point sources with narrowband spectra centered at the
same frequency f0 ¼ N0Df. Equations (9.13)–(9.15) are, therefore, valid for any
type of light, so long as the sampled spectral amplitude and phase profiles, {(An,
n)} of Sa and {(Bn, vn)} of Sb, are dense enough to provide proper representations
of the spectral density functions Â( f ) and B̂ð f Þ. If the light beams emerging from
the independent sources Sa and Sb happen to be chaotic (e.g., thermal), then the
phase angles {n} and {vn} will be random and uncorrelated, thus leading to
NX
0 þM
Z1
1
I10 I20 12 ðA2n þ B2n ÞDf ! Âð f Þ2 þB̂ð f Þ2 d f ; ð9:16Þ
n ¼ N0 M
2
0
þMm
N0 X
*
Î1m Î2m A2n A2nþm þ B2n B2nþm expði2pmDf sb Þ
n ¼ N0 M
þ A2n B2nþm exp½i2pðn þ mÞD f sb þ B2n A2nþm expði2pnD f sb Þ ðD f Þ2 :
ð9:17Þ
Next we relate the various terms appearing in Eq. (9.17) to the first-order degrees
of coherence of Sa and Sb, defined as follows:
P 2
ð1Þ An expði2pnD f sÞD f
ga ðsÞ ¼ P 2 ð9:18aÞ
An D f ;
P
ð1Þ Bn2 expði2pnDf sÞD f
gb ðsÞ ¼ P 2 : ð9:18bÞ
Bn D f
Straightforward calculations similar to those that led to Eq. (9.11) may now be
used to determine the expanded forms of |ga(1)(s)|2, |gb(1)(s þ sb)|2, and ga(1)(s)
gb*(1)(sþsb), which turn out to contain the various terms that appear in Eq. (9.17).
*
One must then substitute for (Î1m /I10)(Î2m/I20) in Eq. (9.15) from Eqs. (9.16) and
(9.17), and proceed to replace the resulting expressions with their equivalents in
124 Classical Optics and its Applications
Concluding remarks
Practical photodetectors may have a narrower bandwidth than is required for
producing an ideal cycle-averaged intensity I(t) in a given application. In other
words, the low-pass filtering mentioned in going from Eq. (9.2) to Eq. (9.3) could
influence the low-frequency terms that survive the filtering. The transfer function
of the detector (including all electronic circuitry) must, therefore, be included in
Eq. (9.3) and all subsequent equations. For practical determinations of intensity
fluctuations, of course, the effects of electronic filtering must be taken into
account. However, as far as the fundamental principles discussed in this chapter
are concerned, the consequences of such filtering are irrelevant, and the detection
circuit’s transfer function may safely be ignored.
Another practical concern revolves around the question of noise in photo
detection. The output signal from a photodetector is, in general, accompanied by
several types of noise, such as shot noise, thermal noise, and the noise associated
with the photo-multiplication process. Accurate measurement of intensity cor-
relations and fluctuations requires a careful analysis of all relevant sources of
noise, elimination or minimization of undesirable signals, and collection of a
sufficient number of photons to ensure the adequacy of the available signal-to-
noise ratio. In this context it must also be mentioned that, when measuring the
intensity autocorrelation hI(t) I(t þ s)i at a fixed point in space, it is advantageous
to use a 50/50 beam-splitter in conjunction with two identical photodetectors, as
shown in Figure 9.6. Whereas the noise or other spurious signals from a single
detector could exhibit temporal correlations, a pair of well-isolated detectors is
unlikely to suffer from such complications. The splitting of the beam, of course,
D1
Source I1(t)
I1(t) I2(t )
D2
I2(t)
Delay
Figure 9.6 The degree of second order coherence g(2)(s) of a beam of light
may be determined by two identical photodetectors D1 and D2, placed sym-
metrically with respect to the output ports of a 50/50 beam-splitter. According to
the classical optical theory, the intensity fluctuations at the two detectors are
identical with those of the light arriving at the splitter. The use of two detectors
(instead of one) is thus dictated by the need to mitigate the temporal correlations
of the noise (or other spurious signals).
9 Second-order coherence 127
will halve the signal strength at each detector, but, according to the classical
optical theory, it should not disturb the intensity fluctuations otherwise.
A fundamental issue raised in the wake of the Hanbury Brown–Twiss experi-
ment concerned the quantum nature of light and its role in determining the
measured intensity fluctuations and correlations of the various types of radiation. In
particular, it was pointed out that a single photon leaving the source in Figure 9.6,
could be picked up by either D1 or D2, but not by both, whereas the classical theory
allowed the beam-splitter to divide the photon’s energy between the two receivers.
Attempts to answer this and many related questions eventually ushered in the
modern era of quantum optics.2,3 The results obtained in the present chapter for
classical sources of light have been found to retain their validity under a quantum
mechanical treatment.3 In the meantime, however, several types of non-classical
light have been discovered whose proper treatment requires the full machinery of
the quantum theory of radiation and detection. A striking example of quantum-
optical phenomena is anti-bunching, where the degree of second-order coherence
g(2)(0) for certain non-classical sources is known to be below unity.3,4 In fact, the
entire range of values between 0.0 and 1.0 is accessible to g(2)(0) in quantum optics.
This, of course, is an impossibility in the classical theory, where Eq. (9.7) dictates
that g(2)(0)
1.
†
The coauthor of this chapter is Lifeng Li, now at the Tsinghua University in China.
128
10 What in the world are surface plasmons? 129
Glass hemisphere
u
0.8 0.8
Amplitude Reflection Coefficient
0.4 0.4
Ss
0.2 0.2
0.0 0.0
0 15 30 45 60 75 90 0 1 2 3 4 5
(degrees) Z (nm)
Figure 10.2 (a) Computed plots of amplitude reflection coefficients for the p-
and s-components of polarization versus the angle of incidence h, for the
monochromatic plane wave (k ¼ 633 nm) incident at the interface between glass
and a thin aluminum layer (d ¼ 5 nm) shown in Figure 10.1. The dip in jrpj at
h 45 is caused by the excitation of a surface plasmon in the aluminum film.
(b) Plots of the magnitude of the Poynting vector S against the depth z within the
aluminum layer, at h ¼ 45 . Note that approximately 90% of the incident power
of the p-polarized light enters the aluminum film and is absorbed fairly uni-
formly within the film’s thickness. In contrast, only 30% of the s-polarized light
is absorbed by the film.
aside from the weak, plasmon-related feature in the vicinity of hcrit, the plots of
jrpj and jrsj in Figure 10.4(a) already resemble those for a very thick aluminum
film (i.e., one for which d skin depth). It is thus obvious that the lower
interface, between aluminum and air, is responsible for the excitation of surface
plasmons: increasing the film thickness prevents the electromagnetic field from
reaching the aluminum–air interface, thus suppressing the excitation of the
plasma wave. Also note in Figure 10.4(b) that the slope of Ss is greatest near the
glass–aluminum interface, and the flux of optical energy contained in the s-
polarized beam decays exponentially as it moves away from this interface
towards the aluminum–air interface. In contrast, the slope of Sp is greatest at the
aluminum–air interface, indicating that most of the energy is deposited at that
site. This is yet another indication that the aluminum–air interface is responsible
for the excitation of surface plasmons in the system of Figure 10.1.
10 What in the world are surface plasmons? 131
0.8
0.8
|rp|
0.6
0.6
0.4
0.4
0.2 Ss
0.2
0.0
0.0
0 15 30 45 60 75 90 0 2 4 6 8 10
(degrees) Z (nm)
Figure 10.3 Same as Figure 10.2, except for the thickness of the aluminum film,
which is now 10 nm. The resonant absorption in this case occurs at h ¼ 42.95 , and
the fraction of p-polarized light absorbed by the aluminum layer is over 98%.
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
e ðkk =k0 Þ2
k? =k0
rs ¼ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi : ð10:2Þ
k? =k0 þ e ðkk =k0 Þ2
132 Classical Optics and its Applications
(a) (b)
= 42.41°
1.0 |rs| 1.0
|rp|
0.8 0.8
0.6 0.6
Sp
0.4 0.4
0.2 0.2
Ss
0.0 0.0
0 15 30 45 60 75 90 0 5 10 15 20
(degrees) Z (nm)
Figure 10.4 Same as Figures 10.2 and 10.3, except for the thickness of the
aluminum film, which is now 20 nm. The resonant absorption in this case occurs
at h ¼ 42.41 , and the fraction of p-polarized light absorbed within the aluminum
layer is just over 60%.
The denominator
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiof
ffi the expression for rp in Eq. (10.1) goes to zero at
kk =k0 ¼ e=ð1 þ eÞ, indicating that rp has a pole at this point. No such pole,
however, exists for rs. In the case of aluminum, n þ ij ¼ 1.38 þ 7.6i at
k0 ¼ 633 nm, yielding e ¼ 55.86 þ 20.98i; this results in a value 1.008 þ 0.003i
for the pole of rp. Under ordinary circumstances, when the metal surface is
illuminated in air at an oblique angle h we have kk/k0 ¼ sinh, which is less than
unity and, therefore, far from the pole. However, if evanescent waves are
somehow created at an air–aluminum interface, then kk/k0 can exceed unity and,
in the neighborhood of the pole, the reflectivity rp at that interface will approach
infinity. This means that an evanescent p-polarized plane wave of very small
amplitude impinging at the metal surface can excite a very strong plane wave
within the metal. This plane wave, of course, is the surface plasmon, which is
capable of absorbing a good fraction of the energy from the incident beam and
converting it to heat within the metallic medium.
In light of the above arguments it is not difficult to see that, in the system of
Figure 10.1, the creation of evanescent waves with kk/k0 1 at the aluminum–air
interface is responsible for the sharp decline in rp at angles slightly greater than
10 What in the world are surface plasmons? 133
the critical TIR angle. Since the expression for rs in Eq. (10.2) does not admit a
pole, no such behavior could be expected from the s-polarized light.
Glass hemisphere
u
Air gap
Metal plate
|rp| 43.0
42.6
42.8
42.4
= 41.0 deg.
0.6 0.6
41.2 42.2
41.4
0.4 0.4
41.6
incidence in the vicinity of the critical TIR angle for the glass–air interface, that
is, when kk/k0 1.
Metal grating
Incident beam
Es
Ep
Objective
lens Metal grating
from the cases discussed in the preceding examples, the range of parameters
over which surface plasmon excitation can be expected is very narrow and,
therefore, the angle of incidence at which surface plasmons are excited must be
sharply defined.
If one directs a focused beam onto a metal grating, as shown in Figure 10.8,
then a wide angular spectrum will be present in the beam, and some of the
136 Classical Optics and its Applications
a b
Figure 10.9 Photographs showing the intensity distribution at the exit pupil
of a 0.8NA microscope objective lens, through which a collimated beam
of laser light (k ¼ 633 nm) is focused on a gold-coated diffraction grating;
(n, k)gold ¼ (0.13, 3.16). The grooves of the grating are oriented along the Y-axis,
the grating period is 1.6 lm, and the grooves, which have a trapezoidal cross-
section, are 0.5 lm wide at the top and 70 nm deep. The direction of the linear
polarization of the incident beam is parallel to the grooves in (a) and perpen-
dicular to the grooves in (b). (From Ronald E. Gerber, Ph.D. dissertation, Optical
Sciences Center, University of Arizona, Tucson.)
rays will be strongly absorbed. A photograph of the reflected beam at the exit
pupil of the lens will show one or more dark lines corresponding to the absorption
of surface plasmons within the grating. Figure 10.9 shows a typical set of results
obtained in an experiment of this type. When the polarization is parallel to the
grooves, as is the case in Figure 10.9(a), there are no surface plasmon bands.
However, with the polarization vector perpendicular to the grooves, surface
plasmons are clearly excited, as shown in Figure 10.9(b). Results of theoretical
calculations confirming these results are shown in Figures 10.10 and 10.11. In these
calculations, Maxwell’s equations were solved for about 10 000 plane waves
impinging on the metal grating at various angles. These results were then combined
to represent the focused cone of light created by a 0.8NA objective lens.
In the case of Figure 10.10, where the incident polarization vector was parallel to
the grooves, no plasmons were observed. We did the calculations for three different
positions of the focused spot over the grooves, however, to show the so-called
baseball pattern that results from superposition of the various diffracted orders.
Frames (a), (b), and (c) correspond respectively to a beam focused on one groove
edge, on the middle of a groove, and on an opposite groove edge. The phase
differences between various diffracted orders create constructive and destructive
interference among these various orders in their regions of mutual overlap, thus
10 What in the world are surface plasmons? 137
a
Figure 10.10 Computed plots of intensity distribution at the exit pupil of a 0.8NA
objective lens through which a uniform plane wave is focused on a diffraction
grating. The grooves are oriented at 45 relative to the X-axis. The parameters of the
grating are the same as those used in the experiment (see the caption to Figure 10.9).
The various diffraction orders are clearly visible in these so-called “baseball”
patterns. The incident linear polarization is parallel to the grooves, thus explaining
the absence of plasmon-related dark bands in these pictures. The center of the
focused spot is (a) on a groove edge, (b) in the middle of a groove, and (c) on the
opposite groove edge.
giving rise to black and white areas. When the polarization is perpendicular to the
grooves, the pattern in Figure 10.11 is obtained. For this computation the position
of the focused spot on the grating was on a grooved edge similar to that shown in
Figure 10.10(c). The dark bands of Figure 10.11, predicted by this theoretical
calculation to arise from surface plasmon excitation, agree quite well with the
experimental results of Figure 10.9(b).
138 Classical Optics and its Applications
General Formulation
With reference to Figure 11.1, in a homogeneous medium of dielectric constant e
the propagation vector is k ¼ k0 ðry^y þ rz^zÞ, where k0 ¼ 2p/k0 and r2y þ r2z ¼ e.
qffiffiffiffiffiffiffiffiffiffiffiffiffi
In general, rz ¼ e r2y , with both plus and minus signs admissible. In each
of the semi-infinite cladding media, however, only one value of rz is allowed,
corresponding to the solution that approaches zero when z ! –1. This is why
rz1 of the upper cladding in Figure 11.1 is chosen to have a plus sign, whereas
that of the lower cladding has a minus sign. (rz1, rz2 have positive imaginary
parts.)
†
This chapter is co-authored with Armis R. Zakharian, now with Corning Corp., and Jerome V. Moloney of
the University of Arizona.
139
140 Classical Optics and its Applications
z
Ez
Hx Ey
k1 = k0( y y + z1z)
1
w k2 = k0( y y ± z z)
2 y
k3 = k0( y y – z1z)
1
The E- and H-fields of each plane-wave are related through the Maxwell
equation r · H ¼ @D/@t (where D ¼ e0e E) as follows:
Hx ðy; z; tÞ ¼ H0 expfi ½k0 ðry y rz zÞ xtg; ð11:1aÞ
The corresponding E-field for each mode can be found from Eqs. (11.1).
Continuity of Hx and Ey at the z ¼ – 12w boundaries yields
Z0 H2 ðrz2 =e2 Þ½expðik0 rz2 w=2Þ expðik0 rz2 w=2Þ ¼ Z0 H1 ðrz1 =e1 Þ: ð11:3bÞ
Substituting for H1 from Eq. (11.3a) into Eq. (11.3b), rearranging the terms, and
expressing rz1 and rz2 in terms of ry, we find:
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
e1 e2 r2y e2 e1 r2y qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi expðik0 e2 r2y wÞ ¼ 1: ð11:4Þ
e1 e2 r2y þ e2 e1 r2y
the wave-guide depicted in Figure 11.1. Each solution ry of Eq. (11.4) cor-
responds to a particular mode of the waveguide; when the plus (minus) sign is
used on the right-hand side of Eq. (11.4), the solution represents an even (odd)
mode. Since we are presently interested in modes that propagate from left to right
in Figure 11.1, the imaginary part of ry must be non-negative (i.e., r(i) y
0),
otherwise the mode will grow exponentially as y ! 1. Also, when computing the
complex square roots in Eq. (11.4), one must always choose the root which has a
positive imaginary part.
Note that the coefficient multiplying the complex exponential on the left-hand
side of Eq. (11.4) is the Fresnel reflection coefficient rp for a p-polarized (TM)
plane-wave at the interface between media of dielectric constants e1 and e2. The
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
Fresnel coefficient has a singularity (pole) at ry ¼ e1 e2 =ðe1 þ e2 Þ, where its
denominator vanishes. The function on the left-hand side of Eq. (11.4) thus varies
rapidly in the vicinity of this pole, where some of the solutions of the equation are
to be found. In particular, when w ! 1, the complex exponential approaches
zero and the pole itself becomes a solution. This can be seen most readily with
reference to Eqs. (11.3); by allowing exp (þik0rz2w/2) ! 0 and substituting
forqH 1 from ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi into Eq. (11.3b), we find rz2 /e2 ¼ rz1/e1, namely,
ffi Eq.q(11.3a)
ffiffiffiffiffiffiffiffiffiffiffiffiffiffi
e1 e2 r2y þ e2 e1 r2y ¼ 0.
142 Classical Optics and its Applications
ry(þ) ry()
1.017 þ i 0.22 · 103 1.041 þ i 1.39 · 103
0.171 þ i 7.868 0.204 þ i 13.738
0.1145 þ i 7.860 0.1722 þ i 13.729
0.211 þ i 20.0012 0.2135 þ i 26.3795
0.1892 þ i 19.992 0.1965 þ i 26.3698
Prism-coupling
To excite SPPs on the flat surface of a metal slab, one may use the prism-coupling
scheme of Figure 11.2, commonly referred to as the Kretschmann or Otto con-
figuration depending on whether the metal is thin or thick.6 The incident beam
arrives at the bottom of the prism (refractive index ¼ n0) at an angle h slightly
greater than the critical angle hc of total internal reflection. Since ky ¼ k0n0 sin h,
the waves coupled to the metal slab will have ky > k0, a basic requirement for SPP
11 Surface plasmon polaritons on metallic surfaces 143
Table 11.2. Fundamental modes of silver slabs of differing thickness
(k0 ¼ 650 nm, e1 ¼ 1.0, e2 ¼ 19.6224 þ 0.443i)
Ez
Ey
Hx
u
z
n0
Gap
1
w Metal slab
2 y
excitation. The incident beam, being mildly focused, has a k-space spectrum that
spans a few degrees around hc. Most of these k-vectors are reflected at the prism’s
base; however, a narrow range of incidence angles evanescently couples to the
metal surface and proceeds to excite the plasmons.
Figure 11.3 shows computed plots of the Fresnel reflection coefficient
rp ¼ jrpj exp(ip) for a p-polarized plane-wave versus the incidence angle h at
the bottom of the prism (n0 ¼ 1.5, k0 ¼ 650 nm). In Figure 11.3(a), corres-
ponding to the case of a thick silver slab separated by a 1078 nm air gap, there is
a single resonant absorption at h ¼ 43.15 . Due to the narrow range of k-vectors
144 Classical Optics and its Applications
1
(a) (b)
0.8
0.6 |rp|
0.4
0.2 |rp|
200
100
0 fp fp
–100
–200
–300
40 42 44 46 48 50 40 42 44 46 48 50
u (°) u (°)
Figure 11.3 Plots of the Fresnel reflection coefficient for p-polarized (TM)
light, rp ¼ jrpj exp(ip), versus the incidence angle h at the bottom of the prism of
Figure 11.2; k0 ¼ 650 nm, n0 ¼ 1.5. (a) The case of a thick silver slab separated
from the prism by a 1078 nm air gap. (b) The case of a 65 nm-thick silver slab
separated by a 950 nm air gap. In each case the gap is optimized to enhance the
strength of the excited plasmon(s).
that cross the gap, we expect the footprint of the beam on the metal surface to be
much wider than the diameter of the focused spot at the prism’s base. The rapid
variation of the phase p in the vicinity of the resonance implies that the
footprint on the metal surface will not be centered under the incident spot but,
rather, it will be shifted to the right. Figure 11.3(b), corresponding to a 65 nm-
thick silver slab separated from the prism by a 950 nm air gap, exhibits two
resonant absorptions, representing the odd and even modes of the metallic slab.
The first resonance at h1 ¼ 42.86 , having the smaller value of ky, excites the
even mode, while the second resonance, at h2 ¼ 43.52 , excites the odd mode.
The coupling of a focused beam of light through a glass prism to a thick (semi-
infinite) silver slab is depicted in Figures 11.4 and 11.5. At the base of the prism, the
Gaussian beam’s full-width-at-half-maximum-amplitude (FWHM) is 4.0 lm, the
central ray’s incidence angle is h ¼ 43 , and the air gap is 1.078 lm. The expected SPP
wavelength, k0/Re[rspp] ¼ 633.22 nm, is consistent with k0/(n0 sin h) ¼ 633.6 nm
estimated from Figure 11.3(a) at the minimum of rp(h). From Figure 11.4(a), the
profile of Hx(y) sampled at Dz ¼ 10 nm below the metal surface has a period of 634 nm
(peak of the function’s Fourier spectrum), in excellent agreement with the theory.
The Poynting vector plots of Figure 11.5 show how a fraction of the evan-
escent field’s energy reaches the metal surface, of which fraction a certain portion
immediately returns to the prism, while the remainder turns around and propa-
gates along the metal surface in the y-direction.
11 Surface plasmon polaritons on metallic surfaces 145
0.4
z [m]
0.2
0.0
–0.2
–0.4
–35 –30 –25 –20 –30 –20 –10 0 10 20
|Ey| |Ez|
0.00 0.16 0.32 0.48 0.64 0.0 0.7 1.4 2.1 2.8
0.8 c d
0.6
0.4
z [m]
0.2
0.0
–0.2
–0.4
–30 –20 –10 0 10 20 –30 –20 –10 0 10 20
y [m] y [m]
Figure 11.4 Electromagnetic fields in the gap region between the prism and
the semi-infinite metal surface. (a) Instantaneous Hx. (b-d) Magnitudes of Hx, Ey,
Ez. The evanescent field at the bottom of the prism is visible in the upper left-
hand corner of each frame. The SPP is launched at the lower left-hand side. Due
to back-coupling to the prism, the SPP’s decay rate along the y-axis is nearly
twice the expected rate.
The best fit to Re[Hx] of Figure 11.4(a) is exp(0.013 67 y) sin [9.9165(y þ 0.245)].
While k0 Re[rspp] ¼ 9.9226 is quite close to the observed value of 9.9165, the
decay rate of 0.013 67 is substantially greater than the SPP extinction rate of k0 Im[rspp]
¼ 0.006; this is caused by the SPP’s back-coupling to the prism. We truncated the
simulated prism by removing the glass that lies directly above the excited SPP (see
146 Classical Optics and its Applications
Sy ×10–5 Sz ×10–5 |S| ×10–5
–21 231 482 734 986 –88 –44 0 44 88 0 246 493 739 986
0.8 a 0.8 b c
0.6
0.6 0.6
0.4
0.4 0.4
z [m]
z [m]
z [m]
0.2 0.2 0.2
Figure 11.5 (a, b) Components Sy and Sz of the Poynting vector S in the gap
region between the prism and the semi-infinite metal. (c) Close-up of jSj;
superimposed arrows show the direction of S.
z [m]
z [m]
0.2 0.2 0.2
0.0 0.0 0.0
–0.2 –0.2 –0.2
–0.4 –0.4 –0.4
–30 –20 –10 0 10 20 –30 –20 –10 0 10 20 –30 –20 –10 0 10 20
y [m] y [m] y [m]
Figure 11.6 (a-c) Plots of Hx, Sy, Sz in the gap region between the prism and a
semi-infinite metallic medium. To eliminate the back-coupling of the SPP to the
prism, the part of the prism that lies above the launched SPP has been removed
(see the inset in Figure 11.2); the prism thus extends from 40 lm to 0 along the
y-axis. The SPP’s decay rate along y now agrees with the theoretical prediction.
the inset in Figure 11.2); the truncated prism thus occupied only the interval (40
lm, 0) along the y-axis. The simulation results for the truncated prism shown in
Figure 11.6 exhibit a period of 634 nm in the y
0 region (obtained from
the waveform’s Fourier spectrum). The best fit to Re[Hx], namely, the function
exp(0.006 y) sin [9.9165(y þ 0.056)], now yields the expected decay rate as well.
1.0 1.0
z [m]
z [m]
0.5 0.5
0.0 0.0
–0.5 –0.5
–50 0 –50 –60 –40 –20 0 20 40 60
|E y | |E z |
0.000 0.188 0.375 0.562 0.750 0.0 0.8 1.6 2.4 3.2
c d
1.5 1.5
1.0 1.0
z [m]
z [m]
0.5 0.5
0.0 0.0
–0.5 –0.5
–60 –40 –20 0 20 40 60 –60 –40 –20 0 20 40 60
y [m] y [m]
Figure 11.7 Electromagnetic field profiles on both sides of a 65 nm-thick silver
slab illuminated through a truncated prism; the slab is centered at z ¼ 0.6 lm,
while the prism’s base at z ¼ 1.55 lm extends from 80 lm to 0 along the y-axis.
(a) Profile of instantaneous Hx. (b–d) Magnitudes of Hx, Ey, Ez. The evanescent
field just below the prism appears in the upper left-hand corner of each frame.
Both odd and even modes of the slab are excited, their interference causing the
peaks and valleys of the field distributions.
Dz ¼ 10 nm below the slab, yields ky1 ¼ 636.9 nm, ky2 ¼ 629.4 nm, in excellent
agreement with k0 /(n0 sin h1,2) ¼ 637.1 nm, 629.3 nm obtained from the minima of
rp(h) of Figure 11.3(b). Computed values of ry for a 65 nm-thick slab in Table 11.2
yield ky(–) ¼ k0/Re[ry(–)] ¼ 636.6 nm, 629.2 nm, and k0 Im[ry(–)] ¼ 0.0034, 0.0098,
once again in agreement with the simulated profile of Re[Hx(y)] shown in
Figure 11.7(a).
148 Classical Optics and its Applications
S y × 10–5 Sz × 10–5
–50 264 577 890 1203 –50 –32.5 –15.0 2.5 20.0
a b
1.5 1.5
1.0 1.0
z [m]
z [m]
0.5 0.5
0.0 0.0
–0.5 –0.5
–60 –40 –20 0 20 40 60 –60 –40 –20 0 20 40 60
0.8 0.8
0.6 0.6
z [m]
z [m]
0.4 0.4
0.2 0.2
0.0 0.0
–60 –50 –40 –30 –20 0 10 20 30 40
y [m] y [m]
Figure 11.8 (a, b) Poynting vector components Sy, Sz around the 65 nm-thick
silver slab illuminated through a truncated prism. (c, d) Close-ups of jSj,
showing the flow of energy in the early and late parts of the propagating SPP.
Profiles of the Poynting vector S in the gap region between the silver slab
and the prism and also in the region immediately below the slab are shown in
Figure 11.8. The Sz plot shows a fraction of the evanescent field’s energy
reaching the metal slab, of which a certain proportion immediately returns to the
prism, while the remainder turns around and straddles the slab along the y-axis.
In general, the odd mode, being lossier than the even mode, has a shorter
propagation distance along the y-axis. The physics behind the loss mechanism may
be understood as follows. With the even mode, the field component Ez has the same
sign above and below the slab; therefore, at a given point along y, the electrical
11 Surface plasmon polaritons on metallic surfaces 149
charges at the top and bottom surfaces have opposite signs. Inside the metallic slab,
the field component Ez – reduced by a factor of e2/e1 relative to the Ez immediately
outside – helps move the charges back and forth between the top and bottom sur-
faces. The slab being thin, the transport distance is short; hence the charge velocity
and the corresponding electrical current are small. In contrast, the charges of the odd
mode have the same sign on opposite sides of the slab. Consequently, positive and
negative charges must move laterally (in the –y-directions) during each period of
oscillation. The travel distance is now on the order of the SPP wavelength, which is
typically greater than the slab thickness. Therefore, the current densities of the odd
mode are relatively large, leading to correspondingly large losses.
Concluding remarks
In this chapter we analyzed the surface modes of thin and thick metallic slabs.
Maxwell’s equations admit many solutions for electromagnetic fields that can be
considered localized at and around metallic surfaces (or, in general, confined to
the vicinity of metallo-dielectric interfaces). However, only a handful of such
solutions extend far enough beyond their point of origination to be considered
useful for practical applications. The odd and even waves that propagate along
the surfaces of metallic slabs are examples of such long-range surface plasmon
150 Classical Optics and its Applications
(a) (b)
E H
H E
o/ y
Vp= c/ y Vp= c/ y
Metal Metal
Figure 11.9 (a) The SPP’s E-fields originate on positive charges and terminate
on negative ones. Continuity of Hk and a negative emetal ensure the continuity of
D?, while Ek becomes continuous when ry ¼ rspp. (b) A physical impossibility,
since the divergence-free nature of the H-field requires Hk to have opposite
directions above and below the surface, thus prohibiting the continuity of Hk at
the boundary.
Michael Faraday (1791–1867) was born in a village near London into the family of
a blacksmith. His family was too poor to keep him at school and, at the age of 13,
he took a job as an errand boy in a bookshop. A year later he was apprenticed as a
bookbinder for a term of seven years. Faraday was not only binding the books but
was also reading many of them, which excited in him a burning interest in science.
When his term of apprenticeship in the bookshop was coming to an end, he
applied for the job of assistant to Sir Humphry Davy, the celebrated chemist, whose
lectures Faraday was attending during his apprenticeship. When Davy asked the
advice of one of the governors of the Royal Institution of Great Britain about the
employment of a young bookbinder, the man said: “Let him wash bottles! If he is
any good he will accept the work; if he refuses, he is not good for anything.”
Faraday accepted, and remained with the Royal Institution for the next fifty years,
first as Davy’s assistant, then as his collaborator, and finally, after Davy’s death, as
his successor. It has been said that Faraday was Davy’s greatest discovery.
In 1823 Faraday liquefied chlorine and in 1825 he discovered the substance
known as benzene. He also did significant work in electrochemistry, discovering
152
12 The Faraday effect 153
the laws of electrolysis. However, his greatest work was with electricity. In 1821
Faraday built two devices to produce what he called electromagnetic rotation,
that is, a continuous circular motion from the circular magnetic force around a
wire. Ten years later, in 1831, he began his great series of experiments in which
he discovered electromagnetic induction. These experiments form the basis of
modern electromagnetic technology.
Apart from numerous publications in scientific magazines, the most remarkable
document pertaining to his studies is his Diary, which he kept continuously
from the year 1820 to the year 1862. (This was published in 1932 by the Royal
Institution in seven volumes containing a total of 3236 pages, with a few thousand
marginal drawings.) Queen Victoria rewarded Faraday’s lifetime of achievement
by granting him the use of a house at Hampton Court and a knighthood. Faraday
accepted the cottage but gracefully rejected the knighthood.1
On 13 September 1845, Faraday discovered the magneto-optical effect that
bears his name. This day’s entry in his Diary reads: “Today worked with lines
of magnetic force, passing them across different bodies (transparent in different
directions) and at the same time passing a polarized ray of light through them
and afterwards examining the ray by a Nichol’s Eyepiece or other means.” After
describing several negative results in which the ray of light was passed through
air and several other substances, Faraday wrote in the same day’s entry: “A
piece of heavy glass which was 2 inches by 1.8 inches, and 0.5 of an inch thick,
being silico borate of lead, and polished on the two shortest edges, was
experimented with. It gave no effects when the same magnetic poles or the
contrary poles were on opposite sides (as respects the course of the polarized
ray) – nor when the same poles were on the same side, either with a constant or
intermitting current – BUT, when contrary magnetic poles were on the same
side, there was an effect produced on the polarized ray, and thus magnetic force
and light were proved to have relation to each other. This fact will most likely
prove exceedingly fertile and of great value in the investigation of both con-
ditions of natural force.”
In an isotropic material (such as ordinary glass) the three diagonal elements are
identical and, in the presence of a magnetic field along the Z-axis, there is a non-
zero off-diagonal element e0 , which couples the x- and y- components of the
optical E-field, that is,
0 1
e e0 0
e ¼ @ e0 e 0 A:
0 0 e
E a–E
a+E
E
Ex
Ex
Z Z
Ey
B B
Ep
Figure 12.2 Faraday effect in the polar geometry. (a) In going normally
through a slab of magnetic material, a linearly polarized beam of light with its
E-field along the X-axis acquires a component of polarization along Y. The lines
of B-field shown within the medium represent either an externally applied mag-
netic field or the intrinsic magnetization of the medium. (b) The effect is also
observed at oblique incidence. Shown here is a p-polarized incident beam, which
acquires a s-component upon transmission through the magnetic medium. (If the
incident beam is s-polarized, the magneto-optically induced polarization is then in
the p-direction.) In general, upon reversing the B-field from the þZ to the Z
direction the magneto-optically induced component of polarization changes sign.
of multiple reflections at the front and rear facets of the slab these functions vary
periodically with k. (The same interference phenomena are responsible for the non-
zero values of gF, which would otherwise be absent in a transparent medium.) The
net Faraday rotation angle is the average value of hF over the relevant range of
wavelengths, but one should also recognize that the wavelength dependence of the
direction of emergent polarization produces a certain amount of depolarization in
the emergent beam. The Faraday rotation combined with the spectral bandwidth of
the light source thus causes partial depolarization as a direct consequence of
interference among the multiple reflections.
Oblique incidence
Figure 12.4 shows the transmitted amplitudes and polarization angles versus
the angle of incidence h in the case of the slab 20 lm thick magnetized along the
Z-axis (e ¼ 5.5, e0 ¼ 0.01i) when, as shown in Figure 12.2(b), a p-polarized plane
wave at the single wavelength of k ¼ 550 nm is incident on the slab. The
12 The Faraday effect 157
0.8 (a)
0.7 |tx|
0.6
Amplitude
0.5
|ty|
0.4
0.3
0.2
40 (b)
Rotation/Ellipticity (degreees)
F
30
20
10
F
–10
546 548 550 552 554
(nm)
Figure 12.3 A plane wave, linearly polarized along the X-axis, is normally
incident on a slab 20 lm thick, as shown in Figure 12.2(a). The slab (e ¼ 5.5,
e0 ¼ 0.01i) is magnetized along the Z-axis. (a) Plots of jtxj and jtyj, the transmitted
polarization components along the X- and Y- axes, as functions of k. (b) Plots of
polarization rotation angle hF and ellipticity gF, versus k.
Amplitude
0.6
0.4
0.2 |tsp|
0.0
0 15 30 45 60 75 90
35 (b)
30
Rotation/Ellipticity (degrees)
F
25
20
15
10
F
5
0
–5
–10
0 15 30 45 60 75 90
(degrees)
beam inside the slab travels at 25 relative to the direction of magnetization of
the material, the maximum Faraday effect as exemplified by jtspj is the same as at
normal incidence, because the propagation distance is correspondingly adjusted.
The wavelength-averaged Faraday rotation may be lower at larger angles of
incidence, but this is just a consequence of interference; it is not caused by any
reduction in the intrinsic optical activity of the slab. If, for instance, the facets are
antireflection coated, or if the beam enters and exits through index-matched
spherical surfaces, then multiple reflections would be eliminated and the Faraday
rotation becomes independent of the incidence angle.
The above discussions were confined to the case of a p-polarized incident
beam, but the conclusions remain valid for s-polarized light as well. For
example, Figure 12.6 is the counterpart of Figure 12.4, showing the transmitted
12 The Faraday effect 159
1.0
(a)
0.8 |tpp|
Amplitude
0.6
0.4
0.2 |tsp|
0.0
546 548 550 552 554
40 (b)
Rotation/Ellipticity (degrees)
F
30
20
10
0 F
–10
546 548 550 552 554
(nm)
amplitudes and polarization angles versus the angle of incidence for a s-polarized
incident beam. Note that the magneto-optically generated component of
polarization tps in Figure 12.6 is identical to tsp in Figure 12.4. This is an
important and completely general result, indicating that the amount of light
converted from one polarization state to another is independent of the incident
polarization state.
0.8 |tss|
Amplitude
0.6
0.4
0.2 |tps|
0.0
0 15 30 45 60 75 90
50 (b)
F
Rotation/Ellipticity (degrees)
40
30
20
10
F
0
–10
0 15 30 45 60 75 90
(degrees)
Figure 12.6 Same as Figure 12.4, except that here the incident beam is s-polarized.
Dielectric mirrors
X
Ex Lens Lens
Y Z
Ex
Faraday medium
c d
Figure 12.8 Intensity and polarization patterns in the exit pupil of the colli-
mating lens of Figure 12.7. (a) The intensity distribution of the emergent
X-polarized component. The bright rings indicate the regions where the condi-
tions of resonance are met and the light passes through the resonator. (b) The
intensity distribution of the emergent Y-polarized component. The bright rings
coincide with those in (a), indicating that the conditions of resonance for the
incident polarization are the same as those for the magneto-optically induced
polarization. (c) Polarization rotation angle hF of the emergent beam encoded in
gray-scale. The range of values of hF is 23 (black) to þ63 (white). (d) The
polarization ellipticity gF of the emergent beam encoded in gray-scale. The
range of values of gF is 32 (black) to þ42 (white).
The first objective lens (NA ¼ 0.8) focuses a linearly polarized beam of light onto
the Fabry–Pérot resonator, and the second, identical, lens collimates the trans-
mitted beam, thus allowing observation at the exit pupil. For a slab of transparent
magnetic material 20 lm thick sandwiched between a pair of dielectric mirrors,
Figure 12.8 shows the computed patterns of intensity and polarization angle at the
exit pupil of the collimator. This figure indicates that the rings of maximum
transmission also correspond to locations of maximum polarization rotation. The
maximum and minimum rotation angles in Figure 12.8(c) are þ63 and 23 ,
respectively, well in excess of the rotations obtained from the bare slab. Also note
in Figures 12.8(c), (d) the asymmetrical nature of the polarization angles in the
first and third quadrants, on the one hand, and in the second and fourth quadrants
on the other hand.
162 Classical Optics and its Applications
(a) (b)
Ep
Ep
Es
Y Z
Ep Ep
B B
Longitudinal Transverse
Figure 12.9 (a) Longitudinal Faraday effect is observed when the direction of
the B-field within the slab of material is parallel both to the surface of the slab
and to the plane of incidence. The rotation of polarization in this case occurs
only at oblique incidence, where, upon transmission, a p-polarized beam
acquires a s-component and vice versa. If the direction of B is reversed, the
magneto-optically induced component of polarization will change sign. (b) The
transverse effect occurs when the B-field lies in the plane of the sample per-
pendicular to the plane of incidence. The MO interaction in this case occurs only
when the incident beam is p-polarized. Even then there is no polarization
rotation; the only effect is that a change in the magnitude of the B-field causes a
slight change in the magnitude of the transmitted p-light. The transverse effect is
small and is not bipolar, meaning that reversing the direction of B does not affect
the emergent beam.
0.6
Amplitude
0.4
0.2 |tsp|
0.0
0 15 30 45 60 75 90
10 (b)
5
Rotation/Ellipticity (degrees)
F
0
–5
–10 F
–15
–20
–25
–30
0 15 30 45 60 75 90
(degrees)
Figure 12.10 The longitudinal Faraday effect arising when a p-polarized plane
wave (k ¼ 550 nm) is incident at oblique angle h on a slab 20 lm thick. The slab
(e ¼ 5.5, e0 ¼ 0.01i) is magnetized along the X-axis, as depicted in Figure 12.9(a).
(a) The transmitted amplitudes jtppj and jtspj versus h. (b) The polarization
rotation angle hF and the ellipticity gF versus h.
Tp(0)
0.8
Transmitted Intensity
0.6
0.4
0.2
Tp– Tp(0)
0.0
0 15 30 45 60 75 90
(degrees)
turns out to be the same for both directions of incident polarization; that is,
tsp ¼ tps.
The transverse effect is very different from both the polar and the longitudinal
effects. With s-polarized incident light, where the optical E-field is parallel to
the direction of the B-field in the slab, there is no MO effect whatsoever,
but for the p-polarized light the medium exhibits an effective refractive index
n ¼ [e þ (e0 2/e)]1/2. Thus in the transverse case neither s- nor p-polarized beams
undergo polarization rotation, but the magnitude of the transmitted p-light shows a
weak dependence on magnetization, that is, Tp ¼ jtpj2 becomes a function of the
strength of the B-field. The transverse effect is not bipolar, so that changing the
direction of the B-field from þY to Y does not alter the magnitude of Tp. For a
slab of transparent material 20 lm thick and with a fairly large MO coefficient
(e ¼ 5.5, e0 ¼ 0.1i), Figure 12.11 shows computed plots of Tp(0) (i.e., transmission in
the absence of a B-field, when e0 ¼ 0) and DTp ¼ TpTp(0) versus the angle of
incidence h. Note, in particular, that DTp 0 around the Brewster angle hB ¼ 66.9 ,
where a vanishing surface reflectivity results in minimal interference effects.
12 The Faraday effect 165
166
13 The magneto-optical Kerr effect 167
Ep Ep Ep
Es Es Es
M
Polar Longitudinal Transverse
and the applied magnetic field (or the internal magnetization of the medium)
takes place:1
0 1
exx exy exz
@
e ¼ eyx eyy eyz A:
ezx ezy ezz
In an isotropic material the three diagonal elements are identical and, in the
presence of a magnetic field along the Z-axis, there is a non-zero off-diagonal
element e0 , which couples the x- and y- components of the optical E-field:
0 1
e e0 0
e ¼ @ e0 e 0 A:
0 0 e
circular polarizations are reflected with different reflectivities, rþ and r, say.
When rþ and r happen to have a phase difference, the reflected beam exhibits a
polarization rotation, and if the magnitudes jrþj and jrj differ from each other,
then there will be some degree of ellipticity. When the medium is transparent, n
are real and, therefore, there is no phase difference between rþ and r, although
their magnitudes will be different. In this case the reflected light exhibits
polarization ellipticity only. However, in the general case of reflection from the
surface of an absorbing medium (both e and e0 complex), the reflected light
exhibits elliptical polarization, with the major axis of the ellipse rotated relative
to the direction of incident polarization.
For concreteness, we will confine our attention throughout this chapter
to a metallic magnetic material having e ¼ 8 þ 27i and e0 ¼ 0.6 þ 0.2i at
the red HeNe wavelength, k0 ¼ 633 nm. This is typical of the TbFeCo amorphous
alloys used in magneto-optical disks for data storage. The discussion, however,
will be kept quite general in nature, and the conclusions drawn from specific
examples should be applicable to a wide variety of magnetic materials.
Reflection Coefficient
0.8
0.6 |rpp|
0.2
0.0
0 15 30 45 60 75 90
0.05 (b)
Rotation & Ellipticity (deg.)
0.00
–0.05 p
–0.10
–0.15
–0.20 p
–0.25
0 15 30 45 60 75 90
Rotation & Ellipticity (deg.)
0.05 (c)
0.00
s
–0.05
–0.10
–0.15
–0.20 s
–0.25
0 15 30 45 60 75 90
(degrees)
Figure 13.2 A linearly polarized plane wave is reflected from the polished
surface of a magnetic material having perpendicular magnetization (the polar
case); exx ¼ 8 þ 27i, exy ¼ 0.6 þ 0.2i. (a) Plots of jrppj, jrssj, and jrspj ¼ jrpsj
versus the angle of incidence h. (b) The polarization rotation angle q and
the ellipticity g versus h for p-polarized incident beam. (c) Same as (b) for
s-polarized beam.
signal gains strength, peaking at h ¼ 65 . Again, q and g depend on whether the
incident polarization is p or s (see Figures 13.3(b), (c)), but the effective MO
signal, rsp, is independent of the incident polarization. The longitudinal MO
signal is typically weaker than its polar counterpart by almost one order of
magnitude.
170 Classical Optics and its Applications
1.0
(a)
Reflection Coefficient
|rss|
0.8
0.6 |rpp|
0.4
1000 |rsp|
0.2
0.0
0 15 30 45 60 75 90
0.05 (b)
Rotation & Ellipticity (deg.)
0.04 p
0.03
0.02
0.01
0.00 p
–0.01
–0.02
–0.03
0 15 30 45 60 75 90
Rotation & Ellipticity (deg.)
(c)
0.02
s
0.01
0.00
–0.01
s
–0.02
0 15 30 45 60 75 90
(degrees)
Figure 13.3 Same as Figure 13.2 but here for the longitudinal Kerr effect.
Again rsp ¼ rps at all angles of incidence. The MO effect is zero at normal
incidence, reaching its peak at a fairly large angle. Note that jrpsj, q, g are about
an order of magnitude smaller than their counterparts for the polar geometry
case. Both polar and longitudinal effects are bipolar, in the sense that a reversal
in the direction of M results in a p phase shift of rps, leading to a reversal in the
signs of both q and g.
When the incident beam is p-polarized, the interaction is confined to the plane of
incidence, creating an extra E-field component within the same plane. Unlike the
polar and longitudinal effects, no E-fields are generated perpendicular to the
plane of incidence. Therefore, there are no polarization rotations in the transverse
geometry. What is interesting, however, is that the reflectivity of the sample,
Rp ¼ jrppj2, depends on the magnitude and direction of the magnetic moment M.
ð0Þ
In Figure 13.4(a) the reflectivity in the absence of M is denoted by Rp (i.e., e0
is set to zero). With M pointing along þY the reflectivity changes slightly, becoming
ðþÞ
Rp ; the difference is shown as the solid curve at the bottom of Figure 13.4(a).
Similarly, when M is reversed to point along Y, the corresponding change in
Rp is given by the broken curve. The change in Rp is thus seen to depend on the
direction of M. This behavior is rather curious and, at first sight, appears to
violate the principles of symmetry, although a careful analysis shows it to be
correct.1 It is noteworthy that this bipolar nature of Rp critically depends on the
magnetic medium being absorptive; for transparent magnetic media (where e
= 60°
0.8 0.10
40°
0.6 0.05
Rp(M) – Rp(0) (×100)
Rp(0)
20° 80°
Reflectivity
0.4 0.00 0°
0.2 –0.05
Rp(+) – Rp(0) (×100)
0.0 –0.10
Figure 13.4 Variation of the reflectivity Rp with the magnitude and/or direc-
tion of M in the transverse geometry. The incident beam is p-polarized in all
cases; there are no transverse effects for s-polarized light. (a) The dependence of
Rp on the angle of incidence h; the superscript zero indicates that M ¼ 0. When
the medium is fully magnetized in the Y direction, the reflectivity is denoted by
ðÞ
Rp . (b) The variation of Rp with M at various angles of incidence. At h ¼ 0 the
dependence on M is quadratic, while at h ¼ 20 , 40 , 60 it is nearly linear.
(The off-diagonal element e0 of the dielectric tensor is assumed to be directly
proportional to M.)
172 Classical Optics and its Applications
E Incident beam
Objective
Magnetic sample
Figure 13.5 A linearly polarized beam of light having its E-field parallel to
the X-axis is focused onto the flat surface of a magnetic medium through a
diffraction-limited microscope objective lens (NA ¼ 0.95, f ¼ 3158k). The power
of the incident beam – its integrated intensity – is set to unity. The reflected
light’s distribution at the exit pupil has a small but important contribution from
the magnetization M of the sample.
13 The magneto-optical Kerr effect 173
a b
c d
e f
Figure 13.6 Various distributions at the exit pupil of the objective of Figure 13.5,
when M is set to zero (i.e., no Kerr effect). (a) Distribution of intensity for the
reflected Ex; the total power ¼ 0.62. (b) Distribution of phase for Ex; min ¼ 0 ,
max ¼ 55 . (c) Distribution of intensity for the reflected Ey; the total power ¼ 0.011.
(d) Distribution of phase for Ey; min ¼ 36 , max ¼ 150 . (e) The polarization
rotation angle q; qmin ¼ 20.5 , qmax ¼ 20.5 . (f) The polarization ellipticity g;
gmin ¼ 25.1 , gmax ¼ 25.1 .
of the reflected light, Ix ¼ jExj2, depicted in Figure 13.6(a), shows slight variations
across the aperture, in agreement with the rpp and rss curves of Figure 13.2(a).
Similar variations are seen in the corresponding phase plot of Figure 13.6(b). In
addition to Ex, the reflected light also contains a y-component, Ey, whose intensity
and phase plots appear in Figures 13.6(c), (d). While the total power (i.e., the
integrated intensity) of Ex is 62% of the incident power, that of Ey is only 1.1%.
The reflected Ey in adjacent quadrants of the aperture exhibits a phase shift of p,
indicating a sign reversal from one quadrant to the next. The presence of Ey in the
reflected beam gives rise to the patterns of polarization rotation and ellipticity
depicted in Figures 13.6(e), (f); note the fairly large values of q and g in the four
corners of the aperture qmin, qmax ¼ 20.5 ; gmin, gmax ¼ 25.1 ).
174 Classical Optics and its Applications
a b
a b
c d
Figure 13.8 Contribution of the magnetic moment of the sample to the E-field
distribution at the exit pupil of the objective of Figure 13.5; M is assumed to be
aligned with the X-axis. (a), (b) Intensity and phase patterns of Ex; total power
0.65 · 107. The top and bottom halves of the aperture have a relative phase of p.
(c), (d) Intensity and phase patterns of Ey; the total power ¼ 0.54 · 106. Note
the p phase difference between the right and left halves of the aperture.
depicted in Figures 13.8(c), (d). Note that, with the exception of a 90 rotation of
coordinates, the distributions in Figures 13.9(c), (d) are identical to those in
Figures 13.8(c), (d).
Signal detection
The MO contribution to the reflected polarization state can be converted to an
electronic signal with the aid of polarization-sensitive optics and photodetectors.
For instance, to detect the polar Kerr signal shown in Figure 13.7, one can employ
the differential scheme shown in Figure 13.10. Here the reflected beam is directed
toward a quarter-wave plate, which helps to eliminate the phase shift between Ex
and Ey. The quarter-wave plate is followed by a Wollaston prism, which mixes the
MO component of polarization contained in Ey with the reflected x-component of
polarization, Ex. The two mixed beams emerging from the Wollaston are detected
by a pair of photodetectors whose difference signal DS conveys information about
the sample’s magnetic state. A computed plot of the normalized differential signal
versus the orientation angle w of M is given in Figure 13.11. As M moves away
176 Classical Optics and its Applications
a b
c d
Figure 13.9 Same as Figure 13.8, but for the transverse geometry, where M is
switched between þY and Y directions. Ex, depicted in (a), (b), has total power
0.26 · 105. Ey, depicted in (c), (d), has total power 0.54 · 106.
Incident
beam
Plate
4 Wollaston
Differential
s1 amplifier
+
ΔS
s2
–
Split
Splitter
detector
Objective
Magnetic
M sample
0.75
0.00
–0.25
–0.50
–0.75
0 45 90 135 180
(degrees)
from its initial orientation at w ¼ 0 toward the plane of the sample at w ¼ 90 ,
and continues downward until w ¼ 180 , DS follows these changes continuously.
(We mention in passing that, as M rotates, the sum signal S1 þ S2 undergoes slight
variations, but, for all practical purposes, it remains a constant.)
Similar systems may be designed to extract the longitudinal and transverse
MO signals depicted in Figures 13.8 and 13.9. However, because in these cases
the E-field contributions have different signs in opposite halves of the aperture,
any viable detection scheme must extract the signals from these half-apertures
separately, before combining them with the proper sign at the end.
Objective
Dielectric mirror
Spacer
Magnetic sample
Figure 13.13 shows the computed distributions at the exit pupil of a 0.75NA
lens. The intensity plot for Ex in Figure 13.13(a) shows absorption bands in the
angular spectrum of the incident beam. The reflected Ey in Figure 13.13(b) is
strong in certain regions of the aperture, but these contributions mostly come
from spurious light reflected from the mirror, not from the magnetic sample.
To determine the MO signal at the exit pupil, we once again compute the
reflected complex amplitudes with M up and M down, then subtract the corres-
ponding distributions. Figure 13.13(c) is the result of this calculation, showing
the intensity of the residual Ey contributed by the MO interaction. The peak value
of this MO signal is nearly twice that shown in Figure 13.7(a).
Quadrilayer stack
A practical method of enhancing the MO Kerr effect involves the incorporation
of a thin magnetic film in a quadrilayer stack structure. Figure 13.14 shows one
such stack, consisting of an aluminum reflector, a dielectric underlayer, a thin
magnetic film, and a dielectric overlayer. By optimizing the thicknesses of these
13 The magneto-optical Kerr effect 179
Plane-wave
t1 Dielectric
Magnetic
t2 Dielectric
Aluminum
Substrate
2.0
0.6 (a) R
(b)
50 |rsp|
Rotation & Ellipticity (deg.)
0.5 1.5
0.4 1.0
Reflectivity
0.3
0.5
0.2
0.0
0.1
–0.5
0.0
0 50 100 150 0 50 100 150
t2 (nm) t2 (nm)
The Sagnac effect pertains to the relative phase shift between two beams of light
that travel on an identical path in opposite directions within a rotating frame.1,2,3,4
Modern fiber-optic gyroscopes (Sagnac interferometers) used for navigation are
based on this effect, allowing highly accurate measurements of rotation rates
down to about 104105 degrees per hour. Georges Sagnac (1869–1926) was
the first to perform a ring interferometry experiment in 1913 aimed at observing
the correlation of angular velocity and optical phase-shift. (An experiment con-
ducted in 1911 by Francis Harress, attempting to measure the Fresnel drag of
light propagating through rotating glass, was later recognized as actually con-
stituting a Sagnac experiment; Harress had ascribed the observed “unexpected
bias” to some other factor.) An ambitious ring interferometry experiment was set
up by Albert Michelson and Henry Gale in 1926 to determine whether the Earth’s
rotation has an effect on the propagation of light in its vicinity. The Michelson–
Gale interferometer with a 1.9 km perimeter was large enough to detect the
rotation of the Earth, confirming its known value of angular velocity (obtained
from astronomical observations). The Michelson–Gale ring interferometer was
not calibrated by comparison with an outside reference, an impossible task given
that the setup was fixed to the Earth.
Figure 14.1 shows the general design of a triangular Sagnac interferometer
consisting of a light source, a beam-splitter S, mirrors M1, M2, and an obser-
vation plane, mounted on a base that rotates at a constant angular velocity X
around a fixed axis. The rotation axis, not necessarily perpendicular to the plane
of the interferometer, crosses that plane at C. The source and the observation
plane are mounted on the same rotating base as M1, M2, and S, although, strictly
speaking, this is not necessary (that is, either the source or the observation plane
or both may be stationary while the rest of the system rotates; this would
require synchronizing the light pulses with the rotating base, but would not
modify the behavior of the system in any significant way). Between the source
182
14 The Sagnac interferometer 183
M1
C
Rotation Axis
S
Source
M2
Observation
Plane
M1
Light
Source M2
Observation
Plane
Figure 14.2 A lens (NA ¼ 0.005, f ¼ 3.0 m) focuses the incoming laser beam
(k0 ¼ 0.633 lm, 1/e amplitude diameter ¼ 1.0 cm) onto the beam-splitter after
the completion of a single round-trip in either direction (red: clockwise, blue:
counter-clockwise). The 0.9 m-long arms of the interferometer form an equi-
lateral triangle. The observation plane is 1.0 m away from the beam-splitter S,
which consists of an 8.0 nm-thick silver film on a glass substrate (Rs ¼ 50.5%,
Ts ¼ 46.8%). The mirrors M1 and M2 are 0.5 lm-thick silver films on glass
substrates (RM ¼ 97%). Tilting one of the mirrors (say, M2) by Dh ¼ 0.05
separates the focused spots at the beam-splitter by 1.57 mm (in the direction
parallel to the observation plane); each focused spot is 0.24 mm in diameter.
The beams emerging from the focused spots travel in parallel and interfere at the
observation plane; the inset shows computed fringes over a 6 · 6 mm2 area.
The objective of the present chapter is to explain the physical basis of the
above features of the Sagnac interferometer without resort to the principles of
general relativity. Our goal is to provide an explanation based on geometry, the
theory of special relativity, and the classical theory of optical wave propagation,
while maintaining some level of generality.
2 R1
C
r1 r2
R2
R3
1
S
0
r3
Light 3 M2
Source 4
Observation
Plane
Figure 14.3 In the clockwise path around the Sagnac loop, the light beam
propagates the distances r1, r2, r3 along the unit vectors r1, r2, r3. Four
reflections (at S, M1, M2, and again at S) bring the beam from the source to the
observation plane. With respect to the rotation center C, the centers of M1, M2, S
are at R1, R2, R3, respectively. The monochromatic light source (frequency ¼ f0)
launches the incident beam along r0; the emergent beam (frequency ¼ f4)
reaches the observation plane along r4. Viewed from an inertial frame outside
the rotating system, f4 could differ from f0; from the perspective of a co-rotating
observer, however, the two frequencies are identical.
around the loop. The total phase shift must be doubled when the counter-
clockwise-propagating beam is taken into account as well. The net phase between
the two beams at the observation plane is
D ¼ 4k0 A X=c: ð14:2Þ
Figure 14.4 shows that the counter-propagating beams in each arm of the
Sagnac loop interfere with each other, setting up a standing-wave pattern
(period ¼ 12k0) that is stationary within the rotating platform. The effect of the
rotation on the standing wave is a uniform, longitudinal shift of the fringes by
A · X/c. The reason for the fringe shift is that, at any given point around the
loop, one beam arrives with its phase advanced, while the other (counter-
propagating) beam arrives with its phase retarded. The combined phase-shift of
Du ¼ 2 k0A· X/c is the same as the accumulated phase in a single round trip for
either beam; the proof follows the same line of argument as that employed in
conjunction with Eq. (14.1).
where
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
f ¼ f 0 ½1 þ ðv=cÞrz0 = 1 v2 =c2 ; ð14:4bÞ
r ¼ ðrx ; ry ; rz Þ
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
¼ ðrx0 1 v2 =c2 ; ry0 1 v2 =c2 ; rz0 þ v=cÞ=½1 þ ðv=cÞrz0 : ð14:4cÞ
188 Classical Optics and its Applications
M1
C
½0
S
Light
Source M2
Observation
Plane
Figure 14.4 The counter-propagating beams of the Sagnac loop interfere with
each other, setting up a standing wave pattern that remains stationary in the
rotating frame. The wavelength everywhere within the rotating platform is k0,
and the standing-wave fringes have a period of 12k0. At any given point around
the loop, the clockwise beam is delayed and the counter-clockwise beam is
advanced, yielding a combined phase-shift of Du ¼ 2k0A·X/c. The phase shift
results in a longitudinal translation of the fringe pattern by A·X/c.
x x
2
v
1
y
. z
y . z
Figure 14.5 In the xyz frame, the mirror M moves with constant velocity v
along the z-axis. From the perspective of an observer in the x0y0z0 frame, the
mirror is stationary, and the incident and reflected plane-waves have the same
frequency f 0 and propagation directions identified with unit vectors r10 and r20 . In
the xyz frame, the incident wave has frequency f1 and propagation direction r1,
while the reflected wave’s parameters are f2 and r2.
14 The Sagnac interferometer 189
It is not difficult to verify that jrj ¼ 1. From Eq. (14.4c) one can compute rz0 as a
function of rz, then substitute into Eq. (14.4b) to find:
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
f ¼ f 0 1 v2 =c2 =ð1 r v=cÞ: ð14:5Þ
Here r · v/c is an alternative expression for (v/c)rz. Consequently, in the xyz frame,
where the incident beam has frequency f1 and propagation direction r1, while the
reflected beam has frequency f2 and propagation direction r2, we have
f2 =f1 ¼ ð1 r1 v=cÞ=ð1 r2 v=cÞ: ð14:6Þ
Returning now to the system depicted in Figure 14.3, where, for the beam-
splitter and the mirrors, r · v ¼ – r · (R · X ) ¼ – (r · R ) · X, we note that the
magnitude of r · R is simply the perpendicular distance from C to a straight line
aligned with r. The vector r · R is perpendicular to the plane of the interferometer,
pointing either up or down depending on the direction of r. Thus we may write, for
the clockwise path in Figure 14.3,
In the final analysis, therefore, the ratio of the emergent frequency f4 to the
source frequency f0 is a function of the perpendicular distance from C to the
incidence vector r0 as well as that from C to the emergent vector r4. We
emphasize that f4 and f0 appear to be different only to a stationary observer
outside the rotating system; a comparison of Eq. (14.7) with Eq. (14.5) clearly
indicates that, from the perspective of a co-rotating observer, the frequency at the
observation plane is the same as the source frequency.
In the counter-clockwise direction the beam is transmitted twice through the
splitter S; neither passage introduces any Doppler shifts, as the propagation
direction remains unaltered before and after transmission through a parallel-
plate slab (in other words, r1 ¼ r2 in conjunction with Eq. (14.6) immediately
implies that f1 ¼ f2). The Doppler shifts produced by reflection from the moving
mirrors M1 and M2, however, need to be taken into account. A similar analysis
as the one that led to Eq. (14.7) then reveals that, for the counter-clockwise
path, the ratio of the emergent frequency f4 to the source frequency f0 is the
same as that for the clockwise path. Therefore, at the observation plane,
the emerging clockwise and counter-clockwise beams will have the same
frequency f4.
We now derive Sagnac’s fundamental equation, Eq. (14.2), from the per-
spective of an observer outside the rotating platform, an observer residing in an
inertial frame in which the rotation axis is stationary. Our alternative derivation
190 Classical Optics and its Applications
relies on the Doppler-shifted frequencies in the three arms of the loop depicted
in Figure 14.3. When the inertial observer considers the clockwise path at a
frozen instant in time, the Doppler-shifted frequencies f1, f2, f3 in the three arms
of the loop yield the accumulated phase as follows:
D1 þ D2 þ D3 ¼ 2pðf1 r1 þ f2 r2 þ f3 r3 Þ=c
¼ ð2pf0 =cÞ½ðf1 =f0 Þr1 þ ðf2 =f0 Þr2 þ ðf3 =f0 Þr3
¼ ð2pf0 =cÞ½1 þ ðr0 · R3 Þ X=c
(
r1 r2
· þ
1 þ ðr1 · R3 Þ X=c 1 þ ðr2 · R1 Þ X=c
)
r3
þ
1 þ ðr3 · R2 Þ X=c
ð2pf00 =cÞ ðr1 þ r2 þ r3 Þ ½ðr1 · R3 Þ þ ðr2 · R1 Þ
þ ðr3 · R2 Þ X=c
In the above derivation we have used Eq. (14.5) to relate f0 to f00 , the frequency of
the source within the rotating system, while ignoring terms of the order (v/c)2 and
higher. A similar treatment of the counter-clockwise path leads to the same result
as in Eq. (14.8), except for the minus sign between the two terms being replaced
by a plus sign. Thus the net phase-shift Du between the counter-propagating
beams is, once again, given by Eq. (14.2). Note that k00 in Eq. (14.8) is the same as
k0 in Eq. (14.2), both symbols representing the light source’s wave-number as
measured on the rotating platform.
Finally, we must show that the counter-propagating beams in each arm of the
Sagnac interferometer produce running fringes that travel with the velocity of the
arm itself; this would corroborate our earlier assertion that the fringes co-rotate
with the platform. For the sake of concreteness, we denote by f2þ and f2,
respectively, the clockwise and counter-clockwise frequencies in the arm located
between the mirrors M1 and M2. (A similar analysis applies to the other arms as
well.) In analogy with Eq. (14.7), we write
f2 =f0 ¼ ½1 þ ðr0 · R3 Þ X=c=½1 ðr2 · R1 Þ X=c: ð14:9Þ
Again, relating f0 to f00 through Eq. (14.5) and ignoring terms of the order (v/c)2
and higher yields
f2 ½1 ðr2 · R1 Þ X=c f00 : ð14:10Þ
14 The Sagnac interferometer 191
Iðx; tÞ a0 2 f1 þ cos½ðxþ þ þ
2 x2 Þt ðk2 þ k2 Þx þ ð2 2 Þg
a0 2 f1 þ cos½4pðf0 0 =cÞðx v2 tÞ ðþ
2 2 Þg ð14:13Þ
The running fringes thus have a period of 12k0 and travel with velocity v2 in the r2
direction.
L 1
2
1
v
n
y
. z
y . z
Figure 14.6 In the xyz frame, a transparent dielectric slab of thickness L moves
with constant velocity v along the z-axis. From the perspective of an observer in
the x0y0z0 frame, the slab is stationary, and the incident, intermediate, and
transmitted plane-waves all have the same frequency f 0 and (identical) propa-
gation directions r0. In the xyz frame, both the incident and transmitted waves
have frequency f1 and propagation direction r1, while the beam parameters
inside the dielectric are f2 and r2.
The Lorentz transformation of the spatio-temporal coordinates from (x0, y0, z0, t0)
to (x, y, z, t) yields the relationship between the various plane-waves in the xyz and
x0y0z0 systems. In particular, one can readily show that
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
f 0¼ ð1 r1 v=cÞf1 = 1 v2 =c2 ð1 r1 v=cÞf1 ; ð14:14aÞ
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
r0¼ ðrx1 1 v2 =c2 ; ry1 1 v2 =c2 ; rz1 v=cÞ=ð1 r1 v=cÞ
ð14:14bÞ
ðr1 v=cÞ=ð1 r1 v=cÞ:
the center of the exit facet (relative to the phase at the center of the entrance facet)
as follows:
1s ¼ 2pn1 f1 L=c: ð14:15Þ
With the slab traveling at a constant velocity v, the frequency f 0 and the
propagation direction r0 inside the slab (within the co-moving x0y0z0 frame)
determine the phase at the center of the exit facet located at r10 ¼ Lr1 (relative to
that at the center of the entrance facet) as u10 ¼ 2pn0f 0Lr1 · r0/c. The emergent
beam may thus be written as
aðr0 ; t0 Þ ¼ a00 expði0 Þ exp½ið2pf 0=cÞðr0 r0 ct0 Þ; ð14:16aÞ
where
0 ¼ 2pðn0 1Þf 0 Lr1 r0 =c ð2pf1 L=cÞðn0 1Þð1 r1 v=cÞ: ð14:16bÞ
In the last line of Eq. (14.17), the near-equality of n0 and n1 has been used to
substitute n0 for (2n0 – n1); however, the same approximation cannot be applied to
the second term, because of the division by the small quantity r1 · v /c.
When a short pulse launched at the entrance facet at t0 ¼ 0 reaches the slab’s
exit facet at t0 ¼ ngL/c, the center of the exit facet, located at r10 ¼ Lr1 in the x0y0z0
system, will have reached the point r1 in the xyz frame. Using the Lorentz
transformation, we find
h pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffii
r1 ¼ Lrx1 ; Lry1 ; ðLrz1 þ vng L=cÞ= 1 v2 =c2 Lðr1 þ ng v=cÞ: ð14:18Þ
Therefore, in the xyz system, the phase of the emergent beam at the center of the
exit facet will be
The difference between the above phase and that obtained in Eq. (14.15) for a
stationary slab is
1 1s ð2pf1 L=cÞr1 v=c: ð14:20Þ
Note that the above expression for the phase difference between a rotating
Sagnac interferometer (with co-rotating slab) and a stationary one, both computed
at the exit facet of the slab (relative to the entrance facet), is independent of the
slab’s refractive index. The expression for u1 – u1s in Eq. (14.20) is the same as
that which would have been obtained had the beam traveled from the entrance to
the exit facet in the free space (rather than in a dielectric of refractive index n).
We have thus proven that the presence of co-moving dielectric media within one
or more arms of the device will not alter the fundamental formula, Eq. (14.2), of
the Sagnac interferometer.
2 R1
C
r2
r1
R2
R3
0
r3
3 M2
4
S
Detector Array
Figure 14.7 An active Sagnac gyroscope is a ring laser whose gain medium is
placed within one or more arms of a Sagnac interferometer. The beam-splitter S
is now re-oriented to allow the counter-propagating beams to circulate around
the loop. Fractions of the clockwise and counter-clockwise modes emerge from
the cavity along r0 and r4, respectively. When the platform is stationary, the two
modes have the same frequency f0; the mode frequencies, however, drift in
opposite directions when X6¼0. The two modes are brought together on
an observation plane to form an interference pattern with running fringes. A
photodetector array picks up the beat frequency Df between the modes, which is
proportional to the platform’s angular frequency X. The detector array is also
capable of detecting the sign of X by monitoring the direction of travel of the
fringes.
When the two beams emerging from the laser cavity are brought together on a
photodetector array, as in Figure 14.7, they produce a pattern of running fringes.
The beat frequency Df then yields the magnitude of X, while its sign is
determined by the direction of fringe travel.
196 Classical Optics and its Applications
197
198 Classical Optics and its Applications
Z
Ep
Es
Θ Y
X
Layer n
Layer 1
Substrate
Figure 15.1 A multilayer dielectric mirror and a plane wave at oblique incidence.
In the examples used in this chapter, the substrate refractive index n is 1.5, the odd-
numbered layers have an index of 2 and are 79.125 nm thick, and the even-numbered
layers have an index of 1.5 and are 105.5 nm thick. At the design wavelength,
k ¼ 633 nm, these layer thicknesses correspond to one-quarter of the wavelength.
(a) (b)
1.0 0
|s|
fp
Phase of Reflection Coefficient
Amplitude Reflection Coefficient
0.8
–45
0.6
|p|
–90
0.4
–135 fs
0.2
0.0 –180
0 15 30 45 60 75 90 0 15 30 45 60 75 90
(degrees) (degrees)
Figure 15.2 Computed plots of (a) amplitude and (b) phase of the reflection
coefficients of a dielectric mirror for p- and s-polarized beams versus the angle
of incidence. The assumed mirror is as shown in Figure 15.1, with a total of 10
layers; the medium of incidence is air.
Z
Θ
Quarter-wave
Substrate stacks Substrate
(In practice the uncoated facets of both substrates are given a slight wedge to
eliminate spurious reflections.) For the system of Figure 15.3 the computed plots
of reflection amplitude and phase versus h are shown in Figure 15.4, for mirrors
with 10 dielectric layers each, and an air gap 8.229 lm wide, which is exactly
13k. Note that, within the 0 to 30 range of angles of incidence depicted, the p-
and s- reflectivities are nearly the same. Sharp drops in the etalon’s reflectivity
occur at h ¼ 0 , 15.37 , 21.83 , and 26.85 ; at these angles (d/k) cos h ¼ 13,
12.53, 12.07, and 11.6, respectively. In other words, when the effective gap-width
is an integer multiple of a half-wavelength, the etalon becomes transparent to the
incident light. To be sure, there are slight deviations from exact half-wavelength
multiplicity here, which have to do with the h-dependence of the phase of the
individual mirror reflectivities (see Figure 15.2 (b)), but, for our purposes, these
differences are small and may be ignored.
Next, we study the setup of Figure 15.5, which is designed to send a focused
beam of light onto an etalon and to analyze the resulting reflection. The setup
includes a path for a reference beam, so that Twyman–Green interferometry may
be used to reveal the reflected phase pattern. It also includes a polarizer before the
observation plane to allow selection of the polarization direction of interest.
Figure 15.6 shows computed plots of intensity at the observation plane obtained
under various conditions. Frames (a) and (b) are obtained when the reference
beam is blocked, whereas frames (c) and (d) are interferograms obtained in the
presence of the reference beam. The circular area in each frame represents the
aperture of the objective lens (NA ¼ 0.5).
200 Classical Optics and its Applications
90
45
0.6
0
–45
0.4
–90
0.2
–135
–180
0.0
0 5 10 15 20 25 30 0 5 10 15 20 25 30
(degrees) (degrees)
Figure 15.4 Computed plots of (a) amplitude and (b) phase of the reflection
coefficients of a Fabry–Pérot etalon for p- and s-polarized beams versus the angle
of incidence. The assumed etalon is that shown in Figure 15.3 with 10-layer
mirrors and a 8.229 lm gap.
In (a) the polarizer is taken to transmit the same direction of polarization as that
of the incident beam. The dark rings correspond to the angles of incidence at which
the reflectivity plots of Figure 15.4(a) exhibit their minima. In frame (b) the
polarizer is turned by 90 so that only a small fraction of the light (about 3 · 104
of the original incident power) passes through to the observation plane. The four
corners of this distribution correspond to the four corners of the focused cone of
light, which have a mix of p- and s-polarization. Here, the rays incident on the
etalon are subject to slightly different reflectivities in their p- and s-components
(see Figure 15.4(a)), which gives rise to a small rotation of polarization from its
original direction. It is this polarization rotation in the four corners of the lens that
is responsible for the four corners of the intensity distribution in Figure 15.6(b).
Frames (c) and (d) of Figure 15.6 are obtained by unblocking the reference arm
in the system of Figure 15.5, thus allowing the interference pattern (between the
beam reflected from the etalon and that reflected from the reference mirror) to
impinge on the observation plane. The case for the parallel component of
polarization depicted in (c) shows the phase of the pattern to be more or less
uniform over the entire aperture; in particular, it shows that there are no phase
jumps between adjacent rings. The case for the perpendicular component of
Fabry–Pérot etalons in polarized light 201
Reference
mirror
Quarter-wave
plate
Microscope objective
Neutral-density lens (NA = 0.5)
filter
Linearly polarized
incident beam
Observation plane
polarization depicted in (d) shows a 180 phase shift between adjacent corners.
This is caused by the fact that, in adjacent corners of the lens, the polarization
vector rotates in opposite directions.
Mirror birefringence
Next we consider the effects of birefringence in the mirrors of the Fabry–
Pérot etalon.4,5 For this analysis we assume that the mirrors have 20 layers each
(jqj ¼ 0.9905, jsj ¼ 0.1375) and that the uppermost layer of both mirrors is
slightly birefringent. We will suppose that the uppermost layer has a nominal
index of 1.5, except along the Y-axis (see Figure 15.1) where the index is 1.505.
We also assume a normally incident plane wave and an adjustable gap-width. For
202 Classical Optics and its Applications
a b
c d
this etalon the computed transmission coefficients and the polarization state of the
transmitted light versus the gap-width are plotted in Figure 15.7. Two different
peaks are observed in transmission, one for the p-polarized, the other for the
s-polarized incident beam. (The E-field of the p-light is parallel to X, while that of
the s-light is parallel to Y.) The peak separation arises because the mirrors give a
slightly different phase upon reflection to the two components of polarization.
The gap-width, therefore, must be adjusted to compensate for this phase differ-
ence. If the incident beam is linearly polarized at 45 (i.e., halfway between p
and s), the transmitted beam will show the rotation angle w and the ellipticity n
depicted in Figure 15.7(b). The maximum ellipticity is close to 40 , which shows
that the transmitted light at this point is nearly circularly polarized. A very small
amount of birefringence in the mirrors can, therefore, have substantial effects on
the polarization state of the transmitted (or reflected) beam.
Fabry–Pérot etalons in polarized light 203
(a) 70 (b)
1.0
60
c
30
0.6 20
10
j
0.4 0
|ts|
–10
|tp| –20
0.2
–30
–40
0.0
8210 8220 8230 8240 8250 8210 8220 8230 8240 8250
Gap Width (nm) Gap Width (nm)
Figure 15.7 (a) Computed amplitude transmission coefficients and (b) the state
of transmitted polarization plotted versus the gap-width for a normally incident
beam on the Fabry–Pérot etalon of Figure 15.3. The mirrors are assumed to have
20 layers each and, for both mirrors, the uppermost layer is assumed to be
birefringent. With reference to Figure 15.1, the refractive indices of the top layer
along the X-, Y-, and Z- axes are 1.500, 1.505, and 1.500, respectively. The
normally incident beam is linearly polarized at 45 to the X-axis. In (b) the
polarization rotation angle, w, is also referred to the X-axis. By definition,
the polarization ellipticity n is the arctangent of the ratio of the minor axis of the
ellipse of polarization to its major axis. Thus e ¼ 0 corresponds to linear
polarization, whereas n ¼ 45 represents circular polarization.
In practice, if the mirrors are known to have the same amount of birefringence,
these problems can be avoided by rotating one of the mirrors by 90 relative to
the other. Also, it might be of some interest to note that birefringence of the top
layer poses the most serious problem for the Fabry–Pérot etalons. In our calcu-
lations, the effects diminished as we moved the birefringent layer down the stack
(closer to the substrate). By the time the birefringence is moved to layer 14 of
both mirrors, its effects are totally negligible.
Ep
Ep Es
Z
Linearly polarized
incident beam
Faraday rotator
c
8
6
|tp|
0.6 4
2
0.4
0
–2 j
0.2
–4
|ts|
0.0
–6
631 632 633 634 635 631 632 633 634 635
(nm) (nm)
Figure 15.9 (a) Computed amplitude transmission coefficients and (b) the state
of transmitted polarization plotted versus k for a plane wave normally incident
on the etalon of Figure 15.8. The assumed direction of incident polarization is p.
In (a) the transmission coefficient ts is defined as the ratio of the transmitted
s-component to the incident p-component. In (b) the polarization rotation angle
w is relative to the direction of incident polarization.
Note that the transmitted power has dropped by more than 60% and that the peak
rotation angle is reduced by about 4 . The plot in Figure 15.10(c) of the magnitude of
the Poynting vector along the propagation path shows constant values in the mul-
tilayer mirrors but a rapid decline within the Faraday medium. Of the roughly 85% of
the optical power that enters the etalon, 46% gets absorbed in the Faraday rotator and
only 39% eventually passes out of the device. The small plateaux in the Poynting
vector plot of Figure 15.10(c) are caused by the standing-wave pattern of the E-field
within the Faraday medium: the absorption rate goes through minima and maxima
following the E-field intensity variations. In our example, where the Faraday
medium is 5k thick, there are exactly 10 such plateaux.
A simple analysis
We now present a simple derivation of the basic properties of the Fabry–Pérot
interferometer. Let us assume that the two mirrors are identical, with reflection
coefficients q and transmission coefficients s. If the light is incident from the
1.0 (a) (b) 1.0 (c)
10
8
0.8 0.8
6
c
0.6 4 0.6
2
0.4 |tp|
0 0.4
206
–2 j
0.2
Figure 15.10 (a) and (b) are the same as in Figure 15.9, except for the presence of a small absorption coefficient (k ¼ 104) in the
Faraday medium. The plot in (c) shows the magnitude of the Poynting vector as a function of position along the beam’s
propagation path. The flat parts of the curve indicate that optical energy passes unattenuated through the dielectric mirrors. The
steep, staircase-like drop in the curve is caused by absorption within the Faraday medium.
Fabry–Pérot etalons in polarized light 207
A ¼ q2 A expði4pD=kÞ þ s0 ; ð15:1Þ
which yields
s0
A¼ : ð15:2Þ
1 q2 expði4pD=kÞ
Resonance occurs when the phase of q2 (if any) plus the phase acquired in a round
trip through the cavity, 4pD/k, becomes a multiple of 2p, at which point the
denominator in Eq. (15.2) will be at a minimum and the field amplitude A within
the cavity at a maximum.
The light transmitted through the device will have amplitude
t ¼ sA expði2pD=kÞ ð15:3Þ
The same equations may be used at oblique incidence, provided that the gap-
width D is multiplied by cos h and that q, s, q0 , and s0 represent the corresponding
quantities at the particular angle of incidence. When the medium of the cavity
happens to be absorptive, the same type of analysis may still be used to arrive at
the relevant formulas.
We now demonstrate the application of the preceding equations to some of the
cases discussed earlier. In the case of the 10-layer stack, the mirror coefficients
were q ¼ q0 ¼ 0.844 and s ¼ s0 ¼ 0.536. From Eqs. (15.2)–(15.4) we find that,
at resonance, A ¼ 1.863, t ¼ 1 and r ¼ 0. In the case of the 20-layer stack,
208 Classical Optics and its Applications
Note
In general, it is possible to eliminate from a stack non-absorbing layers whose
thicknesses are multiples of k/2. Now, if the gap happens to be an integer multiple
of k/2 its elimination will bring the top dielectric layers of the two mirrors into
contact. These layers, each being a quarter-wave thick, will combine into a half-
wave layer that can be subsequently eliminated, paving the way for the elimin-
ation of all the remaining layers in similar fashion. At the end, the two substrates
will come into direct contact, and the incident light will be fully transmitted, as is
expected of a well-tuned etalon.
When a beam of light enters a material medium, it sets in motion the resident
electrons, whether these electrons are free or bound. The electronic oscillations
in turn give rise to electromagnetic radiation which, in the case of linear media,
possesses the frequency of the exciting beam. Because Maxwell’s equations are
linear, one expects the total field at any point in space to be the sum of the
original (exciting) field and the radiation produced by all the oscillating elec-
trons. However, in practice the original beam appears to be absent within the
medium, as though it had been replaced by a different beam, one having a
shorter wavelength and propagating in a different direction. The Ewald–Oseen
theorem1,2 resolves this paradox by showing how the oscillating electrons
conspire to produce a field that exactly cancels out the original beam every-
where inside the medium. The net field is indeed the sum of the incident beam
and the radiated field of the oscillating electrons, but the latter field completely
masks the former.3,4
Although the proof of the Ewald–Oseen theorem is fairly straightforward, it
involves complicated integrations over dipolar fields in three-dimensional space,
making it a brute-force drill in calculus and devoid of physical insight.5,6 It is
possible, however, to prove the theorem using plane waves interacting with thin
slabs of material, while invoking no physics beyond Fresnel’s reflection coeffi-
cients. (These coefficients, which date back to 1823, predate Maxwell’s equa-
tions.) The thin slabs represent sheets of electric dipoles, and the use of Fresnel’s
coefficients allows one to derive exact expressions for the electromagnetic field
radiated by these dipolar sheets. The integrations involved in this approach are
one-dimensional, and the underlying procedures are intuitively appealing to
practitioners of optics. The goal of the present chapter is to outline a general
proof of the Ewald–Oseen theorem using arguments that are based primarily on
thin-film optics.
209
210 Classical Optics and its Applications
Dielectric slab
Consider the transparent slab of dielectric material of thickness d and refractive
index n, shown in Figure 16.1. A normally incident plane wave of vacuum
wavelength k0 produces overall a reflected beam of amplitude r and a transmitted
beam of amplitude t. Both r and t are complex numbers in general, having a
magnitude and a phase angle. Using Fresnel’s coefficients at each facet of the slab
and accounting for multiple reflections, it is fairly straightforward to obtain
expressions for r and t. The reflection and transmission coefficients at the front
facet of the slab are5,7
q ¼ ð1 nÞ=ð1 þ nÞ; ð16:1Þ
s ¼ 2=ð1 þ nÞ: ð16:2Þ
A single path of the beam through the slab causes a phase shift w, where
w ¼ 2pnd=k0 : ð16:5Þ
Adding up all partial reflections at the front facet yields an expression for the
reflection coefficient r of the slab. Similarly, adding all partial transmissions at
d
r n
1 9exp(i)
9r9exp(i2)
9r92 exp(i3)
9r93 exp(i4)
9r94 exp(i5)
9r95 exp(i6)
X
1
ss0 expðiwÞ
t ¼ ss0 expðiwÞ ½ q0 expðiwÞ2m ¼ : ð16:7Þ
m¼0
1 q02 expði2wÞ
Rather than try to simplify these complicated functions of n, d and k0, we give
numerical results in Figure 16.2 for the specific case of n ¼ 2 and k0 ¼ 633 nm. The
magnitudes of r and t are shown in Figure 16.2(a), and their phase angles in Figure
16.2(b), both as functions of the thickness d of the slab. For any given value of d it is
possible to represent r and t as complex vectors (see Figure 16.3). Since the phase
difference between r and t is always 90 , these complex vectors are orthogonal to each
other. Also, the conservation of energy requires that jrj2 þ jtj2 ¼ 1. These observations
lead to the conclusion that the hypotenuse of the triangle in Figure 16.3 must have unit
length, that is jt rj ¼ 1, which is also confirme numerically in Figure 16.2(c).
Within the slab the incident beam sets the atomic dipoles in motion. These
dipoles in turn radiate plane waves in both the forward and the backward dir-
ections, as shown in Figure 16.4. When the slab is sufficiently thin, symmetry
requires forward- and backward-radiated waves to be identical, that is, they must
both have the same amplitude r. In the forward direction, however, the incident
beam continues to propagate unaltered, except for a phase-shift caused by
propagation in free-space through a distance d. Thus we must have
t ¼ r þ expði2pd=k0 Þ: ð16:8Þ
It was pointed out earlier in conjunction with the diagram of Figure 16.3 that t r
has unit amplitude, which is in agreement with Eq. (16.8). It is by no means
obvious, however, that the phase of t r must approach 2pd/k0 as d ! 0. Figure
16.2(c) shows computed plots of the phase of t r normalized by 2pd/k0. It is
seen that in the limit d ! 0 the normalized phase approaches unity as well. This
confirms that the slab radiates equally in the forward and backward directions,
and that the incident beam, having set the dipolar oscillations in motion, con-
tinues to propagate undisturbed in free space.
0.6
Amplitude
|r|
0.4
0.2
0.0
0 25 50 75 100 125 150
270 (b)
Phase (degrees)
180 f (r)
f (t)
90
0
0 25 50 75 100 125 150
2.0 (c)
1.5 f (t – r) / (2d/ 0 )
Normalized (t– r)
1.0
|t – r|
0.5
0.0
0 25 50 75 100 125 150
Thickness d (nm)
Figure 16.2 Computed plots of r and t for a slab of thickness d and refractive
index n ¼ 2, when a plane wave with k0 ¼ 633 nm is normally incident on the
slab. The horizontal axis covers one cycle of variations in r and t, corresponding
to a half-wave thickness of the slab.
In this limit the radiated field is slightly more than 90 ahead of the incident field,
while its amplitude is proportional to d/k0 and also proportional to n21, the
latter being the coefficient of polarizability of the dielectric material. Note that
the small phase angle of r over and above its 90 phase, i.e., the exponential
16 The Ewald–Oseen extinction theorem 213
d
n r
Imaginary
–r
l t
r t t–r
Real
Figure 16.3 A dielectric slab of thickness d and refractive index n, reflecting the
unit- amplitude incident beam with coefficient r while transmitting it with coef-
ficient t. The complex-plane diagram on the right shows the relative orientations of
r, t and their difference t r. For a non-absorbing slab (i.e., one with a real-valued
index n) r and t are orthogonal to each other, and t r has unit magnitude.
d
n
Oscillating
dipole
1 exp (i2d/0)
t
r r
Figure 16.4 Bound electrons within a very thin dielectric slab, when set in
motion by a normally incident plane wave of unit amplitude, radiate with
equal strength in both the forward and backward directions. The magnitude of
the radiated field is the reflection coefficient r of the slab. The incident beam
continues to propagate undisturbed as in free-space, acquiring a phase shift of
2pd/k0 upon crossing the slab. The sum of the incident beam and the forward-
propagating part of the radiated beam constitutes the transmitted beam.
factor in Eq. (16.9), is essential for the conservation of energy among the
incident, reflected, and transmitted beams (see Figure 16.4).
Equation (16.9) is in fact the exact solution of Maxwell’s equations for the
radiation field of a sheet of dipole oscillators. Although derived here as an aid in
proving the extinction theorem, it is an important result in its own right. Note, for
example, that the amplitude of the radiated field is proportional to 1/k0 even
though the field of an individual dipole radiator is known to be proportional to
1/k02. The coherent addition of amplitudes over the sheet of dipoles has thus
modified the wavelength dependence of the radiated field.3
214 Classical Optics and its Applications
Δz
1
z0 Z
Amplitude
0.6
0.4
0.2 |t|
0.0
0 20 40 60 80 100
90 (b)
Phase (degrees)
f (t)
0
–90
f (r)
–180
0 20 40 60 80 100
2.0
(c)
Normalized (t–r)
1.5
|t – r |
1.0
0.5
f(t – r)/(2d/ 0 )
0.0
0 20 40 60 80 100
Thickness d (nm)
Figure 16.6 Computed plots of r and t for a slab of thickness d and complex
refractive index (n, k) ¼ (2,7), when a plane wave with k0 ¼ 633 nm is normally
incident on the slab. The horizontal axis covers the penetration depth of the material.
Figure 16.6(a) shows computed plots of r and t for a metal slab having
complex index n þ ik ¼ 2 þ i7. (Compare these plots with the corresponding plots
for the dielectric slab in Figure 16.2.) It is seen that the reflectance drops sharply
while the transmittance increases as the film thickness is reduced below about
20 nm. The phase plots in Figure 16.6(b) are quite different from those of the
dielectric slab, indicating a phase difference greater than 90 between r and t. A
complex-plane diagram for this type of material is given in Figure 16.7. The
angle between r and t being greater than 90 implies that jt rj2 > jtj2 þ jrj2,
while the conservation of energy requires that jtj2 þ jrj2 < 1 in the case of
216 Classical Optics and its Applications
Imaginary
t–r –r
t Real
d
n
r t
S Oscillating
dipole
absorbing media. The fact that jt rj can approach unity is borne out by the
numerical results depicted in Figure 16.6(c). In the limit d ! 0, not only does the
magnitude of t r become unity but also its phase approaches 2pd/k0. Therefore,
in the limit of small d, the transmitted beam may be expressed as the sum of the
reflected beam and the phase-shifted incident beam, the phase shift being due to
free-space propagation over the distance d. This is all that one needs in order to
prove the extinction theorem for absorbing media.
s-direction of polarization and radiate with equal magnitude in the forward and
backward directions. The computed plots of rs and ts versus d for the specific case
of k0 ¼ 633 nm, n ¼ 2, and h ¼ 50 are shown in Figure 16.9. The angle of
propagation inside the medium is obtained from Snell’s law as h0 ¼ 22.52 , and
the half-wave thickness of the slab is given by k0/(2n cos h0 ) ¼ 171.3 nm. These
curves are again very similar to those of Figure 16.2, showing a 90 phase
1.0
(a)
0.8 |ts|
Amplitude
0.6
0.4 |rs|
0.2
0.0
0 25 50 75 100 125 150 175
270 (b)
f (rs)
Phase (degrees)
180
f (ts)
90
0
0 25 50 75 100 125 150 175
3.0
(c)
2.5
Normalized (t s – r s )
2.0
f (t s – r s ) / ( 2d c o s / 0 )
1.5
1.0
|ts– rs|
0.5
0.0
0 25 50 75 100 125 150 175
Thickness d (nm)
Figure 16.9 Computed plots of r and t for a slab of thickness d and index
n ¼ 2, when a s-polarized plane wave with k0 ¼ 633 nm illuminates the slab at
h ¼ 50 . The horizontal axis covers one cycle of variations of r and t, corres-
ponding to a half-wave thickness of the slab at this particular angle of incidence.
218 Classical Optics and its Applications
d
n
r t
p
Oscillating
dipole
difference between rs and ts, unit magnitude for ts rs, and a phase for ts rs that
approaches 2p(d/k0)cos h as d ! 0. The Ewald–Oseen theorem for the case of
s-polarized light at oblique incidence can therefore be proven along the same
lines as described earlier for normal incidence.
The case of p-polarized light, depicted in Figure 16.10, is somewhat different,
however. Here the directionality of the dipole oscillations within the slab breaks
the symmetry between the forward- and backward-radiated beams. The angle h00
between the direction of oscillation of the dipoles and the plane of the slab may
be determined by considering multiple reflections within the slab. For very thin
slabs, it is possible to show that
Note that at Brewster’s angle, where tan h ¼ n, we have tan h00 ¼ 1/n, that is,
h00 ¼ h0 , where h0 is the propagation angle within the medium as given by Snell’s
law. At angles below the Brewster angle h00 < h0 , while above the Brewster
angle h00 > h0 .
For the case of p-polarized light of wavelength k0 ¼ 633 nm incident at h ¼ 50
on a slab of index n ¼ 2, plots of r and t versus the slab thickness d are shown in
Figure 16.11. Although the magnitude of tp rp can still be shown to be unity, its
phase does not approach 2p(d/k0)cos h as d ! 0. This is a manifestation of the
breakdown of symmetry between the forward and backward radiations. If the
magnitudes of the beams radiated in the two directions are taken into account,
however, the preceding arguments can be restored. One may readily observe from
16 The Ewald–Oseen extinction theorem 219
1.0
(a) |t p |
0.8
Amplitude
0.6
0.4 |r p |
0.2
0.0
0 25 50 75 100 125 150 175
270 (b)
Phase (degrees)
f (r p )
180
f (t p )
90
0
0 25 50 75 100 125 150 175
3.0
(c)
Normalized (t p – wr p )
2.5
f(t p – wr p )/(2d cos / 0 )
2.0
1.5
1.0 |t p – w r p |
0.5
0.0
0 25 50 75 100 125 150 175
Thickness d (nm)
Figure 16.11 Computed plots of r and t for a slab of thickness d and index
n ¼ 2, when a p-polarized plane wave with k0 ¼ 633 nm illuminates the slab at
h ¼ 50 . The horizontal axis covers one cycle of variations in r and t, corres-
ponding to a half-wave thickness of the slab at this particular angle of incidence.
Figure 16.10 that the ratio of the forward- and backward-propagating magnitudes
must be given by
0.5
rp
tp– exp(i2d cos/0)
1/ W
0.0
–0.5
–1.0
0 15 30 45 60 75 90
Angle of incidence (degrees)
As a further test of Eq. (16.11), we show in Figure 16.12 the computed plot
versus h of rp/[tp exp(i2pd cos h/k0)] for a slab with d ¼ 10 nm and n ¼ 2, illu-
minated by a plane wave with k0 ¼ 633 nm. This curve overlaps the plot of the
function 1/W(h) exactly. Taking into account the ratio W(h) between the forward
and backward radiated beams, one can prove the Ewald–Oseen theorem as before.
Appendix
This chapter, when originally published in Optics & Photonics News,
prompted the following criticism and reply.
“Editor:
While we are pleased that Masud Mansuripur has called attention in OPN to the
rather basic Ewald–Oseen extinction theorem, we wish to take issue with certain
parts of his article.1
“Mansuripur states that the goal of his article is ‘to outline a general proof of
the Ewald–Oseen theorem using arguments that are based primarily on thin-
film optics.’ We wish to note first that the proof he outlines, based on the
field produced by a uniform sheet of dipole oscillators and the assumed form
16 The Ewald–Oseen extinction theorem 221
exp[2pinz/k0] for the field inside the medium, is essentially the same approach
used by Fearn, James, and Milonni.2 Their proof is more general in that Fresnel
coefficients (for normal incidence) are derived rather than assumed. Indeed, the
derivation of the Fresnel coefficients assumes the extinction of the incident field
inside the dielectric medium: Mansuripur’s starting point implicitly assumes the
very theorem he is trying to prove! In this connection we note that it was not
claimed by Fearn et al. that they provided a ‘general proof ’ of the extinction
theorem. A general proof, valid for media bounded by surfaces of arbitrary shape,
is given by Born and Wolf.3
“Mansuripur cites References 2 and 3 in support of his opinion that the proof
of the extinction theorem is ‘devoid of physical insight’. While it is true that the
proofs given in these references involve ‘complicated integration over dipolar
fields in three-dimensional space,’ we do not think it is fair to say it [the proof] is
devoid of physical insight. In Reference 3, page 101, the significance of the
theorem is described in the following manner that could hardly be more phys-
ical: ‘The incident wave may . . . be regarded as extinguished at any point
within the medium by interference with the dipole field and replaced by another
wave with a different velocity (and generally also a different direction) of
propagation.’
“Finally we note that various features of the extinction theorem have been
interpreted differently by various authors: some of these differences have been
discussed by Fearn et al.2 It would be unfortunate if readers of Mansuripur’s
article were left with the impression that the theorem can somehow be based
‘primarily on thin-film optics.’
1 M. Mansuripur, The Ewald–Oseen extinction theorem, Opt. & Phot. News 9 (8),
50–55 (1998).
2 H. Fearn et al., Microscopic approach to reflection, transmission, and the Ewald–
Oseen extinction theorem, Am. J. Phy. 64, 986–995 (1996).
3 M. Born and E. Wolf, Principles of Optics, sixth edition, Cambridge University Press,
Cambridge UK 1985, section 2.4.2.
An informal survey of some colleagues and students revealed that the notion of
reciprocity in optics is not widely appreciated. One colleague even justified the
prevailing ignorance by drawing a parallel between reciprocity in optics and
complementarity in quantum mechanics: “Both are true statements which have
little, if any, practical value in their respective domains.” This chapter is an
attempt at explaining the concept of reciprocity, clarifying some associated
misconceptions, and pointing out its practical applications.
224
17 Reciprocity in classical linear optics 225
Polarizing 45° Faraday
beam-splitter rotator Mirror
P
P
Polarizing Quarter-wave
beam-splitter plate Mirror
P P RCP
S LCP
S
Figure 17.2 The quarter-wave plate as used in this system helps to separate the
reflected beam from the incident beam. The key contribution is made by the
(conventional) mirror, which converts the incident RCP beam into LCP upon
reflection.
mirror, the purpose being to separate the reflected beam from the incident beam
efficiently, as well as to isolate the laser diode.) Although the system of Figure
17.2 behaves very much like that of Figure 17.1, no one claims that a QWP is
non-reciprocal. This seeming paradox can be resolved after a careful examination
of the concept of reciprocity, to which we now turn.
Is a polarizer reciprocal?
Consider the simple linear polarizer shown in Figure 17.3. A collimated beam of
light entering from the left-hand side emerges from the polarizer linearly polarized
along the transmission axis. The polarization state of the incident beam may be
decomposed into two linear components, one parallel and the other perpendicular to
226 Classical Optics and its Applications
P P
Transmission
axis
Plano-convex
Lens
Z
F
the transmission axis. Assuming an ideal polarizer, the entire parallel component is
transmitted while the entire perpendicular component is absorbed within the
polarizer. If the direction of propagation of the transmitted beam is reversed, it will
pass through the polarizer without any change. Since the original state of polarization
of the incident beam is not recovered, the polarizer is a non-reciprocal element.
One might argue that in one sense the polarizer is reciprocal because, irre-
spective of whether the incident beam illuminates it from the left or from the right
side, it behaves the same way. However, this turns out to be a poor way to define
reciprocity, because it cannot be generalized to cover other optical elements. For
example, consider the simple plano-convex lens shown in Figure 17.4. As will be
shown below, lenses in general are reciprocal elements. However, a collimated
beam of light shining on the convex surface of this lens comes to focus with
17 Reciprocity in classical linear optics 227
less spherical aberration than a beam shining on its flat surface (see Figures 17.5
and 17.6). Therefore, if reciprocity required the identity of behavior from both
sides of an element, one would end up with the undesirable result that a plano-
convex lens, for example, is non-reciprocal. To avoid this outcome we return to
our earlier definition that the beam transmitted through a reciprocal element, when
“properly” reversed, must recreate the incident beam in the reverse direction. It is
in this sense that the polarizer of Figure 17.3 is non-reciprocal.
a b
Figure 17.5 Plots of intensity and phase corresponding to the plano-convex lens
of Figure 17.4, illuminated by a collimated and uniform beam from the convex
side. (a) Intensity distribution immediately after the beam leaves the plane facet of
the lens. (b) Residual phase distribution immediately after the beam leaves the
plane facet. The curvature of the beam has been removed from the phase distri-
bution, leaving only the residual spherical aberration balanced by a small amount
of defocus. The r.m.s. value of these residual aberrations over the entire aperture
is 0.17k0. (c) Intensity distribution in the plane of best focus, i.e., at the circle of
least confusion. (d) Same as (c) but on a logarithmic scale and over a larger area.
228 Classical Optics and its Applications
a b
c d
Figure 17.6 Same as Figure 17.5 for the case when the beam enters from the
plane side of the lens. (a) Emergent intensity distribution at a plane tangent to the
convex facet of the lens at its vertex. Note the larger diameter of the emergent
beam compared with Figure 17.5(a). (b) Residual phase of the emergent beam
within the tangent plane to the convex surface. The r.m.s. wavefront aberration
over the entire aperture is 0.68k0. (c) Distribution of intensity in the plane of best
focus. (d) Same as (c) but on a logarithmic scale and over a larger area.
beam back towards the lens. Upon re-emerging from the lens the beam, now
collimated once again, propagates in the reverse direction of the original incident
beam. Is this sufficient proof that the lens is reciprocal? The answer is no, for the
following reasons. What if the lens has aberrations? What if the incident beam is
only illuminating one half of the lens’s aperture, as in Figure 17.7(b)? What if the
mirror is displaced from the focal plane of the lens, as in Figure 17.7(c)? In all
these examples (and many more that can be conceived) the returning beam does
not retrace the path of the incident beam. Does this mean that the lens is non-
reciprocal? Again the answer is no. The culprit in all these examples is the mirror,
which does not “properly” reverse the path of the beam.
What we need in place of the conventional mirror is a phase-conjugate mirror1
(PCM) to reverse the wavefront properly. Suppose a PCM is placed perpendicular
to the Z-axis at z ¼ z0. If the complex-amplitude distribution incident on the PCM is
denoted by A(x, y, z0), then the reflected wavefront at the plane of the mirror will be
17 Reciprocity in classical linear optics 229
(a) Lens Mirror
(c)
Lens Mirror
Z
F
A*(x, y, z0), which propagates along the negative Z-axis and completely retraces the
incidence path. Substituting the ordinary mirror by a PCM in Figure 17.7 ensures
that the beam is properly reversed in each case, and proves beyond any doubt that
lenses are reciprocal.
becomes left circularly polarized (LCP) and vice versa. The result is that the
QWP in Figure 17.2 rotates the polarization of the beam by 90 in double pass,
forcing it to change its propagation direction at the PBS. If the mirror is replaced
by a PCM, the sense of circular polarization does not change upon reflection, and
the beam emerges from the QWP with the same linear polarization as it had when
it first entered the plate. The returning beam thus retraces its path, proving the
reciprocity of the QWP.
The question arises as to what happens in the system of Figure 17.1 if the
mirror is replaced by a PCM? Since the beam incident on the mirror is linearly
polarized, it remains linear whether it is reflected from an ordinary mirror or from
a PCM. Therefore, the path of the reflected light in Figure 17.1 does not change as
a result of changing the mirror, confirming our earlier conclusion that the Faraday
rotator is non-reciprocal.
TIR Prism
as ap exp[i(p þ wp)] and as exp[i(s þ ws)]. The second reflection from the
TIR mirror eliminates the acquired phases wp and ws and returns the conjugate of
the original incident beam, which is exactly what is needed. A TIR mirror,
therefore, is a reciprocal element.
A regular beam-splitter
There are many different ways of constructing a beam-splitter. For simplicity’s sake,
let us consider the specific beam-splitter shown in Figure 17.9. This flat piece of glass
of thickness d and refractive index n has no coating layers and is used at a 45 angle
of incidence. If the reflected and transmitted beams are returned by conventional
mirrors, as shown in the figure, then, in general, a certain fraction of the light returns
along the incidence path and the remainder leaves the beam-splitter along a fourth
direction. However, if the conventional mirrors in Figure 17.9 are replaced by PCMs,
the entire beam will retrace its original path.
To see this we must first examine certain properties of the glass slab that forms
the beam-splitter. Figure 17.10 shows computed plots of the reflection and trans-
mission coefficients versus the thickness d of the slab. The assumed refractive
index is n ¼ 2, the angle of incidence is fixed at h ¼ 45 , and the incident beam is
a coherent and monochromatic beam from a red HeNe laser (k0 ¼ 633 nm).
Only the range of thicknesses corresponding to one half-wavelength is shown in
232 Classical Optics and its Applications
Mirror
Beam-splitter
Mirror
Figure 17.9 A parallel plate made of a glass slab of thickness d and refractive
index n used as a beam-splitter. The collimated and uniform incident beam is
partially reflected and partially transmitted at the slab. If conventional mirrors are
used to return the reflected and transmitted beams back to the beam-splitter, in
general a fraction of the beam will go back towards the source but the remainder will
leave the beam-splitter in a fourth direction. However, if the mirrors are replaced by
phase-conjugate mirrors, the entire beam will return along the incidence path.
Figure 17.10, since the reflection and transmission coefficients are periodic with
this period. The half-wave thickness of the slab is d ¼ k0 /(2n cos h0 ) ¼ 169.2 nm.
Here h 0 ¼ 20.7 , obtained from Snell’s law, is the angle between the propagation
direction within the slab and the slab’s surface normal. The reflection and trans-
mission coefficients for both p- and s-polarized light are shown in the figure.
Note in Figure 17.10 that, at any given thickness, jrj2 þ jtj2 ¼ 1 and r t ¼ 90 .
In fact, it may be shown that these two properties of the slab are quite general and
hold not only for all thicknesses but also for all values of the refractive index n,
angle of incidence h, and wavelength k0. The first identity is a trivial statement of
the principle of conservation of energy. The second, relating the phase angles of the
reflected and transmitted beams, is more subtle, but its violation also results in non-
conservation of energy, as we shall see shortly.
When the transmitted beam returns to the slab via a PCM it will have an
amplitude t*. Upon transmission (in the reverse direction) its amplitude becomes
tt*; it will then combine with the reversed reflected beam whose amplitude at this
point is rr*. The total returning amplitude is therefore rr* þ tt* ¼ jrj2 þ jtj2 ¼ 1.
The remainder of the beam, leaving the beam-splitter in the fourth direction, will
have a total amplitude rt* þ r*t ¼ 2jrtj cos(r t), which is exactly zero because
the phase difference between r and t is 90 . Thus the beams reversed by the two
PCMs combine at the beam-splitter to yield the reverse propagating beam along
the original path, leaving no other light to go in the fourth direction.
17 Reciprocity in classical linear optics 233
270 (b)
1.0 (a)
|tp| 225
0.8 frp
Phase (degrees)
180
Amplitude
0.6
135 ftp
0.4
|rp| 90
0.2
45
0.0 0
0 25 50 75 100 125 150 175 0 25 50 75 100 125 150 175
270
1.0 (c) (d)
225
|ts| frs
0.8
Phase (degrees)
180
Amplitude
0.6
135 fts
0.4 |rs|
90
0.2 45
0.0 0
0 25 50 75 100 125 150 175 0 25 50 75 100 125 150 175
Thickness (nm) Thickness (nm)
Although the above proof for reciprocity of the glass slab was given for plane
waves, one can show its validity in the general case of a finite-size incident beam as
well. To appreciate the effects of finite size, consider the plots of intensity distri-
bution in Figure 17.11, computed for a HeNe beam of diameter 2000k0 upon
reflection from and transmission through a slab 500 lm thick of n ¼ 2 glass. Near
the edges of the beam the various reflected (or transmitted) orders do not overlap
and, consequently, give rise to varying degrees of brightness in these regions.
Instead of considering the edges separately, however, the appropriate proof of
reciprocity for a finite-size beam involves the consideration of such beams as a
superposition of a large number of plane waves traveling in different directions
(i.e., angular spectrum decomposition). Since the reciprocity applies to each such
plane wave, it must, of necessity, also apply to their linear superposition.
234 Classical Optics and its Applications
a b
c d
now propagating along the negative direction of that same k-vector. If we replace
the E-fields by E* and the H-fields by H* everywhere, Maxwell’s equations
remain satisfied so long as the dielectric tensor of the material environment obeys
the relation e ¼ e* at all points of space. This latter relation holds, for example, if
the medium is isotropic and lossless (i.e., e is a real-valued scalar), or if the
medium is birefringent but non-absorptive (i.e., e is a real-valued symmetric
matrix), or if the medium has optical activity of the type observed in sugar
crystals. If, however, the medium is absorptive, or if it has magneto-optical
activity such as that exhibited by a Faraday rotator, then e 6¼e*, in which case the
reverse-propagating beam(s) violate Maxwell’s equations and, consequently,
reciprocity breaks down.
tr
þ r 0 t
¼ 0: ð17:2Þ
PCM
rr *
r
tt * r*
t
rt* tr *
t*
PCM
stack applies quite generally unless one or more layers are absorptive or
magneto-optically active. In fact, the media of incidence and emergence on the
two sides of the stack do not have to be identical either. Using the method of
proof outlined above, one can readily show that the behavior of dielectric stacks
remains symmetrical even when the media above and below the stack have
arbitrary refractive indices n1 and n2, provided that proper account is made of
the difference in beam cross-section and the dependence of power on the
refractive index.
Another interesting property of multilayer stacks arises when one or more of
the layers happen to be absorptive. Since reciprocity no longer applies to this
case, it should come as no surprise that the reflectivities of the two sides of the
stack are, in general, different. What is surprising is that, even in the presence of
absorption, the transmissivity continues to be the same from both sides. This
property can be proven using standard methods of thin-film-stack calculation4
and has been verified numerically in several situations. A simple proof for the
symmetric behavior of the transmissivity under quite general conditions is given
in the following appendix.
17 Reciprocity in classical linear optics 237
1.00
0.75
Amplitude
|rs| |tp|
0.50 |ts|
0.25
|rp|
0.00 (a)
0 15 30 45 60 75 90
200
ftp = ftp
Phase (degrees)
100
frp
frp
(b)
–100
0 15 30 45 60 75 90
400
frs
Phase (degrees)
300
frs
200
fts = fts
100 (c)
0 15 30 45 60 75 90
Angle of Incidence (degrees)
Appendix
We prove that the Fresnel transmission coefficient t for a multilayer stack con-
sisting of metal and dielectric layers does not depend on whether the light is
incident from the top or the bottom of the stack. For stacks consisting solely of
dielectric layers this property has been proved in the present chapter, using
reciprocity. Reciprocity, however, breaks down in the presence of absorptive
layers, and one needs to resort to an alternative method of proof, such as that
outlined below.
A general stack consists of an arbitrary number of layers, each having thick-
ness dj and complex refractive index (n þ ik)j, the subscript j referring to the layer
number. For an incident plane wave of wavelength k, arriving at the top of the
stack at angle h, the Fresnel reflection and transmission coefficients of the stack
are denoted by r and t, respectively. Similarly, when the beam is incident from
the bottom side on the stack (again at angle h), the Fresnel coefficients are
denoted r 0 and t 0 . Our goal is to demonstrate the equality of t and t 0 , even though,
in general, r and r 0 may differ from each other.
Consider the hypothetical situation shown in Figure A17.1, where the stack is
split along an interfacial plane into two smaller stacks separated by an air gap d.
The upper stack, identified as stack 1, has reflection and transmission coefficients
from top and bottom denoted by r1, t1, r10 , t10 . Similarly, the corresponding par-
ameters of the lower stack, stack 2, are r2 , t2 , r20 , t20 . The transmissivity t of the
entire stack (in the presence of the air gap) can be obtained by adding an infinite
number of terms corresponding to the beams bouncing back and forth in the gap,
namely,
Here ¼ 2pd cos h/k is the phase delay due to one passage of the beam through
the gap. In the limit of a vanishing gap (i.e., d ! 0) we find a simple expression
for t in terms of the parameters of stacks 1 and 2:
t ¼ t1 t2 =ð1 r10 r2 Þ: ðA17:2Þ
The argument for the equality of t and t 0 flows readily from Eqs. (A17.2)
and (A17.3), using proof by induction as follows . It is clear that if the individual
17 Reciprocity in classical linear optics 239
a ra
Stack 1
(r1, t1, r1, t1)
Stack 2
(r2, t2, r2, t2)
ta
sub-stacks are such that t1 ¼ t10 and t2 ¼ t02, then t ¼ t0 is guaranteed. For each sub-
stack the reduction to a pair of smaller stacks can be repeated until each sub-stack is a
single-layer, in which case t1 ¼ t10 and t2 ¼ t20 obviously hold. The proof is thus
complete.
240
18 Optical pulse compression 241
0.6
A(f)
0.4
0.2
0.0
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5
Frequency (1014 Hz)
2 (b)
1
a (t, z = 0) (× 10–7)
–1
–2
0 50 100 150 200 250
Time (fs)
2 (c)
a (t, z = 0) (× 10–7)
–1
–2
10 20 30 40 50 60 70
Time (fs)
Figure 18.1 Amplitude A(f) of the Fourier transform of a Gaussian pulse train
described by Eq. (18.1), having f0 ¼ 3.75 · 1014 Hz, Df ¼ 0.01f0, and M ¼ 10. The
corresponding amplitude profile a(t, z ¼ 0) is a periodic function of time, with
period T ¼ 1/Df 267 fs. A single period of the pulse train is shown in (b), and a
close-up appears in (c).
nðf Þ n0 þ n1 ðf f0 Þ þ n2 ðf f0 Þ2 : ð18:3Þ
In the above equation, the first term on the right-hand side is a constant phase-factor
(independent of f ), which can, for purposes of the present analysis, be ignored. The
second term is a quadratic phase-factor in (f – f0) ¼ mDf, which may be combined
with the phase m of Am in Eq. (18.2); this term is ultimately responsible for the
broadening and chirp induced on the pulse by the effects of dispersion. The last
term is a linear phase-factor that translates the (dispersed) pulse from t ¼ 0 at z ¼ 0
to t ¼ (n0 þn1f0)z0/c at z ¼ z0. The group velocity Vg is thus found to be
Vg ¼ c=ðn0 þ n1 f0 Þ: ð18:5Þ
n0
1.5
0.5
1030n2
–0.5
2 3 4 5 6 7 8 9 10
Frequency (1014 Hz)
Figure 18.2 Plots of n0, n1, and n2 versus the optical frequency f for fused
silica in the wavelength range k ¼ 0.3 lm – 1.6 lm. The refractive index n0( f) is
measured and fitted to the Sellmeier equation, then the derivatives of the
equation are obtained analytically to yield the plots of n1 and n2.
1.5
a (t, z = 4 mm) × 10–7
1.0
0.5
0.0
–0.5
–1.0
–1.5
Figure 18.3 The pulse depicted in Figure 18.1 after propagating a distance of
4.0 mm in fused silica (n0 ¼ 1.4534, n1 ¼ 3.69 · 1017 s, n2 ¼ 0.6 · 1033 s2 at
k0 ¼ c/f0 ¼ 0.8 lm).
at k0 ¼ c/f0 ¼ 0.8 lm). Clearly it does not take much propagation for a short pulse
of the given wavelength in the given material to become significantly broadened.
(Here we have used the fact that n00 ¼ 2n2.) The so called group velocity dispersion
(GVD) defined by Eq. (18.6) is clearly proportional to the coefficient (n1 þ n2f0)
appearing in the quadratic phase factor in Eq. (18.4). In particular, the sign of (n1 þ n2f0)
determines whether Vg is an increasing or decreasing function of frequency.
n o
· cos 2pf0 t þ p a2 =ða21 þ a22 Þ t2 12 tan1 ða2 =a1 Þ 0 : ð18:8Þ
Note that the field amplitude a(t) has units of volt/meter, namely, those of
the electric field in the MKSA system of units. The pulse envelope is a Gaussian
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
function whose width is proportional to ða1 þ a22 Þ=a1 . (To obtain the pulse’s
2
higher frequency. On the other hand, if (n1þn2f0) happens to be negative, the chirp
frequency will decrease with time (down-chirp). However, since the GVD is
positive in this case, the leading edge, once again, will travel faster than the trailing
edge. Either way, the pulse is seen to broaden as a result of propagation in the
dispersive medium, which is the same conclusion arrived at earlier, when we
argued that the width of the Gaussian pulse of Eq. (18.8) is an increasing function
of a2. The minimum width occurs at z0 ¼ 0, where a2 ¼ 0; here the pulse is said to
pffiffiffiffiffi
be transform-limited, meaning that its width, a1 , cannot be reduced any further,
owing to the finite width of its Fourier transform A(f ).
where Ipeak ¼ n0jA0j2/(2Z0a1) is the pulse’s peak intensity (i.e., optical power per unit
cross-sectional area). Here Z0 377 X is the free-space impedance in the MKSA
system of units. The pulse’s Gaussian intensity profile is approximated in Eq. (18.9)
by the quadratic function (1 2pt2/a1), which provides an accurate description at
and around the center of the pulse, but grossly underestimates the intensity distri-
bution further away, in the wings. The only justification for this approximate
treatment is that it simplifies the following analysis; more realistic calculations,
therefore, must properly account for the actual pulse’s intensity profile.
If the nonlinear refractive index of the medium happens to be proportional to
I(t), namely, n(I) ¼ n0 þ cI, which is characteristic of media with the so-called
Kerr nonlinearity, then, assuming dispersionless propagation and ignoring time-
independent terms, the phase modulation imparted to the pulse after propagating
a distance z0 will be
DðtÞ 2pcIðtÞz0 =k0 4p2 f0 cIpeak z0 =ða1 cÞ t2 ¼ pt2 =a3 ; ð18:10Þ
–5
–10
0 10 20 30 40 50 60 70
Time (fs)
Figure 18.4 The original pulse of Figure 18.1(c) after acquiring a nonlinear
phase shift, D(t) ¼ pt2/a3, with a3 ¼ 8.5 ·1029 s2. Other relevant parameters
are jA0j ¼ 2.0 · 106 v·s/m, a1 ¼ 5.0 ·1028 s2, f0 ¼ 3.75 ·1014 Hz. The chirp
frequency is seen to be a linearly increasing function of time.
a plot of the original pulse depicted in Figure 18.1(c) after acquiring the nonlinear
phase shift given by Eq. (18.10) with the above value of a3. The quadratic nature
of the phase-shift in Eq. (18.10) is responsible for the chirped behavior of the
oscillations in Figure 18.4, where the frequency is seen to be a linearly increasing
function of time (the so-called up-chirp).
When the quadratic phase-factor exp(ipt2/a3) is imposed on a transform-
limited Gaussian pulse, it does not change the Gaussian nature of the pulse, but
modifies the pulse parameter from a ¼ a1 to b. Defining 1/b ¼ (1/a1) þ i/a3, the
Fourier transform of the chirped pulse becomes
pffiffiffiffiffiffiffiffiffiffi h i
Að f Þ ¼ A0 b=a1 exp pbð f f0 Þ2 : ð18:11aÞ
Writing b ¼ b 1 – ib 2, we find
.h i
b 1 ¼ a1 1 þ ða1 =a3 Þ2 ; ð18:11bÞ
.h i
b 2 ¼ a3 1 þ ða3 =a1 Þ2 : ð18:11cÞ
m0
d
Reflection from the second grating does not modify the acquired phase factor given
in Eq. (18.12), but merely cancels the modulation of the wavefront, exp(i2pmx/p),
which was added at the first grating. The beam thus returns to propagating in its
original direction at an angle h to the grating normal, but retains the phase (k)
which it acquired while propagating between the two gratings.
We mention in passing that, in addition to the above phase, one must take into
account the phase-shift imparted to each wavelength upon reflection from the two
gratings. The phase shifts of the two gratings, which will be the same if the gratings
are identical, must be added to (k), and their wavelength dependence must be
fully accounted for when computing the quadratic phase-factor imposed on the
emergent beam.
The Taylor series expansion of (k) of Eq. (18.12) around the center frequency
f0 yields
ðf Þ ¼ U0 þ U1 ðf f0 Þ þ U2 ðf f0 Þ2 þ ; ð18:13aÞ
where, denoting by hm0 the mth order diffraction angle corresponding to the
250 Classical Optics and its Applications
0
f( f ) – Φ0 – Φ1( f – f0 ) (rad)
–10
–20
–30
–40
–50
–60
–70
3.55 3.6 3.65 3.7 3.75 3.8 3.85 3.9 3.95
Frequency (× 1014 Hz)
Figure 18.6 Plot of the function (f) – U0 – U1(f – f0) in the vicinity of
f0 ¼ 3.75·1014 Hz. (f) is given by Eq. (18.12), while its first two Taylor series
coefficients, U0 and U1, are given by Eqs. (18.13b) and (18.13c). The diffraction
order under consideration is m ¼ 1, the assumed grating period is p ¼ 1.0 lm,
the incidence angle is h ¼ 60 , and the separation between the gratings is
d ¼ 10 mm.
As a typical example, Figure 18.6 shows a plot of the phase function (f ) of
Eq. (18.12), with the constant and linear terms of Eq. (18.13a) removed. The
horizontal axis is centered at f0 ¼ 3.75 · 1014 Hz (k0 ¼ 0.8 lm), the grating period
is p ¼ 1.0 lm, the assumed incidence angle is h ¼ 60 , the separation between the
gratings is d ¼ 10 mm, and the diffraction order under consideration is m ¼ 1. The
numerical value of U2, 1.8 · 1025 s2, provides a good match to the actual curvature
of the function plotted in Figure 18.6. This quadratic phase-factor is a linear function of
the separation d of the two gratings, and also a strong function of the grating period p.
Note that the quadratic phase coefficient U2 of Eq. (18.13d) is always negative.
Losses due to diffraction orders other than the mth order used, as well as polarization
dependence of the diffraction efficiency from gratings, can be a disadvantage. Since
the various frequencies are shifted laterally upon emerging from the second grating,
to the extent that this lateral shift cannot be ignored, one must either employ a
second, identical pair of gratings, or return the beam through the same pair, in order
to compensate for this lateral spectral shift. In the end, chirp-compensation with a
grating pair works well for femtosecond and even a few-pico second-long pulses,
18 Optical pulse compression 251
r = exp(if)
Dielectric mirror
2 = exp(i 2)
Incident beam
d
Dielectric mirror
1 = exp(i 1)
but increasing the pulse duration to the sub-nanosecond regime and beyond imposes
unrealistic demands on the grating period p and grating separation d, which renders
impractical this method of chirp-compensation for long pulses.
A third method of chirp-compensation is based on resonant structures, such as
Fabry–Perot etalons. Figure 18.7 is a diagram of a special resonator (the Gires–
Tournois interferometer), which is particularly useful for low-level chirp-
cancellation. For simplicity, let us assume that the front mirror has amplitude
reflectivity q1 ¼ q exp(iw1) and transmissivity s1 from both sides (i.e., symmetric
mirror), and that the second mirror is 100% reflective, that is, q2 ¼ exp(iw2). The
mirrors being lossless, we have jq1j2 þ js1j2 ¼ 1;palso,
ffiffiffiffiffiffiffiffiffiffigenerally,
ffi the phase dif-
ference between q1 and s1 is 90 ; therefore, s1 ¼ i 1q expðiw1 Þ. Assuming the
2
The above phase can be expanded in a Taylor series around the center frequency
f0, as follows:
ðf Þ ¼ U0 þ U1 ðf f0 Þ þ U2 ðf f0 Þ2 þ : ð18:16aÞ
A typical behavior of the GT phase-function (f ) for the special case of q ¼ 0.9 is
shown in Figure 18.8. The dependence of on the total retardation w ¼ w1 þ
w2 þ (4pd/k), shown in Figure 18.8(a), reveals that rises rapidly from 0 to 2p in
the vicinity of resonance, which occurs at w ¼ 0. Figure 18.8(b) shows the
dependence of 0 (f ) ¼ d/df on w. As can be seen from Eq. (18.16b), the max-
imum value of 0 , namely, (4pd/c)(1 þ q)/(1 q), occurs on resonance, at w ¼ 0.
Therefore, for the (chirped) incident pulse to experience, upon reflection, the full
range of the available phase of the GT resonator, the pulse’s spectral
width should be Df (c/4d)(1 q)/(1 þ q). Assuming Df 3.0 · 1011 Hz
(corresponding to pulses in the few-picosecond range), a good choice for the
separation distance of the GT mirrors would be d 14 lm.
With reference to Figure 18.8(c), which is a plot of 00 (f ) ¼ d2/df 2 versus w, it
is clear that a slight increase of the mirror separation d (by only 3.8 nm in the
present example) will shift the center of the incident spectrum (f0 ¼ 3.75 · 1014 Hz)
to the vicinity of the negative peak of 00 (f ), where a large negative quadratic phase
factor is available for chirp cancellation.
Figure 18.9 is a plot of (f ) in the vicinity of the center frequency f0, with the
first two terms of the Taylor series subtracted. The assumed parameters are q ¼ 0.9,
w1 ¼ w2 ¼ 0, d ¼ 14.005 lm, and the computed Taylor series coefficients are
U0 ¼ 1.28, U1 ¼ 7.2 · 1012, U2 ¼ 1.9 · 1023. It is easy to verify that the
quadratic function U2( f – f0)2 provides a fairly good match to the actual phase
depicted in Figure 18.9.
In general, the magnitude of the quadratic phase available from a GT resonator is
rather small, thus limiting the applicability of this type of device to situations that
involve small compression ratios only. To see this, note in Eq. (18.11) that the
pffiffiffiffiffiffiffiffiffiffiffiffi qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
compression ratio Rc ¼ a1 =b 1 is equal to 1 þ ða1 =a3 Þ2 ; also, b2/b 1 ¼ a1/a3;
pffiffiffiffiffiffiffiffiffiffiffiffiffiffi
therefore, b 2 =b 1 ¼ R2c 1. Defining the bandwidth Df as the FWHM
18 Optical pulse compression 253
6 (a)
5
4
f( f ) (rads)
3
2
1
0
–150 –100 –50 0 50 100 150
20
(b)
15
f( f )/(4d/c)
10
0
–60 –40 –20 0 20 40 60
100 (c)
2
f( f )/(4d/c)
50
–50
–100
of jA(f )j2, the phase of A(f ) in Eq. (18.11) varies by about 0.35b 2/b 1 radians
between f0 and f0 12 Df. In practice, the limited amount of quadratic phase
available from a GT resonator means that this device can handle only small
compression ratios (Rc 2–3).
Operating away from the peak of 00 (f ) could help provide a slight increase
in the range of the quadratic phase at the expense of introducing third- and
higher-order phase factors into the optical spectrum. It is also possible to
254 Classical Optics and its Applications
0
–0.1
–0.3
–0.4
–0.5
3.748 3.749 3.75 3.751 3.752
Frequency (× 1014 Hz)
Figure 18.9 Plot of the function (f) – U0 – U1(f – f0) in the vicinity
of f0 ¼ 3.75 · 1014 Hz. The GT parameters are q ¼ 0.9, w1 ¼ w2 ¼ 0,
d ¼ 14.005 lm, while the computed Taylor series coefficients are U0 ¼ 1.28,
U1 ¼ 7.2 · 1012, U2 ¼ 1.9 · 1023. The quadratic function U2(f – f0)2
provides a fairly good match to the actual function depicted here.
design the GT mirrors such that q, w1, and/or w2 exhibit strong dependences
on f within the relevant spectral range. Inserting a transparent dielectric slab
(or thin film layer) between the mirrors is another degree of freedom that can
be (and has been) exploited for the purpose of improving the performance
of GT compressors.9
Concluding remarks
In this chapter we have attempted to provide an explanation of the fundamental
principles of optical pulse compression. We stayed away from the advanced
topics, and steered clear of some of the technical difficulties as well as the
ingenious methods that have been developed to overcome them. In practice, one
must contend with a host of technical problems in order to reliably and efficiently
produce high-quality compressed pulses. The nonlinear medium, which imparts
the all-important phase modulation to the initial pulse, may introduce significant
dispersion of its own. This results in a distorted pulse and, often, it is the
mechanism that limits the amount of useful chirp that can be placed on the pulse.
In addition, the third- and higher-order terms introduced into the spectral phase
profile, either within the nonlinear medium or as a consequence of passage
through the chirp compensator, must be identified and corrected, perhaps by
sending the pulse through additional (high-order) compensators. Finally, the
profile of the compressed pulse must be measured to determine the degree of
18 Optical pulse compression 255
compression, and to find out whether the pulse is free from distortions and other
imperfections. The interested reader may consult the vast literature of the subject
for further details.
Appendix
Slab waveguide and the effective refractive index of guided modes
Consider the slab waveguide depicted in Figure 18.10. The guiding layer has
thickness d and refractive index ng. The substrate and the cladding layer, having
refractive indices ns and nc, respectively, may be assumed to be infinitely thick.
Within the guiding layer a pair of plane-waves propagate at an angle h relative to
the surface normal; h is greater than the critical angle of total internal reflection at
both interfaces, that is, ng sin h > max (ns, nc). The two plane-waves thus have the
following complex amplitudes:
E ðx; zÞ ¼ jE0 j expði0 Þ exp½ið2png =k0 Þð x cos h þ z sin hÞ: ðA18:1Þ
nc
z
Evanescent field
ng
Guiding layer
Evanescent field
ns
Here the plus sign refers to the up-going beam, the minus sign to the down-going
beam, 0 defines the relative phase between the two plane-waves, and k0 ¼ c/f0 is
the vacuum wavelength. At the interface with the cladding, where x ¼ d/2, the
down-going beam must have the same amplitude as the up-going beam, but its
phase must be incremented by the phase of the Fresnel reflection coefficient
at this interface. The Fresnel coefficient, depending on whether the beam is s- or
p-polarized, is rp ¼ exp(ip) or rs ¼ exp(is), where
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
pðcladÞ ¼ þ2 tan 1
ðn2c cos hÞ= ng n2g sin2 h n2c ; ðA18:2Þ
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
sðcladÞ ¼ 2 tan 1
n2g sin h n2c =ðng cos hÞ :
2
ðA18:3Þ
¼ 0 þ ð2png =k0 Þ 2 d cos h þ z sin h ;
1
ðA18:4Þ
which leads to
ðcladÞ
20 þ 2png ðd=k0 Þ cos h þ p;s ¼ 0: ðA18:5Þ
A similar relation must hold at the substrate interface (x ¼ d/2), where the
down-going beam is incident and the up-going beam is reflected. Therefore,
ðsubÞ
20 þ 2png ðd=k0 Þ cos h þ p;s ¼ 0: ðA18:6Þ
Equations (A18.5) and (A18.6) can be satisfied simultaneously if and only if an
integer m exists such that
ðcladÞ ðsubÞ
4png ðd=k0 Þ cos h þ p;s þ p;s ¼ 2mp: ðA18:7Þ
If the guiding layer’s thickness d is sufficiently small, Eq. (A18.7) will have only
one solution (i.e., one acceptable value of h) for s-light, and perhaps another
solution for p-light. The guide is then said to be single-mode. Larger values of d
lead to more solutions, which correspond to higher-order modes. (Note: h ¼ 90 is
always an acceptable solution; however, 0 in this case turns out to be 0 for
p-light and 90 for s-light. Both of these solutions result in the up-going and
down-going plane-waves coming into alignment with equal but opposite ampli-
tudes, thereby canceling each other out. The solution corresponding to h ¼ 90 ,
18 Optical pulse compression 257
therefore, does not lead to a viable mode.) For a viable mode, denoting the
solution of Eq. (A18.7) by hm, and with reference to Eq. (A18.1), the E-field
amplitude within the guiding layer will be
ðA18:8Þ
The cross-sectional profile of the mode along the x-axis is thus determined by the
cosine function on the right-hand side of Eq. (A18.8) (and also by the evanescent
fields within the cladding and the substrate). The exponential term in Eq. (A18.8)
is the propagation phase-factor, from which one can identify an effective
refractive index neff ¼ ng sin hm for the given mode. Considering that, in general,
both ng and the solution hm of Eq. (A18.7) are functions of the frequency f, the
dispersive properties of the waveguide are seen to arise from the frequency
dependence of neff.
258
19 The uncertainty principle in classical optics 259
X
Observation
plane
Δu
D Z
beams expand as they propagate along Z and, although their centers drift apart,
there is the distinct possibility that they will never be completely separated.
Roughly speaking, we expect the beams to remain more or less collimated
between z ¼ 0 and z ¼ D2/k, the Rayleigh range2 for a beam of diameter D and
wavelength k. If at the Rayleigh range the distance between the beam centers is
greater than D, the beams should be separable; otherwise their drifting apart will
go hand in hand with their expansion, and the beams remain entangled as they
propagate beyond the Rayleigh range. The necessary condition for separability is
thus (D2/k)Dh > D, or equivalently,
D Dkx > 2p: ð19:1Þ
The lower bound 2p on the product of D and Dkx appearing in Ineq. (19.1) is not
exact, but depends on the definition of beam diameter D and the adopted criterion
for separability, which are typically imprecise. For all practical purposes, the
number appearing on the right-hand side of Ineq. (19.1) should be on the order of
unity, say, greater than 1 but less than 10.
Invoking the quantum nature of light, if the aperture diameter D is interpreted
as a measure of the uncertainty Dx about the photon position along X, while Dkx
is related (through the relation p ¼ h k) to the linear momentum uncertainty Dpx
along the same axis, then Ineq. (19.1) is equivalent to Heisenberg’s uncertainty
relation DxDpx > h.
260 Classical Optics and its Applications
a b
c d
e f
Figure 19.2 Plots of intensity (left) and phase (right) at the entrance aperture
of the system of Figure 19.1. Two uniform beams, one propagating with a slight
tilt toward the upper right, another with a slight tilt toward the lower left, enter a
D ¼ 500 k aperture. The angular separation of the beams is Dh ¼ 0.23 . The
individual beams are shown in the top (a, b) and the middle (c, d) rows; their
superposition appears at the bottom (e, f).
Figure 19.2 shows the intensity and phase profiles of two plane waves as well
as those of their superposition at the aperture depicted in Figure 19.1 (diameter
D ¼ 500 k). The phase distributions in Figures 19.2(b) and 19.2(d) indicate that
one of the beams is slightly tilted towards the upper-right corner of the XY-plane,
while the other is tilted by an equal amount towards the lower-left corner.
The angular separation between these beams is Dh ¼ 0.23 ¼ 0.004 radians.
19 The uncertainty principle in classical optics 261
The combined beam’s intensity distribution in Figure 19.2(e) reveals the angular
separation of the two superimposed beams through a tell-tale fringe pattern.
When the composite beam (whose intensity and phase distributions are shown
in Figures 19.2(e, f)) is propagated along the Z-axis, one obtains at various
distances from the aperture the intensity patterns displayed in Figure 19.3. It is
seen in these pictures that the two constituent beams continue to overlap at first,
giving rise to interesting interference patterns. After a sufficient propagation
distance, however, the beams separate and go their own ways. The assumed value
of DDkx in this example is 4p, which satisfies Ineq. (19.1).
d e f
g h i
j k l
m n o
Figure 19.3 Two overlapping plane waves depicted in Figure 19.2 propagate
along the Z-axis. The various intensity patterns in frames (a) to (o) are obtained
at z/(103k) ¼ 1, 2, 3, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 125, and 150,
respectively. Initially the beams strongly interfere with each other, but as
propagation proceeds, they separate and exhibit their individual identities.
19 The uncertainty principle in classical optics 263
Focal plane
Lens
D Z
at 45 to the X- and Y-axes, i.e., the direction along which the spots are separated
from each other. The plot in (c) corresponds to the case when both beams are
circularly polarized. Frames (d)–(f) are the logarithmic versions of those in (a)–(c),
showing their detailed structure by emphasizing the weaker regions. Since the
assumed values of D ¼ 500k and Dh ¼ 0.004 rad satisfy the uncertainty relation
in Ineq. (19.1), the focused spots are seen to be resolved irrespective of their
polarization state.
d e f
of D(Dh/k) ¼ 40 in this case amply satisfies Ineq. (19.1). Figure 19.8(a) shows
the incident pattern of intensity distribution of the superposed beams upon
arriving at the etalon. One of these beams propagates along the direction that
makes a 45 angle with the etalon’s surface normal, while the other deviates from
this direction by Dh ¼ 0.115 . The reflected intensity profile depicted in Figure
19.8(b) contains mostly the latter beam, plus a small fraction of the former. This
is due to the imperfect transfer function of the etalon, which cannot fully transmit
the angular spectrum of the 45 beam, nor can it fully reflect the spectrum of the
45.115 beam. Either beam’s angular spectrum has a width of k/D 0.003 ,
which would readily pass through a narrow rectangular transfer function, but is
19 The uncertainty principle in classical optics 265
Reflected
beam Es Dielectric stack
Incident dair
beam
Es
Es
Substrate Transmitted
beam
Air gap
Substrate
Figure 19.6 Fabry–Pérot etalon designed for operation at k ¼ 633 nm, h ¼ 45 .
Dielectric mirrors each contain six pairs of high/low-index layers (n1 ¼ 2.0,
d1 ¼ 84.6 nm; n2 ¼ 1.5, d2 ¼ 119.6 nm). Both mirror substrates are glass (nsub
¼ 1.5), and the medium separating the mirrors is air (dair ¼ 55.95 lm).
The incidence angle on the etalon is in the vicinity of h ¼ 45 ; within the
substrate, however, the angle of incidence on the stack is close to h 0 ¼ 28.1255
(sin h ¼ nsub sin h 0 ). The etalon can separate two beams of identical k arriving
through an aperture of diameter D, but differing in propagation direction,
namely, h1 ¼ 45 , h2 ¼ 45 þ Dh. One beam is reflected by the etalon while
the other is transmitted. Only s-polarized light is considered here, although
p-polarized beams exhibit similar behavior.
partially blocked by the sharply peaked transfer functions of the etalon (see
Figure 19.7(a)). The same arguments apply to the transmitted intensity distri-
bution shown in Figure 19.8(c) which, although primarily composed of the 45
incident beam, still contains a fraction of the 45.115 beam.
To summarize the results of this and the preceding sections, there are several ways
of separating two overlapping beams of the same wavelength and differing propa-
gation directions. Some of these methods may be more effective than others, but none
could violate the uncertainty relation given by Ineq. (19.1). Moreover, Ineq. (19.1)
remains valid even if the beams are observed within a transparent medium of
refractive index n > 1. For instance, in Figure 19.1 if the region to the right of the
aperture happens to be filled with such a medium, the angular separation Dh of the
beams shrinks by a factor n upon entering the medium, but the length of the k-vector
increases by the same factor, thus preserving the magnitude of Dkx. Similarly, in
Figure 19.4 if the index of the medium on the right-hand side of the lens happens to be
n, the focused spot diameters will be n times smaller, but their center-to-center
spacing will also be reduced by the same factor, resulting once again in the preser-
vation of Ineq. (19.1).
266 Classical Optics and its Applications
1.0
(a) |r s |
0.8
Amplitude 0.6
0.4
0.2 |t s |
0.0
44.8 44.9 45.0 45.1 45.2
180
(b)
135
f ts
90
Phase (degrees)
45
0
–45
–90
–135 f rs
–180
44.8 44.9 45.0 45.1 45.2
Angle of Incidence (degrees)
Figure 19.8 Two overlapping beams of uniform amplitude and circular cross-
section (k ¼ 633 nm, D ¼ 2 · 104k) arrive at the etalon of Figure 19.6. One beam
travels at h ¼ 45 relative to the etalon’s surface normal, the other at h ¼ 45.115 .
(a) Intensity distribution of the superposed beams at the entrance aperture.
(b) Reflected intensity distribution, consisting mainly of the second beam plus a
small fraction of the first. (c) Transmitted intensity distribution, consisting
mostly of the first beam plus a small fraction of the second.
268 Classical Optics and its Applications
S2
Detector 2
Detector 1
S1
50/50 Splitter
Figure 19.10 Computed detector signals S1 and S2 versus the input wavelength
k in the Mach–Zehnder interferometer of Figure 19.9. The assumed path-length
difference between the two arms of the device is Dz ¼ 1.266 mm. In the vicinity
of k ¼ 633 nm the adjacent peaks of S1 and S2 are separated by Dk ¼ 0.158 nm, in
agreement with Eq. (19.2).
different exit channel of the device. Therefore, the separability condition for this
interferometer is Dz/k1Dz/k2 ¼ 12, or
Figure 19.10 shows computed detector signals S1, S2 of the system of Figure 19.9
versus the input wavelength in the vicinity of k ¼ 633 nm. For the particular path-
length difference chosen in this example (Dz ¼ 1.266 mm), it is observed that, in
compliance with Eq. (19.2), a pair of beams having Dk ¼ 0.158 nm can be readily
separated from each other.
An alternative form of the uncertainty relation may be obtained in this case
by invoking the quantum-mechanical relation between the magnitude k of the
270 Classical Optics and its Applications
wave-vector and the photon energy E ¼ hm, namely, k ¼ 2p/k ¼ 2pm/c ¼ E/•c. For
two beams of wavelengths k and k þ Dk, co-propagating in the Z direction,
Dkz ¼ DE/•c. Also Dz ¼ cDt, where c is the speed of light and Dt is the time
needed for light to travel a distance Dz in free space. The product Dz Dkz is thus
proportional to DEDt, with • being the proportionality constant. One may thus
reinterpret Eq. (19.2) as a statement of the time-versus-energy uncertainty. When
the observations are made in a transparent medium of refractive index n > 1, the
increase of the k-vector by a factor of n dictates a corresponding decrease in Dz.
This is consistent with the reduced speed of light in the medium of index n, which
yields the same travel time Dt for the shorter propagation distance Dz/n. Needless
to say, DE ¼ hDm is independent of n.
which yields
Amplitude 0.6
0.4
|t p |
0.2
0.0
630 632 634 636 638 640
1.0
(b) |r s |
0.8
Amplitude
0.6
0.4
0.2
|t s |
0.0
630 632 634 636 638 640
(nm)
Now, the emergent beam diameter is D0 ¼ Djcosh 0 /coshj. Since the lens is
expected to resolve the two wavelengths, Ineq. (19.1) requires that jDh 0 j
k/D0 ,
which leads to jcos h 0 Dh 0 j
k cos h/D, which in turn leads to jN /Pj Dk
k cos h/D.
In other words,
D=cos h
ðk=DkÞjP=N j: ð19:4aÞ
From Eq. (19.3a) it is clear that jNk/Pj 2, that is jP/N j
12k. Inequality (19.4a)
may thus be written as follows:
D Z
Grating
Lens
f
Inequality (19.4b) places a lower bound for resolvability not on the beam
diameter D, but on the illuminated length of the grating, D/cos h, in the direction
perpendicular to the grooves.
Next we examine the propagation distance from the center of the entrance
aperture to the focal plane of the lens. With reference to Figure 19.12, the
shortest possible distance from the entrance aperture to the grating center is
Dz1 ¼ 12 D tan h. Similarly, the shortest possible distance from the grating to
the lens center (ignoring the possibility that the lens might block the incident
beam) is Dz2 ¼ 12 D0 jtanh 0 j ¼ 12 D jsinh 0 j/cosh. The smallest feasible focal length
for the lens is f ¼ 12D 0 , corresponding to NA ¼ 1. Therefore, the shortest distance Dz
from the center of the entrance aperture to the focal plane of the lens is given by
Since sinh 0, and jsinh 0 jþjcosh 0 j 1 for any h 0 , Eq. (19.5a) yields
Dz 12 D=cos h: ð19:5bÞ
Dz Dkz
12 p: ð19:6Þ
19 The uncertainty principle in classical optics 273
Note that the initial beam diameter D in this example is not restricted at all,
whereas the propagation distance Dz is required to be greater than a certain
minimum, 14 k2/Dk, to ensure resolvability of the wavelengths k and k þ Dk.
274
20 Omni-directional dielectric mirrors 275
General properties
Consider a periodic multilayer stack such as that depicted in Figure 20.1. The
stack consists of an infinite number of identical blocks, each block having
reflection coefficients r ¼ jrj exp(ir) from the top side and r 0 ¼ jr 0 j exp(ir 0 )
from the bottom side, as well as transmission coefficient t ¼ jtj exp(it) from
either side. In general, r, r 0 and t are functions of the incidence angle h and
the polarization state (p or s) of the incident light. From the reciprocal
properties of electromagnetic waves in non-absorbing media, it is known
that t should be the same whether the incidence is from the top or from
the bottom side, that jrj ¼ jr 0 j, and that 12(rþr 0 ) ¼ t 90 (see Chapter 17,
Reciprocity in classical linear optics). Also, from conservation of energy,
jrj2þjtj2 ¼ 1.
As shown in Figure 20.1, one can express the reflection coefficient r0 from the
top of the stack in terms of the parameters r, r 0 , t of the individual blocks by
assuming a diminishing air-gap between the top unit and the rest of the stack.
Denoting the round-trip phase delay within this (artificial) air-gap by d, and
r
r0t2ei
r02r9t2ei2
tr02r92ei2
..
.
recognizing that the reflectivity r0 of the infinite stack is the same with and
without itsuppermost block, we write
r0 ¼ lim r þ r0 t2 expðidÞ þ r02 r 0 t2 expð2idÞ þ r03 r 02 t2 expð3idÞ þ
d!0
¼ ½r r0 ðrr 0 t2 Þ=ð1 r0 r0 Þ
¼ fr r0 exp½iðr þ r0 Þg=ð1 r0 r0 Þ: ð20:1Þ
The above formula is a quadratic equation in r0. A perfect reflector requires that
jr0j ¼ 1; Eq. (20.1) then yields the following expression for the phase 0 of r0 in
terms of jrj, r, and r 0 :
1 1
cos 0 ðr r0 Þ ¼ cos ðr þ r0 Þ jr j: ð20:2Þ
2 2
Since in practice the actual value of 0 is irrelevant, the above equation predicts
that the reflectivity R0 ¼ jr0j2 of the stack will be unity provided that the right-hand
side of Eq. (20.2) is confined to the interval [1, þ1]; in other words, the necessary
and sufficient condition for the infinite dielectric stack of Figure 20.1 to have 100%
reflectivity may be written as follows:
1
jr j>cos ðr þ r0 Þ : ð20:3aÞ
2
Using the identity jrj 2 þ jtj 2¼ 1 and the relation among r , r 0 , t mentioned
earlier, Ineq. (20.3a) may be written in either of the following alternative forms:
The three inequalities (20.3a), (20.3b), and (20.3c) are equivalent and may be
used interchangeably. As an example, consider a unit block consisting of a pair of
high-index, low-index layers, each a quarter-wave thick at the free-space wave-
length of k0 ¼ 633 nm at normal incidence (i.e., h ¼ 0 ). Let n1 ¼ 2, t1 ¼ 79.0 nm,
n2 ¼ 1.5, t2 ¼ 105.5 nm. Figure 20.2 shows plots of jrj and cos [12 (r þ r 0 )]
in frame (a), jtj and cos t in frame (b), as functions of h for both p- and s-polarized
incident plane-waves. Note that Ineqs. (20.3) are satisfied for p-light when 0 < h <
40 , and for s-light when 0 < h < 52 .
The computed p- and s-reflectivities for a quarter-wave stack consisting of twenty
repetitions of the above bilayer are shown in Figure 20.3. As expected, Rp0 ¼ jrp0j2 1 in
the incidence range 0 < h < 40 , and similarly Rs0 ¼ jrs0j2 1 in the range 0 < h < 52 .
20 Omni-directional dielectric mirrors 277
1.0 1.0
|tp|
|ts|
0.8 cos[(
rs +
rs)/2] 0.8
Transmission Coefficient
cos(
tp)
Reflection Coefficient
0.6 0.6
cos(
ts)
0.4 |rs| 0.4
|rp|
0.2 0.2
cos[(
rp +
rp)/2]
0.0 0.0
0 15 30 45 60 75 90 0 15 30 45 60 75 90
u (degrees) u (degrees)
Figure 20.2 Plots of the various functions appearing in Ineqs. (20.3) for a
bilayer consisting of a pair of dielectric layers, each having a quarter-wave
thickness at the vacuum wavelength of k0 ¼ 633 nm at normal incidence.
(n1 ¼ 2.0, t1 ¼ 79.0 nm, n2 ¼ 1.5, t2 ¼ 105.5 nm.)
1.0
0.8 Rso
0.6
Reflectivity
0.4
Rpo
0.2
0.0
0 15 30 45 60 75 90
(degrees)
Figure 20.3 Computed reflectivity R versus h for p- and s-polarized light for
a quarter-wave stack consisting of twenty repetitions of the bilayer depicted in
Figure 20.2. Rpo and Rso are 100% in those regions where Ineqs. (20.3) are
satisfied.
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
qs ¼ cos h n sin h
2 2
cos h þ n2 sin2 h : ð20:4bÞ
ei2
d
n
ei
where h0 is the angle of the refracted ray.5 The slab’s reflection and transmission
coefficients, r and t, may be obtained by summing the infinite number of rays
multiply reflected from its front and rear facets, namely,
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
jtj ¼ ð1 q2 Þ= q4 2q2 cosð2DÞ þ 1; ð20:8aÞ
Equations (20.7) and (20.8) readily confirm that jrj2 þ jtj2 ¼ 1, that r ¼ t 90 ,
and that a single-layer slab does not satisfy Ineq. (20.3).
280 Classical Optics and its Applications
Double layer
Next, consider the bilayer slab depicted in Figure 20.5. The top layer has index
n1, thickness d1, and reflection and transmission coefficients r1, t1 at the incidence
angle h. The corresponding parameters of the second layer are n2, d2, r2, t2. To
determine the bilayer’s overall transmission coefficient t, we assume a small air
gap between the two layers and proceed to sum the partial transmission coeffi-
cients. We find, in the limit of a vanishing gap,
d1
n1
d2
n2
Discussion
We begin by examining the behavior of Gp,s(n1, n2, h). According to Eq. (20.11c) this
function depends only on q1 and q2, which, in turn, are dependent on n1, n2, the angle
of incidence h, and the polarization state of the beam, but not on layer thicknesses d1
and d2. For fixed values of n1, n2 the function depends only on h and on the
polarization state. Substituting from Eqs. (20.4a) and (20.4b) into Eq. (20.11c) yields
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
1
Gs ðn1 ; n2 ; hÞ ¼ ðn2 sin2 hÞ=ðn22 sin2 hÞ
4 q1ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi
1 1
þ ðn22 sin2 hÞ=ðn21 sin2 hÞ ; ð20:12aÞ
4 2
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
1 2
Gp ðn1 ; n2 ; hÞ ¼ ðn2 =n1 Þ ðn21 sin2 hÞ=ðn22 sin2 hÞ
4 qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
1 2 1
þ ðn1 =n2 Þ ðn22 sin2 hÞ=ðn21 sin2 hÞ ; ð20:12bÞ
4 2
Gp,s(n1, n2, h) is plotted versus h in Figure 20.6 for both p- and s-polarized plane-
waves for the specific values (a) n1 ¼ 1.5, n2 ¼ 2.0, and (b) n1 ¼ 1.3, n2 ¼ 1.2.
Although the selected values of n1, n2 are specific, the shapes of the functions are
quite general. The two functions for p- and s-light are always positive; they both
start, at h ¼ 0 (normal incidence), at the same level, from there Gs goes up
and Gp goes down with increasing h. The behavior shown in Figure 20.6(a),
where Gs increases while Gp decreases monotonically, is typical of situations of
interest in this chapter, where the Brewster angle hB is inaccessible from outside
the multilayer. The behavior depicted in Figure 20.6(b), where Gs increases
282 Classical Optics and its Applications
0.05
(a)
n1 = 1.50
0.03
0.02
Gp(n1, n2, u)
0.01
0.00
0 15 30 45 60 75 90
0.0125
(b)
n1 = 1.30
0.0100 n2 = 1.20
G (n1, n2, u)
0.0075
Gs(n1, n2, u)
0.0050
0.0025
Gp(n1, n2, u)
0.0000
0 15 30 45 60 75 90
u (degrees)
Figure 20.6 Plots of the functions Gp,s (n1, n2, h) versus h for specific values of
n1, n2. In (a) n1 and n2 are large enough to satisfy Ineq. (20.13), thus ensuring
that the Brewster angle is inaccessible from outside the stack. In (b) the Brewster
angle is reached at h ¼ 61.9 .
One can readily show that the slope of Gs versus h is always positive. Gp,
on the other hand, has a negative slope at h ¼ 0 , which continues to be negative
up to where sin h ¼ n12 n22/(n12 þ n22). At this point Gp achieves its minimum value
of zero, then rises until grazing incidence at h ¼ 90 . The angle h at which Gp is a
minimum corresponds to the Brewster angle hB between two media of indices n1
and n2. When hB is accessible from the incidence medium (air in this case), it will
be impossible to achieve 100% reflectivity at this particular angle. Therefore, we
impose the following constraint on the indices of the bilayer:
In this way, Gp,s(n1, n2, h) will always exhibit the typical behavior depicted in Figure
20.6(a), namely, both functions start at the same level, 14(n1/n2) þ 14(n2/n1)12,
when h ¼ 0 ; from there Gs increases and Gp decreases, both monotonically, with an
increasing h.
Inequality (20.11a) can be satisfied over the entire range of h for both p- and
s-light if D1 and D2 are maintained around p/2 throughout the range h ¼ [0 , 90 ].
Likewise, Ineq. (20.11b) can be satisfied if D1 is kept around p/2 while D2 is kept
around 3p/2 (or vice versa). When n1 and n2 are far apart, Gs and Gp will be fairly
large, and choosing d1 and d2 to satisfy the requisite inequalities for all h will not
be difficult. When n1 and n2 are close together, however, it is easier to maintain
D1 and D2 both around p/2 (if at all possible), rather than to keep one of them
around 3p/2. This is simply because the variations with h will be greater for that
D which stays near 3p/2. We limit the following discussion to stacks that satisfy
Ineq. (20.11a), but emphasize that a similar class of reflectors based on Ineq.
(20.11b) is feasible as well.
0.98
0.96 0.134
sin Δ1 sin Δ2
0.147
0.94
0.90
n1 = 1.5, n2 = 2.0
0.88 d1 /0 = 0.191
0 15 30 45 60 75 90
0.12 (b)
n1 = 1.5, n2 = 2.0
d1 /0 = 0.191
0.10 d2/0 = 0.161
0.08
cos2[(Δ1 + Δ2)/2]
0.06 0.147
0.04
0.134
0.02
0.00
0 15 30 45 60 75 90
u (degrees)
Figure 20.7 (a) Plots of sinD1 sinD2 versus h for a bilayer slab consisting
of materials with n1 ¼ 1.5 and n2 ¼ 2.0. The first layer’s thickness is fixed at
d1 ¼ 0.191k0, but the second layer’s assumes one of three different values.
(b) Same as (a) for the function cos2[12(D1þD2)].
around normal incidence. All in all, it turns out that it is impossible to design an omni-
directional reflector with bilayers having n1 ¼ 1.5 and n2 ¼ 2.0.
Figure 20.8 shows the best that one can achieve with n1 ¼ 1.5, n2 ¼ 2.0, and
layer thicknesses d1 ¼ 0.191k0, d2 ¼ 0.134k0 (chosen to satisfy Eq. (20.14)).
286 Classical Optics and its Applications
0.04
(a)
s-polarization
0.02
0.01 p-polarization
0.00
–0.01
–0.02
–0.03
–0.04
0 15 30 45 60 75 90
1.0 (b)
Rso
0.8
Rpo
Reflectance
0.6
0.4
0.2
0.0
0 15 30 45 60 75 90
u (degrees)
Figure 20.8 (a) Plots of Gp,s(n1, n2, h) sinD1 sinD2 cos2 [12(D1þD2)] versus
h for a bilayer having n1 ¼ 1.5, d1 ¼ 0.191k0 and n2 ¼ 2.0, d2 ¼ 0.134k0.
At small angles of incidence both p- and s-light violate Ineq. (20.11a), while
at large angles only p-light is inadequate. (b) Computed plots of p- and
s-reflectivity, Rp0, Rs0, versus h for a twenty-period stack of the above
bilayer. The regions of 100% reflectivity coincide with those that satisfy
Ineq. (11a).
20 Omni-directional dielectric mirrors 287
0.090
(a)
n1 = 1.5, d1 = 0.1910
0.060
0.045
0.030
p-polarization
0.015
0.000
0 15 30 45 60 75 90
1.000 (b)
Rso
0.999
0.998
Reflectance
Rpo
0.997
0.996
0.995
0 15 30 45 60 75 90
(degrees)
Figure 20.9 Same as Figure 20.8 for a twenty-period stack consisting of layers
with n1 ¼ 1.5, d1 ¼ 0.191k0 and n2 ¼ 2.3, d2 ¼ 0.122k0. It is seen in (a) that Ineq.
(20.11a) holds for both polarization states throughout the entire range of inci-
dence. In (b) the reflectances are 100% everywhere. The slight drops in Rpo
and Rso are due to the fact that the assumed stack consists only of a finite number
of bilayers; the reflectivity will rise rapidly if the total number of bilayers
comprising the stack is increased. Note that the smaller the functions depicted in
(a) become, the harder it is to obtain 100% reflectivity from a finite stack.
288 Classical Optics and its Applications
Figure 20.8(a) shows plots of Gp,s(n1, n2, h)sinD1 sinD2 cos2 [12(D1 þ D2)]
versus h for both p- and s-polarized light; both functions must stay above zero to
satisfy Ineq. (20.11a). Figure 20.8(b) shows computed plots of reflectivity, Rp0
and Rs0, versus h for a twenty-period stack of this bilayer. It is seen that 100%
reflectivity is achieved in exactly those regions where the functions depicted in
Figure 20.8(a) are positive-valued.
Mathematical description
The complex-amplitude distribution of a simple vortex of order m centered at
(x0, y0) in the cross-sectional plane of a Gaussian beam may be written as
†
This chapter was coauthored with Ewan M. Wright of the College of Optical Sciences, University of Arizona.
289
290 Classical Optics and its Applications
–15 x/ 15
Figure 21.1 A Gaussian beam, having 1/e radius r0 ¼ 10k at the waist, hosts an
m ¼ þ1 vortex at its center. (a) Intensity and (b) phase distribution at the beam
waist. (c) Interferogram with a tilted plane wave.
intensity distribution in Figure 21.1(a) has a hole at the center, and the phase
distribution in Figure 21.1(b) displays a continuous variation from 0 to 2p around
the vortex. If this vortex is made to interfere with a tilted plane wave, the
resulting fringe pattern would resemble that in Figure 21.1(c). The fork at the
center of the fringe pattern created by the splitting of a single fringe is charac-
teristic of all first-order vortices.
An important feature of vortices is that they maintain their identity as
they propagate through space. Figure 21.2 shows the intensity and phase
21 Linear optical vortices 291
–65 x/ 65
Figure 21.2 The beam of Figure 21.1 propagates to its Rayleigh range at
z ¼ 314k. (a) Intensity, (b) phase. The phase singularity is now mixed with the
wavefront curvature.
–20 x/ 20
–20 x/ 20
Figure 21.4 From top to bottom: x-, y-, and z-components of the Poynting
vector S for the vortex of Figure 21.3. The minimum value of each function is
shown as black and its maximum value as white, the intermediate values being
covered by the gray-scale. The depicted ranges of values (in normalized units)
are: 0.04 Sx 0.04, 0.04 Sy 0.04, 0 Sz 1. The Poynting vector here
has clockwise circulation around the optical axis.
When the vortex of Figure 21.3 propagates to the Rayleigh range at z ¼ 314k,
the intensity and phase patterns of Figure 21.5 are obtained. As before, the
central hole in the intensity pattern and the singularity of the phase pattern are
preserved, but the phase is now mixed with the curvature of the diverging
wavefront.
294 Classical Optics and its Applications
–65 x/ 65
Figure 21.5 The beam of Figure 21.3 is propagated to its Rayleigh range at
z ¼ 314k. (a) Intensity, (b) phase. The phase singularity is mixed with the
wavefront curvature.
Vortex pair
The complex amplitude of a beam containing multiple vortices may be written as
the product of terms similar to those appearing in Eq. (21.1), namely,
( )
YN
Aðx; y; z ¼ 0Þ ¼ ½ðx xn Þ þ i signðmn Þ ðy yn Þjmn j exp ðx2 þ y2 Þ=r02 :
n¼1
ð21:2Þ
a b c
plotted on different scales. The beam expands along the propagation path, of
course, but the vortices maintain their relative shape and position while
undergoing a collective 90 rotation around the optical axis between the waist
and the far field.8
The case of two vortices of opposite helicity is shown in Figure 21.7. Here
m ¼ 1 for one vortex and m ¼ þ1 for the other. The initial separation between
the vortices at the beam waist is d ¼ 11k. As the beam propagates through free
space, the vortices appear to spread out and combine with each other. Eventually,
they carve out a circular niche for themselves, but the phase discontinuity near
the beam center survives all the way to the far field.
A somewhat different behavior will be observed when two vortices of opposite
polarity are separated at the beam waist by d < r0. The corresponding intensity
and phase patterns remain more or less the same as those in Figure 21.7 (which
are representative of the case d > r0) but, at some distance z from the waist, the
phase discontinuity near the beam center disappears.4 This behavior is reminis-
cent of fluid vortices of opposite chirality, which collide and annihilate when they
happen to be within each other’s basin of attraction.
296 Classical Optics and its Applications
a b c
Objective
Sample
Gaussian beam
Observation
Plane
Figure 21.8 Densely packed vortices imprinted upon a sample’s flat surface
may be observed through a coherent-light microscope. The incident Gaussian
beam has 1/e radius r0 ¼ 900k. The entrance pupil of the 0.95NA objective lens,
having a radius of 3000k, allows the Gaussian beam through with negligible
truncation. The beam reflected from the sample picks up the amplitude and phase
patterns of the vortices and returns through the objective lens. The phase
structure may be extracted by interference with the original Gaussian beam
reflected from the reference mirror.
for data communication (or for information storage). Figure 21.8 is the sche-
matic of a coherent-light microscope that might be used to retrieve a dense
pattern of vortices recorded on a flat surface. The Gaussian beam entering the
system is narrow enough that truncation at the objective’s aperture may be
considered negligible. Upon focusing through the 0.95NA lens, the FWHM
diameter of the focused spot becomes 1.33k. The focused spot is modulated
by the amplitude and phase reflectivity of the sample before returning to the
objective lens. At the beam-splitter the returning beam is diverted towards the
observation plane, where the intensity pattern may be examined directly, and
the phase pattern may be obtained by interference with a reference beam
(supplied by the mirror).
Figure 21.9 shows the patterns of intensity and phase at the focal plane of
the objective lens immediately after the beam is reflected from the sample.
There appear here a total of seven vortices within the focused beam area, all
with the same helicity, m ¼ þ1. The pair in the middle, having a center-to-
center spacing of k/2, is at the resolution limit of conventional optical micro-
scopy. When the reflected beam reaches the observation plane, the patterns
shown in Figure 21.10 are obtained. Note that both the intensity and the phase
298 Classical Optics and its Applications
–4 x/ 4
Figure 21.9 (a) Intensity and (b) phase distribution imparted to the focused
Gaussian beam in Figure 21.8 immediately after reflection from the sample’s
surface. There is a total of seven m ¼ þ1 vortices; the distance between the
closest pair, near the center, is 0.5k.
Figure 21.10 Distributions of (a) intensity and (b) phase at the observation
plane of Figure 21.8 corresponding to the seven vortices of Fig. 21.9. In (a) and
(b) the reference beam is blocked. In (c) the reference beam interferes with the
beam returning from the sample, thus creating fringes. The vortices may be
identified by the forks within these fringes.
In isotropic media the rays of geometrical optics are usually obtained from the
surfaces of constant phase (i.e., wavefronts) by drawing normals to these surfaces
at various points of interest.1 It is also possible to find the rays from the eikonal
equation, which is derived from Maxwell’s equations in the limit when the
wavelength k of the light is vanishingly small.2 Both methods provide a fairly
accurate picture of beam-propagation and electromagnetic-energy transport in
situations where the concepts of geometrical optics and ray-tracing are applic-
able. The artifact of rays, however, breaks down near caustics and focal points
and in the vicinity of sharp boundaries, where diffraction effects and the vectorial
nature of the field can no longer be ignored.
It is possible, however, to define the rays in a rigorous manner (consistent with
Maxwell’s electromagnetic theory) such that they remain meaningful even in those
regimes where the notions of geometrical optics break down. Admittedly, in such
regimes the rays are no longer useful for ray-tracing; for instance, the light rays no
longer propagate along straight lines even in free space. However, the rays continue
to be useful as they convey information about the magnitude and direction of the
energy flow, the linear momentum of the field (which is the source of radiation
pressure), and the angular momentum of the field. Such properties of light are
currently of great practical interest, for example, in developing optical tweezers,
where focused laser beams control the movements of small objects.3,4,5,6 Similarly,
the manipulation of atoms and molecules with laser beams is presently an active
area of research that has tremendous potential for future applications.7
301
302 Classical Optics and its Applications
–20 x/ 20
Figure 22.1 Various distributions at the waist of a Gaussian beam having 1/e
(amplitude) radii Rx ¼ 15k, Ry ¼ 10k; the beam is linearly polarized along the
X-axis. (a) Intensity of the x-component of polarization, Ix ¼ jExj2. (b) Intensity
of the z-component of polarization, Iz ¼ jEzj2. In (a) and (b) the peak intensities
are in the ratio Ix : Iz ¼ 1.0 : 0.83 · 104. (c) A plot of Sz, the projection of the
Poynting vector S along the optical axis. Sz(x, y)
0 is encoded in gray-scale
(black, minimum; white, maximum). The other components of S, namely, Sx and
Sy, are exactly zero at this cross-section.
figures are in the ratios Ix : Iy : Iz ¼ 1.0 : 0.39 · 108 : 0.83 · 104. Whereas the
beam at the waist is elongated along X, at z ¼ 800k it is elongated along Y; this is
a natural consequence of diffractive propagation. The phase plots in Figure 22.2,
middle column, reveal the acquired curvature of the beam, as well as a p phase
difference between the adjacent quadrants of Ey and the two halves of Ez. The
304 Classical Optics and its Applications
Figure 22.2 The Gaussian beam of Figure 22.1 after propagating a distance
of z ¼ 800k in free space. The left-hand column shows, from top to bottom,
the distributions of intensity for the x-, y-, and z- components of polarization;
the peak intensities are in the ratios Ix : Iy : Iz ¼ 1.0 : 0.39 · 108 : 0.83 · 104.
The middle column shows the corresponding phase plots for Ex, Ey, Ez; the
gray-scale covers the range 180 (black) to þ180 (white). The third col-
umn shows the Cartesian components of the Poynting vector, Sx, Sy, Sz, in
gray-scale (black, minimum; white, maximum). Here the normalized ranges
of values are: 0.48 Sx 0.48, 0.9 Sy 0.9, 0 Sz 100. Symmetry
with respect to the optical axis ensures that the angular momentum of the
field around this axis is zero. Note that the dimensions are not the same in
the three columns.
general structure of the intensity and phase patterns depicted here may be readily
understood in terms of the symmetries of the Gaussian beam and the basic
properties of electromagnetic radiation.
Shown in the right-hand column of Figure 22.2 are, from top to bottom,
the x-, y-, and z-components of S encoded in gray-scale (black corresponds
to a minimum, white to a maximum). The normalized ranges of values
22 Geometric-optical rays, Poynting’s vector, and the field momenta 305
are: 0.48 Sx 0.48, 0.9 Sy 0.9, 0 Sz 100. So, for example, in the top
frame the bright regions indicate that Sx is directed along þX, while in the dark
regions Sx points toward X. Similarly, Sy in the upper half of the middle frame
points along þY, while it is directed along Y in the lower half. In the bottom frame
where Sz
0, the large positive values appear bright and those in the vicinity of zero
appear dark. As expected, these plots of Sx, Sy, Sz represent a divergent beam.
–20 x/ 20
Figure 22.3 From top to bottom: plots of Sx, Sy, Sz at the waist of a circularly
polarized Gaussian beam having 1/e radii (Rx, Ry) ¼ (15k, 10k). The normalized
ranges of values are: 0.96 Sx 0.96, 0.64 Sy 0.64, and 0 Sz 100.
The counterclockwise circulation of S around the optical axis gives rise to the
beam’s angular momentum around this axis.
limit of an infinitely large beam (i.e., a plane wave), Sx and Sy vanish. Does this
mean that a circularly polarized plane wave does not carry angular momentum?
The answer is no, because while Sx and Sy diminish with the expansion of the
beam they also spread over a larger area, yielding the same final value for the
integrated r · p over the beam’s cross-section.10 This is also in agreement with
the quantum picture of light, where a circularly polarized photon of frequency m
carries energy hm and a unit of angular momentum h/2p.
22 Geometric-optical rays, Poynting’s vector, and the field momenta 307
a b
Figure 22.5 Various distributions for the focused spot of Figure 22.4(b). The
left-hand column shows, from top to bottom, plots of intensity for the x-, y-,
and z-components of polarization. The peak intensities are in the ratios Ix :
Iy : Iz ¼ 0.49 : 1.0 : 0.06. The corresponding phase plots appear in the middle
column, where the gray-scale covers the range 180 (black) to þ180 (white).
The right-hand column shows plots of Sx, Sy, Sz in gray-scale (black, minimum;
white, maximum). The normalized ranges of values are: 9.5 Sx 9.5,
22.6 Sy 12.9, 0 Sz 100. Note that the dimensions are not the same in
the three columns.
Note the elongation of the focused spot along the X-axis, which is a consequence
of the particular polarization pattern of the incident beam. Figure 22.5, left col-
umn, shows the computed intensity distributions for the x-, y- and z- components
of polarization at the focal-plane. The corresponding phase patterns are shown in
the middle column. Of particular interest here are the focal-plane distributions of
Sx, Sy, Sz, shown in the right-hand column. There are two equal but opposite
vortices in this picture, which may be discerned by considering the combined
effects of Sx and Sy. A schematic diagram of the projection of S in the focal plane,
namely, Sxx þ Syy, is given in Figure 22.6. These, as well as more complex
momentum distributions, can now be routinely created in the laboratory and
22 Geometric-optical rays, Poynting’s vector, and the field momenta 309
Figure 22.6 A schematic diagram showing the vortex structure of the Poynting
vector for the focused spot depicted in Figures 22.4(b) and 22.5. The arrows
represent the projection of S in the focal plane, namely, Sx x þ Sy y.
used to trap and manipulate small objects within the confines of the focal region
of a microscope.
310
23 Doppler shift, stellar aberration 311
convection of light by moving media are the various manifestations of the same
fundamental phenomenon: different relative perceptions of space and time for
observers in motion with respect to one another. In this chapter we derive general
formulas for all three phenomena by applying the Lorentz transformation to a
plane electromagnetic wave. Examples will be used to clarify the physics behind
the formulas.
Here c is the speed of light in free space, and the complex vector A0 denotes the
strength of the field at the origin of the coordinate system. Let us define the
coordinates of a point in space-time as p ¼ (x, y, z, ict). The coefficients
appearing in the exponent of the plane wave in Eq. (23.1a) can then be grouped
together as r ¼ [sin h cos , sin h sin , cos h, i], and the equation may be written
in compact form,
pT ¼ L p0 ;
T
ð23:2aÞ
Y X
X f
u
Z
V
Z
Figure 23.1 In the Cartesian XYZ coordinate system a plane wave of frequency f
(wavelength ¼ k) propagates along the unit vector u. The polar and azimuthal
angles of u are denoted by h and . As seen from another system, X 0 Y 0 Z 0 , the XYZ
system moves at a constant velocity V along the Z-axis. From the perspective of an
observer stationary in X 0 Y 0 Z 0 , the plane wave is Doppler shifted to a different
frequency f 0 , and the polar angle of its propagation direction has a different value
h 0 . The azimuthal angle , however, remains the same in the two systems.
(Recall that c is a constant, having the same value in any frame of reference in
which it is measured.) L is a unitary matrix whose inverse is the same as its
transpose, i.e., LLT equals a 4 · 4 identity matrix. We substitute for pT in
Eq. (23.1b) from Eq. (23.2a), and evaluate rL as follows:
1 þ ðV=cÞ cos h
rL ¼ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ½sin h 0 cos ; sin h 0 sin ; cos h 0 ; i: ð23:3aÞ
1 ðV=cÞ2
Here
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
0
sin h ¼ 1 ðV=cÞ2 sin h=½1 þ ðV=cÞ cos h; ð23:3bÞ
It is readily verified from Eqs. (23.3b) and (23.3c) that sin2 h 0 þ cos2 h 0 ¼ 1,
which is needed if the above definition of h 0 is to be meaningful. We conclude
that the plane wave in XYZ remains a plane wave in X 0 Y 0 Z 0 , albeit with a
different frequency and a different propagation direction.
Doppler shift
Replacing rpT in Eq. (23.1b) with rLp 0 T and using Eq. (23.3a), it becomes
clear that the optical frequency f 0 of the plane wave as measured in the X 0 Y 0 Z 0
23 Doppler shift, stellar aberration 313
system is given by
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
0
f ¼ f 1 þ ðV=cÞ cos h 1 ðV=cÞ2 : ð23:4Þ
This is the relativistic formula for the Doppler shift,2 valid for all propagation
directions h and all speeds V. When h ¼ 0 , the propagation direction and the
motion of the observer are antiparallel. In this case f 0 is greater than f (i.e., blue-
shifted) according to the following formula:
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
f 0 ¼ f ½1 þ ðV=cÞ=½1 ðV=cÞ: ð23:5aÞ
When h ¼ 180 , the propagation direction and the motion of the observer are
parallel, in which case f 0 is less than f (i.e., red-shifted) as follows:
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
f 0 ¼ f ½1 ðV=cÞ=½1 þ ðV=cÞ: ð23:5bÞ
When h ¼ 90 , the observer is moving at right angles to the propagation direction.
The classical analysis does not yield any Doppler shift in this case,1 but the
relativistic formula yields
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
0
f ¼f 1 ðV=cÞ2 : ð23:5cÞ
Substitution into Eqs. (23.3b) and (23.3c) reveals that, when the above condition
is satisfied, h 0 ¼ 180 – h.
Based on Eq. (23.4), Figure 23.2(a) shows plots of f 0 /f versus V/c for several values
of h, while Figure 23.2(b) shows plots of f 0 /f versus h for different values of V/c. At a
given velocity V, the beam is blue-shifted when h ¼ 0 , i.e., the observer moves
opposite to the direction of propagation, and red-shifted when h ¼ 180 , i.e., the
observer moves along the propagation direction. If h is varied continuously from 0 to
180 , the frequency changes from blue-shifted to red-shifted, becoming equal to f
somewhere after h ¼ 90 . As V increases, the crossing point occurs at larger angles h.
Stellar aberration
The direction of propagation of the beam perceived by the observer in the X 0 Y 0 Z 0
frame has polar angle h 0 , given by Eqs. (23.3b) and (23.3c), and azimuthal angle
314 Classical Optics and its Applications
5 (a) 5 (b)
V/c = 0.9
4 4
3 u = 0° 3
f/f
0.7
30°
2 2
0.5
60°
90° 0.3
120°
1 1
V/c = 0.1
150°
180°
0 0
0.0 0.2 0.4 0.6 0.8 1.0 0 45 90 135 180
V/c u (degrees)
90° –0.1
90 90
0.1
60°
60 60 0.3
30°
0.5
30 30 0.7
V/c = 0.9
0 0
–1.0 –0.5 0.0 0.5 1.0 0 45 90 135 180
V/c u (degrees)
a
b
Objective
Y
Y
c
Z
P V
Z –4 x/ +4
plane of a 0.6NA objective, shows the diameter of the central bright spot – the
Airy disk – to be 1.22k/NA 2k. Figure 23.5 shows computed patterns of
reflected intensity at the exit pupil of the objective for several positions of the
focused spot over the grating. From (a) to (i), the groove center’s distance from
the center of the focused spot is 0, 0.2k, 0.4k, 0.5k, 0.6k, 0.8k, k, 1.1k, and 1.2k,
respectively. In these simulations the grating is assumed to be stationary in its
various positions relative to the lens.
An alternative (and physically more accurate) explanation of the baseball
patterns of Figure 23.5 may be based on the Doppler shift between the 0th order
and the 1st-order diffracted light cones depicted in Figure 23.4. From the
viewpoint of an observer in the grating’s rest frame, the incident cone of light
moves with velocity V along the Z-axis. This cone is a superposition of a
multitude of plane waves of differing directions and frequencies. With reference
to Figure 23.6, consider a plane wave of frequency f and propagation direction
(h, ) in the XYZ coordinate system in which the lens is stationary. This plane wave,
when seen from the grating’s rest frame, has frequency f 0 given by Eq. (23.4) and
23 Doppler shift, stellar aberration 317
a b c
d e f
g h i
Figure 23.5 Patterns of intensity distribution observed at the exit pupil of the
objective of Figure 23.4. From (a) to (i) the distance between the groove center and
the center of the focused spot is 0, 0.2, 0.4, 0.5, 0.6, 0.8, 1.0, 1.1, and 1.2 (in units of k).
propagation direction (h 0 , ) given by Eq. (23.7). For the 0th-order reflected plane
wave, the frequency remains f 0 but the propagation direction becomes (h 0 , ).
Viewed from the rest frame of the lens, this reflected 0th-order beam has frequency
f and propagation direction (h, ). Thus the 0th-order reflected cone – which is
simply a superposition of various reflected 0th-order plane waves – seen by the lens
is ignorant of the velocity V of the grating.
As for the þ1st-order beam, in the grating’s rest frame the diffracted plane
wave has frequency f 0 and propagation direction (h 0þ1, 0þ1), where, in accord-
ance with Bragg’s law,
0
cos hþ1 ¼ cos h 0 þ ðk0 =PÞ; ð23:8aÞ
0
sin hþ1 cos 0þ1 ¼ sin h 0 cos : ð23:8bÞ
318 Classical Optics and its Applications
u u9
u u+1 u9 u9+1
V
Z Z9
Lens reference frame Grating reference frame
Figure 23.6 In the reference frame of the lens of Figure 23.4, a plane wave of
frequency f incident at an angle h on a moving grating gives rise to a 0th order
diffracted beam of the same frequency and polar angle. The þ1st order dif-
fracted beam, however, will have frequency fþ1 and polar angle hþ1. In the
grating’s rest frame, the incident beam has frequency f 0 and polar angle h 0 . All
diffracted orders have the same frequency f 0 . The polar angle of the 0th order
beam is h 0 , while that of the þ1st order beam is h þ1
0
.
Back in the rest frame of the lens, the diffracted þ1st order plane wave appears to
have a new frequency fþ1, where
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
Df ¼ fþ1 f ¼ V= P 1 ðV=cÞ2 ð23:9Þ
is independent of the incident beam’s propagation direction (h, ). The period P of
the grating thus appears to have been foreshortened by the Lorentz contraction factor,
and the Doppler shift Df is proportional to the velocity V and inversely proportional
to the (contracted) grating period. Since Df is independent of the direction of the
incident plane wave, the entire þ1st-order cone will be Doppler shifted by the same
amount. This Doppler shift causes a beating at the exit pupil between the 0th-order
and the þ1st-order cones in their area of overlap. The beat period, 1/Df, is inde-
pendent of the groove profile as well as the NA of the lens. The same arguments apply
to the 1st-order light cone, except that the Doppler shift in this case is –Df.
We mention in passing that, for the plane wave incident at (h, ) in the rest
frame of the lens, the propagation directions of the 1st-order reflected plane
waves are (h1, 1), where
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
cos h k P 1 ðV=cÞ2
cos h1 ¼ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
ð23:10Þ
1 ðV=cÞk P 1 ðV=cÞ2
0
and 1 ¼ 1 (see Eq. (23.8b)). Aside from the Lorentz contraction of the
grating period P, there is a relativistic correction to Bragg’s law of diffraction
23 Doppler shift, stellar aberration 319
from a moving grating. This correction term, which appears in the denominator
on the right-hand side of Eq. (23.10), is of the first order in V/c.
Y
2 W0
W0 L0 Z
Z9
pffiffiffi
Figure 23.7 A Gaussian beam of wavelength k propagates along the Z-axis. The
beam diameter is W0 at the waist and 2W0 at the Rayleigh range, which is a
distance L0 from the waist. To an observer moving with constant velocity
along the Z-axis, p
the
ffiffiffi beam diameters at the waist and at the Rayleigh range
remain W0 and 2W0, but the distance between them appears to have
shrunk by the Lorentz contraction factor.
320 Classical Optics and its Applications
1
0 ðn2 1Þ½1 ðV=cÞ2 2
n ¼ 1þ ; ð23:12bÞ
½1 þ nðV=cÞ cos h2
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
0 2
tan h ¼ n 1 ðV=cÞ sin h= n cos h þ ðV=cÞ : ð23:12cÞ
It is clear that the Doppler-shifted frequency f 0 and the polar angle h 0 are not only
functions of h and V/c, as before, but they also depend on the refractive index n of
23 Doppler shift, stellar aberration 321
2.50 (a) 2.50 (b)
n = 1.33 n = 1.33
V/c = 0.9
2.25 2.25
0.7
= 180°
0.5
2.00 2.00
150°
n
1.75 1.75
120°
1.50 1.50
V/c = 0.1
90°
1.25 1.25
60° 0.3
0.9
= 0°
1.00 1.00
0.00 0.25 0.50 0.75 1.00 0 45 90 135 180
V/c (degrees)
the propagation medium. Similarly, the apparent index n 0 of the moving medium
depends on n, V/c, and h in accordance with Eq. (23.12b). For water of refractive
index n ¼ 1.33, Figure 23.8(a) shows plots of n 0 versus V/c for several values of h,
while Figure 23.8(b) shows plots of n 0 versus h for different values of V/c.
In the special case when the beam moves in the same direction as the medium,
h ¼ 0 and Eq. (23.12b) simplifies to
n þ ðV=cÞ
n0 ¼ : ð23:13Þ
j1 þ nðV=cÞj
When V c, Eq. (23.13) yields Fresnel’s formula1 for the drag of light by a
moving medium,
†
This chapter’s coauthors are Lifeng Li and Wei-Hung Yeh.
323
324 Classical Optics and its Applications
Grating theories
The simplest theory of gratings treats them as corrugated structures that
modulate the amplitude and/or phase of the incident beam in proportion to the
local reflectivity and the height or depth of the surface relief features. The
modulated reflected (or transmitted) wavefront is then decomposed into its
Fourier spectrum to yield the various diffracted orders. Known as the scalar
theory of gratings, this elementary treatment yields the correct number and
direction of propagation for the diffracted orders, but it does not provide an
accurate estimate of the amplitude, phase, and polarization state of each order.
Rayleigh made a substantial contribution to the understanding of gratings by
representing the diffracted field as the superposition of a number of homo-
geneous (i.e., propagating) and inhomogeneous (i.e., evanescent) plane
waves.8 He then determined the complex amplitudes of the various plane
waves by imposing the electromagnetic boundary conditions at the grating
surface.
Although Rayleigh’s method was far superior to the scalar theory – it could
account for some of the observed anomalies and, in fact, provided exact solu-
tions to the electromagnetic field equations in certain cases of practical interest
– it failed to provide a comprehensive solution that would be applicable under
general conditions. A satisfactory analysis of the diffraction from gratings
requires a numerically stable solution to Maxwell’s equations constrained by
the relevant boundary conditions. Several such methods have been discovered
and elaborated over the past 30 years by a number of researchers from around
the world.2,9,10,11 The results presented in this chapter are based on the dif-
ferential method of Chandezon, which uses the so-called coordinate trans-
formation technique.11
Diffraction orders
Figure 24.1 shows the cross-section of a metallized grating with a trapezoidal
groove geometry. The grating period is denoted by p, the groove depth by d, and
the duty cycle, which is the ratio of the land width to the grating period, by c. In
this symmetric grating both side walls make the same angle a with the horizontal
plane. The metal layer, specified by its complex refractive index (n, k), is assumed
to be thick enough to render the grating opaque.
Referring to Figure 24.2, the plane of the grating is XY, and its surface normal
is the Z-axis. The plane of incidence is XZ, h being the angle of incidence. When
the incident E-field is in the plane of incidence, the beam is p-polarized, and when
the E-field is along the Y-axis it is s-polarized. In an alternative nomenclature,
326 Classical Optics and its Applications
Land Groove
Metal d 100 nm
α
Substrate Period ( p)
in Figure 24.2(a), the polarization is transverse electric (TE) when the incident
E-field is parallel to the grooves and transverse magnetic (TM) when it is
perpendicular to the grooves. Although the grating may be mounted with its
grooves in an arbitrary direction within the XY-plane, we shall consider only
two situations. In the first case, depicted in Figure 24.2(a) and referred to as
“classical mount”, the grooves are perpendicular to the plane of incidence. In
this case all diffracted orders remain in the XZ-plane, their propagation vectors
k given by
kðmÞ ¼ ð2p=k0 ÞðrxðmÞ x þ rzðmÞ zÞ ¼ ð2p=k0 Þf½sin h þ ðmk0 =pÞx þ rzðmÞ zg: ð24:1Þ
Here k0 is the vacuum wavelength of the light, the integer m specifies the dif-
fraction order, the unit vector r ¼ (rx, ry, rz) is along the propagation direction,
and the medium of incidence is implicitly assumed to be air. With ry ¼ 0, it is
necessary that rx2 þ rz2 ¼ 1, from which rz can be determined once rx is known.
To keep rz real, rx(m) ¼ sin h þ mk0/p must be in the range (1, þ1), a constraint
that determines the number of propagating orders.
In the second case, depicted in Figure 24.2(b) and referred to as “conical
mount”, the grooves are parallel to the plane of incidence. Here all diffracted
orders (other than the zeroth) are outside the XZ-plane and their propagation
vectors are given by
kðmÞ ¼ ð2p=k0 ÞðrxðmÞ x þ ryðmÞ y þ rzðmÞ zÞ¼ ð2p=k0 Þ½ðsin hÞ x þ ðmk0 =pÞy þ rzðmÞ z:
ð24:2Þ
Again, the integer m is the diffraction order, the implicitly assumed medium of
incidence is air, and the constraint rx2 þ ry2 þ rz2 ¼ 1 specifies rz once rx and ry
are identified. The inequality rx2 þ ry2 ¼ sin2h þ (mk0 /p)2 1 determines the
number of propagating orders. This mounting is called conical because the
24 Diffraction gratings 327
(a) Z
–2
Es –1
Ep
0
+1
X
(b) Z
Es +1
Ep
0
–1
X
Diffracted orders
Incident beam
Focusing lens
Collimating lens
Grating
c d
Figure 24.4 Computed plots of intensity distribution at the exit pupil of the
collimating lens in Figure 24.3, when the beam is diffracted from the grating of
Figure 24.1 (k0 ¼ 0.633 lm, p ¼ 4k0, d ¼ k0 /8). The grooves are perpendicular to
the plane of incidence, as in Figure 24.2(a), and the incident beam is p-polarized.
The frames on the left correspond to the component of polarization parallel
to the XZ-plane (Ek), while those on the right correspond to the component along
the Y-axis (E?). In (a) and (b) the incidence is normal, whereas in (c) and (d)
h ¼ 40 . The ratio of the peak intensity in (b) to that in (a) is 0.65 · 105. Simi-
larly, the peak intensity ratio of (d) to (c) is 0.009. These results are based on full
vector-diffraction calculations.
mount). In Figures 24.5(a), (b) the incidence is normal, whereas in (c), (d) it is
oblique at h ¼ 30 . In both cases the incident beam is p-polarized, but the dif-
fracted beams contain a certain amount of s-polarization as well.12 At the exit
pupil of the lens, the ratio of the peak intensities perpendicular and parallel to the
XZ-plane is fairly small, jE?j2: jEkj2 being 0.97 · 104 at normal incidence and
0.025 at h ¼ 30 .
In both the above cases if the scalar theory of diffraction is used (instead of the
full vector theory), the picture that emerges will show the diffracted orders in
their correct locations but the amplitude, phase, and polarization state of the
various orders will be substantially incorrect.
330 Classical Optics and its Applications
a b
c d
Figure 24.5 Computed plots of intensity distribution at the exit pupil of the col-
limating lens of Figure 24.3, when the beam is diffracted from the grating of Figure
24.1 (k0 ¼ 0.633 lm, p ¼ 4k0, d ¼ k0 /8). The grooves are parallel to the plane of
incidence, as in Figure 24.2(b), and the incident beam is p-polarized. The frames on
the left correspond to the component of polarization parallel to the XZ-plane (Ek),
while those on the right correspond to the component along the Y-axis (E?). In (a)
and (b) the incidence is normal, whereas in (c) and (d) h ¼ 30 . The ratio of the peak
intensity in (b) to that in (a) is 0.97 · 104. Similarly, the peak intensity ratio of (d) to
(c) is 0.025. These results are based on full vector-diffraction calculations.
Diffraction efficiency
We denote by E the amplitude of the incident beam at angle h and by E(m) the
amplitude of the mth-order reflected (or transmitted) beam emerging at h(m). It is
further assumed that the incidence medium is air and, in the case of a trans-
mission grating, that the transparent medium into which the diffracted orders
emerge has refractive index n0. For the mth-order reflected (transmitted) beam the
diffraction efficiency q(m) (s(m)) can be written as
Here the squared amplitude is the beam’s intensity, and the cosine factor keeps
track of the change in the beam’s cross-sectional area upon diffraction.
Figure 24.6 shows computed plots of diffraction efficiency versus h for the
zeroth- and first-order beams for the grating of Figure 24.1 (k0 ¼ 0.633 lm,
p ¼ 3k0, d ¼ k0 /8).12 In each frame there are four curves, representing the dif-
fraction efficiency of the corresponding order when the incident beam is either p- or
s-polarized and when the mount is either classical (qp, qs) or conical (qp0 , qs0 ). The
sharp peaks and valleys appearing in these plots are caused by the excitation of
surface plasmons, which, in the case of metal gratings, exist only when the incident
beam has an E-field component perpendicular to the grooves (see Chapter 9, “What
in the world are surface plasmons?”). The arrows at the bottom of each figure point
to the angles of incidence associated with the Rayleigh anomalies; these are points
at which a particular diffraction order appears or disappears. In Figure 24.6(b), for
example, qp and qs terminate at h ¼ 41.81 , which is where the þ first-order beam
becomes parallel to the surface and subsequently vanishes. In the case of qp0 and qs0
(conical mount) the cutoff of both the first orders occurs at h ¼ 70.53 . When
the metallic grating has a large conductivity, the surface plasmon features and
Rayleigh anomalies are usually located pairwise, close to each other.
0.7 s(0)
Diffraction Efficiency
0.6 p(0)
0.5
0.4
0.3 p(0)
s(0)
0.2
0.1
0.0
0 15 30 45 60 75 90
(degrees)
0.24 (b)
0.20 p(+1)
Diffraction Efficiency
0.16
0.12
s(+1)
0.08
s(+1) p(+1)
0.04
0.00
0 15 30 45 60 75 90
(degrees)
0.20 s(–1)
p(–1)
Diffraction Efficiency
0.16
0.12
p(–1)
0.08
s(–1)
0.04
0.00
0 15 30 45 60 75 90
(degrees)
(i.e., 84% for p-light, 88% for s-light). In the opposite extreme, p ! 0, the reflectivity
curves once again show a limiting behavior. Although there are no other diffracted
orders in this case, the limiting value of q(0) is not necessarily the same as the
specular reflectance of the flat metal layer but should be calculated from an
“effective medium” theory.
Reciprocity theorem
There exists a powerful and quite unexpected reciprocity relation between the beam
incident on a grating and any of the resulting diffracted orders. Suppose the incident
beam arrives at the grating at an angle h and the mth diffracted order emerges at an
angle h(m), having diffraction efficiency q(m) or, in the case of a transmitted order,
334 Classical Optics and its Applications
1.0
(a) 0 = 0.633 μm, d = 0/8
p = 0
0.8
s(0) p(0)
Diffraction Efficiency
0.6
0.4
0.2 p(0)
s(0)
0.0
0 15 30 45 60 75 90
(degrees)
1.0
(b) p = 50
0.8
s(0) s(0)
p(0)
Diffraction Efficiency
0.6
(0)
p
0.4
0.2
0.0
0 15 30 45 60 75 90
(degrees)
Figure 24.7 Computed plots of diffraction efficiency versus h for the zeroth-order
diffracted beam upon reflection from the grating of Figure 24.1 (k0 ¼ 0.633 lm,
d ¼ k0 /8). In (a) the grating period p ¼ k0 while in (b) p ¼ 5k0. The solid (broken)
arrows indicate the locations of Rayleigh anomalies for the classical (conical) mount.
s(m). If the direction of incidence is now changed in such a way that the incident
beam is along the path of the mth-order beam (in the reverse direction, of course),
there emerges a mth diffracted order along the path of the original incident
beam (again in the reverse direction). The reciprocity theorem states that the
24 Diffraction gratings 335
0.9
0.8 s(0)
0.7 p(0)
s(0)
Diffraction Efficiency
0.6
0.5
0.4 (0)
p
0.3 0 = 0.633 μm
p = 0 /8
0.2 = 30°
0.1
0 1 2 3 4 5 6
p (μm)
Figure 24.8 Computed plots of the zeroth-order efficiency versus the grating
period p for the grating of Figure 24.1 (k0 ¼ 0.633 lm, d ¼ k0/8, h ¼ 30 ). The
solid (broken) arrows indicate the locations of Rayleigh anomalies for the
classical (conical) mount.
1.0
0.8 0 = 0.633 μm
p = 30 s(0)
d = 0.2 μm
Diffraction Efficiency
0.6
p(0)
0.4
0.2 s(0)
p(0)
0.0
0 15 30 45 60 75 90
(degrees)
efficiency of this particular diffracted order will be exactly equal to q(m) (or s(m)).
This theorem can be rigorously proved under general conditions.2 In Figure 24.6
the first-order efficiency curves in the classical mount, i.e., qs(1) and qp(1),
show several manifestations of the reciprocity theorem. A few more conse-
quences of reciprocity will be pointed out in the examples that follow.
Resolving power
Consider a grating of period p having a total of N grooves. The width of the mth-
order diffracted beam that covers the entire grating is Np cos h(m). If this beam is
brought to diffraction-limited focus by a lens of focal length f, the focused spot
diameter D will be1
Therefore, in the focal plane of the lens, a shift of the wavelength from k0 to
k0 þ Dk causes a shift of the focused spot by the following amount:
The two wavelengths are just resolved when the above shift equals the spot
diameter D in Eq. (24.4), that is, when f Dh(m) D. This leads to the following
expression for the resolving power:
k0 =Dk mN: ð24:7Þ
Littrow mount, the nth-order beam, where n is negative, returns along the direction
of incidence. For instance, in the first-order Littrow mount, we find from Eq. (24.1)
2 sin h ¼ k0 =p: ð24:8Þ
Under this condition, if p < 1.5k0, then the only possible diffracted orders are the
zeroth and the first. Furthermore, if the efficiency for the zeroth order can be
reduced to zero, all the available power that is not absorbed by the grating will
return along the first reflected order, thus maximizing the sensitivity of the
spectrometer. Gratings that direct all or most of the incident optical power into a
single diffracted order are known as blazed gratings. Although in the early days
ruled gratings having a triangular groove profile satisfied the blaze condition, a
triangular cross-section is no longer a prerequisite to the blazing property.
Gratings with triangular cross-section and a 90 apex angle are now more
appropriately referred to as “echelette” gratings.
Figure 24.10 shows a metallic prism with an inclination angle a. When a plane
wave is normally incident on the inclined facet of this prism, the specularly
=
Incident beam
4m0/2
3m0/2
2m0/2
m0/2
d = 12 m0 cos
m0
p=
2 sin
Figure 24.10 A normally incident beam of light is specularly reflected from the
inclined facet of a metallic prism (inclination angle a). For a given integer m,
imagine cutting the prism along the broken and dotted lines, which are parallel to
the direction of incidence and have lengths that are multiples of mk0 / 2. The various
sections are then rearranged to form the echelette grating shown in the lower part of
the figure. If the grating is similarly illuminated at h ¼ a, the diffracted order that
retraces the incidence path in the reverse direction will be quite strong, which is why
this kind of grating has come to be known as a blazed grating.
338 Classical Optics and its Applications
reflected light returns along the direction of incidence. Let the lengths of the
equidistant lines drawn on the prism parallel to the direction of incidence be
integer multiples of mk0 / 2, where m is an arbitrary (but fixed) integer. If the
metal prism is cut along these lines and its segments rearranged, one obtains an
echelette grating with period p ¼ mk0/(2sin a), as shown in the lower part of the
figure. With an incidence angle h ¼ a on this grating, Littrow’s condition for
the negative mth diffracted order will be satisfied. In the geometric-optical
approximation, this grating should be equivalent to the original prism, because
the various reflected rays from its individual facets suffer phase delays in mul-
tiples of 2p only, making the grating’s reflected wavefront indistinguishable from
that of the prism. In reality, however, the electromagnetic field “feels” the groove
structure, and the actual diffraction efficiency of the beam returning along the
direction of incidence will not always be the same as the specular reflectance of
the polished metal prism, although they are usually close.
Figure 24.11 shows computed efficiency curves in the classical mount for the
echelette grating of Figure 24.10 having a ¼ 30 , p ¼ 2k0, and (n, k) ¼ (2, 7) at
k0 ¼ 0.633 lm.12 The horizontal axis depicts sin h, the incidence angle h being
positive (negative) when incidence is from the side of the large (small) facet of
the triangular grooves. The arrows at the top of each frame indicate the locations
of Rayleigh anomalies, in the neighborhood of which resonance features and
slope discontinuities are seen to occur. The zeroth-order efficiency curves for p-
and s-polarized light are shown in Figure 24.11(a). Despite the asymmetrical
groove geometry, the plots of qp(0) and qs(0) are perfectly symmetric around h ¼ 0,
which is a manifestation of the reciprocity theorem mentioned earlier. The þfirst-
order efficiency curves in Figure 24.11(b) show the same kind of symmetry
around h ¼ 14.48 (i.e., sin h ¼ 0.25), which is the angle of incidence for the
þfirst-order Littrow mount. Similarly, the first-order curves in Figure 24.11(c)
show the reciprocity theorem at work around h ¼ 14.48 , the angle of incidence
for the first-order Littrow mount. The Rayleigh anomalies at h ¼ 30 (i.e.,
sin h ¼ 0.5) mark the disappearance of the first-order beams beyond these
angles, as may be seen clearly in Figures 24.11(b) and 24.11(c).
The second-order efficiency curves are shown in Figure 24.11(d). These
curves peak at, and are symmetrical around, h ¼ 30 , where the Littrow con-
dition for the second-order beams is satisfied. Reciprocity between the incident
beam and the second-order reflected beams is evident in the symmetrical values
of efficiency around h ¼ 30 . Note in the case of the p-polarized beam incident
at h ¼ 30 , where the second-order efficiency reaches 80% while that of all
other orders essentially vanishes, that the remaining 20% of the incident power
must have been absorbed by the grating. A similar consideration applies to both
qp(þ2) and qs(þ2) at h ¼ 30 . The third-order beams exist only at large angles
24 Diffraction gratings 339
1.0
(a)
0 = 0.633 μm
0.8 p= 20
α = 30°
Diffraction Efficiency
0.6
s(0)
0.4
0.2
p(0)
0.0
–1.0 –0.5 0.0 0.5 1.0
sin
0.40
(b)
0.35
p(+1)
0.30
Diffraction Efficiency
0.25
0.20
0.15
0.10
s(+1)
0.05
0.00
–1.0 –0.5 0.0 0.5 1.0
sin
0.25
s(–1)
Diffraction Efficiency
0.20
0.15
0.10
p(–1)
0.05
0.00
–1.0 –0.5 0.0 0.5 1.0
sin
(d)
0.8
p(–2)
0.6
Diffraction Efficiency
p(+2) s(–2)
0.4
0.2
s(+2)
0.0
–1.0 –0.5 0.0 0.5 1.0
sin
of incidence, as may be inferred from Figure 24.11(e). Again note the symmetry
of these curves (due to reciprocity) around sin h ¼ 0.75; these values of h
correspond to the Littrow mount in the third-order.
For the sake of completeness we present in Figure 24.12 computed efficiency
curves in the case of conical mount for the same echelette grating as discussed
24 Diffraction gratings 341
(e)
0.6
0.5 s(+3)
50 (–3)
s
Diffraction Efficiency
0.4
50 p(–3)
0.3
0.2
0.1 p(+3)
0.0
–1.0 –0.5 0.0 0.5 1.0
sin
above.12 Here the grooves are parallel to the plane of incidence, and symmetry
with respect to h ¼ 0 obviates the need for displaying the results for negative
values of h. In this conical mount only the zeroth and first diffracted orders are
allowed; even then, the first-order beams disappear beyond h ¼ 60 . Note that,
because of the asymmetrical groove shape, the þfirst-order efficiency curves are
quite different from those of the first-order. Also note that, beyond h ¼ 60 ,
where the zeroth-order beam is the only beam reflected from the grating, the
relatively small values of q0p ð0Þ and q0s ð0Þ indicate substantial absorption within the
grating medium.
Transmission grating
Consider a grooved glass plate such as that depicted in Figure 24.13(a). When a
plane wave is incident at h on this grating, the directions of the reflected orders
may be found from Eqs. (24.1) and (24.2), but the transmitted orders inside the
glass plate obey different equations. In the classical mount the transmitted orders
emerge at angles h(m), where
Here n0 is the refractive index of the substrate. The number of diffracted orders in
the substrate could, therefore, be greater than the number reflected into the air.
342 Classical Optics and its Applications
1.0
(a)
0 = 0.633 μm
0.8 p = 20
α = 30°
Diffraction Efficiency
0.6 p(0)
0.4
s(0)
0.2
0.0
0 15 30 45 60 75 90
(degrees)
0.30 (b)
0.25
Diffraction Efficiency
0.20
p(+1)
0.15 s(+1)
0.10
0.05
0.00
0 15 30 45 60 75 90
(degrees)
0.5
Diffraction Efficiency
0.4
p(–1)
0.3
0.2 s(–1)
0.1
0.0
0 15 30 45 60 75 90
(degrees)
However, when the transmitted orders attempt to exit the bottom of the substrate,
those incident at an angle higher than the critical angle for total internal reflection
will be fully reflected. The beams that do exit the substrate will emerge at
angles greater than h(m), in accordance with Snell’s law; the coefficient n0 on the
left-hand side of Eq. (24.9) is effectively canceled. Consequently, the beams
emerging from the bottom of the substrate have exactly the same number and
(aside from being mirror images) the same directions as those reflected from the
top of the grating. Nonetheless, the transmitted diffracted orders may be
observed in their native form by using a hemispherical substrate, as shown in
Figure 24.13(b).
In the case of conical mount similar arguments apply, so that the mth-order
beam inside the substrate will have a propagation direction given by the unit
vector r(m), where
Again, rz is determined from the relation rx2 þ ry2 þ rz2 ¼ 1. As above, when this
beam emerges into air from the bottom of a flat substrate, Snell’s law multiplies
rx and ry by the refractive index n0, ensuring that the emergent beams (aside from
being mirror images) have the same propagation directions as the corresponding
beams reflected from the top of the grating.
344 Classical Optics and its Applications
(a)
Grating
+1
–3 Substrate
(glass)
–2 –1 0
(b)
+1
–3
–2 0
–1
Figure 24.14 shows the location of the transmitted diffracted orders from a
glass grating.12 The assumed grating in this case is similar to that of Figure 24.1,
except that the metal layer is absent. The observation system is also similar to that
in Figure 24.3, except for the position of the collimating lens, which is moved to
the opposite side of the grating to collect the transmitted orders. The incident
beam, arriving at h ¼ 30 in the conical mount, is p-polarized. The pictures on
24 Diffraction gratings 345
a b
c d
Figure 24.14 Computed plots of intensity distribution at the exit pupil of the
collimating lens of Figure 24.3, when the system is rearranged to allow
observation of transmitted orders from the grating of Figure 24.1, from which
the metal layer has been removed (k0 ¼ 0.633 lm, p ¼ 4k0, d ¼ k0 /8). In this
case of conical mount at 30 incidence the grooves are parallel to the plane of
incidence, as in Figure 24.2(b), and the incident beam is p-polarized.
The pictures on the left correspond to the component of polarization in the XZ-
plane, while those on the right represent the polarization component along the
Y-axis. In (a) and (b) the substrate bottom is flat, as in Figure 24.13(a), whereas
in (c) and (d) it is hemispherical, as in Figure 24.13(b). The ratio of the peak
intensity in (b) to that in (a) is 0.21 · 104. Similarly, the peak-intensity ratio of
(d) to (c) is 0.89 · 104. These results are based on full vector-diffraction
calculations.
Dielectric-coated grating
Figure 24.15 is a diagram of a dielectric-coated transmission grating on a
hemispherical glass substrate. In the example that follows it is assumed
that k0 ¼ 0.633 lm, the grating period p ¼ k0, the groove depth d ¼ k0 /8, the
side-wall inclination angle a ¼ 60 , and the duty cycle c ¼ 60%. The coatings
are conformal to the grating surface, both dielectric layers are 100 nm
thick, and their refractive indices are 2.1 and 1.5, as indicated. Because there
are no metallic layers in this case there will be no surface plasmon excita-
tions, but there is the possibility of guided-mode coupling to the dielectric
waveguide formed by the coating layers. The hemispherical substrate allows
all transmitted orders to exit and be measured in air. The bottom of the
hemisphere is antireflection coated, to avoid losses as the beams exit the
substrate.
Figure 24.16 shows computed plots of diffraction efficiency versus h for
the grating of Figure 24.15.12 The case of conical mount does not show
interesting phenomena, as evidenced by the featureless plots of q0 and s0 for
the various orders. This is not surprising, considering that no guided modes
can be launched in the dielectric layers in this case. However, for the classical
mount qp, qs, sp and ss show peaks and valleys that are indicative of resonant
Incident beam
n2 = 1.5
100 nm
n1 = 2.1 100 nm
Substrate
(n0 = 1.5)
–2 +1
–1 0
Diffraction Efficiency
0.6 s(0)
s(0)
0.4
p(0)
0.2 p(0)
0.0
0 15 30 45 60 75 90
(degrees)
0.10
(b)
p(–1)
0.08
Diffraction Efficiency
0.06
s(–1)
0.04
0.02
0.00
0 15 30 45 60 75 90
(degrees)
0.8 p(0)
Diffraction Efficiency
0.6 s(0)
p(0)
0.4
s(0)
0.2
0.0
0 15 30 45 60 75 90
(degrees)
0.06 (d)
p(+1)
0.05
s(+1)
p(+1)
0.04
Diffraction Efficiency
0.03
s(+1)
0.02
0.01
0.00
0 15 30 45 60 75 90
(degrees)
0.12 p(–1)
0.10
s(–1)
Diffraction Efficiency
0.08
0.06 p(–1)
0.04 s(–1)
0.02
0.00
0 15 30 45 60 75 90
(degrees)
0.30 (f )
0.25
p(–2)
Diffraction Efficiency
0.20
0.15
0.10
0.05
s(–2)
0.00
0 15 30 45 60 75 90
(degrees)
values of efficiency before and after h ¼ 30 . Note that, unlike surface plasmon
excitations in metals, which occur in p-polarization only, the waveguide modes of
dielectric layers can be excited by both p- and s-polarized light. For the classical
mount, Figure 24.16(d) shows that the þfirst-order transmitted beam is cut off
350 Classical Optics and its Applications
beyond h ¼ 30 . In its place the second-order transmitted beam shown in Figure
24.16(f) appears and shows fairly high efficiency for p-polarized light in a narrow
range of angles around h ¼ 33 .
It is impossible to describe in a brief survey the entire range of physical
phenomena that occur in diffraction gratings and their potential applications. We
hope, however, to have brought to the reader’s attention the richness and com-
plexity of the physics of gratings, and to have encouraged further exploration of
this fascinating subject.
Diffractive optical elements (DOEs), which are relatively new additions to the
toolbox of optical engineering, can function as lenses, gratings, prisms, aspherics,
and many other types of optical element. Typically formed in a film of only a few
microns thickness, a DOE may be fabricated on an arbitrarily-shaped substrate.
Flexible functionality, wide range of available optical aperture, light weight, and
low manufacturing cost are among the advantages of DOEs. They can be fab-
ricated in a broad range of materials such as aluminum, silicon, silica, and
plastics, thus providing flexibility in selecting the base material for specific
applications. The effects of temperature change, thermal gradients, shock, and
stress in thin film optical devices, however, can cause deformation of the sub-
strate and ultimately alter the behavior of a DOE.1,2,3,4,5,6
DOEs are wavelength sensitive; for instance, the focal length and aberration char-
acteristics of a diffractive lens can vary substantially if the wavelength of the incident
light is changed. DOEs can duplicate most of the functions provided by conventional
glass optics provided that the optical system operates over a narrow spectral bandwidth,
or the operation of the system requires chromatic dispersion. To date, DOEs have found
widespread application in beam-combiners, head-mounted displays, beam-shaping
optics, laser collimators, spectral filters, compact spectrometers, diode laser couplers,
projection displays, compact disk (CD) and digital versatile disk (DVD) players, laser
resonators, computer interconnects, solar concentrators, laser material processing, and
wavelength division multiplexers/demultiplexers.
Optimal design of advanced optical systems requires a thorough understanding
of the interaction between the light beam and the various elements located between
the light source and the detectors. In this chapter we use a combination of polari-
zation ray-tracing and quasi-vector diffraction modeling to analyze the behavior of
a laser beam as it propagates through various diffractive optical elements.
351
352 Classical Optics and its Applications
F(x, y) must be greater than or equal to zero across the surface since n n1, t(x, y) and
kc are all non-negative. For later reference, the gradient of F(x, y) is written below:
n1 n2 n1 n2
t (x, y)
X X
n
Y Z Y Z
Figure 25.1 (a) A ray of light (vacuum wavelength ¼ k0) is incident at an oblique
angle (h1, 1) from a medium of refractive index n1 onto a substrate of index n2. The
substrate is coated with a layer of index n and variable thickness t(x, y), where n is
assumed to be large and t(x, y) very small, so that only the optical path difference,
OPD ¼ (n n1)t(x, y), has a finite value. (b) The variable thickness layer is con-
verted to a DOE by reducing the coating layer’s thickness wherever the OPD
contains an integer multiple of the construction wavelength kc. The characteristic
function of the DOE is thus the fractional part f(x, y) of the characteristic function of
the coating layer in (a), defined as F(x, y) ¼ (n n1)t(x, y)/kc.
25 Diffractive optical elements 353
A diffractive optical element (DOE) is constructed from the above coating layer by
reducing the layer’s thickness whenever F(x, y) happens to be greater than unity.
By removing from t(x, y) all integer multiples of kc/(n n1), one obtains a coating
such as that in Figure 25.1(b), for which the integer part of F(x, y), if any, has been
eliminated in all locations. The characteristic function f(x, y) of the DOE, with
values confined to the interval [0, 1], is simply the fractional part of F(x, y).
As shown in Figure 25.2, the coating layer’s F(x, y) is truncated at contours
where the function acquires integer values, so the local period (Dx, Dy) of the
DOE at a point such as (x0, y0) is the shortest line segment through (x0, y0) that
satisfies the equation
rFðx; yÞ ðDx ^
x þ Dy ^yÞ ¼ ð@F=@xÞDx þ ð@F=@yÞDy ¼ 1: ð25:2Þ
In Eq. (25.2) x ^ and ^y are unit vectors along the coordinate axes. Noting that
jrFj ¼ (@F/@x)2 þ (@F/@y)2, we find (Dx, Dy) ¼ rF/jrFj2. This is the local
2
period of the grating at (x0, y0), which is directed along rF and has magnitude
1/jrFj. In the linear approximation, a single period of the grating
begins at (x, y) ¼ (x0, y0) f(x0, y0)rF/jrFj2, where f(x, y) ¼ 0, and ends at
(x, y) ¼ (x0, y0) þ [1 f(x0, y0)]rF/jrFj2, where f(x, y) ¼ 1.
F/| F |2
(xo, yo)
Figure 25.2 Diagram of a DOE showing the slicing contours where the
function F(x, y) assumes integer values. The DOE’s characteristic function f(x, y)
is the fractional part of F(x, y). Thus, while F(x, y) is continuous across the
XY-plane, f(x, y) jumps by one unit at each contour. The space between each pair
of adjacent contours contains a single groove of the DOE, where f(x, y) varies
continuously between the values of 0 and 1. At an arbitrary location (x0, y0)
in the XY-plane, the separation between adjacent contours is given by
(Dx, Dy) ¼ rF/jrFj2, which is a vector of magnitude 1/jrFj oriented orthogonal
to the contours.
354 Classical Optics and its Applications
Substituting for f(x, y) in Eq. (25.3b) from Eq. (25.4) and carrying out the
integration, we find
Cm ¼ exp½i2pmf ðx0 ; y0 Þ expfip½ðkc =k0 Þ mgsinc½ðkc =k0 Þ m; ð25:5Þ
where sinc(x) ¼ sin(px)/px. The mth order diffraction efficiency is thus found to
have the constant amplitude jCmj ¼ sinc[(kc/k0) m] across the XY-plane for any
given k0. When k0 happens to be the same as the construction wavelength kc, the
first order beam will have 100% efficiency while all other orders vanish. Also, if
kc is an integer-multiple of k0, only one order will emerge, unattenuated, from the
DOE. For all other values of k0, the various orders m ¼ 0, 1, 2, etc. will
coexist. The second term in Eq. (25.5) corresponds to a constant phase, p[(kc/k0)
– m], which is independent of (x0, y0) and may thus be ignored in practice. The
25 Diffractive optical elements 355
remaining phase, 2pmf(x0, y0), varies continuously across the XY-plane with
absolutely no dependence on k0. Since f(x0, y0) is the fractional part of F(x0, y0),
the two functions may be exchanged and the phase acquired by the mth order rays
written as 2pmF(x0, y0). In practice the lack of any discontinuous jumps in this
phase profile of the mth order beam is extremely important, since it means that
the wavefront associated with each and every diffraction order is well-behaved.
In other words, if one assembles all the mth order rays from across the DOE to
construct the mth order transmitted beam, the beam will have a continuous
wavefront.
The transmitted wavefront around (x0, y0), the foot of the incident ray, can now
be written
X
Aðx; yÞ ¼ A 0m exp½ið2pn2 =k0 Þðxrx0m þ yry0m Þ
m
¼ A0 exp½ið2pn1 =k0 Þðxrx þ yry Þ exp½iwðx; yÞ
X
¼ Cm A0 expfið2p=k0 Þ½ðn1 rx þ mk0 @F=@xÞx
m
þ ðn1 ry þ mk0 @F=@yÞyg: ð25:6Þ
The (complex) amplitude and the direction of the mth order transmitted ray are
thus given by
A 0m ¼ Cm A0 ; ð25:7aÞ
ðr0x ; r0y Þm ¼ n1 rx þ mk0 @F=@x; n1 ry þ mk0 @F=@y =n2 : ð25:7bÞ
Note that the mismatch between the refractive indices n1, n, and n2 is not taken
into consideration in Eq. (25.7a) as far as reflection losses at the various inter-
faces are concerned. Also ignored in this analysis are the effects of incident
polarization on the transmission coefficient Cm, which would have required a
rigorous vector diffraction treatment.
For m 6¼ 0, the direction of the mth order transmitted ray, (rx0, ry0)m, is seen from
Eq. (25.7b) to depend on the illumination wavelength k0 in a way that gives rise
to a substantial amount of chromatic aberration; this provides the basis for cor-
recting the chromatic aberrations of conventional refractive lenses by incorpor-
ating diffractive optical elements in the so-called hybrid designs. In going from
medium 1 to medium 2 of Figure 25.1, the undiffracted 0th- order ray follows the
0 0
Snell’s law since, according to Eq. (25.7b), (n2rx0 , n2ry0 ) ¼ (n1rx, n1ry). For other
diffraction orders, one must add mk0rF to the incident beam’s (n1rx, n1ry) in
order to obtain the transmitted beam’s (n2rx0, n2ry0) m.
356 Classical Optics and its Applications
Having exploited the localized ray picture to build the transmitted wavefront(s)
across the DOE surface, we now abandon the rays and concentrate instead on the
transmitted wavefronts (one for each diffracted order). When the incident
wavelength k0 differs from the construction wavelength kc, the various orders
will be present in the mix in different amounts, with the magnitude of the mth
beam, jCmj , being a function of m and the wavelength ratio kc/k0. Although the
phase profile of each diffracted order is independent of the incident wavelength
k0, this does not imply that a given diffracted order behaves identically in
response to different incident wavelengths. Remember that the mth order
phase profile is exp[i2pmF(x, y)], so, for simplicity’s sake, let us assume that
F(x, y) ¼ ax þ by, where a and b are arbitrary constants. This phase profile may
then be written as exp[i(2p/k)(mkax þ mkby)], where k ¼ k0/n2 is the wavelength
within the medium of refractive index n2. This represents a plane wave having
direction cosines (rx, ry) ¼ (mka, mkb), whose propagation direction evidently
depends on k0, even though its phase profile is independent of the incident
wavelength. The bottom line is that the rays and the wavefronts that emerge from
the above analysis paint a consistent picture, both leading to the same conclusions
concerning the diffraction efficiency and the chromatic aberrations associated
with each diffracted order of the transmitted beam.
n1
t (x, y)
Perfect
2 Reflector
1
X
n
Y Z
Figure 25.3 The case of a reflective DOE differs from that of a transmissive
DOE in that the transparent substrate is now replaced with a perfect reflector.
The incident rays, after traveling through the coating layer, bouncing back at the
substrate interface, and returning through the same thickness of the coating
layer, re-emerge into the incidence medium (refractive index ¼ n1). The DOE is
constructed from the coating layer by removing from t(x, y) all integer multiples
of ½kc/(n n1).
25 Diffractive optical elements 357
r
h(r)
Figure 25.4 A surface of revolution around the z-axis is defined by its sag h(r),
which is the distance of the surface (along z) from the plane tangent to the surface
at its vertex. The curvilinear coordinate s follows the tangent to the surface in the
rz-plane. The value of s at each point is the length of the curve measured from
some point of reference, such as the vertex at (r, z) ¼ (0, 0). Also shown is a pair of
incident and refracted rays at the surface.
refractive index n1, but the DOE’s substrate is a perfect reflector. We assume
once again that the variable-thickness layer has a large refractive index n and a
correspondingly small thickness t(x, y). The optical path difference upon trans-
mission through the layer and reflection at the substrate interface is thus given by
OPD ¼ 2(n n1)t(x, y), which yields the characteristic function F(x, y) ¼
2(n n1)t(x, y)/kc, with kc being the construction wavelength. Once again, the
DOE is constructed from the above coating layer by reducing the layer’s
thickness whenever F(x, y) exceeds unity. Note that the above factor of 2 in the
expression for the OPD – representing the effect of double-path through the
coating layer – does not affect any of the subsequent results, since the starting
point of our derivations is the function F(x, y), which already incorporates this
factor. The formal derivations for a reflective DOE parallel those of the
transmissive DOE in the preceding section, until we reach Eq. (25.6), at which
point the refractive index n2 of the medium into which the beam emerges (upon
transmission through the DOE) must be replaced with n1, reflecting the fact that
the incidence and emergence media are now the same. Therefore, for reflective
DOEs, the only equation that needs to be modified is Eq. (25.7b), which
358 Classical Optics and its Applications
All the considerations discussed in the case of transmissive DOEs apply equally
to reflective elements as well.
Gaussian Glass
Beam Plate
DOE
Aspheric Destination
Lens Plane
Figure 25.5 Gaussian beam (k0 ¼ 0.66 lm, e1 radius R0 ¼ 2.5 mm, diameter
D ¼ 4.0 mm) is focused by a 4.0 mm diameter lens (thickness ¼ 1.7 mm,
refractive index ¼ 1.540 44, first surface: radius of curvature Rc ¼ 11.4 mm,
conic constant j ¼ 0.733, aspheric coefficients A4 ¼ 2.82 · 107, A6 ¼ 3.75
· 108, A8 ¼ 1.5 · 109; second surface: Rc ¼ 98 mm). The incident beam,
linearly polarized along the x-axis, has the intensity profile shown on the left-
hand side. The glass plate (d1 ¼ 0.61 mm), the cover slip (d2 ¼ 0.5 mm), and the
substrate (d3 ¼ 2.0 mm) all have the same refractive index n ¼ 1.520 168. The
glass plate is 1.0 mm away from the lens and 14.38 mm away from the cover slip.
The destination plane is at z ¼ 10.0 mm (measured from the first vertex of the
lens), and is tilted by h ¼ 6.03 , as shown. The beam is subsequently propagated
a distance of 10.468 mm along the normal to the destination plane, which brings
the beam to its plane of best focus.
25 Diffractive optical elements 359
revolution, such as that in Figure 25.4, where the axis of symmetry is z, and the
sag is a given function h(r) of r. The characteristic function of such a DOE is
usually defined by a radial polynomial,
X
N
FðrÞ ¼ an r n : ð25:9Þ
n¼0
Consider the local surface coordinate s shown in Figure 25.4. The value of s at
each point on the surface is the length of the curve measured from some point of
reference such as the vertex at (r, z) ¼ (0, 0). What we need is the characteristic
function’s gradient over a short distance Ds, namely, DF/Ds. But
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
Ds ¼ ðDrÞ þ ðDhÞ ¼ Dr 1 þ ðdh=drÞ2 :
2 2
ð25:10Þ
1.5
y (mm)
–1.5
1.5
y (mm)
–1.5
–5 x (mm) –2 –5 x (mm) –2 –5 x (mm) –2
Figure 25.6 Distributions of intensity (top) and phase (bottom) at the destination
plane in the system of Figure 25.5; from left to right, x-, y-, and z-components of
polarization. Note that the emergent beam is centered at x ¼ 3.6 mm. The peak
intensities are in the ratio of Ix : Iy : Iz ¼ 1.0 : 0.39 · 103 : 0.13. In the residual
phase profiles x, y, z, where the wavefront curvature and tilt are factored out,
the color spectrum in each plot covers the range from minimum (blue) to max-
imum (red); here (min : max) is (0 : 39 ) for x, (147 : 39 ) for y, and
(146 : 0 ) for z.
360 Classical Optics and its Applications
15
y (μm)
–15
–15 x (μm) 15 –15 x (μm) 15 –15 x (μm) 15
y (μm)
–5
5
y (μm)
–5
–5 x (μm) 5 –5 x (μm) 5 –5 x (μm) 5
Figure 25.7 Plots of log-intensity (top), intensity (middle), and phase (bottom)
at the plane of best focus in the system of Figure 25.5. From left to right: x-, y-,
and z-components of polarization. The peak intensities are in the ratio of Ix : Iy :
Iz ¼ 1.0 : 0.15 · 103 : 0.115. The phase profiles’ range (blue to red) is (min :
max) ¼ (180 : 180 ).
Therefore,
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
@F=@s ¼ ð@F=@rÞ 1 þ ðdh=drÞ2 : ð25:11Þ
Equation (25.11), in conjunction with the equations derived previously for flat
surfaces, is all that one needs in order to compute the various diffracted rays and
wavefronts associated with DOEs on curved substrates.
25 Diffractive optical elements 361
Y 2.0 mm
1.6
Gaussian
Beam
DOE
y(mm) X
–1.6
–1.6 x (mm) 1.6
10 mm
Destination Plane
Figure 25.8 A linearly polarized Gaussian beam enters a glass prism of refractive
index n ¼ 1.65 whose rear facet is coated with a DOE. The incident beam’s intensity
profile is shown on the left-hand side. The emergent diffracted beam is the þfirst
order. The entrance and exit facets of the prism are antireflection-coated, and the
destination plane is a distance Dy ¼ 10 mm below the prism’s exit facet.
The incident rays are traced through the entire system, then back-traced to the
so-called destination plane, located at z ¼ 10 mm from the first vertex of the
lens and tilted by h ¼ 6.03 , as shown. At the destination plane, the magni-
tude, phase, and polarization state of the rays are used to reconstruct the
wavefront. Figure 25.6 shows the reconstructed wavefront’s intensity and
phase distribution at the destination plane. The wavefront’s curvature and tilt
are factored out, otherwise the phase variations across the cross-sectional
profiles will be too great to display. Note that the y-component is nearly four
orders of magnitude weaker than the x-component, whereas the z-component’s
power content is non-negligible. The phase profiles of Figure 25.6 are quite
362 Classical Optics and its Applications
1.75
y (mm)
–1.75
1.75
y (mm)
–1.75
1.75
y (mm)
–1.75
–1.75 x (mm) 1.75 –1.75 x (mm) 1.75 –1.75 x (mm) 1.75
Figure 25.9 Plots of intensity (top), phase (middle) and phase minus curvature
(bottom) at the destination plane of the system of Figure 25.8. From left to right:
x-, y-, and z-components of polarization. The peak intensities are in the ratio of
Ix : Iy : Iz ¼ 105 : 0.4 : 1.08. The range of the phase profiles (blue to red) is (min :
max) ¼ (180 : 180 ).
Y
1.6
Gaussian Destination Plane
Beam
y (mm) X
DOE
Cover
–1.6 Aspheric Slip
–1.6 x (mm) 1.6 Lens
y (mm)
–2
2
y (mm)
–2
–2 x (mm) 2 –2 x (mm) 2 –2 x (mm) 2
Figure 25.11 Plots of intensity (top) and phase (bottom) at the exit pupil of
the aspheric lens in the system of Figure 25.10. From left to right: x-, y-, and
z-components of polarization. The peak intensities are in the ratio of Ix : Iy :
Iz ¼ 1000 : 0.6 : 70. The range of the phase profiles (blue to red) is (min :
max) ¼ (180 : 180 ).
and its construction wavelength kc is the same as k0; hence the emergent beam is
the þfirst diffracted order.
The incident rays are first traced through the entire system, then back-traced to the
destination plane located at the exit pupil of the objective lens; the emergent
wavefront is subsequently reconstructed from the traced rays. Figure 25.11 shows
plots of intensity and phase at the destination plane. Shown from left to right are the
x-, y-, and z-components of polarization. The curvature of the wavefront has been
factored out, so what is displayed is the residual phase or aberrations. Note that the
y-component is nearly three orders of magnitude weaker than the x-component, but
the z-component is not so weak. The wavefront at the exit pupil is then propagated to
the focal plane and shown in Figure 25.12, where the y-component of polarization is
seen to be more than three orders of magnitude weaker than the x-component.
y (μm)
–4
–4 x (μm) 4 –4 x (μm) 4
Figure 25.12 Intensity distribution at the focal plane of the lens in the system
of Figure 25.10; (left) x-component, (right) y-component of polarization. The
peak intensities are in the ratio Ix : Iy ¼ 1000 : 0.27.
Y Destination
Plane
1.6 Parabolic
Mirror
X
y(mm)
DOE
–1.6
–1.6 x (mm) 1.6 Gaussian
10 mm
Beam
linearly polarized along the x-axis). The paraboloid has radius of curvature
Rc ¼ 40 mm, conic constant j ¼ 1, and aperture diameter D ¼ 3.0 mm;
the DOE’s phase profile is given by F(r) ¼ r2–1.25r4 þ 0.35r6 þ 0.1r8 (r in
millimeters). Since the DOE’s construction wavelength is kc ¼ 0.55 lm, various
diffracted orders exist, although the most intense beam, shown in Figure 25.14, is
the þfirst order. Figure 25.14 shows the reflected intensity and phase profiles at
the destination plane, located 10.0 mm away from the mirror’s vertex; this also
happens to be 10.0 mm before the mirror’s nominal focal plane. From left to right,
366 Classical Optics and its Applications
1
y (mm)
–1
1
y (mm)
–1
–1 x (mm) 1 –1 x (mm) 1 –1 x (mm) 1
Figure 25.14 Plots of intensity (top) and phase (bottom) at the destination
plane of the system of Figure 25.13. From left to right: x-, y-, and z-components
of polarization. The peak intensities are in the ratio of Ix : Iy : Iz ¼ 1.0 : 0.33 ·
106 : 0.177 · 102. The range of the phase profiles (blue to red) is (min :
max) ¼ (180 : 180 ). For display purposes the curvature phase factor has been
taken out of the mesh.
these plots represent the x-, y-, and z-components of polarization. Note that the
y-component is nearly six orders of magnitude weaker than the x-component,
whereas the z-component is only 600 times weaker.
367
368 Classical Optics and its Applications
a b
intensity distribution will have the far field pattern of Figure 26.1(d), although its size
will scale with distance from the screen. Under no circumstances do we obtain an
intensity pattern that closely resembles the cross shape of the aperture itself.
Now consider the periodic array of cross-shaped apertures shown in Figure 26.2(a);
each aperture is identical to that in Figure 26.1(a). The center-to-center spacing
between adjacent apertures along the X- and Y-directions is p ¼ 60k. (For sim-
plicity we have assumed the periodic pattern to extend to infinity, although, for
practical purposes, a finite number of apertures in a periodic arrangement
will suffice.) When the pattern in Figure 26.2(a) is illuminated by a normally
incident, coherent beam of light, the cross shape of the apertures is abundantly
reproduced in the intensity patterns obtained at certain distances from the screen.
Figures 26.2(b)–(f) show the computed patterns of intensity distribution at dis-
tances z ¼ 600k, 1200k, 1800k, 2700k, and 3600k, respectively. (Note that all
pictures in Figure 26.2 have the same scale.) When the distance from object to
image z ¼ p2/k, as is the case in Figure 26.2(f), the original pattern of the apertures
26 The Talbot effect 369
a b c
d e f
is reproduced, albeit with a half-period shift in both the X- and the Y-direction. In
Figure 26.2(d), the distance to the image is p2/(2k), and not only is the original
pattern replicated but also its frequency (along both X and Y) has doubled. In
Figure 26.2(c), where the distance to the image is z ¼ p2/(3k), the pattern is
repeated with three times the original frequency along both X- and Y- axes.
By showing the intensity distribution at other distances from the object,
Figures 26.2(b), 26.2(e) emphasize that perfect reproduction of the shapes in
the original pattern does not occur everywhere but only at certain special planes.
A hint as to why these periodic patterns are reproduced at certain intervals may be
gleaned from the following argument. A plane wave normally incident on a periodic
structure creates a discrete spectrum of plane waves propagating along the directions
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
k ¼ ðkx ; ky ; kz Þ ¼ 2p m=p; n=p; ð1=kÞ2 ðm=pÞ2 ðn=pÞ2 : ð26:1Þ
provided that p/k is large enough that, for all m, n values of interest, the above
Taylor-series expansion to first order suffices. The acquired phase after a
propagation distance of z will then be
y/
–55
55
c d
y/
–55
–55 x/ 55 –55 x/ 55
Figure 26.3 (a) A mask consisting of eight concentric rings (width ¼ 2k,
spacing ¼ 6k) is illuminated by a normally incident plane wave of wavelength k.
The computed intensity distributions shown here are at distances of (b) z ¼ 18k,
(c) z ¼ 27k, and (d) z ¼ 36k from the mask. A bright spike appearing in the central
region of each image has been blocked off in order to improve the image contrast.
A simple analysis
Consider the point source shown in Figure 26.5, located at (x, y, z) ¼ (x0, y0, 0) and
radiating a spherical wavefront into the region z > 0 of space. In this analysis we
assume that all spatial dimensions are normalized by the vacuum wavelength k of
the light; as a result, k will not appear explicitly in any of the following equations.
In the z ¼ z0 plane, the complex-amplitude distribution may be written
y/
–60
60
c d
y/
–60
–60 x/ 60 –60 x/ 60
Figure 26.4 (a) A mask consisting of a spiral aperture (width 3k, spacing 9k)
is illuminated by a normally incident plane wave of wavelength k. The com-
puted intensity distributions shown here are at distances of (b) z ¼ 40.5k,
(c) z ¼ 60.75k, and (d) z ¼ 81k from the aperture. As in the previous figure, a
bright spike appearing in the central region of each image has been blocked off
in order to improve the image contrast.
X
Z0
(x0, y0)
Z
In deriving the above approximate expression we have used, for the exponent, the
first term in the Taylor series expansion
pffiffiffiffiffiffiffiffiffiffiffiffiffi
1 þ x2 ¼ 1 þ 12 x2 þ ð26:5Þ
26 The Talbot effect 373
Now, the first two terms on the right-hand side of Eq. (26.4) are the approximate
form of the spherical wavefront emanating from a point source at the origin of the
plane z ¼ 0. The next term is a constant phase factor that depends on the position
(x0, y0) of the point source within the XY-plane and the last term is a linear phase
factor in x and y.
Next, let us assume that a periodic mask, having periods ax and ay along the
X- and Y-axes, is placed at z ¼ z0 (see Figure 26.6). In the general case, where
the mask modulates the phase and/or the amplitude of the light beam, its
complex-amplitude transmission function may be written
XX
tðx; yÞ ¼ Cmn exp½i2pðmx=ax þ ny=ay Þ: ð26:6Þ
When the incident spherical wavefront is multiplied by t(x, y), each Fourier
component of t(x, y) will create a different spherical wavefront which, according to
Eq. (26.4), appears to originate at a different point (x0, y0) ¼ (mz0 /ax, nz0 /ay)
within the XY-plane. In addition, each such point source appears to have the
following phase factor:
The net effect of the mask, therefore, is to replace the single point source with a
periodic array of point sources, as shown in Figure 26.7, where the magnitude of
each point source is Cmn exp(i mn). At the observation plane, each point source
will give rise to a spherical wavefront that will obey Eq. (26.4), except that the
z0
ax
(0, 0) Z
Point
source
ay
Y Periodic
phase/amplitude
mask
z1
Z
z0 z0 + z1
Y
Periodic array Periodic Observation
of point sources mask plane
Figure 26.7 Interaction between the periodic mask and the cone of light
shown in Figure 26.6 gives rise to an array of (virtual) point sources, each
having a certain phase and amplitude depending on the structure of the mask
and its location z0 along the Z-axis. To determine the light distribution at the
observation plane one may replace the mask by this “equivalent” array of point
sources.
The first two factors in the above equation correspond to a spherical wavefront
with radius of curvature z0 þ z1; we need not keep track of them any longer. The
last factor can be simplified if we define a magnification factor M ¼ (z0 þ z1)/z0, in
which case it is written as
expfi2p½mx=ðMax Þ þ ny=ðMay Þg: ð26:9Þ
This is just the (m, n)th plane-wave component of the spectrum, whose periods
ax, ay are magnified by a factor M. Except for this scale factor, the Fourier
basis functions have not changed in going from the plane of the mask (z ¼ z0)
to the observation plane (z ¼ z0 þ z1). The main factors in Eq. (26.8), there-
fore, are the first two factors in the double sum; these can be written as
follows:
h i h i
exp ip m =ax þ n =ay z0 z1 =ðz0 þ z1 Þ ¼ exp ipðz1 =M Þ m =ax þ n =ay :
2 2 2 2 2 2 2 2
ð26:10Þ
26 The Talbot effect 375
Let us now assume that a2x and ay2 have a least common multiple in the following
sense:
where both l and m are integers. Then the phase factor in Eq. (26.10) may be
written
Since lm2 þ mn2 is an integer, if z1 is chosen to be 2jMa2 with j integer, then the
phase factor in Eq. (26.12) will become unity for all values of m and n and can
therefore be ignored. Under such circumstances Eq. (26.8) will yield a magnified
image of the mask at the observation plane. This is the essence of the Talbot
effect.
By allowing z0 to approach infinity, the above results can be readily extended to
the case of plane-wave illumination. The magnification factor M will become unity
in this case, but no other change will be necessary in the preceding equations.
Image multiplicity
The appearance of multiple images at the observation plane may be readily
explained in the special case where the periodicity is one dimensional and the
frequency of the image is twice that of the object. The explanation, nonetheless,
captures the essence of the phenomenon and can be easily extended to peri-
odicity in two dimensions and to higher multiplicities. Consider the periodic
function f (x) shown in Figure 26.8(a). Note that the period ax is much larger
than the width of the individual “features” of the function, so that there is plenty
of space to insert additional features. Let the Fourier-series representation of
this function be
X
f ðxÞ ¼ Cm expði2p mx=ax Þ: ð26:13Þ
In the Fourier domain, the Fourier transform F(m) of f (x) is a “comb” function with
period 1/ax, where the delta function at position m is multiplied by the corres-
ponding Fourier coefficient Cm, as shown in Figure 26.8(b).
Now, let us assume that the odd coefficients of F(m) are multiplied by a
complex constant b. (This would happen in Eq. (26.12), for instance, if l ¼ 1,
m ¼ 0, and z1 ¼ 12 Ma2 , in which case b ¼ i.) We can then separate the Fourier
coefficients of f (x) into even and odd terms, as shown in Figure 26.9. Both the
resulting comb functions in the Fourier domain will have twice the period of the
376 Classical Optics and its Applications
(a) f (x)
x
–2ax –ax 0 ax 2ax
(b) F(m)
C–1 C1
C–4 C4
C0 C3
C–3
C–5 C–2 C2 C5
m
–5 –4 –3 –2 –1 0 1 2 3 4 5
Figure 26.8 (a) A periodic function f (x) in one-dimensional space; the indi-
vidual “features” of the function are much narrower than its period ax. (b) The
Fourier transform of f (x) consists of a sequence of delta functions located at
integer multiples of 1/ax in the Fourier domain.
Feven(m)
C–4 C4
C–2 C0 C2
m
–5 –4 –3 –2 –1 0 1 2 3 4 5
Fodd(m)
C–1 C1
C–3 C3
C–5 C5
m
–5 –4 –3 –2 –1 0 1 2 3 4 5
Figure 26.9 In Figure 26.8(b), when the odd components of the Fourier-
transformed function F(m) are multiplied by a constant b, the function may be
resolved into two “comb” functions, Feven(m) and Fodd(m). In these new func-
tions the spacing between adjacent delta functions is 2/ax and, in the case of
Fodd(m), the function is shifted by a half-period.
26 The Talbot effect 377
original comb function; therefore, their inverse transforms in the x-domain will
have twice the frequency. The second comb function in Figure 26.9 is also shifted
by a half-period, which means that its inverse transform must be multiplied
by exp(i2px/ax). The resulting comb functions in the x-domain are shown in
Figure 26.10. The net result is that when we add the two comb functions of
Figure 26.10 and convolve the resultant with the unit-period function f0(x), we
will find the function shown in Figure 26.11. Because the width of f0(x) is less
than half the period ax, the new features added to the function will not overlap
with the old ones, yielding a function with an apparently increased frequency.
x
–2ax –ax ax 2ax
x
–2ax –ax ax 2ax
x
–2ax –ax 0 ax 2ax
Figure 26.11 When the sum of the two comb functions in Figure 26.10 is
convolved with the individual features f0 (x) of f (x), the resulting function appears
to have twice the frequency of the original f (x). Note, however, that the “features”
of the new function are alternately multiplied by 12 ð1 þ bÞ and 12 ð1 bÞ.
378 Classical Optics and its Applications
However, the periodicity is only in the amplitude of the function, since the phase
of each feature differs from the phase of its neighbors. In any event, this
description explains why the apparent periodicity of the pattern in Figure 26.2
increases at certain distances between the object and the image.
Readers are undoubtedly familiar with the phenomenon of total internal reflection
(TIR), which occurs when a beam of light within a high-index medium arrives with
a sufficiently great angle of incidence at an interface with a lower-index medium.
What is generally not appreciated is the complexity of phenomena that accompany
TIR. For instance, consider the simple optical setup shown in Figure 27.1, where a
uniform beam of light is brought to focus by a positive lens, being reflected,
somewhere along the way, at the rear facet of a glass prism. Assuming a refractive
index n ¼ 1.65 for the prism material, the critical angle of incidence is readily
found to be hcrit ¼ sin1(1/n) ¼ 37.3 . Let the lens have numerical aperture
NA ¼ 0.2 (i.e., f-number ¼ 2.5). Then the range of angles of incidence on the
prism’s rear facet will be (33.5 , 56.5 ). The majority of the rays thus suffer total
internal reflection and converge, as depicted in Figure 27.1, towards a common
focus in the observation plane.
Figure 27.2 shows computed plots of intensity and phase at the observation
plane, indicating that the focused spot essentially has the Airy pattern, albeit with
minor deviations from the ideal. The diameter of the first dark ring, for example,
is approximately 6k, which is close to the theoretical value of 1.22k/NA for the
Airy disk.1 The coma-like tail appearing on the right-hand side of the focused
spot is caused by those rays that strike the prism in the neighborhood of the
critical TIR angle, hcrit, thus introducing apodization and aberration. (Apodization
is due to a reduction of the reflectivity of the prism below the critical angle, and
aberration is caused by deviations from linearity of phase as a function of angle of
incidence.) One noteworthy feature of the focused spot of Figure 27.2 is that it is
not centered on the optical axis, but is shifted to the right by about one wave-
length. This shift is known as the Goos–Hänchen effect,2, 3, 4 and its cause will
become clear in the course of the following discussion.
For the prism of Figure 27.1 the computed amplitude and phase of Fresnel’s
reflection coefficients at the glass-to-air interface are presented in Figure 27.3.1
379
380 Classical Optics and its Applications
Observation
plane
X
Z
Y
TIR Prism
Lens
Figure 27.1 Focusing of a uniform beam through a TIR prism. The incident beam
is linearly polarized along the X-axis, the numerical aperture of the lens is 0.2, and
the refractive index of the prism material is 1.65. The entrance and exit facets of the
prism are assumed to be spherical so that ray-bending by Snell’s law at these
surfaces is avoided, thus eliminating the corresponding spherical aberrations.
+10
y/
–10
–10 z/ +10 –10 z/ +10
Figure 27.2 Plots of (a) logarithmic intensity distribution and (b) phase, at the
focal plane of the lens. The center of the bright spot is shifted to the right by about
one wavelength in consequence of the Goos–Hänchen effect. The light and dark
rings in the phase plot correspond to regions of 0 and 180 phase, respectively.
The curves for both p- and s-components of polarization are shown, even though
in our example we are primarily concerned with p-polarized light. Note that
beyond the critical angle the phase of the reflected p-light has a very large slope.
To the extent that this phase may be approximated by a straight line (within the
range of incidence angles of interest) it imparts a linear phase shift to the beam
upon reflection from the prism’s rear facet. This linear phase shift is nothing other
than a wavefront tilt, which causes a displacement of the focused spot; in other
words, it gives rise to the Goos–Hänchen effect. One might phrase the same
explanation in the language of Fourier-transform theory by stating that when a
function is multiplied by a linear phase factor, its Fourier transform is displaced
by an amount proportional to the slope of that phase factor.
Note that the largest slopes of the phase plots in Figure 27.3(b) occur imme-
diately after the critical angle; therefore, the greatest effects would be observed
27 Some quirks of total internal reflection 381
(a) (b)
180
1.0
160
0.8 140
120
fs
0.6 100
|rs| 80
0.4
60 fp
|rp| 40
0.2
20
0.0 0
0 15 30 45 60 75 90 0 15 30 45 60 75 90
(degrees) (degrees)
Figure 27.3 Plots of amplitude and phase for the reflection coefficients of the
p- and s-components of polarization at a glass–air interface. The assumed index
of the glass is n ¼ 1.65. The critical angle for TIR is hcrit ¼ sin1(1/n) ¼ 37.3 ,
and the Brewster angle is hB ¼ tan1(1/n) ¼ 31.2 .
when the incident beam’s angular spectrum is confined to the vicinity of hcrit. In our
example, of course, the range of incident angles is fairly large (33.5 to 56.5 ), and
deviations from linearity of the phase function show up as higher-order aberrations
(e.g., coma, astigmatism, spherical aberration, defocus). It is this deviation from
linearity that is mainly responsible for the aberration of the focused spot seen in
Figure 27.2.
A question frequently asked about TIR concerns the balance of energy among the
incident beam, the reflected beam, and the evanescent waves that exist in the medium
beyond the prism. If all the light is reflected at the glass–air interface, then how can
there be any energy in the form of electromagnetic fields in the region immediately
beyond the interface? To answer this question one must distinguish between the
steady state of the system, which prevails once the waves have established them-
selves throughout space, and the transient state, which exists in the earlier stage
immediately after the light source has been turned on. In the transient state, some of
the incident energy goes into developing the evanescent waves, which are estab-
lished early on and remain for as long as the system remains undisturbed. If one
calculated for the evanescent field the component of the Poynting vector perpen-
dicular to the interface, one would find that the electric and magnetic components of
382 Classical Optics and its Applications
this field are exactly 90 out of phase and, therefore, that the perpendicular
component of the Poynting vector is zero. In other words, no energy is carried away
from the interface by these evanescent waves. Consequently, all the incident optical
energy in the steady state is carried away by the reflected beam.
Next, we consider the effect of a collimating lens (identical to the original
focusing lens), placed so as to capture the radiation emanating from the focused
spot. (In the system of Figure 27.1, this lens would be placed one focal length
above the observation plane and parallel to it.) The resulting collimated beam is
depicted in Figure 27.4, which shows computed plots of intensity and phase at the
(a) 3100
y/
–3100
–3100 z/ 3100
(b)
300
Phase (degrees)
200
100
–2000
2000
0 0
z /
2000 –2000 y/
Figure 27.4 (a) Plot of intensity distribution at the exit pupil of the collimating
lens. The low-contrast rings are caused by diffraction effects during propagation
and by loss of the high-spatial-frequency content of the spectrum. The rays on
the left side of the beam, having been below the critical angle for TIR, have been
partially transmitted through the prism. (b) Distribution of phase at the exit pupil
of the collimating lens. The small linear slope is responsible for the Goos–
Hänchen displacement of the focused spot. The plateau on the left-hand side is
caused by the (partially reflected) rays that fall below the critical angle. The
sharp rise immediately before reaching the plateau is due to the rapidly
decreasing phase of the reflected rays just above the critical angle.
27 Some quirks of total internal reflection 383
exit pupil of the collimator. Note, in particular, the strong attenuation of the left
edge of the beam (owing to a loss of rays below hcrit), and also the near-linearity
of the phase plot in regions far from the critical angle. As the light rays approach
hcrit from above, the phase pattern in Figure 27.4(b) rises rather sharply and then
flattens. This is precisely what one would expect based on the behavior of p in
the interval (33.5 , 56.5 ), shown in Figure 27.3(b).
One cannot leave the subject of TIR without at least mentioning the fascinating
phenomena associated with frustrated TIR, which occur when a second prism is
brought to the vicinity of the interface at which TIR occurs. Consider a pair of
identical glass hemispheres separated by an air gap of width D, as shown in
Figure 27.5. Displayed in Figure 27.6 are computed plots of amplitude reflection
coefficients jrpj and jrsj versus the angle of incidence h for three different values
of D. In Figure 27.6(a), where D ¼ 100 nm, one can see close similarities to
Figure 27.3(a), albeit with TIR completely suppressed: the Brewster angle at
hB ¼ 31.2 is still there, but there are no sharp transitions to 100% reflectivity. In
Figure 27.6(b), D is set to 300 nm and the curves are beginning to look more like
those in Figure 27.3(a); it appears that by increasing D one can make a rather
smooth transition to TIR. But wait! In Figure 27.6(c), where D ¼ 400 nm, there
is a radical departure from the presumed “smooth transition”. Specifically, at
h ¼ 20.7 both rp and rs vanish identically. What is going on here? What will
happen if the gap width D keeps increasing? These questions are not difficult to
answer but require some thought. Essentially, at a certain gap width D and at
some angle of incidence h both rp and rs vanish. The gap width is such that, at this
angle, D cos h ¼ k/2 exactly. Now, whenever a non-absorbing layer’s thickness
Incident
Beam Ep
Es Reflected
Beam
Glass
Air gap
Glass
Transmitted
Beam
Figure 27.5 A pair of glass hemispheres separated by an air gap may be used
to demonstrate the phenomenon of frustrated TIR. The coherent beam of light is
directed at the center of the upper hemisphere at incidence angle h. The width D
of the air gap is adjustable.
384
(a) Air Gap = 100 nm (b) Air Gap = 300 nm (c) Air Gap = 400 nm
1.0 1.0 1.0
Amplitude Reflection Coefficient
Figure 27.6 Computed amplitude reflection coefficients, jrpj and jrsj, for p- and s-polarized light in the system of Figure 27.5.
The refractive index of the glass hemispheres is n ¼ 1.65, the wavelength of the incident beam is k ¼ 650 nm, and the width D of
the air gap is (a) 100 nm, (b) 300 nm, (c) 400 nm.
27 Some quirks of total internal reflection 385
(a) (b)
1.0 |rs| 1.0
|rp|
Amplitude Reflection Coefficient
0.8 0.8
0.6 0.6
0.4 0.4
0.2 0.2
0.0 0.0
0 15 30 45 60 75 90 0 200 400 600 800 1000
(degrees) Z (nm)
Figure 27.7 (a) Computed amplitude reflection coefficients, jrpj and jrsj, for
p- and s-polarized light in the system of Figure 27.5, when the flat surface of the
bottom hemisphere is coated with a thick layer of aluminum: (n, k) ¼ (1.47, 7.8),
thickness ¼ 200 nm. The top glass hemisphere is assumed to have refractive
index n ¼ 1.65, the wavelength of the incident beam is k ¼ 650 nm, and the width
D of the air gap is 875 nm. This particular gap width was chosen because it
brought the minimum in rp close to zero. At other gap widths the behavior is
qualitatively the same but the minimum of reflectivity is higher. (b) Component
of the Poynting vector perpendicular to the gap, computed at h ¼ 37.4 . The
horizontal axis is the distance measured from the top of the air gap towards the
aluminized surface at the bottom. The optical energy flows unattenuated through
the air before being fully absorbed in the top 30 nm of the aluminum layer.
386 Classical Optics and its Applications
387
388 Classical Optics and its Applications
Reflected
Beam
Glass Prism
Evanescent
Field
Incident
Beam
Figure 28.1 A beam of light is totally internally reflected from the rear facet of
a glass prism. The electromagnetic field lurking in the free space region behind
the prism is evanescent; both its electric and magnetic components decay
exponentially with distance from the interface, and the projection of its Poynting
vector perpendicular to the interface is zero. The energy stored in the evanescent
field is deposited there at the time when the light source is first turned on. In the
steady state, energy is neither added to nor removed from the evanescent field;
all the incoming optical energy is reflected at the rear facet of the prism.
phenomena of frustrated TIR and attenuated TIR, which are of relevance here,
were discussed in previous chapters (see Chapter 10, “What in the world are
surface plasmons?”, and Chapter 27, “Some quirks of total internal reflection”).
Z
Y
Figure 28.2 A collimated beam of light, uniform, monochromatic (k0 ¼ 633 nm),
and linearly polarized along X, enters an aplanatic 0.8 NA objective lens
(f ¼ 3750k0). A glass hemisphere – also known as a SIL – of refractive index
n ¼ 2 is placed so that its flat facet coincides with the objective’s focal plane.
The surfaces of the objective as well as the spherical surface of the SIL are
antireflection coated, but the flat facet of the SIL is bare. The light reflected
from this flat facet returns to the objective, is collimated by it, and appears at
the exit pupil.
c d
e f
Figure 28.3 Various distributions of the reflected light at the exit pupil of the
objective lens of Figure 28.2. (a) Plot of reflected intensity corresponding to a
66% overall reflectivity at the flat facet of the hemisphere. (b) Logarithmic plot
of the reflected intensity. (c) Intensity distribution for the x-component of
polarization, Ex. (d) Intensity distribution for the y-component of polarization, Ey.
(e) The polarization ellipticity g encoded by gray-scale, covering a range from
37 (black) to þ37 (white). (f) The polarization rotation angle q encoded by
gray-scale, covering a range from 90 (black) to þ90 (white).
Where there was a bright ring of light at the exit pupil in Figure 28.3(a), now
there is a gradual brightening toward the margins in Figure 28.5(a), indicating the
gradual decrease in evanescent coupling with increasing angle of incidence. The
two dark spots in the vicinity of the Brewster angle are clearly visible in the loga-
rithmic plot of Figure 28.5(b). The ellipticity g shown in Figure 28.5(e) varies over
the aperture in the range 29.5 , while the polarization rotation angle q has the
distribution shown in Figure 28.5(f).
28 Evanescent coupling 391
X
Objective Air Gap
Z
Y
Glass Hemispheres
Figure 28.4 A collimated beam of light, uniform, monochromatic (k0 ¼ 633 nm),
and linearly-polarized along X, enters an aplanatic 0.8NA objective lens (f ¼ 3750k0).
Two glass hemispheres of refractive index n ¼ 2, separated by an air gap, are
arranged in such a way that the focal plane of the objective coincides with the
mid-plane of the air gap. Both hemispheres are antireflection coated on their spherical
surfaces, but are left bare on their flat surfaces. The light reflected at the air gap returns
to the objective, is collimated by it, and appears at the exit pupil.
c d
e f
Figure 28.5 Various distributions of the reflected light at the exit pupil of the
objective lens of Figure 28.4; the gap width is fixed at 100 nm. (a) Plot of the
reflected intensity corresponding to a 43% overall reflectivity at the air gap.
(b) Logarithmic plot of the reflected intensity. (c) Intensity distribution for Ex.
(d) Intensity distribution for Ey. (e) The polarization ellipticity g encoded by
gray-scale, covering a range from 29.5 (black) to þ29.5 (white). (f) The
polarization rotation angle q encoded by gray-scale, covering a range from 90
(black) to þ90 (white).
aluminum layer. (At 50 nm thickness, the aluminum film is opaque; the incident
light is partly absorbed and partly reflected from the film’s surface, practically
no light being transmitted through the film.) The absorbed light comes partly
from the central region of the incident beam, which is transported through the
gap by ordinary (i.e., propagating) waves, and partly from the remaining annular
region, which is transported by evanescent coupling. As before, the reflected
28 Evanescent coupling 393
1.0
0.8
0.6
Reflectance
0.4
0.2
0.0
0 150 300 450 600
Gap width (nm)
Figure 28.6 Computed plot of reflectance versus gap width in the system of
Figure 28.4, when a circular mask is placed in the incident beam’s path to block
the rays that arrive at the interface between the hemispheres at or below the
critical TIR angle.
polarization state is quite complex, regions near the critical angle being RCP in
two quadrants and LCP in the other two quadrants. (The coarseness of the mesh
used in these calculations does not reveal the resonant absorption by surface-
plasmon excitation. This type of absorption occurs within a narrow range of
angles just above the critical angle for p-polarized light. Because the angular
range of resonant absorption is extremely narrow, however, its contribution to
the overall absorption within the aluminum film may be neglected.)
To compute the fraction of light absorbed by the aluminum film through
evanescent coupling, we place once again a circular mask in the central region of
the incident beam, blocking all the rays that would arrive at the gap below the
critical angle. The results of calculations in this case are shown in Figure 28.8 for
a bare aluminum film (solid curve) as well as a coated film (broken curve). With
increasing gap width the reflectance of the bare film drops slightly at first, then
rises rapidly to saturate at 100%. When the aluminum film is in contact with the
SIL (i.e., zero gap width) it absorbs about 16% of the light, but by the time the
gap widens to 150 nm the absorption is down to a mere 3%. One can improve
upon this situation by applying a dielectric coating over the aluminum layer. The
broken curve in Figure 28.8 is a plot of reflectance versus gap width for an
aluminum film 50 nm thick coated with a layer of SiO 100 nm thick. It is clear
that evanescent coupling now takes place over a wider range of gaps; in
394 Classical Optics and its Applications
a b
c d
e f
Figure 28.7 Various distributions of the reflected light at the exit pupil of the
objective lens of Figure 28.4, when the flat facet of the second hemisphere is
coated with a layer of aluminum 50 nm thick. The gap width is fixed at 100 nm.
(a) Plot of the reflected intensity, showing a 92% overall reflectance. (If the SIL
is removed, the reflectivity drops to 90%.) (b) Logarithmic plot of the reflected
intensity. (c) Intensity distribution for Ex. (d) Intensity distribution for Ey.
(e) The polarization ellipticity g ranging from 43.4 (black) to þ43.4 (white).
(f) The polarization rotation angle q, ranging from 90 (black) to þ90 (white).
particular, between gap widths of 150 nm and 200 nm there is a plateau of about
10% absorption. (In the absence of the mask blocking the center of the beam, the
dielectric-coated aluminum film absorbs a total of 19% of the incident power at a
gap width of 100 nm.)
The above aluminum–SiO bilayer is discussed for illustration purposes only; it
does not represent the most efficient multilayering scheme for coupling the light
28 Evanescent coupling 395
1.00
50 nm Aluminum
0.95
Reflectance
0.90
50 nm Aluminum
+
100 nm SiO
0.85
Figure 28.8 Computed plots of reflectivity versus gap width for two different
samples. The solid curve shows the reflectance when an aluminum layer 50 nm
thick coats the flat facet of the second hemisphere of Figure 28.4. The broken
curve corresponds to the case where a layer of SiO 100 nm thick and with n ¼ 2
coats the 50 nm thick aluminum film. A mask blocks the central region of the
beam in both cases.
to the aluminum layer. One can do somewhat better by adding additional layers
on top of the aluminum film and/or beneath the flat facet of the SIL and by
optimizing the thickness and refractive index of each such layer.
Magneto-optical disk
A major application area for evanescent coupling is the field of magneto-optical
(MO) disk data storage.2,3,4 Here a disk, which is typically a multilayer stack of
metallic and dielectric layers on a glass or plastic substrate, is placed under a solid
immersion lens (SIL). As the disk spins, the SIL rides on an air cushion, which
separates the two by a fixed gap width. Two of the most important questions in this
area are: (1) how much of the focused optical energy is absorbed within the optical
disk?; (2) how does the reflected MO signal depend on the air gap?
Before answering these questions, however, we must give a brief overview of
the physical mechanisms involved in MO recording and readout. The disk con-
sists of a thin magnetic layer sandwiched between two dielectric layers coated
atop a reflector such as an aluminum-coated substrate (see Figure 28.9). The layer
thicknesses and refractive indices shown in the figure are typical but in fact can
396 Classical Optics and its Applications
Incident beam
(0 = 633 nm)
100 nm Dielectric (n = 2)
25 nm Magnetic
30 nm Dielectric (n = 2)
25 nm Aluminum (n, k) = (1.4, 7.6)
Substrate
vary somewhat, depending on the configuration of the drive for which the disk is
intended. The disk is used in reflection, and the multilayer stack is designed to
take advantage of optical interference in order to maximize the coupling of the
laser beam to the magnetic layer.3 The aluminum reflector is an important
component of this optical interference device, but it also serves as a heat sink to
remove from the magnetic layer the thermal energy deposited there by the
focused laser beam. The dielectric layers protect the magnetic film from the
environment and, through their thickness and refractive index, provide the
necessary degrees of freedom for adjusting the optical characteristics of the stack.
Also, the dielectric layer between the magnetic film and the aluminum layer
controls the flow of heat between these two metallic layers.
The optical properties of the magnetic film are fully specified by its dielectric
tensor, namely,
0 1
e e0 0
e ¼ @ e0 e 0 A: ð28:1Þ
0 0 e
28 Evanescent coupling 397
Differential detection
The standard method of detecting the MO signal in conventional optical disk
drives is shown in Figure 28.10.3 The beam reflected from the disk and collected
by the objective lens is sent through a Wollaston prism and thus divided between
two identical detectors. Typically the detection module is oriented at 45 with
respect to the original direction of polarization of the laser, so that Ex and Ey are
398 Classical Optics and its Applications
X Wollaston Prism
Split-detector
+
S1
Y ΔS
Z –
S2
equally split between the two detectors. Whereas both the magnitude and phase of
Ex at the two detectors are identical, Ey arrives at each detector with a different
pffiffiffi
sign. The total light amplitude arriving at the detectors is thus (Ex Ey Þ= 2, the
plus sign corresponding to one detector and the minus sign corresponding to the
other. If the phase difference between Ex and Ey is denoted x y then the net
differential signal may be written as
Z
1 2 1 2
DS ¼ S1 S2 ¼ c jEx þEy j jEx Ey j dxdy
2 2
Z ð28:2Þ
¼ 2c jEx Ey jcosðx y Þdxdy:
In this equation, c is the responsivity of the detectors (in volts per watt of optical
energy) and the integrals are over the individual detector areas. Note that when the
magnetization direction at the disk is reversed the sign of Ey reverses, resulting in a
sign reversal for DS. Also note that any phase difference between Ex and Ey reduces
the output signal by the cosine factor in the above equation. In principle, this phase
difference may be eliminated by a properly patterned phase plate placed imme-
diately before the Wollaston prism. In practice, however, unless x y is fairly
uniform over the aperture, it is difficult to correct the effects of this relative phase.
reflectance 36%, Kerr rotation angle 0.66 , and Kerr ellipticity 0.05 . Focusing
the beam by the 0.8NA objective through the SIL changes these parameters only
slightly, as long as the disk and the SIL remain in contact. However, a small air
gap between the disk and the SIL can change the disk’s performance drastically.
Figure 28.11 shows computed distributions at the exit pupil of the objective lens
for a 100 nm gap width. Figure 28.11(a) is the intensity distribution for the reflected
a b
c d
e f
Figure 28.11 Various distributions of the reflected light at the exit pupil of the
objective lens of Figure 28.2, when the MO stack of Figure 28.9 is placed in
front of the SIL with a 100 nm air gap. (a) Intensity distribution for Ex, con-
taining 36% of the incident optical power. (b) Phase distribution for Ex; the gray-
scale covers the range 180 (black) to þ180 (white). (c) Intensity distribution
for Ey, containing 11.5% of the incident power. (d) Phase distribution for Ey; the
phase difference between adjacent quadrants is nearly 180 . (e) The polarization
ellipticity g ranging from 45 (black) to þ45 (white). (f) The polarization
rotation angle q ranging from 90 (black) to þ90 (white).
400 Classical Optics and its Applications
Ex, containing 36% of the incident power. The dark oval-shaped region in the
middle indicates an area of strong absorption by the disk. The phase of Ex, shown
in Figure 28.11(b), is non-uniform over the aperture, ranging in value from 180
(black) to þ180 (white). At the center the phase is about þ100 , and drops
continuously along the X-axis to 150 at the edge.
Figure 28.11(c) is a plot of intensity distribution for the reflected Ey, which
contains 11.5% of the incident power. This y-component is due mainly to the
Fresnel reflection coefficients at the interface between the SIL and the multi-
layer stack. The fraction of Ey created by MO activity is relatively small and,
although embedded in Figure 28.11(c), is difficult to recognize at this point.
The phase of Ey depicted in Figure 28.11(d) shows the well-known p shift
between adjacent quadrants. The polarization distribution over the exit pupil
(see Figures 28.11(e), (f)) is highly non-uniform and contains all possible states
of polarization, i.e., linear, elliptical, and circular.
To observe the contribution to Ey by MO activity, we eliminate the magnet-
ization of the disk by setting to zero the off-diagonal element e0 of the tensor,
then subtracting the complex-amplitude distributions at the exit pupil with and
without the magnetization. In doing so the x-component cancels out exactly,
showing that there are no magnetic contributions to the reflected Ex. However, the
y-component shows a residual distribution Ey. Figure 28.12 shows the intensity
and phase plots for DEy at the exit pupil of the objective for a 100 nm gap width.
Notice that this MO contribution to Ey has circular symmetry; moreover, it is
large in the region where absorption by the disk is strong (compare the position of
the bright ring in Figure 28.12(a) with that of the dark oval-shaped region in
Figure 28.11(a)).
a b
Figure 28.12 Plots of intensity and phase for the MO contribution to the
reflected light, DEy, at the exit pupil of the objective lens of Figure 28.2. The
multilayer stack of Figure 28.9 is assumed to be in front of the SIL with a 100 nm
gap. (a) Intensity distribution, containing a fraction 0.37 · 104 of the incident
optical power. (b) Phase distribution, ranging from 70 (black) to þ246 (white).
28 Evanescent coupling 401
1.0
|Ex|2 + |Ey|2
0.8
0.6
Reflectance
0.4
20000 |ΔEy|2
0.2
0.0
0 150 300 450 600
Gap width (nm)
Figure 28.13 Total reflectivity (solid line) and the integrated intensity of the MO
signal (broken line) at the exit pupil as functions of the gap width. These calcula-
tions correspond to the system of Figure 28.2 in conjunction with the quadrilayer
MO stack of Figure 28.9, when a mask blocks the central region of the beam.
402 Classical Optics and its Applications
0.020
0.015
0.005
0.000
0 150 300 450 600
Gap width (nm)
and Eq. (28.2) for the definition of S1, S2.) Again we have blocked the central
region of the incident beam in order to concentrate on the effects of evanescent
coupling. With the SIL and the disk in contact, the normalized differential signal
is close to its ideal value, which is twice the tangent of the Kerr rotation angle,
namely, 2 tan 0.66 ¼ 0.023. As the gap widens, the differential signal drops
sharply: at 100 nm gap width, for instance, the signal is down by a factor of four.
Roughly one-half of this drop may be attributed to the reduction in DEy and the
corresponding rise in reflectivity (see Figure 28.13). The remaining half, how-
ever, is due to variations over the beam’s cross-section of the relative phase
x y of Ex and DEy.
It must be emphasized that the quadrilayer stack of Figure 28.9 is not
specifically optimized for operation with the system of Figure 28.2. By chan-
ging the thicknesses and the refractive indices of the various layers and/or by
introducing dielectric coatings at the bottom of the SIL, it might be possible to
improve upon the aforementioned performance figures. It is highly unlikely,
however, that one can achieve significant gains in terms of the coupling effi-
ciency and the magnitude of the MO Kerr signal over what we have already
reported.
28 Evanescent coupling 403
404
29 Internal and external conical refraction 405
The phenomenon of conical refraction was predicted by Sir William Rowan
Hamilton in 1832 and its existence was confirmed experimentally two months
later by Humphrey Lloyd.1,2 (James Clerk Maxwell was only a toddler at the
time.) The success of this experiment contributed greatly to the general accept-
ance of Fresnel’s wave theory of light.
Conical refraction has been known for nearly 170 years now,1,2 and a complete
explanation based on Maxwell’s electromagnetic theory has emerged, which is
accessible through the published literature.3,4 The complexity of the physics
involved, however, is such that it prevents us from attempting to give a simple
explanation. We shall, therefore, confine our efforts to presenting a descriptive
picture of internal and external conical refraction by way of computer simulations
based on Maxwell’s equations.
Overview
To observe internal conical refraction one must obtain a slab of biaxial
birefringent crystal, such as aragonite, that has been cut with one of its optic axes
perpendicular to the polished parallel surfaces of the slab (see Figure 29.1).
When a collimated beam of light (say, from a HeNe laser) is directed at normal
incidence towards the front facet of the slab, the beam enters the crystal and
spreads out in the form of a hollow cone of light. Upon reaching the opposite
facet, the beam emerges as two concentric hollow cylinders, propagating in the
same direction as the original, incident beam.
External conical refraction is, in a way, the above phenomenon in reverse.
Specifically, a hollow cone of light, converging towards a point on the surface of
Optic axis
Emergent
beams
Incident
beam
a biaxial crystal slab, becomes collimated along the optic-ray axis of the crystal
and continues to propagate along that axis for as long as the beam remains within
the crystal slab (see Figure 29.2). When the beam reaches the opposite facet of
the slab, it emerges as an expanding cone of light. The focused cone thus
“remains in focus” in its entire path through the crystal and diverges only after
exiting the slab.
There are certain subtle differences between internal and external conical
refraction; for instance, the optic axis of wave normals along which the beam pro-
pagates in the former case is not the same as the optic-ray axis in the latter. This and
other differences will become clear in the course of the following discussions.
Optic Optic
axis 2 axis 1
Figure 29.3 The index ellipsoid has semi-axes of length nx, ny, nz along the
principal axes X, Y, Z of the crystal. For a plane wave propagating in a given
direction, a plane through the center of the ellipsoid and perpendicular to the
wave normal will have an elliptical cross-section with the index ellipsoid. A
propagation direction for which the cross-sectional ellipse becomes a circle is
known as an optic axis. Similarly, the ray ellipsoid has semi-axes of length 1/nx,
1/ny, 1/nz along the principal axes. For a given ray direction, a plane through the
center of the ray ellipsoid and perpendicular to the ray will have an elliptical
cross-section with the ray ellipsoid. A propagation direction for which the cross-
sectional ellipse becomes a circle is known as an optic-ray axis. In general,
biaxial crystals have two optic axes and two optic-ray axes.
yield the refractive indices associated with the two orthogonal polarizations of the
beam. If the wave-vector k happens to be in such a direction that its corres-
ponding cross-sectional ellipse becomes a circle, then the beam will “see” a
single refractive index, irrespective of its state of polarization. The propagation
direction corresponding to this circular cross-section is known as the optic axis.
Crystals in which the three principal indices of refraction nx, ny, nz are all dif-
ferent exhibit two such optic axes and are, therefore, referred to as biaxial. A
crystal in which one index of refraction differs from the other two exhibits one
optic axis and is known as a uniaxial birefringent crystal. Conical refraction
occurs only in biaxial birefringent crystals.
Birefringent crystals also have a ray ellipsoid with semi-axis lengths 1/nx, 1/ny,
1/nz along the principal axes. The ray ellipsoid, therefore, is different from the
index ellipsoid, whose semi-axis lengths are the refractive indices themselves.
While the index ellipsoid is relevant to the discussion of internal conical
refraction, it is the ray ellipsoid that plays the central role in the case of external
conical refraction. For a ray propagating along a given direction, the plane
408 Classical Optics and its Applications
passing through the center of the ray ellipsoid and perpendicular to the ray will, in
general, have an elliptical cross-section with the ellipsoid. If a ray happens to be
in such a direction that its corresponding cross-sectional ellipse becomes a circle,
then the direction of that ray defines an optic-ray axis. In general, the optic-ray
axis is different from the optic axis, which is obtained in a similar fashion from
the index ellipsoid. Biaxial birefringent crystals thus possess two optic-ray axes
in addition to their two optic axes. Assuming ny < nx < nz, it is not difficult
to show that the optic-ray axis is in the YZ-plane and makes an angle h with the
Z-axis, where
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
tan h ¼ ðn2x n2y Þ=ðn2z n2x Þ:
y/0
–150
1250 b
y/0
–150
1250 c
y/0
–150
–700 x/0 700
Figure 29.4 (a) The incident intensity distribution at the front facet of the
crystal slab in Figure 29.1. (b) Emergent intensity distribution at the rear facet of
the slab, corresponding to the circularly polarized incident beam shown in (a).
(c) Distribution of the angle of the emergent polarization vector to the X-axis.
The gray-scale is such that a white pixel represents a þ90 angle while a black
pixel corresponds to a 90 angle. The emergent polarization is linear at any
given point on the beam’s cross-section, but its direction varies from point to
point. At the top of the rings there is an apparent 180 discontinuity in the
direction of polarization. The jaggedness of the discontinuity is caused by small
numerical errors that are inevitable when computing the state of polarization in
the dark regions around the rings.
410 Classical Optics and its Applications
circular polarization (RCP and LCP) yield the same results. Alternatively, the
incident beam may be assumed to be unpolarized for the full rings to emerge.
As we shall see later, with linearly polarized light a certain part of the rings will
be missing.
y/0
–150
–700 x/0 700 –700 x/0 700
To gain an appreciation for the phase distribution over the beam’s cross-
section, we show in Figure 29.5 two computed interferograms corresponding to
the superposition of the beam emerging from the exit facet of the crystal and a
uniform reference beam. The beam entering the crystal is assumed to be RCP in all
cases, but the reference beam is RCP in Figure 29.5(a) and LCP in Figure 29.5(b).
We notice that in Figure 29.5(a) the outer ring has interfered constructively with
the reference beam, whereas the inner ring shows destructive interference. As a
general rule, there is a 180 phase shift between the inner and outer rings at
radially adjacent locations, irrespective of the state of incident polarization. This
phase difference aside, the two rings are identical in their polarization and phase
distributions. The interferogram of Figure 29.5(b) is more complicated than that
of Figure 29.5(a); nevertheless, it can be fully explained in terms of the states of
polarization and the distribution of phase over the rings, which we have already
described.
y/0
–150
1250 b
y/0
–150
1250 c
y/0
–150
–700 x/0 700
Figure 29.6 When the incident beam is linearly polarized, the emergent rings
of light will be incomplete. This figure shows the intensity distribution at the
rear facet of the slab in the cases where the incident E-field is (a) parallel to the
X-axis, (b) at 45 to X, and (c) parallel to the Y-axis.
Figure 29.6(c); this region would have had polarization along the X-axis. Unlike
the distribution of polarization over the rings, which is independent of the state
of incident polarization, the phase of the rings is very much a function of the
polarization of the incident beam. When the incident beam is linearly polarized, as
in Figure 29.6, the emergent phase (not shown) will have a constant value over
29 Internal and external conical refraction 413
the entire area of each ring. (As before, the two rings will have a 180 phase
difference.) One may verify the above statements by considering the various
linearly polarized incident beams as superpositions of RCP and LCP beams and by
analyzing the corresponding superpositions at the exit facet of the crystal.
y/0
–3750
3750
b
y/0
–3750
3750
c
y/0
–3750
–3750 x/0 3750
Figure 29.7 The distributions of (a) intensity, (b) polarization ellipticity, and
(c) polarization rotation angle at the observation plane. The incident beam at the
entrance pupil of the focusing lens is assumed to be circularly polarized. The
ellipticity plot in (b) is coded in gray-scale, black corresponding to 45 (i.e.,
LCP) and white to þ45 (i.e., RCP). The distribution of polarization rotation
angle depicted in (c) is also coded in gray-scale, but the black pixels in this case
represent 90 rotation from the X-axis and the white pixels represent þ90
rotation. As before, the jaggedness of the transition from black to white in the
lower part of (c) is caused by small numerical errors; since the discontinuity
represented by this transition is not a physical discontinuity, this jaggedness has
no physical significance.
29 Internal and external conical refraction 415
According to Figure 29.7(c), over the circumference of the rings the polarization
vector rotates from 90 at the bottom (i.e., E-field antiparallel to Y-axis) to 0 at
the top (E-field parallel to X) and back to þ90 at the bottom (E-field parallel to Y).
The apparent discontinuity of polarization direction at the bottom of the rings
does not signify a physical discontinuity, as before, because the phase of the
rings (not shown here) also exhibits a 180 change during one full cycle around
the rings. The overall E-field distribution turns out to be continuous after all.
y/0
–60
60 b
y/0
–60
60 c
y/0
–60
–60 x/0 60
Figure 29.8 Distributions of (a) intensity, (b) polarization ellipticity, and (c)
polarization rotation angle within the pinhole at the exit facet of the crystal slab.
The incident beam at the entrance pupil of the focusing lens is assumed to be
circularly polarized. The ellipticity plot in (b) is coded in gray-scale, black
corresponding to 45 (i.e., LCP) and white to þ45 (i.e., RCP). The distri-
bution of polarization rotation angle depicted in (c) is also coded in gray-scale,
but the black pixels in this case represent 90 rotation from the X-axis and the
white pixels represent þ90 rotation.
29 Internal and external conical refraction 417
3750 a
y/0
–3750
3750 b
y/0
–3750
3750 c
y/0
–3750
Figure 29.9 Same as Figure 29.7 except for the state of polarization of the
incident beam, which is linear along X in the present case.
The apertures of classical optics simply block those parts of an incident wavefront
that fall outside the aperture, allowing everything else to go through intact.
Moreover, multiple apertures act upon an incident beam independently of each
other, polarization effects are usually negligible (i.e., scalar diffraction), and it is
not necessary to keep track of both the electric- and the magnetic-field components
of the beam.1
All of the above assumptions break down when apertures shrink to dimensions
comparable to or smaller than a wavelength.2,3 For example, transmission
through two small adjacent apertures cannot be treated by assuming that only one
aperture is open at a time, then adding the fields transmitted by the individual
apertures. (This is because the electric charge and current distributions in the
vicinity of one aperture are influenced by the radiation pattern of the other
aperture.) Polarization effects are extremely important for small apertures, as
exemplified by the case of a normally incident beam going through an elliptical
aperture in a thin metal film; whereas in the case of polarization (i.e., E-field)
parallel to the long axis of the ellipse there is negligible transmission, when the
incident polarization is rotated 90 to point along the ellipse’s minor axis, the
aperture transmits a substantial fraction of the incident light. Finally, to analyze
the interaction of light with small apertures, it is generally necessary to keep track
of both E and B components of the electromagnetic wave, as the modification of
one of these fields produces non-trivial changes in the other field’s distribution.4
This chapter presents the results of computer simulations based on the Finite
Difference Time Domain (FDTD)5 method for an elliptical aperture in a thin
metal film illuminated by a normally incident, monochromatic plane wave. Both
cases of incident polarization parallel and perpendicular to the long axis of the
†
The co-authors of this chapter are Armis R. Zakharian, now at Corning Corp., and Jerome V. Moloney of the
University of Arizona.
418
30 Light transmission through small elliptical apertures 419
Maxwell’s equations
In developing an intuitive understanding of the electromagnetic field distribu-
tion around an aperture, we rely heavily on Maxwell’s divergence equations,
r · D ¼ q and r · B ¼ 0, where D ¼ e0eE, B ¼ l0H, and q is the electric charge
density.1,4 (e0 and l0 are the permittivity and permeability of free-space, while e
is the relative permittivity of the local environment.) The divergence-free
nature of the magnetic field simply means that the B-field lines cannot be
interrupted; they can go around in loops or they can form unbroken infinite
lines, but they cannot originate, nor can they terminate, at specific points in
space. A similar argument applies to D-field lines, except in locations where
electric charges exist. When charges are present, lines of D originate on positive
charges and terminate on negative charges; everywhere else the D-lines can
twist and turn in space, but they cannot start or stop.
The other two of Maxwell’s equations, r · H ¼ J þ @D/@t and r · E ¼ – @B/@t,
are necessary not only for generating the E and B fields from electrical currents
(J is the local current density), but also to sustain these fields in source-free
regions of space.4 When highly conducting media (e.g., metallic bodies) are
present in a system, surface currents Is develop that support the magnetic field
H immediately outside the conducting surfaces. Aside from these electrical
currents that act as sources of the H-field, time variations of the E-field are
needed at each point of space to maintain the local B-field. In a similar vein,
aside from electric charges that act as sources and sinks for the D-field, time
variations of B are necessary to maintain the local E-field. The lines of the
current density J remain divergence-free, except in those locations where they
deposit electrical charges, that is, r · J ¼ – @q/@t.4,6
Inside an electrical conductor J ¼ rE, where r is the conductivity of the
material. Good conductors (e.g. metals) have large conductivities, which means
that the E-field must all but vanish from the interior of such bodies. When the fields
are oscillatory, any magnetic fields inside a good conductor will produce, by virtue
of the Faraday law, r · E ¼ – @B/@t, a local electric field. Since E-fields are not
allowed inside a conductor, time-varying magnetic fields, being intimately asso-
ciated with the electric fields, must also be absent. The interior of good conductors
thus remains free of charges, currents, and time-varying electromagnetic fields.
420 Classical Optics and its Applications
Charges and currents, however, can and do develop on the conductor’s surface,
where they give rise to E- and B-fields in the vicinity of the surface outside the
conductor.
The fifth equation of classical electrodynamics, the Lorentz law of force,
F ¼ q(E þ V · B), expresses the force F experienced by a particle of charge q
and velocity V.4 This equation is occasionally useful in developing a qualitative
picture of the current distribution in the vicinity of small apertures. For
example, within the skin depth of a conductor, the directions of E and B would
indicate the sense in which local surface currents are affected by the Lorentz
force acting on the charge carriers. Typically, the E-field is the dominant factor
in this regard, as evidenced by the constitutive relation J ¼ rE. Any transverse
deflections of the current by the B-field are generally neglected, unless the Hall
conductivity of the medium is explicitly included in the constitutive relations.
(a) (b)
E
(c) (d)
B
E
m
Figure 30.1 (a) E-field lines of a static electric dipole p emerge from the
positive pole and disappear into the negative pole. (b) An oscillating electric
dipole emanates E-lines that reverse direction on spherical shells separated by k/2.
The curl of the E-field creates B-field lines that surround the dipole in closed
circular loops. (c) A static magnetic dipole m is a closed loop of electrical current
whose B-field pattern is similar to the E-field of an electric dipole. (d) An
oscillating magnetic dipole behaves similarly to an electric dipole, albeit with the
roles of E and B reversed.
Z
E
/4 B
Y
/4
E
Is
/4 B
and incident plane-waves cancel each other out. In the half-space above the
conductor, interference between the incident and reflected beams creates standing-
wave fringes of the electric-field E and the magnetic field B. The B-field is
strongest at the surface of the conductor, reversing sign at intervals of Dz ¼ k/2,
where its adjacent peaks are located. The peaks of the E-field, also located at k/2
intervals, are staggered relative to the B-field peaks, thus coinciding with planes of
vanishing magnetic field.
At the upper surface of the conductor, where the E-field is zero, the B-field
is sustained by the surface current Is. (Although Is is shown antiparallel to
the standing-wave’s E-field at Dz ¼ k/4, in reality Is is 90 behind this E-field,
reaching maximum when the E-field directly above the surface is going through
zero on its way to the peak.) In the half-space above the conductor, in the absence
of any electrical charges and currents, the E-field is sustained by the time-variations
of the B-field (r · E ¼ – @B/@t ), and vice versa (r · H ¼ @D/@t ).
In an imperfect conductor, where conductivity is large but finite, the E- and
B-fields penetrate slightly beneath the surface, producing a Lorenz force on the
moving charges that comprise the surface current. While the E-field provides the
current’s driving force, the magnetic component of the Lorentz force attempts to
30 Light transmission through small elliptical apertures 423
drive the surface current further down into the conductor (radiation pressure). In
general, the surface current Is need not be in-phase with the penetrating E-field,
since, at optical frequencies, the electrical conductivity r is a complex number.
Is B
+ +
+
Figure 30.3 A small elliptical aperture in the system of Figure 30.2, with its
major axis parallel to the surface current Is, distorts the current distribution by
diverting its path to avoid the hole. The B-lines immediately above the surface
bend toward and into the aperture, without breaking up. The E-field in and
around the aperture gets redistributed in a way that supports the B-field while
staying away from the long side-walls of the hole. The surface currents in the
vicinity of the aperture deposit opposite charges around the sharp corners of the
ellipse, causing the E-lines to break up at these corners.
424 Classical Optics and its Applications
and sinks for the E-lines in their neighborhood. Elsewhere, lack of any significant
amount of charge means that the E-lines cannot break up, but rather they must
twist and turn continuously as they adjust to the new environment created by the
presence of the hole. The E-field in and around the aperture must be distributed in
a way that would support the B-field (through the curl equations), but, because a
parallel E-field cannot exist on conducting surfaces, it must also stay away from
the interior walls of the hole. Figure 30.3 shows a possible way for the E-lines
just above the aperture to dodge the side-walls and concentrate near the center, as
they drop into the hole from above. The bundle of E-lines in the middle of the
hole (parallel to the ellipse’s long-axis) then acts as a source of circulating
magnetic fields that wrap around the long axis (r · H ¼ @D/@t ), thus supporting
the B-field above, below, and inside the aperture.
Figure 30.4(a) shows that, in the central XZ cross-section of the aperture, the
B-lines above the aperture, without breaking up, thin down and sag toward and
into the hole. Magnetic energy thus leaves the mid-section of the strong B-fringe
above the hole and leaks into the hole and beyond. The behavior of the E-field in
the central YZ-plane is depicted in Figure 30.4(b). Here the strong fringe, which is
not immediately above the aperture but a distance of Dz ¼ k/4 away, is squeezed
laterally toward the hole’s center, while, at the same time, leaking some of its
energy into the aperture. Some of the E-lines originate or terminate on the
charges deposited by the surface current Is on the sharp corners of the ellipse.
(The dashed lines in Figure 30.4(b) represent the bending of the E-field out of the
YZ-plane toward charges that reside on the side-walls near these sharp corners.)
Note that the charge polarity is such that the E-lines above have the same dir-
ection as those inside and below the aperture. It is important to recognize that
the surface current Is lags 90 behind the E-field of the first fringe. Thus, when
the E-field directly above the aperture reaches its maximum along the negative
Y-axis, Is, which has been traveling in the positive Y-direction until that moment,
has stopped and is beginning to reverse direction. This explains why the charges
reach their maximum strength when the E-field immediately above the aperture is
at a peak, and also clarifies the reasoning behind the polarity chosen for the
charges in Figure 30.4(b).
Aside from the incident beam, which is fixed at the outset, all other radiation in
the system of Figure 30.3 is generated by the surface currents Is (and the charges
deposited by Is around the sharp corners of the aperture). The same is true of the
system of Figure 30.2, with its uniform current confined to the upper surface of
the conductor. Any differences between the radiation fields in the systems of
Figure 30.2 and Figure 30.3 must therefore arise from the difference between
the two surface current distributions. Subtracting the (uniform) surface current
of Figure 30.2 from that of Figure 30.3 yields the distribution sketched in
30 Light transmission through small elliptical apertures 425
(a) Z
B
Is
X
(b)
E
– +
Is – Is
+
– +
Y
Figure 30.4 (a) The B-field above the aperture of Figure 30.3, without breaking
up, thins down and sags into the hole. (b) The E-field, whose strong fringe is not
immediately above the aperture but a distance of Dz ¼ k/4 away, is squeezed
toward the center of the hole, while, at the same time, leaking some of its energy
into the aperture. The E-lines can originate or terminate on the charges deposited
by the surface current Is on the sharp corners of the ellipse. (Dashed lines represent
the bending of some of the E-field out of the YZ-plane toward charges that reside
on the side-walls near these sharp corners.) Note that the charge polarity is such
that the E-lines above have the same direction as those inside and below the
aperture.
Figure 30.5(a). Far from the aperture, of course, the perturbation caused by the
aperture is small and the two surface currents must cancel out. In the vicinity of
the aperture we find two loops of current circulating in opposite directions, as well
as positive and negative charges in those regions where the divergence of the local
current is non-zero. As shown in Figure 30.5(b), these circulating currents are
equivalent to a pair of oppositely oriented magnetic dipoles þm and –m (i.e., a
magnetic quadrupole, assuming their separation is much less than a wavelength);
the charges localized on the aperture’s sharp corners give rise to an oscillating
electric dipole p. Thus, adding the dipoles p and m to the system of Figure 30.2
should transform it over to the system of Figure 30.3.
Figure 30.5(c) shows that, in the vicinity of the aperture, the combined radi-
ation pattern of the electric dipole and the magnetic quadrupole consists of a
426 Classical Optics and its Applications
(a)
Y
+
+ + Is
– –
–
X
(b) (c)
+++
p
–m m
Figure 30.5 (a) Surface current distribution obtained when the (uniform)
surface current of Figure 30.2 is subtracted from that of Figure 30.3. Charges
appear in regions where the current’s divergence is non-zero. (b) The net
effect of the aperture on the uniform surface current of Figure 30.2 is the
addition of an electric dipole p and two loops of current that circulate in
opposite directions; each current loop is a magnetic dipole m. (c) In the
vicinity of the aperture, the combined radiation pattern of the electric and
magnetic dipoles consists of a circulating B-field around the major axis of the
ellipse and an E-field pattern that tends to stay away from the long side-walls
of the aperture.
30 Light transmission through small elliptical apertures 427
circulating B-field around the major axis of the ellipse, and an E-field distribution
that tends to stay away from the long side-walls of the aperture. These fields,
when added to the E- and B-fringes of Figure 30.2, produce the field profiles of
Figures 30.3 and 30.4. The circulating magnetic field around the ellipse’s major
axis in Figure 30.5(c) is responsible for the bending of the B-lines toward and into
the hole, as sketched in Figures 30.3 and 30.4(a). Similarly, superposition of the
E-field pattern of Figure 30.5(c) with the uniform E-fringe that exists above an
apertureless mirror gives rise to the E-field pattern of Figure 30.3 in the XY-plane
immediately above the aperture.
In practice, the metallic film has a finite thickness, and the combined radiation
by the dipole p and quadrupole m of Figure 30.5(b) must vanish within the
body of the film. To this end, the magnetic dipoles may have to tilt sideways, one
to the right and the other to the left, so that everywhere inside the metal film their
E- and B-fields will be canceled by the corresponding fields of the electric dipole.
Physically, the sideways tilt of the m dipoles is a consequence of the induced
surface currents on the interior side-walls of the aperture, which currents also
help to support the B-field adjacent to these side-walls; see Figure 30.4(a).
All in all, the primary source of radiation through the aperture of Figure 30.3
seems to be the m quadrupole depicted in Figure 30.5(b); the induced dipole p
in this system is relatively weak and plays a secondary role, namely, canceling
the quadrupole’s radiation inside the metal film. In general, quadrupolar sources
are weak radiators, thus accounting for the weakness of transmission through an
elliptical aperture illuminated by a plane wave whose polarization direction
coincides with the major axis of the ellipse.
Figure 30.6 shows computed plots of Ex, Ey, Ez in the XY-plane located 20 nm
above the surface of the conductor in the system of Figure 30.3. The simulated
conductor is a 124 nm-thick film of silver (n þ ik ¼ 0.226 þ i6.99 at k ¼ 1.0 lm)
having an 800 nm-long, 100 nm-wide elliptical aperture.7 The magnitude of each
field component is plotted in the top row of Figure 30.6, and the corresponding
phase profile appears below it. For our purposes, the main utility of the phase
distribution is to indicate the relative orientation of the various field components.
For instance, if the phase of Ey at a given location happens to be 0, then if the
phase of Ex at that location turns out to be equal (or nearly equal) to 0, we will
know that Ex x þ Ey y oscillates back and forth between the first and third
quadrants of the XY-plane. However, if the phase of Ex hovers around 0 180 ,
then Ex x þ Ey y oscillates between the second and fourth quadrants.
The E-field distribution of Figure 30.6 is consistent with the qualitative
behavior sketched in Figures 30.3, 30.4(b), and 30.5(c). The Ex component bends
the central field lines toward the middle of the aperture, and pushes the peripheral
lines further way, thus ensuring that the long side-walls repel the parallel E-field.
|Ex| |Ey| |Ez|
0.0000 0.0091 0.0182 0.0273 0.00 0.25 0.50 0.75 0.000 0.048 0.096 0.144
600
400
200
y [nm]
–200
–400
–600
f (Ex) f (Ey) f (Ez)
–180 –60 60 180 0.0 3.9 7.8 11.7 –180 –60 60 180
600
428
400
200
y [nm]
–200
–400
–600
–600 –400 –200 0 200 400 600 –600 –400 –200 0 200 400 600 –600 –400 –200 0 200 400 600
x [nm] x [nm] x [nm]
Figure 30.6 Computed plots of Ex, Ey, Ez in an XY-plane located a short distance (Dz ¼ 20 nm) above the surface of the
conductor in the system of Figure 30.3. Top row: amplitude, bottom row: phase. The silver film is 124 nm thick, the aperture is
800 nm long and 100 nm wide, and the radiation wavelength is k ¼ 1 lm.
30 Light transmission through small elliptical apertures 429
The Ey component is strengthened near the center of the aperture because the field
lines are pushed upward and squeezed laterally toward the center. Finally, the Ez
component confirms the presence of charges of opposite sign at and around the sharp
corners of the aperture (r · D ¼ q). These pictures are consistent with the presence
of a weak electric dipole and a magnetic quadrupole in and around the elliptical
aperture.
Computed amplitude and phase plots of Ey, Ez in the central YZ-plane of the
aperture are shown in Figure 30.7. The bands of Ey above the aperture are the
standing-wave fringes created by the interference between the incident and
reflected beams. The weak nature of transmission through the aperture is evident
in the very small perturbation of the fringes, as they sag ever so slightly to fill the
top of the aperture. The profile of Ez, once again, confirms the accumulation of
electric charges around the sharp corners of the hole. Moreover, it shows that the
|Ey| |Ez|
0.00 0.68 1.36 2.05 0.00 0.088 0.177 0.265
600
400
200
z [nm]
–200
–400
–600
f (Ey) f (Ez)
–180 –60 60 180 –180 –60 60 180
600
400
200
z [nm]
–200
–400
–600
–600 –400 –200 0 200 400 600 –600 –400 –200 0 200 400 600
y [nm] y [nm]
Figure 30.7 Computed amplitude and phase plots of Ey, Ez in the central
YZ-plane in the system of Figure 30.3. The silver film’s cross-section is indicated
with dashed lines. The standing wave fringes are only slightly perturbed by the
aperture.
430 Classical Optics and its Applications
charges on the top facet of the metal film, while much stronger than those on the
bottom facet, have the same sign as the charges on the bottom; in other words, the
top and bottom charges are both positive at one end of the ellipse, and both
negative at the opposite end.
Figure 30.8 shows plots of Hx, Hy, Hz in the XY-plane 20 nm above the surface
of the conductor, while amplitude and phase plots of Hx and Hz in the central
XZ-plane appear in Figure 30.9. As expected from the preceding discussion of
Figures 30.3 and 30.4, the magnetic fringe nearest the surface is seen to leak into
the aperture by bending the H-lines near the corners of the ellipse toward the
center and down into the hole.
Computed plots of Ex, Ey, Ez in the XY-plane 20 nm below the conductor are
shown in Figure 30.10, and the corresponding H-field distributions appear in
Figure 30.11. While the profiles of these fields confirm the behavior expected
from our earlier qualitative analysis, their small magnitudes testify to the weak
nature of radiation by the m quadrupole (and the accompanying p dipole)
induced by the incident beam in the vicinity of the aperture of Figure 30.3.
400
200
y [nm]
–200
–400
–600
400
200
y [nm]
–200
–400
–600
–600 –400 –200 0 200 400 600 –600 –400 –200 0 200 400 600 –600 –400 –200 0 200 400 600
x [nm] x [nm] x [nm]
Figure 30.8 Computed plots of Hx, Hy, Hz in the XY-plane 20 nm above the
conductor surface in the system of Figure 30.3. Top row: amplitude; bottom row:
phase.
30 Light transmission through small elliptical apertures 431
|Hx| |Hz|
× 10–3 0 2 4 6 × 10–3 0.00 1.02 2.03 3.05
600
400
200
z [nm]
–200
–400
–600
f(Hx) f(Hz)
–180 –60 60 180 –180 –60 60 180
600
400
200
z [nm]
–200
–400
–600
–600 –400 –200 0 200 400 600 –600 –400 –200 0 200 400 600
x [nm] x [nm]
Figure 30.9 Amplitude and phase plots of Hx, Hz in the central XZ-plane in the
system of Figure 30.3. The silver film’s cross-section is indicated with dashed lines.
Figure 30.12 shows distributions of the magnitude jSj of the Poynting vector in
various cross-sections of the system of Figure 30.3. The superimposed arrows on
each plot show the projection of S in the corresponding plane.7 For instance, in
the XZ cross-section depicted in Figure 30.12(a), the arrows represent Sx x þ Sz z,
whereas in the YZ cross-section of Figure 30.12(b) the arrows correspond to
the projection Sy y þ Sz z of the Poynting vector on the YZ-plane. The plots in
Figure 30.12(c) and (d) show the distributions of jSj in the XY-planes immedi-
ately above and below the aperture. In the absence of an aperture, S is essentially
zero everywhere, as the reflected beam cancels out the incident beam’s energy
flux. When the aperture is present, however, the fields are redistributed in such a
way as to draw the incident optical energy toward the aperture. In the present
case, the energy flows in from the periphery, fails to find a way through the
aperture, bounces back and returns toward the source in the region directly above
the aperture. In the process, several vortices are formed, where the incoming
energy makes a sharp turnaround and heads back toward the source.
432 Classical Optics and its Applications
|Ex| |Ey| |Ez|
0.0001 0.0021 0.0042 0.0062 0.0000 0.0203 0.0406 0.0609 0.000 0.0099 0.0197 0.0296
600
400
200
y [nm]
–200
–400
–600
400
200
y [nm]
–200
0–400
–600
–600 –400 –200 0 200 400 600 –600 –400 –200 0 200 400 600 –600 –400 –200 0 200 400 600
x [nm] x [nm] x [nm]
Figure 30.10 Computed plots of Ex, Ey, Ez in the XY-plane 20 nm below the
bottom facet of the conductor in the system of Figure 30.3. Top row: amplitude;
bottom row: phase.
In Figure 30.12(d) the Poynting vector S ¼ ½ Re (E · H*) at the bottom of the hole
has a magnitude jSj 2.5 · 10–6 W/m2, consistent with the transmitted E-field
of 0.06 V/m and H-field of 2.3 · 10–4 A/m, considering the large phase difference
of D 70 between the E- and H-fields near the bottom of the aperture; see
Figures 30.10 and 30.11. Since the incident plane-wave is assumed to have Ey ¼ 1.0
V/m, Hx ¼ Ey /Z0 ¼ 2.65 · 10–3 A/m (free-space impedance Z0 377 X), which
correspond to an incident energy density 1.32 · 10–3 W/m2, the power transmission
efficiency g at the center of the elliptical aperture of Figure 30.3 is seen to be just
under 0.2%. We will see in the next section that when the incident polarization is
rotated 90 (to point along the minor axis of the ellipse), the transmission efficiency
through the aperture increases to g 93%, a nearly 500-fold improvement.
400
200
y [nm]
–200
–400
–600
f(Hx) f(Hy) f(Hz)
–180 –60 60 180 –180 –60 60 180 –180 –60 60 180
600
400
200
y [nm]
–200
–400
–600
–600 –400 –200 0 200 400 600 –600 –400 –200 0 200 400 600 –600 –400 –200 0 200 400 600
x [nm] x [nm] x [nm]
Figure 30.11 Computed plots of Hx, Hy, Hz in the XY-plane 20 nm below the
bottom facet of the conductor in the system of Figure 30.3. Top row: amplitude;
bottom row: phase.
400
200
z [nm]
–200
–400
–600
–600 –400 –200 0 200 400 600 –600 –400 –200 0 200 400 600
x [nm] y [nm]
|S| |S|
× 10–5 0.0 5.8 11.6 17.4 × 10–5 0.000 0.082 0.164 0.246
600 c d
400
200
y [nm]
–200
–400
–600
–600 –400 –200 0 200 400 600 –600 –400 –200 0 200 400 600
x [nm] x [nm]
Figure 30.12 Profiles of the magnitude jSj of the Poynting vector in various
cross-sections of the system of Figure 30.3. The superimposed arrows show
the projection of S in the corresponding plane. (a) Central XZ-plane.
(b) Central YZ-plane. (c, d) XY-planes located 20 nm above and below the
aperture.
The contribution of the magnetic dipoles is to enhance the E-field at the center of
the aperture, while weakening it in the outer regions.
Figure 30.14(d) shows that in the XY-plane directly above the aperture, the
B-field profile is shaped by competition between the electric dipole p and the
magnetic dipoles m. The electric dipole dominates near the center but, further
30 Light transmission through small elliptical apertures 435
Y
Is
– – –
+ + +
Z
B
Figure 30.13 When the incident E-field is parallel to the minor axis of an
elliptical aperture, the surface currents Is deposit charges at and around the long
side-walls of the aperture. These oscillating charges radiate as an electric dipole
flanked by a pair of magnetic dipoles, creating circulating magnetic fields around
the ellipse’s minor axis that push the incident B-lines upward and sideways.
away, the magnetic dipoles dictate the B-field’s behavior. The dotted B-lines
near the sharp corners of the ellipse in Figure 30.14(d) show the field leaving the
XY-plane to enter/exit the hole vertically (i.e. in the Z-direction). Although not
shown in this figure, B-lines that enter the hole from above close the loop by
circling beneath the metal film and returning through the hole to reconnect with
the B-lines above the film; see Figure 30.15.
The surface charges and currents of Figure 30.14(a) create magnetic fields in
the free-space regions inside the hole as well as those above and below the metal
surface. The B-field of the electric dipole p combines with that of the magnetic
dipoles m to produce closed loops in the vicinity of the aperture, as shown in
Figure 30.15. The solid B-lines in this figure bulge above and below the metal
surface, while the dashed lines hug the conductor’s top and bottom surfaces. (The
B-field cannot penetrate into the conductor, but, as it emerges from the hole, it
bends above and below the surface in such a way as to bring the field lines close
to the metallic surface.) In all cases, the lines of B must form closed loops to
guarantee the divergence-free nature of the field. Since neither E- nor B-fields can
exist within the conductor, the fields radiated by the electric dipole p must cancel
out those of the magnetic dipoles m everywhere inside the metallic medium.
The radiation emanating from these dipoles, however, permeates the interior of
the hole as well as the free-space regions on both sides of the conductor. To get in
and out of the hole, the B-lines of Figure 30.15 appear to descend through one of
the current loops that constitutes a magnetic dipole in Figure 30.14(b), then return
through the other loop. Note the change of direction of the magnetic field at the
upper surface of the elliptical aperture: the direction of B just above the hole is
436 Classical Optics and its Applications
(a) (b)
Is
–
– –
– – – –
m -m
p
+ +
+ +
+ + +
(c) (d)
B
E
–– –
+ + +
Figure 30.14 (a) Surface currents and the accompanying charge distribution
produced by the elliptical aperture of Figure 30.13. When added to the
uniform current of Figure 30.2, these currents produce the Is pattern shown in
Figure 30.13. (b) The current loops in (a) are equivalent to a pair of magnetic
dipoles, m, while the charges deposited on opposite sides of the aperture
constitute an electric dipole p. (c) In the XY-plane immediately above the
aperture, the E-field is dominated by the electric dipole p. (d) In the XY-plane
immediately above the aperture, the B-field profile is shaped by competition
between the electric dipole p and the magnetic dipoles m. Dotted B-lines
near the sharp corners of the ellipse show the B-field leaving the XY-plane to
enter/exit the hole.
opposite to that beneath the hole’s upper surface. This 180 phase shift, dictated
by the presence of the (uniform) Is on the top surface of the elliptical aperture in
Figure 30.14(a), will disappear when the fringes of Figure 30.2 are added to the
fields produced by p, m, and –m to yield the total field in and around the aperture.
The induced electric charges on the surfaces surrounding the aperture produce
an oscillating E-field in the short gap between the long side-walls as well as in the
regions immediately above and below the aperture. The time rate of change of
30 Light transmission through small elliptical apertures 437
B
Z
Figure 30.15 With reference to Figure 30.14(b), the B-field of the electric
dipole p combines with that of the magnetic dipoles m to produce closed loops
in and around the aperture. The solid B-lines bulge above and below the metal
film, while the dashed B-lines hug the conductor’s top and bottom surfaces.
this field, @D/@t, which is equivalent to an electric current density J across the
gap, creates circulating magnetic fields around the short axis of the ellipse.4
These B-fields by themselves, however, are not sufficient to explain the field
profile depicted in Figure 30.15, and must be augmented by the fields produced
by the circulating currents around the ellipse’s sharp corners (i.e., the m
dipoles) to yield a complete picture. Moreover, inside the metallic medium, the
E- and B-fields of the p dipole cannot vanish without the compensating contri-
butions of the m dipoles.
Figure 30.16 shows cross-sections of the system of Figure 30.13 in YZ- and
XZ-planes. Since Is lags 90 behind the incident E-field immediately above the
aperture, the accumulating charges on and around the side-walls produce
electric fields opposite in direction to the incident E-field. The E-lines may now
start on positive charges and end on negative charges (r · D ¼ q), as shown in
Figure 30.16(a). This change of direction of the E-field causes a 180 phase
shift in Ey from above to below the aperture. The E-fringe just above the
aperture thus becomes weaker, sharing some of its energy with the E-field
inside and below the aperture.
The XZ cross-section of the system of Figure 30.13 depicted in Figure 30.16(b)
shows how the oscillating electric dipole p and magnetic dipoles m push the
B-fringe above the aperture upward and sideways to make room for circulating
B-fields that surround the short axis of the elliptical aperture. The resulting
redistribution of the magnetic energy of the B-fringe above the hole thus makes it
438 Classical Optics and its Applications
(a)
Z
E
+
Is Is
++
Y
(b)
Z
B
Is Is
X
Figure 30.16 Cross-sections of the system of Figure 30.13 in YZ- and XZ-planes.
(a) The charges accumulating on the aperture’s side-walls produce an E-field
opposite in direction to the incident field. The lines of E may now start on
positive charges and end on negative charges. (b) The dipoles p, m, and – m of
Figure 30.14(b) push the B-fringe above the aperture upward and sideways to make
room for circulating B-fields that surround the short axis of the elliptical aperture.
possible for some of the energy stored in this fringe to leak into the hole as well as
the space below the hole. (The B-field distribution inside the aperture and in the
region below the metal film is the same as that in Figure 30.15, since the added
fringes contribute only to the half-space above the conductor.) The divergence-free
nature of the B-lines requires their continuity, which is evident in Figure 30.16(b),
in contrast to the E-lines of Figure 30.16(a), which break up whenever they meet
electrical charges.
Figure 30.17 shows computed plots of Ex, Ey, Ez in the XY-plane 20 nm above
the surface of the conductor in the system of Figure 30.13 (top row: magnitude;
bottom row: phase). As before, the 124 nm-thick silver film used in these
simulations has n þ ik ¼ 0.226 þ i6.99 at k ¼ 1.0 lm, and the ellipse’s diameters
along its major and minor axes are 800 nm and 100 nm, respectively.7 The strong
30 Light transmission through small elliptical apertures 439
|Ex| |Ey| |Ez|
0.000 0.023 0.046 0.069 0.0000 0.245 0.491 0.736 0.00 0.33 0.66 0.99
600
400
200
y [nm]
–200
–400
–600
400
200
y [nm]
–200
–400
–600
–600 –400 –200 0 200 400 600 –600 –400 –200 0 200 400 600 –600 –400 –200 0 200 400 600
x [nm] x [nm] x [nm]
Figure 30.17 Computed plots of Ex, Ey, Ez in the XY-plane 20 nm above the
conductor’s surface in the system of Figure 30.13. Top row: amplitude; bottom
row: phase. The silver film is 124 nm thick, the aperture is 800 nm long and
100 nm wide, and the radiation wavelength is k ¼ 1 lm. The aperture boundaries
are indicated with dashed lines.
400 400
200 200
z [nm]
z [nm]
0 0
–200 –200
–400 –400
–600 –600
400 400
200 200
z [nm]
z [nm]
0 0
–200 –200
–400 –400
–600 –600
–600 –400 –200 0 200 400 600 –600 –400 –200 0 200 400 600 –600 –400 –200 0 200 400 600
x [nm] y [nm] y [nm]
Figure 30.18 (Left) amplitude and phase of Ey in the central XZ-plane; (right)
amplitude and phase of Ey, Ez in the central YZ-plane in the system of
Figure 30.13. The silver film’s cross-section is indicated with dashed lines. The
fringes in the two panels are differently colored because the color scale for Ey in
the YZ-plane has been greatly expanded by two (barely visible) hot spots on the
sidewalls near the bottom of the hole.
phase of Hx, Hz in the central XZ cross-section, while the right panel shows the
distribution of Hx in the central YZ-plane. The magnetic field’s behavior in these
pictures is in accord with the qualitative behavior sketched in Figure 30.16(b).
Note, in particular, that the profile of Hz in Figure 30.20 resembles the
z-component of the circulating B-field in Figure 30.16(b). Note also the draining
of magnetic energy out of the B-fringe above the hole, and its redistribution not
only in the form of magnetic fields inside and below the aperture, but also in the
enhanced values of Hx directly above the conductor’s surface.
Plots of Ex, Ey, Ez in the XY-plane 20 nm below the bottom surface of the
conductor are shown in Figure 30.21, and the corresponding magnetic-field plots
appear in Figure 30.22. These pictures are in full agreement with the qualitative
diagrams of Figures 30.14–30.16.
Figure 30.23 shows distributions of the magnitude jSj of the Poynting vector in
various cross-sections of the system of Figure 30.13. The superimposed arrows
on each plot show the projection of S in the corresponding plane.7 For instance, in
30 Light transmission through small elliptical apertures 441
|Hx| |Hy| |Hz|
× 10–3 0.00 1.94 3.89 5.83 × 10–3 0.00 0.31 0.62 0.92 × 10–3 0.00 0.45 0.89 1.34
600
400
200
y [nm]
–200
–400
–600
400
200
y [nm]
–200
–400
–600
–600 –400 –200 0 200 400 600 –600 –400 –200 0 200 400 600 –600 –400 –200 0 200 400 600
x [nm] x [nm] x [nm]
Figure 30.19 Computed plots of Hx, Hy, Hz in the XY-plane 20 nm above the
surface of the conductor in the system of Figure 30.13. Top row: amplitude;
bottom row: phase. The aperture boundaries are indicated with dashed lines.
400 400
200 200
z [nm]
z [nm]
0 0
–200 –200
–400 –400
–600 –600
f(Hx) f(Hz) f(Hx)
–180 –60 60 180 –180 –60 60 180 –180 –60 60 180
600 600
400 400
200 200
z [nm]
z [nm]
0 0
–200 –200
–400 –400
–600 –600
–600 –400 –200 0 200 400 600 –600 –400 –200 0 200 400 600 –600 –400 –200 0 200 400 600
x [nm] x [nm] y [nm]
Figure 30.20 (Left) amplitude and phase of Hx, Hz in the central XZ-plane;
(right) amplitude and phase of Hx in the central YZ-plane in the system of
Figure 30.13. The silver film’s cross-section is indicated with dashed lines.
density at the center of this aperture is thus g 93%, which is nearly 500 times
greater than that obtained when the incident polarization was parallel to the ellipse’s
long axis. (g is the ratio of jSj at the aperture’s center just below the conductor
to the incident plane-wave’s optical power density, jSincj 1.32 · 10–3 W/m2.)
Several factors appear to have contributed to this strong performance (compared to
the case of parallel polarization), among them, more electrical charges and stronger
surface currents (especially on the bottom facet of the conductor), and a greater
separation between the m magnetic dipoles, which tend to cancel each other out
when they are close together.
Concluding remarks
We have analyzed the transmission of light through a small elliptical aperture in a
thin silver film at k ¼ 1.0 lm. Both cases of incident polarization parallel and
perpendicular to the major axis of the ellipse were considered. The transmission
efficiency g was found to be low for parallel polarization and high for perpen-
dicular polarization.
30 Light transmission through small elliptical apertures 443
|Ex| |Ey| |Ez|
0.000 0.025 0.050 0.075 0.00 0.53 1.07 1.60 0.00 0.42 0.84 1.25
600
400
200
y [nm]
–200
–400
–600
400
200
y [nm]
–200
–400
–600
–600 –400 –200 0 200 400 600 –600 –400 –200 0 200 400 600 –600 –400 –200 0 200 400 600
x [nm] x [nm] x [nm]
Figure 30.21 Computed plots of Ex, Ey, Ez in the XY-plane 20 nm below the
bottom facet of the conductor in the system of Figure 30.13. Top row: amplitude;
bottom row: phase.
400
200
y [nm]
–200
–400
–600
f(Hx) f(Hy) f(Hz)
–180 –60 60 180 –180 –60 60 180 –180 –60 60 180
600
400
200
y [nm]
–200
–400
–600
–600 –400 –200 0 200 400 600 –600 –400 –200 0 200 400 600 –600 –400 –200 0 200 400 600
x [nm] x [nm] x [nm]
Figure 30.22 Computed plots of Hx, Hy, Hz in the XY-plane 20 nm below the
bottom facet of the conductor in the system of Figure 30.13. Top row: amplitude;
bottom row: phase.
and 136 % for h ¼ 248 nm), indicating propagation through the hole (along
the Z-axis) of a guided mode whose E-field is largely parallel to the ellipse’s
short axis. We also found that an infinitely long, 100 nm-wide slit exhibits
strong transmission for an incident polarization aligned with the narrow
dimension of the slit (g 69 % at the center of the slit, 36 nm below a 124 nm-thick
silver film).
It thus appears that achieving a large g requires an aperture that can excite
strong oscillator(s) on the upper surface of the film, which would then induce
strong oscillations on the lower facet, thereby creating the conditions for the
passage of a substantial amount of electro-magnetic energy through the sub-
wavelength opening in the metal film. The ability of a hole (or slit) to support a
guided mode that can be excited by the incident polarization appears to be critical
for achieving large transmission, especially for thicker films. Recent reports of
various aperture designs that have significant throughputs (compared with simple
circular or square-shaped apertures)3,10 indicate that the aforementioned prin-
ciples, far from being specific to elliptical holes in thin metal films, have a broad
range of application.
30 Light transmission through small elliptical apertures 445
|S| |S|
× 10–5 0 59 118 177 × 10–5 0 120 240 360
600 a b
400
200
z [nm]
–200
–400
–600
–600 –400 –200 0 200 400 600 –600 –400 –200 0 200 400 600
x [nm] y [nm]
|S| |S|
× 10–5 0 42 84 126 × 10–5 0 41 82 123
600 c d
400
200
y [nm]
–200
–400
–600
–600 –400 –200 0 200 400 600 –600 –400 –200 0 200 400 600
x [nm] x [nm]
Figure 30.23 Profiles of the magnitude jSj of the Poynting vector in various
cross-sections of the system of Figure 30.13. The superimposed arrows
show the projection of S in the corresponding plane. (a) Central XZ-plane.
(b) Central YZ-plane. (c, d) XY-planes located 20 nm above and below the
aperture.
447
448 Classical Optics and its Applications
(a)
(b)
y/0
–300
300 c d
y/0
–300
–300 x/0 300 –300 x/0 300
Figure 31.2 Computing the 0,0 mode of the resonator shown in Figure 31.1
using the method of Fox and Li. (a) The assumed initial distribution, having
uniform amplitude and constant phase across a wide circular aperture. (b)
Computed intensity distribution at the mid-plane of the cavity, obtained after 80
iterations. (c) Same as (b) but showing the logarithm of intensity on a scale of 1
(white) to 105 (black). (d) Distribution of the phase in the mid-plane of the
cavity corresponding to the steady-state intensity distribution shown in (b) and
(c). In this picture a white pixel represents a þ180 phase angle, a black pixel
represents a 180 phase angle, and the gray pixels represent the continuum of
values in between.
For the above simulation a plot of the power attenuation coefficient c versus
iteration number is shown in Figure 31.3. c is the ratio of the optical power
contained in the beam after a given iteration to the same quantity before the
iteration. It thus represents, for the particular mode under consideration, the
fractional losses of the cavity during one round trip of the beam. The steady-state
value of c is also related to the eigenvalue of the mode under consideration; the
mode itself is an eigenfunction of the cavity. In the present example, where the
steady-state value of c is 0.97, the losses for the lowest-order mode are indeed
very small.
450 Classical Optics and its Applications
1.0
0.9
Attenuation Coefficient
0.8
0.7
0.6
0 20 40 60 80
Number of Iterations
Figure 31.3 Evolution of the power attenuation coefficient c during the simu-
lation that led to the 0,0 mode shown in Figure 31.2. The computation stabilizes
after about 80 iterations, and the steady-state value of c is close to 0.97.
Higher-order modes
Although the method of Fox and Li is ideally suited for computation of the
lowest-order mode of the cavity, under special circumstances (and sometimes
with the aid of special tricks) it is possible to compute some of the higher-order
modes as well. As an example, consider the initial distribution shown in
Figure 31.4(a), which consists of four identical lobes, each having the same
uniform intensity distribution. Although not shown, it is also assumed that the
phase is 0 for the pair of lobes along one diagonal and 180 for the opposite pair.
The stage is thus set for excitation of the so-called 1,1 mode of the cavity.
Figures 31.4(b)–(d) show the computed 1,1 mode obtained from the initial
distribution of Figure 31.4(a) after 64 iterations. The plot of attenuation coeffi-
cient c versus iteration number shown in Figure 31.5 reveals that the steady-state
is reached after only about 40 iterations and that the final value of c is 0.87. That
this value of c is less than that for the 0,0 mode is consistent with the observation
that the 1,1 mode is more spread out and, therefore, must suffer higher truncation
losses at the apertures of the mirrors. For comparison with the steady-state dis-
tribution, two of the intermediate distributions obtained in this simulation are
shown in Figure 31.6. In this figure the intensity plots appear on the left-hand side
and the corresponding log(intensity) plots appear on the right-hand side. The
patterns in Figures 31.6(a), (b) are obtained after 6 and 17 iterations, respectively.
31 The method of Fox and Li 451
300 a b
y/0
–300
300 c d
y/0
–300
–300 x/0 300 –300 x/0 300
0.9
0.8
Attenuation Coefficient
0.7
0.6
0.5
0.4
0.3
0 10 20 30 40 50 60
Number of Iterations
steady-state distribution in this case was the same as the 0,0 mode of Figure 31.2;
also notice that the steady-state value of c in Figure 31.8 is 0.97, in agreement with
our previous estimate of c for the lowest-order mode.
To get an idea as to how the pattern in Figure 31.7 reconfigures itself to resemble
that of the 0,0 mode, we show in Figure 31.9 an intermediate state obtained from the
initial state of Figure 31.7(a) after 33 iterations. Notice that some of the lobes have
moved towards the center and have begun to merge, giving rise to a central bright spot
which, thanks to its lower losses, will eventually overtake the higher-order mode.
y/0
–300
300 b b2
1
y/0
–300
–300 x/0 300 –300 x/0 300
y/0
–300
300 c d
y/0
–300
–300 x/0 300 –300 x/0 300
Figure 31.7 Computed results for a high-order mode of the confocal resonator
shown in Figure 31.1. The assumed initial distribution in the cavity’s mid-plane
has eight lobes of uniform amplitude, as shown in (a), but its phase distribution
(not shown) alternates between 0 and 180 from lobe to adjacent lobe.
(b) Intensity distribution in the cavity’s mid-plane after 16 iterations. (c) Same
as (b), but showing the logarithm of intensity on a scale of 1 (white)
to 103 (black). (d) Distribution of phase in the mid-plane of the cavity after
16 iterations, corresponding to the intensity patterns in (b) and (c). (For a
description of the gray-scale see the caption to Figure 31.2(d).)
31 The method of Fox and Li 455
1.0
0.9
Attenuation Coefficient
0.8
0.7
0.6
0.5
0 10 20 30 40 50 60
Number of Iterations
y/0
–300
300
b
y/0
–300
300 c
y/0
–300
Figure 31.9 Distributions of (a) intensity, (b) log (intensity), and (c) phase at
the cavity’s mid-plane after a total of 33 iterations, starting in the initial state of
Figure 31.7(a). This is a snap-shot from an intermediate state in the simulation
whose other results are depicted in Figures 31.7 and 31.8. Note that four of the
lobes have moved towards the center and started to merge into a bright central
spot. This is the spot that will eventually become the dominant 0,0 mode.
31 The method of Fox and Li 457
300 a
y/0
–300
300
b
y/0
–300
300 c
y/0
–300
–300 x/0 300
1.00
0.95
Attenuation Coefficient
0.90
0.85
0.80
0.75
0.70
0.65
0 10 20 30 40 50 60
Number of Iterations
References to Chapter 31
1 A. E. Siegman, An Introduction to Lasers and Masers, McGraw-Hill, New York
(1971).
2 H. Kogelnik and T. Li, Laser beams and resonators, Proc. IEEE 54, 1312 (1966).
3 A. G. Fox and T. Li, Resonant modes in a maser interferometer, Bell Syst. Tech. J.
40, 453 (1961).
4 A. G. Fox and T. Li, Modes in a maser interferometer with curved and tilted mirrors,
Proc. IEEE 51, 80 (1964).
32
The beam propagation method†
†
The coauthors of this chapter are Ewan M. Wright and Mahmoud Fallahi of the College of Optical Sciences,
University of Arizona.
459
460 Classical Optics and its Applications
Y ΔZ ΔZ
Figure 32.1 The split-step technique used in the BPM. Instead of continuously
propagating the beam in an inhomogeneous environment, the method alternates
between diffracting the beam a short distance through a homogeneous medium
and then modulating its phase/amplitude through a mask. The mask imparts to
the incident beam the cumulative effect of phase shifts and amplitude attenua-
tions (or amplifications) during each propagation step.
Y Z
Laser diode Fiber
Lens
Figure 32.2 The emergent beam from a semiconductor laser diode is captured
by a lens and focused onto the cleaved facet of a fiber. The numerical aperture of
the focused cone of light is NA ¼ sin h, where h is the half-angle of the cone of
light arriving at the fiber. A small fraction of the incident beam is typically lost
by reflection from the facet, while the remaining light penetrates the fiber,
entering the core and cladding. Depending on the modal structure of the fiber and
the cross-sectional profile of the injected beam, a certain fraction of the input
optical power is coupled into the guided mode, which will then propagate along
the fiber’s axis. The remaining (uncoupled) light radiates away from the core and
is lost in the surrounding regions.
The fraction of the beam that radiates away from the core during a BPM
simulation should not be allowed to reach the mesh boundary. The reason is that
the periodic boundary condition imposed on the mesh by the fast Fourier trans-
form (FFT) algorithm used in diffraction calculations tends to return the radiation
modes into the computational region, via aliasing. In the present simulation we
solved this problem by our choice of the mask, which, in addition to a core and
cladding, contains a strongly absorbing region beyond the cladding (transmission
coefficient zero for r > 15k). The transition from cladding to absorber is tapered
to minimize back-reflections into the core and cladding.
Figure 32.4(a) is a cross-sectional plot of the stabilized light amplitude dis-
tribution in the fiber; the vertical broken lines mark the core–cladding boundary.
This guided mode is essentially trapped in the core but has evanescent tails in the
cladding. The significant penetration of the evanescent waves into the cladding is
a direct consequence of the small index-contrast Dn chosen for this particular
example.2,3
Figure 32.4(b) shows the power content of the beam versus z throughout the
simulation. (The power at any given point along the Z-axis is obtained by
integrating the beam’s intensity over the entire cross-sectional area in the
XY-plane.) The power of the beam, which is set to unity at the entrance pupil
of the lens (see Figure 32.2), drops to 0.96 immediately after the beam enters
the fiber. It then drops rapidly until z 1000k while the beam adjusts itself to
462 Classical Optics and its Applications
–20 x/ 20
the fiber, shedding some of its energy by radiating into the absorption region.
The slight decline of optical power for z > 1000k is partly due to the slow decay
of the radiative modes. Nonetheless, the curve in Figure 32.4(b) continues to
exhibit a small negative slope even after a long propagation distance. This
32 The beam propagation method 463
1.00 (a) 1.00
(b)
Normalized amplitude
0.75 0.95
Power
0.90
0.50
0.85
0.25
0.80
0.00
–15 –10 –5 0 5 10 15 0 1250 2500 3750 5000
x/ Propagation distance (z/)
Figure 32.4 (a) Computed amplitude of the stable mode along the X-axis
for the single-mode fiber of Figures 32.2 and 32.3 (the vertical lines denote
the boundary between core and cladding). The mode stabilizes at about
z ¼ 1500k and does not change afterwards. (b) Computed optical power
along the length of the fiber when the incident power at the front facet is set
to unity.
behavior is indicative of the presence of a small loss factor despite the fact that
the simulated guide is lossless. So long as the diffraction step is treated non-
paraxially, this small loss factor, which is a consequence of the discrete
approximation to an inherently continuous problem, will remain an unavoidable
feature of the BPM.3
–60 x/ 60
Figure 32.5 Phase mask used in simulating a special fiber containing 19 low-
index filaments within its core. The core and cladding radii are 25k and 50k,
respectively, and the region beyond the cladding is absorptive. The filaments
each have a diameter of 4k and the same refractive index as the cladding. All
transition regions are tapered. The phase shift imparted to the beam by the
high-index regions of the core (relative to the low-index cladding and the
filaments) is 3 per k.
propagation distance. The power stabilizes when the radiative modes leave the
core and cladding to disappear into the surrounding absorber; however, a small
negative slope similar to the one mentioned in connection with Figure 32.4(b)
remains even after stabilization.
Y-branch beam-splitter
Figure 32.8 is a diagram of a Y-branch channel waveguide. This structure is
typically embedded in a lower-index medium that plays the role of cladding.2 The
beam, injected on the left-hand side, establishes in the initial section a guided
mode that propagates along the Z-axis; in Figure 32.8 the length of this initial
section of the guide is z1. The waveguide then slowly opens up over a distance z2,
and the beam follows this expansion adiabatically (i.e., without significant loss of
power and without exciting higher-order modes). Once the guide has been suf-
ficiently broadened, it splits into two channels that slowly recede from each other
over a distance z3 until they are optically isolated. Afterwards the two channels
may remain parallel for a distance z4. Thus a beam injected into the initial section
of the guide will split in two, each of which may be extracted from a separate
channel.
A set of phase masks for simulating a symmetric Y-splitter is shown in
Figure 32.9(a). From top to bottom these masks represent the initial section of
32 The beam propagation method 465
a b c
d e f
g h i
Figure 32.6 Computed intensity profiles in the core region of the fiber of
Figure 32.5 obtained in a BPM simulation. The assumed split steps are of length
Dz ¼ 2.5k. The initial distribution is a uniform circularly symmetric beam of
diameter 40k, which enters the fiber at z ¼ 0. The distributions in (a) to (i)
correspond to propagation distances z/k ¼ 500, 1250, 1750, 2500, 3250, 3750,
4000, 4500, 5000.
the guide (5k · 5k square), the end of the expanded region in which the width of
the mask increases to 10k, and the length along the split section where the
center-to-center separation of the two channels slowly increases from zero to
45k (branching angle ¼ 1.15 ). The assumed lengths of the various sections are
z1 ¼ z2 ¼ 1000k and z3 ¼ z4 ¼ 2000k (see Figure 32.8). Each mask imparts a 15
phase shift to the incident beam after a propagation step of Dz ¼ 5k; this cor-
responds to a 3 phase shift per k. For the initial distribution at z ¼ 0 we chose a
uniform beam having a circular cross-section of diameter 14k. The output of the
device at z ¼ 6000k is shown in Figure 32.9(b). The intensity profile in the broad
section of the guide at z ¼ 2000k (just before branching) is shown in Figure 32.9(c),
while a plot of the phase distribution at the same location appears in Figure 32.9(d).
466 Classical Optics and its Applications
1.00
Power
0.95
0.90
0 1250 2500 3750 5000
Propagation distance z/
Figure 32.7 Total power of the beam versus propagation distance along the
length of the fiber for the BPM simulation depicted in Figure 32.6. The incident
power at z ¼ 0 is set to unity.
Incident beam
z1 z2 z3 z4
The phase plot indicates the propagation of the radiative modes away from the
core and into the absorbing region beyond the cladding.
Figure 32.10 shows the computed amplitude distributions at several cross-
sections of the Y-splitter of Figure 32.8. Evident in this picture is the evolution of
the guided mode from a narrow beam in the initial section of the guide to a wider
beam in the broadened section, and onwards to a pair of well-confined beams in
the divided channel. The power content of the entire beam is plotted versus z in
Figure 32.11, indicating the losses in various sections of the guide and confirming
the stabilization of power in the output channels once they are sufficiently
separated from each other.
32 The beam propagation method 467
a b
c d
Figure 32.9 (a) A set of phase masks used to simulate the Y-splitter of
Figure 32.8. From top to bottom: at the start of the guide; at the end of the
expanded region; at three locations in the split section. (b) The computed
intensity pattern at the end of the guide, z ¼ 6000k, showing two output beams
confined to their respective channels. (c), (d) Intensity and phase distributions in
the broad section of the guide, just before branching. In the phase plot the
remnants of the incident beam, which are not coupled into a guided mode, are
seen to be radiating away from the core.
Directional coupler
Figure 32.12 shows a channel waveguide known as a directional coupler, which
has applications such as switching in optical communication systems.4 A beam of
light injected into channel 1 propagates along that channel until it reaches a point
where channel 2 is close enough to sense the evanescent tail of the guided mode.
At this point the beam leaks into channel 2 and, after a certain distance, moves
entirely into the second channel. If the parallel section of the guide is long
enough, the back and forth coupling between the two channels may be repeated
many times. In this region of strong coupling, the lowest-order modes of the
guide are the even and odd modes depicted in Figure 32.12. Because these modes
travel at different speeds, their relative phase 1 2 varies with distance along
468 Classical Optics and its Applications
0.15 (a)
0.10
Amplitude
z=0
0.05
0.00
–40 –20 0 20 40
0.10
Amplitude
z = 2000
0.05
0.00
–40 –20 0 20 40
0.15 (c)
0.10 z = 6000
Amplitude
0.05
0.00
–40 –20 0 20 40
z/
0.8
Power
0.6
0.4
0 1000 2000 3000 4000 5000 6000
Propagation distance (z/)
Figure 32.11 Power content of the beam versus z/k in the BPM simulation of
the Y-splitter depicted in Figures 32.9 and 32.10. The arrows indicate the
beginning of the split section, the end of the split section, and the location where
the split channels stop receding from each other and become parallel.
X X
Even Odd
Channel 1
5
5
9 0.23° 3 0.23°
Channel 2
–25 x/ 25
Figure 32.13 (a) Phase masks used in the BPM simulation of the directional
coupler of Figure 32.12. Each mask, which consists of a pair of 5k · 5k
square apertures, imparts a phase shift of 7.5 to the beam at the end of
each propagation step of Dz ¼ 2.5k. (b) Logarithmic plot of the intensity
distribution created by a 0.2NA lens at the entrance to channel 1. The
wavelength k is that inside the cladding material, and the effect of losses
incurred upon reflection from the front facet of the waveguide is included in
this picture.
Figure 32.13(a) displays the phase masks used in our BPM simulation of the
directional coupler shown in Figure 32.12. The initial intensity distribution
produced by a 0.2NA lens at the entrance to channel 1 is shown in Figure 32.13(b).
Figure 32.14 shows several intensity plots at various cross-sections of the guide,
demonstrating the transfer of light between the two channels. Figure 32.15 is a
plot of the phase distribution at a location along the guide where the two channels
carry equal amounts of optical power; the plot indicates the existence of a 90
phase difference between the two channels. Figure 32.16 shows the amplitude
distributions at several cross-sections of the guide. Figure 32.17 shows the power
content of each channel versus z, indicating several oscillations of the power
between the two channels.
32 The beam propagation method 471
–15 x/ 15
–15 x/ 15
0.25
2250
z = 750
0.20
2500
0.15
Amplitude
3250
0.10
0.05
0.00
–15 –10 –5 0 5 10 15
x/
0.75
Channel 1
Power
0.50
0.25
Channel 2
0.00
Figure 32.17 Power content versus z for each channel in the directional
coupler of Figure 32.12. The incident focused laser beam loses about 4% of
its power upon entering the front facet of the guide, and another 14% while
establishing itself in channel 1. When the two channels slowly approach
each other, the total power in the guide does not change appreciably, but it
begins to couple out of channel 1 and into channel 2. By the time the
separation of the channels has reached 3k, a fraction of the beam already
resides in channel 2. The oscillation of optical power between the
two channels continues as long as they remain close to each other. The
arrows at the top of the figure mark the locations of the intensity plots of
Figure 32.14.
L
Z
W
–30 x/ 30
Figure 32.19 Phase masks used in the BPM simulation of the 1 · 3 splitter
depicted in Figure 32.18. Each mask imparts a phase shift of 7.5 to the beam at
the end of each propagation step of Dz ¼ 2.5k.
–30 x/ 30
A typical single-mode silica glass fiber has a mode profile that is well approxi-
mated by a Gaussian beam. At k ¼ 1.55 lm, this Gaussian mode has a (1/e2
intensity) diameter of 10 lm. One method of launching light into a fiber calls
for placing the polished end of the fiber in contact with (or close proximity to) the
polished end of another, signal-carrying fiber that has a matching mode profile.
Alternatively, a coherent beam of light may be focused directly onto the polished
end of the fiber. If the focused spot is well aligned with the fiber’s core and has
the same amplitude and phase distribution as the fiber’s mode profile, then the
launched mode will carry the entire incident optical power into the fiber. In
general, however, the focused spot is neither perfectly matched to the fiber’s
mode, nor is it completely aligned with the core. Under these circumstances, only
a certain fraction of the incident optical power will be launched into the fiber. The
numerical value of this fraction, commonly referred to as the coupling efficiency,
will be denoted by g throughout this chapter.
It is well-known that the strength of the launched mode may be computed by
evaluating the overlap integral between the mode profile and the (complex) light
amplitude distribution that arrives at the polished facet of the fiber.1,2,3 The
problem of computing the coupling efficiency g is thus reduced to determining
the light amplitude distribution immediately in front of the fiber. In what follows,
we will evaluate the performance and tolerances of three different lenses
designed for coupling a collimated beam of light into a single-mode fiber.
476
33 Launching light into a fiber 477
GRIN lens
Fiber
Incident beam
(collimated)
refractive index profile n(r) ¼ n0[1 – q(r/rmax)2], where n0 ¼ 1.5901, q ¼ 0.044 55,
rmax ¼ 1.5 mm. The lens is permanently affixed to a single-mode fiber whose
guided mode diameter (at the 1/e2 intensity point) is 10 lm.
A collimated Gaussian beam, having radius R0 (at the 1/e2 intensity point)
and some wavefront distortion, is incident on the front facet of the lens.
Figure 33.2, top row, shows cross-sectional plots of intensity, log intensity, and
phase for this k ¼ 1.55 lm beam arriving at the entrance facet of the GRIN lens.
The intensity profile has R0 ¼ 500k, full-width-at-half-maximum-intensity
diameter DFWHM ¼ 1.1774R0 ¼ 0.912 mm, and full-aperture diameter D ¼ 2.325
mm. The Poynting vector distribution (representing geometric-optical rays) in the
cross-sectional plane of the beam is computed, and its x-, y-, z-components are
shown in Figures 33.2(d)–(f).
Method of computation
With reference to Figure 33.3, we describe a method of computing the (complex)
light amplitude distribution at the focal plane of the lens. From the incident beam
profile one derives a large number of rays (i.e., Poynting vectors) for subsequent
tracing through the system. Ray-tracing begins at the entrance facet of the GRIN
lens, and continues through the focal plane to the destination plane, which is in
the far field of the focused spot. Note that, after traversing the GRIN lens, the rays
emerge into a homogeneous medium of refractive index n ¼ 1.5; this region is
478 Classical Optics and its Applications
800 c
a b
y/
–800
800 d e f
y/
–800
–800 x/ 800 –800 x/ 800 –800 x/ 800
Figure 33.2 Cross-sectional plots of (a) intensity, (b) log intensity, (c) phase of
a k ¼ 1.55 lm beam arriving at the entrance facet of the GRIN lens of Figure 33.1.
The intensity distribution is Gaussian, having DFWHM ¼ 589k ¼ 0.912 mm, and
full-aperture diameter D ¼ 1500k ¼ 2.325 mm. The Poynting vector distribution
S(x, y) – representing geometric-optical rays – is readily computed from the
beam profile. Frames (d)–(f) show the x-, y-, z-components of the Poynting
vector, namely, Sx(x, y), Sy(x, y), Sz(x, y). In (d) the values of Sx range from
0.18 to 0.39. Similarly, Sy in (e) ranges from 0.22 to 0.32, and Sz in (f) ranges
from 0 to 100 (black ¼ minimum, white ¼ maximum).
Homogeneous
GRIN lens medium
Incident beam
Focal Destination
plane plane
intended to simulate the medium of the fiber (ignoring the slight difference
between the core and cladding indices). At the destination plane the traced rays
are used to construct the wavefront of the emerging (divergent) beam. This
wavefront is then propagated backwards, to the focal plane of the GRIN
33 Launching light into a fiber 479
lens (located at its exit facet), where the focused spot’s diffraction pattern is
computed. The reason for tracing the rays all the way to the destination plane (in
the far field of the focused spot) and then back-propagating to the focal plane is
that geometric-optical ray-tracing does not yield valid results when the rays
terminate in focal (or caustic) regions.
Figure 33.4 shows the results of two different computations for the incident
beam depicted in Figure 33.2. Shown are the intensity and phase distributions at
the focal plane of the GRIN lens of Figure 33.3. The incident wavefront is
initially converted to a set of geometric-optical rays, using the association
between a ray and the local Poynting vector of the electromagnetic field. In
Figure 33.4(a, b) the incident rays are traced directly to the focal plane, and the
emergent wavefront has been reconstructed from the traced rays. In Figure 33.4
(c, d) the rays are traced from the entrance facet to the destination plane (see
a b
c d
Figure 33.4 Using two different methods, the intensity (left) and phase (right)
distributions at the focal plane of the GRIN lens of Figure 33.3 have been
computed for the incident beam shown in Figure 33.2. In (a) and (b) the incident
rays are traced directly to the focal plane, and the emergent wavefront is con-
structed from traced rays. In (c) and (d) the rays are traced from the entrance
facet of the lens to the destination plane, where the emergent wavefront is
constructed and subsequently back-propagated to the focal plane (i.e., rear facet)
of the GRIN lens.
480 Classical Optics and its Applications
Figure 33.3), at which point the emergent wavefront is constructed. This wave-
front is subsequently back-propagated to the rear facet of the lens using the far
field (Fraunhofer) diffraction formula. Since the incident beam in this particular
example is highly aberrated, the two methods of calculation yield similar results.
As a general rule, however, the incident rays should not be traced to the vicinity
of the focal plane, where, due to significant diffraction, geometric-optical
methods are inadmissible.
a b c
Figure 33.5 Plots of (a) intensity, (b) log intensity, and (c) phase of a
k ¼ 1.55 lm Gaussian beam arriving at the entrance facet of the GRIN lens.
The beam has FWHM diameter DFWHM ¼ 0.64 mm, full aperture diameter
D ¼ 2.17 mm, 2k of linear distortion (i.e., 0.16 of tilt), and 3k of Seidel
curvature (i.e., Rc 127 mm).
33 Launching light into a fiber 481
a b c
Figure 33.6 Plots of (a) intensity, (b) log intensity, and (c) phase of the
emergent beam at the destination plane, located 2.0 mm beyond the exit facet of
the GRIN lens of Figure 33.3. Since the beam is highly divergent at this point, its
curvature phase-factor (Rc ¼ 2.046 mm) has been subtracted from the phase plot.
a b c
Figure 33.7 Plots of (a) intensity, (b) log intensity, and (c) phase of the focused
spot at the rear facet of the GRIN lens of Figure 33.3. To compute these distri-
butions, the beam displayed in Figure 33.6 has been back propagated a distance of
2.0 mm (i.e., from the destination plane to the rear facet of the GRIN lens).
d e f
g h i
Figure 33.8 Plots of intensity (left), log intensity (middle), and phase (right)
at the rear facet of the GRIN lens of Figure 33.3. Top row: incident beam
diameter DFWHM ¼ 1.37 mm, D ¼ 3.0 mm, no aberrations other than the
spherical aberration and 105 lm of defocus introduced by the lens itself.
Middle row: incident beam diameter DFWHM ¼ 0.365 mm, D ¼ 2.17 mm, no
aberrations. Bottom row: same as the middle row, except for the presence of
4k of Seidel astigmatism (i.e., cylinder) on the incident wavefront.
under three different conditions. In the first row of Figure 33.8 the assumed
incident Gaussian beam is fairly large, having DFWHM ¼ 1.37 mm, full aperture
D ¼ 3.0 mm, and no wavefront aberrations. The focused spot, however, is
affected by the spherical aberration of the lens and by nearly –105 lm of
defocus, both of which are consequences of the wide aperture of the incident
beam. (The GRIN’s parabolic index profile is not optimum for diffraction-
limited focusing at large aperture, nor is the selected length of the lens
appropriate for wide-aperture applications.) The large NA of the lens is
responsible for the poor coupling efficiency into the fiber obtained in this
case (g 27%).
33 Launching light into a fiber 483
The second row of Figure 33.8 shows profiles of the focused spot for a smaller
incident beam, having DFWHM ¼ 0.365 mm, D ¼ 2.17 mm, and no aberrations. This
focused spot is well matched to the fiber’s mode profile, yielding a large coupling
efficiency (g 99%). Finally, the third row of Figure 33.8 shows the focused
spot profile computed for the same incident beam as above (DFWHM ¼ 0.365 mm,
D ¼ 2.17 mm) to which 4k of Seidel astigmatism (i.e., wavefront cylinder) has been
added. Astigmatism reduces the computed coupling efficiency to g 69%.
Plano-aspheric lens
Another design for a lens that launches a collimated beam into a single-mode
fiber is the plano-aspheric aplanat depicted in Figure 33.10. This lens is designed
to bring a k ¼ 1.55 lm beam to diffraction-limited focus at its rear facet (a plane
facet to which the fiber is attached). The lens has diameter ¼ 3.0 mm, length
L ¼ 5.8826 mm, and refractive index n ¼ 1.673 286. The asphere parameters are:
Rc ¼ 2.367 mm, K ¼ 0.667 23, A4 ¼ 2.911 25 · 103, A6 ¼ 2.522 86 · 104, and
A8 ¼ 2.930 78 · 105. Figure 33.11(a) shows the dependence of g on incident
beam diameter. Clearly, maximum efficiency is achieved with DFWHM 411 lm.
Figure 33.11(b) shows the dependence of g on beam decenter, when the incident
beam diameter is fixed at its optimum value of 411 lm. Similarly, sensitivity
to tilt for the optimum beam size is shown in Figure 33.11(c), and the
484 Classical Optics and its Applications
1.0 1.0
GRIN Lens DFWHM = 365 m
f = 3 mm Full-aperture D = 2.17 mm
0.8 L = 7.89 mm 0.8
= 1.55 m
Coupling Efficiency h
0.6 0.6
0.4 0.4
0.2 0.2
0.0 0.0
0 300 600 900 1200 1500 1800 0 100 200 300 400 500 600
FWHM beam diametrer (m) Decenter (m)
1.0 1.0
DFWHM = 365 m
Full-aperture D = 2.17 mm
0.8 0.8
Coupling Efficiency h
0.6 0.6
0.4 0.4
0.0 0.0
0.00 0.05 0.10 0.15 0.20 0.25 –4.5 –3.0 –1.5 0.0 0.5 3.0 4.5
Tilt angle (degrees) Seidel curvature ()
Figure 33.9 Characteristics of the GRIN lens of Figure 33.3, computed for a
k ¼ 1.55 lm incident Gaussian beam. (a) Dependence of the coupling efficiency
g on the FWHM diameter of the incident beam; optimum diameter is 365 lm.
(b) Dependence of g on incident beam decenter relative to the optic axis. (c)
Variation of g with incident beam tilt. (d) Effect on g of Seidel curvature (i.e.,
defocus); the horizontal axis depicts the departure of the wavefront at the edge of
the beam, where the assumed beam’s full-aperture diameter is D ¼ 2.17 mm. In
(b), (c), and (d) the incident beam has DFWHM ¼ 365 lm.
Aspheric
surface
Fiber
Incident beam
(collimated)
1.0 1.0
Plano-aspheric Lens DFWHM = 411 μm
f = 3.0 mm Full-aperture D = 2.17 mm
Coupling Efficiency h
0.4 0.4
0.2 0.2
0.0 0.0
0 400 800 1200 1600 0 150 300 450 600
FWHM beam diameter (μm) Decenter (μm)
1.0 1.0
DFWHM = 411 μm
Full-aperture D = 2.17 mm
0.8 0.8
Coupling Efficiency h
0.6 0.6
0.4 0.4
(a) Plano-convex
lens
Fiber
Incident beam
(collimated)
(b)
⌬z Gradium
L
glass
Z
zmax
(c)
1.80
GRADIUMTMG14SF
Refractive Index, n(z)
1.76
1.72
1.68
1.64
= 1.55 m
1.60
0 1 2 3 4 5 6
z (mm)
Figure 33.12 (a) Plano-convex lens made of Gradium glass is used to focus
a collimated beam of light into a single-mode fiber. The front facet of the lens is
spherical, having Rc ¼ 3.715 mm, and the focal point is 3.535 mm beyond the
rear (plane) facet of the lens. (b) The lens is fabricated by polishing into
spherical shape one end of a cylindrical rod cut from a slab of Gradium glass.
(c) Index profile of G14SF Gradium glass at k ¼ 1.55 lm. The refractive index
is highest at the front vertex, decreasing nonlinearly with z as one moves toward
the plane facet of the lens. When the same lens is made of homogeneous glass of
refractive index n ¼ 1.7, the focal point shifts by about 30 lm to the right, and
the spherical aberration increases slightly.
33 Launching light into a fiber 487
1.0 1.0
GRADIUM Lens DFWHM = 593 μm
f = 2.6 mm Full-aperture D = 2.17 mm
0.8 L = 2.9 mm 0.8
Coupling Efficiency = 1.55 μm
0.6 0.6
0.4 0.4
0.2 0.2
0.0 0.0
0 300 600 900 1200 1500 1800 0 200 400 600 800
FWHM beam diameter (μm) Decenter (μm)
1.0 1.0
DFWHM = 593 μm
Full-aperture D = 2.17 mm
0.8 0.8
Coupling Efficiency
0.6 0.6
0.4 0.4
0.2 0.2
DFWHM = 593 μm
Full-aperture D = 2.17 mm
0.0 0.0
0.00 0.05 0.10 0.15 0.20 0.25 –4.5 –3.0 –1.5 0.0 1.5 3.0 4.5
Tilt angle (degrees) Seidel curvature ()
plane facet of the lens. For this lens, the dependence of coupling efficiency g on
beam size as well as its sensitivity to misalignment are shown in Figure 33.13. The
optimum g is obtained for a beam diameter DFWHM 593 lm. For this beam size
the amount of decenter that results in 50% reduction of g is 420 lm, and the beam
tilt that causes a 50% drop in g is 0.045 . The dependence of g on wavefront
curvature may be seen in Figure 33.13(d).
We also computed the various performance curves for a plano-convex lens
similar to that depicted in Figure 33.12, but made of homogeneous glass (n ¼ 1.7)
instead of the Gradium material. The focal point of this homogeneous plano-
convex was found to be 3.565 mm beyond the plane facet of the lens, namely,
30 lm greater than that of the Gradium lens. Once again, the optimum beam
size was found to be DFWHM 593 lm; the other characteristics of the lens were
also very similar to those of the Gradium lens. Apparently, the use of
Gradium glass in this particular application does not result in any substantial
improvements.
†
This chapter is co-authored with Ewan M. Wright, Professor of Optical Sciences at the University of
Arizona.
489
490 Classical Optics and its Applications
Positive Electrode
Cladding
Guiding
Layers
Substrate Gain Medium
Ground
Electrode
electrode, tapering off laterally with an increasing distance from the electrode’s
center line along Z. In gain-guided lasers, this tapering off of the gain is
responsible for lateral beam confinement. (By contrast, in index-guided lasers the
regions adjacent to the guiding stripe are selectively etched away, then replaced
by a lower-index cladding material.) In general, the gain layer is highly
absorptive in regions that are not directly underneath the electrode and, therefore,
experience weak pumping or no pumping at all. The guiding layers are essentially
transparent, except for losses due to scattering at impurities and at the interfaces.
The substrate and the cladding are also highly transparent.
Figure 34.2 shows plots of intensity and phase at the front facet of a single-
transverse-mode diode laser (k0 ¼ 980 nm). The assumed beam divergence angles
(full-width-at-half-maximum intensity or FWHM) are hk ¼ 7 in the plane of the
junction and h? ¼ 35 perpendicular to the junction. In the top row of the figure,
(a), (b), where the assumed beam has no astigmatism, the phase distribution at the
laser’s front facet is uniform. In the middle row, (c), (d), the astigmatic distance
(defined as an equivalent distance in free space between horizontal and vertical
beam waists) is Dz ¼ 10 lm, resulting in a slightly wider beam along the X-axis,
and a divergent phase front whose peak-to-valley variation (i.e., from the edge to
the center of the beam) is 120 . In the bottom row, (e), (f), the assumed
astigmatism is Dz ¼ 25 lm. Again the beam is broader (in the horizontal dir-
ection) than the one without astigmatism, and the phase distribution exhibits a
peak-to-valley variation of 190 .
34 The optics of semiconductor diode lasers 491
a b
c d
e f
Figure 34.2 Plots of the logarithm of intensity (left) and phase (right) at the
front facet of a k0 ¼ 980 nm diode laser having hk ¼ 7 , h? ¼ 35 . The range
of variations of intensity between the maximum (red) and minimum (blue) is
Imax : Imin ¼ 104. In (a), (b) the beam has no astigmatism. In (c), (d) the
astigmatic distance (in free space) between the horizontal and vertical beam
waists is Dz ¼ 10 lm, resulting in a wider beam along X, and a divergent
phase pattern whose variation from the center (blue) to the edge (red) is
120 . In (e), (f), where the assumed astigmatism is Dz ¼ 25 lm, the beam
is further broadened and the phase distribution exhibits a peak-to-valley
variation of 190 .
The elliptical cross-section of the beam emerging at the front facet of the laser
is responsible (through diffraction) for hk being much smaller than h?. The cause
of astigmatism is the non-uniform gain profile (along the X-axis) within the active
region of the laser. As the gain is strongest near the cavity’s central axis, the
beam, while propagating in the cavity along Z, experiences a “gain focusing”
effect toward this axis – a direct consequence of stronger amplification on-axis
than in the wings.2 Consequently, a divergent phase profile automatically evolves
for countering this tendency of the beam to collapse to the center. We will have
more to say about this property in the following section.
Another interesting property of a diode laser beam is its polarization state,
which is typically linear, having the E-field parallel to the plane of the junction.
This property may be traced back to the fact that, for light polarized parallel to the
junction (i.e., Ek) the gain is somewhat greater than that for perpendicularly
polarized light (hereinafter E?). The guided mode associated with E? is slightly
broader in the Y-direction than the mode associated with Ek. Since a broad mode
has less overlap with the gain layer than a more compact mode, it stays behind
while the compact mode surpasses the threshold and begins to lase. Moreover,
492 Classical Optics and its Applications
confinement of electrons and holes to a thin (quantum well) active layer makes
it easier for Ek (relative to E?) to stimulate the excited electrons and holes into
surrendering their photons and returning to the ground state. In practice a
combination of both effects is responsible for promoting the selection of Ek
polarization over E?.
y/
–1.5
–20 x/ 20
+1.5 c
y/
–1.5
–20 x/ 20
of the substrate and cladding materials is n0 ¼ 3.3, then the guiding layers have
index n1 ¼ 3.45 and the active medium has index n2 ¼ 3.47.
In practice, pumping the gain medium causes a decline in its local index, so
that a more realistic phase mask would be similar to that shown in Figure 34.3(c),
where the phase at the center of the active region has dropped to 4.5 (corres-
ponding to n2 ¼ 3.425). From this minimum, the phase increases in a Gaussian
fashion along the X-axis, reaching the value of 6.12 in the highly absorptive
494 Classical Optics and its Applications
regions of the active layer. This results in index anti-guiding along the X-axis
(due to index inhomogeneity within the active layer). The Gaussian phase profile
inside the active layer imposes on the laser beam a divergence (along X ) above
and beyond that imposed by the gain profile alone.3,4 The rest of the mask is
identical to that in Figure 34.3(b). In what follows, we will first show results
of BPM simulations obtained with the amplitude-phase Mask 1, depicted in
Figures 34.3(a), (b), confirming that the gain profile alone can give rise to
astigmatism. We then show results of simulations obtained with the amplitude–
phase Mask 2, shown in Figures 34.3(a), (c), which reveal that the index “anti-
guiding” of the gain medium (caused by population inversion) can further
enhance the induced astigmatism.
Figure 34.4, top row, shows plots of (a) intensity, (b) logarithm of intensity, and
(c) phase after 600 steps of BPM using Mask 1. Since each step corresponds to a
propagation distance of 0.1k inside a medium of refractive index n0 ¼ 3.3, the total
propagation distance in this simulation is 18 lm. The light is seen to be well
confined to the guiding layers, with only a small fraction leaking (i.e., evanescent)
into the substrate and cladding. The light that escapes the guiding layers is
eventually lost by scattering or diffraction out of the system. In Figure 34.4(c), the
peak-to-valley phase variation along X is 160 , corresponding to a divergent beam
with a few microns of astigmatism. The bottom row in Figure 34.4 is similar to the
top row, except that it is obtained with Mask 2. The beam is somewhat broader
along the X-axis when compared to that obtained without an index anti-guiding of
the gain medium. Also, the peak-to-valley phase variation in Figure 34.4(f) is 175
(along X), corresponding to a divergent beam with somewhat more astigmatism
than the one depicted in Figure 34.4(c).
Figure 34.4(g) shows plots of the power content of the beam versus propa-
gation distance z, as obtained in the above simulations. At first, the power
decreases as the initial beam adjusts itself to the guiding structure, shedding
excess light that does not match the guided mode profile. The gain medium then
takes over and raises the power content exponentially, as the confined mode
propagates along the optical axis.
Shearing interferometry
The beam of a single-transverse-mode diode laser may be captured and colli-
mated by an aberration-free lens, then analyzed using a shear-plate interfero-
meter, as shown in Figure 34.5. The shear plate creates two identical copies of
the collimated beam shifted relative to each other along the X- and/or Y-axes.
Superposition of these two copies of the same beam at the observation plane
creates an interferogram that reveals the phase structure of the (collimated) beam.
34 The optics of semiconductor diode lasers 495
(a) (b) (c)
3.5 (g)
3.0
2.5
Optical Power
2.0 Mask 1
1.5
1.0
Mask 2
0.5
0 3 6 9 12 15 18
Propagation Distance (m)
Figure 34.4 Plots of (a) intensity, (b) logarithm of intensity, and (c) phase after
600 BPM steps. The amplitude–phase Mask 1 used in these simulations is
depicted in Figures 34.3(a), (b). The light is seen to be confined to the guiding
layers, with a weak evanescent tail leaking into the substrate and cladding. In (c)
the peak-to-valley phase variation along X is 160 . The bottom row (d)–(f) is
similar to the top row (a)–(c), except that it is obtained with Mask 2 depicted in
Figures 34.3(a), (c). In (f) the peak-to-valley phase variation along X is 175 . (g)
Power content of the beam versus propagation distance in the BPM simulations.
Collimator
Y Interferogram
Diode
Laser
Shear Plate
Figure 34.6 Left column: plots of intensity (top) and phase in a plane
located 10 mm beyond the exit pupil of the collimator of Figure 34.5. The
0.6 NA lens is one focal length (f ¼ 4.9 mm) away from the mid-point
between the two waists of the laser beam (k0 ¼ 980 nm, hk ¼ 7 , h?¼ 35 ).
Right column: intensity patterns at the viewing window of the shear plate
(Dx ¼ 0.7 mm, Dy ¼ 2.0 mm). From top to bottom, the assumed astigmatism
of the laser is 0, 10, 20, 30 lm.
498 Classical Optics and its Applications
Diode Laser
Cylindrical Lens
(radial GRIN)
Plano-cylindrical Lens
(homogeneous glass)
a b
Figure 34.8 Plots of (a) intensity, (b) phase of a laser beam (k0 ¼ 980 nm,
hk ¼ 7 , h? ¼ 35 , astigmatism Dz ¼ 0) upon emerging from the lens pair of
Figure 34.7. For the particular lenses chosen in this simulation, the optical power
throughput is 80%, the r.m.s. wavefront aberration is 0.19k, and the peak-to-
valley phase variation across the aperture is 280 .
(with a slight adjustment of the separation between its two elements) may be used
for collimation in the presence of astigmatism on the laser beam, without any
degradation of the wavefront quality.
By allowing the slow axis of the beam to propagate further before being
collimated, the cylindrical lens pair enables one to adjust the degree of ellipticity
34 The optics of semiconductor diode lasers 499
Anamorphic
Prism Pair Focusing Lens
X
Diode Z
Laser
Collimator
Figure 34.9 A diode laser’s beam is collimated, then shaped by a prism pair
that expands the beam’s diameter along the X-axis. The collimated and ana-
morphically magnified beam is subsequently focused by an aberration-free lens
identical to the one used for collimation.
500 Classical Optics and its Applications
(a) (b)
(e) (f)
–8 x (μm) +8 –8 x (μm) +8
Figure 34.10 Distributions of the logarithm of intensity (left) and phase (right)
at several cross-sections of the system of Figure 34.9. The lenses have NA ¼ 0.6,
f ¼ 4.9 mm. The prisms are made of n ¼ 1.72 glass, and have an apex angle of
69 . (a), (b) Front facet of the laser. (c), (d) Exit pupil of the collimator, just
before entering the prism pair. (e), (f) Emerging from the second prism. (g), (h)
Focal plane of the focusing lens.
row in Figure 34.10 shows the beam at the front facet of the laser. The second
row shows that before entering the prisms the beam has an elliptical cross-
section with an aspect ratio of 5.5. Emerging from the prism pair (third row)
the beam is circularized. The bottom row shows the focused spot at the focal
plane of the focusing lens; this compressed image of the bright elliptical spot at
the front facet of the laser has circular symmetry and a much reduced diameter
along the X-axis.
34 The optics of semiconductor diode lasers 501
(a) (b)
–8 x (μm) +8 –8 x (μm) +8
Figure 34.11 Same as Figure 34.10 but with the diode laser shifted 20 lm to
the left along the X-axis. Comparing (d) with (f), note that the tilt angle of the
collimated beam is reduced substantially after going through the prism pair.
Thus the image of the laser beam, in addition to being circularized, moves closer
to the optical axis at x ¼ 0, as shown in (g), (h).
Figure 34.11 is similar to Figure 34.10, except for the position of the diode
laser along the X-axis, which is shifted to x ¼ 20 lm. Collimation and ana-
morphic magnification work as before, but the beam emerging from the colli-
mator is tilted by about 0.23 away from the Z-axis. The prism pair magnifies
the beam along X by 5.5, but it also reduces the tilt of the beam by the same
factor. The net result is that the image of the bright elliptical spot at the front
facet of the laser, in addition to being compressed in size, is brought closer to
the optical axis at x ¼ þ3.6 lm. This is an important result which may be
502 Classical Optics and its Applications
(a)
(b)
(c)
Figure 34.12 Plots of (a) intensity, (b) logarithm of intensity, (c) phase at
the exit pupil of a cylindrical lens pair used as collimator and anamorphic
magnifier for the beam emerging from a single-mode fiber (k0 ¼ 1.544 lm).
A 0.775 · 3.1 mm2 elliptical aperture is placed at the exit pupil to clip the
edges of the beam. The overall transmission of the system is 95.5%, and the
ratio of the FWHM beam diameters is 4.1. The peak-to-valley phase vari-
ation in (c) is 56 .
The essential idea behind the stellar interferometer is that of a double-slit inter-
ferometer, such as that shown in Figure 35.1. This type of instrument dates back to
1868 when Fizeau1 proposed using it to measure the diameters of the fixed stars.
Some modern textbooks2 describe the stellar interferometer in the language of
coherence theory, which tends to obscure its fundamental simplicity. This chapter
attempts to present the original concept in its simplest form while providing a
historical perspective.
505
506 Classical Optics and its Applications
Double-slit
Y X mask Y X
Point
source
f
Optical
axis
f
S1
p
d
Z
~ d
p ~ f
S2
Figure 35.2 Geometrical construction showing the relation between the fringe
period p, the distance d between the slits, the focal length f of the focusing lens,
and the wavelength k of the light.
parameters ( f ¼ 6 · 105k, d ¼ 6500k) the fringe period found from Eq. (35.1) is
p 92.3k, in agreement with the simulated results.
Next, assume that a second point source is placed in the focal plane of the
collimator lens, slightly displaced from the first one located at the origin. Assume
further that the two sources, although both quasi-monochromatic with wave-
length k, are completely uncorrelated and independent, so that their radiation may
be considered to be spatially incoherent. The collimated beam arriving at the
plane of the slits from this second source will make a small angle w with the
optical axis. As shown in Figure 35.4, this angle causes the phase of the light
arriving at the two slits to differ by D where3,4
D 2pwd=k: ð35:2Þ
(c)
1.0
0.8
Normalized intensity
0.6
0.4
0.2
0.0
two point sources will be shifted relative to each other by an amount that will
depend on w. Since the point sources are uncorrelated, it is their corresponding
intensity distributions at the observation plane that will be added together. This
results in a w-dependence of the fringe visibility V, a quantity defined by
Michelson as
V ¼ ðImax Imin Þ=ðImax þ Imin Þ; ð35:3Þ
508 Classical Optics and its Applications
d
Phase shift =2d/ S1
S2
where Imin and Imax are the minimum and maximum values of intensity within
one fringe period.
As an example, let us assume that two point sources of equal strength, separated
by 0.5k along the X-axis, are placed in the focal plane of the collimator lens of
Figure 35.1. The angular separation between the two sources as viewed from the
plane of the slits, therefore, is w ¼ 17.2 seconds of arc.5 In the absence of the double-
slit mask, the images of the two point sources will be unresolvable at the observation
plane; see Figure 35.5(a). With the mask in place, however, the plots of intensity
distribution in Figures 35.5(b), (c) indicate that the fringe visibility will be sub-
stantially altered from that for a single point source; the latter can be deduced from
Figure 35.3(c) to be 100%. Close inspection of the fringes, therefore, enables one to
infer the presence of a second source of light in the system. As a practical matter, one
may adjust the distance d between the two slits until the phase shift D of Eq. (35.2)
becomes equal to p, at which point the two fringe systems will be shifted by half a
period. Under these conditions the maxima of one set of fringes will overlap the
minima of the other, resulting in a complete “washing out” of the interference
pattern. Equation (35.2) can then be used to determine the angular separation w
between the two point sources from the knowledge of the slit separation d that
resulted in minimum fringe visibility. (Note that changing d will have the undesir-
able effect of changing the fringe period p according to Eq. (35.1), but, as long as the
fringes remain visible to the observer, this change should be inconsequential.)
(c)
1.0
0.8
Intensity (normalized)
0.6
0.4
0.2
0.0
Figure 35.5 Results of a simulation of the system of Figure 35.1 involving two
independent point sources, one centered on the optical axis (i.e., at the origin), the
other shifted by 0.5k along the X-axis. (a) Logarithmic plot of the intensity distri-
bution at the observation plane in the absence of the double-slit mask. (b) Fringe
pattern at the observation plane with the double-slit mask present. (c) Cross-section
of the fringe pattern along the X-axis.
(c)
1.0
0.8
Intensity (normalized)
0.6
0.4
0.2
0.0
Figure 35.6 Results of a simulation of the system of Figure 35.1 involving three
independent point sources, one centered on the optical axis, the others shifted
by 0.25k along the X-axis. (a) Logarithmic plot of the intensity distribution at
the observation plane in the absence of the double-slit mask. (b) Fringe pattern at the
observation plane with the double-slit mask present. (c) Cross-section of the fringe
pattern along the X-axis.
appears at d ¼ 0.5k/w, whereas in the latter the first zero occurs at d ¼ 1.22k/w;
here w is the angle subtended by the diameter of the giant star’s disk.2, 3, 4
In any event, it is clear that a measurement of fringe visibility for several
different separations of the slits will provide much information about the distri-
bution of intensity at the source. As another example, we show in Figure 35.6 the
35 Michelson’s stellar interferometer 511
case of three point sources of equal intensity placed at x ¼ 0.25k, 0, and þ0.25k
in the system of Figure 35.1. Again the image obtained at the observation plane
without the double-slit mask does not resolve the sources of light, but the fringe
visibility obtained as a function of d carries enough information to allow one to
make a fairly accurate statement about the distribution of intensity at the source.
A historical perspective
6
R. W. Wood sums up the origins of the interferometer: “This method was
proposed by Fizeau1 in 1868 for measuring the diameters of the fixed stars. In
1874 Stefan made an attempt to carry out Fizeau’s plan, placing two slits in front
of the objective of the Marseilles telescope, the largest available at the time. The
fringes remained visible even when the slits were separated by the full diameter
of the objective. In 1890 Michelson measured the diameters of the four moons of
Jupiter, using the 36 inch telescope of the Lick observatory.7 The method can also
be used for determining the distance between the components of a double star.
“In 1920 Michelson took up the problem of the determination of stellar diam-
eters.8 Even the great 100 inch telescope of the Mount Wilson Observatory is not
large enough to allow of a sufficient separation of the slits; consequently Michelson
designed a ‘periscopic’ arrangement of four mirrors, the two outer ones, twenty feet
apart, reflecting the light to two inner ones which in turn reflected the beams down
upon the mirror of the 100 inch telescope. The mirrors were mounted on a metal
beam attached to the top of the telescope tube. The instrument was constructed in
collaboration with F. G. Pease of the Mount Wilson Observatory.”
A schematic diagram of the stellar interferometer constructed by Michelson (and
mentioned by R. W. Wood in the preceding paragraph) is shown in Figure 35.7. In
this instrument the distance ‘ between mirrors M1 and M2 was varied to effect a
change of fringe visibility; one must therefore substitute ‘ for d in Eq. (35.2) in
order to make it applicable to the new instrument. The fringe period p, however, is
still determined by the distance d between the slits, and Eq. (35.1) applies to
Michelson’s interferometer without any modifications. Thus there is the further
advantage that the fringe spacing remains constant as the separation of the movable
mirrors is varied. The interferometer was mounted on the 100 inch reflecting
telescope of the Mount Wilson Observatory in California, which was used because
of its mechanical strength. The apertures S1 and S2 were 114 cm apart, giving a
fringe spacing of about 20 lm in the focal plane. The maximum separation of the
outer mirrors was 6.1 m, so that the smallest measurable angular diameter (with
k ¼ 550 nm) was about 0.02 seconds of arc.3
Again quoting R. W. Wood:6 “The bright star Betelgeuse was the first
investigated. This star shows evidence of its diameter with the 100 inch telescope
512 Classical Optics and its Applications
Albert Abraham Michelson (1852–1931) was born in what was then Germany
(now Poland) and emigrated with his family to the United States in 1855. He
became professor of physics at the Case School of Applied Science (Cleveland,
Ohio), then at Clark University (Worcester, Massachusetts), and then at the
University of Chicago. In 1907 he became the first American to receive a Nobel
prize; the prize citation reads: “For his optical precision instruments and the
spectroscopic and meteorological investigations carried out with their aid.”
(Photo: courtesy of AIP Emilio Segré Visual Archives.)
if a canvas cover is placed over the instrument, provided with two holes 7 inches
in diameter and 94 inches apart, the diffraction disk of the star being crossed with
faint interference bands. If either hole is covered the bands disappear. If the
telescope is pointed at Rigel, however, the bands are clear and strong, showing
that its angular diameter is smaller than that of Betelgeuse. With the twenty-foot
interferometer the bands disappeared entirely in the case of Betelgeuse when the
mirrors were separated by a distance of 120 inches, while Rigel showed very
distinct bands. The angular diameter of Betelgeuse was computed as 0.047 sec-
onds of arc. From the known distance of the star [determined by triangulation], its
35 Michelson’s stellar interferometer 513
M1
S1 C1
M3
l d
M4
S2 C2
Observation plane
M2
actual diameter was calculated as 250 million miles [i.e., 300 times the diameter
of the sun] or greater than the earth’s orbit about the sun [180 million miles
across]. Its diameter has been found to vary, however, for at times the mirrors
must be separated by a distance of 14 feet before the fringes disappear. Antares
was found to be still larger, having a diameter of 400 million miles. The
minimum angular diameter measurable with the 20 foot instrument is 0.024
seconds of arc.”
The majority of stars are either too distant or too small for the Michelson
interferometer to measure their diameter. For example, at the distance of the
nearest star (Alpha Centauri) the sun’s disk would subtend an angle of only 0.007
seconds of arc, and to observe the first disappearance of the fringes a mirror
separation of 20 m would be necessary. The construction of such a large inter-
ferometer would be a difficult undertaking because of the requirement of rigid
mechanical connection between the collecting mirrors and the eyepiece.3 In recent
years, the method of Hanbury Brown and Twiss as well as extensions of Michelson’s
method to radio astronomy have been used for measurements of some of the smaller
astronomical objects.2,3,4
514 Classical Optics and its Applications
There are countless suns and countless earths all rotating around
their suns in exactly the same way as the seven planets of our system.
We see only the suns because they are the largest bodies and are
luminous, but their planets remain invisible to us because they are
smaller and non-luminous. The countless worlds in the universe are no
worse and no less inhabited than our Earth.
Nulling interferometer
Figure 36.1 is a diagram of a basic Bracewell telescope intended for operation in the
infrared range of wavelengths k 7–20 lm. The reason for working in the infrared is
that the expected brightness of the star in this region is only 106 times that of the
planet, which is much better than the 109 brightness ratio in the visible. Moreover,
several signature absorption lines corresponding to ozone, water vapor, methane,
515
516 Classical Optics and its Applications
B
Telescope axis
M1 M2
CP
Dp BS
M3
Detector
and carbon dioxide reside in this band, which can be exploited in the spectroscopic
analysis of these planets to determine whether they harbor life as we know it.5
In the following discussion we confine our attention to a single infrared
wavelength of k ¼ 10 lm, even though the interferometric telescope can operate
over a fairly broad range of wavelengths. We assume the primaries each have an
aperture diameter Dp ¼ 1 m and focal length fp ¼ 2.5 m. (The angular resolution
of the individual mirrors is thus k/Dp ¼ 105 radians.) The assumed baseline
(i.e., center-to-center separation of primary mirrors) is B ¼ 5 m.† With the
†
These parameters, chosen for the sole purpose of demonstrating the basic concepts, are not representative of
the planned systems. A typical design under consideration by the TPF program, for example, has four primary
mirrors, each 2.5 m in diameter and separated by 100 m baselines. It is envisioned that these free-flying
mirrors would collect and forward the beam of light to a local combiner and controller unit (also free-flying).
The planned system will be capable of executing nulling interferometry over the broad band of k ¼ 720 lm.
The adjustment of mirror positions and their distances from each other as well as from the combiner would
allow the configuration to be optimized on the spot in accordance with the characteristics of the particular
solar system under consideration.5
36 Bracewell’s interferometric telescope 517
delayed relative to the time of arrival of the same wavefront at the other. This
delay or phase shift, being a function of the baseline B, is independent of the
mirror diameter Dp.
AR layer
Substrate
Six-layer stack
Figure 36.2 A simple design for the beam-splitter (BS) in the system of
Figure 36.1, having a 50/50 reflection to transmission ratio at k ¼ 10 lm. The
1 mm-thick substrate has refractive index n ¼ 2, and is antireflection coated (AR)
on the top surface with a t ¼ 1.785 lm layer having n ¼ 1.4. Deposited on the
substrate bottom is a six-layer stack. Numbered in increasing order starting at the
substrate interface, these layers have the following parameters: layers 1, 3, 5,
t ¼ 1.7 lm, n ¼ 1.5; layers 2, 4, t ¼ 1.25 lm, n ¼ 2.0, layer 6, t ¼ 0.475 lm, n ¼ 2.0.
36 Bracewell’s interferometric telescope 519
a b
Figure 36.4 Images of star and planet, when the assumed brightness of the
planet is 1% of the star’s and their angular separation is 5.16 arcsec. (a) When
either beam is blocked the intensity distribution is essentially that of the star;
the planet is barely visible. (b) With both channels open and the phase dif-
ference between them adjusted to 180 , the bright star is canceled out and the
planet becomes visible. The center-to-center spacing between the images of the
star and the planet is 1.25 mm, and the ratio of the peak intensity in (b) to that
in (a) is 0.038.
f Y
c
u
X
Figure 36.5 The Bracewell telescope, with its baseline in the XY-plane
and oriented at an angle h from X, targeting a star along the Z-axis. The
points in the vicinity of the star are identified by their polar and azimuthal
coordinates , w. The wavefront, arriving from an oblique direction, reaches
one mirror later than the other, producing a path-length difference of B sin
cos(w h).
36 Bracewell’s interferometric telescope 521
the star’s neighborhood may be identified either by its angular coordinates ,w or
by the Cartesian coordinates x, y of its image in the telescope. The image location
is related to the polar coordinates through the equation
ðx; yÞ ¼ Ms fP tan ðcos w; sin wÞ: ð36:1Þ
Here fP is the focal length of the primary mirror and Ms is the magnification of the
secondary. For the light arriving at the two primaries from the direction (, w) in
the sky the relative phase is
DU ¼ 2pðB=kÞ sin cosðw hÞ: ð36:2Þ
The two arms of the telescope are adjusted in such a way that when DU ¼ 0 the
two beams interfere destructively whereas a 180 phase shift results in con-
structive interference. The corresponding light amplitude at the image plane is
thus given by
A ¼ 12 A0 ½1 expðiDUÞ; ð36:3Þ
jAj2 ¼ jA0 j2 sin2 ð12 DUÞ ¼ jA0 j2 sin2 ½pðB=kÞ sin cosðw hÞ: ð36:4Þ
Figure 36.6 is a gray-scale plot of jA/A0j2 in the image plane of the telescope
(black and white represent 0 and 1, respectively). Each point (x, y) in this plane
corresponds to a point (, w) in the sky in accordance with Eq. (36.1); it is also
assumed that the primary mirrors are separated by B ¼ 5 m along the h ¼ 45 line.
Figure 36.6 Gray-scale plot of jA/A0j2 (Eq. 36.4) in the image plane of the
telescope (black and white represent 0 and 1, respectively). The primary mirrors
are separated by B ¼ 5 m along the h ¼ 45 line; k ¼ 10 lm.
522 Classical Optics and its Applications
The field of view is centered at (x, y) ¼ (, w) ¼ (0, 0), which is the target star’s
location. Only a circle of radius 100k within the field of view – corresponding to
0 20 lrad – is shown in Figure 36.6, but the same pattern could extend over
a much larger patch of sky around the targeted star.
Any planet (or other source of radiation) located in the bright fringes of
Figure 36.6 will produce a bright Airy pattern in the image plane at that location.
However, planets located in the dark fringes disappear from the image because
destructive interference cancels them out. If the telescope is rotated around the
Z-axis while maintaining a tight fix on the target star, h will change continuously
and the pattern of Figure 36.6 will rotate around its center. The image of a planet
within the field of view, however, will remain fixed while the fringes rotate. The
planet’s image thus waxes and wanes as the bright and dark fringes cross it one
after the other. The number of times that the planet’s image appears and disap-
pears in a single revolution of the telescope depends on the polar coordinate of
the planet; specifically, the frequency of the planet’s signal at the detector output
increases in proportion to its separation from its parent star. In this way it is
possible to modulate the signal of a given planet and, by integration over time, to
reduce noise components residing outside the specific frequency of the planet’s
signal.1
a b
Figure 36.7 (a) Null image of a star of finite diameter (0.05 arcsec). The peak
intensity is about 230 times greater than that in Figure 36.3(b). (b) Image of the
finite-diameter star and its planet. This image should be compared with Figure 36.4
(b), which was obtained under identical conditions except for neglect of the angular
diameter of the star. The peak intensities of the planet and star in the present image
are nearly the same.
524 Classical Optics and its Applications
Let d1 and d2 be the optical path lengths of the two channels in air, and denote
by t1 and t2 the thicknesses of CP and BS, respectively. These plates are made of
the same material, whose refractive index within the wavelength range of interest
may be approximated by n(k) a þ bk (a and b are material constants). Also, one
must take into account the 90 phase shift introduced by the (symmetric) beam-
splitter between the reflected and transmitted beams. The overall optical phase
difference between the two channels is thus given by
In this equation the first bracketed term can be chosen to yield a 90 phase shift by
selecting the plate thicknesses such that (t1t2)b ¼ 14. The second bracketed term is
dependent on k and must therefore be set to zero. Since t1t2 is already fixed,
elimination of the second term requires an adjustment of d1d2, the path-length
difference in air. In practice these adjustments are made iteratively by changing
d1d2 while rotating CP by small amounts until the desired null is achieved.
Principle of operation
The essential features of a scanning optical microscope are shown in Figure 37.1. A
laser beam is sent through an objective lens to form a focused spot on the sample.
Ideally, the objective is corrected for all aberrations, yielding a diffraction-limited
focused spot. The light reflected from the sample returns through the objective and
is redirected by the beam-splitter to a detection module. The detection module may
be designed to monitor the power, the phase, or the polarization state of the
returning beam. The electrical signal S(x, y) produced by the detector is thus rep-
resentative of the small area of the sample illuminated by the focused spot at and
around the point (x, y). The sample is moved to different locations by the XY stage
on which it is mounted; the signal S(x, y), plotted against the sample’s position,
yields an image of the sample’s surface over the desired area.
†
The coauthors of this chapter are Lifeng Li and Wei-Hung Yeh.
525
526 Classical Optics and its Applications
X
Beam-splitter
Sample
Laser
Z
Y
Collimator Objective
XY
Scanner
Detector
module
S(x, y)
When the sample is in air (n ¼ 1) the numerical aperture is less than unity.
However, if the sample is embedded in a liquid or solid of refractive index n > 1,
the numerical aperture can be as large as n.
The diameter D of an aberration-free focused spot is given by diffraction
theory as1
D k 0 =NA ð37:2Þ
where k0 is the vacuum wavelength of the laser beam. The above equation gives only
a rough estimate of the spot diameter, the exact value depending on how the diameter
is defined [e.g., the diameter of the first dark ring of the Airy disk, the full width at
half maximum (FWHM) of the intensity distribution, etc.], on the distribution of
light at the entrance pupil of the lens (e.g., uniform, truncated Gaussian, etc.), and on
the state of polarization of the laser beam. The proportionality constant between D
and k0 / NA is typically between 0.5 and 1.5, depending on the circumstances.
Figure 37.3 shows plots of intensity distribution at the focal plane of the 0.615NA
objective shown in Figure 37.2. The incident beam is assumed to be uniform and
37 Scanning optical microscopy 527
X
u
Z
Y
2d
Depth of focus
Figure 37.3 Logarithmic plots of intensity distribution at the focal plane of the
0.615NA objective shown in Figure 37.2. The incident beam is uniform and has
linear polarization along the X-axis. From left to right: X-, Y-, and Z-components
of polarization at best focus. The integrated intensities of the three components
are in the ratios 1 : 0.002 : 0.113.
linearly polarized along the X-axis. The bending of the rays by the lens produces
E-field components along the Y- and Z-axes as well; the distributions of these
components, which carry only a small fraction of the total optical energy, are shown
in Figure 37.3(b), (c). The logarithmic scale of these plots enhances the rings of light
around the central bright spot; in fact these rings are typically weak and do not
contribute much to the scanning signal. The central bright spot in Figure 37.3(a)
528 Classical Optics and its Applications
is the most important contributor to the signal, but for accurate measurements
the effects of the entire focused spot should be taken into consideration.
Depth of focus
Another important characteristic of the focused spot is its depth of focus.
Typically, for high-NA objectives the range over which the spot size can be con-
sidered to be small is quite limited. As shown in Figure 37.2, if the sample moves
by d along the Z-axis, deviations from perfect focus may be tolerable; for larger
movements, the quality of the scanning signal suffers. The order of magnitude of
the depth of focus is given by the theory of diffraction as d/k (D/k)2, which is an
expression for the Rayleigh range2 of the beam in a medium in which the wave-
length is k. This expression may be written as
d D2 =k: ð37:3Þ
–2 x (μm) 2
Figure 37.4 Logarithmic plots of total intensity distribution at and near the
focus of the 0.615NA objective shown in Figure 37.2. From top to bottom
Dz ¼ 2 lm, 1.5 lm, 1 lm, 0.5 lm, and 0. Because of the symmetry between the
two sides of focus, the distributions for Dz are the same. At best focus the
spot’s FWHM is 0.57 lm along X and 0.51 lm along Y.
showing that for a given cone angle h both the spot size D and the depth of focus d
shrink by a factor n, compared with an objective designed for operation in air.
For an oil-immersion lens having sin h ¼ 0.615 and n ¼ 2, Figure 37.6 shows
plots of the total intensity distribution at 1 lm defocus (top) and at best focus
530 Classical Optics and its Applications
Objective
Sample
Index-matching fluid
Figure 37.5 An oil-immersion objective focuses the beam onto the sample
through an index-matched fluid of refractive index n. The fluid is in contact with
both the sample and the front element of the lens. The rays that emerge from the
objective do not bend on their way to the sample, thus forming a high-NA cone
of light. For a given half-angle h of the cone, the NA of an oil immersion
objective is superior to that of an air-incidence objective by a factor n.
–2 x (μm) 2
Figure 37.6 Logarithmic plots of intensity distribution at and near the focus of an
oil-immersion objective. The objective consists of the 0.615NA lens of Figure 37.2
in conjunction with a hemispherical glass cap. Both the cap and the immersion oil
have index n ¼ 2, resulting in an overall NA of 1.23. Top: Dz is 1 lm away from
the focal plane. Bottom: the position of best focus; FWHM ¼ 0.28 lm along
X, 0.25 lm along Y.
37 Scanning optical microscopy 531
0.5 m
1.0 (a)
Oil immersion
(NA = 1.2)
0.8
0.4 Air-incidence
(NA = 0.6)
0.2
Scalar
0.0
– 0.75 – 0.50 – 0.25 0.00 0.25 0.50 0.75
Distance (m)
0.8 (b)
Oil immersion
0.6 (NA = 1.2)
Differential Signal (S1 – S2)/(S1 + S2)
0.4
0.2
Air-incidence
(NA = 0.6)
0.0
– 0.2
– 0.4
– 0.6
Scalar
– 0.8
– 0.75 – 0.50 – 0.25 0.00 0.25 0.50 0.75
Distance (m)
Figure 37.8 Scalar diffraction theory applied to the grating of Figure 37.7 yields
single-line scans in the direction perpendicular to the grooves. The scanned period
extends from the center of the land at 0.75 lm to the center of the adjacent land,
at þ0.75 lm, with the groove center at 0. The broken line corresponds to a 0.6NA
air-incidence objective, while the solid line represents a 1.2NA oil-immersion
objective. The detector module consists of a split detector aligned with the
grooves, yielding signals S1 and S2. (a) Sum signal scans corresponding to the total
reflected power. (b) Differential signal scans, corresponding to the “push–pull”
method.
37 Scanning optical microscopy 533
Figure 37.9 Computed baseball patterns at the exit pupil of the 1.2NA oil-
immersion objective during the scans depicted in Figure 37.8. From top to bottom,
the focused spot is on the land center, on the groove edge, and on the groove center.
one side of the baseball pattern have a different phase from those appearing on
the opposite side and, therefore, the asymmetry between the two halves of the
baseball pattern yields a fairly large differential signal. Since these calculations are
based on the scalar theory of diffraction, anomalous effects due to surface plasmon
excitation and dependence on the beam’s polarization state are not observed. Such
effects will show up later in our full vector diffraction calculations.
plates, and the storage layer of compact disks is protected from dust and
fingerprints by a plastic substrate 1.2 mm thick. In either case the objective lens
must be corrected for the specific thickness and refractive index of the cover plate.
As shown in Figure 37.10, a cone of light focused through a parallel plate
becomes compressed toward the optical axis, its value of sin h shrinking by the
refractive index n of the plate. At the same time, the wavelength of the light
inside the plate also shrinks by the same factor, to give k ¼ k0 / n. The net effect is
that the spot diameter D does not change as a result of focusing through the cover
plate. However, Eq. (37.3) implies that the depth of focus will improve. This
would be true, of course, if one interpreted the depth of focus as the depth of the
sample interrogated by the focused beam while the sample remained at rest. But
what happens if one moves the sample in the Z-direction and determines the
distance Dz over which the image of the sample remains sharp? One finds in the
latter case that focusing through the cover plate does not improve the depth of
focus at all. In other words, the depths of focus with and without the cover plate
are exactly the same. (Keep in mind that the objective lens is corrected for each
case separately.)
The reason for the above apparent discrepancy is as follows. If one moves the
sample and the cover plate together by Dz along the positive Z-axis, the top of the
cover plate also moves away from the lens by the same distance. Consequently
the focused spot recedes from the sample’s surface by nDz, which is greater than
the actual travel of the sample. (This analysis, which ignores residual spherical
aberrations, is quite straightforward and requires only the use of Snell’s law and
simple geometry. It also applies to the case where the sample and the cover plate
move along the negative Z-axis.) Thus, as long as the lens remains stationary
while the sample and the cover plate travel together along Z, the cover plate does
Objective
Sample
Cover plate
(or substrate)
not increase, nor does it decrease, the depth of focus. Aside from protecting the
sample, focusing through the cover plate has no obvious advantages.
Objective Sample
SIL
–2 x (μm) 2
disk surface.4 Under such circumstances the light must jump through the gap in
order to interact with the storage layer of the disk. This is not a serious problem
for those rays that propagate along the Z-axis, or at a small inclination with
respect to it, since they are readily transmitted through the bottom of the SIL.
However, for those rays that make a large angle with the Z-axis, the Fresnel
transmission coefficients become small; in particular, when the incidence angle
exceeds the critical angle of total internal reflection the transmissivity drops to
zero.1 Fortunately, the phenomenon of frustrated total internal reflection allows
photons to tunnel through the gap and reach the storage layer of the disk. For this
to happen efficiently, the gap width Wg must be a small fraction of k0. (See
Chapter 27, “Some quirks of total internal reflection”.)
The effects of the air gap on the signal level can be seen in Figure 37.13,
which shows results of computer simulations based on the full vector theory of
0.8
Gap = 50 nm
Sum signal (S1 + S2)
0.6 Gap = 0
Air-incidence
0.4 (NA = 0.6)
0.2
Parallel Polarization
0.0
– 0.75 – 0.50 – 0.25 0.00 0.25 0.50 0.75
Distance (m)
– 0.2
– 0.4
– 0.6
Air-incidence
(NA = 0.6) Parallel Polarization
– 0.8
– 0.75 – 0.50 – 0.25 0.00 0.25 0.50 0.75
Distance (m)
1.0 (c)
SIL: NA = 1.2
Gap = 200 nm
0.8
Sum signal (S1 + S2)
Gap = 50 nm
0.6
Gap = 0
0.4
Air-incidence
(NA = 0.6)
0.2
Perpendicular Polarization
0.0
– 0.75 – 0.50 – 0.25 0.00 0.25 0.50 0.75
Distance (m)
diffraction.5,6 The grating used in these simulations is that of Figure 37.7, the
assumed refractive index of the SIL is n ¼ 2, and the incident beam is assumed to
be linearly polarized. The direction of incident polarization is parallel to the
grooves in Figures 37.13(a), (b), and perpendicular to the grooves in Figures 37.13
(c), (d). Several line-scans across a single period of the grating are shown; the plots
in Figures 37.13(a), (c) correspond to the total returned optical power while those
in Figures 37.13(b), (d) represent the differential (or push–pull) signal.5 The signal
amplitude is highest when Wg ¼ 0, but it has dropped considerably at Wg ¼ 50 nm
37 Scanning optical microscopy 539
0.8
(d) Air-incidence
(NA = 0.6)
0.6
0.0
Gap = 50 nm
– 0.2
– 0.4
– 0.6
Perpendicular Polarization
– 0.8
– 0.75 – 0.50 – 0.25 0.00 0.25 0.50 0.75
Distance (m)
and even further at Wg ¼ 200 nm. In practice a gap-width below about k0 /10 is
usually acceptable; going beyond this value causes a sharp reduction in the
signal level.
The scanning signals are sensitive to the direction of incident polarization.
In general the two polarization directions parallel and perpendicular to the
grooves are not equivalent and yield different results, as may be readily
observed in Figure 37.13. To emphasize further the significance of polariza-
tion, Figure 37.14 shows the light intensity pattern at the exit pupil of the
objective lens for polarization directions parallel and perpendicular to the
grooves, all other things being kept equal. The two baseball patterns show clear
differences.
a b
Figure 37.14 Computed baseball patterns at the exit pupil of a 0.6NA objective
lens when the grating of Figure 37.7 is placed beneath a SIL of refractive index
n ¼ 2 (total NA ¼ 1.2). The focused spot is at the center of the land and the
assumed gap width is 100 nm. In (a) the polarization vector is parallel to the
grooves, while in (b) it is perpendicular to the grooves.
Objective
Sample
C
Z
Super SIL
Figure 37.15 Focusing through a “super SIL” of refractive index n and radius
R. In the absence of this SIL the cone of light from the objective comes to focus
at a distance of nR from the center C of the sphere. The bending of the rays at the
sphere’s surface shifts the focal point to a distance of R/n from C, without
introducing any aberrations.
The full factor n2 mentioned above may not be realized in practice, because the
bending of the rays within the super SIL works only up to a point, stopping when
the marginal rays become orthogonal to the Z-axis. If the objective happens to
have a large NA to begin with, the super SIL can only increase the value of its sin
h up to 1, at which point the remaining rays will miss the super SIL. The other
improvement by a factor of n, however, is always realized in practice because the
wavelength always shrinks by this factor.
To see the effect of the super SIL on the focused spot, computed light intensity
distributions at and near the focus of a 0.4NA objective are shown in Figure 37.16.
The FWHM spot diameter at best focus is 0.84 lm along X and 0.8 lm along Y,
37 Scanning optical microscopy 541
–2 x (μm) 2
Figure 37.16 Logarithmic plots of intensity distribution at and near the focus
of a 0.4NA objective lens operating at k0 ¼ 633 nm. Top: Dz ¼ 5 lm defocus.
Bottom: best focus (Dz ¼ 0). The spot’s FWHM at best focus is 0.84 lm along X
and 0.80 lm along Y.
while the depth of focus according to Eqs. (37.2) and (37.3) is around 4 lm.
When a super SIL of index n ¼ 2 is placed in front of this objective the plots of
Figure 37.17 are obtained. Clearly the spot size has shrunk by n2, but the depth of
focus is nearly the same as it was before the super SIL was introduced. Once again,
it is observed that focusing through the super SIL does not reduce the depth of
focus, as long as the sample and the super SIL move together along the Z-axis.
(This statement ignores the effects of a small amount of spherical aberration
introduced by the departure of the super SIL from its ideal location.)
A catadioptric SIL
A design that combines the objective and the SIL into one catadioptric element is
shown in Figure 37.18.7 (A catadioptric element is one that involves both the
reflection and the refraction of light.) A collimated beam enters the concave facet
of the lens, is reflected first at a flat internal mirror, then at an aspheric internal
mirror, and is finally brought to focus at the bottom of a plateau that is in contact
(or near contact) with the surface of the sample under investigation. This par-
ticular lens, which is also aplanatic, has a reasonably large field of view, with
542 Classical Optics and its Applications
–2 x (μm) 2
NA ¼ 1.1. Because the central portion of the incident beam is not used, the lens
effectively has an annular aperture, which makes the central spot even smaller
than the Airy disk but at the expense of increasing the brightness of the rings.
Figure 37.19 shows plots of intensity distribution at the focus of the lens for the
three components of polarization. The FWHM of the total intensity distribution is
37 Scanning optical microscopy 543
Flat
Aspheric mirror
mirror
Glass (n = 1.813)
Figure 37.19 Logarithmic plots of the intensity distribution at the focal plane of
the catadioptric lens of Figure 37.18. The incident beam is collimated and linearly
polarized along the X-axis. From left to right are shown the X-, Y-, and
Z-components of polarization. The integrated intensities of the three components
are in the ratios 1 : 0.003 : 0.128. The effective NA-value of the lens is 1.1, but its
annular shape of aperture gives rise to a spot size slightly less than that of the Airy
disk. The enhanced rings are also caused by the annular shape of the aperture.
0.3 lm along X and 0.26 lm along Y. Depth of focus is not a very useful concept
for this particular element because the incident beam is collimated.
545
546 Classical Optics and its Applications
In the Fourier domain, the first term in the above expansion, being the con-
stant or d.c. term, appears at the center of the plane of spatial frequencies. In
Spatial filter
the rear focal plane of the objective lens, therefore, the d.c. term appears as a
bright spot centered at and around the optical axis. Zernike realized that
by placing a 90 phase shift on this d.c. term (i.e., multiplying it by i), he
could bring it in phase with the second term in Eq. (38.1). In this way he
enabled beams corresponding to the two terms in the above expansion to
interfere with each other when they overlapped within the image plane of the
system. The primary function of the spatial filter, therefore, is to delay, by one
quarter of a wavelength, the central region of the beam within the rear focal
plane of the objective lens.
c d
in Figure 38.2(b). The largest mark is 10k long and the smallest mark is
3k wide. These marks are large enough to yield a reasonably clear image with
both coherent and incoherent illumination, in conjunction with an appropriate
phase-contrast filter. All the marks impart to the incident beam a phase shift
of 36 relative to the background (corresponding to an optical path-length
difference of k /10).
For coherent illumination of the object by the beam depicted in Figure 38.2(a),
the logarithmic plot of intensity distribution at the Fourier plane is shown in
38 Zernike’s method of phase contrast 549
Figure 38.2(c). The bright central spot in this figure is the d.c. term mentioned
earlier. Note that the cutoff point of this logarithmic plot is at a ¼ 6 and, there-
fore, the light diffracted by the object and spread throughout the aperture of the
objective lens is quite weak. In the absence of any phase-contrast mechanism the
computed image of the object is as shown in Figure 38.2(d). This obviously is a
very poor image, one in which the boundaries of the marks are barely perceptible.
We will see below how the action of the phase-contrast filter dramatically
improves the quality of this image.
a b
Figure 38.3 Image of the phase object of Figure 38.2(b), obtained with
the coherent illumination of Figure 38.2(a) when a phase-contrast mask is
placed in the Fourier plane. The mask is a small disk of radius 275k,
imparting a þ90 phase shift to the central region of the beam. (a) Intensity
distribution in the image plane. (b) Same as (a) but on a logarithmic scale
(a ¼ 1.65).
550 Classical Optics and its Applications
a b
Figure 38.4 Image of the phase object of Figure 38.2(b), obtained with the
coherent illumination of Figure 38.2(a), when an amplitude mask is placed in the
Fourier plane. The mask, a small disk of radius 275k, blocks the central region of
the beam. (a) Intensity distribution in the image plane. (b) Same as (a) but on a
logarithmic scale (a ¼ 3).
a b
Figure 38.5 Image of the phase object of Figure 38.2(b) obtained with the
coherent illumination of Figure 38.2(a) when a phase/amplitude mask is placed
in the Fourier plane. The mask, a small disk of radius 275k, imparts a þ 90
phase shift to the central region of the beam while attenuating its amplitude by
50%. (a) Intensity distribution in the image plane. (b) Same as (a) but on a
logarithmic scale (a ¼ 1.65).
its amplitude to bring it in line with the magnitude of (x, y). Figure 38.5 shows
the image obtained with a filter that cuts the amplitude in half while shifting the
phase by 90 . The resulting contrast enhancement is quite impressive.
Finally we consider the effect of changing the phase shift from þ 90 to 90 .
This is shown in Figure 38.6, where the images of the marks are now brighter
than their background. A similar situation will arise, of course, if instead of
38 Zernike’s method of phase contrast 551
a b
Figure 38.6 Same as Figure 38.5 except for the phase shift of the mask, which
is 90 in the present case. (a) Intensity distribution in the image plane.
(b) Same as (a) but on a logarithmic scale (a ¼ 2).
reversing the sign of the phase at the filter we reverse the phase of the marks at
the object. In practice, most phase objects contain a number of positive as well as
negative features, and their images will appear to be darker than the background
in some regions and brighter in other regions.
c d
Figure 38.7 Imaging of the phase object of Figure 38.2(b), obtained with an
incoherent, annular illuminator. (a) The simulated homogeneous, annular
light source consists of 36 independent, quasi-monochromatic point sources.
These point sources are arranged uniformly around the circumference of
the entrance pupil of the 0.25NA condenser lens. (b) Computed intensity
distribution at the focal plane of the condenser, which is also the location of
the object. (c) Distribution of the logarithm of intensity (a ¼ 6) at the exit
pupil of the 0.25NA objective lens. The annular phase mask placed at this
pupil has a width of 300k, it imparts a þ 90 phase shift and a 50% (ampli-
tude) attenuation to the beam at the outer periphery of the exit pupil.
(d) Computed intensity distribution at the image plane of the system.
exit pupil of the objective lens. Evidently, the phase-contrast filter must also be in
the form of an annular ring, covering the circumference of the objective’s exit pupil
and capable of delivering a 90 phase shift as well as a reasonable attenuation
factor to the incident beam.
The resulting image shown in Figure 38.7(d) is obviously of high quality, both
in terms of resolution and contrast.
38 Zernike’s method of phase contrast 553
The state of polarization of a given beam of light is modified upon reflection from
(or transmission through) an object. The resulting change in polarization state
conveys information about the structure and certain physical properties of the
illuminated region. Polarization microscopy is a variant of conventional optical
microscopy that enables one to monitor these changes over a small area of a
specimen. Such observations then allow the user to identify and analyze the
specimen’s structural and other physical features.1,2
Traditionally, observations with a polarization microscope have been cate-
gorized “orthoscopic” or “conoscopic.” Orthoscopic observations involve direct
imaging of the sample itself, thus allowing one to view the indentations, striations,
variations of optical activity and birefringence, etc., over the sample’s surface.
Conoscopic observations, however, involve illuminating a crystalline surface with
a cone of light and then imaging the exit pupil of the objective lens. This mode of
observation is used in characterizing the crystal’s ellipsoid of birefringence and
identifying its optical axes.
554
39 Polarization microscopy 555
CCD
Camera
Analyzer
Wollaston
Prism
Light
Lens Source
Linear
Objective
Polarizer
Sample
Although the source is spatially incoherent, the projected beam at the sample’s
surface is, in general, partially coherent. As for the degree of temporal coherence
of the light source, it does not play a role in polarization microscopy and is,
therefore, ignored throughout this chapter. All one needs to assume is that the
light source is quasi-monochromatic, with a bandwidth that is sufficiently narrow
to allow one to restrict attention to a single wavelength. The bandwidth must be
wide enough, however, to render the source spatially incoherent. (An extended
but purely monochromatic source is, of necessity, spatially coherent because the
radiated fields from any two locations on the source maintain their relative phase
at all times.)
556 Classical Optics and its Applications
a b
c d
Figure 39.2 Various distributions of the reflected light at the exit pupil of the
objective when a single monochromatic point source is used to illuminate the
sample. The intensity plots in (a) and (b) correspond, respectively, to the
components of polarization parallel and perpendicular to the polarizer’s trans-
mission axis. The polarization rotation angle q is depicted in (c) and the
polarization ellipticity g is shown in (d). The gray-scale of the latter plots depicts
positive values of q and g as bright and negative values as dark.
light through the analyzer, thereby reducing the contrast of the image. When the
problem is caused by reflections and refractions at the various surfaces of the
objective (or condenser) lens, a viable solution is to use a specialty objective that
incorporates a half-wave plate in the midst of its optical train.1,6 The half-wave
plate rotates the polarization direction by 90 , allowing the four-corner rotations
before and after the plate to cancel each other out. This solution was offered by
objective-lens manufacturers in the early days, before the advent of powerful
antireflection coatings. Nowadays the various surfaces of the objective and the
condenser are antireflection coated, and the four-corners problem caused by these
surfaces is negligible.
The problem still remains, however, that Fresnel’s reflection coefficients at the
sample’s surface differ for p- and s-polarized rays, causing a polarization rotation
problem that is aggravated with increasing angle of incidence. Moreover, if the
sample is observed through a birefringent substrate, the resulting polarization
558 Classical Optics and its Applications
variations over the beam’s cross-section give rise to spurious light transmission
through the analyzer, which, once again, reduces the image contrast.5 These
problems can no longer be solved by the incorporation of a half-wave plate within
the objective lens, because they are sample dependent. The differential method of
microscopy described below solves the four-corners problem by splitting the
spurious light between two images of the sample and then eliminating it by
subtracting one image from the other.
Differential method†
A simple modification of the conventional microscope of Figure 39.1 involves
replacing the analyzer with a Wollaston prism. The Wollaston splits the image of
the sample into two and transmits both images, side by side, to the camera. With the
transmission axes of the Wollaston fixed at 45 relative to the polarizer’s axis, the
unrotated light is split equally between the two images. When there is polarization
rotation, however, one image receives more light than the other, the sense of rotation
of the polarization determining which image gets the larger share. The two images
are then subtracted from each other (within the computer) to produce a single
differential image of the sample. The differential image is superior in many respects
to the conventional image, as will be seen in the examples that follow. The main
advantage of differential polarization microscopy is that it does not suffer from the
four-corners problem. Another advantage is that a map of reflectivity variations
across the sample can be readily constructed by adding the two images together;
normalizing the differential image by the sum image then provides a pure map of
polarization rotation at the sample.
The sample
In general, the polarization image of a sample is mixed with its other images, say,
those produced by reflectivity variations or optical phase variations across the
sample. To avoid such complications, we consider a smooth sample having
uniform amplitude and phase reflectivity everywhere, but one that rotates the
polarization of the incident beam as a result of optical activity. A perpendicularly
magnetized thin-film sample provides a good example in this case. By changing
the direction of magnetization (from up to down) in different locations, one can
create a pattern of magnetic domains such as that shown in Figure 39.3. Here the
smallest domain (shown at the center) is one wavelength in diameter. The black
†
To the author’s best knowledge the concept of differential polarization microscopy has not been described
previously in the technical and patent literature and may therefore be novel.
39 Polarization microscopy 559
–6 x/0 6
and white regions are magnetized in opposite directions and rotate the incident
(linear) polarization by þ 0.5 and 0.5 , respectively.
The material of the sample used in the following examples is assumed to have
complex index of refraction (n, k) ¼ (3.35, 4.03) which gives it a reflectivity of
62% at normal incidence. At oblique incidence the Fresnel reflection coefficients
for p- and s-polarized light differ from each other, thus inducing some rotation
and ellipticity into the reflected polarization state. For instance, at a 53 angle of
incidence, the linear polarization of a ray originally directed at 45 with respect to
the p-direction rotates by 7.4 and acquires 8.7 of ellipticity. This change of the
polarization state upon reflection is caused solely by the Fresnel coefficients of
the sample, independently of its optical activity.
Low-resolution imaging
Figure 39.4 shows computed images, both conventional and differential, of the
magnetic marks of Figure 39.3 obtained with a 50 ·, 0.4NA objective. In these
calculations the source was defocused by a distance of 35k0 below the object
plane, and the images from a total of 361 point sources were superimposed to
simulate the (spatially incoherent) light source. For the conventional image
shown in Figure 39.4(a) the analyzer axis was set 0.5 away from the cross
position, nearly the optimum setting for achieving maximum contrast in this case.
(The contrast may be reversed by rotating the analyzer to the opposite side of the
cross position.) The resolution of these images is not great, as evidenced by the
near-disappearance of the small mark in the center. The contrast, however, is
560 Classical Optics and its Applications
a
quite good, and there is little difference between the conventional and differential
methods of imaging. The reason is that at 0.4NA the half-angle of the focused cone
of light is only 23.6 , which is not large enough to cause a significant four-corners
problem.
High-resolution imaging
Obtaining images with high resolution requires a high-NA objective lens. Figure 39.5
shows both conventional (a), (b) and differential (c), (d) images of the sample of
Figure 39.3 obtained with a 50 ·, 0.8NA objective. The images on the left show dark
domains on a bright background, while the reverse-contrast counterpart of each
image is shown to its right. In these calculations the source was defocused by a
distance of 10k0 below the object plane, and the images from a total of 361 point
sources were superimposed to simulate the (spatially incoherent) light source.
Inspection of Figure 39.5 reveals that the resolution has improved over that of
Figure 39.4. The contrast, however, is quite poor for the conventional images in
39 Polarization microscopy 561
a b
c d
Figures 39.5(a), (b), even though the analyzer has been set optimally at 1.5 from
the crossed position. This poor contrast is a manifestation of the four-corners
problem. In comparison, the differential images of Figures 39.5(c), (d) show
excellent contrast, which is not surprising considering that the four-corners con-
tributions to individual images (before subtraction) are identical and can therefore
be removed by subtraction.
To gain a better appreciation of the four-corners problem, consider the
intensity distribution at the plane of the sample, Figure 39.6, corresponding to a
single point source defocused by 10k0. Although the incident beam entering the
objective lens is linearly polarized along the X-axis, the defocused spot, in
consequence of the bending of the rays by the lens, contains all three components
of polarization, along the X-, Y-, and Z-axes; these are shown respectively from
top to bottom in Figure 39.6. The peak intensities of the three components in
Figure 39.6 are in the ratios Ix : Iy : Iz ¼ 1 : 0.007 : 0.185. Upon reflection from the
sample the distributions remain qualitatively the same, but the peak-intensity
ratios change to 1 : 0.017 : 0.142. Thus the relative content of the Y-component
562 Classical Optics and its Applications
– 12 x/0 12
Figure 39.6 Distribution of incident intensity at the plane of the sample corres-
ponding to a single point source defocused by 10k0 through a 0.8NA objective.
The incident beam entering the lens is linearly polarized along the X-axis. Top to
bottom: intensity distributions corresponding to polarization components along the
X-, Y-, and Z-axes.
increases upon reflection while that of the Z-component decreases. When this
distribution returns to the objective lens, it gives rise to patterns of intensity and
polarization similar to those shown in Figure 39.2. At the exit pupil the values of
the polarization rotation angle q range from 7.0 to þ 8.1 , while the polar-
ization ellipticity g ranges from 8.8 to þ 8.6 . The slight asymmetry between
positive and negative values is caused by the presence of magnetization in the
sample. In the absence of magneto-optical activity, q and g vary between 7.4
and 8.7 , respectively.
39 Polarization microscopy 563
Substrate birefringence
Sometimes it is necessary to observe a sample through an intervening medium,
such as a coating layer or a substrate. If this medium happens to be birefringent,
it creates a four-corners problem of its own.5 As a typical example, assume that the
sample of Figure 39.3 is coated with a birefringent layer 500 nm thick whose
principal refractive indices along the coordinate axes are (nx, ny, nz) ¼ (1.5, 1.6, 1.7).
For this sample, conventional microscopy yields the image shown in Figure 39.7(a),
Figure 39.7 Images of the sample of Figure 39.3, coated with a birefringent
layer and placed in a microscope having a 50 ·, 0.8NA objective. (a) Conventional
image, obtained with the analyzer set optimally at 5 away from extinction.
(b) Differential image. (c) Same as (b) but with the order of subtraction reversed.
564 Classical Optics and its Applications
Conoscopic observations
The system depicted in Figure 39.8 captures the essence of conoscopic polar-
ization microscopy. Here a coherent, monochromatic beam of light is linearly
polarized and sent through an objective lens to be focused on a birefringent
crystal. The reflected light is re-collimated by the objective and observed after
going through a crossed analyzer. For the specific example described below, the
objective’s NA-value is 0.375 and its focal length f is 20 000k0. The sample is in
the XY-plane, the Z-axis being perpendicular to its surface. The crystal slab’s
thickness is 430k0, its principal refractive indices are (nx, ny, nz) ¼ (1.686, 1.682,
1.531), and its ellipsoid of birefringence is rotated around the Z-axis by 13 .
The computed intensity distribution at the observation plane of Figure 39.8
is shown in Figure 39.9(a), and the corresponding logarithmic plot appears in
Figure 39.9(b). Within the focused cone there are two rays that propagate along
the two optical axes of the crystal; these rays return without any change in their
state of polarization and are therefore blocked by the analyzer. There are also
groups of rays whose polarization vectors undergo rotation by integer multiples
of 180 in double passage through the slab. These rays are also blocked by
the analyzer, giving rise to the various dark regions in the intensity patterns of
Lens Birefringent
Polarizer Beam-splitter crystal
Aluminum
Analyzer mirror
Observation plane
a b
Figure 39.9 (a) Intensity and (b) logarithmic intensity distributions at the
observation plane in the system of Figure 39.8 with a biaxially birefringent crystal.
George Nomarski invented the method of differential interference contrast for the
microscopic observation of phase objects in 1953.1,2,3 The features on a phase
object typically modulate the phase of an incident beam without significantly
affecting the beam’s amplitude. Examples include unstained biological samples
having differing refractive indices from their surroundings, and reflective (as
well as transmissive) surfaces containing digs, scratches, bumps, pits, or other
surface-relief features that are smooth enough to reflect specularly the incident
rays of light. A conventional microscope image of a phase object is usually faint,
showing at best the effects of diffraction near the corners and sharp edges but
revealing little information about the detailed structure of the sample.4
Nomarski’s method creates two slightly shifted, overlapping images of the same
surface. The two images, being temporally coherent with respect to one another,
optically interfere, producing contrast variations that contain useful information
about the phase gradients across the sample’s surface. In particular, a feature that has
a slope in the direction of the imposed shear appears with a specific level of brightness
that is distinct from other, differently sloping regions of the same sample.4,5,6
The Nomarski microscope uses a Wollaston prism in the illumination path to
produce two orthogonally polarized, slightly shifted bright spots at the sample’s
surface. Upon reflection from (or transmission through) the sample, the two
beams are collected by the objective lens, then sent through the same (or, in the
case of a transmission microscope, a similar) Wollaston prism, which recombines
the two beams by sliding them back over each other. The two beams subsequently
arrive coincidentally in the image plane of the microscope, but the two images of
the sample which they carry will be relatively displaced. A linear analyzer, placed
after the Wollaston prism in the reflected (transmitted) path, brings the polar-
ization vectors of the two images into alignment, enabling the two to interfere
with each other. A sheared interferogram of the sample’s surface is thus formed at
the image plane of the microscope.
566
40 Nomarski’s differential interference contrast microscope 567
Wollaston prism
Because Nomarski’s method of microscopy is fundamentally dependent on the
action of the Wollaston prism, a brief description of this polarizing beam-splitter is
in order. The Wollaston prism, depicted in Figure 40.1, consists of two cemented
wedges from the same uniaxial birefringent crystal (e.g., quartz or calcite). The
individual wedges are precisely cut and polished, then aligned with their optic axes
orthogonal to each other.4 In Figure 40.1 the optic axis of the upper wedge is
horizontal within the plane of the page, while that of the lower wedge is perpen-
dicular to the plane. The crystal’s ordinary and extraordinary refractive indices,
no and ne, interact with the E-field components perpendicular and parallel to the
optic axis, respectively.
The incident beam, in general, has both s- and p-components of polarization.
In going through the upper half of the Wollaston, the p-component interacts
with ne and the s-component with n0, but the propagation direction remains the
same for both the p- and s-beams. In the lower half the roles of n0 and ne are
exchanged, with the result that the p-component is deflected to one side and the
s-component to the other (one beam enters a denser, the other a rarer medium).
The angular separation of the beams is further enhanced by Snell’s law when
they exit the prism. Emerging from the Wollaston, therefore, are two beams,
propagating in different directions and having mutually orthogonal directions of
polarization.
Incident beam
p
s
Optic axis
a
Wollaston
Optic axis
p
s
Figure 40.1 The Wollaston prism consists of two cemented wedges of the
same uniaxial birefringent crystal, aligned with their optic axes in different
directions. The incident beam, with its p- and s-components of polarization, is
split at the interface between the wedges. Emerging from the Wollaston are two
orthogonally polarized beams that propagate in different directions.
568 Classical Optics and its Applications
Figure 40.2 shows a thin bundle of rays arriving at a Wollaston prism and
splitting into two orthogonally polarized beams. The p- and s-beams go through a
microscope objective and illuminate the sample in two small, slightly displaced
patches that cover the objective’s field of view. Upon reflection from the sample
the beams return through the objective and come together again as they emerge
from the Wollaston. Note that, in a round trip through this system, the optical
path lengths of the p- and s-beams will be the same only if the Wollaston is
centered on the Z-axis. In particular, if the Wollaston is translated along the
X-axis then, during a round trip, one beam sees a longer optical path than the
other. The relative phase of the p- and s-beams, referred to as the bias phase B,
can therefore be adjusted by sliding the Wollaston along the X-axis. Note that, for
a given lateral position of the Wollaston, the bias phase B is constant for all the
ray bundles that go through the system: it is independent of their initial distance
from the Z-axis.
Assuming a ¼ 0.84 for the wedge angles and n0 ¼ 1.54467, ne ¼ 1.55379 for
the ordinary and extraordinary refractive indices of the crystal (quartz), the
angular separation of the two beams emerging from the Wollaston (in the forward
Wollaston
X
Objective
Sample
Figure 40.2 A bundle of rays entering a Wollaston prism is split into p- and
s-polarized beams. The beams go through a microscope objective and illuminate
the sample in two small, slightly displaced patches that cover the objective’s field
of view. Upon reflection from the sample, the beams return through the objective
and come together as they exit the Wollaston prism. The bias phase B between the
two beams may be adjusted by sliding the Wollaston in the horizontal direction.
40 Nomarski’s differential interference contrast microscope 569
path) will be 0.0153 . For an objective lens having f ¼ 3750k, where k is the
wavelength of the quasi-monochromatic light source, this angular separation
results in one k of displacement between the two spots that illuminate the sample.
Moreover, for every lateral shift by 100k of the Wollaston, there occurs a bias
phase B ¼ 19.26 between the p- and s-beams in a double pass through the
system. So, for example, if the lateral shift is 1870k then one beam will be
retarded by a full 2p relative to the other.
Observation Plane
Lens
Analyzer at –45º
Polarizer at +45º
Beam-splitter
Light source
Lens
Wollaston prism
Objective
Sample
Examples
Figure 40.4(a) shows the distribution of phase on a uniformly reflecting surface
having several sphero-cylindrical pits with varying depths. The nose feature has a
depth of 0.5k, and the mouth, eyes, and eyebrows are respectively 0.25k, 0.375k,
and 0.75k deep. The computed image of this phase object in a conventional
optical microscope (i.e., like that in Figure 40.3 but without the polarizer, ana-
lyzer, and Wollaston) is shown in Figure 40.4(b). Note that diffraction of light
from the edges of the various features of the face creates dark borders in the
corresponding image regions, but this conventional image lacks information
about the slope and depth distribution within those features.
The computed Nomarski image of the phase object of Figure 40.4(a), obtained
with one k of sheer along the X-axis, is shown in Figure 40.5. The intensity
40 Nomarski’s differential interference contrast microscope 571
a b
Figure 40.4 (a) The distribution of phase at an object’s surface and (b) the
distribution of intensity in the image of the same object, as observed in a con-
ventional optical microscope. In (a) the various features of the “face” have the
same reflectance but different depth, resulting in phase modulation of the inci-
dent light. The nose, mouth, eyes, and the eyebrows are respectively 0.5k, 0.25k,
0.375k, and 0.75k deep. The image in (b) is formed by a 0.8NA, 50· objective.
The simulated light source consisted of 529 spatially incoherent point sources,
each defocused by 10k above the sample’s surface. The observed contrast is
purely due to diffraction effects, as the phase object does not give rise to any
contrast in geometric-optical terms.
a b
Figure 40.5 Nomarski images of the phase object in Figure 40.4(a), when the
Wollaston produces one k of shear along the X-axis. The microscope is that
shown in Figure 40.3, having a 50·, 0.8NA objective, and the Wollaston’s
horizontal position is adjusted for B ¼ 0 . (a) Intensity distribution in the image
plane; (b) logarithmic plot of the intensity distribution.
Nomarski image, while horizontal features (such as the mouth) are hidden. The
reverse is true when the shear is along the Y-axis, as in Figure 40.6, where horizontal
features become visible while vertical features disappear.
Figure 40.7 shows the Nomarski image of the object in Figure 40.4(a), but
with a bias phase B ¼ 90 . The background of the image is now bright,
because the analyzer no longer blocks the light reflected from flat regions of
the sample. Moreover there is an asymmetry between regions with positive
and negative slope, as can be seen by comparing the right and left sides of the
nose feature.
Another example of a phase object is shown in Figure 40.8(a). Here a ridge
having height k runs along the 45 direction in the XY-plane. The two edges of
the ridge have differing slopes, the lower edge being 4k wide while the upper
edge is 2k wide. In the middle of the ridge there is a pit of depth k in the shape
a b
Figure 40.6 Same as Figure 40.5, except for the direction of shear, which is
along the Y-axis.
Figure 40.7 Nomarski image of the phase object in Figure 40.4(a), when the
Wollaston produces one k of shear along the X-axis. The microscope is that
shown in Figure 40.3, having a 50·, 0.8NA objective, and the Wollaston’s
horizontal position is adjusted for B ¼ 90 .
40 Nomarski’s differential interference contrast microscope 573
a b
Figure 40.8 (a) Phase object and (b) its conventional microscope image. The
object consists of a ridge with a height of k, running at 45 to the X- and Y-axes,
and a pit in the middle of the ridge whose depth is also k. The ridge’s side-
walls have different slopes: the lower wall is 4k wide, while the upper wall is
2k wide. The flat-bottomed pit has the shape of a football stadium. The image
in (b) is formed through a 50·, 0.8NA microscope objective. The simulated
light source consisted of 529 spatially incoherent point sources, each defocused
by 10k above the sample’s surface. The observed image contrast is purely due
to diffraction effects, as the phase object does not give rise to any contrast in
geometric-optical terms.
Practical considerations
The back focal plane of high-NA objectives is usually inaccessible from outside
the lens, so the Wollaston prism cannot be directly inserted at the entrance pupil.
By choosing a somewhat different orientation for the optic axes of the crystal
wedges, Nomarski modified the Wollaston prism in such a way that the p- and
s-beams appeared to be separating from each other in a plane external to the
prism.3 In this way the light source could be imaged onto the entrance pupil of the
objective through the Nomarski-modified Wollaston prism, allowing both Köhler
illumination and the separation and recombination of the p- and s-beams at the
entrance pupil.
574 Classical Optics and its Applications
a b
Figure 40.9 Nomarski images of the phase object of Figure 40.8(a), when
the Wollaston produces one k of shear along the X-axis. The microscope is
that shown in Figure 40.3, having a 50·, 0.8NA objective. The Wollaston’s
horizontal position is adjusted to yield a bias phase B between the p- and
s-polarized beams. (a) B ¼ 0 , (b) B ¼ 90 .
576
41 The van Leeuwenhoek microscope 577
build microscopes with clearer and brighter images than any of his contemporaries
could achieve.
Van Leeuwenhoek used his invention to confirm the discovery of capillary
systems, to describe the life cycle of ants, and to observe plant and muscle tissue,
protozoa and bacteria, and the spermatozoa of insects and humans. In 1673,
van Leeuwenhoek began writing letters to the newly formed Royal Society of
London, describing his findings – his first letter contained some observations on the
stings of bees. For the next 50 years he corresponded with the Royal Society; his
letters, written in Dutch, were translated into English or Latin and printed in the
Philosophical Transactions of the Royal Society, and often reprinted separately. His
experiments with microscope design and function made him an international
authority on microscopy, and in 1680 he was made a Fellow of the Royal Society.
It is suspected that van Leeuwenhoek produced his lenses by chipping away
the excess glass from the thickened droplet that forms on the bottom of a blown-
glass bulb. These lenses probably had a thickness of 1 mm and a radius of
curvature of 0.75 mm. They had superior magnification and resolution when
compared to other microscopes of the time. The Utrecht museum has one of van
Leeuwenhoek’s microscopes in its collection. This amazing instrument has a
magnification of about 275· with a resolution approaching one micron (in spite
of a scratch on the lens).5
Towards the end of his life van Leeuwenhoek wrote: “ . . . my work, which
I’ve done for a long time, was not pursued in order to gain the praise I now enjoy,
but chiefly from a craving after knowledge, which I notice resides in me more
than in most other men. And therewithal, whenever I found out anything
remarkable, I have thought it my duty to put down my discovery on paper, so that
all ingenious people might be informed thereof.”
When the ray height h is much smaller than the radius R of the sphere, the angles
h and h0 will be small, in which case the small-angle approximation yields
CA nR=½2ðn 1Þ: ð41:2Þ
578 Classical Optics and its Applications
(a)
u9
h u9 u
u
C A
(b)
u
u9
h
u A
C
u
Figure 41.1 A ray of height h traveling parallel to the optic axis is refracted by a
glass sphere of radius R and refractive index n. Upon emerging from the sphere,
the ray crosses the optic axis at point A. When h becomes very small, the point A
approaches the paraxial rear focus F 0 of the lens. In (a) n < 2.0 and the emergent
ray crosses the axis outside the sphere, whereas in (b), where n > 2.0, only the
backward extension of the ray crosses the axis. (When n ¼ 2.0, the paraxial rays
come to focus on the rear facet of the sphere.)
Thus, for example, if n ¼ 1.5 then the paraxial focus of the lens is at a distance
CA ¼ 1.5R from the lens center, or if n ¼ 2 then the paraxial focus coincides with
the rear vertex of the sphere, that is, CA ¼ R. Depending on the values of n and h,
the proper path of the ray may be that shown in Figure 41.1(a) or (b), but
equations (41.1) and (41.2) apply to both cases. The paraxial focus, of course, is
relevant only for rays with a small height h; when h increases beyond the paraxial
regime, the point A moves closer to the center C, giving rise (for a beam of wide
cross-section) to spherical aberrations.
41 The van Leeuwenhoek microscope 579
Here and h are related through sin ¼ R sin h/FC. Thus a point source located
at the front focus F and radiating into a reasonably large cone will produce a real
image on the opposite side at some finite distance from C. To be sure, this image
has a certain amount of spherical aberration and, to obtain a good image, one
must limit the angular range of the cone of light accepted by the lens. This may be
achieved by closing down the aperture stop, which may be located either on
the object side or the image side of the lens. In Figure 41.2 the stop is in the
image space and may thus be referred to as the exit pupil of the lens.
Figure 41.3 shows computed distributions pertaining to the system of Figure 41.2.
The point source is located at the paraxial focus of the lens (R ¼ 1 mm, n ¼ 1.5,
CF ¼ 1.5 mm), and the assumed radius of aperture Ra ¼ 0.55 mm. Figure 41.3(a)
shows that the emergent intensity at the exit pupil is somewhat brighter near the
rim compared with that at the center of the aperture. Figure 41.3(b), a plot of phase
distribution at the exit pupil (minus the curvature), shows a significant amount of
spherical aberration. (The curvature of the emergent beam has been removed from
the phase plot; only the residual aberrations are shown.) The emergent beam
u
u u9 u9
f
Y Z
F C A
Ra
R
a b c
comes to best focus at a distance CA ¼ 27.36 mm behind the lens. Figure 41.3(c),
a logarithmic plot of intensity distribution in the plane of best focus, also shows
the substantial rings of light caused by spherical aberration. These clearly indicate
that the image quality of a wide-aperture system would be poor.4
When the aperture is further closed down to Ra ¼ 0.4 mm the distributions of
Figure 41.4 are obtained. The intensity distribution at the exit pupil is now
fairly uniform, and the phase plot shows convergent behavior towards the point
of best focus at CA ¼ 59.3 mm behind the lens. (Notice that in Figure 41.4(b),
unlike Figure 41.3(b), the curvature has not been subtracted from the phase
plot.) The best-focused spot is shown in Figure 41.4(c). In addition to a rela-
tively small spherical aberration, this system also has a fairly large field of
view, as may be inferred from the plots of Figure 41.5. Here a number of
identical point sources are placed in the front focal plane of the lens, and their
corresponding images are computed in the plane of best focus, at CA ¼ 59.3 mm.
All imaged points show spherical aberration similar to that of the central spot, but
there is very little coma and astigmatism, owing to the fact that the system is
essentially monocentric.
a b
Figure 41.5 Five point sources placed in the front focal plane of the spherical
lens shown in Figure 41.2. The exit-pupil radius Ra ¼ 0.4 mm, and the best image
(with 45· magnification) appears in a plane 59.3 mm away from the lens
center. (a) Intensity distribution in the object plane. (b) Intensity distribution in
the image plane. All imaged points show spherical aberration, but there is very
little coma or astigmatism.
F
Y Z
C
Object
Observer
Virtual image
Exit Pupil
Figure 41.6 The simulated Van Leeuwenhoek microscope. The lens radius
R ¼ 1 mm, its refractive index n ¼ 1.5, the object is 20 lm to the right of the
paraxial focus F (i.e., 0.48 mm away from the lens), and the exit-pupil radius
Ra ¼ 0.25 mm. The virtual image, formed 316 mm to the left of the lens center,
can be comfortably viewed when the eye is placed at or near the exit pupil.
a b
Figure 41.7 Distributions of (a) intensity and (b) phase immediately in front of the
object. The object is trans-illuminated with a uniform, coherent, and monochromatic
plane wave k ¼ 0.5 lm. The smallest feature in the lower right-hand side is 1 lm in
diameter. The phase values in (b) range from 144 (black) to þ108 (white).
a b
Figure 41.8 Distributions of (a) intensity and (b) phase at the exit pupil of the
microscope of Figure 41.6 with the coherently illuminated object of Figure 41.7.
The intensity is shown on a logarithmic scale to emphasize its weak regions. The
phase ranges from 180 (black) to þ180 (white).
a b
Figure 41.9 Distributions of intensity in the virtual image seen through the
microscope of Figure 41.6 with the object of Figure 41.7. The image in (a) is
computed for a coherent, monochromatic beam of light normally incident on
the object. The incoherent image in (b) is obtained by illuminating the object
with 225 point sources through a 0.15NA condenser lens. These virtual images
have a magnification of 200· and appear at a distance of 316 mm behind the
lens center.
Method of computation
The results presented in this chapter were obtained by a combination of ray-
tracing and diffraction calculations. The light emanating from the object was
propagated to the vicinity of the lens using far-field (Fraunhofer) diffraction
formulas. The complex-amplitude distribution at this point was converted into
a set of geometric-optical rays, using the local Poynting vector to represent
the ray. The rays were traced from the entrance pupil to the exit pupil of the
lens using standard methods of ray-tracing. At the exit pupil the ray magni-
tude and phase information was converted into a complex wavefront, and
the wavefront was propagated to the image plane using near-field (Fresnel)
diffraction formulas.
Basic principles
Figure 42.1 is a diagram of a typical projection system used in optical litho-
graphy. A quasi-monochromatic, spatially incoherent light source (wavelength k)
is used to illuminate the mask. Steps are usually taken to homogenize the source,
thus ensuring a highly uniform intensity distribution at the plane of the mask. The
condenser stop may be controlled to adjust the degree of coherence of the illu-
minating beam; this control of partial coherence is especially important when
PSMs are used to improve the performance of optical lithography beyond what is
achievable with the traditional BIMs.
†
The coauthor of this chapter is Rongguang Liang.
586
42 Projection photolithography 587
Light source
Homogenizer
Condenser stop
Condenser lens
Projection lens
u
Wafer and stage
The light transmitted through the mask is collected by the projection lens, which
images the mask onto the wafer, typically with a magnification M ¼ 1/5. Thus, if
the numerical aperture of the projection lens is defined as NA0 ¼ sin h, its angular
aperture on the mask side will be sin h0 ¼ M NA0. If the condenser’s numerical
aperture NAc happens to be much less than sin h0 then the illumination is coherent,
while if NAc
sin h0 then the illumination is essentially incoherent. In practice the
ratio r ¼ NAc /(M NA0) is used as a measure of the incoherence of illumination.
For example, if M ¼ 1/5 and NA0 ¼ 0.6, then NAc ¼ 0.084 yields r ¼ 0.7, while
NAc ¼ 0.06 yields r ¼ 0.5. For a given projection lens, therefore, the incoherence of
illumination is proportional to the condenser’s stop diameter.1,2,3
Over the past decade, photolithographic systems have evolved through several
generations. The wavelength of the light source has steadily decreased from 365 nm
588 Classical Optics and its Applications
(a) (b)
+ +
Amplitude
Amplitude
0 0
(c) (d)
+
Amplitude
Amplitude
0 0
– –
(e) (f)
Amplitude
+ +
Amplitude
0 0
–
–
Figure 42.2 Several mask structures and, below each structure, the corres-
ponding E-field patterns immediately after transmission through the mask. (a)
Conventional transmission mask. (b) Alternating-aperture phase mask with
etched substrate. (c) A chromeless phase-edge mask produces dark lines in the
image solely through destructive interference at the phase transitions. (d) A
shifter–shutter mask is similar to (c) except that each dark line is produced by a
pair of adjacent phase-edges. (e) A rim-shifter mask contains chrome lines
bracketed by 180 phase-edges. (f) An attenuated phase-shift mask; here the
shaded regions represent partially transmissive material with a 180 phase shift.
(Adapted from reference 1.)
mask is a PSM in which the upper and lower bright lines are phase-shifted by
180 relative to the central bright line. In Figures 42.4(a), (b) we compare the
intensity patterns of the images obtained at the wafer for these two types of mask.
The assumed projection system is that of Figure 42.1, with NA0 ¼ 0.6, M ¼ 1/5,
and r ¼ 0.7. Clearly the PSM is better at resolving the dark spaces between
adjacent bright lines. For direct comparison, a cross-section through these two
intensity distributions is shown in Figure 42.4(c). Increasing the coherence of the
illumination by closing down the aperture of the condenser to r ¼ 0.5 improves
the image contrast of the PSM but degrades that of the BIM image, as can be
readily observed in Figures 42.4(d)–(f).
590 Classical Optics and its Applications
Contact hole
Figure 42.7(a) shows a simple 4k · 4k square aperture on a dark background.
This feature has uniform phase across the aperture and, therefore, represents
the BIM for a contact hole. A corresponding PSM for the same hole is shown
in Figure 42.7(b). Here four side-rigger lines of width 0.5k and 180 phase
42 Projection photolithography 591
a d
b e
0.6 0.6
0.4 0.4
0.2 0.2
BIM PSM PSM BIM
0 0
–2.1 y/ 2.1 –2.1 y/ 2.1
Figure 42.4 Computed plots of intensity distribution at the wafer for the
mask of Figure 42.3 placed in the system of Figure 42.1 (NA0 ¼ 0.6, M ¼ 1/5).
(a) Image of the BIM obtained with r ¼ 0.7. (b) Image of the PSM obtained
with r ¼ 0.7. (c) Cross-sections of the intensity patterns for the BIM (broken
line) and the PSM (solid line). (d)–(f) Same as the patterns in the left-hand
column, but for r ¼ 0.5.
shift (relative to the central aperture) are placed around the hole.4 The
computed intensity patterns of the images of these masks at the wafer appear in
Figures 42.8(a),(b), respectively. The side-rigger features are too small to be
printed, but their destructive interference with the central aperture results in a
smaller projected hole, as revealed in the cross-sectional intensity profiles at
the wafer shown in Figure 42.8(c). As before, the printed feature size can be
further optimized by adjusting the dimensions of the side-riggers as well as by
closing the condenser stop to reduce the value of r.
592 Classical Optics and its Applications
a b
Figure 42.5 Masks designed for creating an isolated bright line at the wafer.
(a) BIM containing a 4k-wide line on an opaque background. (b) PSM featuring
the same 4k-wide line flanked by a pair of 0.8k-wide side-riggers. Each side-
rigger imparts to the incident beam a 180 phase shift relative to the central line.
The separation between the central line and each side-rigger is 2k.
0.6
0.2
0
–2.6 y/ 2.6
Figure 42.6 Computed intensity patterns at the wafer for the masks of Figure 42.5
in the system of Figure 42.1 (NA0 ¼ 0.6, M ¼ 1/5, r ¼ 0.7). (a) Using the BIM;
(b) using the PSM; (c) the cross-sections of the intensity patterns in the images of the
BIM (broken line) and the PSM (solid line).
–5 x/ 5 –5 x/ 5
Figure 42.7 Mask patterns for creating a contact hole. (a) BIM containing a
4k · 4k square aperture on an opaque background. (b) PSM featuring the same
4k · 4k aperture surrounded by 0.5k-wide side-riggers. Each side-rigger imparts
to the incident beam a 180 phase shift relative to the central aperture.
0.4
0.2 PSM
0
–1.8 x/ 1.8
Figure 42.8 Computed intensity patterns at the wafer for the masks of Figure 42.7
in the system of Figure 42.1 (NA0 ¼ 0.6, M ¼ 1/5, r ¼ 0.7). (a) Using the BIM;
(b) using the PSM; (c) the cross-sections of the intensity patterns in the images of the
BIM (broken line) and the PSM (solid line).
42 Projection photolithography 595
a b c
0.2 0.2
0 0
–5.5 x/ 5.5 –5.5 x/ 5.5
impractical because they are costly and, moreover, they produce masks that are
difficult to inspect and to repair. In today’s practice, such unwanted dark lines
are erased by a second exposure through a different mask.
Concluding remarks
Incorporating the advantages of optical phase in the design, manufacture, and
testing of photomasks is still very much a research topic; many potential benefits
of the PSM await to be realized. The type of PSM in common use today is the
attenuated PSM depicted in Figure 42.2(f), where the traditional opaque chrome
is replaced by a material that transmits 8% with a 180 phase shift. This is
useful for printing bright spaces and contact holes, and has essentially replaced
the shifter–shutter type of mask (see Figure 42.2(d)). Also, the more recent
596 Classical Optics and its Applications
a b c
0.2 0.2
PSM
0 0
–3 x/ 3 –3 x/ 3
Figure 42.10 Same as Figure 42.9 but for smaller mask features. The lines and
spaces on the mask are now 3k wide.
a b c
0.4 0.4
PSM
0.2 0.2 PSM
0 0
–2.6 x/ 2.6 –2.6 x/ 2.6
Figure 42.11 Same as Figure 42.9 but for very small mask features. The lines
and spaces on the mask are now 2.4k wide.
42 Projection photolithography 597
a
–15 x/ 15
b
–3 x/ 3
1
c
0.8
0.6
0.4
0.2
–3 y/ 3
When a light field interacts with structures that have complex geometric features
comparable in size to the wavelength of the light, it is not permissible to invoke the
assumptions of the classical diffraction theory, which simplify the problem and allow
for approximate solutions. For such cases, direct numerical solutions of the governing
equations are sought through approximating the continuous time and space deriva-
tives by the appropriate difference operators. The Finite Difference Time Domain
(FDTD) method discretizes Maxwell’s equations by using a central difference
operator in both the time and space variables.1 The E- and B-fields are then repre-
sented by their discrete values on the spatial grid, and are advanced in time in steps of
Dt. The numerical solution thus obtained to Maxwell’s equations (in conjunction
with the relevant constitutive relations) provides a highly reliable representation of
the electromagnetic field distribution in the space-time region under consideration.
This chapter presents examples of application of the FDTD method to prob-
lems involving the interaction between a focused beam of light and certain
subwavelength structures of practical interest. A few general remarks concerning
the nature of the FDTD method appear in the next section. This is followed by a
description of the simulated system and two examples in which comparison is
possible between the FDTD method and an alternative method of calculation. We
then present simulation results for the case of a focused beam interacting with
small pits and apertures in a thin film supported by a transparent substrate.
†
The co-authors of this chapter are Armis R. Zakharian, now with Corning Corp., and Jerome V. Moloney of
the University of Arizona.
599
600 Classical Optics and its Applications
Ez
Bx
By
Ey
Δz Bz
Ex
Δx
Δy
Figure 43.1 The unit cell of the FDTD mesh has dimensions Dx · Dy · Dz. The
various components of the E and B fields are assigned to different locations on
the unit cell. The staggered field components are shifted by a half-pixel in
various directions.
positions with respect to the cell center, so that every component of the electric
field is surrounded by four circulating components of the magnetic field, and vice
versa. Such a staggered mesh is motivated by the integral form of Maxwell’s curl
equations. The contour integrals of E (B) along the edges of the cell in Faraday’s
law (Ampere’s law) circulate around the corresponding magnetic (electric) field
component at the center of the cell face.
In 3-D simulations at least six field components must be stored and updated
at each grid point, which leads to considerable memory and CPU requirements
for FDTD simulations. Fortunately, the time update of any field component
involves only nearby fields located one or two cells away on the grid. This kind
of locality in the physical space translates into computer memory access
locality and allows for efficient implementation of the FDTD algorithm on
many types of shared and distributed memory parallel platforms. Low-reflection
absorbing boundary conditions that terminate the computational domain by a
Perfectly Matched Layer (PML) allow the simulation of physical problems with
open boundaries.2
Since the FDTD algorithm solves Maxwell’s equations in the time domain,
calculation for a broad range of frequencies is possible in a single simulation using
a time-pulsed excitation. Other advantages include the possibility of modeling
dispersive and non linear materials. An important property of the FDTD method is
that it introduces no additional dissipation into the physical problem due to
numerical discretization, and hence energy is conserved. However, the finite dif-
ference method contributes to a dispersion error. In the commonly used second
order accurate implementation of FDTD, this error diminishes with cell size h as
O(h2). In practice, therefore, to keep the numerical dispersion errors under control,
a grid with about 30 points per wavelength is desired. The rather large number of
43 Interaction of light with subwavelength structures 601
(a)
0.4
0.2
z [μm]
0.0
–0.2
–0.4
–2
–1
0 2
y [μ 0 1
m] 1 –1
2 –2 x [μm]
(b)
0.4
0.2
z [μm]
0.0
–0.2
–0.4
–2
–1
0 2
y [μ 1
m] 1 0
2 –1 x [μm]
–2
Figure 43.2 3-D computational domain for simulating the interaction between
a focused beam of light and various marks (i.e., bumps or pits) on the surface of
a multilayer data storage medium. (a) Non-uniform conformal grid; the grid-line
density is higher near the center, where the focused beam and the multilayer
stack are located. (b) Nested rectangular cells forming a non-conformal hier-
archical grid.
points and iterations thus required for accurate results may render solution
impractical for a problem with large spatial and/or temporal domain.
In many cases it is desirable to retain the efficiency of the FDTD scheme on the
rectangular grids, but achieve higher resolution only in those regions of the
computational domain where it is needed. The non-uniform grids allow one to vary
a cell size in each coordinate direction, keeping the grid structured and conformal
as in Figure 43.2(a). A more efficient approach is to employ a collection of nested
rectangular cells that form a non-conformal hierarchical grid, as in Figure 43.2(b).
Each successive nested level has a higher resolution, e.g., by a factor of two, than
the previous level, allowing smaller cell sizes to be “focused” in the regions of
interest (e.g., sub-wavelength features, photonic crystal microcavity, etc.). Inside
602 Classical Optics and its Applications
each rectangular region the standard FDTD algorithm is applied, while at the
boundaries between the grids an update scheme and interpolation must be
employed to keep the method stable and accurate. In FDTD the time step Dt is
proportional to the cell size, and hence the smallest time step is required on the
grids with the highest resolution. Each grid can be updated with its own time-step,
the grids with cell size 2h doing half as many iterations as grids with cell size h.
(a) Objective
Y
Substrate
X Z
Incident
beam Thin film(s)
Glass
Metal film hemisphere Metal film Metal film Metal film
of the normalized function is then evaluated, and all pixel values below a
certain level, say, a, are set equal to a. Displayed plots of log_intensity_a
thus cover the range from 10 aIpeak (blue) to Ipeak (red).
When the beam is focused through a hemispherical lens of refractive index n,
as in Figure 43.3(c), the same distribution as in Figure 43.4 is found at the focal
plane, but the spatial coordinates must shrink by a factor of n to account for
the reduced wavelength (k ¼ k0 /n) within the medium of the hemispherical lens.
At the bottom of the hemisphere, therefore, the focused spot diameter is reduced
by a factor of n compared to that shown in Figure 43.4.
604 Classical Optics and its Applications
(a) (b) (c)
Figure 43.4 Plots of log_intensity_4 (top) and phase (bottom) at the focal plane
of the lens of Figure 43.3(a). Left to right: E-field components along the X-, Y-, and
Z-axes. At the entrance pupil the incident beam (wavelength ¼ k0) is Gaussian with
1/e (amplitude) radius of 4000k0, truncated at the lens aperture (radius ¼ 3000k0).
The incident beam is linearly polarized along the X-axis, and its total optical power
captured by the lens is unity, that is, Px ¼ 1.0, Py ¼ 0.0. (The power content of the Ez
component at the focal plane is 8.3% of total power.)
In some cases the beam must be focused onto the object of interest through a
parallel plate cover glass or through the sample’s substrate, as is the case, for
instance, in Figure 43.3(e). Under such circumstances, to obtain a focused spot
free from spherical aberration, the objective lens must be designed for the specific
thickness and refractive index of the substrate. Unlike focusing through a glass
hemisphere, however, the focused spot inside a cover plate (or flat substrate, as
the case may be) has exactly the same dimensions as that obtained by focusing in
air through an objective of the same NA. The reason is that, in passing from the
air to the substrate through a flat interface, the effect of the reduced wavelength
on the focused beam is exactly canceled out by the reduced angle of the focused
cone (Snell’s law). The spot that illuminates the concave pit of Figure 43.3(e)
through the sample’s substrate, therefore, has exactly the same size as that which
directly illuminates the convex pit of Figure 43.3(d).
n ¼ 1.5. The large absorption coefficient k of the metal film ensures that the
light does not reach the substrate; most of the incident light is therefore
reflected, while a small fraction is absorbed in the metal. For the incident beam
depicted in Figure 43.4 at k0 ¼ 650 nm, Figure 43.5 shows computed plots of
reflected intensity (top) and phase (bottom) obtained with the FDTD method.
(The FDTD mesh size was Lx ¼ Ly ¼ 12k0, and the mirror’s front facet was a
distance z ¼ 180 nm beyond the focal plane of the lens.) The integrated intensity
of the reflected light over the XY-plane for the X- and Y-components of
polarization may be defined as follows:
ZZ ZZ
2
Px ¼ 2
jEx j dx dy, Py ¼ Ey dx dy:
Using the FDTD method, we found Px ¼ 0.85, Py ¼ 0.0016 for the mirror of
Figure 43.3(b) illuminated with the focused spot of Figure 43.4. To verify the
(a) (b)
(c) (d)
Figure 43.5 Plots of reflected log_intensity_4 (top) and phase (bottom) from
the metallic mirror depicted in Figure 43.3(b) at k0 ¼ 650 nm. The panels on the
left-hand side correspond to Ex, while those on the right-hand side represent the
Ey component of the reflected field. The front facet of the mirror is located a
distance z ¼ 180 nm beyond the focal plane.
606 Classical Optics and its Applications
accuracy of the FDTD method, we simulated the same system using an alternative
method based on the superposition of plane-wave solutions to Maxwell’s equations
with matching boundary conditions at the various interfaces. The intensity and phase
distributions thus obtained were visually indistinguishable from those shown in
Figure 43.5, and the corresponding integrated intensities were found to be Px ¼ 0.86,
Py ¼ 0.0018. The slight differences between the two methods of computation reflect
the cumulative effect of numerical errors inherent to the FDTD algorithm.
Similar simulations were performed for the sample of Figure 43.3(b) illuminated
through a glass hemisphere of index n ¼ 1.5. (The FDTD mesh size in this case was
Lx ¼ Ly ¼ 8k0, but the mirror’s front facet remained at z ¼ 180 nm beyond the focal
plane of the lens.) The computed values of integrated intensity were Px ¼ 0.78,
Py ¼ 0.0019. The corresponding quantities obtained with the alternative (and more
accurate) method of plane-wave superposition were Px ¼ 0.80, Py ¼ 0.0022. Once
again, comparison against a benchmark has shown the effect of small but cumu-
lative numerical errors on the results of FDTD calculations.
(a) (b)
(c) (d)
Figure 43.6 Plots of reflected log_intensity_3 (top) and phase (bottom) from the
dielectric bilayer depicted in Figure 43.3(c) at k0 ¼ 400 nm. The panels on the left-
hand side correspond to Ex, while those on the right-hand side represent the Ey
component of the reflected field. The front facet of the stack is at z1 ¼ 230 nm
beyond the focal plane.
43 Interaction of light with subwavelength structures 607
Although the alternative method employed in the above examples is faster and
more accurate than FDTD, it has the disadvantage of being restricted to geom-
etries such as those in Figures 43.3(b) and 43.3(c), where the sample consists of
one or more homogeneous layers with flat surfaces/interfaces. As soon as inho-
mogeneities or non-uniformities are introduced, the computation method based
on plane-wave superposition fails, and the FDTD method becomes an attractive
(though costly) candidate for numerical solution of Maxwell’s equations.
(a) (b)
(c) (d)
Figure 43.7 Same as Figure 43.6 but for the transmitted beam. The distance
from the rear of the stack to the plane where the transmitted beam is observed is
z2 ¼ 30 nm.
608 Classical Optics and its Applications
In our FDTD calculations of the bilayer stack of Figure 43.3(c) the incident
focused beam had k0 ¼ 400 nm, the mesh size was Lx ¼ Ly ¼ 8.08k0, and the
distance from the focal plane to the top of the stack was z1 ¼ 230 nm, while that
from the bottom of the stack to the plane in which the transmitted beam is
observed was z2 ¼ 30 nm. Figures 43.6 and 43.7 show computed plots of intensity
and phase for the reflected and transmitted fields, respectively. The corresponding
distributions obtained with the alternative method of plane-wave superposition
were visually indistinguishable from those in Figures 43.6 and 43.7. The integrated
values of reflected intensity are Px ¼ 0.022 (0.019 with the alternative method) and
Py ¼ 0.0026 (both methods). The corresponding quantities for the transmitted beam
are Px ¼ 0.97 (1.01 with the alternative method) and Py ¼ 0.01 (both methods).
Once again the FDTD method is seen to be adequate for these types of calculation,
provided that a few percentage point deviation from the exact solution (caused by
discretization and numerical errors) is deemed acceptable.
(a) (b)
(c) (d)
Figure 43.8 Plots of reflected log_intensity_3 (top) and phase (bottom) from
the convex pit in the sample depicted in Figure 43.3(d) at k0 ¼ 650 nm. The
panels on the left-hand side correspond to Ex, while those on the right-hand side
represent the Ey component of the reflected field. The pit center is 250 nm to the
left of the focused spot center.
43 Interaction of light with subwavelength structures 609
(a) (b)
(c) (d)
Figure 43.9 Same as Figure 43.8 but for the sample of Figure 43.3(e). The
objective is now corrected for the thickness and refractive index of the substrate,
so the beam focused on this concave pit continues to be the diffraction-limited
spot shown in Figure 43.4.
610 Classical Optics and its Applications
(a) (b)
(c) (d)
(c) (d)
Figure 43.11 Same as Figure 43.10 but with evanescent field components
filtered out.
(c) (d)
plane waves. If these evanescent components are filtered out, then the remaining
field will propagate undiminished to the far field. The filtered field in the
same observation plane (i.e., at z2 ¼ 20 nm beyond the interface between the
metal film and the substrate) is shown in Figure 43.11. The integrated intensities
of the X- and Y-components of polarization in these calculations are found to
be Px ¼ 0.31, Py ¼ 0.002.
For the bowtie aperture in the thin-film sample of Figure 43.3(f), computed
plots of transmitted intensity and phase are shown in Figure 43.12. The
computed integrated intensities in this case are Px ¼ 0.194, Py ¼ 0.04. When the
evanescent content of the transmitted field is filtered out, the distributions
shown in Figure 43.13 are obtained. (The integrated intensity values now drop
to Px ¼ 0.093, Py ¼ 0.005.) Note that the bowtie shape of the aperture is no
longer discernible in the filtered transmitted beam, ostensibly because the fine
features of this aperture contribute primarily to the evanescent field.
43 Interaction of light with subwavelength structures 613
(a) (b)
(c) (d)
Figure 43.13 Same as Figure 43.12 but with evanescent field components
filtered out.
When the bowtie aperture was rotated 90 in the plane of the metallic film (to
make the incident E-field perpendicular to the line that connects the sharp ends of
the triangles), the computed integrated intensities dropped to Px ¼ 0.1, Py ¼ 0.012
before filtering and Px ¼ 0.047, Py ¼ 0.0035 after filtering. The transmission
efficiency of the bowtie aperture is thus seen to drop by nearly a factor of 2.0
when the incident polarization goes from being parallel to the line that connects
the sharp ends of the triangles to being perpendicular to it.
In the 1920s Vasco Ronchi developed the well-known method of testing optical
systems now named after him.1,2 The essential features of the Ronchi test may be
described by reference to Figure 44.1. A lens (or more generally, an optical
system consisting of a number of lenses and mirrors) is placed in the position
of the “object under test”. The lens is then illuminated with a beam of light,
which, for the purposes of the present chapter, will be assumed to be coherent and
quasi-monochromatic. These restrictions on the beam may be substantially
relaxed in practice.3
The lens brings the incident beam to a focus in the vicinity of a diffraction
grating, which is placed perpendicular to the optical axis, i.e., the Z-axis. The
grating, also referred to as a Ronchi ruling, may be as simple as a low-frequency
wire grid or as sophisticated as a modern short-pitched, phase/amplitude grating.
The position of the grating should be adjustable in the vicinity of focus, so that it
may be shifted back and forth along the optical axis. The grating breaks up the
incident beam into multiple diffracted orders, which will subsequently propagate
along Z and reach the lens labeled “pupil relay” in Figure 44.1.
The pupil relay may simply be the lens of the eye, which projects the exit pupil
of the object under test onto the retina of the observer. Alternatively, it may be a
conventional lens that creates a real image of the exit pupil on a screen or on a
CCD camera.
The diffracted orders from the grating will be collected by the relay lens and,
within their overlap areas, will create interference fringes characteristic of the
aberrations of the optical system under consideration. By analyzing these fringes,
one can determine the type and, with some effort, the magnitude of the aberra-
tions present at the exit pupil of the system.
The above description of the Ronchi test relies on its modern interpretation;
this is based on our current understanding of physical optics and the theory of
diffraction gratings. Historically, however, the gratings used in the early days
614
44 The Ronchi test 615
Object
under test Grating Pupil
(Ronchi ruling) relay
Observation
plane
were quite coarse, and the results obtained with them required no more than a
simple geometric-optical theory for their interpretation. Typically, one would
place the eye at the focus of the lens and hold a grating (e.g., a wire grid) in front
of the eye, moving the grating in and out until a clear pattern became visible. At
this point the beam would be illuminating several of the wires simultaneously. By
looking through the grating and observing the shadows that the wires cast on the
exit pupil, one could determine the type of aberration present in the system. The
coarseness of the grating, of course, caused several of the diffracted orders (as we
understand them today) to overlap each other, thus resulting in reduced contrast
and smearing of the patterns near the boundaries. These problems were eventu-
ally overcome when finer gratings became available and the diffraction theory of
the Ronchi test was better understood.
–2 –1 0 1 2
Figure 44.2 Several diffracted orders in the far field of the grating of
Figure 44.1. When the grating’s period is chosen properly, each diffracted order
(i.e., emergent cone of light) will overlap only with its nearest neighbors. Except
for a lateral shift in position, the various orders are identical, carrying the
amplitude and phase distribution of the beam as it appears at the exit pupil of the
object under test.
a b c
d e f
a b c
d e f
merits have been expounded in the literature.3 It is useful here to examine some
of these alternative methods and to compare the resulting patterns (interferograms
or otherwise) with those obtained with the Ronchi test.
Object
under test
Beam-splitter Pupil
relay
Mirror
Observation
Mirror plane
Beam-splitter
d e f
spherical aberration (q4) and also in the fact that a Ronchigram, being a kind of
shearing interferogram (albeit with a large shear), is related to the derivative of the
wavefront aberration function.
Observation
plane
Figure 44.8 In the knife-edge test a certain region in the vicinity of focus is
blocked by a knife-edge; the nature and the magnitude of the aberrations are then
inferred from the resulting patterns of intensity distribution at the observation
plane. (The knife-edge may be moved both along and perpendicular to the
optical axis.) The wire test is similar to the knife-edge test except that a fine wire
is used instead, to block certain groups of rays.
a b
c d
show several computed patterns of intensity distribution for the knife-edge and
wire tests, respectively.
The results of the simulated knife-edge test depicted in Figure 44.9 assume a
laser as the light source. Consequently, frames (a) and (b) of Figure 44.9 exhibit
44 The Ronchi test 623
several dark lines which, with a less coherent light source, would have been
absent. The results of the simulated wire test shown in Figure 44.10 assume an
extended light source, since the small amount of spherical aberration present in
the system under consideration would render the test useless with a wire, which
fine as it may be, will still be wider than the focused spot produced by a laser
beam. Note the similarities between the patterns of Figures 44.9 and 44.10 on the
one hand, and those of Figures 44.5(d)–(f) on the other.
Roland Shack invented the device now known as the Shack–Hartmann wavefront
sensor in the early 1970s.1,2 This sensor, which in recent years has been com-
mercialized, measures the phase distribution over the cross-section of a given
beam of light without relying on interference and, therefore, does not require a
reference beam.
The standard method of wavefront analysis is interferometry, where one brings
together on an observation plane the beam under investigation (hereinafter the
test beam) and a reference beam in order to form tell-tale fringes.3 The trouble
with interferometry is that it requires a reference beam, which is not always
readily available. Moreover, the coherence length of the light used in these
measurements must be long compared with the path-length difference between
the reference and test beams. Thus, when the available light source happens to be
broad-band, it becomes difficult (though by no means impossible) to produce
high-contrast fringes. The Shack–Hartmann instrument solves these problems by
eliminating altogether the need for the reference beam.
624
45 The Shack–Hartmann wavefront sensor 625
Monochromatic
Light Source
Condenser
Pinhole
Spherical
Cap
C
C′
Beam-splitter Cube
Observation
Plane
Test Mirror
Figure 45.1 The Shack cube is used here to measure the surface quality of a
spherical mirror. The cube is a 50/50 beam-splitter capped by an index-matched
plano-convex lens. The light from the point source is partially reflected from this
spherical cap, producing a reference beam that comes to focus at C. The beam
that passes through the cap illuminates the test mirror, then returns and crosses
the cube and is focused at C 0 . The interference pattern between the test and
reference beams is viewed at the observation plane. The cube’s axis is slightly
displaced from the axis of the mirror in order to separate C from C 0 , which is
needed for producing straight-line fringes.
source in the beam-splitter’s half-silvered mirror. The light reflected from the
spherical cap (and focused at C) forms the reference beam. (Incidentally, this
interferometer was also invented by Roland Shack in the 1970s, and is now
known as the Shack cube.4)
Note in Figure 45.1 that the pinhole is placed directly on the face of the beam-
splitter to eliminate possible aberrations of the beam upon entering and exiting
the cube. The reflectivity of the spherical cap is about 4%, which is similar to that
of the uncoated test mirror. The equal-strength test and reference beams thus
produce a high-contrast fringe pattern. Figure 45.2(a) shows a typical phase
distribution over the cross-section of a test beam reflected from a mirror having
several waves of aberration. The computed interference pattern between this and
an equal-strength reference beam is shown in Figure 45.2(b). Needless to say, the
fringe contrast is excellent and the observed fringes may be related directly to the
wavefront aberrations. In general, the coherence length of the light source must
be long enough to ensure that, at the observation plane, the test and reference
626 Classical Optics and its Applications
beams remain mutually coherent. For testing small mirrors having a short focal
length (say, less than 10 cm) a single radiation line of an arc lamp may suffice,
but for larger mirrors a long-coherence-length laser is usually necessary.
In practice, the center of curvature of the test mirror is slightly displaced from C,
as shown in Figure 45.1, so that the rays bouncing off the mirror and arriving at the
exit facet of the cube would converge not to C, but to a nearby point C 0 . This small
lateral displacement of the test beam relative to the reference beam produces
straight-line fringes in the interferogram at the observation plane. Such fringes are
very sensitive to small aberrations of the mirror, and their deviation from linearity
can be related easily to minute surface errors. If the errors are large, however, there is
no need for straight-line fringes, and the center of the test mirror can coincide with C.
The combination of the cube beam-splitter and the spherical cap may be
considered a thick lens. This lens projects a real image of the test mirror in the
45 The Shack–Hartmann wavefront sensor 627
space behind the cube. The best place to observe the fringes, therefore, is at the
location of this image, where the fringes are localized on the mirror, and the
observer can readily identify areas that need further grinding and polishing.
Another advantage is that scratches and dust particles on the mirror come to focus
at its image, thus eliminating spurious fringes of the scattered light that downgrade
the quality of the interferograms obtained at other locations in the image space.
(The spherical cap, of course, must be kept clean at all times to prevent dust
particles that have collected there from producing their own spurious fringes.)
The test mirror depicted in Figure 45.1 does not have to be spherical, but may
be a mild paraboloid or hyperboloid whose center of curvature is, as before,
placed at or near the point C. The departure of the mirror’s figure from sphericity
imparts a certain amount of spherical aberration to the test beam, which may be
calculated in advance. The optician then looks for aberrations above and beyond
this expected amount of spherical aberration in order to determine the necessary
corrections.
Large telescope mirrors may also be tested with a Shack cube, but they require
the use of an additional lens system known as a null-corrector.3 A telescope’s
primary mirror is generally a large paraboloid or hyperboloid designed for
operation at “infinite conjugate”, that is, it brings the collimated beam of a distant
star to focus within the mirror’s focal plane. Testing such a large mirror with a
collimated beam is impractical, however, and its actual departure from sphericity
is too severe to be simply subtracted from the interferograms obtained in the
system of Figure 45.1. In such situations, a null-corrector is designed to cancel
the spherical aberrations of a test beam originating from a point source located at
the mirror’s center of curvature. When a properly calibrated null-corrector is
inserted between the Shack cube and the test mirror, the observed interferogram
registers only the departure of the mirror from its desired figure.
CCD array
Test beam
Beam
expander Lenslet array
Figure 45.4 Intensity distribution at the focal plane of a 6 · 6 lenslet array, when
the incident beam is assumed to have the phase distribution of Figure 45.2(a).
Each square lenslet is 1000k · 1000k in size, and has focal length f ¼ 25 000k. The
logarithmic plot of intensity shown here reveals the fine detail of the distribution at
the CCD array. In practice, the fine detail is rather faint and only the center of each
spot is detected by the CCD.
The local slopes are then patched together to reconstruct the complete phase
distribution over the cross-section of the test beam.
Historical notes
The predecessor to the Shack–Hartmann sensor was Hartmann’s screen test,
which used an array of holes in place of the lenslets.3,5,6,7 Shack realized the
advantages of using a lenslet array and set out to fabricate one, since no such
array with the characteristics he desired was available at the time. He made a
mold by using a cutting tool to carve parallel grooves in a piece of flat glass, as
shown in Figure 45.5. Two such pieces of grooved glass, oriented at right angles
to each other, were clamped to an acrylic sheet and heated in an oven to mold
630 Classical Optics and its Applications
Figure 45.5 A piece of flat glass on which identical grooves were carved
served as a mold for the early lenslet arrays. Two such pieces were prepared
and placed face-to-face across a plastic sheet at right angles to each other.
The assembly was then heated in an oven to transfer the pattern of the mold
to the plastic sheet. The 1 mm-wide grooves had a depth of only a few
micrometers.
convex ribs on each side of the acrylic sheet, thus forming an array of crossed
cylindrical lenses. The first such array had 50 · 50 lenslets, each with an area of
1 · 1 mm2 and a focal length of 150 mm.
Before the advent of CCD detectors in the 1980s, wavefront analysis was done
by examining a photographic plate exposed to the array of focused spots. The
plate was also exposed (simultaneously and through the same array of lenslets) to
a parallel, aberration-free reference beam. The spots formed by this reference
beam marked the center of each frame, thus providing reference points for
measuring the displacement of the spots formed by the test beam. The tedious
task of exposing and developing the photographic plate, followed by painstak-
ingly measuring the positions of individual spots, was rewarding nonetheless; it
allowed astronomers to measure the aberrations of their telescopes in the field
using unfiltered star light. Even atmospheric turbulence did not pose a serious
problem for this method, since its effects were simply averaged over during the
relatively long exposure time of the photographic plate.
632
46 Ellipsometry 633
r
cto
cte
ra
De
zer
aly
X An
ns
Le
rp
45˚
u
Z
polarizer angle qp. At this point the reflected beam is linearly polarized, its E-field
components along X and Y being proportional to jrpj and jrsj, respectively. In the
reflected path the analyzer, whose transmission axis is also adjustable, is rotated
through an angle qa ¼ tan1(jrpj/jrsj)¼wr to block the light that would otherwise
reach the detector. Thus by measuring the values of qa and qp that null the detector’s
signal, one obtains the amplitude ratio jrpj/jrsj and the relative phase rp rs of the
sample’s reflection coefficients.
Measuring the sample reflectivities Rp, Rs using a nulling ellipsometer is
straightforward; all one needs to do is monitor the detector signal S at qa ¼ 0 and 90 .
Calibration requires removing the sample and aligning the arms of the ellipsometer
with each other (i.e., h ¼ 90 ), in which case the light from the source goes through
the entire system and yields a detector signal corresponding to a 100% sample
reflectivity. Optical power fluctuations could be countered by splitting off a small
fraction of the beam at the source and monitoring its variations with an auxiliary
detector. The signal from the auxiliary detector is subsequently used to normalize the
reflectivity signals.
Needless to say, the same types of measurement as discussed above, when
performed on the transmitted beam, yield the values of Tp, Ts, tp ts and wt.
Ep Reflected beam
Es
u
Incident beam
d
Substrate
Transmitted beam
(a) (b)
6
ΔRp 2
Δcr
4
Angle Varitation (degrees)
ΔTp
R and T Variation (%)
1 Δct
2 ΔRs
ΔTs
0 0
Δ(ftp– fts)
Δ(frp– frs)
–2
–1
–4
–2
–6
4.0 4.2 4.4 4.6 4.8 5.0 4.0 4.2 4.4 4.6 4.8 5.0
n n
(a) (b)
3
2
ΔTp Δ(frp– frs)
2
1 ΔTs
ΔRp
Δcr Δct
0 0
ΔRs
Δ(ftp– fts)
–1
–1
–2
–3 –2
1.5 1.6 1.7 1.8 1.9 2.0 1.5 1.6 1.7 1.8 1.9 2.0
k k
5 (a) 5 (b)
Δ(frp– frs)
4 4
ΔTp
3 3
ΔRp
Angle Variation (degrees)
2 ΔTs 2
R and T Variation (%)
Δcr
1 ΔRs 1 Δ(ftp– fts)
0 0 Δct
–1 –1
–2 –2
–3 –3
–4 –4
–5 –5
20 22 24 26 28 30 20 22 24 26 28 30
Thickness (nm) Thickness (nm)
1.1 1.1
(a) (b) rp = 54°
rp = 47°
1.0 1.0 39°
0.9 32° 0.9 32°
24°
0.8 17° 0.8
17°
Detector Signal
0.7 0.7
9°
0.6 2° 0.6
2°
0.5 0.5
0.4 0.4
0.3 0.3
0.2 0.2
0.1 0.1
0.0 0.0
0 45 90 135 180 0 45 90 135 180
Analyzer angle ra (degrees) Analyzer angle ra (degrees)
Figure 46.6 The detector signal S versus the orientation angle qa of the
analyzer in the nulling ellipsometer of Figure 46.1 with the sample of Figure
46.2. Different curves correspond to different values of the polarizer angle
qp. The total optical power of the unpolarized (or circularly polarized) beam
emerging from the source is unity, the detector’s conversion factor is 4,
the incidence angle is h ¼ 60 , and the focusing and collimating lenses have
NA ¼ 0.025. In (a) the assumed system is perfect. In (b) there are departures from
ideal behavior, namely, the polarizer and analyzer have a 1:100 extinction ratio,
the angle of incidence deviates by 1 , and the quarter-wave plate’s retardation
is 87 while its axes are 1 away from the ideal 45 orientation.
638 Classical Optics and its Applications
In Figure 46.6(a) the assumed system is perfect, while in Figure 46.6(b) errors are
incorporated into the various components, namely, the assumed polarizer and
analyzer have a 1:100 extinction ratio, the angle of incidence on the sample is
h ¼ 61 , and the QWP’s retardation is 87 while its axes are 1 away from the
ideal 45 orientation.
The null in Figure 46.6(a) is achieved with qp ¼ 47 and qa ¼ 32.2 , yielding
rp rs ¼ 4 and wr ¼ 32.2 , as expected. Also the detector signals at qa ¼ 0 and
90 are 0.296 and 0.748, which correspond to the correct values of Rp and Rs. In
practice, even in this ideal case with perfect components the exact location of the
null may not be easy to determine. This produces a certain degree of inaccuracy,
depending on the available signal-to-noise ratio at the detector. In the case of
Figure 46.6(b), where the assumed components have substantial errors, the
minimum signal occurs at qp ¼ 54 and qa ¼ 30 , yielding rp rs ¼ 18 and
wr ¼ 30 . The reflectivities in this case (obtained at qp ¼ 9 , and qa ¼ 0 and 90 ) are
Rp ¼ 0.308, Rs ¼ 0.727. If we consider the sensitivity curves in Figures 46.3–46.5,
such huge errors are clearly unacceptable.
A more realistic situation might correspond to small system errors; suppose,
for instance, that the polarizer and the analyzer have extinction ratios of 1:1000,
the angle of incidence on the sample has a 0.25 error (h ¼ 60.25 ), and
the QWP’s retardation is 90.5 while its axes are misaligned by only 0.25 .
In this case the minimum signal occurs at qp ¼ 49 and qa ¼ 31.8 , yielding
rp rs ¼ 8 and wr ¼ 31.8 . The reflectivities (obtained at qp ¼ 4 , and qa ¼ 0
and 90 ) are Rp ¼ 0.291 and Rs ¼ 0.757. It is thus clear that the nulling ellipso-
meter requires a high degree of accuracy in its components in order to achieve a
reasonable level of confidence in its estimates of sample parameters.
S1
S2
ors
ect
det
oto
Ph
pri ston
sm
lla
Wo
s
Len
X
45°
u
Z
necessarily in that order. At the same time, the normalized difference signal (S1 S2)/
(S1þS2) exhibits a peak-to-valley variation equal to 2 sin(rp rs). The system of
Figure 46.7 does not provide an independent measure of the other ellipsometric
parameter, wr. However, since Rp and Rs are directly measurable, wr is redundant.
In operating the system of Figure 46.7 it is not necessary to know the time-
dependence of the retardation D, nor in fact does one need to know the specific
value of D at any point during the measurement. The maximum and minimum
values of the sum signal and of the normalized difference signal contain all the
necessary information. Unlike the nulling ellipsometer, this system does not
require any adjustment of angles around a broad minimum; therefore, there is
much less uncertainty about the measured data points.
For the ideal system depicted in Figure 46.7, Figure 46.8(a) shows computed
plots of the sum signal and the normalized difference signal versus the retardation
D. The maximum and minimum values of the sum signal are 0.748 and 0.296,
640 Classical Optics and its Applications
0.8 (a) 0.8 (b)
0.7 S 1 + S2 0.7
S1 + S2
0.6 0.6
Sum and Difference Signals
0.5 0.5
0.4 0.4
0.3 0.3
0.2 0.2
(S1 – S2)/(S1 + S2) (S1 – S2)/(S1 + S2)
0.1 0.1
0.0 0.0
–0.1 –0.1
0 90 180 270 360 0 90 180 270 360
Retardation (degrees) Retardation (degrees)
Figure 46.8 Computed plots of the sum and (normalized) difference signals in
the system of Figure 46.7 for the sample shown in Figure 46.2. The horizontal
axis depicts the relative phase imparted to the beam by the variable retarder. The
beam emerging from the polarizer has unit optical power, the detectors’ con-
version factor is unity, the incidence angle is h ¼ 60 , and the focusing and
collimating lenses have NA ¼ 0.025. (a) The assumed system is perfect. (b) Two
instances (solid lines, broken lines) of imperfect system behavior.
Dennis Gabor (1900–1979). His life-long love of physics started at the age
of 15. Fascinated by Abbe’s theory of the microscope and by Lippmann’s
method of color photography, he and his brother built up a home laboratory
and began experimenting with X-rays and radioactivity. Gabor entered the
Technische Hochschule Berlin and acquired a diploma in 1924 and an
electrical engineering doctorate in 1927. His thesis work involved the
development of high-speed cathode ray oscillographs, in the course of which
he built the first iron-shrouded magnetic electron lens. In 1927 he joined
Siemens & Halske AG, where he invented a high-pressure quartz mercury
lamp, since used in millions of street lamps. With the rise of Hitler in 1933,
Gabor left for England and obtained employment with the British firm
Thomson–Houston. At Thomson–Houston he developed a system of stereo-
scopic cinematography, and in his last year there carried out basic experi-
ments in holography. In 1949 he joined the Imperial College of Science and
Technology (London) and remained there as Professor of Applied Electron
Physics until his retirement in 1967. (Photo: courtesy of AIP Emilio Segré
Visual Archives, W. F. Meggers Collection.)
642
47 Holography and holographic interferometry 643
Holography dates from 1947, when the Hungarian-born British scientist Dennis
Gabor (1900–1979) developed the theory of holography while working to
improve electron microscopy.1,2 Gabor coined the term “hologram” from the
Greek words holos, meaning whole, and gramma, meaning message. The 1971
Nobel prize in physics was awarded to Gabor for his invention of holography.
Further progress in the field was prevented during the following decade
because the light sources available at the time were not truly coherent. This
barrier was overcome in 1960, with the invention of the laser. In 1962 Emmett
Leith and Juris Upatnieks of the University of Michigan recognized, from their
work in side-looking radar, that holography could be used as a three-dimensional
visual medium. They improved upon Gabor’s original idea by using a laser and
an off-axis technique.3 The result was the first laser transmission hologram of
three-dimensional objects. The basic off-axis technique of Leith and Upatnieks
is still the staple of holographic methodology. These transmission holograms
produce images with clarity and realistic depth, but require laser light to view the
holographic image.
The Russian physicist Uri Denisyuk combined holography with Lippmann’s
method of color photography. In 1962 Denisyuk’s approach produced a white-light
reflection hologram, which could be viewed in the light from an ordinary light bulb.
In 1968 Stephen Benton, then at Polaroid corporation, invented white-light trans-
mission holography.4,5,6 This type of hologram can be viewed in ordinary white
light and is commonly known as the rainbow hologram. These holograms, which
are “printed” by direct stamping of the interference pattern onto plastic, can be
mass produced rather inexpensively.7
Basic principles
A setup for recording a simple transmission hologram is shown in Figure 47.1.
The coherent beam of the laser, after being expanded to cover the area of interest,
is split into an object beam and a reference beam. The object beam passes through
(or reflects from) the object before arriving at the photographic plate; the refer-
ence beam is directed toward the photographic plate at an oblique angle h. At the
XY-plane of the plate the complex-amplitude distribution of the object beam is
AO(x, y). The reference beam’s amplitude, AR(x, y), is proportional to exp[i(2p/k)
(xSx þ ySy)], where Sx and Sy are the direction cosines of the beam. The two
beams interfere at the plate, upon which their interference fringes are recorded.
When the plate is properly processed and developed, its amplitude transmissivity
s(x, y) becomes proportional to the incident intensity pattern, that is,8,9,10
Z
Laser
Beam expander
Photographic
plate
am
be
nce
f ere
Re
Mir
ror
Figure 47.1 The basic optical system used for recording a simple hologram.
The laser beam is expanded to accommodate the size of the object. The beam-
splitter separates a fraction of the light to be used as a reference beam and sends
it along a path that reaches the photographic plate at an oblique angle. The rest of
the beam continues along the Z-axis, interacts with the object, and arrives at the
photographic plate while carrying the phase/amplitude information about
the object. The two beams interfere and the plate records the resulting fringes of
the interference pattern. The film is subsequently developed into a positive (or
negative) transparency and becomes a permanent record of the object wave.
To reconstruct the object wave, the developed plate is returned to its original
position and illuminated with the reference beam, as shown in Figure 47.2. The
transmitted beam’s complex amplitude may thus be written
Note in the above equation that jAR(x, y)j is a constant, independent of x and y, and
that A2R(x, y) is a plane wave with direction cosines 2Sx and 2Sy. (When h is small,
the propagation direction of this plane wave makes an angle 2h with the Z-axis.)
Thus in addition to the reference beam AR(x, y) – which is modulated by the squared
modulus of the object wave – the wavefront emerging from the hologram contains
the original object wave AO(x, y), as well as its complex conjugate A*O(x, y). The
reconstructed object wave travels in its original direction (i.e., along the Z-axis, in the
case of Figure 47.2), but the conjugate wave rides on a plane wave whose deviation
angle from the Z-axis is nearly twice that of the original reference beam.
47 Holography and holographic interferometry 645
Hologram
Beam-splitter
Z
Laser
Figure 47.2 To reconstruct the recorded wavefront one places the hologram in
front of the same reference beam as used for recording. Upon transmission
through the hologram several reconstructed waves emerge. If the hologram is in
the same position as it was during recording, the virtual image of the object will
be carried by the component of the emergent beam traveling along the Z-axis.
However, if the hologram is flipped then a real image of the object emerges
along the Z-axis. (The flipping is such that the reconstruction beam becomes the
conjugate of the original reference beam with respect to the hologram.)
Behind the hologram, the reconstructed object wave yields the virtual image of
the recorded object; this image may be viewed through the lens of an eye or
photographed through the lens of a camera. The conjugate wave yields a real
image of the object, which can be visually inspected or photographed by placing
a photographic plate directly in its path. The transmitted portion of the recon-
struction beam itself does not carry any useful information and is generally
ignored.
c d
Figure 47.4 A plane wave traveling along the Z-axis and transmitted through
the face at z ¼ 0 arrives at the photographic plate at z ¼ 3500k. (a) Distribution of
the logarithm of intensity of the object wave at the plate. (b) Object wave’s
phase distribution at the plate. (c) Interference pattern (logarithm of intensity)
between the object wave and a reference plane wave traveling at h ¼ 8 relative
to the Z-axis. (d) Distribution of the logarithm of intensity immediately after the
hologram, when the exposed plate is developed into a positive transparency and
placed in front of the reconstruction beam.
beam are all mixed together and, therefore, difficult to identify separately. Since
these components are traveling in different directions, propagation over a short
distance is all that is required to disentangle them from each other.
c d
The real image of the face – produced by the conjugate wave A*O(x, y) – is shifted
further off-axis, and appears in the upper right corner of Figure 47.5(a).
Holographic reconstruction produces not only the amplitude of the original
object but also its phase pattern, as is evident from Figure 47.5(b). Unlike regular
photography, which maintains a record of the intensity profile but loses all trace
of phase, the holographic process preserves both the amplitude and the phase
information, and faithfully reproduces the entire object wave upon reconstruction.
A comparison of the central regions of Figures 47.5(a), (b) with the original object
wave of Figures 47.4(a), (b) might be worthwhile here, although one should note
that the reconstructed wave in Figures 47.5(a), (b) is captured at an effective distance
of 7000k from the original object, whereas the patterns of Figures 47.4(a), (b)
correspond to a propagation distance of only 3500k.
47 Holography and holographic interferometry 649
To observe the virtual image, one should place an imaging lens in the central
region of the field and produce a real image from the reconstructed object
wave. (Alternatively, one could propagate the reconstructed object wave
backwards in space by 7000k to reproduce the object wave at its point of
origination.) A one-to-one imaging lens (NA ¼ 0.04, f ¼ 3500k) placed in the
central region of Figures 47.5(a), (b) will create an inverted real image of the
face at z ¼ 7000k behind the lens. The resulting intensity and phase patterns
are shown in Figures 47.5(c), (d). The loss of resolution due to the small size of
the hologram is visible at the edges of the various facial features, from which the
high-spatial-frequency content of the original face is obviously missing (compare
with Figure 47.3(b)).
If the hologram is flipped during playback, the reconstruction beam, being a
plane wave in this example, becomes the conjugate of the original reference
beam, namely, A*R(x, y). (Alternatively, the reference beam may be conjugated
and brought in from the opposite side of the hologram.) Under such circum-
stances the transmitted wave along the original direction of the object wave (i.e.,
the Z-axis in the present example) becomes the conjugated object wave, A*O(x, y),
and the reconstructed object wave moves off-axis. This situation is depicted in
Figure 47.6, where, after propagating 3500k beyond the hologram, the various
components of the transmitted beam have separated from each other. The
intensity distribution in Figure 47.6(a) reveals at the center the real image of the
face, slightly to the lower left the directly transmitted reconstruction beam, and
close to the lower left corner the beam containing the virtual image. There is also
a weaker image of the face on the right-hand side of the real image; this “second
harmonic” of the face is created by the nonlinearity of the photographic process.
Figures 47.6(c), (d) are close-ups of the intensity and phase patterns in the real
image produced by the conjugated object wave.
Holographic interferometry
Suppose the face shown in Figure 47.3 is somehow distorted at a later time or has
undergone changes in its optical properties such that the beam transmitted
through the face has acquired a certain degree of phase modulation. To render
this phase modulation visible by converting it to intensity variations, it is
necessary to interfere the beam transmitted through the face with a reference
beam. If a collinear plane wave is chosen as reference, the resulting interferogram
will resemble that in Figure 47.7(a). Here the deformation contours appear as
black and white fringes superimposed on the face. One can also see in this figure
the fringes caused by the phase structure of the facial features, namely, the eyes,
the nose, and the mouth.
650 Classical Optics and its Applications
a b
c d
a b
Figure 47.7 Two interferograms of the distorted face. In (a) the reference
beam is a plane wave, whereas in (b) the distorted face is made to interfere with
its own undistorted version.
47 Holography and holographic interferometry 651
c d
Figure 47.8 A plane wave traveling along the Z-axis and transmitted through
the distorted face at z ¼ 0 arrives at the photographic plate at z ¼ 3500k. In this
double-exposure experiment a hologram of the undistorted face has already been
recorded on the plate. (a) Logarithmic plot of the object wave’s intensity distri-
bution at the plate. (b) The object wave’s phase distribution at the plate. (c) Pattern
of interference between the object wave and a reference plane wave traveling
at h ¼ 8 relative to the Z-axis. (d) Distribution of the logarithm of intensity
immediately after the hologram, when the twice-exposed film is developed into a
positive transparency and placed in front of the reconstruction beam.
directly. This also provides a natural and very sensitive method of aligning the
hologram to the original position after it has been removed for processing.
†
The coauthor of this chapter is Ewan M. Wright of the College of Optical Sciences, University of Arizona.
654
48 Self-focusing in nonlinear optical media 655
y/
–1100
–1100 x/ 1100 –1100 x/ 1100
Figure 48.1 Plots of intensity distribution for (a) the X-component and (b) the
Z-component of polarization. These plots represent the cross-section of a Gaussian
beam having a 1/e radius 1000k.
waist equals 1177k, and its peak intensity Imax ¼ 0.64I0. Here I0 is an arbitrary scale
factor used to normalize all intensity profiles throughout this chapter.
The beam cannot satisfy Maxwell’s equations unless it has a component of
polarization Ez along the Z-axis; the computed intensity profile jEzj2 for this
Z-component is shown in Figure 48.1(b). For a beam whose cross-section is
substantially larger than a wavelength, the power content of Ez is typically much
less than that of Ex. For example, in the present case the fraction of the total
optical power carried by the Z-component is only 0.25 · 107. We will see below
that Ez gains in strength as the beam converges towards focus.
c d
Figure 48.2 The beam of Figure 48.1 goes through a thin slab of a nonlinear
material, creating a change in the index of refraction in proportion to its
intensity. At the center, where the beam is brightest, the self-induced phase
shift is 10p. The intensity of the beam upon emerging from the slab is shown in
(a), (b), and its phase distribution in (c), (d). The plots on the left-hand side
correspond to the X-component of polarization, while those on the right-hand
side represent the Z-component.
Figure 48.2(c) shows the phase profile of the emergent wavefront for the
X-component of polarization. The gray-scale ranges from p (black) to p (white),
and the number of rings indicates a total phase shift of 10p from the center to the
rim. The phase profile for the Z-component of polarization in Figure 48.2(d)
shows, in addition to the curvature, a p phase shift between the right and left
halves of the beam. Again this is a simple geometrical consequence of the
bending of the rays toward the optical axis.
The above example clearly demonstrates that a nonlinear medium can impart a
curvature phase factor to a beam during transmission. When the curvature is
negative the beam becomes divergent and expands upon further propagation.
Conversely, a positive curvature causes the beam to converge towards a focus.
This is the underlying physical mechanism of self-focusing in thick nonlinear
media, to which we now turn.
658 Classical Optics and its Applications
preserving the integrated intensity over the clear aperture of the beam. As before,
the distance between adjacent thin slabs is 5000k, and the beam is propagated in
60 steps for a total distance of 300 000k.
Figure 48.4 shows, from top to bottom, the initial half-Gaussian beam as well
as the patterns of intensity distribution within the medium after 20, 40, 50, and 60
propagation steps. Both columns show the profile of jExj2, the intensity distri-
bution being on the left-hand side and its logarithm on the right-hand side.
(The logarithmic plot enhances weak features of the distribution, just like an
over-exposed photograph.)
We note several new features in this example. First, the beam comes to a
focus in the narrow dimension before it collapses in the wide dimension.
Second, the center of the beam shifts to the right as it propagates. This self-
deflection is caused by the prism-like phase factor that the nonlinear medium
imparts to the beam.4 An ideal prism imparts a phase factor that is linear in the
spatial coordinate x, namely, exp(i2prx/k), deflecting the beam by an angle
h ¼ sin1r. One can explain the observed self-deflection in Figure 48.4 by
noting the similarity between the ideal phase factor of a prism and the phase
factor exp[i(x, y)] imposed on the half-Gaussian beam by the nonlinear
medium.
Finally, note in Figure 48.4 that the beam breaks up into multiple branches
after coming to focus. In practice the intensity at the focal point may be large
enough to damage the material. Even if damage does not occur, small material
inhomogeneities can cause substantial aberrations, distorting and breaking up the
beam in unpredictable ways. The fact that computer simulations also show this
type of breakup is due to small numerical errors incurred during computation.
Usually these numerical errors are insignificant, but when the intensity begins to
build up in the vicinity of a focal point, they cause the breakup of the beam in a
random-looking fashion.
660 Classical Optics and its Applications
Figure 48.3 Top to bottom: plots of intensity distribution after 20, 30, 40, 50,
60 steps of propagation through a nonlinear medium. The X-component of
polarization is on the left, and the Z-component on the right. The incident beam
is the Gaussian shown in Figure 48.1.
Beam filamentation
As mentioned above, if the beam’s power is large enough the beam breaks up
into many cells, each of which contains several critical powers and comes
independently to focus. Our final example concerns a uniform beam of diameter
48 Self-focusing in nonlinear optical media 661
2000k with a constant intensity equal to 0.32I0 across the aperture. In this
simulation we placed 40 thin slabs of a nonlinear material at intervals of 15 000k
along the Z-axis. Each slab imparts a phase shift of 15 at the reference intensity of
I0, which is equivalent to an incident optical power of 60Pcr. Shown in Figure 48.5
662 Classical Optics and its Applications
are the results of simulation after 10, 20, 30, 35, and 40 steps. At first, as a result of
diffraction during propagation, the beam breaks up into multiple rings. After 30
iterations the central region of the beam comes to a focus. Afterwards, the central
spot goes out of focus, but one of the rings breaks into multiple filaments.5 Small
48 Self-focusing in nonlinear optical media 663
Concluding remarks
Another mechanism that can couple the refractive index to the beam intensity
profile is absorption of the light followed by heating and thermal diffusion.
Variation of the refractive index in response to thermal expansion (or contraction)
of the material is a frequently observed source of nonlinear optical behavior.
Thermal effects usually produce negative values of n2, thus causing defocusing of
the beam. Heat diffusion further complicates the relation between n(x, y, z) and
I(x, y, z), by removing the local nature of their interdependence. In this chapter we
have confined our attention to the simple case of local nonlinearity with a positive
value for n2 and have shown examples of self-focusing and beam filamentation.
Similar studies can be carried out for thermally induced nonlinearities, provided
that heat diffusion is taken into consideration properly.
Kerr nonlinearity
The simplest nonlinearity capable of producing self-trapping (leading to soliton
formation in a planar waveguide) is a Kerr nonlinearity, obtained when the
refractive index of the medium has an intensity-dependent term of the form
nðx, y, zÞ ¼ n0 þ n2 Iðx, y, zÞ;
†
This chapter is co-authored with Ewan M. Wright, Professor of Optical Sciences at the University of
Arizona.
664
49 Spatial optical solitons 665
where I ¼ jEj2 is the electric field intensity of the optical beam. Since diffraction
tends to expand the spatial dimensions of a beam, the requisite nonlinearity must
produce self-focusing, which translates into a positive coefficient n2 for the Kerr
medium. (In contrast, temporal solitons can exist in media having either negative
or positive nonlinear indices, depending on whether the dispersion of the medium
is normal or anomalous.) The only spatial solitons that could exist in media with
negative n2 are dark solitons, which are localized depressions in a cw back-
ground. Although both bright and dark solitons (spatial as well as temporal) have
been observed experimentally, we limit the discussion in this chapter to bright
spatial solitons in planar waveguides.
Guiding layer
Figure 49.1 A slab waveguide confines the beam in one spatial dimension (Y),
so that nonlinearity can act in the second dimension (X) to produce self-con-
finement. The incident beam has wavelength k0 in free space. In our simulations
the cladding glass material has refractive index n0 ¼ 1.5, the guiding layer has
index n1 ¼ 1.5056, and the thickness of the guiding layer is 5k, where k ¼ k0 /n0
is the wavelength of the guided beam within the glass medium. The core and
cladding materials have the same nonlinear (Kerr) coefficient n2.
top frame shows the intensity profile of the injected beam upon entering the
waveguide. From top to bottom: z/k ¼ 0, 200, 500, 800. It is seen that the injected
beam initially expands to fill the guiding region in the Y-direction. The beam
subsequently broadens along X as it propagates in the Z-direction, but its width
along Y remains constant.
If a Kerr nonlinearity is introduced in the above waveguide, the broadening
along X will be countered by a self-focusing phase factor imposed on the
propagating beam’s cross-section. The Kerr medium’s refractive index
responds to the light by increasing in proportion to the local intensity, namely,
n(x, y, z) ¼ n0,1 þ n2I(x, y, z), where n2 > 0. Thus the bright central region of
the beam is phase shifted more than its tail sections, which are less bright,
resulting in a lens-like phase pattern that tends to focus the beam towards the
center. If the beam’s intensity is weak, this self-focusing effect will not be
sufficient to counter diffraction broadening. However, once the optical power
density exceeds a certain critical value, the index modulation becomes strong
enough to balance the effects of diffraction, resulting in an unchanging, stable
beam profile along the propagation path.
Figure 49.3 shows computed cross-sectional profiles of intensity along the
Z-axis for the same waveguide and the same injected beam as depicted in
Figure 49.2. The difference is that in the present case the nonlinear index n2 is no
longer zero, but chosen to yield a stable, non-diffracting guided beam. (The peak
intensity reached in this simulation raises the refractive index, locally and
instantaneously, by Dn ¼ 0.0022.) The confined beam is a spatial soliton whose
properties can be readily evinced from Maxwell’s equations in conjunction with
the nonlinear index of the medium. Self-trapping in this one-dimensional case
(along the X-axis) is highly stable, and slight inhomogeneities of the guiding
49 Spatial optical solitons 667
y/
–5
– 25 x/ 25
medium or variations of the input optical power do not destabilize the trapped
beam. (Note that the second transverse dimension Y is essentially taken out of the
equations by the action of the slab waveguide.) In contrast, two-dimensional self
focusing in a Kerr medium (i.e., in the absence of the slab waveguide) would be
highly unstable, resulting in catastrophic collapse and subsequent filamentation of
the beam.5,6 (See Chapter 48, Self-focusing in nonlinear optical media.)
Figure 49.4 shows the total optical power P (as a fraction of the input power
P0) plotted versus z/k in the linear and nonlinear waveguides whose behaviors are
668 Classical Optics and its Applications
y/
–5
– 15 x/ 15
Figure 49.3 Same as Figure 49.2, but with nonlinearity added to the wave-
guide, n ¼ n0,1 þ n2I. (Note that the horizontal scale differs from that in
Figure 49.2.) The injected Gaussian beam initially expands in the Y-direction to
fill the guiding layer, but in the X-direction self-focusing combats the natural
tendency of the beam to expand by diffraction. From top to bottom, z/k ¼ 0, 200,
500, 800. The net result is a stable, non-diffracting beam that propagates along
the Z-axis, confined in the Y-direction by the waveguide, and in the X-direction
by the self-focusing action of the nonlinear medium.
depicted in Figures 49.2 and 49.3, respectively. The power P is computed at each
step of the simulation by integrating the guided beam’s intensity in the cross-
sectional plane of the waveguide. The initial steep drop in P/P0 is caused by
radiation into the cladding, at a time when the injected beam is still adjusting to
the waveguide; the guided mode is seen to stabilize after a fairly short propa-
gation distance. P(z) in the nonlinear guide behaves more or less the same as it
does in the linear guide, except for the steady-state value of the guided optical
power, which is somewhat greater in the presence of nonlinearity.
49 Spatial optical solitons 669
1.0
0.9
0.8
0.7
Nonlinear waveguide
0.6
Linear waveguide
0.5
0 200 400 600 800
Figure 49.4 Total optical power P (as a fraction of the input power P0) plotted
versus z/k in the linear and nonlinear waveguides whose behaviors are depicted
in Figures 49.2 and 49.3, respectively.
y/
–5
– 25 x/ 25 – 25 x/ 25
Figure 49.5 Two Gaussian beams, separated along the X-axis and having a
relative phase of 180 , are simultaneously launched into the slab waveguide of
Figure 49.1. The beams initially expand in the Y-direction to fill the width of the
guide; self-focusing then confines each Gaussian beam in the X-direction, and
the interaction between the two pushes them apart. (The peak intensity reached
in this simulation raises the refractive index, locally and instantaneously, by
Dn ¼ 0.003.) The left column displays cross-sectional intensity profiles, while the
right column shows the corresponding phase distributions (light gray ¼ 0 , dark
gray ¼ 180 ). From top to bottom, the frames represent propagation distances
z/k ¼ 0, 200, 500, 800, 1100, and 1600.
49 Spatial optical solitons 671
1.0
0.9
0.7
0.6
0.5
0 400 800 1200 1600
Figure 49.6 Total optical power P (as a fraction of the input power P0) plotted
versus z/k for the pair of out-of-phase solitons depicted in Figure 49.5.
Figure 49.6 is a plot of the total optical power versus propagation distance for
the pair of out-of-phase solitons depicted in Figure 49.5. The steep initial drop is
caused by radiation into the cladding, during the time when the injected beams
are still adjusting to the waveguide. Once the solitons are established, however,
their power content remains essentially constant.
y/
–5
– 20 x/ 20 – 20 x/ 20
Figure 49.7 Two identical Gaussian beams, separated along the X-axis by 20k
and having a constant, uniform phase in their cross-sectional plane, are simultan-
eously launched into the slab waveguide of Figure 49.1. The various frames display
the patterns of intensity distribution in the waveguide’s cross section along the
Z-axis. From top to bottom, the propagation distance from the input port is z/k ¼ 0,
100, 400, 550, 700, 850, 1100 (left column), and z/k ¼ 1250, 1550, 2150, 2300,
2500, 2650, 3050 (right column). The peak intensity reached in this simulation
raises the local refractive index by Dn ¼ 0.01. Initially, the beams expand in the
Y-direction and fill the width of the guide, while they self-focus in the X-direction.
Thus confined, the two beams move toward each other and collide, appearing for a
brief period to have fused together. Following collision, the two solitons re-appear
and move apart, but their mutual attraction brings them back together again.
49 Spatial optical solitons 673
1.0
0.9
0.8
0.7
0.6
0.5
0 800 1600 2400 3200
Figure 49.8 Total optical power P (as a fraction of the input power P0) plotted
versus z/k for the in-phase soliton pair whose behavior is depicted in Figure 49.7.
one-dimensional system the dance would have continued forever, but in this
quasi-one-dimensional case, it appears that the solitons get somewhat closer
together after each oscillation period.
Figure 49.8 is a plot of the total optical power versus propagation distance for
the in-phase soliton pair depicted in Figure 49.7. It is seen that, once the solitons
are established, their power content remains constant despite repeated collisions.
It is a well-known property of solitons that, upon collision, they pass through
each other unscathed. The above behavior of the in-phase soliton pair is a clear
confirmation of this property, even in a non-ideal (i.e., quasi-one-dimensional)
situation.
Bouncing soliton
Consider the rectangular channel waveguide depicted in Figure 49.9. The guiding
channel has length ¼ 40k, width ¼ 5k, and refractive index n1 ¼ 1.5056, while the
index of the cladding glass is n0 ¼ 1.50. The core and cladding materials are
674 Classical Optics and its Applications
X
Input beam
Y
Channel waveguide
assumed to have the same nonlinear (Kerr) coefficient n2. An elliptically shaped
Gaussian beam is launched with a slight sideways tilt into this waveguide. (The tilt
is simulated by imposing on the beam a linearly varying phase along the X-axis.)
As before, the injected beam initially expands to fill the channel in the Y-direction,
while simultaneously contracting along X to form a soliton (see Figure 49.10).
However, the sideways tilt of the injected beam propels the soliton towards the
right-hand side.
In Figure 49.10, from top to bottom, the displayed cross-sectional intensity
patterns correspond to propagation distances z/k ¼ 0, 100, 200, 300, 500, 650, 800
(left column), and z/k ¼ 1050, 1100, 1350, 1550, 1800, 2100, 2400 (right col-
umn). When the soliton encounters the channel wall on the right-hand side, it is
squeezed against the wall, then bounces back. Subsequently, it moves towards the
left wall, gets squeezed, and bounces back again. This pattern of behavior is
repeated indefinitely as the beam propagates along the Z-axis. (The peak intensity
reached in this simulation raises the local refractive index by Dn ¼ 0.0022.)
Thus spatial solitons exhibit a particle-like behavior, retaining their identity
even after interactions with each other or with the channel walls. This property
underlies their potential utility as information-carrying bits in all-optical
switching applications.7,8
Figure 49.11 is a plot of the total optical power along the Z-axis for the
bouncing soliton depicted in Figure 49.10. Note in particular that no loss of
power occurs when the soliton encounters the side walls of the channel. This is
what one would expect based on the principle of total internal reflection.
Concluding remarks
The simulations reported in this chapter are quite stable and yield similar soli-
tonic behaviors under diverse conditions. For example, in all cases considered,
49 Spatial optical solitons 675
y/
–5
– 25 x/ 25 – 25 x/ 25
Figure 49.10 Elliptically shaped Gaussian beam launched sideways into the
channel waveguide of Figure 49.9 forms a bouncing soliton. From top to bottom,
the cross-sectional intensity patterns correspond to propagation distances z/k ¼ 0,
100, 200, 300, 500, 650, 800 (left column), and z/k ¼ 1050, 1100, 1350, 1550,
1800, 2100, 2400 (right column). The injected beam (left column, top) has a
Gaussian amplitude profile, similar to that in Figure 49.3, but it is also modulated
by a linear phase along the X-axis, which gives its motion a slight tilt toward the
right-hand side. As before, the soliton forms and propagates along the Z-axis, but
it slowly drifts to the right. Upon encountering a channel wall, the soliton is
squeezed against the wall, then bounces back.
676 Classical Optics and its Applications
1.0
Bouncing soliton
0.9
0.8
0.7
0.6
0.5
0 600 1200 1800 2400
Figure 49.11 Total optical power P (as a fraction of the input power P0)
plotted versus z/k for the bouncing soliton whose behavior is depicted in
Figure 49.10.
the optical nonlinearity was placed uniformly in the entire waveguide, that is, the
guide and the cladding layers had the same coefficient of non-linearity (n2). In
general, this is not necessary and one can simulate situations where, for instance,
nonlinearity is present in the guiding layer only, without causing any significant
modification of the results.
Laser beams can deliver controlled doses of optical energy to specific locations on
an object, thereby creating hot spots that can melt, anneal, ablate, or otherwise
modify the local properties of a given substance. Applications include laser cutting,
micro-machining, selective annealing, surface texturing, biological tissue treat-
ment, laser surgery, and optical recording. There are also situations, as in the case
of laser mirrors, where the temperature rise is an unavoidable consequence of the
system’s operating conditions. In all the above cases the processes of light
absorption and heat diffusion must be fully analyzed in order to optimize the
performance of the system and/or to avoid catastrophic failure.
The physics of laser heating involves the absorption of optical energy and
its conversion to heat by the sample, followed by diffusion and redistribution
of this thermal energy through the volume of the material. When the sample is
inhomogeneous (as when it consists of several layers having different optical
and thermal properties) the absorption and diffusion processes become quite
complex, giving rise to interesting temperature profiles throughout the body
of the sample. This chapter describes some of the phenomena that occur in
thin-film stacks subjected to localized irradiation. We confine our attention to
examples from the field of optical data storage but the selected examples
have many features in common with problems in other areas, and it is hoped
that the reader will find this analysis useful in understanding a variety of similar
situations.
Magneto-optical disk
The cross-section of a quadrilayer magneto-optical (MO) disk, optimized for
operation at k ¼ 400 nm, is shown in Figure 50.1. (GaN-based semiconductor
diode lasers operating at these blue and violet wavelengths are becoming com-
mercially available, and optical disk systems are expected to take advantage of
678
50 Laser heating of multilayer stacks 679
Electromagnet
r
30 nm Aluminum alloy
30 nm Dielectric (SiN)
10 nm Magnetic film (TbFeCo)
45 nm Dielectric (SiN)
Substrate (Polycarbonate)
this development by switching to blue or violet lasers within the next two to three
years.) The quadrilayer of Figure 50.1 is deposited on a plastic substrate and
consists of a thin magnetic film sandwiched between two transparent dielectric
layers, capped by a thin layer of an aluminum alloy.1,2 The optical and thermal
constants of the various layers of this stack are listed in Table 50.1.
The focused laser beam arrives at the magnetic layer from the substrate side.
This quadrilayer is designed to have a reflectivity of 9%, and has a fairly large
polar MO Kerr signal (polarization ellipticity gK ¼ 1.55 and Kerr rotation
angle hK ¼ 0.24 , where the signs correspond to the up and down directions
of magnetization of the storage layer). Aside from contributing to the optical
properties of the stack, the aluminum layer acts as a heat sink, and the upper
dielectric layer is thin enough to provide good thermal coupling between the two
metallic layers.1,2
680 Classical Optics and its Applications
Table 50.1. Optical and thermal constants of the various materials used in the
calculations
Thermal
Refractive index Dielectric tensor Specific heat conductivity
n þ ik (k ¼ 0.4 lm) e, e0 (k ¼ 0.4 lm) C (J/cm3 C) K (J/cm s C)
Polycarbonate 1.6 — 1.4 0.0025
(substrate)
Aluminum alloy 0.50þ4.85i — 2.4 0.75
Tb21Fe72Co7 2.33þ3.45i e ¼ 6.46 þ 16.11i 2.9 0.10
(amorphous e0 ¼ 0.1850.233i
ferrimagnet)
SiN (dielectric) 2.2 — 2.5 0.030
Ge2Sb2Te5 2.9þ2.5i — 1.3 0.002
(amorphous)
Ge2Sb2Te5 2.0þ3.6i — 1.3 0.005
(polycrystal)
ZnS–SiO2 2.2 — 2 0.006
(dielectric)
Figure 50.2 shows the intensity profile of the focused spot at the storage layer
of the disk. The assumed objective lens that brings the laser light to focus in this
case is free from all aberrations, is corrected for the thickness of the substrate,
and has NA ¼ 0.8, f ¼ 1.5 mm. The collimated Gaussian beam entering the lens
has 1/e (amplitude) radius r0 ¼ 1.2 mm, which is the same as the radius of the
objective’s entrance pupil. The distribution of Figure 50.2(a) is displayed on a
logarithmic scale to enhance the diffraction rings caused by truncation of the
beam at the objective’s aperture. The radial profile of the spot, depicted on a
linear scale in Figure 50.2(b), reveals that the rings are quite weak, however,
and thus incapable of producing much heat at the periphery of the central
bright spot.
Figure 50.3 is a plot of the magnitude of the Poynting vector, S, along the
Z-axis for a plane wave normally incident on the quadrilayer stack of Figure 50.1
through the substrate.2 The horizontal axis depicts the distance from the top of the
stack. Thus S is seen to be constant in the two dielectric layers (30 < z < 60 nm
and 70 < z < 115 nm), indicating no optical absorption in these regions. Most of
the absorption takes place in the magnetic film (60 < z < 70 nm); a very small
fraction of the incident energy goes to the aluminum layer (0 < z < 30 nm). The
optical energy thus deposited in the magnetic film raises the local temperature
immediately, but soon thermal diffusion takes over and carries the heat to other
regions of the stack.
50 Laser heating of multilayer stacks 681
a
–1 x(μm) +1
0.8
Normalized Intensity
0.6
0.4
0.2
0.0
0.0 0.1 0.2 0.3 0.4 0.5 0.6
Radius (μm)
Figure 50.2 Distribution of total E-field intensity, jExj2 þ jEyj2 þ jEzj2, at the
focal plane of a 0.8NA objective. The incident Gaussian beam (k ¼ 0.4 lm) is
truncated by the lens aperture at its 1/e (amplitude) radius. For simplicity’s
sake, the beam is assumed to be circularly polarized, so that it would yield a
circularly symmetric spot at the focal plane. (a) Logarithmic plot of intensity,
showing an Airy disk diameter 0.6 lm and FWHM 0.3 lm. (b) Radial
intensity profile.
682 Classical Optics and its Applications
1.0
0.8
0.6
0.4
0.2
0.0
0 20 40 60 80 100
z (nm)
Figure 50.3 The magnitude S of the Poynting vector along the Z-axis, plotted
through the thickness of the quadrilayer of Figure 50.1. The incident beam
(k ¼ 0.4 lm) is assumed to have unit power. Upon entering the stack S ¼ 0.91,
which indicates that 9% of the incident optical energy is reflected at the substrate
interface. Approximately 3% of the energy goes to the aluminum layer and the
remaining 88% is absorbed by the magnetic film.
0 15 30 45 60 75
P2(mW)
0 15 30 45 60 75
P3(mW)
6
0
0 15 30 45 60 75
Time (ns)
Figure 50.4 The functions representing laser power versus time that are
used in the various examples: P1(t) is a 55 ns trapezoidal pulse with 5 ns rise
and fall times; P2(t) is a sequence of five identical pulses, each with a 5 ns
duration, 1 ns rise and fall times, and a center-to-center spacing of 10 ns;
P3(t) is a fairly complex pattern of three-level pulses, used in phase-change
recording.
At later times during the heating cycle (i.e., t ¼ 30 ns and 50 ns) the patterns are
similar to that at t ¼ 10 ns, but the temperatures are higher. Once the laser is turned
off the temperatures drop abruptly. At t ¼ 55 ns, the magnetic film is already
cooling down, and the heat is moving to the substrate. The hottest spot at this point
is somewhere in the substrate, close to the interface with the lower dielectric layer.
At t ¼ 60 ns, the cooling has progressed further, and the heat is rapidly spreading
through the substrate. By t ¼ 100 ns, the temperature everywhere is essentially back
684 Classical Optics and its Applications
150 time = 50 ns
125 30 ns
10 ns
75
50 60 ns
25
100 ns
0
0 100 200 300 400
Z (nm)
Figure 50.5 Computed temperature profiles along the Z-axis at the beam
center, r ¼ 0, for the stack of Figure 50.1 illuminated by the focused beam of
Figure 50.2 and the pulse P1(t) of Figure 50.4. The profiles at t ¼ 10, 30, and
50 ns represent the heating-period; cooling-period profiles are shown for t ¼ 55,
60, and 100 ns. By virtue of its strong absorption of the incident optical energy,
the magnetic film is the hottest region during heating. The high thermal
conductivity of the aluminum layer gives it a fairly uniform temperature through
its thickness. As soon as the laser is turned off, the temperatures drop rapidly,
and the peak temperature shifts to the substrate.
to the ambient temperature. Although only the z-profiles are shown here, it should
be remembered that the heat diffuses radially as well; not only does the heat move
to the substrate, but also it spreads radially throughout the entire stack.3
Next we consider the profiles of temperature versus time in the magnetic layer.
Figure 50.6 shows several profiles at different distances r from the beam’s center,
starting at r ¼ 0 and increasing in steps of Dr ¼ 50 nm to r ¼ 1 lm. At r ¼ 0, the
temperature reaches its highest value at the end of the pulse, then decays quickly
and, in the span of a few nanoseconds after the laser turn-off, goes down by
almost an order of magnitude. At larger radii, the temperature is slow to rise, and
it also peaks somewhat after the laser is turned off. The reason for this behavior is
that the focused spot at these radii is rather weak, and the heat does not arrive
there directly from the laser, but by radial diffusion from the central region,
which is under intense illumination.
50 Laser heating of multilayer stacks 685
150 r=0
125
75
50
25
r = 1m
0
0 20 40 60 80 100
Time (ns)
Figure 50.6 Computed profiles of temperature versus time in the magnetic layer
under the same conditions as in Figure 50.5. Different curves correspond to dif-
ferent radial distances from the beam center (in steps of Dr ¼ 50 nm), the largest
temperature occurring in the center at r ¼ 0 and the lowest temperatures belonging
to r ¼ 1 lm. As soon as the laser is turned down at t ¼ 50 ns temperatures near the
beam center drop sharply, but at larger radii, because of radial heat diffusion from
the center, T continues to rise for a while after the laser is turned off.
information to be recorded on the disk is fed to the EM, which switches the
magnetization of the hot spot between the “up” and “down” stable states. The
switching rate must be rapid enough to provide a high data-transfer rate into
the recording medium. This requires a compact EM capable of flying very close
to the magnetic layer, lest its inductance becomes too large. If the recorded marks
are to be 0.25 lm long in the direction of the track, the laser must be pulsed at
10 ns intervals, in which case Bz(t) must switch between Bmax with rise and fall
times of only a few nanoseconds. Such fast magnetic heads are currently at the
forefront of conventional magnetic recording technology (i.e., hard-disk drives),
but they require further development in order to be suitable for future generations
of MO drives.
Consider the quadrilayer disk of Figure 50.1 moving at V ¼ 25 m/s under
the focused beam of Figure 50.2, modulated with the pulse sequence P2(t) of
Figure 50.4. These five pulses are each 5 ns wide, have 1 ns rise and fall times, and
reach a peak power of 3 mW. The assumed ambient temperature is 25 C. Figure 50.7
shows several isotherms at the critical temperature Tcrit ¼ 175 C in the magnetic
film during the period 0 t 50 ns. (The maximum temperature of the magnetic
film during the same period is Tmax ¼ 300 C.) To a good approximation, the
magnetic dipoles of the storage layer align with the field of the EM in those regions
where T
Tcrit but the EM is unable to reorient these dipoles where T < Tcrit.4,5,6
100
Y (nm)
–100
–200
0 200 400 600 800 1000 1200
X (nm)
Figure 50.7 Computed isotherms in the magnetic layer of Figure 50.1, when
the multilayer is subjected to the focused spot of Figure 50.2 and the pulse
sequence P2(t) of Figure 50.4. The ambient temperature is 25 C, and the disk
moves at V ¼ 25 m/s along the X-axis. (In the reference frame of the disk, the
focused spot moves from left to right.) The maximum temperature during this
period is Tmax ¼ 300 C, reached at t ¼ 44 ns. All depicted isotherms are at
T ¼ 175 C, plotted at Dt ¼ 1 ns intervals whenever T
175 C. The solid
(broken) curves represent the heating (cooling) phase of each pulse. Because of
the lateral heat diffusion, each pulse produces slightly larger isotherms than the
preceding one, but by the end of the fifth pulse this process reaches a steady
state. During the 10 ns period of each pulse the disk moves by Dx ¼ 0.25 lm,
which is the minimum mark-length that can be recorded in this example.
50 Laser heating of multilayer stacks 687
50 nm Aluminum alloy
25 nm Dielectric (ZnS–SiO2)
20 nm Phase-change (GeSbTe)
50 nm Dielectric (ZnS–SiO2)
Substrate (Polycarbonate)
operation at k ¼ 400 nm. The optical and thermal constants of this stack are listed
in Table 50.1. The Ge2Sb2Te5 material can be switched between amorphous and
(poly)crystalline states by the laser beam: melting at Tmelt 625 C followed by
rapid quenching results in an amorphous mark, whereas annealing for a reason-
able length of time above the glass transition temperature (Tglass 150 C) returns
the material to the crystalline state.7,8,9 The stack shown in Figure 50.8 has
reflectivities Rc ¼ 30% and Ra ¼ 8% for the crystalline and amorphous phases of
the PC film. Note also that the thermal constants for the two phases are somewhat
different. In the following analysis we ignore these differences by assuming the PC
layer to be crystalline at all times. Also to simplify the calculations further, we
ignore the heats of melting and crystallization. These are reasonable approxima-
tions, but the final results may need slight corrections if more accuracy is desired.
The laser pulse sequence applied to this sample is P3(t), shown in Figure 50.4.
Here the laser operates at three different power levels. At the highest level the
pulse is strong enough to melt the PC film. In the low-power regime, occurring
immediately after the melting pulse, the temperatures drop rapidly, causing the
quenching of the molten material into an amorphous state. The intermediate
power level is for annealing the pre-existing amorphous marks, which is required
when overwriting a previously written track. (Such tracks contain both amorphous
and crystalline regions, and it is necessary that all amorphous regions that are not
being melted be annealed into the crystalline state.)
Figure 50.9(a) shows the computed isotherms in the PC layer at T ¼ Tmelt for a
disk speed V ¼ 25 m/s and an ambient temperature of 25 C. The solid and broken
isotherms as before represent the heating and cooling cycles, respectively. The
maximum temperature reached in this sample is Tmax ¼ 1153 C at t ¼ 59 ns. The
isotherms are plotted at intervals Dt ¼ 0.5 ns whenever T
Tmelt in the PC film.
The first two molten regions are well separated from each other and from other
molten pools; these will eventually quench to form two small amorphous marks.
The cooling in these regions is rapid, and the temperatures return to the vicinity
of Tglass in about 5 ns.
The third and fourth molten pools, however, have some degree of overlap.
(In practice two or more short overlapping marks such as these are used to create
a long mark.) The heat generated by the fourth pulse flows backward and affects
the amorphous region being formed in the wake of the third pulse. In general,
heat diffusion from the tail end of any long mark can anneal the leading edge as
well as the mid-sections of the same mark, causing partial crystallization. This
problem may be better appreciated by examining the T ¼ 325 C isotherms of the
same system, shown in Figure 50.9(b). Here the annealing pulses (i.e., the
medium-power levels of P3(t)) appear behind the first two marks well after they
have cooled. By then the annealed region is far enough from the previously
50 Laser heating of multilayer stacks 689
150
Y (nm)
–150
–300
0 300 600 900 1200 1500 1800
150
Y (nm)
–150
–300 V = 25 m/s
Figure 50.9 Computed isotherms in the GeSbTe film of Figure 50.8, subjected
to the focused beam of Figure 50.2 and the pulse sequence P3(t) of Figure 50.4.
The ambient temperature is 25 C, and the quadrilayer disk moves at V ¼ 25 m/s
along the X-axis. The peak temperature Tmax ¼ 1153 C in the film is reached at
t ¼ 59 ns. The solid (broken) isotherms correspond to the heating (cooling) phase
of each pulse. (a) Isotherms at T ¼ 625 C, the melting point of the PC material.
(b) Isotherms at T ¼ 325 C, the presumed (elevated) annealing temperature
given the short annealing time.
molten pools that there is no danger of recrystallization. In contrast, the two large
isotherms in Figure 50.9(b) corresponding to the third and fourth melting pulses
partially overlap, causing the formation of a small, undesirable crystallite in the
middle of the long amorphous mark. These are some of the issues with which
the designers of optical disk drives must grapple, in order to create robust and
reliable data storage systems.
Abbe’s sine condition 9, 10, 14, 16, 20, 33, 39, 46, annular phase mask 552
528 anti-bunching 127
Abbe’s theory of image formation 23, 38 anti-guiding 494
aberrated wavefront 619 antireflection coated 11, 158, 346, 361, 388–391, 476,
aberration 9, 16, 20, 33, 45, 64, 160, 227, 310, 313, 518, 557, 634–635
351, 355, 362, 379, 381, 452, 482, 503, 525, 578, aperture
617, 658, 680 annular aperture 32, 542
aberration-free 10, 11, 16, 18, 19, 34, 48, 57, 227, aperture stop 503, 579
229, 261, 307, 388, 447, 494, 496, 499, 526, 602, circular aperture 26–28, 34, 41, 443, 448–449, 457,
616, 630 610–611
primary spherical aberration 618, 621, 622 clear aperture 10, 16, 447, 496, 587, 659
spherical aberration 227, 380, 381, 482, 486, 534, spiral aperture 372
541, 578, 604, 617–623, 627, 658 aplanat 18–19, 483
absorbing media 216, 222 aplanatic 16, 18, 21, 33–35, 46–51, 328, 389, 391,
absorption coefficient 128, 155, 204, 206, 605, 611, 528, 539, 541
632 aplanatic sphere 528
air gap 133, 144, 198, 238, 270, 275, 280, 383, 389, aplanatic system 16
395, 536 aplanatism 33, 528
Airy disk 62, 261, 316, 379, 517, 522, 526, 542, 547, apodization 379
681 apparent position of fixed star 310
Airy function 35, 41 aragonite 405
Airy pattern 10, 19, 33–35, 37, 45, 62, 69, 70, 379, arc lamp 62, 546, 554, 588, 626
517, 518, 522, 547 aspheric mirror 543
aluminum mirror 83, 564 aspherics 351
ambient temperature 684, 686, 688, 689 aspheric surface 485, 527, 543
amorphous 168, 680, 687, 688, 689 astigmatic 59, 490–491
amorphous mark 688, 689 astigmatic distance 490, 491
amplitude mask 373, 459, 549 astigmatism 363, 381, 481–483, 490–499, 503,
amplitude spectrum 102, 117, 118 580–581, 617
amplitude transmission function 70, 373 atomic dipole 211
analyzer 555–570, 633–638 atom optics 370
anamorphic magnification 499, 501–503 attenuated total internal reflection (TIR) 133, 385–388
anamorphic magnification factor 503 attenuation coefficient 449–458
anamorphic prisms 499 autocorrelation 78, 113, 115–116, 126, 241
angular discrimination 263 autocorrelation function 78, 113, 116
angular momentum 289, 301–309 average intensity 77, 91–92
angular resolution 37, 516
angular separation 114, 121, 124, 258–265, 508, Babinet’s principle 69
517–520, 567–569 backward propagated 478
angular spectrum 135, 178, 233, 264, 381 backward propagation 53
angular spectrum decomposition 233 baseball pattern 136–137, 315–316, 531, 533,
anisotropic polarizability 655 539–540
annular light source 552 baseline 516–520
691
692 Index
beam decenter 483–487 classical mount 326–328, 332, 336, 338–341,
beam propagation method (BPM) 459, 492, 658, 665, 346–349
667 classical source of light 127
split-step BPM 459–460, 658 coefficient of nonlinearity 676
beam-splitter (BS) 79, 126, 182–187, 194, 201, 224, coherence
231–234, 268, 297, 464, 516, 525, 546, 564, 567, coherence factor 590
619, 624, 644 coherence length 74, 78, 80–81, 93, 113, 624–626
beam tilt 480–488 coherence theory 88, 505
beam waist 55–56, 290, 295, 320, 490 coherence time 64, 545, 547
beat frequency 195 first-order coherence 77, 89, 91–92, 95, 113, 117,
bending of polarization vector 46 124
Bessel beam 32 degree of coherence 88–89, 95, 586, 587
best focus 226–228, 358–362, 525–530, 535–542, mutual coherence 100, 103
580–583 partial coherence 586
bias phase 568–574 partial spatial coherence 556
biaxial birefringent crystal 405–408, 413 temporal coherence 74–79, 88, 93, 113, 555
binary intensity mask (BIM) 586, 588 temporally coherent 64, 566
binary star 121 coherent addition 213–214
birefringence 197, 201–203, 554, 563–565, 632 coherent illumination 62, 64, 66–69, 88, 367, 547–550
birefringent 110, 201–203, 235, 405–408, 413, 557, coherent and incoherent imaging 62, 88, 551, 582
563–565, 567 coherent image 68, 92, 93, 583
birefringent slab 110 coherent imaging 41, 42, 65, 68, 70, 582
birefringent substrate 557 coherent imaging system 42, 65
boundary condition 142, 149, 324–325, 447, 461, 600, coherent monochromatic light 408
606 coherent point source 64, 92
Bracewell, Ronald 61, 515, 524 coherent source 63, 64, 546, 548
Bracewell telescope 515, 519–522 colliding pulse ring laser 240
Bradley, James 310 collimated beam 5, 10, 21, 28, 65, 78, 100, 136, 201,
Bragg’s law 270, 317, 318 225, 307, 367, 382, 405, 476, 494, 506, 541, 551,
bright space 595 627, 633
Brewster’s angle 164, 218, 281–283, 381, 383, 389–390 collimator 160, 161, 351, 383, 495–503, 505–508, 526
collimated coherent illumination 67–69, 547
Calcite 110, 567 coma 5, 9–10, 16, 19, 379–381, 452, 457, 580–581,
catadioptric solid immersion lens (SIL) 541, 543 617–620
caustic 301, 479 third-order coma 5, 617, 619
cavity 177–178, 195, 197, 204, 207, 270, 447–457, primary coma 19, 452, 457, 619, 620
491–492, 602 comatic tail 452
central fringe 96, 98 comb function 375–377
channel waveguide 464–467, 673–675 compact disk 351, 534, 609
chaotic light 113, 116–119 complex amplitude distribution 16, 17, 25–28, 34, 52,
chaotic point source 124, 125 174, 228, 289, 371, 400, 448, 546, 584, 643
characteristic equation 141 complex degree of spatial coherence 96
charge coupled device 556, 627 compound microscope 9, 576
chirp cancellation 241, 248, 249, 252 compression 46, 86, 240–257, 499, 502
chirp-compensation 251 compression ratio 241, 248, 252, 253
chirped mirror 240 concave pit 604, 609, 610
chirped pulse 240, 247, 248 concentric ring pattern 370
chromatic aberration 355, 356, 527, 582 condenser 29, 30, 62–67, 546–557, 583–591, 625
chromatic dispersion 351 condenser stop 586, 587, 590, 591
chromeless 589, 592 conduction electrons 150
circle of least confusion 226–227 cone of light 11–18, 65–67, 136, 160, 200, 316, 372,
circular aperture 26–28, 34, 41, 443, 448, 449, 457, 388, 405, 415, 461, 517, 526, 530–540, 551, 560,
610, 611 579, 611, 637
circularization 499 confocal resonator 447, 451, 454, 457
circularly polarized 5, 12, 35, 107, 154, 167, conical mount 326, 327, 331–335, 340–346
202, 224, 229, 263, 305, 409, 413–416, conical refraction 404–408, 410, 413, 415
637, 681 conjugate plane 11, 14, 17
circular polarization 154–155, 168, 203–204, 208, conjugate wave 644, 645, 658, 651
230, 305, 410 conjugated object wave 649
cladding 139, 243, 255–257, 459–470, 478, 489–495, conoscopic 554, 556, 564
665–668, 671–676 conoscopic polarization microscopy 564
Index 693
conservation of energy 211, 213, 215, 232, 235, 275 differential method of Chandezon 8, 138, 325, 350,
constitutive relation(s) 420, 599 544
construction wavelength 352, 354, 357, 361, 363, 364, differential polarization microscopy 558, 564
365 differential signal 175, 177, 398, 401, 402, 533
contact hole 590, 594, 595 differentiation theorem 60
contrast 62, 88, 96, 201, 315, 371, 389, 549–561, 564, DIFFRACT ix, xi, 2, 138, 350, 544
566, 571, 573, 588, 592, 615, 624, 625 diffracted order 65, 66, 134–136, 249, 270, 272, 315,
contrast enhancement 549–551 318, 324–338, 341–345, 356, 361–365, 387,
convection of light 310, 311, 320 614–616
convex pit 604, 608, 610 diffracted ray 360
convolution 35, 41, 522 diffraction 1, 9, 16, 23, 28, 44–52, 70, 336, 344, 355,
core 243, 459–468, 476, 478, 665, 666, 673, 674 382, 445, 447, 459, 461, 526, 545, 566, 570
co-rotating dielectric 191 classical diffraction 367, 459, 599
coupling efficiency 402, 476–488 classical theory of diffraction 23, 26, 45, 47
cover plate 528, 533–535, 604 diffraction-free beam 36, 44
critical angle 129, 142, 143, 255, 343, 379–383, 393, diffraction effect 301, 382, 571, 573, 655
401, 402, 537 diffraction efficiency 250, 324, 330–349, 354, 356
critical illumination 570 diffraction-limited 10, 35, 64, 172, 307, 460, 527,
critical TIR angle 133, 134, 379, 388, 391, 393 547, 584, 658
cross-correlation function 91, 100, 103, 123, 124 diffraction-limited focus 10, 36, 52, 336, 483, 525,
crossed analyzer 564, 569, 570 527, 539
crystalline 554, 687, 688 diffraction-limited spot 37, 328, 609
current loop 426, 433, 435, 436 diffraction order 21, 137, 249, 250, 316, 325–327,
curvature 10, 21, 31–33, 55, 59, 226, 250, 291–294, 331, 355, 531, 615
302, 358–366, 374, 480–488, 496, 502, 543, diffraction theory 25, 47, 367, 526, 532, 537, 599,
577–581, 617, 624–627, 655–658, 664 615
curvature phase factor 31, 32, 55, 366, 480, 481, 557 diffraction rings 680
cutoff frequency 64–67 scalar diffraction theory 25, 532
cycle-averaged intensity 113, 115, 119, 121, 126 vector diffraction 8, 47, 51, 138, 330, 345, 355,
cylindrical lens pair 496, 498, 499, 503 533, 537, 544
diffractive lens 351
dark soliton 665 diffractive optical element (DOE) 351, 352, 353, 355,
defocus 45, 48, 64, 227, 381, 482–487, 529, 541, 554, 356, 366
559–562, 570–573, 617, 663 diffractive propagation 303, 665
degree of first-order coherence 77, 117, 124 diffuse radiation 522
degree of second-order coherence 113, 114, 116, 119, diffusion 678, 684
127 heat diffusion 663, 678, 682, 685, 686, 688
DELTA 350, 544 lateral heat diffusion 686
delta function 27, 375–377 radial diffusion 684, 685
depolarization 107, 110, 156 thermal diffusion 663, 680
depolarized 105, 107 diode laser 351, 489–502, 584, 678
depth of focus 525–536, 541, 542, 586, 588 dipolar oscillation 211
detection module 177, 397, 398, 525, 638, 639 dipole radiation pattern 420
detector 78, 86, 113, 121–127, 176, 194, 268, 351, Dirac’s delta function 27
397, 402, 517, 522, 525, 526, 531, 532, 537, 556, directional coupler 467, 469–473
627–640, 685 dispersion 83, 240, 243–246, 254, 351, 600, 664, 665
diagonal element 154, 167 dispersive 81, 240, 241, 246–249, 257, 600
diamagnetic 154, 167 dispersive element 81, 248
dielectric constant 128, 131, 139–143 dispersive optical element 248
dielectric mirror 161, 177–179, 197–199, 206, 207, divergence-free 150, 419, 410, 435
274 divergence laws 420
dielectric slab 191, 192, 213, 215, 216, 218, 220, 254, Doppler shift 184–190, 310–320
279 double exposure 651, 652
dielectric stack 82, 235–237, 268, 276, 324 double-slit mask 505, 508–511
dielectric tensor 153, 162, 166, 171, 234, 235, 396 double star 508, 511
differential detection 176, 398, 638, 639 down-chirp 246
differential detection module 398, 638, 639 duty cycle 64, 65, 325, 326, 346
differential detector 402, 639
differential image 555, 558, 561 Earth’s rotation 182
differential interference contrast 566, 569 effective index 243
differential interference contrast microscope 566, 569 effective medium theory 333
694 Index
E-field energy density 50, 51 far field 23, 30–32, 41, 52, 55, 294–298, 368,
eigenfunction 23, 60, 449 477–480, 584, 612, 616
eigenfunction of propagation in free space 60 far field (Fraunhofer) diffraction formula 480, 584
eigenvalue 449 far field pattern 31, 32, 368
electric charge 418, 419, 423, 429, 436, 439 fast axis 498
electric current 437, 489 fast Fourier transform (FFT) 26, 45, 302, 461
electric dipole 209, 216, 420, 421, 425–427, 429, femtosecond range 240
433–437 ferrimagnetic 154, 167
electric field intensity 2, 6, 205, 264, 307, 654, 665, ferromagnetic 154, 167
681 fiber 64, 182, 240, 241, 246, 248, 460–466, 475–478,
electromagnet 679, 685 482, 483, 485–489, 502, 503, 584
electromagnetic energy 292, 301 fiber bundle 64
electromagnetic field 23, 45, 88, 130, 145, 147, 149, fiber-optic gyroscope 182, 196
150, 209, 258, 325, 338, 381, 387, 388, 419, 447, field momenta 301
479, 599 field momentum density 305
electromagnetic radiation 209, 304, 324 field of view 17, 40, 41, 62, 522, 541, 568, 570, 580,
electromagnetic waves 47, 52, 133, 149, 234, 275, 584
387 filament 447, 463, 464, 502, 655, 662, 663
elegant solution of wave equation 60 filamentation 660, 663, 664, 667
ellipse of polarization 5, 7, 102, 105, 106, 203, 398, Finite Difference Time Domain (FDTD) 139, 151,
415, 632 418, 599
ellipsoid of birefringence 554, 564, 565 first-order beam 66, 67, 331, 338, 341, 616, 623
ellipsometry 632, 638 first order field coherence function 78
elliptical aperture 418–442, 503 Fizeau 505, 511, 513
ellipticity 5–7, 102–110, 155–170, 173, 179, 180, flow of heat 396
202–206, 389–399, 413–416, 498, 499, 556, 557, focal-shift phenomenon 46
559, 562, 679 f-number 379, 615, 627
emergent wavefront 10, 16, 19, 45, 364, 478, 479, focused cone 135, 136, 200, 406, 461, 517, 527, 560,
480, 657 564, 604, 611, 615, 637
energy flow pattern 139, 150 focused laser beam 301, 396, 397, 473, 609, 679
ensemble 75, 116 focused spot 10, 19, 20, 35, 37, 45, 136–138, 144,
ensemble average 75 183, 184, 261, 263, 265, 272, 297, 308, 309,
ergodic 88 315–317, 328, 336, 379, 380–382, 388, 396, 397,
ergodicity 116 415, 460, 476–483, 499–502, 525, 526, 528,
evanescent 25–27, 132, 143, 145, 147, 148, 255, 257, 533–535, 540, 561, 580, 602–605, 608–610, 623,
325, 381, 387, 388, 461, 467, 494, 495, 611–613 628, 630, 680, 684–686
evanescent beam 27 forward propagation 53
evanescent coupling 1, 387–393, 395, 398, 401, 402 four-corners problem 556–558, 560, 561, 563
evanescent wave 26, 132–134, 381, 382, 386, 387, Fourier coefficients 354
461 Fourier component 42, 242, 373
even mode 144–150, 467, 469 Fourier domain 27, 28, 41, 52, 89, 375–377, 546
Ewald-Oseen theorem 209, 214, 218, 220, 222 Fourier plane 27, 546, 548–551
extended incoherent source 92 Fourier optics 1, 23, 44, 45, 53, 73, 653
extended source 88, 91, 92 Fourier series 78, 115, 354, 375
extended waveform 75, 77, 81, 91 Fourier spectrum 144, 146, 241, 325
extended white light source 554 Fourier transform 23, 25–28, 31–38, 45, 48, 53, 60,
external conical refraction 404–407, 413 76, 90, 96, 116, 119–121, 124, 147, 242,
extinction rate 145 245–247, 258, 302, 375, 376, 380, 461, 549
extinction ratio 635, 637, 638, 640 Fourier transform lens 36
extinction theorem 1, 209, 213, 214, 216, 220–222 Fourier transform plane 38, 549
Fraunhofer (far field) distribution 31
Fabry–Pérot etalon 1, 197–204, 207, 248, 251, 263, free-space impedance 140, 246, 432
265, 270, 271 frequency domain 100, 121, 247
Fabry–Pérot interferometer 197, 205 frequency spectrum 75–77, 89, 90, 100, 117, 118,
Fabry–Pérot resonator 159–161 240, 241, 320
Faraday angle 154, 157 frequency sweep 240
Faraday effect 152–159, 162–164, 166 Fresnel’s coefficient(s) 141, 209, 210, 221, 222, 238,
longitudinal Faraday effect 162, 163 256, 559
Faraday medium 155, 156, 159, 160, 204–206, 208 Fresnel drag 182
Faraday rotation angle 155, 156, 208 Fresnel’s formula for the drag of light 321
Faraday rotator 203–205, 208, 224, 225, 230, 235 Fresnel-Kirchhoff diffraction integral 447
Index 695
Fresnel number 26 grating period 21, 136, 250, 251, 318, 325–328,
Fresnel’s reflection coefficient 131, 141, 143, 167, 331, 334–346, 615, 616
209, 221, 256, 379, 389, 391, 400, 557, 559 metal grating 135, 136, 331
Fresnel’s reflection formula 128 metallic grating 324, 331, 446
Fresnel rhomb 103, 105 metallized grating 325, 326
Fresnel transmission coefficient 238, 537 ruled grating 323, 324, 337
fringes 68, 74, 88, 92–95, 183, 187, 190, 191, 195, transmission grating 330, 341, 344, 346
201, 298, 422, 427, 429, 436, 438, 440, 495, 505, two-dimensional grating 623
506, 508, 511, 513, 522, 614, 616, 617, 619, grating compressor 240, 241
624–627, 643–645, 649, 651 grazing incidence 283, 284, 288
fringe contrast 88, 96, 98, 625 groove 135–138, 272, 315–318, 324–333, 336–341,
fringe pattern 88, 93–96, 188, 261, 290, 291, 345, 346, 353, 531–533, 537–540, 629, 630
505–510, 519, 523, 618, 619, 625, 651 groove depth 315, 316, 325, 333, 346, 531
fringe periodicity 94 groove edge 136, 137, 531, 533
fringe shift 183, 187 group velocity, 193, 243–245
fringe visibility 507–511 group velocity dispersion (GVD) 244, 245, 246, 248
frustrated total internal reflection (FTIR) 383, 388, guided mode 243, 255, 346, 443, 444, 461, 464, 466,
537 467, 472, 477, 491, 492, 494, 668
fused silica 154, 243, 244, 248, 584, 654 guiding layer 255–257, 489–495, 665–668, 676