Active Calculus Multivariable (2018 Ed.)

Download as pdf or txt
Download as pdf or txt
You are on page 1of 309

ACTIVE CALCULUS

MULTIVARIABLE
2018 Edition

Steven Schlicker
David Austin Matthew Boelkins
Active Calculus - Multivariable
Active Calculus - Multivariable
Steve Schlicker
Grand Valley State University

Contributing Authors
David Austin
Grand Valley State University
Matt Boelkins
Grand Valley State University

July 25, 2018


Cover Photo: Lars Jensen

Edition: 2018
Website: https://activecalculus.org/
© 2013–2018 Steven Schlicker
Permission is granted to copy, distribute and/or modify this document under
the terms of the Creative Commons Attribution-NonCommercial-ShareAlike
4.0 International License. The work may be used for free by any party so long
as attribution is given to the author(s), the work and its derivatives are used
in the spirit of “share and share alike”; no party may sell this work or any of
its derivatives for profit. All trademarks™ are the registered® marks of their
respective owners. The graphic

that may appear in other locations in the text shows that the work is li-
censed with the Creative Commons, that the work may be used for free by
any party so long as attribution is given to the author(s), that the work and
its derivatives are used in the spirit of “share and share alike,” and that no
party may sell this work or any of its derivatives for profit, with the fol-
lowing exception: it is entirely acceptable for university bookstores to sell
bound photocopied copies of the activities workbook to students at their stan-
dard markup above the copying expense. Full details may be found by visiting
creativecommons.org/licenses/by-nc-sa/4.0/ or sending a letter to Creative
Commons, 444 Castro Street, Suite 900, Mountain View, California, 94041,
USA.
Features of the Text

Similar to the presentation of the single-variable Active Calculus, instructors


and students alike will find several consistent features in the presentation,
including:

Motivating Questions. At the start of each section, we list motivating ques-


tions that provide motivation for why the following material is of interest
to us. One goal of each section is to answer each of the motivating ques-
tions.

Preview Activities. Each section of the text begins with a short introduc-
tion, followed by a preview activity. This brief reading and the preview
activity are designed to foreshadow the upcoming ideas in the remainder
of the section; both the reading and preview activity are intended to be
accessible to students in advance of class, and indeed to be completed by
students before a day on which a particular section is to be considered.

Activities. Every section in the text contains several activities. These are
designed to engage students in an inquiry-based style that encourages
them to construct solutions to key examples on their own, working either
individually or in small groups.

Exercises. There are dozens of calculus texts with (collectively) tens of thou-
sands of exercises. Rather than repeat a large list of standard and rou-
tine exercises in this text, we recommend the use of WeBWorK with its
access to the National Problem Library and its many multivariable cal-
culus problems. In this text, each section begins with several anonymous
WeBWorK exercises, and follows with several challenging problems. The
WeBWorK exercises are best completed in the .html version of the text
available at https://activecalculus.org/. Almost every non-WeBWorK
problem has multiple parts, requires the student to connect several key
ideas, and expects that the student will do at least a modest amount
of writing to answer the questions and explain their findings. For in-
structors interested in a more conventional source of exercises, consider
the freely available APEX Calculus text by Greg Hartmann et al. at
www.apexcalculus.com.

Graphics. As much as possible, we strive to demonstrate key fundamental


ideas visually, and to encourage students to do the same. Throughout the
text, we use full-color graphics to exemplify and magnify key ideas, and
to use this graphical perspective alongside both numerical and algebraic
representations of calculus. To keep cost low, the graphics in the print-
on-demand version are in black and white. When the text itself refers to
color in images, one needs to view the .html or .pdf electronically. The
figures and the software to generate them have been created by David
Austin.

v
vi

Summary of Key Ideas. Each section concludes with a summary of the key
ideas encountered in the preceding section; this summary normally re-
flects responses to the motivating questions that began the section.
Links to technological tools. Many of the ideas of multivariable calculus
are best understood dynamically, and we encourage readers to make
frequent use of technology to analyze graphs and data. Since tech-
nology changes so often, we refrain from indicating specific programs
to use in the text. However, aside from computer algebra systems like
Maple, Mathematics, or Sage, there are many free graphing tools avail-
able for drawing three-dimensional surfaces or curves. These programs
can be used by instructors and students to assist in the investigations and
demonstrations. The use of these freely available applets is in accord with
our philosophy that no one should be required to purchase materials to
learn calculus. We are indebted to everyone who allows their expertise
to be openly shared. Below is a list of a few of the technological tools
that are available (links active at the writing of this edition). Of course,
you can find your own by searching the web.

• Wolfram Alpha, useful for graphing surfaces in 2D and 3D, and for general
calculations, at http://www.wolframalpha.com/
• Wolfram Alpha widgets, searchable site for simple to use programs
using Wolfram Alpha, at http://www.wolframalpha.com/widgets/gallery/
?category=math

• GeoGebra, all purpose graphing tool with some computer algebra capa-
bilities, at https://www.geogebra.org/. Clicking on the magnifying glass
icon allows you to search a large database of GeoGebra applets.

• CalcPlot3D, good all-purpose 3D graphing tool, at https://www.monroecc.


edu/faculty/paulseeburger/calcnsf/CalcPlot3D/

• A collection of Flash Mathlets for graphing surfaces, parametric


curves in 3D, spherical coordinates and other 3D tools, at http://www.
math.uri.edu/~bkaskosz/. Requires Flash Player.
Acknowledgments

This text is an extension of the single variable Active Calculus by Matt Boelkins.
The initial drafts of this multivariable edition were written by me; editing and
revisions were made by David Austin and Matt Boelkins. David Austin is
responsible for the beautiful full-color graphics in the text. Many of our col-
leagues at GVSU have shared their ideas and resources, which undoubtedly
had a significant influence on the product. We thank them for all of their
support. Most importantly, I want to thank thank the students who have used
this text and offered helpful advice and suggestions.
In advance, we also thank our colleagues throughout the mathematical
community who have or will read, edit, and use this book, and hence contribute
to its improvement through ongoing discussion. The following people have
used early drafts of this text and have generously offered suggestions that have
improved the text.
Feryâl Alayont Grand Valley State University
David Austin Grand Valley State University
Jon Barker and students St. Ignatius High School, Cleveland, OH
Matt Boelkins Grand Valley State University
Brian Drake Grand Valley State University
Brian Gleason Nevada State College
Mitch Keller Morningside College
The current .html version of the text is possible only because of the amazing
work of Rob Beezer and his development of PreTeXt. My ability to take
advantage of Rob’s work is largely due to the support of the American Institute
of Mathematics, which funded me for a weeklong workshop in Mathbook XML
in San Jose, CA, in April 2016. David Farmer also deserves credit for the
original conversion of the text from LATEX to PreTeXt.
I take full responsibility for all errors or inconsistencies in the text, and
welcome reader and user feedback to correct them, along with other suggestions
to improve the text.
David Austin, Matt Boelkins, Steven Schlicker, Allendale, MI, August,
2018.

vii
viii
Active Calculus -
Multivariable: our goals

Several fundamental ideas in calculus are more than 2000 years old. As a for-
mal subdiscipline of mathematics, calculus was first introduced and developed
in the late 1600s, with key independent contributions from Sir Isaac Newton
and Gottfried Wilhelm Leibniz. Mathematicians agree that the subject has
been understood rigorously since the work of Augustin Louis Cauchy and Karl
Weierstrass in the mid 1800s when the field of modern analysis was devel-
oped, in part to make sense of the infinitely small quantities on which calculus
rests. As a body of knowledge, calculus has been completely understood for at
least 150 years. The discipline is one of our great human intellectual achieve-
ments: among many spectacular ideas, calculus models how objects fall under
the forces of gravity and wind resistance, explains how to compute areas and
volumes of interesting shapes, enables us to work rigorously with infinitely
small and infinitely large quantities, and connects the varying rates at which
quantities change to the total change in the quantities themselves.
While each author of a calculus textbook certainly offers their own creative
perspective on the subject, it is hardly the case that many of the ideas an
author presents are new. Indeed, the mathematics community broadly agrees
on what the main ideas of calculus are, as well as their justification and their
importance; the core parts of nearly all calculus textbooks are very similar.
As such, it is our opinion that in the 21st century—an age where the internet
permits seamless and immediate transmission of information—no one should
be required to purchase a calculus text to read, to use for a class, or to find a
coherent collection of problems to solve. Calculus belongs to humankind, not
any individual author or publishing company. Thus, a main purpose of this
work is to present a new multivariable calculus text that is free. In addition,
instructors who are looking for a calculus text should have the opportunity to
download the source files and make modifications that they see fit; thus this
text is open-source.
In Active Calculus - Multivariable, we endeavor to actively engage students
in learning the subject through an activity-driven approach in which the vast
majority of the examples are completed by students. Where many texts present
a general theory of calculus followed by substantial collections of worked ex-
amples, we instead pose problems or situations, consider possibilities, and then
ask students to investigate and explore. Following key activities or examples,
the presentation normally includes some overall perspective and a brief syn-
opsis of general trends or properties, followed by formal statements of rules or
theorems. While we often offer plausibility arguments for such results, rarely
do we include formal proofs. It is not the intent of this text for the instructor
or author to demonstrate to students that the ideas of calculus are coherent
and true, but rather for students to encounter these ideas in a supportive,

ix
x

leading manner that enables them to begin to understand for themselves why
calculus is both coherent and true.
This approach is consistent with the following goals:
• To have students engage in an active, inquiry-driven approach, where
learners strive to construct solutions and approaches to ideas on their
own, with appropriate support through questions posed, hints, and guid-
ance from the instructor and text.
• To build in students intuition for why the main ideas in multivariable
calculus are natural and true. We strive to accomplish this by using spe-
cific cases to highlight the ideas for the general situation using contexts
that are common and familiar.
• To challenge students to acquire deep, personal understanding of multi-
variable calculus through reading the text and completing preview activ-
ities on their own, through working on activities in small groups in class,
and through doing substantial exercises outside of class time.

• To strengthen students’ written and oral communicating skills by hav-


ing them write about and explain aloud the key ideas of multivariable
calculus.
How to Use this Text

Because the text is free, any professor or student may use the electronic version
of the text for no charge. For reading on laptops or mobile devices, the best
electronic version to use is at https://activecalculus.org/multi/, but you can
find links to a pdf and hard copy of the text at https://activecalculus.org/.
Furthermore, because the text is open-source, any instructor may acquire the
full set of source files, which are available on GitHub.
This text may be used as a stand-alone textbook for a standard multivari-
able calculus course or as a supplement to a more traditional text. Chapter 9
introduces functions of several independent variables along with tools that will
be used to study these functions, namely vectors and vector-valued functions.
Chapter 10 studies differentiation of functions of several independent variables
in detail, addressing the typical topics including limits, partial derivatives, and
optimization, while Chapter 11 provides the standard topics of integration of
multivariable functions.

Electronic Edition Because students and instructors alike have access to the
book in electronic format, there are several advantages to the text over a
traditional print text. One is that the text may be projected on a screen
in the classroom (or even better, on a whiteboard) and the instructor may
reference ideas in the text directly, add comments or notation or features
to graphs, and indeed write right on the projected text itself. Students
can do the same when working at the board. In addition, students can
choose to print only whatever portions of the text are needed for them.
Also, the electronic versions of the text includes live .html links to on-
line programs, so student and instructor alike may follow those links to
additional resources that lie outside the text itself. Finally, students can
have access to a copy of the text anywhere they have a computer. The
.html version is far superior to the .pdf version; this is especially true for
viewing on a smartphone.
Note. In the .pdf version, there is not an obvious visual indicator of the
live .html links, so some availalable information is suppressed. If you are
using the text electronically in a setting with internet access, please know
that it is assumed you are using the .html version.

Activities Workbook Each section of the text has a preview activity and
at least three in-class activities embedded in the discussion. As it is the
expectation that students will complete all of these activities, it is ideal
for them to have room to work on them adjacent to the problem state-
ments themselves. A separate workbook of activities that includes only
the individual activity prompts, along with space provided for students
to write their responses, is in development.

Community of Users Because this text is free and open-source, we hope

xi
xii

that as people use the text, they will contribute corrections, suggestions,
and new material. At this time, the best way to communicate such
feedback is by email to Steve Schlicker at [email protected].
Contents

Features of the Text v

Acknowledgments vii

Active Calculus - Multivariable: our goals ix

How to Use this Text xi

9 Multivariable and Vector Functions 1


9.1 Functions of Several Variables and Three Dimensional Space . 1
9.2 Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
9.3 The Dot Product . . . . . . . . . . . . . . . . . . . . . . . . . . 33
9.4 The Cross Product . . . . . . . . . . . . . . . . . . . . . . . . . 44
9.5 Lines and Planes in Space . . . . . . . . . . . . . . . . . . . . . 55
9.6 Vector-Valued Functions . . . . . . . . . . . . . . . . . . . . . . 66
9.7 Derivatives and Integrals of Vector-Valued Functions . . . . . . 73
9.8 Arc Length and Curvature . . . . . . . . . . . . . . . . . . . . . 86

10 Derivatives of Multivariable Functions 99


10.1 Limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
10.2 First-Order Partial Derivatives . . . . . . . . . . . . . . . . . . 112
10.3 Second-Order Partial Derivatives . . . . . . . . . . . . . . . . . 126
10.4 Linearization: Tangent Planes and Differentials . . . . . . . . . 137
10.5 The Chain Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
10.6 Directional Derivatives and the Gradient . . . . . . . . . . . . . 161
10.7 Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
10.8 Constrained Optimization: Lagrange Multipliers . . . . . . . . 191

11 Multiple Integrals 199


11.1 Double Riemann Sums and Double Integrals over Rectangles . 199
11.2 Iterated Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . 211
11.3 Double Integrals over General Regions . . . . . . . . . . . . . . 216
11.4 Applications of Double Integrals . . . . . . . . . . . . . . . . . 227
11.5 Double Integrals in Polar Coordinates . . . . . . . . . . . . . . 237
11.6 Surfaces Defined Parametrically and Surface Area . . . . . . . 248
11.7 Triple Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . 258
11.8 Triple Integrals in Cylindrical and Spherical Coordinates . . . . 270
11.9 Change of Variables . . . . . . . . . . . . . . . . . . . . . . . . 283

Index 293

xiii
xiv CONTENTS
Chapter 9

Multivariable and Vector


Functions

9.1 Functions of Several Variables and Three Di-


mensional Space

Motivating Questions

• What is a function of several variables? What do we mean by the domain


of a function of several variables?

• How do we find the distance between two points in R3 ? What is the


equation of a sphere in R3 ?

• What is a trace of a function of two variables? What does a trace tell us


about a function?

• What is a level curve of a function of two variables? What does a level


curve tell us about a function?

Throughout our mathematical careers we have studied functions of a single


variable. We define a function of one variable as a rule that assigns exactly one
output to each input. We analyze these functions by looking at their graphs,
calculating limits, differentiating, integrating, and more. Functions of several
variables will be the main focus of Chapters 10 and 11, where we will analyze
these functions by looking at their graphs, calculating limits, differentiating,
integrating, and more. We will see that many of the ideas from single vari-
able calculus translate well to functions of several variables, but we will have
to make some adjustments as well. In this chapter we introduce functions of
several variables and then dicusss some of the tools (vectors and vector-valued
functions) that will help us understand and analyze functions of several vari-
ables.

Preview Activity 9.1.1. Suppose you invest money in an account that pays
5% interest compounded continuously. If you invest P dollars in the account,
the amount A of money in the account after t years is given by

A = P e0.05t .

1
2 CHAPTER 9. MULTIVARIABLE AND VECTOR FUNCTIONS

The variables P and t are independent of each other, so using functional


notation we write
A(P, t) = P e0.05t .

a. Find the amount of money in the account after 7 years if you originally
invest 1000 dollars.

b. Evaluate A(5000, 8). Explain in words what this calculation represents.

c. Now consider only the situation where the amount invested is fixed at
1000 dollars. Calculate the amount of money in the account after t years
as indicated in Table 9.1.1. Round payments to the nearest penny.

Duration (in years) 2 3 4 5 6


Amount (dollars)

Table 9.1.1: Amount of money in an account with an initial investment of


1000 dollars.

d. Now consider the situation where we want to know the amount of money
in the account after 10 years given various initial investments. Calculate
the amount of money in the account as indicated in Table 9.1.2. Round
payments to the nearest penny.

Interest rate 0.03 0.05 0.07 0.09 0.11


Amount (dollars)

Table 9.1.2: Amount of money in an account after 10 years.

e. Describe as best you can the combinations of initial investments and time
that result in an account containing $10,000.

9.1.1 Functions of Several Variables


Up to this point we have been concerned with functions of a single variable.
What defined such a function is that every input in the domain produced a
unique output in the range. We saw similar behavior in Preview Activity 9.1.1,
where each pair (P, t) of inputs produces a unique output A(P, t). Additionally,
the two variables P and t had no real relation to each other. That is, we could
choose any value of P without condsidering what value t might have, and we
could select any value of t to use without regard to what value P might have.
For that reason we say that the variables t and P are independent of each other.
Thus, we call A = A(P, t) a function of the two independent variables P and
t. This is the key idea in defining a function of two independent variables.

Definition 9.1.3. A function f of two independent variables is a rule


that assigns to each ordered pair (x, y) in some set D exactly one real number
f (x, y).

There is, of course, no reason to restrict ourselves to functions of only two


variables—we can use any number of variables we like. For example,

f (x, y, z) = x2 − 2xz + cos(y)


9.1. FUNCTIONS OF SEVERAL VARIABLES AND THREE DIMENSIONAL SPACE3

defines f as a function of the three variables x, y, and z. In general, a func-


tion of n independent variables is a rule that assigns to an ordered n-tuple
(x1 , x2 , . . . , xn ) in some set D exactly one real number.
As with functions of a single variable, it is important to understand the set
of inputs for which the function is defined.

Definition 9.1.4. The domain of a function f is the set of all inputs at which
the function is defined.

Activity 9.1.2. Identify the domain of each of the following functions. Draw
a picture of each domain in the xy-plane.

a. f (x, y) = x2 + y 2

b. f (x, y) =
p
x2 + y 2

x+y
c. Q(x, y) = x2 −y 2

d. s(x, y) = √ 1
1−xy 2

9.1.2 Representing Functions of Two Variables

One of the techniques we use to study functions of one variable is to create


a table of values. We can do the same for functions of two variables, except
that our tables will have to allow us to keep track of both input variables. We
can do this with a 2-dimensional table, where we list the x-values down the
first column and the y-values across the first row. As an example, suppose we
launch a projectile, using a golf club, a cannon, or some other device, from
ground level. Under ideal conditions (ignoring wind resistance, spin, or any
other forces except the force of gravity) the horizontal distance the object will
travel depends on the initial velocity x the object is given, and the angle y at
which it is launched. If we let f represent the horizontal distance the object
travels, then f is a function of the two variables x and y, and we represent f
in functional notation by

x2 sin(2y)
f (x, y) = ,
g

where g is the acceleration due to gravity. (Note that g is constant, 32 feet per
second squared. We will derive this equation in a later section.) To create a
table of values for f , we list the x-values down the first column and the y-values
across the first row. The value f (x, y) is then displayed in the location where
the x row intersects the y column, as shown in Table 9.1.5 (where we measure
x in feet per second and y in radians).
4 CHAPTER 9. MULTIVARIABLE AND VECTOR FUNCTIONS

x\y 0.2 0.4 0.6 0.8 1.0 1.2 1.4


25 7.6 14.0 18.2 19.5 17.8 13.2 6.5
50 30.4 56.0 72.8 78.1 71.0 26.2
75 68.4 163.8 175.7 159.8 118.7 58.9
100 121.7 224.2 291.3 312.4 284.2 211.1 104.7
125 190.1 350.3 455.1 444.0 329.8 163.6
150 273.8 504.4 655.3 702.8 639.3 474.9 235.5
175 372.7 686.5 892.0 956.6 870.2 646.4
200 486.8 896.7 1165.0 1249.5 1136.6 844.3 418.7
225 616.2 1134.9 1474.5 1581.4 1438.5 1068.6 530.0
250 760.6 1401.1 1952.3 1776.0 1319.3 654.3

x2 sin(2y)
Table 9.1.5: Values of f (x, y) = g .

Activity 9.1.3. Complete Table 9.1.5 by filling in the missing values of the
function f . Round entries to the nearest tenth.
If f is a function of a single variable x, then we define the graph of f to be
the set of points of the form (x, f (x)), where x is in the domain of f . We then
plot these points using the coordinate axes in order to visualize the graph. We
can do a similar thing with functions of several variables. Table 9.1.5 identifies
points of the form (x, y, f (x, y)), and we define the graph of f to be the set of
these points.
Definition 9.1.6. The graph of a function f = f (x, y) is the set of points of
the form (x, y, f (x, y)), where the point (x, y) is in the domain of f .
We also often refer to the graph of a function f of two variables as the sur-
face generated by f . Points in the form (x, y, f (x, y)) are in three dimensions,
so plotting these points takes a bit more work than graphs of functions in two
dimensions. To plot these three-dimensional points, we need to set up a coor-
dinate system with three mutually perpendicular axes — the x-axis, the y-axis,
and the z-axis (called the coordinate axes). There are essentially two different
ways we could set up a 3D coordinate system, as shown in Figure 9.1.7; thus,
before we can proceed, we need to establish a convention.

z z

y x

x y

Figure 9.1.7: Left: A left hand system. Right: A right hand system
9.1. FUNCTIONS OF SEVERAL VARIABLES AND THREE DIMENSIONAL SPACE5

The distinction between these two figures is subtle, but important. In the
coordinate system shown at left in Figure 9.1.7, imagine that you are sitting
on the positive z-axis next to the label “z.” Looking down at the x- and
y-axes, you see that the y-axis is obtained by rotating the x-axis by 90◦ in
the counterclockwise direction. Again sitting on the positive z-axis in the
coordinate system at right in Figure 9.1.7, you see that the y-axis is obtained
by rotating the x-axis by 90◦ in the clockwise direction.
We call the coordinate system at right in Figure 9.1.7 a right-hand system;
if we point the index finger of our right hand along the positive x-axis and our
middle finger along the positive y-axis, then our thumb points in the direction
of the positive z-axis. Following mathematical conventions, we choose to use
a right-hand system throughout this book.
Now that we have established a convention for a right-hand system, we can
2
draw a graph of the distance function defined by f (x, y) = x sin(2y)
g . Note that
the function f is continuous in both variables, so when we plot these points in
the right hand coordinate system, we can connect them all to form a surface
in 3-space. The graph of the distance function f is shown in Figure 9.1.8.

z
1500

1000

500

x
200
y 150
1.5 100
1.0 50
0.5
0

Figure 9.1.8: The distance surface.

There are many graphing tools available for drawing three-dimensional sur-
faces as indicated in the Preface (see Links to interactive graphics in Features
of the Text). Since we will be able to visualize graphs of functions of two
independent variables, but not functions of more than two variables, we will
primarily deal with functions of two variables in this text. It is important
to note, however, that the techniques we develop apply to functions of any
number of variables.
Notation: We let R2 denote the set of all ordered pairs of real numbers in
the plane (two copies of the real number system) and let R3 represent the set
of all ordered triples of real numbers (which constitutes three-space).

9.1.3 Some Standard Equations in Three-Space


In addition to graphing functions, we will also want to understand graphs of
some simple equations in three dimensions. For example, in R2 , the graphs of
the equations x = a and y = b, where a and b are constants, are lines parallel to
the coordinate axes. In the next activity we consider their three-dimensional
analogs.
Activity 9.1.4.
a. Consider the set of points (x, y, z) that satisfy the equation x = 2. De-
scribe this set as best you can.
6 CHAPTER 9. MULTIVARIABLE AND VECTOR FUNCTIONS

b. Consider the set of points (x, y, z) that satisfy the equation y = −1.
Describe this set as best you can.
c. Consider the set of points (x, y, z) that satisfy the equation z = 0. De-
scribe this set as best you can.
Activity 9.1.4 shows that the equations where one independent variable is
constant lead to planes parallel to ones that result from a pair of the coordinate
axes. When we make the constant 0, we get the coordinate planes. The xy-
plane satisfies z = 0, the xz-plane satisfies y = 0, and the yz-plane satisfies
z = 0 (see Figure 9.1.9).

z z z

y y y

x x x

Figure 9.1.9: The coordinate planes.

On a related note, we define a circle in R2 as the set of all points equidistant


from a fixed point. In R3 , we call the set of all points equidistant from a fixed
point a sphere. To find the equation of a sphere, we need to understand how to
calculate the distance between two points in three-space, and we explore this
idea in the next activity.

Let P = (x0 , y0 , z0 ) and Q = Q = (x1 , y1 , z1 )


(x1 , y1 , z1 ) be two points in R3 .
These two points form opposite ver-
tices of a rectangular box whose sides
are planes parallel to the coordinate
planes as illustrated in Figure 9.1.10,
and the distance between P and Q is R
the length of the blue diagonal shown
in Figure 9.1.10.

P = (x0 , y0 , z0 ) S
Figure 9.1.10: The distance formula in
R3 .

Activity 9.1.5.
a. Consider the right triangle P RS in the base of the box whose hypotenuse
is shown as the red line in Figure 9.1.10. What are the coordinates of
9.1. FUNCTIONS OF SEVERAL VARIABLES AND THREE DIMENSIONAL SPACE7

the vertices of this triangle? Since this right triangle lies in a plane, we
can use the Pythagorean Theorem to find a formula for the length of the
hypotenuse of this triangle. Find such a formula, which will be in terms
of x0 , y0 , x1 , and y1 .
b. Now notice that the triangle P RQ whose hypotenuse is the blue segment
connecting the points P and Q with a leg as the hypotenuse P R of the
triangle found in part (a) lies entirely in a plane, so we can again use the
Pythagorean Theorem to find the length of its hypotenuse. Explain why
the length of this hypotenuse, which is the distance between the points
P and Q, is p
(x1 − x0 )2 + (y1 − y0 )2 + (z1 − z0 )2 .

The formula developed in Activity 9.1.5 is important to remember.


The distance between points.
The distance between points P = (x0 , y0 , z0 ) and Q = (x1 , y1 , z1 ) (de-
noted as |P Q|) in R3 is given by the formula

(9.1.1)
p
|P Q| = (x1 − x0 )2 + (y1 − y0 )2 + (z1 − z0 )2 .

Equation (9.1.1) can be used to derive the formula for a sphere centered at
a point (x0 , y0 , z0 ) with radius r. Since the distance from any point (x, y, z)
on such a sphere to the point (x0 , y0 , z0 ) is r, the point (x, y, z) will satisfy the
equation p
(x − x0 )2 + (y − y0 )2 + (z − z0 )2 = r
Squaring both sides, we come to the standard equation for a sphere.
The equation of a sphere.
The equation of a sphere with center (x0 , y0 , z0 ) and radius r is

(x − x0 )2 + (y − y0 )2 + (z − z0 )2 = r2 .

This makes sense if we compare this equation to its two-dimensional ana-


logue, the equation of a circle of radius r in the plane centered at (x0 , y0 ):

(x − x0 )2 + (y − y0 )2 = r2 .

9.1.4 Traces
When we study functions of several variables we are often interested in how
each individual variable affects the function in and of itself. In Preview Ac-
tivity 9.1.1, we saw that the amount of money in an account depends on the
interest rate and the duration of the investment. However, if we fix the interest
rate, the amount of money in the account depends only on the duration of the
investment, and if we set the duration of the investment constant, then the
amount of money in the account depends only on the interest rate. This idea
of keeping one variable constant while we allow the other to change will be an
important tool for us when studying functions of several variables.
As another example, consider again the distance function f defined by

x2 sin(2y)
f (x, y) =
g
8 CHAPTER 9. MULTIVARIABLE AND VECTOR FUNCTIONS

where x is the initial velocity of an object in feet per second, y is the launch
angle in radians, and g is the acceleration due to gravity (32 feet per second
squared). If we hold the launch angle constant at y = 0.6 radians, we can
consider f a function of the initial velocity alone. In this case we have

x2
f (x) = sin(2 · 0.6).
32
We can plot this curve on the surface by tracing out the points on the
surface when y = 0.6, as shown at left in Figure 9.1.11. The formula clearly
shows that f is quadratic in the x-direction. More descriptively, as we increase
the launch velocity while keeping the launch angle constant, the horizontal
distance the object travels increases proportional to the square of the initial
velocity.
Similarly, if we fix the initial velocity at 150 feet per second, we can consider
the distance as a function of the launch angle only. In this case we have

1502 sin(2y)
f (y) = .
32
We can again plot this curve on the surface by tracing out the points on
the surface when x = 150, as shown at right in Figure 9.1.11. The formula
clearly show that f is sinusoidal in the y-direction. More descriptively, as
we increase the launch angle while keeping the initial velocity constant, the
horizontal distance traveled by the object is proportional to the sine of twice
the launch angle.

z z
1500 1500

1000 1000

500 500

x x
200 200
y 150 y 150
1.5 100 1.5 100
1.0 50 1.0 50
0.5 0.5
0 0

Figure 9.1.11: Left: The trace with y = 0.6. Right: The trace with x = 150.

The curves we define when we fix one of the independent variables in our
two variable function are called traces.

Definition 9.1.12. A trace of a function f of two independent variables x


and y in the x direction is a curve of the form z = f (x, c), where c is a constant.
Similarly, a trace of a function f of two independent variables x and y in the
y direction is a curve of the form z = f (c, y), where c is a constant.

Understanding trends in the behavior of functions of two variables can be


challenging, as can sketching their graphs; traces help us with each of these
tasks.
9.1. FUNCTIONS OF SEVERAL VARIABLES AND THREE DIMENSIONAL SPACE9

Activity 9.1.6. In the following questions, we investigate the use of traces to


better understand a function through both tables and graphs.

a. Identify the y = 0.6 trace for the distance function f defined by f (x, y) =
x2 sin(2y)
g by highlighting or circling the appropriate cells in Table 9.1.5.
Write a sentence to describe the behavior of the function along this trace.

b. Identify the x = 150 trace for the distance function by highlighting or


circling the appropriate cells in Table 9.1.5. Write a sentence to describe
the behavior of the function along this trace.

−4
−2 y
−4 −2 2 4

x 4

Figure 9.1.13: Coordinate axes to sketch traces.

c. For the function g defined by g(x, y) = x2 + y 2 + 1, explain the type of


function that each trace in the x direction will be (keeping y constant).
Plot the y = −4, y = −2, y = 0, y = 2, and y = 4 traces in 3-dimensional
coordinate system provided in Figure 9.1.13.

d. For the function g defined by g(x, y) = x2 + y 2 + 1, explain the type of


function that each trace in the y direction will be (keeping x constant).
Plot the x = −4, x = −2, x = 0, x = 2, and x = 4 traces in 3-dimensional
coordinate system in Figure 9.1.13.

e. Describe the surface generated by the function g.

9.1.5 Contour Maps and Level Curves


We have all seen topographic maps such as the one of the Porcupine Mountains
in the upper peninsula of Michigan shown in Figure 9.1.14.1 The curves on
these maps show the regions of constant altitude. The contours also depict
changes in altitude: contours that are close together signify steep ascents or
descents, while contours that are far apart indicate only slight changes in el-
evation. Thus, contour maps tell us a lot about three-dimensional surfaces.
Mathematically, if f (x, y) represents the altitude at the point (x, y), then each
contour is the graph of an equation of the form f (x, y) = k, for some constant
k.
1 Map source: Michigan Department of Natural Resources, with permission of the Michi-

gan DNR and Bob Wild.


10 CHAPTER 9. MULTIVARIABLE AND VECTOR FUNCTIONS

Figure 9.1.14: Contour map of the Porcupine Mountains.

Activity 9.1.7. On the topographical map of the Porcupine Mountains in


Figure 9.1.14,

a. identify the highest and lowest points you can find;

b. from a point of your choice, determine a path of steepest ascent that


leads to the highest point;

c. from that same initial point, determine the least steep path that leads to
the highest point.

Curves on a surface that describe points at the same height or level are
called level curves.

Definition 9.1.15. A level curve (or contour) of a function f of two in-


dependent variables x and y is a curve of the form k = f (x, y), where k is a
constant.

Topographical maps can be used to create a three-dimensional surface from


the two-dimensional contours or level curves. For example, level curves of the
2
distance function defined by f (x, y) = x sin(2y)
32 plotted in the xy-plane are
shown at left in Figure 9.1.16. If we lift these contours and plot them at their
respective heights, then we get a picture of the surface itself, as illustrated at
right in Figure 9.1.16.
9.1. FUNCTIONS OF SEVERAL VARIABLES AND THREE DIMENSIONAL SPACE11

z
1.5 y 500

250
1.0

x
0.5 200
y 150
100
x 1.5 1.0 50
0.5
50 100 150 200 0

Figure 9.1.16: Left: Level curves. Right: Level curves at appropriate heights.

The use of level curves and traces can help us construct the graph of a
function of two variables.

Activity 9.1.8.

y y

x x

p Left: Level curves for f (x, y) = x + y . Right: Level curves


2 2
Figure 9.1.17:
for g(x, y) = x + y .
2 2

a. Let f (x, y) = x2 + y 2 . Draw the level curves f (x, y) = k for k = 1, k = 2,


k = 3, and k = 4 on the left set of axes given in Figure 9.1.17. (You
decide on the scale of the axes.) Explain what the surface defined by f
looks like.

b. Let g(x, y) = x2 + y 2 . Draw the level curves g(x, y) = k for k = 1,


p

k = 2, k = 3, and k = 4 on the right set of axes given in Figure 9.1.17.


(You decide on the scale of the axes.) Explain what the surface defined
by g looks like.
12 CHAPTER 9. MULTIVARIABLE AND VECTOR FUNCTIONS

c. Compare and contrast the graphs of f and g. How are they alike? How
are they different? Use traces for each function to help answer these
questions.

The traces and level curves of a function of two variables are curves in space.
In order to understand these traces and level curves better, we will first spend
some time learning about vectors and vector-valued functions in the next few
sections and return to our study of functions of several variables once we have
those more mathematical tools to support their study.

9.1.6 A gallery of functions

We end this section by considering a collection of functions and illustrating


their graphs and some level curves.

Figure 9.1.18: z = x2 + y 2

Figure 9.1.19: z = 4 − (x2 + y 2 )


9.1. FUNCTIONS OF SEVERAL VARIABLES AND THREE DIMENSIONAL SPACE13

p
Figure 9.1.20: z = x2 + y 2

x y

Figure 9.1.21: z = x2 − y 2

Figure 9.1.22: z = sin(x) + sin(y)


14 CHAPTER 9. MULTIVARIABLE AND VECTOR FUNCTIONS

y
x

Figure 9.1.23: z = y 2 − x3 + x

2
−y 2
Figure 9.1.24: z = xye−x

9.1.7 Summary

• A function f of several variables is a rule that assigns a unique number to


an ordered collection of independent inputs. The domain of a function of
several variables is the set of all inputs for which the function is defined.
• In R3 , the distance between points P = (x0 , y0 , z0 ) and Q = (x1 , y1 , z1 )
(denoted as |P Q|) is given by the formula
p
|P Q| = (x1 − x0 )2 + (y1 − y0 )2 + (z1 − z0 )2 .

and thus the equation of a sphere with center (x0 , y0 , z0 ) and radius r is

(x − x0 )2 + (y − y0 )2 + (z − z0 )2 = r2 .

• A trace of a function f of two independent variables x and y is a curve


of the form z = f (c, y) or z = f (x, c), where c is a constant. A trace tells
us how the function depends on a single independent variable if we treat
the other independent variable as a constant.
9.1. FUNCTIONS OF SEVERAL VARIABLES AND THREE DIMENSIONAL SPACE15

• A level curve of a function f of two independent variables x and y is


a curve of the form k = f (x, y), where k is a constant. A level curve
describes the set of inputs that lead to a specific output of the function.

Exercises
1. Evaluate a function. Evaluate the function at the specified points.
f (x, y) = x + yx4 , (−3, 4) , (4, 5) , (5, 2)
At (−3, 4):
At (4, 5):
At (5, 2):
2. Sketch a contour diagram of each function. Then, decide whether its
contours are predominantly lines, parabolas, ellipses, or hyperbolas.

(a) z = x2 + 4y 2
(b) z = x2 − 2y 2
(c) z = −5x2
(d) z = y − 4x2

3. Match the surfaces with the verbal description of the level curves by
placing the letter of the verbal description to the left of the number of the
surface.
(a) z = 1
x−1

(b) z = 2x2 + 3y 2
(c) z = xy
(d) z = (x2 + y 2 )
p

(e) z = 2x + 3y
(f) z = (25 − x2 − y 2 )
p

(g) z = x2 + y 2

A. two straight lines and a collection of hyperbolas


B. a collection of concentric ellipses
C. a collection of equally spaced concentric circles
D. a collection of equally spaced parallel lines
E. a collection of unequally spaced parallel lines
F. a collection of unequally spaced concentric circles
√ √
4. The domain of the function f (x, y) = x + y is
5. Find the equation of the sphere centered at (−8, 3, 9) with radius 5.
Normalize your equations so that the coefficient of x2 is 1.
= 0.
Give an equation which describes the intersection of this sphere with the
plane z = 10.
= 0.
16 CHAPTER 9. MULTIVARIABLE AND VECTOR FUNCTIONS

6. (A) If the positive z-axis points upward, an equation for a horizontal


plane through the point (−3, −2, −2) is
.
(B) An equation for the plane perpendicular to the x-axis and passing
through the point (−3, −2, −2) is
.
(C) An equation for the plane parallel to the xz-plane and passing through
the point (−3, −2, −2) is
.
7. A car rental company charges a one-time application fee of 25 dollars, 60
dollars per day, and 14 cents per mile for its cars.
(a) Write a formula for the cost, C, of renting a car as a function of the
number of days, d, and the number of miles driven, m.
C=
(b) If C = f (d, m), then f (5, 880) =
8. (a) Describe the set of points whose distance from the y-axis equals the
distance from the xz-plane.

A cone opening along the z-axis

A cylinder opening along the y-axis

A cylinder opening along the z-axis

A cylinder opening along the x-axis

A cone opening along the x-axis

A cone opening along the y-axis

(b) Find the equation for the set of points whose distance from the y-axis
equals the distance from the xz-plane.

x2 + z 2 = r 2

y 2 = x2 + z 2

x2 = y 2 + z 2

x2 + y 2 = r 2

y2 + z 2 = r2

z 2 = x2 + y 2

9. For each surface, decide whether it could be a bowl, a plate, or neither.


Consider a plate to be any fairly flat surface and a bowl to be anything that
could hold water, assuming the positive z-axis is up.

(a) z = 3

(b) z = − 3 − x2 − y 2
p

(c) z = 1 − x2 − y 2

(d) z = x2 + y 2

(e) x + y + z = 2
9.1. FUNCTIONS OF SEVERAL VARIABLES AND THREE DIMENSIONAL SPACE17

10. Consider the concentration, C, (in mg/liter) of a drug in the blood as


a function of the amount of drug given, x, and the time since injection, t. For
0 ≤ x ≤ 5 mg and t ≥ 0 hours, we have

C = f (x, t) = 20te−(5−x)t

f (3, 2) =
Give a practical interpretation of your answer: f (3, 2) is

the amount of a 3 mg dose in the blood 2 hours after injection.

the concentration of a 2 mg dose in the blood 3 hours after injection.

the concentration of a 3 mg dose in the blood 2 hours after injection.

the change in concentration of a 2 mg dose in the blood 3 hours after


injection.

the amount of a 2 mg dose in the blood 3 hours after injection.

the change in concentration of a 3 mg dose in the blood 2 hours after


injection.

11. A manufacturer sells aardvark masks at a price of $210 per mask and
butterfly masks at a price of $490 per mask. A quantity of a aardvark masks
and b butterfly masks is sold at a total cost of $550 to the manufacturer.
(a) Express the manufacturer’s profit, P, as a function of a and b.
P (a, b) = dollars.
(b) The curves of constant profit in the ab-plane are

hyperbolas

ellipses

lines

circles

parabolas

12. Consider the concentration, C, in mg per liter (L), of a drug in the blood
as a function of x, the amount, in mg, of the drug given and t, the time in hours
since the injection. For 0 ≤ x ≤ 4 and t ≥ 0, we have C = f (x, t) = te−t(5−x) .
Graph the following two single variable functions on a separate page, being
sure that you can explain their significance in terms of drug concentration.
(a) f (4, t)
(b) f (x, 1)
Using your graph in (a), where is f (4, t)
a maximum? t =
a minimum? t =
Using your graph in (b), where is f (x, 1)
a maximum? x =
a minimum? x =
18 CHAPTER 9. MULTIVARIABLE AND VECTOR FUNCTIONS

13. By setting one variable constant, find a plane that intersects the graph
of z = 3y 2 − 9x2 + 1 in a:
(a) Parabola opening upward: the plane =
(Give your answer by specifying the variable in the first answer blank and
a value for it in the second.)
(b) Parabola opening downward: the plane =
(Give your answer by specifying the variable in the first answer blank and
a value for it in the second.)
(c) Pair of intersecting straight lines: the plane =
(Give your answer by specifying the variable in the first answer blank and
a value for it in the second.)
14. Find the equation of each of the following geometric objects.
a. The plane parallel to the xy-plane that passes through the point (−4, 5, −12).
b. The plane parallel to the yz-plane that passes through the point (7, −2, −3).
c. The sphere centered at the point (2, 1, 3) and has the point (−1, 0, −1)
on its surface.
d. The sphere whose diameter has endpoints (−3, 1, −5) and (7, 9, −1).

15. The Ideal Gas Law, P V = RT , relates the pressure (P , in pascals),


temperature (T , in Kelvin), and volume (V , in cubic meters) of 1 mole of a gas
(R = 8.314 molJ K is the universal gas constant), and describes the behavior
of gases that do not liquefy easily, such as oxygen and hydrogen. We can solve
the ideal gas law for the volume and hence treat the volume as a function of
the pressure and temperature:
8.314T
V (P, T ) = .
P
a. Explain in detail what the trace of V with P = 1000 tells us about a key
relationship between two quantities.
b. Explain in detail what the trace of V with T = 5 tells us.
c. Explain in detail what the level curve V = 0.5 tells us.
d. Use 2 or three additional traces in each direction to make a rough sketch
of the surface over the domain of V where P and T are each nonnegative.
Write at least one sentence that describes the way the surface looks.
e. Based on all your work above, write a couple of sentences that describe
the effects that temperature and pressure have on volume.

16. When people buy a large ticket item like a car or a house, they often
take out a loan to make the purchase. The loan is paid back in monthly
installments until the entire amount of the loan, plus interest, is paid. The
monthly payment that the borrower has to make depends on the amount P
of money borrowed (called the principal), the duration t of the loan in years,
and the interest rate r. For example, if we borrow $18,000 to buy a car, the
monthly payment M that we need to make to pay off the loan is given by the
formula
1500r
M (r, t) = 1 .
1− r 12t
( 12 )
1+

a. Find the monthly payments on this loan if the interest rate is 6% and
the duration of the loan is 5 years.
9.1. FUNCTIONS OF SEVERAL VARIABLES AND THREE DIMENSIONAL SPACE19

b. Create a table of values that illustrates the trace of M with r fixed at


5%. Use yearly values of t from 2 to 6. Round payments to the nearest
penny. Explain in detail in words what this trace tells us about M .
c. Create a table of values that illustrates the trace of M with t fixed at 3
years. Use rates from 3 to 11 % in increments of 2%. Round payments
to the nearest penny. Explain in detail what this trace tells us about M .
d. Consider the combinations of interest rates and durations of loans that
result in a monthly payment of $200. Solve the equation M (r, t) = 200
for t to write the duration of the loan in terms of the interest rate. Graph
this level curve and explain as best you can the relationship between t
and r.

17. Consider the function h defined by h(x, y) = 8 − 4 − x2 − y 2 .


p

a. What is the domain of h? (Hint: describe a set of ordered pairs in the


plane by explaining their relationship relative to a key circle.)
b. The range of a function is the set of all outputs the function
√ generates.
Given that the range of the square root function g(t) = t is the set of
all nonnegative real numbers, what do you think is the range of h? Why?
c. Choose 4 different values from the range of h and plot the corresponding
level curves in the plane. What is the shape of a typical level curve?
d. Choose 5 different values of x (including at least one negative value and
zero), and sketch the corresponding traces of the function h.
e. Choose 5 different values of y (including at least one negative value and
zero), and sketch the corresponding traces of the function h.
f. Sketch an overall picture of the surface generated by h and write at
least one sentence to describe how the surface appears visually. Does the
surface remind you of a familiar physical structure in nature?
20 CHAPTER 9. MULTIVARIABLE AND VECTOR FUNCTIONS

9.2 Vectors

Motivating Questions

• What is a vector?

• What does it mean for two vectors to be equal?

• How do we add two vectors together and multiply a vector by a scalar?

• How do we determine the magnitude of a vector? What is a unit vector,


and how do we find a unit vector in the direction of a given vector?

Quantities like length, speed, area, and mass are all measured by numbers
(called scalars). Other quantities, like velocity, force, and displacement, have
two attributes: magnitude and direction. These quantities are represented by
vectors and are the study of this section. For example, we will use vectors to
calculate work done by a constant force, calculuate torque, determine direction
vectors for lines and normal vectors for planes, define curvature, and determine
the direction of greatest increase on a surface. For most of these applications,
we will interested in using vectors to measure direction and/or speed. Vectors
will be a major tool for us in determining the behavior of functions of several
veriables.
If we are at a point x in the domain of a function of one variable, there
are only two directions in which we can move: in the positive or negative x-
direction. If, however, we are at a point (x, y) in the domain of a function
of two variables, there are many directions in which we can move. Thus, it
is important for us to have a means to indicate direction, and we will do so
using vectors. This notion of direction in space will be critical for us to find
direction vectors for lines, tangent lines to curves, normal vectors to planes,
and to determine direction of motion.

Preview Activity 9.2.1. Postscript is a programming language whose pri-


mary purpose is to describe the appearance of text or graphics. A simple set
of Postscript commands that produces the triangle in the plane with vertices
(0, 0), (1, 1), and (1, −1) is the following:

(0,0) moveto
(1,1) lineto stroke
(1,-1) lineto stroke
(0,0) lineto stroke

The key idea in these commands is that we start at the origin, then tell
Postscript that we want to start at the point (0, 0), draw a line from the
point (0, 0) to the point (1, 1) (this is what the lineto and stroke commands
do), then draw lines from (1, 1) to (1, −1) and (1, −1) back to the origin. Each
of these commands encodes two important pieces of information: a direction
in which to move and a distance to move. Mathematically, we can capture this
information succinctly in a vector. To do so, we record the movement on the
map in a pair hx, yi (this pair hx, yi is a vector), where x is the horizontal
displacement and y the vertical displacement from one point to another. So,
for example, the vector from the origin to the point (1, 1) is represented by the
vector h1, 1i.
9.2. VECTORS 21

a. What is the vector v1 = hx, yi that describes the displacement from the
point (1, 1) to the point (1, −1)? How can we use this vector to determine
the distance from the point (1, 1) to the point (1, −1)?

b. Suppose we want to draw the triangle with vertices A = (2, 3), B =


(−3, 1), and C = (4, −2). As a shorthand notation, we will denote the
−−→
vector from the point A to the point B as AB
−−→ −−→ −→
i. Determine the vectors AB, BC, and AC.
−−→ −−→ −→
ii. What relationship do you see among the vectors AB, BC, and AC?
Explain why this relationship should hold.

9.2.1 Representations of Vectors


Preview Activity 9.2.1 shows how we can record the magnitude and direction
of a change in position using an ordered pair of numbers hx, yi. There are
many other quantities, such as force and velocity, that possess the attributes
of magnitude and direction, and we will call each such quantity a vector.

Definition 9.2.1. A vector is a quantity that possesses the attributes of


magnitude and direction.

We can represent a vector geometrically as a directed line segment, with the


magnitude as the length of the segment and an arrowhead indicating direction,
as shown at left in Figure 9.2.2.

Figure 9.2.2: Left: A vector. Right: Representations of the same vector.

According to the definition, a vector possesses the attributes of length (mag-


nitude) and direction; the vector’s position, however, is not mentioned. Con-
sequently, we regard as equal any two vectors having the same magnitude and
direction, as shown at right in Figure 9.2.2. In other words, two vectors are
equal provided they have the same magnitude and direction.
This means that the same vector may be drawn in the plane in many
different ways. For instance, suppose that we would like to draw the vector
h3, 4i, which represents a horizontal change of three units and a vertical change
of four units. We may place the tail of the vector (the point from which the
22 CHAPTER 9. MULTIVARIABLE AND VECTOR FUNCTIONS

vector originates) at the origin and the tip (the terminal point of the vector) at
(3, 4), as illustrated at left in Figure 9.2.3. A vector with its tail at the origin
is said to be in standard position.

y y

R(4, 5)
5
4

1
x Q(1, 1) x
3 1 4

Figure 9.2.3: Left: Standard position. Right: A vector between two points.

Alternatively, we may place the tail of the vector h3, 4i at another point,
such as Q(1, 1). After a displacement of three units to the right and four units
up, the tip of the vector is at the point R(4, 5) (see the vector at right in
Figure 9.2.3).
In this example, the vector led to the directed line segment from Q to R,
−−→
which we denote as QR. We may also turn the situation around: given the two
points Q and R, we obtain the vector h3, 4i because we move horizontally three
−−→
units and vertically four units to get from Q to R. In other words, QR = h3, 4i.
−−→
In general, the vector QR from the point Q = (q1 , q2 ) to R = (r1 , r2 ) is found
by taking the difference of coordinates, so that

−−→
QR = hr1 − q1 , r2 − q2 i.

We will use boldface letters to represent vectors, such as v = h3, 4i, to


distinguish them from scalars. The entries of a vector are called its components;
in the vector h3, 4i, the x component is 3 and the y component is 4. We use
pointed brackets h , i and the term components to distinguish a vector from a
point ( , ) and its coordinates. There is, however, a close connection between
vectors and points. Given a point P , we will frequently consider the vector
−−→ −−→
OP from the origin O to P . For instance, if P = (3, 4), then OP = h3, 4i as
−−→
in Figure 9.2.4. In this way, we think of a point P as defining a vector OP
−−→
whose components agree with the coordinates of P . The vector OP is called
the position vector of P .
9.2. VECTORS 23

y
P (3, 4)

−−→
OP = h3, 4i

Figure 9.2.4: A point defines a vector

While we often illustrate vectors in the plane since it is easier to draw pic-
tures, different situations call for the use of vectors in three or more dimensions.
For instance, a vector v in n-dimensional space, Rn , has n components and
may be represented as
v = hv1 , v2 , v3 , . . . , vn i.

The next activity will help us to become accustomed to vectors and oper-
ations on vectors in three dimensions.

Activity 9.2.2. An article by C.Kenneth Tanner of the University of Georgia


argues that, due to the concept of social distance, a secondary school classroom
for 20 students should have 1344 square feet of floor space. Suppose a classroom
is 32 feet by 42 feet by 8 feet. Set the origin O of the classroom to be its center.
In this classroom, a student is sitting on a chair whose seat is at location
A = (9, −6, −1.5), an overhead projector is located at position B = (0, 1, 7),
and the teacher is standing at point C = (−2, 20, −4), all distances measured
in feet. Determine the components of the indicated vectors and explain in
context what each represents.
−→ −−→ −−→ −−→ −→ −−→
a. OA b. OB c. OC d. AB e. AC f. BC

9.2.2 Equality of Vectors


Because location is not mentioned in the definition of a vector, any two vectors
that have the same magnitude and direction are equal. It is helpful to have
an algebraic way to determine when this occurs. That is, if we know the
components of two vectors u and v, we will want to be able to determine
algebraically when u and v are equal. There is an obvious set of conditions
that we use.
Equality of Vectors.
Two vectors u = hu1 , u2 i and v = hv1 , v2 i in R2 are equal if and only if
their corresponding components are equal: u1 = v1 and u2 = v2 . More
generally, two vectors u = hu1 , u2 , . . . , un i and v = hv1 , v2 , . . . , vn i in
Rn are equal if and only if ui = vi for each possible value of i.
24 CHAPTER 9. MULTIVARIABLE AND VECTOR FUNCTIONS

9.2.3 Operations on Vectors


Vectors are not numbers, but we can now represent them with components
that are real numbers. As such, we naturally wonder if it is possible to add
two vectors together, multiply two vectors, or combine vectors in any other
ways. In this section, we will study two operations on vectors: vector addition
and scalar multiplication. To begin, we investigate a natural way to add two
vectors together, as well as to multiply a vector by a scalar.

Activity 9.2.3. Let u = h2, 3i, v = h−1, 4i.

a. Using the two specific vectors above, what is the natural way to define
the vector sum u + v?

b. In general, how do you think the vector sum a + b of vectors a = ha1 , a2 i


and b = hb1 , b2 i in R2 should be defined? Write a formal definition of a
vector sum based on your intuition.

c. In general, how do you think the vector sum a + b of vectors a =


ha1 , a2 , a3 i and b = hb1 , b2 , b3 i in R3 should be defined? Write a for-
mal definition of a vector sum based on your intuition.

d. Returning to the specific vector v = h−1, 4i given above, what is the


natural way to define the scalar multiple 21 v?

e. In general, how do you think a scalar multiple of a vector a = ha1 , a2 i in


R2 by a scalar c should be defined? how about for a scalar multiple of a
vector a = ha1 , a2 , a3 i in R3 by a scalar c? Write a formal definition of a
scalar multiple of a vector based on your intuition.

We can now add vectors and multiply vectors by scalars, and thus we
can add together scalar multiples of vectors. This allows us to define vector
subtraction, v − u, as the sum of v and −1 times u, so that

v − u = v + (−1)u.

Using vector addition and scalar multiplication, we will often represent


vectors in terms of the special vectors i = h1, 0i and j = h0, 1i. For instance,
we can write the vector ha, bi in R2 as

ha, bi = ah1, 0i + bh0, 1i = ai + bj,

which means that


h2, −3i = 2i − 3j.

In the context of R3 , we let i = h1, 0, 0i, j = h0, 1, 0i, and k = h0, 0, 1i, and
we can write the vector ha, b, ci in R3 as

ha, b, ci = ah1, 0, 0i + bh0, 1, 0i + ch0, 0, 1i = ai + bj + ck.

The vectors i, j, and k are called the standard unit vectors (as we will learn
momentarily, unit vectors have length 1), and are important in the physical
sciences.
9.2. VECTORS 25

9.2.4 Properties of Vector Operations

We know that the scalar sum 1 + 2 is equal to the scalar sum 2 + 1. This
is called the commutative property of scalar addition. Any time we define
operations on objects (like addition of vectors) we usually want to know what
kinds of properties the operations have. For example, is addition of vectors a
commutative operation? To answer this question we take two arbitrary vectors
v and u and add them together and see what happens. Let v = hv1 , v2 i and
u = hu1 , u2 i. Now we use the fact that v1 , v2 , u1 , and u2 are scalars, and that
the addition of scalars is commutative to see that

v + u = hv1 , v2 i + hu1 , u2 i
= hv1 + u1 , v2 + u2 i
= hu1 + v1 , u2 + v2 i
= hu1 , u2 i + hv1 , v2 i
= u + v.

So the vector sum is a commutative operation. Similar arguments can be


used to show the following properties of vector addition and scalar multiplica-
tion.

Properties of vector operations.


Let v, u, and w be vectors in Rn and let a and b be scalars. Then
1. v + u = u + v

2. (v + u) + w = v + (u + w)
3. The vector 0 = h0, 0, . . . , 0i has the property that v + 0 = v. The
vector 0 is called the zero vector.
4. (−1)v + v = 0. The vector (−1)v = −v is called the additive
inverse of the vector v.
5. (a + b)v = av + bv
6. a(v + u) = av + au

7. (ab)v = a(bv)
8. 1v = v.

We verified the first property for vectors in R2 ; it is straightforward to


verify that the rest of the eight properties just noted hold for all vectors in
Rn .

9.2.5 Geometric Interpretation of Vector Operations

Next, we explore a geometric interpretation of vector addition and scalar mul-


tiplication that allows us to visualize these operations. Let u = h4, 6i and
v = h3, −2i. Then w = u + v = h7, 4i, as shown on the left in Figure 9.2.5.
26 CHAPTER 9. MULTIVARIABLE AND VECTOR FUNCTIONS

u u u

u+v u+v u+v

v v

Figure 9.2.5: A vector sum (left), summing displacements (center), the par-
allelogram law (right).

If we think of these vectors as displacements in the plane, we find a ge-


ometric way to envision vector addition. For instance, the vector u + v will
represent the displacement obtained by following the displacement u with the
displacement v. We may picture this by placing the tail of v at the tip of u,
as seen in the center of Figure 9.2.5.
Of course, vector addition is commutative so we obtain the same sum if
we place the tail of u at the tip of v. We therefore see that u + v appears as
the diagonal of the parallelogram determined by u and v, as shown at right in
Figure 9.2.5.
Vector subtraction has a similar interpretation. At left in Figure 9.2.6
we see vectors u, v, and w = u + v. If we rewrite v = w − u, we have
the arrangement shown at right in Figure 9.2.6. In other words, to form the
difference w − u, we draw a vector from the tip of u to the tip of w.

v w−u

u u
w =u+v w

Figure 9.2.6: Left: Vector addition. Right: Vector subtraction.

In a similar way, we may geometrically represent a scalar multiple of a


vector. For instance, if v = h2, 3i, then 2v = h4, 6i. As shown in Figure 9.2.7,
multiplying v by 2 leaves the direction unchanged, but stretches v by 2. Also,
−2v = h−4, −6i, which shows that multiplying by a negative scalar gives a
vector pointing in the opposite direction of v.
9.2. VECTORS 27

2v

− 2v

Figure 9.2.7: Scalar multiplication

Activity 9.2.4.

u u

v v

Figure 9.2.8: Left: Sketch sums. Right: Sketch multiples.

Suppose that u and v are the vectors shown in Figure 9.2.8.


a. On the axes at left in Figure 9.2.8, sketch the vectors u + v, v − u, 2u,
−2u, and −3v.
b. What is 0v?
c. On the axes at right in Figure 9.2.8, sketch the vectors −3v, −2v, −1v,
2v, and 3v.
d. Give a geometric description of the set of terminal points of the vectors
tv where t is any scalar.
e. On the set of axes at right in Figure 9.2.8, sketch the vectors u − 3v,
u − 2v, u − v, u + v, and u + 2v.
f. Give a geometric description of the set of terminal points of the vectors
u + tv where t is any scalar.
28 CHAPTER 9. MULTIVARIABLE AND VECTOR FUNCTIONS

9.2.6 The Magnitude of a Vector


By definition, vectors have both direction and magnitude (or length). We now
investigate how to calculate the magnitude of a vector. Since a vector v can
be represented by a directed line segment, we can use the distance formula to
calculate the length of the segment. This length is the magnitude of the vector
v and is denoted |v|.

Activity 9.2.5.

y y
B
7
(v1 , v2 )

v v2
3
A

x x
2 4 v1

−−→
Figure 9.2.9: Left: AB. Right: An arbitrary vector, v.

a. Let A = (2, 3) and B = (4, 7), as shown at left in Figure 9.2.9. Compute
−−→
|AB|.

b. Let v = hv1 , v2 i be the vector in R2 with components v1 and v2 as shown


at right in Figure 9.2.9. Use the distance formula to find a general formula
for |v|.

c. Let v = hv1 , v2 , v3 i be a vector in R3 . Use the distance formula to find a


general formula for |v|.

d. Suppose that u = h2, 3i and v = h−1, 2i. Find |u|, |v|, and |u + v|. Is it
true that |u + v| = |u| + |v|?

e. Under what conditions will |u + v| = |u| + |v|? (Hint: Think about how
u, v, and u + v form the sides of a triangle.)

f. With the vector u = h2, 3i, find the lengths of 2u, 3u, and −2u, respec-
tively, and use proper notation to label your results.

g. If t is any scalar, how is |tu| related to |u|?

h. A unit vector is a vector whose magnitude is 1. Of the vectors i, j, and


i + j, which are unit vectors?

i. Find a unit vector v whose direction is the same as u = h2, 3i. (Hint:
Consider the result of part (g).)
9.2. VECTORS 29

9.2.7 Summary

• A vector is an object that possesses the attributes of magnitude and di-


rection. Examples of vector quantities are position, velocity, acceleration,
and force.

• Two vectors are equal if they have the same direction and magnitude.
Notice that position is not considered, so a vector is independent of its
location.

• If u = hu1 , u2 , . . . , un i and v = hv1 , v2 , . . . , vn i are two vectors in Rn ,


then their vector sum is the vector

u + v = hu1 + v1 , u2 + v2 , . . . , un + vn i.

If u = hu1 , u2 , . . . , un i is a vector in Rn and c is a scalar, then the scalar


multiple cu is the vector

cu = hcu1 , cu2 , . . . , cun i.

• The magnitude of the vector v = hv1 , v2 , . . . , vn i in Rn is the scalar


q
|v| = v12 + v22 + · · · + vn2 .

A vector u is a unit vector provided that |u| = 1. If v is a nonzero vector,


then the vector |v|
v
is a unit vector with the same direction as v.

Exercises
1. For each of the following, perform the indicated computation.
(a) (−8 ĩ + 4 j̃ + 9 k̃ ) − (3 ĩ − 3 j̃ − 9 k̃ ) =
(b) (−6 ĩ + 3 j̃ + 6 k̃ ) − 2(2 ĩ + 7 j̃ − 7 k̃ ) =
2. Find a vector a that has the same direction as h−6, 3, 6i but has length
3.
Answer: a =
3. Let a =< −2, 3, 0 > and b =< −2, 1, −2 >.
Show that there are scalars s and t so that sa + tb =< 4, −8, −2 >.
You might want to sketch the vectors to get some intuition.
s=
t=
4. Resolve the following vectors into components:
(a) The vector ~v in 2-space of length 7 pointing up at an angle of π/3
measured from the positive x-axis.
~v = ~i + ~j
(b) The vector w ~ in 3-space of length 5 lying in the xz-plane pointing
upward at an angle of π/4 measured from the positive x-axis.
~v = ~i + ~j + ~k
5. Find all vectors ~v in 2 dimensions having ||~v || = 17 where the ĩ -component
of ~v is 8 ĩ .
vectors:
(If you find more than one vector, enter them in a comma-separated list.)
30 CHAPTER 9. MULTIVARIABLE AND VECTOR FUNCTIONS

6. Which is traveling faster, a car whose velocity vector is 27~i + 32~j, or a


car whose velocity vector is 40~i, assuming that the units are the same for both
directions?
( the first car  the second car) is the faster car.
At what speed is the faster car traveling?
speed =
7. Let a = h3, −1, −3i and b = h−3, 1, −2i.
Compute:
a+b = ( , , )
a−b = ( , , )
2a = ( , , )
3a + 4b = ( , , )
|a| =
8. Find the length of the vectors
(a) 4 ĩ − 5 j̃ + k̃ : length =
(b) 3 ĩ + 0.4 j̃ + 2 k̃ : length =
9. For each of the following, perform the indicated operations on the vectors
~a = 5 j̃ + 4 k̃ , ~b = −5 ĩ + 2 j̃ + k̃ , ~z = 3 ĩ + j̃ .
(a) 3~a + 5~b =
(b) 4~a + 6~b − 3~z =
10. Find the value(s) of a making ~v = 3a~i − 2 ~j parallel to w ~ = a2 ~i + 4 ~j.
a=
(If there is more than one value of a, enter the values as a comma-separated
list.)
11. (a) Find a unit vector from the point P = (1, 1) and toward the point
Q = (16, 9).
~u =
(b) Find a vector of length 34 pointing in the same direction.
~v =
12. A truck is traveling due north at 45 km/hr approaching a crossroad.
On a perpendicular road a police car is traveling west toward the intersection
at 65 km/hr. Both vehicles will reach the crossroad in exactly one hour. Find
the vector currently representing the displacement of the truck with respect to
the police car.
displacement d~ =
13. Let v = h1, −2i, u = h0, 4i, and w = h−5, 7i.

a. Determine the components of the vector u − v.

b. Determine the components of the vector 2v − 3u.

c. Determine the components of the vector v + 2u − 7w.

d. Determine scalars a and b such that av + bu = w. Show all of your work


in finding a and b.

14. Let u = h2, 1i and v = h1, 2i.

a. Determine the components and draw geometric representations of the


vectors 2u, 21 u, (−1)u, and (−3)u on the same set of axes.

b. Determine the components and draw geometric representations of the


vectors u + v, u + 2v, and u + 3v on the same set of axes.
9.2. VECTORS 31

c. Determine the components and draw geometric representations of the


vectors u − v, u − 2v, and u − 3v on the same set of axes.
d. Recall that u − v = u + (−1)v. Sketch the vectors u, v, u + v, and
u − v on the same set of axes. Use the “tip to tail” perspective for vector
addition to explain the geometric relationship between u, v, and u − v.

15. Recall that given any vector v, we can calculate its length, |v|. Also,
we say that two vectors that are scalar multiples of one another are parallel.
a. Let v = h3, 4i in R2 . Compute |v|, and determine the components of the
vector u = |v|
1
v. What is the magnitude of the vector u? How does its
direction compare to v?
b. Let w = 3i − 3j in R2 . Determine a unit vector u in the same direction
as w.
c. Let v = h2, 3, 5i in R3 . Compute |v|, and determine the components of
the vector u = |v|
1
v. What is the magnitude of the vector u? How does
its direction compare to v?
d. Let v be an arbitrary nonzero vector in R3 . Write a general formula for
a unit vector that is parallel to v.

16.

A force (like gravity) has both a magnitude


and a direction. If two forces u and v are
applied to an object at the same point, the w v
resultant force on the object is the vector 30◦ 45◦
sum of the two forces. When a force is ap- O
plied by a rope or a cable, we call that force
tension. Vectors can be used to determine
tension.

Figure 9.2.10: Forces acting on


an object.

As an example, suppose a painting weighing 50 pounds is hung from a wire


attached to a hook which is not perfectly centered on the picture, as illustrated
in Figure 9.2.10. We need to know how much tension will be on the wire to
know what kind of wire to use to hang the picture. Assume the hook is on the
picture frame at point O. Let u be the vector emanating from point O to the
left and v the vector emanating from point C to the right. Assume u makes
a 30◦ angle with the horizontal at point O and v makes a 45◦ angle with the
horizontal at point O. Our goal is to determine the vectors u and v.
a. Treat point O as the origin. Use trigonometry to find the components u1
and u2 so that u = u1 i + u2 j. Since we don’t know the magnitude of u,
your components will be in terms of |u| and the cosine and sine of some
angle. Then find the components v1 and v2 so that v = v1 i + v2 j. Again,
32 CHAPTER 9. MULTIVARIABLE AND VECTOR FUNCTIONS

your components will be in terms of |v| and the cosine and sine of some
angle.
b. The total force holding the picture up is given by u + v. The force acting
to pull the picture down is given by the weight of the picture. Find the
force vector w acting to pull the picture down.

c. The picture will hang in equilibrium when the force acting to hold it up
is equal to the force acting to pull it down. Equate these forces to find
the components of the vectors u and v.
9.3. THE DOT PRODUCT 33

9.3 The Dot Product

Motivating Questions

• How is the dot product of two vectors defined and what geometric infor-
mation does it tell us?

• How can we tell if two vectors in Rn are perpendicular?

• How do we find the projection of one vector onto another?

In the last section, we considered vector addition and scalar multiplication


and found that each operation had a natural geometric interpretation. In this
section, we will introduce a means of multiplying vectors.

Preview Activity 9.3.1. For two-dimensional vectors u = hu1 , u2 i and v =


hv1 , v2 i, the dot product is simply the scalar obtained by

u · v = u1 v1 + u2 v2 .

a. If u = h3, 4i and v = h−2, 1i, find the dot product u · v.

b. Find i · i and i · j.

c. If u = h3, 4i, find u · u. How is this related to |u|?

d. On the axes in Figure 9.3.1, plot the vectors u = h1, 3i and v = h−3, 1i.
Then, find u · v. What is the angle between these vectors?

4 y

x
-4 -2 2 4

-2

-4
Figure 9.3.1: For part (d)

e. On the axes in Figure 9.3.2, plot the vector u = h1, 3i.


34 CHAPTER 9. MULTIVARIABLE AND VECTOR FUNCTIONS

4 y

x
-4 -2 2 4

-2

-4
Figure 9.3.2: For part (e)

For each of the following vectors v, plot the vector on Figure 9.3.2 and
then compute the dot product u · v.

• v = h3, 2i.
• v = h3, 0i.
• v = h3, −1i.
• v = h3, −2i.
• v = h3, −4i.

f. Based upon the previous part of this activity, what do you think is the
sign of the dot product in the following three cases shown in Figure 9.3.3?

v
v v

u u u

Figure 9.3.3: For part (f)

9.3.1 The Dot Product


The definition of the dot product for vectors in R2 given in Preview Activ-
ity 9.3.1 can be extended to vectors in Rn .

Definition 9.3.4. The dot product of vectors u = hu1 , u2 , . . . , un i and v =


hv1 , v2 , . . . , vn i in Rn is the scalar

u · v = u1 v1 + u2 v2 + . . . + un vn .

(As we will see shortly, the dot product arises in physics to calculate the
work done by a vector force in a given direction. It might be more natural
9.3. THE DOT PRODUCT 35

to define the dot product in this context, but it is more convenient from a
mathematical perspective to define the dot product algebraically and then
view work as an application of this definition.)
For instance, we find that

h3, 0, 1i · h−2, 1, 4i = 3 · (−2) + 0 · 1 + 1 · 4 = −6 + 0 + 4 = −2.

Notice that the resulting quantity is a scalar. Our work in Preview Activ-
ity 9.3.1 examined dot products of two-dimensional vectors.

Activity 9.3.2. Determine each of the following.

a. h1, 2, −3i · h4, −2, 0i.

b. h0, 3, −2, 1i · h5, −6, 0, 4i

The dot product is a natural way to define a product of two vectors. In ad-
dition, it behaves in ways that are similar to the product of, say, real numbers.

Properties of the dot product.


Let u, v, and w be vectors in Rn . Then
1. u · v = v · u (the dot product is commutative), and

2. (u + v) · w = (u · w) + (v · w).
3. if c is a scalar, then (cu) · w = c(u · w).

Moreover, the dot product gives us valuable geometric information about


the vectors and their relative orientation. For instance, let’s consider what
happens when we dot a vector with itself:

u · u = hu1 , u2 , . . . , un i · hu1 , u2 , . . . , un i = u21 + u22 + . . . + u2n = |u|2 .

In other words, the dot product of a vector with itself gives the square of
the length of the vector: u · u = |u|2 .

9.3.2 The angle between vectors

The dot product can help us understand the angle between two vectors. For
instance, if we are given two vectors u and v, there are two angles that these
vectors create, as depicted at left in Figure 9.3.5. We will call θ, the smaller of
these angles, the angle between these vectors. Notice that θ lies between 0 and
π.
36 CHAPTER 9. MULTIVARIABLE AND VECTOR FUNCTIONS

v u u−v

θ θ
u
2π − θ
v

Figure 9.3.5: Left: The angle between u and v. Right: The triangle formed
by u, v, and u − v.

To determine this angle, we may apply the Law of Cosines to the triangle
shown at right in Figure 9.3.5.
Using the fact that the dot product of a vector with itself gives us the
square of its length, together with the properties of the dot product, we find:

|u − v|2 = |u|2 + |v|2 − 2|u||v| cos(θ)


(u − v) · (u − v) = u · u + v · v − 2|u||v| cos(θ)
u · (u − v) − v · (u − v) = u · u + v · v − 2|u||v| cos(θ)
u · u − 2u · v + v · v = u · u + v · v − 2|u||v| cos(θ)
−2u · v = − 2|u||v| cos(θ)
u · v = |u||v| cos(θ).

To summarize, we have the important relationship

u · v = u1 v1 + u2 v2 + . . . + un vn = |u||v| cos(θ). (9.3.1)

It is sometimes useful to think of Equation (9.3.1) as giving us an expression


for the angle between two vectors:

u·v
 
−1
θ = cos .
|u||v|

The real beauty of this expression is this: the dot product is a very simple
algebraic operation to perform yet it provides us with important geometric
information — namely the angle between the vectors — that would be difficult
to determine otherwise.

Activity 9.3.3. Determine each of the following.

a. The length of the vector u = h1, 2, −3i using the dot product.

b. The angle between the vectors u = h1, 2i and v = h4, −1i to the nearest
tenth of a degree.

c. The angle between the vectors y = h1, 2, −3i and z = h−2, 1, 1i to the
nearest tenth of a degree.
9.3. THE DOT PRODUCT 37

d. If the angle between the vectors u and v is a right angle, what does the
expression u · v = |u||v| cos(θ) say about their dot product?

e. If the angle between the vectors u and v is acute—that is, less than
π/2—what does the expression u · v = |u||v| cos(θ) say about their dot
product?

f. If the angle between the vectors u and v is obtuse—that is, greater than
π/2—what does the expression u · v = |u||v| cos(θ) say about their dot
product?

9.3.3 The Dot Product and Orthogonality


When the angle between two vectors is a right angle, it is frequently the case
that something important is happening. In this case, we say the vectors are
orthogonal. For instance, orthogonality often plays a role in optimization prob-
lems; to determine the shortest path from a point in R3 to a given plane, we
move along a line orthogonal to the plane.
As Activity 9.3.3 indicates, the dot product provides a simple means to
determine whether two vectors are orthogonal to one another. In this case,
u · v = |u||v| cos(π/2) = 0, so we make the following important observation.

The dot product and orthogonality.


Two vectors u and v in Rn are orthogonal to each other if u · v = 0.

More generally, the sign of the dot product gives us useful information
about the relative orientation of the vectors. If we remember that

cos(θ) > 0 if θ is an acute angle,


cos(θ) = 0 if θ is a right angle, and
cos(θ) < 0 if θ is an obtuse angle,

we see that for nonzero vectors u and v,

u·v >0 if θ is an acute angle,


u·v =0 if θ is a right angle, and
u·v <0 if θ is an obtuse angle.

This is illustrated in Figure 9.3.6.

v
v v

u u u
u·v >0 u·v =0 u·v <0

Figure 9.3.6: The orientation of vectors


38 CHAPTER 9. MULTIVARIABLE AND VECTOR FUNCTIONS

9.3.4 Work, Force, and Displacement


In physics, work is a measure of the energy required to apply a force to an object
through a displacement. For instance, Figure 9.3.7 shows a force F displacing
an object from point A to point B. The displacement is then represented by
−−→
the vector AB.

A θ B

|F| cos θ
Figure 9.3.7: A force F displacing an object.

It turns out that the work required to displace the object is


−−→ −−→
W = F · AB = |F||AB| cos(θ).
This means that the work is determined only by the magnitude of the force
applied parallel to the displacement. Consequently, if we are given two vectors
u and v, we would like to write u as a sum of two vectors, one of which is
parallel to v and one of which is orthogonal to v. We take up this task after
the next activity.
Activity 9.3.4. Determine the work done by a 25 pound force acting at a 30◦
angle to the direction of the object’s motion, if the object is pulled 10 feet. In
addition, is more work or less work done if the angle to the direction of the
object’s motion is 60◦ ?

9.3.5 Projections

u u

proj⊥v u proj⊥v u

θ
θ
projv u v projv u v

Figure 9.3.8: Left: projv u. Right: projv u when θ > 2.


π
9.3. THE DOT PRODUCT 39

Suppose we are given two vectors u and v as shown at left in Figure 9.3.8.
Motivated by our discussion of work, we would like to write u as a sum of two
vectors, one of which is parallel to v and one of which is orthogonal. That is,
we would like to write
u = projv u + proj⊥v u, (9.3.2)
where projv u is parallel to v and proj⊥v u is orthogonal to v. We call the
vector projv u the projection of u onto v. Note that, as the diagram at right
in Figure 9.3.8 illustrates, it is also possible to create a projection even if the
angle between the vectors u and v exceeds π2 .
To find the vector projv u, we will dot both sides of Equation (9.3.2) with
the vector v, to find that

u · v = (projv u + proj⊥v u) · v
= (projv u) · v + (proj⊥v u) · v
= (projv u) · v.

Notice that (proj⊥v u) · v = 0 since proj⊥v u is orthogonal to v. Also,


projv u must be a scalar multiple of v since it is parallel to v, so we will write
projv u = sv. It follows that

u · v = (projv u) · v = sv · v,

which means that


u·v
s=
v·v
and hence
u·v u·v
projv u = v= v
v·v |v|2
It is sometimes useful to write projv u as a scalar times a unit vector in the
direction of v. We call this scalar the component of u along v and denote it
as compv u. We therefore have
u·v u·v v v
projv u = v= = compv u ,
|v|2 |v| |v| |v|
so that
u·v
compv u = .
|v|

The dot product and projections.


Let u and v be vectors in Rn . The component of u in the direction of
v is the scalar
u·v
compv u = ,
|v|
and the projection of u onto v is the vector
u·v
projv u = v.
v·v

Moreover, since
u = projv u + proj⊥v u,
it follows that
proj⊥v u = u − projv u.
This shows that once we have computed projv u, we can find proj⊥v u simply
by calculating the difference of two known vectors.
40 CHAPTER 9. MULTIVARIABLE AND VECTOR FUNCTIONS

Activity 9.3.5. Let u = h2, 6i.

a. Let v = h4, −8i. Find compv u, projv u and proj⊥v u, and draw a picture
to illustrate. Finally, express u as the sum of two vectors where one is
parallel to v and the other is perpendicular to v.

b. Now let v = h−2, 4i. Without doing any calculations, find projv u. Ex-
plain your reasoning. (Hint: Refer to the picture you drew in part (a).)

c. Find a vector w not parallel to z = h3, 4i such that projz w has length
10. Note that there are infinitely many different answers.

9.3.6 Summary

• The dot product of two vectors in Rn , u = hu1 , u2 , . . . , un i and v =


hv1 , v2 , . . . , vn i, is the scalar

u · v = u1 v1 + u2 v2 + · · · + un vn .

• The dot product is related to the length of a vector since u · u = |u|2 .

• The dot product provides us with information about the angle between
the vectors since
u · v = |u| |v| cos(θ),
where θ is the angle between u and v.

• Two vectors are orthogonal if the angle between them is π/2. In terms
of the dot product, the vectors u and v are orthogonal if and only if
u · v = 0.

• The projection of a vector u in Rn onto a vector v in Rn is the vector


u·v
projv u = v.
v·v

Exercises
1. Find a · b if
a = h−1, −2, 2i and b = h1, 0, −2i
a·b=
Is the angle between the vectors "acute", "obtuse" or "right"?
2. Determine if the pairs of vectors below are "parallel", "orthogonal", or
"neither".
a = h0, 5, −4i and b = h0, 15, 75/4i are
a = h0, 5, −4i and b = h0, 15, −12i are
a = h0, 5, −4i and b = h0, 10, −7i are
3. Perform the following operations on the vectors ~u = h2, 2, 0i, ~v =
h−5, 0, −2i, and w ~ = h−2, 4, 4i.
~u · w~=
(~u · ~v )~u =
((w~ · w)~
~ u) · ~u =
~u · ~v + ~v · w~=
4. Find a · b if |a| = 8, |b| = 3, and the angle between a and b is π
9 radians.
a·b =
9.3. THE DOT PRODUCT 41

5. What is the angle in radians between the vectors


a = (8, 9, -3) and
b = (-1, -4, 0)?
Angle:
(radians)
6. Find a · b if |a| = 10, |b| = 6, and the angle between a and b is π6 radians.
a·b =
7. A constant force F = −8i − 6j − 6k moves an object along a straight line
from point (7, −1, −4) to point (8, 3, −10).
Find the work done if the distance is measured in meters and the magnitude
of the force is measured in newtons.
Work: Nm
8. A woman exerts a horizontal force of 3 pounds on a box as she pushes it
up a ramp that is 9 feet long and inclined at an angle of 30 degrees above the
horizontal.
Find the work done on the box.
Work: ft-lb
9. If Yoda says to Luke Skywalker, “The Force be with you,” then the dot
product of the Force and Luke should be:

positive
zero
any real number
negative

10. Find the angle between the diagonal of a cube of side length 10 and the
diagonal of one of its faces. The angle should be measured in radians.
11. Let v = h−2, 5i in R2 , and let y = h0, 3, −2i in R3 .
a. Is h2, −1i perpendicular to v? Why or why not?
b. Find a unit vector u in R2 such that u is perpendicular to v. How many
such vectors are there? Justify your answers.
c. Is h2, −1, −2i perpendicular to y? Why or why not?
d. Find a unit vector w in R3 such that w is perpendicular to y. How many
such vectors are there?Justify your answers.
e. Let z = h2, 1, 0i. Find a unit vector r in R3 such that r is perpendicular
to both y and z. How many such vectors are there? Explain your process.

12. Consider the triangle in R3 given by P = (3, 2, −1), Q = (1, −2, 4), and
R = (4, 4, 0).
a. Find the measure of each of the three angles in the triangle, accurate to
0.01 degrees.
b. Choose two sides of the triangle, and call the vectors that form the sides
(emanating from a common point) a and b.
i. Compute projb a, and proj⊥b a.
ii. Explain why proj⊥b a can be considered a height of triangle P QR.
iii. Find the area of the given triangle.
42 CHAPTER 9. MULTIVARIABLE AND VECTOR FUNCTIONS

13. Let u and v be vectors in R5 with u · v = −1, |u| = 2, |v| = 3. Use the
properties of the dot product to find each of the following.
a. u · 2v
b. v · v
c. (u + v) · v
d. (2u + 4v) · (u − 7v)
e. |u||v| cos(θ), where θ is the angle between u and v
f. θ
14. One of the properties of the dot product is that (u + v) · w = (u · w) +
(v · w). That is, the dot product distributes over vector addition on the right.
Here we investigate whether the dot product distributes over vector addition
on the left.
a. Let u = h1, 2, −1i, v = h4, −3, 6i, and v = h4, 7, 2i. Calculate
u · (v + w) and (u · v) + (u · w).
What do you notice?
b. Use the properties of the dot product to show that in general
x · (y + z) = (x · y) + (x · z)
for any vectors x, y, and z in Rn .
15. When running a sprint, the racers may be aided or slowed by the wind.
The wind assistance is a measure of the wind speed that is helping push the
runners down the track. It is much easier to run a very fast race if the wind
is blowing hard in the direction of the race. So that world records aren’t
dependent on the weather conditions, times are only recorded as record times
if the wind aiding the runners is less than or equal to 2 meters per second.
Wind speed for a race is recorded by a wind gauge that is set up close to
the track. It is important to note, however, that weather is not always as
cooperative as we might like. The wind does not always blow exactly in the
direction of the track, so the gauge must account for the angle the wind makes
with the track. Suppose a 4 mile per hour wind is blowing to aid runners by
making a 38◦ angle with the race track. Determine if any times set during such
a race would qualify as records.
16.
Molecular geometry is the geometry de-
termined by arrangements of atoms in H
molecules. Molecular geometry includes
measurements like bond angle, bond
length, and torsional angles. These at-
tributes influence several properties of C
molecules, such as reactivity, color, and po- H
larity.

H
H
Figure 9.3.9: A methane
molecule.
9.3. THE DOT PRODUCT 43

As an example of the molecular geometry of a molecule, consider the


methane CH4 molecule, as illustrated in Figure 9.3.9. According to the Valence
Shell Electron Repulsion (VSEPR) model, atoms that surround single different
atoms do so in a way that positions them as far apart as possible. This means
that the hydrogen atoms in the methane molecule arrange themselves at the
vertices of a regular tetrahedron. The bond angle for methane is the angle de-
termined by two consecutive hydrogen atoms and the central carbon atom. To
determine the bond angle for methane, we can place the center carbon atom
at the point 21 , 12 , 21 and the hydrogen atoms at the points (0, 0, 0), (1, 1, 0),
(1, 0, 1), and (0, 1, 1). Find the bond angle for methane to the nearest tenth of
a degree.
44 CHAPTER 9. MULTIVARIABLE AND VECTOR FUNCTIONS

9.4 The Cross Product

Motivating Questions
• How and when is the cross product of two vectors defined?
• What geometric information does the cross product provide?

The last two sections have introduced some basic algebraic operations on
vectors—addition, scalar multiplication, and the dot product—with useful geo-
metric interpretations. In this section, we will meet a final algebraic operation,
the cross product, which again conveys important geometric information.
To begin, we must emphasize that the cross product is only defined for
vectors u and v in R3 . Also, remember that we use a right-hand coordinate
system, as described in Section 9.1. In particular, recall that the vectors i, j,
and k are oriented as shown below in Figure 9.4.1. Earlier, we noticed that if
we point the index finger of our right hand in the direction of i and our middle
finger in the direction of j, then our thumb points in the direction of k.

Figure 9.4.1: Basis vectors i, j, and k.

Preview Activity 9.4.1. The cross product of two vectors, u and v, will
itself be a vector denoted u × v. The direction of u × v is determined by the
right-hand rule: if we point the index finger of our right hand in the direction
of u and our middle finger in the direction of v, then our thumb points in the
direction of u × v.
a. We begin by defining the cross products using the vectors i, j, and k.
Referring to Figure 9.4.1, explain why i, j, k in that order form a right-
hand system. We then define i × j to be k — that is i × j = k.
b. Now explain why i, k, and −j in that order form a right-hand system.
We then define i × k to be −j — that is i × k = −j.
c. Continuing in this way, complete the missing entries in Table 9.4.2.

i×j=k i × k = −j j×k=

j×i= k×i= k×j=

Table 9.4.2: Table of cross products involving i, j, and k.


9.4. THE CROSS PRODUCT 45

d. Up to this point, the products you have seen, such as the product of
real numbers and the dot product of vectors, have been commutative,
meaning that the product does not depend on the order of the terms.
For instance, 2 · 5 = 5 · 2. The table above suggests, however, that
the cross product is anti-commutative: for any vectors u and v in R3 ,
u × v = −v × u. If we consider the case when u = v, this shows that
v × v = −(v × v). What does this tell us about v × v; in particular,
what vector is unchanged by scalar multiplication by −1?
e. It is not difficult to show that the cross product interacts with scalar
multiplication and vector addition as one would expect: that is

(cu) × v = c(u × v)
(u + v) × w = (u × w) + (v × w)

We can combine these properties to make cross product calculations a


bit easier. For example,

(2i + j) × k = (2i × k) + (j × k)
= 2(i × k) + (j × k)
= − 2j + i.

Using these properties along with Table 9.4.2, find the cross product u×v
if u = 2i + 3j and v = −i + k.
f. Verify that the cross product u×v you just found in part (e) is orthogonal
to both u and v.
g. Consider the vectors u and v in the xy-plane as shown below in Fig-
ure 9.4.3.

y
v

θ u x

Figure 9.4.3: Two vectors in the xy-plane

Explain why u = |u|i and v = |v| cos(θ)i + |v| sin(θ)j. Then compute the
length of |u × v|.

9.4.1 Computing the cross product


As we have seen in Preview Activity 9.4.1, the cross product u × v is defined
for two vectors u and v in R3 and produces another vector in R3 . Using the
right-hand rule, we saw that
46 CHAPTER 9. MULTIVARIABLE AND VECTOR FUNCTIONS

i×j=k i × k = −j j×k=i
j × i = −k k×i=j k × j = −i

If, in addition, we assume the cross product behaves like we think a prod-
uct should (e.g., the cross product distributes over vector addition), we can
compute the cross product in terms of the components of general vectors to
find a formula for the cross product. Doing so we see that

u × v = (u1 i + u2 j + u3 k) × (v1 i + v2 j + v3 k)
= u1 i × (v1 i + v2 j + v3 k) + u2 j × (v1 i + v2 j + v3 k)
+ u3 k × (v1 i + v2 j + v3 k)
= u1 v1 i × i + u1 v2 i × j + u1 v3 i × k + u2 v1 j × i + u2 v2 j × j
+ u2 v3 j × k + u3 v1 k × i + u3 v2 k × j + u3 v3 k × k
= u1 v2 k − u1 v3 j − u2 v1 k + u2 v3 i + u3 v1 j − u3 v2 i
= (u2 v3 − u3 v2 )i − (u1 v3 − u3 v1 )j + (u1 v2 − u2 v1 )k.

(Like the dot product, the cross product arises in physical applications, e.g.,
torque, but it is more convenient mathematically to begin from an algebraic
perspective.)
The previous calculations lead us to define the cross product of vectors in
R3 as follows.

Definition 9.4.4. The cross product u × v of vectors u = u1 i + u2 j + u3 k


and v = v1 i + v2 j + v3 k in R3 is the vector

(u2 v3 − u3 v2 )i − (u1 v3 − u3 v1 )j + (u1 v2 − u2 v1 )k. (9.4.1)

At first, this may look intimidating and difficult to remember. However, if


we rewrite the expression in Equation (9.4.1) using determinants, important
structure emerges. The determinant of a 2 × 2 matrix is

a b
c d = ad − bc.

It follows that we can thus rewrite Equation (9.4.1) in the form



u2 u3 u1 u3 u1 u2
u×v = i− j+ k.
v2 v3 v1 v3 v1 v2

For those familiar with the determinant of a 3 × 3 matrix, we write the


mnemonic as
i j k

u × v = u1 u2 u3 .
v v2 v3
1

Activity 9.4.2. Suppose u = h0, 1, 3i and v = h2, −1, 0i. Use the formula
(9.4.1) for the following.

a. Find the cross product u × v.

b. Evaluate the dot products u · (u × v) and v · (u × v). What does this tell
you about the geometric relationship among u, v, and u × v?

c. Find the cross product v × i.


9.4. THE CROSS PRODUCT 47

d. Multiplication of real numbers is associative, which means, for instance,


that (2 · 5) · 3 = 2 · (5 · 3). Is it true that the cross product of vectors is
associative? For instance, is it true that (u × v) × i = u × (v × i)?

e. Find the cross product u × u.

The cross product satisfies the following properties, some of which were
illustrated in Preview Activity 9.4.1 and may be easily verified from the defi-
nition (9.4.1).
Properties of the cross product.
Let u, v, and w be vectors in R3 , and let c be a scalar. Then
1. u × v = −(v × u)

2. (u + v) × w = (u × w) + (v × w)
3. (cu) × w = c(u × w) = u × (cv)
4. u × v = 0 if u and v are parallel.
5. The cross product is not associative; that is, in general

(u × v) × w 6= u × (v × w).

Just as we found for the dot product, the cross product provides us with
useful geometric information. In particular, both the length and direction of
the cross product u × v encode information about the geometric relationship
between u and v.

9.4.2 The Length of u × v


We may ask whether the length |u × v| has any relationship to the lengths of
u and v. To investigate, we will compute the square of the length |u × v|2 and
denote by θ the angle between u and v, as in Section 9.3. Doing so, we find
through some significant algebra that

|u × v|2 = (u2 v3 − u3 v2 )2 + (u1 v3 − u3 v1 )2 + (u1 v2 − u2 v1 )2


= u22 v32 − 2u2 u3 v2 v3 + u23 v22 + u21 v32 − 2u1 u3 v1 v3 + u23 v12
+ u21 v22 − 2u1 u2 v1 v2 + u22 v12
= u21 (v22 + v32 ) + u22 (v12 + v32 ) + u23 (v12 + v22 )
− 2(u1 u2 v1 v2 + u1 u3 v1 v3 + u2 u3 v2 v3 )
= u21 (v12 + v22 + v32 ) + u22 (v12 + v22 + v32 ) + u23 (v12 + v22 + v32 )
− (u21 v12 + u22 v22 + u23 v32 + 2(u1 u2 v1 v2 + u1 u3 v1 v3 + u2 u3 v2 v3 ))
= (u21 + u22 + u23 )(v12 + v22 + v32 ) − (u1 v1 + u2 v2 + u3 v3 )2
= |u|2 |v|2 − (u · v)2
= |u|2 |v|2 (1 − cos2 (θ))
= |u|2 |v|2 sin2 (θ).

Therefore, we have found |u × v|2 = |u|2 |v|2 sin2 (θ), which means that

|u × v| = |u||v| sin(θ). (9.4.2)


48 CHAPTER 9. MULTIVARIABLE AND VECTOR FUNCTIONS

Note that the third property stated above says that u × v = 0 if u and v
are parallel. This is reflected in Equation (9.4.2) since sin(θ) = 0 if u and v
are parallel, which implies that u × v = 0.
Equation (9.4.2) also has a geometric implication. Consider the parallelo-
gram formed by two vectors u and v, as shown in Figure 9.4.5.

|v| sin θ

θ u

Figure 9.4.5: The parallelogram formed by u and v

Remember that the area of a parallelogram is the product of its base and
height. As shown in the figure, we may consider the base of the parallelogram
to be |u| and the height to be |v| sin(θ). This means that the area of the
parallelogram formed by u and v is

|u||v| sin(θ) = |u × v|.

This leads to the following interesting fact.


The length of the cross product.
The length, |u × v|, of the cross product of vectors u and v is the area
of the parallelogram determined by u and v.

Note also that if u = u1 i + u2 j + 0k and v = v1 i + v2 j + 0k are vectors in


the xy-plane, then Equation (9.4.1) shows that the area of the parallelogram
determined by u and v is |u × v| = |u1 v2 − u2 v1 | is the absolute value of the
u u2
2 × 2 determinant 1 . So the absolute value of a determinant of a 2 × 2
v1 v2
matrix is also the area of a parallelogram.

Activity 9.4.3.

a. Find the area of the parallelogram formed by the vectors u = h1, 3, −2i
and v = h3, 0, 1i.

b. Find the area of the parallelogram in R3 whose vertices are (1, 0, 1),
(0, 0, 1), (2, 1, 0), and (1, 1, 0). (Hint: It might be helpful to draw a
picture to see how the vertices are arranged so you can determine which
vectors you might use.)

9.4.3 The Direction of u × v


Now that we understand the length of u × v, we will investigate its direction.
Remember from Preview Activity 9.4.1 that cross products involving the vec-
tors i, j, and k resulted in vectors that are orthogonal to the two terms. We
will see that this holds more generally.
We begin by computing u · (u × v), and see that

u · (u × v) = u1 (u2 v3 − u3 v2 ) − u2 (u1 v3 − u3 v1 ) + u3 (u1 v2 − u2 v1 )


9.4. THE CROSS PRODUCT 49

= u1 u2 v3 − u1 u3 v2 − u2 u1 v3 + u2 u3 v1 + u3 u1 v2 − u3 u2 v1
=0

To summarize, we have u · (u × v) = 0, which implies that u is orthogonal


to u × v. In the same way, we can show that v is orthogonal to u × v. The
net effect is that u × v is a vector that is perpendicular to both u and v, and
hence u × v is perpendicular to the plane determined by u and v. Moreover,
the direction of u × v is determined by applying the right-hand rule to u and
v, as we saw in Preview Activity 9.4.1. In light of our earlier work that showed
|u||v| sin(θ) = |u × v|., we may now express u × v in the following different
way.
The cross product as normal vector.
Suppose that u and v are not parallel and that n is the unit vector
perpendicular to the plane containing u and v determined by the right-
hand rule. Then
u × v = |u||v| sin(θ) n.

There is yet one more geometric implication we may draw from this result.
Suppose u, v, and w are vectors in R3 that are not coplanar and that form a
three-dimension parallelepiped as shown in Figure 9.4.6.

h
α v

Figure 9.4.6: The parallelepiped determined by u, v, and w

The volume of the parallelepiped is determined by multiplying A, the area


of the base, by the height h. As we have just seen, the area of the base is
|u × v|. Moreover, the height h = |w| cos(α) where α is the angle between w
and the vector n, which is orthogonal to the plane formed by u and v. Since
n is parallel to u × v, the angle between w and u × v is also α. This shows
that
|(u × v) · w| = |u × w||w| cos(α) = Ah,
and therefore
The cross product and the volume of a parallelepiped.
The volume of the parallelepiped determined by u, v, and w is |(u ×
v) · w|.

As a dot product of two vectors, the quantity (u × v) · w is a scalar and is


called the triple scalar product.
Activity 9.4.4. Suppose u = h3, 5, −1i and v = h2, −2, 1i.
a. Find two unit vectors orthogonal to both u and v.
b. Find the volume of the parallelepiped formed by the vectors u, v, and
w = h3, 3, 1i.
50 CHAPTER 9. MULTIVARIABLE AND VECTOR FUNCTIONS

c. Find a vector orthogonal to the plane containing the points (0, 1, 2),
(4, 1, 0), and (−2, 2, 2).
d. Given the vectors u and v shown below in Figure 9.4.7, sketch the cross
products u × v and v × u.

z
v
u

Figure 9.4.7: Vectors u and v

e. Do the vectors a = h1, 3, −2i,b = h2, 1, −4i, and c = h0, 1, 0i in standard


position lie in the same plane? Use the concepts from this section to
explain.

9.4.4 Torque is measured by a cross product


We have seen that the cross product enables us to produce a vector perpen-
dicular to two given vectors, to measure the area of a parallelogram, and to
measure the volume of a parallelepiped. Besides these geometric applications,
the cross product also enables us to describe a physical quantity called torque.
Suppose that we would like to turn a bolt using a wrench as shown in
Figure 9.4.8. If a force F is applied to the wrench and r is the vector from the
position on the wrench at which the force is applied to center of the bolt, we
define the torque, τ , to be
τ = F × r.

F
θ

Figure 9.4.8: A force applied to a wrench


9.4. THE CROSS PRODUCT 51

When a force is applied to an object, Newton’s Second Law tells us that the
force is equal to the rate of change of the object’s linear momentum. Similarly,
the torque applied to an object is equal to the rate of change of the object’s
angular momentum. In other words, torque will cause the bolt to rotate.
In many industrial applications, bolts are required to be tightened using a
specified torque. Of course, the magnitude of the torque is |τ | = |F × r| =
|F||r|| sin(θ). Thus, to produce a larger torque, we can increase either |F| or
|r|, which you may know if you have ever removed lug nuts when changing a
flat tire. The ancient Greek mathematician Archimedes said: “Give me a lever
long enough and a fulcrum on which to place it, and I shall move the world.”
A modern spin on this statement is: “Allow me to make |r| large enough, and
I shall produce a torque large enough to move the world.”

9.4.5 Comparing the dot and cross products


Finally, it is worthwhile to compare and contrast the dot and cross products.
• u · v is a scalar, while u × v is a vector.
• u · v = v · u, while u × v = −v × u
• u · v = |u||v| cos(θ), while |u × v| = |u||v| sin(θ).
• u · v = 0 if u and v are perpendicular, while u × v = 0 if u and v are
parallel.

9.4.6 Summary

• The cross product is defined only for vectors in R3 . The cross product of
vectors u = u1 i + u2 j + u3 k and v = v1 i + v2 j + v3 k in R3 is the vector

u × v = (u2 v3 − u3 v2 )i − (u1 v3 − u3 v1 )j + (u1 v2 − u2 v1 )k.

• Geometrically, the cross product is

u × v = |u| |v| sin(θ) n,

where θ is the angle between u and v and n is a unit vector perpendicular


to both u and v as determined by the right-hand rule.
• The cross product of vectors u and v is a vector perpendicular to both
u and v.
• The magnitude |u×v| of the cross product of the vectors u and v gives the
area of the parallelogram determined by u and v. Also, the scalar triple
product |(u × v) · w| gives the volume of the parallelepiped determined
by u, v, and w.

Exercises
1. If a = i + j + 4k and b = i + j + 5k
Compute the cross product a × b.
a×b= i+ j+ k
2. Suppose ~v · w ~ = 7 and ||~v × w||
~ = 3, and the angle between ~v and w
~ is θ.
Find
(a) tan θ =
(b) θ =
52 CHAPTER 9. MULTIVARIABLE AND VECTOR FUNCTIONS

3. You are looking down at a map. A vector u with |u| = 8 points north
and a vector v with |v| = 2 points northeast. The crossproduct u × v points:
A) south
B) northwest
C) up
D) down
Please enter the letter of the correct answer: The magnitude |u × v|
=
4. If a = i + 9j + k and b = i + 18j + k, find a unit vector with positive first
coordinate orthogonal to both a and b.
i+ j+ k
5. Sketch the triangle with vertices O, P = (0, 7, 4) and Q = (0, 5, 6) and
compute its area using cross products.
Area=
6. Let A = (−4, −4, 5), B = (−1, −2, 3), and P = (k, k, k). The vector from
A to B is perpendicular to the vector from A to P when k = .
7. Find two unit vectors orthogonal to a = h−3, 3, −2i and b = h−4, −2, −3i
Enter your answer so that the first non-zero coordinate of the first vector
is positive.
First Vector: h , , i
Second Vector: h , , i
8. Use the geometric definition of the cross product and the properties of
the cross product to make the following calculations.
(a) ((~i + ~j) ×~i) × ~j =
(b) (~j + ~k) × (~j × ~k) =
(c) 3~i × (~i + ~j) =
(d) (~k + ~j) × (~k − ~j) =
9. Are the following statements true or false?

(a) If ~v and w
~ are any two vectors, then ||~v + w||
~ = ||~v || + ||w||.
~

(b) The value of ~v · (~v × w)


~ is always zero.

(c) For any scalar c and any vector ~v , we have ||c~v || = c||~v ||.

(d) (~i × ~j) · ~k = ~i · (~j × ~k).

10. A bicycle pedal is pushed straight downwards by a foot with a 31 Newton


force. The shaft of the pedal is 20 cm long. If the shaft is π/6 radians past
horizontal, what is the magnitude of the torque about the point where the shaft
is attached to the bicycle?
Nm
11. Let u = 2i + j and v = i + 2j be vectors in R3 .

a. Without doing any computations, find a unit vector that is orthogonal


to both u and v. What does this tell you about the formula for u × v?

b. Using the properties of the cross product and what you know about cross
products involving the fundamental vectors i and j, compute u × v.

c. Next, use the determinant version of Equation (9.4.1) to compute u × v.


Write one sentence that compares your results in (a), (b), and (c).

d. Find the area of the parallelogram determined by u and v.


9.4. THE CROSS PRODUCT 53

12. Let x = h1, 1, 1i and y = h0, 3, −2i.


a. Are x and y orthogonal? Are x and y parallel? Clearly explain how you
know, using appropriate vector products.
b. Find a unit vector that is orthogonal to both x and y.
c. Express y as the sum of two vectors: one parallel to x, the other orthog-
onal to x.
d. Determine the area of the parallelogram formed by x and y.

13. Consider the triangle in R3 formed by P (3, 2, −1), Q(1, −2, 4), and
R(4, 4, 0).
−−→ −→
a. Find P Q and P R.
b. Observe that the area of 4P QR is half of the area of the parallelogram
−−→ −→
formed by P Q and P R. Hence find the area of 4P QR.
c. Find a unit vector that is orthogonal to the plane that contains points
P , Q, and R.
d. Determine the measure of ∠P QR.

14. One of the properties of the cross product is that (u + v) × w =


(u × w) + (v × w). That is, the cross product distributes over vector addition
on the right. Here we investigate whether the cross product distributes over
vector addition on the left.
a. Let u = h1, 2, −1i, v = h4, −3, 6i, and v = h4, 7, 2i. Calculate

u × (v + w) and (u × v) + (u × w).

What do you notice?


b. Use the properties of the cross product to show that in general

x × (y + z) = (x × y) + (x × z)

for any vectors x, y, and z in R3 .

15. Let u = hu1 , u2 , u3 i, v = hv1 , v2 , v3 i, and w = hw1 , w2 , w3 i be vectors


in R3 . In this exercise we investigate properties of the triple scalar product
(u × v) · w.

u1 u2 u3
a. Show that (u × v) · w = v1 v2 v3 .

w w w
1 2 3

u1 u2 u3 v1 v2 v3
b. Show that v1 v2 v3 = − u1 u2 u3 . Conclude that inter-

w w w w w w
1 2 3 1 2 3
changing the first two rows in a 3 × 3 matrix changes the sign of the
determinant. In general (although we won’t show it here), interchanging
any two rows in a 3 × 3 matrix changes the sign of the determinant.
c. Use the results of parts (a) and (b) to argue that

(u × v) · w = (w × u) · v = (v × w) · u.
54 CHAPTER 9. MULTIVARIABLE AND VECTOR FUNCTIONS

d. Now suppose that u, v, and w do not lie in a plane when they eminate
from a common initial point.
a. Given that the parallepiped determined by u, v, and w must have
positive volume, what can we say about (u × v) · w?
b. Now suppose that u, v, and w all lie in the same plane. What value
must (u × v) · w have? Why?
c. Explain how (i.) and (ii.) show that if u, v, and w all eminate from
the same initial point, then u, v, and w lie in the same plane if and
only if (u×v)·w = 0. This provides a straightforward computational
method for determining when three vectors are co-planar.
9.5. LINES AND PLANES IN SPACE 55

9.5 Lines and Planes in Space

Motivating Questions

• How are lines in R3 similar to and different from lines in R2 ?

• What is the role that vectors play in representing equations of lines,


particularly in R3 ?

• How can we think of a plane as a set of points determined by a point and


a vector?

• How do we find the equation of a plane through three given non-collinear


points?

In single variable calculus, we learn that a differentiable function is lo-


cally linear. In other words, if we zoom in on the graph of a differentiable
function at a point, the graph looks like the tangent line to the function at
that point. Linear functions played important roles in single variable calcu-
lus, useful in approximating differentiable functions, in approximating roots of
functions (see Newton’s Method), and approximating solutions to first order
differential equations (see Euler’s Method). In multivariable calculus, we will
soon study curves in space; differentiable curves turn out to be locally linear
as well. In addition, as we study functions of two variables, we will see that
such a function is locally linear at a point if the surface defined by the function
looks like a plane (the tangent plane) as we zoom in on the graph.
Consequently, it is important for us to understand both lines and planes
in space, as these define the linear functions in R2 and R3 . (Recall that a
function is linear if it is a polynomial function whose terms all have degree less
than or equal to 1. For example, x defines a single variable linear function and
x + y a two variable linear function, but xy is not linear since it has degree
two, the sum of the degress of its factors.) We begin our work by considering
some familiar ideas in R2 but from a new perspective.

2 y

1
x
-2 -1 1 2 3 4
-1

-2

-3

-4
Figure 9.5.1: The line through (2, −1) with slope 23 .

Preview Activity 9.5.1. We are familiar with equations of lines in the plane
in the form y = mx + b, where m is the slope of the line and (0, b) is the y-
56 CHAPTER 9. MULTIVARIABLE AND VECTOR FUNCTIONS

intercept. In this activity, we explore a more flexible way of representing lines


that we can use not only in the plane, but in higher dimensions as well.
To begin, consider the line through the point (2, −1) with slope 32 as shown
in Figure 9.5.1.

a. Suppose we increase x by 1 from the point (2, −1). How does the y-value
change? What is the point on the line with x-coordinate 3?

b. Suppose we decrease x by 3.25 from the point (2, −1). How does the
y-value change? What is the point on the line with x-coordinate −1.25?

c. Now, suppose we increase x by some arbitrary value 3t from the point


(2, −1). How does the y-value change? What is the point on the line
with x-coordinate 2 + 3t?

d. Observe that the slope of the line is related to any vector whose y-
component divided by the x-component is the slope of the line. For
the line in this exercise, we might use the vector h3, 2i, which describes
the direction of the line. Explain why the terminal points of the vectors
r(t), where
r(t) = h2, −1i + h3, 2it,

trace out the graph of the line through the point (2, −1) with slope 32 .

e. Now we extend this vector approach to R3 and consider a second example.


Let L be the line in R3 through the point (1, 0, 2) in the direction of the
vector h2, −1, 4i. Find the coordinates of three distinct points on line L.
Explain your thinking.

f. Find a vector in the form

r(t) = hx0 , y0 , z0 i + ha, b, cit

whose terminal points trace out the line L that is described in (e). That
is, you should be able to locate any point on the line by determining a
corresponding value of t.

9.5.1 Lines in Space


In two-dimensional space, a non-vertical line is defined to be the set of points
satisfying the equation
y = mx + b,

for some constants m and b. The value of m (the slope) tells us how the
dependent variable changes for every one unit increase in the independent
variable, while the point (0, b) is the y-intercept and anchors the line to a
location on the y-axis. Alternatively, we can think of the slope as being related
to the vector h1, mi, which tells us the direction of the line, as shown on the
left in Figure 9.5.2. Thus, we can identify a line in space by fixing a point P
and a direction v, as shown on the right. Since we also have vectors in space
to provide direction, this same idea of a point and a direction determining a
line works in Rn for any n.
9.5. LINES AND PLANES IN SPACE 57

5 y 5 y

4 4

3 3

2 2 v
h1, mi
1 1
P
x x
-2 -1 1 2 3 4 -2 -1 1 2 3 4
-1 -1
Figure 9.5.2: A vector description of a line

Definition 9.5.3. A line in space is the set of terminal points of vectors


emanating from a given point P that are parallel to a fixed vector v.

The fixed vector v in the definition is called a direction vector for the line.
As we saw in Preview Activity 9.5.1, to find an equation for a line through
point P in the direction of vector v, observe that any vector parallel to v will
have the form tv for some scalar t. So, any vector emanating from the point
P in a direction parallel to the vector v will be of the form

−−→
OP + vt (9.5.1)

for some scalar t (where O is the origin).

4 4 4

3 3 3
tv
2 2 tv 2
tv
−−→ −−→ −−→
1 OP 1 OP 1 OP

-2 -1 1 2 3 4 -2 -1 1 2 3 4 -2 -1 1 2 3 4

Figure 9.5.4: A line in 2-space.

Figure 9.5.4 presents three images of a line in two-space in which we can


−−→ −−→
identify the vector OP and the vector tv as in Equation (9.5.1). Here, OP
is the fixed vector shown in blue, while the direction vector v is the vector
parallel to the vector shown in green (that is, the green vector represents tv,
and the line is traced out by the terminal points of the magenta vector). In
other words, the tips (terminal points) of the magenta vectors (the vectors of
−−→
the form OP + tv) trace out the line as t changes.
In particular, the terminal points of the vectors of the form in (9.5.1) de-
fine a linear function r in space of the following form, which is valid in any
dimension.
58 CHAPTER 9. MULTIVARIABLE AND VECTOR FUNCTIONS

The vector form of a line.


The vector form of a line through the point P in the direction of the
vector v is
r(t) = r0 + tv, (9.5.2)
−−→
where r0 is the position vector OP from the origin to the point P .

Of course, it is common to represent lines in the plane using the slope-


intercept equation y = mx + b. The vector form of the line, described above,
is an alternative way to represent lines that has the following two advantages.
First, in two dimensions, we are able to represent vertical lines, whose slope m
is not defined, using a vertical direction vector, such as v = h0, 1i. Second, this
description of lines works in any dimension even though there is no concept of
the slope of a line in more than two dimensions.

z z z
1 -2 1 -2 1 -2

-1 -1 -1
y y y
-3 -2 -1 1 2 3 -3 -2 -1 1 2 3 -3 -2 -1 1 2 3

1 -1 P2 1 -1 P2 1 -1 P2
L L L
2 2 2
-2 P1 -2 P1 -2 P1
x x x
3 3 3

-3 -3 -3

Figure 9.5.5: A line in 3-space.

Activity 9.5.2. Let P1 = (1, 2, −1) and P2 = (−2, 1, −2). Let L be the line
in R3 through P1 and P2 , and note that three snapshots of this line are shown
in Figure 9.5.5.
a. Find a direction vector for the line L.
b. Find a vector equation of L in the form r(t) = r0 + tv.
c. Consider the vector equation s(t) = h−5, 0, −3i + th6, 2, 2i. What is the
direction of the line given by s(t)? Is this new line parallel to line L?
d. Do r(t) and s(t) represent the same line, L? Explain.

9.5.2 The Parametric Equations of a Line


The vector form of a line, r(t) = r0 + tv in Equation (9.5.2), describes a line
as the set of terminal points of the vectors r(t). If we write this in terms of
components letting

r(t) = hx(t), y(t), z(t)i, r0 = hx0 , y0 , z0 i, and v = ha, b, ci,

then we can equate the components on both sides of r(t) = r0 + tv to obtain


the equations

x(t) = x0 + at, y(t) = y0 + bt, and z(t) = z0 + ct,

which describe the coordinates of the points on the line. The variable t repre-
sents an arbitrary scalar and is called a parameter. In particular, we use the
following language.
9.5. LINES AND PLANES IN SPACE 59

The parametric equations of a line.


The parametric equations for a line through the point P = (x0 , y0 , z0 )
in the direction of the vector v = ha, b, ci are

x(t) = x0 + at, y(t) = y0 + bt, z(t) = z0 + ct.

Notice that there are many different parametric equations for the same line.
For example, choosing another point P on the line or another direction vector
v produces another set of parametric equations. It is sometimes useful to think
of t as a time parameter and the parametric equations as telling us where we
are on the line at each time. In this way, the parametric equations describe a
particular walk taken along the line; there are, of course, many possible ways
to walk along a line.

Activity 9.5.3. Let P1 = (1, 2, −1) and P2 = (−2, 1, −2), and let L be the
line in R3 through P1 and P2 , which is the same line as in Activity 9.5.2.

a. Find parametric equations of the line L.

b. Does the point (1, 2, 1) lie on L? If so, what value of t results in this
point?

c. Consider another line, K, whose parametric equations are

x(s) = 11 + 4s, y(s) = 1 − 3s, z(s) = 3 + 2s.

What is the direction of the line K?

d. Do the lines L and K intersect? If so, provide the point of intersection


and the t and s values, respectively, that result in the point. If not,
explain why.

9.5.3 Planes in Space

Now that we have a way of describing lines, we would like to develop a means
of describing planes in three dimensions. We studied the coordinate planes
and planes parallel to them in Section 9.1. Each of those planes had one of
the variables x, y, or z equal to a constant. We can note that any vector in
a plane with x constant is orthogonal to the vector h1, 0, 0i, any vector in a
plane with y constant is orthogonal to the vector h0, 1, 0i, and any vector in a
plane with z constant is orthogonal to the vector h0, 0, 1i. This idea works in
general to define a plane.

Definition 9.5.6. A plane p in space is the set of all terminal points of vectors
emanating from a given point P0 perpendicular to a fixed vector n, as shown
in Figure 9.5.7.
60 CHAPTER 9. MULTIVARIABLE AND VECTOR FUNCTIONS

p
n = ha, b, ci
P = (x, y, z)

P0 = (x0 , y0 , z0 )

Figure 9.5.7: A point P0 on a plane p with a normal vector n

The definition allows us to find the equation of a plane. Assume that


n = ha, b, ci, P0 = (x0 , y0 , z0 ), and that P = (x, y, z) is an arbitrary point on
−−→
the plane. Since the vector P P0 lies in the plane, it must be perpendicular to
n. This means that
−−→
0 = n · P P0

= n · hx, y, zi − hx0 , y0 , z0 i
= n · hx − x0 , y − y0 , z − z0 i
= a(x − x0 ) + b(y − y0 ) + c(z − z0 ).

The fixed vector n perpendicular to the plane is frequently called a normal


vector to the plane. We may now summarize as follows.
Equations of a plane.

• The scalar equation of the plane with normal vector n = ha, b, ci


containing the point P0 = (x0 , y0 , z0 ) is

a(x − x0 ) + b(y − y0 ) + c(z − z0 ) = 0. (9.5.3)

• The vector equation of the plane with normal vector n = ha, b, ci


containing the points P0 = (x0 , y0 , z0 ) and P = (x, y, z) is
−−→
n · P P0 = 0. (9.5.4)

We may take the scalar equation of a plane a little further and note that
since
a(x − x0 ) + b(y − y0 ) + c(z − z0 ) = 0,
it equivalently follows that

ax + by + cz = ax0 + by0 + cz0 .

That is, we may write an equation of a plane as ax + by + cz = d where


where d = n · hx0 , y0 , z0 i.
For instance, if we would like to describe the plane passing through the
point P0 = (4, −2, 1) and perpendicular to the vector n = h1, 2, 1i, we have

h1, 2, 1i · hx, y, zi = h1, 2, 1i · h4, −2, 1i

or
x + 2y + z = 1.
9.5. LINES AND PLANES IN SPACE 61

Notice that the coefficients of x, y, and z in this description give a vector


perpendicular to the plane. For instance, if we are presented with the plane

−2x + y − 3z = 4,

we know that n = h−2, 1, −3i is a vector perpendicular to the plane.

Activity 9.5.4.

a. Write a scalar equation of the plane p1 passing through the point (0, 2, 4)
and perpendicular to the vector n = h2, −1, 1i.

b. Is the point (2, 0, 2) on the plane p1 ?

c. Write a scalar equation of the plane p2 that is parallel to p1 and passing


through the point (3, 0, 4). (Hint: Compare normal vectors of the planes.)

d. Write a parametric description of the line l passing through the point


(2, 0, 2) and perpendicular to the plane p3 described by the equation
x + 2y − 2z = 7.

e. Find the point at which l intersects the plane p3 .

Just as two distinct points in space determine a line, three non-collinear


points in space determine a plane. Consider three points P0 , P1 , and P2 in
space, not all lying on the same line as shown in Figure 9.5.8.

P2 p
n
P

P0 P1

Figure 9.5.8: A plane determined by three points P0 , P1 , and P2

−−−→ −−−→
Observe that the vectors P0 P1 and P0 P2 both lie in the plane p. If we form
their cross-product
−−−→ −−−→
n = P0 P1 × P0 P2 ,
we obtain a normal vector to the plane p. Therefore, if P is any other point on
−−→
p, it then follows that P0 P will be perpendicular to n, and we have, as before,
the equation
−−→
n · P0 P = 0. (9.5.5)

Activity 9.5.5. Let P0 = (1, 2, −1), P1 = (1, 0, −1), and P2 = (0, 1, 3) and let
p be the plane containing P0 , P1 , and P2 .
−−−→ −−−→
a. Determine the components of the vectors P0 P1 and P0 P2 .

b. Find a normal vector n to the plane p.

c. Find a scalar equation of the plane p.


62 CHAPTER 9. MULTIVARIABLE AND VECTOR FUNCTIONS

d. Consider a second plane, q, with scalar equation −3(x − 1) + 4(y + 3) +


2(z − 5) = 0. Find two different points on plane q, as well as a vector m
that is normal to q.

e. The angle between two planes is the acute angle between their respective
normal vectors. What is the angle between planes p and q?

9.5.4 Summary

• While lines in R3 do not have a slope, like lines in R2 they can be char-
acterized by a point and a direction vector. Indeed, we define a line in
space to be the set of terminal points of vectors emanating from a given
point that are parallel to a fixed vector.

• Vectors play a critical role in representing the equation of a line. In


particular, the terminal points of the vector r(t) = r0 + tv define a linear
function r in space through the terminal point of the vector r0 in the
direction of the vector v, tracing out a line in space.

• A plane in space is the set of all terminal points of vectors emanating


from a given point perpendicular to a fixed vector.
−−−→
• If P1 , P2 , and P3 are non-collinear points in space, the vectors P1 P2 and
−−−→ −−−→ −−−→
and P1 P3 are vectors in the plane and the vector n = P1 P2 × P1 P3 is
a normal vector to the plane. So any point P in the plane satisfies the
−−→
equation P P1 · n = 0. If we let P = (x, y, z), n = ha, b, ci be the normal
vector, and P1 = (x0 , y0 , z0 ), we can also represent the plane with the
equation
a(x − x0 ) + b(y − y0 ) + c(z − z0 ) = 0.

Exercises
1. Rewrite the vector equation r(t) = (2 − 5t)i + (−1 − 3t)j + (−5 − 3t)k as
the corresponding parametric equations for the line.
x(t) =
y(t) =
z(t) =
2. Find the vector and parametric equations for the line through the point
P(3, -5, -1) and parallel to the vector 3i − 3j + 3k.
Vector Form: r = h , , -1 i + th , ,3i
Parametric form (parameter t, and passing through P when t = 0):
x = x(t) =
y = y(t) =
z = z(t) =
3. Consider the line which passes through the point P(4, 2, 4), and which is
parallel to the line x = 1 + 4t, y = 2 + 2t, z = 3 + 2t
Find the point of intersection of this new line with each of the coordinate
planes:
xy-plane: ( , , )
xz-plane: ( , , )
yz-plane: ( , , )
4. Find the point at which the line h5, −2, −2i + th−3, −3, 3i intersects the
plane −3x − y − 2z = 33.
( , , )
9.5. LINES AND PLANES IN SPACE 63

5. Find an equation of a plane containing the three points (-5, 5, 2), (0, 2,
-1), (0, 3, 1) in which the coefficient of x is -3.
= 0.
6. Find an equation for the plane containing the line in the xy-plane where
x = 2, and the line in the yz-plane where z = 3.
equation:
7. Find the angle in radians between the planes 3x + z = 1 and 3y + z = 1.
8. A store sells CDs at one price and DVDs at another price. The figure
below shows the revenue (in dollars) of the music store as a function of the
number, c, of CDs and the number, d, of DVDs that it sells. The values of the
revenue are shown on each line.

(Hint: for this problem there are many possible ways to estimate the requi-
site values; you should be able to find information from the figure that allows
you to give an answer that is essentially exact.)
(a) What is the price of a CD?
dollars
(b) What is the price of a DVD?
dollars
9. The table below gives the number of calories burned per minute for
someone roller-blading, as a function of the person’s weight in pounds and
speed in miles per hour [from the August 28,1994, issue of Parade Magazine].
calories burned per minute

weight\speed 8 9 10 11
120 4.2 5.8 7.4 8.9
140 5.1 6.7 8.3 9.9
160 6.1 7.7 9.2 10.8
180 7 8.6 10.2 11.7
200 7.9 9.5 11.1 12.6

(a) Suppose that a 160 lb person and a 180 person both go 10 miles, the
first at 11 mph and the second at 10 mph.
How many calories does the 160 lb person burn?
How many calories does the 180 lb person burn?
(b) We might also be interested in the number of calories each person burns
64 CHAPTER 9. MULTIVARIABLE AND VECTOR FUNCTIONS

per pound of their weight.


How many calories per pound does the 160 lb person burn?
How many calories per pound does the 180 lb person burn?
10. The vector and parametric forms of a line allow us to easily describe
line segments in space.
Let P1 = (1, 2, −1) and P2 = (−2, 1, −2), and let L be the line in R3
through P1 and P2 as in Activity 9.5.2.
a. What value of the parameter t makes (x(t), y(t), z(t)) = P1 ? What value
of t makes (x(t), y(t), z(t)) = P2 ?
b. What t values describe the line segment between the points P1 and P2 ?
c. What about the line segment (along the same line) from (7, 4, 1) to
(−8, −1, −4)?
d. Now, consider a segment that lies on a different line: parameterize the
segment that connects point R = (4, −2, 7) to Q = (−11, 4, 27) in such a
way that t = 0 corresponds to point Q, while t = 2 corresponds to R.

11. This exercise explores key relationships between a pair of lines. Consider
the following two lines: one with parametric equations x(s) = 4 − 2s, y(s) =
−2 + s, z(s) = 1 + 3s, and the other being the line through (−4, 2, 17) in the
direction v = h−2, 1, 5i.
a. Find a direction vector for the first line, which is given in parametric
form.
b. Find parametric equations for the second line, written in terms of the
parameter t.
c. Show that the two lines intersect at a single point by finding the values of
s and t that result in the same point. Then find the point of intersection.
d. Find the acute angle formed where the two lines intersect, noting that this
angle will be given by the acute angle between their respective direction
vectors.
e. Find an equation for the plane that contains both of the lines described
in this problem.

12. This exercise explores key relationships between a pair of planes. Con-
sider the following two planes: one with scalar equation 4x − 5y + z = −2, and
the other which passes through the points (1, 1, 1), (0, 1, −1), and (4, 2, −1).
a. Find a vector normal to the first plane.
b. Find a scalar equation for the second plane.
c. Find the angle between the planes, where the angle between them is
defined by the angle between their respective normal vectors.
d. Find a point that lies on both planes.
e. Since these two planes do not have parallel normal vectors, the planes
must intersect, and thus must intersect in a line. Observe that the line
of intersection lies in both planes, and thus the direction vector of the
line must be perpendicular to each of the respective normal vectors of
the two planes. Find a direction vector for the line of intersection for the
two planes.
9.5. LINES AND PLANES IN SPACE 65

f. Determine parametric equations for the line of intersection of the two


planes.

13. In this problem, we explore how we can use what we know about vectors
and projections to find the distance from a point to a plane.
Let p be the plane with equation z = −4x + 3y + 4, and let Q = (4, −1, 8).

a. Show that Q does not lie in the plane p.


b. Find a normal vector n to the plane p.
c. Find the coordinates of a point P in p.
−−→
d. Find the components of P Q. Draw a picture to illustrate the objects
found so far.
−−→
e. Explain why |compn P Q| gives the distance from the point Q to the plane
p. Find this distance.
66 CHAPTER 9. MULTIVARIABLE AND VECTOR FUNCTIONS

9.6 Vector-Valued Functions

Motivating Questions
• What is a vector-valued function? What do we mean by the graph of a
vector-valued function?
• What is a parameterization of a curve in R2 ? In R3 ? What can the
parameterization of a curve can tell us?

So far, we have seen several different examples of curves in space, including


traces and contours of functions of two variables, as well as lines in 3-space.
Recall that for a line through a fixed point r0 in the direction of vector v, we
may express the line parametrically through the single vector equation
r(t) = r0 + tv.
From this perspective, the vector r(t) is a function that depends on the
parameter t, and the terminal points of this vector trace out the line in space.
Like lines, other curves in space are one-dimensional objects, and thus we
aspire to similarly express the coordinates of points on a given curve in terms
of a single variable. Vectors are a perfect vehicle for doing so — we can use
vectors based at the origin to identify points in space, and connect the terminal
points of these vectors to draw a curve in space. This approach will allow us
to draw an incredible variety of graphs in 2- and 3-space, as well as to identify
and describe curves in n-space for any n. It will also allow us to represent
traces and cross sections of surfaces in space.
Preview Activity 9.6.1. In this activity we consider how we might use vec-
tors to define a curve in space.
a. On a single set of axes in R2 , draw the vectors
• hcos(0), sin(0)i,
• cos π2 , sin π2 ,

 

• hcos (π) , sin (π)i, and


• cos 3π 3π

 
2 , sin 2

with their initial points at the origin.


b. On the same set of axes, draw the vectors
• cos π4 , sin π4 ,

 

• cos 3π 3π
,

 
4 , sin 4
• cos 4 , sin 4 , and
5π 5π

 

• cos 7π 7π

 
4 , sin 4

with their initial points at the origin.


c. Based on the pictures from parts (a) and (b), sketch the set of terminal
points of all of the vectors of the form hcos(t), sin(t)i, where t assumes
values from 0 to 2π. What is the resulting figure? Why?
d. Suppose we sketched the terminal points of all vectors of the form hcos(t), sin(t)i,
where t assumes values from 0 to π. How does the resulting picture differ
from the one in part (c)? What about for t from 0 to 4π?
9.6. VECTOR-VALUED FUNCTIONS 67

9.6.1 Vector-Valued Functions


Consider the curve shown in Figure 9.6.1. As in Preview Activity 9.6.1, we
can think of a point on this curve as resulting from a vector from the origin to
the point. As the point travels along the curve, the vector changes in order to
terminate at the desired point. A few still pictures of this motion are shown
in Figure 9.6.1.

Figure 9.6.1: The graph of a curve in space.

Thus, we can think of the curve as a collection of terminal points of vectors


emanating from the origin. We therefore view a point traveling along this curve
as a function of time t, and define a function r whose input is the variable t
and whose output is the vector from the origin to the point on the curve at
time t. In so doing, we have introduced a new type of function, one whose
input is a scalar and whose output is a vector.
The terminal points of the vector outputs of r then trace out the curve
in space. From this perspective, the x, y, and z coordinates of the point are
functions of time, t, say

x = x(t), y = y(t), and z = z(t),

and thus we have three coordinate functions that enable us to represent the
curve. The variable t is called a parameter and the equations x = x(t), y = y(t),
and z = z(t) are called parametric equations (or a parameterization of the
curve). The function r whose output is the vector from the origin to a point
on the curve is defined by

r(t) = hx(t), y(t), z(t)i.

Note that the input of r is the real-valued parameter t and the correspond-
ing output is vector hx(t), y(t), z(t)i. Such a function is called a vector-valued
function because each real number input generates a vector output. More
formally, we state the following definition.
Definition 9.6.2. A vector-valued function is a function whose input is a
real parameter t and whose output is a vector that depends on t. The graph of
a vector-valued function is the set of all terminal points of the output vectors
with their initial points at the origin.
Parametric equations for a curve are equations of the form

x = x(t), y = y(t), and z = z(t)

that describe the (x, y, z) coordinates of a point on a curve in R3 .


68 CHAPTER 9. MULTIVARIABLE AND VECTOR FUNCTIONS

Note particularly that every set of parametric equations determines a vector-


valued function of the form

r(t) = hx(t), y(t), z(t)i,

and every vector-valued function defines a set of parametric equations for a


curve. Moreover, we can consider vector-valued functions and parameteriza-
tions in R2 , R4 , or indeed a real space of any dimension. As a reminder, in
Section 9.5, we determined the parametric equations of a line in space using a
point and a direction vector. For a nonlinear example, the curve in Figure 9.6.1
has the parametric equations

x(t) = cos(t), y(t) = sin(t), and z(t) = cos(t) sin(t).

Represented as a vector-valued function r, the curve in Figure 9.6.1 is the


graph of
r(t) = hcos(t), sin(t), cos(t) sin(t)i.
Activity 9.6.2. The same curve can be represented with different parame-
terizations. Use appropriate technology to plot the curves generated by the
following vector-valued functions for values of t from 0 to 2π. Compare and
contrast the graphs — explain how they are alike and how they are different.
a. r(t) = hsin(t), cos(t)i
b. r(t) = hsin(2t), cos(2t)i
c. r(t) = hcos(t + π), sin(t + π)i
d. r(t) = hcos(t2 ), sin(t2 )i
The examples in Activity 9.6.2 illustrate that a parameterization allows us
to look not only at the graph, but at the direction and speed at which the
graph is traversed as t changes. In the different parameterizations of the circle,
we see that we can start at different points and move around the circle in either
direction. The calculus of vector-valued functions — which we will begin to
investigate in Section 9.7 — will enable us to precisely quantify the direction,
speed, and acceleration of a particle moving along a curve in space. As such,
describing curves parametrically will allow us to not only indicate the curve
itself, but also to describe how motion occurs along the curve.
Using parametric equations to define vector-valued functions in two dimen-
sions is much more versatile than just defining y as a function of x. In fact, if
y = f (x) is a function of x, then we can parameterize the graph of f by

r(t) = ht, f (t)i,

and thus every single-variable function may be described parametrically. In


addition, as we saw in Preview Activity 9.6.1 and Activity 9.6.2, we can use
vector-valued functions to represent curves in the plane that do not define y as
a function of x (or x as a function of y). (As a side note: vector-valued functions
make it easy to plot the inverse of a one-to-one function in two dimensions. To
see how, if y = f (x)defines a one-to-one function, then we can parameterize
this function by r(t) = ht, f (t)i. Since the inverse function just reverses the
role of input and output, a parameterization for f −1 is hf (t), ti.)
Activity 9.6.3. Vector-valued functions can be used to generate many inter-
esting curves. Graph each of the following using an appropriate technological
tool, and then write one sentence for each function to describe the behavior of
the resulting curve.
9.6. VECTOR-VALUED FUNCTIONS 69

a. r(t) = ht cos(t), t sin(t)i

b. r(t) = hsin(t) cos(t), t sin(t)i

c. r(t) = hsin(5t), sin(4t)i

d. r(t) = ht2 sin(t) cos(t), 0.9t cos(t2 ), sin(t)i (Note that this defines a curve
in 3-space.)

e. Experiment with different formulas for x(t) and y(t) and ranges for t
to see what other interesting curves you can generate. Share your best
results with peers.

Recall from our earlier work that the traces and level curves of a function
are themselves curves in space. Thus, we may determine parameterizations for
them. For example, if z = f (x, y) = cos(x2 +y 2 ), the y = 1 trace of the function
is given by setting y = 1 and letting x be parameterized by the variable t; then,
the trace is the curve whose parameterization is ht, 1, cos(t2 + 1)i.

Activity 9.6.4. Consider the paraboloid defined by f (x, y) = x2 + y 2 .

a. Find a parameterization for the x = 2 trace of f . What type of curve


does this trace describe?

b. Find a parameterization for the y = −1 trace of f . What type of curve


does this trace describe?

c. Find a parameterization for the level curve f (x, y) = 25. What type of
curve does this trace describe?

d. How do your responses change to all three of the preceding question if


you instead consider the function g defined by g(x, y) = x2 − y 2 ? (Hint
for generating one of the parameterizations: sec2 (t) − tan2 (t) = 1.)

9.6.2 Summary

• A vector-valued function is a function whose input is a real parameter t


and whose output is a vector that depends on t. The graph of a vector-
valued function is the set of all terminal points of the output vectors with
their initial points at the origin.

• Every vector-valued function provides a parameterization of a curve. In


R2 , a parameterization of a curve is a pair of equations x = x(t) and
y = y(t) that describes the coordinates of a point (x, y) on the curve in
terms of a parameter t. In R3 , a parameterization of a curve is a set
of three equations x = x(t), y = y(t), and z = z(t) that describes the
coordinates of a point (x, y, z) on the curve in terms of a parameter t.

Exercises
1. Find the domain of the vector function

 
1
r(t) = ln(13t), t + 16, √
17 − t

using interval notation.


Domain:
70 CHAPTER 9. MULTIVARIABLE AND VECTOR FUNCTIONS

2. Find a parametrization of the circle of radius 7 in the xy-plane, centered


at the origin, oriented clockwise. The point (7, 0) should correspond to t = 0.
Use t as the parameter for all of your answers.
x(t) =
y(t) =
3. Find a vector parametrization of the circle of radius 8 in the xy-plane,
centered at the origin, oriented clockwise so that the point (8, 0) corresponds
to t = 0 and the point (0, −8) corresponds to t = 1.
~r(t) =
4. Find a vector parametric equation ~r(t) for the line through the points
P = (5, −3, −5) and Q = (7, −8, −8) for each of the given conditions on the
parameter t.
(a) If ~r(0) = h5, −3, −5i and ~r(8) = h7, −8, −8i, then
~r(t) =
(b) If ~r(3) = P and ~r(5) = Q, then
~r(t) =
(c) If the points P and Q correspond to the parameter values t = 0 and
t = −4, respectively, then
~r(t) =
5. Suppose parametric equations for the line segment between (−1, 8) and
(3, −1) have the form:
x = a + bt
y = c + dt
If the parametric curve starts at (−1, 8) when t = 0 and ends at (3, −1) at
t = 1, then find a, b, c, and d.
a= ,b = ,c = ,d =
.
6. Find a parametrization of the curve x = −5z 2 in the xz-plane. Use t as
the parameter for all of your answers.
x(t) =
y(t) =
z(t) =
7. Find parametric equations for the quarter-ellipse from (3, 0, 7) to (0, −5, 7)
centered at (0, 0, 7) in the plane z = 7. Use the interval 0 ≤ t ≤ π/2.
x(t) =
y(t) =
z(t) =
8. Are the following statements true or false?

(a) The parametric curve x = (3t + 4)2 , y = 5(3t + 4)2 − 9, for 0 ≤ t ≤ 3 is a


line segment.

(b) A parametrization of the graph of y = ln(x) for x > 0 is given by x =


et , y = t for −∞ < t < ∞.

(c) The line parametrized by x = 7, y = 5t, z = 6 + t is parallel to the x-axis.

9. Find a vector function that represents the curve of intersection of the


paraboloid z = 6x2 + 3y 2 and the cylinder y = 2x2 . Use the variable t for the
parameter.
r(t) = ht, , i
9.6. VECTOR-VALUED FUNCTIONS 71

10. A bicycle wheel has radius R. Let P be a point on the spoke of a wheel
at a distance d from the center of the wheel. The wheel begins to roll to the
right along the the x-axis. The curve traced out by P is given by the following
parametric equations:
x = 14θ − 12 sin(θ)
y = 14 − 12 cos(θ)
What must we have for R and d?
R= d=
11. A standard parameterization for the unit circle is hcos(t), sin(t)i, for
0 ≤ t ≤ 2π.
a. Find a vector-valued function r that describes a  point

traveling along
√ 
the unit circle so that at time t = 0 the point is at 22 , 22 and travels
clockwise along the circle as t increases.
b. Find a vector-valued function r that describes a  point

traveling along
√ 
the unit circle so that at time t = 0 the point is at 22 , 22 and travels
counter-clockwise along the circle as t increases.
c. Find a vector-valued function r that describes a point
 √traveling
√ 
along the
unit circle so that at time t = 0 the point is at − 22 , 22 and travels
clockwise along the circle as t increases.
d. Find a vector-valued function r that describes a point traveling along
the unit circle so that at time t = 0 the point is at (0, 1) and makes one
complete revolution around the circle in the counter-clockwise direction
on the interval [0, π].

12. Let a and b be positive real numbers. You have probably seen the
2 2
equation (x−h)
a2 + (y−k)
b2 = 1 that generates an ellipse, centered at (h, k), with
a horizontal axis of length 2a and a vertical axis of length 2b.
a. Explain why the vector function r defined by r(t) = ha cos(t), b sin(t)i,
2 2
0 ≤ t ≤ 2π is one parameterization of the ellipse xa2 + yb2 = 1.
x2 y2
b. Find a parameterization of the ellipse 4 + 16 = 1 that is traversed
counterclockwise.
(x+3)2 (y−2)2
c. Find a parameterization of the ellipse 4 + 9 = 1.
d. Determine the x-y equation of the ellipse that is parameterized by

r(t) = h3 + 4 sin(2t), 1 + 3 cos(2t)i.

13. Consider the two-variable function z = f (x, y) = 3x2 + 4y 2 − 2.


a. Determine a vector-valued function r that parameterizes the curve which
is the x = 2 trace of z = f (x, y). Plot the resulting curve. Do likewise
for x = −2, −1, 0, and 1.
b. Determine a vector-valued function r that parameterizes the curve which
is the y = 2 trace of z = f (x, y). Plot the resulting curve. Do likewise
for y = −2, −1, 0, and 1.
c. Determine a vector-valued function r that parameterizes the curve which
is the z = 2 contour of z = f (x, y). Plot the resulting curve. Do likewise
for z = −2, −1, 0, and 1.
72 CHAPTER 9. MULTIVARIABLE AND VECTOR FUNCTIONS

d. Use the traces and contours you’ve just investigated to create a wireframe
plot of the surface generated by z = f (x, y). In addition, write two
sentences to describe the characteristics of the surface.

14. Recall that any line in space may be represented parametrically by a


vector-valued function.

a. Find a vector-valued function r that parameterizes the line through


(−2, 1, 4) in the direction of the vector v = h3, 2, −5i.
b. Find a vector-valued function r that parameterizes the line of intersection
of the planes x + 2y − z = 4 and 3x + y − 2z = 1.
c. Determine the point of intersection of the lines given by

x = 2 + 3t, y = 1 − 2t, z = 4t,

x = 3 + 1s, y = 3 − 2s, z = 2s.


Then, find a vector-valued function r that parameterizes the line that
passes through the point of intersection you just found and is perpendic-
ular to both of the given lines.

15. For each of the following, describe the effect of the parameter s on the
parametric curve for t in the interval [0, 2π].

a. r(t) = hcos(t), sin(t) + si


b. r(t) = hcos(t) − s, sin(t)i
c. r(t) = hs cos(t), sin(t)i
d. r(t) = hs cos(t), s sin(t)i

e. r(t) = hcos(st), sin(st)i


9.7. DERIVATIVES AND INTEGRALS OF VECTOR-VALUED FUNCTIONS73

9.7 Derivatives and Integrals of Vector-Valued


Functions

Motivating Questions

• What do we mean by the derivative of a vector-valued function and how


do we calculate it?

• What does the derivative of a vector-valued function measure?

• What do we mean by the integral of a vector-valued function and how


do we compute it?

• How do we describe the motion of a projectile if the only force acting on


the object is acceleration due to gravity?

A vector-valued function r determines a curve in space as the collection of


terminal points of the vectors r(t). If the curve is smooth, it is natural to ask
whether r(t) has a derivative. In the same way, our experiences with integrals
in single-variable calculus prompt us to wonder what the integral of a vector-
valued function might be and what it might tell us. We explore both of these
questions in detail in this section.
For now, let’s recall some important ideas from calculus I. Given a function
s that measures the position of an object moving along an axis, its derivative,
s0 , is defined by
s(t + h) − s(t)
s0 (t) = lim ,
h→0 h
and measures the instantaneous rate of change of s with respect to time. In
particular, for a fixed value t = a, s0 (a) measures the velocity of the moving
object, as well as the slope of the tangent line to the curve y = s(t) at the
point (a, s(a)).
As we work with vector-valued functions, we will strive to update these
ideas and perspectives into the context of curves in space and outputs that are
vectors.

Preview Activity 9.7.1. Let r(t) = cos(t)i + sin(2t)j describe the path trav-
eled by an object at time t.

a. Use appropriate technology to help you sketch the graph of the vector-
valued function r, and then locate and label the point on the graph when
t = π.

b. Recall that for functions of a single variable, the derivative of a sum is


the sum of the derivatives; that is, dx
d
[f (x) + g(x)] = f 0 (x) + g 0 (x). With
this idea in mind and viewing i and j as constant vectors, what do you
expect the derivative of r to be? Write a proposed formula for r0 (t).

c. Use your result from part (b) to compute r0 (π). Sketch this vector r0 (π)
as emanating from the point on the graph of r when t = π , and explain
what you think r0 (π) tells us about the object’s motion.
74 CHAPTER 9. MULTIVARIABLE AND VECTOR FUNCTIONS

9.7.1 The Derivative


In single variable calculus, we define the derivative, f 0 , of a given function f
by
f (x + h) − f (x)
f 0 (x) = lim ,
h→0 h
provided the limit exists. At a given value of a, f 0 (a) measures the instanta-
neous rate of change of f , and also tells us the slope of the tangent line to the
curve y = f (x) at the point (a, f (a)). The definition of the derivative extends
naturally to vector-valued functions and curves in space.
Definition 9.7.1. The derivative of a vector-valued function r is defined to
be
r(t + h) − r(t)
r0 (t) = lim
h→0 h
for those values of t at which the limit exists. We also use the notation dr
dt and
dt [r(t)] for r (t).
d 0

Activity 9.7.2. Let’s investigate how we can interpret the derivative r0 (t).
Let r be the vector-valued function whose graph is shown in Figure 9.7.2, and
let h be a scalar that represents a small change in time. The vector r(t) is the
blue vector in Figure 9.7.2 and r(t + h) is the green vector.

r(t + h)

r(t)

Figure 9.7.2: A single difference quotient.

a. Is the quantity r(t + h) − r(t) a vector or a scalar? Identify this object


in Figure 9.7.2.

b. Is r(t+h)−r(t)
h a vector or a scalar? Sketch a representative vector r(t+h)−r(t)
h
with h < 1 in Figure 9.7.2.
c. Think of r(t) as providing the position of an object moving along the
curve these vectors trace out. What do you think that the vector r(t+h)−r(t)
h
measures? Why? (Hint: You might think analogously about difference
quotients such as f (x+h)−f
h
(x)
or s(t+h)−s(t)
h from calculus I.)

d. Figure 9.7.3 presents three snapshots of the vectors r(t+h)−r(t)


h as we let
h → 0. Write 2-3 sentences to describe key attributes of the vector
r(t + h) − r(t)
lim .
h→0 h
9.7. DERIVATIVES AND INTEGRALS OF VECTOR-VALUED FUNCTIONS75

(Hint: Compare to limits such as limh→0 f (x+h)−f


h
(x)
or limh→0 s(t+h)−s(t)
h
from calculus I, keeping in mind that in three dimensions there is no gen-
eral concept of slope.)

r(t + h)
r(t + h)
r(t + h)

r(t) r(t) r(t)

Figure 9.7.3: Snapshots of several difference quotients.

As Activity 9.7.2 indicates, if r(t) determines the position of an object at


time t, then r(t+h)−r(t)
h represents the average rate of change in the position of
the object over the interval [t, t + h], which is also the average velocity of the
object on this interval. Thus, the derivative

r(t + h) − r(t)
r0 (t) = lim
h→0 h
is the instantaneous rate of change of r(t) at time t (for those values of t for
which the limit exists), so r0 (t) = v(t) is the instantaneous velocity of the
object at time t. Furthermore, we can interpret the derivative r0 (t) as the
direction vector of the line tangent to the graph of r at the value t.
Similarly,
v(t + h) − v(t)
v0 (t) = r00 (t) = lim
h→0 h
is the instantaneous rate of change of the velocity of the object at time t,
for those values of t for which the limits exists, and thus v0 (t) = a(t) is the
acceleration of the moving object.
Note well: Both the velocity and acceleration are vector quantities: they
have magnitude and direction. By contrast, the magnitude of the velocity
vector, |v(t)|, which is the speed of the object at time t, is a scalar quantity.

9.7.2 Computing Derivatives


As we learned in single variable calculus, computing derivatives from the defi-
nition is often difficult. Fortunately, properties of the limit make it straightfor-
ward to calculate the derivative of a vector-valued function similar to how we
developed shortcut differentiation rules in calculus I. To see why, recall that
the limit of a sum is the sum of the limits, and that we can remove constant
factors from limits. Thus, as we observed in a particular example in Preview
Activity 9.7.1, if r(t) = x(t)i + y(t)j + z(t)k, it follows that

r(t + h) − r(t)
r0 (t) = lim
h→0 h
[x(t + h) − x(t)]i + [y(t + h) − y(t)]j + [z(t + h) − z(t)]k
= lim
h→0 h
76 CHAPTER 9. MULTIVARIABLE AND VECTOR FUNCTIONS

x(t + h) − x(t) y(t + h) − y(t)


   
= lim i + lim j
h→0 h h→0 h
z(t + h) − z(t)
 
+ lim k
h→0 h
= x0 (t)i + y 0 (t)j + z 0 (t)k.

Thus, we can calculate the derivative of a vector-valued function by simply


differentiating its components.
The derivative of a vector-valued function.
If r(t) = x(t)i + y(t)j + z(t)k, then

d
r(t) = x0 (t)i + y 0 (t)j + z 0 (t)k
dt
for those values of t at which x, y, and z are differentiable.

Activity 9.7.3. For each of the following vector-valued functions, find r0 (t).
a. r(t) = hcos(t), t sin(t), ln(t)i.
b. r(t) = ht2 + 3t, e−2t , t2 +1
t
i.

c. r(t) = htan(t), cos(t2 ), te−t i.



d. r(t) = h t4 + 4, sin(3t), cos(4t)i.
In first-semester calculus, we developed several important differentiation
rules, including the constant multiple, product, quotient, and chain rules. For
instance, recall that we formally state the product rule as
d
[f (x) · g(x)] = f (x) · g 0 (x) + g(x) · f 0 (x).
dx
There are several analogous rules for vector-valued functions, including a
product rule for scalar functions and vector-valued functions. These rules,
which are easily verified, are summarized as follows.
Properties of derivatives of vector-valued functions.
Let f be a differentiable real-valued function of a real variable t and let
r and s be differentiable vector-valued functions of the real variable t.
Then
1. d
dt [r(t) + s(t)] = r0 (t) + s0 (t)

2. d
dt [f (t)r(t)] = f (t)r0 (t) + f 0 (t)r(t)

3. d
dt [r(t) · s(t)] = r0 (t) · s(t) + r(t) · s0 (t)

4. d
dt [r(t) × s(t)] = r0 (t) × s(t) + r(t) × s0 (t)

5. d
dt [r(f (t))] = f 0 (t)r0 (f (t)).

Note well. When applying these properties, use care to interpret the quan-
tities involved as either scalars or vectors. For example, r(t) · s(t) defines a
scalar function because we have taken the dot product of two vector-valued
functions. However, r(t) × s(t) defines a vector-valued function since we have
taken the cross product of two vector-valued functions.
9.7. DERIVATIVES AND INTEGRALS OF VECTOR-VALUED FUNCTIONS77

Activity 9.7.4. The left side of Figure 9.7.4 shows the curve described by the
vector-valued function r defined by
 
1 2
r(t) = 2t − t + 1, t − 1 .
2

4 y 16
Speed
3 14
2 12
1 10
x
8
-4 -3 -2 -1 1 2 3 4
-1 6
-2 4
-3 2
t
-4
-4 -3 -2 -1 1 2 3 4

Figure 9.7.4: The curve r(t) = 2t − 12 t2 + 1, t − 1 and its speed.



a. Find the object’s velocity v(t).

b. Find the object’s acceleration a(t).

c. Indicate on the left of Figure 9.7.4 the object’s position, velocity and
acceleration at the times t = 0, 2, 4. Draw the velocity and acceleration
vectors with their tails placed at the object’s position.

d. Recall that the speed is |v| = v · v. Find the object’s speed and graph
it as a function of time t on the right of Figure 9.7.4. When is the object’s
speed the slowest? When is the speed increasing? When is it decreasing?

e. What seems to be true about the angle between v and a when the speed
is at a minimum? What is the angle between v and a when the speed is
increasing? when the speed is decreasing?

f. Since the square root is an increasing function, we see that the speed
increases precisely when v · v is increasing. Use the product rule for the
dot product to express dt
d
(v·v) in terms of the velocity v and acceleration
a. Use this to explain why the speed is increasing when v · a > 0 and
decreasing when v · a < 0. Compare this to part (d).

g. Show that the speed’s rate of change is

d
|v(t)| = compv a.
dt
78 CHAPTER 9. MULTIVARIABLE AND VECTOR FUNCTIONS

9.7.3 Tangent Lines


One of the most important ideas in first-semester calculus is that a differen-
tiable function is locally linear : that is, when viewed up close, the curve gener-
ated by a differentiable function looks very much like a line. Indeed, when we
zoom in sufficiently far on a particular point, the curve looks indistinguishable
from its tangent line.
In the same way, we expect that a smooth curve in 3-space will be locally
linear. In the following activity, we investigate how to find the tangent line to
such a curve. Recall from our work in Section 9.5 that the vector equation of
a line that passes through the point at the tip of the vector L0 = hx0 , y0 , z0 i
in the direction of the vector u = ha, b, ci can be written as

L(t) = L0 + tu.

In parametric form, the line L is given by

x(t) = x0 + at, y(t) = y0 + bt, z(t) = z0 + ct.

Activity 9.7.5. Let

r(t) = cos(t)i − sin(t)j + tk.

Sketch the curve using some appropriate tool.

a. Determine the coordinates of the point on the curve traced out by r(t)
when t = π.

b. Find a direction vector for the line tangent to the graph of r at the point
where t = π.

c. Find the parametric equations of the line tangent to the graph of r when
t = π.

d. Sketch a plot of the curve r(t) and its tangent line near the point where
t = π. In addition, include a sketch of r0 (π). What is the important role
of r0 (π) in this activity?

We see that our work in Activity 9.7.5 can be generalized. Given a differ-
entiable vector-valued function r, the tangent line to the curve at the input
value a is given by
L(t) = r(a) + tr0 (a). (9.7.1)
Here we see that because the tangent line is determined entirely by a given
point and direction, the point is provided by the function r, evaluated at t = a,
while the direction is provided by the derivative, r0 , again evaluated at t = a.
Note how analogous the formula for L(t) is to the tangent line approximation
from single-variable calculus: in that context, for a given function y = f (x) at
a value x = a, we found that the tangent line can be expressed by the linear
function y = L(x) whose formula is

L(x) = f (a) + f 0 (a)(x − a).

Equation (9.7.1) for the tangent line L(t) to the vector-valued function r(t)
is nearly identical. Indeed, because there are multiple parameterizations for a
single line, it is even possible to write the parameterization as

L(t) = r(a) + (t − a)r0 (a). (9.7.2)


9.7. DERIVATIVES AND INTEGRALS OF VECTOR-VALUED FUNCTIONS79

(For example, in Equation (9.7.1), L(0) = r(a), so the line’s parameteriza-


tion “starts” at t = 0. When we write the parameterization in the form of
Equation (9.7.2), L(a) = r(a), so the line’s parameterization “starts” at t = a.)
As we will learn more in Chapter 10, a smooth surface in 3-space is also
locally linear. That means that the surface will look like a plane, which we call
its tangent plane, as we zoom in on the graph. It is possible to use tangent
lines to traces of the surface to generate a formula for the tangent plane; see
Exercise 9.7.15 at the end of this section for more details.

9.7.4 Integrating a Vector-Valued Function


Recall from single variable calculus that an antiderivative of a function f of
the independent variable x is a function F that satisfies F 0 (x) = f (x). We
then defined the indefinite integral f (x) dx to be the general antiderivative
R

of f . Recall that the general antiderivative includes an added constant C in


order to indicate that the general antiderivative is in fact an entire family of
functions. We can do the similar work with vector-valued functions.

Definition 9.7.5. An antiderivative of a vector-valued function r is a vector-


valued function R such that

R0 (t) = r(t).

The indefinite integral r(t) dt of a vector-valued function r is the gen-


R

eral antiderivative of r and represents the collection of all antiderivatives of


r.

The same reasoning that allows us to differentiate a vector-valued function


componentwise applies to integrating as well. Recall that the integral of a sum
is the sum of the integrals and also that we can remove constant factors from
integrals. So, given r(t) = x(t)i + y(t)j + z(t)k, it follows that we can integrate
componentwise. Expressed more formally,
Integrating a vector-valued function.
If r(t) = x(t)i + y(t)j + z(t)k, then
Z Z  Z  Z 
r(t) dt = x(t) dt i + y(t) dt j + z(t) dt k.

In light of being able to integrate and differentiate componentwise with


vector-valued functions, we can solve many problems that are analogous to
those we encountered in single-variable calculus. For instance, recall problems
where we were given an object moving along an axis with velocity function
v and an initial position s(0). In that context, we were able to differentiate
v in order to find acceleration, and integrate v and use the initial condition
in order to find the position function s. In the following activity, we explore
similar ideas with vector-valued functions.

Activity 9.7.6. Suppose a moving object in space has its velocity given by
 
1
v(t) = (−2 sin(2t))i + (2 cos(t))j + 1 − k.
1+t

A graph of the position of the object for times t in [−0.5, 3] is shown in


Figure 9.7.6. Suppose further that the object is at the point (1.5, −1, 0) at
time t = 0.
80 CHAPTER 9. MULTIVARIABLE AND VECTOR FUNCTIONS

a. Determine a(t), the acceleration of the object at time t.

b. Determine r(t), position of the object at time t.

c. Compute and sketch the position, velocity, and acceleration vectors of


the object at time t = 1, using Figure 9.7.6.

d. Finally, determine the vector equation for the tangent line, L(t), that is
tangent to the position curve at t = 1.

2 z

y
-2 -1 1

1 -1
x
2
Figure 9.7.6: The position graph for the function in Activity 9.7.6.

9.7.5 Projectile Motion


Any time that an object is launched into the air with a given velocity and
launch angle, the path the object travels is determined almost exclusively by
the force of gravity. Whether in sports such as archery or shotput, in military
applications with artillery, or in important fields like firefighting, it is important
to be able to know when and where a launched projectile will land. We can use
our knowledge of vector-valued functions in order to completely determine the
path traveled by an object that is launched from a given position at a given
angle from the horizontal with a given initial velocity.

v(0)
θ
(x0 , y0 )
Figure 9.7.7: Projectile motion.

Assume we fire a projectile from a launcher and the only force acting on
the fired object is the force of gravity pulling down on the object. That is, we
assume no effect due to spin, wind, or air resistance. With these assumptions,
the motion of the object will be planar, so we can also assume that the motion
occurs in two-dimensional space. Suppose we launch the object from an initial
position (x0 , y0 ) at an angle θ with the positive x-axis as illustrated in Fig-
ure 9.7.7, and that we fire the object with an initial speed of v0 = |v(0)|, where
v(t) is the velocity vector of the object at time t. Assume g is the positive
9.7. DERIVATIVES AND INTEGRALS OF VECTOR-VALUED FUNCTIONS81

constant acceleration force due to gravity, which acts to pull the fired object
toward the ground (in the negative y direction). Note particularly that there
is no external force acting on the object to move it in the x direction.
We first observe that since gravity only acts in the downward direction
and that the acceleration due to gravity is constant, the acceleration vector
is h0, −gi. That is, a(t) = h0, −gi. We may use this fact about acceleration,
together with the initial position and initial velocity in order to fully determine
the position r(t) of the object at time t. In Exercise 9.7.17, you can work
through the details to show that the following general formula holds.
The motion of a projectile.
If an object is launched from a point (x0 , y0 ) with initial velocity v0 at
an angle θ with the horizontal, then the position of the object at time
t is given by
D g E
r(t) = v0 cos(θ)t + x0 , − t2 + v0 sin(θ)t + y0 .
2

This assumes that the only force acting on the object is the acceleration g
due to gravity.

9.7.6 Summary
• If r is a vector-valued function, then the derivative of r is defined by
r(t + h) − r(t)
r0 (t) = lim
h→0 h
for those values of t at which the limit exists, and is computed compo-
nentwise by the formula
r0 (t) = x0 (t)i + y 0 (t)j + z 0 (t)k
for those values of t at which x, y, and z are differentiable, where r(t) =
x(t)i + y(t)j + z(t)k.
• The derivative r0 (t) of the vector-valued function r tells us the instanta-
neous rate of change of r with respect to time, t, which can be interpreted
as a direction vector for the line tangent to the graph of r at the point
r(t), or also as the instantaneous velocity of an object traveling along the
graph defined by r(t) at time t.
• An antiderivative of r is a vector-valued function R such that R0 (t) =
r(t). The indefinite integral r(t) dt of a vector-valued function r is the
R

general antiderivative of r (which is a collection of all of the antideriva-


tives of r, with any two antiderivatives differing by at most a constant
vector). Moreover, if r(t) = x(t)i + y(t)j + z(t)k, then
Z Z  Z  Z 
r(t) dt = x(t) dt i + y(t) dt j + z(t) dt k.

• If an object is launched from a point (x0 , y0 ) with initial velocity v0 at


an angle θ with the horizontal, then the position of the object at time t
is given by
D g E
r(t) = v0 cos(θ)t + x0 , − t2 + v0 sin(θ)t + y0 ,
2
provided that that the only force acting on the object is the acceleration
g due to gravity.
82 CHAPTER 9. MULTIVARIABLE AND VECTOR FUNCTIONS

Exercises
1. If r(t) = cos(5t)i + sin(5t)j + 6tk, compute:
A. The velocity vector v(t) = i+ j+ k
B. The acceleration vector a(t) = i+ j+ k
Note: the coefficients in your answers must be entered in the form of ex-
pressions in the variable t; e.g. “5 cos(2t)”
2. Given that the acceleration vector is a (t) = (−16 cos (4t)) i+(−16 sin (4t)) j+
(−1t) k, the initial velocity is v (0) = i + k, and the initial position vector is
r (0) = i + j + k, compute:
A. The velocity vector v (t) = i+ j+ k
B. The position vector r (t) = i+ j+ k
Note: the coefficients in your answers must be entered in the form of ex-
pressions in the variable t; e.g. “5 cos(2t)”
3. R Evaluate
10
0
(ti + t2 j + t3 k)dt = i+ j+ k.
4. Find parametric equations for the tangent line at the point
(cos( −5π6 ), sin( 6 ), 6 ) on the curve x = cos t, y = sin t, z = t
−5π −5π

x(t) =
y(t)=
z(t)=
(Your line should be parametrized so that it passes through the given point
at t=0).
5. If r(t) = cos(−3t)i + sin(−3t)j − 2tk
compute r0 (t)= i+ j+ k
and r(t) dt=
R
i+ j+ k+C
with C a constant vector.
6. For the given position vectors r(t),
compute the (tangent) velocity vector r0 (t) for the given value of t .
A) Let r(t) = (cos 2t, sin 2t).
Then r0 ( π4 )= ( , )?
B) Let r(t) = (t , t ).
2 3

Then r0 (1)= ( , )?
C) Let r(t) = e2t i + e−t j + tk.
Then r0 (−5)= i+ j+ k?
7. Suppose ~r(t) = cos(πt) i + sin(πt) j + 3tk represents the position of a
particle on a helix, where z is the height of the particle.
(a) What is t when the particle has height 12?
t=
(b) What is the velocity of the particle when its height is 12?
~v =
(c) When the particle has height 12, it leaves the helix and moves along
the tangent line at the constant velocity found in part (b). Find a vector
parametric equation for the position of the particle (in terms of the original
parameter t) as it moves along this tangent line.
L(t) =
8. Suppose the displacement of a particle in motion at time t is given by the
parametric equations
3
x(t) = (4t − 1) , y(t) = 4, z(t) = 768t4 − 256t3 .
(a) Find the speed of the particle when t = 3.
Speed =
(b) Find t when the particle is stationary.
t=
9.7. DERIVATIVES AND INTEGRALS OF VECTOR-VALUED FUNCTIONS83

9. Find the derivative of the vector function


r(t) = ta × (b + tc), where
a = h2, 5, −2i, b = h1, 4, 2i, and c = h−2, −4, 1i.
r0 (t) = h , , i
10. Let c1 (t) = (e , sin(2t), 2t ), and c2 (t) = (e , cos(4t), −4t3 )
3t 3 −2t

d
[c1 (t) · c2 (t)] =
dt
d
[c1 (t) × c2 (t)] = i+
dt
j+
k
11. A gun has a muzzle speed of 100 meters per second. What angle
of elevation should be used to hit an object 160 meters away? Neglect air
resistance and use g = 9.8 m/sec2 as the acceleration of gravity.
Answer:
radians
12. A child wanders slowly down a circular staircase from the top of a
tower. With x, y, z in feet and the origin at the base of the tower, her position
t minutes from the start is given by

x = 30 cos t, y = 30 sin t, z = 120 − 5t.

(a) How tall is the tower?


height = ft
(b) When does the child reach the bottom?
time = minutes
(c) What is her speed at time t?
speed =
ft/min(d) What is her acceleration at time t?
acceleration =
ft/min2
13. Compute the derivative of each of the following functions in two different
ways: (1) use the rules provided in the theorem stated just after Activity 9.7.3,
and (2) rewrite each given function so that it is stated as a single function
(either a scalar function or a vector-valued function with three components),
and differentiate component-wise. Compare your answers to ensure that they
are the same.
a. r(t) = sin(t)h2t, t2 , arctan(t)i
b. s(t) = r(2t ), where r(t) = ht + 2, ln(t), 1i.
c. r(t) = hcos(t), sin(t), ti · h− sin(t), cos(t), 1i
d. r(t) = hcos(t), sin(t), ti × h− sin(t), cos(t), 1i

14. Consider the two vector-valued functions given by


 π  
1
r(t) = t + 1, cos t ,
2 1+t
and D π  E
w(s) = s2 , sin s ,s .
2
a. Determine the point of intersection of the curves generated by r(t) and
w(s). To do so, you will have to find values of a and b that result in r(a)
and w(b) being the same vector.
84 CHAPTER 9. MULTIVARIABLE AND VECTOR FUNCTIONS

b. Use the value of a you determined in (a) to find a vector form of the
tangent line to r(t) at the point where t = a.
c. Use the value of b you determined in (a) to find a vector form of the
tangent line to w(s) at the point where s = b.
d. Suppose that z = f (x, y) is a function that generates a surface in three-
dimensional space, and that the curves generated by r(t) and w(s) both
lie on this surface. Note particularly that the point of intersection you
found in (a) lies on this surface. In addition, observe that the two tangent
lines found in (b) and (c) both lie in the tangent plane to the surface at
the point of intersection. Use your preceding work to determine the
equation of this tangent plane.

15. In this exercise, we determine the equation of a plane tangent to the


surface defined by f (x, y) = x2 + y 2 at the point (3, 4, 5).
p

a. Find a parameterization for the x = 3 trace of f . What is a direction


vector for the line tangent to this trace at the point (3, 4, 5)?
b. Find a parameterization for the y = 4 trace of f . What is a direction
vector for the line tangent to this trace at the point (3, 4, 5)?
c. The direction vectors in parts (a) and (b) form a plane containing the
point (3, 4, 5). What is a normal vector for this plane?
d. Use your work in parts (a), (b), and (c) to deterring an equation for the
tangent plane. Then, use appropriate technology to draw the graph of
f and the plane you determined on the same set of axes. What do you
observe? (We will discuss tangent planes in more detail in Chapter 10.)

For each given function r, determine r(t) dt. In addition, recalling


R
16.
the Fundamental
R1 Theorem of Calculus for functions of a single variable, also
evaluate 0 r(t) dt for each given function r. Is the resulting quantity a scalar
or a vector? What does it measure?
D E
a. r(t) = cos(t), t+1
1
, tet

b. r(t) = hcos(3t), sin(2t), ti


D E
t2
c. r(t) = 1+tt 1
2 , te , 1+t2

17. In this exercise, we develop the formula for the position function of a
projectile that has been launched at an initial speed of |v0 | and a launch angle
of θ. Recall that a(t) = h0, −gi is the constant acceleration of the projectile at
any time t.
a. Find all velocity vectors for the given acceleration vector a. When
you anti-differentiate, remember that there is an arbitrary constant that
arises in each component.
b. Use the given information about initial speed and launch angle to find
v0 , the initial velocity of the projectile. You will want to write the vector
in terms of its components, which will involve sin(θ) and cos(θ).
c. Next, find the specific velocity vector function v for the projectile. That
is, combine your work in (a) and (b) in order to determine expressions
in terms of |v0 | and θ for the constants that arose when integrating.
9.7. DERIVATIVES AND INTEGRALS OF VECTOR-VALUED FUNCTIONS85

d. Find all possible position vectors for the velocity vector v(t) you deter-
mined in (c).
e. Let r(t) denote the position vector function for the given projectile. Use
the fact that the object is fired from the position (x0 , y0 ) to show it
follows that
D g E
r(t) = |v0 | cos(θ)t + x0 , − t2 + |v0 | sin(θ)t + y0 .
2

18. A central force is one that acts on an object so that the force F is
parallel to the object’s position r. Since Newton’s Second Law says that an
object’s acceleration is proportional to the force exerted on it, the acceleration
a of an object moving under a central force will be parallel to its position r.
For instance, the Earth’s acceleration due to the gravitational force that the
sun exerts on the Earth is parallel to the Earth’s position vector as shown in
Figure 9.7.8.

Earth
r

Sun

Figure 9.7.8: A central force.

a. If an object of mass m is moving under a central force, the angular


momentum vector is defined to be L = mr × v. Assuming the mass is
constant, show that the angular momentum is constant by showing that
dL
= 0.
dt

b. Explain why L · r = 0.
c. Explain why we may conclude that the object is constrained to lie in the
plane passing through the origin and perpendicular to L.
86 CHAPTER 9. MULTIVARIABLE AND VECTOR FUNCTIONS

9.8 Arc Length and Curvature

Motivating Questions

• How can a definite integral be used to measure the length of a curve in


2- or 3-space?

• Why is arc length useful as a parameter?

• What is the curvature of a curve?

Given a space curve, there are two natural geometric questions one might
ask: how long is the curve and how much does it bend? In this section, we
answer both questions by developing techniques for measuring the length of a
space curve as well as its curvature.

Preview Activity 9.8.1. In earlier investigations, we have used integration


to calculate quantities such as area, volume, mass, and work. We are now
interested in determining the length of a space curve.
Consider the smooth curve in 3-space defined by the vector-valued function
r, where
r(t) = hx(t), y(t), z(t)i = hcos(t), sin(t), ti
for t in the interval [0, 2π]. Pictures of the graph of r are shown in Figure 9.8.1.
We will use the integration process to calculate the length of this curve. In
this situation we partition the interval [0, 2π] into n subintervals of equal length
and let 0 = t0 < t1 < t2 < · · · < tn = b be the endpoints of the subintervals.
We then approximate the length of the curve on each subinterval with some
related quantity that we can compute. In this case, we approximate the length
of the curve on each subinterval with the length of the segment connecting the
endpoints. Figure 9.8.1 illustrates the process in three different instances using
increasing values of n.

z z z
6 6 6

4 4 4

2 2 2
-1 -1 -1
y y y
-1 1 -1 1 -1 1
x x x
1 Estimate: 8.15 1 Estimate: 8.69 1 Estimate: 8.80

Figure 9.8.1: Approximating the length of the curve with n = 3, n = 6, and


n = 9.

a. Write a formula for the length of the line segment that connects the
endpoints of the curve on the ith subinterval [ti−1 , ti ]. (This length is
our approximation of the length of the curve on this interval.)
9.8. ARC LENGTH AND CURVATURE 87

b. Use your formula in part (a) to write a sum that adds all of the approx-
imations to the lengths on each subinterval.

c. What do we need to do with the sum in part (b) in order to obtain the
exact value of the length of the graph of r(t) on the interval [0, 2π]?

9.8.1 Arc Length


Consider a smooth curve in 3-space that is parametrically described by the
vector-valued function r defined by r(t) = hx(t), y(t), z(t)i. Preview Activ-
ity 9.8.1 shows that to approximate the length of the curve defined by r(t)
as the values of t run over an interval [a, b], we partition the interval [a, b]
into n subintervals of equal length ∆t, with a = t0 < t1 < · · · < tn = b
as the endpoints of the subintervals. On each subinterval, we approximate
the length of the curve by the length of the line segment connecting the end-
points. The points on the curve corresponding to t = ti−1 and t = ti are
(x(ti−1 ), y(ti−1 ), z(ti−1 )) and (x(ti ), y(ti ), z(ti )), respectively, so the length of
the line segment connecting these points is
p
(x(ti ) − x(ti−1 ))2 + (y(ti ) − y(ti−1 ))2 + (z(ti ) − z(ti−1 ))2 .

Now we add all of these approximations together to obtain an approxima-


tion to the length L of the curve:
n p
X
L≈ (x(ti ) − x(ti−1 ))2 + (y(ti ) − y(ti−1 ))2 + (z(ti ) − z(ti−1 ))2 .
i=1

We now want to take the limit of this sum as n goes to infinity, but in
its present form it might be difficult to see how. We first introduce ∆t by
multiplying by ∆t
∆t
, and see that
n p
X
L≈ (x(ti ) − x(ti−1 ))2 + (y(ti ) − y(ti−1 ))2 + (z(ti ) − z(ti−1 ))2
i=1
n p
X ∆t
= (x(ti ) − x(ti−1 ))2 + (y(ti ) − y(ti−1 ))2 + (z(ti ) − z(ti−1 ))2
i=1
∆t
n p
X ∆t
= (x(ti ) − x(ti−1 ))2 + (y(ti ) − y(ti−1 ))2 + (z(ti ) − z(ti−1 ))2 p
i=1
(∆t)2

To get the difference quotients under the radical, we use properties of the
square root function to see further that
n
s
X 1
L≈ [(x(ti ) − x(ti−1 ))2 + (y(ti ) − y(ti−1 ))2 + (z(ti ) − z(ti−1 )2 ] 2
∆t
i=1
(∆t)
s
n 2  2  2
X x(ti ) − x(ti−1 ) y(ti ) − y(ti−1 ) z(ti ) − z(ti−1 )
= + + ∆t.
i=1
∆t ∆t ∆t

Recall that as n → ∞ we also have ∆t → 0. Since


x(ti ) − x(ti−1 )
x0 (t) = lim ,
∆t→0 ∆t
y(ti ) − y(ti−1 )
y 0 (t) = lim , and
∆t→0 ∆t
88 CHAPTER 9. MULTIVARIABLE AND VECTOR FUNCTIONS

z(ti ) − z(ti−1 )
z 0 (t) lim ,
∆t→0 ∆t
we see that
s
n 2  2  2
X x(ti ) − x(ti−1 ) y(ti ) − y(ti−1 ) z(ti ) − z(ti−1 )
lim + + ∆t
n→∞
i=1
∆t ∆t ∆t

is equal to
Z b p
(x0 (t))2 + (y 0 (t))2 + (z 0 (t))2 dt.
a

Noting further that

|r0 (t)| =
p
(x0 (t))2 + (y 0 (t))2 + (z 0 (t))2 ,

we can rewrite our arclength formula in a more succinct form as follows.


The length of a curve.
If r(t) defines a smooth curve C on an interval [a, b], then the length L
of C is given by
Z b
L= |r0 (t)| dt. (9.8.1)
a

Note that formula (9.8.1) applies to curves in any dimensional space. More-
over, this formula has a natural interpretation: if r(t) records the position of a
moving object, then r0 (t) is the object’s velocity and |r0 (t)| its speed. Formula
(9.8.1) says that we simply integrate the speed of an object traveling over the
curve to find the distance traveled by the object, which is the same as the
length of the curve, just as in one-variable calculus.

Activity 9.8.2. Here we calculate the arc length of two familiar curves.

a. Use Equation (9.8.1) to calculate the circumference of a circle of radius


r.

b. Find the exact length of the spiral defined by r(t) = hcos(t), sin(t), ti on
the interval [0, 2π].

We can adapt the arc length formula to curves in 2-space that define y as
a function of x as the following activity shows.

Activity 9.8.3. Let y = f (x) define a smooth curve in 2-space. Parameterize


this curve and use Equation (9.8.1) to show that the length of the curve defined
by f on an interval [a, b] is
Z b p
1 + [f 0 (t)]2 dt.
a

9.8.2 Parameterizing With Respect To Arc Length


In addition to helping us to find the length of space curves, the expression
for the length of a curve enables us to find a natural parametrization of space
curves in terms of arc length, as we now explain.
Shown below in Figure 9.8.2 is a portion of the parabola y = x2 /2. Of
course, this space curve may be parametrized by the vector-valued function r
defined by r(t) = ht, t2 /2i as shown on the left, where we see the location at
9.8. ARC LENGTH AND CURVATURE 89

a few different times t. Notice that the points are not equally spaced on the
curve.
A more natural parameter describing the points along the space curve is
the distance traveled s as we move along the parabola starting at the origin.
For instance, the right side of Figure 9.8.2 shows the points corresponding to
various values of s. We call this an arc length parametrization.

2 y 2 y
t =2.0

s =2.5

1 t =1.5 1 s =2.0

s =1.5
t =1.0
s =1.0
t =0.5 x s =0.5 x
t =0.0 1 2 s =0.0 1 2

Figure 9.8.2: The parametrization r(t) (left) and a reparametrization by arc


length.

To see that this is a more natural parametrization, consider an interstate


highway cutting across a state. One way to parametrize the curve defined by
the highway is to drive along the highway and record our position at every time,
thus creating a function r. If we encounter an accident or road construction,
however, this parametrization might not be at all relevant to another person
driving the same highway. An arc length parametrization, however, is like using
the mile markers on the side of road to specify our position on the highway. If
we know how far we’ve traveled along the highway, we know exactly where we
are.
If we begin with a parametrization of a space curve, we can modify it to find
an arc length parametrization, as we now describe. Suppose that the curve is
parametrized by the vector-valued function r = r(t) where t is in the interval
[a, b]. We define the parameter s through the function
Z tp
s = L(t) = (x0 (w))2 + (y 0 (w))2 + (z 0 (w))2 dw,
a

which measures the length along the curve from r(a) to r(t).
The Fundamental Theorem of Calculus shows us that
ds
= L0 (t) = (x0 (t))2 + (y 0 (t))2 + (z 0 (t))2 = |r0 (t)| (9.8.2)
p
dt
and so Z t
d
L(t) = dt r(w) dw.

a

If we assume that r (t) is never 0, then L0 (t) > 0 for all t and s = L(t) is
0

always increasing. This should seem reasonable: unless we stop, the distance
traveled along the curve increases as we move along the curve.
Since s = L(t) is an increasing function, it is invertible, which means we
may view the time t as a function of the distance traveled; that is, we have
the relationship t = L−1 (s). We then obtain the arc length parametrization
90 CHAPTER 9. MULTIVARIABLE AND VECTOR FUNCTIONS

by composing r(t) with t = L−1 (s) to obtain r(s). Let’s illustrate this with an
example.
Example 9.8.3. Consider a circle of radius 5 in 2-space centered at the origin.
We know that we can parameterize this circle as

r(t) = h5 cos(t), 5 sin(t)i,

where t runs from 0 to 2π. We see that r0 (t) = h−5 sin(t), 5 cos(t)i, and hence
|r0 (t)| = 5. It then follows that
Z t Z t
s = L(t) = |r0 (w)| dw = 5 dw = 5t.
0 0

Since s = L(t) = 5t, we may solve for t in terms of s to obtain t(s) =


L−1 (s) = s/5. We then find the arc length parametrization by composing
D s  s E
r(t(s)) = r(L−1 (s)) = 5 cos , 5 sin .
5 5
More generally, for a circle of radius a centered at the origin, a similar
computation shows that
D s  s E
a cos , a sin (9.8.3)
a a
is an arc length parametrization.
Notice that equation (9.8.2) shows that

dr dr ds dr 0
= = |r (t)|,
dt ds dt ds
so
dr 1 dr
=
ds |r0 (t)| dt = 1,

which means that we move along the curve with unit speed when we parame-
terize by arc length. This is clearly seen in Example 9.8.3 where |r0 (s)| = 1. It
follows that the parameter s is the distance traveled along the curve, as shown
by: Z s Z s
d
L(s) = r(w) dw =
ds 1 dw = s.
0 0

Activity 9.8.4. In this activity we parameterize a line in 2-space in terms of


arc length. Consider the line with parametric equations

x(t) = x0 + at and y(t) = y0 + bt.

a. To write t in terms of s, evaluate the integral


Z tp
s = L(t) = (x0 (w))2 + (y 0 (w))2 dw
0

to determine the length of the line from time 0 to time t.


b. Use the formula from (a) for s in terms of t to write t in terms of s. Then
explain why a parameterization of the line in terms of arc length is
a b
x(s) = x0 + √ s and y(s) = y0 + √ s. (9.8.4)
a2+ b2 a2 + b2
9.8. ARC LENGTH AND CURVATURE 91

A little more complicated example is the following.

Example 9.8.4. Let us parameterize the curve defined by


 
8
r(t) = t2 , t3/2 , 4t
3

for t ≥ 0 in terms of arc length. To write t in terms of s we find s in terms of


t:
Z tp
s(t) = (x0 (w))2 + (y 0 (w))2 + (z 0 (w))2 dw
0
Z tq
= (2w)2 + (4w1/2 )2 + (4)2 dw
0
Z tp
= 4w2 + 16w + 16 dw
0
Z tp
=2 (w + 2)2 dw
0
Z t
=2 w + 2 dw
0
 t

= w2 + 4w
0
= t2 + 4t.

Since t ≥ 0, we can solve the equation s = t2 + 4t (or t2 + 4t − s = 0) for



−4+ 16+4s √
t to obtain t = 2 = −2 + 4 + s. So we can parameterize our curve
in terms of arc length by

√ √ √
 
2 8 3/2 
r(s) = −2 + 4+s , −2 + 4 + s , 4 −2 + 4 + s .
3

These examples illustrate a general method. Of course, evaluating an arc


length integral and finding a formula for the inverse of a function can be diffi-
cult, so while this process is theoretically possible, it is not always practical to
parameterize a curve in terms of arc length. However, we can guarantee that
such a parameterization exists, and this observation plays an important role
in the next section.

9.8.3 Curvature
For a smooth space curve, the curvature measures how fast the curve is bending
or changing direction at a given point. For example, we expect that a line
should have zero curvature everywhere, while a circle (which is bending the
same at every point) should have constant curvature. Circles with larger radii
should have smaller curvatures.
To measure the curvature, we first need to describe the direction of the
curve at a point. We may do this using a continuously varying tangent vector
to the curve, as shown at left in Figure 9.8.5. The direction of the curve is
then determined by the angle φ each tangent vector makes with a horizontal
vector, as shown at right in Figure 9.8.5.
92 CHAPTER 9. MULTIVARIABLE AND VECTOR FUNCTIONS

φ
1 1
φ

-2 -1 1 2 -2 -1 1 2
φ
-1 -1

Figure 9.8.5: Left: Tangent vectors to an ellipse. Right: Angles of tangent


vectors.

Informally speaking, the curvature will be the rate at which the angle φ
is changing as we move along the curve. Of course, this rate of change will
depend on how we move along the curve; if we move with a greater speed
along the curve, then φ will change more rapidly. This is why the speed limit
is sometimes lowered when we enter a curve on a highway. In other words, the
rate of change of φ will depend on the parametrization we use to describe the
space curve. To eliminate this dependence on the parametrization, we choose
to work with an arc length parametrization r(s), which means we move along
the curve with unit speed.
Using an arc length parametrization r(s), we define the tangent vector
T(s) = r0 (s), and note that |T(s)| = 1; that is, T(s) is a unit tangent vector.
We then have T(s) = hcos(φ(s)), sin(φ(s))i, which means that
 
dT dφ dφ dφ
= − sin(φ(s)) , cos(φ(s)) = h− sin(φ(s)), cos(φ(s))i .
ds ds ds ds

Therefore

dT
= |h− sin(φ(s)), cos(φ(s))i| dφ = dφ


ds ds ds

This observation leads us to adopt the following definition.

Definition 9.8.6. If C is a smooth space curve and s is an arc length param-


eter for C, then the curvature, κ, of C is

dT
κ = κ(s) = .
ds

Note that κ is the Greek lowercase letter “kappa”.

Activity 9.8.5.

a. We should expect that the curvature of a line is 0 everywhere. To show


that our definition of curvature measures this correctly in 2-space, recall
that (9.8.4) gives us the arc length parameterization

a b
x(s) = x0 + √ s and y(s) = y0 + √ s
a2+ b2 a2 + b2
of a line. Use this information to explain why the curvature of a line is
0 everywhere.
9.8. ARC LENGTH AND CURVATURE 93

b. Recall that an arc length parameterization of a circle in 2-space of radius


a centered at the origin is, from (9.8.3),
D s  s E
r(s) = a cos , a sin .
a a

Show that the curvature of this circle is the constant a1 . What can you
say about the relationship between the size of the radius of a circle and
the value of its curvature? Why does this make sense?

The definition of curvature relies on our ability to parameterize curves in


terms of arc length. Since we have seen that finding an arc length parametriza-
tion can be difficult, we would like to be able to express the curvature in terms
of a more general parametrization r(t).
To begin, we need to describe the vector T, which is a vector tangent to
the curve having unit length. Of course, the velocity vector r0 (t) is tangent to
the curve; we simply need to normalize its length to be one. This means that
we may take
r0 (t)
T(t) = 0 . (9.8.5)
|r (t)|
Then the curvature of the curve defined by r is

dT
κ=
ds

dT dt
=
dt ds
dT

dt
= ds

dt
|T0 (t)|
= 0 .
|r (t)|

This last formula allows us to use any parameterization of a curve to cal-


culate its curvature. There is another useful formula, given below, whose
derivation is left for the exercises.
Formulas for curvature.
If r is a vector-valued function defining a smooth space curve C, and if
r0 (t) is not zero and if r00 (t) exists, then the curvature κ of C satisfies
|T0 (t)|
• κ = κ(t) = |r0 (t)|

|r0 (t)×r00 (t)|


• κ= |r0 (t)|3 .

Activity 9.8.6. Use one of the two formulas for κ in terms of t to help you
answer the following questions.
x2 y2
a. The ellipse a2 + b2 = 1 has parameterization

r(t) = ha cos(t), b sin(t)i.

Find the curvature of the ellipse. Assuming 0 < b < a, at what points
is the curvature the greatest and at what points is the curvature the
smallest? Does this agree with your intuition?
94 CHAPTER 9. MULTIVARIABLE AND VECTOR FUNCTIONS

b. The standard helix has parameterization r(t) = cos(t)i+sin(t)j+tk. Find


the curvature of the helix. Does the result agree with your intuition?

The curvature has another interpretation. Recall that the tangent line to a
curve at a point is the line that best approximates the curve at that point. The
curvature at a point on a curve describes the circle that best approximates the
curve at that point. Remembering that a circle of radius a has curvature 1/a,
then the circle that best approximates the curve near a point on a curve whose
curvature is κ has radius 1/κ and will be tangent to the tangent line at that
point and has its center on the concave side of the curve. This circle, called
the osculating circle of the curve at the point, is shown in Figure 9.8.7 for a
portion of a parabola.

4 y

x
-2 -1 1 2

Figure 9.8.7: The osculating circle

9.8.4 Summary

• The integration process shows that the length L of a smooth curve defined
by r(t) on an interval [a, b] is
Z b
L= |r0 (t)| dt.
a

• Arc length is useful as a parameter because when we parameterize with


respect to arc length, we eliminate the role of speed in our calculation of
curvature and the result is a measure that depends only on the geometry
of the curve and not on the parameterization of the curve.

• We define the curvature κ of a curve in 2- or 3-space to be the rate of


change of the magnitude of the unit tangent vector with respect to arc
length, or
dT
κ = .
ds

Exercises
1. Find the length of the curve

x = t − 5, y = 5 + 4t, z = 2 + 2t,
9.8. ARC LENGTH AND CURVATURE 95

for 4 ≤ t ≤ 6.
length =
(Think of second way that you could calculate this length, too, and see that
you get the same result.)
2. Consider the curve r = (e−3t cos(−4t), e−3t sin(−4t), e−3t ).
Compute the arclength function s(t): (with initial point t = 0).
3. Find the length of the given curve:

r (t) = (−2t, 4 sin t, 4 cos t)

where −2 ≤ t ≤ 5.
4. Find the curvature of y = sin (−1x) at x = π4 .
5. Consider the path r(t) = (12t, 6t2 , 6 ln t) defined for t > 0.
Find the length of the curve between the points (12, 6, 0) and (24, 24, 6 ln(2)).
6. Find the curvature κ(t) of the curve r(t) = (1 sin t) i+(1 sin t) j+(4 cos t) k
7. A factory has a machine which bends wire at a rate of 6 unit(s) of
curvature per second. How long does it take to bend a straight wire into a
circle of radius 8?
seconds
8. Find the unit tangent vector at the indicated point of the vector function

r(t) = e18t cos t i + e18t sin t j + e18t k


T(π/2) = h , , i
9. Consider the vector function
r(t) = ht, t9 , t8 i
Compute
r0 (t) = h , , i
T(1) = h , , i
r00 (t) = h , , i
r0 (t) × r00 (t) = h , , i
10. Starting from the point (−5, 2, 5), reparametrize the curve
x(t) = (−5 + 2t, 2 − 3t, 5 − t) in terms of arclength.
y(s) = ( , , )
11. Consider the moving particle whose position at time t in seconds is given
by the vector-valued function r defined by r(t) = 5ti + 4 sin(3t)j + 4 cos(3t)k.
Use this function to answer each of the following questions.
a. Find the unit tangent vector, T(t), to the spacecurve traced by r(t) at
time t. Write one sentence that explains what T(t) tells us about the
particle’s motion.
b. Determine the speed of the particle moving along the spacecurve with
the given parameterization.
c. Find the exact distance traveled by the particle on the time interval
[0, π/3].
d. Find the average velocity of the particle on the time interval [0, π/3].
e. Determine the parameterization of the given curve with respect to arc
length.

12. Let y = f (x) define a curve in the plane. We can consider this curve as
a curve in three-space with z-coordinate 0.
96 CHAPTER 9. MULTIVARIABLE AND VECTOR FUNCTIONS

a. Find a parameterization of the form r(t) = hx(t), y(t), z(t)i of the curve
y = f (x) in three-space.

b. Use the formula


|r0 (t) × r00 (t)|
κ=
|r0 (t)|3
to show that
|f 00 (x)|
κ= 3/2
.
[1 + (f 0 (x))2 ]

13. Consider the single variable function defined by y = 4x2 − x3 .

a. Find a parameterization of the form r(t) = hx(t), y(t)i that traces the
curve y = 4x2 − x3 on the interval from x = −3 to x = 3.

b. Write a definite integral which, if evaluated, gives the exact length of


the given curve from x = −3 to x = 3. Why is the integral difficult to
evaluate exactly?

c. Determine the curvature, κ(t), of the parameterized curve. (Exercise 9.8.12


might be useful here.)

d. Use appropriate technology to approximate the absolute maximum and


minimum of κ(t) on the parameter interval for your parameterization.
Compare your results with the graph of y = 4x2 − x3 . How do the
absolute maximum and absolute minimum of κ(t) align with the original
curve?

14. Consider the standard helix parameterized by r(t) = cos(t)i+sin(t)j+tk.

a. Recall that the unit tangent vector, T(t), is the vector tangent to the
curve at time t that points in the direction of motion and has length 1.
Find T(t).

b. Explain why the fact that |T(t)| = 1 implies that T and T0 are orthogonal
vectors for every value of t. (Hint: note that T · T = |T|2 = 1, and
compute dtd
[T · T].)

c. For the given function r with unit tangent vector T(t) (from (a)), deter-
mine N(t) = |T01(t)| T0 (t).

d. What geometric properties does N(t) have? That is, how long is this
vector, and how is it situated in comparison to T(t)?

e. Let B(t) = T(t) × N(t), and compute B(t) in terms of your results in
(a) and (c).

f. What geometric properties does B(t) have? That is, how long is this
vector, and how is it situated in comparison to T(t) and N(t)?

g. Sketch a plot of the given helix, and compute and sketch T(π/2), N(π/2),
and B(π/2).

15. In this exercise we verify the curvature formula

|r0 (t) × r00 (t)|


κ= .
|r0 (t)|3
9.8. ARC LENGTH AND CURVATURE 97

a. Explain why
ds
|r0 (t)| = .
dt

r0 (t)
b. Use the fact that T(t) = |r0 (t)| and |r0 (t)| = ds
dt to explain why

ds
r0 (t) = T(t).
dt

c. The Product Rule shows that

d2 s ds
r00 (t) = T(t) + T0 (t).
dt2 dt
Explain why
 2
0 00 ds
r (t) × r (t) = (T(t) × T0 (t)).
dt

d. In Exercise 9.8.14 we showed that |T(t)| = 1 implies that T(t) is or-


thogonal to T0 (t) for every value of t. Explain what this tells us about
|T(t) × T0 (t)| and conclude that
 2
ds
|r0 (t) × r00 (t)| = |T0 (t)|.
dt

|T0 (t)|
e. Finally, use the fact that κ = |r0 (t)| to verify that

|r0 (t) × r00 (t)|


κ= .
|r0 (t)|3

16. In this exercise we explore how to find the osculating circle for a given
curve. As an example, we will use the curve defined by f (x) = x2 . Recall that
this curve can be parameterized by x(t) = t and y(t) = t2 .

a. Use (9.8.5) to find T(t) for our function f .

b. To find the center of the osculating circle, we will want to find a vector
that points from a point on the curve to the center of the circle. Such a
vector will be orthogonal to the tangent vector at that point. Recall that
T(s) = hcos(φ(s)), sin(φ(s))i, where φ is the angle the tangent vector to
the curve makes with a horizontal vector. Use this fact to show that
dT
T· = 0.
ds

Explain why this tells us that dT ds is orthogonal to T. Let N be the


unit vector in the direction of dT
ds . The vector N is called the principal
unit normal vector and points in the direction toward which the curve
is turning. The vector N also points toward the center of the osculating
circle.

c. Find T at the point (1, 1) on the graph of f . Then find N at this same
point. How do you know you have the correct direction for N?
98 CHAPTER 9. MULTIVARIABLE AND VECTOR FUNCTIONS

Let P be a point on the curve. Recall


that ρ = κ1 at point P is the radius of
the osculating circle at point P . We
call ρ the radius of curvature at point
P . Let C be the center of the osculating
circle to the curve at point P , and let O C
−−→ N T
be the origin. Let γ be the vector OC.
See Figure 9.8.8 for an illustration using γ
an arbitrary function f .

Figure 9.8.8: An osculating circle.

d. Which vector, in terms of ρ and N points from the point P to the point
C? Use this vector to explain why

γ = r + ρN,
−−→
where r = OP .

e. Finally, use the previous work to find the center of the osculating circle
for f at the point (1, 1). Draw pictures of the curve and the osculating
circle to verify your work.
Chapter 10

Derivatives of Multivariable
Functions

10.1 Limits

Motivating Questions

• What do we mean by the limit of a function f of two variables at a point


(a, b)?
• What techniques can we use to show that a function of two variables does
not have a limit at a point (a, b)?
• What does it mean for a function f of two variables to be continuous at
a point (a, b)?

In this section, we will study limits of functions of several variables, with


a focus on limits of functions of two variables. In single variable calculus, we
studied the notion of limit, which turned out to be a critical concept that
formed the basis for the derivative and the definite integral. In this section we
will begin to understand how the concept of limit for functions of two variables
is similar to what we encountered for functions of a single variable. The limit
will again be the fundamental idea in multivariable calculus, and we will use
this notion of the limit of a function of several variables to define the important
concept of differentiability later in this chapter. We have already seen its use
in the derivatives of vector-valued functions in Section 9.7.
Let’s begin by reviewing what we mean by the limit of a function of one
variable. We say that a function f has a limit L as x approaches a provided
that we can make the values f (x) as close to L as we like by taking x sufficiently
close (but not equal) to a. We denote this behavior by writing

lim f (x) = L.
x→a

Preview Activity 10.1.1. We investigate the limits of several different func-


tions by working with tables and graphs.
a. Consider the function f defined by

f (x) = 3 − x.

99
100 CHAPTER 10. DERIVATIVES OF MULTIVARIABLE FUNCTIONS

Complete Table 10.1.1.

x −0.2 −0.1 0.0 0.1 0.2


f (x)

Table 10.1.1: Values of f (x) = 3 − x.

What does the table suggest regarding limx→0 f (x)?

b. Explain how your results in (a) are reflected in Figure 10.1.2.

5 y

1
x
-3 -2 -1 1 2 3
-1
Figure 10.1.2: The graph of f (x) = 3 − x.

c. Next, consider
x
g(x) = .
|x|

Complete Table 10.1.3 with values near x = 0, the point at which g is


not defined.

x −0.1 −0.01 −0.001 0.001 0.01 0.1


g(x)

Table 10.1.3: Values of g(x) = |x| .


x

What does this suggest about limx→0 g(x)?

d. Explain how your results in (c) are reflected in Figure 10.1.4.


10.1. LIMITS 101

-1 1

-1

Figure 10.1.4: The graph of g(x) = |x| .


x

e. Now, let’s examine a function of two variables. Let


f (x, y) = 3 − x − 2y.
Complete Table 10.1.5.

x\y −1.0 −0.1 0.0 0.1 1.0


−1.0 4.2
−0.1 1.1
0.0 2.8
0.1 4.9
1.0 2.0

Table 10.1.5: Values of f (x, y) = 3 − x − 2y.

What does the table suggest about lim(x,y)→(0,0) f (x, y)?

1.5 y
z
3

1.0
1.5 y

0.5
3

x
x
0.5 1.0 1.5

Figure 10.1.6: Left: The graph of f (x, y) = 3 − x − 2y. Right: A contour


plot.
102 CHAPTER 10. DERIVATIVES OF MULTIVARIABLE FUNCTIONS

f. Explain how your results in (e) are reflected in Figure 10.1.6. Compare
this limit to the limit in part (a). How are the limits similar and how are
they different?

g. Finally, consider
2xy
g(x, y) = ,
x2 + y2
which is not defined at (0, 0). Complete Table 10.1.7. Round to three
decinal places.

x\y −1.0 −0.1 0.0 0.1 1.0


−1.0 0.198
−0.1 −0.198
0.0 — 0.000
0.1 −0.198
1.0 0.000
2xy
Table 10.1.7: Values of g(x, y) = x2 +y 2 .

What does this suggest about lim(x,y)→(0,0) g(x, y)?

h. Explain how your results are reflected in Figure 10.1.8. Compare this
limit to the limit in part (b). How are the results similar and how are
they different?

z
x
0
-1

x
0
-1
-1 0 y 1

2xy
Figure 10.1.8: Left: The graph of g(x, y) = x2 +y 2 . Right: A contour plot.

10.1.1 Limits of Functions of Two Variables


In Preview Activity 10.1.1, we recalled the notion of limit from single variable
calculus and saw that a similar concept applies to functions of two variables.
Though we will focus on functions of two variables, for the sake of discussion,
all the ideas we establish here are valid for functions of any number of variables.
In a natural followup to our work in Preview Activity 10.1.1, we now formally
define what it means for a function of two variables to have a limit at a point.
10.1. LIMITS 103

Definition 10.1.9. Given a function f = f (x, y), we say that f has limit L
as (x, y) approaches (a, b) provided that we can make f (x, y) as close to L
as we like by taking (x, y) sufficiently close (but not equal) to (a, b). We write
lim f (x, y) = L.
(x,y)→(a,b)

To investigate the limit of a single variable function, limx→a f (x), we often


consider the behavior of f as x approaches a from the right and from the left.
Similarly, we may investigate limits of two-variable functions, lim(x,y)→(a,b) f (x, y)
by considering the behavior of f as (x, y) approaches (a, b) from various direc-
tions. This situation is more complicated because there are infinitely many
ways in which (x, y) may approach (a, b). In the next activity, we see how it is
important to consider a variety of those paths in investigating whether or not
a limit exists.
Activity 10.1.2. Consider the function f , defined by
y
f (x, y) = p ,
x + y2
2

whose graph is shown below in Figure 10.1.10

-1

x
0
-1
-1 0 y 1

y
Figure 10.1.10: The graph of f (x, y) = √ .
x2 +y 2

a. Is f defined at the point (0, 0)? What, if anything, does this say about
whether f has a limit at the point (0, 0)?
b. Values of f (to three decimal places) at several points close to (0, 0) are
shown in Table 10.1.11.

x\y −1.000 −0.100 0.000 0.100 1.000


−1.000 −0.707 — 0.000 — 0.707
−0.100 — −0.707 0.000 0.707 —
0.000 −1.000 −1.000 — 1.000 1.000
0.100 — −0.707 0.000 0.707 —
1.000 −0.707 — 0.000 — 0.707

Table 10.1.11: Values of a function f .

Based on these calculations, state whether f has a limit at (0, 0) and give
an argument supporting your statement. (Hint: The blank spaces in the
table are there to help you see the patterns.)
104 CHAPTER 10. DERIVATIVES OF MULTIVARIABLE FUNCTIONS

c. Now we formalize the conjecture from the previous part by considering


what happens if we restrict our attention to different paths. First, we
look at f for points in the domain along the x-axis; that is, we consider
what happens when y = 0. What is the behavior of f (x, 0) as x → 0? If
we approach (0, 0) by moving along the x-axis, what value do we find as
the limit?

d. What is the behavior of f along the line y = x when x > 0; that is, what
is the value of f (x, x) when x > 0? If we approach (0, 0) by moving along
the line y = x in the first quadrant (thus considering f (x, x) as x → 0+ ,
what value do we find as the limit?

e. In general, if lim(x,y)→(0,0) f (x, y) = L, then f (x, y) approaches L as


(x, y) approaches (0, 0), regardless of the path we take in letting (x, y) →
(0, 0). Explain what the last two parts of this activity imply about the
existence of lim(x,y)→(0,0) f (x, y).

f. Shown below in Figure 10.1.12 is a set of contour lines of the function


f . What is the behavior of f (x, y) as (x, y) approaches (0, 0) along any
straight line? How does this observation reinforce your conclusion about
the existence of lim(x,y)→(0,0) f (x, y) from the previous part of this ac-
tivity? (Hint: Use the fact that a non-vertical line has equation y = mx
for some constant m.)

y
Figure 10.1.12: Contour lines of f (x, y) = √ .
x2 +y 2

As we have seen in Activity 10.1.2, if (x, y) approachs (a, b) along two


different paths and we find that f (x, y) has two different limits, we can conclude
that lim(x,y)→(a,b) f (x, y) does not exist. This is similar to the one-variable
example g(x) = x/|x| as shown in Figure 10.1.13; limx→0 g(x) does not exist
because we see different limits as x approaches 0 from the left and the right.
10.1. LIMITS 105

-1 1

-1

Figure 10.1.13: The graph of g(x) = |x| .


x

As a general rule, we have


Limits along different paths.
If f (x, y) has two different limits as (x, y) approaches (a, b) along two
different paths, then lim(x,y)→(a,b) f (x, y) does not exist.

As the next activity shows, studying the limit of a two-variable function f


by considering the behavior of f along various paths can require subtle insights.

Activity 10.1.3. Let’s consider the function g defined by

x2 y
g(x, y) =
x4 + y 2

and investigate the limit lim(x,y)→(0,0) g(x, y).

a. What is the behavior of g on the x-axis? That is, what is g(x, 0) and
what is the limit of g as (x, y) approaches (0, 0) along the x-axis?

b. What is the behavior of g on the y-axis? That is, what is g(0, y) and
what is the limit of g as (x, y) approaches (0, 0) along the y-axis?

c. What is the behavior of g on the line y = mx? That is, what is g(x, mx)
and what is the limit of g as (x, y) approaches (0, 0) along the line y =
mx?

d. Based on what you have seen so far, do you think lim(x,y)→(0,0) g(x, y)
exists? If so, what do you think its value is?

e. Now consider the behavior of g on the parabola y = x2 ? What is g(x, x2 )


and what is the limit of g as (x, y) approaches (0, 0) along this parabola?

f. State whether the limit lim(x,y)→(0,0) g(x, y) exists or not and provide a
justification of your statement.

This activity shows that we need to be careful when studying the limit of
a two-variable functions by considering its behavior along different paths. If
we find two different paths that result in two different limits, then we may
conclude that the limit does not exist. However, we can never conclude that
106 CHAPTER 10. DERIVATIVES OF MULTIVARIABLE FUNCTIONS

the limit of a function exists only by considering its behavior along different
paths.
Generally speaking, concluding that a limit lim(x,y)→(a,b) f (x, y) exists re-
quires a more careful argument.
Example 10.1.14. Consider the function f defined by

x2 y 2
f (x, y) = .
x2 + y2

We want to know whether lim(x,y)→(0,0) f (x, y) exists.


Note that if either x or y is 0, then f (x, y) = 0. Therefore, if f has a limit
at (0, 0), it must be 0. We will therefore argue that

lim f (x, y) = 0,
(x,y)→(0,0)

by showing that we can make f (x, y) as close to 0 as we wish by taking (x, y)


sufficiently close (but not equal) to (0, 0). In what follows, we view x and y as
being real numbers that are close, but not equal, to 0.
Since 0 ≤ x2 , we have
y 2 ≤ x2 + y 2 ,
which implies that
y2
≤ 1.
x2 + y2
Multiplying both sides by x2 and observing that f (x, y) ≥ 0 for all (x, y)
gives
x2 y 2 y2
 
2
0 ≤ f (x, y) = 2 =x ≤ x2 .
x + y2 x2 + y 2
Thus, 0 ≤ f (x, y) ≤ x2 . Since x2 → 0 as x → 0, we can make f (x, y) as
close to 0 as we like by taking x sufficiently close to 0 (for this example, it turns
out that we don’t even need to worry about making y close to 0). Therefore,

x2 y 2
lim = 0.
(x,y)→(0,0) x2 + y 2

In spite of the fact that these two most recent examples illustrate some
of the complications that arise when studying limits of two-variable functions,
many of the properties that are familiar from our study of single variable
functions hold in precisely the same way.
10.1. LIMITS 107

Properties of Limits.
Let f = f (x, y) and g = g(x, y) be functions so that lim(x,y)→(a,b) f (x, y)
and lim(x,y)→(a,b) g(x, y) both exist. Then

1. lim x = a and lim y=b


(x,y)→(a,b) (x,y)→(a,b)
 
2. lim cf (x, y) = c lim f (x, y) for any scalar c
(x,y)→(a,b) (x,y)→(a,b)

3. lim [f (x, y) ± g(x, y)] = lim f (x, y) ± lim g(x, y)


(x,y)→(a,b) (x,y)→(a,b) (x,y)→(a,b)
  
4. lim [f (x, y)g(x, y)] = lim f (x, y) lim g(x, y)
(x,y)→(a,b) (x,y)→(a,b) (x,y)→(a,b)

lim f (x, y)
f (x, y) (x,y)→(a,b)
5. lim = if lim g(x, y) 6= 0.
(x,y)→(a,b) g(x, y) lim g(x, y) (x,y)→(a,b)
(x,y)→(a,b)

We can use these properties and results from single variable calculus to
verify that many limits exist. For example, these properties show that the
function f defined by

f (x, y) = 3x2 y 3 + 2xy 2 − 3x + 1

has a limit at every point (a, b) and, moreover,

lim f (x, y) = f (a, b).


(x,y)→(a,b)

The reason for this is that polynomial functions of a single variable have
limits at every point.

10.1.2 Continuity
Recall that a function f of a single variable x is said to be continuous at x = a
provided that the following three conditions are satisfied:
1. f (a) exists,
2. limx→a f (x) exists, and
3. limx→a f (x) = f (a).
Using our understanding of limits of multivariable functions, we can define
continuity in the same way.
Definition 10.1.15. A function f = f (x, y) is continuous at the point (a, b)
provided that
1. f is defined at the point (a, b),
2. lim(x,y)→(a,b) f (x, y) exists, and
3. lim(x,y)→(a,b) f (x, y) = f (a, b).
For instance, we have seen that the function f defined by f (x, y) = 3x2 y 3 +
2xy 2 − 3x + 1 is continous at every point. And just as with single variable
functions, continuity has certain properties that are based on the properties of
limits.
108 CHAPTER 10. DERIVATIVES OF MULTIVARIABLE FUNCTIONS

Properties of continuity.
Let f and g be functions of two variables that are continuous at the
point (a, b). Then
1. cf is continuous at (a, b) for any scalar c
2. f + g is continuous at (a, b)

3. f − g is continuous at (a, b)
4. f g is continuous at (a, b)
f
5. g is continuous at (a, b) if g(a, b) 6= 0

Using these properties, we can apply results from single variable calculus to
decide about continuity of multivariable functions. For example, the coordinate
functions f and g defined by f (x, y) = x and g(x, y) = y are continuous
at every point. We can then use properties of continuity listed to conclude
that every polynomial function in x and y is continuous at every point. For
example, g(x, y) = x2 and h(x, y) = y 3 are continuous functions, so their
product f (x, y) = x2 y 3 is a continuous multivariable function.

10.1.3 Summary

• A function f = f (x, y) has a limit L at a point (a, b) provided that we


can make f (x, y) as close to L as we like by taking (x, y) sufficiently close
(but not equal) to (a, b).

• If (x, y) has two different limits as (x, y) approaches (a, b) along two
different paths, we can conclude that lim(x,y)→(a,b) f (x, y) does not exist.

• Properties similar to those for one-variable functions allow us to conclude


that many limits exist and to evaluate them.

• A function f = f (x, y) is continuous at a point (a, b) in its domain if f


has a limit at (a, b) and

f (a, b) = lim f (x, y).


(x,y)→(a,b)

Exercises
1. Find the limits, if they exist, or type DNE for any which do not exist.

1x2
lim
(x,y)→(0,0) x2 + y 2

1) Along the x-axis:


2) Along the y-axis:
3) Along the line y = mx :
4) The limit is:
2. Determining the limit of a funtion. In this problem we show that
the function
4x2 − y 2
f (x, y) = 2
x + y2
10.1. LIMITS 109

does not have a limit as (x, y) → (0, 0).


(a) Suppose that we consider (x, y) → (0, 0) along the curve y = 4x. Find
the limit in this case:
4x2 −y 2
lim x2 +y 2 =
(x,4x)→(0,0)
(b) Now consider (x, y) → (0, 0) along the curve y = 5x. Find the limit in
this case:
4x2 −y 2
lim x2 +y 2 =
(x,5x)→(0,0)
(c) Note that the results from (a) and (b) indicate that f has no limit as
(x, y) → (0, 0) (be sure you can explain why!).
To show this more generally, consider (x, y) → (0, 0) along the curve y =
mx, for arbitrary m. Find the limit in this case:
4x2 −y 2
lim x2 +y 2 =
(x,mx)→(0,0)
(Be sure that you can explain how this result also indicates that f has no
limit as (x, y) → (0, 0).
3. Show that the function
x3 y
f (x, y) = .
x6 + y 3
does not have a limit at (0, 0) by examining the following limits.
(a) Find the limit of f as (x, y) → (0, 0) along the line y = x.
lim f (x, y) =
(x,y)→(0,0)
y=x

(b) Find the limit of f as (x, y) → (0, 0) along the curve y = x3 .


lim f (x, y) =
(x,y)→(0,0)
y=x3

(Be sure that you are able to explain why the results in (a) and (b) indicate
that f does not have a limit at (0,0)!
4. Find the limit, if it exists, or type N if it does not exist.
3x2
lim =
(x,y)→(0,0) 1x2 + 2y 2
5. Find the limit, if it exists, or type N if it does not exist.
(x + 16y)2
lim 2 =
(x,y)→(0,0) x2 + 16 y 2
6. Find the limit,
√ 2 if it2 exists, or type ’DNE’ if it does not exist.
lim e 2x +2y =
(x,y)→(3,−1)
7. Find the limit, if it exists, or type N if it does not exist.
5xy + yz + 4xz
lim =
(x,y,z)→(0,0,0) 25x2 + y 2 + 16z 2
8. Find the limit, if it exists, or type N if it does not exist.
2 2
4zex +y
lim =
(x,y,z)→(5,4,1) 5x2 + 4y 2 + z 2

9. Find the limit (enter ’DNE’ if the limit does not exist)
Hint: rationalize the denominator.
(−2x2 − 2y 2 )
lim p
(x,y)→(0,0) (−2x2 − 2y 2 + 1) − 1

10. The largest set on which the function f (x, y) = 1/(5 − x2 − y 2 ) is


continuous is
A. The interior of the circle x2 + y 2 = 5, plus the circle
110 CHAPTER 10. DERIVATIVES OF MULTIVARIABLE FUNCTIONS

B. The exterior of the circle x2 + y 2 = 5


C. All of the xy-plane except the circle x2 + y 2 = 5
D. The interior of the circle x2 + y 2 = 5
E. All of the xy-plane
xy
11. Consider the function f defined by f (x, y) = x2 +y 2 +1 .

a. What is the domain of f ?


b. Evaluate limit of f at (0, 0) along the following paths: x = 0, y = 0,
y = x, and y = x2 .
c. What do you conjecture is the value of lim(x,y)→(0,0) f (x, y)?
d. Is f continuous at (0, 0)? Why or why not?
e. Use appropriate technology to sketch both surface and contour plots of
f near (0, 0). Write several sentences to say how your plots affirm your
findings in (a) - (d).
xy
12. Consider the function g defined by g(x, y) = x2 +y 2 .

a. What is the domain of g?


b. Evaluate limit of g at (0, 0) along the following paths: x = 0, y = x, and
y = 2x.
c. What can you now say about the value of lim(x,y)→(0,0) g(x, y)?
d. Is g continuous at (0, 0)? Why or why not?
e. Use appropriate technology to sketch both surface and contour plots of
g near (0, 0). Write several sentences to say how your plots affirm your
findings in (a) - (d).

2x2 y
13. Consider the function h defined by h(x, y) = x4 +y 2 .

a. What is the domain of h?


b. Evaluate the limit of h at (0, 0) along all linear paths the contain the
origin. What does this tell us about lim(x,y)→(0,0) h(x, y)? (Hint: A non-
vertical line throught the origin has the form y = mx for some constant
m.)
c. Does lim(x,y)→(0,0) h(x, y) exist? Verify your answer. Check by using
appropriate technology to sketch both surface and contour plots of h
near (0, 0). Write several sentences to say how your plots affirm your
findings about lim(x,y)→(0,0) h(x, y).

14. For each of the following prompts, provide an example of a function of


two variables with the desired properties (with justification), or explain why
such a function does not exist.
a. A function p that is defined at (0, 0), but lim(x,y)→(0,0) p(x, y) does not
exist.
b. A function q that does not have a limit at (0, 0), but that has the same
limiting value along any line y = mx as x → 0.
10.1. LIMITS 111

c. A function r that is continuous at (0, 0), but lim(x,y)→(0,0) r(x, y) does


not exist.
d. A function s such that

lim s(x, x) = 3 and lim s(x, 2x) = 6,


(x,x)→(0,0) (x,2x)→(0,0)

for which lim(x,y)→(0,0) s(x, y) exists.

e. A function t that is not defined at (1, 1) but lim(x,y)→(1,1) t(x, y) does


exist.

15. Use the properties of continuity to determine the set of points at which
each of the following functions is continuous. Justify your answers.
x+2y
a. The function f defined by f (x, y) = x−y

sin(x)
b. The function g defined by g(x, y) = 1+ey

c. The function h defined by


(
xy
x2 +y 2 if (x, y) 6= (0, 0)
h(x, y) =
0 if (x, y) = (0, 0)

d. The function k defined by


x2 y 4
(
x2 +y 2 if (x, y) 6= (0, 0)
k(x, y) =
0 if (x, y) = (0, 0)
112 CHAPTER 10. DERIVATIVES OF MULTIVARIABLE FUNCTIONS

10.2 First-Order Partial Derivatives

Motivating Questions
• How are the first-order partial derivatives of a function f of the indepen-
dent variables x and y defined?
• Given a function f of the independent variables x and y, what do the
first-order partial derivatives ∂f ∂f
∂x and ∂y tell us about f ?

The derivative plays a central role in first semester calculus because it


provides important information about a function. Thinking graphically, for
instance, the derivative at a point tells us the slope of the tangent line to the
graph at that point. In addition, the derivative at a point also provides the
instantaneous rate of change of the function with respect to changes in the
independent variable.
Now that we are investigating functions of two or more variables, we can
still ask how fast the function is changing, though we have to be careful about
what we mean. Thinking graphically again, we can try to measure how steep
the graph of the function is in a particular direction. Alternatively, we may
want to know how fast a function’s output changes in response to a change
in one of the inputs. Over the next few sections, we will develop tools for
addressing issues such as these these. Preview Activity 10.2.1 explores some
issues with what we will come to call partial derivatives.
Preview Activity 10.2.1. Suppose we take out a $18,000 car loan at interest
rate r and we agree to pay off the loan in t years. The monthly payment, in
dollars, is
1500r
M (r, t) = .
r −12t

1 − 1 + 12
a. What is the monthly payment if the interest rate is 3% so that r = 0.03,
and we pay the loan off in t = 4 years?
b. Suppose the interest rate is fixed at 3%. Express M as a function f of t
alone using r = 0.03. That is, let f (t) = M (0.03, t). Sketch the graph of
f on the left of Figure 10.2.1. Explain the meaning of the function f .

1000 1000
f (t) g(r)

750 750

500 500

250 250

t r
1 2 3 4 5 6 7 8 9 10 0.02 0.04 0.06 0.08 0.10

Figure 10.2.1: Left: Graphs f (t) = M (0.03, t). Right: Graph g(r) = M (r, 4).

c. Find the instantaneous rate of change f 0 (4) and state the units on this
quantity. What information does f 0 (4) tell us about our car loan? What
information does f 0 (4) tell us about the graph you sketched in (b)?
10.2. FIRST-ORDER PARTIAL DERIVATIVES 113

d. Express M as a function of r alone, using a fixed time of t = 4. That is,


let g(r) = M (r, 4). Sketch the graph of g on the right of Figure 10.2.1.
Explain the meaning of the function g.
e. Find the instantaneous rate of change g 0 (0.03) and state the units on
this quantity. What information does g 0 (0.03) tell us about our car loan?
What information does g 0 (0.03) tell us about the graph you sketched in
(d)?

10.2.1 First-Order Partial Derivatives


In Section 9.1, we studied the behavior of a function of two or more variables
by considering the traces of the function. Recall that in one example, we
considered the function f defined by

x2 sin(2y)
f (x, y) = ,
32
which measures the range, or horizontal distance, in feet, traveled by a projec-
tile launched with an initial speed of x feet per second at an angle y radians
to the horizontal. The graph of this function is given again on the left in
Figure 10.2.2. Moreover, if we fix the angle y = 0.6, we may view the trace
f (x, 0.6) as a function of x alone, as seen at right in Figure 10.2.2.

z 1000
f (x, 0.6)
1500

1000 800
500
600
x
200
400
y 150
1.5 100
1.0
0.5
50 200
0
x
50 100 150 200

x2 sin(2y)
Figure 10.2.2: Left: The trace of z = 32 with y = 0.6.

Since the trace is a one-variable function, we may consider its derivative


just as we did in the first semester of calculus. With y = 0.6, we have
sin(1.2) 2
f (x, 0.6) = x ,
32
and therefore
d sin(1.2)
[f (x, 0.6)] = x.
dx 16
When x = 150, this gives
d sin(1.2)
[f (x, 0.6)]|x=150 = 150 ≈ 8.74 feet per feet per second,
dx 16
114 CHAPTER 10. DERIVATIVES OF MULTIVARIABLE FUNCTIONS

which gives the slope of the tangent line shown on the right of Figure 10.2.2.
Thinking of this derivative as an instantaneous rate of change implies that if
we increase the initial speed of the projectile by one foot per second, we expect
the horizontal distance traveled to increase by approximately 8.74 feet if we
hold the launch angle constant at 0.6 radians.
By holding y fixed and differentiating with respect to x, we obtain the first-
order partial derivative of f with respect to x. Denoting this partial derivative
as fx , we have seen that

d f (150 + h, 0.6) − f (150, 0.6)


fx (150, 0.6) = f (x, 0.6)|x=150 = lim .
dx h→0 h
More generally, we have

f (a + h, b) − f (a, b)
fx (a, b) = lim ,
h→0 h
provided this limit exists.
In the same way, we may obtain a trace by setting, say, x = 150 as shown
in Figure 10.2.3.

z 1000
f (150, y)
1500

1000 800
500
600
x
200
400
y 150
1.5 100
1.0
0.5
50 200
0
y
0.25 0.50 0.75 1.00 1.25

x2 sin(2y)
Figure 10.2.3: The trace of z = 32 with x = 150.

This gives
1502
f (150, y) = sin(2y),
32
and therefore
d 1502
[f (150, y)] = cos(2y).
dy 16
If we evaluate this quantity at y = 0.6, we have

d 1502
[f (150, y)]|y=0.6 = cos(1.2) ≈ 509.5 feet per radian.
dy 16

Once again, the derivative gives the slope of the tangent line shown on the
right in Figure 10.2.3. Thinking of the derivative as an instantaneous rate of
change, we expect that the range of the projectile increases by 509.5 feet for
10.2. FIRST-ORDER PARTIAL DERIVATIVES 115

every radian we increase the launch angle y if we keep the initial speed of the
projectile constant at 150 feet per second.
By holding x fixed and differentiating with respect to y, we obtain the first-
order partial derivative of f with respect to y. As before, we denote this partial
derivative as fy and write
d f (150, 0.6 + h) − f (150, 0.6)
fy (150, 0.6) = f (150, y)|y=0.6 = lim .
dy h→0 h
As with the partial derivative with respect to x, we may express this quan-
tity more generally at an arbitrary point (a, b). To recap, we have now arrived
at the formal definition of the first-order partial derivatives of a function of
two variables.
Definition 10.2.4. The first-order partial derivatives of f with respect
to x and y at a point (a, b) are, respectively,
f (a + h, b) − f (a, b)
fx (a, b) = lim , and
h→0 h
f (a, b + h) − f (a, b)
fy (a, b) = lim ,
h→0 h
provided the limits exist.
Activity 10.2.2. Consider the function f defined by
xy 2
f (x, y) =
x+1
at the point (1, 2).
a. Write the trace f (x, 2) at the fixed value y = 2. On the left side of
Figure 10.2.5, draw the graph of the trace with y = 2 around the point
where x = 1, indicating the scale and labels on the axes. Also, sketch
the tangent line at the point x = 1.

xy 2
Figure 10.2.5: Traces of f (x, y) = x+1 .

b. Find the partial derivative fx (1, 2) and relate its value to the sketch you
just made.
116 CHAPTER 10. DERIVATIVES OF MULTIVARIABLE FUNCTIONS

c. Write the trace f (1, y) at the fixed value x = 1. On the right side of
Figure 10.2.5, draw the graph of the trace with x = 1 indicating the
scale and labels on the axes. Also, sketch the tangent line at the point
y = 2.

d. Find the partial derivative fy (1, 2) and relate its value to the sketch you
just made.

As these examples show, each partial derivative at a point arises as the


derivative of a one-variable function defined by fixing one of the coordinates. In
addition, we may consider each partial derivative as defining a new function of
the point (x, y), just as the derivative f 0 (x) defines a new function of x in single-
variable calculus. Due to the connection between one-variable derivatives and
partial derivatives, we will often use Leibniz-style notation to denote partial
derivatives by writing

∂f ∂f
(a, b) = fx (a, b), and (a, b) = fy (a, b).
∂x ∂y

To calculate the partial derivative fx , we hold y fixed and thus we treat y


as a constant. In Leibniz notation, observe that

∂ ∂
(x) = 1 and (y) = 0.
∂x ∂x
To see the contrast between how we calculate single variable derivatives and
partial derivatives, and the difference between the notations dxd
[ ] and ∂x

[ ],
observe that
d d d d
[3x2 − 2x + 3] = 3 [x2 ] − 2 [x] + [3] = 3 · 2x − 2,
dx dx dx dx
∂ 2 ∂ ∂ ∂
and [x y − xy + 2y] = y [x2 ] − y [x] + [2y] = y · 2x − y
∂x ∂x ∂x ∂x
Thus, computing partial derivatives is straightforward: we use the standard
rules of single variable calculus, but do so while holding one (or more) of the
variables constant.

Activity 10.2.3.

a. If f (x, y) = 3x3 − 2x2 y 5 , find the partial derivatives fx and fy .

xy 2
b. If f (x, y) = , find the partial derivatives fx and fy .
x+1
c. If g(r, s) = rs cos(r), find the partial derivatives gr and gs .

d. Assuming f (w, x, y) = (6w + 1) cos(3x2 + 4xy 3 + y), find the partial


derivatives fw , fx , and fy .

x2t z 3
e. Find all possible first-order partial derivatives of q(x, t, z) = .
1 + x2

10.2.2 Interpretations of First-Order Partial Derivatives


Recall that the derivative of a single variable function has a geometric interpre-
tation as the slope of the line tangent to the graph at a given point. Similarly,
we have seen that the partial derivatives measure the slope of a line tangent
to a trace of a function of two variables as shown in Figure 10.2.6.
10.2. FIRST-ORDER PARTIAL DERIVATIVES 117

z z
1500 1500

1000 1000

500 500

x x
200 200
y 150 y 150
1.5 100 1.5 100
1.0 50 1.0 50
0.5 0.5
0 0

Figure 10.2.6: Tangent lines to two traces of the distance function.

Now we consider the first-order partial derivatives in context. Recall that


the difference quotient f (a+h)−f
h
(a)
for a function f of a single variable x at a
point where x = a tells us the average rate of change of f over the interval
[a, a + h], while the derivative f 0 (a) tells us the instantaneous rate of change
of f at x = a. We can use these same concepts to explain the meanings of the
partial derivatives in context.
Activity 10.2.4. The speed of sound C traveling through ocean water is a
function of temperature, salinity and depth. It may be modeled by the function

C = 1449.2 + 4.6T − 0.055T 2 + 0.00029T 3 + (1.34 − 0.01T )(S − 35) + 0.016D.

Here C is the speed of sound in meters/second, T is the temperature in


degrees Celsius, S is the salinity in grams/liter of water, and D is the depth
below the ocean surface in meters.
a. State the units in which each of the partial derivatives, CT , CS and CD ,
are expressed and explain the physical meaning of each.
b. Find the partial derivatives CT , CS and CD .
c. Evaluate each of the three partial derivatives at the point where T = 10,
S = 35 and D = 100. What does the sign of each partial derivatives tell
us about the behavior of the function C at the point (10, 35, 100)?

10.2.3 Using tables and contours to estimate partial deriva-


tives
Remember that functions of two variables are often represented as either a
table of data or a contour plot. In single variable calculus, we saw how we can
use the difference quotient to approximate derivatives if, instead of an algebraic
formula, we only know the value of the function at a few points. The same
idea applies to partial derivatives.
Activity 10.2.5. The wind chill, as frequently reported, is a measure of how
cold it feels outside when the wind is blowing. In Table 10.2.7, the wind chill w,
measured in degrees Fahrenheit, is a function of the wind speed v, measured in
miles per hour, and the ambient air temperature T , also measured in degrees
Fahrenheit. We thus view w as being of the form w = w(v, T ).
118 CHAPTER 10. DERIVATIVES OF MULTIVARIABLE FUNCTIONS

v\T −30 −25 −20 −15 −10 −5 0 5 10 15 20


5 −46 −40 −34 −28 −22 −16 −11 −5 1 7 13
10 −53 −47 −41 −35 −28 −22 −16 −10 −4 3 9
15 −58 −51 −45 −39 −32 −26 −19 −13 −7 0 6
20 −61 −55 −48 −42 −35 −29 −22 −15 −9 −2 4
25 −64 −58 −51 −44 −37 −31 −24 −17 −11 −4 3
30 −67 −60 −53 −46 −39 −33 −26 −19 −12 −5 1
35 −69 −62 −55 −48 −41 −34 −27 −21 −14 −7 0
40 −71 −64 −57 −50 −43 −36 −29 −22 −15 −8 −1

Table 10.2.7: Wind chill as a function of wind speed and temperature.

a. Estimate the partial derivative wv (20, −10). What are the units on this
quantity and what does it mean? (Recall that we can estimate a partial
derivative of a single variable function f using the symmetric difference
quotient f (x+h)−f
2h
(x−h)
for small values of h. A partial derivative is a
derivative of an appropriate trace.)
b. Estimate the partial derivative wT (20, −10). What are the units on this
quantity and what does it mean?
c. Use your results to estimate the wind chill w(18, −10). (Recall from single
variable calculus that for a function f of x, f (x + h) ≈ f (x) + hf 0 (x).)
d. Use your results to estimate the wind chill w(20, −12).
e. Consider how you might combine your previous results to estimate the
wind chill w(18, −12). Explain your process.
Activity 10.2.6. Shown below in Figure 10.2.8 is a contour plot of a function
f . The values of the function on a few of the contours are indicated to the left
of the figure.

3 y

2
-1

1
0
x
1
-3 -2 -1 1 2 3
2
3 -1
4
5 -2
6
-3

Figure 10.2.8: A contour plot of f .

a. Estimate the partial derivative fx (−2, −1). (Hint: How can you find
values of f that are of the form f (−2 + h) and f (−2 − h) so that you can
use a symmetric difference quotient?)
10.2. FIRST-ORDER PARTIAL DERIVATIVES 119

b. Estimate the partial derivative fy (−2, −1).

c. Estimate the partial derivatives fx (−1, 2) and fy (−1, 2).

d. Locate, if possible, one point (x, y) where fx (x, y) = 0.

e. Locate, if possible, one point (x, y) where fx (x, y) < 0.

f. Locate, if possible, one point (x, y) where fy (x, y) > 0.

g. Suppose you have a different function g, and you know that g(2, 2) = 4,
gx (2, 2) > 0, and gy (2, 2) > 0. Using this information, sketch a possibility
for the contour g(x, y) = 4 passing through (2, 2) on the left side of
Figure 10.2.9. Then include possible contours g(x, y) = 3 and g(x, y) = 5.

4 y 4 y

3 3

2 2

1 1

x x
1 2 3 4 1 2 3 4

Figure 10.2.9: Plots for contours of g and h.

h. Suppose you have yet another function h, and you know that h(2, 2) =
4, hx (2, 2) < 0, and hy (2, 2) > 0. Using this information, sketch a
possible contour h(x, y) = 4 passing through (2, 2) on the right side of
Figure 10.2.9. Then include possible contours h(x, y) = 3 and h(x, y) = 5.

10.2.4 Summary

• If f = f (x, y) is a function of two variables, there are two first order


partial derivatives of f : the partial derivative of f with respect to x,

∂f f (x + h, y) − f (x, y)
(x, y) = fx (x, y) = lim ,
∂x h→0 h
and the partial derivative of f with respect to y,

∂f f (x, y + h) − f (x, y)
(x, y) = fy (x, y) = lim ,
∂y h→0 h

where each partial derivative exists only at those points (x, y) for which
the limit exists.
120 CHAPTER 10. DERIVATIVES OF MULTIVARIABLE FUNCTIONS

• The partial derivative fx (a, b) tells us the instantaneous rate of change of


f with respect to x at (x, y) = (a, b) when y is fixed at b. Geometrically,
the partial derivative fx (a, b) tells us the slope of the line tangent to the
y = b trace of the function f at the point (a, b, f (a, b)).

• The partial derivative fy (a, b) tells us the instantaneous rate of change of


f with respect to y at (x, y) = (a, b) when x is fixed at a. Geometrically,
the partial derivative fy (a, b) tells us the slope of the line tangent to the
x = a trace of the function f at the point (a, b, f (a, b)).

Exercises
1. Find the first partial derivatives of
2x − 3y
f (x, y) = at the point (x, y) = (2, 4).
2x + 3y
∂f
(2, 4) =
∂x
∂f
(2, 4) =
∂y
2. Find the first partial derivatives of f (x, y) = sin(x − y) at the point (-2,
-2).
A. fx (−2, −2) =
B. fy (−2, −2) =
3. Find the partial derivatives of the function
p
w = 5r2 + 9s2 + 2t2
∂w
∂r =
∂w
∂s =
∂w
∂t =
4. Suppose that f (x, y) is a smooth function and that its partial derivatives
have the values, fx (2, 2) = −5 and fy (2, 2) = −4. Given that f (2, 2) = 4,
use this information to estimate the value of f (3, 3). Note this is analogous to
finding the tangent line approximation to a function of one variable. In fancy
terms, it is the first Taylor approximation.
Estimate of (integer value) f (2, 3)
Estimate of (integer value) f (3, 2)
Estimate of (integer value) f (3, 3)
5. The gas law for a fixed mass m of an ideal gas at absolute temperature
T , pressure P , and volume V is P V = mRT , where R is the gas constant.
Find the partial derivatives
∂P
=
∂V
∂V
=
∂T
∂T
=
∂P
∂P ∂V ∂T
= (an integer)
∂V ∂T ∂P
6. Find the first partial derivatives of f (x, y, z) = z arctan( xy ) at the point
(5, 5, 5).
A. ∂f
∂x (5, 5, 5) =
B. ∂f
∂y (5, 5, 5) =
∂f
C. ∂z (5, 5, 5) =
10.2. FIRST-ORDER PARTIAL DERIVATIVES 121

7. Find the partial derivatives of the function


Z x
f (x, y) = cos(9t2 + 1t + 5) dt
y

fx (x, y) =
fy (x, y) =
8. Let f (x, y) = e−5x sin(2y).
(a) Using difference quotients with ∆x = 0.1 and ∆y = 0.1, we estimate
fx (−2, 3) ≈
fy (−2, 3) ≈
(b) Using difference quotients with ∆x = 0.01 and ∆y = 0.01, we find
better estimates:
fx (−2, 3) ≈
fy (−2, 3) ≈
9. Determine the sign of fx and fy at each indicated point using the contour
diagram of f shown below. (The point P is that in the first quadrant, at a
positive x and y value; Q through T are located clockwise from P , so that Q
is at a positive x value and negative y, etc.)

(a) At point Q,
fx is ( positive  negative) and
fy is ( positive  negative) .
(b) At point R,
fx is ( positive  negative) and
fy is ( positive  negative) .
(c) At point S,
fx is ( positive  negative) and
fy is ( positive  negative) .
10. Your monthly car payment in dollars is P = f (P0 , t, r), where $P0 is the
amount you borrowed, t is the number of months it takes to pay off the loan,
and r percent is the interest rate.
(a) Is ∂P/∂t positive or negative? ( positive  negative)
Suppose that your bank tells you that the magnitude of ∂P/∂t is 20.
What are the units of this value?
(For this problem, write our your units in full, writing dollars for $, months
for months, percent for %, etc. Note that fractional units generally have a
122 CHAPTER 10. DERIVATIVES OF MULTIVARIABLE FUNCTIONS

plural numerator and singular denominator.)


(b) Is ∂P/∂r positive or negative? ( positive  negative)
Suppose that your bank tells you that the magnitude of ∂P/∂r is 15.
What are the units of this value?
(For this problem, write our your units in full, writing dollars for $, months
for months, percent for %, etc. Note that fractional units generally have a
plural numerator and singular denominator.)
For both parts of this problem, be sure you can explain what the practical
meanings of the partial derivatives are.
11. An experiment to measure the toxicity of formaldehyde yielded the data
in the table below. The values show the percent, P = f (t, c), of rats surviving
an exposure to formaldehyde at a concentration of c (in parts per million, ppm)
after t months.

t = 14 t = 16 t = 18 t = 20 t = 22 t = 24
c=0 100 100 100 99 97 95
c=2 100 99 98 97 95 92
c=6 96 95 93 90 86 80
c = 15 96 93 82 70 58 36

(a) Estimate ft (18, 2):


ft (18, 2) ≈
(b) Estimate fc (18, 2):
fc (18, 2) ≈
(Be sure that you can give the practical meaning of these two values in
terms of formaldehyde toxicity.)
12. An airport can be cleared of fog by heating the air. The amount of heat
required depends on the air temperature and the wetness of the fog. The figure
below shows the heat H(T, w) required (in calories per cubic meter of fog) as a
function of the temperature T (in degrees Celsius) and the water content w (in
grams per cubic meter of fog). Note that this figure is not a contour diagram,
but shows cross-sections of H with w fixed at 0.1, 0.2, 0.3, and 0.4.
10.2. FIRST-ORDER PARTIAL DERIVATIVES 123

(a) Estimate HT (10, 0.2):


HT (10, 0.2) ≈
(Be sure you can interpret this partial derivative in practical terms.)
(b) Make a table of values for H(T, w) from the figure, and use it to estimate
HT (T, w) for each of the following:
T = 20, w = 0.2 : HT (T, w) ≈
T = 30, w = 0.2 : HT (T, w) ≈
T = 20, w = 0.3 : HT (T, w) ≈
T = 30, w = 0.3 : HT (T, w) ≈
(c) Repeat (b) to find Hw (T, w) for each of the following:
T = 20, w = 0.2 : Hw (T, w) ≈
T = 30, w = 0.2 : Hw (T, w) ≈
T = 20, w = 0.3 : Hw (T, w) ≈
T = 30, w = 0.3 : Hw (T, w) ≈
(Be sure you can interpret this partial derivative in practical terms.)
13. The Heat Index, I, (measured in apparent degrees F ) is a function of
the actual temperature T outside (in degrees F) and the relative humidity H
(measured as a percentage). A portion of the table which gives values for this
function, I = I(T, H), is reproduced in Table 10.2.10.

T ↓\H → 70 75 80 85
90 106 109 112 115
92 112 115 119 123
94 118 122 127 132
96 125 130 135 141

Table 10.2.10: A portion of the wind chill data.

a. State the limit definition of the value IT (94, 75). Then, estimate IT (94, 75),
and write one complete sentence that carefully explains the meaning of
this value, including its units.

b. State the limit definition of the value IH (94, 75). Then, estimate IH (94, 75),
and write one complete sentence that carefully explains the meaning of
this value, including its units.

c. Suppose you are given that IT (92, 80) = 3.75 and IH (92, 80) = 0.8.
Estimate the values of I(91, 80) and I(92, 78). Explain how the partial
derivatives are relevant to your thinking.

d. On a certain day, at 1 p.m. the temperature is 92 degrees and the


relative humidity is 85%. At 3 p.m., the temperature is 96 degrees and
the relative humidity 75%. What is the average rate of change of the
heat index over this time period, and what are the units on your answer?
Write a sentence to explain your thinking.

14. Let f (x, y) = 21 xy 2 represent the kinetic energy in Joules of an object


of mass x in kilograms with velocity y in meters per second. Let (a, b) be the
point (4, 5) in the domain of f .

a. Calculate fx (a, b).


124 CHAPTER 10. DERIVATIVES OF MULTIVARIABLE FUNCTIONS

b. Explain as best you can in the context of kinetic energy what the partial
derivative
f (a + h, b) − f (a, b)
fx (a, b) = lim
h→0 h
tells us about kinetic energy.

c. Calculate fy (a, b).

d. Explain as best you can in the context of kinetic energy what the partial
derivative
f (a, b + h) − f (a, b)
fy (a, b) = lim
h→0 h
tells us about kinetic energy.

e. Often we are given certain graphical information about a function in-


stead of a rule. We can use that information to approximate partial
derivatives. For example, suppose that we are given a contour plot of the
kinetic energy function (as in Figure 10.2.11) instead of a formula. Use
this contour plot to approximate fx (4, 5) and fy (4, 5) as best you can.
Compare to your calculations from earlier parts of this exercise.

8 y
7
6
5
70 80
4 50 60
3 40
30
20
2
10
1
x
1 2 3 4 5 6 7 8

Figure 10.2.11: The graph of f (x, y) = 12 xy 2 .

15. The temperature on an unevenly heated metal plate positioned in the


first quadrant of the xy-plane is given by

25xy + 25
C(x, y) = .
(x − 1)2 + (y − 1)2 + 1

Assume that temperature is measured in degrees Celsius and that x and


y are each measured in inches. (Note: At no point in the following questions
should you expand the denominator of C(x, y).)

a. Determine ∂C
∂x |(x,y) and ∂y |(x,y) .
∂C

b. If an ant is on the metal plate, standing at the point (2, 3), and starts
walking in the direction parallel to the positive y axis, at what rate will
the temperature the ant is experiencing change? Explain, and include
appropriate units.
10.2. FIRST-ORDER PARTIAL DERIVATIVES 125

c. If an ant is walking along the line y = 3 in the positive x direction,


at what instantaneous rate will the temperature the ant is experiencing
change when the ant passes the point (1, 3)?
d. Now suppose the ant is stationed at the point (6, 3) and walks in a straight
line towards the point (2, 0). Determine the average rate of change in
temperature (per unit distance traveled) the ant encounters in moving
between these two points. Explain your reasoning carefully. What are
the units on your answer?

16. Consider the function f defined by f (x, y) = 8 − x2 − 3y 2 .


a. Determine fx (x, y) and fy (x, y).

b. Find parametric equations in R3 for the tangent line to the trace f (x, 1)
at x = 2.
c. Find parametric equations in R3 for the tangent line to the trace f (2, y)
at y = 1.

d. State respective direction vectors for the two lines determined in (b) and
(c).
e. Determine the equation of the plane that passes through the point (2, 1, f (2, 1))
whose normal vector is orthogonal to the direction vectors of the two lines
found in (b) and (c).

f. Use a graphing utility to plot both the surface z = 8 − x2 − 3y 2 and the


plane from (e) near the point (2, 1). What is the relationship between
the surface and the plane?

17. Recall from single variable calculus that, given the derivative of a sin-
gle variable function and an initial condition, we can integrate to find the
original function. We can sometimes use the same process for functions of
more than one variable. For example, suppose that a function f satisfies
fx (x, y) = cos(y)ex + 2x + y 2 , fy (x, y) = − sin(y)ex + 2xy + 3, and f (0, 0) = 5.
a. Find all possible functions f of x and y such that fx (x, y) = cos(y)ex +
2x + y 2 . Your function will have both x and y as independent variables
and may also contain summands that are functions of y alone.

b. Use the fact that fy (x, y) = − sin(y)ex + 2xy + 3 to determine any un-
known non-constant summands in your result from part (a).
c. Complete the problem by determining the specific function f that satis-
fies the given conditions.
126 CHAPTER 10. DERIVATIVES OF MULTIVARIABLE FUNCTIONS

10.3 Second-Order Partial Derivatives

Motivating Questions

• Given a function f of two independent variables x and y, how are the


second-order partial derivatives of f defined?

• What do the second-order partial derivatives fxx , fyy , fxy , and fyx of a
function f tell us about the function’s behavior?

Recall that for a single-variable function f , the second derivative of f is


defined to be the derivative of the first derivative. That is, f 00 (x) = dx
d
[f 0 (x)],
which can be stated in terms of the limit definition of the derivative by writing

f 0 (x + h) − f 0 (x)
f 00 (x) = lim .
h→0 h

In what follows, we begin exploring the four different second-order partial


derivatives of a function of two variables and seek to understand what these
various derivatives tell us about the function’s behavior.

Preview Activity 10.3.1. Once again, let’s consider the function f defined
2
by f (x, y) = x sin(2y)
32 that measures a projectile’s range as a function of its
initial speed x and launch angle y. The graph of this function, including traces
with x = 150 and y = 0.6, is shown in Figure 10.3.1.

z z
1500 1500

1000 1000

500 500

x x
200 200
y 150 y 150
1.5 100 1.5 100
1.0 50 1.0 50
0.5 0.5
0 0

Figure 10.3.1: The distance function with traces x = 150 and y = 0.6.

a. Compute the partial derivative fx . Notice that fx itself is a new function


of x and y, so we may now compute the partial derivatives of fx . Find
the partial derivative fxx = (fx )x and show that fxx (150, 0.6) ≈ 0.058.

b. Figure 10.3.2 shows the trace of f with y = 0.6 with three tangent lines
included. Explain how your result from part (b) of this preview activity
is reflected in this figure.
10.3. SECOND-ORDER PARTIAL DERIVATIVES 127

1000
f (x, 0.6)

800

600

400

200

x
50 100 150 200

Figure 10.3.2: The trace with y = 0.6.

c. Determine the partial derivative fy , and then find the partial derivative
fyy = (fy )y . Evaluate fyy (150, 0.6).

1000
f (150, y)

800

600

400

200

y
0.25 0.50 0.75 1.00 1.25

Figure 10.3.3: More traces of the range function.

d. Figure 10.3.3 shows the trace f (150, y) and includes three tangent lines.
Explain how the value of fyy (150, 0.6) is reflected in this figure.

e. Because fx and fy are each functions of both x and y, they each have
two partial derivatives. Not only can we compute fxx = (fx )x , but also
fxy = (fx )y ; likewise, in addition to fyy = (fy )y , but also fyx = (fy )x .
2
For the range function f (x, y) = x sin(2y)
32 , use your earlier computations
of fx and fy to now determine fxy and fyx . Write one sentence to explain
how you calculated these “mixed” partial derivatives.

10.3.1 Second-order Partial Derivatives


A function f of two independent variables x and y has two first order partial
derivatives, fx and fy . As we saw in Preview Activity 10.3.1, each of these
first-order partial derivatives has two partial derivatives, giving a total of four
second-order partial derivatives:
128 CHAPTER 10. DERIVATIVES OF MULTIVARIABLE FUNCTIONS
 
∂f ∂2f
• fxx = (fx )x = ∂
∂x ∂x = ∂x2 ,

 
∂f ∂2f
• fyy = (fy )y = ∂
∂y ∂y = ∂y 2 ,

 
∂f ∂2f
• fxy = (fx )y = ∂
∂y ∂x = ∂y∂x ,

 
∂f ∂2f
• fyx = (fy )x = ∂
∂x ∂y = ∂x∂y .

The first two are called unmixed second-order partial derivatives while the
last two are called the mixed second-order partial derivatives.
One aspect of this notation can be a little confusing. The notation

∂2f
 
∂ ∂f
=
∂y∂x ∂y ∂x

means that we first differentiate with respect to x and then with respect to y;
this can be expressed in the alternate notation fxy = (fx )y . However, to find
the second partial derivative
fyx = (fy )x
we first differentiate with respect to y and then x. This means that

∂2f ∂2f
= fxy , and = fyx .
∂y∂x ∂x∂y

Be sure to note carefully the difference between Leibniz notation and sub-
script notation and the order in which x and y appear in each. In addition,
remember that anytime we compute a partial derivative, we hold constant the
variable(s) other than the one we are differentiating with respect to.

Activity 10.3.2. Find all second order partial derivatives of the following
functions. For each partial derivative you calculate, state explicitly which
variable is being held constant.

a. f (x, y) = x2 y 3

b. f (x, y) = y cos(x)

c. g(s, t) = st3 + s4

d. How many second order partial derivatives does the function h defined
by h(x, y, z) = 9x9 z − xyz 9 + 9 have? Find hxz and hzx (you do not need
to find the other second order partial derivatives).

In Preview Activity 10.3.1 and Activity 10.3.2, you may have noticed that
the mixed second-order partial derivatives are equal. This observation holds
generally and is known as Clairaut’s Theorem.
Clairaut’s Theorem.
Let f be a function of several variables for which the partial derivatives
fxy and fyx are continuous near the point (a, b). Then

fxy (a, b) = fyx (a, b).


10.3. SECOND-ORDER PARTIAL DERIVATIVES 129

10.3.2 Interpreting the second-order Partial Derivatives

Recall from single variable calculus that the second derivative measures the
instantaneous rate of change of the derivative. This observation is the key to
understanding the meaning of the second-order partial derivatives.

z z z

6 6 6

4 4 4

2 2 2

-2 1 -2 1 -2 1
y -1 2 x y -1 2 x y -1 2 x
3 3 3

Figure 10.3.4: The tangent lines to a trace with increasing x.

Furthermore, we remember that the second derivative of a function at a


point provides us with information about the concavity of the function at that
point. Since the unmixed second-order partial derivative fxx requires us to
hold y constant and differentiate twice with respect to x, we may simply view
fxx as the second derivative of a trace of f where y is fixed. As such, fxx will
measure the concavity of this trace.
Consider, for example, f (x, y) = sin(x)e−y . Figure 10.3.4 shows the graph
of this function along with the trace given by y = −1.5. Also shown are three
tangent lines to this trace, with increasing x-values from left to right among
the three plots in Figure 10.3.4.
That the slope of the tangent line is decreasing as x increases is reflected,
as it is in one-variable calculus, in the fact that the trace is concave down.
Indeed, we see that fx (x, y) = cos(x)e−y and so fxx (x, y) = − sin(x)e−y < 0,
since e−y > 0 for all values of y, including y = −1.5.
In the following activity, we further explore what second-order partial
derivatives tell us about the geometric behavior of a surface.

Activity 10.3.3. We continue to consider the function f defined by f (x, y) =


sin(x)e−y .

a. In Figure 10.3.5, we see the trace of f (x, y) = sin(x)e−y that has x held
constant with x = 1.75. We also see three different lines that are tangent
to the trace of f in the x direction at values of y that are increasing
from left to right in the figure. Write a couple of sentences that describe
whether the slope of the tangent lines to this curve increase or decrease as
y increases, and, after computing fyy (x, y), explain how this observation
is related to the value of fyy (1.75, y). Be sure to address the notion of
concavity in your response.(You need to be careful about the directions
in which x and y are increasing.)
130 CHAPTER 10. DERIVATIVES OF MULTIVARIABLE FUNCTIONS

z z z

6 6 6

4 4 4

2 2 2

-2 1 -2 1 -2 1
y -1 2 x y -1 2 x y -1 2 x
3 3 3

Figure 10.3.5: The tangent lines to a trace with increasing y.

b. In Figure 10.3.6, we start to think about the mixed partial derivative,


fxy . Here, we first hold y constant to generate the first-order partial
derivative fx , and then we hold x constant to compute fxy . This leads
to first thinking about a trace with x being constant, followed by slopes
of tangent lines in the y-direction that slide along the original trace. You
might think of sliding your pencil down the trace with x constant in a
way that its slope indicates (fx )y in order to further animate the three
snapshots shown in the figure.

z z z

6 6 6

4 4 4

2 2 2

-2 1 -2 1 -2 1
y -1 2 x y -1 2 x y -1 2 x
3 3 3

Figure 10.3.6: The trace of z = f (x, y) = sin(x)e−y with x = 1.75, along


with tangent lines in the y-direction at three different points.

Based on Figure 10.3.6, is fxy (1.75, −1.5) positive or negative? Why?

c. Determine the formula for fxy (x, y), and hence evaluate fxy (1.75, −1.5).
How does this value compare with your observations in (b)?

d. We know that fxx (1.75, −1.5) measures the concavity of the y = −1.5
trace, and that fyy (1.75, −1.5) measures the concavity of the x = 1.75
trace. What do you think the quantity fxy (1.75, −1.5) measures?

e. On Figure 10.3.6, sketch the trace with y = −1.5, and sketch three tan-
gent lines whose slopes correspond to the value of fyx (x, −1.5) for three
different values of x, the middle of which is x = −1.5. Is fyx (1.75, −1.5)
positive or negative? Why? What does fyx (1.75, −1.5) measure?
10.3. SECOND-ORDER PARTIAL DERIVATIVES 131

Just as with the first-order partial derivatives, we can approximate second-


order partial derivatives in the situation where we have only partial information
about the function.

Activity 10.3.4. As we saw in Activity 10.2.5, the wind chill w(v, T ), in


degrees Fahrenheit, is a function of the wind speed, in miles per hour, and
the air temperature, in degrees Fahrenheit. Some values of the wind chill are
recorded in Table 10.3.7.

v\T -30 -25 -20 -15 -10 -5 0 5 10 15 20


5 -46 -40 -34 -28 -22 -16 -11 -5 1 7 13
10 -53 -47 -41 -35 -28 -22 -16 -10 -4 3 9
15 -58 -51 -45 -39 -32 -26 -19 -13 -7 0 6
20 -61 -55 -48 -42 -35 -29 -22 -15 -9 -2 4
25 -64 -58 -51 -44 -37 -31 -24 -17 -11 -4 3
30 -67 -60 -53 -46 -39 -33 -26 -19 -12 -5 1
35 -69 -62 -55 -48 -41 -34 -27 -21 -14 -7 0
40 -71 -64 -57 -50 -43 -36 -29 -22 -15 -8 -1

Table 10.3.7: Wind chill as a function of wind speed and temperature.

a. Estimate the partial derivatives wT (20, −15), wT (20, −10), and wT (20, −5).
Use these results to estimate the second-order partial wT T (20, −10).

b. In a similar way, estimate the second-order partial wvv (20, −10).

c. Estimate the partial derivatives wT (20, −10), wT (25, −10), and wT (15, −10),
and use your results to estimate the partial wT v (20, −10).

d. In a similar way, estimate the partial derivative wvT (20, −10).

e. Write several sentences that explain what the values wT T (20, −10), wvv (20, −10),
and wT v (20, −10) indicate regarding the behavior of w(v, T ).

Figure 10.3.8: The graph of f (x, y) = −xy.


132 CHAPTER 10. DERIVATIVES OF MULTIVARIABLE FUNCTIONS

As we have found in Activities 10.3.3 and Activity 10.3.4, we may think of


fxy as measuring the “twist” of the graph as we increase y along a particular
trace where x is held constant. In the same way, fyx measures how the graph
twists as we increase x. If we remember that Clairaut’s theorem tells us that
fxy = fyx , we see that the amount of twisting is the same in both directions.
This twisting is perhaps more easily seen in Figure 10.3.8, which shows the
graph of f (x, y) = −xy, for which fxy = −1.

10.3.3 Summary

• There are four second-order partial derivatives of a function f of two


independent variables x and y:

fxx = (fx )x , fxy = (fx )y , fyx = (fy )x , and fyy = (fy )y .

• The unmixed second-order partial derivatives, fxx and fyy , tell us about
the concavity of the traces. The mixed second-order partial derivatives,
fxy and fyx , tell us how the graph of f twists.

Exercises
1. Calculate all four second-order partial derivatives of f (x, y) = 4x2 y+6xy 3 .
fxx (x, y) =
fxy (x, y) =
fyx (x, y) =
fyy (x, y) =
2. Find all the first and second order partial derivatives of f (x, y) =
5 sin(2x + y) − 8 cos(x − y).
A. ∂f
∂x = fx =
B. ∂f
∂y = fy =
∂2f
C. ∂x2 = fxx =
∂2f
D. ∂y 2 = fy y =
∂2f
E. ∂x∂y = fy x =
∂2f
F. ∂y∂x = fxy =
3. Find the partial derivatives of the function

f (x, y) = xye7y

fx (x, y) =
fy (x, y) =
fxy (x, y) =
fyx (x, y) =
 
5x
4. Calculate all four second-order partial derivatives of f (x, y) = sin .
y
fxx (x, y) =
fxy (x, y) =
fyx (x, y) =
fyy (x, y) =
5. Given F (r, s, t) = −r 8s6 + 8t4 , compute:


Frst =
10.3. SECOND-ORDER PARTIAL DERIVATIVES 133

6. Calculate all four second-order partial derivatives and check that fxy =
fyx . Assume the variables are restricted to a domain on which the function is
defined.

f (x, y) = e2xy

fxx =
fyy =
fxy =
fyx =

7. Calculate all four second-order partial derivatives of f (x, y) = (3x + 3y) ey .


fxx (x, y) =
fxy (x, y) =
fyx (x, y) =
fyy (x, y) =
5
8. Let f (x, y) = (− (x + y)) . Then

∂ 2f
∂x∂y =
∂ 3f
∂x∂y∂x =
∂ 3f
∂x2 ∂y =

9. If zxy = 6y and all of the second order partial derivatives of z are


continuous, then
(a) zyx =
(b) zxyx =
(c) zxyy =

10. If z = f (x) + yg(x), what can we say about zyy ?

zyy = y

zyy = zxx

zyy = g(x)

zyy = 0

We cannot say anything

11. Shown in Figure 10.3.9 is a contour plot of a function f with the values
of f labeled on the contours. The point (2, 1) is highlighted in red.
134 CHAPTER 10. DERIVATIVES OF MULTIVARIABLE FUNCTIONS

y
5
4
4

3
2
2

1
x
2 4

Figure 10.3.9: A contour plot of f (x, y).

a. Estimate the partial derivatives fx (2, 1) and fy (2, 1).

b. Determine whether the second-order partial derivative fxx (2, 1) is posi-


tive or negative, and explain your thinking.

c. Determine whether the second-order partial derivative fyy (2, 1) is posi-


tive or negative, and explain your thinking.

d. Determine whether the second-order partial derivative fxy (2, 1) is posi-


tive or negative, and explain your thinking.

e. Determine whether the second-order partial derivative fyx (2, 1) is posi-


tive or negative, and explain your thinking.

f. Consider a function g of the variables x and y for which gx (2, 2) > 0 and
gxx (2, 2) < 0. Sketch possible behavior of some contours around (2, 2)
on the left axes in Figure 10.3.10.

4 y 4 y

3 3

2 2

1 1

x x
1 2 3 4 1 2 3 4

Figure 10.3.10: Plots for contours of g and h.


10.3. SECOND-ORDER PARTIAL DERIVATIVES 135

g. Consider a function h of the variables x and y for which hx (2, 2) > 0


and hxy (2, 2) < 0. Sketch possible behavior of some contour lines around
(2, 2) on the right axes in Figure 10.3.10.

12. The Heat Index, I, (measured in apparent degrees F ) is a function of


the actual temperature T outside (in degrees F) and the relative humidity H
(measured as a percentage). A portion of the table which gives values for this
function, I(T, H), is reproduced in Table 10.3.11.

T ↓\H → 70 75 80 85
90 106 109 112 115
92 112 115 119 123
94 118 122 127 132
96 125 130 135 141

Table 10.3.11: Heat index.

a. State the limit definition of the value IT T (94, 75). Then, estimate IT T (94, 75),
and write one complete sentence that carefully explains the meaning of
this value, including units.
b. State the limit definition of the value IHH (94, 75). Then, estimate IHH (94, 75),
and write one complete sentence that carefully explains the meaning of
this value, including units.
c. Finally, do likewise to estimate IHT (94, 75), and write a sentence to ex-
plain the meaning of the value you found.

13. The temperature on a heated metal plate positioned in the first quadrant
of the xy-plane is given by
2
−(y−1)3
C(x, y) = 25e−(x−1) .

Assume that temperature is measured in degrees Celsius and that x and y


are each measured in inches.
a. Determine Cxx (x, y) and Cyy (x, y). Do not do any additional work to
algebraically simplify your results.
b. Calculate Cxx (1.1, 1.2). Suppose that an ant is walking past the point
(1.1, 1.2) along the line y = 1.2. Write a sentence to explain the meaning
of the value of Cxx (1.1, 1.2), including units.
c. Calculate Cyy (1.1, 1.2). Suppose instead that an ant is walking past the
point (1.1, 1.2) along the line x = 1.1. Write a sentence to explain the
meaning of the value of Cyy (1.1, 1.2), including units.
d. Determine Cxy (x, y) and hence compute Cxy (1.1, 1.2). What is the mean-
ing of this value? Explain, in terms of an ant walking on the heated metal
plate.

14. Let f (x, y) = 8 − x2 − y 2 and g(x, y) = 8 − x2 + 4xy − y 2 .


a. Determine fx , fy , fxx , fyy , fxy , and fyx .
b. Evaluate each of the partial derivatives in (a) at the point (0, 0).
136 CHAPTER 10. DERIVATIVES OF MULTIVARIABLE FUNCTIONS

c. What do the values in (b) suggest about the behavior of f near (0, 0)?
Plot a graph of f and compare what you see visually to what the values
suggest.
d. Determine gx , gy , gxx , gyy , gxy , and gyx .

e. Evaluate each of the partial derivatives in (d) at the point (0, 0).
f. What do the values in (e) suggest about the behavior of g near (0, 0)?
Plot a graph of g and compare what you see visually to what the values
suggest.
g. What do the functions f and g have in common at (0, 0)? What is
different? What do your observations tell you regarding the importance
of a certain second-order partial derivative?

15. Let f (x, y) = 21 xy 2 represent the kinetic energy in Joules of an object


of mass x in kilograms with velocity y in meters per second. Let (a, b) be the
point (4, 5) in the domain of f .
2
a. Calculate ∂∂xf2 at the point (a, b). Then explain as best you can what this
second order partial derivative tells us about kinetic energy.
2
b. Calculate ∂∂yf2 at the point (a, b). Then explain as best you can what this
second order partial derivative tells us about kinetic energy.
2
∂ f
c. Calculate ∂y∂x at the point (a, b). Then explain as best you can what
this second order partial derivative tells us about kinetic energy.
2
∂ f
d. Calculate ∂x∂y at the point (a, b). Then explain as best you can what
this second order partial derivative tells us about kinetic energy.
10.4. LINEARIZATION: TANGENT PLANES AND DIFFERENTIALS 137

10.4 Linearization: Tangent Planes and Differ-


entials

Motivating Questions

• What does it mean for a function of two variables to be locally linear at


a point?

• How do we find the equation of the plane tangent to a locally linear


function at a point?

• What does it mean to say that a multivariable function is differentiable?

• What is the differential of a multivariable function of two variables and


what are its uses?

One of the central concepts in single variable calculus is that the graph of
a differentiable function, when viewed on a very small scale, looks like a line.
We call this line the tangent line and measure its slope with the derivative. In
this section, we will extend this concept to functions of several variables.
Let’s see what happens when we look at the graph of a two-variable function
on a small scale. To begin, let’s consider the function f defined by

x2
f (x, y) = 6 − − y2 ,
2

whose graph is shown in Figure 10.4.1.

4
z
2

-2

x
-2 y 2

Figure 10.4.1: The graph of f (x, y) = 6 − x2 /2 − y 2 .

We choose to study the behavior of this function near the point (x0 , y0 ) =
(1, 1). In particular, we wish to view the graph on an increasingly small scale
around this point, as shown in the two plots in Figure 10.4.2
138 CHAPTER 10. DERIVATIVES OF MULTIVARIABLE FUNCTIONS

6 6

4 4
z z
2 2

0 0.5
x x
0 y 2 0.5 y 1.5

Figure 10.4.2: The graph of f (x, y) = 6 − x2 /2 − y 2 .

Just as the graph of a differentiable single-variable function looks like a


line when viewed on a small scale, we see that the graph of this particular two-
variable function looks like a plane, as seen in Figure 10.4.3. In the following
preview activity, we explore how to find the equation of this plane.

4
z
2

0.5
x
0.5 y 1.5

Figure 10.4.3: The graph of f (x, y) = 6 − x2 /2 − y 2 .

In what follows, we will also use the important fact1 that the plane passing
through (x0 , y0 , z0 ) may be expressed in the form z = z0 + a(x − x0 ) + b(y − y0 ),
where a and b are constants.
2
Preview Activity 10.4.1. Let f (x, y) = 6 − x2 − y 2 , and let (x0 , y0 ) = (1, 1).
2
a. Evaluate f (x, y) = 6 − x2 − y 2 and its partial derivatives at (x0 , y0 ); that
is, find f (1, 1), fx (1, 1), and fy (1, 1).
1 As we saw in Section 9.5, the equation of a plane passing through the point (x , y , z )
0 0 0
may be written in the form A(x − x0 ) + B(y − y0 ) + C(z − z0 ) = 0. If the plane is not vertical,
then C 6= 0, and we can rearrange this and hence write C(z − z0 ) = −A(x − x0 ) − B(y − y0 )
and thus
A B
z = z0 − (x − x0 ) − (y − y0 )
C C
= z0 + a(x − x0 ) + b(y − y0 )
where a = −A/C and b = −B/C, respectively.
10.4. LINEARIZATION: TANGENT PLANES AND DIFFERENTIALS 139

b. We know one point on the tangent plane; namely, the z-value of the
2
tangent plane agrees with the z-value on the graph of f (x, y) = 6− x2 −y 2
at the point (x0 , y0 ). In other words, both the tangent plane and the
graph of the function f contain the point (x0 , y0 , z0 ). Use this observation
to determine z0 in the expression z = z0 + a(x − x0 ) + b(y − y0 ).
x2
c. Sketch the traces of f (x, y) = 6 − 2 − y 2 for y = y0 = 1 and x = x0 = 1
below in Figure 10.4.4.

5.0 5.0
z = f (x, 1) z = f (1, y)

4.5 4.5

x y
4.0 4.0
0.5 1.0 1.5 0.5 1.0 1.5

Figure 10.4.4: The traces of f (x, y) with y = y0 = 1 and x = x0 = 1.

d. Determine the equation of the tangent line of the trace that you sketched
in the previous part with y = 1 (in the x direction) at the point x0 = 1.

6 6

4 4
z z
2 2

0.5 0.5
x x
0.5 y 1.5 0.5 y 1.5

Figure 10.4.5: The traces of f (x, y) and the tangent plane.

e. Figure 10.4.5 shows the traces of the function and the traces of the tan-
gent plane. Explain how the tangent line of the trace of f , whose equation
you found in the last part of this activity, is related to the tangent plane.
140 CHAPTER 10. DERIVATIVES OF MULTIVARIABLE FUNCTIONS

How does this observation help you determine the constant a in the equa-
tion for the tangent plane z = z0 + a(x − x0 ) + b(y − y0 )? (Hint: How do
you think fx (x0 , y0 ) should be related to zx (x0 , y0 )?)

f. In a similar way to what you did in (d), determine the equation of the
tangent line of the trace with x = 1 at the point y0 = 1. Explain how
this tangent line is related to the tangent plane, and use this observation
to determine the constant b in the equation for the tangent plane z =
z0 + a(x − x0 ) + b(y − y0 ). (Hint: How do you think fy (x0 , y0 ) should be
related to zy (x0 , y0 )?)

g. Finally, write the equation z = z0 + a(x − x0 ) + b(y − y0 ) of the tangent


plane to the graph of f (x, y) = 6−x2 /2−y 2 at the point (x0 , y0 ) = (1, 1).

10.4.1 The tangent plane


Before stating the formula for the equation of the tangent plane at a point for
a general function f = f (x, y), we need to discuss a technical condition. As we
have noted, when we look at the graph of a single-variable function on a small
scale near a point x0 , we expect to see a line; in this case, we say that f is
locally linear near x0 since the graph looks like a linear function locally around
x0 . Of course, there are functions, such as the absolute value function given
by f (x) = |x|, that are not locally linear at every point. In single-variable
calculus, we learn that if the derivative of a function exists at a point, then the
function is guaranteed to be locally linear there.
In a similar way, we say that a two-variable function f is locally linear near
(x0 , y0 ) provided that the graph of f looks like a plane (its tangent plane) when
viewed on a small scale near (x0 , y0 ). How can we tell when a function of two
variables is locally linear at a point?
It is not unreasonable to expect that if fx (a, b) and fy (a, b) exist for some
function f at a point (a, b), then f is locally linear at (a, b). This is not
sufficient, however. As an example, consider the function f defined by f (x, y) =
x1/3 y 1/3 . In Exercise 10.4.11 you are asked to show that fx (0, 0) and fy (0, 0)
both exist, but that f is not locally linear at (0, 0) (see Figure 10.4.12). So the
existence of the two first order partial derivatives at a point does not guarantee
local linearity at that point.
It would take us too far afield to provide a rigorous dicussion of differentia-
bility of functions of more than one variable (see Exercise 10.4.15) for a little
more detail), so we will be content to just state conditions that ensure local
linearity.

Differentiablity.
If f is a function of the independent variables x and y and both fx
and fy exist and are continuous in an open disk containing the point
(x0 , y0 ), then f is differentiable at (x0 , y0 ).

As a consequence, whenever a function z = f (x, y) is differentiable at a


point (x0 , y0 ), it follows that the function has a tangent plane at (x0 , y0 ).
Viewed up close, the tangent plane and the function are then virtually indis-
tinguishable. In addition, as in Preview Activity 10.4.1, we find the following
general formula for the tangent plane.
10.4. LINEARIZATION: TANGENT PLANES AND DIFFERENTIALS 141

The tangent plane.


If f (x, y) has continuous first-order partial derivatives, then the equa-
tion of the plane tangent to the graph of f at the point (x0 , y0 , f (x0 , y0 ))
is

z = f (x0 , y0 ) + fx (x0 , y0 )(x − x0 ) + fy (x0 , y0 )(y − y0 ). (10.4.1)

Important Note: As can be seen in Exercise 10.4.11, it is possible that


fx (x0 , y0 ) and fy (x0 , y0 ) can exist for a function f , and so the plane z =
f (x0 , y0 ) + fx (x0 , y0 )(x − x0 ) + fy (x0 , y0 )(y − y0 ) exists even though f is not
locally linear at (x0 , y0 ) (because the graph of f does not look linear when we
zoom in around the point (x0 , y0 )). In such a case this plane is not tangent to
the graph. Differentiability for a function of two variables implies the existence
of a tangent plane, but the existence of the two first order partial derivatives
of a function at a point does not imply differentiaility. This is quite different
than what happens in single variable calculus.
Finally, one important note about the form of the equation for the tangent
plane, z = f (x0 , y0 ) + fx (x0 , y0 )(x − x0 ) + fy (x0 , y0 )(y − y0 ). Say, for exam-
ple, that we have the particular tangent plane z = 7 − 2(x − 3) + 4(y + 1).
Observe that we can immediately read from this form that fx (3, −1) = −2
and fy (3, −1) = 4; furthermore, fx (3, −1) = −2 is the slope of the trace to
both f and the tangent plane in the x-direction at (−3, 1). In the same way,
fy (3, −1) = 4 is the slope of the trace of both f and the tangent plane in the
y-direction at (3, −1).

Activity 10.4.2.

a. Find the equation of the tangent plane to f (x, y) = 2 + 4x − 3y at the


point (1, 2). Simplify as much as possible. Does the result surprise you?
Explain.

b. Find the equation of the tangent plane to f (x, y) = x2 y at the point


(1, 2).

y y

y = L(x) y = L(x)

y = f (x) y = f (x)

x x
x0 x0

Figure 10.4.6: The linearization of the single-variable function f (x).


142 CHAPTER 10. DERIVATIVES OF MULTIVARIABLE FUNCTIONS

10.4.2 Linearization
In single variable calculus, an important use of the tangent line is to approx-
imate the value of a differentiable function. Near the point x0 , the tangent
line to the graph of f at x0 is close to the graph of f near x0 , as shown in
Figure 10.4.6.
In this single-variable setting, we let L denote the function whose graph is
the tangent line, and thus

L(x) = f (x0 ) + f 0 (x0 )(x − x0 )

Furthermore, observe that f (x) ≈ L(x) near x0 . We call L the linearization


of f .
In the same way, the tangent plane to the graph of a differentiable function
z = f (x, y) at a point (x0 , y0 ) provides a good approximation of f (x, y) near
(x0 , y0 ). Here, we define the linearization, L, to be the two-variable function
whose graph is the tangent plane, and thus

L(x, y) = f (x0 , y0 ) + fx (x0 , y0 )(x − x0 ) + fy (x0 , y0 )(y − y0 ).

Finally, note that f (x, y) ≈ L(x, y) for points near (x0 , y0 ). This is illus-
trated in Figure 10.4.7.

6
z = L(x, y)

4
z
2

z = f (x, y)

0.5
x
0.5 y 1.5

Figure 10.4.7: The linearization of f (x, y).

Activity 10.4.3. In what follows, we find the linearization of several different


functions that are given in algebraic, tabular, or graphical form.

a. Find the linearization L(x, y) for the function g defined by

x
g(x, y) =
x2 + y2

at the point (1, 2). Then use the linearization to estimate the value of
g(0.8, 2.3).

b. Table 10.4.8 provides a collection of values of the wind chill w(v, T ), in


degrees Fahrenheit, as a function of wind speed, in miles per hour, and
temperature, also in degrees Fahrenheit.
10.4. LINEARIZATION: TANGENT PLANES AND DIFFERENTIALS 143

v\T −30 −25 −20 −15 −10 −5 0 5 10 15 20


5 −46 −40 −34 −28 −22 −16 −11 −5 1 7 13
10 −53 −47 −41 −35 −28 −22 −16 −10 −4 3 9
15 −58 −51 −45 −39 −32 −26 −19 −13 −7 0 6
20 −61 −55 −48 −42 −35 −29 −22 −15 −9 −2 4
25 −64 −58 −51 −44 −37 −31 −24 −17 −11 −4 3
30 −67 −60 −53 −46 −39 −33 −26 −19 −12 −5 1
35 −69 −62 −55 −48 −41 −34 −27 −21 −14 −7 0
40 −71 −64 −57 −50 −43 −36 −29 −22 −15 −8 −1

Table 10.4.8: Wind chill as a function of wind speed and temperature.

Use the data to first estimate the appropriate partial derivatives, and
then find the linearization L(v, T ) at the point (20, −10). Finally, use
the linearization to estimate w(10, −10), w(20, −12), and w(18, −12).
Compare your results to what you obtained in Activity 10.2.5

c. Figure 10.4.9 gives a contour plot of a differentiable function f .

y
5
4
4

3
2
2

1
x
2 4

Figure 10.4.9: A contour plot of f (x, y).

After estimating appropriate partial derivatives, determine the lineariza-


tion L(x, y) at the point (2, 1), and use it to estimate f (2.2, 1), f (2, 0.8),
and f (2.2, 0.8).

10.4.3 Differentials

As we have seen, the linearization L(x, y) enables us to estimate the value of


f (x, y) for points (x, y) near the base point (x0 , y0 ). Sometimes, however, we
are more interested in the change in f as we move from the base point (x0 , y0 )
to another point (x, y).
144 CHAPTER 10. DERIVATIVES OF MULTIVARIABLE FUNCTIONS

df
∆f

y
(x0 , y0 )

dx = ∆x

dy = ∆y (x, y)
x

Figure 10.4.10: The differential df approximates the change in f (x, y).

Figure 10.4.10 illustrates this situation. Suppose we are at the point (x0 , y0 ),
and we know the value f (x0 , y0 ) of f at (x0 , y0 ). If we consider the displace-
ment h∆x, ∆yi to a new point (x, y) = (x0 + ∆x, y0 + ∆y), we would like to
know how much the function has changed. We denote this change by ∆f ,
where
∆f = f (x, y) − f (x0 , y0 ).
A simple way to estimate the change ∆f is to approximate it by df , which
represents the change in the linearization L(x, y) as we move from (x0 , y0 ) to
(x, y). This gives

∆f ≈ df = L(x, y) − f (x0 , y0 )
= [f (x0 , y0 ) + fx (x0 , y0 )(x − x0 ) + fy (x0 , y0 )(y − y0 )] − f (x0 , y0 )
= fx (x0 , y0 )∆x + fy (x0 , y0 )∆y.

For consistency, we will denote the change in the independent variables as


dx = ∆x and dy = ∆y, and thus

∆f ≈ df = fx (x0 , y0 ) dx + fy (x0 , y0 ) dy. (10.4.2)

Expressed equivalently in Leibniz notation, we have


∂f ∂f
df = dx + dy. (10.4.3)
∂x ∂y
We call the quantities dx, dy, and df differentials, and we think of them
as measuring small changes in the quantities x, y, and f . Equations (10.4.2)
and (10.4.3) express the relationship between these changes. Equation (10.4.3)
resembles an important idea from single-variable calculus: when y depends on
x, it follows in the notation of differentials that
dy
dy = y 0 dx = dx.
dx
We will illustrate the use of differentials with an example.
Example 10.4.11. Suppose we have a machine that manufactures rectangles
of width x = 20 cm and height y = 10 cm. However, the machine isn’t perfect,
and therefore the width could be off by dx = ∆x = 0.2 cm and the height
could be off by dy = ∆y = 0.4 cm.
The area of the rectangle is

A(x, y) = xy,

so that the area of a perfectly manufactured rectangle is A(20, 10) = 200


square centimeters. Since the machine isn’t perfect, we would like to know
10.4. LINEARIZATION: TANGENT PLANES AND DIFFERENTIALS 145

how much the area of a given manufactured rectangle could differ from the
perfect rectangle. We will estimate the uncertainty in the area using (10.4.2),
and find that

∆A ≈ dA = Ax (20, 10) dx + Ay (20, 10) dy.

Since Ax = y and Ay = x, we have

∆A ≈ dA = 10 dx + 20 dy = 10 · 0.2 + 20 · 0.4 = 10.

That is, we estimate that the area in our rectangles could be off by as much
as 10 square centimeters.
Activity 10.4.4. The questions in this activity explore the differential in sev-
eral different contexts.
a. Suppose that the elevation of a landscape is given by the function h,
where we additionally know that h(3, 1) = 4.35, hx (3, 1) = 0.27, and
hy (3, 1) = −0.19. Assume that x and y are measured in miles in the east-
erly and northerly directions, respectively, from some base point (0, 0).
Your GPS device says that you are currently at the point (3, 1). However,
you know that the coordinates are only accurate to within 0.2 units; that
is, dx = ∆x = 0.2 and dy = ∆y = 0.2. Estimate the uncertainty in your
elevation using differentials.
b. The pressure, volume, and temperature of an ideal gas are related by the
equation
P = P (T, V ) = 8.31T /V,
where P is measured in kilopascals, V in liters, and T in kelvin. Find
the pressure when the volume is 12 liters and the temperature is 310 K.
Use differentials to estimate the change in the pressure when the volume
increases to 12.3 liters and the temperature decreases to 305 K.
c. Refer to Table 10.4.8, the table of values of the wind chill w(v, T ), in
degrees Fahrenheit, as a function of temperature, also in degrees Fahren-
heit, and wind speed, in miles per hour. Suppose your anemometer says
the wind is blowing at 25 miles per hour and your thermometer shows a
reading of −15◦ degrees. However, you know your thermometer is only
accurate to within 2◦ degrees and your anemometer is only accurate to
within 3 miles per hour. What is the wind chill based on your mea-
surements? Estimate the uncertainty in your measurement of the wind
chill.

10.4.4 Summary

• A function f of two independent variables is locally linear at a point


(x0 , y0 ) if the graph of f looks like a plane as we zoom in on the graph
around the point (x0 , y0 ). In this case, the equation of the tangent plane
is given by

z = f (x0 , y0 ) + fx (x0 , y0 )(x − x0 ) + fy (x0 , y0 )(y − y0 ).

• The tangent plane L(x, y) = f (x0 , y0 )+fx (x0 , y0 )(x−x0 )+fy (x0 , y0 )(y −
y0 ), when considered as a function, is called the linearization of a differ-
entiable function f at (x0 , y0 ) and may be used to estimate values of
f (x, y); that is, f (x, y) ≈ L(x, y) for points (x, y) near (x0 , y0 ).
146 CHAPTER 10. DERIVATIVES OF MULTIVARIABLE FUNCTIONS

• A function f of two independent variables is differentiable at (x0 , y0 )


provided that both fx and fy exist and are continuous in an open disk
containing the point (x0 , y0 ).

• The differential df of a function f = f (x, y) is related to the differentials


dx and dy by
df = fx (x0 , y0 )dx + fy (x0 , y0 )dy.
We can use this relationship to approximate small changes in f that
result from small changes in x and y.

Exercises
1. Find the linearization L (x, y) of the function f (x, y) = 89 − 9x2 − 1y 2
p

at (3, −2).
L (x, y) =
Note: Your answer should be an expression in x and y; e.g. “3x - 5y + 9”
2. Find the equation of the tangent plane to the surface z = e3x/17 ln (3y)
at the point (3, 3, 3.731).
z=
Note: Your answer should be an expression of x and y; e.g. “5x + 2y - 3”
3. A student was asked to find the equation of the tangent plane to the
surface z = x4 − y 5 at the point (x, y) = (2, 2). The student’s answer was
z = −16 + 4x3 (x − 2) − 5y 4 (y − 2).


(a) At a glance, how do you know this is wrong. What mistakes did the
student make? Select all that apply.
 The -16 should not be in the answer.  The answer is not a linear
function.  The (x - 2) and (y - 2) should be x and y.  The
partial derivatives were not evaluated a the point.  All of the above
(b) Find the correct equation for the tangent plane.
z=
4. (a) Check the local linearity of f (x, y) = ey sin(x) near x = 2.5, y = −1
by filling in the following table of values of f for x = 2.4, 2.5, 2.6 and y =
−1.1, −1, −0.9. Express values of f with 4 digits after the decimal point.

x= 2.4 2.5 2.6


y = −1.1
y = −1
y = −0.9

(b) Next, fill in the table for the values x = 2.49, 2.5, 2.51 and y =
−1.01, −1, −0.99, again showing 4 digits after the decimal point.

x= 2.49 2.5 2.51


y = −1.01
y = −1
y = −0.99

Notice if the two tables look nearly linear, and whether the second looks
more linear than the first (in particular, think about how you would decide if
10.4. LINEARIZATION: TANGENT PLANES AND DIFFERENTIALS 147

they were linear, or if the one were more closely linear than the other).
(c) Give the local linearization of f (x, y) = ey sin(x) at (2.5, −1):
Using the second of your tables:
f (x, y) ≈
Using the fact that fx (x, y) = ey cos(x) and fy (x, y) = ey sin(x):
f (x, y) ≈
5. Suppose that z is a linear function of x and y with slope -4 in the x
direction and slope 1 in the y direction.
(a) A change of −0.4 in x and 0.2 in y produces what change in z?
change in z =
(b) If z = 7 when x = 3 and y = 5, what is the value of z when x = 2.9
and y = 5.1?
z=
6. Find the differential of the function w = x3 sin(y 2 z 4 )
dw = dx+ dy+ dz
7. The dimensions of a closed rectangular box are measured as 70 cen-
timeters, 100 centimeters, and 100 centimeters, respectively, with the error in
each measurement at most .2 centimeters. Use differentials to estimate the
maximum error in calculating the surface area of the box.
square centimeters
8. One mole of ammonia gas is contained in a vessel which is capable of
changing its volume (a compartment sealed by a piston, for example). The
total energy U (in Joules) of the ammonia is a function of the volume V (in
cubic meters) of the container, and the temperature T (in degrees Kelvin) of
the gas. The differential dU is given by dU = 840dV + 27.32dT .
(a) How does the energy change if the volume is held constant and the
temperature is increased slightly?

it decreases slightly

it does not change

it increases slightly

(b) How does the energy change if the temperature is held constant and
the volume is decreased slightly?

it decreases slightly

it does not change

it increases slightly

(c) Find the approximate change in energy if the gas is compressed by 450
cubic centimeters and heated by 3 degrees Kelvin.
Change in energy = .
Please include units in your answer.
9. An unevenly heated metal plate has temperature T (x, y) in degrees Cel-
sius at a point (x, y). If T (2, 1) = 140, Tx (2, 1) = 15, and Ty (2, 1) = −14,
estimate the temperature at the point (2.04, 0.98).
T (2.04, 0.98) ≈ .
Please include units in your answer.
10. Let f be the function defined by f (x, y) = 2x2 + 3y 3 .

a. Find the equation of the tangent plane to f at the point (1, 1).
148 CHAPTER 10. DERIVATIVES OF MULTIVARIABLE FUNCTIONS

b. Use the linearization to approximate the values of f at the points (1.1, 2.05)
and (1.3, 2.2).

c. Compare the approximations form part (b) to the exact values of f (1.1, 2.05)
and f (1.3, 2.2). Which approximation is more accurate. Explain why this
should be expected.

11. Let f be the function defined by f (x, y) = x1/3 y 1/3 , whose graph is
shown in Figure 10.4.12.

0
-1

x
0
-1
-1 0 y 1

Figure 10.4.12: The surface for f (x, y) = x1/3 y 1/3 .

a. Determine
f (0 + h, 0) − f (0, 0)
lim .
h→0 h
What does this limit tell us about fx (0, 0)?

b. Note that f (x, y) = f (y, x), and this symmetry implies that fx (0, 0) =
fy (0, 0). So both partial derivatives of f exist at (0, 0). A picture of the
surface defined by f near (0, 0) is shown in Figure 10.4.12. Based on this
picture, do you think f is locally linear at (0, 0)? Why?

c. Show that the curve where x = y on the surface defined by f is not


differentiable at 0. What does this tell us about the local linearity of f
at (0, 0)?
x2
d. Is the function f defined by f (x, y) = y 2 +1 locally linear at (0, 0)? Why
or why not?

12. Let g be a function that is differentiable at (−2, 5) and suppose that its
tangent plane at this point is given by z = −7 + 4(x + 2) − 3(y − 5).

a. Determine the values of g(−2, 5), gx (−2, 5), and gy (−2, 5). Write one
sentence to explain your thinking.

b. Estimate the value of g(−1.8, 4.7). Clearly show your work and thinking.

c. Given changes of dx = −0.34 and dy = 0.21, estimate the corresponding


change in g that is given by its differential, dg.
10.4. LINEARIZATION: TANGENT PLANES AND DIFFERENTIALS 149

d. Suppose that another function h is also differentiable at (−2, 5), but that
its tangent plane at (−2, 5) is given by 3x + 2y − 4z = 9. Determine the
values of h(−2, 5), hx (−2, 5), and hy (−2, 5), and then estimate the value
of h(−1.8, 4.7). Clearly show your work and thinking.

13. In the following questions, we determine and apply the linearization for
several different functions.

a. Find the linearization L(x, y) for the function f defined by f (x, y) =


cos(x)(2e2y + e−2y ) at the point (x0 , y0 ) = (0, 0). Hence use the lin-
earization to estimate the value of f (0.1, 0.2). Compare your estimate to
the actual value of f (0.1, 0.2).

b. The Heat Index, I, (measured in apparent degrees F) is a function of the


actual temperature T outside (in degrees F) and the relative humidity
H (measured as a percentage). A portion of the table which gives values
for this function, I = I(T, H), is provided in Table 10.4.13.

T ↓\H → 70 75 80 85
90 106 109 112 115
92 112 115 119 123
94 118 122 127 132
96 125 130 135 141

Table 10.4.13: Heat index.

Suppose you are given that IT (94, 75) = 3.75 and IH (94, 75) = 0.9. Use
this given information and one other value from the table to estimate
the value of I(93.1, 77) using the linearization at (94, 75). Using proper
terminology and notation, explain your work and thinking.

c. Just as we can find a local linearization for a differentiable function of


two variables, we can do so for functions of three or more variables.
By extending the concept of the local linearization from two to three
variables, find the linearization of the function h(x, y, z) = e2x (y + z 2 ) at
the point (x0 , y0 , z0 ) = (0, 1, −2). Then, use the linearization to estimate
the value of h(−0.1, 0.9, −1.8).

14. In the following questions, we investigate two different applied settings


using the differential.

a. Let f represent the vertical displacement in centimeters from the rest


position of a string (like a guitar string) as a function of the distance
x in centimeters from the fixed left end of the string and y the time
in seconds after the string has been plucked. (An interesting video of
this can be seen at https://www.youtube.com/watch?v=TKF6nFzpHBUA.)
A simple model for f could be

f (x, y) = cos(x) sin(2y).

Use the differential to approximate how much more this vibrating string
is vertically displaced from its position at (a, b) = π4 , π3 if we decrease a
by 0.01 cm and increase the time by 0.1 seconds. Compare to the value
of f at the point π4 − 0.01, π3 + 0.1 .
150 CHAPTER 10. DERIVATIVES OF MULTIVARIABLE FUNCTIONS

b. Resistors used in electrical circuits have colored bands painted on them


to indicate the amount of resistance and the possible error in the resis-
tance. When three resistors, whose resistances are R1 , R2 , and R3 , are
connected in parallel, the total resistance R is given by
1 1 1 1
= + + .
R R1 R2 R3
Suppose that the resistances are R1 = 25Ω, R2 = 40Ω, and R3 = 50Ω.
Find the total resistance R. If you know each of R1 , R2 , and R3 with a
possible error of 0.5%, estimate the maximum error in your calculation
of R.

15. In this exercise we explore the concept of differetiability of a function


of two variables in more detail. We will consider the function f defined by
f (x, y) = |x| + |y|.
a. Use appropriate technology to plot the graph of f on the domain [−1, 1]×
[−1, 1]. Based on the graph, do you think that f is locally linear at (0, 0)?
Explain your reasoning.
b. Show that both fx (0, 0) and fy (0, 0) exist. If f is locally linear at (0, 0),
what must be the equation of the tangent plane L to f at (0, 0)?
c. In general, if g = g(x, y) is a function of two variables, and g is differen-
tiable at a point (x0 , y0 ), let

L(x, y) = g(x0 , y0 ) + gx (x0 , y0 )(x − x0 ) + gy (x0 , y0 )(y − y0 )

be the linearization of g at (x0 , y0 ). The error in approximating g(x, y)


by L(x, y) for points near (x0 , y0 ) is given by

E(x, y) = g(x, y) − L(x, y).

It might be reasonable to think that if the error term goes to 0 as (x, y)


approaches (x0 , y0 ), then g is locally linear at (x0 , y0 ). Assume that
f (x, y) = |x| + |y| is differentiable at (0, 0) and use what must be its
linearization at the origin that you found in (b) and demonstrate that
the limit of E(x, y) for f is 0 at (0, 0). This shows that just because an
error term goes to 0 as (x, y) approaches (x0 , y0 ), we cannot conclude
that a function is locally linear at (x0 , y0 ).
d. To help understand the condition we need for differentiability (local lin-
earity) at a point, let us recall the absolute value function a of a single
variable, that is a(x) = |x|. The reason that a is not differentiable at 0 is
that to the right of 0 a is the linear function y = x and to the left of 0 a
is the linear function y = −x. In other words, the slope of a is different
on the two sides of 0. The slope is the change in a divided by the change
in x, or how far the values of a are from the origin in relation to how far
the values of x are from the origin. This is a relative error instead of an
actual error. We can apply the same idea for functions of two variables
and measure how well the error term approximates a function relative to
how far the point in question is from the base point.
For a differentiable function g of two variables x and y, we define the
relative error in approximating g(x, y) with L(x, y) as
E(x0 + h, y0 + k)
√ ,
h2 + k 2
10.4. LINEARIZATION: TANGENT PLANES AND DIFFERENTIALS 151

where h = x − x0 and k = y − y0 . Notice that h2 + k 2 measures how
far the point (x, y) is from (x0 , y0 ). It is this relative error that we want
to go to 0 in order for our function to be differentiable at (x0 , y0 ). We
can use this idea to more formally define differentiability of a function of
two variables.

Definition 10.4.14. A function f = f (x, y) is differentiable at a


point (x0 , y0 ) if there is a linear function L = L(x, y) = f (x0 , y0 ) +
m(x − x0 ) + n(y − y0 ) such that the relative error

E(x0 + h, y0 + k)
√ ,
h2 + k 2
has at limit of 0 at (h, k) = (0, 0), where E(x, y) = f (x, y) − L(x, y),
h = x − x0 , and k = y − y0 .

Show that for f (x, y) = |x| + |y|, the relative error at (x, y) = (0, 0) does
not have a limit at (h, k) = (0, 0), using L(x, y) as in part (b). So in this
case the relative error does show that f is not differentiable at the origin.
As we have seen, it is often difficult to verify a limit of a function at a
point, so this definition of differentibility can be hard to use.

e. We conclude this exercise by showing that the linear function in Defini-


tion 10.4.14 is in fact our tangent plane. So assume that a function g
is differentiable at a point (x0 , y0 ) and that L = L(x, y) = f (x0 , y0 ) +
m(x − x0 ) + n(y − y0 ) satisfies the conditions of Definition 10.4.14. Show
that m = gx (x0 , y0 ) and n = gy (x0 , y0 ). (Hint: Calculate the limits of
the relative errors when h = 0 and k = 0.)
152 CHAPTER 10. DERIVATIVES OF MULTIVARIABLE FUNCTIONS

10.5 The Chain Rule

Motivating Questions

• What is the Chain Rule and how do we use it to find a derivative?

• How can we use a tree diagram to guide us in applying the Chain Rule?

In single-variable calculus, we encountered situations in which some quan-


tity z depends on y and, in turn, y depends on x. A change in x produces a
change in y, which consequently produces a change in z. Using the language
of differentials that we saw in the previous section, these changes are naturally
related by
dz dy
dz = dy and dy = dx.
dy dx
In terms of instantaneous rates of change, we then have

dz dy dz
dz = dx = dx
dy dx dx

and thus
dz dz dy
= .
dx dy dx
This most recent equation we call the Chain Rule.
In the case of a function f of two variables where z = f (x, y), it might be
that both x and y depend on another variable t. A change in t then produces
changes in both x and y, which then cause z to change. In this section we
will see how to find the change in z that is caused by a change in t, leading us
to multivariable versions of the Chain Rule involving both regular and partial
derivatives.

3 y z

1
y

x
x
1 2 3

Figure 10.5.1: Left: Your position in the plane. Right: The corresponding
temperature.
10.5. THE CHAIN RULE 153

Preview Activity 10.5.1. Suppose you are driving around in the xy-plane
in such a way that your position r(t) at time t is given by function

r(t) = hx(t), y(t)i = h2 − t2 , t3 + 1i.

The path taken is shown on the left of Figure 10.5.1.


Suppose, furthermore, that the temperature at a point in the plane is given
by
1 1
T (x, y) = 10 − x2 − y 2 ,
2 5
and note that the surface generated by T is shown on the right of Figure 10.5.1.
Therefore, as time passes, your position (x(t), y(t)) changes, and, as your po-
sition changes, the temperature T (x, y) also changes.

a. The position function r provides a parameterization x = x(t) and y =


y(t) of the position at time t. By substituting x(t) for x and y(t) for
y in the formula for T , we can write T = T (x(t), y(t)) as a function of
t. Make these substitutions to write T as a function of t and then use
the Chain Rule from single variable calculus to find dT dt . (Do not do any
algebra to simplify the derivative, either before taking the derivative, nor
after.)

b. Now we want to understand how the result from part (a) can be obtained
from T as a multivariable function. Recall from the previous section that
small changes in x and y produce a change in T that is approximated by

∆T ≈ Tx ∆x + Ty ∆y.

The Chain Rule tells us about the instantaneous rate of change of T , and
this can be found as
∆T Tx ∆x + Ty ∆y
lim = lim . (10.5.1)
∆t→0 ∆t ∆t→0 ∆t

Use equation (10.5.1) to explain why the instantaneous rate of change of


T that results from a change in t is

dT ∂T dx ∂T dy
= + . (10.5.2)
dt ∂x dt ∂y dt

c. Using the original formulas for T , x, and y in the problem statement,


calculate all of the derivatives in Equation (10.5.2) (with Tx and Ty in
terms of x and y, and x0 and y 0 in terms of t), and hence write the
right-hand side of Equation (10.5.2) in terms of x, y, and t.

d. Compare the results of parts (a) and (c). Write a couple of sentences that
identify specifically how each term in (c) relates to a corresponding terms
in (a). This connection between parts (a) and (c) provides a multivariable
version of the Chain Rule.

10.5.1 The Chain Rule


As Preview Activity 10.3.1 suggests, the following version of the Chain Rule
holds in general.
154 CHAPTER 10. DERIVATIVES OF MULTIVARIABLE FUNCTIONS

The Chain Rule.


Let z = f (x, y), where f is a differentiable function of the independent
variables x and y, and let x and y each be differentiable functions of an
independent variable t. Then
dz ∂z dx ∂z dy
= + . (10.5.3)
dt ∂x dt ∂y dt

It is important to note the differences among the derivatives in (10.5.3).


Since z is a function of the two variables x and y, the derivatives in the Chain
Rule for z with respect to x and y are partial derivatives. However, since
x = x(t) and y = y(t) are functions of the single variable t, their derivatives
are the standard derivatives of functions of one variable. When we compose z
with x(t) and y(t), we then have z as a function of the single variable t, making
the derivative of z with respect to t a standard derivative from single variable
calculus as well.
To understand why this Chain Rule works in general, suppose that some
quantity z depends on x and y so that
∂z ∂z
dz = dx + dy. (10.5.4)
∂x ∂y
Next, suppose that x and y each depend on another quantity t, so that
dx dy
dx = dt and dy = dt. (10.5.5)
dt dt
Combining Equations (10.5.4) and (10.5.5), we find that

∂z dx ∂z dy dz
dz = dt + dt = dt,
∂x dt ∂y dt dt
which is the Chain Rule in this particular context, as expressed in Equa-
tion (10.5.3).

Activity 10.5.2. In the following questions, we apply the Chain Rule in sev-
eral different contexts.

a. Suppose that we have a function z defined by z(x, y) = x2 + xy 3 . In


addition, suppose that x and y are restricted to points that move around
the plane by following a circle of radius 2 centered at the origin that is
parameterized by

x(t) = 2 cos(t), and y(t) = 2 sin(t).

i. Use the Chain Rule to find the resulting instantaneous rate of change
dt .
dz

ii. Substitute x(t) for x and y(t) for y in the rule for z to write z in
dt directly. Compare to the result of part
terms of t and calculate dz
(i.).

b. Suppose that the temperature on a metal plate is given by the function


T with
T (x, y) = 100 − (x2 + 4y 2 ),
where the temperature is measured in degrees Fahrenheit and x and y
are each measured in feet.
10.5. THE CHAIN RULE 155

i. Find Tx and Ty . What are the units on these partial derivatives?

ii. Suppose an ant is walking along the x-axis at the rate of 2 feet per
minute toward the origin. When the ant is at the point (2, 0), what
is the instantaneous rate of change in the temperature dT /dt that
the ant experiences. Include units on your response.

iii. Suppose instead that the ant walks along an ellipse with x = 6 cos(t)
and y = 3 sin(t), where t is measured in minutes. Find dT dt at t =
π/6, t = π/4, and t = π/3. What does this seem to tell you about
the path along which the ant is walking?

c. Suppose that you are walking along a surface whose elevation is given
by a function f . Furthermore, suppose that if you consider how your
location corresponds to points in the xy-plane, you know that when you
pass the point (2, 1), your velocity vector is v = h−1, 2i. If some contours
of f are as shown in Figure 10.5.2, estimate the rate of change df /dt when
you pass through (2, 1).

y
5
4
4

3
2
2

1
x
2 4

Figure 10.5.2: Some contours of f .

10.5.2 Tree Diagrams

Up to this point, we have applied the Chain Rule to situations where we have
a function z of variables x and y, with both x and y depending on another
single quantity t. We may apply the Chain Rule, however, when x and y each
depend on more than one quantity, or when z is a function of more than two
variables. It can be challenging to keep track of all the dependencies among
the variables, and thus a tree diagram can be a useful tool to organize our
work. For example, suppose that z depends on x and y, and x and y both
depend on t. We may represent these relationships using the tree diagram
shown at left Figure 10.5.3. We place the dependent variable at the top of the
tree and connect it to the variables on which it depends one level below. We
then connect each of those variables to the variable on which each depends.
156 CHAPTER 10. DERIVATIVES OF MULTIVARIABLE FUNCTIONS

z z

∂z ∂z
∂x ∂y

x y x y

dx dy
dt dt

t t t t

Figure 10.5.3: A tree diagram illustrating dependencies.

To represent the Chain Rule, we label every edge of the diagram with the
appropriate derivative or partial derivative, as seen at right in Figure 10.5.3.
To calculate an overall derivative according to the Chain Rule, we construct
the product of the derivatives along all paths connecting the variables and then
add all of these products. For example, the diagram at right in Figure 10.5.3
illustrates the Chain Rule
dz ∂z dx ∂z dy
= + .
dt ∂x dt ∂y dt

Activity 10.5.3.

a. Figure 10.5.4 shows the tree diagram we construct when (a) z depends
on w, x, and y, (b) w, x, and y each depend on u and v, and (c) u and
v depend on t.

w x y

u v u v u v

t t t t t t

Figure 10.5.4: Three levels of dependencies

i. Label the edges with the appropriate derivatives.


ii. Use the Chain Rule to write dt .
dz
10.5. THE CHAIN RULE 157

b. Suppose that z = x2 − 2xy 2 and that

x = r cos(θ)
y = r sin(θ).

i. Construct a tree diagram representing the dependencies of z on x


and y and x and y on r and θ.
ii. Use the tree diagram to find ∂r .
∂z

iii. Now suppose that r = 3 and θ = π/6. Find the values of x and y
that correspond to these given values of r and θ, and then use the
Chain Rule to find the value of the partial derivative ∂z
∂θ |(3, 6 ) .
π

10.5.3 Summary

• The Chain Rule is a tool for differentiating a composite for functions. In


its simplest form, it says that if f (x, y) is a function of two variables and
x(t) and y(t) depend on t, then

df ∂f dx ∂f dy
= + .
dt ∂x dt ∂y dt

• A tree diagram can be used to represent the dependence of variables on


other variables. By following the links in the tree diagram, we can form
chains of partial derivatives or derivatives that can be combined to give
a desired partial derivative.

Exercises
1. Use the chain rule to find dt ,
dz
where

z = x2 y + xy 2 , x = 2 + t3 , y = −3 − t2

First the pieces:


∂z
∂x =
∂z
∂y =
dx
dt=
dy
dt=
End result (in terms of just t):
dz
dt =

2. Use the chain rule to find ∂z


∂s and ∂t ,
∂z
where

5s
z = exy tan y, x = 5s + 3t, y =
2t

First the pieces:


∂z ∂z
∂x = ∂y =
∂x ∂x
∂s= ∂t =
∂y ∂y
∂s= ∂t =
And putting it all together:
∂z ∂y ∂z ∂y
∂s = ∂x ∂s + ∂y ∂s and ∂t =
∂z ∂z ∂x ∂z ∂z ∂x
∂x ∂t + ∂y ∂t
158 CHAPTER 10. DERIVATIVES OF MULTIVARIABLE FUNCTIONS

x y
3. Suppose w = + , where
y z
x = e , y = 2 + sin (t), and z = 2 + cos (3t).
5t

A) Use the chain rule to find dw dt as a function of x, y, z, and t. Do not


rewrite x, y, and z in terms of t, and do not rewrite e5t as x.
dt =
dw

Note: You may want to use exp() for the exponential function. Your answer
should be an expression in x, y, z, and t; e.g. “3x - 4y”
B ) Use part A to evaluate dw dt when t = 0.
4. If z = (x + y) e and x = u2 + v 2 and y = u2 − v 2 , find the following
y

partial derivatives using the chain rule. Enter your answers as functions of u
and v.
∂z
=
∂u
∂z
=
∂v
5. If
z = sin x2 + y 2 ,

x = u cos(v) , y = u sin(v) ,
find ∂z/∂u and ∂z/∂v. The variables are restricted to domains on which the
functions are defined.
∂z/∂u =
∂z/∂v =
6. Let z = g(u, v, w) and u(r, s), v(r, s), w(r, s). How many terms are there
in the expression for ∂z/∂r?
terms
7. Let W (s, t) = F (u(s, t), v(s, t)) where
u(1, 0) = −2, us (1, 0) = 5, ut (1, 0) = 4
v(1, 0) = 3, vs (1, 0) = −8, vt (1, 0) = 9
Fu (−2, 3) = 5, Fv (−2, 3) = 8
Ws (1, 0) = Wt (1, 0) =
8. The radius of a right circular cone is increasing at a rate of 4 inches per
second and its height is decreasing at a rate of 3 inches per second. At what
rate is the volume of the cone changing when the radius is 10 inches and the
height is 20 inches?
cubic inches per second
9. In a simple electric circuit, Ohm’s law states that V = IR, where V is the
voltage in volts, I is the current in amperes, and R is the resistance in ohms.
Assume that, as the battery wears out, the voltage decreases at 0.03 volts per
second and, as the resistor heats up, the resistance is increasing at 0.01 ohms
per second. When the resistance is 300 ohms and the current is 0.01 amperes,
at what rate is the current changing?
amperes per second
10. Suppose z = x sin y, x = 3s + 3t , y = 6st.
2 2 2

A. Use the chain rule to find ∂z


∂s and ∂t as functions of x, y, s and t.
∂z
∂z
∂s =
∂z
∂t =
B. Find the numerical values of ∂z ∂s and ∂t when (s, t) = (2, −3).
∂z
∂z
∂s (2, −3) =
∂z
∂t (2, −3) =
11. Find the indicated derivative. In each case, state the version of the
Chain Rule that you are using.
10.5. THE CHAIN RULE 159

df
a. dt , if f (x, y) = 2x2 y, x = cos(t), and y = ln(t).
∂f
b. ∂w , if f (x, y) = 2x2 y, x = w + z 2 , and y = 2z+1
w
∂f
c. ∂v , if f (x, y, z) = 2x2 y + z 3 , x = u − v + 2w, y = w2v − u3 , and z = u2 − v

12. Let z = u2 − v 2 and suppose that

u = ex cos(y)
v = ex sin(y)

a. Find the values of u and v that correspond to x = 0 and y = 2π/3.


b. Use the Chain Rule to find the general partial derivatives
∂z ∂z
and
∂x ∂y

and then determine both ∂z


and ∂z
.


∂x (0, 2π
3 )
∂y (0, 2π
3 )

13. Suppose that T = x2 + y 2 − 2z where

x = ρ sin(φ) cos(θ)
y = ρ sin(φ) sin(θ)
z = ρ cos(φ)

a. Construct a tree diagram representing the dependencies among the vari-


ables.
b. Apply the chain rule to find the partial derivatives
∂T ∂T ∂T
, , and .
∂ρ ∂φ ∂θ

14. Suppose that the temperature on a metal plate is given by the function
T with
T (x, y) = 100 − (x2 + 4y 2 ),
where the temperature is measured in degrees Fahrenheit and x and y are each
measured in feet. Now suppose that an ant is walking on the metal plate in
such a way that it walks in a straight line from the point (1, 4) to the point
(5, 6).
a. Find parametric equations (x(t), y(t)) for the ant’s coordinates as it walks
the line from (1, 4) to (5, 6).
dy
b. What can you say about dx
dt and dt for every value of t?
c. Determine the instantaneous rate of change in temperature with respect
to t that the ant is experiencing at the moment it is halfway from (1, 4)
to (5, 6), using your parametric equations for x and y. Include units on
your answer.

15. There are several proposed formulas to approximate the surface area of
the human body. One model1 uses the formula

A(h, w) = 0.0072h0.725 w0.425 ,


1 DuBois D, DuBois DF. A formula to estimate the approximate surface area if height

and weight be known. Arch Int Med 1916;17:863-71.


160 CHAPTER 10. DERIVATIVES OF MULTIVARIABLE FUNCTIONS

where A is the surface area in square meters, h is the height in centimeters,


and w is the weight in kilograms.
Since a person’s height h and weight w change over time, h and w are
functions of time t. Let us think about what is happening to a child whose
height is 60 centimeters and weight is 9 kilograms. Suppose, furthermore, that
h is increasing at an instantaneous rate of 20 centimeters per year and w is
increasing at an instantaneous rate of 5 kg per year.
Determine the instantaneous rate at which the child’s surface area is chang-
ing at this point in time.
16. Let z = f (x, y) = 50 − (x + 1)2 − (y + 3)2 and z = h(x, y) = 24 − 2x − 6y.
Suppose a person is walking on the surface z = f (x, y) in such a way that
she walks the curve which is the intersection of f and h.

a. Show that x(t) = 4 cos(t) and y(t) = 4 sin(t) is a parameterization of


the “shadow” in the xy-plane of the curve that is the intersection of the
graphs of f and h.
b. Use the parameterization from part (a) to find the instantaneous rate at
which her height is changing with respect to time at the instant t = 2π/3.

17. The voltage V (in volts) across a circuit is given by Ohm’s Law: V = IR,
where I is the current (in amps) in the circuit and R is the resistance (in ohms).
Suppose we connect two resistors with resistances R1 and R2 in parallel as
shown in Figure 10.5.5. The total resistance R in the circuit is then given by
1 1 1
= + .
R R1 R2

a. Assume that the current, I, and


the resistances, R1 and R2 , are
changing over time, t. Use the
Chain Rule to write a formula for R1 R2
dt .
dV

b. Suppose that, at some particu-


lar point in time, we measure the + −
current to be 3 amps and that the
current is increasing at 10
1
amps
per second, while resistance R1
is 2 ohms and decreasing at the
rate of 0.2 ohms per second and
R2 is 1 ohm and increasing at the
rate of 0.5 ohms per second. At
what rate is the voltage changing
at this point in time?
Figure 10.5.5: Resistors in parallel.
10.6. DIRECTIONAL DERIVATIVES AND THE GRADIENT 161

10.6 Directional Derivatives and the Gradient

Motivating Questions
• The partial derivatives of a function f tell us the rate of change of f in
the direction of the coordinate axes. How can we measure the rate of
change of f in other directions?
• What is the gradient of a function and what does it tell us?

The partial derivatives of a function tell us the instantaneous rate at which


the function changes as we hold all but one independent variable constant and
allow the remaining independent variable to change. It is natural to wonder
how we can measure the rate at which a function changes in directions other
than parallel to a coordinate axes. In what follows, we investigate this question,
and see how the rate of change in any given direction is connected to the rates
of change given by the standard partial derivatives.
Preview Activity 10.6.1. Let’s consider the function f defined by
1
f (x, y) = 30 − x2 − y 2 ,
2
and suppose that f measures the temperature, in degrees Celsius, at a given
point in the plane, where x and y are measured in feet. Assume that the
positive x-axis points due east, while the positive y-axis points due north. A
contour plot of f is shown in Figure 10.6.1

6 y

1
x
1 2 3 4 5 6
Figure 10.6.1: A contour plot of f (x, y) = 30 − x2 − 21 y 2 .

a. Suppose that a person is walking due east, and thus parallel to the x-axis.
At what instantaneous rate is the temperature changing with respect to
x at the moment the walker passes the point (2, 1)? What are the units
on this rate of change?
b. Next, determine the instantaneous rate of change of temperature with
respect to distance at the point (2, 1) if the person is instead walking due
north. Again, include units on your result.
162 CHAPTER 10. DERIVATIVES OF MULTIVARIABLE FUNCTIONS

c. Now, rather than walking due east or due north, let’s suppose that the
person is walking with velocity given by the vector v = h3, 4i, where time
is measured in seconds. Note that the person’s speed is thus |v| = 5 feet
per second. Find parametric equations for the person’s path; that is,
parameterize the line through (2, 1) using the direction vector v = h3, 4i.
Let x(t) denote the x-coordinate of the line, and y(t) its y-coordinate.
Make sure your parameterization places the walker at the point (2, 1)
when t = 0.
d. With the parameterization in (c), we can now view the temperature f
as not only a function of x and y, but also of time, t. Hence, use the
chain rule to determine the value of df
dt t=0 . What are the units on your

answer? What is the practical meaning of this result?

10.6.1 Directional Derivatives


Given a function z = f (x, y), the partial derivative fx (x0 , y0 ) measures the
instantaneous rate of change of f as only the x variable changes; likewise,
fy (x0 , y0 ) measures the rate of change of f at (x0 , y0 ) as only y changes. Note
particularly that fx (x0 , y0 ) is measured in “units of f per unit of change in x,”
and that the units on fy (x0 , y0 ) are similar.
In Preview Activity 10.6.1, we saw how we could measure the rate of change
of f in a situation where both x and y were changing; in that activity, however,
we found that this rate of change was measured in “units of f per unit of time.”
In a given unit of time, we may move more than one unit of distance. In fact,
in Preview Activity 10.6.1, in each unit increase in time we move a distance of
|v| = 5 feet. To generalize the notion of partial derivatives to any direction of
our choice, we instead want to have a rate of change whose units are “units of
f per unit of distance in the given direction.”
In this light, in order to formally define the derivative in a particular direc-
tion of motion, we want to represent the change in f for a given unit change in
the direction of motion. We can represent this unit change in direction with a
unit vector, say u = hu1 , u2 i. If we move a distance h in the direction of u from
a fixed point (x0 , y0 ), we then arrive at the new point (x0 + u1 h, y0 + u2 h). It
now follows that the slope of the secant line to the curve on the surface through
(x0 , y0 ) in the direction of u through the points (x0 , y0 ) and (x0 +u1 h, y0 +u2 h)
is
f (x0 + u1 h, y0 + u2 h) − f (x0 , y0 )
msec = . (10.6.1)
h
To get the instantaneous rate of change of f in the direction u = hu1 , u2 i,
we must take the limit of the quantity in Equation (10.6.1) as h → 0. Doing
so results in the formal definition of the directional derivative.
Definition 10.6.2. Let f = f (x, y) be given. The derivative of f at the
point (x, y) in the direction of the unit vector u = hu1 , u2 i is denoted
Du f (x, y) and is given by
f (x + u1 h, y + u2 h) − f (x, y)
Du f (x, y) = lim (10.6.2)
h→0 h
for those values of x and y for which the limit exists.
The quantity Du f (x, y) is called a directional derivative. When we evaluate
the directional derivative Du f (x, y) at a point (x0 , y0 ), the result Du f (x0 , y0 )
tells us the instantaneous rate at which f changes at (x0 , y0 ) per unit increase
in the direction of the vector u. In addition, the quantity Du f (x0 , y0 ) tells us
the slope of the line tangent to the surface in the direction of u at the point
(x0 , y0 , f (x0 , y0 )).
10.6. DIRECTIONAL DERIVATIVES AND THE GRADIENT 163

10.6.2 Computing the Directional Derivative


In a similar way to how we developed shortcut rules for standard derivatives
in single variable calculus, and for partial derivatives in multivariable calculus,
we can also find a way to evaluate directional derivatives without resorting to
the limit definition found in Equation (10.6.2). We do so using a very similar
approach to our work in Preview Activity 10.6.1.
Suppose we consider the situation where we are interested in the instanta-
neous rate of change of f at a point (x0 , y0 ) in the direction u = hu1 , u2 i, where
u is a unit vector. The variables x and y are therefore changing according to
the parameterization

x = x0 + u1 t and y = y0 + u2 t.
dy
Observe that dx dt = u1 and dt = u2 for all values of t. Since u is a unit
vector, it follows that a point moving along this line moves one unit of distance
per one unit of time; that is, each single unit of time corresponds to movement
of a single unit of distance in that direction. This observation allows us to
use the Chain Rule to calculate the directional derivative, which measures the
instantaneous rate of change of f with respect to change in the direction u.
In particular, by the Chain Rule, it follows that

dx dy
Du f (x0 , y0 ) = fx (x0 , y0 ) +fy (x0 , y0 )
dt (x0 ,y0 ) dt (x0 ,y0 )
= fx (x0 , y0 )u1 + fy (x0 , y0 )u2 .

This now allows us to compute the directional derivative at an arbitrary


point according to the following formula.
Calculating a directional derivative.
Given a differentiable function f = f (x, y) and a unit vector u =
hu1 , u2 i, we may compute Du f (x, y) by

Du f (x, y) = fx (x, y)u1 + fy (x, y)u2 . (10.6.3)

Note well: To use Equation (10.6.3), we must have a unit vector u =


hu1 , u2 i in the direction of motion. In the event that we have a direction
prescribed by a non-unit vector, we must first scale the vector to have length
1.
Activity 10.6.2. Let f (x, y) = 3xy − x2 y 3 .
a. Determine fx (x, y) and fy (x, y).
b. Use Equation (10.6.3) to determine Di f (x, y) and Dj f (x, y). What fa-
miliar function is Di f ? What familiar function is Dj f ? (Recall that i is
the unit vector in the positive x-direction and j is the unit vector in the
positive y-direction.)
c. Use Equation (10.6.3) to find the derivative of f in the direction of the
vector v = h2, 3i at the point (1, −1). Remember that a unit direction
vector is needed.

10.6.3 The Gradient


Via the Chain Rule, we have seen that for a given function f = f (x, y), its
instantaneous rate of change in the direction of a unit vector u = hu1 , u2 i is
164 CHAPTER 10. DERIVATIVES OF MULTIVARIABLE FUNCTIONS

given by
Du f (x0 , y0 ) = fx (x0 , y0 )u1 + fy (x0 , y0 )u2 . (10.6.4)
Recalling that the dot product of two vectors v = hv1 , v2 i and u = hu1 , u2 i
is computed by
v · u = v1 u1 + v2 u2 ,
we see that we may recast Equation (10.6.4) in a way that has geometric
meaning. In particular, we see that Du f (x0 , y0 ) is the dot product of the
vector hfx (x0 , y0 ), fy (x0 , y0 )i and the vector u.
We call this vector formed by the partial derivatives of f the gradient of f
and denote it
∇f (x0 , y0 ) = hfx (x0 , y0 ), fy (x0 , y0 )i .
We read ∇f as “the gradient of f ,” “grad f ” or “del f ”.1 Notice that ∇f
varies from point to point, and also provides an alternate formulation of the
directional derivative.
The directional derivative and the gradient.
Given a differentiable function f = f (x, y) and a unit vector u =
hu1 , u2 i, we may compute Du f (x, y) by

Du f (x, y) = ∇f (x, y) · u. (10.6.5)

In the following activity, we investigate some of what the gradient tells us


about the behavior of a function f .
Activity 10.6.3. Let’s consider the function f defined by f (x, y) = x2 − y 2 .
Some contours for this function are shown in Figure 10.6.3.

4 y
3
2
1
x
-4 -3 -2 -1 1 2 3 4
-1 3 6 9
0
-2 -3
-6
-3 -9
-4

Figure 10.6.3: Contours of f (x, y) = x2 − y 2 .

a. Find the gradient ∇f (x, y).


b. For each of the following points (x0 , y0 ), evaluate the gradient ∇f (x0 , y0 )
and sketch the gradient vector with its tail at (x0 , y0 ). Some of the vectors
are too long to fit onto the plot, but we’d like to draw them to scale; to
do so, scale each vector by a factor of 1/4.
1 The symbol ∇ is called nabla, which comes from a Greek word for a certain type of harp

that has a similar shape.


10.6. DIRECTIONAL DERIVATIVES AND THE GRADIENT 165

• (x0 , y0 ) = (2, 0)
• (x0 , y0 ) = (0, 2)
• (x0 , y0 ) = (2, 2)
• (x0 , y0 ) = (2, 1)
• (x0 , y0 ) = (−3, 2)
• (x0 , y0 ) = (−2, −4)
• (x0 , y0 ) = (0, 0)

c. What do you notice about the relationship between the gradient at


(x0 , y0 ) and the contour passing through that point?

d. Does f increase or decrease in the direction of ∇f (x0 , y0 )? Provide a


justification for your response.

As a vector, ∇f (x0 , y0 ) defines a direction and a length. As we will soon


see, both of these convey important information about the behavior of f near
(x0 , y0 ).

10.6.4 The Direction of the Gradient


Remember that the dot product also conveys information about the angle
between the two vectors. If θ is the angle between ∇f (x0 , y0 ) and u (where u
is a unit vector), then we also have that

Du f (x0 , y0 ) = ∇f (x0 , y0 ) · u = |∇f (x0 , y0 )||u| cos(θ).

In particular, when θ is a right angle, as shown on the left of Figure 10.6.4,


then Du f (x0 , y0 ) = 0, because cos(θ) = 0. Since the value of the directional
derivative is 0, this means that f is unchanging in this direction, and hence
u must be tangent to the contour of f that passes through (x0 , y0 ). In other
words, ∇f (x0 , y0 ) is orthogonal to the contour through (x0 , y0 ). This shows
that the gradient vector at a given point is always perpendicular to the con-
tour passing through the point, confirming that what we saw in part (c) of
Activity 10.6.3 holds in general.

y y y
∇f (x0 , y0 ) ∇f (x0 , y0 ) ∇f (x0 , y0 )

u
θ θ θ
u
(x0 , y0 ) (x0 , y0 ) u (x0 , y0 )
x x x

Figure 10.6.4: The sign of Du f (x0 , y0 ) is determined by θ.

Moreover, when θ is an acute angle, it follows that cos(θ) > 0 so since

Du f (x0 , y0 ) = |∇f (x0 , y0 )||u| cos(θ),


166 CHAPTER 10. DERIVATIVES OF MULTIVARIABLE FUNCTIONS

and therefore Du f (x0 , y0 ) > 0, as shown in the middle image in Figure 10.6.4.
This means that f is increasing in any direction where θ is acute. In a similar
way, when θ is an obtuse angle, then cos(θ) < 0 so Du f (x0 , y0 ) < 0, as seen
on the right in Figure 10.6.4. This means that f is decreasing in any direction
for which θ is obtuse.
Finally, as we can see in the following activity, we may also use the gradient
to determine the directions in which the function is increasing and decreasing
most rapidly.

Activity 10.6.4. In this activity we investigate how the gradient is related


to the directions of greatest increase and decrease of a function. Let f be a
differentiable function and u a unit vector.

a. Let θ be the angle between ∇f (x0 , y0 ) and u. Use the relationship be-
tween the dot product and the angle between two vectors to explain why

Du f (x0 , y0 ) = |hfx (x0 , y0 ), fy (x0 , y0 )i| cos(θ). (10.6.6)

b. At the point (x0 , y0 ), the only quantity in Equation (10.6.6) that can
change is θ (which determines the direction u of travel). Explain why
θ = 0 makes the quantity

|hfx (x0 , y0 ), fy (x0 , y0 )i| cos(θ)

as large as possible.

c. When θ = 0, in what direction does the unit vector u point relative to


∇f (x0 , y0 )? Why? What does this tell us about the direction of greatest
increase of f at the point (x0 , y0 )?

d. In what direction, relative to ∇f (x0 , y0 ), does f decrease most rapidly


at the point (x0 , y0 )?

e. State the unit vectors u and v (in terms of ∇f (x0 , y0 )) that provide
the directions of greatest increase and decrease for the function f at
the point (x0 , y0 ). What important assumption must we make regarding
∇f (x0 , y0 ) in order for these vectors to exist?

10.6.5 The Length of the Gradient


Having established in Activity 10.6.4 that the direction in which a function
increases most rapidly at a point (x0 , y0 ) is the unit vector u in the direction
of the gradient, (that is, u = |∇f (x10 ,y0 )| ∇f (x0 , y0 ), provided that ∇f (x0 , y0 ) 6=
0), it is also natural to ask, “in the direction of greatest increase for f at
(x0 , y0 ), what is the value of the rate of increase?” In this situation, we are
asking for the value of Du f (x0 , y0 ) where u = |∇f (x10 ,y0 )| ∇f (x0 , y0 ).
Using the now familiar way to compute the directional derivative, we see
that  
1
Du f (x0 , y0 ) = ∇f (x0 , y0 ) · ∇f (x0 , y0 ) .
|∇f (x0 , y0 )|
Next, we recall two important facts about the dot product: (i) w · (cv) =
c(w · v) for any scalar c, and (ii) w · w = |w|2 . Applying these properties to
the most recent equation involving the directional derivative, we find that
1 1
Du f (x0 , y0 ) = (∇f (x0 , y0 )·∇f (x0 , y0 )) = |∇f (x0 , y0 )|2 .
|∇f (x0 , y0 )| |∇f (x0 , y0 )|
10.6. DIRECTIONAL DERIVATIVES AND THE GRADIENT 167

Finally, since ∇f (x0 , y0 ) is a nonzero vector, its length |∇f (x0 , y0 )| is a


nonzero scalar, and thus we can simplify the preceding equation to establish
that
Du f (x0 , y0 ) = |∇f (x0 , y0 )|.
We summarize our most recent work by stating two important facts about
the gradient.
Important facts about the gradient.
Let f be a differentiable function and (x0 , y0 ) a point for which
∇f (x0 , y0 ) 6= 0. Then ∇f (x0 , y0 ) points in the direction of great-
est increase of f at (x0 , y0 ), and the instantaneous rate of change of
f in that direction is the length of the gradient vector. That is, if
u = |∇f (x10 ,y0 )| ∇f (x0 , y0 ), then u is a unit vector in the direction of
greatest increase of f at (x0 , y0 ), and Du f (x0 , y0 ) = |∇f (x0 , y0 )|.

Activity 10.6.5. Consider the function f defined by f (x, y) = −x + 2xy − y.

a. Find the gradient ∇f (1, 2) and sketch it on Figure 10.6.5.

4 y

x
1 2 3 4
Figure 10.6.5: A plot for the gradient ∇f (1, 2).

D E
b. Sketch the unit vector z = − √12 , − √12 on Figure 10.6.5 with its tail at
(1, 2). Now find the directional derivative Dz f (1, 2).

c. What is the slope of the graph of f in the direction z? What does the
sign of the directional derivative tell you?

d. Consider the vector v = h2, −1i and sketch v on Figure 10.6.5 with its
tail at (1, 2). Find a unit vector w pointing in the same direction of
v. Without computing Dw f (1, 2), what do you know about the sign of
this directional derivative? Now verify your observation by computing
Dw f (1, 2).

e. In which direction (that is, for what unit vector u) is Du f (1, 2) the
greatest? What is the slope of the graph in this direction?

f. Corresponding, in which direction is Du f (1, 2) least? What is the slope


of the graph in this direction?
168 CHAPTER 10. DERIVATIVES OF MULTIVARIABLE FUNCTIONS

g. Sketch two unit vectors u for which Du f (1, 2) = 0 and then find compo-
nent representations of these vectors.

h. Suppose you are standing at the point (3, 3). In which direction should
you move to cause f to increase as rapidly as possible? At what rate
does f increase in this direction?

10.6.6 Applications
The gradient finds many natural applications. For example, situations often
arise — for instance, constructing a road through the mountains or planning
the flow of water across a landscape — where we are interested in knowing the
direction in which a function is increasing or decreasing most rapidly.
For example, consider a two-dimensional version of how a heat-seeking mis-
sile might work.(This application is borrowed from United States Air Force
Academy Department of Mathematical Sciences.) Suppose that the tempera-
ture surrounding a fighter jet can be modeled by the function T defined by

100
T (x, y) = ,
1 + (x − 5)2 + 4(y − 2.5)2

where (x, y) is a point in the plane of the fighter jet and T (x, y) is measured
in degrees Celsius. Some contours and gradients ∇T are shown on the left in
Figure 10.6.6.

5 y 5 y

4 4

3 3

2 2

1 1
x x
1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10

Figure 10.6.6: Contours and gradient for T (x, y) and the missile’s path.

A heat-seeking missile will always travel in the direction in which the tem-
perature increases most rapidly; that is, it will always travel in the direction
of the gradient ∇T . If a missile is fired from the point (2, 4), then its path will
be that shown on the right in Figure 10.6.6.
In the final activity of this section, we consider several questions related to
this context of a heat-seeking missile, and foreshadow some upcoming work in
Section 10.7.

Activity 10.6.6.

a. The temperature T (x, y) has its maximum value at the fighter jet’s loca-
tion. State the fighter jet’s location and explain how Figure 10.6.6 tells
you this.

b. Determine ∇T at the fighter jet’s location and give a justification for


your response.
10.6. DIRECTIONAL DERIVATIVES AND THE GRADIENT 169

c. Suppose that a different function f has a local maximum value at (x0 , y0 ).


Sketch the behavior of some possible contours near this point. What is
∇f (x0 , y0 )? (Hint: What is the direction of greatest increase in f at the
local maximum?)
d. Suppose that a function g has a local minimum value at (x0 , y0 ). Sketch
the behavior of some possible contours near this point. What is ∇g(x0 , y0 )?
e. If a function g has a local minimum at (x0 , y0 ), what is the direction of
greatest increase of g at (x0 , y0 )?

10.6.7 Summary

• The directional derivative of f at the point (x, y) in the direction of the


unit vector u = hu1 , u2 i is
f (x + u1 h, y + u2 h) − f (x, y)
Du f (x, y) = lim
h→0 h
for those values of x and y for which the limit exists. In addition,
Du f (x, y) measures the slope of the graph of f when we move in the
direction u. Alternatively, Du f (x0 , y0 ) measures the instantaneous rate
of change of f in the direction u at (x0 , y0 ).
• The gradient of a function f = f (x, y) at a point (x0 , y0 ) is the vector

∇f (x0 , y0 ) = hfx (x0 , y0 ), fy (x0 , y0 )i .

• The directional derivative in the direction u may be computed by

Du f (x0 , y0 ) = ∇f (x0 , y0 ) · u.

• At any point where the gradient is nonzero, gradient is orthogonal to


the contour through that point and points in the direction in which f
increases most rapidly; moreover, the slope of f in this direction equals
the length of the gradient |∇f (x0 , y0 )|. Similarly, the opposite of the
gradient points in the direction of greatest decrease, and that rate of
decrease is the opposite of the length of the gradient.

Exercises
1. Consider the function f (x, y, z) = xy + yz 2 + xz 3 .
Find the gradient of f :
h , , i
Find the gradient of f at the point (4, -1, -1).
h , , i
Find the rate√of change √ of the
√ function f at the point (4, -1,-1) in the
direction u = h1/ 51, −5/ 51, 5/ 51i.
2. If f (x, y) = 4x2 − 4y 2 , find the value of the directional derivative at the
point (3, −2) in the direction given by the angle θ = 2π 1 .
3. Find the directional derivative of f (x, y, z) = 2xy + z 2 at the point
(2, −3, −5) in the direction of the maximum rate of change of f .
fu (2, −3, −5) = Du f (2, −3, −5) =
110
4. The temperature at any point in the plane is given by T (x, y) = 2 .
x + y2 + 2
(a) What shape are the level curves of T ?
170 CHAPTER 10. DERIVATIVES OF MULTIVARIABLE FUNCTIONS

ellipses
parabolas
lines
circles
hyperbolas
none of the above

(b) At what point on the plane is it hottest?


What is the maximum temperature?
(c) Find the direction of the greatest increase in temperature at the point
(2, 3).
What is the value of this maximum rate of change, that is, the maximum
value of the directional derivative at (2, 3)?
(d) Find the direction of the greatest decrease in temperature at the point
(2, 3).
What is the value of this most negative rate of change, that is, the minimum
value of the directional derivative at (2, 3)?
2 2 2
5. The temperature at a point (x,y,z) is given by T (x, y, z) = 200e−x −y /4−z /9 ,
where T is measured in degrees Celsius and x,y, and z in meters. There are lots
of places to make silly errors in this problem; just try to keep track of what
needs to be a unit vector.
Find the rate of change of the temperature at the point (-1, -1, -2) in the
direction toward the point (-1, 5, 5).
In which direction (unit vector) does the temperature increase the fastest
at (-1, -1, -2)?
h , , i
What is the maximum rate of increase of T at (-1, -1, -2)?
6. If f (x, y, z) = 3zy 2 , then the gradient at the point (5, 4, 2) is
∇f (5, 4, 2) =
7. The concentration of salt in a fluid at (x, y, z) is given by F (x, y, z) =
2x2 + 2y 4 + 4x2 z 2 mg/cm3 . You are at the point (1, 1, 1).
(a) In which direction should you move if you want the concentration to
increase the fastest?
direction:
(Give your answer as a vector.)
(b) You start to move in the direction you found in part (a) at a speed of
6 cm/sec. How fast is the concentration changing?
rate of change =
8. At a certain point on a heated metal plate, the greatest rate of tempera-
ture increase, 5 degrees Celsius per meter, is toward the northeast. If an object
at this point moves directly north, at what rate is the temperature increasing?

degrees Celsius per meter


9. Suppose that you are climbing a hill whose shape is given by z = 987 −
0.08x2 − 0.07y 2 , and that you are at the point (80, 50, 300).
In which direction (unit vector) should you proceed initially in order to
reach the top of the hill fastest?
h , i
If you climb in that direction, at what angle above the horizontal will you
be climbing initially (radian measure)?
10.6. DIRECTIONAL DERIVATIVES AND THE GRADIENT 171

10. Are the following statements true or false?

(a) f~u (a, b) is parallel to ~u.

(b) ∇f (a, b) is a vector in 3-dimensional space.

(c) f~u (a, b) = ||∇f (a, b)||.

(d) Suppose fx (a, b) and fy (a, b) both exist. Then there is always a direction
in which the rate of change of f at (a, b) is zero.

(e) If ~u is a unit vector, then f~u (a, b) is a vector.

(f) If f (x, y) has fx (a, b) = 0 and fy (a, b) = 0 at the point (a, b), then f is
constant everywhere.

(g) If ~u is perpendicular to ∇f (a, b), then f~u (a, b) = h0, 0i.

(h) The gradient vector ∇f (a, b) is tangent to the contour of f at (a, b).

11. Let E(x, y) = 1+(x−5)2100 +4(y−2.5)2 represent the elevation on a land mass
at location (x, y). Suppose that E, x, and y are all measured in meters.

a. Find Ex (x, y) and Ey (x, y).

b. Let u be a unit vector in the direction of h−4, 3i. Determine Du E(3, 4).
What is the practical meaning of Du E(3, 4) and what are its units?

c. Find the direction of greatest increase in E at the point (3, 4).

d. Find the instantaneous rate of change of E in the direction of greatest


decrease at the point (3, 4). Include units on your answer.

e. At the point (3, 4), find a direction w in which the instantaneous rate of
change of E is 0.

12. Find all directions in which the directional derivative of f (x, y) = ye−xy
is 1 at the point (0, 2).
13. Find, if possible, a function f such that
 
5
∇f = sin(yz), xz cos(yz) + 2y, xy cos(yz) + .
z

If not possible, explain why.


14. Let f (x, y) = x2 + 3y 2 .

a. Find ∇f (x, y) and ∇f (1, 2).

b. Find the direction of greatest increase in f at the point (1, 2). Explain.
A graph of the surface defined by f is shown at left in Figure 10.6.7.
Illustrate this direction on the surface.

c. A contour diagram of f is shown at right in Figure 10.6.7. Illustrate your


calculation from (b) on this contour diagram.
172 CHAPTER 10. DERIVATIVES OF MULTIVARIABLE FUNCTIONS

4 y
z 3
60
2
40 1
x
20 -4 -3 -2 -1 1 2 3 4
-1

-2
-4 -2
0 x
-4 -2 0 2 2
4
y -3
-4

Figure 10.6.7: Left: Graph of f (x, y) = x2 + 3y 2 . Right: Contours.

d. Find a direction w for which the derivative of f in the direction of w is


zero.

15. The properties of the gradient that we have observed for functions of
two variables also hold for functions of more variables. In this problem, we
consider a situation where there are three independent variables. Suppose
that the temperature in a region of space is described by

2
−y 2 −z 2
T (x, y, z) = 100e−x

and that you are standing at the point (1, 2, −1).

a. Find the instantaneous rate of change of the temperature in the direction


of v = h0, 1, 2i at the point (1, 2, −1). Remember that you should first
find a unit vector in the direction of v.

b. In what direction from the point (1, 2, −1) would you move to cause the
temperature to decrease as quickly as possible?

c. How fast does the temperature decrease in this direction?

d. Find a direction in which the temperature does not change at (1, 2, −1).

16. Figure 10.6.8 shows a plot of the gradient ∇f at several points for some
function f = f (x, y).
10.6. DIRECTIONAL DERIVATIVES AND THE GRADIENT 173

Figure 10.6.8: The gradient ∇f .

a. Consider each of the three indicated points, and draw, as best as you
can, the contour through that point.
b. Beginning at each point, draw a curve on which f is continually decreas-
ing.
174 CHAPTER 10. DERIVATIVES OF MULTIVARIABLE FUNCTIONS

10.7 Optimization

Motivating Questions

• How can we find the points at which f (x, y) has a local maximum or
minimum?

• How can we determine whether critical points of f (x, y) are local maxima
or minima?

• How can we find the absolute maximum and minimum of f (x, y) on a


closed and bounded domain?

We learn in single-variable calculus that the derivative is a useful tool for


finding the local maxima and minima of functions, and that these ideas may
often be employed in applied settings. In particular, if a function f , such
as the one shown in Figure 10.7.1 is everywhere differentiable, we know that
the tangent line is horizontal at any point where f has a local maximum or
minimum. This, of course, means that the derivative f 0 is zero at any such
point. Hence, one way that we seek extreme values of a given function is to
first find where the derivative of the function is zero.

y = f (x)

Figure 10.7.1: The graph of y = f (x).

In multivariable calculus, we are often similarly interested in finding the


greatest and/or least value(s) that a function may achieve. Moreover, there
are many applied settings in which a quantity of interest depends on several
different variables. In the following preview activity, we begin to see how
some key ideas in multivariable calculus can help us answer such questions
by thinking about the geometry of the surface generated by a function of two
variables.

Preview Activity 10.7.1. Let z = f (x, y) be a differentiable function, and


suppose that at the point (x0 , y0 ), f achieves a local maximum. That is,
the value of f (x0 , y0 ) is greater than the value of f (x, y) for all (x, y) nearby
(x0 , y0 ). You might find it helpful to sketch a rough picture of a possible
function f that has this property.
10.7. OPTIMIZATION 175

a. If we consider the trace given by holding y = y0 constant, then the single-


variable function defined by f (x, y0 ) must have a local maximum at x0 .
What does this say about the value of the partial derivative fx (x0 , y0 )?

b. In the same way, the trace given by holding x = x0 constant has a local
maximum at y = y0 . What does this say about the value of the partial
derivative fy (x0 , y0 )?

c. What may we now conclude about the gradient ∇f (x0 , y0 ) at the local
maximum? How is this consistent with the statement “f increases most
rapidly in the direction ∇f (x0 , y0 )?”

d. How will the tangent plane to the surface z = f (x, y) appear at the point
(x0 , y0 , f (x0 , y0 ))?

e. By first computing the partial derivatives, find any points at which


f (x, y) = 2x − x2 − (y + 2)2 may have a local maximum.

10.7.1 Extrema and Critical Points


One of the important applications of single-variable calculus is the use of deriva-
tives to identify local extremes of functions (that is, local maxima and local
minima). Using the tools we have developed so far, we can naturally extend
the concept of local maxima and minima to several-variable functions.

Definition 10.7.2. Let f be a function of two variables x and y.

• The function f has a local maximum at a point (x0 , y0 ) provided that


f (x, y) ≤ f (x0 , y0 ) for all points (x, y) near (x0 , y0 ). In this situation we
say that f (x0 , y0 ) is a local maximum value.

• The function f has a local minimum at a point (x0 , y0 ) provided that


f (x, y) ≥ f (x0 , y0 ) for all points (x, y) near (x0 , y0 ). In this situation we
say that f (x0 , y0 ) is a local minimum value.

• An absolute maximum point is a point (x0 , y0 ) for which f (x, y) ≤ f (x0 , y0 )


for all points (x, y) in the domain of f . The value of f at an absolute
maximum point is the maximum value of f .

• An absolute minimum point is a point such that f (x, y) ≥ f (x0 , y0 ) for all
points (x, y) in the domain of f . The value of f at an absolute minimum
point is the maximum value of f .

We use the term extremum point to refer to any point (x0 , y0 ) at which f
has a local maximum or minimum. In addition, the function value f (x0 , y0 ) at
an extremum is called an extremal value. Figure 10.7.3 illustrates the graphs
of two functions that have an absolute maximum and minimum, respectively,
at the origin (x0 , y0 ) = (0, 0).
176 CHAPTER 10. DERIVATIVES OF MULTIVARIABLE FUNCTIONS

Figure 10.7.3: An absolute maximum and an absolute minimum

In single-variable calculus, we saw that the extrema of a continuous function


f always occur at critical points, values of x where f fails to be differentiable
or where f 0 (x) = 0. Said differently, critical points provide the locations where
extrema of a function may appear. Our work in Preview Activity 10.7.1 sug-
gests that something similar happens with two-variable functions.
Suppose that a continuous function f has an extremum at (x0 , y0 ). In this
case, the trace f (x, y0 ) has an extremum at x0 , which means that x0 is a critical
value of f (x, y0 ). Therefore, either fx (x0 , y0 ) does not exist or fx (x0 , y0 ) = 0.
Similarly, either fy (x0 , y0 ) does not exist or fy (x0 , y0 ) = 0. This implies that
the extrema of a two-variable function occur at points that satisfy the following
definition.

Definition 10.7.4. A critical point (x0 , y0 ) of a function f = f (x, y) is a


point in the domain of f at which fx (x0 , y0 ) = 0 and fy (x0 , y0 ) = 0, or such
that one of fx (x0 , y0 ) or fy (x0 , y0 ) fails to exist.

We can therefore find critical points of a function f by computing partial


derivatives and identifying any values of (x, y) for which one of the partials
doesn’t exist or for which both partial derivatives are simultaneously zero. For
the latter, note that we have to solve the system of equations

fx (x, y) = 0
fy (x, y) = 0.

Activity 10.7.2. Find the critical points of each of the following functions.
Then, using appropriate technology, plot the graphs of the surfaces near each
critical value and compare the graph to your work.

a. f (x, y) = 2 + x2 + y 2

b. f (x, y) = 2 + x2 − y 2

c. f (x, y) = 2x − x2 − 14 y 2

d. f (x, y) = |x| + |y|

e. f (x, y) = 2xy − 4x + 2y − 3.
10.7. OPTIMIZATION 177

10.7.2 Classifying Critical Points: The Second Derivative


Test
While the extrema of a continuous function f always occur at critical points, it
is important to note that not every critical point leads to an extremum. Recall,
for instance, f (x) = x3 from single variable calculus. We know that x0 = 0 is
a critical point since f 0 (x0 ) = 0, but x0 = 0 is neither a local maximum nor a
local minimum of f .
A similar situation may arise in a multivariable setting. Consider the func-
tion f defined by f (x, y) = x2 − y 2 whose graph and contour plot are shown in
Figure 10.7.5. Because ∇f = h2x, −2yi, we see that the origin (x0 , y0 ) = (0, 0)
is a critical point. However, this critical point is neither a local maximum or
minimum; the origin is a local minimum on the trace defined by y = 0, while
the origin is a local maximum on the trace defined by x = 0. We call such
a critical point a saddle point due to the shape of the graph near the critical
point.

x y

Figure 10.7.5: A saddle point.

As in single-variable calculus, we would like to have some sort of test to


help us identify whether a critical point is a local maximum, local maximum,
or neither.

Activity 10.7.3. Recall that the Second Derivative Test for single-variable
functions states that if x0 is a critical point of a function f so that f 0 (x0 ) = 0
and if f 00 (x0 ) exists, then

• if f 00 (x0 ) < 0, x0 is a local maximum,

• if f 00 (x0 ) > 0, x0 is a local minimum, and

• if f 00 (x0 ) = 0, this test yields no information.

Our goal in this activity is to understand a similar test for classifying ex-
treme values of functions of two variables. Consider the following three func-
tions:

f1 (x, y) = 4 − x2 − y 2 , f2 (x, y) = x2 + y 2 , f3 (x, y) = x2 − y 2 .


178 CHAPTER 10. DERIVATIVES OF MULTIVARIABLE FUNCTIONS

You can verify that each function has a critical point at the origin (0, 0). You
should check this.

x y

Figure 10.7.6: Three surfaces.

a. The graphs of these three functions are shown in Figure 10.7.6, with
z = 4 − x2 − y 2 at left, z = x2 + y 2 in the middle, and z = x2 − y 2 at
right. Use the graphs to decide if a function has a local maximimum,
local minimum, saddle point, or none of the above at the origin.
b. There is no single second derivative of a function of two variables, so we
consider a quantity that combines the second order partial derivatives.
Let D = fxx fyy − fxy2
. Calculate D at the origin for each of the functions
f1 , f2 , and f3 . What difference do you notice between the values of D
when a function has a maximum or minimum value at the origin versus
when a function has a saddle point at the origin?

c. Now consider the cases where D > 0. It is in these cases that a function
has a local maximum or minimum at a point. What is necessary in these
cases is to find a condition that will distinguish between a maximum and
a minimum. In the cases where D > 0 at the origin, evaluate fxx (0, 0).
What value does fxx (0, 0) have when f has a local maximum value at
the origin? When f has a local minimum value at the origin? Explain
why. (Hint: This should look very similar to the Second Derivative Test
for functions of a single variable.) What would happen if we considered
the values of fyy (0, 0) instead?
Activity 10.7.3 provides the basic ideas for the Second Derivative Test for
functions of two variables.
10.7. OPTIMIZATION 179

The Second Derivative Test.


Suppose (x0 , y0 ) is a critical point of the function f for which
fx (x0 , y0 ) = 0 and fy (x0 , y0 ) = 0. Let D be the quantity defined by

D = fxx (x0 , y0 )fyy (x0 , y0 ) − fxy (x0 , y0 )2 .

• If D > 0 and fxx (x0 , y0 ) < 0, then f has a local maximum at


(x0 , y0 ).
• If D > 0 and fxx (x0 , y0 ) > 0, then f has a local minimum at
(x0 , y0 ).
• If D < 0, then f has a saddle point at (x0 , y0 ).

• If D = 0, then this test yields no information about what happens


at (x0 , y0 ).
The quantity D is called the discriminant of the function f at (x0 , y0 ).

To properly understand the origin of the Second Derivative Test, we could


introduce a “second-order directional derivative.” If this second-order direc-
tional derivative were negative in every direction, for instance, we could guar-
antee that the critical point is a local maximum. A complete justification of
the Second Derivative Test requires key ideas from linear algebra that are be-
yond the scope of this course, so instead of presenting a detailed explanation,
we will accept this test as stated. In Activity 10.7.4, we apply the test to more
complicated examples.

Activity 10.7.4. Find the critical points of the following functions and use
the Second Derivative Test to classify the critical points.

a. f (x, y) = 3x3 + y 2 − 9x + 4y

b. f (x, y) = xy + 2
x + 4
y

c. f (x, y) = x3 + y 3 − 3xy.

As we learned in single-variable calculus, finding extremal values of func-


tions can be particularly useful in applied settings. For instance, we can often
use calculus to determine the least expensive way to construct something or to
find the most efficient route between two locations. The same possibility holds
in settings with two or more variables.

Activity 10.7.5. While the quantity of a product demanded by consumers is


often a function of the price of the product, the demand for a product may also
depend on the price of other products. For instance, the demand for blue jeans
at Old Navy may be affected not only by the price of the jeans themselves, but
also by the price of khakis.
Suppose we have two goods whose respective prices are p1 and p2 . The
demand for these goods, q1 and q2 , depend on the prices as

q1 = 150 − 2p1 − p2 (10.7.1)


q2 = 200 − p1 − 3p2 . (10.7.2)

The seller would like to set the prices p1 and p2 in order to maximize
revenue. We will assume that the seller meets the full demand for each product.
Thus, if we let R be the revenue obtained by selling q1 items of the first good
180 CHAPTER 10. DERIVATIVES OF MULTIVARIABLE FUNCTIONS

at price p1 per item and q2 items of the second good at price p2 per item, we
have
R = p1 q1 + p2 q2 .
We can then write the revenue as a function of just the two variables p1
and p2 by using Equations (10.7.1) and (10.7.2), giving us

R(p1 , p2 ) = p1 (150 − 2p1 − p2 ) + p2 (200 − p1 − 3p2 )


= 150p1 + 200p2 − 2p1 p2 − 2p21 − 3p22 .

A graph of R as a function of p1 and p2 is shown in Figure 10.7.7.

4000

2000

-2000

-4000

-6000

-8000

-10000
10 20 30 40 20 10 0
p2 50 60 60 50 40 30 p1

Figure 10.7.7: A revenue function.

a. Find all critical points of the revenue function, R. (Hint: You should
obtain a system of two equations in two unknowns which can be solved
by elimination or substitution.)

b. Apply the Second Derivative Test to determine the type of any critical
point(s).

c. Where should the seller set the prices p1 and p2 to maximize the revenue?

10.7.3 Optimization on a Restricted Domain


The Second Derivative Test helps us classify critical points of a function, but it
does not tell us if the function actually has an absolute maximum or minimum
at each such point. For single-variable functions, the Extreme Value Theorem
told us that a continuous function on a closed interval [a, b] always has both
an absolute maximum and minimum on that interval, and that these absolute
extremes must occur at either an endpoint or at a critical point. Thus, to
find the absolute maximum and minimum, we determine the critical points
in the interval and then evaluate the function at these critical point s and at
the endpoints of the interval. A similar approach works for functions of two
variables.
For functions of two variables, closed and bounded regions play the role
that closed intervals did for functions of a single variable. A closed region is a
region that contains its boundary (the unit disk x2 + y 2 ≤ 1 is closed, while its
interior x2 + y 2 < 1 is not, for example), while a bounded region is one that
does not stretch to infinity in any direction. Just as for functions of a single
10.7. OPTIMIZATION 181

variable, continuous functions of several variables that are defined on closed,


bounded regions must have absolute maxima and minima in those regions.
The Extreme Value Theorem.
Let f = f (x, y) be a continuous function on a closed and bounded
region R. Then f has an absolute maximum and an absolute minimum
in R.

The absolute extremes must occur at either a critical point in the interior
of R or at a boundary point of R. We therefore must test both possibilities,
as we demonstrate in the following example.
Example 10.7.8. Suppose the temperature T at each point on the circular
plate x2 + y 2 ≤ 1 is given by
T (x, y) = 2x2 + y 2 − y.
The domain R = {(x, y) : x2 + y 2 ≤ 1} is a closed and bounded region, as
shown on the left of Figure 10.7.9, so the Extreme Value Theorem assures us
that T has an absolute maximum and minimum on the plate. The graph of
T over its domain R is shown in Figure 10.7.9. We will find the hottest and
coldest points on the plate.

1.5 y
z
2
1.0

0.5
1
x
-1.5 -1.0 -0.5 0.5 1.0 1.5
-0.5 -1
0 0 x
y 1
-1.0
-1

-1.5

Figure 10.7.9: Domain of the temperature T (x, y) = 2x2 + y 2 − y and its


graph.

If the absolute maximum or minimum occurs inside the disk, it will be at


a critical point so we begin by looking for critical points inside the disk. To do
this, notice that critical points are given by the conditions Tx = 4x = 0 and
Ty = 2y − 1 = 0. This means that there is one critical point of the function at
the point (x0 , y0 ) = (0, 1/2), which lies inside the disk.
We now find the hottest and coldest points on the boundary of the disk,
which is the circle of radius 1. As we have seen, the points on the unit circle
can be parametrized as
x(t) = cos(t), y(t) = sin(t),
where 0 ≤ t ≤ 2π. The temperature at a point on the circle is then described
by
T (x(t), y(t)) = 2 cos2 (t) + sin2 (t) − sin(t).
182 CHAPTER 10. DERIVATIVES OF MULTIVARIABLE FUNCTIONS

To find the hottest and coldest point on the boundary, we look for the
critical points of this single-variable function on the interval 0 ≤ t ≤ 2π. We
have

dT
= −4 cos(t) sin(t) + 2 cos(t) sin(t) − cos(t)
dt
= −2 cos(t) sin(t) − cos(t) = cos(t)(−2 sin(t) − 1)
= 0.

This shows that we have critical points when cos(t) = 0 or sin(t) = −1/2.
This occurs when t = π/2, 3π/2, 7π/6, and 11π/6. Since we have x(t) = cos(t)
and y(t) = sin(t), the corresponding points are

 √1) when
 t = 2, (0, −1) when
 t= 2 ,
π 3π
• (x, y) = (0, • (x, y) = 

• (x, y) = 2 , − 2 when t =
3 1
6 .
11π
• (x, y) = − 2 , − 2 when t =
3 1
6 .

These are the critical points of T on the boundary and so this collection of
points includes the hottest and coldest points on the boundary.
We now have a list of candidates for the hottest and coldest points: the
critical point in the interior of the disk and the critical points on the boundary.
We find the hottest and coldest points by evaluating the temperature at each
of these points, and find that

• T 0, 12 = −14 ,

• T
(0, 1) = 0,  • T (0, −1) = 2,
√ √
• T − 23 , − 12 = 94 , • T − 23 , − 21 = 94 .

So the maximum √ value of T on the disk x + y ≤ 1 is 4 , which occurs at


2 2 9

the two points ± 2 , − 2 on the boundary, and the minimum value of T on


3 1

the disk is − 41 which occurs at the critical point 0, 21 in the interior of R.




From this example, we see that we use the following procedure for deter-
mining the absolute maximum and absolute minimum of a function on a closed
and bounded domain.

• Find all critical points of the function in the interior of the domain.

• Find all the critical points of the function on the boundary of the domain.
Working on the boundary of the domain reduces this part of the problem
to one or more single variable optimization problems. Note that there
may be endpoints on portions of the boundary that need to be considered.

• Evaluate the function at each of the points found in Steps 1 and 2.

• The maximum value of the function is the largest value obtained in Step
3, and the minimum value of the function is the smallest value obtained
in Step 3.

Activity 10.7.6. Let f (x, y) = x2 − 3y 2 − 4x + 6y with triangular domain R


whose vertices are at (0, 0), (4, 0), and (0, 4). The domain R and a graph of f
on the domain appear in Figure 10.7.10.
10.7. OPTIMIZATION 183

z
y 5
4

1 2 2 1 x 0
y 3 3
3 -5 4

-10
2
-15

1
-20

x
1 2 3 4

Figure 10.7.10: The domain of f (x, y) = x2 − 3y 2 − 4x + 6y and its graph.

a. Find all of the critical points of f in R.

b. Parameterize the horizontal leg of the triangular domain, and find the
critical points of f on that leg. (Hint: You may need to consider end-
points.)

c. Parameterize the vertical leg of the triangular domain, and find the crit-
ical points of f on that leg. (Hint: You may need to consider endpoints.)

d. Parameterize the hypotenuse of the triangular domain, and find the crit-
ical points of f on the hypotenuse. (Hint: You may need to consider
endpoints.)

e. Find the absolute maximum and absolute minimum value of f on R.

10.7.4 Summary

• To find the extrema of a function f = f (x, y), we first find the critical
points, which are points where one of the partials of f fails to exist, or
where fx = 0 and fy = 0.

• The Second Derivative Test helps determine whether a critical point is a


local maximum, local minimum, or saddle point.

• If f is defined on a closed and bounded domain, we find the absolute


maxima and minima by finding the critical points in the interior of the
domain, finding the critical points on the boundary, and testing the value
of f at both sets of critical points.

Exercises
1. The function
2
k(x, y) = e−y cos(5x)
184 CHAPTER 10. DERIVATIVES OF MULTIVARIABLE FUNCTIONS

has a critical point at (0, 0).


What is the value of D at this critical point? D =
What type of critical point is it? ( maximum  minimum  saddle
point  point with unknown behavior)
2. Consider the function

f (x, y) = (2x − x2 )(12y − y 2 ).

Find and classify all critical points of the function. If there are more blanks
than critical points, leave the remaining entries blank.
fx =
fy =
fxx =
fxy =
fyy =
There are several critical points to be listed. List them lexicograhically,
that is in ascending order by x-coordinates, and for equal x-coordinates in
ascending order by y-coordinates (e.g., (1,1), (2, -1), (2, 3) is a correct order)
The critical point with the smallest x-coordinate is
( , ) Classification:
(local minimum, local maximum, saddle point, cannot be determined)
The critical point with the next smallest x-coordinate is
( , ) Classification:
(local minimum, local maximum, saddle point, cannot be determined)
The critical point with the next smallest x-coordinate is
( , ) Classification:
(local minimum, local maximum, saddle point, cannot be determined)
The critical point with the next smallest x-coordinate is
( , ) Classification:
(local minimum, local maximum, saddle point, cannot be determined)
The critical point with the next smallest x-coordinate is
( , ) Classification:
(local minimum, local maximum, saddle point, cannot be determined)
3. Suppose f (x, y) = xy − ax − by.
(A) How many local minimum points does f have in R2 ? (The answer is
an integer).
(B) How many local maximum points does f have in R2 ?
(C) How many saddle points does f have in R2 ?
4. Let f (x, y) = 2/x + 3/y + 4x + 5y in the region R where x, y > 0.
Explain why f must have a global minimum at some point in R (note that
R is unbounded—how does this influence your explanation?). Then find the
global minimum.
minimum =
5. Each of the following functions has at most one critical point. Graph a
few level curves and a few gradients and, on this basis alone, decide whether
the critical point is a local maximum, a local minimum, a saddle point, or that
there is no critical point.
2 2
For f (x, y) = e−2x −3y , type of critical point: ( Local Maximum
 Local Minimum  Saddle Point  No Critical Point)
2 2
For f (x, y) = e2x −3y , type of critical point: ( Local Maximum  Lo-
cal Minimum  Saddle Point  No Critical Point)
For f (x, y) = 2x2 + 3y 2 + 1, type of critical point: ( Local Maximum
 Local Minimum  Saddle Point  No Critical Point)
For f (x, y) = 2x2 + 3y + 1, type of critical point: ( Local Maximum
10.7. OPTIMIZATION 185

 Local Minimum  Saddle Point  No Critical Point)


6. Find the absolute minimum and absolute maximum of

f (x, y) = 11 − 3x + 7y

on the closed triangular region with vertices (0, 0), (7, 0) and (7, 9).
List the maximum/minimum values as well as the point(s) at which they
occur. Ignore unneeded answer blanks.
Minimum value:
Occurs at ( , ) and ( , )
Maximum value:
Occurs at ( , ) and ( , )
7. Find the maximum and minimum values of f (x, y) = xy on the ellipse
6x2 + y 2 = 8.
maximum value =
minimum value =
8. Find A and B so that f (x, y) = x2 + Ax + y 2 + B has a local minimum
at the point (4, 0), with z-coordinate 25.
A=
B=
9. The contours of a function f are shown in the figure below.

For each of the points shown, indicate whether you think it is a local max-
imum, local minimum, saddle point, or none of these.
(a) Point P is ( a local maximum  a local minimum  a saddle
point  none of these)
(b) Point Q is ( a local maximum  a local minimum  a saddle
point  none of these)
(c) Point R is ( a local maximum  a local minimum  a saddle
point  none of these)
(d) Point S is ( a local maximum  a local minimum  a saddle
point  none of these)
10. Consider the three points (5, 4), (4, 3), and (9, 1).
(a) Supposed that at (5, 4), we know that fx = fy = 0 and fxx = 0, fyy > 0,
and fxy > 0. What can we conclude about the behavior of this function near
186 CHAPTER 10. DERIVATIVES OF MULTIVARIABLE FUNCTIONS

the point (5, 4)? ( (5,4) is a local maximum  (5,4) is a local minimum
 (5,4) is a saddle point  (5,4) is a none of these)
(b) Supposed that at (4, 3), we know that fx = fy = 0 and fxx < 0, fyy = 0,
and fxy < 0. What can we conclude about the behavior of this function near
the point (4, 3)? ( (4,3) is a local maximum  (4,3) is a local minimum
 (4,3) is a saddle point  (4,3) is a none of these)
(c) Supposed that at (9, 1), we know that fx = fy = 0 and fxx < 0, fyy = 0,
and fxy < 0. What can we conclude about the behavior of this function near
the point (9, 1)? ( (9,1) is a local maximum  (9,1) is a local minimum
 (9,1) is a saddle point  (9,1) is a none of these)
Using this information, on a separate sheet of paper sketch a possible con-
tour diagram for f .
11. Find three positive real numbers whose sum is 5 and whose product is
a maximum.
Enter the three numbers separated by commas:
12. A closed rectangular box has volume 26 cm3 . What are the lengths of
the edges giving the minimum surface area?
lengths =
(Give the three lengths as a comma separated list.)
13. An open rectangular box has volume 30 cm3 . What are the lengths of
the edges giving the minimum surface area?
lengths =
(Give the three lengths as a comma separated list.)
14. What is the shortest distance from the surface xy + 6x + z 2 = 36 to the
origin?
distance =
15. Find the volume of the largest rectangular box with edges parallel to
the axes that can be inscribed in the ellipsoid
x2 y2 z2
+ + =1
25 1 16
Hint: By symmetry, you can restrict your attention to the first octant
(where x, y, z ≥ 0), and assume your volume has the form V = 8xyz. Then ar-
guing by symmetry, you need only look for points which achieve the maximum
which lie in the first octant.
Maximum volume:
16. Design a rectangular milk carton box of width w, length l, and height h
which holds 510 cm3 of milk. The sides of the box cost 3 cent/cm2 and the top
and bottom cost 5 cent/cm2 . Find the dimensions of the box that minimize
the total cost of materials used.
dimensions =
(Enter your answer as a comma separated list of lengths.)
17. Respond to each of the following prompts to solve the given optimization
problem.
a. Let f (x, y) = sin(x) + cos(y). Determine the absolute maximum and
minimum values of f . At what points do these extreme values occur?
b. For a certain differentiable function F of two variables x and y, its partial
derivatives are
Fx (x, y) = x2 − y − 4 and Fy (x, y) = −x + y − 2.
Find each of the critical points of F , and classify each as a local maxi-
mum, local minimum, or a saddle point.
10.7. OPTIMIZATION 187

c. Determine all critical points of T (x, y) = 48+3xy −x2 y −xy 2 and classify
each as a local maximum, local minimum, or saddle point.
x2
d. Find and classify all critical points of g(x, y) = 2 + 3y 3 + 9y 2 − 3xy +
9y − 9x
2
−2y 2
e. Find and classify all critical points of z = f (x, y) = ye−x .
f. Determine the absolute maximum and absolute minimum of f (x, y) =
2+2x+2y −x2 −y 2 on the triangular plate in the first quadrant bounded
by the lines x = 0, y = 0, and y = 9 − x.
g. Determine the absolute maximum and absolute minimum of f (x, y) =
2 + 2x + 2y − x2 − y 2 over the closed disk of points (x, y) such that
(x − 1)2 + (y − 1)2 ≤ 1.
h. Find the point on the plane z = 6 − 3x − 2y that lies closest to the origin.

18. If a continuous function f of a single variable has two critical numbers


c1 and c2 at which f has relative maximum values, then f must have another
critical number c3 , because “it is impossible to have two mountains without
some sort of valley in between. The other critical point can be a saddle point
(a pass between the mountains) or a local minimum (a true valley)." (From
Calculus in Vector Spaces by Lawrence J. Corwin and Robert H. Szczarb.)
Consider the function f defined by f (x, y) = 4x2 ey − 2x4 − e4y . (From Ira
Rosenholz in the Problems Section of the Mathematics Magazine, Vol. 60 NO.
1, February 1987.) Show that f has exactly two critical points, and that f
has relative maximum values at each of these critical points. Explain how this
function f illustrates that it really is possible to have two mountains without
some sort of valley in between. Use appropriate technology to draw the surface
defined by f to see graphically how this happens.
19. If a continuous function f of a single variable has exactly one critical
number with a relative maximum at that critical point, then the value of f
at that critical point is an absolute maximum. In this exercise we see that
the same is not always true for functions of two variables. Let f (x, y) =
3xey −x3 −e3y (from “ ‘The Only Critical Point in Town” Test’ by Ira Rosenholz
and Lowell Smylie in the Mathematics Magazine, VOL 58 NO 3 May 1985.).
Show that f has exactly one critical point, has a relative maximum value at
that critical point, but that f has no absolute maximum value. Use appropriate
technology to draw the surface defined by f to see graphically how this happens.
20. A manufacturer wants to procure rectangular boxes to ship its product.
The boxes must contain 20 cubic feet of space. To be durable enough to ensure
the safety of the product, the material for the sides of the boxes will cost $0.10
per square foot, while the material for the top and bottom will cost $0.25 per
square foot. In this activity we will help the manufacturer determine the box
of minimal cost.
a. What quantities are constant in this problem? What are the variables
in this problem? Provide appropriate variable labels. What, if any,
restrictions are there on the variables?
b. Using your variables from (a), determine a formula for the total cost C
of a box.
c. Your formula in part (b) might be in terms of three variables. If so, find
a relationship between the variables, and then use this relationship to
write C as a function of only two independent variables.
188 CHAPTER 10. DERIVATIVES OF MULTIVARIABLE FUNCTIONS

d. Find the dimensions that minimize the cost of a box. Be sure to verify
that you have a minimum cost.

21. A rectangular box with length x, width y, and height z is being built.
The box is positioned so that one corner is stationed at the origin and the box
lies in the first octant where x, y, and z are all positive. There is an added
constraint on how the box is constructed: it must fit underneath the plane
with equation x + 2y + 3z = 6. In fact, we will assume that the corner of the
box “opposite” the origin must actually lie on this plane. The basic problem is
to find the maximum volume of the box.
a. Sketch the plane x + 2y + 3z = 6, as well as a picture of a potential box.
Label everything appropriately.
b. Explain how you can use the fact that one corner of the box lies on the
plane to write the volume of the box as a function of x and y only. Do
so, and clearly show the formula you find for V (x, y).
c. Find all critical points of V . (Note that when finding the critical points,
it is essential that you factor first to make the algebra easier.)
d. Without considering the current applied nature of the function V , classify
each critical point you found above as a local maximum, local minimum,
or saddle point of V .
e. Determine the maximum volume of the box, justifying your answer com-
pletely with an appropriate discussion of the critical points of the func-
tion.
f. Now suppose that we instead stipulated that, while the vertex of the box
opposite the origin still had to lie on the plane, we were only going to
permit the sides of the box, x and y, to have values in a specified range
(given below). That is, we now want to find the maximum value of V on
the closed, bounded region
1
≤ x ≤ 1, 1 ≤ y ≤ 2.
2
Find the maximum volume of the box under this condition, justifying
your answer fully.

22. The airlines place restrictions on luggage that can be carried onto planes.
• A carry-on bag can weigh no more than 40 lbs.
• The length plus width plus height of a bag cannot exceed 45 inches.
• The bag must fit in an overhead bin.
Let x, y, and z be the length, width, and height (in inches) of a carry
on bag. In this problem we find the dimensions of the bag of largest volume,
V = xyz, that satisfies the second restriction. Assume that we use all 45 inches
to get a maximum volume. (Note that this bag of maximum volume might not
satisfy the third restriction.)
a. Write the volume V = V (x, y) as a function of just the two variables x
and y.
b. Explain why the domain over which V is defined is the triangular region
R with vertices (0,0), (45,0), and (0,45).
10.7. OPTIMIZATION 189

c. Find the critical points, if any, of V in the interior of the region R.


d. Find the maximum value of V on the boundary of the region R, and the
determine the dimensions of a bag with maximum volume on the entire
region R. (Note that most carry-on bags sold today measure 22 by 14
by 9 inches with a volume of 2772 cubic inches, so that the bags will fit
into the overhead bins.)

23.
According to The Song of Insects by x y
G.W. Pierce (Harvard College Press, 20.0 88.6
1948) the sound of striped ground crick-
16.0 71.6
ets chirping, in number of chirps per
19.8 93.3
second, is related to the temperature.
So the number of chirps per second 18.4 84.3
could be a predictor of temperature. 17.1 80.6
The data Pierce collected is shown in 15.5 75.2
Table 10.7.11., where x is the (average) 14.7 69.7
number of chirps per second and y is 17.1 82.0
the temperature in degrees Fahrenheit. 15.4 69.4
A scatterplot of the data would show 16.2 83.3
that, while the relationship between x 15.0 79.6
and y is not exactly linear, it looks
17.2 82.6
to have a linear pattern. It could be
that the relationship is really linear but 16.0 80.6
experimental error causes the data to 17.0 83.5
be slightly inaccurate. Or perhaps the 14.4 76.3
data is not linear, but only approxi-
mately linear.
Table 10.7.11: Crickets chirping.

If we want to use the data to make predications, then we need to fit a curve
of some kind to the data. Since the cricket data appears roughly linear, we will
fit a linear function f of the form f (x) = mx + b to the data. We will do this in
such a way that we minimize the sums of the squares of the distances between
the y values of the data and the corresponding y values of the line defined
by f . This type of fit is called a least squares approximation. If the data is
represented by the points (x1 , y1 ), (x2 , y2 ), . . ., (xn , yn ), then the square of the
distance between yi and f (xi ) is (f (xi ) − yi )2 = (mxi + b − yi )2 . So our goal
is to minimize the sum of these squares, of minimize the function S defined by
n
X
S(m, b) = (mxi + b − yi )2 .
i=1

a. Calculate Sm and Sb .
b. Solve the system Sm (m, b) = 0 and Sb (m, b) = 0 to show that the critical
point satisfies
Pn Pn Pn
n ( i=1 xi yi ) − ( i=1 xi ) ( i=1 yi )
m= Pn Pn 2
n ( i=1 x2i ) − ( i=1 xi )
Pn Pn 2
 P n Pn
( i=1 yi ) i=1 xi − ( i=1 xi ) ( i=1 xi yi )
b= Pn n 2 .
n ( i=1 x2i ) − ( i=1 xi )
P
190 CHAPTER 10. DERIVATIVES OF MULTIVARIABLE FUNCTIONS

(Hint: Don’t be daunted by these expressions, the system Sm (m, b) = 0


and Sb (m, b) = 0 is a system of two linear
Pn equationsPnin the unknowns
Pn m
and b. ItP
might be easier to let r = i=1 x2i , s = i=1 xi , t = i=1 yi ,
n
and u = i=1 xi yi and write your equations using these constants.)
c. Use the Second Derivative Test to explain why the critical point gives
a local minimum. Can you then explain why the critical point gives an
absolute minimum?
d. Use the formula from part (b) to find the values of m and b that give the
line of best fit in the least squares sense to the cricket data. Draw your
line on the scatter plot to convince yourself that you have a well-fitting
line.
10.8. CONSTRAINED OPTIMIZATION: LAGRANGE MULTIPLIERS 191

10.8 Constrained Optimization: Lagrange Mul-


tipliers

Motivating Questions
• What geometric condition enables us to optimize a function f = f (x, y)
subject to a constraint given by g(x, y) = k, where k is a constant?
• How can we exploit this geometric condition to find the extreme values
of a function subject to a constraint?

We previously considered how to find the extreme values of functions on


both unrestricted domains and on closed, bounded domains. Other types of
optimization problems involve maximizing or minimizing a quantity subject
to an external constraint. In these cases the extreme values frequently won’t
occur at the points where the gradient is zero, but rather at other points that
satisfy an important geometric condition. These problems are often called con-
strained optimization problems and can be solved with the method of Lagrange
Multipliers, which we study in this section.
Preview Activity 10.8.1. According to U.S. postal regulations, the girth
plus the length of a parcel sent by mail may not exceed 108 inches, where
by “girth” we mean the perimeter of the smallest end. Our goal is to find
the largest possible volume of a rectangular parcel with a square end that
can be sent by mail. (We solved this applied optimization problem in single
variable Active Calculus, so it may look familiar. We take a different approach
in this section, and this approach allows us to view most applied optimization
problems from single variable calculus as constrained optimization problems,
as well as provide us tools to solve a greater variety of optimization problems.)
If we let x be the length of the side of one square end of the package and y
the length of the package, then we want to maximize the volume f (x, y) = x2 y
of the box subject to the constraint that the girth (4x) plus the length (y) is
as large as possible, or 4x + y = 108. The equation 4x + y = 108 is thus an
external constraint on the variables.

120 y

100

80
A
60
C
40
5000
20000
20 D
1000 10000
B x
10 20 30 40
Figure 10.8.1: Contours of f and the constraint equation g(x, y) = 108.
192 CHAPTER 10. DERIVATIVES OF MULTIVARIABLE FUNCTIONS

a. The constraint equation involves the function g that is given by


g(x, y) = 4x + y.
Explain why the constraint is a contour of g, and is therefore a two-
dimensional curve.
b. Figure 10.8.1 shows the graph of the constraint equation g(x, y) = 108
along with a few contours of the volume function f . Since our goal is
to find the maximum value of f subject to the constraint g(x, y) = 108,
we want to find the point on our constraint curve that intersects the
contours of f at which f has its largest value.
i. Points A and B in Figure 10.8.1 lie on a contour of f and on the
constraint equation g(x, y) = 108. Explain why neither A nor B
provides a maximum value of f that satisfies the constraint.
ii. Points C and D in Figure 10.8.1 lie on a contour of f and on the
constraint equation g(x, y) = 108. Explain why neither C nor D
provides a maximum value of f that satisfies the constraint.
iii. Based on your responses to parts i. and ii., draw the contour of f
on which you believe f will achieve a maximum value subject to the
constraint g(x, y) = 108. Explain why you drew the contour you
did.
c. Recall that g(x, y) = 108 is a contour of the function g, and that the
gradient of a function is always orthogonal to its contours. With this in
mind, how should ∇f and ∇g be related at the optimal point? Explain.

10.8.1 Constrained Optimization and Lagrange Multipli-


ers
In Preview Activity 10.8.1, we considered an optimization problem where there
is an external constraint on the variables, namely that the girth plus the length
of the package cannot exceed 108 inches. We saw that we can create a function
g from the constraint, specifically g(x, y) = 4x + y. The constraint equation is
then just a contour of g, g(x, y) = c, where c is a constant (in our case 108).
Figure 10.8.2 illustrates that the volume function f is maximized, subject to
the constraint g(x, y) = c, when the graph of g(x, y) = c is tangent to a contour
of f . Moreover, the value of f on this contour is the sought maximum value.

120 y

100

80

60

40

20
x
10 20 30 40
Figure 10.8.2: Contours of f and the constraint contour.
10.8. CONSTRAINED OPTIMIZATION: LAGRANGE MULTIPLIERS 193

To find this point where the graph of the constraint is tangent to a contour
of f , recall that ∇f is perpendicular to the contours of f and ∇g is perpen-
dicular to the contour of g. At such a point, the vectors ∇g and ∇f are
parallel, and thus we need to determine the points where this occurs. Recall
that two vectors are parallel if one is a nonzero scalar multiple of the other, so
we therefore look for values of a parameter λ that make

∇f = λ∇g. (10.8.1)

The constant λ is called a Lagrange multiplier .


To find the values of λ that satisfy (10.8.1) for the volume function in
Preview Activity 10.8.1, we calculate both ∇f and ∇g. Observe that

∇f = 2xyi + x2 j and ∇g = 4i + j,

and thus we need a value of λ so that

2xyi + x2 j = λ(4i + j).

Equating components in the most recent equation and incorporating the


original constraint, we have three equations

2xy = λ(4) (10.8.2)


2
x = λ(1) (10.8.3)
4x + y = 108 (10.8.4)

in the three unknowns x, y, and λ. First, note that if λ = 0, then equation


(10.8.3) shows that x = 0. From this, Equation (10.8.4) tells us that y = 108.
So the point (0, 108) is a point we need to consider. Next, provided that λ 6= 0
(from which it follows that x 6= 0 by Equation (10.8.3)), we may divide both
sides of Equation (10.8.2) by the corresponding sides of (10.8.3) to eliminate
λ, and thus find that
2y
= 4, so
x
y = 2x.

Substituting into Equation (10.8.4) gives us

4x + 2x = 108

or
x = 18.
Thus we have y = 2x = 36 and λ = x2 = 324 as another point to consider.
So the points at which the gradients of f and g are parallel, and thus at which
f may have a maximum or minimum subject to the constraint, are (0, 108) and
(18, 36). By evaluating the function f at these points, we see that we maximize
the volume when the length of the square end of the box is 18 inches and the
length is 36 inches, for a maximum volume of f (18, 36) = 11664 cubic inches.
Since f (0, 108) = 0, we obtain a minimum value at this point.
We summarize the process of Lagrange multipliers as follows.
194 CHAPTER 10. DERIVATIVES OF MULTIVARIABLE FUNCTIONS

The method of Lagrange multipliers.


The general technique for optimizing a function f = f (x, y) subject to a
constraint g(x, y) = c is to solve the system ∇f = λ∇g and g(x, y) = c
for x, y, and λ. We then evaluate the function f at each point (x, y)
that results from a solution to the system in order to find the optimum
values of f subject to the constraint.

Activity 10.8.2. A cylindrical soda can holds about 355 cc of liquid. In this
activity, we want to find the dimensions of such a can that will minimize the
surface area. For the sake of simplicity, assume the can is a perfect cylinder.
a. What are the variables in this problem? Based on the context, what
restriction(s), if any, are there on these variables?
b. What quantity do we want to optimize in this problem? What equation
describes the constraint? (You need to decide which of these functions
plays the role of f and which plays the role of g in our discussion of
Lagrange multipliers.)
c. Find λ and the values of your variables that satisfy Equation (10.8.1) in
the context of this problem.
d. Determine the dimensions of the pop can that give the desired solution
to this constrained optimization problem.
The method of Lagrange multipliers also works for functions of more than
two variables.
Activity 10.8.3. Use the method of Lagrange multipliers to find the dimen-
sions of the least expensive packing crate with a volume of 240 cubic feet when
the material for the top costs $2 per square foot, the bottom is $3 per square
foot and the sides are $1.50 per square foot.
The method of Lagrange multipliers also works for functions of three vari-
ables. That is, if we have a function f = f (x, y, z) that we want to optimize
subject to a constraint g(x, y, z) = k, the optimal point (x, y, z) lies on the
level surface S defined by the constraint g(x, y, z) = k. As we did in Preview
Activity 10.8.1, we can argue that the optimal value occurs at the level surface
f (x, y, z) = c that is tangent to S. Thus, the gradients of f and g are parallel
at this optimal point. So, just as in the two variable case, we can optimize
f = f (x, y, z) subject to the constraint g(x, y, z) = k by finding all points
(x, y, z) that satisfy ∇f = λ∇g and g(x, y, z) = k.

10.8.2 Summary

• The extrema of a function f = f (x, y) subject to a constraint g(x, y) = c


occur at points for which the contour of f is tangent to the curve that
represents the constraint equation. This occurs when

∇f = λ∇g.

• We use the condition ∇f = λ∇g to generate a system of equations,


together with the constraint g(x, y) = c, that may be solved for x, y, and
λ. Once we have all the solutions, we evaluate f at each of the (x, y)
points to determine the extrema.
10.8. CONSTRAINED OPTIMIZATION: LAGRANGE MULTIPLIERS 195

Exercises
1. Use Lagrange multipliers to find the maximum and minimum values of
f (x, y) = 3x − 2y subject to the constraint x2 + y 2 = 13, if such values exist.
maximum =
minimum =
(For either value, enter DNE if there is no such value.)
2. Use Lagrange multipliers to find the maximum and minimum values of
f (x, y) = x2 y + 3y 2 − y, subject to the constraint x2 + y 2 ≤ 38.3333333333333
maximum =
minimum =
(For either value, enter DNE if there is no such value.)
3. Find the absolute maximum and minimum of the function f (x, y) =
x2 + y 2 subject to the constraint x4 + y 4 = 10000.
As usual, ignore unneeded answer blanks, and list points in lexicographic
order.
Absolute minimum value:
attained at ( , ), ( , ),
( , ), ( , ).
Absolute maximum value:
attained at ( , ), ( , ),
( , ), ( , ).
4. Find the absolute maximum and minimum of the function f (x, y) =
x2 − y 2 subject to the constraint x2 + y 2 = 1.
As usual, ignore unneeded answer blanks, and list points in lexicographic
order.
Absolute minimum value:
attained at ( , ) and ( , ).
Absolute maximum value:
attained at ( , ) and ( , ).
5. Find the minimum distance from the point (1, 1, 15) to the paraboloid
given by the equation z = x2 + y 2 .
Minimum distance =
Note: If you need to find roots of a polynomial of degree ≥ 3, you may
want to use a calculator of computer to do so numerically. Also be sure that
you can give a geometric justification for your answer.
6. For each value of λ the function h(x, y) = x2 + y 2 − λ(2x + 6y − 16) has
a minimum value m(λ).
(a) Find m(λ)
m(λ) =
(Use the letter L for λ in your expression.)
(b) For which value of λ is m(λ) the largest, and what is that maximum
value?
λ=
maximum m(λ) =
(c) Find the minimum value of f (x, y) = x2 + y 2 subject to the constraint
2x + 6y = 16 using the method of Lagrange multipliers and evaluate λ.
minimum f =
λ=
(How are these results related to your result in part (b)?)
196 CHAPTER 10. DERIVATIVES OF MULTIVARIABLE FUNCTIONS

7. The plane x + y + 2z = 8 intersects the paraboloid z = x2 + y 2 in an


ellipse. Find the points on this ellipse that are nearest to and farthest from
the origin.
Point farthest away occurs at
( , , ).
Point nearest occurs at
( , , ).
8. Find the maximum and minimum values of the function f (x, y, z) =
x2 y 2 z 2 subject to the constraint x2 + y 2 + z 2 = 100.
Maximum value is , occuring at
points (positive integer or "infinitely many").
Minimum value is , occuring at
points (positive integer or "infinitely many").
9. Find the maximum and minimum values of the function f (x, y, z, t) =
x + y + z + t subject to the constraint x2 + y 2 + z 2 + t2 = 144.
Maximum value is , occuring at
points (positive integer or "infinitely many").
Minimum value is , occuring at
points (positive integer or "infinitely many").
10. Find the maximum and minimum volumes of a rectangular box whose
surface area equals 8000 square cm and whose edge length (sum of lengths of
all edges) is 480 cm.
Hint: It can be deduced that the box is not a cube, so if x, y, and z are
the lengths of the sides, you may want to let x represent a side with x 6= y and
x 6= z.
Maximum value is ,
occuring at ( , , ).
Minimum value is ,
occuring at ( , , ).
P3 P3
11. (a) If i=1 xi = 5, find the values of x1 , x2 , x3 making i=1 xi 2 mini-
mum.
x1 , x2 , x3 =
(Give your values as a comma separated list.) Pn
(b) Generalize
Pn the result of part (a) to find the minimum value of i=1 xi 2
subject to i=1 xi = 5.
minimum value =
12. The Cobb-Douglas production function is used in economics to model
production levels based on labor and equipment. Suppose we have a specific
Cobb-Douglas function of the form

f (x, y) = 50x0.4 y 0.6 ,

where x is the dollar amount spent on labor and y the dollar amount spent on
equipment. Use the method of Lagrange multipliers to determine how much
should be spent on labor and how much on equipment to maximize productivity
if we have a total of $1.5 million dollars to invest in labor and equipment.
13. Use the method of Lagrange multipliers to find the point on the line
x − 2y = 5 that is closest to the point (1, 3). To do so, respond to the following
prompts.

a. Write the function f = f (x, y) that measures the square of the distance
from (x, y) to (1, 3). (The extrema of this function are the same as the
extrema of the distance function, but f (x, y) is simpler to work with.)
10.8. CONSTRAINED OPTIMIZATION: LAGRANGE MULTIPLIERS 197

b. What is the constraint g(x, y) = c?

c. Write the equations resulting from ∇f = λ∇g and the constraint. Find
all the points (x, y) satisfying these equations.

d. Test all the points you found to determine the extrema.

14. Apply the Method of Lagrange Multipliers solve each of the following
constrained optimization problems.

a. Determine the absolute maximum and absolute minimum values of f (x, y) =


(x − 1)2 + (y − 2)2 subject to the constraint that x2 + y 2 = 16.

b. Determine the points on the sphere x2 + y 2 + z 2 = 4 that are closest


to and farthest from the point (3, 1, −1). (As in the preceding exercise,
you may find it simpler to work with the square of the distance formula,
rather than the distance formula itself.)

c. Find the absolute maximum and minimum of f (x, y, z) = x2 + y 2 + z 2


subject to the constraint that (x − 3)2 + (y + 2)2 + (z − 5)2 ≤ 16. (Hint:
here the constraint is a closed, bounded region. Use the boundary of that
region for applying Lagrange Multipliers, but don’t forget to also test any
critical values of the function that lie in the interior of the region.)

15. In this exercise we consider how to apply the Method of Lagrange Mul-
tipliers to optimize functions of three variable subject to two constraints. Sup-
pose we want to optimize f = f (x, y, z) subject to the constraints g(x, y, z) = c
and h(x, y, z) = k. Also suppose that the two level surfaces g(x, y, z) = c and
h(x, y, z) = k intersect at a curve C. The optimum point P = (x0 , y0 , z0 ) will
then lie on C.

a. Assume that C can be represented parametrically by a vector-valued


−−→
function r = r(t). Let OP = r(t0 ). Use the Chain Rule applied to
f (r(t)), g(r(t)), and h(r(t)), to explain why

∇f (x0 , y0 , z0 ) · r0 (t0 ) = 0,
∇g(x0 , y0 , z0 ) · r0 (t0 ) = 0, and
∇h(x0 , y0 , z0 ) · r0 (t0 ) = 0.

Explain how this shows that ∇f (x0 , y0 , z0 ), ∇g(x0 , y0 , z0 ), and ∇h(x0 , y0 , z0 )


are all orthogonal to C at P . This shows that ∇f (x0 , y0 , z0 ), ∇g(x0 , y0 , z0 ),
and ∇h(x0 , y0 , z0 ) all lie in the same plane.

b. Assuming that ∇g(x0 , y0 , z0 ) and ∇h(x0 , y0 , z0 ) are nonzero and not par-
allel, explain why every point in the plane determined by ∇g(x0 , y0 , z0 )
and ∇h(x0 , y0 , z0 ) has the form s∇g(x0 , y0 , z0 ) + t∇h(x0 , y0 , z0 ) for some
scalars s and t.

c. Parts (a.) and (b.) show that there must exist scalars λ and µ such that

∇f (x0 , y0 , z0 ) = λ∇g(x0 , y0 , z0 ) + µ∇h(x0 , y0 , z0 ).

So to optimize f = f (x, y, z) subject to the constraints g(x, y, z) = c and


h(x, y, z) = k we must solve the system of equations

∇f (x, y, z) = λ∇g(x, y, z) + µ∇h(x, y, z),


g(x, y, z) = c, and
198 CHAPTER 10. DERIVATIVES OF MULTIVARIABLE FUNCTIONS

h(x, y, z) = k.

for x, y, z, λ, and µ.
Use this idea to find the maximum and minium values of f (x, y, z) =
x + 2y subject to the constraints y 2 + z 2 = 8 and x + y + z = 10.

16. There is a useful interpretation of the Lagrange multiplier λ. Assume


that we want to optimize a function f with constraint g(x, y) = c. Recall
that an optimal solution occurs at a point (x0 , y0 ) where ∇f = λ∇g. As the
constraint changes, so does the point at which the optimal solution occurs. So
we can think of the optimal point as a function of the parameter c, that is
x0 = x0 (c) and y0 = y0 (c). The optimal value of f subject to the constraint
can then be considered as a function of c defined by f (x0 (c), y0 (c)). The Chain
Rule shows that
df ∂f dx0 ∂f dy0
= + .
dc ∂x0 dc ∂y0 dc
a. Use the fact that ∇f = λ∇g at (x0 , y0 ) to explain why

df dg
=λ .
dc dc

b. Use the fact that g(x, y) = c to show that

df
= λ.
dc
Conclude that λ tells us the rate of change of the function f as the
parameter c increases (or by approximately how much the optimal value
of the function f will change if we increase the value of c by 1 unit).

c. Suppose that λ = 324 at the point where the package described in Pre-
view Activity 10.8.1 has its maximum volume. Explain in context what
the value 324 tells us about the package.
d. Suppose that the maximum value of a function f = f (x, y) subject to
a constraint g(x, y) = 100 is 236. When using the method of Lagrange
multipliers and solving ∇f = λ∇g, we obtain a value of λ = 15 at this
maximum. Find an approximation to the maximum value of f subject
to the constraint g(x, y) = 98.
Chapter 11

Multiple Integrals

11.1 Double Riemann Sums and Double Inte-


grals over Rectangles

Motivating Questions

• What is a double Riemann sum?

• How is the double integral of a continuous function f = f (x, y) defined?

• What are two things the double integral of a function can tell us?

In single-variable calculus, recall that we approximated the area under the


graph of a positive function f on an interval [a, b] by adding areas of rectangles
whose heights are determined by the curve. The general process involved
subdividing the interval [a, b] into smaller subintervals, constructing rectangles
on each of these smaller intervals to approximate the region under the curve on
that subinterval, then summing the areas of these rectangles to approximate
the area under the curve. We will extend this process in this section to its
three-dimensional analogs, double Riemann sums and double integrals over
rectangles.

Preview Activity 11.1.1. In this activity we introduce the concept of a


double Riemann sum.

a. Review the concept of the Riemann sum from single-variable calculus.


Rb
Then, explain how we define the definite integral a f (x) dx of a contin-
uous function of a single variable x on an interval [a, b]. Include a sketch
of a continuous function on an interval [a, b] with appropriate labeling in
order to illustrate your definition.

b. In our upcoming study of integral calculus for multivariable functions,


we will first extend the idea of the single-variable definite integral to
functions of two variables over rectangular domains. To do so, we will
need to understand how to partition a rectangle into subrectangles. Let
R be rectangular domain R = {(x, y) : 0 ≤ x ≤ 6, 2 ≤ y ≤ 4} (we can
also represent this domain with the notation [0, 6] × [2, 4]), as pictured
in Figure 11.1.1.

199
200 CHAPTER 11. MULTIPLE INTEGRALS

2
0 6

Figure 11.1.1: Rectangular domain R with subrectangles.

To form a partition of the full rectangular region, R, we will partition


both intervals [0, 6] and [2, 4]; in particular, we choose to partition the
interval [0, 6] into three uniformly sized subintervals and the interval
[2, 4] into two evenly sized subintervals as shown in Figure 11.1.1. In
the following questions, we discuss how to identify the endpoints of each
subinterval and the resulting subrectangles.

i. Let 0 = x0 < x1 < x2 < x3 = 6 be the endpoints of the subintervals


of [0, 6] after partitioning. What is the length ∆x of each subinterval
[xi−1 , xi ] for i from 1 to 3?
ii. Explicitly identify x0 , x1 , x2 , and x3 . On Figure 11.1.1 or your own
version of the diagram, label these endpoints.
iii. Let 2 = y0 < y1 < y2 = 4 be the endpoints of the subintervals of
[2, 4] after partitioning. What is the length ∆y of each subinterval
[yj−1 , yj ] for j from 1 to 2? Identify y0 , y1 , and y2 and label these
endpoints on Figure 11.1.1.
iv. Let Rij denote the subrectangle [xi−1 , xi ]×[yj−1 , yj ]. Appropriately
label each subrectangle in your drawing of Figure 11.1.1. How does
the total number of subrectangles depend on the partitions of the
intervals [0, 6] and [2, 4]?
v. What is area ∆A of each subrectangle?

11.1.1 Double Riemann Sums over Rectangles


For the definite integral in single-variable calculus, we considered a continuous
function over a closed, bounded interval [a, b]. In multivariable calculus, we
will eventually develop the idea of a definite integral over a closed, bounded
region (such as the interior of a circle). We begin with a simpler situation by
thinking only about rectangular domains, and will address more complicated
domains in Section 11.3.
Let f = f (x, y) be a continuous function defined on a rectangular domain
R = {(x, y) : a ≤ x ≤ b, c ≤ y ≤ d}. As we saw in Preview Activity 11.1.1, the
domain is a rectangle R and we want to partition R into subrectangles. We do
this by partitioning each of the intervals [a, b] and [c, d] into subintervals and
using those subintervals to create a partition of R into subrectangles. In the
first activity, we address the quantities and notations we will use in order to
define double Riemann sums and double integrals.

Activity 11.1.2. Let f (x, y) = 100 − x2 − y 2 be defined on the rectangular


domain R = [a, b] × [c, d]. Partition the interval [a, b] into four uniformly
11.1. DOUBLE RIEMANN SUMS AND DOUBLE INTEGRALS OVER RECTANGLES201

sized subintervals and the interval [c, d] into three evenly sized subintervals as
shown in Figure 11.1.2. As we did in Preview Activity 11.1.1, we will need
a method for identifying the endpoints of each subinterval and the resulting
subrectangles.

y
d

c
x
a b

Figure 11.1.2: Rectangular domain with subrectangles.

a. Let a = x0 < x1 < x2 < x3 < x4 = b be the endpoints of the subintervals


of [a, b] after partitioning. Label these endpoints in Figure 11.1.2.
b. What is the length ∆x of each subinterval [xi−1 , xi ]? Your answer should
be in terms of a and b.
c. Let c = y0 < y1 < y2 < y3 = d be the endpoints of the subintervals of
[c, d] after partitioning. Label these endpoints in Figure 11.1.2.
d. What is the length ∆y of each subinterval [yj−1 , yj ]? Your answer should
be in terms of c and d.
e. The partitions of the intervals [a, b] and [c, d] partition the rectangle R
into subrectangles. How many subrectangles are there?
f. Let Rij denote the subrectangle [xi−1 , xi ] × [yj−1 , yj ]. Label each sub-
rectangle in Figure 11.1.2.
g. What is area ∆A of each subrectangle?
h. Now let [a, b] = [0, 8] and [c, d] = [2, 6]. Let (x∗11 , y11

) be the point in the
upper right corner of the subrectangle R11 . Identify and correctly label
this point in Figure 11.1.2. Calculate the product
f (x∗11 , y11

)∆A.
Explain, geometrically, what this product represents.
i. For each i and j, choose a point (x∗ij , yij∗
) in the subrectangle Ri,j . Iden-
tify and correctly label these points in Figure 11.1.2. Explain what the
product
f (x∗ij , yij

)∆A
represents.
j. If we were to add all the values f (x∗ij , yij

)∆A for each i and j, what does
the resulting number approximate about the surface defined by f on the
domain R? (You don’t actually need to add these values.)
k. Write a double sum using summation notation that expresses the arbi-
trary sum from part (j).
202 CHAPTER 11. MULTIPLE INTEGRALS

11.1.2 Double Riemann Sums and Double Integrals


Now we use the process from the most recent activity to formally define double
Riemann sums and double integrals.

Definition 11.1.3. Let f be a continuous function on a rectangle R = {(x, y) :


a ≤ x ≤ b, c ≤ y ≤ d}. A double Riemann sum for f over R is created as
follows.

• Partition the interval [a, b] into m subintervals of equal length ∆x = b−a m .


Let x0 , x1 , . . ., xm be the endpoints of these subintervals, where a = x0 <
x1 < x2 < · · · < xm = b.

• Partition the interval [c, d] into n subintervals of equal length ∆y = d−c n .


Let y0 , y1 , . . ., yn be the endpoints of these subintervals, where c = y0 <
y1 < y2 < · · · < yn = d.

• These two partitions create a partition of the rectangle R into mn sub-


rectangles Rij with opposite vertices (xi−1 , yj−1 ) and (xi , yj ) for i be-
tween 1 and m and j between 1 and n. These rectangles all have equal
area ∆A = ∆x · ∆y.

• Choose a point (x∗ij , yij



) in each rectangle Rij . Then, a double Riemann
sum for f over R is given by
n X
X m
f (x∗ij , yij

) · ∆A.
j=1 i=1

If f (x, y) ≥ 0 on the rectangle R, we may ask to find the volume of the


solid bounded above by f over R, as illustrated on the left of Figure 11.1.4.
This volume is approximated by a Riemann sum, which sums the volumes of
the rectangular boxes shown on the right of Figure 11.1.4.

z z
z = f (x, y)

y y

x x

Figure 11.1.4: The volume under a graph approximated by a Riemann Sum.

As we let the number of subrectangles increase without bound (in other


words, as both m and n in a double Riemann sum go to infinity), as illustrated
in Figure 11.1.5, the sum of the volumes of the rectangular boxes approaches
the volume of the solid bounded above by f over R. The value of this limit,
provided it exists, is the double integral.
11.1. DOUBLE RIEMANN SUMS AND DOUBLE INTEGRALS OVER RECTANGLES203

z z

y y

x x

Figure 11.1.5: Finding better approximations by using smaller subrectangles.

Definition 11.1.6. Let R be a rectangular region in the xy-plane and f a


continuous function over R. With terms defined as in a double Riemann sum,
the double integral of f over R is

ZZ n X
X m
f (x, y) dA = lim f (x∗ij , yij

) · ∆A.
R m,n→∞
j=1 i=1

Some textbooks use the notation R f (x, y) dA for a double integral. You
R

will see this in some of the WeBWorK problems.

11.1.3 Interpretation of Double Riemann Sums and Dou-


ble integrals.

At the moment, there are two ways we can interpret the value of the double
integral.

• Suppose that f (x, y) assumes both positive and negatives values on the
rectangle R, as shown on the left of Figure 11.1.7. When constructing
a Riemann sum, for each i and j, the product f (x∗ij , yij ∗
) · ∆A can be
interpreted as a “signed” volume of a box with base area ∆A and “signed”
height f (x∗ij , yij

). Since f can have negative values, this “height” could
be negative. The sum

n X
X m
f (x∗ij , yij

) · ∆A
j=1 i=1

can then be interpreted as a sum of “signed” volumes of boxes, with a


negative sign attached to those boxes whose heights are below the xy-
plane.
204 CHAPTER 11. MULTIPLE INTEGRALS

z = f (x, y)

y
R

Figure 11.1.7: The integral measures signed volume.

We can then
RR realize the double integral R f (x, y) dA as a difference in
RR

volumes: R f (x, y) dA tells us the volume of the solids the graph of f


bounds above the xy-plane over the rectangle R minus the volume of the
solids the graph of f bounds below the xy-plane under the rectangle R.
This is shown on the right of Figure 11.1.7.
• The average of the finitely many mn values f (x∗ij , yij

) that we take in a
double Riemann sum is given by
n m
1 XX
Avgmn = f (x∗ij , yij

).
mn j=1 i=1

If we take the limit as m and n go to infinity, we obtain what we define


as the average value of f over the region R, which is connected to the
value of the double integral. First, to view Avg mn as a double Riemann
sum, note that
b−a d−c
∆x = and ∆y = .
m n
Thus,
1 ∆x · ∆y ∆A
= = ,
mn (b − a)(d − c) A(R)
where A(R) denotes the area of the rectangle R. Then, the average value
of the function f over R, fAVG(R) , is given by
n m
1 XX
fAVG(R) = lim f (x∗ij , yij

)
m,n→∞ mn
j=1 i=1
n m
1 XX
= lim f (x∗ij , yij

) · ∆A
m,n→∞ A(R)
j=1 i=1
1
ZZ
= f (x, y) dA.
A(R) R

Therefore, the double integral of f over R divided by the area of R gives


us the average value of the function f on R. Finally, if f (x, y) ≥ 0 on
R, we can interpret this average value of f on R as the height of the
box with base R that has the same volume as the volume of the surface
defined by f over R.
11.1. DOUBLE RIEMANN SUMS AND DOUBLE INTEGRALS OVER RECTANGLES205

Activity 11.1.3. Let f (x, y) = x + 2y and let R = [0, 2] × [1, 3].


a. Draw a picture of R. Partition [0, 2] into 2 subintervals of equal length
and the interval [1, 3] into two subintervals of equal length. Draw these
partitions on your picture of R and label the resulting subrectangles using
the labeling scheme we established in the definition of a double Riemann
sum.
b. For each i and j, let (x∗ij , yij

) be the midpoint of the rectangle Rij .
Identify the coordinates of each (x∗ij , yij

). Draw these points on your
picture of R.
c. Calculate the Riemann sum
n X
X m
f (x∗ij , yij

) · ∆A
j=1 i=1

using the partitions we have described. If we let (x∗ij , yij



) be the midpoint
of the rectangle Rij for each i and j, then the resulting Riemann sum is
called a midpoint sum.
d. Give two interpretations for the meaning of the sum you just calculated.
Activity 11.1.4. Let f (x, y) = 4 − y 2 on the rectangular domain R =
p

[1, 7] × [−2, 2]. Partition [1, 7] into 3 equal length subintervals and [−2, 2] into
2 equal length subintervals. A table of values of f at some points in R is given
in Table 11.1.8, and a graph of f with the indicated partitions is shown in
Figure 11.1.9.

−2 −1 0 1 2
√ √
1 0 √3 2 √3 0
2 0 √3 2 √3 0
3 0 √3 2 √3 0
4 0 1
√3 2 √3 0
5 0 √3 2 √3 0
3
6 0 √3 2 √3 0
x
7 0 3 2 3 0 5

-2 -1 0 1
y
Table 11.1.8: Table of values of Figure 11.1.9: Graph of f (x, y) =
f (x, y) = 4 − y 2 .
p
4 − y 2 on R.
p

a. Sketch the region R in the plane using the values in Table 11.1.8 as the
partitions.
b. Calculate the double Riemann sum using the given partition of R and
the values of f in the upper right corner of each subrectangle.
c. Use geometry to calculate the exact value of R f (x, y) dA and compare
RR

it to your approximation. Describe one way we could obtain a better


approximation using the given data.
206 CHAPTER 11. MULTIPLE INTEGRALS

We conclude this section with a list of properties of double integrals. Since


similar properties are satisfied by single-variable integrals and the arguments
for double integrals are essentially the same, we omit their justification.
Properties of Double Integrals.
Let f and g be continuous functions on a rectangle R = {(x, y) : a ≤
x ≤ b, c ≤ y ≤ d}, and let k be a constant. Then
1. R (f (x, y) + g(x, y)) dA = R f (x, y) dA + R g(x, y) dA.
RR RR RR

2. R kf (x, y) dA = k R f (x, y) dA.


RR RR

3. If f (x, y) ≥ g(x, y) on R, then R f (x, y) dA ≥ R g(x, y) dA.


RR RR

11.1.4 Summary

• Let f be a continuous function on a rectangle R = {(x, y) : a ≤ x ≤ b, c ≤


y ≤ d}. The double Riemann sum for f over R is created as follows.

◦ Partition the interval [a, b] into m subintervals of equal length ∆x =


m . Let x0 , x1 , . . ., xm be the endpoints of these subintervals,
b−a

where a = x0 < x1 < x2 < · · · < xm = b.


◦ Partition the interval [c, d] into n subintervals of equal length ∆y =
n . Let y0 , y1 , . . ., yn be the endpoints of these subintervals, where
d−c

c = y0 < y1 < y2 < · · · < yn = d.


◦ These two partitions create a partition of the rectangle R into mn
subrectangles Rij with opposite vertices (xi−1 , yj−1 ) and (xi , yj ) for
i between 1 and m and j between 1 and n. These rectangles all have
equal area ∆A = ∆x · ∆y.
◦ Choose a point (x∗ij , yij

) in each rectangle Rij . Then a double Rie-
mann sum for f over R is given by
n X
X m
f (x∗ij , yij

) · ∆A.
j=1 i=1

• With terms defined as in the Double Riemann Sum, the double integral
of f over R is
ZZ n X
X m
f (x, y) dA = lim f (x∗ij , yij

) · ∆A.
R m,n→∞
j=1 i=1

• Two interpretations of the double integral f (x, y) dA are:


RR
R

◦ The volume of the solids the graph of f bounds above the xy-plane
over the rectangle R minus the volume of the solids the graph of f
bounds below the xy-plane under the rectangle R;
◦ Dividing the double integral of f over R by the area of R gives us
the average value of the function f on R. If f (x, y) ≥ 0 on R, we
can interpret this average value of f on R as the height of the box
with base R that has the same volume as the volume of the surface
defined by f over R.
11.1. DOUBLE RIEMANN SUMS AND DOUBLE INTEGRALS OVER RECTANGLES207

Exercises

1. Suppose f (x, y) = 25 − x2 − y 2 and RZ Zis the rectangle with vertices (0,0),


(6,0), (6,4), (0,4). In each part, estimate f (x, y) dA using Riemann sums.
R
For underestimates or overestimates, consistently use either the lower left-hand
corner or the upper right-hand corner of each rectangle in a subdivision, as
appropriate.
(a) Without subdividing R,
Underestimate =
Overestimate =
(b) By partitioning R into four equal-sized rectangles.
Underestimate =
Overestimate =

2. Consider the solid that lies above the square (in the xy-plane) R =
[0, 1] × [0, 1], and below the elliptic paraboloid z = 25 − x2 + xy − y 2 .
Estimate the volume by dividing R into 9 equal squares and choosing the
sample points to lie in the midpoints of each square.

3. Let R √be the rectangle with vertices (0, 0), (2, 0), (2, 2), and (0, 2) and let
f (x, y) = 3xy.
(a) Find reasonable upper and lower bounds for R f dA without subdivid-
R

ing R.
upper bound =
lower bound =
(b) Estimate R f dA three ways: by partitioning R into four subrectangles
R

and evaluating f at its maximum and minimum values on each subrectangle,


and then by considering the average of these (over and under) estimates.
overestimate: R f dA ≈
R

underestimate: R f dA ≈
R

average: R f dA ≈
R

4. Using Riemann sums with four subdivisions in each direction, find upper
and lower bounds for the volume under the graph of f (x, y) = 1 + 2xy above
the rectangle R with 0 ≤ x ≤ 2, 0 ≤ y ≤ 4.
upper bound =
lower bound =

5. Consider the solid that lies above the square (in the xy-plane) R =
[0, 2] × [0, 2],
and below the elliptic paraboloid z = 49 − x2 − y 2 .
(A) Estimate the volume by dividing R into 4 equal squares and choosing
the sample points to lie in the lower left hand corners.
(B) Estimate the volume by dividing R into 4 equal squares and choosing
the sample points to lie in the upper right hand corners..
(C) What is the average of the two answers from (A) and (B)?

6. The figure below shows contours of g(x, y) on the region R, with 5 ≤ x ≤


11 and 6 ≤ y ≤ 12.
208 CHAPTER 11. MULTIPLE INTEGRALS

R Using ∆x = ∆y = 2, find an overestimate and an underestimate for


R
g(x, y) dA.
Overestimate =
Underestimate =
7. The figure below shows the distribution of temperature, in degrees C, in
a 5 meter by 5 meter heated room.

Using Riemann sums, estimate the average temperature in the room.


average temperature =
8. Values of f (x, y) are given in the table below. Let R be the rectangle
1 ≤Rx ≤ 1.6, 2 ≤ y ≤ 3.2. Find a Riemann sum which is a reasonable estimate
for R f (x, y) da with ∆x = 0.2 and ∆y = 0.4. Note that the values given in
the table correspond to midpoints.

y\x 1.1 1.3 1.5


2.2 6 −4 6
2.6 0 2 8
3.0 −5 2 5
R
R
f (x, y) da ≈
9. Values of f (x, y) are shown in the table below.
11.1. DOUBLE RIEMANN SUMS AND DOUBLE INTEGRALS OVER RECTANGLES209

x=3 x = 3.1 x = 3.2


y=5 7 8 10
y = 5.2 8 9 12
y = 5.4 9 10 6

Let R be the rectangle 3 ≤ x ≤ 3.2, 5 ≤ y ≤ 5.4. Find the values of Rie-


mann sums which are reasonable over- and under-estimates for R f (x, y) dA
R

with ∆x = 0.1 and ∆y = 0.2.


over-estimate:
under-estimate:
10. The temperature at any point on a metal plate in the xy plane is given
by T (x, y) = 100 − 4x2 − y 2 , where x and y are measured in inches and T in
degrees Celsius. Consider the portion of the plate that lies on the rectangular
region R = [1, 5] × [3, 6].

a. Estimate the value of R T (x, y) dA by using a double Riemann sum


RR

with two subintervals in each direction and choosing (x∗i , yj∗ ) to be the
point that lies in the upper right corner of each subrectangle.

b. Determine the area of the rectangle R.

c. Estimate the average temperature, TAVG(R) , over the region R.

d. Do you think your estimate in (c) is an over- or under-estimate of the


true temperature? Why?

11. Let f be a function of independent variables x and y that is increasing


in both the positive x and y directions on a rectangular domain R. For each of
the following situations, determine if the double Riemann RR sum of f over R is
an overestimate or underestimate of the double integral R f (x, y) dA, or if it
impossible to determine definitively. Provide justification for your responses.

a. The double Riemann sum of f over R where f is evaluated at the lower


left point of each subrectangle.

b. The double Riemann sum of f over R where f is evaluated at the upper


right point of each subrectangle.

c. The double Riemann sum of f over R where f is evaluated at the mid-


point of each subrectangle.

d. The double Riemann sum of f over R where f is evaluated at the lower


right point of each subrectangle.

12. The wind chill, as frequently reported, is a measure of how cold it


feels outside when the wind is blowing. In Table 11.1.10, the wind chill w =
w(v, T ), measured in degrees Fahrenheit, is a function of the wind speed v,
measured in miles per hour, and the ambient air temperature T , also measured
in degrees Fahrenheit. Approximate the average wind chill on the rectangle
[5, 35] × [−20, 20] using 3 subintervals in the v direction, 4 subintervals in the
T direction, and the point in the lower left corner in each subrectangle.
210 CHAPTER 11. MULTIPLE INTEGRALS

v\T −20 −15 −10 −5 0 5 10 15 20


5 −34 −28 −22 −16 −11 −5 1 7 13
10 −41 −35 −28 −22 −16 −10 −4 3 9
15 −45 −39 −32 −26 −19 −13 −7 0 6
20 −48 −42 −35 −29 −22 −15 −9 −2 4
25 −51 −44 −37 −31 −24 −17 −11 −4 3
30 −53 −46 −39 −33 −26 −19 −12 −5 1
35 −55 −48 −41 −34 −27 −21 −14 −7 0

Table 11.1.10: Wind chill as a function of wind speed and temperature.

13. Consider the box with a sloped top that is given by the following de-
scription: the base is the rectangle R = [0, 4] × [0, 3], while the top is given by
the plane z = p(x, y) = 20 − 2x − 3y.
a. Estimate the value of R p(x, y) dA by using a double Riemann sum
RR

with four subintervals in the x direction and three subintervals in the y


direction, and choosing (x∗i , yj∗ ) to be the point that is the midpoint of
each subrectangle.
b. What important quantity does your double Riemann sum in (a) estimate?
c. Suppose it can be determined that R p(x, y) dA = 138. What is the
RR

exact average value of p over R?


d. If you wanted to build a rectangular box (with the same base) that has
the same volume as the box with the sloped top described here, how tall
would the rectangular box have to be?
11.2. ITERATED INTEGRALS 211

11.2 Iterated Integrals

Motivating Questions
• How do we evaluate a double integral over a rectangle as an iterated
integral, and why does this process work?

Recall that we defined the double integral of a continuous function f =


f (x, y) over a rectangle R = [a, b] × [c, d] as
ZZ n X
X m
f x∗ij , yij


f (x, y) dA = lim · ∆A,
R m,n→∞
j=1 i=1

whereRRthe different variables and notation are as described in Section 11.1.


Thus R f (x, y) dA is a limit of double Riemann sums, but while this definition
tells us exactly what a double integral is, it is not very helpful for determining
the value of a double integral. Fortunately, there is a way to view a double
integral as an iterated integral , which will make computations feasible in many
cases.
The viewpoint of an iterated integral is closely connected to an important
idea from single-variable calculus. When we studied solids of revolution, such
as the one shown in Figure 11.2.1, we saw that in some circumstances we could
slice the solid perpendicular to an axis and have each slice be approximately
a circular disk. From there, we were able to find the volume of each disk, and
then use an integral to add the volumes of the slices. In what follows, we are
able to use single integrals to generalize this approach to handle even more
general geometric shapes.

y = 4 − x2

Figure 11.2.1: A solid of revolution.

Preview Activity 11.2.1. Let f (x, y) = 25 − x2 − y 2 on the rectangular


domain R = [−3, 3] × [−4, 4].
As with partial derivatives, we may treat one of the variables in f as con-
stant and think of the resulting function as a function of a single variable. Now
we investigate what happens if we integrate instead of differentiate.
a. Choose a fixed value of x in the interior of [−3, 3]. Let
Z 4
A(x) = f (x, y) dy.
−4
212 CHAPTER 11. MULTIPLE INTEGRALS

What is the geometric meaning of the value of A(x) relative to the surface
defined by f . (Hint: Think about the trace determined by the fixed
value of x, and consider how A(x) is related to the image at left in
Figure 11.2.2.)

z 25 z 25
20 20
15 15
10 10
5 5

-3 -3
-1 -1
1 x 1 x
-4 -2 0 y 2 -4 -2 0 y 2

Figure 11.2.2: Left: A cross section with fixed x. Right: A cross section
with fixed x and ∆x.

b. For a fixed value of x, say x∗i , what is the geometric meaning of A(x∗i ) ∆x?
(Hint: Consider how A(x∗i )∆x is related to the image at right in Fig-
ure 11.2.2.)

c. Since f is continuous on R, we can define the function A = A(x) at every


value of x in [−3, 3]. Now think about subdividing the x-interval [−3, 3]
into m subintervals, and choosing a value
Pm x∗i in each of those subintervals.
What will be the meaning of the sum i=1 A(x∗i ) ∆x?
R3
d. Explain why −3 A(x) dx will determine the exact value of the volume
under the surface z = f (x, y) over the rectangle R.

11.2.1 Iterated Integrals


The ideas that we explored in Preview Activity 11.2.1 work more generally and
lead to the idea of an iterated integral. Let f be a continuous function on a
rectangular domain R = [a, b] × [c, d], and let
Z d
A(x) = f (x, y) dy.
c

The function A = A(x) determines the value of the cross sectional area (by
area we mean “signed” area) in the y direction for the fixed value of x of the
solid bounded between the surface defined by f and the xy-plane.
The value of this cross sectional area is determined by the input x in A.
Since A is a function of x, it follows that we can integrate A with respect to
x. In doing so, we use a partition of [a, b] and make an approximation to the
integral given by
Z b Xm
A(x) dx ≈ A(x∗i )∆x,
a i=1
11.2. ITERATED INTEGRALS 213

z 25 z 25 z 25
20 20 20
15 15 15
10 10 10
5 5 5

-3 -3 -3
-1 -1 -1
1 x 1 x 1 x
-4 2 0 y 2 -4 2 0 y 2 -4 2 0 y 2

Figure 11.2.3: Summing volumes of cross section slices.

where x∗i is any number in the subinterval [xi−1 , xi ]. Each term A(x∗i )∆x in
the sum represents an approximation of a fixed cross sectional slice of the sur-
face in the y direction with a fixed width of ∆x as illustrated in Figure 11.2.3.
We add the signed volumes of these slices as shown in the frames in Fig-
ure 11.2.3 to obtain an approximation of the total signed volume.
As we let the number of subintervals
Pm in the x direction approach infinity,
we can see that the Riemann sum i=1 A(x∗i )∆x approaches a limit and that
limit is the sum of signed volumes bounded by the function f on R. Therefore,
since A(x) is itself determined by an integral, we have
m
ZZ Z b Z b Z d !
X

f (x, y) dA = lim A(xi )∆x = A(x) dx = f (x, y) dy dx.
R m→∞ a a c
i=1

Hence, we can compute the double integral of f over R by first integrating


f with respect to y on [c, d], then integrating the resulting function of x with
respect to x on [a, b]. The nested integral
Z b Z d ! Z bZ d
f (x, y) dy dx = f (x, y) dy dx
a c a c

is called an iterated integral, and we see that each double integral may be
represented by two single integrals.
We made a choice to integrate first with respect to y. The same argu-
ment shows that we can also find the double integral as an iterated integral
integrating with respect to x first, or
ZZ Z d Z b ! Z dZ b
f (x, y) dA = f (x, y) dx dy = f (x, y) dx dy.
R c a c a

The fact that integrating in either order results in the same value is known
as Fubini’s Theorem.
Fubini’s Theorem.
If f = f (x, y) is a continuous function on a rectangle R = [a, b] × [c, d],
then
ZZ Z dZ b Z bZ d
f (x, y) dA = f (x, y) dx dy = f (x, y) dy dx.
R c a a c

Fubini’s theorem enables us to evaluate iterated integrals without resorting


to the limit definition. Instead, working with one integral at a time, we can
use the Fundamental Theorem of Calculus from single-variable calculus to find
the exact value of each integral, starting with the inner integral.
214 CHAPTER 11. MULTIPLE INTEGRALS

Activity 11.2.2. Let f (x, y) = 25 − x2 − y 2 on the rectangular domain R =


[−3, 3] × [−4, 4].

a. Viewing x as a fixed constant, use the Fundamental Theorem of Calculus


to evaluate the integral
Z 4
A(x) = f (x, y) dy.
−4

Note that you will be integrating with respect to y, and holding x con-
stant. Your result should be a function of x only.

b. Next, use your result from (a) along with the Fundamental Theorem of
R3
Calculus to determine the value of −3 A(x) dx.

c. What is the value of R f (x, y) dA? What are two different ways we
RR

may interpret the meaning of this value?

Activity 11.2.3. Let f (x, y) = x + y 2 on the rectangle R = [0, 2] × [0, 3].

a. Evaluate R f (x, y) dA using an iterated integral. Choose an order for


RR

integration by deciding whether you want to integrate first with respect


to x or y.

b. Evaluate R f (x, y) dA using the iterated integral whose order of inte-


RR

gration is the opposite of the order you chose in (a).

11.2.2 Summary

• We can evaluate the double integral R f (x, y) dA over a rectangle R =


RR

[a, b] × [c, d] as an iterated integral in one of two ways:


R b R d 
◦ a c f (x, y) dy dx, or
R d R b 
◦ c a
f (x, y) dx dy.

This process works because each inner integral represents a cross-sectional


(signed) area and the outer integral then sums all of the cross-sectional
(signed) areas. Fubini’s Theorem guarantees that the resulting value is
the same, regardless of the order in which we integrate.

Exercises
R4R2
1. Evaluate the iterated integral 0 0
4x2 y 3 dxdy
R2R2
2. Evaluate the iterated integral 1 1 (4x + y)−2 dydx
R5R6
3. Find 0 4 (x + ln y) dydx
R 4 R 10
4. Find 1 4 xyex+y dydx
5. Calculate the double integral (4x + 2y + 8) dA where R is the region:
RR
R
0 ≤ x ≤ 1, 0 ≤ y ≤ 2.
6. Calculate the double integral x cos(x + y) dA where R is the region:
RR
R
0 ≤ x ≤ π3 , 0 ≤ y ≤ π2
11.2. ITERATED INTEGRALS 215

7. Consider the solid that lies above the square (in the xy-plane) R =
[0, 2] × [0, 2],
and below the elliptic paraboloid z = 64 − x2 − 2y 2 .
(A) Estimate the volume by dividing R into 4 equal squares and choosing
the sample points to lie in the lower left hand corners.
(B) Estimate the volume by dividing R into 4 equal squares and choosing
the sample points to lie in the upper right hand corners..
(C) What is the average of the two answers from (A) and (B)?
(D) Using iterated integrals, compute the exact value of the volume.
Z 6 Z 4 ZZ
8. If f (x)dx = −2 and g(x)dx = −2, what is the value of f (x)g(y)dA
2 0 D
where D is the rectangle: 2 ≤ x ≤ 6, 0 ≤ y ≤ 4?
9. Find the average value of f (x, y) = 5x4 y 5 over the rectangle R with
vertices (−3, 0), (−3, 4), (3, 0), (3, 4).
Average value =

10. Find the average value of f (x, y) = 8ey x + ey over the rectangle
R = [0, 8] × [0, 3].
Average value =
11. Evaluate each of the following double or iterated integrals exactly.
R 3 R 5 
a. 1 2 xy dy dx
R π/4 R π/3 
b. 0 0
sin(x) cos(y) dx dy
R 1 R 1 
c. 0 0
e−2x−3y dy dx
RR √
d. R
2x + 5y dA, where R = [0, 2] × [0, 3].

12. The temperature at any point on a metal plate in the xy plane is given
by T (x, y) = 100 − 4x2 − y 2 , where x and y are measured in inches and T in
degrees Celsius. Consider the portion of the plate that lies on the rectangular
region R = [1, 5] × [3, 6].
a. Write an iterated integral whose value represents the volume under the
surface T over the rectangle R.
b. Evaluate the iterated integral you determined in (a).
c. Find the area of the rectangle, R.
d. Determine the exact average temperature, TAVG(R) , over the region R.

13. Consider the box with a sloped top that is given by the following de-
scription: the base is the rectangle R = [1, 4] × [2, 5], while the top is given by
the plane z = p(x, y) = 30 − x − 2y.
a. Write an iterated integral whose value represents the volume under p
over the rectangle R.
b. Evaluate the iterated integral you determined in (a).
c. What is the exact average value of p over R?
d. If you wanted to build a rectangular box (with an identical base) that
has the same volume as the box with the sloped top described here, how
tall would the rectangular box have to be?
216 CHAPTER 11. MULTIPLE INTEGRALS

11.3 Double Integrals over General Regions

Motivating Questions
• How do we define a double integral over a non-rectangular region?
• What general form does an iterated integral over a non-rectangular region
have?

Recall that we defined the double integral of a continuous function f =


f (x, y) over a rectangle R = [a, b] × [c, d] as
ZZ Xn X m
f (x, y) dA = lim f (x∗ij , yij

) · ∆A,
R m,n→∞
j=1 i=1

where the notation is as described in Section


RR 11.1. Furthermore, we have seen
that we can evaluate a double integral R f (x, y) dA over R as an iterated
integral of either of the forms
Z bZ d Z dZ b
f (x, y) dy dx or f (x, y) dx dy.
a c c a

It is natural to wonder how we might define and evaluate a double integral


over a non-rectangular region; we explore one such example in the following
preview activity.
Preview Activity 11.3.1. A tetrahedron is a three-dimensional figure with
four faces, each of which is a triangle. A picture of the tetrahedron T with
vertices (0, 0, 0), (1, 0, 0), (0, 1, 0), and (0, 0, 1) is shown at left in Figure 11.3.1.
If we place one vertex at the origin and let vectors a, b, and c be determined
by the edges of the tetrahedron that have one end at the origin, then a formula
that tells us the volume V of the tetrahedron is
1
V = |a · (b × c)|. (11.3.1)
6

z
c y
1.0

b y
0.5

a
x
x
0.5 1.0

Figure 11.3.1: Left: The tetrahedron T . Right: Projecting T onto the xy-
plane.
11.3. DOUBLE INTEGRALS OVER GENERAL REGIONS 217

a. Use the formula (11.3.1) to find the volume of the tetrahedron T .

b. Instead of memorizing or looking up the formula for the volume of a


tetrahedron, we can use a double integral to calculate the volume of the
tetrahedron T . To see how, notice that the top face of the tetrahedron
T is the plane whose equation is

z = 1 − (x + y).

Provided that we can use an iterated integral on a non-rectangular region,


the volume of the tetrahedron will be given by an iterated integral of the
form Z Z x=? y=?
1 − (x + y) dy dx.
x=? y=?

The issue that is new here is how we find the limits on the integrals;
note that the outer integral’s limits are in x, while the inner ones are in
y, since we have chosen dA = dy dx. To see the domain over which we
need to integrate, think of standing way above the tetrahedron looking
straight down on it, which means we are projecting the entire tetrahedron
onto the xy-plane. The resulting domain is the triangular region shown
at right in Figure 11.3.1. Explain why we can represent the triangular
region with the inequalities

0≤y ≤1−x and 0 ≤ x ≤ 1.

(Hint: Consider the cross sectional slice shown at right in Figure 11.3.1.)

c. Explain why it makes sense to now write the volume integral in the form
Z x=? Z y=? Z x=1 Z y=1−x
1 − (x + y) dy dx = 1 − (x + y) dy dx.
x=? y=? x=0 y=0

d. Use the Fundamental Theorem of Calculus to evaluate the iterated inte-


gral
Z x=1 Z y=1−x
1 − (x + y) dy dx
x=0 y=0

and compare to your result from part (a). (As with iterated integrals
over rectangular regions, start with the inner integral.)

11.3.1 Double Integrals over General Regions


So far, we have learned that a double integral over a rectangular region may
be interpreted in one of two ways:

• R f (x, y) dA tells us the volume of the solids the graph of f bounds


RR

above the xy-plane over the rectangle R minus the volume of the solids
the graph of f bounds below the xy-plane under the rectangle R;

• A(R)
1
f (x, y) dA, where A(R) is the area of R tells us the average
RR
R
value of the function f on R. If f (x, y) ≥ 0 on R, we can interpret this
average value of f on R as the height of the box with base R that has
the same volume as the volume of the surface defined by f over R.
218 CHAPTER 11. MULTIPLE INTEGRALS

As we saw in Preview Activity 11.1.1, a function f = f (x, y) may be consid-


ered over regions other than rectangular ones, and thus we want to understand
how to set up and evaluate double integrals over non-rectangular regions. Note
that if we can, then the two interpretations of the double integral noted above
will naturally extend to solid regions with non-rectangular bases.
So, suppose f is a continuous function on a closed, bounded domain D. For
example, consider D as the circular domain shown at left in Figure 11.3.2.

2 y 2 y

1 R 1
D D
x x
-2 -1 1 2 -2 -1 1 2

-1 -1

-2 -2

Figure 11.3.2: Left: A non-rectangular domain. Right: Enclosing this do-


main in a rectangle.

We can enclose D in a rectangular domain R as shown at right in Fig-


ure 11.3.2 and extend the function f to be defined over R in order to be able
to use the definition of the double integral over a rectangle. We extend f in
such a way that its values at the points in R that are not in D contribute 0 to
the value of the integral. In other words, define a function F = F (x, y) on R
as (
f (x, y), if (x, y) ∈ D,
F (x, y) = .
0, if (x, y) 6∈ D
We then say that the double integral of f over D is the same as the double
integral of F over R, and thus
ZZ ZZ
f (x, y) dA = F (x, y) dA.
D R

In practice, we just ignore everything that is in R but not in D, since these


regions contribute 0 to the value of the integral.
Just as with double integrals over rectangles, a double integral over a do-
main D can be evaluated as an iterated integral. If the region D can be de-
scribed by the inequalities g1 (x) ≤ y ≤ g2 (x) and a ≤ x ≤ b, where g1 = g1 (x)
and g2 = g2 (x) are functions of only x, then
ZZ Z x=b Z y=g2 (x)
f (x, y) dA = f (x, y) dy dx.
D x=a y=g1 (x)

Alternatively, if the region D is described by the inequalities h1 (y) ≤ x ≤


h2 (y) and c ≤ y ≤ d, where h1 = h1 (y) and h2 = h2 (y) are functions of only
11.3. DOUBLE INTEGRALS OVER GENERAL REGIONS 219

y, we have
ZZ Z y=d Z x=h2 (y)
f (x, y) dA = f (x, y) dx dy.
D y=c x=h1 (y)

The structure of an iterated integral is of particular note:


In an iterated double integral:
• the limits on the outer integral must be constants;
• the limits on the inner integral must be constants or in terms of only the
remaining variable — that is, if the inner integral is with respect to y,
then its limits may only involve x and constants, and vice versa.
We next consider a detailed example.
Example 11.3.3. Let f (x, y) = x2 y be defined on the triangle D with vertices
(0, 0), (2, 0), and (2, 3) as shown at left in Figure 11.3.4.

3 y 3 y 3 y

2 D 2 D 2 D

1 1 1

x x x
1 2 3 1 2 3 1 2 3

Figure 11.3.4: A triangular domain and slices in the y and x directions.

To evaluate D f (x, y) dA, we must first describe the region D in terms of


RR

the variables x and y. We take two approaches.


Approach 1: Integrate first with respect to y. In this case we choose to
evaluate the double integral as an iterated integral in the form
ZZ Z x=b Z y=g2 (x)
x2 y dA = x2 y dy dx,
D x=a y=g1 (x)

and therefore we need to describe D in terms of inequalities

g1 (x) ≤ y ≤ g2 (x) and a ≤ x ≤ b.

Since we are integrating with respect to y first, the iterated integral has
the form ZZ Z x=b
x2 y dA = A(x) dx,
D x=a

where A(x) is a cross sectional area in the y direction. So we are slicing


the domain perpendicular to the x-axis and want to understand what a
cross sectional area of the overall solid will look like. Several slices of the
domain are shown in the middle image in Figure 11.3.4. On a slice with
fixed x value, the y values are bounded below by 0 and above by the y
coordinate on the hypotenuse of the right triangle. Thus, g1 (x) = 0; to
find y = g2 (x), we need to write the hypotenuse as a function of x. The
220 CHAPTER 11. MULTIPLE INTEGRALS

hypotenuse connects the points (0,0) and (2,3) and hence has equation
y = 23 x. This gives the upper bound on y as g2 (x) = 23 x. The leftmost
vertical cross section is at x = 0 and the rightmost one is at x = 2, so we
have a = 0 and b = 2. Therefore,
ZZ Z x=2 Z y= 23 x
x2 y dA = x2 y dy dx.
D x=0 y=0

We evaluate the iterated integral by applying the Fundamental Theorem


of Calculus first to the inner integral, and then to the outer one, and find
that
Z x=2 Z y= 23 x Z x=2   3
2 y= 2 x
2 2 y
x y dy dx = x · dx
x=0 y=0 x=0 2 y=0
Z x=2
9 4
= x dx
x=0 8
x=2
9 x5

=
8 5 x=0
  
9 32
=
8 5
36
= .
5
Approach 2: Integrate first with respect to x. In this case, we choose
to evaluate the double integral as an iterated integral in the form
ZZ Z y=d Z x=h2 (y)
x2 y dA = x2 y dx dy
D y=c x=h1 (y)

and thus need to describe D in terms of inequalities

h1 (y) ≤ x ≤ h2 (y) and c ≤ y ≤ d.

Since we are integrating with respect to x first, the iterated integral has
the form ZZ Z d
x2 y dA = A(y) dy,
D c
where A(y) is a cross sectional area of the solid in the x direction. Several
slices of the domain — perpendicular to the y-axis — are shown at right
in Figure 11.3.4. On a slice with fixed y value, the x values are bounded
below by the x coordinate on the hypotenuse of the right triangle and
above by 2. So h2 (y) = 2; to find h1 (y), we need to write the hypotenuse
as a function of y. Solving the earlier equation we have for the hypotenuse
(y = 23 x) for x gives us x = 32 y. This makes h1 (y) = 32 y. The lowest
horizontal cross section is at y = 0 and the uppermost one is at y = 3,
so we have c = 0 and d = 3. Therefore,
ZZ Z y=3 Z x=2
x2 y dA = x2 y dx dy.
D y=0 x=(2/3)y

We evaluate the resulting iterated integral as before by twice applying


the Fundamental Theorem of Calculus, and find that
Z y=3 Z 2 Z y=3  3  x=2
2 x
x y dx dy = y dx
y=0 x= 23 y y=0 3 x= 2 y
3
11.3. DOUBLE INTEGRALS OVER GENERAL REGIONS 221

y=3  
8 8 4
Z
= y − y dy
y=0 3 81
5 y=3
 2 
8y 8 y
= −
3 2 81 5 y=0
     
8 9 8 243
= −
3 2 81 5
24
= 12 −
5
36
= .
5

We see, of course, that in the situation where D can be described in two


different ways, the order in which we choose to set up and evaluate the double
integral doesn’t matter, and the same value results in either case.

The meaning of a double integral over a non-rectangular region, D, parallels


the meaning over a rectangular region. In particular,

• D f (x, y) dA tells us the volume of the solids the graph of f bounds


RR

above the xy-plane over the closed, bounded region D minus the volume
of the solids the graph of f bounds below the xy-plane under the region
D;

• A(D)
1
f (x, y) dA, where A(D) is the area of D tells us the average
RR
R
value of the function f on D. If f (x, y) ≥ 0 on D, we can interpret
this average value of f on D as the height of the solid with base D and
constant cross-sectional area D that has the same volume as the volume
of the surface defined by f over D.

Activity 11.3.2. Consider the double integral D (4 − x − 2y) dA, where D


RR

is the triangular region with vertices (0,0), (4,0), and (0,2).

a. Write the given integral as an iterated integral of the form D (4 − x −


RR

2y) dy dx. Draw a labeled picture of D with relevant cross sections.

b. Write the given integral as an iterated integral of the form D (4 − x −


RR

2y) dx dy. Draw a labeled picture of D with relevant cross sections.

c. Evaluate the two iterated integrals from (a) and (b), and verify that they
produce the same value. Give at least one interpretation of the meaning
of your result.
R x=5 R y=x2
Activity 11.3.3. Consider the iterated integral x=3 y=−x
(4x + 10y) dy dx.

a. Sketch the region of integration, D, for which


ZZ Z x=5 Z y=x2
(4x + 10y) dA = (4x + 10y) dy dx.
D x=3 y=−x

b. Determine the equivalent iterated integral that results from integrating


in the opposite order (dx dy, instead of dy dx). That is, determine the
limits of integration for which
ZZ Z y=? Z x=?
(4x + 10y) dA = (4x + 10y) dx dy.
D y=? x=?
222 CHAPTER 11. MULTIPLE INTEGRALS

c. Evaluate one of the two iterated integrals above. Explain what the value
you obtained tells you.

d. Set up and evaluate a single definite integral to determine the exact area
of D, A(D).

e. Determine the exact average value of f (x, y) = 4x + 10y over D.


R x=4 R y=2 2
Activity 11.3.4. Consider the iterated integral x=0 y=x/2
ey dy dx.

2
a. Explain why we cannot find a simpleR antiderivative for ey with respect to
x=4 R y=2 y2
y, and thus are unable to evaluate x=0 y=x/2 e dy dx in the indicated
order using the Fundamental Theorem of Calculus.
2 R x=4 R y=2 2
b. Given that ey dA = ey dy dx, sketch the region of inte-
RR
D x=0 y=x/2
gration, D.

c. Rewrite the given iterated integral in the opposite order, using dA =


dx dy. (Hint: You may need more than one integral.)

d. Use the Fundamental Theorem of Calculus to evaluate the iterated inte-


gral you developed in (d). Write one sentence to explain the meaning of
the value you found.

e. What is the important lesson this activity offers regarding the order in
which we set up an iterated integral?

11.3.2 Summary

• For a double integral D f (x, y) dA over a non-rectangular region D, we


RR

enclose D in a rectangle R and then extend integrand f to a function


F so that F (x, y) = 0 at all points in R RR outside of D and F (x, y) =
fRR(x, y) for all points in D. We then define D f (x, y) dA to be equal to
R
F (x, y) dA.

• In an iterated double integral, the limits on the outer integral must be


constants while the limits on the inner integral must be constants or in
terms of only the remaining variable. In other words, an iterated double
integral has one of the following forms (which result in the same value):
Z x=b Z y=g2 (x)
f (x, y) dy dx,
x=a y=g1 (x)

where g1 = g1 (x) and g2 = g2 (x) are functions of x only and the region
D is described by the inequalities g1 (x) ≤ y ≤ g2 (x) and a ≤ x ≤ b or
Z y=d Z x=h2 (y)
f (x, y) dx dy,
y=c x=h1 (y)

where h1 = h1 (y) and h2 = h2 (y) are functions of y only and the region
D is described by the inequalities h1 (y) ≤ x ≤ h2 (y) and c ≤ y ≤ d.
11.3. DOUBLE INTEGRALS OVER GENERAL REGIONS 223

Exercises
ZZ
1. Evaluate the double integral I = xy dA where D is the triangular
D
region with vertices (0, 0), (1, 0), (0, 6).
ZZ
2. Evaluate the double integral I = xy dA where D is the triangular
D
region with vertices (0, 0), (1, 0), (0, 4).
3. Evaluate the integral by reversing the order of integration.
R 1R 4 x2
0 4y
e dxdy =
4. Decide, without calculation, if each of the integrals below are positive,
negative, or zero. Let D be the region inside the unit circle centered at the
origin. Let T, B, R, and L denote the regions enclosed by the top half, the
bottom half, the right half, and the left half of unit circle, respectively.

ZZ
(a) (y 3 + y 5 ) dA
B
ZZ
(b) (y 3 + y 5 ) dA
T
ZZ
(c) (y 3 + y 5 ) dA
D
ZZ
(d) (y 3 + y 5 ) dA
R
ZZ
(e) (y 3 + y 5 ) dA
L

2 2
5. The region W lies below the surface f (x, y) = 4e−(x−3) −y and above
the disk x2 + y 2 ≤ 36 in the xy-plane.
(a) Think about what the contours of f look like. You may want to use
f (x, y) = 1 as an example. Sketch a rough contour diagram on a separate sheet
of paper.
(b) Write an integral giving the area of the cross-section of W in the plane
x = 3. Rb
Area = a d ,
where a = and b =
(c) Use your work from (b) to write an iterated double integral giving the
volume of W , using the work from (b) to inform the construction of the inside
integral.
RbRd
Volume = a c d d ,
where a = ,b= c= and d =
6. Set up a double integral in rectangular coordinates for calculating the
volume of the solid under the graph of the function f (x, y) = 29 − x2 − y 2 and
above the plane z = 4.
Instructions: Please enter the integrand in the first answer box. Depending
on the order of integration you choose, enter dx and dy in either order into
224 CHAPTER 11. MULTIPLE INTEGRALS

the second and third answer boxes with only one dx or dy in each box. Then,
enter the limits of integration.
Z BZ D

A C
A=
B=
C=
D=
7. Find the volume of the solid bounded by the planes x = 0, y = 0, z = 0,
and x + y + z = 7.
Z 7 Z √49−y
8. Consider the integral f (x, y)dxdy. If we change the order of
0 0
integration we obtain the sum of two integrals:
Z b Z g2 (x) Z d Z g4 (x)
f (x, y)dydx + f (x, y)dydx
a g1 (x) c g3 (x)
a= b=
g1 (x) = g2 (x) =
c= d=
g3 (x) = g4 (x) =
9. A pile of earth standing on flat ground has height 36 meters. The ground
is the xy-plane. The origin is directly below the top of the pile and the z-axis
is upward. The cross-section at height z is given by x2 + y 2 = 36 − z for
0 ≤ z ≤ 36, with x, y, and z in meters.
(a) What equation gives the edge of the base of the pile?
x2 + y 2 = 36
x + y = 36
x+y =6
x2 + y 2 = 6
None of the above
(b) What is the area of the base of the pile?
(c) What equation gives the cross-section of the pile with the plane z = 3?

x2 + y 2 = 33
x2 + y 2 = 3
x2 + y 2 = 33
x2 + y 2 = 9
None of the above
(d) What is the area of the cross-section z = 3 of the pile?
(e) What is A(z), the area of a horizontal cross-section at height z?
A(z) =
square meters
(f) Use your answer in part (e) to find the volume of the pile.
Volume =
cubic meters
10. Match the following integrals with the verbal descriptions of the solids
whose volumes they give. Put the letter of the verbal description to the left of
the corresponding integral.
11.3. DOUBLE INTEGRALS OVER GENERAL REGIONS 225

Z 2Z 2 p
(a) 4 − y 2 dydx
0 −2
Z 1Z √
y
(b) 4x2 + 3y 2 dxdy
0 y2

Z 1Z 1−x2
(c) √ 1 − x2 − y 2 dydx
−1 − 1−x2

Z 2Z 4+ 4−x2
(d) 4x + 3y dydx
−2 4

1 1

Z √
3
Z 2 1−3y 2 p
(e) 1 − 4x2 − 3y 2 dxdy
0 0

A. Solid under a plane and over one half of a circular disk.

B. One eighth of an ellipsoid.

C. Solid under an elliptic paraboloid and over a planar region bounded by


two parabolas.

D. One half of a cylindrical rod.

E. Solid bounded by a circular paraboloid and a plane.

11. For each of the following iterated integrals,

• sketch the region of integration,

• write an equivalent iterated integral expression in the opposite order of


integration,

• choose one of the two orders and evaluate the integral.

R x=1 R y=x
a. x=0 y=x2
xy dy dx
R y=2 R x=0 √
b. y=0 x=− 4−y 2
xy dx dy

R x=1 R y=x1/4
c. x=0 y=x4
x + y dy dx
R y=2 R x=2y
d. y=0 x=y/2
x + y dx dy

12. The temperature at any point on a metal plate in the xy-plane is given
by T (x, y) = 100 − 4x2 − y 2 , where x and y are measured in inches and T in
degrees Celsius. Consider the portion of the plate that lies on the region D
that is the finite region that lies between the parabolas x = y 2 and x = 3−2y 2 .

a. Construct a labeled sketch of the region D.

b. Set up an iterated integral whose value is D T (x, y) dA, using dA =


RR

dxdy. (Hint: It is possible that more than one integral is needed.)

c. Set up an integrated integral whose value is D T (x, y) dA, using dA =


RR

dydx. (Hint: It is possible that more than one integral is needed.)


226 CHAPTER 11. MULTIPLE INTEGRALS

d. Use the Fundamental Theorem of Calculus to evaluate the integrals you


determined in (b) and (c).
e. Determine the exact average temperature, TAVG(D) , over the region D.

13. Consider the solid that is given by the following description: the base is
the given region D, while the top is given by the surface z = p(x, y). In each
setting below, set up, but do not evaluate, an iterated integral whose value is
the exact volume of the solid. Include a labeled sketch of D in each case.
a. D is the interior of the quarter circle of radius 2, centered at the origin,
that lies in the second quadrant of the plane; p(x, y) = 16 − x2 − y 2 .
b. D is the finite region between the line y = x + 1 and the parabola y = x2 ;
p(x, y) = 10 − x − 2y.
c. D is the triangular region with vertices (1, 1), (2, 2), and (2, 3); p(x, y) =
e−xy .

d. D is the region bounded by the y-axis, y = 4 and x = y; p(x, y) =
1 + x2 + y 2 .
p

Z x=4 Z y=2
14. Consider the iterated integral I = √
cos(y 3 ) dy dx.
x=0 y= x

a. Sketch the region of integration.


b. Write an equivalent iterated integral with the order of integration re-
versed.
c. Choose one of the two orders of integration and evaluate the iterated
integral you chose by hand. Explain the reasoning behind your choice.

d. Determine the exact average value of cos(y 3 ) over the region D that is
determined by the iterated integral I.
11.4. APPLICATIONS OF DOUBLE INTEGRALS 227

11.4 Applications of Double Integrals

Motivating Questions
• If we have a mass density function for a lamina (thin plate), how does a
double integral determine the mass of the lamina?
• How may a double integral be used to find the area between two curves?
• Given a mass density function on a lamina, how can we find the lamina’s
center of mass?
• What is a joint probability density function? How do we determine the
probability of an event if we know a probability density function?

So far, we have interpreted the double RR integral of a function f over a do-


main D in two different ways. First, D f (x, y) dA tells us a difference of
volumes — the volume the surface defined by f bounds above the xy-plane on
D minus
RR the volume the surface bounds below the xy-plane on D. In addition,
1
A(D) D
f (x, y) dA determines the average value of f on D. In this section, we
investigate several other applications of double integrals, using the integration
process as seen in Preview Activity 11.4.1: we partition into small regions,
approximate the desired quantity on each small region, then use the integral
to sum these values exactly in the limit.
The following preview activity explores how a double integral can be used to
determine the density of a thin plate with a mass density distribution. Recall
that in single-variable calculus, we considered a similar problem and computed
the mass of a one-dimensional rod with a mass-density distribution. There, as
here, the key idea is that if density is constant, mass is the product of density
and volume.
Preview Activity 11.4.1. Suppose that we have a flat, thin object (called a
lamina) whose density varies across the object. We can think of the density
on a lamina as a measure of mass per unit area. As an example, consider a
circular plate D of radius 1 cm centered at the origin whose density δ varies
depending on the distance from its center so that the density in grams per
square centimeter at point (x, y) is
δ(x, y) = 10 − 2(x2 + y 2 ).
a. Suppose that we partition the plate into subrectangles Rij , where 1 ≤
i ≤ m and 1 ≤ j ≤ n, of equal area ∆A, and select a point (x∗ij , yij ∗
) in
Rij for each i and j. What is the meaning of the quantity δ(xij , yij )∆A?
∗ ∗

b. State a double Riemann sum that provides an approximation of the mass


of the plate.
c. Explain why the double integral
ZZ
δ(x, y) dA
D

tells us the exact mass of the plate.


d. Determine an iterated integral which, if evaluated, would give the exact
mass of the plate. Do not actually evaluate the integral. (This integral is
considerably easier to evaluate in polar coordinates, which we will learn
more about in Section 11.5.)
228 CHAPTER 11. MULTIPLE INTEGRALS

11.4.1 Mass
Density is a measure of some quantity per unit area or volume. For example,
we can measure the human population density of some region as the number
of humans in that region divided by the area of that region. In physics, the
mass density of an object is the mass of the object per unit area or volume.
As suggested by Preview Activity 11.4.1, the following holds in general.
The mass of a lamina.
If δ(x, y) describes the density of a lamina defined by RR
a planar region
D, then the mass of D is given by the double integral D δ(x, y) dA.

Activity 11.4.2. Let D be a half-disk lamina of radius 3 in quadrants IV and


I, centered at the origin as shown in Figure 11.4.1. Assume the density at point
(x, y) is given by δ(x, y) = x + y. Find the exact mass of the lamina.

3 y

2
D
1
x
-1 1 2 3 4
-1

-2

-3
Figure 11.4.1: A half disk lamina.

11.4.2 Area
If we consider the situation where the mass-density distribution is constant,
we can also see how a double integral may be used to determine the area of
a region. Assuming that δ(x, y) = 1 over a closed bounded RR region D, where
the units of δ are “mass per unit of area,” it follows that D 1 dA is the mass
of the lamina. But since the density is constant, the numerical value of the
integral is simply the area.
As the following activity demonstrates, we can also see this fact by consid-
ering a three-dimensional solid whose height is always 1.
Activity 11.4.3. Suppose we want to find the area of the bounded region D
between the curves

y = 1 − x2 and y = x − 1.

A picture of this region is shown in Figure 11.4.2.


a. The volume of a solid with constant height is given by the area of the
base times the height. Hence, we may interpret the area of the region D
as the volume of a solidRR
with base D and of uniform height 1. That is, the
area of D is given by D 1 dA. Write an iterated integral whose value
11.4. APPLICATIONS OF DOUBLE INTEGRALS 229

is D 1 dA. (Hint: Which order of integration might be more efficient?


RR

Why?)

2 y

1
x
-1 D 1 2 3
-1

-2

-3

Figure 11.4.2: The graphs of y = 1 − x2 and y = x − 1.

b. Evaluate the iterated integral from (a). What does the result tell you?

We now formally state the conclusion from our earlier discussion and Ac-
tivity 11.4.3.
The double integral and area.
Given a closed, bounded region D in the plane, the area of D, denoted
A(D), is given by the double integral
ZZ
A(D) = 1 dA.
D

11.4.3 Center of Mass


The center of mass of an object is a point at which the object will balance
perfectly. For example, the center of mass of a circular disk of uniform density
is located at its center. For any object, if we throw it through the air, it will
spin around its center of mass and behave as if all the mass is located at the
center of mass.
In order to understand the role that integrals play in determining the center
of a mass of an object with a nonuniform mass distribution, we start by finding
the center of mass of a collection of N distinct point-masses in the plane.
Let m1 , m2 , . . ., mN be N masses located in the plane. Think of these
masses as connected by rigid rods of negligible weight from some central point
(x, y). A picture with four masses is shown in Figure 11.4.3. Now imagine bal-
ancing this system by placing it on a thin pole at the point (x, y) perpendicular
to the plane containing the masses. Unless the masses are perfectly balanced,
the system will fall off the pole. The point (x, y) at which the system will
balance perfectly is called the center of mass of the system. Our goal is to
determine the center of mass of a system of discrete masses, then extend this
to a continuous lamina.
230 CHAPTER 11. MULTIPLE INTEGRALS

y
(x2 , y2 )
(x3 , y3 )

(x, y)

(x4 , y4 )
(x1 , y1 )
x

Figure 11.4.3: A center of mass (x, y) of four masses.

Each mass exerts a force (called a moment) around the lines x = x and
y = y that causes the system to tilt in the direction of the mass. These moments
are dependent on the mass and the distance from the given line. Let (x1 , y1 )
be the location of mass m1 , (x2 , y2 ) the location of mass m2 , etc. In order to
balance perfectly, the moments in the x direction and in the y direction must
be in equilibrium. We determine these moments and solve the resulting system
to find the equilibrium point (x, y) at the center of mass.
The force that mass m1 exerts to tilt the system from the line y = y is

m1 g(y − y1 ),

where g is the gravitational constant. Similarly, the force mass m2 exerts to


tilt the system from the line y = y is

m2 g(y − y2 ).

In general, the force that mass mk exerts to tilt the system from the line
y = y is
mk g(y − yk ).
For the system to balance, we need the forces to sum to 0, so that
N
X
mk g(y − yk ) = 0.
k=1

Solving for y, we find that


PN
mk yk
y = Pk=1
N
.
k=1 mk

A similar argument shows that


PN
k=1 mk xk
x= PN
.
k=1 mk

PN
The value Mx = k=1 mk yk is called the total moment with respect
PN
to the x-axis; My = k=1 mk xk is the total moment with respect to the
11.4. APPLICATIONS OF DOUBLE INTEGRALS 231

y-axis. Hence, the respective quotients of the moments to the total mass, M ,
determines the center of mass of a point-mass system:
 
My Mx
(x, y) = , .
M M

Now, suppose that rather than a point-mass system, we have a continuous


lamina with a variable mass-density δ(x, y). We may estimate its center of
mass by partitioning the lamina into mn subrectangles of equal area ∆A, and
treating the resulting partitioned lamina as a point-mass system. In particular,
we select a point (x∗ij , yij

) in the ijth subrectangle, and observe that the quanity

δ(x∗ij , yij

)∆A

is density times area, so δ(x∗ij , yij



)∆A approximates the mass of the small
portion of the lamina determined by the subrectangle Rij .
We now treat δ(x∗ij , yij∗
)∆A as a point mass at the point (x∗ij , yij

). The
coordinates (x, y) of the center of mass of these mn point masses are thus
given by
Pn Pm ∗ ∗ ∗
Pn Pm ∗ ∗ ∗
j=1 i=1 xij δ(xij , yij )∆A j=1 i=1 yij δ(xij , yij )∆A
x = Pn Pm ∗ ∗ and y = P n P m ∗ ∗ .
j=1 i=1 δ(xij , yij )∆A j=1 i=1 δ(xij , yij )∆A

If we take the limit as m and n go to infinity, we obtain the exact center


of mass (x, y) of the continuous lamina.
The center of mass of a lamina.
The coordinates (x, y) of the center of mass of a lamina D with density
δ = δ(x, y) are given by
RR RR
xδ(x, y) dA yδ(x, y) dA
x= D
RR and y = RRD .
D
δ(x, y) dA D
δ(x, y) dA

The center of mass of a lamina can then be thought of as a weighted average


of all of the points in the lamina with the weights given by the density at each
point. The centroid of a lamina is the just the average of all of the points in
the lamina, or the center of mass if the density at each point is 1.
The numerators of x and y are called the respective moments of the lamina
about the coordinate axes. Thus, the moment of a lamina D with density
δ = δ(x, y) about the y-axis is
ZZ
My = xδ(x, y) dA
D

and the moment of D about the x-axis is


ZZ
Mx = yδ(x, y) dA.
D

If M is the mass of the lamina, it follows that the center of mass is


 
M y Mx
(x, y) = , .
M M

Activity 11.4.4. In this activity we determine integrals that represent the


center of mass of a lamina D described by the triangular region bounded by
the x-axis and the lines x = 1 and y = 2x in the first quadrant if the density
232 CHAPTER 11. MULTIPLE INTEGRALS

at point (x, y) is δ(x, y) = 6x + 6y + 6. A picture of the lamina is shown in


Figure 11.4.4.

2 y

x
1
Figure 11.4.4: The lamina bounded by the x-axis and the lines x = 1 and
y = 2x in the first quadrant.

a. Set up an iterated integral that represents the mass of the lamina.

b. Assume the mass of the lamina is 14. Set up two iterated integrals that
represent the coordinates of the center of mass of the lamina.

11.4.4 Probability
Calculating probabilities is a very important application of integration in the
physical, social, and life sciences. To understand the basics, consider the game
of darts in which a player throws a dart at a board and tries to hit a particular
target. Let us suppose that a dart board is in the form of a disk D with radius
10 inches. If we assume that a player throws a dart at random, and is not
aiming at any particular point, then it is equally probable that the dart will
strike any single point on the board. For instance, the probability that the dart
will strike a particular 1 square inch region is 100π
1
, or the ratio of the area
of the desired target to the total area of D (assuming that the dart thrower
always hits the board itself at some point). Similarly, the probability that the
dart strikes a point in the disk D3 of radius 3 inches is given by the area of D3
divided by the area of D. In other words, the probability that the dart strikes
the disk D3 is
9π 1
ZZ
= dA.
100π D3 100π
The integrand, 100π
1
, may be thought of as a distribution function, describ-
ing how the dart strikes are distributed across the board. In this case the
distribution function is constant since we are assuming a uniform distribution,
but we can easily envision situations where the distribution function varies. For
example, if the player is fairly good and is aiming for the bulls eye (the center
of D), then the distribution function f could be skewed toward the center, say
2
+y 2 )
f (x, y) = Ke−(x
11.4. APPLICATIONS OF DOUBLE INTEGRALS 233

for some constant positive K. If we assume that the player is consistent enough
so that the dart always strikes the board, then the probability that the dart
strikes the board somewhere is 1, and the distribution function f will have to
satisfy1 ZZ
f (x, y) dA = 1.
D

For such a function f , the probability that the dart strikes in the disk D1
of radius 1 would be ZZ
f (x, y) dA.
D1

Indeed, the probability that the dart strikes in any region R that lies within
D is given by ZZ
f (x, y) dA.
R

The preceding discussion highlights the general idea behind calculating


probabilities. We assume we have a joint probability density function f , a
function of two independent variables x and y defined on a domain D that
satisfies the conditions

• f (x, y) ≥ 0 for all x and y in D,

• the probability that x is between some values a and b while y is between


some values c and d is given by
Z b Z d
f (x, y) dy dx,
a c

• The probability that the point (x, y) is in D is 1, that is


ZZ
f (x, y) dA = 1. (11.4.1)
D

Note that it is possible that D could be an infinite region and the limits
on the integral in Equation (11.4.1) could be infinite. When we have such a
probability density function f = f (x, y), the probability that the point (x, y)
is in some region R contained in the domain D (the notation we use here is
“P ((x, y) ∈ R)”) is determined by
ZZ
P ((x, y) ∈ R) = f (x, y) dA.
R

Activity 11.4.5. A firm manufactures smoke detectors. Two components for


the detectors come from different suppliers — one in Michigan and one in Ohio.
The company studies these components for their reliability and their data
suggests that if x is the life span (in years) of a randomly chosen component
from the Michigan supplier and y the life span (in years) of a randomly chosen
component from the Ohio supplier, then the joint probability density function
f might be given by
f (x, y) = e−x e−y .

a. Theoretically, the components might last forever, so the domain D of the


function f is the set D of all (x, y) such that x ≥ 0 and y ≥ 0. To show
1 This 1
makes K = , which you can check.
π (1−e−100 )
234 CHAPTER 11. MULTIPLE INTEGRALS

that f is a probability density function on D we need to demonstrate


that Z Z
f (x, y) dA = 1,
D

or that Z ∞ Z ∞
f (x, y) dy dx = 1.
0 0

Use your knowledge of improper integrals to verify that f is indeed a


probability density function.

b. Assume that the smoke detector fails only if both of the supplied compo-
nents fail. To determine the probability that a randomly selected detector
will fail within one year, we will need to determine the probability that
the life span of each component is between 0 and 1 years. Set up an
appropriate iterated integral, and evaluate the integral to determine the
probability.

c. What is the probability that a randomly chosen smoke detector will fail
between years 3 and 7?

d. Suppose that the manufacturer determines that one of the components


is more likely to fail than the other, and hence conjectures that the
probability density function is instead f (x, y) = Ke−x e−2y . What is the
value of K?

11.4.5 Summary

• RR
The mass of a lamina D with a mass density function δ = δ(x, y) is
D
δ(x, y) dA.

• The area of a region D in the plane has the same numerical value as the
volume ofRRa solid of uniform height 1 and base D, so the area of D is
given by D 1 dA.

• The center of mass, (x, y), of a continuous lamina with a variable density
δ(x, y) is given by
RR RR
xδ(x, y) dA yδ(x, y) dA
x = RR D
and y = RRD .
D
δ(x, y) dA D
δ(x, y) dA

• Given a joint probability density function f is a function of two indepen-


dent variables x and y defined on a domain D, if R is some subregion of
D, then the probability that (x, y) is in R is given by
ZZ
f (x, y) dA.
R

Exercises
1. The masses mi are located at the points Pi . Find the center of mass of
the system.
m1 = 1, m2 = 5, m3 = 9.
P1 = (−4, 6), P2 = (−6, 7), P3 = (−2, −4).
x̄=
ȳ=
11.4. APPLICATIONS OF DOUBLE INTEGRALS 235

2. Find the centroid (x̄, ȳ) of the triangle with vertices at (0, 0), (1, 0), and
(0, 3).
x̄=
ȳ=
3. Find the mass of the rectangular region 0 ≤ x ≤ 2, 0 ≤ y ≤ 4 with
density function ρ (x, y) = 4 − y.
4. Find the mass of the triangular region with vertices (0, 0), (3, 0), and (0,
1), with density function ρ (x, y) = x2 + y 2 .
5. A lamina occupies the region inside the circle x2 + y 2 = 10y but outside
the circle x2 + y 2 = 25. The density at each point is inversely proportional to
its distance from the orgin.
Where is the center of mass?
( , )
6. A sprinkler distributes water in a circular pattern, supplying water to a
depth of e−r feet per hour at a distance of r feet from the sprinkler.
A. What is the total amount of water supplied per hour inside of a circle
of radius 7?
f t3 per hour
B. What is the total amount of water that goes through the sprinkler per
hour?
f t3 per hour
7. Let p be the joint density function such that p(x, y) = 361
xy in R, the
rectangle 0 ≤ x ≤ 6, 0 ≤ y ≤ 2, and p(x, y) = 0 outside R. Find the fraction
of the population satisfying the constraint x + y ≤ 8
fraction =
8. A lamp has two bulbs, each of a type with an average lifetime of 12
hours. The probability density function for the lifetime of a bulb is f (t) =
1 −t/12
12 e , t ≥ 0.
What is the probability that both of the bulbs will fail within 2 hours?
9. For the following two functions p(x, y), check whether p is a joint density
function. Assume p(x, y) = 0 outside the region R.
(a) p(x, y) = 1, where R is 1 ≤ x ≤ 1.5, −1 ≤ y ≤ −0.5.
p(x, y) ( is a joint density function  is not a joint density function)
(b) p(x, y) = 3, where R is 1 ≤ x ≤ 2, 2 ≤ y ≤ 5.
p(x, y) ( is a joint density function  is not a joint density function)
Then, for the region R given by 0 ≤ x ≤ 2, 0 ≤ y ≤ 3, what constant
function p(x, y) is a joint density function?
p(x, y) =
10. Let x and y have joint density function
(
2
(x + 2y) for 0 ≤ x ≤ 1, 0 ≤ y ≤ 1,
p(x, y) = 3
0 otherwise.

Find the probability that


(a) x > 1/2:
probability =
(b) x < 12 + y:
probability =
11. A triangular plate is bounded by the graphs of the equations y = 2x,
y = 4x, and y = 4. The plate’s density at (x, y) is given by δ(x, y) = 4xy 2 +
1, measured in grams per square centimeter (and x and y are measured in
centimeters).
236 CHAPTER 11. MULTIPLE INTEGRALS

a. Set up an iterated integral whose value is the mass of the plate. Include
a labeled sketch of the region of integration. Why did you choose the
order of integration you did?
b. Determine the mass of the plate.

c. Determine the exact center of mass of the plate. Draw and label the
point you find on your sketch from (a).
d. What is the average density of the plate? Include units on your answer.

12. Let D be a half-disk lamina of radius 3 in quadrants IV and I, centered


at the origin as in Activity 11.4.2. Assume the density at point (x, y) is equal
to x.

a. Before doing any calculations, what do you expect the y-coordinate of


the center of mass to be? Why?
b. Set up iterated integral expressions which, if evaluated, will determine
the exact center of mass of the lamina.

c. Use appropriate technology to evaluate the integrals to find the center of


mass numerically.

13. Let x denote the time (in minutes) that a person spends waiting in a
checkout line at a grocery store and y the time (in minutes) that it takes to
check out. Suppose the joint probability density for x and y is
1 −x/4−y/2
f (x, y) = e .
8
a. What is the exact probability that a person spends between 0 to 5 minutes
waiting in line, and then 0 to 5 minutes waiting to check out?
b. Set up, but do not evaluate, an iterated integral whose value determines
the exact probability that a person spends at most 10 minutes total both
waiting in line and checking out at this grocery store.

c. Set up, but do not evaluate, an iterated integral expression whose value
determines the exact probability that a person spends at least 10 minutes
total both waiting in line and checking out, but not more than 20 minutes.
11.5. DOUBLE INTEGRALS IN POLAR COORDINATES 237

11.5 Double Integrals in Polar Coordinates

Motivating Questions

• What are the polar coordinates of a point in two-space?

• How do we convert between polar coordinates and rectangular coordi-


nates?

• What is the area element in polar coordinates?

• How do we convert a double integral in rectangular coordinates to a


double integral in polar coordinates?

While we have naturally defined double integrals in the rectangular coor-


dinate system, starting with domains that are rectangular regions, there are
many of these integrals that are difficult, if not impossible, to evaluate. For
2 2
example, consider the domain D that is the unit circle and f (x, y) = e−x −y .
To integrate f over D, we would use the iterated integral

ZZ Z x=1 Z y= 1−x2
2
−y 2
f (x, y) dA = √ e−x dy dx.
D x=−1 y=− 1−x2

For this particular integral, regardless of the order of integration, we are


unable to find an antiderivative of the integrand; in addition, even if we were
able to find an antiderivative, the inner limits of integration involve relatively
complicated functions.
It is useful, therefore, to be able to translate to other coordinate systems
where the limits of integration and evaluation of the involved integrals is sim-
pler. In this section we provide a quick discussion of one such system — polar
coordinates — and then introduce and investigate their ramifications for dou-
ble integrals. The rectangular coordinate system allows us to consider domains
and graphs relative to a rectangular grid. The polar coordinate system is an
alternate coordinate system that allows us to consider domains less suited to
rectangular coordinates, such as circles.

Preview Activity 11.5.1. The coordinates of a point determine its location.


In particular, the rectangular coordinates of a point P are given by an ordered
pair (x, y), where x is the (signed) distance the point lies from the y-axis to
P and y is the (signed) distance the point lies from the x-axis to P . In polar
coordinates, we locate the point by considering the distance the point lies from
the origin, O = (0, 0), and the angle the line segment from the origin to P
forms with the positive x-axis.

a. Determine the rectangular coordinates of the following points:

(a) The point P that lies 1 unit from the origin on the positive x-axis.
(b) The point Q that lies 2 units from the origin and such that OQ
makes an angle of π2 with the positive x-axis.
(c) The point R that lies 3 units from the origin such that OR makes
an angle of 2π
3 with the positive x-axis.
238 CHAPTER 11. MULTIPLE INTEGRALS

b. Part (a) indicates that the two pieces of information completely deter-
mine the location of a point: either the traditional (x, y) coordinates, or
alternately, the distance r from the point to the origin along with the
angle θ that the line through the origin and the point makes with the
positive x-axis. We write “(r, θ)” to denote the point’s location in its
polar coordinate representation. Find polar coordinates for the points
with the given rectangular coordinates.
i. (0, −1) ii. (−2, 0) iii. (−1, 1)
c. For each of the following points whose coordinates are given in polar
form, determine the rectangular coordinates of the point.
√ 5π
i. (5, π4 ) ii. (2, 5π
6 ) iii. ( 3, 3 )

11.5.1 Polar Coordinates


The rectangular coordinate system is best suited for graphs and regions that
are naturally considered over a rectangular grid. The polar coordinate system
is an alternative that offers good options for functions and domains that have
more circular characteristics. A point P in rectangular coordinates that is
described by an ordered pair (x, y), where x is the displacement from P to
the y-axis and y is the displacement from P to the x-axis, as seen in Preview
Activity 11.5.1, can also be described with polar coordinates (r, θ), where r is
the distance from P to the origin and θ is the angle formed by the line segment
OP and the positive x-axis, as shown at left in Figure 11.5.1.

π/2

P 3π/4 π/4

π 0
r y 1 2 3 4

5π/4 7π/4
θ
3π/2
x

Figure 11.5.1: The polar coordinates of a point and the polar coordinate
grid.

Trigonometry and the Pythagorean Theorem allow for straightforward con-


version from rectangular to polar, and vice versa.
11.5. DOUBLE INTEGRALS IN POLAR COORDINATES 239

Converting between rectangular and polar coordinates.

• If we are given the rectangular coordinates (x, y) of a point P ,


then the polar coordinates (r, θ) of P satisfy
p y
r= x2 + y 2 and tan(θ) = , assuming x 6= 0.
x

• If we are given the polar coordinates (r, θ) of a point P , then the


rectangular coordinates (x, y) of P satisfy

x = r cos(θ) and y = r sin(θ).

Note: The angle θ in the polar coordinates of a point is not unique. We


could replace θ with θ +2π and still be at the same terminal point. In addition,
the sign of tan(θ) does not uniquely determine the quadrant in which θ lies, so
we have to determine the value of θ from the location of the point. In other
words, more care has to be paid when using polar coordinates than rectangular
coordinates.
We can draw graphs of curves in polar coordinates in a similar way to how
we do in rectangular coordinates. However, when plotting in polar coordinates,
we use a grid that considers changes in angles and changes in distance from
the origin. In particular, the angles θ and distances r partition the plane into
small wedges as shown at right in Figure 11.5.1.
Activity 11.5.2. Most polar graphing devices can plot curves in polar coor-
dinates of the form r = f (θ). Use such a device to complete this activity.
a. Before plotting the polar curve r = 1 (where θ can have any value), think
about what shape it should have, in light of how r is connected to x and
y. Then use appropriate technology to draw the graph and test your
intuition.
b. The equation θ = 1 does not define r as a function of θ, so we can’t graph
this equation on many polar plotters. What do you think the graph of
the polar curve θ = 1 looks like? Why?
c. Before plotting the polar curve r = θ, what do you think the graph looks
like? Why? Use technology to plot the curve and compare your intuition.
d. What does the region defined by 1 ≤ r ≤ 3 (where θ can have any value)
look like? (Hint: Compare to your response from part (a).)
e. What does the region defined by 1 ≤ r ≤ 3 and π/4 ≤ θ ≤ π/2 look like?
f. Consider the curve r = sin(θ). For some values of θ we will have r < 0.
In these situations, we plot the point (r, θ) as (|r|, θ + π) (in other words,
when r < 0, we reflect the point through the origin). With that in mind,
what do you think the graph of r = sin(θ) looks like? Plot this curve
using technology and compare to your intuition.

11.5.2 Integrating in Polar Coordinates


Consider the double integral
ZZ
2
+y 2
ex dA,
D
240 CHAPTER 11. MULTIPLE INTEGRALS

where D is the unit disk. While we cannot directly evaluate this integral in
rectangular coordinates, a change to polar coordinates will convert it to one
we can easily evaluate. ZZ
We have seen how to evaluate a double integral f (x, y) dA as an iter-
D
ated integral of the form
Z b Z g2 (x)
f (x, y) dy dx
a g1 (x)

in rectangular coordinates, because we know that dA = dy dx in rectangular


coordinates. To make the change to polar coordinates, we not only need to
represent the variables x and y in polar coordinates, but we also must under-
stand how to write the area element, dA, in polar coordinates. That is, we
must determine how the area element dA can be written in terms of dr and dθ
in the context of polar coordinates. We address this question in the following
activity.

ri+1

ri+1
ri
ri

θj+1
θj

Figure 11.5.2: Left: A polar rectangle. Right: An annulus.

Activity 11.5.3. Consider a polar rectangle R, with r between ri and ri+1


and θ between θj and θj+1 as shown at left in Figure 11.5.2. Let ∆r = ri+1 −ri
and ∆θ = θj+1 − θj . Let ∆A be the area of this region.

a. Explain why the area ∆A in polar coordinates is not ∆r ∆θ.

b. Now find ∆A by the following steps:

i. Find the area of the annulus (the washer-like region) between ri and
ri+1 , as shown at right in Figure 11.5.2. This area will be in terms
of ri and ri+1 .
ii. Observe that the region R is only a portion of the annulus, so the
area ∆A of R is only a fraction of the area of the annulus. For
instance, if θi+1 − θi were π4 , then the resulting wedge would be
π
4 1
=
2π 8
11.5. DOUBLE INTEGRALS IN POLAR COORDINATES 241

of the entire annulus. In this more general context, using the wedge
between the two noted angles, what fraction of the area of the an-
nulus is the area ∆A?
iii. Write an expression for ∆A in terms of ri , ri+1 , θj , and θj+1 .
iv. Finally, write the area ∆A in terms of ri , ri+1 , ∆r, and ∆θ, where
each quantity appears only once in the expression. (Hint: Think
about how to factor a difference of squares.)
c. As we take the limit as ∆r and ∆θ go to 0, ∆r becomes dr, ∆θ becomes
dθ, and ∆A becomes dA, the area element. Using your work in (iv),
write dA in terms of r, dr, and dθ.
From the result of Activity 11.5.3, we see when we convert an integral from
rectangular coordinates to polar coordinates, we must not only convert x and
y to being in terms of r and θ, but we also have to change the area element
to dA = r dr dθ in polar coordinates. As we saw in Activity 11.5.3, the reason
the additional factor of r in the polar area element is due to the fact that
in polar coordinates, the cross sectional area element increases as r increases,
while the cross sectional area
RR element in rectangular coordinates is constant.
So, given a double integral D f (x, y) dA in rectangular coordinates, to write a
corresponding iterated integral in polar coordinates, we replace x with r cos(θ),
y with r sin(θ) and dA with r dr dθ. Of course, we need to describe the region
D in polar coordinates as well. To summarize:
Double integrals in polar coordinates.
The double integral f (x, y) dA in rectangular coordinates
RR
D
can
RR be converted to a double integral in polar coordinates as
D
f (r cos(θ), r sin(θ)) r dr dθ.

2 2
Example 11.5.3.RRLet f (x, y) = ex +y on the disk D = {(x, y) : x2 + y 2 ≤ 1}.
We will evaluate D f (x, y) dA.
In rectangular coordinates the double integral D f (x, y) dA can be written
RR

as the iterated integral


ZZ Z x=1 Z y=√1−x2
2 2
f (x, y) dA = √ ex +y dy dx.
D x=−1 y=− 1−x2
2 2
We cannot evaluate this iterated integral, because ex +y does not have
an elementary antiderivative with respect to either x or y. However, since
r2 = x2 + y 2 and the region D is circular, it is natural to wonder whether
converting to polar coordinates will allow us to evaluate the new integral. To
do so, we replace x with r cos(θ), y with r sin(θ), and dy dx with r dr dθ to
obtain ZZ ZZ
2
f (x, y) dA = er r dr dθ.
D D
The disc D is described in polar coordinates by the constraints 0 ≤ r ≤ 1
and 0 ≤ θ ≤ 2π. Therefore, it follows that
ZZ Z θ=2π Z r=1
2 2
er r dr dθ = er r dr dθ.
D θ=0 r=0

We can evaluate the resulting iterated polar integral as follows:


Z θ=2π Z r=1 Z 2π r=1 !
r2 1 r2
e r dr dθ = e dθ
θ=0 r=0 θ=0 2 r=0
242 CHAPTER 11. MULTIPLE INTEGRALS

θ=2π
1
Z
= (e − 1) dθ
2 θ=0
θ=2π
1
Z
= (e − 1) dθ
2 θ=0
θ=2π
1
= (e − 1) [θ]
2 θ=0
= π(e − 1).
While there is no firm rule for when polar coordinates can or should be
used, they are a natural alternative anytime the domain of integration may be
expressed
psimply in polar form, and/or when the integrand involves expressions
such as x2 + y 2 .
Activity 11.5.4. Let f (x, y) = x + y and D = {(x, y) : x2 + y 2 ≤ 4}.
a. Sketch the region D and then write the double integral of f over D as
an iterated integral in rectangular coordinates.
b. Write the double integral of f over D as an iterated integral in polar
coordinates.
c. Evaluate one of the iterated integrals. Why is the final value you found
not surprising?
Activity 11.5.5. Consider the circle given by x2 + (y − 1)2 = 1 as shown in
Figure 11.5.4.

2 y

x
-1 1

Figure 11.5.4: The graphs of y = x and x2 + (y − 1)2 = 1, for use in


Activity 11.5.5.

a. Determine a polar curve in the form r = f (θ) that traces out the circle
x2 + (y − 1)2 = 1. (Hint: Recall that a circle centered at the origin of
radius r can be described by the equations x = r cos(θ) and y = r sin(θ).)
b. Find the exact average value of g(x, y) = x2 + y 2 over the interior of
p

the circle x2 + (y − 1)2 = 1.


c. Find the volume under the surface h(x, y) = x over the region D, where
D is the region bounded above by the line y = x and below by the circle
(this is the shaded region in Figure 11.5.4).
d. Explain why in both (b) and (c) it is advantageous to use polar coordi-
nates.
11.5. DOUBLE INTEGRALS IN POLAR COORDINATES 243

11.5.3 Summary

• The polar representation of a point P is the ordered pair (r, θ) where r


is the distance from the origin to P and θ is the angle the ray through
the origin and P makes with the positive x-axis.
• The polar coordinates r and θ of a point (x, y) in rectangular coordinates
satisfy p y
r = x2 + y 2 and tan(θ) = ;
x
the rectangular coordinates x and y of a point (r, θ) in polar coordinates
satisfy
x = r cos(θ) and y = r sin(θ).

• The area element dA in polar coordinates is determined by the area of a


slice of an annulus and is given by

dA = r dr dθ.

• To convert the double integral D f (x, y) dA to an iterated integral in


RR

polar coordinates, we substitute r cos(θ) for x, r sin(θ) for y, and r dr dθ


for dA to obtain the iterated integral
ZZ
f (r cos(θ), r sin(θ)) r dr dθ.
D

Exercises
1. For each set of Polar coordinates, match the equivalent Cartesian coor-
dinates.

2. (a) The Cartesian coordinates of a point are (−1, − 3).
(i) Find polar coordinates (r, θ) of the point, where r > 0 and 0 ≤ θ < 2π.
r=
θ=
(ii) Find polar coordinates (r, θ) of the point, where r < 0 and 0 ≤ θ < 2π.
r=
θ=
(b) The Cartesian coordinates of a point are (−2, 3).
(i) Find polar coordinates (r, θ) of the point, where r > 0 and 0 ≤ θ < 2π.
r=
θ=
(ii) Find polar coordinates (r, θ) of the point, where r < 0 and 0 ≤ θ < 2π.
r=
θ=
3. (a) You are given the point (1, π/2) in polar coordinates.
(i) Find another pair of polar coordinates for this point such that r > 0
and 2π ≤ θ < 4π.
r=
θ=
(ii) Find another pair of polar coordinates for this point such that r < 0
and 0 ≤ θ < 2π.
r=
θ=
(b) You are given the point (−2, π/4) in polar coordinates.
(i) Find another pair of polar coordinates for this point such that r > 0
244 CHAPTER 11. MULTIPLE INTEGRALS

and 2π ≤ θ < 4π.


r=
θ=
(ii) Find another pair of polar coordinates for this point such that r < 0
and −2π ≤ θ < 0.
r=
θ=
(c) You are given the point (3, 2) in polar coordinates.
(i) Find another pair of polar coordinates for this point such that r > 0
and 2π ≤ θ < 4π.
r=
θ=
(ii) Find another pair of polar coordinates for this point such that r < 0
and 0 ≤ θ < 2π.
r=
θ=
4. Decide if the points given in polar coordinates are the same. If they are
the same, enter T . If they are different, enter F .
a.) (6, π3 ), (−6, −π 3 )
b.) (2, 75π
4 ), (2, − 75π
4 )
c.) (0, 6π), (0, 4 )

d.) (1, 61π


4 ), (−1, 4 )
π

e.) (2, 3 ), (−2, 3 )


68π −π

f.)(6, 11π), (−6, 11π)


5. A curve with polar equation

37
r=
7 sin θ + 43 cos θ
represents a line. Write this line in the given Cartesian form.
y=
Note: Your answer should be a function of x .
6. Find a polar equation of the form r = f (θ) for the curve represented by
the Cartesian equation x = −y 2 .
Note: Since θ is not a symbol on your keyboard, use t in place of θ in your
answer.
r=
7. By changing to polar coordinates, evaluate the integral
ZZ
(x2 + y 2 )7/2 dxdy
D

where D is the disk x2 + y 2 ≤ 49.


Answer =
8. Convert the integral

Z 2 Z x
dy dx
0 −x

to polar coordinates and evaluate it (use t for θ):


With a = ,b= ,c=
11.5. DOUBLE INTEGRALS IN POLAR COORDINATES 245

and d√= ,
R 2Rx RbRd
0 −x
dy dx = a c
dr dt
Rb
= a
dt
b

=

= .a
9. For each of the following, set up the integral of an arbitrary function
f (x, y) over the region in whichever of rectangular or polar coordinates is most
appropriate. (Use t for θ in your expressions.)
(a) The region

With a = ,b= ,
c= , and d = ,
RbRd
integral = a c d d
(b) The region

With a = ,b= ,
c= , and d = ,
RbRd
integral = a c d d
10. A Cartesian equation for the polar equation r = 3 can be written as:
x2 + y 2 =
11. Using polar coordinates, evaluate the integral which gives the area which
lies in the first quadrant between the circles x2 + y 2 = 64 and x2 − 8x + y 2 = 0.
12. (a) Graph r = 1/(5 cos θ) for −π/2 ≤ θ ≤ π/2 and r = 1. Then write
an iterated integral in polar coordinates representing the area inside the curve
246 CHAPTER 11. MULTIPLE INTEGRALS

r = 1 and to the right of r = 1/(5 cos θ). (Use t for θ in your work.)
With a = ,b= ,
c= , and d = ,
RbRd
area = a c d d
(b) Evaluate your integral to find the area.
area =
ZZ
13. Using polar coordinates, evaluate the integral sin(x2 +y 2 )dA where
R
R is the region 9 ≤ x2 + y 2 ≤ 81.
14. Sketch the region of integration for the following integral.
Z Z 5/ cos(θ)
π/4
f (r, θ) r dr dθ
0 0
The region of integration is bounded by

y = 0, x = 25 − y 2 , and y = 5
p


y = 0, y = 25 − x2 , and x = 5

y = 0, y = x, and y = 5

y = 0, y = x, and x = 5

None of the above

15. Use the polar coordinates to find the volume of a sphere of radius 8.
2 2
16. Consider the solid under the graph of z = e−x −y above the disk
x2 + y 2 ≤ a2 , where a > 0.
(a) Set up the integral to find the volume of the solid.
Instructions: Please enter the integrand in the first answer box, typing
theta for θ. Depending on the order of integration you choose, enter dr and
dtheta in either order into the second and third answer boxes with only one dr
or dtheta in each box. Then, enter the limits of integration.
Z BZ D

A C
A=
B=
C=
D=
(b) Evaluate the integral and find the volume. Your answer will be in terms
of a.
Volume V =
(c) What does the volume approach as a → ∞?
lim V =
a→∞

17. Consider the iterated integral I = −3 −√9−y2 x2 +yy 2 +1 dx dy.


R0 R0

a. Sketch (and label) the region of integration.

b. Convert the given iterated integral to one in polar coordinates.

c. Evaluate the iterated integral in (b).

d. State one possible interpretation of the value you found in (c).

18. Let D be the region that lies inside the unit circle in the plane.
11.5. DOUBLE INTEGRALS IN POLAR COORDINATES 247

a. Set up and evaluate an iterated integral in polar coordinates whose value


is the area of D.
b. Determine the exact average value of f (x, y) = y over the upper half of
D.

c. Find the exact center of mass of the lamina over the portion of D that
lies in the first quadrant and has its mass density distribution given by
δ(x, y) = 1. (Before making any calculations, where do you expect the
center of mass to lie? Why?)
d. Find the exact volume of the solid that lies under the surface z = 8 −
x2 − y 2 and over the unit disk, D.

19. For each of the following iterated integrals,


• sketch and label the region of integration,
• convert the integral to the other coordinate system (if given in polar, to
rectangular; if given in rectangular, to polar), and

• choose one of the two iterated integrals to evaluate exactly.

R 3π/2 R 3
a. π 0
r3 dr dθ
R 2 R √1−(x−1)2 p
b. 0
√ x2 + y 2 dy dx
− 2
1−(x−1)

R π/2 R sin(θ) √
c. 0 0
r 1 − r2 dr dθ.
R √2/2 R √1−y2
d. 0 y
cos(x2 + y 2 ) dx dy.
248 CHAPTER 11. MULTIPLE INTEGRALS

11.6 Surfaces Defined Parametrically and Sur-


face Area

Motivating Questions

• What is a parameterization of a surface?


• How do we find the surface area of a parametrically defined surface?

We have now studied at length how curves in space can be defined para-
metrically by functions of the form r(t) = hx(t), y(t), z(t)i, and surfaces can be
represented by functions z = f (x, y). In what follows, we will see how we can
also define surfaces parametrically. A one-dimensional curve in space results
from a vector function that relies upon one parameter, so a two-dimensional
surface naturally involves the use of two parameters. If x = x(s, t), y = y(s, t),
and z = z(s, t) are functions of independent parameters s and t, then the
terminal points of all vectors of the form

r(s, t) = x(s, t)i + y(s, t)j + z(s, t)k

form a surface in space. The equations x = x(s, t), y = y(s, t), and z = z(s, t)
are the parametric equations for the surface, or a parametrization of the surface.
In Preview Activity 11.6.1 we investigate how to parameterize a cylinder and
a cone.
Preview Activity 11.6.1. Recall the standard parameterization of the unit
circle that is given by

x(t) = cos(t) and y(t) = sin(t),

where 0 ≤ t ≤ 2π.
a. Determine a parameterization of the circle of radius 1 in R3 that has its
center at (0, 0, 1) and lies in the plane z = 1.
b. Determine a parameterization of the circle of radius 1 in 3-space that has
its center at (0, 0, −1) and lies in the plane z = −1.
c. Determine a parameterization of the circle of radius 1 in 3-space that has
its center at (0, 0, 5) and lies in the plane z = 5.
d. Taking into account your responses in (a), (b), and (c), describe the
graph that results from the set of parametric equations

x(s, t) = cos(t), y(s, t) = sin(t), and z(s, t) = s,

where 0 ≤ t ≤ 2π and −5 ≤ s ≤ 5. Explain your thinking.


e. Just as a cylinder can be viewed as a “stack” of circles of constant radius,
a cone can be viewed as a stack of circles with varying radius. Modify the
parametrizations of the circles above in order to construct the parame-
terization of a cone whose vertex lies at the origin, whose base radius
is 4, and whose height is 3, where the base of the cone lies in the plane
z = 3. Use appropriate technology to plot the parametric equations you
develop. (Hint: The cross sections parallel to the xz-plane are circles,
with the radii varying linearly as z increases.)
11.6. SURFACES DEFINED PARAMETRICALLY AND SURFACE AREA249

11.6.1 Parametric Surfaces

In a single-variable setting, any function may have its graph expressed para-
metrically. For instance, given y = g(x), by considering the parameterization
ht, g(t)i (where t belongs to the domain of g), we generate the same curve.
What is more important is that certain curves that are not functions may
be represented parametrically; for instance, the circle (which cannot be rep-
resented by a single function) can be parameterized by hcos(t), sin(t)i, where
0 ≤ t ≤ 2π.
In the same way, in a two-variable setting, the surface z = f (x, y) may be
expressed parametrically by considering

hx(s, t), y(s, t), z(s, t)i = hs, t, f (s, t)i,

where (s, t) varies over the entire domain of f . Therefore, any familiar surface
that we have studied so far can be generated as a parametric surface. But what
is more powerful is that there are surfaces that cannot be generated by a single
function z = f (x, y) (such as the unit sphere), but that can be represented
parametrically. We now consider an important example.

Example 11.6.1. Consider the torus (or doughnut) shown in Figure 11.6.2.

Figure 11.6.2: A torus.

To find a parametrization of this torus, we recall our work in Preview


Activity 11.6.1. There, we saw that a circle of radius r that has its center at
the point (0, 0, z0 ) and is contained in the horizontal plane z = z0 , as shown in
Figure 11.6.3, can be parametrized using the vector-valued function r defined
by

r(t) = r cos(t)i + r sin(t)j + z0 k

where 0 ≤ t ≤ 2π.
250 CHAPTER 11. MULTIPLE INTEGRALS

Figure 11.6.3: A circle in a horizontal plane centered at (0, 0, z0 ).

To obtain the torus in Figure 11.6.2, we begin with a circle of radius a


in the xz-plane centered at (b, 0), as shown on the left of Figure 11.6.4. We
may parametrize the points on this circle, using the parameter s, by using the
equations
x(s) = b + a cos(s) and z(s) = a sin(s),
where 0 ≤ s ≤ 2π.

z z

r
a
x
b
y

Figure 11.6.4: Revolving a circle to obtain a torus.

Let’s focus our attention on one point on this circle, such as the indicated
point, which has coordinates (x(s), 0, z(s)) for a fixed value of the parameter
s. When this point is revolved about the z-axis, we obtain a circle contained
in a horizontal plane centered at (0, 0, z(s)) and having radius x(s), as shown
on the right of Figure 11.6.4. If we let t be the new parameter that generates
the circle for the rotation about the z-axis, this circle may be parametrized by

r(s, t) = x(s) cos(t)i + x(s) sin(t)j + z(s)k.

Now using our earlier parametric equations for x(s) and z(s) for the original
smaller circle, we have an overall parameterization of the torus given by

r(s, t) = (b + a cos(s)) cos(t)i + (b + a cos(s)) sin(t)j + a sin(s)k.

To trace out the entire torus, we require that the parameters vary through
the values 0 ≤ s ≤ 2π and 0 ≤ t ≤ 2π.
11.6. SURFACES DEFINED PARAMETRICALLY AND SURFACE AREA251

Activity 11.6.2. In this activity, we seek a parametrization of the sphere of


radius R centered at the origin, as shown on the left in Figure 11.6.5. Notice
that this sphere may be obtained by revolving a half-circle contained in the
xz-plane about the z-axis, as shown on the right.

z z

R
x
y

Figure 11.6.5: A sphere obtained by revolving a half-circle.

a. Begin by writing a parametrization of this half-circle using the parameter


s:
x(s) = . . . z(s) = . . . .
Be sure to state the domain of the parameter s.

b. By revolving the points on this half-circle about the z-axis, obtain a


parametrization r(s, t) of the points on the sphere of radius R. Be sure
to include the domain of both parameters s and t. (Hint: What is the
radius of the circle obtained when revolving a point on the half-circle
around the z axis?)

c. Draw the surface defined by your parameterization with appropriate tech-


nology.

11.6.2 The Surface Area of Parametrically Defined Sur-


faces
Recall that a differentiable function is locally linear — that is, if we zoom in
on the surface around a point, the surface looks like its tangent plane. We
now exploit this idea in order to determine the surface area generated by a
parametrization hx(s, t), y(s, t), z(s, t)i. The basic idea is a familiar one: we
will subdivide the surface into small pieces, in the approximate shape of small
parallelograms, and thus estimate the entire the surface area by adding the
areas of these approximation parallelograms. Ultimately, we use an integral to
sum these approximations and determine the exact surface area.
Let
r(s, t) = x(s, t)i + y(s, t)j + z(s, t)k
252 CHAPTER 11. MULTIPLE INTEGRALS

define a surface over a rectangular domain a ≤ s ≤ b and c ≤ t ≤ d. As


a function of two variables, s and t, it is natural to consider the two partial
derivatives of the vector-valued function r, which we define by

rs (s, t) = xs (s, t)i + ys (s, t)j + zs (s, t)k


rt (s, t) = xt (s, t)i + yt (s, t)j + zt (s, t)k.

In the usual way, we slice the domain into small rectangles. In particular,
we partition the interval [a, b] into m subintervals of length ∆s = b−a n and let
s0 , s1 , . . ., sm be the endpoints of these subintervals, where a = s0 < s1 <
s2 < · · · < sm = b. Also partition the interval [c, d] into n subintervals of equal
length ∆t = d−c n and let t0 , t1 , . . ., tn be the endpoints of these subintervals,
where c = t0 < t1 < t2 < · · · < tn = d. These two partitions create a partition
of the rectangle R = [a, b] × [c, d] in st-coordinates into mn sub-rectangles Rij
with opposite vertices (si−1 , tj−1 ) and (si , tj ) for i between 1 and m and j
between 1 and n. These rectangles all have equal area ∆A = ∆s · ∆t.
Now we want to think about the small piece of area on the surface itself
that lies above one of these small rectangles in the domain. Observe that if we
increase s by a small amount ∆s from the point (si−1 , tj−1 ) in the domain, then
r changes by approximately rs (si−1 , tj−1 )∆s. Similarly, if we increase t by a
small amount ∆t from the point (si−1 , tj−1 ), then r changes by approximately
rt (si−1 , tj−1 )∆t. So we can approximate the surface defined by r on the st-
rectangle [si−1 , si ] × [tj−1 , tj ] with the parallelogram determined by the vectors
rs (si−1 , tj−1 )∆s and rt (si−1 , tj−1 )∆t, as seen in Figure 11.6.6.

rt ∆t

rs ∆s

Figure 11.6.6: Approximation surface area with a parallelogram.

Say that the small parallelogram has area Sij . If we can find its area, then
all that remains is to sum the areas of all of the generated parallelograms and
take a limit. Recall from our earlier work in the course that given two vectors
u and v, the area of the parallelogram spanned by u and v is given by the
magnitude of their cross product, |u×v|. In the present context, it follows that
the area, Sij , of the parallelogram determined by the vectors rs (si−1 , tj−1 )∆s
and rt (si−1 , tj−1 )∆t is

Sij = |(rs (si−1 , tj−1 )∆s) × (rt (si−1 , tj−1 )∆t)|


= |rs (si−1 , tj−1 ) × rt (si−1 , tj−1 )|∆s∆t, (11.6.1)

where the latter equality holds from standard properties of the cross product
and length.
We sum the surface area approximations from Equation (11.6.1) over all
11.6. SURFACES DEFINED PARAMETRICALLY AND SURFACE AREA253

sub-rectangles to obtain an estimate for the total surface area, S, given by


m X
X n
S≈ |rs (si−1 , tj−1 ) × rt (si−1 , tj−1 )|∆s∆t.
i=1 j=1

Taking the limit as m, n → ∞ shows that the surface area of the surface
defined by r over the domain D is given as follows.
Surface area.
Let r(s, t) = hx(s, t), y(s, t), z(s, t)i be a parameterization of a smooth
surface over a domain D. The area of the surface defined by r on D is
given by ZZ
S= |rs × rt | dA. (11.6.2)
D

Activity 11.6.3. Consider the cylinder with radius a and height h defined
parametrically by
r(s, t) = a cos(s)i + a sin(s)j + tk
for 0 ≤ s ≤ 2π and 0 ≤ t ≤ h, as shown in Figure 11.6.7.

Figure 11.6.7: A cylinder.

a. Set up an iterated integral to determine the surface area of this cylinder.

b. Evaluate the iterated integral.

c. Recall that one way to think about the surface area of a cylinder is to
cut the cylinder horizontally and find the perimeter of the resulting cross
sectional circle, then multiply by the height. Calculate the surface area
of the given cylinder using this alternate approach, and compare your
work in (b).

As we noted earlier, we can take any surface z = f (x, y) and generate a


corresponding parameterization for the surface by writing hs, t, f (s, t)i. Hence,
we can use our recent work with parametrically defined surfaces to find the
surface area that is generated by a function f = f (x, y) over a given domain.

Activity 11.6.4. Let z = f (x, y) define a smooth surface, and consider the
corresponding parameterization r(s, t) = hs, t, f (s, t)i.
254 CHAPTER 11. MULTIPLE INTEGRALS

a. Let D be a region in the domain of f . Using Equation (11.6.2), show


that the area, S, of the surface defined by the graph of f over D is
ZZ q
2 2
S= (fx (x, y)) + (fy (x, y)) + 1 dA.
D

b. Use the formula developed


√ in (a) to calculate the area of the surface
defined by f (x, y) = 4 − x2 over the rectangle D = [−2, 2] × [0, 3].

c. Observe that the surface of the solid describe in (b) is half of a circular
cylinder. Use the standard formula for the surface area of a cylinder to
calculate the surface area in a different way, and compare your result
from (b).

11.6.3 Summary

• A parameterization of a curve describes the coordinates of a point on


the curve in terms of a single parameter t, while a parameterization of a
surface describes the coordinates of points on the surface in terms of two
independent parameters.

• If r(s, t) = hx(s, t), y(s, t), z(s, t)i describes a smooth surface in 3-space
on a domain D, then the area, S, of that surface is given by
ZZ
S= |rs × rt | dA.
D

Exercises
1. Consider the cone shown below.

If the height of the cone is 5 and the base radius is 2, write a parameteri-
zation of the cone in terms of r = s and θ = t.
x(s, t) = ,
y(s, t) = , and
z(s, t) = , with
≤s≤ and
≤t≤ .
11.6. SURFACES DEFINED PARAMETRICALLY AND SURFACE AREA255

2. Parameterize the plane through the point (−4, −5, −4) with the normal
vector h3, 4, −3i
~r(s, t) =
(Use s and t for the parameters in your parameterization, and enter your
vector as a single vector, with angle brackets: e.g., as <1 + s + t, s - t, 3 - t
>.)

3. Parameterize a vase formed by rotating the curve z = 6 x − 1, 1 ≤ x ≤ 5,
around the z-axis. Use s and t for your parameters.
x(s, t) = ,
y(s, t) = , and
z(s, t) = , with
≤s≤ and
≤t≤

4. Find parametric equations for the sphere centered at the origin and with
radius 5. Use the parameters s and t in your answer.
x(s, t) = ,
y(s, t) = , and
z(s, t) = , where
≤s≤ and
≤t≤ .

5. Find the surface area of that part of the plane 6x + 2y + z = 9 that lies
2 y2
inside the elliptic cylinder x25 + 100 =1
Surface Area =

6. Find the surface area of the part of the circular paraboloid z = x2 + y 2


that lies inside the cylinder x2 + y 2 = 9.

7. Find the surface area of the part of the plane 4x + 2y + z = 2 that lies
inside the cylinder x2 + y 2 = 4.

8. Write down the iterated integral which expresses the surface area of
z = y 7 cos8 x over the triangle with vertices (-1,1), (1,1), (0,2):

Z b Z g(y) p
h(x, y) dxdy
a f (y)

a=
b=
f (y) =
g(y) =
h(x, y) =

9. A decorative oak post is 30 inches long and is turned on a lathe so that


its profile is sinusoidal as shown in the figure below.
256 CHAPTER 11. MULTIPLE INTEGRALS

In this figure, r0 = 6 inches and a0 = 10 inches.


(a) Describe the surface of the post parametrically using cylindrical coor-
dinates and the parameters s and t.
x(s, t) = ,
y(s, t) = , and
z(s, t) = , where
≤s≤ and
≤t≤ .
(b) Find the volume of the post.
volume =
(Include units.)
10. Consider the ellipsoid given by the equation

x2 y2 z2
+ + = 1.
16 25 9
In Activity 11.6.2, we found that a parameterization of the sphere S of
radius R centered at the origin is

x(r, s) = R cos(s) cos(t), y(s, t) = R cos(s) sin(t), and z(s, t) = R sin(s)

for − π2 ≤ s ≤ π
2 and 0 ≤ t ≤ 2π.
a. Let (x, y, z) be a point on the ellipsoid and let X = x4 , Y = y5 , and Z = z3 .
Show that (X, Y, Z) lies on the sphere S. Hence, find a parameterization
of S in terms of X, Y , and Z as functions of s and t.
b. Use the result of part (a) to find a parameterization of the ellipse in terms
of x, y, and z as functions of s and t. Check your parametrization by
substituting x, y, and z into the equation of the ellipsoid. Then check
your work by plotting the surface defined by your parameterization.

11. In this exercise, we explore how to use a parametrization and iterated


integral to determine the surface area of a sphere.
11.6. SURFACES DEFINED PARAMETRICALLY AND SURFACE AREA257

a. Set up an iterated integral whose value is the portion of the surface area of
a sphere of radius R that lies in the first octant (see the parameterization
you developed in Activity 11.6.2).
b. Then, evaluate the integral to calculate the surface area of this portion
of the sphere.

c. By what constant must you multiply the value determined in (b) in order
to find the total surface area of the entire sphere.
d. Finally, compare your result to the standard formula for the surface area
of sphere.

12. Consider the plane generated by z = f (x, y) = 24 − 2x − 3y over the


region D = [0, 2] × [0, 3].
a. Sketch a picture of the overall solid generated by the plane over the given
domain.
b. Determine a parameterization r(s, t) for the plane over the domain D.

c. Use Equation (11.6.2) to determine the surface area generated by f over


the domain D.
d. Observe that the vector u = h2, 0, −4i points from (0, 0, 24) to (2, 0, 20)
along one side of the surface generated by the plane f over D. Find
the vector v such that u and v together span the parallelogram that
represents the surface defined by f over D, and hence compute |u × v|.
What do you observe about the value you find?

13. A cone with


p base radius a and height h can be realized as the surface
defined by z = ha x2 + y 2 , where a and h are positive.

a. Find a parameterization of the cone described by z = ha x2 + y 2 . (Hint:


p

Compare to the parameterization of a cylinder as seen in Activity 11.6.3.)


b. Set up an iterated integral to determine the surface area of this cone.

c. Evaluate the iterated integral to find a formula for the lateral surface
area of a cone of height h and base a.
258 CHAPTER 11. MULTIPLE INTEGRALS

11.7 Triple Integrals

Motivating Questions

• How are a triple Riemann sum and the corresponding triple integral of a
continuous function f = f (x, y, z) defined?
• What are two things the triple integral of a function can tell us?

We have now learned that we define the double integral of a continuous


function f = f (x, y) over a rectangle R = [a, b] × [c, d] as a limit of a double
Riemann sum, and that these ideas parallel the single-variable integral of a
function g = g(x) on an interval [a, b]. Moreover, this double integral has
natural interpretations and applications, and can even be considered over non-
rectangular regions, D. For instance, given a continuous function f over a
region D, the average value of f , fAVG(D) , is given by

1
ZZ
fAVG(D) = f (x, y) dA,
A(D) D

where A(D) is the area of D. Likewise, if δ(x, y) describes a mass density


function on a lamina over D, the mass, M , of the lamina is given by
ZZ
M= δ(x, y) dA.
D

It is natural to wonder if it is possible to extend these ideas of double


Riemann sums and double integrals for functions of two variables to triple
Riemann sums and then triple integrals for functions of three variables. We
begin investigating in Preview Activity 11.7.1.
Preview Activity 11.7.1. Consider a solid piece of granite in the shape of
a box B = {(x, y, z) : 0 ≤ x ≤ 4, 0 ≤ y ≤ 6, 0 ≤ z ≤ 8}, whose density varies
from point to point. Let δ(x, y, z) represent the mass density of the piece of
granite at point (x, y, z) in kilograms per cubic meter (so we are measuring x,
y, and z in meters). Our goal is to find the mass of this solid.

y
x
Figure 11.7.1: A partitioned three-dimensional domain.

Recall that if the density was constant, we could find the mass by multi-
plying the density and volume; since the density varies from point to point, we
11.7. TRIPLE INTEGRALS 259

will use the approach we did with two-variable lamina problems, and slice the
solid into small pieces on which the density is roughly constant.
Partition the interval [0, 4] into 2 subintervals of equal length, the interval
[0, 6] into 3 subintervals of equal length, and the interval [0, 8] into 2 subin-
tervals of equal length. This partitions the box B into sub-boxes as shown in
Figure 11.7.1.

a. Let 0 = x0 < x1 < x2 = 4 be the endpoints of the subintervals of


[0, 4] after partitioning. Draw a picture of Figure 11.7.1 and label these
endpoints on your drawing. Do likewise with 0 = y0 < y1 < y2 < y3 = 6
and 0 = z0 < z1 < z2 = 8 What is the length ∆x of each subinterval
[xi−1 , xi ] for i from 1 to 2? the length of ∆y? of ∆z?

b. The partitions of the intervals [0, 4], [0, 6] and [0, 8] partition the box B
into sub-boxes. How many sub-boxes are there? What is volume ∆V of
each sub-box?

c. Let Bijk denote the sub-box [xi−1 , xi ]×[yj−1 , yj ]×[zk−1 , zk ]. Say that we
choose a point (x∗ijk , yijk
∗ ∗
, zijk ) in the i, j, kth sub-box for each possible
combination of i, j, k. What is the meaning of δ(x∗ijk , yijk ∗ ∗
, zijk )? What
physical quantity will δ(xijk , yijk , zijk )∆V approximate?
∗ ∗ ∗

d. What final step(s) would it take to determine the exact mass of the piece
of granite?

11.7.1 Triple Riemann Sums and Triple Integrals


Through the application of a mass density distribution over a three-dimensional
solid, Preview Activity 11.7.1 suggests that the generalization from double
Riemann sums of functions of two variables to triple Riemann sums of functions
of three variables is natural. In the same way, so is the generalization from
double integrals to triple integrals. By simply adding a z-coordinate to our
earlier work, we can define both a triple Riemann sum and the corresponding
triple integral.

Definition 11.7.2. Let f = f (x, y, z) be a continuous function on a box


B = [a, b] × [c, d] × [r, s]. The triple Riemann sum of f over B is created
as follows.

• Partition the interval [a, b] into m subintervals of equal length ∆x =


m . Let x0 , x1 , . . ., xm be the endpoints of these subintervals, where
b−a

a = x0 < x1 < x2 < · · · < xm = b. Do likewise with the interval [c, d]


using n subintervals of equal length ∆y = d−cn to generate c = y0 < y1 <
y2 < · · · < yn = d, and with the interval [r, s] using ` subintervals of
equal length ∆z = s−r ` to have r = z0 < z1 < z2 < · · · < zl = s.

• Let Bijk be the sub-box of B with opposite vertices (xi−1 , yj−1 , zk−1 )
and (xi , yj , zk ) for i between 1 and m, j between 1 and n, and k between
1 and `. The volume of each Bijk is ∆V = ∆x · ∆y · ∆z.

• Let (x∗ijk , yijk


∗ ∗
, zijk ) be a point in box Bijk for each i, j, and k. The
resulting triple Riemann sum for f on B is
m X
X n X
`
f (x∗ijk , yijk
∗ ∗
, zijk ) · ∆V.
i=1 j=1 k=1
260 CHAPTER 11. MULTIPLE INTEGRALS

If f (x, y, z) represents the mass density of the box B, then, as we saw in


Preview Activity 11.7.1, the triple Riemann sum approximates the total mass
of the box B. In order to find the exact mass of the box, we need to let the
number of sub-boxes increase without bound (in other words, let m, n, and `
go to infinity); in this case, the finite sum of the mass approximations becomes
the actual mass of the solid B. More generally, we have the following definition
of the triple integral.
Definition 11.7.3. With following notation defined as in a triple Riemann
sum, the triple integral of f over B is
ZZZ m X
X n X
`
f (x, y, z) dV = lim f (x∗ijk , yijk
∗ ∗
, zijk ) · ∆V.
B m,n,`→∞
i=1 j=1 k=1

As we noted earlier, if f (x, y, z) represents the density of the solid B at


each point (x, y, z), then
ZZZ
M= f (x, y, z) dV
B

is the mass of B. Even more importantly, for any continuous function f over
the solid B, we can use a triple integral to determine the average value of
f over B, fAVG(B) . We note this generalization of our work with functions
of two variables along with several others in the following important boxed
information. Note that each of these quantities may actually be considered
over a general domain S in R3 , not simply a box, B.
• The triple integral ZZZ
V (S) = 1 dV
S
represents the volume of the solid S.
• The average value of the function f = f (x, y, x) over a solid domain S is
given by   ZZZ
1
fAVG(S) = f (x, y, z) dV,
V (S) S
where V (S) is the volume of the solid S.
• The center of mass of the solid S with density δ = δ(x, y, z) is (x, y, z),
where
RRR
S
x δ(x, y, z) dV
x= ,
RRR M
S
y δ(x, y, z) dV
y= ,
RRR M
S
z δ(x, y, z) dV
z= ,
M
ZZZ
and M = δ(x, y, z) dV is the mass of the solid S.
S

In the Cartesian coordinate system, the volume element dV is dz dy dx,


and, as a consequence, a triple integral of a function f over a box B = [a, b] ×
[c, d] × [r, s] in Cartesian coordinates can be evaluated as an iterated integral
of the form
ZZZ Z bZ dZ s
f (x, y, z) dV = f (x, y, z) dz dy dx.
B a c r
11.7. TRIPLE INTEGRALS 261

If we want to evaluate a triple integral as an iterated integral over a solid


S that is not a box, then we need to describe the solid in terms of variable
limits.
Activity 11.7.2.
a. Set up and evaluate the triple integral of f (x, y, z) = x − y + 2z over the
box B = [−2, 3] × [1, 4] × [0, 2].

b. Let S be the solid cone bounded by z = x2 + y 2 and z = 3. A picture


p

of S is shown at right in Figure 11.7.4. Our goal in what follows is to set


up an iterated integral of the form
Z x=? Z y=? Z z=?
δ(x, y, z) dz dy dx (11.7.1)
x=? y=? z=?

to represent the mass of S in the setting where δ(x, y, z) tells us the


density of S at the point (x, y, z). Our particular task is to find the
limits on each of the three integrals.

4 y

x
-4 -2 2 4
-3
-2
0
-3 0 3
-4

Figure 11.7.4: Left: The cone. Right: Its projection.

i. If we think about slicing up the solid, we can consider slicing the


domain of the solid’s projection onto the xy-plane (just as we would
slice a two-dimensional region in R2 ), and then slice in the z-direction
as well. The projection of the solid onto the xy-plane is shown at
left in Figure 11.7.4. If we decide to first slice the domain of the
solid’s projection perpendicular to the x-axis, over what range of
constant x-values would we have to slice?
ii. If we continue with slicing the domain, what are the limits on y on
a typical slice? How do these depend on x? What, therefore, are
the limits on the middle integral?
iii. Finally, now that we have thought about slicing up the two-dimensional
domain that is the projection of the cone, what are the limits on
z in the innermost integral? Note that over any point (x, y) in the
plane, a vertical slice in the z direction will involve a range of values
from the cone itself to its flat top. In particular, observe that at
least one of these limits is not constant but depends on x and y.
262 CHAPTER 11. MULTIPLE INTEGRALS

iv. In conclusion, write an iterated integral of the form (11.7.1) that


represents the mass of the cone S.

Note well: When setting up iterated integrals, the limits on a given variable
can be only in terms of the remaining variables. In addition, there are multiple
different ways we can choose to set up such an integral. For example, two pos-
sibilities for iterated integrals that represent a triple integral
RRR
S
f (x, y, z) dV
over a solid S are
R b R g (x) R h (x,y)
• a g12(x) h12(x,y) f (x, y, z) dz dy dx
R s R p2 (z) R q2 (x,z)
• r p1 (z) q1 (x,z)
f (x, y, z) dy dx dz

where g1 , g2 , h1 , h2 , p1 , p2 , q1 , and q2 are functions of the indicated


variables. There are four other options beyond the two stated here, since the
variables x, y, and z can (theoretically) be arranged in any order. Of course, in
many circumstances, an insightful choice of variable order will make it easier
to set up an iterated integral, just as was the case when we worked with double
integrals.

Example 11.7.5. Find the mass of the tetrahedron in the first octant bounded
by the coordinate planes and the plane x + 2y + 3z = 6 if the density at point
(x, y, z) is given by δ(x, y, z) = x + y + z. A picture of the solid tetrahedron is
shown at left in Figure 11.7.6.

z 6 y
2
5
4
3
3
y 2
1
x
6
x 2 4 6

Figure 11.7.6: Left: The tetrahedron. Right: Its projection.

We find the mass, M , of the tetrahedron by the triple integral


ZZZ
M= δ(x, y, z) dV,
S

where S is the solid tetrahedron described above. In this example, we choose


to integrate with respect to z first for the innermost integral. The top of the
tetrahedron is given by the equation

x + 2y + 3z = 6;
11.7. TRIPLE INTEGRALS 263

solving for z then yields


1
z= (6 − x − 2y).
3
The bottom of the tetrahedron is the xy-plane, so the limits on z in the
iterated integral will be 0 ≤ z ≤ 31 (6 − x − 2y).
To find the bounds on x and y we project the tetrahedron onto the xy-
plane; this corresponds to setting z = 0 in the equation z = 13 (6 − x − 2y). The
resulting relation between x and y is

x + 2y = 6.

The right image in Figure 11.7.6 shows the projection of the tetrahedron
onto the xy-plane.
If we choose to integrate with respect to y for the middle integral in the
iterated integral, then the lower limit on y is the x-axis and the upper limit
is the hypotenuse of the triangle. Note that the hypotenuse joins the points
(6, 0) and (0, 3) and so has equation y = 3 − 21 x. Thus, the bounds on y are
0 ≤ y ≤ 3 − 21 x. Finally, the x values run from 0 to 6, so the iterated integral
that gives the mass of the tetrahedron is
Z 6 Z 3−(1/2)x Z (1/3)(6−x−2y)
M= x + y + z dz dy dx. (11.7.2)
0 0 0

Evaluating the triple integral gives us


Z 6 Z 3−(1/2)x Z (1/3)(6−x−2y)
M= x + y + z dz dy dx
0 0 0
6 3−(1/2)x (1/3)(6−x−2y)
z i
Z Z h
= xz + yz + dy dx
0 0 2 0
6 3−(1/2)x
4 5 2 4
Z Z
= x − x2 − xy + y − y 2 + 2 dy dx
0 0 3 18 9 3 9
Z 6  3−(1/2)x
4 5 7 1 4
xy − x2 y − xy 2 + y 2 − y 3 + 2y

= dx
0 3 18 18 3 27 0
Z 6
1 7 13 3
= 5 + x − x2 + x dx
0 2 12 216
  6
1 2 7 3 13 4
= 5x + x − x + x
4 36 864 0
33
= .
2
Setting up limits on iterated integrals can require considerable geometric
intuition. It is important to not only create carefully labeled figures, but also
to think about how we wish to slice the solid. Further, note that when we
say “we will integrate first with respect to x,” by “first” we are referring to the
innermost integral in the iterated integral. The next activity explores several
different ways we might set up the integral in the preceding example.

Activity 11.7.3. There are several other ways we could have set up the inte-
gral to give the mass of the tetrahedron in Example 11.7.5.

a. How many different iterated integrals could be set up that are equal to
the integral in Equation (11.7.2)?
264 CHAPTER 11. MULTIPLE INTEGRALS

b. Set up an iterated integral, integrating first with respect to z, then x,


then y that is equivalent to the integral in Equation (11.7.2). Before
you write down the integral, think about Figure 11.7.6, and draw an
appropriate two-dimensional image of an important projection.
c. Set up an iterated integral, integrating first with respect to y, then z,
then x that is equivalent to the integral in Equation (11.7.2). As in (b),
think carefully about the geometry first.
d. Set up an iterated integral, integrating first with respect to x, then y,
then z that is equivalent to the integral in Equation (11.7.2).
Now that we have begun to understand how to set up iterated triple in-
tegrals, we can apply them to determine important quantities, such as those
found in the next activity.
Activity 11.7.4. A solid S is bounded below by the square z = 0, −1 ≤ x ≤ 1,
−1 ≤ y ≤ 1 and above by the surface z = 2 − x2 − y 2 . A picture of the solid
is shown in Figure 11.7.7.

z 2

-1
0 x
-1 0 y 1

Figure 11.7.7: The solid bounded by the surface z = 2 − x2 − y 2 .

a. First, set up an iterated double integral to find the volume of the solid S
as a double integral of a solid under a surface. Then set up an iterated
triple integral that gives the volume of the solid S. You do not need to
evaluate either integral. Compare the two approaches.
b. Set up (but do not evaluate) iterated integral expressions that will tell us
the center of mass of S, if the density at point (x, y, z) is δ(x, y, z) = x2 +1.
c. Set up (but do not evaluate) an iterated integral to find the average
density on S using the density function from part (b).
d. Use technology appropriately to evaluate the iterated integrals you de-
termined in (a), (b), and (c); does the location you determined for the
center of mass make sense?

11.7.2 Summary
• Let f = f (x, y, z) be a continuous function on a box B = [a, b] × [c, d] ×
[r, s]. The triple integral of f over B is defined as
ZZZ Xm X n X l
f (x, y, z) dV = lim f (x∗ijk , yijk
∗ ∗
, zijk ) · ∆V,
B ∆V →0
i=1 j=1 k=1
11.7. TRIPLE INTEGRALS 265

where the triple Riemann sum is defined in the usual way. The definition
of the triple integral naturally extends to non-rectangular solid regions
S.

• The triple integral f (x, y, z) dV can tell us


RRR
S

◦ the volume of the solid S if f (x, y, z) = 1,


◦ the mass of the solid S if f represents the density of S at the point
(x, y, z).

Moreover,
1
ZZZ
fAVG(S) = f (x, y, z) dV,
V (S) S

is the average value of f over S.

Exercises
1. Find the triple integral of the function f (x, y, z) = x3 cos(y + z) over the
cube 2 ≤ x ≤ 3, 0 ≤ y ≤ π, 0 ≤ z ≤ π.
2. Evaluate the triple integral
ZZZ
xyz dV
E

where E is the solid: 0 ≤ z ≤ 2, 0 ≤ y ≤ z, 0 ≤ x ≤ y.


3. Find the mass of the rectangular prism 0 ≤ x ≤ 1, 0 ≤ y ≤ 3, 0 ≤ z ≤ 2,
with density function ρ (x, y, z) = x.
4. Find the average value of the function f (x, y, z) = ye−xy over the rect-
angular prism 0 ≤ x ≤ 2, 0 ≤ y ≤ 2, 0 ≤ z ≤ 4
5. Find the volume of the solid bounded by the planes x = 0, y = 0, z = 0,
and x + y + z = 5.
6. Find the mass of the solid bounded by the xy-plane, yz-plane, xz-plane,
and the plane (x/3) + (y/3) + (z/9) = 1, if the density of the solid is given by
δ(x, y, z) = x + 3y.
mass =
7. The moment of inertia of a solid body about an axis in 3-space relates the
angular acceleration about this axis to torque (force twisting the body). The
moments of inertia about the coordinate axes of a body of constant density
and mass m occupying a region W of volume V are defined to be

m m m
Z Z Z
2 2 2 2
Ix = (y +z ) dV Iy = (x +z ) dV Iz = (x2 +y 2 ) dV
V W V W V W

Use these definitions to find the moment of inertia about the z-axis of the
rectangular solid of mass 60 given by 0 ≤ x ≤ 4, 0 ≤ y ≤ 1, 0 ≤ z ≤ 5.
Ix =
Iy =
Iz =
ZZZ
8. Express the integral f (x, y, z)dV as an iterated integral in six dif-
E
ferent ways, where E is the solid bounded by z = 0, x = 0, z = y − 7x and
266 CHAPTER 11. MULTIPLE INTEGRALS

y = 28.
Z b Z g2 (x) Z h2 (x,y)
1. f (x, y, z)dzdydx
a g1 (x) h1 (x,y)
a= b=
g1 (x) = g2 (x) =
h1 (x, y) = h2 (x, y) =
Z b Z g2 (y) Z h2 (x,y)
2. f (x, y, z)dzdxdy
a g1 (y) h1 (x,y)
a= b=
g1 (y) = g2 (y) =
h1 (x, y) = h2 (x, y) =
Z b Z g2 (z) Z h2 (y,z)
3. f (x, y, z)dxdydz
a g1 (z) h1 (y,z)
a= b=
g1 (z) = g2 (z) =
h1 (y, z) = h2 (y, z) =
Z b Z g2 (y) Z h2 (y,z)
4. f (x, y, z)dxdzdy
a g1 (y) h1 (y,z)
a= b=
g1 (y) = g2 (y) =
h1 (y, z) = h2 (y, z) =
Z b Z g2 (x) Z h2 (x,z)
5. f (x, y, z)dydzdx
a g1 (x) h1 (x,z)
a= b=
g1 (x) = g2 (x) =
h1 (x, z) = h2 (x, z) =
Z b Z g2 (z) Z h2 (x,z)
6. f (x, y, z)dydxdz
a g1 (z) h1 (x,z)
a= b=
g1 (z) = g2 (z) =
h1 (x, z) = h2 (x, z) =
9. Calculate the volume under the elliptic paraboloid z = 4x2 + 7y 2 and
over the rectangle R = [−4, 4] × [−2, 2].
10. The motion of a solid object can be analyzed by thinking of the mass
as concentrated at a single point, the center of mass. If the object has density
ρ(x, y, z) at the point (x, y, z) and occupies a region W , then the coordinates
(x, y, z) of the center of mass are given by

1 1 1
Z Z Z
x= xρ dV y = yρ dV z = zρ dV,
m W m W m W

Assume x, y, z are in cm. Let C be a solid conep with both height and
radius 1 and contained between the surfaces z = x2 + y 2 and z = 1. If C
has constant mass density of 1 g/cm3 , find the z-coordinate of C’s center of
mass.
z=
(Include units.)
11. Without calculation, decide if each of the integrals
p below are positive,
negative, or zero. Let W be the solid bounded by z = x2 + y 2 and z = 2.

ZZZ
(a) (z − 2) dV
W
11.7. TRIPLE INTEGRALS 267
ZZZ  p 
(b) z− x2 + y 2 dV
W
ZZZ
(c) e−xyz dV
W

12. Set up a triple integral to find the mass of the solid tetrahedron bounded
by the xy-plane, the yz-plane, the xz-plane, and the plane x/3 + y/2 + z/6 = 1,
if the density function is given by δ(x, y, z) = x + y. Write an iterated integral
in the form below to findZ theZ mass
Z of the solid.
ZZZ B D F
f (x, y, z) dV = dz dy dx
A C E
R
with limits of integration
A=
B=
C=
D=
E=
F=
13. Consider the solid S that is bounded by the parabolic cylinder y = x2
and the planes z = 0 and z = 1 − y as shown in Figure 11.7.8.

z
1

y
1

x 1

Figure 11.7.8: The solid bounded by y = x2 and the planes z = 0 and


z = 1 − y.

Assume the density of S is given by δ(x, y, z) = z

a. Set up (but do not evaluate) an iterated integral that represents the mass
of S. Integrate first with respect to z, then y, then x. A picture of the
projection of S onto the xy-plane is shown at left in Figure 11.7.9.

b. Set up (but do not evaluate) an iterated integral that represents the mass
of S. In this case, integrate first with respect to y, then z, then x. A
picture of the projection of S onto the xz-plane is shown at center in
Figure 11.7.9.

c. Set up (but do not evaluate) an iterated integral that represents the mass
of S. For this integral, integrate first with respect to x, then y, then z.
A picture of the projection of S onto the yz-plane is shown at right in
Figure 11.7.9.
268 CHAPTER 11. MULTIPLE INTEGRALS

d. Which of these three orders of integration is the most natural to you?


Why?

y z z
1.0 1.0 1.0

0.5 0.5
0.5

x x
y
-1 1 -1 1
0.5 1.0

Figure 11.7.9: Projections of S onto the xy, xz, and yz-planes.

14. This problem asks you to investigate the average value of some different
quantities.

a. Set up, but do not evaluate, an iterated integral expression whose value
is the average sum of all real numbers x, y, and z that have the following
property: y is between 0 and 2, x is greater than or equal to 0 but cannot
exceed 2y, and z is greater than or equal to 0 but cannot exceed x + y.

b. Set up, but do not evaluate, an integral expression whose value represents
the average value of f (x, y, z) = x + y + z over the solid region in the first
octant bounded by the surface z = 4 − x − y 2 and the coordinate planes
x = 0, y = 0, z = 0.

c. How are the quantities in (a) and (b) similar? How are they different?

15. Consider the solid that lies between the paraboloids z = g(x, y) = x2 +y 2
and z = f (x, y) = 8 − 3x2 − 3y 2 .

a. By eliminating the variable z, determine the curve of intersection between


the two paraboloids, and sketch this curve in the xy-plane.

b. Set up, but do not evaluate, an iterated integral expression whose value
determines the mass of the solid, integrating first with respect to z,
then y, then x. Assume the the solid’s density is given by δ(x, y, z) =
x2 +y 2 +z 2 +1 .
1

c. Set up, but do not evaluate, iterated integral expressions whose values
determine the mass of the solid using all possible remaining orders of
integration. Use δ(x, y, z) = x2 +y21+z2 +1 as the density of the solid.

d. Set up, but do not evaluate, iterated integral expressions whose values
determine the center of mass of the solid. Again, assume the the solid’s
density is given by δ(x, y, z) = x2 +y21+z2 +1 .

e. Which coordinates of the center of mass can you determine without eval-
uating any integral expression? Why?

16. In each of the following problems, your task is to


11.7. TRIPLE INTEGRALS 269

• sketch, by hand, the region over which you integrate


• set up iterated integral expressions which, when evaluated, will determine
the value sought
• use appropriate technology to evaluate each iterated integral expression
you develop
Note well: in some problems you may be able to use a double rather than
a triple integral, and polar coordinates may be helpful in some cases.
a. Consider the solid created by the region enclosed by the circular paraboloid
z = 4 − x2 − y 2 over the region R in the xy-plane enclosed by y = −x
and the circle x2 + y 2 = 4 in the first, second, and fourth quadrants.
Determine the solid’s volume.
b. Consider the solid region that lies beneath the circular paraboloid z =
9 − x2 − y 2 over the triangular region between y = x, y = 2x, and
y = 1. Assuming that the solid has its density at point (x, y, z) given
by δ(x, y, z) = xyz + 1, measured in grams per cubic cm, determine the
center of mass of the solid.
c. In a certain room in a house, the walls can be thought of as being formed
by the lines y = 0, y = 12 + x/4, x = 0, and x = 12, where length is
measured in feet. In addition, the ceiling of the room is vaulted and is
determined by the plane z = 16 − x/6 − y/3. A heater is stationed in the
corner of the room at (0, 0, 0) and causes the temperature in the room
at a particular time to be given by
80
T (x, y, z) = x2 y2 z2
1+ 1000 + 1000 + 1000

What is the average temperature in the room?


d. Consider the solid enclosed by the cylinder x2 + y 2 = 9 and the planes
y+z = 5 p and z = 1. Assuming that the solid’s density is given by
δ(x, y, z) = x2 + y 2 , find the mass and center of mass of the solid.
270 CHAPTER 11. MULTIPLE INTEGRALS

11.8 Triple Integrals in Cylindrical and Spheri-


cal Coordinates

Motivating Questions
• What are the cylindrical coordinates of a point, and how are they related
to Cartesian coordinates?
• What is the volume element in cylindrical coordinates? How does this
inform us about evaluating a triple integral as an iterated integral in
cylindrical coordinates?
• What are the spherical coordinates of a point, and how are they related
to Cartesian coordinates?
• What is the volume element in spherical coordinates? How does this
inform us about evaluating a triple integral as an iterated integral in
spherical coordinates?

We have encountered two different coordinate systems in R2 — the rect-


angular and polar coordinates systems — and seen how in certain situations,
polar coordinates form a convenient alternative. In a similar way, there there
turn out to be two additional natural coordinate systems in R3 . Given that
we are already familiar with the Cartesian coordinate system for R3 , we next
investigate the cylindrical and spherical coordinate systems (each of which
builds upon polar coordinates in R2 ). In what follows, we will see how to con-
vert among the different coordinate systems, how to evaluate triple integrals
using them, and some situations in which these other coordinate systems prove
advantageous.
Preview Activity 11.8.1. In the following questions, we investigate the two
new coordinate systems that are the subject of this section: cylindrical and
spherical coordinates. Our goal is to consider some examples of how to convert
from rectangular coordinates to each of these systems, and vice versa. Triangles
and trigonometry prove to be particularly important.

z z

(ρ, θ, φ)
(r, θ, z)

z φ ρ

y y
θ r θ

x x

Figure 11.8.1: The cylindrical (left) and spherical (right) coordinates of a


point.
11.8. TRIPLE INTEGRALS IN CYLINDRICAL AND SPHERICAL COORDINATES271

a. The cylindrical coordinates of a point in R3 are given by (r, θ, z) where


r and θ are the polar coordinates of the point (x, y) and z is the same z
coordinate as in Cartesian coordinates. An illustration is given at left in
Figure 11.8.1.

i. Find cylindrical
√ coordinates for the point whose Cartesian coordi-
nates are (−1, 3, 3). Draw a labeled picture illustrating all of the
coordinates.
ii. Find the Cartesian coordinates of the point whose cylindrical coor-
dinates are 2, 5π
4 , 1 . Draw a labeled picture illustrating all of the


coordinates.

b. The spherical coordinates of a point in R3 are ρ (rho), θ, and φ (phi),


where ρ is the distance from the point to the origin, θ has the same
interpretation it does in polar coordinates, and φ is the angle between the
positive z axis and the vector from the origin to the point, as illustrated
at right in Figure 11.8.1. You should convince yourself that any point in
R3 can be represented in spherical coordinates with ρ ≥ 0, 0 ≤ θ < 2π,
and 0 ≤ φ ≤ π.
For the following questions,
√ consider the point P whose Cartesian coor-
dinates are (−2, 2, 8).

i. What is the distance from P to the origin? Your result is the value
of ρ in the spherical coordinates of P .
ii. Determine the point that is the projection of P onto the xy-plane.
Then, use this projection to find the value of θ in the polar coor-
dinates of the projection of P that lies in the plane. Your result is
also the value of θ for the spherical coordinates of the point.
iii. Based on the illustration in Figure 11.8.1, how is the angle φ de-
termined by ρ and the z coordinate of P ? Use a well-chosen right
triangle to find the value of φ, which is the final component in the
spherical coordinates of P . Draw a carefully labeled picture that
clearly illustrates the values of ρ, θ, and φ in this example, along
with the original rectangular coordinates of P .
iv. Based on your responses to i., ii., and iii., if we are given the Carte-
sian coordinates (x, y, z) of a point Q, how are the values of ρ, θ,
and φ in the spherical coordinates of Q determined by x, y, and z?

11.8.1 Cylindrical Coordinates


As we stated in Preview Activity 11.8.1, the cylindrical coordinates of a point
are (r, θ, z), where r and θ are the polar coordinates of the point (x, y), and z
is the same z coordinate as in Cartesian coordinates. The general situation is
illustrated Figure 11.8.1.
Since we already know how to convert between rectangular and polar co-
ordinates in the plane, and the z coordinate is identical in both Cartesian and
cylindrical coordinates, the conversion equations between the two systems in
R3 are essentially those we found for polar coordinates.
272 CHAPTER 11. MULTIPLE INTEGRALS

Coverting between Cartesian and cylindrical coordinates.

• If we are given the Cartesian coordinates (x, y, z) of a point P ,


then the cylindrical coordinates (r, θ, z) of P satisfy

x = r cos(θ) y = r sin(θ) and z = z.

• If we are given the cylindrical coordinates (r, θ, z) of a point P ,


then the Cartesian coordinates (x, y, z) of P satisfy
y
r 2 = x2 + y 2 tan(θ) = and z = z,
x
assuming x 6= 0.

Just as with rectangular coordinates, where we usually write z as a function


of x and y to plot the resulting surface, in cylindrical coordinates, we often
express z as a function of r and θ. In the following activity, we explore several
basic equations in cylindrical coordinates and the corresponding surface each
generates.

Activity 11.8.2. In this activity, we graph some surfaces using cylindrical co-
ordinates. To improve your intuition and test your understanding, you should
first think about what each graph should look like before you plot it using
appropriate technology.

a. What familiar surface is described by the points in cylindrical coordinates


with r = 2, 0 ≤ θ ≤ 2π, and 0 ≤ z ≤ 2? How does this example suggest
that we call these coordinates cylindrical coordinates? How does your
answer change if we restrict θ to 0 ≤ θ ≤ π?

b. What familiar surface is described by the points in cylindrical coordinates


with θ = 2, 0 ≤ r ≤ 2, and 0 ≤ z ≤ 2?

c. What familiar surface is described by the points in cylindrical coordinates


with z = 2, 0 ≤ θ ≤ 2π, and 0 ≤ r ≤ 2?

d. Plot the graph of the cylindrical equation z = r, where 0 ≤ θ ≤ 2π and


0 ≤ r ≤ 2. What familiar surface results?

e. Plot the graph of the cylindrical equation z = θ for 0 ≤ θ ≤ 4π. What


does this surface look like?

As the name and Activity 11.8.2 suggest, cylindrical coordinates are useful
for describing surfaces that are cylindrical in nature.

11.8.2 Triple Integrals in Cylindrical Coordinates


To evaluate a triple integral f (x, y, z) dV as an iterated integral in Carte-
RRR
S
sian coordinates, we use the fact that the volume element dV is equal to
dz dy dx (which corresponds to the volume of a small box). To evaluate a triple
integral in cylindrical coordinates, we similarly must understand the volume
element dV in cylindrical coordinates.

Activity 11.8.3. A picture of a cylindrical box, B = {(r, θ, z) : r1 ≤ r ≤


r2 , θ1 ≤ θ ≤ θ2 , z1 ≤ z ≤ z2 }, is shown in Figure 11.8.2. Let ∆r = r2 − r1 ,
11.8. TRIPLE INTEGRALS IN CYLINDRICAL AND SPHERICAL COORDINATES273

∆θ = θ2 − θ1 , and ∆z = z2 − z1 . We want to determine the volume ∆V of B


in terms of ∆r, ∆θ, ∆z, r, θ, and z.

Figure 11.8.2: A cylindrical box.

a. Appropriately label ∆r, ∆θ, and ∆z in Figure 11.8.2.


b. Let ∆A be the area of the projection of the box, B, onto the xy-plane,
which is shaded blue in Figure 11.8.2. Recall that we previously deter-
mined the area ∆A in polar coordinates in terms of r, ∆r, and ∆θ. In
light of the fact that we know ∆A and that z is the standard z coordi-
nate from Cartesian coordinates, what is the volume ∆V in cylindrical
coordinates?
Activity 11.8.3 demonstrates that the volume element dV in cylindrical
coordinates is given by dV = r dz dr dθ, and hence the following rule holds in
general.
Triple integrals in cylindrical coordinates.
Given a continuous function f = f (x, y, z) over a region S in R3 ,
ZZZ ZZZ
f (x, y, z) dV = f (r cos(θ), r sin(θ), z) r dz dr dθ.
S S

The latter expression is an iterated integral in cylindrical coordinates.

Of course, to complete the task of writing an iterated integral in cylindrical


coordinates, we need to determine the limits on the three integrals: θ, r, and z.
In the following activity, we explore how to do this in several situations where
cylindrical coordinates are natural and advantageous.
Activity 11.8.4. In this activity we work with triple integrals in cylindrical
coordinates.
a. Let S be the solid bounded above by the graph of z = x2 + y 2 and below
by z = 0 on the unit disk in the xy-plane.
i. The projection of the solid S onto the xy-plane is a disk. Describe
this disk using polar coordinates.
ii. Now describe the surfaces bounding the solid S using cylindrical
coordinates.
274 CHAPTER 11. MULTIPLE INTEGRALS

iii. Determine an iterated triple integral expression in cylindrical coor-


dinates that gives the volume of S. You do not need to evaluate
this integral.

b. Suppose the density of the cone defined by r = 1 − z, with z ≥ 0, is given


by δ(r, θ, z) = z. A picture of the cone is shown at left in Figure 11.8.3,
and the projection of the cone onto the xy-plane in given at right in
Figure 11.8.3. Set up an iterated integral in cylindrical coordinates that
gives the mass of the cone. You do not need to evaluate this integral.

1 y
1.0

z 0.5

x
-1.0 -0.5 0.5 1.0

-0.5

-1
-1.0
0
-1 0 1

Figure 11.8.3: The cylindrical cone r = 1 − z and its projection onto the
xy-plane.

-2
0
-2 0 2
11.8.4: A solid bounded by the cones z = x2 + y 2 and z = 4 −
p
Figure
x2 + y 2 .
p

c. Determine an iterated integral expression in cylindrical coordinates


p whose
value is the volume of the solid bounded below by the cone z = x2 + y 2
11.8. TRIPLE INTEGRALS IN CYLINDRICAL AND SPHERICAL COORDINATES275

and above by the cone z = 4 − x2 + y 2 . A picture is shown in Fig-


p

ure 11.8.4. You do not need to evaluate this integral.

11.8.3 Spherical Coordinates


As we saw in Preview Activity 11.8.1, the spherical coordinates of a point in
3-space have the form (ρ, θ, φ), where ρ is the distance from the point to the
origin, θ has the same meaning as in polar coordinates, and φ is the angle
between the positive z axis and the vector from the origin to the point. The
overall situation is illustrated at right in Figure 11.8.1.

z
ρ sin φ
(ρ, θ, φ)

φ ρ ρ cos φ

y
θ
ρ sin φ

Figure 11.8.5: Converting from spherical to Cartesian coordinates.

The example in Preview Activity 11.8.1 and Figure 11.8.5 suggest how to
convert between Cartesian and spherical coordinates.
Coverting between Cartesian and spherical coordinates.

• If we are given the Cartesian coordinates (x, y, z) of a point P ,


then the spherical coordinates (ρ, θ, φ) of P satisfy
p y z
ρ= x2 + y 2 + z 2 tan(θ) = and cos(φ) = ,
x ρ
where in the latter two equations, we require x 6= 0 and ρ 6= 0.
• If we are given the spherical coordinates (ρ, θ, φ) of a point P ,
then the Cartesian coordinates (x, y, z) of P satisfy

x = ρ sin(φ) cos(θ) y = ρ sin(φ) sin(θ) and z = ρ cos(φ).

When it comes to thinking about particular surfaces in spherical coor-


dinates, similar to our work with cylindrical and Cartesian coordinates, we
usually write ρ as a function of θ and φ; this is a natural analog to polar coor-
dinates, where we often think of our distance from the origin in the plane as
being a function of θ. In spherical coordinates, we likewise often view ρ as a
function of θ and φ, thus viewing distance from the origin as a function of two
key angles.
In the following activity, we explore several basic equations in spherical
coordinates and the surfaces they generate.
276 CHAPTER 11. MULTIPLE INTEGRALS

Activity 11.8.5. In this activity, we graph some surfaces using spherical co-
ordinates. To improve your intuition and test your understanding, you should
first think about what each graph should look like before you plot it using
appropriate technology.

a. What familiar surface is described by the points in spherical coordinates


with ρ = 1, 0 ≤ θ ≤ 2π, and 0 ≤ φ ≤ π? How does this particular
example demonstrate the reason for the name of this coordinate system?
What if we restrict φ to 0 ≤ φ ≤ π2 ?

b. What familiar surface is described by the points in spherical coordinates


with φ = π3 , 0 ≤ ρ ≤ 1, and 0 ≤ θ ≤ 2π?

c. What familiar shape is described by the points in spherical coordinates


with θ = π6 , 0 ≤ ρ ≤ 1, and 0 ≤ φ ≤ π?

d. Plot the graph of ρ = θ, for 0 ≤ φ ≤ π and 0 ≤ θ ≤ 2π. How does the


resulting surface appear?

As the name and Activity 11.8.5 indicate, spherical coordinates are partic-
ularly useful for describing surfaces that are spherical in nature; they are also
convenient for working with certain conical surfaces.

11.8.4 Triple Integrals in Spherical Coordinates

As with rectangular and cylindrical coordinates, a triple integral


RRR
S
f (x, y, z) dV
in spherical coordinates can be evaluated as an iterated integral once we un-
derstand the volume element dV .

Activity 11.8.6. To find the volume element dV in spherical coordinates, we


need to understand how to determine the volume of a spherical box of the form
ρ1 ≤ ρ ≤ ρ2 (with ∆ρ = ρ2 − ρ1 ), φ1 ≤ φ ≤ φ2 (with ∆φ = φ2 − φ1 ), and
θ1 ≤ θ ≤ θ2 (with ∆θ = θ2 −θ1 ). An illustration of such a box is given at left in
Figure 11.8.6. This spherical box is a bit more complicated than the cylindrical
box we encountered earlier. In this situation, it is easier to approximate the
volume ∆V than to compute it directly. Here we can approximate the volume
∆V of this spherical box with the volume of a Cartesian box whose sides have
the lengths of the sides of this spherical box. In other words,

_ _
∆V ≈ |P S| |P R| |P Q|,

_
where |P R| denotes the length of the circular arc from P to R.
11.8. TRIPLE INTEGRALS IN CYLINDRICAL AND SPHERICAL COORDINATES277

z z

R
R
Q
Q P
S
P
S
y

y
x

Figure 11.8.6: Left: A spherical box. Right: A spherical volume element.

a. What is the length |P S| in terms of ρ?


_ _
b. What is the length of the arc P R? (Hint: The arc P R is an arc of a
circle of radius ρ2 , and arc length along a circle is the product of the
angle measure (in radians) and the circle’s radius.)
_ _
c. What is the length of the arc P Q? (Hint: The arc P Q lies on a horizontal
circle as illustrated at right in Figure 11.8.6. What is the radius of this
circle?)
d. Use your work in (a), (b), and (c) to determine an approximation for ∆V
in spherical coordinates.
Letting ∆ρ, ∆φ and ∆θ go to 0, it follows from the final result in Ac-
tivity 11.8.6 that dV = ρ2 sin(φ) dρ dφ dθ in spherical coordinates, and thus
allows us to state the following general rule.
Triple integrals in spherical coordinates.
Given a continuous
RRR function f = f (x, y, z) over a region S in R , the
3

triple integral S
f (x, y, z) dV is converted to the integral
ZZZ
f (ρ sin(φ) cos(θ), ρ sin(φ) sin(θ), ρ cos(φ)) ρ2 sin(φ) dρ dφ dθ
S

in spherical coordinates.
The latter expression is an iterated integral in spherical coordinates.

Finally, in order to actually evaluate an iterated integral in spherical coor-


dinates, we must of course determine the limits of integration in θ, φ, and ρ.
The process is similar to our earlier work in the other two coordinate systems.
Activity 11.8.7. We can use spherical coordinates to help us more easily
understand some natural geometric objects.
a. Recall that the sphere of radius a has spherical equation ρ = a. Set up
and evaluate an iterated integral in spherical coordinates to determine
the volume of a sphere of radius a.
278 CHAPTER 11. MULTIPLE INTEGRALS

b. Set up, but do not evaluate, an iterated integral expression in spherical


coordinates whose value is the mass of the solid obtained by removing the
cone φ = π4 from the sphere ρ = 2 if the density δ at the point (x, y, z)
is δ(x, y, z) = x2 + y 2 + z 2 . An illustration of the solid is shown in
p

Figure 11.8.7.

Figure 11.8.7: The solid cut from the sphere ρ = 2 by the cone φ = 4.
π

11.8.5 Summary

• The cylindrical coordinates of a point P are (r, θ, z) where r is the dis-


tance from the origin to the projection of P onto the xy-plane, θ is the
angle that the projection of P onto the xy-plane makes with the positive
x-axis, and z is the vertical distance from P to the projection of P onto
the xy-plane. When P has rectangular coordinates (x, y, z), it follows
that its cylindrical coordinates are given by
y
r 2 = x2 + y 2 , tan(θ) = , z = z.
x
When P has given cylindrical coordinates (r, θ, z), its rectangular coor-
dinates are
x = r cos(θ), y = r sin(θ), z = z.

• The volume element dV RRRin cylindrical coordinates is dV = r dz dr dθ.


Hence, a triple integral S
f (x, y, z) dA can be evaluated as the iterated
integral ZZZ
f (r cos(θ), r sin(θ), z) r dz dr dθ.
S

• The spherical coordinates of a point P in 3-space are ρ (rho), φ (phi), and


θ, where ρ is the distance from P to the origin, φ is the angle between
the positive z axis and the vector from the origin to P , and θ is the angle
that the projection of P onto the xy-plane makes with the positive x-axis.
When P has Cartesian coordinates (x, y, z), the spherical coordinates are
given by
y z
ρ2 = x2 + y 2 + z 2 , tan(θ) = , cos(φ) = .
x ρ
Given the point P in spherical coordinates (ρ, φ, θ), its rectangular coor-
dinates are
x = ρ sin(φ) cos(θ), y = ρ sin(φ) sin(θ), z = ρ cos(φ).
11.8. TRIPLE INTEGRALS IN CYLINDRICAL AND SPHERICAL COORDINATES279

• The volume element dVRRR in spherical coordinates is dV = ρ2 sin(φ) dρ dφ dθ.


Thus, a triple integral S
f (x, y, z) dA can be evaluated as the iterated
integral
ZZZ
f (ρ sin(φ) cos(θ), ρ sin(φ) sin(θ), ρ cos(φ)) ρ2 sin(φ) dρ dφ dθ.
S

Exercises
1. What are the rectangular coordinates of the point whose cylindrical
coordinates are
3 , z = −8) ?
(r = 5, θ = 3π
x=
y=
z=
2. What are the rectangular coordinates of the point whose spherical coor-
dinates are 
3, 13 π, 32 π ?
x=
y=
z=
3. What are the cylindrical coordinates of the point whose spherical coordi-
nates are
6 ) ?
(3, −1, 2π
r=
θ=
z=
4. Find an equation for the paraboloid z = x2 + y 2 in spherical coordinates.
(Enter rho, phi and theta for ρ, φ and θ, respectively.)
equation:
5. Match the given equation with the verbal description of the surface:

A. Cone

B. Half plane

C. Elliptic or Circular Paraboloid

D. Plane

E. Circular Cylinder

F. Sphere

(a) r = 2 cos(θ)

(b) φ = π
3

(c) z = r2

(d) ρ cos(φ) = 4

(e) ρ = 2 cos(φ)

(f) ρ = 4

(g) θ = π
3
280 CHAPTER 11. MULTIPLE INTEGRALS

(h) r2 + z 2 = 16

(i) r = 4

6. Match the integrals with the type of coordinates which make them the
easiest to do. Put the letter of the coordinate system to the left of the number
of the integral.

ZZZ
(a) z 2 dV where E is: −2 ≤ z ≤ 2, 1 ≤ x2 + y 2 ≤ 2
E
ZZZ
(b) dV where E is: x2 + y 2 + z 2 ≤ 4, x ≥ 0, y ≥ 0, z ≥ 0
E

Z 1Z y2
1
(c) dx dy
0 0 x
ZZZ
(d) z dV where E is: 1 ≤ x ≤ 2, 3 ≤ y ≤ 4, 5 ≤ z ≤ 6
E

1
ZZ
(e) dA where D is: x2 + y 2 ≤ 4
D x2 + y 2

A. cartesian coordinates

B. spherical coordinates

C. cylindrical coordinates

D. polar coordinates

7. Evaluate

the √
integral.
3Z 9−x2 Z 9−x2 −z 2
1
Z
√ √ dy dz dx =
0 − 9−x2 − 9−x2 −z 2 (x2 + y 2 + z 2 )1/2
ZZZ p
8. Use cylindrical coordinates to evaluate the triple integral
x2 + y 2 dV ,
E
where E is the solid bounded by the circular paraboloid z = 9 − 1 x2 + y 2


and the xy -plane.


ZZZ
9. Use spherical coordinates to evaluate the triple integral x2 + y 2 +
E
z 2 dV , where E is the ball: x2 + y 2 + z 2 ≤ 9.
10. Find the volume of the solid enclosed by the paraboloids z = 4 x + y
2 2


and z = 32 − 4 x + y .
2 2

11. FInd the volume of the ellipsoid x2 + y 2 + 7z 2 = 64.


12. The density, δ, of the cylinder x2 + y 2 ≤ 9, 0 ≤ z ≤ 3 varies with the
distance, r, from the z-axis:

δ = 1 + r g/cm3 .

Find the mass of the cylinder, assuming x, y, z are in cm.


mass =
(Include units.)
11.8. TRIPLE INTEGRALS IN CYLINDRICAL AND SPHERICAL COORDINATES281

1
13. Suppose f (x, y, z) = p and W is the bottom half of a
x2 + y 2 + z 2
sphere of radius 5. Enter ρ as rho, φ as phi, and θ as theta.
(a) As an iterated integral,
ZZZ Z BZ DZ F
f dV = dρ dφ dθ
A C E
W
with limits of integration
A=
B=
C=
D=
E=
F=
(b) Evaluate the integral.
14. In each of the following questions, set up an iterated integral expression
whose value determines the desired result. Then, evaluate the integral first by
hand, and then using appropriate technology.

a. Find the volume of the “cap” cut from the solid sphere x2 + y 2 + z 2 = 4
by the plane z = 1, as well as the z-coordinate of its centroid.

b. Find the x-coordinate of the center of mass of the portion of the unit
sphere that lies in the first octant (i.e., where x, y, and z are all non-
negative). Assume that the density of the solid given by δ(x, y, z) =
1+x2 +y 2 +z 2 .
1

c. Find the volume of the solid bounded below by the xy-plane, on the sides
by the sphere ρ = 2, and above by the cone φ = π/3.

d. Find the z coordinate of the


qp center of mass of the region that is bounded
above by the surface z = x2 + y 2 , on the sides by the cylinder x2 +
y = 4, and below by the xy-plane. Assume that the density of the solid
2

is uniform and constant.

e. Find the volume of the solid that lies outside the sphere x2 + y 2 + z 2 = 1
and inside the sphere x2 + y 2 + z 2 = 2z.

15. For each of the following questions,

• sketch the region of integration,

• change the coordinate system in which the iterated integral is written to


one of the remaining two,

• evaluate the iterated integral you deem easiest to evaluate by hand.

R 1 R √1−x2 R √2−x2 −y2


a. 0 0
√ 2 2 xy dz dy dx
x +y

R π/2 R π R 1
b. 0 0 0
ρ2 sin(φ) dρ dφ dθ
R 2π R 1 R 1
c. 0 0 r
r2 cos(θ) dz dr dθ

16. Consider the solid region S bounded above by the paraboloid z =


16 − x2 − y 2 and below by the paraboloid z = 3x2 + 3y 2 .
282 CHAPTER 11. MULTIPLE INTEGRALS

a. Describe parametrically the curve in R3 in which these two surfaces in-


tersect.
b. In terms of x and y, write an equation to describe the projection of the
curve onto the xy-plane.

c. What coordinate system do you think is most natural for an iterated


integral that gives the volume of the solid?
d. Set up, but do not evaluate, an iterated integral expression whose value
is average z-value of points in the solid region S.
e. Use technology to plot the two surfaces and evaluate the integral in (c).
Write at least one sentence to discuss how your computations align with
your intuition about where the average z-value of the solid should fall.
11.9. CHANGE OF VARIABLES 283

11.9 Change of Variables

Motivating Questions
• What is a change of variables?
• What is the Jacobian, and how is it related to a change of variables?

In single variable calculus, we encountered the idea of a change of variable


in a definite integral through the method of substitution. For example, given
the definite integral Z 2
2x(x2 + 1)3 dx,
0
we naturally consider the change of variable u = x2 +1. From this substitution,
it follows that du = 2x dx, and since x = 0 implies u = 1 and x = 2 implies
u = 5, we have transformed the original integral in x into a new integral in u.
In particular, Z 2 Z 5
2x(x2 + 1)3 dx = u3 du.
0 1
The latter integral, of course, is far easier to evaluate.
Through our work with polar, cylindrical, and spherical coordinates, we
have already implicitly seen some of the issues that arise in using a change
of variables with two or three variables present. In what follows, we seek
to understand the general ideas behind any change of variables in a multiple
integral.
Preview Activity 11.9.1. Consider the double integral
ZZ
I= x2 + y 2 dA, (11.9.1)
D

where D is the upper half of the unit disk.


a. i. Write the double integral I given in Equation (11.9.1) as an iterated
integral in rectangular coordinates.
ii. Write the double integral I given in Equation (11.9.1) as an iterated
integral in polar coordinates.
b. When we write the double integral (11.9.1) as an iterated integral in
polar coordinates we make a change of variables, namely
x = r cos(θ) and y = r sin(θ). (11.9.2)

We also then have to change dA to r dr dθ. This process also identifies a


“polar rectangle” [r1 , r2 ] × [θ1 , θ2 ] with the original Cartesian rectangle,
under the transformation1 in Equation (11.9.2). The vertices of the polar
rectangle are transformed into the vertices of a closed and bounded region
in rectangular coordinates.
To work with a numerical example, let’s now consider the polar rectangle
P given by [1, 2] × [ π6 , π4 ], so that r1 = 1, r2 = 2, θ1 = π6 , and θ2 = π4 .
1 A transformation is another name for function: here, the equations x = r cos(θ) and

y = r sin(θ) define a function T by T (r, θ) = (r cos(θ), r sin(θ)) so that T is a function


(transformation) from R2 to R2 . We view this transformation as mapping a version of the
xy-plane where the axes are viewed as representing r and θ (the rθ-plane) to the familiar
xy-plane.
284 CHAPTER 11. MULTIPLE INTEGRALS

i. Use the transformation determined by the equations in (11.9.2) to


find the rectangular vertices that correspond to the polar vertices in
the polar rectangle P . In other words, by substituting appropriate
values of r and θ into the two equations in (11.9.2), find the val-
ues of the corresponding x and y coordinates for the vertices of the
polar rectangle P . Label the point that corresponds to the polar
vertex (r1 , θ1 ) as (x1 , y1 ), the point corresponding to the polar ver-
tex (r2 , θ1 ) as (x2 , y2 ), the point corresponding to the polar vertex
(r1 , θ2 ) as (x3 , y3 ), and the point corresponding to the polar vertex
(r2 , θ2 ) as (x4 , y4 ).
ii. Draw a picture of the figure in rectangular coordinates that has
the points (x1 , y1 ), (x2 , y2 ), (x3 , y3 ), and (x4 , y4 ) as vertices. (Note
carefully that because of the trigonometric functions in the transfor-
mation, this region will not look like a Cartesian rectangle.) What
is the area of this region in rectangular coordinates? How does this
area compare to the area of the original polar rectangle?

11.9.1 Change of Variables in Polar Coordinates


The general idea behind a change of variables is suggested by Preview Activ-
ity 11.9.1. There, we saw that in a change of variables from rectangular coor-
dinates to polar coordinates, a polar rectangle [r1 , r2 ] × [θ1 , θ2 ] gets mapped to
a Cartesian rectangle under the transformation

x = r cos(θ) and y = r sin(θ).

The vertices of the polar rectangle P are transformed into the vertices of
a closed and bounded region P 0 in rectangular coordinates. If we view the
standard coordinate system as having the horizontal axis represent r and the
vertical axis represent θ, then the polar rectangle P appears to us at left in
Figure 11.9.1. The image P 0 of the polar rectangle P under the transformation
given by (11.9.2) is shown at right in Figure 11.9.1. We thus see that there is a
correspondence between a simple region (a traditional, right-angled rectangle)
and a more complicated region (a fraction of an annulus) under the function
T given by T (r, θ) = (r cos(θ), r sin(θ)).

θ y
2 2
P′

1 P 1

r x
1 2 1 2

Figure 11.9.1: A rectangle P and its image P 0 .

Furthermore, as Preview Activity 11.9.1 suggests, it follows generally that


for an original polar rectangle P = [r1 , r2 ]×[θ1 , θ2 ], the area of the transformed
11.9. CHANGE OF VARIABLES 285

rectangle P 0 is given by r2 +r
2 ∆r∆θ. Therefore, as ∆r and ∆θ go to 0 this
1

area becomes the familiar area element dA = r dr dθ in polar coordinates.


When we proceed to working with other transformations for different changes
in coordinates, we have to understand how the transformation affects area so
that we may use the correct area element in the new system of variables.

11.9.2 General Change of Coordinates


We first focus on double integrals. As with single integrals, we may be able to
simplify a double integral of the form
ZZ
f (x, y) dA
D

by making a change of variables (that is, a substitution) of the form

x = x(s, t) and y = y(s, t)

where x and y are functions of new variables s and t. This transformation


introduces a correspondence between a problem in the xy-plane and one in the
the st-plane. The equations x = x(s, t) and y = y(s, t) convert s and t to x
and y; we call these formulas the change of variable formulas. To complete the
change to the new s, t variables, we need to understand the area element, dA,
in this new system. The following activity helps to illustrate the idea.

Activity 11.9.2. Consider the change of variables



x = s + 2t and y = 2s + t.

Let’s see what happens to the rectangle T = [0, 1] × [1, 4] in the st-plane
under this change of variable.

a. Draw a labeled picture of T in the st-plane.

b. Find the image of the st-vertex (0, 1) in the xy-plane. Likewise, find the
respective images of the other three vertices of the rectangle T : (0, 4),
(1, 1), and (1, 4).

c. In the xy-plane, draw a labeled picture of the image, T 0 , of the original


st-rectangle T . What appears to be the shape of the image, T 0 ?

d. To transform an integral with a change of variables, we need to determine


the area element dA for image of the transformed rectangle. Note that
T 0 is not exactly a parallelogram since the equations that define the
transformation are not lienar. But we can approximate the area of T 0
with the area of a parallelogram. How would we find the area of a
parallelogram that approximates the area of the xy-figure T 0 ? (Hint:
Remember what the cross product of two vectors tells us.)

Activity 11.9.2 presents the general idea of how a change of variables works.
We partition a rectangular domain in the st system into subrectangles. Let
T = [a, b]×[a+∆s, b+∆t] be one of these subrectangles. Then we transform this
into a region T 0 in the standard xy Cartesian coordinate system. The region T 0
is called the image of T ; the region T is the pre-image of T 0 . Although the sides
of this xy region T 0 aren’t necessarily straight (linear), we will approximate the
element of area dA for this region with the area of the parallelogram whose sides
are given by the vectors v and w, where v is the vector from (x(a, b), y(a, b))
286 CHAPTER 11. MULTIPLE INTEGRALS

to (x(a + ∆s, b), y(a + ∆s, b)), and w is the vector from (x(a, b), y(a, b)) to
(x(a, b + ∆t), y(a, b + ∆t)).
An example of an image T 0 in the xy-plane that results from a transforma-
tion of a rectangle T in the st-plane is shown in Figure 11.9.2.

t y
2 2
T′

1 T 1

w v
s x
1 2 1 2

Figure 11.9.2: Approximating an area of an image resulting from a transfor-


mation.

The components of the vector v are

v = hx(a + ∆s, b) − x(a, b), y(a + ∆s, b) − y(a, b), 0i

and similarly those for w are

w = hx(a, b + ∆t) − x(a, b), y(a, b + ∆s) − y(a, b), 0i .

Slightly rewriting v and w, we have

x(a + ∆s, b) − x(a, b) y(a + ∆s, b) − y(a, b)


 
v= , , 0 ∆s, and
∆s ∆s
x(a, b + ∆t) − x(a, b) y(a, b + ∆s) − y(a, b)
 
w= , , 0 ∆t.
∆t ∆t
For small ∆s and ∆t, the definition of the partial derivative tells us that
   
∂x ∂y ∂x ∂y
v≈ (a, b), (a, b), 0 ∆s and w≈ (a, b), (a, b), 0 ∆t.
∂s ∂s ∂t ∂t
Recall that the area of the parallelogram with sides v and w is the length
of the cross product of the two vectors, |v × w|. From this, we observe that
   
∂x ∂y ∂x ∂y
v×w ≈ (a, b), (a, b), 0 ∆s × (a, b), (a, b), 0 ∆t
∂s ∂s ∂t ∂t
 
∂x ∂y ∂x ∂y
= 0, 0, (a, b) (a, b) − (a, b) (a, b) ∆s ∆t.
∂s ∂t ∂t ∂s
Finally, by computing the magnitude of the cross product, we see that
 
∂x ∂y ∂x ∂y
|v × w| ≈ 0, 0,
(a, b) (a, b) − (a, b) (a, b) ∆s ∆t
∂s ∂t ∂t ∂s

∂x ∂y ∂x ∂y
= (a, b) (a, b) −
(a, b) (a, b) ∆s ∆t.
∂s ∂t ∂t ∂s
11.9. CHANGE OF VARIABLES 287

Therefore, as the number of subdivisions increases without bound in each


direction, ∆s and ∆t both go to zero, and we have

∂x ∂y ∂x ∂y
dA = − ds dt. (11.9.3)
∂s ∂t ∂t ∂s

Equation (11.9.3) hence determines the general change of variable formula


in a double integral, and we can now say that

∂x ∂y ∂x ∂y
ZZ ZZ
f (x, y) dy dx = f (x(s, t), y(s, t)) − ds dt.
T T0 ∂s ∂t ∂t ∂s
The quantity
∂x ∂y ∂x ∂y

∂s ∂t ∂t ∂s
is called the Jacobian, and we denote the Jacobian using the shorthand nota-
tion
∂(x, y) ∂x ∂y ∂x ∂y
= − .
∂(s, t) ∂s ∂t ∂t ∂s
Recall from Section 9.4 that we can
∂x also write this Jacobian as the determinant
∂x
of the 2 × 2 matrix ∂y ∂y . Note that, as discussed in Section 9.4, the

∂s ∂t

∂x ∂s ∂x ∂t

absolute value of ∂y ∂t is the area of the parallelogram determined by

∂s
∂y
∂s ∂t
the vectors v and w, and so the area element dA in xy-coordinates is also
represented by the area element ∂(s,t) ds dt in st-coordinates, and ∂(x,y)
∂(x,y)
∂(s,t) is

the factor by which the transformation magnifies area.
To summarize, the preceding change of variable formula that we have de-
rived now follows.
Change of Variables in a Double Integral.
Suppose a change of variables x = x(s, t) and y = y(s, t) transforms a
closed and bounded region R in the st-plane into a closed and bounded
region R0 in the xy-plane. Under modest conditions (that are studied
in advanced calculus), it follows that

∂(x, y)
ZZ ZZ
f (x, y) dA = f (x(s, t), y(s, t)) ds dt.
R0 R ∂(s, t)

Activity 11.9.3. Find the Jacobian when changing from rectangular to polar
coordinates. That is, for the transformation given by x = r cos(θ), y = r sin(θ),
determine a simplified expression for the quantity
∂x ∂y ∂x ∂y
− .
∂r ∂θ ∂θ ∂r
What do you observe about your result? How is this connected to our
earlier work with double integrals in polar coordinates?
Activity 11.9.4. Let D0 be the region in the xy-plane bounded by the lines
y = 0, x = 0, and x + y = 1. We will evaluate the double integral

ZZ
x + y(x − y)2 dA (11.9.4)
D0

with a change of variables.


288 CHAPTER 11. MULTIPLE INTEGRALS

a. Sketch the region D0 in the xy-plane.


b. We would like to make a substitution that makes the integrand easier to
antidifferentiate. Let s = x + y and t = x − y. Explain why this should
make antidifferentiation easier by making the corresponding substitutions
and writing the new integrand in terms of s and t.

c. Solve the equations s = x + y and t = x − y for x and y. (Doing so


determines the standard form of the transformation, since we will have
x as a function of s and t, and y as a function of s and t.)
d. To actually execute this change of variables, we need to know the st-
region D that corresponds to the xy-region D0 .
i. What st equation corresponds to the xy equation x + y = 1?
ii. What st equation corresponds to the xy equation x = 0?
iii. What st equation corresponds to the xy equation y = 0?
iv. Sketch the st region D that corresponds to the xy domain D0 .

e. Make the change of variables indicated by s = x + y and t = x − y in


the double integral (11.9.4) and set up an iterated integral in st variables
whose value is the original given double integral. Finally, evaluate the
iterated integral.

11.9.3 Change of Variables in a Triple Integral


The argument for the change of variable formula for triple integrals is com-
plicated, and we will not go into the details. The general process, though, is
the same as the two-dimensional case. Given a solid S 0 in the xyz-coordinate
system in R3 , a change of variables transformation x = x(s, t, u), y = y(s, t, u),
and z = z(s, t, u) transforms S 0 into a region S in stu-coordinates. Any
function f = f (x, y, z) defined on S 0 can be considered as a function f =
f (x(s, t, u), y(s, y, u), z(s, t, u)) in stu-coordinates defined on S. The volume
element dV in xyz-coordinates cooresponds to a scaled volume element in
stu-coordinates, where the scale factor is given by the absolute value of the
Jacobian, ∂(x,y,z)
∂(s,t,u) , which is the determinant of the 3 × 3 matrix

∂x ∂x ∂x


∂s ∂t ∂u
∂y ∂y ∂y .


∂s ∂t ∂u
∂z ∂z ∂z
∂s ∂t ∂u

(Recall that this determinant was introduced in Section 9.4.) That is, ∂(x,y,z)
∂(s,t,u)
is given by
     
∂x ∂y ∂z ∂y ∂z ∂x ∂y ∂z ∂y ∂z ∂x ∂y ∂z ∂y ∂z
− − − + − .
∂s ∂t ∂u ∂u ∂t ∂t ∂s ∂u ∂u ∂s ∂u ∂s ∂t ∂t ∂s

To summarize,
11.9. CHANGE OF VARIABLES 289

Change of Variables in a Triple Integral.


Suppose a change of variables x = x(s, t, u), y = y(s, t, u), and z =
z(s, t, u) transforms a closed and bounded region S in stu-coordinates
into a closed and bounded region S 0 in xyz-coordinates. Under modest
conditions (that are studied in advanced calculus), the triple integral
z) dV is equal to
RRR
S0
f (x, y,

∂(x, y, z)
ZZZ
f (x(s, t, u), y(s, t, u), z(s, t, u))
ds dt du.
S ∂(s, t, u)

Activity 11.9.5. Find the Jacobian when changing from Cartesian to cylin-
drical coordinates. That is, for the transformation given by x = r cos(θ),
y = r sin(θ), and z = z, determine a simplified expression for the quantity

∂(x, y, z)
.
∂(r, θ, z)

What do you observe about your result? How is this connected to our
earlier work with triple integrals in cylindrial coordinates?

Activity 11.9.6. Consider the solid S 0 defined by the inequalities 0 ≤ x ≤ 2,


2 ≤ y ≤ 2 + 1, and 0 ≤ z ≤ 6. Consider the transformation defined by s = 2 ,
x x x
x−2y
t = 2 , and u = z3 . Let f (x, y, x) = x − 2y + z.

a. The transformation turns the solid S 0 in xyz-coordinates into a box S in


stu-coordinates. Apply the transformation to the boundries of the solid
S 0 to find stu-coordinatte descriptions of the box S.
∂(x,y,z)
b. Find the Jacobian ∂(s,t,u) .

c. Use
RRR the transformation to perform a change of variables and evaluate
S0
f (x, y, z) dV by evaluating

∂(x, y, z)
ZZZ
f (x(s, t, u), y(s, t, u), z(s, t, u))
ds dt du.
S ∂(s, t, u)

11.9.4 Summary

• If an integral is described in terms of one set of variables, we may write


that set of variables in terms of another set of the same number of vari-
ables. If the new variables are chosen appropriately, the transformed
integral may be easier to evaluate.

• The Jacobian is a scalar function that relates the area or volume element
in one coordinate system to the corresponding element in a new system
determined by a change of variables.

Exercises

1. Find the absolute value of the Jacobian, ∂(x,y)
∂(s,t) , for the change of vari-

ables
given by x = 7s + 5t, y = 8s + 6t
∂(x,y)
∂(s,t) =
290 CHAPTER 11. MULTIPLE INTEGRALS

∂(x,y,z)
2. Find the Jacobian. ∂(s,t,u) , where x = 4s−3t−2u, y = − (2s + t + 4u) , z =
4t − 4s + 4u.
∂(x,y,z)
∂(s,t,u) =

3. Consider the transformation T : x = 50


14
u − 48 48 14
50 v, y = 50 u + 50 v
A. Compute the Jacobian:
∂(x,y)
∂(u,v) =
B. The transformation is linear, which implies that it transforms lines into
lines. Thus, it transforms the square S : −50 ≤ u ≤ 50, −50 ≤ v ≤ 50 into a
square T (S) with vertices:
T(50, 50) = ( , )
T(-50, 50) = ( , )
T(-50, -50) = ( , )
T(50, -50) = ( , )
C. Use the transformation T to evaluate the integral T (S) x2 + y 2 dA
RR

Use the change of variables s = y, t = y − x2 to evaluate


RR
4. R
x dx dy
over the region R in the first quadrant bounded by y = 0, y = 16, y = x2 , and
y =RxR2 − 2.
R
x dx dy =
5. Use the change of variables s = x + y, t = y to find the area of the ellipse
x2 + 2xy + 2y 2 ≤ 1.
area =
6. Use the change of variables s = xy, t = xy 2 to compute R xy 2 dA, where
R

R isR the region bounded by xy = 2, xy = 7, xy 2 = 2, xy 2 = 7.


R
xy 2 dA =
7. Find positive numbersR aR and b so that the change of variables s = ax, t =
by transforms the integral R
dx dy into
Z Z
∂(x, y)
ds dt
T ∂(s, t)

for the region R, the rectangle 0 ≤ x ≤ 40, 0 ≤ y ≤ 45 and the region T , the
square 0 ≤ s, t ≤ 1.
a=
b=
What is ∂(x,y) in this case?

∂(s,t)
∂(x,y)
∂(s,t) =
8. Find a number a RsoR that the change of variables s = x + ay, t = y
transforms the integral R
dx dy over the parallelogram R in the xy-plane
with vertices (0, 0), (22, 0), (−24, 14), (−2, 14) into an integral
Z Z
∂(x, y)
ds dt
T ∂(s, t)

over a rectangle T in the st-plane.


a=
What is ∂(x,y)
∂(s,t) in this case?


∂(x,y)
∂(s,t) =
9. In this problem Rwe use the change of variables x = 4s + t, y = s − t to
compute the integral R (x + y) dA, where R is the parallelogram with vertices
11.9. CHANGE OF VARIABLES 291

(x, y) = (0, 0), (8, 2), (10, 0), and (2, −2).
First find the magnitude of the Jacobian, ∂(x,y)
∂(s,t) = .

Then, with a = ,b= ,
c= , and d = ,
Rb Rd
) dt ds =
R
R
(x + y) dA = a c ( s+ t+

10. Let D0 be the region in the xy-plane that is the parallelogram with
vertices (3, 3), (4, 5), (5, 4), and (6, 6).
a. Sketch and label the region D0 in the xy-plane.
b. Consider the integral D0 (x + y) dA. Explain why this integral would
RR

be difficult to set up as an iterated integral.


c. Let a change of variables be given by x = 2u + v, y = u + 2v. Using
substitution or elimination, solve this system of equations for u and v in
terms of x and y.
d. Use your work in (c) to find the pre-image, D, which lies in the uv-plane,
of the originally given region D0 , which lies in the xy-plane. For instance,
what uv point corresponds to (3, 3) in the xy-plane?
e. Use the change of variables in (c) and your other work to write a new
iterated
RR integral in u and v that is equivalent to the original xy integral
D 0 (x + y) dA.
f. Finally, evaluate the uv integral, and write a sentence to explain why the
change of variables made the integration easier.
11. Consider the change of variables
x(ρ, θ, φ) = ρ sin(φ) cos(θ) y(ρ, θ, φ) = ρ sin(φ) sin(θ) z(ρ, θ, φ) = ρ cos(φ),
which is the transformation from spherical coordinates to rectangular coor-
dinates. Determine the Jacobian of the transformation. How is the result
connected to our earlier work with iterated integrals in spherical coordinates?
2
x2
12. In this problem, our goal is to find the volume of the ellipsoid a2 + yb2 +
z2
c2 = 1.
a. Set up an iterated integral in rectangular coordinates whose value is the
volume of the ellipsoid. Do so by using symmetry and taking 8 times
the volume of the ellipsoid in the first octant where x, y, and z are all
nonnegative.
b. Explain why it makes sense to use the substitution x = as, y = bt, and
z = cu in order to make the region of integration simpler.
c. Compute the Jacobian of the transformation given in (b).
d. Execute the given change of variables and set up the corresponding new
iterated integral in s, t, and u.
e. Explain why this new integral is better, but is still difficult to evaluate.
What additional change of variables would make the resulting integral
easier to evaluate?
f. Convert the integral from (d) to a new integral in spherical coordinates.
g. Finally, evaluate the iterated integral in (f) and hence determine the
volume of the ellipsoid.
292 CHAPTER 11. MULTIPLE INTEGRALS
Index

arclength, 88 graph of a vector-valued function


average value over a solid, 260 definition, 67

center of mass iterated integral


of a lamina, 231 cylindrical coordinates, 273
of a solid, 260 polar coordinates, 241
centroid rectangular coordinates, 211
of a lamina, 231 spherical coordinates, 277
change of variable
double integral, 287 Jacobian, 287
triple integral, 289 joint probability density function,
Cobb-Douglas production 233
function, 196
Lagrange multiplier, 193
continuity, 107
level curve, 10
coordinate planes, 6
line
critical point, 176
direction vector, 57
cross product, 46
in space, 57
curvature, 92
parametric equations, 59
cylindrical coordinates, 271
vector equation, 58
locally linear, 55
differentiable function, 151
differentials, 144 mass
directional derivative, 162 of a solid, 260
discriminant, 179 moments about coordinate axes,
dot product, 34 231
double integral
average value, 204 parameterization
definition, 203 curve, 67
difference in volumes, 204 surface, 248
mass of lamina, 228 parametric equations for a curve,
probability, 233 67
double integral over a general partial derivatives
region, 218 first-order, 115
double Riemann sum, 202 second-order, 127
second-order, mixed, 128
function second-order, unmixed, 128
differentiable, 140 plane
domain, 3 definition, 59
graph, 4 scalar equation, 60
locally linear, 140 vector equation, 60
of two variables, 2 polar coordinates, 238
surface, 4 position vector, 22

293
294 INDEX

principal unit normal vector, 97 vector


projectile motion angle between, 35
parametric equations, 81 component in the direction of,
39
radius of curvature, 98 definition, 21
projection, 39
sphere subtraction, 24
definition, 6 sum, parallelogram, 26
formula, 7 vector-valued function
spherical coordinates, 275 antiderivative, 79
surface area, 253 definition, 67
derivative, 74
trace, 8 indefinite integral, 79
triple integral, 260 vectors
triple Riemann sum, 259 orthogonal, 37
triple scalar product, 49 volume of a solid, 260

You might also like