Proof of The Second-Derivative Test

18.
024 SPRING OF 2008

SD. SECOND-DERIVATIVE TEST FOR EXTREMA
OF FUNCTIONS OF TWO VARIABLES
Proof of the second-derivative test. Our goal is to derive the second-derivative test, which deter-
mines the nature of a critical point of a function of two variables, that is, whether a critical point is
a local minimum, a local maximum, or a saddle point, or none of these. In general for a function
of n variables, it is determined by the algebraic sign of a certain quadratic form, which in turn is
determined by eigenvalues of the Hessian matrix [Apo, Section 9.11]. This approach however re-
lies on results on eigenvalues, and it may take several lectures to fully develop. Here we focus on
the simpler setting when n = 2 and derive a test using the algebraic sign of the second derivative
of the function.
The statement of the test is in [Apo, Theorem 9.7].
Theorem 1 (The second-derivative test). Let the scalar field f (x1 , x2 ) have continuous second deriva-
tives in an open ball containing a = (a1 , a2 ). Suppose that D1 f (a) = D2 f (a) = 0. Let A = D11 f (a),
B = D12 f (a) = D21 f (a) and C = D22 f (a). Let

A B
4 = det = AC − B 2 .
B C
Then, we have
(a) If 4 < 0, then a is a saddle point.
(b) If 4 > 0 and A > 0, then f (a) is a local minimum.
(c) If 4 > 0 and A < 0, then f (a) is a local maximum.
(d) If 4 = 0, then the test is inconclusive.
The proof uses the second-orde Taylor formula, which we will state for general scalar fields.
Theorem 2 (Second-order Taylor formula). Let f be a scalar field with continuous second-order partial
derivatives Dij f in an open ball B(a). Then, for all y ∈ Rn such that a + y ∈ B(a) we have
n n
1 XX
f (a + y) − f (a) = ∇f (a) · y + Dij f (a + θy)yi yj ,
2
i=1 j=1
where y = (y1 , . . . , yn ) and 0 < θ < 1.

The statement and its proof is in [Apo, Section 9.10], and hence it is omitted. The coefficient
of the quadratic form in the above expansion are the second-order partial derivatives. The n × n
matrix of second-order derivatives Dij f (a) is often called the Hessian matrix.
In our present setting, the above Taylor expansion leads to
f (a + ty) =f (a) + t(D1 f (a)h + D2 f (a)k)
(*) t2
+(D11 f (a∗ )h2 + 2D12 f (a∗ )hk + D22 f (a∗ )k 2 ),
2
where y = (h, k) and a∗ = a + θy for some 0 < θ < t. Therefore,
2
f (a + ty) − f (a) =(Ah2 + 2Bhk + Ck 2 )
(**) t2
+ ((D11 f (a∗ ) − A)h2 + 2(D12 f (a∗ ) − B)hk + (D22 f (a∗ ) − C)k 2 ),
1
for t small enough. This uses that D1 f (a) = D2 f (a) = 0. Roughly speaking, it states that
2
f (a + ty) − f (a) ∼ (Ah2 + 2Bhk + Ck 2 )
t2
as the last three terms in (**) are small if a∗ is close to a, thanks to that the second-order partial
derivatives are continuous.
We are concerned with the sing of the left side of (**) when t is small. Consider the quadratic
function
Q(h, k) = Ah2 + 2Bhk + Ck 2 ,
where (h, k) is a unit vector in R2 .
Case 1: If 4 = AC − B 2 < 0, then we claim that Q takes both positive and negative signs.
Indeed, if A 6= 0 then the quadratic equation Ax2 + 2Bx + C = 0 has two distinct real roots, and
the graph of y = Ax2 + 2Bx + C represents a parabola which crosses the x-axis at two distinct
points; otherwise if A = 0 then B 6= 0 must hold, and y = Ax2 + 2Bx + C represents a straight
line with a non-zero slope. In either case, Ax2 + 2Bx + C takes both positive and negative values.
This proves the claim.
Case 2: If 4 = AC − B 2 > 0 then Q takes only one sign. Since (h, k) is a unit vector, let
(h, k) = (cos t, sin t), where t ∈ [0, 2π]. Q(t) = A(cos t, sin t) is a continuous function in t. We will
show that Q(t) takes only one sign. First, Q(0) = A 6= 0 takes a definitive sign. Suppose now that
is a t0 ∈ [0, 2π] such that Q(t0 ) = 0. That is, Q(cos t0 , sin t0 ) = Q(h0 , k0 ) = 0 for some (h0 , k0 ). If
k0 6= 0 this means that h0 /k0 is a real root of Ax2 + 2Bx + C = 0. But, since B 2 − AC < 0, the
quadratic equation can only have complex roots, and we are lead to a contradiction. Similarly, if
k0 = 0(andh0 6= 0), we consider the quadratic equation Cx2 + 2Bx + A = 0, and, by the same
argument, we get a contradiction. This proves the claim.
Our next step is to study implications of the previous results for the nature of the critical points.
First we prove part (a) of the theorem. Assume that B 2 − AC > 0. Let y0 be a unit vector for
which Q(y0 ) > 0. Examining the formula (**), we notice that the expression (2/t2 )(f (a+ty)−f (a)
approaches Q(y0 ) as t → 0. Let x = a + ty and let t → 0. Then, x approaches a along the straight
line from a to a + ty0 and the expression f (x) − f (a) approaches zero through positive values. On
the other hand, if y1 is a point such that Q(y1 ) < 0, then by the same argument, the expression
f (x) − f (a) approaches zero through negative values as x approaches a along the straight line
from a to a + ty1 . Therefore, a is a saddle point of f .
Next, we prove parts (b) and (c) of the theorem. Examining (**) again, we know that |Q(y)| > 0
for all unit vectors y. Then |Q(y)| has a positive minimum, say m, while y exhausts all unit vectors.
Now choose δ > 0 small enough that
|D1,1 f (a∗ ) − A|, |D1,2 f (a∗ ) − B|, |D2,2 f (a∗ ) − C| < m/3
whenever |a∗ − a| < δ. This uses continuity of the second-order partial derivatives of f . It 0 < t <
δ, then a∗ is on the line from a to a + δy. Since y is a unit vector, the right side of (*) has the same
sign as A whenever 0 < t < δ. If A > 0 this means that f (x) − f (a) > 0 whenever 0 < |x − a| < δ,
which says that f has a local minimum at a. Similarly, A < 0 implies that f has a local maximum
at a.
The proof of (d) of the theorem is by an example, which is left as a pset problem.
Example 3. (1) Let f (x, y) = x2 + 3xy + y 2 . The origin is a critical point of f since 5f (x, y) =
(2x + 3y, 3x + 2y). Here, A = 2, B = 3, C = 2. Since 4 = −5, which is negative, the origin is a
saddle point of f .
2
(2) Let f (x, y) = 3x2 − 5xy + 3y 2 . Again, the origin is a critical point since 5f (x, y) = (6x −
5y, −5x + 6y). Here, A = 6, B = −5, C = 6. Since 4 = 11, which is positive, and since A > 0 the
origin is a local minimum point.
Example 4. Find the critical points of f (x, y) = x sin y and determine whether they are local min-
ima, local maxima or saddle points.
solution. It is straightforward that
5f (x, y) = (sin y, x cos y).
If sin y = 0 then y = nπ, where n is an integer. Since cos y 6= 0 at these points, x cos y = 0 implies
that x = 0. Thus, all critical points of f are of the form (0, nπ), where n is an integer. Next, since
A = 0, B = cos y = ±1, C = −x sin y = 0
at critical points, the discriminant is calculated as δ = −1 < 0. Therefore, all critical points are
saddle.
2 −y 2 )/2
Exercise. Find the maxima, minima and saddle points of z = (x2 − y 2 )e(−x .
Exercise. Let z = (x2 + y 2 ) cos(x + 2y). Show that (0, 0) is a critical point. Is it a local minimum or
a local maximum?
Examples for use of Lagrange’s multiplier method. Assuming that among all rectangular boxes
with fixed surface area of 10 square meters there is a box of largest possible volume. Find its
dimension.
solution. Let the lengths of the sides of the cube be x, y, z > 0, respectively. The volume is
f (x, y, z) = xyz. The constraint is 2(xy + yz + xz) = 10. Lagrange multiplier conditions are
yz = λ(y + z); xz = λ(x + z); xy = λ(x + y)
and
xy + yz + xz = 5.
First of all, x 6= 0; for x = 0 implies yz = 5 and λz = 0 so that λ = 0 and this leads to a contradiction
since yz = 0. Similarly, y 6= 0, z 6= 0 and x + y 6= 0 and so on. Thus, we get
yz xz xy
λ= = = ,
y+z x+z x+y
whence
p x = y = z. Substituting these values into the constraint equation, we obtain x = y =
z = 5/3, This (cubical) shape must therefore maximize the volume, assuming there is a box of
maximum volume.
Now we make a couple of remarks.

1. The solution in the example does not demonstrate that the cube is the rectangular box of
largest volume with a given fixed surface area; it proves that the cube is the only possible candi-
date for a maximum. The distinction between showing that there is only one possible solution to a
problem and that, in fact, a solution exists is quite subtle.
Indeed, Queen Dido (ca. 900 B.C.) realized that among all planar regions with fixed circumfer-
ence the disc is the region of maximum area. It is not terribly hard to prove this fact assuming that
there does exists a region of maximum area. However, proving that such a region of maximum
area exists is quite another (and difficult) matter. A complete proof was not given until the second
half of the nineteenth century by a German mathematician Weierstrass (1815–1897).
2. The problem in showing that f (x, y, z) = xyz has a maximum lies in the fact that f is a
continuous function which is defined on the unbounded surface xy + yz + zx = 5 and not on
a bounded set which includes its boundary. The way to show the existence of a maximum of
3
f (x, y, z) = xyz subject to xy + yz + zx = 5 is to show that if either x, y or z tend to infinity, then
f (x, y, z) → 0. We may then conclude that the maximum of f on the surface xy + yz + zx = 5.
Indeed, multiplying the equation of the surface by z we obtain the equation xyz + xzy2 z 2 = 5z → 0
s x → ∞. Since x, y, z > 0 it follows that xyz = f (x, y, z) → 0. Similarly, f (x, y, z) → 0 if either y
or z tend to infinity. Therefore, a box of maximum volume must exist.
3. There are also second derivative tests available for constrained extrema. Please consult
[MaTr].
Exercise. Show that the volume of the largest rectangular parallelepiped that can be inscribed in
the ellipsoid
x2 y 2 z 2
+ 2 + 2 =1
√ a2 b c
is 8abc/3 3.
R EFERENCES
[Apo] T. Apostol, Calculus, vol. II, Second edition, Wiley, 1967.
[MaTr] J. Marsde and A. Tromba, Vector Calculus, Third edition, W.H. Freeman, N.Y., 1988.
2008
c BY V ERA M IKYOUNG H UR
E-mail address: [email protected]

Proof of The Second-Derivative Test

Uploaded by

Copyright:

Available Formats

Proof of The Second-Derivative Test

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Proof of The Second-Derivative Test

Uploaded by

Copyright:

Available Formats

18.

024 SPRING OF 2008

where y = (y1 , . . . , yn ) and 0 < θ < 1.

Now we make a couple of remarks.

You might also like