Proof of The Second-Derivative Test
Proof of The Second-Derivative Test
Proof of The Second-Derivative Test
Proof of the second-derivative test. Our goal is to derive the second-derivative test, which deter-
mines the nature of a critical point of a function of two variables, that is, whether a critical point is
a local minimum, a local maximum, or a saddle point, or none of these. In general for a function
of n variables, it is determined by the algebraic sign of a certain quadratic form, which in turn is
determined by eigenvalues of the Hessian matrix [Apo, Section 9.11]. This approach however re-
lies on results on eigenvalues, and it may take several lectures to fully develop. Here we focus on
the simpler setting when n = 2 and derive a test using the algebraic sign of the second derivative
of the function.
The statement of the test is in [Apo, Theorem 9.7].
Theorem 1 (The second-derivative test). Let the scalar field f (x1 , x2 ) have continuous second deriva-
tives in an open ball containing a = (a1 , a2 ). Suppose that D1 f (a) = D2 f (a) = 0. Let A = D11 f (a),
B = D12 f (a) = D21 f (a) and C = D22 f (a). Let
A B
4 = det = AC − B 2 .
B C
Then, we have
(a) If 4 < 0, then a is a saddle point.
(b) If 4 > 0 and A > 0, then f (a) is a local minimum.
(c) If 4 > 0 and A < 0, then f (a) is a local maximum.
(d) If 4 = 0, then the test is inconclusive.
The proof uses the second-orde Taylor formula, which we will state for general scalar fields.
Theorem 2 (Second-order Taylor formula). Let f be a scalar field with continuous second-order partial
derivatives Dij f in an open ball B(a). Then, for all y ∈ Rn such that a + y ∈ B(a) we have
n n
1 XX
f (a + y) − f (a) = ∇f (a) · y + Dij f (a + θy)yi yj ,
2
i=1 j=1
We are concerned with the sing of the left side of (**) when t is small. Consider the quadratic
function
Q(h, k) = Ah2 + 2Bhk + Ck 2 ,
where (h, k) is a unit vector in R2 .
Case 1: If 4 = AC − B 2 < 0, then we claim that Q takes both positive and negative signs.
Indeed, if A 6= 0 then the quadratic equation Ax2 + 2Bx + C = 0 has two distinct real roots, and
the graph of y = Ax2 + 2Bx + C represents a parabola which crosses the x-axis at two distinct
points; otherwise if A = 0 then B 6= 0 must hold, and y = Ax2 + 2Bx + C represents a straight
line with a non-zero slope. In either case, Ax2 + 2Bx + C takes both positive and negative values.
This proves the claim.
Case 2: If 4 = AC − B 2 > 0 then Q takes only one sign. Since (h, k) is a unit vector, let
(h, k) = (cos t, sin t), where t ∈ [0, 2π]. Q(t) = A(cos t, sin t) is a continuous function in t. We will
show that Q(t) takes only one sign. First, Q(0) = A 6= 0 takes a definitive sign. Suppose now that
is a t0 ∈ [0, 2π] such that Q(t0 ) = 0. That is, Q(cos t0 , sin t0 ) = Q(h0 , k0 ) = 0 for some (h0 , k0 ). If
k0 6= 0 this means that h0 /k0 is a real root of Ax2 + 2Bx + C = 0. But, since B 2 − AC < 0, the
quadratic equation can only have complex roots, and we are lead to a contradiction. Similarly, if
k0 = 0(andh0 6= 0), we consider the quadratic equation Cx2 + 2Bx + A = 0, and, by the same
argument, we get a contradiction. This proves the claim.
Our next step is to study implications of the previous results for the nature of the critical points.
First we prove part (a) of the theorem. Assume that B 2 − AC > 0. Let y0 be a unit vector for
which Q(y0 ) > 0. Examining the formula (**), we notice that the expression (2/t2 )(f (a+ty)−f (a)
approaches Q(y0 ) as t → 0. Let x = a + ty and let t → 0. Then, x approaches a along the straight
line from a to a + ty0 and the expression f (x) − f (a) approaches zero through positive values. On
the other hand, if y1 is a point such that Q(y1 ) < 0, then by the same argument, the expression
f (x) − f (a) approaches zero through negative values as x approaches a along the straight line
from a to a + ty1 . Therefore, a is a saddle point of f .
Next, we prove parts (b) and (c) of the theorem. Examining (**) again, we know that |Q(y)| > 0
for all unit vectors y. Then |Q(y)| has a positive minimum, say m, while y exhausts all unit vectors.
Now choose δ > 0 small enough that
|D1,1 f (a∗ ) − A|, |D1,2 f (a∗ ) − B|, |D2,2 f (a∗ ) − C| < m/3
whenever |a∗ − a| < δ. This uses continuity of the second-order partial derivatives of f . It 0 < t <
δ, then a∗ is on the line from a to a + δy. Since y is a unit vector, the right side of (*) has the same
sign as A whenever 0 < t < δ. If A > 0 this means that f (x) − f (a) > 0 whenever 0 < |x − a| < δ,
which says that f has a local minimum at a. Similarly, A < 0 implies that f has a local maximum
at a.
The proof of (d) of the theorem is by an example, which is left as a pset problem.
Example 3. (1) Let f (x, y) = x2 + 3xy + y 2 . The origin is a critical point of f since 5f (x, y) =
(2x + 3y, 3x + 2y). Here, A = 2, B = 3, C = 2. Since 4 = −5, which is negative, the origin is a
saddle point of f .
2
(2) Let f (x, y) = 3x2 − 5xy + 3y 2 . Again, the origin is a critical point since 5f (x, y) = (6x −
5y, −5x + 6y). Here, A = 6, B = −5, C = 6. Since 4 = 11, which is positive, and since A > 0 the
origin is a local minimum point.
Example 4. Find the critical points of f (x, y) = x sin y and determine whether they are local min-
ima, local maxima or saddle points.
solution. It is straightforward that
5f (x, y) = (sin y, x cos y).
If sin y = 0 then y = nπ, where n is an integer. Since cos y 6= 0 at these points, x cos y = 0 implies
that x = 0. Thus, all critical points of f are of the form (0, nπ), where n is an integer. Next, since
A = 0, B = cos y = ±1, C = −x sin y = 0
at critical points, the discriminant is calculated as δ = −1 < 0. Therefore, all critical points are
saddle.
2 −y 2 )/2
Exercise. Find the maxima, minima and saddle points of z = (x2 − y 2 )e(−x .
Exercise. Let z = (x2 + y 2 ) cos(x + 2y). Show that (0, 0) is a critical point. Is it a local minimum or
a local maximum?
Examples for use of Lagrange’s multiplier method. Assuming that among all rectangular boxes
with fixed surface area of 10 square meters there is a box of largest possible volume. Find its
dimension.
solution. Let the lengths of the sides of the cube be x, y, z > 0, respectively. The volume is
f (x, y, z) = xyz. The constraint is 2(xy + yz + xz) = 10. Lagrange multiplier conditions are
yz = λ(y + z); xz = λ(x + z); xy = λ(x + y)
and
xy + yz + xz = 5.
First of all, x 6= 0; for x = 0 implies yz = 5 and λz = 0 so that λ = 0 and this leads to a contradiction
since yz = 0. Similarly, y 6= 0, z 6= 0 and x + y 6= 0 and so on. Thus, we get
yz xz xy
λ= = = ,
y+z x+z x+y
whence
p x = y = z. Substituting these values into the constraint equation, we obtain x = y =
z = 5/3, This (cubical) shape must therefore maximize the volume, assuming there is a box of
maximum volume.
R EFERENCES
[Apo] T. Apostol, Calculus, vol. II, Second edition, Wiley, 1967.
[MaTr] J. Marsde and A. Tromba, Vector Calculus, Third edition, W.H. Freeman, N.Y., 1988.
2008
c BY V ERA M IKYOUNG H UR
E-mail address: [email protected]