Intro Real Anal
Intro Real Anal
Lee Larson
University of Louisville
July 18, 2016
About This Document
i
Contents
Basic Ideas
In the end, all mathematics can be boiled down to logic and set theory. Be-
cause of this, any careful presentation of fundamental mathematical ideas is
inevitably couched in the language of logic and sets. This chapter defines enough
of that language to allow the presentation of basic real analysis. Much of it will be
familiar to you, but look at it anyway to make sure you understand the notation.
1. Sets
Set theory is a large and complicated subject in its own right. There is no time
in this course to touch on any but the simplest parts of it. Instead, we’ll just look
at a few topics from what is often called “naive set theory.”
We begin with a few definitions.
A set is a collection of objects called elements. Usually, sets are denoted by the
capital letters A, B, · · · , Z . A set can consist of any type and number of elements.
Even other sets can be elements of a set. The sets dealt with here usually have
real numbers as their elements.
If a is an element of the set A, we write a ∈ A. If a is not an element of the set
A, we write a ∉ A.
If all the elements of A are also elements of B , then A is a subset of B . In this
case, we write A ⊂ B or B ⊃ A. In particular, notice that whenever A is a set, then
A ⊂ A.
Two sets A and B are equal, if they have the same elements. In this case we
write A = B . It is easy to see that A = B iff A ⊂ B and B ⊂ A. Establishing that both
of these containments are true is the most common way to show that two sets
are equal.
If A ⊂ B and A 6= B , then A is a proper subset of B . In cases when this is
important, it is written A ( B instead of just A ⊂ B .
There are several ways to describe a set.
A set can be described in words such as “P is the set of all presidents of the
United States.” This is cumbersome for complicated sets.
All the elements of the set could be listed in curly braces as S = {2, 0, a}. If the
set has many elements, this is impractical, or impossible.
More common in mathematics is set builder notation. Some examples are
1-1
1-2 CHAPTER 1. BASIC IDEAS
and
A = {n : n is a prime number} = {2, 3, 5, 7, 11, · · · }.
In general, the set builder notation defines a set in the form
{formula for a typical element : objects to plug into the formula}.
A more complicated example is the set of perfect squares:
S = {n 2 : n is an integer} = {0, 1, 4, 9, · · · }.
The existence of several sets will be assumed. The simplest of these is the
empty set, which is the set with no elements. It is denoted as ;. The natural
numbers is the set N = {1, 2, 3, · · · } consisting of the positive integers. The set
Z = {· · · , −2, −1, 0, 1, 2, · · · } is the set of all integers. ω = {n ∈ Z : n ≥ 0} = {0, 1, 2, · · · }
is the nonnegative integers. Clearly, ; ⊂ A, for any set A and
; ⊂ N ⊂ ω ⊂ Z.
D EFINITION 1.1. Given any set A, the power set of A, written P (A), is the set
consisting of all subsets of A; i. e.,
P (A) = {B : B ⊂ A}.
For example, P ({a, b}) = {;, {a}, {b}, {a, b}}. Also, for any set A, it is always
true that ; ∈ P (A) and A ∈ P (A). If a ∈ A, it is rarely true that a ∈ P (A), but it is
always true that {a} ⊂ P (A). Make sure you understand why!
An amusing example is P (;) = {;}. (Don’t confuse ; with {;}! The former is
empty and the latter has one element.) Now, consider
P (;) = {;}
P (P (;)) = {;, {;}}
P (P (P (;))) = {;, {;}, {{;}}, {;, {;}}}
After continuing this n times, for some n ∈ N, the resulting set,
P (P (· · · P (;) · · · )),
is very large. In fact, since a set with k elements has 2k elements in its power set,
22
there are 22 = 65, 536 elements after only five iterations of the example. Beyond
this, the numbers are too large to print. Number sequences such as this one are
sometimes called tetrations.
2. Algebra of Sets
Let A and B be sets. There are four common binary operations used on sets.1
1In the following, some logical notation is used. The symbol ∨ is the logical nonexclusive “or.”
The symbol ∧ is the logical “and.” Their truth tables are as follows:
∧ T F ∨ T F
T T F T T T
F F F F T F
A B A B
A B A B
A\B A∆B
A B A B
F IGURE 1.1. These are Venn diagrams showing the four standard
binary operations on sets. In this figure, the set which results from the
operation is shaded.
The union of A and B is the set containing all the elements in either A or B :
A ∪ B = {x : x ∈ A ∨ x ∈ B }.
The intersection of A and B is the set containing the elements contained in
both A and B :
A ∩ B = {x : x ∈ A ∧ x ∈ B }.
The symmetric difference of A and B is the set of elements in one of the sets,
but not the other:
A∆B = (A ∪ B ) \ (A ∩ B ).
3. Indexed Sets
We often have occasion to work with large collections of sets. For example,
we could have a sequence of sets A 1 , A 2 , A 3 , · · · , where there is a set A n associated
with each n ∈ N. In general, let Λ be a set and suppose for each λ ∈ Λ there is a
set A λ . The set {A λ : λ ∈ Λ} is called a collection of sets indexed by Λ. In this case,
Λ is called the indexing set for the collection.
3The logical symbol ⇐⇒ is the same as “if, and only if.” If A and B are any two statements,
then A ⇐⇒ B is the same as saying A implies B and B implies A. It is also common to use iff in
this way.
4Augustus De Morgan (1806–1871)
Two of the basic binary operations can be extended to work with indexed col-
lections. In particular, using the indexed collection from the previous paragraph,
we define
A λ = {x : x ∈ A λ for some λ ∈ Λ}
[
λ∈Λ
and
A λ = {x : x ∈ A λ for all λ ∈ Λ}.
\
λ∈Λ
De Morgan’s Laws can be generalized to indexed collections.
T HEOREM 1.4. If {B λ : λ ∈ Λ} is an indexed collection of sets and A is a set, then
[ \
A\ Bλ = (A \ B λ )
λ∈Λ λ∈Λ
and \ [
A\ Bλ = (A \ B λ ).
λ∈Λ λ∈Λ
Of course, the common Cartesian plane from your analytic geometry course
is nothing more than a generalization of this idea of listing the elements of a
Cartesian product as a table.
The definition of Cartesian product can be extended to the case of more than
two sets. If {A 1 , A 2 , · · · , A n } are sets, then
A 1 × A 2 × · · · × A n = {(a 1 , a 2 , · · · , a n ) : a k ∈ A k for 1 ≤ k ≤ n}
is a set of n-tuples. This is often written as
n
Y
Ak = A1 × A2 × · · · × An .
k=1
4.2. Relations.
D EFINITION 1.6. If A and B are sets, then any R ⊂ A × B is a relation from A
to B . If (a, b) ∈ R, we write aRb.
In this case,
dom (R) = {a : (a, b) ∈ R}
is the domain of R and
ran (R) = {b : (a, b) ∈ R}
is the range of R.
In the special case when R ⊂ A × A, for some set A, there is some additional
terminology.
R is symmetric, if aRb ⇐⇒ bRa.
R is reflexive, if aRa whenever a ∈ dom (A).
R is transitive, if aRb ∧ bRc =⇒ aRc.
R is an equivalence relation on A, if it is symmetric, reflexive and transitive.
E XAMPLE 1.3. Let R be the relation on Z × Z defined by aRb ⇐⇒ a ≤ b. Then
R is reflexive and transitive, but not symmetric.
E XAMPLE 1.4. Let R be the relation on Z × Z defined by aRb ⇐⇒ a < b. Then
R is transitive, but neither reflexive nor symmetric.
E XAMPLE 1.5. Let R be the relation on Z × Z defined by aRb ⇐⇒ a 2 = b 2 . In
this case, R is an equivalence relation. It is evident that aRb iff b = a or b = −a.
4.3. Functions.
D EFINITION 1.7. A relation R ⊂ A × B is a function if
aRb 1 ∧ aRb 2 =⇒ b 1 = b 2 .
¡ ¢
If f ⊂ A × B is a function and dom f = A, then we usually write f : A → B
and use the usual notation f (a) = b instead of a f b.
If f : A → B is a function, the usual intuitive interpretation is to regard f
as a rule that associates each element of A with a unique element of B . It’s not
necessarily the case that¡ ¢each element of B is associated with something from A;
i.e., B may not be ran f .
b
c
a f
A B
b
c
a g
A B
g
E XAMPLE¡ 1.6.
¢ Define f : N → Z by f¢ (n) = n¡ 2 ¢and g : Z → Z by g (n) = n 2 . In
this case ran f = {n 2 : n ∈ N} and ran g = ran f ∪ {0}. Notice that even though
¡
f and g use the same formula, they are actually different functions.
f
—1
f
f —1
A f B
A ∼ B ⇐⇒ there is a bijection f : A → B
is an equivalence relation.
A1 A2 A3 A4 A5
···
B1 B2 B3 B4
f (A)
B5 ···
B
F IGURE 1.4. Here are the first few steps from the construction used
in the proof of Theorem 1.16.
x = g ◦ f ◦ g ◦ f ◦ · · · ◦ f ◦ g (x 1 ).
| {z }
k − 1 f ’s and k g ’s
This implies
h(x) = g −1 (x) = f ◦ g ◦ f ◦ · · · ◦ f ◦ g (x 1 ) = f (y)
| {z }
k − 1 f ’s and k − 1 g ’s
so that
y = g ◦ f ◦ g ◦ f ◦ · · · ◦ f ◦ g (x 1 ) ∈ A k−1 ⊂ Ã.
| {z }
k − 2 f ’s and k − 1 g ’s
5. Cardinality
There is a way to use sets and functions to formalize and generalize how we
count. For example, suppose we want to count how many elements are in the set
{a, b, c}. The natural way to do this is to point at each element in succession and
say “one, two, three.” What we’re doing is defining a bijective function between
{a, b, c} and the set {1, 2, 3}. This idea can be generalized.
The cardinalities defined in Definition 1.18 are called the finite cardinal
numbers. They correspond to the everyday counting numbers we usually use.
The idea can be generalized still further.
6The symbol ℵ is the Hebrew letter “aleph” and ℵ is usually pronounced “aleph nought.”
0
A logical question is whether all sets either have finite cardinality, or are
countably infinite. That this is not so is seen by letting S = N in the following
theorem.
P ROOF. Noting that 0 = card (;) < 1 = card (P (;)), the theorem is true when
S is empty.
Suppose S 6= ;. Since {a} ∈ P (S) for all a ∈ S, it follows that card (S) ≤
card (P (S)). Therefore, it suffices to prove there is no surjective function f :
S → P (S).
To see this, assume there is such a function f and let T = {x ∈ S : x ∉ f (x)}.
Since f is surjective, there is a t ∈ S such that f (t ) = T . Either t ∈ T or t ∉ T .
If t ∈ T = f (t ), then the definition of T implies t ∉ T , a contradiction. On
the other hand, if t ∉ T = f (t ), then the definition of T implies t ∈ T , another
contradiction. These contradictions lead to the conclusion that no such function
f can exist.
The proofs of these theorems are extremely difficult and entire broad areas of
mathematics were invented just to make their proofs possible. Even today, there
are some deep philosophical questions swirling around them. A more technical
introduction to many of these ideas is contained in the book by Ciesielski [8]. A
nontechnical and very readable history of the efforts by mathematicians to un-
derstand the continuum hypothesis is the book by Aczel [1]. A readable account
of Cantor’s work is in an article by Dauben [9].
6. Exercises
1.1. If a set S has n elements for n ∈ ω, then how many elements are in P (S)?
B
1.13. Given two sets A and B , it is common to ´ A denote the set of all
³ Alet
functions f : B → A. Prove that for any set A, card 2 = card (P (A)). This is why
many authors use 2 A as their notation for P (A).
1.14. Let S be a set. Prove the following two statements are equivalent:
(a) S is infinite; and,
(b) there is a proper subset T of S and a bijection f : S → T .
This statement is often used as the definition of when a set is infinite.
1.17. If f : [0, ∞) → (0, ∞) and g : (0, ∞) → [0, ∞) are given by f (x) = x + 1 and
g (x) = x, then the proof of the Schrŏder-Bernstein theorem yields what bijection
h : [0, ∞) → (0, ∞)?
1.21. If A and B are sets such that card (A) = card (B ) = ℵ0 , then card (A ∪ B ) = ℵ0 .
1.22. Using the notation from the proof of the Schröder-Bernstein Theorem, let
A = [0, ∞), B = (0, ∞), f (x) = x + 1 and g (x) = x. Determine h(x).
1.23. Using the notation from the proof of the Schröder-Bernstein Theorem, let
A = N, B = Z, f (n) = n and
(
1 − 3n, n ≤ 0
g (n) = .
3n − 1, n > 0
Calculate h(6) and h(7).
A
1.26. If A and B are sets, the ´ of all functions f : A → B is often denoted by B .
³ set S
If S is a set, prove that card 2 = card (P (S)).
This chapter concerns what can be thought of as the rules of the game: the
axioms of the real numbers. These axioms imply all the properties of the real
numbers and, in a sense, any set satisfying them is uniquely determined to be
the real numbers.
The axioms are presented here as rules without very much justification. Other
approaches can be used. For example, a common approach is to begin with the
Peano axioms — the axioms of the natural numbers — and build up to the real
numbers through several “completions” of the natural numbers. It’s also possible
to begin with the axioms of set theory to build up the Peano axioms as theorems
and then use those to prove our axioms as further theorems. No matter how it’s
done, there are always some axioms at the base of the structure and the rules for
the real numbers are the same, whether they’re axioms or theorems.
We choose to start at the top because the other approaches quickly turn into
a long and tedious labyrinth of technical exercises without much connection to
analysis.
2-1
2-2 CHAPTER 2. THE REAL NUMBERS
Although these axioms seem to contain most properties of the real numbers
we normally use, they don’t characterize the real numbers; they just give the rules
for arithmetic. There are other fields besides the real numbers and studying them
is a large part of most abstract algebra courses.
E XAMPLE 2.1. From elementary algebra we know that the rational numbers
Q = {p/q : p ∈ Z ∧ q ∈ N}
p
form a field. It is shown in Theorem 2.13 that 2 ∉ Q, so Q doesn’t contain all the
real numbers.
The following theorems, containing just a few useful properties of fields, are
presented mostly as examples showing how the axioms are used. More complete
developments can be found in any beginning abstract algebra text.
e1 = e1 × e2 = e2,
so the multiplicative identity is unique. The proof for the additive identity is
essentially the same.
T HEOREM 2.2. Let F be a field. If a, b ∈ F with b 6= 0, then −a and b −1 are
unique.
b 1 = b 1 × 1 = b 1 × (b × b 2 ) = (b 1 × b) × b 2 = 1 × b 2 = b 2 .
This shows the multiplicative inverse in unique. The proof is essentially the same
for the additive inverse.
There are many other properties of fields which could be proved here, but
they correspond to the usual properties of the real numbers learned in beginning
algebra, so we omit them. Some of them are in the exercises at the end of this
chapter.
From now on, the standard notations for algebra will usually be used; e. g.,
we will allow ab instead of a × b and a/b instead of a × b −1 . The reader may also
use the standard facts she learned from elementary algebra.
The most important properties of the absolute value function are contained
in the following theorem.
P ROOF. (a) The fact that |x| ≥ 0 for all x ∈ F follows from Axiom 7(b).
Since 0 = −0, the second part is clear.
(b) If x ≥ 0, then −x ≤ 0 so that | − x| = −(−x) = x = |x|. If x < 0, then −x > 0
and |x| = −x = | − x|.
(c) If x ≥ 0, then −|x| = −x ≤ x = |x|. If x < 0, then −|x| = −(−x) = x < −x =
|x|.
(d) This is left as Exercise 2.4.
(e) Add the two sets of inequalities −|x| ≤ x ≤ |x| and −|y| ≤ y ≤ |y| to see
−(|x| + |y|) ≤ x + y ≤ |x| + |y|. Now apply (d).
From studying analytic geometry and calculus, we are used to thinking of
|x − y| as the distance between the numbers x and y. This notion of a distance
between two points of a set can be generalized.
A metric is a function which defines the distance between any two points of a
set.
There is no requirement in the definition that the upper and lower bounds
for a set are elements of the set. They can be elements of the set, but typically are
not. For example, if N = (−∞, 0), then [0, ∞) is the set of all upper bounds for N ,
but none of them is in N . On the other hand, if T = (−∞, 0], then [0, ∞) is again
the set of all upper bounds for T , but in this case 0 is an upper bound which is
also an element of T .
A set need not have upper or lower bounds. For example N = (−∞, 0) has no
lower bounds, while P = (0, ∞) has no upper bounds. The integers, Z, has neither
upper nor lower bounds. If S has no upper bound, it is unbounded above and, if
it has no lower bound, then it is unbounded below. In either case, it is usually just
said to be unbounded.
If M is an upper bound for the set S, then every x ≥ M is also an upper bound
for S. Considering some simple examples should lead you to suspect that among
the upper bounds for a set, there is one that is best in the sense that everything
greater is an upper bound and everything less is not an upper bound. This is the
basic idea of completeness.
P ROOF. Suppose u 1 and u 2 are both least upper bounds for A. Since u 1
and u 2 are both upper bounds for A, two applications of Definition 2.15 shows
u 1 ≤ u 2 ≤ u 1 =⇒ u 1 = u 2 . The proof of the other case is similar.
T HEOREM 2.18. Let A ⊂ F and α ∈ F. α = lub A iff (α, ∞) ∩ A = ; and for all
ε > 0, (α − ε, α] ∩ A 6= ;. Similarly, α = glb A iff (−∞, α) ∩ A = ; and for all ε > 0,
[α, α + ε) ∩ A 6= ;.
P ROOF. We will prove the first statement, concerning the least upper bound.
The second statement, concerning the greatest lower bound, follows similarly.
(⇒) If x ∈ (α, ∞) ∩ A, then α cannot be an upper bound of A, which is a
contradiction. If there is an ε > 0 such that (α − ε, α] ∩ A = ;, then from above, we
conclude
; = ((α − ε, α] ∩ A) ∪ ((α, ∞) ∩ A) = (α − ε, ∞) ∩ A.
So, α−ε/2 is an upper bound for A which is less than α = lub A. This contradiction
shows (α − ε, α] ∩ A 6= ;.
(⇐) The assumption that (α, ∞)∩ A = ; implies α ≥ lub A. On the other hand,
suppose lub A < α. By assumption, there is an x ∈ (lub A, α) ∩ A. This is clearly a
contradiction, since lub A < x ∈ A. Therefore, α = lub A.
3Some people prefer the notation sup A and inf A instead of lub A and glb A, respectively. They
stand for the supremum and infimum of A.
An eagle-eyed reader may wonder why the intervals in Theorem 2.18 are
(α − ε, α] and [α, α + ε) instead of (α − ε, α) and (α, α + ε). Just consider the case
A = {α} to see that the theorem fails when the intervals are open. When lub A ∉ A
or glb A ∉ A, the intervals can be open, as shown in the following corollary.
C OROLLARY 2.19. If A is bounded above and α = lub A ∉ A, then for all ε > 0,
(α − ε, α) ∩ A is an infinite set. Similarly, if A is bounded below and β = glb A ∉ A,
then for all ε > 0, (β, β + ε) ∩ A is an infinite set.
This is the final axiom. Any field F satisfying all eight axioms is called a
complete ordered field. We assume the existence of a complete ordered field, R,
called the real numbers.
In naive set theory it can be shown that if F1 and F2 are both complete ordered
fields, then they are the same, in the following sense. There exists a unique
bijective function i : F1 → F2 such that i (a + b) = i (a) + i (b), i (ab) = i (a)i (b) and
a < b ⇐⇒ i (a) < i (b). Such a function i is called an order isomorphism. The
existence of such an order isomorphism shows that R is essentially unique. More
reading on this topic can be done in some advanced texts [11, 12].
Every statement about upper bounds has a dual statement about lower
bounds. A proof of the following dual to Axiom 8 is left as an exercise.
P ROOF. If the theorem is false, then a is an upper bound for N. Let β = lub N.
According to Theorem 2.18 there is an m ∈ N such that m > β − 1. But, this is a
contradiction because β = lub N < m + 1 ∈ N.
Some other variations on this theme are in the following corollaries.
C OROLLARY 2.22. Let a, b ∈ R with a > 0.
(a) There is an n ∈ N such that an > b.
(b) There is an n ∈ N such that 0 < 1/n < a.
(c) There is an n ∈ N such that n − 1 ≤ a < n.
P ROOF. (a) Use Theorem 2.21 to find n ∈ N where n > b/a.
(b) Let b = 1 in part (a).
(c) Theorem 2.21 guarantees that S = {n ∈ N : n > a} 6= ;. If n is the least
element of this set, then n − 1 ∉ S and n − 1 ≤ a < n.
C OROLLARY 2.23. If I is any interval from R, then I ∩ Q 6= ; and I ∩ Qc 6= ;.
P ROOF. Left as an exercise.
A subset of R which intersects every interval is said to be dense in R. Corollary
2.23 shows both the rational and irrational numbers are dense.
4. Comparisons of Q and R
All of the above still does not establish that Q is different from R. In Theorem
2.13, it was shown that the equation x 2 = 2 has no solution in Q. The following
theorem shows x 2 = 2 does have solutions in R. Since a copy of Q is embedded in
R, it follows, in a sense, that R is bigger than Q.
T HEOREM 2.24. There is a positive α ∈ R such that α2 = 2.
P ROOF. Let S = {x > 0 : x 2 < 2}. Then 1 ∈ S, so S 6= ;. If x ≥ 2, then Theorem
2.5(c) implies x 2 ≥ 4 > 2, so S is bounded above. Let α = lub S. It will be shown
that α2 = 2.
Suppose first that α2 < 2. This assumption implies (2 − α2 )/(2α + 1) > 0.
According to Corollary 2.22, there is an n ∈ N large enough so that
1 2 − α2 2α + 1
0< < =⇒ 0 < < 2 − α2 .
n 2α + 1 n
Therefore,
µ ¶2 µ ¶
1 2α 1 1 1
α+ = α2 + + 2 = α2 + 2α +
n n n n n
(2α + 1)
< α2 + < α2 + (2 − α2 ) = 2
n
contradicts the fact that α = lub S. Therefore, α2 ≥ 2.
Next, assume α2 > 2. In this case, choose n ∈ N so that
1 α2 − 2 2α
0< < =⇒ 0 < < α2 − 2.
n 2α n
July 18, 2016 http://math.louisville.edu/∼lee/ira
2-10 CHAPTER 2. THE REAL NUMBERS
F IGURE 2.1. The proof of Theorem 2.26 is called the “diagonal argu-
ment’” because it constructs a new number z by working down the
main diagonal of the array shown above, making sure z(n) 6= αn (n) for
each n ∈ N.
Then
1 2
µ ¶
2α 1 2α
α− = α2 − + 2 > α2 − > α2 − (α2 − 2) = 2,
n n n n
again contradicts that α = lub S.
Therefore, α2 = 2.
Theorem 2.13 leads to the obvious question of how much bigger R is than
Q. First, note that since N ⊂ Q, it is clear that card (Q) ≥ ℵ0 . On the other hand,
every q ∈ Q has a unique reduced fractional representation q = m(q)/n(q) with
m(q) ∈ Z and n(q) ∈ N. This gives an injective function f : Q → Z × N defined by
f (q) = (m(q), n(q)), and we conclude card (Q) ≤ card (Z × N) = ℵ0 . The following
theorem ensues.
T HEOREM 2.25. card (Q) = ℵ0 .
In 1874, Georg Cantor first showed that R is not countable. The following
proof is his famous diagonal argument from 1891.
T HEOREM 2.26. card (R) > ℵ0
P ROOF. It suffices to prove that card ([0, 1]) > ℵ0 . If this is not true, then there
is a bijection α : N → [0, 1]; i.e.,
(5) [0, 1] = {αn : n ∈ N}.
n
Each x ∈ [0, 1] can be written in the decimal form x = ∞
P
n=1 x(n)/10 where
x(n) ∈ {0, 1, 2, 3, 4, 5, 6, 7, 8, 9} for each n ∈ N. This decimal representation is not
necessarily unique. For example,
1 5 4 ∞ 9
X
= = + .
2 10 10 n=2 10n
In such a case, there is a choice of x(n) so it is constantly 9 or constantly 0 from
some N onward. When given a choice, we will always opt to end the number with
a string of nines. With this convention, the decimal representation of x is unique.
Define z ∈ [0, 1] by choosing z(n) ∈ {d ∈ ω : d ≤ 8} such that z(n) 6= αn (n).
n
Let z = ∞ n=1 z(n)/10 . Since z ∈ [0, 1], there is an n ∈ N such that z = α(n). But,
P
this is impossible because z(n) differs from αn in the nth decimal place. This
contradiction shows card ([0, 1]) > ℵ0 .
Around the turn of the twentieth century these then-new ideas about infinite
sets were very controversial in mathematics. This is because some of these ideas
are very unintuitive. For example, the rational numbers are a countable set
and the irrational numbers are uncountable, yet between every two rational
numbers are an uncountable number of irrational numbers and between every
two irrational numbers there are a countably infinite number of rational numbers.
It would seem there are either too few or too many gaps in the sets to make this
possible. Such a seemingly paradoxical situation flies in the face of our intuition,
which was developed with finite sets in mind.
This brings us back to the discussion of cardinalities and the Continuum Hy-
pothesis at the end of Section 5. Most of the time, people working in real analysis
assume the Continuum Hypothesis is true. With this assumption and Theo-
rem 2.26 it follows that whenever A ⊂ R, then either card (A) ≤ ℵ0 or card (A) =
card (R) = card (P (N)).4 Since P (N) has many more elements than N, any count-
able subset of R is considered to be a small set, in the sense of cardinality, even if
it is infinite. This works against the intuition of many beginning students who are
not used to thinking of Q, or any other infinite set as being small. But it turns out
to be quite useful because the fact that the union of a countably infinite number
of countable sets is still countable can be exploited in many ways.5
In later chapters, other useful small versus large dichotomies will be found.
5. Exercises
2.11. If F is an ordered field and a ∈ F such that 0 ≤ a < ε for every ε > 0, then
a = 0.
2.14. If p is a prime number and ε > 0, then there are x, y ∈ Q such that x 2 < p <
y 2 < x 2 + ε.
2.20. Let F be an ordered field. (a) Prove F has no upper or lower bounds.
(b) Every element of F is both an upper and lower bound for ;.
2.23. If A ⊂ R and B = {x : x is an upper bound for A}, then lub (A) = glb (B ).
Sequences
We begin our study of analysis with sequences. There are several reasons
for starting here. First, sequences are the simplest way to introduce limits, the
central idea of calculus. Second, sequences are a direct route to the topology of
the real numbers. The combination of limits and topology provides the tools to
finally prove the theorems you’ve already used in your calculus course.
1. Basic Properties
D EFINITION 3.1. A sequence is a function a : N → R.
E XAMPLE 3.1. Let the sequence a n = 1 − 1/n. The first three elements are
a 1 = 0, a 2 = 1/2, a 3 = 2/3, etc.
E XAMPLE 3.3. Let the sequence c n = 100 − 5n so c 1 = 95, c 2 = 90, c 3 = 85, etc.
E XAMPLE 3.6. Some sequences are not defined by an explicit formula, but are
defined recursively. This is an inductive method of definition in which successive
terms of the sequence are defined by using other terms of the sequence. The most
famous of these is the Fibonacci sequence. To define the Fibonacci sequence,
f n , let f 1 = 0, f 2 = 1 and for n > 2, let f n = f n−2 + f n−1 . The first few terms are
0, 1, 1, 2, 3, 5, 8, . . . . There actually is a simple formula that directly gives f n , but we
leave its derivation as Exercise 3.6.
3-1
3-2 CHAPTER 3. SEQUENCES
E XAMPLE 3.7. These simple definitions can lead to complex problems. One
famous case is a hailstone sequence. Let h 1 be any natural number. For n > 1,
recursively define (
3h n−1 + 1, if h n−1 is odd
hn = .
h n−1 /2, if h n−1 is even
Lothar Collatz conjectured in 1937 that any hailstone sequence eventually settles
down to repeating the pattern 1, 4, 2, 1, 4, 2, · · · . Many people have tried to prove
this and all have failed.
It’s often inconvenient for the domain of a sequence to be N, as required by
Definition 3.1. For example, the sequence beginning 1, 2, 4, 8, . . . can be written
20 , 21 , 22 , 23 , . . . . Written this way, it’s natural to let the sequence function be 2n
with domain ω. As long as there is a simple substitution to write the sequence
function in the form of Definition 3.1, there’s no reason to adhere to the letter
of the law. In general, the domain of a sequence can be any set of the form
{n ∈ Z : n ≥ N } for some N ∈ Z.
D EFINITION 3.2. A sequence a n is bounded if {a n : n ∈ N} is a bounded set.
This definition is extended in the obvious way to bounded above and bounded
below.
The sequence of Example 3.1 is bounded, but the sequence of Example 3.2 is
not, although it is bounded below.
D EFINITION 3.3. A sequence a n converges to L ∈ R if for all ε > 0 there exists
an N ∈ N such that whenever n ≥ N , then |a n − L| < ε. If a sequence does not
converge, then it is said to diverge.
When a n converges to L, we write limn→∞ a n = L, or often, more simply,
a n → L.
E XAMPLE 3.8. Let a n = 1 − 1/n be as in Example 3.1. We claim a n → 1. To see
this, let ε > 0 and choose N ∈ N such that 1/N < ε. Then, if n ≥ N
|a n − 1| = |(1 − 1/n) − 1| = 1/n ≤ 1/N < ε,
so a n → 1.
E XAMPLE 3.9. The sequence b n = 2n of Example 3.2 diverges. To see this,
suppose not. Then there is an L ∈ R such that b n → L. If ε = 1, there must be an
N ∈ N such that |b n − L| < ε whenever n ≥ N . Choose n ≥ N . |L − 2n | < 1 implies
L < 2n + 1. But, then
b n+1 − L = 2n+1 − L > 2n+1 − (2n + 1) = 2n − 1 ≥ 1 = ε.
This violates the condition on N . We conclude that for every L ∈ R there exists
an ε > 0 such that for no N ∈ N is it true that whenever n ≥ N , then |b n − L| < ε.
Therefore, b n diverges.
D EFINITION 3.4. A sequence a n diverges to ∞ if for every B > 0 there is an
N ∈ N such that n ≥ N implies a n > B . The sequence a n is said to diverge to −∞
if −a n diverges to ∞.
Therefore a n + b n → A + B .
(b) Let ε > 0 and α > 0 be an upper bound for |a n |. Choose N1 , N2 ∈ N such
that n ≥ N1 =⇒ |a n − A| < ε/2(|B | + 1) and n ≥ N2 =⇒ |b n − B | < ε/2α.
If n ≥ N = max{N1 , N2 }, then
|a n b n − AB | = |a n b n − a n B + a n B − AB |
≤ |a n b n − a n B | + |a n B − AB |
= |a n ||b n − B | + |B ||a n − A|
ε ε
<α + |B |
2α 2(|B | + 1)
< ε/2 + ε/2 = ε.
(c) First, notice that it suffices to show that 1/b n → 1/B , because part (b) of
this theorem can be used to achieve the full result.
Let ε > 0. Choose N ∈ N so that the following two conditions are
satisfied: n ≥ N =⇒ |b n | > |B |/2 and |b n − B | < B 2 ε/2. Then, when
n ≥ N,
1 ¯¯ ¯¯ B − b n ¯¯ ¯¯ B 2 ε/2 ¯¯
¯ ¯ ¯ ¯ ¯ ¯
¯ 1
¯
¯b − = < = ε.
n B ¯ ¯ b n B ¯ ¯ (B /2)B ¯
Therefore 1/b n → 1/B .
If you’re not careful, you can easily read too much into the previous theorem
and try to use its converse. Consider the sequences a n = (−1)n and b n = −a n .
Their sum, a n + b n = 0, product a n b n = −1 and quotient a n /b n = −1 all converge,
but the original sequences diverge.
It is often easier to prove that a sequence converges by comparing it with a
known sequence than it is to analyze it directly. For example, a sequence such as
a n = sin2 n/n 3 can easily be seen to converge to 0 because it is dominated by 1/n 3 .
The following theorem makes this idea more precise. It’s called the Sandwich
Theorem here, but is also called the Squeeze, Pinching, Pliers or Comparison
Theorem in different texts.
2. Monotone Sequences
One of the problems with using the definition of convergence to prove a
given sequence converges is the limit of the sequence must be known in order
to verify the sequence converges. This gives rise in the best cases to a “chicken
and egg” problem of somehow determining the limit before you even know the
sequence converges. In the worst case, there is no nice representation of the
limit to use, so you don’t even have a “target” to shoot at. The next few sections
are ultimately concerned with removing this deficiency from Definition 3.2, but
some interesting side-issues are explored along the way.
Not surprisingly, we begin with the simplest case.
The key idea of this proof is the existence of the least upper bound of the
sequence when viewed as a set. This means the Completeness Axiom implies
Theorem 3.11. In fact, it isn’t hard to prove Theorem 3.11 also implies the Com-
pleteness Axiom, showing they are equivalent statements. Because of this, Theo-
rem 3.11 is often used as the Completeness Axiom on R instead of the least upper
bound property we used in Axiom 8.
¢n
E XAMPLE 3.11. The sequence e n = 1 + n1 converges.
¡
and
à !
n+1
X n +1 1
(10) e n+1 = .
k=0 k (n + 1)k
à !
n 1 n(n − 1)(n − 2) · · · (n − (k − 1))
=
k nk k!n k
1 n −1 n −2 n −k +1
= ···
k! µ n ¶nµ ¶ nµ
k −1
¶
1 1 2
= 1− 1− ··· 1−
k! n n n
k −1
µ ¶µ ¶ µ ¶
1 1 2
< 1− 1− ··· 1−
k! n +1 n +1 n +1
1 ³ n ´ n −1 n + 1 − (k − 1)
µ ¶ µ ¶
= ···
k! n + 1 n + 1 n +1
(n + 1)n(n − 1)(n − 2) · · · (n + 1 − (k − 1))
=
k!(n + 1)k
à !
n +1 1
= ,
k (n + 1)k
which is the kth term of (10). Since (10) also has one more positive term in the
sum, it follows that e n < e n+1 , and the sequence e n is increasing.
Noting that 1/k! ≤ 1/2k−1 for k ∈ N, we can bound the kth term of (9).
à !
n 1 n! 1
=
k n k k!(n − k)! n k
n −1 n −2 n −k +1 1
= ···
n n n k!
1
<
k!
1
≤ k−1 .
2
Substituting this into (9) yields
à !
Xn n 1
en = k
k=0 k n
1 1 1
< 1+1+ + + · · · + n−1
2 4 2
1 − 21n
= 1+ < 3,
1 − 12
so e n is bounded.
In case a n is bounded, both lim inf a n and lim sup a n are accumulation points of
a n and a n converges iff lim inf a n = limn→∞ a n = lim sup a n .
The following theorem has been proved.
P ROOF. Since the intervals are nested, it’s clear that a n is an increasing se-
quence bounded above by b 1 and b n is a decreasing sequence bounded below by
a 1 . Applying Theorem 3.11 twice, we find there are α, β ∈ R such that a n → α and
b n → β.
We claim α = β. To see this, let ε > 0 and use the “shrinking” condition on
the intervals to pick N ∈ N so that b N − a N < ε. The nestedness of the intervals
implies a N ≤ a n < b n ≤ b N for all n ≥ N . Therefore
a N ≤ lub {a n : n ≥ N } = α ≤ b N and a N ≤ glb {b n : n ≥ N } = β ≤ b N .
This shows |α−β| ≤ |b N − a N | < ε. Since ε > 0 was chosen arbitrarily, we conclude
α = β.
Let x = α = β. It remains to show that n∈N I n = {x}.
T
6. Cauchy Sequences
Often the biggest problem with showing that a sequence converges using
the techniques we have seen so far is we must know ahead of time to what it
converges. This is the “chicken and egg” problem mentioned above. An escape
from this dilemma is provided by Cauchy sequences.
This definition is a bit more subtle than it might at first appear. It sort of says
that all the terms of the sequence are close together from some point onward.
The emphasis is on all the terms from some point onward. To stress this, first
consider a negative example.
E XAMPLE 3.18. Suppose a n = nk=1 1/k for n ∈ N. There’s a trick for showing
P
the sequence a n diverges. First, note that a n is strictly increasing. For any n ∈ N,
consider
n j
2X −1 1 n−1
X 2X −1 1
a 2n −1 = = j
k=1 k j =0 k=0 2 + k
j
n−1
X 2X −1 1 n−1
X 1 n
> = = →∞
j =0 k=0 2 j +1 j =0 2 2
This shows the claim is true in the case n + 1. Therefore, by induction, the claim
is true for all n ∈ N.
To show x n is a Cauchy sequence, let ε > 0. Since c n → 0, we can choose
N ∈ N so that
c N −1
(14) |x 1 − x 2 | < ε.
(1 − c)
Let n > m ≥ N . Then
|x n − x m | = |x n − x n−1 + x n−1 − x n−2 + x n−2 − · · · − x m+1 + x m+1 − x m |
≤ |x n − x n−1 | + |x n−1 − x n−2 | + · · · + |x m+1 − x m |
c N −1
≤ |x 1 − x 2 |
1−c
ε
< |x 1 − x 2 |
|x 1 − x 2 |
=ε
This shows x n is a Cauchy sequence and must converge by Theorem 3.22.
Pn
E XAMPLE 3.19. Let −1 < r < 1 and define the sequence s n = k=0 r k . (You no
doubt recognize this as the geometric series from your calculus course.) If r = 0,
the convergence of s n is trivial. So, suppose r 6= 0. In this case,
|s n+1 − s n | ¯¯ r n+1 ¯¯
¯ ¯
= = |r | < 1
|s n − s n−1 | ¯ r n ¯
and s n is contractive. Theorem 3.24 implies s n converges.
E XAMPLE 3.20. Suppose f (x) = 2 + 1/x, a 1 = 2 and a n+1 = f (a n ) for n ∈ N. It
is evident that a n ≥ 2 for all n. Some algebra gives
¯ ¯ ¯ ¯
¯ a n+1 − a n ¯ ¯ f ( f (a n−1 )) − f (a n−1 ) ¯ 1 1
¯
¯a −a
¯=¯ ¯= ≤ .
n n−1
¯ ¯ f (a )−a
n−1 n−1
¯ 1 + 2a 5 n−1
This shows a n is a contractive sequence and, according to Theorem 3.24, a n → L
for some L ≥ 2. Since, a n+1 = 2 + 1/a n , taking the p limit as n → ∞ of both sides
gives L = 2 + 1/L. A bit more algebra shows L = 1 + 2.
L is called a fixed point of the function f ; i.e. f (L) = L. Many approximation
techniques for solving equations involve such iterative techniques depending
upon contraction to find fixed points.
The calculations in the proof of Theorem 3.24 give the means to approximate
the fixed point to within an allowable error. Looking at line (15), notice
c m−1
|x n − x m | < |x 1 − x 2 |
.
1−c
Let n → ∞ in this inequality to arrive at the error estimate
c m−1
(16) |L − x m | ≤ |x 1 − x 2 |
.
1−c
In Example 3.20, a 1 = 2, a 2 = 5/2 and c ≤ 1/5. Suppose we want to approxi-
mate L to 5 decimal places of accuracy. This means we need |a n − L| < 5 × 10−6 .
Using (16), with m = 9 shows
c m−1
|a 1 − a 2 | ≤ 1.6 × 10−6 .
1−c
July 18, 2016 http://math.louisville.edu/∼lee/ira
3-14 CHAPTER 3. SEQUENCES
7. Exercises
6n − 1
3.1. Let the sequence a n = . Use the definition of convergence for a
3n + 2
sequence to show a n converges.
3.11. Find a sequence a n such that given x ∈ [0, 1], there is a subsequence b n of
a n such that b n → x.
3.19. Let a n and b n be sequences. Prove that both sequences a n and b n converge
iff both a n + b n and a n − b n converge.
3.20. Let a n be a bounded sequence. Prove that given any ε > 0, there is an
interval I with length ε such that {n : a n ∈ I } is infinite. Is it necessary that a n be
bounded?
3.23. If a n is a Cauchy sequence whose terms are integers, what can you say
about the sequence?
Pn
3.24. Show a n = k=0
1/k! is a Cauchy sequence.
3.27. If 0 < α < 1 and s n is a sequence satisfying |s n+1 | < α|s n |, then s n → 0.
3.35. If a n is a sequence of positive numbers, then lim inf a n = lim sup 1/a n .
3.38. The equation x 3 − 4x + 2 = 0 has one real root lying between 0 and 1. Find
a sequence of rational numbers converging to this root. Use this sequence to
approximate the root to five decimal places.
Series
1. What is a Series?
The idea behind adding up an infinite collection of numbers is a reduction to
the well-understood idea of a sequence. This is a typical approach in mathemat-
ics: reduce a question to a previously solved problem.
D EFINITION 4.1. Given a sequence a n , the series having a n as its terms is the
new sequence
n
X
sn = ak = a1 + a2 + · · · + an .
k=1
The numbers s n are called the partial sums of the series. If s n → S ∈ R, then the
series converges to S. This is normally written as
∞
X
a k = S.
k=1
4-1
4-2 Series
called a geometric series. The number r is called the ratio of the series.
Suppose a n = r n−1 for r 6= 1. Then,
s 1 = 1, s 2 = 1 + r, s 3 = 1 + r + r 2 , . . .
steps
2 1 1/2 0
distance from wall
In some cases, the geometric series has an intuitively plausible limit. If you
start two meters away from a wall and keep stepping halfway to the wall, no
number of steps will get you to the wall, but a large number of steps will get you
as close to the wall as you want. (See Figure 4.1.) So, the total distance stepped
has limiting value 2. The total distance after n steps is the nth partial sum of a
geometric series with ratio r = 1/2 and c = 1.
for each of the two series. By assumption, there are numbers A and B where
A n → A and B n → B .
(a) nk=1 c a k = c nk=1 a k = c A n → c A.
P P
Many have made the mistake of reading too much into Corollary 4.3. It can
only be used to show divergence. When the terms of a series do tend to zero, that
does not guarantee convergence. Example 4.3, shows Theorem 4.2(c) is necessary,
but not sufficient for convergence.
Another useful observation is that the partial sums of a convergent series are
a Cauchy sequence. The Cauchy criterion for sequences can be rephrased for
series as the following theorem, the proof of which is Exercise 4.4.
P
T HEOREM 4.4 (Cauchy Criterion). Let a n be a series. The following state-
ments are equivalent.
P
(a) a n converges.
(b) For every ε > 0 there is an N ∈ N such that whenever n ≥ m ≥ N , then
¯ ¯
¯X n ¯
a i ¯ < ε.
¯ ¯
¯
¯i =m ¯
2. Positive Series
Most of the time, it is very hard or impossible to determine the exact limit of
a convergent series. We must satisfy ourselves with determining whether a series
converges, and then approximating its sum. For this reason, the study of series
usually involves learning a collection of theorems that might answer whether a
given series converges, but don’t tell us to what it converges. These theorems are
usually called the convergence tests. The reader probably remembers a battery
of such tests from her calculus course. There is a myriad of such tests, and the
standard ones are presented in the next few sections, along with a few of those
less widely used.
Since convergence of a series is determined by convergence of the sequence
of its partial sums, the easiest series to study are those with well-behaved partial
sums. Series with monotone sequences of partial sums are certainly the simplest
such series.
P
D EFINITION 4.5. The series a n is a positive series, if a n ≥ 0 for all n.
2.1. The Most Common Convergence Tests. All beginning calculus courses
contain several simple tests to determine whether positive series converge. Most
of them are presented below.
2.1.1. Comparison Tests. The most basic convergence tests are the compari-
son tests. In these tests, the behavior of one series is inferred from that of another
series. Although they’re easy to use, there is one often fatal catch: in order to use
a comparison test, you must have a known series to which you can compare the
mystery series. For this reason, a wise mathematician collects example series
for her toolbox. The more samples in the toolbox, the more powerful are the
comparison tests.
P P
T HEOREM 4.6 (Comparison Test). Suppose a n and b n are positive series
with a n ≤ b n for all n.
P P
(a) If b n converges, then so does a n .
P P
(b) If a n diverges, then so does b n .
a 1 + a 2 + a 3 + a 4 + a 5 + a 6 + a 7 + a 8 + a 9 + · · · + a 15 +a 16 + · · ·
| {z } | {z } | {z }
≤2a 2 ≤4a 4 ≤8a 8
a 1 + a 2 + a 3 + a 4 + a 5 + a 6 + a 7 + a 8 + a 9 + · · · + a 15 +a 16 + · · ·
|{z} | {z } | {z } | {z }
≥a 2 ≥2a 4 ≥4a 8 ≥8a 16
F IGURE 4.2. This diagram shows the groupings used in inequality (25).
P P
P ROOF. Let A n and B n be the partial sums of a n and b n , respectively. It
follows from the assumptions that A n and B n are increasing and for all n ∈ N,
(24) An ≤ Bn .
P P
If b n = B , then (24) implies B is an upper bound for A n , and a n converges.
P
On the other hand, if a n diverges, A n → ∞ and the Sandwich Theorem
3.9(b) shows B n → ∞.
E XAMPLE 4.5. Example 4.3 shows that 1/n diverges. If p ≤ 1, then 1/n p ≥
P
sin2 n 1
n
≤ n
2 2
P n
for all n and the geometric series 1/2 = 1.
1The series P 2n a P
2n is sometimes called the condensed series associated with an .
E XAMPLE 4.7 (p-series). For fixed p ∈ R, the series 1/n p is called a p-series.
P
X 1
E XAMPLE 4.8. To test the series for convergence, let
2n − n
1 1
an = and b n = .
2n − n 2n
July 18, 2016 http://math.louisville.edu/∼lee/ira
Positive Series 4-7
Then
an 1/(2n − n) 2n 1
lim= lim n
= lim n
= lim = 1 ∈ (0, ∞).
n→∞ b n n→∞ 1/2 n→∞ 2 −n n→∞ 1 − n/2n
Since 1/2n = 1, the original series converges by the Limit Comparison Test.
P
P ROOF. First, suppose ρ < 1 and r ∈ (ρ, 1). There is an N ∈ N so that a n1/n < r
for all n ≥ N . This is the same as a n < r n for all n ≥ N . Using this, it follows that
when n ≥ N ,
n NX
−1 n NX
−1 n NX
−1 rN
rk <
X X X
ak = ak + ak < ak + ak + .
k=1 k=1 k=N k=1 k=N k=1 1−r
P
This shows the partial sums of a n are bounded. Therefore, it must converge.
1/k
If ρ > 1, there is an increasing sequence of integers k n → ∞ such that a k n > 1
n
for all n ∈ N. This shows a kn > 1 for all n ∈ N. By Theorem 4.3, a n diverges.
P
E XAMPLE 4.9. For any x ∈ R, the series |x n |/n! converges. To see this, note
P
E XAMPLE 4.10. Consider the p-series 1/n and 1/n 2 . The first diverges
P P
and the second converges. Since n 1/n → 1 and n 2/n → 1, it can be seen that when
ρ = 1, the Root Test in inconclusive.
P
T HEOREM 4.11 (Ratio Test). Suppose a n is a positive series. Let
a n+1 a n+1
r = lim inf
≤ lim sup = R.
an an
P P
If R < 1, then a n converges. If r > 1, then a n diverges.
P ROOF. First, suppose R < 1 and ρ ∈ (R, 1). There exists N ∈ N such that
a n+1 /a n < ρ whenever n ≥ N . This implies a n+1 < ρa n whenever n ≥ N . From
this it’s easy to prove by induction that a N +m < ρ m a N whenever m ∈ N. It follows
In calculus books, the ratio test usually takes the following simpler form.
P
C OROLLARY 4.12 (Ratio Test). Suppose a n is a positive series. Let
a n+1
r = lim
.
n→∞ a n
P P
If r < 1, then a n converges. If r > 1, then a n diverges.
From a practical viewpoint, the ratio test is often easier to apply than the root
test. But, the root test is actually the stronger of the two in the sense that there
are series for which the ratio test fails, but the root test succeeds. (See Exercise
4.10, for example.) This happens because
a n+1 a n+1
(31) lim inf ≤ lim inf a n1/n ≤ lim sup a n1/n ≤ lim sup .
an an
To see this, note the middle inequality is always true. To prove the right-hand
inequality, choose r > lim sup a n+1 /a n . It suffices to show lim sup a n1/n ≤ r . As in
the proof of the ratio test, a n+k < r k a n . This implies
an
a n+k < r n+k ,
rn
which leads to
³ a ´1/(n+k)
1/(n+k) n
a n+k <r n
.
r
Finally,
³ a ´1/(n+k)
1/(n+k) n
lim sup a n1/n = lim sup a n+k ≤ lim sup r = r.
k→∞ k→∞ rn
The left-hand inequality is proved similarly.
Most times the simple tests of the preceding section suffice. However, more
difficult series require more delicate tests. There dozens of other, more spe-
cialized, convergence tests. Several of them are consequences of the following
theorem.
P
T HEOREM 4.13 (Kummer’s Test). Suppose a n is a positive series, p n is a
sequence of positive numbers and
an an
µ ¶ µ ¶
(32) α = lim inf p n − p n+1 ≤ lim sup p n − p n+1 = β
a n+1 a n+1
If α > 0, then a n converges. If 1/p n diverges and β < 0, then a n diverges.
P P P
P ROOF. Let s n = nk=1 a k , suppose α > 0 and choose r ∈ (0, α). There must be
P
an
pn − p n+1 < 0, ∀n ≥ N .
a n+1
This implies
p n a n < p n+1 a n+1 , ∀n ≥ N .
Therefore, p n a n > p N a N whenever n > N and
1
an > p N a N , ∀n ≥ N .
pn
P P
Because N is fixed and 1/p n diverges, the Comparison Test shows a n diverges.
Kummer’s test is powerful. In fact, it can be shown that, given any positive
series, a judicious choice of the sequence p n can always be made to determine
whether it converges. (See Exercise 4.17, [19] and [18].) But, as stated, Kummer’s
test is not very useful because choosing p n for a given series is often difficult.
Experience has led to some standard choices that work with large classes of series.
For example, Exercise 4.9 asks you to prove the choice p n = 1 for all n reduces
Kummer’s test to the standard ratio test. Other useful choices are shown in the
following theorems.
P
T HEOREM 4.14 (Raabe’s Test). Let a n be a positive series such that a n > 0 for
all n. Define
an an
µ ¶ µ ¶
α = lim sup n − 1 ≥ lim inf n −1 = β
n→∞ a n+1 n→∞ a n+1
If α > 1, then a n converges. If β < 1, then a n diverges.
P P
2See §5.2.
series. Since the harmonic series diverges, we see the alternating harmonic series
is not absolutely convergent.
On the other hand, if s n = nk=1 (−1)k+1 /k, then
P
Xn µ 1 1
¶
Xn 1
s 2n = − =
k=1 2k − 1 2k k=1 2k(2k − 1)
is a positive series that converges. Since |s 2n − s 2n−1 | = 1/2n → 0, it’s clear
that s 2n−1 must also converge to the same limit. Therefore, s n converges and
(−1)n+1 /n is conditionally convergent. (See Figure 4.3.)
P
1.0
0.8
0.6
0.4
0.2
0 5 10 15 20 25 30 35
F IGURE 4.3. This plot shows the first 35 partial sums of the alternating
harmonic series. It converges to ln 2 ≈ 0.6931, which is the level of the
dashed line. Notice how the odd partial sums decrease to ln 2 and the
even partial sums increase to ln 2.
(b) b n ≥ b n+1 , ∀n ∈ N
(c) b n → 0
P
Then a n b n converges.
n
X n
X n
X k
X
a k b k = b n+1 ak − (b k+1 − b k ) a`
k=1 k=1 k=1 `=1
Pn
P ROOF. Let s n = k=1
a k and s 0 = 0. Then
n
X n
X
ak bk = (s k − s k−1 )b k
k=1 k=1
Xn n
X
= sk bk − s k−1 b k
k=1 k=1
à !
n
X n
X
= sk bk − s k b k+1 − s n b n+1
k=1 k=1
n
X n
X k
X
= b n+1 ak − (b k+1 − b k ) a`
k=1 k=1 `=1
P ROOF. To prove the theorem, suppose ¯ nk=1 a k ¯ < M for all n ∈ N. Let ε > 0
¯P ¯
and choose N ∈ N such that b N < ε/2M . If N ≤ m < n, use Lemma 4.19 to write
¯ ¯ ¯ ¯
¯Xn ¯ ¯X n m−1
X ¯
a` b` ¯ = ¯ a` b` − a` b` ¯
¯ ¯ ¯ ¯
¯
¯`=m ¯ ¯`=1 `=1
¯
¯
¯ Xn Xn X̀
= ¯b n+1 a` − (b `+1 − b ` ) ak
¯
`=1 `=1 k=1
¯
à !¯
m−1
X m−1
X X̀ ¯
− bm a` − (b `+1 − b ` ) ak ¯
¯
`=1 `=1 k=1
¯
= (b n+1 + b m )M + M (b m − b n+1 )
= 2M b m
ε
< 2M
2M
<ε
Pn
This shows `=1
a ` b ` satisfies Theorem 4.4, and therefore converges.
There’s one special case of this theorem that’s most often seen in calculus
texts.
c n+1 .
P ROOF. Let a n = (−1)n+1 and b n = c n in Theorem 4.18 to see the series con-
verges to some number s. For n ∈ N, let s n = nk=0 (−1)k+1 c k and s 0 = 0. Since
P
E XAMPLE 4.13. Corollary 4.20 provides another way to prove the alternating
harmonic series in Example 4.12 converges. Figures 4.3 and 4.4 show how the
partial sums bounce up and down across the sum of the series.
4. Rearrangements of Series
We want to use our standard intuition about adding lists of numbers when
working with series. But, this intuition has been formed by working with finite
sums and does not always work with series.
E XAMPLE 4.14. Suppose (−1)n+1 /n = γ so that (−1)n+1 2/n = 2γ. It’s easy
P P
2
(−1)n+1
X
2γ =
n
2 1 2 1
= 2−1+ − + − +···
3 2 5 3
P ROOF. (Theorem 4.23) Let b n and c n be as in Lemma 4.24 and define the
subsequence a n+ of b n by removing those terms for which b n = 0 and a n 6= 0.
Define the subsequence a n− of c n by removing those terms for which c n = 0. The
series ∞
P + P∞ −
n=1 a n and n=1 a n are still divergent because only terms equal to zero
have been removed from b n and c n .
Now, let c ∈ R and m 0 = n 0 = 0. According to Lemma 4.24, we can define the
natural numbers
n m1 n
a k+ > c} and n 1 = min{n : a k+ + a `− < c}.
X X X
m 1 = min{n :
k=1 k=1 `=1
If m p and n p have been chosen for some p ∈ N, then define
( Ã ! )
p mXk+1 nXk+1 n
+ − +
X X
m p+1 = min n : a` − a` + a` > c
k=0 `=m k +1 `=n k +1 `=m p +1
and
( Ã !
p mX
k+1 nX
k+1
a `+ − a `−
X
n p+1 = min n :
k=0 `=m k +1 `=n k +1
nX
)
p+1 n
a `+ − a `−
X
+ <c .
`=m p +1 `=n p +1
and à !
p mX
k+1 nk
a `+ − a `− ≤ a n−p
X X
0<c−
k=0 `=m k +1 `=n k +1
+ −
Since both am
→ 0 and
p
a n p → 0, the result follows from the Squeeze Theorem.
The argument when c is infinite is left as Exercise 4.31.
5. Exercises
∞
X
4.3. The series (a n − a n+1 ) converges iff the sequence a n converges.
n=1
P
4.4. Prove or give a counter example: If |a n | converges, then na n → 0.
If ∞
P
4.6.
P∞ n=1 a n converges and b n is a bounded monotonic sequence, then
n=1 a n b n converges.
P∞ −n
4.7. Let x n be a sequence with range {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}. Prove that n=1 x n 10
converges.
4.9. Prove the ratio test by setting p n = 1 for all n in Kummer’s test.
4.12. Does
1 1×2 1×2×3 1×2×3×4
+ + + +···
3 3×5 3×5×7 3×5×7×9
converge?
P∞
(c) n=1 a n b n diverges.
xn
converges for all x ∈ R.
P∞
4.18. Prove that n=0 n!
k 2 (x + 3)k converges.
P∞
4.19. Find all values of x for which k=0
converge?
X∞ (x + 3)n
4.21. For what values of x does n
converge absolutely, converge
n=1 n4
conditionally or diverge?
∞
X n +6
4.22. For what values of x does 2 (x − 1)n
converge absolutely, converge
n=1 n
conditionally or diverge?
n α
4.23. For what positive values of α does n=1 α n
P∞
converge?
X nπ π
4.24. Prove that cos sin converges.
3 n
P∞
4.25. For a series k=1
a n with partial sums s n , define
1 Xn
σn = sn .
n k=1
P∞ 2
4.27. If ∞
P
n=1 a n is a convergent positive series, then so is n=1 a n . Give an
example to show the converse is not true.
P∞ P∞ 2
4.28. Prove or give a counter example: If n=1 a n converges, then n=1 a n
converges.
n α
4.29. For what positive values of α does n=1 α n
P∞
converge?
4.30. If a n ≥ 0 for all n ∈ N and there is a p > 1 such that limn→∞ n p a n exists and
is finite, then ∞
P
n=1 a n converges. Is this true for p = 1?
The Topology of R
The idea is that about every point of an open set, there is some room inside
the set on both sides of the point. It is easy to see that any open interval (a, b) is
an open set because if a < x < b and ε = min{x −a, b −x}, then (x −ε, x +ε) ⊂ (a, b).
Similarly, it’s not difficult to show R is an open set.
On the other hand, any closed interval [a, b] is a closed set. To see this, it
must be shown its complement is open. Let x ∈ [a, b]c and ε = min{|x − a|, |x −b|}.
Then (x − ε, x + ε) ∩ [a, b] = ;, so (x − ε, x + ε) ⊂ [a, b]c . Therefore, [a, b]c is open,
and its complement, namely [a, b], is closed.
A singleton set {a} is closed. To see this, suppose x 6= a and ε = |x − a|. Then
a ∉ (x − ε, x + ε), and {a}c must be open. The definition of a closed set implies {a}
is closed.
Open and closed sets can get much more complicated than the intervals
examined above. For example, similar arguments show Z is a closed set and Zc is
open. Both have an infinite number of disjoint pieces.
A common mistake is to assume all sets are either open or closed. Most sets
are neither open nor closed. For example, if S = [a, b) for some numbers a < b,
then no matter the size of ε > 0, neither (a −ε, a +ε) nor (b −ε, b +ε) are contained
in S or S c .
open.
(c) Both ; and R are open.
closed.
(c) Both ; and R are closed.
Surprisingly, ; and R are both open and closed. They are the only subsets of
R with this dual personality.
1.1. Topological Spaces. The preceding theorem provides the starting point
for a fundamental area of mathematics called topology. The properties of the
open sets of R motivate the following definition.
1This use of the term limit point is not universal. Some authors use the term accumulation
point. Others use condensation point, although this is more often used for those cases when every
neighborhood of x 0 intersects S in an uncountable set.
Notice that limit points of S need not be elements of S, but isolated points
of S must be elements of S. In a sense, limit points and isolated points are at
opposite extremes. The definitions can be restated as follows:
x 0 is a limit point of S iff ∀ε > 0 (S ∩ (x 0 − ε, x 0 + ε) \ {x 0 } 6= ;)
x 0 ∈ S is an isolated point of S iff ∃ε > 0 (S ∩ (x 0 − ε, x 0 + ε) \ {x 0 } = ;)
E XAMPLE 5.1. If S = (0, 1], then S 0 = [0, 1] and S has no isolated points.
E XAMPLE 5.2. If T = {1/n : n ∈ Z \ {0}}, then T 0 = {0} and all points of T are
isolated points of T .
P ROOF. For the purposes of this proof, if I = [a, b] is a closed interval, let
I L = [a, (a + b)/2] be the closed left half of I and I R = [(a + b)/2, b] be the closed
right half of I .
Suppose S is a bounded and infinite set. The assumption that S is bounded
implies the existence of an interval I 1 = [−B, B ] containing S. Since S is infinite,
at least one of the two sets I 1L ∩ S or I 1R ∩ S is infinite. Let I 2 be either I 1L or I 1R such
that I 2 ∩ S is infinite.
If I n is such that I n ∩ S is infinite, let I n+1 be either I nL or I nR , where I n+1 ∩ S is
infinite.
In this way, a nested sequence of intervals, I n for n ∈ N, is defined such that
I n ∩ S is infinite for all n ∈ N and the length of I n is B /2n−2 → 0. According to the
Nested Interval Theorem, there is an x 0 ∈ R such that n∈N I n = {x 0 }.
T
2.2. Connected Sets. One place where the relative topologies are useful is
in relation to the following definition.
E XAMPLE 5.6. If S = [−1, 0) ∪ (0, 1], then U = (−2, 0) and V = (0, 2) are open
sets such that U ∩ V = ;, U ∩ S 6= ;, V ∩ S 6= ; and S ⊂ U ∪ V . This shows S is
disconnected.
p p
E XAMPLE 5.7. The sets U = (−∞, 2) and V = ( 2, ∞)pare open sets such
that U ∩ V = ;, U ∩ Q 6= ;, V ∩ Q 6= ; and Q ⊂ U ∪ V = R \ { 2}. This shows Q is
disconnected. In fact, the only connected subsets of Q are single points. Sets with
this property are often called totally disconnected.
If w ∈ U , then u < w < v and there is ε ∈ (0, v − w) such that (w −ε, w +ε) ⊂ U
and [u, w + ε) ⊂ S because w + ε < v. This clearly contradicts the definition of w,
so w ∉ U .
If w ∈ V , then there is an ε > 0 such that (w − ε, w] ⊂ V . In particular, this
shows w = lub A ≤ w − ε < w. This contradiction forces the conclusion that
w ∉V.
Now, putting all this together, we see w ∈ S ⊂ U ∪ V and w ∉ U ∪ V . This is a
clear contradiction, so we’re forced to conclude there is no separation of S.
and cannot cover the unbounded set K . This shows K cannot be compact, and
every compact set must be bounded.
Suppose K is not closed. According to Theorem 5.9, there is a limit point x of
K such that x ∉ K . Define O = {[x − 1/n, x + 1/n]c : n ∈ N}. Then O is a collection
of open sets and K ⊂ G∈O G = R \ {x}. Let O 0 = {[x − 1/n i , x + 1/n i ]c : 1 ≤ i ≤ N }
S
from O .
The following theorem shows that a nowhere dense set is small in the sense
mentioned above because it fails to be dense in any part of B .
The nowhere dense sets are topologically small in the following sense.
Theorem 5.29 is called the Baire category theorem because of the terminology
introduced by René-Louis Baire in 1899.3 He said a set was of the first category, if
it could be written as a countable union of nowhere dense sets. An easy example
of such a set is any countable set, which is a countable union of singletons. All
3René-Louis Baire (1874-1932) was a French mathematician. He proved the Baire category
theorem in his 1899 doctoral dissertation.
other sets are of the second category.4 Theorem 5.29 can be stated as “Any open
interval is of the second category.” Or, more generally, as “Any nonempty open
set is of the second category.”
A set is called a Gδ set, if it is the countable intersection of open sets. It is
called an Fσ set, if it is the countable union of closed sets. DeMorgan’s laws show
that the complement of an Fσ set is a Gδ set and vice versa. It’s evident that any
countable subset of R is an Fσ set, so Q is an Fσ set.
On the other hand, suppose Q is a Gδ set. Then there is a sequence of open
sets G n such that Q = n∈N G n . Since Q is dense, each G n must be dense and
T
showing R is a first category set and violating the Baire category theorem. There-
fore, Q is not a Gδ set.
Essentially the same argument shows any countable subset of R is a first
category set. The following protracted example shows there are uncountable sets
of the first category.
n+1
= = 1.
n=0 3 3 n=0 3
4Baire did not define any categories other than these two. Some authors call first category
sets meager sets, so as not to make students fruitlessly wait for definitions of third, fourth and fifth
category sets.
5Cantor’s original work [6] is reprinted with an English translation in Edgar’s Classics on Frac-
tals [10]. Cantor only mentions his eponymous set in passing and it had actually been presented
earlier by others.
F IGURE 5.1. Shown here are the first few steps in the construction of
the Cantor Middle-Thirds set.
pairwise disjoint closed intervals each of length 3−n . Their total length is (2/3)n .
Given ε > 0, choose n ∈ N so (2/3)n < ε/2. Each of the closed intervals comprising
C n can be placed inside a slightly longer open interval so the sums of the lengths
of the 2n open intervals is less than ε.
5. Exercises
5.1. If G is an open set and F is a closed set, then G \ F is open and F \G is closed.
5.7. A set S ⊂ R is open iff ∂S ∩ S = ;. (∂S is the set of boundary points of S.)
T
5.8. Find a sequence of open sets G n such that n∈N G n is neither open nor
closed.
5.9. An open set G is called regular if G = (G)◦ . Find an open set that is not
regular.
5.10. Let R = {(x, ∞) : x ∈ R} and T = R∪{R, ;}. Prove that (R, T ) is a topological
space. This is called the right ray topology on R.
5.16. Prove that the set of accumulation points of any sequence is closed.
5.17. Prove any closed set is the set of accumulation points for some sequence.
5.21. Prove the union of a finite number of compact sets is compact. Give an
example to show this need not be true for the union of an infinite number of
compact sets.
5.22. (a) Give an example of a set S such that S is disconnected, but S ∪ {1} is
connected. (b) Prove that 1 must be a limit point of S.
5.23. If K is compact and V is open with K ⊂ V , then there is an open set U such
that K ⊂ U ⊂ U ⊂ V .
5.27. Let f : [a, b] → R be a function such that for every x ∈ [a, b] there is a δx > 0
such that f is bounded on (x − δx , x + δx ). Prove f is bounded.
Limits of Functions
1. Basic Definitions
D EFINITION 6.1. Let D ⊂ R, x 0 be a limit point of D and f : D → R. The
limit of f (x) at x 0 is L, if for each ε > 0 there is a δ > 0 such that when x ∈ D with
0 < |x−x 0 | < δ, then | f (x)−L| < ε. When this is the case, we write limx→x0 f (x) = L.
F IGURE 6.1. This figure shows a way to think about the limit. The
graph of f must not leave the top or bottom of the box (x 0 − δ, x 0 + δ) ×
(L − ε, L + ε), except possibly the point (x 0 , f (x 0 )).
A useful way of rewording the definition is to say that limx→x0 f (x) = L iff
for every ε > 0 there is a δ > 0 such that x ∈ (x 0 − δ, x 0 + δ) ∩ D \ {x 0 } implies
f (x) ∈ (L − ε, L + ε). This can also be succinctly stated as
∀ε > 0 ∃δ > 0 f ( (x 0 − δ, x 0 + δ) ∩ D \ {x 0 } ) ⊂ (L − ε, L + ε) .
¡ ¢
6-1
6-2 CHAPTER 6. LIMITS OF FUNCTIONS
-2 2
F IGURE 6.2. The function from Example 6.3. Note that the graph is a
line with one “hole” in it.
E XAMPLE 6.5. If f (x) = |x|/x for x 6= 0, then limx→0 f (x) does not exist. (See
Figure 6.3.) To see this, suppose limx→0 f (x) = L, ε = 1 and δ > 0. If L ≥ 0 and
−δ < x < 0, then f (x) = −1 < L − ε. If L < 0 and 0 < x < δ, then f (x) = 1 > L + ε.
These inequalities show for any L and every δ > 0, there is an x with 0 < |x| < δ
such that | f (x) − L| > ε.
1.0
0.5
-0.5
-1.0
F IGURE 6.4. This is the function from Example 6.6. The graph shown
here is on the interval [0.03, 1]. There are an infinite number of oscilla-
tions from −1 to 1 on any open interval containing the origin.
0.10
0.05
-0.05
-0.10
F IGURE 6.5. This is the function from Example 6.7. The bounding
lines y = x and y = −x are also shown. There are an infinite number of
oscillations between −x and x on any open interval containing the
origin.
In the same manner as Example 6.8, it can be shown for every rational func-
tion f (x), that limx→x0 f (x) = f (x 0 ) whenever f (x 0 ) exists.
2. Unilateral Limits
D EFINITION 6.5. Let f : D → R and x 0 be a limit point of (−∞, x 0 )∩D. f has L
as its left-hand limit at x 0 if for all ε > 0 there is a δ > 0 such that f ((x 0 −δ, x 0 )∩D) ⊂
(L − ε, L + ε). In this case, we write limx↑x0 f (x) = L.
Let f : D → R and x 0 be a limit point of D ∩ (x 0 , ∞). f has L as its right-hand
limit at x 0 if for all ε > 0 there is a δ > 0 such that f (D ∩ (x 0 , x 0 + δ)) ⊂ (L − ε, L + ε).
In this case, we write limx↓x0 f (x) = L.2
These are called the unilateral or one-sided limits of f at x 0 . When they are
different, the graph of f is often said to have a “jump” at x 0 , as in the following
example.
E XAMPLE 6.9. As in Example 6.5, let f (x) = |x|/x. Then limx↓0 f (x) = 1 and
limx↑0 f (x) = −1. (See Figure 6.3.)
In parallel with Theorem 6.2, the one-sided limits can also be reformulated
in terms of sequences.
The proof of Theorem 6.6 is similar to that of Theorem 6.2 and is left to the
reader.
3. Continuity
D EFINITION 6.9. Let f : D → R and x 0 ∈ D. f is continuous at x 0 if for every ε >
0 there exists a δ > 0 such that when x ∈ D with |x − x 0 | < δ, then | f (x)− f (x 0 )| < ε.
The set of all points at which f is continuous is denoted C ( f ).
Several useful ways of rephrasing this are contained in the following theorem.
They are analogous to the similar statements made about limits. Proofs are left to
the reader.
T HEOREM 6.10. Let f : D → R and x 0 ∈ D. The following statements are
equivalent.
(a) x 0 ∈ C ( f ),
(b) For all ε > 0 there is a δ > 0 such that
x ∈ (x 0 − δ, x 0 + δ) ∩ D ⇒ f (x) ∈ ( f (x 0 ) − ε, f (x 0 ) + ε),
(c) For all ε > 0 there is a δ > 0 such that
f ((x 0 − δ, x 0 + δ) ∩ D) ⊂ ( f (x 0 ) − ε, f (x 0 ) + ε).
E XAMPLE 6.10. Define
2x 2 −8
(
x−2 , x 6= 2
f (x) = .
8, x =2
It follows from Example 6.3 that 2 ∈ C ( f ).
There is a subtle difference between the treatment of the domain of the
function in the definitions of limit and continuity. In the definition of limit, the
“target point,” x 0 is required to be a limit point of the domain, but not actually
be an element of the domain. In the definition of continuity, x 0 must be in
the domain of the function, but does not have to be a limit point. To see a
consequence of this difference, consider the following example.
E XAMPLE 6.11. If f : Z → R is an arbitrary function, then C ( f ) = Z. To see this,
let n 0 ∈ Z, ε > 0 and δ = 1. If x ∈ Z with |x − n 0 | < δ, then x = n 0 . It follows that
| f (x) − f (n 0 )| = 0 < ε, so f is continuous at n 0 .
2 Calculus books often use the notation lim
x↑x 0 f (x) = limx→x 0 − f (x) and limx↓x 0 f (x) =
limx→x0 + f (x).
On the other hand, suppose limx→x0 f (x) = f (x 0 ) and ε > 0. Choose δ ac-
cording to the definition of limit. When x ∈ (x 0 − δ, x 0 + δ) ∩ D \ {x 0 }, then f (x) ∈
( f (x 0 ) − ε, f (x 0 ) + ε). It follows from this that when x = x 0 , then f (x) − f (x 0 ) =
f (x 0 ) − f (x 0 ) = 0 < ε. Therefore, when x ∈ (x 0 − δ, x 0 + δ) ∩ D, then f (x) ∈ ( f (x 0 ) −
ε, f (x 0 ) + ε), and x 0 ∈ C ( f ), as desired.
E XAMPLE 6.12. If f (x) = c, for some c ∈ R, then Example 6.1 and Theorem
6.11 show that f is continuous at every point.
E XAMPLE 6.13. If f (x) = x, then Example 6.2 and Theorem 6.11 show that f
is continuous at every point.
P ROOF. Combining Theorem 6.11 with Theorem 6.2 shows this to be true.
E XAMPLE 6.15 (Salt and Pepper Function). Since Q is a countable set, it can
be written as a sequence, Q = {q n : n ∈ N}. Define
(
1/n, x = q n ,
f (x) =
0, x ∈ Qc .
If x ∈ Q, then x = q n , for some n and f (x) = 1/n > 0. There is a sequence x n
from Qc such that x n → x and f (x n ) = 0 6→ f (x) = 1/n. Therefore C ( f ) ∩ Q = ;.
On the other hand, let x ∈ Qc and ε > 0. Choose N ∈ N large enough so that
1/N < ε and let δ = min{|x − q n | : 1 ≤ n ≤ N }. If |x − y| < δ, there are two cases to
consider. If y ∈ Qc , then | f (y) − f (x)| = |0 − 0| = 0 < ε. If y ∈ Q, then the choice of
δ guarantees y = q n for some n > N . In this case, | f (y) − f (x)| = f (y) = f (q n ) =
1/n < 1/N < ε. Therefore, x ∈ C ( f ).
This shows that C ( f ) = Qc .
It is a consequence of the Baire category theorem that there is no function f
such that C ( f ) = Q. Proving this would take us too far afield.
The following theorem is an almost immediate consequence of Theorem 6.4.
T HEOREM 6.13. Let f : D → R and g : D → R. If x 0 ∈ C ( f ) ∩C (g ), then
(a) x 0 ∈ C ( f + g ),
(b) x 0 ∈ C (α f ), ∀α ∈ R,
(c) x 0 ∈ C ( f g ), and
(d) x 0 ∈ C ( f /g ) when g (x 0 ) 6= 0.
4. Unilateral Continuity
D EFINITION 6.16. Let f : D → R and x 0 ∈ D. f is left-continuous at x 0 if for
every ε > 0 there is a δ > 0 such that f ((x 0 − δ, x 0 ] ∩ D) ⊂ ( f (x 0 ) − ε, f (x 0 ) + ε).
Let f : D → R and x 0 ∈ D. f is right-continuous at x 0 if for every ε > 0 there is
a δ > 0 such that f ([x 0 , x 0 + δ) ∩ D) ⊂ ( f (x 0 ) − ε, f (x 0 ) + ε).
This implies that for any two x 0 , y 0 ∈ (a, b)\C ( f ), there are disjoint open intervals,
I x0 = (limx↑x0 f (x), limx↓x0 f (x)) and I y 0 = (limx↑y 0 f (x), limx↓y 0 f (x)). For each
7
6
5
4
3
x2 −4
2 f (x) = x−2
1 2 3 4
F IGURE 6.7. The function from Example 6.19. Note that the graph is
a line with one “hole” in it. Plugging up the hole removes the disconti-
nuity.
5. Continuous Functions
Up until now, continuity has been considered as a property of a function at a
point. There is much that can be said about functions continuous everywhere.
6. Uniform Continuity
Most of the ideas contained in this section will not be needed until we begin
developing the properties of the integral in Chapter 8.
D EFINITION 6.28. A function f : D → R is uniformly continuous if for all ε > 0
there is a δ > 0 such that when x, y ∈ D with |x − y| < δ, then | f (x) − f (y)| < ε.
The idea here is that in the ordinary definition of continuity, the δ in the
definition depends on both ε and the x at which continuity is being tested; i.e., δ
is really a function of both ε and x. With uniform continuity, δ only depends on
ε; i. e., δ is only a function of x, and the same δ works across the whole domain.
T HEOREM 6.29. If f : D → R is uniformly continuous, then it is continuous.
E XAMPLE 6.22. Let f (x) = 1/x on D = (0, 1) and ε > 0. It’s clear that f is
continuous on D. Let δ > 0 and choose m, n ∈ N such that m > 1/δ and n − m > ε.
If x = 1/m and y = 1/n, then 0 < y < x < δ and f (y) − f (x) = n − m > ε. Therefore,
f is not uniformly continuous.
ε ≤ | f (x nk ) − f (y nk )| = | f (x nk ) − f (x 0 ) + f (x 0 ) − f (y nk )|
≤ | f (x nk ) − f (x 0 )| + | f (x 0 ) − f (y nk )| < ε/2 + ε/2 = ε,
which is a contradiction.
Therefore, f must be uniformly continuous.
7. Exercises
6.2. Give examples of functions f and g such that neither function has a limit at
a, but f + g does. Do the same for f g .
6.5. If lim f (x) = L > 0, then there is a δ > 0 such that f (x) > 0 when 0 < |x − a| <
x→a
δ.
6.10. If f : R → R is monotone, then there is a countable set D such that the values
of f can be altered on D in such a way that the altered function is left-continuous
at every point of R.
6.12. If f : R → R and there is an α > 0 such that | f (x) − f (y)| ≤ α|x − y| for all
x, y ∈ R, then show that f is continuous.
6.16. Let f and g be two functions which are continuous on a set D ⊂ R. Prove
or give a counter example: {x ∈ D : f (x) > g (x)} is relatively open in D.
6.17. If f , g : R → R are functions such that f (x) = g (x) for all x ∈ Q and C ( f ) =
C (g ) = R, then f = g .
6.19. Find an example to show the conclusion of Problem 6.18 fails if I = (a, b).
6.20. If f and g are both continuous on [a, b], then {x : f (x) ≤ g (x)} is compact.
6.22. Suppose f : R → R is a function such that every interval has points at which
f is negative and points at which f is positive. Prove that every interval has points
where f is not continuous.
6.23. If f : [a, b] → R has a limit at every point, then f is bounded. Is this true for
f : (a, b) → R?
Differentiation
7-1
7-2 CHAPTER 7. DIFFERENTIATION
F IGURE 7.1. These graphs illustrate that the two standard ways of
writing the difference quotient are equivalent.
Theorem 7.2 and Example 7.3 show that differentiability is a strictly stronger
condition than continuity. For a long time most mathematicians believed that
every continuous function must certainly be differentiable at some point. In
the nineteenth century, several researchers, most notably Bolzano and Weier-
strass, presented examples of functions continuous everywhere and differentiable
nowhere.2 It has since been proved that, in a technical sense, the “typical” contin-
uous function is nowhere differentiable [4]. So, contrary to the impression left by
many beginning calculus courses, differentiability is the exception rather than
the rule, even for continuous functions..
2. Differentiation Rules
Following are the standard rules for differentiation learned in every beginning
calculus course.
2Bolzano presented his example in 1834, but it was little noticed. The 1872 example of
Weierstrass is more well-known [2]. A translation of Weierstrass’ original paper [20] is presented
by Edgar [10]. Weierstrass’ example is not very transparent because it depends on trigonometric
series. Many more elementary constructions have since been made. One such will be presented in
Example 9.9.
Combining Examples 7.1 and 7.2 with Theorem 7.3, the following theorem is
easy to prove.
C OROLLARY 7.4. A rational function is differentiable at every point of its do-
main.
T HEOREM 7.5 (Chain Rule). If f and g are functions such that x 0 ∈ D( f ) and
f (x 0 ) ∈ D(g ), then x 0 ∈ D(g ◦ f ) and (g ◦ f )0 (x 0 ) = g 0 ◦ f (x 0 ) f 0 (x 0 ).
P ROOF. Let y 0 = f (x 0 ). By assumption, there is an open interval J containing
f (x 0 ) such that g is defined on J . Since J is open and x 0 ∈ C ( f ), there is an open
interval I containing x 0 such that f (I ) ⊂ J .
Define h : J → R by
g (y) − g (y 0 ) − g 0 (y ), y 6= y
0 0
h(y) = y − y0 .
0, y = y0
1/ f 0 (x 0 ).
P ROOF. Let y 0 = f (x 0 ) and suppose y n is any sequence in f ([a, b]) \ {y 0 } con-
verging to y 0 and x n = f −1 (y n ). By Theorem 6.24, f −1 is continuous, so
x 0 = f −1 (y 0 ) = lim f −1 (y n ) = lim x n .
n→∞ n→∞
Therefore,
f −1 (y n ) − f −1 (y 0 ) xn − x0 1
lim = lim = 0 .
n→∞ yn − y0 n→∞ f (x n ) − f (x 0 ) f (x 0 )
E XAMPLE 7.4. It follows easily from Theorem 7.3 that f (x) = x 3 is differ-
p
entiable everywhere with f 0 (x) = 3x 2 . Define g (x) = 3 x. Then g (x) = f −1 (x).
Suppose g (y 0 ) = x 0 for some y 0 ∈ R. According to Theorem 7.6,
1 1 1 1 1
g 0 (y 0 ) = = = = p = .
f 0 (x 0 ) 3x 02 3(g (y 0 ))2 3( 3 y 0 )2 3y 02/3
In the same manner as Example 7.4, the following corollary can be proved.
C OROLLARY 7.7. Suppose q ∈ Q, f (x) = x q and D is the domain of f . Then
f 0 (x) = q x q−1 on the set
(
D, when q ≥ 1
.
D \ {0}, when q < 1
Theorem 7.9 is, of course, the basis for much of a beginning calculus course.
If f : [a, b] → R, then the extreme values of f occur at points of the set
C = {x ∈ (a, b) : f 0 (x) = 0} ∪ {x ∈ [a, b] : f 0 (x) does not exist}.
The elements of C are often called the critical points or critical numbers of f on
[a, b]. To find the maximum and minimum values of f on [a, b], it suffices to
find its maximum and minimum on the smaller set C , which is usually finite in
elementary calculus courses.
4. Differentiable Functions
Differentiation becomes most useful when a function has a derivative at each
point of an interval.
Rolle’s Theorem is just a stepping-stone on the path to the Mean Value Theo-
rem. Two versions of the Mean Value Theorem follow. The first is a version more
general than the one given in most calculus courses. The second is the usual
version.4
P ROOF. Let
h(x) = (g (b) − g (a))( f (a) − f (x)) + (g (x) − g (a))( f (b) − f (a)).
Because of the assumptions on f and g , h is continuous on [a, b] and differen-
tiable on (a, b) with h(a) = h(b) = 0. Theorem 7.11 yields a c ∈ (a, b) such that
h 0 (c) = 0. Then
0 = h 0 (c) = −(g (b) − g (a)) f 0 (c) + g 0 (c)( f (b) − f (a))
=⇒ g 0 (c)( f (b) − f (a)) = f 0 (c)(g (b) − g (a)).
P ROOF. Only the first assertion is proved because the proof of the second is
pretty much the same with all the inequalities reversed.
(⇒) If x, y ∈ (a, b) with x 6= y, then the assumption that f is increasing gives
f (y) − f (x) f (y) − f (x)
≥ 0 =⇒ f 0 (x) = lim ≥ 0.
y −x y→x y −x
(⇐) Let x, y ∈ (a, b) with x < y. According to Theorem 7.13, there is a c ∈ (x, y)
such that f (y) − f (x) = f 0 (c)(y − x) ≥ 0. This shows f (x) ≤ f (y), so f is increasing
on (a, b).
C OROLLARY 7.16. Let f : (a, b) → R be a differentiable function. f is constant
iff f 0 (x) = 0 for all x ∈ (a, b).
It follows from Theorem 7.2 that every differentiable function is continuous.
But, it’s not true that a derivative need be continuous.
E XAMPLE 7.5. Let (
x 2 sin x1 , x 6= 0
f (x) = .
0, x =0
We claim f is differentiable everywhere, but f 0 is not continuous.
To see this, first note that when x 6= 0, the standard differentiation formulas
give that f 0 (x) = 2x sin(1/x)−cos(1/x). To calculate f 0 (0), choose any h 6= 0. Then
¯ f (h) ¯ ¯ h 2 sin(1/h) ¯ ¯ h 2 ¯
¯ ¯ ¯ ¯ ¯ ¯
¯ ¯=¯ ¯ ≤ ¯ ¯ = |h|
¯ h ¯ ¯ h ¯ ¯h¯
and it easily follows from the definition of the derivative and the Squeeze Theorem
(Theorem 6.3) that f 0 (0) = 0.
Let x n = 1/2πn for n ∈ N. Then x n → 0 and
f 0 (x n ) = 2x n sin(1/x n ) − cos(1/x n ) = −1
for all n. Therefore, f 0 (x n ) → −1 6= 0 = f 0 (0), and f 0 is not continuous at 0.
But, derivatives do share one useful property with continuous functions; they
satisfy an intermediate value property. Compare the following theorem with
Corollary 6.26.
T HEOREM 7.17 (Darboux’s Theorem). If f is differentiable on an open set
containing [a, b] and γ is between f 0 (a) and f 0 (b), then there is a c ∈ [a, b] such
that f 0 (c) = γ.
P ROOF. If f 0 (a) = f 0 (b), then c = a satisfies the theorem. So, we may as well
assume f 0 (a) 6= f 0 (b). There is no generality lost in assuming f 0 (a) < f 0 (b), for,
otherwise, we just replace f with g = − f .
and
h(b) − h(b − ε)
> 0 =⇒ h(b − ε) < h(b).
ε
(See Figure 7.3.) In light of these two inequalities and Theorem 6.23, there must
be a c ∈ (a, b) such that h(c) = glb {h(x) : x ∈ [a, b]}. Now Theorem 7.9 gives 0 =
h 0 (c) = f 0 (c) − γ, and the theorem follows.
Here’s an example showing a possible use of Theorem 7.17.
E XAMPLE 7.6. Let (
0, x 6= 0
f (x) = .
1, x = 0
Theorem 7.17 implies f is not a derivative.
A more striking example is the following
E XAMPLE 7.7. Define
( (
sin x1 , x 6= 0 sin x1 , x 6= 0
f (x) = and g (x) = .
1, x =0 −1, x =0
Since (
0, x 6= 0
f (x) − g (x) =
2, x = 0
does not have the intermediate value property, at least one of f or g is not a
derivative. (Actually, neither is a derivative because f (x) = −g (−x).)
is the fact that in many cases it is possible to determine how large n must be to
achieve a desired accuracy in the approximation of f ; i. e., the error term is the
important part.
T HEOREM 7.18 (Taylor’s Theorem). If f is a function such that f , f 0 , . . . , f (n)
are continuous on [a, b] and f (n+1) exists on (a, b), then there is a c ∈ (a, b) such
that
n f (k) (a) f (n+1) (c)
(b − a)k + (b − a)n+1 .
X
f (b) =
k=0 k! (n + 1)!
P ROOF. Let the constant α be defined by
n f (k) (a) α
(b − a)k + (b − a)n+1
X
(59) f (b) =
k=0 k! (n + 1)!
and define
à !
n f (k) (x) α
k n+1
X
F (x) = f (b) − (b − x) + (b − x) .
k=0 k! (n + 1)!
From (59) we see that F (a) = 0. Direct substitution in the definition of F shows
that F (b) = 0. From the assumptions in the statement of the theorem, it is easy to
see that F is continuous on [a, b] and differentiable on (a, b). An application of
Rolle’s Theorem yields a c ∈ (a, b) such that
µ (n+1)
f α
¶
(c) n n
0
0 = F (c) = − (b − c) − (b − c) =⇒ α = f (n+1) (c),
n! n!
as desired.
Now, suppose f is defined on an open interval I with a, x ∈ I . If f is n + 1
times differentiable on I , then Theorem 7.18 implies there is a c between a and x
such that
f (x) = p n (x) + R f (n, x, a),
f (n+1) (c)
where R f (n, x, a) = (n+1)! (x − a)
n+1
is the error in the approximation.6
E XAMPLE 7.8. Let f (x) = cos x. Suppose we want to approximate f (2) to 5
decimal places of accuracy. Since it’s an easy point to work with, we’ll choose
a = 0. Then, for some c ∈ (0, 2),
| f (n+1) (c)| n+1 2n+1
(60) |R f (n, 2, 0)| = 2 ≤ .
(n + 1)! (n + 1)!
A bit of experimentation with a calculator shows that n = 12 is the smallest n such
that the right-hand side of (60) is less than 5 × 10−6 . After doing some arithmetic,
it follows that
22 24 26 28 210 212 27809
p 12 (2) = 1 − + − + − + =− ≈ −0.41614.
2! 4! 6! 8! 10! 12! 66825
is a 5 decimal place approximation to cos(2).
6There are several different formulas for the error. The one given here is sometimes called the
Lagrange form of the remainder. In Example 8.4 a form of the remainder using integration instead
of differentiation is derived.
4
n=4 n=8
n = 20
2 4 6 8
y = cos(x)
-2
-4 n=2 n=6 n = 10
F IGURE 7.4. Here are several of the Taylor polynomials for the func-
tion cos(x), centered at a = 0, graphed along with cos(x).
But, things don’t always work out the way we might like. Consider the follow-
ing example.
E XAMPLE 7.9. Suppose
2
(
e −1/x , x 6= 0
f (x) = .
0, x =0
Figure 7.5 below has a graph of this function. In Example 7.11 below it is shown
that f is differentiable to all orders everywhere and f (n) (0) = 0 for all n ≥ 0. With
this function the Taylor polynomial centered at 0 gives a useless approximation.
5.2. L’Hôpital’s Rules and Indeterminate Forms. According to Theorem
6.4,
f (x) limx→a f (x)
lim=
g (x) limx→a g (x)
x→a
whenever limx→a f (x) and limx→a g (x) both exist and limx→a g (x) 6= 0. But, it
is easy to find examples where both limx→a f (x) = 0 and limx→a g (x) = 0 and
limx→a f (x)/g (x) exists, as well as similar examples where limx→a f (x)/g (x) fails
to exist. Because of this, such a limit problem is said to be in the indeterminate
form 0/0. The following theorem allows us to determine many such limits.
T HEOREM 7.19 (Easy L’Hôpital’s Rule). Suppose f and g are each continuous
on [a, b], differentiable on (a, b) and f (b) = g (b) = 0. If g 0 (x) 6= 0 on (a, b) and
limx↑b f 0 (x)/g 0 (x) = L, where L could be infinite, then limx↑b f (x)/g (x) = L.
P ROOF. Let x ∈ [a, b), so f and g are continuous on [x, b] and differentiable
on (x, b). Cauchy’s Mean Value Theorem, Theorem 7.12, implies there is a c(x) ∈
(x, b) such that
f (x) f 0 (c(x))
f 0 (c(x))g (x) = g 0 (c(x)) f (x) =⇒ = .
g (x) g 0 (c(x))
Since x < c(x) < b, it follows that limx↑b c(x) = b. This shows that
f 0 (x) f 0 (c(x)) f (x)
L = lim 0
= lim 0
= lim .
x↑b g (x) x↑b g (c(x)) x↑b g (x)
Several things should be noted about this proof. First, there is nothing special
about the left-hand limit used in the statement of the theorem. It could just as
easily be written in terms of the right-hand limit. Second, if limx→a f (x)/g (x) is
not of the indeterminate form 0/0, then applying L’Hôpital’s rule will usually give
a wrong answer. To see this, consider
x 1
lim = 0 6= 1 = lim .
x→0 x + 1 x→0 1
Another case where the indeterminate form 0/0 occurs is in the limit at
infinity. That L’Hôpital’s rule works in this case can easily be deduced from
Theorem 7.19.
C OROLLARY 7.20. Suppose f and g are differentiable on (a, ∞) and
lim f (x) = lim g (x) = 0.
x→∞ x→∞
If g 0 (x) 6= 0 on (a, ∞) and limx→∞ f 0 (x)/g 0 (x) = L, where L could be infinite, then
limx→∞ f (x)/g (x) = L.
P ROOF. There is no generality lost by assuming a > 0. Let
( (
f (1/x), x ∈ (0, 1/a] g (1/x), x ∈ (0, 1/a]
F (x) = and G(x) = .
0, x =0 0, x =0
Then
lim F (x) = lim f (x) = 0 = lim g (x) = lim G(x),
x↓0 x→∞ x→∞ x↓0
so both F and G are continuous at 0. It follows that both F and G are continuous
on [0, 1/a] and differentiable on (0, 1/a) with G 0 (x) = −g 0 (x)/x 2 6= 0 on (0, 1/a)
and limx↓0 F 0 (x)/G 0 (x) = limx→∞ f 0 (x)/g 0 (x) = L. The rest follows from Theorem
7.19.
The other standard indeterminate form arises when
lim f (x) = ∞ = lim g (x).
x→∞ x→∞
This is called an ∞/∞ indeterminate form. It is often handled by the following
theorem.
T HEOREM 7.21 (Hard L’Hôpital’s Rule). Suppose that f and g are differentiable
on (a, ∞) and g 0 (x) 6= 0 on (a, ∞). If
f 0 (x)
lim f (x) = lim g (x) = ∞ and lim = L ∈ R ∪ {−∞, ∞},
x→∞ x→∞ x→∞ g 0 (x)
then
f (x)
lim = L.
x→∞ g (x)
P ROOF. First, suppose L ∈ R and let ε > 0. Choose a 1 > a large enough so that
¯ 0 ¯
¯ f (x)
¯ < ε, ∀x > a 1 .
¯
¯
¯ g 0 (x) − L ¯
Since limx→∞ f (x) = ∞ = limx→∞ g (x), we can assume there is an a 2 > a 1 such
that both f (x) > 0 and g (x) > 0 when x > a 2 . Finally, choose a 3 > a 2 such that
whenever x > a 3 , then f (x) > f (a 2 ) and g (x) > g (a 2 ).
Let x > a 3 and apply Cauchy’s Mean Value Theorem, Theorem 7.12, to f and
g on [a 2 , x] to find a c(x) ∈ (a 2 , x) such that
³ ´
f (a 2 )
f 0 (c(x)) f (x) − f (a 2 ) f (x) 1 − f (x)
(61) = = ´.
g 0 (c(x)) g (x) − g (a 2 ) g (x) 1 − g (a2 )
³
g (x)
If
g (a 2 )
1− g (x)
h(x) = f (a 2 )
,
1− f (x)
then (61) implies
f (x) f 0 (c(x))
= h(x).
g (x) g 0 (c(x))
Since limx→∞ h(x) = 1, there is an a 4 > a 3 such that whenever x > a 4 , then |h(x)−
1| < ε. If x > a 4 , then
¯ ¯ ¯ 0 ¯
¯ f (x) ¯ ¯ f (c(x)) ¯
¯ g (x) − L ¯ = ¯ g 0 (c(x)) h(x) − L ¯
¯ ¯ ¯ ¯
¯ 0 ¯
¯ f (c(x)) ¯
=¯ 0
¯ h(x) − Lh(x) + Lh(x) − L ¯¯
g (c(x))
¯ 0 ¯
¯ f (c(x)) ¯
≤¯ 0
¯ − L ¯¯ |h(x)| + |L||h(x) − 1|
g (c(x))
< ε(1 + ε) + |L|ε = (1 + |L| + ε)ε
can be made arbitrarily small through a proper choice of ε. Therefore
lim f (x)/g (x) = L.
x→∞
The case when L = ∞ is done similarly by first choosing a B > 0 and adjusting
(61) so that f 0 (x)/g 0 (x) > B when x > a 1 . A similar adjustment is necessary when
L = −∞.
There is a companion corollary to Theorem 7.21 which is proved in the same
way as Corollary 7.20.
C OROLLARY 7.22. Suppose that f and g are continuous on [a, b] and differen-
tiable on (a, b) with g 0 (x) 6= 0 on (a, b). If
f 0 (x)
lim f (x) = lim g (x) = ∞ and lim = L ∈ R ∪ {−∞, ∞},
x↓a x↓a x↓a g 0 (x)
–3 –2 –1 1 2 3
F IGURE 7.5. This is a plot of f (x) = exp(−1/x 2 ). Notice how the graph
flattens out near the origin.
then
f (x)
lim = L.
x↓a g (x)
E XAMPLE 7.10. If α > 0, then limx→∞ ln x/x α is of the indeterminate form
∞/∞. Taking derivatives of the numerator and denominator yields
1/x 1
lim = lim = 0.
x→∞ αx α−1 x→∞ αx α
α
Theorem 7.21 now implies limx→∞ ln x/x = 0, and therefore ln x increases more
slowly than any positive power of x.
E XAMPLE 7.11. Let f be as in Example 7.9. (See Figure 7.5.) It is clear f (n) (x)
exists whenever n ∈ ω and x 6= 0. We claim f (n) (0) = 0. To see this, we first prove
that
2
e −1/x
(62) lim = 0, ∀n ∈ Z.
x→0 x n
When n ≤ 0, (62) is obvious. So, suppose (62) is true whenever m ≤ n for
some n ∈ ω. Making the substitution u = 1/x, we see
2
e −1/x u n+1
(63) lim n+1 = lim .
x↓0 x u→∞ e u 2
Since 2
(n + 1)u n (n + 1)u n−1
n +1 e −1/x
lim = lim = lim =0
u→∞ 2ue u 2 u→∞ 2e u
2
2 x↓0 x n−1
by the inductive hypothesis, Theorem 7.21 gives (63) in the case of the right-hand
limit. The left-hand limit is handled similarly. Finally, (62) follows by induction.
When x 6= 0, a bit of experimentation can convince the reader that f (n) (x)
2
is of the form p n (1/x)e −1/x , where p n is a polynomial. Induction and repeated
applications of (62) establish that f (n) (0) = 0 for n ∈ ω.
6. Exercises
7.1. If (
x 2, x ∈ Q
f (x) = ,
0, otherwise
then show D( f ) = {0} and find f 0 (0).
7.4. Let G be an open set and f ∈ D(G). If there is an a ∈ G such that limx→a f 0 (x)
exists, then limx→a f 0 (x) = f 0 (a).
7.7. If (
1/2, x =0
f 1 (x) =
sin(1/x), x 6= 0
and (
1/2, x =0
f 2 (x) = ,
sin(−1/x), x 6= 0
then at least one of f 1 and f 2 is not in ∆.
7.9. Suppose f is differentiable everywhere and f (x+y) = f (x) f (y) for all x, y ∈ R.
Show that f 0 (x) = f 0 (0) f (x) and determine the value of f 0 (0).
d p
7.11. Use the definition of the derivative to find x.
dx
7.12. Let f be continuous on [0, ∞) and differentiable on (0, ∞). If f (0) = 0 and
| f 0 (x)| < | f (x)| for all x > 0, then f (x) = 0 for all x ≥ 0.
7.15. Let f be continuous on [a, b] and differentiable on (a, b). If f (a) = α and
| f 0 (x)| < β for all x ∈ (a, b), then calculate a bound for f (b).
7.17. Let G be an open set and f ∈ D(G). If there is an a ∈ G such that limx→a f 0 (x)
exists, then limx→a f 0 (x) = f 0 (a).
7.18. Prove or give a counter example: If f ∈ D((a, b)) such that f 0 is bounded,
then there is an F ∈ C ([a, b]) such that f = F on (a, b).
7.20. Suppose that I is an open interval and that f 00 (x) ≥ 0 for all x ∈ I . If a ∈ I ,
then show that the part of the graph of f on I is never below the tangent line to
the graph at (a, f (a)).
f (x − h) − 2 f (x) + f (x + h)
lim = f 00 (x).
h→0 h2
(b) Find a function f where this limit exists, but f 00 (x) does not exist.
7A function g is even if g (−x) = g (x) for every x and it is odd if g (−x) = −g (x) for every x. The
terms are even and odd because this is how g (x) = x n behaves when n is an even or odd integer,
respectively.
Integration
Contrary to the impression given by most calculus courses, there are many
ways to define integration. The one given here is called the Riemann integral or
the Riemann-Darboux integral, and it is the one most commonly presented to
calculus students.
1. Partitions
A partition of the interval [a, b] is a finite set P ⊂ [a, b] such that {a, b} ⊂ P . The
set of all partitions of [a, b] is denoted part ([a, b]). Basically, a partition should
be thought of as a way to divide an interval into a finite number of subintervals
by choosing some points where it is divided.
If P ∈ part ([a, b]), then the elements of P can be ordered in a list as a = x 0 <
x 1 < · · · < x n = b. The adjacent points of this partition determine n compact
intervals of the form I kP = [x k−1 , x k ], 1 ≤ k ≤ n. If the partition is clear from the
context, we write I k instead of I kP . It’s clear that these intervals only intersect at
their common endpoints and there is no requirement they have the same length.
Since it’s inconvenient to always list each part of a partition, we’ll use the
partition of the previous paragraph as the generic partition. Unless it’s necessary
within the context to specify some other form for a partition, assume any partition
is the generic partition. (See Figure 1.)
If I is any interval, its length is written |I |. Using the notation of the previous
paragraph, it follows that
n
X n
X
|I k | = (x k − x k−1 ) = x n − x 0 = b − a.
k=1 k=1
The norm of a partition P is
kP k = max{|I kP | : 1 ≤ k ≤ n}.
In other words, the norm of P is just the length of the longest subinterval deter-
mined by P . If |I k | = kP k for every I k , then P is called a regular partition.
I1 I2 I3 I4 I5
x0 x1 x2 x3 x4 x5
a b
8-1
8-2 CHAPTER 8. INTEGRATION
2. Riemann Sums
Let f : [a, b] → R and P ∈ part ([a, b]). Choose x k∗ ∈ I k for each k. The set
{x k∗ : 1 ≤ k ≤ n} is called a selection from P . The expression
n
R f , P, x k∗ = f (x k∗ )|I k |
¡ ¢ X
k=1
is the Riemann sum for f with respect to the partition P and selection x k∗ . The
Riemann sum is the usual first step toward integration in a calculus course and
can be visualized as the sum of the areas of rectangles with height f (x k∗ ) and
width |I k | — as long as the rectangles are allowed to have negative area when
f (x k∗ ) < 0. (See Figure 8.2.)
Notice that given a particular function f and partition P , there are an un-
countably infinite number of different possible Riemann sums, depending on
the selection x k∗ . This sometimes makes working with Riemann sums quite com-
plicated.
E XAMPLE 8.1. Suppose f : [a, b] → R is the constant function f (x) = c. If
P ∈ part ([a, b]) and {x k∗ : 1 ≤ k ≤ n} is any selection from P , then
n n
R f , P, x k∗ = f (x k∗ )|I k | = c
¡ ¢ X X
|I k | = c(b − a).
k=1 k=1
y = f (x)
⇤
a x1 x1 x⇤2 x2 x⇤3 x3 x⇤4 b
x0 x4
E XAMPLE 8.2. Suppose f (x) = x on [a, b]. Choose any P ∈ part ([a, b]) where
kP k < 2(b − a)/n. (Convince yourself this is always possible.1) Make two specific
selections l k∗ = x k−1 and r k∗ = x k . If x k∗ is any other selection from P , then l k∗ ≤
x k∗ ≤ r k∗ and the fact that f is increasing on [a, b] gives
R f , P, l k∗ ≤ R f , P, x k∗ ≤ R f , P, r k∗ .
¡ ¢ ¡ ¢ ¡ ¢
|R f − R f , P, x k∗ | < ε
¡ ¢ ¡ ¢
P ROOF. Suppose R 1 ( f ) and R 2 ( f ) both satisfy the definition and ε > 0. For
i = 1, 2 choose δi > 0 so that whenever kP k < δi , then
|R i ( f ) − R f , P, x k∗ | < ε/2,
¡ ¢
|R 1 ( f ) − R 2 ( f )| ≤ |R 1 ( f ) − R f , P, x k∗ | + |R 2 ( f ) − R f , P, x k∗ | < ε
¡ ¢ ¡ ¢
and it follows R 1 ( f ) = R 2 ( f ).
3. Darboux Integration
As mentioned above, a difficulty with handling Riemann sums is there are
an uncountably so many different ways to choose partitions and selections that
working with them is unwieldy. One way to resolve this problem was shown
in Example 8.2, where it was easy to find largest and smallest Riemann sums
associated with each partition. However, that’s not always a straightforward
calculation, so to use that idea, a little more care must be taken.
D EFINITION 8.4. Let f : [a, b] → R be bounded and P ∈ part ([a, b]). For each
I k determined by P , let
T HEOREM 8.5. If f : [a, b] → R is bounded and P,Q ∈ part ([a, b]) with P ¿ Q,
then
D f , P ≤ D f ,Q ≤ D f ,Q ≤ D f , P .
¡ ¢ ¡ ¢ ¡ ¢ ¡ ¢
Then
m k0 ≤ m l ≤ M l ≤ M k0 and m k0 ≤ m r ≤ M r ≤ M k0
so that
¡ ¢
m k0 |I k0 | = m k0 |[x k0 −1 , x]| + |[x, x k0 ]|
≤ m l |[x k0 −1 , x]| + m r |[x, x k0 ]|
≤ M l |[x k0 −1 , x]| + M r |[x, x k0 ]|
≤ M k0 |[x k0 −1 , x]| + M k0 |[x, x k0 ]|
= M k0 |I k0 |.
This implies
n
D f ,P =
¡ ¢ X
m k |I k |
k=1
X
= m k |I k | + m k0 |I k0 |
k6=k 0
X
≤ m k |I k | + m l |[x k0 −1 , x]| + m r |[x, x k0 ]|
k6=k 0
= D f ,Q
¡ ¢
≤ D f ,Q
¡ ¢
X
= M k |I k | + M l |[x k0 −1 , x]| + M r |[x, x k0 ]|
k6=k 0
Xn
≤ M k |I k |
k=1
= D f ,P
¡ ¢
The argument given above shows that the theorem holds if Q has one more point
than P . Using induction, this same technique also shows the theorem holds when
Q has an arbitrarily larger number of points than P .
The main lesson to be learned from Theorem 8.5 is that refining a parti-
tion causes the lower Darboux sum to increase and the upper Darboux sum to
decrease. Moreover, if P,Q ∈ part ([a, b]) and f : [a, b] → [−B, B ], then,
−B (b − a) ≤ D f , P ≤ D f , P ∪Q ≤ D f , P ∪Q ≤ D f ,Q ≤ B (b − a).
¡ ¢ ¡ ¢ ¡ ¢ ¡ ¢
Therefore every Darboux lower sum is less than or equal to every Darboux upper
sum. Consider the following definition with this in mind.
D EFINITION 8.6. The upper and lower Darboux integrals of a bounded func-
tion f : [a, b] → R are
and
D f = lub {D f , P : P ∈ part ([a, b])},
¡ ¢ ¡ ¢
respectively.
Which functions are Darboux integrable? The following corollary gives a first
approximation to an answer.
D f −D f ≤ D f ,P −D f ,P
¡ ¢ ¡ ¢ ¡ ¢ ¡ ¢
n n
f (x i∗ )|I i | − f (y i∗ )|I i |
X X
=
i =1 i =1
n
( f (x i∗ ) − f (y i∗ ))|I i |
X
=
i =1
ε X n
< |I i |
b − a i =1
=ε
and the corollary follows.
This corollary should not be construed to imply that only continuous func-
tions are Darboux integrable. In fact, the set of integrable functions is much more
extensive than only the continuous functions. Consider the following example.
E XAMPLE 8.3. Let f be the salt and pepper function of Example 6.15. It was
shown that C ( f ) = Qc . We claim that f is Darboux integrable over any compact
interval [a, b].
To see this, let ε > 0 and N ∈ N so that 1/N < ε/2(b − a). Let
{q ki : 1 ≤ i ≤ m} = {q k : 1 ≤ k ≤ N } ∩ [a, b]
and choose P ∈ part ([a, b]) such that kP k < ε/2m. Then
n
D f ,P =
¡ ¢ X
lub { f (x) : x ∈ I ` }|I ` |
`=1
X X
= lub { f (x) : x ∈ I ` }|I ` | + lub { f (x) : x ∈ I ` }|I ` |
q ki ∉I ` q ki ∈I `
1
≤ (b − a) + mkP k
N
ε ε
< (b − a) + m
2(b − a) 2m
= ε.
4. The Integral
There are now two different definitions for the integral. It would be embarass-
ing, if they gave different answers. The following theorem shows they’re really
different sides of the same coin.2
|R f − R f , P, x k∗ | < ε/4
¡ ¢ ¡ ¢
2Theorem 8.9 shows that the two integrals presented here are the same. But, there are many
other integrals, and not all of them are equivalent. For example, the well-known Lebesgue integral
includes all Riemann integrable functions, but not all Lebesgue integrable functions are Riemann
integrable. The Denjoy integral is another extension of the Riemann integral which is not the same
as the Lebesgue integral. For more discussion of this, see [11].
Then
n n
D f , P − R f , P, x k =
¡ ¢ ¡ ¢ X X
M k |I k | − f (x k )|I k |
k=1 k=1
Xn
= (M k − f (x k ))|I k |
k=1
ε ε
< (b − a) = .
4(b − a) 4
In the same way,
R f , P, x k − D f , P < ε/4.
¡ ¢ ¡ ¢
Therefore,
D f −D f
¡ ¢ ¡ ¢
¡ ¢ ¡ ¢
= glb {D f ,Q : Q ∈ part ([a, b])} − lub {D f ,Q : Q ∈ part ([a, b])}
≤ D f ,P −D f ,P
¡ ¢ ¡ ¢
³ ¡ ¢ ε´ ³ ¡ ¢ ε´
< R f , P, x k + − R f , P, x k −
4 4
¢¯ ε
≤ ¯R f , P, x k − R f , P, x k ¯ +
¯ ¡ ¢ ¡
2
¢ ε
< |R f , P, x k − R f | + |R f − R f , P, x k | +
¡ ¢ ¡ ¢ ¡ ¢ ¡
2
<ε
there is a P 1 ∈ part ([a, b]), with points a = p 0 < · · · < p m = b, such that
¢ ε
D f , P1 − D f , P1 < .
¡ ¢ ¡
2
Set δ = ε/8mB . Choose P ∈ part ([a, b]) with kP k < δ and let P 2 = P ∪ P 1 . Since
P 1 ¿ P 2 , according to Theorem 8.5,
¢ ε
D f , P2 − D f , P2 < .
¡ ¢ ¡
2
Thinking of P as the generic partition, the interiors of its intervals (x i −1 , x i )
may or may not contain points of P 1 . For 1 ≤ i ≤ n, let
Q i = {x i −1 , x i } ∪ (P 1 ∩ (x i −1 , x i )) ∈ part (I i ) .
If P 1 ∩ (x i −1 , x i ) = ;, then D f , P and D f , P 2 have the term M i |I i | in com-
¡ ¢ ¡ ¢
mon because Q i = {x i −1 , x i }.
Otherwise, P 1 ∩ (x i −1 , x i ) 6= ; and
D f ,Q i ≥ −B kP 2 k ≥ −B kP k > −B δ.
¡ ¢
Since P 1 has m − 1 points in (a, b), there are at most m − 1 of the Q i not contained
in P .
D f ,P −D f ,P =
¡ ¢ ¡ ¢
³ ¡ ¢´ ³ ¡ ¢´ ¡ ¡
D f , P − D f , P2 + D f , P2 − D f , P2 + D f , P2 − D f , P
¢ ¡ ¢ ¡ ¢ ¡ ¢¢
ε ε ε
+ + =ε
<
4 2 4
This shows that, given ε > 0, there is a δ > 0 so that kP k < δ implies
D f , P − D f , P < ε.
¡ ¢ ¡ ¢
Since
D f , P ≤ D f ≤ D f , P and D f , P ≤ R f , P, x i∗ ≤ D f , P
¡ ¢ ¡ ¢ ¡ ¢ ¡ ¢ ¡ ¢ ¡ ¢
When proving statements about the integral, it’s convenient to switch back
and forth between the Riemann and Darboux formulations. Given f : [a, b] → R
the following three facts summarize much of what we know.
Rb
(1) a f exists iff for all ε > 0 there is a δ > 0 and an α ∈ R¯ such
¡ that whenever
P ∈ part ([a, b]) and x i∗ is a selection from P , then ¯R f , P, x i∗ − α¯ < ε.
¢ ¯
Rb
In this case a f = α. ³ ¡ ´
Rb
(2) a f exists iff ∀ε > 0∃P ∈ part ([a, b]) D f , P − D f , P < ε
¢ ¡ ¢
Rb
(a) a f exists.
(b) Given ε > 0 there exists P ∈ part ([a, b]) such that if P ¿ Q 1 and P ¿
Q 2 , then
¯R f ,Q 1 , x ∗ − R f ,Q 2 , y ∗ ¯ < ε
¯ ¡ ¢ ¡ ¢¯
(70) k k
for any selections from Q 1 and Q 2 .
Rb
P ROOF. (=⇒) Assume a f exists. According to Definition 8.1, there is a δ > 0
Rb
such that whenever P ∈ part ([a, b]) with kP k < δ, then | a f − R f , P, x i∗ | < ε/2
¡ ¢
D f ,P −D f ,P
¡ ¢ ¡ ¢
= D f , P − R f , P, x k∗ + R f , P, x k∗ − R f , P, y k∗ + R f , P, y k∗ − D f , P
¡ ¢ ¡ ¢ ¡ ¢ ¡ ¢ ¡ ¢ ¡ ¢
n n
(M k − f (x k∗ ))|I k | + R f , P, x k∗ − R f , P, y k∗ + ( f (y k∗ ) − m k )|I k |
X ¡ ¢ ¡ ¢ X
=
k=1 k=1
¯ ¯
n ¯ n
¢¯ ¯¯ X ¯
∗ ∗ ∗ ∗
¯M k − f (x )¯ |I k | + ¯R f , P, x − R f , P, y ¯ + ¯ ( f (y ) − m k )|I k |¯¯
X ¯ ¯ ¡ ¢ ¡
≤ k k k k
k=1
¯k=1 ¯
ε n n ε
|I k | + ¯R f , P, x k∗ − R f , P, y k∗ ¯ +
X ¯ ¡ ¢ ¡ ¢¯ X
< |I k |
k=1 4n|I k | k=1 4n|I k|
ε ε ε
< + + <ε
4 2 4
Corollary 8.7 implies D f exists and Theorem 8.9 finishes the proof.
¡ ¢
Rb Rd
C OROLLARY 8.11. If a f exists and [c, d ] ⊂ [a, b], then c f exists.
P ROOF. Let P 0 = {a, b, c, d } ∈ part ([a, b]) and ε > 0. Using Theorem 8.10,
choose a partition P ε such that P 0 ¿ P ε and whenever P ε ¿ P and P ε ¿ P 0 ,
then
|R f , P, x k∗ − R f , P 0 , y k∗ | < ε.
¡ ¢ ¡ ¢
|R f ,Q 1 , x k∗ − R f ,Q 2 , y k∗ | =
¡ ¢ ¡ ¢
|R f , P ε1 ∪Q 1 ∪ P ε3 , x k∗ − R f , P ε1 ∪Q 2 ∪ P ε3 , y k∗ | < ε
¡ ¢ ¡ ¢
Rb
for any selections. An application of Theorem 8.10 shows a f exists.
and define
Then
P ROOF. (a) Since all the Riemann sums are nonnegative, this follows at once.
Rb
(b) It is always true that | f | ± f ≥ 0 and | f | − f ≥ 0, so by (a), a (| f | + f ) ≥ 0
Rb Rb Rb Rb Rb
and a (| f | − f ) ≥ 0. Rearranging these shows − a f ≤ a | f | and a f ≤ a | f |.
Rb Rb
Therefore, | a f | ≤ a | f |, which is (b).
(c) By Corollary 8.11, all the integrals exist. Let ε > 0 and choose P l ∈ part ([a, c])
and P r ∈ part ([c, b]) such that whenever P l ¿ Q l and P r ¿ Q r , then,
¯ ε ¯ ε
¯ Z c ¯ ¯ Z b ¯
¯R f ,Q l , x ∗ − ∗
and ¯R f ,Q r , y k −
¯ ¡ ¢ ¯ ¡ ¢
k f ¯<
¯ ¯ f ¯¯ < .
¯
a 2 c 2
If P = P l ∪ P r and Q = Q l ∪ Q r , then P,Q ∈ part ([a, b]) and P ¿ Q. The triangle
inequality gives
¯ Z c Z b ¯
¯R f ,Q, x ∗ − f ¯¯ < ε.
¯ ¡ ¢ ¯
¯ k f−
a c
Rb
There’s some notational trickery that can be played here. If a f exists, then
Ra Rb
we define b f = − a f . With this convention, it can be shown
Z b Z c Z b
(73) f = f+ f
a a c
no matter the order of a, b and c, as long as at least two of the integrals exist. (See
Problem 8.4.)
P ROOF. Let ε > 0. According to (a) and Definition 8.1, P ∈ part ([a, b]) can be
chosen such that ¯ Z b ¯
¯R f , P, x ∗ − ¯ < ε.
¯ ¡ ¢ ¯
¯ k f ¯
a
for every selection from P . On each interval [x k−1 , x k ] determined by P , the
function F satisfies the conditions of the Mean Value Theorem. (See Corollary
7.13.) Therefore, for each k, there is an c k ∈ (x k−1 , x k ) such that
F (x k ) − F (x k−1 ) = F 0 (c k )(x k − x k−1 ) = f (c k )|I k |.
So,
¯Z b
¯ ¯¯Z b n
¯
¯
¯ ¯ ¯ X
f − (F (b) − F (a))¯¯ = ¯ f− (F (x k ) − F (x k−1 )¯
¯ ¯
¯
a ¯ a k=1
¯
¯Z ¯
¯ b Xn ¯
=¯ f− f (c k )|I k |¯
¯ ¯
¯ a k=1
¯
¯Z b ¯
f − R f , P, c k ¯¯
¯ ¡ ¢¯
= ¯¯
a
<ε
and the theorem follows.
d (x − t )n (n+1)
R f (n, x, t ) = − f (t ).
dt n!
C OROLLARY 8.15 (Integration by Parts). If f , g ∈ C ([a, b]) ∩ D((a, b)) and both
f 0 g and f g 0 are integrable on [a, b], then
Z b Z b
0
fg + f 0 g = f (b)g (b) − f (a)g (a).
a a
Rb
Suppose a f exists. By Corollary 8.11, f is integrable on every interval [a,
R xx],
for x ∈ [a, b]. This allows us to define a function F : [a, b] → R as F (x) = a f ,
called the indefinite integral of f on [a, b].
Rb
P ROOF. To show F ∈ C ([a, b]), let x 0 ∈ [a, b] and ε > 0. Since a f exists, there
is an M > lub {| f (x)| : a ≤ x ≤ b}. Choose 0 < δ < ε/M and x ∈ (x 0 −δ, x 0 +δ)∩[a, b].
Then
¯Z x ¯
f ¯¯ ≤ M |x − x 0 | < M δ < ε
¯ ¯
|F (x) − F (x 0 )| = ¯¯
x0
and x 0 ∈ C (F ).
x x+h
The right picture makes Theorem 8.16 almost obvious. Consider Figure 8.3.
Suppose x ∈ C ( f ) and ε > 0. There is a δ > 0 such that
Let
m = glb { f y : |x − y| < δ} ≤ lub { f y : |x − y| < δ} = M .
Apparently M − m < ε and for 0 < h < δ,
Z x+h
F (x + h) − F (x)
mh ≤ f ≤ M h =⇒ m ≤ ≤ M.
x h
E XAMPLE 8.5. The usual definition of the natural logarithm function depends
on the Fundamental Theorem of Calculus. Recall for x > 0,
Z x
1
ln(x) = dt.
1 t
It’s easy to read too much into the Fundamental Theorem of Calculus. We
are tempted to start thinking of integration and differentiation as opposites
of each other. But, this is far from the truth. The operations of integration
and antidifferentiation are different operations, that happen to sometimes be
tied together by the Fundamental Theorem of Calculus. Consider the following
examples.
R x easy to prove that f is integrable over any compact interval, and that F (x) =
It’s
−1 f = |x| − 1 is an indefinite integral of f . But, F is not differentiable at x = 0
and f is not a derivative, according to Theorem 7.17.
E XAMPLE 8.8. Let f be the salt and pepper function of Example 6.15. It was
Rb Rx
shown in Example 8.3 that a f = 0 on any interval [a, b]. If F (x) = 0 f , then
F (x) = 0 for all x and F 0 = f only on C ( f ) = Qc .
8. Change of Variables
Integration by substitution works side-by-side with the Fundamental The-
orem of Calculus in the integration section of any calculus course. Most of the
time calculus books require all functions in sight to be continuous. In that case, a
substitution theorem is an easy consequence of the Fundamental Theorem and
the Chain Rule. (See Exercise 8.13.) More general statements are true, but they
are harder to prove.
T HEOREM 8.17. If f and g are functions such that
(a) g is strictly monotone on [a, b],
(b) g is continuous on [a, b],
(c) g is differentiable on (a, b), and
R g (b) Rb
(d) both g (a) f and a ( f ◦ g )g 0 exist,
then
Z g (b) Z b
(74) f = ( f ◦ g )g 0 .
g (a) a
P ROOF. Suppositions (a) and (b) show g is a bijection from [a, b] to an interval
[c, d ]. The correspondence between the endpoints depends on whether g is
increasing or decreasing.
Let ε > 0.
From (d) and Definition 8.1, there is a δ1 > 0 such that whenever P ∈ part ([a, b])
with kP k < δ1 , then
¯ ε
¯ Z b ¯
¯R ( f ◦ g )g 0 , P, x ∗ − 0¯
¯ ¡ ¢
(75) ¯ i ( f ◦ g )g ¯< 2
a
for any selection from P . Choose P 1 ∈ part ([a, b]) such that kP 1 k < δ1 .
Using the same argument, there is a δ2 > 0 such that whenever Q ∈ part ([c, d ])
with kQk < δ2 , then
¯ ε
¯ Z d ¯
∗
¯R f ,Q, x i −
¯ ¡ ¢
(76) ¯ f ¯¯ <
c 2
for any selection from Q. As above, choose Q 1 ∈ part ([c, d ]) such that kQ 1 k < δ2 .
Setting P 2 = P 1 ∪ {g −1 (x) : x ∈ Q 1 } and Q 2 = P 1 ∪ {g (x) : x ∈ P 1 }, it is apparent
that P 1 ¿ P 2 , Q 1 ¿ Q 2 , kP 2 k ≤ kP 1 k < δ1 , kQ 2 k ≤ kQ 1 k < δ2 and Q 2 = {g (x) : x ∈
P 2 }. From (75) and (76), it follows that
¯ ( f ◦ g )g 0 − R ( f ◦ g )g 0 , P 2 , x ∗ ¯ < ε
¯Z b ¯
¯ ¡ ¢¯
i ¯
¯
a 2
(77) and
¢¯ ε
¯Z d ¯
∗ ¯
f − R f ,Q 2 , y i ¯ <
¯ ¡
¯
¯
c 2
for any selections from P 2 and Q 2 .
Label the points of P 2 as a = x 1 < x 2 < · · · < x n = b and those of Q 2 as c = y 0 <
y 1 < · · · < y n = d . From (b), (c) and the Mean Value Theorem, for each i , choose
c i ∈ (x i −1 , x i ) such that
(78) g (x i ) − g (x i −1 ) = g 0 (c i )(x i − x i −1 ).
g (b) b
¯Z Z ¯
0¯
¯ ¯
¯
¯ f− ( f ◦ g )g ¯
g (a) a
g (b) b
¯Z Z ¯
f − R f ,Q 2 , g (c i ) + R f ,Q 2 , g (c i ) − ( f ◦ g )g 0 ¯¯
¯ ¡ ¢ ¡ ¢ ¯
= ¯¯
g (a) a
g (b) Z b
¯Z ¯ ¯ ¯
0
f − R f ,Q 2 , g (c i ) ¯¯ + ¯¯R f ,Q 2 , g (c i ) −
¯ ¡ ¢¯ ¯ ¡ ¢ ¯
≤ ¯¯ ( f ◦ g )g ¯¯
g (a) a
Use the triangle inequality and (77). Expand the second Riemann sum.
¯ ¯
ε ¯¯ X
n Z b ¯
0¯
¡ ¢
< + ¯ f (g (c i )) g (x i ) − g (x i −1 ) − ( f ◦ g )g ¯
2 ¯i =1 a ¯
Apply the Mean Value Theorem, as in (78), and then use (77).
¯ ¯
ε ¯¯ Xn Z b ¯
= + ¯ f (g (c i ))g 0 (c i ) (x i − x i −1 ) − ( f ◦ g )g 0 ¯
¯
2 i =1
¯ a ¯
ε ¯ ¡
¯ Z b ¯
= + ¯¯R ( f ◦ g )g 0 , P 2 , c i − ( f ◦ g )g 0 ¯¯
¢ ¯
2 a
ε ε
< +
2 2
=ε
where c k ∈ (x k−1 , x k ) is as above. The rest of the proof is much like the case when
g is increasing.
g (b) b
¯Z Z ¯
0¯
¯ ¯
¯
¯ f− ( f ◦ g )g ¯
g (a) a
¯ Z g (a) Z b ¯
0¯
f + R f ,Q 2 , g (c n−k+1 ) − R f ,Q 2 , g (c n−k+1 ) −
¯ ¡ ¢ ¡ ¢ ¯
= ¯−¯ ( f ◦ g )g ¯
g (b) a
¯ Z g (b) ¯ ¯ Z b ¯
f + R f ,Q 2 , g (c n−k+1 ) ¯¯ + ¯¯−R f ,Q 2 , g (c n−k+1 ) − ( f ◦ g )g 0 ¯¯
¯ ¡ ¢¯ ¯ ¡ ¢ ¯
≤ ¯¯−
g (a) a
Use (77), expand the second Riemann sum and apply (79).
¯ ¯
ε ¯¯ X n Z b ¯
0¯
< + ¯− f (g (c n−k+1 ))(y k − y k−1 ) − ( f ◦ g )g ¯
2 ¯ k=1 a ¯
¯ ¯
ε ¯¯ X
n Z b ¯
= +¯ f (g (c n−k+1 ))g 0 (c n−k+1 )(x n−k+1 − x n−k ) − ( f ◦ g )|g 0 |¯
¯
2 k=1
¯ a ¯
On the other hand, it can also be done with a decreasing function. If g (x) = cos x
and [a, b] = [0, π], then
Z 1p Z cos 0 p
1 − x2 d x = 1 − x2 d x
−1 cos π
Z cos π p
=− 1 − x2 d x
cos 0
Z πp
=− 1 − cos2 x(− sin x) d x
0
Z πp
= 1 − cos2 x sin x d x
0
Z π
= sin2 x d x
0
π
=
2
ě
Rb Rb
(c) a f and a f g both exist.
There is a c ∈ [a, b] such that
Z b Z c Z b
f g =m g +M g.
a a c
10. Exercises
8.2. Let (
1, x ∈ Q
f (x) = .
0, x ∉ Q
(a) Use Definition 8.1 to show f is not integrable on any interval.
(b) Use Definition 8.6 to show f is not integrable on any interval.
R5
8.3. Calculate 2 x 2 using the definition of integration.
8.6. If f : [a, b] → [0, ∞) is continuous and D( f ) = 0, then f (x) = 0 for all x ∈ [a, b].
Rb Rb Rb
8.7. If a f exists, then limx↓a x f = a f.
Rb
8.8. If f is monotone on [a, b], then a f exists.
Rb
(Hint: Expand a (x f + g )2 as a quadratic with variable x.)3
Z x dt
8.11. If f (x) = for x > 0, then f (x y) = f (x) + f (y) for x, y > 0.
1 t
8.13. In the statement of Theorem 8.17, make the additional assumptions that f
and g 0 are both continuous. Use the Fundamental Theorem of Calculus to give
an easier proof.
3This is variously called the Cauchy inequality, Cauchy-Schwarz inequality, or the Cauchy-
Schwarz-Bunyakowsky inequality. Rearranging the last one, some people now call it the CBS
inequality.
Sequences of Functions
1. Pointwise Convergence
We have accumulated much experience working with sequences of numbers.
The next level of complexity is sequences of functions. This chapter explores
several ways that sequences of functions can converge to another function. The
basic starting point is contained in the following definitions.
D EFINITION 9.1. Suppose S ⊂ R and for each n ∈ N there is a function f n :
S → R. The collection { f n : n ∈ N} is a sequence of functions defined on S.
For each fixed x ∈ S, f n (x) is a sequence of numbers, and it makes sense to
ask whether this sequence converges. If f n (x) converges for each x ∈ S, a new
function f : S → R is defined by
f (x) = lim f n (x).
n→∞
The function f is called the pointwise limit of the sequence f n , or, equivalently, it
S
is said f n converges pointwise to f . This is abbreviated f n −→ f , or simply f n → f ,
if the domain is clear from the context.
E XAMPLE 9.1. Let
0,
x <0
f n (x) = x n , 0 ≤ x < 1 .
1, x ≥1
Then f n → f where (
0, x < 1
f (x) = .
1, x ≥ 1
(See Figure 9.1.) This example shows that a pointwise limit of continuous func-
tions need not be continuous.
1.0
0.8
0.6
0.4
0.2
F IGURE 9.1. The first ten functions from the sequence of Exam-
ple 9.1.
0.4
0.2
-3 -2 -1 1 2 3
-0.2
-0.4
F IGURE 9.2. The first four functions from the sequence of Example 9.2.
implies f n → 0. This example shows that functions can remain bounded away
from 0 and still converge pointwise to 0.
To figure out what this looks like, it might help to look at Figure 9.3.
The graph of f n is a piecewise linear function supported on [1/2n+1 , 1/2n ] and
the area under the isosceles triangle of the graph over this interval is 1. Therefore,
R1
0 f n = 1 for all n.
If x > 0, then whenever x > 1/2n , we have f n (x) = 0. From this it follows that
f n → 0.
The lesson to be learned from this example is that it may not be true that
R1 R1
limn→∞ 0 f n = 0 limn→∞ f n .
64
32
16
8
1 1 1 1
1
16 8 4 2
F IGURE 9.3. The first four functions f n → 0 from the sequence of
Example 9.3.
1/2
1/4
1/8
1 1
F IGURE 9.4. The first ten functions of the sequence f n → |x| from
Example 9.4.
(See Figure 9.4.) The parabolic section in the center was chosen so f n (±1/n) = 1/n
and f n0 (±1/n) = ±1. This splices the sections together at (±1/n, ±1/n) so f n is
differentiable everywhere. It’s clear f n → |x|, which is not differentiable at 0.
This example shows that the limit of differentiable functions need not be
differentiable.
The examples given above show that continuity, integrability and differen-
tiability are not preserved in the pointwise limit of a sequence of functions. To
have any hope of preserving these properties, a stronger form of convergence is
needed.
2. Uniform Convergence
D EFINITION 9.2. The sequence f n : S → R converges uniformly to f : S → R
on S, if for each ε > 0 there is an N ∈ N so that whenever n ≥ N and x ∈ S, then
| f n (x) − f (x)| < ε.
S
In this case, we write f n â f , or simply f n â f , if the set S is clear from the
context.
f(x) + ε
f(x)
fn(x)
f(x ε
a b
The first three examples given above show the converse to Theorem 9.3
is false. There is, however, one interesting and useful case in which a partial
converse is true.
S
D EFINITION 9.4. If f n −→ f and f n (x) ↑ f (x) for all x ∈ S, then f n increases to f
S
on S. If f n −→ f and f n (x) ↓ f (x) for all x ∈ S, then f n decreases to f on S. In either
case, f n is said to converge to f monotonically.
The functions of Example 9.4 decrease to |x|. Notice that in this case, the con-
vergence is also happens to be uniform. The following theorem shows Example
9.4 to be an instance of a more general phenomenon.
T HEOREM 9.5 (Dini’s Theorem). If
(a) S is compact,
S
(b) f n −→ f monotonically,
(c) f n ∈ C (S) for all n ∈ N, and
(d) f ∈ C (S),
then f n â f .
P ROOF. There is no loss of generality in assuming f n ↓ f , for otherwise we
consider − f n and − f . With this assumption, if g n = f n − f , then g n is a sequence
of continuous functions decreasing to 0. It suffices to show g n â 0.
To do so, let ε > 0. Using continuity and pointwise convergence, for each
x ∈ S find an open set G x containing x and an N x ∈ N such that g Nx (y) < ε for all
y ∈ G x . Notice that the monotonicity condition guarantees g n (y) < ε for every
y ∈ G x and n ≥ N x .
The collection {G x : x ∈ S} is an open cover for S, so it must contain a finite
subcover {G xi : 1 ≤ i ≤ n}. Let N = max{N xi : 1 ≤ i ≤ n} and choose m ≥ N . If
x ∈ S, then x ∈ G xi for some i , and 0 ≤ g m (x) ≤ g N (x) ≤ g Ni (x) < ε. It follows that
g n â 0.
fn (x) = xn
1/2
2 1/n 1
E XAMPLE 9.5. Let f n (x) = x n for n ∈ N, then f n decreases to 0 on [0, 1). If 0 <
a < 1 Dini’s Theorem shows f n â 0 on the compact interval [0, a]. On the whole
interval [0, 1), f n (x) > 1/2 when 2−1/n < x < 1, so f n is not uniformly convergent.
(Why doesn’t this violate Dini’s Theorem?)
T HEOREM 9.7. Let f n ∈ B (S). There is a function f ∈ B (S) such that f n â f iff
f n is a Cauchy sequence in B (S).
1Definition 2.11
4. Series of Functions
The definitions of pointwise and uniform convergence are extended in the
natural way to series of functions. If ∞
P
f is a series of functions defined on a
k=1 k
set S, then the series converges pointwise or uniformly, depending on whether
the sequence of partial sums, s n = nk=1 f k converges pointwise or uniformly,
P
n n n
M k < ε.
X X X
ks n − s m k = k fk k ≤ k fk k ≤
k=m+1 k=m+1 k=m
This shows s n is a Cauchy sequence and must converge according to Theorem 9.7.
n
E XAMPLE 9.8. Let a > 0 and M n = a /n!. Since
M n+1 a
lim = lim = 0,
n→∞ M n n→∞ n + 1
P∞
the Ratio Test shows n=0 M n converges. When x ∈ [−a, a],
¯ n¯
¯ x ¯ an
¯ n! ¯ n! .
¯ ¯≤
n
The Weierstrass M-Test now implies ∞
P
n=0 x /n! converges absolutely uniformly
on [−a, a] for any a > 0. (See Exercise 9.4.)
1.2
1.0
0.8
0.6
0.4
0.2
P ROOF. Parts (a) and (b) follow easily from the definition of k n .
To prove (c) first note that
Z 1/pn
1 n
Z 1 µ ¶
2 n 2
1= kn ≥ p c n (1 − t ) d t ≥ c n p 1 − .
−1 −1/ n n n
¢n p
Since 1 − n1 ↑ 1e , it follows there is an α > 0 such that c n < α n.2 Letting δ ∈ (0, 1)
¡
and δ ≤ t ≤ 1,
p
k n (t ) ≤ k n (δ) ≤ α n(1 − δ2 )n → 0
by L’Hospital’s Rule. Since k n is an even function, this establishes (c).
P ROOF. There is no generality lost in assuming [a, b] = [0, 1], for otherwise we
consider the linear change of variables g (x) = f ((b − a)x + a). Similarly, we can
assume f (0) = f (1) = 0, for otherwise we consider g (x) = f (x) − (( f (1) − f (0))x −
f (0), which is a polynomial added to f . We can further assume f (x) = 0 when
x ∉ [0, 1].
Set
Z 1
(82) p n (x) = f (x + t )k n (t ) d t .
−1
3Given two functions f and g defined on R, the convolution of f and g is the integral
Z ∞
f ? g (x) = f (t )g (x − t ) d t .
−∞
The term convolution kernel is used because such kernels typically replace g in the convolution
given above, as can be seen in the proof of the Weierstrass approximation theorem.
4
It was investigated by the German mathematician Edmund Landau (1877–1938).
The theorems of this section can also be used to construct some striking
examples of functions with unwelcome behavior. Following is perhaps the most
famous.
Notice that s 0 is continuous and periodic with period 2 and maximum value 1.
Compress it both vertically and horizontally:
µ ¶n
3
s 0 4 n x , n ∈ N.
¡ ¢
s n (x) =
4
Each s n is continuous and periodic with period p n = 2/4n and ks n k = (3/4)n .
1.0
0.8
0.6
0.4
0.2
3.0
2.5
2.0
1.5
1.0
0.5
Since ks n k = (3/4)n , the Weierstrass M -test implies the series defining f is uni-
formly convergent and Corollary 9.12 shows f is continuous on R. We will show
f is differentiable nowhere.
Let x ∈ R, m ∈ N and h m = 1/(2 · 4m ).
If n > m, then h m /p n = 4n−m−1 ∈ N, so s n (x ± h m ) − s n (x) = 0 and
f (x ± h m ) − f (x) X m s (x ± h ) − s (x)
k m k
(86) = .
±h m k=0 ±h m
3m − 1 3m
(87) ≤ < .
3−1 2
Since s m is linear on intervals of length 4−m = 2 · h m with slope ±3m on those
linear segments, at least one of the following is true:
¯ ¯ ¯ ¯
¯ s m (x + h m ) − s(x) ¯ m
¯ s m (x − h m ) − s(x) ¯
(88) ¯ = 3m .
¯ = 3 or ¯
¯ ¯ ¯
¯ hm −h m ¯
Suppose the first of these is true. The argument is essentially the same in the
second case.
3m 3m
> 3m − = .
2 2
This is often referred to as “passing the limit through the integral.” At some
point in her career, any student of advanced analysis or probability theory will
be tempted to just blithely pass the limit through. But functions such as those of
Example 9.3 show that some care is needed. A common criterion for doing so is
uniform convergence.
Rb
T HEOREM 9.16. If f n : [a, b] → R such that a f n exists for each n and f n â f
on [a, b], then
Z b Z b
f = lim fn
a n→∞ a
P ROOF. Some care must be taken in this proof, because there are actually
two things to prove. Before the equality can be shown, it must be proved that f is
integrable.
To show that f is integrable, let ε > 0 and N ∈ N such that k f − f N k < ε/3(b−a).
If P ∈ part ([a, b]), then
n n
|R f , P, x k∗ − R f N , P, x k∗ | = | f (x k∗ )|I k | − f N (x k∗ )|I k ||
¡ ¢ ¡ ¢ X X
(89)
k=1 k=1
N
( f (x k∗ ) − f N (x k∗ ))|I k ||
X
=|
k=1
N
| f (x k∗ ) − f N (x k∗ )||I k |
X
≤
k=1
ε Xn
< |I k |
3(b − a)) k=1
ε
=
3
According to Theorem 8.10, there is a P ∈ part ([a, b]) such that whenever
P ¿ Q 1 and P ¿ Q 2 , then
ε
|R f N ,Q 1 , x k∗ − R f N ,Q 2 , y k∗ | < .
¡ ¢ ¡ ¢
(90)
3
Combining (89) and (90) yields
¯R f ,Q 1 , x ∗ − R f ,Q 2 , y ∗ ¯
¯ ¡ ¢ ¡ ¢¯
k k
= ¯R f ,Q 1 , x k∗ − R f N ,Q 1 , x k∗ + R f N ,Q 1 , x k∗
¯ ¡ ¢ ¡ ¢ ¡ ¢
−R f N ,Q 1 , x k∗ + R f N ,Q 2 , y k∗ − R f ,Q 2 , y k∗ ¯
¡ ¢ ¡ ¢ ¡ ¢¯
≤ ¯R f ,Q 1 , x k∗ − R f N ,Q 1 , x k∗ ¯ + ¯R f N ,Q 1 , x k∗ − R f N ,Q 1 , x k∗ ¯
¯ ¡ ¢ ¡ ¢¯ ¯ ¡ ¢ ¡ ¢¯
+ ¯R f N ,Q 2 , y k∗ − R f ,Q 2 , y k∗ ¯
¯ ¡ ¢ ¡ ¢¯
ε ε ε
< + + =ε
3 3 3
Another application of Theorem 8.10 shows that f is integrable.
Finally, when n ≥ N ,
ε ε
¯Z b Z b ¯ ¯Z b ¯ Z b
= <ε
¯ ¯ ¯ ¯
¯ f− f n ¯ = ¯ ( f − f n )¯¯ <
¯ ¯
a 3(b − a) 3
¯
a a a
Rb Rb
shows that a f n → a f .
C OROLLARY 9.17. If ∞
P
n=1 f n is a series of integrable functions converging
uniformly on [a, b], then
Z bX ∞ X∞ Z b
fn = fn
a n=1 n=1 a
E XAMPLE 9.10. It was shown in Example 4.2 that the geometric series
∞ 1
tn =
X
, −1 < t < 1.
n=0 1−t
In Exercise 9.3, you are asked to prove this convergence is uniform on any com-
pact subset of (−1, 1). Substituting −t for t in the above formula, it follows that
∞ 1
(−t )n â
X
n=0 1+t
on [0, x], when 0 < x < 1. Corollary 9.17 implies
Z x ∞ Z x
dt
(−t )n d t = x − x 2 + x 3 − x 4 + · · · .
X
ln(1 + x) = =
0 1+t n=0 0
ln(1 + x) = x − x 2 + x 3 − x 4 + · · ·
If x ∈ [a, b] and m, n ≥ N , then the Mean Value Theorem and the assumption that
F m (a) = F n (a) = 0 yield a c ∈ [a, b] such that
|F m (x) − F n (x)| = |(F m (x) − F n (x)) − (F m (a) − F n (a))|
(91) = | f m (c) − f n (c)| |x − a| ≤ k f m − f n k(b − a) < ε.
This shows F n is a Cauchy sequence in C ([a, b]) and there is an F ∈ C ([a, b]) with
Fn â F .
It suffices to show F 0 = f . To do this, several estimates are established.
Let M ∈ N so that
ε
m, n ≥ M =⇒ k f m − f n k < .
3
Notice this implies
ε
(92) k f − f n k ≤ , ∀n ≥ M .
3
For such m, n ≥ M and x, y ∈ [a, b] with x 6= y, another application of the
Mean Value Theorem gives
¯ ¯
¯ F n (x) − F n (y) F m (x) − F m (y) ¯
¯ − ¯
¯ x−y x−y ¯
1 ¯¯ ¯
= (F n (x) − F m (x)) − (F n (y) − F m (y))¯
|x − y|
1 ¯¯ ¯ ε
= f n (c) − f m (c)¯ |x − y| ≤ k f n − f m k < .
|x − y| 3
Letting m → ∞, it follows that
¯ F n (x) − F n (y) F (x) − F (y) ¯ ε
¯ ¯
(93) ¯ − ¯ ≤ , ∀n ≥ M .
¯ x−y x−y ¯ 3
Fix n ≥ M and x ∈ [a, b]. Since F n0 (x) = f n (x), there is a δ > 0 so that
¯ ε
¯ ¯
¯ F n (x) − F n (y)
(94) ¯ − f n (x)¯ < , ∀y ∈ (x − δ, x + δ) \ {x}.
¯ x−y ¯ 3
Finally, using (93), (94) and (92), we see
¯ ¯
¯ F (x) − F (y) ¯
¯ − f (x)¯¯
¯ x−y
¯
¯ F (x) − F (y) F n (x) − F n (y)
= ¯¯ −
x−y x−y
¯
F n (x) − F n (y) ¯
+ − f n (x) + f n (x) − f (x)¯¯
x−y
¯ ¯
¯ F (x) − F (y) F n (x) − F n (y) ¯
≤ ¯¯ − ¯
x−y x−y ¯
¯ ¯
¯ F n (x) − F n (y) ¯ ¯ ¯
+¯
¯ − f n (x)¯¯ + ¯ f n (x) − f (x)¯
x−y
ε ε ε
< + + = ε.
3 3 3
8. Power Series
8.1. The Radius and Interval of Convergence. One place where uniform
convergence plays a key role is with power series. Recall the definition.
D EFINITION 9.22. A power series is a function of the form
∞
a n (x − c)n .
X
(96) f (x) =
n=0
Members of the sequence a n are the coefficients of the series. The domain of f is
the set of all x at which the series converges. The constant c is called the center of
the series.
To determine the domain of (96), let x ∈ R \ {c} and use the root test to see the
series converges when
lim sup |a n (x − c)n |1/n = |x − c| lim sup |a n |1/n < 1
and diverges when
|x − c| lim sup |a n |1/n > 1.
If r lim sup |a n |1/n ≤ 1 for some r ≥ 0, then these inequalities imply (96) is abso-
lutely convergent when |x − c| < r . In other words, if
(97) R = lub {r : r lim sup |a n |1/n < 1},
then the domain of (96) is an interval of radius R centered at c. The root test gives
no information about convergence when |x − c| = R. This R is called the radius of
convergence of the power series. Assuming R > 0, the open interval centered at c
with radius R is called the interval of convergence. It may be different from the
domain of the series because the series may converge at one endpoint or both
endpoints of the interval of convergence.
The ratio test can also be used to determine the radius of convergence, but,
as shown in (31), it will not work as often as the root test. When it does,
¯ ¯
¯ a n+1 ¯
(98) R = lub {r : r lim ¯
¯ ¯ < 1}.
n→∞ a n
¯
This is usually easier to compute than (97), and both will give the same value for
R.
E XAMPLE 9.12. Calling to mind Example 4.2, it is apparent the geometric
n
power series ∞
P
n=0 x has center 0, radius of convergence 1 and domain (−1, 1).
n n
E XAMPLE 9.13. For the power series ∞
P
n=1 2 (x + 2) /n, we compute
µ n ¶1/n
2 1
lim sup = 2 =⇒ R = .
n 2
Since the series diverges when x = −2 ± 12 , it follows that the interval of conver-
gence is (−5/2, −3/2).
n
E XAMPLE 9.14. The power series ∞
P
n=1 x /n has interval of convergence
(−1, 1) and domain [−1, 1). Notice it is not absolutely convergent when x = −1.
n 2
E XAMPLE 9.15. The power series ∞
P
n=1 x /n has interval of convergence
(−1, 1), domain [−1, 1] and is absolutely convergent on its whole domain.
T HEOREM 9.23. Let the power series be as in (96) and R be given by either (97)
or (98).
(a) If R = 0, then the domain of the series is {c}.
(b) If R > 0 the series converges absolutely at x when |c − x| < R and
diverges at x when |c−x| > R. In the case when R = ∞, the series converges
everywhere.
(c) If R ∈ (0, ∞), then the series may converge at none, one or both of c − R
and c + R.
P ROOF. There is no generality lost in assuming the series has the form of (96)
with c = 0. Let the radius of convergence be R > 0 and K be a compact subset of
(−R, R) with α = lub {|x| : x ∈ K }. Choose r ∈ (α, R). If x ∈ K , then |a n x n | < |a n r n |
n n
for n ∈ N. Since ∞
P P∞
n=0 |a n r | converges, the Weierstrass M -test shows n=0 a n x
is absolutely and uniformly convergent on K .
The following two corollaries are immediate consequences of Corollary 9.12
and Theorem 9.16, respectively.
The latter series satisfies the Alternating Series Test. Since π1 5/(15 × 15!) ≈ 1.5 ×
10−6 , Corollary 4.20 shows
Z π 6 (−1)n
π2n+1 ≈ 1.85194
X
f (x) d x ≈
0 n=0 (2n + 1)(2n + 1)!
(100) lim sup(na n )1/n = lim sup n 1/n a n1/n = lim sup a n1/n .
n
Now, suppose the power series ∞
P
n=0 a n x has a nontrivial interval of con-
vergence, I . Formally differentiating the power series term-by-term gives a new
n−1
power series ∞
P
n=1 na n x . According to (100) and Theorem 9.23, the term-by-
term differentiated series has the same interval of convergence as the original.
Its partial sums are the derivatives of the partial sums of the original series and
Theorem 9.24 guarantees they converge uniformly on any compact subset of I .
Corollary 9.21 shows
d X ∞ ∞ d ∞
an x n = an x n = na n x n−1 , ∀x ∈ I .
X X
d x n=0 n=0 d x n=1
This process can be continued inductively to obtain the same results for all higher
order derivatives. We have proved the following theorem.
n
T HEOREM 9.27. If f (x) = ∞
P
n=0 a n (x − c) is a power series with nontrivial
interval of convergence, I , then f is differentiable to all orders on I with
∞ n!
f (m) (x) = a n (x − c)n−m .
X
(101)
n=m (n − m)!
n
8.3. Taylor Series. Suppose f (x) = ∞
P
n=0 a n x has I = (−R, R) as its interval
of convergence for some R > 0. According to Theorem 9.27,
m! f (m) (0)
f (m) (0) = a m =⇒ a m = , ∀m ∈ ω.
(m − m)! m!
Therefore,
∞ f (n) (0)
x n , ∀x ∈ I .
X
f (x) =
n=0 n!
The series (102) is called the Taylor series5 for f centered at c. The Taylor
series can be formally defined for any function that has derivatives of all orders at
c, but, as Example 7.9 shows, there is no guarantee it will converge to the function
anywhere except at c. Taylor’s Theorem 7.18 can be used to examine the question
of pointwise convergence. If f can be represented by a power series on an open
interval I , then f is said to be analytic on I .
Pn
Set s = f (1), s −1 = 0 and s n = k=0
a k for n ∈ ω. For |x| < 1,
n n
ak x k = (s k − s k−1 )x k
X X
k=0 k=0
n n
sk x k − s k−1 x k
X X
=
k=0 k=1
n−1 n−1
= sn x n + sk x k − x sk x k
X X
k=0 k=0
n−1
= s n x n + (1 − x) sk x k
X
k=0
Let ε > 0. Choose N ∈ N such that whenever n ≥ N , then |s n − s| < ε/2. Choose
δ ∈ (0, 1) so
N
δ |s k − s| < ε/2.
X
k=0
Suppose x is such that 1 − δ < x < 1. With these choices, (104) becomes
¯ ¯ ¯ ¯
¯ N ¯ ¯ ∞ ¯
k¯ ¯ k¯
X X
| f (x) − s| ≤ ¯(1 − x) (s k − s)x ¯ + ¯(1 − x) (s k − s)x ¯
¯
k=0 k=N +1
¯ ¯ ¯ ¯
¯ ¯
N ε ¯¯ ∞ ¯ ε ε
<δ xk ¯ < + = ε
X X
|s k − s| + ¯(1 − x)
¯
k=0 2 ¯ k=N +1
¯ 2 2
has (−1, 1) as its interval of convergence. If 0 ≤ |x| < 1, then Corollary 9.17 justifies
Z x
dt
Z xX∞ ∞ (−1)n
n 2n
x 2n+1 .
X
arctan(x) = 2
= (−1) t d t =
0 1 + t 0 n=0 n=0 2n + 1
This series for the arctangent converges by the alternating series test when x = 1,
so Theorem 9.29 implies
X∞ (−1)n π
(105) = lim arctan(x) = arctan(1) = .
n=0 2n + 1 x↑1 4
A bit of rearranging gives the formula
µ ¶
1 1 1
π = 4 1− + − +··· ,
3 5 7
which is known as Gregory’s series for π.
Finally, Abel’s theorem opens up an interesting idea for the summation of
series. Suppose ∞
P
n=0 a n is a series. The Abel sum of this series is
∞ ∞
an x n .
X X
A a n = lim
n=0 x↑1 n=0
Let ε > 0. According to (a) and Exercise 3.3.21, there is an N ∈ N such that
ε 1 Xn ε
(107) n ≥ N =⇒ n|a n | < and k|a k | < .
2 n k=0 2
Let n ≥ N and 1 − 1/n < x < 1. Using the right term in (107),
n
X 1 Xn ε
(108) (1 − x) k|a k | < k|a k | < .
k=0 n k=0 2
Using the left term in (107) gives
∞ ∞ ε k
|a k |x k <
X X
x
k=n+1 k=n+1 2k
ε x n+1
(109) <
2n 1 − x
ε
< .
2
Combining (107) and (107) with (106) shows
¯ ¯
¯ ∞ ¯
a k x k ¯ < ε.
X
¯s n −
¯ ¯
k=0
¯ ¯
9. Exercises
9.1. If f n (x) = nx(1 − x)n for 0 ≤ x ≤ 1, then show f n converges pointwise, but
not uniformly on [0, 1].
9.2. Show sinn x converges uniformly on [0, a] for all a ∈ (0, π/2). Does sinn x
converge uniformly on [0, π/2)?
9.3. Show that x n converges uniformly on [−r, r ] when 0 < r < 1, but not on
P
(−1, 1).
n
9.4. Prove ∞n=0 x /n! does not converge uniformly on R.
P
nx
n=0 e
is uniformly convergent on any set of the form [a, ∞) with a > 0.
p X∞ (−1)n
9.14. Prove π = 2 3 n
. (This is the Madhava-Leibniz series which
n=0 3 (2n + 1)
was used in the fourteenth century to compute π to 11 decimal places.)
∞ xn
d
exp(x) = exp(x) for all x ∈ R.
X
9.15. If exp(x) = , then dx
n=0 n!
∞
X 1
9.16. Is Abel convergent?
n=2 n ln n
Fourier Series
1. Trigonometric Polynomials
D EFINITION 10.1. A function of the form
n
αk cos kx + βk sin kx
X
(112) p(x) =
k=0
is called a trigonometric polynomial. The largest value of k such that |αk | + βk | 6=
0 is the degree of the polynomial. Denote by T the set of all trigonometric
polynomials.
Evidently, all functions in T are 2π-periodic and T is closed under addition
and multiplication by real numbers. Indeed, it is a real vector space, in the sense
of linear algebra and the set {sin nx : n ∈ N} ∪ {cos nx : n ∈ ω} is a basis for T .
The following theorem can be proved using integration by parts or trigono-
metric identities.
T HEOREM 10.2. If m, n ∈ Z, then
Z π
(113) sin mx cos nx d x = 0,
−π
1
Fourier’s methods can be seen in most books on partial differential equations, such as [3]. For
example, see solutions of the heat and wave equations using the method of separation of variables.
10-1
10-2 CHAPTER 10. FOURIER SERIES
Z π 0, m 6= n
(114) sin mx sin nx d x = 0, m = 0 or n = 0
−π
π m = n 6= 0
and
Z π 0, m 6= n
(115) cos mx cos nx d x = 2π m = n = 0 .
−π
π m = n 6= 0
2Many people, including me, would argue that the study of Fourier series has been the
most important area of mathematical research over the past two centuries. Huge mathematical
disciplines, including set theory, measure theory and harmonic analysis trace their lineage back to
basic questions about Fourier series. Even after centuries of study, research in this area continues
unabated.
F IGURE 10.1. This shows f (x) = |x|, s 1 (x) and s 3 (x), where s n (x)
is the n th partial sum of the Fourier series for f .
E XAMPLE 10.1. Let f (x) = |x|. Since f is an even functions and sin nx is odd,
1 π
Z
bn = |x| sin nx d x = 0
π −π
for all n ∈ N. On the other hand,
1
Z π π, n=0
an = |x| cos nx d x = 2(cos nπ − 1)
π −π , n∈N
n2π
for n ∈ Z. Therefore,
π 4 cos x 4 cos 3x 4 cos 5x 4 cos 7x 4 cos 9x
|x| ∼ − − − − − +···
2 π 9π 25π 49π 81π
(See Figure 10.1.)
There are at least two fundamental questions arising from (118): Does the
Fourier series of f converge to f ? Can f be recovered from its Fourier series,
even if the Fourier series does not converge to f ? These are often called the
convergence and representation questions, respectively. The next few sections
will give some partial answers.
P ROOF. Since the two limits have similar proofs, only the first will be proved.
Let ε > 0 and P be a generic partition of [a, b] satisfying
¢ ε
Z b
f −D f ,P < .
¡
0<
a 2
For m i = glb { f (x) : x i −1 < x < x i }, define a function g on [a, b] by g (x) = m i when
Rb
x i −1 ≤ x < x i and g (b) = m n . Note that a g = D f , P , so
¡ ¢
ε
Z b
(119) 0< (f − g) < .
a 2
Choose
4X n
(120) α> |m i |.
ε i =1
Since f ≥ g ,
¯Z b ¯ ¯Z b Z b ¯
αt ¯ = ¯ ( f (t ) − g (t )) cos αt d t + αt
¯ ¯ ¯ ¯
¯
¯ f (t ) cos d t ¯ ¯ g (t ) cos d t ¯
¯
a a a
¯Z b ¯ ¯Z b ¯
≤ ¯ ( f (t ) − g (t )) cos αt d t ¯ + ¯ g (t ) cos αt d t ¯¯
¯ ¯ ¯ ¯
¯ ¯ ¯
a a
¯ ¯
Z b ¯1 X n ¯
≤ (f − g)+¯ m i (sin(αx i ) − sin(αx i −1 ))¯
¯ ¯
a ¯ α i =1 ¯
Z b
2X n
≤ (f − g)+ |m i |
a α i =1
15
12
Π Π
-Π - Π
2 2
-3
a0 X n
s n (x) = + (a k cos kx + b k sin kx)
2 k=1
1 π n 1Z π¡
Z X ¢
= f (t ) d t + f (t ) cos kt cos kx + f (t ) sin kt sin kx d t
2π −π k=1 π −π
Z π Ã
n
!
1 X
= f (t ) 1 + 2(cos kt cos kx + sin kt sin kx) d t
2π −π k=1
1 π
à !
Z Xn
= f (t ) 1 + 2 cos k(x − t ) d t
2π −π k=1
1 π
à !
Z Xn
(121) = f (x − s) 1 + 2 cos ks d s
2π −π k=1
is called the Dirichlet kernel. Its properties will prove useful for determining the
pointwise convergence of Fourier series.
1
Z π
(d) D n (s) d s = 1 for each n ∈ N.
2π −π
sin(n + 1/2)s
(e) D n (s) = for each n ∈ N and s/2 not an integer multiple
sin s/2
of π.
Use the facts that the cosine is even and the sine is odd.
n
X cos 2s X
n
= cos ks + sin ks
k=−n sin 2s k=−n
1 X n ³ s s ´
= sin cos ks + cos sin ks
sin 2s k=−n 2 2
1 X n 1
= s sin(k + )s
sin 2 k=−n 2
sin(n + 12 )s
=
sin 2s
According to (121),
1
Z π
s n ( f , x) = f (x − t )D n (t ) d t .
2π −π
This is similar to a situation we’ve seen before within the proof of the Weier-
strass approximation theorem, Theorem 9.13. The integral given above is a
convolution integral similar to that used in the proof of Theorem 9.13, although
the Dirichlet kernel isn’t a convolution kernel in the sense of Lemma 9.14 because
it doesn’t satisfy conditions (a) and (c) of that lemma. (See Figure 10.3.)
then
a0 X∞
+ (a k cos kx + b k cos kx) = s.
2 k=1
Π
2
Π Π
-Π - Π 2Π
2 2
Π
-
2
-Π
F IGURE 10.4. This plot shows the function of Example 10.2 and
s 8 (x) for that function.
so
∞ 2
(−1)n+1 sin nx = x for − π < x < π.
X
(123)
n=1 n
In particular, when x = π/2, (123) gives another way to derive (105). When x = π,
the series converges to 0, which is the middle of the “jump” for f .
If both of the integrals on the right are finite, then the integral on the left is also
finite. This amounts to a proof of the following corollary.
5. Gibbs Phenomenon
For x ∈ [−π, π) define
(
|x|
x , 0 < |x| < π
(124) f (x) =
0, x = 0, π
and extend f 2π-periodically to all of R. This function is often called a square
wave. A straightforward calculation gives
4 X∞ sin(2k − 1)x
f ∼ .
π k=1 2k − 1
Corollary 10.7 shows s n (x) → f (x) everywhere. This convergence cannot be uni-
form because all the partial sums are continuous and f is discontinuous at every
integer multiple of π. A plot of s 19 (x) is shown in Figure 10.5. Notice the higher
peaks in the oscillation of s n (x) just before and after the jump discontinuities
of f . This behavior is not unique to f , as it can also be seen in Figure 10.4. If a
function is discontinuous at some point, the partial sums of its Fourier series
will always have such higher peaks near that point. This behavior is called Gibbs
phenomenon.3
Instead of doing a general analysis of Gibbs phenomenon, we’ll only analyze
the simple case shown in the square wave f . It’s basically a calculus exercise.
3It is named after the American mathematical physicist, J. W. Gibbs, who pointed it out in 1899.
He was not the first to notice the phenomenon, as the British mathematician Henry Wilbraham
had published a little-noticed paper on it it 1848. Gibbs’ interest in the phenomenon was sparked
by investigations of the American experimental physicist A. A. Michelson who wrote a letter to
Nature disputing the possibility that a Fourier series could converge to a discontinuous function.
The ensuing imbroglio is recounted in a marvelous book by Paul J. Nahin [16].
The last sum is a midpoint Riemann sum for the function sinx x on the interval
[0, π] using a regular partition of n subintervals. Example 9.16 shows
2 π sin x
Z
d x ≈ 1.17898.
π 0 x
Since f (0+) − f (0−) = 2, this is an overshoot of a bit less than 9%. There is a
similar undershoot on the other side. It turns out this is typical behavior at points
of discontinuity [21].
We’ll not have much use for the conjugate Dirichlet kernel, except as a conve-
nient way to refer to sums of the form (125).
Lemma 10.8 immediately gives the following bound.
¯D̃ n (t )¯ ≤ ¯ 1 ¯ .
¯ ¯
¯sin t ¯
2
6.2. A Sawtooth Wave. If the function f (x) = (π − x)/2 on [0, 2π) is extended
2π-periodically to R, then the graph of the resulting function is often referred to
as a “sawtooth wave”. It has a particularly nice Fourier series:
π−x X ∞ sin kx
∼ .
2 k=1 k
According to Corollary 10.7
(
X∞ sin kx 0, x = 2nπ, n ∈ Z
= .
k=1 k f (x), otherwise
We’re interested in various partial sums of this series.
P ROOF.
¯ ¯ ¯ ¯
¯X n sin kt ¯ ¯ X
n ¡ ¢ 1 ¯¯
¯=¯ D̃ (t ) − D̃ k−1 (t ) ¯
¯ ¯ ¯
¯k=m k ¯ ¯k=m k
¯
k¯
P ROOF.
¯ ¯ ¯ ¯
¯Xn sin kt ¯ ¯¯ X sin kt ¯
X sin kt ¯
¯= +
¯ ¯ ¯ ¯
¯
¯k=1 k ¯ ¯¯ 1 k 1 k ¯
1≤k≤ t t
<k≤n ¯
¯ ¯ ¯ ¯
¯ ¯ ¯ ¯
¯ X sin kt ¯ ¯ X sin kt ¯
≤¯¯
¯+¯
¯ ¯ ¯
¯1≤k≤ 1 k ¯ ¯ 1 <k≤n k ¯
¯
t t
X kt 1
≤ +1 t
1≤k≤ 1
k t sin 2
t
X 1
≤ t+ 12 t
1≤k≤ 1t t π2
≤ 1+π
X∞ 1 1 Z π
= 2
f 2n 3 (t )D 2m3 (t ) d t
n=1 n 2π −π
X∞ 1 ¡ ¢
= 2
s 2 m 3 f 2n 3 , 0
n=1 n
Use (126).
∞ 1
X ¡ ¢
= 2
s 2m 3 f 2n 3 , 0
n=m n
1 ¡ ¢
> 2 s 2m 3 f 2 m 3 , 0
m
m 3
1 2X 1
= 2
m k=1 k
1 3
> 2
ln 2m
m
= m ln 2
This implies, lim sup s n (F, 0) ≥ limm→∞ m ln 2 = ∞, so s n (F, 0) does not converge.
1 X n
σn (x) = s k (x)
n + 1 k=0
1 X n 1 Z π
= f (x − t )D k (t ) d t
n + 1 k=0 2π −π
1 π 1 X n
Z
(*) = f (x − t ) D k (t ) d t
2π −π n + 1 k=0
1 π 1 X n sin(k + 1/2)t
Z
= f (x − t ) dt
2π −π n + 1 k=0 sin t /2
1 π 1 n
Z X
= f (x − t ) 2
sin t /2 sin(k + 1/2)t d t
2π −π (n + 1) sin t /2 k=0
10
Π Π
-Π - Π
2 2
1
Z π 1/2 n
X
= f (x − t ) 2
(cos kt − cos(k + 1)t ) d t
2π −π (n + 1) sin t /2 k=0
1 π 1/2
Z
= f (x − t ) (1 − cos(n + 1)t ) d t
2π −π (n + 1) sin2 t /2
1
Z π
σn (x) = f (x − t )K n (t ) d t .
2π −π
P ROOF. Theorem 10.5 and (129) imply (a), (b) and (d). Equation (128) implies
(c).
Let δ be as in (e). In light of (a), it suffices to prove (e) for the interval [δ, π].
Noting that sin t /2 is decreasing on [δ, π], it follows that for δ ≤ t ≤ π,
!2
sin n+1
Ã
1 2 t
K n (t ) =
(n + 1) sin 2t
à !2
1 1
≤
(n + 1) sin 2t
1 1
≤ →0
(n + 1) sin2 δ
2
1
Z δ ε 1
Z π ε
K n (t ) d t < and K n (t ) d t < .
2π −π 3M 2π δ 3M
5
Compare this theorem with Lemma 9.14.
We start calculating.
π 1 π
¯ ¯
¯ 1
Z Z
¯
|σn (x) − f (x)| = ¯
¯ f (x − t )K n (t ) d t − f (x)K n (t ) d t ¯¯
2π −π 2π −π
1 ¯¯ π
¯Z ¯
¯
= ( f (x − t ) − f (x))K n (t ) d t ¯¯
2π −π¯
Z δ
1 ¯¯ −δ
¯Z
= ( f (x − t ) − f (x))K n (t ) d t + ( f (x − t ) − f (x))K n (t ) d t
2π ¯ −π −δ
Z π ¯
¯
+ ( f (x − t ) − f (x))K n (t ) d t ¯¯
δ
¯ 1 −δ ¯ ¯ 1 δ
¯ Z ¯ ¯ Z ¯
¯
≤¯
¯ ( f (x − t ) − f (x))K n (t ) d t ¯ + ¯ ¯ ¯ ( f (x − t ) − f (x))K n (t ) d t ¯¯
2π −π 2π −δ
¯ 1 π
¯ Z ¯
¯
+¯¯ ( f (x − t ) − f (x))K n (t ) d t ¯¯
2π δ
1 δ M π
Z −δ
M
Z Z
< K n (t ) d t + | f (x − t ) − f (x)|K n (t ) d t + K n (t ) d t
2π −π 2π −δ 2π δ
ε ε 1 δ ε
Z
<M + K n (t ) d t + M <ε
3M 3 2π −δ 3M
This shows σn (x) → f (x).
Π
Π
Π
Π
2
2
Π Π Π Π
-Π - Π 2 Π -Π - Π 2Π
2 2 2 2
Π Π
- -
2 2
-Π -Π
E XAMPLE 10.4. As in Example 10.2, let f (x) = x for −π < x ≤ π and extend f to
be periodic on R with period 2π. Figure 10.9 shows the difference between the Fe-
jér and classical methods of summation. Notice that the Fejér sums remain much
more smoothly affixed to the function and do not show Gibbs phenomenon.
8. Exercises
10.4. If f (x) = sgn(x) on [−π, π), then find the Fourier series for f .
∞
X
10.5. Is sin nx the Fourier series of some function?
n=1
π2 X∞ 1
10.7. Use Exercise 10.6 to prove = .
6 n=1 n 2
10.8. Suppose f is integrable on [−π, π]. If f is even, then the Fourier series for f
has no sine terms. If f is odd, then the Fourier series for f has no cosine terms.
10.10. Prove
n
X sin 2nt
cos(2k − 1)t =
k=1 2 sin t
Pn
for t 6= kπ, k ∈ Z and n ∈ N. (Hint: 2 k=1
cos(2k − 1)t = D 2n (t ) − D n (2t ).)
10.11. The function g (t ) = t / sin(t /2) is undefined whenever t = 2nπ for some
n ∈ Z. Show that it can be redefined on the set {2nπ : n ∈ Z} to be periodic and
uniformly continuous on R.
10.13. Prove the Weierstrass approximation theorem using Fourier series and
Taylor’s theorem.
[1] A.D. Aczel. The mystery of the Aleph: mathematics, the Kabbalah, and the search for infinity.
Washington Square Press, 2001.
[2] Carl B. Boyer. The History of the Calculus and Its Conceptual Development. Dover Publications,
Inc., 1959.
[3] James Ward Brown and Ruel V Churchill. Fourier series and boundary value problems. McGraw-
Hill, New York, 8th ed edition, 2012.
[4] Andrew Bruckner. Differentiation of Real Functions, volume 5 of CRM Monograph Series.
American Mathematical Society, 1994.
[5] Georg Cantor. Über eine Eigenschaft des Inbegriffes aller reelen algebraishen Zahlen. Journal
für die Reine und Angewandte Mathematik, 77:258–262, 1874.
[6] Georg Cantor. De la puissance des ensembles parfait de points. Acta Math., 4:381–392, 1884.
[7] Lennart Carleson. On convergence and growth of partial sums of Fourier series. Acta Math.,
116:135–157, 1966.
[8] Krzysztof Ciesielski. Set Theory for the Working Mathematician. Number 39 in London Mathe-
matical Society Student Texts. Cambridge University Press, 1998.
[9] Joseph W. Dauben. Georg Cantor and the Origins of Transfinite Set Theory. Sci. Am., 248(6):122–
131, June 1983.
[10] Gerald A. Edgar, editor. Classics on Fractals. Addison-Wesley, 1993.
[11] James Foran. Fundamentals of Real Analysis. Marcel-Dekker, 1991.
[12] Nathan Jacobson. Basic Algebra I. W. H. Freeman and Company, 1974.
[13] Yitzhak Katznelson. An Introduction to Harmonic Analysis. Cambridge University Press, Cam-
bridge, UK, 3rd ed edition, 2004.
[14] Charles C. Mumma. n! and The Root Test. Amer. Math. Monthly, 93(7):561, August-September
1986.
[15] James R. Munkres. Topology; a first course. Prentice-Hall, Englewood Cliffs, N.J., 1975.
[16] Paul J Nahin. Dr. Euler’s Fabulous Formula: Cures Many Mathematical Ills. Princeton Univer-
sity Press, Princeton, NJ, 2006.
[17] Paul du Bois Reymond. Ueber die Fourierschen Reihen. Nachr. Kön. Ges. Wiss. Gẗtingen,
21:571–582, 1873.
[18] Hans Samelson. More on Kummer’s test. The American Mathematical Monthly, 102(9):817–818,
1995.
[19] Jingcheng Tong. Kummer’s test gives characterizations for convergence or divergence of all
positive series. The American Mathematical Monthly, 101(5):450–452, 1994.
[20] Karl Weierstrass. Über continuirliche Functionen eines reelen Arguments, di für keinen Werth
des Letzteren einen bestimmten Differentialquotienten besitzen, volume 2 of Mathematische
werke von Karl Weierstrass, pages 71–74. Mayer & Müller, Berlin, 1895.
[21] Antoni Zygmund. Trigonometric Series. Cambridge University Press, Cambridge, UK, 3rd ed.
edition, 2002.
A-1
Index
A ∞
P
n=1 , Abel sum, 9-23 (, proper subset, 1-1
ℵ0 , cardinality of N, 1-11 ⊃, superset, 1-1
S c , complement of S, 1-3 ), proper superset, 1-1
c, cardinality of R, 2-11 ∆, symmetric difference, 1-3
D (.) Darboux integral, 8-6 ×, product (Cartesian or real), 1-5, 2-1
D (.) . lower Darboux integral, 8-6 T , trigonometric polynomials, 10-1
D (., .) lower Darboux sum, 8-4 â uniform convergence, 9-4
D (.) . upper Darboux integral, 8-6 ∪, union, 1-2
D (., .) upper Darboux sum, 8-4 Z, integers, 1-2
\, set difference, 1-3 Abel’s test, 4-11
D n (t ) Dirichlet kernel, 10-5 absolute value, 2-4, 2-5
D̃ n (t ) conjugate Dirchlet kernel, 10-12 almost every, 5-10
∈, element, 1-1 alternating harmonic series, 4-11
6∈, not an element, 1-1 Alternating Series Test, 4-13
;, empty set, 1-2 and ∧, 1-2
⇐⇒ , logically equivalent, 1-4 Archimedean Principle, 2-8
K n (t ) Fejér kernel, 10-17 axioms of R
Fσ , F sigma set, 5-12 additive inverse, 2-1
B A , all functions f : A → B , 1-14 associative laws, 2-1
Gδ , G delta set, 5-12 commutative laws, 2-1
glb , greatest lower bound, 2-7 completeness, 2-8
iff, if and only if, 1-4 distributive law, 2-1
=⇒ , implies, 1-4 identities, 2-1
<, ≤, >, ≥, 2-3 multiplicative inverse, 2-2
∞, infinity, 2-7 order, 2-3
∩, intersection, 1-3
∧, logical and, 1-2 Baire category theorem, 5-11
∨, logical or, 1-2 Baire, René-Louis, 5-11
lub , least upper bound, 2-7 Bertrand’s test, 4-10
n, initial segment, 1-11 Bolzano-Weierstrass Theorem, 3-8, 5-3
N, natural numbers, 1-2 bound
ω, nonnegative integers, 1-2 lower, 2-6
part ([a, b]) partitions of [a, b], 8-1 upper, 2-6
→ pointwise convergence, 9-1 bounded, 2-6
P (A), power set, 1-2 above, 2-6
Π, indexed product, 1-6 below, 2-6
R, real numbers, 2-8
R (.) Riemann integral, 8-3 Cantor, Georg, 1-12
R (., ., .) Riemann sum, 8-2 diagonal argument, 2-10
⊂, subset, 1-1 middle-thirds set, 5-12
A-2
Index A-3