A Brief Introduction To Measure Theory and Integration. Bass, Richard F. 1998

Download as pdf or txt
Download as pdf or txt
You are on page 1of 22
At a glance
Powered by AI
The document introduces concepts related to measure theory and integration including measures, Lebesgue measure, and Lp spaces.

A measure is a function that assigns a non-negative real number or infinity to subsets of a set in a way that satisfies certain properties like countable additivity.

Key properties of a measure include that the measure of the empty set is 0, the measure of a subset is less than or equal to the measure of the containing set, and the measure of a countable union of disjoint sets is the sum of the individual measures.

A Brief Introduction to

Measure Theory and Integration


Richard F. Bass
Department of Mathematics
University of Connecticut
September 18, 1998

c
These notes are 1998
by Richard Bass. They may be used for personal use or class use, but not for
commercial purposes.
1. Measures.
Let X be a set. We will use the notation: Ac = {x X : x
/ A} and A B = A B c .
Definition. An algebra or a field is a collection A of subsets of X such that
(a) , X A;
(b) if A A, then Ac A;
(c) if A1 , . . . , An A, then ni=1 Ai and ni=1 Ai are in A.
A is a -algebra or -field if in addition

(d) if A1 , A2 , . . . are in A, then


i=1 Ai and i=1 Ai are in A.
In (d) we allow countable unions and intersections only; we do not allow uncountable unions and intersections.
Example. Let X = R and A be the collection of all subsets of R.
Example. Let X = R and let A = {A R : A is countable or Ac is countable}.
Definition. A measure on (X, A) is a function : A [0, ] such that
(a) (A) 0 for all A A;
(b) () = 0;
(c) if Ai A are disjoint, then
(
i=1 Ai ) =

(Ai ).

i=1

Example. X is any set, A is the collection of all subsets, and (A) is the number of elements in A.
P
Example. X = R, A the collection of all subsets, x1 , x2 , . . . R, a1 , a2 , . . . > 0, and (A) = {i:xi A} ai .
Example. x (A) = 1 if x A and 0 otherwise. This measure is called point mass at x.
Proposition 1.1. The following hold:
(a) If A, B A with A B, then (A) (B).
P
(b) If Ai A and A =
i=1 Ai , then (A)
i=1 (Ai ).

(c) If Ai A, A1 A2 , and A = i=1 Ai , then (A) = limn (An ).


(d) If Ai A, A1 A2 , (A1 ) < , and A =
i=1 Ai , then we have (A) = limn (An ).
Proof. (a) Let A1 = A, A2 = B A, and A3 = A4 = = . Now use part (c) of the definition of measure.
1

(b) Let B1 = A1 , B2 = A2 B1 , B3 = A3 (B1 B2 ), and so on. The Bi are disjoint and


P
P

B
(Bi ) (Ai ).
i=1 i = i=1 Ai . So (A) =
(c) Define the Bi as in (b). Since ni=1 Bi = ni=1 Ai , then

(A) = (
i=1 Ai ) = (i=1 Bi ) =

(Bi )

i=1

= lim

n
X

(Bi ) = lim (ni=1 Bi ) = lim (ni=1 Ai ).

i=1

(d) Apply (c) to the sets A1 Ai , i = 1, 2, . . ..

Definition. A probability or probability measure is a measure such that (X) = 1. In this case we usually
write (, F, P) instead of (X, A, ).
2. Construction of Lebesgue measure.
Define m((a, b)) = b a. If G is an open set and G R, then G =
i=1 (ai , bi ) with the intervals
P
disjoint. Define m(G) = i=1 (bi ai ). If A R, define
m (A) = inf{m(G) : G open, A G}.
We will show the following.
(1) m is not a measure on the collection of all subsets of R.
(2) m is a measure on the -algebra consisting of what are known as m -measurable sets.
(3) Let A0 be the algebra (not -algebra) consisting of all finite unions of sets of the form [ai , bi ). If A is
the smallest -algebra containing A0 , then m is a measure on (R, A).
We will prove these three facts (and a bit more) in a moment, but lets first make some remarks about
the consequences of (1)-(3).
If you take any collection of -algebras and take their intersection, it is easy to see that this will again
be a -algebra. The smallest -algebra containing A0 will be the intersection of all -algebras containing
A0 .
Since (a, b] is in A0 for all a and b, then (a, b) =
i=i0 (a, b 1/i] A, where we choose i0 so that

1/i0 < b a. Then sets of the form i=1 (ai , bi ) will be in A, hence all open sets. Therefore all closed sets
are in A as well.
The smallest -algebra containing the open sets is called the Borel -algebra. It is often written B.
A set N is a null set if m (N ) = 0. Let L be the smallest -algebra containing B and all the null sets.
L is called the Lebesgue -algebra, and sets in L are called Lebesgue measurable.
As part of our proofs of (2) and (3) we will show that m is a measure on L. Lebesgue measure is
the measure m on L. (1) shows that L is strictly smaller than the collection of all subsets of R.
Proof of (1). Define x y if x y is rational. This is an equivalence relationship on [0, 1]. For each
equivalence class, pick an element out of that class (by the axiom of choice) Call the collection of such points
A. Given a set B, define B + x = {y + x : y B}. Note m (A + q) = m (A) since this translation invariance
holds for intervals, hence for open sets, hence for all sets. Moreover, the sets A + q are disjoint for different
rationals q.
2

Now
[0, 1] q[2,2] (A + q),
P
where the sum is only over rational q, so 1 q[2,2] m (A + q), and therefore m (A) > 0. But
q[2,2] (A + q) [6, 6],
P

where again the sum is only over rational q, so 12


contradiction.

q[2,2]

m (A + q), which implies m (A) = 0, a




Proposition 2.1. The following hold:


(a) m () = 0;
(b) if A B, then m (A) m (B);
P

(c) m (
i=1 Ai )
i=1 m (Ai ).
Proof. (a) and (b) are obvious. To prove (c), let > 0. For each i there exist intervals Ii1 , Ii2 , . . . such that
P

Ai
j=1 Iij and
j m(Iij ) m (Ai ) + /2 . Then i=1 Ai i,j Iij and
X

m(Iij )

i,j

Since is arbitrary, m (
i=1 Ai )

m (Ai ) +

i=1

/2i =

m (Ai ) + .

m (Ai ).

A function on the collection of all subsets satisfying (a), (b), and (c) is called an outer measure.
Definition. Let m be an outer measure. A set A X is m -measurable if
m (E) = m (E A) + m (E Ac )

(2.1)

for all E X.
Theorem 2.2. If m is an outer measure on X, then the collection A of m measurable sets is a -algebra
and the restriction of m to A is a measure. Moreover, A contains all the null sets.
Proof. By Proposition 2.1(c),
m (E) m (E A) + m (E Ac )
for all E X. So to check (2.1) it is enough to show m (E) m (E A) + m (E Ac ). This will be trivial
in the case m (E) = .
If A A, then Ac A by symmetry and the definition of A. Suppose A, B A and E X. Then
m (E) = m (E A) + m (E Ac )
= (m (E A B) + m (E A B c )) + (m (E Ac B) + m (E Ac B c )
The first three terms on the right have a sum greater than or equal to m (E (A B)) because A B
(A B) (A B c ) (Ac B). Therefore
m (E) m (E (A B)) + m (E (A B)c ),
which shows A B A. Therefore A is an algebra.
3

Let Ai be disjoint sets in A, let Bn = ni=1 Ai , and B =


i=1 Ai . If E X,
m (E Bn ) = m (E Bn An ) + m (E Bn Acn )
= m (E An ) + m (E Bn1 ).
Repeating for m (E Bn1 ), we obtain
m (E Bn ) =

n
X

m (E Ai ).

i=1

So
m (E) = m (E Bn ) + m (E Bnc )

n
X

m (E Ai ) + m (E B c ).

i=1

Let n . Then

m (E)

m (E Ai ) + m (E B c )

i=1

c
m (
i=1 (E Ai )) + m (E B )

= m (E B) + m(E B c )
m (E).
This shows B A.
If we set E = B in this last equation, we obtain
m (B) =

m (Ai ),

i=1

or m is countably additive on A.
If m (A) = 0 and E X, then
m (E A) + m (E Ac ) = m (E Ac ) m (E),
which shows A contains all null sets.

None of this is useful if A does not contain the intervals. There are two main steps in showing this.
Let A0 be the algebra consisting of all finite unions of intervals of the form (a, b]. The first step is
P

Proposition 2.3. If Ai A0 are disjoint and


i=1 Ai A0 , then we have m(i=1 Ai ) =
i=1 m(Ai ).
Proof. Since
i=1 Ai is a finite union of intervals (ak , bk ], we may look at Ai (ak , bk ] for each k. So we
may assume that A =
i=1 Ai = (a, b].
First,
n
X
m(A) = m(ni=1 Ai ) + m(A ni=1 Ai ) m(ni=1 Ai ) =
m(Ai ).
i=1

Letting n ,
m(A)

m(Ai ).

i=1

Let us assume a and b are finite, the other case being similar. By linearity, we may assume Ai =
(ai , bi ]. Let > 0. The collection {(ai , bi + /2i )} covers [a + , b], and so there exists a finite subcover.
4

Discarding any interval contained in another one, and relabeling, we may assume a1 < a2 < aN and
bi + /2i (ai+1 , bi+1 + /2i+1 ). Then
m(A) = b a = b (a + ) +
N
X

(bi + /2i ai ) +

i=1

m(Ai ) + 2.

i=1

Since is arbitrary, m(A)

i=1

m(Ai ).

The second step is the Caratheodory extension theorem. We say that a measure m is -finite if there
exist E1 , E2 , . . . , such that m(Ei ) < for all i and X
i=1 Ei .
Theorem 2.4. Suppose A0 is an algebra and m restricted to A0 is a measure. Define
m (E) = inf

nX

o
m(Ai ) : Ai A0 , E
i=1 Ai .

i=1

Then
(a) m (A) = m(A) if A A0 ;
(b) every set in A0 is m -measurable;
(c) if m is -finite, then there is a unique extension to the smallest -field containing A0 .
Proof. We start with (a). Suppose E A0 . We know m (E) m(E) since we can take A1 = E and
n1
A2 , A3 , . . . empty in the definition of m . If E
i=1 Ai with Ai A0 , let Bn = E (An i=1 Ai ). The
the Bn are disjoint, they are each in A0 , and their union is E. Therefore
m(E) =

m(Bi )

i=1

m(Ai ).

i=1

Thus m(E) m (E).


Next we look at (b). Suppose A A0 . Let > 0 and let E X. Pick Bi A0 such that E
i=1 Bi
P
and i m(Bi ) m (E) + . Then
m (E) +

m(Bi ) =

i=1

m(Bi A) +

i=1

m(Bi Ac )

i=1
c

m (E A) + m (E A ).
Since is arbitrary, m (E) m (E A) + m (E Ac ). So A is m -measurable.
Finally, suppose we have two extensions to the smallest -field containing A0 ; let the other extension
be called n. We will show that if E is in this smallest -field, then m (E) = n(E).
P
Since E must be m -measurable, m (E) = inf{ i=1 m(Ai ) : E
i=1 Ai , Ai A0 }. But m = n on
P
P
P
A0 , so i m(Ai ) = i n(Ai ). Therefore n(E) i n(Ai ), which implies n(E) m (E).
P
Let > 0 and choose Ai A0 such that m (E) + i m(Ai ) and E i Ai . Let A = i Ai and
Bk = ki=1 Ai . Observe m (E) + m (A), hence m (A E) < . We have
m (A) = lim m (Bk ) = lim n(Bk ) = n(A).
k

Then
m (E) m (A) = n(A) = n(E) + n(A E) n(E) + m(A E) n(E) + .
Since is arbitrary, this completes the proof.

We now drop the from m and call m Lebesgue measure.


3. Lebesgue-Stieltjes measures. Let : R R be nondecreasing and right continuous (i.e., (x+) =
P
(x) for all x). Suppose we define m ((a, b)) = (b) (a), define m (
i=1 (ai , bi )) =
i ((bi ) (ai ))
when the intervals (ai , bi ) are disjoint, and define m (A) = inf{m (G) : A G, G open}. Very much as in
the previous section we can show that m is a measure on the Borel -algebra. The only differences in the
proof are that where we had a+, we replace this by a0 , where a0 is chosen so that a0 > a and (a0 ) (a)+
and we replace bi + /2i by b0i , where b0i is chosen so that b0i > bi and (b0i ) (bi ) + /2i . These choices are
possible because is right continuous.
Lebesgue measure is the special case of m when (x) = x.
Given a measure on R such that (K) < whenever K is compact, define (x) = ((0, x]) if x 0
and (x) = ((x, 0]) if x < 0. Then is nondecreasing, right continuous, and it is not hard to see that
= m .
4. Measurable functions. Suppose we have a set X together with a -algebra A.
Definition. f : X R is measurable if {x : f (x) > a} A for all a R.
Proposition 4.1. The following are equivalent.
(a) {x : f (x) > a} A for all a;
(b) {x : f (x) a} A for all a;
(c) {x : f (x) < a} A for all a;
(d) {x : f (x) a} A for all a.
Proof. The equivalence of (a) and (b) and of (c) and (d) follow from taking complements. The remaining
equivalences follow from the equations
{x : f (x) a} =
n=1 {x : f (x) > a 1/n},
{x : f (x) > a} =
n=1 {x : f (x) a + 1/n}.

Proposition 4.2. If X is a metric space, A contains all the open sets, and f is continuous, then f is
measurable.
Proof. {x : f (x) > a} = f 1 (a, ) is open.

Proposition 4.3. If f and g are measurable, so are f + g, cf , f g, max(f, g), and min(f, g).
Proof. If f (x) + g(x) < , then f (x) < g(x), and there exists a rational r such that f (x) < r < g(x).
So
[
{x : f (x) + g(x) < } =
({x : f (x) < r} {x : g(x) < r}).
r rational

f 2 is measurable since {x : f (x)2 > a) = {x : f (x) > a} {x : f (x) < a}. The measurability of
f g follows since f g = 12 [(f + g)2 f 2 g 2 ].
6

{x : max(f (x), g(x)) > a} = {x : f (x) > a} {x : g(x) > a}.

Proposition 4.4. If fi is measurable for each i, then so is supi fi , inf i fi , lim supi fi , and lim inf i fi .
Proof. The result will follow for lim sup and lim inf once we have the result for the sup and inf by using
the definitions. We have {x : supi fi > a} =

i=1 {x : fi (x) > a}, and the proof for inf fi is similar.
Definition. We say f = g almost everywhere, written f = g a.e., if {x : f (x) 6= g(x)} has measure zero.
Similarly, we say fi f a.e., if the set of x where this fails has measure zero.
5. Integration. In this section we introduce the Lebesgue integral.
Definition. If E X, define the characteristic function of E by
n
1 x E;
E (x) =
0 x
/ E.
A simple function s is one of the form
s(x) =

n
X

ai Ei (x)

i=1

for reals ai and sets Ei .


Proposition 5.1. Suppose f 0 is measurable. Then there exists a sequence of nonnegative measurable
simple functions increasing to f .
Proof. Let Eni = {x : (i 1)/2n f (x) < i/2n } and Fn = {x : f (x) n} for n = 1, 2, . . . , and
i = 1, 2, . . . , n2n . Then define
n2n
X
i1
Eni + nFn .
sn =
2n
i=1
It is easy to see that sn has the desired properties.
Definition. If s =
s to be

Pn

i=1

ai Ei is a nonnegative measurable simple function, define the Lebesgue integral of


Z
s d =

n
X

ai (Ei ).

(5.1)

i=1

If f 0 is measurable function, define


Z
nZ
o
f d = sup
s d : 0 s f, s simple .

(5.2)

R
R
If f is measurable and at least one of the integrals f + d, f d is finite, where f + = max(f, 0) and
f = min(f, 0), define
Z
Z
Z
+
f d = f d f d.
(5.3)
A few remarks are in order. A function s might be written as a simple function in more than one way.
R
For example AB = A + B is A and B are disjoint. It is clear that the definition of s d is unaffected by
how s is written. Secondly, if s is a simple function, one has to think a moment to verify that the definition
R
of s d by means of (5.1) agrees with its definition by means of (5.2).
R
Definition. If |f | d < , we say f is integrable.
The proof of the next proposition follows from the definitions.
7

R
Proposition 5.2. (a) If f is measurable, a f (x) b for all x, and (X) < , then a(X) f d
b(X);
R
R
(b) If f (x) g(x) for all x and f and g are measurable and integrable, then f d g d.
R
R
(c) If f is integrable, then cf d = c f d for all real c.
R
(d) If (A) = 0 and f is measurable, then f A d = 0.
R
R
The integral f A d is often written A f d. Other notation for the integral is to omit the if it
R
R
is clear which measure is being used, to write f (x) (dx), or to write f (x) d(x).
Proposition 5.3. If f is integrable,
Z Z


f |f |.
Proof. f |f |, so

R
R
|f |. Also f |f |, so f |f |. Now combine these two facts.

One of the most important results concerning Lebesgue integration is the monotone convergence
theorem.
Theorem 5.4. Suppose fn is a sequence of nonnegative measurable functions with f1 (x) f2 (x) for
R
R
all x and with limn fn (x) = f (x) for all x. Then fn d f d.
R
Proof. By Proposition 5.2(b), fn is an increasing sequence of real numbers. Let L be the limit. Since
R
R
fn f for all n, then L f . We must show L f .
Pm
Let s = i=1 ai Ei be any nonnegative simple function less than f and let c (0, 1). Let An = {x :
fn (x) cs(x)}. Since the fn (x) increases to f (x) for each x and c < 1, then A1 A2 , and the union
of the An is all of X. For each n,
Z
Z
Z
fn

fn c
An

Z
=c
=c

sn
An

m
X

ai Ei

An i=1
m
X

ai (Ei An ).

i=1

If we let n , by Proposition 1.1(c), the right hand side converges to


c

m
X

Z
ai (Ei ) = c

s.

i=1

R
R
Therefore L c s. Since c is arbitrary in the interval (0, 1), then L s. Taking the supremum over all
R
simple s f , we obtain L f .

Once we have the monotone convergence theorem, we can prove that the Lebesgue integral is linear.
Theorem 5.5. If f1 and f2 are integrable, then
Z
Z
Z
(f1 + f2 ) = f1 + f2 .
Proof. First suppose f1 and f2 are nonnegative and simple. Then it is clear from the definition that the
theorem holds in this case. Next suppose f1 and f2 are nonnegative. Take sn simple and increasing to f1
8

and tn simple and increasing to f2 . Then sn + tn increases to f1 + f2 , so the result follows from the monotone
convergence theorem and the result for simple functions. Finally in the general case, write f1 = f1+ f1
and similarly for f2 , and use the definitions and the result for nonnegative functions.

Suppose fn are nonnegative measurable functions. We will frequently need the observation
Z X

N
X

Z
fn =

lim

n=1

= lim

fn = lim

n=1

N Z
X

Z X

fn =

n=1

Z
X

fn

(5.4)

n=1

fn .

n=1

We used here the monotone convergence theorem and the linearity of the integral.
The next theorem is known as Fatous lemma.
Theorem 5.6. Suppose the fn are nonnegative and measurable. Then
Z
Z
lim inf fn lim inf fn .
n

Proof. Let gn = inf in fi . Then gn are nonnegative and gn increases to lim inf fn . Clearly gn fi for each
R
R
i n, so gn fi . Therefore
Z
Z
gn inf
fi .
in

If we take the supremum over n, on the left hand side we obtain


R
theorem, while on the right hand side we obtain lim inf n fn .

lim inf fn by the monotone convergence




A second very important theorem is the dominated convergence theorem.


Theorem 5.7. Suppose fn are measurable functions and fn (x) f (x). Suppose there exists an integrable
R
R
function g such that |fn (x)| g(x) for all x. Then fn d f d.
Proof. Since fn + g 0, by Fatous lemma,
Z
Z
(f + g) lim inf (fn + g).
Since g is integrable,
Z

Z
f lim inf

Similarly, g fn 0, so

fn .
Z

(g f ) lim inf
and hence

(g fn ),

Z
f lim inf

Therefore

Z
(fn ) = lim sup

fn .

Z
f lim sup

which with the above proves the theorem.

fn ,


R
Example. Suppose fn = n(0,1/n) . Then fn 0, fn 0 for each x, but fn = 1 does not converge to
R
0 = 0. The trouble here is that the fn do not increase for each x, nor is there a function g that dominates
all the fn simultaneously.
If in the monotone convergence theorem or dominated convergence theorem we have only fn (x) f (x)
almost everywhere, the conclusion still holds. For if A = {x : fn (x) f (x)}, then f A f A for each x.
R
R
And since Ac has measure 0, we see from Proposition 5.2(d) that f A = f , and similarly with f replaced
by fn .
Later on we will need the following two propositions.
Proposition 5.8. Suppose f is measurable and for every measurable set A we have
f = 0 almost everywhere.

R
A

f d = 0. Then

Proof. Let A = {x : f (x) > }. Then


Z

Z
f

0=
A

= (A)
A

since f A A . Hence (A) = 0. We use this argument for = 1/n and n = 1, 2, . . . , so {x : f (x) >
0} = 0. Similarly {x : f (x) < 0} = 0.

Proposition 5.9. Suppose f is measurable and nonnegative and

f d = 0. Then f = 0 almost everywhere.

Proof. If f is not almost everywhere equal to 0, there exists an n such that (An ) > 0 where An = {x :
f (x) > 1/n}. But then since f is nonnegative,
Z

Z
f

f
An

1
(An ),
n

a contradiction.

6. Product measures. If A1 A2 and A =


i=1 Ai , we write Ai A. If A1 A2 and

A = i=1 Ai , we write Ai A.
Definition. M is a monotone class is M is a collection of subsets of X such that
(a) if Ai A and each Ai M, then A M;
(b) if Ai A and each Ai M, then A M.
The intersection of monotone classes is a monotone class, and the intersection of all monotone classes
containing a given collection of sets is the smallest monotone class containing that collection.
The next theorem, the monotone class lemma, is rather technical, but very useful.
Theorem 6.1. Suppose A0 is a algebra, A is the smallest -algebra containing A0 , and M is the smallest
monotone class containing A0 . Then M = A.
Proof. A -algebra is clearly a monotone class, so A M. We must show M A.
Let N1 = {A M : Ac M}. Note N1 is contained in M, contains A0 , and is a monotone class. So
N1 = M, and therefore M is closed under the operation of taking complements.
10

Let N2 = {A M : A B M for all B A0 }. N2 is contained in M; N2 contains A0 because A0

is an algebra; N2 is a monotone class because (


i=1 Ai ) B = i=1 (Ai B), and similarly for intersections.
Therefore N2 = M; in other words, if B A0 and A M, then A B M.
Let N3 = {A M : A B M for all B M}. As in the preceding paragraph, N3 is a monotone
class contained in M. By the last sentence of the preceding paragraph, N3 contains A0 . Hence N3 = M.
We thus have that M is a monotone class closed under the operations of taking complements and
taking intersections. This shows M is a -algebra, and so M A.

Suppose (X, A, ) and (Y, B, ) are two measure spaces, i.e., A and B are -algebras on X and Y ,
resp., and and are measures on A and B, resp. A rectangle is a set of the form A B, where A A and
B B. Define a set function on rectangles by
(A B) = (A)(B).
Lemma 6.2. Suppose A B =
i=1 Ai Bi , where A, Ai A and B, Bi B. Then
(A B) =

(Ai Bi ).

i=1

Proof. We have
AB (x, y) =

Ai Bi (x, y),

i=1

and so
A (x)B (y) =

Ai (x)Bi (y).

i=1

Holding x fixed and integrating over y with respect to , we have, using (5.4),
A (x)(B) =

Ai (x)(Bi ).

i=1

Now use (5.4) again and integrate over x with respect to to obtain the result.

Let C0 = {finite unions of rectangles}. It is clear that C0 is an algebra. By Lemma 6.2 and linearity,
we see that is a measure on C0 . Let A B be the smallest -algebra containing C0 ; this is called the
product -algebra. By the Caratheodory extension theorem, can be extended to a measure on A B.
We will need the following observation. Suppose a measure is -finite. So there exist Ei which have
finite measure and whose union is X. If we let Fn = ni=1 Ei , then Fi X and (Fn ) is finite for each n.
If and are both -finite, say with Fi X and Gi Y , then will be -finite, using the sets
Fi G i .
The main result of this section is Fubinis theorem, which allows one to interchange the order of
integration.
Theorem 6.3. Suppose f : X Y R is measurable with respect to A B. If f is nonnegative or
R
|f (x, y)| d( )(x, y) < , then
R
(a) the function g(x) = f (x, y)(dy) is measurable with respect to A;
R
(b) the function h(y) = f (x, y)(dx) is measurable with respect to B;
11

(c) we have
Z
f (x, y) d( )(x, y) =
=

Z Z


f (x, y) d(x) d(y)

Z Z


f (x, y) d(y) (dx).

Proof. First suppose and are finite measures. If f is the characteristic function of a rectangle, then
(a)(c) are obvious. By linearity, (a)(c) hold if f is the characteristic function of a set in C0 , the set of finite
unions of rectangles.
Let M be the collection of sets C such that (a)(c) hold for C . If Ci C and Ci M, then (c)
holds for C by monotone convergence. If Ci C, then (c) holds for C by dominated convergence. (a) and
(b) are easy. So M is a monotone class containing A0 , so M = A B.
If and are -finite, applying monotone convergence to C (Fn Gn ) for suitable Fn and Gn and
monotone convergence, we see that (a)(c) holds for the characteristic functions of sets in A B in this case
as well.
By linearity, (a)(c) hold for nonnegative simple functions. By monotone convergence, (a)(c) hold
R
for nonnegative functions. In the case |f | < , writing f = f + f and using linearity proves (a)(c) for
this case, too.


7. The Radon-Nikodym theorem. Suppose f is nonnegative, measurable, and integrable with respect
to . If we define by
Z
(A) =
f d,
A

then is a measure. The only part that needs thought is the countable additivity, and this follows from
(5.4) applied to the functions f Ai . Moreover, (A) is zero whenever (A) is.
Definition. A measure is called absolutely continuous with respect to a measure if (A) = 0 whenever
(A) = 0.
P
Definition. A function : A (, ] is called a signed measure if () = 0 and (
i=1 Ai ) =
i=1 (Ai )
whenever the Ai are disjoint and all the Ai are in A.
Definition. Let be a signed measure. A set A A is called a positive set for if (B) 0 whenever
B A and A A. We define a negative set similarly.
Proposition 7.1. Let be a signed measure and let M > 0 such that (A) M for all A A. If
(F ) < 0, then there exists a subset E of F that is a negative set with (E) < 0.
Proof. Suppose (F ) < 0. Let F1 = F and let a1 = sup{(A) : A F1 }. Since (F1 A) = (F1 ) (A)
if A F1 , we see that a1 is finite. Let B1 be a subset of F1 such that (B1 ) a1 /2. Let F2 = F1 B1 , let
a2 = sup{(A) : A F2 }, and choose B2 a subset of F2 such that (B2 ) a2 /2. Let F3 = F2 B2 and
continue.
One possibility is that this procedure stops after finitely many steps. This happens only if for some i
every subset of Fi has nonpositive mass. In this case E = Fi is the desired negative set.
The other possibility is if this procedure continues indefinitely. In this case, let E =
i=1 Fi . Note

E = F (i=1 Bi ), and the Bi are disjoint. So


(E) = (F )

X
i=1

12

(Bi ),

and (E) (F ) < 0. Also

(Bi ) = (F ) (E) M.

i=1

This implies the series converges, so (Bi ) 0. Since (Bi ) ai /2, then ai 0. Suppose E is not a
negative set. Then there exists A E with (A) > 0. Choose n such that an < (A). But A is a subset of
Fn , so an (A), a contradiction. Therefore E is a negative set.

Proposition 7.2. Let be a signed measure and M > 0 such that (A) M for all A A. There exist
sets E and F that are disjoint whose union is X and such that E is a negative set and F is a positive set.
Proof. Let L = inf{(A) : A is a negative set}. Choose negative sets An such that (An ) L. Let
E =
n=1 An . Let Bn = An (B1 Bn1 ) for each n. Since An is a negative set, so is each Bn . Also,
the Bn are disjoint. If C E, then
(C) = lim (C (ni=1 Bi )) = lim
n

n
X

(C Bi ) 0.

i=1

So E is a negative set.
Since E is negative,
(E) = (An ) + (E An ) (An ).
Letting n , we obtain (E) = L.
Let F = E c . If F were not a positive set, there would exist B F with (B) < 0. By Proposition
7.1 there exists a negative set C contained in B with (C) < 0. But then E C would be a negative set
with (E C) < (E) = L, a contradiction.

We now are ready for the Radon-Nikodym theorem.
Theorem 7.3. Suppose is a -finite measure and is a finite measure such that is absolutely continuous
R
with respect to . There exists a -integrable nonnegative function f such that (A) = A f d for all A A.
Moreover, if g is another such function, then f = g almost everywhere.
Proof. Let us first prove the uniqueness assertion. For every set A we have
Z
(f g) d = (A) (A) = 0.
A

By Proposition 5.8 we have f g = 0 a.e.


Since is -finite, there exist Fi X such that (Fi ) < for each i. Let i be the restriction of
to Fi , that is, i (A) = (A Fi ). Define i , the restriction of to Fi , similarly. If fi is a function such
R
that i (A) = A fi di for all A, the argument of the first paragraph shows that fi = fj on Fi if i j. If
we define f by f (x) = fi (x) if x Fi , we see that f will be the desired function. So it suffices to restrict
attention to the case where is finite.
Let
Z
n
o
F = g : 0 g,
g d (A) for all A A .
A

R
F is not empty because 0 F. Let L = sup{ g d : g F}, and let gn be a sequence in F such that
R
gn d L. Let hn = max(g1 , . . . , gn ).
13

If g1 and g2 are in F, then h2 = max(g1 , g2 ) is also in F. To see this,


Z
Z
Z
h2 d =
h2 d +
h2 d
A
A{x:g1 (x)g2 (x)}
A{x:g1 (x)<g2 (x)}
Z
Z
=
g1 d +
g2 d
A{x:g1 (x)g2 (x)}

A{x:g1 (x)<g2 (x)}

(A {x : g1 (x) g2 (x)}) + (A {x : g1 (x) < g2 (x)}) = (A).


By an induction argument, hn is in F.
R
The hn increase, say to f . By the monotone convergence theorem, f d = L and
Z
f d (A)

(7.1)

for all A.
Let A be a set where there is strict inequality in (7.1); let be chosen sufficiently small so that if
is defined by
Z
(B) = (B)
f d (B),
B

then (A) > 0. is a signed measure; let F be the positive set as constructed in Proposition 7.2. In
particular, (F ) > 0. So for every B
Z
f d + (B F ) (B F ).
BF

We then have, using (7.1), that


Z
Z
(f + F ) d =
f d + (B F )
B
ZB
Z
=
f d +
f d + (B F )
BF c

BF

(B F c ) + (B F ) = (B).
This says that f + F F. However,
Z
Z
L (f + F ) d = f d + (F ) = L + (F ),
which implies (F ) = 0. But then (F ) = 0, and hence (F ) = 0, contradicting the fact that F is a positive
set for F with (F ) > 0.


8. Differentiation of real-valued functions.


Let E R be a measurable set and let O be a collection of intervals. We say O is a Vitali cover of
E if for each x E and each > 0 there exists an interval G O containing x whose length is less than .
m will denote Lebesgue measure.
Lemma 8.1. Let E have finite measure and let O be a Vitali cover of E. Given > 0 there exists a finite
subcollection of disjoint intervals I1 , . . . , In such that m(E ni=1 In ) < .
Proof. We may replace each interval in O by a closed one, since the set of endpoints of a finite subcollection
will have measure 0.
14

Let O be an open set of finite measure containing E. Since O is a Vitali cover, we may suppose
without loss of generality that each set of O is contained in O. Let a1 = sup{m(I) : I O}. Let I1 be an
element of O with m(I1 ) a1 /2. Let a2 = sup{m(I) : I O, I disjoint from I1 },and choose I2 O disjoint
from I1 such that m(I2 ) a2 /2. Continue in this way, choosing In+1 disjoint from I1 , . . . , In and in O with
length at least one half as large as any other such interval in O that is disjoint from I1 , . . . , In .
If the process stops at some finite stage, we are done. If not, we generate a sequence of disjoint
P
intervals I1 , I2 , . . . Since they are disjoint and all contained in O, then i=1 m(Ii ) m(O) < . So there
P
exists N such that i=N +1 m(Ii ) < /5.
Let R = E N
i=1 Ii ; we will show m(R) < . Let Jn be the interval with the same center as In but
five times the length. Let x R. There exists an interval I O containing x with I disjoint from I1 , . . . , IN .
P
P
P
Since
m(In ) < , then
an 2 m(In ) < , and an 0. So I must either be one of the In for some
n > N or at least intersect it, for otherwise we would have chosen I at some stage. Let n be the smallest
integer such that I intersects In ; note n > N . We have m(I) an1 2m(In ). Since x is in I and I
intersects In , the distance from x to the midpoint of In is at most m(I) + m(In )/2 (5/2)m(In ). Therefore
x Jn .
P
P

Then R
i=N +1 Jn , so m(R)
i=N +1 m(In ) < .
i=N +1 m(Jn ) = 5
Given a function f , we define the derivates of f at x by
f (x) f (x h)
h
h0
f (x) f (x h)
D f (x) = lim inf
.
h0
h

f (x + h) f (x)
,
h
h0+
f (x + h) f (x)
D+ f (x) = lim inf
,
h0+
h

D f (x) = lim sup

D+ f (x) = lim sup

If all the derivates are equal, we say that f is differentiable at x and define f 0 (x) to be the common value.
Theorem 8.2. Suppose f is nondecreasing on [a, b]. Then f is differentiable almost everywhere, f 0 is
Rb
measurable, and a f 0 (x) dx f (b) f (a).
Proof. We will show that the set where any two derivates are unequal has measure zero. We consider the
set E where D+ f (x) > D f (X), the other sets being similar. Let Eu,v = {x : D+ f (x) > u > v > D f (x)}.
If we show m(Eu,v ) = 0, then taking the union of all pairs of rationals with u > v rational shows m(E) = 0.
Let s = m(Eu,v ), let > 0, and choose an open set O such that Eu,v O and m(O) < s + . For each
x Eu,v there exists an arbitrarily small interval [x h, x] contained in O such that f (x) f (x h) < vh.
Use Lemma 8.1 to choose I1 , . . . , In which are disjoint and whose interiors cover a subset of A of Eu,v of
measure greater than s . Suppose In = [xn hn , xn ]. Summing over these intervals,
N
X

[f (xn ) f (xn hn )] < v

n=1

n
X

hn < vm(O) < v(s + ).

n=1

Each point y A is the left endpoint of an arbitrarily small interval (y, y + k) that is contained in
some In and for which f (y + k) f (y) > u(k). Using Lemma 8.1 again, we pick out a finite collection
J1 , . . . , JM whose union contains a subset of A of measure larger than s 2. Summing over these intervals
yields
M
X
X
[f (yi + ki ) f (yi )] > u
ki > u(s 2).
i=1

Each interval Ji is contained in some interval In , and if we sum over those i for which Ji In we find
X
[f (yi + ki ) f (yi )] f (xn ) f (xn hn ),
15

since f is increasing. Thus


N
X

[f (xn ) f (xn hn )]

n=1

M
X

[f (yi + ki ) f (yi )],

i=1

and so v(s + ) > u(s 2). This is true for each , so vs us. Since u > v, this implies s = 0.
This shows that
f (x + h) f (x)
g(x) = lim
h0
h
is defined almost everywhere and that f is differentiable wherever g is finite. Define f (x) = f (b) if x b.
Let gn (x) = n[f (x + 1/n) f (x)]. Then gn (x) g(x) for almost all x, and so g is measurable. Since f is
increasing, gn 0. By Fatous lemma
Z b
Z b
Z b
g lim inf
gn = lim inf n
[f (x + 1/n) f (x)]dx
a

h Z
= lim inf n

b+1/n

a
a+1/n

f n

a+1/n

f = lim inf f (b) n


a

f (b) f (a).
This shows that g is integrable and hence finite almost everywhere.

Pk
A function is of bounded variation if sup{ i=1 |f (xi ) f (xi1 )|} is finite, where the supremum is
over all partitions a = x0 < x1 < < xk = b of [a, b].
Lemma 8.3. If f is of bounded variation on [a, b], then f can be written as the difference of two nondecreasing functions on [a, b].
Proof. Define
k
nX
o
P (y) = sup
[f (xi ) f (xi1 )]+ ,

k
nX
o
N (y) = sup
[f (xi ) f (xi1 )] ,

i=1

i=1

where the supremum is over all partitions a = x0 < x1 < < xk = y for y [a, b]. Since
k
X

[f (xi ) f (xi1 )]+ =

i=1

k
X

[f (xi ) f (xi1 )] + f (y) f (a),

i=1

taking the supremum over all partitions of [a, y] yields


P (y) = N (y) + f (y) f (a).
Clearly P and N are nondecreasing in y, and the result follows by solving for f (y).

Define the indefinite integral of an integrable function f by


Z x
F (x) =
f (t) dt.
a

Lemma 8.4. If f is integrable, then F is continuous and of bounded variation.


Proof. The continuity follows from the dominated convergence theorem The bounded variation follows from
k
X
i=1

k Z
X

|F (xi ) F (xi1 )| =

i=1

xi

xi1

k Z
X

f (t) dt
i=1

16

xi

xi1

|f (t)| dt

|f (t)| dt
a

for all partitions.

Lemma 8.5. If f is integrable and F (x) = 0 for all x, then f = 0 a.e.


Rd
Rd
Rc
Proof. For any interval, c f = a f a f = 0. By dominated convergence and the fact that any open set
R
is the countable union of disjoint open intervals, O f = 0 for any open set O.
If E is any measurable set, take On open that such that On decreases to E a.e. By dominated
convergence,
Z
Z
Z
Z
f=

f E = lim

f On = lim

f = 0.
On

This with Proposition 5.8 implies f is zero a.e.

Proposition 8.6. If f is bounded and measurable, then F 0 (x) = f (x) for almost every x.
Proof. By Lemma 8.4, F is of bounded variation, and so F 0 exists a.e. Let K be a bound for |f |. If
fn (x) =

F (x + 1/n) F (x)
,
1/n

then
Z

x+1/n

fn (x) = n

f (t) dt,
x

so |fn | is also bounded by K. Since fn F 0 a.e., then by dominated convergence,


Z

c
0

F (x) dx = lim
a

[F (x + 1/n) F (x)] dx

fn (x) dx = lim
a

c+1/n

a+c

F (x) dx n

= lim n
c

using the fact that F is continuous. So


Lemma 8.5.

Z
F (x) dx = F (c) F (a) =

Rc
a

f (x) dx,
a

[F 0 (x) f (x)] dx = 0 for all c, which implies F 0 = f a.e. by




Theorem 8.7. If f is integrable, then F 0 = f almost everywhere.


Proof. Without loss of generality we may assume f 0. Let fn (x) = f (x) if f (x) n and let fn (x) = n
Rx
if f (x) > n. Then f fn 0. If Gn (x) = a [f fn ], then Gn is nondecreasing, and hence has a derivative
Rx
almost everywhere. By Lemma 8.6, we know the derivative of a fn is equal to fn almost everywhere.
Therefore
h Z x i0
0
0
F (x) = Gn (x) +
fn fn (x)
a

Rb

Rb

a.e. Since n is arbitrary, F 0 f a.e. So a F 0 a f = F (b) F (a). On the other hand, by Theorem 8.2,
Rb 0
Rb
Rb
F (x) dx F (b) F (a) = a f . We conclude that a [F 0 f ] = 0; since F 0 f 0, this tells us that
a
F 0 = f a.e.

Pk
A function is absolutely continuous on [a, b] if given there exists such that i=1 |f (x0i ) f (xi )| <
Pk
whenever {xi , x0i )} is a finite collection of nonoverlapping intervals with i=1 |x0i xi | < .
17

Lemma 8.8. If F (x) =

Rx
a

f (t) dt for f integrable on [a, b], then F is absolutely continuous.

Rb
Proof. Let > 0. Choose a simple function s such that a |f s| < /2. Let K be a bound for |s| and let
= /2K. If {(xi , x0i )} is a collection of nonoverlapping intervals, the sum of whose lengths is less than ,
R
R
then set A = ki=1 (xi , x0i ) and note A |f s| < /2 and A s < K = /2.

Lemma 8.9. If f is absolutely continuous, then it is of bounded variation.
Proof. Let correspond to = 1 in the definition of absolute continuity. Given a partition, add points if
necessary so that each subinterval has length at most . We can then group the subintervals into at most
K collections, each of total length less than , where K is an integer larger than (1 + b a)/. So the total
variation is then less than K.

Lemma 8.10. If f is absolutely continuous on [a, b] and f 0 (x) = 0 a.e., then f is constant.
Proof. Let c [a, b], let E = {x [a, c] : f 0 (x) = 0}, and let > 0. For each point x E there exists
arbitrarily small intervals [x, x+h] [a, c] such that |f (x+h)f (x)| < h. By Lemma 8.1 we can find a finite
collection of such intervals that cover all of E except for a set of measure less than , where is the in the
P
definition of absolute continuity. If the intervals are [xi , yi ] with xi < yi xi+1 , then
|f (xi+1 ) f (yi )| <
P
P
by the definition of absolute continuity, while
|f (yi ) f (xi )| < (yi xi ) (c a). So adding these
two inequalities together,
X

X


|f (c) f (a)| = [f (xi+1 ) f (yi )] +
[f (yi ) f (xi )] + (c a).
Since is arbitrary, then f (c) = f (a), which implies that f is constant.

Theorem 8.11. F is an indefinite integral if and only if it is absolutely continuous.


Proof. One direction was Lemma 8.11. Suppose F is absolutely continuous on [a, b]. Then F is of bounded
variation, F = F1 F2 where F1 and F2 are nondecreasing, and F 0 exists a.e. Since |F 0 (x)| F10 (x) + F20 (x),
R
Rx
then |F 0 (x)| dx F1 (b) + F2 (b) F1 (a) F2 (a), then F 0 is integrable. If G(x) = a F 0 (t) dt, then G is
absolutely continuous by Lemma 8.11, so F G is absolutely continuous. Then (F G)0 = 0 a.e., and
Rx
therefore F G is constant. Thus F (x) = a F 0 (t) dt + F (a).


9. Lp spaces.
For 1 p < , define the Lp norm of f by
kf kp =

Z

1/p
|f (x)|p d
.

For p = , define the L norm of f by


kf k = inf{M : ({x : |f (x)| M }) = 0}.
For 1 p the space Lp is the set {f : kf kp < }.
The L norm of a function f is the supremum of f provided we disregard sets of measure 0.
It is clear that kf kp = 0 if and only if f = 0 a.e.
18

Proposition 9.1. (H
olders inequality) If 1 < p, q < and p1 + q 1 = 1, then
Z
f (x)g(x)d kf kp kgkq .
This also holds if p = and g = 1.
R
R
Proof. If M = kf k , then f g M |g| and the case p = and q = 1 follows. So let us assume
R
1 < p, q < . If kf kp = 0, then f = 0 a.e and f g = 0, so the result is clear if kf kp = 0 and similarly if
kgkq = 0. Let F (x) = |f (x)|/kf kp and G(x) = |g(x)|/kgkq . Note kF kp = 1 and kGkq = 1, and it suffices to
R
show that F G 1.
The second derivative of the function ex is again ex , which is positive, and so ex is convex. Therefore
if 0 1, we have
ea+(1)b ea + (1 )eb .
If F (x), G(x) 6= 0, let a = p log F (x), b = q log G(x), = 1/p, and 1 = 1/q. We then obtain
F (x)p
G(x)q
+
.
p
q
Clearly this inequality also holds if F (x) = 0 or G(x) = 0. Integrating,
Z
kF kpp
kGkqq
1 1
FG
+
= + = 1.
p
q
p q
F (x)G(x)


One application of H
olders inequality is to prove Minkowskis inequality, which is simply the triangle
inequality for Lp .
Proposition 9.2. (Minkowskis inequality) If 1 p , then
kf + gkp kf kp + kgkp .
Proof. Since |(f + g)(x)| |f (x)| + |g(x)|, integrating gives the case when p = 1. The case p = is also
easy. So let us suppose 1 < p < . If kf kp or kgkp is infinite, the result is obvious, so we may assume both
are finite. The inequality (a + b)p 2p ap + 2p bp with a = |f (x)| and b = |g(x)| yields, after an integration,
Z
Z
Z
|(f + g)(x)|p d 2p |f (x)|p d + 2p |g(x)|p d.
So we have kf + gkp < . Clearly we may assume kf + gkp > 0.
Now write
|f + g|p |f | |f + g|p1 + |g| |f + g|p1
and apply H
olders inequality with q = (1 p1 )1 . We obtain
Z
Z
1/q
Z
1/q
|f + g|p kf kp
|f + g|(p1)q
+ kgkp
|f + g|(p1)q
.
Since p1 + q 1 = 1, then (p 1)q = p, so we have


kf + gkpp kf kp + kgkp kf + gkp/q
p .
p/q

Dividing both sides by kf + gkp

and using the fact that p (p/q) = 1 gives us our result.

Minkowskis inequality says that Lp is a normed linear space, provided we identify functions that are
equal a.e. The next proposition says that Lp is complete. This is often phrased as saying that Lp is a Banach
space, i.e., a complete normed linear space.
Before proving this we need two easy preliminary results. The first is sometimes called Chebyshevs
inequality.
19

Lemma 9.3. If 1 p < ,


({x : |f (x)| a})

kf kpp
.
ap

Proof. If A = {x : |f (x)| a}, then


|f (x)|p
1
d p
ap
a

Z
(A)
A

|f |p d.


The next lemma is sometimes called the Borel-Cantelli lemma.


P
Lemma 9.4. If
(Aj ) < , then

(
j=1 m=j Am ) = 0.
Proof.

(
j=1 m=j Am ) = lim (m=j Am ) lim
j

(Am ) = 0.

m=j


Proposition 9.5. If 1 p , then Lp is complete.
Proof. We do only the case p < ; the case p = is easy. Suppose fn is a Cauchy sequence in Lp . Given
= 2(j+1) , there exists nj such that if n, m nj , then kfn fm kp 2(j+1) . Without loss of generality
we may assume nj nj1 for each j.
Set n0 = 0 and define f0 0. If Aj = {x : |fnj (x) fnj1 (x)| > 2j/2 , then from Lemma 9.3,

(Aj ) 2jp/2 . By Lemma 9.4, (


j=1 m=j Am ) = 0. So except for a set of measure 0, for each x there

is a last j for which x m=j Am , hence a last j for which x Aj . So for each x (except for the null set)
there is a j0 (depending on x) such that if j j0 , then |fnj (x) fnj1 (x)| 2j .
Set

X
gj (x) =
|fnm (x) fnm1 (x)|.
m=1

gj (x) increases for each x, and the limit is finite for almost every xby the preceding paragraph. Let us call
the limit g(x). We have
j
X
kgj kp
2j + kfn1 kp 2 + kfn1 kp
m=1

by Minkowskis inequality, and so by Fatous lemma, kgkp 2 + kfn1 kp < . We have


fnj (x) =

j
X

(fnm (x) fnm1 (x)).

m=1

Suppose x is not in the null set where g(x) is infinite. Since |fnj (x) fnk (x)| |gnj (x) gnk (x)| 0 as
j, k , then fnj (x) is a Cauchy series (in R), and hence converges, say to f (x). We have kf fnj kp =
limm kfnm fnj kp ; this follows by dominated convergence with the function g defined above as the
dominating function.
We have thus shown that kf fnj kp 0. Given = 2(j+1) , if m nj , then kf fm kp
kf fnj kp + kfm fnj kp . This shows that fm converges to f in Lp norm.

The following is very useful.
20

Proposition 9.6. For 1 < p < and p1 + q 1 = 1,


nZ
o
kf kp = sup
f g : kgkq 1 .

(9.1)

When p = 1 (9.1) holds if we take q = , and if p = (9.1) holds if we take q = 1.


Proof. The right hand side of (9.1) is less than the left hand side by Holders inequality. So we need only
show that the right hand side is greater than the left hand side.
First suppose p = 1. Take g(x) = sgn f (x), where sgn a is 1 if a > 0, is 0 if a = 0, and is 1 if a < 0.
Then g is bounded by 1 and f g = |f |. This takes care of the case p = 1.
Next suppose p = . Since is -finite, there exist sets Fn increasing up to X such that (Fn ) <
for each n. If M = kf k , let a be any finite real less than M . By the definition of L norm, the measure
of A = {x Fn : |f (x)| > a} must be positive if n is sufficiently large. Let g(x) = (sgn f (x))A (x)/(A).
R
R
Then the L1 norm of g is 1 and f g = A |f |/(A) a. Since a is arbitrary, the supremum on the right
hand side must be M .
Now suppose 1 < p < . We may suppose kf kp > 0. Let qn be a sequence of nonnegative
simple functions increasing to f + , rn a sequence of nonnegative simple functions increasing to f , and
sn (x) = (qn (x) rn (x))Fn (x). Then sn (x) f (x) for each x, |sn (x)| |f (x)| for each x, sn is a simple
R
function, and ksn kp < for each n. If f Lp , then ksn kp kf kp by dominated convergence. If |f |p = ,
R
then |sn |p by monotone convergence. For n sufficiently large, ksn kp > 0.
Let
|sn (x)|p1
gn (x) = (sgn f (x))
.
p/q
ksn kp
Since (p 1)q = p, then
kgn kq =

R
( |sn |(p1)q )1/q )
p/q

ksn kp

p/q

ksn kp

p/q

ksn kp

= 1.

On the other hand, since |f | |sn |,


R

Z
f gn =
Since p (p/q) = 1, then

|f | |sn |p1
p/q
ksn kp

|sn |p

p/q
ksn kp

= ksn kp(p/q)
.
p

f gn ksn kp , which tends to kf kp .

The above proof also establishes


Corollary 9.7. For 1 < p < and p1 + q 1 = 1,
Z
kf kp = sup{ f g : kgkq 1, g simple}.
The space Lp is a normed linear space. We can thus talk about its dual, namely, the set of bounded
linear functionals on Lp . The dual of a space Y is denoted Y . If H is a bounded linear functional on Lp ,
we define the norm of H to be kHk = sup{H(f ) : kf kp 1}.
Theorem 9.8. If 1 < p < and p1 + q 1 = 1, then (Lp ) = Lq .
R
Proof. If g Lq , then setting H(f ) = f g for f Lp yields a bounded linear functional; the boundedness
follows from H
olders inequality. Moreover, from Holders inequality and Proposition 9.6 we see that kHk =
kgkq .
21

Now suppose we are given a bounded linear functional H on Lp and we must show there exists g Lq
R
such that H(f ) = f g. First suppose (X) < . Define (A) = H(A ). If A and B are disjoint, then
(A B) = H(AB ) = H(A + B ) = H(A ) + H(B ) = (A) + (B).
To show is countably additive, it suffices to show that if An A, then (An ) (A). But if An A, then
An A in Lp , and so (An ) = H(An ) H(A ) = (A); we use here the fact that (X) < . Therefore
is a countably additive signed measure. Moreover, if (A) = 0, then A = 0 a.e., hence (A) = H(A ) = 0.
By writing = + and using the Radon-Nikodym theorem for both the positive and negative parts,
R
P
we see there exists an integrable g such that (A) = A g for all sets A. If s = ai Ai is a simple function,
by linearity we have
Z
X
X
X Z
H(s) =
ai H(Ai ) =
ai (Ai ) =
ai gAi = gs.
By Corollary 9.7,
nZ
o
kgkq = sup
gs : kskp 1, s simple sup{H(s) : kskp 1} kHk.
R
R
If sn are simple functions tending to f in Lp , then H(sn ) H(f ), while by Holders inequality sn g f g.
R
We thus have H(f ) = f g for all f Lp , and kgkp kHk. By Holders inequality, kHk kgkp .
In the case where is -finite, but not finite, let Fn X be such that (Fn ) < for each n. Define
functionals Hn by Hn (f ) = H(f Fn ). Clearly each Hn is a bounded linear functional on Lp . Applying
R
the above argument, we see there exist gn such that Hn (f ) = f gn and kgn kq = kHn k kHk. It is
easy to see that gn is 0 if x
/ Fn . Moreover, by the uniqueness part of the Radon-Nikodym theorem, if
n > m, then gn = gm on Fm . Define g by setting g(x) = gn (x) if x Fn . Then g is well defined. By
Fatous lemma, g is in Lq with a norm bounded by kHk. Since f Fn f in Lp by dominated convergence,
then Hn (f ) = H(f Fn ) H(f ), since H is a bounded linear functional on Lp . On the other hand
R
R
R
R
Hn (f ) = Fn f gn = Fn f g f g by dominated convergence. So H(f ) = f g. Again by Holders
inequality kHk kgkp .


References.
1. G.B. Folland, Real analysis: modern techniques and their applications, New York, Wiley, 1984.
2. H.L. Royden, Real analysis, New York, Macmillan, 1963.
3. W. Rudin, Real and complex analysis, New York, McGraw-Hill, 1966.

22

You might also like