0% found this document useful (0 votes)
13 views

Lecture Notes

Uploaded by

DjGorgeous
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views

Lecture Notes

Uploaded by

DjGorgeous
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 58

B4.

3 Distribution Theory and Fourier Analysis:


An Introduction∗

Jan Kristensen

Michaelmas Term 2018

1 Lecture 1

1.1 Why Distributions?

There are many reasons to study distributions, but most of them are only really appreciated
after the fact. Many physical quantities are naturally not defined pointwise. For instance,
being able to measure temperature at a given point in space and time is an idealization – see
the discussion in R.S. Strichartz’s A Guide to Distribution Theory and Fourier Transforms, §1.
Similarly, in the theory of Lebesgue integration as discussed in the Part A Integration course
you encountered Lp functions and also they are not really defined uniquely everywhere, but
only almost everywhere. In fact they are strictly speaking not even functions, but equivalence
classes of functions under the equivalence relation equal almost everywhere. Nonetheless, for
f ∈ Lp (Rn ) and each measurable subset A ⊂ Rn the integral
Z
f (x) dx
A

is well-defined and does not depend on the representativeRused to calculate the integral. Note
that if we know that f is continuous, then the integrals A f (x) dx for measurable subsets A
of Rn with, say Ln (A) < ∞, determine f (x) uniquely for all x ∈ Rn . Specifically, we have
Z
1
f (x) dx → f (x0 )
Ln (Br (x0 )) Br (x0 )

as r → 0+ for all x0 ∈ Rn . For a general Lp function f , knowing the values of


Z
hf, 1A i ..= f (x) dx
A

Notes typed by Grigalius Taujanskas. Thanks to Edward Hart for helpful comments.

1
for all measurable A ⊂ Rn with Ln (A) < ∞ determines f (x) uniquely almost everywhere (and
so uniquely as an Lp function). Here

1 if x ∈ A,
1A (x) =
0 if x ∈
/A
acts as a test function, or measurement of f . It turns out that taking very nice test functions
here is a good idea that allows us to extend aspects of differential calculus to Lp functions and
beyond. This leads to the theory of distributions. But why should we bother?

1.1.1 One Dimensional Wave Equation

The equation
∂2u 2
2∂ u
(x, t) = k (x, t) (1)
∂t2 ∂x2
can be used to model a vibrating string. A function given by
u(x, t) = f (x − kt),
where f is a function of one variable, represents a travelling wave with shape f (x) moving to
the right with velocity k. When f is twice differentiable, one can check that u is a solution
to (1). However, there is no physical reason for the shape of the travelling wave to be twice
differentiable. For instance, the triangular profile


❅ ✲



moving with speed k to the right is perfectly fine! We do not want to throw away physically
meaningful solutions because of technicalities. Looking at the example above, one could think
that if we accepted as solutions to differential equations any function that satisfies the differ-
ential equation except for some points (finitely many, say), where it fails to be differentiable,
then all would be fine. But this is too simplistic and does not work, as the next example shows.

1.1.2 Laplace’s Equation

On Rn Laplace’s equation is
∂2u ∂2u
∆u ..= + · · · + = 0. (2)
∂x21 ∂x2n
For n = 2 or n = 3 a solution to the above equation has the physical interpretation of an
electric potential in a region with no external charges. From physical experience we know that
such potentials should be smooth. However, as you may have seen last year,

u = G2 (x1 , x2 ) ..= log x21 + x22 ,

2
and
− 1
u = G3 (x1 , x2 , x3 ) ..= x21 + x22 + x23 2

are solutions in R2 \{(0, 0)} and in R3 \{(0, 0, 0)} respectively. Clearly neither can be extended
to the origin in a smooth manner, and so these should not be considered as solutions on the
full space.
Distribution theory allows us, among many other things, to distinguish between the case of
the one dimensional wave equation (1) and Laplace’s equation (2). Indeed, the standing wave
satisfies the one dimensional wave equation in the sense of distributions for any continuous
profile f , while
∆G2 = c2 δ0
and
∆G3 = c3 δ0
as distributions, where δ0 is Dirac’s delta function, a distribution.
We must spend some time developing the notion of a test function before we can define
the notion of a distribution. It is worth mentioning that there is not just one class of test
functions. In this course we will define two classes of test functions, the compactly supported
ones and the Schwartz ones. But there are in fact many others, and, depending on the context,
they can sometimes be very useful. To each class of test functions there corresponds a class of
distributions. The principle to keep in mind here is that the nicer (smoother) the test functions
are, the wilder (rougher) the corresponding distributions are allowed to be.

1.2 Test Functions I

Let Ω be a non-empty open subset of Rn . We denote by

C(Ω) ..= {u : Ω → R : u is continuous}

and sometimes say that such functions are C 0 functions, and write C 0 (Ω) = C(Ω). We will
use the same notation for complex-valued functions, which will be clear from context or will
be stated explicitly. Similarly, for k ∈ N we define

C k (Ω) ..= {u : Ω → R : u is k times continuously differentiable}.

That is, u ∈ C k (Ω) (or u is a C k function) if and only if u and all its partial derivatives up to
and including order k are continuous on Ω.
∂u ∂u
Example 1.1. The function u is C 1 if and only if u, ∂x 1
, . . . , ∂x n
are all continuous on Ω. Note
∂u
in particular that we require, for example, ∂x1 to be jointly continuous in x = (x1 , x2 , . . . , xn ).

Lemma 1.2. If u : Ω → R is C 2 , then


∂2u ∂2u
=
∂xj ∂xk ∂xk ∂xj

3
on Ω. Hence the order in which we take (two) partial derivatives is unimportant for C 2 func-
tions.

Proof. Let {ej }nj=1 be the standard basis for Rn and denote by

△h u(x) ..= u(x + h) − u(x)

for x, x+h ∈ Ω. Observe that △sej △rek u = △rek △sej u for s, r ∈ R. Because u is C 2 , applying
the Fundamental Theorem of Calculus twice we have
Z 1Z 1
1 ∂2u
△sej △rek u(x) = (x + σsej + ρrek ) dσ dρ,
sr 0 0 ∂xj ∂xk

and hence
1 ∂2u
△sej △rek u(x) − (x) → 0
sr ∂xj ∂xk
as (r, s) → (0, 0).

We can extend this result to C k functions for k > 2 by induction, and so for such functions
we do not have to worry about the order in which we partially differentiate. When there are
many independent variables we shall often rely on multi-index notation.

1.2.1 Multi-index Notation

A multi-index α is an n-tuple of non-negative integers, α = (α1 , . . . , αn ) ∈ Nn0 . The length (or


order ) of α is
|α| ..= α1 + · · · + αn .
If α, β ∈ Nn0 , then also α + β ∈ Nn0 . When u ∈ C k (Ω) and |α| 6 k, we write

∂ |α| u
Dα u(x) ..= (x)
∂xα1 1 . . . ∂xαnn

and by convention set D0 u(x) ..= u(x).


Example 1.3. For α = (1, 2), β = (0, 2) and u ∈ C 3 (Ω), where Ω ⊂ R2 ,

∂3u ∂2u
Dα u = , Dβ u = .
∂x1 ∂x22 ∂x22

When u ∈ C 5 (Ω),
∂5u
Dα+β u = .
∂x1 ∂x42
Note that by lemma 1.2, Dα+β u = Dα (Dβ u) = Dβ (Dα u) = Dβ+α u. In a sense, lemma 1.2
justifies using multi-index notation for partial derivatives.

4
Note that C k (Ω) is a descending sequence of sets, C k+1 (Ω) ( C k (Ω). We define

\
C ∞ (Ω) ..= C k (Ω),
k=0

the class of infinitely differentiable functions on Ω. Under the natural pointwise definitions of
addition, multiplication by scalars, and multiplication, these classes form commutative rings
with unity and vector spaces (over R or C).

1.2.2 Support of a Continuous Function

For u ∈ C(Ω) we define the support of u as

supp(u) ..= Ω ∩ {x ∈ Ω : u(x) 6= 0},

that is, the closure of the set {u 6= 0} relative to Ω. As such, supp(u) is closed in Ω, but need
not be closed in Rn .
Example 1.4. Define u1 : R → R by

1 − |x|, |x| < 1,
u1 (x) =
0, |x| > 1.

Then supp(u1 ) = [−1, 1].


If instead we consider the restriction of u1 to Ω = (−1, 1), that is u2 (x) = 1−|x|, x ∈ (−1, 1),
then supp(u2 ) = (−1, 1).

One sees that the support of a function u depends on the ambient set Ω, and we could
instead write suppΩ (u). However, for our purposes it will suffice to write supp(u), where Ω
will be understood from context.
We shall be particularly interested in having compact support. Recall that a set K is
compact if and only if any open cover of K admits a finite subcover. Also recall that in Rn ,
the Heine–Borel theorem tells us that K is compact if and only if K is closed and bounded.
Note that K ⊂ Ω is compact in Ω if and only if K is compact in Rn . Also when K ⊂ Ω is
compact, then dist(K, ∂Ω) ..= inf x∈K, y∈∂Ω |x − y| > 0.

Definition 1.5. Let Ω be an non-empty open subset of Rn . Then

D(Ω) ..= {u ∈ C ∞ (Ω) : supp(u) is compact}

is the class of smooth compactly supported test functions.

Remark 1.6. Note that for u ∈ D(Ω), u not identically zero, we have dist(supp(u), ∂Ω) > 0.

5
We also write
Cck (Ω) ..= {u ∈ C k (Ω) : supp(u) is compact}
for k ∈ N0 ∪ {∞}. So in fact D(Ω) = Cc∞ (Ω). As before, we can define ring operations in
the standard way, making Cck (Ω) and D(Ω) into commutative rings (without unity) and vector
spaces (over R or C).
We have defined a test function to be any smooth and compactly supported function, but
so far we have seen no example. Actually, are there any smooth compactly supported test
functions other than the trivial one φ ≡ 0?

2 Lecture 2

2.1 Bump Functions

Lemma 2.1. The function


(  
1
exp x2 −1
, |x| < 1,
φ(x) =
0, |x| > 1

is in C ∞ (R) with supp(φ) = [−1, 1]. In particular, φ ∈ D(R).

Sketch of proof. One shows by induction on k that


 
(k) pk (x) 1
φ (x) = 2 exp
(x − 1)2k 2
x −1

for |x| < 1, where pk (x) is a polynomial of order k. One deduces that φ ∈ C k (R). The rest is
left as an exercise.

Example 2.2. Let Br (x0 ) ..= {x ∈ Rn : |x − x0 | < r} and put


 
|x − x0 |2
ϕ(x) ..= φ , x ∈ Rn .
r2

By the Chain Rule, we see that ϕ ∈ C ∞ (Rn ). Clearly, supp(ϕ) = Br (x0 ), and so ϕ ∈ D(Rn ).

We can do much more using φ as a building block. We shall use the operation of convolution;
recall that when f, g ∈ L1 (Rn ), then
Z
.
(f ∗ g)(x) .= f (x − y)g(y) dy
Rn

is well-defined for almost every x ∈ Rn , and f ∗ g ∈ L1 (Rn ). Furthermore, f ∗ g = g ∗ f almost


everywhere.

6
2.1.1 The Standard Mollifier in Rn

Put Z Z ∞
2
 
cn ..= φ |x| dx = φ r2 rn−1 dr ωn−1 > 0,
Rn 0
where ωn−1 is the surface area of Sn−1 . It can be shown that
n
2π 2
ωn−1 = nLn (B1 (0)) = .
Γ n2

We only need to know that 0 < ωn−1 < ∞, the actual value is not important here. Put
1 
ρ(x) ..= φ |x|2 , x ∈ Rn .
cn

Then ρ ∈ D(Rn ), ρ > 0, supp(ρ) = B1 (0) and


Z
ρ(x) dx = 1.
Rn

For each ε > 0 we put x


ρε (x) ..= ε−n ρ , x ∈ Rn .
ε
Then ρε ∈ D(Rn ), ρε > 0, supp (ρε ) = Bε (0) and
Z
ρε (x) dx = 1.
Rn

Definition 2.3. We call the family of functions (ρε )ε>0 the standard mollifier on Rn .
Proposition 2.4. Let 1 6 p < ∞ and u ∈ Lp (Ω). Define u to be zero outside Ω. Then

(i) ρε ∗ u ∈ C ∞ (Ω),

(ii) kρε ∗ ukp 6 kukp , and

(iii) kρε ∗ u − ukp → 0 as ε → 0+ .

We require the following auxiliary result for the proof.


Lemma 2.5. Let 1 6 p 6 ∞, ϕ ∈ D(Ω), and u ∈ Lp (Ω). Define u to be zero outside of Ω.
Then ϕ ∗ u ∈ C 1 (Ω) and for each 1 6 j 6 n,
 
∂(ϕ ∗ u) ∂ϕ
= ∗ u.
∂xj ∂xj

Proof. This is a simple application of the Dominated Convergence Theorem. We omit the
details.

7
Proof of Proposition 2.4. Part (i) follows by applying Lemma 2.5 inductively. For part (ii),
we use Hölder’s inequality. Let
1 1
+ =1
p q
and write for each x and almost every y,
1 1
|ρε (x − y)u(y)| = ρε (x − y) q ρε (x − y) p |u(y)|.

Integrating over y ∈ Rn and using Hölder’s inequality,


Z Z  1 Z 1
q p
p
|ρε (x − y)u(y)| dy 6 ρε (x − y) dy ρε (x − y)|u(y)| dy
Rn Rn Rn
Z 1
p
p
= ρε (x − y)|u(y)| dy .
Rn

Integrating over x ∈ Ω,
Z Z Z
|(ρε ∗ u)(x)|p dx 6 ρε (x − y)|u(y)|p dy dx
n

ZΩ R Z

= |u(y)|p ρε (x − y) dx dy
Rn Ω
Z Z
p
6 |u(y)| ρε (x − y) dx dy = kukpp ,
Rn Rn

where in † we used Fubini–Tonelli. For (iii), let τ > 0 and take v ∈ Cc (Ω) such that ku−vkp 6 τ
(that this is possible follows from the way we defined the integral in Part A Integration–you
might want to write out the details of this). By uniform continuity of v, we can find an ε0 > 0
such that
kρε ∗ v − vk∞ < τ
for 0 < ε 6 ε0 . Using Minkowski’s inequality, we have for 0 < ε 6 ε0

kρε ∗ u − ukp 6 kρε ∗ (u − v)kp + kρε ∗ v − vkp + kv − ukp


(ii)
6 2kv − ukp + kρε ∗ v − vkp
 1
p
< 2τ + kρε ∗ v − vk∞ Ln Bε (supp(v))
  1 
n p
< 2 + L Bε (supp(v)) τ.

We are now ready to prove two useful technical results.

8
2.1.2 Cut-off Functions and Partitions of Unity

Theorem 2.6. Let K be a compact subset of Ω. There exists φ ∈ D(Ω) such that 0 6 φ 6 1
and φ ≡ 1 on K. We refer to φ as a cut-off function between K and Rn \ Ω.

Proof. Put d ..= dist(K, ∂Ω) > 0 and fix δ ∈ 0, d4 . Put K̃ = B2δ (K). Recall that by definition,

K̃ = {x ∈ Rn : dist(x, K) 6 2δ}.

Let (ρε )ε>0 be the standard mollifier and put φ ..= ρδ ∗ 1K̃ . Then φ ∈ C ∞ (Rn ), supp(φ) ⊂
Bδ (K̃) = B3δ (K), and since δ 6 d4 , then supp(φ) ⊂ Ω and hence φ ∈ D(Ω). Next, 0 6 φ 6 1,
and for x ∈ K we have Bδ (x) ⊂ K̃, so
Z Z
φ(x) = ρδ (x − y)1K̃ (y) dy = ρδ (x − y) dy = 1.
Rn Rn

Note that ρδ (x − y) is supported in Bδ (x).

Remark 2.7. For a multi-index α we have


Z
|Dα φ(x)| = δ −|α| (Dα ρ)δ (x − y)1K̃ (y) dy
R n
Z
−|α|
6δ |(Dα ρ)δ (x − y)| dy
Rn
−|α| α
=δ kD ρkL1 ,

hence
|Dα φ| 6 cα d−|α| ,
where cα = 4|α| kDα ρkL1 , a constant independent of d.
S
Theorem 2.8. Let Ω = m j=1 Ωj , where Ω1 , . . . , Ωm are open, non-empty, potentially over-
lapping sets. For K ⊂ Ω compact there exist φ1 , . . . , φm ∈ D(Ω) satisfying supp(φj ) ⊂ Ωj ,
0 6 φj 6 1, and
m
X
φj = 1
j=1

on K. We refer to φ1 , . . . , φm as a partition of unity on K subordinate to the cover Ω1 , . . . , Ωm .

Proof. (Not examinable) Let x ∈ K ∩ Ωj . Because Ωj is open, we can find rj (x) > 0 such that
Brj (x) (x) ⊂ Ωj . The set
n o
Brj (x) (x) : x ∈ K, 1 6 j 6 m
is an open cover of K, so by compactness it admits a finite subcover, say
n o
Bs ..= Brjs (xs ) (xs ), 1 6 s 6 N .

9
Put Jj ..= {s : js (xs ) = j}, so that [
Bs ⊂ Ωj .
s∈Jj
S  S
Now Kj = K ∩ s∈Jj Bs is compact, Kj ⊂ Ωj and K = m j=1 Kj . We now apply Theorem 2.6
to each Kj , Ωj to find corresponding cut-off functions ψj ∈ D(Ωj ) satisfying 0 6 ψj 6 1 and
ψj ≡ 1 on Kj . We extend ψj to Ω \ Ωj by zero and, denoting this extension Qm−1 again by ψj ,
have ψj ∈ D(Ω). Now define φ1 ..= ψ1 , φ2 ..= ψ2 (1 − ψ1 ), . . . , φm ..= ψm j=1 (1 − ψj ). By

repeated use of the Leibniz rule we see that φ1 , . . . , φm ∈ C (Ω). Clearly supp(φj ) ⊂ Ωj , and
0 6 φj 6 1. Finally, on K we have
m
X m
Y
φj − 1 = − (1 − ψj ) = 0.
j=1 j=1

The proof of Theorem 2.8 is not examinable, but the result is.

3 Lecture 3

3.1 Convergence of Sequences in D(Ω)

Before defining distributions corresponding to smooth compactly supported test functions we


must first discuss a notion of convergence in D(Ω).
Definition 3.1. Let {φj }j be a sequence in D(Ω) and φ ∈ D(Ω). We say
φj → φ in D(Ω)
if and only if there exists a compact set K ⊂ Ω such that supp(φ), supp(φj ) ⊂ K for all j, and
for each multi-index α
sup |Dα (φj − φ)| → 0.
K
In words, φj → φ in D(Ω) if and only if all supports are contained in a fixed compact set in
Ω, and we have uniform convergence of φj − φ together with all partial derivatives to the zero
function.
Remark 3.2. Convergence in D(Ω) is a strong requirement. The requirement of all supports
being contained in a fixed compact set is needed to ensure that φ(x − j) does not converge to
zero in D(R).
Remark 3.3. It is possible to define a topology T on D(Ω) in such a way that φj → φ in D(Ω)
corresponds to φj → φ in the topological space (D(Ω), T ). We shall not pursue this point of
view in these notes, however, apart from noting that it can be shown that the topology T is
not metrizable.

10
3.2 Distributions Corresponding to D(Ω)

Definition 3.4. A functional u : D(Ω) → R (or C) is a distribution on Ω if and only if

(i) u is linear,
u(φ + tψ) = u(φ) + tu(ψ)
for φ, ψ ∈ D(Ω), t ∈ R (or C), and

(ii) u is continuous in the sense that u(φj ) → u(φ) whenever φj → φ in D(Ω).

The set of all distributions on Ω is denoted by D′ (Ω).

Remark 3.5. Firstly, because of linearity, the continuity condition (ii) holds if and only if it
holds at φ = 0. Indeed, if u(φj ) → 0 whenever φj → 0 in D(Ω) and ψj → ψ in D(Ω), then we
take φj = ψj − ψ and note that φj → 0 in D(Ω). Then by assumption, u(φj ) → 0. But u is
linear, so u(φj ) = u(ψj ) − u(ψ) and so u(ψj ) → u(ψ).
Secondly, when u : D(Ω) → R is linear (and defined everywhere on D(Ω)), then chances
are that u is continuous in the sense defined above and thus is a distribution on Ω. Indeed,
the only counterexamples I know are constructed by use of the Axiom of Choice in the form
of existence of a Hamel basis for D(Ω).

Notation. When u ∈ D′ (Ω) and φ ∈ D(Ω), we often write hu, φi instead of u(φ).
Example 3.6. If f ∈ Lp (Ω), p ∈ [1, ∞], then
Z
hTf , φi = f (x)φ(x) dx, φ ∈ D(Ω)

defines a distribution on Ω. Linearity follows from linearity of the integral, and continuity
follows from the Dominated Convergence Theorem. Note that since each φ ∈ D(Ω) has compact
support in Ω and since we defined convergence in D(Ω) by requiring all supports to be in a
fixed compact set in Ω, the above distribution Tf would also be well-defined if f was only
locally in Lp .

3.3 Local Lebesgue Spaces

Definition 3.7. For p ∈ [1, ∞] we write f ∈ Lploc (Ω) and say f is locally Lp on
R Ω if and only
if for each compact set K ⊂ Ω we have f |K ∈ Lp (K). Specifically, we require K |f |p dx < ∞
when p < ∞ and ess supK |f | < ∞ when p = ∞.

Example 3.8. The function x−1 ∈ / L1 (0, ∞), but x−1 ∈ L1loc (0, ∞) and, in fact, x−1 ∈ Lploc (0, ∞)
for all p ∈ [1, ∞]. Note that Ω determines what local means. For example, x−1 ∈ L1loc (0, ∞),
but x−1 ∈ / L1loc (−1, 1).

11
Example 3.9. Summarizing a previous discussion, each f ∈ Lploc (Ω), p ∈ [1, ∞], gives rise to a
distribution on Ω via Z
hTf , φi = f (x)φ(x) dx

for each φ ∈ D(Ω).
Example 3.10 (Dirac’s delta function at x0 ∈ Ω). The map

φ 7→ hδx0 , φi ..= φ(x0 )

for φ ∈ D(Ω) is clearly a distribution on Ω. Furthermore, so is φ 7→ (Dα φ)(x0 ) for any


multi-index α.

While the continuity condition (ii) in Definition 3.4 often is not an issue, it is nonetheless
useful to reformulate it using linearity as follows.

Theorem 3.11. A linear functional u : D(Ω) → R (or C) is a distribution if and only if for
every compact set K ⊂ Ω there exist constants c = c(K) > 0 and m = m(K) ∈ N0 such that
X
|hu, φi| 6 c sup |Dα φ| (3)
K
|α|6m

for all φ ∈ D(K) ..= {φ ∈ D(Ω) : supp(φ) ⊂ K}.

Proof. If φj → 0 in D(Ω), then for some compact set K ⊂ Ω we have φj ∈ D(K) for all j.
Then by assumption we can find c = c(K) > 0 and m = m(K) ∈ N0 such that (3) holds. But
then X
|hu, φj i| 6 c sup |Dα φj | → 0.
K
|α|6m

For the converse, we argue by contradiction. Assume there exists u ∈ D(Ω) and a compact
set K ⊂ Ω such that (3) is violated for all choices of c and m. In particular, for c = m = j we
can find φj ∈ D(K) with X
|hu, φj i| > j sup |Dα φj |.
K
|α|6j

φj
Put λj = hu, φj i. Then |λj | > 0, ψj ..= λj ∈ D(K), hu, ψj i = 1, and
X
1>j sup |Dα ψj |.
K
|α|6j

Thus |Dα ψj | < j −1 on Ω for j > |α|, and in particular ψj → 0 in D(Ω). But hu, ψj i = 1, which
does not converge to zero.

12
4 Lecture 4

4.1 The Fundamental Lemma of the Calculus of Variations

When f ∈ Lploc (Ω), then Z


hTf , φi = f (x)φ(x) dx,

φ ∈ D(Ω), defines a distribution on Ω. It is natural to ask if the distribution Tf determines
f , that is, if for f, g ∈ Lploc (Ω) we have Tf = Tg , must it be the case that f = g almost
everywhere? The answer is affirmative and relies on the following.

Lemma 4.1 (The Fundamental Lemma of the Calculus of Variations). If f ∈ L1loc (Ω) and
Z
f (x)φ(x) dx = 0

for all φ ∈ D(Ω), then f = 0 almost everywhere.

Proof. Let O be a non-empty open subset of Ω such that O is compact and O ⊂ Ω. In this
case we write O ⋐ Ω.
Put g = f 1O and extend g to Rn \ Ω by zero. Because O ⊂ Ω is compact, we have g ∈
L1 (Rn ). For the standard mollifier (ρε )ε>0 we know, by proposition 2.4, that kρε ∗ g − gk1 → 0
as ε → 0+ . Now note that for x ∈ Ω,
Z
(ρε ∗ g)(x) = ρε (x − y)g(y) dy
Rn
Z
= ρε (x − y)f (y) dy.
O

If we take x ∈ O and ε ∈ (0, dist(x, ∂O)), then, denoting φx (y) ..= ρε (x − y) for y ∈ Ω, we have
φx ∈ C ∞ (Ω) and supp(φx ) = Bε (x) ⊂ O ⊂ Ω, so φx ∈ D(Ω). By assumption,
Z Z
0= f (y)φx (y) dy = f (y)ρε (x − y) dy = (ρε ∗ g)(x).
Ω O

It follows that (ρε ∗ g)(x) → 0 as ε → 0+ pointwise in x ∈ O. From Fatou’s Lemma, we


therefore get
Z Z
|g| dx 6 lim inf |ρε ∗ g − g| dx
O ε→0+ O
Z
6 lim |ρε ∗ g − g| dx = 0.
ε→0+ Ω

Thus f = g = 0 almost everywhere in O and since O ⋐ Ω was arbitrary, we conclude that


f = 0 almost everywhere.

13
Notation. When f ∈ Lploc (Ω), we shall also use f to denote the distribution Tf . This is of
course an abuse of notation, but it is convenient and should not cause too much trouble. We
shall often refer to distributions that correspond to an Lploc (Ω) function as a regular distribution
on Ω.

Definition 4.2. Let u ∈ D′ (Ω). If there exists an m ∈ N0 with the property that for all
compact subsets K ⊂ Ω there exists a constant c = cK > 0 such that
X
|hu, φi| 6 c sup |Dα φ|
K
|α|6m

for all φ ∈ D(K), then we say u has order at most m.


We say u has order m if and only if u has order at most m, but not order at most m − 1.
In particular, u has order 0 if and only if it has order at most 0.
We say u has order infinity if and only if u does not have order at most m for any m ∈ N0 .

Note that by Theorem 3.11, any distribution has locally finite order.
Example 4.3. Let f ∈ L1loc (Ω). Then the corresponding distribution has order 0. Indeed, if
K ⊂ Ω is compact and ϕ ∈ D(K), then
Z
|hf, ϕi| = f (x)ϕ(x) dx

Z
6 |f ||ϕ| dx
K
Z
6 sup |ϕ| |f | dx,
K K

and so |hf, ϕi| 6 c supK |ϕ|.


Example 4.4. Let x0 ∈ Ω and α ∈ Nn0 . Define

hT, ϕi ..= (Dα ϕ)(x0 )

for ϕ ∈ D(Ω). Then T ∈ D′ (Ω) as T is clearly linear and for compact K ⊂ Ω and ϕ ∈ D(K),

|hT, ϕi| = |(Dα ϕ)(x0 )| 6 sup |Dα ϕ|.


K

It also shows that T has order at most |α|. If α = 0 so that T = δx0 , we see that T has order
0. Assume |α| > 0. We shall prove that T has order |α|. Suppose, for contradiction, that T
has order at most |α| − 1. Take r ∈ (0, dist(x0 , ∂Ω)) and put K = Br (x0 ). Then K ⊂ Ω is
compact. By assumption, we can then find c = cK > 0 such that
X
|hT, ϕi| = |(Dα ϕ)(x0 )| 6 c sup |Dβ ϕ| (4)
K
|β|6|α|−1

14
for all ϕ ∈ D(K). Take ψ ∈ D(B1 (0)) with ψ(0) = 1 and define, for ε ∈ (0, r),
 
. (x − x0 )α x − x0
ϕ(x) .= ψ
α! ε

for x ∈ Ω. Note that ϕ is C ∞ and supp(ϕ) ⊂ Bε (x0 ) ⊂ K, so that ϕ ∈ D(K). Also,


  
β (x − x0 )α 1 if β = α
D = ,
α! x=x0
0 if β 6= α

so that Dα ϕ(x0 ) = 1. If β ∈ Nn0 , β 6 α, and |β| < |α|, then for x ∈ Bε (x0 ) we get from the
generalized Leibniz rule (see below)
X β  
(x − x0 )α
 
x − x0

β γ β−γ
|D ϕ(x0 )| 6 Dx Dx ψ
γ α! ε
γ6β
X β 
6 cγ |x − x0 ||α|−|γ| sup |Dβ−γ ψ|ε|γ|−|β|
γ
γ6β
X β 
6 cγ sup |Dβ−γ ψ|ε|α|−|γ|+|γ|−|β|
γ
γ6β

= cψ,β ε|α|−|β| ,

where
X β 
cψ,β ..= cγ sup |Dβ−γ ψ|.
γ
γ6β

When ε < 1 we have, since |α| − |β| > 1, that ε|α|−|β| 6 ε, and hence in combination with (4)
we get X X
16c cψ,β ε|α|−|β| 6 c cψ,β ε =.. c̃ε.
|β|6|α|−1 |β|6|α|−1

Here c̃ > 0 is a constant (in particular, independent of ε) whose value is unimportant. We


have shown that 1 6 c̃ε holds for all ε ∈ (0, min(1, r)). This is a contradiction for ε < c̃−1 .

Theorem 4.5 (Generalized Leibniz Rule). Let f, g ∈ C k (Ω). Then f g ∈ C k (Ω) and for
α ∈ Nn0 , |α| 6 k, we have
X α 
α
D (f g) = Dβ f Dα−β g,
β
β6α

where β 6 α means βi 6 αi for all i = 1, . . . n,


 
α . α!
.= ,
β β!(α − β)!

and α! = α1 !α2 ! . . . αn !.

15
Proof. This can be proven by induction on |α|. We omit the details.

A generalization of the above example is as follows. Let xj , where j ∈ J is countable or


finite, be distinct points in Ω so that the set {xj | j ∈ J} has no limit points in Ω (that is, any
limit points are on ∂Ω). For any set of multi-indices αj ∈ Nn0 put
X
hT, ϕi ..= (Dαj ϕ)(xj )
j∈J

for ϕ ∈ D(Ω). Then T ∈ D′ (Ω) and the order of T is supj∈J |αj |.

5 Lecture 5

Definition 5.1. Let {uj }j be a sequence in D′ (Ω) and let u ∈ D′ (Ω). We say uj converges to
u in the sense of distributions on Ω and write

uj −→ u in D′ (Ω)

if and only if
huj , ϕi −→ hu, ϕi
for each ϕ ∈ D(Ω).

Remark 5.2. As with convergence in D(Ω), one can define a topology T on D′ (Ω) so that
uj → u in D′ (Ω) corresponds to uj → u in the topological space (D′ (Ω), T ). We will not need
it and shall not pursue this here. We note, however, that the topology T is not metrizable.
Convergence in D′ (Ω) is very weak.
Example 5.3. Let p ∈ [1, ∞] and fj , f ∈ Lp (Ω). If fj → f in Lp (Ω), then fj → f in D′ (Ω).
This can be proven using the Dominated Convergence Theorem. The converse, however, is
false:

(i) Let fj (x) = sin(jx), x ∈ (0, 1). Then fj → 0 in D′ (0, 1), but fj 6→ 0 in Lp (0, 1) for any
p ∈ [1, ∞].

(ii) Let gj (x) = g(jx), x ∈ (0, 1), where g is T -periodic and on (0, T ] is given by g =
−1171(0, T ] + 1171( T ,T ] . On Sheet 2 you will be asked to prove that gj → 0 in D′ (0, 1),
2 2
but kgj k1 = 117 6→ 0.

Example 5.4. Let v ∈ Cc (Rn ) and for x0 ∈ Ω and ε > 0 put


 
. −n x − x0
vε (x) .= ε v ,
ε

16
R
x ∈ Ω. Then vε ∈ D′ (Ω) and vε → Rn v dx δx0 in D′ (Ω) as ε → 0+ . Indeed, it is clear that
vε ∈ D′ (Ω) for all ε > 0, and for ϕ ∈ D(Ω) we have
Z  
−n x − x0
hvε , ϕi = ε v ϕ(x) dx
ε
ZΩ
= v(y)ϕ(x0 + εy) dy
RnZ Z
−→ v(y)ϕ(x0 ) dy = v dy hδx0 , ϕi,
ε→0+ Rn Rn

where in the second line we made the change of variables y = ε−1 (x − x0 ). In particular, note
that if (ρε )ε>0 is the standard mollifier, then ρε → δ0 in D′ (Rn ) as ε → 0+ .

Note that for x ∈ Rn , φx (y) ..= ρε (x−y), y ∈ Rn , is C ∞ and has support supp(φx ) = Bε (x),
so in particular φx ∈ D(Ω). Now recall that for f ∈ Lp (Rn ) we defined
Z
(ρε ∗ f )(x) = ρε (x − y)f (y) dy
n
ZR
= φx (y)f (y) dy = hf, φx i.
Rn

This in fact all makes perfect sense for any f ∈ Lploc (Rn ), but we can go much further.

Definition 5.5 (Mollification of distributions). Let u ∈ D′ (Rn ) and (ρε )ε>0 be the standard
mollifier. Then ρε ∗ u ∈ D′ (Rn ) is defined by

hρε ∗ u, ϕi = hu, ρ̃ε ∗ ϕi,

ϕ ∈ D(Rn ), where ρ̃ε (x) = ρε (−x), which is equal to ρε (x) in the case of the standard mollifier
since it is an even function.

Remark 5.6. ρε ∗ ϕ ∈ D(Rn ) since clearly ρε ∗ ϕ is C ∞ and supp(ρε ∗ ϕ) ⊂ B2ε (supp(ϕ)).

From Lemma 2.5, we get


Dα (ρε ∗ ϕ) = ρε ∗ Dα ϕ
by applying it |α| times, and it is easy to check that

sup |Dα (ρε ∗ ϕ)| 6 sup |Dα ϕ|,

so ρε ∗ u will satisfy the same bounds as u (uniformly in ε > 0). But for each fixed ε > 0, ρε ∗ u
is much better, as the following theorem shows.

Theorem 5.7. If u ∈ D′ (Rn ), then ρε ∗ u ∈ C ∞ (Rn ) and ρε ∗ u → u in D′ (Rn ).

17
Sketch of proof. For ε > 0 fixed we consider
φx (y) ..= ρε (x − y)
for y ∈ Rn . Since φx → φx0 in D(Rn ) as x → x0 , we have by the continuity condition
hu, φx i −→ hu, φx0 i
as x → x0 . Thus ρε ∗ u ∈ C 0 (Rn ). Next, for a direction 1 6 j 6 n and an increment h 6= 0
consider  
1 1  x+hej x
((ρε ∗ u)(x + hej ) − (ρε ∗ u)(x)) = u, φ −φ .
h h
We can easily check that
1  x+hej 
φ − φx −→ ψjx
h h→0
in D(Rn ), where ψjx (y) = (Dj ρε )(x − y), y ∈ Rn . Hence ρε ∗ u admits partial derivatives
Dj (ρε ∗ u), and we see that, as above, they are continuous. Thus ρε ∗ u is C 1 , and an induction
argument along these lines gives that ρε ∗ u is C ∞ . For the approximation, let ϕ ∈ D(Rn ). By
definition,
hρε ∗ u, ϕi = hu, ρε ∗ ϕi,
and it’s easy to check that ρε ∗ ϕ −→ ϕ in D(Rn ), consequently ρε ∗ u −→ u in D′ (Rn ).
ε→0+ ε→0+

We can refine the result even further.


Theorem 5.8. If u ∈ D′ (Ω), then there exists a sequence uj in D(Ω) such that uj → u in
D′ (Ω).

Therefore any distribution on Ω can be approximated in the sense of distributions by smooth


compactly supported test functions.

Strategy. Prove results for test functions and then try to extend them to distributions by the
above approximation result. We shall not prove Theorem 5.8 here, but we shall return to it
later.
How should we differentiate u ∈ D′ (Ω)? Take uj ∈ D(Ω) such that uj → u in D′ (Ω). Then
Z
hDk uj , ϕi = (Dk uj )ϕ dx
ZΩ
= (Dk uj )ϕ dx
Rn
Z Z
Fubini
= (Dk uj )ϕ dxk dx1
R n−1 R
Z Z
parts
= − uj Dk ϕ dxk dx1
Rn−1 R
Fubini
= huj , −Dk ϕi −→ hu, −Dk ϕi,
j→∞

ϕ ∈ D(Ω). Hence hDk u, ϕi ..= hu, −Dk ϕi seems reasonable.

18
6 Lecture 6

We outline a principle that often allows us to extend well-known operations on test functions
to corresponding operations on distributions. Let T be an operation on test functions, that
is T : D(Ω) → D(Ω) is a linear map. Suppose there exists a linear map S : D(Ω) → D(Ω)
satisfying Z Z
T (ϕ)ψ dx = ϕS(ψ) dx
Ω Ω
for all ϕ, ψ ∈ D(Ω). We call this an adjoint identity. If S is continuous in the sense that
S(ψj ) → S(ψ) in D(Ω) whenever ψj → ψ in D(Ω), then we can extend T to distributions u by
the rule
hT̄ (u), ψi ..= hu, S(ψ)i,
ψ ∈ D(Ω). Because S is linear and continuous, it follows that T̄ (u) ∈ D′ (Ω), and in fact
T̄ : D′ (Ω) → D′ (Ω) is linear and continuous,
T̄ (u + λv) = T̄ (u) + λT̄ (v)
for u, v ∈ D′ (Ω), λ ∈ R (or C), since for ψ ∈ D(Ω)
hT̄ (u + λv), ψi = hu + λv, S(ψ)i = hu, S(ψ)i + λhv, S(ψ)i = hT̄ (u), ψi + λhT̄ (v), ψi,
and if uj → u in D′ (Ω), then for ψ ∈ D(Ω)
hT̄ (uj ), ψi = huj , S(ψ)i −→ hu, S(ψ)i = T̄ (u), ψi,
hence T̄ (uj ) → T̄ (u).
d
Example 6.1. 1. (Differentiation). T = dx = D on D(R). For ϕ, ψ ∈ D(R) we have by
integration by parts
Z Z ∞ Z
′ +∞ ′
ϕ ψ dx = [ϕψ]−∞ − ϕψ dx = ϕ(−ψ ′ ) dx,
R −∞ R

hence we have an adjoint identity with S = −D. Clearly, S : D(R) → D(R) is linear and
continuous, so we may extend to distributions u ∈ D′ (R) by
hD̄u, ψi = hu, −Dψi,
ψ ∈ D(R). To check consistency, suppose u ∈ C 1 (R) and consider also u as an element
of D′ (R). We would like to know the relation between the distributional derivative D̄u
defined above and the usual derivative Du. We have
Z ∞ Z ∞
+∞
hD̄u, ψi = hu, −Dψi = u(−Dψ) dx = − [uψ]−∞ + ψDu, dx = hDu, ψi
−∞ −∞

for all ψ ∈ D(R), and so D̄u = Du. In the following we shall therefore not distinguish
between the distributional and the classical derivatives and simply denote both by Du
or du
dx when they exist.

19
2. (Multiplication by smooth functions). For f ∈ C ∞ (R) define T (ϕ) ..= f ϕ for ϕ ∈ D(R).
Clearly T : D(R) → D(R) is linear and S = T yields an adjoint identity:
Z Z
f ϕψ dx = ϕf ψ dx
R R
for ϕ, ψ ∈ D(R). It is clear that S : D(R) → D(R) is linear and continuous (checked by
Leibniz), so we may extend T to distributions by the rule
hf u, ψi ..= hu, f ψi
for u ∈ D′ (R), ψ ∈ D(R). Clearly we have consistency here: when u ∈ L1loc (R), then
f u ∈ L1loc (R) and f u can be identified with the above distribution.
3. Many other useful operations admit extensions to distributions.
Translation. T = τh defined by τh ϕ(x) = ϕ(x + h) yields adjoint identity with S = τ−h .
Thus for u ∈ D′ (R), τh u ∈ D′ (R) is defined by the rule
hτh u, ψi ..= hu, τ−h ψi
for ψ ∈ D(R).
Dilation. T = dr defined by dr ϕ(x) = ϕ(rx), r > 0, yields the adjoint identity with
S = 1r d 1 . Thus for u ∈ D′ (R), dr u ∈ D′ (R) is defined by the rule
r
 
. 1
hdr u, ψi .= u, d 1 ψ
r r
for ψ ∈ D(R).
Reflection through the origin. (T ϕ)(x) = ϕ̃(x) = ϕ(−x) admits the adjoint identity with
S = T . Thus for u ∈ D′ (R), ũ ∈ D′ (R) is defined by the rule
hũ, ψi ..= hu, ψ̃i
for ψ ∈ D(R).
Convoltion with a test function. For v ∈ D(R), T ϕ = v ∗ ϕ admits an adjoint identity
with Sψ = ṽ ∗ ψ. Indeed, by Fubini,
Z ∞ Z ∞Z ∞
(v ∗ ϕ)(x)ψ(x) dx = v(x − y)ϕ(y) dyψ(x) dx
−∞ −∞ −∞
Z ∞Z ∞
= v(x − y)ψ(x) dxϕ(y) dy
−∞ −∞
Z ∞
= (ṽ ∗ ψ)(y)ϕ(y) dy.
−∞

Thus for u ∈ D′ (R), v∗u∈ D′ (R) is defined by the rule


hv ∗ u, ψi ..= hu, ṽ ∗ ψi
for ψ ∈ D(R). On Sheet 2 you will be asked to prove that v ∗ u ∈ C ∞ (R).

20
Definition 6.2. Let Ω be a non-empty open subset of Rn . Let u ∈ D′ (Ω) and j ∈ {1, . . . , n}.
∂u
The j-th partial derivative of u, Dj u or ∂x j
, in the sense of distributions is defined by the rule

hDj u, ϕi ..= hu, −Dj ϕi

for ϕ ∈ D(Ω).

Note that Dj fits into the adjoint identity scheme with T = Dj and S = −Dj , and so
is well-defined. Also note that Dj is continuous in the sense that if uk → u in D′ (Ω), then
Dj uk → Dj u in D′ (Ω). As in the one-dimensional case, when u ∈ C 1 (Ω) the distributional and
classical partial derivatives D1 u, . . . , Dn u coincide. Moreover, note that since for ϕ ∈ D(Ω) we
have
∂2ϕ ∂2ϕ
= ,
∂xj ∂xk ∂xk ∂xj
we also have Dj Dk u = Dk Dj u for u ∈ D′ (Ω). We can therefore use multi-index notation for
distributional derivatives. For u ∈ D′ (Ω) and α ∈ Nn we have

hDα u, ϕi = (−1)|α| hu, Dα ϕi

for ϕ ∈ D(Ω), where we recall that α = (α1 , . . . , αn ), |α| = α1 + · · · + αn , and

∂ |α| ϕ
Dα ϕ = .
∂xα1 1 . . . ∂xαnn

Definition 6.3. Let u be a distribution and f be a smooth function. Then the product f u in
the sense of distributions is defined by the rule

hf u, ϕi ..= hu, f ϕi

for ϕ ∈ D(Ω).

This definition also fits into the adjoint identity scheme with T = f x = S and so is well-
defined. It is clearly consistent, as in the one-dimensional case.
Example 6.4. The Heaviside function is the function
(
0 x<0
H(x) =
1 x > 0.

Note that the value of H(x) at x = 0 is not particularly important and is sometimes taken to
be 0 instead (or in some other contexts even 21 ). Clearly H ∈ L1loc (R), so H ∈ D′ (R) and we

21
have H ′ = δ0 . Indeed, since for ϕ ∈ D(R)
hH ′ , ϕi = hH, −ϕ′ i
Z ∞
= H(x)(−ϕ′ (x)) dx
−∞
Z ∞
=− ϕ′ (x) dx
0
FTC
= − [ϕ(x)]x=∞
x=0
= ϕ(0) = hδ0 , ϕi.
Note also that for m ∈ N
 m 
d
δ0 , ϕ = hδ0 , (−1)m ϕ(m) i = (−1)m ϕ(m) (0).
dxm
A slight extension of the above formula for H ′ is obtained by differentiation of a piecewise C 1
function (
f (x) x < 0
h(x) =
g(x) x > 0,
where f , g ∈ C 1 (R). This will be addressed on Sheet 2.
τh −1
Example 6.5. We can define △h = h for h 6= 0 on distributions u ∈ D′ (R) by the adjoint
identity scheme: for ϕ ∈ D(R) put
 
τ−h − 1
h△h u, ϕi = u, ϕ ,
h
 
τ−h −1 ϕ(x−h)−ϕ(x)
where h ϕ (x) = h . If u ∈ C 1 (R), then clearly

u(x + h) − u(x)
△h u(x) = −→ u′ (x)
h h→0

locally uniformly in x. What happens when u ∈ D′ (R)? One may check that
τ−h − 1
ϕ −→ −ϕ′
h h→0

in D(R) and therefore △h u −→ u′ in D′ (R).


h→0

Theorem 6.6 (Leibniz Rule). If u ∈ D′ (Ω), f ∈ C ∞ (Ω), and j ∈ {1, . . . , n}, then
Dj (f u) = (Dj f )u + f Dj u
in D′ (Ω). In fact, the Generalized Leibniz Rule also holds for distributions: for a multi-index
α ∈ Nn ,
X α 
α
D = Dβ f Dα−β u.
β
β6α

22
Proof. We only prove the basic case, the general case can be proved by induction, or simply
by using the formula for test functions. First note that Dj (f u), (Dj f )u + f Dj u ∈ D′ (Ω) and
that for ϕ ∈ D(Ω):
hDj (f u), ϕi = hf u, −Dj ϕi = hu, −f Dj ϕi,

h(Dj f )u + f Dj u, ϕi = h(Dj f )u, ϕi + hf Dj u, ϕi


= hu, (Dj f )ϕi + hDj u, f ϕi
= hu, (Dj f )ϕi + hu, −Dj (f ϕ)i
= hu, (Dj f )ϕ − Dj (f ϕ)i
= hu, −f Dj ϕi,

and we are done.

7 Lecture 7

Theorem 7.1. Let Ω be a nonempty connected open subset of Rn . If u ∈ D′ (Ω) and

Dj u = 0

for j = 1, 2, . . . , n, then u is constant in the sense that there exists c ∈ R (or C) such that
Z
hu, ϕi = c ϕ(x) dx

for ϕ ∈ D(Ω).

Proof. We only give details for the case n = 1 and Ω = R. The general case can be done along
similar lines (not examinable). Suppose u ∈ D′ (R) satisfies u′ = 0, that is

0 = hu′ , ϕi = −hu, ϕ′ i
R
for all ϕ ∈ D′ (R). Fix ρ ∈ D(R) with R ρ dx = 1 (the standard mollifier kernel on R will do).
For ϕ ∈ D(R) we put Z
cϕ ..= ϕ(x) dx
R
and Z x
ψ(x) ..= (ϕ(t) − cϕ ρ(t)) dt
−∞

for x ∈ R. Take a, b ∈ R with a < b and ϕ(t) = 0 = ρ(t) for t 6 a and t > b. Then ψ(x) = 0
for x 6 a, and for x > b,
Z x Z ∞
ψ(x) = (ϕ(t) − cϕ ρ(t)) dt = (ϕ(t) − cϕ ρ(t)) dt = 0.
−∞ −∞

23
By the FTC, ψ is C 1 with ψ ′ (x) = ϕ(x)−cϕ ρ(x) and hence ψ is C ∞ . Since also supp(ψ) ⊂ [a, b],
ψ ∈ D(R). Now

hu, ϕi = hu, ψ ′ + cϕ ρi
= hu, ψ ′ i + cϕ hu, ρi
= h−u′ , ψi + cϕ hu, ρi
= cϕ hu, ρi,

so Z Z
hu, ϕi = hu, ρi ϕ(x) dx = c ϕ(x) dx,
R R
where we denoted c = hu, ρi.

Remark 7.2. Recall that any u ∈ L1loc (Ω) is uniquely determined by the corresponding distri-
bution Z
hu, ϕi = uϕ dx,

ϕ ∈ D(Ω), and we do not distinguish between u as an L1loc function and u as a distribution in
our notation. In particular note that C k (Ω) ⊂ L1loc (Ω), and that a distribution u ∈ D′ (Ω) is a
C k function precisely when Z
hu, ϕi = f (x)ϕ(x) dx

for some f ∈ C k (Ω).Now following the above convention we write u = f . However, keep in
mind that for u ∈ Lloc (Ω) and f ∈ C k (Ω) we have from
1

Z Z
u(x)ϕ(x) dx = f (x)ϕ(x) dx
Ω Ω

only that u = f a.e. in Ω.


Theorem 7.3. Let Ω be a nonempty open subset of Rn and u ∈ D′ (Ω). If Dj u ∈ C(Ω) for
j = 1, 2, . . . , n, then u ∈ C 1 (Ω).

Proof. We only consider the case n = 1 and Ω = R. Let (ρε )ε>0 be the standard mollifier and
put uε = ρε ∗ u. Then by Theorem 5.7 we have uε ∈ C ∞ (R) and uε −→ u in D′ (R). By the
FTC, Z x
uε (x) − uε (y) = u′ε (t) dt
y

for all x, y ∈ R. By considering difference quotients as in Example 6.5, we find that u′ε = ρε ∗u′ .
Here the distributional derivative
Rx ′ u′ is a continuous function, so u′ε → u′ locally uniformly on
R. Now uε (x) = uε (y) + y uε (t) dt, and multiplying by ρ(y) and then integrating over y ∈ R
gives Z Z Z
∞ ∞ x
uε (x) = uε (y)ρ(y) dy + ρ(y) u′ε (y) dt dy.
−∞ −∞ y

24
Multiply by ϕ(x) ∈ D(R) and integrate over x ∈ R:
Z ∞
huε , ϕi = uε (x)ϕ(x) dx
Z−∞
∞ Z ∞ Z ∞ Z ∞ Z x
= uε (y)ρ(y) dy ϕ(x) dx + ρ(y) u′ε (t) dt dy ϕ(x) dx.
−∞ −∞ −∞ −∞ y

Taking ε → 0 we get
Z ∞ Z ∞ Z ∞ Z x 

hu, ϕi = hu, ϕi ϕ(x) dx + ρ(y) u (t) dt dy ϕ(x) dx
−∞ −∞ −∞ y
Z ∞ Z ∞ Z x 

= hu, ρi + ρ(y) u (t) dt dy ϕ(x) dx.
−∞ −∞ y

Notice that the first term in parentheses is a constant, while the second is a C 1 function in x
(by the FTC). Therefore u ∈ C 1 (R).

What happens if u ∈ D′ (Ω) and Dj u ∈ Lp (Ω) for j = 1, 2, . . . , n? We shall return to this


by the end of the course. Meanwhile we record a definition that is related to this question (and
its answer):
Definition 7.4. Let Ω be a nonempty open subset of Rn , m ∈ N, and p ∈ [1, ∞]. Then any
u ∈ Lp (Ω) such that Dα u ∈ Lp (Ω) for all |α| 6 m is called a W m,p Sobolev function and the
set of all such functions is denoted W m,p (Ω). Thus

W m,p (Ω) ..= {u ∈ Lp (Ω) : Dα u ∈ Lp (Ω) for all |α| 6 m} .

It is not difficult to check that W m,p (Ω) is a vector space under the usual definitions of
addition and scalar multiplication. It is called a Sobolev space, and is equipped with the norm
 1/p



 X
 kDα ukpp  if p ∈ [1, ∞)
kukW m,p .
.=
 |α|6m

 α

 max kD uk∞ if p = ∞.
|α|6m

This is a norm in the same sense that k · kp is a norm on Lp (Ω): one identifies functions that
agree a.e.

8 Lecture 8

Definition 8.1. Let f ∈ L1 (Rn ). Then the Fourier transform of f is


Z
ˆ .
F(f )(ξ) = f (ξ) =
. f (x)e−ix·ξ dx, ξ ∈ Rn .
Rn

25
Remark 8.2. Note that this is well-defined for each ξ ∈ Rn since

|f (x)e−ix·ξ | = |f (x)|

and f ∈ L1 (Rn ). Here x · ξ = x1 ξ1 + · · · + xn ξn is the usual dot product in Rn . Observe that

• |fˆ(ξ)| 6 kf k1 for all ξ ∈ Rn , and if f > 0, then fˆ(0) = kf k1 ,

• fˆ ∈ C(Rn ) (by the DCT),

• fˆ(ξ) → 0 as |ξ| → ∞ (by the Riemann–Lebesgue lemma which we shall prove later in
the course).

The precise range {fˆ : f ∈ L1 (Rn )} is not so easy to describe in terms not involving the Fourier
transform; however, it is strictly smaller than
 
n .. n
C0 (R ) = g ∈ C(R ) : lim g(x) = 0 .
|x|→∞

One reason that we are interested in the Fourier transform here is its ability to transform
partial derivatives to an algebraic operation.

Lemma 8.3 (Differentiation Rule). Let f ∈ L1 (Rn ) and assume Dj f ∈ L1 (Rn ) for some
j ∈ {1, . . . , n}. Then
d
D ˆ
j f (ξ) = iξj f (ξ).

Note that here Dj f is the distributional derivative of f .

Proof. Let φ ∈ D(Rn ) be such that φ(x) = 1 for |x| 6 1 (exercise: think about how to construct
such a cut-off function). We then calculate
Z
d
Dj f (ξ) = Dj f (x)e−ix·ξ dx
R n
Z x
DCT
= lim Dj f (x)e−ix·ξ φ dx
r→∞ Rn r
D  · E
= lim Dj f, e−i(·)·ξ φ
r→∞
 r
·  ·  1
−i(·)·ξ −i(·)·ξ
= lim f, iξj e φ −e (Dj φ)
r→∞ r r r
Z  x x 1
−ix·ξ −ix·ξ
= lim f (x) iξj e φ −e (Dj φ) dx
r→∞ Rn r r r
Z
DCT
= f (x)iξj e−ix·ξ dx = iξj fˆ(ξ).
Rn

26
Example 8.4. Let f = 1(−1,1) . Clearly f ∈ L1 (R), and

 2 sin ξ for ξ 6= 0
fˆ(ξ) = ξ

2 for ξ = 0.

Note that fˆ ∈
/ L1 (R).
Example 8.5. Let ρ ∈ D(R) the the standard Rmollifier kernel on R (that is, ρ is an even function
satisfying 0 6 ρ 6 1, supp(ρ) = [−1, 1], and ρ = 1). Then
Z 1
ρ̂(ξ) = 2 ρ(x) cos(x · ξ) dx.
0

It is not hard to check that ρ̂ ∈ C ∞ (R),


but supp(ρ̂) is not compact. Therefore ρ̂ ∈ / D(R).
We shall return to this point when discussing the uncertainty principle later in the course.
However,
ρ̂(ξ) → 0
as |ξ| → ∞, and in fact, for any k, m ∈ N0 we have

|ξ|k Dm ρ̂(ξ) −→ 0

as |ξ| → ∞.

We would like to extend the Fourier transform to distributions, and to that end we seek an
adjoint identity. The above example with ρ̂ ∈ / D(R) shows that we will have to define a new
class of test functions. We shall return to that shortly, but first we need a few lemmas.
Lemma 8.6 (The Product Rule). Let f , g ∈ L1 (Rn ). Then
Z Z
f (x)ĝ(x) dx = fˆ(x)g(x) dx.
Rn Rn

Note that both sides are well-defined since the Fourier transform of an L1 function is bounded
and continuous. We thus have an adjoint identity with S = F = T , but there is an issue with
the domain that we shall address later.

Proof. This is an easy application of Fubini:


Z Z Z
f (x)ĝ(x) dx = f (x)g(y)e−ix·y dy dx
Rn Rn Rn
Z Z
Fubini
= f (x)g(y)e−ix·y dx dy
n n
Z R R
= fˆ(y)g(y) dy
Rn

27
Before addressing the issues with the domain and the appropriate class of test functions,
let us investigate the properties of the Fourier transform on L1 functions a little more. It will
turn out to be a useful source of insight.

Lemma 8.7 (Translation Rules). Let f ∈ L1 (Rn ). Then

F(τh f )(ξ) = eiξ·h fˆ(ξ)

and
F(e−ix·h f (x))(ξ) = τh fˆ(ξ)
for any h ∈ Rn .

Proof. We simply calculate


Z Z
y=x+h
F(τh f )(x) = f (x + h)e −ix·ξ
dx = f (y)e−i(y−h)·ξ dy = eih·ξ fˆ(ξ),
Rn Rn

and Z Z
e−ix·h f (x)e−ix·ξ dx = f (x)e−ix·(ξ+h) dx = fˆ(ξ + h) = τh fˆ(ξ).
Rn Rn

Lemma 8.8 (Dilation Rules). Let f ∈ L1 (Rn ) and denote

(dr f )(x) = f (rx)

for r > 0. Then


F(dr f )(ξ) = r−n fˆ(r−1 ξ) = r−n (d 1 fˆ)(ξ)
r

and
(dr fˆ)(ξ) = F(r−n d 1 f )(ξ).
r

Proof. The proof is a simple calculation as in the previous lemma,


Z
F(dr f )(ξ) = f (rx)e−ix·ξ dx
Rn
Z
y=rx y
=n f (y)e−i r ·ξ r−n dy
dy=r dx Rn
 
−n ˆ ξ
=r f
r
= r−n (d 1 fˆ)(ξ),
r

28
and
(dr fˆ)(ξ) = fˆ(rξ)
Z
= f (x)e−ix·rξ dx
R n
Z y 
y=rx y
=n f e−i r ·rξ r−n dy
dy=r dx Rn r
Z
−n
=r (d 1 f )(y)e−iy·ξ dy
r
Rn
\
= r−n (d 1 f )(ξ).
r

Lemma 8.9 (Convolution Rule). Let f , g ∈ L1 (Rn ). Then f ∗ g ∈ L1 (Rn and


F(f ∗ g)(ξ) = fˆ(ξ)ĝ(ξ).

Proof. By Fubini,
Z Z
F(f ∗ g)(ξ) = f (x − y)g(y) dy e−ix·ξ dx
Rn Rn
Z Z
= f (x − y)e−i(x−y)·ξ dx g(y)e−iy·ξ dy
Rn Rn
Z Z
z=x−y
= f (z)e−iz·ξ dz g(y)e−iy·ξ dy
dz=dx Rn Rn
= fˆ(ξ)ĝ(ξ).

Lemma 8.10 (Reverse Differentiation Rule). Let f ∈ L1 (Rn ) and assume xj f (x) ∈ L1 (Rn ) for
some j ∈ {1, . . . , n}. Then the distributional partial derivative Dj fˆ is a continuous function,
(Dj fˆ)(ξ) = F(−ixj f (x))(ξ).
In fact, Dj fˆ exists classically.

Proof. Let us start with the latter statement. Fix ξ ∈ Rn and h ∈ R \ {0} and consider the
following difference quotient. We have
fˆ(ξ + hej ) − fˆ(ξ)
△hej fˆ(ξ) ..=
Z h
= f (x)△hej e−ix·(·) (ξ) dx
Rn
Z
DCT
−→ −ixj f (x)e−ix·ξ dx
h→0 Rn
= F(−ixj f (x))(ξ),

29
so the partial derivative Dj fˆ exists classically at ξ. Moreover, since the application ξ 7→
F(−ixj f (x))(ξ) is continuous, so is Dj fˆ. This is also the distributional derivative since as in
Example 6.5, we have △hej fˆ → Dj fˆ in D′ (Rn ) as h → 0. More precisely, one has that

h△hej fˆ, ϕi −→ hDj fˆ, ϕi


h→0

for every ϕ ∈ D(Rn ). But the difference quotient △hej fˆ(ξ) converges locally uniformly in
ξ to the classical derivative too, so Dj fˆ can be understood in either sense. If unconvinced,
the reader is invited to write out the precise details of the last part of the argument as an
exercise.

9 Lecture 9

We start with the observation that the differentiation rules for the Fourier transform on L1
admit generalizations to higher order derivatives. We formalize this using the following no-
tation. Recall that for a multi-index α = (α1 , . . . , αn ) ∈ Nn0 and x = (x1 , . . . , xn ∈ Rn ,
D = (D1 , . . . , Dn ), we denoted by
xα ..= xα1 1 . . . xαnn
and
∂ |α|
Dα ..= .
∂xα1 1 . . . ∂xαnn
We can use this notation to write out polynomials in n variables: if p(x) is a polynomial of
degree at most k, then X
p(x) = c α xα ,
|α|6k

where cα ∈ R and we sum over all multi-indices α ∈ Nn0 of length |α| 6 k. Corresponding to
the polynomial p(x) is a linear partial differential operator
X
p(D) ..= cα D α .
|α|6k

If cα 6= 0 for some α ∈ Nn0 with |α| = k, then we say p(D) has order k. Sometimes we also
write p(iD) or p(−iD), the notation being self explanatory:
X X
p(iD) = cα (iD)α = cα i|α| Dα ,
|α|6k |α|6k

and so on.

Corollary 9.1 (Generalized Differentiation Rules). Let p(x) ∈ C[x] be a polynomial in n


variables.

30
1. If f ∈ L1 (Rn ) and p(d)f ∈ L1 (Rn ), then

\ (ξ) = p(iξ)fˆ(ξ).
p(D)f

2. If f ∈ L1 (Rn ) and pf ∈ L1 (Rn ), then

(p(iD)fˆ)(ξ) = F(pf )(ξ).

Note that in 1. p(D)f is understood distributionally.

We are now ready to address the domain issues in the adjoint identity for the Fourier
transform (Lemma 8.6).

Definition 9.2. A function f : Rn → R (or into C) is said to be rapidly decreasing if and only
if for every m ∈ N there exist rm , cm > 0 such that

|f (x)| 6 cm |x|−m

for all |x| > rm .

Remark 9.3. A continuous function f is rapidly decreasing if and only if for any polynomial
p(x) the function x 7→ p(x)f (x) is bounded on Rn :

sup |p(x)f (x)| < ∞.


x∈Rn

Clearly this bound will depend on the polynomial p(x). Exercise: prove this.
Example 9.4.

1
• 1+x2m
is not rapidly decreasing for any m ∈ N,
2
• e−x is rapidly decreasing,

• e−|x| is rapidly decreasing.

Definition 9.5 (The Schwartz Space S(Rn )). We say that ϕ is a Schwartz test function on Rn
and write ϕ ∈ S(Rn ) if and only if ϕ ∈ C ∞ (Rn ) and for all α ∈ Nn0 , Dα ϕ is rapidly decreasing.

Example 9.6.
2
• e−|x| ∈ S(Rn ) \ D(Rn ),

• e−|x| ∈
/ S(Rn ) because it is not differentiable at zero,

• ρ̂(ξ) ∈ S(Rn ) \ D(Rn ), where ρ is the standard mollifier kernel on Rn .

The following lemma collects a few of the properties of the class of Schwartz test functions.

31
Lemma 9.7.

(i) S(Rn ) is a vector space (over R or C),


(ii) If p(x) is a polynomial and ϕ ∈ S(Rn ), then pϕ ∈ S(Rn ),
(iii) If p(x) is a polynomial and ϕ ∈ S(Rn ), then p(D)ϕ ∈ S(Rn ),
(iv) D(Rn ) ( S(Rn ) ( L1 (Rn ).

Proof.

(i) If ϕ, ψ ∈ S(Rn ) and λ ∈ R (or C), then certainly ϕ + λψ ∈ C ∞ (Rn ) because C ∞ (Rn ) is
a vector space. Now for a polynomial p on Rn we have
sup |p(x)(ϕ + λψ)(x)| 6 sup |p(x)ϕ(x)| + |λ| sup |p(x)ψ(x)| < ∞.
x x x

Thus ϕ + λψ is rapidly decreasing. All derivatives Dα (ϕ + λψ) are similarly seen to be


rapidly decreasing for all α ∈ Nn0 . Consequently ϕ + λψ ∈ S(Rn ).
(ii) Clearly pϕ ∈ C ∞ (Rn ). Fix α ∈ Nn0 and a polynomial q on Rn . By the Leibniz Rule
X α 
α
D (pϕ) = Dβ pDα−β ϕ,
β
β6α

and so
X α 
sup |qDα (pϕ)| 6 sup |qDβ pDα−β ϕ|
x x β
β6α
X α 
6 sup |qDβ pDα−β ϕ|
β x
β6α

<∞

since qDβ p is a polynomial on Rn .


(iii) For each α ∈ Nn0 we have Dα ϕ ∈ S(Rn ), and so p(D)ϕ ∈ S(Rn ) follows from (i).
(iv) We have already seen that D(Rn ) ( S(Rn ). Let ϕ ∈ S(Rn ). Then in particular ϕ is
continuous and rapidly decreasing, so for m = n + 2 we may find r = rm and c = cm > 0
such that
|ϕ(x)| 6 c|x|−n−2
for |x| > r. By continuity, M ..= sup|x|6r |ϕ(x)| < ∞, and so
Z Z ∞
n dt
|ϕ(x)| dx 6 M L (Br (0)) + cωn−1 2 < ∞.
Rn r t

32
10 Lecture 10

Theorem 10.1. F : S(Rn ) → S(Rn ) is a linear map.

Proof. By part (iv) of Lemma 9.7, S(Rn ) ⊂ L1 (Rn ) and so ϕ̂ is well-defined for ϕ ∈ S(Rn ).
We thus only need to check that ϕ̂ ∈ S(Rn ) whenever ϕ ∈ S(Rn ).
Recall that if ϕ ∈ S(Rn ) ⊂ L1 (Rn ), then ϕ̂ is continuous and
Z
|ϕ̂(ξ)| = ϕ(x)e−ix·ξ dx 6 kϕk1 (5)
Rn

for all ξ. Fix ϕ ∈ S(Rn ).


Step 1: ϕ̂ is rapidly decreasing.
Since ϕ̂ is continuous it suffices to show that pϕ̂ is bounded whenever p is a polynomial in
n variables. So let p be such a polynomial, and recall from parts (iii) and (iv) of Lemma 9.7
that
p(−iD) ∈ S(Rn ) ⊂ L1 (Rn ).
Hence by Corollary 9.1 V

p(ξ)ϕ̂(ξ) = p(−iD)ϕ(ξ),
and so supξ |p(ξ)ϕ̂(ξ)| < ∞ by (5).
Step 2: Let α ∈ Nn0 . Then Dα ϕ̂ is rapidly decreasing.
Again, Dα ϕ̂ is continuous so we only need to show that supξ |p(ξ)Dα ϕ̂(ξ)| < ∞ whenever
p is a polynomial in n variables. By part (ii) of Lemma 9.7, ψ(x) ..= (−ix)α ϕ(x) ∈ S(Rn ), ad
so by parts (iii) and (iv)
p(−iD)ψ ∈ S(Rn ) ⊂ L1 (Rn ).
Then Corollary 9.1 yields
V

p(ξ)Dα ϕ̂(ξ) = p(ξ)ψ̂(ξ) = p(−iD)ψ (ξ),

which is bounded by (5).

Remark 10.2. We record the following principle that is implicit in the above proof.

(a) Let m ∈ N0 . If f ∈ W m,1 (Rn ), then supξ (1 + |ξ|)m |fˆ(ξ)| < ∞,

(b) Let m ∈ N, m > n+1. If f ∈ L1 (Rn ) and supx (1+|x|)m |f (x)| < ∞, then fˆ ∈ C m−n−1 (Rn )
and supξ |Dα fˆ(ξ)| < ∞ for |α| 6 m − n − 1.

There is clearly a gap of n+1 derivatives between (a) and (b), but in the proof of Theorem 10.1
this did not matter because the definition of a Schwartz function involves C ∞ smoothness and
rapid decrease.

33
We could now proceed to define the Fourier transform for certain distributions using the
adjoint identity scheme: the product rule holds in particular for ϕ, ψ ∈ S(Rn ), namely
Z Z
ϕ̂(x)ψ(x) dx = ϕ(x)ψ̂(x) dx.
Rn Rn

However, we shall first show that the Fourier transform is bijective on the Schwartz space.
Theorem 10.3 (Fourier Inversion Theorem on S(Rn )). The Fourier transform F : S(Rn ) →
S(Rn ) is bijective with inverse given by
Z
(F −1 ψ)(x) = (2π)−n ψ(ξ)eix·ξ dξ.
Rn

Proof. Let ϕ ∈ S(Rn ). Can we recover ϕ from ϕ̂? Now ϕ̂ ∈ S(Rn ) ⊂ L1 (Rn ) so we may
consider
Z
2
F (ϕ)(x) = ϕ̂(ξ)e−ix·ξ dξ
Rn
Z Z
= ϕ(y)e−i(x+y)·ξ dy dξ.
Rn Rn

Observe that |ϕ(y)e−i(x+y)·ξ | = |ϕ(y)| is not in general integrable over (y, ξ) ∈ Rn × Rn , so we


cannot use Fubini to swap the order of integration. Instead we use a trick (in reality it’s a tool
from the theory of ‘summability methods’). This is based on the following.
2
Lemma 10.4 (Auxiliary Lemma). For t > 0 let Gt (x) ..= e−t|x| , x ∈ Rn . Then Ĝt (ξ) =
 |ξ|2
π n/2 − 4t
t e , ξ ∈ Rn , and the family (Ĝt )t>0 satisfies
R
(1) Rn Ĝt (ξ) dξ = (2π)n for t > 0,

(2) Ĝt (ξ) > 0 for ξ ∈ Rn , t > 0, and


(3) limt→0+ Ĝt (ξ) = 0 uniformly in |ξ| > ε for each fixed ε > 0.

As a consequence of points (1)–(3), the family


   
−n 1 |ξ|2
− 4t
(2π) Ĝt = e
t>0 (4πt)n/2 t>0

is an approximate identity, and for f ∈ S(Rn ) we have that


(Ĝt ∗ f )(x) −→ (2π)n f (x)
t→0+

uniformly in x ∈ Rn . You might recognize the function (2π)−n Ĝt as the heat kernel from the
Part A Differential Equations course. We postpone the proof of the Auxiliary Lemma for now.
Note that Z
2
F (ϕ)(x) = lim ϕ̂(ξ)e−ix·ξ Gt (ξ) dξ.
t→0+ Rn

34
Now for t > 0 we have
Z Z
−ix·ξ 2
ϕ̂(ξ)e Gt (ξ) dξ = ϕ̂(ξ)e|−ix·ξ−t|ξ|
{z } dξ
Rn R n
∈L1
Z  
2
product rule → = ϕ(ξ)Fy→ξ e−ix·y−t|y| dξ
n
ZR
translation rule → = ϕ(ξ)Ĝt (ξ + x) dξ
Rn
 Z
η = −ξ
→= ϕ(−η)Ĝt (x − η) dη
dη = dξ Rn
= (Ĝt ∗ ϕ̃)(x).

From the Auxiliary Lemma it follows that

(Ĝt ∗ ϕ̃)(x) −→ (2π)n ϕ̃(x) = (2π)n ϕ(−x)


t→0+

uniformly in x ∈ Rn . Collecting the pieces we conclude that

F 2 (ϕ)(x) = (2π)n ϕ(−x),

or equivalently that Z
−n
ϕ(x) = (2π) ϕ̂(ξ)eix·ξ dξ,
Rn
x ∈ Rn . It follows easily that F is bijective. It thus remains to prove the Auxiliary Lemma.

11 Lecture 11

Proof of the Auxiliary Lemma. We express (1)-(3) by saying that ((2π)−n Ĝt )t>0 is an approx-
imate identity and for f ∈ S(Rn ) we have

(Ĝt ∗ f )(x) −→ (2π)n f (x)


t→0+

uniformly in x.
Remark 11.1. In addition to (3) we record that also
Z
(3’) Ĝt (ξ) dξ −→ 0 for each fixed ε > 0.
|ξ|>ε t→0+

Note that Gt = d√t G1 , so from the dilation rule Ĝt = t−n/2 dt−1/2 Ĝ1 . We can therefore focus
on the computation for t = 1. Put G ..= G1 , and note further that
n
Y 2
−|x|2
G(x) = e = e−xj ,
j=1

35
hence by Fubini
n
Y  2
Ĝ(ξ) = Fxj →ξj e−xj (ξj ).
j=1
2
We may therefore further assume that n = 1 and G(x) = e−|x| , x ∈ R. In this case we compute
Z ∞ Z ∞
i
−x2 −ixξ − 14 ξ 2 2
Ĝ(ξ) = e dx = e e−(x+ 2 ξ) dx.
−∞ −∞

Put Z ∞ i 2
F (ξ) = e−(x+ 2 ξ) dx
−∞

for ξ ∈ R, and observe that by the DCT that F is C 1 with


Z ∞    
′ −(x+ 2i ξ)2 i i
F (ξ) = e −2 x + ξ dx
−∞ 2 2
Z
i ∞ d  −(x+ i ξ)2 
= e 2 dx
2 −∞ dx
h i
FTC i i 2 ∞
= e−(x+ 2 ξ) = 0.
2 −∞
R∞ 2 √
F is therefore constant: F (ξ) = F (0) = −∞ e−x dx = π. Thus

√ ξ2
Ĝ(ξ) = πe− 4 ,

as required.
We return to the general case and the proof of (1)-(3). Point (1) follows from the above
calculation upon performing a substitution, and (2) is clear. For (3) we note that when |ξ| > δ,
 π n/2 δ2
|Ĝt (ξ)| = Ĝt (ξ) 6 e− 4t .
t
sn n!
For s > 0 one has es > n! , so e−s < sn , and so
 π n/2 δ2 π n/2 n!4n n/2
e− 4t < t −→ 0,
t δ 2n t→0+

from which (3) follows. Point (3’) is a minor variation:


Z η= ξ

Z
t n n/2 2
e−|η| dη −→ 0.
2
Ĝt (ξ) dξ =
√ 2 π √
|ξ|>δ dη=(2 t)−n dξ |η|>(2 t)−1 δ t→0+

36
Finally, for f ∈ S(Rn ) we have that f is in particular uniformly continuous, so given ε > 0 we
can find δ > 0 such that |f (y) − f (x)| < ε whenever |x − y| 6 δ. As f is also bounded we get
(1)
Z
|(Ĝt ∗ f )(x) − f (x)| 6 Ĝt (x − y)|f (y) − f (x)| dy
Rn
Z Z
6 Ĝt (x − y) dy · 2kf k∞ + Ĝt (x − y) dy · ε,
|x−y|>δ Bδ (x)

and hence from (1) and (3’)

lim sup |(Ĝt ∗ f )(x) − f (x)| 6 ε.


t→0+

The choices of δ and t above only depend on ε, so convergence is uniform in x ∈ Rn . One


can show that the conclusion can also be obtained only from (1)-(3); the use of (3’) is not
necessary.

Remark 11.2. When (Kt )t>0 is an approximate identity, it is not difficult to show that

kKt ∗ f − f k1 −→ 0
t→0+

whenever f ∈ L1 (Rn ).

11.1 Recap on the Fourier Transform

For f ∈ L1 (Rn ) we defined


Z
F(f )(ξ) = fˆ(ξ) ..= f (x)e−ix·ξ dx,
Rn

for x ∈ Rn . Here F maps L1 (Rn ) into C0 (Rn ), but one can show that it is not onto. The
situation is better when we consider the Fourier transform on S(Rn ); then

F : S(Rn ) −→ S(Rn )

is bijective with Z
1
F −1 (g)(x) = g(ξ)eix·ξ dξ,
(2π)n Rn

x ∈ Rn . Note that we may write this as F −1 = (2π)−n F̃, where the order in which we perform
the operations g 7→ F(g) and g 7→ g̃ is unimportant.
We return to the task of defining the Fourier transform on distributions using the adjoint
identity: Z Z
ϕ(x)ψ̂(x) dx = ϕ̂(x)ψ(x) dx
Rn Rn

37
for ϕ, ψ ∈ S(Rn ). Observe that the distribution should be defined on S(Rn ) rather than on
D(Rn ), and since D(Rn ) ( S(Rn ), it is likely that we will have to exclude some distributions.
As with D′ (Rn ) we start with a notion of convergence on S(Rn ). In connection with this we
recall the following characterization of S(Rn ):
 
n ∞ n n . α β
ϕ ∈ S(R ) ⇐⇒ ϕ ∈ C (R ) and ∀α, β ∈ N0 Sα,β (ϕ) .= sup |x D ϕ(x)| < ∞ .
x∈Rn

We may also replace the condition for Sα,β (ϕ) by the following apparently stronger, but in fact
equivalent, condition
Sp,q (ϕ) ..= sup |p(x)(q(D)ϕ)(x)| < ∞
x∈Rn
for all polynomials p and q on Rn . For k, l ∈ N0 put

S̄k,l (ϕ) ..= max Sα,β (ϕ).


|α|6k, |β|6l

Note that all three of Sα,β , Sp,q , and S̄k,l are semi-norms. Observe that

Sp,q (ϕ) 6 cS̄deg p, deg q (ϕ)

for all ϕ ∈ S(Rn ), where c is a constant that only depends on the polynomials p and q and
whose precise value is not important here.
Lemma 11.3. Let p be a polynomial on Rn . Then there exists a constant c = c(p) such that
for all α, β ∈ Nn0 and ϕ ∈ S(Rn ),

Sα,β (pϕ) 6 cS̄|α|+deg p,|β| (ϕ),


Sα,β (p(D)ϕ) 6 cS̄|α|,|β|+deg p (ϕ).

The proof is a simple but somewhat tedious application of the Leibniz rule and we omit the
details here.
Definition 11.4. Let ϕj , ϕ ∈ S(Rn ). Then we say ϕj converges to ϕ in the sense of the
Schwartz test functions, and write ϕj → ϕ in S(Rn ), if and only if

Sα,β (ϕ − ϕj ) −→ 0

as j → ∞ for all α, β ∈ Nn0 . This can also be stated in terms of Sp,q or S̄k,l .
Remark 11.5 (A metric on S(Rn )). Define for ϕ, ψ ∈ S(Rn )
X S̄k,l (ϕ − ψ)
d(ϕ, ψ) ..= 2−k−l .
1 + S̄k,l (ϕ − ψ)
k,l∈N0

Then d is a metric on S(Rn ), and we have ϕj → ϕ in S(Rn ) if and only if d(ϕj , ϕ) → 0. Note
that d is translation invariant,

d(ϕ + η, ψ + η) = d(ϕ, ψ),

38
and it can be shown that (S(Rn ), d) is complete (such a space is called a Fréchet space).
We may compare this to the notion of convergence in D(Rn ). If ϕj , ϕ ∈ D(Rn ) and ϕj → ϕ
in D(Rn ), then ϕj , ϕ ∈ S(Rn ) and ϕj → ϕ in ϕj → ϕ in S(Rn ). The converse, however, is
clearly false.
Lemma 11.6. Let p be a polynomial on Rn . Then the maps ϕ 7→ pϕ, ϕ 7→ p(D)ϕ, and ϕ 7→ ϕ̂
are continuous with respect to the convergence on S(Rn ).

Proof. The first two follow immediately from Lemma 11.3. For the Fourier transform, note
that for all α, β ∈ Nn0 ξ α Dβ ϕ̂(ξ) = (−1)|α|+|β| Fx→ξ Dα (xβ ϕ(x)) , so
 
Sα,β (ϕ̂) = sup Fx→ξ Dα (xβ ϕ(x))
ξ
Z
6 |Dα (xβ ϕ(x))| dx
Rn
Z
1
= 2n
(1 + |x|2n )|Dα (xβ ϕ(x))| dx
Rn 1 + |x|
Z
dx
6 2n
sup (1 + |x|2n )Dα (xβ ϕ(x)) .
n 1 + |x| x
| R {z }
=..c

By convexity of the map t 7→ tn , we have


|x|2n = (x21 + · · · + x2n )n 6 nn−1 (x2n 2n
1 + · · · + xn )

and thus
 
n
X
sup (1 + |x|2n )Dα (xβ ϕ(x)) 6 sup 1 + nn−1 x2n
j
 Dα (xβ ϕ)
x x
j=1
n
X
6 sup Dα (xβ ϕ) + nn−1 sup x2n α β
j D (x ϕ)
x x
j=1
Lemma 11.3
6 cβ S̄|β|,|α| (ϕ) + nn cβ S̄2n+|β|,|α| (ϕ)
6 C S̄2n+|β|,|α| (ϕ),
where C = (1 + nn )cβ . We have thus shown that
Sα,β (ϕ̂) 6 cC S̄2n+|β|,|α| (ϕ),
and hence F is continuous.

Definition 11.7 (Tempered Distributions). A functional u : S(Rn ) → R (or into C) is a


tempered distribution if and only if u is linear and u is continuous on S(Rn ) in the sense that
hu, ϕj i −→ hu, ϕi

39
whenever ϕj → ϕ in S(Rn ). The set of all tempered distributions is denoted by S ′ (Rn ). It is
clearly a vector space with the usual definitions of vector space operations.

Remark 11.8. Since D(Rn ) ⊂ S(Rn ) and D-convergence implies S-convergence, it follows
that S ′ (Rn ) ⊂ D′ (Rn ). The space D′ (Rn ) is genuinely larger than S ′ (Rn ), since for example
2
u = ex ∈ L1loc (R) ⊂ D′ (R) by the rule
Z ∞
2
hu, ϕi = ex ϕ(x) dx.
−∞

2
/ S ′ (R) as for instance ϕ = e−x ∈ S(R) but uϕ = 1R ∈
However, u ∈ / L1 (R).

Definition 11.9 (Convergence of Tempered Distributions). For a sequence (uj ) in S ′ (Rn ) and
u ∈ S ′ (Rn ) we write
uj −→ u in S ′ (Rn )
if and only if huj , ϕi → hu, ϕi for each fixed ϕ ∈ S(Rn ).

Remark 11.10. This is stronger than convergence in D′ (Rn ).

Using adjoint identities and lemma 11.6 we get the following.

Definition 11.11. For α ∈ Nn0 , a polynomial inC[x] and u ∈ S ′ (Rn ) we define the tempered
distributions Dα u, pu, and û by the rules

hDα u, ϕi ..= (−1)|α| hu, Dα ϕi,

hpu, ϕi ..= hu, pϕi,


hû, ϕi ..= hu, ϕ̂i.
We also define the translation τh u, dilation dr u, and ũ as on D′ (Rn ). All of the above operations
are linear and continuous in the sense of S ′ (Rn ), and the rules for the Fourier Transform on
S(Rn ) also hold on S ′ (Rn ).

Theorem 11.12 (Fourier Inversion Formula on S ′ (Rn )). The map F : S ′ (Rn ) → S ′ (Rn ) is a
linear bijection with inverse F −1 = (2π)−n F̃.

Proof. By chasing definitions and using the Fourier Inversion Formula on S(Rn ), for u ∈ S ′ (Rn )
and ϕ ∈ S(Rn ) we have

h(2π)−n F̃Fu, ϕi = hu, (2π)−n F F̃ϕi = hu, ϕi = hu, (2π)−n F̃Fϕi = h(2π)−n F F̃u, ϕi.

40
Example 11.13. Let u ∈ Lp (Rn ), p ∈ [1, ∞]. Then u ∈ S ′ (Rn ) by the rule
Z
hu, ϕi = uϕ dx
Rn

for ϕ ∈ S(Rn ). We show that it is well-defined and continuous by noting that

S(Rn ) ⊂ (L1 ∩ L∞ )(Rn ) ⊂ Lq (Rn )


1 1
for p + q = 1, so by Hölder’s inequality
Z
|uϕ| dx 6 kukp kϕkq .
Rn

If 1 6 q < ∞, then
Z
1
kϕkqq = (1 + |x|2nq )|ϕ|q dx
n 1 + |x|2nq
ZR  
dx 2nq q
6 sup(1 + |x| )|ϕ(x)|
n 1 + |x|2nq x
ZR  
dx q 2n q
6 (sup |ϕ|) + (sup |x| |ϕ(x)|)
n 1 + |x|2nq x x
ZR
dx q q

6 S 0,0 (ϕ) + S|x|2n ,0 (ϕ) < ∞.
Rn 1 + |x|2nq
If q = ∞, then kϕk∞ = S0,0 (ϕ). It follows that hu, ϕi is well-defined and continuous: if ϕj → ϕ
in S(Rn ), then S0,0 (ϕj − ϕ) → 0, S|x|2n ,0 (ϕj − ϕ) → 0 and hence hu, ϕj − ϕi → 0. This implies
continuity by linearity of u.

The example with u = ex shows that we cannot expect to make sense of general Lploc
2

functions as tempered distributions: to be a tempered distribution, a function cannot grow


too quickly at infinity. This is admittedly quite vague, but it has to be. Indeed, consider
u = cos(ex ). This is bounded, so u ∈ S ′ (R) by the above. Thus also u′ ∈ S ′ (R), but it is easy
to see that u′ = − sin(ex )ex . So is exponential growth allowed or not?
Finally, let us record that the Dirac delta function is a tempered distribution:

hδx0 , ϕi = ϕ(x0 )

for ϕ ∈ S(Rn ). Note that δx0 = τ−x0 δ0 and δ̂0 = 1.


Proposition 11.14. Let u : S(Rn ) → R (or into C) be linear. Then u ∈ S ′ (Rn ) if and only if
there exists constants c > 0, k, l ∈ N0 such that

|hu, ϕi| 6 cS̄k,l (ϕ) (6)

holds for all ϕ ∈ S(Rn ). Recall that we denoted by S̄k,l (ϕ) = max|α|6k, |β|6l Sα,β (ϕ) and
Sα,β (ϕ) = supx |xα Dβ ϕ(x)|.

41
Remark 11.15. Tempered distributions have finite order.

Proof. The ’if’ part is clear. To prove the ’only if’ statement, assume u is ϕ-continuous but
that (6) fails for all c = k = l = j ∈ N: there exist ϕj ∈ S(Rn ) such that
|hu, ϕj i| > j S̄j,j (ϕj ).
Clearly ϕj 6= 0, so S̄j,j (ϕj ) > 0 and we may define
ϕj
ψj = ∈ S(Rn ).
j S̄j,j (ϕj )
For α, β ∈ Nn0 we have for |α|, |β| 6 j that Sα,β (ψj ) 6 j −1 → 0, hence ψj → 0 in S(Rn ) and so
hu, ψj i → 0. But this contradicts |hu, ψj i| > 1.

12 Lecture 12

Definition 12.1. A function f : Rn → R (or into C) is of polynomial growth if and only if


there exist constants c > 0, m ∈ N0 such that
|f (x)| 6 c(1 + |x|)m
for all x ∈ Rn .
Remark 12.2. Obviously f is of polynomial growth if and only if there exists a polynomial p
on Rn such that |f (x)| 6 |p(x)| for all x.
Lemma 12.3. If u ∈ L∞ n
loc (R ) is of polynomial growth, then
Z
hu, ϕi = ϕ(x)u(x) dx
Rn
for ϕ ∈ S(Rn ) is a tempered distribution.

Proof. Let p ∈ R[x] be chosen so that


|u(x)| 6 |p(x)|
for almost every x. Then for ϕ ∈ S(Rn )
Z
|hu, ϕi| 6 |p(x)ϕ(x)| dx
Rn
Z
1
= 2n
(1 + |x|2n )p(x)ϕ(x) dx
n 1 + |x|
ZR
dx
6 2n
sup (1 + |x|2n )p(x)ϕ(x)
Rn 1 + |x| x
= cSP,0 (ϕ),
R dx 2n )p(x).
where c ..= Rn 1+|x|2n , P (x) = (1 + |x|

42
Definition 12.4. A function a ∈ C ∞ (Rn ) is said to be of moderate growth if and only if a and
all partial derivatives Dα a, α ∈ Nn0 , have polynomial growth: ∀α ∈ Nn0 ∃pα ∈ R[x] such that

|Dα a(x)| 6 |pα (x)|

for all x ∈ Rn .

Example 12.5. Polynomials have moderate growth.

Lemma 12.6. Let a ∈ C ∞ (Rn ) be of moderate growth (a is a moderate C ∞ function) and


u ∈ S ′ (Rn ). Then
hau, ϕi ..= hu, aϕi,
ϕ ∈ S(Rn ), defines a tempered distribution au. Furthermore, the map

S ′ (Rn ) ∋ u 7−→ au ∈ S ′ (Rn )

is linear and continuous.

Proof. Fix α, β ∈ Nn0 . Then for ϕ ∈ S(Rn ) we compute using the Leibniz rule
X β 
xα Dβ (aϕ) = (Dγ a)xα (Dβ−γ ϕ).
γ
γ6β

For each γ 6 β, Dγ a has polynomial growth so we can find constants cγ > 0, mγ ∈ N0 such
that
|Dγ a(x)| 6 cγ (1 + |x|)mγ
for all x ∈ Rn . Here

(1 + |x|)mγ 6 (1 + |x|)2mγ 6 22mγ −1 (1 + |x|2mγ )

and
2mγ 2mγ
|x|2mγ = (x21 + · · · + x2n )mγ 6 nmγ −1 (x1 + · · · + xn ),
and so
n
X 2mγ
|Dγ a(x)| 6 22mγ −1 cγ + 22mγ −1 nmγ −1 cγ xj .
j=1

Put m̄ = maxγ6β mγ , c̄ = 22m−1 nm maxγ6β cγ . Then


 
n
X
|Dγ a(x)| 6 c̄ 1 + x2j m̄ 
j=1

43
and so
 
X β  Xn
Sα,β (aϕ) 6 c̄ sup 1 + x2j m̄  xα Dβ−γ ϕ
γ x
γ6β j=1
X β  

6 c̄ Sα,β−γ (ϕ) + nS̄|α|+2m̄,|β| (ϕ)
γ
γ6β
X β 
6 c̄ (n + 1)S̄|α|+2m̄,|β| (ϕ)
γ
γ6β

6 cS̄|α|+2m̄,|β| (ϕ).

It follows that ϕ 7→ aϕ is S-continuous and hence au ∈ S ′ (Rn ) since the linearity of au is clear
once we know it is well-defined. Next, u 7→ au is clearly linear and S ′ -continuous: the latter
follows by definition-chasing. Indeed, when uj → u in S ′ (Rn ), then

huj , aϕi −→ hu, aϕi

for each ϕ ∈ S(Rn ), that is


hauj , ϕi −→ hau, ϕi
for each ϕ ∈ S(Rn ).

Lemma 12.7. If u ∈ S ′ (Rn ) and θ ∈ S(Rn ), then u ∗ θ can be defined the adjoint identity
scheme as
hu ∗ θ, ϕi = hu, θ̃ ∗ ϕi
for all ϕ ∈ S(Rn ). Furthermore, u ∗ θ is a C ∞ function of moderate growth and is given by

(u ∗ θ)(x) = hu, θ(x − ·)i

for x ∈ Rn .
Remark 12.8. We have not emphasized it so far, but since the convolution product is commu-
tative on S(Rn ), ϕ ∗ ψ = ψ ∗ ϕ for ϕ, ψ ∈ S(Rn ), we also have u ∗ θ = θ ∗ u. The proof is easy
(exercise).
Definition 12.9. We can define convolutions of u, v ∈ S ′ (Rn ) provided v̂ is a C ∞ function of
moderate growth:
u ∗ v ..= F −1 (ûv̂).
This is a good definition by virtue of the Fourier Inversion Formula on S ′ (Rn ) and lemma 12.6.

The various rules for the Fourier transform continue to hold for tempered distributions.
Theorem 12.10 (Convolution Rule). Let u, v ∈ S ′ (Rn ) and assume v is a C ∞ function of
moderate growth. Then uv ∈ S ′ (Rn ) and u
cv = (2π)−n û ∗ v̂.

44
Proof. By the Inversion Formula we define
 
ˆv̂ˆ ,
û ∗ v̂ ..= F −1 û

where ˆ = (2π)n ũ, v̂ˆ = (2π)n ṽ, and the latter is clearly a C ∞ function of moderate growth.

Hence ˆv̂ˆ ∈ S ′ (Rn ) by lemma 12.6, and using ũṽ = u
û fv we get
 
û ∗ v̂ = F −1 (2π)2n u
fv = F̃ (2π)2n ufv = (2π)n u
cv.

Theorem 12.11 (Plancherel’s Theorem). The Fourier transform F : L2 (Rn ) → L2 (Rn ) is


bijective, and (2π)−n/2 F is unitary (isometric and onto). That is, F(L2 ) = L2 and
kfˆkL2 = (2π)n/2 kf kL2
for f ∈ L2 (Rn ), and more generally
Z Z
f (x)g(x) dx = (2π) −n
fˆ(ξ)ĝ(ξ) dξ.
Rn Rn

for f , g ∈ L2 (Rn ).

Proof. We start by observing that for ϕ, ψ ∈ S(Rn )


Z Z
−n ¯
ϕψ̄ dx = (2π) ϕ̂ψ̂ dξ,
Rn Rn
and in particular for ϕ = ψ
Z Z
2 −n
|ϕ| dx = (2π) |ϕ̂|2 dξ. (7)
Rn Rn

This follows from the Product Rule and the Inversion Formula on S(Rn ): clearly ψ̄ ∈ S(Rn ),
so F −1 (ψ̄) ∈ S(Rn ) and so
Z Z Z
ϕψ̄ dx = ϕF(F −1 ψ̄) dx = ϕ̂F −1 (ψ̄) dx.

Now
Z Z
−1 −n ix·y −n
F (ψ̄)(x) = (2π) ψ̄(y)e dy = (2π) ψ(y)e−ix·y dy = (2π)−n ψ̂(x).

If now f ∈ L2 (Rn ) we know that there exist fj ∈ D(Rn ) ⊂ S(Rn ) so kf − fj kL2 → 0. Clearly
this means in particular that fj → f in S ′ (Rn ), and thus by S ′ -continuity of the Fourier
transform, fˆj → fˆ in S ′ (Rn ). By (7) we see that
Z Z
|fˆj − fˆk |2 dξ = (2π)n |fj − fk |2 dx,
Rn Rn

so (fˆj ) is Cauchy. It is thus convergent in L2 (by Riesz–Fischer), fˆj → g in L2 (Rn ) for some
g ∈ L2 (Rn ). Clearly then fˆj → g in S ′ (Rn ) too, and so g = fˆ.

45
What happens on the other Lp spaces?

Theorem 12.12 (Hausdorff–Young). For p ∈ (1, 2) and p1 + 1


q = 1 we have for f ∈ Lp (Rn )
that fˆ ∈ Lq (Rn ) with
kfˆkLq 6 (2π)n/q kf kLp .

Remark 12.13. For p > 2 the image F(Lp (Rn )) contains tempered distributions of positive
orders.

13 Lecture 13

Recall that a partial differential operator (PDO) with constant coefficients can be written as
X
P (D) = aα D α
|α|6m

for aα ∈ C, and if aα 6= 0 for some α with |α| = m, we say P (D) is of order m.

Definition 13.1. A fundamental solution for P (D) is any E ∈ S ′ (Rn ) such that P (D)E = δ0 .
xm
+
Example 13.2. Recall from Problem Sheet 1 that E(x) = m! satisfies

dm+1
E = δ0
dxm+1
1
in D′ (R), and from Problem Sheet 2 that E(x) = 2π log |x| satisfies

∆E = δ0

in D′ (R2 ). It is not difficult to check that in both cases the distribution E is tempered and so
is a fundamental solution.

We refer to the polynomial X


p(ξ) = aα ξ α
|α|6m

as the symbol of the PDO P (D), and, provided P (D) has order m, we call
X
aα ξ α
|α|=m

the principal symbol of P (D).

46
Definition 13.3. A PDO X
P (D) = aα D α
|α|6m

of order m is called elliptic if and only if the principal symbol satisfies


X
aα ξ α 6= 0
|α|=m

for all ξ ∈ Rn \ {0}.

The most prominent examples of elliptic PDOs are the Laplacian on Rn ,

∆ = D12 + · · · + Dn2 ,

and the Cauchy–Riemann operators on the plane,


   
∂ 1 ∂ ∂ ∂ 1 ∂ ∂
= +i , = −i .
∂ z̄ 2 ∂x ∂y ∂z 2 ∂x ∂y

They are of orders 2 and 1 respectively, and their principal symbols are
1 1
−|ξ|2 , − (ξ1 + iξ2 ), − (ξ1 − iξ2 ),
2 2
respectively. Clearly the condition for ellipticity is satisfied by each of them.
Example 13.4. Recall that on R2 ≃ C we have

∂2
∆=4 ,
∂z∂ z̄
1
and 2π log |z| is a fundamental solution for ∆:
 
1
δ0 = ∆ log |z|

2 ∂2
= (log |z|)
π ∂ z̄∂z
 
1 ∂ ∂
= log(z z̄)
π ∂ z̄ ∂z
 
1 ∂ 1
= z̄
π ∂ z̄ z z̄
 
∂ 1
= ,
∂ z̄ πz

so (πz)−1 is a fundamental solution for ∂ z̄ .

47
Theorem 13.5. Assume E is a fundamental solution for a PDO P (D). Then for f ∈ S(Rn )
the general solution to
P (D)u = f
in S ′ (Rn ) is given by
u = E ∗ f + h,
where h ∈ S ′ (Rn ) ∩ ker P (D).

Remark 13.6. We can allow any f ∈ S ′ (Rn ) on the right-hand side for which we can define
E ∗ f as a tempered distribution (for instance, if fˆ is a moderate C ∞ function, but also much
more generally; we have not defined convolutions in full generality).

Proof. We simply chase definitions: E ∗ f ∈ S ′ (Rn ) and

P (D)(E ∗ f ) = (P (D)E) ∗ f = δ0 ∗ f = f,

so E ∗ f is a particular solution and hence the general solution is as stated.

Theorem 13.7. Let n > 3. Then


1
E(x) = − |x|2−n ,
(n − 2)ωn−1

x ∈ Rn \ {0}, is a fundamental solution for ∆.

Remark 13.8. Note that E is C ∞ away from zero, E ∈ L1loc (Rn ) ∩ S ′ (Rn ), and that

1
Dj E = xj |x|−n ∈ L1loc (Rn ).
ωn−1

The constant ωn−1 is the surface area of Sn−1 in Rn . One can show that ωn−1 = nLn (B1 (0)),
and n
π2
Ln (B1 (0)) = 
Γ n2 + 1
for all n ∈ N. In particular, we record the values for n = 2 and 3:

ω1 = 2L2 (B1 (0)) = 2π, ω2 = 3L3 (B1 (0)) = 4π.



The calculation uses Γ(x + 1) = xΓ(x) and Γ( 12 ) = π. Note also that

d2
• in n = 1 the Laplacian dx2
has fundamental solution x+ ,
∂ 2 ∂2 1
p
• in n = 2 the Laplacian ∆ = ∂x 2 + ∂y has fundamental solution 2π log x2 + y 2 . This is
called the logarithmic potential ;

48
∂2 ∂2 ∂2
• in n = 3 the Laplacian ∆ = ∂x2
+ ∂y 2
+ ∂z 2
has fundamental solution

1 1
− p .
4π x + y 2 + z 2
2

This is called the Newtonian potential.

Proof. Fourier transforming ∆E = δ0 we get


d = −|ξ|2 Ê(ξ).
1 = δ̂0 = ∆E

This is not enough to deduce that Ê(ξ) = − |ξ|12 , only that

1
Ê(ξ) = − + T̂
|ξ|2

for some T ∈ S ′ (Rn ) satisfies ∆T = 0. This means that −|ξ|2 T̂ = 0 in S ′ (Rn ). Hence if
ϕ ∈ S(Rn ) and 0 ∈/ supp ϕ, then
ϕ(ξ)
ψ(ξ) = − 2
|ξ|
for ξ 6= 0 and ψ(0) = 0 belongs to S(Rn ), and so

hT̂ , ϕi = h−|ξ|2 T̂ , ψi = 0.

We express this by supp T̂ = {0}, that is T̂ has support {0}. We discuss below how this implies
that T̂ ∈ span{Dα δ0 : α ∈ Nn0 }, and hence that T ∈ span{(2π)−n (ix)α : α ∈ Nn0 } = C[x].
Since also ∆T = 0, we see that T must be a harmonic polynomial. Note that implicit in this is
the Liouville-type result saying that if T ∈ S ′ (Rn ) is harmonic, then T is a polynomial. Now
we return to the quest for fundamental solutions:
1
Ê(ξ) = − + T̂ (ξ).
|ξ|2

We only need one, so consider Ê = − |ξ|12 . The result then follows from the following.

Lemma 13.9 (Auxiliary Lemma). Let α ∈ (−n, 0) and put f (x) = |x|α . Then f ∈ L1loc (Rn ) ∩
S ′ (Rn ) and fˆ(ξ) = c(n, α)|ξ|−n−α , where

α+n n Γ n+α
2
c(n, α) = 2 π2 
Γ − α2

and −n < −n − α < 0.

49
Proof. We start with the observation that for x 6= 0
 α Z ∞ Z ∞
α α −α −1 −t α 2
|x| Γ − = |x| t 2 e dt = s− 2 −1 e−s|x| ds,
2 0 0

where we made the substitution t = s|x|2 , and hence


Z ∞
α 1 α 2
|x| = α
 s− 2 −1 e−s|x| ds.
Γ −2 0
Note that Z j
1 α 2
 s− 2 −1 e−s|x| ds −→ |x|α
Γ − α2 0 j→∞

in S ′ (Rn ) and Riemann sums for the integrals


Z j
1 α 2
α
 s− 2 −1 e−s|x| ds.
Γ −2 0

for j fixed converge as mesh size tends to zero in the S ′ (Rn ) sense. Consequently we get by
S ′ -continuity and linearity of F that
Z ∞
1 α 2
Fx→ξ (|x|α ) =  s− 2 −1 Fx→ξ (e−s|x| ) ds
Γ − α2 0
Z ∞   n |ξ|2
1 −α −1 π 2 − 4s
=  s 2 e ds
Γ − α2 0 s
n  −n−α Z ∞
π2 |ξ| n+α
= α
 t 2 −1 e−t dt
Γ −2 2 0
n+α

n Γ
= 2n+α π 2 2 
|ξ|−n−α .
Γ − α2

13.1 Localization of Distributions

Theorem 13.10. If u ∈ D′ (Ω) and for each x ∈ Ω there exists rx > 0 such that hu, ϕi = 0
for all ϕ ∈ D(Ω ∩ Brx (x)), then u = 0.
Remark 13.11. For an open subset ω ⊂ Ω we define the restriction of u ∈ D′ (Ω) to ω, denoted
u|ω , by
hu|ω , ϕi ..= hu, ϕi
for all ϕ ∈ D(ω). Clearly u|ω ∈ D′ (ω), and the assumption in the above theorem is that
u|Ω∩Brx (x) = 0.

50
Proof. (Not examinable) Let ϕ ∈ D(Ω). Since we clearly have
[
supp ϕ ⊂ {Ω ∩ Brx (x) : x ∈ Ω} ,

the compactness of supp ϕ means that we can find a finite subcover, say
 
supp ϕ ⊂ Ω ∩ Brx1 (x1 ) ∪ · · · ∪ Ω ∩ Brxm (xm ) .
Using Theorem 2.8 we find a partition of unity φ1 , . . . , φm ∈ D(Ω) with supp φj ⊂ Ω∩Brxj (xj ),
P
0 6 φj 6 1, and mj=1 φj = 1 on supp ϕ. Thus
* m
+ m
X X
hu, ϕi = u, ϕφj = hu, ϕφj i = 0.
j=1 j=1

Definition 13.12. For u ∈ D′ (Ω) the support of u is


supp u ..= {x ∈ Ω : u|Ω∩Br (x) 6= 0 for all r > 0}.

Thus x ∈ Ω \ supp u if and only if there exists r > 0 such that u|Ω∩Br (x) = 0. Consequently
the above theorem means in particular that Ω \ supp u must be the largest open subset of Ω on
which u vanishes. It also follows from this that the support of u is a relatively closed subset
of Ω.
Theorem 13.13. Let u ∈ D′ (Ω) and x0 ∈ Ω. If supp u = {x0 }, then u ∈ span{Dα δx0 : α ∈
Nn0 }.

The proof is omitted.

Example 13.14. If u ∈ C(Ω), its support is by definition


supp u = Ω ∩ {x ∈ Ω : u(x) 6= 0}.
We can also obviously consider the corresponding distribution
Z
hu, ϕi = uϕ dx

for ϕ ∈ D(Ω). Its support as a distribution coincides with the above definition of the support
for a continuous function and we may therefore use the same notation for both: In order to
justify this let D be the support of the distribution. If ϕ ∈ D(Ω) and supp ϕ ⊂ Ω \ supp u,
then Z
hu, ϕi = uϕ dx = 0,

so the open set Ω \ supp u must be contained in the largest open set on which u vanishes, that
is Ω \ supp u ⊂ Ω \ D, i.e. D ⊂ supp u. If x0 ∈ supp u \ D, then we find ϕ ∈ D(Ω) supported
near x0 and in Ω \ D such that hu, ϕi =
6 0.

51
14 Lecture 14

Theorem 14.1 (Weyl’s Lemma). Assume u ∈ D′ (Ω) and ∆u = 0 in D′ (Ω). Then u ∈ C ∞ (Ω)
and u is harmonic.
Corollary 14.2. Let Ω ⊂ C be open and assume f ∈ D′ (Ω) satisfies
∂f
=0
∂ z̄
in D′ (Ω) . Then f is holomorphic.

Proof. This is clear since we obviously also have that


 
∂ ∂
∆f = 4 f =0
∂z ∂ z̄
in D′ (Ω). Weyl’s Lemma then implies that f is C ∞ , in which case distributional and classical
derivatives coincide. Thus f satisfies the usual Cauchy–Riemann equations and is a holomor-
phic function.

Proof of Weyl’s Lemma. Let (ρε )ε>0 be the standard mollifier. Fix Ω′ ⋐ Ω and put ε0 =
dist(Ω′ , ∂Ω). For each x ∈ Ω′ and ε ∈ (0, ε0 ) the function

y 7−→ ρε (x − y)

belongs to D(Ω) and so we may consider hu, ρε (x − ·)i. We assert that it is independent of
d
ε ∈ (0, ε0 ). To prove it we calculate dε ρε (x − y) for x, y ∈ Rn . Recall that
 
−n x−y
ρε (x − y) = ε ρ
ε

and that ρ(x) = θ(|x|2 ) (since ρ was a smooth radial function), where θ ∈ C ∞ (R) satisfies
θ(t) = 0 for t > 1. Now calculate
      
d x−y x−y x−y x−y
ε−n ρ = −nε−n−1 ρ − ε−n ∇ρ ·
dε ε ε ε ε2
     
1 x−y x−y x−y
= − n+1 nρ + ∇ρ · .
ε ε ε ε
Put K(x) = −nρ(x) − ∇ρ(x) · x so that
    
d −n x−y 1 x−y
ε ρ = n+1 K .
dε ε ε ε
We now use that ρ(x) = θ(|x|2 ). Hereby

K(x) = − div (ρ(x)x) = − div θ(|x|2 )x

52
and if we set Z ∞
1
Θ(t) = θ(s) ds,
2 t

then Θ ∈ C ∞ (R) with Θ(t) = 0 for t > 1, and Θ′ (t) = − 21 θ(t). Consequently

−θ(|x|2 )x = ∇ Θ(|x|2 ) ,

and so K(x) = div ∇ Θ(|x|2 ) = (∆Φ)(x), where Φ(x) = Θ(|x|2 ). Observe that Φ ∈ D(B1 (0)),
and
        
1 x−y x−y x−y 1 x−y
− n+1 nρ + ∇ρ · = n+1 ∆y Φ
ε ε ε ε ε ε
  
1−n x−y
= ∆y ε Φ .
ε

Here y 7→ ε1−n Φ x−y
ε is supported in Bε (x) ⊂ Ω, and so by assumption
   
1−n x−y
u, ∆y ε Φ = 0.
ε

Now by considering difference quotients we see that


 
d d
hu, ρε (x − ·)i = u, ρε (x − ·) .
dε dε

Indeed, for ε, ε′ > 0 we have


Z 1
ρε+ε′ (x − y) − ρε (x − y) FTC d
= ρε+tε′ (x − y) dt
ε′ 0 dt
d
−→ ρs (x − y)

ε →0 ds s=ε

in D′ (Ω) with respect to y, provided x ∈ Ω′ and 0 < ε < ε0 (since we may differentiate both
d
sides with respect to y). But then dε hu, ρε (x − ·)i = 0, and so hu, ρε (x − ·)i = hu, ρε1 (x − ·)i
for all ε ∈ (0, ε0 ), where ε1 ∈ (0, ε0 ). Now let ϕ ∈ D(Ω′ ). Then, by the usual trick when
convolving distributions with test functions,
Z  Z 
hu, ρε (x − ·)iϕ(x) dx = u, ρε (x − ·)ϕ(x) dx
Ω′ Ω′
= hu, ρε ∗ ϕi,

and so for ε ∈ (0, ε1 ) we have


Z
hu, ρε ∗ ϕi = hu, ρε1 (x − ·)iϕ(x) dx.
Ω′

53
Hence, as ρε ∗ ϕ → ϕ in D(Ω) as ε → 0+ , we get
Z
hu, ϕi = hu, ρε1 (x − ·)iϕ(x) dx.
Ω′

Consequently u|Ω′ ∈ C ∞ (Ω′ ), and since Ω′ was arbitrary, we are done.

Remark 14.3. The above proof is inspired by the mean value property that is known to char-
acterize harmonic functions in the following sense. Let h ∈ C(Ω). Then h is harmonic in the
usual sense (h ∈ C 2 (Ω) and ∆h = 0) if and only if for all balls Br (x0 ) ⋐ Ω we have
Z
1
h(x0 ) = h(x) dSx .
ωn−1 rn−1 ∂Br (x0 )

Using polar coordinates we see that when h is harmonic, then for Br (x0 ) ⋐ Ω

h(x0 ) = (ρr ∗ h)(x0 ).

15 Lecture 15

In this lecture we use the Plancherel Theorem to obtain estimates and regularity results for
distributional solutions to the Poisson equation.
We are interested in solving the Poisson equation ∆u = f in Rn in the context of tempered
distributions. If E denotes the fundamental solution for ∆ in Rn found in Example 13.2 (for
n = 2) and in Theorem 13.7 (for n > 3), then the general solution in S ′ (Rn ) is E ∗ f + h, where
h is any harmonic polynomial on Rn . This follows from Theorem 13.5 provided we can make
sense of E ∗ f as a tempered distribution.
Example 15.1. If f ∈ S ′ (Rn ) has compact support, then fˆ is a moderate C ∞ function and so
Ê fˆ is well-defined as a tempered distribution by Lemma 12.6. We can then define
 
E ∗ f ..= F −1 Ê fˆ .

This is consistent and a good definition (extending the Convolution Rule).

Theorem 15.2 (An L2 identity for the Laplacian). Let f ∈ L2 (Rn ) and assume E ∗ f is well-
defined as a tempered distribution. Then the general solution to Poisson’s equation ∆v = f in
S ′ (Rn ) is v = E ∗ f + h, where h is any harmonic polynomial on Rn . Furthermore, if u = E ∗ f ,
then Dj Dk u ∈ L2 (Rn ) for 1 6 j, k 6 n, and
n Z
X Z
2
|Dj Dk u| dx = |f |2 dx. (8)
n Rn
j,k=1 R

54
Remark 15.3. The symmetric n × n matrix

D2 u = (Dj Dk u)

is called the Hessian matrix of u. When u is a distribution with the property that the second
order partial derivatives Dj Dk u are regular distributions (i.e. they are L1loc functions), then
n
X
2 2
|D u| = |Dj Dk u|2 .
j,k=1

The right-hand side serves to define the left-hand side, and we record that for an n × n matrix
A = (aij ) ∈ Mn×n (C)
 1
q Xn 2

. ⊤
|A| = tr(A Ā) =
.  |ajk | 2
.
j,k=1

This is the standard norm on Mn×n (C), sometimes called the Frobenius norm or the Hilbert–
Schmidt norm. In terms of the Hessian matrix we may rewrite (8) as kD2 ukL2 = kf kL2 .

Proof. By the Differentiation Rule,


ξj ξk ˆ
D\
j Dk u = f.
|ξ|2
ξj ξk
Since |ξ|2
6 1 and fˆ ∈ L2 (Rn ) by Plancherel, also D\ 2 n
j Dk u ∈ L (R ) and thus Dj Dk u ∈
L2 (Rn ), by applying Plancherel once again. Note that
 2
n
X n
X
|ξ|4 =  ξj2  = ξj2 ξk2
j=1 j,k=1

55
and therefore by Plancherel’s Formula
Z n Z
X
2 2
|D u| dx = |Dj Dk u|2 dx
Rn j,k=1 R
n

Xn Z   2
−1 ξj ξk ˆ
= Fξ→x f dx
n |ξ|2
j,k=1 R
Xn Z
−n ξj ξk ˆ 2
= (2π) 2
f dξ
j,k=1 Rn |ξ|
 
Z X n
= (2π)−n  1 ξj2 ξk2  |fˆ|2 dξ
n |ξ| 4
R j,k=1
Z
= (2π)−n |fˆ|2 dξ
R n
Z
2
= |f | dx.
Rn

Remark 15.4 (Calderon–Zygmund Lp estimate for the Laplacian). Let p ∈ (1, ∞). Then there
exists a constant c = c(p, n) such that for f ∈ Lp (Rn ) (with compact support, say)
Z Z
2 p
|D (E ∗ f )| dx 6 c |f |p dx.
Rn Rn

This fails for p = 1 and for p = ∞. The Sobolev Embedding Theorem then implies that
2,p
E ∗ f ∈ Wloc (see Theorem 16.1 for p = 2).

Proposition 15.5 (Localization). Let Ω be an open subset of Rn and suppose that u ∈ D′ (Ω)
2,2
satisfies ∆u = f in D′ (Ω). If f ∈ L2loc (Ω), then u ∈ Wloc (Ω). Recall that
k,2 
Wloc (Ω) = v ∈ L2loc (Ω) : Dα v ∈ L2loc (Ω) ∀|α| 6 k .

Proof. Fix Ω′ ⋐ Ω and take a cut-of function θ ∈ D(Ω) between Ω′ and ∂Ω: 1Ω′ 6 θ 6 1Ω .
Now θu ∈ S ′ (Rn ) in view of the bound u must satisfy on the compact set supp θ ⊂ Ω. Using
Leibniz we calculate
n
X
∆(θu) = (∆θ)u + 2 Dj θDj u + θ∆u
j=1

= f1 + f2 ,

say, where f1 = (∆θ)u+2∇θ·∇u and f2 = θf . Observe that f1 , f2 ∈ S ′ (Rn ) both have compact
supports contained in supp θ. By linearity we then have θu = u1 + u2 , where ui ∈ S ′ (Rn )

56
satisfies ∆ui = fi in S ′ (Rn ). Now f2 ∈ L2 (Rn ) and so we may apply Theorem 15.2 which,
2,2
together with the Sobolev Embedding stated in Theorem 16.1, gives that u2 ∈ Wloc (Rn ). For
u1 we observe that u1 |Ω′ ∈ D′ (Ω′ ) satisfies

∆(u1 |Ω′ ) = f1 |Ω′ = 0

in D′ (Ω′ ) since θ ≡ 1 on Ω′ so that ∆θ = 0 = Dj θ for all 1 6 j 6 n. But then Weyl’s Lemma


(Theorem 14.1) implies that u1 |Ω′ is C ∞ and harmonic, and so
2,2
u|Ω′ = (θu)|Ω′ = u1 |Ω′ + u2 |Ω′ ∈ Wloc (Ω′ ).

Because Ω′ ⋐ Ω was arbitrary, the proof is complete.

Remark 15.6. The above can also be done for other elliptic PDEs. For instance for P (D)u = f
in D′ (Ω), where X
P (D) = aα D α ,
|α|=2

aα ∈ R, and X
P (ξ) = aα ξ α 6= 0
|α|=2

for all ξ ∈ Rn \ {0}. For more along these lines, see the course C4.3 Functional Analytic
Methods for PDEs.

16 Lecture 16

Theorem 16.1 (L2 Sobolev Inequality). Let u ∈ S ′ (Rn ) and assume that for some m ∈ N
Dα u ∈ L2 (Rn ) for all α ∈ Nn0 with |α| = m. Then

1. if 2m > n, then u ∈ C(Rn ) (u need not be bounded),

2. if 2m = n, then u ∈ Lploc (Rn ) for all p < ∞ (but not in general for p = ∞),

3. if 2m < n, then there exists a constant c = c(m, n) > 0 such that


 1
Z  n−2m Z X
2
2n 2n
inf |u − k| n−2m dx 6 c |D u| dx ,
α 2
k∈Cm−1 [x] Rn Rn |α|=m

where Cm−1 [x] denotes polynomials over C of degree at most m − 1.

Proof. We omit the proof here. Instead we prove a slightly weaker variant of Theorem 16.1
(1).

57
n
Theorem 16.2 (An L2 Sobolev Embedding). If m ∈ N, m > 2, and u ∈ W m,2 (Rn ), then
u ∈ C0 (Rn )1 .

Proof. Since u ∈ S ′ (Rn ) we can express u ∈ W m,2 (Rn ) equivalently by use of Plancherel as
d
D α u(ξ) = (−iξ)α û(ξ) ∈ L2 (Rn )

for |α| 6 m. Indeed,


 
X Z X Z Z X
|Dα u|2 dx = (2π)−n |ξ α û|2 dξ = (2π)−n  |ξ α |2  |û|2 dξ.
|α|6m Rn |α|6m Rn Rn |α|6m

Observe that for some positive constants c1 = c1 (m, n), c2 = c2 (m, n)


X
c1 (1 + |ξ|2 )m 6 |ξ α |2 6 c2 (1 + |ξ|2 )m
|α|6m
m
for all ξ ∈ Rn . Thus u ∈ W m,2 (Rn ) if and only if (1 + |ξ|2 ) 2 û ∈ L2 (Rn )2 . Now g ..=
m
(1 + |ξ|2 ) 2 û ∈ L2 (Rn ) and 2m > n, so
m 1 1
|û| = (1 + |ξ|2 )− 2 |g| 6 (1 + |ξ|2 )−m + |g|2 ∈ L1 (Rn )
2 2
and hence u ∈ C0 (R ) by the Fourier Inversion Formula in S ′ (Rn ):
n

u = F −1 (û) = (2π)−n F̃(û).

Theorem 16.3 (Qualitative Version of the Uncertainty Principle). If u ∈ Cc (Rn ) and û ∈


Cc (Rn ), then u = 0.

Proof. Put Z
f (z) = u(x)e−ix·z dx
Rn
for z ∈ Cn . This is well-defined because u ∈ Cc (Rn ), and using the DCT we see that u ∈
C 1 (Rn ), with Z
∂ ∂ −ix·z
f (z) = u(x) (e ) dx = 0
∂ z̄k Rn ∂ z̄k
for each k = 1, . . . , n. But then zk 7→ f (z) is holomorphic (for fixed z1 , . . . , zk−1 , zk+1 , . . . , zn ).
Now observe that for ξ ∈ Rn we have f (ξ) = û(ξ), and so if we fix ξ2 , . . . , ξn ∈ R and denote by
f1 (z1 ) = f (z1 , ξ2 , . . . , ξn ), then f1 is entire and f1 (ξ1 ) = û(ξ). Because û has compact support,
it follows that f1 must vanish on a half-line, so the Identity Theorem for holomorphic functions
implies that f1 ≡ 0. But then û(ξ) = 0, and since ξ2 , . . . , ξn ∈ R were arbitrary, we have shown
that û ≡ 0. By the Fourier Inversion Formula in S ′ (Rn ) it follows that u = 0.

1
Recall that this means that there exists a representative of u belonging to C0 (Rn ).
2
Note that this gives us a way to define the Sobolev spaces W m,2 in terms of the Fourier transform.

58

You might also like