Lecture Notes
Lecture Notes
Jan Kristensen
1 Lecture 1
There are many reasons to study distributions, but most of them are only really appreciated
after the fact. Many physical quantities are naturally not defined pointwise. For instance,
being able to measure temperature at a given point in space and time is an idealization – see
the discussion in R.S. Strichartz’s A Guide to Distribution Theory and Fourier Transforms, §1.
Similarly, in the theory of Lebesgue integration as discussed in the Part A Integration course
you encountered Lp functions and also they are not really defined uniquely everywhere, but
only almost everywhere. In fact they are strictly speaking not even functions, but equivalence
classes of functions under the equivalence relation equal almost everywhere. Nonetheless, for
f ∈ Lp (Rn ) and each measurable subset A ⊂ Rn the integral
Z
f (x) dx
A
is well-defined and does not depend on the representativeRused to calculate the integral. Note
that if we know that f is continuous, then the integrals A f (x) dx for measurable subsets A
of Rn with, say Ln (A) < ∞, determine f (x) uniquely for all x ∈ Rn . Specifically, we have
Z
1
f (x) dx → f (x0 )
Ln (Br (x0 )) Br (x0 )
1
for all measurable A ⊂ Rn with Ln (A) < ∞ determines f (x) uniquely almost everywhere (and
so uniquely as an Lp function). Here
1 if x ∈ A,
1A (x) =
0 if x ∈
/A
acts as a test function, or measurement of f . It turns out that taking very nice test functions
here is a good idea that allows us to extend aspects of differential calculus to Lp functions and
beyond. This leads to the theory of distributions. But why should we bother?
The equation
∂2u 2
2∂ u
(x, t) = k (x, t) (1)
∂t2 ∂x2
can be used to model a vibrating string. A function given by
u(x, t) = f (x − kt),
where f is a function of one variable, represents a travelling wave with shape f (x) moving to
the right with velocity k. When f is twice differentiable, one can check that u is a solution
to (1). However, there is no physical reason for the shape of the travelling wave to be twice
differentiable. For instance, the triangular profile
❅
❅
❅ ✲
❅
❅
❅
moving with speed k to the right is perfectly fine! We do not want to throw away physically
meaningful solutions because of technicalities. Looking at the example above, one could think
that if we accepted as solutions to differential equations any function that satisfies the differ-
ential equation except for some points (finitely many, say), where it fails to be differentiable,
then all would be fine. But this is too simplistic and does not work, as the next example shows.
On Rn Laplace’s equation is
∂2u ∂2u
∆u ..= + · · · + = 0. (2)
∂x21 ∂x2n
For n = 2 or n = 3 a solution to the above equation has the physical interpretation of an
electric potential in a region with no external charges. From physical experience we know that
such potentials should be smooth. However, as you may have seen last year,
u = G2 (x1 , x2 ) ..= log x21 + x22 ,
2
and
− 1
u = G3 (x1 , x2 , x3 ) ..= x21 + x22 + x23 2
are solutions in R2 \{(0, 0)} and in R3 \{(0, 0, 0)} respectively. Clearly neither can be extended
to the origin in a smooth manner, and so these should not be considered as solutions on the
full space.
Distribution theory allows us, among many other things, to distinguish between the case of
the one dimensional wave equation (1) and Laplace’s equation (2). Indeed, the standing wave
satisfies the one dimensional wave equation in the sense of distributions for any continuous
profile f , while
∆G2 = c2 δ0
and
∆G3 = c3 δ0
as distributions, where δ0 is Dirac’s delta function, a distribution.
We must spend some time developing the notion of a test function before we can define
the notion of a distribution. It is worth mentioning that there is not just one class of test
functions. In this course we will define two classes of test functions, the compactly supported
ones and the Schwartz ones. But there are in fact many others, and, depending on the context,
they can sometimes be very useful. To each class of test functions there corresponds a class of
distributions. The principle to keep in mind here is that the nicer (smoother) the test functions
are, the wilder (rougher) the corresponding distributions are allowed to be.
and sometimes say that such functions are C 0 functions, and write C 0 (Ω) = C(Ω). We will
use the same notation for complex-valued functions, which will be clear from context or will
be stated explicitly. Similarly, for k ∈ N we define
That is, u ∈ C k (Ω) (or u is a C k function) if and only if u and all its partial derivatives up to
and including order k are continuous on Ω.
∂u ∂u
Example 1.1. The function u is C 1 if and only if u, ∂x 1
, . . . , ∂x n
are all continuous on Ω. Note
∂u
in particular that we require, for example, ∂x1 to be jointly continuous in x = (x1 , x2 , . . . , xn ).
3
on Ω. Hence the order in which we take (two) partial derivatives is unimportant for C 2 func-
tions.
Proof. Let {ej }nj=1 be the standard basis for Rn and denote by
for x, x+h ∈ Ω. Observe that △sej △rek u = △rek △sej u for s, r ∈ R. Because u is C 2 , applying
the Fundamental Theorem of Calculus twice we have
Z 1Z 1
1 ∂2u
△sej △rek u(x) = (x + σsej + ρrek ) dσ dρ,
sr 0 0 ∂xj ∂xk
and hence
1 ∂2u
△sej △rek u(x) − (x) → 0
sr ∂xj ∂xk
as (r, s) → (0, 0).
We can extend this result to C k functions for k > 2 by induction, and so for such functions
we do not have to worry about the order in which we partially differentiate. When there are
many independent variables we shall often rely on multi-index notation.
∂ |α| u
Dα u(x) ..= (x)
∂xα1 1 . . . ∂xαnn
∂3u ∂2u
Dα u = , Dβ u = .
∂x1 ∂x22 ∂x22
When u ∈ C 5 (Ω),
∂5u
Dα+β u = .
∂x1 ∂x42
Note that by lemma 1.2, Dα+β u = Dα (Dβ u) = Dβ (Dα u) = Dβ+α u. In a sense, lemma 1.2
justifies using multi-index notation for partial derivatives.
4
Note that C k (Ω) is a descending sequence of sets, C k+1 (Ω) ( C k (Ω). We define
∞
\
C ∞ (Ω) ..= C k (Ω),
k=0
the class of infinitely differentiable functions on Ω. Under the natural pointwise definitions of
addition, multiplication by scalars, and multiplication, these classes form commutative rings
with unity and vector spaces (over R or C).
that is, the closure of the set {u 6= 0} relative to Ω. As such, supp(u) is closed in Ω, but need
not be closed in Rn .
Example 1.4. Define u1 : R → R by
1 − |x|, |x| < 1,
u1 (x) =
0, |x| > 1.
One sees that the support of a function u depends on the ambient set Ω, and we could
instead write suppΩ (u). However, for our purposes it will suffice to write supp(u), where Ω
will be understood from context.
We shall be particularly interested in having compact support. Recall that a set K is
compact if and only if any open cover of K admits a finite subcover. Also recall that in Rn ,
the Heine–Borel theorem tells us that K is compact if and only if K is closed and bounded.
Note that K ⊂ Ω is compact in Ω if and only if K is compact in Rn . Also when K ⊂ Ω is
compact, then dist(K, ∂Ω) ..= inf x∈K, y∈∂Ω |x − y| > 0.
Remark 1.6. Note that for u ∈ D(Ω), u not identically zero, we have dist(supp(u), ∂Ω) > 0.
5
We also write
Cck (Ω) ..= {u ∈ C k (Ω) : supp(u) is compact}
for k ∈ N0 ∪ {∞}. So in fact D(Ω) = Cc∞ (Ω). As before, we can define ring operations in
the standard way, making Cck (Ω) and D(Ω) into commutative rings (without unity) and vector
spaces (over R or C).
We have defined a test function to be any smooth and compactly supported function, but
so far we have seen no example. Actually, are there any smooth compactly supported test
functions other than the trivial one φ ≡ 0?
2 Lecture 2
for |x| < 1, where pk (x) is a polynomial of order k. One deduces that φ ∈ C k (R). The rest is
left as an exercise.
By the Chain Rule, we see that ϕ ∈ C ∞ (Rn ). Clearly, supp(ϕ) = Br (x0 ), and so ϕ ∈ D(Rn ).
We can do much more using φ as a building block. We shall use the operation of convolution;
recall that when f, g ∈ L1 (Rn ), then
Z
.
(f ∗ g)(x) .= f (x − y)g(y) dy
Rn
6
2.1.1 The Standard Mollifier in Rn
Put Z Z ∞
2
cn ..= φ |x| dx = φ r2 rn−1 dr ωn−1 > 0,
Rn 0
where ωn−1 is the surface area of Sn−1 . It can be shown that
n
2π 2
ωn−1 = nLn (B1 (0)) = .
Γ n2
We only need to know that 0 < ωn−1 < ∞, the actual value is not important here. Put
1
ρ(x) ..= φ |x|2 , x ∈ Rn .
cn
Definition 2.3. We call the family of functions (ρε )ε>0 the standard mollifier on Rn .
Proposition 2.4. Let 1 6 p < ∞ and u ∈ Lp (Ω). Define u to be zero outside Ω. Then
(i) ρε ∗ u ∈ C ∞ (Ω),
Proof. This is a simple application of the Dominated Convergence Theorem. We omit the
details.
7
Proof of Proposition 2.4. Part (i) follows by applying Lemma 2.5 inductively. For part (ii),
we use Hölder’s inequality. Let
1 1
+ =1
p q
and write for each x and almost every y,
1 1
|ρε (x − y)u(y)| = ρε (x − y) q ρε (x − y) p |u(y)|.
Integrating over x ∈ Ω,
Z Z Z
|(ρε ∗ u)(x)|p dx 6 ρε (x − y)|u(y)|p dy dx
n
Ω
ZΩ R Z
†
= |u(y)|p ρε (x − y) dx dy
Rn Ω
Z Z
p
6 |u(y)| ρε (x − y) dx dy = kukpp ,
Rn Rn
where in † we used Fubini–Tonelli. For (iii), let τ > 0 and take v ∈ Cc (Ω) such that ku−vkp 6 τ
(that this is possible follows from the way we defined the integral in Part A Integration–you
might want to write out the details of this). By uniform continuity of v, we can find an ε0 > 0
such that
kρε ∗ v − vk∞ < τ
for 0 < ε 6 ε0 . Using Minkowski’s inequality, we have for 0 < ε 6 ε0
8
2.1.2 Cut-off Functions and Partitions of Unity
Theorem 2.6. Let K be a compact subset of Ω. There exists φ ∈ D(Ω) such that 0 6 φ 6 1
and φ ≡ 1 on K. We refer to φ as a cut-off function between K and Rn \ Ω.
Proof. Put d ..= dist(K, ∂Ω) > 0 and fix δ ∈ 0, d4 . Put K̃ = B2δ (K). Recall that by definition,
K̃ = {x ∈ Rn : dist(x, K) 6 2δ}.
Let (ρε )ε>0 be the standard mollifier and put φ ..= ρδ ∗ 1K̃ . Then φ ∈ C ∞ (Rn ), supp(φ) ⊂
Bδ (K̃) = B3δ (K), and since δ 6 d4 , then supp(φ) ⊂ Ω and hence φ ∈ D(Ω). Next, 0 6 φ 6 1,
and for x ∈ K we have Bδ (x) ⊂ K̃, so
Z Z
φ(x) = ρδ (x − y)1K̃ (y) dy = ρδ (x − y) dy = 1.
Rn Rn
hence
|Dα φ| 6 cα d−|α| ,
where cα = 4|α| kDα ρkL1 , a constant independent of d.
S
Theorem 2.8. Let Ω = m j=1 Ωj , where Ω1 , . . . , Ωm are open, non-empty, potentially over-
lapping sets. For K ⊂ Ω compact there exist φ1 , . . . , φm ∈ D(Ω) satisfying supp(φj ) ⊂ Ωj ,
0 6 φj 6 1, and
m
X
φj = 1
j=1
Proof. (Not examinable) Let x ∈ K ∩ Ωj . Because Ωj is open, we can find rj (x) > 0 such that
Brj (x) (x) ⊂ Ωj . The set
n o
Brj (x) (x) : x ∈ K, 1 6 j 6 m
is an open cover of K, so by compactness it admits a finite subcover, say
n o
Bs ..= Brjs (xs ) (xs ), 1 6 s 6 N .
9
Put Jj ..= {s : js (xs ) = j}, so that [
Bs ⊂ Ωj .
s∈Jj
S S
Now Kj = K ∩ s∈Jj Bs is compact, Kj ⊂ Ωj and K = m j=1 Kj . We now apply Theorem 2.6
to each Kj , Ωj to find corresponding cut-off functions ψj ∈ D(Ωj ) satisfying 0 6 ψj 6 1 and
ψj ≡ 1 on Kj . We extend ψj to Ω \ Ωj by zero and, denoting this extension Qm−1 again by ψj ,
have ψj ∈ D(Ω). Now define φ1 ..= ψ1 , φ2 ..= ψ2 (1 − ψ1 ), . . . , φm ..= ψm j=1 (1 − ψj ). By
∞
repeated use of the Leibniz rule we see that φ1 , . . . , φm ∈ C (Ω). Clearly supp(φj ) ⊂ Ωj , and
0 6 φj 6 1. Finally, on K we have
m
X m
Y
φj − 1 = − (1 − ψj ) = 0.
j=1 j=1
The proof of Theorem 2.8 is not examinable, but the result is.
3 Lecture 3
10
3.2 Distributions Corresponding to D(Ω)
(i) u is linear,
u(φ + tψ) = u(φ) + tu(ψ)
for φ, ψ ∈ D(Ω), t ∈ R (or C), and
Remark 3.5. Firstly, because of linearity, the continuity condition (ii) holds if and only if it
holds at φ = 0. Indeed, if u(φj ) → 0 whenever φj → 0 in D(Ω) and ψj → ψ in D(Ω), then we
take φj = ψj − ψ and note that φj → 0 in D(Ω). Then by assumption, u(φj ) → 0. But u is
linear, so u(φj ) = u(ψj ) − u(ψ) and so u(ψj ) → u(ψ).
Secondly, when u : D(Ω) → R is linear (and defined everywhere on D(Ω)), then chances
are that u is continuous in the sense defined above and thus is a distribution on Ω. Indeed,
the only counterexamples I know are constructed by use of the Axiom of Choice in the form
of existence of a Hamel basis for D(Ω).
Notation. When u ∈ D′ (Ω) and φ ∈ D(Ω), we often write hu, φi instead of u(φ).
Example 3.6. If f ∈ Lp (Ω), p ∈ [1, ∞], then
Z
hTf , φi = f (x)φ(x) dx, φ ∈ D(Ω)
Ω
defines a distribution on Ω. Linearity follows from linearity of the integral, and continuity
follows from the Dominated Convergence Theorem. Note that since each φ ∈ D(Ω) has compact
support in Ω and since we defined convergence in D(Ω) by requiring all supports to be in a
fixed compact set in Ω, the above distribution Tf would also be well-defined if f was only
locally in Lp .
Definition 3.7. For p ∈ [1, ∞] we write f ∈ Lploc (Ω) and say f is locally Lp on
R Ω if and only
if for each compact set K ⊂ Ω we have f |K ∈ Lp (K). Specifically, we require K |f |p dx < ∞
when p < ∞ and ess supK |f | < ∞ when p = ∞.
Example 3.8. The function x−1 ∈ / L1 (0, ∞), but x−1 ∈ L1loc (0, ∞) and, in fact, x−1 ∈ Lploc (0, ∞)
for all p ∈ [1, ∞]. Note that Ω determines what local means. For example, x−1 ∈ L1loc (0, ∞),
but x−1 ∈ / L1loc (−1, 1).
11
Example 3.9. Summarizing a previous discussion, each f ∈ Lploc (Ω), p ∈ [1, ∞], gives rise to a
distribution on Ω via Z
hTf , φi = f (x)φ(x) dx
Ω
for each φ ∈ D(Ω).
Example 3.10 (Dirac’s delta function at x0 ∈ Ω). The map
While the continuity condition (ii) in Definition 3.4 often is not an issue, it is nonetheless
useful to reformulate it using linearity as follows.
Theorem 3.11. A linear functional u : D(Ω) → R (or C) is a distribution if and only if for
every compact set K ⊂ Ω there exist constants c = c(K) > 0 and m = m(K) ∈ N0 such that
X
|hu, φi| 6 c sup |Dα φ| (3)
K
|α|6m
Proof. If φj → 0 in D(Ω), then for some compact set K ⊂ Ω we have φj ∈ D(K) for all j.
Then by assumption we can find c = c(K) > 0 and m = m(K) ∈ N0 such that (3) holds. But
then X
|hu, φj i| 6 c sup |Dα φj | → 0.
K
|α|6m
For the converse, we argue by contradiction. Assume there exists u ∈ D(Ω) and a compact
set K ⊂ Ω such that (3) is violated for all choices of c and m. In particular, for c = m = j we
can find φj ∈ D(K) with X
|hu, φj i| > j sup |Dα φj |.
K
|α|6j
φj
Put λj = hu, φj i. Then |λj | > 0, ψj ..= λj ∈ D(K), hu, ψj i = 1, and
X
1>j sup |Dα ψj |.
K
|α|6j
Thus |Dα ψj | < j −1 on Ω for j > |α|, and in particular ψj → 0 in D(Ω). But hu, ψj i = 1, which
does not converge to zero.
12
4 Lecture 4
Lemma 4.1 (The Fundamental Lemma of the Calculus of Variations). If f ∈ L1loc (Ω) and
Z
f (x)φ(x) dx = 0
Ω
Proof. Let O be a non-empty open subset of Ω such that O is compact and O ⊂ Ω. In this
case we write O ⋐ Ω.
Put g = f 1O and extend g to Rn \ Ω by zero. Because O ⊂ Ω is compact, we have g ∈
L1 (Rn ). For the standard mollifier (ρε )ε>0 we know, by proposition 2.4, that kρε ∗ g − gk1 → 0
as ε → 0+ . Now note that for x ∈ Ω,
Z
(ρε ∗ g)(x) = ρε (x − y)g(y) dy
Rn
Z
= ρε (x − y)f (y) dy.
O
If we take x ∈ O and ε ∈ (0, dist(x, ∂O)), then, denoting φx (y) ..= ρε (x − y) for y ∈ Ω, we have
φx ∈ C ∞ (Ω) and supp(φx ) = Bε (x) ⊂ O ⊂ Ω, so φx ∈ D(Ω). By assumption,
Z Z
0= f (y)φx (y) dy = f (y)ρε (x − y) dy = (ρε ∗ g)(x).
Ω O
13
Notation. When f ∈ Lploc (Ω), we shall also use f to denote the distribution Tf . This is of
course an abuse of notation, but it is convenient and should not cause too much trouble. We
shall often refer to distributions that correspond to an Lploc (Ω) function as a regular distribution
on Ω.
Definition 4.2. Let u ∈ D′ (Ω). If there exists an m ∈ N0 with the property that for all
compact subsets K ⊂ Ω there exists a constant c = cK > 0 such that
X
|hu, φi| 6 c sup |Dα φ|
K
|α|6m
Note that by Theorem 3.11, any distribution has locally finite order.
Example 4.3. Let f ∈ L1loc (Ω). Then the corresponding distribution has order 0. Indeed, if
K ⊂ Ω is compact and ϕ ∈ D(K), then
Z
|hf, ϕi| = f (x)ϕ(x) dx
Ω
Z
6 |f ||ϕ| dx
K
Z
6 sup |ϕ| |f | dx,
K K
for ϕ ∈ D(Ω). Then T ∈ D′ (Ω) as T is clearly linear and for compact K ⊂ Ω and ϕ ∈ D(K),
It also shows that T has order at most |α|. If α = 0 so that T = δx0 , we see that T has order
0. Assume |α| > 0. We shall prove that T has order |α|. Suppose, for contradiction, that T
has order at most |α| − 1. Take r ∈ (0, dist(x0 , ∂Ω)) and put K = Br (x0 ). Then K ⊂ Ω is
compact. By assumption, we can then find c = cK > 0 such that
X
|hT, ϕi| = |(Dα ϕ)(x0 )| 6 c sup |Dβ ϕ| (4)
K
|β|6|α|−1
14
for all ϕ ∈ D(K). Take ψ ∈ D(B1 (0)) with ψ(0) = 1 and define, for ε ∈ (0, r),
. (x − x0 )α x − x0
ϕ(x) .= ψ
α! ε
so that Dα ϕ(x0 ) = 1. If β ∈ Nn0 , β 6 α, and |β| < |α|, then for x ∈ Bε (x0 ) we get from the
generalized Leibniz rule (see below)
X β
(x − x0 )α
x − x0
β γ β−γ
|D ϕ(x0 )| 6 Dx Dx ψ
γ α! ε
γ6β
X β
6 cγ |x − x0 ||α|−|γ| sup |Dβ−γ ψ|ε|γ|−|β|
γ
γ6β
X β
6 cγ sup |Dβ−γ ψ|ε|α|−|γ|+|γ|−|β|
γ
γ6β
= cψ,β ε|α|−|β| ,
where
X β
cψ,β ..= cγ sup |Dβ−γ ψ|.
γ
γ6β
When ε < 1 we have, since |α| − |β| > 1, that ε|α|−|β| 6 ε, and hence in combination with (4)
we get X X
16c cψ,β ε|α|−|β| 6 c cψ,β ε =.. c̃ε.
|β|6|α|−1 |β|6|α|−1
Theorem 4.5 (Generalized Leibniz Rule). Let f, g ∈ C k (Ω). Then f g ∈ C k (Ω) and for
α ∈ Nn0 , |α| 6 k, we have
X α
α
D (f g) = Dβ f Dα−β g,
β
β6α
and α! = α1 !α2 ! . . . αn !.
15
Proof. This can be proven by induction on |α|. We omit the details.
5 Lecture 5
Definition 5.1. Let {uj }j be a sequence in D′ (Ω) and let u ∈ D′ (Ω). We say uj converges to
u in the sense of distributions on Ω and write
uj −→ u in D′ (Ω)
if and only if
huj , ϕi −→ hu, ϕi
for each ϕ ∈ D(Ω).
Remark 5.2. As with convergence in D(Ω), one can define a topology T on D′ (Ω) so that
uj → u in D′ (Ω) corresponds to uj → u in the topological space (D′ (Ω), T ). We will not need
it and shall not pursue this here. We note, however, that the topology T is not metrizable.
Convergence in D′ (Ω) is very weak.
Example 5.3. Let p ∈ [1, ∞] and fj , f ∈ Lp (Ω). If fj → f in Lp (Ω), then fj → f in D′ (Ω).
This can be proven using the Dominated Convergence Theorem. The converse, however, is
false:
(i) Let fj (x) = sin(jx), x ∈ (0, 1). Then fj → 0 in D′ (0, 1), but fj 6→ 0 in Lp (0, 1) for any
p ∈ [1, ∞].
(ii) Let gj (x) = g(jx), x ∈ (0, 1), where g is T -periodic and on (0, T ] is given by g =
−1171(0, T ] + 1171( T ,T ] . On Sheet 2 you will be asked to prove that gj → 0 in D′ (0, 1),
2 2
but kgj k1 = 117 6→ 0.
16
R
x ∈ Ω. Then vε ∈ D′ (Ω) and vε → Rn v dx δx0 in D′ (Ω) as ε → 0+ . Indeed, it is clear that
vε ∈ D′ (Ω) for all ε > 0, and for ϕ ∈ D(Ω) we have
Z
−n x − x0
hvε , ϕi = ε v ϕ(x) dx
ε
ZΩ
= v(y)ϕ(x0 + εy) dy
RnZ Z
−→ v(y)ϕ(x0 ) dy = v dy hδx0 , ϕi,
ε→0+ Rn Rn
where in the second line we made the change of variables y = ε−1 (x − x0 ). In particular, note
that if (ρε )ε>0 is the standard mollifier, then ρε → δ0 in D′ (Rn ) as ε → 0+ .
Note that for x ∈ Rn , φx (y) ..= ρε (x−y), y ∈ Rn , is C ∞ and has support supp(φx ) = Bε (x),
so in particular φx ∈ D(Ω). Now recall that for f ∈ Lp (Rn ) we defined
Z
(ρε ∗ f )(x) = ρε (x − y)f (y) dy
n
ZR
= φx (y)f (y) dy = hf, φx i.
Rn
This in fact all makes perfect sense for any f ∈ Lploc (Rn ), but we can go much further.
Definition 5.5 (Mollification of distributions). Let u ∈ D′ (Rn ) and (ρε )ε>0 be the standard
mollifier. Then ρε ∗ u ∈ D′ (Rn ) is defined by
ϕ ∈ D(Rn ), where ρ̃ε (x) = ρε (−x), which is equal to ρε (x) in the case of the standard mollifier
since it is an even function.
so ρε ∗ u will satisfy the same bounds as u (uniformly in ε > 0). But for each fixed ε > 0, ρε ∗ u
is much better, as the following theorem shows.
17
Sketch of proof. For ε > 0 fixed we consider
φx (y) ..= ρε (x − y)
for y ∈ Rn . Since φx → φx0 in D(Rn ) as x → x0 , we have by the continuity condition
hu, φx i −→ hu, φx0 i
as x → x0 . Thus ρε ∗ u ∈ C 0 (Rn ). Next, for a direction 1 6 j 6 n and an increment h 6= 0
consider
1 1 x+hej x
((ρε ∗ u)(x + hej ) − (ρε ∗ u)(x)) = u, φ −φ .
h h
We can easily check that
1 x+hej
φ − φx −→ ψjx
h h→0
in D(Rn ), where ψjx (y) = (Dj ρε )(x − y), y ∈ Rn . Hence ρε ∗ u admits partial derivatives
Dj (ρε ∗ u), and we see that, as above, they are continuous. Thus ρε ∗ u is C 1 , and an induction
argument along these lines gives that ρε ∗ u is C ∞ . For the approximation, let ϕ ∈ D(Rn ). By
definition,
hρε ∗ u, ϕi = hu, ρε ∗ ϕi,
and it’s easy to check that ρε ∗ ϕ −→ ϕ in D(Rn ), consequently ρε ∗ u −→ u in D′ (Rn ).
ε→0+ ε→0+
Strategy. Prove results for test functions and then try to extend them to distributions by the
above approximation result. We shall not prove Theorem 5.8 here, but we shall return to it
later.
How should we differentiate u ∈ D′ (Ω)? Take uj ∈ D(Ω) such that uj → u in D′ (Ω). Then
Z
hDk uj , ϕi = (Dk uj )ϕ dx
ZΩ
= (Dk uj )ϕ dx
Rn
Z Z
Fubini
= (Dk uj )ϕ dxk dx1
R n−1 R
Z Z
parts
= − uj Dk ϕ dxk dx1
Rn−1 R
Fubini
= huj , −Dk ϕi −→ hu, −Dk ϕi,
j→∞
18
6 Lecture 6
We outline a principle that often allows us to extend well-known operations on test functions
to corresponding operations on distributions. Let T be an operation on test functions, that
is T : D(Ω) → D(Ω) is a linear map. Suppose there exists a linear map S : D(Ω) → D(Ω)
satisfying Z Z
T (ϕ)ψ dx = ϕS(ψ) dx
Ω Ω
for all ϕ, ψ ∈ D(Ω). We call this an adjoint identity. If S is continuous in the sense that
S(ψj ) → S(ψ) in D(Ω) whenever ψj → ψ in D(Ω), then we can extend T to distributions u by
the rule
hT̄ (u), ψi ..= hu, S(ψ)i,
ψ ∈ D(Ω). Because S is linear and continuous, it follows that T̄ (u) ∈ D′ (Ω), and in fact
T̄ : D′ (Ω) → D′ (Ω) is linear and continuous,
T̄ (u + λv) = T̄ (u) + λT̄ (v)
for u, v ∈ D′ (Ω), λ ∈ R (or C), since for ψ ∈ D(Ω)
hT̄ (u + λv), ψi = hu + λv, S(ψ)i = hu, S(ψ)i + λhv, S(ψ)i = hT̄ (u), ψi + λhT̄ (v), ψi,
and if uj → u in D′ (Ω), then for ψ ∈ D(Ω)
hT̄ (uj ), ψi = huj , S(ψ)i −→ hu, S(ψ)i = T̄ (u), ψi,
hence T̄ (uj ) → T̄ (u).
d
Example 6.1. 1. (Differentiation). T = dx = D on D(R). For ϕ, ψ ∈ D(R) we have by
integration by parts
Z Z ∞ Z
′ +∞ ′
ϕ ψ dx = [ϕψ]−∞ − ϕψ dx = ϕ(−ψ ′ ) dx,
R −∞ R
hence we have an adjoint identity with S = −D. Clearly, S : D(R) → D(R) is linear and
continuous, so we may extend to distributions u ∈ D′ (R) by
hD̄u, ψi = hu, −Dψi,
ψ ∈ D(R). To check consistency, suppose u ∈ C 1 (R) and consider also u as an element
of D′ (R). We would like to know the relation between the distributional derivative D̄u
defined above and the usual derivative Du. We have
Z ∞ Z ∞
+∞
hD̄u, ψi = hu, −Dψi = u(−Dψ) dx = − [uψ]−∞ + ψDu, dx = hDu, ψi
−∞ −∞
for all ψ ∈ D(R), and so D̄u = Du. In the following we shall therefore not distinguish
between the distributional and the classical derivatives and simply denote both by Du
or du
dx when they exist.
19
2. (Multiplication by smooth functions). For f ∈ C ∞ (R) define T (ϕ) ..= f ϕ for ϕ ∈ D(R).
Clearly T : D(R) → D(R) is linear and S = T yields an adjoint identity:
Z Z
f ϕψ dx = ϕf ψ dx
R R
for ϕ, ψ ∈ D(R). It is clear that S : D(R) → D(R) is linear and continuous (checked by
Leibniz), so we may extend T to distributions by the rule
hf u, ψi ..= hu, f ψi
for u ∈ D′ (R), ψ ∈ D(R). Clearly we have consistency here: when u ∈ L1loc (R), then
f u ∈ L1loc (R) and f u can be identified with the above distribution.
3. Many other useful operations admit extensions to distributions.
Translation. T = τh defined by τh ϕ(x) = ϕ(x + h) yields adjoint identity with S = τ−h .
Thus for u ∈ D′ (R), τh u ∈ D′ (R) is defined by the rule
hτh u, ψi ..= hu, τ−h ψi
for ψ ∈ D(R).
Dilation. T = dr defined by dr ϕ(x) = ϕ(rx), r > 0, yields the adjoint identity with
S = 1r d 1 . Thus for u ∈ D′ (R), dr u ∈ D′ (R) is defined by the rule
r
. 1
hdr u, ψi .= u, d 1 ψ
r r
for ψ ∈ D(R).
Reflection through the origin. (T ϕ)(x) = ϕ̃(x) = ϕ(−x) admits the adjoint identity with
S = T . Thus for u ∈ D′ (R), ũ ∈ D′ (R) is defined by the rule
hũ, ψi ..= hu, ψ̃i
for ψ ∈ D(R).
Convoltion with a test function. For v ∈ D(R), T ϕ = v ∗ ϕ admits an adjoint identity
with Sψ = ṽ ∗ ψ. Indeed, by Fubini,
Z ∞ Z ∞Z ∞
(v ∗ ϕ)(x)ψ(x) dx = v(x − y)ϕ(y) dyψ(x) dx
−∞ −∞ −∞
Z ∞Z ∞
= v(x − y)ψ(x) dxϕ(y) dy
−∞ −∞
Z ∞
= (ṽ ∗ ψ)(y)ϕ(y) dy.
−∞
20
Definition 6.2. Let Ω be a non-empty open subset of Rn . Let u ∈ D′ (Ω) and j ∈ {1, . . . , n}.
∂u
The j-th partial derivative of u, Dj u or ∂x j
, in the sense of distributions is defined by the rule
for ϕ ∈ D(Ω).
Note that Dj fits into the adjoint identity scheme with T = Dj and S = −Dj , and so
is well-defined. Also note that Dj is continuous in the sense that if uk → u in D′ (Ω), then
Dj uk → Dj u in D′ (Ω). As in the one-dimensional case, when u ∈ C 1 (Ω) the distributional and
classical partial derivatives D1 u, . . . , Dn u coincide. Moreover, note that since for ϕ ∈ D(Ω) we
have
∂2ϕ ∂2ϕ
= ,
∂xj ∂xk ∂xk ∂xj
we also have Dj Dk u = Dk Dj u for u ∈ D′ (Ω). We can therefore use multi-index notation for
distributional derivatives. For u ∈ D′ (Ω) and α ∈ Nn we have
∂ |α| ϕ
Dα ϕ = .
∂xα1 1 . . . ∂xαnn
Definition 6.3. Let u be a distribution and f be a smooth function. Then the product f u in
the sense of distributions is defined by the rule
hf u, ϕi ..= hu, f ϕi
for ϕ ∈ D(Ω).
This definition also fits into the adjoint identity scheme with T = f x = S and so is well-
defined. It is clearly consistent, as in the one-dimensional case.
Example 6.4. The Heaviside function is the function
(
0 x<0
H(x) =
1 x > 0.
Note that the value of H(x) at x = 0 is not particularly important and is sometimes taken to
be 0 instead (or in some other contexts even 21 ). Clearly H ∈ L1loc (R), so H ∈ D′ (R) and we
21
have H ′ = δ0 . Indeed, since for ϕ ∈ D(R)
hH ′ , ϕi = hH, −ϕ′ i
Z ∞
= H(x)(−ϕ′ (x)) dx
−∞
Z ∞
=− ϕ′ (x) dx
0
FTC
= − [ϕ(x)]x=∞
x=0
= ϕ(0) = hδ0 , ϕi.
Note also that for m ∈ N
m
d
δ0 , ϕ = hδ0 , (−1)m ϕ(m) i = (−1)m ϕ(m) (0).
dxm
A slight extension of the above formula for H ′ is obtained by differentiation of a piecewise C 1
function (
f (x) x < 0
h(x) =
g(x) x > 0,
where f , g ∈ C 1 (R). This will be addressed on Sheet 2.
τh −1
Example 6.5. We can define △h = h for h 6= 0 on distributions u ∈ D′ (R) by the adjoint
identity scheme: for ϕ ∈ D(R) put
τ−h − 1
h△h u, ϕi = u, ϕ ,
h
τ−h −1 ϕ(x−h)−ϕ(x)
where h ϕ (x) = h . If u ∈ C 1 (R), then clearly
u(x + h) − u(x)
△h u(x) = −→ u′ (x)
h h→0
locally uniformly in x. What happens when u ∈ D′ (R)? One may check that
τ−h − 1
ϕ −→ −ϕ′
h h→0
Theorem 6.6 (Leibniz Rule). If u ∈ D′ (Ω), f ∈ C ∞ (Ω), and j ∈ {1, . . . , n}, then
Dj (f u) = (Dj f )u + f Dj u
in D′ (Ω). In fact, the Generalized Leibniz Rule also holds for distributions: for a multi-index
α ∈ Nn ,
X α
α
D = Dβ f Dα−β u.
β
β6α
22
Proof. We only prove the basic case, the general case can be proved by induction, or simply
by using the formula for test functions. First note that Dj (f u), (Dj f )u + f Dj u ∈ D′ (Ω) and
that for ϕ ∈ D(Ω):
hDj (f u), ϕi = hf u, −Dj ϕi = hu, −f Dj ϕi,
7 Lecture 7
Dj u = 0
for j = 1, 2, . . . , n, then u is constant in the sense that there exists c ∈ R (or C) such that
Z
hu, ϕi = c ϕ(x) dx
Ω
for ϕ ∈ D(Ω).
Proof. We only give details for the case n = 1 and Ω = R. The general case can be done along
similar lines (not examinable). Suppose u ∈ D′ (R) satisfies u′ = 0, that is
0 = hu′ , ϕi = −hu, ϕ′ i
R
for all ϕ ∈ D′ (R). Fix ρ ∈ D(R) with R ρ dx = 1 (the standard mollifier kernel on R will do).
For ϕ ∈ D(R) we put Z
cϕ ..= ϕ(x) dx
R
and Z x
ψ(x) ..= (ϕ(t) − cϕ ρ(t)) dt
−∞
for x ∈ R. Take a, b ∈ R with a < b and ϕ(t) = 0 = ρ(t) for t 6 a and t > b. Then ψ(x) = 0
for x 6 a, and for x > b,
Z x Z ∞
ψ(x) = (ϕ(t) − cϕ ρ(t)) dt = (ϕ(t) − cϕ ρ(t)) dt = 0.
−∞ −∞
23
By the FTC, ψ is C 1 with ψ ′ (x) = ϕ(x)−cϕ ρ(x) and hence ψ is C ∞ . Since also supp(ψ) ⊂ [a, b],
ψ ∈ D(R). Now
hu, ϕi = hu, ψ ′ + cϕ ρi
= hu, ψ ′ i + cϕ hu, ρi
= h−u′ , ψi + cϕ hu, ρi
= cϕ hu, ρi,
so Z Z
hu, ϕi = hu, ρi ϕ(x) dx = c ϕ(x) dx,
R R
where we denoted c = hu, ρi.
Remark 7.2. Recall that any u ∈ L1loc (Ω) is uniquely determined by the corresponding distri-
bution Z
hu, ϕi = uϕ dx,
Ω
ϕ ∈ D(Ω), and we do not distinguish between u as an L1loc function and u as a distribution in
our notation. In particular note that C k (Ω) ⊂ L1loc (Ω), and that a distribution u ∈ D′ (Ω) is a
C k function precisely when Z
hu, ϕi = f (x)ϕ(x) dx
Ω
for some f ∈ C k (Ω).Now following the above convention we write u = f . However, keep in
mind that for u ∈ Lloc (Ω) and f ∈ C k (Ω) we have from
1
Z Z
u(x)ϕ(x) dx = f (x)ϕ(x) dx
Ω Ω
Proof. We only consider the case n = 1 and Ω = R. Let (ρε )ε>0 be the standard mollifier and
put uε = ρε ∗ u. Then by Theorem 5.7 we have uε ∈ C ∞ (R) and uε −→ u in D′ (R). By the
FTC, Z x
uε (x) − uε (y) = u′ε (t) dt
y
for all x, y ∈ R. By considering difference quotients as in Example 6.5, we find that u′ε = ρε ∗u′ .
Here the distributional derivative
Rx ′ u′ is a continuous function, so u′ε → u′ locally uniformly on
R. Now uε (x) = uε (y) + y uε (t) dt, and multiplying by ρ(y) and then integrating over y ∈ R
gives Z Z Z
∞ ∞ x
uε (x) = uε (y)ρ(y) dy + ρ(y) u′ε (y) dt dy.
−∞ −∞ y
24
Multiply by ϕ(x) ∈ D(R) and integrate over x ∈ R:
Z ∞
huε , ϕi = uε (x)ϕ(x) dx
Z−∞
∞ Z ∞ Z ∞ Z ∞ Z x
= uε (y)ρ(y) dy ϕ(x) dx + ρ(y) u′ε (t) dt dy ϕ(x) dx.
−∞ −∞ −∞ −∞ y
Taking ε → 0 we get
Z ∞ Z ∞ Z ∞ Z x
′
hu, ϕi = hu, ϕi ϕ(x) dx + ρ(y) u (t) dt dy ϕ(x) dx
−∞ −∞ −∞ y
Z ∞ Z ∞ Z x
′
= hu, ρi + ρ(y) u (t) dt dy ϕ(x) dx.
−∞ −∞ y
Notice that the first term in parentheses is a constant, while the second is a C 1 function in x
(by the FTC). Therefore u ∈ C 1 (R).
It is not difficult to check that W m,p (Ω) is a vector space under the usual definitions of
addition and scalar multiplication. It is called a Sobolev space, and is equipped with the norm
1/p
X
kDα ukpp if p ∈ [1, ∞)
kukW m,p .
.=
|α|6m
α
max kD uk∞ if p = ∞.
|α|6m
This is a norm in the same sense that k · kp is a norm on Lp (Ω): one identifies functions that
agree a.e.
8 Lecture 8
25
Remark 8.2. Note that this is well-defined for each ξ ∈ Rn since
|f (x)e−ix·ξ | = |f (x)|
• fˆ(ξ) → 0 as |ξ| → ∞ (by the Riemann–Lebesgue lemma which we shall prove later in
the course).
The precise range {fˆ : f ∈ L1 (Rn )} is not so easy to describe in terms not involving the Fourier
transform; however, it is strictly smaller than
n .. n
C0 (R ) = g ∈ C(R ) : lim g(x) = 0 .
|x|→∞
One reason that we are interested in the Fourier transform here is its ability to transform
partial derivatives to an algebraic operation.
Lemma 8.3 (Differentiation Rule). Let f ∈ L1 (Rn ) and assume Dj f ∈ L1 (Rn ) for some
j ∈ {1, . . . , n}. Then
d
D ˆ
j f (ξ) = iξj f (ξ).
Proof. Let φ ∈ D(Rn ) be such that φ(x) = 1 for |x| 6 1 (exercise: think about how to construct
such a cut-off function). We then calculate
Z
d
Dj f (ξ) = Dj f (x)e−ix·ξ dx
R n
Z x
DCT
= lim Dj f (x)e−ix·ξ φ dx
r→∞ Rn r
D · E
= lim Dj f, e−i(·)·ξ φ
r→∞
r
· · 1
−i(·)·ξ −i(·)·ξ
= lim f, iξj e φ −e (Dj φ)
r→∞ r r r
Z x x 1
−ix·ξ −ix·ξ
= lim f (x) iξj e φ −e (Dj φ) dx
r→∞ Rn r r r
Z
DCT
= f (x)iξj e−ix·ξ dx = iξj fˆ(ξ).
Rn
26
Example 8.4. Let f = 1(−1,1) . Clearly f ∈ L1 (R), and
2 sin ξ for ξ 6= 0
fˆ(ξ) = ξ
2 for ξ = 0.
Note that fˆ ∈
/ L1 (R).
Example 8.5. Let ρ ∈ D(R) the the standard Rmollifier kernel on R (that is, ρ is an even function
satisfying 0 6 ρ 6 1, supp(ρ) = [−1, 1], and ρ = 1). Then
Z 1
ρ̂(ξ) = 2 ρ(x) cos(x · ξ) dx.
0
|ξ|k Dm ρ̂(ξ) −→ 0
as |ξ| → ∞.
We would like to extend the Fourier transform to distributions, and to that end we seek an
adjoint identity. The above example with ρ̂ ∈ / D(R) shows that we will have to define a new
class of test functions. We shall return to that shortly, but first we need a few lemmas.
Lemma 8.6 (The Product Rule). Let f , g ∈ L1 (Rn ). Then
Z Z
f (x)ĝ(x) dx = fˆ(x)g(x) dx.
Rn Rn
Note that both sides are well-defined since the Fourier transform of an L1 function is bounded
and continuous. We thus have an adjoint identity with S = F = T , but there is an issue with
the domain that we shall address later.
27
Before addressing the issues with the domain and the appropriate class of test functions,
let us investigate the properties of the Fourier transform on L1 functions a little more. It will
turn out to be a useful source of insight.
and
F(e−ix·h f (x))(ξ) = τh fˆ(ξ)
for any h ∈ Rn .
and Z Z
e−ix·h f (x)e−ix·ξ dx = f (x)e−ix·(ξ+h) dx = fˆ(ξ + h) = τh fˆ(ξ).
Rn Rn
and
(dr fˆ)(ξ) = F(r−n d 1 f )(ξ).
r
28
and
(dr fˆ)(ξ) = fˆ(rξ)
Z
= f (x)e−ix·rξ dx
R n
Z y
y=rx y
=n f e−i r ·rξ r−n dy
dy=r dx Rn r
Z
−n
=r (d 1 f )(y)e−iy·ξ dy
r
Rn
\
= r−n (d 1 f )(ξ).
r
Proof. By Fubini,
Z Z
F(f ∗ g)(ξ) = f (x − y)g(y) dy e−ix·ξ dx
Rn Rn
Z Z
= f (x − y)e−i(x−y)·ξ dx g(y)e−iy·ξ dy
Rn Rn
Z Z
z=x−y
= f (z)e−iz·ξ dz g(y)e−iy·ξ dy
dz=dx Rn Rn
= fˆ(ξ)ĝ(ξ).
Lemma 8.10 (Reverse Differentiation Rule). Let f ∈ L1 (Rn ) and assume xj f (x) ∈ L1 (Rn ) for
some j ∈ {1, . . . , n}. Then the distributional partial derivative Dj fˆ is a continuous function,
(Dj fˆ)(ξ) = F(−ixj f (x))(ξ).
In fact, Dj fˆ exists classically.
Proof. Let us start with the latter statement. Fix ξ ∈ Rn and h ∈ R \ {0} and consider the
following difference quotient. We have
fˆ(ξ + hej ) − fˆ(ξ)
△hej fˆ(ξ) ..=
Z h
= f (x)△hej e−ix·(·) (ξ) dx
Rn
Z
DCT
−→ −ixj f (x)e−ix·ξ dx
h→0 Rn
= F(−ixj f (x))(ξ),
29
so the partial derivative Dj fˆ exists classically at ξ. Moreover, since the application ξ 7→
F(−ixj f (x))(ξ) is continuous, so is Dj fˆ. This is also the distributional derivative since as in
Example 6.5, we have △hej fˆ → Dj fˆ in D′ (Rn ) as h → 0. More precisely, one has that
for every ϕ ∈ D(Rn ). But the difference quotient △hej fˆ(ξ) converges locally uniformly in
ξ to the classical derivative too, so Dj fˆ can be understood in either sense. If unconvinced,
the reader is invited to write out the precise details of the last part of the argument as an
exercise.
9 Lecture 9
We start with the observation that the differentiation rules for the Fourier transform on L1
admit generalizations to higher order derivatives. We formalize this using the following no-
tation. Recall that for a multi-index α = (α1 , . . . , αn ) ∈ Nn0 and x = (x1 , . . . , xn ∈ Rn ,
D = (D1 , . . . , Dn ), we denoted by
xα ..= xα1 1 . . . xαnn
and
∂ |α|
Dα ..= .
∂xα1 1 . . . ∂xαnn
We can use this notation to write out polynomials in n variables: if p(x) is a polynomial of
degree at most k, then X
p(x) = c α xα ,
|α|6k
where cα ∈ R and we sum over all multi-indices α ∈ Nn0 of length |α| 6 k. Corresponding to
the polynomial p(x) is a linear partial differential operator
X
p(D) ..= cα D α .
|α|6k
If cα 6= 0 for some α ∈ Nn0 with |α| = k, then we say p(D) has order k. Sometimes we also
write p(iD) or p(−iD), the notation being self explanatory:
X X
p(iD) = cα (iD)α = cα i|α| Dα ,
|α|6k |α|6k
and so on.
30
1. If f ∈ L1 (Rn ) and p(d)f ∈ L1 (Rn ), then
\ (ξ) = p(iξ)fˆ(ξ).
p(D)f
We are now ready to address the domain issues in the adjoint identity for the Fourier
transform (Lemma 8.6).
Definition 9.2. A function f : Rn → R (or into C) is said to be rapidly decreasing if and only
if for every m ∈ N there exist rm , cm > 0 such that
|f (x)| 6 cm |x|−m
Remark 9.3. A continuous function f is rapidly decreasing if and only if for any polynomial
p(x) the function x 7→ p(x)f (x) is bounded on Rn :
Clearly this bound will depend on the polynomial p(x). Exercise: prove this.
Example 9.4.
1
• 1+x2m
is not rapidly decreasing for any m ∈ N,
2
• e−x is rapidly decreasing,
Definition 9.5 (The Schwartz Space S(Rn )). We say that ϕ is a Schwartz test function on Rn
and write ϕ ∈ S(Rn ) if and only if ϕ ∈ C ∞ (Rn ) and for all α ∈ Nn0 , Dα ϕ is rapidly decreasing.
Example 9.6.
2
• e−|x| ∈ S(Rn ) \ D(Rn ),
• e−|x| ∈
/ S(Rn ) because it is not differentiable at zero,
The following lemma collects a few of the properties of the class of Schwartz test functions.
31
Lemma 9.7.
Proof.
(i) If ϕ, ψ ∈ S(Rn ) and λ ∈ R (or C), then certainly ϕ + λψ ∈ C ∞ (Rn ) because C ∞ (Rn ) is
a vector space. Now for a polynomial p on Rn we have
sup |p(x)(ϕ + λψ)(x)| 6 sup |p(x)ϕ(x)| + |λ| sup |p(x)ψ(x)| < ∞.
x x x
and so
X α
sup |qDα (pϕ)| 6 sup |qDβ pDα−β ϕ|
x x β
β6α
X α
6 sup |qDβ pDα−β ϕ|
β x
β6α
<∞
32
10 Lecture 10
Proof. By part (iv) of Lemma 9.7, S(Rn ) ⊂ L1 (Rn ) and so ϕ̂ is well-defined for ϕ ∈ S(Rn ).
We thus only need to check that ϕ̂ ∈ S(Rn ) whenever ϕ ∈ S(Rn ).
Recall that if ϕ ∈ S(Rn ) ⊂ L1 (Rn ), then ϕ̂ is continuous and
Z
|ϕ̂(ξ)| = ϕ(x)e−ix·ξ dx 6 kϕk1 (5)
Rn
p(ξ)ϕ̂(ξ) = p(−iD)ϕ(ξ),
and so supξ |p(ξ)ϕ̂(ξ)| < ∞ by (5).
Step 2: Let α ∈ Nn0 . Then Dα ϕ̂ is rapidly decreasing.
Again, Dα ϕ̂ is continuous so we only need to show that supξ |p(ξ)Dα ϕ̂(ξ)| < ∞ whenever
p is a polynomial in n variables. By part (ii) of Lemma 9.7, ψ(x) ..= (−ix)α ϕ(x) ∈ S(Rn ), ad
so by parts (iii) and (iv)
p(−iD)ψ ∈ S(Rn ) ⊂ L1 (Rn ).
Then Corollary 9.1 yields
V
Remark 10.2. We record the following principle that is implicit in the above proof.
(b) Let m ∈ N, m > n+1. If f ∈ L1 (Rn ) and supx (1+|x|)m |f (x)| < ∞, then fˆ ∈ C m−n−1 (Rn )
and supξ |Dα fˆ(ξ)| < ∞ for |α| 6 m − n − 1.
There is clearly a gap of n+1 derivatives between (a) and (b), but in the proof of Theorem 10.1
this did not matter because the definition of a Schwartz function involves C ∞ smoothness and
rapid decrease.
33
We could now proceed to define the Fourier transform for certain distributions using the
adjoint identity scheme: the product rule holds in particular for ϕ, ψ ∈ S(Rn ), namely
Z Z
ϕ̂(x)ψ(x) dx = ϕ(x)ψ̂(x) dx.
Rn Rn
However, we shall first show that the Fourier transform is bijective on the Schwartz space.
Theorem 10.3 (Fourier Inversion Theorem on S(Rn )). The Fourier transform F : S(Rn ) →
S(Rn ) is bijective with inverse given by
Z
(F −1 ψ)(x) = (2π)−n ψ(ξ)eix·ξ dξ.
Rn
Proof. Let ϕ ∈ S(Rn ). Can we recover ϕ from ϕ̂? Now ϕ̂ ∈ S(Rn ) ⊂ L1 (Rn ) so we may
consider
Z
2
F (ϕ)(x) = ϕ̂(ξ)e−ix·ξ dξ
Rn
Z Z
= ϕ(y)e−i(x+y)·ξ dy dξ.
Rn Rn
uniformly in x ∈ Rn . You might recognize the function (2π)−n Ĝt as the heat kernel from the
Part A Differential Equations course. We postpone the proof of the Auxiliary Lemma for now.
Note that Z
2
F (ϕ)(x) = lim ϕ̂(ξ)e−ix·ξ Gt (ξ) dξ.
t→0+ Rn
34
Now for t > 0 we have
Z Z
−ix·ξ 2
ϕ̂(ξ)e Gt (ξ) dξ = ϕ̂(ξ)e|−ix·ξ−t|ξ|
{z } dξ
Rn R n
∈L1
Z
2
product rule → = ϕ(ξ)Fy→ξ e−ix·y−t|y| dξ
n
ZR
translation rule → = ϕ(ξ)Ĝt (ξ + x) dξ
Rn
Z
η = −ξ
→= ϕ(−η)Ĝt (x − η) dη
dη = dξ Rn
= (Ĝt ∗ ϕ̃)(x).
or equivalently that Z
−n
ϕ(x) = (2π) ϕ̂(ξ)eix·ξ dξ,
Rn
x ∈ Rn . It follows easily that F is bijective. It thus remains to prove the Auxiliary Lemma.
11 Lecture 11
Proof of the Auxiliary Lemma. We express (1)-(3) by saying that ((2π)−n Ĝt )t>0 is an approx-
imate identity and for f ∈ S(Rn ) we have
uniformly in x.
Remark 11.1. In addition to (3) we record that also
Z
(3’) Ĝt (ξ) dξ −→ 0 for each fixed ε > 0.
|ξ|>ε t→0+
Note that Gt = d√t G1 , so from the dilation rule Ĝt = t−n/2 dt−1/2 Ĝ1 . We can therefore focus
on the computation for t = 1. Put G ..= G1 , and note further that
n
Y 2
−|x|2
G(x) = e = e−xj ,
j=1
35
hence by Fubini
n
Y 2
Ĝ(ξ) = Fxj →ξj e−xj (ξj ).
j=1
2
We may therefore further assume that n = 1 and G(x) = e−|x| , x ∈ R. In this case we compute
Z ∞ Z ∞
i
−x2 −ixξ − 14 ξ 2 2
Ĝ(ξ) = e dx = e e−(x+ 2 ξ) dx.
−∞ −∞
Put Z ∞ i 2
F (ξ) = e−(x+ 2 ξ) dx
−∞
√ ξ2
Ĝ(ξ) = πe− 4 ,
as required.
We return to the general case and the proof of (1)-(3). Point (1) follows from the above
calculation upon performing a substitution, and (2) is clear. For (3) we note that when |ξ| > δ,
π n/2 δ2
|Ĝt (ξ)| = Ĝt (ξ) 6 e− 4t .
t
sn n!
For s > 0 one has es > n! , so e−s < sn , and so
π n/2 δ2 π n/2 n!4n n/2
e− 4t < t −→ 0,
t δ 2n t→0+
36
Finally, for f ∈ S(Rn ) we have that f is in particular uniformly continuous, so given ε > 0 we
can find δ > 0 such that |f (y) − f (x)| < ε whenever |x − y| 6 δ. As f is also bounded we get
(1)
Z
|(Ĝt ∗ f )(x) − f (x)| 6 Ĝt (x − y)|f (y) − f (x)| dy
Rn
Z Z
6 Ĝt (x − y) dy · 2kf k∞ + Ĝt (x − y) dy · ε,
|x−y|>δ Bδ (x)
Remark 11.2. When (Kt )t>0 is an approximate identity, it is not difficult to show that
kKt ∗ f − f k1 −→ 0
t→0+
whenever f ∈ L1 (Rn ).
for x ∈ Rn . Here F maps L1 (Rn ) into C0 (Rn ), but one can show that it is not onto. The
situation is better when we consider the Fourier transform on S(Rn ); then
F : S(Rn ) −→ S(Rn )
is bijective with Z
1
F −1 (g)(x) = g(ξ)eix·ξ dξ,
(2π)n Rn
x ∈ Rn . Note that we may write this as F −1 = (2π)−n F̃, where the order in which we perform
the operations g 7→ F(g) and g 7→ g̃ is unimportant.
We return to the task of defining the Fourier transform on distributions using the adjoint
identity: Z Z
ϕ(x)ψ̂(x) dx = ϕ̂(x)ψ(x) dx
Rn Rn
37
for ϕ, ψ ∈ S(Rn ). Observe that the distribution should be defined on S(Rn ) rather than on
D(Rn ), and since D(Rn ) ( S(Rn ), it is likely that we will have to exclude some distributions.
As with D′ (Rn ) we start with a notion of convergence on S(Rn ). In connection with this we
recall the following characterization of S(Rn ):
n ∞ n n . α β
ϕ ∈ S(R ) ⇐⇒ ϕ ∈ C (R ) and ∀α, β ∈ N0 Sα,β (ϕ) .= sup |x D ϕ(x)| < ∞ .
x∈Rn
We may also replace the condition for Sα,β (ϕ) by the following apparently stronger, but in fact
equivalent, condition
Sp,q (ϕ) ..= sup |p(x)(q(D)ϕ)(x)| < ∞
x∈Rn
for all polynomials p and q on Rn . For k, l ∈ N0 put
Note that all three of Sα,β , Sp,q , and S̄k,l are semi-norms. Observe that
for all ϕ ∈ S(Rn ), where c is a constant that only depends on the polynomials p and q and
whose precise value is not important here.
Lemma 11.3. Let p be a polynomial on Rn . Then there exists a constant c = c(p) such that
for all α, β ∈ Nn0 and ϕ ∈ S(Rn ),
The proof is a simple but somewhat tedious application of the Leibniz rule and we omit the
details here.
Definition 11.4. Let ϕj , ϕ ∈ S(Rn ). Then we say ϕj converges to ϕ in the sense of the
Schwartz test functions, and write ϕj → ϕ in S(Rn ), if and only if
Sα,β (ϕ − ϕj ) −→ 0
as j → ∞ for all α, β ∈ Nn0 . This can also be stated in terms of Sp,q or S̄k,l .
Remark 11.5 (A metric on S(Rn )). Define for ϕ, ψ ∈ S(Rn )
X S̄k,l (ϕ − ψ)
d(ϕ, ψ) ..= 2−k−l .
1 + S̄k,l (ϕ − ψ)
k,l∈N0
Then d is a metric on S(Rn ), and we have ϕj → ϕ in S(Rn ) if and only if d(ϕj , ϕ) → 0. Note
that d is translation invariant,
38
and it can be shown that (S(Rn ), d) is complete (such a space is called a Fréchet space).
We may compare this to the notion of convergence in D(Rn ). If ϕj , ϕ ∈ D(Rn ) and ϕj → ϕ
in D(Rn ), then ϕj , ϕ ∈ S(Rn ) and ϕj → ϕ in ϕj → ϕ in S(Rn ). The converse, however, is
clearly false.
Lemma 11.6. Let p be a polynomial on Rn . Then the maps ϕ 7→ pϕ, ϕ 7→ p(D)ϕ, and ϕ 7→ ϕ̂
are continuous with respect to the convergence on S(Rn ).
Proof. The first two follow immediately from Lemma 11.3. For the Fourier transform, note
that for all α, β ∈ Nn0 ξ α Dβ ϕ̂(ξ) = (−1)|α|+|β| Fx→ξ Dα (xβ ϕ(x)) , so
Sα,β (ϕ̂) = sup Fx→ξ Dα (xβ ϕ(x))
ξ
Z
6 |Dα (xβ ϕ(x))| dx
Rn
Z
1
= 2n
(1 + |x|2n )|Dα (xβ ϕ(x))| dx
Rn 1 + |x|
Z
dx
6 2n
sup (1 + |x|2n )Dα (xβ ϕ(x)) .
n 1 + |x| x
| R {z }
=..c
and thus
n
X
sup (1 + |x|2n )Dα (xβ ϕ(x)) 6 sup 1 + nn−1 x2n
j
Dα (xβ ϕ)
x x
j=1
n
X
6 sup Dα (xβ ϕ) + nn−1 sup x2n α β
j D (x ϕ)
x x
j=1
Lemma 11.3
6 cβ S̄|β|,|α| (ϕ) + nn cβ S̄2n+|β|,|α| (ϕ)
6 C S̄2n+|β|,|α| (ϕ),
where C = (1 + nn )cβ . We have thus shown that
Sα,β (ϕ̂) 6 cC S̄2n+|β|,|α| (ϕ),
and hence F is continuous.
39
whenever ϕj → ϕ in S(Rn ). The set of all tempered distributions is denoted by S ′ (Rn ). It is
clearly a vector space with the usual definitions of vector space operations.
Remark 11.8. Since D(Rn ) ⊂ S(Rn ) and D-convergence implies S-convergence, it follows
that S ′ (Rn ) ⊂ D′ (Rn ). The space D′ (Rn ) is genuinely larger than S ′ (Rn ), since for example
2
u = ex ∈ L1loc (R) ⊂ D′ (R) by the rule
Z ∞
2
hu, ϕi = ex ϕ(x) dx.
−∞
2
/ S ′ (R) as for instance ϕ = e−x ∈ S(R) but uϕ = 1R ∈
However, u ∈ / L1 (R).
Definition 11.9 (Convergence of Tempered Distributions). For a sequence (uj ) in S ′ (Rn ) and
u ∈ S ′ (Rn ) we write
uj −→ u in S ′ (Rn )
if and only if huj , ϕi → hu, ϕi for each fixed ϕ ∈ S(Rn ).
Definition 11.11. For α ∈ Nn0 , a polynomial inC[x] and u ∈ S ′ (Rn ) we define the tempered
distributions Dα u, pu, and û by the rules
Theorem 11.12 (Fourier Inversion Formula on S ′ (Rn )). The map F : S ′ (Rn ) → S ′ (Rn ) is a
linear bijection with inverse F −1 = (2π)−n F̃.
Proof. By chasing definitions and using the Fourier Inversion Formula on S(Rn ), for u ∈ S ′ (Rn )
and ϕ ∈ S(Rn ) we have
h(2π)−n F̃Fu, ϕi = hu, (2π)−n F F̃ϕi = hu, ϕi = hu, (2π)−n F̃Fϕi = h(2π)−n F F̃u, ϕi.
40
Example 11.13. Let u ∈ Lp (Rn ), p ∈ [1, ∞]. Then u ∈ S ′ (Rn ) by the rule
Z
hu, ϕi = uϕ dx
Rn
If 1 6 q < ∞, then
Z
1
kϕkqq = (1 + |x|2nq )|ϕ|q dx
n 1 + |x|2nq
ZR
dx 2nq q
6 sup(1 + |x| )|ϕ(x)|
n 1 + |x|2nq x
ZR
dx q 2n q
6 (sup |ϕ|) + (sup |x| |ϕ(x)|)
n 1 + |x|2nq x x
ZR
dx q q
6 S 0,0 (ϕ) + S|x|2n ,0 (ϕ) < ∞.
Rn 1 + |x|2nq
If q = ∞, then kϕk∞ = S0,0 (ϕ). It follows that hu, ϕi is well-defined and continuous: if ϕj → ϕ
in S(Rn ), then S0,0 (ϕj − ϕ) → 0, S|x|2n ,0 (ϕj − ϕ) → 0 and hence hu, ϕj − ϕi → 0. This implies
continuity by linearity of u.
The example with u = ex shows that we cannot expect to make sense of general Lploc
2
hδx0 , ϕi = ϕ(x0 )
holds for all ϕ ∈ S(Rn ). Recall that we denoted by S̄k,l (ϕ) = max|α|6k, |β|6l Sα,β (ϕ) and
Sα,β (ϕ) = supx |xα Dβ ϕ(x)|.
41
Remark 11.15. Tempered distributions have finite order.
Proof. The ’if’ part is clear. To prove the ’only if’ statement, assume u is ϕ-continuous but
that (6) fails for all c = k = l = j ∈ N: there exist ϕj ∈ S(Rn ) such that
|hu, ϕj i| > j S̄j,j (ϕj ).
Clearly ϕj 6= 0, so S̄j,j (ϕj ) > 0 and we may define
ϕj
ψj = ∈ S(Rn ).
j S̄j,j (ϕj )
For α, β ∈ Nn0 we have for |α|, |β| 6 j that Sα,β (ψj ) 6 j −1 → 0, hence ψj → 0 in S(Rn ) and so
hu, ψj i → 0. But this contradicts |hu, ψj i| > 1.
12 Lecture 12
42
Definition 12.4. A function a ∈ C ∞ (Rn ) is said to be of moderate growth if and only if a and
all partial derivatives Dα a, α ∈ Nn0 , have polynomial growth: ∀α ∈ Nn0 ∃pα ∈ R[x] such that
for all x ∈ Rn .
Proof. Fix α, β ∈ Nn0 . Then for ϕ ∈ S(Rn ) we compute using the Leibniz rule
X β
xα Dβ (aϕ) = (Dγ a)xα (Dβ−γ ϕ).
γ
γ6β
For each γ 6 β, Dγ a has polynomial growth so we can find constants cγ > 0, mγ ∈ N0 such
that
|Dγ a(x)| 6 cγ (1 + |x|)mγ
for all x ∈ Rn . Here
and
2mγ 2mγ
|x|2mγ = (x21 + · · · + x2n )mγ 6 nmγ −1 (x1 + · · · + xn ),
and so
n
X 2mγ
|Dγ a(x)| 6 22mγ −1 cγ + 22mγ −1 nmγ −1 cγ xj .
j=1
43
and so
X β Xn
Sα,β (aϕ) 6 c̄ sup 1 + x2j m̄ xα Dβ−γ ϕ
γ x
γ6β j=1
X β
6 c̄ Sα,β−γ (ϕ) + nS̄|α|+2m̄,|β| (ϕ)
γ
γ6β
X β
6 c̄ (n + 1)S̄|α|+2m̄,|β| (ϕ)
γ
γ6β
6 cS̄|α|+2m̄,|β| (ϕ).
It follows that ϕ 7→ aϕ is S-continuous and hence au ∈ S ′ (Rn ) since the linearity of au is clear
once we know it is well-defined. Next, u 7→ au is clearly linear and S ′ -continuous: the latter
follows by definition-chasing. Indeed, when uj → u in S ′ (Rn ), then
Lemma 12.7. If u ∈ S ′ (Rn ) and θ ∈ S(Rn ), then u ∗ θ can be defined the adjoint identity
scheme as
hu ∗ θ, ϕi = hu, θ̃ ∗ ϕi
for all ϕ ∈ S(Rn ). Furthermore, u ∗ θ is a C ∞ function of moderate growth and is given by
for x ∈ Rn .
Remark 12.8. We have not emphasized it so far, but since the convolution product is commu-
tative on S(Rn ), ϕ ∗ ψ = ψ ∗ ϕ for ϕ, ψ ∈ S(Rn ), we also have u ∗ θ = θ ∗ u. The proof is easy
(exercise).
Definition 12.9. We can define convolutions of u, v ∈ S ′ (Rn ) provided v̂ is a C ∞ function of
moderate growth:
u ∗ v ..= F −1 (ûv̂).
This is a good definition by virtue of the Fourier Inversion Formula on S ′ (Rn ) and lemma 12.6.
The various rules for the Fourier transform continue to hold for tempered distributions.
Theorem 12.10 (Convolution Rule). Let u, v ∈ S ′ (Rn ) and assume v is a C ∞ function of
moderate growth. Then uv ∈ S ′ (Rn ) and u
cv = (2π)−n û ∗ v̂.
44
Proof. By the Inversion Formula we define
ˆv̂ˆ ,
û ∗ v̂ ..= F −1 û
where ˆ = (2π)n ũ, v̂ˆ = (2π)n ṽ, and the latter is clearly a C ∞ function of moderate growth.
û
Hence ˆv̂ˆ ∈ S ′ (Rn ) by lemma 12.6, and using ũṽ = u
û fv we get
û ∗ v̂ = F −1 (2π)2n u
fv = F̃ (2π)2n ufv = (2π)n u
cv.
for f , g ∈ L2 (Rn ).
This follows from the Product Rule and the Inversion Formula on S(Rn ): clearly ψ̄ ∈ S(Rn ),
so F −1 (ψ̄) ∈ S(Rn ) and so
Z Z Z
ϕψ̄ dx = ϕF(F −1 ψ̄) dx = ϕ̂F −1 (ψ̄) dx.
Now
Z Z
−1 −n ix·y −n
F (ψ̄)(x) = (2π) ψ̄(y)e dy = (2π) ψ(y)e−ix·y dy = (2π)−n ψ̂(x).
If now f ∈ L2 (Rn ) we know that there exist fj ∈ D(Rn ) ⊂ S(Rn ) so kf − fj kL2 → 0. Clearly
this means in particular that fj → f in S ′ (Rn ), and thus by S ′ -continuity of the Fourier
transform, fˆj → fˆ in S ′ (Rn ). By (7) we see that
Z Z
|fˆj − fˆk |2 dξ = (2π)n |fj − fk |2 dx,
Rn Rn
so (fˆj ) is Cauchy. It is thus convergent in L2 (by Riesz–Fischer), fˆj → g in L2 (Rn ) for some
g ∈ L2 (Rn ). Clearly then fˆj → g in S ′ (Rn ) too, and so g = fˆ.
45
What happens on the other Lp spaces?
Remark 12.13. For p > 2 the image F(Lp (Rn )) contains tempered distributions of positive
orders.
13 Lecture 13
Recall that a partial differential operator (PDO) with constant coefficients can be written as
X
P (D) = aα D α
|α|6m
Definition 13.1. A fundamental solution for P (D) is any E ∈ S ′ (Rn ) such that P (D)E = δ0 .
xm
+
Example 13.2. Recall from Problem Sheet 1 that E(x) = m! satisfies
dm+1
E = δ0
dxm+1
1
in D′ (R), and from Problem Sheet 2 that E(x) = 2π log |x| satisfies
∆E = δ0
in D′ (R2 ). It is not difficult to check that in both cases the distribution E is tempered and so
is a fundamental solution.
as the symbol of the PDO P (D), and, provided P (D) has order m, we call
X
aα ξ α
|α|=m
46
Definition 13.3. A PDO X
P (D) = aα D α
|α|6m
∆ = D12 + · · · + Dn2 ,
They are of orders 2 and 1 respectively, and their principal symbols are
1 1
−|ξ|2 , − (ξ1 + iξ2 ), − (ξ1 − iξ2 ),
2 2
respectively. Clearly the condition for ellipticity is satisfied by each of them.
Example 13.4. Recall that on R2 ≃ C we have
∂2
∆=4 ,
∂z∂ z̄
1
and 2π log |z| is a fundamental solution for ∆:
1
δ0 = ∆ log |z|
2π
2 ∂2
= (log |z|)
π ∂ z̄∂z
1 ∂ ∂
= log(z z̄)
π ∂ z̄ ∂z
1 ∂ 1
= z̄
π ∂ z̄ z z̄
∂ 1
= ,
∂ z̄ πz
∂
so (πz)−1 is a fundamental solution for ∂ z̄ .
47
Theorem 13.5. Assume E is a fundamental solution for a PDO P (D). Then for f ∈ S(Rn )
the general solution to
P (D)u = f
in S ′ (Rn ) is given by
u = E ∗ f + h,
where h ∈ S ′ (Rn ) ∩ ker P (D).
Remark 13.6. We can allow any f ∈ S ′ (Rn ) on the right-hand side for which we can define
E ∗ f as a tempered distribution (for instance, if fˆ is a moderate C ∞ function, but also much
more generally; we have not defined convolutions in full generality).
P (D)(E ∗ f ) = (P (D)E) ∗ f = δ0 ∗ f = f,
Remark 13.8. Note that E is C ∞ away from zero, E ∈ L1loc (Rn ) ∩ S ′ (Rn ), and that
1
Dj E = xj |x|−n ∈ L1loc (Rn ).
ωn−1
The constant ωn−1 is the surface area of Sn−1 in Rn . One can show that ωn−1 = nLn (B1 (0)),
and n
π2
Ln (B1 (0)) =
Γ n2 + 1
for all n ∈ N. In particular, we record the values for n = 2 and 3:
d2
• in n = 1 the Laplacian dx2
has fundamental solution x+ ,
∂ 2 ∂2 1
p
• in n = 2 the Laplacian ∆ = ∂x 2 + ∂y has fundamental solution 2π log x2 + y 2 . This is
called the logarithmic potential ;
48
∂2 ∂2 ∂2
• in n = 3 the Laplacian ∆ = ∂x2
+ ∂y 2
+ ∂z 2
has fundamental solution
1 1
− p .
4π x + y 2 + z 2
2
1
Ê(ξ) = − + T̂
|ξ|2
for some T ∈ S ′ (Rn ) satisfies ∆T = 0. This means that −|ξ|2 T̂ = 0 in S ′ (Rn ). Hence if
ϕ ∈ S(Rn ) and 0 ∈/ supp ϕ, then
ϕ(ξ)
ψ(ξ) = − 2
|ξ|
for ξ 6= 0 and ψ(0) = 0 belongs to S(Rn ), and so
hT̂ , ϕi = h−|ξ|2 T̂ , ψi = 0.
We express this by supp T̂ = {0}, that is T̂ has support {0}. We discuss below how this implies
that T̂ ∈ span{Dα δ0 : α ∈ Nn0 }, and hence that T ∈ span{(2π)−n (ix)α : α ∈ Nn0 } = C[x].
Since also ∆T = 0, we see that T must be a harmonic polynomial. Note that implicit in this is
the Liouville-type result saying that if T ∈ S ′ (Rn ) is harmonic, then T is a polynomial. Now
we return to the quest for fundamental solutions:
1
Ê(ξ) = − + T̂ (ξ).
|ξ|2
We only need one, so consider Ê = − |ξ|12 . The result then follows from the following.
Lemma 13.9 (Auxiliary Lemma). Let α ∈ (−n, 0) and put f (x) = |x|α . Then f ∈ L1loc (Rn ) ∩
S ′ (Rn ) and fˆ(ξ) = c(n, α)|ξ|−n−α , where
α+n n Γ n+α
2
c(n, α) = 2 π2
Γ − α2
49
Proof. We start with the observation that for x 6= 0
α Z ∞ Z ∞
α α −α −1 −t α 2
|x| Γ − = |x| t 2 e dt = s− 2 −1 e−s|x| ds,
2 0 0
for j fixed converge as mesh size tends to zero in the S ′ (Rn ) sense. Consequently we get by
S ′ -continuity and linearity of F that
Z ∞
1 α 2
Fx→ξ (|x|α ) = s− 2 −1 Fx→ξ (e−s|x| ) ds
Γ − α2 0
Z ∞ n |ξ|2
1 −α −1 π 2 − 4s
= s 2 e ds
Γ − α2 0 s
n −n−α Z ∞
π2 |ξ| n+α
= α
t 2 −1 e−t dt
Γ −2 2 0
n+α
n Γ
= 2n+α π 2 2
|ξ|−n−α .
Γ − α2
Theorem 13.10. If u ∈ D′ (Ω) and for each x ∈ Ω there exists rx > 0 such that hu, ϕi = 0
for all ϕ ∈ D(Ω ∩ Brx (x)), then u = 0.
Remark 13.11. For an open subset ω ⊂ Ω we define the restriction of u ∈ D′ (Ω) to ω, denoted
u|ω , by
hu|ω , ϕi ..= hu, ϕi
for all ϕ ∈ D(ω). Clearly u|ω ∈ D′ (ω), and the assumption in the above theorem is that
u|Ω∩Brx (x) = 0.
50
Proof. (Not examinable) Let ϕ ∈ D(Ω). Since we clearly have
[
supp ϕ ⊂ {Ω ∩ Brx (x) : x ∈ Ω} ,
the compactness of supp ϕ means that we can find a finite subcover, say
supp ϕ ⊂ Ω ∩ Brx1 (x1 ) ∪ · · · ∪ Ω ∩ Brxm (xm ) .
Using Theorem 2.8 we find a partition of unity φ1 , . . . , φm ∈ D(Ω) with supp φj ⊂ Ω∩Brxj (xj ),
P
0 6 φj 6 1, and mj=1 φj = 1 on supp ϕ. Thus
* m
+ m
X X
hu, ϕi = u, ϕφj = hu, ϕφj i = 0.
j=1 j=1
Thus x ∈ Ω \ supp u if and only if there exists r > 0 such that u|Ω∩Br (x) = 0. Consequently
the above theorem means in particular that Ω \ supp u must be the largest open subset of Ω on
which u vanishes. It also follows from this that the support of u is a relatively closed subset
of Ω.
Theorem 13.13. Let u ∈ D′ (Ω) and x0 ∈ Ω. If supp u = {x0 }, then u ∈ span{Dα δx0 : α ∈
Nn0 }.
for ϕ ∈ D(Ω). Its support as a distribution coincides with the above definition of the support
for a continuous function and we may therefore use the same notation for both: In order to
justify this let D be the support of the distribution. If ϕ ∈ D(Ω) and supp ϕ ⊂ Ω \ supp u,
then Z
hu, ϕi = uϕ dx = 0,
Ω
so the open set Ω \ supp u must be contained in the largest open set on which u vanishes, that
is Ω \ supp u ⊂ Ω \ D, i.e. D ⊂ supp u. If x0 ∈ supp u \ D, then we find ϕ ∈ D(Ω) supported
near x0 and in Ω \ D such that hu, ϕi =
6 0.
51
14 Lecture 14
Theorem 14.1 (Weyl’s Lemma). Assume u ∈ D′ (Ω) and ∆u = 0 in D′ (Ω). Then u ∈ C ∞ (Ω)
and u is harmonic.
Corollary 14.2. Let Ω ⊂ C be open and assume f ∈ D′ (Ω) satisfies
∂f
=0
∂ z̄
in D′ (Ω) . Then f is holomorphic.
Proof of Weyl’s Lemma. Let (ρε )ε>0 be the standard mollifier. Fix Ω′ ⋐ Ω and put ε0 =
dist(Ω′ , ∂Ω). For each x ∈ Ω′ and ε ∈ (0, ε0 ) the function
y 7−→ ρε (x − y)
belongs to D(Ω) and so we may consider hu, ρε (x − ·)i. We assert that it is independent of
d
ε ∈ (0, ε0 ). To prove it we calculate dε ρε (x − y) for x, y ∈ Rn . Recall that
−n x−y
ρε (x − y) = ε ρ
ε
and that ρ(x) = θ(|x|2 ) (since ρ was a smooth radial function), where θ ∈ C ∞ (R) satisfies
θ(t) = 0 for t > 1. Now calculate
d x−y x−y x−y x−y
ε−n ρ = −nε−n−1 ρ − ε−n ∇ρ ·
dε ε ε ε ε2
1 x−y x−y x−y
= − n+1 nρ + ∇ρ · .
ε ε ε ε
Put K(x) = −nρ(x) − ∇ρ(x) · x so that
d −n x−y 1 x−y
ε ρ = n+1 K .
dε ε ε ε
We now use that ρ(x) = θ(|x|2 ). Hereby
K(x) = − div (ρ(x)x) = − div θ(|x|2 )x
52
and if we set Z ∞
1
Θ(t) = θ(s) ds,
2 t
then Θ ∈ C ∞ (R) with Θ(t) = 0 for t > 1, and Θ′ (t) = − 21 θ(t). Consequently
−θ(|x|2 )x = ∇ Θ(|x|2 ) ,
and so K(x) = div ∇ Θ(|x|2 ) = (∆Φ)(x), where Φ(x) = Θ(|x|2 ). Observe that Φ ∈ D(B1 (0)),
and
1 x−y x−y x−y 1 x−y
− n+1 nρ + ∇ρ · = n+1 ∆y Φ
ε ε ε ε ε ε
1−n x−y
= ∆y ε Φ .
ε
Here y 7→ ε1−n Φ x−y
ε is supported in Bε (x) ⊂ Ω, and so by assumption
1−n x−y
u, ∆y ε Φ = 0.
ε
in D′ (Ω) with respect to y, provided x ∈ Ω′ and 0 < ε < ε0 (since we may differentiate both
d
sides with respect to y). But then dε hu, ρε (x − ·)i = 0, and so hu, ρε (x − ·)i = hu, ρε1 (x − ·)i
for all ε ∈ (0, ε0 ), where ε1 ∈ (0, ε0 ). Now let ϕ ∈ D(Ω′ ). Then, by the usual trick when
convolving distributions with test functions,
Z Z
hu, ρε (x − ·)iϕ(x) dx = u, ρε (x − ·)ϕ(x) dx
Ω′ Ω′
= hu, ρε ∗ ϕi,
53
Hence, as ρε ∗ ϕ → ϕ in D(Ω) as ε → 0+ , we get
Z
hu, ϕi = hu, ρε1 (x − ·)iϕ(x) dx.
Ω′
Remark 14.3. The above proof is inspired by the mean value property that is known to char-
acterize harmonic functions in the following sense. Let h ∈ C(Ω). Then h is harmonic in the
usual sense (h ∈ C 2 (Ω) and ∆h = 0) if and only if for all balls Br (x0 ) ⋐ Ω we have
Z
1
h(x0 ) = h(x) dSx .
ωn−1 rn−1 ∂Br (x0 )
Using polar coordinates we see that when h is harmonic, then for Br (x0 ) ⋐ Ω
15 Lecture 15
In this lecture we use the Plancherel Theorem to obtain estimates and regularity results for
distributional solutions to the Poisson equation.
We are interested in solving the Poisson equation ∆u = f in Rn in the context of tempered
distributions. If E denotes the fundamental solution for ∆ in Rn found in Example 13.2 (for
n = 2) and in Theorem 13.7 (for n > 3), then the general solution in S ′ (Rn ) is E ∗ f + h, where
h is any harmonic polynomial on Rn . This follows from Theorem 13.5 provided we can make
sense of E ∗ f as a tempered distribution.
Example 15.1. If f ∈ S ′ (Rn ) has compact support, then fˆ is a moderate C ∞ function and so
Ê fˆ is well-defined as a tempered distribution by Lemma 12.6. We can then define
E ∗ f ..= F −1 Ê fˆ .
Theorem 15.2 (An L2 identity for the Laplacian). Let f ∈ L2 (Rn ) and assume E ∗ f is well-
defined as a tempered distribution. Then the general solution to Poisson’s equation ∆v = f in
S ′ (Rn ) is v = E ∗ f + h, where h is any harmonic polynomial on Rn . Furthermore, if u = E ∗ f ,
then Dj Dk u ∈ L2 (Rn ) for 1 6 j, k 6 n, and
n Z
X Z
2
|Dj Dk u| dx = |f |2 dx. (8)
n Rn
j,k=1 R
54
Remark 15.3. The symmetric n × n matrix
D2 u = (Dj Dk u)
is called the Hessian matrix of u. When u is a distribution with the property that the second
order partial derivatives Dj Dk u are regular distributions (i.e. they are L1loc functions), then
n
X
2 2
|D u| = |Dj Dk u|2 .
j,k=1
The right-hand side serves to define the left-hand side, and we record that for an n × n matrix
A = (aij ) ∈ Mn×n (C)
1
q Xn 2
. ⊤
|A| = tr(A Ā) =
. |ajk | 2
.
j,k=1
This is the standard norm on Mn×n (C), sometimes called the Frobenius norm or the Hilbert–
Schmidt norm. In terms of the Hessian matrix we may rewrite (8) as kD2 ukL2 = kf kL2 .
55
and therefore by Plancherel’s Formula
Z n Z
X
2 2
|D u| dx = |Dj Dk u|2 dx
Rn j,k=1 R
n
Xn Z 2
−1 ξj ξk ˆ
= Fξ→x f dx
n |ξ|2
j,k=1 R
Xn Z
−n ξj ξk ˆ 2
= (2π) 2
f dξ
j,k=1 Rn |ξ|
Z X n
= (2π)−n 1 ξj2 ξk2 |fˆ|2 dξ
n |ξ| 4
R j,k=1
Z
= (2π)−n |fˆ|2 dξ
R n
Z
2
= |f | dx.
Rn
Remark 15.4 (Calderon–Zygmund Lp estimate for the Laplacian). Let p ∈ (1, ∞). Then there
exists a constant c = c(p, n) such that for f ∈ Lp (Rn ) (with compact support, say)
Z Z
2 p
|D (E ∗ f )| dx 6 c |f |p dx.
Rn Rn
This fails for p = 1 and for p = ∞. The Sobolev Embedding Theorem then implies that
2,p
E ∗ f ∈ Wloc (see Theorem 16.1 for p = 2).
Proposition 15.5 (Localization). Let Ω be an open subset of Rn and suppose that u ∈ D′ (Ω)
2,2
satisfies ∆u = f in D′ (Ω). If f ∈ L2loc (Ω), then u ∈ Wloc (Ω). Recall that
k,2
Wloc (Ω) = v ∈ L2loc (Ω) : Dα v ∈ L2loc (Ω) ∀|α| 6 k .
Proof. Fix Ω′ ⋐ Ω and take a cut-of function θ ∈ D(Ω) between Ω′ and ∂Ω: 1Ω′ 6 θ 6 1Ω .
Now θu ∈ S ′ (Rn ) in view of the bound u must satisfy on the compact set supp θ ⊂ Ω. Using
Leibniz we calculate
n
X
∆(θu) = (∆θ)u + 2 Dj θDj u + θ∆u
j=1
= f1 + f2 ,
say, where f1 = (∆θ)u+2∇θ·∇u and f2 = θf . Observe that f1 , f2 ∈ S ′ (Rn ) both have compact
supports contained in supp θ. By linearity we then have θu = u1 + u2 , where ui ∈ S ′ (Rn )
56
satisfies ∆ui = fi in S ′ (Rn ). Now f2 ∈ L2 (Rn ) and so we may apply Theorem 15.2 which,
2,2
together with the Sobolev Embedding stated in Theorem 16.1, gives that u2 ∈ Wloc (Rn ). For
u1 we observe that u1 |Ω′ ∈ D′ (Ω′ ) satisfies
Remark 15.6. The above can also be done for other elliptic PDEs. For instance for P (D)u = f
in D′ (Ω), where X
P (D) = aα D α ,
|α|=2
aα ∈ R, and X
P (ξ) = aα ξ α 6= 0
|α|=2
for all ξ ∈ Rn \ {0}. For more along these lines, see the course C4.3 Functional Analytic
Methods for PDEs.
16 Lecture 16
Theorem 16.1 (L2 Sobolev Inequality). Let u ∈ S ′ (Rn ) and assume that for some m ∈ N
Dα u ∈ L2 (Rn ) for all α ∈ Nn0 with |α| = m. Then
2. if 2m = n, then u ∈ Lploc (Rn ) for all p < ∞ (but not in general for p = ∞),
Proof. We omit the proof here. Instead we prove a slightly weaker variant of Theorem 16.1
(1).
57
n
Theorem 16.2 (An L2 Sobolev Embedding). If m ∈ N, m > 2, and u ∈ W m,2 (Rn ), then
u ∈ C0 (Rn )1 .
Proof. Since u ∈ S ′ (Rn ) we can express u ∈ W m,2 (Rn ) equivalently by use of Plancherel as
d
D α u(ξ) = (−iξ)α û(ξ) ∈ L2 (Rn )
Proof. Put Z
f (z) = u(x)e−ix·z dx
Rn
for z ∈ Cn . This is well-defined because u ∈ Cc (Rn ), and using the DCT we see that u ∈
C 1 (Rn ), with Z
∂ ∂ −ix·z
f (z) = u(x) (e ) dx = 0
∂ z̄k Rn ∂ z̄k
for each k = 1, . . . , n. But then zk 7→ f (z) is holomorphic (for fixed z1 , . . . , zk−1 , zk+1 , . . . , zn ).
Now observe that for ξ ∈ Rn we have f (ξ) = û(ξ), and so if we fix ξ2 , . . . , ξn ∈ R and denote by
f1 (z1 ) = f (z1 , ξ2 , . . . , ξn ), then f1 is entire and f1 (ξ1 ) = û(ξ). Because û has compact support,
it follows that f1 must vanish on a half-line, so the Identity Theorem for holomorphic functions
implies that f1 ≡ 0. But then û(ξ) = 0, and since ξ2 , . . . , ξn ∈ R were arbitrary, we have shown
that û ≡ 0. By the Fourier Inversion Formula in S ′ (Rn ) it follows that u = 0.
1
Recall that this means that there exists a representative of u belonging to C0 (Rn ).
2
Note that this gives us a way to define the Sobolev spaces W m,2 in terms of the Fourier transform.
58