Lindstrom1 PDF
Lindstrom1 PDF
Lindstrom1 PDF
Examensarbete i matematik, 30 hp
Handledare och examinator: Sten Kaijser
Januari 2008
Department of Mathematics
Uppsala University
Abstract
In this report we will study the origins and history of functional analysis up until 1918. We
begin by studying ordinary and partial differential equations in the 18th and 19th century
to see why there was a need to develop the concepts of functions and limits. We will see
how a general theory of infinite systems of equations and determinants by Helge von Koch
were used in Ivar Fredholms 1900 paper on the integral equation
Zb
(s) = f (s) +
(1)
Acknowledgements
First of all, I would like to give my sincerest gratitudes to Sten Kaijser, not only for
supervising this thesis, but also for being my menthor during my years at the university.
If it were not for him, I would have followed my original plan and study theoretical
philosophy instead of mathematics. For preventing this, I am grateful. Secondly, I am
grateful for the help of Gunnar Berg who provided me with helpful comments and criticism
to improve this thesis. Finally I give my gratitudes to Olivier for interesting conversations
and help with French translations and bad grammar.
Contents
1 Introduction
2 Differential equations
2.1 Linear ordinary differential equations . . . . . . . . . . . . . . . . . . . . . .
2.2 Partial differential equations . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.3 Spectral theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5
5
5
6
3 Integral equations
3.1 Origins in applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2 Potential theory and electrostatics . . . . . . . . . . . . . . . . . . . . . . .
3.3 The connection between Differential and Integral equations . . . . . . . . .
10
10
11
13
23
25
30
45
by
. .
. .
. .
. .
Fredholms metod
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
57
57
58
59
60
Introduction
Functional analysis is the branch of mathematics where vector spaces and operators on
them are in focus. In linear algebra, the discussion is about finite dimensional vector
spaces over any field of scalars. The functions are linear mappings which can be viewed as
matrices with scalar entries. If the functions are mappings from a vector space to itself, the
functions are called operators and they are represented by square matrices. In functional
analysis, the vector spaces are in general infinite dimensional and not all operators on
them can be represented by matrices. Hence the theory becomes more complicated, but
nonetheless there are many similarities.
Functional analysis has its origin in ordinary and partial differential equations, and in
the beginning of the 20th century it started to form a discipline of its own via integral
equations. However, for a long time there were doubts wether the mathematical theory
was rich enough. Despite the efforts of many prominent mathematicians, it was not sure
if there were sufficiently many functionals to support a good theory, and it was not until
1920 that the question was finally settled with the celebrated HahnBanach theorem.
Seen from the modern point of view, functional analysis can be considered as a generalization of linear algebra. However, from a historical point of view, the theory of linear
algebra was not developed enough to provide a basis for functional analysis at its time of
creation. Thus, to study the history of functional analysis we need to investigate which
concepts of mathematics that needed to be completed in order to get a theory rigorous
enough to support it. Those concepts turn out to be functions, limits and set theory.
For a long time, the definition of a function was due to Euler in his Introductio in
Analysin Infinitorum from 1748 which read: A function of a variable quantity is an
analytic expression composed in any way whatsoever of the variable quantity and numbers
or constant quantities.. For a detailed discussion about the problems concerning this
definition, see [11]. For the purpose of this report, it is enough to say that the entire focus
of this definition is on the function itself, and the properties of this particular function.
What lead to the success of functional analysis was that the focus was lifted from the
function, and shifted to the algebraic properties of sets of functions The algebraization
of analysis. The process of algebraization led mathematicians to study sets of functions
where the functions are nothing more than abstract points in the set.
At the same time as the theory is very concrete and applicable to physical problems, it
can be presented in a very abstract way. Some proofs and results are significantly simplified
by introducing the axiom of choice, Zorns lemma or the Baire category theorem some
of the most abstract concepts in set theory. The main theorems are
1. The Hahn-Banach theorem by Hans Hahn and Stephan Banach, which states
that there are sufficiently many continuous functionals on every normed space to
make the theory of dual spaces and adjoint operators interesting
2. The uniform boundedness principle or the Banach-Steinhaus theorem by
Banach and Hugo Steinhaus, which states that for any family of continuous linear
operators on a Banach space, pointwise boundedness is equivalent to boundedness
in the operator norm
3. The open mapping theorem or the Banach-Schauder theorem by Banach
and Juliusz Pawel Schauder, which classifies the open mappings between two Banach
spaces
2
2.1
Differential equations
Linear ordinary differential equations
Due to the unclear notion of a function during the end of the 18th and the beginning of
the 19th century, one thought of a function in the same way as we today would think
of an analytic function. That is, it was asumed that around each point x0 , the function
was equal to a power series in x x0 . Taking derivatives of this function was equal to
taking derivatives of the terms in the series expansion. In general convergence was not
considered. The common recipe for solving a differential equation
y (n) = F (x, y, y 0 , y 00 , . . . , y (n1) )
(1)
P
k
would then be to substitute the power series y =
0 ck (x x0 ) and its termwise derivatives, into (1). Identifying the series on both sides would then decide ck , for k n, as a
function of c0 , c1 , . . . , ck1 .
The usefulness of this metod was restricted to rather simple differential equations such
as the linear equation y 0 = a(x)y + b(x) for which the solution had been known since the
17th century. It was not until after 1760 that a general study of ordinary linear differential
equations of arbitrary order began. [5]
2.2
In the 18th century the development was triggered by physical problems, and one of the
best examples of this fact is the theory of partial differential equations. In 1747, Jean
le Rond dAlembert (1717 1783) published a paper which proposed a solution to the
vibrating string problem. Since the position of any point on the string is depending on
both time and position, a function describing the shape of the string must depend on two
variables, y = f (x, t). dAlembert considered the string to be composed of infinitely many
small parts, each with infinitely small mass, and used Newtons laws of motion to derive
a partial differential equation for the shape of the string, now called the wave equation,
2y
2y
= c2 2 ,
2
t
x
(2)
where c is a known function of x and constant if the mass of the string is homogeneous.
dAlembert considered the special case when c2 = 1 and by the change of variables X =
x t, Y = x + t he reduced (2) to
2y
= 0.
XY
(3)
From (3) dAlembert concluded that the solution of (2) was y(x, t) = f (x t) + g(x + t)
where f and g are arbitrary twice differentiable functions. [11]
This caused quite a controversy because of the deception that a function had to be
something very concrete, and not arbitrary. Thus already at this point we see the need
of an abstraction of the concept of a function to a level where the function itself is not
important, but rather the collection of functions with abstract properties. In this case the
property of being twice differentiable.
There was also another approach to the vibrating string problem which began already
in 1715 with Brook Taylor (1685 1731), but took almost 150 years to mature. By direct
arguments, without using (2), he concluded that when c is constant, the functions
un (x, t) = sin
nx
nt
cos
,n1
a
a
5
(4)
represented the vibrations of the string, where the value of n decided the tone (with
n = 1 representing the fundamental tone and n = 2, 3, . . . the harmonics1 ). This lead D.
Bernoulli in 1750 to propose the general solution as a series
u(x, t) =
X
n=1
an sin
nc
nx
cos
(t n )
a
a
(5)
u
(x, 0) = (x)
t
were prescribed. Note that these are functions of a single variable. Using this fact, Euler
was able to give a geometric construction equivalent to u(x, t) being explicitly given by
1
1
u(x, t) = ((x t) + (x + t)) +
2
2
Zt
(x )d.
t
Now it was a well-known fact due to experiments that the function could look quite
terrible. For example there could be points where there are no derivatives. This forced
Euler to extend the notion of a function, from what he called continuous (analytical), to
the more general notion of mechanical. Euler does not define explicitly what he means
by a mechanical function, but it seems as it would mean piecewise twice differentiable in
our notation. [5]
From (5) Euler was led to the conclusion that any mechanical function defined on an
interval a x a could be represented as a series
a0
x
x
2x
2x
+ a1 cos
+ b1 sin
+ a2 cos
+ b2 sin
+ ...
2
a
a
a
a
where each term in the series is a continuous (analytic) function. However, Euler could
not imagine that this sum of continuous functions could be anything but a continuous
function. Eulers opinions were shared by most of the mathematicians of his time, and no
progress was done until the work of Joseph Fourier (1768 1830) on the theory of heat.
[5]
2.3
Spectral theory
The work of Fourier on the theory of heat triggered not only the development of trigonometric series, which required mathematicians to even more consider what is a function
and the meaning of convergence, but it also gave birth to spectral theory which is a central
concept in functional analysis.
Fourier studied the cooling off problem for a solid sphere of radius r which with
spherical symmetry gives the partial differential equation
2
u 2 u
u
=k
+
,
(6)
t
x2 x x
1
The fact that if a vibrating string is cut in half, one will hear a tone which is one octave higher, was
already known by Pythagoras. [23]
with boundary condition that u(x, t) remains finite when x tends to 0 and satisfies the
relation
u
+ hu = 0 for x = r and for all t,
(7)
x
where h and k are constants. Using the method of separation of variables, Fourier proved
that the function
1
(8)
u(x, t) = exp(k2 t)sin(x)
x
is a solution, where the parameter is a solution of the trancendental equation
r
= 1 hr
tgr
(9)
and that equation (9) has infinitely many real zeroes n tending to +. In order to
obtain a solution to (6) with boundary condition
P(7) such that u(x, 0) = f (x), for a given
function f (x), he expressed xf (x) as a series n=1 cn sin(n x) and proved the relations2
Zr
sin(n x)sin(m x)dx = 0 for n 6= m
(10)
xf (x)sin(n x)dx
Rr
(11)
sin2 (n x)dx
As always with Fourier, no rigorous proofs or justifications were given, not even that the
series actually converged to xf (x). [5]
The ideas and results of Fourier in the 1820s were further developed and put on a
more rigorous basis by Simeon Denis Poisson (1781 1840), which in turn led Charles
Francois Sturm (1803 1855), 1836, and Joseph Liouville (1809 - 1882), 1837, to develop
a general theory which included all of Fouriers work.
They began with the study of the second order differential equation
y 00 q(x)y + y = 0,
(12)
This is what we now call an orthogonality relation, a word which was never used by Fourier. [5]
The definition used for continuity at this time was due to Cauchy, 1821. [11]
4
From here on I will use the terms eigenvalue and eigenfunction despite that we have not actually
proved the existence of them yet, and that those words were not used until Hilbert, 1904.
3
give that
u00 v v 00 u + ( )uv = 0,
which by integrating both sides gives
Zb
(u00 v v 00 u + ( )uv)dx
Zb
00
Zb
00
(u v v u)dx +
=
a
( )uvdx
a
= [u v v
u]ba
Zb
+ ( )
(14)
u(x)v(x)dx
Zb
= ( )
u(x)v(x)dx
a
=0
because of (13). An immediate consequence of this is that all eigenvalues are real. If we
Rb
and v by u
replace by
in (14) we get that a |u(x)|2 dx = 0, which would imply that
u 0 on [a, b] a contradiction to the assumptions.
In a rather long and cumbersome paper by Sturm, 1836, he proves the existence of
eigenvalues. We will also prove this fact, but as a reformulation following [5]. We begin
by studying the equation y 00 + q(x)y = 0 and in the usual way writing this as a system of
first order equations by introducting y1 = y, y2 = y 0 which gives the system
0
y1 = y 2
.
(15)
y20 = q(x)y1
Now we introduce two new functions r and such that y1 = rsin() and y2 = rcos()
which turns (15) into
0
r = (1 q(x))rsin()cos()
.
(16)
0 = cos2 () + q(x)sin2 ()
If we apply this change of variables to the equation (12) we get the equation
0 = cos2 () + ( q(x))sin2 (),
(17)
and if we assume that a solution (x, ) exists, such that (a, ) = , then the eigenvalues
are the solutions of the equations
(b, ) = + n for n Z.
(18)
One of Sturms comparison theorems, which is found in the same paper, then shows that
for each x ]a, b[ the function 7 (x, ) is strictly increasing and that it follows from
(17) that if (x, ) = k for some integer k then
x (x, ) = 1. [5]
These are the results that Sturm needed for his conclusion that equation (18) has
one and only solution n for each n 1 and no solutions for n 0, and finally that
the corresponding eigenfunctions un have exactly n zeroes in the interval [a, b]. One year
later, Liouville continued Sturms work and gave generalizations of the works of Fourier
and Poisson. One of his main results concerning our purpose is that he in (14) replaced
and by n and m from which it follows that
Zb
un (x)um (x)dx = 0 for n 6= m.
a
This is another orthogonality result for the eigenfunctions, but neither Liouville nor Sturm
used this word. These results are now known as Sturm-Liouville theory for certain types
of partial differential equations.
Integral equations
The theory of integral equations provided mathematicians with three essential concepts
which were of great importance not only to the development of functional analysis, but
also to a richer and more general theory of other areas of mathematics. Some of them
were perhaps quite unexpected, like algebra and group theory. These are:
1. Solution to the Dirichlet problem, and thus connecting the theory of differential and
integral equations
2. Passing from finite to infinite systems of equations
3. Developing the notion of infinite spaces and function spaces
We will try to track these ideas which lead us a few steps closer to functional analysis,
and to a better understanding of mathematics in general.
3.1
Origins in applications
As for partial differential equations, the theory of integral equations has its cradle in
applications, mainly astronomy and electrostatics. The study of planetary motions led
mathematicians to successions of equations of the form
0 = f (x, a , . . . , a )
y1,i
1,i
1
n
0
y2,i = f2,i (x, y1,1 , . . . , y1,n )
0 = f (x, y , . . . , y
y3,i
3,i
1,1
1,n , y2,1 , . . . , y2,n )
..
.
for i = 1, 2, . . . , n
where all the right-hand sides are known functions. This problem was hence reduced to
quadratures. No attempts to justify the procedure mathematically were made since it
gave a satisfactory answer to observations. Yet it is an example of an iterative process for
solving large systems of equations or successive equations.
The question of wether a general differential equation has solutions, even though explicit solutions can not be given, was raised and answered by Augustin Louis Cauchy
(1789 1857) who proved some existence theorems on differential equations. In a paper published in 1835 he considered a method like the one outlined above for the partial
differential equation
p
X
U
U
=
Ai (t, x1 , . . . , xp )
,
(1)
t
xi
i=1
where the problem was to find a solution which reduces to a given function u(x1 , . . . , xp )
for t = 0. Cauchy transformed (1), by considering x1 , . . . , xp as parameters, to
Zt X
p
U
)ds
U (t, x1 , . . . , xp ) = u(x1 , . . . , xp ) + (
Ai (s, x1 , . . . , xp )
xi
0
(2)
i=1
which he was able to solve using the method of successive approximations. He Started
with U0 = u and defined
Zt X
p
Un1
Un (t, x1 , . . . , xp ) = u(x1 , . . . , xp ) + (
Ai (s, x1 , . . . , xp )
)ds
xi
0
10
i=1
(3)
which he could prove converged to a solution when the Ai :s are analytic functions. [5]
In the previously mentioned paper from 1837, Liouville independently used a smiliar
method for the differential equation y 00 = f (x)y on [a, b] with boundary condition y 0 (a)
hy(a) = 0. He started his recursive definition with y0 (x) = 1 + h(x a) and considered
the series
y = y 0 + y 1 + . . . + yn + . . .
where yn is determined by
Zx
yn (x) =
Zt
dt
f (s)yn1 (s)ds.
a
On the question of convergence, Liouville proved that |yn (x)| cn (x a)2n /(2n)! which
implies that the series converges for every x. Liouville then continued without further
motivation by assuming that y(x) is twice differentiable as the limit of twice differentiable
functions. A common deception among mathematicians at this time since the notion of
uniform convergence did not exist yet.
An interesting remark is that Liouville gave another definition of y as
Zx
y = y0 +
Zt
dt
f (s)y(s)ds
a
y = y0 +
a
thus giving him the first example of what is now called a Volterra integral equation of the
second kind. [5]
3.2
When Daniel Bernoulli (1700 1782) and Adrien-Marie Legendre (1757 1833) studied
newtonian attractions they arrived at expresions such as
Z Z Z
(, , )ddd
p
(x, y, z) =
(4)
(x )2 + (y )2 + (z )2
V
for a point with a certain mass under the influence of a solid V with density . PierreSimon Laplace (1749 1827) went on to show that this rather terrifying looking function
satisfied the rather easy relation
def
2 2 2
+
+
= 0,
x2
y 2
z 2
(5)
for (x, y, z) 6 V , which since has played an important role in governing stationary phenomena in for example hydrostatics, the theory of heat and electrostatics.
The theory of partial differential equations had until now been in an embryonic stage
and not gone through such drastic development as for example the theory of ordinary
differential equations. This was about to change when George Green (1793 1841) in
1828 published a paper on partial differential equations with general boundary conditions.
11
The concern of the paper was electrostatics and what he called potential functions which
were not only of the type (4), but also
Z Z
(P )
(M ) =
d(P )
(6)
MP
where is a smooth surface, a continuous function on and d the element of area on
. This will later be called simple layer potentials. The motivation for his results were
based on experiments showing that on conductors, the electric charges are concentrated
on the surface. He discovered his famous theorem when he studied in which potentials the
surface density function would define the relation
Z Z
Z Z Z
u
v
(v
(uv yu)d =
u )d,
(7)
n
n
V
where is a smooth surface limiting a bounded volume V , u and v are twice differentiable
u v
functions in a neighborhood of V and n
, n are the derivatives along the exterior normal
of . Green considered a function u with the following two properties,
1. u is twice differentiable for all points different from some point M in V
2. u(P ) (1/M P ) is bounded when P M ,
to which he applies (7) to V , from which a small ball centered at M has been removed.
By letting the radius of this small ball tend to 0 he obtained the formula
Z Z Z
Z Z
u
v
4v(M ) +
(uv vu)d =
(v
u )d,
(8)
n
V
n
and finally by taking u(P ) = 1/M P he obtained
Z Z
4v(M ) =
(v
( 1r ) 1 v
(9)
v
are
This is an integral formula for solving the Laplace equation, v = 0, when v and n
known on . Inspired by a paper published by Poisson in 1820, Green realized that he
could generalize (8) to a general domain V by replacing u with a function G(M, P ) with
the following properties:
G
n
exist on
12
3.3
After the papers by Green on electrostatics that provided a theory of partial differential
equations with general boundary conditions, the embryo of theory of partial differential
equations awoke and started to grow rapidly. Carl Friedrich Gauss (1777 1855) had
been interested in the Laplace equation very early, both in two and three variables, in
connection with his work on complex numbers and astronomy, and already in 1813 he
published some special cases of the Green formula (7) and used the word potential for
the function (4). [5]
Gauss work on potential theory led him to a fundamental result. When he studied
equations of the type (6) with 0, and a function U continuous on , such that
Z Z
( 2U )d
Hermann Amandus Schwarz (1843 1921) on the vibrating membrane problem, published
in 1885.
By the same principles as the vibrating string problem (compare section 2.2) one can
deduce that if z = u(x, y, t) is the equation of the surface at time t, then u satisfies
2u 2u
2u
+
=
x2
y 2
t2
(10)
for suitable units of time and length, and small vibrations. If one looks for solutions of
the form u(x, y, t) = v(x, y, )w(t) one finds a solution for v by solving1
2v
2v
+
+ v = 0,
x2 y 2
(11)
for a suitable constant . This equation (11) was successfully studied by Heinrich Weber
(1842 1913) in 1869 who proved interesting eigenvalue properties and orthogonality
relations, which implied basically the same properties as for the vibrating string problem.
The problem with Webers solution is that it used methods of the calculus of variations
which were considered rather suspicious by Weierstrass among others. Hence his results
were not fully accepted until Schwarz in 1885 published a long paper on minimal surfaces
which used entirely new methods to obtain the same, and even more general results.
Schwarz considered a type of equations slightly more general than (11), namely
2v
2v
+
+ 2 pv = 0
x2 y 2
(12)
in a domain D with a continuous function p > 0. His topic of interest was not to study
eigenvalue problems 2 of (12), but a Dirichlet problem for the equation
w + pw = 0
(13)
depending on the parameter , and restricted to the case where w = 1 on the boundary
D of D. He expressed the solution as a power series in ,
w = w0 + w1 + 2 w2 + . . . ,
(14)
(16)
(17)
(18)
14
The only thing left now is to prove the convergence of (14), for small enough , which
he did by an ingenious use of the inequality that bears his name. The proof of this fact
brought him one step further than those who had studied this problem before him. First
he proved that w is a solution of (13) and is equal to 1 on D. Second he also proved
that when = 1/ c, for a well-defined constant c, the terms in (14) tend uniformly 3 to a
but vanishes on D, and is a solution to
limit U which is not identically zero in D,
w + (1/c)pw = 0.
This means that he had proven the existence of the smallest eigenvalue 2 = 1/c of (13)
for the corresponding eigenfunction.
What is interesting from our perspective is that if we write w = w0 + v and solve
equation (13) using the explicit formula (17) we get as a solution for v
Z Z
1
2
Z Z
G(M, P )p(P )d,
D
where (19) will later be known as a Fredholm integral equation of second type.
The concept of uniform convergence was given by Weierstrass in a series of lectures in the beginning
of the 1850s. [11]
15
4
4.1
Mathematicians have been concerned with solving systems of equations in two or three
variables for thousands of years. Two variables representing a problem in the plane,
and three variables in the space. Equations in more than three variables were suspicious
because it would no longer represent a real problem. As an example, we can take
Herons formula dating over 2000 years. Givenpa triangle with sides a, b and c, Herons
formula gives the area of this triangle as A = s(s a)(s b)(s c) where s = a+b+c
2 .
This formula raised suspiciousness among philosphers since it involes the multiplication of
four numbers. One number alone represents a distance, two distances multiplied gives an
area and a distance multiplied by an area gives a volume, but how does one represent the
multiplication of two areas? [10]
The introduction of the carthesian coordinate system in the 17th century allowed mathematicians to clearly envision the geometrical representation of equations and systems of
equations in one, two or three variables, and some started to consider a similar concept
of geometry in any number of variables. In 1844 there could have been a major breakthrough concerning the concept of geometry generalized to any finite system of unknowns,
but there was not. This was Die Lineare Ausdehnungslehre, ein neuer Zweig der Mathematik by Hermann Grassmann (1809 1877), a German high-school teacher. In this
book he gives a treatment of what we today refer to as linear algebra. This book was
persistently rejected by everyone for almost a century because of the way that it was
written. The precise language of mathematics which we use today was not avaliable to
Grassmann and due to its level of abstraction, the book is written in an intricate and
philosophical way which made it almost unreadable. Several well-known mathematicians
of the time (Mobius, Dedekind, ...) tried to read and realize the importance of it, but
failed. The negative response he recieved made him publish a second version in 1862, Die
Ausdehnungslehre: Vollst
andig und in strenger Form bearbeitet, but this version did not
have any influence on the mathematical community.
By the end of the 19th century all the basic theorems of linear algebra had been proven,
but they were presented in unclear notations and bilinear forms instead of vectors and
matrices. Thus it did not give a sufficient basis for a generalization towards functional
analysis. The development was also in almost the exact reverse order as the logical
order which is taught in linear algebra courses today: Linear systems of equations
Determinants Bilinear and quadratic forms Matrices Vector spaces.
4.2
When studying the first occurences of infinite systems it is evident that there is no general
theory under consideration. Almost all such problems arose while studying differential
equations and representing the solutions as power series. As we have seen, the technique
was to assume that a convergent power series for the solution existed and to substitute this
into the differential equation, taking termwise derivatives and identifying the coefficients.
This gives an infinite system of equations in infinitely many unknowns and the problem is
to find a recursive relation for the coefficients.
To find the first attempts of a general theory we return to Fourier in his 1822 treatise
Theorie Analytique de la Chaleur in which he uses a more sophisticated method for an
infinite system where no recursive formula is available. The problem was to determine the
16
an cos((2n 1)x)
(1)
n=1
such that the function represented by this series will be constant for /2 x /2. To
solve this problem, he considered a more general case, namely when a series similar to (1)
equaled an analytic function with a Maclaurin expansion only containing odd powers of
x,
X
A2n1 x2n1
f (x) =
,
(2)
(1)(n+1)
(2n 1)!
n=1
where the Ak :s are known, which led him to consider a series expansion only involving
sin(nx) for n = 1, 2, . . .. That is, he assumed that
f (x) =
an sin(nx)
(3)
n=1
and he wanted to find the coefficients an . By putting x = 0 and taking derivatives of (2)
and (3), he obtained the infinite system
A
=
nan
n=1
A3 =
n3 an
n=1
.
..
.
2k1
A2k1 =
n
an
n=1
..
for k = 1, 2, . . .
(4)
He began by solving the system (4) for the first m equations in the first m unknowns. This
(m)
(m)
gave him a set of solutions {an } and his task was to determine lim an for n = 1, 2, . . ..
m
By lenghty and cumbersome calculations, Fourier arrived at the solution
2
4
2
1
a1 = A1 A3
1 + A5
+ 1 ...,
2
3!2
5! 4 3!
2
1
1 2
1
a2 = A1 A3
2 + A5
2
+ 4 ...,
2
3!2 2
5!4 2 3!2 2
1
1
3
a3 = A1 A3
2 + A5
2
+ 4 ....
2
3!
3
5!
3 3!
3
..
.
As usual, he did not give any justifications of his procedures and there are plenty of
results which would require a more careful investigation. These results however, are not
that important but more an illuminating example of the fact that passing to the infinite
was necessary for the development of analysis.
The work done by Fourier on infinite systems was left unnoticed for half a century
since it was not the major concern of his papers. According to Frigyes Riesz (1880 1956)
there was only one paper (published in 1828 by an italian mathematician named G. Piola)
on Fouriers method.
17
Next time this method was used was in 1870 by Theodor Kotteritzsch in a paper that
had a general system of infinitely many unknowns under consideration. The advance of
this paper is that he under certain conditions was able to solve the infinite system (4),
which he pointed out is of importance when finding Fourier coefficients1 .
It took another 15 years until general infinite systems were considered again, but this
time with much larger success since it triggered Henri Poincare (1854 1912) to give the
theory a rigorous treatment. These were two papers published in 1884 by Paul Appell
(1855 1930) and in 1886 by George William Hill2 (1838 1914). The paper by Appell
considered the problem of finding coefficients of a power series for certain elliptic and
periodic functions. His technique was the same as the one used by Fourier 62 years
earlier, but this time it caught the attention of Poincare which is the main reason for
the importance of his paper. Poincare realized the usefulness of the method, but it was
unclear under which assumptions the method could be applied.
Poincare started by considering an infinite sequence of complex numbers {an } with
|an+1 | > |an | and lim |an | = , and he wanted to find a sequence {An } such that
x
An apn = 0, for p = 0, 1, . . . .
(5)
n=1
This system is similar to the one that Appell considered, but in general it has no solutions
and Poincare started to investigate under which assumptions one can solve the system
(5). By a theorem of Weierstrass there exist an entire function F which has simple zeroes
precisely at the an :s, and Poincare assumed that this function F can be written as
Y
z
F (z) =
1
.
(6)
an
n=1
If cn is a sequence of concentric circles such that the radius rn of cn satisfies |an1 | < rn <
|an |, then Poincare could state that the system (5) has a solution {An } if
I
zp
lim
dx = 0
(7)
n c F (z)
n
for every p.
If this requirement is fulfilled, then by (7), the system (5) has a solution and it is given
by
ai
,
Ai =
(8)
Q
ai
1
an
n=1
n6=i
|An apn |
n=1
|p Sp | < ,
p=0
1
In this paper he even uses the word Fourier coefficients, but he does not mention the work by Fourier
on infinite systems. [2]
2
This paper was written already in 1877, but it did not reach Europe and Poincare until 1886.
18
then
X
p api
Bi = Ai
p=1
(9)
k 2k
(10)
k=
where = eit , k = k for k = 1, 2, . . . and that there existed a solution for (9) of the
form
X
w=
bk c+2k
(11)
k=
where all bk :s are constants. By substituting (10) and (11) into (9), Hill obtained the
infinite system of homogeneous equations
..
.
[2]b2 1 b1 2 b0 3 b1 4 b2 = 0
1 b2 + [1]b1 1 b0 2 b1 3 b2 = 0
2 b2 1 b1 + [0]b0 1 b1 2 b2 = 0
..
.
(12)
19
4.3
General theory
The first one to give a broad and general theory for infinite matrices and determinants was
Helge von Koch (1870 1924), beginning in 1891 with his interest in the Fuchs equation,
P (y) =
dn1 y
dn y
+
P
(x)
+ + Pn (x)y = 0,
2
dxn
dxn1
(13)
Pr (x) =
r x for r = 2, 3, . . . , n,
(14)
valid in some annulus A about the origin. It was already known that a general solution
y=
g x+%
(15)
existed which was convergent in A. Von Kochs problem was to find a general formula for
the coefficients g and % in (15). His investigations led him to consider an infinite matrix
of the same type as Poincare and using his results, von Koch was able to give explicit
formulas for g and % under certain restrictive assumptions.
One year later, von Koch returned to the problem in order to lighten the restriction on his assumptions. He began by studying the infinite array A = {Aik } for i, k =
. . . , 2, 1, 0, 1, 2, . . . and denoting
Dm = det {Aik } for i, k = m, . . . , m.
(16)
The determinant D of A is then lim Dm provided that the limit exists and is finite,
m
|aik | <
(17)
i= k=
1+
i=
Now let
Pm =
!
|aik |
< eS < .
k=
m
Y
1+
i=m
m
X
k=m
20
!
aik
(18)
and
m
Y
Pm =
m
X
1+
i=m
!
|aik | ,
k=m
m
Y
Pmn =
1+
i=n
m
X
!
|aik | .
k=n
Note that Dpp and Ppp is the same as Dp and Pp respectively as above. Now we can state
the theorem we need for the final part:
Theorem 4.2. Let A be in normal form. Then m
lim Dmn = D.
n
Proof. By theorem 4.1 we know that D is finite and that (18) holds. For any pair (m, n)
let p = max(m, n). Then we have, as before
|Dpp Dmn | Ppp Pmn .
The right hand side can be made arbitrarily small for sufficiently large m and n, and
hence p, because of the convergence of (19). The triangle inequality then gives the desired
result.
Under the assumption that A is in normal form, von Koch deduced several properties
of D. The most important of these, from our perspective, is the possibility to expand
D by minors. Suppose we want to expand by minors at the ith row. To determine the
coefficients of Aik von Koch replaced Aik by zero for i 6= k and Aik by one for i = k in A
and calculates the resulting determinant, which he denoted
D
i
= ik =
adj(Aik ) =
.
(20)
k
Aik
The ik :s are called minors or subdeterminants of order one. As in the case of finite
determinants we have that
X
D=
Aik ik ,
k=
Aij ik = 0 for j 6= k
i=
and
Ajk ik = 0 for j 6= k.
k=
21
Using these ideas, von Koch continued to expand D by two rows, say i and m, and
thus obtaining the subdeterminant of order two as
2
Aik Ain
i m
= D =
.
(21)
adj
Amk Amn Aik Amn
k n
Analogously, the determinant can now be written as4
X X
Aik Ain
D=
Amk Amn
k<n n=
i m
k n .
Inductively, one can continue and obtain the subdeterminant of order r, by rows i1 , . . . , ir
and columns k1 , . . . , kr , as
Ai1 k1 . . . Ai1 kr
Ai k . . . A i k
r
i1 i2 . . . ir
2
1
2
adj .
.. = k k . . . k
..
..
1
2
r
.
.
Ai k . . . Ai k
r 1
r r
and thus writing D as
Ai1 k1
XX
X Ai2 k1
..
.
k1 k2
kr
Ai k
r 1
. . . Ai1 kr
. . . Ai2 kr
..
..
.
.
. . . Air kr
i1 i2 . . . ir
,
k1 k2 . . . kr
where k1 < k2 < . . . < kr with < kr < . von Kock pointed out that there are no
restrictions on calculating D by expanding either rows or columns, but any combination
of rows and columns can be used, even infinite sets.
von Kochs final expression for D is given by
X
X app apq
X app apq apr
apq aqq aqr +
app +
D =1+
(22)
aqp aqq +
p<q
p<q<r apr arq arr
p=
where the largest summation index in each term is to range over all integers. This expression is of uttermost importance to us since it will later be known as the Fredholm
determinant which Ivar Fredholm (1866 1927) used to solve his integral equations. [2]
Aik
We use the notation
Amk
Ain
Aik
=
det
Amn
Amk
22
Ain
Amn
.
Set-theoretic concepts are not new in mathematics. They were used by Aristotle and even
earlier. Naively one can say that as soon as you make a statement valid for a collection of
objects, you are really talking about sets. For example the pythagorean theorem makes a
statement about all right triangles, which of course form a set. In the beginning of the 19th
century the word class was commonly used to denote a collection of objects having the
same property, but it took up until mid 19th century with George Boole (1815 1864) and
Georg Cantor (1845 1918) to formalize the ideas and introduce notations for calculating
with these concepts.
Since Die Ausdehnungslehre by Grassmann received little attention, we will follow [3]
and consider Riemann as the one who introduced the concept of space. In his famous 1851
doctoral thesis, Grundlagen f
ur eine allgemeine Theorie der Functionen einer ver
anderlichen complexen Gr
osse, one reads1 :
The totality of the functions forms a connected domain closed in itself
[ein zusammenh
angendes in sich abgeschlossenes Gebiet], since each of these
functions can go over continuously into every other . . ..
It is obvious that Riemann understood what we today mean by a function space. In his
even more famous talk, Ueber die Hypothesen, welche der Geometrie zu Grunde liegen, he
advanced further and introduced a notion of geometry to classes of objects, even infinite
classes2 :
But there also exist manifolds in which the determination of location [die
Ortsbestimmung] requires not a finite number but either an infinite sequence
or a continuum of determinations of quantities [. . . sondern entweder eine unendliche Reihe oder eine stetige Mannigfaltigkeit von Gr
ossenbestimmungen
erfordert]. Such a manifold, for instance, is formed by the possible determinations of a function for a given domain.
This talk was published in 1868 by Richard Dedekind (1831 1916), two years after
Riemanns too early death, and after that the ideas introduced by Riemann slowly began
to become understood and accepted.
At this time another revolution in mathematics began, mainly due to Weierstrass and
his school, which culminated in Felix Kleins (1849 1925) Erlanger Programm. The task
was to tidy up mathematics. According to Weierstrass, mathematics lacked rigor and
relied too much on intuition and physical observations. This triggered the mathematical
community to give more rigorous proofs and motivations, as well as making definitions
and axioms clearer.
Concerning the concept of space one man picked up all these ideas, but in the same
way as Die Ausdehnungslehre by Grassmann was persistently ignored, another potential
revolution was ignored. This was in 1888 when Giuseppe Peano (1858 1932) published
his book Calcolo geometrico secondo lAusdehnungslehre di H. Grassmann preceduto dalle
operazioni della logica deduttiva. This book is interesting in several ways. Among others
the symbols , and representing union, intersection and an element belonging to a
set, respectively were introduced. Regarding the concept of space, we give a passage in
chapter IX where Peano defined a linear space 3 .
1
23
4
Cited from http://www-history.mcs.st-and.ac.uk/HistTopics/Abstract_linear_spaces.html,
2007-10-12.
5
The definition of a function, as being a mapping from an arbitrary set into an another arbitrary set,
seems to have appeared for the first time in this book. [5]
24
With all the ideas and concepts developed at the end of the 19th century, mathematics
and especially functional analysis was ready to enter the 20th century. Following [5] we
find the major steps in four fundamental papers:
Fredholm 1900 on Integral equations
Lebesgue 1902 on Integration theory
Hilbert 1906 on Spectral theory
Maurice Frechet (1878 1973) 1906 on Metric spaces
Yet each of these papers alone were not enough, and it took for F. Riesz to understand the
connection between them and to create a general theory. When discussing these papers we
will not follow the chronological order, but discuss the paper by Lebesgue last, since the
work of Hilbert is based on those of Fredholm, and those of Frechet on those of Hilbert.
If not, the chain of events will be broken.
The 1900 paper by Fredholm is entitled On a new method for the solution of Dirichlets
problem 1 and, of course, concerns the solution of Dirichlets problem. It is easy to dismiss
this paper as only dealing with this specific problem, but the theory and results in it are
very deep. Inspired by a visit in France 1899, where he met and worked with both Poincare
and Jacques Hadamard (1865 1963), he returned home and succeded in improving the
results of all his predecessors.
Integral equations had been solved before Fredholm. The first who was able to give a
complete solution to an integral equation was Niels Henrik Abel (1802 1829) concerning
a problem in mechanics. After him followed successful attempts by Vito Volterra (1860
1940), Neumann and Poincare. The general recipe until Fredholm was to replace the
integral with finite Riemann sums, and then passing to the limit. Consider the equation
Z1
(s) = f (s) +
(1)
where and K are known, and f is to be determined. Partition the interval [0, 1] into
n subdivisions x0 , x1 , . . . , xn with xp = p/n. Set (xp ) = p , K(xp , xq ) = Kpq and
f (xp ) = fp . Then by substituting into (1) one obtains the finite system of equations
n
1X
fp +
Kpk fk = p for p = 0, 1, . . . , n.
n
(2)
k=0
(n)
(n)
If we fix n and let {fk } be a solution of (2), then by plotting (xk , fk ), one obtains a
polygonal solution curve. By letting n the system (2) should go over into (1) and the
polygonal curve should represent a solution curve of (1), but it is of course this limiting
process that causes problems.
Fredholms goal was to obtain a complete theory for the integral equation2
Zb
(s) = f (s) +
(3)
a
1
The original title is Sur une nouvelle methode pour la resolution du probl`eme de Dirichlet. This was a
preliminary article, and the full version entitled Sur une classe dequations fonctionelles was published in
Acta Mathematica, 1903.
2
We do not follow Fredholms original notations here.
25
where is a given continuous function on [a, b], K is bounded and picewise continuous on
[a, b] [a, b], a complex parameter3 and f is the unknown. Though he does not explicitly
describe the methods used by Volterra and von Koch, it is evident that he is well aware
of their methods. Fredholms procedure is composed of three ideas:
1.) Replacing the integral in (3) by Riemann sums and thus obtaining a system of
equations,
n
(b a) X
f (xj ) +
K(xk , xj )f (xk ) = (xj ) for j = 1, 2, . . . , n.
n
(4)
k=1
2.) Writing the determinant of the resulting system by use of von Kochs formula as
(compare section 4.3)
n
(b a) X
2 (b a)2 X K(xk1 , xk1 ) K(xk1 , xk2 )
1+
K(xk , xk ) +
K(xk , xk ) K(xk , xk ) +
n
2!n2
2
1
2
2
k=1
k1 ,k2
x1 x2 . . . xm
y1 y2 . . . ym
K(x1 , y1 ) K(x1 , y2 )
K(x2 , y1 ) K(x2 , y2 )
=
..
..
.
.
K(xm , y1 ) K(xm , y2 )
...
. . . K(xm , ym )
...
...
K(x1 , ym )
K(x2 , ym )
..
.
(5)
2
K(x, x)dx +
2!
Zb Z b
K
x1 x2
x1 x2
dx1 dx2 +
(6)
and it remains to
3.) prove the uniform convergence of (6) in any closed and bounded subset of the
complex plane. For this it is sufficient to give a good upper bound for the determinant
(5). By a theorem of Hadamard4 , Fredholm showed that
x1 x2 . . . xm
K
nn M n ,
y1 y2 . . . ym
where M = max |Kpq |, from which it follows that (6) converges since
p,q
X
1 n n
n M < .
n!
n=1
Fredholms next step was to apply Cramers rule to the system (4) and again let
n . Expanding by the first row then gives
s x1 . . . xm
K
t x1 . . . xm
(7)
x1 . . . xm
x1 x2 . . . xm
m
= K(s, t)K
+ (1) K(s, xm )K
.
x1 . . . xm
t x1 . . . xm1
3
Note that in connection with spectral properties, both Fredholm and Hilbert (chapter 7) studied I K
instead of I K as we do today. Hence their spectrum of operators
are different from ours.
!
n
n
Q
P
2
2
4
Hadamards theorem states that |det (A)|
|aij | .
i=1
26
j=1
(8)
and replaced each integrand by its expression (7), which gives the relation
Zb
(s, t; ) = K(s, t)()
(9)
(s, ; )()d
(10)
(11)
The conclusion from (11) is that if () 6= 0 then f (s) = (s)/() is a solution of (3).
[3]
Fredholm did not stop there. He went on and proved that
d()
=
d
Zb
(s, s; )ds
(12)
from which he could deduce that if 0 is a zero of order of (), then for a suitable choice
of , the function (s) cannot be divisible by a power of 0 higher than ( 0 )1 .
That is, if (s) = ( 0 )k 1 (s) then from (11) one has that
Zb
1 (s) + 0
(13)
which means that if (13) has no non-trivial solutions, then () 6= 0 and for = 0 , there
exists a unique solution of (3). By taking 0 = 1 and using the properties of double layer
potentials one can deduce that (13) has no non-trivial solutions5 , and hence the existence
and uniqueness for the solutions of the Dirichlet problem is proved.
Despite the startling results in this paper, the methods that Fredholm used were not
very original. It relied heavily on von Kochs theory of infinite determinants and taking
limits of finite systems. In the revised paper of 1903, Sur une classe dequations fonctionnelles, he introduced a method which was many years ahead of its time. [3]
We again consider the equation (3),
Zb
(s) = f (s) +
For a domain with sufficiently smooth boundary, in Fredholms case, three times continuously differentiable.
27
but this time viewed as a transformation, depending on the kernel K, of the unknown
function f into the known function . If we denote this transformation by f SK f , we
have that SK f = with
Zb
SK f (s) = f (s) +
Zb
K(s, )K 0 (, t)d.
(14)
Zb
Zb
s1 . . . sm
; ds1 . . . dsm
s1 . . . sm
(15)
from which he deduced that if () = 0, then there exist an integer m such that
s1 . . . sm
;
s1 . . . sm
is not identically equal to zero. Let m be the smallest such integer, which is exactly the
order of as a zero of , then the m solutions of (13),
s s2 . . . sm
t1 t2 . . . tm
1 (s) =
s1 . . . sm
t1 . . . tm
s1 s s3 . . . sm
t1 t2 t3 . . . tm
2 (s) =
s1 . . . sm
t1 . . . tm
..
.
are linearly independent and every other solution of (13) is a linear combination of the
j :s for 1 j m. He concluded the paper by showing that for two kernels K and K 0 ,
with corresponding determinants K and K 0 , the word determinant is justified by the
fact that for the composed kernel K 00 , one has K 00 = K K 0 . [6]
6
28
The success of this paper is not only due to the solution of a classical problem, but also
for its originality. What analysis needed was the introduction of algebra and group theory.
Indeed, the idea used by Fredholm is to consider the set of all transformations (operators)
which have a non-zero determinant, and realize that they form a group under composition.
Then by using algebraic properties of groups and group actions, he could say something
about the underlying problem in analysis. Hence this paper is not only a forerunner of
functional analysis, but of all spectral theory and operator theory, in particular operator
algebras.
29
The work of Fredholm got immediate attention from mathematicians all over the world,
and of those, Hilbert was one of the most enthusiastic. It made him drop almost everything
that he was doing, and turn his attention to the theory of integral equation. He even
proposed at his seminar in Gottingen that Fredholms results on integral equations could
lead to a solution of the Riemann hypothesis. Hilbert hoped that the Riemann zetafunction, which is an entire function, could be expressed as the determinant of an integral
equation with symmetric kernel, but unfortunately no one has been able to find such a
representation yet. However, it clearly expresses Hilberts faith and enthusiasm in the
methods invented by Fredholm.
During the years 19041906, Hilbert published six papers on integral equations which
later were all put together in a single volume entitled Grundz
uge einer allgemeinen Theorie
der Integralgleichungen. Of these papers, the first and fourth are of main interest to
us. From what I can see, he began by taking one step back, to make sure the methods
used prior to him were rigorous enough, and then took two steps forward. He started
by returning back to transforming the integral equation to a finite system of equations
(compare chapter 6, equation (4)), under the restriction that the kernel is symmetric, and
then taking limits. One might ask why he bothered to do so when it had been considered
already by Volterra and Poincare. The answer is probably that he under the assumption
of a symmetric kernel was able to obtain much more precise results for this special case,
than he could using the previous general methods.
Let the kernel K(s, t) of the integral equation
Zb
(s) = f (s) +
(1)
n (t)2 dt = 1
and define for each continuous function x on [a, b], the Fourier coefficients
Zb
(x, n ) =
x(t)n (t)dt.
a
30
X 1
(x, n )(y, n )
n
n
(2)
for any two continuous functions x and y. Note that this is a generalization of the principal
axis theorem. An interesting remark is that he showed that the right-hand side of (2) is
uniformly convergent for arbitrarily continuous functions x and y subject only to1
Rb
Rb
x(t)2 dt 1 and
y(t)2 dt 1.
We have already seen how Hilbert generalized and abstracted existing concepts, but his
purpose in these papers were on the contrary meant to deal with applications, and abstraction was for Hilbert a tool to solve concrete problems. He even wrote that2
...the systematic building of a general theory of integral equations for the
whole of analysis, especially for the theory of the definite integral and the theory
of the development of arbitrary functions in an infinite series, besides for the
theory of linear differential equations and analytic functions, as well as for
potential theory and calculus of variations, is of the greatest importance, and
that, the most noteworthy result is that the developability of a function in [a
series] of eigenfunctions belonging to an integral equation of the second kind is
evidently dependent on the solvability of the corresponding integral equation of
the first kind.
Before we continue we need to make a remark about eigenvalues. Both Fredholm and
Hilbert studied eigenvalues in the sense that the operator K I is not invertible instead
as we do now when we consider K I. This means that a , in the sense of Fredholm
and Hilbert, is an eigenvalue if and only if 1/ is an eigenvalue in our sense.
Hilbert went on by showing that the set of n :s is infinite unless K(x, y) is a finite linear
combination of functions of the form u(x)v(y), and that the resolvent kernel R(s, t; ) has
eigenvalues n with the corresponding eigenfunctions n /(n ). Thus one has the
relation
Zb
R(s, t; ) R(s, y; ) = ( ) R(s, ; )R(, t; )d
a
for and different from n . Finally, he proved that if a function f can be written as
Zb
f (s) =
K(s, t)g(t)dt
(3)
(f, n )n
is absolutely and uniformly convergent and one has the Parseval relation
Zb
f (s)2 ds =
(f, n )2 .
The restriction to functions of type (3) was later removed by Erhard Schmidt (1876
1959), a student of Hilbert, in his 1905 dissertation. [5]
1
2
31
kpq =
bp =
Rb Rb
a a
Rb
(s)wp (s)ds,
kpq xq = bp for p = 1, 2, . . . ,
(4)
q=1
p,q
(5)
Zb
kp (s)
which means that the series u(s) = p xp kp (s) is absolutely and uniformly convergent,
and hence that u is continuous and satisfying (u, wp ) = bp xp . Now if f = u then
from (f, wp ) = xp and the completeness of {wp } it follows that f is a solution of (1) with
= 1.
After this rather standard procedure, according to [5], Hilbert ventured where no one
had ever gone before:
1. He exclusively
considered sequences, x = (xp ) with p = 1, 2, . . ., of real numbers such
P
that p x2p <
2. He dropped all restrictions on the double sequence kpq except that kpq = kqp
3. The center of attention was no longer solutions of (4), but the bilinear symmetric
form
n
X
K(x, y) =
kpq xp yq ,
(6)
p,q=1
32
K(s, t)(t)dt
(7)
n
X
Kpq q for p = 1, 2, . . . , n.
(8)
q=1
which he applied to (8) and thus formed the system, involving bilinear forms,
(u, f ) = (u, ) (u, K).
(10)
With these definitions, he started his study of the infinite bilinear form
K(x, y) =
kpq xp yq
(11)
p,q=1
K(;
x, y) K(K(,
x, y)) = (x, y)
in a way which will be made clear later.
After the construction of this resolvent he further generalized the principal axis theorem
to infinite quadratic forms, and finally applied the theory of infinite bilinear forms to
infinite systems of equations.
With K(x, y) defined as in (11), the nsection of K is defined as
Kn (x, y) =
n
X
kpq xp yq
p,q=1
and also
(x, y)n =
n
X
xp yq
p,q=1
3
For simplicity, suppose that the interval of integration, [a, b], is [0, 1].
33
with corresponding quadratic forms, K(x, x) and (x, y), and their nsections
Kn (x, x) =
n
P
p,q=1
n
P
p,q=1
x2p .
To the form
(x, x)n Kn (x, x)
there is the associated determinant
1 k11 k12 . . .
k1n
k21 1 k22 . . .
k
2n
Dn () =
..
..
..
..
.
.
.
.
kn1
kn2 . . . 1 knn
(n)
(n)
(n)
n
(n)
(n)
X
L (x)L (y)
i
i=1
(n)
i
(n)
(n)
where {Li (x)} is an orthonormal set of eigenforms. That is, Li (x) = (i , x) where i
(n)
is a normalized eigenvector associated with the eigenvalues i of Kn (x, y). Assume for
(n)
simplicity that
one. The product 4 of the forms
Pn these eigenvalues i havePmultiplicity
An (x, y) = p,q=1 apq xp yq and Bn (x, y) = np,q=1 bpq xp yq is denoted and defined as
n
X
apq bqr xp yq .
p,q,r=1
Note that this is the form associated with the matrix product AB and hence we will call
it the product form. For simplicity we denote An (x, .)Bn (., y) by A(B(x, y)) instead.
For forms K with the spectrum of Kn uniformly bounded, Hilbert defined the functions
(n)
Xp () as
(
(n)
0
for p
(n)
Xp () =
(n)
(n)
(n) p = 1, 2, . . . , n.
(Lp (x))2 ( p ) for > p
and
X (n) () =
n
X
Xp(n) ().
p=1
34
. Furthermore, if p and q are fixed, then the coefficients xp xq in X (mj ) () will make
(m )
Xpq j () converge to a continuous function of , say Xpq ().
Finally, from all the above, Hilbert defined the bilinear form
X() =
n
X
Xpq ()xp xq .
p,q=1
From the beginning, this form was only defined for the distinguished set of variables {x(k) }
and the interval [a, b] containing the spectra of Kn , but Hilbert extended the definition of
X() to all real and all variables x by taking linear combinations of the distinguished
variables and extending X by linearity. Hilbert denoted the value of X() at the distinguished variable x(k) by X()k and proved that they have left (k ()) and right (k+ ())
derivatives with respect to for all k and all real , and that they are non-decreasing
functions of . The set of :s for which there is a k such that k () 6= k+ () is countable
is called the point or discontinuous spectrum of K and its elements are the eigenvalues of
K.
For those not in the point spectrum of K, Hilbert defined the quadratic form
() =
pq ()xp xq
p,q=1
where pq is the common value of the right and left derivatives of Xpq at . For an element
h of the point spectrum of K, the quadratic eigenform belonging to h is defined as
Eh (x, x) = Eh =
+
pq
(h ) pq
(h ) xp xq ,
p,q=1
and
() =
p <
where also has right and left derivates at every point, which are equal everywhere exept
def
for those belonging to the point spectrum of K. Next %() = X() () is proved to
be a continuously differentiable function of which is used to define the spectral form of
K as
d%
def X
() =
pq ()xp xq =
= () ().
d
p,q=1
The set of real such that in every neighborhood there are points 0 with () 6 (0 ) for
all x is called the line or continuous spectrum of K. The union of the point and continuous
spectrum is called the spectrum of K. From the assumption that the spectrum of Kn is
uniformly bounded and the fact that outside the spectrum of K, the coefficients of X are
linear and those of are constant, it follows that the spectrum of K is contained in some
finite interval s. Finally, by combining many results, Hilbert was able to get the resolvent
as
K
Z
X
1
1
K(; x, x) =
Ep (x, x) 1
+
1
d(),
(12)
p
p
s
35
where the sum is taken over all Ep , for which p is in the point spectrum of K, and s is
its continuous spectrum. How he obtained this result is outside the scope of this work,
and I refer to [3] or [9] for a detailed discussion.
To sharpen his results, Hilbert went on by defining the concept of bounded forms. These
are the forms for which there exists a non-negative number M such that |K(x, y)| < M
whenever
x) < 1 and (y, y) < 1. This definition extends in a natural way to linear forms
P(x,
(21 +22 + )0
From this definition it is obvious that any bounded linear form is continuous. These
definitions made it possible to extend his previous results when the spectrum of the kernels
K have infinity as a point of accumulation. It also made it possible to prove the spectral
radius theorem, which in this case states that the spectrum of K is bounded away from
zero by M 1 , where M is the smallest bound for K. This means that = 0 is not an
eigenvalue, but it can happen that the absolute value of the eigenvalues tends to infinity.
We summarize, as Hilbert ([9] p. 137, Satz 32), these results in a major theorem.
Theorem 7.1. Let K(x, x) be a bounded quadratic form in the infinitely many variables
K(;
x, x) =
kpq ()xp xq
p,q
whose coefficients are regular analytic functions for all outside the spectrum of K.
For such , the resolvent is a bounded form; it represents for all arbitrary values of the
infinitely many variables, x1 , x2 , . . ., an analytic function of .
The resolvent permits for arbitrary values of the infinite variables x1 , x2 , . . ., and sufficiently small , the power series representation
K(;
x, x) = (x, x) + K(x, x) + 2 K 2 (x, x) + .
Furthermore, for arbitrary values of the infinitely many variables and for all outside the
spectrum of K, the resolvent satisfies the partial fraction representation
Z
X
1
d()
K(;
x, x) =
Ep (x, x) 1
,
+
p
(p,)
s 1
where the sum is taken over the entire point spectrum of K, namely extending over all
eigenvalues, if necessary with the inclusion of the eigenvalue . Ep denotes the quadratic
eigenform belonging to p ; it is a bounded form for which no set of values of the variables
x1 , x2 , . . . is negative. The spectral form () is a bounded form of the infinitely many
variables x1 , x2 , . . ., and indeed represents for each of these sets of variables, a function
which is continuous with respect to . Moreover it increases with increasing inside the
continuous spectrum s except for special values of x1 , x2 , . . . but remains constant in
every interval outside of s.
In particular, the following equations are satisfied:
Z
X
(x, x) =
Ep + d()
(p,)
36
and
X Ep Z d()
+
K(x, x) =
.
p
(p,)
K(;
x, y) K(K(;
x, x)) = (x, y)
which is satisfied for all outside the spectrum of K.
To illustrate the concepts introduced by Hilbert in these papers, it is interesting to
compare the basic concepts of spectral theory in a standard textbook in functional analysis.
For example [12] (p. 370371) gives the following definitions.
Let X 6= 0 be a complex normed space and T : D(T ) X a linear operator with
domain D(T ) X. With T we associate the operator
T = T I
where is a complex number and I the identity operator on D(T ). If T has an inverse ,
we denote it by R , that is,
R (T ) = T1 = (T I)1
and call it the resolvent operator of T or, simply, the resolvent of T . He then goes on with
Definition (Regular value, resolvent set, spectrum). Let X 6= 0 be a complex normed
space and T : D(T ) X a linear operator with domain D(T ) X. A regular value of
T is a complex number such that
(R1) R exists,
(R2) R is bounded,
(R3) R is defined on a set which is dense in X.
The resolvent set (T ) of T is the set of all regular values of T . Its complement
(T ) = C \ (T ) in the complex plane C is called the spectrum of T , and a (T )
is called a spectral value of T . Furthermore, the spectrum (T ) is partitioned into three
disjoint sets as follows,
The point or discrete spectrum p (T ) is the set such that R (T ) does not exist. A
p (T ) is called an eigenvalue of T .
The continuous or line spectrum c (T ) is the set such that R (T ) exist and satisfy
R3, but not R2, that is, R (T ) is unbounded.
The residual spectrum r (T ) is the set such that R (T ) exist (and maybe bounded
or not) but does not satisfy R3, that is, the domain of R is not dense in X.
As we can see, all concepts introduced by Hilbert are used, though in terms of operators
on Hilbert spaces instead of bilinear and quadratic forms, but the notations are still the
same almost 100 years later.
Despite the importance of the discussions above, we have not yet arrived at the most
important part concerning the future development of functional analysis. This is the
37
1 0,
P 22 0,...
i <1
F (a1 + 1 , a2 + 2 , . . .) = F (a1 , a2 , . . .)
(k)
(13)
(k)
lim 1 = 0,
(k)
lim 2 = 0, . . . .
(14)
Hilbert used this definition to prove several sufficient conditions for a quadratic form to
be completely continuous, of which the most important from our perspective is that the
P
2 < . He proved that lim K (x, x) = K(x, x) uniformly
coefficients of K satisfy
kpq
n
k
p,q=1
for any completely continuous quadratic form, where, as before, Kn is the nsection of K.
He continued by showing numerous results on completely continuous forms, such as that
they attain their maximum value on closed and bounded sets and that the continuous
spectrum is empty and its eigenvalues have no finite point of accumulation. This led him
to a further generalization of the principal axis theorem.
Theorem 7.2. If K is a completely continuous bounded form, then it can be brought into
the following representation through an orthogonal substitution:
X
K(x, y) =
kj x2j
j
38
8
8.1
In chapter 7, the modern reader recognizes almost all aspects of what now is called a
Hilbert space, and in particular the Hilbert space l2 which played an essential role in
Hilberts investigations of bilinear and quadratic forms. It seems clear that when Hilbert
created his theories, he had Euclidean geometry in mind. This can in particular be seen
in connection with his set of distinguished variables, which were chosen such that they in
todays notation would have norm one. The same idea applies to the completely continuous
forms. In some sense they are the forms which preserve lengths and distances. It is not
by coincidence that Hilbert worked with this intuition. At the same time a new concept
emerged in mathematics in general the concept of structure.
Until the middle of the 19th century, mathematics had been something very concrete.
The problems dealt with concerned particular objects, such as numbers, points, curves,
areas, volumes, surfaces and so on, and the manipulation of these objects had relied heavily
on which type of object that was under consideration. Around 1840, some mathematicians
began to see that the manipulations on these objects did not depend on the nature of the
objects, but rather on which rules that could be applied, and on how those rules could be
applied to numerous different kinds of objects. However, these ideas had to wait another
50 years to mature and it was not until Cantor had created his set theory that serious
investigations could begin. By 1895 the definition of a group on an arbitrary set was
defined by Weber in his famous Lehrbuch der Algebra, which was the starting point for an
abstraction and axiomatization of algebra, and by 1920 all fundamental notions of algebra
had been defined.
In analysis there was no similar development at this time. The central concepts of limits, convergence and continuity had been defined relative to special objects such as curves,
surfaces or functions, and no one had considered how they could be generalized to arbitrary
sets. Both Fredholm and Hilbert had intentionally avoided this question by claiming that
they were interested in explicitly solving integral equations without abstracting concepts
for the purpose of abstraction alone. This was about to change when Frechet went in the
complete opposite direction and did everything for the sole purpose of abstraction.
8.2
To understand how and why Frechet developed his ideas we need to look at the mathematical environment in Paris around 1900. Paris was the brilliant center of science and
mathematics. The old and established mathematicians were still active and provided great
knowledge. We had Camille Jordan (1838 1922), Charles Hermite (1822 1901), of which
two functions (x) and (x), and every x in some interval I, we have that
k
d dk
< , for k = 1, 2, . . . , p.
|(x) (x)| < and k
dx
dxk
The functions and are then said to be in an -neighborhood of order p. The importance
of this definition lies not in its applications, but the fact that it gave sufficient structure
to a set of functions to make the concepts of limits and continuity meaningful. This was
improved by the italians Giulio Ascoli (1843 1896), and Cesare Arzel`a (1847 1912) when
they tried to extend the work by Cantor on set of points, to sets of curves or functions.
In particular, they were interested in sequences of lines and their limits. This led them
to the concept of equicontinuity of families of functions, and the requirement that for a
sequence of continuous functions to have a uniformly convergent subsequence, is that the
sequence is equicontinuous and bounded. A corollary of this statement is that there is a
subsequence of an equicontinuous and bounded sequence of functions such that
Zb
lim
Zb
fn (x)dx =
n
a
lim fn (x)dx.
n
a
The difference between quasiuniform and uniform convergence is that (1) need not hold for all N 0 > N .
40
Fr
echet on metric spaces
Frechet began his investigations already in 1904 with a paper which can be considered
as an aperitif of his 1906 thesis Sur quelques points du calcul fonctionnel. It is divided
into two parts, of which the first deals with abstraction and the second with applications.
Frechet had big ambitions with his project. He hoped that his generalization of analysis
would include all previous work by Fredholm and Hilbert as special cases, and even the
work by Cantor on point sets. We cite from [3] his motivation for undertaking this task:
The present work is a tentative first [effort] to establish systematically
certain fundamental principles of the Functional Calculus, and then to apply
them to certain concrete examples.
Interestingly, it is this procedure he feels he had to motivate by further writing2 :
In proceeding thus, it happens that certain demonstrations are made more
difficult because one does without some [of the] more concrete representation[s].
But that which is lost in this way, is largely regained in dispensing with the repetition, several times, of different forms of the same reasoning. One often gains
thereby from seeing more clearly that which was essential in the demonstrations
... from the simplifications, and in the freeing [of the proofs] from that which
only depends on the particular nature of the elements considered. It is this
which we are going to try to do for the Functional Calculus and in particular
for the theory of abstract sets.
Frechet based his work on two considerations in order to obtain maximum generality.
First the notions of Cantor on set theory and second a characterization of limit. In general
a limit in his sets would not be defined, but rather be characterized by two properties
similar to the characterization of group multiplication. The class of sets for which this
concept of limit is introduced is called L, and a set E will belong to the class L if given
any infinite set of elements A1 , A2 , . . . of E chosen at random, it is possible to determine
wether or not there exists a unique element A (called the limit of {An } when it exists)
subject to the following conditions:
I If Ai = A, for i = 1, 2, . . ., then the limit is A itself
II If A is the limit of {An } = {A1 , A2 , . . .} then A is the limit of every subsequence
{An1 , An2 , . . .} of {An }
In the coming discussion we will assume that all sets under consideration are of class
L. That is, all sets have a limit defined and all theory and structure of these sets are
compatible with this limit.
We begin by giving several important definitions. The derived set of a set E, denoted
by E 0 is the set of points which are limits of sequences belonging to E. E 0 is closed if
E 0 E and perfect if E 0 = E. A is an interior point of E if A is not the limit of any
sequence in the complement of E. A set E is called compact if either E has finitely many
elements or if every infinite subset of E has at least one limit element. If E is both
compact and closed, it is called extremal. These concepts have changed very little since
Frechet defined them. Compactness and extremal in the sense of Frechet is now known as
relatively sequentially compact and sequentially compact respectively.
2
41
Note that the usage of the word neighbourhood is quite different from that of today.
42
Finally, Frechet did one final specialization of the set V . This time he replaced condition 3 of the definition of a neighbourhood with the condition that for any elements, A,
B and C, of E, we have that
(A, B) (A, C) + (C, B).
The sets satisfying these three conditions are said to be of class E and the real valued
function (A, B) is called an ecart on E. The reason for introducing this is that he wanted
to classify the extremal sets C of class E. Thus, in 1906, the modern definition of a metric
space 4 was born and has not changed since.
In the second part of his thesis, Frechet dealt with very concrete sets of different objects
and defines ecarts on them. For example the Frechet metric (which is used in the study
of C functions)
X
1 |xp yp |
(x, y) =
,
p! 1 + |xp yp |
p=1
Today this is called the maximum norm and it was well-known even in 1906 that convergence in this norm is uniform. We will not go deeper into the second part of Frechets
thesis since it is a bit outside of the scope of this work, and does not serve the purpose of
motivating the future development. For those interested in Frechet I can warmly recomend
the great articles [20], [21] and [22].
9.1
In 1908, Eliakim Hastings Moore (1862 1932) published a paper entitled On a form of
general analysis with applications to linear differential and integral equations. In this paper
he followed the footsteps of Frechet and has the same ambitions to include all theory of
finite linear systems, infinite linear systems in infinitely many unknowns, integral equations
and the work of Hilbert as special cases in his general analysis. His ambitions failed and
this paper had almost no influence on the european mathematical community. The reason
for this failure has been described in terms of everything from political and socialistic to
individualistic and notational. For a somewhat detailed discussion, I refer to [18]. Still,
I think that Hellinger and Toeplitz summarize it best when they say that ... solution
theory is not accomplished through such axiomatic formulation .... Simply, at this time
there was no need for further abstractions. [3]
Schmidt had more success with his approach to the recently developed theories. His
aim was to simplify Hilberts proofs and to generalize some of his results. This resulted in
one of his greatest successes the introduction of geometry into what he called function
X
|zp |2 < .
p=1
4
43
He introduced, for what seems to be the first time, the notation ||z|| for what later were
to become the norm of z as
X
||z||2 =
zp zp .
p=1
zp wp
p=1
(z, z).
P
p=1
6
7
44
zp w
p .
10
Ever since Cantor created set theory, mathematicians had been struggling to associate
numbers to sets, which would in some sense measure the set. Intuitively this number
should always be zero for the empty set, and grow with bigger sets. During the 1880s the
Italians Ulisse Dini (1845 1918) and Volterra were investigating the relationship between
integrability in Dirichlets sense1 and in Riemanns sense. Hermann Hankel (1839 1873)
had proposed a theorem and a proof that functions continuous everywhere except for sets
of measure zero are necessarily integrable which Dini opposed, but he could not come up
with a counterexample that this was not the case. Dinis skepticism was proven right by
Volterra who proved the existence of, what we today would call, a nowhere dense set with
positive outer content, from which it followed that Hankels theorem was false.
The usefulness of measuring sets began to gain recognition, and in the early 1880s it
spread to Germany where Paul Du Bois-Reymond (1831 1889) named sets of content
zero2 integrable system of points, to distinguish them from other nowhere dense sets. In
1882, Axel Harnack (1851 1888) introduced a notion similar to that of a property to
hold almost everywhere. Two functions, f and g, were said to be equal in general if for
every > 0 the set of points x such that |f (x) g(x)| < is discrete.
Cantor himself, in 1884, tried to define content in the sense of subsets of the n
dimensional euclidean space, without much success. His definition relied on the assumption
that a certain multiple integral was well-defined, a fact that was not sufficiently justified
until the work of Jordan in 1892 on multiple integrals. Even with that assumption justified,
the distinction between a set and its closure was not clear enough which resulted in that
the content of a disjoint union of two sets was not in general the sum of their contents
a property which is fundamental, and in some sense even defining. After this failure,
Cantor lost interest in contents and turned his attention to other areas. Harnack, on the
other hand, did not lose interest but picked out the most promising parts of Cantors work
and reconsidered the definition of contents, and thought about what would happen if one
would allow infinite coverings with intervals of a set in ndimensional euclidean space.
He writes: in a certain sense, every countable point set has the property that all its
points can be enclosed in intervals whose sum [of lenghts] is arbitrarily small.. That is,
Harnack seems to have been the first one who considered this property for countable sets,
[
X
m
Ak =
m(Ak )
k=1
k=1
for disjoint (Borel) sets Ak the additive property that Cantor failed with in defining his
measure.
Judging from comments made by Weierstrass, he was never satisfied with the Riemann definition of an integral. In a correspondence with Du Bois-Reymond, concerning
his discovery that Dirichlets condition for integrability was not sufficient for Riemann
1
45
integrability, Weierstrass responded that Dirichlet for sure had in mind another, and more
general, definition than that of Riemann. Weierstrass suggested that Dirichlet had in mind
an extension of Cauchys definition to functions with infinitely many points of discontinuity. Since Hankel had proven that the points of continuity of an integrable function form
a dense set, then the
Ppartition of any interval [a, b] could always be taken such that in the
Cauchy sum, S =
f (ti )(xi xi1 ), the ti are continuity points of f . Working in this
direction it should then be possible to extend the integral to a larger class of integrable
functions. Weierstrass himself worked out a definition of an integral which he corresponded
to his friend and student, Sofia Kovalevskaya (1850 1891). The main idea of Weierstrass
is to take any interval [a, b] and in each arbitrarily small part of this interval let there
be points where the function is defined. For each of these points where the function is
defined, erect the ordinate. These ordinates need not overlap continuously and hence the
integral can not be defined as the area filled up by these ordinates. If we let each of these
ordinates be surrounded by a rectangle whose base is , then these rectangles overlap, and
if we define the sets of those points that are in some rectangle, it is seen that they form
a continuum. This continuum has a content S which is a function of . It can be shown
that this content decreases with decreasing and hence has a limit as 0. Then define
Zb
f (x)dx = lim S .
0
This definition is justified since it coincides with the usual definition for continuous functions. However, there were other problems with this definition. As Volterra pointed out,
this definition does not make integration additive. Weierstrass had thus encountered the
same problem as Cantor when trying to define these new concepts. [8]
It was not until 1902 that all these problems were definitively solved in Lebesgues
famous doctoral thesis, Integrale, longueur, aire. It is here we find all familiar notions of
integration theory, such as measures, Lebesgue measures, sets of measure zero, measurable
functions, almost everywhere and of course, the Lebesgue integral. The only thing missing
in this thesis is the fundamental theorem of caculus,
x
Z
d
f (t)dt = f (x) almost everywhere.
(1)
dx
a
He knew that there would exist a theorem like this, but at the time of his dissertation he
was not able to prove it and it took him another year before he was able to give a complete
proof. [17]
It took another few years for Lebesgues work to mature, and for other mathematicians
to realize the importance and usefulness of the Lebesgue integral. In particular, it was
now possible to take limits under the integral sign under very general assumptions,
Z
Z
fn (x)dx =
lim fn (x)dx,
lim
n
which was an important step in putting the theory of orthogonal series and Fourier series
on a rigorous basis.
Let e be a measurable set and f a function which takes values f (x) for x e and 0
elsewhere. The value of the integral is called the integral of f (x) on e and is written
Z
F (e) = f (x)dx.
e
46
With this definition, we can state that if e1 , e2 , . . . are disjoint, measurable sets and f is
Lebesgue integrable on e1 e2 . . ., then
F (e1 e2 . . .) = F (e1 ) + F (e2 ) + .
That is, F (e) is a countable, additive set function what both Weierstrass and Cantor
had been searching for. [16]
47
11
With all these new ideas, concepts and the mathematical alignments of abstraction versus
problem solving, the world waited for someone to come up with a unifying theory. The
one who should have credit for this unification is F. Riesz, a hungarian mathematician
working as a high-school teacher at the turn of the century. After he finished his thesis
in 1902 (same year as Lebesgue), he went to Gottingen where he met Hilbert and became
good friend with Schmidt. After his stay in Gottingen he went to Paris where he made
friends with Borel and Lebesgue, so Riesz was indeed the right man to come up with a
unifying theory. Back in Hungary, he started working on functional analysis inspired by
his visits. When he wrote about concrete problems he wrote in German and published in
German periodicals, and when he wrote about abstract theories he wrote in French and
published in French periodicals. [7]
In 1906, Riesz had become enough acquainted with Lebesgues work to understand
that by combining it with Frechets work on abstract spaces, he could greatly improve
some results by Schmidt. He observed that in the space of continuous functions on some
interval I with metric max |f (x) g(x)| for x I, the concept of an orthogonal system of
continuous functions could be generalized to any orthogonal system of functions, as long
as they were integrable in some sense. Since Schmidt had proven that any such system is
countable, why not use the integrability condition in the Lebesgue sense, since it was well
compatible with countability.
As an example, Riesz considered all bounded and Lebesgue integrable functions defined
on a Lebesgue measurable set E with the distance defined as
Z
2 1/2
f (x) g(x) dx
,
E
R
and the convention that all functions with E |f (x)|dx = 0 are identified with the function
everywhere identically equal to zero on E. Thus we see the origin of the important Lp
space theory, where in this case p = 2. [8]
Earlier, Hilbert had studied integral equations of the form
Zb
f (s) = (s) +
K(s, t)(t)dt
(1)
where f and K had been assumed to be continuous. With the new theory of Lebesgue
integration, Riesz wanted to see if he could improve Hilberts results to more general functions. His study of equation (1) resulted in whether or not Riesz could insure that the
generalized Fourier coefficients of f could be determined relatively to a given orthonormal system of functions, {p }. Conversely, he was also interested to find under which
circumstances a given sequence of numbers, {ap }, was the set of Fourier coefficients of
some function f relative to an orthonormal system of functions {p }. Pierre Fatou (1878
1929) had showed that a necessary condition for this was that the sequence be square
summable. Maybe it was also sufficient? This question had not been of very much interest before the introduction of the Lebesgue integral, since a positive answer would have
seemed very unlikely. However, Riesz (and independently Ernst Fischer (1875 1954), )
were able to give a complete answer to this question by the following celebrated theorem.
Theorem 11.1 (Riesz-Fischer theorem). If {p } is an orthonormal system of square
Lebesgue integrable functions defined on some interval [a, b] and if {ap } is a square summable
48
sequence of real numbers, then there exists a square Lebesgue integrable function1 f defined
on [a, b] such that
Zb
ap = f (x)p (x)dx
a
if and only if
a2p < .
p=1
The necessity for this theorem follows from Bessels inequality and was known to be
valid for such functions. He began with the classic case when the orthonormal system is
the set of trigonometric functions and the interval [a, b] is [0, 2]. To establish the theorem
in this case, Riesz formed a trigonometric series,
ap p (x)
p=1
1
1
where the p are of the form p
cos(px) or p
sin(px) and {ap } the given sequence of
numbers, and proved that it converges uniformly to a continuous function of bounded
variation with derivatives almost everywhere. The function f is then defined to be this
derivative where it exists, and to have arbitrary values on sets of measure zero. This f
is then shown to be measurable and square Lebesgue integrable and to have the desired
Fourier coefficients {ap }. Thus the theorem is proved when the orthonormal system is the
sequence of trigonometric functions.
To prove the general case, Riesz considered a system of infinitely many equations in
infinitely many unknowns,
ap =
xq bpq , for p = 1, 2, . . . ,
(2)
q=1
where {ap } is the given square summable sequence of numbers, xp are the unknowns and
Z2
bpq =
p (x)q (x)dx.
(3)
In the last equation (3), {p } is the orthonormal system of trigonometric functions and
{p } is an arbitrary orthonormal system. It was known that if
r=1
bpq ap .
(4)
p=1
1
The fact that the function f is square Lebesgue integrable was not proved in the first version of this
theorem. [3]
49
2
2
Z
Z
X
X
p (x)r (x)dx q (x)r (x)dx
bpr bqr =
r=1
r=1
Z2
=
and hence the bpq satisfy the correct conditions, which establish the validity of (4) and
hence the xp , p = 1, 2, . . ., can be considered as known.
The special case when {p } is a system of trigonometric function then insured him
that there is a measurable and square Lebesgue integrable function f such that f satisfies
Z2
f (x)q (x)dx = xq .
(5)
ap =
Z
X
Z2
f (x)q (x)dx
q=1 0
Z2
p (x)q (x)dx
0
f (x)p (x)dx,
0
where again the last equality is valid due to Fatous theorem. Thus he had proved that ap
is the pth Fourier coefficient of f with respect to the arbitrary orthonormal system {p }.
The final adjustment that needed to be done is a change of variable to obtain the result
for any interval [a, b]. [3]
If the orthonormal system {p } is complete, then the coefficients bpq determine the
solution {xp } uniquely and hence f is unique up to an additive function with zero integral. This means that for a fixed, complete orthonormal system we have a one to one
correspondance between the set of measurable and square Lebesgue integrable functions
and the set of square summable sequences.
Less than a month after this publication by Riesz, E. Fischer published basically the
same result in the same journal. In 1904, Fischer had published some papers on the
Parseval identity for Riemann integrable functions. During this process he had come
across an unsuccessful attempt by Harnack to prove that if Sn denotes the nth partial sum
of a Fourier series of an integrable function and
Z2
lim
(Sn Sm )2 = 0,
m,n
0
2
If {p } is any given orthogonal system, then for arbitrary functions h and g we have that
b
b
Zb
Z
Z
X
h(x)p (x)dx g(x)p (x)dx .
h(x)g(x)dx =
a
p=1
50
then there would exist a limit function g(x) = lim Sn (s) in general (almost everywhere).
n
It was probably this that led Fischer to introduce what he called mean convergence, defined
as: let denote the class of square Lebesgue integrable functions on [a, b], and suppose
that fn for n = 1, 2 . . .. Then the sequence of functions {fn } is said to converge in
the mean if
Zb
lim
(fn fm )2 = 0.
m,n
(f fn )2 = 0.
K(s, t)(t)dt
a
could be completely solved under the more relaxed assumptions that f L2 [I] and K
L2 [I I], where I = [a, b]. It allowed Frechet to determine the compact sets in L2 , and
by considering the metric
Zb
2
(f, g) =
f (x) g(x) dx,
(6)
a
where two functions are identified if they differ only on a set of measure zero, Frechet
could prove the following theorem:
Theorem 11.3. For every continuous, linear functional U defined on L2 [a, b] (with the
metric (6)), there is a function u(x) L2 [a, b] such that for every f L2 [a, b],
Zb
U (f ) =
f (x)u(x)dx.
a
These results are in some sense the unification of the work of Fredholm, Hilbert, Frechet
and Lebesgue. It did not only show that two apparently different sets, l2 and L2 could
actually be completely identified, but an even greater importance was that it really showed
how problem solving led to abstraction, and how these abstractions actually included all
3
Rb
(1/2)
(f (x) g(x))2 dx
.
51
previous work. Hence both Hilbert and Frechet were right when one claimed that problem
solving was the essential part, and the other that abstraction was the essential part. There
could not have been a better man to realize this than F. Riesz, who worked idependently
of both the German and French school in the beginning, and later had training in both.
11.1
where {an } is a given sequence, {gn } a, not necessarily orthonormal, sequence of functions
and the problem is to determine f when integration is in the Lebesgue sense, he was led
to considering Lp spaces and its relation to Lq spaces. However, it took until 1910 until
the theory was furnished enough to become commonly accepted and usable. [5]
His main tools for completing the theory were the Holder inequalities
n
X
|ai bi |
n
X
i=1
or
|ai |p
n
1 X
p
|bi |q
1
q
(7)
i=1
i=1
Z
Z
1 Z
1
p
q
p
|f (x)| dx
|g(x)|q dx ,
f (x)g(x)dx
M
(8)
|ai + bi |p
1
n
X
i=1
1
i=1
or
Z
|ai |p
|f (x) + g(x)|p dx
1
Z
n
X
|bi |p
1
(9)
i=1
1 Z
1
p
p
|f (x)|p dx +
|g(x)|p dx ,
(10)
where M is the region of integration. This allowed Riesz to define the space Lp as the set
of all functions f , measurable on a set M for which |f |p is integrable.
Taking the set M to be the closed interval [a, b], Riesz defined strong convergence of a
sequence of functions {fn } to a function f in the mean of order p as
Zb
lim
fn (t) f (t) dt = 0.
52
f (x) fn (x) g(x)dx = 0,
(11)
and noted in passing that if {fn } and f are such that (11) is satisfied for every g Lq ,
then {fn } converge weakly to f the modern definition of weak convergence, which is
completely equivalent to that of Riesz. [3]
With this new machinery, he again set out to study an eigenvalue problem, which
turned out to be one of the most fruitful so far. According to [5] (p. 145146) it is one of
the most beautiful [papers] ever written; it is entirely geometric in language and spirit, and
so perfectly adapted to its goal that it has never been superseded and that Reisz proofs can
still be transcribed almost verbatim.. The paper is entitled Untersuchungen u
ber Systeme
integrierbarer Funktionen and was published in Acta Mathematica 19184 .
In finite dimensional linear algebra, an operator is a linear map, from a vector space
to itself, represented by a square matrix. Given an arbitrary finite dimensional linear
operator one asks, what can we do with it? In some cases the operator permits a complete
eigenvalue decomposition; it is diagonizable. Unfortunately that is not always the case. If
the vector space is complex, then there is a basis such that the corresponding matrix is
upper triangular. Let T be a linear operator on a finite complex vectorspace V . A basis
of V is called a Jordan basis for T if T , with respect to this basis, has a corresponding
block diagonal matrix
A1 . . . 0
.. . .
.. ,
(12)
.
.
.
0
. . . Am
j 1
..
.
Aj =
the form
..
.
.
..
. 1
j
53
assumptions one had to impose on the operator in order to get good properties, like
the Jordan normal form for the finite dimensional case. For simplicity he considered the
set of continuous functions on the interval [a, b], but he claims that the theory is easily
generalizable ([15] p. 71, cited from [4]):
The restriction to continuous functions made in this paper is not essential.
The reader familiar with the more recent investigations on various function
spaces will recognize immediately the general applicability of the method; he will
also notice that certain among those, such as the square integrable functions
and Hilbert space of infinitely many dimensions, still admit simplifications,
whereas the seemingly simpler case treated here may be regarded as a test case
for the general applicability.
To start his investigations, he began with a few definitions ([15] p. 72). The set of all
continuous functions on the interval [a, b] is called a function space and the norm of f ,
denoted ||f ||, is the maximum of |f (x)|. Hence the norm is in general positive and zero
only when f is identically zero. Furthermore we have that
||cf (x)|| = |c|||f (x)||;
By the distance between f1 and f2 we mean the norm ||f1 f2 || = ||f2 f1 ||. Convergence
of a sequence of functions {fn } to a limit function f is then understood as ||fn f || 0
when n . If f , f1 , f2 are in this function space then so are cf and f1 + f2 and if {fn }
is a convergent sequence in this space, then the limit function f is also in this space. That
is, this space of functions is a normed space which is complete with respect to the topology
of strong convergence, exactly as defined by Stefan Banach (1892 1945) four years later.
A transformation T of an element f in this space to a uniquely determined element T [f ]
in this space is called a linear transformation if it is distributive and bounded, i.e. if for
all f , f1 , f2 and every constant c we have
T [cf ] = cT [f ];
The reason for introducing all these notations and definitions is to be able to treat the
eigenvalue problem
(x) K (x) = f (x),
(13)
where f is known, is the unknown and K is a symmetric (bounded) linear transformation
in L2 . Fredholm and Hilbert studied this equation under the assumption that the functions
involved were continuous on the interval [a, b]. With the new theory of integration, Riesz
wanted to improve the results by Fredholm and Hilbert by relaxing the assumptions on
the functions and make (13) solvable for a larger class of functions. It turned out that the
most successful way to study equation (13) was to let the involved functions be of class L2
and that is why Riesz restricted himself to L2 from here on, but the theory is applicable
in Lp for any p > 1.
1
For the parameter , Riesz proved the spectral radius theorem saying that if || < ||K||
then (13) has a solution which is unique up to a null function. This is proved by showing
that the transformation T = E K, where E is the identity transformation on L2 , is
invertible. Furthermore he continued to show that if K f (x) is real whenever f (x) is
real, then for at least one of the two integrals
Zb
(x)
2
1
K (x) dx
||K||
|n (x)|2 dx = 1 for n = 1, 2, . . . ,
(14)
such that
Zb
lim
n (x)
2
1
K n (x) dx = 0.
||K||
(15)
Riesz is hence faced with the problem of determining which properties of K one has
to impose in order for the equation
Zb
(x)
2
1
K (x) dx = 0
||K||
(16)
to have non-trivial solutions. From the boundedness of K it follows that if {fn } converges
strongly to f then {K(fn )} converges strongly to K(f ). From (15), Riesz could conclude
that there is a subsequence {nj (x)} of {n (x)} with {nj } converging weakly to . This
subsequence would then satisfy (16). The problem is that it could happen that every such
subsequence {nj } could converge to a null function which would result in only trivial
solutions. However, that situation will not occur if K is assumed to be completely continuous which was Reisz motivation for introducing the concept of completely continuous
operators. If K is assumed to be completely continuous, then by the inequality
Zb
1/2 b
1/2
1/2 b
2
2
Z
Z
0
2
1
0
n (x) dx
0n (x) 1 K 0n (x) dx ,
||K|| K n (x) dx
||K||
a
55
|K nj (x) |2 dx = ||K||2 .
Since the sequence {K(nj )} converge strongly to some {K(0 )} we have that5
Zb
Zb
|K(nj )| dx =
lim
j
a
|K 0 (x) |2 dx = ||K||2 ,
Zb
||K|| =
|K 0 (x) |2 dx ||K||2
Zb
|0 (x)|2 dx
Zb
|0 (x)|2 dx 1,
which means that 0 (x) is not a null function. Thus it is proved that (16) has non-trivial
solutions and a sufficient condition is that K is completely continuous. [3]
Riesz continued to show that this method is applicable to any for which there is a
sequence of functions {n } satisfying (14) and (15). These :s are called the eigenvalues
of K and the non-trivial solutions are called eigenfunctions. Phrased in modern language,
Riesz had proved that the continuous spectrum of a real symmetric compact operator in
L2 is empty. He closed the discussion about completely continuous operators by proving
the Hilbert decomposition theorem for real symmetric completely continuous operators,
which state that
X
1
K f (x) =
Ki f (x) ,
i
i=1
where the Ki are certain transformations similar to projections and the sum is taken over
all eigenvalues. Hence with the completely continuous operators we have an analogy with
the finite dimensional case and the Jordan normal form.
This paper is without doubt one of the most significant in the history of functional
analysis. It finally settled the analogy with finite dimensional linear algebra and introduced
or developed almost every important concept that Banach axiomatized four years later.
There are even more astonishing features of this paper that I have not dealt with here.
Among others the introduction of the adjoint operator T of T which is used to study
inverses of operators. For a more detailed discussion, I refer the reader to [3] or [4] and
for a thorough investigation of spectral theory in particular, see [19].
From this point on, the development of functional analysis was explosive. Between
1920 and 1932, both Banach and Hans Hahn (1879 1934) published their books on the
subject which contained the major theorems of functional analysis; The HahnBanach
theorem, the uniform boundedness theorem and the open mapping theorem. With the
complete axiomatization of Banach and Hilbert spaces along with the new rigor of quantum
mechanics, many prominent mathematicians turned their attention to functional analysis
and developed the theory, and the physicists found usage of it. Thus the establishment
was complete and there could be no questions about the usefulness of this new theory.
5
56
Because of the significance of Fredholms work on the Dirichlet problem we will give it a
more careful investigation, following [16]. In the Dirichlet problem one seeks a harmonic
function which is continuous on a domain and reduces to a given function on the boundary.
The Neumann problem is similar, but instead of prescribing the value of the solution on
the boundary of the domain, one prescribes the value of its normal derivative. We will
limit ourselves to a domain in the plane which is bounded by a simple closed curve C
with continuous curvature and parametrized by arc length. We will refer to an interior
or exterior problem depending on if the domain under consideration is the interior Di , or
exterior De of C.
A.1
In the interior problem we seek a function u(P ) which is harmonic in Di and whose limit
when the point P tends to a point on C is equal to a given continuous function g(s), that
is
ui (s) = g(s).
(17)
Following the classical method by Neumann we try to find a harmonic function u of the
form
Z
Z
cos(rP t , nt )
1
u(P ) = (t)
dt = (t)
dt.
(18)
log
nt
rP t
rP t
C
That is, as the potential of a double layer (t) distributed over C, rP t is the distance from
the point P to the point t on C and nt is the interior normal at the point t of C.
When this double layer is continuous it is known that the potential (t) is harmonic
in Di and De , but discontinuous when we cross C. The interior and exterior limits are
then related by the relations
ue (s) = u(s) (s).
(19)
(20)
where
K(s, t) =
1
1
1 cos(rst , nt )
log
=
.
nt
rst
rst
The kernel K(s, t) is continuous not only for s 6= t, but also for the diagonal s = t. If we
denote these rectangular coordinates of the point s on C by x(s) and y(s), these functions
57
s,ts0
where k(s0 ) is the curvature of C at the point s0 . Hence we are allowed to apply the
Fredholm theory.
From the Fredholm theory it follows that either the non-homogeneous equation (20) has
a continuous solution (s) for any continuous function g(s), or the homogeneous equation
Z
(21)
v(s) + K(s, t)v(t)dt = 0
C
has a continuous solution v(s) 6 0. If we study the last case (21), we see that it is not
possible. From (17), (20) and (21) it follows that for the potential v(P ) corresponding to
the double layer v(s) we have vi (s) 0 which implies v(P ) 0 in Di , since aharmonic
function attains its extremal values on the boundary. Hence we also have
=
n i
De
A.2
Analagously with the interior problem, the exterior problem leads to the equation
Z
1
g(s) = (s) K(s, t)(t)dt = 0,
(22)
and that the solutions are necessary constant. However, they need not be zero. Hence the
number of linearly independent solutions is equal to one. The same is true for the adjoint
equation
Z
%(s)
K(s, t)%(t)dt = 0.
C
58
Let %0 (s) be a solution of the adjoint equation such that all other solutions are multiples
of %0 (s). Then it is necessary and sufficient for (22) to have a solution that
Z
g(s)%0 (s)ds = 0.
C
Now determine a constant c such that g1 (s) = g(s) c is orthogonal to %0 (s) and denote
by u1 (s) the potential by which (22) corresponds to g1 (s). Then u = u1 + c is a solution
of (22) and hence we have
Theorem A.2. The exterior Dirichlet problem has a solution for every continuous function g(s) given on the boundary.
A.3
The methods used to solve the Dirichlet problem can also be applied to the Neumann
problem, but instead of considering solutions of type (18) we seek a single layer potential,
Z
1
u(P ) = %(t) log
dt.
rP t
C
This potential is harmonic on both Di and De and even continuous on C, but its normal
derivatives are discontinuous on C. Hence we have the relations
ui = ue
and
u
ns
+ %(s) =
i
u
ns
Z
%(s) =
e
%(t)
log
ns
rP t
dt.
Z
%(t)K(t, s)dt,
(23)
where the kernel K(t, s) is the adjoint of that encountered when studying the Dirichlet
problem. Thus using the same argument as for the Dirichlet problem, we can conclude
that (23) has a solution if and only if h(s) is orthogonal to 1.
Theorem A.3. The internal Neumann problem has a solution for every continuous function h(s) such that
Z
h(s)ds = 0.
C
59
A.4
The exterior Neumann problem leads by exactly the same arguments to the equation
Z
1
(24)
h(s) = %(s) + %(t)K(t, s)dt.
The corresponding homogeneous equation has no solutions, since the adjoint homogeneous
solutions does not have any as we have seen in the Dirichlet problem. Thus we have
Theorem A.4. The exterior Neumann problem has a solution for every given continuous
function h(s).
60
References
[1] Axler, Sheldon. Linear Algebra Done Right. Springer Science+Business Media, Inc,
New York, 1996.
[2] Bernkopf, Michael. A history of infinite matrices. Archive for History of Exact
Sciences, 4(4):308358, 1968.
[3] Bernkopf, Michael. The development of function spaces with particular reference
to their origins in integral equation theory. Archive for History of Exact Sciences,
3(1):196, 1975.
[4] Birkhoff, Garrett and Kreyszig, Erwin. The Establishment of Functional Analysis.
Historia Mathematica, 11:258321, 1984.
[5] Dieudonne, Jean. History of Functional Analysis. NorthHolland, Amsterdam, 1981.
[6] Fredholm, Ivar. Sur une classe dequations fonctionnelles.
27(1):365390, 1903.
Acta Mathematica,
[7] Gray, J. D. The shaping of the riesz representation theorem: A chapter in the history
of analysis. Archive for History of Exact Sciences, 31(2):127187, 1984.
[8] Hawkins, Thomas. Lebesgues Theory of Integration: Its Origins and Development.
The University of Wisconsin Press, Madison, Milwaukee, London, 1970.
[9] Hilbert, David. Grundz
uge einer allgemeinen Theorie der linearen Integralgleichungen. B.G. Teubner, Leibzig und Berlin, 1912.
[10] Johansson, Bo Goran. Matematikens historia. Studentlitteratur AB, Lund, 2004.
[11] Katz, Victor. The History of Mathematics: An Introduction. AddisonWesley, Reading, 1998.
[12] Kreyszig, Erwin. Introductory Functional Analysis with Applications. John Wiley
and Sons, New York, 1989.
[13] Luciano, Erika. At the Origins of Functional Analysis: G. Peano and M. Gramegna on
Ordinary Differential Equations. Revue dHistoire des Mathematiques, 12(1):3579,
2006.
[14] Monna, A.F. Functional Analysis in Historical Perspective. Oosthoek Publishing
Company, Utrecht, 1973.
[20] Taylor, Angus E. A study of Maurice Frechet: I. His early work on point set theory
and the theory of functionals. Archive for History of Exact Sciences, 27(3):233295,
1982.
[21] Taylor, Angus E. A study of Maurice Frechet: II. Mainly about his work his work on
general topology, 19091928. Archive for History of Exact Sciences, 34(4):279380,
1985.
[22] Taylor, Angus E. A study of Maurice Frechet: III. Frechet as analyst, 19091930.
Archive for History of Exact Sciences, 37(1):2576, 1987.
[23] Vretblad, Anders. Fourier Analysis and Its Applications. SpringerVerlag, New York,
2005.
62