Ode
Ode
GABRIEL NAGY
Mathematics Department,
Michigan State University,
East Lansing, MI, 48824.
x2
x2
x1
b a
0 x1
1
G. NAGY ODE September 11, 2017 I
Contents
2.2.4. Exercises 96
2.3. Homogenous Constant Coefficients Equations 97
2.3.1. The Roots of the Characteristic Polynomial 97
2.3.2. Real Solutions for Complex Roots 101
2.3.3. Constructive Proof of Theorem 2.3.2 103
2.3.4. Exercises 106
2.4. Euler Equidimensional Equation 107
2.4.1. The Roots of the Indicial Polynomial 107
2.4.2. Real Solutions for Complex Roots 110
2.4.3. Transformation to Constant Coefficients 112
2.4.4. Exercises 113
2.5. Nonhomogeneous Equations 114
2.5.1. The General Solution Formula 114
2.5.2. The Undetermined Coefficients Method 115
2.5.3. The Variation of Parameters Method 119
2.5.4. Exercises 123
2.6. Applications 124
2.6.1. Review of Constant Coefficient Equations 124
2.6.2. Undamped Mechanical Oscillations 125
2.6.3. Damped Mechanical Oscillations 127
2.6.4. Electrical Oscillations 129
2.6.5. Exercises 132
We start our study of differential equations in the same way the pioneers in this field did.
We show particular techniques to solve particular types of first order differential equa-
tions. The techniques were developed in the eighteenth and nineteenth centuries and the
equations include linear equations, separable equations, Euler homogeneous equations, and
exact equations. Soon this way of studying differential equations reached a dead end. Most
of the differential equations cannot be solved by any of the techniques presented in the first
sections of this chapter. People then tried something different. Instead of solving the equa-
tions they tried to show whether an equation has solutions or not, and what properties such
solution may have. This is less information than obtaining the solution, but it is still valu-
able information. The results of these efforts are shown in the last sections of this chapter.
We present theorems describing the existence and uniqueness of solutions to a wide class of
first order differential equations.
y
y 0 = 2 cos(t) cos(y)
0 t
2
2 G. NAGY ODE september 11, 2017
Remark: This is a second order in time and space Partial Differential Equation (PDE).
C
The equations in examples (a) and (b) are called ordinary differential equations (ODE) the
unknown function depends on a single independent variable, t. The equations in examples
(d) and (c) are called partial differential equations (PDE)the unknown function depends
on two or more independent variables, t, x, y, and z, and their partial derivatives appear in
the equations.
The order of a differential equation is the highest derivative order that appears in the
equation. Newtons equation in example (a) is second order, the time decay equation in
example (b) is first order, the wave equation in example (d) is second order is time and
G. NAGY ODE September 11, 2017 3
space variables, and the heat equation in example (c) is first order in time and second order
in space variables.
1.1.2. Linear Differential Equations. We start with a precise definition of a first order
ordinary differential equation. Then we introduce a particular type of first order equations
linear equations.
Definition 1.1.1. A first order ODE on the unknown y is
y 0 (t) = f (t, y(t)), (1.1.1)
dy
where f is given and y 0 = . The equation is linear iff the source function f is linear on
dt
its second argument,
y 0 = a(t) y + b(t). (1.1.2)
The linear equation has constant coefficients iff both a and b above are constants. Oth-
erwise the equation has variable coefficients.
There are different sign conventions for Eq. (1.1.2) in the literature. For example, Boyce-
DiPrima [3] writes it as y 0 = a y + b. The sign choice in front of function a is matter of
taste. Some people like the negative sign, because later on, when they write the equation
as y 0 + a y = b, they get a plus sign on the left-hand side. In any case, we stick here to the
convention y 0 = ay + b.
Example 1.1.2:
(a) An example of a first order linear ODE is the equation
y 0 = 2 y + 3.
On the right-hand side we have the function f (t, y) = 2y + 3, where we can see that
a(t) = 2 and b(t) = 3. Since these coefficients do not depend on t, this is a constant
coefficient equation.
(b) Another example of a first order linear ODE is the equation
2
y 0 = y + 4t.
t
In this case, the right-hand side is given by the function f (t, y) = 2y/t + 4t, where
a(t) = 2/t and b(t) = 4t. Since the coefficients are nonconstant functions of t, this is
a variable coefficients equation.
2
(c) The equation y 0 = + 4t is nonlinear.
ty
C
We denote by y : D R R a real-valued function y defined on a domain D. Such
a function is solution of the differential equation (1.1.1) iff the equation is satisfied for all
values of the independent variable t on the domain D.
3
Example 1.1.3: Show that y(t) = e2t is solution of the equation
2
y 0 = 2 y + 3.
Solution: We need to compute the left and right-hand sides of the equation and verify
they agree. On the one hand we compute y 0 (t) = 2e2t . On the other hand we compute
3
2 y(t) + 3 = 2 e2t + 3 = 2e2t .
2
We conclude that y 0 (t) = 2 y(t) + 3 for all t R. C
4 G. NAGY ODE september 11, 2017
Example 1.1.4: Find the differential equation y 0 = f (y) satisfied by y(t) = 4 e2t + 3.
Solution: (Solution Video) We compute the derivative of y,
y 0 = 8 e2t
We now write the right-hand side above, in terms of the original function y, that is,
y = 4 e2t + 3 y 3 = 4 e2t 2(y 3) = 8 e2t .
So we got a differential equation satisfied by y, namely
y 0 = 2y 6.
C
1.1.3. Solving Linear Differential Equations. Linear equations with constant coeffi-
cient are simpler to solve than variable coefficient ones. But integrating each side of the
equation does not work. For example, take the equation
y 0 = 2 y + 3,
and integrate with respect to t on both sides,
Z Z
y 0 (t) dt = 2 y(t) dt + 3t + c, c R.
Integrating both sides of the differential equation is not enough to find a solution y. We
still need to find a primitive of y. We have only rewritten the original differential equation
as an integral equation. Simply integrating both sides of a linear equation does not solve
the equation.
We now state a precise formula for the solutions of constant coefficient linear equations.
The proof relies on a new ideaa clever use of the chain rule for derivatives.
Theorem 1.1.2 (Constant Coefficients). The linear differential equation
y0 = a y + b (1.1.3)
with a 6= 0, b constants, has infinitely many solutions,
b
y(t) = c eat , c R. (1.1.4)
a
Remarks:
(a) Equation (1.1.4) is called the general solution of the differential equation in (1.1.3).
(b) Theorem 1.1.2 says that Eq. (1.1.3) has infinitely many solutions, one solution for each
value of the constant c, which is not determined by the equation.
(c) It makes sense that we have a free constant c in the solution of the differential equa-
tion. The differential equation contains a first derivative of the unknown function y,
so finding a solution of the differential equation requires one integration. Every indefi-
nite integration introduces an integration constant. This is the origin of the constant c
above.
G. NAGY ODE September 11, 2017 5
b 0
The right-hand side above can be rewritten as a derivative, b eat = eat , hence
a
at b at 0 h b at i0
e y+ e =0 y+ e = 0.
a a
We have succeeded in writing the whole differential equation as a total derivative. The
differential equation is the total derivative of a potential function, which in this case is
b at
(t, y) = y + e .
a
Notice that this potential function is the exponential of the potential function found in the
first proof of this Theorem. The differential equation for y is a total derivative,
d
(t, y(t)) = 0,
dt
so it is simple to integrate,
b at b
(t, y(t)) = c y(t) + e = c y(t) = c eat .
a a
This establishes the Theorem.
We solve the example below following the second proof of Theorem 1.1.2.
Example 1.1.6: Find all solutions to the constant coefficient equation
y 0 = 2y + 3 (1.1.8)
Solution: The equation above is the case of a = 2 and b = 3 in Eq. (1.1.3). Therefore,
using these values in the expression for the solution given in Eq. (1.1.4) we obtain
3
y(t) = c e2t .
2
C
1.1.5. The Initial Value Problem. Sometimes in physics one is not interested in all
solutions to a differential equation, but only in those solutions satisfying extra conditions.
For example, in the case of Newtons second law of motion for a point particle, one could
be interested only in solutions such that the particle is at a specific position at the initial
time. Such condition is called an initial condition, and it selects a subset of solutions of the
differential equation. An initial value problem means to find a solution to both a differential
equation and an initial condition.
Definition 1.1.3. The initial value problem (IVP) is to find all solutions y to
y 0 = a y + b, (1.1.10)
that satisfy the initial condition
y(t0 ) = y0 , (1.1.11)
where a, b, t0 , and y0 are given constants.
Remark: The equation (1.1.11) is called the initial condition of the problem.
Although the differential equation in (1.1.10) has infinitely many solutions, the associated
initial value problem has a unique solution.
Theorem 1.1.4 (Constant Coefficients IVP). Given the constants a, b, t0 , y0 R, with
a 6= 0, the initial value problem
y 0 = a y + b, y(t0 ) = y0 ,
has the unique solution
b a(tt0 ) b
y(t) = y0 + e . (1.1.12)
a a
Introduce this expression for the constant c into the differential equation in Eq. (1.1.10),
b a(tt0 ) b
y(t) = y0 + e .
a a
This establishes the Theorem.
Example 1.1.8: Find the unique solution of the initial value problem
y 0 = 2y + 3, y(0) = 1. (1.1.13)
Solution: (Solution Video) All solutions of the differential equation are given by
3
y(t) = ce2t ,
2
where c is an arbitrary constant. The initial condition in Eq. (1.1.13) determines c,
3 5
1 = y(0) = c c= .
2 2
5 3
Then, the unique solution to the initial value problem above is y(t) = e2t . C
2 2
Example 1.1.9: Find the solution y to the initial value problem
y 0 = 3y + 1, y(0) = 1.
Notes. This section corresponds to Boyce-DiPrima [3] Section 2.1, where both constant
and variable coefficient equations are studied. Zill and Wright give a more concise exposition
in [17] Section 2.3, and a one page description is given by Simmons in [10] in Section 2.10.
The integrating factor method is shown in most of these books, but unlike them, here we
emphasize that the integrating factor changes the linear differential equation into a total
derivative, which is trivial to integrate. We also show here how to compute the potential
functions for the linear differential equations. In 1.4 we solve (nonlinear) exact equations
and nonexact equations with integrating factors. We solve these equations by transforming
them into a total derivative, just as we did in this section with the linear equations.
10 G. NAGY ODE september 11, 2017
1.1.6. Exercises.
1.1.1.- Find the differential equation of the 1.1.5.- Find all solutions of
form y 0 = f (y) satisfied by the function y 0 = 2y + 5
2
y(t) = 8e5t .
5 1.1.6.- Find the solution of the IVP
1.1.2.- Find constants a, b, so that y 0 = 4y + 2, y(0) = 5.
2t
y(t) = (t + 3) e 1.1.7.- Find the solution of the IVP
is solution of the IVP dy
(t) = 3 y(t) 2, y(1) = 1.
y 0 = ay + e2t , y(0) = b. dt
1.1.8.- Express the differential equation
1.1.3.- Find all solutions y of
y0 = 6 y + 1 (1.1.14)
y 0 = 3y.
as a total derivative of a potential func-
1.1.4.- Follow the steps below to find all so- tion (t, y), that is, find satisfying
lutions of y 0 = 6 y + 1 0 = 0.
0
y = 4y + 2 Integrate the equation for the poten-
(a) Find the integrating factor . tial function to find all solutions y of
(b) Write the equations as a total de- Eq. (1.1.14).
rivative of a function , that is
1.1.9.- Find the solution of the IVP
y 0 = 4y + 2 0 = 0.
y 0 = 6 y + 1, y(0) = 1.
(c) Integrate the equation for .
(d) Compute y using part (c). 1.1.10.- * Follow the steps below to solve
y 0 = 3y + 5, y(0) = 1.
(a) Find any integrating factor for
the differential equation.
(b) Write the differential equation as a
total derivative of a potential func-
tion .
(c) Use the potential function to find
the general solution of the differen-
tial equation.
(d) Find the solution of the initial value
problem above.
G. NAGY ODE September 11, 2017 11
1.2.1. Review: Constant Coefficient Equations. Let us recall how we solved the con-
stant coefficient case. We wrote the equation y 0 = a y + b as follows
b
y0 = a y + .
a
The critical step was the following: since b/a is constant, then (b/a)0 = 0, hence
b 0 b
y+ =a y+ .
a a
At this point the equation was simple to solve,
(y + ab )0 b 0 b
= a ln + = a ln + = c0 + at.
(y + ab ) a a
y y
However, the case where b/a is not constant is not so simple to solvewe cannot add
zero to the equation in the form of 0 = (b/a)0 . We need a new idea. We now show an idea
that works with all first order linear equations with variable coefficientsthe integrating
factor method.
12 G. NAGY ODE september 11, 2017
1.2.2. Solving Variable Coefficient Equations. We now state our main resultthe
formula for the solutions of linear differential equations with variable coefficiets.
Theorem 1.2.1 (Variable Coefficients). If the functions a, b are continuous, then
y 0 = a(t) y + b(t), (1.2.1)
has infinitely many solutions given by
Z
y(t) = c eA(t)
+e A(t)
eA(t) b(t) dt, (1.2.2)
R
where A(t) = a(t) dt and c R.
Remarks:
(a) The expression in Eq. (1.2.2) is called the general solution of the differential equation.
(b) The function (t) = eA(t) is called the integrating factor of the equation.
Example 1.2.2: Show that for constant coefficient equations the solution formula given in
Eq. (1.2.2) reduces to Eq. (1.1.4).
Solution: In the particular case of constant coefficient equations, a primitive, or antideriv-
ative, for the constant function a is A(t) = at, so
Z
at
y(t) = c e + e at
eat b dt.
Since b is constant, the integral in the second term above can be computed explicitly,
Z b b
e at
b eat dt = eat eat = .
a a
b
Therefore, in the case of a, b constants we obtain y(t) = c eat given in Eq. (1.1.4). C
a
Proof of Theorem 1.2.1: Write the differential equation with y on one side only,
y 0 a y = b,
and then multiply the differential equation by a function , called an integrating factor,
y 0 a y = b. (1.2.3)
The critical step is to choose a function such that
a = 0 . (1.2.4)
For any function solution of Eq. (1.2.4), the differential equation in (1.2.3) has the form
y 0 + 0 y = b.
But the left-hand side is a total derivative of a product of two functions,
0
y = b. (1.2.5)
This is the property we want in an integrating factor, . We want to find a function such
that the left-hand side of the differential equation for y can be written as a total derivative,
just as in Eq. (1.2.5). We need to find just one of such functions . So we go back to
Eq. (1.2.4), the differential equation for , which is simple to solve,
0
0 = a = a ln(||)0 = a ln(||) = A + c0 ,
G. NAGY ODE September 11, 2017 13
R
where A = a dt, a primitive or antiderivative of a, and c0 is an arbitrary constant. Com-
puting the exponential of both sides we get
= ec0 eA = c1 eA , c1 = ec0 .
Since c1 is a constant which will cancel out from Eq. (1.2.3) anyway, we choose the integration
constant c0 = 0, hence c1 = 1. The integrating factor is then
(t) = eA(t) .
This function is an integrating factor, because if we start again at Eq. (1.2.3), we get
0
eA y 0 a eA y = eA b eA y 0 + eA y = eA b,
0
where we used the main property of the integrating factor, a eA = eA . Now the
product rule for derivatives implies that the left-hand side above is a total derivative,
0
eA y = eA b.
Integrating on both sides we get
Z Z
eA y = eA b dt + c eA y eA b dt = c.
tial equation. The solution of the differential equation can be computed form the second
equation above, = c, and the result is
Z
y(t) = c eA(t) + eA(t) eA(t) b(t) dt.
t 3 0
Using that 3 t4 = (t3 )0 and t2 = , we get
3
t 3 0 0 t 3 0 t3 0
t3 y 0 + (t3 )0 y = t3 y = = 0. t3 y
3 3 3
t3
This last equation is a total derivative of a potential function (t, y) = t3 y . Since the
3
equation is a total derivative, this confirms that we got a correct integrating factor. Now
we need to integrate the total derivative, which is simple to do,
t3 t3 t6
t3 y =c t3 y = c + y(t) = c t3 + ,
3 3 3
where c is an arbitrary constant. C
1.2.3. The Initial Value Problem. We now generalize Theorem 1.1.4initial value prob-
lems have unique solutionsfrom constant coefficients to variable coefficients equations. We
start introducing the initial value problem for a variable coefficients equationa simple gen-
eralization of Def. 1.1.3.
Definition 1.2.2. The initial value problem (IVP) is to find all solutions y of
y 0 = a(t) y + b(t), (1.2.7)
that satisfy the initial condition
y(t0 ) = y0 , (1.2.8)
where a, b are given functions and t0 , y0 are given constants.
G. NAGY ODE September 11, 2017 15
Let us introduce the particular primitives A(t) = A(t) A(t0 ) and K(t) = K(t) K(t0 ),
which vanish at t0 , that is,
Z t Z t
A(t) = a(s) ds, K(t) = eA(s) b(s) ds.
t0 t0
which is equivalent to
Z t
y(t) = y0 eA(t) + eA(t)A(t0 ) e(A(s)A(t0 )) b(s) ds,
t0
16 G. NAGY ODE september 11, 2017
so we conclude that Z t
y(t) = y0 eA(t) + eA(t) eA(s) b(s) ds.
t0
Solution: In Example 1.2.4 we computed the general solution of the differential equation,
c
y(t) = 2 + t2 , c R.
t
The initial condition implies that
1
2 = y(1) = c + 1 c = 1 y(t) = 2 + t2 .
t
C
Example 1.2.6: Find the solution of the problem given in Example 1.2.5, but this time
using the results of Theorem 1.2.3.
Solution: We find the solution simply by using Eq. (1.2.10). First, find the integrating
factor function as follows:
Z t
2
ds = 2 ln(t) ln(1) = 2 ln(t) A(t) = ln(t2 ).
A(t) =
1 s
1 t 3
Z
2
= 2+ 2 4s ds
t t 1
2 1
= 2 + 2 (t4 1)
t t
2 1 1
= 2 + t2 2 y(t) = 2 + t2 .
t t t
C
1.2.4. The Bernoulli Equation. In 1696 Jacob Bernoulli solved what is now known as
the Bernoulli differential equation. This is a first order nonlinear differential equation. The
following year Leibniz solved this equation by transforming it into a linear equation. We
now explain Leibnizs idea in more detail.
Definition 1.2.4. The Bernoulli equation is
y 0 = p(t) y + q(t) y n . (1.2.11)
where p, q are given functions and n R.
G. NAGY ODE September 11, 2017 17
Remarks:
(a) For n 6= 0, 1 the equation is nonlinear.
(b) If n = 2 we get the logistic equation, (well study it in a later chapter),
y
y 0 = ry 1 .
K
(c) This is not the Bernoulli equation from fluid dynamics.
The Bernoulli equation is special in the following sense: it is a nonlinear equation that
can be transformed into a linear equation.
Remark: This result summarizes Laplaces idea to solve the Bernoulli equation. To trans-
form the Bernoulli equation for y, which is nonlinear, into a linear equation for v = 1/y (n1) .
One then solves the linear equation for v using the integrating factor method. The last step
is to transform back to y = (1/v)1/(n1) .
Proof of Theorem 1.2.5: Divide the Bernoulli equation by y n ,
y0 p(t)
n
= n1 + q(t).
y y
Introduce the new unknown v = y (n1) and compute its derivative,
0 v 0 (t) y 0 (t)
v 0 = y (n1) = (n 1)y n y 0
= n .
(n 1) y (t)
If we substitute v and this last equation into the Bernoulli equation we get
v0
= p(t) v + q(t) v 0 = (n 1)p(t) v (n 1)q(t).
(n 1)
This establishes the Theorem.
Example 1.2.8: Given any constants a0 , b0 , find every solution of the differential equation
y 0 = a0 y + b0 y 3 .
Notes. This section corresponds to Boyce-DiPrima [3] Section 2.1, and Simmons [10]
Section 2.10. The Bernoulli equation is solved in the exercises of section 2.4 in Boyce-
Diprima, and in the exercises of section 2.10 in Simmons.
20 G. NAGY ODE september 11, 2017
1.2.5. Exercises.
1.2.1.- Find all solutions of 1.2.7.- Find the solutions to the IVP
y 0 = 4t y. 2ty y 0 = 0, y(0) = 3.
1.2.2.- Find the general solution of 1.2.8.- Find all solutions of the equation
y 0 = y + e2t . y 0 = y 2 sin(t).
1.2.3.- Find the solution y to the IVP 1.2.9.- Find the solution to the initial value
0
y = y + 2te , 2t
y(0) = 0. problem
t y 0 = 2 y + 4t3 cos(4t), y = 0.
1.2.4.- Find the solution y to the IVP 8
sin(t) 2 1.2.10.- Find all solutions of the equation
t y0 + 2 y = , y = ,
t 2 y0 + t y = t y2 .
for t > 0.
1.2.11.- Find all solutions of the equation
1.2.5.- Find all solutions y to the ODE
y 0 = x y + 6x y.
y0
= 4t.
(t2 + 1)y 1.2.12.- Find all solutions of the IVP
3
1.2.6.- Find all solutions y to the ODE y 0 = y + 2 , y(0) = 1.
y
ty 0 + n y = t2 ,
with n a positive integer. 1.2.13.- * Find all solutions of
y0 = a y + b yn ,
where a 6= 0, b, and n are real constants
with n 6= 0, 1.
G. NAGY ODE September 11, 2017 21
Remark: A separable differential equation is h(y) y 0 = g(y) has the following properties:
The left-hand side depends explicitly only on y, so any t dependence is through y.
The right-hand side depends only on t.
And the left-hand side is of the form (something on y) y 0 .
Example 1.3.1:
t2
(a) The differential equation y 0 = is separable, since it is equivalent to
1 y2
g(t) = t2 ,
(
2
0 2
1y y =t
h(y) = 1 y 2 .
From the last two examples above we see that linear differential equations, with a 6= 0,
are separable for b/a constant, and not separable otherwise. Separable differential equations
are simple to solve. We just integrate on both sides of the equation. We show this idea in
the following example.
22 G. NAGY ODE september 11, 2017
Remark: Notice the following about the equation and its implicit solution:
1 0 1
y = cos(2t) h(y) y 0 = g(t), h(y) = , g(t) = cos(2t),
y2 y2
1 0 1 1 1
y = sin(2t) H(y) = G(t), H(y) = , G(t) = sin(2t).
y 2 y 2
R
Here H is an antiderivative of h, that is, H(y) =R h(y) dy.
Here G is an antiderivative of g, that is, G(t) = g(t) dt.
R
Remark: An antiderivative
R of h is H(y) = h(y) dy, while an antiderivative of g is the
function G(t) = g(t) dt.
Proof of Theorem 1.3.2: Integrate with respect to t on both sides in Eq. (1.3.1),
Z Z
0 0
h(y) y = g(t) h(y(t)) y (t) dt = g(t) dt + c,
where c is an arbitrary constant. Introduce on the left-hand side of the second equation
above the substitution
y = y(t), dy = y 0 (t) dt.
The result of the substitution is
Z Z Z Z
h(y(t)) y 0 (t) dt = h(y)dy h(y) dy = g(t) dt + c.
To integrate on each side of this equation means to find a function H, primitive of h, and
a function G, primitive of g. Using this notation we write
Z Z
H(y) = h(y) dy, G(t) = g(t) dt.
Solution: We write the differential equation in (1.3.3) in the form h(y) y 0 = g(t),
1 y 2 y 0 = t2 .
In this example the functions h and g defined in Theorem 1.3.2 are given by
h(y) = (1 y 2 ), g(t) = t2 .
We now integrate with respect to t on both sides of the differential equation,
Z Z
1 y 2 (t) y 0 (t) dt = t2 dt + c,
where c is any constant. The integral on the right-hand side can be computed explicitly.
The integral on the left-hand side can be done by substitution. The substitution is
Z Z
dy = y 0 (t) dt 1 y 2 (t) y 0 (t) dt = (1 y 2 ) dy.
y = y(t),
Definition 1.3.3. A function y is a solution in implicit form of the equation h(y) y 0 = g(t)
iff the function y is solution of the algebraic equation
H y(t) = G(t) + c,
where H and G are any antiderivatives of h and g. In the case that function H is invertible,
the solution y above is given in explicit form iff is written as
y(t) = H 1 G(t) + c .
In the case that H is not invertible or H 1 is difficult to compute, we leave the solution
y in implicit form. We now solve the same example as in Example 1.3.3, but now we just
use the result of Theorem 1.3.2.
Example 1.3.4: Use the formula in Theorem 1.3.2 to find all solutions y to the equation
t2
y0 = . (1.3.4)
1 y2
Solution: Theorem 1.3.2 tell us how to obtain the solution y. Writing Eq. (1.3.4) as
1 y 2 y 0 = t2 ,
Remark: Sometimes it is simpler to remember ideas than formulas. So one can solve a
separable equation as we did in Example 1.3.3, instead of using the solution formulas, as in
Example 1.3.4. (Although in the case of separable equations both methods are very close.)
In the next Example we show that an initial value problem can be solved even when the
solutions of the differential equation are given in implicit form.
Example 1.3.5: Find the solution of the initial value problem
t2
y0 = , y(0) = 1. (1.3.6)
1 y2
Solution: From Example 1.3.3 we know that all solutions to the differential equation
in (1.3.6) are given by
y 3 (t) t3
y(t) = + c,
3 3
where c R is arbitrary. This constant c is now fixed with the initial condition in Eq. (1.3.6)
y 3 (0) 0 1 2 y 3 (t) t3 2
y(0) = +c 1 =c c= y(t) = + .
3 3 3 3 3 3 3
G. NAGY ODE September 11, 2017 25
So we can rewrite the algebraic equation defining the solution functions y as the (time
dependent) roots of a cubic (in y) polynomial,
y 3 (t) 3y(t) + t3 + 2 = 0.
C
Example 1.3.7: Follow the proof in Theorem 1.3.2 to find all solutions y of the equation
4t t3
y0 = .
4 + y3
Example 1.3.8: Find the solution of the initial value problem below in explicit form,
2t
y0 = , y(0) = 1. (1.3.8)
1+y
Remark:
G. NAGY ODE September 11, 2017 27
(a) Any function F of t, y that depends only on the quotient y/t is scale invariant. This
means that F does not change when we do the transformation y cy, t ct,
(cy) y
F =F .
(ct) t
For this reason the differential equations above are also called scale invariant equations.
(b) Scale invariant functions are a particular case of homogeneous functions of degree n,
which are functions f satisfying
f (ct, cy) = cn f (y, t).
Scale invariant functions are the case n = 0.
(c) An example of an homogeneous function is the energy of a thermodynamical system,
such as a gas in a bottle. The energy, E, of a fixed amount of gas is a function of the gas
entropy, S, and the gas volume, V . Such energy is an homogeneous function of degree
one,
E(cS, cV ) = c E(S, V ), for all c R.
Example 1.3.9: Show that the functions f1 and f2 are homogeneous and find their degree,
f1 (t, y) = t4 y 2 + ty 5 + t3 y 3 , f2 (t, y) = t2 y 2 + ty 3 .
Example 1.3.10: Show that the functions below are scale invariant functions,
y t3 + t2 y + t y 2 + y 3
f1 (t, y) = , f2 (t, y) = .
t t3 + t y 2
More often than not, Euler homogeneous differential equations come from a differential
equation N y 0 +M = 0 where both N and M are homogeneous functions of the same degree.
Theorem 1.3.5. If the functions N , M , of t, y, are homogeneous of the same degree, then
the differential equation
N (t, y) y 0 (t) + M (t, y) = 0
is Euler homogeneous.
28 G. NAGY ODE september 11, 2017
y2
Example 1.3.11: Show that (t y) y 0 2y + 3t + = 0 is an Euler homogeneous equation.
t
Solution: Rewrite the equation in the standard form
y2
y 2 2y 3t
(t y) y 0 = 2y 3t y0 = t .
t (t y)
So the function f in this case is given by
y2
2y 3t
f (t, y) = t .
(t y)
This function is scale invariant, since numerator and denominator are homogeneous of the
same degree, n = 1 in this case,
c2 y 2 y2
2cy 3ct c 2y 3t
f (ct, cy) = ct = t = f (t, y).
(ct cy) c(t y)
So, the differential equation is Euler homogeneous. We now write the equation in the form
y 0 = F (y/t). Since the numerator and denominator are homogeneous of degree n = 1, we
multiply them by 1 in the form (1/t)1 /(1/t)1 , that is
y2
2y 3t
y0 = t (1/t) .
(t y) (1/t)
Distribute the factors (1/t) in numerator and denominator, and we get
0 2(y/t) 3 (y/t)2 y
y = y0 = F ,
(1 (y/t)) t
G. NAGY ODE September 11, 2017 29
where
y 2(y/t) 3 (y/t)2
F = .
t (1 (y/t))
So, the equation is Euler homogeneous and it is written in the standard form. C
Remark: The original homogeneous equation for the function y is transformed into a sep-
arable equation for the unknown function v = y/t. One solves for v, in implicit or explicit
form, and then transforms back to y = t v.
Proof of Theorem 1.3.6: Introduce the function v = y/t into the differential equation,
y 0 = F (v).
We still need to replace y 0 in terms of v. This is done as follows,
y(t) = t v(t) y 0 (t) = v(t) + t v 0 (t).
Introducing these expressions into the differential equation for y we get
0 0 F (v) v v0 1
v + t v = F (v) v = = .
t F (v) v t
The equation on the far right is separable. This establishes the Theorem.
t2 + 3y 2
Example 1.3.13: Find all solutions y of the differential equation y 0 = .
2ty
Solution: The equation is Euler homogeneous, since
c2 t2 + 3c2 y 2 c2 (t2 + 3y 2 ) t2 + 3y 2
f (ct, cy) = = = = f (t, y).
2(ct)(cy) c2 (2ty) 2ty
30 G. NAGY ODE september 11, 2017
Next we compute the function F . Since the numerator and denominator are homogeneous
degree 2 we multiply the right-hand side of the equation by 1 in the form (1/t2 )/(1/t2 ),
1 y 2
2 2
(t + 3y ) t2 1 + 3
y0 = 1 y =
0 y t .
2ty 2
t2 t
Now we introduce the change of functions v = y/t,
1 + 3v 2
y0 = .
2v
Since y = t v, then y 0 = v + t v 0 , which implies
1 + 3v 2 1 + 3v 2 1 + 3v 2 2v 2 1 + v2
v + t v0 = t v0 = v = = .
2v 2v 2v 2v
We obtained the separable equation
1 1 + v2
v0 = .
t 2v
We rewrite and integrate it,
Z Z
2v 0 1 2v 0 1
v = v dt = dt + c0 .
1 + v2 t 1 + v2 t
The substitution u = 1 + v 2 (t) implies du = 2v(t) v 0 (t) dt, so
Z Z
du dt
= + c0 ln(u) = ln(t) + c0 u = eln(t)+c0 .
u t
But u = eln(t) ec0 , so denoting c1 = ec0 , then u = c1 t. So, we get
y 2
1 + v 2 = c1 t 1 + = c1 t y(t) = t c1 t 1.
t
C
t(y + 1) + (y + 1)2
Example 1.3.14: Find all solutions y of the differential equation y 0 = .
t2
Solution: This equation is Euler homogeneous when written in terms of the unknown
u(t) = y(t) + 1 and the variable t. Indeed, u0 = y 0 , thus we obtain
t(y + 1) + (y + 1)2 tu + u2 u u 2
y0 = u0
= u 0
= + .
t2 t2 t t
Therefore, we introduce the new variable v = u/t, which satisfies u = t v and u0 = v + t v 0 .
The differential equation for v is
Z 0 Z
0 2 0 2 v 1
v + tv = v + v tv = v dt = dt + c,
v2 t
with c R. The substitution w = v(t) implies dw = v 0 dt, so
Z Z
2 1 1
w dw = dt + c w1 = ln(|t|) + c w= .
t ln(|t|) + c
Substituting back v, u and y, we obtain w = v(t) = u(t)/t = [y(t) + 1]/t, so
y+1 1 t
= y(t) = 1.
t ln(|t|) + c ln(|t|) + c
C
G. NAGY ODE September 11, 2017 31
Notes. This section corresponds to Boyce-DiPrima [3] Section 2.2. Zill and Wright study
separable equations in [17] Section 2.2, and Euler homogeneous equations in Section 2.5.
Zill and Wright organize the material in a nice way, they present first separable equations,
then linear equations, and then they group Euler homogeneous and Bernoulli equations in
a section called Solutions by Substitution. Once again, a one page description is given by
Simmons in [10] in Chapter 2, Section 7.
32 G. NAGY ODE september 11, 2017
1.3.4. Exercises.
1.3.1.- Find all solutions y to the ODE 1.3.6.- Find all solutions y to the ODE
t2 t2 + y 2
y0 = . y0 = .
y ty
Express the solutions in explicit form.
1.3.7.- Find the explicit solution to the IVP
1.3.2.- Find every solution y of the ODE (t2 + 2ty) y 0 = y 2 , y(1) = 1.
3t2 + 4y 3 y 0 1 + y 0 = 0.
1.3.8.- Prove that if y 0 = f (t, y) is an Euler
Leave the solution in implicit form. homogeneous equation and y1 (t) is a so-
lution, then y(t) = (1/k) y1 (kt) is also a
1.3.3.- Find the solution y to the IVP solution for every non-zero k R.
y 0 = t2 y 2 , y(0) = 1.
1.3.9.- * Find the explicit solution of the
1.3.4.- Find every solution y of the ODE initial value problem
4t 6t2
p
ty + 1 + t2 y 0 = 0. y0 = , y(0) = 3.
y
1.3.5.- Find every solution y of the Euler
homogeneous equation
y+t
y0 = .
t
G. NAGY ODE September 11, 2017 33
Remark: The functions N , M depend on t, y, and we use the notation for partial derivatives
N M
t N = , y M = .
t y
In the definition above, the letter y has been used both as the unknown function (in the
first equation), and as an independent variable (in the second equation). We use this dual
meaning for the letter y throughout this section.
Our first example shows that all separable equations studied in 1.3 are exact.
Example 1.4.1: Show whether a separable equation h(y) y 0 (t) = g(t) is exact or not.
Solution: If we write the equation as h(y) y 0 g(t) = 0, then
)
N (t, y) = h(y) t N (t, y) = 0,
t N (t, y) = y M (t, y),
M (t, y) = g(t) y M (t, y) = 0,
hence every separable equation is exact. C
The next example shows that linear equations, written as in 1.2, are not exact.
Example 1.4.2: Show whether the linear differential equation below is exact or not,
y 0 (t) = a(t) y(t) + b(t), a(t) 6= 0.
Solution: We first find the functions N and M rewriting the equation as follows,
y 0 + a(t)y b(t) = 0 N (t, y) = 1, M (t, y) = a(t) y b(t).
Let us check whether the equation is exact or not,
)
N (t, y) = 1 t N (t, y) = 0,
t N (t, y) 6= y M (t, y).
M (t, y) = a(t)y b(t) y M (t, y) = a(t),
So, the differential equation is not exact. C
34 G. NAGY ODE september 11, 2017
The following examples show that there are exact equations which are not separable.
Example 1.4.3: Show whether the differential equation below is exact or not,
2ty y 0 + 2t + y 2 = 0.
Solution: We first identify the functions N and M . This is simple in this case, since
(2ty) y 0 + (2t + y 2 ) = 0 N (t, y) = 2ty, M (t, y) = 2t + y 2 .
The equation is indeed exact, since
)
N (t, y) = 2ty t N (t, y) = 2y,
t N (t, y) = y M (t, y).
M (t, y) = 2t + y 2 y M (t, y) = 2y,
Therefore, the differential equation is exact. C
Example 1.4.4: Show whether the differential equation below is exact or not,
sin(t) y 0 + t2 ey y 0 y 0 = y cos(t) 2tey .
Solution: We first identify the functions N and M by rewriting the equation as follows,
sin(t) + t2 ey 1 y 0 + y cos(t) + 2tey = 0
1.4.2. Solving Exact Equations. Exact differential equations can be rewritten as a total
derivative of a function, called a potential function. Once they are written in such way they
are simple to solve.
Theorem 1.4.2 (Exact Equations). If the differential equation
N (t, y) y 0 + M (t, y) = 0 (1.4.1)
is exact, then it can be written as
d
(t, y(t)) = 0,
dt
where is called a potential function and satisfies
N = y , M = t . (1.4.2)
Therefore, the solutions of the exact equation are given in implicit form as
(t, y(t)) = c, c R.
Remarks:
(a) A differential equation defines the functions N and M . The exact condition in (1.4.3)
is equivalent to the existence of , related to N and M through Eq. (1.4.4).
(b) If we recall the definition of the gradient of a function of two variables, = ht , y i,
then the equations in (1.4.4) say that = hM, N i.
Exact equations always have a potential function , and this function is not difficult to
computewe only need to integrate Eq. (1.4.4). Having a potential function of an exact
equation is essentially the same as solving the differential equation, since the integral curves
of define implicit solutions of the differential equation.
Proof of Theorem 1.4.2: The differential equation in (1.4.1) is exact, then Poincare
Theorem implies that there is a potential function such that
N = y , M = t .
Therefore, the differential equation is given by
0 = N (t, y) y 0 (t) + M (t, y)
= y (t, y) y 0 + t (t, y)
d
= (t, y(t)),
dt
36 G. NAGY ODE september 11, 2017
where in the last step we used the chain rule. Recall that the chain rule says
d dy
t, y(t) = (y ) + (t ).
dt dt
So, the differential equation has been rewritten as a total t-derivative of the potential func-
tion, which is simple to integrate,
d
(t, y(t)) = 0 (t, y(t)) = c,
dt
where c is an arbitrary constant. This establishes the Theorem.
Example 1.4.6 (Calculation of a Potential): Find all solutions y to the differential equation
2ty y 0 + 2t + y 2 = 0.
Solution: The first step is to verify whether the differential equation is exact. We know
the answer, the equation is exact, we did this calculation before in Example 1.4.3, but we
reproduce it here anyway.
)
N (t, y) = 2ty t N (t, y) = 2y,
t N (t, y) = y M (t, y).
M (t, y) = 2t + y 2 y M (t, y) = 2y.
Since the equation is exact, Lemma 1.4.3 implies that there exists a potential function
satisfying the equations
y (t, y) = N (t, y), (1.4.5)
t (t, y) = M (t, y). (1.4.6)
Let us compute . Integrate Eq. (1.4.5) in the variable y keeping the variable t constant,
Z
y (t, y) = 2ty (t, y) = 2ty dy + g(t),
Remark: An exact equation and its solutions can be pictured on the graph of a potential
function. This is called a geometrical interpretation of the exact equation. We saw that an
exact equation N y 0 +M = 0 can be rewritten as d/dt = 0. The solutions of the differential
equation are functions y such that (t, y(t)) = c, hence the solutions define level curves of
the potential function. Given a level curve, the vector r(t) = ht, y(t)i, which belongs to the
G. NAGY ODE September 11, 2017 37
ty-plane, points to the level curve, while its derivative r0 (t) = h1, y 0 (t)i is tangent to the
level curve. Since the gradient vector = hM, N i is perpendicular to the level curve,
r0 r0 = 0 M + N y 0 = 0.
We wanted to remark that the differential equation can be thought as the condition r0 .
Solution: The first step is to verify whether the differential equation is exact.
N (t, y) = sin(t) + t2 ey 1 t N (t, y) = cos(t) + 2tey ,
M (t, y) = y cos(t) + 2tey 3t2 y M (t, y) = cos(t) + 2tey .
So, the equation is exact. Poincare Theorem says there is a potential function satisfying
y (t, y) = N (t, y), t (t, y) = M (t, y). (1.4.8)
To compute we integrate on y the equation y = N keeping t constant,
Z
2 y
sin(t) + t2 ey 1 dy + g(t)
y (t, y) = sin(t) + t e 1 (t, y) =
Remark: A potential function is also called a conserved quantity. This is a reasonable name,
since a potential function evaluated at any solution of the differential equation is constant
along the evolution. This is yet another interpretation of the equation d/dt = 0, or its
integral (t, y(t)) = c. If we call c = 0 = (0, y(0)), the value of the potential function at
the initial conditions, then (t, y(t)) = 0 .
Conserved quantities are important in physics. The energy of a moving particle is a
famous conserved quantity. In that case the differential equation is Newtons second law of
motion, mass times acceleration equals force. One can prove that the energy E of a particle
with position function y moving under a conservative force is kept constant in time. This
statement can be expressed by E(t, y(t), y 0 (t)) = E0 , where E0 is the particles energy at the
initial time.
Example 1.4.8: Show that linear differential equations y 0 = a(t) y + b(t) are semi-exact.
Solution: We first show that linear equations y 0 = a y + b with a 6= 0 are not exact. If we
write them as
y0 a y b = 0 N y0 + M = 0 with N = 1, M = a y b.
Therefore,
t N = 0, y M = a t N 6= y M.
We now show that linear equations are semi-exact. Let us multiply the linear equation by
a function , which depends only on t,
(t) y 0 a(t) (t) y (t) b(t) = 0,
where we emphasized that , a, b depende only on t. Let us look for a particular function
that makes the equation above exact. If we write this equation as N y 0 + M = 0, then
N (t, y) = , M (t, y) = a y b.
We now check the condition for exactness,
t N = 0 , y M = a ,
G. NAGY ODE September 11, 2017 39
Remarks:
(a) The function (t) = eH(t) is called an integrating factor.
(b) Any integrating factor is solution of the differential equation
0 (t) = h(t) (t).
(c) Multiplication by an integrating factor transforms a non-exact equation
N y0 + M = 0
into an exact equation.
(N ) y 0 + (M ) = 0.
This is exactly what happened with linear equations.
Verification Proof of Theorem 1.4.5: We need to verify that the equation is exact,
(eH N ) y 0 + (eH M ) = 0 N (t, y) = eH(t) N (t, y), M (t, y) = eH(t) M (t, y).
We now check for exactness, and let us recall t (eH ) = (eH )0 = h eH , then
t N = h eH N + eH t N, y M = eH y M.
Let us use the definition of h in the first equation above,
( M N )
y t
t N = eH N + t N = eH y M = y M .
N
So the equation is exact. This establishes the Theorem.
Constructive Proof of Theorem 1.4.5: The original differential equation
N y0 + M = 0
40 G. NAGY ODE september 11, 2017
where we have chosen in second equation the integration constant to be zero. Then, multi-
plying the original differential equation in (1.4.13) by the integrating factor we obtain
3t2 y + t y 2 + t3 + t2 y y 0 = 0.
(1.4.14)
This latter equation is exact, since
N (t, y) = t3 + t2 y t N (t, y) = 3t2 + 2ty,
M (t, y) = 3t2 y + ty 2 y M (t, y) = 3t2 + 2ty,
so we get the exactness condition t N = y M . The solution y can be found as we did in the
previous examples in this Section. That is, we find the potential function by integrating
the equations
y (t, y) = N (t, y), (1.4.15)
t (t, y) = M (t, y). (1.4.16)
From the first equation above we obtain
Z
3 2
t3 + t2 y dy + g(t).
y = t + t y (t, y) =
We have seen in Example 1.4.2 that linear differential equations with a 6= 0 are not exact.
In Section 1.2 we found solutions to linear equations using the integrating factor method.
We multiplied the linear equation by a function that transformed the equation into a total
derivative. Those calculations are now a particular case of Theorem 1.4.5, as we can see it
in the following Example.
Example 1.4.10: Use Theorem 1.4.5 to find all solutions to the linear differential equation
y 0 = a(t) y + b(t), a(t) 6= 0. (1.4.17)
Solution: We first write the linear equation in a way we can identify functions N and M ,
y 0 a(t) y + b(t) = 0.
We now verify whether the linear equation is exact or not. Actually, we have seen in
Example 1.4.3 that this equation is not exact, since
N (t, y) = 1 t N (t, y) = 0,
M (t, y) = a(t) y b(t) y M (t, y) = a(t).
42 G. NAGY ODE september 11, 2017
But now we can go further, we can check wheteher the condtion in Theorem 1.4.5 holds or
not. We compute the function
y M (t, y) t N (t, y) a(t) 0
= = a(t)
N (t, y) 1
and we see that it is independent of the variable y. Theorem 1.4.5 says that we can transform
the linear equation into an exact equation. We only need to multiply the linear equation by
a function , solution of the equation
Z
0 A(t)
(t) = a(t) (t) (t) = e , A(t) = a(t) dt.
This is the same integrating factor we discovered in Section 1.2. Therefore, the equation
below is exact,
eA(t) y 0 a(t) eA(t) y b(t) eA(t) = 0.
(1.4.18)
This new version of the linear equation is exact, since
N (t, y) = eA(t) t N (t, y) = a(t) eA(t) ,
M (t, y) = a(t) eA(t) y b(t) eA(t) y M (t, y) = a(t) eA(t) .
Since the linear equation is now exact, the solutions y can be found as we did in the previous
examples in this Section. We find the potential function integrating the equations
y (t, y) = N (t, y), (1.4.19)
t (t, y) = M (t, y). (1.4.20)
From the first equation above we obtain
Z
A(t)
y = e (t, y) = eA(t) dy + g(t).
All solutions y to the linear differential equation in (1.4.17) satisfy the equation
Z
eA(t) y(t) b(t) eA(t) dt = c0 ,
where c0 R is arbitrary. This is the implicit form of the solution, but in this case it is
simple to find the explicit form too,
Z
y(t) = e A(t)
c0 + b(t) eA(t) dt .
This expression agrees with the one in Theorem 1.2.3, when we studied linear equations. C
G. NAGY ODE September 11, 2017 43
1.4.4. The Equation for the Inverse Function. Sometimes the equation for a function
y is neither exact nor semi-exact, but the equation for the inverse function y 1 might be.
We now try to find out when this can happen. To carry out this study it is more convenient
to change a little bit the notation we have been using so far:
(a) We change the independent variable name from t to x. Therefore, we write differential
equations as
dy
N (x, y) y 0 + M (x, y) = 0, y = y(x), y0 = .
dx
(b) We denote by x(y) the inverse of y(x), that is,
x(y1 ) = x1 y(x1 ) = y1 .
(c) Recall the identity relating derivatives of a function and its inverse function,
1
x0 (y) = 0 .
y (x)
Our first result says that for exact equations it makes no difference to solve for y or its
inverse x. If one equation is exact, so is the other equation.
So, if the equation for y is exact, so is the equation for its inverse x. The same is not true
for semi-exact equations. If the equation for y is semi-exact, then the equation for its inverse
x might or might not be semi-exact. The next result states a condition on the equation for
the inverse function x to be semi-exact. This condition is not equal to the condition on the
equation for the function y to be semi-exact. Compare Theorems 1.4.5 and 1.4.7.
Remarks:
(a) The function (y) = eL(y) is called an integrating factor.
(b) Any integrating factor is solution of the differential equation
0 (y) = `(y) (y).
(c) Multiplication by an integrating factor transforms a non-exact equation
M x0 + N = 0
into an exact equation.
(M ) x0 + (N ) = 0.
Verification Proof of Theorem 1.4.7: We need to verify that the equation is exact,
(eL M ) x0 + (eL N ) = 0 M (x, y) = eL(y) M (x, y), N (x, y) = eL(y) N (x, y).
We now check for exactness, and let us recall y (eL ) = (eL )0 = ` eL , then
y M = ` eL M + eL y M, x N = eH x N.
Let us use the definition of ` in the first equation above,
( M N )
y x
y M = eL M + y M = eL x N = x N .
M
So the equation is exact. This establishes the Theorem.
Constructive Proof of Theorem 1.4.7: The original differential equation
M x0 + N = 0
is not exact because y M 6= x N . Now multiply the differential equation by a nonzero
function that depends only on y,
(M ) x0 + (N ) = 0.
We look for a function such that this new equation is exact. This means that must
satisfy the equation
y (M ) = x (N ).
G. NAGY ODE September 11, 2017 45
Solution: We first check if the equation is exact for the unknown function y, which depends
on the variable x. If we write the equation as N y 0 + M = 0, with y 0 = dy/dx, then
N (x, y) = 5x ey + 2 cos(3x) x N (x, y) = 5 ey 6 sin(3x),
M (x, y) = 5 ey 3 sin(3x) y M (x, y) = 5 ey .
Since x N 6= y M , the equation is not exact. Let us check if there exists an integrating
factor that depends only on x. Following Theorem 1.4.5 we study the function
y M x N 10 ey + 6 sin(3x)
h= = ,
N 5x ey + 2 cos(3x)
which is a function of both x and y and cannot be simplified into a function of x alone.
Hence an integrating factor cannot be function of only x.
Let us now consider the equation for the inverse function x, which depends on the variable
y. The equation is M x0 + N = 0, with x0 = dx/dy, where M and N are the same as before,
M (x, y) = 5 ey 3 sin(3x) N (x, y) = 5x ey + 2 cos(3x).
We know from Theorem 1.4.6 that this equation is not exact. Both the equation for y and
equation for its inverse x must satisfy the same condition to be exact. The condition is
x N = y M , but we have seen that this is not true for the equation in this example. The
last thing we can do is to check if the equation for the inverse function x has an integrating
factor that depends only on y. Following Theorem 1.4.7 we study the function
10 ey + 6 sin(3x)
(y M x N )
`= = = 2 `(y) = 2.
M 5 ey 3 sin(3x)
The function above does not depend on x, so we can solve the differential equation for (y),
0 (y) = `(y) (y) 0 (y) = 2 (y) (y) = 0 e2y .
46 G. NAGY ODE september 11, 2017
Since is an integrating factor, we can choose 0 = 1, hence (y) = e2y . If we multiply the
equation for x by this integrating factor we get
e2y 5 ey 3 sin(3x) x0 + e2y 5x ey + 2 cos(3x) = 0,
Notes. Exact differential equations are studied in Boyce-DiPrima [3], Section 2.6, and in
most differential equation textbooks.
G. NAGY ODE September 11, 2017 47
1.4.5. Exercises.
+ 3 e2y + 5 cos(5x) = 0.
1.4.3.- Consider the equation
2 y ety (a) Is this equation for y exact? If not,
y0 = .
2y + t ety does this equation have an integrat-
(a) Determine whether the differential ing factor depending on x?
equation is exact. (b) Is the equation for x = y 1 exact?
(b) Find every solution of the equation If not, does this equation have an
above. integrating factor depending on y?
(c) Find an implicit expression for all
1.4.4.- Consider the equation solutions y of the differential equa-
tion above.
(6x5 xy) + (x2 + xy 2 )y 0 = 0,
with initial condition y(0) = 1. 1.4.7.- * Find the solution to the equation
(a) Find an integrating factor that 2t2 y +2t2 y 2 +1+(t3 +2t3 y +2ty) y 0 = 0,
converts the equation above into an
with initial condition
exact equation.
(b) Find an implicit expression for the y(1) = 2.
solution y of the IVP.
48 G. NAGY ODE september 11, 2017
Remark: The equation N 0 = k N , with k > 0 is called the exponential growth equation.
We have seen in 1.1 how to solve this equation. But we review it here one more time.
Theorem 1.5.2 (Exponential Decay). The solution N of the exponential decay equation
N 0 = k N and intial condition N (0) = N0 is
N (t) = N0 ekt .
Proof of Theorem 1.5.2: The differential equation above is both linear and separable.
We choose to solve it using the integrating factor method. The integrating factor is ekt ,
0
N 0 + k N ekt = 0 ekt N = 0 ekt N = c, c R.
The initial condition N0 = N (0) = c, so the solution of the initial value problem is
N (t) = N0 ekt .
This establishes the Theorem.
Remark: Radioactive materials are often characterized not by their decay constant k but
by their half-life . This is a time it takes for half the material to decay.
Definition 1.5.3. The half-life of a radioactive substance is the time such that
N (0)
N ( ) = .
2
There is a simple relation between the material constant and the material half-life.
Theorem 1.5.4. A radioactive material constant k and half-life are related by the equation
k = ln(2).
G. NAGY ODE September 11, 2017 49
Proof of Theorem 1.5.4: We know that the amount of a radioactive material as function
of time is given by
N (t) = N0 ekt .
Then, the definition of half-life implies,
N0 1
= N0 ek k = ln k = ln(2).
2 2
This establishes the Theorem.
Remark: A radioactive material, N , can be expressed in terms of the half-life,
(t/ )
N (t) = N0 e(t/ ) ln(2) N (t) = N0 eln[2 ]
N (t) = N0 2t/ .
From this last expression is clear that for t = we get N ( ) = N0 /2.
Our first example is about dating remains with Carbon-14. The Carbon-14 is a radioac-
tive isotope of Carbon-12 with a half-life of = 5730 years. Carbon-14 is being constantly
created in the upper atmosphereby collisions of Carbon-12 with outer space radiation
and is accumulated by living organisms. While the organism lives, the amount of Carbon-14
in the organism is held constant. The decay of Carbon-14 is compensated with new amounts
when the organism breaths or eats. When the organism dies, the amount of Carbon-14 in
its remains decays. So the balance between normal and radioactive carbon in the remains
changes in time.
Example 1.5.1: Bone remains in an ancient excavation site contain only 14% of the Carbon-
14 found in living animals today. Estimate how old are the bone remains. Use that the
half-life of the Carbon-14 is = 5730 years.
Solution: Suppose that t = 0 is set at the time when the organism dies. If at the present
time t1 the remains contain 14% of the original amount, that means
14
N (t1 ) =
N (0).
100
Since Carbon-14 is a radioactive substant with half-life , the amount of Carbon-14 decays
in time as follows,
N (t) = N (0) 2t/ ,
where = 5730 years is the Carbon-14 half-life. Therefore,
14 t1
2t1 / = = log2 (14/100) t1 = log2 (100/14).
100
We obtain that t1 = 16, 253 years. The organism died more that 16, 000 years ago. C
Solution: (Using the decay constant k.) We write the solution of the radioactive decay
equation as
N (t) = N (0) ekt , k = ln(2).
Write the condition for t1 , to be 14 % of the original Carbon-14, as follows,
14 14 14
N (0) ekt1 = N (0) ekt1 = kt1 = ln ,
100 100 100
1 100
so, t1 = ln . Recalling the expression for k in terms of , that is k = ln(2), we get
k 14
ln(100/14)
t1 = .
ln(2)
50 G. NAGY ODE september 11, 2017
We get t1 = 16, 253 years, which is the same result as above, since
ln(100/14)
log2 (100/14) = .
ln(2)
C
1.5.2. Newtons Cooling Law. In 1701 Newton published, anonymously, the result of his
home made experiments done fifteen years earlier. He focused on the time evolution of the
temperature of objects that rest in a medium with constant temperature. He found that the
difference between the temperatues of an object and the constant temperature of a medium
varies geometrically towards zero as time varies arithmetically. This was his way of saying
that the difference of temperatures, T , depends on time as
(T )(t) = (T )0 et/ ,
for some initial temperature difference (T )0 and some time scale . Although this is called
a Cooling Law, it also describes objects that warm up. When (T )0 > 0, the object is
cooling down, but when (T )0 < 0, the object is warming up.
Newton knew pretty well that the function T above is solution of a very particular
differential equation. But he chose to put more emphasis in the solution rather than in the
equation. Nowadays people think that differential equations are more fundamental than
their solutions, so we define Newtons cooling law as follows.
Definition 1.5.5. The Newton cooling law says that the temperature T at a time t of a
material placed in a surrounding medium kept at a constant temperature Ts satisfies
(T )0 = k (T ),
with T (t) = T (t)Ts , and k > 0, constant, characterizing the material thermal properties.
Remark: Newtons cooling law for T is the same as the radioactive decay equation.
But now the initial temperature difference, (T )(0) = T (0) Ts , can be either positive or
negative.
Theorem 1.5.6. The solution of Newtons cooling law equation (T )0 = k (T ) with
initial data T (0) = T0 is
T (t) = (T0 Ts ) ekt + Ts .
Proof of Theorem 1.5.6: Newtons cooling law is a first order linear equation, which we
solved in 1.1. The general solution is
(T )(t) = c ekt T (t) = c ekt + Ts , c R,
where we used that (T )(t) = T (t) Ts . The initial condition implies
T0 = T (0) = c + Ts c = T0 Ts T (t) = (T0 Ts ) ekt + Ts .
This establishes the Theorem.
Example 1.5.2: A cup with water at 45 C is placed in the cooler held at 5 C. If after 2
minutes the water temperature is 25 C, when will the water temperature be 15 C?
Solution: We know that the solution of the Newton cooling law equation is
T (t) = (T0 Ts ) ekt + Ts ,
and we also know that in this case we have
T0 = 45, Ts = 5, T (2) = 25.
G. NAGY ODE September 11, 2017 51
In this example we need to find t1 such that T (t1 ) = 15. In order to find that t1 we first
need to find the constant k,
T (t) = (45 5) ekt + 5 T (t) = 40 ekt + 5.
Now use the fact that T (2) = 25 C, that is,
1
20 = T (2) = 40 e2k ln(1/2) = 2k
ln(2). k=
2
Having the constant k we can now go on and find the time t1 such that T (t1 ) = 15 C.
T (t) = 40 et ln( 2)
+5 10 = 40 et1 ln( 2)
t1 = 4.
C
1.5.3. Mixing Problems. We study the system pictured in Fig. 3. A tank has a salt mass
Q(t) dissolved in a volume V (t) of water at a time t. Water is pouring into the tank at a
rate ri (t) with a salt concentration qi (t). Water is also leaving the tank at a rate ro (t) with
a salt concentration qo (t). Recall that a water rate r means water volume per unit time,
and a salt concentration q means salt mass per unit volume.
Definition 1.5.7. A Mixing Problem refers to water coming into a tank at a rate ri with
salt concentration qi , and going out the tank at a rate ro and salt concentration qo , so that
the water volume V and the total amount of salt Q, which is instantaneously mixed, in the
tank satisfy the following equations,
V 0 (t) = ri (t) ro (t), (1.5.1)
0
Q (t) = ri (t) qi (t) ro (t), qo (t), (1.5.2)
Q(t)
qo (t) = , (1.5.3)
V (t)
ri0 (t) = ro0 (t) = 0. (1.5.4)
The first and second equations above are just the mass conservation of water and salt,
respectively. Water volume and mass are proportional, so both are conserved, and we
chose the volume to write down this conservation in Eq. (1.5.1). This equation is indeed
a conservation because it says that the water volume variation in time is equal to the
difference of volume time rates coming in and going out of the tank. Eq. (1.5.2) is the salt
mass conservation, since the salt mass variation in time is equal to the difference of the salt
mass time rates coming in and going out of the tank. The product of a water rate r times a
salt concentration q has units of mass per time and represents the amount of salt entering or
52 G. NAGY ODE september 11, 2017
leaving the tank per unit time. Eq. (1.5.3) is the consequence of the instantaneous mixing
mechanism in the tank. Since the salt in the tank is well-mixed, the salt concentration is
homogeneous in the tank, with value Q(t)/V (t). Finally the equations in (1.5.4) say that
both rates in and out are time independent, hence constants.
Theorem 1.5.8. The amount of salt in the mixing problem above satisfies the equation
Q0 (t) = a(t) Q(t) + b(t), (1.5.5)
where the coefficients in the equation are given by
ro
a(t) = , b(t) = ri qi (t). (1.5.6)
(ri ro ) t + V0
Proof of Theorem 1.5.8: The equation for the salt in the tank given in (1.5.5) comes
from Eqs. (1.5.1)-(1.5.4). We start noting that Eq. (1.5.4) says that the water rates are
constant. We denote them as ri and ro . This information in Eq. (1.5.1) implies that V 0 is
constant. Then we can easily integrate this equation to obtain
V (t) = (ri ro ) t + V0 , (1.5.7)
where V0 = V (0) is the water volume in the tank at the initial time t = 0. On the other
hand, Eqs.(1.5.2) and (1.5.3) imply that
ro
Q0 (t) = ri qi (t) Q(t).
V (t)
Since V (t) is known from Eq. (1.5.7), we get that the function Q must be solution of the
differential equation
ro
Q0 (t) = ri qi (t) Q(t).
(ri ro ) t + V0
This is a linear ODE for the function Q. Indeed, introducing the functions
ro
a(t) = , b(t) = ri qi (t),
(ri ro ) t + V0
the differential equation for Q has the form
Q0 (t) = a(t) Q(t) + b(t).
This establishes the Theorem.
We could use the formula for the general solution of a linear equation given in Section 1.2
to write the solution of Eq. (1.5.5) for Q. Such formula covers all cases we are going to
study in this section. Since we already know that formula, we choose to find solutions in
particular cases. These cases are given by specific choices of the rate constants ri , ro , the
concentration function qi , and the initial data constants V0 and Q0 = Q(0). The study of
solutions to Eq. (1.5.5) in several particular cases might provide a deeper understanding of
the physical situation under study than the expression of the solution Q in the general case.
Example 1.5.3 (General Case for V (t) = V0 ): Consider a mixing problem with equal
constant water rates ri = ro = r, with constant incoming concentration qi , and with a given
initial water volume in the tank V0 . Then, find the solution to the initial value problem
Q0 (t) = a(t) Q(t) + b(t), Q(0) = Q0 ,
where function a and b are given in Eq. (1.5.6). Graph the solution function Q for different
values of the initial condition Q0 .
G. NAGY ODE September 11, 2017 53
Solution: The assumption ri = ro = r implies that the function a is constant, while the
assumption that qi is constant implies that the function b is also constant too,
ro r
a(t) = a(t) = = a0 ,
(ri ro ) t + V0 V0
b(t) = ri qi (t) b(t) = ri qi = b0 .
Then, we must solve the initial value problem for a constant coefficients linear equation,
Q0 (t) = a0 Q(t) + b0 , Q(0) = Q0 ,
The integrating factor method can be used to find the solution of the initial value problem
above. The formula for the solution is given in Theorem 1.1.4,
b0 a0 t b0
Q(t) = Q0 + e .
a0 a0
In our case the we can evaluate the constant b0 /a0 , and the result is
b0 V
0 b0
= (rqi ) = qi V0 .
a0 r a0
Then, the solution Q has the form,
Q(t) = Q0 qi V0 ert/V0 + qi V0 .
(1.5.8)
The initial amount of salt Q0 in the tank can be any non-negative real number. The solution
behaves differently for different values of Q0 . We classify these values in three classes:
Example 1.5.4 (Find a particular time, for V (t) = V0 ): Consider a mixing problem with
equal constant water rates ri = ro = r and fresh water is coming into the tank, hence
qi = 0. Then, find the time t1 such that the salt concentration in the tank Q(t)/V (t) is 1%
the initial value. Write that time t1 in terms of the rate r and initial water volume V0 .
Solution: The first step to solve this problem is to find the solution Q of the initial value
problem
Q0 (t) = a(t) Q(t) + b(t), Q(0) = Q0 ,
54 G. NAGY ODE september 11, 2017
where function a and b are given in Eq. (1.5.6). In this case they are
ro r
a(t) = a(t) = ,
(ri ro ) t + V0 V0
b(t) = ri qi (t) b(t) = 0.
The initial value problem we need to solve is
r
Q0 (t) = Q(t), Q(0) = Q0 .
V0
From Section 1.1 we know that the solution is given by
Q(t) = Q0 ert/V0 .
We can now proceed to find the time t1 . We first need to find the concentration Q(t)/V (t).
We already have Q(t) and we now that V (t) = V0 , since ri = ro . Therefore,
Q(t) Q(t) Q0 rt/V0
= = e .
V (t) V0 V0
The condition that defines t1 is
Q(t1 ) 1 Q0
= .
V (t1 ) 100 V0
From these two equations above we conclude that
1 Q0 Q(t1 ) Q0 rt1 /V0
= = e .
100 V0 V (t1 ) V0
The time t1 comes from the equation
1 1 rt1 rt1
= ert1 /V0 ln = ln(100) = .
100 100 V0 V0
The final result is given by
V0
t1 = ln(100).
r
C
Example 1.5.5 (Nonzero qi , for V (t) = V0 ): Consider a mixing problem with equal constant
water rates ri = ro = r, with only fresh water in the tank at the initial time, hence Q0 = 0
and with a given initial volume of water in the tank V0 . Then find the function salt in the
tank Q if the incoming salt concentration is given by the function
qi (t) = 2 + sin(2t).
where we used that the initial condition is Q0 = 0. Recalling the definition of the function
b we obtain Z t
Q(t) = ea0 t ea0 s 2 + sin(2s) ds.
0
This is the formula for the solution of the problem, we only need to compute the integral
given in the equation above. This is not straightforward though. We start with the following
integral found in an integration table,
eks
Z
eks sin(ls) ds = 2
2
k sin(ls) l cos(ls) ,
k +l
where k and l are constants. Therefore,
Z t it h ea0 s
h2 it
ea0 s 2 + sin(2s) ds = ea0 s + 2
a sin(2s) 2 cos(2s) ,
0
0 a0 0 a0 + 22 0
a0 t
2 e 2
= q ea0 t 1 + 2
a0 sin(2t) 2 cos(2t) + 2 .
a0 a0 + 22 a0 + 22
With the integral above we can compute the solution Q as follows,
h2 ea0 t 2 i
Q(t) = ea0 t ea0 t 1 + 2
a 0 sin(2t) 2 cos(2t) + ,
a0 a0 + 22 a20 + 22
recalling that a0 = r/V0 . We rewrite expression above as follows,
2 h 2 2 i a0 t 1
Q(t) = + 2 e + 2 a0 sin(2t) 2 cos(2t) . (1.5.9)
a0 a0 + 22 a0 a0 + 22
C
y
8
f (x) = 2 ex
5
2
Q(t)
1.5.4. Exercises.
1.5.2.- A vessel with liquid at 18 C is placed 1.5.5.- A tank with a capacity of Vm = 500
in a cooler held at 3 C, and after 3 min- liters originally contains V0 = 200 liters
utes the temperature drops to 13 C. of water with Q0 = 100 grams of salt
(a) Find the differential equation satis- in solution. Water containing salt with
fied by the temperature T of a liq- concentration of qi = 1 gram per liter
uid in the cooler at time t = 0. is poured in at a rate of ri = 3 liters
(b) Find the function temperature of per minute. The well-stirred water is
the liquid once it is put in the allowed to pour out the tank at a rate
cooler. of ro = 2 liters per minute. Find the
(c) Find the liquid cooling constant. salt concentration in the tank at the
time when the tank is about to overflow.
1.5.3.- A tank initially contains V0 = 100 Compare this concentration with the
liters of water with Q0 = 25 grams of limiting concentration at infinity time
salt. The tank is rinsed with fresh wa- if the tank had infinity capacity.
ter flowing in at a rate of ri = 5 liters
per minute and leaving the tank at the
same rate. The water in the tank is well-
stirred. Find the time such that the
amount the salt in the tank is Q1 = 5
grams.
G. NAGY ODE September 11, 2017 57
is nonlinear, since the function f (t, y) = 2ty + ln(y) is nonlinear in the second argument,
due to the term ln(y).
(c) The differential equation
y 0 (t)
= 2t2
y(t)
is linear, since the function f (t, y) = 2t2 y is linear in the second argument.
C
The Picard-Lindelof Theorem shows that certain nonlinear equations have solutions,
uniquely determined by appropriate initial data.
Theorem 1.6.2 (Picard-Lindelof). Consider the initial value problem
y 0 (t) = f (t, y(t)), y(t0 ) = y0 . (1.6.1)
If the function f is continuous on the domain Da = [t0 a, t0 + a] [y0 a, y0 + a] R2 ,
for some a > 0, and f is Lipschitz continuous on y, that is there exists k > 0 such that
|f (t, y2 ) f (t, y1 )| < k |y2 y1 |,
for all (t, y2 ), (t, y1 ) Da , then there exists a positive b < a such that there exists a unique
solution y, on the domain [t0 b, t0 + b], to the initial value problem in (1.6.1).
58 G. NAGY ODE september 11, 2017
Remark: We prove this theorem rewriting the differential equation as an integral equation
for the unknown function y. Then we use this integral equation to construct a sequence of
approximate solutions {yn } to the original initial value problem. Next we show that this
sequence of approximate solutions has a unique limit as n . We end the proof showing
that this limit is the only solution of the original initial value problem. This proof follows
[15] 1.6 and Zeidlers [16] 1.8. It is important to read the review on complete normed
vector spaces, called Banach spaces, given in these references.
Proof of Theorem 1.6.2: We start writing the differential equation in 1.6.1 as an integral
equation, hence we integrate on both sides of that equation with respect to t,
Z t Z t Z t
0
y (s) ds = f (s, y(s)) ds y(t) = y0 + f (s, y(s)) ds. (1.6.2)
t0 t0 t0
We have used the Fundamental Theorem of Calculus on the left-hand side of the first
equation to get the second equation. And we have introduced the initial condition y(t0 ) = y0 .
We use this integral form of the original differential equation to construct a sequence of
functions {yn }
n=0 . The domain of every function in this sequence is Da = [t0 a, t0 + a].
The sequence is defined as follows,
Z t
yn+1 (t) = y0 + f (s, yn (s)) ds, n > 0, y0 (t) = y0 . (1.6.3)
t0
We see that the first element in the sequence is the constant function determined by the
initial conditions in (1.6.1). The iteration in (1.6.3) is called the Picard iteration. The
central idea of the proof is to show that the sequence {yn } is a Cauchy sequence in the
space C(Db ) of uniformly continuous functions in the domain Db = [t0 b, t0 + b] for a small
enough b > 0. This function space is a Banach space under the norm
See [15] and references therein for the definition of Cauchy sequences, Banach spaces, and
the proof that C(Db ) with that norm is a Banach space. We now show that the sequence
{yn } is a Cauchy sequence in that space. Any two consecutive elements in the sequence
satisfy
Z t Z t
kyn+1 yn k = max f (s, yn (s)) ds f (s, yn1 (s)) ds
tDb t0 t0
Z t
6 max f (s, yn (s)) f (s, yn1 (s)) ds
tDb t0
Z t
6 k max |yn (s) yn1 (s)| ds
tDb t0
6 kb kyn yn1 k.
Using the triangle inequality for norms and and the sum of a geometric series one compute
the following,
kyn yn+m k = kyn yn+1 + yn+1 yn+2 + + yn+(m1) yn+m k
6 kyn yn+1 k + kyn+1 yn+2 k + + kyn+(m1) yn+m k
6 (rn + rn+1 + + rn+m ) ky1 y0 k
6 rn (1 + r + r2 + + rm ) ky1 y0 k
1 rm
6 rn ky1 y0 k.
1r
Now choose the positive constant b such that b < min{a, 1/k}, hence 0 < r < 1. In this case
the sequence {yn } is a Cauchy sequence in the Banach space C(Db ), with norm k k, hence
converges. Denote the limit by y = limn yn . This function satisfies the equation
Z t
y(t) = y0 + f (s, y(s)) ds,
t0
which says that y is not only continuous but also differentiable in the interior of Db , hence
y is solution of the initial value problem in (1.6.1). The proof of uniqueness of the solution
follows the same argument used to show that the sequence above is a Cauchy sequence.
Consider two solutions y and y of the initial value problem above. That means,
Z t Z t
y(t) = y0 + f (s, y(s) ds, y(t) = y0 + f (s, y(s) ds.
t0 t0
Example 1.6.2: Use the proof of Picard-Lindelofs Theorem to find the solution to
y0 = 2 y + 3 y(0) = 1.
Recall now that the power series expansion for the exponential
X (at)k X (at)k X (at)k
eat = =1+ = (eat 1).
k! k! k!
k=0 k=1 k=1
Example 1.6.3: Use the proof of Picard-Lindelofs Theorem to find the solution to
y0 = a y + b y(0) = y0 , a, b R.
We already have the factorials n! on each term tn . We now realize we can write the power
functions as (at)n is we multiply eat term by one, as follows
(ay0 + b) (at)1 (a y0 + b) (at)2 (a y0 + b) (at)3
y3 (t) = y0 + + + .
a 1! a 2! a 3!
Now we can pull a common factor
b (at)1 (at)2 (at)3
y3 (t) = y0 + y0 + + +
a 1! 2! 3!
From this last expressionis simple to guess the n-th approximation
b (at)1 (at)2 (at)3 (at)N
yN (t) = y0 + y0 + + + + +
a 1! 2! 3! N!
b X (at)k
lim yN (t) = y0 + y0 + .
N a k!
k=1
Recall now that the power series expansion for the exponential
at
X (at)k X (at)k X (at)k
e = =1+ = (eat 1).
k! k! k!
k=0 k=1 k=1
Notice that the sum in the exponential starts at k = 0, while the sum in yn starts at k = 1.
Then, the limit n is given by
y(t) = lim yn (t)
n
b X (at)k
= y0 + y0 +
a k!
k=1
b
eat 1 ,
= y0 + y0 +
a
We have been able to add the power series and we have the solution written in terms of
simple functions. One last rewriting of the solution and we obtain
b at b
y(t) = y0 + e .
a a
C
We now compute the first four elements in the sequence. The first one is y0 = y(0) = 1, the
second one y1 is given by
Z t
5
n = 0, y1 (t) = 1 + 5s ds = 1 + t2 .
0 2
So y1 = 1 + (5/2)t2 . Now we compute y2 ,
Z t
y2 = 1 + 5s y1 (s) ds
0
Z t
5
=1+ 5s 1 + s2 ds
0 2
Z t 2
5 3
=1+ 5s + s ds
0 2
5 52 4
= 1 + t2 + t .
2 8
5 52
So we obtained y2 (t) = 1 + t2 + 3 t4 . A similar calculation gives us y3 ,
2 2
Z t
y3 = 1 + 5s y2 (s) ds
0
Z t
5 52
=1+ 5s 1 + s2 + 3 s4 ds
0 2 2
Z t
5 3 53 5
2
=1+ 5s + s + 3 s ds
0 2 2
5 2 52 4 53 6
=1+ t + t + 3 t .
2 8 2 6
2 3
5 5 5
So we obtained y3 (t) = 1 + t2 + 3 t4 + 4 t6 . We now rewrite this expression so we can
2 2 2 3
get a power series expansion that can be written in terms of simple functions. The first step
is to write the powers of t as tn , for n = 1, 2, 3,
5 52 53
y3 (t) = 1 + (t2 )1 + 3 (t2 )2 + 4 (t2 )3 .
2 2 2 3
Now we multiply by one each term to get the right facctorials, n! on each term,
5 (t2 )1 52 (t2 )2 53 (t2 )3
y3 (t) = 1 + + 2 + 3 .
2 1! 2 2! 2 3!
No we realize that the factor 5/2 can be written together with the powers of t2 ,
( 25 t2 ) ( 52 t2 )2 ( 5 t2 )3
y3 (t) = 1 + + + 2 .
1! 2! 3!
From this last expression is simple to guess the n-th approximation
N
X ( 5 t2 )k
2
yN (t) = 1 + ,
k!
k=1
Recall now that the power series expansion for the exponential
X (at)k X (at)k
eat = =1+ .
k! k!
k=0 k=1
so we get
5 2 5 2
y(t) = 1 + (e 2 t 1) y(t) = e 2 t .
C
2 22 1
So we obtained y2 (t) = 1 + t5 + 2 t10 . A similar calculation gives us y3 ,
5 5 2
Z t
y3 = 1 + 2s4 y2 (s) ds
0
Z t
2 22 1
=1+ 2s4 1 + s5 + 2 s10 ds
0 5 5 2
Z t 2
2 9 23 1 14
=1+ 2s4 + s + 2 s ds
0 5 5 2
2 2 1 10 23 1 1 15
2
= 1 + t5 + t + 2 t .
5 5 10 5 2 15
2 22 1 23 1 1 15
So we obtained y3 (t) = 1 + t5 + 2 t10 + 3 t . We now try reorder terms in this last
5 5 2 5 23
expression so we can get a power series expansion we can write in terms of simple functions.
This is what we do:
2 22 (t5 )2 23 (t5 )3
y3 (t) = 1 + (t5 ) + 3 + 4
5 5 2 5 6
2 (t5 ) 22 (t5 )2 23 (t5 )3
=1+ + 2 + 3
5 1! 5 2! 5 3!
2 5 2 5 2 2 5 3
( t ) (5 t ) ( t )
=1+ 5 + + 5 .
1! 2! 3!
From this last expression is simple to guess the n-th approximation
N
X ( 52 t5 )n
yN (t) = 1 + ,
n=1
n!
which can be proven by induction. Therefore,
X ( 52 t5 )n
y(t) = lim yN (t) = 1 + .
N
n=1
n!
Recall now that the power series expansion for the exponential
X (at)k X (at)k
eat = =1+ .
k! k!
k=0 k=1
so we get
2 5 2 5
y(t) = 1 + (e 5 t 1) y(t) = e 5 t .
C
1.6.2. Comparison of Linear and Nonlinear Equations. The main result in 1.2 was
Theorem 1.2.3, which says that an initial value problem for a linear differential equation
y 0 = a(t) y + b(t), y(t0 ) = y0 ,
with a, b continuous functions on (t1 , t2 ), and constants t0 (t1 , t2 ) and y0 R, has the
unique solution y on (t1 , t2 ) given by
Z t
y(t) = eA(t) y0 + eA(s) b(s) ds ,
t0
Z t
where we introduced the function A(t) = a(s) ds.
t0
66 G. NAGY ODE september 11, 2017
From the result above we can see that solutions to linear differential equations satisfiy
the following properties:
(a) There is an explicit expression for the solutions of a differential equations.
(b) For every initial condition y0 R there exists a unique solution.
(c) For every initial condition y0 R the solution y(t) is defined for all (t1 , t2 ).
Remark: None of these properties hold for solutions of nonlinear differential equations.
From the Picard-Lindelof Theorem one can see that solutions to nonlinear differential
equations satisfy the following properties:
(i) There is no explicit formula for the solution to every nonlinear differential equation.
(ii) Solutions to initial value problems for nonlinear equations may be non-unique when
the function f does not satisfy the Lipschitz condition.
(iii) The domain of a solution y to a nonlinear initial value problem may change when we
change the initial data y0 .
The next three examples (1.6.6)-(1.6.8) are particular cases of the statements in (i)-(iii).
We start with an equation whose solutions cannot be written in explicit form.
Example 1.6.6: For every constant a1 , a2 , a3 , a4 , find all solutions y to the equation
t2
y 0 (t) = . (1.6.4)
y 4 (t) + a4 y 3 (t) + a3 y 2 (t) + a2 y(t) + a1
Solution: The nonlinear differential equation above is separable, so we follow 1.3 to find
its solutions. First we rewrite the equation as
y 4 (t) + a4 y 3 (t) + a3 y 2 (t) + a2 y(t) + a1 y 0 (t) = t2 .
Integrate the left-hand side with respect to u and the right-hand side with respect to t.
Substitute u back by the function y, hence we obtain
1 5 a4 4 a3 3 a2 t3
y (t) + y (t) + y (t) + y(t) + a1 y(t) = + c.
5 4 3 2 3
This is an implicit form for the solution y of the problem. The solution is the root of a
polynomial degree five for all possible values of the polynomial coefficients. But it has been
proven that there is no formula for the roots of a general polynomial degree bigger or equal
five. We conclude that that there is no explicit expression for solutions y of Eq. (1.6.4). C
We now give an example of the statement in (ii), that is, a differential equation which
does not satisfy one of the hypothesis in Theorem 1.6.2. The function f has a discontinuity
at a line in the (t, u) plane where the initial condition for the initial value problem is given.
We then show that such initial value problem has two solutions instead of a unique solution.
Example 1.6.7: Find every solution y of the initial value problem
y 0 (t) = y 1/3 (t), y(0) = 0. (1.6.5)
G. NAGY ODE September 11, 2017 67
Remark: The equation above is nonlinear, separable, and f (t, u) = u1/3 has derivative
1 1
u f = .
3 u2/3
Since the function u f is not continuous at u = 0, it does not satisfies the Lipschitz condition
in Theorem 1.6.2 on any domain of the form S = [a, a] [a, a] with a > 0.
Solution: The solution to the initial value problem in Eq. (1.6.5) exists but it is not unique,
since we now show that it has two solutions. The first solution is
y1 (t) = 0.
The second solution can be computed as using the ideas from separable equations, that is,
Z Z
1/3 0
y(t) y (t) dt = dt + c0 .
Solution: This is a nonlinear separable equation, so we can again apply the ideas in
Sect. 1.3. We first find all solutions of the differential equation,
Z 0 Z
y (t) dt 1 1
2
= dt + c0 = t + c0 y(t) = .
y (t) y(t) c0 + t
We now use the initial condition in the last expression above,
1 1
y0 = y(0) = c0 = .
c0 y0
So, the solution of the initial value problem above is:
1
y(t) = .
1
t
y0
This solution diverges at t = 1/y0 , so the domain of the solution y is not the whole real line
R. Instead, the domain is R {y0 }, so it depends on the values of the initial data y0 . C
In the next example we consider an equation of the form y 0 (t) = f (t, y(t)), where f does
not satisfy the hypotheses in Theorem 1.6.2.
68 G. NAGY ODE september 11, 2017
1.6.3. Direction Fields. Sometimes one needs to find information about solutions of a
differential equation without having to actually solve the equation. One way to do this is
with the direction fields. Consider a differential equation
y 0 (t) = f (t, y(t)).
We interpret the the right-hand side above in a new way.
(a) In the usual way, the graph of f is a surface in the tyz-space, where z = f (t, y),
(b) In the new way, f (t, y) is the value of a slope of a segment at each point (t, y) on the
ty-plane.
(c) That slope is the value of y 0 (t), the derivative of a solution y at t.
Example 1.6.10: Find the direction field of the equation y 0 = y, and sketch a few solutions
to the differential equation for different initial conditions.
Solution: Recall that the solutions are y(t) = y0 et . So is the direction field shown in
Fig. 8. C
y
y0 = y
0 t
Example 1.6.11: Find the direction field of the equation y 0 = sin(y), and sketch a few
solutions to the differential equation for different initial conditions.
Solution: The equation is separable so the solutions are
csc(y ) + cot(y )
0 0
ln = t,
csc(y) + cot(y)
for any y0 R. The graphs of these solutions are not simple to do. But the direction field
is simpler to plot and can be seen in Fig. 9. C
Example 1.6.12: Find the direction field of the equation y 0 = 2 cos(t) cos(y), and sketch a
few solutions to the differential equation for different initial conditions.
Solution: We do not need to compute the explicit solution of y 0 = 2 cos(t) cos(y) to have
a qualitative idea of its solutions. The direction field can be seen in Fig. 10. C
70 G. NAGY ODE september 11, 2017
y
y 0 = sin(y)
0 t
0 t
1.6.4. Exercises.
1.6.1.- Use the Picard iteration to find the 1.6.3.- Find the domain where the solution
first four elements, y0 , y1 , y2 , and y3 , of the initial value problems below is
of the sequence {yn }n=0 of approximate well-defined.
solutions to the initial value problem 4t
(a) y 0 = , y(0) = y0 > 0.
y
y 0 = 6y + 1, y(0) = 0. 0 2
(b) y = 2ty , y(0) = y0 > 0.
1.6.2.- Use the Picard iteration to find the
1.6.4.- By looking at the equation coeffi-
information required below about the
cients, find a domain where the solution
sequence {yn }n=0 of approximate solu-
of the initial value problem below exists,
tions to the initial value problem
(a) (t2 4) y 0 +2 ln(t) y = 3t, and initial
y 0 = 3y + 5, y(0) = 1.
condition y(1) = 2.
(a) The first 4 elements in the sequence, y
y0 , y1 , y2 , and y3 . (b) y 0 = , and initial condition
t(t 3)
(b) The general term ck (t) of the ap- y(1) = 2.
proximation
n 1.6.5.- State where in the plane with points
X ck (t)
yn (t) = 1 + . (t, y) the hypothesis of Theorem 1.6.2
k!
k=1 are not satisfied.
(c) Find the limit y(t) = limn yn (t). y2
(a) y 0 = .
2t 3y
0
p
(b) y = 1 t2 y 2 .
72 G. NAGY ODE september 11, 2017
Newtons second law of motion, ma = f , is maybe one of the first differential equations
written. This is a second order equation, since the acceleration is the second time derivative
of the particle position function. Second order differential equations are more difficult to
solve than first order equations. In 2.1 we compare results on linear first and second order
equations. While there is an explicit formula for all solutions to first order linear equations,
not such formula exists for all solutions to second order linear equations. The most one
can get is the result in Theorem 2.1.7. In 2.2 we introduce the Reduction Order Method
to find a new solution of a second order equation if we already know one solution of the
equation. In 2.3 we find explicit formulas for all solutions to linear second order equations
that are both homogeneous and with constant coefficients. These formulas are generalized
to nonhomogeneous equations in 2.5. In 2.6 we describe a few physical systems described
by second order linear differential equations.
y1 y2
ed t
ed t
G. NAGY ODE September 11, 2017 73
Example 2.1.2: Find the differential equation satisfied by the family of functions
y(t) = c1 e4t + c2 e4t ,
where c1 , c2 are arbitrary constants.
Solution: From the definition of y compute c1 ,
c1 = y e4t c2 e8t .
Now compute the derivative of function y
y 0 = 4c1 e4t 4c2 e4t ,
Replace c1 from the first equation above into the expression for y 0 ,
y 0 = 4(y e4t c2 e8t )e4t 4c2 e4t y 0 = 4y + (4 4)c2 e4t ,
so we get an expression for c2 in terms of y and y 0 ,
1
y 0 = 4y 8c2 e4t (4y y 0 ) e4t
c2 =
8
At this point we can compute c1 in terms of y and y 0 , although we do not need it for what
follows. Anyway,
1 1
c1 = y e4t (4y y 0 )e4t e8t c1 = (4y + y 0 ) e4t .
8 8
We do not need c1 because we can get a differential equation for y from the equation for c2 .
Compute the derivative of that equation,
1 1
0 = c02 = (4y y 0 ) e4t + (4y 0 y 00 ) e4t 4(4y y 0 ) + (4y 0 y 00 ) = 0
2 8
which gives us the following second order linear differential equation for y,
y 00 16 y = 0.
C
Example 2.1.3: Find the differential equation satisfied by the family of functions
c1
y(t) = + c2 t, c1 , c2 R.
t
c1
Solution: Compute y 0 = + c2 . Get one constant from y 0 and put it in y,
t2
c1 c1 0 c1
c2 = y 0 + 2 y = + y + 2 t,
t t t
so we get
c1 c1 2c1
y= + t y0 + y= + t y0 .
t t t
Compute the constant from the expression above,
2c1
= y t y 0 2c1 = t y t2 y 0 .
t
Since the left hand side is constant,
0 = (2c1 )0 = (t y t2 y 0 )0 = y + t y 0 2t y 0 t2 y 00 ,
so we get that y must satisfy the differential equation
t2 y 00 + t y 0 y = 0.
C
G. NAGY ODE September 11, 2017 75
Example 2.1.4: Find the differential equation satisfied by the family of functions
y(x) = c1 x + c2 x2 ,
where c1 , c2 are arbitrary constants.
Solution: Compute the derivative of function y
y 0 (x) = c1 + 2c2 x,
From here it is simple to get c1 ,
c1 = y 0 2c2 x.
Use this expression for c1 in the expression for y,
y0 y
y = (y 0 2c2 x) x + c2 x2 = x y 0 c2 x2 2.c2 =
x x
To get the differential equation for y we do not need c1 , but we compute it anyway,
y0 y 2y 2y
c1 = y 0 2( 2 ) x = y 0 2 y 0 + c1 = y 0 + .
x x x x
The equation for y can be obtained computing a derivative in the expression for c2 ,
y 00 y0 y0 y y 00 y0 y
0 = c02 = 2 2 +2 3 = 2 2 + 2 3 = 0 x2 y 00 2x y 0 + 2 y = 0.
x x x x x x x
C
2.1.2. Solutions to the Initial Value Problem. Here is the first of the two main results
in this section. Second order linear differential equations have solutions in the case that the
equation coefficients are continuous functions. Since the solution is unique when we specify
two initial conditions, the general solution must have two arbitrary integration constants.
Theorem 2.1.2 (IVP). If the functions a1 , a0 , b are continuous on a closed interval I R,
the constant t0 I, and y0 , y1 R are arbitrary constants, then there is a unique solution
y, defined on I, of the initial value problem
y 00 + a1 (t) y 0 + a0 (t) y = b(t), y(t0 ) = y0 , y 0 (t0 ) = y1 . (2.1.2)
Remark: The fixed point argument used in the proof of Picard-Lindelofs Theorem 1.6.2
can be extended to prove Theorem 2.1.2.
Example 2.1.5: Find the domain of the solution to the initial value problem
4(t 1)
(t 1) y 00 3t y 0 + y = t(t 1), y(2) = 1, y 0 (2) = 0.
(t 3)
Solution: We first write the equation above in the form given in the Theorem above,
3t 4
y 00 y0 + y = t.
(t 1) (t 3)
The equation coefficients are defined on the domain
(, 1) (1, 3) (3, ).
So the solution may not be defined at t = 1 or t = 3. That is, the solution is defined in
(, 1) or (1, 3) or (3, ).
Since the initial condition is at t0 = 2 (1, 3), then the domain of the solution is
D = (1, 3).
C
76 G. NAGY ODE september 11, 2017
2.1.3. Properties of Homogeneous Equations. We simplify the problem with the hope
to get deeper properties of its solutions. From now on in this section we focus on homoge-
neous equations only. We will get back to non-homogeneous equations in a later section.
But before getting into homogeneous equations, we introduce a new notation to write dif-
ferential equations. This is a shorter, more economical, notation. Given two functions a1 ,
a0 , introduce the function L acting on a function y, as follows,
L(y) = y 00 + a1 (t) y 0 + a0 (t) y. (2.1.3)
The function L acts on the function y and the result is another function, given by Eq. (2.1.3).
8
Example 2.1.6: Compute the operator L(y) = t y 00 + 2y 0 y acting on y(t) = t3 .
t
Solution: Since y(t) = t3 , then y 0 (t) = 3t2 and y 00 (t) = 6t, hence
8 3
L(t3 ) = t (6t) + 2(3t2 ) t L(t3 ) = 4t2 .
t
The function L acts on the function y(t) = t3 and the result is the function L(t3 ) = 4t2 . C
The function L above is called an operator, to emphasize that L is a function that acts
on other functions, instead of acting on numbers, as the functions we are used to. The
operator L above is also called a differential operator, since L(y) contains derivatives of y.
These operators are useful to write differential equations in a compact notation, since
y 00 + a1 (t) y 0 + a0 (t) y = f (t)
can be written using the operator L(y) = y 00 + a1 (t) y 0 + a0 (t) y as
L(y) = f.
An important type of operators are the linear operators.
Definition 2.1.3. An operator L is a linear operator iff for every pair of functions y1 ,
y2 and constants c1 , c2 holds
L(c1 y1 + c2 y2 ) = c1 L(y1 ) + c2 L(y2 ). (2.1.4)
In this Section we work with linear operators, as the following result shows.
Theorem 2.1.4 (Linear Operator). The operator L(y) = y 00 + a1 y 0 + a0 y, where a1 , a0 are
continuous functions and y is a twice differentiable function, is a linear operator.
Proof of Theorem 2.1.4: This is a straightforward calculation:
L(c1 y1 + c2 y2 ) = (c1 y1 + c2 y2 )00 + a1 (c1 y1 + c2 y2 )0 + a0 (c1 y1 + c2 y2 ).
Recall that derivations is a linear operation and then reoorder terms in the following way,
L(c1 y1 + c2 y2 ) = c1 y100 + a1 c1 y10 + a0 c1 y1 + c2 y200 + a1 c2 y20 + a0 c2 y2 .
Introduce the definition of L back on the right-hand side. We then conclude that
L(c1 y1 + c2 y2 ) = c1 L(y1 ) + c2 L(y2 ).
This establishes the Theorem.
The linearity of an operator L translates into the superposition property of the solutions
to the homogeneous equation L(y) = 0.
G. NAGY ODE September 11, 2017 77
Remarks:
(a) Two functions y1 , y2 are proportional iff there is a constant c such that for all t holds
y1 (t) = c y2 (t).
(b) The function y1 = 0 is proportional to every other function y2 , since holds y1 = 0 = 0 y2 .
The definitions of linearly dependent or independent functions found in the literature are
equivalent to the definition given here, but they are worded in a slight different way. Often
in the literature, two functions are called linearly dependent on the interval I iff there exist
constants c1 , c2 , not both zero, such that for all t I holds
c1 y1 (t) + c2 y2 (t) = 0.
Two functions are called linearly independent on the interval I iff they are not linearly
dependent, that is, the only constants c1 and c2 that for all t I satisfy the equation
c1 y1 (t) + c2 y2 (t) = 0
are the constants c1 = c2 = 0. This wording makes it simple to generalize these definitions
to an arbitrary number of functions.
Example 2.1.7:
(a) Show that y1 (t) = sin(t), y2 (t) = 2 sin(t) are linearly dependent.
(b) Show that y1 (t) = sin(t), y2 (t) = t sin(t) are linearly independent.
Solution:
Part (a): This is trivial, since 2y1 (t) y2 (t) = 0.
Part (b): Find constants c1 , c2 such that for all t R holds
c1 sin(t) + c2 t sin(t) = 0.
Evaluating at t = /2 and t = 3/2 we obtain
3
c1 +c2 = 0, c1 + c2 = 0 c1 = 0, c2 = 0.
2 2
We conclude: The functions y1 and y2 are linearly independent. C
78 G. NAGY ODE september 11, 2017
We now introduce the second main result in this section. If you know two linearly in-
dependent solutions to a second order linear homogeneous differential equation, then you
actually know all possible solutions to that equation. Any other solution is just a linear
combination of the previous two solutions. We repeat that the equation must be homoge-
neous. This is the closer we can get to a general formula for solutions to second order linear
homogeneous differential equations.
Theorem 2.1.7 (General Solution). If y1 and y2 are linearly independent solutions of the
equation L(y) = 0 on an interval I R, where L(y) = y 00 + a1 y 0 + a0 y, and a1 , a2 are
continuous functions on I, then there are unique constants c1 , c2 such that every solution y
of the differential equation L(y) = 0 on I can be written as a linear combination
y(t) = c1 y1 (t) + c2 y2 (t).
Before we prove Theorem 2.1.7, it is convenient to state the following the definitions,
which come out naturally from this Theorem.
Definition 2.1.8.
(a) The functions y1 and y2 are fundamental solutions of the equation L(y) = 0 iff y1 ,
y2 are linearly independent and
L(y1 ) = 0, L(y2 ) = 0.
(b) The general solution of the homogeneous equation L(y) = 0 is a two-parameter family
of functions ygen given by
ygen (t) = c1 y1 (t) + c2 y2 (t),
where the arbitrary constants c1 , c2 are the parameters of the family, and y1 , y2 are
fundamental solutions of L(y) = 0.
Example 2.1.8: Show that y1 = et and y2 = e2t are fundamental solutions to the equation
y 00 + y 0 2y = 0.
Solution: We first show that y1 and y2 are solutions to the differential equation, since
L(y1 ) = y100 + y10 2y1 = et + et 2et = (1 + 1 2)et = 0,
L(y2 ) = y200 + y20 2y2 = 4 e2t 2 e2t 2e2t = (4 2 2)e2t = 0.
It is not difficult to see that y1 and y2 are linearly independent. It is clear that they are not
proportional to each other. A proof of that statement is the following: Find the constants
c1 and c2 such that
0 = c1 y1 + c2 y2 = c1 et + c2 e2t tR 0 = c1 et 2c2 e2t
The second equation is the derivative of the first one. Take t = 0 in both equations,
0 = c1 + c2 , 0 = c1 2c2 c1 = c2 = 0.
We conclude that y1 and y2 are fundamental solutions to the differential equation above.C
Remark: The fundamental solutions to the equation above are not unique. For example,
show that another set of fundamental solutions to the equation above is given by,
2 1 1
y1 (t) = et + e2t , y2 (t) = et e2t .
3 3 3
To prove Theorem 2.1.7 we need to introduce the Wronskian function and to verify some
of its properties. The Wronskian function is studied in the following Subsection and Abels
G. NAGY ODE September 11, 2017 79
Theorem is proved. Once that is done we can say that the proof of Theorem 2.1.7 is
complete.
Proof of Theorem 2.1.7: We need to show that, given any fundamental solution pair,
y1 , y2 , any other solution y to the homogeneous equation L(y) = 0 must be a unique linear
combination of the fundamental solutions,
0 = (c1 c1 ) y1 + (c2 c2 ) y2 c1 c1 = 0, c2 c2 = 0,
where we used that y1 , y2 are linearly independent. This second part of the proof can be
obtained from the part three below, but I think it is better to highlight it here.
So we only need to show that the expression in Eq. (2.1.5) contains all solutions. We
need to show that we are not missing any other solution. In this third part of the argument
enters Theorem 2.1.2. This Theorem says that, in the case of homogeneous equations, the
initial value problem
L(y) = 0, y(t0 ) = d1 , y 0 (t0 ) = d2 ,
always has a unique solution. That means, a good parametrization of all solutions to the
differential equation L(y) = 0 is given by the two constants, d1 , d2 in the initial condition.
To finish the proof of Theorem 2.1.7 we need to show that the constants c1 and c2 are also
good to parametrize all solutions to the equation L(y) = 0. One way to show this, is to
find an invertible map from the constants d1 , d2 , which we know parametrize all solutions,
to the constants c1 , c2 . The map itself is simple to find,
d1 = c1 y1 (t0 ) + c2 y2 (t0 )
d2 = c1 y10 (t0 ) + c2 y20 (t0 ).
We now need to show that this map is invertible. From linear algebra we know that this
map acting on c1 , c2 is invertible iff the determinant of the coefficient matrix is nonzero,
y1 (t0 ) y2 (t0 ) 0 0
y1 (t0 ) y20 (t0 ) = y1 (t0 ) y2 (t0 ) y1 (t0 )y2 (t0 ) 6= 0.
0
This function is called the Wronskian of the two functions y1 , y2 . At the end of this section
we prove Theorem 2.1.13, which says the following: If y1 , y2 are fundamental solutions of
L(y) = 0 on I R, then W12 (t) 6= 0 on I. This statement establishes the Theorem.
80 G. NAGY ODE september 11, 2017
2.1.4. The Wronskian Function. We now introduce a function that provides important
information about the linear dependency of two functions y1 , y2 . This function, W , is called
the Wronskian to honor the polish scientist Josef Wronski, who first introduced this function
in 1821 while studying a different problem.
Definition 2.1.9. The Wronskian of the differentiable functions y1 , y2 is the function
W12 (t) = y1 (t)y20 (t) y10 (t)y2 (t).
y1 (t) y2 (t)
Remark: Introducing the matrix valued function A(t) = 0 the Wronskian can
y1 (t) y20 (t)
be written using the determinant
of that 2 2 matrix, W 12 (t) = det A(t) . An alternative
y y2
notation is: W12 = 10 .
y1 y20
Example 2.1.9: Find the Wronskian of the functions:
(a) y1 (t) = sin(t) and y2 (t) = 2 sin(t). (ld)
(b) y1 (t) = sin(t) and y2 (t) = t sin(t). (li)
Solution:
Part (a): By the definition of the Wronskian:
y1 (t) y2 (t) sin(t) 2 sin(t)
W12 (t) = 0 = = sin(t)2 cos(t) cos(t)2 sin(t)
y1 (t) y20 (t) cos(t) 2 cos(t)
We conclude that W12 (t) = 0. Notice that y1 and y2 are linearly dependent.
Part (b): Again, by the definition of the Wronskian:
sin(t) t sin(t)
W12 (t) =
= sin(t) sin(t) + t cos(t) cos(t)t sin(t).
cos(t) sin(t) + t cos(t)
We conclude that W12 (t) = sin2 (t). Notice that y1 and y2 are linearly independent. C
It is simple to prove the following relation between the Wronskian of two functions and
the linear dependency of these two functions.
Theorem 2.1.10 (Wronskian I). If y1 , y2 are linearly dependent on I R, then
W12 = 0 on I.
Proof of Theorem 2.1.10: Since the functions y1 , y2 are linearly dependent, there exists
a nonzero constant c such that y1 = c y2 ; hence holds,
W12 = y1 y20 y10 y2 = (c y2 ) y20 (c y2 )0 y2 = 0.
This establishes the Theorem.
Remark: The converse statement to Theorem 2.1.10 is false. If W12 (t) = 0 for all t I,
that does not imply that y1 and y2 are linearly dependent.
Second, their Wronskian vanishes on R. This is simple to see, since y1 (t) = y2 (t) for
t < 0, then W12 = 0 for t < 0. Since y1 (t) = y2 (t) for t > 0, then W12 = 0 for t > 0. Finally,
it is not difficult to see that W12 (t = 0) = 0. C
Remark: Often in the literature one finds the negative of Theorem 2.1.10, which is equiv-
alent to Theorem 2.1.10, and we summarize ibn the followng Corollary.
Corollary 2.1.11 (Wronskian I). If the Wronskian W12 (t0 ) 6= 0 at a point t0 I, then the
functions y1 , y2 defined on I are linearly independent.
The results mentioned above provide different properties of the Wronskian of two func-
tions. But none of these results is what we need to finish the proof of Theorem 2.1.7. In
order to finish that proof we need one more result, Abels Theorem.
2.1.5. Abels Theorem. We now show that the Wronskian of two solutions of a differential
equation satisfies a differential equation of its own. This result is known as Abels Theorem.
Theorem 2.1.12 (Abel). If y1 , y2 are twice continuously differentiable solutions of
y 00 + a1 (t) y 0 + a0 (t) y = 0, (2.1.6)
where a1 , a0 are continuous on I R, then the Wronskian W12 satisfies
W120 + a1 (t) W12 = 0.
Therefore, for any t0 I, the Wronskian W12 is given by the expression
W12 (t) = W12 (t0 ) eA1 (t) ,
Z t
where A1 (t) = a1 (s) ds.
t0
Proof of Theorem 2.1.12: We start computing the derivative of the Wronskian function,
0
W120 = y1 y20 y10 y2 = y1 y200 y100 y2 .
Recall that both y1 and y2 are solutions to Eq. (2.1.6), meaning,
y100 = a1 y10 a0 y1 , y200 = a1 y20 a0 y2 .
Replace these expressions in the formula for W120 above,
W120 = y1 a1 y20 a0 y2 a1 y10 a0 y1 y2 W120 = a1 y1 y20 y10 y2
Solution: Notice that we do not known the explicit expression for the solutions. Neverthe-
less, Theorem 2.1.12 says that we can compute their Wronskian. First, we have to rewrite
the differential equation in the form given in that Theorem, namely,
2 2 1
y 00 + 1 y0 + 2 + y = 0.
t t t
82 G. NAGY ODE september 11, 2017
Then, Theorem 2.1.12 says that the Wronskian satisfies the differential equation
2
W120 (t) + 1 W12 (t) = 0.
t
This is a first order, linear equation for W12 , so its solution can be computed using the
method of integrating factors. That is, first compute the integral
Z t
2 t
+ 1 ds = 2 ln (t t0 )
t0 s t0
t2
= ln 02 (t t0 ).
t
Then, the integrating factor is given by
t20 (tt0 )
(t) = e ,
t2
which satisfies the condition (t0 ) = 1. So the solution, W12 is given by
0
(t)W12 (t) = 0 (t)W12 (t) (t0 )W12 (t0 ) = 0
We now state and prove the statement we need to complete the proof of Theorem 2.1.7.
Theorem 2.1.13 (Wronskian II). If y1 , y2 are fundamental solutions of L(y) = 0 on I R,
then W12 (t) 6= 0 on I.
Proof of Corollary 2.1.14: We know that y1 , y2 are solutions of L(y) = 0. Then, Abels
Theorem says that their Wronskian W12 is given by
W12 (t) = W 12(t0 ) eA1 (t) ,
for any t0 I. Chossing the point t0 to be t1 , the point where by hypothesis W12 (t1 ) = 0,
we get that
W12 (t) = 0 for all t I.
Knowing that the Wronskian vanishes identically on I, we can write
y1 y20 y10 y2 = 0,
on I. If either y1 or y2 is the function zero, then the set is linearly dependent. So we
can assume that both are not identically zero. Lets assume there exists t1 I such that
G. NAGY ODE September 11, 2017 83
2.1.6. Exercises.
2.1.1.- Compute the Wronskian of the fol- 2.1.5.- Can the function y(t) = sin(t2 ) be
lowing functions: solution on an open interval containing
(a) f (t) = sin(t), g(t) = cos(t). t = 0 of a differential equation
(b) f (x) = x, g(x) = x ex . y 00 + a(t) y 0 + b(t)y = 0,
(c) f () = cos2 (), g() = 1 + cos(2).
with continuous coefficients a and b?
2.1.2.- Find the longest interval where the Explain your answer.
solution y of the initial value problems
below is defined. (Do not try to solve 2.1.6.- Verify whether the functions y1 , y2
the differential equations.) below are a fundamental set for the dif-
ferential equations given below:
(a) t2 y 00 + 6y = 2t, y(1) = 2, y 0 (1) = 3.
(a) y1 (t) = cos(2t), y2 (t) = sin(2t),
(b) (t 6)y 0 + 3ty 0 y = 1, y(3) =
1, y 0 (3) = 2. y 00 + 4y = 0.
(b) y1 (t) = et , y2 (t) = t et ,
2.1.3.- (a) Verify that y1 (t) = t2 and
y2 (t) = 1/t are solutions to the dif- y 00 2y 0 + y = 0.
ferential equation (c) y1 (x) = x, y2 (t) = x ex ,
t2 y 00 2y = 0, t > 0. x2 y 00 2x(x + 2) y 0 + (x + 2) y = 0.
b
(b) Show that y(t) = a t2 + is so- 2.1.7.- If the Wronskian of any two solu-
t
lution of the same equation for all tions of the differential equation
constants a, b R.
y 00 + p(t) y 0 + q(t) y = 0
2.1.4.- If the graph of y, solution to a sec- is constant, what does this imply about
ond order linear differential equation the coefficients p and q?
L(y(t)) = 0 on the interval [a, b], is tan-
gent to the t-axis at any point t0 [a, b], 2.1.8.- Let y(t) = c1 t + c2 t2 be the general
then find the solution y explicitly. solution of a second order linear differ-
ential equation L(y) = 0. By eliminat-
ing the constants c1 and c2 , find the dif-
ferential equation satisfied by y.
G. NAGY ODE September 11, 2017 85
Z
dt
Then, y = + c. We integrate using the method of partial fractions,
t2
1
1 1 a b
= = + .
t2 1 (t 1)(t + 1) (t 1) (t + 1)
1 1
Hence, 1 = a(t + 1) + b(t 1). Evaluating at t = 1 and t = 1 we get a = , b = . So
2 2
1 1h 1 1 i
= .
t2 1 2 (t 1) (t + 1)
Therefore, the integral is simple to do,
1 1
y = ln |t 1| ln |t + 1| + c. 2 = y(0) = (0 0) + c.
2 2
1
We conclude y = ln |t 1| ln |t + 1| + 2. C
2
The case (b) is way more complicated to solve.
Theorem 2.2.3 (Variable t Missing). If the initial value problem
y 00 = f (y, y 0 ), y(0) = y0 , y 0 (0) = y1 ,
has an invertible solution y, then the function
w(y) = v(t(y)),
0
where v(t) = y (t), and t(y) is the inverse of y(t), satisfies the initial value problem
f (y, w)
w = , w(y0 ) = y1 ,
w
dw
where we denoted w = .
dy
Remark: The proof is based on the chain rule for the derivative of functions.
Proof of Theorem 2.2.3: The differential equation is y 00 = f (y, y 0 ). Denoting v(t) = y 0 (t)
v 0 = f (y, v)
It is not clear how to solve this equation, since the function y still appears in the equation.
On a domain where y is invertible we can do the following. Denote t(y) the inverse values
of y(t), and introduce w(y) = v(t(y)). The chain rule implies
dw dv dt v 0 (t) v 0 (t) f y(t), v(t)) f (y, w(y))
w(y) = = = 0 = = = .
dy y dt t(y) dy t(y) y (t) t(y) v(t) t(y) v(t) w(y)
t(y)
dw dv
where w(y) = , and v 0 (t) = . Therefore, we have obtained the equation for w, namely
dy dt
f (y, w)
w =
w
Finally we need to find the initial condition fro w. Recall that
y(t = 0) = y0 t(y = y0 ) = 0,
0
y (t = 0) = y1 v(t = 0) = y1 .
Therefore,
w(y = y0 ) = v(t(y = y0 )) = v(t = 0) = y1 w(y0 ) = y1 .
This establishes the Theorem.
G. NAGY ODE September 11, 2017 87
Solution: The variable t does not appear in the equation. So we start introduciong the
function v(t) = y 0 (t). The equation is now given by v 0 (t) = 2y(t) v(t). We look for invertible
solutions y, then introduce the function w(y) = v(t(y)). This function satisfies
dw dv dt v 0 v 0
w(y) = = = 0 = .
dy dt dy t(y) y t(y) v t(y)
The equation for w is separable, so the method from 1.3 implies that
ln(w) = 3 ln(y) + c0 = ln(y 3 ) + c0 w(y) = c1 y 3 , c1 = ec0 .
The initial condition fixes the constant c1 , since
6 = w(1) = c1 w(y) = 6 y 3 .
We now transform from w back to v as follows,
v(t) = w(y(t)) = 6 y 3 (t) y 0 (t) = 6y 3 (t).
This is now a first order separable equation for y. Again the method from 1.3 imply that
y4
y3 y0 = 6 = 6t + c2
4
The initial condition for y fixes the constant c2 , since
1 y4 1
1 = y(0) = 0 + c2 = 6t + .
4 4 4
So we conclude that the solution y to the initial value problem is
y(t) = (24t + 1)4 .
C
2.2.2. Conservation of the Energy. We now study case (c) in Def. 2.2.1second order
differential equations such that both the variable t and the function y 0 do not appear ex-
plicitly in the equation. This case is important in Newtonian mechanics. For that reason
we slightly change notation we use to write the differential equation. Instead of writing the
equation as y 00 = f (y), as in Def. 2.2.1, we write it as
m y 00 = f (y),
where m is a constant. This notation matches the notation of Newtons second law of motion
for a particle of mass m, with position function y as function of time t, acting under a force
f that depends only on the particle position y.
It turns out that solutions to the differential equation above have a particular property:
There is a function of y 0 and y, called the energy of the system, that remains conserved
during the motion. We summarize this result in the statement below.
Theorem 2.2.4 (Conservation of the Energy). Consider a particle with positive mass m
and position y, function of time t, which is a solution of Newtons second law of motion
m y 00 = f (y),
with initial conditions y(t0 ) = y0 and y 0 (t0 ) = v0 , where f (y) is the force acting on the
particle at the position y. Then, the position function y satisfies
1
mv 2 + V (y) = E0 ,
2
where E0 = 21 mv02 + V (y0 ) is fixed by the initial conditions, v(t) = y 0 (t) is the particle
velocity, and V is the potential of the force f the negative of the primitive of f , that is
Z
dV
V (y) = f (y) dy f = .
dy
G. NAGY ODE September 11, 2017 89
Remark: The term T (v) = 12 mv 2 is the kinetic energy of the particle. The term V (y) is
the potential energy. The Theorem above says that the total mechanical energy
E = T (v) + V (y)
remains constant during the motion of the particle.
Proof of Theorem 2.2.4: We write the differential equation using the potential V ,
dV
m y 00 = .
dy
Multiply the equation above by y 0 ,
dV 0
m y 0 (t) y 00 (t) = y (t).
dy
Use the chain rule on both sides of the equation above,
d 1 d
m (y 0 )2 = V (y(t)).
dt 2 dt
Introduce the velocity v = y 0 , and rewrite the equation above as
d 1
m v 2 + V (y) = 0.
dt 2
This means that the quantity
1
E(y, v) = m v 2 + V (y),
2
called the mechanical energy of the system, remains constant during the motion. Therefore,
it must match its value at the initial time t0 , which we called E0 in the Theorem. So we
arrive to the equation
1
E(y, v) = m v 2 + V (y) = E0 .
2
This establishes the Theorem.
Example 2.2.4: Find the potential energy and write the energy conservation for the fol-
lowing systems:
(i) A particle attached to a spring with constant k, moving in one space dimension.
(ii) A particle moving vertically close to the Earth surface, under Earths constant gravita-
tional acceleration. In this case the force on the particle having mass m is f (y) = mg,
where g = 9.81 m/s2 .
(iii) A particle moving along the direction vertical to the surface of a spherical planet with
mass M and radius R.
Solution:
Case (i). The force on a particle of mass m attached to a spring with spring constant
k > 0, when displaced an amount y from the equilibrium position y = 0 is f (y) = ky.
Therefore, Newtons second law of motion says
m y 00 = ky.
The potential in this case is V (y) = 21 ky 2 , since dV /dy = ky = f . If we introduce the
particle velocity v = y 0 , then the total mechanical energy is
1 1
E(y, v) = mv 2 + ky 2 .
2 2
90 G. NAGY ODE september 11, 2017
Case (iii). Consider a particle of mass m moving on a line which is perpendicular to the
surface of a spherical planet of mass M and radius R. The force on such a particle when is
at a distance y from the surface of the planet is, according to Newtons gravitational law,
GM m
f (y) = ,
(R + y)2
GM m
V (y) = ,
(R + y)
since dV /dy = f (y). The energy for this system is
1 GM m
E(y, v) = mv 2
2 (R + y)
where we introduced the particle velocity v = y 0 . The conservation of the energy says that
1 GM m
mv 2 = E0 ,
2 (R + y)
where E0 is the energy at the initial time. C
Example 2.2.5: Find the maximum height of a ball of mass m = 0.1 Kg that is shot
vertically by a spring with spring constant k = 400 Kg/s2 and compressed 0.1 m. Use
g = 10 m/s2 .
Solution: This is a difficult problem to solve if one tries to find the position function y and
evaluate it at the time when its speed vanishesmaximum altitude. One has to solve two
differential equations for y, one with source f1 = kymg and other with source f2 = mg,
and the solutions must be glued together. The first source describes the particle when is
pushed by the spring under the Earths gravitational force. The second source describes
the particle when only the Earths gravitational force is acting on the particle. Also, the
moment when the ball leaves the spring is hard to describe accurately.
G. NAGY ODE September 11, 2017 91
A simpler method is to use the conservation of the mechanical and gravitational energy.
The energy for this particle is
1 1
E(t) = mv 2 + ky 2 + mgy.
2 2
This energy must be constant along the movement. In particular, the energy at the initial
time t = 0 must be the same as the energy at the time of the maximum height, tM ,
1 1 1
E(t = 0) = E(tM ) mv02 + ky02 + mgy0 = mvM 2
+ mgyM .
2 2 2
But at the initial time we have v0 = 0, and y0 = 0.1, (the negative sign is because the
spring is compressed) and at the maximum time we also have vM = 0, hence
1 2 k
ky + mgy0 = mgyM yM = y0 + y2 .
2 0 2mg 0
We conclude that yM = 1.9 m. C
Example 2.2.6: Find the escape velocity from Earththe initial velocity of a projec-
tile moving vertically upwards starting from the Earth surface such that it escapes Earth
gravitational attraction. Recall that the acceleration of gravity at the surface of Earth is
g = GM/R2 = 9.81 m/s2 , and that the radius of Earth is R = 6378 Km. Here M denotes
the mass of the Earth, and G is Newtons gravitational constant.
Solution: The projectile moves in the vertical direction, so the movement is along one
space dimension. Let y be the position of the projectile, with y = 0 at the surface of the
Earth. Newtons equation in this case is
GM m
m y 00 = .
(R + y)2
We start rewriting the force using the constant g instead of G,
GM m GM mR2 gmR2
= = .
(R + y)2 R2 (R + y)2 (R + y)2
So the equation of motion for the projectile is
gmR2
m y 00 = .
(R + y)2
The projectile mass m can be canceled from the equation above (we do it later) so the result
will be independent of the projectile mass. Now we introduce the gravitational potential
gmR2
V (y) = .
(R + y)
We know that the motion of this particle satisfies the conservation of the energy
1 gmR2
mv 2 = E0 ,
2 (R + y)
where v = y 0 . The initial energy is simple to compute, y(0) = 0 and v(0) = v0 , so we get
1 gmR2 1
mv 2 (t) = mv02 gmR.
2 (R + y(t)) 2
We now cancel the projectile mass from the equation, and we rewrite the equation as
2gR2
v 2 (t) = v02 2gR + .
(R + y(t))
92 G. NAGY ODE september 11, 2017
Now we choose the initial velocity v0 to be the escape velocity ve . The latter is the smallest
initial velocity such that v(t) is defined for all y including y . Since
2gR2
v 2 (t) > 0 and > 0,
(R + y(t))
this means that the escape velocity must satisfy
Since the escape velocity is the smallest velocity satisfying the condition above, that means
p
ve = 2gR ve = 11.2 Km/s.
C
Example 2.2.7: Find the time tM for a rocket to reach the Moon, if it is launched at
the escape velocity. Use that the distance from the surface of the Earth to the Moon is
d = 405, 696 Km.
Solution: From Example 2.2.6 we know that the position function y of the rocket satisfies
the differential equation
2gR2
v 2 (t) = v02 2gR + ,
(R + y(t))
where R is the Earth radius, g the gravitational acceleration at the Earth surface, v = y 0 ,
and v0 is the
initial velocity. Since the rocket initial velocity is the Earth escape velocity,
v0 = ve = 2gR, the differential equation for y is
2gR2 2g R
(y 0 )2 = y0 = ,
(R + y) R+y
where we chose the positive square root because, in our coordinate system, the rocket leaving
Earth means v > 0. Now, the last equation above is a separable differential equation for y,
so we can integrate it,
2
(R + y)1/2 y 0 =
p p
2g R (R + y)3/2 = 2g R t + c,
3
where c is a constant, which can be determined by the initial condition y(t = 0) = 0, since
at the initial time the projectile is on the surface of the Earth, the origin of out coordinate
system. With this initial condition we get
2 3/2 2 p 2
c= R (R + y)3/2 = 2g R t + R3/2 . (2.2.1)
3 3 3
From the equation above we can compute an explicit form of the solution function y,
3 p 2/3
y(t) = 2g R t + R3/2 R. (2.2.2)
2
To find the time to reach the Moon we need to evaluate Eq. (2.2.1) for y = d and get tM ,
2 p 2 2 1
(R + d)3/2 = 2g R tM + R3/2 . (R + d)3/2 R2/3 .
tM =
3 3 3 2g R
The formula above gives tM = 51.5 hours. C
G. NAGY ODE September 11, 2017 93
2.2.3. The Reduction of Order Method. If we know one solution to a second order,
linear, homogeneous, differential equation, then one can find a second solution to that equa-
tion. And this second solution can be chosen to be not proportional to the known solution.
One obtains the second solution transforming the original problem into solving two first
order differential equations.
Theorem 2.2.5 (Reduction of Order). If a nonzero function y1 is solution to
y 00 + a1 (t) y 0 + a0 (t) y = 0. (2.2.3)
where a1 , a0 are given functions, then a second solution not proportional to y1 is
Z A1 (t)
e
y2 (t) = y1 (t) dt, (2.2.4)
y12 (t)
R
where A1 (t) = a1 (t) dt.
Remark: In the first part of the proof we write y2 (t) = v(t) y1 (t) and show that y2 is
solution of Eq. (2.2.3) iff the function v is solution of
y 0 (t)
v 00 + 2 1 + a1 (t) v 0 = 0. (2.2.5)
y1 (t)
In the second part we solve the equation for v. This is a first order equation for for w = v 0 ,
since v itself does not appear in the equation, hence the name reduction of order method.
The equation for w is linear and first order, so we can solve it using the integrating factor
method. One more integration gives v, which is the factor multiplying y1 in Eq. (2.2.4).
Remark: The functions v and w in this subsection have no relation with the functions v
and w from the previous subsection.
Proof of Theorem 2.2.5: We write y2 = vy1 and we put this function into the differential
equation in 2.2.3, which give us an equation for v. To start, compute y20 and y200 ,
y20 = v 0 y1 + v y10 , y200 = v 00 y1 + 2v 0 y10 + v y100 .
Introduce these equations into the differential equation,
0 = (v 00 y1 + 2v 0 y10 + v y100 ) + a1 (v 0 y1 + v y10 ) + a0 v y1
= y1 v 00 + (2y10 + a1 y1 ) v 0 + (y100 + a1 y10 + a0 y1 ) v.
The function y1 is solution to the differential original differential equation,
y100 + a1 y10 + a0 y1 = 0,
then, the equation for v is given by
y0
y1 v 00 + (2y10 + a1 y1 ) v 0 = 0. v 00 + 2 1 + a1 v 0 = 0.
y1
This is Eq. (2.2.5). The function v does not appear explicitly in this equation, so denoting
w = v 0 we obtain
y0
w0 + 2 1 + a1 w = 0.
y1
This is is a first order linear equation for w, so we solve it using the integrating factor
method, with integrating factor
Z
2 A1 (t)
(t) = y1 (t) e , where A1 (t) = a1 (t) dt.
94 G. NAGY ODE september 11, 2017
Recall that above in this proof we have computed v 0 = w, and the result was w = w0 eA1 /y12 .
So we get v 0 y12 = w0 eA1 , and then the Wronskian is given by
W12 = w0 eA1 .
This is a nonzero function, therefore the functions y1 and y2 = vy1 are linearly independent.
This establishes the Theorem.
Example 2.2.8: Find a second solution y2 linearly independent to the solution y1 (t) = t of
the differential equation
t2 y 00 + 2ty 0 2y = 0.
Solution: We look for a solution of the form y2 (t) = t v(t). This implies that
y20 = t v 0 + v, y200 = t v 00 + 2v 0 .
So, the equation for v is given by
0 = t2 t v 00 + 2v 0 + 2t t v 0 + v 2t v
2.2.4. Exercises.
2.3.1. The Roots of the Characteristic Polynomial. Thanks to the work done in 2.1
we only need to find two linearly independent solutions to second order linear homogeneous
equations. Then Theorem 2.1.7 says that every other solution is a linear combination of the
former two. How do we find any pair of linearly independent solutions? Since the equation
is so simple, having constant coefficients, we find such solutions by trial and error. Here is
an example of this idea.
Example 2.3.1: Find solutions to the equation
y 00 + 5y 0 + 6y = 0. (2.3.1)
Solution: We try to find solutions to this equation using simple test functions. For exam-
ple, it is clear that power functions y = tn wont work, since the equation
n(n 1) t(n2) + 5n t(n1) + 6 tn = 0
cannot be satisfied for all t R. We obtained, instead, a condition on t. This rules out
power functions. A key insight is to try with a test function having a derivative proportional
to the original function,
y 0 (t) = r y(t).
Such function would be simplified from the equation. For example, we try now with the
test function y(t) = ert . If we introduce this function in the differential equation we get
(r2 + 5r + 6) ert = 0 r2 + 5r + 6 = 0. (2.3.2)
We have eliminated the exponential and any t dependence from the differential equation,
and now the equation is a condition on the constant r. So we look for the appropriate values
of r, which are the roots of a polynomial degree two,
r+ = 2,
1 1
r = 5 25 24 = (5 1)
2 2 r = 3.
We have obtained two different roots, which implies we have two different solutions,
y1 (t) = e2t , y2 (t) = e3t .
These solutions are not proportional to each other, so the are fundamental solutions to the
differential equation in (2.3.1). Therefore, Theorem 2.1.7 in 2.1 implies that we have found
all possible solutions to the differential equation, and they are given by
y(t) = c1 e2t + c2 e3t , c1 , c2 R. (2.3.3)
C
From the example above we see that this idea will produce fundamental solutions to
all constant coefficients homogeneous equations having associated polynomials with two
different roots. Such polynomial play an important role to find solutions to differential
equations as the one above, so we give such polynomial a name.
98 G. NAGY ODE september 11, 2017
As we saw in Example 2.3.1, the roots of the characteristic polynomial are crucial to
express the solutions of the differential equation above. The characteristic polynomial is a
second degree polynomial with real coefficients, and the general expression for its roots is
1 p
r = a1 a21 4a0 .
2
If the discriminant (a21 4a0 ) is positive, zero, or negative, then the roots of p are different
real numbers, only one real number, or a complex-conjugate pair of complex numbers. For
each case the solution of the differential equation can be expressed in different forms.
Theorem 2.3.2 (Constant Coefficients). If r are the roots of the characteristic polynomial
to the second order linear homogeneous equation with constant coefficients
y 00 + a1 y 0 + a0 y = 0, (2.3.4)
and if c+ , c- are arbitrary constants, then the following statements hold true.
(a) If r+ 6= r- , real or complex, then the general solution of Eq. (2.3.4) is given by
ygen (t) = c+ er+ t + c- er- t .
(b) If r+ = r- = r0 R, then the general solution of Eq. (2.3.4) is given by
ygen (t) = c+ er0 t + c- ter0 t .
Furthermore, given real constants t0 , y0 and y1 , there is a unique solution to the initial value
problem given by Eq. (2.3.4) and the initial conditions y(t0 ) = y0 and y 0 (t0 ) = y1 .
Remarks:
(a) The proof is to guess that functions y(t) = ert must be solutions for appropriate values of
the exponent constant r, the latter being roots of the characteristic polynomial. When
the characteristic polynomial has two different roots, Theorem 2.1.7 says we have all
solutions. When the root is repeated we use the reduction of order method to find a
second solution not proportional to the first one.
(b) At the end of the section we show a proof where we construct the fundamental solutions
y1 , y2 without guessing them. We do not need to use Theorem 2.1.7 in this second proof,
which is based completely in a generalization of the reduction of order method.
Proof of Theorem 2.3.2: We guess that particular solutions to Eq. 2.3.4 must be expo-
nential functions of the form y(t) = ert , because the exponential will cancel out from the
equation and only a condition for r will remain. This is what happens,
r2 ert + a1 ert + a0 ert = 0 r2 + a1 r + a0 = 0.
The second equation says that the appropriate values of the exponent are the root of the
characteristic polynomial. We now have two cases. If r+ 6= r- then the solutions
y+ (t) = er+ t , y- (t) = er- t ,
are linearly independent, so the general solution to the differential equation is
ygen (t) = c+ er+ t + c- er- t .
G. NAGY ODE September 11, 2017 99
If r+ = r- = r0 , then we have found only one solution y+ (t) = er0 t , and we need to find a
second solution not proportional to y+ . This is what the reduction of order method is perfect
for. We write the second solution as
y- (t) = v(t) y+ (t) y- (t) = v(t) er0 t ,
and we put this expression in the differential equation (2.3.4),
v 00 + 2r0 v 0 + vr02 er0 t + v 0 + r0 v a1 er0 t + a0 v er0 t = 0.
Solution: We know that the general solution of the differential equation above is
ygen (t) = c+ e2t + c- e3t .
100 G. NAGY ODE september 11, 2017
We now find the constants c+ and c- that satisfy the initial conditions above,
)
1 = y(0) = c+ + c- c+ = 2,
0
1 = y (0) = 2c+ 3c- c- = 1.
Therefore, the unique solution to the initial value problem is
y(t) = 2e2t e3t .
C
Example 2.3.3: Find the general solution ygen of the differential equation
2y 00 3y 0 + y = 0.
Solution: We look for every solutions of the form y(t) = ert , where r is solution of the
characteristic equation
1 r+ = 1,
2r2 3r + 1 = 0 r = 3 9 8
1
4 r- = .
2
Therefore, the general solution of the equation above is
ygen (t) = c+ et + c- et/2 .
C
3 3
So, the solution to the initial value problem above is: y(t) = (1 + 2t) et/3 . C
Since the roots of the characteristic polnomial are different, Theorem 2.3.2 says that the
general solution of the differential equation above, which includes complex-valued solutions,
can be written as follows,
ygen (t) = c+ e(1+i 5)t
+ c- e(1i 5)t
, c+ , c- C.
C
2.3.2. Real Solutions for Complex Roots. We study in more detail the solutions to
the differential equation (2.3.4) in the case that the characteristic polynomial has complex
roots. Since these roots have the form
a1 1p 2
r = a1 4a0 ,
2 2
the roots are complex-valued in the case a21 4a0 < 0. We use the notation
r
a1 a2
r = i, with = , = a0 1 .
2 4
The fundamental solutions in Theorem 2.3.2 are the complex-valued functions
y+ = e(+i)t , y- = e(i)t .
The general solution constructed from these solutions is
ygen (t) = c+ e(+i)t + c- e(i)t , c+ , c- C.
This formula for the general solution includes real valued and complex valued solutions.
But it is not so simple to single out the real valued solutions. Knowing the real valued
solutions could be important in physical applications. If a physical system is described by a
differential equation with real coefficients, more often than not one is interested in finding
real valued solutions. For that reason we now provide a new set of fundamental solutions
that are real valued. Using real valued fundamental solution is simple to separate all real
valued solutions from the complex valued ones.
Theorem 2.3.3 (Real Valued Fundamental Solutions). If the differential equation
y 00 + a1 y 0 + a0 y = 0, (2.3.5)
where a1 , a0 are real constants, has characteristic polynomial with complex roots r = i
and complex valued fundamental solutions
y+ (t) = e(+i)t , y- (t) = e(i)t ,
then the equation also has real valued fundamental solutions given by
y+ (t) = et cos(t), y- (t) = et sin(t).
Proof of Theorem 2.3.3: We start with the complex valued fundamental solutions
y+ (t) = e(+i)t , y- (t) = e(i)t .
We take the function y+ and we use a property of complex exponentials,
y+ (t) = e(+i)t = et eit = et cos(t) + i sin(t) ,
where on the last step we used Eulers formula ei = cos()+i sin(). Repeat this calculation
for y- we get,
y+ (t) = et cos(t) + i sin(t) , y- (t) = et cos(t) i sin(t) .
102 G. NAGY ODE september 11, 2017
Solution: We already found the roots of the characteristic polynomial, but we do it again,
1
r2 2r + 6 = 0 r = 2 4 24
r = 1 i 5.
2
So the complex valued fundamental solutions are
y+ (t) = e(1+i 5) t
, y- (t) = e(1i 5) t
.
Theorem ?? says that real valued fundamental solutions are given by
y+ (t) = et cos( 5t), y- (t) = et sin( 5t).
So the real valued general solution is given by
ygen (t) = c+ cos( 5 t) + c- sin( 5 t) et , c+ , c- R.
C
Remark: Sometimes it is difficult to remember the formula for real valued solutions. One
way to obtain those solutions without remembering the formula is to start repeat the proof
of Theorem 2.3.3. Start with the complex valued solution y+ and use the properties of the
complex exponential,
y+ (t) = e(1+i 5)t = et ei 5t = et cos( 5t) + i sin( 5t) .
The real valued fundamental solutions are the real and imaginary parts in that expression.
y1 y2
Remark: Physical processes that oscillate in time without dissipation could be described
by differential equations like the one in this example.
2.3.3. Constructive Proof of Theorem 2.3.2. We now present an alternative proof for
Theorem 2.3.2 that does not involve guessing the fundamental solutions of the equation.
Instead, we construct these solutions using a generalization of the reduction of order method.
Proof of Theorem 2.3.2: The proof has two main parts: First, we transform the original
equation into an equation simpler to solve for a new unknown; second, we solve this simpler
problem.
In order to transform the problem into a simpler one, we express the solution y as a
product of two functions, that is, y(t) = u(t)v(t). Choosing v in an appropriate way the
equation for u will be simpler to solve than the equation for y. Hence,
y = uv y 0 = u0 v + v 0 u y 00 = u00 v + 2u0 v 0 + v 00 u.
Therefore, Eq. (2.3.4) implies that
(u00 v + 2u0 v 0 + v 00 u) + a1 (u0 v + v 0 u) + a0 uv = 0,
that is,
h v0 0 i
u00 + a1 + 2 u + a0 u v + (v 00 + a1 v 0 ) u = 0. (2.3.6)
v
We now choose the function v such that
v0 v0 a1
a1 + 2 = 0 = . (2.3.7)
v v 2
We choose a simple solution of this equation, given by
v(t) = ea1 t/2 .
Having this expression for v one can compute v 0 and v 00 , and it is simple to check that
a21
v 00 + a1 v 0 = v. (2.3.8)
4
104 G. NAGY ODE september 11, 2017
Introducing the first equation in (2.3.7) and Eq. (2.3.8) into Eq. (2.3.6), and recalling that
v is non-zero, we obtain the simplified equation for the function u, given by
a21
u00 k u = 0, a0 .
k= (2.3.9)
4
Eq. (2.3.9) for u is simpler than the original equation (2.3.4) for y since in the former there
is no term with the first derivative of the unknown function.
In order to solve Eq. (2.3.9) we repeat the idea followed to obtain this equation, that
is, express function u as a product of two functions, and solve a simple problem of one of
the functions. We
first consider the harder case, which is when k 6= 0. In this case, let us
express u(t) = e kt w(t). Hence,
u0 = ke kt w + e kt w0 u00 = ke kt w + 2 ke kt w0 + e kt w00 .
Therefore, Eq. (2.3.9) for function u implies the following equation for function w
0 = u00 ku = e kt (2 k w0 + w00 ) w00 + 2 kw0 = 0.
Only derivatives of w appear in the latter equation, so denoting x(t) = w0 (t) we have to
solve a simple equation
x0 = 2 k x x(t) = x0 e2 kt , x0 R.
Integrating we obtain w as follows,
x0
w0 = x0 e2 kt
w(t) = e2 kt + c0 .
2 k
renaming c1 = x0 /(2 k), we obtain
w(t) = c1 e2 kt
+ c0 u(t) = c0 e kt
+ c1 e kt
.
We then obtain the expression for the solution y = uv, given by
a1 a1
k)t
y(t) = c0 e( 2 + k)t
+ c1 e( 2 .
Since k = (a21 /4 a0 ), the numbers
a1 1 p
r = k r = a1 a21 4a0
2 2
are the roots of the characteristic polynomial
r2 + a1 r + a0 = 0,
we can express all solutions of the Eq. (2.3.4) as follows
y(t) = c0 er+ t + c1 er- t , k 6= 0.
Finally, consider the case k = 0. Then, Eq. (2.3.9) is simply given by
u00 = 0 u(t) = (c0 + c1 t) c0 , c1 R.
Then, the solution y to Eq. (2.3.4) in this case is given by
y(t) = (c0 + c1 t) ea1 t/2 .
Since k = 0, the characteristic equation r2 +a1 r+a0 = 0 has only one root r+ = r = a1 /2,
so the solution y above can be expressed as
y(t) = (c0 + c1 t) er+ t , k = 0.
The Furthermore part is the same as in Theorem 2.3.2. This establishes the Theorem.
G. NAGY ODE September 11, 2017 105
Notes.
(a) In the case that the characteristic polynomial of a differential equation has repeated
roots there is an interesting argument to guess the solution y- . The idea is to take a
particular type of limit in solutions of differential equations with complex valued roots.
Consider the equation in (2.3.4) with a characteristic polynomial having complex
valued roots given by r = i, with
r
a1 a2
= , = a0 1 .
2 4
Real valued fundamental solutions in this case are given by
y+ = e t cos(t), y- = e t sin(t).
We now study what happen to these solutions y+ and y- in the following limit: The
variable t is held constant, is held constant, and 0. The last two conditions
are conditions on the equation coefficients, a1 , a0 . For example, we fix a1 and we vary
a0 a21 /4 from above.
Since cos(t) 1 as 0 with t fixed, then keeping fixed too, we obtain
y+ (t) = e t cos(t) e t = y+ (t).
sin(t)
Since 1 as 0 with t constant, that is, sin(t) t, we conclude that
t
y- (t) sin(t) t sin(t) t
= e = te t e t = y- (t).
t
The calculation above says that the function y- / is close to the function y- (t) = t e t in
the limit 0, t held constant. This calculation provides a candidate, y- (t) = t y+ (t),
of a solution to Eq. (2.3.4). It is simple to verify that this candidate is in fact solution of
Eq. (2.3.4). Since y- is not proportional to y+ , one then concludes the functions y+ , y- are
a fundamental set for the differential equation in (2.3.4) in the case the characteristic
polynomial has repeated roots.
(b) Brief Review of Complex Numbers.
Complex numbers have the form z = a + ib, where i2 = 1.
The complex conjugate of z is the number z = a ib.
Re(z) = a, Im(z) = b are the real and imaginary parts of z
z+z zz
Hence: Re(z) = , and Im(z) = .
2 2i
The exponential of a complex number is defined as
a+ib
X (a + ib)n
e = .
n=0
n!
In particular holds ea+ib = ea eib .
Eulers formula: eib = cos(b) + i sin(b).
Hence, a complex number of the form ea+ib can be written as
ea+ib = ea cos(b) + i sin(b) , eaib = ea cos(b) i sin(b) .
2.3.4. Exercises.
2.3.1.- . 2.3.2.- .
G. NAGY ODE September 11, 2017 107
Remarks:
(a) This equation is also called Cauchy equidimensional equation, Cauchy equation, Cauchy-
Euler equation, or simply Euler equation. As George Simmons says in [10], Euler
studies were so extensive that many mathematicians tried to avoid confusion by naming
subjects after the person who first studied them after Euler.
(b) The equation is called equidimensional because if the variable t has any physical di-
dn
mensions, then the terms with (t t0 )n n , for any nonnegative integer n, are actually
dt
dimensionless.
(c) The exponential functions y(t) = ert are not solutions of the Euler equation. Just
introduce such a function into the equation, and it is simple to show that there is no
constant r such that the exponential is solution.
(d) The particular case t0 = 0 is
t2 y 00 + p0 t y 0 + q0 y = 0.
We now summarize what is known about solutions of the Euler equation.
Theorem 2.4.2 (Euler Equation). Consider the Euler equidimensional equation
(t t0 )2 y 00 + a1 (t t0 ) y 0 + a0 y = 0, t > t0 , (2.4.1)
where a1 , a0 , and t0 are real constants, and denote by r-+ the roots of the indicial polynomial
p(r) = r(r 1) + a1 r + a0 .
(a) If r+ 6= r- , real or complex, then the general solution of Eq. (2.4.1) is given by
ygen (t) = c+ (t t0 )r+ + c- (t t0 )r- , t > t0 , c+ , c- R.
(b) If r+ = r- = r0 R, then the general solution of Eq. (2.4.1) is given by
ygen (t) = c+ (t t0 )r0 + c- (t t0 )r0 ln(t t0 ), t > t0 , c+ , c- R.
Furthermore, given real constants t1 > t0 , y0 and y1 , there is a unique solution to the initial
value problem given by Eq. (2.4.1) and the initial conditions
y(t1 ) = y0 , y 0 (t1 ) = y1 .
108 G. NAGY ODE september 11, 2017
Remark: We have restricted to a domain with t > t0 . Similar results hold for t < t0 . In
fact one can prove the following: If a solution y has the value y(t t0 ) at t t0 > 0, then
the function y defined as y(t t0 ) = y((t t0 )), for t t0 < 0 is solution of Eq. (2.4.1)
for t t0 < 0. For this reason the solution for t 6= t0 is sometimes written in the literature,
see [3] 5.4, as follows,
ygen (t) = c+ |t t0 |r+ + c- |t t0 |r- , r+ 6= r- ,
r0 r0
ygen (t) = c+ |t t0 | + c- |t t0 | ln |t t0 |, r+ = r- = r0 .
However, when solving an initial value problem, we need to pick the domain that contains
the initial data point t1 . This domain will be a subinterval in either (, t0 ) or (t0 , ). For
simplicity, in these notes we choose the domain (t0 , ).
The proof of this theorem closely follows the ideas to find all solutions of second order
linear equations with constant coefficients, Theorem 2.3.2, in 2.3. In that case we found
fundamental solutions to the differential equation
y 00 + a1 y 0 + a0 y = 0,
and then we recalled Theorem 2.1.7, which says that any other solution is a linear combina-
tion of a fundamental solution pair. In the case of constant coefficient equations, we looked
for fundamental solutions of the form y(t) = ert , where the constant r was a root of the
characteristic polynomial
r2 + a1 r + a0 = 0.
When this polynomial had two different roots, r+ 6= r- , we got the fundamental solutions
y+ (t) = er+ t , y- (t) = er- t .
When the root was repeated, r+ = r- = r0 , we used the reduction order method to get the
fundamental solutions
y+ (t) = er0 t , y- (t) = t er0 t .
Well, the proof of Theorem 2.4.2 closely follows this proof, replacing the exponential function
by power functions.
Proof of Theorem 2.4.2: For simplicity we consider the case t0 = 0. The general case
t0 6= 0 follows from the case t0 = 0 replacing t by (t t0 ). So, consider the equation
t2 y 00 + a1 t y 0 + a0 y = 0, t > 0.
r
We look for solutions of the form y(t) = t , because power functions have the property that
y 0 = r tr1 t y 0 = r tr .
A similar property holds for the second derivative,
y 00 = r(r 1) tr2 t2 y 00 = r(r 1) tr .
When we introduce this function into the Euler equation we get an algebraic equation for r,
r(r 1) + a1 r + a0 tr = 0 r(r 1) + a1 r + a0 = 0.
If we have a repeated root r+ = r- = r0 , then one solution is y+ (t) = tr0 . To obtain the second
solution we use the reduction order method. Since we have one solution to the equation, y+ ,
the second solution is
y- (t) = v(t) y+ (t) y- (t) = v(t) tr0 .
We need to compute the first two derivatives of y- ,
y-0 = r0 v tr0 1 + v 0 tr0 , y-00 = r0 (r0 1)v tr0 2 + 2r0 v 0 tr0 1 + v 00 tr0 .
We now put these expressions for y- , y-0 and y-00 into the Euler equation,
t2 r0 (r0 1)v tr0 2 + 2r0 v 0 tr0 1 + v 00 tr0 + a1 t r0 v tr0 1 + v 0 tr0 + a0 v tr0 = 0.
Solution: We look for solutions of the form y(t) = tr , which implies that
t y 0 (t) = r tr , t2 y 00 (t) = r(r 1) tr ,
therefore, introducing this function y into the differential equation we obtain
r(r 1) + 4r + 2 tr = 0 r(r 1) + 4r + 2 = 0.
Example 2.4.2: Find the general solution of the Euler equation below for t > 0,
t2 y 00 3t y 0 + 4 y = 0.
Solution: We look for solutions of the form y(t) = tr , then the constant r must be solution
of the Euler characteristic polynomial
r(r 1) 3r + 4 = 0 r2 4r + 4 = 0 r+ = r- = 2.
Therefore, the general solution of the Euler equation for t > 0 in this case is given by
ygen (t) = c+ t2 + c- t2 ln(t).
C
Example 2.4.3: Find the general solution of the Euler equation below for t > 0,
t2 y 00 3t y 0 + 13 y = 0.
Solution: We look for solutions of the form y(t) = tr , which implies that
t y 0 (t) = r tr , t2 y 00 (t) = r(r 1) tr ,
therefore, introducing this function y into the differential equation we obtain
r(r 1) 3r + 13 tr = 0 r(r 1) 3r + 13 = 0.
2.4.2. Real Solutions for Complex Roots. We study in more detail the solutions to the
Euler equation in the case that the indicial polynomial has complex roots. Since these roots
have the form
(a1 1) 1 p
r-+ = (a1 1)2 4a0 ,
2 2
the roots are complex-valued in the case (p0 1)2 4q0 < 0. We use the notation
r
(a1 1) (a1 1)2
r-+ = i, with = , = a0 .
2 4
The fundamental solutions in Theorem 2.4.2 are the complex-valued functions
y+ (t) = t(+i) , y- (t) = t(i) .
The general solution constructed from these solutions is
ygen (t) = c+ t(+i) + c- t(i) , c+ , c- C.
This formula for the general solution includes real valued and complex valued solutions.
But it is not so simple to single out the real valued solutions. Knowing the real valued
solutions could be important in physical applications. If a physical system is described by a
G. NAGY ODE September 11, 2017 111
differential equation with real coefficients, more often than not one is interested in finding
real valued solutions. For that reason we now provide a new set of fundamental solutions
that are real valued. Using real valued fundamental solution is simple to separate all real
valued solutions from the complex valued ones.
Theorem 2.4.3 (Real Valued Fundamental Solutions). If the differential equation
(t t0 )2 y 00 + a1 (t t0 ) y 0 + a0 y = 0, t > t0 , (2.4.3)
where a1 , a0 , t0 are real constants, has indicial polynomial with complex roots r-+ = i
and complex valued fundamental solutions for t > t0 ,
y+ (t) = (t t0 )(+i) , y- (t) = (t t0 )(i) ,
then the equation also has real valued fundamental solutions for t > t0 given by
y+ (t) = (t t0 ) cos ln(t t0 ) , y- (t) = (t t0 ) sin ln(t t0 ) .
Proof of Theorem 2.4.3: For simplicity consider the case t0 = 0. Take the solutions
y+ (t) = t(+i) , y- (t) = t(i) .
Rewrite the power function as follows,
i
y+ (t) = t(+i) = t ti = t eln(t )
= t ei ln(t) y+ (t) = t ei ln(t) .
A similar calculation yields
y- (t) = t ei ln(t) .
Recall now Euler formula for complex exponentials, ei = cos() + i sin(), then we get
y+ (t) = t cos ln(t) + i sin ln(t) , y- (t) = t cos ln(t) i sin ln(t) .
To prove the case having t0 6= 0, just replace t by (t t0 ) on all steps above. This establishes
the Theorem.
Example 2.4.4: Find a real-valued general solution of the Euler equation below for t > 0,
t2 y 00 3t y 0 + 13 y = 0.
2.4.3. Transformation to Constant Coefficients. Theorem 2.4.2 shows that y(t) = tr-+ ,
where r-+ are roots of the indicial polynomial, are solutions to the Euler equation
t2 y 00 + a1 t y 0 + a0 y = 0, t > 0.
The proof of this theorem is to verify that the power functions y(t) = tr-+ solve the differential
equation. How did we know we had to try with power functions? One answer could be, this
is a guess, a lucky one. Another answer could be that the Euler equation can be transformed
into a constant coefficient equation by a change of the independent variable.
Theorem 2.4.4 (Transformation to Constant Coefficients). The function y is solution of
the Euler equidimensional equation
t2 y 00 + a1 t y 0 + a0 y = 0, t>0 (2.4.4)
z
iff the function u(z) = y(t(z)), where t(z) = e , satisfies the constant coefficients equation
u + (a1 1) u + a0 u = 0, z R, (2.4.5)
0
where y = dy/dt and u = du/dz.
Remark: The solutions of the constant coefficient equation in (2.4.5) are u(z) = er-+ z , where
r-+ are the roots of the characteristic polynomial of Eq. (2.4.5),
r-+2 + (a1 1)r-+ + a0 = 0,
that is, r-+ must be a root of the indicial polynomial of Eq. (2.4.4).
(a) Consider the case that r+ 6= r- . Recalling that y(t) = u(z(t)), and z(t) = ln(t), we get
r+
y-+ (t) = u(z(t)) = er-+ z(t) = er-+ ln(t) = eln(t -)
y-+ (t) = tr-+ .
(b) Consider the case that r+ = r- = r0 . Recalling that y(t) = u(z(t)), and z(t) = ln(t), we
get that y+ (t) = tr0 , while the second solution is
r0
y- (t) = u(z(t)) = z(t) er0 z(t) = ln(t) er0 ln(t) = ln(t) eln(t )
y- (t) = ln(t) tr0 .
Proof of Theorem 2.4.4: Given t > 0, introduce t(z) = ez . Given a function y, let
u(z) = y(t(z)) u(z) = y(ez ).
Then, the derivatives of u and y are related by the chain rule,
du dy dt d(ez )
u(z) = (z) = (t(z)) (z) = y 0 (t(z)) = y 0 (t(z)) ez
dz dt dz dz
so we obtain
u(z) = t y 0 (t),
where we have denoted u = du/dz. The relation for the second derivatives is
d dt d(ez )
t y 0 (t) (z) = t y 00 (t) + y 0 (t) = t y 00 (t) + y 0 (t) t
u(z) =
dt dz dz
so we obtain
u(z) = t2 y 00 (t) + t y 0 (t).
Combining the equations for u and u we get
t2 y 00 = u u, t y 0 = u.
The function y is solution of the Euler equation t2 y 00 + a1 t y 0 + a0 y = 0 iff holds
u u + a1 u + a0 u = 0 u + (a1 1) u + a0 u = 0.
This establishes the Theorem.
G. NAGY ODE September 11, 2017 113
2.4.4. Exercises.
2.4.1.- . 2.4.2.- .
114 G. NAGY ODE september 11, 2017
2.5.1. The General Solution Formula. The general solution formula for homogeneous
equations, Theorem 2.1.7, is no longer true for nonhomogeneous equations. But there is
a general solution formula for nonhomogeneous equations. Such formula involves three
functions, two of them are fundamental solutions of the homogeneous equation, and the
third function is any solution of the nonhomogeneous equation. Every other solution of the
nonhomogeneous equation can be obtained from these three functions.
Theorem 2.5.1 (General Solution). Every solution y of the nonhomogeneous equation
L(y) = f, (2.5.1)
00 0
with L(y) = y + a1 y + a0 y, where a1 , a0 , and f are continuous functions, is given by
y = c1 y1 + c2 y2 + yp ,
where the functions y1 and y2 are fundamental solutions of the homogeneous equation,
L(y1 ) = 0, L(y2 ) = 0, and yp is any solution of the nonhomogeneous equation L(yp ) = f .
Before we proof Theorem 2.5.1 we state the following definition, which comes naturally
from this Theorem.
Definition 2.5.2. The general solution of the nonhomogeneous equation L(y) = f is a
two-parameter family of functions
ygen (t) = c1 y1 (t) + c2 y2 (t) + yp (t), (2.5.2)
where the functions y1 and y2 are fundamental solutions of the homogeneous equation,
L(y1 ) = 0, L(y2 ) = 0, and yp is any solution of the nonhomogeneous equation L(yp ) = f .
G. NAGY ODE September 11, 2017 115
Remark: The difference of any two solutions of the nonhomogeneous equation is actually a
solution of the homogeneous equation. This is the key idea to prove Theorem 2.5.1. In other
words, the solutions of the nonhomogeneous equation are a translation by a fixed function,
yp , of the solutions of the homogeneous equation.
Proof of Theorem 2.5.1: Let y be any solution of the nonhomogeneous equation L(y) = f .
Recall that we already have one solution, yp , of the nonhomogeneous equation, L(yp ) = f .
We can now subtract the second equation from the first,
L(y) L(yp ) = f f = 0 L(y yp ) = 0.
The equation on the right is obtained from the linearity of the operator L. This last equation
says that the difference of any two solutions of the nonhomogeneous equation is solution of
the homogeneous equation. The general solution formula for homogeneous equations says
that all solutions of the homogeneous equation can be written as linear combinations of a
pair of fundamental solutions, y1 , y2 . So the exist constants c1 , c2 such that
y yp = c1 y1 + c2 y2 .
Since for every y solution of L(y) = f we can find constants c1 , c2 such that the equation
above holds true, we have found a formula for all solutions of the nonhomogeneous equation.
This establishes the Theorem.
2.5.2. The Undetermined Coefficients Method. The general solution formula in (2.5.2)
is the most useful if there is a way to find a particular solution yp of the nonhomogeneous
equation L(yp ) = f . We now present a method to find such particular solution, the un-
determined coefficients method. This method works for linear operators L with constant
coefficients and for simple source functions f . Here is a summary of the undetermined
coefficients method:
(1) Find fundamental solutions y1 , y2 of the homogeneous equation L(y) = 0.
(2) Given the source functions f , guess the solutions yp following the Table 1 below.
(3) If the function yp given by the table satisfies L(yp ) = 0, then change the guess to typ. .
If typ satisfies L(typ ) = 0 as well, then change the guess to t2 yp .
(4) Find the undetermined constants k in the function yp using the equation L(y) = f ,
where y is yp , or typ or t2 yp .
Keat keat
Km tm + + K0 km tm + + k0
K1 cos(bt) + K2 sin(bt) k1 cos(bt) + k2 sin(bt)
This is the undetermined coefficients method. It is a set of simple rules to find a particular
solution yp of an nonhomogeneous equation L(yp ) = f in the case that the source function
f is one of the entries in the Table 1. There are a few formulas in particular cases and a
few generalizations of the whole method. We discuss them after a few examples.
Example 2.5.1 (First Guess Right): Find all solutions to the nonhomogeneous equation
y 00 3y 0 4y = 3 e2t .
(2): The table says: For f (t) = 3e2t guess yp (t) = k e2t . The constant k is the undetermined
coefficient we must find.
(3): Since yp (t) = k e2t is not solution of the homogeneous equation, we do not need to
modify our guess. (Recall: L(y) = 0 iff exist constants c+ , c- such that y(t) = c+ e4t + c- et .)
(4): Introduce yp into L(yp ) = f and find k. So we do that,
1
(22 6 4)ke2t = 3 e2t 6k = 3 k= .
2
We guessed that yp must be proportional to the exponential e2t in order to cancel out the
exponentials in the equation above. We have obtained that
1
yp (t) = e2t .
2
The undetermined coefficients method gives us a way to compute a particular solution yp of
the nonhomogeneous equation. We now use the general solution theorem, Theorem 2.5.1,
to write the general solution of the nonhomogeneous equation,
1 2t
ygen (t) = c+ e4t + c- et e .
2
C
Remark: The step (4) in Example 2.5.1 is a particular case of the following statement.
Theorem 2.5.3. Consider the equation L(y) = f , where L(y) = y 00 +a1 y 0 +a0 y has constant
coefficients and p is its characteristic polynomial. If the source function is f (t) = K eat ,
with p(a) 6= 0, then a particular solution of the nonhomogeneous equation is
K at
yp (t) = e .
p(a)
Proof of Theorem 2.5.3: Since the linear operator L has constant coefficients, let us
write L and its associated characteristic polynomial p as follows,
L(y) = y 00 + a1 y 0 + a0 y, p(r) = r2 + a1 r + a0 .
Since the source function is f (t) = K eat , the Table 1 says that a good guess for a particular
soution of the nonhomogneous equation is yp (t) = k eat . Our hypothesis is that this guess
is not solution of the homogenoeus equation, since
L(yp ) = (a2 + a1 a + a0 ) k eat = p(a) k eat , and p(a) 6= 0.
G. NAGY ODE September 11, 2017 117
In the following example our first guess for a particular solution yp happens to be a
solution of the homogenous equation.
Example 2.5.2 (First Guess Wrong): Find all solutions to the nonhomogeneous equation
y 00 3y 0 4y = 3 e4t .
Solution: If we write the equation as L(y) = f , with f (t) = 3 e4t , then the operator L is
the same as in Example 2.5.1. So the solutions of the homogeneous equation L(y) = 0, are
the same as in that example,
y+ (t) = e4t , y- (t) = et .
The source function is f (t) = 3 e4t , so the Table 1 says that we need to guess yp (t) = k e4t .
However, this function yp is solution of the homogeneous equation, because
yp = k y + L(yp ) = 0.
We have to change our guess, as indicated in the undetermined coefficients method, step (3)
yp (t) = kt e4t .
This new guess is not solution of the homogeneous equation. So we proceed to compute the
constant k. We introduce the guess into L(yp ) = f ,
yp0 = (1 + 4t) k e4t , yp00 = (8 + 16t) k e4t 8 3 + (16 12 4)t k e4t = 3 e4t ,
Solution: If we write the equation as L(y) = f , with f (t) = 2 sin(t), then the operator L
is the same as in Example 2.5.1. So the solutions of the homogeneous equation L(y) = 0,
are the same as in that example,
y+ (t) = e4t , y- (t) = et .
118 G. NAGY ODE september 11, 2017
Since the source function is f (t) = 2 sin(t), the Table 1 says that we need to choose the
function yp (t) = k1 cos(t) + k2 sin(t). This function yp is not solution to the homogeneous
equation. So we look for the constants k1 , k2 using the differential equation,
yp0 = k1 sin(t) + k2 cos(t), yp00 = k1 cos(t) k2 sin(t),
and then we obtain
[k1 cos(t) k2 sin(t)] 3[k1 sin(t) + k2 cos(t)] 4[k1 cos(t) + k2 sin(t)] = 2 sin(t).
Reordering terms in the expression above we get
(5k1 3k2 ) cos(t) + (3k1 5k2 ) sin(t) = 2 sin(t).
The last equation must hold for all t R. In particular, it must hold for t = /2 and for
t = 0. At these two points we obtain, respectively,
3
3k1 5k2 = 2,
k1 =
,
17
5k1 3k2 = 0, k2 = 5 .
17
So the particular solution to the nonhomogeneous equation is given by
1
yp (t) = 3 cos(t) 5 sin(t) .
17
The general solution theorem for nonhomogeneous equations implies
1
ygen (t) = c+ e4t + c- et +
3 cos(t) 5 sin(t) .
17
C
The next example collects a few nonhomogeneous equations and the guessed function yp .
Example 2.5.4: We provide few more examples of nonhomogeneous equations and the
appropriate guesses for the particular solutions.
Remark: Suppose that the source function f does not appear in Table 1, but f can be
written as f = f1 + f2 , with f1 and f2 in the table. In such case look for a particular solution
yp = yp1 + yp2 , where L(yp1 ) = f1 and L(yp2 ) = f2 . Since the operator L is linear,
L(yp ) = L(yp1 + yp2 ) = L(yp1 ) + L(yp2 ) = f1 + f2 = f L(yp ) = f.
Solution: If we write the equation as L(y) = f , with f (t) = 2 sin(t), then the operator L
is the same as in Example 2.5.1 and 2.5.3. So the solutions of the homogeneous equation
L(y) = 0, are the same as in these examples,
y+ (t) = e4t , y- (t) = et .
G. NAGY ODE September 11, 2017 119
The source function f (t) = 3 e2t + 2 sin(t) does not appear in Table 1, but each term does,
f1 (t) = 3 e2t and f2 (t) = 2 sin(t). So we look for a particular solution of the form
yp = yp1 + yp2 , where L(yp1 ) = 3 e2t , L(yp2 ) = 2 sin(t).
We have chosen this example because we have solved each one of these equations before, in
Example 2.5.1 and 2.5.3. We found the solutions
1 1
yp1 (t) = e2t ,
yp2 (t) = 3 cos(t) 5 sin(t) .
2 17
Therefore, the particular solution for the equation in this example is
1 1
yp (t) = e2t +
3 cos(t) 5 sin(t) .
2 17
Using the general solution theorem for nonhomogeneous equations we obtain
1 1
ygen (t) = c+ e4t + c- et e2t +
3 cos(t) 5 sin(t) .
2 17
C
2.5.3. The Variation of Parameters Method. This method provides a second way to
find a particular solution yp to a nonhomogeneous equation L(y) = f . We summarize this
method in formula to compute yp in terms of any pair of fundamental solutions to the
homogeneous equation L(y) = 0. The variation of parameters method works with second
order linear equations having variable coefficients and contiuous but otherwise arbitrary
sources. When the source function of a nonhomogeneous equation is simple enough to
appear in Table 1 the undetermined coefficients method is a quick way to find a particular
solution to the equation. When the source is more complicated, one usually turns to the
variation of parameters method, with its more involved formula for a particular solution.
Theorem 2.5.4 (Variation of Parameters). A particular solution to the equation
L(y) = f,
00 0
with L(y) = y + a1 (t) y + a0 (t) y and a1 , a0 , f continuous functions, is given by
yp = u1 y1 + u2 y2 ,
where y1 , y2 are fundamental solutions of the homogeneous equatio L(y) = 0 and the func-
tions u1 , u2 are defined by
Z Z
y2 (t)f (t) y1 (t)f (t)
u1 (t) = dt, u2 (t) = dt, (2.5.3)
Wy1 y2 (t) Wy1 y2 (t)
where Wy1 y2 is the Wronskian of y1 and y2 .
The proof is a generalization of the reduction order method. Recall that the reduction
order method is a way to find a second solution y2 of an homogeneous equation if we already
know one solution y1 . One writes y2 = u y1 and the original equation L(y2 ) = 0 provides an
equation for u. This equation for u is simpler than the original equation for y2 because the
function y1 satisfies L(y1 ) = 0.
The formula for yp can be seen as a generalization of the reduction order method. We
write yp in terms of both fundamental solutions y1 , y2 of the homogeneous equation,
yp (t) = u1 (t) y1 (t) + u2 (t) y2 (t).
We put this yp in the equation L(yp ) = f and we find an equation relating u1 and u2 . It
is important to realize that we have added one new function to the original problem. The
original problem is to find yp . Now we need to find u1 and u2 , but we still have only one
equation to solve, L(yp ) = f . The problem for u1 , u2 cannot have a unique solution. So we
120 G. NAGY ODE september 11, 2017
are completely free to add a second equation to the original equation L(yp ) = f . We choose
the second equation so that we can solve for u1 and u2 .
Proof of Theorem 2.5.4: Motivated by the reduction of order method we look for a yp
yp = u1 y1 + u2 y2 .
We hope that the equations for u1 , u2 will be simpler to solve than the equation for yp . But
we started with one unknown function and now we have two unknown functions. So we are
free to add one more equation to fix u1 , u2 . We choose
u01 y1 + u02 y2 = 0.
y10 0
Z
In other words, we choose u2 = u dt. Lets put this yp into L(yp ) = f . We need yp0
y20 1
(and recall, u01 y1 + u02 y2 = 0)
yp0 = u01 y1 + u1 y10 + u02 y2 + u2 y20 yp0 = u1 y10 + u2 y20 .
and we also need yp00 ,
yp00 = u01 y10 + u1 y100 + u02 y20 + u2 y200 .
So the equation L(yp ) = f is
(u01 y10 + u1 y100 + u02 y20 + u2 y200 ) + a1 (u1 y10 + u2 y20 ) + a0 (u1 y1 + u2 y2 ) = f
We reorder a few terms and we see that
u01 y10 + u02 y20 + u1 (y100 + a1 y10 + a0 y1 ) + u2 (y200 + a1 y20 + a0 y2 ) = f.
The functions y1 and y2 are solutions to the homogeneous equation,
y100 + a1 y10 + a0 y1 = 0, y200 + a1 y20 + a0 y2 = 0,
so u1 and u2 must be solution of a simpler equation that the one above, given by
u01 y10 + u02 y20 = f. (2.5.4)
So we end with the equations
u01 y10 + u02 y20 = f
u01 y1 + u02 y2 = 0.
And this is a 2 2 algebraic linear system for the unknowns u01 , u02 . It is hard to overstate
the importance of the word algebraic in the previous sentence. From the second equation
above we compute u02 and we introduce it in the first equation,
y1 y1 y20 0 y0 y y y0
u02 = u01 u01 y10 u1 = f u01 1
2 1 2
= f.
y2 y2 y2
Recall that the Wronskian of two functions is W12 = y1 y20 y10 y2 , we get
y2 f y1 f
u01 = u02 = .
W12 W12
These equations are the derivative of Eq. (2.5.3). Integrate them in the variable t and choose
the integration constants to be zero. We get Eq. (2.5.3). This establishes the Theorem.
Remark: The integration constants in the expressions for u1 , u2 can always be chosen to
be zero. To understand the effect of the integration constants in the function yp , let us do
the following. Denote by u1 and u2 the functions in Eq. (2.5.3), and given any real numbers
c1 and c2 define
u1 = u1 + c1 , u2 = u2 + c2 .
G. NAGY ODE September 11, 2017 121
Solution: The formula for yp in Theorem 2.5.4 requires we know fundamental solutions to
the homogeneous problem. So we start finding these solutions first. Since the equation has
constant coefficients, we compute the characteristic equation,
r+ = 3,
1
r2 5r + 6 = 0 r = 5 25 24
2 r- = 2.
So, the functions y1 and y2 in Theorem 2.5.4 are in our case given by
y1 (t) = e3t , y2 (t) = e2t .
The Wronskian of these two functions is given by
Wy1 y2 (t) = (e3t )(2 e2t ) (3 e3t )(e2t ) Wy1 y2 (t) = e5t .
We are now ready to compute the functions u1 and u2 . Notice that Eq. (2.5.3) the following
differential equations
y2 f y1 f
u01 = , u02 = .
Wy1 y2 Wy1 y2
So, the equation for u1 is the following,
u01 = e2t (2 et )(e5t ) u01 = 2 e2t u1 = e2t ,
u02 = e3t (2 et )(e5t ) u02 = 2 et u2 = 2 et ,
where we have chosen the constant of integration to be zero. The particular solution we are
looking for is given by
yp = (e2t )(e3t ) + (2 et )(e2t ) yp = et .
Then, the general solution theorem for nonhomogeneous equation implies
ygen (t) = c+ e3t + c- e2t + et c+ , c- R.
C
Remark: Sometimes it could be difficult to remember the formulas for functions u1 and u2
in (2.5.3). In such case one can always go back to the place in the proof of Theorem 2.5.4
where these formulas come from, the system
u01 y10 + u02 y20 = f
u01 y1 + u02 y2 = 0.
The system above could be simpler to remember than the equations in (2.5.3). We end this
Section using the equations above to solve the problem in Example 2.5.7. Recall that the
solutions to the homogeneous equation in Example 2.5.7 are y1 (t) = t2 , and y2 (t) = 1/t,
while the source function is f (t) = 3 1/t2 . Then, we need to solve the system
1
t2 u01 + u02 = 0,
t
0 0 (1) 1
2t u1 + u2 2 = 3 2 .
t t
This is an algebraic linear system for u01 and u02 . Those are simple to solve. From the equation
on top we get u02 in terms of u01 , and we use that expression on the bottom equation,
1 1 1
u02 = t3 u01 2t u01 + t u01 = 3 2 u01 = 3 .
t t 3t
Substitue back the expression for u01 in the first equation above and we get u02 . We get,
1 1
u01 = 3
t 3t
1
u02 = t2 + .
3
We should now integrate these functions to get u1 and u2 and then get the particular solution
yp = u1 y1 + u2 y2 . We do not repeat these calculations, since they are done Example 2.5.7.
G. NAGY ODE September 11, 2017 123
2.5.4. Exercises.
2.5.1.- . 2.5.2.- .
124 G. NAGY ODE september 11, 2017
2.6. Applications
Different physical systems are mathematically identical. In this Section we show that
a weight attached to a spring, oscillating either in air or under water, is mathematically
identical to the behavior of an electric current in a circuit containing a resistance, a capacitor,
and an inductance. Mathematical identical means that both systems are described by the
same differential equation.
2.6.1. Review of Constant Coefficient Equations. In Section 2.3 we have seen how to
find solutions to second order, linear, constant coefficient, homogeneous, differential equa-
tions,
y 00 + a1 y 0 + a0 y = 0, a1 , a2 R. (2.6.1)
Theorem 2.3.2 provides formulas for the general solution of this equation. We review here
this result, and at the same time we introduce new names describing these solutions, names
that are common in the physics literature. The first step to obtain solutions to Eq. (2.6.1)
is to find the roots or the characteristic polynomial p(r) = r2 + a1 r + a0 , which are given by
a1 1p 2
r = a1 4a0 .
2 2
We then have three different cases to consider.
(a) A system is over damped in the case that a21 4a0 > 0. In this case the characteristic
polynomial has real and distinct roots, r+ , r- , and the corresponding solutions to the
differential equation are
So the solutions are exponentials, increasing or decreasing, according whether the roots
are positive or negative, respectively. The decreasing exponential solutions originate the
name over damped solutions.
(b) A system is critically damped in the case that a21 4a0 = 0. In this case the characteristic
polynomial has only one real, repeated, root, r = a1 /2, and the corresponding solutions
to the differential equation are then,
(c) A system is under damped in the case that a21 4a0 < 0. In this case the characteristic
polynomial has two complex roots, r = i, where one root is the complex conjugate
of the other, since the polynomial has real coefficients. The corresponding solutions to
the differential equation are
2.6.2. Undamped Mechanical Oscillations. Springs are curious objects, when you slightly
deform them they create a force proportional and in opposite direction to the deformation.
When you release the spring, it goes back to its original size. This is true for small enough
deformations. If you stretch the spring long enough, the deformations are permanent.
Definition 2.6.1. A spring is an object that when deformed by an amount l creates a
force Fs = k l, with k > 0.
Consider a spring-body system as shown in Fig. 2.6.2. A spring is fixed to a ceiling and
hangs vertically with a natural length l. It stretches by l when a body with mass m is
attached to its lower end, just as in the middle spring in Fig. 2.6.2. We assume that the
weight m is small enough so that the spring is not damaged. This means that the spring acts
like a normal spring, whenever it is deformed by an amount l it makes a force proportional
and opposite to the deformation,
Fs0 = k l.
Here k > 0 is a constant that depends on the type of spring. Newtons law of motion imply
the following result.
Theorem 2.6.2. A spring-body system with spring constant k, body mass m, at rest with
a spring deformation l, within the rage where the spring acts like a spring, satisfies
mg = k l.
Proof of Theorem 2.6.2: Since the spring-body system is at rest, Newtons law of motion
imply that all forces acting on the body must add up to zero. The only two forces acting on
the body are its weight, Fg = mg, and the force done by the spring, Fs0 = k l. We have
used the hypothesis that l is small enough so the spring is not damaged. We are using
the sign convention displayed in Fig. 2.6.2, where forces pointing downwards are positive.
We now find out how the body will move when we take it away from the rest position.
To describe that movement we introduce a vertical coordinate for the displacements, y, as
shown in Fig. 2.6.2, with y positive downwards, and y = 0 at the rest position of the spring
and the body. The physical system we want to describe is simple; we further stretch the
spring with the body by y0 and then we release it with an initial velocity v0 . Newtons law
of motion determine the subsequent motion.
126 G. NAGY ODE september 11, 2017
Theorem 2.6.3. The vertical movement of a spring-body system in air with spring constant
k > 0 and body mass m > 0 is described by the solutions of the differential equation
m y 00 + k y = 0, (2.6.2)
where y is the vertical displacement function as shown in Fig. 2.6.2. Furthermore, there is
a unique solution to Eq. (2.6.2) satisfying the initial conditions y(0) = y0 and y 0 (0) = v0 ,
y(t) = A cos(0 t ),
r
k
with angular frequency 0 = , where the amplitude A > 0 and phase-shift (, ],
m
s
v2 v
0
A = y02 + 02 , = arctan .
0 0 y0
p
Remark: The angular or circular frequency of the system is 0 = k/m, meaning that
the motion of the system is periodic with period given by T = 2/0 , which in turns implies
that the system frequency is 0 = 0 /(2).
Proof of Theorem 2.6.3: Newtons second law of motion says that mass times acceleration
of the body m y 00 (t) must be equal to the sum of all forces acting on the body, hence
m y 00 (t) = Fg + Fs0 + Fs (t),
where Fs (t) = k y(t) is the force done by the spring due to the extra displacement y.
Since the first two terms on the right hand side above cancel out, Fg + Fs0 = 0, the body
displacement from the equilibrium position, y(t), must be solution of the differential equation
m y 00 (t) + k y(t) = 0.
which is Eq. (2.6.2). In Section ?? we have seen how to solve this type of differential
equations. The characteristic polynomial is p(r) = mr2 + k, which has complex roots
r = 02 i, where we introduced the angular or circular frequency of the system,
r
k
0 = .
m
The reason for this name is the calculations done in Section ??, where we found that a
real-valued expression for the general solution to Eq. (2.6.2) is given by
ygen (t) = c1 cos(0 t) + c2 sin(0 t).
This means that the body attached to the spring oscillates around the equilibrium position
y = 0 with period T = 2/0 , hence frequency 0 = 0 /(2). There is an equivalent way
to express the general solution above given by
ygen (t) = A cos(0 t ).
These two expressions for ygen are equivalent because of the trigonometric identity
A cos(0 t ) = A cos(0 t) cos() + A sin(0 t) sin(),
which holds for all A and , and 0 t. Then, it is not difficult to see that
p
A = c21 + c22 ,
)
c1 = A cos(),
c
c2 = A sin(). = arctan 2 .
c1
G. NAGY ODE September 11, 2017 127
Since both expressions for the general solution are equivalent, we use the second one, in
terms of the amplitude and phase-shift. The initial conditions y(0) = y0 and y 0 (0) = y0
determine the constants A and . Indeed,
s
v2
A = y02 + 02 ,
)
y0 = y(0) = A cos(),
0
v0 = y 0 (0) = A0 sin().
v0
= arctan
.
0 y0
This establishes the Theorem.
Example 2.6.1: Find the movement of a 50 gr mass attached to a spring moving in air
with initial conditions y(0) = 4 cm and y 0 (0) = 40 cm/s. The spring is such that a 30 gr
mass stretches it 6 cm. Approximate the acceleration of gravity by 1000 cm/s2 .
Solution: Theorem 2.6.3 says that the equation satisfied by the displacement y is given by
my 00 + ky = 0.
In order to solve this equation we need to find the spring constant, k, which by Theorem 2.6.2
is given by k = mg/l. In our case when a mass of m = 30 gr is attached to the sprint, it
stretches l = 6 cm, so we get,
(30) (1000) gr
k= k = 5000 2 .
6 s
Knowing the spring constant k we can now describe the movement of the body with mass
m = 50 gr. The solution of the differential equation above is obtained as usual, first find the
roots of the characteristic polynomial
r r
2 k 5000 1
mr + k = 0 r = 0 i, 0 = = 0 = 10 .
m 50 s
We write down the general solution in terms of the amplitude A and phase-shift ,
y(t) = A cos(0 t ) y(t) = A cos(10 t ).
To accommodate the initial conditions we need the function y 0 (t) = A0 sin(0 t ). The
initial conditions determine the amplitude and phase-shift, as follows,
A = 16 + 16,
)
4 = y(0) = A cos(),
40
40 = y 0 (0) = 10 A sin() = arctan .
(10)(4)
We obtain that A = 4 2 and tan() = 1. The later equation implies that either = /4 or
= 3/4, for (, ]. If we pick the second value, = 3/4, this would imply that
y(0) < 0 and y 0 (0) < 0, which is not true in our case. So we must pick the value = /4.
We then conclude:
y(t) = 4 2 cos 10 t .
4
C
2.6.3. Damped Mechanical Oscillations. Suppose now that the body in the spring-body
system is a thin square sheet of metal. If the main surface of the sheet is perpendicular to
the direction of motion, then the air dragged by the sheet during the spring oscillations will
be significant enough to slow down the spring oscillations in an appreciable time. One can
find out that the friction force done by the air opposes the movement and it is proportional
to the velocity of the body, that is, Fd = d y 0 (t). We call such force a damping force, where
128 G. NAGY ODE september 11, 2017
d > 0 is the damping coefficient, and systems having such force damped systems. We now
describe the spring-body system in the case that there is a non-zero damping force.
Theorem 2.6.4.
(a) The vertical displacement y, function as shown in Fig. 2.6.2, of a spring-body system
with spring constant k > 0, body mass m > 0, and damping constant d > 0, is described
by the solutions of
m y 00 + d y 0 + k y = 0, (2.6.3)
p
(b) The roots of the characteristic polynomial of Eq. (2.6.3) are rr = d d2 02 ,
d k
with damping coefficient d = and circular frequency 0 = .
2m m
(c) The solutions to Eq. (2.6.3) fall into one of the following cases:
(i) A system with d > 0 is called over damped, with general solution to Eq. (2.6.3)
y(t) = c+ er+ t + c- er- t .
(ii) A system with d = 0 is called critically damped, with general solution to Eq. (2.6.3)
y(t) = c+ ed t + c- t ed t .
(iii) A system with d < 0 is called under damped, with general solution to Eq. (2.6.3)
y(t) = A ed t cos(t ),
p
where = 02 d2 .
(d) There is a unique solution to Eq. (2.6.2) with initial conditions y(0) = y0 and y 0 (0) = v0 .
Remark: In the case the damping coefficient vanishes we recover Theorem 2.6.3.
Proof of Therorem 2.6.3: Newtons second law of motion says that mass times acceler-
ation of the body m y 00 (t) must be equal to the sum of all forces acting on the body. In the
case that we take into account the air dragging force we have
m y 00 (t) = Fg + Fs0 + Fs (t) + Fd (t),
where Fs (t) = k y(t) as in Theorem 2.6.3, and Fd (t) = d y 0 (t) is the air -body dragging
force. Since the first two terms on the right hand side above cancel out, Fg + Fs0 = 0,
as mentioned in Theorem 2.6.2, the body displacement from the equilibrium position, y(t),
must be solution of the differential equation
m y 00 (t) + +d y 0 (t) + k y(t) = 0.
which is Eq. (2.6.3). In Section ?? we have seen how to solve this type of differential
equations. The characteristic polynomial is p(r) = mr2 + dr + k, which has complex roots
r
1 d d 2 k
p q
r = 2
d d 4mk = r = d d2 02 .
2m 2m 2m m
r
d k
where d = and 0 = . In Section ?? we found that the general solution of a
2m m
differential equation with a characteristic polynomial having roots as above can be divided
into three groups. For the case r+ 6= r- real valued, we obtain case (ci), for the case r+ = r-
we obtain case (cii). Finally, we said that the general solution for the case of two complex
roots r = + i was given by
y( t) = et c1 cos(t) + c2 sin(t) .
G. NAGY ODE September 11, 2017 129
p
In our case = d and = 02 d2 . We now rewrite the second factor on the right-hand
side above in terms of an amplitude and a phase shift,
y(t) = A ed t cos(t ).
The main result from Section ?? says that the initial value problem in Theorem 2.6.4 has a
unique solution for each of the three cases above. This establishes the Theorem.
Example 2.6.2: Find the movement of a 5Kg mass attached to a spring with constant k =
2
5Kg/Secs
moving in a mediuith damping constant d = 5Kg/Secs, with initial conditions
y(0) = 3 and y 0 (0) = 0.
Solution: By Theorem 2.6.4 the differential equation for this system is my 00 + dy 0 + ky = 0,
with m = 5, k = 5, d = 5. The roots of the characteristic polynomial are
r
d 1 k
q
2 2
r = d d 0 , d = = , 0 = = 1,
2m 2 m
that is,
r
1 1 1 3
r = 1= i .
2 4 2 2
This means our system has under damped oscillations. Following Theorem 2.6.4 part (ciii),
the general solution is given by
3
y(t) = A et/2 cos t .
2
We only need to introduce the initial conditions into the expression for y to find out the
amplitude A and phase-shift . In order to do that we first compute the derivative,
1 3 3 3
0 t/2 t/2
y (t) = A e cos t Ae sin t .
2 2 2 2
The initial conditions in the example imply,
0 1 3
3 = y(0) = A cos(), 0 = y (0) = A cos() + A sin().
2 2
The second equation above allows us to compute the phase-shift, since
1 5
tan() = = , or = = .
3 6 6 6
If = 5/6, then y(0) < 0, which is not out case. Hence we must choose = /6. With
that phase-shift, the amplitude is given by
3
3 = A cos =A A = 2.
6 2
3
t/2
We conclude: y(t) = 2 e cos t . C
2 6
2.6.4. Electrical Oscillations. We describe the electric current flowing through an RLC-
series electric circuit, which consists of a resistance, a coil, and a capacitor connected in
series as shown in Fig. 13. A current can be started by approximating a magnet to the coil.
If the circuit has low resistance, the current will keep flowing through the coil between the
capacitor plates, endlessly. There is no need of a power source to keep the current flowing.
The presence of a resistance transforms the current energy into heat, damping the current
oscillation.
130 G. NAGY ODE september 11, 2017
Remark: When the circuit has no resistance, the current oscillates without dissipation.
p
Case (b): R < 4L/C. This implies
4L R2 1
R2 < < d2 < 02 .
C 4L2 LC
p
Therefore, the characteristic polynomial has complex roots r = d i 02 d2 , hence
the fundamental solutions are
I1 (t) = ed t cos( t),
G. NAGY ODE September 11, 2017 131
I1 I2
ed t
ed t
2.6.5. Exercises.
2.6.1.- . 2.6.2.- .
G. NAGY ODE September 11, 2017 133
The first differential equations were solved around the end of the seventeen century and
beginning of the eighteen century. We studied a few of these equations in 1.1-1.4 and the
constant coefficients equations in Chapter 2. By the middle of the eighteen century people
realized that the methods we learnt in these first sections had reached a dead end. One reason
was the lack of functions to write the solutions of differential equations. The elementary
functions we use in calculus, such as polynomials, quotient of polynomials, trigonometric
functions, exponentials, and logarithms, were simply not enough. People even started to
think of differential equations as sources to find new functions. It was matter of little time
before mathematicians started to use power series expansions to find solutions of differential
equations. Convergent power series define functions far more general than the elementary
functions from calculus.
In 3.1 we study the simplest case, when the power series is centered at a regular point
of the equation. The coefficients of the equation are analytic functions at regular points, in
particular continuous. In ?? we study the Euler equidimensional equation. The coefficients
of an Euler equation diverge at a particular point in a very specific way. No power series
are needed to find solutions in this case. In 3.2 we solve equations with regular singular
points. The equation coefficients diverge at regular singular points in a way similar to
the coefficients in an Euler equation. We will find solutions to these equations using the
solutions to an Euler equation and power series centered precisely at the regular singular
points of the equation.
1 P0
P1
P2
1 1 x
P3
1
134 G. NAGY ODE september 11, 2017
3.1.1. Regular Points. We now look for solutions to second order linear homogeneous
differential equations having variable coefficients. Recall we solved the constant coefficient
case in Chapter 2. We have seen that the solutions to constant coefficient equations can
be written in terms of elementary functions such as quotient of polynomials, trigonometric
functions, exponentials, and logarithms. For example, the equation
y 00 + y = 0
has the fundamental solutions y1 (x) = cos(x) and y2 (x) = sin(x). But the equation
x y 00 + y 0 + x y = 0
cannot be solved in terms of elementary functions, that is in terms of quotients of poly-
nomials, trigonometric functions, exponentials and logarithms. Except for equations with
constant coefficient and equations with variable coefficient that can be transformed into
constant coefficient by a change of variable, no other second order linear equation can be
solved in terms of elementary functions. Still, we are interested in finding solutions to vari-
able coefficient equations. Mainly because these equations appear in the description of so
many physical systems.
We have said that power series define more general functions than the elementary func-
tions mentioned above. So we look for solutions using power series. In this section we center
the power series at a regular point of the equation.
Definition 3.1.1. A point x0 R is called a regular point of the equation
y 00 + p(x) y 0 + q(x) y = 0, (3.1.1)
iff p, q are analytic functions at x0 . Otherwise x0 is called a singular point of the equation.
Remark: Near a regular point x0 the coefficients p and q in the differential equation above
can be written in terms of power series centered at x0 ,
X
p(x) = p0 + p1 (x x0 ) + p2 (x x0 )2 + = pn (x x0 )n ,
n=0
X
q(x) = q0 + q1 (x x0 ) + q2 (x x0 )2 + = qn (x x0 )n ,
n=0
3.1.2. The Power Series Method. The differential equation in (3.1.1) is a particular
case of the equations studied in 2.1, and the existence result in Theorem 2.1.2 applies to
Eq. (3.1.1). This Theorem was known to Lazarus Fuchs, who in 1866 added the following: If
the coefficient functions p and q are analytic on a domain, so is the solution on that domain.
Fuchs went ahead and studied the case where the coefficients p and q have singular points,
which we study in 3.2. The result for analytic coefficients is summarized below.
Theorem 3.1.2. If the functions p, q are analytic on an open interval (x0 , x0 + ) R,
then the differential equation
y 00 + p(x) y 0 + q(x) y = 0,
has two independent solutions, y1 , y2 , which are analytic on the same interval.
Remark: A complete proof of this theorem can be found in [2], Page 169. See also [10],
29. We present the first steps of the proof and we leave the convergence issues to the latter
references. The proof we present is based on power series expansions for the coefficients p,
q, and the solution y. This is not the proof given by Fuchs in 1866.
Proof of Thorem 3.1.2: Since the coefficient functions p and q are analytic functions on
(x0 , x0 + ), where > 0, they can be written as power series centered at x0 ,
X
X
p(x) = pn (x x0 )n , q(x) = qn (x x0 )n .
n=0 n=0
We look for solutions that can also be written as power series expansions centered at x0 ,
X
y(x) = an (x x0 )n .
n=0
We start computing the first derivatives of the function y,
X X
0 (n1) 0
y (x) = nan (x x0 ) y (x) = nan (x x0 )(n1) ,
n=0 n=1
where in the second expression we started the sum at n = 1, since the term with n = 0
vanishes. Relabel the sum with m = n 1, so when n = 1 we have that m = 0, and
n = m + 1. Therefore, we get
X
0
y (x) = (m + 1)a(m+1) (x x0 )m .
m=0
We finally rename the summation index back to n,
X
y 0 (x) = (n + 1)a(n+1) (x x0 )n . (3.1.2)
n=0
From now on we do these steps at once, and the notation n 1 = m n means
X X
y 0 (x) = nan (x x0 )(n1) = (n + 1)a(n+1) (x x0 )n .
n=1 n=0
136 G. NAGY ODE september 11, 2017
Therefore, the differential equation y 00 + p(x) y 0 + q(x) y = 0 has now the form
h n
X X i
(k + 1)a(k+1) p(nk) + ak q(nk) (x x0 )n = 0.
(n + 2)(n + 1)a(n+2) +
n=0 k=0
So we obtain a recurrence relation for the coefficients an ,
Xn
(n + 2)(n + 1)a(n+2) + (k + 1)a(k+1) p(nk) + ak q(nk) = 0,
k=0
for n = 0, 1, 2, . Equivalently,
n
1 X
a(n+2) = (k + 1)a(k+1) p(nk) + ak q(nk) . (3.1.3)
(n + 2)(n + 1)
k=0
We have obtained an expression for a(n+2) in terms of the previous coefficients a(n+1) , , a0
and the coefficients of the function p and q. If we choose arbitrary values for the first two
coefficients a0 and a1 , the the recurrence relation in (3.1.3) define the remaining coefficients
a2 , a3 , in terms of a0 and a1 . The coefficients an chosen in such a way guarantee that
the function y defined in (3.1.2) satisfies the differential equation.
In order to finish the proof of Theorem 3.1.2 we need to show that the power series
for y defined by the recurrence relation actually converges on a nonempty domain, and
furthermore that this domain is the same where p and q are analytic. This part of the
proof is too complicated for us. The interested reader can find the rest of the proof in [2],
Page 169. See also [10], 29.
It is important to understand the main ideas in the proof above, because we will follow
these ideas to find power series solutions to differential equations. So we now summarize
the main steps in the proof above:
G. NAGY ODE September 11, 2017 137
(a) Write a power series expansion of the solution centered at a regular point x0 ,
X
y(x) = an (x x0 )n .
n=0
(b) Introduce the power series expansion above into the differential equation and find a
recurrence relation among the coefficients an .
(c) Solve the recurrence relation in terms of free coefficients.
(d) If possible, add up the resulting power series for the solutions y1 , y2 .
We follow these steps in the examples below to find solutions to several differential equa-
tions. We start with a first order constant coefficient equation, and then we continue with
a second order constant coefficient equation. The last two examples consider variable coef-
ficient equations.
Example 3.1.2: Find a power series solution y around the point x0 = 0 of the equation
y 0 + c y = 0, c R.
Solution: We already know every solution to this equation. This is a first order, linear,
differential equation, so using the method of integrating factor we find that the solution is
y(x) = a0 ec x , a0 R.
We are now interested in obtaining such solution with the power series method. Although
this is not a second order equation, the power series method still works in this example.
Propose a solution of the form
X
X
y= an xn y0 = nan x(n1) .
n=0 n=1
0
We can start the sum in y at n = 0 or n = 1. We choose n = 1, since it is more convenient
later on. Introduce the expressions above into the differential equation,
X
X
nan xn1 + c an xn = 0.
n=1 n=0
Relabel the first sum above so that the functions xn1 and xn in the first and second sum
have the same label. One way is the following,
X
X
(n + 1)a(n+1) xn + c an xn = 0
n=0 n=0
We can now write down both sums into one single sum,
X
(n + 1)a(n+1) + c an xn = 0.
n=0
Since the function on the left-hand side must be zero for every x R, we conclude that
every coefficient that multiplies xn must vanish, that is,
(n + 1)a(n+1) + c an = 0, n > 0.
The last equation is called a recurrence relation among the coefficients an . The solution of
this relation can be found by writing down the first few cases and then guessing the general
138 G. NAGY ODE september 11, 2017
Example 3.1.3: Find a power series solution y(x) around the point x0 = 0 of the equation
y 00 + y = 0.
Solution: We know that the solution can be found computing the roots of the characteristic
polynomial r2 + 1 = 0, which gives us the solutions
y(x) = a0 cos(x) + a1 sin(x).
We now recover this solution using the power series,
X X
X
y= an xn y 0 = nan x(n1) , y 00 = n(n 1)an x(n2) .
n=0 n=1 n=2
Introduce the expressions above into the differential equation, which involves only the func-
tion and its second derivative,
X X
n(n 1)an xn2 + an xn = 0.
n=2 n=0
Relabel the first sum above, so that both sums have the same factor xn . One way is,
X
X
(n + 2)(n + 1)a(n+2) xn + an xn = 0.
n=0 n=0
Now we can write both sums using one single sum as follows,
X
(n + 2)(n + 1)a(n+2) + an xn = 0 (n + 2)(n + 1)a(n+2) + an = 0.
n > 0.
n=0
The last equation is the recurrence relation. The solution of this relation can again be found
by writing down the first few cases, and we start with even values of n, that is,
1
n = 0, (2)(1)a2 = a0 a2 = a0 ,
2!
1
n = 2, (4)(3)a4 = a2 a4 = a0 ,
4!
1
n = 4, (6)(5)a6 = a4 a6 = a0 .
6!
G. NAGY ODE September 11, 2017 139
One can check that the even coefficients a2k can be written as
(1)k
a2k = a0 .
(2k)!
The coefficients an for the odd values of n can be found in the same way, that is,
1
n = 1, (3)(2)a3 = a1 a3 = a1 ,
3!
1
n = 3, (5)(4)a5 = a3 a5 = a1 ,
5!
1
n = 5, (7)(6)a7 = a5 a7 = a1 .
7!
One can check that the odd coefficients a2k+1 can be written as
(1)k
a2k+1 = a1 .
(2k + 1)!
Split the sum in the expression for y into even and odd sums. We have the expression for
the even and odd coefficients. Therefore, the solution of the differential equation is given by
X (1)k 2k X (1)k
y(x) = a0 x + a1 x2k+1 .
(2k)! (2k + 1)!
k=0 k=0
One can check that these are precisely the power series representations of the cosine and
sine functions, respectively,
y(x) = a0 cos(x) + a1 sin(x).
C
Example 3.1.4: Find the first four terms of the power series expansion around the point
x0 = 1 of each fundamental solution to the differential equation
y 00 x y 0 y = 0.
Solution: This is a differential equation we cannot solve with the methods of previous
sections. This is a second order, variable coefficients equation. We use the power series
method, so we look for solutions of the form
X X
X
n 0 n1 00
y= an (x 1) y = nan (x 1) y = n(n 1)an (x 1)n2 .
n=0 n=1 n=2
We start working in the middle term in the differential equation. Since the power series is
centered at x0 = 1, it is convenient to re-write this term as x y 0 = [(x 1) + 1] y 0 , that is,
X
x y0 = nan x(x 1)n1
n=1
X
nan (x 1) + 1 (x 1)n1
=
n=1
X
X
= nan (x 1)n + nan (x 1)n1 . (3.1.4)
n=1 n=1
As usual by now, the first sum on the right-hand side of Eq. (3.1.4) can start at n = 0, since
we are only adding a zero term to the sum, that is,
X
X
nan (x 1)n = nan (x 1)n ;
n=1 n=0
140 G. NAGY ODE september 11, 2017
so both sums in Eq. (3.1.4) have the same factors (x 1)n . We obtain the expression
X
X
x y0 = nan (x 1)n + (n + 1)a(n+1) (x 1)n
n=0 n=0
X
nan + (n + 1)a(n+1) (x 1)n .
= (3.1.5)
n=0
If we use Eqs. (3.1.5)-(3.1.6) in the differential equation, together with the expression for y,
the differential equation can be written as follows
X
X
X
(n + 2)(n + 1)a(n+2) (x 1)n nan + (n + 1)a(n+1) (x 1)n an (x 1)n = 0.
n=0 n=0 n=0
We can now put all the terms above into a single sum,
X h i
(n + 2)(n + 1)a(n+2) (n + 1)a(n+1) nan an (x 1)n = 0.
n=0
This expression provides the recurrence relation for the coefficients an with n > 0, that is,
(n + 2)(n + 1)a(n+2) (n + 1)a(n+1) (n + 1)an = 0
h i
(n + 1) (n + 2)a(n+2) a(n+1) an = 0,
Example 3.1.5: Find the first three terms of the power series expansion around the point
x0 = 2 of each fundamental solution to the differential equation
y 00 x y = 0.
We now relabel the first sum on the right-hand side of Eq. (3.1.8) in the following way,
X
X
an (x 2)n+1 = a(n1) (x 2)n . (3.1.9)
n=0 n=1
We can solve this recurrence relation for the first four coefficients,
n=0 a2 a0 = 0 a2 = a0 ,
a0 a1
n=1 (3)(2)a3 2a1 a0 = 0 a3 = + ,
6 3
a0 a1
n=2 (4)(3)a4 2a2 a1 = 0 a4 = + .
6 12
Therefore, the first terms in the power series expression for the solution y of the differential
equation are given by
a a1 a a1
y = a0 + a1 (x 2) + a0 (x 2)2 + (x 2)3 + (x 2)4 +
0 0
+ +
6 3 6 12
which can be rewritten as
h 1 1 i
y = a0 1 + (x 2)2 + (x 2)3 + (x 2)4 +
6 6
h 1 1 i
+ a1 (x 2) + (x 2) + (x 2)4 +
3
3 12
So the first three terms on each fundamental solution are given by
1
y1 = 1 + (x 2)2 + (x 2)3 ,
6
1 1
y2 = (x 2) + (x 2)3 + (x 2)4 .
3 12
C
3.1.3. The Legendre Equation. The Legendre equation appears when one solves the
Laplace equation in spherical coordinates. The Laplace equation describes several phenom-
ena, such as the static electric potential near a charged body, or the gravitational potential
of a planet or star. When the Laplace equation describes a situation having spherical sym-
metry it makes sense to use spherical coordinates to solve the equation. It is in that case
that the Legendre equation appears for a variable related to the polar angle in the spherical
coordinate system. See Jacksons classic book on electrodynamics [8], 3.1, for a derivation
of the Legendre equation from the Laplace equation.
Example 3.1.6: Find all solutions of the Legendre equation
(1 x2 ) y 00 2x y 0 + l(l + 1) y = 0,
where l is any real constant, using power series centered at x0 = 0.
Solution: We start writing the equation in the form of Theorem 3.1.2,
2 l(l + 1)
y 00 y0 + y = 0.
(1 x2 ) (1 x2 )
It is clear that the coefficient functions
2 l(l + 1)
p(x) = , q(x) = ,
(1 x2 ) (1 x2 )
are analytic on the interval |x| < 1, which is centered at x0 = 0. Theorem 3.1.2 says that
there are two solutions linearly independent and analytic on that interval. So we write the
solution as a power series centered at x0 = 0,
X
y(x) = an xn ,
n=0
G. NAGY ODE September 11, 2017 143
Then we get,
X
y 00 = (n + 2)(n + 1)a(n+2) xn ,
n=0
X
x2 y 00 = (n 1)nan xn ,
n=0
X
2x y 0 = 2nan xn ,
n=0
X
l(l + 1) y = l(l + 1)an xn .
n=0
The Legendre equation says that the addition of the four equations above must be zero,
X
(n + 2)(n + 1)a(n+2) (n 1)nan 2nan + l(l + 1)an xn = 0.
n=0
Therefore,every term in that sum must vanish,
(n + 2)(n + 1)a(n+2) (n 1)nan 2nan + l(l + 1)an = 0, n > n.
This is the recurrence relation for the coefficients an . After a few manipulations the recur-
rence relation becomes
(l n)(l + n + 1)
a(n+2) = an , n > 0.
(n + 2)(n + 1)
By giving values to n we obtain,
l(l + 1) (l 1)(l + 2)
a2 = a0 , a3 = a1 .
2! 3!
Since a4 is related to a2 and a5 is related to a3 , we get,
(l 2)(l + 3) (l 2)l(l + 1)(l + 3)
a4 = a2 a4 = a0 ,
(3)(4) 4!
(l 3)(l + 4) (l 3)(l 1)(l + 2)(l + 4)
a5 = a3 a5 = a1 .
(4)(5) 5!
If one keeps solving the coefficients an in terms of either a0 or a1 , one gets the expression,
h l(l + 1) 2 (l 2)l(l + 1)(l + 3) 4 i
y(x) = a0 1 x + x +
2! 4!
h (l 1)(l + 2) 3 (l 3)(l 1)(l + 2)(l + 4) 5 i
+ a1 x x + x + .
3! 5!
Hence, the fundamental solutions are
l(l + 1) 2 (l 2)l(l + 1)(l + 3) 4
y1 (x) = 1 x + x +
2! 4!
(l 1)(l + 2) 3 (l 3)(l 1)(l + 2)(l + 4) 5
y2 (x) = x x + x + .
3! 5!
The ration test provides the interval where the seires above converge. For function y1 we
get, replacing n by 2n,
2n+2 x2n+2 (l 2n)(l + 2n + 1) 2
a
= |x | |x|2 as n .
a2n x2n (2n + 1)(2n + 2)
=
A similar result holds for y2 . So both series converge on the interval defined by |x| < 1. C
144 G. NAGY ODE september 11, 2017
Remark: The functions y1 , y2 are called Legendre functions. For a noninteger value of
the constant l these functions cannot be written in terms of elementary functions. But
when l is an integer, one of these series terminate and becomes a polynomial. The case
l being a nonnegative integer is specially relevant in physics. For l even the function y1
becomes a polynomial while y2 remains an infinite series. For l odd the function y2 becomes
a polynomial while the y1 remains an infinite series. For example, for l = 0, 1, 2, 3 we get,
l = 0, y1 (x) = 1,
l = 1, y2 (x) = x,
l = 2, y1 (x) = 1 3x2 ,
5
l = 3, y2 (x) = x x3 .
3
The Legendre polynomials are proportional to these polynomials. The proportionality fac-
tor for each polynomial is chosen so that the Legendre polynomials have unit lengh in a
particular chosen inner product. We just say here that the first four polynomials are
l = 0, y1 (x) = 1, P0 = y1 , P0 (x) = 1,
l = 1, y2 (x) = x, P1 = y2 , P1 (x) = x,
1 1
y1 (x) = 1 3x2 , P2 (x) = 3x2 1 ,
l = 2, P2 = y1 ,
2 2
5 3 1
y2 (x) = x x3 , P3 (x) = 5x3 3x .
l = 3, P3 = y1 ,
3 2 2
These polynomials, Pn , are called Legendre polynomials. The graph of the first four Le-
gendre polynomials is given in Fig. 15.
1 P0
P1
P2
1 1 x
P3
3.1.4. Exercises.
3.1.1.- . 3.1.2.- .
G. NAGY ODE September 11, 2017 147
Example 3.2.1: Show that the singular point of Euler equation below is regular singular,
(x 3)2 y 00 + 2(x 3) y 0 + 4 y = 0.
Solution: Divide the equation by (x 3)2 , so we get the equation in the standard form
2 4
y 00 + y0 + y = 0.
(x 3) (x 3)2
The functions p and q are given by
2 4
p(x) = , q(x) = .
(x 3) (x 3)2
The functions p3 and q3 for the point x0 = 3 are constants,
2 4
p3 (x) = (x 3) = 2, q3 (x) = (x 3)2 = 4.
(x 3) (x 3)2
Therefore they are analytic. This shows that x0 = 3 is regular singular. C
Example 3.2.3: Find the regular singular points of the differential equation
(x + 2)2 (x 1) y 00 + 3(x 1) y 0 + 2 y = 0.
Remark: It is fairly simple to find the regular singular points of an equation. Take the
equation in out last example, written in standard form,
3 2
y 00 + 2
y0 + y = 0.
(x + 2) (x + 2)2 (x 1)
The functions p and q are given by
3 2
p(x) = , q(x) = .
(x + 2)2 (x + 2)2 (x 1)
The singular points are given by the zeros in the denominators, that is x0 = 2 and x1 = 1.
The point x0 is not regular singular because function p diverges at x0 = 2 faster than
1
. The point x1 = 1 is regular singular because function p is regular at x1 = 1 and
(x + 2)
1
function q diverges at x1 = 1 slower than .
(x 1)2
150 G. NAGY ODE september 11, 2017
3.2.2. The Frobenius Method. We now assume that the differential equation
y 00 + p(x) y 0 + q(x) y = 0, (3.2.1)
has a regular singular point. We want to find solutions to this equation that are defined
arbitrary close to that regular singular point. Recall that a point x0 is a regular singular
point of the equation above iff the functions (x x0 ) p and (x x0 )2 q are analytic at x0 . A
function is analytic at a point iff it has a convergent power series expansion in a neighborhood
of that point. In our case this means that near a regular singular point holds
X
(x x0 ) p(x) = pn (x x0 )n = p0 + p1 (x x0 ) + p2 (x x0 )2 +
n=0
X
(x x0 )2 q(x) = qn (x x0 )n = q0 + q1 (x x0 ) + q2 (x x0 )2 +
n=0
This means that near x0 the function p diverges at most like (x x0 )1 and function q
diverges at most like (x x0 )2 , as it can be seen from the equations
p0
p(x) = + p1 + p2 (x x0 ) +
(x x0 )
q0 q1
q(x) = + + q2 +
(x x0 )2 (x x0 )
Therefore, for p0 and q0 nonzero and x close to x0 we have the relations
p0 q0
p(x) ' , q(x) ' , x ' x0 ,
(x x0 ) (x x0 )2
where the symbol a ' b, with a, b R means that |a b| is close to zero. In other words,
the for x close to a regular singular point x0 the coefficients of Eq. (3.2.1) are close to the
coefficients of the Euler equidimensional equation
(x x0 )2 ye00 + p0 (x x0 ) ye0 + q0 ye = 0,
where p0 and q0 are the zero order terms in the power series expansions of (x x0 ) p and
(xx0 )2 q given above. One could expect that solutions y to Eq. (3.2.1) are close to solutions
ye to this Euler equation. One way to put this relation in a more precise way is
X
an (x x0 )n y(x) = ye (x) a0 + a1 (x x0 ) + .
y(x) = ye (x)
n=0
Recalling that at least one solution to the Euler equation has the form ye (x) = (x x0 )r ,
where r is a root of the indicial polynomial
r(r 1) + p0 r + q0 = 0,
we then expect that for x close to x0 the solution to Eq. (3.2.1) be close to
X
r
y(x) = (x x0 ) an (x x0 )n .
n=0
This expression for the solution is usually written in a more compact way as follows,
X
y(x) = an (x x0 )(r+n) .
n=0
This is the main idea of the Frobenius method to find solutions to equations with regular
singular points. To look for solutions that are close to solutions to an appopriate Euler
equation. We now state two theorems summarize a few formulas for solutions to differential
equations with regular singular points.
G. NAGY ODE September 11, 2017 151
(b) If (r+ r- ) = N , a nonnegative integer, then the differential equation in (3.2.2) has two
independent solutions y+ , y- of the form
X
y+ (x) = |x x0 |r+ an (x x0 )n , with a0 = 1,
n=0
X
y- (x) = |x x0 |r- bn (x x0 )n + c y+ (x) ln |x x0 |, with b0 = 1.
n=0
The constant c is nonzero if N = 0. If N > 0, the constant c may or may not be zero.
In both cases above the series converge in the interval defined by |x x0 | < and the
differential equation is satisfied for 0 < |x x0 | < .
Remarks:
(a) The statements above are taken from Apostols second volume [2], Theorems 6.14, 6.15.
For a sketch of the proof see Simmons [10]. A proof can be found in [5, 7].
(b) The existence of solutions and their behavior in a neighborhood of a singular point was
first shown by Lazarus Fuchs in 1866. The construction of the solution using singular
power series expansions was first shown by Ferdinand Frobenius in 1874.
We now give a summary of the Frobenius method to find the solutions mentioned in
Theorem 3.2.2 to a differential equation having a regular singular point. For simplicity we
only show how to obtain the solution y+ .
X
(1) Look for a solution y of the form y(x) = an (x x0 )(n+r) .
n=0
(2) Introduce this power series expansion into the differential equation and find the indicial
equation for the exponent r. Find the larger solution of the indicial equation.
(3) Find a recurrence relation for the coefficients an .
(4) Introduce the larger root r into the recurrence relation for the coefficients an . Only
then, solve this latter recurrence relation for the coefficients an .
(5) Using this procedure we will find the solution y+ in Theorem 3.2.2.
152 G. NAGY ODE september 11, 2017
We now show how to use these steps to find one solution of a differential equation near a
regular singular point. We show the case where the roots of the indicial polynomial differ by
an integer. We show that in this case we obtain only solution y+ . The solution y- does not
X
have the form y(x) = an (x x0 )(n+r) . Theorem 3.2.2 says that there is a logarithmic
n=0
term in the solution. We do not compute that solution here.
Example 3.2.4: Find the solution y near the regular singular point x0 = 0 of the equation
x2 y 00 x(x + 3) y 0 + (x + 3) y = 0.
As one can see from Eqs.(3.2.3)-(3.2.5), the guiding principle to rewrite each term is to
have the power function x(n+r) labeled in the same way on every term. For example, in
Eqs.(3.2.3)-(3.2.5) we do not have a sum involving terms with factors x(n+r1) or factors
x(n+r+1) . Then, the differential equation can be written as follows,
X
X
(n + r)(n + r 1)an x(n+r) (n + r 1)a(n1) x(n+r)
n=0 n=1
X
X
X
3(n + r)an x(n+r) + a(n1) x(n+r) + 3an x(n+r) = 0.
n=0 n=1 n=0
In the equation above we need to split the sums containing terms with n > 0 into the term
n = 0 and a sum containing the terms with n > 1, that is,
r(r 1) 3r + 3 a0 xr +
X h i
(n + r)(n + r 1)an (n + r 1)a(n1) 3(n + r)an + a(n1) + 3an x(n+r) = 0,
n=1
and this expression can be rewritten as follows,
r(r 1) 3r + 3 a0 xr +
X h i
(n + r)(n + r 1) 3(n + r) + 3 an (n + r 1 1)a(n1) x(n+r) = 0
n=1
and then,
r(r 1) 3r + 3 a0 xr +
h
X i
(n + r)(n + r 1) 3(n + r 1) an (n + r 2)a(n1) x(n+r) = 0
n=1
hence,
h
X i
r(r 1) 3r + 3 a0 xr + (n + r 1)(n + r 3)an (n + r 2)a(n1) x(n+r) = 0.
n=1
The indicial equation and the recurrence relation are given by the equations
r(r 1) 3r + 3 = 0, (3.2.6)
(n + r 1)(n + r 3)an (n + r 2)a(n1) = 0. (3.2.7)
The way to solve these equations in (3.2.6)-(3.2.7) is the following: First, solve Eq. (3.2.6) for
the exponent r, which in this case has two solutions r ; second, introduce the first solution
r+ into the recurrence relation in Eq. (3.2.7) and solve for the coefficients an ; the result is
a solution y+ of the original differential equation; then introduce the second solution r- into
Eq. (3.2.7) and solve again for the coefficients an ; the new result is a second solution y- . Let
us follow this procedure in the case of the equations above:
r+ = 3,
1
r2 4r + 3 = 0 r = 4 16 12
2 r- = 1.
Introducing the value r+ = 3 into Eq. (3.2.7) we obtain
(n + 2)n an (n + 1)an1 = 0.
One can check that the solution y+ obtained form this recurrence relation is given by
h 2 1 1 3 i
y+ (x) = a0 x3 1 + x + x2 + x + .
3 4 15
154 G. NAGY ODE september 11, 2017
3.2.3. The Bessel Equation. We saw in 3.1 that the Legendre equation appears when
one solves the Laplace equation in spherical coordinates. If one uses cylindrical coordinates
insted, one needs to solve the Bessel equation. Recall we mentioned that the Laplace
equation describes several phenomena, such as the static electric potential near a charged
body, or the gravitational potential of a planet or star. When the Laplace equation describes
a situation having cylindrical symmetry it makes sense to use cylindrical coordinates to solve
it. Then the Bessel equation appears for the radial variable in the cylindrical coordinate
system. See Jacksons classic book on electrodynamics [8], 3.7, for a derivation of the
Bessel equation from the Laplace equation.
The equation is named after Friedrich Bessel, a German astronomer from the first half
of the seventeen century, who was the first person to calculate the distance to a star other
than our Sun. Bessels parallax of 1838 yielded a distance of 11 light years for the star
61 Cygni. In 1844 he discovered that Sirius, the brightest star in the sky, has a traveling
companion. Nowadays such system is called a binary star. This companion has the size
of a planet and the mass of a star, so it has a very high density, many thousand times
the density of water. This was the first dead start discovered. Bessel first obtained the
equation that now bears his name when he was studing star motions. But the equation
first appeared in Daniel Bernoullis studies of oscillations of a hanging chain. (Taken from
Simmons book [10], 34.)
X
Example 3.2.5: Find all solutions y(x) = an xn+r , with a0 6= 0, of the Bessel equation
n=0
00 0
x y + x y + (x 2 ) y = 0,
2 2
x > 0,
where is any real nonnegative constant, using the Frobenius method centered at x0 = 0.
Solution: Let us double check that x0 = 0 is a regular singular point of the equation. We
start writing the equation in the standard form,
1 0 (x2 2 )
y 00 +
y + y = 0,
x x2
so we get the functions p(x) = 1/x and q(x) = (x2 2 )/x2 . It is clear that x0 = 0 is a
singular point of the equation. Since the functions
p(x) = xp(x) = 1, q(x) = x2 q(x) = (x2 2 )
G. NAGY ODE September 11, 2017 155
are analytic, we conclude that x0 = 0 is a regular singular point. So it makes sense to look
for solutions of the form
X
y(x) = an x(n+r) , x > 0.
n=0
We now compute the different terms needed to write the differential equation. We need,
X X
x2 y(x) = an x(n+r+2) y(x) = a(n2) x(n+r) ,
n=0 n=2
where we did the relabeling n + 2 = m n. The term with the first derivative is given by
X
x y 0 (x) = (n + r)an x(n+r) .
n=0
The term with the second derivative has the form
X
x2 y 00 (x) = (n + r)(n + r 1)an x(n+r) .
n=0
Therefore, the differential equation takes the form
X
X
(n + r)(n + r 1)an x(n+r) + (n + r)an x(n+r)
n=0 n=0
X
X
+ a(n2) x(n+r) 2 an x(n+r) = 0.
n=2 n=0
Group together the sums that start at n = 0,
X
X
(n + r)(n + r 1) + (n + r) 2 an x(n+r) + a(n2) x(n+r) ,
n=0 n=2
and cancel a few terms in the first sum,
X
X
(n + r)2 2 an x(n+r) + a(n2) x(n+r) = 0.
n=0 n=2
Split the sum that starts at n = 0 into its first two terms plus the rest,
(r2 2 )a0 xr + (r + 1)2 2 a1 x(r+1)
X
X
(n + r)2 2 an x(n+r) + a(n2) x(n+r) = 0.
+
n=2 n=2
The reason for this splitting is that now we can write the two sums as one,
X
(r2 2 )a0 xr + (r + 1)2 2 a1 x(r+1) + (n + r)2 2 an + a(n2) x(n+r) = 0.
n=2
We then conclude that each term must vanish,
(r2 2 )a0 = 0, (r + 1)2 2 a1 = 0, (n + r)2 2 an + a(n2) = 0,
n > 2. (3.2.8)
This is the recurrence relation for the Bessel equation. It is here where we use that we look
for solutions with a0 6= 0. In this example we do not look for solutions with a1 6= 0. Maybe
it is a good exercise for the reader to find such solutions. But in this example we look for
solutions with a0 6= 0. This condition and the first equation above imply that
r 2 2 = 0 r = ,
156 G. NAGY ODE september 11, 2017
and recall that is a nonnegative but otherwise arbitrary real number. The choice r = r+
will lead to a solution y , and the choice r = r- will lead to a solution y . These solutions
may or may not be linearly independent. This depends on the value of , since r+ r- = 2.
One must be careful to study all possible cases.
Remark: Let us start with a very particular case. Suppose that both equations below hold,
(r2 2 ) = 0, (r + 1)2 2 = 0.
This equations are the result of both a0 6= 0 and a1 6= 0. These equations imply
1
r2 = (r + 1)2 2r + 1 = 0
r= .
2
But recall that r = , and > 0, hence the case a0 6= 0 and a1 6= 0 happens only when
= 1/2 and we choose r- = = 1/2. We leave computation of the solution y1/2 as an
exercise for the reader. But the answer is
cos(x) sin(x)
y1/2 (x) = a0 + a1 .
x x
From now on we assume that 6= 1/2. This condition on , the equation r2 2 = 0, and
the remark above imply that
(r + 1)2 2 6= 0.
So the second equation in the recurrence relation in (3.2.8) implies that a1 = 0. Summariz-
ing, the first two equations in the recurrence relation in (3.2.8) are satisfied because
r = , a1 = 0.
We only need to find the coefficients an , for n > 2 such that the third equation in the
recurrence relation in (3.2.8) is satisfied. But we need to consider two cases, r = r+ = and
r- = .
We start with the case r = r+ = , and we get
(n2 + 2n) an + a(n2) = 0 n(n + 2) an = a(n2) .
Since n > 2 and > 0, the factor (n + 2) never vanishes and we get
a(n2)
an = .
n(n + 2)
This equation and a1 = 0 imply that all coefficients a2k+1 = 0 for k > 0, the odd coefficients
vanish. On the other hand, the even coefficent are nonzero. The coefficient a2 is
a0 a0
a2 = a2 = 2 ,
2(2 + 2) 2 (1 + )
the coefficient a4 is
a2 a2 a0
a4 = = 2 a4 = ,
4(4 + 2) 2 (2)(2 + ) 24 (2)(1 + )(2 + )
the coefficient a6 is
a4 a4 a0
a6 = = 2 a6 = .
6(6 + 2) 2 (3)(3 + ) 26 (3!)(1 + )(2 + )(3 + )
Now it is not so hard to show that the general term a2k , for k = 0, 1, 2, has the form
(1)k a0
a2k = .
22k (k!)(1 + )(2 + ) (k + )
G. NAGY ODE September 11, 2017 157
k + 1/2, for k integer. Introducing this y(k+1/2) into the Bessel equation one can check
that y(k+1/2) is a solution to the Bessel equation.
Summarizing, the solutions of the Bessel equation function y is defined for every non-
negative real number , and y is defined for every nonnegative real number except for
nonnegative integers. For a given such that both y and y are defined, these func-
tions are linearly independent. That these functions cannot be proportional to each other
is simple to see, since for > 0 the function y is regular at the origin x = 0, while y
diverges.
The last case we need to study is how to find the solution y when is a nonnegative
integer. We see that the expression in (3.2.10) is not defined when is a nonnegative
integer. And we just saw that this condition on is a particular case of the condition in
Theorem 3.2.2 that (r+ r- ) is not a nonnegative integer. Theorem 3.2.2 gives us what is
the expression for a second solution, y linearly independent of y , in the case that is a
nonnegative integer. This expression is
X
y (x) = y (x) ln(x) + x cn xn .
n=0
If we put this expression into the Bessel equation, one can find a recurrence relation for the
coefficients cn . This is a long calculation, and the final result is
y (x) = y (x) ln(x)
1
1 x X ( n 1)! x 2n
2 2 n=0
n! 2
1 x X (hn + h(n+) ) x 2n
(1)n ,
2 2 n=0 n! (n + )! 2
1 1
with h0 = 0, hn = 1 + 2 + + n for n > 1, and a nonnegative integer. C
G. NAGY ODE September 11, 2017 159
3.2.4. Exercises.
3.2.1.- . 3.2.2.- .
160 G. NAGY ODE september 11, 2017
Notes on Chapter 3
Sometimes solutions to a differential equation cannot be written in terms of previously
known functions. When that happens the we say that the solutions to the differential
equation define a new type of functions. How can we work with, or let alone write down, a
new function, a function that cannot be written in terms of the functions we already know?
It is the differential equation what defines the function. So the function properties must be
obtained from the differential equation itself. A way to compute the function values must
come from the differential equation as well. The few paragraphs that follow try to give sense
that this procedure is not as artificial as it may sound.
Differential Equations to Define Functions. We have seen in 3.2 that the solutions
of the Bessel equation for 6= 1/2 cannot be written in terms of simple functions, such as
quotients of polynomials, trigonometric functions, logarithms and exponentials. We used
power series including negative powers to write solutions to this equation. To study prop-
erties of these solutions one needs to use either the power series expansions or the equation
itself. This type of study on the solutions of the Bessel equation is too complicated for these
notes, but the interested reader can see [14].
We want to give an idea how this type of study can be carried out. We choose a differential
equation that is simpler to study than the Bessel equation. We study two solutions, C and S,
of this particular differential equation and we will show, using only the differential equation,
that these solutions have all the properties that the cosine and sine functions have. So
we will conclude that these solutions are in fact C(x) = cos(x) and S(x) = sin(x). This
example is taken from Hassanis textbook [?], example 13.6.1, page 368.
Example 3.2.6: Let the function C be the unique solution of the initial value problem
C 00 + C = 0, C(0) = 1, C 0 (0) = 0,
and let the function S be the unique solution of the initial value problem
S 00 + S = 0, S(0) = 0, S 0 (0) = 1.
Use the differential equation to study these functions.
Solution:
(a) We start showing that these solutions C and S are linearly independent. We only need
to compute their Wronskian at x = 0.
W (0) = C(0) S 0 (0) C 0 (0) S(0) = 1 6= 0.
Therefore the functions C and S are linearly independent.
(b) We now show that the function S is odd and the function C is even. The function
C(x) = C(x) satisfies the initial value problem
C 00 + C = C 00 + C = 0, C(0) = C(0) = 1, C 0 (0) = C 0 (0) = 0.
This is the same initial value problem satisfied by the function C. The uniqueness of
solutions to these initial value problem implies that C(x) = C(x) for all x R, hence the
function C is even. The function S(x) = S(x) satisfies the initial value problem
S 00 + S = S 00 + S = 0, S(0) = S(0) = 0, S 0 (0) = S 0 (0) = 1.
This is the same initial value problem satisfied by the function S. The uniqueness of
solutions to these initial value problem implies that S(x) = S(x) for all x R, hence
the function S is odd.
G. NAGY ODE September 11, 2017 161
(c) Next we find a differential relation between the functions C and S. Notice that the
function C 0 is the uique solution of the initial value problem
(C 0 )00 + (C 0 ) = 0, C 0 (0) = 0, (C 0 )0 (0) = C(0) = 1.
This is precisely the same initial value problem satisfied by the function S. The uniqueness
of solutions to these initial value problems implies that C = S, that is for all x R holds
C 0 (x) = S(x).
Take one more derivative in this relation and use the differential equation for C,
S 0 (x) = C 00 (x) = C(x) S 0 (x) = C(x).
(d) Let us now recall that Abels Theorem says that the Wronskian of two solutions to a
second order differential equation y 00 + p(x) y 0 + q(x) y = 0 satisfies the differential equation
W 0 + p(x) W = 0. In our case the function p = 0, so the Wronskian is a constant function.
If we compute the Wronskian of the functions C and S and we use the differential relations
found in (c) we get
W (x) = C(x) S 0 (x) C 0 (x) S(x) = C 2 (x) + S 2 (x).
This Wronskian must be a constant function, but at x = 0 takes the value W (0) = C 2 (0) +
S 2 (0) = 1. We therefore conclude that for all x R holds
C 2 (x) + S 2 (x) = 1.
(e) We end computing power series expansions of these functions C and S, so we have a
way to compute their values. We start with function C. The initial conditions say
C(0) = 1, C 0 (0) = 0.
The differential equation at x = 0 and the first initial condition say that C 00 (0) = C(0) =
1. The derivative of the differential equation at x = 0 and the second initial condition say
that C 000 (0) = C 0 (0) = 0. If we keep taking derivatives of the differential equation we get
C 00 (0) = 1, C 000 (0) = 0, C (4) (0) = 1,
and in general,
(
0 if n is odd,
(n)
C (0) = k
(1) if n = 2k, where k = 0, 1, 2, .
So we obtain the Taylor series expansion
X x2k
C(x) = (1)k ,
(2k)!
k=0
which is the power series expansion of the cosine function. A similar calculation yields
X x2k+1
S(x) = (1)k ,
(2k + 1)!
k=0
which is the power series expansion of the sine function. Notice that we have obtained these
expansions using only the differential equation and its derivatives at x = 0 together with
the initial conditions. The ratio test shows that these power series converge for all x R.
These power series expansions also say that the function S is odd and C is even. C
162 G. NAGY ODE september 11, 2017
The Euler number e is defined as the solution of the equation ln(e) = 1. The inverse of the
natural logarithm, ln1 , is defined in the usual way,
ln1 (y) = x ln(x) = y, x (0, ), y (, ).
Since the natural logarithm satisfies that ln(x1 x2 ) = ln(x1 ) + ln(x2 ), the inverse function
satisfies the related identity ln1 (y1 + y2 ) = ln1 (y1 ) ln1 (y2 ). To see this identity compute
ln1 (y1 + y2 ) = ln1 ln(x1 ) + ln(x2 ) = ln1 (ln(x1 x2 )) = x1 x2 = ln1 (y1 ) ln1 (y2 ).
This identity and the fact that ln1 (1) = e imply that for any positive integer n holds
n times n times
z }| { z 1 }| { nz times
ln1 (n) = ln1 1
(1 + + 1)=ln (1) ln (1)= e e = en .
}| {
This relation says that ln1 is the exponential function when restricted to positive integers.
This suggests a way to generalize the exponential function from positive integers to real
numbers, ey = ln1 (y), for y real. Hence the name exponential for the inverse of the natural
logarithm. And this is how calculus brought us the logarithm and the exponential functions.
Finally notice that by the definition of the natural logarithm, its derivative is ln0 (x) = 1/x.
But there is a formula relating the derivative of a function f and its inverse f 1 ,
0 1
f 1 (y) = 0 1 .
f f (y)
G. NAGY ODE September 11, 2017 163
The Laplace Transform is a transformation, meaning that it changes a function into a new
function. Actually, it is a linear transformation, because it converts a linear combination of
functions into a linear combination of the transformed functions. Even more interesting, the
Laplace Transform converts derivatives into multiplications. These two properties make the
Laplace Transform very useful to solve linear differential equations with constant coefficients.
The Laplace Transform converts such differential equation for an unknown function into an
algebraic equation for the transformed function. Usually it is easy to solve the algebraic
equation for the transformed function. Then one converts the transformed function back
into the original function. This function is the solution of the differential equation.
Solving a differential equation using a Laplace Transform is radically different from all
the methods we have used so far. This method, as we will use it here, is relatively new. The
Laplace Transform we define here was first used in 1910, but its use grew rapidly after 1920,
specially to solve differential equations. Transformations like the Laplace Transform were
known much earlier. Pierre Simon de Laplace used a similar transformation in his studies of
probability theory, published in 1812, but analogous transformations were used even earlier
by Euler around 1737.
3 (t)
3
2 (t)
2
1 (t)
1
0 1 1 1 t
3 2
G. NAGY ODE September 11, 2017 165
So we have transformed the derivative we started with into a multiplication by this constant
s from the exponential. The idea in this calculation actually works to solve differential
equations and motivates us to define the integral transformation y(t) Y (s) as follows,
Z
y(t) Y (s) = est y(t) dt.
The Laplace transform is a transformation similar to the one above, where we choose some
appropriate integration limitswhich are very convenient to solve initial value problems.
We dedicate this section to introduce the precise definition of the Laplace transform
and how is used to solve differential equations. In the following sections we will see that
this method can be used to solve linear constant coefficients differential equation with very
general sources, including Diracs delta generalized functions.
4.1.1. Oveview of the Method. The Laplace transform changes a function into another
function. For example, we will show later on that the Laplace transform changes
a
f (x) = sin(ax) into F (x) = 2 .
x + a2
We will follow the notation used in the literature and we use t for the variable of the
original function f , while we use s for the variable of the transformed function F . Using
this notation, the Laplace transform changes
a
f (t) = sin(at) into F (s) = 2 .
s + a2
We will show that the Laplace transform is a linear transformation and it transforms deriva-
tives into multiplication. Because of these properties we will use the Laplace transform to
solve linear differential equations.
We Laplace transform the original differential equation. Because the the properties above,
the result will be an algebraic equation for the transformed function. Algebraic equations
are simple to solve, so we solve the algebraic equation. Then we Laplace transform back
the solution. We summarize these steps as follows,
Solve the Transform back
differential Algebraic
(1) (2) (3)
L algebraic to obtain y.
eq. for y. eq. for L[y].
eq. for L[y]. (Use the table.)
166 G. NAGY ODE september 11, 2017
4.1.2. The Laplace Transform. The Laplace transform is a transformation, meaning that
it converts a function into a new function. We have seen transformations earlier in these
notes. In Chapter 2 we used the transformation
L[y(t)] = y 00 (t) + a1 y 0 (t) + a0 y(t),
so that a second order linear differential equation with source f could be written as L[y] = f .
There are simpler transformations, for example the differentiation operation itself,
D[f (t)] = f 0 (t).
Not all transformations involve differentiation. There are integral transformations, for ex-
ample integration itself, Z x
I[f (t)] = f (t) dt.
0
Of particular importance in many applications are integral transformations of the form
Z b
T [f (t)] = K(s, t) f (t) dt,
a
where K is a fixed function of two variables, called the kernel of the transformation, and a,
b are real numbers or . The Laplace transform is a transfomation of this type, where
the kernel is K(s, t) = est , the constant a = 0, and b = .
Definition 4.1.1. The Laplace transform of a function f defined on Df = (0, ) is
Z
F (s) = est f (t) dt, (4.1.1)
0
so the integral converges only for s > a and the Laplace transform is given by
1
L[eat ] = , s > a.
(s a)
C
This improper integral diverges for s = a, so L[teat ] is not defined for s = a. From now on
we consider only the case s 6= a. In this case we can integrate by parts,
Z N
h 1 N 1 i
L[teat ] = lim te(sa)t + e(sa)t dt ,
N (s a) 0 sa 0
that is,
h 1 N 1 N i
L[teat ] = lim te(sa)t e (sa)t
. (4.1.2)
N (s a) 0 (s a)2 0
then we get
Z N
s2 h 1 N a st N i
est sin(at) dt = e st
sin(at) e cos(at) .
(s2 + a2 ) s s2
0 0 0
In Table 2 we present a short list of Laplace transforms. They can be computed in the
same way we computed the the Laplace transforms in the examples above.
4.1.3. Main Properties. Since we are more or less confident on how to compute a Laplace
transform, we can start asking deeper questions. For example, what type of functions have
a Laplace transform? It turns out that a large class of functions, those that are piecewise
continuous on [0, ) and bounded by an exponential. This last property is particularly
important and we give it a name.
170 G. NAGY ODE september 11, 2017
Remarks:
(a) When the precise value of the constant s0 is not important we will say that f is of
exponential order.
2
(b) An example of a function that is not of exponential order is f (t) = et .
This definition helps to describe a set of functions having Laplace transform. Piecewise
continuous functions on [0, ) of exponential order have Laplace transforms.
Proof of Theorem 4.1.3: From the definition of the Laplace transform we know that
Z N
L[f ] = lim est f (t) dt.
N
0
The definite integral on the interval [0, N ] exists for every N > 0 since f is piecewise
continuous on that interval, no matter how large N is. We only need to check whether the
integral converges as N . This is the case for functions of exponential order, because
Z N Z N Z N Z N
st st st
e f (t) dt 6 e |f (t)| dt 6 s0 t
e ke dt = k e(ss0 )t dt.
0 0 0 0
Theorem 4.1.4 (Linearity). If L[f ] and L[g] exist, then for all a, b R holds
L[af + bg] = a L[f ] + b L[g].
G. NAGY ODE September 11, 2017 171
Proof of Theorem 4.1.4: Since integration is a linear operation, so is the Laplace trans-
form, as this calculation shows,
Z
est af (t) + bg(t) dt
L[af + bg] =
0
Z Z
=a est f (t) dt + b est g(t) dt
0 0
= a L[f ] + b L[g].
This establishes the Theorem.
2
Example 4.1.5: Compute L[3t + 5 cos(4t)].
Solution: From the Theorem above and the Laplace transform in Table ?? we know that
L[3t2 + 5 cos(4t)] = 3 L[t2 ] + 5 L[cos(4t)]
2 s
=3 3 +5 2 , s>0
s s + 42
6 5s
= 3+ 2 .
s s + 42
Therefore,
5s4 + 6s2 + 96
L[3t2 + 5 cos(4t)] = , s > 0.
s3 (s2 + 16) C
The Laplace transform can be used to solve differential equations. The Laplace transform
converts a differential equation into an algebraic equation. This is so because the Laplace
transform converts derivatives into multiplications. Here is the precise result.
Theorem 4.1.5 (Derivative into Multiplication). If a function f is continuously differen-
tiable on [0, ) and of exponential order s0 , then L[f 0 ] exists for s > s0 and
L[f 0 ] = s L[f ] f (0), s > s0 . (4.1.4)
We start computing the definite integral above. Since f 0 is continuous on [0, ), that definite
integral exists for all positive N , and we can integrate by parts,
Z N h N Z N i
st 0 st
e f (t) dt = e f (t) (s)est f (t) dt
0 0 0
Z N
sN
=e f (N ) f (0) + s est f (t) dt.
0
Let us use one more time that f is of exponential order s0 . This means that there exist
positive constants k and T such that |f (t)| 6 k es0 t , for t > T . Therefore,
lim esN f (N ) 6 lim k esN es0 N = lim k e(ss0 )N = 0, s > s0 .
N N N
172 G. NAGY ODE september 11, 2017
These two results together imply that L[f 0 ] exists and holds
L[f 0 ] = s L[f ] f (0), s > s0 .
This establishes the Theorem.
Example 4.1.6: Verify the result in Theorem 4.1.5 for the function f (t) = cos(bt).
Solution: We need to compute the left hand side and the right hand side of Eq. (4.1.4)
and verify that we get the same result. We start with the left hand side,
b b2
L[f 0 ] = L[b sin(bt)] = b L[sin(bt)] = b L[f 0 ] = .
s2 + b2 s2 + b2
We now compute the right hand side,
s s2 s2 b2
s L[f ] f (0) = s L[cos(bt)] 1 = s 1= ,
s2 +b 2 s2 + b2
so we get
b2
s L[f ] f (0) = .
s2 + b2
We conclude that L[f 0 ] = s L[f ] f (0). C
Proof of Theorem 4.1.6: We need to use Eq. (4.1.4) n times. We start with the Laplace
transform of a second derivative,
L[f 00 ] = L[(f 0 )0 ]
= s L[f 0 ] f 0 (0)
= s s L[f ] f (0) f 0 (0)
4.1.4. Solving Differential Equations. The Laplace transform can be used to solve differ-
ential equations. We Laplace transform the whole equation, which converts the differential
equation for y into an algebraic equation for L[y]. We solve the Algebraic equation and we
transform back.
Solve the Transform back
differential Algebraic
(1) (2) (3)
L algebraic to obtain y.
eq. for y. eq. for L[y].
eq. for L[y]. (Use the table.)
Example 4.1.8: Use the Laplace transform to find y solution of
y 00 + 9 y = 0, y(0) = y0 , y 0 (0) = y1 .
Remark: Notice we already know what the solution of this problem is. Following 2.3 we
need to find the roots of
p(r) = r2 + 9 r-+ = 3 i,
and then we get the general solution
y(t) = c+ cos(3t) + c- sin(3t).
Then the initial condition will say that
y1
y(t) = y0 cos(3t) +
sin(3t).
3
We now solve this problem using the Laplace transform method.
Solution: We now use the Laplace transform method:
L[y 00 + 9y] = L[0] = 0.
The Laplace transform is a linear transformation,
L[y 00 ] + 9 L[y] = 0.
But the Laplace transform converts derivatives into multiplications,
s2 L[y] s y(0) y 0 (0) + 9 L[y] = 0.
This is an algebraic equation for L[y]. It can be solved by rearranging terms and using the
initial condition,
s 1
(s2 + 9) L[y] = s y0 + y1 L[y] = y0 2 + y1 2 .
(s + 9) (s + 9)
174 G. NAGY ODE september 11, 2017
4.1.5. Exercises.
4.1.1.- . 4.1.2.- .
176 G. NAGY ODE september 11, 2017
4.2.1. Solving Differential Equations. As we see in the sketch above, we start with a
differential equation for a function y. We first compute the Laplace transform of the whole
differential equation. Then we use the linearity of the Laplace transform, Theorem 4.1.4, and
the property that derivatives are converted into multiplications, Theorem 4.1.5, to transform
the differential equation into an algebraic equation for L[y]. Let us see how this works in a
simple example, a first order linear equation with constant coefficientswe already solved
it in 1.1.
Example 4.2.1: Use the Laplace transform to find the solution y to the initial value problem
y 0 + 2y = 0, y(0) = 3.
Solution: In 1.1 we saw one way to solve this problem, using the integrating factor
method. One can check that the solution is y(t) = 3e2t . We now use the Laplace transform.
First, compute the Laplace transform of the differential equation,
L[y 0 + 2y] = L[0] = 0.
Theorem 4.1.4 says the Laplace transform is a linear operation, that is,
L[y 0 ] + 2 L[y] = 0.
Theorem 4.1.5 relates derivatives and multiplications, as follows,
s L[y] y(0) + 2 L[y] = 0 (s + 2)L[y] = y(0).
In the last equation we have been able to transform the original differential equation for y
into an algebraic equation for L[y]. We can solve for the unknown L[y] as follows,
y(0) 3
L[y] = L[y] = ,
s+2 s+2
where in the last step we introduced the initial condition y(0) = 3. From the list of Laplace
transforms given in . 4.1 we know that
1 3 3
L[eat ] = = 3L[e2t ] = L[3 e2t ].
sa s+2 s+2
So we arrive at L[y(t)] = L[3 e2t ]. Here is where we need one more property of the Laplace
transform. We show right after this example that
L[y(t)] = L[3 e2t ] y(t) = 3 e2t .
This property is called one-to-one. Hence the only solution is y(t) = 3 e2t . C
G. NAGY ODE September 11, 2017 177
4.2.2. One-to-One Property. Let us repeat the method we used to solve the differential
equation in Example 4.2.1. We first computed the Laplace transform of the whole differential
equation. Then we use the linearity of the Laplace transform, Theorem 4.1.4, and the
property that derivatives are converted into multiplications, Theorem 4.1.5, to transform
the differential equation into an algebraic equation for L[y]. We solved the algebraic equation
and we got an expression of the form
L[y(t)] = H(s),
where we have collected all the terms that come from the Laplace transformed differential
equation into the function H. We then used a Laplace transform table to find a function h
such that
L[h(t)] = H(s).
We arrived to an equation of the form
L[y(t)] = L[h(t)].
Clearly, y = h is one solution of the equation above, hence a solution to the differential
equation. We now show that there are no solutions to the equation L[y] = L[h] other than
y = h. The reason is that the Laplace transform on continuous functions of exponential
order is an one-to-one transformation, also called injective.
Theorem 4.2.1 (One-to-One). If f , g are continuous on [0, ) of exponential order, then
L[f ] = L[g] f = g.
Remarks:
(a) The result above holds for continuous functions f and g. But it can be extended to
piecewise continuous functions. In the case of piecewise continuous functions f and g
satisfying L[f ] = L[g] one can prove that f = g + h, where h is a null function, meaning
RT
that 0 h(t) dt = 0 for all T > 0. See Churchills textbook [4], page 14.
(b) Once we know that the Laplace transform is a one-to-one transformation, we can define
the inverse transformation in the usual way.
Definition 4.2.2. The inverse Laplace transform, denoted L1 , of a function F is
L1 [F (s)] = f (t) F (s) = L[f (t)].
Remarks: There is an explicit formula for the inverse Laplace transform, which involves
an integral on the complex plane,
Z a+ic
1
L1 [F (s)] = lim est F (s) ds.
t 2i c aic
See for example Churchills textbook [4], page 176. However, we do not use this formula in
these notes, since it involves integration on the complex plane.
Proof of Theorem 4.2.1: The proof is based on a clever change of variables and on
Weierstrass Approximation Theorem of continuous functions by polynomials. Before we get
to the change of variable we need to do some rewriting. Introduce the function u = f g,
then the linearity of the Laplace transform implies
L[u] = L[f g] = L[f ] L[g] = 0.
178 G. NAGY ODE september 11, 2017
What we need to show is that the function u vanishes identically. Let us start with the
definition of the Laplace transform,
Z
L[u] = est u(t) dt.
0
We know that f and g are of exponential order, say s0 , therefore u is of exponential order
s0 , meaning that there exist positive constants k and T such that
u(t) < k es0 t ,
t > T.
Evaluate L[u] at s = s1 + n + 1, where s1 is any real number such that s1 > s0 , and n is any
positive integer. We get
Z Z
L[u] = e(s1 +n+1)t u(t) dt = es1 t e(n+1)t u(t) dt.
s 0 0
t t
We now do the substitution y = e , so dy = e dt,
Z 0 Z 1
y s1 y n u ln(y) (dy) = y s1 y n u ln(y) dy.
L[u] =
s 1 0
Introduce the function v(y) = y s1 u( ln(y) , so the integral is
Z 1
L[u] = y n v(y) dy. (4.2.1)
s 0
We know that L[u] exists because u is continuous and of exponential order, so the function
v does not diverge at y = 0. To double check this, recall that t = ln(y) as y 0+ ,
and u is of exponential order s0 , hence
lim |v(y)| = lim es1 t |u(t)| < lim e(s1 s0 )t = 0.
y0+ t t
Our main hypothesis is that L[u] = 0 for all values of s such that L[u] is defined, in particular
s. By looking at Eq. (4.2.1) this means that
Z 1
y n v(y) dy = 0, n = 1, 2, 3, .
0
The equation above and the linearity of the integral imply that this function v is perpen-
dicular to every polynomial p, that is
Z 1
p(y) v(y) dy = 0, (4.2.2)
0
The last term in the second equation above vanishes because of Eq. (4.2.2), therefore
Z 1 Z 1
2
v (y) dy = v(y) p(y) v(y) dy
0 0
Z 1
6 v(y) p(y) |v(y)| dy
0
Z 1
6 max |v(y)| v(y) p(y) dy. (4.2.3)
y[0,1] 0
We remark that the inequality above is true for every polynomial p. Here is where we use the
Weierstrass Approximation Theorem, which essentially says that every continuous function
on a closed interval can be approximated by a polynomial.
G. NAGY ODE September 11, 2017 179
The proof of this theorem can be found on a real analysis textbook. Weierstrass result
implies that, given v and > 0, there exists a polynomial p such that the inequality
in (4.2.3) has the form
Z 1 Z 1
2
v (y) dy 6 max |v(y)| v(y) p (y) dy 6 max |v(y)| .
0 y[0,1] 0 y[0,1]
that is, the polynomial has two real roots. In this case we factorize the denominator,
(s 1)
L[y] = .
(s 2)(s + 1)
The partial fraction decomposition of the right-hand side in the equation above is the fol-
lowing: Find constants a and b such that
(s 1) a b
= + .
(s 2)(s + 1) s2 s+1
A simple calculation shows
(s 1) a b a(s + 1) + b(s 2) s(a + b) + (a 2b)
= + = = .
(s 2)(s + 1) s2 s+1 (s 2)(s + 1) (s 2)(s + 1)
Hence constants a and b must be solutions of the equations
a + b = 1,
(s 1) = s(a + b) + (a 2b)
a 2b = 1.
1 2
The solution is a = and b = . Hence,
3 3
1 1 2 1
L[y] = + .
3 (s 2) 3 (s + 1)
From the list of Laplace transforms given in ??, Table 2, we know that
1 1 1
L[eat ] = = L[e2t ], = L[et ].
sa s2 s+1
So we arrive at the equation
1 2 h1 i
L[y] = L[e2t ] + L[et ] = L e2t + 2 et
3 3 3
We conclude that
1
y(t) = e2t + 2 et .
3
C
The Partial Fraction Method is usually introduced in a second course of Calculus to inte-
grate rational functions. We need it here to use Table 2 to find Inverse Laplace transforms.
The method applies to rational functions
Q(s)
R(s) = ,
P (s)
where P , Q are polynomials and the degree of the numerator is less than the degree of the
denominator. In the example above
(s 1)
R(s) = .
(s2 s 2)
One starts rewriting the polynomial in the denominator as a product of polynomials degree
two or one. In the example above,
(s 1)
R(s) = .
(s 2)(s + 1)
One then rewrites the rational function as an addition of simpler rational functions. In the
example above,
a b
R(s) = + .
(s 2) (s + 1)
G. NAGY ODE September 11, 2017 181
We now solve a few examples to recall the different partial fraction cases that can appear
when solving differential equations.
Example 4.2.3: Use the Laplace transform to find the solution y to the initial value problem
y 00 4y 0 + 4y = 0, y(0) = 1, y 0 (0) = 1.
Example 4.2.4: Use the Laplace transform to find the solution y to the initial value problem
y 00 4y 0 + 4y = 3 et , y(0) = 0, y 0 (0) = 0.
So we get
3 3 3s + 9
L[y] = 2
= +
(s 1)(s 2) s 1 (s 2)2
One last trick is needed on the last term above,
3s + 9 3(s 2 + 2) + 9 3(s 2) 6 + 9 3 3
= = + = + .
(s 2)2 (s 2)2 (s 2)2 (s 2)2 (s 2) (s 2)2
G. NAGY ODE September 11, 2017 183
So we finally get
3 3 3
L[y] = + .
s 1 (s 2) (s 2)2
From our Laplace transforms Table we know that
1 1
L[eat ] = = L[e2t ],
sa s2
1 1
L[teat ] = = L[te2t ].
(s a)2 (s 2)2
So we arrive at the formula
L[y] = 3 L[et ] 3 L[e2t ] + 3 L[t e2t ] = L 3 (et e2t + t e2t )
Example 4.2.5: Use the Laplace transform to find the solution y to the initial value problem
y 00 4y 0 + 4y = 3 sin(2t), y(0) = 1, y 0 (0) = 1.
We know use partial fractions to simplify the third term on the right hand side of Eq. (4.2.4).
The appropriate partial fraction decomposition for this term is the following: Find constants
a, b, c, d, such that
6 as + b c d
= 2 + +
(s 2)2 (s2 + 4) s + 4 (s 2) (s 2)2
Take common denominator on the right hand side above, and one obtains the system
a + c = 0,
4a + b 2c + d = 0,
4a 4b + 4c = 0,
4b 8c + 4d = 6.
The solution for this linear system of equations is the following:
3 3 3
a= , b = 0, c= , d= .
8 8 4
Therefore,
6 3 s 3 1 3 1
= +
2)2
(s 2
(s + 4) 8 s + 4 8 (s 2) 4 (s 2)2
2
We can rewrite this expression above in terms of the Laplace transforms given in Table 2,
in Sect. ??, as follows,
6 3 3 3
= L[cos(2t)] L[e2t ] + L[te2t ],
(s 2)2 (s2 + 4) 8 8 4
and using the linearity of the Laplace transform,
6 h3 3 2t 3 2t i
= L cos(2t) e + te . (4.2.6)
(s 2)2 (s2 + 4) 8 8 4
Finally, introducing Eqs. (4.2.5) and (4.2.6) into Eq. (4.2.4) we obtain
3 3 i
L[y(t)] = L (1 t) e2t + (1 + 2t) e2t + cos(2t) .
8 8
Since the Laplace transform is an invertible transformation, we conclude that
3 3
y(t) = (1 t) e2t + (2t 1) e2t + cos(2t).
8 8
C
4.2.4. Higher Order IVP. The Laplace transform method can be used with linear differ-
ential equations of higher order than second order, as long as the equation coefficients are
constant. Below we show how we can solve a fourth order equation.
Example 4.2.6: Use the Laplace transform to find the solution y to the initial value problem
y(0) = 1, y 0 (0) = 0,
y (4) 4y = 0,
y 00 (0) = 2, y 000 (0) = 0.
4.2.5. Exercises.
4.2.1.- . 4.2.2.- .
G. NAGY ODE September 11, 2017 187
Example 4.3.1: Graph the step u, uc (t) = u(t c), and uc (t) = u(t + c), for c > 0.
Solution: The step function u and its right and left translations are plotted in Fig. 16.
u u u
u(t) u(t c) u(t + c)
1
0 t 0 c t c 0 t
Figure 16. The graph of the step function given in Eq. (4.3.1), a right
and a left translation by a constant c > 0, respectively, of this step function.
Recall that given a function with values f (t) and a positive constant c, then f (t c) and
f (t + c) are the function values of the right translation and the left translation, respectively,
of the original function f . In Fig. 17 we plot the graph of functions f (t) = eat , g(t) = u(t) eat
and their respective right translations by c > 0.
1 1 1 1
0 t 0 c t 0 t 0 c t
Figure 17. The function f (t) = et , its right translation by c > 0, the
function f (t) = u(t) eat and its right translation by c.
Right and left translations of step functions are useful to construct bump functions.
188 G. NAGY ODE september 11, 2017
Example 4.3.2: Graph the bump function b(t) = u(t a) u(t b), where a < b.
Solution: The bump function we need to graph is
0
t < a,
b(t) = u(t a) u(t b) b(t) = 1 a6t<b (4.3.2)
0 t > b.
The graph of a bump function is given in Fig. 18, constructed from two step functions. Step
and bump functions are useful to construct more general piecewise continuous functions.
u u u
u(t a) u(t b) b(t)
1 1 1
0 a b t 0 a b t 0 a b t
4.3.2. The Laplace Transform of Steps. We compute the Laplace transform of a step
function using the definition of the Laplace transform.
Theorem 4.3.2. For every number c R and and every s > 0 holds
cs
e
for c > 0,
L[u(t c)] = s
1
for c < 0.
s
Proof of Theorem 4.3.2: Consider the case c > 0. The Laplace transform is
Z Z
st
L[u(t c)] = e u(t c) dt = est dt,
0 c
G. NAGY ODE September 11, 2017 189
where we used that the step function vanishes for t < c. Now compute the improper integral,
1 ecs ecs
L[u(t c)] = lim eN s ecs = L[u(t c)] = .
N s s s
Consider now the case of c < 0. The step function is identically equal to one in the domain
of integration of the Laplace transform, which is [0, ), hence
Z Z
1
L[u(t c)] = est u(t c) dt = est dt = L[1] = .
0 0 s
This establishes the Theorem.
Example 4.3.4: Compute L[3 u(t 2)].
Solution: The Laplace transform is a linear operation, so
L[3 u(t 2)] = 3 L[u(t 2)],
3 e2s
and the Theorem 4.3.2 above implies that L[3 u(t 2)] = . C
s
Remarks:
(a) The LT is an invertible transformation in the set of functions we work in our class.
(b) L[f ] = F L1 [F ] = f .
h e3s i
Example 4.3.5: Compute L1 .
s
e3s h e3s i
Solution: Theorem 4.3.2 says that = L[u(t 3)], so L1 = u(t 3). C
s s
4.3.3. Translation Identities. We now introduce two properties relating the Laplace
transform and translations. The first property relates the Laplace transform of a trans-
lation with a multiplication by an exponential. The second property can be thought as the
inverse of the first one.
Theorem 4.3.3 (Translation Identities). If L[f (t)](s) exists for s > a, then
L[u(t c)f (t c)] = ecs L[f (t)], s > a, c>0 (4.3.3)
ct
L[e f (t)] = L[f (t)](s c), s > a + c, c R. (4.3.4)
Example 4.3.6: Take f (t) = cos(t) and write the equations given the Theorem above.
Solution:
cs s
s L[u(t c) cos(t c)] = e
2
s +1
L[cos(t)] = (s c)
s2 + 1
ct
L[e cos(t)] = .
(s c)2 + 1
C
Remarks:
(a) We can highlight the main idea in the theorem above as follows:
L right-translation (uf ) = (exp) L[f ] ,
L (exp) (f ) = translation L[f ] .
190 G. NAGY ODE september 11, 2017
(b) Denoting F (s) = L[f (t)], then an equivalent expression for Eqs. (4.3.3)-(4.3.4) is
L[u(t c)f (t c)] = ecs F (s),
L[ect f (t)] = F (s c).
(c) The inverse form of Eqs. (4.3.3)-(4.3.4) is given by,
L1 [ecs F (s)] = u(t c)f (t c), (4.3.5)
L1 [F (s c)] = ect f (t). (4.3.6)
(d) Eq. (4.3.4) holds for all c R, while Eq. (4.3.3) holds only for c > 0.
(e) Show that in the case that c < 0 the following equation holds,
Z |c|
L[u(t + |c|)f (t + |c|)] = e|c|s L[f (t)] est f (t) dt .
0
Proof of Theorem 4.3.3: The proof is again based in a change of the integration variable.
We start with Eq. (4.3.3), as follows,
Z
L[u(t c)f (t c)] = est u(t c)f (t c) dt
Z0
= est f (t c) dt, = t c, d = dt, c > 0,
Zc
= es( +c) f ( ) d
0
Z
cs
=e es f ( ) d
0
s
Solution: Since L cos(at) = , then
s2 + a2
s (s 3)
L u(t 2) cos a(t 2) = e2s L e3t cos(at) =
, .
(s2 + a2 ) (s 3)2 + a2
C
Solution: The idea is to rewrite function f so we can use the Laplace transform Table 2,
in 4.1 to compute its Laplace transform. Since the function f vanishes for all t < 1, we
use step functions to write f as
f (t) = u(t 1)(t2 2t + 2).
Now, notice that completing the square we obtain,
t2 2t + 2 = (t2 2t + 1) 1 + 2 = (t 1)2 + 1.
The polynomial is a parabola t2 translated to the right and up by one. This is a discontinuous
function, as it can be seen in Fig. 20.
e4s
Example 4.3.11: Find the function f such that L[f (t)] = .
s2 + 5
Solution: Notice that
4s
1 1 4s 5
L[f (t)] = e L[f (t)] = e 2 .
s2 + 5 5 s2 + 5
a
Recall that L[sin(at)] = , then
(s2 + a2 )
1
L[f (t)] = e4s L[sin( 5t)].
5
But the translation identity
ecs L[f (t)] = L[u(t c)f (t c)]
192 G. NAGY ODE september 11, 2017
implies
1
L[f (t)] = L u(t 4) sin 5 (t 4) ,
5
hence we obtain
1
f (t) = u(t 4) sin 5 (t 4) .
5
C
(s 1)
Example 4.3.12: Find the function f (t) such that L[f (t)] = .
(s 2)2 + 3
Solution: We first rewrite the right-hand side above as follows,
(s 1 1 + 1)
L[f (t)] =
(s 2)2 + 3
(s 2) 1
= +
(s 2) + 3 (s 2)2 + 3
2
(s 2) 1 3
= 2 + 2
2
(s 2) + 3 3 2
(s 2) + 3
1
= L[cos( 3 t)](s 2) + L[sin( 3 t)](s 2).
3
But the translation identity L[f (t)](s c) = L[ect f (t)] implies
1
L[f (t)] = L e2t cos 3 t + L e2t sin 3 t .
3
So, we conclude that
e2t h i
f (t) = 3 cos 3 t + sin 3 t .
3
C
h 2e3s i
Example 4.3.13: Find L1 2 .
s 4
h a i
Solution: Since L1 2 = sinh(at) and L1 ecs f(s) = u(t c) f (t c), then
s a 2
h 2e3s i h 2 i h 2e3s i
L1 2 = L1 e3s 2 L1 2
= u(t 3) sinh 2(t 3) .
s 4 s 4 s 4
C
e2s
Example 4.3.14: Find a function f such that L[f (t)] = .
s2
+s2
Solution: Since the right hand side above does not appear in the Laplace transform Table
in 4.1, we need to simplify it in an appropriate way. The plan is to rewrite the denominator
of the rational function 1/(s2 +s2), so we can use partial fractions to simplify this rational
function. We first find out whether this denominator has real or complex roots:
s+ = 1,
1
s = 1 1 + 8
2 s = 2.
We are in the case of real roots, so we rewrite
s2 + s 2 = (s 1) (s + 2).
G. NAGY ODE September 11, 2017 193
4.3.4. Solving Differential Equations. The last three examples in this section show how
to use the methods presented above to solve differential equations with discontinuous source
functions.
Example 4.3.15: Use the Laplace transform to find the solution of the initial value problem
y 0 + 2y = u(t 4), y(0) = 3.
Example 4.3.16: Use the Laplace transform to find the solution to the initial value problem
5 1 06t<
y 00 + y 0 + y = b(t), y(0) = 0, y 0 (0) = 0, b(t) = (4.3.8)
4 0 t > .
u u b
u(t) u(t ) u(t) u(t )
1 1 1
0 t 0 t 0 t
Figure 21. The graph of the u, its translation and b as given in Eq. (4.3.8).
The last expression for b is particularly useful to find its Laplace transform,
1 1 1
L[b(t)] = L[u(t)] L[u(t )] = + es L[b(t)] = (1 es ) .
s s s
Now Laplace transform the whole equation,
5
L[y 00 ] + L[y 0 ] + L[y] = L[b].
4
Since the initial condition are y(0) = 0 and y 0 (0) = 0, we obtain
5 1 1
s2 + s + L[y] = 1 es L[y] = 1 es
.
4 s s s2 + s + 5 4
That is, we only need to find the inverse Laplace transform of H. We use partial fractions to
simplify the expression of H. We first find out whether the denominator has real or complex
roots:
5 1
s2 + s + = 0 s = 1 1 5 ,
4 2
so the roots are complex valued. An appropriate partial fraction decomposition is
1 a (bs + c)
H(s) = 5 = s +
s + s + 54
s s2 +s+ 4 2
Therefore, we get
5 5
1 = a s2 + s + + s (bs + c) = (a + b) s2 + (a + c) s + a.
4 4
G. NAGY ODE September 11, 2017 195
Example 4.3.17: Use the Laplace transform to find the solution to the initial value problem
5 sin(t) 0 6 t <
y 00 + y 0 + y = g(t), y(0) = 0, y 0 (0) = 0, g(t) = (4.3.9)
4 0 t > .
Solution: From Fig. 22, the source function g can be written as the following product,
g(t) = u(t) u(t ) sin(t),
since u(t) u(t ) is a box function, taking value one in the interval [0, ] and zero on
the complement. Finally, notice that the equation sin(t) = sin(t ) implies that the
function g can be expressed as follows,
g(t) = u(t) sin(t) u(t ) sin(t) g(t) = u(t) sin(t) + u(t ) sin(t ).
The last expression for g is particularly useful to find its Laplace transform,
196 G. NAGY ODE september 11, 2017
v b g
v(t) = sin(t) u(t) u(t ) g(t)
1 1 1
0 t 0 t 0 t
Figure 22. The graph of the sine function, a square function u(t)u(t)
and the source function g given in Eq. (4.3.9).
1 1
L[g(t)] = + es 2 .
(s2 + 1) (s + 1)
With this last transform is not difficult to solve the differential equation. As usual, Laplace
transform the whole equation,
5
L[y 00 ] + L[y 0 ] + L[y] = L[g].
4
Since the initial condition are y(0) = 0 and y 0 (0) = 0, we obtain
5 1 1
s2 + s + L[y] = 1 + es L[y] = 1 + es
2
.
4 (s + 1) s2 + s + 5 (s2 + 1) 4
4.3.5. Exercises.
4.3.1.- . 4.3.2.- .
G. NAGY ODE September 11, 2017 199
4.4.1. Sequence of Functions and the Dirac Delta. A sequence of functions is a se-
quence whose elements are functions. If each element in the sequence is a continuous func-
tion, we say that this is a sequence of continuous functions. Given a sequence of functions
{yn }, we compute the limn yn (t) for a fixed t. The limit depends on t, so it is a function
of t, and we write it as
lim yn (t) = y(t).
n
The domain of the limit function y is smaller or equal to the domain of the yn . The limit
of a sequence of continuous functions may or may not be a continuous function.
Example 4.4.1: The limit of the sequence below is a continuous function,
n 1 o
fn (t) = sin 1 + t sin(t) as n .
n
As usual in this section, the limit is computed for each fixed value of t. C
However, not every sequence of continuous functions has a continuous function as a limit.
Exercise: Find a sequence {un } so that its limit is the step function u defined in 4.3.
200 G. NAGY ODE september 11, 2017
Although every function in the sequence {un } is continuous, the limit u is a discontinuous
function. It is not difficult to see that one can construct sequences of continuous functions
having no limit at all. A similar situation happens when one considers sequences of piecewise
discontinuous functions. In this case the limit could be a continuous function, a piecewise
discontinuous function, or not a function at all.
We now introduce a particular sequence of piecewise discontinuous functions with domain
R such that the limit as n does not exist for all values of the independent variable t.
The limit of the sequence is not a function with domain R. In this case, the limit is a new
type of object that we will call Diracs delta generalized function. Diracs delta is the limit
of a sequence of particular bump functions.
Definition 4.4.1. The Dirac delta generalized function is the limit
(t) = lim n (t),
n
for every fixed t R of the sequence functions {n }
n=1 ,
h 1 i
n (t) = n u(t) u t . (4.4.2)
n
The sequence of bump functions introduced above
n
can be rewritten as follows,
0, t<0
1
n (t) = n, 0 6 t < 3 (t)
n 3
1
0, t> .
n
We then obtain the equivalent expression,
2 (t)
0 for t 6= 0, 2
(t) =
for t = 0.
The Dirac delta generalized function is the function identically zero on the domain R{0}.
Diracs delta is not defined at t = 0, since the limit diverges at that point. If we shift each
element in the sequence by a real number c, then we define
(t c) = lim n (t c), c R.
n
This shifted Diracs delta is identically zero on R {c} and diverges at t = c. If we shift
the graphs given in Fig. 24 by any real number c, one can see that
Z c+1
n (t c) dt = 1
c
G. NAGY ODE September 11, 2017 201
for every n > 1. Therefore, the sequence of integrals is the constant sequence, {1, 1, },
which has a trivial limit, 1, as n . This says that the divergence at t = c of the sequence
{n } is of a very particular type. The area below the graph of the sequence elements is always
the same. We can say that this property of the sequence provides the main defining property
of the Dirac delta generalized function.
Using a limit procedure one can generalize several operations from a sequence to its
limit. For example, translations, linear combinations, and multiplications of a function by
a generalized function, integration and Laplace transforms.
Definition 4.4.2. We introduce the following operations on the Dirac delta:
f (t) (t c) + g(t) (t c) = lim f (t) n (t c) + g(t) n (t c) ,
n
Z b Z b
(t c) dt = lim n (t c) dt,
a n a
L[(t c)] = lim L[n (t c)].
n
Remark: The notation in the definitions above could be misleading. In the left hand
sides above we use the same notation as we use on functions, although Diracs delta is not
a function on R. Take the integral, for example. When we integrate a function f , the
integration symbol means take a limit of Riemann sums, that is,
Z b n
X ba
f (t) dt = lim f (xi ) x, xi = a + i x, x = .
a n
i=0
n
However, when f is a generalized function in the sense of a limit of a sequence of functions
{fn }, then by the integration symbol we mean to compute a different limit,
Z b Z b
f (t) dt = lim fn (t) dt.
a n a
We use the same symbol, the integration, to mean two different things, depending whether
we integrate a function or a generalized function. This remark also holds for all the oper-
ations we introduce on generalized functions, specially the Laplace transform, that will be
often used in the rest of this section.
4.4.2. Computations with the Dirac Delta. Once we have the definitions of operations
involving the Dirac delta, we can actually compute these limits. The following statement
summarizes few interesting results. The first formula below says that the infinity we found
in the definition of Diracs delta is of a very particular type; that infinity is such that Diracs
delta is integrable, in the sense defined above, with integral equal one.
Z c+
Theorem 4.4.3. For every c R and > 0 holds, (t c) dt = 1.
c
Proof of Theorem 4.4.3: The integral of a Diracs delta generalized function is computed
as a limit of integrals,
Z c+ Z c+
(t c) dt = lim n (t c) dt.
c n c
If we choose n > 1/, equivalently 1/n < , then the domain of the functions in the sequence
is inside the interval (c , c + ), and we can write
Z c+ Z c+ n1
1
(t c) dt = lim n dt, for < .
c n c n
202 G. NAGY ODE september 11, 2017
Proof of Theorem 4.4.4: We again compute the integral of a Diracs delta as a limit of
a sequence of integrals,
Z b Z b
(t c) f (t) dt = lim n (t c) f (t) dt
a n a
Z b h 1 i
= lim n u(t c) u t c f (t) dt
n a n
1
Z c+ n
1
= lim n f (t) dt, < (b c),
n c n
R
To get the last line we used that c [a, b]. Let F be any primitive of f , so F (t) = f (t) dt.
Then we can write,
Z b 1
(t c) f (t) dt = lim n F c + F (c)
a n n
1 1
= lim 1 F c + F (c)
n
n
n
= F 0 (c)
= f (c).
First Proof of Theorem 4.4.5: We use the previous theorem on the integral that defines
a Laplace transform. Although the previous theorem applies to definite integrals, not to
improper integrals, it can be extended to cover improper integrals. In this case we get
(
ecs for c > 0,
Z
st
L[(t c)] = e (t c) dt =
0 0 for c < 0,
Second Proof of Theorem 4.4.5: The Laplace transform of a Diracs delta is computed
as a limit of Laplace transforms,
L[(t c)] = lim L[n (t c)]
n
h 1 i
= lim L n u(t c) u t c
n
Z n
1 st
= lim n u(t c) u t c e dt.
n 0 n
1
The case c < 0 is simple. For < |c| holds
n
Z
L[(t c)] = lim 0 dt L[(t c)] = 0, for s R, c < 0.
n 0
4.4.3. Applications of the Dirac Delta. Diracs delta generalized functions describe
impulsive forces in mechanical systems, such as the force done by a stick hitting a marble.
An impulsive force acts on an infinitely short time and transmits a finite momentum to the
system.
Example 4.4.3: Use Newtons equation of motion and Diracs delta to describe the change
of momentum when a particle is hit by a hammer.
Solution: A point particle with mass m, moving on one space direction, x, with a force F
acting on it is described by
ma = F mx00 (t) = F (t, x(t)),
where x(t) is the particle position as function of time, a(t) = x00 (t) is the particle acceleration,
and we will denote v(t) = x0 (t) the particle velocity. We saw in 1.1 that Newtons second
204 G. NAGY ODE september 11, 2017
law of motion is a second order differential equation for the position function x. Now it is
more convenient to use the particle momentum, p = mv, to write the Newtons equation,
mx00 = mv 0 = (mv)0 = F p0 = F.
So the force F changes the momentum, P . If we integrate on an interval [t1 , t2 ] we get
Z t2
p = p(t2 ) p(t1 ) = F (t, x(t)) dt.
t1
Suppose that an impulsive force is acting on a particle at t0 transmitting a finite momentum,
say p0 . This is where the Dirac delta is uselful for, because we can write the force as
F (t) = p0 (t t0 ),
then F = 0 on R {t0 } and the momentum transferred to the particle by the force is
Z t0 +t
p = p0 (t t0 ) dt = p0 .
t0 t
The momentum tranferred is p = p0 , but the force is identically zero on R {t0 }. We have
transferred a finite momentum to the particle by an interaction at a single time t0 . C
4.4.4. The Impulse Response Function. We now want to solve differential equations
with the Dirac delta as a source. But there is a particular type of solutions that will be
important later onsolutions to initial value problems with the Dirac delta source and zero
initial conditions. We give these solutions a particular name.
Definition 4.4.6. The impulse response function at the point c > 0 of the constant
coefficients linear operator L(y) = y 00 + a1 y 0 + a0 y, is the solution y of
L(y ) = (t c), y (0) = 0, y0 (0) = 0.
Theorem 4.4.7. The function y is the impulse response function at c > 0 of the constant
coefficients operator L(y) = y 00 + a1 y 0 + a0 y iff holds
h ecs i
y = L1 .
p(s)
where p is the characteristic polynomial of L.
Proof of Theorem 4.4.7: Compute the Laplace transform of the differential equation for
for the impulse response function y ,
L[y 00 ] + a1 L[y 0 ] + a0 L[y] = L[(t c)] = ecs .
Since the initial data for y is trivial, we get
(s2 + a1 s + a0 ) L[y] = ecs .
Since p(s) = s2 + a1 s + a0 is the characteristic polynomial of L, we get
ecs h ecs i
L[y] = y(t) = L1 .
p(s) p(s)
All the steps in this calculation are if and only ifs. This establishes the Theorem.
G. NAGY ODE September 11, 2017 205
Example 4.4.4: Find the impulse response function at t = 0 of the linear operator
L(y) = y 00 + 2y 0 + 2y.
Since for c > 0 holds ecs L[f ](s) = L[u(t c) f (t c)], we conclude that
Solution: The source is a generalized function, so we need to solve this problem using the
Lapace transform. So we compute the Laplace transform of the differential equation,
where in the second equation we have already introduced the initial conditions. We arrive
to the equation
s 1
L[y] = 20 e3s 2 = L[cosh(t)] 20 L[u(t 3) sinh(t 3)],
(s2 1) (s 1)
which leads to the solution
where in the second equation above we have introduced the initial conditions. Then,
es e2s
L[y] =
(s2 + 4) (s2 + 4)
es 2 e2s 2
= 2
2
2 (s + 4) 2 (s + 4)
1 h i 1 h i
= L u(t ) sin 2(t ) L u(t 2) sin 2(t 2) .
2 2
The last equation can be rewritten as follows,
1 1
y(t) = u(t ) sin 2(t ) u(t 2) sin 2(t 2) ,
2 2
which leads to the conclusion that
1
y(t) = u(t ) u(t 2) sin(2t).
2
C
G. NAGY ODE September 11, 2017 207
4.4.5. Comments on Generalized Sources. We have used the Laplace transform to solve
differential equations with the Dirac delta as a source function. It may be convenient to
understand a bit more clearly what we have done, since the Dirac delta is not an ordinary
function but a generalized function defined by a limit. Consider the following example.
Example 4.4.8: Find the impulse response function at t = c > 0 of the linear operator
L(y) = y 0 .
Looking at the differential equation y 0 (t) = (t c) and at the solution y(t) = u(t c) one
could like to write them together as
u0 (t c) = (t c). (4.4.3)
But this is not correct, because the step function is a discontinuous function at t = c, hence
not differentiable. What we have done is something different. We have found a sequence of
functions un with the properties,
lim un (t c) = u(t c), lim u0n (t c) = (t c),
n n
and we have called y(t) = u(t c). This is what we actually do when we solve a differential
equation with a source defined as a limit of a sequence of functions, such as the Dirac delta.
The Laplace transform method used on differential equations with generalized sources allows
us to solve these equations without the need to write any sequence, which are hidden in the
definitions of the Laplace transform of generalized functions. Let us solve the problem in
the Example 4.4.8 one more time, but this time let us show where all the sequences actually
are.
Example 4.4.9: Find the solution to the initial value problem
y 0 (t) = (t c), y(0) = 0, c > 0, (4.4.4)
Solution: Recall that the Dirac delta is defined as a limit of a sequence of bump functions,
h 1 i
(t c) = lim n (t c), n (t c) = n u(t c) u t c , n = 1, 2, .
n n
The problem we are actually solving involves a sequence and a limit,
y 0 (t) = lim n (t c), y(0) = 0.
n
We start computing the Laplace transform of the differential equation,
L[y 0 (t)] = L[ lim n (t c)].
n
208 G. NAGY ODE september 11, 2017
We have defined the Laplace transform of the limit as the limit of the Laplace transforms,
L[y 0 (t)] = lim L[n (t c)].
n
If the solution is at least piecewise differentiable, we can use the property
L[y 0 (t)] = s L[y(t)] y(0).
Assuming that property, and the initial condition y(0) = 0, we get
1 L[n (t c)]
L[y(t)] = lim L[n (t c)] L[y(t)] = lim .
s n n s
Introduce now the function yn (t) = un (t c), given in Eq. (4.4.1), which for each n is the
only continuous, piecewise differentiable, solution of the initial value problem
yn0 (t) = n (t c), yn (0) = 0.
It is not hard to see that this function un satisfies
L[n (t c)]
L[un (t)] = .
s
Therefore, using this formula back in the equation for y we get,
L[y(t)] = lim L[un (t)].
n
For continuous functions we can interchange the Laplace transform and the limit,
L[y(t)] = L[ lim un (t)].
n
So we get the result,
y(t) = lim un (t) y(t) = u(t c).
n
We see above that we have found something more than just y(t) = u(t c). We have found
y(t) = lim un (t c),
n
where the sequence elements un are continuous functions with un (0) = 0 and
lim un (t c) = u(t c), lim u0n (t c) = (t c),
n n
Finally, derivatives and limits cannot be interchanged for un ,
0
lim u0n (t c) 6= lim un (t c)
n n
so it makes no sense to talk about y 0 . C
When the Dirac delta is defined by a sequence of functions, as we did in this section, the
calculation needed to find impulse response functions must involve sequence of functions
and limits. The Laplace transform method used on generalized functions allows us to hide
all the sequences and limits. This is true not only for the derivative operator L(y) = y 0 but
for any second order differential operator with constant coefficients.
Definition 4.4.8. A solution of the initial value problem with a Diracs delta source
y 00 + a1 y 0 + a0 y = (t c), y(0) = y0 , y 0 (0) = y1 , (4.4.5)
where a1 , a0 , y0 , y1 , and c R, are given constants, is a function
y(t) = lim yn (t),
n
where the functions yn , with n > 1, are the unique solutions to the initial value problems
yn00 + a1 yn0 + a0 yn = n (t c), yn (0) = y0 , yn0 (0) = y1 , (4.4.6)
and the source n satisfy limn n (t c) = (t c).
G. NAGY ODE September 11, 2017 209
The definition above makes clear what do we mean by a solution to an initial value problem
having a generalized function as source, when the generalized function is defined as the limit
of a sequence of functions. The following result says that the Laplace transform method
used with generalized functions hides all the sequence computations.
Theorem 4.4.9. The function y is solution of the initial value problem
y 00 + a1 y 0 + a0 y = (t c), y(0) = y0 , y 0 (0) = y1 , c > 0,
iff its Laplace transform satisfies the equation
s2 L[y] sy0 y1 + a1 s L[y] y0 a0 L[y] = ecs .
This Theorem tells us that to find the solution y to an initial value problem when the source
is a Diracs delta we have to apply the Laplace transform to the equation and perform the
same calculations as if the Dirac delta were a function. This is the calculation we did when
we computed the impulse response functions.
Proof of Theorem 4.4.9: Compute the Laplace transform on Eq. (4.4.6),
L[yn00 ] + a1 L[yn0 ] + a0 L[yn ] = L[n (t c)].
Recall the relations between the Laplace transform and derivatives and use the initial con-
ditions,
L[yn00 ] = s2 L[yn ] sy0 y1 , L[y 0 ] = s L[yn ] y0 ,
and use these relation in the differential equation,
(s2 + a1 s + a0 ) L[yn ] sy0 y1 a1 y0 = L[n (t c)],
Since n satisfies that limn n (t c) = (t c), an argument like the one in the proof of
Theorem 4.4.5 says that for c > 0 holds
L[n (t c)] = L[(t c)] lim L[n (t c)] = ecs .
n
Then
(s2 + a1 s + a0 ) lim L[yn ] sy0 y1 a1 y0 = ecs .
n
Interchanging limits and Laplace transforms we get
(s2 + a1 s + a0 ) L[y] sy0 y1 a1 y0 = ecs ,
which is equivalent to
s2 L[y] sy0 y1 + a1 s L[y] y0 a0 L[y] = ecs .
4.4.6. Exercises.
4.4.1.- . 4.4.2.- .
G. NAGY ODE September 11, 2017 211
Remark: The convolution is defined for functions f and g such that the integral in (4.5.1) is
defined. For example for f and g piecewise continuous functions, or one of them continuous
and the other a Diracs delta generalized function.
Example 4.5.1: Find f g the convolution of the functions f (t) = et and g(t) = sin(t).
Solution: The definition of convolution is,
Z t
(f g)(t) = e sin(t ) d.
0
that is,
Z t h it h it
2 e sin(t ) d = e cos(t ) e sin(t ) = et cos(t) 0 + sin(t).
0 0 0
In the graphs below we can see that the values of the convolution function f g measure
the overlap of the functions f and g when one function slides over the other.
C
A few properties of the convolution operation are summarized in the Theorem below.
But we save the most important property for the next subsection.
Theorem 4.5.2 (Properties). For every piecewise continuous functions f , g, and h, hold:
(i) Commutativity: f g = g f ;
(ii) Associativity: f (g h) = (f g) h;
(iii) Distributivity: f (g + h) = f g + f h;
(iv) Neutral element: f 0 = 0;
(v) Identity element: f = f .
Proof of Theorem 4.5.2: We only prove properties (i) and (v), the rest are left as an
exercise and they are not so hard to obtain from the definition of convolution. The first
property can be obtained by a change of the integration variable as follows,
Z t
(f g)(t) = f ( ) g(t ) d.
0
G. NAGY ODE September 11, 2017 213
so we conclude that
(f g)(t) = (g f )(t).
We now move to property (v), which is essentially a property of the Dirac delta,
Z t
(f )(t) = f ( ) (t ) d = f (t).
0
4.5.2. The Laplace Transform. The Laplace transform of a convolution of two functions
is the pointwise product of their corresponding Laplace transforms. This result will be a
key part in the solution decomposition result we show at the end of the section.
Theorem 4.5.3 (Laplace Transform). If both L[g] and L[g] exist, including the case where
either f or g is a Diracs delta, then
Remark: It is not an accident that the convolution of two functions satisfies Eq. (4.5.3).
The definition of convolution is chosen so that it has this property. One can see that this is
the case by looking at the proof of Theorem 4.5.3. One starts with the expression L[f ] L[g],
then changes the order of integration, and one ends up with the Laplace transform of some
quantity. Because this quantity appears in that expression, is that it deserves a name. This
is how the convolution operation was created.
Proof of Theorem 4.5.3: We start writing the right hand side of Eq. (4.5.1), the product
L[f ] L[g]. We write the two integrals coming from the individual Laplace transforms and
we rewrite them in an appropriate way.
hZ i hZ i
L[f ] L[g] = est f (t) dt est g(t) dt
0 0
Z Z
= est g(t) est f (t) dt dt
Z0 Z
0
= g(t) es(t+t) f (t) dt dt,
0 0
where we only introduced the integral in t as a constant inside the integral in t. Introduce
the change of variables in the inside integral = t + t, hence d = dt. Then, we get
Z Z
L[f ] L[g] = g(t) es f ( t) d dt (4.5.4)
Z0 Z t
= es g(t) f ( t) d dt. (4.5.5)
0 t
214 G. NAGY ODE september 11, 2017
Solution: Since u = f g, with f (t) = et and g(t) = sin(t), then from Example 4.5.3,
1
L[u] = L[f g] = .
(s + 1)(s2 + 1)
A partial fraction decomposition of the right hand side above implies that
1 1 (1 s)
L[u] = + 2
2 (s + 1) (s + 1)
1 1 1 s
= + 2 2
2 (s + 1) (s + 1) (s + 1)
1 t
= L[e ] + L[sin(t)] L[cos(t)] .
2
This says that
1
u(t) = et + sin(t) cos(t) .
2
So, we recover Eq. (4.5.2) in Example 4.5.1, that is,
1
(f g)(t) = et + sin(t) cos(t) ,
2
C
G. NAGY ODE September 11, 2017 215
Z t
Example 4.5.5: Find the function g such that f (t) = sin(4 ) g(t ) d has the Laplace
0
s
transform L[f ] = 2 .
(s + 16)((s 1)2 + 9)
Solution: Since f (t) = sin(4t) g(t), we can write
s
= L[f ] = L[sin(4t) g(t)]
(s + 16)((s 1)2 + 9)
2
= L[sin(4t)] L[g]
4
= 2 L[g],
(s + 42 )
so we get that
4 s 1 s
L[g] = 2 L[g] = .
(s2 + 42 ) (s + 16)((s 1)2 + 9) 4 (s 1)2 + 32
We now rewrite the right-hand side of the last equation,
1 (s 1 + 1) 1 (s 1) 1 3
L[g] = L[g] = + ,
4 (s 1)2 + 32 4 (s 1)2 + 32 3 (s 1)2 + 32
that is,
1 1 1 1
L[g] = L[cos(3t)](s 1) + L[sin(3t)](s 1) = L[et cos(3t)] + L[et sin(3t)] ,
4 3 4 3
which leads us to
1 t 1
g(t) = e cos(3t) + sin(3t)
4 3
C
4.5.3. Solution Decomposition. The Solution Decomposition Theorem is the main result
of this section. Theorem 4.5.4 shows one way to write the solution to a general initial
value problem for a linear second order differential equation with constant coefficients. The
solution to such problem can always be divided in two terms. The first term contains
information only about the initial data. The second term contains information only about
the source function. This second term is a convolution of the source function itself and the
impulse response function of the differential operator.
Theorem 4.5.4 (Solution Decomposition). Given constants a0 , a1 , y0 , y1 and a piecewise
continuous function g, the solution y to the initial value problem
y 00 + a1 y 0 + a0 y = g(t), y(0) = y0 , y 0 (0) = y1 , (4.5.6)
can be decomposed as
y(t) = yh (t) + (y g)(t), (4.5.7)
where yh is the solution of the homogeneous initial value problem
yh00 + a1 yh0 + a0 yh = 0, yh (0) = y0 , yh0 (0) = y1 , (4.5.8)
and y is the impulse response solution, that is,
y00 + a1 y0 + a0 y = (t), y (0) = 0, y0 (0) = 0.
216 G. NAGY ODE september 11, 2017
Remark: The solution decomposition in Eq. (4.5.7) can be written in the equivalent way
Z t
y(t) = yh (t) + y ( )g(t ) d.
0
Also, recall that the impulse response function can be written in the equivalent way
h ecs i h 1 i
y = L1 , c 6= 0, and y = L1 , c = 0.
p(s) p(s)
Using the result in Theorem 4.5.3 in the last term above we conclude that
y(t) = yh (t) + (y g)(t).
Example 4.5.6: Use the Solution Decomposition Theorem to express the solution of
y 00 + 2 y 0 + 2 y = g(t), y(0) = 1, y 0 (0) = 1.
Example 4.5.7: Use the Laplace transform to solve the same IVP as above.
y 00 + 2 y 0 + 2 y = g(t), y(0) = 1, y 0 (0) = 1.
4.5.4. Exercises.
4.5.1.- . 4.5.2.- .
220 G. NAGY ODE september 11, 2017
Newtons second law of motion for point particles is one of the first differential equations
ever written. Even this early example of a differential equation consists not of a single
equation but of a system of three equation on three unknowns. The unknown functions are
the particle three coordinates in space as function of time. One important difficulty to solve
a differential system is that the equations in a system are usually coupled. One cannot solve
for one unknown function without knowing the other unknowns. In this chapter we study
how to solve the system in the particular case that the equations can be uncoupled. We call
such systems diagonalizable. Explicit formulas for the solutions can be written in this case.
Later we generalize this idea to systems that cannot be uncoupled.
x2
R5
R1
R6 u2 u1
R4
R2 0 x1
R8
R3
R7
G. NAGY ODE September 11, 2017 221
5.1.1. First Order Linear Systems. A single differential equation on one unknown func-
tion is often not enough to describe certain physical problems. For example problems in
several dimensions or containing several interacting particles. The description of a point
particle moving in space under Newtons law of motion requires three functions of time
the space coordinates of the particleto describe the motion together with three differential
equations. To describe several proteins activating and deactivating each other inside a cell
also requires as many unknown functions and equations as proteins in the system. In this
section we present a first step aimed to describe such physical systems. We start introducing
a first order linear differential system of equations.
Definition 5.1.1. An n n first order linear differential system is the equation
x0 (t) = A(t) x(t) + b(t), (5.1.1)
where the n n coefficient matrix A, the source n-vector b, and the unknown n-vector x are
given in components by
a11 (t) a1n (t) b1 (t) x1 (t)
A(t) = ... .. , b(t) = ... , x(t) = ... .
.
an1 (t) ann (t) bn (t) xn (t)
The system in 5.1.1 is called homogeneous iff the source vector b = 0, of constant coef-
ficients iff the matrix A is constant, and diagonalizable iff the matrix A is diagonalizable.
Remarks:
x01 (t)
x0n (t)
(b) By the definition of the matrix-vector product, Eq. (5.1.1) can be written as
x01 (t) = a11 (t) x1 (t) + + a1n (t) xn (t) + b1 (t),
..
.
x0n (t) = an1 (t) x1 (t) + + ann (t) xn (t) + bn (t).
(c) We recall that in 8.3 we say that a square matrix A is diagonalizable iff there exists
an invertible matrix P and a diagonal matrix D such that A = P DP 1 .
A solution of an n n linear differential system is an n-vector valued function x, that
is, a set of n functions {x1 , , xn }, that satisfy every differential equation in the system.
When we write down the equations we will usually write x instead of x(t).
Example 5.1.1: The case n = 1 is a single differential equation: Find a solution x1 of
x01 = a11 (t) x1 + b1 (t).
222 G. NAGY ODE september 11, 2017
Solution: This is a linear first order equation, and solutions can be found with the inte-
grating factor method described in Section 1.2. C
Example 5.1.2: Find the coefficient matrix, the source vector and the unknown vector for
the 2 2 linear system
x01 = a11 (t) x1 + a12 (t) x2 + g1 (t),
x02 = a21 (t) x1 + a22 (t) x2 + g2 (t).
Solution: The coefficient matrix A, the source vector b, and the unknown vector x are,
a11 (t) a12 (t) g1 (t) x1 (t)
A(t) = , b(t) = , x(t) = .
a21 (t) a22 (t) g2 (t) x2 (t)
C
Example 5.1.3: Use matrix notation to write down the 2 2 system given by
x01 = x1 x2 ,
x02 = x1 + x2 .
Solution: In this case, the matrix of coefficients and the unknown vector have the form
1 1 x1 (t)
A= , x(t) = .
1 1 x2 (t)
This is an homogeneous system, so the source vector b = 0. The differential equation can
be written as follows,
x01 = x1 x2
0
x1 1 1 x1
= x0 = Ax.
x02 = x1 + x2 x02 1 1 x2
C
Example 5.1.4: Find the explicit expression for the linear system x0 = Ax + b, where
t
1 3 e x1
A= , b(t) = , x = .
3 1 2e3t x2
Example 5.1.6: Find the explicit expression of the most general 3 3 homogeneous linear
differential system.
Solution: This is a system of the form x0 = A(t) x, with A being a 3 3 matrix. Therefore,
we need to find functions x1 , x2 , and x3 solutions of
x01 = a11 (t) x1 + a12 (t) x2 + a13 (t) x3
x02 = a21 (t) x1 + a22 (t) x2 + a13 (t) x3
x03 = a31 (t) x1 + a32 (t) x2 + a33 (t) x3 .
C
5.1.2. Existence of Solutions. We first introduce the initial value problem for linear
differential equations. This problem is similar to initial value problem for a single differential
equation. In the case of an n n first order system we need n initial conditions, one for
each unknown function, which are collected in an n-vector.
Definition 5.1.2. An Initial Value Problem for an n n linear differential system is
the following: Given an n n matrix valued function A, and an n-vector valued function b,
a real constant t0 , and an n-vector x0 , find an n-vector valued function x solution of
x0 = A(t) x + b(t), x(t0 ) = x0 .
Remark: The initial condition vector x0 represents n conditions, one for each component
of the unknown vector x.
x
Example 5.1.7: Write down explicitly the initial value problem for x = 1 given by
x2
2 1 3
x0 = Ax, x(0) = , A= .
3 3 1
The main result about existence and uniqueness of solutions to an initial value problem
for a linear system is also analogous to Theorem 2.1.2
Theorem 5.1.3 (Existence and Uniqueness). If the functions A and b are continuous on
an open interval I R, and if x0 is any constant vector and t0 is any constant in I, then
there exist only one function x, defined an interval I I with t0 I,
solution of the initial
value problem
x0 = A(t) x + b(t), x(t0 ) = x0 . (5.1.2)
Remark: The fixed point argument used in the proof of Picard-Lindelofs Theorem 1.6.2
can be extended to prove Theorem 5.1.3. This proof will be presented later on.
224 G. NAGY ODE september 11, 2017
First Proof of Theorem 5.1.5: We start with the following identity, which is satisfied by
every 2 2 matrix A, (exercise: prove it on 2 2 matrices by a straightforward calculation)
A2 tr (A) A + det(A) I = 0.
This identity is the particular case n = 2 of the Cayley-Hamilton Theorem, which holds for
every n n matrix. If we use this identity on the equation for x00 we get the equation in
Theorem 5.1.5, because
0
x00 = A x = A x0 = A2 x = tr (A) Ax det(A)Ix.
Recalling that A x = x0 , and Ix = x, we get the vector equation
x00 tr (A) x0 + det(A) x = 0.
The initial conditions for a second order differential equation are x(0) and x0 (0). The first
condition is given by hypothesis, x(0) = x0 . The second condition comes from the original
first order system evaluated at t = 0, that is x0 (0) = Ax(0) = Ax0 . This establishes the
Theorem.
Second Proof
of Theorem
5.1.5: This proof is based on a straightforward computation.
a11 a12
Denote A = , then the system has the form
a21 a22
x01 = a11 x1 + a12 x2 (5.1.10)
0
x2 = a21 x1 + a22 x2 . (5.1.11)
We start considering the case a12 6= 0. Compute the derivative of the first equation,
x001 = a11 x01 + a12 x02 .
Use Eq. (5.1.11) to replace x02 on the right-hand side above,
x001 = a11 x01 + a12 a21 x1 + a22 x2 .
Since we are assuming that a12 6= 0, we can replace the term with x2 above using Eq. (5.1.10),
x01 a11 x1
00 0
x1 = a11 x1 + a12 a21 x1 + a12 a22 .
a12
A simple cancellation and reorganization of terms gives the equation,
x001 = (a11 + a22 ) x01 + (a12 a21 a11 a22 ) x1 .
Recalling that tr (A) = a11 + a22 , and det(A) = a11 a22 a12 a21 , we get
x001 tr (A) x01 + det(A) x1 = 0.
226 G. NAGY ODE september 11, 2017
The initial conditions for x1 are x1 (0) and x01 (0). The first one comes from the first compo-
nent of x(0) = x0 , that is
x1 (0) = x01 . (5.1.12)
The second condition comes from the first component of the first order differential equation
evaluated at t = 0, that is x0 (0) = Ax(0) = Ax0 . The first component is
x01 (0) = a11 x01 + a12 x02 . (5.1.13)
Consider now the case a12 = 0. In this case the system is
x01 = a11 x1
x02 = a21 x1 + a22 x2 .
In this case compute one more derivative in the first equation above,
x001 = a11 x01 .
Now rewrite the first equation in the system as follows
a22 (x01 + a11 x1 ) = 0.
Adding these last two equations for x1 we get
x001 a11 x01 + a22 (x01 + a11 x1 ) = 0,
So we get the equation
x001 (a11 + a22 ) x01 + (a11 a22 ) x1 = 0.
Recalling that in the case a12 = 0 we have tr (A) = a11 + a22 , and det(A) = a11 a22 , we get
x001 tr (A) x01 + det(A) x1 = 0.
The initial conditions are the same as in the case a12 6= 0. A similar calculation gives x2
and its initial conditions. This establishes the Theorem.
Example 5.1.9: Express as a single second order equation the 2 2 system and solve it,
x01 = x1 + 3x2 ,
x02 = x1 x2 .
Solution: Instead of using the result from Theorem 5.1.5, we solve this problem following
the second proof of that theorem. But instead of working with x1 , we work with x2 . We start
computing x1 from the second equation: x1 = x02 + x2 . We then introduce this expression
into the first equation,
(x02 + x2 )0 = (x02 + x2 ) + 3x2 x002 + x02 = x02 x2 + 3x2 ,
so we obtain the second order equation
x002 + 2x02 2x2 = 0.
We solve this equation with the methods studied in Chapter 2, that is, we look for solutions
of the form x2 (t) = ert , with r solution of the characteristic equation
1
r2 + 2r 2 = 0 r =
2 4 + 8 r = 1 3.
2
Therefore, the general solution to the second order equation above is
x2 = c+ e(1+ 3)t
+ c- e(1 3)t
, c+ , c- R.
Since x1 satisfies the same equation as x2 , we obtain the same general solution
x1 = c+ e(1+ 3)t
+ c- e(1 3)t
, c+ , c- R.
G. NAGY ODE September 11, 2017 227
Proof of Theorem 5.1.6: We check that the function x = ax(1) + bx(2) is a solution of
the differential equation in the Theorem. Indeed, since the derivative of a vector valued
function is a linear operation, we get
0
x0 = ax(1) + bx(2) = a x(1) 0 + b x(2) 0 .
Replacing the differential equation on the right-hand side above,
x0 = a Ax(1) + b Ax(2) .
The matrix-vector product is a linear operation, A ax(1) + bx(2) = a Ax(1) + b Ax(2) , hence,
x0 = A ax(1) + bx(2) x0 = Ax.
We now introduce the notion of a linearly dependent and independent set of functions.
Definition 5.1.7. A set of n vector valued functions {x(1) , , x(n) } is called linearly
dependent on an interval I R iff for all t I there exist constants c1 , , cn , not all of
them zero, such that it holds
c1 x(1) (t) + + cn x(n) (t) = 0.
A set of n vector valued functions is called linearly independent on I iff the set is not
linearly dependent.
Remark: This notion is a generalization of Def. 2.1.6 from two functions to n vector valued
functions. For every value of t R this definition agrees with the definition of a set of linearly
dependent vectors given in Linear Algebra, reviewed in Chapter 8.
We now generalize Theorem 2.1.7 to linear systems. If you know a linearly independent
set of n solutions to an nn first order, linear, homogeneous system, then you actually know
all possible solutions to that system, since any other solution is just a linear combination of
the previous n solutions.
Theorem 5.1.8 (General Solution). If {x(1) , , x(n) } is a linearly independent set of
solutions of the n n system x0 = A x, where A is a continuous matrix valued function, then
there exist unique constants c1 , , cn such that every solution x of the differential equation
x0 = A x can be written as the linear combination
x(t) = c1 x(1) (t) + + cn x(n) (t). (5.1.14)
Before we present a sketch of the proof for Theorem 5.1.8, it is convenient to state the
following the definitions, which come out naturally from Theorem 5.1.8.
G. NAGY ODE September 11, 2017 229
Definition 5.1.9.
(a) The set of functions {x(1) , , x(n) } is a fundamental set of solutions of the equation
x0 = A x iff the set {x(1) , , x(n) } is linearly independent and x(i)0 = A x(i) , for every
i = 1, , n.
(b) The general solution of the homogeneous equation x0 = A x denotes any vector valued
function xgen that can be written as a linear combination
xgen (t) = c1 x(1) (t) + + cn x(n) (t),
where x(1) , , x(n) are the functions in any fundamental set of solutions of x0 = A x,
while c1 , , cn are arbitrary constants.
Remark: The names above are appropriate, since Theorem 5.1.8 says that knowing the
n functions of a fundamental set of solutions is equivalent to knowing all solutions to the
homogeneous linear differential system.
n
(1) 1 2t (2) 1 4t o
Example 5.1.12: Show that the set of functions x = e , x = e is a
1 1
1 3
fundamental set of solutions to the system x0 = Ax, where A = .
3 1
Solution: In Example 5.1.11 we have shown that x(1) and x(2) are solutions to the dif-
ferential equation above. We only need to show that these two functions form a linearly
independent set. That is, we need to show that the only constants c1 , c2 solutions of the
equation below, for all t R, are c1 = c2 = 0, where
2t
e4t c1
(1) (2) 1 2t 1 4t e
0 = c1 x + c2 x = c1 e + c2 e = 2t = X(t) c,
1 1 e e4t c2
c
where X(t) = x(1) (t), x(2) (t) and c = 1 . Using this matrix notation, the linear system
c2
for c1 , c2 has the form
X(t) c = 0.
We now show that matrix X(t) is invertible for all t R. This is the case, since its
determinant is
e2t e4t
det X(t) = 2t
= e2t + e2t = 2 e2t 6= 0 for all t R.
e e4t
Since X(t) is invertible for t R, the only solution
for the linear system above is c = 0,
that is, c1 = c2 = 0. We conclude that the set x(1) , x(2) is linearly independent, so it is a
fundamental set of solution to the differential equation above. C
Proof of Theorem 5.1.8: The superposition property in Theorem 5.1.6 says that given any
set of solutions {x(1) , , x(n) } of the differential equation x0 = A x, the linear combination
x(t) = c1 x(1) (t) + + cn x(n) (t) is also a solution. We now must prove that, in the case
that {x(1) , , x(n) } is linearly independent, every solution of the differential equation is
included in this linear combination.
Let x be any solution of the differential equation x0 = A x. The uniqueness statement in
Theorem 5.1.3 implies that this is the only solution that at t0 takes the value x(t0 ). This
means that the initial data x(t0 ) parametrizes all solutions to the differential equation. We
now try to find the constants {c1 , , cn } solutions of the algebraic linear system
x(t0 ) = c1 x(1) (t0 ) + + cn x(n) (t0 ).
230 G. NAGY ODE september 11, 2017
cn
the algebraic linear system has the form
x(t0 ) = X(t0 ) c.
This algebraic system has a unique solution c for every source x(t0 ) iff the matrix X(t0 )
is invertible. This matrix is invertible iff det X(t0 ) 6= 0. The generalization of Abels
Theorem to systems, Theorem 5.1.11, says that det X(t0 ) 6= 0 iff the set {x(1) , , x(n) } is
a fundamental set of solutions to the differential equation. This establishes the Theorem.
Example 5.1.13: Find the general solution to differential equation in Example 5.1.5 and
then use this general solution to find the solution of the initial value problem
1 3 2
x0 = Ax, x(0) = , A= .
5 2 2
Solution: From Example 5.1.5 we know that the general solution of the differential equa-
tion above can be written as
2 2t 1 t
x(t) = c1 e + c2 e .
1 2
Before imposing the initial condition on this general solution, it is convenient to write this
general solution using a matrix valued function, X, as follows
et
2t
2e c1
x(t) = x(t) = X(t)c,
e2t 2et c2
where we introduced the solution matrix and the constant vector, respectively,
et
2t
2e c
X(t) = , c= 1 .
e2t 2et c2
The initial condition fixes the vector c, that is, its components c1 , c2 , as follows,
1
x(0) = X(0) c c = X(0) x(0).
Since the solution matrix X at t = 0 has the form,
2 1 1 1 2 1
X(0) = X(0) = ,
1 2 3 1 2
1
introducing X(0) in the equation for c above we get
c1 = 1,
1 2 1 1 1
c= =
3 1 2 5 3 c2 = 3.
We conclude that the solution to the initial value problem above is given by
2 2t 1 t
x(t) = e +3 e .
1 2
C
G. NAGY ODE September 11, 2017 231
5.1.5. The Wronskian and Abels Theorem. From the proof of Theorem 5.1.8 above
we see that it is convenient to introduce the notion of solution matrix and Wronskian of a
set of n solutions to an n n linear differential system,
Definition 5.1.10.
(a) A solution matrix of any set of vector functions {x(1) , , x(n) }, solutions to a dif-
ferential equation x0 = A x, is the n n matrix valued function
X(t) = x(1) (t), , x(n) (t) . (5.1.15)
Xis called a fundamental matrix iff the set {x(1) , , x(n) } is a fundamental
set.
(b) The Wronskian of the set {x(1) , , x(n) } is the function W (t) = det X(t) .
Remark: A fundamental matrix provides a more compact way to write the general solution
of a differential equation. The general solution in Eq. (5.1.14) can be rewritten as
c1 c1
(1) (n)
(1) (n)
. ..
.
xgen (t) = c1 x (t) + + cn x (t) = x (t), , x (t) . = X(t) c, c = . .
cn cn
This is a more compact notation for the general solution,
xgen (t) = X(t) c. (5.1.16)
Remark: The definition of the Wronskian in Def 5.1.10 agrees with the Wronskian of
solutions to second order linear scalar equations given in Def. 2.1.9, 2.1. We can see
this relation if we compute the first order reduction of a second order equation. So, the
Wronskian of two solutions y1 , y2 of the second order equation y 00 + a1 y 0 + a0 y = 0, is
y1 y2
Wy1 y2 = 0 .
y1 y20
Now compute the first order reduction of the differential equation above, as in Theorem 5.1.4,
x01 = x2 ,
x02 = a0 x1 a1 x2 .
The solutions y1 , y2 define two solutions of the 2 2 linear system,
(1) y1 (2) y
x = 0 , x = 20 .
y1 y2
The Wronskian for the scalar equation coincides with the Wronskian for the system, because
y1 y2 x(11) x(12)
Wy1 y2 = 0 = = det x(1) , x(2) = W.
y1 y20 x(21) x(22)
Example 5.1.14: Find two fundamental matrices for the linear homogeneous system in
Example 5.1.11.
Solution: One fundamental matrix is simple to find, we use the solutions in Example 5.1.11,
2t
e4t
(1) (2) e
X = x ,x X(t) = 2t .
e e4t
232 G. NAGY ODE september 11, 2017
A second fundamental matrix can be obtained multiplying by any nonzero constant each
solution above. For example, another fundamental matrix is
2t
3e4t
(1) (2)
2e
X = 2x , 3x X(t) = .
2e2t 3e4t
C
Example 5.1.15: Compute the Wronskian of thevector valued functions given in Exam-
1 2t 1 4t
ple 5.1.11, that is, x(1) = e and x(2) = e .
1 1
Solution: The Wronskian is the determinant of the solution matrix, with the vectors
placed in any order. For example, we can choose the order x(1) , x(2) . If we choose the
order x(2) , x(1) , this second Wronskian is the negative of the first one. Choosing the first
order for the solutions, we get
e2t e4t
= e2t e4t + e2t e4t .
W (t) = det x(1) , x(2) = 2t
e e4t
We conclude that W (t) = 2e2t . C
3t t o
n e e
Example 5.1.16: Show that the set of functions x(1) = , x(2)
= is
2e3t 2et
linearly independent for all t R.
e3t et
Solution: We compute the determinant of the matrix X(t) = , that is,
2e3t 2et
et
3t
e
w(t) = 3t = 2e2t 2e2t w(t) = 4e2t 6= 0 t R.
2e 2et
C
We now generalize Abels Theorem 2.1.12 from a single equation to an nn linear system.
Theorem 5.1.11 (Abel). The Wronskian function W = det X(t) of a solution matrix
X = x , , x(n) of the linear system x0 = A(t)x, where A is an n n continuous matrix
(1)
Proof of Theorem 5.1.11: The proof is based in an identity satisfied by the determinant
of certain matrix valued functions. The proof of this identity is quite involved, so we do not
G. NAGY ODE September 11, 2017 233
provide it here. The identity is the following: Every n n, differentiable, invertible, matrix
valued function Z, with values Z(t) for t R, satisfies the identity:
d d
det(Z) = det(Z) tr Z 1 Z .
dt dt
(1)
We use this identity with any fundamental matrix X = x , , x(n) of the linear homo-
geneous differential system x0 = Ax. Recalling that the Wronskian w(t) = det X(t) , the
where the equation on the far right comes from the definition of matrix multiplication.
Replacing this equation in the Wronskian equation we get
W 0 (t) = W (t) tr X 1 AX = W (t) tr X X 1 A = W (t) tr (A),
where in the second equation above we used a property of the trace of three matrices:
tr (ABC) = tr (CAB) = tr (BCA). Therefore, we have seen that the Wronskian satisfies
the equation
W 0 (t) = tr A(t) W (t),
5.1.6. Exercises.
5.1.1.- . 5.1.2.- .
G. NAGY ODE September 11, 2017 235
Remark: See 8.4 for the definitions of the exponential of a square matrix. In particular,
recall the following properties of eAt , for a constant square matrix A and any s, t R:
d At 1
e = A eAt = eAt A, eAt = eAt , eAs eAt = eA(s+t) .
dt
Proof of Theorem 5.2.1: We generalize to linear systems the integrating factor method
used in 1.1 to solve linear scalar equations. Therefore, rewrite the equation as x0 A x = 0,
where 0 is the zero n-vector, and then multiply the equation on the left by eAt ,
eAt x0 eAt A x = 0 eAt x0 A eAt x = 0,
since eAt A = A eAt . We now use the properties of the matrix exponential to rewrite the
system as 0 0
eAt x0 + eAt x = 0 eAt x = 0.
If we integrate in the last equation above, and we denote by c a constant n-vector, we get
eAt x(t) = c x(t) = eAt c,
At 1
where we used e = eAt . If we now evaluate at t = t0 we get the constant vector c,
x0 = x(t0 ) = eAt0 c c = eAt0 x0 .
Using this expression for c in the solution formula above we get
x(t) = eAt eAt0 x0 x(t) = eA(tt0 ) x0 .
This establishes the Theorem.
236 G. NAGY ODE september 11, 2017
Example 5.2.1: Compute the exponential function eAt and use it to express the vector-
valued function x solution to the initial value problem
0 1 2 x
x = A x, A= , x(0) = x0 = 01 .
2 1 x02
Solution: The exponential of a matrix is simple to compute in the case that the matrix
is diagonalizable. So we start checking whether matrix A above is diagonalizable. Theo-
rem 8.3.8 says that a 22 matrix is diagonalizable if it has two eigenvectors not proportional
to each other. In oder to find the eigenvectors of A we need to compute its eigenvalues,
which are the roots of the characteristic polynomial
(1 ) 2
p() = det(A I2 ) =
= (1 )2 4.
2 (1 )
The roots of the characteristic polynomial are
( 1)2 = 4 = 1 2 + = 3, - = 1.
The eigenvectors corresponding to the eigenvalue + = 3 are the solutions v+ of the linear
system (A 3I2 )v+ = 0. To find them, we perform Gauss operations on the matrix
2 2 1 1 + + + 1
A 3I2 = v1 = v2 v = .
2 2 0 0 1
The eigenvectors corresponding to the eigenvalue - = 1 are the solutions v- of the linear
system (A + I2 )v- = 0. To find them, we perform Gauss operations on the matrix
2 2 1 1 1
A + I2 = v1- = v2- v- = .
2 2 0 0 1
Summarizing, the eigenvalues and eigenvectors of matrix A are following,
1 1
+ = 3, v+ = , and - = 1, v- = .
1 1
Then, Theorem 8.3.8 says that the matrix A is diagonalizable, that is A = P DP 1 , where
1 1 3 0 1 1 1
P = , D= , P 1 = .
1 1 0 1 2 1 1
Now Theorem ?? says that the exponential of At is given by
1 1 e3t
At Dt 1 0 1 1 1
e = Pe P = ,
1 1 0 et 2 1 1
so we conclude that
1 (e3t + et ) (e3t et )
At
e = . (5.2.2)
2 (e3t et ) (e3t + et )
Finally, we get the solution to the initial value problem above,
1 (e3t + et ) (e3t et ) x01
x(t) = eAt x0 = .
2 (e3t et ) (e3t + et ) x02
In components, this means
1 (x01 + x02 ) e3t + (x01 x02 ) et
x(t) = .
2 (x01 + x02 ) e3t (x01 x02 ) et
C
G. NAGY ODE September 11, 2017 237
Solution: As it is usually the case, the equations in the system above are coupled. One
must know the function x2 in order to integrate the first equation to obtain the function x1 .
Similarly, one has to know function x1 to integrate the second equation to get function x2 .
The system is coupled; one cannot integrate one equation at a time. One must integrate
the whole system together.
However, the coefficient matrix of the system above is diagonalizable. In this case the
equations can be decoupled. If we add the two equations equations, and if we subtract the
second equation from the first, we obtain, respectively,
(x1 + x2 )0 = 0, (x1 x2 )0 = 2(x1 x2 ).
To see more clearly what we have done, let us introduce the new unknowns y1 = x1 + x2 ,
and y2 = x1 x2 , and rewrite the equations above with these new unknowns,
y10 = 0, y20 = 2y2 .
We have decoupled the original system. The equations for x1 and x2 are coupled, but we
have found a linear combination of the equations such that the equations for y1 and y2 are
not coupled. We now solve each equation independently of the other.
y10 = 0 y1 = c1 ,
0
y2 = 2y2 y2 = c2 e2t ,
with c1 , c2 R. Having obtained the solutions for the decoupled system, we now transform
back the solutions to the original unknown functions. From the definitions of y1 and y2 we
see that
1 1
x1 = (y1 + y2 ), x2 = (y1 y2 ).
2 2
We conclude that for all c1 , c2 R the functions x1 , x2 below are solutions of the 2 2
differential system in the example, namely,
1 1
x1 (t) = (c1 + c2 e2t ), x2 (t) = (c1 c2 e2t ).
2 2
C
The equations for x1 and x2 in the example above are coupled, so we found an appropriate
linear combination of the equations and the unknowns such that the equations for the new
unknown functions, y1 and y2 , are decoupled. We integrated each equation independently
of the other, and we finally transformed the solutions back to the original unknowns x1 and
x2 . The key step is to find the transformation from x1 , x2 to y1 , y2 . For general systems
this transformation may not exist. It exists, however, for diagonalizable systems.
238 G. NAGY ODE september 11, 2017
Remark: Recall Theorem 8.3.8, which says that an n n matrix A diagonalizable iff A
has a linearly independent set of n eigenvectors. Furthermore, if i , v(i) are eigenpairs of
A, then the decomposition A = P DP 1 holds for
P = v(i) , , v(n) , D = diag 1 , , n .
Remark: We show two proofs of this Theorem. The first one is just a verification that
the expression in Eq. (5.2.3) satisfies the differential equation x0 = A x. The second proof
follows the same idea presented to solve Example 5.2.2. We decouple the system, we solve
the uncoupled system, and we transform back to the original unknowns. The differential
system is decoupled when written in the basis of eigenvectors of the coefficient matrix.
First proof of Theorem 5.2.2: Each function x(i) = ei t v(i) , for i = 1, , n, is solution
of the system x0 = A x, because
x(i)0 = i ei t v(i) , Ax(i) = A ei t v(i) = ei t A v(i) = i ei t v(i) ,
is a fundamental set of solutions to the system. Therefore, the superposition property says
that the general solution to the system is
x(t) = c1 e1 t v(1) + + cn en t v(n) .
The constants c1 , , cn are computed by evaluating the equation above at t0 and recalling
the initial condition x(t0 ) = x0 . This establishes the Theorem.
Remark: In the proof above we verify that the functions x(i) = ei t v(i) are solutions,
but we do not say why we choose these functions in the first place. In the proof below we
construct the solutions, and we find that they are the ones given in the proof above.
Second proof of Theorem 5.2.2: Since the coefficient matrix A is diagonalizable, there
exist an invertible matrix P and a diagonal matrix D such that A = P DP 1 . Introduce
this expression into the differential equation and multiplying the whole equation by P 1 ,
P 1 x0 (t) = P 1 P DP 1 x(t).
Notice that to multiply the differential system by the matrix P 1 means to perform a very
particular type of linear combinations among the equations in the system. This is the linear
combination that decouples the0 system. Indeed, since matrix A is constant, so is P and D.
In particular P 1 x0 = P 1 x , hence
(P 1 x)0 = D P 1 x .
G. NAGY ODE September 11, 2017 239
y0 (t) = D y(t).
Since matrix D is diagonal, the system above is a decoupled for the variable y. Transform
the initial condition too, that is, P 1 x(t0 ) = P 1 x0 , and use the notation y0 = P 1 x0 , so
we get the initial condition in terms of the y variable,
y(t0 ) = y0 .
Solve the decoupled initial value problem y0 (t) = D y(t),
y10 (t) = 1 y1 (t), y1 (t) = c1 e1 t ,
c1 e1 t
.. ..
y(t) = ... .
. .
cn en t
yn0 (t) = n yn (t), yn (t) = cn en t ,
cn en t
This is Eq. (5.2.3). Evaluating it at t0 we get Eq. (5.2.4). This establishes the Theorem.
Example 5.2.3: Find the vector-valued function x solution to the differential system
0 3 1 2
x = A x, x(0) = , A= .
2 2 1
Solution: First we need to find out whether the coefficient matrix A is diagonalizable or
not. Theorem 8.3.8 says that a 2 2 matrix is diagonalizable iff there exists a linearly
independent set of two eigenvectors. So we start computing the matrix eigenvalues, which
are the roots of the characteristic polynomial
(1 ) 2
p() = det(A I2 ) = = (1 )2 4.
2 (1 )
The roots of the characteristic polynomial are
( 1)2 = 4 = 1 2 + = 3, - = 1.
The eigenvectors corresponding to the eigenvalue + = 3 are the solutions v+ of the linear
system (A 3I2 )v+ = 0. To find them, we perform Gauss operations on the matrix
2 2 1 1 1
A 3I2 = v1+ = v2+ v+ = .
2 2 0 0 1
The eigenvectors corresponding to the eigenvalue - = 1 are the solutions v- of the linear
system (A + I2 )v- = 0. To find them, we perform Gauss operations on the matrix
2 2 1 1 - - - 1
A + I2 = v1 = v2 v = .
2 2 0 0 1
Summarizing, the eigenvalues and eigenvectors of matrix A are following,
+ 1 - 1
+ = 3, v = , and - = 1, v = .
1 1
240 G. NAGY ODE september 11, 2017
Once we have the eigenvalues and eigenvectors of the coefficient matrix, Eq. (5.2.3) gives us
the general solution
1 1
x(t) = c+ e3t + c- et ,
1 1
where the coefficients c+ and c- are solutions of the initial condition equation
1 1 3 1 1 c+ 3 c+ 1 1 1 3
c+ + c- = = = .
1 1 2 1 1 c- 2 c- 2 1 1 2
We conclude that c+ = 5/2 and c- = 1/2, hence
1 5e3t + et
5 1 1 1
x(t) = e3t et x(t) = .
2 1 2 1 2 5e3t et
C
Solution: We need to find the eigenvalues and eigenvectors of the coefficient matrix A.
But they were found in Example 8.3.4, and the result is
1 1
+ = 4, v(+) = and - = 2, v(-) = .
1 1
With these eigenpairs we construct fundamental solutions of the differential equation,
1 1
+ = 4, v(+) = x(+) (t) = e4t ,
1 1
1 1
- = 2, v(-) = x(-) (t) = e2t .
1 1
Therefore, the general solution of the differential equation is
1 1
x(t) = c+ e4t + c- e2t , c+ , c- R.
1 1
C
The formula in Eq. 5.2.3 is a remarkably simple way to write the general solution of the
equation x0 = A x in the case A is diagonalizable. It is a formula easy to remember, you
just add all terms of the form ei t vi , where i , vi is any eigenpair of A. But this formula
is not the best one to write down solutions to initial value problems. As you can see in
Theorem 5.2.2, we did not provide a formula for that. We only said that the constants
c1 , , cn are the solutions of the algebraic linear system in (5.2.4). But we did not write
the solution for the cs. It is too complicated in this notation, though it is not difficult to
do it on every particular case, as we did near the end of Example 5.2.3.
A simple way to introduce the initial condition in the expression of the solution is with
a fundamental matrix, which we introduced in Eq. (5.1.10).
Theorem 5.2.3 (Fundamental Matrix Expression). If the nn constant matrix A is diago-
nalizable, with a set of linearly independent eigenvectors v(1) , , v(n) and corresponding
eigenvalues {1 , , n }, then, the initial value problem x0 = A x with x(t0 ) = x0 has a
unique solution given by
x(t) = X(t)X(t0 )1 x0 (5.2.5)
t (1) t (n)
where X(t) = e v , , e v
1 n is a fundamental matrix of the system.
G. NAGY ODE September 11, 2017 241
Example 5.2.5: Find a fundamental matrix for the system below and use it to write down
the general solution to the system
1 2
x0 = A x, A = .
2 1
Solution: One way to find a fundamental matrix of a system is to start computing the
eigenvalues and eigenvectors of the coefficient matrix. The differential equation in this
Example is the same as the one given in Example 5.2.3, where we found that the eigenvalues
and eigenvectors of the coefficient matrix are
+ 1 - 1
+ = 3, v = , and - = 1, v = .
1 1
We see that the coefficient matrix is diagonalizable, so with the eigenpairs above we can
construct a fundamental set of solutions,
o
1 1
n
x(+) (t) = e3t , x(-) (t) = et .
1 1
From here we construct a fundamental matrix
et
3t
e
X(t) = 3t .
e et
c+
Then we have the general solution xgen (t) = X(t)c, where c = , that is,
c-
et
3t
e c+ 1 1
xgen (t) = xgen (t) = c+ e3t + c+ et .
e3t et c- 1 1
C
242 G. NAGY ODE september 11, 2017
Example 5.2.6: Use the fundamental matrix found in Example 5.2.5 to write down the
solution to the initial value problem
x 1 2
x0 = A x, x(0) = x0 = 01 , A= .
x02 2 1
Solution: In Example 5.2.5 we found the general solution to the differential equation,
et c+
3t
e
xgen (t) = 3t .
e et c-
The initial condition has the form
x01 1 1 c+
= x(0) = X(0) c = .
x02 1 1 c-
We need to compute the inverse of matrix X(0),
1 1 1
X(0)1 = ,
2 1 1
so we compute the constant vector c,
c+ 1 1 1 x01
= .
c- 2 1 1 x02
So the solution to the initial value problem is,
et 1 1 1 x01
3t
e
x(t) = X(t)X(0)1 x0 x(t) = .
e3t et 2 1 1 x02
If we compute the matrix on the last equation, explicitly, we get,
1 (e3t + et ) (e3t et ) x01
x(t) = .
2 (e3t et ) (e3t + et ) x02
C
1 2
Remark: In the Example 5.2.6 above we found that, for A = , holds
2 1
1 (e3t + et ) (e3t et )
1
X(t)X(0) = ,
2 (e3t et ) (e3t + et )
which is precisely the same as the expression for eAt we found in Eq. (5.2.2) in Example 5.2.2,
1 (e3t + et ) (e3t et )
At
e = .
2 (e3t et ) (e3t + et )
This is not a coincidence. If a matrix A is diagonalizable, then eA(tt0 ) = X(t)X(t0 )1 . We
summarize this result in the theorem below.
eA(tt0 ) = X(t)X(t0 )1 ,
where X(t) = e1 t v(1) , , en t v(n) .
G. NAGY ODE September 11, 2017 243
Proof of Theorem 5.2.4: We start rewriting the formula for the fundamental matrix
given in Theorem 5.2.3,
t
e 1 0
X(t) = v(1) e1 t , , v(n) en t = v(1) , , v(n) ... .. .. ,
. .
n t
0 e
The diagonal matrix on the last equation above can be written as
t
e 1 0
.. .. .. = diage1 t , , en t .
. . .
0 en t
If we recall the exponential of a matrix defined in 8.4, we can see that the matrix above
is an exponential, since
diag e1 t , , en t = eDt ,
where Dt = diag 1 t, , n t .
One more thing, let us denote P = v(1) , , v(n) , as we did in 8.3. If we use these two
expressions into the formula for X above, we get
X(t) = P eDt .
Using properties of invertible matrices, given in 8.2, and the properties of the exponential
of a matrix, given in 8.4, we get
1
X(t0 )1 = P eDt0 = eDt0 P 1 ,
1
where we used that eDt0 = eDt0 . These manipulations lead us to the formula
X(t)X(t0 )1 = P eDt eDt0 P 1 X(t)X(t0 )1 = P eD(tt0 ) P 1 .
Since A is diagonalizable, with A = P DP 1 , we known from 8.4 that
P eD(tt0 ) P 1 = eA(tt0 ) .
We conclude that
X(t)X(t0 )1 = eA(tt0 ) .
This establishes the Theorem.
1 3
Example 5.2.7: Verify Theorem 5.2.4 for matrix A = and t0 = 0.
3 1
Solution: We known from Example 5.2.4 that the eigenpairs of matrix A above are
1 1
+ = 4, v(+) = and - = 2, v(-) = .
1 1
This means that a fundamental matrix for A is
e2t
4t
e
X(t) = 4t .
e e2t
This fundamental matrix at t = 0 is
1 1 1 1 1
X(0) = , X(0)1 = .
1 1 2 1 1
Therefore we get that
e2t 1 1 1 1 (e4t + e2t ) (e4t e2t )
4t
1 e
X(t)X(0) = 4t = .
e e2t 2 1 1 2 (e4t e2t ) (e4t + e2t )
244 G. NAGY ODE september 11, 2017
On the other hand, eAt can be computed using the formula eAt = P eDt P 1 , where
4t 0 1 1 1 1 1
Dt = , P = . P 1 = .
0 2t 1 1 2 1 1
Then we get
1 e4t e2t
4t
1 0 1 1 1 e 1 1 1
eAt = = 4t ,
1 1 0 e2t 2 1 1 e e2t 2 1 1
so we get
1 (e4t + e2t ) (e4t e2t )
At
e = .
2 (e4t e2t ) (e4t + e2t )
We conclude that eAt = X(t)X(0)1 . C
5.2.3. Nonhomogeneous Systems. The solution formula of an initial value problem for
an nonhomogeneous linear system is a generalization of the solution formula for a scalar
equation given in 1.2. We use the integrating factor method, just as in 1.2.
Theorem 5.2.5 (Nonhomogeneous Systems). If A is a constant n n matrix and b is a
continuous n-vector function, then the initial value problem
x0 (t) = A x(t) + b(t), x(t0 ) = x0 ,
has a unique solution for every initial condition t0 R and x0 Rn given by
Z t
x(t) = e A(tt0 )
x0 + e A(tt0 )
eA( t0 ) b( ) d. (5.2.6)
t0
Remark: Since e-+At0 are constant matrices, an equivalent expression for Eq. (5.2.6) is
Z t
x(t) = e A(tt0 )
x0 + e At
eA b( ) d.
t0
Finally, using the group property of the exponential, eAt eAt0 = eA(tt0 ) , we get
Z t
x(t) = e A(tt0 )
x0 + e At
eA b( ) d,
t0
This establishes the Theorem.
G. NAGY ODE September 11, 2017 245
Solution: In Example 5.2.3 we have found the eigenvalues and eigenvectors of the coeffi-
cient matrix, and the result is
1 1
1 = 3, v(1) = , and 2 = 1, v(2) = .
1 1
The eigenvectors above say that A is diagonalizable,
1 1 3 0
A = P DP 1 , P = , D= .
1 1 0 1
We also know how to compute the exponential of a diagonalizable matrix,
1 1 e3t
0 1 1 1
eAt = P eDt P 1 = ,
1 1 0 et 2 1 1
so we conclude that
1 (e3t + et ) (e3t et ) 1 (e3t + et ) (e3t et )
eAt = eAt = .
2 (e3t et ) (e3t + et ) 2 (e3t et ) (e3t + et )
The solution to the initial value problem above is,
Z t
x(t) = eAt x0 + eAt eA b d.
0
Since
1 (e3t + et ) (e3t et ) 3 1 5e3t + et
At
e x0 = = ,
2 (e3t et ) (e3t + et ) 2 2 5e3t et
in a similar way
1 (e3 + e ) (e3 e ) 1 1 3e3 e
eA b = = .
2 (e3 e ) (e3 + e ) 2 2 3e3 + e
Integrating the last expresion above, we get
Z t
1 e3t et
1
eA b d = 3t t + .
0 2 e + e 0
Therefore, we get
1 5e3t + et 1 (e3t + et ) (e3t et ) h 1 e3t et
i
1
x(t) = + + .
2 5e3t et 2 (e3t et ) (e3t + et ) 2 e3t + et 0
Multiplying the matrix-vector product on the second term of the left-hand side above,
1 5e3t + et 1 (e3t + et )
1
x(t) = + .
2 5e3t et 0 2 (e3t et )
We conclude that the solution to the initial value problem above is
3e + et 1
3t
x(t) = .
3e3t et
C
246 G. NAGY ODE september 11, 2017
Remark: The formula in Eq. (5.2.6) is also called the variation of parameters formula. The
reason is that Eq. (5.2.6) can be seen as
x(t) = xh (t) + xp (t)
where xh (t) = e is solution of the homogeneous equation x0 = A x, and xp is a
A(tt0 )
particular solution of the nonhomogeneous equation. One can generalize the variation of
parameters method to get xp as follows,
xp (t) = X(t) u(t),
where X(t) is a fundamental matrix of the homogeneous system, and u are functions to be
determined. If one introduce this xp in the nonhomogeneous equation, one gets
X 0 u + X u0 = AX u + b
One can prove that the fundamental matrix satisfies the differential equation X 0 = AX. If
we use this equation for X in the equation above, we get
AX u + X u0 = AX u + b Xu0 = b
so we get the equation
Z t 1
u0 = X 1 b u(t) = X( ) b( ) d.
t0
Therefore, a particular solution found with this method is
Z t
xp (t) = X(t) X( )1 b( ) d.
t0
Now, one can also prove that eA(tt0 ) = X(t)X(t0 )1 for all n n coefficient matrices, no
just diagonalizable matrices. If we use that formula we get
Z t
xp (t) = eA(tt0 ) eA( t0 ) b( ) d.
t0
So we recover the expression in Eq. (5.2.6) for x = xh + xp . This is why Eq. (5.2.6) is also
called the variation of parameters formula.
G. NAGY ODE September 11, 2017 247
5.2.4. Exercises.
5.2.1.- . 5.2.2.- .
248 G. NAGY ODE september 11, 2017
Solution: We have computed in Example 8.3.4 the eigenpairs of the coefficient matrix,
1 1
+ = 4, v+ = , and - = 2, v- = .
1 1
This coefficient matrix has distinct real eigenvalues, so the general solution to the differential
equation is
1 1
xgen (t) = c+ e4t + c- e2t .
1 1
C
We now focus on case (ii). The coefficient matrix is real-valued with the complex-valued
eigenvalues. In this case each eigenvalue is the complex conjugate of the other. A similar
result is true for nn real-valued matrices. When such nn matrix has a complex eigenvalue
, there is another eigenvalue . A similar result holds for the respective eigenvectors.
G. NAGY ODE September 11, 2017 249
Proof of Theorem 5.3.3: Theorem 8.3.9 implies the set in (5.3.3) is a linearly independent
set. The new information in Theorem 5.3.3 above is the real-valued solutions in Eq. (5.3.4).
They are obtained from Eq. (5.3.3) as follows:
x() = (a ib) e(i)t
= et (a ib) eit
= et (a ib) cos(t) i sin(t)
Since the differential equation x0 = Ax is linear, the functions below are also solutions,
1
x(1) = x+ + x- = a cos(t) b sin(t) et ,
2
(2) 1 +
x x- = a sin(t) + b cos(t) et .
x =
2i
This establishes the Theorem.
Example 5.3.2: Find a real-valued set of fundamental solutions to the differential equation
2 3
x0 = Ax, A= . (5.3.5)
3 2
Then find the respective eigenvectors. The one corresponding to + is the solution of the
homogeneous linear system with coefficients given by
2 (2 + 3i) 3 3i 3 i 1 1 i 1 i
= .
3 2 (2 + 3i) 3 3i 1 i 1 i 0 0
+
v
+
Therefore the eigenvector v = 1+ is given by
v2
+ + + + + i
v1 = iv2 v2 = 1, v1 = i, v = , + = 2 + 3i.
1
The second eigenvector is the complex conjugate of the eigenvector found above, that is,
- i
v = , - = 2 3i.
1
Notice that
0 1
v() = i.
1 0
Then, the real and imaginary parts of the eigenvalues and of the eigenvectors are given by
0 1
= 2, = 3, a= , b= .
1 0
So a real-valued expression for a fundamental set of solutions is given by
0
1
sin(3t) 2t
x1 = cos(3t) sin(3t) e2t x1 = e ,
1 0 cos(3t)
0 1 cos(3t) 2t
x2 = sin(3t) + cos(3t) e2t x2 = e .
1 0 sin(3t)
C
We end with case (iii). There are no many possibilities left for a 2 2 real matrix that
both is diagonalizable and has a repeated eigenvalue. Such matrix must be proportional to
the identity matrix.
Theorem 5.3.4. Every 22 diagonalizable matrix with repeated eigenvalue 0 has the form
A = 0 I.
Proof of Theorem 5.3.4: Since matrix A diagonalizable, there exists a matrix P invertible
such that A = P DP 1 . Since A is 2 2 with a repeated eigenvalue 0 , then
0
D= = 0 I2 .
0
Put these two fatcs together,
A = P 0 IP 1 = 0 P P 1 = 0 I.
Remark: The general solution xgen for x0 = I x is simple to write. Since any non-zero
2-vector is an eigenvector of 0 I2 , we choos the linearly independent set
o
1 0
n
1 2
v = ,v = .
0 1
Using these eigenvectors we can write the general solution,
0 t (1) 0 t (2) 0 t 1 0 t 0 t c1
xgen (t) = c1 e v + c2 e v = c1 e + c2 e xgen (t) = e .
0 1 c2
G. NAGY ODE September 11, 2017 251
where the vector w is one of infinitely many solutions of the algebraic linear system
(A I)w = v. (5.3.6)
Remark: The eigenvalue is the precise number that makes matrix (AI) not invertible,
that is, det(A I) = 0. This implies that an algebraic linear system with coefficient
matrix (A I) is not consistent for every source. Nevertheless, the Theorem above says
that Eq. (5.3.6) has solutions. The fact that the source vector in that equation is v, an
eigenvector of A, is crucial to show that this system is consistent.
Proof of Theorem 5.3.5: One solution to the differential system is x(1) (t) = et v. Inspired
by the reduction order method we look for a second solution of the form
x(2) (t) = et u(t).
Inserting this function into the differential equation x0 = A x we get
u0 + u = A u (A I) u = u0 .
We now introduce a power series expansion of the vector-valued function u,
u(t) = u0 + u1 t + u2 t2 + ,
into the differential equation above,
(A I)(u0 + u1 t + u2 t2 + ) = (u1 + 2u2 t + ).
If we evaluate the equation above at t = 0, and then its derivative at t = 0, and so on, we
get the following infinite set of linear algebraic equations
(A I)u0 = u1 ,
(A I)u1 = 2u2 ,
(A I)u2 = 3u3
..
.
Here is where we use Cayley-Hamiltons Theorem. Recall that the characteristic polynomial
p() = det(A I) has the form
p() = 2 tr (A) + det(A).
Cayley-Hamilton Theorem says that the matrix-valued polynomial p(A) = 0, that is,
A2 tr (A) A + det(A) I = 0.
252 G. NAGY ODE september 11, 2017
Since in the case we are interested in matrix A has a repeated root , then
p() = ( )2 = 2 2 + 2 .
Therefore, Cayley-Hamilton Theorem for the matrix in this Theorem has the form
0 = A2 2 A + 2 I (A I)2 = 0.
This last equation is the one we need to solve the system for the vector-valued u. Multiply
the first equation in the system by (A I) and use that (A I)2 = 0, then we get
0 = (A I)2 u0 = (A I) u1 (A I)u1 = 0.
This implies that u1 is an eigenvector of A with eigenvalue . We can denote it as u1 = v.
Using this information in the rest of the system we get
(A I)u0 = v,
(A I)v = 2u2 u2 = 0,
(A I)u2 = 3u3 u3 = 0,
..
.
We conclude that all terms u2 = u3 = = 0. Denoting u0 = w we obtain the following
system of algebraic equations,
(A I)w = v,
(A I)v = 0.
For vectors v and w solution of the system above we get u(t) = w + tv. This means that
the second solution to the differential equation is
x(2) (t) = et (tv + w).
This establishes the Theorem.
Example 5.3.3: Find the fundamental solutions of the differential equation
1 6 4
x0 = Ax, A= .
4 1 2
Solution: As usual, we start finding the eigenvalues and eigenvectors of matrix A. The
former are the solutions of the characteristic equation
3
1 3 1 1
0= 2
1 1 = + + + = 2 + 2 + 1 = ( + 1)2 .
4 2 2 2 4
Therefore, there solution is the repeated eigenvalue = 1. The associated eigenvectors
are the vectors v solution to the linear system (A + I)v = 0,
3 1
2 + 1 1 2 1 1 2 1 2
= v1 = 2v2 .
14 12 + 1 14 12 1 2 0 0
Choosing v2 = 1, then v1 = 2, and we obtain
2
= 1, v= .
1
Any other eigenvector associated to = 1 is proportional to the eigenvector above. The
matrix A above is not diagonalizable. So. we follow Theorem 5.3.5 and we solve for a vector
w the linear system (A + I)w = v. The augmented matrix for this system is given by,
1
2 1 2 1 2 4 1 2 4
w1 = 2w2 4.
14 12 1 1 2 4 0 0 0
G. NAGY ODE September 11, 2017 253
5.3.3. Exercises.
5.3.1.- . 5.3.2.- .
G. NAGY ODE September 11, 2017 255
5.4.1. Real Distinct Eigenvalues. We study the system in (5.4.1) in the case that matrix
A has two real eigenvalues + 6= - . The case where one eigenvalues vanishes is left one of the
exercises at the end of the Section. We study the case where both eigenvalues are non-zero.
Two non-zero eigenvalues belong to one of hte following cases:
(i) + > - > 0, both eigenvalues positive;
(ii) + > 0 > - , one eigenvalue negative and the other positive;
(iii) 0 > + > - , both eigenvalues negative.
In a phase portrait the solution vector x(t) at t is displayed on the plane x1 , x2 . The
whole vector is shown, only the end point of the vector is shown for t (, ). The
result is a curve in the x1 , x2 plane. One usually adds arrows to determine the direction of
increasing t. A phase portrait contains several curves, each one corresponding to a solution
given in Eq. (5.4.2) for particular choice of constants c+ and c- . A phase diagram can be
sketched by following these few steps:
(a) Plot the eigenvectors v+ and v- corresponding to the eigenvalues + and - .
(b) Draw the whole lines parallel to these vectors and passing through the origin. These
straight lines correspond to solutions with either c+ or c- zero.
(c) Draw arrows on these lines to indicate how the solution changes as the variable t in-
creases. If t is interpreted as time, the arrows indicate how the solution changes into
the future. The arrows point towards the origin if the corresponding eigenvalue is
negative, and they point away form the origin if the eigenvalue is positive.
(d) Find the non-straight curves correspond to solutions with both coefficient c+ and c-
non-zero. Again, arrows on these curves indicate the how the solution moves into the
future.
Case + > - > 0.
Example 5.4.1: Sketch the phase diagram of the solutions to the differential equation
1 11 3
x0 = Ax, A= . (5.4.3)
4 1 9
The curved lines on each quadrant correspond to the following four solutions:
c+ = 1, c- = 1; c+ = 1, c- = 1; c+ = 1, c- = 1; c+ = 1, c- = 1.
x2
c+ = 0, c- = 1 c+ = 1, c- = 1
v-
c+ = 1, c- = 0
c+ = 1, c- = 1 v+
0 x1
c+ = 1, c- = 0
c+ = 1, c- = 1
c+ = 1, c- = 1 c+ = 0, c- = 1
Figure 28. Eight solutions to Eq. (5.4.3), where + > - > 0. The trivial
solution x = 0 is called an unstable point.
x2
c+ = 0, c- = 1 c+ = 1, c- = 0
c+ = 1, c- = 1
v- v+
c+ = 1, c- = 1 c+ = 1, c- = 1
x1
0
c+ = 1, c- = 1
c+ = 1, c- = 0 c+ = 0, c- = 1
Figure 29. Several solutions to Eq. (5.4.4), + > 0 > - . The trivial
solution x = 0 is called a saddle point.
x2
c+ = 1, c- = 1
c+ = 0, c- = 1
v-
c+ = 1, c- = 0
c+ = 1, c- = 1 v+
0 x1
c+ = 1, c- = 0
c+ = 1, c- = 1
c+ = 0, c- = 1
c+ = 1, c- = 1
Figure 30. Several solutions to Eq. (5.4.5), where 0 > + > - . The trivial
solution x = 0 is called a stable point.
Solution: We have found in Example 5.3.2 that the eigenvalues and eigenvectors of the
coefficient matrix are
i
= 2 3i, v = .
1
Writing them in real and imaginary parts, = i and v = a ib, we get
0 1
= 2, = 3, a= , b= .
1 0
These eigenvalues and eigenvectors imply the following real-valued fundamental solutions,
n
1 sin(3t) 2t 2 cos(3t) 2t o
x (t) = e , x (t) = e . (5.4.7)
cos(3t) sin(3t)
260 G. NAGY ODE september 11, 2017
The phase diagram of these two fundamental solutions is given in Fig. 31 below. There is
also a circle given in that diagram, corresponding to the trajectory of the vectors
sin(3t) cos(3t)
x1 (t) = x2 (t) = .
cos(3t) sin(3t)
The phase portrait of these functions is a circle, since they are unit vector-valued functions
they have length one. C
x2
x2
a
x1
b 0 x1
Figure 31. The graph of the fundamental solutions x(1) and x(2) in Eq. (5.4.7).
= i, v = a ib.
We now sketch phase portraits of these solutions for a few choices of , a and b. We start
fixing the vectors a, b and plotting phase diagrams for solutions having > 0, = 0,
and < 0. The result can be seen in Fig. 32. For > 0 the solutions spiral outward as t
increases, and for < 0 the solutions spiral inwards to the origin as t increases. The rotation
direction is from vector b towards vector a. The solution vector 0, is called unstable for
> 0 and stable for < 0.
We now change the direction of vector b, and we repeat the three phase portraits given
above; for > 0, = 0, and < 0. The result is given in Fig. 33. Comparing Figs. 32
and 33 shows that the relative directions of the vectors a and b determines the rotation
direction of the solutions as t increases.
G. NAGY ODE September 11, 2017 261
x2 > 0 x2 = 0 x2 < 0
x2 x2
x1
x1
a
b a b a b
0 x1 0 x1 0 x1
x2 x1
x2 > 0 x2 = 0 x2 < 0
x1
x1 x1
a
a a
0 x1 0 x1 0 x1
b x2 b b
x2 x2
5.4.3. Repeated Eigenvalues. A matrix with repeated eigenvalues may or may not be
diagonalizable. If a 2 2 matrix A is diagonalizable with repeated eigenvalues, then by
Theorem 5.3.4 this matrix is proportional to the identity matrix, A = 0 I, with 0 the
repeated eigenvalue. We saw in Section 5.3 that the general solution of a differential system
with such coefficient matrix is
c
xgen (t) = 1 e0 t .
c2
Phase portraits of these solutions are just straight lines, starting from the origin for 0 > 0,
or ending at the origin for 0 < 0.
Non-diagonalizable 2 2 differential systems are more interesting. If x0 = A x is such a
system, it has fundamental solutions
where 0 is the repeated eigenvalue of A with eigenvector v, and vector w is any solution of
the linear algebraic system
(A 0 I)w = v.
The phase portrait of these fundamental solutions is given in Fig 34. To construct this
figure start drawing the vectors v and w. The solution x1 is simpler to draw than x2 , since
the former is a straight semi-line starting at the origin and parallel to v.
262 G. NAGY ODE september 11, 2017
x2 0 > 0
x2 0 < 0
x2
x1
x1
x2
v
w v
w
x1
0 x1
0
x1 x2
x1
x2
Figure 34. Functions x1 , x2 in Eq. (5.4.8) for the cases 0 > 0 and 0 < 0.
The solution x2 is more difficult to draw. One way is to first draw the trajectory of the
time-dependent vector
x2 = v t + w.
This is a straight line parallel to v passing through w, one of the black dashed lines in
Fig. 34, the one passing through w. The solution x2 differs from x2 by the multiplicative
factor e0 t . Consider the case 0 > 0. For t > 0 we have x2 (t) > x2 (t), and the opposite
happens for t < 0. In the limit t the solution values x2 (t) approach the origin,
since the exponential factor e0 t decreases faster than the linear factor t increases. The
result is the purple line in the first picture of Fig. 34. The other picture, for 0 < 0 can be
constructed following similar ideas.
G. NAGY ODE September 11, 2017 263
5.4.4. Exercises.
5.4.1.- . 5.4.2.- .
264 G. NAGY ODE september 11, 2017
By the end of the seventeenth century Newton had invented differential equations, discov-
ered his laws of motion and the law of universal gravitation. He combined all of them to
explain Kepler laws of planetary motion. Newton solved what now is called the two-body
problem. Kepler laws correspond to the case of one planet orbiting the Sun. People then
started to study the three-body problem. For example the movement of Earth, Moon, and
Sun. This problem turned out to be far more difficult than the two-body problem and no
solution was ever found. Around the end of the nineteenth century Henri Poincare proved
a breakthrough result. The solutions of the three body problem could not be found explic-
itly in terms of elementary functions, such as combinations of polynomials, trigonometric
functions, exponential, and logarithms. This led him to invent the so-called Qualitative
Theory of Differential Equations. In this theory one studies the geometric properties of
solutionswhether they show periodic behavior, tend to fixed points, tend to infinity, etc.
This approach evolved into the modern field of Dynamics. In this chapter we introduce a
few basic concepts and we use them to find qualitative information of a particular type of
differential equations, called autonomous equations.
2 Unstable
CD
CU
Stable
CD
CU
Unstable
0 t
CD
CU
Stable
CD
CU
2 Unstable
G. NAGY ODE September 11, 2017 265
6.1.1. Autonomous Equations. Let us study, one more time, first order nonlinear dif-
ferential equations. In 1.3 we learned how to solve separable equationswe integrated on
both sides of the equation. We got an implicit expression for the solution in terms of the
antiderivative of the equation coefficients. In this section we concentrate on a particular
type of separable equations, called autonomous, where the independent variable does not
appear explicitly in the equation. For these systems we find a few qualitative properties
of their solutions without actually computing the solution. We find these properties of the
solutions by studying the equation itself.
Definition 6.1.1. A first order autonomous differential equation is
y 0 = f (y), (6.1.1)
dy
where y 0 = , and the function f does not depend explictly on t.
dt
Remarks: The equation in (6.1.1) is a particular case of a separable equation where the
independent variable t does not appear in the equation. This is the case, since Eq. (6.1.1)
has the form
h(y) y 0 = g(t),
as in Def. 1.3.1, with h(y) = 1/f (y) and g(t) = 1.
The autonomous equations we study in this section are a particular type of the separable
equations we studied in 1.3, as we can see in the following examples.
Example 6.1.1: The following first order separable equations are autonomous:
(a) y 0 = 2 y + 3.
(b) y 0 = sin(y).
y
(c) y 0 = r y 1 .
K
The independent variable t does not appear explicitly in these equations. The following
equations are not autonomous.
(a) y 0 = 2 y + 3t.
(b) y 0 = t2 sin(y).
y
(c) y 0 = t y 1 . C
K
Remark: Since the autonomous equation in (6.1.1) is a particular case of the separable
equations from 1.3, the Picard-Lindelof Theorem applies to autonomous equations. There-
fore, the initial value problem y 0 = f (y), y(0) = y0 , with f continuous, always has a unique
solution in the neighborhood of t = 0 for every value of the initial data y0 .
266 G. NAGY ODE september 11, 2017
Solution: y
This is a linear, constant coefficients equation,
c>0
so it could be solved using the integrating fac-
tor method. But this is also a separable equa-
tion, so we solve it as follows,
Z Z
dy 1 t
= dt ln(a y + b) = t + c0 ab c=0
ay + b a
so we get,
a y + b = eat eac0
and denoting c = eac0 /a, we get the expression
c<0
b
y(t) = c eat . (6.1.2)
a
This is the expression for the solution we got Figure 35. A few solutions
in Theorem 1.1.2. C to Eq. (6.1.2) for different c.
It is not so easy to see certain properties of the solution from the exact expression
in (6.1.3). For example, what is the behavior of the solution values y(t) as t for
an arbitrary initial condition y0 ? To be able to answer questions like this one, is that we
introduce a new approach, a geometric approach.
6.1.2. Geometrical Characterization of Stability. The idea is to obtain qualitative
information about solutions to an autonomous equation using the equation itself, without
solving it. We now use the equation in Example 6.1.3 to show how this can be done.
Example 6.1.4: Sketch a qualitative graph of solutions to y 0 = sin(y), for different initial
data conditions y(0).
Solution: The differential equation has the form y 0 = f (y), where f (y) = sin(y). The first
step in the graphical approach is to graph the function f .
y0 = f f (y) = sin(y)
2 0 2 y
The second step is to identify all the zeros of the function f . In this case,
f (y) = sin(y) = 0 yn = n, where n = , 2, 1, 0, 1, 2, .
It is important to realize that these constants yn are solutions of the differential equation.
On the one hand, they are constants, t-independent, so yn0 = 0. On the other hand, these
constants yn are zeros of f , hence f (yn ) = 0. So yn are solutions of the differential equation
0 = yn0 = f (yn ) = 0.
The constants yn are called critical points, or fixed points. When the emphasis is on the fact
that these constants define constant functions solutions of the differential equation, then
they are called stationary solutions, or equilibrium solutions.
y0 = f f (y) = sin(y)
2 0 2 y
Figure 37. Critical points and increase/decrease information added to Fig. 36.
The third step is to identify the regions on the line where f is positive, and where f is
negative. These regions are bounded by the critical points. A solution y of y 0 = f (y) is
increasing for f (y) > 0, and it is decreasing for f (y) < 0. We indicate this behavior of the
268 G. NAGY ODE september 11, 2017
solution by drawing arrows on the horizontal axis. In an interval where f > 0 we write a
right arrow, and in the intervals where f < 0 we write a left arrow, as shown in Fig. 37.
There are two types of critical points in Fig. 37. The points y-1 = , y1 = , have
arrows on both sides pointing to them. They are called attractors, or stable points, and
are pictured with solid blue dots. The points y-2 = 2, y0 = 0, y2 = 2, have arrows on
both sides pointing away from them. They are called repellers, or unstable points, and are
pictured with white dots.
The fourth step is to find the regions where the curvature of a solution is concave up or
0
concave down. That information is given by y 00 = (y 0 )0 = f (y) = f 0 (y) y 0 = f 0 (y) f (y).
0
So, in the regions where f (y) f (y) > 0 a solution is concave up (CU), and in the regions
where f (y) f 0 (y) < 0 a solution is concave down (CD). See Fig. 38.
CU CD CU CD CU CD CU CD
2 0 2 y
This is all we need to sketch a qualitative graph of solutions to the differential equation.
The last step is to collect all this information on a ty-plane. The horizontal axis above is
now the vertical axis, and we now plot soltuions y of the differential equation. See Fig. 39.
2 Unstable
CD
CU
Stable
CD
CU
Unstable
0 t
CD
CU
Stable
CD
CU
2 Unstable
Fig. 39 contains the graph of several solutions y for different choices of initial data y(0).
Stationary solutions are in blue, t-dependent solutions in green. The stationary solutions
are separated in two types. The stable solutions y-1 = , y1 = , are pictured with solid
blue lines. The unstable solutions y-2 = 2, y0 = 0, y2 = 2, are pictured with dashed
blue lines. C
Remark: A qualitative graph of the solutions does not provide all the possible information
about the solution. For example, we know from the graph above that for some initial
conditions the corresponding solutions have inflection points at some t > 0. But we cannot
know the exact value of t where the inflection point occurs. Such information could be
useful to have, since |y 0 | has its maximum value at those points.
In the Example 6.1.4 above we have used that the second derivative of the solution
function is related to f and f 0 . This is a result that we remark here in its own statement.
Theorem 6.1.2. If y is a solution of the autonomous system y 0 = f (y), then
y 00 = f 0 (y) f (y).
Remark: This result has been used to find out the curvature of the solution y of an
autonomous system y 0 = f (y). The graph of y has positive curvature iff f 0 (y) f (y) > 0 and
negative curvature iff f 0 (y) f (y) < 0.
Proof: The differential equation relates y 00 to f (y) and f 0 (y), because of the chain rule,
d dy d df dy
y 00 = = f (y(t)) = y 00 = f 0 (y) f (y).
dt dt dt dy dt
6.1.3. Critical Points and Linearization. Let us summarize a few definitions we intro-
duced in the Example 6.1.3 above.
Definition 6.1.3.
A point yc is a critical point of y 0 = f (y) iff f (yc ) = 0. A critical points is:
(i) an attractor (or sink), iff solutions flow toward the critical point;
(ii) a repeller (or source), iff solutions flow away from the critical point;
(iii) neutral, iff solution flow towards the critical point from one side and flow away from
the other side.
In this section we keep the convention used in the Example 6.1.3, filled dots denote
attractors, and white dots denote repellers. We will use a a half-filled point for neutral
points. We recall that attractors have arrows directed to them on both sides, while repellers
have arrows directed away from them on both sides. A neutral point would have an arrow
pointing towards the critical point on one side and the an arrow pointing away from the
critical point on the other side. We will usually mention critical points as stationary solutions
when we describe them in a yt-plane, and we reserve the name critical point when we describe
them in the phase line, the y-line.
We also talked about stable and unstable solutions. Here is a precise definition.
Definition 6.1.4. Let y0 be a a constant solution of y 0 = f (y), and let y be a solution with
initial data y(0) = y1 . The solution given by y0 is stable iff given any > 0 there is a > 0
such that if the initial data y1 satisfies |y1 y0 | < , then the solution values y(t) satisfy
|y(t) y0 | < for all t > 0. Furthermore, if limt y(t) = y0 , then y0 is asymptotically
stable. If y0 is not stable, we call it unstable.
270 G. NAGY ODE september 11, 2017
The geometrical method described in Example 6.1.3 above is useful to get a quick qual-
itative picture of solutions to an autonomous differential system. But it is always nice to
complement geometric methods with analytic methods. For example, one would like an an-
alytic way to determine the stability of a critical point. One would also like a quantitative
measure of a solution decay rate to a stationary solution. A linear stability analysis can
provide this type of information.
We start assuming that the function f has a Taylor expansion at any y0 . That is,
f (y) = f (y0 ) + f 0 (y0 ) (y y0 ) + o((y y0 )2 ).
Denote f0 = f (y0 ), then f00 = f 0 (y0 ), and introduce the variable u = y y0 . Then we get
f (y) = f0 + f00 u + o(u2 ).
Let us use this Taylor expansion on the right hand side of the equation y 0 = f (y), and
recalling that y 0 = (y0 + u)0 = u0 , we get
y 0 = f (y) u0 = f0 + f00 u + o(u2 ).
If y0 is a critical point of f , then f0 = 0, then
y 0 = f (y) u0 = f00 u + o(u2 ).
From the equations above we see that for y(t) close to a critical point y0 the right hand
side of the equation y 0 = f (y) is close to f00 u. Therefore, one can get information about
a solution of a nonlinear equation near a critical point by studying an appropriate linear
equation. We give this linear equation a name.
Definition 6.1.5. The linearization of an autonomous system y 0 = f (y) at a critical
point yc is the linear differential equation for the function u given by
u0 = f 0 (yc ) u.
As we see in the the Def. 6.1.5 and in Example 6.1.5, the linearization of y 0 = f (y) at
a critical point y0 is quite simple, it is the linear equation u0 = a u, where a = f 0 (y0 ). We
know all the solutions to this linear equation, we computed them in 1.1.
G. NAGY ODE September 11, 2017 271
Theorem 6.1.6 (Stability of Linear Equations). The constant coefficent linear equation
u0 = a u, with a 6= 0, has only one critical point u0 = 0. And the constant solution defined
by this critical point is unstable for a > 0, and it is asymptotically stable for a < 0.
Proof of Theorem 6.1.6: The critical points of the linear equation u0 = a u are the
solutions of au = 0. Since a 6= 0, that means we have only one critical point, u0 = 0. Since
the linear equation is so simple to solve, we can study the stability of the constant solution
u0 = 0 from the formula for all the solutions of the equation,
u(t) = u(0) eat .
The graph of all these solutions is sketch in Fig. 40. in the case that u(0) 6= 0, we see that
for a > 0 the solutions diverge to as t , and for a < 0 the solutions approach to
zero as t .
u u
u0 > 0 u0 > 0
Unstable Stable
0 a>0 t 0 a<0 t
u0 < 0 u0 < 0
Figure 40. The graph of the functions u(t) = u(0) eat for a > 0 and a < 0.
Remark: In the Example 6.1.5 above (and later on in Example 6.1.8) we see that the
stability of a critical point yc of a nonlinear differential equation y 0 = f (y) is the same as
the stability of the trivial solution u = 0 of the linearization u0 = f 0 (yc ) u. This is a general
result, which we state below.
Theorem 6.1.7 (Stability of Nonlinear Equations). Let yc be a critical point of the
autonomous system y 0 = f (y).
(a) The critical point yc is stable iff f 0 (yc ) < 0.
(b) The critical point yc is unstable iff f 0 (yc ) > 0.
Furthermore, If the initial data y(0) ' yc , is close enough to the critial point yc , then the
solution with that initial data of the equation y 0 = f (y) are close enough to yc in the sense
y(t) ' yc + u(t),
where u is the solution to the linearized equation at the critical point yc ,
u0 = f 0 (yc ) u, u(0) = y(0) yc .
Remark: The proof of this result can be found in 2.4 in Strogatz textbook [12].
Remark: The first part of Theorem 6.1.7 highlights the importance of the sign fo the
coefficient f 0 (yc ), which determines the stability of the critical point yc . The furthermore
part of the Theorem highlights how stable is a critical point. The value |f 0 (yc )| plays
a role of an exponential growth or a exponential decay rate. Its reciprocal, 1/|f 0 (yc )| is
a characteristic scale. It determines the value of t required for the solution y to vary
significantly in a neighborhood of the critical point yc .
272 G. NAGY ODE september 11, 2017
6.1.4. Population Growth Models. The simplest model for the population growth of an
organism is N 0 = rN where N (t) is the population at time t and r > 0 is the growth rate.
This model predicts exponential population growth N (t) = N0 ert , where N0 = N (0). This
model assumes that the organisms have unlimited food supply, hence the per capita growth
N 0 /N = r is constant.
A more realistic model assumes that the per capita growth decreases linearly with N ,
starting with a positive value, r, and going down to zero for a critical population N = K > 0.
So when we consider the per capita growth N 0 /N as a function of N , it must be given by
the formula N 0 /N = (r/K)N + r. This is the logistic model for population growth.
Definition 6.1.8. The logistic equation describes the organisms population function N
in time as the solution of the autonomous differential equation
N
N 0 = rN 1 ,
K
where the initial growth rate constant r and the carrying capacity constant K are positive.
Remark: The logistic equation is, of course, a separable equation, so it can be solved using
the method from 1.3. We solve it below, so you can compare the qualitative graphs from
Example 6.1.7 with the exact solution below.
Example 6.1.6: Find the exact expression for the solution to the logistic equation for
population growth
y
y 0 = ry 1 , y(0) = y0 , 0 < y0 < K.
K
Remark: The expression above provides all solutions to the logistic equation with initial
data on the interval (0, K). With some more work one could graph these solution and get
a picture of the solution behaviour. We now use the graphical method discussed above to
get a qualitative picture of the solution graphs without solving the differential equation.
G. NAGY ODE September 11, 2017 273
Example 6.1.7: Sketch a qualitative graph of solutions for different initial data conditions
y(0) = y0 to the logistic equation below, where r and K are given positive constants,
y
y 0 = ry 1 .
K
f y
Solution: f (y) = ry 1
rK K
The logistic differential equation for pop-
4
ulation growth can be written y 0 = f (y),
where function f is the polynomial
y
f (y) = ry 1 .
K 0 K K y
The first step in the graphical approach 2
is to graph the function f . The result is
in Fig. 41.
Figure 41. The graph of f .
The second step is to identify all criti-
cal points of the equation. The critical
points are the zeros of the function f . In
f
this case, f (y) = 0 implies y
f (y) = ry 1
rK K
y0 = 0, y1 = K.
4
The third step is to find out whether
the critical points are stable or unstable.
Where function f is positive, a solution
will be increasing, and where function f 0 K K y
is negative a solution will be decreasing. 2
These regions are bounded by the crit-
ical points. Now, in an interval where
Figure 42. Critical points added.
f > 0 write a right arrow, and in the
intervals where f < 0 write a left arrow,
as shown in Fig. 42.
This is all the information we need to sketch a qualitative graph of solutions to the
differential equation. So, the last step is to put all this information on a yt-plane. The
274 G. NAGY ODE september 11, 2017
horizontal axis above is now the vertical axis, and we now plot solutions y of the differential
equation. The result is given in Fig. 44.
CU
K Stable
CD
K
2
CU
Unstable
0 t
CD
The picture above contains the graph of several solutions y for different choices of initial
data y(0). Stationary solutions are in blue, t-dependent solutions in green. The stationary
solution y0 = 0 is unstable and pictured with a dashed blue line. The stationary solution
y1 = K is stable and pictured with a solid blue line. C
y
Example 6.1.8: Find the linearization of the logistic equation y 0 = ry 1 at the
K
critical points y0 = 0 and y1 = K. Solve the linear equations for arbitrary initial data.
y
Solution: If we write the nonlinear system as y 0 = f (y), then f (y) = ry 1 . The
K
2r
critical points are y0 = 0 and y1 = K. We also need to compute f 0 (y) = r y. For the
K
critical point y0 = 0 we get the linearized system
u00 (t) = r u0 u0 (t) = u0 (0) ert .
For the critical point y1 = K we get the linearized system
u01 (t) = r u1 u1 (t) = u1 (0) ert .
From this last expression we can see that for y0 = 0 the critical solution u0 = 0 is unstable,
while for y1 = K the critical solution u1 = 0 is stable. The stability of the trivial solution
u0 = u1 = 0 of the linearized system coincides with the stability of the critical points y0 = 0,
y1 = K for the nonlinear equation. C
Notes
This section follows a few parts of Chapter 2 in Steven Strogatzs book on Nonlinear
Dynamics and Chaos, [12], and also 2.5 in Boyce DiPrima classic textbook [3].
G. NAGY ODE September 11, 2017 275
6.1.5. Exercises.
6.1.1.- . 6.1.2.- .
276 G. NAGY ODE september 11, 2017
Example 6.2.2 (Predator-Prey): The physical system consists of two biological species
where one preys on the other. For example cats prey on mice, foxes prey on rabbits. If we
G. NAGY ODE September 11, 2017 277
call x1 the predator population, and x2 the prey population, then predator-prey equations,
also known as Lotka-Volterra equations for predator prey, are
x01 = a x1 + b x1 x2 ,
x02 = c x1 x2 + d x2 .
The constants a, b, c, d are all nonnegative. Notice that in the case of absence of predators,
x1 = 0, the prey population grows without bounds, since x02 = d x2 . In the case of absence of
prey, x2 = 0, the predator population becames extinct, since x01 = a x1 . The term c x1 x2
represents the prey death rate due to predation, which is porportional to the number of
encounters, x1 x2 , between predators and prey. These encounters have a positive contribution
b x1 x2 to the predator population. C
Example 6.2.3 (Competing Species): The physical system consists of two species that
compete on the same food resources. For example, rabbits and sheep, which compete on
the grass on a particular piece of land. If x1 and x2 are the competing species popultions,
the the differential equations, also called Lotka-Volterra equations for competition, are
x1
x01 = r1 x1 1 x2 ,
K1
0
x2
x2 = r2 x2 1 x1 .
K2
The constants r1 , r2 , , are all nonnegative, and K1 , K2 are positive. Note that in the case
of absence of one species, say x2 = 0, the population of the other species, x1 is described by
a logistic equation. The terms x1 x2 and x1 x2 say that the competition between the
two species is proportional to the number of competitive pairs x1 x2 . C
x2
x(t)
x1
0
Figure 46. A curve in a phase portrait represents all the end points of
the vectors x(t), for t on some interval. The arrows in the curve show the
direction of increasing t.
278 G. NAGY ODE september 11, 2017
Figure 47. The stability of the solution x0 = 0. (Boyce DiPrima, 9.1, [3].)
The trivial solution x0 = 0 is called a critical point of the linear system x0 = Ax. Here is
a more detailed classification of this critical point.
Definition 6.2.2. The critical point x0 = 0 of a 2 2 linear system x0 = Ax is:
(a) an attractor (or sink), iff both eigenvalues of A have negative real part;
(b) a repeller (or source), iff both eigenvalues of A have positive real part;
(c) a saddle, iff one eigenvalue of A is positive and the other is negative;
(d) a center, iff both eigenvalues of A are pure imaginary;
(e) higher order critical point iff at least one eigenvalue of A is zero.
The critical point x0 = 0 is called hyperbolic iff it belongs to cases (a-c), that is, the real
part of all eigenvalues of A are nonzero.
G. NAGY ODE September 11, 2017 279
We saw in ?? that the behavior of solutions to a linear system x0 = Ax, with initial
data x(0), depends on what type of critical point is x0 = 0. The results presented in that
section can be summarized in the following statement.
Theorem 6.2.3 (Stability of Linear Systems). Let x(t) be the solution of a 2 2 linear
system x0 = Ax, with det(A) 6= 0 and initial condition x(0) = x1 .
(a) The critical point x0 = 0 is an attractor iff for any initial condition x(0) the correspond-
ing solution x(t) satisfies that limt x(t) = 0.
(b) The critical point x0 = 0 is a repeller iff for any initial condition x(0) the corresponding
solution x(t) satisfies that limt |x(t)| = .
(c) The critical point x0 = 0 is a center iff for any initial data x(0) the corresponding
solution x(t) describes a closed periodic trajectory around 0.
Phase portraits will be very useful to understand solutions to 2-dimensional nonlinear
differential equations. We now state the main result about solutions to autonomous systems
x0 = f (x) is the following.
Theorem 6.2.4 (IVP). If the field f differentiable on some open connected set D R2 ,
then the initial value problem
x0 = f (x), x(0) = x0 D,
has a unique solution x(t) on some nonempty interval (t1 , t1 ) about t = 0.
Remark: The fixed point argument used in the proof of Picard-Lindelofs Theorem 1.6.2
can be extended to prove Theorem 6.2.4. This proof will be presented later on.
Remark: That the field f is differentiable on D R2 means that f is continuous, and all
the partial derivatives fi /xj , for i, j = 1, 2, are continuous for all x in D.
Theorem 6.2.4 has an important corollary: different trajectories never intersect. If two
trajectories did intersect, then there would be two solutions starting from the same point, the
crossing point. This would violate the uniqueness part of the theorem. Because trajectories
cannot intersect, phase portraits of autonomous systems have a well-groomed appearence.
6.2.3. Critical Points and Linearization. We now extended to two-dimensional systems
the concept of linearization we introduced for one-dimensional systems. The hope is that
solutions to nonlinear systems close to critical points behave in a similar way to solutions to
the linearized system. We will see that this is the case if the linearized system has distinct
eigenvalues. Se start with the definition of critical points.
Definition 6.2.5. A critical point of a two-dimensional system x0 = f (x) is a vector x0
where the field f vanishes,
f (x0 ) = 0.
Remark: A critical point defines a constant vector function x(t) = x0 for all t, solution of
the differential equation,
x0 = 0 = f (x0 ).
0 0
f x
In components, the field is f = 1 , and the critical point x0 = 10 is solution of
f2 x2
f1 (x01 , x02 ) = 0,
f2 (x01 , x02 ) = 0.
When there is more than one critical point we will use the notation xi , with i = 0, 1, 2, ,
to denote the critical points.
280 G. NAGY ODE september 11, 2017
Example 6.2.4: Find all the critical points of the two-dimensional (decoupled) system
x01 = x1 + (x1 )3
x02 = 2 x2 .
x
Solution: We need to find all constant vectors x = 1 solutions of
x2
x1 + (x1 )3 = 0,
2 x2 = 0.
From the second equation we get x2 = 0. From the first equation we get
x1 (x1 )2 1 = 0 x1 = 0, or x1 = 1.
0 1 1
Therefore, we got three critical points, x0 = , x1 = , x2 = . C
0 0 0
We now use this Taylor expansion of the field f into the differential equation x0 = f . Recall
that x1 = x01 + u1 , and x2 = x02 + u2 , and that x01 and x02 are constants, then
u01 = f10 + (1 f1 ) u1 + (2 f1 ) u2 + o u21 , u22 ,
Let us write this differential equation using vector notation. If we introduce the vectors and
the matrix 0
u1 f1 1 f1 2 f1
u= , f0 = 0 , Df0 = ,
u2 f2 1 f2 2 f2
then, we have that
x0 = f (x) u0 = f 0 + (Df0 ) u + o |u|2 .
G. NAGY ODE September 11, 2017 281
In the case that x0 is a critical point, then f 0 = 0. In this case we have that
x0 = f (x) u0 = (Df0 ) u + o |u|2 .
The relation above says that the equation coefficients of x0 = f (x) are close, order o |u|2 ,
to the coefficients of the linear differential equation u0 = (Df0 ) u. For this reason, we give
this linear differential equation a name.
Definition 6.2.6. The linearization of a two-dimensional system x0 = f (x) at a critical
point x0 is the 2 2 linear system
u0 = (Df0 ) u,
where u = x x0 , and we have introduced the Jacobian matrix at x0 ,
f
f1
1
x1
x0
x2
x0 1 f1 2 f1
Df0 = f
= .
2 f2 1 f2 2 f2
x1 x2
x0 x0
Solution: We found earlier that this system has three critial points,
0 1 1
x0 = , x1 = , x2 = .
0 0 0
This means we need to compute three linearizations, one for each critical point. We start
computing the derivative matrix at an arbitrary point x,
"
f1 f1
#
3 3
x1 x2 x1 (x1 + x1 ) x2 (x1 + x1 )
Df (x) = f
= ,
x1 (2x2 ) x2 (2x2 )
2 f2
x1 x2
so we get that
1 + 3 x21
0
Df (x) = .
0 2
We only need to evaluate this matrix Df at the critical points. We start with x0 ,
0
0 1 0 u1 1 0 u1
x0 = Df0 = =
0 0 2 u2 0 2 u2
The Jacobian at x1 and x2 is the same, so we get the same linearization at these points,
0
1 2 0 u1 2 0 u1
x1 = Df1 = =
0 0 2 u2 0 2 u2
0
1 2 0 u1 2 0 u1
x2 = Df2 = =
0 0 2 u2 0 2 u2
282 G. NAGY ODE september 11, 2017
Critical points of nonlinear systems are classified according to the eigenvalues of their
corresponding linearization.
Definition 6.2.7. A critical point x0 of a two-dimensional system x0 = f (x) is:
(a) an attractor (or sink), iff both eigenvalues of Df0 have negative real part;
(b) a repeller (or source), iff both eigenvalues of Df0 have positive real part;
(c) a saddle, iff one eigenvalue of Df0 is positive and the other is negative;
(d) a center, iff both eigenvalues of Df0 are pure imaginary;
(e) higher order critical point iff at least one eigenvalue of Df0 is zero.
A critical point x0 is called hyperbolic iff it belongs to cases (a-c), that is, the real part
of all eigenvalues of Df0 are nonzero.
Example 6.2.6: Classify all the critical points of the nonlinear system
x01 = x1 + (x1 )3
x02 = 2 x2 .
Solution: We already know that this system has three critical points,
0 1 1
x0 = , x1 = , x2 = .
0 0 0
We have already computed the linearizations at these critical points too.
1 0 2 0
Df0 = , Df1 = Df2 = .
0 2 0 2
We now need to compute the eigenvalues of the Jacobian matrices above. For the critical
point x0 we have + = 1, - = 2, so x0 is an attractor. For the critical points x1 and x2
we have + = 2, - = 2, so x1 and x2 are saddle points. C
Remark: This theorem says that, for hyperbolic critical points, the phase portrait of the
linearization at the critical point is enough to determine the phase portrait of the nonlinear
system near that critical point.
Example 6.2.7: Use the Hartman-Grobman theorem to sketch the phase portrait of
x01 = x1 + (x1 )3
x02 = 2 x2 .
Since we now know that Fig 6.2.4 is also the phase portrait of the nonlinear, we only
need to fill in the gaps in that phase portrait. In this example, a decoupled system, we
can complete the phase portrait from the symmetries of the solution. Indeed, in the x2
direction all trajectories must decay to exponentially to the x2 = 0 line. In the x1 direction,
all trajectories are attracted to x1 = 0 and repelled from x1 = 1. The vertical lines x1 = 0
and x1 = 1 are invariant, since x01 = 0 on these lines; hence any trajectory that start on
these lines stays on these lines. Similarly, x2 = 0 is an invariant horizontal line. We also
note that the phase portrait must be symmetric in both x1 and x2 axes, since the equations
are invariant under the transformations x1 x1 and x2 x2 . Putting all this extra
information together we arrive to the phase portrait in Fig. 49.
Figure 49. Phase portraits of the nonlinear systems in the Example 6.2.7
C
284 G. NAGY ODE september 11, 2017
6.2.5. Competing Species. Suppose we have two species competing for the same food
resources. Can we predict what will happen to the species population over time? Is there
an equilibrium situation where both species cohabit together? Or one of the species must
become extinct? If this is the case, which one?
We study in this section a particular competing species system, taken from Strogatz [12],
x01 = x1 (3 x1 2 x2 ), (6.2.1)
0
x2 = x2 (2 x2 x1 ), (6.2.2)
where x1 (t) is the population of one of the species, say rabbits, and x2 (t) is the population
of the other species, say sheeps, at the time t. We restrict to nonnegative functions x1 , x2 .
We start finding all the critical points of the rabbits-sheeps system. We need to find all
constants x1 , x2 solutions of
x1 (3 x1 2 x2 ) = 0, (6.2.3)
x2 (2 x2 x1 ) = 0. (6.2.4)
From Eq. (6.2.3) we get that one solution is x1 = 0. In that case Eq. (6.2.4) says that
x2 (2 x2 ) = 0 x2 = 0 or x2 = 2.
0 0
So we got two critical points, x0 = and x1 = . We now consider the case that x1 6= 0.
0 2
In this case Eq. (6.2.3) implies
(3 x1 2 x2 ) = 0 x1 = 3 2 x2 .
Using this equation in Eq. (6.2.4) we get that
x2 = 0,
hence x1 = 3,
x2 (2 x2 3 + 2x2 ) = 0 x2 (1 + x2 ) = 0 or
x = 1,
2 hence x1 = 1.
3 1
So we got two more critical points, x2 = and x3 = . We now proceed to find
0 1
the linearization of the rabbits-sheeps system in Eqs.(6.2.1)-(6.2.2). We first compute the
derivative of the field f , where
f1 x1 (3 x1 2 x2 )
f (x) = = .
f2 x2 (2 x2 x1 )
The derivative of f at an arbitrary point x is
f1 f1
x x (3 2 x1 2 x2 ) 2 x1
Df (x) = f 1 f 2 = .
2 2 x2 (2 x1 2 x2 )
x1 x2
We now evaluate the matrix Df (x) at each of the critical points we found.
0 3 0
At x0 = we get (Df0 ) = .
0 0 2
This coefficient matrix has eigenvalues 0+ = 3 and 0- = 2, both positive, which means that
the critical point x0isa repeller. Tosketch the phase portrait we will need the corresponding
1 0
eigenvectors, v+0 = and v-0 = .
0 1
0 1 0
At x1 = we get (Df1 ) = .
2 2 2
G. NAGY ODE September 11, 2017 285
This coefficient matrix has eigenvalues 1+ = 1 and 1- = 2, both negative, which means
that the critical
point x1 is an attractor. One can check that the corresponding eigenvectors
+ 1 0
are v1 = and v-1 = .
2 1
3 3 6
At x2 = we get (Df2 ) = .
0 0 1
This coefficient matrix has eigenvalues 2+ = 1 and 2- = 3, both negative, which means
that the critical
point x2 is an attractor. One can check that the corresponding eigenvectors
+ 3 1
are v2 = and v-2 = .
1 0
1 1 2
At x3 = we get (Df3 ) = .
1 1 1
One can check that this coefficient matrix has eigenvalues 3+ = 1+ 2 and 3- = 1 2,
which means that the critical
point x3 isa saddle.
One can check that the corresponding
2 2
eigenvectors are v+3 = and v-3 = . We summarize this information about the
1 1
linearized systems in the following picture.
We would like to have the complete phase portrait for the nonlinear system, that is, we
would like to fill the gaps in Fig. 51. This is difficult to do analytically in this example as
286 G. NAGY ODE september 11, 2017
Figure 52. The phase portrait of the rabbits-sheeps system in Eqs. (6.2.1)-(6.2.2).
well as in general nonlinear autonomous systems. At this point is where we need to turn to
computer generated solutions to fill the gaps in Fig. 51. The result is in Fig. 52.
We can now study the phase portrait in Fig. 52 to obtain some biological insight on the
rabbits-sheeps system. The picture on the right says that most of the time one species drives
the other to extinction. If the initial data for the system is a point on the blue region,
called
3
the rabbit basin, then the solution evolves in time toward the critical point x2 = . This
0
means that the sheep become extinct. If the initial data for the system is a point on the
green region,
called the sheep basin, then the solution evolves in time toward the critical
0
point x1 = . This means that the rabbits become extinct.
2
The two basins of attractions are separated by a curve, called the basin boundary. Only
when the initial data lies on that curve the rabbits and sheeps coexist
with neither becoming
1
extinct. The solution moves towards the critical point x3 = . Therefore, the populations
1
of rabbits and sheep become equal to each other as time goes to infinity. But, if we pick
an initial data outside this basin boundary, no matter how close this boundary, one of the
species becomes extinct.
G. NAGY ODE September 11, 2017 287
6.2.6. Exercises.
6.2.1.- . 6.2.2.- .
288 G. NAGY ODE september 11, 2017
We study the a simple case of the Sturm-Liouville Problem, we then present how to compute
the Fourier series expansion of continuous and discontinuous functions. We end this chapter
introducing the separation of variables method to find solutions of a partial differential
equation, the heat equation.
Insulation
u(t, `) = 0
u(t, 0) = 0
x
0 `
z Insulation
u(t, 0) = 0 t u = k x2 u u(t, `) = 0
0 u(0, x) = f (x) ` x
G. NAGY ODE September 11, 2017 289
Remarks:
(a) The two boundary conditions are held at different points, x1 6= x2 .
(b) Both y and y 0 may appear in the boundary condition.
Example 7.1.1: We now show four examples of boundary value problems that differ only
on the boundary conditions: Solve the different equation
y 00 + a1 y 0 + a0 y = e2t
with the boundary conditions at x1 = 0 and x2 = 1 given below.
(a)
( ) ( )
y(0) = y1 , b1 = 1, b2 = 0,
Boundary Condition: which is the case
y(1) = y2 , b1 = 1, b2 = 0.
(b)
( ) ( )
y(0) = y1 , b1 = 1, b2 = 0,
Boundary Condition: which is the case
y 0 (1) = y2 , b1 = 0, b2 = 1.
(c)
y 0 (0) = y1 ,
( ) ( )
b1 = 0, b2 = 1,
Boundary Condition: which is the case
y(1) = y2 , b1 = 1, b2 = 0.
(d)
y 0 (0) = y1 ,
( ) ( )
b1 = 0, b2 = 1,
Boundary Condition: which is the case
y 0 (1) = y2 , b1 = 0, b2 = 1.
(e)
2 y(0) + y 0 (0) = y1 ,
( ) ( )
b1 = 2, b2 = 1,
BC: which is the case
y 0 (1) + 3 y 0 (1) = y2 , b1 = 1, b2 = 3.
C
290 G. NAGY ODE september 11, 2017
7.1.2. Comparison: IVP and BVP. We now review the initial boundary value problem
for the equation above, which was discussed in Sect. 2.1, where we showed in Theorem 2.1.2
that this initial value problem always has a unique solution.
Definition 7.1.2 (IVP). Find all solutions of the differential equation y 00 + a1 y 0 + a0 y = 0
satisfying the initial condition (IC)
y(t0 ) = y0 , y 0 (t0 ) = y1 . (7.1.1)
A typical boundary value problem that appears in many applications is the following.
Definition 7.1.3 (BVP). Find all solutions of the differential equation y 00 + a1 y 0 + a0 y = 0
satisfying the boundary condition (BC)
y(0) = y0 , y(L) = y1 , L 6= 0. (7.1.2)
The names initial value problem and boundary value problem come from physics.
An example of the former is to solve Newtons equations of motion for the position function
of a point particle that starts at a given initial position and velocity. An example of the
latter is to find the equilibrium temperature of a cylindrical bar with thermal insulation on
the round surface and held at constant temperatures at the top and bottom sides.
Lets recall an important result we saw in 2.1 about solutions to initial value problems.
Theorem 7.1.4 (IVP). The equation y 00 +a1 y 0 +a0 y = 0 with IC y(t0 ) = y0 and y 0 (t0 ) = y1
has a unique solution y for each choice of the IC.
The solutions to boundary value problems are more complicated to describe. A boundary
value problem may have a unique solution, or may have infinitely many solutions, or may
have no solution, depending on the boundary conditions. In the case of the boundary value
problem in Def. 7.1.3 we get the following.
Theorem 7.1.5 (BVP). The equation y 00 +a1 y 0 +a0 y = 0 with BC y(0) = y0 and y(L) = y1 ,
wilt L 6= 0 and with r roots of the characteristic polynomial p(r) = r2 + a1 r + a0 , satisfy
the following.
(A) If r+ 6= r- are reals, then the BVP above has a unique solution for all y0 , y1 R.
(B) If r = i are complex, with , R, then the solution of the BVP above belongs
to one of the following three possibilities:
(i) There exists a unique solution;
(ii) There exists infinitely many solutions;
(iii) There exists no solution.
G. NAGY ODE September 11, 2017 291
= 2i eL sin(L).
We conclude that
er- L er+ L = 2i eL sin(L) = 0 L = n.
So for L 6= n the BVP has a unique solution, case (Bi). But for L = n the BVP has
either no solution or infinitely many solutions, cases (Bii) and (Biii). This establishes the
Theorem.
Example 7.1.2: Find all solutions to the BVPs y 00 + y = 0 with the BCs:
( ( (
y(0) = 1, y(0) = 1, y(0) = 1,
(a) (b) (c)
y() = 0. y(/2) = 1. y() = 1.
Solution: We first find the roots of the characteristic polynomial r2 + 1 = 0, that is,
r = i. So the general solution of the differential equation is
y(x) = c1 cos(x) + c2 sin(x).
292 G. NAGY ODE september 11, 2017
BC (a):
1 = y(0) = c1 c1 = 1.
0 = y() = c1 c1 = 0.
Therefore, there is no solution.
BC (b):
1 = y(0) = c1 c1 = 1.
1 = y(/2) = c2 c2 = 1.
So there is a unique solution y(x) = cos(x) + sin(x).
BC (c):
1 = y(0) = c1 c1 = 1.
1 = y() = c1 c2 = 1.
Therefore, c2 is arbitrary, so we have infinitely many solutions
y(x) = cos(x) + c2 sin(x), c2 R.
C
Example 7.1.3: Find all solutions to the BVPs y 00 + 4 y = 0 with the BCs:
( ( (
y(0) = 1, y(0) = 1, y(0) = 1,
(a) (b) (c)
y(/4) = 1. y(/2) = 1. y(/2) = 1.
Solution: We first find the roots of the characteristic polynomial r2 + 4 = 0, that is,
r = 2i. So the general solution of the differential equation is
y(x) = c1 cos(2x) + c2 sin(2x).
BC (a):
1 = y(0) = c1 c1 = 1.
1 = y(/4) = c2 c2 = 1.
Therefore, there is a unique solution y(x) = cos(2x) sin(2x).
BC (b):
1 = y(0) = c1 c1 = 1.
1 = y(/2) = c1 c1 = 1.
So, c2 is arbitrary and we have infinitely many solutions
y(x) = cos(2x) + c2 sin(2x), c2 R.
BC (c):
1 = y(0) = c1 c1 = 1.
1 = y(/2) = c1 c2 = 1.
Therefore, we have no solution. C
G. NAGY ODE September 11, 2017 293
7.1.3. Eigenfunction Problems. We now focus on boundary value problems that have
infinitely many solutions. A particular type of these problems are called an eigenfunction
problems. They are similar to the eigenvector problems we studied in 8.3. Recall that
the eigenvector problem is the following: Given an n n matrix A, find all numbers and
nonzero vectors v solution of the algebraic linear system
Av = v.
We saw that for each there are infinitely many solutions v, because if v is a solution so is
any multiple av. An eigenfunction problem is something similar.
Definition 7.1.6. An eigenfunction problem is the following: Given a linear operator
L(y) = y 00 + a1 y 0 + a0 y, find a number and a nonzero function y solution of
L(y) = y,
with homogeneous boundary conditions
b1 y(x1 ) + b2 y 0 (x1 ) = 0,
b1 y(x2 ) + b2 y 0 (x2 ) = 0.
Remarks:
Notice that y = 0 is always a solution of the BVP above.
Eigenfunctions are the nonzero solutions of the BVP above.
Hence, the eigenfunction problem is a BVP with infinitely many solutions.
So, we look for such that the operator L(y) + y has characteristic polynomial
with complex roots.
So, is such that L(y) + y has oscillatory solutions.
Our examples focus on the linear operator L(y) = y 00 .
Example 7.1.4: Find all numbers and nonzero functions y solutions of the BVP
y 00 + y = 0, with y(0) = 0, y(L) = 0, L > 0.
Solution: We divide the problem in three cases: (a) < 0, (b) = 0, and (c) > 0.
Case (a): = 2 < 0, so the equation is y 00 2 y = 0. The characteristic equation is
r2 2 = 0 r-+ = .
The general solution is y = c+ ex + c- ex . The BC imply
0 = y(0) = c+ + c- , 0 = y(L) = c+ eL + c- eL .
So from the first equation we get c+ = c- , so
0 = c- eL + c- eL c- (eL eL ) = 0 c- = 0, c+ = 0.
So the only the solution is y = 0, then there are no eigenfunctions with negative eigenvalues.
Case (b): = 0, so the differential equation is
y 00 = 0 y = c0 + c1 x.
The BC imply
0 = y(0) = c0 , 0 = y(L) = c1 L c1 = 0.
So the only solution is y = 0, then there are no eigenfunctions with eigenvalue = 0.
Case (c): = 2 > 0, so the equation is y 00 + 2 y = 0. The characteristic equation is
r2 + 2 = 0 r-+ = i.
294 G. NAGY ODE september 11, 2017
Example 7.1.5: Find the numbers and the nonzero functions y solutions of the BVP
y 00 + y = 0, y(0) = 0, y 0 (L) = 0, L > 0.
Solution: We divide the problem in three cases: (a) < 0, (b) = 0, and (c) > 0.
Case (a): Let = 2 , with > 0, so the equation is y 00 2 y = 0. The characteristic
equation is
r2 2 = 0 r-+ = ,
The general solution is y(x) = c1 ex + c2 ex . The BC imply
)
0 = y(0) = c1 + c2 ,
1 1
c1 0
= .
0 = y 0 (L) = c1 eL + c2 eL eL eL c2 0
The matrix above is invertible, because
1 1 L L
eL eL = e + e
6= 0.
So, the linear system above for c1 , c2 has a unique solution c1 = c2 = 0. Hence, we get the
only solution y = 0. This means there are no eigenfunctions with negative eigenvalues.
Case (b): Let = 0, so the differential equation is
y 00 = 0 y(x) = c1 + c2 x, c1 , c2 R.
The boundary conditions imply the following conditions on c1 and c2 ,
0 = y(0) = c1 , 0 = y 0 (L) = c2 .
So the only solution is y = 0. This means there are no eigenfunctions with eigenvalue = 0.
Case (c): Let = 2 , with > 0, so the equation is y 00 + 2 y = 0. The characteristic
equation is
r2 + 2 = 0 r-+ = i.
The general solution is y(x) = c1 cos(x) + c2 sin(x). The BC imply
)
0 = y(0) = c1 ,
c2 cos(L) = 0.
0 = y 0 (L) = c1 sin(L) + c2 cos(L)
Since we are interested in non-zero solutions y, we look for solutions with c2 6= 0. This
implies that cannot be arbitrary but must satisfy the equation
cos(L) = 0 n L = (2n 1) , n > 1.
2
G. NAGY ODE September 11, 2017 295
Example 7.1.6: Find the numbers and the nonzero functions y solutions of the BVP
x2 y 00 x y 0 = y, y(1) = 0, y(`) = 0, ` > 1.
Since ` > 1, the matrix above is invertible, and the linear system for c1 , c2 has a unique
solution given by c1 = c2 = 0. Hence we get the only solution y = 0. This means there are
no eigenfunctions with eigenvalues < 1.
Case (c): Let 1 < 0, so we can rewrite it as 1 = 2 , with > 0. Then r = 1 i,
and so the general solution to the differential equation is
y(x) = x c1 cos ln(x) + c2 sin ln(x) .
The boundary conditions imply the following conditions on c1 and c2 ,
)
0 = y(1) = c1 ,
c2 ` sin ln(`) = 0.
0 = y(`) = c1 ` cos ln(`) + c2 ` sin ( ln(`)
296 G. NAGY ODE september 11, 2017
Since we are interested in nonzero solutions y, we look for solutions with c2 6= 0. This
implies that cannot be arbitrary but must satisfy the equation
sin ln(`) = 0 n ln(`) = n, n > 1.
Recalling that 1 n = 2n , we get n = 1 + 2n , hence,
n2 2 n ln(x)
n = 1 + 2 , yn (x) = cn x sin , n > 1.
ln (`) ln(`)
Since we only need one eigenfunction for each eigenvalue, we choose cn = 1, and we get
n2 2 n ln(x)
n = 1 + 2 , yn (x) = x sin , n > 1.
ln (`) ln(`)
C
G. NAGY ODE September 11, 2017 297
7.1.4. Exercises.
7.1.1.- . 7.1.2.- .
298 G. NAGY ODE september 11, 2017
7.2.1. Fourier Expansion of Vectors. We review the basic concepts about vectors in R3
we will need to generalize to the space of functions. These concepts include: the dot (or
inner) product of two vectors, orthogonal and orthonormal set of vectors, Fourier expansion
(or orthonormal expansion) of vectors, and vector approximations.
Definition 7.2.1. The dot product of two vectors u, v R3 is
u v = |u | |v | cos(),
with |u |, |v | the magnitude of the vectors, and [0, ] the angle in between them.
Example 7.2.1: The set of vectors {i, j , k } used in physics is an orthonormal set in R3 .
Solution: These are the vectors
1 0 0
i = 0 , j = 1 , k = 0 .
0 0 1
The Fourier expansion theorem says that the set above is not just a set, it is a basisany
vector in R3 can be decomposed as a linear combination of the basis vectors. Furthermore,
there is a simple formula for the vector components.
Theorem 7.2.4. The orthonormal set {i, j, k} is an orthonormal basis, that is, every vector
v R3 can be decomposed as
v = vx i + vy j + vz k.
The orthonormality of the vector set implies a formula for the vector components
vx = v i, vy = v j, vz = v k.
The vector components are the dot product of the whole vector with each basis vector.
The decomposition above allow us to introduce vector approximations.
7.2.2. Fourier Expansion of Functions. The ideas described above for vectors in R3 can
be extended to functions. We start introducing a notion of projectionhence of perpen-
dicularityamong functions. Unlike it happens in R3 , we now do not have a geometric
intuition that can help us find such a product. So we look for any dot product of functions
having the positivity property, the symmetry property, and the linearity property. Here is
one product with these properties.
Definition 7.2.5. The dot product of two functions f , g on [L, L] is
Z L
f g = f (x) g(x) dx.
L
The dot product above takes two functions and produces a number. And one can verify
that the product has the following properties.
Theorem 7.2.6. For every functions f , g, h and every a, b R holds,
(a) Positivity: f f = 0 iff f = 0; and f f > 0 for f 6= 0.
(b) Symmetry: f g = g f .
(c) Linearity: (a f + b g) h = a (f h) + b (g h).
Remark: To show that the set above is orthogonal we need to show that the dot product
of any two different functions in the set vanishesthe three equations below on the left. To
show that the set if orthonormal we also need to show that all the functions in the set are
unit functionsthe two equations below on the right.
um un = 0, m 6= n. un un = 1, for all n.
vm vn = 0, m 6= n. vn vn = 1, for all n.
um vn = 0, for all m, n.
Example 7.2.2: The normalization condition is simple to see, because for n > 1 holds
Z L
1 L
Z
1 nx 1 nx nx 1
un un = cos cos dx = cos2 dx = L = 1.
L L L L L L L L L
C
G. NAGY ODE September 11, 2017 301
The orthogonality of the set above is equivalent to the following statement about the
functions sine and cosine.
Theorem 7.2.9. The following relations hold for all n, m N,
Z L nx mx 0
n 6= m,
cos cos dx = L n = m 6= 0,
L L L
2L n = m = 0,
0 n 6= m,
Z L nx mx
sin sin dx =
L L L L n = m,
Z L nx mx
cos sin dx = 0.
L L L
Still for n > 0 or m > 0, assume that n 6= m, then the second term above is
1 L h (n m)x i h (n m)x iL
Z
L
cos dx = sin = 0.
2 L L 2(n m) L
L
The remaining equations in the Theorem are proven in a similar way. This establishes the
Theorem.
Remark: Instead of an orthonormal set we will use an orthogonal set, which is often used
in the literature on Fourier series:
n 1 nx nx o
u0 = , un = cos , vn = .
2 L L n=1
302 G. NAGY ODE september 11, 2017
Idea of the Proof of Theorem 7.2.10: It is not simple to prove that the set in 7.2.1 is
a basis, that is every continuous function on [L, L] can be written as a linear combination
a0 X nx nx
f (x) = + an cos + bn sin .
2 n=1
L L
We skip that part of the proof. But once we have the expansion above, it is not difficult to
find a formula for the coefficients a0 , an , and bn , for n > 1. To find a coefficient bm we just
multiply the expansion above by sin(mx/L) and integrate on [L, L], that is,
mx a X nx nx mx
0
f (x) sin = + an cos + bn sin sin .
L 2 n=1
L L L
The linearity property of the dot product implies
mx 1 mx
f (x) sin = a0 sin
L 2 L
X nx mx
+ an cos sin
n=1
L L
X nx mx
+ bn sin sin .
n=1
L L
But (1/2) sin(mx/L) = 0, since the sine functions above are perpendicular to the constant
functions. Also cos(nx/L) sin(mx/L) = 0, since all sine functions above are perpendic-
ular to all cosine functions above. Finally sin(nx/L) sin(mx/L) = 0 for m 6= n, since
G. NAGY ODE September 11, 2017 303
sine functions with different values of m and n are mutually perpendicular. So, on the right
hand side above it survives only one term, n = m,
mx mx mx
f (x) sin = bm sin sin .
L L L
But on the right hand side we got the magnitude square of the sine function above,
mx mx
mx
2
sin sin =
sin
= L.
L L L
Therefore,
mx Z L
1 mx
f (x) sin = bm L bm = f (x) sin dx.
L L L L
To get the coefficient am , multiply the series expansion of f by cos(mx/L) and integrate
on [L, L], that is
mx a
X nx nx mx
0
f (x) cos = + an cos + bn sin cos .
L 2 n=1
L L L
As before, the linearity of the dot product together with the orthogonality properties of the
basis implies that only one term survives,
mx mx mx
f (x) cos = am cos cos .
L L L
Since
mx mx
mx
2
cos cos =
cos
= L,
L L L
we get that
mx Z L
1 mx
f (x) cos = am L am = f (x) cos dx.
L L L L
The coefficient a0 is obtained integrating on [L, L] the series expansion for f , and using
that all sine and cosine functions above are perpendicular to the constant functions, then
we get
Z L
a0 L
Z
a0
f (x), dx = dx = 2L,
L 2 L 2
so we get the formula
Z L
1
a0 = f (x) dx.
L L
We also skip the part of the proof about the values of the Fourier series of discontinuous
functions at the point of the discontinuity.
We now use the formulas in the Theorem above to compute the Fourier series expansion
of a continuous function.
x , for x [0, 3]
7.2.3. Even or Odd Functions. The Fourier series expansion of a function takes a simpler
form in case the function is either even or odd. More interestingly, given a function on [0, L]
one can extend such function to [L, L] requiring that the extension be either even or odd.
Definition 7.2.11. A function f on [L, L] is:
even iff f (x) = f (x) for all x [L, L];
odd iff f (x) = f (x) for all x [L, L].
Remark: Not every function is either odd or even. The function y = ex is neither even nor
odd. And in the case that a function is even, such as y = cos(x), or odd, such as y = sin(x),
it is very simple to break that symmetry: add a constant. The functions y = 1 + cos(x) and
y = 1 + sin(x) are neither even nor odd.
Below we now show that the graph of a typical even function is symmetrical about the
vertical axis, while the graph of a typical odd function is symmetrical about the origin.
G. NAGY ODE September 11, 2017 305
y
y
y = x3
y = x2
x
x
Remark: We leave proof as an exercise. Notice that the last two equations above are simple
to understand, just by looking at the figures below.
y y
y = x2 y = x3
+ + +
x x
7.2.4. Sine and Cosine Series. In the case that a function is either even or odd, half of
its Fourier series expansion coefficients vanish. In this case the Fourier series is called either
a sine or a cosine series.
(a) If f is even, then bn = 0. The series Fourier series is called a cosine series,
a0 X nx
f (x) = + an cos .
2 n=1
L
(b) If f is odd, then an = 0. The series Fourier series is called a sine series,
X nx
f (x) = bn sin .
n=1
L
but f is even and the Sine is odd, so the integrand is odd. Therefore bn = 0.
Part (b): Suppose that f is odd, then for n > 1 we get
Z L
1 nx
an = f (x) cos dx,
L L L
but f is odd and the Cosine is even, so the integrand is odd. Therefore an = 0. Finally
Z L
1
a0 = f (x) dx,
L L
is actually a sine series. Therefore, all the coefficients an = 0 for n > 0. So we only need to
compute the coefficients bn . Since in our case L = 3, we have
1 3
Z nx
bn = f (x) sin dx
3 3 3
1 0
Z nx Z 3
nx
= (1) sin dx + sin dx
3 3 3 0 3
2 3 nx
Z
= sin dx
3 0 3
2 3 nx 3
= (1) cos
3 n 3
0
2 n 2
((1)(n+1) + 1).
= (1) + 1 bn =
n n
Therefore, we get
X 2 nx
fF (x) = ((1)(n+1) + 1) sin .
n=1
n L
C
We start computing a0 ,
Z 1
a0 = f (x) dx
1
Z 0 Z 1
= (1 + x) dx + (1 x) dx
1 0
x2 0 x2 1
= x+ + x
2 1 2 0
1 1
= 1 + 1 a0 = 1.
2 2
Similarly,
Z 1
an = f (x) cos(nx) dx
1
Z 0 Z 1
= (1 + x) cos(nx) dx + (1 x) cos(nx) dx.
1 0
7.2.5. Applications. The Fourier series expansion is a powerful tool for signal analysis.
It allows us to view any signal in a different way, where several difficult problems are very
simple to solve. Take sound, for example. Sounds can be transformed into electrical currents
by a microphone. There are electric circuits that compute the Fourier series expansion of
the currents, and the result is the frequencies and their corresponding amplitude present on
that signal. Then it is possible to manipulate a precise frequency and then recombine the
result into a current. That current is transformed into sound by a speaker.
G. NAGY ODE September 11, 2017 309
This type of sound manipulation is very common. You might remember the annoying
sound of the vuvuzelaskind of loud trumpets, plastic made, very cheapin the 2010
soccer world championship. Their sound drowned the tv commentators during the world
cup. But by the 2014 world cup you could see the vuvuzelas in the stadiums but you did
not hear them. It turns out vuvuzelas produce a single frequency sound, about 235 Hz. The
tv equipment had incorporated a circuit that eliminated that sound, just as we described
above. Fourier series expand the sound, kill that annoying frequency, and recombine the
sound.
A similar, although more elaborate, sound manipulation is done constantly by sound
editors in any film. Suppose you like an actor but you do not like his voice. You record
the movie, then take the actors voice, compute its Fourier series expansion, increase the
amplitudes of the frequencies you like, kill the frequencies you do not like, and recombine
the resulting sound. Now the actor has a new voice in the movie.
7.2.6. Exercises.
7.2.1.- . 7.2.2.- .
G. NAGY ODE September 11, 2017 311
Remarks:
u is the temperature of a solid material.
t is a time coordinate, while x is a space coordinate.
k > 0 is the heat conductivity, with units [k] = [x]2 /[t].
The partial differential equation above has infinitely many solutions.
We look for solutions satisfying both boundary conditions and initial conditions.
y
t
Insulation
The heat equation contains partial derivatives with respect to time and space. Solving the
equation means to do several integrations, which means we have a few arbitrary integration
constants. So the equation has infinitely many solutions. We are going to look for solutions
that satisfy some additional conditions, known as boundary conditions and initial conditions.
( (
u(t, 0) = 0, u(0, x) = f (x),
Boundary Conditions: Initial Conditions:
u(t, L) = 0. f (0) = f (L) = 0.
We are going to try to understand the qualitative behavior of the solutions to the heat
equation before we start any detailed calculation. Recall that the heat equation is
t u = k x2 u.
312 G. NAGY ODE september 11, 2017
The meaning of the left and hand side of the equation is the following:
How fast the temperature The concavity of the graph of u
= k (> 0)
increases of decreases. in the variable x at a given time.
Suppose that at a fixed time t > 0 the graph of the temperature u as function of x is given by
Fig. 64. We assume that the boundary conditions are u(t, 0) = T0 = 0 and u(t, L) = TL > 0.
Then the temperature will evolve in time following the red arrows in that figure.
The heat equation relates the time variation of the temperature, t u, to the curvature
of the function u in the x variable, x2 u. In the regions where the function u is concave up,
hence x2 u > 0, the heat equation says that the tempreature must increase t u > 0. In the
regions where the function u is concave down, hence x2 u < 0, the heat equation says that
the tempreature must decrease t u < 0.
Therefore, the heat equation tries to make the temperature along the material to vary
the least possible that is consistent with the boundary conditions. In the case of the figure
below, the temperature will try to get to the dashed line.
u(t, x)
0 6= TL t u < 0
t u > 0
0 = T0
0 t fixed L x
Before we start solving the heat equation we mention one generalizations and and a couple
of similar equations.
The heat equation in three space dimensions is
t u = k (x2 u + y2 u + z2 u).
The method we use in this section to solve the one-space dimensional equation can
be generalized to solve the three-space dimensional equation.
The wave equation in three space dimensions is
t2 u = v 2 (x2 u + y2 u + z2 u).
This equation describes how waves propagate in a medium. The constant v has
units of velocity, and it is the wave speed.
The Schrodinger equation of Quantum Mechanics is
~2 2
i~ t u = ( u + y2 u + z2 u) + V (t, x) u,
2m x
where m is the mass of a particle and ~ is the Planck constant divided by 2, while
i2 = 1. The solutions of this equation behave more like the solutions of the wave
equation than the solutions of the heat equation.
G. NAGY ODE September 11, 2017 313
7.3.2. The IBVP: Dirichlet Conditions. We now find solutions of the one-space dimen-
sional heat equation that satisfy a particular type of boundary conditions, called Dirichlet
boundary conditions. These conditions fix the values of the temperature at two sides of the
bar.
Theorem 7.3.2. The boundary value problem for the one space dimensional heat equation,
t u = k x2 u, BC: u(t, 0) = 0, u(t, L) = 0,
where k > 0, L > 0 are constants, has infinitely many solutions
nx
n 2
X
u(t, x) = cn ek( L ) t sin , cn R.
n=1
L
Furthermore, for every continuous function f on [0, L] satisfying f (0) = f (L) = 0, there is a
unique solution u of the boundary value problem above that also satisfies the initial condition
u(0, x) = f (x).
This solution u is given by the expression above, where the coefficients cn are
2 L
Z nx
cn = f (x) sin dx.
L 0 L
The proof of the IBVP above is based on the separation of variables method:
(1) Look for simple solutions of the boundary value problem.
(2) Any linear combination of simple solutions is also a solution. (Superposition.)
(3) Determine the free constants with the initial condition.
Proof of the Theorem 7.3.2: Look for simple solutions of the heat equation given by
u(t, x) = v(t) w(x).
So we look for solutions having the variables separated into two functions. Introduce this
particular function in the heat equation,
1 v(t) w00 (x)
v(t) w(x) = k v(t) w00 (x) = ,
k v(t) w(x)
where we used the notation v = dv/dt and w0 = dw/dx. The separation of variables in the
function u implies a separation of variables in the heat equation. The left hand side in the
last equation above depends only on t and the right hand side depends only on x. The only
possible solution is that both sides are equal the same constant, call it . So we end up
with two equations
1 v(t) w00 (x)
= , and = .
k v(t) w(x)
The equation on the left is first order and simple to solve. The solution depends on ,
v (t) = c ekt , c = v (0).
314 G. NAGY ODE september 11, 2017
The second equation leads to an eigenfunction problem for w once boundary conditions are
provided. These boundary conditions come from the heat equation boundary conditions,
)
u(t, 0) = v(t) w(0) = 0 for all t > 0
w(0) = w(L) = 0.
u(t, L) = v(t) w(L) = 0 for all t > 0
is solution of the heat equation with homogeneous Dirichlet boundary conditions. Here the
cn are arbitrary constants. Notice that at t = 0 we have
X nx
u(0, x) = cn sin
n=1
L
Solution: We look for simple solutions of the form u(t, x) = v(t) w(x),
dv d2 w 4v(t) w00 (x)
4w(x) (t) = v(t) 2 (x) = = .
dt dx v(t) w(x)
So, the equations for v and w are
v(t) = v(t), w00 (x) + w(x) = 0.
4
The solution for v depends on , and is given by
v (t) = c e 4 t , c = v (0).
Next we turn to the the equation for w, and we solve the BVP
w00 (x) + w(x) = 0, with BC w(0) = w(2) = 0.
This is an eigenfunction problem for w and . This problem has solution only for > 0,
since only in that case the characteristic polynomial has complex roots. Let = 2 , then
p(r) = r2 + 2 = 0 r = i.
The general solution of the differential equation is
wn (x) = c1 cos(x) + c2 sin(x).
The first boundary conditions on w implies
0 = w(0) = c1 , w(x) = c2 sin(x).
The second boundary condition on w implies
0 = w(2) = c2 sin(2), c2 6= 0, sin(2) = 0.
n
Then, n 2 = n, that is, n = . Choosing c2 = 1, we conclude,
2
n 2 nx
n = , wn (x) = sin , n = 1, 2, .
2 2
Using the values of n found above in the formula for v we get
1 n 2
vn (t) = cn e 4 ( 4 ) t , cn = vn (0).
Therefore, we get
nx
n 2
X
u(t, x) = cn e( 4 ) t sin .
n=1
2
316 G. NAGY ODE september 11, 2017
So we get
10 2n n
bn = cos cos .
n 3 3
Since fodd (x) = f (x) for x [0, 2] we get that cn = bn . So, the solution of the initial-
boundary value problem for the heat equation contains is
10 X 1 n 2n n 2
nx
u(t, x) = cos cos e( 4 ) t sin .
n=1 n 3 3 2
C
7.3.3. The IBVP: Neumann Conditions. We now find solutions of the one-space dimen-
sional heat equation that satisfy a particular type of boundary conditions, called Neumann
boundary conditions. These conditions fix the values of the heat flux at two sides of the
bar.
Theorem 7.3.3. The boundary value problem for the one space dimensional heat equation,
t u = k x2 u, BC: x u(t, 0) = 0, x u(t, L) = 0,
where k > 0, L > 0 are constants, has infinitely many solutions
c0 X n 2
nx
u(t, x) = + cn ek( L ) t cos , cn R.
2 n=1 L
Furthermore, for every continuous function f on [0, L] satisfying f 0 (0) = f 0 (L) = 0, there
is a unique solution u of the boundary value problem above that also satisfies the initial
condition
u(0, x) = f (x).
This solution u is given by the expression above, where the coefficients cn are
2 L
Z nx
cn = f (x) cos dx, n = 0, 1, 2, .
L 0 L
G. NAGY ODE September 11, 2017 317
One can use Dirichlet conditions on one side and Neumann on the other side. This is
called a mixed boundary condition. The proof, in all cases, is based on the separation of
variables method.
Proof of the Theorem 7.3.3: Look for simple solutions of the heat equation given by
u(t, x) = v(t) w(x).
So we look for solutions having the variables separated into two functions. Introduce this
particular function in the heat equation,
1 v(t) w00 (x)
v(t) w(x) = k v(t) w00 (x) = ,
k v(t) w(x)
where we used the notation v = dv/dt and w0 = dw/dx. The separation of variables in the
function u implies a separation of variables in the heat equation. The left hand side in the
last equation above depends only on t and the right hand side depends only on x. The only
possible solution is that both sides are equal the same constant, call it . So we end up
with two equations
1 v(t) w00 (x)
= , and = .
k v(t) w(x)
The equation on the left is first order and simple to solve. The solution depends on ,
v (t) = c ekt , c = v (0).
The second equation leads to an eigenfunction problem for w once boundary conditions are
provided. These boundary conditions come from the heat equation boundary conditions,
x u(t, 0) = v(t) w0 (0) = 0 for all t > 0
)
w0 (0) = w0 (L) = 0.
x u(t, L) = v(t) w0 (L) = 0 for all t > 0
So we need to solve the following BVP for w;
w00 + w = 0, w0 (0) = w0 (L) = 0.
This is an eigenfunction problem, which has solutions only for > 0, because in that case
the asociated characteristic polynomial has complex roots. If we write = 2 , for > 0,
we get the general solution
w(x) = c1 cos(x) + c2 sin(x).
The boundary conditions apply on the derivative,
w0 (x) = c1 sin(x) + c2 cos(x).
The boundary conditions are
0 = w0 (0) = c2 c2 = 0.
So the function is w(x) = c1 cos(x). The second boundary condition is
0 = w0 (L) = c1 sin(L) sin(L) = 0 n L = n, n = 1, 2, .
318 G. NAGY ODE september 11, 2017
is solution of the heat equation with homogeneous Neumann boundary conditions. Notice
that the constant solution c0 /2 is a trivial solution of the Neumann boundary value problem,
which was not present in the Dirichlet boundary value problem. Here the cn are arbitrary
constants. Notice that at t = 0 we have
c0 X nx
u(0, x) = + cn cos
2 n=1 L
Solution: We look for simple solutions of the form u(t, x) = v(t) w(x),
dv d2 w v(t) w00 (x)
w(x) (t) = v(t) 2 (x) = = .
dt dx v(t) w(x)
So, the equations for v and w are
v(t) = v(t), w00 (x) + w(x) = 0.
The solution for v depends on , and is given by
v (t) = c et , c = v (0).
Next we turn to the the equation for w, and we solve the BVP
w00 (x) + w(x) = 0, with BC w0 (0) = w0 (3) = 0.
This is an eigenfunction problem for w and . This problem has solution only for > 0,
since only in that case the characteristic polynomial has complex roots. Let = 2 , then
p(r) = r2 + 2 = 0 r = i.
The general solution of the differential equation is
wn (x) = c1 cos(x) + c2 sin(x).
Its derivative is
w0 (x) = c1 sin(x) + c2 cos(x).
The first boundary conditions on w implies
0 = w0 (0) = c2 , c2 = 0 w(x) = c1 cos(x).
The second boundary condition on w implies
0 = w0 (3) = c1 sin(3), c1 6= 0, sin(3) = 0.
n
Then, n 3 = n, that is, n = . Choosing c2 = 1, we conclude,
3
n 2 nx
n = , wn (x) = cos , n = 1, 2, .
3 3
Using the values of n found above in the formula for v we get
n 2
vn (t) = cn e( 3 ) t , cn = vn (0).
Therefore, we get
c0 X n 2
nx
u(t, x) = + cn e( 3 ) t cos ,
2 n=1 2
where we have added the trivial constant solution written as c0 /2. The initial condition is
3
7 x ,3 ,
f (x) = u(0, x) = 2
0 x 0, 3 ,
2
We extend f to [3, 3] as an even function
3
7 x ,3 ,
2
3 3
feven (x) = 0 x , ,
2 2
7 x 3, 3 .
2
320 G. NAGY ODE september 11, 2017
But the function f has exactly the same Fourier expansion on [0, 3], which means that
2(1)k
c0 = 7, c2k = 0, c(2k1) = 7 .
(2k 1)
So the solution of the initial-boundary value problem for the heat equation is
7 X 2(1)k ( (2k1) )2 t (2k 1)x
u(t, x) = + 7 e 3 cos .
2 (2k 1) 3
k=1
C
The expressions for vn and wn imply that the simple solution solution un has the form
nx
X 2
un (t, x) = cn e4(n) t sin .
n=1
2
Since any linear combination of the function above is also a solution, we get
nx
X 2
u(t, x) = cn e4(n) t sin .
n=1
2
The initial condition is
x
X nx
3 sin = cn sin .
2 n=1
2
We now consider this function on the interval [2, 2], where is an odd function. Then, the
orthogonality of these sine functions above implies
Z 2 x mx X Z 2 nx mx
3 sin sin dx = cn sin sin dx.
2 2 2 n=1 2 2 2
322 G. NAGY ODE september 11, 2017
7.3.4. Exercises.
7.3.1.- . 7.3.2.- .
324 G. NAGY ODE september 11, 2017
We review a few concepts of linear algebra, such as the Gauss operations to solve linear
systems of algebraic equations, matrix operations, determinants, inverse matrix formulas,
eigenvalues and eigenvectors of a matrix, diagonalizable matrices, and the exponential of a
matrix.
x3 R3
a2
a1 a3
x2
x1
G. NAGY ODE September 11, 2017 325
(b) The coefficients of the algebraic linear systems in Example 8.1.1 can be grouped in
matrices, as follows,
x1 + 2x2 + x3 = 1,
1 2 1
2x1 x2 = 0,
2 1
A= . 3x1 + x2 + 3x3 = 24, A = 3 1 3 .
x1 + 2x2 = 3, 1 2
0 1 4
x2 4x3 = 1.
Remark: A square matrix is upper (lower) triangular iff all the matrix coefficients below
(above) the diagonal vanish. For example, the 3 3 matrix A below is upper triangular
while B is lower triangular.
1 2 3 1 0 0
A = 0 4 5 , B = 2 3 0 .
0 0 6 4 5 6
vm
components are vi C, with i = 1, , m.
Example 8.1.3: The unknowns of the algebraic linear systems in Example 8.1.1 can be
grouped in vectors, as follows,
x1 + 2x2 + x3 = 1,
x1
2x1 x2 = 0,
x
x= 1 . 3x1 + x2 + 3x3 = 24, x = x2 .
x1 + 2x2 = 3, x2
x3
x2 4x3 = 1.
C
Example 8.1.5: Use the matrix-vector product to express the algebraic linear system below,
2x1 x2 = 0,
x1 + 2x2 = 3.
Solution: Introduce the coefficient matrix A, the unknown vector x, and the source vector
b as follows,
2 1 x1 0
A= , x= , b= .
1 2 x2 3
Since the matrix-vector product Ax is given by
2 1 x1 2x1 x2
Ax = = ,
1 2 x2 x1 + 2x2
then we conclude that
2x1 x2 = 0,
2x1 x2 0
= Ax = b.
x1 + 2x2 = 3, x1 + 2x2 3
C
It is simple to see that the result found in the Example above can be generalized to every
n n algebraic linear system.
Theorem 8.1.5 (Matrix Notation). The system in Eqs. (8.1.1)-(8.1.2) can be written as
Ax = b,
where the coefficient matrix A, the unknown vector x, and the source vector b are
a11 a1n x1 b1
.. .. , .. ..
A= . . x = . , b = . .
an1 ann xn bn
Proof of Theorem 8.1.5: From the definition of the matrix-vector product we have that
a11 a1n x1 a11 x1 + + a1n xn
Ax = ... .. .. = ..
.
. . .
an1 ann xn an1 x1 + + a1n xn
Then, we conclude that
a11 x1 + + a1n xn = b1 ,
a11 x1 + + a1n xn b1
..
.. ..
=. Ax = b.
.
.
an1 x1 + + a1n xn bn
an1 x1 + + ann xn = bn ,
We introduce one last definition, which will be helpful in the next subsection.
Definition 8.1.6. The augmented matrix of Ax = b is the n (n + 1) matrix [A|b].
The augmented matrix of an algebraic linear system contains the equation coefficients and
the sources. Therefore, the augmented matrix of a linear system contains the complete
information about the system.
328 G. NAGY ODE september 11, 2017
Example 8.1.6: Find the augmented matrix of both the linear systems in Example 8.1.1.
Solution: The coefficient matrix and source vector of the first system imply that
2 1 0 2 1 0
A= , b= [A|b] = .
1 2 3 1 2 3
The coefficient matrix and source vector of the second system imply that
1 2 1 1 1 2 1 1
A = 3 1 3 , b = 24 [A|b] = 3 1 3 24 .
0 1 4 1 0 1 4 1
C
Recall that the linear combination of two vectors is defined component-wise, that is, given
any numbers a, b R and any vectors x, y, their linear combination is the vector given by
ax1 + by1 x1 y1
.
.. . y = ... .
ax + by = , where x = .. ,
axn + byn xn yn
With this definition of linear combination of vectors it is simple to see that the matrix-vector
product is a linear operation.
Theorem 8.1.7 (Linearity). The matrix-vector product is a linear operation, that is, given
an n n matrix A, then for all n-vectors x, y and all numbers a, b R holds
Proof of Theorem 8.1.7: Just write down the matrix-vector product in components,
a11 a1n ax1 + by1 a11 (ax1 + by1 ) + + a1n (axn + byn )
A(ax + by) = ... .. ..
=
..
.
. . .
am1 amn axn + byn an1 (ax1 + by1 ) + + ann (axn + byn )
Expand the linear combinations on each component on the far right-hand side above and
re-order terms as follows,
a (a11 x1 + + a1n xn ) + b (a11 y1 + + a1n yn )
A(ax + by) = ..
.
.
a (an1 x1 + + ann xn ) + b (an1 y1 + + ann yn )
As we said above, the Gauss elimination operations change the coefficients of the augmented
matrix of a system but do not change its solution. Two systems of linear equations having
the same solutions are called equivalent. It can be shown that there is an algorithm using
these operations that transforms any n n linear system into an equivalent system where
the solutions are explicitly given.
Example 8.1.7: Find the solution to the 2 2 linear system given in Example 8.1.1 using
the Gauss elimination operations.
Solution: Consider the augmented matrix of the 2 2 linear system in Example (8.1.1),
and perform the following Gauss elimination operations,
2 1 0 2 1 0 2 1 0 2 1 0
1 2 3 2 4 6 0 3 6 0 1 2
x1 + 0 = 1 x1 = 1
2 0 2 1 0 1
0 1 2 0 1 2 0 + x2 = 2 x2 = 2
C
Example 8.1.8: Find the solution to the 3 3 linear system given in Example 8.1.1 using
the Gauss elimination operations
Solution: Consider the augmented matrix of the 3 3 linear system in Example 8.1.1 and
perform the following Gauss elimination operations,
1 2 1 1 1 2 1 1 1 2 1
1
3 1 3 24 0 7 6 27 0 1 4 1 ,
0 1 4 1 0 1 4 1 0 7 6 27
x1 = 6,
1 0 9 3 1 0 9 3 1 0 0 6
0 1 4 1 0 1 4 1 0 1 0
3 x2 = 3,
0 0 34 34 0 0 1 1 0 0 1 1
x = 1.
3
C
330 G. NAGY ODE september 11, 2017
In the last augmented matrix on both Examples 8.1.7 and 8.1.8 the solution is given ex-
plicitly. This is not always the case with every augmented matrix. A precise way to define
the final augmented matrix in the Gauss elimination method is captured in the notion of
echelon form and reduced echelon form of a matrix.
Definition 8.1.9. An m n matrix is in echelon form iff the following conditions hold:
(i) The zero rows are located at the bottom rows of the matrix;
(ii) The first non-zero coefficient on a row is always to the right of the first non-zero
coefficient of the row above it.
The pivot coefficient is the first non-zero coefficient on every non-zero row in a matrix in
echelon form.
Example 8.1.9: The 6 8, 3 5 and 3 3 matrices given below are in echelon form, where
the means any non-zero number and pivots are highlighted.
0 0
0 0 0
0 0 , 0 .
0 0 0 0 0 0 ,
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 C
Example 8.1.10: The following matrices are in echelon form, with pivot highlighted.
2 1 1
1 3 2 3 2
, , 0 3 4 .
0 1 0 4 2
0 0 0
C
Definition 8.1.10. An m n matrix is in reduced echelon form iff the matrix is in
echelon form and the following two conditions hold:
(i) The pivot coefficient is equal to 1;
(ii) The pivot coefficient is the only non-zero coefficient in that column.
We denote by EA a reduced echelon form of a matrix A.
Example 8.1.11: The 6 8, 3 5 and 3 3 matrices given below are in echelon form, where
the means any non-zero number and pivots are highlighted.
1 0 0 0
0 0 1 0 0
0 0 0 1 0
1 0 1 0 0
0 0 1 , 0 1 0 .
0 0 0 0 0 0 1 ,
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 1
0 0 0 0 0 0 0 0 C
Example 8.1.12: And the following matrices are not only in echelon form but also in
reduced echelon form; again, pivot coefficients are highlighted.
1 0 0
1 0 1 0 4
, , 0 1 0 .
0 1 0 1 5
0 0 0
C
Summarizing, the Gauss elimination operations can transform any matrix into reduced
echelon form. Once the augmented matrix of a linear system is written in reduced echelon
form, it is not difficult to decide whether the system has solutions or not.
G. NAGY ODE September 11, 2017 331
Example 8.1.13: Use Gauss operations to find the solution of the linear system
2x1 x2 = 0,
1 1 1
x1 + x2 = .
2 4 4
Solution: We find the system augmented matrix and perform appropriate Gauss elimina-
tion operations,
2 1 0 2 1 0 2 1 0
21 1
4
1
4 2 1 1 0 0 1
From the last augmented matrix above we see that the original linear system has the same
solutions as the linear system given by
2x1 x2 = 0,
0 = 1.
Since the latter system has no solutions, the original system has no solutions. C
The situation shown in Example 8.1.13 is true in general. If the augmented matrix [A|b] of
an algebraic linear system is transformed by Gauss operations into the augmented matrix
[A|b] having a row of the form [0, , 0|1], then the original algebraic linear system Ax = b
has no solution.
Example 8.1.14: Find all vectors b such that the system Ax = b has solutions, where
1 2 3 b1
A = 1 1 2 , b = b2 .
2 1 3 b3
Solution: We do not need to write down the algebraic linear system, we only need its
augmented matrix,
1 2 3 b1 1 2 3
b1
[A|b] = 1 1 2 b2 0 1 1
b1 + b2
2 1 3 b3 2 1 3 b3
1 2 3 b1 1 2 3
b1
0 1 1 b1 b2 0 1 1 b1 b2 .
0 3 3 b3 2b1 0 0 0 b3 + b1 + 3b2
8.1.3. Linearly Dependence. We generalize the idea of two vectors lying on the same
line, and three vectors lying on the same plane, to an arbitrary number of vectors.
Definition 8.1.11. A set of vectors {v1 , , vk }, with k > 1 is called linearly dependent
iff there exists constants c1 , , ck , with at least one of them non-zero, such that
c1 v1 + + ck vk = 0. (8.1.4)
The set of vectors is called linearly independent iff it is not linearly dependent, that is,
the only constants c1 , , ck that satisfy Eq. (8.1.4) are given by c1 = = ck = 0.
In other words, a set of vectors is linearly dependent iff one of the vectors is a linear combi-
nation of the other vectors. When this is not possible, the set is called linearly independent.
Example 8.1.15: Show that the following set of vectors is linearly dependent,
n 1 3 1 o
2 , 2 , 2 ,
3 1 5
and express one of the vectors as a linear combination of the other two.
Solution: We need to find constants c1 , c2 , and c3 solutions of the equation
1 3 1 0 1 3 1 c1 0
2 c1 + 2 c2 + 2 c3 = 0 2 2 2 c2 + 0 .
3 1 5 0 3 1 5 c3 0
The solution to this linear system can be obtained with Gauss elimination operations,
c1 = 2c3 ,
1 3 1 1 3 1 1 3 1 1 0 2
2 2 2 0 4 4 0 1 1 0 1 1 c2 = c3 ,
3 1 5 0 8 8 0 1 1 0 0 0
c = free.
3
Since there are non-zero constants c1 , c2 , c3 solutions to the linear system above, the vectors
are linearly dependent. Choosing c3 = 1 we obtain the third vector as a linear combination
of the other two vectors,
1 3 1 0 1 1 3
2 2 2 2 = 0 2 = 2 2 2 .
3 1 5 0 5 3 1
C
G. NAGY ODE September 11, 2017 333
8.1.4. Exercises.
8.1.1.- . 8.1.2.- .
334 G. NAGY ODE september 11, 2017
x2 x2
x2 = x1
Ax Ax
z
Ay y
Az x
x
x1 x1
Example 8.2.2: Describe the action on R2 of the function given by the 2 2 matrix
0 1
A= . (8.2.2)
1 0
These cases are plotted in the second figure on Fig. 66, and the vectors are called x, y and
z, respectively. We therefore conclude that this matrix produces a ninety degree counter-
clockwise rotation of the plane. C
Definition
T 8.2.2.
The transpose of a matrix A = [Aij ] Fm,n is the matrix denoted as
n,m
A = (A )kl F , with its components given by AT kl = Alk .
T
1 3 5
Example 8.2.6: Find the transpose of the 2 3 matrix A = .
2 4 6
Solution: Matrix A has components Aij with i = 1, 2 and j = 1, 2, 3. Therefore, its
transpose has components (AT )ji = Aij , that is, AT has three rows and two columns,
1 2
AT = 3 4 .
5 6
C
If a matrix has complex-valued coefficients, then the conjugate of a matrix can be defined
as the conjugate of each component.
Definition
8.2.3. The complex conjugate of a matrix A = [Aij ] Fm,n is the matrix
m,n
A = Aij F .
Example 8.2.7: A matrix A and its conjugate is given below,
1 2+i 1 2i
A= , A= .
i 3 4i i 3 + 4i
C
Example 8.2.8: A matrix A has real coefficients iff A = A; It has purely imaginary coeffi-
cients iff A = A. Here are examples of these two situations:
1 2 1 2
A= A= = A;
3 4 3 4
i 2i i 2i
A= A= = A.
3i 4i 3i 4i
C
T
Definition 8.2.4. The adjoint of a matrix A Fm,n is the matrix A = A Fn,m .
T
Since A = (AT ), the order of the operations does not change the result, that is why there
is no parenthesis in the definition of A .
Example 8.2.9: A matrix A and its adjoint is given below,
1 2+i 1 i
A= , A = .
i 3 4i 2 i 3 + 4i
C
The transpose, conjugate and adjoint operations are useful to specify certain classes of
matrices with particular symmetries. Here we introduce few of these classes.
Definition 8.2.5. An n n matrix A is called:
(a) symmetric iff holds A = AT ;
(b) skew-symmetric iff holds A = AT ;
(c) Hermitian iff holds A = A ;
(d) skew-Hermitian iff holds A = A .
G. NAGY ODE September 11, 2017 337
Example 8.2.10: We present examples of each of the classes introduced in Def. 8.2.5.
Part (a): Matrices A and B are symmetric. Notice that A is also Hermitian, while B is
not Hermitian,
1 2 3 1 2 + 3i 3
A = 2 7 4 = AT , B = 2 + 3i 7 4i = B T .
3 4 8 3 4i 8
Part (b): Matrix C is skew-symmetric,
0 2 3 0 2 3
C= 2 0 4 C T = 2 0 4 = C.
3 4 0 3 4 0
Notice that the diagonal elements in a skew-symmetric matrix must vanish, since Cij = Cji
in the case i = j means Cii = Cii , that is, Cii = 0.
Part (c): Matrix D is Hermitian but is not symmetric:
1 2+i 3 1 2i 3
D = 2 i 7 4 + i DT = 2 + i 7 4 i 6= D,
3 4i 8 3 4+i 8
however,
1 2+i 3
T
D = D = 2 i 7 4 + i = D.
3 4i 8
Notice that the diagonal elements in a Hermitian matrix must be real numbers, since the
condition Aij = Aji in the case i = j implies Aii = Aii , that is, 2iIm(Aii ) = Aii Aii = 0.
T
We can also verify what we said in part (a), matrix A is Hermitian since A = A = AT = A.
Part (d): The following matrix E is skew-Hermitian:
i 2+i 3 i 2 + i 3
E = 2 + i 7i 4 + i E T = 2 + i 7i 4 + i
3 4 + i 8i 3 4+i 8i
therefore,
i 2 i 3
T
E = E 2 i 7i 4 i = E.
3 4i 8i
A skew-Hermitian matrix has purely imaginary elements in its diagonal, and the off diagonal
elements have skew-symmetric real parts with symmetric imaginary parts. C
The trace of a square matrix is a number, the sum of all the diagonal elements of the matrix.
Definition 8.2.6. The trace of a square matrix A = Aij Fn,n , denoted as tr (A) F,
is the sum of its diagonal elements, that is, the scalar given by tr (A) = A11 + + Ann .
1 2 3
Example 8.2.11: Find the trace of the matrix A = 4 5 6.
7 8 9
Solution: We only have to add up the diagonal elements:
tr (A) = 1 + 5 + 9 tr (A) = 15.
C
338 G. NAGY ODE september 11, 2017
The product is not defined for two arbitrary matrices, since the size of the matrices is
important: The numbers of columns in the first matrix must match the numbers of rows in
the second matrix.
A times B defines AB
mn n` m`
2 1 3 0
Example 8.2.12: Compute AB, where A = and B = .
1 2 2 1
Solution: The component (AB)11 = 4 is obtained from the first row in matrix A and the
first column in matrix B as follows:
2 1 3 0 4 1
= , (2)(3) + (1)(2) = 4;
1 2 2 1 1 2
The component (AB)12 = 1 is obtained as follows:
2 1 3 0 4 1
= , (2)(0) + (1)(1) = 1;
1 2 2 1 1 2
The component (AB)21 = 1 is obtained as follows:
2 1 3 0 4 1
= , (1)(3) + (2)(2) = 1;
1 2 2 1 1 2
And finally the component (AB)22 = 2 is obtained as follows:
2 1 3 0 4 1
= , (1)(0) + (2)(1) = 2.
1 2 2 1 1 2
C
2 1 3 0
Example 8.2.13: Compute BA, where A = and B = .
1 2 2 1
63
Solution: We find that BA = . Notice that in this case AB 6= BA. C
54
4 3 1 2 3
Example 8.2.14: Compute AB and BA, where A = and B = .
2 1 4 5 6
Solution: The product AB is
4 3 1 2 3 16 23 30
AB = AB = .
2 1 4 5 6 6 9 12
The product BA is not possible. C
G. NAGY ODE September 11, 2017 339
1 2 1 1
Example 8.2.15: Compute AB and BA, where A = and B = .
1 2 1 1
Solution: We find that
1 2 1 1 1 1
AB = = ,
1 2 1 1 1 1
1 1 1 2 0 0
BA = = .
1 1 1 2 0 0
Remarks:
(a) Notice that in this case AB 6= BA.
(b) Notice that BA = 0 but A 6= 0 and B 6= 0.
C
8.2.3. The Inverse Matrix. We now introduce the concept of the inverse of a square
matrix. Not every square matrix is invertible. The inverse of a matrix is useful to compute
solutions to linear systems of algebraic equations.
Definition 8.2.8. The matrix In Fn,n is the identity matrix iff In x = x for all x Fn .
It is simple to see that the components of the identity matrix are given by
Iii = 1
In = [Iij ] with
Iij = 0 i 6= j.
The cases n = 2, 3 are given by
1 0 0
1 0
I2 = , I3 = 0 1 0 .
0 1
0 0 1
Definition 8.2.9. A matrix A Fn,n is called
invertible iff there exists a matrix, denoted
as A1 , such that A1 A = In , and A A1 = In .
Example 8.2.16: Verify that the matrix and its inverse are given by
2 2 1 3 2
A= , A1 = .
1 3 4 1 2
The number is called the determinant of A, since it is the number that determines whether
A is invertible or not.
340 G. NAGY ODE september 11, 2017
2 2
Example 8.2.17: Compute the inverse of matrix A = , given in Example 8.2.16.
1 3
Solution: Following Theorem 8.2.10 we first compute = 6 4 = 4. Since 6= 0, then
A1 exists and it is given by
1 3 2
A1 = .
4 1 2
C
1 2
Example 8.2.18: Compute the inverse of matrix A = .
3 6
Solution: Following Theorem 8.2.10 we first compute = 6 6 = 0. Since = 0, then
matrix A is not invertible. C
The matrix operations we have introduced are useful to solve matrix equations, where
the unknown is a matrix. We now show an example of a matrix equation.
Solution: There are many ways to solve a matrix equation. We choose to multiply the
equation by the inverses of matrix A and B, if they exist. So first we check whether A is
invertible. But
1 3
det(A) = = 1 6 = 5 6= 0,
2 1
so A is indeed invertible. Regarding matrix B we get
2 1
det(B) =
= 4 1 = 3 6= 0,
1 2
AXB = I A1 (AXB)B 1 = A1 IB 1 X = A1 B 1 .
Therefore,
1 1 3 1 2 1 1 5 7
X= =
5 2 1 3 1 2 15 5 4
so we obtain
1 7
3 15
X= 1 4 .
3 15
C
G. NAGY ODE September 11, 2017 341
8.2.4. Computing the Inverse Matrix. Gauss operations can be used to compute the
inverse of a matrix. The reason for this is simple to understand in the case of 2 2 matrices,
as can be seen in the following Example.
Example 8.2.20: Given any 2 2 matrix A, find its inverse matrix, A1 , or show that the
inverse does not exist.
Solution: If the inverse matrix, A1 exists, then denote
it as A1 = [x1 , x2 ]. The equation
1 0
A(A1 ) = I2 is then equivalent to A [x1 , x2 ] = . This equation is equivalent to solving
0 1
two algebraic linear systems,
1 0
A x1 = , A x2 = .
0 1
Here is where we can use Gauss elimination operations. We use them on both systems
" # " #
1 0
A , A .
0 1
However, we can solve both systems at the same time if we do Gauss operations on the
bigger augmented matrix
" #
1 0
A .
0 1
Now, perform Gauss operations until we obtain the reduced echelon form for [A|I2 ]. Then
we can have two different types of results:
If there is no line of the form [0, 0|, ] with any of the star coefficients non-zero,
then matrix A is invertible and the solution vectors x1 , x2 form the columns of the
inverse matrix, that is, A1 = [x1 , x2 ].
If there is a line of the form [0, 0|, ] with any of the star coefficients non-zero, then
matrix A is not invertible. C
2 2
Example 8.2.21: Use Gauss operations to find the inverse of A = .
1 3
Solution: As we said in the Example above, perform Gauss operation on the augmented
matrix [A|I2 ] until the reduced echelon form is obtained, that is,
2 2 1 0 1 3 0 1 1 3 0 1
1 3 0 1 2 2 1 0 0 4 1 2
3
12
1 3 0 1 1 0
1 1 4
0 1 4 2 0 1 14 1
2
That is, matrix A is invertible and the inverse is
3
12
1 4 1 1 3 2
A = A = .
14 1
2 4 1 2
C
1 2 3
Example 8.2.22: Use Gauss operations to find the inverse of A = 2 5 7.
3 7 9
342 G. NAGY ODE september 11, 2017
Solution: We perform Gauss operations on the augmented matrix [A|I3 ] until we obtain
its reduced echelon form, that is,
1 2 3 1 0 0 1 2 3 1 0 0
2 5 7 0 1 0 0 1 1 2 1 0
3 7 9 0 0 1 0 1 0 3 0 1
1 0 1 5 2 0 1 0 1 5 2 0
0 1 1 2 1 0 0 1 1 2 1 0
0 0 1 1 1 1 0 0 1 1 1 1
1 0 1 5 2 0 1 0 0 4 3 1
0 1 1 2 1 0 0 1 0 3 0 1
0 0 1 1 1 1 0 0 1 1 1 1
We conclude that matrix A is invertible and
4 3 1
A1 = 3 0 1 .
1 1 1
C
Example 8.2.23: The following three examples show that the determinant can be a negative,
zero or positive number.
1 2 2 1 1 2
= 4 6 = 2, = 8 3 = 5,
2 4 = 4 4 = 0.
3 4 3 4
The following is an example shows how to compute the determinant of a 3 3 matrix,
1 3 1
1 1 2 1 2 1
2 1 1 = (1)
3
+ (1)
3 2 2 1 3 1 3 2
1
= (1 2) 3 (2 3) (4 3)
= 1 + 3 1
= 1. C
Remark: The determinant of upper or lower triangular matrices is the product of the
diagonal coefficients.
G. NAGY ODE September 11, 2017 343
x2 R2 x3 R3
a2
a2
a1 a3
a1
x2
x1 x1
Example 8.2.25: Show whether the set of vectors below linearly independent,
n 1 3 3 o
2 , 2 , 2 .
3 1 7
The determinant of the matrix whose column vectors are the vectors above is given by
1 3 3
2 2
2 = (1) (14 2) 3 (14 6) + (3) (2 6) = 12 24 + 12 = 0.
3 1 7
Therefore, the set of vectors above is linearly dependent. C
The determinant of a square matrix also determines whether the matrix is invertible or not.
344 G. NAGY ODE september 11, 2017
8.2.6. Exercises.
8.2.1.- . 8.2.2.- .
346 G. NAGY ODE september 11, 2017
Example 8.3.1: Verify that the pair 1 , v1 and the pair 2 , v2 are eigenvalue and eigenvector
pairs of matrix A given below,
1
1 = 4 v1 = ,
1
1 3
A= ,
3 1
2 = 2 v2 = 1 .
1
Solution: We just must verify the definition of eigenvalue and eigenvector given above.
We start with the first pair,
1 3 1 4 1
Av1 = = =4 = 1 v1 Av1 = 1 v1 .
3 1 1 4 1
A similar calculation for the second pair implies,
1 3 1 2 1
Av2 = = = 2 = 2 v2 Av2 = 2 v2 .
3 1 1 2 1
C
0 1
Example 8.3.2: Find the eigenvalues and eigenvectors of the matrix A = .
1 0
Solution: This is the matrix given in Example 8.2.1. The action of this matrix on the
plane is a reflection along the line x1 = x2 , as it was shown in Fig. 66. Therefore, this line
G. NAGY ODE September 11, 2017 347
x1 = x2 is left invariant under the action of this matrix. This property suggests that an
eigenvector is any vector on that line, for example
1 0 1 1 1
v1 = = 1 = 1.
1 1 0 1 1
1
So, we have found one eigenvalue-eigenvector pair: 1 = 1, with v1 = . We remark that
1
any nonzero vector proportional to v1 is also
an eigenvector. Another choice fo eigenvalue-
3
eigenvector pair is 1 = 1, with v1 = . It is not so easy to find a second eigenvector
3
which does not belong to the line determined by v1 . One way to find such eigenvector is
noticing that the line perpendicular to the line x1 = x2 is also left invariant by matrix A.
Therefore, any nonzero vector on that line must be an eigenvector. For example the vector
v2 below, since
1 0 1 1 1 1
v2 = = = (1) 2 = 1.
1 1 0 1 1 1
1
So, we have found a second eigenvalue-eigenvector pair: 2 = 1, with v2 = . These
1
two eigenvectors are displayed on Fig. 68. C
x2 x2
x2 = x1
Ax Ax
Av1 = v1
=
v2 2
x
x
x1 x1
Av2 = v2
Figure 68. The first picture shows the eigenvalues and eigenvectors of
the matrix in Example 8.3.2. The second picture shows that the matrix
in Example 8.3.3 makes a counterclockwise rotation by an angle , which
proves that this matrix does not have eigenvalues or eigenvectors.
There exist matrices that do not have eigenvalues and eigenvectors, as it is show in the
example below.
cos() sin()
Example 8.3.3: Fix any number (0, 2) and define the matrix A = .
sin() cos()
Show that A has no real eigenvalues.
Solution: One can compute the action of matrix A on several vectors and verify that the
action of this matrix on the plane is a rotation counterclockwise by and angle , as shown
in Fig. 68. A particular case of this matrix was shown in Example 8.2.2, where = /2.
Since eigenvectors of a matrix determine directions which are left invariant by the action of
the matrix, and a rotation does not have such directions, we conclude that the matrix A
above does not have eigenvectors and so it does not have eigenvalues either. C
348 G. NAGY ODE september 11, 2017
Remark: We will show that matrix A in Example 8.3.3 has complex-valued eigenvalues.
We now describe a method to find eigenvalue-eigenvector pairs of a matrix, if they exit.
In other words, we are going to solve the eigenvalue-eigenvector problem: Given an n n
matrix A find, if possible, all its eigenvalues and eigenvectors, that is, all pairs and v 6= 0
solutions of the equation
Av = v.
This problem is more complicated than finding the solution x to a linear system Ax = b,
where A and b are known. In the eigenvalue-eigenvector problem above neither nor v are
known. To solve the eigenvalue-eigenvector problem for a matrix A we proceed as follows:
(a) First, find the eigenvalues ;
(b) Second, for each eigenvalue , find the corresponding eigenvectors v.
The following result summarizes a way to solve the steps above.
Theorem 8.3.2 (Eigenvalues-Eigenvectors).
(a) All the eigenvalues of an n n matrix A are the solutions of
det(A I) = 0. (8.3.1)
(b) Given an eigenvalue of an n n matrix A, the corresponding eigenvectors v are the
nonzero solutions to the homogeneous linear system
(A I)v = 0. (8.3.2)
Proof of Theorem 8.3.2: The number and the nonzero vector v are an eigenvalue-
eigenvector pair of matrix A iff holds
Av = v (A I)v = 0,
where I is the n n identity matrix. Since v 6= 0, the last equation above says that the
columns of the matrix (A I) are linearly dependent. This last property is equivalent, by
Theorem 8.2.12, to the equation
det(A I) = 0,
which is the equation that determines the eigenvalues . Once this equation is solved,
substitute each solution back into the original eigenvalue-eigenvector equation
(A I)v = 0.
Since is known, this is a linear homogeneous system for the eigenvector components. It
always has nonzero solutions, since is precisely the number that makes the coefficient
matrix (A I) not invertible. This establishes the Theorem.
1 3
Example 8.3.4: Find the eigenvalues and eigenvectors v of the matrix A = .
3 1
Solution: We first find the eigenvalues as the solutions of the Eq. (8.3.1). Compute
1 3 1 0 1 3 0 (1 ) 3
A I = = = .
3 1 0 1 3 1 0 3 (1 )
Then we compute its determinant,
+ = 4,
(1 ) 3
0 = det(A I) = = ( 1)2 9
3 (1 ) - = 2.
We have obtained two eigenvalues, so now we introduce + = 4 into Eq. (8.3.2), that is,
14 3 3 3
A 4I = = .
3 14 3 3
G. NAGY ODE September 11, 2017 349
It is useful to introduce few more concepts, that are common in the literature.
Definition 8.3.3. The characteristic polynomial of an n n matrix A is the function
p() = det(A I).
1 3
Example 8.3.5: Find the characteristic polynomial of matrix A = .
3 1
Solution: We need to compute the determinant
(1 ) 3
p() = det(A I) =
= (1 )2 9 = 2 2 + 1 9.
3 (1 )
We conclude that the characteristic polynomial is p() = 2 2 8. C
Since the matrix A in this example is 2 2, its characteristic polynomial has degree two.
One can show that the characteristic polynomial of an n n matrix has degree n. The
eigenvalues of the matrix are the roots of the characteristic polynomial. Different matrices
may have different types of roots, so we try to classify these roots in the following definition.
350 G. NAGY ODE september 11, 2017
Solution: In order to find the algebraic multiplicity of the eigenvalues we need first to find
the eigenvalues. We now that the characteristic polynomial of this matrix is given by
(1 ) 3
p() = = ( 1)2 9.
3 (1 )
The roots of this polynomial are 1 = 4 and 2 = 2, so we know that p() can be rewritten
in the following way,
p() = ( 4)( + 2).
We conclude that the algebraic multiplicity of the eigenvalues are both one, that is,
1 = 4, r1 = 1, and 2 = 2, r2 = 1.
In order to find the geometric multiplicities of matrix eigenvalues we need first to find the
matrix eigenvectors. This part of the work was already done in the Example 8.3.4 above
and the result is
1 1
1 = 4, v(1) = , 2 = 2, v(2) = .
1 1
From this expression we conclude that the geometric multiplicities for each eigenvalue are
just one, that is,
1 = 4, s1 = 1, and 2 = 2, s2 = 1.
C
The following example shows that two matrices can have the same eigenvalues, and so the
same algebraic multiplicities, but different eigenvectors with different geometric multiplici-
ties.
3 0 1
Example 8.3.7: Find the eigenvalues and eigenvectors of the matrix A = 0 3 2.
0 0 1
Solution: We start finding the eigenvalues, the roots of the characteristic polynomial
(3 ) 0 1 1 = 1, r1 = 1,
2 = ( 1)( 3)2
p() = 0 (3 )
2 = 3, r2 = 2.
0 0 (1 )
We now compute the eigenvector associated with the eigenvalue 1 = 1, which is the solution
of the linear system
(1)
2 0 1 v1 0
(A I)v(1) = 0 0 2 2 v2(1) = 0 .
0 0 0 (1) 0
v3
G. NAGY ODE September 11, 2017 351
Therefore, we obtain two linearly independent solutions, the first one v(2) with the choice
(2) (2) (2) (2)
v1 = 1, v2 = 0, and the second one w(2) with the choice v1 = 0, v2 = 1, that is
1 0
v(2) = 0 , w(2) = 1 , 2 = 3, r2 = 2, s2 = 2.
0 0
Summarizing, the matrix in this example has three linearly independent eigenvectors. C
3 1 1
Example 8.3.8: Find the eigenvalues and eigenvectors of the matrix A = 0 3 2.
0 0 1
Solution: Notice that this matrix has only the coefficient a12 different from the previous
example. Again, we start finding the eigenvalues, which are the roots of the characteristic
polynomial
(3 ) 1 1 1 = 1, r1 = 1,
2
p() = 0
(3 ) 2 = ( 1)( 3)
2 = 3, r2 = 2.
0 0 (1 )
So this matrix has the same eigenvalues and algebraic multiplicities as the matrix in the
previous example. We now compute the eigenvector associated with the eigenvalue 1 = 1,
which is the solution of the linear system
(1)
2 1 1 v1 0
(A I)v(1) = 0 0 2 2 v2(1) = 0 .
0 0 0 (1) 0
v3
352 G. NAGY ODE september 11, 2017
That is, a matrix is diagonal iff every nondiagonal coefficient vanishes. From now on we use
the following notation for a diagonal matrix A:
a11 0
A = diag a11 , , ann = ... .. .. .
. .
0 ann
This notation says that the matrix is diagonal and shows only the diagonal coefficients,
since any other coefficient vanishes. The next result says that the eigenvalues of a diagonal
matrix are the matrix diagonal elements, and it gives the corresponding eigenvectors.
G. NAGY ODE September 11, 2017 353
Many properties of diagonal matrices are shared by diagonalizable matrices. These are
matrices that can be transformed into a diagonal matrix by a simple transformation.
Definition 8.3.7. An n n matrix A is called diagonalizable iff there exists an invertible
matrix P and a diagonal matrix D such that
A = P DP 1 .
Remarks:
(a) Systems of linear differential equations are simple to solve in the case that the coefficient
matrix is diagonalizable. One decouples the differential equations, solves the decoupled
equations, and transforms the solutions back to the original unknowns.
(b) Not every square matrix is diagonalizable. For example, matrix A below is diagonaliz-
able while B is not,
1 3 1 3 1
A= , B= .
3 1 2 1 5
1 3
Example 8.3.10: Show that matrix A = is diagonalizable, where
3 1
1 1 4 0
P = and D = .
1 1 0 2
1 1 1
algebra methods one can find out that the inverse matrix is P 1 = . Now we
2 1 1
only need to verify that P DP 1 is indeed A. A straightforward calculation shows
1 1 4 0 1 1 1
P DP 1 =
1 1 0 2 2 1 1
4 2 1 1 1
=
4 2 2 1 1
2 1 1 1
=
2 1 1 1
1 3
= P DP 1 = A.
3 1
C
There is a deep relation between the eigenpairs of a matrix and whether that matrix is
diagonalizable.
Theorem 8.3.8 (Diagonalizable Matrix). An n n matrix A is diagonalizable iff A has
a linearly independent set of n eigenvectors. Furthermore, if i , vi , for i = 1, , n, are
eigenpairs of A, then
A = P DP 1 , P = [v1 , , vn ], D = diag 1 , , n .
where the last equation comes from multiplying the former equation by P on the left. This
last equation says that the vectors v(i) = P e(i) are eigenvectors of A with eigenvalue dii .
By definition, v(i) is the i-th column of matrix P , that is,
P = v(1) , , v(n) .
Since matrix P is invertible, the eigenvectors set {v(1) , , v(n) } is linearly independent.
This establishes this part of the Theorem.
() Let i , v(i) be eigenvalue-eigenvector pairs of matrix
A, for i = 1, , n. Now use the
eigenvectors to construct matrix P = v(1) , , v(n) . This matrix is invertible, since the
eigenvector set {v(1) , , v(n) } is linearly independent. We now show that matrix P 1 AP
is diagonal. We start computing the product
AP = A v(1) , , v(n) = Av(1) , , Av(n) , = 1 v(1) , n v(n) .
that is,
P 1 AP = P 1 1 v(1) , , n v(n) = 1 P 1 v(1) , , n P 1 v(n) .
G. NAGY ODE September 11, 2017 355
So, e(i) = P 1 v(i) , for i = 1 , n. Using these equations in the equation for P 1 AP ,
P 1 AP = 1 e(1) , , n e(n) = diag 1 , , n .
A = P DP 1 ,
P = v(1) , , v(n) , D = diag 1 , , n .
This means that A is diagonalizable. This establishes the Theorem.
1 3
Example 8.3.11: Show that matrix A = is diagonalizable.
3 1
Solution: We know that the eigenvalue-eigenvector pairs are
1 1
1 = 4, v1 = and 2 = 2, v2 = .
1 1
Introduce P and D as follows,
1 1 1 1 1 1 4 0
P = P = , D= .
1 1 2 1 1 0 2
We must show that A = P DP 1 . This is indeed the case, since
1 1 1 4 0 1 1 1
P DP = .
1 1 0 2 2 1 1
4 2 1 1 1 2 1 1 1
P DP 1 = =
4 2 2 1 1 2 1 1 1
1 3
We conclude, P DP 1 = P DP 1 = A, that is, A is diagonalizable. C
3 1
Theorem 8.3.8 shows the importance of knowing whether an n n matrix has a linearly
independent set of n eigenvectors. However, more often than not, there is no simple way to
check this property other than to compute all the matrix eigenvectors. But there is a simpler
particular case, the case when an n n matrix has n different eigenvalues. Then, we do not
need to compute the eigenvectors. The following result says that such matrix always have
a linearly independent set of n eigenvectors, hence, by Theorem 8.3.8, it is diagonalizable.
Theorem 8.3.9 (Different Eigenvalues). If an n n matrix has n different eigenvalues,
then this matrix has a linearly independent set of n eigenvectors.
Proof of Theorem 8.3.9: Let 1 , , n be the eigenvalues of an n n matrix A,
all different from each other. Let v(1) , , v(n) the corresponding eigenvectors, that is,
Av(i) = i v(i) , with i = 1, , n. We have to show that the set {v(1) , , v(n) } is linearly
independent. We assume that the opposite is true and we obtain a contradiction. Let us
assume that the set above is linearly dependent, that is, there are constants c1 , , cn , not
all zero, such that,
c1 v(1) + + cn v(n) = 0. (8.3.4)
Let us name the eigenvalues and eigenvectors such that c1 6= 0. Now, multiply the equation
above by the matrix A, the result is,
c1 1 v(1) + + cn n v(n) = 0.
Multiply Eq. (8.3.4) by the eigenvalue n , the result is,
c1 n v(1) + + cn n v(n) = 0.
Subtract the second from the first of the equations above, then the last term on the right-
hand sides cancels out, and we obtain,
c1 (1 n )v(1) + + cn1 (n1 n )v(n1) = 0. (8.3.5)
Repeat the whole procedure starting with Eq. (8.3.5), that is, multiply this later equation
by matrix A and also by n1 , then subtract the second from the first, the result is,
c1 (1 n )(1 n1 )v(1) + + cn2 (n2 n )(n2 n1 )v(n2) = 0.
Repeat the whole procedure a total of n 1 times, in the last step we obtain the equation
c1 (1 n )(1 n1 ) (1 3 )(1 2 )v(1) = 0.
Since all the eigenvalues are different, we conclude that c1 = 0, however this contradicts our
assumption that c1 6= 0. Therefore, the set of n eigenvectors must be linearly independent.
This establishes the Theorem.
1 1
Example 8.3.13: Is matrix A = diagonalizable?
1 1
Solution: We compute the matrix eigenvalues, starting with the characteristic polynomial,
(1 ) 1
p() =
= (1 )2 1 = 2 2 p() = ( 2).
1 (1 )
The roots of the characteristic polynomial are the matrix eigenvalues,
1 = 0, 2 = 2.
The eigenvalues are different, so by Theorem 8.3.9, matrix A is diagonalizable. C
G. NAGY ODE September 11, 2017 357
8.3.3. Exercises.
8.3.1.- . 8.3.2.- .
358 G. NAGY ODE september 11, 2017
8.4.1. The Exponential Function. The exponential function defined on real numbers,
f (x) = eax , where a is a constant and x R, satisfies f 0 (x) = a f (x). We want to find a
function of a square matrix with a similar property. Since the exponential on real numbers
can be defined in several equivalent ways, we start with a short review of three of ways to
define the exponential ex .
(a) The exponential function can be defined as a generalization of the power function from
the positive integers to the real numbers. One starts with positive integers n, defining
en = e e, n-times.
0
Then one defines e = 1, and for negative integers n
1
en = .
en
m
The next step is to define the exponential for rational numbers, , with m, n integers,
n
m
n
en = em .
The difficult part in this definition of the exponential is the generalization to irrational
numbers, x, which is done by a limit,
m
ex = mlim e n .
n x
It is nontrivial to define that limit precisely, which is why many calculus textbooks do
not show it. Anyway, it is not clear how to generalize this definition from real numbers
x to square matrices X.
(b) The exponential function can be defined as the inverse of the natural logarithm function
g(x) = ln(x), which in turns is defined as the area under the graph of the function
1
h(x) = from 1 to x, that is,
x
Z x
1
ln(x) = dy, x (0, ).
1 y
Again, it is not clear how to extend to matrices this definition of the exponential function
on real numbers.
(c) The exponential function can be defined also by its Taylor series expansion,
X xk x2 x3
ex = =1+x+ + + .
k! 2! 3!
k=0
Most calculus textbooks show this series expansion, a Taylor expansion, as a result from
the exponential definition, not as a definition itself. But one can define the exponential
using this series and prove that the function so defined satisfies the properties in (a)
and (b). It turns out, this series expression can be generalized square matrices.
G. NAGY ODE September 11, 2017 359
We now use the idea in (c) to define the exponential function on square matrices. We
start with the power function of a square matrix, f (X) = X n = X X, n-times, for X a
square matrix and n a positive integer. Then we define a polynomial of a square matrix,
p(X) = an X n + an1 X n1 + + a0 I.
Now we are ready to define the exponential of a square matrix.
Definition 8.4.1. The exponential of a square matrix A is the infinite sum
X An
eA = . (8.4.1)
n=0
n!
This definition makes sense, because the infinite sum in Eq. (8.4.1) converges.
Theorem 8.4.2. The infinite sum in Eq. (8.4.1) converges for all n n matrices.
Proof: See Section 2.1.2 and 4.5 in Hassani [6] for a proof using the Spectral Theorem.
The infinite sum in the definition of the exponential of a matrix is in general difficult to
compute. However, when the matrix is diagonal, the exponential is remarkably simple.
Theorem 8.4.3 (Exponential of Diagonal Matrices). If D = diag d1 , , dn , then
eD = diag ed1 , , edn .
where in the second equallity we used that the matrix D is diagonal. Then,
h (d )k
X 1 (dn )k i hX (d1 )k X (dn )k i
eD = diag , , = diag , , .
k! k! k! k!
k=0 k=0 k=0
k
(di )
X
Each sum in the diagonal of matrix above satisfies = edi . Therefore, we arrive to
k!
k=0
the equation eD = diag ed1 , , edn . This establishes the Theorem.
2 0
Example 8.4.1: Compute eA , where A = .
0 7
Solution: We follow the proof of Theorem 8.4.3 to get this result. We start with the
definition of the exponential
n
An
A
X X 1 2 0
e = = .
n=0
n! n=0
n! 0 7
Since the matrix A is diagonal, we have that
n n
2 0 2 0
= .
0 7 0 7n
Therefore,
2n P 2n
1 2n
X
A
X 0 n! 0 n=0 n! 0
e = = 7n = P 7n .
n! 0 7n 0 n! 0 n=0 n!
n=0 n=0
360 G. NAGY ODE september 11, 2017
2 0 2
P an a 0 7 e 0
Since = e , for a = 2, 7, we obtain that e = . C
n=0 n! 0 e7
We use this result and induction in n to prove Eq.(8.4.2). Since the case n = 1 is trivially
true, we start computing the case n = 2. We get
2
A2 = P DP 1 = P DP 1 P DP 1 = P DDP 1 A2 = P D2 P 1 ,
that is, Eq. (8.4.2) holds for k = 2. Now assume that Eq. (8.4.2) is true for k. This equation
also holds for k + 1, since
A(k+1) = Ak A = P Dk P 1 P DP 1 = P Dk P 1 P DP 1 = P Dk DP 1 .
Remark: Theorem 8.4.5 says that the infinite sum in the definition of eA reduces to a
product of three matrices when the matrix A is diagonalizable. This Theorem also says that
to compute the exponential of a diagonalizable matrix we need to compute the eigenvalues
and eigenvectors of that matrix.
Proof of Theorem 8.4.5: We start with the definition of the exponential,
X 1 n X 1 X 1
eA = A = (P DP 1 )n = (P Dn P 1 ),
k! k! k!
k=0 k=0 k=0
where the last step comes from Theorem 8.4.4. Now, in the expression on the far right we
can take common factor P on the left and P 1 on the right, that is,
X 1 n 1
eA = P D P .
k!
k=0
The sum in between parenthesis is the exponential of the diagonal matrix D, which we
computed in Theorem 8.4.3,
eA = P eD P 1 .
This establishes the Theorem.
G. NAGY ODE September 11, 2017 361
We have defined the exponential function F (A) = eA : Rnn Rnn , which is a function
from the space of square matrices into the space of square matrices. However, when one
studies solutions to linear systems of differential equations, one needs a slightly different type
of functions. One needs functions of the form F (t) = eAt : R Rnn , where A is a constant
square matrix and the independent variable is t R. That is, one needs to generalize the
real constant a in the function f (t) = eat to an n n matrix A. In the case that the matrix
A is diagonalizable, with A = P DP 1 , so is matrix At, and At = P (Dt)P 1 . Therefore,
the formula for the exponential of At is simply
eAt = P eDt P 1 .
We use this formula in the following example.
1 3
Example 8.4.2: Compute eAt , where A = and t R.
3 1
Solution: To compute eAt we need the decomposition A = P DP 1 , which in turns im-
plies that At = P (Dt)P 1 . Matrices P and D are constructed with the eigenvectors and
eigenvalues of matrix A. We computed them in Example ??,
1 1
1 = 4, v1 = and 2 = 2, v2 = .
1 1
Introduce P and D as follows,
1 1 1 1 1 1 4 0
P = P = , D= .
1 1 2 1 1 0 2
Then, the exponential function is given by
1 1 e4t
At Dt 1 0 1 1 1
e = Pe P = .
1 1 0 e2t 2 1 1
Usually one leaves the function in this form. If we multiply the three matrices out we get
1 (e4t + e2t ) (e4t e2t )
At
e = .
2 (e4t e2t ) (e4t + e2t )
C
Proof of Theorem 8.4.7: We start with the definition of the exponential function
X Aj sj X Ak tk X X Aj+k sj tk
eAs eAt = = .
j=0
j! k! j=0
j! k!
k=0 k=0
We now introduce the new label n = j + k, then j = n k, and we reorder the terms,
Xn n
X An snk tk X An X n!
eAs eAt = = snk tk .
n=0
(n k)! k! n=0
n! (n k)! k!
k=0 k=0
n
X n!
If we recall the binomial theorem, (s + t)n = snk tk , we get
(n k)! k!
k=0
X An
eAs eAt = (s + t)n = eA(s+t) .
n=0
n!
This establishes the Theorem.
If we set s = 1 and t = 1 in the Theorem 8.4.7 we get that
eA eA = eA(11) = e0 = I,
so we have a formula for the inverse of the exponential.
Theorem 8.4.8 (Inverse Exponential). If A is an n n matrix, then
1
eA = eA .
1
At 3
Example 8.4.3: Verify Theorem 8.4.8 for e , where A = and t R.
3 1
Solution: In Example 8.4.2 we found that
1 (e4t + e2t ) (e4t e2t )
eAt = .
2 (e4t e2t ) (e4t + e2t )
A 2 2 matrix is invertible iff its determinant is nonzero. In our case,
1 1 1 1
det eAt = (e4t + e2t ) (e4t + e2t ) (e4t e2t ) (e4t e2t ) = e2t ,
2 2 2 2
hence eAt is invertible. The inverse is
1 1 (e4t + e2t ) (e4t + e2t )
1
eAt = 2t ,
e 2 (e4t + e2t ) (e4t + e2t )
that is
1 (e2t + e4t ) (e2t + e4t ) 1 (e4t + e2t ) (e4t e2t )
At 1
= eAt .
e = =
2 (e2t + e4t ) (e2t + e4t ) 2 (e4t e2t ) (e4t + e2t )
C
We now want to compute the derivative of the function F (t) = eAt , where A is a constant
n n matrix and t R. It is not difficult to show the following result.
Theorem 8.4.9 (Derivative of the Exponential). If A is an n n matrix, and t R, then
d At
e = A eAt .
dt
G. NAGY ODE September 11, 2017 363
8.4.4. Exercises.
8.4.1.- Use the definition of the matrix ex- 8.4.5.- Compute eA for the following matri-
ponential to prove Theorem 8.4.6. Do ces:
not use any other theorems in this Sec-
a b
tion. (a) A = .
0 0
a 0
8.4.2.- If A2 = A, find a formula for eA (b) A =
0 b
.
which does not contain an infinite sum.
8.4.6.- If A2 = I, show that
8.4.3.- Compute eA for the following matri- 1 1
ces: 2 eA = e + I + e A.
0 1
e e
(a) A = .
0 0 8.4.7.- If and v are an eigenvalue and
1 1 eigenvector of A, then show that
(b) A = .
0 1
eA v = e v.
a b
(c) A = .
0 1 8.4.8.- By direct computation show that
e(A+B) 6= eA eB for
8.4.4.- Show that, if A is diagonalizable,
1 0 0 1
det eA = etr (A) . A= , B= .
0 0 0 0
Remark: This result is true for all
square matrices, but it is hard to prove
for nondiagonalizable matrices.
366 G. NAGY ODE september 11, 2017
Chapter 9. Appendices
(c) A function y having at x0 both infinitely many continuous derivatives and a convergent
power series is analytic where the series converges. The Taylor expansion centered at
x0 of such a function is
X y (n) (x0 )
y(x) = (x x0 )n ,
n=0
n!
and this means
y 00 (x0 ) y 000 (x0 )
y(x) = y(x0 ) + y 0 (x0 ) (x x0 ) + (x x0 )2 + (x x0 )3 + .
2! 3!
C
The Taylor series can be very useful to find the power series expansions of function having
infinitely many continuous derivatives.
Example B.2: Find the Taylor series of y(x) = sin(x) centered at x0 = 0.
Solution: We need to compute the derivatives of the function y and evaluate these deriva-
tives at the point we center the expansion, in this case x0 = 0.
y(x) = sin(x) y(0) = 0, y 0 (x) = cos(x) y 0 (0) = 1,
y 00 (x) = sin(x) y 00 (0) = 0, y 000 (x) = cos(x) y 000 (0) = 1.
One more derivative gives y (4) (t) = sin(t), so y (4) = y, the cycle repeats itself. It is not
difficult to see that the Taylors formula implies,
x3 x5 X (1)n
sin(x) = x + sin(x) = x(2n+1) .
3! 5! n=0
(2n + 1)!
368 G. NAGY ODE september 11, 2017
Remark: The Taylor series at x0 = 0 for y(x) = cos(x) is computed in a similar way,
X (1)n (2n)
cos(x) = x .
n=0
(2n)!
y
X
Solution: Notice that this function is well y(x) = xn
defined for every x R {1}. The func- n=0
X
Remark: The power series y(x) = xn does not converge on (, 1][1, ). But there
n=0
1
are different power series that converge to y(x) = on intervals inside that domain.
1x
For example the Taylor series about x0 = 2 converges for |x 2| < 1, that is 1 < x < 3.
n! n! X
y (n) (x) = y (n) (2) = y(x) = (1)n+1 (x 2)n .
(1 x)n+1 (1)n+1 n=0
Later on we might need the notion of convergence of an infinite series in absolute value.
X
Definition B.2. The power series y(x) = an (x x0 )n converges in absolute value
n=0
X
iff the series |an | |x x0 |n converges.
n=0
Remark: If a series converges in absolute value, it converges. The converse is not true.
G. NAGY ODE September 11, 2017 369
X (1)n
Example B.4: One can show that the series s = converges, but this series does
n=1
n
X 1
not converge absolutely, since diverges. See [11, 13]. C
n=1
n
Since power series expansions of functions might not converge on the same domain where
the function is defined, it is useful to introduce the region where the power series converges.
X
Definition B.3. The radius of convergence of a power series y(x) = an (x x0 )n
n=0
is the number > 0 satisfying both the series converges absolutely for |x x0 | < and the
series diverges for |x x0 | > .
Remark: The radius of convergence defines the size of the biggest open interval where the
power series converges. This interval is symmetric around the series center point x0 .
Example B.5: We state the radius of convergence of few power series. See [11, 13].
1 X
(1) The series = xn has radius of convergence = 1.
1 x n=0
x
X xn
(2) The series e = has radius of convergence = .
n=0
n!
X (1)n
(3) The series sin(x) = x(2n+1) has radius of convergence = .
n=0
(2n + 1)!
X (1)n (2n)
(4) The series cos(x) = x has radius of convergence = .
n=0
(2n)!
X 1
(5) The series sinh(x) = x(2n+1) has radius of convergence = .
n=0
(2n + 1)!
X 1
(6) The series cosh(x) = x(2n) has radius of convergence = .
n=0
(2n)!
One of the most used tests for the convergence of a power series is the ratio test.
X
Theorem B.4 (Ratio Test). Given the power series y(x) = an (x x0 )n , introduce
n=0
|an+1 |
the number L = lim . Then, the following statements hold:
n |an |
1.4.1.- 1.4.4.-
(a) The equation is exact. N = (1+t2 ), (a) (x) = 1/x.
M = 2t y, so t N = 2t = y M . 18 5
(b) y 3 3xy + x = 1.
(b) Since a potential function is given 5
by (t, y) = t2 y + y, the solution is 1.4.5.-
y(t) = 2
c
, c R. (a) (x) = x2 .
t +1 (b) y 2 (x4 + 1/2) = 2.
1.4.2.- 2
(c) y(x) = . The negative
(a) The equation is exact. We have 1 + 2x4
square root is selected because the
N = t cos(y) 2y, M = t + sin(y),
the initial condition is y(0) < 0.
t N = cos(y) = y M.
1.4.6.-
(b) Since a potential function is given
(a) The equation for y is not exact.
t2
by (t, y) = + t sin(y) y 2 , the There is no integrating factor de-
2 pending only on x.
solution is
(b) The equation for x = y 1 is not ex-
t2
+ t sin(y(t)) y 2 (t) = c, act. But there is an integrating fac-
2
tor depending only on y, given by
for c R.
(y) = ey .
1.4.3.-
(c) An implicit expression for both y(x)
(a) The equation is exact. We have
and x(y) is given by
N = 2y + t ety , M = 2 + y ety ,
t N = (1 + t y) ety = y M. 3x ey + sin(5x) ey = c,
T 0 = k (T 3). and
(b) The integrating factor method im- lim Q(t) = 300 grams.
t
plies (T 0 + k T )ekt = 3k ekt , so
0 0 1.5.5.- Denoting r = ri ro and
T ekt 3 ekt = 0. V (t) = r t + V0 , we obtain
Integrating we get (T 3) ekt = h V i ro
0 r
c, so the general solution is T = Q(t) = Q0
V (t)
c ekt + 3. The initial condition im- h h V i ro i
r
0
plies 18 = T (0) = c + 3, so c = 15, + qi V (t) V0 .
V (t)
and the function temperature is
A reordering of terms gives
T (t) = 15 ekt + 3. h V i ro
0 r
(c) To find k we use that T (3) = 13 C. Q(t) = qi V (t) (qi V0 Q0 )
V (t)
This implies 13 = 15 e3k +3, so we
and replacing the problem values yields
arrive at
13 3 2 (200)2
e3k = = , Q(t) = t + 200 100 .
15 3 (t + 200)2
which leads us to 3k = ln(2/3), so The concentration q(t) = Q(t)/V (t) is
we get h V i ro +1
0 r Q0
1 q(t) = qi qi .
k = ln(3/2). V (t) V0
3
The concentration at V (t) = Vm is
h V i ro +1 Q0
0 r
qm = qi qi ,
Vm V0
which gives the value
121
qm = grams/liter.
125
In the case of an unlimited capacity,
limt V (t) = , thus the equation for
q(t) above says
lim q(t) = qi .
t
G. NAGY ODE September 11, 2017 375
1.6.1.- 1.6.4.-
y0 = 0, (a) Write the equation as
y1 = t, 2 ln(t)
y0 = y.
(t2 4)
y2 = t + 3t2 ,
The equation is not defined for
y3 = t + 3t2 + 6t3 .
t=0 t = 2.
1.6.2.-
This provides the intervals
y0 = 1, (, 2), (2, 2), (2, ).
y1 = 1 + 8t, Since the initial condition is at t =
(a)
y2 = 1 + 8t + 12 t2 , 1, the interval where the solution is
defined is
y3 = 1 + 8t + 12 t2 + 12 t3 .
8 D = (0, 2).
(b) ck (t) = 3k tk .
3 (b) The equation is not defined for
8 5
(c) y(t) = e3t . t = 0, t = 3.
3 3
1.6.3.- This provides the intervals
p
(a) Since y = y02 4t2 , and the ini- (, 0), (0, 3), (3, ).
tial condition is at t = 0, the solu-
tion domain is Since the initial condition is at t =
h y y i
0 0
1, the interval where the solution
D= , . is defined is
2 2
y0 D = (, 0).
(b) Since y = and the initial
1 t2 y0
condition is at t = 0, the solution 1.6.5.-
2
domain is (a) y = t.
h 1 1 i 3
D = , . (b) Outside the disk t2 + y 2 6 1.
y0 y0
376 G. NAGY ODE september 11, 2017
2.1.1.- . 2.1.2.- .
??.1.- . ??.2.- .
??.1.- . ??.2.- .
??.??.- . ??.??.- .
??.??.- . ??.??.- .
??.??.- . ??.??.- .
G. NAGY ODE September 11, 2017 377
3.1.1.- . 3.1.2.- .
2.4.1.- . 2.4.2.- .
3.2.1.- . 3.2.2.- .
378 G. NAGY ODE september 11, 2017
??.??.- . ??.??.- .
??.??.- . ??.??.- .
??.1.- . ??.2.- .
??.1.- . ??.2.- .
??.1.- . ??.2.- .
G. NAGY ODE September 11, 2017 379
??.??.- . ??.??.- .
8.1.1.- . 8.1.2.- .
8.2.1.- . 8.2.2.- .
5.1.??.- . 5.1.??.- .
8.3.1.- . 8.3.2.- .
??.??.- . ??.??.- .
380 G. NAGY ODE september 11, 2017
7.1.1.- . 7.1.2.- .
7.2.1.- . 7.2.2.- .
7.3.1.- . 7.3.2.- .
G. NAGY ODE September 11, 2017 381
References
[1] T. Apostol. Calculus. John Wiley & Sons, New York, 1967. Volume I, Second edition.
[2] T. Apostol. Calculus. John Wiley & Sons, New York, 1969. Volume II, Second edition.
[3] W. Boyce and R. DiPrima. Elementary differential equations and boundary value problems. Wiley, New
Jersey, 2012. 10th edition.
[4] R. Churchill. Operational Mathematics. McGraw-Hill, New york, 1958. Second Edition.
[5] E. Coddington. An Introduction to Ordinary Differential Equations. Prentice Hall, 1961.
[6] S. Hassani. Mathematical physics. Springer, New York, 2006. Corrected second printing, 2000.
[7] E. Hille. Analysis. Vol. II.
[8] J.D. Jackson. Classical Electrodynamics. Wiley, New Jersey, 1999. 3rd edition.
[9] W. Rudin. Principles of Mathematical Analysis. McGraw-Hill, New York, NY, 1953.
[10] G. Simmons. Differential equations with applications and historical notes. McGraw-Hill, New York,
1991. 2nd edition.
[11] J. Stewart. Multivariable Calculus. Cenage Learning. 7th edition.
[12] S. Strogatz. Nonlinear Dynamics and Chaos. Perseus Books Publishing, Cambridge, USA, 1994. Pa-
perback printing, 2000.
[13] G. Thomas, M. Weir, and J. Hass. Thomas Calculus. Pearson. 12th edition.
[14] G. Watson. A treatise on the theory of Bessel functions. Cambridge University Press, London, 1944.
2nd edition.
[15] E. Zeidler. Nonlinear Functional Analysis and its Applications I, Fixed-Point Theorems. Springer, New
York, 1986.
[16] E. Zeidler. Applied functional analysis: applications to mathematical physics. Springer, New York,
1995.
[17] D. Zill and W. Wright. Differential equations and boundary value problems. Brooks/Cole, Boston, 2013.
8th edition.