Ordinary Dierential Equations Principles and Applications
Ordinary Dierential Equations Principles and Applications
Ordinary Dierential Equations Principles and Applications
Many interesting and important real life problems in the field of mathematics, physics,
chemistry, biology, engineering, economics, sociology and psychology are modelled
using the tools and techniques of ordinary differential equations (ODEs). This book
offers detailed treatment on fundamental concepts of ordinary differential equations.
Important topics including first and second order linear equations, initial value problems
and qualitative theory are presented in separate chapters. The concepts of physical models
and first order partial differential equations are discussed in detail. The text covers two-
point boundary value problems for second order linear and nonlinear equations. Using
two linearly independent solutions, a Green’s function is also constructed for given
boundary conditions.
The text emphasizes the use of calculus concepts in justification and analysis of
equations to get solutions in explicit form. While discussing first order linear systems,
tools from linear algebra are used and the importance of these tools is clearly explained
in the book. Real life applications are interspersed throughout the book. The methods
and tricks to solve numerous mathematical problems with sufficient derivations and
explanations are provided.
The first few chapters can be used for an undergraduate course on ODE, and later
chapters can be used at the graduate level. Wherever possible, the authors present the
subject in a way that students at undergraduate level can easily follow advanced topics,
such as qualitative analysis of linear and nonlinear systems.
P. S. Datti superannuated from the Centre for Applicable Mathematics at the Tata
Institute of Fundamental Research, Bangalore after serving for over 35 years. His
research interests include nonlinear hyperbolic equations, hyperbolic conservation laws,
ordinary differential equations, evolution equations and boundary layer phenomenon.
Raju K. George is Senior Professor and Dean (R&D) at the Indian Institute of Space
Science and Technology (IIST), Thiruvananthapuram. His research areas include
functional analysis, mathematical control theory, soft computing, orbital mechanics and
industrial mathematics.
CAMBRIDGE–IISc SERIES
Cambridge–IISc Series aims to publish the best research and scholarly work on
different areas of science and technology with emphasis on cutting-edge research.
The books will be aimed at a wide audience including students, researchers,
academicians and professionals and will be published under three categories:
research monographs, centenary lectures and lecture notes.
The editorial board has been constituted with experts from a range of disciplines
in diverse fields of engineering, science and technology from the Indian Institute
of Science, Bangalore.
IISc Press Editorial Board:
G. K. Ananthasuresh, Professor, Department of Mechanical Engineering
K. Kesava Rao, Professor, Department of Chemical Engineering
Gadadhar Misra, Professor, Department of Mathematics
T. A. Abinandanan, Professor, Department of Materials Engineering
Diptiman Sen, Professor, Centre for High Energy Physics
A. K. Nandakumaran
P. S. Datti
Raju K. George
University Printing House, Cambridge CB2 8BS, United Kingdom
One Liberty Plaza, 20th Floor, New York, NY 10006, USA
477 Williamstown Road, Port Melbourne, vic 3207, Australia
4843/24, 2nd Floor, Ansari Road, Daryaganj, Delhi - 110002, India
79 Anson Road, #06–04/06, Singapore 079906
Cambridge University Press is part of the University of Cambridge.
It furthers the University’s mission by disseminating knowledge in the pursuit of
education, learning and research at the highest international levels of excellence.
www.cambridge.org
Information on this title: www.cambridge.org/9781108416412
c Cambridge University Press 2017
This publication is in copyright. Subject to statutory exception
and to the provisions of relevant collective licensing agreements,
no reproduction of any part may take place without the written
permission of Cambridge University Press.
First published 2017
Printed in India
A catalogue record for this publication is available from the British Library
Figures xiii
Preface xv
Acknowledgement xix
Many interesting and important real life problems are modeled using
ordinary differential equations (ODE). These include, but are not limited
to, physics, chemistry, biology, engineering, economics, sociology,
psychology etc. In mathematics, ODE have a deep connection with
geometry, among other branches. In many of these situations, we are
interested in understanding the future, given the present phenomenon. In
other words, we wish to understand the time evolution or the dynamics of
a given phenomenon. The subject field of ODE has developed, over the
years, to answer adequately such questions. Yet, there are many
important intriguing situations, where complete answers are still awaited.
The present book aims at giving a good foundation for a beginner,
starting at an undergraduate level, without compromising on the rigour.
We have had several occasions to teach the students at the
undergraduate and graduate level in various universities and institutions
across the country, including our own institutions, on many topics
covered in the book. In our experience and the interactions we have had
with the students, we felt that many students lack a clear notion of ODE
including the simplest integral calculus problem. For other students, a
course on ODE meant learning a few tricks to solve equations. In India,
in particular, the books which are generally prescribed, consist of a few
tricks to solve problems, making ODE one of the most uninteresting
subject in the mathematical curriculum. We are of the opinion that many
students at the beginning level do not have clarity about the essence of
ODE, compared to other subjects in mathematics.
While we were still contemplating to write a book on ODE, to address
some of the issues discussed earlier, we got an opportunity to present
a video course on ODE, under the auspices of the National Programme
xvi Preface
First and second order equations are dealt with in Chapter 3. This
chapter also contains the usual methods of solutions, but with sufficient
mathematical explanation, so that students feel that there is indeed
rigorous mathematics behind these methods. The concept behind the
exact differential equation is also explained. Second order linear
equations, with or without constant coefficients, are given a detailed
treatment. This will make a student better equipped to study linear
systems, which are treated in Chapter 5.
Chapter 4 deals with the hard theme of existence, non-existence,
uniqueness etc., for a single equation and also a system of first order
equations. We have tried to motivate the reader to wonder why these
questions are important and how to deal with them. We have also
discussed other topics such as continuous dependence on initial data,
continuation of solutions and the maximal interval of existence of a
solution.
Linear systems are studied in great detail in Chapter 5. We have tried to
show the power of linear algebra in obtaining the phase portrait of 2 × 2
and general systems. We have also included a brief discussion on Floquet
theory, which deals with linear systems with periodic coefficients.
In the case of a second order linear equation with variable coefficients,
it is not possible in general, to obtain a solution in explicit form. This has
been discussed at length in Chapter 3. Chapter 6 deals with a class of
second order linear equations, whose solutions may be written explicitly,
although in the form of an infinite series. This method is attributed to
Frobenius.
Chapter 7 deals with the regular Sturm–Lioville theory. This theory is
concerned with boundary value problems associated with linear second
order equations with smooth coefficients, in a compact interval on the
real, involving a parameter. We, then, show the existence of a countable
number of values of the parameter and associated non-trivial solutions of
the differential equation satisfying the boundary conditions. There are
many similarities with the existence of eigenvalues and eigenvectors of a
matrix, though we are now in an infinite dimensional situation.
The qualitative theory of nonlinear systems is the subject of Chapter 8.
The contents may be suitable for a senior undergraduate course or a
beginning graduate course. This chapter does demand for more
prerequisites and these are described in Chapter 2. The main topics of the
chapter are equilibrium points or solutions of autonomous systems and
their stability analysis; existence of periodic orbits in a two-dimensional
xviii Preface
mathematical analysis. We remark that the first existence theorem for first
order differential equations is due to Cauchy in 1820. A class of
differential equations known as linear differential equations, is much
easier to handle. We will analyse linear equations and linear systems in
more detail and see the extensive use of linear algebra; in particular, we
will see how the nature of eigenvalues of a given matrix influences the
stability of solutions.
After the invention of differential calculus, the question of the
existence of antiderivative led to the following question regarding
differential equation: Given a function f , does there exist a function g
such that ġ(t ) = f (t )? Here, ġ(t ) is the derivative of g with respect to t.
This was the beginning of integral calculus and we refer to this problem
as an integral calculus problem. In fact, Newton’s second law of motion
describing the motion of a particle having mass m states that the rate
change of momentum equals the applied force. Mathematically, this is
written as dtd (mv) = −F, where v is the velocity of the particle. If
x = x(t ) is the position of the particle at time t, then v(t ) = ẋ(t ). In
general, the applied force F is a function of t, x and v. If we assume F is
a function of t, x, we have a second order equation for x given by
mẍ = −F (t, x). If F is a function of x alone, we obtain a conservative
equation which we study in Chapter 8. If on the other hand, F is a
function of t alone, then the second law leads to two integral calculus
problems: namely, first solve for the momentum p = mv by ṗ = −F (t )
and then solve for the position using mẋ = p. This also suggests that one
of the best ways to look at a differential equation is to view it as a
dynamical system; namely, the motion of some physical object. Here t,
the independent variable is viewed as time and x is the unknown variable
which depends on the independent variable t, and is known as the
dependent variable.
A large number of physical and biological phenomena can be
modelled via differential equations. Applications arise in almost all
branches of science and engineering–radiation decay, aging, tumor
growth, population growth, electrical circuits, mechanical vibrations,
simple pendulum, motion of artificial satellites, to mention a few.
In summary, real life phenomena together with physical and other
relevant laws, observations and experiments lead to mathematical models
(which could be ODE). One would like to do mathematical analysis and
computations of solutions of these models to simulate the behaviour of
these physical phenomena for better understanding.
Introduction and Examples: Physical Models 3
Definition 1.1.1
An ODE is an equation consisting of an independent variable t, an
unknown function (dependent variable) y = y(t ) and its derivatives up
to a certain order. Such a relation can be written as
dny
dy
f t, y, , · · · , n = 0. (1.1.1)
dt dt
Here, n is a positive integer, known as the order of the differential
equation.
For example, first and second order equations, respectively, can be written
as
dy d 2 y
dy
f t, y, = 0 and f t, y, , 2 = 0. (1.1.2)
dt dt dt
We will be discussing some special cases of these two classes of
equations. It is possible that there will be more than one unknown
function and in that case, we will have a system of differential equations.
A higher order differential equation in one unknown function may be
reduced into a system of first order differential equations. On the other
hand, if there are more than one independent variable, we end up with
partial differential equations (PDEs).
where r denotes the difference between birth rate and death rate. If y(t0 ) =
y0 is the population at time t0 , our problem is to find the population for all
t > t0 . This leads to the so-called initial value problem (IVP) which will
be discussed in Chapter 3. Assuming that r is a constant, the solution is
given by
the details for this and the other examples in this chapter.
Introduction and Examples: Physical Models 5
a/b
a/2b
a
population crosses the half way mark . This indicates that if the initial
2b
population is less than half the limiting
population, then there is an
2
dy d y
accelerated growth > 0, 2 > 0 , but after reaching half the
dt dt
dy
population, the population still grows > 0 , but it has now a
2 dt
d y
decelerated growth <0 .
dt 2
When we analyse the case where the initial population is bigger than
dy d2y
the limiting population, we observe that < 0 and 2 < 0. Thus, the
dt dt
population decreases with a decelerated growth to the limiting population.
Remark 1.2.1
dy
exerted by water (it is a kind of resistance), where V = , the velocity
dt
of the object and c > 0 is a constant of proportionality. Thus, we have the
differential equation
d2y 1 1 g
2
= F = (W −B−cV ) = (W −B−cV ), y(0) = 0. (1.2.6)
dt m m W
Equivalently,
dV cg g
+ V = (W − B), V (0) = 0. (1.2.7)
dt W W
Equation (1.2.7) can be solved to get
W −B cg
V (t ) = 1 − e− W t . (1.2.8)
c
Thus, V (t ) is increasing and tends to W −B c as t → ∞ and the value
W −B
(practically) of ≈ 700.
c
The limiting value 700 ft/sec of velocity is far above the permitted
critical value. Thus, it remains to ensure that V (t ) does not reach 40 ft/sec
by the time it reaches the sea bed. But it is not possible to compute t at
which time the drum hits the sea bed and one needs to do further analysis.
d2y dy
m 2
= −ky − c + F0 (t ). (1.2.11)
dt dt
That is,
d2y dy
m 2
+ c + ky = F0 (t ), m, c, k ≥ 0. (1.2.12)
dt dt
This is a second order non-homogeneous linear equation with constant
coefficients and we study such equations in detail in Chapter 3. Such a
system also arises in electrical circuits, which we discuss next.
V L
S
C
y
v
u
x
θ
P
u2
r
r
θ x
u1
d2θ 2 dθ dr u2 (t )
=− + . (1.2.30)
dt 2 r (t ) dt dt r (t )
In applications, when a satellite is injected into an orbit, it usually drifts
from its prescribed orbit due to the influence of other cosmic forces. The
thrusters (controls) are activated to maintain the desired orbit of the
satellite.
dy v0 y
= −v0 sin θ + w = − p + w.
dt x2 + y2
y
w
(x, y) w
aeroplane
V0
p
x2 + y2 y
θ
x
x (0, 0) (a, 0)
The path {(x(t ), y(t )),t ≥ 0} is called the orbit or trajectory of the aircraft
in the xy plane (Figure 1.4). These equations can be implicitly written as
dy 1 p
= (v0 y − w x2 + y2 ).
dx v0 x
Example 1.2.3
circuits by the Dutch engineer van der Pol when he was working for the
Philips company (in the Netherlands) around 1920. He also studied this
equation with forced periodic term A sin ωt and observed the
phenomenon, which in the current literature is termed as chaos. A
detailed mathematical analysis of this equation was done by Cartwright
and Littlewood [CL45] and by Levinson [Lev49]; their study revealed the
existence of the paradoxical combination of randomness and structure,
which is also called deterministic chaos in the current literature; see
Example 1.2.5, Lorenz equations.
The van der Pol equation is also used to model certain situations in
physical and biological sciences. For example, in seismology, it is used to
model the motion of two plates in a geological fault; in biology, it is used
to model the action potential of neurons.
Example 1.2.4
When x is small, we have sin x ≈ x and one obtains the linear pendulum
equation.
Example 1.2.5
ẏ = Rx − y − xz (1.2.36)
ż = −bz + xy,
where R, σ , b are fixed parameters.
Example 1.2.6
Example 1.2.7
Example 1.2.8
Example 1.2.9
xn+1 = axn (1 − xn ),
for n = 1, 2, · · · . This is a first order equation. The constant a ∈ [0, 4]. Thus,
if x1 ∈ [0, 1], then xn ∈ [0, 1] for all n > 1. The logistic map has been studied
extensively and it reveals many surprising properties of the sequence {xn }
for a certain range of values of a.
We will not pursue this subject in this book, but the interested reader
may look into, for example in [Hao84, Wig90].
1.3 Exercises
1. Consider the initial value problem2
dy
= ay(t ) − by2 (t ), y(t0 ) = y0
dt
where a, b > 0, t0 , y0 ∈ R. Assume the unique existence of the (local)
solution y = y(t ) in the interval (t1 ,t2 ) with t0 ∈ (t1 ,t2 ).
(c) Use the first part to obtain the solution y in the explicit form
ay0
y(t ) =
by0 + (a − by0 )e−a(t−t0 )
(d) In each of the cases of the first part, describe the maximal
interval (t∗ ,t ∗ ), where the solution y is defined. This is referred
to as the maximal interval of existence, which will be
discussed in detail in Chapter 4. (Note that t ∗ can be +∞ or t∗
can be −∞). Further, compute the limits
lim y(t ) and lim∗ y(t ).
t↓t∗ t↑t
2 The methods of solutions are described in Chapter 3
20 Ordinary Differential Equations: Principles and Applications
dy d 2 y
(e) In each of the cases, find dt , dt 2 and analyse the shape of the
curve.
(f) Find the conditions on y0 so that t∗ = −∞ and/or t ∗ = +∞.
(g) Plot the graphs of the solutions y in the ty plane for different
values of y0 .
(h) Let y = y(t ) be the solution as earlier and z = z(t ) be the
solution to the initial value problem:
dz
= az(t ) − bz2 (t ), z(t1 ) = y0 .
dt
Represent z in terms of y. Sketch with different initial times. Do
you observe any property? Describe the observed properties for
the general problem
dy
= f (y(t )), y(t0 ) = y0 .
dt
2. Consider the modified population model with a real parameter λ ,
namely
dy
= ay(t ) − by2 (t ) − λ , y(t0 ) = y0 .
dt
Do a similar analysis for various values of the parameter. More
precisely, show that there is a critical value λcr such that for
λ > λcr , the behaviour is exactly similar, but for λ < λcr , the
behaviour of the solution is completely different.
3. Consider the linear model of the atomic waste disposal problem:
dV cg g
+ V (t ) = (W − B), V (0) = 0,
dt W W
where V = V (t ) is the velocity at time t.
(a) Find the solution V and find the limit lim V (t ).
t→∞
(b) Now derive the non linear model:
v dv(y) g
= , v(0) = 0
W − B − cv dy W
Introduction and Examples: Physical Models 21
1.4 Notes
We have presented a few real world problems to highlight the importance
of modelling using ODEs and their analysis. Of course, the examples are
not exhaustive; in fact, one can find several text books devoted to a
particular topic, for example, mathematical biology, mechanical systems,
etc. We have seen through the atomic waste disposal problem (Section
1.2.3) that through the simple linear model, we can solve the problem
explicitly, but incomplete answer to the question set out therein.
However, a little reformulation gives us a non-linear equation, which in
general is hard to solve, yet gives us a complete answer to the question.
This exhibits the importance of correct modelling and its analysis even if
the solution is not available in explicit form. Such phenomena can be
observed in other models like population growth (Sections 1.2.1 and
1.2.2). One should bear in mind such peculiarities arising in the analysis
of ODEs. In general, it is hard to obtain explicit or implicit or even a
representation of a solution leading to the necessity of analysing the
solution in the absence of such forms.
A large number of real life examples are available in Martin Brown
[Bra78, Bra75]. See also [AMR95, TS86]
2
Preliminaries
2.1 Introduction
In this chapter, we present some topics from linear algebra and analysis
which are extensively used in the subsequent chapters of the book. Our
discussion will only be brief and more details and longer proofs may be
found in the references cited at the end of this chapter. Quite often, the
explicit solution may not be available and we may appeal to the analysis
to derive the qualitative nature of the solution, which in turn may help us
to arrive at conclusions about the behaviour of the physical or biological
problems modelled through ODE. Even when the explicit solution is
known, it may be hard to draw significant conclusions regarding the
global behaviour of the system. We have therefore emphasized the
importance of analysis and linear algebra throughout this book, with the
hope that the beginner starts appreciating the essential role of these
subjects in the study of ODE.
Notice that each function fk defined here is continuous on I = [0, 1], but
the limit function f is discontinuous at x = 0. We thus lost the important
property of continuity under the pointwise convergence. Therefore, we
now discuss a stronger convergence under which the continuity property
is preserved. This is the notion of uniform convergence.
Definition 2.2.1
Theorem 2.2.2
0, 0 ≤ x < 1
f (x ) =
1, x = 1,
Example 2.2.3
Theorem 2.2.4
Z b Z b
lim fk (t ) dt = f (t ) dt.
k→∞ a a
in general, we may not be able to interchange the limit and integral signs
if the convergence is not uniform.
In view of Theorem 2.2.2 and Theorem 2.2.4, for a given sequence of
functions, extracting a uniformly convergent subsequence is very
important in analysis. In this direction, we need to have conditions under
which one can derive uniformly convergent subsequences. A well-known
theorem is the Arzela–Ascoli theorem. Before stating this result, we
introduce some more concepts.
We discuss the convergence and uniform convergence of series of
functions. Let {uk } be a sequence of functions defined on I. Consider the
k
sequence of partial sums fk = ∑ ui . If the sequence fk converges
i=1
pointwise (respectively, uniformly) to a function u on I, then we say that
∞
the infinite series, denoted by ∑ uk converges pointwise (respectively,
k =1
uniformly) to u on I.
Theorem 2.2.2 and Theorem 2.2.4 are valid for series under appropriate
hypotheses.
Theorem 2.2.5
Definition 2.2.6
Definition 2.2.7
Theorem 2.2.8
A proof can be found in several books, see, for instance, [Rud76, CL72].
Definition 2.2.9
Theorem 2.2.10
Definition 2.2.11
Usually, it is the third property of the norm that does not follow in an
obvious way and needs proof. In the context of Rn , it is called Minkowski’s
inequality. When p = 2, it is the usual Euclidean norm (or distance).
It is convenient to take p = 1 for the discussion in Chapter 4 and we
write k · k1 = | · | and state the definition of Lipschitz continuity now in
terms of this 1−norm.
Definition 2.2.12
Example 2.2.13
2
(iv) For the function f (t, y) = e−t y2 sint on D = {(t, y) : 0 ≤ y ≤ 2,
t ∈ R}, we have
2
| f (t, y1 ) − f (t, y2 )| = |e−t sint||y1 + y2 ||y1 − y2 | ≤ 4|y1 − y2 |
for any (t, y1 ), (t, y2 ) in D. Thus, f (t, y) is Lipschitz continuous on
the strip D.
√
(v) f (t, y) = t y on the rectangle D = {(t, y) : 0 ≤ t ≤ 1, 0 ≤ y ≤ 1}
Note that
√ 1
| f (1, y) − f (1, 0)| = y = √ |y − 0|
y
1
and √ → +∞ as y → 0+ . Hence, the function f is not Lipschitz
y
continuous on the rectangle D, but is continuous on D.
Here we state a sufficient condition for Lipschitz continuity of f(t, y) with
respect to y.
Theorem 2.2.14
Example 2.2.15
Example 2.2.16
Lemma 2.2.17
How do we interpret the symbolic notation 2ydy = dt? This can√be done
3
via the change of variable Z √ take f (y) = y Z and g(t ) = t, then,
formula;
1 1
Z
by Theorem 2.2.18, y3 dy = ( t )3 √ dt = tdt.
2 t 2
We now discuss an important result known as differentiation under the
integral sign.
Theorem 2.2.19
[Generalized Leibnitz Formula]
Theorem 2.2.20
[Taylor’s Formula]
F 0 (0) = ∇ f (x0 ) · y,
where
∂f ∂f
∇ f (x0 ) = (x0 ), · · · , (x0 )
∂ x1 ∂ xn
is the gradient of f at x0 . Doing a further differentiation of F 0 (t ), we get
n
∂2 f
F 00 (t ) = ∑ (x0 + ty)yi y j .
i, j =1 ∂ xi ∂ x j
34 Ordinary Differential Equations: Principles and Applications
·
Df(x0 ) = ·
·
∇ fn (x0 )
which is an n × n matrix and we may write
f(x0 + y) = f(x0 ) + Df(x0 )y + O(|y|2 ).
Note that ∇ f (x0 ) · y is the dot product, whereas Df(x0 )y is the action of
the matrix Df(x0 ) on the vector y.
Definition 2.3.1
!1/2
n
∑ |xi − yi |2 for all x = (x1 , · · · , xn ), y = (y1 , · · · , yn ) ∈ Rn or Cn .
i=1
This is the standard Euclidean metric or distance. There are many other
metrics we can introduce on Rn and Cn . We will see more examples later.
Let (X, d ) be a metric space. A sequence {xk } in X is said to converge
to a point x ∈ X if for given ε > 0, there exists N ∈ N such that d (xk , x) < ε
for all k ≥ N. This statement may also be written as d (xk , x) → 0 or xk → x
as k → ∞. It is easy to see, from the triangle inequality, that if xk → x and
xk → y, then x = y.
A sequence {xn } ⊂ X is said to be a Cauchy sequence, if d (xn , xm ) → 0
as n, m → ∞. A metric space (X, d ) is said to be a complete metric space if
every Cauchy sequence in X converges in X. A normed linear space which
is a complete metric space (metric induced by the norm) is called a Banach
space (see, Definition 2.4.2).
In particular, if X = Rn , uk ∈ Rn converges to u ∈ Rn if |uk − u| → 0
as k → ∞ and Rn is a Banach space. It is also a Banach space under the
norms
!1/p
n
kxk p = ∑ |xi | p , 1 ≤ p < ∞,
i=1
and
kxk∞ = max |xi |,
1≤i≤n
for x ∈ Rn . The function space C [0, 1] or, more generally, C [a, b] with sup
norm is a Banach space. However, it is not a complete space with respect
to k · k1 introduced earlier. The completeness plays a crucial role in the
fixed point theorem to be studied later.
It is also easy to check that for a sequence { fn } ⊂ C [a, b], the statement
fn → f in sup norm is equivalent to saying that fn converges uniformly to
f.
Suppose (X, d ) is a metric space, x ∈ X, r > 0. The set Br (x) ≡ {y ∈ X :
d (x, y) < r} is called an open ball of radius r centred at x. The collection
{Br (x) : x ∈ X, r > 0} forms a basis for a topology in X. This is referred
to as the topology induced by the metric d in X. An open set in X is, by
definition, an arbitrary union of open balls. A subset of X is closed if its
complement is open in X.
36 Ordinary Differential Equations: Principles and Applications
Theorem 2.3.2
Corollary 2.3.3
The corollary follows from the theorem. Let x∗ be the unique fixed point
of T k , that is, T k x∗ = x∗ . Applying T , we get T k (T x∗ ) = T x∗ and hence,
T x∗ is also a fixed point of T k . By uniqueness, T x∗ = x∗ and thus, the
unique fixed point of T k is also a fixed point of T . If x1 is another fixed
point of T , that is, T x1 = x1 , then by repeated application of T , we see
that T k x1 = x1 . By uniqueness of the fixed point of T k , we have x1 = x∗
as required.
38 Ordinary Differential Equations: Principles and Applications
Definition 2.4.1
(a f )(t ) = a f (t ), t ∈ X,
where on the right are the usual addition and multiplication of real
numbers. Thus, f + g, a f ∈ V whenever f , g ∈ V and a ∈ R. It is easy
to check that V is a vector space with these operations. The additive
identity in V is the zero function: 0(t ) = 0 for all t ∈ X and the
additive inverse of f ∈ V is the function − f defined by (− f )(t ) =
− f (t ), t ∈ X.
2. If we take X = {1, 2, · · · , n} (n, a given positive integer) in
Example 1, then we identify the vector space V with Rn .
3. If instead we take X as an interval in R, then we may consider the
subsets of V consisting of polynomial functions, continuous
functions, continuously differentiable functions, etc. It is easy to
verify that all these are examples of real vector spaces. A
continuously differentiable function is one which is differentiable
and its derivative is also continuous. Higher order continuously
differentiable functions are defined in a similar way.
We now define some important concepts such as linear dependence and
independence of vectors, linear span, basis and dimension.
40 Ordinary Differential Equations: Principles and Applications
Definition 2.4.2
which is same as
|Ax|
|A| = sup = sup |Ax|.
x∈Rn , x6=0 |x| x∈Rn , |x|≤1
Note that for identity matrix I, we have |I| = 1. Using the properties of | · |
in Rn , it is not hard to verify the following:
Note that the term on the right side is a partial sum of the tail of the
(scalar) exponential e|A| . Thus, {Sk } is a Cauchy sequence and
consequently converges to some S ∈ Mn (R).
Definition 2.5.1
k
Aj
where S = lim ∑ .
k→∞ j =0 j!
∞
Aj
We also write eA = ∑ j! . Note that eA ∈ Mn (R). Clearly |eA | ≤
j =0
e|A| , which is an interesting inequality. The computation of eA is not easy.
However, if A = diag (λ1 , · · · , λn ) is a diagonal matrix, that is, the main
diagonal entries are λ1 , · · · , λn and all other elements are zero, then Ak
is also a diagonal matrix with diagonal entries λ1k , · · · , λnk (show this by
induction) and hence, eA = diag (eλ1 , · · · , eλn ).
Here are a couple of important observations:
1. Suppose that the matrix A is similar to a matrix B, that is, there exists
a non-singular matrix P such that B = PAP−1 . Then,
B2 = (PAP−1 )(PAP−1 ) = PA(P−1 P)AP−1 = PA2 P−1 ,
and, by induction, we get Bk = PAk P−1 for any k = 1, 2, · · · . This
implies that
eB = P eA P−1 and eA = P−1 eB P (2.5.1)
Thus, eA and eB are also similar.
Preliminaries 45
not possible to get any set of n distinct directions invariant under T , then
T will not be diagonalizable.
Invariant subspaces
Let M and N be two subspaces of Rn such that M ∩ N = {0}. We say M
and N are disjoint subspaces though 0 ∈ M ∩ N always. We say that Rn is
a direct sum of M and N, if, by definition, for every x ∈ Rn , there exist
unique y ∈ M, z ∈ N such that x = y + z. We denote the direct sum by
Rn = M ⊕ N. For example, R2 = {(x, 0) : x ∈ R} ⊕ {(0, y) : y ∈ R}.
Preliminaries 47
We can also introduce the direct sum of more than two subspaces
M1 , · · · , Mk as Rn = M1 ⊕ · · · ⊕ Mk , that is, each vector x ∈ Rn has a
unique representation x = u1 + · · · + uk , where ui ∈ Mi , i = 1, 2, · · · , k.
For example,
R3 = {(x, y, 0) : x, y ∈ R} ⊕ {(0, 0, z) : z ∈ R}
Definition 2.5.2
Further,
C−1 eA C = diag(eA1 , · · · , eAk )
The computation of eAi need not be easy in general. We shall next describe
a procedure to find a suitable C so that eAi are easily computed.
It is not hard to see that these real vectors are linearly independent. In
conclusion, we have the following theorem.
Theorem 2.5.3
Let A ∈ Mn (R).
Then, for each λ ∈ σ (A) real or non-real, there exists an invariant
subspace Nλ of Rn such that
algebraic multiplicity of λ if λ is real
dim(Nλ ) =
twice the algebraic multiplicity of λ if λ is non-real
Our aim is to find suitable bases for Nλi and Nµ j so that Aλi and Aµ j have
simple structures. Hence, exp(Aλi ) and exp(Aµ j ) can be computed easily.
Before proceeding further, we illustrate this with an example.
Example 2.5.4
O B2 I2 · · · O
(2.5.5)
··· ··· ··· ···
O O · · · · · · B2
a b 1 0 0 0
where B2 = , I2 = , O= are all 2 × 2
−b a 0 1 0 0
matrices.
This analysis can be worked out for every eigenvalue and we get the final
decomposition known as Jordan decomposition theorem (JDT).
Theorem 2.5.5
1 0 ··· 0
λ
0 λ 1 ··· 0
··· ··· ··· ··· ···
(2.5.7)
0 0 ··· ··· λ
0 0 1 ··· 0
N= .
··· ··· ··· ··· ···
0 0 ··· ··· 0
Thus, since I and N commute with each other,
eJ = eλ I .eN = eλ IeN = eλ eN .
It is easy to see that Nr = Nr+1 = · · · = O, the zero matrix. Hence,
Nr−1
J λ
e = e I + N + ··· +
(r − 1) !
52 Ordinary Differential Equations: Principles and Applications
where
O I2 O ··· O
··· O
O O I2
a b
D= and B2 = .
··· ··· ··· ··· ··· −b a
O O O ··· O
cos b sin b
Further, it is straightforward to see that eB2 = ea . From
− sin b cos b
(2.5.6), it follows that
A = C diag(J1 , · · · Jk ) C−1 ,
With some more computation, one can prove the following theorem (using
the representation of eJ ). See [CL72].
Theorem 2.5.6
a1 u1 (t2 ) + a2 u2 (t2 ) = 0.
The non-singularity of the matrix implies that a1 = 0 = a2 . Since the class
V is too large, we cannot make a statement about the converse. We now
consider a special class from V . Let C1 (I ) be the class of continuously
differentiable functions defined on I. Clearly C1 (I ) ⊂ V , which again is a
subspace of V . In the class C1 (I ), we get a simpler sufficient condition for
linear independence. For u1 , u2 ∈ C1 (I ), define the Wronskian of u1 , u2 ,
denoted by W = W (t ) = W (u1 , u2 )(t ) by
W (u1 , u2 )(t ) = u1 (t )u̇2 (t ) − u̇1 (t )u2 (t ), t ∈ I,
u1 (t ) u2 (t )
which is the determinant of the Wronskian matrix .
u̇1 (t ) u̇2 (t )
It is not hard to see the following. Suppose u1 , u2 ∈ C1 (I ). If there is a
point t0 ∈ I such that W (t0 ) 6= 0, then u1 , u2 are linearly independent.
The converse need not be true. The functions u1 (t ) = t 3 , u2 (t ) = |t|3 ,
t ∈ I = [−1, 1] are in C1 (I ) and are linearly independent, but
W (u1 , u2 )(t ) = 0 for all t ∈ [−1, 1]. This easy verification is left as an
exercise for the reader.
It is interesting and important that this situation does not occur when
we deal with functions which are solutions of linear second order ODE, as
will be shown in Chapter 3.
54 Ordinary Differential Equations: Principles and Applications
2.7 Exercises
1. Consider fk : [0, 1] → R defined by
1
k2 x, 0≤x≤
k
2 1 2
f k (x ) = k2 −x , ≤x≤
k k k
2
0, ≤ x ≤ 0.
k
R1
Show that fk (x) → f ≡ 0, not uniformly and 0 fk (x) = 1,
R1
0 f (t )dt = 0.
(a) Show that f (x) = |x|1/2 is not locally Lipschitz at 0, that is, f
is not Lipschitz in any interval (a, b) containing the origin. But,
it is Lipschitz in any interval (finite or infinite) away from the
origin. More specifically, prove that it is Lipschitz in (a, b) if
a > 0 and it is Lipschitz in (a, b) with b < 0. Is it Lipschitz in
(0, 1)? Justify your answer.
(b) Write down 3 different solutions for ẋ = |x|1/2 satisfying
x(0) = 0.
1 1
4. Show that the matrix A = is not diagonalizable by
0 1
proving A has only one eigenvalue and the corresponding
eigenspace is one dimensional. Thus, it will not be possible to
obtain two linearly independent eigenvectors.
2.8 Notes
In this chapter, we have merely listed some results from analysis and
linear algebra which are used throughout the book. For a comfortable
understanding of the book, the reader is advised to get familiarized with
these basics. A good course on basic analysis and linear algebra will be
sufficient to follow the book. Quite often, the beauty and importance of
many interesting notions like diagonalization, eigenvalues and
eigenvectors are hidden in the abstraction. We have made an effort to
introduce these notions in a very natural way and hence, the
diagonalization of matrices is no longer unreachable to undergraduate
students. There are many books for both linear algebra and analysis; for
example, see [Apo11, BS05, Rud76] for analysis and [Apo11, HK97,
Kum00, Str06] for linear algebra.
3
First and Second Order Linear
Equations
f , throughout the rest of this book. Using the area concept and continuity
assumption on f , we indeed prove Zthat such a function y exists. In fact, all
t
the solutions are given by y(t ) = f (τ ) dτ + C, where C is a constant.
This really is the content of the fundamental theorem of calculus. Thus, if
we know the value of y at some point, say at t0 , that is, y(t0 ) = y0 , then C
can be determined uniquely, as C = y0 and the solution is
Z t
y(t ) = y0 + f (τ ) dτ (3.1.5)
t0
ÿ = f (t, y, ẏ) for t ∈ (a, b)
(3.1.8)
α1 y(a) + β1 ẏ(a) = γ1 , α2 y(b) + β2 ẏ(b) = γ2
We remark that boundary value problems are generally more difficult than
initial value problems. We will discuss some of these issues in Chapter 7
and Chapter 9.
Definition 3.1.1
This means that for each t ∈ (ā, b̄), y(t ) ∈ (c, d ) and ẏ(t ) = f (t, y(t )) and
y(t0 ) = y0 . The interval (ā, b̄) is referred to as an interval of existence of
the solution. Here, t0 ∈ (ā, b̄) ⊂ (a, b) for some interval (ā, b̄) and
y(t ) ∈ (c, d ) for all t ∈ (ā, b̄). If (ā, b̄) = (a, b), then we say y is a global
solution to the IVP; otherwise, it is known as a local solution. If the
function f is continuous, then y is continuously differentiable, that is
y ∈ C1 (ā, b̄). It is also possible to define a weaker notion of the solution
concept. Throughout this book, we will assume that f is continuous and
hence, we seek a solution in C1 (ā, b̄).
A similar concept of a solution may be extended to a system of first
order equations. Let f : (a, b) × Ω → Rn be a vector valued continuous
function so that f = ( f1 , · · · , fn ) and each fi is a real valued continuous
function, where Ω is an open domain in Rn . For a given initial value y0 ∈
Ω, the IVP is given by
ẏ = f(t, y)
(3.1.9)
y(t0 ) = y0
Example 3.1.2
2y
Equivalently, ẏ = when t 6= 0. Separating the variables and integrating,
t
we obtain the general solution as y(t ) = Ct 2 , where C is a constant. For
any fixed C, the solution y, therefore represents a parabola in the t − y
plane passing through the origin. Thus, if we consider the IVP for this
equation with the initial value y(0) = 0, there are infinitely many solutions
satisfying the initial condition, but no solution if the initial value is y(0) =
y0 6= 0.
62 Ordinary Differential Equations: Principles and Applications
Example 3.1.3
t
Consider the equation ẏ = − .
y
We can easily see that y and t satisfy the implicit√equation y2 + t 2 = C2 ,
where C is a constant, which implies y = ± C2 − t 2 and |t| ≤ |C|.
Therefore, for −|C| ≤ t ≤ |C|, there exist solutions. The solution is not
defined for |t| > |C|.
A general regular (that is, the coefficient of highest order term is never
zero) first order linear ODE can be written as
Ly := ẏ + p(t )y = q(t ), (3.1.12)
where, p and q are functions of t. We assume that p and q are continuous
functions of t. For the basic equation, namely the integral calculus
problem, ẏ = f (t ), the general solution is given by
Z t
y(t ) = f (τ )dτ + C.
Now recall the linear equation (3.1.12) and consider the corresponding
homogeneous equation Ly = 0; that is, ẏ + p(t )y = 0 or ẏ = −p(t )y.
ẏ
Writing this formally as: = −p(t ), an integration gives
y
d
log |y(t )| = −p(t )
dt
and therefore
Zt Z t
|y(t )| = C exp − p(τ )dτ , that is, y(t ) exp p(τ )dτ = C,
for some arbitrary constant C̃. The reader should verify that if f is a
continuous function defined in an interval in R whose modulus is a
constant, then f itself is a constant. It is also easy to directly verify that y
given by (3.1.13) indeed satisfies Ly = 0.
Remark 3.1.4
Example 3.1.5
R
Consider the differential equation ẏ + 2ty = t. Here, the I.F. is e 2tdt =
2 2 2
et . Thus, et (ẏ + 2ty) = tet , which implies
d
Z
2 2 2 2
(yet ) = tet ⇒ yet = tet + C
dt
1 2
or y(t ) = + Ce−t
2
First and Second Order Linear Equations 65
Example 3.1.6
Example 3.1.7
Example 3.1.8
Definition 3.2.1
d
If the differential equation ẏ = f (t, y) can be written as ϕ (t, y(t )) =
dt
0 for a two variable function ϕ in a domain in the (t, y) plane, then the
differential equation is said to be an exact differential equation (EDE).
First and Second Order Linear Equations 67
Example 3.2.2
d
The equation 1 + cos(t + y) + cos(t + y)ẏ = 0 can be written as [t +
dt
sin(t + y)] = 0 and hence is exact. The solution is implicitly given by
t + sin(t + y) = constant.
Theorem 3.2.3
Z t
ϕ (t, y) = M (s, y) ds + h(y),
t0
dh
= N (t, y) − N (t0 , y) + .
dy
∂ϕ
Therefore, the second relation, namely = N, is satisfied if we choose h
∂y
dh
such that dy = N (t0 , y). But, this is an integral calculus problem for h and
Z y
we obtain h(y) = N (t0 , ξ ) dξ . Thus, the required function is given by
y0
Z t Z y
ϕ (t, y) = M (s, y) ds + N (t0 , ξ ) dξ .
t0 y0
We remark that ϕ is determined only up to a constant. Thus, if we change
t0 , y0 in this equation, only the constant term is going to change. Therefore,
the role of t0 , y0 is minimal and one can discard all the constants in the
expression for ϕ. We will observe this in the following examples. First,
make the following definition.
Definition 3.2.4
∂M ∂N
The DE, M (t, y) + N (t, y)ẏ = 0 is said to be exact if = .
∂y ∂t
Example 3.2.5
dy
The DE, 3y + et + (3t + cos y) = 0 is exact.
dt
We have
∂M ∂N
M = 3y + et , N = 3t + cos y, and thus =3= .
∂y ∂t
70 Ordinary Differential Equations: Principles and Applications
∂ϕ ∂ϕ
Therefore, M = , that is = 3y + et which gives ϕ (t, y) = 3yt +
∂t ∂t
∂ϕ dh
et + h(y). Differentiating with respect to y, we get N = = 3t + .
∂y dy
dh
Thus, = cos y or h(y) = sin y. We may take the constant of integration
dy
as 0. Hence, ϕ (t, y) = 3yt + et + sin y. Therefore, the given DE can be
∂
written as ϕ (t, y) = 0. The solution is given by ϕ (t, y) = 3yt + et +
∂t
sin y = constant.
We now discuss the notion of an integrating factor. If the DE (3.2.1)
is not exact, we may possibly make it exact by multiplying it with a
suitable function, which is called an integrating factor (I.F.). Multiplying
(3.2.1) by µ (t, y), we get
µ (t, y)M (t, y) + µ (t, y)N (t, y)ẏ = 0. (3.2.4)
Note that if the function µ > 0, then any solution y of (3.2.4) is also a
solution of (3.2.1) and vice versa. Equation (3.2.4) is exact if and only if
∂ ∂
( µM ) = ( µN ), which implies
∂y ∂t
∂µ ∂M ∂µ ∂N
M+µ = N+µ . (3.2.5)
∂y ∂y ∂t ∂t
If this equation has a solution µ, then (3.2.4) is exact and µ is an I.F. of
the original equation. As (3.2.5) is a PDE for µ, it is more difficult to
solve and goes beyond the realm of ODE! However, since we have some
freedom in choosing µ, we will try to choose it as simple as possible, say
µ is a function of only t or only y. Fortunately, such an assumption works
in many situations.
Consider a special caseµ = µ (t ) is a function of t alone. Then, (3.2.5)
∂M ∂N
becomes µ (t ) − = µ̇ (t )N and hence,
∂y ∂t
µ̇ (t ) 1 ∂M ∂N
= − .
µ (t ) N ∂y ∂t
As the expression on the left is a function of t alone, this equation makes
sense only when the expression on the right side is also a function of t
alone, say R(t ); then one can find an I.F. µ (t ) = exp t R(t )dt .
R
First and Second Order Linear Equations 71
1 ∂N ∂M
Similarly, if the expression − is a function of y alone,
M ∂t ∂y
then we can choose µ as function of only y.
Example 3.2.6
Here,
M = p(t )y − q(t ), N = 1.
Now
1 ∂M ∂N
− = p(t ).
N ∂y ∂t
t
Z
Hence, µ (t ) = exp p(t )dt is an I.F. as we have already seen
Section 1.2.
Example 3.2.7
Now
∂ϕ dh
N = t 2 cos y + 3y2 et = = t 2 cos y + 3y2 et + .
∂y dy
72 Ordinary Differential Equations: Principles and Applications
dh
Therefore, (y) = 0, h(y) = constant. We can take ϕ (t, y) =
dy
d 2
t 2 sin y + y3 et . The equation becomes (t sin y + y3 et ) = 0 which
dt
implies t 2 sin y + y3 et = k, a constant.
Example 3.2.8
Consider the DE t ẏ − 2y = 0.
1
For t > 0, we can see that the function µ (t ) = 3 is an integrating factor
t
d y 1 2
since = 2 ẏ − 3 y = 0. Thus, y = ct 2 is a solution for any
dt t 2 t t
constant c.
Theorem 3.3.1
Proposition 3.3.2
Proof: The first part of the proposition is trivial to verify. Now, let y
be any solution of (3.3.3) with y0 = y(t0 ) and y1 = ẏ(t0 ). We now show
that there are constants α and β such that y(t ) = αz(t ) + β w(t ), for all
t ∈ I (t0 ). In particular, taking t = t0 , we see that α and β should satisfy
the 2 × 2 matrix system
z(t0 )α + w(t0 )β = y0 ,
(3.3.4)
ż(t0 )α + ẇ(t0 )β = y1 .
t
Z
z, w satisfies (3.3.3). Thus, W is given by W (t ) = C exp p(t ) dt , for
some constant C. Hence, W ≡ 0 if C = 0. If C 6= 0, then W (t ) 6= 0, for
all t.
The following proposition will complete the proof of Proposition 3.3.2.
Proposition 3.3.3
Theorem 3.3.4.
dim(S) = 2.
Hence, the Wronskian is non-zero for all t and by Proposition 3.3.3, z and
w are linearly independent. It now follows from Proposition 3.3.2 that any
solution of (3.3.3) can be written as a linear combination of z and w.
We remark that the aforementioned proposition holds true for an nth order
linear equation as well. Further, the existence of a unique solution for the
aforementioned IVP is guaranteed under the assumption that the functions
p, q are continuous in a compact interval I (t0 ). But, in general, even for
second order equations, it is difficult to find independent solutions to Ly =
0 in explicit form. We present two methods describing the possibility of
obtaining linearly independent solutions. When applicable, these methods
generate two linearly independent solutions.
Method 1: The idea is to remove the term involving the first order
derivative ẏ via an integrating factor. We look for a solution of the form
y = uv, where u and v are to be properly chosen. In this case, ẏ = uv̇ + u̇v
and ÿ = uv̈ + 2u̇v̇ + üv. Substituting in Ly = 0, we get
(uv̈ + 2u̇v̇ + üv) + p(t )(uv̇ + u̇v) + q(t )uv = 0. (3.3.6)
Rearranging the terms in (3.3.6), we obtain
uv̈ + (2u̇ + p(t )u)v̇ + (ü + p(t )u̇)v + q(t )uv = 0.
Now choose u so that the coefficient of v̇ in this equation vanishes. That
is, choose u satisfying
2u̇ + p(t )u = 0, (3.3.7)
which can be easily solved for u. Note that u never vanishes, if it is not
zero initially. The equation satisfied by v now becomes
v̈ + q(t )v = −(ü + p(t )u̇)/u. (3.3.8)
Since v̇ term is absent and u is known in (3.3.8), it may be possible to solve
this equation for v, at least in some situations.
Example 3.3.5
Solve ÿ + 2t ẏ + (1 + t 2 )y = 0.
First and Second Order Linear Equations 77
1 2
Here p(t ) = 2t, q(t ) = 1 + t 2 , u(t ) = e− 2 t . It is easy to see that v satisfies
v̈ = 0. Thus, v(t ) = C1t + C2 and the solution is given by
1 2
y(t ) = u(t )v(t ) = e− 2 t (C1t + C2 ).
1 p 1 p
r1 = −b + b2 − 4ac and r2 = −b − b2 − 4ac .
2a 2a
(3.3.13)
We now analyse various cases, depending on the nature of the discriminant
of the quadratic equation (3.3.12).
Case (i) b2 − 4ac > 0: In this case, the roots r1 and r2 of (3.3.12) are real
and distinct and we get two linearly independent solutions y1 (t ) = er1t and
y2 (t ) = er2t . Hence, the general solution can be written as
y(t ) = Aer1t + Ber2t , (3.3.14)
where, A and B are arbitrary constants.
Case (iii) b2 − 4ac < 0: Then, the roots r1 and r2 are complex and er1t
and er2t are complex valued solutions. Clearly, if y(t ) = u(t ) + iv(t ) is a
complex valued solution, then u and v are real valued solutions. Thus, if
r1 = α + iβ and r2 = α − iβ , β 6= 0, the two independent solutions are
b
given by y1 (t ) = eαt cos βt and y2 (t ) = eαt sin βt, where α = − 2a and
√
4ac − b2
β= . Thus, the general solution is given by
2a
First and Second Order Linear Equations 79
Theorem 3.3.6
Example 3.3.7
Example 3.3.8
Example 3.3.9
Example 3.3.10
Consider the linear equation
1+t 1
ÿ − ẏ + y = te2t .
t t
By inspection, we observe that y1 (t ) = et is a solution of the
homogeneous equation. By the method of reduction of order, it can
be shown that y2 (t ) = 1 + t is another (linearly independent)
solution. Therefore, a particular solution of the given
non-homogeneous equation is given by
Z t 2t Z t 2t
te te −t 1
y p (t ) = et (1 + t ) dt − (1 + t ) e dt = (t + 1)e2t .
tet tet 2
eat
y p (t ) = .
a2 + pa + q
If a is root of (3.3.12), we now look for a solution of the form
y p = Ateat . In fact, one can use the reduction order method to get
the coefficient as At. A computation will lead to A(2a + p) = 1.
1
Again, we get a particular solution by choosing A = if
2a + p
2a + p 6= 0 as
teat
y p (t ) = .
2a + p
d 2
Now, note that 2a + p = (a + ap + q). Thus, 2a + p = 0 is
da
equivalent to a being a double root. If a is a double root (that is,
a2 + ap + q = 0 and 2a + p = 0), then look for a solution of the
form y p (t ) = At 2 eat which will give us A = 21 . In summary, we have
eat
if a is not a root of (3.3.12),
a2 + pa + q
teat
y p (t ) = if a is a simple root of (3.3.12),
2a + p
1 t 2 eat
if a is a double root of (3.3.12).
2
(3.3.23)
Example 3.3.11
Consider ÿ + y = sint.
Example 3.3.12
r
k
where ω0 = , called the natural frequency of the system. This can
m
also be written as
y(t ) = R cos(ω0t − δ ).
√
b
Here, R = a2 + b2 , δ = tan−1 are, respectively, the amplitude and
a
2π
phase angle. Further, T0 = is the period of the motion and the motion
ω0
is periodically oscillating between −R and R (see Fig. 3.1). Note that the
term involving c is the damping term. Indeed, Newton’s law is justified as
the motion never stops.
y
R
−R
2π/ω 0
Case (ii) (Damped, free motion: F = 0, c > 0): If r1 , r2 are the roots
of the characteristic equation mr2 + cr + k = 0, we can write the general
solution as
if c2 − 4mk > 0
rt
ae 1 + ber2t
c
y(t ) = (a + bt )e− 2m t if c2 − 4mk = 0 (3.3.24)
−ct
e 2m [a cos µt + b sin µt ] if c2 − 4mk < 0,
√
4mk − c2
where µ = . Further, it is easy to see that r1 , r2 are negative
2m
real numbers or have negative real parts. Hence, in the first two cases,
y(t ) → 0 as t → ∞, y(t ) creeps back to the equilibrium position and there
are no oscillations at all. These are referred to as over-damped or critically
First and Second Order Linear Equations 85
u
R
c
R exp( 2m t)
y(t ) t
c
−R exp( 2m t)
−R
r
k
and ω0 = . The general solution can be written as
m
y(t ) = ϕ (t ) + y p (t ), (3.3.27)
where ϕ is the general solution to the homogeneous equation and as
observed earlier, ϕ (t ) → 0 as t → ∞. Thus, for large time, y(t ) behaves
like y p (t ). The solution y p is called the steady state part of y(t ) and ϕ (t )
is called the transient part.
y(t )
t
The first two terms are periodic functions of time. The last term is
oscillatory and its amplitude keeps increasing due the presence of t.
Thus, if the forcing term F0 cos ω0t is in resonance with the natural
frequency of the system, then it will cause unbounded oscillations,
leading to mechanical catastrophes.
A criterion, similar to the lack of sufficient damping, was the reason
for the collapse of the Tacoma bridge on November 7, 1940 at 11.00 am.
This is also the cause for the collapse of the Broughton suspension bridge
near Manchester. This occurred when a column of soldiers marched in
cadence over the bridge, thereby setting up a periodic force with a rather
large amplitude. The frequency was almost equal to the natural frequency
of the bridge, and thus, large oscillations were induced and the bridge
collapsed. It is for this reason that soldiers are ordered to break cadence
when crossing a bridge.
Among many similarities with mechanical vibrations, electrical
circuits also have the property of resonance. Unlike mechanical systems,
resonance is put to good use here like, the tuning knob of radio or
television is used to vary the capacitance in such a manner that the
resonant frequency is changed until it agrees with the frequency of the
88 Ordinary Differential Equations: Principles and Applications
external signal, that is, from a radio or television station; the amplitude of
the current produced by this signal will be much greater than that of other
signals so that we get the desired sound quality or picture quality or both.
and
Z t
d
ÿ(t ) = Gtt (t, ξ )r (ξ )dξ + Gt (t,t )r (t ) + (G(t,t )r (t )).
a dt
Now substituting in (3.3.16), we get
Z t
Ly(t ) = LG(·, ξ )r (ξ )dξ + Gt (t,t )r (t ) + p(t )G(t,t )r (t )
a
d
+ (G(t,t )r (t )).
dt
This motivates us to define G as a solution to the following homogeneous
problem: For fixed ξ ≥ a as a parameter (in fact initial point), define G,
for t ≥ ξ to be the solution of
LG(., ξ ) = 0, G(ξ , ξ ) = 0, Gt (ξ , ξ ) = 1.
Further, define G(t, ξ ) = 0 for t ∈ [a, ξ ]. Then, y given by (3.3.29) will
satisfy the non-homogeneous equation Ly = r satisfying the initial
conditions y(0) = ẏ(0) = 0. The kernel G is called Green’s function
associated with the problem (3.3.16). For a class of problems associated
with second order equations, the process of obtaining G is done in detail
in Chapter 9.
Example 3.4.1
x x
x0 x0 = (x − ct, 0)
(a)
u
u(.,t0 ) = u0 (. − ct0 )
x x
t=0 t = t0
(b)
x(t ), that is, x(t ) = ct + ξ0 . These curves, straight lines in this case, are
called the characteristic curves of (3.4.1). In general, when c is a function
of t and x, these characteristic curves need not be straight lines. Fix ξ = ξ0
and restrict u along this curve (line). Consider the function of one variable
g(t ) = u(x(t ),t ). Now, it is easy to see by the chain rule that
d d
g(t ) = u(x(t ),t ) = ux ẋ(t ) + ut .1 = ux .c + ut = 0. (3.4.2)
dt dt
First and Second Order Linear Equations 91
t t
x x
(a) (b)
We now discuss a second order linear PDE, namely, the heat equation.
Our only purpose here is to indicate that this equation may be viewed
as an ODE, though in an infinite dimensional (Hilbert) space. As such,
many of the following terminology are not explained in a rigorous manner.
92 Ordinary Differential Equations: Principles and Applications
An inquisitive reader may explore this and similar topics after gaining
sufficient knowledge in functional analysis and related topics.
Example 3.4.2
3.5 Exercises
1. Prove that every separable equation is exact.
2. Find the unique solution to the IVP ẏ = 2yt , y(t0 ) = y0 , where t0 6= 0.
Also find the interval of existence and plot the solution for different
values of y0 .
3. Show that the solution of ẏ + (sint )y = 0, y(0) = 32 is given by
y(t ) = 32 ecost−1 .
dy 2
4. The solution of + et y = 0, y(1) = 2 can be represented as y(t ) =
dt
t
Z
2
2 exp − es ds .
1
5. Classify the following into linear or non-linear:
(a) ẏ = ay − by2 , ẏ = −t/y, ẏ = −y/t, ẏ(t ) = sin(t ),
sin y + x cos(ẏ) = 0 ẏ = |y|, yẏ = y,
ẏ = sin y, yẏ = Wg (W − B − cy).
(b) (Duffing equation): ÿ + δ ẏ + αy + β x3 = 0
(c) (van der Pol equation): ÿ − µ (y2 − 1)ẏ + y = 0
(d) (Prey–predator system): ẋ = ax − bxy, ẏ = −cy + dxy
(e) (Epidemiology): Ṡ = −β SI, I˙ = β SI − γI
(f) (Bernoulli equation): ẏ + φ (t )y = ψ (t )yn
(g) (Reduced Bernoulli equation): ẏ + (1 − n)φ (t )y = (1 − n)ψ (t )
(h) (Generalized Riccati equation): ẏ + ψ (t )y2 + φ (t )y + χ (t ) = 0
6. Consider the Bernoulli equation
ẋ + φ x = ψxn ,
where φ , ψ are continuous functions. For n 6= 1, it is non-linear;
show that it can be reduced to a linear equation by the substitution
y = x1−n . Then, solve the equation.
7. Find the general solution of (i) ẋ + et x = et x2 (ii) ẋ + t n x = xn .
8. Consider the Jacobi equation
(a1 + b1t + c1 x)(tdx − xdt ) − (a3 + b3t + c3 x)dx
(c) ẏ + ty = t 3 y3
1 3y2
2y
(e) + 4 dt = dy
t2 t t3
t 2 dy − y2 dt
(f) =0
(t − y)2
17. Find the general solution of the following equations
(a) t 2 ÿ + t ẏ − y = 0.
96 Ordinary Differential Equations: Principles and Applications
(b) t ÿ − (t + n)ẏ + ny = 0.
d3y a
(d) t 3
= 2 and ÿ = 3 .
dt y
18. Three solutions of a certain second order non-homogeneous linear
equation in R are
ϕ1 (t ) = t 2 , ϕ2 (t ) = t 2 + e2t , ϕ3 (t ) = 1 + t 2 + 2e2t .
Find the general solution of this equation.
19. Three solutions of a certain second order non-homogeneous linear
equation L y = g in R are
ψ1 (t ) = t 2 , ψ2 (t ) = t 2 + e2t , ψ3 (t ) = 1 + t 2 + 2e2t .
Here g is a continuous function in R. Find the solution y of L y = g
satisfying y(0) = 1, y0 (0) = 2.
20. Assume the unique existence of a solution to the nth order IVP
3.6 Notes
The discussion on linear first and second order equations is available in
every basic book on ODE. In addition to the linear equations, we have
also introduced a section on exact differential equations. On one hand,
we have shown how every first order linear equation can be reduced to an
integral calculus problem, namely ẏ(t ) = h(t ) with the introduction of an
integrating factor (I.F.). This also makes it clear why an IVP for a first
98 Ordinary Differential Equations: Principles and Applications
4.1 Introduction
4.1.1 Well-posed problems
In this chapter, we address the problem of the existence and uniqueness
of solutions of initial value problems (IVP). For this purpose, our first
task is to ensure that the given differential equation has a solution. A
mathematical model originating from a real life system may exhibit more
than one solution starting from the same initial condition, though a
unique solution is expected. This may be due to rough approximations
and assumptions made while making a mathematical model of the
physical system. On the other hand, a mathematical model may not have
a solution at all. Similarly, it is also important to study the behaviour of
the solution with respect to the initial data as the initial data is usually
measured by using some devices and is bound to have some small errors.
Continuous dependence of solutions on initial data guarantees that a
small error in the initial data does not cause a drastic change in the
solution of the system. According to the French mathematician Jacques
Hadamard, if an initial value problem arising from a physical
phenomenon qualifies the above mentioned tests, namely, a solution
exists (existence problem), the solution is unique (uniqueness problem)
and the solution depends continuously on the initial conditions (stability
problem) in appropriate norms, then the IVP is said to be well-posed.
Otherwise, the problem is ill-posed. In this chapter, we will address these
issues and prove results which ensure the well-posedness of an IVP,
under suitable assumptions. We consider the following IVP
100 Ordinary Differential Equations: Principles and Applications
When the function f (t, y), (called the vector field) is not continuous at a
point in the (t, y)-plane, then there may be a possibility of non-existence
of a solution at that point. Similarly, if the vector field is not differentiable
at a point, then there may be a possibility of non-uniqueness of solutions
through that point. The initial condition also plays a crucial role in the
existence and behaviour of the solution of an IVP.
Before proceeding further, we will consider some examples which
exhibit one or more phenomena discussed here. Also see Examples 3.1.2,
3.1.3.
4.1.2 Examples
Example 4.1.1
It has been shown in Chapter 3 that the function y(t ) = ce2t is a solution
of the equation for every arbitrary constant c. Thus, the differential
equation has infinitely many solutions. Geometrically, this is a
one-parameter family of curves.
Let us remark that a differential equation always comes with some
associated physically meaningful conditions such as initial conditions
and boundary conditions. So when we talk about well-posedness, it is for
the differential equation together with the associated conditions provided.
General Theory of Initial Value Problems 101
Example 4.1.2
3
We consider the problem ẏ = y,
t
with various initial conditions to exhibit multiplicity of solutions, non-
existence and a unique existence. Note that the vector field is not defined
at t = 0. By separable variable method, we get
y = ct 3
as the solution for the arbitrary constant c. Now consider the initial
condition y(0) = 0, we see that y = ct 3 is a solution to the initial value
problem for any value of c. Thus, the IVP has infinitely many solutions
with the initial condition y(0) = 0. All the solution curves pass through
(0, 0) in the (t, y)-plane.
On the other hand, if we take the same differential equation but with
the initial condition y(0) = 2, then the IVP has no solution as the general
solution is y = ct 3 . The trouble is due to the singularity at t = 0. It is also
easy to see that the IVP with y(t0 ) = y0 , with t0 6= 0 has a unique solution
y0
y(t ) = 3 t 3 .
t0
We now give an example of an initial value problem in which f is
continuous but not linear and exhibit infinitely many solutions.
Example 4.1.3
0, 0≤t ≤a
y(t ) =
(t − a)3 , t >a
Example 4.1.4
ẏ = y2 , y(0) = y0 > 0.
y0
By an integration and using y(0) = y0 , we get a solution y(t ) = .
1 − y0t
It is the only solution to the problem. We will see this fact in the next
1
section. Note that y(t ) is defined only for t < , despite the fact that the
y0
vector field f (t, y) = y2 is very smooth on the entire real line. If y0 is large,
then we have solutions only on a very small interval to the right of t = 0;
however, they exist for all t < 0.
We now see an example of a nonlinear IVP which does not have a
solution.
Example 4.1.5
must be decreasing initially as ẏ(0) = −1. Therefore, y(t ) < 0 for some
t > 0. However, for all negative values of y, the solution must be
increasing. Since these two statements contradict each other, there is no
solution to this IVP.
Lemma 4.2.1
[Basic Lemma]
Z t Z t
ẏ(τ )dτ = f (τ, y(τ ))dτ,
t0 t0
that is,
Z t
y(t ) = y0 + f (τ, y(τ ))dτ.
t0
Remark 4.2.2
Lemma 4.2.3
Z t
p(t ) ≤ C + k q(s) p(s)ds,
t0
for all t ∈ [a, b], where t0 ∈ [a, b] is fixed and C, k are constants with
k ≥ 0. Then,
Zt
p(t ) ≤ C exp k q(s)ds (4.2.2)
t0
Remark 4.2.4
Theorem 4.2.5
Proof: Suppose that y and z are two solutions of the IVP (4.1.1) defined
on an interval [c, d ] contained in the interval [t0 − a,t0 + a] and t0 ∈ [c, d ].
Thus by the basic lemma, we have
Z t Z t
y(t ) = y0 + f (τ, y(τ ))dτ and z(t ) = y0 + f (τ, z(τ ))dτ
t0 t0
Theorem 4.3.1
Let (t0 , y0 ) ∈ D and a and b be positive constants such that the rectangle
R defined by
R = {(t, y) : |t − t0 | ≤ a, |y − y0 | ≤ b}
b
is a subset of D. Let M = max | f (t, y)| and h = min a, . Then, IVP
(t,y)∈R M
(4.1.1) has a unique solution in the interval |t − t0 | ≤ h.
D
y = y0 + b
R R1
t = t0 − a t = t0 + a
(t0 , y0 )
t0 − h y = y0 − b t0 + h
t
Fig. 4.1 Picard’s theorem
Consider the interval [t0 ,t0 + h]. Similar arguments hold for the interval
[t0 − h,t0 ]. The proof will be established by the construction of successive
approximations, called Picard’s iterates {yn }, n = 0, 1, 2, · · · and showing
that {yn } converges uniformly to some y defined on [t0 ,t0 + h], a solution
of the integral equation (4.2.1). The basic lemma (Lemma 4.2.1), then
gives the existence of a solution to (4.1.1). The proof will be accomplished
through the following three steps.
Step 1: Here we define Picard’s iterates. For t ∈ [t0 ,t0 + h], let
y0 (t ) = y0
and define successively,
Z t
yn (t ) = y0 + f (τ, yn−1 (τ )) dτ (4.3.1)
t0
(t, yn−1 (t )) ∈ R1 and |yn−1 (t ) − y0 | ≤ b holds, for all t ∈ [t0 ,t0 + h]. We
show that the same statements are true when yn−1 is replaced by yn and
that completes the induction argument. Since R1 ⊂ R, we have
| f (t, yn−1 (t ))| ≤ M on [t0 ,t0 + h].
Z t
Now consider yn (t ) = y0 + f (τ, yn−1 (τ )) dτ. The aforementioned
t0
induction assumption implies that the definition of yn (t ) makes sense and
yn is continuously differentiable on [t0 ,t0 + h]. Now,
Z t Zt
|yn (t ) − y0 | = f (τ, yn−1 (τ )) dτ ≤
| f (τ, yn−1 (τ ))| dτ
t0 t0
≤ M (t − t0 ) ≤ Mh ≤ b.
Thus, (t, yn (t )) lies in the rectangle R1 and hence, f (t, yn (t )) is defined
and continuous on [t0 ,t0 + h]. Hence, the said properties hold also for yn
and induction is complete.
Z t
≤ | f (τ, yn−1 (τ )) − f (τ, yn−2 (τ ))| dτ.
t0
Z t
|yn (t ) − yn−1 (t )| ≤ α |yn−1 (τ ) − yn−2 (τ )| dτ
t0
Mα n−2
Z t
≤ α (τ − t0 )n−1 dτ, by (4.3.3)
t0 (n − 1) !
Mα n−1 (τ − t0 )n t
=
(n − 1) ! n t0
Mα n−1
= (t − t0 )n .
n!
Therefore, the inequality is true for n. For the case n = 1, we have
Z t
|y1 (t ) − y0 | ≤ | f (τ, y0 )| dτ ≤ M (t − t0 ).
t0
Thus, the inequality (4.3.2) is true for n = 1, and hence, it is also true for
any n ≥ 1 as shown earlier using mathematical induction.
formed by the constants on the right side in the inequality (4.3.2). This
series converges to M
α (e − 1). Now consider the infinite series
αh
∞
∑ |yn (t ) − yn−1 (t )|
n=1
Z t
= y0 + lim f (τ, yn (τ )) dτ
t0 n→∞
Z t
= y0 + f (τ, y(τ )) dτ on [t0 ,t0 + h].
t0
Therefore, y satisfies (4.2.1). Thus, by the basic lemma, the limit function
y(t ) satisfies IVP (4.1.1) on [t0 ,t0 + h]. Using similar arguments, one can
show the existence of a solution on the interval [t0 − h,t0 ]. Thus, Picard’s
iterates converge uniformly to the unique solution of IVP (4.1.1). This
completes the proof.
Remark 4.3.2
Definition 4.3.3
Theorem 4.3.4
R = {(t, y) : |t − t0 | ≤ a, |y − y0 | ≤ b} ,
where, a, b are positive real numbers. Let
b
M = max | f (t, y)| and h = min a, .
(t,y)∈R M
General Theory of Initial Value Problems 113
Then, for given ε > 0, there exists an ε-approximate solution y for the IVP
(4.1.1) on |t − t0 | ≤ h. Note that h does not depend on ε.
slope M
t = t0 + a
(t0 , y0 ) t1 t2 t3 t = t0 + h
slope −M
Theorem 4.3.5
1
Proof: Choose εn = , n = 1, 2, · · · . From Theorem 4.3.4, we have for
n
each εn , there exists an εn -approximate solution, which we denote by
yn (t ), defined on |t − t0 | ≤ h. This implies that
|yn (t ) − y0 | ≤ b
and hence, |yn (t )| ≤ |y0 | + b. Thus, the family of approximate solutions
{yn } is uniformly bounded. Again, from (4.3.6), we have
|yn (t ) − yn (t˜)| ≤ M|t − t˜|, for all t, t˜ ∈ [t0 ,t0 + h].
Therefore, {yn } is an equicontinuous family of functions; see
Chapter 2. Thus, by the Arzela–Ascoli theorem (Theorem 2.2.8), there
exists a subsequence {ynk } of {yn } such that ynk → y uniformly on
[t0 − h,t0 + h] as nk → ∞. This implies that y is continuous and
|y(t ) − y(t˜)| ≤ M|t − t˜|.
We, now prove that this limit function y is a required solution to IVP
(4.1.1). Consider the error defined by
ẏnk (t ) − f (t, ynk (t )) if ẏnk exists
∆nk (t ) =
0 otherwise.
1
Furthermore, we have |∆nk | ≤ εnk = nk . Therefore, we can pass to the limit
in (4.3.7), to get
Z t
y(t ) = y0 + f (τ, y(τ )) dτ.
t0
Define X = {y ∈ C [t0 ,t0 + h] : |y(t ) − y0 | ≤ b, for all t ∈ [t0 ,t0 + h]} which
is a closed ball in the Banach space C [t0 ,t0 + h] with the sup norm
kyk = sup |y(t )|.
t∈[t0 ,t0 +h]
≤ α (t − t0 ) ky1 − y2 k .
Successively applying the first and second inequalities, we get
2 Z t
(T y1 )(t ) − (T 2 y2 )(t ) ≤ α |(Ty1 )(τ ) − (Ty2 )(τ )|dτ
t0
Z t
≤ α2 (τ − t0 ) ky1 − y2 k dτ
t0
(t − t0 )2
= α2 ky1 − y2 k .
2
Hence,
2 2
T y1 − T 2 y2
≤ α h ky1 − y2 k .
2
2
An induction argument now gives that, for any n ≥ 1,
α n hn
kT n y1 − T n y2 k ≤ ky1 − y2 k .
n!
118 Ordinary Differential Equations: Principles and Applications
α n hn
By choosing large n, the quantity can be made less than 1, and hence,
n
n!
T is a contraction. Thus, by the generalized Banach contraction principle
(Theorem 2.3.2 and its corollary 2.3.3), T has a unique fixed point. This
completes the proof.
Remark 4.3.7
Example 4.3.8
We successively get
Z t Z t
y1 (t ) = 1 + y0 (τ )dτ = 1 + 1dτ = 1 + t
0 0
t2
Z t Z t
y2 (t ) = 1 + y1 (τ )dτ = 1 + (1 + τ )dτ = 1 + t +
0 0 2
t2 t3
Z t
y3 (t ) = 1 + y2 (τ )dτ = 1 + t + +
0 2! 3!
General Theory of Initial Value Problems 119
··· ············
n
tm
yn (t ) = ∑ → et as n → ∞.
m=0 m!
But we know that the solution to the IVP is indeed given by y(t ) = et .
Theorem 4.4.1
Proof: We give a proof when t0 = t˜0 . From the basic lemma, the solutions
y and ỹ satisfy the following integral equations:
Z t Z t
y(t ) = y0 + f (τ, y(τ )) dt, ỹ(t ) = ỹ0 + f˜(τ, ỹ(τ )) dτ,
t0 t0
120 Ordinary Differential Equations: Principles and Applications
for all t ∈ I. Subtracting the second equation from the first, we get
Z t
y(t ) − ỹ(t ) = y0 − ỹ0 + ( f (τ, y(τ )) − f˜(τ, ỹ(τ ))) dτ.
t0
Example 4.5.1
Then,
Z t
φ0 (t ) = y0 + f (τ, φ0 (τ )) dτ, for t0 − h ≤ t ≤ t1
t0
122 Ordinary Differential Equations: Principles and Applications
Z t
φ1 (t ) = φ0 (t1 ) + f (τ, φ1 (τ )) dτ, for t1 ≤ t ≤ t1 + h1 .
t1
Thus, we have
Z t
y(t ) = y0 + f (τ, y(τ )) dτ
t0
Theorem 4.5.2
Theorem 4.5.3
an interval (a, b̄] with b̄ > b. Similar statements hold at the left end
point a.
Definition 4.5.4
Proposition 4.5.5
Proof: If not, suppose J = (α, β ]. In this case, β < ∞. Then, one can
consider the IVP for the same ODE with initial condition at β , to get
a solution in [β , β + h] for some h > 0. This will produce a solution in
(α, β + h] contradicting the maximality of J. Similar contradiction can be
arrived at if α is a point in J. .
Theorem 4.5.6
We infer the following from the conclusion of the theorem. Only one of
the following statements is true:
• If y(β −) = lim y(t ) exists, then (β , y(β −)) ∈ D̄ \ D, the boundary
t→β −
of D.
• The solution y becomes unbounded near β , that is, given any large
positive number C, there exists t1 < β such that |y(t1 )| ≥ C.
The following examples illustrate both these situations.
The proof of the theorem follows immediately from Theorem 4.5.3.
Example 4.5.7
1
Consider the equation ẏ = .
ty
1
Here, the function f (t, y) = is defined in the entire (t, y)-plane, except
ty
the t-axis and y-axis. We will consider the IVP in the first quadrant in
the (t, y)-plane: y(t0 ) = y0 where both t0 , y0 are positive. The solution
is given by y = [2 log(t/t0 ) + y20 ]1/2 . Therefore, the maximal interval of
2
existence is (α, ∞), where α = t0 e−y0 /2 and as t → α−, y(t ) → 0 with
(α, 0) belonging to the boundary of the domain in question.
General Theory of Initial Value Problems 125
Example 4.5.8
1
Consider the equation ẏ = .
t +y
In this case, we take the domain as {(t, y) : t + y > 0} and impose the initial
condition as y(t0 ) = y0 with t0 + y0 > 0. By introducing a new variable
eu(t )
u(t ) = t + y(t ), we see that the solution is implicitly given by =
1 + u(t )
e0y
et . We notice that the maximal interval of existence in this case
1 + t0 + y0
is given by (α, ∞), where α = t0 + log(1 + t0 + y0 ) − (t0 + y0 ) < t0 . Again,
it is not hard to see that as t → α−, (t, y(t )) approaches the boundary of
the domain in question, that is, t + y(t ) → 0.
Example 4.5.9
π
Consider the IVP: ẏ = (1 + y2 ), y(0) = 0.
2
We now see that the solution y(t ) = tan( π2 t ) cannot be extended beyond
the interval (−1, 1). Note that if we take any rectangle {(t, y) : |t| ≤ a, |y| ≤
b} around the origin (0, 0), then as in the local existence theorem, we
get the existence of a unique solution in an interval [−h, h], where h =
min(a, π2 1+bb2 ), which is always less than π1 .
ẋ2 = f2 (t, x1 , x2 , · · · , xn )
············
ẋn = fn (t, x1 , x2 , · · · , xn )
126 Ordinary Differential Equations: Principles and Applications
· · · , xn (t ))]T
and
x0 = [x01 , x02 , · · · , x0n ]T .
Here, the superscript T denotes the transpose of a matrix/vector. Using
these notations, the aforementioned system of differential equations can
be written in the following compact form
ẋ = f(t, x), x(t0 ) = x0 . (4.6.1)
An nth order scalar differential equation can be reduced into a system of
n first order differential equations of this form. Such a representation is
known as the state-space representation of the system.
Example 4.6.1
x2 (t ) = y(1) (t ) = ẋ1 (t )
General Theory of Initial Value Problems 127
x3 (t ) = y(2) (t ) = ẋ2 (t )
············
xn (t ) = y(n−1) (t ) = ẋn−1 (t ).
ẋ2 (t ) = x3 (t )
············
ẋn−1 (t ) = xn (t )
ẋn (t ) = g(t, x1 (t ), x2 (t ), . . . , xn (t ))
and the initial conditions reduce to
x1 (t0 ) = x01 , x2 (t0 ) = x02 , · · · , xn (t0 ) = x0n .
In vector notation, we get
ẋ(t ) = f (t, x(t )) , x(t0 ) = x0 ,
where
x0 = [x01 , x02 , . . . , x0n ]T
or, component-wise,
Z t
xi (t ) = x0i + fi (τ, x(τ )) dτ
t0
Following exactly the procedure used for a single equation, see Theorem
4.3.1, we can show that this Picard’s iterates converge uniformly to the
General Theory of Initial Value Problems 129
Theorem 4.6.2
The interval of existence obtained from the theorem need not be the best
possible interval. One can also prove the results on continuation of
solutions to a larger interval as in the case of a single equation in a similar
fashion. We can also introduce the maximal interval of existence and
eventually obtain global solutions under the assumption that f is a
continuous bounded function and is globally Lipschitz with respect to the
x variable in the domain of definition.
Example 4.6.3
≤ 3|x − x̃|.
Thus, f(t, x) is globally Lipschitz continuous with Lipschitz constant less
than or equal to 3. Hence, by the existence and uniqueness theorem, there
exists a unique solution for the given differential system around the given
initial data.
Let A = [ai j ] be a constant n × n matrix. Then, f(t, x) = Ax is
obviously Lipschitz continuous with Lipschitz constant α = kAk. Thus,
the linear system with constant coefficients ẋ = Ax, x(t0 ) = x0 has a
unique solution, and the solution is global. A detailed study of the system
130 Ordinary Differential Equations: Principles and Applications
4.7 Exercises
1. Discuss the existence and uniqueness of the solution of the following
IVPs
(a) ẏ = 2t y, y(t0 ) = y0 , t0 6= 0.
(b) ẏ = (cott )y, y(1) = 0.
x2 (0).
1
(ii) ẏ = , y(0) = 0
y
(iii) ẏ = |y|1/2 , y(0) = 0
θ (0) = a0 , θ̇ (0) = a1 .
1
(a) ẏ = .
1 + y2
1
(b) ẏ = .
1 − y2
sint
(c) ẏ = .
1 − y2
1
(d) ẏ = .
y(1 − y)
11. Prove the continuity of the solution of the equation given below, in
appropriate norm with respect to the initial data x0 and f
ẋ = f(t, x), x(t0 ) = x0 ,
assuming that f is continuous and Lipschitz continuous with respect
to x variable, in the domain of definition.
12. Consider the n-dimensional control system
ẋ = f(t, x(t ), u(t )), x(t0 ) = x0 ,
where the function f : R × Rn × Rm → Rn is continuous and
Lipschitz continuous with respect to x and u, where the continuous
function u(t ) is an external control input applied to the system.
Prove that the system has a unique solution for a given initial
condition x0 and a given control function u(t ). Also, prove the
following:
(a). Let x be the unique solution with initial state x0 and x̃ be the unique
solution with initial condition x̃0 , for a fixed control input u. Then
there exists K1 > 0 such that
||x − x̃|| ≤ K1 ||x0 − x̃0 ||.
(b). Let xu be the unique solution with a control u and xũ be the unique
solution with a control ũ for a fixed initial state x0 . Then prove that
there exists K2 > 0 such that
||xu − xũ || ≤ K2 ||u − ũ||.
order smoothness in the data f , we can get the corresponding higher order
smoothness in the solution.
4.8 Notes
This chapter deals with some important topics on existence and
uniqueness of a solution to an ODE. The significance of these topics are
explained through several examples so that a beginner starts appreciating
the topics. We have included three results on existence–Cauchy–Peano
existence theorem, existence using Picard’s iterates and existence using
fixed point theorem. The first one requires only the minimal assumption
of continuity, but uniqueness is not guaranteed. The other two results
require the assumption of Lipschitz continuity and the uniqueness is
guaranteed. Gronwall’s inequality is stated and proved, which in turn is
used to prove the uniqueness of solutions. Gronwall’s inequality is also
useful for comparison of different solutions with different coefficients
and/ or initial data. There are other types of uniqueness results; see for
example [AO12]. For general theory, there are many good books, see for
example [CL72, Sim91, SK07, Tay11, MU78, HSD04]. Continuous
dependence on the data is also discussed in detail. Also discussed is the
topic on continuation of solutions to larger intervals; this leads to the
concept of maximal interval of existence of a solution. In particular, the
conditions for global existence of a solution is dealt with. An application
of these results is seen in the proof of Perron’s theorem in Chapter 9. A
brief discussion on systems is also carried out.
5
Linear Systems and
Qualitative Analysis
Definition 5.1.1
Definition 5.2.1
this case to bring out the aspects of diagonalizability. The general case is
more involved and Jordan decomposition is the best possible reduction.
We already know that diagonalizability is equivalent to the existence
of n independent eigenvectors. Note that we do not demand n distinct
eigenvalues. However, the existence of n distinct eigenvalues implies the
existence of n independent eigenvectors and hence, diagonalizability (it
is a sufficient condition) is guaranteed. In general, if the algebraic and
geometric multiplicities are equal for all the eigenvalues of a matrix, then
that matrix is diagonalizable.
1 0
The reader can show that the matrix is not diagonalizable.
1 1
Theorem 5.3.1
Proof: If the two eigenvalues λ and µ of A are real and distinct, then
the corresponding eigenvectors x and y are linearly independent. If
λ = µ is the double real eigenvalue of A and the corresponding
eigenspace is two-dimensional, then also we obtain two linearly
independent eigenvectors
xand y. In either of these cases, we obtain,
x1 y1 λ 0
with P = [x y] = , AP = Pdiag(λ , µ ) = P . From now
x2 y2 0 µ
onwards, note that we may represent matrices in the form [x y], where x
and y are the column vectors of the matrix in question. Since x and y are
independent, the matrix P is invertible, and we get the first form B1 .
140 Ordinary Differential Equations: Principles and Applications
Equivalently,
y1 (t ) = eta (y01 cos(tb) − y02 sin(tb)), y2 (t ) = eta (y01 sin(tb)
+y02 cos(tb)).
When we go back to the original system corresponding to A, we may write
the solution to the IVP (5.2.1), respectively as
Example 5.3.2
−1 −3
The matrix A = has eigenvalues λ1 = −1, λ2 = 2 with
0 2
1 −1 1 −1
corresponding eigenvectors and . Hence, P = and
0 1 0 1
1 1 −1 0
P−1 = . Further, B = P−1 AP = . This implies that a
0 1 0 2
linearly equivalent system is given by ẏ1 = −y1 , ẏ2 = y2 , which is a
diagonal system.
Example 5.3.3
Example 5.3.4
0 1
A= . Observe that A has the complex eigenvalues given by
−3 −2
√ √ √
λ = −1 + i 2 and µ = −1 − i 2. Thus, a = −1 and b = 2√in (5.3.1).
x corresponding
A complex eigenvector eigenvalue −1
to + i 2 can
be
1 1 0 0 1
computed as x = = + √ . Thus, P = √ and
λ −1 i 2 2 −1
1 1
P−1 = √12 √ . Finally, the solution to the system is given by
2 0
" √ √ #
cos ( 2t ) − sin ( 2t ) −1
x(t ) = e−t P √ √ P x0 .
cos( 2t ) sin( 2t )
ẋ(t ) are called the phases of the system under consideration. We adopt
same terminology for a first order system (5.4.1) and call the components
x1 , x2 , · · · , xn of the solution x as phases of (5.4.1). Indeed, if Newton’s
law transformed to a first order system, we get, x1 = x and x2 = ẋ. If
x is a solution of (5.4.1) in some interval I ⊂ R, containing t0 , the set
{x(t ) ∈ Rn : t ∈ I} is called a trajectory or an orbit passing through x0 . In
this scenario, Rn is referred to as the phase space (phase plane if n = 2
and phase line if n = 1) of (5.4.1). Thus, the phase space contains all the
trajectories of (5.4.1) passing through different points of the phase space.
Description of all the trajectories of (5.4.1) in the phase space (or phase
plane or phase line) is referred to as the phase portrait of (5.4.1) and the
analysis involved in this process may be called as the phase space (or phase
plane or phase line) analysis of (5.4.1).
We now give an example.
Example 5.4.1
x2
x1
Flow: Given the dynamical system Φ as described earlier for any fixed t,
introduce the map φt : Rn → Rn by φt (x0 ) = Φ(t + t0 , x0 ) = x(t + t0 , x0 ).
Then, the collection G = {φt : t ∈ R} is called the flow of the system
(5.4.1).
The notion of a flow gives an entirely different perspective which is
quite useful in applications. For example, when we watch fluid flowing,
we normally do not see the trajectory lines (stream lines); rather, we see
a body of fluid moving. This is the view incorporated in the concept of
flow. More precisely, we would like to see a neighbourhood, say U of x0
moving with time. Thus, φt (U ) is the position of all particles at time t,
whose initial position is in U. The collection G satisfies
φ0 (x) = x, φs (φt (x)) = φs+t (x), φ−t (φt (x)) = φt (φ−t (x)) = x,
(5.4.3)
for all x ∈ Rn . The first property comes from the initial condition,
whereas the second property follows from the uniqueness of a solution to
146 Ordinary Differential Equations: Principles and Applications
the IVP. This is known as semigroup property. In other words, the flow is
a semigroup. The last property is a consequence of the second one, and
asserts the existence of inverse and hence, the flow G has properties of a
group. Thus, the flow can be visualized as a group action. We remark that
this notion can even be generalized to PDE. In general, every system may
not have the group structure; for example, the heat equation does not
produce a group structure, but only a semigroup structure. However, the
wave equation does produce a group structure.
In the case of an autonomous linear system, that is, if f(x) = Ax, the
dynamical system and flow are, respectively, given by Φ(t, x0 ) = e(t−t0 )A
x0 and φt = etA .
x1
Now consider the system (5.4.1). Here f(x) is the given information or
data which we view as a vector located at the point x and thus producing
a vector at each point in a domain Ω ⊂ Rn , where the ODE is described.
This we name as a vector field. Thus, a vector field X in a domain Ω ⊂ Rn
is a mapping such that a vector X (x) ∈ Rn is associated with every x ∈ Ω.
We say that the vector field is smooth if this mapping is smooth. The vector
field associated with the system in Example 5.4.1 is represented in Fig. 5.2.
We may ask a question: what is the connection between the vector field
and the solution of a system? It is easy to see that the tangent to the solution
curves gives the vector field and conversely, any curve whose tangents are
from the vector field will be the solution to the system.
Definition 5.4.2
We would like to state an important point at this stage. Note that x(t ) = x̄
for all t is a solution to the system ẋ = f(x) if x̄ is an equilibrium point.
This means that if the motion starts from the equilibrium point, the
trajectory will remain there forever. In physical problems, especially in
mechanics, it represents the steady state solution, the one that does not
change with time. Hence, we not only view an equilibrium point as a
point, but also as a solution to the system.
Since equilibrium point is a steady state solution, we would be
interested in the behaviour of solutions which start close to an
equilibrium point. This is very important since when we make small
errors, we would like to know whether the trajectory also remains in a
neighbourhood of the equilibrium point. This is the motivation behind
stability analysis of equilibrium points.
148 Ordinary Differential Equations: Principles and Applications
For a linear system, that is, f(x) = Ax, observe that x̄ = 0 is always
an equilibrium point and in addition, if A is invertible, this is the only
equilibrium point. In general, the set of all equilibrium points is given by
ker(A). In this section, we will characterize various types of equilibrium
points for a 2 × 2 system. We will do this via various examples. Stability
of nonlinear systems will be studied in Chapter 8.
Saddle Point, Node, Focus and Center: Note that in Example 5.4.1, the
first component x1 (t ) → 0, whereas the second component x2 (t ) → ±∞ as
t → ∞ depending on the initial condition. This equilibrium point is called a
saddle point and it is classified as unstable. In fact, this will be the feature
for any system having two real non-zero eigenvalues with opposite sign
except that the trajectories will remain in four parts separated by a different
set of coordinate axes given by the eigenvectors, and which need not be
the standard coordinate axes (see Example 5.3.2).
Example 5.4.3
λ 0
Let A = , λ > 0.
0 λ
x2 x2
x1 x1
(a) (b)
−λ 0
consider A = with λ > 0 and both the trajectories will now
0 −λ
approach 0 as t → ∞. See Fig. 5.3(b). This equilibrium in both the cases
is referred as a node. In the first case, we have an unstable node and the
second case corresponds to a stable node.
Example 5.4.4
2 0
Take A = .
0 1
This has two distinct eigenvalues, 2 and 1, having the same sign and the
solution is given by x1 (t ) = x01 e2t , x2 (t ) = x02 et . Eliminating t, we will
get x1 = cx22 (see Fig. 5.4(b)) and an unstable node. Again the arrows will
get reversed if we take negative numbers in A and we get a stable node
(see Fig. 5.4(a)). The situation is exactly the same if we replace 2 and 1 by
any two real numbers with the same sign. More generally, the behaviour
of the trajectories will remain the same for any system having two distinct
real eigenvalues of the same sign except that the trajectories will remain
in four parts separated by a different coordinate system.
x2 x2
x1 x1
(a) (b)
These two examples cover matrices of the form B1 in Theorem 5.3.1. Now,
we consider an example with a double real eigenvalue, but which is not
diagonalizable.
150 Ordinary Differential Equations: Principles and Applications
Example 5.4.5
λ 1
Now, consider A = .
0 λ
x1 x1
(a) (b)
Example 5.4.6
a −b
Let A =
b a
Linear Systems and Qualitative Analysis 151
cos(bt ) − sin(bt )
The solution is x(t ) = eat x0 . Indeed, the sign of a
sin(bt ) cos(bt )
will determine the stability; the components of the matrix appearing in
the solution are periodic with the sign of b determining the orientation
of the rotation. Of course, we take b 6= 0 to get the complex (non-real)
eigenvalues.
cos(bt ) − sin(bt )
Case (i), a = 0: Note that the matrix C = shows
sin(bt ) cos(bt )
the periodic nature of the trajectories, rotating around the origin. In fact,
C is a rotation matrix with determinant 1. Thus, we have |x(t )| = |Cx0 | =
|x0 | for all t. In other words, x(t ) rotates around the origin along the circle
of radius |x0 | as t increases or decreases. The rotation will be clockwise
if b < 0 and it is counter-clockwise if b > 0. In this case, the equilibrium
point 0 is referred to as a center. See Fig. 5.6.
x2 x2
x1 x1
a = 0, b < 0 a = 0, b > 0
Case (ii), a 6= 0: Here also the rotation matrix C acts the same way, but
the presence of eat changes amplitude making a spiral around the origin.
The spiral moves towards infinity as time increases if a > 0 and it tends to
the origin, if a < 0. The situation leads to four different cases and these
are depicted in Fig. 5.7 and Fig. 5.8. The equilibrium point in this case is
referred to as a focus. It is a stable/unstable focus depending on whether
a < 0/a > 0, respectively.
152 Ordinary Differential Equations: Principles and Applications
x2 x2
x1 x1
x2 x2
x1 x1
Example 5.4.7
0 0
Take the matrix A = . Then, x1 (t ) = x01 , x2 (t ) = x02 e−2t .
0 −2
Linear Systems and Qualitative Analysis 153
In this degenerate case, where one eigenvalue is zero, all the points on
the x1 -axis are equilibrium points. Figure 5.9 is self explanatory. However,
A has two linearly independent
note that since the eigenvalues are distinct,
0 1
eigenvectors. Now, consider A = . The double eigenvalue 0 has
0 0
geometric multiplicity one. The reader should work out the further details
and observe the different behaviour in this case compared to the previous
example.
x2
x1
Definition 5.4.8
0 −b
4. center if B = , b 6= 0.
b 0
A stable node or a focus is also called a sink and an unstable node or focus
is called a source. If det(A) = 0, then the origin is called a degenerate
equilibrium point.
Center
∆=0
∆<0 ∆<0
∆>0 ∆>0
Stable node Unstable node
α
Saddle point
0 0 λ3 0 0 λ2 0 0 λ1 0 b a
according to the case when A has eigenvalues; 3 real (need not be
distinct) with 3 independent eigenvectors, 3 real with only two
independent eigenvectors, 3 real with only a single independent
eigenvector, a real and two complex eigenvalues, respectively.
156 Ordinary Differential Equations: Principles and Applications
Example 5.5.1
1 0 0
Consider A = 0 1 0 .
0 0 −1
x2
x1
Example 5.5.2
that is,
x1 (t ) = eta (x01 cos(tb) −x02 sin(tb)), x2 (t ) = eta (x01 sin(tb) + x02 cos(tb)).
Now consider the same with special eigenvalues. The reader may work
out the same problem with different signs for a, b and λ . Let a > 0, b < 0
and λ > 0. If we take the initial point in the x1 x2 -plane, that is, x03 = 0,
then the entire trajectory will remain in the x1 x2 -plane and it is similar to
a planar trajectory corresponding to an unstable focus, rotating clockwise
as b < 0, with increasing amplitude of the spiral as a > 0. On the other
hand, if the initial point is on the x3 -axis, that is, x01 = 0 = x02 , then the
trajectory will remain on the x3 -axis and, since λ > 0, tend to ±∞ along the
x3 -axis as t → ∞, according to whether x03 is positive or negative. So, if we
put both these arguments together, for a general initial point, the trajectory
will spiral around the x3 -axis, increasing the distance of the spiral from
the x3 -axis, but moving towards ±∞. See Fig. 5.12(b). As another case,
if we take a = 0 and b > 0, we get spirals around the x3 -axis, maintaining
the same distance from the x3 -axis, moving towards ±∞, since λ > 0, but
in the counter-clockwise direction since b > 0. See Fig. 5.12(a).
158 Ordinary Differential Equations: Principles and Applications
x3 x3
x2 x2
x1 x1
(a) (b)
Theorem 5.5.3
Theorem 5.5.4
" #
cos(b j t ) − sin(b j t )
Note that etB j = ea j t .
sin(b j t ) cos(b j t )
" #
cos(b j t ) − sin(b j t )
Further, observe that represents a pure
sin(b j t ) cos(b j t )
rotation.
for i = 1, · · · , n. Finally,
!
n
etA = exp t ∑ λ jP j .
j =1
Definition 5.5.5
A matrix N is said to be nilpotent of order k if there exists an integer
k ≥ 1 such that Nk−1 6= 0 and Nk = 0.
k−1 i
N
For a nilpotent matrix N of order k, we have eN = ∑ . For example, the
i=0 i!
0 1 N 1 1
matrix N = is nilpotent of order 2 and e = I + N = . The
0 0 0 1
n × n matrix whose only non-zero entries are 1 along the first off-diagonal,
that is
0 1 0 ··· 0
0 0 1 · · · 0
N=
· · · · · · · · · · · · 1
Definition 5.5.6
[Generalized eigenvector] Let λ be an eigenvalue, then any vector v
satisfying (A − λ I)k v = 0 for some k ≥ 1 is called a generalized
eigenvector; if k = 1, it is the usual eigenvector.
It is a known fact that the smallest such k is less than or equal to the
1 1 1
algebraic multiplicity of λ . For example, in , the vector is an
0 1 0
eigenvector. Since, we are in the second dimension and the algebraic
multiplicity of the eigenvalue 1 is 2, the matrix A satisfies (A − I)2 v = 0
for any vector v. Hence, any vector can be taken as a generalized
eigenvector. We now state the following theorem without proof.
Theorem 5.5.7
Let λ1 , λ2 , · · · , λn be real eigenvalues of an n × n matrix A counted
according to their (algebraic) multiplicity. Then, there exists an
invertible matrix P = [v1 v2 · · · vn ] consisting of generalized
eigenvectors of A such that A = S + N, where S is diagonalizable
using P, that is P−1 SP = diag[λ1 , · · · , λn ] and N = A − S is nilpotent
of order k less than or equal to n. Further, SN = NS.
Since S and N commute, the solution to the linear system (5.2.1) is given
by
t k−1 Nk−1
h i
λ1 t λn t −1
= P diag e , · · · , e P I + tN + · · · + x0 .
(k − 1) !
(5.5.1)
162 Ordinary Differential Equations: Principles and Applications
Example 5.5.8
3 1
Solve the linear system with A = .
−1 1
Example 5.5.9
1 0 0
Let A = −1 2 0.
1 1 2
linearly independent
of v2 ; of course, it isalso linearly
independent of v1 .
1 0 0 1 0 0
Thus, P = 1 0 1, P−1 = 2 0 1. Now compute
−2 1 0 −1 1 0
1 0 0 1 0 0 0 0 0
S = P 0 2 0 P−1 = −1 2 0 and N = A − S = 0 0 0,
0 0 2 2 0 2 −1 1 0
N2 = 0. The solution is given by
et 0 0
t − e2t 2t
x(t ) =
e e 0 x0 .
−2et + (2 − t )e2t te2t e2t
If all the generalized eigenvectors are complex, we have the following
theorem.
Theorem 5.5.10
Thus, in the case of all eigenvalues complex, the solution to (5.2.1) is given
by
t m−1 Nm−1
x(t ) = P diag etB1 , · · · , etBk P−1 I + tN + · · · +
x0 .
(m − 1) !
" #
tB a t
cos(b j t ) − sin(b j t )
Here e j = e j .
sin(b j t ) cos(b j t )
The final result is the Jordan form for a general matrix A.
164 Ordinary Differential Equations: Principles and Applications
Theorem 5.5.11
[The Jordan Canonical Form] Let A be a real matrix of order
n = k + 2m with real eigenvalues λ1 , λ2 , · · · , λk and complex
eigenvalues λ j = a j + ib j , λ̄ j = a j − ib j , j = k + 1, · · · , m. Then,
there exists a basis {v1 , v2 , · · · , vk , vk+1 , uk+1 , · · · , vk+m , uk+m } of Rn ,
where v j , j = 1, 2, · · · , k, w j = u j + iv j , j = k + 1, · · · , k + m are
generalized eigenvectors corresponding to the eigenvalues λ j , such
that
P = [v1 v2 ··· vk vk+1 uk+1 ··· vk+m uk+m ]
is invertible and P−1 AP = diag[B1 , B2 , · · · , Br ] is block diagonal with
Jordan blocks B j , j = 1, · · · , r for some r. Further, B j takes one of the
following two forms
0 ··· , 0 0 D I2 02 · · · 02 02
λ 1
0
λ 1 ··· 0 0 02 D I2 · · · 02 02
· · · · · · , · · · · · · · · · · · · or · · · · · · · · · · · · · · · · · ·
0 · · · · · · · · · λ 1 02 · · · · · · · · · D I2
t2 t m−1
1 t ···
2! (m − 1) !
m−2
t
tB tλ tN tλ 0
1 t ···
e =e e =e (m − 2) !
.
· · · · · · · · · ··· ···
· · · · · · · · · ··· ···
Definition 5.6.1
Recall Example 5.4.1, where x1 and x2 axes are two invariant subspaces,
namely E1 = {(x1 , 0), x1 ∈ R} and E2 = {(0, x2 ), x2 ∈ R}. For any initial
condition (x01 , 0) ∈ E1 , we have the solution x(t ) = (x1 (t ), 0) ∈ E1 for
all t. Further, x1 (t ) → 0 as t → ∞. This subspace is referred to as a stable
subspace. On the other hand, for E2 , any solution which starts in E2 ,
remains there for all t, but now it goes to ±∞ as t → ∞. In this case, the
subspace is called an unstable subspace.
In Example 5.4.4, both axes are unstable invariant subspaces and
hence, the entire R2 space is unstable. In Example 5.5.1, the x1 x2 -plane is
the unstable subspace, whereas the x3 -axis is the stable subspace.
The subspace generated by the generalized eigenvectors of an
eigenvalue λ of a matrix A is called the generalized eigenspace
corresponding to λ .
Proposition 5.6.2
v̄ j ∈ ker(A − λ I)k j −1
and thus, v̄ j is a generalized eigenvector and v̄ j ∈ E. Finally, it follows
that Av j = λ v j + v̄ j ∈ E. Hence the proposition.
Now, the entire space Rn can be decomposed into stable, unstable and
center spaces. This is the content of the following theorem. Given an n ×
n real matrix A, denote by E s , E u and E c the subspaces spanned by the
Linear Systems and Qualitative Analysis 167
Theorem 5.6.3
Proof: To see the invariance under the flow, it suffices to consider one
of the subspaces, say for E s . Let z be a generalized eigenvector, then, by
Proposition 5.6.2, it follows that Az ∈ E s and hence, Ak z ∈ E s for any
positive integer k. Therefore, it follows that
k
t jA jz
etA z = lim ∑ ∈ E s.
k7→∞ j =0 k!
Remark 5.6.4
where I is the identity matrix. Clearly, Φ̃(t,t0 ) = Φ(t −t0 ) = e(t−t0 )A will
satisfy the same differential system with the initial time at t0 , that is
˙ (t,t ) = AΦ̃(t,t ), Φ̃(t ,t ) = I,
Φ̃ (5.7.3)
0 0 0 0
and the ith column of Φ̃ will satisfy (5.7.1) with g = 0 and x(t0 ) = ei .
Definition 5.7.1
Proposition 5.7.2
flow etA to a varying vector. Thus, look for a solution of the form
x(t ) = etA y(t ), where y(t ) is to be determined so that x(t ) satisfies
(5.7.1).
A simple computation yields
ẋ(t ) = AetA y(t ) + etA ẏ(t ) = Ax(t ) + etA ẏ(t ).
Thus, we need to choose y which satisfies etA ẏ(t ) = g(t ). In other words,
Zt Zt
−sA −t0 A
y(t ) = y(t0 ) + e g(s)ds = e x0 + e−sA g(s)ds.
t0 t0
Zt Zt
x(t ) = e (t−t0 )A
x0 + e (t−s)A
g(s) ds = Φ(t −t0 )x0 + Φ(t −s)g(s) ds.
t0 t0
(5.7.4)
In the case of the finite dimensional linear control theory, we have g(t ) =
Bu(t ), where B is an n × r matrix and u is an r × 1 control vector. In this
case,
Zt
x(t ) = Φ(t − t0 )x0 + Φ(t − s)Bu(s) ds. (5.7.5)
t0
Remark 5.7.3
The last two properties together are known as semi-group properties and
in this particular ODE system, we have group structure due to the first
property explained here. As remarked earlier, when we consider a PDE,
the heat equation, for example, we may not get a group structure, but a
semi-group structure. We have also noted earlier that it is possible to study
ODEs in infinite dimensional spaces such as a Hilbert space, a Banach
space, etc.
Now the solution to (5.7.6) is given by
x(t ) = Φ(t,t0 )x0 = Ψ(t )Ψ−1 (t0 )x0 .
The matrix Φ(t,t0 ) is known as the transition matrix. The solution of the
non-homogeneous system (5.7.7) is given by
Zt
x(t ) = Φ(t,t0 )x0 + Φ(t,t0 )Φ−1 (s,t0 )g(s) ds
t0
(5.7.9)
Zt
= Φ(t,t0 )x0 + Φ(t, s)g(s) ds.
t0
Proposition 5.7.4
Example 5.7.5
1 1+t
Let A(t ) = .
0 t
which gives
det(Φ(t + ∆t,t0 )) = det(I + A(t )∆t ) det Φ(t,t0 ) + O((∆t )2 ).
For the first term,1 we have
det(I + A(t )∆t ) = 1 + ∆ttr(A(t )) + O((∆t )2 ).
d
Thus, as ∆t → 0, we see that det(Φ(t,t0 )) = tr(A(t )) det(Φ(t,t0 )).
dt
This upon integration, produces Abel’s formula
Z t
det Φ(t,t0 ) = exp tr(A(s))ds .
t0
but λ x will also satisfy the same system with the same initial condition.
Hence, by uniqueness, we get x(t + T ) = λ x(t ) for all t. This is quasi-
periodicity. Moreover, since λ 6= 0, one can choose an α so that λ = eαT .
Now, it is easy to see that x(t ) = eαt z(t ), where z is a periodic function of
period T . Thus, we have the following theorem.
Theorem 5.7.6
Note that the eigenvalues, may, in general, be complex and hence solutions
appear to be complex valued; but this is not the case. This requires little
more work. The interested reader is referred to [Inc26, Lef77] for further
reading.
5.8 Exercises
1. Let A be an n × n matrix with an eigenvalue λ0 of multiplicity n.
Show that the standard basis can be chosen as the basis of
generalized eigenvectors so that B = I which allows us to write
A = S + N in the appropriate theorem and then represent the
solution.
2. Use the decomposition
inExercise 1 and solve the system ẋ = Ax,
2 0 0
where A = −1 2 0 .
1 1 2
3. Find the general solution and phase portraits of the following
systems
(a) ẋ1 = −x1 + x2 , ẋ2 = −x2 .
(b) ẋ1 = x1 , ẋ2 = 5x2 .
(c) ẋ1 = −x1 − 3x2 , ẋ2 = −2x2 .
(d) ẋ1 = −x2 , ẋ2 = −x1 , ẋ3 = x3 .
176 Ordinary Differential Equations: Principles and Applications
(a)
0 −1 1 −1
0 1 1 1 1 1 1 1
, , , , , .
1 1 1 0 0 −1 1 1 −1 1 0 1
(b)
1 0 0 1 0 0 1 0 0
0 0 1 , 0 1 1 , 0 0 −1 ,
0 1 0 0 0 −1 0 1 0
1 1 0 1 0 0 1 0 0
0 1 1 , 1 2 0 , −1 2 0 .
0 0 −1 1 2 3 1 0 2
(c)
−1
−1 0 0 0 0 0 0
1 −2 0 0 1 2 0 0
, ,
1 −2 3 0 1 0 2 0
1 2 3 −4 1 1 0 2
−3 1 4 0 2 1 4 0
0 −3 1 0 0 2 1 −1
, .
0 0 −3 0 0 0 2 1
0 0 0 −3 0 0 0 2
18. ZThe result in Exercise 18, in general, is not true if A(t ) and
t
A(s) ds do not commute. To see this (see Example 5.7.5), work
t0
1 1+t
out the details with the following matrix: A(t ) = . Also
0 t
find the solution to the corresponding IVP.
5.9 Notes
Qualitative analysis of linear systems is the main concept in this chapter.
This chapter is also a precursor to the study of stability analysis of
nonlinear systems carried out in Chapter 8. A good reference for this
chapter among others is [Per01]; see also [Tay11, Sim91, SK07, CL72,
HSD04]. A detailed study of 2 × 2 systems is done here by directly
developing the required linear algebra. However, for higher order
systems, the analysis is done by borrowing the Jordan decomposition
theorem from linear algebra. The other notions that have been introduced
are dynamical systems, flow, invariant subspaces, which will also be
useful in the study of nonlinear systems. Non-homogeneous and
non-autonomous systems are studied by introducing the concepts of
fundamental matrix and transition matrix. A brief mentioning of Floquet
theory is also done; this concerns non-autonomous systems with periodic
coefficients.
6
Series Solutions: Frobenius Theory
6.1 Introduction
In Chapter 3, we have seen that the solutions of linear first order
equations can be obtained in explicit form by converting the problem
essentially to an integral calculus problem. We have also seen that there is
no general procedure to obtain the solutions of linear second order
equations with variable coefficients, in explicit form. Nevertheless, we
could obtain valuable information about the solutions by exploiting the
linearity, superposition principle, etc. In this chapter, we consider a class
of linear second order equations whose solutions may be written down in
explicit form. Since the solutions will be in the form of an infinite
(power) series, eliciting the qualitative behavior of solutions will be
difficult. The results of this chapter are collectively called Frobenius
theory. Some important equations such as Bessel’s equation, Hermite
equation, Chebyshev equation, Laguerre equation, etc., are included in
the class of equations considered here. Owing to the importance of these
equations, which appear in applications frequently, the major properties
of their solutions have been tabulated in mathematical handbooks. The
interested reader may refer to [AS72]. We restrict our discussion to the
real domain. There are also very interesting and important results for
equations in the complex domain and the reader is referred to [Inc26].
Definition 6.2.1
[Analyticity] A function f : (a, b) → R, where (a, b) is an open
interval in R, is said to be (real) analytic at t0 ∈ (a, b) if there exists
δ > 0 such that (t0 − δ ,t0 + δ ) ⊂ (a, b) and
∞
f (t ) = ∑ an (t − t0 )n ,
n=0
for all t ∈ (t0 − δ ,t0 + δ ), where an s are real numbers, that is, f (t ) is
represented as a convergent power series in t − t0 in a neighborhood
of t0 . If f is analytic at every point in the interval (a, b), we say that f
is analytic in (a, b).
We now recall certain facts about convergent power series which will be
needed in what follows. For details, see [Apo11, Rud76].
∞ p
Consider a real power series ∑ ant n and put R−1 = lim sup n |an |.
n=0 n→∞
Then, the given power series converges for all t satisfying |t| < R and
diverges for |t| > R; the case of |t| = R is, in general, inconclusive. The
number R is called the radius of convergence of the power series. Note
∞
that R can also take the value 0 or ∞. Put f (t ) = ∑ ant n for t ∈ (−R, R).
n=0
The following statements hold:
1. The series converges uniformly in any compact subset of (−R, R).
2. The function f is infinitely differentiable in (−R, R) and
∞
f (k) (t ) = ∑ (n + 1)(n + 2) · · · (n + k)an+kt n ,
n=0
Remark 6.2.2
|an+1 |
If an 6= 0 after a certain stage and lim = l, then it is well
n→∞ |an |
p
known that lim n |an | also equals l. Thus, we have an alternative
n→∞
way of calculating the radius of convergence, when applicable.
182 Ordinary Differential Equations: Principles and Applications
Example 6.2.3
Theorem 6.2.4
Remark 6.2.5
and
∞
ÿ(t ) = ∑ (n + 1)(n + 2)an+2t n , (6.3.4)
n=0
Therefore, we have
(n + 1)(n + 2)an+2 + an = 0, n = 0, 1, 2, · · · .
184 Ordinary Differential Equations: Principles and Applications
The two power series in (6.3.6) are very familiar to us; they represents
cost and sint respectively. Thus,
y(t ) = a0 cost + a1 sint,
where a0 and a1 are arbitrary real constants. Thus, the series (6.3.2) for y
converges for all t ∈ R. This may be expected as the coefficients in (6.3.1)
are analytic in R.
Of course, we would have obtained the aforementioned solution
without going through the exercise of power series, as (6.3.1) is an
equation with constant coefficients. Nevertheless, this exercise contains
all the ingredients of a general procedure to obtain series solutions to
linear equations with analytic coefficients. In general, we will not be as
lucky as in this example to recognize the power series in terms of familiar
functions.
We consider one more example before stating the general result. The
second order equation
ÿ − 2t ẏ + 2py = 0, (6.3.7)
where p is a real constant, is termed as Hermite’s equation. If we again
assume the solution in the form (6.3.2), then we obtain from (6.3.7), after
the substitution of expressions in (6.3.2), (6.3.3) and (6.3.4),
∞
∑ [(n + 1)(n + 2)an+2 − 2nan + 2pan ]t n = 0. (6.3.8)
n=0
y(t ) = a0 y1 (t ) + a1 y2 (t ), (6.3.9)
where y1 and y2 are given by the following series
2p 2 22 p( p − 2) 4 23 p( p − 2)( p − 4) 6
y1 (t ) = 1 − t + t − t + ···
2! 4! 6!
(6.3.10)
and
2( p − 1) 3 22 ( p − 1)( p − 3) 5
y2 (t ) = t − t + t
3! 5!
23 ( p − 1)( p − 3)( p − 5) 7
− t + ··· (6.3.11)
7!
By the simple ratio test, it is straightforward to verify that both these series
converge for all t ∈ R. It is also not difficult to see that they are linearly
independent and hence they span the solution space of Hermite’s equation.
We now make the following observations.
First, note that unless p is a non-negative integer, the infinite series for
y1 and y2 do not terminate. If p is a non-negative even integer, the series
for y1 terminates and y1 becomes a polynomial of degree p. Similarly, if
p is a non-negative odd integer, y2 becomes a polynomial of degree p.
Any other polynomial solution of Hermite’s equation is a multiple of one
of these polynomials. It is not difficult to compute these polynomials for
small p. For example, when p = 0, 1, 2, 3, the respective polynomials are
2
given by 1,t, 1 − 2t 2 ,t − t 3 .
3
Since any constant multiple of these polynomials is also a solution of
Hermite’s equation with p a non-negative integer, it is customary to take
the coefficient of t n , the leading term, as 2n . The resulting polynomials
are then termed as Hermite polynomials and are denoted by Hn (t ). Thus,
H0 (t ) = 1, H1 (t ) = 2t and H3 (t ) = 8t 3 −12t. Hermite polynomials appear
frequently in several applications, especially in quantum mechanics.
The following interesting formula for Hn may be deduced from the
expressions (6.3.10) and (6.3.11):
2 d n −t 2
Hn (t ) = (−1)n et e .
dt n
186 Ordinary Differential Equations: Principles and Applications
Theorem 6.3.1
Proof: The uniqueness question has already been dealt with in detail in
Chapter 3. We may take t0 = 0, by changing the variable, if necessary.
Suppose
∞ ∞
P(t ) = ∑ pnt n and Q(t ) = ∑ qnt n , (6.3.13)
n=0 n=0
Assuming for the moment that the series for y converges in a small interval
around 0 and the operations of term-by-term differentiation are legitimate,
we obtain
∞
ẏ(t ) = ∑ (n + 1)an+1t n , (6.3.15)
n=0
and
∞
ÿ(t ) = ∑ (n + 1)(n + 2)an+2t n , (6.3.16)
n=0
Series Solutions: Frobenius Theory 187
" #
∞ n
= ∑ ∑ (k + 1) pn−k ak+1 tn (6.3.17)
n=0 k =0
" #
∞ n
= ∑ ∑ qn−k ak t n. (6.3.18)
n=0 k =0
n
where γn = ∑ αk βn−k .
k =0
188 Ordinary Differential Equations: Principles and Applications
M n
≤ [(k + 1)|ak+1 | + |ak |]rk + M|an+1 |r,
rn k∑
=0
where the term M|an+1 |r is added for the purpose of what follows. Now
define b0 = |a0 |, b1 = |a1 | and recursively
M n
(n + 1)(n + 2)bn+2 = [(k + 1)bk+1 + bk ]rk + Mbn+1 r, (6.3.20)
rn k∑
=0
bn+1
It follows that |an | ≤ bn for all n. We now consider the ratios for
bn
large n for the application of the ratio test. Replace first n by n − 1 and
then, by n − 2 in (6.3.20) to obtain
n−1
M
n(n + 1)bn+1 = [(k + 1)bk+1 + bk ]rk + Mbn r
rn−1 k∑
=0
and
n−2
M
(n − 1)nbn = [(k + 1)bk+1 + bk ]rk + Mbn−1 r.
rn−2 k∑
=0
Multiplying the first expression here by r and using the second, we obtain
n−2
M
rn(n + 1)bn+1 = [(k + 1)bk+1 + bk ]rk + rM (nbn + bn−1 )
rn−2 k∑
=0
+ Mbn r2
Series Solutions: Frobenius Theory 189
Example 6.4.1
∞ ∞
ÿ = ∑ an (m + n)(m + n−1)t m+n−2 = t m−2 ∑ an (m + n)(m + n−1)t n .
n=0 n=0
For the terms P(t )ẏ and Q(t )y, using (6.4.2) and (6.4.3), we get
!" #
1 ∞ ∞
P(t )ẏ =
t n∑
pnt n ∑ an (m + n)t m+n−1
=0 n=0
" #
∞ n
m−2
= t ∑ ∑ pn−k ak (m + k) tn
n=0 k =0
" #
∞ n−1
= t m−2 ∑ ∑ pn−k ak (m + k) + p0 an (m + n) tn
n=0 k =0
192 Ordinary Differential Equations: Principles and Applications
and
!" #
∞ ∞
1
Q(t )y = 2
t ∑ qnt n ∑ ant m+n
n=0 n=0
" #
∞ n
= t m−2 ∑ ∑ qn−k ak tn
n=0 k =0
" #
∞ n−1
= t m−2 ∑ ∑ qn−k ak + q0 an t n.
n=0 k =0
After the substitution of these expressions for ÿ, P(t )ẏ and Q(t )y in (6.4.1)
and canceling the common factor t m−2 throughout, we obtain
"
∞
∑ an {(m + n)(m + n − 1) + (m + n) p0 + q0 }
n=0
#
n−1
+ ∑ ak {(m + k) pn−k + qn−k } t n = 0. (6.4.5)
k =0
a0 [m(m − 1) + mp0 + q0 ] = 0,
a1 [m(m + 1) + (m + 1) p0 + q0 ] + a0 (mp1 + q1 ) = 0,
a2 [(m + 1)(m + 2) + (m + 2) p0 + q0 ]+
+a0 (mp2 + q2 ) + a1 [(m + 1) p1 + q1 ] = 0,
··· ··· ··· ··· ··· ···
an [(m + n − 1)(m + n) + (m + n) p0 + q0 ]
+ a0 (mpn + qn ) + · · · + an−1 [(m + n − 1) p1 + q1 ] = 0,
··· ··· ··· ··· ··· ···
(6.4.6)
Series Solutions: Frobenius Theory 193
a0 f (m) = 0,
a1 f (m + 1) + a0 (mp1 + q1 ) = 0,
a2 f (m + 2) + a0 (mp2 + q2 ) + a1 [(m + 1) p1 + q1 ] = 0,
··· ··· ··· ··· ··· ···
an f (m + n) + a0 (mpn + qn ) + · · · + an−1 [(m + n − 1) p1 + q1 ] = 0,
··· ··· ··· ··· ··· ···
(6.4.7)
Since a0 6= 0, it follows that f (m) = 0, that is,
m(m − 1) + mp0 + q0 = 0. (6.4.8)
This is the indicial equation, which determines the possible values of the
exponent m in the assumed expression for the solution y. Let m1 and m2
be the roots of (6.4.8). If we choose m = m1 , then, from the
aforementioned expressions, we see that an is determined in terms of
a0 , a1 , · · · , an−1 , successively for n = 1, 2, · · · , provided that
f (m + n) 6= 0. The process breaks off if f (m + n) = 0. Thus, if
m1 = m2 + n, for some positive integer n, the choice m = m1 gives a
formal solution, but in general, the choice m = m2 does not, since
f (m2 + n) = f (m1 ) = 0. If m1 = m2 , then also we obtain only one
formal solution. In all the other cases, when the roots of the indicial
equation are real, we obtain two linearly independent formal solutions.
The roots of the indicial equation may also be complex, and therefore,
this procedure leads to a formal series with complex coefficients. Since
we are only interested in real solutions, we need to consider real and
imaginary parts of these formal solutions, which in general is quite
complicated and requires tools from complex analysis. We will not
pursue these topics here and the interested reader may refer to [Inc26] for
a discussion on differential equations in the complex domain.
We now state the forgoing discussion in the following theorem.
194 Ordinary Differential Equations: Principles and Applications
Theorem 6.4.2
The series in (6.4.9) and (6.4.10) are called Frobenius series. In a specific
problem, it is much preferable to start with a series of the form (6.4.4)
and derive the indicial equation and recursion relations. However, the
recursion formula (6.4.7) finds its main application in the proof of
Theorem 6.4.2, which is similar to the one in the previous section, but is
more delicate because of the presence of the terms f (m + n). We will not
present a proof here and the reader is referred to [Sim91] for details.
The theorem leaves unanswered the cases of m1 = m2 and when m1 −
m2 is a positive integer.
Suppose m1 = m2 and y1 is a solution given by the Frobenius series. We
may now proceed to find a second independent solution by the procedure
described in Chapter 3. Let y2 = y1 v be another solution, where v is a
non-constant function. Then,
Series Solutions: Frobenius Theory 195
Z
1
v̇ = 2 exp − P(t ) dt
y1
Z h i
1 p0
= 2m exp − + p1 + · · · dt
t 1 (a0 + a1t + · · · )2 t
1
= exp (−p0 logt − p1t − · · · )
t 2m1 (a0 + a1t + · · · )2
1
= exp (−p1t − · · · )
t (a0 + a1t + · · · )2
1
= g(t ), say ,
t
where we have used the fact that 2m1 + p0 = 1 when m1 = m2 and g is an
analytic function at t = 0 with g(0) = a12 . Therefore, we have
0
(m + 1)2 a1 = 0,
(m + 2)2 a2 + 1 = 0,
(m + 3)2 a3 + a1 = 0,
··· ···
Then, unless m is a negative integer, we have ak = 0 for odd integers k and
1
a2 = − ,
(m + 2)2
a2 1
a4 = − 2
= ,
(m + 4) (m + 2) (m + 4)2
2
·········
Substituting these values in (6.4.12) and (6.4.13), we infer that if
t2 t4
m
y=t 1− + −··· (6.4.14)
(m + 2)2 (m + 2)2 (m + 4)2
and if m is not a negative integer, then
t ÿ + ẏ + ty = m2t m−1 . (6.4.15)
Choosing m = 0 in (6.4.14) and (6.4.15), we see that
t2 t4
y = 1− + −··· (6.4.16)
22 22 · 42
is a solution of Bessel’s equation
t ÿ + ẏ + ty = 0. (6.4.17)
The series in (6.4.16) is denoted by J0 (t ) and is called Bessel’s function
of zero order of the first kind. It is easy to see that J0 (t ) is an even
function of t and converges for all t ∈ R with J0 (0) = 1. We can also see
that the indicial equation for Bessel’s equation is given by m2 = 0; thus,
its roots are equal and equal to zero. We now proceed to find another
independent solution of Bessel’s equation. The general procedure tells us
Series Solutions: Frobenius Theory 197
that the second solution involves a logarithm term. We are going to derive
an expression for the same using (6.4.13). Differentiating both the sides
of (6.4.13) with respect to m and then choosing m = 0, we obtain
tY¨0 + Y˙0 + tY0 = 0,
∂y
where Y0 = evaluated at m = 0. Now, from (6.4.14),
∂m
t2 t4
∂y m
= t logt 1 − + −···
∂m (m + 2)2 (m + 2)2 (m + 4)2
2t 2 2t 4
m 1 1 1
+t − +
(m + 2)2 m + 2 (m + 2)2 (m + 4)2 m+2 m+4
2t 6
1 1 1
+ + + −··· .
(m + 2)2 (m + 4)2 (m + 6)2 m+2 m+4 m+6
Hence, putting m = 0, we obtain
t2 t2 t6
1 1 1
Y0 (t ) = J0 (t ) logt − 2 + 2 2 1+ + 2 2 2 1+ + −· · · ,
2 2 ·4 2 2 ·4 ·6 2 6
(6.4.18)
which is called Bessel’s function of the second kind of order zero. Using
1 1
1+ + · · · + = log n + γ + εn ,
2 n
where γ is Euler’s constant and εn → 0 as n → ∞, it is straightforward
to check that the power series in (6.4.18) (excluding the term J0 (t ) logt)
converges for all values of t. It follows that the general solution of Bessel’s
equation is given by
y = AJ0 + BY0 ,
for arbitrary constants A, B.
Remark 6.4.3
6.5 Exercises
1. Show that the function f in Example 6.2.3 is in C∞ (R) and
f (n) (0) = 0 for n = 1, 2, · · · .
2. Let
∞ ∞
y1 (t ) = t r1 ∑ ant n , y2 (t ) = t r2 ∑ bnt n ,
n=0 n=0
for t > 0. Here r1 and r2 are real and unequal; a0 , b0 are non-zero.
State and prove a general theorem concerning the linear
independence of y1 and y2 .
3. Consider the second order Euler equation
t 2 ÿ + at ẏ + by = 0,
where a, b are real. Find two linearly independent solutions of the
equation in each of the following cases by applying the method of
Frobenius.
1 1
(a) a = ,b=− .
2 2
(b) a = −5, b = 9.
4. Discuss the solution of Legendre’s equation
(1 − t 2 )ÿ − 2t ẏ + a(a + 1)y = 0
in the neighborhoods of t = 1 and t = −1.
5. For each of the following equations, write the indicial equation and
find its roots. Write the form of two linearly independent solutions
Series Solutions: Frobenius Theory 199
6.6 Notes
In this chapter, we have considered a couple of classes of linear second
order equations with variable coefficients whose solutions can be
obtained explicitly, in the form of a power series; see [Sim91]. The
analysis mainly involves proving the convergence of the power series,
obtained heuristically, of a solution. For power series solutions of a
system of linear equations, the reader is referred to [Tay11].
7
Regular Sturm–Liouville Theory
7.1 Introduction
In this chapter, we are going to study certain boundary value problems
(BVP) associated with regular second order linear equations containing a
parameter. More specifically, we will be looking for non-trivial solutions
of the following equation
d du
L u(t ) ≡ − p(t ) + q(t )u(t ) = λ ρ (t )u(t ).
dt dt
are (infinite) power series, we need to discuss the convergence, etc. and
the tools required come from functional analysis (Hilbert space), the
discussion of these topics is outside the purview of the present book.
Before proceeding further, let us consider a simple example
L u = −ü.
It is easy to see that non-trivial solutions to the BVP
L u = λ u, u(0) = u(π ) = 0,
exist if and only if λ = n2 , n ∈ Z \ {0}. Thus, the eigenvalues are n2 ,
n a non-zero integer and the corresponding eigenfunctions are sin(nt ).
If, instead, we consider the boundary conditions as u(−π ) = u(π ) = 0,
the eigenvalues are now n2 /4, n ∈ Z \ {0}, and the eigenfunctions are
sin(nt/2) and cos(nt/2) for n even and n odd respectively. A reader
familiar with Fourier (sine and cosine) series recognizes that any suitable
function satisfying the given boundary conditions, can be written as an
infinite series involving the corresponding eigenfunctions.
If we now consider the periodic boundary condition u(−π ) = u(π ),
we do obtain the situation of the Fourier series. However, the question of
convergence, especially that of point-wise convergence, of a Fourier series
is a delicate issue.
From this example, we learn that the form of the boundary conditions
plays an important role in the determination of the eigenvalues; it is also
important in making the operator L self-adjoint and the orthogonality
condition of the eigenfunctions, as we will see in the next section.
Theorem 7.2.1
Let
d d
L =− p(t ) + q(t ),
dt dt
204 Ordinary Differential Equations: Principles and Applications
c1 u1 (b) + c2 u2 (b) = 0.
Therefore, (7.2.2) has only trivial solutions if and only if the matrix
" #
u1 (a) u2 (a)
A=
u1 (b) u2 (b)
Regular Sturm–Liouville Theory 205
It turns out that the more interesting situation is when (7.2.2) has
non-trivial solutions. This will lead to the existence of eigenvalues and
eigenfunctions for L .
We are now going to obtain the self-adjointness of L . To this end, we
introduce the following inner product. For any continuous functions u, v
defined on [a, b], define the inner product by
Z b
hu, vi = u(t )v(t ) dt.
a
(If u, v are complex valued, v(t ) should be replaced by v(t ), the complex
conjugate).
We impose the following boundary conditions on functions in C2 [a, b]:
α1 u(a) + α2 u̇(a) = 0, |α1 | + |α2 | > 0,
)
(7.2.3)
β1 u(b) + β2 u̇(b) = 0, |β1 | + |β2 | > 0.
206 Ordinary Differential Equations: Principles and Applications
Theorem 7.2.2
Remark 7.2.3
Remark 7.2.4.
Theorem 7.3.1
and therefore
∂F 1
∂ θ ≤ sup |Q(t )| + sup |P(t )| .
t∈[a,b] t∈[a,b]
Hence, we obtain a unique solution θ defined on [a, b] for any initial value
θ (a) = γ. Once θ is known, (7.3.5) for r gives
Z t
1 1
r (t ) = r (a) exp − Q(s) sin 2θ (s) ds ,
2 a P(s)
for all t ∈ [a, b]. Each solution of the Prüfer system depends on an initial
amplitude r (a) and an initial phase γ = θ (a). Changing r (a) just
multiplies the solution u by a constant factor. Thus, the zeros of any
solution u can be located by studying only the ODE for the phase θ .
From (7.3.2), we see that the zeros of any non-trivial solution u of
(7.3.1) occur where the phase function θ assumes the values nπ, n ∈ Z. At
these points, cos2 θ = 1 and θ̇ > 0, as follows from (7.3.4). Geometrically,
this means that the curve (P(t )u̇(t ), u(t )), t ∈ [a, b] in the (Pu̇, u) plane,
corresponding to a solution u can cross the Pu̇-axis at θ = nπ only counter-
clockwise.
The advantage of Prüfer substitution in studying the zeros of the
solution u is now evident from (7.3.4) satisfied by the phase variable. It is
only a first order equation for θ and does not contain r and the solution
exists in [a, b] for any given initial condition. We will now make the
following observations, which will be useful when we consider S–L
systems.
1. If θ is a solution of (7.3.4), so are −θ and θ + nπ for any integer n.
We may thus fix the initial condition θ (a) = γ ∈ [0, π ).
2. If we are just interested in the location of the zeros of u, then, it is
sufficient to solve the first order equation (7.3.4) and find the points
where θ takes the values nπ, n a positive integer.
3. Fix a non-negative integer n. If there is a tn ∈ [a, b] such that θ (tn ) =
1
nπ, then, from (7.3.4), it follows that θ̇ (tn ) = > 0. Hence,
P(tn )
θ (t ) > nπ for t > tn , close to tn .
We claim that θ (t ) > nπ for all t > tn . For, if there is a t > tn
such that θ (t ) = nπ, then we would have that θ̇ (t ) ≤ 0. But this
Regular Sturm–Liouville Theory 211
1
contradicts the fact that θ̇ (t ) = > 0, which follows from
P(t )
(7.3.4). Though θ need not be a monotonically increasing function
(it is if Q is also non-negative), it remains above the line θ = nπ, (n
any non-negative integer) once it crosses that line, for all future
times. In particular, for the chosen initial condition, θ > 0 in (a, b].
The existence of zeros of a non-trivial solution will now be done using
comparison theorems.
Theorem 7.3.2
Theorem 7.3.3
Corollary 7.3.4
[Corollary to Theorem 7.3.2] For any t1 ∈ (a, b], either f (t1 ) < g(t1 )
or f ≡ g in [a,t1 ].
Regular Sturm–Liouville Theory 213
Corollary 7.3.5
Proof: Suppose the conclusion is false. Then, we can find a t1 > a such
that f (t1 ) = g(t1 ). Now, consider the functions φ and ψ defined by
φ (t ) = f (−t ), ψ (t ) = g(−t ), t ∈ [−t1 , −a].
Then, φ and ψ satisfy the DEs
φ̇ (t ) = −F (−t, φ (t )), ψ̇ (t ) = −G(−t, ψ (t )),
for t ∈ [−t1 , −a] and satisfy the condition φ (−t1 ) = ψ (−t1 ). Since
−F (−t, y) ≥ −G(−t, y), we can apply Theorem 7.3.3 in the interval
[−t1 , −a]. We conclude that φ (−a) ≥ ψ (−a). This implies f (a) ≥ g(a),
a contradiction. The proof is complete.
Theorem 7.3.6
We are now in a position to discuss the oscillation results for the solutions
of the Sturm–Liouville system
d du
L u(t ) ≡ − p(t ) + q(t )u(t ) = λ ρ (t )u(t ), (7.3.8)
dt dt
satisfying the boundary conditions (7.2.3). Comparing (7.3.8) with the
equation (7.3.1), we find that P = p and Q = λ ρ − q. We would like to
study the number of zeros of a non-trivial solution of (7.3.8) as the real
parameter λ varies. Denote by θ (t, λ ), the corresponding phase variable.
Then, by (7.3.4), we have
1
θ̇ (t, λ ) = [λ ρ (t ) − q(t )] sin2 θ (t, λ ) + cos2 θ (t, λ ), (7.3.9)
p(t )
for t ∈ [a, b]. Recall that p is a positive C1 function and ρ > 0, q are
continuous functions defined on [a, b]. Now, fix a real number γ and
consider the solution θ (t, λ ) of (7.3.9) satisfying θ (a, λ ) = γ, for all λ .
Here, γ is determined by the boundary condition (7.2.3) at a as
Regular Sturm–Liouville Theory 215
α2
α1 sin γ + cos γ = 0. (7.3.10)
p(a)
There is a unique solution γ ∈ [0, π ) of (7.3.10). If α1 6= 0, we have, tan γ =
α2 π
− ; if α1 = 0, then, put γ = and tan γ = ∞.
α1 2
We are now going to obtain the following results as a direct
consequence of the comparison theorems and their corollaries proved
earlier. See also the observations made on the phase function in the
previous section.
Lemma 7.3.7
For any fixed t > a, the phase variable θ (t, λ ) is a strictly increasing
function of λ .
Lemma 7.3.8
Proof: This follows from the third observation made earlier in the
previous section.
Lemma 7.3.9
Theorem 7.3.10
[Oscillation Theorem]
γ1 − γ
and t − a < . Thus, the solution curve lies below the straight line, in
K −m
[a, a1 ] for some a1 > a. Now suppose, if possible, that θ (t, λ ) > s(t ) for
some t ∈ [a,t1 ]. We will obtain a contradiction for a choice of λ .
By continuity, we can find the smallest t∗ ∈ [a,t1 ] such that θ (t∗ , λ ) =
s(t∗ ) with θ̇ (t∗ , λ ) ≥ m, since the solution curve can cross the straight line
only from below. Then, observe that θ (t∗ , λ ) = s(t∗ ) = γ1 + m(t∗ − a).
Substituting the expression for m and the upper bound for γ1 , we see that
θ (t∗ , λ ) ∈ [ε, π − ε ]. Therefore, sin θ (t∗ , λ ) ≥ ε.
m−K
Let ρ∗ = inf ρ (t ) and choose Λ = . Then, remembering
t∈[a,t1 ] ρ∗ sin2 ε
that λ < 0, we obtain the following contradiction
m ≤ θ̇ (t∗ , λ ) ≤ λ ρ (t∗ ) sin2 θ (t∗ , λ ) + K ≤ λ ρ∗ sin2 ε + K < m,
provided that λ < Λ. This shows that θ (t1 , λ ) < ε if λ < Λ and completes
the proof.
Lemma 7.3.11
Theorem 7.4.1
7.5 Exercises
1. Let u, v satisfy the following equations
d du
P1 (t ) (t ) − Q1 (t )u(t ) = 0,
dt dt
d dv
P2 (t ) (t ) − Q2 (t )v(t ) = 0,
dt dt
on some interval in R, where P1 ≥ P2 > 0 are differentiable functions
and Q1 ≥ Q2 are continuous functions. If v does not vanish at any
point in a closed interval [a, b], show that
hu ib Z b Z b
(P1 u̇v − P2 uv̇) = (Q1 −Q2 )u2 dt + (P1 −P2 )u̇2 dt
v a a a
(u̇v − uv̇)2
Z b
+ P2 dt
a v2
where [ χ ]ba = χ (b) − χ (a). This formula is known as the Picone
formula. Deduce Sturm comparison theorem from this formula.
2. Suppose u satisfies the following equations
d du
P1 (t ) (t ) − Q1 (t )u(t ) = 0,
dt dt
d du
P2 (t ) (t ) − Q2 (t )u(t ) = 0,
dt dt
Regular Sturm–Liouville Theory 221
7.6 Notes
We have studied regular Sturm–Liouville boundary value problems in
this chapter, mainly concentrating on the existence of eigenvalues and the
corresponding eigenfunctions. We have followed the approach in [BR03]
by the consideration of the Prüfer substitution. For other approaches to
this problem, the reader is referred to [Inc26, CL72, Sim91, SK07]
among others. We have not done the expansion in terms of
eigenfunctions, as this requires tools from Hilbert space. We also have
not considered the more difficult topic of singular S–L systems.
Representation of solutions of BVP through Green’s function will be
taken up in Chapter 9. We may also use the integral operator defined
through Green’s function to show the existence of eigenvalues and
eigenfunctions; but this also requires tools from functional analysis
(compact operators).
8
Qualitative Theory
8.1 Introduction
Nonlinear dynamics, essentially concentrated around the study of
planetary motions, has some claim to be the most ancient of scientific
problems, perhaps as old as geometry. It, therefore, seems surprising that
until the twentieth century, geometric methods in nonlinear dynamics
were not much pursued. Henri Poincarè is universally acknowledged as
the founder of geometric dynamics, followed by G. D. Birkoff. But apart
from a few instances such as the stability analysis of Liapunov,1
Poincarè’s ideas seemed to have had little impact on applied dynamics for
almost half a century. A reason perhaps could be that Poincarè and
Birkoff concentrated on conservative systems motivated by problems in
celestial mechanics. Dissipative systems, on the other hand, have the
property that an evolving ensemble of states occupies a region of phase
space whose volume decreases with time. Over a long period of time, this
volume has the tendency to simplify the topological structure of the orbits
in the phase space; this may be true even in an infinite dimensional phase
space, for example, governed by a partial differential equation.
In this chapter, we study the qualitative behavior of solutions to
nonlinear ODE. We wish to do this by plotting the phase portrait of these
systems similar to the one that was done for 2 × 2 linear systems in
Chapter 5. The material in this chapter will be developed through
important examples described in Chapter 1, which will be recalled in the
sequel frequently.
We close this section with a few remarks on the phase portrait. There
is a similarity between the plotting of a phase portrait of a system and
plotting of a plane or space curve given by a parametric representation. In
both the situations, we suppress the independent variable t while plotting
the curve in question. We have already observed this in great detail while
analyzing 2 × 2 linear systems. In general, it will be more difficult to have
a complete phase portrait for nonlinear systems. Physically, the position
vector x(t ) and its velocity vector ẋ(t ) are called phases of the system,
hence the name phase portrait.
Definition 8.2.1
Lemma 8.2.3 shows that any solution passing through x0 may be used to
define O (x0 ) or O + (x0 ) unambiguously. Generally speaking, the phase
space (plane) analysis is about describing all the (positive) orbits of
(8.2.1). The other terminologies used for orbit are trajectory and path.
We will now discuss some important properties of solutions of
autonomous systems. In the following results, statements regarding t
refer to all t ∈ R.
Lemma 8.2.2
Lemma 8.2.3
This lemma shows that O (x0 ) or O + (x0 ) is the same set whether x or y
is used in its definition.
Corollary 8.2.4
Lemma 8.2.5.
Lemma 8.2.6
Remark 8.2.7
Definition 8.2.8
Lemma 8.2.9
Proof: For any fixed h > 0, x(t + h) is also a solution and converges to
ξ as t → ∞. By mean value theorem, we have
x(t + h) − x(t ) = hẋ(t˜) = hf(x(t˜))
for some t˜ between t and t + h. Hence t˜ → ∞ and x(t + h) − x(t ) → 0
as t → ∞. By continuity, we therefore get hf(ξ ) = 0 and conclude that
f(ξ ) = 0 as required.
Thus, if a solution x(t ) has a finite limit as t → ±∞, then, the limit is an
equilibrium point. In one dimension, it is easy to see that in the absence
of equilibrium points, all orbits will be unbounded and will not have finite
limits as t → ±∞.
8.2.1 Examples
Example 8.2.10
Example 8.2.11
the equilibrium points are nπ, n ∈ Z. All these equilibrium points are
isolated.
Example 8.2.12
If we take the negative sign in the second equation, then the origin (0, 0) is
the only equilibrium point. On the other hand, for the case of the positive
sign, (0, 0) and (±1, 0) are the equilibrium points. In either case, they are
isolated.
Qualitative Theory 229
Example 8.2.13
Writing the van der Pol equation (1.2.34) as the following 2D system:
ẋ = y, ẏ = µ (x2 − 1)y − x, (8.2.3)
Example 8.2.14
We find that (nπ, 0), n ∈ Z are the equilibrium points and each one of them
is isolated.
Example 8.2.15
The origin (0, 0) is the only equilibrium point of this system (Why?).
Example 8.2.16
Writing this as a first order system in x, ẋ, we see that each point on the
line ẋ = 0 is an equilibrium point. Hence, none of the equilibrium points
is isolated.
Definition 8.3.1
Definition 8.3.2
8.3.1 Linearization
We now discuss the linearization around an equilibrium point of (8.2.1).
We assume that f in (8.2.1) is a C2 function. If x̄ is an equilibrium point,
then by Taylor’s formula (see Chapter 2), we have
f(x̄ + y) = f(x̄) + Ay + O(|y|2 ) = Ay + O(|y|2 ), (8.3.1)
∂ fi
where A = Df(x̄) ≡ (x̄) denotes the Jacobian matrix of f at x̄.
∂xj
Writing x = x̄ + y and ignoring quadratic and higher order terms in y, we
obtain from (8.2.1) and (8.3.1), the following linear system:
ẏ = Ay. (8.3.2)
Qualitative Theory 231
Theorem 8.3.3
Theorem 8.3.4
ẋ = x2 , ẋ = −µx + x2 , µ > 0.
for t ∈ [0,t ∗ ]. The hypothesis (8.3.4) on f means that given any ε > 0, there
is a δ > 0, depending only on ε such that
ε
|f(t, x)| ≤ |x|, (8.3.6)
K
for all x satisfying |x| ≤ δ and for all t. Therefore, we have from (8.3.5)
and (8.3.6), using (8.3.4),
Z t
|x(t )| ≤ Ke−σt |x0 | + ε e−σ (t−s) |x(s)| ds (8.3.7)
0
as long as the solution x(t ) satisfies the condition |x(t )| ≤ δ , for t ∈ [0,t ∗ ];
this may be achieved by choosing small |x0 | and t ∗ if necessary. Next, by
multiplying the inequality (8.3.7) throughout by eσt and using Gronwall’s
inequality, we obtain
eσt |x(t )| ≤ K|x0 |eεt .
σ
Choosing ε = , we obtain the a priori estimate
2
σ
|x(t )| ≤ K|x0 |e− 2 t , (8.3.8)
provided that |x(t )| ≤ δ . If we choose x0 such that K|x0 | ≤ δ , then, from
(8.3.8), we see that |x(t )| ≤ δ , for all t, where the local solution exists.
Therefore, for the chosen initial data x0 , all the aforementioned
arguments are justified and the solution x(t ) satisfies (8.3.8) in its interval
of existence. This allows us to extend the solution for all t ≥ 0 and (8.3.8)
Qualitative Theory 233
Definition 8.3.5
8.3.2 Examples
Example 8.3.6
−2π −π 0 π 2π
In this case, we also have an explicit formula for the solution, from which
these conclusions may be drawn. We have, with α as earlier
234 Ordinary Differential Equations: Principles and Applications
nπ + 2 arctan(Ce−t )
(
if n is odd
x(t ) =
nπ + 2 arctan(Cet ) if n is even,
Example 8.3.7
In this case, we know that the solutions do not exist for all time; but there is
a maximum interval of existence depending on the initial condition. Here
0 is the only equilibrium point and linearization gives the equation ẏ = 0.
Thus, in the linearization, 0 is stable but not asymptotically stable. For the
nonlinear equation, we have the solution in explicit form:
x0
x(t ) =
1 − x0t
with x(0) = x0 . The solution is defined in the interval ( x10 , ∞) if x0 < 0 and
in the interval (−∞, x10 ) if x0 > 0. Since any solution is always increasing,
we see that 0 is an unstable equilibrium point and its behavior is more like
a saddle point: if x0 < 0, then x(t ) → 0 as t → ∞ and when x0 > 0, then
1
x(t ) → ∞ as t → .
x0
Example 8.3.8
The only equilibrium point is (0, 0). The corresponding linearized system
is
ẋ = −y, ẏ = x
and (0, 0) is stable, but not asymptotically stable for the linearized
system. However, by considering the original equations in polar
coordinates, we see that ṙ = r3 , where r2 = x2 + y2 . Thus, r is increasing
and the orbits starting near the origin spiral away from the origin as t
Qualitative Theory 235
Example 8.3.9
The equilibrium points are (0, 0) and (±1, 0). We now discuss the
linearization of the system around each of these equilibrium points. At
any point (x, y) ∈ R2 , the Jacobian of the right side functions is given by
0 1
.
1 − 3x2 −δ
0 1
At (0, 0), this becomes , whose eigenvalues are
1 −δ
1
√
2
2 (−δ ± δ + 4). Hence, for δ ≥ 0, there is always one positive
eigenvalue and the equilibrium point (0, 0) is linearly unstable.
At (±1, 0), the Jacobian matrix is given by
0 1
.
−2 −δ
√
Here, the eigenvalues are 12 (−δ ± δ 2 − 8). Hence, the equilibrium points
(±1,
√ 0) are asymptotically stable when δ > 0. If δ = 0, the eigenvalues are
± 2 i and the equilibrium points are now stable, but not asymptotically
stable in the linear approximation.
Example 8.3.10
0 1
The corresponding Jacobian matrix is , whose eigenvalues
−1 −µ
p
are 21 ( µ± µ 2 − 4). Thus, (0, 0) is asymptotically stable for µ < 0 and
unstable for µ > 0. For µ = 0, it is stable in the linear approximation; the
original system itself is linear when µ = 0.
Example 8.3.11
p p
We, next consider the equilibrium points (± b(R − 1), ± b(R − 1),
R − 1), which exist when R > 1. The corresponding Jacobian matrix is
given by
−σ σ 0
p
1 −1 ∓ b(R − 1) ,
p p
± b(R − 1) ± b(R − 1) −b
whose eigenvalues are the roots of the cubic equation
λ 3 + (σ + b + 1)λ 2 + (R + σ )bλ + 2σ b(R − 1) = 0.
The analysis of this cubic equation becomes more difficult, as there are
three parameters. Being a cubic equation with real coefficients, there is
always a real root, which can be shown to be negative. The Hurewitz
criterion (see, for instance, [Mer97]) shows that the eigenvalues all have
negative real parts if and only if
R > σ (σ + b + 1)(σ − b − 3)−1 .
We will not discuss this further and the interested reader can refer to many
works on the subject. For example, see [GH83, Hao84, Wig90]. However,
we wish to make the following remark on the value R = 1, which is special
as observed earlier. As the value of R moves from the region R < 1 to
the region R > 1, we have observed either a change in the stability of an
equilibrium point or increase in the number of equilibrium points. For this
reason, R = 1 is referred to as a bifurcation point. The topic of bifurcation
is an important and difficult part of the qualitative analysis. The interested
reader may refer to the works cited earlier.
Example 8.3.12
where
3 3
2
−1 + 2 cos t 1 − cost sint
2
A(t ) =
.
3 3 2
−1 − cost sint −1 + sin t
2 2
√
The eigenvalues of this system are given by 14 (−1± 7i) for all t and thus,
they have negative real parts. However, this system has the following two
linearly independent solutions:
− cost
sint
v1 (t ) = et/2 , v2 (t ) = e−t .
sint cost
Definition 8.4.1
A C1 function V : Ω → R satisfying
(1) V (0) = 0, V (x) > 0 for all x ∈ Ω \ {0},
(2) ∇V · f ≤ 0 in Ω
is called a Liapunov function for (8.2.1).
Theorem 8.4.2
2 Later, while discussing conservative equations we will see that V may be taken as the sum of
kinetic energy and potential energy. Thus, Vc may be thought of as the surface at energy level c .
240 Ordinary Differential Equations: Principles and Applications
C1
ε
t = t0 δ
C2 Ω
η
C3
x(t )
Since k > 0, V (x(t )) < 0 for large t, which contradicts the positivity of
V . Thus, L = 0, that is, lim V (x(t )) = 0. We leave it as an exercise to the
t→∞
reader to show that lim x(t ) exists and the limit is 0 as V (x) > 0 for x 6= 0.
t→∞
Thus, 0 is asymptotically stable. The proof is complete.
Theorem 8.4.3
Actually, one does not need to assume the condition (2) that every sphere
around 0 contains a point where V > 0; it may be replaced by a weaker
assumption. Theorem 8.4.3 follows from the following theorem, due to
Chetaev.
242 Ordinary Differential Equations: Principles and Applications
Theorem 8.4.4
V = V (x0 )
V =0 K
ε
x0
Claim: This positive orbit crosses C1 after a finite time greater than t0 .
Hence, one more integration shows that V (x(t ) ≥ V (x0 ) + m(t − t0 ) for
all t > t0 , which contradicts the boundedness of V in K. This completes
the proof.
Example 8.4.5
Example 8.4.6
Choosing c1 = c3 > 0 and c2 = 2c1 , we find that V (x) > 0 for x 6= 0 and
∇V · f ≡ 0. Thus, the orbits of the system lie on the ellipsoids
x12 + 2x22 + x32 = c2 ,
where c is a constant. These are ellipsoids surrounding the origin and
therefore (0, 0, 0) is stable, but not asymptotically stable. This conclusion
follows directly from the nature of the trajectories. The Liapunov theorem
is not applicable here as the equilibrium point (0, 0, 0) is not isolated.
Note that this system has many more equilibrium points. In fact, the
points (0, 0, c), (0, b, 2) and (a, 0, 1), where a, b, c ∈ R, are all equilibrium
points and none of them is isolated! The reader should be able to construct
suitable Liapunov functions and do the stability analysis.
Example 8.4.7
[Slight modification of Example 8.4.6] Consider
The reader can verify that the origin (0, 0, 0) is the only equilibrium point.
The Jacobian matrix is same as in Example 8.4.6. If we take V (x) = x12 +
2x22 + x32 , we find that ∇V · f = −2(x14 + 2x24 + x34 ) < 0 for x 6= 0. Thus,
(0, 0, 0) is asymptotically stable.
Example 8.4.8
Consider the 2D system
ẋ = x2 + 2y5 , ẏ = xy2 .
The origin (0, 0) is the only equilibrium point. It is easy to see that the
linearization does not reveal much regarding the nature of the equilibrium
point. Consider the Liapunov function V (x, y) = x2 − y4 . This is not a
positive definite function, but has subsets in any neighborhood around the
origin, where it is positive. These subsets are bounded by the parabolas
x = y2 and x = −y2 . Next, along a trajectory (x(t ), y(t )) of the given
system, we find that
Qualitative Theory 245
Definition 8.5.1
Example 8.5.2
x2
x y− = c,
3
x02
c a constant; c = x0 y0 − , x0 , y0 being the initial values of x and y
3
respectively.
We choose c = 0. If x0 = 0, then x(t ) = 0 for all t. This in turn gives
the y-axis, which is an invariant set for the given system. It is a stable
manifold, denoted by W s (0, 0), which is the same as E s , as any solution of
the given system starting on the y-axis remains there for all t and converges
to 0 as t → ∞. Now suppose x0 6= 0. Then, as c is assumed to be 0, we
x2
have y0 − 0 = 0. It is not difficult to see that the parabola given by,
3
x2
W u (0, 0) = {(x, y) ∈ R2 : y = },
3
is an invariant manifold of the given nonlinear system. It is unstable as
any non-trivial solution starting on this parabola remains there and moves
away from the origin as t increases.
Example 8.5.3
Theorem 8.5.4
For the description of the subspaces E s and E u , see Theorem 5.6.3. A more
delicate and detailed analysis is contained in the following:
Theorem 8.5.5
[Hartman–Grobman Theorem]
8.6.1 Examples
Example 8.6.1
Solving the equation to obtain the solution explicitly, we see that every
solution is periodic. The same conclusion may be reached by analyzing
(8.6.2), which turns out to be
(ẋ(t ))2 + k(x(t ))2 = 2E
Qualitative Theory 249
and E is obtained from the initial conditions. The orbits in this case are
ellipses (circles if k = 1) surrounding the origin in the phase plane.
Example 8.6.2
z = E = 2k
z = E < 2k
x
−3π −2π −π 0 π 2π 3π
ẋ
Fig. 8.4 Potential function and phase plane for the pendulum equation
250 Ordinary Differential Equations: Principles and Applications
Example 8.6.3
1 2 x2 x4
ẋ − + = E.
2 2 4
We will now do a similar phase portrait analysis as in the previous
example. Note that V is symmetric around the origin, that is,
V (x) = V (−x) and attains its minimum at ±1 with V (±1) = − 14 and
√
V (± 2) = 0. Thus, E ≥ − 41 and we consider the following cases. See
Fig. 8.5.
Case 1: E = − 14 : Here, we obtain only the equilibrium solutions
(±1, 0).
Case 2: − 14 < E < 0: In this case, the values of x are restricted to
the symmetric intervals around the equilibrium points (±1, 0) of length
2b where b > 0 satisfies V (±1 ± b) = E; see the graph of the potential
function in Fig. 8.5. We, then, obtain periodic solutions surrounding each
of these equilibrium points separately.
Case 3: E = 0: Now, we first obtain the equilibrium solution (0, 0).
Thus, any other orbit can reach this equilibrium point only as t √→ ±∞.
√ such solution x therefore lies either in the interval (0, 2] or
Any
[− 2, 0) and is thus bounded. The direction of the orbit for increasing t
can easily be determined and is shown in Fig. 8.5. In this case, we obtain
2 orbits, one with x(0) > 0 and the other with x(0) < 0. Each one of
252 Ordinary Differential Equations: Principles and Applications
V (x )
E >0
x (E = 0)
−1/4 < E < 0
E = −1/4
ẋ
A good way to visualize the variation of the angle the vector v makes with
the x-axis as it moves in the positive direction along Γ, is to place all these
3 A reader familiar with one variable complex analysis would realize that the Poincarè index is
similar to the notion of the winding number of a closed curve in the plane.
Qualitative Theory 255
vectors at one point and see how these vectors rotate. The reader should
try this with the simple examples mentioned a little later and more.
Since the angle φ = arctan(v2 /v1 ), it is not difficult to see that the
analytic expression for the index is as follows. Suppose the vector field is
given by v = (v1 , v2 ), where v1 , v2 are smooth real valued functions. Then,
Z
1 1 v2 (x, y) 1 v1 dv2 − v2 dv1
Z Z
Iv (Γ) = dφ = d arctan = .
2π Γ 2π Γ v1 (x, y) 2π Γ v21 + v22
(8.7.2)
It is important to keep the direction right as far as the line integral is
concerned; it is always in the counter-clockwise direction, which we may
call positive direction. Using the parametric representation of Γ, the line
integral in (8.7.2) may be expressed as the following one-dimensional
integral:
1 b 2
dv2 dv1
Z
2 −1
Iv (Γ) =
v + v2 v1 − v2 ds, (8.7.3)
2π a 1 ds ds
where, in the integrand, vi = vi (x(s), y(s)), i = 1, 2. The derivatives with
respect to s may be evaluated, using the chain rule, in terms of the partial
derivatives of v1 , v2 and the derivatives of x, y. Before proceeding to see
the relevance of the index with the periodic orbits of (8.7.1), we will see
some examples.
Example 8.7.1
Let Γ be the unit circle centered at (−2, 0) and the vector field be
v1 (x, y) = x, v2 (x, y) = y. We now find that Iv (Γ) = 0.
256 Ordinary Differential Equations: Principles and Applications
Example 8.7.3
Let Γ be again the unit circle centered at the origin and the vector field
be given by v1 (x, y) = x2 , v2 (x, y) = y2 .
The reader should try to visualize the rotation of the given vector field
along the positive direction on Γ as described earlier. We leave it to the
reader as an exercise to show that Iv (Γ) = 2. On the other hand, if we now
take v1 (x, y) = −x2 , v2 (x, y) = y2 , the index will be −2.
These are typical examples covering many vector fields with isolated
equilibrium points.
Theorem 8.7.4
Let Γ be a closed Jordan curve such that Γ and its interior DΓ do not
contain any equilibrium points of a smooth vector field v. Then,
Iv (Γ) = 0.
4 Green’s theorem is more generally valid for any bounded open set with a smooth boundary.
Qualitative Theory 257
where,
−1 ∂ v2 ∂ v1 2 −1 ∂ v2 ∂ v1
v21 + v22 2
Q=− v1 − v2 , P = v1 + v2 v1 − v2
∂x ∂x ∂y ∂y
and they satisfy the conditions of Green’s theorem, as the denominator is
never zero on Γ and in its interior DΓ . Hence, using (8.7.4), we obtain
ZZ
∂P ∂Q
Iv (Γ) = + dx dy.
DΓ ∂ x ∂y
It is easy to check that the integrand, after the evaluation of the partial
derivatives, is identically zero. Thus, Iv (Γ) = 0 and the proof is complete.
Corollary 8.7.5
Suppose Γ1 and Γ2 are two simple closed curves in the plane, one
lying in the interior of the other, such that the vector field v has no
equilibrium points on Γ1 , Γ2 and the ‘annular’ region between them.
Then, Iv (Γ1 ) = Iv (Γ2 ).
Γ2
L
Γ1
The conclusion of the corollary is usually stated as: If a closed curve not
containing any equilibrium points of v, is continuously deformed without
crossing any equilibrium points of v, then the index is unchanged. Thus,
the index is in a way independent of the closed curve in question and
is associated with some special points in the plane. This enables us to
talk of the index of an isolated equilibrium point x0 of the vector field
v as the index of any Jordan curve containing x0 in its interior, and no
other equilibrium point of v. This will be denoted by Iv (x0 ). The reader is
advised to look at the examples discussed earlier keeping this discussion
in mind. For isolated equilibrium points, it can be shown that the index
of an equilibrium point, computed from the linearized system, is the same
for the nonlinear system as well. For more details, see [CL72, JS03]. For
linear systems, we can do the computations easily and find the following.
Suppose the vector field v is linear and is given by
v1 (x, y) = ax + by, v2 (x, y) = cx + dy, ad − bc 6= 0.
Then, the origin is the only equilibrium point of v and one finds, using
(8.7.4), that Iv (0) = the sign of ad − bc. This is left as an exercise to the
reader.
The idea of the proof of the corollary can also be extended to prove the
following general result.
Theorem 8.7.6
Theorem 8.7.7
Let Γ be a closed Jordan curve with a continuous tangent vector v at
each point of Γ, which has no equilibrium points on Γ. Then, Iv (Γ)= 1.
(a, b) (b, b)
p(t )
T
ū p(s)
x s
(a, a) (a, a)
We will now construct an auxiliary vector field ū, which will be used to
prove the theorem. Let
T = {(s,t ) : a ≤ s ≤ b, s ≤ t ≤ b},
be the triangular region in the (s,t )-plane. Define ū on T by
ū(s, s) = u(s),
for a ≤ s ≤ b and ū(a, b) = −u(a); at all other points in T , define ū(s,t ) to
be the unit vector in the direction from p(s) to p(t ) on Γ. See Figure 8.7.
Let θ (s,t ) be the angle the vector ū(s,t ) makes with the positive x-axis.
Therefore, θ (a, a) = 0. Since Γ is assumed to lie in the region y ≥ 0,
θ (a,t ) varies from 0 to π as t varies from a to b. Similarly, θ (s, b) varies
from π to 2π as s varies from a to b. Also, by the definition, ū does not
vanish on the boundary Γ̄ of T . Therefore, by Theorem 8.7.4, we conclude
that Iū (Γ̄) = 0. This means that the variation of θ (s, s) as s varies from a
to b is 2π. But this is precisely the same as saying that the variation of the
angle that u makes with the positive x-axis as Γ is traversed once in the
positive direction is 2π. Hence, Iu (Γ) = 1 and the proof is complete.
Theorem 8.7.8
Theorem 8.7.9
Theorem 8.7.9, thus, asserts that equilibrium points are necessary for the
Qualitative Theory 261
Example 8.7.10
This system does not have any equilibrium points, but has periodic orbits
given by (cost, sint, c), for arbitrary constants c.
Theorem 8.7.11
which vanishes using (8.7.1) and, by hypothesis, the right side is either
positive or negative. This contradiction proves the theorem.
Theorem 8.7.12
orbit which lies in Ω at t = t0 and remains there for all t > t0 , then, C
itself is a periodic orbit or it spirals towards a periodic orbit as t → ∞.
Example 8.7.13
1−r02
where t0 is the initial time and c = r02
is given in terms of the initial
condition for r (t0 ) = r0 > 0. Hence, every orbit that either begins inside
or outside the circle r = 1, spirals towards the circle r = 1 as t → ∞. Thus,
r = 1 is the only periodic orbit for the given system and it is a stable limit
cycle.
The following theorem, due to Leinard, (see [Sim91, SK07]) for the
existence of periodic orbits of second order equations has particularly
verifiable hypothesis compared to the Poincarè–Bendixon theorem.
Theorem 8.7.14
Example 8.7.15
(van der Pol Equation) Applying Leinard’s theorem to the van der Pol
equation,
ẍ + µ (x2 − 1)ẋ + x = 0, µ > 0,
we see that it possesses a unique periodic orbit, which is a stable limit
cycle.
264 Ordinary Differential Equations: Principles and Applications
8.8 Exercises
1. In the following systems, find the equilibrium points, draw the phase
portraits and find explicit solutions wherever possible.
(a) Consider the Lotka–Volterra prey–predator model discussed in
Example 1.2.6. This system is given by ẋ = ax − bxy, ẏ =
−cy + dxy, where a, b, c, d are all positive real numbers.
(b) Consider the system ẋ = y + sin x, ẏ = x − cos x.
(c) Show that all solutions of the system
ẋ = x2 + y sin x,
ẏ = 1 + xy cos y,
which start in the first quadrant must remain in the first quadrant
for all future time.
(d) (Competition between two species) Consider the system
ẋ = ax − bxy − ex2 ,
ẏ = −cy + dxy − f y2 ,
where the constants a, b, . . . are all positive. If c/d > a/e,
show that every orbit that starts in the first quadrant approaches
the equilibrium point (a/e, 0) as t → ∞. (Note that this system
is a generalization of the logistic model with two competing
species, that is, the Lotka–Volterra prey–predator model).
2. Consider the equation ẍ = ex . There are no equilibrium points!
Hence, all solutions are unbounded. Find the solution explicitly
using the equation of conservation of energy (8.6.2).
3. Do the same as in the previous exercise for the equation ẍ = −ex . Is
there any difference?
4. Work out all the details in Example 8.6.2 of the pendulum equation
when E = 2k.
5. Work out all the details in Example 8.6.3 of Duffing’s equation with
no damping when E = 0.
6. Consider the equation −ẍ + xẋ = 0. For this equation, all the
equilibrium points are non-isolated! Solve this equation explicitly.
Qualitative Theory 265
ẏ = Cx + Dy.
10. Draw the phase portrait of ẍ = 21 (x2 −1). Derive the formulas for the
solution x in Example 8.2.6, using the method of separable variables.
11. Work out all the details in Example 8.6.11.
266 Ordinary Differential Equations: Principles and Applications
12. Let (x, y) be a solution of the system in Example 8.5.2 with initial
data (x0 , y0 ) at t = 0. Show that
x2
(a) x y − is a constant.
3
x(t )2 x02
(b) y(t ) − =e −t y0 − for all t ∈ R.
3 3
8.9 Notes
In this chapter, we have studied the very basic notion of stability of an
equilibrium point of an autonomous system. This is an important aspect
in many physical systems such as mechanical systems, motion of a
satellite, etc. The stability analysis of a hyperbolic equilibrium point of a
general autonomous system follows from the analysis of the linearized
system. This follows from the theorems of Perron and
Hartman–Grobman. However, the case of a non-hyperbolic equilibrium
point is more delicate. We have discussed a powerful tool, namely the
Liapunov function, to deal with this situation. Though the stability results
are easy to state and prove using the method of Liapunov, it is not trivial
to construct a Liapunov function for a general system. For polynomial
vector fields, one may try to use the quadratic forms to generate a
Liapunov function.
In the same way, one can consider the stability of an orbit of a
solution. However, a notion called structural stability is required, which is
not considered here, and which is more involved and complicated.
Another aspect we have completely left out is the study of stability of a
periodic orbit. We have briefly mentioned this during our discussion on
Floquet theory, which concerns linear systems with periodic orbits. An
interested reader, after thoroughly going through the present chapter, may
look into more advanced texts regarding the other aspects of stability
theory. A good list of references is [CL72, HSD04, Wig90, Per01, JS03].
9
Two Point Boundary Value
Problems
9.1 Introduction
In this chapter, we discuss some boundary value problems (BVPs) for
linear and nonlinear second order equations. These problems arise in a
vast number of practical situations ranging from physics, engineering to
biology. For a very good collection of such problems and their detailed
descriptions, refer to [AMR95].
The analysis, in the linear case, makes use of the existence of two
linearly independent solutions, discussed thoroughly in Chapter 3. These
are, then, used to construct the so-called Green’s function of the given
BVP, which in turn will generate the required solution.
The nonlinear case is more delicate. We describe a well-known
method, shooting method, to prove the existence and uniqueness of a
solution to BVP. Several examples will be discussed in detail as
illustrations of the theory developed. Doing a phase plane analysis, at
least in the case of autonomous equations, may help decide whether a
solution to the given BVP is possible or not. However, in most situations
one needs to use a suitable numerical scheme to obtain a solution.
The study of the existence and uniqueness of solutions to a BVP is
more difficult than that of an IVP, even in the linear case. We first look at
some examples to see these difficulties.
Example 9.1.1
Example 9.1.2
This represents a steady state heat flow in a rod; the boundary conditions
represent the heat fluxes at the ends of the rod. The given function f ,
represents the external heating or cooling of the rod.
Assuming that a solution exists, we obtain after an integration that
Z 1
f (t ) dt = −u̇(1) + u̇(0) = γ2 + γ1 .
0
The left hand side represents total heat (or cooling) supply to the rod and
the term on the right side represents the total heat flux at the ends.
Therefore, we immediately conclude that no solution exists if, for
example, f ≡ 1 and γ1 = γ2 = 0. On the other hand, if f (t ) = sin(2πt ),
0 ≤ t ≤ 1 and γ1 = γ2 = 0, then there are infinitely many solutions given
by
t 1
u(t ) = a − + 2 sin(2πt ),
2π 4π
with an arbitrary constant a.
Example 9.1.3
Here a0 , a1 , a2 are real constants. The reader should work out the details
to find out various cases of existence and non-existence of solutions.
" # " #
c1 0
(AW(a) + BW(b)) = . (9.2.5)
c2 0
y1 (t ) y2 (t )
Here, W(t ) = denotes the Wronskian matrix of the
ẏ1 (t ) ẏ2 (t )
solutions y1 and y2 at t. The aforementioned system of linear algebraic
equations has a non-trivial solution if and only if the matrix
AW(a) + BW(b) is singular, that is, its rank is either 0 or 1; notice that
y ≡ 0 is always a solution of (9.2.3) and (9.2.4) (the trivial solution).
Also, note that W(t ) is non-singular and the rank of AW(a) + BW(b)
does not depend on any particular choice of a pair of linearly independent
solutions.
Now, let y0 be any particular solution of the non-homogeneous
equation (9.2.1). Then, any general solution of (9.2.1) is given by
y = y0 + c1 y1 + c2 y2 for arbitrary real constants c1 and c2 . If this y were
to satisfy the boundary conditions (9.2.2), then c1 , c2 should satisfy
" # " #
c1 γ1 − ξ1
(AW(a) + BW(b)) = γ −ξ ≡ , (9.2.6)
c2 γ2 − ξ2
y0 (a) y0 (b)
ξ1
where =A +B . The system of algebraic equations
ξ2 ẏ0 (a) ẏ0 (b)
(9.2.6) has a solution if and only if
rank (AW(a) + BW(b)) = rank [AW(a) + BW(b) γ − ξ ] . (9.2.7)
Note that this condition on ranks is automatically satisfied if
AW(a) + BW(b) is non-singular. We now state the foregoing discussion
in the following theorem.
Theorem 9.2.1
solution is not unique. In this case, the BVP (9.2.3) and (9.2.4) has
non-trivial solutions.
For simplicity of presentation, we now consider the BVP (9.2.1) with the
following homogeneous boundary conditions, replacing the general
boundary conditions in (9.2.2):
y(a) = 0 and y(b) = 0. (9.2.8)
Fix α, β . We wish to derive a formula for the solution y, when it exists,
in the form of an integral involving the ‘input’ function g, much similar to
the case of first order linear equations. We rewrite (9.2.1) in the equivalent
form as
d
[ p(t )ẏ(t )] + q(t )y(t ) = f (t ), (9.2.9)
dt
where p is a positive C1 function on [a, b]. Given (9.2.9), it is obvious
that it can be written in the form (9.2.1) by taking α (t ) = ṗ(t )/p(t ),
β (t ) = q(t )/p(t ) and
R g(t ) = f (t )/p(t ). Conversely, given (9.2.1), we
define p(t ) = exp at α (s) ds and find that (9.2.1) can be put in the form
(9.2.9) with q(t ) = p(t )β (t ) and f (t ) = p(t )g(t ).
We begin with a heuristic description of the method to obtain a
solution to the problem (9.2.9) satisfying the boundary conditions (9.2.8).
Let u1 , u2 be two linearly independent solutions of (9.2.9) with f = 0,
that is, the homogeneous equation corresponding to (9.2.9). By the
method of variation of parameters, we find a general solution of (9.2.9) as
Z t
f (s)
y(t ) = Au1 (t ) + Bu2 (t ) + [u1 (s)u2 (t ) −u1 (t )u2 (s)] ds, (9.2.10)
a W (s)
where A, B are constants and W is the Wronskian of u1 and u2 defined by
W (t ) = u1 (t )u̇2 (t ) − u̇1 (t )u2 (t ). By the linear independence of u1 , u2 , it
follows that W is never zero. If we now require that y given by (9.2.10)
satisfy the boundary conditions (9.2.8), then we must have, using (9.2.10),
1 Note that the existence of w and w is not automatic and the given boundary conditions play an
1 2
important role in their existence.
Two Point Boundary Value Problems 273
if a ≤ s ≤ t
(
w1 (s)w2 (t )
G(t, s) =
w1 (t )w2 (s) if t < s ≤ b.
We now directly verify that y given by (9.2.13) is a solution of (9.2.9)
satisfying the boundary conditions (9.2.8), after suitably normalizing
w1 , w2 . Clearly y satisfies (9.2.8) as w1 (a) = 0 and w2 (b) = 0.
Note that
∂ ∂
lim G(t, s) − lim G(t, s) = w1 (t )ẇ2 (t ) − ẇ1 (t )w2 (t ).
s→t− ∂t s→t + ∂t
The right side expression being the Wronskian of w1 , w2 , it follows from
(9.2.9) that this limit equals C/p(t ), where C is a constant. This follows
from the fact that the Wronskian satisfies a first order linear equation; see
Chapter 3. We may normalize w1 , w2 so that C = 1. With this
normalization, we have2
Z t Z b
∂ ∂
ẏ(t ) = G(t, s) f (s) ds + G(t,t ) f (t ) + G(t, s) f (s) ds
a ∂t t ∂t
−G(t,t ) f (t )
and therefore,
Z t Z b
ẏ(t ) = ẇ2 (t ) w1 (s) f (s) ds + ẇ1 (t ) w2 (s) f (s) ds.
a t
Now multiply this expression by p(t ) and differentiate once again with
respect to t to obtain
Z t
d d
[ p(t )ẏ(t )] = w1 (s) [ p(t )w˙2 (t )] f (s) ds + p(t )w1 (t )w˙2 (t ) f (t )
dt a dt
Z b
d
+ [ p(t )w˙1 (t )w2 (s) f (s) ds − p(t )w˙1 (t )w2 (t ) f (t ).
t dt
Using the normalization and that w1 , w2 satisfy the homogeneous equation
(9.2.9), we see that the expression on the right equals −q(t )y(t ) + f (t ).
This completes the verification that y is indeed a solution of the BVP. We
take up the uniqueness question in the next section.
2 See differentiation under the integral sign in Chapter 2.
274 Ordinary Differential Equations: Principles and Applications
Φ(t )Φ−1 (s) − Φ(t )Y−1 NΦ(b)Φ−1 (s), if a ≤ s ≤ t;
G(t, s) =
−Φ(t )Y−1 NΦ(b)Φ−1 (s), if t < s ≤ b.
9.2.2 Examples
Example 9.2.2
1
In particular, if f ≡ 1, we find that y(t ) = t (t − 1).
2
276 Ordinary Differential Equations: Principles and Applications
Example 9.2.3
Since, sin(2t ) and cos(2t ) form a basis for the solution space of the
homogeneous equation, we choose w1 (t ) = sin(2t ) and w2 (t ) =
− 12 cos(2t ) to satisfy the required boundary conditions and normalization
condition. Thus, Green’s function is given by
− 12 sin(2s) cos(2t ), if 0 ≤ s ≤ t,
G(t, s) =
1
− 2 sin(2t ) cos(2s), if t < s ≤ 1.
Example 9.2.4
Theorem 9.3.1
y(t ) = y j (t ) ≡ u(t; s j ),
where u solves IVP (9.3.3).
Theorem 9.3.2
Corollary 9.3.3
with the boundary conditions (9.3.2) and the same conditions on the
coefficients a0 , a1 , b0 , b1 as in Theorem 9.3.1 and Theorem 9.3.2. If p, q
are continuous in [a, b] and q > 0 in [a, b], then, the linear BVP has a
unique solution.
Proof: (of Theorem 9.3.2) It suffices to show that (9.3.5) has a unique
∂u
root. If u(t; s) denotes a solution of (9.3.3), put ξ (t ) = (t; s). By
∂s
differentiating (9.3.3) with respect to s, we obtain
ξ̈ = p(t )ξ̇ + q(t )ξ , (9.3.7)
where
∂f ∂f
p(t ) = (t, u(t; s), u̇(t; s)) and q(t ) = (t, u(t; s), u̇(t; s)).
∂ u̇ ∂u
Two Point Boundary Value Problems 279
and thus,
∂ u̇
ξ̇ (t ) = (t; s) > a0 exp(−M (t − a)) ≥ 0,
∂s
280 Ordinary Differential Equations: Principles and Applications
9.3.1 Examples:
Example 9.3.4
This is an equation in the conservative form and its phase portrait is not
hard to draw. This is depicted in Fig. 9.1. Note that E = E (t ) = 12 u̇2 (t ) −
2u(t ) − 13 u3 (t ) is the total conserved energy.
If we do not insist that u(1) = 0, but just require that u(b) = 0 for
some b > 0, then we obtain an infinite number of solutions of (9.3.8) by
choosing u(0) = 0 and u̇(0) < 0, as can be seen from the phase portraits
shown in Fig. 9.1.
u̇
The requirement of the latter condition that u̇(0) < 0 may also be seen as
follows. From the equation in (9.3.8), we see that u is convex. Thus, if it
satisfies the boundary conditions in (9.3.8), we must have u̇(0) < 0 and
u̇(1) > 0. It is not possible to integrate the given equation explicitly, but
the equation may be solved numerically. We find that if we ‘shoot’ with
u̇(0) slightly less than −1, we obtain a solution of (9.3.8).
Example 9.3.5
Here, λ > 0. A complete phase portrait of the equation is shown in Fig. 9.2.
Note that in this case a solution u is concave and therefore, we need to
‘shoot’ with u̇(0) > 0 to possibly obtain a solution of (9.3.9). In the present
situation, an explicit solution is possible. Here, the constant conserved
energy is given by E = E (t ) = 12 (u̇)2 (t ) + λ eu(t ) .
u̇
√
κ λ
= √
cosh(κ ) 2 2
The function of κ > 0 on the left side, has a unique positive maximum
and tends to 0 as κ → ∞. It follows, therefore, that there is a critical value
λcr of λ such that (9.3.9) has no solution for λ > λcr , unique solution for
λ = λcr and two solutions for λ < λcr .
For a slightly different representation of the solution, see [AMR95].
Example 9.3.6
u̇
9.4 Exercises
1. Determine the values of λ for which a Green’s function can be
constructed for the equation ÿ + λ y = f (t ), with the following
prescribed boundary conditions. Construct a Green’s function for
all such values of λ :
(a) y(0) = y(1) = 0.
9.5 Notes
Two point BVP are studied in this chapter for both linear and nonlinear
second order equations. For linear equations, we take the advantage of
Two Point Boundary Value Problems 285
Let u0 = u0 (s) be a given function defined on the initial curve. The IVP
for the PDE can be defined as follows: Find u = u(x, y) satisfying the PDE
(10.1.1) together with the initial condition
u(x0 (s), y0 (s)) = u0 (s) (10.1.2)
for all s ∈ [0, 1]. The problem of local solvability of IVP is to find u in a
neighborhood in Ω of the initial curve satisfying (10.1.1) and (10.1.2).
y
(x(t ), y(t ))
Example 10.1.1
Consider the transport equation uy (x, y) + kux (x, y) = 0 with the initial
condition u(x, 0) = u0 (x) on Γ : y = 0 considered in (3.4.1).
dy 1
Here a = k, b = 1, c = d = 0. Thus, = or x − ky = constant, are the
dx k
characteristics which we have already seen in Example 3.4.1. Note that
we had used t instead of y.
Example 10.1.2
Consider the PDE, xux + yuy = αu and u = φ (x) on the initial curve
y = 1.
It is easy to see that y = cx, c constant, are the characteristic curves and
along any of these curves, u satisfies
d y α
u(x, cx) = ux (x, cx) + uy (x, cx).c = ux + uy = u(x, cx),
dx x x
whose solution is given by u(x, cx) = kxα . Here k = k(c) depends on c
which may differ from characteristic to characteristic. Thus, we have the
general solution u(x, y) = k xy xα , where k is an arbitrary function. Now
applying the condition u = φ (x) at y = 1, we get
1 α 1 α
φ (x ) = k x or k(x) = φ x
x x
290 Ordinary Differential Equations: Principles and Applications
Definition 10.2.1
satisfies
dz dx dy
= ux + uy = aux + buy = c.
dt dt dt
Thus, the curve (x(t ), y(t ), z(t )) = u(x(t )), y(t )) satisfies the system
(10.2.4) and hence, it is the characteristic through (x0 , y0 , z0 ). Moreover,
it lies on ∑ by definition. Further, if two integral surfaces intersect at a
point, then the characteristic curve through the point would lie on both
the surfaces and hence, they intersect along the whole characteristic
through this common point. With this detailed discussion, we can now
formulate the initial value problem as follows:
Characteristic curves
Initial curve
Theorem 10.2.2
space curve Γ̄0 by (10.2.5) and assume that the transversality condition
holds:
dy0 dx0
a(x0 (s), y0 (s), u0 (s)) − b(x0 (s), y0 (s), u0 (s)) 6= 0 (10.2.6)
ds ds
for all 0 ≤ s ≤ 1. Then, there exists a unique solution u(x, y) defined in
some neighborhood of the initial curve Γ0 , which satisfies the PDE
(10.2.2) and the initial conditions
u(x0 (s), y0 (s)) = u0 (s). (10.2.7)
The theorem thus, confirms that there is an integral surface through the
space curve Γ̄0 in some neighborhood.
for small t, with x(0, s) = x0 (s), y(0, s) = y0 (s), u(0, s) = u(x0 (s),
y0 (s)) = u0 (s). Note that the derivatives of x, y, u with respect to s and t
are continuous. Thus, we can solve for u along the characteristic curve.
But we need to do more because we need to solve for u in any arbitrary
point in the neighborhood of the initial space curve as we discussed
earlier in the linear case; in other words, we need to answer the question:
does there exist a characteristic curve passing through any arbitrary point
in the neighborhood and meeting the initial space curve? Since the
Jacobian
294 Ordinary Differential Equations: Principles and Applications
∂x ∂x
∂ (x, y) ∂s ∂t dx0 dy0
= =b −a 6= 0,
∂ (s,t ) t =0
∂y ∂ y ds ds
∂s
∂t t =0
we can invoke the inverse function theorem ([Apo11, Rud76]) to obtain
s,t as functions of x, y in some neighborhood of the initial curve t = 0, say
s = s(x, y),t = t (x, y). Now define
ϕ (x, y) = u(s(x, y),t (x, y)).
One can verify that ϕ is the unique solution satisfying the initial
conditions.
Example 10.2.3
Remark 10.3.1
In the quasi-linear case, the cone degenerates into a straight line whose
direction is given by (a, b, c).
At each point, the surface will be tangent to a Monge cone. The line of
contact of the surface and the cones define a field of directions on the
surface called the characteristic directions and the integral curves of this
field define a family of characteristic curves. The Monge cone at
(x0 , y0 , z0 ) is the envelope of the one parameter family of planes (whose
normal is ( p, q, −1)) which can be written as
z − z0 = p(x − x0 ) + q(y − y0 ), (10.3.3)
where p, q solves (10.3.2). By solving (10.3.2) for q in terms of p as
q = q(x0 , y0 , z0 , p), we can write (10.3.3) as
z − z0 = p(x − x0 ) + q(x0 , y0 , z0 , p)(y − y0 )
which is a one parameter family of planes describing the Monge cone.
Differentiating with respect to p, we get
dq
0 = (x − x0 ) + (y − y0 ) .
dp
From (10.3.2), we have
dF dq
= Fp + Fq = 0. (10.3.4)
dp dp
dq
Eliminating , the equations describing the Monge cone can be written
dp
as
F (x0 , y0 , z0 , p, q) = 0
z − z0 = p(x − x0 ) + q(y − y0 )
(10.3.5)
x − x0 y − y0
= .
Fp Fq
Given p and q, the last two equations give the line of contact between the
tangent plane and the cone. The last two equations can be written as
First Order Partial Differential Equations: Method of Characteristics 297
x − x0 y − y0 z − z0
= = . (10.3.6)
Fp Fq pFp + qFq
Thus, on the given integral surface, at each point p0 = p(x0 , y0 ), q0 =
q(x0 , y0 ) are known, the tangent plane
z − z0 = p0 (x − x0 ) + q0 (y − y0 )
together with the third equation in (10.3.5) determines the line of contact
with the Monge cone given by (10.3.6) or the characteristic direction.
Thus, the characteristic curves are given by the system of ODE
dx dy dz
= =
Fp Fq pFp + qFq
or
dx dy dz
= Fp , = Fq , = pFp + qFq . (10.3.7)
dt dt dt
As there are five unknowns x(t ), y(t ), z(t ), p(t ), q(t ), we need two more
equations to complete the system (10.3.7). But along a characteristic curve
on the given integral surface, we have
dp dx dy
= px + py = px Fp + py Fq
dt dt dt
(10.3.8)
dq
= qx Fp + qy Fq .
dt
However, px , py , qx , qy are second derivatives of u which are undesirable;
we need to eliminate them. Differentiating the given PDE with respect to
x and y, we obtain
Fx + Fz p + Fp px + Fq qx = 0,
Fy + Fz q + Fp py + Fq qy = 0
so that (10.3.8) becomes
dp
= −Fx − Fz p
dt
(10.3.9)
dq
= −Fy − Fz q,
dt
298 Ordinary Differential Equations: Principles and Applications
∂ 2u
where we have used py = = qx . Thus, on the integral surface
∂ y∂ x
z = u(x, y), we have a family of characteristic curves with coordinates
x(t ), y(t ), z(t ) along with the numbers p(t ), q(t ) and which is given by
the system (10.3.7), (10.3.9). Moreover along the curve, we have
dF dx dy dz dp dq
= Fx + Fy + Fz + Fp + Fq
dt dt dt dt dt dt
and we readily see that dF dt = 0 using (10.3.7) and (10.3.9), showing that
F = constant, is an integral of ODE. Thus, if F = 0 is satisfied at an
initial point x0 , y0 , z0 , p0 , q0 for t = 0, then (10.3.7), (10.3.9) will determine
a unique solution x(t ), y(t ), z(t ), p(t ), q(t ) passing through this point and
along which F = 0 will be satisfied for all t.
Hence, a solution can be interpreted using these five numbers and is
called a strip, that is, a space curve x = x(t ), y = y(t ), z = z(t ) and, along
it, a family of tangent planes whose normal directions are ( p(t ), q(t ), −1).
Characteristic elements
Initial element
Theorem 10.3.2
Example 10.3.3
Remark 10.3.4
Remark 10.3.5
One can easily deduce the quasi-linear and linear case from the general
equations. In the quasi-linear case
F (x, y, z, p, q) = a(x, y, z) p + b(x, y, z)q − c(x, y, z) = 0.
Thus, Fp = a, Fq = b and pFp + qFq = c and hence, the equations in
(10.3.7) are independent of p and q and so it can be solved to
determine the characteristic curves (x(t ), y(t ), z(t )). But, in the
nonlinear case, one has to solve for (x(t ), y(t ), z(t )) together with the
direction numbers p and q. Moreover, in the quasi-linear case, the
Monge–Cone equations (10.3.6), reduces to
x − x0 y − y0 z − z0
= =
a b c
which represents the equation of a line in the space showing that the
Monge cone degenerates to a line.
In the linear case, a and b are independent of z as well, so that the
first two equations in (10.3.7) form a complete system for x and y;
the characteristic curves are plane curves, that is, the curves lie on the
(x, y) plane. Moreover, the third equation reduces to
du dz
(x(t ), y(t )) = (t ) = c(x(t ), y(t ))
dt dt
which can be solved to obtain u.
302 Ordinary Differential Equations: Principles and Applications
Theorem 10.4.1
Example 10.4.2
10.5 Exercises
1. Find and sketch some sample characteristic curves of the PDE
(x + 2)ux + 2yuy = 2u
First Order Partial Differential Equations: Method of Characteristics 305
1 if x < 0
(b) u(x, 0) =
0 if x ≥ 0
0 if x < 0
(c) u(x, 0) = and u(x, 0) is smooth and
1 if x ≥ 1
increasing.
2
∂u ∂u
8. Find the integral surface of the equation x +y =u
∂x ∂y
passing through the line y = 1, x + z = 0.
9. Consider the equation p2 + q2 = 1 with initial condition u(x, y) = 0
on the line x + y = 1. Show that there are two solutions given by
1
u(x, y) = ± √ (x + y − 1) using the method of characteristics.
2
10.6 Notes
The purpose of this chapter is not to give an expository introduction to
PDE, but to show how the ODE play an important role in the analysis of
first order PDE. The reader can refer to [Eva98, Joh75, RR04, PR96] for
further discussion and more details.
Appendix A
Poincarè–Bendixon and
Leinard’s Theorems
A.1 Introduction
In this appendix (see ([CL72])), we present a proof of the
Poincarè–Bendixon theorem concerning the existence of periodic orbits
to two- dimensional autonomous systems. We also discuss Leinard’s
theorem. First we discuss some basic notions of limit sets. Consider an
n-dimensional autonomous system
ẋ = f(x), (A.1.1)
where f : Rn → Rn is a continuous, locally Lipschitz function. Denote by
φt (x0 ), the unique solution x of (A.1.1) with x(0) = x0 , for t ∈ Ix0 , where
Ix0 is the corresponding maximal interval of existence.
Recall the definition of an invariant set. A set A ⊂ Rn is said to be
invariant with respect to (A.1.1), if φt (x) ∈ A for every x ∈ A and t ∈ Ix .
Next, recall the definitions of orbit O (x), positive (semi) orbit O + (x)
and negative (semi) orbit O − (x) through a given point x ∈ Rn :
O (x) = {φt (x) : t ∈ Ix };
O (x).
[
A=
x∈A
We now introduce the notions of α-limit and ω-limit sets (observe that α
and ω are, respectively, the first and the last letters of the Greek alphabet).
Defintion A.1.1
Given x ∈ Rn , the α-limit set and the ω-limit set of x, with respect to
(A.1.1) are defined, respectively, by
O − (y)
\
α (x) = αf (x) =
y∈O (x)
and
O + (y).
\
ω (x) = ωf (x) =
y∈O (x)
Theorem A.1.2
where
At = {φs (x) : s > t}.
Using (A.1.2), we now establish the second property. If y ∈ ω (x), then,
there exists a sequence tk % ∞ such that y ∈ Atk for k ∈ N. Then, there is
also a sequence sk % ∞, sk ≥ tk such that φsk (x) → y. Conversely, if there
exists a sequence tk % ∞ such that φtk (x) → y, then, y ∈ Atk for k ∈ N,
and hence,
∞
\ \
y∈ Atk = At ,
k =1 t>0
φs ( y )
y
Lemma A.2.1
continuity, we then have a · f(z) = 0 for some z along the line segment
joining x(t1 ) and y(t2 ) along L. This again contradicts the transversality
condition, proving the second statement. Some of such crossings are
depicted in Fig. A.1.
For the last statement in the theorem, let the equation for L be given by
a · x + b = 0 for some non-zero vector a ∈ R2 . By continuity of f, there is a
circle around x0 containing only regular points. The solution φt (x) passing
through any x inside this circle at t = 0 is continuous in (t, x) in an open set
containing (0, x0 ); this follows from the continuous dependence on initial
data. Put L(t, x) = a · φt (x) + b. Then, L(0, x0 ) = 0 and ∂∂tL (0, x0 ) 6= 0,
by transversality. Hence, by the implicit function theorem, for any ε > 0,
there is a circle C, centered at x0 and a continuous function t = t (x) defined
inside C satisfying t (x0 ) = 0 and L(t (x), x) = 0 for all x inside Cε . By
continuity of t at x0 , it now follows that, given any ε > 0, there is a circle
Cε , centered at x0 such that |t (x)| < ε for all x inside Cε . Hence, the
orbit passing through any x inside Cε at t = 0 crosses L at time t (x) and
|t (x)| < ε.
Proposition A.2.2
xk xk+1
xk+1 xk
xk
xk+1
xk+2
Theorem A.2.3
Theorem A.2.4
Proof (of Theorem A.2.4) Since ω (x) ⊂ O + (x), the set ω (x) contains
at most finite number of equilibrium points. If ω (x) consists of only
equilibrium points, then it is necessarily a single equilibrium point
because of connectedness.
Now, assume ω (x) contains some regular points as well and at least
one periodic orbit O (p). We claim that ω (x) is the periodic orbit. For
otherwise, by connectedness, there would exist a sequence {xk } ⊂ ω (x) \
O (p) and a point x0 ∈ O (p) such that xk → x0 as k → ∞. Next consider a
transversal L to f such that x0 ∈ L. It follows from Proposition A.2.2 that
ω (x) ∩ L = {x0 }. On the other hand, proceeding as in Proposition A.2.2,
we infer that O + (xk ) ⊂ ω (x) intersects L for sufficiently large k. Since
ω (x) ∩ L = {x0 }, it follows that xk ∈ O (x0 ) = O (p) for sufficiently large
k, which contradicts the choice of the sequence {xk }. Therefore, ω (x) is
a periodic orbit.
Finally, assume that ω (x) contains regular points, but no periodic
orbit. We show that for any regular p ∈ ω (x), the sets ω (p) and α (p) are
equilibrium points.
If p ∈ ω (x) is a regular point, notice that ω (p) ⊂ ω (x). If q ∈ ω (p) is
a regular point and L is a transversal to f containing q in its interior, then
by Proposition A.2.2, we have
316 Appendix
Theorem A.3.1
for the system (A.3.2). Thus, any periodic orbit must surround the origin.
Now
Z x
d dx
ẍ + f (x)ẋ = + f (ξ ) dξ
dt dt 0
d
= [y + F (x)],
dt
which suggests that we introduce a new variable
z = y + F (x ).
Thus, (A.3.2) can be written in the equivalent form
ẋ = z − F (x)
)
(A.3.3)
ż = −g(x),
in the x − z plane. Again, the origin is the only equilibrium point for
(A.3.3) and the usual existence and uniqueness result holds. Because of
assumption (2), the correspondence (x, y) ↔ (x, z) between the points in
the two planes is one–one and continuous both ways. Therefore, periodic
orbits correspond to periodic orbits and the configurations of orbits in the
two planes are qualitatively similar. The orbits of (A.3.3) satisfy the
differential equation
dz −g(x)
= . (A.3.4)
dx z − F (x )
The orbits of (A.3.3) may easily be described using the hypothesis in the
theorem and (A.3.4). We will now make the following observations which
will help in understanding the directions of the orbit in Fig. A.4.
First note that since g and F are odd functions, (A.3.3) and (A.3.4) are
unchanged when x, z are replaced by −x, −z. This means that any curve
symmetric to an orbit with respect to the origin is also an orbit. Therefore,
if we know an orbit in the right half plane (x > 0), it is also known in the
left half plane (x < 0) by reflection through the origin.
Next, if an orbit starts on the z-axis with positive z coordinate (denoted
by P in Fig. A.4), then, the orbit is horizontal (parallel to the x-axis) at
this point. The x coordinate of the orbit is increasing and thus, moves into
318 Appendix
the right half plane, until the orbit meets the curve z = F (x) (denoted by
Q in Fig. A.4), where the orbit becomes vertical, that is, parallel to the
z-axis; after crossing the curve z = F (x), the x coordinate of the orbit
starts decreasing, which continues up to the time when the orbit meets the
z-axis again (denoted by R in Fig. A.4). As long as the orbit is in the
right half plane, the z coordinate is decreasing. Let b be the abscissa (the x
coordinate) of the point Q and denote by Cb the orbit described previously.
z
Cb S
P N
z = F (x )
M
x=a x=b
x
O L K
T
R
It is not hard to see that when the orbit is continued beyond P and R into
the left half plane, the result will be a periodic orbit if and only if the
distances OP and OR are equal, by using the reflection through the origin;
O is the origin. Therefore, to show that there is a unique periodic orbit, it
suffices to show that there is a unique value of b which gives OP = OR.
To this end, we introduce the function
Z x
G(x ) = g(ξ ) dξ
0
z2
Note that E (0, z) = . Along any orbit, we have
2
Ė = g(x)ẋ + zż
= F (x)ż,
which may be written as dE = F dz. Now evaluate the line integral of F
along the orbit Cb from P to R, to obtain
1
Z Z
I (b) = F dz = dE = ER − EP = (OR2 − OP2 ).
PR PR 2
Thus, it suffices to show the existence of a unique b such that I (b) = 0.
For b ≤ a (see the hypothesis), we have F < 0 and ż < 0. Hence, I (b) >
0. For b > a, write I (b) = I1 (b) + I2 (b), where
Z Z Z
I1 (b) = F dz + F dz and I2 (b) = F dz.
PS TR ST
See Fig. A.4. Since F < 0 and ż < 0 as we move along Cb from P to S and
from T to R, we have I1 (b) > 0. On the other hand, when we move from S
to T along Cb , we have F > 0 and ż < 0, so I2 (b) < 0. Therefore, we need
to find a b such that I1 (b) = −I2 (b) > 0.
We now show that I (b) is a decreasing function of b for b ≥ a and
tends to −∞ as b → ∞. Since I (a) > 0, this gives, by continuity, a unique
b0 such that I (b0 ) = 0. We, then, have the unique periodic orbit Cb0 .
From (A.3.4), it follows that
dz −g(x)F (x)
F dz = F dx = dx.
dx z − F (x )
Hence, the effect of increasing b is to raise arc PS (the arc PS is part of the
−g(x)F (x)
orbit) and to lower arc T R, which decreases the magnitude of
z − F (x )
for a given x ∈ (0, a). Since the limits of integration for I1 (b) are fixed,
the result is a decrease in I1 (b). Furthermore, since F is positive and
non-decreasing for x > a, we see that an increase in b gives an increase
in the positive number −I2 (b), and hence, decrease in I2 (b). This proves
320 Appendix