Some Examples. Constraints and Lagrange Multipliers
Some Examples. Constraints and Lagrange Multipliers
Some Examples. Constraints and Lagrange Multipliers
x
the EL equation for the x coordinate is easily seen to be (exercise)
V
m
x = 0.
x
Of course, the y and z coordinates get a similar treatment. We then get (exercise)
m~r + V = 0.
Let us consider cylindrical coordinates,
x = cos ,
y = sin .
Can you write down the equations of motion following from F~ = m~a in cylindrical (, , z)
coordinates? It is completely straightforward using the EL equations a distinct practical
advantage of this formalism. We just need to express the kinetic energy (T ) and the
potential energy (V ) in terms of cylindrical coordinates and take the difference to make
the Lagrangian, from which the EL equations are easily computed.
To compute the kinetic energy we take the kinetic energy
1
T = m(x 2 + y 2 + z 2 )
2
1
and substitute
sin ,
x = cos
cos ,
y = sin +
to get
1
T = m( 2 + 2 2 + z 2 ).
2
To get the potential energy we just substitute into V (x, y, z, t) for x and y in terms of
and . The Lagrangian is of the form
1
L = m( 2 + 2 2 + z 2 ) V (, , z, t).
2
Caution:
The notation for the potential energy function is the usual one used by physicists,
but it can be misleading because, strictly speaking, it violates standard mathematical
notational rules. V (, , z, t) means the potential energy at the location defined by (, , z)
and at the time t. V (, , z, t) is not the function obtained by substituting x , y in
V (x, y, z, t), which a strict interpretation of the function notation would require. What we
are calling V (, , z, t) is in fact the function obtained from V (x, y, z, t) by the substitution
x cos , etc. For example, V (x, y, z, t) = x2 t corresponds to V (, , z, t) = 2 t cos2 .
The EL equations are (exercise)
m
2 +
m
V
= 0,
d 2
V
( ) +
= 0,
dt
V
m
z+
= 0.
z
Exercise: Does the 2 term in the radial equation represent an attractive or repulsive
effect in the radial motion?
Exercise: Repeat this computation for spherical polar coordinates.
Lagrangians for Systems
Often times it is useful to view a dynamical system as consisting of several subsystems.
For example, the solar system can be modeled as consisting of ten particles interacting gravitationally. In the absence of interaction, the Lagrangian L0 for the total (non-interacting)
system can be viewed as the sum of Lagrangians for each of its parts:
L0 = L1 + L2 + L3 + . . . .
2
Here L1 , L2 , etc. are the Lagrangians for the subsystems. For example, if we have a
system of (non-interacting) Newtonian subsystems each Lagrangian is of the form (for the
ith subsystem)
Li = Ti Vi .
Here Vi is the potential energy of the ith system due to external forces not due to intersystem interactions, which we are ignoring for a moment. It is easy to see that L0 correctly
describes the motion of the system of non-interacting systems through its EL equations.
This is because the EL equations for the k th system involve derivatives of L0 with respect
to coordinates and velocities of the k th system and this just picks out Lk from the sum
P
L0 = j Lj .
As a simple example, let us consider a system consisting of the planets, viewed as
non-interacting point particles. They all move in a central force field due to the sun which
is non-dynamical in this model. The Lagrangian for the k th planet, with position ~rk is of
the form
1
GM mk
2
Lk = mk ~r k +
,
2
rk
where M is the mass of the sun and G is Newtons constant.
Of course, non-interacting systems are an idealization and, ultimately, are of little
physical interest. Interactions are what makes the world what it is. One of the main
ways to mathematically represent interactions between the subsystems is to introduce a
potential energy function, V = V (q1 , q2 , . . . , t) which couples various degrees of freedom.
We then have the Lagrangian for the interacting system given by
L = L0 V.
The effect of this potential energy function in the Lagrangian is to couple the motion of
the subsystems. You can see this by noting the EL equations for a degree of freedom can
now, in general, depend upon the other degrees of freedom (exercise).
As an example, let us consider the Lagrangian for a pair of electons in a Helium atom.
We view the nucleus as fixed, with charge Q; it is part of the environment. The system
consists of the two electrons, each with mass m and charge q. The configuration space is
R3 R3 = R6 , and we label points in the configuration space with position vectors ~r1 and
~r2 . The Lagrangian is
1 2 qQ
q2
1 2 qQ
+ m~r 2
.
L = m~r 1
2
|~r1 | 2
|~r2 | |~r1 ~r2 |
The first two terms represent electron 1 moving in the Coulomb field of the nucleus.
Likewise for the next two terms regarding electron two. With these first 4 terms alone
the electrons orbit the nucleus independently and do not interact among themselves. The
3
last term represents the interaction between the electrons, which is Coulomb repulsion. It
is this term which couples the motion of the two electrons and makes the EL equations
somewhat complex, lacking an explicit solution.
The other principal way to mathematically represent interactions is via constraints. In
this scenario, the coupling between subsystems will typically occur via the kinetic energy
function. We will give a couple of examples in what follows.
Example: Double Pendulum
Consider a system consisting of two plane pendulums (pendula?) connected in series.
Dont even try to write down the equations of motion using Newtons second law! The
Lagrangian analysis is straightforward.
To begin with, we have two particles moving in a plane. We denote their x and y
positions via (x1 , y1 ) and (x2 , y2 ), where the origin of coordinates is placed at the fixed
point of the double pendulum. The masses are m1 and m2 . The motion of the particles
is constrained: the lengths are l1 and l2 ; pendulum 1 is attached to a fixed point in space
and pendulum 2 is attached to the end of pendulum 1. Mathematically we have
x21 + y12 = l12 ,
These two constraints on the 4 cartesian coordinates leaves 2 degrees of freedom for this
system. As we have already mentioned, the configuration of the system is uniquely specified
once the angular displacement of each pendulum from (say) the vertical is specified. These
angles generalized coordinates for this system are denoted by 1 and 2 .
The kinetic energy for mass 1 is easily seen to be (exercise)
1
1
T1 = m1 (x 21 + x 22 ) = m1 l12 12 .
2
2
To find T2 we note that, defining l2 and 2 in the same way as for mass 1, we have
x2 = l1 sin 1 + l2 sin 2 ,
y2 = l1 cos 1 l2 cos 2 .
Along any curve we have (exercise)
x 2 = l1 cos 1 1 + l2 cos 2 2 ,
y 2 = l1 sin 1 1 + l2 sin 2 2 .
The kinetic energy of mass 2 is then (exercise)
1
T2 = m2 (x 22 + y 22 )
2
h
i
1
2
2
2
2
= m2 l1 1 + l2 2 + 2l1 l2 cos(1 2 )1 2 .
2
4
(x2 x1 )2 + y22 = l2 .
Here we have set the x axis along the line upon which m1 moves. We then have (exercise)
(x1 , y1 ) = (x, 0)
and
(x2 , y2 ) = (x + l sin , l cos ).
Thus the generalized coordinates are the horizontal displacement x1 x for m1 and the
angle made with the vertical for the pendulum with mass m2 . The kinetic energy is
therefore (exercise)
1
1
2
2
2
2
T = m1 x + m2 x + 2lx cos + l .
2
2
5
d
(m l cos ) = 0.
dt 2
d
(m lx cos ) + m2 l(x g) sin = 0.
dt 2
d
(Al sin t cos ) l(A sin t + g) sin ] = 0.
dt
6
As an exercise you can check that this equation correctly describes the reduction of the
original EL equations by the substitution x = A cos t.
Example: Charged particle in a Prescribed Electromagnetic Field
Two of the fundamental interactions allow themselves to be treated fruitfully using
classical mechanics: gravity and electromagnetism. Of course, we have in mind macroscopic systems here. (The other fundamental interactions strong and weak only operate microscopically and require a quantum treatment.) Here we will give a Lagrangian
formulation of the dynamics of a charged (test) particle in a given electromagnetic field.
While a fully relativistic treatment is certainly feasible, we will stick to a non-relativistic
(slow motion) treatment for simplicity.
We consider a particle with mass m and electric charge q moving in a given electro~ r, t), B(~
~ r, t). The equations of motion come from the Lorentz force law,*
magnetic field E(~
which asserts that the force F~ at time t on a charge q located at position ~r and moving
with velocity ~v is given by
~ r, t) + q ~v B(~
~ r, t).
F~ (~r, ~v , t) = q E(~
c
Here, of course,
~v = ~r = x i + y j + z k.
Thus the equations of motion for the curve ~r = ~r(t) are
~ r(t), t) + q ~r (t) B(~
~ r(t), t) = 0.
m~r(t) q E(~
c
It is not possible to find a Lagrangian whose EL equations correspond to the Lorentz
force law without introducing the electromagnetic potentials. Recall that 4 of the eight
Maxwell equations, the homogeneous equations,
~
~ + 1 B = 0,
E
c t
~ = 0.
B
~ so that
are equivalent to the existence of a scalar potential and a vector potential A
~
~ = 1 A ,
E
c t
and
~ = A.
~
B
* We use Gaussian units. The text uses mks units.
7
You can easily check that these relations lead to electromagnetic fields satisfying the homogeneous Maxwell equations (exercise). Conversely, given any electromagnetic field satisfying the homogeneous Maxwell equations, one can find a function (~r, t) and a vector
~ r, t) such that the above relations are satisfied.
field A(~
Remark: One issue that arises here is that of gauge transformations: For each configuration
of the electromagnetic field there are infinitely many potentials that can describe it. You
~ correspond to a given E
~ and B,
~ then so do
can easily check that if and A
0 =
1
,
c t
~0 = A
~ + ,
A
where (~r, t) is any function. This change in potentials is called a gauge transformation.
Because the potentials are not uniquely defined by the electromagnetic fields, and because
the effect of the electromagnetic field on matter is via the Lorentz force law involving
~ and B,
~ the potentials have no direct physical significance, e.g., one cannot measure
only E
by studying the motion of test particles.
In terms of the potentials, the equations of motion are (in an inertial reference frame)
!
~ 1
1
A
~ = 0.
~r A
m~r + q +
c t
c
We now show that the Lagrangian
1
q~
L(~r, ~r , t) = m(~r )2 q(~r, t) + A(~
r, t) ~r
2
c
yields these equations of motion as EL equations. To do this, we consider the EL equation
for x(t) and compare with the x component of the Lorentz force law. The y and z EL
equations are handled in an identical manner. We have (exercise)
~
L
q A
= q
+
~r ,
x
x c x
and
L
q
= mx + Ax ,
x
c
so that, on a curve ~r = ~r(t), (exercise)
d L
q
q Ax
= m
x + ~r Ax +
.
dt x
c
c t
Here Ax means to take the gradient of the function Ax . The EL equations are therefore
~
q
q Ax
q A
m
x + ~r Ax +
+q
~r = 0.
c
c t
x c x
8
Using (exercises)
()x =
~
A
t
!
=
x
,
x
Ax
,
t
~
~r A
~ = ~r Ax + A ~r ,
x
x
it is easy to see that the EL equation for x is the same as the x component of the Lorentz
force law (exercise).
There is a technical issue of interest here. The Lagrangian is built from potentials
~
(, A). As we have already pointed out, there are infinitely many potentials for any given
electromagnetic field. Thus there are infinitely many Lagrangians describing motion in
a single electromagnetic field. Each of these Lagrangians will yield the same Lorentz
~ B).
~
force law, which is built from (E,
Each of the Lagrangians can be related via a
gauge transformation of the potentials. Apparently, under a gauge transformation of the
potentials the Lagrangian changes in just the right way so that the EL equations do not
change. How does this happen? You will explore this in a homework problem.
Constraints
Often times we consider dynamical systems which are defined using some kind of restrictions on the motion. For example, the spherical pendulum can be defined as a particle
moving in 3-d such that its distance from a given point is fixed. Thus the true configuration space is defined by giving a simpler (usually bigger) configuration space along with
some constraints which restrict the motion to some subspace. We now give a systematic
treatment of this idea and show how to handle it using the Lagrangian formalism.
For simplicity we will only consider holonomic constraints, which are restrictions which
can be expressed in the form of the vanishing of some set of functions the constraints
on the configuration space and time:
C (q, t) = 0,
= 1, 2, . . . m.
We assume these functions are smooth and independent so that if there are n coordinates
q i , then at each time t the constraints restrict the motion to a nice n m dimensional
space. For example, the spherical pendulum has a single constraint on the three Cartesian
configruation variables (x, y, z):
C(x, y, z) = x2 + y 2 + z 2 l2 = 0.
9
This constraint restricts the configuration to a two dimensional sphere of radius l centered
at the origin. To see another example of such constraints, see our previous discussion of
the double pendulum and pendulum with moving point of support.
We note that the constraints will restrict the velocities:
d
C i C
C =
= 0.
q +
dt
t
q i
For example in the spherical pendulum we have
~r ~r = 0.
There are two ways to deal with such constraints. Firstly, one can simply solve the
constraints, i.e., find an independent set of generalized coordinates. We have been doing
this all along in our examples with constraints. For the spherical pendulum, we solve the
constraint by
x = l sin cos , y = l sin sin , z = l cos ,
and express everything in terms of and , in particular the Lagrangian and EL equations.
In principle this can always be done, but in practice this might be difficult. There is another
method in which one can find the equations of motion without having to explicitly solve
the constraints. This is known as the method of Lagrange multipliers. This method is
not just popular in mechanics, but also features in constrained optimization problems,
e.g., in economics. As we shall see, the Lagrange multiplier method is more than just an
alternative approach to constraints it provides additional physical information about the
forces which maintain the constraints.
Lagrange Multipliers
The method of Lagrange multipliers in the calculus of variations has an analog in
ordinary calculus. Suppose we are trying to find the critical points of a function f (x, y)
subject to a constraint C(x, y) = 0. That is to say, we want to find where on the curve
defined by the constraint the function has a maximum, minimum, saddle point. Again, we
could try to solve the constraint, getting a solution of the form y = g(x). Then we could
substitute this into the function f to get a (new) function h(x) = f (x, g(x)). Then we find
the critical points by solving h0 (x) = 0 for x = x0 whence the critical point is (x0 , g(x0 ))
This is analogous to our treatment of constraints in the variational calculus thus far (where
we solved the constraints via generalized coordinates before constructing the Lagrangian
and EL equations). There is another method, due to Lagrange, which does not require
explicit solution of the constraints and which gives useful physical information about the
constraints.
10
To begin with, when finding a critical point (x0 , y0 ) subject to the constraint C(x, y) =
0 we are looking for a point on the curve C(x, y) = 0 such that a displacement tangent
to the curve does not change the value of f to first order. Let the tangent vector to
C(x, y) = 0 at the point (x0 , y0 ) on the curve be denoted by ~t. We want
~t f (x0 , y0 ) = 0
where C(x0 , y0 ) = 0.
Evidently, at the critical point the gradient of f is orthogonal to the curve C(x, y) = 0.
Now, any vector orthogonal to the curve orthogonal to ~t at (x0 , y0 ) will be proportional
to the gradient of C at that point.* Thus the condition for a critical point (x0 , y0 ) of f
(where C(x0 , y0 ) = 0) is that the gradient of f and the gradient of C are proportional at
(x0 , y0 ). We write
f + C = 0,
for some . This requirement is meant to hold only on the curve C = 0, so without loss of
generality we can take as the critical point condition
(f + C) = 0,
C = 0.
This constitutes three conditions on 3 unknowns; the unknowns being (x, y) and . The
function is known as a Lagrange multiplier. In fact, if we artificially enlarge our x-y plane
to a 3-d space parametrized by (x, y, ) we can replace the above critical point condition
with
+ C) = 0,
(f
is the gradient in (x, y, ) space. You should prove this as an exercise.
where
To summarize: the critical points (x0 , y0 ) of a function f (x, y) constrained to a curve
C(x, y) = 0 can be obtained by finding unconstrained critical points (x0 , y0 , 0 ) of a
function
f(x, y, ) = f (x, y) + C(x, y).
We can do the same thing with our variational principle. Suppose we have an action
for n degrees of freedom q i , i = 1, 2, . . . , n:
Z t2
S[q] =
dt L(q(t), q(t),
t)
t1
= 1, 2, . . . , m.
* This follows from the basic calculus result that the gradient of a function is orthogonal to
the locus of points where the function takes a constant value.
11
sA q i dt qi q=F (s,t)
i
S[q, ] =
dt L = S[q] +
dt (t)C (q(t), t),
t1
t1
i
i
S =
dt
q .
q +
dt C +
q i dt qi
q i
t1
t1
are
The EL equations of motion coming from L
L
d L
C
+ i = 0,
i
i
dt q
q
q
which come from the variations in q i and also
C = 0,
which come from variations of . We have (n + m) equations for (n + m) unknowns. In
principle they can be solved to get the q i and the as functions of t.
What is the meaning of these equations? Well, the constraints are there, of course.
But what about the modified EL expressions? The EL equations you would have gotten
12
0=
C F i
C
(F
(s),
t)
=
.
q i sA
sA
A
is that the EL expressions coming from
the meaning of the EL equations coming form L
L no longer have to vanish, they simply have to be orthogonal to the constraint surface
since the equations of motion say that
L
d L
C
= i .
i
i
dt q
q
q
One physically interprets this force term as the force required to keep the motion on
this surface.
It is easy to verify that these modified equations, (n + m) in number, are equivalent
the correct (n m) equations obtained for sA earlier. Indeed, we have the m equations
of constraint. And, given this constraint, to say the EL expression coming from L is
orthogonal to C = 0 is the same as saying its projection tangent to the surface vanishes,
i.e.,
F i L
d L
F i
C
=
= 0,
q i
sA q i dt qi q=F (s,t) sA
which is precisely the content of the equations for the sA we obtained above.
2y mg m
y = 0,
m
(x
x + y y + gy) .
2l2
x
x + y y = (x 2 + y 2 ),
so that the multiplier can be solved for in terms of the original velocity phase space
variables:
m
= 2 (x 2 + y 2 gy).
2l
Substituting this result back into the EL equations for x and y we get the equations of
motion for x and y with the effect of the constraint physically, the tension in the rod
taken into account:
m
m
x = 2 (x 2 + y 2 gy)x,
l
m
m
y = 2 (x 2 + y 2 gy)y mg.
l
Note we never had to solve the constraint! Still, as a nice exercise you can check that,
after solving the constraint with x = l cos , y = l sin , these remaining 2 equations are
equivalent the familiar equation of motion for a plane pendulum, namely,
g
= sin ,
l
where is the angular displacement from equilibrium.
14
Using Lagrangian multipliers, the equations of motion for x and y tell us that the
pendulum moves according to a superposition of forces consisting of (i) gravity, (ii) the
force of constraint F~constraint needed to keep the mass moving in a circle of radius l.
This latter force is supplied by the Lagrange multiplier terms in the equation of motion.
Indeed, thanks to these Lagrange multiplier terms, the radial component of the net force
is (exercise)
~r ~
m
F = (x 2 + y 2 ),
l
l
which is the centripetal force, as it should be.
To summarize: Given a dynamical system with coordinates q i and Lagrangian L, we
can impose constraints C (q, t) = 0 by the following recipe.
(i) Add variables the Lagrange multipliers to the configuration space,
= L + C ,
(ii) Define a Lagrangian on the augmented velocity phase space L
for the q i and degrees of freedom.
(iii) Compute the usual EL equations from L
The resulting equations will include the constraints themselves as equations of motion
coming from variations of . The equations coming from the variations of the q i will have
extra terms involving the multipliers. For Newtonian systems these terms represent the
forces in the system which are necessary to enforce the constraints.
Thus the Lagrange multiplier method has distinct advantages over our previous approach in which we just solve the constraints at the beginning.. Namely, you do not have
to explcitly solve the constraints in order to compute the equations of motion, and the
equations of motion have additional physical information: the forces of constraint.
15