Bellman Equation

Existence and Uniqueness of a Fixed Point for
the Bellman Operator in Deterministic

Dynamic Programming∗
Takashi Kamihigashi†
February 19, 2012
Abstract
We study existence and uniqueness of a fixed point for the Bellman
operator in deterministic dynamic programming. Without any topo-
logical assumption, we show that the Bellman operator has a unique
fixed point in a restricted domain, that this fixed point is the value
function, and that the value function can be computed by value iter-
ation.
Keywords: Dynamic programming, Bellman operator, value function,

fixed point.
JEL Classification: C61
∗
This is an extensively revised version of the paper presented as “The Bellman Operator
as a Monotone Map” at the 11th SAET Conferance in Faro, 2011. I would like to thank
all participants at this conference and the Workshop in Honor of Cuong Le Van in Exeter,
2011, for comments and discussions. In particular, I am grateful to Larry Blume for his
encouraging comments and suggestions, and to Juan Pablo Rincón-Zapatero, V. Filipe
Martins-da-Rocha, Yiannis Vailakis, and Cuong Le Van for their helpful comments and
discussions. Financial support from the Japan Society for the Promotion of Science is
gratefully acknowledged.
†
RIEB, Kobe University, Rokkodai, Nada, Kobe 657-8501 JAPAN. Email:
[email protected]. Tel/Fax: +81-78-803-7015.
1 Introduction
Dynamic programming is one of the most fundamental tools in economic
analysis. This has been particularly true since the publication of the influ-
ential book by Stokey and Lucas (1989). In this book and earlier studies,
however, models with unbounded returns were not fully covered, though such
models are extremely common in economics, especially in macroeconomics.
This problem has been treated in several important contributions, includ-
ing Alvarez and Stokey (1998), Le Van and Morhaim (2002), and Rinćon-
Zapatero and Rodrı́guez-Palmero (2003, 2007, 2009).
Building on the work by the last pair of authors, Martins-da-Rocha and
Vailakis (2010) recently established one of the most general results on ex-
istence and uniqueness of a solution to the Bellman equation—or a fixed
point of the Bellman operator—applicable to models with unbounded re-
turns. Among the assumptions of their result are the following:
(i) The state space is Rn+ .
(ii) The feasibility correspondence is continuous and compact-valued.
(iii) The return function is continuous.
(iv) Except at the origin, a return of −∞ can be avoided by following some

continuous (suboptimal) policy.
Using these and other assumptions, Martins-da-Rocha and Vailakis (2010)
showed that the Bellman operator is a local contraction to apply their gen-
eral fixed point theorem. This is a powerful approach that guarantees not
only the existence and uniqueness of a fixed point within a restricted domain,
but also the continuity of the value function and the convergence of value
iteration from any initial function in that domain.
In this paper we show that the assumptions listed above are in fact unnec-
essary for establishing the existence and uniqueness of a fixed point for the
Bellman operator. Indeed, no topological assumption is required given the
remaining assumptions used by Martins-da-Rocha and Vailakis (2010). More
precisely, under weaker versions of those remaining assumptions, we obtain
the following conclusions: (a) the Bellman operator has a unique fixed point
in a restricted domain; (b) this fixed point is the value function; and (c) the
value function can be computed by value iteration starting from the lower
boundary of the restricted domain.
1
Although the uniqueness part of this result can be shown by extending
some of Stokey and Lucas’s (1989) arguments,1 the existence and conver-
gence parts require an additional tool. In the case of Martins-da-Rocha and
Vailakis (2010), it is their fixed point theorem on local contractions for both
existence and convergence (as well as uniqueness). In our case, we exploit
the monotonicity of the Bellman operator and apply the Knaster-Tarski fixed
point theorem (e.g., Aliprantis and Border, 2006) for existence, and to de-
velop additional monotonicity-based arguments for convergence.2
Unlike the previous contributions mentioned above, we establish no regu-
larity property of the value and policy functions. However, many properties
of these functions can be shown separately under additional assumptions.
For example, if the return function is upper semi-continuous, then the value
function is also upper semi-continuous; see Le Van and Morhaim (2002). If
the return function is concave, then the value function is also concave and
thus continuous except on the boundary of the state space.3 Such arguments
can easily be added to our analysis. Moreover, on a practical level, it is
useful to know that the value function can be computed by value iteration
regardless of its regularity properties.
The rest of the paper is organized as follows. In the next section, we
describe our framework and state our main result, which is proved in Section
4. In Section 3, we present two examples. The first one is trivial but has
a continuum of fixed points, illustrating the importance of restricting the
domain of the Bellman operator. The second example shows that value
iteration may fail to converge to the value function unless the initial function
is chosen appropriately.
1
We do not entirely follow their approach since we prove uniqueness along with existence
and convergence.
2
The monotonicity arguments of Bertsekas and Shreve (1978, Chapter 5) are not ap-
plicable to our setting since they require the return function to be everywhere positive or
everywhere negative. Le Van and Vailakis (2011) also use a monotonicity argument, but
they require the Bellman operator to be concave to ensure uniqueness of a fixed point.
Both Bertsekas and Shreve (1978) and Le Van and Vailakis (2011) deal with stochastic
models, which are beyond the scope of this paper.
3
For yet another example, if the return function is strictly concave, then the optimal
policy correspondence is single-valued and thus continuous (provided that it is upper hemi-
continuous).
2
2 The Main Result
Let X be a set. Let Γ be a nonempty-valued correspondence from X to X.
Let D be the graph of Γ:
D = {(x, y) ∈ X × X : y ∈ Γ(x)}. (2.1)
Let u : D → [−∞, ∞). In the optimization problem introduced below, X is

the state space, Γ is the feasibility correspondence, u is the return function,
and D is the domain of u.
Let Π and Π(x0 ) denote the set of feasible paths and that of feasible paths
from x0 , respectively:
Π = {{xt }∞ t=0 ∈ X
∞
: ∀t ∈ Z+ , xt+1 ∈ Γ(xt )}. (2.2)
∞
Π(x0 ) = {{xt }t=1 ∈ X : {xt }∞
∞
t=0 ∈ Π}, x0 ∈ X. (2.3)
Let β ≥ 0. Given x0 ∈ X, consider the following optimization problem:

T
X
sup L β t u(xt , xt+1 ), (2.4)
{xt }∞ T ↑∞
t=1 ∈Π(x0 ) t=0
where L ∈ {lim, lim} with lim = lim inf and lim = lim sup. Since u(x, y) < ∞
for all (x, y) ∈ D, the objective function is well-defined for any feasible path.
For {xt }∞t=0 ∈ Π, we define
∞
X
S({xt }∞
t=0 ) = L β t u(xt , xt+1 ). (2.5)
T ↑∞
t=0
The value function v ∗ : X → R is defined by
v ∗ (x0 ) = sup S({xt }∞

t=0 ), x0 ∈ X. (2.6)
{xt }∞
t=1 ∈Π(x0 )
Note that v ∗ (x0 ) is unaffected if Π(x0 ) is replaced by Π0 (x0 ),4 where
Π0 = {{xt }∞ ∞
t=0 ∈ Π : S({xt }t=0 ) > −∞}, (2.7)
Π0 (x0 ) = {{xt }∞ ∞ 0
t=1 ∈ Π(x0 ) : {xt }t=0 ∈ Π }, x0 ∈ X. (2.8)
4
We follow the convention that sup ∅ = −∞.
3
Let V be the set of functions from X to [−∞, ∞). The Bellman operator
B on V is defined by
(Bv)(x) = sup {u(x, y) + βv(y)}, x ∈ X, v ∈ V. (2.9)

y∈Γ(x)
Given v ∈ V , it need not be the case that Bv ∈ V . A fixed point of B is a

function v ∈ V such that Bv = v.
Let v, w ∈ V . We define the partial order ≤ on V in the usual way:
v≤w ⇐⇒ ∀x ∈ X, v(x) ≤ w(x). (2.10)
It is immediate from (2.9) that B is a monotone operator:
v≤w ⇒ Bv ≤ Bw. (2.11)
If v ≤ w, we define the order interval [v, w] by
[v, w] = {f ∈ V : v ≤ f ≤ w}. (2.12)
We are ready to state the main result of this paper:

Theorem 2.1. Suppose that there exist v, v ∈ V such that
v ≤ v, (2.13)
Bv ≥ v, (2.14)
Bv ≤ v, (2.15)
∀{xt }∞ 0
t=0 ∈ Π , lim β t v(xt ) ≥ 0, (2.16)
t↑∞
∀{xt }∞
t=0 ∈ Π, lim β t v(xt ) ≤ 0. (2.17)
t↑∞
Then the following conclusions hold:

(a) The Bellman operator B has a unique fixed point in [v, v].
(b) This fixed point is the value function v ∗ .
(c) The sequence {B n v}∞ ∗
n=1 converges to v pointwise.
Proof. See Section 4.

Theorem 2.1 does not require any of the assumptions (i)–(iv) listed in the
introduction. In fact, no topological assumption is required given conditions
(2.13)–(2.17), as far as conclusions (a)–(c) are concerned. Rinćon-Zapatero
4
and Rodrı́guez-Palmero (2003) offer several nontrivial, economically relevant
examples satisfying stronger versions of these conditions. A detailed com-
parison between Theorem 2.1 and Martins-da-Rocha and Vailakis’s (2010)
result is available in an earlier version of this paper (Kamihigashi, 2011).
If there exist v, v ∈ V satisfying (2.13)–(2.15), then the Bellman operator
B has a fixed point in [v, v] by the Knaster-Tarski fixed point theorem (see
Section 4 for a precise argument). But if (2.16) and (2.17) are violated, then
B can have multiple fixed points in [v, v]; see Section 3.1 for an example.
If (2.16) is strengthened by replacing Π0 with Π, then it essentially follows
from Stokey and Lucas (1989, Theorem 4.3) that any fixed point of B in [v, v]
coincides with v ∗ . However, this strengthened version of (2.16) is almost
never satisfied if u is unbounded below. Conditions similar to (2.16) have
been used to solve this problem since Le Van and Morhaim (2002).
In conclusion (c), we have convergence to v ∗ only from v. Our argument
for (c) is based on the observation that the limit of the increasing sequence
{B n v}∞n=1 is the supremum of the sequence. This allows us to interchange this
supremum and another supremum (see (4.16)–(4.18)) to show that the limit
is the value function v ∗ . The case of the decreasing sequence {B n v}∞
n=1 , which
also converges pointwise, is not symmetric since the sup and inf operators
are in general not interchangeable. See Section 3.2 for an example satisfying
(2.13)–(2.17) in which limn↑∞ B n v 6= v ∗ .5
3 Counterexamples
3.1 Multiple Fixed Points
The Bellman operator B can have multiple fixed points in [v, v] if (2.16) and
(2.17) are violated. To see this, suppose that β > 0, X = Z+ , and
∀i ∈ X, Γ(i) = {i + 1}, u(i, i + 1) = 0. (3.1)
At each state i ∈ X, there is only one feasible choice (i + 1) with a return

of zero. Thus v ∗ (i) = 0 for all i ∈ X. Let α > 0. Define v, v ∈ V by
v(i) = −αβ −i and v(i) = αβ −i for all i ∈ X. Then (2.13) holds. Since
v(i) = βv(i + 1) and v(i) = βv(i + 1) for all i ∈ X, (2.14) and (2.15) hold
with equality. This observation alone shows that B has multiple fixed points
5
See Strauch (1966, p. 880) for a related example of an undiscounted stochastic model.
5
in [v, v]. In fact, for any a ∈ [−α, α], the function v defined by v(i) = aβ −i
for all i ∈ X is a fixed point of B. Therefore B has a continuum of fixed
points. Note that there is only one feasible path from state 0, which is given
by {xt }∞ ∞ t t
t=1 = {t}t=1 . Then β v(xt ) = −α and β v(xt ) = α for all t ∈ Z+ ; i.e.,
(2.16) and (2.17) are violated in this example.
3.2 Nonconvergence to v ∗
Even under (2.13)–(2.17), the sequence {B n v}∞ ∗
n=1 may not converge to v .
To see this, let α > 0 and suppose that α < β < 1. Consider the example
depicted in Figure 1; more precisely, assume the following:
X = {(i, j) : i, j ∈ Z+ , j ≤ i}, (3.2)

0 0
{(i , 0) : i ∈ Z+ } if (i, j) = (0, 0),

Γ((i, j)) = {(i, j)} if i = j 6= 0, (3.3)

{(i, j + 1)} if j < i,


−α
 if (i, j) = (i0 , j 0 ) = (0, 0),
u((i, j), (i0 , j 0 )) = −β −i if (i, j) = (i0 , j 0 ) 6= (0, 0), (3.4)

0 otherwise.

Then the value function v ∗ can be computed directly:6

(
−α/(1 − β) if (i, j) = (0, 0),
v ∗ ((i, j)) = −j
(3.5)
−β /(1 − β) otherwise.
Let v = v ∗ and v = 0. Then v ≤ v and Bv = v. Since u ≤ 0, we have
Bv ≤ v. Thus (2.13)–(2.15) hold. As any feasible path eventually becomes
constant, (2.16) and (2.17) hold with equality. Hence Theorem 2.1 applies.
Consider the sequence {v n }∞ n ∞
n=1 ≡ {B v}n=1 . If (i, j) 6= (0, 0), there is
only one feasible transition from (i, j), so that v n ((i, j)) can be computed
directly:
( Pn−(i−j)−1 k
−β −j k=0 β if i > 0 and n ≥ i − j + 1,
v n ((i, j)) = (3.6)
0 otherwise.
6
Let i ∈ N. Then at state (i, i), we have v ∗ ((i, i)) = −β −i /(1 − β). Note that v ∗ ((i, i −
k)) = β k v ∗ ((i, i)) for k = 1, . . . , i; thus v ∗ ((i, j)) = −β i−j v ∗ ((i, i)) = −β −j /(1 − β). It
remains to compute v((0, 0)). If xt = (0, 0) for all t ∈ Z+ , then S({xt }∞ t=0 ) = −α/(1 − β).
If x1 = (i, 0) with i > 0, then S({xt }∞ ∗
t=0 ) = βv ((i, 0)) = −β/(1 − β) < −α/(1 − β). Hence
it is never optimal to leave state (0, 0), so that v ∗ ((0, 0)) = −α/(1 − β).
6
Figure 1: States (i, j) ∈ X (circles), feasible transitions (arrows), and asso-
ciated returns (values adjacent to arrows) under (3.2)–(3.4)
-Β-3
3, 3
-2
-Β
0
2, 2 3, 2
-Β-1
0 0
1, 1 2, 1 3, 1
-Α
0 0 0
0
0, 0 1, 0 2, 0 3, 0
0
0 0
This formula works for (i, j) = (0, 0) as well; i.e., v n ((0, 0)) = 0 for all n ∈ N.7
Now letting v ∗ = limn↑∞ v n ,8 we see that v ∗ ((i, j)) = v ∗ ((i, j)) for all
(i, j) ∈ X \ {(0, 0)}, but v ∗ ((0, 0)) = 0 > v ∗ ((0, 0)); i.e., the sequence {v n }∞
n=1
fails to converge to v ∗ at (0, 0).
Interestingly, the sequence {B n v ∗ }∞ ∗
t=1 restarted from v converges to v .
∗
Indeed, (B n v ∗ )((0, 0)) = −α(1 + β + · · · + β n−1 ) → v ∗ ((0, 0)) as n ↑ ∞.
4 Proof of Theorem 2.1

The proof consists of three lemmas and a concluding argument. The proof of
the first lemma slightly generalizes an argument of Stokey and Lucas (1989,
Theorem 4.3). The second lemma essentially shows that B T v with T ∈ N
and v ∈ V is the value function of the T period problem with the value
7
To see this, define v 0 = v = 0. Then v 0 ((0, 0)) = 0. Let n ∈ Z+ . With v n given by
(3.6), we have v n+1 ((0, 0)) = β supi∈X v n ((i, 0)) = 0 since v n ((i, 0)) = 0 for all i ≥ n. By
induction, v n ((0, 0)) = 0 for all n ∈ N.
8
In this paper the limit is taken pointwise: v ∗ (x) = limn↑∞ v n (x) for all x ∈ X.
7
of the terminal stock xT given by v(xT ). This result extends the classical
idea of Bertsekas and Shreve (1978, Section 3.2) to our setting. The last
lemma is less trivial than the first two. The concluding argument applies the
Knaster-Tarski fixed point theorem and combines the first and last lemmas.
Lemma 4.1. Let v ∈ V satisfy (2.17). Let v ∈ V be a fixed point of B with

v ≤ v. Then v ≤ v ∗ .
Proof. Let x0 ∈ X. If v(x0 ) = −∞, then v(x0 ) ≤ v ∗ (x0 ). Consider

P∞ the case
∞ t
v(x0 ) > −∞. Let > 0. Let {t }t=0 ⊂ (0, ∞) be such that t=0 β t ≤ .
Since v = Bv, for any t ∈ Z+ and xt ∈ X, there exists xt+1 ∈ Γ(xt ) such that
v(xt ) ≤ u(xt , xt+1 ) + βv(xt+1 ) + t . (4.1)
We pick x1 ∈ Γ(x0 ), x2 ∈ Γ(x1 ), . . . so that (4.1) holds for all t ∈ Z+ . Then

{xt }∞
t=1 ∈ Π(x0 ). By repeated application of (4.1) we have
v(x0 ) ≤ u(x0 , x1 ) + βv(x1 ) + 0 (4.2)

≤ u(x0 , x1 ) + β[u(x1 , x2 ) + βv(x2 ) + 1 ] + 0 (4.3)
..
. (4.4)
T −1
X
≤ β t u(xt , xt+1 ) + β T v(xT ) + , ∀T ∈ N. (4.5)
t=0
Since v(x0 ) > −∞, we have β T v(xT ) > −∞ for all T ∈ N. It follows that
T −1
X
T
v(x0 ) − − β v(xT ) ≤ β t u(xt , xt+1 ). (4.6)
t=0
Applying limT ↑∞ to both sides, recalling (2.5) and (2.6), we have

T −1
X
v(x0 )−− lim β T v(xT ) ≤ lim β t u(xt , xt+1 ) ≤ S({xt }∞ ∗
t=0 ) ≤ v (x0 ). (4.7)
T ↑∞ T ↑∞ t=0
By (2.17) we have v(x0 ) − ≤ v ∗ (x0 ). Since this is true for any > 0, we
have v(x0 ) ≤ v ∗ (x0 ). Since x0 was arbitrary, we obtain v ≤ v ∗ .
For any v ∈ V , define v1 = Bv; for each n ∈ N, provided that vn ∈ V ,
define vn+1 = Bvn . The following remark follows from (2.11).
8
Remark 4.1. Let v, w ∈ V satisfy v ≤ w and Bw ≤ w. Then for all n ∈ N,
we have vn ≤ w and thus vn ∈ V .
Lemma 4.2. Let v ∈ V satisfy (2.15). Let v ∈ V satisfy v ≤ v. Then for

any T ∈ N, we have vT ∈ V and
(T −1 )
X
∀x0 ∈ X, vT (x0 ) = sup β t u(xt , xt+1 ) + β T v(xT ) . (4.8)
{xt }∞
t=1 ∈Π(x0 ) t=0
Proof. Note from (2.15) and Remark 4.1 with w = v that vn ∈ V for all
n ∈ N. For any x0 ∈ X, we have
v1 (x0 ) = sup {u(x0 , x1 ) + βv(x1 )} (4.9)

x1 ∈Γ(x0 )
= sup sup {u(x0 , x1 ) + βv(x1 )} (4.10)

x1 ∈Γ(x0 ) {xt }∞
t=2 ∈Π(x1 )
= sup {u(x0 , x1 ) + βv(x1 )}, (4.11)

{xt }∞
t=1 ∈Π(x0 )
where (4.10) holds since {u(x0 , x1 ) + βv(x1 )} is independent of {xt }∞ 9

t=2 , and
(4.11) follows by combining the two suprema (see Kamihigashi, 2008, Lemma
1). It follows that (4.8) holds for T = 1.
Now assume (4.8) for T = n ∈ N. For any x0 ∈ X, we have
vn+1 (x0 ) = sup {u(x0 , x1 ) + βvn (x1 )} (4.12)

x1 ∈Γ(x0 )
(
= sup u(x0 , x1 ) (4.13)
x1 ∈Γ(x0 )
n−1
)
nX o
+β sup β i u(xi+1 , xi+2 ) + β n v(xn+1 )
{xi+1 }∞
i=1 ∈Π(x1 ) i=0
( n
)
X
= sup sup β t u(xt , xt+1 ) + β n+1 v(xn+1 ) (4.14)
x1 ∈Γ(x0 ) {xi+1 }∞
i=1 ∈Π(x1 ) t=0
( n )
X
= sup β t u(xt , xt+1 ) + β n+1 v(xn+1 ) , (4.15)
{xt }∞
t=1 ∈Π(x0 ) t=0
9
This step uses the assumption that Γ is nonempty-valued.
9
where (4.13) uses (4.8) for T = n, (4.14) holds since u(x0 , x1 ) is independent
of {xi+1 }∞
i=1 , and (4.15) follows by combining the two suprema (see Kami-
higashi, 2008, Lemma 1). It follows that (4.8) holds for T = n + 1. By
induction, (4.8) holds for all T ∈ N.
Lemma 4.3. Let v, v ∈ V satisfy (2.13)–(2.16). Then v ∗ ≡ limT ↑∞ v T ≥

v ∗ .10
Proof. Note from (2.13)–(2.15), (2.11), and Remark 4.1 that {v T }∞

T =1 is an
increasing sequence in V . Thus for any x0 ∈ X, we have
v ∗ (x0 ) = sup v T (x0 ) (4.16)

T ∈N
(T −1 )
X
= sup sup β t u(xt , xt+1 ) + β T v(xT ) (4.17)
T ∈N {xt }∞
t=1 ∈Π(x0 ) t=0
(T −1 )
X
= sup sup β t u(xt , xt+1 ) + β T v(xT ) (4.18)
{xt }∞
t=1 ∈Π(x0 )
T ∈N t=0
(T −1 )
X
≥ sup L β t u(xt , xt+1 ) + β T v(xT ) (4.19)
{xt }∞ 0 T ↑∞
t=1 ∈Π (x0 ) t=0
( T −1
)
X
≥ sup L β t u(xt , xt+1 ) + lim β T v(xT ) (4.20)
{xt }∞ 0 T ↑∞ T ↑∞
t=1 ∈Π (x0 ) t=0
T
X −1
≥ sup L β t u(xt , xt+1 ) = v ∗ (x0 ), (4.21)
{xt }∞ 0 T ↑∞
t=1 ∈Π (x0 ) t=0
where (4.17) uses Lemma 4.2, (4.18) follows by interchanging the two suprema
(see Kamihigashi, 2008, Lemma 1), (4.19) holds since Π0 (x0 ) ⊂ Π(x0 ) (recall
(2.8)) and LT ↑∞ aT ≤ supT ∈N aT for any sequence {aT } in [−∞, ∞), (4.20)
follows from the properties of lim and lim,11 and the inequality in (4.21) uses
(2.16). It follows that v ∗ ≥ v ∗ .
To complete the proof of Theorem 2.1, suppose that there exist v, v ∈ V
satisfying (2.13)–(2.17). The order interval [v, v] is partially ordered by ≤
10
Here v T = B T v for all T ∈ N and v ∗ (x) = limT ↑∞ v T (x) for all x ∈ X.
11
We have lim(at +bt ) ≥ lim at +lim bt and lim(at +bt ) ≥ lim at +lim bt for any sequences
{at } and {bt } in [−∞, ∞) whenever both sides are well-defined (e.g., Michel, 1990, p. 706).
10
(recall (2.10)). Given any F ⊂ [v, v], we have sup F ∈ [v, v] because
∀x ∈ X, (sup F )(x) = sup{f (x) : f ∈ F } ∈ [v(x), v(x)]. (4.22)
Since B is a monotone operator, and since B([v, v]) ⊂ [v, v] by (2.13)–(2.15)

and (2.11), it follows that B has a fixed point v in [v, v] by the Knaster-
Tarski fixed point theorem (e.g., Aliprantis and Boder, 2006, p. 16). Since
v ≤ v = Bv, we have v n ≤ v for all n ∈ N by Remark 4.1; thus v ∗ ≤ v.12
Since v ≤ v ∗ by Lemma 4.1, and since v ∗ ≤ v ∗ by Lemma 4.3, it follows that
v ≤ v ∗ ≤ v ∗ ≤ v. Hence v = v ∗ = v ∗ . Therefore v ∗ is a unique fixed point of
B in [v, v]; this establishes (a) and (b). Finally (c) holds since v ∗ = v ∗ .
References
Aliprantis, C.D., Border, K.C., 2006, Infinite Dimensional Analysis: A Hitch-
hiker’s Guide, 3rd Edition, Springer-Verlag, Berlin.
Alvarez, F., Stokey, N.L., 1998, Dynamic programming with homogenous func-
tions, Journal of Economic Theory 82, 167–189.
Bertsekas, D.P., Shreve, S.E., 1978, Stochastic Optimal Control: The Dis-
crete Time Case, Academic Press, New York.
Kamihigashi, T., 2008, On the principle of optimality for nonstationary de-
terministic dynamic programming, International Journal of Economic
Theory 4, 519–525.
Kamihigashi, T., 2011, Existence and uniqueness of a fixed point for the
Bellman operator in deterministic dynamic programming, RIEB Discus-
sion Paper DP2011-23.
Le Van, C., Morhaim, L, 2002, Optimal growth models with bounded or un-
bounded returns: a unifying approach, Journal of Economic Theory 105,
158–187.
Le Van, C., Vailakis, Y., 2011, Monotone concave (convex) operators: ap-
plications to stochastic dynamic programming with unbounded returns,
memeo, University of Paris 1 and University of Exeter Business School.
Martins-da-Rocha, V.F., Vailakis, Y., 2010, Existence and uniqueness of a
fixed point for local contractions, Econometrica 78, 1127–1141.
12
See footnote 10 for the definitions of v n and v ∗ .
11
Michel, P., 1990, Some clarifications on the transversality condition, Econo-
metrica 58, 705–723.
Rinćon-Zapatero, J.P., Rodrı́guez-Palmero, C., 2003, Existence and unique-
ness of solutions to the Bellman equation in the unbounded case, Econo-
metrica 71, 1519–1555.
Rinćon-Zapatero, J.P., Rodrı́guez-Palmero, C., 2007, Recursive utility with
unbounded aggregators, Economic Theory 33, 381–391.
Rinćon-Zapatero, J.P., Rodrı́guez-Palmero, C., 2009, Corrigendum to “Ex-
istence and uniqueness of solutions to the Bellman equation in the un-
bounded case” Econometrica, Vol. 71, No. 5 (September, 2003), 1519–
1555, Econometrica 77, 317–318.
Stokey, N., Lucas, R.E., Jr., 1989, Recursive Methods in Economic Dynam-
ics, Harvard University Press, Cambridge, MA.
Strauch, R.E., 1966, Negative dynamic programming, Annals of Mathemat-
ical Statistics 37, 871–890.
12

Bellman Equation

Uploaded by

Copyright:

Available Formats

Bellman Equation

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Bellman Equation

Uploaded by

Copyright:

Available Formats

Existence and Uniqueness of a Fixed Point for

the Bellman Operator in Deterministic

Keywords: Dynamic programming, Bellman operator, value function,

(ii) The feasibility correspondence is continuous and compact-valued.

(iii) The return function is continuous.

(iv) Except at the origin, a return of −∞ can be avoided by following some

D = {(x, y) ∈ X × X : y ∈ Γ(x)}. (2.1)

Let u : D → [−∞, ∞). In the optimization problem introduced below, X is

Let β ≥ 0. Given x0 ∈ X, consider the following optimization problem:

The value function v ∗ : X → R is defined by

v ∗ (x0 ) = sup S({xt }∞

Note that v ∗ (x0 ) is unaffected if Π(x0 ) is replaced by Π0 (x0 ),4 where

(Bv)(x) = sup {u(x, y) + βv(y)}, x ∈ X, v ∈ V. (2.9)

Given v ∈ V , it need not be the case that Bv ∈ V . A fixed point of B is a

v≤w ⇐⇒ ∀x ∈ X, v(x) ≤ w(x). (2.10)

It is immediate from (2.9) that B is a monotone operator:

v≤w ⇒ Bv ≤ Bw. (2.11)

If v ≤ w, we define the order interval [v, w] by

[v, w] = {f ∈ V : v ≤ f ≤ w}. (2.12)

We are ready to state the main result of this paper:

Then the following conclusions hold:

Proof. See Section 4.

∀i ∈ X, Γ(i) = {i + 1}, u(i, i + 1) = 0. (3.1)

At each state i ∈ X, there is only one feasible choice (i + 1) with a return

Then the value function v ∗ can be computed directly:6

Indeed, (B n v ∗ )((0, 0)) = −α(1 + β + · · · + β n−1 ) → v ∗ ((0, 0)) as n ↑ ∞.

4 Proof of Theorem 2.1

Lemma 4.1. Let v ∈ V satisfy (2.17). Let v ∈ V be a fixed point of B with

Proof. Let x0 ∈ X. If v(x0 ) = −∞, then v(x0 ) ≤ v ∗ (x0 ). Consider

v(xt ) ≤ u(xt , xt+1 ) + βv(xt+1 ) + t . (4.1)

We pick x1 ∈ Γ(x0 ), x2 ∈ Γ(x1 ), . . . so that (4.1) holds for all t ∈ Z+ . Then

v(x0 ) ≤ u(x0 , x1 ) + βv(x1 ) + 0 (4.2)

Applying limT ↑∞ to both sides, recalling (2.5) and (2.6), we have

Lemma 4.2. Let v ∈ V satisfy (2.15). Let v ∈ V satisfy v ≤ v. Then for

v1 (x0 ) = sup {u(x0 , x1 ) + βv(x1 )} (4.9)

= sup sup {u(x0 , x1 ) + βv(x1 )} (4.10)

= sup {u(x0 , x1 ) + βv(x1 )}, (4.11)

where (4.10) holds since {u(x0 , x1 ) + βv(x1 )} is independent of {xt }∞ 9

vn+1 (x0 ) = sup {u(x0 , x1 ) + βvn (x1 )} (4.12)

Lemma 4.3. Let v, v ∈ V satisfy (2.13)–(2.16). Then v ∗ ≡ limT ↑∞ v T ≥

Proof. Note from (2.13)–(2.15), (2.11), and Remark 4.1 that {v T }∞

v ∗ (x0 ) = sup v T (x0 ) (4.16)

∀x ∈ X, (sup F )(x) = sup{f (x) : f ∈ F } ∈ [v(x), v(x)]. (4.22)

Since B is a monotone operator, and since B([v, v]) ⊂ [v, v] by (2.13)–(2.15)

You might also like

v(xt ) ≤ u(xt , xt+1 ) + βv(xt+1 ) + t . (4.1)

v(x0 ) ≤ u(x0 , x1 ) + βv(x1 ) + 0 (4.2)