The Dynamics of Distributions in Continuous-Time Stochastic Models

Download as pdf or txt
Download as pdf or txt
You are on page 1of 33

The Dynamics of Distributions

in Continuous-Time Stochastic Models


Christian Bayer(a) and Klaus Wälde(b)
(a)
WeierstraßInstitute Berlin and (b) Johannes-Gutenberg University Mainz1

November 2015

We study an optimal precautionary-saving problem in continuous time. The evo-


lution of optimally evolving state variables, wealth and labour market status, can
be described by stochastic di¤erential equations. We derive conditions under which
an invariant distribution for state variables exists and is unique. We also provide
conditions such that initial distributions converge to the long-run distribution. By
deriving Fokker-Planck equations for these state variables, we can provide an intu-
itive interpretation of the evolution and determinants of the implied distribution of
wealth.

JEL Codes: C62, D91, J63


Keywords: uncertainty in continuous time, Poisson process,
existence, uniqueness, stability, Fokker-Planck equations

1 Introduction
[Motivation] Dynamic and stochastic models are widely used for macro economic analysis and
also for many analysis in labour economics. When the development of these models started with
the formulation of stochastic growth models, a lot of emphasis was put on understanding formal
properties of these models. Does a unique solution exist, both for the control variables and
general equilibrium itself? Is there a stationary long-run distribution (of state variables being
driven by optimally chosen control variables) to which initial distributions of states converge?
The literature employing continuous time models only initially put some emphasis on looking
at stability issues (Merton, 1975; Bismut, 1975; Magill, 1977; Brock and Magill, 1979; Chang
and Malliaris, 1987). In recent decades, applications to economic questions have been the main
focus. This does not mean, however, that all formal problems have been solved. In fact, we
argue in this paper that formal work is badly missing for continuous time uncertainty.
[Objectives] The goal of this paper is threefold: First, we introduce methods for analysing
existence and stability of distributions described by stochastic di¤erential equations from the
mathematical literature. The approach to proving the existence and uniqueness of an invariant
distribution and its ergodicity, i.e. of convergence to the said distribution, builds on the work
of Meyn and Tweedie (1993 a,b,c) and Down et al. (1995). Their work is especially useful for
understanding properties of systems driven by jump processes.2 The methods we use here are
therefore particularly relevant for the search and matching analyses cited above.
Second, we use these methods to analyse stability properties of a precautionary-savings
model where individuals can smooth consumption by accumulating wealth. Any extension
1
Christian Bayer: Weierstrass Institute, Mohrenstr. 39, 10117 Berlin, Germany. christian.bayer@wias-
berlin.de. Klaus Wälde: Gutenberg University Mainz, Gutenberg School of Management and Economics,
Jakob-Welder-Weg 4, 55128 Mainz, Germany. [email protected], www.waelde.com. We are grateful to
William Brock, Benjamin Moll, Manuel Santos, John Stachurski and Stephen Turnovsky for comments and
suggestions. Klaus Wälde acknowledges generous …nancial support from the Gutenberg Research Council.
2
These methods are also used for understanding how to estimate models that contain jumps (e.g. Bandi and
Nguyen, 2003) or for understanding long-term risk-return trade-o¤s (Hansen and Scheinkman, 2009).

1
of search and matching models in continuous time that allow for self-insurance (see e.g. Lise,
2013) would display similar stability properties. Individuals have constant relative risk aversion
and an in…nite planning horizon. Optimal behaviour implies that the two state variables of
an individual, wealth and employment status, follow a process described by two stochastic
di¤erential equations. We analyse under which conditions an invariant (stationary) distribution
for wealth and employment status exists, is unique and when the model is stable in the sense
that the distribution converges for any initial distribution to the unique invariant one. The
corresponding theorem is proven.
Our third objective consists in providing some economic interpretations for the determinants
of the distribution of wealth implied by the matching and saving process. To this end, we provide
a tool that embeds the analysis of distributions into a standard mathematical tool - the so-called
Fokker-Planck equations. These equations describe the distributional properties of stochastic
processes in a fairly general but still intuitive way. The advantage of these equations consists in
the fact that one is no longer restricted to speci…c distributions for which closed-form solutions
can be found. The entire dynamics of distributions is described and not simply distributions
in a “steady-state”. They can also be applied to much more general processes than has been
done so far in the literature. By their nature, all existing distributions must be special cases of
these general equations.3
[Findings] One crucial component of our proofs is a smoothing condition. As we allow
for Poisson processes, we have to use more advanced methods based on T -processes than in
the case of a stochastic di¤erential equation driven by a Brownian motion. In the latter case
the strong smoothing properties of Brownian motion can be used to obtain the strong Feller
property. In this sense, the corresponding analysis will often be more straightforward than
the one presented here. For the wealth-employment process of our model, we …nd that the
wealth process is not smoothing and the strong Feller property does not hold. However, for the
economically relevant parameter case (the low-interest rate regime), we can still show a strong
version of recurrence (namely Harris recurrence) by using a weaker smoothing property, and
thus obtain uniqueness of the invariant distribution. Ergodicity is then implied by properties
of discrete skeleton chains.
Using the Dynkin formula, we compute the Fokker-Planck equations for the wealth-employment
status system, a two-dimensional partial di¤erential equation system. It describes the evolution
of the density of wealth and employment status over time, given some initial condition. When
we are interested in long-run properties only, we can set time derivatives equal to zero in the
Fokker-Planck equations and obtain an ordinary two-dimensional non-autonomous di¤erential
equation system. Boundary conditions can be motivated from our phase diagram analysis.
The big advantage of our example for illustrating the usefulness of Fokker-Planck equations,
consists in the generic nature of the resulting stochastic system. There will be one fundamental
equation that describes the ins into and outs out of employment. Then, there will be one
“dependent” equation that describes the accumulation of wealth. If wealth is replaced by
…rm-size, human capital, entitlement to bene…ts or duration in employment or unemployment,
exactly the same structure occurs.
[Table of contents] The structure of our paper is as follows. The next section relates our
analysis to the literature. Section 3 presents the consumption-saving problem and derives the
stochastic di¤erential equations describing the evolution of state variables. Section 4 proves
existence and uniqueness of an invariant measure for the state variables together with conver-
3
As Fokker-Planck equations describe densities, this method would allow for structural maximum likelihood
estimation of models that include additional features to those usually captured in labour models (see e.g. van
den Berg, 1990; Postel-Vinay and Robin, 2002; Flinn, 2006 or Launov and Wälde, 2013). In work in progress,
we use the Fokker-Planck equations derived in this paper to understand determinants of the wealth distribution
of a cohort in the US based on the NLSY79.

2
gence to the long-run invariant distribution. Section 5 provides a more applied approach to
describing the dynamics of distributions by presenting the Fokker-Planck equations. The …nal
section concludes. Appendix 4.1 provides some general background to stochastic processes in
continuous time to which we refer in the main text. Appendix A derives the Fokker-Planck
equations step by step.

2 Related literature
To illustrate the usefulness of our methods, we employ a continuous-time Bewley-Huggett-
Aiyagari model of precautionary savings. Our economic background is therefore in the tradition
of Huggett (1993) and Aiyagari (1994). Huggett (1993) analyses an exchange economy with
idiosyncratic risk and incomplete markets. Agents can smooth consumption by holding an asset
and endowment in each period is either high or low, following a stationary Markov process. This
structure is similar in spirit to our setup. Huggett provides existence and uniqueness results for
the value function and the optimal consumption function and shows that there is a unique long-
run distribution function to which initial distributions converge. Regarding stability, he relies
on the results of Hopenhayn and Prescott (1992). An overview of the various directions the
precautionary savings model took is provided by Heathcote et al. (2009). Miao (2006) proves
the existence of a sequential competitive equilibrium in a Bewley-Huggett-Aiyagari model.
More recent analyses include Ortigueira and Siassi (2013) who focus on risk-sharing within a
family in the presence of idiosyncratic risk.
The only fully-developed continuous-time model with precautionary savings we are aware
of is Lise (2013). In addition to our setup, he also allows for on-the-job search. He does
not employ Fokker-Planck equations and abstracts from existence and stability analyses, how-
ever. Scheinkman and Weiss (1986) study a precautionary savings setup with a borrowing
constraint when the interest rate is zero (e.g. for holding cash) in a two-type economy. Lippi
et al. (2015) extend their framework to time-varying money supply.4 Achdou et al. (2014)
survey continuous-time models in macroeconomics with a focus on partial di¤erential equations
emphasizing open theoretical ends like the lack of proofs of existence and uniqueness. They
also present a precautionary saving model where uncertainty results from Brownian motion.5
The theory we will employ below provides a useful contribution to the economic literature as
the latter, as just presented, focuses on related, but di¤erent methods. For one, we treat Markov
processes in continuous time, while references in the macro-economic literature in the context
of Markov-process stability are mostly related to discrete time.6 But even in discrete time,
the theory of T -processes of Meyn and Tweedie (a weaker version of strong Feller processes),
seems new in the economics literature. While relying on other results from Meyn and Tweedie
(1993a), Kamihigashi and Stachurski (2012, 2013), for instance, infer stability from order mixing
properties instead.
In the economic continuous–time literature, the starting point is Merton’s (1975) analysis of
the continuous-time stochastic growth model. For the case of a constant saving rate and a Cobb-
Douglas production function, the “steady-state distributions for all economic variables can be
4
The two-type economy is very useful as it is the simplest economy with a wealth distribution that varies
over time. At the individual level, the decision problem and therefore the wealth distribution for some future
point in time has properties that we describe here by Fokker-Planck equations. The aggregate distributions in
two-type economies and in economies with many types (as here) di¤ers of course.
5
See also Achedou et al. (2015) for a model similar to ours. One main di¤erence to their approach is our
analysis of the dynamics of the wealth distribution. In particular, we show convergence of the wealth distribution
to the unique stationary distribution.
6
Continuous time models are treated thoroughly, but under di¤erent conditions, in the …nance literature. As
an example, Raimondo (2005) proves existence of equilibrium in a model with incomplete and with complete
markets. Anderson and Raimondo (2008) prove dynamic completeness of the equilibrium price process.

3
solved for in closed form”. No such closed form results are available of course for the general
case of optimal consumption. Chang and Malliaris (1987) also allow for uncertainty that results
from stochastic population growth as in Merton (1975) and they assume the same exogenous
saving function where savings are a function of the capital stock. They follow a di¤erent route,
however, by studying the class of strictly concave production functions (thus including CES
production function and not restricting their attention to the Cobb-Douglas case). They prove
“existence and uniqueness of the solution to the stochastic Solow equation”. The build their
proof on the so-called re‡ection principle. More work on growth was undertaken by Brock and
Magill (1979) building on Bismut (1975). Magill (1977) undertakes a local stability analysis
for a many-sector stochastic growth model with Brownian motions using methods going back
to Rishel (1970). All of these models use Brownian motion as their source of uncertainty and
do not allow for Poisson jumps. To the best of our knowledge, not much (no) work has been
done on these issues since then.
The principles behind and the derivation of the Fokker-Planck equation (FPE) for Brownian
motion are treated e.g. in Friedman (1975, ch. 6.5) or Øksendal (1998, ch. 8.1). For our case of
a stochastic di¤erential equation driven by a Markov chain, we use the in…nitesimal generator
as presented e.g. in Protter (1995, ex. V.7). From general mathematical theory, we know that
@
the density satis…es the corresponding FPE @t p(t; x) = A p(t; x), where p denotes the density
of the process with state variable x at time t and A is the adjoint operator of the in…nitesimal
generator A of this process. We follow this approach in our framework and obtain the FPE for
the law of the employment-wealth process.
In economics, versions of Fokker-Planck equations, equivalently called Kolmogorov forward
equations, are rarely used or referred to so far. Lo (1988) derives a FPE for a one-dimensional
process. Merton (1975) applies the method to analyse distributional properties of a stochastic
Solow growth model. Bertola and Caballero (1994) study the distribution of capital when
investment is irreversible. Klette and Kortum (2004) employ a method related to FPEs to
derive …rm-size distributions. Moscarini (2005) uses them to derive the distribution of the belief
about the quality of a match. Koeniger and Prat (2007) obtain an employment distribution
and Prat (2007) describes the distribution of detrended productivity. Impullitti et al. (2011)
study the …rm-size distribution in an international trade context. Stokey (2008) provides a
text-book treatment of Kolmogorov forward and backward equations for Brownian motion.
The main di¤erence in our application consists in its considerable generalization (as we allow
for a system of stochastic di¤erential equations with jumps), in the detailed derivation and in the
explanations linking the derivation to standard methods taught in advanced graduate courses.
The only new tool we require is the Dynkin formula. This approach focusing on the principles
of FPEs in a tractable and accessible way should allow and encourage a much wider use of this
tool for other applications. We would like to move Fokker-Planck equations much more into
the mainstream. In fact, one could argue that Fokker-Planck equations should become a tool
as common as Keynes-Ramsey rules.7
By transforming the FPEs from equations describing densities into equations describing
distribution functions, we obtain a description of densities whose intuitive interpretation is
very similar to derivations of less complex distributions as in Burdett and Mortensen (1998)
or Burdett et al. (2011). In addition, however, our equations exhibit new “advection” terms
that capture the shift of the distribution due to the evolution of the additional state variable,
i.e. due to wealth.
7
We would like to thank Philipp Kircher for having put this so nicely.

4
3 The model
3.1 The setup and optimal consumption
Consider
R1 an individual that maximizes a standard intertemporal utility function, U (t) =
Et t e [ t] u (c ( )) d ; where expectations need to be formed due to the uncertainty of labour
income which in turn makes consumption c ( ) uncertain. The expectations operator is denoted
Et and conditions on the current state in t: The planning horizon starts in t and is in…nite.
The time preference rate is positive. We assume that the instantaneous utility functions has
a CRRA structure
c ( )1 1
u (c ( )) = (1)
1
with 6= 1: All proofs for the logarithmic case = 1 should work accordingly.
Each individual can save in an asset a. Allowing optimal consumption to be a function of
state variables, c (a (t) ; z (t)) ; the optimal evolution of individual wealth is given by

da (t) = fra (t) + z (t) c (a (t) ; z (t))g dt: (2)

Wealth a (t) increases (or decreases) per unit of time dt if capital income ra (t) plus labour
income z (t) is larger (or smaller) than optimally chosen consumption c (a (t) ; z (t)) : Labour
income z (t) is given by constants w and b8 and is described by the second constraint of the
household, a stochastic di¤erential equation,

dz (t) = dq dqs ; w b: (3)

The Poisson process qs counts how often our individual moves from employment into unemploy-
ment. The arrival rate of this process is given by s > 0 when the individual is employed and by
s = 0 when the individual is unemployed. The Poisson process related to job …nding is denoted
by q with an arrival rate > 0 when unemployed and = 0 when employed (as there is no
search on the job). It counts how often the individual …nds a job. In e¤ect, z(t) is a continuous
time Markov chain with state space fw; bg, where the transition w ! b happens with rate s
and the transition b ! w with rate . This description of z will be used in the remainder of
the paper. As usual, the wealth-employment process (a; z), is de…ned on a probability space
( ; F; P ).
We now let the individual maximize her objective function by choosing a consumption path
subject to the budget constraint (2) and the equation for the employment status (3). Optimal
consumption is described by the following generalized Keynes-Ramsey rules which extends the
approach suggested by Wälde (1999) for the case of an uncertain interest rate to our case
of uncertain labour income. We suppress the time argument for readability. Consumption
c (aw ; w) of an employed individual with current wealth aw follows (see app. B.1)

u00 (c (aw ; w)) u0 (c (aw ; b))


dc (aw ; w) = r +s 1 dt
u0 (c (aw ; w)) u0 (c (aw ; w))
u00 (c (aw ; w))
[c (aw ; b) c (aw ; w)] dqs (4a)
u0 (c (aw ; w))

while her wealth evolves according to

daw = [raw + w c (aw ; w)] dt: (4b)


8
In some broader equilibrium perspective, w and b would be endogenous objects. As long as there is only
idiosyncratic risk and income is a deterministic function of time, all of our proofs below would work as well.

5
Analogously, solving for the optimal consumption of an unemployed individual with current
wealth ab yields

u00 (c (ab ; b)) u0 (c (ab ; w))


dc (ab ; b) = r 1 dt
u0 (c (ab ; b)) u0 (c (ab ; b))
u00 (c (ab ; b))
[c (ab ; w) c (ab ; b)] dq (4c)
u0 (c (ab ; b))

and her wealth follows


dab = [rab + b c (ab ; b)]dt: (4d)
Without uncertainty about future labor income, i.e. s = = dqs = dq = 0, the above
00 (c)
Keynes-Ramsey rules reduce to the classical deterministic consumption rule, uu0 (c) c_ = r
. The additional s [: : :] term in (4a) shows that consumption growth is faster under the risk of
a job loss. Note that the expression [u0 (c (aw ; b)) =u0 (c (aw ; w)) 1] is positive as consumption
c (aw ; b) of an unemployed worker is smaller than consumption of an employed worker c (aw ; w)
(see lem. B.12 for a proof) and marginal utility is decreasing, u00 < 0: Similarly, the [: : :] term
in (4c) shows that consumption growth for unemployed workers is smaller.
As the additional term in (4a) contains the ratio of marginal utility from consumption when
unemployed relative to marginal utility when employed, this suggests that it stands for precau-
tionary savings (Leland, 1968, Aiyagari, 1994, Huggett and Ospina, 2001).9 When marginal
utility from consumption under unemployment is much higher than marginal utility from em-
ployment, individuals experience a high drop in consumption when becoming unemployed. If
d c(a;w)
relative consumption shrinks as wealth rises, i.e. if da c(a;b)
< 0; reducing this gap and smooth-
ing consumption is best achieved by fast capital accumulation. This fast capital accumulation
would go hand in hand with fast consumption growth as visible in (4a).
In the case of unemployment, the [: : :] term in (4c) suggests that the possibility to …nd a
new job induces unemployed individuals to increase their current consumption level. Relative to
a situation in which unemployment is an absorbing state (once unemployed, always unemployed,
i.e. = 0), the prospect of a higher labor income in the future reduces the willingness to give
up today’s consumption. With higher consumption levels, wealth accumulation is lower and
consumption growth is reduced.
The stochastic dq-terms in (4a) and (4c) (tautologically) represent the discrete jumps in
the level of consumption whenever the employment status changes. We will understand more
about these jumps after the phase-diagram analysis below.
For our analysis to follow, we assume that the interest rate is lower than the time-preference
rate, r < . For convenience, we also assume that the initial wealth level a(t) is chosen inside
the interval [ b=r; aw ]: The lower bound b=r is a natural borrowing constraint as discussed
below and the upper bound aw is endogenously determined below as well.10

3.2 An illustration of consumption and wealth dynamics


The dynamics of consumption and wealth can be illustrated in the wealth-consumption space.
The background for this illustration results from initially focusing on the evolution between
jumps and by eliminating time as exogenous variable. Computing the derivatives of consump-
tion with respect to wealth in both states and considering wealth as the exogenous variable,
9
If the individual knew the points in time where she moves to another state, the Keynes-Ramsey rule would
not display this term. In fact, an explicit solution for the consumption level would be available for any wage
path (see e.g. Wälde, 2012, eq. (5.6.10)).
10
Our discussion below suggests that wealth will lie within this intervall after a …nite lenght of time with
probability one even when initial wealth a (t) lies outside the intervall.

6
we obtain a two-dimensional system of non-autonomous ordinary di¤erential equations (ODE).
As wealth is now the argument for these two di¤erential equations, there is no longer a need
to distinguish between wealth of employed and unemployed workers (i.e. between aw and ab ).
The dynamics between jumps therefore follows
h 0 i
u (c(a;b))
u00 (c (a; w)) dc (a; w) r + s u0 (c(a;w)) 1
= ; (5a)
u0 (c (a; w)) da ra + w c (a; w)
h i
u0 (c(a;w))
00
u (c (a; b)) dc (a; b) r 1 u0 (c(a;b))
0
= : (5b)
u (c (a; b)) da ra + b c (a; b)

With two boundary conditions, this system provides a unique solution for c (a; w) and c (a; b).
Once solved, the e¤ect of a jump is then simply the e¤ect of a jump of consumption from, say,
c (a; w) to c (a; b) :
Properties of this system can then be illustrated in the usual way by plotting zero-motion
lines and by plotting the sign of the derivatives into a phase diagram. Following these steps, it
turns out (see app. B.2) that there is an endogenous upper limit aw of the wealth distribution
determined by the zero-motion line for consumption. The ratio of consumption at this point is
given by
u0 (c (aw ; b)) r
0
1 : (6)
u (c (aw ; w)) s
Joint with an endogenous natural borrowing limit of a b=r (see app. B.3), this allows
us to plot a phase diagram as in …g. 1.11 This …gure displays wealth on the horizontal and
consumption c (a; z) on the vertical axis. It plots dashed zero-motion lines for aw and c (a; w)
and a solid zero-motion line for ab following from (4b), (55) and (4d), respectively. We assume
for this …gure that the threshold level aw is positive.12 The intersection point of the zero-motion
lines for c (a; w) and aw is the temporary steady state (TSS),

(aw ; c (aw ; w)) : (7)

We call this point temporary steady state for two reasons. On the one hand, employed
workers experience no change in wealth, consumption or any other variable when at this point
(as in a standard steady state of a deterministic system). On the other hand, the expected
spell in employment is …nite and a random transition into unemployment will eventually occur.
Hence, the state in is steady only temporarily.
As we know from the proposition in app. B.2 that consumption for the unemployed always
falls, both consumption and wealth fall above the zero-motion line for ab . The arrow-pairs for
the employed workers are also added. They show that one can draw a saddle-path through the
TSS. To the left of the TSS, wealth and consumption of employed workers rise, to the right,
they fall.
Relative consumption when the employed worker is in the TSS is given by (6). A trajectory
going through (aw ; c (aw ; b)) and hitting the zero-motion line of ab at b=r is in accordance
with laws of motions for the unemployed worker.
11
App. B proves various properties of our system used for plotting this phase diagram under a mild technical
condition. A proof of the existence of an optimal consumption path is in app. C.
12
This is of course a quantitative issue. In ongoing numerical work, the threshold is positive for reasonable
parameter values. It approaches in…nity for r approaching .

7
Figure 1 Policy functions for employed and unemployed workers

For our assumption of an interest rate being lower than the time preference rate, r < ,
the range of wealth a worker can hold is bounded. Whatever the initial wealth level, there is a
positive probability that the wealth level will be in the range [ b=r; aw ] after some …nite length
of time. For an illustration, consider the policy functions in …g. 1: Wealth decreases both for
employed and unemployed workers for a > aw : The transition into the range [ b=r; aw ] will
take place only in the state of unemployment which, however, occurs with positive probability.
When wealth of an individual is within the range [ b=r; aw ] ; consumption and wealth will
rise while employed and fall while unemployed. While employed, precautionary saving motives
drive the worker to accumulate wealth. While unemployed, the worker runs down current
wealth as higher income for the future is anticipated –“postcautionary dissaving”takes place.
When a worker loses a job at a wealth level of, say, aw =2; his consumption level will drop
from c (aw =2; w) to c (aw =2; b) : Conversely, if an unemployed worker …nds a job at, say, a = 0;
her consumption increases from c (0; b) to c (0; w) : A worker will therefore be in a permanent
consumption and wealth cycle. Given these dynamics, wealth will never leave the interval
[ b=r; aw ] and one can easily imagine a distribution of wealth over the range [ b=r; aw ].

4 Stability of the wealth-employment process


We would now like to formally understand the stability properties of the model just presented.
As the fundamental state variables are wealth (2) and the employment status (3) of an individ-
ual, the process we are interested in is the wealth-employment process X (a ( ) ; z ( )) : All
other variables (like control variables or e.g. factor rewards in a general equilibrium version)
are known deterministic functions of the state variables. Hence, if we understand the process
governing the state variables, we also understand the properties of all other variables in this
model. The state-space of this process X is X [ b=r; aw ] fw; bg and has all the properties
required for the state space in the general ergodicity theory for Markov processes, which we
review in section 4.1 below. Moreover, for the sake of simplicity, we now set the initial time
t = 0 –following the usual practice in the mathematical literature.
The goal of this section is a proof of stability of the Markov process X in the sense that we
want to show that the distribution of X converges for ! 1 to a unique limiting distribution
(no matter what the initial value X0 ). (See def. 4.10 for the precise meaning of that statement.)
The general structure of the stability or ergodicity proof is quite usual:

8
First we prove existence of an invariant probability measure, i.e., of a distribution on
the state space such that the process is stationary when started with this distribution, i.e.,
when X0 . Hence, the …rst step is looking for candidates for the limiting distribution,
if it exists. (Note that we here use “probability measure”and “distribution”essentially as
synonyms.) As our state space is already compact, existence will follow from a continuity
condition on the paths of X, more precisely the weak Feller property, cf. def. 4.5 below.
We review the theoretical underpinnings in section 4.1.2 and carry out the proofs for our
model in section 4.2.

Then we prove uniqueness of such invariant probability measures. Technically, the usual
techniques actually only provide uniqueness of invariant measures (which may well be
in…nite if no invariant probability measure exists), but the combination with the …rst
step, of course, gives existence and uniqueness of the invariant distribution. As in the
case of Markov chains, uniqueness follows from irreducibility (def. 4.1) and recurrence
(def. 4.3) of the process X. Proving the latter property requires us to have some smoothing
properties of X, which is often easy to verify in a di¤usion setting, but not so clear in
a pure jump setting as ours. We critically rely on the notion of T -processes de…ned in
def. 4.8. Verifying that our wealth-employment process X is a T -process is the main task
of section 4.3.

The unique invariant distribution identi…ed in the last step is the natural candidate for
the limiting distribution, so we only have to prove convergence in the third step. This is
done in section 4.4. Note that we are using the notion of convergence in total variation
sense as compared to the more usual (and weaker) convergence in distribution.

We now continue with an overview of ergodicity theory for Markov processes in continu-
ous time with continuous state spaces. All the results in section 4.1 are well known in the
mathematical literature and, hence, the reader only interested in the new results might directly
proceed with section 4.2.

4.1 Review of ergodicity results for continuous time Markov processes


The wealth-employment process (a( ); z( )) described by (2) and (3) is a continuous-time
Markov process with a non-discrete state space [ b=r; aw ] fw; bg. Thus, we will rely on
results from the general stability theory of Markov processes as presented in the works of Meyn
and Tweedie and their coauthors cited above. In the present section, we will recapitulate the
most important elements of the stability for Markov processes in continuous time. Here, we
will discuss the theory in full generality, i.e., we assume that we are given a Markov process
(Xt )t2R 0 on a state space X, which is assumed to be a locally compact separable metric space
endowed with its Borel -algebra. All Markov processes are assumed to be time-homogeneous,
i.e., the conditional distribution of Xt+s given Xt = x only depends on s, not on t.

4.1.1 Preliminaries
Let (Xt )t2R 0 be a (homogeneous) Markov process with the state space X, where X is assumed
to be a locally compact and separable metric space, which is endowed with its Borel -algebra
B(X). Let P t (x; A), t 0, x 2 X, A 2 B(X), denote the corresponding transition kernel, i.e.

P t (x; A) P (Xt 2 AjX0 = x) Px (Xt 2 A); (8)

where Px is a shorthand-notation for the conditional probability P ( jX0 = x). Note that P t ( ; )
is a Markov kernel, i.e. for every x 2 X, the map A 7! P t (x; A) is a probability measure on

9
B(X) and for every A 2 B(X), the map x 7! P t (x; A) is a measurable function. Similarly, by
a kernel we understand a function K : (X; B(X)) ! R 0 such that K(x; ) is a measure, not
necessarily normed by 1, for every x and K( ; A) is a measurable function for every measurable
set A. Moreover, let us denote the corresponding semi-group by Pt , i.e.
Z
Pt f (x) E(f (Xt )jX0 = x) = f (y)P t (x; dy) (9)
X

for f : X ! R bounded measurable. For a measurable set A, we consider the stopping time A
and the number of visits of X in set A;
Z 1
A infft 0jXt 2 Ag; A 1A (Xt )dt:
0

De…nition 4.1 Assume that there is a -…nite, non-trivial measure ' on B(X) such that, for
sets B 2 B(X), '(B) > 0 implies Ex ( B ) > 0, 8x 2 X. Here, similar to Px , Ex is a short-hand
notation for the conditional expectation E( jX0 = x). Then X is called '-irreducible.

In the more familiar case of a …nite state space and discrete time, we would simply require
fxg to have positive expectation for any state x. In the continuous case, such a requirement
would obviously be far too strong, since singletons fxg usually have probability zero. The above
de…nition only requires positive expectation for sets B, which are “large enough”, in the sense
that they are non-null for some reference measure.
A simple su¢ cient condition for irreducibility is given in Meyn and Tweedie (1993b, prop. 2.1),
which will be used to show irreducibility of the wealth-employment process.

Proposition 4.2 Suppose that there exists a -…nite measure such that (B) > 0 implies
that Px ( B < 1) > 0. Then X is '-irreducible, where
Z Z 1
'(A) R(x; A) (dx); R(x; A) P t (x; A)e t dt:
X 0

De…nition 4.3 The process X is called Harris recurrent if there is a non-trivial -…nite mea-
sure ' such that '(A) > 0 implies that Px ( A = 1) = 1, 8x 2 X. Moreover, if a Harris
recurrent process X has an invariant probability measure, then it is called positive Harris.

Like in the discrete case, Harris recurrence may be equivalently de…ned by the existence of
a -…nite measure such that (A) > 0 implies that Px ( A < 1) = 1. As already remarked in
the context of irreducibility, in the discrete framework one would consider sets A = fyg with
only one element.
Let be a measure on (X; B(X)). We de…ne a measure P t by
Z
t
P (A) = P t (x; A) (dx):
X

We say that is an invariant measure, i¤ P t = for all t. Here, the measure might be
in…nite. If it is a …nite measure, we may, without loss of generality, normalize it to have total
mass (X) = 1. The resulting probability measure is obviously still invariant, and we call it
an invariant distribution. (Note that any constant multiple of an invariant measure is again
invariant.) In the case of an invariant distribution, we can interpret invariance as meaning that
the Markov process has always the same marginal distribution in time, when starting with the
distribution .

10
4.1.2 Existence of an invariant probability measure
The existence of …nite invariant measures follows from a combination of two di¤erent types of
conditions. The …rst property is a growth property. Several such properties have been used in
the literature, a very useful one seems to be boundedness in probability on average.

De…nition 4.4 The process X is called bounded in probability on average if for every x 2 X
and every > 0 there is a compact set C X such that
Z
1 t
lim inf Px (Xs 2 C)ds 1 : (10)
t!1 t 0

The second property is a continuity condition.

De…nition 4.5 The Markov process X has the weak Feller property if for every continuous
bounded function f : X ! R the function Pt f : X ! R from (9) is again continuous. Moreover,
if Pt f is continuous even for every bounded measurable function f , then X has the strong Feller
property.

Given these two conditions, Meyn and Tweedie (1993b, th. 3.1) establish the existence of
an invariant probability measure in the following

Proposition 4.6 If a Markov process X is bounded in probability on average and has the weak
Feller property, then there is an invariant probability measure for X.

4.1.3 Uniqueness
Turning to uniqueness, the following proposition is cited in Meyn and Tweedie (1993b, page
491). For a proof see Azéma, Du‡o and Revuz (1969, Théorème 2.5).

Proposition 4.7 If the Markov process X is Harris recurrent and irreducible for a non-trivial
-…nite measure ', then there is a unique invariant measure (up to constant multiples).

Proposition 4.7 gives existence and uniqueness of the invariant measure. A simple example
shows that irreducibility and Harris recurrence do not guarantee existence of an invariant
probability measure: Let X = R and Xt = Bt denote the one-dimensional Brownian motion.
The Brownian motion is both irreducible and Harris recurrent – irreducibility is easily seen,
while recurrence is classical in dimension one. Therefore, there is a unique invariant measure.
By the Fokker-Planck equation, the density f of the invariant measure must satisfy f = 0.
By non-negativity, this implies that f is constant, f c for some c > 0. Thus, any invariant
measure is a constant multiple of the Lebesgue measure, and there is no invariant probability
measure for this example.
Given this example and as we are only interested in invariant probability measures, we need
to combine this proposition with the previous section: Boundedness in probability on average
together with the weak Feller property gives us the existence of an invariant probability measure
as used in sect. 4.1.2, whereas irreducibility together with Harris recurrence imply uniqueness
of invariant measures. Thus, for existence and uniqueness of the invariant probability measure,
we will need all four conditions.
Whereas irreducibility, boundedness in probability on average and the weak Feller property
are rather straightforward to check in practical situations, this seems to be harder for Harris
recurrence. Thus, we next discuss some su¢ cient conditions for Harris recurrence. If the Markov
process has the strong Feller property, then Harris recurrence will follow from a very weak
growth property, namely that Px (Xt ! 1) = 0 for all x 2 X, see Meyn and Tweedie (1993b,

11
th. 3.2). While the strong Feller property is often satis…ed for models driven by Brownian
motion (e.g., for hypo-elliptic di¤usions), it may not be satis…ed in models where randomness
is driven by a pure-jump process. Thus, we will next formulate an intermediate notion between
the weak and strong Feller properties, which still guarantees enough smoothing for stability.

De…nition 4.8 The Markov process X is called T-process, if there is a probability measure
on R 0 and a kernel T on (X; B(X)) satisfying the following three conditions:

1. For every A 2 B(X), the function x 7! T (x; A) is continuous13 .


R1 t
2. For every x 2 X and every A 2 B(X) we have K (x; A) 0
P (x; A) (dt) T (x; A).

3. T (x; X) > 0 for every x 2 X.

The kernel K is the transition kernel of a discrete-time Markov process (Yn )n2N obtained
from (Xt )t 0 by random sampling according to the distribution : more precisely, let us draw a
sequence n of independent samples from the distribution and de…ne a discrete time process
Yn X 1 + + n , n 2 N. Then the process Yn is Markov and has transition probabilities given
by K : Using this de…nition and theorem 3.2 in Meyn and Tweedie (1993b), we can formulate

Proposition 4.9 Suppose that X is a '-irreducible T-process. Then it is Harris recurrent


(with respect to ') if and only if Px (Xt ! 1) = 0 for every x 2 X.

Hence, in a practical sense and in order to prove existence of a unique invariant probability
measure, one needs to establish that a process X has the weak Feller property and is an
irreducible T -process which is bounded in probability on average (as the latter implies the
growth condition Px (Xt ! 1) = 0 of prop. 4.9).
Let us shortly compare the continuous, but compact case –where boundedness in probability
is always satis…ed – with the discrete case. In the latter situation, existence of an invariant
distribution always holds, while uniqueness is then given by irreducibility. In the compact,
continuous case irreducibility and Harris recurrence only guarantee existence and uniqueness of
an invariant measure, which might be in…nite. On the other hand, existence of a …nite invariant
measure is given by the weak Feller property. Thus, for existence and uniqueness of an invariant
probability measure, we will need the weak Feller property, irreducibility and Harris recurrence
– which we will conclude from the T-property. Thus, the situation in the continuous (but
compact) case is roughly the same as in the discrete case, except for some required continuity
property, namely the weak Feller property.

4.1.4 Stability
By now we have established a framework for showing existence and uniqueness of an invariant
distribution, i.e., probability measure. However, under stability we understand more, namely
the convergence of the marginal distributions to the invariant distribution, i.e., that for any
starting distribution , the law P of the Markov process at time converges to the unique
invariant distribution for ! 1. In the context of T -processes, we are going to discuss two
methods which allow to derive stability. But …rst, let us de…ne the notion of stability in a more
precise way.
13
A more general de…nition requires lower semi-continuity only. As we can show continuity for our applications,
we do not need this more general version here.

12
De…nition 4.10 For a signed measure consider the total variation norm
Z
k k sup f (x) (dx) :
jf j 1 X

Then we call a Markov process (Xt )t2R 0


stable or ergodic i¤ there is an invariant probability
measure such that
8x 2 X : lim kP t (x; ) k = 0:
t!1

Note that this implies in particular that the law P t of the Markov process converges to ,
which is the unique invariant probability measure.
In the case of a …nite state space in discrete time, ergodicity follows (inter alia) from aperi-
odicity. Down, Meyn and Tweedie (1995), also give one continuous result in this direction.

De…nition 4.11 A -irreducible Markov process (Xt ) is called aperiodic i¤ there is a measur-
able set C with (C) > 0 satisfying the following properties:

1. there is > 0 and a non-trivial measure on B(X) such that

8x 2 C; 8A 2 B(X) : P (x; A) (A); 14

2. there is T > 0 such that

8t T; 8x 2 C : P t (x; C) > 0:

If we are given an irreducible, aperiodic Markov process, then stability is implied by con-
ditions on the in…nitesimal generator. In the following proposition we give a special case of
Down, Meyn and Tweedie (1995, th. 5.2) suitable for the employment-wealth process in our
model.

Proposition 4.12 Given an irreducible, aperiodic T-process Xt with in…nitesimal generator A


on a compact state space. Assume we can …nd a measurable function V 2 D(A) with V 1
and constants d; c > 0 such that
AV cV + d:
Then the Markov-process is ergodic.

The problem with aperiodicity in the continuous time framework is that it seems hard to
characterize the small sets appearing in def. 4.11. For this reason, we also give an alternative
theorem, which avoids small sets (but is clearly related with the notion of aperiodicity). Given
a …xed > 0, the process Yn X n , n 2 N, clearly de…nes a Markov process in discrete time,
a so-called skeleton of X. These skeleton chains are a very useful construction for transferring
results from Markov processes in discrete time to continuous time. In particular, Meyn and
Tweedie (1993b, th. 6.1) gives a characterization of stability in terms of irreducibility of skeleton
chains.

Proposition 4.13 Given a Harris recurrent Markov process X with invariant probability mea-
sure . Then X is stable i¤ there is some irreducible skeleton chain.
14
Such a set C is then called small.

13
4.2 Existence
After the review of the general ergodicity theory, we now come back and implement the scheme
for our particular model. Hence, from now on we again work with the two-dimensional Markov
process X( ) = (a( ); z( )). As seen above, in order to show existence for an invariant probabil-
ity measure for X, we need (i) some compactness result for X like boundedness in probability
on average recalled in def. 4.4 and (ii) a continuity property like the weak Feller property,
see prop. 4.6. Showing that X is bounded in probability on average is straightforward: Ac-
cording to def. 4.4 we need to …nd a compact set for any initial condition x and any small
number such that the average probability to be in this set is larger than 1 : As our process
X (a ( ) ; z ( )) is bounded, we can choose the state-space X [ b=r; aw ] fw; bg as our
set for any x and . Concerning the weak Feller property, we o¤er the following
Lemma 4.14 The wealth-employment process has the weak Feller property.
Proof. Let us …rst show that the wealth-employment process depends continuously on
its initial values. To see this, …x some ! 2 , the probability space, on which the wealth-
employment process is de…ned. Notice that z (!) is certainly continuous in the starting values,
because any function de…ned on fw; bg is continuous by our choice of topology. Thus, we
only need to consider the wealth process. For …xed !, a (!) is a composition of solutions
to deterministic ODEs, each of which are continuous functions of the respective initial value.
Therefore, a (!) is a continuous function of the initial wealth.
Now assume, without loss of generality, that the wealth-employment process has a determin-
istic initial value (a0 ; z0 ) and …x some bounded, continuous function f : [ b=r; aw ] fw; bg ! R.
For the weak Feller property, we need to show that
P f (a0 ; z0 ) = E (f (a ; z ))
is a continuous function in (a0 ; z0 ). Thus, take any sequence (an0 ; z0n ) converging to (a0 ; z0 ) and
denote the wealth-employment process started at (an0 ; z0n ) by (an ; z n ). Then, by continuous
dependence on the initial value, (an (!); z n (!)) ! (a (!); z (!)), for every ! 2 . By conti-
nuity of f , this implies convergence of f (an (!); z n (!)). Since f is bounded, we may conclude
convergence P f (an0 ; z0n ) ! P f (a0 ; z0 ) by the dominated convergence theorem. Thus, P f is,
indeed, bounded and continuous whenever f is bounded and continuous, and the weak Feller
property holds.

4.3 Uniqueness
Given existence of an invariant distribution, uniqueness will follow from (Harris) recurrence
together with irreducibility of the process X. The details are spelled out in section 4.1.3, in
particular in prop. 4.7.

4.3.1 Irreducibility
We prove irreducibility in the following
Lemma 4.15 In the low-interest-regime with r < , (a( ); z( )) is an irreducible Markov
process, with the non-trivial irreducibility measure ' introduced in prop. 4.2.
Proof. Let b=r < a < aw , z 2 fw; bg. Then, regardless of the initial point at 2 [ b=r; aw ]
and regardless of zt , it is possible to attain the state (a; z) in …nite time with probability greater
than zero. Thus, prop. 4.2 implies irreducibility with irreducibility measure
Z Z 1
'(A) R(x; A) (dx); R(x; A) P t (x; A)e t dt;
X 0

14
where we can take the Lebesgue measure on [ b=r; aw ] times the counting measure on fw; bg
as measure .

4.3.2 Harris recurrence


The proof of Harris recurrence is more elaborate and builds on some auxiliary results, most
importantly on being a T -process, compare De…nition 4.8 which will be proved in Theorem 4.17
below. We start by giving an auxiliary result on the distribution of jumps in the employment
status.

Lemma 4.16 The conditional density of the time of the …rst jump in employment given that
there is precisely one such jump in [0; ] and that z(0) = w is given by
(
s
e( s)u ; 0 u ; 6= s;
g (1) (u) = e
( s) 1
1= ; 0 u ; = s:

Proof. Since the formula is well-known for = s, we only prove the result for 6= s. The
joint probability of the …rst jump 1 u and N = 1, where N denotes the number of
jumps in [0; ], is given by
Z u
P ( 1 u; N = 1) = P ( 1 u; 2 1) = P( 2 v)se sv dv
0
Z u
s
= e ( v) se sv dv = e e( s)u 1 :
0 s
Here, 2 denotes the time between the …rst and the second jump, and we have used independence
of 1 and 2 . Dividing through the probability of N = 1, we get

e( s)u
1
P( 1 ujN = 1) = ;
e( s)t 1
and we obtain the above density by di¤erentiating with respect to u.
Before starting the somewhat elaborate proof of the T -property, let us shortly discuss why
the conventional way to uniqueness of invariant measures is not open to us. As discussed in
section 4.1, uniqueness of the invariant distribution of a Markov process is implied by smoothing
properties of the process, and this approach is usually employed in the literature of continuous
time models. However, the wealth-employment process (a; z) does not satisfy the strong Feller
property (see def. 4.5). Indeed, assume that f : [ab ; aw ] fw; bg ! R is bounded measurable,
but not continuous. For the sake of concreteness, let us assume that f has a jump at some
point ab < a0 < aw . If there is no jump in the employment status until time (an event with
positive probability), then the trajectory of the wealth process a is deterministic until time
and z is even constant. Hence, on this event the jump cannot be smeared out.
On the other hand, the distribution of the jump times has a smooth density. If there is at
least one jump until time , we, therefore, expect the discontinuity of f to be smeared out due
to the density of the jump times. If both these heuristics are true, then

the wealth-employment process is not strong Feller, as

P f (a0 ; z0 ) = E [f (a ; z )] = E [f (a ; z )1N =0 ] + E [f (a ; z )1N >0 ]


| {z } | {z }
discontinuous in (a0 ;z0 ) continuous in (a0 ;z0 )

is discontinuous in (a0 ; z0 ) – where N denotes the number of jumps in the employment


status;

15
the wealth-employment status conditioned on the number of jumps being greater then zero
should satisfy the strong Feller condition. Hence, the kernel T ((a0 ; z0 ); A) = P ((a0 ; z0 ); A\
fN > 0g) should be a continuous component of P in the sense of def. 4.8. In other
words, the wealth-employment process is a T -process.

Indeed, it turns out that these heuristic considerations lead to a correct conclusion.

Theorem 4.17 The wealth-employment process (a( ); z( )) is a T -process.

Given that there are some technical di¢ culties concerning the proof of th. 4.17, we …rst give
a detailed heuristic sketch of the proof. A formal proof is provided afterwards. The main step
in establishing that a kernel T is a continuous component of P in the sense of def. 4.8 is to
show continuity. To this end, let us consider a measurable set A [ab ; aw ] fw; bg and de…ne

T>0 ((a0 ; z0 ); A) P ((a0 ; z0 ); A \ fN > 0g) =


Z
1A (a; w)p>0 ((a0 ; z0 ); (a; w))daP (N > 0) +
Z
1A (a; b)p>0 ((a0 ; z0 ); (a; b))daP (N > 0) ;

where p>0 ((a0 ; z0 ); (a; z)) denotes the transition density of the wealth-employment process con-
ditioned on fN > 0g. Obviously, continuity of T>0 is equivalent to continuity of a0 7!
p>0 ((a0 ; z0 ); (a; w)) and a0 7! p>0 ((a0 ; z0 ); (a; b)). Moreover, if the heuristic argument is correct,
we may actually restrict ourselves to the case when there is exactly one jump in the employment
process until time . This means, we consider the kernel
Z
T1 ((a0 ; z0 ); A) P ((a0 ; z0 ); A \ fN = 1g) = 1A (a; z00 )p1 ((a0 ; z0 ); (a; z00 ))daP (N = 1) ;

where z00 2 fw; bg, z00 6= z0 and p1 denotes the transition density conditioned on the event that
there is exactly one jump until time . Now the picture becomes much clearer. Indeed, let us
assume that the jump in employment status happens at some time u < . Up to time u, the
wealth process moves deterministically according to the ODE (2), after time u it again moves
in a deterministic way according to (2). Hence, there is a deterministic function z0 (see (13)
for the precise de…nition) such that

a = z0 (a0 ; u; )

provided that there is precisely one jump of the employment status at time u (and no other
jump before ). Hence, we may express T1 by
Z
T1 ((a0 ; z0 ); A) = 1A ( z0 (a0 ; u; ); z00 )g (1) (u)duP (N = 1) :
0

If u 7! z0 (a0 ; u; ) were smooth and invertible with smooth inverse y 7! z01 (a0 ; y; ), then we
could re-write the equation as
Z up(a0 )
@
T1 ((a0 ; z0 ); A) = 1A (y; z00 )g (1) ( z01 (a0 ; y; )) 1
z0 (a0 ; y; ) dy; (11)
low(a0 ) @y

@ 1
which is continuous in a0 provided that a0 7! @y z0 (a0 ; y; ) and a0 7! low(a0 ), a0 7! up(a0 )
are continuous (plus some boundedness assumption). Assuming that we can make all these
steps rigorous, we thus have proved the theorem.

16
In order to verify the various assumptions made in the above sketch , we need to understand
the solution of the ODE
daz ( )
= raz ( ) + z c(az ( ); z) (12)
d
better. Indeed, the properties would be essentially trivial, if it were not for the (possible)
singularity of the consumption function c(a; z) at a = ab and a = aw induced by the explosion
of the right hand side in (5). Nevertheless, by careful analysis we can establish the assumptions
made above, at least when we further restrain the domain.
We denote the solution of (12) started at a0 2 [ab ; aw ] at time 0 evaluated at time = u
by z (a0 ; u), i.e., z (a0 ; 0) = a0 . Let T(a; z) 2 [0; 1] be the time it takes for the deterministic
function z (a; ) to reach the boundary fab ; aw g of the domain. Note that T may be in…nite,
which is actually the good situation, as the consumption function c(a; z) is actually C 1 in that
case— and, hence, stability holds. While it seems not clear how to obtain C 1 on the whole
interval [ab ; aw ], it is clear how to get it on the interior of the domain. Of course, if T(a; z) = 1
for some a 2]ab ; aw [, then it is in…nite for any such a.
Lemma 4.18 For z = w; b, the map a 7! c(a; z) is C 1 in the interior ]ab ; aw [ of the domain.
Proof. x(a) (c(a; w); c(a; b)) solves an ODE in a (the reduced form ODE system), with
a right hand side which is locally Lipschitz in the interior of the domain. Fix some interior
value a0 and consider the initial value problem for x started at x(a0 ) on the a-domain [a0 ; aw [.
As the right hand side is locally Lipschitz, we can apply the usual existence and uniqueness
theorem, which gives, in particular, that the solution is C 1 up to (but not necessarily including)
a = aw . On the other hand, for a 2]ab ; a0 ], we just revert the direction, which gives another
locally Lipschitz right hand side, and, hence, C 1 follows in the same way.
This directly implies that z (a; u) is C 1 in both a and u for u < T(a; z), and continuous in
both variables even for u T(a; z).
Lemma 4.19 The map a 7! T(a; z) is continuous on [ab ; aw ] n faz g. Moreover, if T(a; z) < 1
for any ab < a < aw , then T( ; z) is continuous on the whole domain.15
Proof. Let z (a; u) denote the solution map of the ODE driving az evaluated at time u
for initial value z (a; 0) = a. Obviously, w (a; ) is strictly increasing (until the time that aw
is hit), while b (a; ) is strictly decreasing. Hence, they have continuous inverse functions (in t,
for …xed a).
Fix any point a0 2]ab ; aw [ and the corresponding value T0 (z) T(a0 ; z). For any positive t
we obviously have
T( z (a; t); z) = T(a; z) t:
0 0
Denoting z (t) z (a ; t), we get for any a < a0 for z = b and any a > a0 for z = w that
0 0 1
T(a; z) = T( z (( z ) (a)); z) = T 0 (z) + ( 0 1
z ) (a);

which is continuous in a. As a0 was arbitrary in the interior of the interval, the claim follows.

Let us introduce a little bit of notation: for z 2 fw; bg we denote by z 0 the other element
of fw; bg. Moreover, we de…ne
z (a; u; ) z0 ( z (a; u); u) ; 0 u ; z 2 fw; bg: (13)
In words, z denotes the value of the wealth process at time given that the wealth process at
time 0 has the value a and there is precisely one change of the employment status (from z to
z 0 ) in [0; ], which takes place at time u. We are going to identify a su¢ ciently large set of us
on which u 7! z (a; u; ) is di¤erentiable and invertible with di¤erentiable inverse.
15
Otherwise, we have a jump from +1 to 0 at a = az .

17
Lemma 4.20 De…ne the set
0
S(a; z; ) fu 2 [0; ] j u > T( z (a; u); z )g:
If T(a; z 0 ) = 1 for some ab < a < aw , then
(
[0; ]; a 6= az0 ;
S(a; z; ) =
]0; ]; a = az 0 :
Otherwise, the following three properties hold:
1. There are numbers s(a; z; ) such that S(a; z; ) = ]s(a; z; ); ].
2. a 7! s(a; z; ) is continuous on ]ab ; aw [.
3. For every (a; z) 2 [ab ; aw ] fw; bg we have (uniformly) s(a; z; ) > 0.
Proof. The description for T(a; z 0 ) = 1 is obvious, so we assume that 8a 2 [ab ; aw ] n faz0 g :
T(a; z 0 ) < 1.
First note that 2 S(a; z; ). Moreover, for u < v < we have that u 2 S(a; z; ) implies
v 2 S(a; z; ), since
0 0
T( z (a; v); z ) T( z (a; u); z ) < u < v;
which shows that S(a; z; ) is an interval. However, for its lower endpoint the inequality is no
longer strict, implying that the interval is closed to the right, but open to the left.
For the continuity of s, let us consider any (monotone) converging sequence an ! a 2
[ab ; aw ]. First, assume that u 2 S(an ; z; ) for all n N . Then u > T( z (an ; u); z 0 ). Thus,
0
continuity of z ( ; u) and T( ; z ) (cf. Lemma 4.19) imply that
0
u T( z (a; u); z ):
The right hand side of the inequality is decreasing in u, so that we can infer that every u0 > u
is contained in S(a; z; ), hence u 2 S(a; z; ). In a similar way, we can show that u 2
[0; ] n S(an ; z; ) for every n N implies that u 2 [0; ] n S(a; z; ). However, this is only
possible if s(an ; z; ) ! s(a; z; ), proving continuity in the interior of the domain.
It is obvious that > s(a; z; ) as 2 S(a; z; ) and S(a; z; ) is half-open. The uniformity
is also clear.
Lemma 4.21 The map u 7! z (a; u; ) is di¤erentiable on S(a; z; ) and we have
@
z (a; u; ) > 0:
@u
Proof. By (13), z is di¤erentiable in u provided that a0 7! z0 (a0 ; u) is di¤erentiable
at a0 = z (a; u). It is a well-known fact that the solution map of an ODE is di¤erentiable in
its initial value provided that the right hand side is C 1 . By Lemma 4.18, the right hand side
of (12) (for z = z 0 ) is C 1 (in a) as long as we do not hit az0 , which is precisely guaranteed by
u 2 S(a; z; ). Hence, we can apply the chain rule and obtain
@ @ z0 @ z0 @ z
z (a; u; )= ( z (a; u); u) + ( z (a; u); u) (a; u)
@u @u @a @u
= [r (a; u; ) + z 0 c( z (a; u; ); z 0 )] +
| z {z }
I
@ z0
+ ( z (a; u); u) [r z (a; u) +z c( z (a; u); z)] :
|@a {z }| {z
III
}
II

18
For z = w, we have I < 0 (with strict inequality as u 2 S(a; z; )), and II 0, III 0,
implying that
@
w (a; u; ) > 0:
@u
On the other hand, for z = b, we have I > 0 (again, with strict inequality), II 0 and III 0,
implying that
@
b (a; u; ) < 0:
@u

By Lemma 4.21 together with Lemma 4.20 we now understand rigorously on which do-
mains of integration we can do the change of variables in (11), which is crucial for establishing
continuity. Therefore, we are now prepared to …nish the proof of the theorem.
Proof of th. 4.17. We choose the measure (dt) = (dt) for some …xed > 0 and de…ne
a candidate Te for a continuous component of P by
Z
e
T ((a; z); A) 1A ( z (a; u; ); z 0 )1S(a;z; ) ( z (a; u; ))g (1) (u)duP (N ( ) = 1); (14)
0

for a 2 [ab ; aw ], z 2 fw; bg, A [ab ; aw ] fw; bg measurable, i.e.,

Te((a; z); A) = P ((a; z); A \ fN = 1g \ fT1 2 S(a; z; )g);


where T1 denotes the …rst jump time of the Poisson process N . Hence, it is clear that Te P .
Now, introduce a change or variables u ! y z (a; y; ) as in (11). By Lemma 4.21, we get

Z U (a;z; )
Te((a; z); A) = 1A (y; z 0 )1S(a;z; ) z
1
(a; y; )
L(a;z; )
@
g (1) z
1
(a; y; ) z
1
(a; y; ) dy; (15)
@y
where the lower and upper limits of the integration are given by
( (
z (a; 0; ); z = w; z (a; ; ); z = w;
L(a; z; ) U (a; z; )
z (a; ; ); z = b; z (a; 0; ); z = b;

respectively. Here, y 7! z 1 (a; y; ) denotes the inverse function of u 7! z (a; u; ). Compar-


ing (15) with (14), we note two important di¤erences: the integrand (including the limits of
the integration) in (15) is continuous in a almost everywhere but, on the other hand, generally
unbounded.
By a slight abuse of notation, let us denote S(a; z; ) ]s(a; z; ); ].16 Lemma 4.20 implies
that we may choose 0 < < inf (a;z) ( s(a; z; )). Now de…ne S (a; z; ) ]s(a; z; ) + ; ]
and
Z
T ((a; z); A) 1A ( z (a; u; ); z 0 )1S (a;z; ) ( z (a; u; ))g (1) (u)duP (N ( ) = 1): (16)
0

By the same change of variables as above, we arrive at


Z U (a;z; )
T ((a; z); A) = 1A (y; z 0 )1S (a;z; ) z 1 (a; y; )
L(a;z; )
@
g (1) z
1
(a; y; ) z
1
(a; y; ) dy: (17)
@y
16
This means that s(a; z; ) 0 in the case T(a; z 0 ) = 1 and S(a; z; ) = [0; ] is replaced by ]0; ] in that
case.

19
Since the term I in the proof of Lemma 4.21 only gets close to 0 when u is close to s(a; z; ),
now
@
1S (a;z; ) z 1 (a; y; ) 1
(a; y; )
@y z
is uniformly bounded, implying that (a; z) 7! T ((a; z); A) is continuous for any measurable set
A.
As, by construction, (s(a0 ; z0 ; ) + ) > 0 we have T ((a; z); [ab ; aw ] fw; bg) > 0. Finally,
it is obvious that T ((a; z); A) Te((a; z); A) P ((a; z); A) for any (a; z) and any measurable
function A.

Corollary 4.22 The wealth-employment process (a( ); z( )) is Harris recurrent.

Proof. By lemma 4.15 and theorem 4.17, the employment-wealth process (a( ); z( )) is an
irreducible T -process. Thus, prop. 4.9 implies that (a( ); z( )) is Harris recurrent, given that
Px (Xt ! 1) = 0 holds for our bounded state space.

4.3.3 Uniqueness
We can now complete our proof of uniqueness.

Theorem 4.23 Suppose that r < . Then there is a unique invariant probability measure for
the wealth-employment process (a( ); z( )).

Proof. By prop. 4.7, there is a unique invariant measure (up to a constant multiplier), and
prop. 4.6 implies that we may choose the invariant measure to be a probability measure.

4.4 Stability
Stability, i.e., convergence of the distribution of (a( ); z( )) to the unique invariant distribution
for any given initial distribution is implied by the existence of an irreducible skeleton chain, see
prop. 4.13.

Corollary 4.24 Under the assumptions of theorem 4.23, the employment-wealth process is
stable in the sense of def. 4.10.

Proof. Recall that the employment-wealth-process is a T -process, see theorem 4.17. More-
over, we have shown irreducibility in lemma 4.15. Proposition 4.13 will imply the desired
conclusion, if we can show irreducibility of a skeleton chain. Take any > 0 and consider
the corresponding skeleton Yn , n 2 N, with transition probabilities P . By the proof of theo-
rem 4.17, we see that (Yn ) is also a T -process, where the de…nition of T -processes is generalized
to discrete-time processes in the obvious way. By Meyn and Tweedie (1993, prop. 6.2.1), the
discrete-time T -process Y is irreducible if there is a point x 2 X such that for any open
neighborhood O of x, we have
X
1
8y 2 X : P n (y; O) > 0: (18)
n=1

This property, however, can be easily shown for the wealth-employment process (a; z) as illus-
trated in …g. 1 and formally analysed in app. B and C. Indeed, take x = ( b=r; b). Then any
open neighborhood O of x contains [ b=r; b=r + [ fbg for some > 0. We start at some point
y = (a0 ; z0 ) 2 X and assume the following scenario: if necessary, at some time between 0 and ,
the employment status changes to b, then it stays constant until the random time N de…ned

20
by N inffn j a(n ) < b=r + g. Note that the wealth is decreasing in a deterministic way
while z = b. Thus, we can …nd a deterministic upper bound N K(a0 ). The event that the
employment attains the value b during the time interval [0; ] and retains this value until time
K(a0 ) has positive probability. In
Pthis case, however, the trajectory of the wealth-employment
1
process reaches O, implying that n=1 P n (y; O) > 0. Thus, the -skeleton chain is irreducible
and the wealth-employment process is stable.

5 Describing the distribution of labour income and wealth


We now come to the applied part of this paper where we describe distributional properties of
z ( ) and a ( ) by Fokker-Planck equations. This is of importance per se for our setup and
serves as an example that can be adapted for many other applications.

5.1 Labour market probabilities


Consider …rst the distribution of the labour market state. Given that the transition rates
between w and b are constant, the conditional probabilities of being in state z ( ) follow e.g. from
solving Kolmogorov’s backward equations as presented e.g. in Ross (1993, ch. 6). As an example,
the probability of being employed in t conditional on being in state z 2 fw; bg in t are
s ( +s)( t)
P (z ( ) = wjz (t) = w) pww ( ) = + e ; (19)
+s +s
( +s)( t)
P (z ( ) = wjz (t) = b) pbw ( ) = e : (20)
+s +s

The complementary probabilities are pwb ( ) = 1 pww ( ) and pbb ( ) = 1 pbw ( ) : Letting
pw (t) denote the probability of z (t) = w; i.e. letting it describe the initial distribution of z (t) ;
the unconditional probability of being in state z in is

pz ( ) = pw (t) pwz ( ) + (1 pw (t)) pbz ( ) : (21)

Equations (19) and (20) nicely show the in‡uence of the initial condition on the probability
of having a job. Consider a point in time which is just an instant after t: Let this instant be
so small that is basically identical to t: Then, the probability of being employed in (where
s
= t) is given by +s + +s = 1: Similarly, the probability of being unemployed in where is
very close to t is given by (set = t in (20)) +s +s
= 0: The longer the point lies into the
future, the less important the initial state becomes and the closer both probabilities approach
the unconditional probability of being employed, which is +s :

5.2 Fokker-Planck equations for wealth


5.2.1 The question and how to answer it
Our individual faces an uncertain future labour income stream z ( ) : We would like to under-
stand the joint distribution of a ( ) and z ( ) for t. To this end, we consider the stochastic
processes of a ( ) in (2) and z ( ) in (3). After de…ning the (joint) density of (a ( ) ; z ( )), we
apply the “Fokker-Planck machinery”to obtain a description of the densities.
We denote the joint density by p (a; z; ) : For each point in time ; there is obviously
a discrete and a continuous random variable. We can therefore split the density into two
“subdensities”p (a; w; ) and p (a; b; ) ; both drawn in …g. 2 for some t. The subdensities

21
can be understood as the product of a conditional density p (a; jz) times the probability of
being in employment state z,
p (a; z; ) p (a; jz) pz ( ) : (22)
The probability pz ( ) of an individual to be in a state z in is given by (21). As is clear
from (22), p (a; z; ) are not conditional densities –they rather integrate to the probability of
z ( ) = z: Looking at an individual who is in state z in , we get
Z Z Z
p (a; z; ) da = p (a; jz) pz ( ) da = pz ( ) p (a; jz) da = pz ( ) : (23)

The density of a at some point in time is then simply


p (a; ) = p (a; w; ) + p (a; b; ) : (24)

p(a, τ )

p(.) p(a,w, τ )
z
b or w

p(a,b, τ )
b

Figure 2 The subdensities p (a; b; ) and p (a; w; ) and the density p (a; )

Note that the distribution of (a ( ) ; z ( )) certainly depends on the initial condition (a (t) ; z (t)),
which needs to be speci…ed in order to calculate p (a; z; ). In the notation we do not distinguish
between the following two possibilities. Firstly, (a (t) ; z (t)) can be deterministic numbers, in
which case p (a; z; t) is a Dirac-distribution centered in (a (t) ; z (t)) (more precisely, the map-
ping a ! p (z; a; t) is a Dirac-distribution). Secondly, (a (t) ; z (t)) can itself be random, either
because we regard them as outcomes of the employment-wealth-process started at an even ear-
lier time, or because there is some intrinsic uncertainty in measuring a (t) (as e.g. the exact
value of some asset, think e.g. of a house, is not known).
Let us now step back and ask how this approach can be applied to other setups. If one would
like to understand the process of accumulation and depreciation of skills and experience during
di¤erent employment states, one would have to specify a di¤erential equation for skill similar
to the budget constraint (2). Joint with the fundamental process (3) one could then derive
Fokker-Planck equations for densities. If one would like to model the endogenous distribution
of entitlement to unemployment bene…ts, one would have to “translate”regulations concerning
entitlement into a di¤erential equation, add again (3) and proceed to derive Fokker-Planck
equations. Similar procedures are possible for analysing distributions over the business cycle
where some aggregate shock process would be added to (2), (3) or both. Note that this approach
works for processes driven e.g. by Brownian motion just as well.

22
5.2.2 The equations and their economic interpretation
The derivation of the Fokker-Planck equations is in app. A. The result is a system of two
non-autonomous quasi-linear partial di¤erential equations in p (a; w; ) and p (a; b; ),

@ @
p (a; w; ) + fra + w c (a; w)g p (a; w; ) =
@ @a
@
r c (a; w) + s p (a; w; ) + p (a; b; ) ; (25a)
@a
@ @
p (a; b; ) + fra + b c (a; b)g p (a; b; ) =
@ @a
@
sp (a; w; ) r c (a; b) + p (a; b; ) : (25b)
@a

The system is a partial di¤erential equation system as there are two derivatives, one with respect
to time and one with respect to wealth a – which is not surprising: As the FPEs describe
the evolution of the density for wealth over time, two derivatives are needed. The derivative
with respect to a describes the “cross-sectional”property of the density for a given : The time
derivative describes how a density changes over time.17 The di¤erential equations are called
quasi-linear as the factors in front of the wealth-derivatives are functions of a: The PDEs are
non-autonomous as some of the terms (other than the densities) also depend explicitly on one
of the exogenous variables (exogenous in a di¤erential equation sense), i.e. on wealth a:
As we can see, the density depends on properties of optimizing behaviour through the con-
sumption levels c (a; w) and c (a; b) and through the marginal propensities to consume out of
wealth, @c (a; w) =@a. These FPEs therefore describe the evolution of wealth for any speci…-
cation of the utility function (e.g. CRRA, CARA, log, etc.). Modifying the utility function
(e.g. allowing for labour supply or separating the intertemporal elasticity of substitution from
risk aversion) a¤ects the density of wealth through the e¤ect on the optimal consumption plan
c (a; z) :
Before we give an economic interpretation to these equations, we transform them such that
they do not describe densities but distribution functions. To this end, de…ne “subdistribution”
functions as Z a
P (a; z; ) p (a; z; ) da: (26)
b=r

The term P (a; w; ) gives the probability that an individual will be employed in and own
wealth equal or lower to a: Given our de…nition of subdensities and their property in (23), we
know that lima!1 P (a; w; ) = pzw ( ) where the term pzw ( ) is given in either (19) or (20),
depending on the initial state in t:
b
The transformation of our FPEs is subject to the condition that p r
; z; = 0 for all
: This means that there is no worker with wealth equal to b=r: As a wealth of b=r for
unemployed workers would imply zero consumption, c ( b=r; b) = 0, this can be ruled out indeed
as marginal utility from consumption would then be in…nity. This would violate optimality.
As employed workers with wealth of b=r can only originate from unemployed workers with
b
this wealth level (as wealth of employed workers increases) and as p r
; b; = 0 for all ; we
b
know that p r
; w; = 0 for all as well.
17
Compare this to the Pearson system of distributions that describes densities by ordinary non-autonomous
di¤erential equations (see e.g. Johnson, Kotz and Balakrishnan, 1994, ch. 12). These ordinary di¤erential
equations describe the density of one random variable. Here, we analyse a stochastic process, i.e. a sequence of
random variables, and therefore need two derivatives.

23
The subdistribution functions in (26) obey the following system (cf. app. D.2)

@ @
P (a; w; ) = fra + w c (a; w)g P (a; w; ) sP (a; w; ) + P (a; b; ) ; (27a)
@ @a
@ @
P (a; b; ) = fra + b c (a; b)g P (a; b; ) + sP (a; w; ) P (a; b; ) : (27b)
@ @a
This system is now extremely easy to understand: Starting with the …rst equation, the
evolution of the distribution function over time, i.e. the time derivative @P (a; w; ) =@ on the
left hand side depends on three terms. Starting at the end, there is an increase in the probability
P (a; w; ) if there is a high ‡ow from the state of being unemployed. This ‡ow can be high if
the matching rate , the probability of being unemployed P (a; b; ) or if a combination of the
two is high. Similarly, the probability P (a; w; ) decreases (ceteris paribus) exponentially at
the rate s; and the faster so, the higher the separation rate. The interpretation of the last two
terms in the second equation (27b) is identical (subject to reversed signs). These two terms are
very familiar from derivations of wage distributions in the Burdett-Mortensen (1998) tradition.
We can think of these equations as describing how wealth of a worker ‡ows up and down
depending on her current state. The labor income levels of workers are stochastically moving
back and forth between the di¤erent states w and b: The e¤ect of these stochastic jumps on the
distribution of wealth are captured by the two terms at the end of (27a,b). Wealth is moving
non-stochastically within the states, either upwards (when employed) or downwards (when
unemployed). The direction of the movement is on the wealth line, i.e. the partial derivative
@P (a; w; ) =@a gives the direction of a: The speed of this movement is determined by savings
ra + z c (a; z) : The speed is positive when employed and negative when unemployed. The
overall e¤ect of positive savings for the probability P (a; w; ) of employed workers is then to
decrease this probability. As wealth increases, the probability of having a wealth level equal to
or lower than a certain level a obviously falls as there is a permanent ‡ow towards higher wealth
levels. This ‡ow is then reversed in the state of unemployment where the speed (i.e. savings
ra b c (a; b)) is negative. As a consequence, the probability P (a; b; ) ceteris paribus increases
over time as unemployed workers “gather”towards the lower end of the wealth distribution.

5.2.3 Initial conditions


Obtaining a unique solution for ODEs generally requires certain di¤erentiability conditions
and as many initial conditions as di¤erential equations. Conditions for obtaining a unique
solution for PDEs di¤er in various respects, of which the most important one from an intuitive
perspective is the fact that instead of initial conditions (i.e. an initial value or vector), initial
functions are required. This can easily be understood for our case: Let us assume two initial
functions for a; one for each labour market state z 2 fw; bg. The obvious interpretation for
these initial functions are densities, just as illustrated in …g. 2. Initial functions would therefore
be given by p (a; b; t) = pini (a; b) and p (a; w; t) = pini (a; w) : Clearly, they take positive values
on the range [ b=r; aw ] only and need to jointly integrate to unity. Given these initial functions,
one can then compute the partial derivatives with respect to a in (25). This gives an ODE
system which allows us to compute the density for the “next” : Repeating this gives us the
densities for all z; a and we are interested in.

5.2.4 A density gives a density


The Fokker-Planck equations have a very convenient property that easily allows to show that
they indeed describe densities (in the sense that their solutions integrate to one). The only
condition is that the initial functions integrate to one. We summarize this in the following

24
R1
Proposition 5.1 De…ne I ( ) 1
p (a; w; ) + p (a; b; ) da: Given the laws of motion for
p (a; z; ) from (25) and the fact of a bounded support [ b=r; aw ], this integral is mass-preserving,
i.e. dI ( ) =d = 0 for all : Assuming initial densities, i.e. initial functions p (a; z; t) 0 such
that I (t) = 1; the PDEs in (25) indeed describe the dynamics of distributions over time.
Proof. see app. D.1
This is an extremely useful property as this implies that with an initial density we know
that all other functions p (a; w; ) + p (a; b; ) integrate to one and therefore represent densities.

5.2.5 The long-run distribution of individual wealth


When we are interested in the long-run distribution of wealth and income only, the time deriva-
tives of the densities would be zero and the long-run densities would be described by two linear
ordinary di¤erential equations. This is true both for the system in densities (25) and for the
system for distributions (27).
Initial conditions for this ordinary di¤erential equations are given by
p (aw ; w) = 0; p (aw ; b) = 0: (28)
The intuition for p (aw ; w) = 0 comes from the saddle-path nature of the TSS in (7): There
is one path going into from the left and one going into from the right and two (not
drawn) starting from and going North and South. In saddle-points of ODE systems, one
can prove by linearization around the …x point that local solutions of the ODE approach the
saddle point asymptotically. Linearization here is more involved given the special structure of
our system (see fn. 20). Assuming that the qualitative properties of local behaviour are not
a¤ected by this structure, we would observe asymptotic behaviour here as well and the TSS
would actually never be reached: p (aw ; w) = 0 would follow. The second boundary condition is
then an immediate consequence. As the state (aw ; b) can occur only through a transition from
(aw ; w) but the density at (aw ; w) is zero, p (aw ; b) = 0 as well.

6 Conclusion
This paper has introduced methods that allow to prove existence, uniqueness and stability of
distributions described by stochastic di¤erential equations driven by a jump process. These
methods were applied to a model of precautionary saving. Existence, uniqueness and stability
of the optimal process for the state variables, wealth and labour market status, were proven.
The results hold for an interest rate being lower than the time-preference rate.
The T -property turned out to be especially useful for models where randomness is introduced
by …nite-activity jump processes, i.e., by compound Poisson processes. In di¤usion models,
usually even the strong Feller property holds, which makes it easy to conclude the T -property.
On the other hand, in models driven by in…nite-activity jump processes, e.g., Lévy processes
with in…nite activity, it does not seem clear whether the T -property can lead to useful results.
Indeed, in these models, the strong Feller property may and may not hold, see, for instance,
Picard (1995/97). On the other hand, the weak Feller property is satis…ed for all Lévy processes,
implying existence of invariant distributions, see Applebaum (2004, theorem 3.1.9). Looking
at these issues in economic applications o¤ers many fascinating research projects for years to
come.
From a more applied perspective, we derived Fokker-Planck equations for wealth and labour
market status. We saw inter alia how matching and separation rates and savings shape the
evolution of the wealth distribution over time. Our approach and our derivation provides a
considerable generalization to existing applications in economics. This will facilitate the use
these equations in many other applications in future work.

25
A Appendix on deriving the Fokker-Planck equations
This appendix derives the Fokker-Planck equations (25) of the wealth-employment process
(a(t); z(t)). We proceed step by step as this facilitates applications for other purposes. Step 1:
We start with some function f having as arguments the variables whose density we would like
to understand. We compute the di¤erential of this function in the usual way and also compute
its expected change. Step 2: The starting point here is Dynkin’s formula. This formula,
intuitively speaking, gives the expected value of some function f; whose arguments are the
random variables we are interested in, as the sum of the current value of f plus the integral
over expected future changes of f . The expected change of f is expressed by using the density
of our random variables. The Dynkin formula is di¤erentiated with respect to time. Step 3:
By using integration by parts or the adjoint operator, we get an expression for the change of
the expected value of f: Step 4: A di¤erent expression for this change of the expected value
can be obtained by starting from the expected value and di¤erentiating it. Step 5: Equating
the two gives the di¤erential equations for the density.
It should be kept in mind that this approach can be applied to systems beyond (2) and
(3). As long as there are one to several stochastic processes described by stochastic di¤erential
equations, this approach can be used to obtain a description of the corresponding densities.
Uncertainty can stem from Brownian motion, Poisson processes, a combination of the two or
Levy processes.

A.1 The expected change of some function f


Assume there is a function f having as arguments the state variables a and z. This function
has a bounded support S, i.e. f (a; z) = 0 outside this support.18 Heuristically, the di¤erential
of this function, using a change of variable formula,19 gives
df (a ( ) ; z ( )) = fa (:) fra ( ) + z ( ) c (a ( ) ; z ( ))g d
+ ff (a ( ) ; z ( ) + ) f (a ( ) ; z ( ))g dq
+ ff (a ( ) ; z ( ) ) f (a ( ) ; z ( ))g dqs :
Due to the state-dependent arrival rates, see after (3), only one Poisson process is active at a
time.
When we are interested in the expected change, we need to form expectations. Applying
the conditional expectations operator E and dividing by d yields the heuristic equation
E df (:)
= fa (:) fra ( ) + z ( ) c (a ( ) ; z ( ))g
d
+ (z ( )) [f (a ( ) ; z ( ) + ) f (a ( ) ; z ( ))]
+ s (z ( )) [f (a ( ) ; z ( ) ) f (a ( ) ; z ( ))] (29)
In what follows, we denote this expression by
E df (a ( ) ; z ( ))
Af (a ( ) ; z ( )) (30)
d
which is, more precisely, the in…nitesimal generator A de…ned by
E (f (z( + ); a( + ))jz( ) = z; a( ) = a) f (a; z)
Af (a; z) = lim :
&0

18
We can make this assumption without any restriction. As we will see below, this function will not play any
role in the determination of the actual density.
19
There are formal derivations of this equation in mathematical textbooks like Protter (1995). For a more
elementary presentation, see Wälde (2012, part IV).

26
Notice that Af (a; z) does not depend on , because the Markov-process (a( ); z( )) is time-
homogeneous. We understand A as an operator mapping functions (in a and z) to other such
functions. Moreover, note that all test-functions, i.e. C 1 functions of bounded support, are in
the domain of the operator A, i.e. the domain of all functions f such that the above limit exists
(for all a and z).

A.2 Dynkin’s formula and its manipulation


To abbreviate notation, we now de…ne x ( ) (a ( ) ; z ( )) : The expected value of our function
f (x ( )) is by Dynkin’s formula (e.g. Yuan and Mao, 2003) given by
Z
Ef (x ( )) = Ef (x (t)) + E (Af (x (s))) ds: (31)
t

To understand this equation, use the de…nition in (30) and formally write it as
Z Z
Edf (x (s))
Ef (x ( )) = Ef (x (t)) + ds = Ef (x (t)) + Edf (x (s)) :
t ds t

Intuitively speaking, Dynkin’s formula says that the expected value of f (x ( )) is the expecta-
tion for the current value, Ef (x (t)) (given that
R we allow for a random initial condition x (t)),
plus the “sum of”expected future changes, t Edf (x (s)) :
Let us now di¤erentiate (31) with respect to time and …nd
Z
@ @
Ef (x ( )) = E (Af (x (s))) ds = E (Af (x ( ))) ; (32)
@ @ t

where the …rst equality used that Ef (x (t)) is a constant and pulled the expectations operator
into the integral. This equation says the following: We form expectations in t about f (x ( )) :
We now ask how this expectation changes when moves further into the future, i.e. we look at
@
@
E [f (x ( ))]. We see that this change is given by the expected change of f (x ( )); where the
change is Af (x ( )) :
We now introduce the densities we de…ned in sect. 5.2.1. The expectation operator E in (32)
integrates over all possible states of x ( ) : When we express this joint density as p (a; z; )
p (a; jz) pz ( ), we can write (32) as

@
Ef (x ( )) = E (Af (x ( )))
@ Z Z
1 1
= pw ( ) Af (a; w) p (a; jw) da + pb ( ) Af (a; b) p (a; jb) da:
1 1

Now pull pw ( ) and pb ( ) back into the integral and use p (a; z; ) p (a; jz) pz ( ) again for
z = w and z = b. Then
Z 1 Z 1
@
Ef (x ( )) = Af (a; w) p (a; w; ) da + Af (a; b) p (a; b; ) da
@ 1 1

w + b: (33)

A.3 The adjoint operator and integration by parts


This is now the crucial step in obtaining a di¤erential equation for the density. It consists in
applying an integration by parts formula which allows to move the derivatives in Af (x ( ))
into the density p (x; ) : Let us brie‡y review this method, without getting into technical

27
details. Given two functions f; g : R ! R and two …xed real numbers c < d, the factor rule of
di¤erentiation
d(f (x) g(x)) = df (x) g(x) + f (x) dg(x) (34)
Rd 0 Rd
implies that f (d)g(d) f (c)g(c) = c f (x)g(x)dx + c f (x)g 0 (x)dx; a formula referred to as
partial integration rule. In particular, it also holds for c = 1 and d = +1, if the function
evaluations are understood as limits for c ! 1 and d ! +1, respectively. If f has bounded
support, i.e. is equal to zero outside a …xed bounded set, then the function evaluations at 1
vanish and we get Z Z
+1 +1
f 0 (x)g(x)dx = f (x)g 0 (x)dx: (35)
1 1

We now apply (35) to equation (33). We can do this as the expressions in (33) “lost” all
stochastic features. To this end, insert the de…nition of A given in (30) together with (29) into
(33). To avoid getting lost in long expressions, we look at the both integrals in (33) in turn.
For the second, observe that

Af (a; b) = fa (:) fra + b c (a; b)g + [f (a; w) f (a; b)] ;

i.e. the term with s in (29) is missing given that we are in state b. Hence,
Z 1
b = [fa (a; b) fra + b c (a; b)g + [f (a; w) f (a; b)]] p (a; b; ) da
1
Z 1
= fa (a; b) fra + b c (a; b)g p (a; b; ) da
1
Z 1
+ [f (a; w) f (a; b)] p (a; b; ) da:
1

Now integrate by parts. As this integral shows, we only need to integrate by parts for
the fa term. The rest remains untouched. This gives with (35), where g (x) stands for
fra + b c (a; b)g p (a; b; ) and x for a;
Z 1
@ @
b = f (a; b) r c (a; b) p (a; b; ) + fra + b c (a; b)g p (a; b; ) da
@a @a
Z 1 1
+ [f (a; w) f (a; b)] p (a; b; ) da: (36)
1

Now look at the …rst integral of (33). After similar steps (as the principle is the same, we
replace b by w and the arrival rate by s in the last equation), this reads
Z 1
@ @
w = f (a; w) r c (a; w) p (a; w; ) + fra + w c (a; w)g p (a; w; ) da
@a @a
Z 1 1
+ s [f (a; b) f (a; w)] p (a; w; ) da: (37)
1

Summarizing, we …nd
@
Ef (x ( )) = w + b
@

28
Z 1
@ @
= f (a; w) r c (a; w) p (a; w; ) fra + w c (a; w)g p (a; w; ) da
1 @a @a
Z 1
+ s [f (a; b) f (a; w)] p (a; w; ) da
1
Z 1
@ @
+ f (a; b) r c (a; b) p (a; b; ) fra + b c (a; b)g p (a; b; ) da
1 @a @a
Z 1
+ [f (a; w) f (a; b)] p (a; b; ) da: (38)
1

A.4 The expected value again


Let us now derive the second expression for the change in the expected value. By de…nition,
and as an alternative to the Dynkin formula (31), we have
Z 1 Z 1
Ef (x ( )) = f (a; b) p (a; b; ) da + f (a; w) p (a; w; ) da: (39)
1 1

When we di¤erentiate this expression with respect to time, we get


Z 1
@ @
Ef (x ( )) = f (a; b) p (a; b; ) da
@ @
Z 11
@
+ f (a; w) p (a; w; ) da: (40)
1 @
Note that we can use
Z 1 Z 1
@ @
f (a; z) p (a; z; ) da = f (a; z) p (a; z; ) da
@ 1 1 @
as z and a inside this integral are no longer functions of time.

A.5 Equating the two expressions


We now equate (38) with (40). Collecting terms belonging to f (a; w) and f (a; b) gives
Z 1 Z 1
f (a; w) 'w da + f (a; b) 'b da = 0; (41)
1 1

where
@ @
'w r c (a; w) + s p (a; w; ) fra + w c (a; w)g p (a; w; )
@a @a
@
+ p (a; b; ) p (a; w; )
@
and
@ @
'b r c (a; b) + p (a; b; ) fra + b c (a; b)g p (a; b; )
@a @a
@
+ sp (a; w; ) p (a; b; ) :
@
Obviously, the above equation is satis…ed if

'b = 'w = 0: (42)

29
These are the Fokker-Planck equations used in (25).
It is easy to see that the integral equation can only be satis…ed for all functions f if these
Fokker-Planck equations are satis…ed. Indeed, assume that 'b > 0 on an interval I = [d ; d+ ].
One can …nd a non-negative function f smooth in a such that f (a; w) = 0 for all a and
(
1; a 2 [d =2; d + =2];
f (a; b) =
0; a 2] 1; d ] [ [d + ; 1[:

Inserting this test function into the integral equation gives


Z 1 Z 1 Z d+
f (a; w) 'w da + f (a; b) 'b da = 0 + f (a; b) 'b da > 0
1 1 d

by construction. Therefore, 'b = 0 has to hold for all a 2 R, and similarly for 'w .

B Referee appendix
For all further appendices, please see the Referees’appendix

References
Achdou, Y., F. Buera, J. Lasry, P. Lions, and B. Moll (2014): “Partial di¤erential equation mod-
els in macroeconomics,” Philosophical Transactions of the Royal Society A 372: 20130397,
pp. 1–19.
Achdou, Y., J. Han, J. Lasry, P. Lions, and B. Moll (2015): “Heterogeneous Agent Models in
Continuous Time,”mimeo Princeton University.
Aiyagari, S. R. (1994): “Uninsured Idiosyncratic Risk and Aggregate Saving,”Quarterly Jour-
nal of Economics, 109, 659–84.
Anderson, R., and R. Raimondo (2008): “Equilibrium in Continous-Time Financial Markets:
Endogenously Dynamically Complete Markets,”Econometrica, 76(4), 841–907.
Applebaum, D. (2004): Lévy processes and stochastic calculus, vol. 93 of Cambridge Studies
in Advanced Mathematics. Cambridge University Press, Cambridge.
Azema, J., M. Du‡o, and D. Revuz (1969): “Mesure invariante des processus de Markov recur-
rents.,”Sem. Probab. III, Univ. Strasbourg 1967/68, Lect. Notes Math. 88, 24-33 (1969).
Bandi, F. M., and T. H. Nguyen (2003): “On the functional estimation of jump-di¤usion
models,”Journal of Econometrics, 116(1-2), 293–328.
Bertola, G., and R. Caballero (1994): “Irreversibility and Aggregate Investment,” Review of
Economic Studies, 61(207), 223–246.
Bismut, J.-M. (1975): “Growth and Optimal Intertemporal Allocation of Risks,” Journal of
Economic Theory, 10(2), 239–257.
Brock, W., and M. Magill (1979): “Dynamics under Uncertainty,” Econometrica, 47(4), 843–
868.
Burdett, K., and D. T. Mortensen (1998): “Wage Di¤erentials, Employer Size, and Unemploy-
ment,”International Economic Review, 39, 257–273.

30
Chang, F.-R., and A. Malliaris (1987): “Asymptotic Growth under Uncertainty: Existence and
Uniqueness,”Review of Economic Studies, 54(1), 169–174.

Down, D., S. P. Meyn, and R. L. Tweedie (1995a): “Exponential and Uniform Ergodicity of
Markov Processes,”Annals of Probability, 23, 1671 –1691.

(1995b): “Exponential and uniform ergodicity of Markov processes,” Ann. Probab.,


23(4), 1671–1691.

Flinn, C. (2006): “Minimum Wage E¤ects on Labor Market Outcomes under Search, Matching,
and Endogenous Contact Rates,”Econometrica, 74, 1013–1062.

Friedman, A. (1975): Stochastic di¤erential equations and applications. Vol. 1. Academic Press
[Harcourt Brace Jovanovich Publishers], New York, Probability and Mathematical Statistics,
Vol. 28.

Hansen, L. P., and J. A. Scheinkman (2009): “Long-Term Risk: An Operator Approach,”


Econometrica, 77(1), 177–234.

Heathcote, J., K. Storesletten, and G. Violante (2009): “Quantitative Macroeconomics with


Heterogeneous Households,”Annual Review of Economics, 1, 319–354.

Hopenhayn, H., and E. Prescott (1992): “Stochastic Monotonicity and Stationary Distributions
for Dynamic Economies,”Econometrica, 60(6), 1387–1406.

Huggett, M. (1993): “The risk-free rate in heterogeneous-agent incomplete-insurance


economies,”Journal of Economic Dynamics and Control, 17, 953–969.

Huggett, M., and S. Ospina (2001): “Aggregate precautionary savings: when is the third
derivative irrelevant?,”Journal of Monetary Economics, 48, 373–396.

Impullitti, G., A. A. Irarrazabal, and L. D. Opromolla (2011): “A Theory of Entry into and
Exit from Export Markets,”mimeo Cambridge University.

Johnson, N., S. Kotz, and N. Balakrishnan (1994): “Continuous Distributions (General),” in


Continuous Univariate Distributions Vol. 1, ed. by N. Johnson, S. Kotz, and N. Balakrishnan,
pp. 1–79. Wiley Publications.

Kamihigashi, T., and J. Stachurski (2012): “An order-theoretic mixing condition for monotone
Markov chains,”Statistics & Probability Letters, 82(2), 262–267.

(2013): “Stochastic Stabilty in Monotone Economies,” Theoretical Economics, forth-


coming.

Klette, T. J., and S. Kortum (2004): “Innovating Firms and Aggregate Innovation,”Journal of
Political Economy, 112(5), 986–1018.

Koeniger, W., and J. Prat (2007): “Employment protection, product market regulation and
…rm selection,”Economic Journal, 117(521), F302–F332.

Launov, A., and K. Wälde (2013): “Estimating Incentive and Welfare E¤ects of Non-Stationary
Unemployment Bene…ts,”International Economic Review, 54, 1159–1198.

Leland, H. (1968): “Saving and Uncertainty: The Precautionary Demand for Saving,” Quar-
terly Journal of Economics, 82(3), 465–473.

31
Lippi, F., S. Ragni, and N. Trachter (2015): “Optimal monetary policy with heterogeneous
money holdings,”Journal of Economic Theory, 159, 339–368.

Lise, J. (2013): “On-the-Job Search and Precautionary Savings,”Review of Economic Studies,


80, 1086–1113.

Lo, A. W. (1988): “Maximum likelihood estimation of generalized Ito processes with discretely
sampled data,”Econometric Theory, 4, 231–247.

Magill, M. (1977): “A Local Analysis of N-Sector Capital Accumulation under Uncertainty,”


Journal of Economic Theory, 15(1), 211–219.

Mattheij, R., and J. Molenaar (2002): Ordinary di¤erential equations in theory and practice,
vol. 43 of Classics in Applied Mathematics. Society for Industrial and Applied Mathematics
(SIAM), Philadelphia, PA, Reprint of the 1996 original.

Merton, R. C. (1975): “An Asymptotic Theory of Growth under Uncertainty,”The Review of


Economic Studies, 42(3), 375–393.

Meyn, S. P., and R. L. Tweedie (1993a): Markov chains and stochastic stability, Communica-
tions and Control Engineering Series. Springer-Verlag London Ltd., London.

Meyn, S. P., and R. L. Tweedie (1993b): “Stability of Markovian processes. II. Continuous-time
processes and sampled chains,”Adv. in Appl. Probab., 25(3), 487–517.

(1993c): “Stability of Markovian processes. III. Foster-Lyapunov criteria for continuous-


time processes,”Adv. in Appl. Probab., 25(3), 518–548.

Miao, J. (2006): “Competitive equilibria of economies with a continuum of consumers and


aggregate shocks,”Journal of Economic Theory, 128(1), 274–298.

Moscarini, G. (2005): “Job Matching and the Wage Distribution,” Econometrica, 73(2), 481–
516.

Øksendal, B. (1998): Stochastic Di¤erential Equations. Springer, Fifth Edition, Berlin.

Ortigueira, S., and N. Siassi (2013): “How important is intra-household risk sharing for savings
and labor supply?,”Journal of Monetary Economics, 60(6), 650–666.

Picard, J. (1995/97): “Density in small time for Levy processes,” ESAIM Probab. Statist., 1,
357–389 (electronic).

Postel-Vinay, F., and J.-M. Robin (2002): “Equilibrium Wage Dispersion with Worker and
Employer Heterogeneity,”Econometrica, 70, 2295–2350.

Prat, J. (2007): “The impact of disembodied technological progress on unemployment,”Review


of Economic Dynamics, 10, 106–125.

Protter, P. (1995): Stochastic Integration and Di¤erential Equations. Springer-Verlag, Berlin.

Raimondo, R. C. (2005): “Market clearing, utility functions, and securities prices,” Economic
Theory, 25(2), 265–285.

Rishel, R. (1970): “Necessary and Su¢ cient Dynamic Programming Conditions for Continuous
Time Stochastic Optimal Control,” SIAM Journal on Control and Optimization, 8(4), 559–
571.

32
Ross, S. M. (1993): Introduction to Probability Models, 5th edition. Academic Press, San
Diego.

Scheinkman, J., and L. Weis (1986): “Borrowing Constraints and Aggregate Economic Activ-
ity,”Econometrica, 54(1), 23–45.

Stokey, N. L. (2008): The Economics of Inaction: Stochastic Control Models with Fixed Costs.
Princeton University Press.

van den Berg, G. J. (1990): “Nonstationarity in Job Search Theory,” Review of Economic
Studies, 57, 255–277.

Wälde, K. (1999): “Optimal Saving under Poisson Uncertainty,”Journal of Economic Theory,


87, 194–217.

(2012): Applied Intertemporal Optimization. Know Thyself - Academic Publishers,


available at www.waelde.com/KTAP.

Yuan, C., and X. Mao (2003): “Asymptotic stability in distribution of stochastic di¤erential
equations with Markovian switching,” Stochastic Processes and Their Applications, 103,
277–291.

33

You might also like