“Where one door shuts, another opens.”
– Miguel de Cervantes, Don Quixote.
Abstract
We consider the problem of finding the optimal time to switch between two measurable cash-flow streams. A complete characterization of the set of solutions is obtained in terms of adjoint variables which measure the available gain from deviations. An iterative procedure for the computation of the adjoint variables is provided. The results are generalized to multiple switching times, multiple cash-flow streams, switching costs, as well as switch-triggered cash-flow streams that arise in equipment-replacement problems.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Conceptually, the question of optimally switching between several (deterministic) cash-flow streams—considered in this paper—can be viewed as a scheduling problem for substitutable machines (processors) with time-varying yield that can be allocated to the task of creating time-discounted value in a single job. We characterize the solutions to the simple switching problem and generalize the approach to include switching costs, switch-triggered obligations, as well as any finite number of allowable switches and available cash-flow streams.
1.1 Literature
From a scheduling viewpoint, the cash-flow switching problem fits within the deterministic production-planning and scheduling framework laid out by Salveson (1952). Subsequent contributions to the theory of scheduling, beginning with Johnson (1954), focus on minimizing the time needed to complete heterogeneous jobs on a discrete number of machines. Because of their combinatorial nature these types of problems are often NP-hard and require an algorithmic approach (Veinott and Wagner 1962; Gawienjnowicz 2008). For the cash-flow switching problem with a single switching time, we pursue a semi-analytical approach. For multiple switching times, the simple solution can be used repeatedly, effectively amounting to a solution via dynamic programming (Bellman 1957).
In a stochastic setting, when the yield of the processors is random (but usually stationary), the cash-flow switching problem is also known as the multi-armed bandit problem (Gittins et al. 2011).Footnote 1 For uncorrelated processors, the optimal switching policy is obtained by comparing processor-specific “Gittins indices,” corresponding to an expected retirement reward for a given machine, and then always choosing the machine with the highest available reward (Gittins and Jones 1974). Our problem amounts to a “time-varying deterministic multi-armed bandit problem,” thus focusing on pure exploitation without experimental exploration of the various processors. It is optimal to switch to another cash-flow stream as soon as the benefit of delaying the switch, as measured by an appropriate “adjoint variable,” drops to zero. Switching costs, which in our problem tend to delay actions, have also been studied for the stochastic multi-armed bandit problem; see Jun (2004) for a survey.
Finally, our approach to the cash-flow switching problem is related to global optimization, as the optimal switching time needs to globally maximize a differentiable function on the interval of consideration. Early semi-heuristic methods included golden-section search (Kiefer 1953, 1957) among others; see, e.g., Wilde (1964) for a textbook survey. Derivative-free approaches based on a Lipschitz constant have been advanced by Shubert (1972), Breiman and Cutler (1993), Sergeyev (1995), and more recently by Lera and Sergeyev (2013). For an overview of random sampling methods see, e.g., Törn and Žilinskas (1989, Ch. 4). Another interesting approach, at least from a computational viewpoint, is the (approximate) global maximization of polynomials formulated in terms of semi-definite programming, pioneered by Nesterov (2000), Lasserre (2001), and Parrilo (2003); see also Li et al. (2012). To solve the cash-flow switching problem we use the semi-analytic approach recently proposed by Weber (2017) which requires the knowledge of a derivative that is naturally available in our problem as the difference between the two available cash-flow streams. The latter is assumed merely essentially bounded (as an element of \({\mathcal L}^\infty \)) instead of the absolute integrability (as an element of \({\mathcal L}^1\)) stipulated originally.Footnote 2
Cash-flow switching in its various forms relates to many well-known problems in capital budgeting: equipment replacement (Alchian 1958) as well as the evaluation of project returns, for example. Regarding the latter Arrow and Levhari (1969), as well as Wright (1959), earlier, and Flemming and Wright (1971), later, with more generality, considered the uniqueness of the internal rate of return subject to optimal stopping of a cash-flow stream, thus assuming a “free disposal of investment projects” (Arrow 1985, p. 373). In other words, given a solution to (a special case of) the cash-flow switching problem (switching to a zero cash-flow stream), these authors show that the resulting maximized present value is nonincreasing in the constant interest rate, so that the internal rate of return should be unique.Footnote 3
1.2 Outline
The remainder of this paper is organized as follows. In Sect. 2, we introduce the basic cash-flow switching problem and formulate it as a global optimization problem, the solutions to which can be characterized by means of a suitable adjoint variable. Section 3 considers the effects of switching cost, while Sect. 4 allows for a generalization of the basic problem to multiple switches and multiple cash-flow streams, respectively. Section 5 concludes.
2 Simple switching
2.1 Preliminaries
Let \(T>0\) be a fixed time horizon. An investment project consists of a deterministic (measurable) cash-flow stream x(t), defined for all \(t\in [0,T]\). Given the (measurable) schedule \(r:[0,T]\rightarrow {\mathbb R}\), which for any \(t\in [0,T]\) specifies an external rate r(t), representing a cost of capital or a hurdle rate,Footnote 4 let
denote the cumulative interest up to time t. The present value of the cash-flow stream x is
where \(\beta (t)\triangleq \exp [-R(t)]>0\) denotes the discount factor, i.e., the present value of a concentrated unit payment that is set to arrive at time \(t\in [0,T]\).
Example 1
For a constant discount rate \(r(t)\equiv \hat{r}\in {\mathbb R}\), it is \(R(t) = \hat{r}t\) for all \(t\in [0,T\)], and the present value of the cash-flow stream x becomes \(\text{ PV }(x) = \int _0^T e^{-\hat{r}t} x(t)\,dt\).
No assumption is made about the sign of the interest rate. Time-varying interest rates characterize the term structure of financial instruments (e.g., in the form of yield curves) and help describe macroeconomic performance (e.g., relative to the real interest rate); see, e.g., Gürkaynak and Wright (2012) for an overview. They also describe the (deterministic) dynamics of the capital cost for any given firm or individual decision-maker, who are the subjects of our study.
2.2 Problem formulation
Consider the following basic switching problem: Given two cash-flow streams, \(x^1\) (“project 1”) and \(x^2\) (“project 2”), both defined on the interval [0, T], at what time \(\tau ^*\) is it optimal to switch from project 1 to project 2? To formulate this question as an optimization problem, note first that when switching at time \(\tau \in [0,T]\), the decision-maker obtains the switched cash-flow streamFootnote 5
The original cash-flow streams obtain as special cases for \(\tau \in \{0,T\}\): \(x_0 = x^2\) and \(x_T = x^1\). The decision-maker’s payoff from the switched cash-flow stream corresponds to its present value,
The objective function V, which represents the present value of the switched cash-flow stream, is differentiable, with derivative
The decision-maker’s cash-flow switching problem amounts to maximizing the objective function,
By virtue of the Weierstrass theorem (see, e.g., Bertsekas 1995, p. 540) a solution \(\tau ^*\) to (P) exists. Let \({\mathcal P}\subseteq [0,T]\) denote the set of all solutions to the switching problem. For an interior \(\tau ^*\in (0,T)\), Fermat’s lemma implies that \(\dot{V}(\tau ^*) = 0\), leading to the necessary optimality condition
provided that \(\dot{V}\) is continuous in a neighborhood of \(\tau ^*\). We note that Eq. (4) need not be satisfied at any point of discontinuity of \(x^1\) or \(x^2\). It also does not need to hold at the boundary of the interval [0, T]. Moreover, the condition is at most necessary, i.e., there may be many points \(\tau \in [0,T]\) for which \(x^1(\tau )=x^2(\tau )\) but which do not solve (P), including local minima and maxima, as well as saddle points and points of discontinuity. The usual approach to obtain a better characterization of the optimum, namely to use a second-order necessary optimality condition, requires that the cash-flow streams are differentiable. Even if satisfied and valid, such an additional condition leads neither to a truly necessary nor a sufficient optimality condition. Our approach is different in that we construct a necessary and sufficient optimality condition for any solution to (P), without any further regularity assumptions, thus arriving at a simple characterization and representation of the solution set \(\mathcal P\).
Remark 1
We note two important special cases of the switching problem. For \(x^2=0\), one obtains the problem of optimally stopping project 1, and for \(x^1 = 0\) the problem of optimally starting project 2.
Remark 2
In the basic version (P) of the switching problem, the implicit assumption is that both cash-flow streams are synchronized, in the sense that both start at \(t=0\). By switching at time \(\tau \), the decision-maker changes his exposure from one stream of obligations (characterized by the net cash inflow \(x^1\) on \([0,\tau ]\)) to another stream of obligations (characterized by the net cash inflow \(x^2\) on \([\tau ,T]\)). As shown in Sect. 3.3, the solution of this problem can be extended to the situation where the switch from one cash-flow stream triggers the start of another cash-flow stream, such as for equipment-replacement decisions.
Remark 3
If the decision-maker’s (indirect) utility u(x(t)) for a net cash inflow x(t) at time t is evaluated by a (measurable) utility function \(u:{\mathbb R}\rightarrow {\mathbb R}\), then it is enough to replace x(t) by \(\hat{x}(t)\triangleq u(x(t))\), for all \(t\in [0,T]\). That is, any cash-flow stream can also be interpreted as a stream of utility, as experienced—for example—by a representative consumer in a model of (finite-horizon) economic growth.
Example 2
For \(t\ge 0\), consider the cash-flow streams \(x^1(t)\triangleq 1-c\exp (-at)\cos (\omega t)\) and \(x^2(t)\triangleq 1-\exp (-bt)\), where \(a,b>0\) are damping constants and \(\omega \ge 0\) describes an oscillation frequency of the default option before the switch (e.g., due to seasonalities). As in Example 1, let the discount rate be constant, so \(r(t)\equiv \hat{r}\in {\mathbb R}\). While in the very long run, i.e., for \(t\rightarrow \infty \), both cash-flow streams behave like a unit perpetuity, in the short run they may differ significantly. In terms of their present values, over a sufficiently long time horizon T, it is \(\text{ PV }(x^2) > \text{ PV }(x^1)\) if (and only if) \(b>a + \omega ^2/(a+\hat{r})\).Footnote 6 Figure 1 shows the cash-flow streams \(x^1\) and \(x^2\) for the parameter values \((a,b,c) = (1,2,5)\) and \(\omega =2\pi \). For the horizon \(T=3\) and the constant interest rate \(\hat{r}=20\%\), Fig. 2 depicts the corresponding objective function V.
2.3 Adjoint variable
The solution of the switching problem (P) is achieved by means of an adjoint variable, the cumulative (right-sided) gain,
where the (right-sided) gain inflow f, for \((t,\hat{x},\hat{y})\in [0,T]\times {\mathbb R}\times {\mathbb R}\), is defined as
Starting at an initial value of zero at \(s=0\), the adjoint variable would measure any nonnegative gain available by continuing from \(\tau = T-s\) until the interval horizon T. By reversing the time-scale, the right-hand side of Eq. (5) becomes
A solution y to this integral equation is necessarily differentiable and such that \(y(0)=0\), corresponding to the logic that the right-sided cash-flow gain must vanish at the right interval boundary. As a result, the right-sided adjoint variable y satisfies the following initial-value problem:
The gain inflow f is a discontinuous function, and in particular does not satisfy the Carathéodory conditions (Filippov 1988, p. 3), so that the existence of a solution to Eq. (8) deserves special attention. The latter conditions allow for a measurable dependence on time, so that it is not the possible discontinuities of \(x^1-x^2\), but rather the structural properties of the gain inflow itself which require care in the construction of a solution.Footnote 7 Before turning our attention to the question of existence and uniqueness, it is useful to establish a lower bound for an adjoint variable y, provided it exists.
Lemma 1
The (right-sided) adjoint variable y, as solution to the initial-value problem (8), is bounded from below: \(y(s)\ge \max \{0,V(T) - V(T-s)\}/\beta (T-s)\), for all \(s\in [0,T]\).
The adjoint variable captures the value of the option to delay switching. It is therefore necessarily nonnegative and cannot be smaller than the value of waiting until the end of the interval horizon. Its interpretation as the cumulative right-sided gain is the key to finding the smallest solution to the switching problem (P), as detailed in Sect. 2.4 below.
As in the standard approach (see, e.g., Coddington and Levinson 1955), the idea of proving the existence of a solution y to Eq. (8) is to construct an approximating sequence \((y_k)_{k=0}^\infty \) by means of a successive Picard–Lindelöf iteration. However, while usually the convergence of this sequence and the uniqueness of its limit are established jointly using the Banach fixed-point theorem, in our setting this strategy proves unsuccessful because of the lack of a suitable Lipschitz constant for the gain inflow. We consider any “admissible” solution of Eq. (8) in the Sobolev space \({\mathcal W}^{1,\infty }([0,T])\) of differentiable functions with measurable and essentially bounded derivatives (in \({\mathcal L}^\infty ([0,T])\)). For this sufficiently large class of functions, it is possible to construct a Cauchy sequence which converges in the \(\Vert \cdot \Vert _{1,\infty }\)-norm, the latter being defined as
for any \(y\in {\mathcal W}^{1,\infty }([0,T])\). To determine the solution set of Eq. (8) (as a subset of \({\mathcal W}^{1,\infty }([0,T])\)), our approach is to introduce an outcome-equivalent transformation of the gain inflow and use it to define a successive-approximation procedure that converges to the unique solution of the initial-value problem. For this, note that by Lemma 1 a solution y to Eq. (8), if it exists, is necessarily nonnegative-valued. Introducing the modified gain inflow,
we obtain a measure that counts a net cash flow \(\hat{x}\) (at a given time) if it is positive or if the cumulative gain \(\hat{y}\) is positive. Specifically, provided the cumulative gain is nonnegative, the modified gain inflow equals the (regular) gain inflow, i.e., for any \((\hat{t},\hat{x},\hat{y})\in [0,T]\times {\mathbb R}\times {\mathbb R}\):
If we further introduce the time-reversed cash-flow difference \(\varphi (s)\triangleq x^1(T-s) - x^2(T-s)\) and the time-reversed interest rate \(\rho (s)\triangleq r(T-s)\), for \(s\in [0,T]\), then the initial-value problem in Eq. (8) is equivalent to
that is, it has the same set of solutions as Eq. (8). The space \({\mathcal W}^{1,\infty }([0,T])\) is a Banach space, i.e., a complete normed vector space, which means that any Cauchy sequence with elements in the vector space converges (in the \(\Vert \cdot \Vert _{1,\infty }\)-norm) to an element of the vector space. Based on the equivalent integral representation (7) of the initial-value problem (8), we introduce the operator \({\mathbf P}:{\mathcal W}^{1,\infty }([0,T])\rightarrow {\mathcal W}^{1,\infty }([0,T])\) mapping any admissible function y to a function \({\mathbf P}y\),
which (as can be verified) is also an element of \({\mathcal W}^{1,\infty }([0,T])\). The solution set of the initial-value problem (8′) is therefore the set of fixed points of \(\mathbf P\). The following result provides existence and uniqueness of a solution to the initial-value problems (8) and its equivalent representation (8′).
Proposition 1
There exists a unique solution to the initial-value problem (8).
As shown in the proof of Proposition 1, a repeated application of the operator \(\mathbf P\) to the initial seed \(\phi \), withFootnote 8
converges to the unique solution of Eq. (8). That is, when considering the sequence \(\sigma \triangleq (y_k)_{k=0}^\infty \), with the initial function \(y_0 = \phi \) and the Picard–Lindelöf iteration \(y_{k+1} = {\mathbf P}y_k\) for \(k\ge 0\), then \(y_k \rightarrow y\) as \(k\rightarrow \infty \), and y solves Eq. (8). In practice, the convergence of the sequence \(\sigma \) to the adjoint variable \(y = \lim _{k\rightarrow \infty } {\mathbf P}^k\phi \) is usually very efficient and takes place within a few iterations, as illustrated by the following example.
Example 3
In the setting of Example 2, let \(y_0 = \phi \). As illustrated in Fig. 3, the Picard–Lindelöf iterations converge rapidly: \(y = \lim _{k\rightarrow \infty } y_k = y_5 = {\mathbf P}^5\phi \). Note that successive iterates are “alternatingly nested,” in the sense that \(y_0\le y_2\le y_4\le y=y_5 \le y_3\le y_1\).Footnote 9
2.4 Optimal switching
For any solution \(\tau \) of the cash-flow switching problem (P), the right-sided adjoint variable y(s) in the inverted time scale \(s=T-\tau \) must vanish, i.e., necessarily \(y(T-\tau )=0\). This guarantees that no improvement is possible on the interval \([\tau ,T]\) relative to the present value \(V(\tau )\) obtained by switching at time \(\tau \). Conversely, when viewed from the left, the smallest point \(\tau ^*\in [0,T]\) which satisfies \(y(T-\tau ^*) = 0\) is the first for which all right-sided improvements vanish. It must therefore be globally optimal.
Proposition 2
Let \(s^*\triangleq \sup \{s\in [0,T]: y(s)=0\}\). The smallest solution of the cash-flow switching problem (P) is \(\tau ^* = T - s^*\).
The smallest solution of (P) is the earliest switching time \(\tau ^*\) for which no improvement of the objective can be obtained. By a symmetric consideration, one can determine the largest solution \(\tau ^{**}\) of (P) by measuring left-sided gains on the interval \([0,\tau ]\). For this, we introduce a (left-sided) (co-)adjoint variable z as solution to the initial-value problem
where the (left-sided) gain inflow g, for any \((t,\hat{x},\hat{z})\in [0,T]\times {\mathbb R}\times {\mathbb R}\), is defined as
Because Eqs. (8) and (12) have the same structure, Proposition 1 also guarantees the existence and uniqueness of z. By symmetric considerations, we obtain the latest optimal switching time as the largest feasible point where no left-sided improvement is available.
Proposition 3
The largest solution of (P) is \(\tau ^{**} = \sup \{t\in [0,T]:z(t)=0\}\).
As with Eq. (8′) for Eq. (8), there exists an equivalent formulation for the initial-value problem (12) for the computation of z,
with the modified gain inflow \(\Phi \) as specified in Eq. (10). To determine the left-sided gain z, we use again a Picard–Lindelöf approximation sequence \((z_k)_{k=0}^\infty \), where \(z_0 \triangleq \hat{\phi }\) and \(z_{k+1} = \hat{\mathbf P} z_k\) for \(k\ge 0\), with initial seed
The operator \(\hat{\mathbf P}:{\mathcal W}^{1,\infty }([0,T])\rightarrow {\mathcal W}^{1,\infty }([0,T])\) maps any admissible function z to an admissible function \(\hat{\mathbf P}z\), with
similar to the definition of the operator \(\mathbf P\) in Eq. (11). With this, the left-sided adjoint variable z is attained by repeatedly applying \(\hat{\mathbf P}\) to \(\hat{\Phi }\):
The convergence usually obtains in a finite number of iterations, e.g., when both cash-flow streams are Lipschitz-continuous.
Remark 4
Propositions 2 and 3 together characterize the uniqueness of a solution to the cash-flow switching problem (P). Indeed, the solution is unique if and only if \(\tau ^* = \tau ^{**}\). In the case where \(\tau ^*\ne \tau ^{**}\), it is still \(V^* = V(\tau ^*)=V(\tau ^{**})\). The solution set is then the upper contour set of V relative to its globally optimal value \(V^*\) on [0, T]: \({\mathcal P} = \{\tau \in [\tau ^*,\tau ^{**}] : V(\tau )\ge V^*\}\).
The adjoint variables y and z describe the right-sided and left-sided gains available at any point in the interval [0, T] of available switching times. By introducing the combined (or two-sided) adjoint variable
one naturally obtains a necessary and sufficient condition for an optimal switching time.
Proposition 4
A point \(\hat{t}\in [0,T]\) is a solution to (P) if and only if
Accordingly, the solution set is \({\mathcal P} = \{t\in [0,T]: \lambda (t)=0\}\).
The fact that \(\lambda (\tau )\) can be interpreted as the total gain available on the domain [0, T] relative to the point \(\tau \) implies that the present value \(V(\tau )\) plus \(\lambda (\tau )\) must always be equal to the optimal value of (P); see Fig. 4.
Corollary 1
For all \(\tau \in [0,T]: \lambda (\tau ) + V(\tau ) = V^*\).
Knowing the adjoint variable \(\lambda \) is therefore equivalent to knowing the objective function V and its largest value \(V^*\). Combining the last result with the initial conditions in Eqs. (8) and (12) yields an expression of the optimal value of (P) as a function of the adjoint variables evaluated at the interval horizon.
Corollary 2
\(y(T) = \lambda (0) = V^* - V(0)\) and \(\beta (T)z(T) = \lambda (T) = V^* - V(T)\).
The preceding result means that a one-sided adjoint variable (either y or z) is enough to determine the optimal value of the switching problem (P):
Each one-sided adjoint variable can also be considered a solution to a complete family of one-sided problems. For example, the right-sided gain y determines the optimal policy of whether to stop at \(\tau \) or continue searching for a better stopping time on the interval \([\tau ,T]\) because
for any \(\tau \in [0,T]\). Similarly, the left-sided gain z fully describes the optimal policy of whether to stop at \(\tau \) or earlier, since
for any \(\tau \in [0,T]\). As will become apparent in Sect. 4, the one-sided adjoint variables are intimately related to a dynamic-programming solution of the switching problem.
Remark 5
Let \(x^1\) and \(x^2\) be defined on the general interval \([{\underline{t}},\bar{t}]\), where \({\underline{t}},\bar{t}\) are real numbers such that \({\underline{t}}<\bar{t}\). Consider the cash-flow switching problem
where
is a differentiable real-valued objective function in \({\mathcal W}^{1,\infty }([{\underline{t}},\bar{t}])\). While (P′) may appear more general than the basic switching problem discussed thus far, it can be reduced to (P) by maximizing \(V(\tau ) \triangleq W({\underline{t}}+\tau )\) on the interval [0, T] with \(T\triangleq \bar{t}-{\underline{t}}\), just as in (P). By a translation, any solution \(\tau ^*\) of (P) directly corresponds to a solution \(\hat{\tau }^*\) of (P′): \(\hat{\tau }^* = \tau ^*-{\underline{t}}\).
2.5 Comparative statics
We now examine how changes in the interest-rate schedule affect the solution to the switching problem. To this end, let us consider two different (measurable and real-valued) interest-rate schedules, r and \(\hat{r}\), defined on [0, T]. Let \(\tau ^*\) and \(\hat{\tau }^*\) be the corresponding (smallest) solutions to the cash-flow switching problem (P). An ordinal relationship in the interest-rate schedules implies an (inverse) ordinal relationship in the optimal switching times.
Proposition 5
If \(r\le \hat{r}\), then \(\tau ^*\ge \hat{\tau }^*\).
This result confirms the intuition that a decrease in patience (i.e., a higher discount rate) at the margin tends to speed up cash-flow switching.
3 Switching cost
If there is a cost c of performing a switch, payable at the time of the switch, then the optimal solution depends on whether switching is mandatory or optional.
3.1 Mandatory switching
If a switch from \(x^1\) to \(x^2\) is required at some point on the time interval [0, T], then the optimal cash-flow switching problem becomes
A positive switching cost introduces a bias towards preserving the status quo, as delaying the transition increases the decision-maker’s present value. A negative switching cost c amounts to a subsidy, encouraging the switch. For zero switching cost one obtains \(V_0=V\), i.e., the same payoff function as in the original problem discussed in Sect. 2. Moreover, by (fictitiously) including the benefit of the implicit rent from not switching in the default cash-flow stream \(x^1\) for all times before the switching time, the problem (\(\hbox {P}_c\)) with nonzero switching cost can be reduced to the simple cash-flow switching problem (P). For this, we introduce \(({\mathbf P}_{x^1-x^2}) y \triangleq {\mathbf P}y\) and \((\hat{\mathbf P}_{x^1-x^2})z \triangleq \hat{\mathbf P}z\) for any admissible y and z, just as in Eqs. (11) and (14), with the only difference that the extended notation indicates the difference of the available cash-flow streams.
Proposition 6
Let \(c\in {\mathbb R}\). A point \(\tau _c^*\in [0,T]\) is a solution to the switching problem (\(\hbox {P}_c\)) if and only if \(\tau _c^*\in {\mathcal P}_c = \{\tau : \lambda _c(\tau )=0\}\), where \(\lambda _c = \beta \max \{y_c,z_c\}\) is the adjoint variable, and \(y_c,z_c\) are the unique solutions of \(y_c = ({\mathbf P}_{x^1+cr-x^2})y_c\) and \(z_c = (\hat{\mathbf P}_{x^1+cr-x^2})z_c\). The resulting optimal value is \(V_c^* = V(\tau _c^*)-c\beta (\tau _c^*)\).
To incorporate switching costs when switching is mandatory, it is therefore sufficient to consider the default cash-flow stream \(\hat{x}^1_c \triangleq x^1 + cr\) instead of \(x^1\). As noted earlier, a positive switching cost introduces inertia in the decision-maker’s willingness to move from \(\hat{x}^1_c\) to the alternative cash-flow stream \(x^2\).
Lemma 2
The (smallest) solution to (\(\hbox {P}_c\)) is nondecreasing in the switching cost c.
If switching becomes more expensive, then it is never optimal to switch earlier.
3.2 Optional switching
If switching is optional, then the decision-maker can opt to forgo the option of switching away from the default cash-flow stream. The optimal switching problem becomes
Because in the case where the decision-maker chooses \(\tau \in [0,T)\), his payoff is identical to the payoff in the problem (\(\hbox {P}_c\)) with mandatory switching cost, any extra payoff related to the additional flexibility in the relaxed problem (\(\hat{\hbox {P}}_c\)) must come from saving the switching cost by holding out until the end of the horizon without switch.
Proposition 7
For \(c\in {\mathbb R}\), consider the optional switching problem (\(\hat{\hbox {P}}_c\)). Let \({\mathcal P}_c\) and \(V^*_c\) be as in Proposition 6. If \(V(T)>V_c^*\), then it is optimal to never switch. Otherwise, it is optimal to switch at a time \(\tau _c^*\in {\mathcal P}_c\). The resulting optimal value is \(\hat{V}_c^* = \max \{V_c^*,V(T)\}\).
The solution to the problem (\(\hat{\hbox {P}}_c\)) with flexibility reduces to the problem (\(\hbox {P}_c\)) without flexibility. As in Sect. 3.1, if the subsidies to switching increase (so c decreases), it is never optimal to increase the switching time (see Lemma 2).
3.3 Switch-triggered cash-flow streams
If a switch at time \(\tau \in [0,T]\) triggers a commitment that is associated with a cash-flow stream \(x^0\) (which can be of any duration), then the present value of this cash-flow stream is
The switch-triggered cash-flow stream may well extend beyond the consideration interval [0, T]. When triggered at time \(\tau \), only its (differentiable) time-0 value \(v(\tau )\) is important.Footnote 10 An example of practical interest is the equipment-replacement problem, where at the time of purchasing new equipment a new cash-flow stream starts, associated with the lifecycle of the replacement product. The corresponding cash-flow switching problem becomes
As in the simpler problem (\(\hbox {P}_c\)) with constant switching cost, the strategy for solving (\(\hbox {P}_v\)) is to convert the present value into a cash-flow stream associated with the default option. By not exerting the option to switch at t the decision-maker earns the time-t rent \(\dot{v}(t)\).
Proposition 8
Assume that switching at time \(\tau \) triggers an obligation with present value \(v(\tau )\). A point \(\hat{\tau }_v^*\in [0,T]\) is a solution to (\(\hbox {P}_v\)) if and only if \(\tau ^*_v\in {\mathcal P}_v = \{\tau : \lambda _v(\tau )=0\}\), where \(\lambda _v = \beta \max \{y_v,z_v\}\) is the adjoint variable, and \(y_v,z_v\) are the unique solutions of \(y_v = ({\mathbf P}_{\hat{x}^0+x^1-x^2})y_v\) and \(z_v = (\hat{\mathbf P}_{\hat{x}^0+x^1-x^2})z_v\), given the auxiliary cash-flow stream \(\hat{x}^0 \triangleq \dot{v}/\beta \). The resulting optimal value is \(V_v^* = V(\tau _v^*)+v(\tau _v^*)\).
The intuition of this result is that a positive switch-triggered present value would generally decrease when delayed, thus producing a negative gradient and therefore also a negative fictitious cash-flow stream to be added to the default cash-flow stream. Thus, delaying the switch amounts to an opportunity cost. Conversely, delaying a negative switch-triggered cash-flow stream produces a positive bias to stick with the default cash-flow stream.
4 Multi-switching
The results obtained so far can be generalized for settings where multiple switches are possible or multiple cash-flow streams are available.Footnote 11
4.1 Multiple switches
Let \(N\ge 1\) be the number of allowable switches between the cash-flow streams \(x^1\) and \(x^2\) on the time interval [0, T], beginning with the default cash-flow stream \(x^1\). Let
be the decision-maker’s value for a vector for switching times \(\tau = (\tau ^1,\ldots ,\tau ^N)\in {\mathcal T}_N\) with
The “N-multi-switching problem” is the original cash-flow switching problem (P) generalized to N switches:
To understand the general argument for a solution to (\(\hbox {P}_N\)), consider first the case with two possible switches. If at time t, there is only one switch left, then the present value (at time 0) of the switched cash-flow stream (from 2 to 1) on the interval [t, T] is the payoff of switching immediately at t, plus the value of the option to switch later,
Consider now the fictitious cash-flow stream \(\hat{x}^2\) such that \(\text{ PV }(\hat{x}^2_{[t,T]}) \equiv W_{21}(t)\), i.e.,
Then necessarily \(\hat{x}^2(t) \equiv - \dot{W}_{21}(t)/\beta (t)\), i.e.,
The only remaining problem is how to optimally switch from \(x^1\) to \(\hat{x}^2\), which is equivalent to the simple cash-flow switching problem (P) discussed in Sect. 2.
In the general case with \(N\ge 2\) switches, odd-numbered switches are from \(x^1\) to \(x^2\), and even-numbered switches are from \(x^2\) to \(x^1\). Let \(\xi ^k(t)\) be the (somewhat fictitious) cash-flow stream that when switched to at time t will have the same present value as the cash-flow stream which on the interval [t, T] includes a maximum of k switches. For this, we set \(\xi ^0\triangleq x^1\) if N is even, and \(\xi ^0 \triangleq x^2\) if N is odd. Furthermore, we define recursively
for all \(k\in \{1,\ldots ,N\}\), where the (right-sided) adjoint variable for k remaining switches is given by
The auxiliary cash-flow stream \(\xi ^k\) with k remaining switches incorporates the improvement of switching from the current cash-flow stream to the auxiliary cash-flow stream \(\xi ^{k-1}\) with \(k-1\) remaining switches. This implies a solution to the N-multi-switching problem.
Proposition 9
A solution \(\tau (N)=(\tau ^1(N),\ldots ,\tau ^N(N))\in {\mathcal T}_N\) to the cash-flow switching problem (\(\hbox {P}_N\)) with \(N\ge 1\) allowable switches between the cash-flow streams \(x^1\) and \(x^2\) is such that the i-th switching time is \(\tau ^i(N) = T - s^i(N)\), where \(s^i(N)\triangleq \sup \{s\in [\tau ^{i-1},T] : y^{N-i+1}(s)=0\}\), for all \(i\in \{1,\ldots ,N\}\), and \(\tau ^0\triangleq 0\).
Analogous to our discussion after Corollary 2, at any time \(t\in [0,T]\), the right-sided adjoint variable \(y^k(T-t)\) determines the optimal stopping policy for the k-th switching time on the remaining time interval [t, T]. Similarly, \(\xi ^k(t)\) defines the (fictitious) cash-flow stream with that optimal policy implemented on the interval [t, T]. The solution in Proposition 9 therefore implements Bellman’s principle of optimality via backward induction from the right interval end \(t=T\) in the form of dynamic programming (Bellman 1957, Ch. III.3).
Example 4
Given the cash-flow streams \(x^1\) and \(x^2\) described in Example 2, consider the N-multi-switching problem (\(\hbox {P}_N\)) for \(N\in \{1,\ldots ,7\}\). Table 1 shows the components of the optimal timing vector \(\tau (N)\in {\mathcal T}_N\). The solutions for \(N\in \{2,\ldots ,7\}\) suggest a monotonicity property in the sense that optimal switches remain optimal when more switches are available. However, a counterexample to this heuristic is easily obtained by comparing the solutions for \(N=1\) and \(N=2\), as indeed \(\tau ^1(1)\notin \{\tau ^1(2),\tau ^2(2)\}\); see Fig. 5.
As can be gleaned from Table 1 (with \(V^*_0\triangleq \text{ PV }(x^1)\approx 2.1133\)), the optimal value \(V^*_N\) of an N-multi-switching problem (\(\hbox {P}_N\)) is not necessarily concave in the allowed number of switches \(N\ge 1\). On the other hand, it is straightforward to see that in general both \(V^*_{2n+2}\) and \(V^*_{2n+1}\) are nondecreasing and concave in n, because each increase of n provides two extra switches, adding a full cycle relative to the current cash-flow stream. The resulting extra cycles must, by optimality, be chosen in the order of nonincreasing payoff increments.
Remark 6
As N becomes large, the optimal value \(V_N^*\) of the N-multi-switching problem (\(\hbox {P}_N\)) converges to the upper bound
i.e., \(\lim _{N\rightarrow \infty } V^*(N) = V^*_\infty \). This follows from the monotone convergence theorem (see, e.g., Rudin 1976, p. 55), for the sequence \((V^*_N)_{N=1}^\infty \) is nondecreasing and tightly bounded by \(V^*_\infty \).
Example 5
In the setting of Example 4, one obtains \(V^*_\infty = V^*_N \approx 3.2339\), for all \(N\ge 7\).
4.2 Multiple cash-flow streams
Consider the switching problem for \(M\ge 2\) cash-flow streams \(x^1,\ldots ,x^M\). As before, we assume that \(x^1\) is the initial default option. The decision-maker’s problem is to find the best cash-flow stream to switch to at the optimal time, which amounts to solving
Having the option to switch between multiple cash-flow streams means that the decision-maker can participate in the proceeds of a portfolio of simultaneous projects without being able to alter their timing and without being able to have stakes in more than one project at a time.
Proposition 10
The tuple \((j^*,\tau ^*)\) solves the cash-flow switching problem (\(\hbox {P}_M\)) if and only if
where \({\mathcal P}_M = \{\tau \in [0,T]:\lambda _{j^*}(\tau )=0\}\), with \(\lambda _j =\beta \max \{y_j,z_j\}\) and \(y_j,z_j\) uniquely determined by \(y_j = ({\mathbf P}_{x^1-x^j})y_j\) and \(z_j = (\hat{\mathbf P}_{x^1-x^j})z_j\), for \(j\in \{2,\ldots ,M\}\), respectively.
The earliest optimal jump should be to the cash-flow stream which promises the highest payoff at a time when no strict improvement from waiting can be achieved.
5 Conclusion
The problem of switching between cash-flow streams can be reinterpreted as finding the optimal choice between a finite number of intertemporal streams of expected utility by a rational economic agent who can change activities, possibly at a cost.Footnote 12 The key insight from the analysis is that the optimal switching times, and in fact the entire switching policy up to the decision horizon, is characterized by an adjoint variable that can be precomputed as the unique solution of an initial-value problem. The adjoint variable measures the one-sided gain that is available in the future. Applying this logic backward leads to a natural dynamic-programming solution of the switching problem. As in Weber (2014), all results can be transposed into a discrete-time setting.
In terms of future research, it will be interesting to study the combinatorial extension of the cash-flow switching problem for finitely many cash-flow streams with a given maximum number of switching times. Moreover, one may consider a robust version of the problem when its primitives, such as the interval horizon, the cash-flow streams, or the interest-rate process, are only imperfectly known. Similarly, an investigation of the comparative statics may reveal that structured perturbations of the problem primitives—valid for a base case—may, under certain conditions, lead to monotone changes of the optimal solutions and resulting values of the objective function.
Notes
Splitting hairs somewhat, this assertion requires strict monotonicity. Their result also follows from the more general comparative-statics methods developed by Quah and Strulovici (2009). For an unrelated approach that yields a unique internal rate of return for any cash-flow stream (without the need of truncation), see Weber (2014).
Poterba and Summers (1995) note that, to evaluate their investment projects, firms often use hurdle rates which differ significantly from their actual costs of capital.
The term in square brackets serves to establish a homotopic mapping from \(x^2\) to \(x^1\) as \(\tau \) varies from 0 to 1; see, e.g., Zangwill and Garcia (1984). It can be omitted without changing any of the following results, provided one of the two strict inequalities on the right-hand side is replaced by its weak equivalent. More generally, the value of all the cash-flow streams can be changed on a set of (Lebesgue-)measure zero without affecting any of the present values.
For any given horizon \(T>0\), the present values are \(\text{ PV }(x^2) =\left( 1-e^{-\hat{r}T}\right) /\hat{r} - \left( 1-e^{-\left( \hat{r}+b\right) T}\right) /\left( \hat{r}+b\right) \) and \(\text{ PV }(x^1) = \left( 1-e^{-\hat{r}T}\right) /\hat{r} - c \left[ \hat{r} + a + \left( \omega \sin (\omega T) - (\hat{r}+a)\cos (\omega T)\right) e^{-(\hat{r}+a)T}\right] /\left( \left( \hat{r}+a\right) ^2+\omega ^2\right) \), respectively. The present values coincide if (for example): \(a=b\), \(c = 1+\omega ^2/(\hat{r}+a)^2\), and \(\omega T=2\pi k\) for some \(k\in {\mathbb Z}\).
Even if \(x^1-x^2\) is continuous, the right-hand side of Eq. (8) is generally discontinuous.
In general, a more effective seed is \(\phi _0\triangleq \max \{0,\phi \}\), corresponding to the lower bound for y in Lemma 1.
See Lemma 3 in the appendix for the general formulation of this property.
If \(v(\tau )=-c\beta (\tau )\) for \(\tau \in [0,T]\), then (\(\hbox {P}_v\)) reduces to (\(\hbox {P}_c\)) with the constant switching cost c.
A combination of both generalizations is beyond the scope of this paper and left for future research.
At this point, we note in passing—without introducing the stochastic machinery—that the utility streams (see Remark 3) may in principle be risky, as long as the decision-maker remains risk-neutral, thus limiting attention to the expected utility, represented at each instant by the current cash flow.
The iterative solution of an ordinary differential equation in this manner originated with Picard (1893) and Lindelöf (1894). Its convergence is usually established using the Banach fixed-point theorem, which cannot be used here, as the (right-sided) gain inflow f in Eq. (6) is discontinuous and does not satisfy the Carathéodory conditions.
If there were another bound \(\hat{T}<T\), then whenever \(s_k=\hat{T}\), by virtue of \(\mathscr {A}(k)\) one would obtain \(s_{k+1}>\hat{T}\), i.e., a contradiction.
The precise difference is: \(V(\tau ) - \hat{V}_j(\tau ) = \int _0^{\min \{\tau ,T/j\}}\beta (\theta )x^1(\theta )\,d\theta + \int _{\min \{\tau ,T/j\}}^{T/j} \beta (\theta )x^2(\theta )\,d\theta \), \(\tau \in [0,T]\).
References
Alchian AA (1958) Economic replacement policy, research memorandum RM-2153 (an abbreviated version of R-224). Rand Corporation, Santa Monica
Arrow KJ (1985) Production and capital (collected papers of Kenneth J. Arrow, Vol. 5). Belknap, Cambridge
Arrow KJ, Levhari D (1969) Uniqueness of the internal rate of return with variable life investment. Econ J 79(315):560–566
Bellman RE (1957) Dynamic programming. Princeton University Press, Princeton
Bertsekas DP (1995) Nonlinear programming. Athena Scientific, Belmont
Breiman L, Cutler A (1993) A deterministic algorithm for global optimization. Math Program 58(2):179–199
Cairoli R, Dalang RC (1995) Optimal switching between two random walks. Ann Probab 23(4):1982–2013
Coddington EA, Levinson N (1955) Theory of ordinary differential equations. McGraw-Hill, New York
Filippov AF (1988) Differential equations with discontinuous righthand sides. Kluwer, Dordrecht
Flemming JS, Wright JF (1971) Uniqueness of the internal rate of return: a generalisation. Econ J 81(322):256–263
Gawienjnowicz S (2008) Time-dependent scheduling. Springer, Berlin
Gittins JC, Glazebrook KD, Weber RR (2011) Multi-armed bandit allocation indices. Wiley, New York
Gittins JC, Jones DM (1974) A dynamic allocation index for the sequential design of experiments. In: Gani J, Sarkadi K, Vincze I (eds) Progress in statistics. North-Holland, Amsterdam, pp 241–266
Gürkaynak RS, Wright JH (2012) Macroeconomics and the term structure. J Econ Lit 50(2):331–367
Johnson SM (1954) Optimal two and three stage production schedules with setup times included. Nav Res Logist Q 1(1):61–68
Jun T (2004) A survey on the bandit problem with switching costs. Neth Econ Rev (De Econ) 152(4):513–541
Kiefer J (1953) Sequential minimax search for a maximum. Proc Am Math Soc 4(3):502–506
Kiefer J (1957) Optimum sequential search and approximation methods under minimum regularity assumptions. J Soc Ind Appl Math 5(3):105–136
Lasserre JB (2001) Global optimization with polynomials and the problem of moments. SIAM J Optim 11(3):796–817
Lera D, Sergeyev YD (2013) Acceleration of univariate global optimization algorithms working with Lipschitz functions and Lipschitz first derivatives. SIAM J Optim 23(1):508–529
Li Z, He S, Zhang Z (2012) Approximation methods for polynomial optimization: models, algorithms, and applications. Springer, New York
Lindelöf E (1894) Sur l’application de la méthode des approximations successives aux équations différentielles ordinaires du premier ordre. C R Hebd Séances l’Académie Sci 116:454–457
Mandelbaum A, Shepp LA, Vanderbei RJ (1990) Optimal switching between a pair of Brownian motions. Ann Probab 18(3):1010–1033
Mas-Colell A, Whinston MD, Green JR (1995) Microeconomic theory. Oxford University Press, New York
Milgrom P, Segal I (2002) Envelope theorems for arbitrary choice sets. Econometrica 70(2):583–601
Nesterov Y (2000) Squared functional systems and optimization problems. In: Frenk H, Roos K, Terlaky T, Zhang S (eds) High performance optimization. Kluwer, Dordrecht, pp 405–440
Parrilo PA (2003) Semidefinite programming relaxations for semialgebraic problems. Math Program Ser B 96(2):293–320
Picard E (1893) Sur l’application des méthodes d’approximations successives à l’étude de certaines équations différentielles ordinaires. J Math Pures Appl (4e série) 9:217–272
Poterba JM, Summers LH (1995) A CEO survey of US companies time horizons and hurdle rates. Sloan Manag Rev 37(1):43–53
Quah J-H, Strulovici B (2009) Comparative statics, informativeness, and the interval dominance order. Econometrica 77(6):1949–1992
Rudin W (1976) Principles of mathematical analysis, 3rd edn. McGraw-Hill, New York
Salveson ME (1952) On a quantitative method in production planning and scheduling. Econometrica 20(4):554–590
Sergeyev YD (1995) A one-dimensional deterministic global minimization algorithm. Comput Math Math Phys 35(5):553–562
Shubert BO (1972) A sequential method seeking the global maximum of a function. SIAM J Numer Anal 9(3):379–388
Törn A, Žilinskas A (1989) Global optimization, lecture notes in computer science, vol 350. Springer, New York
Veinott AF, Wagner HM (1962) Optimal capacity scheduling I/II. Oper Res 10(4):518–532–533–546
Weber TA (2011) Optimal control theory with applications in economics. MIT Press, Cambridge
Weber TA (2014) On the (non-)equivalence of IRR and NPV. J Math Econ 52:25–39
Weber TA (2017) Global optimization on an interval. J Optim Theory Appl 182(2):684–705
Wilde DJ (1964) Optimum seeking methods. Prentice-Hall, Englewood Cliffs
Wright JF (1959) The marginal efficiency of capital. Econ J 69(276):813–816
Zangwill WI, Garcia CB (1984) Pathways to solutions, fixed points, and equilibria. Prentice-Hall, Englewood Cliffs
Author information
Authors and Affiliations
Corresponding author
Appendix: Proofs
Appendix: Proofs
Proof of Lemma 1
Since \(f(t,\nu ,0) = \max \{0,\nu \}\ge 0\) for all \(\nu \in {\mathbb R}\), the right-hand side of the differential equation in Eq. (8) is nonnegative whenever \(y=0\). Together with the initial condition \(y(0)=0\), this implies—using the continuity of y and the intermediate value theorem—that the adjoint variable \(y(t)\ge 0\) for all \(t\in [0,T]\). In other words, it is not possible that y takes on negative values because when reaching the boundary where \(y=0\), it can only stay or grow (as in that case \(\dot{y}\ge 0\)), but not decrease. We now show that \(y(s)\ge \left( V(T) - V(T-s)\right) /\beta (T-s)\), which is equivalent to
For this, consider the initial-value problem
which for any measurable cash-flow streams \(x^1\) and \(x^2\) satisfies the Carathéodory conditions. By the Cauchy formula (see, e.g., Weber 2011, p. 25), the solution to Eq. (19) is of the form
Consider the difference \(\Delta \triangleq y - \nu \). Then \(\Delta (0)=0\) and, using the fact that \(y(s)\ge 0\), it is
for all \(s\in [0,T]\). Similar to our earlier argument that the adjoint variable y cannot become negative, we observe that \(\dot{\Delta }(s)\ge 0\). Indeed, at the boundary where \(\Delta (s)=0\), it is \(\dot{\Delta }(s) = \max \{0,x^2(T-s)-x^1(T-s)\}\ge 0\), so that the difference \(y(s)-\nu (s)\) must be nondecreasing on [0, T]. More generally, the Cauchy formula yields the solution to the initial-value problem for \(\Delta \):
which implies that \(y(s)\ge \nu (s)\) for all \(s\in [0,T]\), thus establishing the claim. \(\square \)
Proof of Proposition 1
Existence and uniqueness of a solution to the initial-value problem (8) are established separately. The result largely parallels but differs from Weber (2017, Thm. 1), as we allow here for solutions in the Sobolev space \({\mathcal W}^{1,\infty }([0,T])\).
(i) Existence. Consider a sequence of admissible functions, \(\sigma \triangleq (y_k)_{k=0}^\infty \subset {\mathcal W}^{1,\infty }([0,T])\), defined by the recursion
for all \(k\ge 0\), where \(\phi (s) = \int _0^s \varphi (\varsigma )\,d\varsigma = V(0) - V(T-s)\).Footnote 13 Consider now the sequence of the largest possible horizons \(s_k\) such that the consecutive elements of this sequence coincide, \(y_k(s)=y_{k-1}(s)\), for all \(s\in [0,s_k]\):
with the additional definition \(s_0\triangleq 0\). We now show the following statement:
for all \(k\ge 1\). For this, we first introduce \(\varphi _-(s)\triangleq \min \{0,\varphi (s)\}\) for all \(s\in [0,T]\), and then note that \(y_1 = {\mathbf P}y_0 = {\mathbf P}\phi \), with
so \(0\le s_1 = \inf \{s\in [0,T]:\phi (s)\le 0\}\). Since by definition \(\phi (0)=0\), the preceding infimum is nonnegative, and by Eq. (21) it describes \(s_1\in [0,T]\) as introduced in Eq. (20). By a contradiction argument, it is straightforward to see that \(s_1>0\). Indeed, if \(s_1=0\), then \(\phi (s) > 0\) for all \(s\in (0,T]\). Thus, by the continuity of \(\varphi \) there exists an \(\varepsilon _0\in (0,T]\) such that \(\varphi (s)>0\) for all \(s\in (0,\varepsilon _0)\). This implies \(\varphi _-(s)=0\) and by Eq. (21) therefore \(y_1(s)=y_0(s)\) on \([0,\varepsilon _0]\), whence by Eq. (20): \(s_1\ge \varepsilon _0>0\), as claimed. If \(s_1=T\), then \(\mathscr {A}(1)\) holds automatically. Consider now the interesting case where \(0<s_1<T\). By the definition of \(s_1\), there exists an \(\varepsilon _1\in (0,T-s_1)\) such that for all \(s\in (s_1,s_1+\varepsilon _1)\): \(\phi (s)<0=y_1(s)\), whence \({\mathbf 1}_{\{\phi (s)\le 0<y_1(s)\}}=0\). With this, the inequality in (21) yields
for all \(s\in [0,T]\). This means that \(y_1(s)=y_2(s)\) for all \(s\in [0,s_1+\varepsilon _1]\), so necessarily
Thus, the statement \(\mathscr {A}(1)\) is true. The following “alternating nestedness” of the sequence \(\sigma \) is useful in the remainder of the argument.
Lemma 3
(Weber 2017) The even and odd subsequences \((y_{2j})_{j=0}^\infty \) and \((y_{2j+1})_{j=0}^\infty \) of \(\sigma \) are both monotonic, and its elements are such that \(y_{2j}\le y_{2j+2} \le y_{2j+3} \le y_{2j+1}\), for all \(j\ge 0\).
By Eqs. (21) and (22) it is \(\phi = y_0\le y_2 \le y_1\). By virtue of Lemma 3, if \(y_k = y_{k+1}\) (i.e., \(s_{k+1}=T\)), then \(y_k = y_{k+n}\) (i.e., \(s_{k+n}=T\)) for all \(n\ge 1\). In our proof of \(\mathscr {A}(k)\) for \(k\ge 1\) we therefore consider the nontrivial case where \(s_k<T\).
The forward difference between two consecutive elements of \(\sigma \), starting with an even element \(y_{k}=y_{2j+2}\), is
for all \(s\in [0,T]\) and any integer \(j\ge 0\). By the definition of \(s_k\) in Eq. (20) this yields
Since \(y_k(s_k) = y_{k-1}(s_{k-1})\), by the continuity of \(\varphi \) there exists an \(\varepsilon _k\in (0,T-s_k]\) such that \(y_{k}(s)>y_{k-1}(s)\) for all \(s\in (s_k,s_k+\varepsilon _k)\). But then \({\mathbf 1}_{\{y_{k}(\varsigma )\le 0<y_{k-1}(\varsigma )\}}=0\) on \((s_k,s_k+\varepsilon _k)\), which (by continuity) implies that \(y_{k+1}(s) = y_k(s)\) for all \(s\in [s_k,s_k+\varepsilon _k]\), whence (given that \(s_1>0\), as shown earlier):
Similarly, the forward difference between two consecutive elements of \(\sigma \), starting with an odd element \(y_k = y_{2j+1}\), is
for all \(s\in [0,T]\) and any integer \(j\ge 0\). As a result, using again the definition of \(s_k\):
The fact that \(y_k(s_k) = y_{k-1}(s_k)\) implies (by continuity) that there exists an \(\varepsilon _k\in (0,T-s_k]\) such that \(y_k(s)<y_{k-1}(s)\) and therefore also \({\mathbf 1}_{\{y_{k-1}(s)\le 0<y_{k}(s)\}}\), for all \(s\in (s_k,s_k+\varepsilon _k)\). Hence, \(y_{k+1}(s)=y_k(s)\) on \([s_k,s_k+\varepsilon _k]\), resulting in
Combining the monotonicity of \(s_k\) in (23) and (24), \((s_k)_{k=0}^\infty \) is an increasing sequence with upper bound T. As such it must converge (Rudin 1976, p. 55), and since T is the smallest upper bound:Footnote 14
Taking into account the \(\Vert \cdot \Vert _{1,\infty }\)-norm in Eq. (9) yields
where
The first term in inequality (25) tends to zero as \(k\rightarrow \infty \). The second term is by definition
That this term cannot be relevant for \(k\rightarrow \infty \) may be shown as follows. Consider a small perturbation such that we consider the cash-flow streams \(\hat{x}^i_j(t) \triangleq x^i(t)\,{\mathbf 1}_{\{t\in [T/j,T]\}}\) instead of \(x^i(t)\) for \(i\in \{1,2\}\) and \(t\in [0,T]\), where \(j>1\) is a large integer. As \(j\rightarrow \infty \), the measure of the set [0, T / j), where \(\hat{x}^i_j\) can differ from \(x^i\), goes to zero. For any given \(j>1\), there is a k(j) such that \([0,T-s_{2k(j)}]\subsetneq [0,T/j)\), which implies that for all \(k\ge k(j)\):
If we therefore use the cash-flow streams \(\hat{x}^i_j\), and consider the corresponding Picard–Lindelöf iterates \(\hat{y}_{k,j}\) of the adjoint variable \(\hat{y}_j\), computed analogous to the original iterates \(y_k\) for y, then
Thus, the j-th sequence \((\hat{y}_{j,k})_{k=0}^\infty \) must be a Cauchy sequence. By completeness of the Banach space \({\mathcal W}^{1,\infty }([0,T])\), there exists an admissible function \(\hat{y}_j\in {\mathcal W}^{1,\infty }([0,T])\) such that
The limit function \(\hat{y}_j\) solves the fixed-point problem (written for the perturbed problem):
But this means that \(\hat{y}_j\) solves the initial-value problem (8′) for the perturbed problem. Consider now the switched cash-flow stream \(\hat{x}_{\tau ,j}\) for the perturbed problem, analogous to \(x_\tau \) for the original problem in Sect. 2.2, with
Let \(\hat{V}_j(\tau )\triangleq \text{ PV }(\hat{x}_{j,\tau })\). Then the difference in the objectives,Footnote 15
as \(j\rightarrow \infty \), independent of \(\tau \in [0,T]\). By the continuity of V and \(\hat{V}_j\) we obtain that the sequence of solution sets \(\hat{\mathcal P}_j\) to the perturbed problem converges to the solution set \(\mathcal P\) of the cash-flow switching problem (P), in the sense that for any converging sequence \((\hat{\tau }_j)_{j=2}^\infty \) with \(\hat{\tau }_j \in \hat{\mathcal P}_j\) for all \(j>1\), there exists a \(\tau \in {\mathcal P}\) such that \(\lim _{j\rightarrow \infty }\hat{\tau }_j\rightarrow \tau \), and vice versa, for any \(\tau \in {\mathcal P}\) there exists a converging sequence \((\hat{\tau }_j)_{j=2}^\infty \) with limit \(\tau \). This means that the k-th Picard–Lindelöf iterate \(\hat{y}_{j,k}\) of the right-sided improvement for the j-th perturbed problem must uniformly converge to the k-th iterate \(y_k\) of the right-sided improvement of the original problem. Moreover,
It is now possible to choose a \(j(k)>1\) such that each of the terms is less than 1 / k for all \(j\ge j(k)\) [the third term converges to zero in j by Eq. (26)]. But this means that by successive approximation of the problem with the j(k)-th perturbed problem, we obtain that
This in turn implies that by the completeness of \({\mathcal W}^{1,\infty }([0,T])\) there exists a limit function \(y\in {\mathcal W}^{1,\infty }([0,T])\) with \(\lim _{k\rightarrow \infty } \Vert y_k - y\Vert _{1,\infty }=0\). In particular, \(\Vert {\mathbf P}y - y\Vert _{1,\infty } = 0\), so
i.e., the limit function y solves the initial-value problem (8′).
(ii) Uniqueness. For any given solutions \(y^1\) and \(y^2\) of Eq. (8), let
denote the corresponding pointwise difference. By the initial condition in Eq. (8′) it is \(\rho (0) = 0\), and furthermore:
Thus, \(\dot{\rho }(s)=0\) whenever the values \(y^1(s)\) and \(y^2(s)\) are either both positive or both equal to 0. On the other hand, if \(y^1(s)>y^2(s)=0\), then \(\dot{\rho }(s) = \varphi _-(s)\le 0\); and if \(y^1(s)=0<y^2(s)\), then \(\dot{\rho }(s) = -\varphi _-(s)\ge 0\). Combining these insights yields
Together with the initial condition \(\rho (0)=0\), Eq. (27) implies
so \(y^1 = y^2\), which yields the claimed uniqueness.
The claims (i) and (ii) together imply that there exists a unique solution to the initial-value problem (8′), which by construction has the same solution set as the initial-value problem (8), thus concluding our proof. \(\square \)
Proof of Proposition 2
Note first that by definition \(y(0)=0\), and by virtue of Lemma 1 the adjoint variable y(s) is nonnegative for all \(s\in [0,T]\). Thus, the set \({\mathcal S}\triangleq \{s\in [0,T]: y(s)=0\}\) is nonempty (as \(0\in {\mathcal S}\)). Its supremum, \(s^* \triangleq \sup \,{\mathcal S}\), exists and lies in the interval [0, T]. We distinguish two cases, depending on whether \(\mathcal S\) is a singleton or not.
Case 1: \({\mathcal S}=\{0\}\). Provided that \(y(s)>0\) for all \(s\in (0,T]\), the Cauchy formula yields the terminal value of the adjoint variable, as for the initial-value problem (19) (in the proof of Lemma 1):
Thus, for any \(\tau \in [0,T)\), by setting \(s=T-\tau \), one obtains
Since \(s^*=0\), this implies that \(\tau ^*= T - s^* = T\).
Case 2: \({\mathcal S}\supsetneq \{0\}\). Suppose there exists \(\hat{s}\in (0,T]\) such that \(y(\hat{s})=0\). Thus, \(\hat{s}\in {\mathcal S}\) and \(s^*\ge \hat{s}>0\). Using \(\Delta = x - \nu \) as in the proof of Lemma 1, note that \(\beta (T-s)\Delta (s) = \int _{T-s}^T \beta (\theta ) \max \{0,x^2(\theta )-x^1(\theta )\}{\mathbf 1}_{\{y(T-\theta )=0\}}\,d\theta \) is nondecreasing in s on [0, T]. Now consider the value of the switching problem for switching times \(\tau \) restricted to the interval \([T-\hat{s},T]\),
Then by the monotonicity of \(\beta (T-s)\Delta (s)\), it is
Since by hypothesis \(y(\hat{s})=0\), we obtain
Taking into account Eq. (28), this yields
Using again the monotonicity \(\beta (T-s)\Delta (s)\) and setting \(s^*\triangleq \sup \{s\in [0,T]: y(s)=0\}\), one therefore finds
and \(y(s)>0\) for all \(s\in (s^*,T]\). Thus, the cardinality of \(\hat{\mathcal S}\triangleq \{s\in [s^*,T]: y(s)=0\}\) is 1, \(\hat{\mathcal S} = \{s^*\}\). Just as in Case 1, one obtains that the maximum of V on the interval \([0,T-s^*]\) is achieved at the upper interval boundary, so
Combining Eqs. (29) and (30) the optimal switching time is therefore \(\tau ^*=T-s^*\), and
Moreover,
This completes the proof. \(\square \)
Proof of Proposition 3
For any \(s\in [0,T]\), let \(W(s)\triangleq V(T-s)\). Then any solution to the cash-flow switching problem,
is also a solution of (P). Moreover, by Proposition 2 the smallest solution \(s^*\) of (P′) is equal to T minus the largest solution \(\tau ^{**}\) of (P). Mirroring the objective function from V to W also mirrors the sign of corresponding net inflow, \(\dot{W}(s) = -\dot{V}(T-s)\). Accordingly, instead of discounting from T to \(t=T-s\), it is necessary to compound from 0 to t, so that the left-sided cumulative cash-flow gain z necessarily satisfies the initial-value problem
The latter corresponds to the initial-value problem (8) with inverse cash-flow difference and r replaced by \(-r\). By Proposition 2 the smallest solution of (P′) is \(s^* = T - \sup \{t\in [0,T]:z(t) = 0\}\), so the largest solution of (P) becomes \(\tau ^{**} = T - s^* = \sup \{t\in [0,T]:z(t)=0\}\). Since Eq. (31) is equivalent to the initial-value problem (12), this concludes our proof. \(\square \)
Proof of Proposition 4
Consider the solution set \({\mathcal P}\) and the optimal value \(V^*\) of the cash-flow switching problem (P). We first establish necessity and then sufficiency of the optimality condition (15).
-
(i)
Necessity: If \(\tau \in {\mathcal P}\), then by Remark 4 no improvement is possible on the interval \([\tau ,T]\), so \(y(T-\tau )=0\) necessarily. Similarly, no improvement is possible on the interval \([0,\tau ]\) which implies that \(z(\tau )=0\). Together with the definition of \(\lambda \), this establishes Eq. (15) as a necessary optimality condition for any element of the set \(\mathcal P\).
-
(ii)
Sufficiency: Consider a switching time \(\tau \in [0,T]\) which satisfies \(\lambda (\tau )=0\). By Lemma 1 the adjoint variable y is nonnegative-valued, which—by symmetry—is also true for z. Hence, \(y(T-\tau )=z(\tau )=0\), so a positive gain over \(V(\hat{t})\) is attainable neither to the right (on \([\tau ,T]\)) nor to the left (on \([0,\tau ]\)), which implies that \(V(\tau )=V^*\), and—consequently—it must be \(\tau \in {\mathcal P}\).
Based on (i) and (ii), Eq. (15) characterizes any solution to (P), which implies the claimed representation of the solution set \(\mathcal P\), concluding the proof. \(\square \)
Proof of Proposition 5
Assume that \(\hat{r}(t)\ge r(t)\) for all \(t\in [0,T]\). Consider further a homotopic mapping \(r_\mu = (1-\mu )r + \mu \hat{r}\), parametrized by \(\mu \in [0,1]\), which is such that \(r_0=r\) and \(r_1=\hat{r}\) (see also footnote 5). Let R and \(\hat{R}\) be the cumulative-interest functions, as in Eq. (1), associated with r and \(\hat{r}\), respectively. As a result, the parametrized discount rate becomes
and the corresponding parametrized switching problem is
where \(V_\mu (\tau )\triangleq V(\tau ;\beta _\mu )\) when the discount factor \(\beta \) in Eq. (3), for all \(\tau \in [0,T]\), is replaced by \(\beta _\mu \). Since the (nonnegative) right-sided gain inflow f in Eq. (6) is nonincreasing in the current value r(t) of the interest rate, for any \(t\in [0,T]\), the right-sided adjoint variable \(y_\mu \) for the parametrized switching problem (\(\hbox {P}_\mu \)) is nonincreasing in \(\mu \). Hence, \(y_\mu \) is nonincreasing in \(\mu \):
Consider now the smallest solution \(\tau ^*_\mu \) of the cash-flow switching problem (P), which by Proposition 2 solves
Provided that \(\tau ^*_\mu \in (0,T)\), the envelope theorem (see, e.g., Mas-Colell et al. 1995; Milgrom and Segal 2002) yields
where \({\mathcal L}_\mu (s)\triangleq T-s + \ell y_\mu (s)\) is the corresponding Lagrangian and \(\ell \ge 0\) is the Lagrange multiplier for the binding constraint \(y_\mu (s) = 0\). Thus, \(\tau ^*_\mu \) is nonincreasing in \(\mu \), which in turn implies that \(\tau ^*\ge \hat{\tau }^*\) as claimed. \(\square \)
Proof of Proposition 6
Since by the fundamental theorem of calculus and by the definition of the discount factor \(\beta \) it is
the decision-maker’s objective function with mandatory switching can be written in the form
As a result, the solution to the cash-flow switching problem follows from Proposition 4 with the adjoint variable \(\lambda _c = \beta \max \{y_c,z_c\}\) instead of \(\lambda \), where the one-sided adjoint variables \(y_c\) and \(z_c\) are uniquely specified as solutions to the fixed-point problems
respectively. \(\square \)
Proof of Lemma 2
For any \(c\in {\mathbb R}\), let \({\mathcal S}_c\triangleq \{s\in [0,T]:y_c(s) = 0 \}\), generalizing the definition of \({\mathcal S}={\mathcal S}_0\) in the proof of Proposition 2. Consider now two different switching costs c and \(\hat{c}\) such that \(c<\hat{c}\). Then because of the (weak) monotonicity of gain inflow f (see Eq. (6)) in its second variable (i.e., \(\hat{x}\)), it is \(y_c\le y_{\hat{c}}\). This in turn implies that \({\mathcal S}_{\hat{c}}\subseteq {\mathcal S}_{c}\), so necessarily \(\sup \,{\mathcal S}_{\hat{c}}\le \sup \,{\mathcal S}_c\), and consequently also \(\tau _{\hat{c}}^*\ge \tau _c^*\), completing the proof. \(\square \)
Proof of Proposition 7
For any solution \(\tau _c^*\in {\mathcal P}_c\) of the problem (\(\hbox {P}_c\)) with optional switching, we have
Hence, \(\hat{V}_c^* = \max \{V_c^*,V(T)\}\), and the stated optimal switching policy follows. \(\square \)
Proof of Proposition 8
By the fundamental theorem of calculus
where, using the Leibniz rule and taking into account the definition of \(\beta \) in Sect. 2.1,
Hence, by setting
where ‘\(*\)’ denotes the convolution product, the switching problem (\(\hbox {P}_v\)) is equivalent to
which is a simple switching problem of the form (P). By Proposition 4, we therefore obtain that \(\tau _v^*\) solves (\(\hbox {P}_v\)) if and only if it is an element of \({\mathcal P}_v\triangleq \left\{ \tau : \lambda _v(\tau )=0 \right\} \), where \(\lambda _v = \beta \max \{y_v,z_v\}\) and \(y_v,z_v\) are one-sided adjoint variables, uniquely determined as the solutions (in the space \({\mathcal W}^{1,\infty }([0,T])\)) of the fixed-point problems \(y_v = ({\mathbf P}_{\hat{x}^0+x^1 - x^2})y_v\) and \(z_v = (\hat{\mathbf P}_{\hat{x}^0+x^1-x^2})z_v\), respectively. \(\square \)
Proof of Proposition 9
For \(N=1\), an optimal switching time is \(\tau ^1=\tau ^*\), as in Proposition 2. Consider now the case where \(N>1\). Let \(t\in [0,T]\), and for \(k\in \{2,\ldots ,N\}\) assume that an optimal solution has been found for the problem with \(k-1\) switches on the interval [t, T] resulting in the present value \(U^*_{k-1}(t)\). Switching at time \(\tau \) from an admissible cash-flow stream x (defined on [0, T]) to the cash-flow stream \(\xi ^{k-1}\) at the start of the interval \([\tau ,T]\) results in the payoff
where the cash-flow stream \(\xi ^{k-1}\) is such that
As discussed in Sect. 2, the solution to the family of optimization problems,
can be obtained—by the principle of optimality—from the (right-sided) adjoint variable \(y^N\) of the “all-inclusive” problem
which is of the form (P). By Proposition 1, the adjoint variable \(y^N\) is the unique solution of
that is: \(y^k = ({\mathbf P}_{x-\xi ^{k-1}})y^k\). By Proposition 2, an optimal solution to the cash-flow switching problem (33) is \(\hat{\tau }^k(t) = T - \hat{s}^k(t)\), where
Differentiating the maximized objective in Eq. (33) yields
which, using Eq. (32), implies that
If the number of remaining switches k and the total number of available switches N are both even or both odd (or equivalently, \(k+N\) is even), then the current switch k is from \(x^1\) to \(x^2\). Conversely, if k is even (resp., odd) and N is odd (resp., even) (or equivalently, \(k+N\) is odd), then the current switch k is from \(x^2\) to \(x^1\). Hence, we set \(x=x^1\) if \(k+N\) is even, and \(x=x^2\) if \(k+N\) is odd. As a result:
By induction, the preceding arguments apply for all \(k\in \{2,\ldots ,N\}\). The corresponding switching times \(\tau ^1, \ldots , \tau ^N\) (with \(\tau ^1\ge \cdots \ge \tau ^N\)) obtain recursively, starting with the smallest (\(\tau ^N\)) :
where \(\hat{\tau }^k = T - \hat{s}^k\) as introduced earlier; see Eq. (34). This completes our proof. \(\square \)
Proof of Proposition 10
Given that \(x^1\) is the current default cash-flow stream, the decision-maker compares the post-switch cash-flow streams
for \(j\in \{2,\ldots ,M\}\), where \(y_j\triangleq ({\mathbf P}_{x^1-x^j})y_j\). The present value of \(\xi ^j\) is equal to the optimal value of the simple cash-flow switching problem, from \(x^1\) to \(x^j\), so by Corollary 2 and the definition of the present value in Eq. (2) we have
This implies that from \(x^1\) the best cash-flow stream to switch to is \(x^{j^*}\), where
The optimal switching time \(\tau ^*\) is obtained by Proposition 4, in the sense that \(\tau ^*\) must be an element of the solution set \({\mathcal P}_M = \{\tau \in [0,T]:\lambda _{j^*}(\tau )=0\}\) with the adjoint variable \(\lambda _j =\beta \max \{y_j,z_j\}\). The one-sided adjoint variables \(y_j,z_j\) are uniquely determined as solutions (in the space \({\mathcal W}^{1,\infty }([0,T])\) to the fixed-point problems \(y_j = ({\mathbf P}_{x^1-x^j})y_j\) and \(z_j = (\hat{\mathbf P})z_j\), for any \(j\in \{2,\ldots ,M\}\), which completes the proof. \(\square \)
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
Weber, T.A. Optimal switching between cash-flow streams. Math Meth Oper Res 86, 567–600 (2017). https://doi.org/10.1007/s00186-017-0586-0
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00186-017-0586-0
Keywords
- Capital budgeting
- Cash flows
- Deterministic multi-armed bandit
- Optimal stopping and starting
- Replacement decisions