(BAO, 2014) A Simulation-Based Portfolio Optimization Approach With Least Squares Learning
(BAO, 2014) A Simulation-Based Portfolio Optimization Approach With Least Squares Learning
(BAO, 2014) A Simulation-Based Portfolio Optimization Approach With Least Squares Learning
Abstract—This paper introduces a simulation-based numeri- In this paper we introduce a new computational method
cal method for solving dynamic portfolio optimization problem. to solve the dynamic portfolio optimization problem numeri-
We describe a recursive numerical approach that is based on the cally. The Monte Carlo method is used for simulating a large
Least Squares Monte Carlo method to calculate the conditional
value functions of investors for a sequence of discrete decision number of hypothetical sample paths of asset returns and
dates. The method is data driven rather than restricted to state variables. We call these sample paths the training set.
specific asset model, also importantly intermediate transaction The key idea is that these sample paths should incorporate
costs associated with portfolio rebalancing is considered in the the investors’ belief about the stochastic properties of the
dynamic optimisation method, and investors’ risk preferences future asset dynamics and state variable. For example, the
and risk management constraints are also taken into account
in the current implementation. sample paths may have an arbitrarily complex marginal joint
In this paper, the presented method is used for a case study distribution, correlation structure, path-dependency, and non-
on a global equity portfolio invested in five equity markets, stationarity.
and foreign exchange risks are also included. We examine the Given the simulated training set, we can solve the opti-
portfolio performance with three optimizers in a out-of-sample mal portfolio allocation problem by using an approximate
simulation study together with a benchmark portfolio which is
passively managed with equal weighted position. stochastic dynamic programming framework in the form of
the Least Squares Monte Carlo (LSM) method. LSM was
Index Terms—Portfolio Optimization, Least-squares Monte introduced initially by [3] as a numerical methodology to
Carlo, Approximate Stochastic Dynamic Programming, Opti-
mal Asset Allocation. value American or Bermudian options by a least-squares re-
gression. In this paper we extend the LSM approach to solve
a multiple switching options problem which also incorporates
I. I NTRODUCTION the complex features of the intermediate transaction cost and
TABLE I
depend on all the qualitative or quantitative research or S TRATEGY SET FOR 5 ASSETS , 5- STEP DISCRETIZE CASE .
even luck. The optimal portfolio model implemented in this
paper provides an efficient translation of the market forecast
x(1) : (0, 0, 0, 0, 1) x(2) : (0, 0, 0, 0.2, 0.8)
and risk management requirement into the corresponding
x(3) : (0, 0, 0, 1, 0) x(4) : (0, 0, 0.2, 0.8, 0)
portfolio.
x(5) : (0, 0, 1, 0, 0) x(6) : (0, 0.2, 0.8, 0, 0)
In the next section, we describe in detail the framework x(7) : (0, 1, 0, 0, 0) x(8) : (0.2, 0.8, 0, 0, 0)
of the dynamic portfolio optimization problem and the nu- x(9) : (1, 0, 0, 0, 0) x(10) : (0.2, 0, 0.8, 0, 0)
merical implementation of the algorithm for the approximate x(11) : (0, 0.4, 0.4, 0, 0) x(12) : (0.4, 0.4, 0.2, 0, 0)
stochastic dynamic programming. In Section III, we apply ... ...
our method on a case study of a global equity portfolio.
Some computational results and implementation issues are
discussed in Section IV. Section V concludes.
1 2
used to represent the portfolio weight as (0, m , m , ..., 1). The
II. T HE F RAMEWORK discretized portfolio weight value vector x are then defined
as all possible combinations of the discrete portfolio weight
A. The investor’s problem values for all individual assets. Of course, we also have the
We consider the dynamic portfolio optimization problem condition that satisfy ΣN i=1 wi = 1.
at time t of an investor. There are N assets that the investor The investor thus has a set of possible portfolio weighting
can invest in, the unit price of buying or selling the i-th asset positions in vector form, and each of the possible portfolio
at time t is given by Sti . For each asset, the price Sti depends weigh composition represents a so-called strategy. The full
on some underlying stochastic processes called risk factors set of strategies for possible adoption can be listed as Θ =
on the probability space (Ω, Ft , P). {x(1) , x(2) , x(3) , ...}.
We consider the planning time horizon with maturity date As an example, for a portfolio with 5 assets to invest,
T , and the portfolio can be rebalanced at a sequence of each asset weight is discretized by a 5-step grid (i.e. 0%,
discrete rebalancing dates t, t + 1, t + 2, ..., T . The investor’s 20%, 40%, 60%, 80% and 100%), there are in total 126
problem is possible strategies for potential adoption. We list some of
n o the strategies in Table I for this example case.
Vt (ωt ) = maxxt E [ft (xt , ωt ) + Vt+1 (ωt+1 )|Ft ] , (1)
ωt ∈ Ft
C. Constraints
where xt is a vector of portfolio position on the N assets Strategies of the portfolio are subject to some constraints.
at time t, Vt is the value function at time t with boundary The first constraint is on the position limits, this is the
condition at the maturity date T : VT (ωT ) = 0 a.s.. The individual upper- and lower-bound of the position (portfolio)
utility function ft (·) represents the investor’s preference of weight of each asset:
the portfolio performance which will be discussed further in
Section III. wi ∈ [ai , bi ], i = 1, 2, ..., N, ai < bi
The value function Vt can be seen as the expected total
future utility of the investor at time t with condition that all where ai , bi ∈ R . The limits for individual asset portfolio
the portfolio position weights xt , xt+1 , ..., xT are optimally weight or position may be defined by legislations or from
chosen with respect to to all random events ωs ∈ Fs , s = investor’s risk management requirements.
t, t + 1, ..., T . Investors may be required to operate the portfolio under
We assume that the value function in Equation (1) is the some thresholds of risk exposures in form of risk measures
objective function of the investor. This is a typical decision such as VaR. Other constraints may apply on the turnover
under uncertainty problem where the decision maker has to and restrictions caused by the trading ability of the investor.
make a decision based only on the realizations of historical Also, large orders in the market may adversely affect
performance of the portfolio and taking into consideration market price movement itself. Liquidity constraints can apply
the dynamics of future scenario with all the possible future for all rebalancing dates. Depending on assumptions on
decisions which would not be unveiled until the future liquidity, a minimum absolute liquidity can be specified for
decision dates. the portfolio. For instance, the first 20% of the local market
The vector of dynamic portfolio position xt (ω) is a Ft - traded equity is highly liquid whereas any amount larger than
adapted random variable. The value of xt is decided by 20% of the local market equity is assumed to take more time
all the information available at time t. This may include and slippage to liquidate.
current value of all risk factors and the portfolio history For simplicity, we make an assumption on liquidity con-
x0 , x1 , ...xt−1 . straint by setting a maximum total turnover so as to restrict
any possible large rebalancing trades within a short period.
B. Constructing Strategy Set
The position vector xt = {w1 , w2 , ..., wN } at time t D. Least Squares Monte Carlo Method
represents the weight in percentage of the total book size In this section we describe the approximation algorithm
value of the portfolio invested in asset S i , i = 1, 2, .., N . we use to estimate the expected value function in Equation
We discretize the position weight wi of the asset Sti , i = (1).
1, 2, ..., N in the following way: an m-step discrete grid is The input of the LSM model takes:
• n + s Monte Carlo simulated sample paths of the function at time t + 1 with strategy l, given the strategy l
M underlying risk factors at time t = 0, 1, 2, ..., T , provides maximum value of the total utility of transaction
(i,k)
{Xt }, where i is the risk factor index, k is the sample cost and the conditional value function at t + 1.
paths number and t is the time index. The first n sample After all the coefficient parameters for the basis functions
paths is the training set and we use the rest for out-of- are estimated, we then calculate the optimal strategy at time
sample tests. t = 0, the algorithm is described in psuedo code form
• The realization of ωt is a vector of simulated risk factor in Algorithm 2. The algorithm computes the mean of the
(i,k)
{Xt } up to time t and all the decision history of the conditional value function at t = 0 for all strategies; the one
portfolio position x0 , x1 , ..., xt−1 . that gives the highest value is chosen as the initial optimal
• The structure of basis function L(·) and the truncated strategy.
order parameter K;
• The investor’s utility function f (·); (i,k)
Data: {Xt }, i = 1, 2,...,n
• A transaction cost function T C(xt−1 , xt , ωt ); Result: x0
1
• The N assets S1 , S2 , ..., SN and the step size m for initialization;
the discrete weight values in the portfolio; and, for j = 1 to size(Θ) do
• Some constraint conditions, we can write it as an for i = n + 1 to n + s do
indicator function 1t (ωt , x) which returns a value of
(j) (j)
either 1 if the position x is within all the constraints EV0 + = f (x0 , ω1 )
otherwise value 0 is returned at time t given all the n
(j) (l)
state variable information ωt . + max f (−T C(x0 , x1 , ω1 ))
(l)
(1) (2) (3) x1 ∈Θ1
The first step is to construct the Θt = {xt , xt , xt , ...}
0
o
. For any x = {w1 , w2 , ..., wN }, wi = 0, 1/m, 2/m, .., 1, if +ΣK
(k)
(k) (1,i) (M,i)
k=1 ĉ1,l L (X1 , ..., X1 )
1t (ωt , x0 ) = 1 then add x0 into set Θt .
The next step is to approximate the conditional value end
function: end
h i
(l) (l) (l)
Vt (ωt ) = E ft (xt , ωt ) + Vt+1 (ωt+1 |xt = xt )|Ft ,(2) n o
(l)
ωt ∈ Ft . l = argx(l) max EV0
0
The conditional value function is the expected value of the Algorithm 2: Algorithm to estimate x0
(l)
total utility at time t given the strategy at time t is xt .
Here, we approximate Equation (2) by using a cross-
sectional least-square regression scheme.
III. A C ASE S TUDY OF G LOBAL E QUITY P ORTFOLIO
(i,k)
Data: {Xt }, i = 1, 2,...,n We consider an investor who manages an equity portfolio
Result: ĉt,k invested in five major equity markets globally – the Australia
initialization; (AU), the United States of America (US), the United King-
(k) dom (UK), the Japan (JP) and the emerging equities markets
Set ĉT,j := 0;
(EM).
for t = T − 1 to 1 do
The investor operates the portfolio from the viewpoint of
for j = 1 to size(Θ) do
the home currency. In this paper we assume the Australian
n
X Dollar (AUD) as the home currency; all valuations will need
Qj = f (x(j) , ωt+1 ) to be converted to the home currency.
i=1 Except for the AU market, the returns from the four foreign
n
(j) (l)
+ max f (−T C(xt , xt+1 , ωt+1 )) equity markets are subject to foreign exchange risks. In
x(l) ∈Θ
this case we have to consider the dynamics of the foreign
K
X (k) (1,i) (M,i)
o exchange rates for the currency pairs of the local currencies
+ ĉt+1,l L(k) (Xt+1 , ..., Xt+1 ) of the equity markets and the home currency:
k=1 U SD
K • EAU D US dollar to Australian dollar;
2 GBP
(k) (1,i) (M,i) • EAU D British sterling to Australian dollar;
X
− ĉt,j L(k) (Xt , ..., Xt ) JP Y
• EAU D Japanese Yen to Australian dollar; and
k=1
• A basket of Emerging market currencies to Australian
(k)
ĉt,j = arg{ĉ} min Qj ; dollar.
end
end A. Data
Algorithm 1: Algorithm to estimate ĉ Our datasets will have an eight-dimensional structure: five
equity indices and three exchange rates. Table II sets out the
Algorithm 1 describes the algorithm in pseudo code form. data used as proxies for the variables of the market indices
The coefficient parameters c are estimated by minimizing the and their local currency.
sum of sample squared difference (function Q) between the We use the adjusted closing price of equity market indices
sample utility at time t with the expected conditional value at the last trading day of the month. We don’t have access
TABLE IV
The Power utility function is also known as the constant S TATISTICS OF THE PORTFOLIO RETURNS AND CRRA UTILITY
relative risk aversion (CRRA) utility function. It’s a relative FUNCTION VALUES FOR PORTFOLIO OPTIMIZER P0,P1,P2,P3 OF
OUT- THE - SAMPLE TESTS .
measure of risk aversion, defined by −wu00 (w)/u0 (w), and
α is a constant .
excess returns
IV. R ESULT AND D ISCUSSIONS optimizer CRRA-5 CRRA-7 mean volatility
P0 0.613388 0.998549 1.77% 43.20%
The algorithm presented here is implemented in software P1 0.556802 0.811427 25.66% 56.99%
package RiskLab, and is used to compute the numerical P2 0.661708 0.987756 19.00% 49.60%
results in this section1 . Once the model parameters of these P3 0.638013 1.001234 10.80% 45.85%
risk factors are calibrated, we can proceed to calculate all the
expected value functions at any future asset prices for all the
portfolio strategies at time t. At time t, if previous strategy at
time t−1 is known, we can readily select the current optimal portfolio optimization scheme achieved the objective as set
target portfolio strategy. In other words, the algorithm can out in the P1 investment style.
be used as what-if forecasting tool for portfolio strategies. For the P1 style investment, the high volatility of the port-
For example, for any given scenario of the risk factors or folio return brings down the calculated corresponding CRRA
a mathematically generated random event ω, the current utility value. An investor with risk preferences expressed in
methodology produces optimal target portfolio positions at the CRRA utility function will find the P1 investment style
every rebalance date for given scenario events. as not achieving the objective maximizing the CRRA utility
For this example, we first generate 5000 Monte Carlo function. For the investment style of maximizing the CRRA
sample scenarios from the stochastic models of the 8 risk utility function, we have chosen to evaluate two investment
factors as the training set, then we generate another set styles: P2 and P3 respectively for α = 5 and α = 7.
of 5000 simulation scenarios as the out-of-sample data to The expected value of excess returns for the P2 style is
analyse the decision output for optimal portfolios. We also 6.66% which is lower than the P1 style, whereas the return
set up four investment styles for managing the portfolio: volatility is reduced by 7.39%. This suggests the investor
P0: A constant position strategy with equal weighted posi- following the decision rule given by P2 style would end
tion on each of the five assets: 20% of the total book up with a lower risk in the form of reduced volatility and
size of the portfolio is allocated to each of the five assets smaller return than the P1 investment style. The difference in
at the every rebalancing time. expected excess returns can be seen as the risk premium paid
P1: The investor chooses to manage the portfolio through a to reduce the volatility of the portfolio. A similar behaviour
linear utility function of Equation (4). is observed for the P3 investment style for which the CRRA
P2: The investor selects the risk aversion utility function of utility function is adopted with the parameter α = 7. The
Equation (5) with parameter α = 5. P3 style investment obviously achieved the highest CRRA-7
P3: The investor selects the risk aversion utility function of value. Interestingly, we observe that the calculated CRRA-7
Equation (5) with parameter α = 7. value of the P0 style investment is very close to the CRRA-
7 value of the P3 style investment. Correspondingly, the P2
Table IV shows the portfolio performance: in values of
investment style maximizes the value of the CRRA-5, or the
CRRA utility functions, achieved expected excess returns and
utility function value with parameter α = 5.
volatilities over the 10 years investment horizon with 5-step
strategy set (portfolio weigh steps).
First we looked at the expected total returns and volatilities A. Visualization of dynamic decisions
over the entire investment horizon. The P0 investment style One intuitive way to show the real time dynamics of
gives 1.77% excess returns against the 10-year bond yield, portfolio position changing with respect to different scenarios
with a 43.20% volatility which is an average 13.66% for each is by using a motion plot. Figure 1 shows a snapshot of
year. The P0 portfolio follows a passively management style the motion plot created by the visualization tool built in the
and does not require decision making or market forecasting RiskLab software package. We have also generated motion
by the investor. We choose P0 style investment as the plots for the investment styles P1, P2 and P3, which can be
benchmark portfolio. accessed through the web link: https://dl.dropboxusercontent.
The P1 investment style produces the highest expected re- com/u/788580/Presentation/IAENG/googleEmbedded.html.
turn but also the highest volatility among the four investment
styles. Recall in this case study we use the same calibrated
stochastic asset models to generate the Monte Carlo sample B. Basis functions
set for model training and the out-of-sample test. The overall One important issue when applying the LSM algorithm is
superior performance in return from this P1 investment style the choice of basis functions. The selection of basis functions
indicates that the algorithm has successfully captured the depends on the application in hand. The work of [3] suggests
properties of the future asset dynamics through the training Laguerre (weight) should be selected as the basis orthogonal
data set, and the superior performance in portfolio returns for function for single asset American put options. The robust-
the out-of-sample data indicates the implemented dynamic ness and convergence of LSM algorithms have also been
1 RiskLab is a software package developed by CSIRO for asset modelling,
an issue when selecting basis functions. For example, [4]
simulation, decision supporting, real option pricing and portfolio optimiza- shows that the LSM method is more efficient than either a
tion. finite difference or a binomial method when valuing options
V. C ONCLUSION
We have presented a simulation-based numerical method
for solving dynamic portfolio optimization problems. There
is no restriction on the choice of asset models, investor pref-
Fig. 1. A motion plot of 30 selected sample paths optimal weight calculated
by P2 erences, transaction cost, and liquidity position constraints.
We have applied the method to managing an equity port-
TABLE V folio invested across five global equity markets. For the case
P ERFORMANCE OF R ISK L AB PORTFOLIO OPTIMIZER ON SELECTED TEST
CASES
study shown in this paper, the views of an investor on future
market returns is modelled and calibrated by a multi-factor
mean-reverting process with eight risk factors, and auto- and
number of scenarios calculation time in minutes
cross asset correlation structures are also considered. Four
calibration optimization calibration optimization
investment styles are chosen in the test case, and a Least
P1 5000 5000 267.242’ 7”
Square Monte Carlo approximation method has been devel-
P2 5000 5000 268.249’ 8”
oped to calibrate the dynamic portfolio model. Through the
P3 5000 5000 267.242’ 8”
test case, we have shown that the three dynamic investment
styles outperform the benchmark portfolio for out-of-sample
tests. Viewed on a mean-variance plane, the performance of
on multiple assets, and Monomials are suggested as possible the dynamic portfolios are located on a new efficient frontier
basis functions. whereas the benchmark static portfolio is less efficient with
For the case study of this paper, we have tested LSM a higher risk premium. Some computational issues with the
with a set of different basis polynomial functions including LSM model have also been discussed.
Laguerre, Nominal, Hermite, Hyperbolic, Legendre. We use
the total standard deviation of the least square residuals as R EFERENCES
a measure of goodness-of-fit. We observe that essentially [1] E.F. Fama and K.R. French “The Cross-Section of Expected Stock
all the the tested orthogonal functions provide comparable Returns”, The Journal of Finance, number. 2, volume XLVII, June 1992
results. For this particular example, Laguerre polynomials pp 427-465.
[2] R. Grinold and R. Kahn “Active Portfolio Management: A Quantitative
with order greater than 3 provide the lowest error among the Approach for Producing Superior Returns and Controlling Risk,(2nd
5 basis polynomial functions. ed.)” McGraw-Hill, 1999.
As a standard approach for selecting a numerical basis [3] F.A. Longstaff and E.S. Schwartz “Valuing American options by simu-
lation: A simple least-squares approach”, Review of Financial studies,
approximation function, we suggest testing multiple possible number. I, volume 14, 2001 pp 113-147.
orthogonal functions for each new application before choos- [4] L. Stentoft “Assessing the Least Squares Monte-Carlo Approach”,
ing the appropriate basis functions. Review of Derivatives Research, number. 7, 2004 pp 129-168.
[5] G. Consigli, M.A.H. Dempster, “Dynamic stochastic programming for
asset-liability management”, in Annals of Operations Research 06-1998,
C. Computational time Volume 81, Issue 0, , pp. 131-162.
[6] M.A.H. Dempster and M. Germano and E.A. Medova and M. Villaverde
Table V shows the calibration and optimization time “Global Asset Liability Management”, in British Actuarial Journal, 9,
pp 137-195 doi:10.1017/S1357321700004153.
running on a PC with Intel Core i5-2540 2.6 GHz, 4 GB [7] M.W. Brandt and A. Goyal and P. Santa-Clara and J.R. Stroud “A
ram, compiled using Visual C++ 2010 32 bit version. Simulation Approach to Dynamic Portfolio Choice with an Application
One can see from the figures of Table V, the calibration to Learning About Return Predictability”, in Review of Financial Studies
18, 2005, pp. 831-873.
phase for 5000 simulated sample paths took more than four [8] C. Bao and Z. Zhu “Land use decisions under uncertainty: optimal
hours to finish. The computation time also depends on the strategies to switch between agriculture and afforestation”, in MOD-
size of the strategy set and the number of the risk factors. SIM2013, 20th International Congress on Modelling and Simulation.
Modelling and Simulation Society of Australia and New Zealand,
We have also tested for using smaller number of sample December 2013,, pp. 1419-1425. ISBN: 978-0-9872143-3-1.
paths, different number of strategies in the strategy set and [9] C. Bao and M. Mortazavi-Naeini and S. Northey and T. Tarnopolskaya
basis functions. (Results not listed.) The computational time and A. Monch and Z. Zhu “Valuing flexible operating strategies in
nickel production under uncertainty”, in MODSIM2013, 20th Interna-
for calibration is asymptotically linear with respect to the tional Congress on Modelling and Simulation. Modelling and Simula-
number of strategies and the number of simulation sample tion Society of Australia and New Zealand, December 2013,, pp. 1426-
paths. Once the LSM model is calibrated, it normally takes 1432. ISBN: 978-0-9872143-3-1.
less than a few seconds for the LSM algorithm to calculate
the optimal portfolio positions for the 5000 out-of-sample
scenarios.
It is worth noting that the calibrating process for the LSM
model, as described in Algorithm 1, can be performed in