Partial Differential Equations in Applied Mathematics

Partial Differential Equations in Applied Mathematics 6 (2022) 100457
Contents lists available at ScienceDirect
Partial Differential Equations in Applied Mathematics

journal homepage: www.elsevier.com/locate/padiff
A test of backward stochastic differential equations solver for solving

semilinear parabolic differential equations in 1D and 2D✩
Evan Davis a , Guangming Yao b ,∗, Elizabeth Javor c , Kalani Rubasinghe b ,
Luis Antonio Topete Galván d
a Department of Mathematics, University of Wisconsin, Madison, 53706, WI, USA
b Department of Mathematics, Clarkson University, 8 Clarkson Ave, Potsdam, 13699, NY, USA
c
Department of Mathematics, Rochester Institute of Technology, Rochester, 14623-5600, NY, USA
d
Universidad Autónoma del Estado de Hidalgo, Kilómetro 4.5 carretera Pachuca-Hidalgo, Pachuca, Hidalgo, 42074, Mexico
ARTICLE INFO ABSTRACT

Keywords: Backward stochastic differential equation solver was first introduced by Han et al in 2017. A semilinear
Semilinear PDE parabolic partial differential equation is converted into a stochastic differential equation, and then solved
Backward stochastic PDE by the backward stochastic differential equation (BSDE) solver. The BSDE solver uses the backward Euler
Euler method
scheme for time discretization and stochastic process and neural network to learn the derivative functions
Deep learning
at each time step. The algorithm is ideal for solving high-dimensional PDEs, especially in dimensions higher
Parabolic
Black–scholes equations
than 3-dimension problems, whereas the traditional numerical techniques fail to produce any simulations.
Allen–Cahn equations We modified the BSDE solver so that is works for one-dimensional problems as well. The focus of this
Diffusion–reaction equations paper is to understand how the BSDE solver works in comparison with the traditional numerical techniques
Radial basis functions in low dimensional spaces (1D and 2D). We examined the BSDE solver in terms of accuracy, efficiency
Picard method and convergence. Through five classical differential equations, we discovered that the solver works for low
dimensional spaces, as accurate as it can be in high dimensional spaces. It is more accurate than the radial
basis function collocation method reported in literature and the results by the Picard method. However, the
BSDE solver is time consuming. This however, can be solved by parallel computing if needed.
1. Introduction In this paper, we are interested in examining the semilinear parabolic

partial differential equations (PDEs) of the following form:
Partial differential equations (PDEs) are among the fundamental
𝜕𝑢 1 [ ]
tools used to model physical phenomena in sciences and engineering.1 (𝑡, 𝑥) + Tr 𝜎𝜎 𝑇 (𝑡, 𝑥)𝐇𝑥 𝑢(𝑡, 𝑥) + ∇𝑢(𝑡, 𝑥) ⋅ 𝜇(𝑡, 𝑥) + 𝑓 (𝑡, 𝑥, 𝑢, 𝜎 𝑇 ∇𝑢) = 0
𝜕𝑡 2
Traditional analytical or numerical methods are sometimes inade-
(1.1)
quate to use in realistic problems due to increase in dimensions and
complexity.2–5 Additionally, numerical stochastic differential equa- where 𝑥 ∈ R𝑑 , 𝑡 ∈ R, subject to terminal condition
tions (SDE)6–9 are very successful for modeling stochastic phenomena
such as those in finance, biology, mechanical engineering, popula- 𝑢(𝑇 , 𝑥) = 𝑔(𝑥), (1.2)
tion and etc. Methods for solving SDE including finite difference
where the function 𝑔(𝑥) is a given real-valued function defined on R𝑑 .
method,10,11 finite element method,12 meshfree methods,13 Bernoulli
Note that Tr denotes the trace of matrices, ∇𝑢 is the gradient, 𝐇𝑥 𝑢(𝑡, 𝑥)
wavelet method,14 hybrid methods,15,16 and many others. Further-
is the Hessian matrix, 𝜎(𝑡, 𝑥) ∶ R × R𝑑 → 𝑀 𝑑×𝑑 (R) is a known matrix-
more, over the last decade, deep learning algorithms have gained a
valued function, 𝜇(𝑡, 𝑥) ∶ R × R𝑑 → R𝑑 is a known vector-valued
lot of interests in dealing with high-dimensional problems in many
function, and 𝑓 is a known nonlinear function. The goal is to obtain
fields. It has effectively been used in solving problems with com-
𝑢(0, 𝜉) for some fixed 𝜉 ∈ R𝑑 .
plicated geometries,17 many-electron Schrodinger equations,18 high-
Backward stochastic differential equation solver was first intro-
dimensional forward–backward stochastic differential equations,19
duced in Ref. 21 in 2017 to solve such semilinear parabolic PDEs in
computational fluid dynamics,20 and etc.
high-dimensional spaces as high as hundreds. Instead of setting up
✩ This work was supported in part by National Security Agency, USA grant H98230-21-1-0336 and H89230-22-1-0008.
∗ Corresponding author.
E-mail address: [email protected] (G. Yao).
https://doi.org/10.1016/j.padiff.2022.100457
Received 23 January 2022; Received in revised form 12 June 2022; Accepted 19 October 2022
2666-8181/© 2022 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY license
(http://creativecommons.org/licenses/by/4.0/).
E. Davis, G. Yao, E. Javor et al. Partial Differential Equations in Applied Mathematics 6 (2022) 100457
Fig. 1. Flowchart of the BSDE solver algorithm for solving semilinear parabolic PDEs.
initial value problems, as is commonly done in traditional numerical {𝑍𝑡 }𝑡∈[0,𝑇 ] as follows23 :
methods literature, problem with terminal conditions is considered 𝑡 𝑡
since it enables to make connections with BSDEs. To be able to handle 𝑋𝑡 = 𝜉 + 𝜇𝑑𝑠 + 𝜎𝑑𝑊 , (2.3)
∫0 ∫0
such high-dimension, the PDEs are converted into SDEs, and then
𝑇 𝑇
solved by the backward stochastic differential equation (BSDE) solver. 𝑌𝑡 = 𝑔(𝑋𝑇 ) + 𝑓 (𝑠, 𝑋𝑠 , 𝑌𝑠 , 𝑍𝑠 )𝑑𝑠 − (𝑍𝑡 )𝑇 𝑑𝑊𝑠 (2.4)
The BSDE solver uses the backward Euler scheme for time discretization ∫𝑡 ∫𝑡
and stochastic process and neural network to learn the derivative func- 𝑍𝑡 = 𝜎 𝑇 (𝑡, 𝑋𝑡 )∇𝑢(𝑡, 𝑋𝑡 ). (2.5)
tions at each time step. Fig. 1 illustrates the steps involved in the BSDE
solver. The solver is programmed in PYTHON using TENSORFLOW by Theorem 1 (Itô’s Lemma 22 ).
Han et al. in 2017, which can be downloaded from GitHub. For a vector 𝜇 and matrix 𝜎, let 𝑋𝑡 = (𝑋𝑡1 , 𝑋𝑡2 , … , 𝑋𝑡𝑑 )𝑇 be a vector of
The focus of our paper is to examine how the BSDE solver works Itô processes such that 𝑑𝑋𝑡 = 𝜇 𝑑𝑡 + 𝜎 𝑑𝑊𝑡 . Then
in low-dimensional spaces (1D, 2D) and compare its performance with
𝜕𝑢 1( )𝑇 ( )
traditional numerical techniques. The original BSDE solver found to 𝑑𝑢(𝑡, 𝑋𝑡 ) = 𝑑𝑡 + (∇𝑢)𝑇 𝑑𝑋𝑡 + 𝑑𝑋𝑡 𝐇𝑥 𝑢 𝑑𝑋𝑡 ,
𝜕𝑡 2
be faulty for 1D problems. We modified the solver so it works for 1D { )]} (2.6)
𝜕𝑢 1 [ (
problems. These numerical experiments include classical benchmark = + (∇𝑢)𝑇 𝜇 + Tr 𝜎 𝑇 𝜎 𝐇𝑥 𝑢 𝑑𝑡 + (∇𝑢)𝑇 𝜎 𝑑𝑊𝑡
𝜕𝑡 2
differential equation problems, such as 1D and 2D heat equation with
where ∇𝑢 is the gradient of 𝑢 w.r.t. 𝑥, 𝐇𝑥 𝑢 is the Hessian matrix of 𝑢 w.r.t.
a terminal condition, a diffusion–reaction equation in 2D and higher
dimensions up to 100 dimensions, nonlinear Black–Scholes equations 𝑥, and 𝑇 𝑟 is the trace operator.
in 2D, and an Allen–Cahn Equation in 2D. We discovered that the
If 𝑢(𝑡, 𝑥) be any twice differentiable scalar function of two real
solver works for low dimensional spaces, as accurate as it can be
variables 𝑡 and 𝑥, for an Itô drift–diffusion process
in high dimensional spaces. As the solver can only approximate so-
lution at a single point at a time makes it inefficient when we are 𝑑𝑋𝑡 = 𝜇 𝑑𝑡 + 𝜎 𝑑𝑊𝑡 , (2.7)
interested in solutions in a region. Note that authors claimed that
the solver can be improved to solve PDEs over a region. Further- one has
( )
more, the method is analyzed and showed to be convergences for 𝜕𝑢 𝜕𝑢 𝜎 2 𝜕 2 𝑢 𝜕𝑢
𝑑𝑢(𝑡, 𝑋𝑡 ) = +𝜇 + 𝑑𝑡 + 𝜎 𝑑𝑊𝑡 . (2.8)
high-dimensional forward–backward stochastic differential equations 𝜕𝑡 𝜕𝑥 2 𝜕𝑥2 𝜕𝑥
in Ref. 19. This immediately implies that 𝑢(𝑡, 𝑋𝑡 ) is itself an Itô drift–diffusion
process. We seek to show that the solution 𝑢 to the semilinear parabolic
2. Itô’s Lemma and the BSDE PDEs (1.1)–(1.2) can lead us to a solution of a stochastic differential
equation (SDE) and vise versa.
For completeness, we will briefly review the algorithm used in the
BSDE solver in this section. First, we need to define an Itô process. An
Theorem 2. The semilinear parabolic PDEs (1.1)–(1.2) has a solution
Itô process is a type of stochastic process described by Kiyoshi Itô. It
𝑢(𝑡, 𝑥) if and only if 𝑢(𝑡, 𝑋𝑡 ) satisfies the following Backward SDE (BSDE)
is expressed as the sum of the integral of a process over time and the
integral of another process over a Brownian motion. Those processes 𝑢(𝑡, 𝑋𝑡 ) − 𝑢(0, 𝑋0 )
are the base of Stochastic integration, which therefore are widely used 𝑡
in stochastic calculus. =− 𝑓 (𝑠, 𝑋𝑠 , 𝑢(𝑠, 𝑋𝑠 ), 𝜎 𝑇 (𝑠, 𝑋𝑠 )∇𝑢(𝑠, 𝑋𝑠 ))𝑑𝑠
∫0
𝑡
Definition 1. For a vector 𝜇 ∈ 𝐿1 and a matrix 𝜎 ∈ 𝐿2 , 𝑋(𝑡) is an Itô + [∇𝑢(𝑠, 𝑋𝑠 )]𝑇 𝜎(𝑠, 𝑋𝑠 )𝑑𝑊𝑠 . (2.9)
∫0
process22 if
𝑡 𝑟 where 𝑋𝑡 is defined in (2.3).
𝑋(𝑡) = 𝑋0 + 𝜇𝑑𝑠 + 𝜎𝑑𝑊 . (2.1)
∫0 ∫0
Proof. For our simplicity, we rewrite (2.9) as follows:
This implies that 𝑡 𝑡
𝑢(𝑡, 𝑋𝑡 ) − 𝑢(0, 𝑋0 ) = − 𝑓 𝑑𝑠 + [∇𝑢(𝑠, 𝑋𝑠 )]𝑇 𝜎(𝑠, 𝑋𝑠 )𝑑𝑊𝑠 . (2.10)
𝑑𝑋𝑡 = 𝜇𝑑𝑡 + 𝜎𝑑𝑊𝑡 . (2.2) ∫0 ∫0
We can also reformulate (2.4) as follows:
Definition 2. Let {𝑊𝑡 }𝑡∈[0,𝑇 ] be a d-dimensional Brownian motion.
We define the d-dimensional stochastic processes, {𝑋𝑡 }𝑡∈[0,𝑇 ] , {𝑌𝑡 }𝑡∈[0,𝑇 ] , 𝑑𝑌𝑡 = −𝑓 (𝑡, 𝑋𝑡 , 𝑌𝑡 , 𝑍𝑡 )𝑑𝑡 + (𝑍𝑠 )𝑇 𝑑𝑊 (2.11)
2
In addition, by Itô’s Lemma, we have that 0, 1, … , 𝑁, and 𝑡0 = 0, 𝑡𝑁 = 𝑇 . With the time discretization, Eq. (2.3)
{ } can be discretized as
1
𝑑(𝑢(𝑡, 𝑋𝑡 )) = 𝑢𝑡 + ∇𝑢 ⋅ 𝜇 + 𝑇 𝑟(𝜎𝜎 𝑇 𝐇𝑥 𝑢) 𝑑𝑡 + [∇𝑢]𝑇 𝜎𝑑𝑊 . (2.12)
2 𝑋𝑡𝑛+1 − 𝑋𝑡𝑛 ≈ 𝜇(𝑡𝑛 , 𝑋𝑡𝑛 )𝛥𝑡𝑛 + 𝜎(𝑡𝑛 , 𝑋𝑡𝑛 )𝛥𝑊𝑛 , (3.1)
• If 𝑢(𝑡, 𝑥) is a solution to the semilinear parabolic PDE (1.1)–(1.2),
The choices in Eq. (3.1) is the explicit Euler’s method. The local trunca-
by Eq. (1.1), 𝑢𝑡 + ∇𝑢 ⋅ 𝜇 + 12 𝑇 𝑟(𝜎𝜎 𝑇 𝐇𝑥 𝑢) = −𝑓 (𝑡, 𝑋𝑡 , 𝑌𝑡 , 𝑍𝑡 ). Thus,
tion error by such time discretization is approximately proportional to
𝑑(𝑢(𝑡, 𝑋𝑡 )) = − 𝑓 (𝑡, 𝑋𝑡 , 𝑌𝑡 , 𝑍𝑡 )𝑑𝑡 + [∇𝑢]𝑇 𝜎𝑑𝑊 . (2.13) the square of time step size. We chose to use small time steps to ensure
accuracy in the numerical experiments, so the effect of neural network
Note the definitions of 𝑌𝑡 and 𝑍𝑡 by Eqs. (2.4) and (2.5) or (2.11), training on the derivative approximations can be studied. With the time
discretization, (2.9) becomes
𝑑(𝑢(𝑡, 𝑋𝑡 )) = −𝑓 (𝑡, 𝑋𝑡 , 𝑌𝑡 , 𝑍𝑡 )𝑑𝑡 + (𝑍𝑡 )𝑇 𝑑𝑊 = 𝑑𝑌𝑡 , (2.14) 𝑢(𝑡𝑛+1 , 𝑋𝑡𝑛+1 ) − 𝑢(𝑡𝑛 , 𝑋𝑡𝑛 ) ≈ − 𝑓 (𝑡𝑛 , 𝑋𝑡𝑛 , 𝑢(𝑡𝑛 , 𝑋𝑡𝑛 ), 𝜎 𝑇 (𝑡𝑛 , 𝑋𝑡𝑛 ))∇𝑢(𝑡𝑛 , 𝑋𝑡𝑛 )𝛥𝑡𝑛
Thus, +[∇𝑢(𝑡𝑛 , 𝑋𝑡𝑛 )]𝑇 𝜎(𝑡𝑛 , 𝑋𝑡𝑛 )𝛥𝑊𝑛 . (3.2)

𝑇 𝑇 Given this temporal discretization, the path 𝑋𝑡𝑛 , 0 ≤ 𝑛 ≤ 𝑁 can be
𝑢(𝑡, 𝑋𝑡 ) = 𝑌𝑡 = 𝑔(𝑋𝑇 ) + 𝑓 (𝑠, 𝑋𝑠 , 𝑌𝑠 , 𝑍𝑠 )𝑑𝑠 − (𝑍𝑡 )𝑇 𝑑𝑊𝑠 , easily sampled using (3.1).
∫𝑡 ∫𝑡
𝑇 𝑇
𝑢(0, 𝑋0 ) = 𝑌0 = 𝑔(𝑋𝑇 ) + 𝑓 (𝑠, 𝑋𝑠 , 𝑌𝑠 , 𝑍𝑠 )𝑑𝑠 − (𝑍𝑡 )𝑇 𝑑𝑊𝑠 . 3.2. Neural network for spatial derivative approximations
∫0 ∫0
(2.15)
Next key step is to approximate the function
Therefore,
𝑥 ↦ 𝜎 𝑇 (𝑡, 𝑥)∇𝑢(𝑡, 𝑥) (3.3)
𝑢(𝑡, 𝑋𝑡 ) − 𝑢(0, 𝑋0 )
𝑡
at each time step 𝑡 = 𝑡𝑛 by a multilayer feedforward network under the
=− 𝑇
𝑓 (𝑠, 𝑋𝑠 , 𝑢(𝑠, 𝑋𝑠 ), 𝜎 (𝑠, 𝑋𝑠 )∇𝑢(𝑠, 𝑋𝑠 ))𝑑𝑠 assumption that 𝑢(𝑡, 𝑥) is know by given terminal condition. Denote
∫0
𝑡 (𝜎 𝑇 ∇𝑢)(𝑡𝑛 , 𝑋𝑡𝑛 ) ≈ (𝜎 𝑇 ∇𝑢)(𝑡𝑛 , 𝑋𝑡𝑛 ; 𝜃𝑛 )
+ [∇𝑢(𝑠, 𝑋𝑠 )]𝑇 𝜎(𝑠, 𝑋𝑠 )𝑑𝑊𝑠
∫0 for 𝑛 = 1, … , 𝑁 − 1, and 𝜃𝑛 refers to neural network parameter. The
𝑡 𝑡
=− 𝑓 𝑑𝑠 + 𝑇
(𝑍𝑡 ) 𝑑𝑊𝑠 loss function is defined based on the squared approximation error
∫0 ∫0 associated to the terminal condition of the BSDE:
This is the BSDE (2.9).
𝑙(𝜃) = E[|𝑔(𝑋𝑡𝑁 ) − 𝑢(𝑡𝑁 , 𝑋𝑡𝑁 )|2 ] (3.4)
• If 𝑢(𝑡, 𝑋𝑡 ) is a solution of the BSDE (2.9), then
𝑡 𝑡 Optimization methods such as Stochastic Gradient Descent (SGD)
𝑢(𝑡, 𝑋𝑡 ) =𝑢(0, 𝑋0 ) − 𝑓 𝑑𝑠 + (𝑍𝑡 )𝑇 𝑑𝑊𝑠 , algorithm or Adam optimizer24 can be used to minimize 𝑙(𝜃) over 𝜃.
∫0 ∫0 (2.16)
Once the maximum number of iterations has occurred, a final estimate
𝑑𝑢(𝑡, 𝑋𝑡 ) = − 𝑓 𝑑𝑡 + (𝑍𝑡 )𝑇 𝑑𝑊 . of the initial value is obtained. We refer readers to21 for detailed
Thus, by Eq. (2.4) we have that description on the training of the neural networks in the algorithm.
𝑇 𝑇 The traditional numerical methods represent functions using poly-
𝑢(𝑇 , 𝑋𝑡 ) =𝑢(0, 𝑋0 ) − 𝑓 𝑑𝑠 + (𝑍𝑡 )𝑇 𝑑𝑊𝑠 nomials or other basis functions, leads to high complexity. In contrast,
∫0 ∫0
the BSDE algorithm uses compositions of simple functions in the neural
=𝑢(0, 𝑋0 ) + 𝑔(𝑋𝑇 ) − 𝑌0 = 𝑔(𝑋𝑇 ). (2.17) network which leads to less computational cost. This made the numer-
ical approximation to higher dimensional PDEs possible whereas the
Thus, 𝑢(𝑇 , 𝑥) = 𝑔(𝑥). On the other hand, recall Itô’s lemma,
traditional numerical techniques unable to produce any solution within
combining Eqs. (2.12) and (2.16), we have that
reasonable time and storage spaces in computers.
{ }
1
𝑢𝑡 + ∇𝑢 ⋅ 𝜇 + 𝑇 𝑟(𝜎𝜎 𝑇 𝐇𝑥 𝑢) 𝑑𝑡 + [∇𝑢]𝑇 𝜎𝑑𝑊 = −𝑓 𝑑𝑡 + (𝑍𝑡 )𝑇 𝑑𝑊 ,
2 4. Numerical experiments
1
𝑢𝑡 + ∇𝑢 ⋅ 𝜇 + 𝑇 𝑟(𝜎𝜎 𝑇 𝐇𝑥 𝑢) = −𝑓 ,
2 In this section, we extended the solver so that it works for one-
1
𝑢𝑡 + ∇𝑢 ⋅ 𝜇 + 𝑇 𝑟(𝜎𝜎 𝑇 𝐇𝑥 𝑢) + 𝑓 = 0. dimensional problems. Additionally, a comparison of the BSDE solver
2
with traditional numerical techniques in low dimensional spaces is
(2.18)
presented through five classical differential equations benchmark prob-
Thus, we have a solution to the PDE (1.1)–(1.2), 𝑢(𝑡, 𝑥). □ lems. There are 1D and 2D heat equation with a terminal condition, a
diffusion–reaction equation in 2D and higher dimensions up to 100 di-
3. Numerical solutions to the BSDE mensions, nonlinear Black–Scholes equations in 2D, and an Allen–Cahn
Equation in 2D.
The BSDE solver buit by Han et al. in Ref. 18 uses a simple explicit • In Examples 1 and 2, we were able to derive an analytical solution
Euler scheme to discretize the time space for the BSDE, and then to the 1D and 2D heat equation. So that the performance of the
use the deep learning method to approximate derivatives in spatial BSDE solver is examined by comparing numerical solution with
variables during each time step. In this section, we briefly introduce the analytical solution.
the numerical algorithm. • In Example 3, we examined a diffusion–reaction equations in 2D
and dimensions up to 100. We also compared the performance
3.1. Euler’s method for time discretization of the solver with a radial basis function collocation method
reported in Ref. 25.
First, we will apply a temporal discretization using the explicit Euler • In Example 4, the focus is a Black–Scholes equation where the
scheme on (2.3)–(2.5). Let 𝛥𝑡𝑛 = 𝑡𝑛+1 − 𝑡𝑛 , and 𝛥𝑊𝑛 = 𝑊𝑡𝑛+1 − 𝑊𝑡𝑛 , 𝑛 = performance of the solver is closely experimented by considering
3
Table 1 Table 2
Example 1: Comparison of analytical and approximated solutions at different spatial Example 2: Comparison of approximated solution by the BSDE solver and the analytical
locations: 𝑥 = −1, 0, 1, 2, and 3. Absolute errors and relative errors are shown in the solution. Absolute errors and relative errors are shown in the table, where 𝑇 = 1.0,
table, where 𝑇 = 1.0, number of iterations in the deep learning network is 1000, and number of iteration in time is 𝑁 = 1000, and 30 time steps are used.
the number of time steps is 30. (𝑥1 , 𝑥2 ) BSDE 𝑢(𝑇 , 𝑥1 , 𝑥2 ) Absolute error Relative error
𝑥 BSDE 𝑢(𝑇 , 𝑥) Absolute error Relative error
(−1,−1) 8.570 × 10−4 5.176 × 10−4 3.394 × 10−4 0.656
−1 2.554 × 10−2 2.275 × 10−2 3.394 × 10−4 0.123 (−1,0) 3.423 × 10−3 3.609 × 10−3 1.860 × 10−4 5.155 × 10−2
0 0.147 0.159 1.137 × 10−2 7.163 × 10−2 (−1,1) 1.019 × 10−2 1.138 × 10−2 1.188 × 10−3 0.104
1 0.482 0.5 1.825 × 10−2 3.788 × 10−2 (0,0) 2.547 × 10−2 2.517 × 10−2 2.936 × 10−4 1.166 × 10−2
2 0.800 0.841 4.174 × 10−2 5.22 × 10−2 (0,1) 8.240 × 10−2 7.933 × 10−2 3.070 × 10−3 3.870 × 10−2
3 0.988 0.977 1.091 × 10−2 1.104 × 10−2 (0,2) 0.129 0.133 4.177 × 10−3 3.129 × 10−2
(0,3) 0.153 0.155 2.340 × 10−3 1.509 × 10−2
(1,1) 0.249 0.25 9.973 × 10−4 3.989 × 10−3
(1,2) 0.422 0.421 8.990 × 10−4 2.137 × 10−3
(1,3) 0.490 0.487 1.252 × 10−3 2.562 × 103
different locations in the domain, number of iterations in the
(2,2) 0.707 0.708 9.257 × 10−4 1.308 × 10−3
neural network and number of iterations in time discretization. (2,3) 0.820 0.822 1.767 ×10−3 2.149 × 10−3
• In Example 5, an Allen–Cahn equation in 2D is considered. (3,3) 0.956 0.955 1.104 × 10−3 1.156 × 10−3
In all the numerical experiments, spatial domains have been dis-

cretized and approximate solutions at several random points have been
obtained using the BSDE solver. Due to the stochastic property of the The solution to the initial value problem27 is
solver, the solver is tested 5 times on all spatial points of interests.
Averages of 5 runs are reported in tables and standard deviations are 1 −[(𝑥−𝑘)2 +(𝑦−ℎ)2 ]∕2𝑡
𝑢(𝑡, 𝑥) = 𝑔(𝑘, ℎ) 𝑒 𝑑ℎ 𝑑𝑘. (4.5)
∬ 2𝜋𝑡
also presented in figures. All simulations are performed on a MacBook
Pro with a 2.4 GHz Quad-Core Intel Core i5 processor and 16 GB Imposing the initial condition 𝑔(𝑥) and applying Fubini’s Theorem, we
memory. arrive at:
( ∞ )( ∞ )
1 2 2
𝑢(𝑡, 𝑥) = 𝑒−(𝑥1 −ℎ) ∕2𝑡 𝑑ℎ 𝑒−(𝑥2 −𝑘) ∕2𝑡 𝑑𝑘 .
Example 1. We consider the following 1D heat equation with terminal 2𝜋𝑡 ∫0 ∫0
condition26 √
ℎ−𝑥 𝑘−𝑥 𝑧 2 𝜋
1 𝜕𝑢 Let 𝑧 = √ 1 and 𝑤 = √ 2 . Note that ∫0 𝑒−𝑡 𝑑𝑡 = 𝑒𝑟𝑓 (𝑧).
𝑢𝑡 (𝑡̃, 𝑥) + (𝑡, 𝑥) = 0, ∀ (𝑡, 𝑥) ∈ [0, 𝑇 ) × R 2𝑡 2𝑡
2
2 𝜕𝑥 (4.1) The solution can be rewritten as follows
𝑢(𝑇 , 𝑥) = 𝜒[1,∞) (𝑥), ∀ 𝑥 ∈ R,
1 −[(𝑥−𝑘)2 +(𝑦−ℎ)2 ]∕2𝑡
where 𝜒 [1,∞) is an indicator function of [1, ∞) and 𝜒[1,∞) (𝑥) = 𝑢(𝑡, 𝑥) = 𝑔(𝑘, ℎ) 𝑒 𝑑ℎ𝑑𝑘
{ ∬ 2𝜋𝑡
1, 𝑥 ≥ 1, ∞ ∞
1 −𝑧2 2
. It can be shown that there exists at most one strong- = √ 𝑒 𝑑𝑧 𝑒−𝑤 𝑑𝑤
0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 𝜋 ∫−𝑥1 ∕ 2𝑡 ∫−𝑥2 ∕√2𝑡
viscosity solution to the equation above. The solution is given by ( √ )
∞ 𝑥1 ∕ 2𝑡
1 −𝑧2 2
( ) = 𝑒 𝑑𝑧 + 𝑒−𝑧 𝑑𝑧
𝜋 ∫0 ∫0
1−𝑥 ( √ )
𝑢(𝑡, 𝑥) = 1 − 𝛷 √ , (𝑡, 𝑥) ∈ [0, 𝑇 ] × R, (4.2) ∞ 𝑥2 ∕ 2𝑡
2 2
𝑇 −𝑡 × 𝑒−𝑤 𝑑𝑤 + 𝑒−𝑤 𝑑𝑤
𝑧 2
∫0 ∫0
where 𝛷(𝑧) = ∫−∞ √1 𝑒−𝑧 ∕2 𝑑𝑧. [√ √ )] [ √ √( ( )]
2𝜋
𝜋 𝑥1 𝜋 𝜋 𝜋 𝑥 (4.6)
1
Table 1 shows a comparison between analytical solution and the = +√ 𝑒𝑟𝑓+ 𝑒𝑟𝑓 √2
2𝜋 2 2𝑡 2 2 2𝑡
approximated solutions at different spatial locations: 𝑥 = −1, 0, 1, 2, and [ ( )] [ ( )]
3. Absolute errors and relative errors are shown, where 𝑇 = 1.0, number 1 1 𝑥 1 1 𝑥
= + 𝑒𝑟𝑓 √1 + 𝑒𝑟𝑓 √2
of iterations in deep learning network is 1000, and the number of time 2 2 2 2
2𝑡 2𝑡
steps is 30. [ ( )] [ ( )]
Left of Fig. 2 shows the profile of the approximated solution at 1 𝑥 1 1 𝑥2
= 1 + 𝑒𝑟𝑓 √ √ 1 + 𝑒𝑟𝑓 √ √
𝑡 = 0, 𝑥 = 0 as a function of number of iterations. When the num- 2 2 𝑡 2 2 𝑡
ber of iterations higher than 200, the approximated solutions started ( ) ( )
𝑥1 𝑥2
converge to the true solution. The line represents the average values =𝛷 √ 𝛷 √ .
over 5 runs when the number of iteration is fixed, and the shaded 𝑡 𝑡
blue region around the line demonstrates the standard deviation of the [ ( )]
approximation for that iteration. Right of Fig. 2 shows the profile of where 𝛷(𝑥) = 12 1 + 𝑒𝑟𝑓 √𝑥 . Substituting in the original variables,
2
the approximated solution versus the analytical solution as a function we arrive at the solution to the original terminal condition problem:
of 𝑥 when 𝑡 = 0. ( ) ( )
𝑥̃ − 1 𝑥̃ − 1
̃ = 𝛷 √1
𝑢(𝑡̃, 𝑥) 𝛷 √2 . (4.7)
Example 2. We consider the following 2D heat equation with terminal 𝑇 − 𝑡̃ 𝑇 − 𝑡̃
condition
The profile of our analytical solution is shown in Fig. 3 on the
1
𝑢𝑡 (𝑡̃, 𝑥)
̃ + 𝛥𝑢(𝑡̃, 𝑥)
̃ = 0, ∀ (𝑡̃, 𝑥̃ = (𝑥̃ 1 , 𝑥̃ 2 )) ∈ [0, 𝑇 ) × R2 left. On the right of Fig. 3, it shows the absolute error surface when
2 (4.3) 𝑇 = 1, 𝑁 = 1000 and 30 time steps are used. Detailed comparison of
𝑢(𝑇 , 𝑥)̃ = 𝜒[1,∞)×[1,∞) (𝑥),
̃ ∀ 𝑥̃ ∈ R2 . approximated solution by the BSDE solver and the analytical solution
We will first derive the analytical solution of the system above. computed on several locations of the domain can be found in Table 2.
A change of variables is performed to transform a terminal condition Left of Fig. 4 shows the absolute error calculated at (0, 0) and on
problems into an initial value problem. Let 𝑥1 = 𝑥̃ 1 − 1, 𝑥2 = 𝑥̃ 2 − 1, 𝑡 = the right, it shows the profile of approximation solution at (0, 0) against
𝑇 − 𝑡̃, then the system above becomes the number of iterations with standard deviation shown as the shaded
1 region around the curves.
𝑢𝑡 (𝑡, 𝑥) = 𝛥𝑢(𝑡, 𝑥), ∀ (𝑡, 𝑥) ∈ [0, 𝑇 ) × R2 Compare to the 1D heat equation in Example 1, the approximated
2 (4.4)
𝑢(0, 𝑥) = 𝜒[1,∞)×[1,∞) (𝑥), ∀ 𝑥 ∈ R2 . solution in 2D heat equation experienced less deviation from the mean
4
Fig. 2. Example 1: Left is the profile of the approximated solution at 𝑡 = 0, 𝑥 = 0 as a function of number of iterations; The line represents the average values over 5 runs when
the number of iteration is fixed, and the shaded blue region around the line demonstrates the standard deviation of the approximation for that iteration. Right is the analytical
solution vs. approximated solution as a function of 𝑥 at 𝑡 = 0.
Fig. 3. Example 2: on the left, it shows a profile of the analytical solution; on the right, it shows a profile of absolute error surface when 𝑇 = 1.0, 𝑁 = 1000 and 30 time steps
are used.
Fig. 4. Example 2: Left: Absolute error at (0, 0). Right: Approximate solution at (0, 0) with standard deviations shown as the shaded region around the mean curves.
value than in 1D. This indicates the solver performs more stable as the By considering the time reversal 𝑡 → 𝑇 − 𝑡 for some 𝑇 > 0, we can
dimension gets higher. obtain an equation as expressed in form of (1.1).
𝑢𝑡 (𝑡, 𝑥) + 0.2𝛥𝑢(𝑡, 𝑥) + 0.1𝑢(𝑡, 𝑥) = 0. (4.10)

Example 3. In this example, we consider the following 2D problem:
√
This matches the semilinear parabolic form with 𝜎 = 0.4 𝐼2 , 𝜇 = 0,
𝑢𝑡 (𝑡, 𝑥) = 0.2𝛥𝑢(𝑡, 𝑥) + 0.1𝑢(𝑡, 𝑥) (4.8)
and 𝑓 (𝑡, 𝑥) = 0.1𝑢(𝑡, 𝑥), where 𝐼2 is an identity matrix of size 2 × 2. We
subject to the initial condition: 𝑢(0, 𝑥) = cos(𝑥1 ) + sin(𝑥2 ). This problem have the terminal condition 𝑔(𝑥) = 𝑢(𝑇 , 𝑥) = cos(𝑥1 ) + sin(𝑥2 ).
has an exact analytical solution: In our numerical experiments, a 6 × 6 grid of points were tested
over the domain [0, 2𝜋] × [0, 2𝜋] using total of 5000 iterations in neural
[ ]
𝑢(𝑡, 𝑥) = exp(−0.1𝑡) cos(𝑥1 ) + sin(𝑥2 ) . (4.9) network with a learning rate of 0.01 for the first half and 0.0002 for
5
Fig. 5. Example 3: The exact solution profile on the left and the absolute error profile on the right.
Fig. 6. Example 3: Left: The approximate value of 𝑢(0, 0) at different numbers of iterations of the Deep BSDE solver and Right: The absolute errors of the Deep BSDE solver at
various numbers of iterations.
the lines demonstrate the standard deviation of the approximation and

the error for the particular iteration. It is clear from Fig. 6, as the
number of iterations increases, the approximate solution converges to
an asymptotic value equal to the analytical solution and absolute error
decreases with increasing number of iterations. Time profile of the
2D solution was obtained by running the solver with different end
conditions, which can be found in Fig. 7 when the number of iteration
in the deep learning network is sufficient.
In Table 3, we compare the performance of the BSDE solver with
RBFCM in Ref. 28. The accuracy of the results from BSDE solver did
not improve substantially when the time step size was reduced, while
the time required to perform the approximation increased greatly.
Compared to the RBFCM, the BSDE solver with the aforementioned
parameters produces results more accurate than the results obtained
by RBFCM. It should be noted that the computational time is high,
requiring several minutes per point of interest when 100 times steps
Fig. 7. Example 3: The value of 𝑢(𝑡, 0) for various times 𝑡. The BSDE solver results are used. Even when 10 time steps are used, the computation time is
are in blue, while the exact solution is in red.. (For interpretation of the references to still substantial.
color in this figure legend, the reader is referred to the web version of this article.) To further evaluate the performance of the Deep BSDE solver, the
previous equation was considered in 𝑑 dimensions with 𝑑 being 5, 10,
20, 50, and 100:
the second half. In two separate rounds of testing, 10 and 100 step sizes
𝑢𝑡 (𝑡, 𝑥) = 0.2𝛥𝑢(𝑡, 𝑥) + 0.1𝑢(𝑡, 𝑥)
were used with 𝑇 = 0.1.
∑
The surface of the exact solution in 2D and the surface of the Subject to 𝑢(0, 𝑥) = 𝑔(𝑥) = 𝑑𝑖=1 cos(𝑥𝑖 ). This has a solution:
absolute error in 2D obtained by the BSDE solver are shown on the left
∑
𝑑
and right of Fig. 5, respectively. To ascertain the impact of the number 𝑢(𝑡, 𝑥) = exp(−0.1𝑡) cos(𝑥𝑖 ).
of iterations used, five independent runs were completed using several 𝑖=1
total iterations. The average approximation solution and approximation Table 4 shows the approximated solutions at the origin for various
errors at 𝑢(𝑡 = 0, 𝑥 = 0) are plotted in Fig. 6. The shaded regions around 𝑑. For each value of 𝑑, the approximations are calculated by averaging
6
Table 3
Example 3: Comparison of BSDE solution with different time step sizes and with RBFCM presented in Ref. 28. Decreasing the
step size by a factor of 10 had minimal impact on the error while increasing time greatly.
𝛥𝑡 BSDE Solver RBFCM28
𝐿∞ 𝐿𝑟𝑚𝑠 CPU time 𝐿∞ 𝐿𝑟𝑚𝑠
𝛥𝑡 = 10−2 3.002 × 10−4 1.598 × 10−4 38 8.31 × 10−3 2.45 × 10−3
𝛥𝑡 = 10−3 3.456 × 10−4 1.340 × 10−4 408.5 3.33 × 10−3 2.45 × 10−3
Table 4
Example 3: The approximated solutions at the origin, errors and CPU time for various 𝑑 = 5, 10, 20, 50 and 100, where 𝑇 = 0.1. 3000 iterations
are used in the deep learning network, learning rate is set to be 0.5 for the first half of the iterations to speed up computational time, and it
was set to be 0.0001 for the next half of the iterations.
𝑑 5 10 20 50 100 200 500
𝑌0 (BSDE) 4.951 9.900 19.80 49.50 99.00 198.0 479.0
𝑌0 (exact) 4.951 9.901 19.80 49.50 99.01 198.0 495.0
Absolute error 2.63𝐸−4 4.82𝐸−4 2.70𝐸−3 2.30𝐸−3 6.74𝐸−4 5.47𝐸−4 1.60𝐸1
Relative error 0.0053% 0.0049% 0.0136% 0.0046% 0.0007% 0.0003% 3.23%
CPU Time 50.8 51.4 53.4 61.4 75.8 109.1 220
10 independent trials. In running the trials, a total time of 𝑇 = 0.1 Table 5

was considered, split up into 20 time steps. A total of 3000 iterations Example 4: Approximation of 𝑢(𝑡 = 0, 𝑥 = (𝑥1 , 𝑥2 )) at different values of 𝑥1 and 𝑥2 .
were considered for each trial with a learning rate of 0.5 for the first 𝑥2 ∖𝑥1 0 20 40 60 80 100
half and 0.0001 for the second half. The BSDE solver is fairly accurate 0 2.41 × 10−4 0.0632 9.67 × 10−3 2.24 × 10−4 0.0183 0.0075
despite the dimension of the problem. However, the relative error when 20 0.03 16.596 18.811 18.835 18.176 18.722
40 3.44 × 10−3 18.583 33.199 37.194 37.43 37.472
𝑑 = 500 is slightly higher than when the number of dimensions are
60 4.25 × 10−3 18.599 36.805 50.412 56.108 57.428
small. 80 3.3942 × 10−2 18.605 37.234 56.083 70.096 76.269
100 3.6747 × 10−2 18.606 37.274 57.537 76.224 88.16
Example 4. Let us consider a model of pricing with default risk with
domain [0, 𝑇 ] × R2 . Let 𝑅 be the interest rate of the risk-free asset and
𝑄 be an intensity function defined in three regions (𝑣ℎ < 𝑣𝑙 , 𝛾 ℎ > 𝛾 𝑙 ),
1
which are defined by : − 𝑄(𝑢(𝑡, 𝑥))𝑢(𝑡, 𝑥) − 0.02𝑢(𝑡, 𝑥) = 0. (4.15)
( ℎ ) 3
𝛾 − 𝛾𝑙 Table 5 shows the approximations of 𝑢(𝑡 = 0) computed on different
𝑄(𝑦) = 1(−∞,𝑣ℎ ) (𝑦)𝛾 ℎ + 1[𝑣𝑙 ,∞) (𝑦)𝛾 𝑙 + 1[𝑣ℎ ,𝑣𝑙 ) (𝑦) (𝑦 − 𝑣ℎ
) + 𝛾 ℎ
,
𝑣ℎ − 𝑣𝑙 locations of the domain and left of Fig. 9 shows the corresponding
(4.11) solution profile. According to the Fig. 8, solver seems to converge to
𝑓 (𝑡, 𝑥, 𝑢(𝑡, 𝑥), 𝜎 𝑇 (𝑡, 𝑥)∇𝑢(𝑡, 𝑥)) = −(1 − 𝛿)𝑄(𝑢(𝑡, 𝑥))𝑢(𝑡, 𝑥) − 𝑅𝑢(𝑡, 𝑥), (4.12) a solution when we let it work higher number of iterations. Moreover,
we compute the approximations using multilevel Picard method for the
for any 𝛿 ∈ (0, 1], where 𝑓 is the value process. The nonlinear Black– exact same (𝑥1 , 𝑥2 ) as in Table 5 and present them on the Table 6.
Scholes equation then can be formed as follows Accuracy of the BSDE solver in lower dimensions is evident from the
( )
errors computed in Table 6 and absolute error profile on the right of
𝜕𝑢 𝜎2 𝜕2 𝑢 𝜕2 𝑢
+ 𝜇𝑥∇𝑢(𝑡, 𝑥) + |𝑥1 |2 (𝑡, 𝑥) + |𝑥2 |2 (𝑡, 𝑥) Fig. 9.
𝜕𝑡 2 𝜕𝑥21 𝜕𝑥22
From the left of Fig. 10 it is clear that solutions from the BSDE
− (1 − 𝛿)𝑄(𝑢(𝑡, 𝑥))𝑢(𝑡, 𝑥) − 𝑅𝑢(𝑡, 𝑥) = 0. (4.13) solver converge for any given point (𝑥1 , 𝑥2 ) if the number of iteration
in Neural Network is high enough. The right of Fig. 10 shows that
In mathematics finance, Black and Scholes29
is a partial differen-
tial equation that governs the price evolution of a European call or BSDE solver can obtain approximation which is accurate up to fourth
European put options. Finding the analytic closed-form solution of order by increasing the number of iterations in neural network. The
the Black–Scholes equation is not easy. Therefore, it is necessary to approximation solutions of 𝑢(𝑡 = 0, 𝑥 = (100, 100)) were reported at
approximate the solutions using numerical methods. The finite dif- different time steps for number of iterations 1000 and 8000 in Table 7.
ference methods (FDM) such as the operator splitting method30 and The approximations vary significantly as the number of time steps
multigrid method31 are very popular to approximate the solution of the changes when the number of iteration in network is not high enough to
BS equations.32 In 2018, Grohs33 showed that artificial neural networks achieve convergence in the solution. The not explicitly known ‘‘exact’’
overcome the curse of dimensionality in the numerical approximation solution at 𝑡 = 0, 𝑥 = (100, 100) has been approximately computed
of Black–Scholes partial differential equations. using the multilevel Picard method35 : 𝑢(𝑡 = 0, 𝑥 = (100, 100)) ≈ 88.16.
To verify the accuracy of the BSDE solver in lower dimension,
we approximate the initial value 𝑢(𝑡 = 0, (𝑥1 , 𝑥2 )), with (𝑥1 , 𝑥2 ) ∈ Example 5. We consider the following Allen–Cahn Equation in 2D on
[0, 100]2 , using the BSDE solver and compare with the approximations the domain [0, 10] × [0, 10]:
computed means of the multilevel Picard method.34 We set all the
known parameters as follows: 𝑢𝑡 (𝑡, 𝑥) = 𝛥𝑢(𝑡, 𝑥) + 𝑢(𝑡, 𝑥) − 𝑢(𝑡, 𝑥)3 (4.16)
2 1
𝑇 = 1, 𝛿 = , 𝑅 = 0.02, 𝜇 = 0.02, 𝜎 = 0.2, subject to the initial condition: 𝑢(0, 𝑥) = 2+0.4‖𝑥‖2
.
3 (4.14)
By considering the time reversal 𝑡 → 𝑇 − 𝑡 for some 𝑇 > 0, we can
𝑣ℎ = 50, 𝑣𝑙 = 70, 𝛾 ℎ = 0.2, 𝛾 𝑙 = 0.02,
obtain an equation as expressed in (1.1):
and terminal condition 𝑔(𝑥) = min{𝑥1 , 𝑥2 }. Thus, the nonlinear Black–
Scholes equation is simplified as: 𝑢𝑡 (𝑡, 𝑥) + 𝛥𝑢(𝑡, 𝑥) + 𝑢(𝑡, 𝑥) − 𝑢(𝑡, 𝑥)3 = 0. (4.17)
( ) √
𝜕𝑢 2 2 This matches the semilinear parabolic form with 𝜎 = 2𝐼2 , 𝜇 = 0, and
2𝜕 𝑢 2𝜕 𝑢
+ 0.02𝑥∇𝑢(𝑡, 𝑥) + 0.02 |𝑥1 | (𝑡, 𝑥) + |𝑥2 | (𝑡, 𝑥)
𝜕𝑡 𝜕𝑥21 𝜕𝑥22 𝑓 (𝑡, 𝑥) = 𝑢(𝑡, 𝑥) − 𝑢(𝑡, 𝑥)3 .
7
Table 6
Example 4: Relative and absolute errors of 𝑢(𝑡 = 0, 𝑥 = (𝑥1 , 𝑥2 )), for each 𝑥1 , 𝑥2 ∈ {0, 20, 40, 60, 80, 100}, compared with approximations made with
multilevel Picard method values.
𝑥2 ∖𝑥1 0 20 40 60 80 100
0 𝑌0 (𝑃 𝑖𝑐𝑎𝑟𝑑) 0 0 0 0 0 0
Relative error
Absolute error 2.41 × 10−4 0.0632 9.67 × 10−3 2.24 × 10−4 0.0183 0.0075
20 𝑌0 (𝑃 𝑖𝑐𝑎𝑟𝑑) 0 16.5963 18.8110 18.8354 18.8356 18.8354
Relative error 0.01% 0.58% 0.64% 0.64% 0.6%
Absolute error 0.03 .002137 0.1095 0.121 0.1199 0.1133
40 𝑌0 (𝑃 𝑖𝑐𝑎𝑟𝑑) 0 18.5834 33.1986 37.1937 37.611 37.6367
Relative error 0.61% 0.02% 0.53% 0.48% 0.44%
Absolute error 3.44 × 10−3 0.1142 .0054 0.1962 0.1813 0.1643
60 𝑌0 (𝑃 𝑖𝑐𝑎𝑟𝑑) 0 18.5994 36.8051 50.4123 56.2028 57.6156
Relative error 0.63% 0.63% 0.04% 0.17% 0.33%
Absolute error 4.25 × 10−3 0.1165 0.2324 0.0187 0.0944 0.1873
80 𝑌0 (𝑃 𝑖𝑐𝑎𝑟𝑑) 0 18.6048 37.2338 56.0825 70.1898 76.3015
Relative error 0.6% 0.49% 0.02% 0.13% 0.04%
Absolute error 0.0339 0.1118 0.1821 0.0131 0.0935 0.0321
100 𝑌0 (𝑃 𝑖𝑐𝑎𝑟𝑑) 0 18.6059 37.2738 57.569 76.1515 88.16
Relative error 0.6% 0.45% 0.06% 0.09% 0.0001%
Absolute error 0.03674 0.1109 0.1674 0.0318 0.072 9.83 × 10−5
Fig. 8. Example 4: Approximations of 𝑢(𝑡 = 0) at (100, 20) and (80, 100) using the BSDE solver for Black–Scholes equation against number of iterations in neural network.
Fig. 9. Example 4: Left: Approximate solution 𝑢(𝑡 = 0, 𝑥 = (𝑥1 , 𝑥2 )) profile based on Table 5, Right:Absolute error compared to the approximation of 𝑢 obtained by multilevel Picard
method.
Table 7 solids. Then it has been widely used in many fields.36–39 Numerical
Example 4: Comparison of approximations of 𝑢(𝑡 = 0, 𝑥 = (100, 100)) as a mean of 5
methods40 including Finite difference method,41 the reduced order
independent runs at different time steps for number of iterations 1000 and 8000.
method42,43 spectral method40,44 and reproducing kernel method,45
# of time steps 10 20 40 60 100
radial basis function collocation method,46 and many others have
1000 Iterations 35.165 37.725 37.605 35.864 36.781
investigated.
8000 Iterations 87.631 88.057 87.269 87.180 87.530
In this example, the Allen–Cahn Equation was explored using both
the BSDE solver and the RBF solver. Solution surfaces for the two
methods are shown in Fig. 11. Fig. 12 shows the absolute error obtained
The Allen–Cahn equation was first introduced by Allen and Cahn by the BSDE solver when using RBF solver as the comparison solutions.
in Ref. 36 to describe the motion of anti-phase boundaries in crystalline The error is viewed from two different angles.
8
Fig. 10. Example 4: Left shows an approximation of 𝑢 at 𝑡 = 0 as a mean of 5 independent runs against the number of iterations in network for different points in the domain.
Right shows the relative approximation error of 𝑢(𝑡 = 0, 𝑥 = (100, 100)) against the number of iterations in network. The shaded area depicts the mean ±the SD of the relative error
for five different runs.
Fig. 11. Example 5: Approximation solution of 𝑢(𝑥1 , 𝑥2 ) at 𝑡 = 0.5 in a square domain using BSDE method on the left and using RBF method on the right.
Fig. 12. Example 5: The approximation error of 𝑢 at 𝑡 = 0.5 in two different views.
5. Conclusions that the solver works well for low dimensional problems, as accurate as
it can be in high dimensional spaces. However, the BSDE solver might
Backward stochastic differential equation solver solves semilinear take much longer time than the traditional methods, considering that
parabolic partial differential equations by converting them into stochas- it only handles one point at a time and the approximation should be
tic differential equations . This algorithm is especially powerful for computed by means of few independent runs of the algorithm.
solving high-dimensional PDEs, which traditional numerical techniques
cannot handle. Declaration of competing interest
This paper modified the BSDE solver so it can be used for one-
dimensional problems. Furthermore, we compared the BSDE solver The authors declare that they have no known competing finan-
with traditional numerical techniques in low dimensional spaces in- cial interests or personal relationships that could have appeared to
cluding 1D, 2D. Through classical differential equations, we discovered influence the work reported in this paper.
9
References 23. Pardoux É, Peng S. Backward stochastic differential equations and quasilinear
parabolic partial differential equations. In: Stochastic Partial Differential Equations
1. Evans LC. Partial Differential Equations, Vol. 19. American Mathematical Society; and their Applications. Springer; 1992:200–217.
2010. 24. Kingma DP, Ba J. A method for stochastic optimization. In: CAnon. International-
2. Morton KW, Mayers DF. Numerical Solution of Partial Differential Equations: An Conferenceon Learning Representations. SanDego: ICLR, 2015 abs/1412.6980.
Introduction. Cambridge University Press; 2005. 25. Yao G. An improved localized method of approximate particular solutions for
3. Lapidus L, Pinder GF. Numerical Solution of Partial Differential Equations in Science solving elliptic PDEs. Comput Math Appl. 2016;71(1):171–184.
and Engineering. John Wiley & Sons; 2011. 26. Cosso A, Russo F. Strong-viscosity solutions: Classical and path-dependent PDEs.
4. Farshid M, Shadi R, Nasrin S. Application of combination schemes based on radial Osaka J Math. 2019;56(2):323–373.
basis functions and finite difference to solve stochastic coupled nonlinear time 27. Haberman R. Elementary Applied Partial Differential Equations, Vol. 987. Prentice
fractional sine–Gordon equations. Comput Appl Math. 2021;41(10). Hall Englewood Cliffs, NJ; 1983.
5. Cheng L. KdV-type Wronskian rational solutions to the (4+ 1)-dimensional Fokas 28. Yao G. A comparative study of global and local meshless methods for
equation. Partial Differ Equ Appl Math. 2022;5:100222. diffusion-reaction equation. Comput Model Eng Sci(CMES). 2010;59(2):127–154.
6. Kloeden PE, Platen E. Stochastic differential equations. In: Numerical Solution of 29. Black F, Scholes M. The pricing of options and corporate liabilities. In: World
Stochastic Differential Equations. Springer; 1992:103–160. Scientific Reference on Contingent Claims Analysis in Corporate Finance: Volume 1:
7. Platen E. An introduction to numerical methods for stochastic differential Foundations of CCA and Equity Valuation. World Scientific; 2019:3–21.
equations. Acta Numer. 1999;8:197–246. 30. Ikonen S, Toivanen J. Operator splitting methods for American option pricing.
8. Burrage K, Burrage P, Mitsui T. Numerical solutions of stochastic differ- Appl Math Lett. 2004;17(7):809–814.
ential equations–implementation and stability issues. J Comput Appl Math. 31. Ramage A, von Sydow L. A multigrid preconditioner for an adaptive Black-Scholes
2000;125(1–2):171–182. solver. BIT Numer Math. 2011;51(1):217–233.
9. Higham DJ. An algorithmic introduction to numerical simulation of stochastic 32. Duffy DJ. Finite Difference Methods in Financial Engineering: A Partial Differential
differential equations. SIAM Rev. 2001;43(3):525–546. Equation Approach. John Wiley & Sons; 2013.
10. Mirzaee F, Sayevand K, Rezaei S, Samadyar N. Finite difference and spline 33. Grohs P, Hornung F, Jentzen A, Von Wurstemberger P. A proof that artificial neu-
approximation for solving fractional stochastic advection-diffusion equation. Iran ral networks overcome the curse of dimensionality in the numerical approximation
J Sci Technol Trans A Sci. 2021;45(2):607–617. of Black-Scholes partial differential equations. 2018 arXiv, abs/1809.02362.
11. Mirzaee F, Samadyar N. Implicit meshless method to solve 2D fractional stochastic 34. Hutzenthaler M, Jentzen A, Kruse T, et al. On multilevel Picard numerical approx-
Tricomi-type equation defined on irregular domain occurring in fractal transonic imations for high-dimensional nonlinear parabolic partial differential equations
flow. Numer Methods Partial Differential Equations. 2021;37(2):1781–1799. and high-dimensional nonlinear backward stochastic differential equations. J Sci
12. Yan Y. Galerkin finite element methods for stochastic parabolic partial differential Comput. 2019;79(3):1534–1571.
equations. SIAM J Numer Anal. 2005;43(4):1363–1384. 35. Weinan E, Hutzenthaler M, Jentzen A, Kruse T. On multilevel picard numerical
13. Mirzaee F, Rezaei S, Samadyar N. Solving one-dimensional nonlinear stochastic approximations for high-dimensional nonlinear parabolic partial differential equa-
Sine-Gordon equation with a new meshfree technique. Int J Numer Modell: Electron tions and high-dimensional nonlinear backward stochastic differential equations.
Netw. 2020;34. J Sci Comput. 2019;79(3):1534–1571.
14. Mirzaee F, Samadyar N. Application of Bernoulli wavelet method for estimating 36. Allen SM, Cahn JW. A microscopic theory for antiphase boundary motion and its
a solution of linear stochastic Itô-Volterra integral equations. Multidiscip Model application to antiphase domain coarsening. Acta Metall. 1979;27(6):1085–1095.
Mater Struct. 2018;15(3):575–598. 37. Li Y, Jeong D, Choi J-i, Lee S, Kim J. Fast local image inpainting based on the
15. Mirzaee F, Samadyar N. Combination of finite difference method and mesh- Allen–Cahn model. Digit Signal Process. 2015;37:65–74.
less method based on radial basis functions to solve fractional stochastic 38. Cheng M, Warren JA. An efficient algorithm for solving the phase field crystal
advection–diffusion equations. Eng Comput. 2020;36(4):1673–1686. model. J Comput Phys. 2008;227(12):6241–6248.
16. Samadyar N, Ordokhani Y, Mirzaee F. Hybrid Taylor and block-pulse functions 39. Ward MJ. Metastable bubble solutions for the Allen-Cahn equation with mass
operational matrix algorithm and its application to obtain the approximate conservation. SIAM J Appl Math. 1996;56(5):1247–1279.
solution of stochastic evolution equation driven by fractional Brownian motion. 40. Shen J, Yang X. Numerical approximations of Allen-Cahn and Cahn-Hilliard
Commun Nonlinear Sci Numer Simul. 2020;90:105346. equations. Discrete Contin Dyn Syst. 2010;28(4):1669–1691.
17. Berg J, Nyström K. A unified deep artificial neural network approach to partial 41. Kim J, Jeong D, Yang S-D, Choi Y. A finite difference method for a conservative
differential equations in complex geometries. Neurocomputing. 2018;317:28–41. Allen–Cahn equation on non-flat surfaces. J Comput Phys. 2017;334:170–181.
18. Han J, Zhang L, Weinan E. Solving many-electron Schrödinger equation using 42. Song H, Jiang L, Li Q. A reduced order method for Allen–Cahn equations. J Comput
deep neural networks. J Comput Phys. 2019;399:108929. Appl Math. 2016;292:213–229.
19. Han J, Long J. Convergence of the deep BSDE method for coupled FBSDEs. Probab 43. Abbaszadeh M, Dehghan M. A reduced order finite difference method for solving
Uncertain Quant Risk. 2020;5(1):1–33. space-fractional reaction-diffusion systems: The Gray-Scott model. Eur Phys J Plus.
20. Lye KO, Mishra S, Ray D. Deep learning observables in computational fluid 2019;134(12):1–15.
dynamics. J Comput Phys. 2020;410:109339. 44. Lee HG, Lee J-Y. A semi-analytical Fourier spectral method for the Allen–Cahn
21. Weinan E, Han J, Jentzen A. Deep learning-based numerical methods for high- equation. Comput Math Appl. 2014;68(3):174–184.
dimensional parabolic partial differential equations and backward stochastic 45. Niu J, Xu M, Yao G. An efficient reproducing kernel method for solving the
differential equations. Commun Math Stat. 2017;5(4):349–380. Allen–Cahn equation. Appl Math Lett. 2019;89:78–84.
22. Øksendal B. Stochastic Differential Equations: An Introduction with Applications, Vol. 46. Hon Y-C, Mao X-Z. A radial basis function method for solving options pricing
82. 2000. model. Financ Eng. 1999;81(1):31–49.
10

Partial Differential Equations in Applied Mathematics

Uploaded by

Copyright:

Available Formats

Partial Differential Equations in Applied Mathematics

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Partial Differential Equations in Applied Mathematics

Uploaded by

Copyright:

Available Formats

Partial Differential Equations in Applied Mathematics 6 (2022) 100457

Contents lists available at ScienceDirect

Partial Differential Equations in Applied Mathematics

A test of backward stochastic differential equations solver for solving

ARTICLE INFO ABSTRACT

1. Introduction In this paper, we are interested in examining the semilinear parabolic

Thus, +[∇𝑢(𝑡𝑛 , 𝑋𝑡𝑛 )]𝑇 𝜎(𝑡𝑛 , 𝑋𝑡𝑛 )𝛥𝑊𝑛 . (3.2)

In all the numerical experiments, spatial domains have been dis-

𝑢𝑡 (𝑡, 𝑥) + 0.2𝛥𝑢(𝑡, 𝑥) + 0.1𝑢(𝑡, 𝑥) = 0. (4.10)

the lines demonstrate the standard deviation of the approximation and

10 independent trials. In running the trials, a total time of 𝑇 = 0.1 Table 5

You might also like