Deep neural network approach to forward-inverse problems

Hyeontae Jo; Hwijae Son; Hyung Ju Hwang; Eun Heui Kim

doi:10.3934/nhm.2020011

Article Contents

2020, Volume 15, Issue 2: 247-259. Doi: 10.3934/nhm.2020011

This issue Previous Article A new mixed finite element method for the n-dimensional Boussinesq problem with temperature-dependent viscosity Next Article Comparative study of macroscopic traffic flow models at road junctions

Deep neural network approach to forward-inverse problems

1.
Department of Mathematics, Pohang University of Science and Technology, South Korea
2.
Department of Mathematics and Statistics, California State University Long Beach, US

^* Corresponding author: Hyung Ju Hwang
^* Corresponding author: Hyung Ju Hwang

Received: January 2020

Revised: April 2020

Early access: April 2020

Published: June 2020

Abstract / Introduction Full Text(HTML) Figure(7) / Table(3) Related Papers Cited by

Abstract

In this paper, we construct approximated solutions of Differential Equations (DEs) using the Deep Neural Network (DNN). Furthermore, we present an architecture that includes the process of finding model parameters through experimental data, the inverse problem. That is, we provide a unified framework of DNN architecture that approximates an analytic solution and its model parameters simultaneously. The architecture consists of a feed forward DNN with non-linear activation functions depending on DEs, automatic differentiation [2], reduction of order, and gradient based optimization method. We also prove theoretically that the proposed DNN solution converges to an analytic solution in a suitable function space for fundamental DEs. Finally, we perform numerical experiments to validate the robustness of our simplistic DNN architecture for 1D transport equation, 2D heat equation, 2D wave equation, and the Lotka-Volterra system.

Keywords:

Mathematics Subject Classification: Primary: 58F15, 58F17; Secondary: 53C35.

Citation:

FullText(HTML)

Figure 1. Network architecture

Download: Full-size image PowerPoint slide

Figure 2. Experimental result for 1D transport equation

Download: Full-size image PowerPoint slide

Figure 3. Experimental result for 2D heat equation with $ u(0,x,y) = x(1-x)y(1-y) $

Download: Full-size image PowerPoint slide

Figure 4. Experimental result for 2D heat equation with $ u(0,x,y) = 1 \text{, if } (x,y) \in \Omega, 0 \text{, otherwise} $

Download: Full-size image PowerPoint slide

Figure 5. Experimental result for 2D wave equation

Download: Full-size image PowerPoint slide

Figure 6. Experimental result for Lotka-Volterra equation

Download: Full-size image PowerPoint slide

Figure 7. Experimental result for CFL condition

Download: Full-size image PowerPoint slide

Algorithm 1: Training

1: procedure train(number of epochs)
2:   Initialize the nerural network.
3:   For number of epochs do
4:     sample $ z^1, z^2,..., z^m $ from uniform distribution over $ \Omega $
5:     sample $ z_I^1, z_I^2,..., z_I^m $ from uniform distribution over $ \{0\} \times\Omega $
6:     sample $ z_B^1, z_B^2,..., z_B^m $ from uniform distribution over $ \partial\Omega $
7:     sample k observation points $ z_O^1, z_O^2,..., z_O^k $
8:     Find the true value $ u_j = u_p(z_O^j) $ for $ j=1,2,...,k $
9:     Update the neural network by descending its stochastic gradient :

$\begin{equation} \nonumber \nabla_{w, b} [\frac{1}{m} \sum\limits_{i = 1}^m [L_p(u_N)(z^i)^2 + (u_N(z_I^i)-f(z_I^i))^2 + (u_N(z_B^i)-g(z_B^i))^2] + \frac{1}{k}\sum\limits_{j = 1}^k (u_N(z_O^j)-u_j)^2] \end{equation}$

10: end for
11: end procedure

| Show Table

DownLoad: CSV

Table 1. Information of grid and observation points

	Data Generation
	Grid Range	Number of Grid Points	Number of Observations
1D Transport	$ (t,x) \in [0,1]\times[0,1] $	$ 17 \times 100 $	17
2D Heat	$ (t,x,y) \in [0,1]\times[0,1]\times[0,1] $	$ 100 \times 100 \times 100 $	13
2D Wave	$ (t,x,y) \in [0,1]\times[0,1]\times[0,1] $	$ 100 \times 100 \times 100 $	61
Lotka-Volterra	$ t \in [0,100] $	20,000	40

| Show Table

DownLoad: CSV

Table 2. Neural network architecture

	Neural Network Architecture
	Fully Connected Layers	Activation Functions	Learning Rate
1D Transport	2(input)-128-256-128-1(output)	ReLU	$ 10^{-5} $
2D Heat	3(input)-128-128-1(output)	Sin, Sigmoid	$ 10^{-5} $
2D Wave	3(input)-128-256-128-1(output)	Sin, Tanh	$ 10^{-5} $
Lotka-Volterra	1(input)-64-64-2(output)	Sin	$ 10^{-4} $

| Show Table

DownLoad: CSV

Related Papers

Cited by

References

[1]	W. Arloff, K. R. B. Schmitt and L. J. Venstrom, A parameter estimation method for stiff ordinary differential equations using particle swarm optimisation, Int. J. Comput. Sci. Math., 9 (2018), 419-432. doi: 10.1504/IJCSM.2018.095506.
[2]	A. G. Baydin, B. A. Pearlmutter, A. A. Radul and J. M. Siskind, Automatic differentiation in machine learning: A survey, J. Mach. Learn. Res., 18 (2017), 43pp.
[3]	J. Berg and K. Nystr{ö}m, Neural network augmented inverse problems for PDEs, preprint, arXiv: 1712.09685.
[4]	J. Berg and K. Nystr{ö}m, A unified deep artificial neural network approach to partial differential equations in complex geometries, Neurocomputing, 317 (2018), 28-41. doi: 10.1016/j.neucom.2018.06.056.
[5]	G. Chavet, Nonlinear Least Squares for Inverse Problems. Theoretical Foundations and Step-By-Step Guide for Applications, Scientific Computation, Springer, New York, 2009. doi: 10.1007/978-90-481-2785-6.
[6]	N. E. Cotter, The Stone-Weierstrass theorem and its application to neural networks, IEEE Trans. Neural Networks, 1 (1990), 290-295. doi: 10.1109/72.80265.
[7]	R. Courant, K. Friedrichs and H. Lewy, On the partial difference equations of mathematical physics, IBM J. Res. Develop., 11 (1967), 215-234. doi: 10.1147/rd.112.0215.
[8]	G. Cybenko, Approximation by superpositions of a sigmoidal function, Math. Control Signals Systems, 2 (1989), 303-314. doi: 10.1007/BF02551274.
[9]	L. C. Evans, Partial Differential Equations, Graduate Studies in Mathematics, 19, American Mathematical Society, Providence, RI, 2010. doi: 10.1090/gsm/019.
[10]	G. E. Fasshauer, Solving partial differential equations by collocation with radial basis functions, Proceedings of Chamonix, 1997 (1996), 1-8.
[11]	K. Hornik, M. Stinchcombe and H. White, Multilayer feedforward networks are universal approximators, Neural Networks, 2 (1989), 359-366. doi: 10.1016/0893-6080(89)90020-8.
[12]	D. P. Kingma and J. Ba, Adam: A method for stochastic optimization, preprint, arXiv: 1412.6980.
[13]	I. E. Lagaris, A. Likas and D. I. Fotiadis, Artificial neural networks for solving ordinary and partial differential equations, IEEE Trans. Neural Networks, 9 (1998), 987-1000. doi: 10.1109/72.712178.
[14]	I. E. Lagaris, A. C. Likas and D. G. Papageorgiou, Neural-network methods for boundary value problems with irregular boundaries, IEEE Trans. Neural Networks, 11 (2000), 1041-1049. doi: 10.1109/72.870037.
[15]	K. Levenberg, A method for the solution of certain non-linear problems in least squares, Quart. Appl. Math., 2 (1944), 164-168. doi: 10.1090/qam/10666.
[16]	L. Jianyu, L. Siwei, Q. Yingjian and H. Yaping, Numerical solution of elliptic partial differential equation using radial basis function neural networks, Neural Networks, 16 (2003), 729-734. doi: 10.1016/S0893-6080(03)00083-2.
[17]	J. Li and X. Li, Particle swarm optimization iterative identification algorithm and gradient iterative identification algorithm for Wiener systems with colored noise, Complexity, 2018 (2018), 8pp. doi: 10.1155/2018/7353171.
[18]	X. Li, Simultaneous approximations of multivariate functions and their derivatives by neural networks with one hidden layer, Neurocomputing, 12 (1996), 327-343. doi: 10.1016/0925-2312(95)00070-4.
[19]	D. W. Marquardt, An algorithm for least-squares estimation of nonlinear parameters, J. Soc. Indust. Appl. Math., 11 (1963), 431-441. doi: 10.1137/0111030.
[20]	W. S. McCulloch and W. Pitts, A logical calculus of the ideas immanent in nervous activity, Bull. Math. Biophys., 5 (1943), 115-133. doi: 10.1007/BF02478259.
[21]	A. Paszke, et al., Automatic differentiation in PyTorch, Computer Science, (2017).
[22]	M. Raissi, P. Perdikaris and G. E. Karniadakis, Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations, J. Comput. Phys., 378 (2019), 686-707. doi: 10.1016/j.jcp.2018.10.045.
[23]	S. J. Reddi, S. Kale and S. Kumar, On the convergence of ADAM and beyond, preprint, arXiv: 1904.09237.
[24]	S. A. Sarra, Adaptive radial basis function methods for time dependent partial differential equations, Appl. Numer. Math., 54 (2005), 79-94. doi: 10.1016/j.apnum.2004.07.004.
[25]	P. Tsilifis, I. Bilionis, I. Katsounaros and N. Zabaras, Computationally efficient variational approximations for Bayesian inverse problems, J. Verif. Valid. Uncert., 1 (2016), 13pp. doi: 10.1115/1.4034102.
[26]	F. Yaman, V. G. Yakhno and R. Potthast, A survey on inverse problems for applied sciences, Math. Probl. Eng., 2013 (2013), 19pp. doi: 10.1155/2013/976837.