• Previous Article
    Comparative study of macroscopic traffic flow models at road junctions
  • NHM Home
  • This Issue
  • Next Article
    A new mixed finite element method for the n-dimensional Boussinesq problem with temperature-dependent viscosity
June  2020, 15(2): 247-259. doi: 10.3934/nhm.2020011

Deep neural network approach to forward-inverse problems

1. 

Department of Mathematics, Pohang University of Science and Technology, South Korea

2. 

Department of Mathematics and Statistics, California State University Long Beach, US

* Corresponding author: Hyung Ju Hwang

Received  January 2020 Revised  April 2020 Published  April 2020

In this paper, we construct approximated solutions of Differential Equations (DEs) using the Deep Neural Network (DNN). Furthermore, we present an architecture that includes the process of finding model parameters through experimental data, the inverse problem. That is, we provide a unified framework of DNN architecture that approximates an analytic solution and its model parameters simultaneously. The architecture consists of a feed forward DNN with non-linear activation functions depending on DEs, automatic differentiation [2], reduction of order, and gradient based optimization method. We also prove theoretically that the proposed DNN solution converges to an analytic solution in a suitable function space for fundamental DEs. Finally, we perform numerical experiments to validate the robustness of our simplistic DNN architecture for 1D transport equation, 2D heat equation, 2D wave equation, and the Lotka-Volterra system.

Citation: Hyeontae Jo, Hwijae Son, Hyung Ju Hwang, Eun Heui Kim. Deep neural network approach to forward-inverse problems. Networks & Heterogeneous Media, 2020, 15 (2) : 247-259. doi: 10.3934/nhm.2020011
References:
[1]

W. ArloffK. R. B. Schmitt and L. J. Venstrom, A parameter estimation method for stiff ordinary differential equations using particle swarm optimisation, Int. J. Comput. Sci. Math., 9 (2018), 419-432.  doi: 10.1504/IJCSM.2018.095506.  Google Scholar

[2]

A. G. Baydin, B. A. Pearlmutter, A. A. Radul and J. M. Siskind, Automatic differentiation in machine learning: A survey, J. Mach. Learn. Res., 18 (2017), 43pp.  Google Scholar

[3]

J. Berg and K. Nystr{ö}m, Neural network augmented inverse problems for PDEs, preprint, arXiv: 1712.09685. Google Scholar

[4]

J. Berg and K. Nystr{ö}m, A unified deep artificial neural network approach to partial differential equations in complex geometries, Neurocomputing, 317 (2018), 28-41.  doi: 10.1016/j.neucom.2018.06.056.  Google Scholar

[5]

G. Chavet, Nonlinear Least Squares for Inverse Problems. Theoretical Foundations and Step-By-Step Guide for Applications, Scientific Computation, Springer, New York, 2009. doi: 10.1007/978-90-481-2785-6.  Google Scholar

[6]

N. E. Cotter, The Stone-Weierstrass theorem and its application to neural networks, IEEE Trans. Neural Networks, 1 (1990), 290-295.  doi: 10.1109/72.80265.  Google Scholar

[7]

R. CourantK. Friedrichs and H. Lewy, On the partial difference equations of mathematical physics, IBM J. Res. Develop., 11 (1967), 215-234.  doi: 10.1147/rd.112.0215.  Google Scholar

[8]

G. Cybenko, Approximation by superpositions of a sigmoidal function, Math. Control Signals Systems, 2 (1989), 303-314.  doi: 10.1007/BF02551274.  Google Scholar

[9]

L. C. Evans, Partial Differential Equations, Graduate Studies in Mathematics, 19, American Mathematical Society, Providence, RI, 2010. doi: 10.1090/gsm/019.  Google Scholar

[10]

G. E. Fasshauer, Solving partial differential equations by collocation with radial basis functions, Proceedings of Chamonix, 1997 (1996), 1-8.   Google Scholar

[11]

K. HornikM. Stinchcombe and H. White, Multilayer feedforward networks are universal approximators, Neural Networks, 2 (1989), 359-366.  doi: 10.1016/0893-6080(89)90020-8.  Google Scholar

[12]

D. P. Kingma and J. Ba, Adam: A method for stochastic optimization, preprint, arXiv: 1412.6980. Google Scholar

[13]

I. E. LagarisA. Likas and D. I. Fotiadis, Artificial neural networks for solving ordinary and partial differential equations, IEEE Trans. Neural Networks, 9 (1998), 987-1000.  doi: 10.1109/72.712178.  Google Scholar

[14]

I. E. LagarisA. C. Likas and D. G. Papageorgiou, Neural-network methods for boundary value problems with irregular boundaries, IEEE Trans. Neural Networks, 11 (2000), 1041-1049.  doi: 10.1109/72.870037.  Google Scholar

[15]

K. Levenberg, A method for the solution of certain non-linear problems in least squares, Quart. Appl. Math., 2 (1944), 164-168.  doi: 10.1090/qam/10666.  Google Scholar

[16]

L. JianyuL. SiweiQ. Yingjian and H. Yaping, Numerical solution of elliptic partial differential equation using radial basis function neural networks, Neural Networks, 16 (2003), 729-734.  doi: 10.1016/S0893-6080(03)00083-2.  Google Scholar

[17]

J. Li and X. Li, Particle swarm optimization iterative identification algorithm and gradient iterative identification algorithm for Wiener systems with colored noise, Complexity, 2018 (2018), 8pp. doi: 10.1155/2018/7353171.  Google Scholar

[18]

X. Li, Simultaneous approximations of multivariate functions and their derivatives by neural networks with one hidden layer, Neurocomputing, 12 (1996), 327-343.  doi: 10.1016/0925-2312(95)00070-4.  Google Scholar

[19]

D. W. Marquardt, An algorithm for least-squares estimation of nonlinear parameters, J. Soc. Indust. Appl. Math., 11 (1963), 431-441.  doi: 10.1137/0111030.  Google Scholar

[20]

W. S. McCulloch and W. Pitts, A logical calculus of the ideas immanent in nervous activity, Bull. Math. Biophys., 5 (1943), 115-133.  doi: 10.1007/BF02478259.  Google Scholar

[21]

A. Paszke, et al., Automatic differentiation in PyTorch, Computer Science, (2017). Google Scholar

[22]

M. RaissiP. Perdikaris and G. E. Karniadakis, Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations, J. Comput. Phys., 378 (2019), 686-707.  doi: 10.1016/j.jcp.2018.10.045.  Google Scholar

[23]

S. J. Reddi, S. Kale and S. Kumar, On the convergence of ADAM and beyond, preprint, arXiv: 1904.09237. Google Scholar

[24]

S. A. Sarra, Adaptive radial basis function methods for time dependent partial differential equations, Appl. Numer. Math., 54 (2005), 79-94.  doi: 10.1016/j.apnum.2004.07.004.  Google Scholar

[25]

P. Tsilifis, I. Bilionis, I. Katsounaros and N. Zabaras, Computationally efficient variational approximations for Bayesian inverse problems, J. Verif. Valid. Uncert., 1 (2016), 13pp. doi: 10.1115/1.4034102.  Google Scholar

[26]

F. Yaman, V. G. Yakhno and R. Potthast, A survey on inverse problems for applied sciences, Math. Probl. Eng., 2013 (2013), 19pp. doi: 10.1155/2013/976837.  Google Scholar

show all references

References:
[1]

W. ArloffK. R. B. Schmitt and L. J. Venstrom, A parameter estimation method for stiff ordinary differential equations using particle swarm optimisation, Int. J. Comput. Sci. Math., 9 (2018), 419-432.  doi: 10.1504/IJCSM.2018.095506.  Google Scholar

[2]

A. G. Baydin, B. A. Pearlmutter, A. A. Radul and J. M. Siskind, Automatic differentiation in machine learning: A survey, J. Mach. Learn. Res., 18 (2017), 43pp.  Google Scholar

[3]

J. Berg and K. Nystr{ö}m, Neural network augmented inverse problems for PDEs, preprint, arXiv: 1712.09685. Google Scholar

[4]

J. Berg and K. Nystr{ö}m, A unified deep artificial neural network approach to partial differential equations in complex geometries, Neurocomputing, 317 (2018), 28-41.  doi: 10.1016/j.neucom.2018.06.056.  Google Scholar

[5]

G. Chavet, Nonlinear Least Squares for Inverse Problems. Theoretical Foundations and Step-By-Step Guide for Applications, Scientific Computation, Springer, New York, 2009. doi: 10.1007/978-90-481-2785-6.  Google Scholar

[6]

N. E. Cotter, The Stone-Weierstrass theorem and its application to neural networks, IEEE Trans. Neural Networks, 1 (1990), 290-295.  doi: 10.1109/72.80265.  Google Scholar

[7]

R. CourantK. Friedrichs and H. Lewy, On the partial difference equations of mathematical physics, IBM J. Res. Develop., 11 (1967), 215-234.  doi: 10.1147/rd.112.0215.  Google Scholar

[8]

G. Cybenko, Approximation by superpositions of a sigmoidal function, Math. Control Signals Systems, 2 (1989), 303-314.  doi: 10.1007/BF02551274.  Google Scholar

[9]

L. C. Evans, Partial Differential Equations, Graduate Studies in Mathematics, 19, American Mathematical Society, Providence, RI, 2010. doi: 10.1090/gsm/019.  Google Scholar

[10]

G. E. Fasshauer, Solving partial differential equations by collocation with radial basis functions, Proceedings of Chamonix, 1997 (1996), 1-8.   Google Scholar

[11]

K. HornikM. Stinchcombe and H. White, Multilayer feedforward networks are universal approximators, Neural Networks, 2 (1989), 359-366.  doi: 10.1016/0893-6080(89)90020-8.  Google Scholar

[12]

D. P. Kingma and J. Ba, Adam: A method for stochastic optimization, preprint, arXiv: 1412.6980. Google Scholar

[13]

I. E. LagarisA. Likas and D. I. Fotiadis, Artificial neural networks for solving ordinary and partial differential equations, IEEE Trans. Neural Networks, 9 (1998), 987-1000.  doi: 10.1109/72.712178.  Google Scholar

[14]

I. E. LagarisA. C. Likas and D. G. Papageorgiou, Neural-network methods for boundary value problems with irregular boundaries, IEEE Trans. Neural Networks, 11 (2000), 1041-1049.  doi: 10.1109/72.870037.  Google Scholar

[15]

K. Levenberg, A method for the solution of certain non-linear problems in least squares, Quart. Appl. Math., 2 (1944), 164-168.  doi: 10.1090/qam/10666.  Google Scholar

[16]

L. JianyuL. SiweiQ. Yingjian and H. Yaping, Numerical solution of elliptic partial differential equation using radial basis function neural networks, Neural Networks, 16 (2003), 729-734.  doi: 10.1016/S0893-6080(03)00083-2.  Google Scholar

[17]

J. Li and X. Li, Particle swarm optimization iterative identification algorithm and gradient iterative identification algorithm for Wiener systems with colored noise, Complexity, 2018 (2018), 8pp. doi: 10.1155/2018/7353171.  Google Scholar

[18]

X. Li, Simultaneous approximations of multivariate functions and their derivatives by neural networks with one hidden layer, Neurocomputing, 12 (1996), 327-343.  doi: 10.1016/0925-2312(95)00070-4.  Google Scholar

[19]

D. W. Marquardt, An algorithm for least-squares estimation of nonlinear parameters, J. Soc. Indust. Appl. Math., 11 (1963), 431-441.  doi: 10.1137/0111030.  Google Scholar

[20]

W. S. McCulloch and W. Pitts, A logical calculus of the ideas immanent in nervous activity, Bull. Math. Biophys., 5 (1943), 115-133.  doi: 10.1007/BF02478259.  Google Scholar

[21]

A. Paszke, et al., Automatic differentiation in PyTorch, Computer Science, (2017). Google Scholar

[22]

M. RaissiP. Perdikaris and G. E. Karniadakis, Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations, J. Comput. Phys., 378 (2019), 686-707.  doi: 10.1016/j.jcp.2018.10.045.  Google Scholar

[23]

S. J. Reddi, S. Kale and S. Kumar, On the convergence of ADAM and beyond, preprint, arXiv: 1904.09237. Google Scholar

[24]

S. A. Sarra, Adaptive radial basis function methods for time dependent partial differential equations, Appl. Numer. Math., 54 (2005), 79-94.  doi: 10.1016/j.apnum.2004.07.004.  Google Scholar

[25]

P. Tsilifis, I. Bilionis, I. Katsounaros and N. Zabaras, Computationally efficient variational approximations for Bayesian inverse problems, J. Verif. Valid. Uncert., 1 (2016), 13pp. doi: 10.1115/1.4034102.  Google Scholar

[26]

F. Yaman, V. G. Yakhno and R. Potthast, A survey on inverse problems for applied sciences, Math. Probl. Eng., 2013 (2013), 19pp. doi: 10.1155/2013/976837.  Google Scholar

Figure 1.  Network architecture
Figure 2.  Experimental result for 1D transport equation
Figure 3.  Experimental result for 2D heat equation with $ u(0,x,y) = x(1-x)y(1-y) $
Figure 4.  Experimental result for 2D heat equation with $ u(0,x,y) = 1 \text{, if } (x,y) \in \Omega, 0 \text{, otherwise} $
Figure 5.  Experimental result for 2D wave equation
Figure 6.  Experimental result for Lotka-Volterra equation
Figure 7.  Experimental result for CFL condition
Algorithm 1: Training
1: procedure train(number of epochs)
2:   Initialize the nerural network.
3:   For number of epochs do
4:     sample $ z^1, z^2,..., z^m $ from uniform distribution over $ \Omega $
5:     sample $ z_I^1, z_I^2,..., z_I^m $ from uniform distribution over $ \{0\} \times\Omega $
6:     sample $ z_B^1, z_B^2,..., z_B^m $ from uniform distribution over $ \partial\Omega $
7:     sample k observation points $ z_O^1, z_O^2,..., z_O^k $
8:     Find the true value $ u_j = u_p(z_O^j) $ for $ j=1,2,...,k $
9:     Update the neural network by descending its stochastic gradient :
$\begin{equation} \nonumber \nabla_{w, b} [\frac{1}{m} \sum\limits_{i = 1}^m [L_p(u_N)(z^i)^2 + (u_N(z_I^i)-f(z_I^i))^2 + (u_N(z_B^i)-g(z_B^i))^2] + \frac{1}{k}\sum\limits_{j = 1}^k (u_N(z_O^j)-u_j)^2] \end{equation}$
10:   end for
11: end procedure
Algorithm 1: Training
1: procedure train(number of epochs)
2:   Initialize the nerural network.
3:   For number of epochs do
4:     sample $ z^1, z^2,..., z^m $ from uniform distribution over $ \Omega $
5:     sample $ z_I^1, z_I^2,..., z_I^m $ from uniform distribution over $ \{0\} \times\Omega $
6:     sample $ z_B^1, z_B^2,..., z_B^m $ from uniform distribution over $ \partial\Omega $
7:     sample k observation points $ z_O^1, z_O^2,..., z_O^k $
8:     Find the true value $ u_j = u_p(z_O^j) $ for $ j=1,2,...,k $
9:     Update the neural network by descending its stochastic gradient :
$\begin{equation} \nonumber \nabla_{w, b} [\frac{1}{m} \sum\limits_{i = 1}^m [L_p(u_N)(z^i)^2 + (u_N(z_I^i)-f(z_I^i))^2 + (u_N(z_B^i)-g(z_B^i))^2] + \frac{1}{k}\sum\limits_{j = 1}^k (u_N(z_O^j)-u_j)^2] \end{equation}$
10:   end for
11: end procedure
Table 1.  Information of grid and observation points
Data Generation
Grid Range Number of Grid Points Number of Observations
1D Transport $ (t,x) \in [0,1]\times[0,1] $ $ 17 \times 100 $ 17
2D Heat $ (t,x,y) \in [0,1]\times[0,1]\times[0,1] $ $ 100 \times 100 \times 100 $ 13
2D Wave $ (t,x,y) \in [0,1]\times[0,1]\times[0,1] $ $ 100 \times 100 \times 100 $ 61
Lotka-Volterra $ t \in [0,100] $ 20,000 40
Data Generation
Grid Range Number of Grid Points Number of Observations
1D Transport $ (t,x) \in [0,1]\times[0,1] $ $ 17 \times 100 $ 17
2D Heat $ (t,x,y) \in [0,1]\times[0,1]\times[0,1] $ $ 100 \times 100 \times 100 $ 13
2D Wave $ (t,x,y) \in [0,1]\times[0,1]\times[0,1] $ $ 100 \times 100 \times 100 $ 61
Lotka-Volterra $ t \in [0,100] $ 20,000 40
Table 2.  Neural network architecture
Neural Network Architecture
Fully Connected Layers Activation Functions Learning Rate
1D Transport 2(input)-128-256-128-1(output) ReLU $ 10^{-5} $
2D Heat 3(input)-128-128-1(output) Sin, Sigmoid $ 10^{-5} $
2D Wave 3(input)-128-256-128-1(output) Sin, Tanh $ 10^{-5} $
Lotka-Volterra 1(input)-64-64-2(output) Sin $ 10^{-4} $
Neural Network Architecture
Fully Connected Layers Activation Functions Learning Rate
1D Transport 2(input)-128-256-128-1(output) ReLU $ 10^{-5} $
2D Heat 3(input)-128-128-1(output) Sin, Sigmoid $ 10^{-5} $
2D Wave 3(input)-128-256-128-1(output) Sin, Tanh $ 10^{-5} $
Lotka-Volterra 1(input)-64-64-2(output) Sin $ 10^{-4} $
[1]

Graciela Canziani, Rosana Ferrati, Claudia Marinelli, Federico Dukatz. Artificial neural networks and remote sensing in the analysis of the highly variable Pampean shallow lakes. Mathematical Biosciences & Engineering, 2008, 5 (4) : 691-711. doi: 10.3934/mbe.2008.5.691

[2]

H. N. Mhaskar, T. Poggio. Function approximation by deep networks. Communications on Pure & Applied Analysis, 2020, 19 (8) : 4085-4095. doi: 10.3934/cpaa.2020181

[3]

Daniel G. Alfaro Vigo, Amaury C. Álvarez, Grigori Chapiro, Galina C. García, Carlos G. Moreira. Solving the inverse problem for an ordinary differential equation using conjugation. Journal of Computational Dynamics, 2020, 7 (2) : 183-208. doi: 10.3934/jcd.2020008

[4]

Weishi Yin, Jiawei Ge, Pinchao Meng, Fuheng Qu. A neural network method for the inverse scattering problem of impenetrable cavities. Electronic Research Archive, 2020, 28 (2) : 1123-1142. doi: 10.3934/era.2020062

[5]

Hermann Gross, Sebastian Heidenreich, Mark-Alexander Henn, Markus Bär, Andreas Rathsfeld. Modeling aspects to improve the solution of the inverse problem in scatterometry. Discrete & Continuous Dynamical Systems - S, 2015, 8 (3) : 497-519. doi: 10.3934/dcdss.2015.8.497

[6]

Martin Benning, Elena Celledoni, Matthias J. Ehrhardt, Brynjulf Owren, Carola-Bibiane Schönlieb. Deep learning as optimal control problems: Models and numerical methods. Journal of Computational Dynamics, 2019, 6 (2) : 171-198. doi: 10.3934/jcd.2019009

[7]

Nguyen Huy Tuan, Mokhtar Kirane, Long Dinh Le, Van Thinh Nguyen. On an inverse problem for fractional evolution equation. Evolution Equations & Control Theory, 2017, 6 (1) : 111-134. doi: 10.3934/eect.2017007

[8]

Ying Sue Huang. Resynchronization of delayed neural networks. Discrete & Continuous Dynamical Systems - A, 2001, 7 (2) : 397-401. doi: 10.3934/dcds.2001.7.397

[9]

Zengyun Wang, Jinde Cao, Zuowei Cai, Lihong Huang. Finite-time stability of impulsive differential inclusion: Applications to discontinuous impulsive neural networks. Discrete & Continuous Dynamical Systems - B, 2020  doi: 10.3934/dcdsb.2020200

[10]

Zuowei Cai, Jianhua Huang, Lihong Huang. Generalized Lyapunov-Razumikhin method for retarded differential inclusions: Applications to discontinuous neural networks. Discrete & Continuous Dynamical Systems - B, 2017, 22 (9) : 3591-3614. doi: 10.3934/dcdsb.2017181

[11]

Yegana Ashrafova, Kamil Aida-Zade. Numerical solution to an inverse problem on a determination of places and capacities of sources in the hyperbolic systems. Journal of Industrial & Management Optimization, 2019  doi: 10.3934/jimo.2019091

[12]

Li-Fang Dai, Mao-Lin Liang, Wei-Yuan Ma. Optimization problems on the rank of the solution to left and right inverse eigenvalue problem. Journal of Industrial & Management Optimization, 2015, 11 (1) : 171-183. doi: 10.3934/jimo.2015.11.171

[13]

Yukihiko Nakata. Existence of a period two solution of a delay differential equation. Discrete & Continuous Dynamical Systems - S, 2020  doi: 10.3934/dcdss.2020392

[14]

Tatyana S. Turova. Structural phase transitions in neural networks. Mathematical Biosciences & Engineering, 2014, 11 (1) : 139-148. doi: 10.3934/mbe.2014.11.139

[15]

Gen Nakamura, Michiyuki Watanabe. An inverse boundary value problem for a nonlinear wave equation. Inverse Problems & Imaging, 2008, 2 (1) : 121-131. doi: 10.3934/ipi.2008.2.121

[16]

Hongfei Yang, Xiaofeng Ding, Raymond Chan, Hui Hu, Yaxin Peng, Tieyong Zeng. A new initialization method based on normed statistical spaces in deep networks. Inverse Problems & Imaging, 2020, 0 (0) : 1-12. doi: 10.3934/ipi.2020045

[17]

Sari Lasanen. Non-Gaussian statistical inverse problems. Part II: Posterior convergence for approximated unknowns. Inverse Problems & Imaging, 2012, 6 (2) : 267-287. doi: 10.3934/ipi.2012.6.267

[18]

Shaoyong Lai, Yong Hong Wu. The asymptotic solution of the Cauchy problem for a generalized Boussinesq equation. Discrete & Continuous Dynamical Systems - B, 2003, 3 (3) : 401-408. doi: 10.3934/dcdsb.2003.3.401

[19]

Yuantian Xia, Juxiang Zhou, Tianwei Xu, Wei Gao. An improved deep convolutional neural network model with kernel loss function in image classification. Mathematical Foundations of Computing, 2020, 3 (1) : 51-64. doi: 10.3934/mfc.2020005

[20]

Piotr Kowalski. The existence of a solution for Dirichlet boundary value problem for a Duffing type differential inclusion. Discrete & Continuous Dynamical Systems - B, 2014, 19 (8) : 2569-2580. doi: 10.3934/dcdsb.2014.19.2569

2019 Impact Factor: 1.053

Article outline

Figures and Tables

[Back to Top]