Recently, neural networks (NN) with an infinite number of layers have been introduced. Especially for these very large NN the training procedure is very expensive. Hence, there is interest to study their robustness with respect to input data to avoid unnecessarily retraining the network.
Typically, model-based statistical inference methods, e.g. Bayesian neural networks, are used to quantify uncertainties. Here, we consider a special class of residual neural networks and we study the case, when the number of layers can be arbitrarily large. Then, kinetic theory allows to interpret the network as a dynamical system, described by a partial differential equation. We study the robustness of the mean-field neural network with respect to perturbations in initial data by applying UQ approaches on the loss functions.
Citation: |
Figure 3.
Top left panel: Regression of the target function
Figure 4.
Top left panel: classification problem with the statistical quantities computed with the Monte-Carlo method and the stochastic collocation method with noise level
Figure 5.
CASE 1: Projection of initial data and root mean squared error (RMSE) with exactly computed mean and variance. Upper panel: The left
Figure 6.
CASE 2: Projection of initial data and root mean squared error (RMSE) with exactly computed mean and variance. Upper panel: The blue lines describe deviations from the mean of initial data (red). Two realisations are shown in black, which are in the confidence region (gray shaded). The lower left panel shows for each fixed signal
Figure 7.
CASE 1: Solutions to the stochastic Galerkin formulation with truncation
Figure 8.
CASE 2: Solutions to the stochastic Galerkin formulation with truncation
Figure 9.
CASE 1: Mean, variance and distribution for random loss. The left panels show for each fixed signal
Figure 10.
CASE 2: Mean, variance and distribution for random loss. The left panels show for each fixed signal
[1] |
R. Abgrall and S. Mishra, Uncertainty quantification for hyperbolic systems of conservation laws, Handbook of numerical methods for hyperbolic problems, Handb. Numer. Anal., Elsevier/North-Holland, Amsterdam, 18 (2017), 507–544.
![]() ![]() |
[2] |
D. Anderson and G. McNeill, Artificial neural networks technology, Kaman Sciences Corporation, 258 (1992), 1-83.
![]() |
[3] |
T. Auld, A. W. Moore and S. F. Gull, Bayesian neural networks for internet traffic classification, IEEE Transactions on Neural Networks, 18 (2007), 223-239.
![]() |
[4] |
G. Bastin and J.-M. Coron, Stability and Boundary Stabilization of 1-d Hyperbolic Systems, 1st edition, Progress in nonlinear differential equations and their applications, Birkhäuser, Switzerland, 2016.
doi: 10.1007/978-3-319-32062-5.![]() ![]() ![]() |
[5] |
H. D. Beale, H. B. Demuth and M. Hagan, Neural network design, PWS, Boston.
![]() |
[6] |
M. A. Beaumont, Approximate bayesian computation, Annu. Rev. Stat. Appl., 6 (2019), 379-403.
doi: 10.1146/annurev-statistics-030718-105212.![]() ![]() ![]() |
[7] |
M. G. B. Blum, Approximate bayesian computation: A nonparametric perspective, J. Amer. Statist. Assoc., 105 (2010), 1178-1187.
doi: 10.1198/jasa.2010.tm09448.![]() ![]() ![]() |
[8] |
C. Blundell, J. J. Cornebise, K. Kavukcuoglu and D. Wierstra, Weight uncertainty in neural networks, in ICML, 2015.
![]() |
[9] |
R. H. Cameron and W. T. Martin, The orthogonal development of non-linear functionals in series of Fourier-Hermite functionals, Ann. of Math., 48 (1947), 385-392.
doi: 10.2307/1969178.![]() ![]() ![]() |
[10] |
J. A. Carrillo, L. Pareschi and M. Zanella, Particle based gPC methods for mean-field models of swarming with uncertainty, Commun. Comput. Phys., 25 (2019), 508-531.
doi: 10.4208/cicp.oa-2017-0244.![]() ![]() ![]() |
[11] |
J. A. Carrillo and M. Zanella, Monte Carlo gPC methods for diffusive kinetic flocking models with uncertainties, Vietnam J. Math., 47 (2019), 931-954.
doi: 10.1007/s10013-019-00374-2.![]() ![]() ![]() |
[12] |
R. T. Q. Chen, Y. Rubanova and J. B. D. K. D. Duvenaud, Neural ordinary differential equations, in Advances in Neural Information Processing Systems, Curran Associates, Inc., 31 (2018), 6571–6583.
![]() |
[13] |
R. Courant and D. Hilbert, Methods of Mathematical Physics, Vol. Ⅱ. Partial differential equations. Reprint of the 1962 original. Wiley Classics Library. A Wiley-Interscience Publication. John Wiley & Sons, Inc., New York, 1989.
![]() ![]() |
[14] |
I. Cravero, G. Puppo, M. Semplice and G. Visconti, Cool WENO schemes, Comput. & Fluids, 169 (2018), 71-86.
doi: 10.1016/j.compfluid.2017.07.022.![]() ![]() ![]() |
[15] |
I. Cravero, G. Puppo, M. Semplice and G. Visconti, CWENO: Uniformly accurate reconstructions for balance laws, Math. Comp., 87 (2018), 1689-1719.
doi: 10.1090/mcom/3273.![]() ![]() ![]() |
[16] |
I. Cravero, M. Semplice and G. Visconti, Optimal definition of the nonlinear weights in multidimensional central WENOZ reconstructions, SIAM J. Numer. Anal., 57 (2019), 2328-2358.
doi: 10.1137/18M1228232.![]() ![]() ![]() |
[17] |
R. Crisovan, D. Torlo, R. Abgrall and S. Tokareva, Model order reduction for parametrized nonlinear hyperbolic problems as an application to uncertainty quantification, J. Comput. Appl. Math., 348 (2019), 466-489.
doi: 10.1016/j.cam.2018.09.018.![]() ![]() ![]() |
[18] |
Y. Deng, Y. Shen, K. Chen and H. Jin, Training recurrent neural network through moment matching for nlp applications, (2018), 3353–3357.
![]() |
[19] |
B. Després, G. Poëtte and D. Lucor, Robust uncertainty propagation in systems of conservation laws with the entropy closure method, Uncertainty Quantification in Computational Fluid Dynamics, Lect. Notes Comput. Sci. Eng., Springer, Heidelberg, 92 (2013), 105–149.
doi: 10.1007/978-3-319-00885-1_3.![]() ![]() ![]() |
[20] |
D. Funaro, Polynomial Approximation of Differential Equations, Lecture Notes in Physics. New Series m: Monographs, 8. Springer-Verlag, Berlin, 1992.
![]() ![]() |
[21] |
S. Gerster and M. Herty, Entropies and symmetrization of hyperbolic stochastic Galerkin formulations, Communications in Computational Physics, 27 (2020), 639-671.
![]() |
[22] |
S. Gerster, M. Herty and A. Sikstel, Hyperbolic stochastic Galerkin formulation for the $p$-system, J. Comput. Phys., 395 (2019), 186-204.
doi: 10.1016/j.jcp.2019.05.049.![]() ![]() ![]() |
[23] |
S. Gerster, M. Herty and H. Yu, Hypocoercivity of stochastic Galerkin formulations for stabilization of kinetic equations, RWTH, preprint, 1–21.
![]() |
[24] |
F. S. Gharehchopogh, Neural networks application in software cost estimation: A case study, in 2011 International Symposium on Innovations in Intelligent Systems and Applications, IEEE, (2011), 69–73.
doi: 10.1109/INISTA.2011.5946160.![]() ![]() |
[25] |
M. B. Giles, Multilevel Monte Carlo methods, Springer Proceedings in Mathematics & Statistics, Monte Carlo and Quasi-Monte Carlo Methods, 65 (2012), 83–103.
doi: 10.1007/978-3-642-41095-6_4.![]() ![]() |
[26] |
P. Glasserman, Monte Carlo Methods in Financial Engineering, Springer, New York, 2004.
![]() ![]() |
[27] |
F. Golse, On the dynamics of large particle systems in the mean field limit, in Macroscopic and Large Scale Phenomena: Coarse Graining, Mean Field Limits and Ergodicity, Lect. Notes Appl. Math. Mech., Springer, 3 (2016), 1–144.
doi: 10.1007/978-3-319-26883-5_1.![]() ![]() ![]() |
[28] |
D. Gottlieb and D. Xiu, Galerkin method for wave equations with uncertain coefficients, Commun. Comput. Phys., 3 (2008), 505-518.
![]() ![]() |
[29] |
A. Graves, Practical variational inference for neural networks, NIPS, (2011), 2348–2356.
![]() |
[30] |
E. Haber, F. Lucka and L. Ruthotto, Never look back - A modified EnKF method and its application to the training of neural networks without back propagation, 2018, Preprint, arXiv: 1805.08034.
![]() |
[31] |
M. H. Hassoun et al., Fundamentals of Artificial Neural Networks, MIT press, 1995.
![]() |
[32] |
K. He, X. Zhang, S. Ren and J. Sun, Deep residual learning for image recognition, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 770–778.
doi: 10.1109/CVPR.2016.90.![]() ![]() |
[33] |
D. O. Hebb, The Organization of Behavior: A Neuropsychological Theory, Psychology Press, 2005.
doi: 10.4324/9781410612403.![]() ![]() |
[34] |
M. Herty, T. Trimborn and G. Visconti, Kinetic theory for residual neural networks, arXiv preprint, arXiv: 2001.04294.
![]() |
[35] |
G. Hu, R. Li and T. Tang, A robust WENO type finite volume solver for steady Euler equations on unstructured grids, Commun. Comput. Phys., 9 (2011), 627-648.
doi: 10.4208/cicp.031109.080410s.![]() ![]() ![]() |
[36] |
J. Hu and S. Jin, A stochastic Galerkin method for the Boltzmann equation with uncertainty, J. Comput. Phys., 315 (2016), 150-168.
doi: 10.1016/j.jcp.2016.03.047.![]() ![]() ![]() |
[37] |
P.-E. Jabin, A review of the mean field limits for vlasov equations, Kinetic & Related Models, 7 (2014), 661-711.
doi: 10.3934/krm.2014.7.661.![]() ![]() ![]() |
[38] |
K. Janocha and W. M. Czarnecki, On loss functions for deep neural networks in classifications, 2017, Preprint, arXiv: 1702.05659v1.
![]() |
[39] |
B. Jiang, T.-Y. H Wu, C. Zheng and W. Wong, Learning summary statistic for approximate bayesian computation via deep neural network, Statistica Sinica, 27 (2017), 1595-1618.
![]() ![]() |
[40] |
G.-S. Jiang and C.-W. Shu, Efficient implementation of weighted ENO schemes, J. Comput. Phys., 126 (1996), 202-228.
doi: 10.1006/jcph.1996.0130.![]() ![]() ![]() |
[41] |
S. Jin and Y. Zhu, Hypocoercivity and uniform regularity for the Vlasov-Poisson-Fokker-Planck system with uncertainty and multiple scales, SIAM J. Math. Anal., 50 (2018), 1790-1816.
doi: 10.1137/17M1123845.![]() ![]() ![]() |
[42] |
M. I. Jordan and T. M. Mitchell, Machine learning: Trends, perspectives, and prospects, Science, 349 (2015), 255-260.
doi: 10.1126/science.aaa8415.![]() ![]() ![]() |
[43] |
A. V. Joshi, Machine learning and artificial intelligence, Springer, Cham, 2020.
doi: 10.1007/978-3-030-26622-6.![]() ![]() |
[44] |
H. M. D. Kabir, A. Khosravi, M. A. Hosen and S. Nahavandi, Neural network-based uncertainty quantification: A survey of methodologies and applications, IEEE Access, 6 (2018), 36218-36234.
doi: 10.1109/ACCESS.2018.2836917.![]() ![]() |
[45] |
C. Klingenberg, G. Puppo and M. Semplice, Arbitrary order finite volume well-balanced schemes for the Euler equations with gravity, SIAM J. Sci. Comput., 41 (2019), A695–A721.
doi: 10.1137/18M1196704.![]() ![]() ![]() |
[46] |
N. B. Kovachki and A. M. Stuart, Ensemble Kalman inversion: A derivative-free technique for machine learning tasks, Inverse Probl., 35 (2019), 095005, 35 pp.
doi: 10.1088/1361-6420/ab1c3a.![]() ![]() ![]() |
[47] |
J. Kusch and M. Frank, Intrusive methods in uncertainty quantification and their connection to kinetic theory, Int. J. Adv. Eng. Sci. Appl. Math., 10 (2018), 54-69.
doi: 10.1007/s12572-018-0211-3.![]() ![]() ![]() |
[48] |
O. P. Le Maître and O. M. Knio, Spectral Methods for Uncertainty Quantification, With applications to computational fluid dynamics. Scientific Computation. Springer, New York, 2010.
doi: 10.1007/978-90-481-3520-2.![]() ![]() ![]() |
[49] |
C. Li and S. Mahadevan, Efficient approximate inference in Bayesian networks with continuous variables, Reliability Engineering & System Safety, 169 (2018), 269-280.
doi: 10.1016/j.ress.2017.08.017.![]() ![]() |
[50] |
C.-L. Li, W.-C. Chang, Y. Cheng, Y. Yang and B. Póczos, Mmd gan: Towards deeper understanding of moment matching network, in Advances in Neural Information Processing Systems, (2017), 2203–2213.
![]() |
[51] |
H. Lin and S. Jegelka, Resnet with one-neuron hidden layers is a universal approximator, in Advances in Neural Information Processing Systems (eds. S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi and R. Garnett), Curran Associates, Inc., 31 (2018), 6169–6178.
![]() |
[52] |
S. Mandt, M. Hoffman and D. Blei, Stochastic gradient descent as approximate Bayesian inference, J. Mach. Learn. Res., 18 (2017), Paper No. 134, 35 pp.
![]() ![]() |
[53] |
W. S. McCulloch and W. Pitts, A logical calculus of the ideas immanent in nervous activity, Bull. Math. Biophys., 5 (1943), 115-133.
doi: 10.1007/BF02478259.![]() ![]() ![]() |
[54] |
S. Mishra, Ch. Schwab and J. Šukys, Multi-level Monte Carlo finite volume methods for nonlinear systems of conservation laws in multi-dimensions, J. Comput. Phys., 231 (2012), 3365-3388.
doi: 10.1016/j.jcp.2012.01.011.![]() ![]() ![]() |
[55] |
S. Mishra, Ch. Schwab and J. Šukys, Multilevel Monte Carlo finite volume methods for shallow water equations with uncertain topography in multi-dimensions, SIAM J. Sci. Comput., 34 (2012), B761–B784.
doi: 10.1137/110857295.![]() ![]() ![]() |
[56] |
O. Nikodym, Sur une généralisation des intégrales de M. J. Radon, Fundamenta Mathematicae, 15 (1930), 131-179.
doi: 10.4064/fm-15-1-131-179.![]() ![]() |
[57] |
S. C. Onar, A. Ustundag, Ç. Kadaifci and B. Oztaysi, The changing role of engineering education in industry 4.0 era, in Industry 4.0: Managing The Digital Transformation, Springer, (2018), 137–151.
![]() |
[58] |
P. Pettersson, G. Iaccarino and J. Nordström, A stochastic Galerkin method for the Euler equations with Roe variable transformation, Journal of Computational Physics, 257 (2014), 481-500.
doi: 10.1016/j.jcp.2013.10.011.![]() ![]() ![]() |
[59] |
G. Poëtte, B. Després and D. Lucor, Uncertainty quantification for systems of conservation laws, J. Comput. Phys., 228 (2009), 2443-2467.
doi: 10.1016/j.jcp.2008.12.018.![]() ![]() ![]() |
[60] |
R. Pulch and E. Maten, Stochastic galerkin methods and model order reduction for linear dynamical systems, International Journal for Uncertainty Quantification, 5 (2015), 255-273.
doi: 10.1615/Int.J.UncertaintyQuantification.2015010171.![]() ![]() ![]() |
[61] |
R. Pulch and D. Xiu, Generalised polynomial chaos for a class of linear conservation laws, J. Sci. Comput., 51 (2012), 293-312.
doi: 10.1007/s10915-011-9511-5.![]() ![]() ![]() |
[62] |
D. Ray and J. S. Hesthaven, An artificial neural network as a troubled-cell indicator, J. Comput. Phys., 367 (2018), 166-191.
doi: 10.1016/j.jcp.2018.04.029.![]() ![]() ![]() |
[63] |
D. Ray and J. S. Hesthaven, Detecting troubled-cells on two-dimensional unstructured grids using a neural network, J. Comput. Phys., 397 (2019), 108845, 31 pp.
doi: 10.1016/j.jcp.2019.07.043.![]() ![]() ![]() |
[64] |
N. H. Risebro, C. Schwab and F. Weber, Multilevel Monte Carlo front-tracking for random scalar conservation laws, BIT, 56 (2016), 263-292.
doi: 10.1007/s10543-015-0550-4.![]() ![]() ![]() |
[65] |
L. Ruthotto, S. Osher, W. Li, L. Nurbekyan and S. W. Fung, A machine learning framework for solving high-dimensional mean field game and mean field control problems, Proceedings of the National Academy of Sciences, 117, (2020), 9183–9193.
![]() |
[66] |
L. Ruthotto and E. Haber, Deep neural networks motivated by partial differential equations, J. Math. Imaging Vision, 62 (2020), 352-364.
doi: 10.1007/s10851-019-00903-1.![]() ![]() ![]() |
[67] |
G. Scarciotti and A. Astolfi, Data-driven model reduction by moment matching for linear and nonlinear systems, Automatica J. IFAC, 79 (2017), 340-351.
doi: 10.1016/j.automatica.2017.01.014.![]() ![]() ![]() |
[68] |
R. Schmitt and G. Schuh, Advances in Production Research: Proceedings of the 8th Congress of the German Academic Association for Production Technology (WGP), Aachen, November 19-20, 2018, Springer, 2018.
doi: 10.1007/978-3-030-03451-1.![]() ![]() |
[69] |
C. Schwab and S. Tokareva, High order approximation of probabilistic shock profiles in hyperbolic conservation laws with uncertain initial data, ESAIM Math. Model. Numer. Anal., 47 (2013), 807-835.
doi: 10.1051/m2an/2012060.![]() ![]() ![]() |
[70] |
M. Semplice, A. Coco and G. Russo, Adaptive mesh refinement for hyperbolic systems based on third-order compact WENO reconstruction, J. Sci. Comput., 66 (2016), 692-724.
doi: 10.1007/s10915-015-0038-z.![]() ![]() ![]() |
[71] |
R. Shu, J. Hu and S. Jin, A stochastic Galerkin method for the Boltzmann equation with multi-dimensional random inputs using sparse wavelet bases, Numer. Math. Theory Methods Appl., 10 (2017), 465-488.
doi: 10.4208/nmtma.2017.s12.![]() ![]() ![]() |
[72] |
R. C. Smith, Uncertainty Quantification: Theory, Implementation, and Applications, Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, 2013.
![]() |
[73] |
R. C. Smith, Uncertainty Quantification: Theory, Implementation, and Applications, Computational Science & Engineering, 12. Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA, 2014.
![]() ![]() |
[74] |
R. J. Solomonoff, Machine Learning – Past and Future, The Dartmouth Artificial Intelligence Conference, 2006.
![]() |
[75] |
J. Stoer and R. Bulirsch, Introduction to Numerical Analysis, 3rd edition, Texts in Applied Mathematics, 12. Springer-Verlag New York, 2002.
doi: 10.1007/978-0-387-21738-3.![]() ![]() ![]() |
[76] |
J. Šukys, S. Mishra and C. Schwab, Multi-level Monte Carlo finite difference and finite volume methods for stochastic linear hyperbolic systems, Springer Proceedings in Mathematics and Statistics, 65 (2013), 649-666.
doi: 10.1007/978-3-642-41095-6_34.![]() ![]() ![]() |
[77] |
T. Tang and T. Zhou, Convergence analysis for stochastic collocation methods to scalar hyperbolic equations with a random wave speed, Commun. Comput. Phys., 8 (2010), 226-248.
doi: 10.4208/cicp.060109.130110a.![]() ![]() ![]() |
[78] |
H. Tercan, T. Al Khawli, U. Eppelt, C. Büscher, T. Meisen and S. Jeschke, Improving the laser cutting process design by machine learning techniques, Production Engineering, 11 (2017), 195-203.
doi: 10.1007/s11740-017-0718-7.![]() ![]() |
[79] |
H. H. Thodberg, A review of bayesian neural networks with an application to near infrared spectroscopy, IEEE Transactions on Neural Networks, 7 (1996), 56-72.
![]() |
[80] |
S. Tokareva, C. Schwab and S. Mishra, High order SFV and mixed SDG/FV methods for the uncertainty quantification in multidimensional conservation laws, in High Order Nonlinear Numerical Schemes for Evolutionary PDEs, Lect. Notes Comput. Sci. Eng., Springer, Cham, 99 (2014), 109–133.
![]() ![]() |
[81] |
R. K. Tripathy and I. Bilionis, Deep UQ: Learning deep neural network surrogate models for high dimensional uncertainty quantification, J. Comput. Phys., 375 (2018), 565-588.
doi: 10.1016/j.jcp.2018.08.036.![]() ![]() ![]() |
[82] |
V. Vanhoucke, A. Senior and M. Z. Mao, Improving the Speed of Neural Networks on CPUs., Deep Learning and Unsupervised Feature Learning Workshop, 2011.
![]() |
[83] |
Q. Wang, J. S. Hesthaven and D. Ray, Non-intrusive reduced order modelling of unsteady flows using artificial neural networks with application to a combustion problem, J. Comput. Phys., 384 (2019), 289-307.
doi: 10.1016/j.jcp.2019.01.031.![]() ![]() ![]() |
[84] |
K. Watanabe and S. G. Tzafestas, Learning algorithms for neural networks with the Kalman filters, J. Intell. Robot. Syst., 3 (1990), 305-319.
doi: 10.1007/BF00439421.![]() ![]() |
[85] |
N. Wiener, The homogeneous chaos, Amer. J. Math., 60 (1938), 897-936.
doi: 10.2307/2371268.![]() ![]() ![]() |
[86] |
D. Xiu, Numerical methods for stochastic computations, A spectral method approach. Princeton University Press, Princeton, 2010.
![]() ![]() |
[87] |
D. Xiu and G. E. Karniadakis, The Wiener-Askey polynomial chaos for stochastic differential equations, SIAM J. Sci. Comput., 24 (2002), 619-644.
doi: 10.1137/S1064827501387826.![]() ![]() ![]() |
[88] |
M. Zanella, Structure preserving stochastic Galerkin methods for Fokker-Planck equations with background interactions, Math. Comput. Simulation, 168 (2020), 28-47.
doi: 10.1016/j.matcom.2019.07.012.![]() ![]() ![]() |
[89] |
Y. Zhu and S. Jin, The Vlasov-Poisson-Fokker-Planck system with uncertainty and a one-dimensional asymptotic preserving method, Multiscale Model. Simul., 15 (2017), 1502-1529.
doi: 10.1137/16M1090028.![]() ![]() ![]() |