[1]
|
R. T. Chen, Y. Rubanova, J. Bettencourt and D. K. Duvenaud, Neural ordinary differential equations, Advances in Neural Information Processing Systems, (2018), 6571–6583.
|
[2]
|
T. Chen, E. B. Fox and C. Guestrin, Stochastic gradient hamiltonian monte carlo, Proceedings of the 31st International Conference on Machine Learning, (2014).
|
[3]
|
B. Dai, A. Shaw, L. Li, L. Xiao, N. He, Z. Liu, J. Chen and L. Song, SBEED: Convergent reinforcement learning with nonlinear function approximation, Proceedings of Machine Learning Research, Stockholmsmässan, Stockholm Sweden, PMLR, 80 (2018), 1125-1134.
|
[4]
|
W. E, J. Han and Q. Li, A mean-field optimal control formulation of deep learning, Research in the Mathematical Sciences, 6 (2019), 41 pp.
doi: 10.1007/s40687-018-0172-y.
|
[5]
|
N. El Karoui, S. Peng and M. C. Quenez, Backward stochastic differential equations in finance, Math. Finance, 7 (1997), 1-71.
doi: 10.1111/1467-9965.00022.
|
[6]
|
C. Fang, Z. Lin and T. Zhang, Sharp analysis for nonconvex SGD escaping from saddle points, Conference on Learning Theory, (2019), 1192–1234.
|
[7]
|
X. Feng, R. Glowinski and M. Neilan, Recent developments in numerical methods for fully nonlinear second order partial differential equations, SIAM Rev., 55 (2013), 205-267.
doi: 10.1137/110825960.
|
[8]
|
Z. Ghahramani, Probabilistic machine learning and artificial intelligence, Nature, 521 (2015), 452-459.
doi: 10.1038/nature14541.
|
[9]
|
B. Gong, W. Liu, T. Tang, W. Zhao and T. Zhou, An efficient gradient projection method for stochastic optimal control problems, SIAM J. Numer. Anal., 55 (2017), 2982-3005.
doi: 10.1137/17M1123559.
|
[10]
|
E. Haber and L. Ruthotto, Stable architectures for deep neural networks, Inverse Problems, 34 (2017), 014004, 22 pp.
doi: 10.1088/1361-6420/aa9a90.
|
[11]
|
E. Haber, L. Ruthotto, E. Holtham and S. Jun, Learning across scales - multiscale methods for convolution neural networks, (2017), arXiv: 1703.02009v2.
|
[12]
|
K. He, X. Zhang, S. Ren and J. Sun, Deep residual learning for image recognition, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2016).
doi: 10.1109/CVPR.2016.90.
|
[13]
|
J. M. Hernandez-Lobato and R. P. Adams, Probabilistic backpropagation for scalable learning of bayesian neural networks, Proceedings of the 32nd International Conference on Machine Learning, (2015).
|
[14]
|
P. Jain and P. Kar, Non-convex optimization for machine learning, Foundations and Trends® in Machine Learning, 10 (2017), 142–336.
doi: 10.1561/9781680833690.
|
[15]
|
J. Jia and A. Benson, Neural jump stochastic differential equations, 33rd Conference on Neural Information Processing Systems, (2019).
|
[16]
|
P. Kidger and T. Lyons, Universal approximation with deep narrow networks, Proceedings of Machine Learning Research, PMLR, 125 (2020), 2306-2327.
|
[17]
|
L. Kong, J. Sun and C. Zhang, Sde-net: Equipping deep neural networks with uncertainty estimates, Proceedings of the 37th International Conference on Machine Learning, (2020).
|
[18]
|
X. Li, T. Wong, T. Chen and D. Duvenaud, Scalable gradients for stochastic differential equations, Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, (2020), 3870–3882.
|
[19]
|
X. Liu, T. Xiao, S. Si, Q. Cao, S. K. Kumar and C.-J. Hsieh, Neural sde: Stabilizing neural ode networks with stochastic noise, arXiv preprint, (2019), arXiv: 1906.02355.
|
[20]
|
J. Ma, P. Protter and J. Yong, Solving forward-backward stochastic differential equations explicitly–A four step scheme, Probab. Theory Related Fields, 98 (1994), 339-359.
doi: 10.1007/BF01192258.
|
[21]
|
J. Ma and J. Zhang, Representation theorems for backward stochastic differential equations, Ann. Appl. Probab., 12 (2002), 1390-1418.
doi: 10.1214/aoap/1037125868.
|
[22]
|
G. N. Milstein and M. V. Tretyakov, Numerical algorithms for forward-backward stochastic differential equations, SIAM J. Sci. Comput., 28 (2006), 561-582.
doi: 10.1137/040614426.
|
[23]
|
M. Morzfeld, M. S. Day, R. W. Grout, G. S. H. Pau, S. A. Finsterle and J. B. Bell, Iterative importance sampling algorithms for parameter estimation, SIAM J. Sci. Comput., 40 (2018), B329–B352.
doi: 10.1137/16M1088417.
|
[24]
|
S. G. Peng, A general stochastic maximum principle for optimal control problems, SIAM J. Control Optim., 28 (1990), 966-979.
doi: 10.1137/0328054.
|
[25]
|
H. Pham, On some recent aspects of stochastic control and their applications, Probab. Surv., 2 (2005), 506-549.
doi: 10.1214/154957805100000195.
|
[26]
|
J. T. Springenberg, A. Klein, S. Falkner and F. Hutter, Bayesian optimization with robust bayesian neural networks, Advances in Neural Information Processing Systems, Curran Associates, Inc., 29 (2016), 4134-4142.
|
[27]
|
B. Tzen and M. Raginsky, Neural stochastic differential equations: Deep latent gaussian models in the diffusion limit, arXiv, 2019.
|
[28]
|
M. Welling and Y. W. Teh, Bayesian learning via stochastic gradient langevin dynamics, Proceedings of the 28th International Conference on Machine Learning, (2011).
|
[29]
|
J. Yong and X. Y. Zhou, Stochastic controls. Hamiltonian Systems and HJB Equations, Applications of Mathematics (New York), 43. Springer-Verlag, New York, 1999.
doi: 10.1007/978-1-4612-1466-3.
|
[30]
|
J. Zhang, A numerical scheme for BSDEs, Ann. Appl. Probab., 14 (2004), 459-488.
doi: 10.1214/aoap/1075828058.
|
[31]
|
W. Zhao, L. Chen and S. Peng, A new kind of accurate numerical method for backward stochastic differential equations, SIAM J. Sci. Comput., 28 (2006), 1563-1581.
doi: 10.1137/05063341X.
|