June  2022, 1(2): 287-319. doi: 10.3934/fmf.2021011

Convergence of deep fictitious play for stochastic differential games

1. 

Center for Computational Mathematics, Flatiron Institute, 162 5th Avenue, New York, NY, USA

2. 

Department of Mathematics, Princeton University, Princeton, NJ, USA

3. 

Department of Mathematics, and Department of Statistics and Applied Probability, University of California, Santa Barbara, CA, USA

4. 

The Program in Applied and Computational Mathematics, Princeton University, Princeton, NJ, USA

* Corresponding author: Ruimeng Hu

Received  September 2021 Revised  March 2022 Published  June 2022 Early access  May 2022

Fund Project: R.H. was partially supported by the NSF grant DMS-1953035

Stochastic differential games have been used extensively to model agents' competitions in finance, for instance, in P2P lending platforms from the Fintech industry, the banking system for systemic risk, and insurance markets. The recently proposed machine learning algorithm, deep fictitious play, provides a novel and efficient tool for finding Markovian Nash equilibrium of large $ N $-player asymmetric stochastic differential games [J. Han and R. Hu, Mathematical and Scientific Machine Learning Conference, pages 221-245, PMLR, 2020]. By incorporating the idea of fictitious play, the algorithm decouples the game into $ N $ sub-optimization problems, and identifies each player's optimal strategy with the deep backward stochastic differential equation (BSDE) method parallelly and repeatedly. In this paper, we prove the convergence of deep fictitious play (DFP) to the true Nash equilibrium. We can also show that the strategy based on DFP forms an $ \epsilon $-Nash equilibrium. We generalize the algorithm by proposing a new approach to decouple the games, and present numerical results of large population games showing the empirical convergence of the algorithm beyond the technical assumptions in the theorems.

Citation: Jiequn Han, Ruimeng Hu, Jihao Long. Convergence of deep fictitious play for stochastic differential games. Frontiers of Mathematical Finance, 2022, 1 (2) : 287-319. doi: 10.3934/fmf.2021011
References:
[1]

A. Angiuli, J. -P. Fouque and M. Laurière, Unified reinforcement Q-learning for mean field game and control problems, arXiv: 2006.13912, 2020.

[2]

M. Arjovsky, S. Chintala and L. Bottou, Wasserstein generative adversarial networks, In Proceedings of the 34th International Conference on Machine Learning, volume 70 of PLMR, 2017, 214–223.

[3]

R. Arora, A. Basu, P. Mianjy and A. Mukherjee, Understanding deep neural networks with rectified linear units, arXiv preprint, arXiv: 1611.01491, 2016.

[4]

E. BayraktarA. Budhiraja and A. Cohen, A numerical scheme for a mean field game in some queueing systems based on Markov chain approximation method, SIAM J. Control Optim., 56 (2018), 4017-4044.  doi: 10.1137/17M1154357.

[5]

C. Beck, S. Becker, P. Cheridito, A. Jentzen and A. Neufeld, Deep splitting method for parabolic PDEs, SIAM J. Sci. Comput., 43 (2021), A3135–A3154. doi: 10.1137/19M1297919.

[6]

C. BeckW. E and A. Jentzen, Machine learning approximation algorithms for high-dimensional fully nonlinear partial differential equations and second-order backward stochastic differential equations, J. Nonlinear Sci., 29 (2019), 1563-1619.  doi: 10.1007/s00332-018-9525-3.

[7]

A. BensoussanC. C. SiuS. C. P. Yam and H. Yang, A class of non-zero-sum stochastic differential investment and reinsurance games, Automatica J. IFAC, 50 (2014), 2025-2037.  doi: 10.1016/j.automatica.2014.05.033.

[8]

U. Berger, Fictitious play in 2 × n games, J. Econom. Theory, 120 (2005), 139-154.  doi: 10.1016/j.jet.2004.02.003.

[9]

H. Brezis, Functional Analysis, Sobolev Spaces and Partial Differential Equations, Universitext. Springer, New York, 2011.

[10]

A. Briani and P. Cardaliaguet, Stable solutions in potential mean field game systems, NoDEA Nonlinear Differential Equations Appl., 25 (2018), Paper No. 1, 26 pp. doi: 10.1007/s00030-017-0493-3.

[11]

G. W. Brown, Some Notes on Computation of Games Solutions, Technical report, Rand Corp Santa Monica CA, 1949.

[12]

G. W. Brown, Iterative solution of games by fictitious play, Activity Analysis of Production and Allocation, 13 (1951), 374-376. 

[13]

P. Cardaliaguet and S. Hadikhanloo, Learning in mean field games: The fictitious play, ESAIM Control Optim. Calc. Var., 23 (2017), 569-591.  doi: 10.1051/cocv/2016004.

[14]

P. Cardaliaguet and C.-A. Lehalle, Mean field game of controls and an application to trade crowding, Math. Financ. Econ., 12 (2018), 335-363.  doi: 10.1007/s11579-017-0206-z.

[15]

R. Carmona and F. Delarue, Probabilistic Theory of Mean Field Games with Applications I-II., Springer, 2018.

[16]

R. CarmonaJ.-P. Fouque and L.-H. Sun, Mean field games and systemic risk, Commun. Math. Sci., 13 (2015), 911-933.  doi: 10.4310/CMS.2015.v13.n4.a4.

[17]

P. Casgrain, B. Ning and S. Jaimungal, Deep Q-learning for Nash equilibria: Nash-DQN, arXiv: 1904.10554, 2019.

[18]

S. ChenH. Yang and Y. Zeng, Stochastic differential games between two insurers with generalized mean-variance premium principle, Astin Bull., 48 (2018), 413-434.  doi: 10.1017/asb.2017.35.

[19] E. J. DocknerS. JørgensenN. V. Long and G. Sorger, Differential Games in Economics and Management Science, Cambridge University Press, 2000.  doi: 10.1017/CBO9780511805127.
[20]

W. EJ. Han and A. Jentzen, Deep learning-based numerical methods for high-dimensional parabolic partial differential equations and backward stochastic differential equations, Commun. Math. Stat., 5 (2017), 349-380.  doi: 10.1007/s40304-017-0117-6.

[21]

N. El KarouiS. Peng and M. C. Quenez, Backward stochastic differential equations in finance, Math. Finance, 7 (1997), 1-71.  doi: 10.1111/1467-9965.00022.

[22]

R. Elie, J. Pérolat, M. Laurière, M. Geist and O. Pietquin, On the convergence of model free learning in mean field games, AAAI-20 Technical Tracks 5, Vol. 34, 2020. arXiv: 1907.02633. doi: 10.1609/aaai. v34i05.6203.

[23]

M. Fazlyab, A. Robey, H. Hassani, M. Morari and G. Pappas, Efficient and accurate estimation of Lipschitz constants for deep neural networks, In Advances in Neural Information Processing Systems, (2019), 11427–11438.

[24]

M. Germain, H. Pham and X. Warin, Deep backward multistep schemes for nonlinear PDEs and approximation error analysis, arXiv preprint, arXiv: 2006.01496, 2020.

[25]

D. A. GomesS. Patrizi and V. Voskanyan, On the existence of classical solutions for stationary extended mean field games, Nonlinear Anal., 99 (2014), 49-79.  doi: 10.1016/j.na.2013.12.016.

[26]

D. A. Gomes and V. K. Voskanyan, Extended deterministic mean-field games, SIAM J. Control Optim., 54 (2016), 1030-1055.  doi: 10.1137/130944503.

[27]

A. Gosavi, A reinforcement learning algorithm based on policy iteration for average reward: Empirical results with yield management and convergence analysis, Machine Learning, 55 (2004), 5-29. 

[28]

I. Gulrajani, F. Ahmed, M. Arjovsky, V. Dumoulin and A. C. Courville, Improved training of wasserstein gans, In Advances in Neural Information Processing Systems, (2017), 5767–5777.

[29]

X. GuoA. HuR. Xu and J. Zhang, Learning mean-field games, Advances in Neural Information Processing Systems, 32 (2019), 4966-4976. 

[30]

J. Han and W. E, Deep learning approximation for stochastic control problems, arXiv: 1611.07422, 2016.

[31]

J. Han and R. Hu, Deep fictitious play for finding Markovian Nash equilibrium in multi-agent games, Proceedings of The First Mathematical and Scientific Machine Learning Conference (MSML), 107 (2020), 221-245. 

[32]

J. HanA. Jentzen and W. E, Solving high-dimensional partial differential equations using deep learning, Proc. Natl. Acad. Sci. USA, 115 (2018), 8505-8510.  doi: 10.1073/pnas.1718942115.

[33]

J. Han and J. Long, Convergence of the deep BSDE method for coupled FBSDEs, Probab. Uncertain. Quant. Risk, 5 (2020), Paper No. 5, 33 pp. doi: 10.1186/s41546-020-00047-w.

[34]

J. Han, J. Lu and M. Zhou, Solving high-dimensional eigenvalue problems using deep neural networks: A diffusion Monte Carlo like approach, J. Comput. Phys., 423 (2020), 109792, 13 pp. doi: 10.1016/j. jcp. 2020.109792.

[35]

J. Han, L. Zhang and W. E, Solving many-electron Schrödinger equation using deep neural networks, J. Comput. Phys., 399 (2019), 108929, 8 pp. doi: 10.1016/j. jcp. 2019.108929.

[36]

J. Hofbauer and W. H. Sandholm, On the global convergence of stochastic fictitious play, Econometrica, 70 (2002), 2265-2294. 

[37]

U. Horst, Stability of linear stochastic difference equations in strategically controlled random environments, Adv. in Appl. Probab., 35 (2003), 961-981.  doi: 10.1239/aap/1067436330.

[38]

U. Horst, Stationary equilibria in discounted stochastic games with weakly interacting players, Games Econom. Behav., 51 (2005), 83-108.  doi: 10.1016/j.geb.2004.03.003.

[39]

R. A. Howard, Dynamic Programming and Markov Processes, John Wiley, 1960.

[40]

R. Hu, Deep learning for ranking response surfaces with applications to optimal stopping problems, Quant. Finance, 20 (2020), 1567-1581.  doi: 10.1080/14697688.2020.1741669.

[41]

R. Hu, Deep fictitious play for stochastic differential games, Commun. Math. Sci., 19 (2021), 325-353.  doi: 10.4310/CMS.2021.v19.n2.a2.

[42]

C. HuréH. Pham and X. Warin, Deep backward schemes for high-dimensional nonlinear PDEs, Math. Comp., 89 (2020), 1547-1579.  doi: 10.1090/mcom/3514.

[43]

S. Ioffe and C. Szegedy, Batch normalization: Accelerating deep network training by reducing internal covariate shift, In International Conference on Machine Learning, (2015), 448–456.

[44]

R. Isaacs, Differential Games: A Mathematical Theory with Applications to Warfare and Pursuit, Control and Optimization, John Wiley & Sons, Inc., New York-London-Sydney 1965

[45]

S. JiS. PengY. Peng and X. Zhang, Three algorithms for solving high-dimensional fully-coupled FBSDEs through deep learning, IEEE Intelligent Systems, 35 (2020), 71-84.  doi: 10.1109/MIS.2020.2971597.

[46]

D. Kingma and J. Ba, Adam: A method for stochastic optimization, In Proceedings of the International Conference on Learning Representations, 2015.

[47]

P. E. Kloeden and E. Platen, Numerical Solution of Stochastic Differential Equations, volume 23., Springer-Verlag, Berlin, 1992. doi: 10.1007/978-3-662-12616-5.

[48]

V. Krishna and T. Sjöström, On the convergence of fictitious play, Math. Oper. Res., 23 (1998), 479-511.  doi: 10.1287/moor.23.2.479.

[49]

H. LiuH. QiaoS. Wang and Y. Li, Platform competition in peer-to-peer lending considering risk control ability, European J. Oper. Res., 274 (2019), 280-290.  doi: 10.1016/j.ejor.2018.09.024.

[50]

N. V. Long, Dynamic games in the economics of natural resources: A survey, Dyn. Games Appl., 1 (2011), 115-148.  doi: 10.1007/s13235-010-0003-2.

[51]

J. MaP. Protter and J. Yong, Solving forward-backward stochastic differential equations explicitly-a four step scheme, Probab. Theory Related Fields, 98 (1994), 339-359.  doi: 10.1007/BF01192258.

[52]

J. Ma and J. Zhang, Representation theorems for backward stochastic differential equations, Ann. Appl. Probab., 12 (2002), 1390-1418.  doi: 10.1214/aoap/1037125868.

[53]

E. J. McShane, Extension of range of functions, Bull. Amer. Math. Soc., 40 (1934), 837-842.  doi: 10.1090/S0002-9904-1934-05978-0.

[54]

P. Milgrom and J. Roberts, Adaptive and sophisticated learning in normal form games, Games Econom. Behav., 3 (1991), 82-100.  doi: 10.1016/0899-8256(91)90006-Z.

[55]

D. Monderer and L. S. Shapley, Fictitious play property for games with identical interests, J. Econom. Theory, 68 (1996), 258-265.  doi: 10.1006/jeth.1996.0014.

[56]

T. Nakamura-Zimmerer, Q. Gong and W. Kang, Adaptive deep learning for high dimensional Hamilton-Jacobi-Bellman equations, SIAM J. Sci. Comput., 43 (2021), A1221–A1247. doi: 10.1137/19M1288802.

[57]

É. Pardoux and S. Peng, Backward stochastic differential equations and quasilinear parabolic partial differential equations, in Stochastic Partial Differential Equations and their Applications, 200–217. Springer, 1992. doi: 10.1007/BFb0007334.

[58]

E. Pardoux and S. Tang, Forward-backward stochastic differential equations and quasilinear parabolic PDEs, Probab. Theory Related Fields, 114 (1999), 123-150.  doi: 10.1007/s004409970001.

[59]

P. Pauli, A. Koch, J. Berberich, P. Kohler and F. Allgöwer, Training robust neural networks using {L}ipschitz bounds, 2021 American Control Conference (ACC), (2021), 2595–2600. doi: 10.23919/ACC50511.2021.9482773.

[60]

D. PfauJ. S. SpencerA. G. D. G. Matthews and W. M. C. Foulkes, Ab-initio solution of the many-electron Schrödinger equation with deep neural networks, Phys. Rev. Research, 2 (2020), 033429.  doi: 10.1103/PhysRevResearch.2.033429.

[61]

H. Pham, X. Warin and M. Germain, Neural networks-based backward scheme for fully nonlinear PDEs, Partial Differ. Equ. Appl., 2 (2021), Paper No. 16, 24 pp. doi: 10.1007/s42985-020-00062-8.

[62]

W. B. Powell and J. Ma, A review of stochastic algorithms with continuous value function approximation and some new approximate policy iteration algorithms for multidimensional continuous applications, J. Control Theory Appl., 9 (2011), 336-352.  doi: 10.1007/s11768-011-0313-y.

[63]

A. Prasad and S. P. Sethi, Competitive advertising under uncertainty: A stochastic differential game approach, J. Optim. Theory Appl., 123 (2004), 163-185.  doi: 10.1023/B:JOTA.0000043996.62867.20.

[64]

M. L. Puterman, Markov Decision Processes: Discrete Stochastic Dynamic Programming, John Wiley & Sons, 1994.

[65]

C. Simone, C. Fabio and G. Alessandro, A policy iteration method for mean field games, ESAIM: Control, Optimisation and Calculus of Variations, 27 (2021).

[66]

J. Sirignano and K. Spiliopoulos, DGM: A deep learning algorithm for solving partial differential equations, J. Comput. Phys., 375 (2018), 1339-1364.  doi: 10.1016/j.jcp.2018.08.029.

[67]

Z. Wei and M. Lin, Market mechanisms in online peer-to-peer lending, Management Science, 63 (2017), 4236-4257.  doi: 10.1287/mnsc.2016.2531.

[68]

Y. Xuan, R. Balkin, J. Han, R. Hu and H. D. Ceniceros, Optimal policies for a pandemic: A stochastic game approach and a deep learning algorithm, Proceedings of The Second Mathematical and Scientific Machine Learning Conference (MSML), 145 (2022), 987-1012.

[69]

B. Yu, X. Xing and A. Sudjianto, Deep-learning based numerical BSDE method for barrier options, Available at SSRN. arXiv: 1904.05921, 2019. doi: 10.2139/ssrn. 3366314.

[70]

X. Zeng, A stochastic differential reinsurance game, J. Appl. Probab., 47 (2010), 335-349.  doi: 10.1239/jap/1276784895.

[71]

J. Zhang, Backward Stochastic Differential Equations: From Linear to Fully Nonlinear Theory, Springer, 2017. doi: 10.1007/978-1-4939-7256-2.

show all references

References:
[1]

A. Angiuli, J. -P. Fouque and M. Laurière, Unified reinforcement Q-learning for mean field game and control problems, arXiv: 2006.13912, 2020.

[2]

M. Arjovsky, S. Chintala and L. Bottou, Wasserstein generative adversarial networks, In Proceedings of the 34th International Conference on Machine Learning, volume 70 of PLMR, 2017, 214–223.

[3]

R. Arora, A. Basu, P. Mianjy and A. Mukherjee, Understanding deep neural networks with rectified linear units, arXiv preprint, arXiv: 1611.01491, 2016.

[4]

E. BayraktarA. Budhiraja and A. Cohen, A numerical scheme for a mean field game in some queueing systems based on Markov chain approximation method, SIAM J. Control Optim., 56 (2018), 4017-4044.  doi: 10.1137/17M1154357.

[5]

C. Beck, S. Becker, P. Cheridito, A. Jentzen and A. Neufeld, Deep splitting method for parabolic PDEs, SIAM J. Sci. Comput., 43 (2021), A3135–A3154. doi: 10.1137/19M1297919.

[6]

C. BeckW. E and A. Jentzen, Machine learning approximation algorithms for high-dimensional fully nonlinear partial differential equations and second-order backward stochastic differential equations, J. Nonlinear Sci., 29 (2019), 1563-1619.  doi: 10.1007/s00332-018-9525-3.

[7]

A. BensoussanC. C. SiuS. C. P. Yam and H. Yang, A class of non-zero-sum stochastic differential investment and reinsurance games, Automatica J. IFAC, 50 (2014), 2025-2037.  doi: 10.1016/j.automatica.2014.05.033.

[8]

U. Berger, Fictitious play in 2 × n games, J. Econom. Theory, 120 (2005), 139-154.  doi: 10.1016/j.jet.2004.02.003.

[9]

H. Brezis, Functional Analysis, Sobolev Spaces and Partial Differential Equations, Universitext. Springer, New York, 2011.

[10]

A. Briani and P. Cardaliaguet, Stable solutions in potential mean field game systems, NoDEA Nonlinear Differential Equations Appl., 25 (2018), Paper No. 1, 26 pp. doi: 10.1007/s00030-017-0493-3.

[11]

G. W. Brown, Some Notes on Computation of Games Solutions, Technical report, Rand Corp Santa Monica CA, 1949.

[12]

G. W. Brown, Iterative solution of games by fictitious play, Activity Analysis of Production and Allocation, 13 (1951), 374-376. 

[13]

P. Cardaliaguet and S. Hadikhanloo, Learning in mean field games: The fictitious play, ESAIM Control Optim. Calc. Var., 23 (2017), 569-591.  doi: 10.1051/cocv/2016004.

[14]

P. Cardaliaguet and C.-A. Lehalle, Mean field game of controls and an application to trade crowding, Math. Financ. Econ., 12 (2018), 335-363.  doi: 10.1007/s11579-017-0206-z.

[15]

R. Carmona and F. Delarue, Probabilistic Theory of Mean Field Games with Applications I-II., Springer, 2018.

[16]

R. CarmonaJ.-P. Fouque and L.-H. Sun, Mean field games and systemic risk, Commun. Math. Sci., 13 (2015), 911-933.  doi: 10.4310/CMS.2015.v13.n4.a4.

[17]

P. Casgrain, B. Ning and S. Jaimungal, Deep Q-learning for Nash equilibria: Nash-DQN, arXiv: 1904.10554, 2019.

[18]

S. ChenH. Yang and Y. Zeng, Stochastic differential games between two insurers with generalized mean-variance premium principle, Astin Bull., 48 (2018), 413-434.  doi: 10.1017/asb.2017.35.

[19] E. J. DocknerS. JørgensenN. V. Long and G. Sorger, Differential Games in Economics and Management Science, Cambridge University Press, 2000.  doi: 10.1017/CBO9780511805127.
[20]

W. EJ. Han and A. Jentzen, Deep learning-based numerical methods for high-dimensional parabolic partial differential equations and backward stochastic differential equations, Commun. Math. Stat., 5 (2017), 349-380.  doi: 10.1007/s40304-017-0117-6.

[21]

N. El KarouiS. Peng and M. C. Quenez, Backward stochastic differential equations in finance, Math. Finance, 7 (1997), 1-71.  doi: 10.1111/1467-9965.00022.

[22]

R. Elie, J. Pérolat, M. Laurière, M. Geist and O. Pietquin, On the convergence of model free learning in mean field games, AAAI-20 Technical Tracks 5, Vol. 34, 2020. arXiv: 1907.02633. doi: 10.1609/aaai. v34i05.6203.

[23]

M. Fazlyab, A. Robey, H. Hassani, M. Morari and G. Pappas, Efficient and accurate estimation of Lipschitz constants for deep neural networks, In Advances in Neural Information Processing Systems, (2019), 11427–11438.

[24]

M. Germain, H. Pham and X. Warin, Deep backward multistep schemes for nonlinear PDEs and approximation error analysis, arXiv preprint, arXiv: 2006.01496, 2020.

[25]

D. A. GomesS. Patrizi and V. Voskanyan, On the existence of classical solutions for stationary extended mean field games, Nonlinear Anal., 99 (2014), 49-79.  doi: 10.1016/j.na.2013.12.016.

[26]

D. A. Gomes and V. K. Voskanyan, Extended deterministic mean-field games, SIAM J. Control Optim., 54 (2016), 1030-1055.  doi: 10.1137/130944503.

[27]

A. Gosavi, A reinforcement learning algorithm based on policy iteration for average reward: Empirical results with yield management and convergence analysis, Machine Learning, 55 (2004), 5-29. 

[28]

I. Gulrajani, F. Ahmed, M. Arjovsky, V. Dumoulin and A. C. Courville, Improved training of wasserstein gans, In Advances in Neural Information Processing Systems, (2017), 5767–5777.

[29]

X. GuoA. HuR. Xu and J. Zhang, Learning mean-field games, Advances in Neural Information Processing Systems, 32 (2019), 4966-4976. 

[30]

J. Han and W. E, Deep learning approximation for stochastic control problems, arXiv: 1611.07422, 2016.

[31]

J. Han and R. Hu, Deep fictitious play for finding Markovian Nash equilibrium in multi-agent games, Proceedings of The First Mathematical and Scientific Machine Learning Conference (MSML), 107 (2020), 221-245. 

[32]

J. HanA. Jentzen and W. E, Solving high-dimensional partial differential equations using deep learning, Proc. Natl. Acad. Sci. USA, 115 (2018), 8505-8510.  doi: 10.1073/pnas.1718942115.

[33]

J. Han and J. Long, Convergence of the deep BSDE method for coupled FBSDEs, Probab. Uncertain. Quant. Risk, 5 (2020), Paper No. 5, 33 pp. doi: 10.1186/s41546-020-00047-w.

[34]

J. Han, J. Lu and M. Zhou, Solving high-dimensional eigenvalue problems using deep neural networks: A diffusion Monte Carlo like approach, J. Comput. Phys., 423 (2020), 109792, 13 pp. doi: 10.1016/j. jcp. 2020.109792.

[35]

J. Han, L. Zhang and W. E, Solving many-electron Schrödinger equation using deep neural networks, J. Comput. Phys., 399 (2019), 108929, 8 pp. doi: 10.1016/j. jcp. 2019.108929.

[36]

J. Hofbauer and W. H. Sandholm, On the global convergence of stochastic fictitious play, Econometrica, 70 (2002), 2265-2294. 

[37]

U. Horst, Stability of linear stochastic difference equations in strategically controlled random environments, Adv. in Appl. Probab., 35 (2003), 961-981.  doi: 10.1239/aap/1067436330.

[38]

U. Horst, Stationary equilibria in discounted stochastic games with weakly interacting players, Games Econom. Behav., 51 (2005), 83-108.  doi: 10.1016/j.geb.2004.03.003.

[39]

R. A. Howard, Dynamic Programming and Markov Processes, John Wiley, 1960.

[40]

R. Hu, Deep learning for ranking response surfaces with applications to optimal stopping problems, Quant. Finance, 20 (2020), 1567-1581.  doi: 10.1080/14697688.2020.1741669.

[41]

R. Hu, Deep fictitious play for stochastic differential games, Commun. Math. Sci., 19 (2021), 325-353.  doi: 10.4310/CMS.2021.v19.n2.a2.

[42]

C. HuréH. Pham and X. Warin, Deep backward schemes for high-dimensional nonlinear PDEs, Math. Comp., 89 (2020), 1547-1579.  doi: 10.1090/mcom/3514.

[43]

S. Ioffe and C. Szegedy, Batch normalization: Accelerating deep network training by reducing internal covariate shift, In International Conference on Machine Learning, (2015), 448–456.

[44]

R. Isaacs, Differential Games: A Mathematical Theory with Applications to Warfare and Pursuit, Control and Optimization, John Wiley & Sons, Inc., New York-London-Sydney 1965

[45]

S. JiS. PengY. Peng and X. Zhang, Three algorithms for solving high-dimensional fully-coupled FBSDEs through deep learning, IEEE Intelligent Systems, 35 (2020), 71-84.  doi: 10.1109/MIS.2020.2971597.

[46]

D. Kingma and J. Ba, Adam: A method for stochastic optimization, In Proceedings of the International Conference on Learning Representations, 2015.

[47]

P. E. Kloeden and E. Platen, Numerical Solution of Stochastic Differential Equations, volume 23., Springer-Verlag, Berlin, 1992. doi: 10.1007/978-3-662-12616-5.

[48]

V. Krishna and T. Sjöström, On the convergence of fictitious play, Math. Oper. Res., 23 (1998), 479-511.  doi: 10.1287/moor.23.2.479.

[49]

H. LiuH. QiaoS. Wang and Y. Li, Platform competition in peer-to-peer lending considering risk control ability, European J. Oper. Res., 274 (2019), 280-290.  doi: 10.1016/j.ejor.2018.09.024.

[50]

N. V. Long, Dynamic games in the economics of natural resources: A survey, Dyn. Games Appl., 1 (2011), 115-148.  doi: 10.1007/s13235-010-0003-2.

[51]

J. MaP. Protter and J. Yong, Solving forward-backward stochastic differential equations explicitly-a four step scheme, Probab. Theory Related Fields, 98 (1994), 339-359.  doi: 10.1007/BF01192258.

[52]

J. Ma and J. Zhang, Representation theorems for backward stochastic differential equations, Ann. Appl. Probab., 12 (2002), 1390-1418.  doi: 10.1214/aoap/1037125868.

[53]

E. J. McShane, Extension of range of functions, Bull. Amer. Math. Soc., 40 (1934), 837-842.  doi: 10.1090/S0002-9904-1934-05978-0.

[54]

P. Milgrom and J. Roberts, Adaptive and sophisticated learning in normal form games, Games Econom. Behav., 3 (1991), 82-100.  doi: 10.1016/0899-8256(91)90006-Z.

[55]

D. Monderer and L. S. Shapley, Fictitious play property for games with identical interests, J. Econom. Theory, 68 (1996), 258-265.  doi: 10.1006/jeth.1996.0014.

[56]

T. Nakamura-Zimmerer, Q. Gong and W. Kang, Adaptive deep learning for high dimensional Hamilton-Jacobi-Bellman equations, SIAM J. Sci. Comput., 43 (2021), A1221–A1247. doi: 10.1137/19M1288802.

[57]

É. Pardoux and S. Peng, Backward stochastic differential equations and quasilinear parabolic partial differential equations, in Stochastic Partial Differential Equations and their Applications, 200–217. Springer, 1992. doi: 10.1007/BFb0007334.

[58]

E. Pardoux and S. Tang, Forward-backward stochastic differential equations and quasilinear parabolic PDEs, Probab. Theory Related Fields, 114 (1999), 123-150.  doi: 10.1007/s004409970001.

[59]

P. Pauli, A. Koch, J. Berberich, P. Kohler and F. Allgöwer, Training robust neural networks using {L}ipschitz bounds, 2021 American Control Conference (ACC), (2021), 2595–2600. doi: 10.23919/ACC50511.2021.9482773.

[60]

D. PfauJ. S. SpencerA. G. D. G. Matthews and W. M. C. Foulkes, Ab-initio solution of the many-electron Schrödinger equation with deep neural networks, Phys. Rev. Research, 2 (2020), 033429.  doi: 10.1103/PhysRevResearch.2.033429.

[61]

H. Pham, X. Warin and M. Germain, Neural networks-based backward scheme for fully nonlinear PDEs, Partial Differ. Equ. Appl., 2 (2021), Paper No. 16, 24 pp. doi: 10.1007/s42985-020-00062-8.

[62]

W. B. Powell and J. Ma, A review of stochastic algorithms with continuous value function approximation and some new approximate policy iteration algorithms for multidimensional continuous applications, J. Control Theory Appl., 9 (2011), 336-352.  doi: 10.1007/s11768-011-0313-y.

[63]

A. Prasad and S. P. Sethi, Competitive advertising under uncertainty: A stochastic differential game approach, J. Optim. Theory Appl., 123 (2004), 163-185.  doi: 10.1023/B:JOTA.0000043996.62867.20.

[64]

M. L. Puterman, Markov Decision Processes: Discrete Stochastic Dynamic Programming, John Wiley & Sons, 1994.

[65]

C. Simone, C. Fabio and G. Alessandro, A policy iteration method for mean field games, ESAIM: Control, Optimisation and Calculus of Variations, 27 (2021).

[66]

J. Sirignano and K. Spiliopoulos, DGM: A deep learning algorithm for solving partial differential equations, J. Comput. Phys., 375 (2018), 1339-1364.  doi: 10.1016/j.jcp.2018.08.029.

[67]

Z. Wei and M. Lin, Market mechanisms in online peer-to-peer lending, Management Science, 63 (2017), 4236-4257.  doi: 10.1287/mnsc.2016.2531.

[68]

Y. Xuan, R. Balkin, J. Han, R. Hu and H. D. Ceniceros, Optimal policies for a pandemic: A stochastic game approach and a deep learning algorithm, Proceedings of The Second Mathematical and Scientific Machine Learning Conference (MSML), 145 (2022), 987-1012.

[69]

B. Yu, X. Xing and A. Sudjianto, Deep-learning based numerical BSDE method for barrier options, Available at SSRN. arXiv: 1904.05921, 2019. doi: 10.2139/ssrn. 3366314.

[70]

X. Zeng, A stochastic differential reinsurance game, J. Appl. Probab., 47 (2010), 335-349.  doi: 10.1239/jap/1276784895.

[71]

J. Zhang, Backward Stochastic Differential Equations: From Linear to Fully Nonlinear Theory, Springer, 2017. doi: 10.1007/978-1-4939-7256-2.

Figure 1.  A sample path for all $ N = 10 $ players in the inter-bank game, obtained from decoupling the problem by policy update and solving the sub-problems with the Deep BSDE method. Top: the optimal state process $ X_t^i $ (solid lines) and its neural networks approximation $ \hat{X}_t^i $ (circles), under the same realized path of Brownian motion. Bottom: comparisons of the strategies $ \alpha_t^i $ and $ \hat{\alpha}_t^i $ (dashed lines)
[1]

Rui Mu, Zhen Wu. Nash equilibrium points of recursive nonzero-sum stochastic differential games with unbounded coefficients and related multiple\\ dimensional BSDEs. Mathematical Control and Related Fields, 2017, 7 (2) : 289-304. doi: 10.3934/mcrf.2017010

[2]

Yaozhong Hu, David Nualart, Xiaobin Sun, Yingchao Xie. Smoothness of density for stochastic differential equations with Markovian switching. Discrete and Continuous Dynamical Systems - B, 2019, 24 (8) : 3615-3631. doi: 10.3934/dcdsb.2018307

[3]

Jasmina Djordjević, Svetlana Janković. Reflected backward stochastic differential equations with perturbations. Discrete and Continuous Dynamical Systems, 2018, 38 (4) : 1833-1848. doi: 10.3934/dcds.2018075

[4]

Jan A. Van Casteren. On backward stochastic differential equations in infinite dimensions. Discrete and Continuous Dynamical Systems - S, 2013, 6 (3) : 803-824. doi: 10.3934/dcdss.2013.6.803

[5]

Joscha Diehl, Jianfeng Zhang. Backward stochastic differential equations with Young drift. Probability, Uncertainty and Quantitative Risk, 2017, 2 (0) : 5-. doi: 10.1186/s41546-017-0016-5

[6]

Yanqiang Chang, Huabin Chen. Stability analysis of stochastic delay differential equations with Markovian switching driven by Lévy noise. Discrete and Continuous Dynamical Systems - B, 2021  doi: 10.3934/dcdsb.2021301

[7]

Chuchu Chen, Jialin Hong. Mean-square convergence of numerical approximations for a class of backward stochastic differential equations. Discrete and Continuous Dynamical Systems - B, 2013, 18 (8) : 2051-2067. doi: 10.3934/dcdsb.2013.18.2051

[8]

Dariusz Borkowski. Forward and backward filtering based on backward stochastic differential equations. Inverse Problems and Imaging, 2016, 10 (2) : 305-325. doi: 10.3934/ipi.2016002

[9]

Ying Hu, Shanjian Tang. Switching game of backward stochastic differential equations and associated system of obliquely reflected backward stochastic differential equations. Discrete and Continuous Dynamical Systems, 2015, 35 (11) : 5447-5465. doi: 10.3934/dcds.2015.35.5447

[10]

Xin Chen, Ana Bela Cruzeiro. Stochastic geodesics and forward-backward stochastic differential equations on Lie groups. Conference Publications, 2013, 2013 (special) : 115-121. doi: 10.3934/proc.2013.2013.115

[11]

Alejandra Fonseca-Morales, Onésimo Hernández-Lerma. A note on differential games with Pareto-optimal NASH equilibria: Deterministic and stochastic models. Journal of Dynamics and Games, 2017, 4 (3) : 195-203. doi: 10.3934/jdg.2017012

[12]

Qi Zhang, Huaizhong Zhao. Backward doubly stochastic differential equations with polynomial growth coefficients. Discrete and Continuous Dynamical Systems, 2015, 35 (11) : 5285-5315. doi: 10.3934/dcds.2015.35.5285

[13]

Yufeng Shi, Qingfeng Zhu. A Kneser-type theorem for backward doubly stochastic differential equations. Discrete and Continuous Dynamical Systems - B, 2010, 14 (4) : 1565-1579. doi: 10.3934/dcdsb.2010.14.1565

[14]

Yanqing Wang. A semidiscrete Galerkin scheme for backward stochastic parabolic differential equations. Mathematical Control and Related Fields, 2016, 6 (3) : 489-515. doi: 10.3934/mcrf.2016013

[15]

Weidong Zhao, Jinlei Wang, Shige Peng. Error estimates of the $\theta$-scheme for backward stochastic differential equations. Discrete and Continuous Dynamical Systems - B, 2009, 12 (4) : 905-924. doi: 10.3934/dcdsb.2009.12.905

[16]

Weidong Zhao, Yang Li, Guannan Zhang. A generalized $\theta$-scheme for solving backward stochastic differential equations. Discrete and Continuous Dynamical Systems - B, 2012, 17 (5) : 1585-1603. doi: 10.3934/dcdsb.2012.17.1585

[17]

Yueyang Zheng, Jingtao Shi. A stackelberg game of backward stochastic differential equations with partial information. Mathematical Control and Related Fields, 2021, 11 (4) : 797-828. doi: 10.3934/mcrf.2020047

[18]

Jiongmin Yong. Forward-backward stochastic differential equations: Initiation, development and beyond. Numerical Algebra, Control and Optimization, 2022  doi: 10.3934/naco.2022011

[19]

Yinggu Chen, Said HamadÈne, Tingshu Mu. Mean-field doubly reflected backward stochastic differential equations. Numerical Algebra, Control and Optimization, 2022  doi: 10.3934/naco.2022012

[20]

Alain Bensoussan, Jens Frehse, Jens Vogelgesang. Systems of Bellman equations to stochastic differential games with non-compact coupling. Discrete and Continuous Dynamical Systems, 2010, 27 (4) : 1375-1389. doi: 10.3934/dcds.2010.27.1375

 Impact Factor: 

Metrics

  • PDF downloads (20)
  • HTML views (31)
  • Cited by (0)

Other articles
by authors

[Back to Top]