October  2021, 8(4): 403-443. doi: 10.3934/jdg.2021023

Linear-quadratic zero-sum mean-field type games: Optimality conditions and policy optimization

Department of Operations Research and Financial Engineering, Princeton University, Princeton, NJ 08540, USA

Received  July 2020 Revised  July 2021 Published  October 2021 Early access  August 2021

Fund Project: A preliminary version of this work was submitted to the 59th Conference on Decision and Control

In this paper, zero-sum mean-field type games (ZSMFTG) with linear dynamics and quadratic cost are studied under infinite-horizon discounted utility function. ZSMFTG are a class of games in which two decision makers whose utilities sum to zero, compete to influence a large population of indistinguishable agents. In particular, the case in which the transition and utility functions depend on the state, the action of the controllers, and the mean of the state and the actions, is investigated. The optimality conditions of the game are analysed for both open-loop and closed-loop controls, and explicit expressions for the Nash equilibrium strategies are derived. Moreover, two policy optimization methods that rely on policy gradient are proposed for both model-based and sample-based frameworks. In the model-based case, the gradients are computed exactly using the model, whereas they are estimated using Monte-Carlo simulations in the sample-based case. Numerical experiments are conducted to show the convergence of the utility function as well as the two players' controls.

Citation: René Carmona, Kenza Hamidouche, Mathieu Laurière, Zongjun Tan. Linear-quadratic zero-sum mean-field type games: Optimality conditions and policy optimization. Journal of Dynamics and Games, 2021, 8 (4) : 403-443. doi: 10.3934/jdg.2021023
References:
[1]

Y. AchdouF. Camilli and I. Capuzzo-Dolcetta, Mean field games: Numerical methods for the planning problem, SIAM J. Control Optim., 50 (2012), 77-109.  doi: 10.1137/100790069.

[2]

Y. Achdou and I. Capuzzo-Dolcetta, Mean field games: Numerical methods, SIAM J. Numer. Anal., 48 (2010), 1136-1162.  doi: 10.1137/090758477.

[3]

Y. Achdou and J.-M. Lasry, Mean field games for modeling crowd motion, in Contributions to Partial Differential Equations and Applications, Comput. Methods Appl. Sci., 47, Springer, Cham, 2019, 17-42. doi: 10.1007/978-3-319-78325-3_4.

[4]

Y. Achdou and M. Laurière, Mean field games and applications: Numerical aspects, in Mean Field Games, Lecture Notes in Math., 2281, Fond. CIME/CIME Found. Subser., Springer, Cham, 2020,249-307. doi: 10.1007/978-3-030-59837-2_4.

[5]

Y. Achdou and M. Laurière, Mean field type control with congestion (Ⅱ): An augmented Lagrangian method, Appl. Math. Optim., 74 (2016), 535-578.  doi: 10.1007/s00245-016-9391-z.

[6]

Y. Achdou and M. Laurière, On the system of partial differential equations arising in mean field type control, Discrete Contin. Dyn. Syst., 35 (2015), 3879-3900.  doi: 10.3934/dcds.2015.35.3879.

[7]

A. Al-TamimiF. L. Lewis and M. Abu-Khalaf, Model-free Q-learning designs for linear discrete-time zero-sum games with application to H-infinity control, Automatica J. IFAC, 43 (2007), 473-481.  doi: 10.1016/j.automatica.2006.09.019.

[8]

C. AlasseurI. Ben Tahar and A. Matoussi, An extended mean field game for storage in smart grids, J. Optim. Theory Appl., 184 (2020), 644-670.  doi: 10.1007/s10957-019-01619-3.

[9]

B. Anahtarci, C. D. Karıksı z and N. Saldi, Value iteration algorithm for mean-field games, Systems Control Lett., 143 (2020), 10pp. doi: 10.1016/j.sysconle.2020.104744.

[10]

J. Barreiro-Gomez, T. E. Duncan and H. Tembine, Discrete-time linear-quadratic mean-field-type repeated games: Perfect, incomplete, and imperfect information, Automatica J. IFAC, 112 (2020), 16pp. doi: 10.1016/j.automatica.2019.108647.

[11]

T. Başar and P. Bernhard, $H^{\infty}$ Optimal Control and Related Minimax Design Problems: A Dynamic Game Approach, Birkhäuser, Boston, MA, 2008. doi: 10.1007/978-0-8176-4757-5.

[12]

D. Bauso, Game Theory with Engineering Applications, Advances in Design and Control, 30, Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA, 2016. doi: 10.1137/1.9781611974287.

[13]

D. BausoH. Tembine and T. Başar, Robust mean field games with application to production of an exhaustible resource, IFAC Proceedings Volumes, 45 (2012), 454-459.  doi: 10.3182/20120620-3-DK-2025.00135.

[14]

A. Bensoussan, G. Da Prato, M. C. Delfour and S. K. Mitter, Representation and Control of Infinite Dimensional Systems, Systems & Control: Foundations & Applications, Birkhäuser Boston, Inc., Boston, MA, 2007. doi: 10.1007/978-0-8176-4581-6.

[15]

A. Bensoussan, J. Frehse and P. Yam, Mean Field Games and Mean Field Type Control Theory, SpringerBriefs in Mathematics, Springer, New York, 2013. doi: 10.1007/978-1-4614-8508-7.

[16]

A. BensoussanT. Huang and M. Laurière, Mean field control and mean field game models with several populations, Minimax Theory Appl., 3 (2018), 173-209. 

[17]

L. Briceño-Arias, D. Kalise, Z. Kobeissi, M. Laurière, Á. Mateos González and F. J. Silva, On the implementation of a primal-dual algorithm for second order time-dependent mean field games with local couplings, in CEMRACS 2017-Numerical Methods for Stochastic Models: Control, Uncertainty Quantification, Mean-Field, ESAIM Proc. Surveys, 65, EDP Sci., Les Ulis, 2019,330-348. doi: 10.1051/proc/201965330.

[18]

L. M. Briceño-AriasD. Kalise and F. J. Silva, Proximal methods for stationary mean field games with local couplings, SIAM J. Control Optim., 56 (2018), 801-836.  doi: 10.1137/16M1095615.

[19]

H. Cao, X. Guo and M. Laurière, Connecting GANs, MFGs, and OT, preprint, arXiv: 2002.04112.

[20]

P. Cardaliaguet, Notes on Mean Field Games, 2013. Available from: https://www.ceremade.dauphine.fr/cardaliaguet/MFG20130420.pdf.

[21]

P. Cardaliaguet and C.-A. Lehalle, Mean field game of controls and an application to trade crowding, Math. Financ. Econ., 12 (2018), 335-363.  doi: 10.1007/s11579-017-0206-z.

[22]

E. Carlini and F. J. Silva., A fully discrete semi-Lagrangian scheme for a first order mean field game problem, SIAM J. Numer. Anal., 52 (2014), 45-67.  doi: 10.1137/120902987.

[23]

R. Carmona and F. Delarue, Probabilistic Theory of Mean Field Games with Applications. I. Mean Field FBSDEs, Control, and Games, Probability Theory and Stochastic Modelling, 83, Springer, Cham, 2018. doi: 10.1007/978-3-319-58920-6.

[24]

R. CarmonaJ.-P. Fouque and L.-H. Sun, Mean field games and systemic risk, Commun. Math. Sci., 13 (2015), 911-933.  doi: 10.4310/CMS.2015.v13.n4.a4.

[25]

R. Carmona, K. Hamidouche, M. Laurière and Z. Tan, Policy optimization for linear-quadratic zero-sum mean-field type games, Proceedings of the IEEE Conference on Decision and Control, Jeju, Korea, 2020. doi: 10.1109/CDC42340.2020.9303734.

[26]

R. Carmona and M. Laurière, Convergence analysis of machine learning algorithms for the numerical solution of mean field control and games Ⅰ: The ergodic case, SIAM J. Numer. Anal., 59 (2021), 1455-1485.  doi: 10.1137/19M1274377.

[27]

R. Carmona and M. Laurière, Convergence analysis of machine learning algorithms for the numerical solution of mean field control and games Ⅱ: The finite horizon case, preprint, arXiv: 1908.01613.

[28]

R. Carmona, M. Laurière and Z. Tan, Linear-quadratic mean-field reinforcement learning: Convergence of policy gradient methods, preprint, arXiv: 1910.04295.

[29]

R. Carmona, M. Laurière and Z. Tan, Model-free mean-field reinforcement learning: Mean-field MDP and mean-field Q-learning, preprint, arXiv: 1910.12802.

[30]

A. CherukuriB. Gharesifard and J. Cortés, Saddle-point dynamics: Conditions for asymptotic stability of saddle points, SIAM J. Control Optim., 55 (2017), 486-511.  doi: 10.1137/15M1026924.

[31]

A. Cosso and H. Pham, Zero-sum stochastic differential games of generalized McKean-Vlasov type, J. Math. Pures Appl. (9), 129 (2019), 180-212.  doi: 10.1016/j.matpur.2018.12.005.

[32]

C. Daskalakis and I. Panageas, The limit points of (optimistic) gradient descent in min-max optimization, NIPS'18: Proceedings of the 32nd International Conference on Neural Information Processing Systems, 2018, 9256-9266. Available from: https://dl.acm.org/doi/pdf/10.5555/3327546.3327597.

[33]

B. Djehiche and S. Hamadène, Optimal control and zero-sum stochastic differential game problems of mean-field type, Appl. Math. Optim., 81 (2020), 933-960.  doi: 10.1007/s00245-018-9525-6.

[34]

B. DjehicheA. Tcheukam and H. Tembine, Mean-field-type games in engineering, AIMS Electronics and Electrical Engineering, 1 (2017), 18-73.  doi: 10.3934/ElectrEng.2017.1.18.

[35]

C. Domingo-Enrich, S. Jelassi, A. Mensch, G. M. Rotskoff and J. Bruna, A mean-field analysis of two-player zero-sum games, preprint, arXiv: 2002.06277.

[36]

R. Elie, T. Ichiba and M. Laurière, Large banking systems with default and recovery: A mean field game model, preprint, arXiv: 2001.10206.

[37]

R. ElieJ. PérolatM. LaurièreM. Geist and O. Pietquin, On the convergence of model free learning in mean field games, Proceedings of the AAAI Conference on Artificial Intelligence, 34 (2020), 7143-7150.  doi: 10.1609/aaai.v34i05.6203.

[38]

M. Fazel, R. Ge, S. M. Kakade and M. Mesbahi, Global convergence of policy gradient methods for the linear quadratic regulator, preprint, arXiv: 1801.05039.

[39]

Z. Fu, Z. Yang, Y. Chen and Z. Wang, Actor-critic provably finds Nash equilibria of linear-quadratic mean-field games, preprint, arXiv: 1910.07498.

[40]

H. Gu, X. Guo, X. Wei and R. Xu, Mean-field controls with Q-learning for cooperative MARL: Convergence and complexity analysis, preprint, arXiv: 2002.04131.

[41]

X. Guo, A. Hu, R. Xu and J. Zhang, Learning mean-field games, Proceedings of the 33rd International Conference on Neural Information Processing Systems, 2019, 4967-4977.

[42]

M. HuangR. P. Malhamé and P. E. Caines, Large population stochastic dynamic games: Closed-loop McKean-Vlasov systems and the Nash certainty equivalence principle, Commun. Inf. Syst., 6 (2006), 221-251.  doi: 10.4310/CIS.2006.v6.n3.a5.

[43]

C. Jin, P. Netrapalli and M. I. Jordan, What is local optimality in nonconvex-nonconcave minimax optimization?, preprint, arXiv: 1902.00618.

[44]

H. KimJ. ParkM. BennisS.-L. Kim and M. Debbah, Mean-field game theoretic edge caching in ultra-dense networks, IEEE Transactions on Vehicular Technology, 69 (2019), 935-947.  doi: 10.1109/TVT.2019.2953132.

[45]

V. Kučera, The discrete Riccati equation of optimal control, Kybernetika (Prague), 8 (1972), 430-447. 

[46]

J.-M. Lasry and P.-L. Lions, Mean field games, Jpn. J. Math., 2 (2007), 229-260.  doi: 10.1007/s11537-007-0657-8.

[47]

Z. Liu, B. Wu and H. Lin, A mean field game approach to swarming robots control, 2018 Annual American Control Conference (ACC), Milwaukee, WI, 2018. doi: 10.23919/ACC.2018.8431807.

[48]

T.-T. Lu and S.-H. Shiou, Inverses of 2 × 2 block matrices, Comput. Math. Appl., 43 (2002), 119-129.  doi: 10.1016/S0898-1221(01)00278-4.

[49]

E. Mazumdar, M. I. Jordan and S. S. Sastry, On finding local Nash equilibria (and only local Nash equilibria) in zero-sum continuous games, preprint, arXiv: 1901.00838.

[50]

F. Mériaux, V. Varma and S. Lasaulce, Mean field energy games in wireless networks, 2012 Conference Record of the Forty Sixth Asilomar Conference on Signals, Systems and Computers (ASILOMAR), Pacific Grove, CA, 2012. doi: 10.1109/ACSSC.2012.6489095.

[51]

M. NouiehedM. SanjabiT. HuangJ. D. Lee and M. Razaviyayn, Solving a class of non-convex min-max games using iterative first order methods, Advances in Neural Information Processing Systems, 32 (2019), 14934-14942. 

[52]

A. C. M. Ran and R. Vreugdenhil, Existence and comparison theorems for algebraic Riccati equations for continuous- and discrete-time systems, Linear Algebra Appl., 99 (1988), 63-83.  doi: 10.1016/0024-3795(88)90125-5.

[53]

D. ShiH. GaoL. WangM. PanZ. Han and H. V. Poor, Mean field game guided deep reinforcement learning for task placement in cooperative multi-access edge computing, IEEE Internet of Things Journal, 7 (2020), 9330-9340.  doi: 10.1109/JIOT.2020.2983741.

[54]

J. SunJ. Yong and S. Zhang, Linear quadratic stochastic two-person zero-sum differential games in an infinite horizon, ESAIM: Control Optim. Calc. Var., 22 (2016), 743-769.  doi: 10.1051/cocv/2015024.

[55] J. von Neumann and O. Morgenstern, Theory of Games and Economic Behavior, Princeton University Press, Princeton, NJ, 2007. 
[56]

R. Xu, Zero-sum stochastic differential games of mean-field type and bsdes, Proceedings of the 31st Chinese Control Conference, (2012), 1651-1654.

[57]

K. Zhang, Z. Yang and T. Basar, Policy optimization provably converges to Nash equilibria in zero-sum linear quadratic games, Advances in Neural Information Processing Systems, (2019) 11598-11610.

show all references

References:
[1]

Y. AchdouF. Camilli and I. Capuzzo-Dolcetta, Mean field games: Numerical methods for the planning problem, SIAM J. Control Optim., 50 (2012), 77-109.  doi: 10.1137/100790069.

[2]

Y. Achdou and I. Capuzzo-Dolcetta, Mean field games: Numerical methods, SIAM J. Numer. Anal., 48 (2010), 1136-1162.  doi: 10.1137/090758477.

[3]

Y. Achdou and J.-M. Lasry, Mean field games for modeling crowd motion, in Contributions to Partial Differential Equations and Applications, Comput. Methods Appl. Sci., 47, Springer, Cham, 2019, 17-42. doi: 10.1007/978-3-319-78325-3_4.

[4]

Y. Achdou and M. Laurière, Mean field games and applications: Numerical aspects, in Mean Field Games, Lecture Notes in Math., 2281, Fond. CIME/CIME Found. Subser., Springer, Cham, 2020,249-307. doi: 10.1007/978-3-030-59837-2_4.

[5]

Y. Achdou and M. Laurière, Mean field type control with congestion (Ⅱ): An augmented Lagrangian method, Appl. Math. Optim., 74 (2016), 535-578.  doi: 10.1007/s00245-016-9391-z.

[6]

Y. Achdou and M. Laurière, On the system of partial differential equations arising in mean field type control, Discrete Contin. Dyn. Syst., 35 (2015), 3879-3900.  doi: 10.3934/dcds.2015.35.3879.

[7]

A. Al-TamimiF. L. Lewis and M. Abu-Khalaf, Model-free Q-learning designs for linear discrete-time zero-sum games with application to H-infinity control, Automatica J. IFAC, 43 (2007), 473-481.  doi: 10.1016/j.automatica.2006.09.019.

[8]

C. AlasseurI. Ben Tahar and A. Matoussi, An extended mean field game for storage in smart grids, J. Optim. Theory Appl., 184 (2020), 644-670.  doi: 10.1007/s10957-019-01619-3.

[9]

B. Anahtarci, C. D. Karıksı z and N. Saldi, Value iteration algorithm for mean-field games, Systems Control Lett., 143 (2020), 10pp. doi: 10.1016/j.sysconle.2020.104744.

[10]

J. Barreiro-Gomez, T. E. Duncan and H. Tembine, Discrete-time linear-quadratic mean-field-type repeated games: Perfect, incomplete, and imperfect information, Automatica J. IFAC, 112 (2020), 16pp. doi: 10.1016/j.automatica.2019.108647.

[11]

T. Başar and P. Bernhard, $H^{\infty}$ Optimal Control and Related Minimax Design Problems: A Dynamic Game Approach, Birkhäuser, Boston, MA, 2008. doi: 10.1007/978-0-8176-4757-5.

[12]

D. Bauso, Game Theory with Engineering Applications, Advances in Design and Control, 30, Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA, 2016. doi: 10.1137/1.9781611974287.

[13]

D. BausoH. Tembine and T. Başar, Robust mean field games with application to production of an exhaustible resource, IFAC Proceedings Volumes, 45 (2012), 454-459.  doi: 10.3182/20120620-3-DK-2025.00135.

[14]

A. Bensoussan, G. Da Prato, M. C. Delfour and S. K. Mitter, Representation and Control of Infinite Dimensional Systems, Systems & Control: Foundations & Applications, Birkhäuser Boston, Inc., Boston, MA, 2007. doi: 10.1007/978-0-8176-4581-6.

[15]

A. Bensoussan, J. Frehse and P. Yam, Mean Field Games and Mean Field Type Control Theory, SpringerBriefs in Mathematics, Springer, New York, 2013. doi: 10.1007/978-1-4614-8508-7.

[16]

A. BensoussanT. Huang and M. Laurière, Mean field control and mean field game models with several populations, Minimax Theory Appl., 3 (2018), 173-209. 

[17]

L. Briceño-Arias, D. Kalise, Z. Kobeissi, M. Laurière, Á. Mateos González and F. J. Silva, On the implementation of a primal-dual algorithm for second order time-dependent mean field games with local couplings, in CEMRACS 2017-Numerical Methods for Stochastic Models: Control, Uncertainty Quantification, Mean-Field, ESAIM Proc. Surveys, 65, EDP Sci., Les Ulis, 2019,330-348. doi: 10.1051/proc/201965330.

[18]

L. M. Briceño-AriasD. Kalise and F. J. Silva, Proximal methods for stationary mean field games with local couplings, SIAM J. Control Optim., 56 (2018), 801-836.  doi: 10.1137/16M1095615.

[19]

H. Cao, X. Guo and M. Laurière, Connecting GANs, MFGs, and OT, preprint, arXiv: 2002.04112.

[20]

P. Cardaliaguet, Notes on Mean Field Games, 2013. Available from: https://www.ceremade.dauphine.fr/cardaliaguet/MFG20130420.pdf.

[21]

P. Cardaliaguet and C.-A. Lehalle, Mean field game of controls and an application to trade crowding, Math. Financ. Econ., 12 (2018), 335-363.  doi: 10.1007/s11579-017-0206-z.

[22]

E. Carlini and F. J. Silva., A fully discrete semi-Lagrangian scheme for a first order mean field game problem, SIAM J. Numer. Anal., 52 (2014), 45-67.  doi: 10.1137/120902987.

[23]

R. Carmona and F. Delarue, Probabilistic Theory of Mean Field Games with Applications. I. Mean Field FBSDEs, Control, and Games, Probability Theory and Stochastic Modelling, 83, Springer, Cham, 2018. doi: 10.1007/978-3-319-58920-6.

[24]

R. CarmonaJ.-P. Fouque and L.-H. Sun, Mean field games and systemic risk, Commun. Math. Sci., 13 (2015), 911-933.  doi: 10.4310/CMS.2015.v13.n4.a4.

[25]

R. Carmona, K. Hamidouche, M. Laurière and Z. Tan, Policy optimization for linear-quadratic zero-sum mean-field type games, Proceedings of the IEEE Conference on Decision and Control, Jeju, Korea, 2020. doi: 10.1109/CDC42340.2020.9303734.

[26]

R. Carmona and M. Laurière, Convergence analysis of machine learning algorithms for the numerical solution of mean field control and games Ⅰ: The ergodic case, SIAM J. Numer. Anal., 59 (2021), 1455-1485.  doi: 10.1137/19M1274377.

[27]

R. Carmona and M. Laurière, Convergence analysis of machine learning algorithms for the numerical solution of mean field control and games Ⅱ: The finite horizon case, preprint, arXiv: 1908.01613.

[28]

R. Carmona, M. Laurière and Z. Tan, Linear-quadratic mean-field reinforcement learning: Convergence of policy gradient methods, preprint, arXiv: 1910.04295.

[29]

R. Carmona, M. Laurière and Z. Tan, Model-free mean-field reinforcement learning: Mean-field MDP and mean-field Q-learning, preprint, arXiv: 1910.12802.

[30]

A. CherukuriB. Gharesifard and J. Cortés, Saddle-point dynamics: Conditions for asymptotic stability of saddle points, SIAM J. Control Optim., 55 (2017), 486-511.  doi: 10.1137/15M1026924.

[31]

A. Cosso and H. Pham, Zero-sum stochastic differential games of generalized McKean-Vlasov type, J. Math. Pures Appl. (9), 129 (2019), 180-212.  doi: 10.1016/j.matpur.2018.12.005.

[32]

C. Daskalakis and I. Panageas, The limit points of (optimistic) gradient descent in min-max optimization, NIPS'18: Proceedings of the 32nd International Conference on Neural Information Processing Systems, 2018, 9256-9266. Available from: https://dl.acm.org/doi/pdf/10.5555/3327546.3327597.

[33]

B. Djehiche and S. Hamadène, Optimal control and zero-sum stochastic differential game problems of mean-field type, Appl. Math. Optim., 81 (2020), 933-960.  doi: 10.1007/s00245-018-9525-6.

[34]

B. DjehicheA. Tcheukam and H. Tembine, Mean-field-type games in engineering, AIMS Electronics and Electrical Engineering, 1 (2017), 18-73.  doi: 10.3934/ElectrEng.2017.1.18.

[35]

C. Domingo-Enrich, S. Jelassi, A. Mensch, G. M. Rotskoff and J. Bruna, A mean-field analysis of two-player zero-sum games, preprint, arXiv: 2002.06277.

[36]

R. Elie, T. Ichiba and M. Laurière, Large banking systems with default and recovery: A mean field game model, preprint, arXiv: 2001.10206.

[37]

R. ElieJ. PérolatM. LaurièreM. Geist and O. Pietquin, On the convergence of model free learning in mean field games, Proceedings of the AAAI Conference on Artificial Intelligence, 34 (2020), 7143-7150.  doi: 10.1609/aaai.v34i05.6203.

[38]

M. Fazel, R. Ge, S. M. Kakade and M. Mesbahi, Global convergence of policy gradient methods for the linear quadratic regulator, preprint, arXiv: 1801.05039.

[39]

Z. Fu, Z. Yang, Y. Chen and Z. Wang, Actor-critic provably finds Nash equilibria of linear-quadratic mean-field games, preprint, arXiv: 1910.07498.

[40]

H. Gu, X. Guo, X. Wei and R. Xu, Mean-field controls with Q-learning for cooperative MARL: Convergence and complexity analysis, preprint, arXiv: 2002.04131.

[41]

X. Guo, A. Hu, R. Xu and J. Zhang, Learning mean-field games, Proceedings of the 33rd International Conference on Neural Information Processing Systems, 2019, 4967-4977.

[42]

M. HuangR. P. Malhamé and P. E. Caines, Large population stochastic dynamic games: Closed-loop McKean-Vlasov systems and the Nash certainty equivalence principle, Commun. Inf. Syst., 6 (2006), 221-251.  doi: 10.4310/CIS.2006.v6.n3.a5.

[43]

C. Jin, P. Netrapalli and M. I. Jordan, What is local optimality in nonconvex-nonconcave minimax optimization?, preprint, arXiv: 1902.00618.

[44]

H. KimJ. ParkM. BennisS.-L. Kim and M. Debbah, Mean-field game theoretic edge caching in ultra-dense networks, IEEE Transactions on Vehicular Technology, 69 (2019), 935-947.  doi: 10.1109/TVT.2019.2953132.

[45]

V. Kučera, The discrete Riccati equation of optimal control, Kybernetika (Prague), 8 (1972), 430-447. 

[46]

J.-M. Lasry and P.-L. Lions, Mean field games, Jpn. J. Math., 2 (2007), 229-260.  doi: 10.1007/s11537-007-0657-8.

[47]

Z. Liu, B. Wu and H. Lin, A mean field game approach to swarming robots control, 2018 Annual American Control Conference (ACC), Milwaukee, WI, 2018. doi: 10.23919/ACC.2018.8431807.

[48]

T.-T. Lu and S.-H. Shiou, Inverses of 2 × 2 block matrices, Comput. Math. Appl., 43 (2002), 119-129.  doi: 10.1016/S0898-1221(01)00278-4.

[49]

E. Mazumdar, M. I. Jordan and S. S. Sastry, On finding local Nash equilibria (and only local Nash equilibria) in zero-sum continuous games, preprint, arXiv: 1901.00838.

[50]

F. Mériaux, V. Varma and S. Lasaulce, Mean field energy games in wireless networks, 2012 Conference Record of the Forty Sixth Asilomar Conference on Signals, Systems and Computers (ASILOMAR), Pacific Grove, CA, 2012. doi: 10.1109/ACSSC.2012.6489095.

[51]

M. NouiehedM. SanjabiT. HuangJ. D. Lee and M. Razaviyayn, Solving a class of non-convex min-max games using iterative first order methods, Advances in Neural Information Processing Systems, 32 (2019), 14934-14942. 

[52]

A. C. M. Ran and R. Vreugdenhil, Existence and comparison theorems for algebraic Riccati equations for continuous- and discrete-time systems, Linear Algebra Appl., 99 (1988), 63-83.  doi: 10.1016/0024-3795(88)90125-5.

[53]

D. ShiH. GaoL. WangM. PanZ. Han and H. V. Poor, Mean field game guided deep reinforcement learning for task placement in cooperative multi-access edge computing, IEEE Internet of Things Journal, 7 (2020), 9330-9340.  doi: 10.1109/JIOT.2020.2983741.

[54]

J. SunJ. Yong and S. Zhang, Linear quadratic stochastic two-person zero-sum differential games in an infinite horizon, ESAIM: Control Optim. Calc. Var., 22 (2016), 743-769.  doi: 10.1051/cocv/2015024.

[55] J. von Neumann and O. Morgenstern, Theory of Games and Economic Behavior, Princeton University Press, Princeton, NJ, 2007. 
[56]

R. Xu, Zero-sum stochastic differential games of mean-field type and bsdes, Proceedings of the 31st Chinese Control Conference, (2012), 1651-1654.

[57]

K. Zhang, Z. Yang and T. Basar, Policy optimization provably converges to Nash equilibria in zero-sum linear quadratic games, Advances in Neural Information Processing Systems, (2019) 11598-11610.

Figure 1.  Model-based policy optimization: Convergence of each part of the utility. (a) $ C_y $ as a function of $ (K_1,K_2) $. (b) $ C_z $ as a function of $ (L_1,L_2) $
Figure 2.  Model-based policy optimization: Convergence of the control parameters in (a) and of the relative error on the utility in (b)
Figure 3.  Sample-based policy optimization: Convergence of each part of the utility. (a) $ C_y $ as a function of $ (K_1,K_2) $. (b) $ C_z $ as a function of $ (L_1,L_2) $
Figure 4.  Sample-based policy optimization: Convergence of the control parameters in (a) and of the relative error on the utility in (b)
Table 1.  Simulation parameters
Model parameters
$ A $ $ \overline{A} $ $ B_1=\overline{B}_1 $ $ B_2=\overline{B}_2 $ $ Q $ $ \overline{Q} $ $ R_1=\overline{R}_1 $ $ R_2=\overline{R}_2 $ $ \gamma $
0.4 0.4 0.4 0.3 0.4 0.4 0.4 0.4 0.9
Initial distribution and noise processes
$ \epsilon_0^0 $ $ \epsilon^1_0 $ $ \epsilon^0_t $ $ \epsilon^1_t $
$ \mathcal{U}([-1, 1]) $ $ \mathcal{U}([-1, 1]) $ $ \mathcal{N}(0, 0.01) $ $ \mathcal{N}(0, 0.01) $
AG and DGA methods parameters
$ \mathcal{N}^{max}_1 $ $ \mathcal{N}^{max}_2 $ $ T $ $ \eta_1 $ $ \eta_2 $ $ K_1^0 $ $ L_1^0 $ $ K_2^0 $ $ L_2^0 $
10 200 2000 0.1 0.1 0.0 0.0 0.0 0.0
Gradient estimation algorithm parameters
$ \mathcal{T} $ $ M $ $ \tau $
50 10000 0.1
Model parameters
$ A $ $ \overline{A} $ $ B_1=\overline{B}_1 $ $ B_2=\overline{B}_2 $ $ Q $ $ \overline{Q} $ $ R_1=\overline{R}_1 $ $ R_2=\overline{R}_2 $ $ \gamma $
0.4 0.4 0.4 0.3 0.4 0.4 0.4 0.4 0.9
Initial distribution and noise processes
$ \epsilon_0^0 $ $ \epsilon^1_0 $ $ \epsilon^0_t $ $ \epsilon^1_t $
$ \mathcal{U}([-1, 1]) $ $ \mathcal{U}([-1, 1]) $ $ \mathcal{N}(0, 0.01) $ $ \mathcal{N}(0, 0.01) $
AG and DGA methods parameters
$ \mathcal{N}^{max}_1 $ $ \mathcal{N}^{max}_2 $ $ T $ $ \eta_1 $ $ \eta_2 $ $ K_1^0 $ $ L_1^0 $ $ K_2^0 $ $ L_2^0 $
10 200 2000 0.1 0.1 0.0 0.0 0.0 0.0
Gradient estimation algorithm parameters
$ \mathcal{T} $ $ M $ $ \tau $
50 10000 0.1
[1]

Salah Eddine Choutri, Boualem Djehiche, Hamidou Tembine. Optimal control and zero-sum games for Markov chains of mean-field type. Mathematical Control and Related Fields, 2019, 9 (3) : 571-605. doi: 10.3934/mcrf.2019026

[2]

Pierre Cardaliaguet, Jean-Michel Lasry, Pierre-Louis Lions, Alessio Porretta. Long time average of mean field games. Networks and Heterogeneous Media, 2012, 7 (2) : 279-301. doi: 10.3934/nhm.2012.7.279

[3]

Josu Doncel, Nicolas Gast, Bruno Gaujal. Discrete mean field games: Existence of equilibria and convergence. Journal of Dynamics and Games, 2019, 6 (3) : 221-239. doi: 10.3934/jdg.2019016

[4]

Yves Achdou, Manh-Khang Dao, Olivier Ley, Nicoletta Tchou. A class of infinite horizon mean field games on networks. Networks and Heterogeneous Media, 2019, 14 (3) : 537-566. doi: 10.3934/nhm.2019021

[5]

Fabio Camilli, Elisabetta Carlini, Claudio Marchi. A model problem for Mean Field Games on networks. Discrete and Continuous Dynamical Systems, 2015, 35 (9) : 4173-4192. doi: 10.3934/dcds.2015.35.4173

[6]

Martin Burger, Marco Di Francesco, Peter A. Markowich, Marie-Therese Wolfram. Mean field games with nonlinear mobilities in pedestrian dynamics. Discrete and Continuous Dynamical Systems - B, 2014, 19 (5) : 1311-1333. doi: 10.3934/dcdsb.2014.19.1311

[7]

Adriano Festa, Diogo Gomes, Francisco J. Silva, Daniela Tonon. Preface: Mean field games: New trends and applications. Journal of Dynamics and Games, 2021, 8 (4) : i-ii. doi: 10.3934/jdg.2021025

[8]

Marco Cirant, Diogo A. Gomes, Edgard A. Pimentel, Héctor Sánchez-Morgado. On some singular mean-field games. Journal of Dynamics and Games, 2021, 8 (4) : 445-465. doi: 10.3934/jdg.2021006

[9]

Lucio Boccardo, Luigi Orsina. The duality method for mean field games systems. Communications on Pure and Applied Analysis, 2022, 21 (4) : 1343-1360. doi: 10.3934/cpaa.2022021

[10]

Kuang Huang, Xuan Di, Qiang Du, Xi Chen. A game-theoretic framework for autonomous vehicles velocity control: Bridging microscopic differential games and macroscopic mean field games. Discrete and Continuous Dynamical Systems - B, 2020, 25 (12) : 4869-4903. doi: 10.3934/dcdsb.2020131

[11]

Martino Bardi. Explicit solutions of some linear-quadratic mean field games. Networks and Heterogeneous Media, 2012, 7 (2) : 243-261. doi: 10.3934/nhm.2012.7.243

[12]

Diogo A. Gomes, Gabriel E. Pires, Héctor Sánchez-Morgado. A-priori estimates for stationary mean-field games. Networks and Heterogeneous Media, 2012, 7 (2) : 303-314. doi: 10.3934/nhm.2012.7.303

[13]

Yves Achdou, Victor Perez. Iterative strategies for solving linearized discrete mean field games systems. Networks and Heterogeneous Media, 2012, 7 (2) : 197-217. doi: 10.3934/nhm.2012.7.197

[14]

Matt Barker. From mean field games to the best reply strategy in a stochastic framework. Journal of Dynamics and Games, 2019, 6 (4) : 291-314. doi: 10.3934/jdg.2019020

[15]

Olivier Guéant. New numerical methods for mean field games with quadratic costs. Networks and Heterogeneous Media, 2012, 7 (2) : 315-336. doi: 10.3934/nhm.2012.7.315

[16]

Juan Pablo Maldonado López. Discrete time mean field games: The short-stage limit. Journal of Dynamics and Games, 2015, 2 (1) : 89-101. doi: 10.3934/jdg.2015.2.89

[17]

Laura Aquilanti, Simone Cacace, Fabio Camilli, Raul De Maio. A Mean Field Games model for finite mixtures of Bernoulli and categorical distributions. Journal of Dynamics and Games, 2021, 8 (1) : 35-59. doi: 10.3934/jdg.2020033

[18]

Siting Liu, Levon Nurbekyan. Splitting methods for a class of non-potential mean field games. Journal of Dynamics and Games, 2021, 8 (4) : 467-486. doi: 10.3934/jdg.2021014

[19]

Tigran Bakaryan, Rita Ferreira, Diogo Gomes. A potential approach for planning mean-field games in one dimension. Communications on Pure and Applied Analysis, 2022, 21 (6) : 2147-2187. doi: 10.3934/cpaa.2022054

[20]

Jun Moon. Linear-quadratic mean-field type stackelberg differential games for stochastic jump-diffusion systems. Mathematical Control and Related Fields, 2022, 12 (2) : 371-404. doi: 10.3934/mcrf.2021026

 Impact Factor: 

Metrics

  • PDF downloads (124)
  • HTML views (322)
  • Cited by (0)

[Back to Top]