Article Contents
Article Contents

A differential game on Wasserstein space. Application to weak approachability with partial monitoring

This research was partially by research contract AFOSR-FA9550-18-1-0254. The first author also benefited from the support of the FMJH Program Gaspard Monge in optimization and operations research (supported in part by EDF) and the CNRS through the PEPS 3IA program.
• Studying continuous time counterpart of some discrete time dynamics is now a standard and fruitful technique, as some properties hold in both setups. In game theory, this is usually done by considering differential games on Euclidean spaces. This allows to infer properties on the convergence of values of a repeated game, to deal with the various concepts of approachability, etc. In this paper, we introduce a specific but quite abstract differential game defined on the Wasserstein space of probability distributions and we prove the existence of its value. Going back to the discrete time dynamics, we derive results on weak approachability with partial monitoring: we prove that any set satisfying a suitable compatibility condition is either weakly approachable or weakly excludable. We also obtain that the value for differential games with nonanticipative strategies is the same that those defined with a new concept of strategies very suitable to make links with repeated games.

Mathematics Subject Classification: Primary: 91A20, 91A23; Secondary: 49N99.

 Citation:

•  [1] J. Abernethy, L. P. Bartlett and E. Hazan, Blackwell approachability and low-regret learning are equivalent, J. Mach. Learn. Res.: Workshop Conf. Proc., 19 (2011), 27-46. [2] S. As Soulaimani, M. Quincampoix and S. Sorin, Repeated games and qualitative differential games: Approachability and comparison of strategies, SIAM J. Control Optim., 48 (2009), 2461-2479.  doi: 10.1137/090749098. [3] L. Ambrosio, N. Gigli and G. Savaré, Gradient Flows in Metric Spaces and in the Space of Probability Measures, Lectures in Mathematics, Birkhäuser, 2005. [4] R. J. Aumann and  M. B. Maschler,  Repeated Games with Incomplete Information, MIT Press, 1995. [5] M. Bardi and I. Capuzzo-Dolcetta, Optimal Control and Viscosity Solutions of Hamilton-Jacobi-Bellman Equations, Birkhäuser, 1997. doi: 10.1007/978-0-8176-4755-1. [6] P. Bettiol, P. Cardaliaguet and M. Quincampoix, Zero-sum state constrained differential games: Existence of value for Bolza problem, Internat. J. Game Theory, 34 (2006), 495-527.  doi: 10.1007/s00182-006-0030-9. [7] D. Blackwell, An analog of the minimax theorem for vector payoffs, Pacific J. Math., 6 (1956), 1-8.  doi: 10.2140/pjm.1956.6.1. [8] D. Blackwell, Controlled random walks, in: Proceedings of the International Congress of Mathematicians, 1954, Amsterdam, 3 (1956), 336-338. [9] R. Buckdahn, P. Cardaliaguet and M. Quincampoix, Some recent aspects of differential game theory, Dynamic Games and Applications, 1 (2011), 74-114.  doi: 10.1007/s13235-010-0005-0. [10] P. Cardaliaguet, M. Quincampoix and P. Saint-Pierre, Pursuit differential games with state constraints, SIAM J. Control Optim., 39 (2000), 1615-1632.  doi: 10.1137/S0363012998349327. [11] P. Cardaliaguet and M. Quincampoix, Deterministic differential games under probability knowledge of initial condition, International Game Theory Review, 10 (2008), 1-16.  doi: 10.1142/S021919890800173X. [12] P. Cardaliaguet, R. Laraki and S. Sorin, A continuous time approach for the asymptotic value in two-person zero-sum repeated games, SIAM J. on Control and Optimization, 50 (2012), 1573-1596.  doi: 10.1137/110839473. [13] N. Cesa-Bianchi and G. Lugosi, Prediction, Learning, and Games, Cambridge University Press, Cambridge, 2006. doi: 10.1017/CBO9780511546921. [14] M. G. Crandall, H. Ishii and P. L. Lions, User's guide to viscosity solutions of Hamilton Jacobi Equations, Trans. Amer. Math. Soc., 282 (1984), 487-502.  doi: 10.1090/S0002-9947-1984-0732102-X. [15] A. P. Dawid, Self-calibrating priors do not exist: Comment, J. Amer. Statist. Assoc., 80 (1985), 340-341.  doi: 10.2307/2287892. [16] R. M. Dudley, Real Analysis and Probability, Revised reprint of the 1989 original. Cambridge Studies in Advanced Mathematics, 74. Cambridge University Press, Cambridge, 2002. doi: 10.1017/CBO9780511755347. [17] L. Evans and T. Souganidis, Differential games and representation formulas for solutions of Hamilton-Jacobi Equations, ndiana Univ. Math. J., 33 (1984), 773-797.  doi: 10.1512/iumj.1984.33.33040. [18] W. Fleming, The convergence problem for differential games, J. Math. Anal. Appl., 3 (1961), 102-116.  doi: 10.1016/0022-247X(61)90009-9. [19] W. Fleming, The convergence problem for differential games, Ⅱ, in Advances in Game Theory, Ann. of Math. Studies, 52 (1964), Princeton Univ. Press, Princeton, NJ, 195-210. [20] D. P. Foster and R. Vohra, Calibrated learning and correlated equilibrium, Games and Economic Behavior, 21 (1997), 40-55.  doi: 10.1006/game.1997.0595. [21] R. Isaacs, Differential Games. A Mathematical Theory with Applications to Warfare and Pursuit, Control and Optimization, John Wiley & Sons, Inc., New York-London-Sydney, 1965. [22] C. Jimenez and M. Quincampoix, Hamilton Jacobi Isaacs equations for differential games with asymmetric information on probabilistic initial condition, J. Math. Anal. Appl., 457 (2018), 1422-1451.  doi: 10.1016/j.jmaa.2017.08.012. [23] E. Kohlberg, Optimal strategies in repeated games with incomplete information, Internat. J. Game Theory, 4 (1975), 7-24.  doi: 10.1007/BF01766399. [24] J. Kwon and V. Perchet, Online learning and blackwell approachability with partial monitoring: optimal convergence rates, PMLR Work. Conf. Proc., 54 (2014), 604-613. [25] G. Lugosi, S. Mannor and G. Stoltz, Strategies for prediction under imperfect monitoring, Math. Oper. Res., 33 (2008), 513-528.  doi: 10.1287/moor.1080.0312. [26] S. Mannor, V. Perchet and G. Stoltz, Robust approachability and regret minimization in games with partial monitoring, JMLR Work. Conf. Proc., 19 (2011), 515-536. [27] S. Mannor, V. Perchet and G. Stoltz, Approachability in unknown games: Online learning meets multi-objective optimization, JMLR Work. Conf. Proc., 35 (2014), 1-17. [28] A. Marigonda and M. Quincampoix, Mayer control problem with probabilistic uncertainty on initial positions, J. Differential Equations, 264 (2018), 3212-3252.  doi: 10.1016/j.jde.2017.11.014. [29] J. F. Mertens, S. Sorin and S. Zamir, Repeated Games, With a foreword by Robert J. Aumann. Econometric Society Monographs, 55. Cambridge University Press, New York, 2015. doi: 10.1017/CBO9781139343275. [30] V. Perchet, Approachability of convex sets with partial monitoring, J. Optim. Theory. Appl., 149 (2011), 665-677.  doi: 10.1007/s10957-011-9797-3. [31] V. Perchet, Internal regret with partial monitoring: calibration-based optimal algorithms, J. Mach. Learn. Res., 12 (2011), 1893-1921. [32] V. Perchet, Approachability, regret and calibration: Implications and equivalences, J. Dyn. Games, 1 (2014), 181-254.  doi: 10.3934/jdg.2014.1.181. [33] V. Perchet, Exponential weight approachability, applications to calibration and regret minimization, Dynamic Games And Applications, 5 (2015), 136-153.  doi: 10.1007/s13235-014-0119-x. [34] V. Perchet and M. Quincampoix, On a unified framework for approachability with full or partial monitoring, Mathematics of Operations Research, 40 (2014), 596-610.  doi: 10.1287/moor.2014.0686. [35] S. Plaskacz and M. Quincampoix, Value-functions for differential games and control systems with discontinuous terminal cost, SIAM J. Control Optim., 39 (2000), 1485-1498.  doi: 10.1137/S0363012998340387. [36] L. Pontrjagin, Linear differential games. Ⅰ, Ⅱ, Dokl. Akad. Nauk SSSR, 174 (1967), 1278-1280. [37] A. Rustichini, Minimizing regret: The general case, Games Econom. Behav., 29 (1999), 224-243.  doi: 10.1006/game.1998.0690. [38] F. Santambrogio, Optimal Transport for Applied Mathematicians, Calculus of variations, PDEs, and modeling. Progress in Nonlinear Differential Equations and their Applications, 87. Birkhauser/Springer, Cham, 2015. doi: 10.1007/978-3-319-20828-2. [39] M. Sion, On general minimax theorems, Pacific J. Math, 8 (1958), 171-176.  doi: 10.2140/pjm.1958.8.171. [40] X. Spinat, A necessary and sufficient condition for approachability, Math. Oper. Res., 27 (2002), 31-44.  doi: 10.1287/moor.27.1.31.333. [41] T. Tomala, Belief-free Communication Equilibria, Math. Oper. Res., 38 (2013), 617-637.  doi: 10.1287/moor.2013.0594. [42] N. Vieille, Weak approachability, Math. Op. Res., 17 (1992), 781-791.  doi: 10.1287/moor.17.4.781. [43] C. Villani, Topics in Optimal Transportation, Graduate studies in Mathematics, AMS, Vol. 58, 2003. doi: 10.1007/b12016.