This paper reframes approachability theory within the context of population games. Thus, whilst a player still aims at driving her average payoff to a predefined set, her opponent is no longer malevolent but instead is extracted randomly at each instant of time from a population of individuals choosing actions in a similar manner. First, we define the notion of 1st-moment approachability, a weakening of Blackwell's approachability. Second, since the endogenous evolution of the population's play is then important, we develop a model of two coupled partial differential equations (PDEs) in the spirit of mean-field game theory: one describing the best-response of every player given the population distribution, the other capturing the macroscopic evolution of average payoffs if every player plays her best response. Third, we provide a detailed analysis of existence, nonuniqueness, and stability of equilibria (fixed points of the two PDEs). Fourth, we apply the model to regret-based dynamics, and use it to establish convergence to Bayesian equilibrium under incomplete information.
Citation: |
Figure 1. Payoff space of Prisoner's dilemma: State space $ X = conv\{(3, 3), (1, 1), (0, 4), (4, 0)\} $ (boundary a solid line), supporting hyperplane $ H $ (dot-dashed line) passing through the barycenter, vector field $ dx(t) $ pointing towards $ (\frac{3}{2}, \frac{7}{2}) $ for those who cooperate (region below $ H $) and towards $ (\frac{5}{2}, \frac{1}{2}) $ for those who defect (region above $ H $), $ conv\{(\frac{3}{2}, \frac{7}{2}), (\frac{5}{2}, \frac{1}{2})\} $ is set of approachable points with population strategy $ q = ((\frac{1}{2}, \frac{1}{2}), (\frac{1}{2}, \frac{1}{2})) $, barycenter is self-confirmed with uniform distribution over $ X $
Figure 3. Regret space of the coordination game: State space $ X = conv\{(-1, 0), (0, 1), (0, -2), (2, 0)\} $ (boundary a solid line), and vector field $ dx(t) $ converging to $ (1, 0) $ (grey area) and $ (0, -1) $ (white area), approachable point is $ y = (0, -1) $, set of approachable points is $ conv\{(1, 0), (0, -1)\} $ (dashed line) with mixed population strategy $ q = (\frac{2}{3}, \frac{1}{3}) $
Figure 4. Regret space of parametric game with $ a< 0 < b $: State space $ X = conv\{(0, a), (-a, 0), (-b, 0), (0, b)\} $ (boundary a solid line), vector field $ dx(t) $ converging to $ (0, a) $ which is also an approachable vertex with population strategy $ q = (1, 0) $, supporting hyperplane $ H $ (dot-dashed line) intersects $ X $ only at one point (the vertex)
Figure 5. Regret space of parametric game with $ 0<b < a $: State space $ X = conv\{(0, a), (-a, 0), (-b, 0), (0, b)\} $ (boundary a solid line), supporting hyperplane $ H $ (dot-dashed line) passing through the vertex $ (-b, 0) $, vector field $ dx(t) $ converging to $ (0, b) $ left of $ H $ and to $ (-b, 0) $ right of $ H $, $ conv\{(0, b), (-b, 0)\} $ is set of approachable points with population strategy $ q = (0, 1) $, vertex $ (-b, 0) $ is not self-confirmed, while vertex $ (0, a) $ is self-confirmed with population strategy $ q = (1, 0) $
[1] | R. J. Aumann, Utility theory without the completeness axiom, Econometrica, 30 (1962), 445-462. doi: 10.2307/1913746. |
[2] | R. J. Aumann, Markets with a continuum of traders, Econometrica, 32 (1964), 39-50. doi: 10.2307/1913732. |
[3] | R. J. Aumann and M. B. Maschler, Repeated Games with Incomplete Information, MIT Press, 1995. |
[4] | F. Bagagiolo and D. Bauso, Objective function design for robust optimality of linear control under state-constraints and uncertainty, ESAIM: Control, Optimisation and Calculus of Variations, 17 (2011), 155-177. doi: 10.1051/cocv/2009040. |
[5] | M. Bardi, Explicit solutions of some linear-quadratic mean field games, Network and Heterogeneous Media, 7 (2012), 243-261. doi: 10.3934/nhm.2012.7.243. |
[6] | J. Battelle, The Search, Nicholas Brealey Publishing, 2006. |
[7] | D. Bauso, E. Lehrer, E. Solan and X. Venel, Attainability in repeated games with vector payoffs, INFORMS Mathematics of Operations Research, 40 (2015), 739-755. doi: 10.1287/moor.2014.0693. |
[8] | M. Benaïm, J. Hofbauer and S. Sorin, Stochastic approximations and differential inclusions, SIAM Journal on Control and Optimization, 44 (2005), 338–348. doi: 10.1137/S0363012904439301. |
[9] | D. Blackwell, An analog of the minimax theorem for vector payoffs, Pacific J. Math., 6 (1956), 1-8. doi: 10.2140/pjm.1956.6.1. |
[10] | F. Blanchini, Set invariance in control – a survey, Automatica, 35 (1999), 1747-1768. doi: 10.1016/S0005-1098(99)00113-2. |
[11] | L. E. Blume, A. Brandenburger and E. Dekel, Lexicographic probabilities and choice under uncertainty, Econometrica, 59 (1991), 61-79. doi: 10.2307/2938240. |
[12] | N. Cesa-Bianchi and G. Lugosi, Prediction, Learning and Games, Cambridge University Press, 2006. doi: 10.1017/CBO9780511546921. |
[13] | B. Edelman, M. Ostrovsky and M. Schwarz, Internet advertising and the generalised second-price auction: Selling billions of dollars worth of keywords, American Economic Review, 97 (2007), 242-259. |
[14] | N. J. Elliot and N. J. Kalton, The existence of value in differential games of pursuit and evasion, J. Differential Equations, 12 (1972), 504-523. doi: 10.1016/0022-0396(72)90022-8. |
[15] | Jeffrey C. Ely and William H. Sandholm, Evolution in Bayesian games I: Theory, Games and Economic Behavior, 53 (2005), 83-109. doi: 10.1016/j.geb.2004.09.003. |
[16] | D. Foster and R. Vohra, Regret in the on-line decision problem, Games and Economic Behavior, 29 (1999), 7-35. doi: 10.1006/game.1999.0740. |
[17] | John C. Harsanyi, Games with incomplete information played by 'bayesian' players, i–iii. part ii. Bayesian equilibrium points, Management Science, 14 (1968), 320-334. doi: 10.1287/mnsc.14.5.320. |
[18] | S. Hart, Adaptive heuristics, Econometrica, 73 (2005), 1401-1430. doi: 10.1111/j.1468-0262.2005.00625.x. |
[19] | S. Hart and A. Mas-Colell, A general class of adaptive strategies, Journal of Economic Theory, 98 (2001), 26-54. doi: 10.1006/jeth.2000.2746. |
[20] | S. Hart and A. Mas-Colell, Regret-based continuous-time dynamics, Games and Economic Behavior, 45 (2003), 375-394. doi: 10.1016/S0899-8256(03)00178-7. |
[21] | M. Y. Huang, P. E. Caines and R. P. Malhamé, Large population stochastic dynamic games: Closed loop McKean-Vlasov systems and the nash certainty equivalence principle, Communications in Information and Systems, 6 (2006), 221-252. doi: 10.4310/CIS.2006.v6.n3.a5. |
[22] | M. Y. Huang, P. E. Caines and R. P. Malhamé, Large-population cost-coupled LQG problems with non-uniform agents: Individual-mass behaviour and decentralized $\epsilon$-nash equilibria, IEEE Trans. on Automatic Control, 9 (2007), 1560-1571. doi: 10.1109/TAC.2007.904450. |
[23] | M. Y. Huang, P. E. Caines and R. P. Malhamé, Individual and mass behaviour in large population stochastic wireless power control problems: Centralized and nash equilibrium solutions., In Proc. of the IEEE Conference on Decision and Control, volume 42, pages 98–103, HI, USA, December 2003. |
[24] | B. Jovanovic and R. W. Rosenthal, Anonymous sequential games, Journal of Mathematical Economics, 17 (1988), 77-87. doi: 10.1016/0304-4068(88)90029-8. |
[25] | J.-M. Lasry and P.-L. Lions, Jeux à champ moyen. I. Le cas stationnaire, Comptes Rendus Mathematique, 343 (2006), 619-625. doi: 10.1016/j.crma.2006.09.019. |
[26] | J.-M. Lasry and P.-L. Lions, Jeux à champ moyen. II. horizon fini et controle optimal, Comptes Rendus Mathematique, 343 (2006), 679-684. doi: 10.1016/j.crma.2006.09.018. |
[27] | J.-M. Lasry and P.-L. Lions, Mean field games, Japanese Journal of Mathematics, 2 (2007), 229-260. doi: 10.1007/s11537-007-0657-8. |
[28] | E. Lehrer, Allocation processes in cooperative games, International Journal of Game Theory, 31 (2002), 341-351. doi: 10.1007/s001820200123. |
[29] | E. Lehrer, Approachability in infinite dimensional spaces, International Journal of game Theory, 31 (2002), 253-268. doi: 10.1007/s001820200115. |
[30] | E. Lehrer, A wide range no-regret theorem, Games and Economic Behavior, 42 (2003), 101-115. doi: 10.1016/S0899-8256(03)00032-0. |
[31] | E. Lehrer and E. Solan, Excludability and bounded computational capacity strategies, Mathematics of Operations Research, 31 (2006), 637-648. doi: 10.1287/moor.1060.0211. |
[32] | E. Lehrer, E. Solan and D. Bauso, Repeated games over networks with vector payoffs: the notion of attainability, in Proceedings of the NetGCoop 2011, IEEE, Paris, France, 2011. |
[33] | E. Lehrer and S. Sorin, Minmax via differential inclusion, Journal of Convex Analysis, 14(2): 271–273, 2007. |
[34] | M. Maschler, E. Solan and S. Zamir, Game Theory, Cambridge University Press, Cambridge, 2013. doi: 10.1017/CBO9780511794216. |
[35] | E. Roxin, Axiomatic approach in differential games, J. Optim. Theory Appl., 3 (1969), 153-163. doi: 10.1007/BF00929440. |
[36] | W. H. Sandholm, Population Games and Evolutionary Dynamics, MIT Press, Cambridge, MA, 2010. |
[37] | W. H. Sandholm, Evolution in Bayesian games II: Stability of purified equilibria, Journal of Economic Theory, 136 (2007), 641-667. doi: 10.1016/j.jet.2006.10.003. |
[38] | A. S. Soulaimani, M. Quincampoix and S. Sorin, Approachability theory, discriminating domain and differential games, SIAM Journal of Control and Optimization, 48 (2009), 2461-2479. |
[39] | P. Varaiya, The existence of solutions to a differential game, SIAM Journal of Control and Optimization, 5 (1967), 153-162. doi: 10.1137/0305009. |
[40] | H. R. Varian, Position auctions, International Journal of Industrial Organization, 25 (2007), 1163-1178. doi: 10.1016/j.ijindorg.2006.10.002. |
[41] | N. Vieille, Weak approachability, Mathematics of Operations Research, 17 (1992), 781-791. doi: 10.1287/moor.17.4.781. |
[42] | John von Neumann, Zur Theorie der Gesellschaftsspiele, Math. Ann., 100 (1928), 295-320. doi: 10.1007/BF01448847. |
[43] | S. Zamir, Bayesian games: Games with incomplete information, in Computational Complexity, Vol. 1–6, Springer, New York, 2012. doi: 10.1007/978-1-4614-1800-9_16. |
Payoff space of Prisoner's dilemma: State space
Regret space of the Prisoner's dilemma: State space
Regret space of the coordination game: State space
Regret space of parametric game with
Regret space of parametric game with