Article Contents
Article Contents

# Approachability in population games

• * Corresponding author: Dario Bauso
• This paper reframes approachability theory within the context of population games. Thus, whilst a player still aims at driving her average payoff to a predefined set, her opponent is no longer malevolent but instead is extracted randomly at each instant of time from a population of individuals choosing actions in a similar manner. First, we define the notion of 1st-moment approachability, a weakening of Blackwell's approachability. Second, since the endogenous evolution of the population's play is then important, we develop a model of two coupled partial differential equations (PDEs) in the spirit of mean-field game theory: one describing the best-response of every player given the population distribution, the other capturing the macroscopic evolution of average payoffs if every player plays her best response. Third, we provide a detailed analysis of existence, nonuniqueness, and stability of equilibria (fixed points of the two PDEs). Fourth, we apply the model to regret-based dynamics, and use it to establish convergence to Bayesian equilibrium under incomplete information.

Mathematics Subject Classification: Primary: 91A16, 91A22; Secondary: 35Q89.

 Citation:

• Figure 1.  Payoff space of Prisoner's dilemma: State space $X = conv\{(3, 3), (1, 1), (0, 4), (4, 0)\}$ (boundary a solid line), supporting hyperplane $H$ (dot-dashed line) passing through the barycenter, vector field $dx(t)$ pointing towards $(\frac{3}{2}, \frac{7}{2})$ for those who cooperate (region below $H$) and towards $(\frac{5}{2}, \frac{1}{2})$ for those who defect (region above $H$), $conv\{(\frac{3}{2}, \frac{7}{2}), (\frac{5}{2}, \frac{1}{2})\}$ is set of approachable points with population strategy $q = ((\frac{1}{2}, \frac{1}{2}), (\frac{1}{2}, \frac{1}{2}))$, barycenter is self-confirmed with uniform distribution over $X$

Figure 2.  Regret space of the Prisoner's dilemma: State space $X = conv\{(-1, 0), (0, 1)\}$ (solid line), initial distribution $\rho(x, 0)$ (grey area), and vector field $dx(t)$ converging to $y = (-0.5, 0.5)$

Figure 3.  Regret space of the coordination game: State space $X = conv\{(-1, 0), (0, 1), (0, -2), (2, 0)\}$ (boundary a solid line), and vector field $dx(t)$ converging to $(1, 0)$ (grey area) and $(0, -1)$ (white area), approachable point is $y = (0, -1)$, set of approachable points is $conv\{(1, 0), (0, -1)\}$ (dashed line) with mixed population strategy $q = (\frac{2}{3}, \frac{1}{3})$

Figure 4.  Regret space of parametric game with $a< 0 < b$: State space $X = conv\{(0, a), (-a, 0), (-b, 0), (0, b)\}$ (boundary a solid line), vector field $dx(t)$ converging to $(0, a)$ which is also an approachable vertex with population strategy $q = (1, 0)$, supporting hyperplane $H$ (dot-dashed line) intersects $X$ only at one point (the vertex)

Figure 5.  Regret space of parametric game with $0<b < a$: State space $X = conv\{(0, a), (-a, 0), (-b, 0), (0, b)\}$ (boundary a solid line), supporting hyperplane $H$ (dot-dashed line) passing through the vertex $(-b, 0)$, vector field $dx(t)$ converging to $(0, b)$ left of $H$ and to $(-b, 0)$ right of $H$, $conv\{(0, b), (-b, 0)\}$ is set of approachable points with population strategy $q = (0, 1)$, vertex $(-b, 0)$ is not self-confirmed, while vertex $(0, a)$ is self-confirmed with population strategy $q = (1, 0)$

•  [1] R. J. Aumann, Utility theory without the completeness axiom, Econometrica, 30 (1962), 445-462.  doi: 10.2307/1913746. [2] R. J. Aumann, Markets with a continuum of traders, Econometrica, 32 (1964), 39-50.  doi: 10.2307/1913732. [3] R. J. Aumann and  M. B. Maschler,  Repeated Games with Incomplete Information, MIT Press, 1995. [4] F. Bagagiolo and D. Bauso, Objective function design for robust optimality of linear control under state-constraints and uncertainty, ESAIM: Control, Optimisation and Calculus of Variations, 17 (2011), 155-177.  doi: 10.1051/cocv/2009040. [5] M. Bardi, Explicit solutions of some linear-quadratic mean field games, Network and Heterogeneous Media, 7 (2012), 243-261.  doi: 10.3934/nhm.2012.7.243. [6] J. Battelle, The Search, Nicholas Brealey Publishing, 2006. [7] D. Bauso, E. Lehrer, E. Solan and X. Venel, Attainability in repeated games with vector payoffs, INFORMS Mathematics of Operations Research, 40 (2015), 739-755.  doi: 10.1287/moor.2014.0693. [8] M. Benaïm, J. Hofbauer and S. Sorin, Stochastic approximations and differential inclusions, SIAM Journal on Control and Optimization, 44 (2005), 338–348. doi: 10.1137/S0363012904439301. [9] D. Blackwell, An analog of the minimax theorem for vector payoffs, Pacific J. Math., 6 (1956), 1-8.  doi: 10.2140/pjm.1956.6.1. [10] F. Blanchini, Set invariance in control – a survey, Automatica, 35 (1999), 1747-1768.  doi: 10.1016/S0005-1098(99)00113-2. [11] L. E. Blume, A. Brandenburger and E. Dekel, Lexicographic probabilities and choice under uncertainty, Econometrica, 59 (1991), 61-79.  doi: 10.2307/2938240. [12] N. Cesa-Bianchi and  G. Lugosi,  Prediction, Learning and Games, Cambridge University Press, 2006.  doi: 10.1017/CBO9780511546921. [13] B. Edelman, M. Ostrovsky and M. Schwarz, Internet advertising and the generalised second-price auction: Selling billions of dollars worth of keywords, American Economic Review, 97 (2007), 242-259. [14] N. J. Elliot and N. J. Kalton, The existence of value in differential games of pursuit and evasion, J. Differential Equations, 12 (1972), 504-523.  doi: 10.1016/0022-0396(72)90022-8. [15] Jeffrey C. Ely and William H. Sandholm, Evolution in Bayesian games I: Theory, Games and Economic Behavior, 53 (2005), 83-109.  doi: 10.1016/j.geb.2004.09.003. [16] D. Foster and R. Vohra, Regret in the on-line decision problem, Games and Economic Behavior, 29 (1999), 7-35.  doi: 10.1006/game.1999.0740. [17] John C. Harsanyi, Games with incomplete information played by 'bayesian' players, i–iii. part ii. Bayesian equilibrium points, Management Science, 14 (1968), 320-334.  doi: 10.1287/mnsc.14.5.320. [18] S. Hart, Adaptive heuristics, Econometrica, 73 (2005), 1401-1430.  doi: 10.1111/j.1468-0262.2005.00625.x. [19] S. Hart and A. Mas-Colell, A general class of adaptive strategies, Journal of Economic Theory, 98 (2001), 26-54.  doi: 10.1006/jeth.2000.2746. [20] S. Hart and A. Mas-Colell, Regret-based continuous-time dynamics, Games and Economic Behavior, 45 (2003), 375-394.  doi: 10.1016/S0899-8256(03)00178-7. [21] M. Y. Huang, P. E. Caines and R. P. Malhamé, Large population stochastic dynamic games: Closed loop McKean-Vlasov systems and the nash certainty equivalence principle, Communications in Information and Systems, 6 (2006), 221-252.  doi: 10.4310/CIS.2006.v6.n3.a5. [22] M. Y. Huang, P. E. Caines and R. P. Malhamé, Large-population cost-coupled LQG problems with non-uniform agents: Individual-mass behaviour and decentralized $\epsilon$-nash equilibria, IEEE Trans. on Automatic Control, 9 (2007), 1560-1571.  doi: 10.1109/TAC.2007.904450. [23] M. Y. Huang, P. E. Caines and R. P. Malhamé, Individual and mass behaviour in large population stochastic wireless power control problems: Centralized and nash equilibrium solutions., In Proc. of the IEEE Conference on Decision and Control, volume 42, pages 98–103, HI, USA, December 2003. [24] B. Jovanovic and R. W. Rosenthal, Anonymous sequential games, Journal of Mathematical Economics, 17 (1988), 77-87.  doi: 10.1016/0304-4068(88)90029-8. [25] J.-M. Lasry and P.-L. Lions, Jeux à champ moyen. I. Le cas stationnaire, Comptes Rendus Mathematique, 343 (2006), 619-625.  doi: 10.1016/j.crma.2006.09.019. [26] J.-M. Lasry and P.-L. Lions, Jeux à champ moyen. II. horizon fini et controle optimal, Comptes Rendus Mathematique, 343 (2006), 679-684.  doi: 10.1016/j.crma.2006.09.018. [27] J.-M. Lasry and P.-L. Lions, Mean field games, Japanese Journal of Mathematics, 2 (2007), 229-260.  doi: 10.1007/s11537-007-0657-8. [28] E. Lehrer, Allocation processes in cooperative games, International Journal of Game Theory, 31 (2002), 341-351.  doi: 10.1007/s001820200123. [29] E. Lehrer, Approachability in infinite dimensional spaces, International Journal of game Theory, 31 (2002), 253-268.  doi: 10.1007/s001820200115. [30] E. Lehrer, A wide range no-regret theorem, Games and Economic Behavior, 42 (2003), 101-115.  doi: 10.1016/S0899-8256(03)00032-0. [31] E. Lehrer and E. Solan, Excludability and bounded computational capacity strategies, Mathematics of Operations Research, 31 (2006), 637-648.  doi: 10.1287/moor.1060.0211. [32] E. Lehrer, E. Solan and D. Bauso, Repeated games over networks with vector payoffs: the notion of attainability, in Proceedings of the NetGCoop 2011, IEEE, Paris, France, 2011. [33] E. Lehrer and S. Sorin, Minmax via differential inclusion, Journal of Convex Analysis, 14(2): 271–273, 2007. [34] M. Maschler,  E. Solan and  S. Zamir,  Game Theory, Cambridge University Press, Cambridge, 2013.  doi: 10.1017/CBO9780511794216. [35] E. Roxin, Axiomatic approach in differential games, J. Optim. Theory Appl., 3 (1969), 153-163.  doi: 10.1007/BF00929440. [36] W. H. Sandholm,  Population Games and Evolutionary Dynamics, MIT Press, Cambridge, MA, 2010. [37] W. H. Sandholm, Evolution in Bayesian games II: Stability of purified equilibria, Journal of Economic Theory, 136 (2007), 641-667.  doi: 10.1016/j.jet.2006.10.003. [38] A. S. Soulaimani, M. Quincampoix and S. Sorin, Approachability theory, discriminating domain and differential games, SIAM Journal of Control and Optimization, 48 (2009), 2461-2479. [39] P. Varaiya, The existence of solutions to a differential game, SIAM Journal of Control and Optimization, 5 (1967), 153-162.  doi: 10.1137/0305009. [40] H. R. Varian, Position auctions, International Journal of Industrial Organization, 25 (2007), 1163-1178.  doi: 10.1016/j.ijindorg.2006.10.002. [41] N. Vieille, Weak approachability, Mathematics of Operations Research, 17 (1992), 781-791.  doi: 10.1287/moor.17.4.781. [42] John von Neumann, Zur Theorie der Gesellschaftsspiele, Math. Ann., 100 (1928), 295-320.  doi: 10.1007/BF01448847. [43] S. Zamir, Bayesian games: Games with incomplete information, in Computational Complexity, Vol. 1–6, Springer, New York, 2012. doi: 10.1007/978-1-4614-1800-9_16.

Figures(5)