DISCRETIZED BEST-RESPONSE DYNAMICS FOR THE ROCK-PAPER-SCISSORS GAME

Discretizing a differential equation may change the qualitative behaviour drastically, even if the stepsize is small. We illustrate this by looking at the discretization of a piecewise continuous differential equation that models a population of agents playing the Rock-Paper-Scissors game. The globally asymptotically stable equilibrium of the differential equation turns, after discretization, into a repeller surrounded by an annulus shaped attracting region. In this region, more and more periodic orbits emerge as the discretization step approaches zero.

1. Introduction.Due to its origins in biology, evolutionary game theory often uses the replicator equation as a main tool to model changes in population states [11].The replicator equation does not require to assume cognitive skills or rational behaviour of the individuals.However, in some contexts it is useful to assume certain levels of rationality.Here, the best-response dynamics [6] can be preferable over the replicator equation.It is defined as follows.For n ∈ N (the number of strategies), the payoff matrix A ∈ R n×n , a mixed strategy x ∈ ∆ n (the probability simplex in R n ), we define [9,12] BR(x) := argmax y∈∆n (yAx) = y ∈ ∆ n : yAx = max z∈∆n (zAx) (1) Then, the Best-Response-dynamics (BR-dynamics) is given by which is a differential inclusion (or a piecewise continuous differential equation).The BR-dynamics is the continuous-time limit of fictitious play [3].The usual derivation of the BR-dynamics is as follows.Consider a large population of players, each using one of the n pure strategies.The strategy distribution at time t is x(t) ∈ ∆ n .Assume that in a short time interval ε, a small randomly chosen fraction ε of the population changes their strategies to a best response from BR(x(t)) against the current population state x(t).Then In the limit ε → 0 this leads to the BR-dynamics (2).In the present paper, we shall investigate the map (3) which can be viewed as a discretization of (2).Some general aspects of the best response dynamics and its discretizations were analyzed in [2,10].In this paper, we study the attractor of the discretized best response dynamics for a simple game with n = 3 strategies: the Rock-Paper-Scissors game.
Other simple games such as the hawk-dove game and matching pennies were studied in [1].
2. Attractor of the discretized best-response dynamics.For convenience, we shall use the discretization step h > 0, s.t.ε = h 1+h to obtain Further, let us consider the Rock-Paper-Scissors game with the payoff matrix For more general Rock-Paper-Scissors games see section 4 below.BR(x) is given by with the (open) best response regions R 1 , R 2 and R 3 (c.f. Figure 1).On the boundary of the best response regions, i.e., on the line segments ℓ i , BR(x) is multivalued.For example, for x ∈ ℓ 1 , i.e., x 1 = 1 3 , x 3 < 1 3 , we have BR(x) = {(0, α, 1 − α) : 0 ≤ α ≤ 1}.And for the equilibrium BR(e) = ∆ 3 .For the continuous-time BR dynamics (2), it is easy to see that and therefore e is globally asymptotically stable [5,9].Similar results hold for general zero-sum games [10,12].The discretized BR dynamics is given by the following piecewise continuous map The multivalued map F in (4) is the extension of F to the whole simplex ∆ 3 , s.t. the graph of F is closed, and F (x) is convex for each x.Define the global attractor A h of F as Obviously, the equilibrium e is in A h .Since e is the global attractor for (2), general results [2,10] imply that as h → 0, A h shrinks to e.However, e is a repeller for (4), see the Proposition below.Let A e h denote its dual attractor.It contains the ω-limit set of all orbits starting at x = e.In the following we want to locate and fence in this attractor A e h .Let us consider the point which is the intersection point of F 1 (ℓ 3 ) with ℓ 1 .Further, let T be the rotation Let ∆ p denote the equilateral triangle spanned by p, T p, and T 2 p (cf. Figure 3).Note, that p, T 2 p, T p form a periodic orbit of period 3 under the multivalued map F (but not under the piecewise continuous map (8)).This orbit is unstable.Since p ∈ F (e), there are orbits starting in e (and on ℓ i close to e) that jump onto this periodic orbit in one step.
Proposition 1.The points inside the triangle ∆ p have no pre-image under the map (8).
Proof.The line segment from p to T p is given by We want to show that the points Then we get a contradiction by Finally, to check that also a pre-image in R 3 is impossible, we calculate A slightly stronger result, now formulated for (4) instead of ( 8), is Thus, the equilibrium e is a repeller and its dual attractor A e h is contained in the complement of the (open) triangle ∆ p and even the larger triangle ∆ ′ p = {x ∈ ∆ 3 : h .Now we want to find an upper bound or 'outer boundary' for the attractor A e h .Following orbits from R 1 we see that not all points of R 2 can be reached.The image F 1 (ℓ 1 ) is a line segment parallel to ℓ 1 .From the strip between ℓ 1 and F 1 (ℓ 1 ), i.e., the trapezoid T q0 spanned by the four corners e, F 1 (e), q 0 = ( 13 , 2 3 , 0) ∈ ℓ 1 , and F 1 (q 0 ), the orbits move towards the vertex S. By tracking the 'outermost' point q 0 on ℓ 1 we can determine the border between the points in R 2 which can be reached from R 1 and the points in R 2 which are inaccessible from R 1 and, in the limit, obtain an 'outer boundary' for the attractor.
Constructing the outer boundary for the attractor.
More precisely, we construct a mapping G : ℓ 1 → ℓ 1 , such that all orbits under the dynamics F of points in R 1 are inside of the orbit 1 of q 0 under G. Geometrically speaking, we take the ray from F 1 (q 0 ) towards S and intersect it with the line segment ℓ 2 .Instead of looking at the trapezoid between ℓ 2 and F 2 (ℓ 2 ), etc., we exploit the cyclic symmetry and rotate 120 • clockwise or 240 • anticlockwise to the trapezoid T q0 , see Figure 2. We define where T is the rotation (10) and Z : R 2 → ℓ 2 the central projection onto ℓ 2 in direction of S, which is given by 2 All together, we get Since ℓ 1 is a line segment, it suffices to consider one coordinate , g is contracting and therefore also G is contracting, which means that there exists a unique stable fixed point q ∈ ℓ 1 .It is immediately clear that q = q 0 and also that q is not the equilibrium e (because e is a repeller).Thus, the orbit of q under G defines an outer boundary for all orbits of R 1 (and thus for all points in ∆ 3 ) under the original dynamics F .To get q, we solve and obtain and hence 1 Every orbit under the map F (except the constant one at e) repeatedly enters Tq 0 on its way cycling around e. Each orbit of (4), i.e., each sequence (x n ) s.t.x n+1 ∈ F (x n ) for all n = 1, 2, . . ., can be extended to a broken line by connecting x n with x n+1 by a line segment, for n = 1, 2, . . . .As this broken line circles indefinitely around the equilibrium e, by looking at its intersection points with ℓ 1 and ℓ 2 , we obtain a multivalued transition map F : ℓ 1 → ℓ 1 .The graph of F consists of all pairs (x, T y) such that x ∈ ℓ 1 and y ∈ ℓ 2 are consecutive intersection points of such a broken line with ℓ 1 and ℓ 2 .For each x ∈ ℓ 1 , F (x) is an interval (a subsegment) in ℓ 1 .Our map G from ( 13) is the upper (outer) limit of F , i.e., F (x) ⊆ eG(x).
Thus, we have shown that the attractor of the discretized BR dynamics lies within the triangle ∆ q spanned by F 1 (q), T (F 1 (q)) and T 2 (F 1 (q)).Together with Proposition 1, we have shown the following.
Theorem 2.1.The attractor A e h of the discretized BR dynamics (8) is contained in the band between the two triangles ∆ q \ ∆ p (cf. Figure 3).

S R P
p The outer triangle ∆ q is constructed such that the ωlimits of all orbits must be inside of it.The inner triangle ∆ p contains the set of points which do not have a pre-image under F .Thus, the region bounded by the two green triangles attracts all orbits, except the constant one at e.
3. Periodic orbits.We now compute the periodic orbits of the map (8) whose broken lines produce an isosceles triangle.If a point xn ∈ R 1 generates such a periodic orbit of period 3n, then by cyclic symmetry with the rotation T as above, and Solving the linear equation (21) gives These points exist for any n, however they define periodic orbits only if The latter follows from equation ( 23) and for the former we Periodic orbits of periods 3n, which exist for h < h n , are shown for n ≤ 5.The red curves correspond to the inner and outer demarkation of the attractor calculated in section 2 and h k are numerical solutions to the equation corresponding to (26). need Thus, if n = 1, this condition holds for all h > 0, and for each n ≥ 2 there is a threshold h n , such that inequality (26) holds for h ≤ h n .In other words, there is an (asymptotically stable) periodic orbit with period 3 for every h > 0. For h → ∞ it approaches the best reply cycle {R, P, S}.As h gets smaller, higher period orbits emerge, while the lower periodic orbits will be maintained, resulting in multiple coexisting periodic orbits (cf. Figure 4).Each of these periodic orbits is asymptotically stable, as soon as none of its points is on a line segment ℓ i , i.e., if h = h n .This follows, since the maps F i are contractions on R i .We can measure their distance from the center e by comparing their position on ℓ 1 .Intersecting ℓ 1 with the line segment xn P gives Note that for n = 1 we get precisely the point p from (9). Figure 4 shows the emergence of periodic orbits and their distance from the equilibrium.Figure 5 shows the basins of attraction of these orbits.The accompanying movie 3 shows how these basins change with decreasing stepsize h.

Generalisation.
Let us now consider the general symmetric Rock-Paper-Scissors game with the payoff matrix with a, b > 0. Then the regions R * i are separated by the line segments with i ∈ Z 3 , i.e., i + 3 = i.Decoration with a * signifies the analogs for the game (28) of concepts and notations used so far for game (5).
If a > b then the equilibrium e is evolutionarily stable, and globally asymptotically stable for all standard continuous time dynamics, whereas for a < b, the game (28) is positive definite and the equilibrium e is repelling, see [7].Since the expected payoff at e is a−b 3 , for a < b the equilibrium e is unattractive as a solution: it gives less than the tie payoff 0.
For the discretized best response dynamics, e is repelling for all a, b > 0 and all h > 0. Again, the attractor can be confined to an annulus-shaped region.Compared to the case a = b = 1, the inner boundary is a smaller (a > b) or larger (a < b) rotated triangle.Similar to Proposition 2 we get.
Proof.Suppose x = e and w.l.o.g.
Note that for a < b, the repelling region around e is even wider, similar to the continuous best-response dynamics, where the BR dynamics converges to the Shapley triangle [5,9].Regarding the 'outer boundary', let us now consider the point on ℓ * By tracking q * 0 we get the border between the points in R * 2 which can be reached from R * 1 and the points in R * 2 which are inaccessible from R * 1 .As before, we construct a mapping G * : ℓ * 1 → ℓ * 1 , such that all orbits under the dynamics F * of points in R * 1 are inside of the orbit of q * 0 under G * : 3 Supplementary file at http://homepage.univie.ac.at/Josef.Hofbauer/AnimationBR.avi Note that the central projection Z * now maps R * 2 to ℓ * 2 .It is given by The explicit formula for G * is more complicated, so we give only the second coordinate Since g * is a linear fractional map (both numerator and denominator are linear in x 2 ), that maps the interval [ 1 3 , a+b 2a+b ] into itself, it is a contraction (w.r.t.projective distance), see e.g., [5].
Therefore g * and hence G * has a unique, stable fixed point q * ∈ ℓ * 1 .Thus, as before, the orbit of q * under G * defines an outer boundary for all orbits of R * 1 (and thus for all points in ∆ 3 ) under the map F * .To get q * , we solve g * (q * 2 ) = q * 2 and obtain Hence and, for a = b, the difference is of order h (since 2 approaches 1 3 as h → 0. And for a > b, the attractor A * h shrinks to e linearly with h, as h → 0, whereas for a = b it does so only with order √ h.For b > a, q * 2 → b 2 a 2 +ab+b 2 , and the limit of q * is a corner of the Shapley triangle, compare the expressions in [5,9].Thus, we have shown that the attractor of the discretized BR dynamics lies in the triangle ∆ q * spanned by F 1 (q * ), T (F 1 (q * )) and T 2 (F 1 (q * )).Together with Proposition 3, and by defining we have shown the following Theorem 4.1.The attractor A e * h of the discretized BR dynamics for game (28) is contained in the region between the two triangles, ∆ q * \ ∆ p * .
Overall, if a > b, the dynamics behaves qualitatively very similarly to the case a = b : more and more periodic orbits emerge, as h → 0. For b > a, the general results in [2] imply that the dual attractor A e * h approaches the Shapley triangle, as h → 0, therefore only orbits within a narrow range of periods (proportional to 1 h ) are possible.If the stepsize h > 0 is fixed, then the behaviour of F is robust against small perturbations of a and b.The periodic orbits constructed in section 3 persist, as long as they stay in the open regions R * i and do not hit the lines ℓ * i .

Conclusion.
For the standard rock-paper-scissors game (5) we have shown that all trajectories of ( 4) and ( 8) (except the one staying at the fixed point e) end up in a region confined between the two triangles as can be seen in Figure 3.As h approaches 0, higher period orbits emerge.By construction of the map G for the outer boundary, new periodic orbits emerge precisely on this boundary.Therefore convergence of the outer boundary of the attractor towards the center cannot be faster than with order of square root in the following sense: We can measure the triangles' ∆ p and ∆ q distance from the center by, e.g., the x 2 coordinate of their intersection with ℓ 1 .Then from ( 9) and (19) we get As h → 0, the inner boundary converges to the center with linear order, while the outer boundary approaches the center only with order of the square root.
To some extent the behaviour is similar to the work [8]: For a smooth differential equation in R 2 with an asymptotically stable equilibrium with purely imaginary eigenvalues, discretization makes the equilibrium unstable and produces as new attractor an invariant curve around the equilibrium of radius proportional to √ h and with rotation number proportional to h.The behaviour in the present paper is more complicated, due to the discontinuity.The attractor still shrinks like √ h towards the equilibrium, but the smaller h the more complex is the dynamics.
For the non-zero-sum version (28) the results are similar.The equilibrium e is always unstable.Thus, this dynamics is a better match for many experimental results [4,13,14] than the continuous time dynamics.Cason et al. [4] observe persistent deviation from the equilibrium and sustained oscillations around the equilibrium, both for the stable and the unstable type of RPS games.Only the cycle amplitudes are consistently larger in the unstable games -similar to our dynamics.

Figure 1 .
Figure 1.Best response regions R i of the Rock-Paper-Scissors game separated by line segments ℓ i .

Figure 5 .
Figure 5. Periodic orbits of various periods together with their (numerically calculated) respective basins of attraction, for various values of the stepsize h.Red is the basin of attraction for period 3, dark red for period 6, light green: 9, green: 12, yellow 15, olive 18 and blue 21.The inner and outer triangles ∆ p and ∆ q are also shown (gray lines).