SPATIAL COMPETITIVE GAMES WITH DISINGENUOUSLY DELAYED POSITIONS

. During the last decades, spatial games have received great atten-tion from researchers showing the behavior of populations of players over time in a spatial structure. One of the main factors which can greatly aﬀect the behavior of such populations is the updating scheme used to apprise new strate- gies of players. Synchronous updating is the most common updating strategy in which all players update their strategy at the same time. In order to be able to describe the behavior of populations more realistically several asynchronous updating schemes have been proposed. Asynchronous game does not use a universal clock and players can update their strategy at diﬀerent time steps during the play. In this paper, we introduce a new type of asynchronous strategy updating in which some of the players hide their updated strategy from their neighbors for several time steps. It is shown that this behavior can change the behavior of populations but does not necessarily lead to a higher payoﬀ for the dishonest players. The paper also shows that with dishonest players, the average payoﬀ of players is less than what they think they get, while they are not aware of their neighbors’ true strategy.


(Communicated by Jim M Cushing)
Abstract. During the last decades, spatial games have received great attention from researchers showing the behavior of populations of players over time in a spatial structure. One of the main factors which can greatly affect the behavior of such populations is the updating scheme used to apprise new strategies of players. Synchronous updating is the most common updating strategy in which all players update their strategy at the same time. In order to be able to describe the behavior of populations more realistically several asynchronous updating schemes have been proposed. Asynchronous game does not use a universal clock and players can update their strategy at different time steps during the play.
In this paper, we introduce a new type of asynchronous strategy updating in which some of the players hide their updated strategy from their neighbors for several time steps. It is shown that this behavior can change the behavior of populations but does not necessarily lead to a higher payoff for the dishonest players. The paper also shows that with dishonest players, the average payoff of players is less than what they think they get, while they are not aware of their neighbors' true strategy.
1. Introduction. Game theory is defined as the study of mathematical models that capture conflict and cooperation between rational decision-makers [16]. In classical game theory an individual is playing a game with another individual who has to decide between different strategies to maximize their payoff which depends on other player's strategy while the other player is also trying to maximize his own payoff. Thus, it can describe how rational players behave when interacting with each other and has also been used as a method to model competition in competitive evolution which refers to as evolutionary game theory [18]. Evolutionary game theory is mainly the study of a population of competing players in which players interact with each other and go through evolution [18].
Evolutionary game theory was fostered by evolutionary biologist and then found many applications in non-biological fields such as economics and learning theory.
One can say that evolutionary game theory, in contrast with classical game theory, deals with population of players who have to decide between different strategies, and strategies with high payoff will spread in the population through learning or copying those strategies [12]. This behavior is studied using the replicator equation which describes the evolution of frequencies of population types or strategies. In deriving the replicator equation, it is assumed that there exist a well-mixed population in which every individual interacts with all other individuals with the same probability. While, in many populations, individuals have different probabilities of interaction and the structure of the interactions between individuals can affect the outcome of evolution. Spatial evolutionary game theory was created in order to model these types of interactions [21].
Spatial game theory has been introduced by Nowak and May in the beginning of the 90s and involves evolutionary games with strategies distributed over some spatial region [14]. Therefore, spatial games use evolutionary game dynamics on a spatial structure combining evolutionary game theory and cellular automata (CA). In these games, each player plays the game with its neighbors on a grid and based on the players' strategy the various positions will be occupied by the winning strategy (or species in evolutionary games) [18]. To illustrate, assume a lattice in which each cell is occupied with a player and each player has a set of strategies to adopt and there is a payoff associated with each strategy playing another strategy. At time zero a strategy is assigned to each player from the set of strategies and the total payoff of a player is defined as the sum of the payoffs resulting from playing with all the neighbors of that player. Using the total payoff, one can define a dynamic process to assign a strategy to a cell at the next generation which is usually the strategy of the player with the highest total payoff in the neighborhood of that player including itself. This dynamic process continues until the lattice reaches a steady state in which no one changes its strategy.
In most spatial games, the strategy of cells is updated in synchrony. A synchronous method of updating indicates that at every round all players on the lattice play at the same time and all cells are updated simultaneously. In contrary, in the asynchronous approach not all cells are updated at the same time but can be updated at different times. This approach can be used for modeling real-world problems in which there is not a universal clock to enforce a synchronized updating. Using an asynchronous updating method can result in different dynamic behavior in comparison with synchronous method of updating [2]. Several studies have used asynchronous updating methods and have probed the differences between their behavior and the synchronous ones. In the literature there are several types of asynchronous updating which are more commonly used [17]. These methods are as follows: 1. Random updating: In this updating scheme, a time step used for synchronously updating all the strategies is broken to n smaller time steps where n is the number of players. These smaller time steps are known as microtime-steps. Then, at each micro-time-step a player is selected at random from the population to play against its neighbors and update its strategy. In this updating scheme, the probability of being chosen for updating is independent from the updating history. This method is commonly used to study the effect of asynchronous updating on stability of cellular automata or emerging cooperation [2,17,20,24,9,28,3,7]. This approach is also known as "Random

SPATIAL COMPETITIVE GAMES WITH DISINGENUOUSLY DELAYED POSITIONS 243
Asynchronous Updating with Replacement", because a strategy of a player can be updated several times in micro-time-steps [17]. 2. Random order updating: In this updating method the strategy of players is updated in a random order at each time step. This means at each microtime-step, a player updates its strategy and the order of updating strategies is randomly generated. Then, after all the players update their strategies a new order for updating will be created. This approach is also known as "Random Asynchronous Updating without Replacement", because it is similar to the Random updating method with the difference that when a player updates its strategy it cannot be updated again. In several studies this method has been used in comparison to Random updating [2,17,20,24,3].
3. Cyclic updating: This updating scheme is similar to the Random order updating, but cells are updated based on a fixed order. Which means that the order of updating is fixed through the entire simulation and will not be updated after each time step. This approach is also known as "Random asynchronous updating with a fixed order" and has been used in some studies in comparison with Random updating [2,3,10] 4. Clocked updating: In this updating method a clock is assigned to each player. In the starting position, each clock is set to a random starting position. The player updates its status when the clock reaches the top of its period which is the updating time of that player. This updating process continues until a number of time steps is reached [17,24,3]. A new methodology in this updating type is that the clock of each player can be influenced by the clocks of other players [17].
5. α-synchronous updating: In this updating scheme an updating probability function is used to update a cell. Based on this function a cell updates its value with probability α, or remains unchanged with probability 1 − α. There are several studies in the literature of asynchronous updating in which a kind of probability based updating has been used [19,11,5,4].
In addition to the more common methods which have been used in asynchronous updating, there are some studies that have used some novel methods of updating. Lee et al. [15] presented a new cellular automaton in which each cell consists of some sub-cells which are joined to each other and at each time step a random portion of these cells are updated.
As illustrated here, there are several methods that can be used to asynchronously update a game. In most of these methods, a time step for updating the strategy of all the players is broken to micro-time-steps. Thus, the players update their strategies in a time instance before the general time step in which all the players have updated their strategies. This study proposes a method in which, in contrary to most of the asynchronous updating methods, there are some few players who seem not to update their strategy for some general time steps, although they update their strategies for themselves. Then, after a defined number of time steps those players reveal their updated strategy to their neighbors. This updating scheme can be referred to as an asynchronous strategy with dishonest players since the flow of information and decisions of the winning strategies in this method are not the same as in synchronous updating.
Moreover, there are studies which have used a type of vague information shared between players. Bouré et al. [6] introduced a β-synchronism updating method.
In this method, there is a probability to disrupt the transmission of information between cells. They showed that β-synchronous updating causes phase transitions. Chen et al. [8] analyzed players' long-run behavior in evolutionary coordination games in which there exists imperfect monitoring in a large population and players can observe signals of other players' unseen actions and extract information from the signals by using the proposed simple or maximum likelihood estimation algorithm. They showed that players' method of extracting information from the observed signal has a critical impact on the long-run behavior in evolutionary games. Zhang and Chen [29] have also considered a situation in which required payoff information does not exist. They built a model based on probability of switching for each player and showed that the number of neighbors consulted for updating and the relationship of the switching probabilities between competing strategies can intrinsically affect the evolutionary game. Wang et al. [25] considered a silence strategy in the prisoner's dilemma game, where players can either engage in the game as cooperators, defectors or silenced in which they gain no payoff. They showed that for a small payoff level, the silence strategy can increase the frequency of cooperation but for a large payoff level, a great majority of players choose the silence strategy to avoid a potential high loss of engaging in the game. They also presented an intermediate payoff level that could guarantee the optimal cooperation circumstance. Tanimito [22] proposed a mixed strategy system for spatial prisoner's dilemma in which the player stochastically shows different behaviors to its neighbors based on the agent's overall strategy. He showed that in this model cooperation increases in comparison with the common mixed strategy model in which players offer strategies stochastically.
The current proposed method can be useful in many real-world situations were information transmission is delayed (intentionally or not), for example between competing companies, political parties or even in disease transmission. In the following section the delayed updating method is described, followed by a demonstration of the game used in this model and finally results of delayed updating and consequent analysis are discussed.
2. Methodology. This model considers players in a two-dimensional square lattice in which each player has 8 neighbors which are the eight cells surrounding it (Moore neighborhood). It is also assumed that the boundaries are not wrapped and the number of neighbors for the cells at the edges and in the corners is equal to the number of their immediate contacted cells. Thus the cells at the edges have 5 neighbors and the cells in the corners have 3 neighbors. The required steps to build a model that represents the behavior of a lattice in evolutionary spatial games with some dishonest delayed players are as follows: 1. Generate a lattice of size n 2 . Populate the lattice with players of type A and type B. Players of type A constitute i% of the players while type B are (1−i)% of the n 2 players. In this model players using strategy A are called Hawks and players with strategy B are referred to as Doves.
2. The honest and dishonest positions are randomly allocated to cells in the lattice.
its neighbors based on a given payoff matrix. Payoff calculation is described in section 2.2. Each player then adopts the strategy with the highest payoff.
4. All players update their true strategy in a synchronous manner but the dishonest players do not reveal their strategy for t time steps. The reader should note that the apparent and true strategies are the same for honest players.

5.
Steps 3 and 4 repeat for a defined number of iterations or until the lattice reaches quasi-equilibrium.
2.1. Type of game used. Spatial evolutionary games have looked into a variety of well-defined games. In populations with two strategies, 2 × 2 games are used to represent the payoff of players of the different strategies competing with each other. These games are normally described by a general payoff matrix as follows [18]: This payoff matrix shows that A gets payoff of M 11 when playing with A; A gets M 12 when playing with B; similarly, B gets M 21 when playing with A and B gets M 22 when playing with B.
Based on the payoff matrix, several types of games can be defined by changing the relations between the payoffs. There are three types of games most commonly used to study the behavior of players; these games are the Prisoners' Dilemma, Chicken (or Hawk and Dove) and Stag Hunt game.
In the Prisoners' Dilemma game, two persons are arrested because of a joint crime. Each of them can either cooperate with the other person and remain silent (strategy C), or can defect and confess (strategy D). If both cooperate, then both get M 22 points. If one cooperates while the other one defects, then the cooperator gets M 21 points which is less than M 22 and the defector gets M 12 points which is more than M 22 . If both defect, they both get M 11 points which is more than M 21 but less than M 22 . So, the relation between the payoffs is M 12 > M 22 > M 11 > M 21 . This game shows that the most rewarding policy is to defect but only if the other player does not defect.
In the Chicken Game, there are two players fighting for a resource. In this game b is the value of the contested resource, and C is the cost of an escalated fight. It is also assumed that the value of the resource is less than the cost of a fight In the Stag Hunt game, two individuals are going on a hunt. Each person can choose either to hunt a stag or a hare. If one decides to hunt a stag, he must have the cooperation of the other person to succeed and the worth of a stag is more than that of a hare. In this game, if both individuals decide to hunt a hare each gets M 11 points. If one of them decides to hunt a stag and the other one a hare, the one who has decided to hunt a hare gets M 12 while the other person gets nothing (M 21 = 0). Otherwise, they can hunt a stag together and each gets M 22 . So, the relation between the payoffs is M 22 > M 12 > M 11 > M 21 .
In the evolutionary dynamic games, a game has a dilemma in it if the joint ii strategy with the highest payoff (M ii ) does not always have a non-negative replicator dynamic [23]. In a 2 × 2 game world, only the payoff matrix determines whether a dilemma occurs in a game or not [29]. In all the above presented games M 22 > M 11 and the difference in their dilemma potential can be paraphrased as DL 1 = M 11 − M 21 and DL 2 = M 12 − M 22 . If DL 1 ≥ 0 or DL 2 ≥ 0 then it is confirmed that a certain type of dilemma arises. Prisoners' dilemma is a particular kind of game, because both conditions are not satisfied at the same time. The Chicken game has just DL 2 > 0 and Stag-Hunt game has DL 1 > 0. This study analyzes only the second type of dilemma which is reflected in Chicken type games, as deemed most relevant type of game to the issue on hand [27,13].

Payoff calculation.
In every time step, each player plays against all of its immediate neighbors. The payoff of each player (cell) is calculated as the sum of all the payoffs that a player can get in playing with all its neighbors. The payoff matrix shows the payoff that a player can get confronting any other player type (Table 1.) It is worth mentioning that using the payoff matrix alone in calculating the game's outcome turns the game into a deterministic one, meaning that if one knows the arrangement of the players in the lattice, the result of the game is known with certainty.
In this table, b is the value of the contested resource, and C is the cost of an escalated fight in the Hawk and Dove game.

Updating rule.
For updating a cell in general, we look for the player with the highest payoff in the neighborhood of a cell (center cell) and check if the payoff of that player is higher than the payoff of that center cell. If the payoff is higher, the center cell adopts the strategy of the cell with the highest payoff, otherwise it keeps its current strategy. There is a possibility that more than one neighbor has the same highest payoff. In this case, if there is at least one winning neighbor whose strategy is the same as the center cell, the center cell will keep its strategy. Otherwise the center cell will adopt the winning strategy.

Hiding strategy.
Hiding strategy is the behavior of players who are dishonest in showing their updated strategy to their neighbors. In other words, they present an outdated strategy but keep track of their real strategy, so their payoff calculations are correct, but their neighbors' are not. It is assumed here that a player knows about other players' payoff through direct communication and thus each player can just know the displayed payoff of its neighbors which is calculated based on their displayed strategy. However, the dishonest players, knowing their true strategy, can calculate their own actual payoff. In the following, consider the center player as  In Figure 1, it can be seen that the strategy of the central player after one step will be Hawk but the player displays its previous strategy which is Dove. All other players calculate their payoff based on the displayed strategy (Dove) except the "liar", who knows its true strategy. The resulting payoff is what neighbors of a liar perceive and the honest payoff is the payoff that a liar calculates based on its true strategy and the perceived payoff information that it gets from its neighbors. In the next time step the center player reveals its true strategy, Hawk, which is calculated based on the honest payoff lattice and all other players reveal their strategy based on the perceived payoff lattice. This hiding strategy of the center player has changed the destiny of the lattice from all Hawks to more Doves.
In Figure 1, the first row shows the displayed strategy in asynchronous updating, the second row shows the true strategy of each player and the third one shows the strategy of the players using the regular synchronous updating.
3. Experimental results. There are several factors to take into consideration when analyzing the results of these experiments. In brief, the variables in these models are the size of the lattice (size of population of players), distribution of players in the first lattice, payoff matrix, the distribution of dishonest players in the lattice, and the time steps to hide a strategy. Changing each of these factors can change the final result of the game.
In these experiments we have shown that after a few iterations, the population of each strategy in the final lattice becomes stable, so the final lattice can be used as a measure for comparisons. Figure 2 shows that for both synchronous and asynchronous updating for different values of b, following some time steps, the population will converge to an equilibrium or quasi-equilibrium state in which the percentage of players with the same strategy does not change or oscillates around the converged value. In this figure, the red line shows the percentage of Hawks for synchronous updating, the blue line shows the percentage of players displaying a Hawk strategy in asynchronous updating, while the green line shows the true percentage of real Hawks in each time step. We can see in the figures that in all cases there is and enduring face (END) in which the number of Hawks in the population grows followed by the expanding period (EXP) in which the clusters of Doves starts to grow [26,1]. The following lattices in Figure 3 show the arrangement of players in the first and final lattice for the graphs in Figure 2. In these lattices, black cells show players with Hawk strategy and white cells show players with Dove strategy.  In order to probe the behavior of our methodology in comparison with synchronous updating, the following experiments were conducted. These experiments are performed in a 50 by 50 lattices updated 50 times in which 50% of the initial population of players are Hawks. The following experiments (sections 3.1 to 3.3) have been executed for 5 different payoff matrices (b = 1, 3, 5, 7, 9 and c = −10) and for 20 different random lattices in which 30% of players in their first lattice are dishonest. The time step for hiding the true strategy is 2.
3.1. Comparing percentage of Hawks in the final lattice. This experiment shows the comparison in percentage of Hawks in the final lattice for synchronous and asynchronous updating. Table 2 shows the average percentage of Hawks in the final lattice using 20 experiments.  Table 2 shows that when b is small, the final number of players with Hawk strategy using asynchronous updating is slightly lower than when using synchronous updating (no dishonest players). As b increases, the population of Hawks increases (more players choose the Hawk strategy) and using asynchronous updating (adding "dishonest players") gives an advantage to the Hawk strategy.
Thus, this hiding strategy increases the slope of the graph which shows changes in percentage of Hawks with regard to changes in b as shown in Figure 4.
Since the number of Hawks using asynchronous updating with displayed strategy or true strategy is very close, only the true strategy is used in plotting the values in this figure. In this graph red line shows the percentage of Hawks using synchronous updating, while the green line shows the percentage of true Hawks in asynchronous updating. This is the case when "dishonest" players show their true strategy in the final lattice.   The graph shows that for a smaller b, the average payoff of the players in populations with dishonest players is more than the payoff in populations with no dishonest players, but as b increases the result changes to the opposite (rewarding more "honest" population). However, in both populations the average payoff has its highest amount when b is equal to C/2, but the changes for asynchronous updating is smoother.
Comparing the average payoff of asynchronous updating ( the blue and green line) shows that average true payoff of players (green line) is slightly less than what they think they get (blue line), since they are not aware of their neighbors' true strategy. However, these two payoffs are still very close to each other.  This experiment shows that the average payoff that honest players get in playing in a dishonest society is more than the average payoff of liars, and so, displaying a dishonest position does not lead these players to a higher payoff.
3.4. Effect of changing time steps for updating. This analysis starts with one random lattice and three payoff matrices with b = 3, 5, and 7. In particular this experiment considers the behavior of asynchronous updating when the time steps for updating are changed from 1 to 20 with a fixed percentage of liars (30%). To compare the behavior, the percentage of Hawks in the final lattice is calculated. In the following graph ( Figure 7) the blue line shows the percentage of Hawks when dishonest players are hiding their true strategy and the green line shows the percentage of Hawks when dishonest players show their true strategy in the final lattice.
This figure demonstrates that an increase in time delay can increase the percentage of Hawks in the final lattice when b is small (and the Hawks are in disadvantage). As b increases, the time step does not play an important role in the destiny of the occupants of the lattice.
3.5. Effect of changing percentage of liars in the population. This experiment uses one random lattice and three payoff matrices (b = 3, 5, 7) and checks the behavior of asynchronous updating when the percentage of liars changes from 5% to 95% with a time step of 2, and a fixed distribution of liars. To compare the behavior, the percentage of Hawks in the final lattice is illustrated. In Figure 8 Figure 8 shows that increasing the percentage of liars when it is less than 50% can decrease the percentage of Hawks in the final lattice for small b and increase the percentage of Hawks for large values of b. However, when the percentage of liars increases beyond 50%, it has the reverse effect on the population of Hawks in the final lattice. Although the percentage of Hawks grows considerably in some combinations of b and percentage of liars, generally this value is still less than the percentage of Hawks in synchronous updating for small b and more for large b as shown in Figure 4.
On the other hand, when the number of liars is not large, the honest players real payoff is less than what they think they get, and as the number of liars increases the honest players' true payoff grows larger than the displayed one, which can be viewed as the result of living in a dishonest society. It is also worth mentioning that when the percentage of liars is large for a small b, the displayed payoff of dishonest players is more than that of the honest players, but the true payoff of liars is always less than the true payoff of honest players.
3.6. Further analysis of the effect of b. Based on the limit of weak selection and considering a large population size, the selection in the Hawk and Dove game favors the fixation of Hawk if b > C/3 (proof in the Appendix). When C is equal to 10, if b grows larger than 10/3, there is a higher probability for Hawk strategy to become dominant and if b is less than 10/3 Dove strategy has a higher chance to become dominant. Although this ratio holds true for a general population of players, Figure 4 shows that even in a spatial structure, as b grows, the number of Hawks in the final lattice increases. Figure 4 also shows that asynchronous updating is more sensitive to the fixation ratio. Therefore when b = 10/3, the asynchronous method reacts more significantly to the change of payoff matrix and the relation of synchronous and asynchronous method in regard to final percentage of Hawks changes. Thus, when b is less than 10/3, the percentage of Hawks in the asynchronous updating is less than in synchronous one, and when b is more than 10/3, the percentage of Hawks in the asynchronous updating is more than in synchronous updating. Figure 9 shows the percentage of Hawks in the final lattice for 20 different randomly generated initial lattices for b equal to 3.2, 3.3, 3.4 and 3.5. In all figures, the red line shows the percentage of Hawks for synchronous updating, the blue line shows the percentage of players displaying a Hawk strategy in asynchronous updating and the green line shows the true percentage of real Hawks in the final lattice. In all experiments, when b < 10/3, the percentage of Hawks in the final lattice for asynchronous updating is less than the percentage of Hawks for synchronous updating and when b > 10/3 the percentage of Hawks for asynchronous updating is more than the percentage of Hawks for synchronous updating. This is true for all 20 experiments.

4.
Conclusion. This study presents a new methodology for updating spatial games when there are some players who hide their updated strategies from their neighbors for some time steps. Using the Hawk and Dove game, a series of numerical simulations shows that this method of updating in comparison with common synchronous game can result in higher number of Hawks and thus lower average payoff in the quasi-equilibrium state of the games when the value of the contested resource in the game (b) is large. This is reversed when b is small, resulting in a lower number of Hawks and a higher average payoff.
Moreover, the results indicate that the true average payoff of honest players is more than the average payoff of dishonest players which shows that disingenuously delay of those players cannot result in higher payoff for them, although it may seem that a hiding strategy will lead to a higher payoff for these players. This unintuitive result can be the effect of false information that a player gets from its neighbors due to its own dishonest behavior. The sensitivity analysis on the number of time steps that a dishonest player hides its true strategy shows that the increase in number of time steps can increase the percentage of Hawks in the final lattice when b is small, but as b increase this effect is diminished. The sensitivity analysis of the percentage of dishonest players in the society also shows that when b is small, the increase in percentage of liars will decrease the percentage of Hawks when the percentage of liars is not large, and as the percentage of liars increases to more than 50% the percentage of Hawks will start increasing again. However, there is a reversal of this trend for a large b.