EVOLVING MULTIPLAYER NETWORKS : MODELLING THE EVOLUTION OF COOPERATION IN A MOBILE POPULATION

We consider a finite population of individuals that can move through a structured environment using our previously developed flexible evolutionary framework. In the current paper the behaviour of the individuals follows a Markov movement model where decisions about whether they should stay or leave depends upon the group of individuals they are with at present. The interaction between individuals is modelled using a public goods game. We demonstrate that cooperation can evolve when there is a cost associated with movement. Combining the movement cost with a larger population size has a positive effect on the evolution of cooperation. Moreover, increasing the exploration time, which is the amount of time an individual is allowed to explore its environment, also has a positive effect. Unusually, we find that the evolutionary dynamics used does not have a significant effect on these results.

1. Introduction.Evolutionary game theory has proved to be an effective method of modelling the evolution of populations.The original models focused on wellmixed infinite populations [36,35], with games such as the Hawk-Dove game [34] and the sex ratio game [26] being used.With further development, these models can be considered within well-mixed finite populations [39, ( [37,38] provided important results for finite populations without game theoretical methods).
The seminal work of [31] (see also [5,10,53,32], and [3,49] for reviews) in which evolutionary graph theory was developed, allowed the modelling of structured finite populations within a given framework.It also provided important results in the fixed fitness case [31,33,45].However, this approach is limited by the fact that it is suited to modelling pairwise interactions, whereas in real populations, there are interactions between multiple individuals [50,19], and there are many examples of multiplayer games used in the literature [44,8,14,24].In [41] it was shown that evolutionary graph theory can be used in conjunction with a different 'interaction' graph to model more complex behaviours but there is no obvious link between the two graphs, that is, one graph has not been derived from the other nor is there some clear connection, for instance both arising from some common population-derived factors.
We should also mention that structured populations have been considered in an evolutionary context in different ways, see for example [51].This includes island models, where populations evolve in isolated communities with a low rate of migration between them, as in [18].Community-structured populations were considered in [54], where interactions occur at multiple levels, with members of the same community interacting more commonly than those in different communities.
A more general framework that can be used is that of [11] where it is possible to consider multiplayer interactions in groups of any size, depending upon various factors like the population's history, whilst keeping the beneficial aspects of evolutionary graph theory.More recently this framework has been used to model different kinds of multiplayer behaviour [13,9,12].In this paper, we extend this work to consider a population of mobile individuals, focusing on a specific multiplayer game, a public goods game [6,7,25,55].
When using the evolutionary graph theory approach [30,52,29,46,58], individuals group with their neighbours within a fixed population structure.One potential problem with this is that individuals could spend more time with some of their neighbours, less with others and some time alone.The framework of [11] solves this problem as shown in [13,9] using a simple model where individuals are confined to their neighbourhood but are still allowed to form groups of different sizes.The framework, though, is capable of handling much more complex movement behaviour [1,2,21] where individuals make a choice of where to move given the information they have at hand.In this paper we apply the framework for the first time to one such model where the movement of individuals follows the Markov property.
The paper is structured as follows: in Section 2 the model framework is described in general, with examples of each concept being given to motivate how it can be applied, in Section 3 the framework is applied to create a Markov movement model, in Section 4 we describe the results of the Markov movement model, and Section 5 is a general discussion.
2. The framework of [11].This section presents the framework of [11] for modelling the evolution of a population in a which the movement of individuals follows a discrete-time stochastic process.In particular we update the terminology from the original paper somewhat, and the methodology described here will be applicable to a wide variety of scenarios, although we focus on a Markov movement model (and indeed a specific one only) in the current paper.The framework can be broken down into three components that each describe a certain aspect of the population: structure, fitness, and evolutionary dynamics.
2.1.The population: Structure and distribution.The population structure describes the restrictions upon how members of the population can interact with each other, including through the different places each individual can and cannot visit.This paper focuses on a Markov movement model, and in the type of examples that we consider all places are visitable by all individuals.The structure here will reduce to simply considering the distribution of the population individuals at any given time, and so we shall find it convenient to talk about distribution in place of structure.In a population of N individuals who can move around M places, the population distribution at time t is given in [11] by an N × M binary matrix Cooperator (defector) with staying propensity α. γ (δ) ∈ [0, 1] Nash equilibrium staying propensity of cooperator (defector).
Table 1.Notation used in the paper. denoted n,m ) and defined To consider the Markov movement models that are the subject of the current paper, it is convenient to use an alternative matrix representation of the population distribution.Here the population distribution at time t will be denoted by the matrix The framework assumes that the movement of individuals is probabilistic such that there is dependence upon time and the current and past movements of individuals in the population.In particular, the transition probability function denoted p t (m|m <t ) gives the probability that the movement of individuals at time t results in a population distribution m given the population distribution history m <t = (m t−1 , . . ., m 1 , m 0 ).The transition probability function is defined as follows whose exact form will depend upon the model being used but will always satisfy The population distribution probability function (PDPF) π t (m) gives the probability that the population distribution is m after t time steps regardless of the population distribution history.It can be expressed using the transition probabilities as where P (m <t ) denotes the historical PDPF that gives the probability that the population distribution history is m <t and is written as where the probability of the initial population distribution, π 0 (m 0 ), is assumed to be known.

2.1.
1.An individual movement model.In this model it is assumed that individuals move independently of each other.The PDPF can then be defined as follows where π n,t (m n ) denotes the individual distribution probability function (IDPF) that gives the probability of individual I n being present in place P mn at time t independently of the history of the process.The expression for π n,t (m n ) will depend upon whether the movement of I n is dependent upon the whole population distribution history or just its own individual history.
Dependence on the population distribution history.When the movement of individual I n depends upon the distribution history of the whole population, the individual transition probability function p n,t (m n |m <t ) gives the probability that I n moves to place m n at time t given the population history m <t and is given as follows The individual transition probability function is then defined as follows Dependence on the individual distribution history.When the movement of individual I n depends only upon its own distribution history m n,<t = (m n,t−1 , . . ., m n,0 ), independent from the history of the other individuals, then the individual transition probability function is given as follows The IDPF is then given by where P n (m n,<t ) denotes the individual history distribution as follows 2.1.2.The fully independent movement model.In this model individuals move independently of each other, history and time.In this case, the individual transition function is denoted p n (m) and we have that and therefore the PDPF can simply be written 2.2.Fitness.In the framework the contribution to an individual's fitness depends upon the time t, the current population distribution m and historical population distributions m <t .The fitness contribution of I n is denoted where the exact form will depend upon the assumptions about the factors that contribute to an individual's fitness.The mean fitness contribution at time t is then as follows fn,t = m m<t f n,t (m|m <t )p t (m|m <t )P (m <t ).(15) We assume that the fitness of an individual at time t is given by averaging the mean fitness contribution across all time periods up to and including t.The fitness function is then defined as follows Note that there are other definitions of the fitness function that one can use instead of the one given here, for example, one could use a weighted average of the mean fitness contribution instead.When there is fully independent movement, the mean fitness change simplifies to fn,t = m m<t f n,t (m|m <t )p(m)P (m <t ).(17) In [9] it is assumed that the fitness contribution of individual I n only depends upon those individuals that it can directly interact with.The direct group (or simply the group) of individual I n , denoted G n (m), is the set of individuals that are present with it in the same place for population distribution m and is defined as follows Dynamics Dynamics defined using the replacement weights and fitnesses as in [45].In each case, B (D) is appended to the name of the dynamics if selection happens in the birth (death) event.For BDB and BDD dynamics r ij = b i d ij , for DBD and DBB dynamics We then denote the fitness contribution as f n (G n (m)).In this case, the mean fitness change is constant over time and therefore the fitness is equal to the mean fitness contribution, that is 2.3.Evolutionary dynamics.In the framework it is assumed that there is one birth and death per replacement event.A replacement event at time t is governed by an evolutionary graph defined using an N × N weighted adjacency matrix W t = [w i,j,t ] i,j=1,...,N where the replacement weight w i,j,t gives the weight of the edge from node i to node j in the evolutionary graph that represent individuals I i and I j respectively.The contribution to a replacement weight depends upon the time t, the current population distribution m and the historical population distributions m <t .The replacement weight contribution that individual I i assigns individual I j is denoted by The exact form will depend upon the assumptions made about the replacement weight contributions.The mean replacement weight contribution is given as follows ūi,j,t = m m<t u i,j,t (m|m <t )p t (m|m <t )P (m <t ).
In this paper, we choose the replacement weight at time t as the mean replacement weight contribution at time t as in [21] that is but, as for the fitness function, there are other definitions that one can use.The probability that the offspring of individual I i replaces individual I j , denoted r ij , is defined using the replacement weights and fitnesses as in [45].The different definitions of the replacement probabilities are summarised in Table 2.
For the fully independent movement model, the mean replacement weight contribution is defined as follows ūi,j,t = m m<t u i,j,t (m, m <t )p(m)P (m <t ).(23) In [9], the replacement weight contribution is independent of time and history, and depends only upon direct groups.This implies that the mean replacement weight is invariant over time and is as follows 2.4.The evolutionary Markov chain.The evolution of the population can now be described in terms of a Markov chain.We will assume that there are only two types of individuals in the population, which we label A and B. Furthermore, each type is made up of two different characteristics, and we will say more about this in the following sections.A state of the population gives its composition in terms of type A and B individuals.We use S to denote a state of the population such that n ∈ S if I n is of type A. There are a total of 2 N different states where N (∅) is the state consisting of all type A (B) individuals.The state transition probabilities are described using the dynamics as follows Given that the state of the population is given by S, type A (B) is said to fixate from that state when all type B (A) individuals have been replaced and we reach state N (∅).Once a certain type has fixated no more changes can take place and the population remains in this state.The probability of type A individuals given by S fixating in a population where the type B individuals are given by N \ S is denoted ρ A S (and we denote the equivalent fixation probability for type B individuals by ρ B S ).This probability is found by solving the following equation with boundary conditions For type B individuals we can use the fact that ρ B S = 1 − ρ A S .We shall consider a population where a population is all of a single type, but where a single population member is selected uniformly at random to be replaced by one of the opposite type.We are thus interested in calculating the fixation probability where state S consists of only one individual (all but one individual).There are N initial states from which the fixation probability can be calculated, and we take an arithmetic mean of these fixation probabilities, which we denote as ρ A (ρ B ).Alternatively, one could weight the fixation probability of a mutant using the likelihood of that mutant appearing [4].Sometimes this is an important distinction, but in the examples considered in the current paper the differences are small, and so we have stuck with the traditional, simpler, version.
3. The Markov movement model.In the previous models [9] considered in this framework, the movement of individuals is limited to their neighbourhood and exogenously controlled by the home fidelity parameter that measures how likely the individual is to remain in their home.A natural extension to this is to allow individual distributions to vary with time.A logical first step is to consider a Markov model, which is based on the assumption that history dependence is Markov, that is, the current population distribution is only dependent upon the previous population distribution.The concept of a Markov movement model within the framework was introduced in [11], but was only discussed in general terms.In this paper we fully develop it and apply it to example populations for the first time.The definitions we have given before would then change as follows; for the PDPF we have where π n,t = [π n,t (m)] m=1,...,M .Furthermore, if we assume that there is time homogeneity, that is p n,t = p n for all t, then this simplifies to In this case, assuming that p n is irreducible and aperiodic for all n, then as t → ∞ the IDPF π n,∞ is stationary for all n.Essentially, our model is then equivalent to the fully independent movement model.We do not consider this case further here, but rather refer the reader to [9] for a detailed discussion of this kind of model.

3.2.
Individual movement with dependence on population history.In this model individuals move to a new position independently of each other but dependent upon the current distribution of the whole population.The IDPF is then as follows In this paper we construct a model of this type that is made up of the following four components: population structure, movement strategy, game and evolutionary dynamics.
3.2.1.The population structure.The population is assumed to be of size N where each individuals has a home that they can return to.The structure is described by a graph such that each node represents a place.We consider the complete graph structure where all places are connected to each other.We assume that every place is home to precisely one individual.
3.2.2.Individual movement.We assume that the individual transition probabilities are time homogeneous but dependent upon the previous group and previous position of the individuals, that is where h n (G n (m t−1 )) denotes the staying probability of individual I n and N − 1 is the number of neighbouring places that an individual can move to in a complete graph.
The staying probability h n (G n (m t−1 )) will depend upon the staying propensity α n of individual I n and the attractiveness of remaining in group G n (m t−1 ).The staying propensity α n measures the likelihood that individual I n will stay where it is, in particular, The staying propensity is assumed to be one of the characteristics that makes up the type of an individual.However, when present in a group (|G n (m t−1 )| > 1), individual I n would take into account the benefit of remaining in that group.The benefit β i of group member I i to others depends upon its interactive strategy, the second characteristic that makes up the type of an individual.We will assume that there are two interactive strategies, cooperate (C) and defect (D).The benefit function, β i is then defined as follows where β C and β D are the benefits of being with a cooperator and defector, respectively.The benefit of group G n (m t−1 ) to individual I n is then defined as follows Finally, combining the effects of the staying propensity and the group benefit, in the rest of the paper the staying probability is expressed as the following sigmoid function where 0 < S < 1 is the sensitivity shown to group members.So, for example, S → 0 implies that I n shows great sensitivity and would move away immediately if remaining in group G n (m t−1 ) is unattractive, which is the case when β Gn(mt−1)\{n} < 0. An alternative way of representing the S → 0 limit involves the staying probability being defined using the following step function For example, if we set α n = 0 ∀n, β C = 0 and β D < 0 then the attractiveness of a group is completely determined by the presence or absence of defectors.An individual would therefore leave with probability 1 if a defector is present in the group.This was referred to as the 'walk away' strategy in [1].
In our model we select an exploration time T , which is the number of steps an individual takes moving around the region before returning to its home place.Thus the larger T , the more time cooperators have to find other cooperators, but also the more time there is for them to be found by defectors.

3.2.3.
Fitness.We assume that the change in fitness of an individual depends upon direct group interactions and whether a movement has been made.
For these group interactions we will consider a public goods game in which the payoffs are determined by the interactive strategies, cooperate and defect, that we introduced earlier.Each individual receives a base reward of 1 regardless of their strategy.A cooperator always pays a cost 0 ≤ c < 1 so that every individual that it can directly interact with (excluding itself) receives an equal share of a reward v > 0. The cost cannot exceed 1 in order to prevent the fitness contribution from going negative (this is done for convenience of calculation; it is important that total fitness is not negative, and we could deal with large costs if necessary by truncating the resulting total fitness at 0).A defector does not pay a cost but receives a share of the reward from cooperators present in the group.Note that the base reward has been normalised to 1 and the reward v and cost c are multiples of the base reward.The direct group interaction payoff functions are then defined as follows where |G| C is the number of cooperators in group G.Note the cooperators still pay a cost when they are alone.An individual will pay a cost of λ for every movement that it makes.The movement cost is chosen so that it does not exceed the direct group interaction payoff an individual receives (for the same reasons as for the cooperative cost c, and large movement costs could be similarly accommodated if necessary), that is 0 ≤ λ < min(R n,t (G n (m t ))).The fitness contribution is then given by It is clear that these fitness contributions vary with time, as the first move from the home place follows the distribution for a lone individual, and then movement depends upon the groups formed.For instance in a population entirely composed of cooperators, individuals would almost cease to move when they had found another cooperator, so the level of movement would decrease (and the fitness contributions would increase) with time, until the exploration time T is reached.
3.2.4.Evolutionary dynamics.We assume that the replacement weight contribution will only depend upon the direct group.As in [9], the replacement weight contribution will depend upon the amount of time spent with each individual.In particular, it is assumed that an individual spends an equal amount of time with each individual in the group excluding itself.However, if the individual is alone, then it effectively allocates all the time to itself.The replacement weight contribution function is then defined as follows We note that combining equations ( 24) and ( 42), we have that w i,j = w j,i and w i,i = 1 − j =i w i,j , which implies that our selected weights have the isothermal property (see [31]).
3.2.5.Simulating the evolutionary Markov chain.The approach used in this paper to calculate the fixation probability is a semi-analytic one where the fitnesses of individuals are found by simulation, and these results are then used to evolve the population using the evolutionary Markov chain, which results in a more accurate solution than simulating the whole process (the movement process is too complex to allow for a fully analytic solution).
Individuals start on their home place and then undergo an exploration phase of T time steps as described in Section 3.2.2.To calculate the fitness, the individuals move T times such that their fitness contribution is calculated for each of these movements; the total of these T fitness contributions gives their fitness for one simulation.The position of the individuals is then reset, that is, they return to their home place before the next simulation is run.Their average fitness for 10,000 simulations is used in the evolutionary Markov chain.
To calculate the replacement weights, individuals start on their home place and move only one time to determine their replacement weight.This represents individuals returning to their home place to reproduce, with individuals being replaced according to the corresponding local connections.This counts as one simulation and, before the next simulation is run, we reset the position of the individuals so they all start in their home place.The replacement weights are calculated exactly because they are comprised of only one movement.This involves calculating the probability that an individual is alone, which gives the self-replacement weight.Since we have a complete graph and only two different types of individuals, there are only four distinct weights for the replacement of other individuals.
The fitnesses and the replacement weights are all that is required to construct the transition probabilities of the evolutionary Markov chain.The transition probabilities are substituted into the formula of [28] to give the fixation probability of i type A mutants in a population of N − i type B residents as follows where P − k (P + k ) is the backward (forward) transition probability for a state with k type A individuals.Note that the weights w ij and the fitnesses from Section 3.2.3depend upon the composition of the population, so at successive steps of the evolutionary Markov chain the transition probabilities will in general be different.
We also note that this formula can easily be modified to find the fixation probability of type B individuals.What exactly makes a type A or B individual would depend upon its interactive strategy and staying propensity.For example, we could have that A = C 0.1 and B = D 0.5 , which means that type A is a cooperator with a staying propensity of 0.1 and type B is a defector with staying propensity 0.5, or we could have A = C 0.1 and B = C 0.2 so both types have the same behavioural strategy but different staying propensities.However, the important thing to note is that, at any one time, there are only two unique types A and B in the population.
The advantage of such an approach is that we can relatively quickly calculate the fixation probability starting from any state.The saving comes from the fact that we do not simulate the entire process, which would take much longer because the number of steps to reach fixation could be high.However, this approach necessarily requires that we have a population in which individuals can differ only in terms of their type.To ensure that this is the case, we consider a complete structure with N places such that each individual has their own home place.We note that the advantage of efficient algorithmic processes over simulations was demonstrated in [48], but also that it was shown in [27] that for frequency-dependent selection this approach will not work for arbitrary spatial populations.4. Results.In this section the effect of the model parameters on the fixation probability are investigated.In particular, we investigate how the model parameters affect assortment, which is the mechanism that allows cooperation to evolve as shown in [22].There is positive assortment between cooperators if they are more likely to interact with other cooperators than defectors.In our model, this occurs due to an increase (decrease) in the time it takes for defectors (cooperators) to find cooperators.According to [20] the time to find cooperators should depend upon the density of the population and an individual's movement speed.In their model, N individuals pair up with one another to form a coalition such that the probability of a pair forming is exponentially distributed with rate µ, which is a function of N and the population density.The time to find cooperators in their model is essentially determined by the rate µ.We have one-to-one correspondence between individuals and places and therefore the density remains constant; on the other hand, since we consider a complete graph, the movement speed is high as individuals can directly get from one place to another.Therefore, the time it takes to find cooperators is mostly determined by the staying propensity of the individuals, however, this relationship is not so straightforward as it is not globally controlled and the individuals may have different staying propensities (which are subject to the evolutionary process).This means that some individuals may find cooperators faster than others.The parameters used in the simulations are summarised in Table 3.
Apart from an individual's interactive strategy and staying propensity, all other parameters are considered to be fixed.Each individual inherits these two characteristics from its parent, and different interactive strategies or staying propensities are introduced into the population through mutations.Staying propensities can take any value 0.01m for m = 1, . . ., 99; this means that no individual moves all the time or never, and so makes some adjustment to their behaviour depending upon the group they are in.In particular we have max(α) = 0.99; some movement is a necessary requirement otherwise the replacement weights would be zero and there  3. Parameters used for the simulations.The other parameters are fixed such that we have a complete structure with each individual having its own home, β C = 1, β D = −1, S = 0.03 and the dynamics used are BDB.would be no evolution within the population.In a real world setting, a minimum movement requirement can be explained by, for example, foraging behaviour where an individual searches its environment to find food and therefore needs to move in order to survive.
The mutations of these characteristics are sufficiently infrequent that the population is assumed to consist of a maximum of two types; resident and mutant, whose competition will result in fixation of one of the types before a new mutant appears.We consider two different scenarios to account for the different mutation rates of each characteristic.

4.1.
Scenario A: Interactive strategy mutations are rare.As previously stated, it is assumed that fixation happens much faster than new mutations arise.A mutation can result in a change of the interactive strategy and/ or the staying propensity.In this scenario, the mutation rate of an individual's interactive strategy is much slower than the rate of mutations that involve their staying propensity.Since it is much more likely that the staying propensity mutates than the interactive strategy does, once one of the interactive strategies (cooperate or defect) is removed from the population, it will be a long time before a new mutant involving this strategy appears.During this time, there will be a sequence of contests among individuals with the same interactive strategy but different staying propensities and the population will eventually evolve to the point where all individuals have the same interactive strategy and are using a (strict) Nash equilibrium staying propensity (a strict Nash equilibrium propensity is one where the fixation probability is maximised and changing the staying propensity is disadvantageous).Eventually, a mutant with a different interactive strategy and staying propensity will appear, and the quantity of interest at this point is the fixation probability of this mutant type.We assume that the staying propensity of the mutant can be different from the Nash equilibrium staying propensity of the resident population it is invading.The resident population will therefore be stable if it can resist invasion from a mutant using any staying propensity.Rather than considering any arbitrary mutant, the focus will be on the mutant most likely to invade, i.e. one maximising its fixation probability.
Cooperator residents are of the type C γ R where their Nash equilibrium staying propensity γ R is the staying propensity where a = b in the set In this set we identify all the points (a, b) where a is the best response staying propensity of 1 individual of type C a when playing against N − 1 individual of type C b , who are using some arbitrary staying propensity b.Therefore, at the point where a = b, C a is a best response to itself, i.e. a Nash equilibrium.
A defector mutant is of the type D δ M where the staying propensity δ M satisfies Defector residents are of the type D 0.99 (i.e. in the equivalent terminology to the above δ R = 0.99) where their Nash equilibrium staying propensity is max(α) = 0.99 whenever the movement cost is greater than 0 because the only way for them to maximize their fixation probability is by moving as little as possible.
A cooperator mutant is of the type C γ M where the staying propensity γ M satisfies ρ Cγ M ,D0.99 1 = max ρ Cc,D0.99 The Nash equilibrium staying propensity of the resident cooperators γ R is calculated as follows.We consider N − 1 residents of the type C b and calculate the fixation probability of 1 individual of the type C a for all values of a in the range [max(0.01,b − 0.09), min(b + 0.09, 0.99)], and the a that gives the highest fixation probability is picked.Note that using a wider range of values for a gives the same result so this range is used for efficiency.The N − 1 residents then use the staying propensity a that was picked and this process is repeated several times.After around 20 repetitions, the staying propensity that gives the maximum fixation probability remains the same, that is, we can see that it is a (strict) Nash equilibrium because it is a best response to itself and any other strategy will be disadvantageous.We therefore set γ R to the value of a we get after 20 repetitions.
We hypothesize that there is only one solution to the Nash equilibrium staying propensity.As seen in Figure 1, the best response staying propensity of one type C a against N − 1 type C b is relatively flat (the jagged line of the figure being an approximation to a smooth "real" value, caused by the stochasticity of the simulations).Intuitively the real solution should be smooth; a small change in the movement cost would have a small change on the payoff to a focal individual.It is possible that at some point this would lead to a sudden jump of the best response strategy as the payoffs from two different values pass.We would expect to see either a single smooth continuous function for the best response, or a piecewise continuous collection of distinct parts, and it is the former that we have here.This flatness means that the best response staying propensity is predominantly determined by the movement cost λ regardless of what the other players are doing.Therefore, there is only one intersection point with the line a = b as shown in Figure 1, which gives the Nash equilibrium staying propensity γ R of resident cooperators.A nonunique solution would occur if there were multiple crossings (or indeed no crossings, which would need a discontinuity in Figure 1, as described above).We should note that we have no proof of the uniqueness of the Nash equilibrium staying propensity, although in all cases considered, the solution to the process described in the previous paragraph is independent of the starting position.4.1.1.The effect of the movement cost.In Figure 2 the effect of the movement cost is shown.In particular, it increases the time it takes to find cooperators by increasing the staying propensity, that is, γ R , γ M , δ M are positively correlated with movement cost; the (partial) exception is resident defectors, which we know have a staying propensity of max(α) = 0.99 regardless of the movement cost.This plot shows the best response staying propensities for 1 type C i individual playing against N − 1 type C j individuals.Parameter set 1 is used with λ = 0.2 and i, j ∈ {0.01, 0.02, . . ., 0.99}.The intersection point of the plots gives the unique strategy which is a best response to itself, i.e. the unique cooperator resident Nash equilibrium staying propensity γ R , which is somewhere between 0.3 and 0.4.This value is similar to the one obtained using the iterative method (see Figure 2).The values from the current figure are approximate only because of the jagged nature of the lines; these occur because of the very large number of simulations that would be necessary to obtain a smooth version (the figure uses 10000 simulations for each combination).The figure is used to illustrate the uniqueness of the solution only.
For very low movement cost, both mutant types have a significantly lower staying propensity than the resident population that they are invading.They can therefore invade the resident population because they take less time to find cooperators.
For higher, but still low, movement costs, whilst mutant cooperators can still invade, mutant defectors cannot.Here the resident cooperators are better at preventing invasion even when δ M < γ R for some values of the movement cost.This is because the movement cost impacts the invading mutant defector more adversely than the resident cooperators, who on average leave and regroup less often than a defector who will be repeatedly deserted by its cooperator groupmates.
For intermediate movement costs, neither mutant type can invade.At this point, since δ M > γ R , a mutant defector is slower at finding cooperators than the resident cooperators and therefore cannot take advantage of them.For a mutant cooperator, γ M becomes much larger thereby diminishing their advantage over the resident defectors, in particular, not only are they paying a higher movement cost but it takes longer to find the other cooperators, which in turn reduces the amount of time that they can spend with them.
For high movement costs, defecting mutants can invade, but cooperator mutants cannot.At this point all types have a large staying propensity and therefore do not interact much with one another.However, a mutant defector is helped by the fact that the resident cooperators always pay a cooperating cost that they now find difficult to recoup because they are moving very little and also paying a very large movement cost whenever they do so.4.1.2.The effect of the exploration time.The exploration time T plays an important role in the evolution of cooperation.Changing the exploration time has a minimal effect on the time it takes to find cooperators because it will not alter the speed of movement of the individuals.This is because we are using a complete graph and individuals can directly get from one place to any other.However, increasing the exploration time has a positive effect on the coalition time, that is, the amount of time that cooperators spend cooperating with one another.[20] showed that increasing the coalition time helps with the evolution of cooperation.In our model, one explanation for this is that the fitness of the individuals, which is the average reward over the exploration time, will naturally have a higher value the larger the coalition time.
In Figure 3 reducing the exploration time T from 10 to 5 steps decreases the coalition time which adversely affects the cooperators.One of the key differences is that the resident cooperators now find it much more difficult to prevent invasion from a mutant defector.The shape of the plot for a mutant cooperator is largely the same but with a consistently lower fixation probability.In Figure 4 increasing the exploration time T from 10 to 25 steps benefits the cooperators.Not only does it help the resident cooperators prevent invasion from a mutant defector but it also increases the success of an invading mutant cooperator.This again has to do with the increased coalition time that allows the cooperators to increase their fitness.4.1.3.The effect of population size.Fixation probability is reduced in general when the size of the population increases, as we see when comparing Figures 2 and 5, with population sizes of 10 and 20 respectively.The key value to compare fixation probabilities against is the neutral fixation probability of 1/N , the horizontal line in each of these figures, however.We see that the fixation probability is slightly higher for cooperators when compared to this line for the larger population of Figure 5 (although it is also more sensitive to the movement cost) than for the smaller population.The key difference is that a mutant defector has fixation probability consistently under the neutral line in Figure 5 and so cannot invade even for very low movement cost in the larger population.Thus larger populations help a little in establishing cooperation, but help a lot in making it stable against defection.
Increasing the population size has a positive impact on the evolution of cooperation because it increases the time it takes to find cooperators.Note that we are assuming that there is a one-to-one correspondence between individuals and places and therefore increasing the number of individuals also increases the number of places.Even though the density remains the same, there would be more places for the individuals to search in order to find cooperators thereby increasing the overall time it takes to find cooperators.In particular, an individual that is currently not in a cooperating group will have to search N − 1 places to find one, therefore, the probability of a defector finding a cooperating group decreases as N gets larger.This means that cooperators would resist invasion by defectors better, as we have noted above.
4.1.4.The effect of reward and cost.The reward to cost ratio v/c is important because, even if other external factors favour cooperation, cooperation will not evolve if the reward to cost ratio is too low.This is seen in Figure 6 where the cost is set to 0.04 with the reward written as a multiple of the cost.When v/c is low, a mutant cooperator cannot invade but a mutant defector can.This is simply because the value of v/c is too low to promote cooperation.Increasing v/c makes cooperation more viable and, in particular, it allows a mutant cooperator to reduce the time it takes to find cooperators by reducing its staying propensity.It becomes more difficult for a mutant defector to invade because, on average, resident cooperators move less than the mutant defector as they are more in number and the larger v/c helps them quickly recoup any movement cost they incur whilst evading the mutant defector.This is the case even when δ < γ R , that is, a mutant defector takes less time to find cooperators.For comparison with a different value of v/c, in Figure 7 the cost is set to 0.09.However, there is no fundamental change in what happens and we have a very similar figure to that for c = 0.04.

Scenario B:
Interactive strategy mutation is not rare.In this scenario, the mutation rate of an individual's interactive strategy is not much slower than that of their staying propensity.Since the staying propensity would take a number of mutations to reach the right level for any scenario, any successful strategy will have to repeatedly face individuals of both types.The (strict) Nash equilibrium staying propensity will then be determined in a mixed population, i.e. there are individuals of both types.For simplicity we choose only one mixed state to determine the Nash equilibrium staying propensity which is the one where there are N/2 individuals of each type.The Nash equilibrium staying propensity for each type is therefore the one in which the fixation probability from the mixed state of each type is maximised.
Resident and mutant defectors are of the same type D δ .Similarly, resident and mutant cooperators are of the same type C γ .The Nash equilibrium staying : c ∈ (0, 1) and a ∈ (0, 1) .
In the first set we are finding the Nash equilibrium staying propensity a of N/2 type C a playing against N/2 type D b , where b is some arbitrary staying propensity.In the second set we are finding the Nash equilibrium staying propensity b of N/2 type D b playing against N/2 type C a , where a is some arbitrary staying propensity.The point at which these two sets intersect is (γ, δ), that is, both types will be using their Nash equilibrium staying propensities.
To calculate γ and δ we use a similar iterative procedure from scenario A. To initialise the iterative procedure we arbitrarily choose some staying propensities a 0 and b 0 , and the iterative step is as follows.We calculate the fixation probability of N/2 type C a individuals against N/2 type D b0 for all values of a in the range [max(0.01,a 0 − 0.09), min(a 0 + 0.09, 0.99)].The staying propensity a that gives the maximum fixation probability is picked, which is labelled a 1 .We then calculate the fixation probability of N/2 type D b individuals against N/2 type C a1 for all values of b in the range [max(0.01,b 0 − 0.09), min(b 0 + 0.09, 0.99)].The staying propensity b that gives the maximum fixation probability is picked, which is labelled b 1 .Note that using a wider ranges for a and b gives the same result so these ranges were used for efficiency.After around 20 repetitions of the iterative step, the staying propensities a and b that give the maximum fixation probability remain the same, which means that we are at a (strict) Nash equilibrium because any other values would be disadvantageous.We therefore set γ = a 20 and δ = b 20 .
We hypothesize that γ and δ are unique.For cooperators, their Nash equilibrium staying propensity is relatively stable because it is predominantly determined by the movement cost regardless of what the defectors are doing.As seen in Figure 8, the plot for this is a roughly vertical line.For defectors, their Nash equilibrium staying propensity is negatively correlated with the staying propensity of the cooperators given that the movement cost is not too large, otherwise it would be max(α).In Figure 8, the plot for this slopes downwards as the staying propensity of the cooperators increases.There is therefore only one intersection point of the two curves that gives γ and δ.

4.2.1.
The effect of movement cost.As in scenario A, the movement cost increases the staying propensity of the individuals and, therefore, increases the time it takes to find cooperators.As seen in Figure 9, what happens in this case is quite different to the situation in scenario A. Here, the mutant cooperator does not benefit from the fact that the resident defectors have a very high staying propensity as in scenario A. In this case, δ changes with the movement cost in a similar way that γ changes.Therefore, the key difference here is that a mutant cooperator cannot invade for very low movement cost because the resident defectors have a very low staying propensity, which means that they take much less time to find cooperators.4.2.2.The effect of exploration time.As in scenario A, the cooperators do worse when the exploration time is lower; this is shown in Figure 10 where T is decreased from 10 to 5, and in Figure 11 where T is increased from 10 to 25.The explanation is as in scenario A where the coalition time is lower when the exploration time is lower and the coalition time increases, since, as we already know, increasing the coalition time helps the cooperators do better.4.2.3.The effect of population size.Similarly to scenario A, increasing the population size helps cooperators as shown in Figure 12, where N is increased from 10 to 20.As before, increasing the population size increases the time it takes to find cooperators because there is a one-to-one correspondence between individuals and places.Increasing the population size therefore increases the number of places that need to be searched to find cooperators.Furthermore, as in scenario A, a mutant defector can no longer invade resident cooperators for very small movement cost.4.2.4.The effect of reward and cost.For a mutant defector, the effect of the reward to cost ratio v/c is the same as in scenario A. However, a mutant cooperator does not do better with increasing v/c.In this scenario, the fixation probability of a mutant cooperator peaks, then starts dropping, as v/c is increased.This is because the resident defectors have a very low staying propensity, and are therefore faster at finding cooperators, making it difficult for a mutant cooperator to invade because it cannot avoid the defectors.This is shown in Figure 13 where c = 0.04.Increasing  .This plot shows the best response cooperator staying propensity (solid line, value shown on the x-axis) versus the range of defector staying propensities on the y-axis, and the best response defector staying propensity (dashed line, value shown on the y-axis) versus the range of cooperator staying propensities (on the x-axis) for N/2 cooperators and N/2 defectors.Parameter set 1 is used with λ = 0.2 and the staying propensities are chosen from the set {0.01, 0.02, . . ., 0.99}.The best response staying propensities cross at one point only, which is thus the unique Nash equilibrium, where γ ≈ 0.7 and δ ≈ 0.5.These values are similar to those obtained using the iterative method described earlier (see Figure 9).As before, the values from the current figure are approximate only because of the jagged nature of the lines; the figure is used to illustrate the uniqueness of the solution only.
the cost c though, makes it even more difficult for the cooperators regardless of v/c.In Figure 14, a mutant cooperator cannot invade for any v/c.This is because a larger c reduces the cooperators' background fitness by a larger amount, increasing the handicap that the cooperators already have.

4.3.
The effect of other parameters.The effects of other parameters are not shown using plots but will be explained in this section.
Making the individuals more sensitive to their group members by decreasing the sensitivity parameter S improves the chances of cooperation evolving.In equation (38), we can see that decreasing S will increase the size of the denominator if the group benefit is negative, thereby increasing the probability that an individual moves away from its current position if it is undesirable to stay.Therefore, as S → 0 the more sensitive individuals become, which helps the evolution of cooperation  9 with exploration time T decreased from 10 to 5.
because it reduces the exploitation of cooperators (cooperators are now more likely to move away if the group they are in becomes undesirable).
Another way in which the group member sensitivity can be changed is by choosing β A > 0 and β B < 0 such that β B /β A → −∞.As seen in equation (38), this will cause the group benefit to become negative very quickly in the presence of a defector, even if there are significantly more cooperators present.Once again, this reduces the exploitation of cooperators by defectors, hence, improving the chances that cooperation evolves.
In all of the plots shown, we have only used BDB dynamics because the effect of a change to other dynamics is quite small.The reason for this is that the evolutionary  9 but λ is fixed and reward to cost ratio v/c varied such that c = 0.09.
graph is always complete, that is, whilst the replacement weights change, all individuals can still replace one other.For example, in the case of DBB dynamics, to make a significant difference a defector randomly chosen for death should be more likely to be replaced with the offspring of a cooperator.However, this is not the case here and, in particular, the only way the evolutionary graph can be changed is by changing the staying propensity such that increasing the staying propensity increases the probability that an individual replaces itself.Therefore, the dynamics overall have a small effect.We note that this would not be the case for some other underlying structure that was not complete.

4.4.
The limiting fixed fitness case.Our general framework is complex, and hence so far there have been few analytical results associated with it.In particular payoffs for the games considered, the public goods game as in the current paper and the multiplayer Hawk Dove game as in [9], are frequency dependent, and so general analytical solutions are hard to find.This is especially true for history-dependent models such as the Markov model that we consider in this paper.An alternative, simpler, case is that of fixed fitness, i,e.where payoffs depend only upon an individual's type, and not its interactions.This case is considered in many of the classical evolutionary graph theory papers, and in particular yields some analytical solutions (see for example [31,10,53]).We note that this applies for the public goods game considered here in the limiting case of either very small v or the probability of being alone being close to 1 (i.e.|G n (m)| = 1 almost always), in which case we approximately have fitnesses of 1 and 1 − c for defectors and cooperators, respectively.Here, the interactions only affect the replacement probabilities as described in Section 2.3.Below we shall give some new analytical results for our framework for this fixed fitness case.
The classical fixed fitness models involve a resident population of fitness 1 and an invading mutant of fitness r.The Moran fixation probability is given as where ρ A S denotes the fixation probability of a set of mutants S in a completely unstructured population (because the population is unstructured, only the number of mutants matters).A complete analysis of the conditions under which fixation on an evolutionary graph satisfies the Moran probability was carried out in [45] and was summarised as Table 2 of that paper.
For the weights used in this paper (and commonly elsewhere) the weight matrix W satisfies the isothermal property, as we have noted in Section 3.2.4.The conditions for the Moran fixation probability to hold were shown in [45] to include all isothermal cases for each of the four dynamics BDB, DBD, LB and LD.Thus for the fixed fitness case and these selected weights, in our framework every population is equivalent to the well-mixed population for these dynamics.Substituting the payoffs we gave above for cooperators and defectors into equation 44, we then have the following respective fixation probabilities for i cooperators (defectors) in a population with N − i defectors (cooperators) In our results we have used the BDB dynamics, so that in the limiting case of v → 0 we will obtain the fixation probabilities given above.It is easy to see that that is indeed the case by substituting i = 1 and N = 10 into equations 45 and 46 and comparing with the fixation probabilities near the axis in the third subfigure of Figures 6 and 13  This leaves the dynamics BDD and DBB.Only a very special subclass of weight matrices, some isothermal and some not, could yield the Moran probability for these two dynamics (different for each dynamics).Thus in general these dynamics will not yield the Moran probability in the fixed fitness case, although for the structure used in this paper this is actually a reasonable approximation.
Thus it is clear that, for the weights described in Section 3.2.4,our framework affects evolution primarily through how it affects the fitnesses through the interaction of individuals, and when this effect is removed (as above) significant structural effects disappear.We note firstly that we can have different weights that do not satisfy the isothermal condition, and so for which these results do not apply; for example if self-replacement is replaced by a resampling from the distribution of groups when an individual is alone.
Secondly, we note that some of the more extreme effects that occur in the fixed fitness case from evolutionary graph theory come about precisely because the weights involved are very uneven, for example relating to the star graph, where there is a single central vertex with many neighbours but these vertices only have the central vertex as a neighbour.The payoffs are typically calculated using either the average or the total of a set of games, one played against each neighbour.Yet if we consider weights in the way that we think of in the current paper, namely time spent together, there is a problem with this.The central individual can only spend a small amount of time with each of its many neighbours.What then do these other individuals do the rest of the time?In the average payoff case they are effectively able to acquire the same payoff as for the interaction with their single neighbour (irrespective of what that is), in the total payoff case they gain zero for the rest of the time.In our framework individuals can gain certain payoffs when alone, and this would perhaps be logical for classical evolutionary graph theory too. 5. Discussion.In this paper we have developed the framework of [11], for considering the evolution of structured populations involving multiplayer interactions, and in particular created a model of a mobile population in which the movement of the individuals is Markov, where the place an individual moves to next depends upon their current position.In previous models [9], individuals moved independently of their current position so the model in this paper gives a different perspective on the movement of individuals.In particular, we looked at the movement of individuals in relation to the evolution of cooperation.In what follows, we discuss some of the results of this Markov movement model.
In the Markov movement model we considered in detail the version where the movement of individuals depends upon population history.Here, individuals make a decision of whether they should stay or leave their current position depending upon the other individuals present with them in the same place.This movement strategy is akin to the "walk away" strategy of [1,2].However, we note that this is only one interpretation we can use for the Markov movement model.The framework provides the tools to construct different kinds of Markov movement behaviour.For example, in [21], individuals would study all surrounding areas before making a decision about where to move to next.In terms of the framework, individuals would consider a larger subset of the current population distribution rather than just the distribution of individuals that are currently present with each other.Both simple and complex Markov movement behaviour provide useful insight into the movement behaviour of individuals but we have opted to start with a simpler behaviour to make it easier to show how the framework can be applied.
For cooperation to evolve, it was shown in [22] that there should be assortment, in particular there should be a mechanism that allows the cooperators to increase their preference for interacting with other cooperators.Here, this mechanism is provided by the Markov movement of the individuals.Our results are in line with [2] who also modelled the Markov movement of individuals where individuals would stay where they are if the payoff they received was above some minimum threshold.However, the structure we have used is substantially different.We have used a complete graph with one-to-one correspondence between individuals and places instead of a two-dimensional array.This means that there is a high potential movement speed as individuals can go directly from one place to another, which is mitigated in our model with the introduction of a movement cost.A higher staying propensity slows down an individual because they are more likely to stay where they are.In terms of choosing the staying propensity an individual should use, we calculated the staying propensity which maximises their fixation probability.We considered two different scenarios where the staying propensity of an individual mutates very quickly or slowly.The key difference between the two scenarios was that a mutant cooperator can invade a resident population of defectors for very low movement cost if their staying propensity mutates very slowly.We also investigated the effect of changing the other model parameters.
The BDB dynamics used here allows cooperation to evolve even though typically selection does not favour cooperators with these dynamics [40].Other dynamics that favour cooperators showed little improvement over the results we got for BDB dynamics.This shows that Markov movement is quite effective in allowing cooperation to evolve.Its effectiveness is further backed up by the fact that the structure of the evolutionary graph is complete, which is known to be detrimental for cooperators [40].In particular, in a complete evolutionary graph all individuals can replace each other and, therefore, the individuals with the highest fitness are more likely to be favoured by selection.This shows that conditional movement makes the choice of dynamics being used less important.
We note that our work effectively involves a coevolution of population strategy and structure, and that there has been significant research on this over the past ten years or so, as in for example [42,43].In such models the growth and structure of the graph can be strongly influenced by the game played, as well as previous interactions of individuals.In this case connections between pairs of individuals change, and are formed or broken depending upon the types of the individuals, in a population that evolves with link dynamics happening on a faster timescale that the evolutionary dynamics.A similar but more general set-centred approach is considered in [56].In [23] it is reputation rather than previous interactions that causes structural changes; in [15] the key factor is prosperity.For an excellent review of this type of work prior to 2010 see [47] (see also [3] for a more recent but more general review).As noted by [3], a common feature of a lot of this work is that cooperative behaviour occurs more readily when cooperators are able to both group themselves together and exclude defectors to a significant extent, and this is a feature of our work too.In our case a key difference is the presence of variable-sized multiplayer interactions, the distribution of which is closely linked to population structure.
Furthermore through our framework, we can see a clear connection between models with mobile individuals as in the current paper, and those on a fixed structure.We see an interesting alternative (but which has a similar effect) in [16] and [17], where individuals are on a lattice and move when their current interactions are unsatisfactory.In [57] mobility (also on a lattice) is linked to reputation (where individuals with a higher reputation level than their locality tend to move).These works demonstrate that an intermediate level of mobility can help cooperation to evolve, which we have also seen in our different type of structure.
In this paper we have made several advances on our previous work.We have largely completed the development of the framework of [11] and have shown how it incorporates different aspects of evolutionary game theory thereby making it very flexible in terms of what can be modelled.We have then applied this to a Markov movement model, the simplest type of history-dependent model within our framework.In turn we have used this to explore the evolution of cooperative behaviour, making predictions upon when cooperation can occur, with high exploration time and low movement cost both helping cooperation; interestingly, the evolutionary dynamics used is not so important for our chosen model.The example model used in this paper made use of quite a simplistic territorial structure that allowed the results to be calculated semi-analytically, that is, only a part of the results were calculated using a simulation.In future, we would like to model a more complex territorial structure to determine the effect this has on the evolution of cooperation.As we have seen, the space within which individuals move in has an effect on the speed of movement.Indeed, being able to directly move from one location to another means that individuals have a very high movement speed.However, having to pass through a number of places before reaching the desired location would reduce this movement speed.This again opens up new opportunities for study, for example, the effect of common hubs that all individuals regularly pass through on the evolution of cooperation.

Figure 1 .
Figure1.This plot shows the best response staying propensities for 1 type C i individual playing against N − 1 type C j individuals.Parameter set 1 is used with λ = 0.2 and i, j ∈ {0.01, 0.02, . . ., 0.99}.The intersection point of the plots gives the unique strategy which is a best response to itself, i.e. the unique cooperator resident Nash equilibrium staying propensity γ R , which is somewhere between 0.3 and 0.4.This value is similar to the one obtained using the iterative method (see Figure2).The values from the current figure are approximate only because of the jagged nature of the lines; these occur because of the very large number of simulations that would be necessary to obtain a smooth version (the figure uses 10000 simulations for each combination).The figure is used to illustrate the uniqueness of the solution only.

Figure 2 .
Figure 2.These plots show the effect of movement cost on the evolution of cooperation using parameter set 1. The left (centre) plot shows the staying propensities δ R = 0.99 (γ R ) for resident defectors (cooperators) and γ M (δ M ) for a mutant cooperator (defector) used to invade the resident population.In the right plot, we have the fixation probability of a mutant cooperator C γ M (defector D δ M ) against N − 1 resident defectors D 0.99 (cooperators C γ R ).

Figure 4 .
Figure 4. Plots created using parameter set 3. The exploration time T has been increased from 10 to 25.

Figure 6 .Figure 7 . 2 = max ρ Cc,D b N/ 2 : 2 =
Figure 6.Plots have been created using parameter set 5. The plots here are against the reward to cost ratio v/c such that c = 0.04.

Figure 8
Figure 8.This plot shows the best response cooperator staying propensity (solid line, value shown on the x-axis) versus the range of defector staying propensities on the y-axis, and the best response defector staying propensity (dashed line, value shown on the y-axis) versus the range of cooperator staying propensities (on the x-axis) for N/2 cooperators and N/2 defectors.Parameter set 1 is used with λ = 0.2 and the staying propensities are chosen from the set {0.01, 0.02, . . ., 0.99}.The best response staying propensities cross at one point only, which is thus the unique Nash equilibrium, where γ ≈ 0.7 and δ ≈ 0.5.These values are similar to those obtained using the iterative method described earlier (see Figure9).As before, the values from the current figure are approximate only because of the jagged nature of the lines; the figure is used to illustrate the uniqueness of the solution only.
Figure 3. Plots created using parameter set 2. The exploration time T has been decreased from 10 to 5.
Figure 9.These plots show the effect of movement cost λ on the evolution of cooperation and are created using parameter set 1. The plot on the left shows the Nash equilibrium staying propensity γ for cooperators and δ for defectors in a mixed population where there are N/2 individuals of each type.The plot in the centre shows the fixation probability of each type from the mixed state with N/2 individuals of each type.The plot on the right shows the fixation probability of a mutant cooperator C γ (defector D δ ) in a population of N − 1 resident defectors D δ (cooperators C γ ).Figure 10.Plots created using parameter set 2. Plots are as in Figure Figure 11.Plots created using parameter set 3. Plots are as in Figure 9 with exploration time T increased from 10 to 25.Figure 12. Plots created using parameter set 4. Plots are as in Figure 9 with population size N increased from 10 to 20.Figure 13.Plots created using parameter set 5. Plots are as in Figure 9 but λ is fixed and reward to cost ratio v/c varied such that c = 0.04.Figure 14.Plots created using parameter set 5. Plots are as in Figure