ON THE EVOLUTION OF COMPLIANCE AND REGULATION WITH TAX EVADING AGENTS

. We study the evolution of compliance and regulation with tax-evading agents, allowing for imitation rather than rationality in the evolution of available strategies distribution in the population. The general framework of the approach combines a classical model for tax evasion where agents are imitators rather than rational optimizers and form an endogenized subjective probability of audit. A regulator chooses values to available policy instruments, either myopically or optimally -within an optimal control setup-, always with respect to the behavior of agents. A comparison is drawn between the evolu- tionary and rational case in order to evaluate the diﬀerences that occur.

1. Introduction. Tax evasion and tax avoidance, as well as ways to remedy such issues, are real, timely and universal facts and have always been the subject of research and investigation of tax authorities in the majority of countries. By affecting a key sector of public finance, i.e. fiscal policy, tax noncompliance renders any action to restrict its emergence and development worth it in terms of budget poise. However, the design and enforcement of such institutions is a rather awkward and costly task for any regulator, since monitoring costs and the existence of loopholes in the law system hinder desirable effectiveness. Tax authorities have not tackled this issue successfully, since tax evasion is still observable and has always been the center of interest in an immense economic literature and economic modelling.
Most of the modern economic literature treats taxpayers as rational agents, who make decisions based on net expected utility gains or losses, having exact knowledge over the characteristics of other agents, or even of the government sector's goals and motives. The intrinsic problem with this approach is that it tends to disregard other aspects of human behavior, such as imitation processes, personal influence, tastes, as well as many other characteristics. The omission of these particularities is often for the sake of result tractability, since their inclusion renders mathematical models highly non-linear. The ways modern literature models the interaction between tax authorities and taxpayers is reviewed by [2], who categorize them in either principal-agent based models, e.g. [7,16,20,22] and game theoretic based models, e.g. [4,11,17,23]. Adding to these approaches the most well-known models of tax evasion, i.e. the seminal paper of [1], the contribution of [25], as well as [21], agents are treated as rational criminals, as earlier proposed in the seminal paper of [5], having no interaction with each other and maximizing their expected payoff through cheating.
More recent models try to capture these characteristics, namely the behavioral aspects of agents, using classical or evolutionary game theory. The latter offers a sufficiently more realistic background treating the strategies of agents in the population as evolving over time, a process that allows for imitation and the emergence of a different concept of equilibrium. An approach towards the relaxation of the assumption that there is no social interaction among taxpayers is offered by [15]. The agents are considered imitators in terms of learning each others strategy through the stages of a repeated game in which the probability of audit is held either fixed or is allowed to change. In a recent paper, [3] study tax evasion using an evolutionary approach, but in a different framework than ours. Their framework includes cheaters, honest citizens and punishers who are willing to sanction cheaters by incurring costs, but include no regulatory authority which audits and penalizes tax evaders. Our work starts from the same basis in the sense that we allow for imitation, but it differs as far as the steps of the analysis and regulation is concerned. We use the notion of the replicator dynamics equation as proposed by [19] and we also allow for optimization with respect to that behavior. Nevertheless, we view both this work and the work of [15], as ways to expand the research on tax evasion by improving on assumptions and moving towards evolutionary game theory. In the contemporary literature, social imitation is considered among the fundamental motives in psychology regarding social norms and cultural evolution (e.g. [6]), consequently contemporary research needs to allow for imitation rules and social interaction among agents.
Thus, our work incorporates the concepts of both classical and contemporary approaches in an attempt to unify regulation and enforcement mechanisms with evolutionary concepts of agent behavior. In particular, we consider a general model with a large population of agents and a public sector represented by a regulator whose main concern is to keep a balanced budget. Note that there is no tax avoidance in our model, since we are only interested in the illegal manifestations of the phenomenon. The players are the taxpayers and interact with each other by playing an evolutionary stage game in which each period, a random pair of players is formed and each player compares own payoff with the payoff of the other member of the pair. The mathematical equation explicitly showing the aforementioned procedure is the replicator dynamics equation, which is induced by imitation dynamics and describes the evolution of the different strategies of agents in the population through time. 1 where x i is the share of strategy i in the population of strategies, π (·) is the payoff function andπ (·) is the average payoff in the population. The equation states that as long as the payoff of strategy i is greater than the average payoff in the population, strategy i will spread (thrive) in the population of strategies, as it is relatively fitter than its competitors. This is a result that is derived by imitation dynamics in the sense that there is a nonnegative probability that the agent following strategy j will switch to strategy i if the latter enjoys a higher payoff. Indeed, if we consider a sufficiently large population and we restrict to two available strategies in the population, let these be i and j, then the replicator dynamics equation becomes: This ongoing process determines which strategies (suppose i = evade or j = comply) have a higher fitness relative to the other, rendering them more abundant in the population as the game is played for a long time. Thus, the agents in this doctrine are considered boundedly rational and not fully rational, having only subjective beliefs about the way the population of strategies evolves. On the other hand, the regulator has the task to keep an intertemporal balanced budget path and enforce the tax regime by taking into account that the agents have evolving strategies governed by imitation dynamics. However, tax enforcement is a costly procedure by itself and thus the regulator needs to account for these costs as well, i.e. the cost of audit and the cost of the policy instruments she uses. The cost of audit incorporates the sum of operating costs of the fiscal sector as a whole, whereas the cost of policy instruments relates to the inelastic ability of the government to overact with the tax and penalty rates. Policy instrument costs have an inseparable relation to welfare and social stability and they should be normalized or even restricted to viable levels.
The novelty of this work lies primarily in the fact that the optimal control problem faced by the regulator is constrained by the boundedly rational imitating behavior of the taxpayers. Consequently, the regulator aims to determine the optimal paths of the available costly policy instruments, i.e. monitoring effort, tax rate or penalty rate by taking into account the non-best-response imitating behavior of the tax base. Having found the optimal paths of the instruments, one can then evaluate fiscal policy in terms of equilibria, comparative statics and dynamical behavior. Thus our approach is distinct from the standard 'rational tax-evader approach', which constraints the regulator's problem by the optimality conditions of the rational tax-evader, since in our approach the regulator is constrained by the imitation dynamics of the boundedly rational tax evaders.
There is a discussion whether this kind of models can capture stylized facts about the realized levels of evasion, or the underestimation of the subjective probability of audit etc. In our opinion, all these models including ours, can only deal with petty crimes, namely tax evasion performed by individuals or by small homogeneous firms, where decisions on whether to tax evade or not are taken by individuals as well, i.e. firm owners or freelancers. In this context, 'getting away with murder' is easier and less costly in most cases, thus imitation plays an important role in all this decision-making. On the other hand, when we deal with large multinational corporations or conglomerates, the decision on whether to tax evade or not cannot be regarded as a result of imitation. It is a strictly rational choice and most of the times it dwells between tax evasion and tax avoidance, since these firms hire professionals tax experts in order to take advantage of loopholes present in the tax law. This is considered legal and is obviously a rather rational choice. Consequently, all these models can be said to act as an Occam's razor in terms of modelling reality and providing explanation to stylized facts. 2. The model. We consider an economy with a sufficiently large 2 population of agents, for which our primary objective is to investigate the evolution of compliance and tax evasion under certain assumptions. The model revolves, almost exclusively, around the interaction of agents with each other and the implications that such an interaction has on tax enforcement policy and regulation. Restricting attention to tax evasion by excluding tax avoidance and other similar practices, we assume that agents have two available strategies, i.e. either to tax evade, by reporting a taxable income below its true value, or not, by legitimately reporting their whole taxable income. The decision of each agent on whether to tax evade or not depends on her inclination to imitate the behavior of the agent she comes across in the game that agents play, which is a standard 2x2 symmetric evolutionary stage game according to which, each period, random pairs of agents are formed. Within each pair, each agent has a non-zero probability to switch to the opposing strategy, if her own payoff is lower than the perceived payoff of her match.
On the other side, we consider a regulator, whose role is to optimally control for the aforementioned agents' behavior by setting the level of the different policy instruments at her disposal, accordingly, in order to attain a certain goal. We assume this goal to be a primary deficit/surplus level towards which she optimizes a corresponding social objective function. The regulator has perfect information over the agents' behavior and beliefs and is only limited by intrinsic characteristics of each policy instrument, e.g. the tax rate or penalty rate cannot surpass viable levels.
2.1. Agents side. We assume that we have a sufficiently large population of agents who are subject to income taxation, with tax rate t. Each agent receives an income, denoted by Y , and decides whether to report it as a whole or just a part of it. The income is exogenously given and is known only to the taxpayer and not by the tax collection authorities. 3 We consider the case where each agent has two available strategies, i.e. to tax evade or not. When the agent chooses to tax evade, her reported income, Y R , is strictly below Y , i.e. Y R < Y , while Y − Y R denotes the evaded income.
Thus, we limit to two types of agents in the population; the complying and the non-complying with the tax rule. At the end of each period, tax authorities conduct random audits in a part of the population and any tax understatement detected is subject to a proportional penalty θ, applied over and above the evaded tax, as proposed by [25]. The agents, therefore, face an audit probability, which is detrimental in the form of a penalty/fine, only to the non-complying type. We will assume that tax investigation will always penalize the non-complying agent type. In other words, we do not allow for corruption of auditors or in the tax enforcement authority as a whole. Hence, the non-compliers' expected utility, EU N C , incorporates a subjective probability of audit p, in order to properly weigh the two contingent outcomes, i.e. being caught or not, following Becker's concept (see [5]): where U N U , depicts the utility of the non-complier that remains unaudited and U N A , the utility of the non-complier that is audited and thus gets caught 4 . It follows from the standard von Neumann-Morgenstern utility function properties that U N U > U N A . The expected utility of the complying agent, EU C , is simply U C , since she faces no threat as far as a penalty is concerned, i.e.: We will come to the issue on how we define p, later on. For the time being, we consider it being fixed and exogenous, satisfying all the desirable properties of a probability measure.
To be more rigorous in notation, each agent is considered risk-averse and has a utility-based payoff depending on the strategy she follows and on the likelihood of being audited. Following the non-compliers first, each agent reports Y R , and faces two contingencies; to be caught cheating or not. In the first case, the evader reported Y R < Y , and has been taxed with tY R . Since she has been caught, she will have to pay her full tax responsibilities, i.e. the remaining t (∆Y ) plus the penalty on evaded tax, θt (∆Y ), where θ is the penalty imposed. Her payoff, In the latter case, the respective payoff of the "lucky" non-complying agent who does not get audited, i.e. U N U , turns out to be just U (Y − tY R ), which in terms of after-tax income equals to U (Υ + t (∆Y )). Substituting into (3), we obtain the following expression for the expected utility of the non-complying agent: On the other hand, the complying agent, i.e. the one who reports her whole income Y , gets a payoff, U C , equal to U (Y − tY ), which simplifies to just U (Υ), since she fully paid her tax liabilities. Following, once again, the notion of the von Neumann-Morgenstern properties for utility ranking, it is always true that U N U > U C > U N A . This is an intuitive and necessary condition, so that the individual rationality constraint, for the complying agents, is satisfied. Notice again in (4), that the expected utility of the complying agent is U C itself, i.e. EU C ≡ U C , since she faces no uncertainty.

2.2.
Replicator dynamics equation and tax evasion. Let x denote the share of the non-compliers in the population and consequently, the share of compliers will be the remaining (1 − x). The average expected payoff in terms of utility in the population of agents will be given by: As described in (1) and (2), the change in the share of non-compliers in the population of agents will be given by the following expression:

YANNIS PETROHILOS-ANDRIANOS AND ANASTASIOS XEPAPADEAS
After combining equations (3), (4), (5) and (6), the replicator dynamics equation will have the following form: For further details on the derivation of the replicator dynamics equation one can refer to the Appendix B of the present work. Until now, we have treated the subjective probability of audit p, formed by the non-complying agents, as exogenous and fixed. This means that the level of monitoring effort is taken as given, i.e. as if the regulator publicly announced the number of inspections to be exercised per period. We want to go beyond the case of a precommitment to a certain auditing probability and we will, therefore, endogenize p in a way that it makes our analysis more realistic inasmuch as tractability of results is attained, since there exists a trade-off between these two properties. We assume that each agent elicits information in order to form the probability with which she thinks she will be audited, through two channels. The first channel consists of all publicly available information on the stringency of the tax authorities, whereas the second channel incorporates all the knowledge that the agent can acquire through her interaction with the other agents, especially via the evolutionary stage game. Broadcasts, television, radio, Internet, or printed news, as well as political announcements, economic crises or booms directly affecting the real economy etc., perform as suitable examples for the first channel. People tend to shape their beliefs regarding the stringency of audits according to the aforementioned sources. As far as the second channel is concerned, human communication and interpersonal relationships provide plausible examples through which agents produce beliefs and most importantly imitate one another; and we have seen that the replicator dynamics is all about imitation. Thus, we assume that the subjective probability of audit is jointly defined firstly by the overall perceived effort put into auditing by the tax authorities and secondly by the perceived magnitude of tax evasion in the population.
As far as the former is concerned, let ε denote the effort put into auditing by the tax authorities. We can think of ε being the ratio of the spending channeled to audits over the total government spending towards tax authorities, i.e. the net expenditure ratio financing the audit mechanism. Thus, it holds that 0 ≤ ε ≤ 1, since it constitutes a ratio, where the lower bound, 0, and the upper bound, 1, can never be truly achieved in reality, since they imply two extreme cases of either no audit spending or full audit spending, respectively, that can never be applied in reality. Although the two boundary values are limiting cases without further qualification, they serve so that effort has desirable probability-like characteristics and in the formation of the control variable constraints in the optimization problem below.
As far as the latter is concerned, the ratio of non-compliers in the population, i.e. x ∈ [0, 1], represents the relative magnitude of tax evasion in the population and can easily be inferred by each agent through public relations and social interaction in general. Notice that the taxpayer has no knowledge of the true values of ε and x, instead she infers a value for each one of them; let these subjective values be denoted by (ε,x). These values are then entered as parameters in the class of probability distributions of audit, identifying the true function of subjective belief of being audited. In other words,ε andx must satisfy that: The perceived values of ε and x, i.e.ε andx, are a function of the true values of ε and x plus random shocks v 1 (ω) and v 2 (ω) on the sample space, respectively. Note that, the true values of ε and x may depend on certain factors, but this has no impact on the form of the subjective probability of audit whatsoever, since it incorporates only the parametric valuesε andx in order to measure the audit chance. Both random shocks v 1 (ω) and v 2 (ω) are considered to be zero-mean stationary random variables. We can now define the endogenized subjective probability of audit to be a distribution function measuring the probability: which will be denoted for simplicity by p (ε, x), satisfying the property that: Thus, we get a specific probability measure that, given the values of the parameters ε andx, takes values from 0 to 1 for the event that the agent gets audited. Substituting the endogenized probability back to (7), we obtain the replicator dynamics equation with endogenized subjective probability, i.e.: Notice that with the endogenized probability we may obtain a feasible interior steady state solution for the replicator dynamics equation. In its original form, as depicted in (7), the only candidate solutions for a steady state (x;ẋ = 0) were the corner solutions, i.e. x = 0 and x = 1, being interpreted as full compliance or full evasion in the population, respectively. This was due to the fact that all the terms in the bracketed section were independent of x.
2.3. Monomorphic vs. polymorphic behavior. The replicator dynamics equation as a differential equation, implies some dynamical behavior for x, i.e. the fraction of non-compliers in the population. Taking equation (7) as a benchmark, where probability is still considered exogenous and the bracketed section is independent of x, we can rewrite it as: , and can be viewed as a scaling constant, as far as the differential equation is concerned. It is clear thatẋ is a parabola, the convexity (curvature) of which depends on the sign of K. The critical points of (9), i.e. the ones for which it holds thatẋ = 0, are x * 1 = 0, x * 2 = 1, or K = 0, which is trivial. Differentiatingẋ with respect to x we obtain: Computing (10) in each steady state, i.e. x * 1 = 0 and x * 2 = 1, we derive the dynamical behavior of the steady states of (9), which is summarized as:

YANNIS PETROHILOS-ANDRIANOS AND ANASTASIOS XEPAPADEAS
∂ẋ ∂x K=0 = 0, all levels of x satisfy stability (not desirable) Notice that in the benchmark case with exogenous probability of audit we only get two corner solutions, i.e. x * 1 = 0 and x * 2 = 1, and one trivial solution that consists of all interior x's. Hence it describes a monomorphic behavior and leads to monomorphic phase diagrams, in the sense that there are no transitional dynamics, but only a jump to each stable point. It is obvious that any tax authority would want to achieve a stable x * 1 = 0 state, which means that all agents become complying and fully pay their tax responsibilities. In the benchmark environment this implies that K must be negative, which in turn means that U C > (1 − p) U N U + pU N A at all times; a condition that makes sense, since switching to the "good" strategy means that it must score higher in terms of payoff than the "bad" strategy. This can be regarded as a participation constraint for the complying agent type. On the other hand, by endogenizing the subjective probability of audit we manage to transform K into a function of x through p, i.e. K : K (p (ε, x)). Thus, for the replicator dynamics equation (8), steady states, as before, are defined byẋ = 0 as x * 1 = 0, x * 2 = 1, and assuming that it exists, an interior x † , that satisfies the following relationship: To pinpoint the interior x † satisfying (11), one must assume a specific form for p (ε, x). Assuming that this interior x * exists, we get a polymorphic phase diagram whose dynamical behavior depends on the sign of the derivative ofẋ calculated at each steady state All the possible variations for both monomorphic and polymorphic (with one interior x * ) phase diagrams are summarized in Figure 1 below.
Notice that for the monomorphic case we can attain a stable full compliance level at once (see Figure 1(A) and Figure 1(C)), if the dynamics permit it, whereas for the polymorphic case of one interior steady state, full compliance and full noncompliance are both present in the same contingency. In Figure 1(B) a small perturbation from the unstable interior steady state x † leads to either total conformity or total non-compliance. A desirable polymorphic case is, therefore, one that is characterized by unstable corner solutions and a steady interior x † , i.e. the one depicted in Figure 1(D). This is also the most realistic of all four contingencies, not mentioning cases with multiple interior steady states that outperform all of the above in terms of interest.
2.4. Primary deficit and tax evasion. We assume that the economy has a single regulating agent in lieu of the government, in the sense that she has no other responsibility than the planning, enforcement and optimization with respect to resource and policy instrument constraints, towards a certain goal. The regulator is aware of the behavior of the agents and has full knowledge over the whole system that describes it. The objective function towards which policy is set can vary depending on the ultimate goals of the economy. These goals could regard spending, welfare, the monetary or fiscal sector, trade etc.; they could even concern idiosyncratic objectives, e.g. reelection. In our setup, the regulator will be a fiscal policy regulator, who sets the target for the primary deficit of the economy. The primary deficit consideration offers two major advantages.
Firstly, it incorporates, as we shall see below, the two channels that the agents use to form their subjective probability of audit, i.e. perceived effort of audit, ε, and Monomorphic Polymorphic the ratio of evaders in the population, x. That is mainly because primary deficit is a publicly available information, which enables agents to make a forecast about the stringency of the government concerning tax investigation. In our model, changes in the primary deficit will be attributed to fluctuations in the tax revenues, and can thus be interpreted as a corresponding change in the non-compliance rate, x. Secondly, it can be considered as a more straightforward approach than a generic social welfare target, since it is a measure that is also used in practice in order to assess the effectiveness of fiscal policy and determine necessary changes in policy instruments. Let where G is the government spending and T e are the expected tax revenues. Government spending can be regarded as constant over time, i.e.Ġ = 0, so that the rate of change of the primary deficit is fully attributed to the change of the tax revenues.
Notice that, T e implies total net expected tax revenues of the government, since the effort put into audit is costly for the government. Let c (ε) be the cost function of effort, faced by the tax authorities, according to which: We also define a function φ (ε), denoting the ratio of the population to be randomly drawn and audited per period. Notice that φ (ε) is deterministic as it is the regulator's choice, but it resembles to a probability measure, since it constitutes a ratio, and will be treated as such, with the following properties: For notation simplicity, let φ ≡ φ (ε) and Y − Y R ≡ ∆Y . In order to compute T e , it is wiser to split agents, once again, into evaders and compliers. Both groups face a probability to be audited equal to φ, thus, a ratio φ of the population is indeed audited and the remaining (1 − φ) remains unaudited. For both audited and unaudited agents, x of them evade and (1 − x) of them comply. Table 1 below, summarizes the payoffs for each contingency: whereas the unaudited agent has a tax collection of Therefore, total net expected tax revenues T e , is the sum of audited plus unaudited tax collections, net of auditing costs 5 : The intuition behind this result is that the government earns a part of the tax revenues from the complying agents, i.e. the (1 − x) share of the population, who fully pay their tax responsibilities tY , as indicated by part A of (12), and the rest comes from the non-complying agents, namely, the x share of the population, as in part B of (12). Notice that all evading agents pay part b1 of B, that is to say, the tax on understated income tY R , while the remaining b2 part of B is the φ share of them that gets caught and is fined by the amount (1 + θ) t∆Y , which consists of the evaded tax t∆Y and the penalty on evaded tax θt∆Y . Subtracting the cost of effort c (ε) from the summation gives the total net expected tax revenues. Notice that the magnitude of the tax revenues T e can only be known to the regulator and not the agents, since she is the only one to know the realized ratio of agents to be audited in the population, φ.
3. Regulating tax evasion. In a standard optimization setup, the regulator would aim to control for tax evasion towards a certain goal, with respect to the behavior of agents as described by equation (8). Lets suppose, at first, and setting it as a benchmark case, that the regulator can perform an unconditional regulation, being able to tamper with the replicator dynamics equation without having to worry about costs or specific goals.
3.1. A benchmark case: Controlling replicator dynamics. Imagine that the regulator sees only equation (8), i.e. the replicator dynamics equation, and is indifferent towards other kinds of goals and/or restrictions: Remember also, that: The regulator's goal immediately becomes to make the fraction of non-compliers, x, in the population reach zero and make this point, i.e. x = 0, a stable steady state. With this being a first-best outcome in terms of efficiency of tax regulation, the second-best outcome would be to somehow drive the system to the lowest nonzero x possible, since x = 0 is unattainable, and make that one a stable critical point. Notice that in the benchmark case, effort is costless, hence the regulator can put into as much as is required in order to attain the desirable result.
The channels through which the regulator can affect the replicator dynamics equation are the three policy instruments at her disposal, namely the effort ε, the tax rate t and the penalty rate θ. We will see what can be achieved in terms of compliance when the regulator uses each policy instrument, unconditionally and to their full extent, implying that she uses one instrument at a time while the others remain fixed to a certain value (ceteris paribus). In order to be able to proceed, we must impose specific functional forms for the von Neumann-Morgenstern utility function describing the preferences of agents towards risk U (·), as well as the subjective probability of audit, p (ε, x). Since agents are treated as risk-averse, we choose one of the most widely used utility functions exhibiting constant relative risk aversion, i.e. the isoelastic utility function: with ρ > 0 measuring the degree of risk aversion. Notice that, when ρ = 1, using l'Hôpital's rule, we obtain that lim ρ→1 w 1−ρ −1 1−ρ = log (w), as a special case. For all ρ = 1, the constant term "−1" in the numerator will be omitted, since optimal decisions are not affected by additive constant terms. As far as the subjective probability of audit, p (ε, x) is concerned, since both ε and x take values in [0, 1], we can use any relationship that satisfies that p (ε, x) will also range between [0, 1] . Examples may include, among others: In all cases it is realistic to assume a positive relationship between p (ε, x) and ε, since higher perceived stringency causes a risk-averse taxpayer to believe that the odds of being audited increase. This statement, though, is not always true as far as the relation between p (ε, x) and x is concerned. Observe that in cases 1) and 3) above, there is a positive relation between p (ε, x) and x which implies that a rise in the perceived ratio of evaders causes the agent to strengthen her belief that she might get audited, which in turn can be attributed to the fact that higher x is indeed the case and the tax authorities will increase effort to compensate for that rise in evasion. On the other hand, in cases 2) and 4) we have the exact opposite logic, implying that a perceived rise in x by the agent leads her to the conclusion that the tax authorities cannot deal with tax evasion efficiently and will not be able to do so in the near future as well, which in turn makes her infer that the probability of audit falls in the sense that it has become overwhelmingly costly for the government to increase audits. We will be using cases 1) and 2) and see the implications below. Incorporating the above in the replicator dynamics equation we end up with the respective alternative cases: Case 1).
Note that the dynamical behavior of the special case where U (w) = log (w), i.e. ρ = 1 is almost identical 6 to the general case and that all results hold for both types of utility functions. The steady states of the replicator dynamics equations (13) and (14) are the solutions toẋ 1 = 0 andẋ 2 = 0 and include the corner solutions x * 1 = 0 and x * 2 = 1 as well as the interior x † which satisfies that the bracketed section of each equation is zero. For the first case, i.e. when p (ε, x) = εx, and after making substitutions in order to avoid complex notation while providing a formula that applies to all cases, 6 The only thing that changes in the case where U (w) = log (w) is that for the second case where p (ε, x) = ε (1 − x), the interior solution x † 2 could be a complex number for some parametric values of the tax rate t.
the interior x † 1 is given by: By analogy, the interior steady state, x † 2 , for the second case, i.e. when p (ε, x) = ε (1 − x), is given by: While the fact that x † 2 = 1 − x † 1 implies a completely inverse trajectory of x † 2 in comparison to x † 1 they both share something in common. While effort level ε and penalty rate θ have a seemingly reverse effect on x † 1 and x † 2 , because they only appear positively in the denominator, the tax rate, t, has an ambiguous effect making the interior steady states x † 1 and x † 2 behave very quirky for overly high levels of t. This is mainly attributed to the fact that as t reaches unsustainable levels (t → 1), the compliers' utility goes to zero, since the true after tax income reaches zero (Υ → 0) and the evaders gain a complete advantage, rendering a strategy switch towards evasion perfectly profitable and spreading it in the population.
In order to characterize the steady states in terms of stability we need to compute the derivative with respect to x in each steady state x i , i.e.:

∂ẋ ∂x x=xi
For the first case, where p (ε, x) = εx, we have that: ≶ 0, < 0 as ε and/or θ rise This shows that as long as effort (or penalty) remains low and close to zero we have a phase diagram as depicted in Figure 1(A), which implies full evasion, and as effort (or penalty) rises we get a smooth transition to a state as described by Figure 1(D), with a stable interior x † 1 which steadily decreases towards zero. Notice that we say nothing about the behavior of the phase diagram with respect to the tax rate because we get ambiguous results when it exceeds a specific level, which is by no means sustainable in an economy whatsoever (t > 0.7). Up to that level, the transition is exactly the same as in ε and θ, but when it passes beyond this threshold we get an inverse smooth yet rapid transition back to the initial monomorphic full evasion situation. This is attributed to the fact mentioned above, i.e. the evasion strategy gains in terms of payoffs beyond a tax rate level. Figure 2 depicts frames [A-D] of this smooth transition of the phase diagram as ε (or θ) go from 0 → 1, ceteris paribus. For the case of the tax rate, as t grows, imagine moving from frame A through D as before, but after t reaches the critical threshold the transition reverses following the path from frame D back to A in a more hasty pace. Figure 2. Transition of the replicator dynamics for p (ε, x) = εx.

YANNIS PETROHILOS-ANDRIANOS AND ANASTASIOS XEPAPADEAS
For the second case, where p (ε, x) = ε (1 − x), we have that: ≶ 0, > 0 as ε and/or θ rise This in turn, shows that as long as effort (or penalty) remains low and close to zero we have a phase diagram as depicted in Figure 1(A), which implies full evasion, and as effort (or penalty) rises we get a smooth transition to a state as described by Figure 1(B), described by an unstable interior x † 2 which steadily increases towards one. Here we get exactly the opposite behavior than in the first case. Notice that this result is counter-intuitive since a rise in ε and θ leads to higher levels of evasion. This, of course, is due to the fact that we have assumed that the taxpayers perceive large levels of evasion, i.e. large x, as a signal that audit chance lowers due to ineffective effort, and thus the system eventually makes evasion grow. Nevertheless, remember that in the Figure 1(B) case, we have an unstable interior steady state x † 2 , which means that x † 2 will only hold in the short-run, as a small perturbation of this fragile (unstable) steady state will lead the economy to either full compliance (x * 1 = 0) or full evasion (x * 2 = 1). The regulator would surely want to be able to take advantage of this and lead the system to full compliance, which, after all, is the firstbest outcome, but this is as likely as the full evasion scenario. The response of this case to changes in the policy instruments is depicted below, in Figure 3, following the same notion as in Figure 2. Moreover, the phase diagram's behavior to changes in the tax rate, t, follow the same pattern as in the first case, i.e. increasing tax rate until a certain level causes evasion to grow, as do changes in ε and θ [Frames A → D, Figure 3], but after that level is surpassed evasion falls back to zero [Frames D → A, Figure 3]; nevertheless this still means full evasion since Frame A describes a stable x * 2 = 1 steady state. One can easily conclude that the regulator would prefer to have a population of agents that form a subjective belief of audit similar to the first case, i.e. where p (ε, x) = εx, so that accruement in the levels of policy instruments may have a reductional effect on evasion, rendering every interior x † stable and thus feasible. This means that the regulator expects that agents view a rise in evasion around them, not as an opportunity for them to evade as well, but rather as a sign of impending rise in the effort for audit by the tax authorities. Same results also hold for parameterization with weighted cases of behavior of agents, e.g. p (ε, x) = zεx + (1 − z) ε (1 − x), with 0 ≤ z ≤ 1, depending on which behavior gets the biggest weight.

3.2.
Optimal regulation of tax evasion. We now suppose that the regulator wants to optimally control for the policy instruments with respect to the behavior of agents as described by the replicator dynamics equation. We consider only one policy instrument and more specifically effort, ε, as it generates tractable results. Recall that the regulator wants to set a goal for the primary deficit equation, i.e.
More specifically, we will assume that her objective will be to minimize the square deviations from a target primary deficit,D, plus any costs this minimization incurs. Moreover, we consider this to be an infinite horizon problem 7 , so that it can be rendered autonomous and the time dimension enters only through the discount term, r. Appendix C provides a more detailed view of the optimal control problem.
The problem faced by the regulator will be: subject to the replicator dynamics equation: and the bounds for the control and state variable: The Lagrange equation of the problem will be comprised of the current value Hamiltonian and the constraints for the control and state variables: where λ, w 1 , w 2 µ 1, µ 2 are the costate variables.
Notice that if the constraints for the control and state variables are to hold, i.e. in order to acquire an interior solution, we end up solving the Hamiltonian system of 246 YANNIS PETROHILOS-ANDRIANOS AND ANASTASIOS XEPAPADEAS the problem:ẋ = g (x, ε * ) , and a steady state (x s , λ s ), if it exists, will satisfy: In order to describe the dynamical behavior of the steady states of the form (x s , λ s ), for which ε * = E (x s , λ s ; t, θ) is the solution of (19), we study the linear differential equation system that approximates (20) and (21) at x s , λ s .
In our setup, we use the following simplifying assumptions in order to obtain analytical results: • p (ε, x) = εx, since this type of reaction, as shown above, exhibits satisfying dynamical behavior. Note that the case where p (ε, x) = zε + (1 − z) x, with 0 ≤ z ≤ 1, also provides tractable results. • U (w) = log (w), as a special case. This is only assumed for tractability reasons. The general case of the isoelastic function also provides tractable but more complex results. • c (ε) = cε, a linear cost function. The case where c (ε) = 1 2 cε 2 is indisputably a better choice, but analytical solutions could not be attained.
• φ (ε) = ε, which means that the ratio of the population to be audited is proportional to the effort put into audit. This can be scaled by any constant, including the cost constant c, or take other plausible forms without affecting the general behavior of the system. The best φ (ε) formulation would be a sigmoid type as in φ (ε) = 1 − exp (−ε), but this renders the model unsolvable due to high non-linearity. •D = 0 without any loss of generality, i.e. setting a balanced budget goal from which to minimize deviations.
After the substitutions, the solution for the optimal ε * will be: where The intuition behind ε * is not so easy to describe, that is why we split it in parts to which an economic interpretation can be given: • A, for which it must hold that A = 0, denotes the net effective effort revenues, i.e. benefit of effort minus cost of effort. Notice that in A, the xt (1 + θ) ∆Y are the tax revenues from the audited evaders, whereas c is the cost for that audit. • B is the certain primary deficit. Certain in the sense that without any auditing at all, i.e. φ (ε) = 0, the tax revenues would only come from the compliers, namely (1 − x) tY and from the underreporting evaders, i.e. xtY R . Subtracting these two tax revenues from the social spending G gives us the certain primary deficit where no audits occurred. • D is a little bit more complicated. It can be said to describe the shadow price of remaining unaudited, measured in terms of a replicator dynamics equation inside evaders' strategy subpopulation. Substituting ε * from (22) into the Hamiltonian system (49) we can investigate for the dynamical behavior of the occurring steady states. To do so we must solve for the system linearized around the steady state. The linearization matrix around any emerging steady state (x s , λ s ), i.e. the Jacobian of the Hamiltonian system, will be of the form: The dynamical behavior of each steady state depends on the properties of the characteristic roots (eigenvalues) of J (xs,λs) , denoted by r 1 and r 2 , for which it holds that: tr (J) = r 1 + r 2 and |A| = r 1 r 2 .
3.3. Numerical approximation. Both the Hamiltonian system in (19), and its linearization matrix become very complicated to present and practically unintuitive. We give plausible values to all parameters in order to get an idea on how the system works.
The values used for parameterization attempt to simulate real relative magnitudes of data, some of which, for calculations facilitation are normalized to unity. This follows by the logic that the whole problem including the replicator dynamics equation, the net expected tax revenues equation and the objective function do not take into account the number of agents in the population, rather than a representative agent or, more rigorously, her mixed strategy of either to evade or not. Correspondingly, the level of public spending, G, will have a per-capita sense. Thus, global parameters will include: Remember that r is the rate of time preference, whereas ρ is the rate of risk aversion, if the isoelastic function is used. However, we will use the logarithmic utility case for ρ = 1 8 . Tax evasion, if exhibited, is considered to be around 10% of total income in order to conform with data in petty crimes. Finally, we assume the regulator aims for a balanced budget, setting D = 0. Before proceeding, it is worth noting that the results were very sensitive to the values of the cost of effort, c, which acts as a free parameter. However, it plays a crucial role in the numerical approximation, since it has to be set in a level that will make the regulation and the effort put in audits worth it, in the sense that the benefits of audits, i.e. the tax revenues, should always surpass the costs it implies. Since the cost of effort is linear, c (ε) = cε, the constant c can be regarded as a measure of constant marginal (and average) cost of effort.
1. Benchmark case: G = 0.35, t = 0.1, θ = 1, c = 0.014 We set a plausible value G = 0.35, meaning that per-capita public spending is 35% the individual income, Y = 1. The per-capita public spending rate follows data concerning most OECD countries. Tax rate is 10% of income, following data on direct taxes as a percentage of GDP (see [8]), i.e. personal income taxes 9 . The penalty is set to 100% which is a typical penalty rate conditional on a 10% income evasion, without any loss of generality.
Optimization yields: Steady State Optimal Effort Verdict The interior steady state of x † = 0.698, i.e. 69.8% evasion is attained with an optimal level of effort of ε * = 0.711, showing that a 71.1% of spending towards tax authorities will be channeled towards audits. Both corner solutions x * 1 = 0 and x * 2 = 1 activate the constraint on effort level setting it equal to zero, namely no effort will be put in audits at all when the system settles at the corner states. Remember that the planner's goal is to minimize the square deviations from a balanced budget, thus the effort exhibited in every point is optimal as far as spending is concerned. The intuition is that for both full evasion and full compliance the regulator understands that the best response is to stop funding the audit mechanism, since any effort will have no effect on tax revenues nor the level of evasion, at least local to the critical point attained by the economy.
Stability analysis implies that the steady state in x * 1 = 0 behaves locally as an unstable node and both the steady state in x * 2 = 1 and the interior steady state x † = 0.698 behave as a saddle path 10 .
2. G = 0.35, t = 0.11, θ = 1, c = 0.014 We change the tax rate to 11% in order to investigate a sensitivity on an increase in t. The new optimization yields: 9 In general, GDP per capita is different than income per capita and should not be used as a proxy. However, data show that the ratio of income over GDP is very close to unity for the majority of OECD countries. The mean tax rate as a percentage of GDP is around 9% and since income/GDP is around 0.9 we settled with a corrected tax rate of 10%. 10 As far as the corner solutions are concerned, their dynamical behavior remains the same in similar parameter sets, i.e. the x * 1 = 0 always behaves as an unstable node and the x * 2 = 1 always behaves as a saddle point. The interior steady state behaves as a saddle point as well, with some excpeptions, in which it behaves as an unstable focus. See Appendix A for details.

Steady State Optimal Effort Verdict
x * 1 = 0 ε * = −88.571 constraint active: ε * = 0 x * 2 = 1 ε * = −187.375 constraint active: ε * = 0 x † = 0.635 ε * = 0.781 accepted A slight increase in the tax rate will lead to a fall in the level of evasion, as expected, and a slight increase in the effort put in audits, which can be attributed to the fact that an increased tax rate not only induces cost, through the increase of the evaded tax when the agent is a non-complier, but it also increases the penalty earned by the tax authorities if the agent is caught. This means that the increased tax rate has an ambiguous effect on optimal effort through two conflicting channels. The results here imply that the revenues from the increased tax rate make the rise in effort worth it in terms of revenues. The stability properties of the steady states are the same as before.
We have reverted the tax rate back to 10% as in the benchmark case 1. and increased the penalty from 100% to 110% in order to test for the sensitivity on θ.
The results are: Steady State Optimal Effort Verdict x * 1 = 0 ε * = −88.571 constraint active: ε * = 0 x * 2 = 1 ε * = −1010 constraint active: ε * = 0 x † = 0.665 ε * = 0.710 accepted An increase in the penalty rate leads to a fall in the level of evasion and in the level of effort put in audits as compared to the benchmark case. Both effects are expected and intuitively appealing, i.e. an increase in the penalty has no ambiguous effects on the preferences of the agent. It increases her cost of being or becoming a non-complier, thus leading to a fall in the level of evasion, and it also allows for a costless readjustment for the tax authorities, having managed to reduce evasion without changing the level of effort. This result is attributed by construction to the fact that the penalty rate affects only the evaders' utility and the revenue is earned only if the agent is an evader and gets caught. The stability properties of the steady states still remain the same as in all of the above cases.
In all three cases, the corner solutions activated the control variable constraints. In the full compliance case, where x * 1 = 0, the optimal effort fluctuated around ε * = −88, triggering the constraint which set it back to ε * = 0. This means that when the system finds itself in no evasion environment, the tax authorities need not administer any funds in order to control for tax evasion, since it is not present at all. The same holds for the case of full non-compliance, x * 2 = 1, triggering again the constraint for ε * = 0. This in turn means that in a full evasion scenario the tax authorities see that since the whole population evades, the optimal policy is to earn the minimum tax revenues since any level of effort will not be cost effective 11 . Both an increase in the tax rate and the penalty rate lowered the ratio of evaders in the population, whereas only the tax rate seems to have an ambiguous effect on the change of the optimal effort level needed for the attainment of the interior steady state. Note that, respective reductions in the tax rate and penalty rate produce the expected converse results, which can be found in more detail in Appendix A. 4. Rational case and comparison. In this section we want to somehow be able to compare the results obtained in our model with a directly analogous model of the rational case, i.e. where the taxpayer acts as a rational optimizer. Following this approach imposes some changes in the model making it follow the methodology of the original articles of [1] and [25]. In these papers, there is a representative taxpayer who chooses the level of reported income, Y R , that maximizes her expected utility: Notice that this is exactly the same as in (5), without any need to distinguish the taxpayers between compliers and non-compliers, since the outcome of the maximization process itself will characterize the taxpayer. The theoretical properties of this optimization process are fully described in the aforementioned articles, thus we shall not go through them again. However, since in our case we also have a second step, i.e. the optimal regulation, we hereby provide the alterations that have to be imposed in order to produce an analogous setting before we proceed with the comparison. First, it is straightforward that we no longer need an evolutionary context for the behavior of agents, since we have a rational representative agent, rather than an imitating one. This implies that we have to withdraw the concept of the ratio of compliers and non-compliers, i.e. x and (1 − x). In turn, this changes the way the expected tax revenues are formed and by extension the way the optimal regulation works. More specifically, remember that the expected tax revenues in the original model was described by equation (12): In the rational case, the regulator knows that all agents will have the same reported income, Y R , and thus it all comes down to how much she is able to audit. The level of audit performed is described by φ (ε), whereas its cost by c (ε). To derive the expected tax revenues for the rational case we construct a new table, accounting for the fact that the reported income resulting from the maximization of (23) determines whether the taxpayer is an evader (Y R < Y ) or a complier (Y R = Y ): Note that, ∆Y = (Y − Y R ), φ ≡ φ (ε) and p ≡ p (ε), with exactly the same properties as before. The tax revenues are derived by adding the elements of each row: The tax revenues for the complier case simplify to just T e = tY −c (ε), rendering the case of compliance almost trivial, because the cost of effort c (ε), is not a necessary expenditure for the regulator to make, since everyone fully pays their tax responsibilities. On the other hand, the tax revenues for any level of reported income below Y , simplify to T e = tY R + φ (t∆Y + θt∆Y ) − c (ε), which is the general formula for the tax revenues for the rational case. Second, we need to find the optimal level of reported income Y R . In order to produce a closed form solution, we impose the logarithmic utility function, U (w) = log (w), without any loss of generality. The first order condition for a maximum will produce a Y * R such that: The closed form solution for Y * R turns out to be: It is obvious that Y * R can also be negative for certain parameter values. The conditions for a positive interior solution are thoroughly described in [1], and will not be covered here.
After having set the framework for the rational case, we can now proceed to the part of the regulation. The regulator takes Y * R as given and acts as a Stackelberg leader choosing the optimal level of effort ε, that will minimize the square deviation from a target primary deficit, as seen before. The major difference relies on the fact that in the evolutionary case, i.e. when the agents were imitators, the level of reported income Y R was a given and the regulator controlled for the compliance in the population through the effort put on audits. In the rational case, we have a variable reported income that changes with effort perceived making all taxpayers acting as one. Therefore, the problem becomes static and for each level of the fixed instruments, i.e. the tax rate t, and the penalty rate θ, the regulator finds the optimal effort that will drive the system to a level of controlled evasion, that is a specific level of Y * R . The objective function for the regulator remains the same as far as the target is concerned. The only difference is that the problem has become static due to the nature of the constraints. Note that D (ε) = G − T e is the primary deficit, incorporating the newly formulated tax revenues as in (24): The problem is solved by simple substitution of the constraint inside the objective function. Analytical solution is not attainable, because the effort level appears in c (ε). Only numerical solutions can be obtained given appropriate parameter sets.

Numerical approximation.
In order to be able to make the comparison as direct as possible we impose the same global parameter values with the benchmark case of the original model: Notice that now we do not parameterize Y R , as it's value will be a result of optimization of the taxpayer. 1. Benchmark case: G = 0.35, t = 0.1, θ = 1, c = 0.014 In the benchmark case, using the same values as in the benchmark for the evolutionary case, we get a reported income Y * R = −8 + 18p (ε), which of course is a function of the effort level, ε. Solving for the optimal ε, we get ε * = 0.490 as the real root and two other complex roots. The level of effort is accepted in terms of intuition since it lies within [0, 1]. Substituting ε * back at Y * R , we get Y * R = 0.828, i.e. ceteris paribus, the taxpayers will report 82.8% of their income and the authorities will channel 49% of the available resources of the Ministry of Finance towards audits.
2. G = 0.35, t = 0.11, θ = 1, c = 0.014 Following the same practice as before, we raise the tax rate by 1 percentage point. The optimal reported income by the taxpayers, after noticing and adapting to this change becomes Y * R = −7.09 + 16.181p (ε). The real root for optimal ε, in turn, becomes ε * = 0.490, almost staying at the same level. Substituting it back in the reported income equation, we get Y * R = 0.839, which is in turn slightly greater than Y * R in the benchmark case. This means that a rise in the tax rate causes a rise in reported income and a trivial fall in the effort put in by the tax authorities. In the evolutionary case, this change led to a higher effort level of the authorities and a smaller ratio of evaders. One can say that the agents that are allowed to imitate, rather than behave as rational criminals, show more propensity to comply with the new regime, whereas the regulator is highly penalized when facing imitators, as shown by the increase in effort in the evolutionary case counter to the decrease in the rational case.
3. G = 0.35, t = 0.1, θ = 1.1, c = 0.014 We set the tax rate back to the benchmark level and increase the penalty from 100% to 110%. The optimal reported income becomes Y * R = −7.181 + 17.181p (ε). The real root for optimal ε becomes ε * = 0.466, exhibiting a notable fall. Substituting it back in the reported income equation, we get Y * R = 0.836, which is greater than the reported income in the benchmark case. In the evolutionary case, an increase in the penalty caused both a fall in effort and evasion, confirming the monotonic effect of a change in the penalty rate in contrast to the ambiguity of a change in the tax rate.

5.
Conclusion and further research. In this paper we wanted to merge both classical and contemporary approaches into an evolutionary game theoretical setting that is characterized by simultaneously incorporating imitation within agents, an endogenized probability of audit and regulation with either non-optimizing (myopic) behavior of the regulator, or an optimizing behavior with respect to the boundedly rational maximizing behavior of imitating agents. The resulting model was characterized by a highly non-linear structure, which is by definition unable to provide analytical results. Nevertheless, the main results prove to be particularly descriptive as far as the behavior of the agents is concerned. We have seen that if the agents are imitators and their choices over the two available strategies (evasion or compliance with the tax rule) are governed by the replicator dynamics equation, leads to different types of dynamics depending on the way the regulator acts. These dynamics lead to either monomorphic or polymorphic evasion or compliance in the population of taxpayers, a fact which the regulator wishes to manipulate by means of the policy instruments at her disposal, i.e. the tax rate, the penalty rate and the effort put in auditing. We have seen two ways that the regulator wishes to intervene. The first was a direct interference with the replicator dynamics equation, the results of which were highly dependent on the form of the subjective probability of audit and led to polymorphic steady states, whereas the second involved an optimal control of effort put into audit subject to the constraint that the agents are imitators. The optimal control problem led to a more complex result which if approached numerically behaved in a way that was intuitively appealing, meaning that the system led to multiple equilibria, of which the interior steady state where a fraction of the population evades, behaved as a saddle point. The dynamics of the system have shown very intense flows over the manifolds of each steady state. Nevertheless, saddle path stability is achieved for both the interior steady state and for the state of full evasion.
It is interesting to note when comparing the case of imitating agents with optimizing agents, that when agents are imitators a proportion of them complies, around 30 percent in our benchmark numerical example, while 70 percent evades. On the other hand when agents are fully rational in choosing reported income, everybody evades in the corresponding numerical example. Thus imitating behavior seems to support the idea of polymorphic behavior regarding compliance and evasion.
Further research and refinement of such models is necessary, in order to be able to overcome the high nonlinearity problems, so that it covers the part of the optimal control problem concerning the optimal choice of the tax rate and penalty level, which exhibited more complex behavior and did not provide analytical solutions. Another issue that could be further studied is the potential inclusion of a public good financed by the public spending G(t) which, along with tax collections, determines the primary deficit. In our model individuals were assumed not to consider the impact of their tax behavior on this public spending, thus they perceived G as exogenous with respect to their decision to tax evade or not. In an extended model, G could enter individual utilities as a function of the target deficit and expected tax collections. Since expected tax collection depends on x, i.e. the share of population that tax evades, this extension will increase the complexity of the replicator dynamics equation. Determination of final outcome requires further research. Our intuition is that dependence of individual utilities on expected tax collections could increase the number of polymorphic steady states, by increasing the nonlinearities of the replicator dynamics. Further research could also address the case where imitating agents evade by considering more than one levels of reported income. This will increase the dimensionality of the replicator dynamics system, but could provide some more insights into a case where we want to distinguish between large and small tax evasion. Finally another possible area of extending the model is to study localized interaction regarding the decision to evade or not. That is agents imitating their neighbors, where the concept of neighborhood can be defined in terms of geographical or socioeconomic characteristics.
Appendix B. Replicator dynamics equation. In order to derive the replicator dynamics equation, let x(t) = (x N C (t) , x C (t)) be the state vector denoting the share of the non-complying and complying agents in the population, respectively, at time t. As previously shown, if an agent follows the complying strategy, she will have a payoff of U C . On the other hand, the non-complying agent will have a payoff as shown in (3), i.e. (1 − p) U N U + pU N A . Thus, using the fact that x N C (t) + x C (t) = 1, for each t, the average expected payoff in terms of utility EU , of the population will be as follows: Consider now, the evolutionary game that the agents play, where each one follows one of two pure strategies s i for i = N C, C, for "not comply" and "comply" respectively. The game is repeated in periods t = 1, 2, .... Since x t i is the fraction of players in the population playing s i in period t, and the payoff to s i is EU t i = EU i (x t ), where x t = (x t N C , x t C ), we look at a given time t, and we rank the strategies so that, e.g. EU t N C ≤ EU t C . Suppose that in every time period dt, each agent following a certain strategy N C or C, learns the payoff and consequently the strategy, of another randomly chosen agent, i.e. her pair in the game, and with a probability αdt > 0 changes her strategy to her pair's strategy if she perceives that the other strategy's payoff is higher. The larger the difference in the expected payoffs, the higher the probability the agent will perceive it and change to the more profitable strategy. In this context we will regard strategy choices to be a result of imitation, in the sense that a specific strategy that leads to a payoff higher than the alternative will be imitated with a probability proportional to the payoff difference; see Schlag (1998Schlag ( , 1999. Thus, the probability p t N C,C that an agent playing strategy N C will shift to the strategy C is given by: where β is sufficiently small that p t N C,C ≤ 1. Then, the expected fraction of the population following the non-complying strategy N C, in period t + dt is defined as: where the second term in (26) reflects the expected proportion of the population that switches to the non-complying strategy. Using the definition of the average payoff as given in (25) and the fact that x t C = 1 − x t N C , we obtain that: For a sufficiently large population, one can replace Ex t+dt N C with x t+d N C . Furthermore, subtracting x t N C from both sides of (27), dividing by dt, taking the limit as dt → 0, and setting without loss of generality αβ = 1, we get the replicator dynamics The replicator dynamics equation describes the evolution of the share of the noncomplying agents, and therefore the frequency of the evading strategy N C, in the population. More specifically, it states that its frequency increases exactly when the non-complying strategy has above average payoff and vice versa. Using the payoff definitions in (25), dropping t and renaming x t N C into just x for the sake of notation simplicity, we can rewrite the replicator dynamics equation as: In (29) we clearly notice that the ratio of non-compliers increases as long as the noncompliers' payoff is greater than the compliers' payoff in the population. Nonetheless, it is important to note that the replicator dynamics equation does not describe a best-reply dynamic; that is, the agents do not adopt a best reply to the overall frequency distribution of strategies in the previous period. On the contrary, agents have a localized and limited knowledge regarding the system as a whole and are thus considered to be "boundedly rational" as far as the distribution of information is concerned.