Mean-field limit of a spatially-extended FitzHugh-Nagumo neural network

We consider a spatially-extended model for a network of interacting FitzHugh-Nagumo neurons without noise, and rigorously establish its mean-field limit towards a nonlocal kinetic equation as the number of neurons goes to infinity. Our approach is based on deterministic methods, and namely on the stability of the solutions of the kinetic equation with respect to their initial data. The main difficulty lies in the adaptation in a deterministic framework of arguments previously introduced for the mean-field limit of stochastic systems of interacting particles with a certain class of locally Lipschitz continuous interaction kernels. This result establishes a rigorous link between the microscopic and mesoscopic scales of observation of the network, which can be further used as an intermediary step to derive macroscopic models. We also propose a numerical scheme for the discretization of the solutions of the kinetic model, based on a particle method, in order to study the dynamics of its solutions, and to compare it with the microscopic model.

1 Introduction The FitzHugh-Nagumo (FHN) model [21,30] focuses on the evolution of the membrane electrical potential of a nerve cell depending on the input it receives. Such variations depend on the ion exchanges between the neuron and its environment through synapses, which were precisely described by the Hodgkin-Huxley model [27]. The FHN model was then developed as a simplification of the Hodgkin-Huxley model. Rather than precisely describe all the variations of activatory and inhibitory ion channels, the FHN model gathers all the fast excitable dynamics in the variable v, which is still considered as the electrical membrane potential, and the slow refractory dynamics in a newly introduced adaptation variable w. It can be written as follows for t > 0: where I ext stands for the input current the neuron receives from its environment, τ ≥ 0, a ∈ R and b ≥ 0 are some given constants, and N (v) is a nonlinearity which models the cell excitability. A typical choice for the nonlinearity N (see for instance [2,5,29,30]) is the cubic function for any α, β > 0. Since the input I ext is the result of the interactions between the nerve cell and its neighbours, we can replace it with an interaction term with other similar neurons, to consider an individual-based model for a network of n interacting neurons, where n ∈ N. Biological observations seem to exclude the case of homogeneous interactions [32,40], and show that the interactions inside a network of neurons are spatially structured [7]. Hence, as in [7,28], we choose to modulate the neural connectivity with a spatial weight. Thus, we consider the following spatially-extended FHN system for n interacting neurons for t > 0: where each neuron within the network is labelled by its index i ∈ {1, ..., n}, x i ∈ R d with d ∈ {1, 2, 3} is a parameter which stands for the position of the neuron i, and (v i , w i ) ∈ R 2 is the couple membrane potential and adaptation variable of the neuron i. Here, the connectivity weight Ψ : R 2d → R models the effect of spatial dependence on the strength of neuronal interactions which we assume to be of electric type. More precisely, for every neural cell i, each other neuron j contributes in the input current received by neuron i, and we write this interaction using Ohm's law via Ψ(x i , x j )(v i − v j ), which is then summed over the network. In doing so, we assumed that conductances from the neuron j to the neuron i, denoted Ψ(x i , x j ), are modulated by the positions of the two neurons i and j. Moreover, the scaling factor 1/n is introduced in order to renormalize the contributions of each neuron. From a biological viewpoint, since a neuron interacts with its closest neighbours, a reasonable choice for Ψ(x i , ·) is the indicator function of a small ball centered in x i . Here, the framework considered in this paper admits only bounded Lipschitz continuous connectivity kernels.
The main purpose of this article is to rigorously derive a kinetic mean-field model of the FHN system (3) as the number of neurons n goes to infinity. The classical method to derive a mean-field description from an individually based model is to study the time evolution at each location of the probability of finding neurons characterized by a certain couple potential-adaptation variable. The framework of this approach is to work in a probability measure set independent on the number of neurons. Thus, we recall the following notion of empirical measure. Definition 1.1 (Empirical measure). To each n-tuple X n = (x i ) 1≤i≤n ∈ R d n , V n (t) = (v i (t)) 1≤i≤n ∈ R n and W n (t) = (w i (t)) 1≤i≤n ∈ R n for t > 0, one associates its empirical measure: f n (t, dx, dv, dw) := 1 n n j=1 δ (xj ,vj (t),wj (t)) (dx, dv, dw).
Let us highlight that an empirical measure is a probability measure on R d+2 . As the number of neurons n goes to infinity, the interaction term in (3) is expected to formally satisfy: where f (t, dx, dv, dw) is the probability measure of finding neurons in an elementary volume of size dx with a potential membrane and an adaptation variable in an elementary interval respectively of length dv and dw at time t ≥ 0 within the cortex. Our purpose is thus to prove the convergence of the empirical measures associated to the solutions of the FHN system (3) towards such a probability measure f satisfying the following nonlocal kinetic equation: The term −∂ v (f K[f ]) describes nonlocal interactions through the whole network, and ∂ v (f (N (v) − w)) + ∂ w (f A(v, w)) accounts for the local reaction due to the excitability of nerve cells. Since this PDE can be written in a divergence form, we directly have the conservation of mass, which leads us to complement (5) with an initial condition: It is worth noticing that for any x ∈ R d , t → (v(t), w(t)) is a solution to the FHN system (1) if and only if the function f : t → δ x,v(t),w(t) is a measure solution to the kinetic equation (5) (and then K[f ] ≡ 0 in this case). Furthermore, we stress the fact that for all n ∈ N, for all n-tuples X n = (x i ) 1≤i≤n ∈ (R d ) n , V n = (v i ) 1≤i≤n ∈ R n and W n = (w i ) 1≤i≤n ∈ R n , (X n , V n , W n ) is a solution to the spatially-extended FHN system (3) if and only if the associated empirical measure f n as defined in Definition 1.1 is a solution to the kinetic equation (5). The derivation of the kinetic equation (5) joins a large literature in mathematical neuroscience and kinetic theory. Indeed, the problem of the mean-field limit of systems of interacting particles towards kinetic models has been widely investigated, varying the types of interactions. We mention the work of Dobrushin [17], who introduced some classical methods to prove the mean-field limit of an individualbased model of interacting particles with a bounded globally Lipschitz continuous interaction kernel towards a Vlasov equation, using a stability result of the solutions to the kinetic model with respect to their initial data, in a suitable topology of probability measures. Such a stability result is called Dobrushin's estimate. See for instance [22] for a more recent review of this approach. Then, similar results have been proved for a larger variety of interaction kernels. For example, in [25], the authors proved the mean-field limit towards a Vlasov equation, in the case where the interaction kernel has a singularity. Besides, in [4], the authors introduced an extension of the classical mean-field theory from [17] which also works for a certain class of locally Lipschitz continuous interaction kernels.
Coming back to a neuroscience viewpoint, in [18,38], the authors analysed the mean-field limit of a network of neurons modeled by their firing rates, divided into a fixed number of populations, with random synaptic weights whose probability distributions only depend upon the population indexes. They also studied the influence of the noise on collective behaviors in the network. A particular case of firing rate models is the time-elapsed model, which focuses on the probability that each neuron releases an action potential regarding the delay since the last firing time (see for instance [12,13,14,33,34,35]). Here, in this article, we use another approach, with the derivation of a kinetic equation from the FHN model, which does not abstract the emission of action potentials, following the example of the Hodgkin-Huxley model.
The mean-field limit of the stochastic spatially-extended FHN model for interacting neurons has already been studied in the case of bounded coupling instead of linear coupling, and more specifically when the connectivity kernel is compactly supported in [28], and in the case when each cell interacts only with its p nearest neighbours in [28,31]. Other approaches are considered in the literature to adjust the interactions with other quantities than space. For example, in [2,5], the authors focused on the mean-field limit of a stochastic FHN model or Hodgkin-Huxley model of n neurons interacting through electro-chemical synapses towards a kinetic PDE similar to (5). In [29], the authors proved a similar result for a stochastic FHN model of a neural network with homogeneous conductance. We also mention works like [6,36] which prove the synchronization of mean-field models of neural networks of respectively Hodgkin-Huxley and FHN type. The specificity of our present article compared to the literature is that we only consider spatially-weighted neural networks without noise.
The main contribution of this article is the rigorous justification of the kinetic model (5) we considered in [16] as the mean-field limit of the FHN system (3). It provides a mesoscopic description of the neural network, which can be used as an intermediary step for the derivation of macroscopic models from the FHN system (3), as in [16]. Here, we work with an initial data for (5) which is not compactly supported in general, but with finite exponential moments. Even though the main mathematical methods that we use were introduced in [4], to our knowledge, they have not been applied for the FHN system in a deterministic and spatially structured framework. Furthermore, in order to display some numerical simulations of the kinetic model (5), we consider a numerical scheme for this model using a particle method, that is we consider a set of particles approximating the solution of the characteristic system associated to (5) instead of the finite-size model (3). It appears that for certain sets of parameters, we observe some dynamics usually expected for macroscopic models. We also compare this scheme with the FHN system (3), to get some numerical evidence of the relevance of the kinetic equation (5).
Here, in our framework the interaction kernel (x, y, v) → Ψ(x, y) v is locally Lipschitz continuous but not globally because of the factor v. Indeed, it is Lipschitz continuous on every bounded sets of R d+1 but its Lipschitz constant goes to infinity with the diameter of the set considered. Therefore some terms of higher order arise in the computation of a Dobrushin's estimate. Thus, we need additional assumptions to control these terms, and hence to compute a similar stability estimate. In the spirit of [4], we circumvent this issue with the estimate of exponential moments of the solution of the kinetic equation (5), and with an appropriate division of the set of integration in the nonlocal terms, to get a suitable estimate. We mention that in [10,23], the authors tackled a similar difficulty for some kinetic models of collective motion, and they chose to work with compactly supported solutions specifically to overcome this problem, controling the growth of velocity support to construct measure solutions.
2 Main result This section is devoted to the statement of our main result on the mean-field limit from the solution of the microscopic model (3) towards the solution of the kinetic equation (5) as the number of neurons n goes to infinity.
Before stating our main result, let us precisely define the notion of solution of the kinetic equation (5) we use in this article. In the following, we denote by P 2 (R d+2 ) the set of probability measures on R d+2 with finite moments of order 2.
Definition 2.1 (Measure solution of (5)). Consider an initial data f 0 ∈ P 2 (R d+2 ). Let T > 0. Then, f is said to be a measure solution of (5) with initial data f 0 if f ∈ C ([0, T ], P 2 (R d+2 )) and for all t ∈ [0, T ], where # is our notation for the push-forward 1 , and Z f0 is defined for all t ∈ [0, T ] and all z = ( where (V f0 , W f0 ) is a solution of the characteristic system associated to (5) for t > 0 Remark 1. In the rest of this paper, for the sake of clarity, we will use the notation z = (x, v, w) ∈ R d+2 and z = (x , v , w ) ∈ R d+2 .
As we can expect from this notion of measure solution to (5), the regularity of the solution to the characteristic system (8) is crucial. Therefore, we need to clearly define our framework for the connectivity kernel Ψ and the nonlinear function N to prove, on the one hand, the existence and uniqueness of solutions to the FHN system (3) and the kinetic equation (5), and on the other hand, to prove our result of mean-field limit. In the following, in the spirit of [4,10], we choose a connectivity kernel Ψ : R 2d → R satisfying where Lip b (R 2d ) is the set of bounded and globally Lipschitz continuous functions from R 2d to R. In the following, we note L > 0 the Lipschitz constant of the connectivity kernel Ψ, and |Ψ(x, y)| .
1 For all h, map from R d+2 to itself, for all probability measure µ, the notation ν = h#µ is equivalent to for all B measurable subset of R d+2 .
Here, the Lipschitz continuity of the connectivity kernel is only a technical assumption. On the other hand, the choice of a bounded Ψ seems to be reasonable from a biological viewpoint, since currents transmitted through synapses are bounded. We also choose a locally Lipschitz continuous nonlinearity N : R → R such that there exist two constants κ,κ > 0 satisfying: Then, it remains to choose a suitable topology on P 2 (R d+2 ) to state our result of mean-field limit. The most convenient distance to describe the convergence of an empirical measure in such problems is the Wasserstein distance [4,10,23,22], recalled in the next definition.
In this paper, we choose to work with the Wasserstein distance of order 2 instead of 1 as in [10,17,22] to deal with the nonlinearity N , which naturally makes appear some moments of second order in v and w in the computation of the stability estimate.
Then, as explained in the introduction, to circumvent the issues caused by the class of interaction kernel we consider, even if it is not needed for the existence and uniqueness of a measure solution to (5), we make some assumptions on the initial data of the kinetic equation (5). Thus, we consider an initial data f 0 ∈ P 2 (R d+2 ) satisfying: for some constant α 0 > 0, using the notation for all (v, w) ∈ R 2 : v, w := (1 + |v| 2 + |w| 2 ) 1/2 .
Now, we have all the tools we need to state our main theorem.
Assume that f 0,2 satisfies (11) for some constant α 0 > 0. Further assume that there exist f 1 and f 2 ∈ C ([0, T ], P 2 (R d+2 )) two measure solutions of the kinetic equation (5) respectively with initial conditions f 0,1 and f 0,2 . Then there exist two positive constants C T > 0 and K T > 0 such that for all t ∈ [0, T ], where β(t) := e −C T t . 2. For all n ∈ N, consider the initial data (x i , v 0,i , w 0,i ) 1≤i≤n ∈ R d+2 n and its associated empirical measure f 0,n . Consider an initial data f 0 ∈ P 2 (R d+2 ).
Then, for all n ∈ N, there exists a unique solution ( , and for all t ∈ [0, T ], we note its associated empirical measures f n (t). There also exists a unique measure solution f ∈ C ([0, T ], P 2 (R d+2 )) to (5) with initial data f 0 .
Further assume that f 0 satisfies (11) for some positive constant α 0 > 0 and that lim Then, we get: lim Remark 2. Let us discuss the dependance of the two constants C T and K T mentioned in Theorem 2.3 (i) with respect to the initial data. First of all, C T and K T depend on T , N , τ , a, b, and on the exponential moment of only one of the two initial data, say f 0,2 . Actually, we need the assumption d 2 (f 0,1 , f 0,2 ) < 1 so that K T does not depend on the Wasserstein distance between the two initial data.
The proof of Theorem 2.3 is postponed to Section 5. The first part (i) is the stability result of the measure solutions to (5) with respect to their initial data. Our approach follows the idea from [4,10,17,22]. Indeed, the main difficulty comes from the interaction kernel of the form (x, x , v) → Ψ(x, x ) v, which is only locally Lipschitz continuous.
As for the second part (ii), the existence and uniqueness of the FHN system (3) follows from the Cauchy-Lipschitz Theorem, and the proof of the well-posedness of the kinetic equation (5) relies on the well-posedness of the characteristic system (8) using classical arguments as in [22]. Then, the mean-field limit from (3) towards (5) is just a consequence of the first part (i).

Remark 3.
We can extend the result of existence and uniqueness of the solution to (3) and the Definition 2.1 of measure solutions of (5) to the case where Ψ is only bounded and continuous with respect to its first variable uniformly relative to its second variable.
The rest of this paper is organized as follows. In Section 3, we prove an a priori estimate for the solution of the kinetic equation (5) which will be crucial for the proof of Theorem 2.3. Then, in Section 4, we present the proof of the existence and uniqueness of the measure solution of the kinetic equation (5). Furthermore, we prove our main result of mean-field limit in Section 5 and present as an application a stability result regarding monokinetic solutions in the following Section 6. Finally, in Section 7, we provide a numerical scheme for the kinetic equation (5) based on a particle method, and we display some numerical simulations, to illustrate the results established in Sections 5 and 6, and to show that this numerical scheme accurately reproduces the behavior of a large neural network.
3 A priori estimate The purpose of this section is to prove an a priori estimate of the exponential moments of a solution of the kinetic equation (5). But first, we start by proving a technical lemma which will be useful all along the rest of the article. We will need to work with the characteristic system (8). In Section 4, we will prove that for all T > 0, the characteristics are well- with This choice of Banach space is justified since in general, the characteristics are not bounded, so we need to control some moments in v and w.
Let f 0,1 and f 0,2 ∈ P(R d+2 ), and U 1 and Then, we have for all t ∈ [0, T ] and all z = (x, v, w) ∈ R d+2 : where κ andκ are the two constants defined in (10), and for all t Proof. The proof is straightforward, and mainly lies on Young's inequality and the properties (10) satisfied by the nonlinearity N .
Now, let us prove the a priori estimate of the exponential moments.
We consider a connectivity kernel Ψ satisfying (9) and a locally Lipschitz continuous nonlinearity N satisfying (10). Let p ≥ 1. Consider Then, there exists a constant C T f0 > 0 which depends only on the parameters of the equation (5), on T and on the moments According to the inequality (18) from Lemma 3.1, we have the following estimate: First of all, we can easily estimate the first term T l (t) factorizing by the quantity | V f0 (t, z), W f0 (t, z) | 2 as follows: Then, we treat the second term T nl (t) using the moment estimate (11) satisfied by f 0 , with Young's inequality and then factorizing by | V f0 (t, z), W f0 (t, z) | 2 as follows: Finally, we get that there exists a positive constant C T f0 > 0 such that: Consequently, to conclude the proof, we integrate this last inequality between 0 and t ∈ [0, T ] to get the estimate (20).
4 Proof of the well-posedness of the kinetic equation (5) This section is devoted to the proof of existence and uniqueness of a measure solution to the kinetic equation (5), in the sense of Definition 2.1. Let T > 0 be a fixed final time and f 0 ∈ P 2 (R d+2 ). First, we focus on the well-posedness of the characteristic system (8), and then we will conclude by defining the measure solution to (5) as the push-forward of the initial data by the solution of the characteristic system.

4.1
Proof of the well-posedness of the characteristic system (8) As a preliminary step, we establish the existence and uniqueness of the solution of the characteristic system (8) (17).
Proof. Since we cannot directly conclude with the Cauchy-Lipschitz theorem because of the term resulting from the nonlocal interactions in (8), our approach is based on the construction of a Cauchy sequence (V p , W p ) p∈N in C ([0, T ], E), in order to circumvent this difficulty. Then, we will define the couple (V f0 , W f0 ) as its limit as p tends to infinity.
Step 1: construction of the sequences First, we prove the following lemma, which yields the existence and uniqueness of the solution of a system of equations approximating (8), in which we consider the contribution of the interactions as a source term.
The proof of Lemma 4.1 only relies on classical arguments, but for the sake of completeness, it is postponed to the Appendix A. Then, by induction, Lemma 4.1 implies the existence and uniqueness of a sequence (V p , W p ) p∈N , such that (V 0 , W 0 ) := (0, 0), and for all p ∈ N, for all t ∈ [0, T ] and z = (x, v, w) ∈ R d+2 , Step 2: Cauchy sequences Now, we want to prove that for all t ∈ [0, T ], {V p (t)} p∈N and {W p (t)} p∈N are two Cauchy sequences in E. For all p ∈ N, we define: and we want to prove by induction that this quantity is summable. Let p ∈ N, t ∈ [0, T ] and z = (x, v, w). Thus, according to the inequality (19) from Lemma 3.1, we have: where for all s ∈ [0, t], Let s ∈ [0, t]. The first term T l (s) is easily controled factorizing by | v, w | 2 ≥ 1, and then taking the supremum on R 2 , as follows: Then, to deal with the nonlocal term T nl (s), using the boundedness of Ψ and Young's inequality, we can compute: Moreover, factorizing each term with | v, w | 2 , we get: Finally, the estimates (23) and (24) together yield that there exist two positive constants C 1 > 0 and C 2 > 0 independent of p such that: which implies, by dividing this inequality by | v, w | 2 and then taking the supremum on R d+2 : On the one hand, one can check with a straightforward induction atgument that G p+2 is continuous with respect to time. Hence, using Grönwall's lemma, we have for all t ∈ [0, T ]: On the other hand, for all t ∈ [0, T ], for some constant C T > 0, since V 1 and W 1 ∈ C ([0, T ], E) according to Lemma 4.1.
Hence, by induction, we can deduce from (26) and (27) that for all p ∈ N and all t ∈ [0, T ]: which is summable. Consequently, for all t ∈ [0, T ], {V p (t)} p∈N and {W p (t)} p∈N are Cauchy sequences in E. Since E is a Banach space, and since for all p ∈ N, V p and W p ∈ C ([0, T ], E), there exist V f0 and W f0 ∈ C ([0, T ], E) such that for all t ∈ [0, T ], V p (t, ·) (respectively W p ) converges towards V f0 (t, ·) (respectively W f0 ) uniformly in E. Thus, passing to the limit p → +∞ in (22), we get that for all z ∈ R d+2 , (V f0 (·, z), W f0 (·, z)) is a solution in C ([0, T ], E) 2 of the characteristic system (8) in the integral form. Furthermore, since the nonlinearity N is locally Lipschitz continuous and V f0 and W f0 are continuous with respect to time, we directly have from the integral form of (8) that (V f0 (·, z), W f0 (·, z)) is of class C 1 in time.
Step 3: Uniqueness Now, we want to check that the solution of (8) is unique. Suppose that there exist (V 1 , W 1 ) and (V 2 , W 2 ) two solutions of (8) in C ([0, T ], E) 2 . We define for all t ∈ [0, T ]: Thus, using similar computations as previously, we get that there exists a positive constant C such that for all t ∈ [0, T ]: Since V i , W i are continuous with respect to time for i ∈ {1, 2}, using Grönwall's inequality, we get that (V 1 , W 1 ) = (V 2 , W 2 ). (5) We have proved so far that there exists a unique map Z f0 := (id R d , V f0 , W f0 ) such that (V f0 , W f0 ) is a solution of the characteristic system (8) in C ([0, T ], E). We define:

Proof of Theorem 2.3
This section is devoted to the proof of our main result Theorem 2.3. We start with the stability result (i). Our approach consists in using an estimate of the Wasserstein distance between two measure solutions of (5). Then, we will show how this stability result implies the mean-field limit result (ii).

Proof of the stability result
For j ∈ {1 , 2}, we define the map Z j such that for all t ∈ [0, T ] and all z = (x, v, w) ∈ R d+2 : where (V j , W j ) is the solution of the characteritic system (8) with initial data f 0,j . V j and W j are in C ([0, T ], E) according to Proposition 1. Let π ∈ Λ(f 0,1 , f 0,2 ) be the optimal measure to compute d 2 (f 0,1 , f 0,2 ), that is The existence of such a minimizing measure is proved in [39]. First, we want to estimate the function D[π] defined for all t ∈ [0, T ] with .

JOACHIM CREVAT
Then, we will be able to conclude using the fact that for all t ∈ [0, T ], Step 1: Estimate of D[π] Let t ∈ [0, T ], z 1 = (x 1 , v 1 , w 1 ) and z 2 = (x 2 , v 2 , w 2 ) ∈ R d+2 . We start by estimating the integrand Z 1 (t, z 1 ) − Z 2 (t, z 2 ) 2 . Then, we will integrate with respect to the measure π(dz 1 , dz 2 ) in order to estimate D[π] 2 (t). In the following, for j ∈ {1, 2}, we use the shorthand notations V j := V j (t, z j ) and V j := V j (t, z j ), and the same for W j . First, since V j and W j are of class C 1 with respect to time for j ∈ {1, 2}, according to the inequality (19) from Lemma 3.1, we have: where By integrating (31) with respect to the measure π(dz 1 , dz 2 ), we get: We easily treat the first term T l using the definition of D[π]: Now, we deal with the nonlocal terms T nl . Since we have: we get: The two first terms J 1 and J 2 are easy to estimate, using the fact that Ψ is bounded and Young's inequality. We find: Furthermore, since Ψ satisfies the assumption (9), we have for all (x 1 , x 1 ) and (x 2 , x 2 ) ∈ R 2d : Let R(t) > 0 to be determined later. We define the sets Then, we can deal with the third term J 3 using (34) and then splitting the set of integration into Θ R(t) and its complementary Θ C R(t) : We can treat the term J 31 with the Cauchy-Schwarz inequality: Then, using the Young and Cauchy-Schwarz inequalities, we can estimate the second term J 32 as follows: On the one hand, according to Lemma 3.2, there exists a positive function α : On the other hand, there exists a constant C > 0 such that according to Lemma 3.2. Finally, there exists a constant C > 0 such that and consequently, there exists a constant C > 0 such that Now, we use (33) and (36) to estimate respectively the local terms T l and the nonlocal terms T nl in (32). Finally, we get that there exists a constant C T > 0 such that for all t ∈ [0, T ] and all R(t) > 0: First, if we choose R(t) = 1 for all t ∈ [0, T ] in the inequality (37), since D[π] is continous with respect to time because Z 1 and Z 2 are continuous, Grönwall's lemma yields that there exists a constant K T > 0 such that for all t ∈ [0, T ], D[π](t) 2 < K T . Then, we define the function so that for all t ∈ [0, T ] such that u(t) > 0, we have 1 ≤ − ln(u(t)). Let t ∈ [0, T ] such that u(t) > 0. Then, we choose the quantity R(t) in (37) as follows: Hence, (37) becomes for some constant C T > 0. Since u(t) > 0, we can divide the last inequality by u(t), which yields: (ln(u(t))) ≤ − C T ln(u(t)) Consequently, since u is continuous with respect to time, using Grönwall's inequality to ln(u), if we define the function we get that for all t ∈ [0, T ]: and this remains true if u(t) = 0. Finally, we can conclude that for all t ∈ [0, T ] where K T > 0 is a positive constant.
Step 2: Conclusion of the estimate of the Wasserstein distance We note for all j ∈ {1 , 2} and all t ∈ [0, T ] Thus, for all t ∈ [0, T ] and for all π ∈ Λ(f 0,1 , f 0,2 ), and therefore, Finally, we can conclude using the estimate (5.1): Remark 4. If we make the additional assumption that there exists p > 1 such that then instead of the estimate (36), we can conclude using a similar argument that there exists a constant C T > 0 such that for all t ∈ [0, T ], Hence, since D[π] is continuous with respect to time, Grönwall's lemma yields that for all R > 0 and all t ∈ [0, T ], Therefore, this implies that for all t ∈ [0, T ] and all R > 0, Choosing for instance the estimate (40) is enough to prove the part (ii) of Theorem 2.3. Indeed, since (f 0,n ) n∈N and f 0 satisfy assumption (14), we get 5.2 Proof of the mean-field limit from (3) towards (5) First, the existence and uniqueness of the solution to the FHN system is a direct consequence of the Cauchy-Lipschitz theorem. Then, we have already proved the well-posedness of the kinetic equation (5) in Section 4. Now, let us conclude the proof using the stability result from the first part of Theorem 2.3. We notice that for all n ∈ N, the empirical measure f n is the measure solution of the kinetic equation (5) with initial condition f 0,n . Then, the part (i) of Theorem 2.3 yields that there exist two positive constants C T and K T independent of n such that for all t ∈ [0, T ] and for all n ∈ N, where β(t) := e −C T t . Finally, using the assumption (14), we get that which concludes the proof of Theorem 2.3.
6 Application: Stability of monokinetic solutions One of the motivations to study the mean-field model is the analysis of the macroscopic quantities computed from a measure solution to (5), though the equation formally satisfied by the average membrane potential in the network is not closed. A way to overcome this difficulty is to look for monokinetic solutions of (5), that is solutions f of the form where ρ 0 is the average density of neurons, and (V, W ) is the average couple membrane potential-adaptation variable in the network. Therefore, if f is a monokinetic solution of (5), then the couple (V, W ) formally satisfies the nonlocal reactiondiffusion FHN system: W (t, x)).
(42) In this subsection, we consider a connectivity kernel Ψ of the form: where Φ : R d → R is symmetric, which means that the conductance between two neurons only depends on the distance between them. We want to use the stability result from Theorem 2.3 to prove that monokinetic solutions of the kinetic equation (5) are stable with respect to the Wasserstein distance d 2 . Under some additional assumptions, we can state the following well-posedness result. Lemma 6.1. Assume that Φ is a non-negative symmetric connectivity kernel in L 1 (R d ), and N is the nonlinearity defined through: Consider an initial data (ρ 0 , V 0 , W 0 ) satisfying Then for any T > 0, there exists a unique couple (V, W ) which is a classical solution to the nonlocal reaction-diffusion system (42) with initial data The proof of Lemma 6.1 is based on a classical fixed point argument, and we refer to [16] for the details. Now, as a direct consequence of Theorem 2.3, we get the following result of stability of monokinetic solution of the equation (5).
Proposition 2 (Stability of monokinetic solutions). Let T > 0. Assume that Φ is a non-negative symmetric connectivity kernel in L 1 (R d ), and N is the nonlinearity defined through (43). Consider the initial data f 0 ∈ P 2 (R d+2 ), and (ρ 0 , V 0 , W 0 ) satisfying (44) and for some positive constant α 0 > 0. Let (V, W ) be the solution of (42) with initial data (V 0 , W 0 ) provided by Lemma 6.1, and let f ∈ C ([0, T ], P 2 (R d+2 )) be the measure solution of (5) with initial data f 0 . Then there exist two positive constants C T > 0 and K T > 0 such that for all t ∈ [0, T ], where β(t) := e −C T t .

Numerical simulations
In this section, we approximate the solution f of the kinetic equation (5) for a one-dimensional network (i.e. d = 1), normalized to [0, 1]. There are few numerical methods specifically adapted to kinetic theory. In [1], the authors numerically approximate a mean-field model of neural network of FHN type using finite differences without considering any space dependence. On the contrary, we are particularly interested in the influence of space in the kinetic model (5). In order to approximate (5), we use a particle method. This kind of numerical scheme was first introduced by Harlow [24] for the numerical computation of specific problems in fluid dynamics, and precisely mathematically studied later [37]. Then, a large diversity of particle methods were introduced for simulations   (1) admits a unique fixed point (0.25, 0), which is unstable. We also consider τ = 0.02. Consequently, in this setting, the solution of the system (1) converges towards a stable limit cycle if it is not initialized at the fixed point. (iii) Excitable regime: we choose a = 0 and b = 7, so that the system (1) admits a unique fixed point (0, 0), which is stable. All the solutions of (1) converge towards (0, 0) as t → +∞. Moreover, we fix τ = 0.002, so that (1) exhibits a slow/fast dynamics. From now on, we define the connectivity kernel to be where ε > 0 is a rescaling parameter. Let us discuss this choice of connectivity kernel in (50). First of all, since Ψ is non-negative, we consider a purely excitatory regime. Then, the parameter ε > 0 varies between 1 and very small values, in order to consider the regime of strong local interactions, which seems reasonable from a biological point of view since two neurons interact only through their contact point provided by their shared synapse.
We are interested in the dynamics of the average macroscopic couple (V f , W f ) computed from the solution f of the kinetic model (5). In the previous section, we have proved the stability of monokinetic solutions (41) from well-prepared initial data (see Proposition 2). In the following, we present in the three regimes discussed above some numerical evidences that monokinetic solutions have a larger basin of attraction in the sense that the dynamics of (V f , W f ) is found to be close to the dynamics of the solutions of the nonlocal reaction-diffusion system (42). To conclude this section, we will compare the dynamics between the FHN system (3), and the mean-field model (5) in the oscillatory regime for different values of n the number of neurons.
Case (i) -Bistable regime. We study the kinetic model (5) in the bistable regime with τ = 0. As initial condition for our numerical scheme, we choose for all 1 where erfc is the complementary error function.
In Figure 1, we show the spatio-temporal evolution of the macroscopic quantity V f for different values of the parameter ε. First, in the case (A), ε = 10 −1 is large compared to the width of the considered interval [0, 1]. Thus, the space influence is almost homogenized, and the interactions between the particles of the neural network are expected to vanish after a few time. Consequently, for t large enough and for all fixed position x ∈ [0, 1], the macroscopic quantity V f (·, x) is expected to behave as a solution of the one-neuron equation (1) in the same framework, that is to converge towards one of the two stable fixed points. Indeed, we can observe that it converges towards 0. Then, in the case (B), ε = 10 −3 is sufficiently small to observe another dynamic. Here, the behavior of the function V f qualitatively looks like an invasion front, connecting the steady state 0 to 1, propagating at constant speed. Moreover, as shown in (C), after an initial transition phase, the shape of the front seems to be invariant and smooth.
We note that the qualitative behavior of the macroscopic function V f when ε is small enough corresponds well to the dynamics of the nonlocal reaction-diffusion system (42) for which traveling front solutions are known to exist [3] when considered on the real line. Regarding the density function f , we show in Figure 2 its temporal evolution for ε = 10 −3 . We observe that the density f remains concentrated around the states v = 1 in an interval [0, x 0 (t)] and then around v = 0 in the complementary interval [x 0 (t), 1] for some x 0 (t) ∈ (0, 1) propagating at a constant speed. This shows that f remains close to a monokinetic solution of the form ρ 0 dx ⊗ δ V f (t,·) (dv), where the qualitative behavior of V f is that of a traveling front, as previously detailed. This validates the fact monokinetic solutions seem to have a large basin of attraction.
As explained in [3], the macroscopic speed of propagation is proportional to the integral 1 0 N (u) du. Thus, if we rather work with the nonlinearity we observe a similar behaviour, where the state V f = 0 invades the state V f = 1. Furthermore, in the specific case of a nonlinearity satisfying there is no invasion.
Case (ii) -Oscillatory regime In the oscillatory regime, we choose as initial data a perturbation of the steady state (0.25, 0) concentrated around the position x = 0: v 0,i = 0.25 + 0.5 exp −5000 x 2 i , w 0,i = 0. In Figure 3, we display the spatio-temporal evolution of the macroscopic function V f computed from the solution f of the kinetic equation (5) with three different values of the variance ε. In the first case (A), we fix the parameter ε = 10 −1 . In that case, we expect that the space dependence of V f to be suppressed and we indeed observe synchronized homogeneous oscillations. Then, we reduce the value of ε to localize the interactions and enforce the spatial dependence. For both ε = 10 −3 and ε = 10 −5 , we observe temporal oscillations whose phase is modulated spatially, similar to what is usually found for the local FHN reaction-diffusion system in the oscillatory regime [11]. More precisely, the dynamics is that of a modulated traveling wave propagating at constant speed from 0 towards the right, and leaving in the wake an oscillatory pattern with a constant frequency and amplitude. This is illustrated in Figure 3 panel (F) where the trajectory followed by the average couple (V f , W f ) in the phase space at a given time converges towards a limit cycle.
Case (iii) -Excitable regime We consider an initial condition that is a perturbation of the steady state (0, 0) concentrated at the middle of the interval [0, 1], that is for We report in Figure 4, panel (A), the space-time representation of the macroscopic function V f for ε = 10 −5 . In that case, we see that the dynamics generates two counter-propagating traveling pulses. Once again, the behavior of the function V f is qualitatively the same as the expected dynamics of the corresponding macroscopic model (42), where it is well known that the nonlocal reaction-diffusion FHN system supports traveling pulse solution [19]. We have drawn in Figure 4  w our best knowledge, it is the first time that traveling pulses are reported for the FHN kinetic model.

Numerical comparison between the FHN system and the kinetic model
We consider a set of parameters corresponding to the oscillatory regime of the FHN model (1), that is the same parameters and initial condition as in paragraph above. We also fix ε = 10 −4 , so that the average couple (V f , W f ) computed from the solution to the kinetic equation (5) presents an oscillatory pattern modulated by a traveling wave propagating through space as previously.
In Figure 5, we display the profile of the macroscopic function V f at fixed time t = 225 with respect to x, together with the points (x i , v i ) 1≤i≤n standing for the solution of the FHN system (3) Figure 3.
oscillatory pattern emerges. Though, the frequency of these oscillations seems to be slightly higher than for the kinetic model (5). Then, we observe that for a relatively small number of neurons n in the network, the points (x i , v i ) 1≤i≤n match with the trajectory of V (t, ·). Indeed, in the case (B), the oscillations of both models (3) and (5) are almost synchronized, and in the case (C), the points (x i , v i ) 1≤i≤n overlay the trajectory of V f . It seems that the microscopic system (3) does not vary much any more for higher values of n, and quickly converges towards the solution of (5) as n tends to infinity.
On the one hand, the computations are obviously quicker than for the microscopic model (3) with a too large value of n. This is natural since in the numerical scheme of (5), we replace the sum over all the neurons in the network with a meanfield operator. On the other hand, the kinetic model (5) manages to accurately represent the behavior of a large neural network, and we still have access to more information than with a macroscopic model. Indeed, the kinetic model focuses on an approximation of the density of neurons rather than the average membrane potential. This shows the whole interest of the kinetic model (5) compared to the microscopic model (3) and to macroscopic models.
We mention that in the presence of noise as in [36], the authors showed some numerical simulations of the FHN system (3) with homogeneous interactions, where the finite number of neurons can cause the emergence of relaxation cycles near the transition from the excitable regime to the oscillatory regime, whereas the deterministic model still presents a unique stable fixed point.

Discussion
In this paper, we have proved the mean-field limit of the deterministic spatially-extended FHN model for neural networks towards a nonlocal kinetic equation as the number of neurons goes to infinity. Our approach is based on a stability estimate of solutions of the kinetic equation (5) with respect to their initial data. We have also proved the well-posedness of the kinetic equation in the space of probability measures with finite second moments, equiped with the Wasserstein distance of order 2. This mean-field limit provides a rigorous link between the microscopic scale of the neural network and a mesoscopic scale, which can then be used as an intermediary step for the derivation of macroscopic description of the neural network. Our microscopic model was obtained coupling the FHN model for a finite number of neurons, whose interactions with their neighbours are modulated only with their spatial position in the network with a connectivity kernel Ψ. Moreover, we have ignored the noise in the interactions, so that we only used deterministic tools. Finally, our numerical simulations showed that the kinetic model (5), with a sufficiently localized connectivity kernel, is robust enough to display some qualitative behaviors expected for macroscopic quantities in some specific frameworks, while retaining more information than macroscopic models.
Several extensions to this work seem natural. For example, taking inspiration from [8], a possibility could be to randomly choose the connectivity weights Ψ i,j in (3) with a probability law which depends only on the distance between the neurons i and j. A direct consequence of a mean-field limit result could be the propagation of chaos in the network as the number of neurons goes to infinity, that is the neurons become less and less correlated as their number gets large. Furthermore, another interesting extension comes from the recent work of Chiba and Medvedev [15] for the Kuramoto model, and to study the mean-field limit of the FHN model on various types of random graphs.
9 Acknowledgments I thank Julien Chevallier for fruitful discussions during a poster session at the conference ICMNS 2018, and Grégory Faye and Francis Filbet for their help.