Measure solutions to a system of continuity equations driven by Newtonian nonlocal interactions

We prove global-in-time existence and uniqueness of measure solutions of a nonlocal interaction system of two species in one spatial dimension. For initial data including atomic parts we provide a notion of gradient-flow solutions in terms of the pseudo-inverses of the corresponding cumulative distribution functions, for which the system can be stated as a gradient flow on the Hilbert space $L^2(0,1)^2$ according to the classical theory by Br\'ezis. For absolutely continuous initial data we construct solutions using a minimising movement scheme in the set of probability measures. In addition we show that the scheme preserves finiteness of the $L^m$-norms for all $m\in (1,+\infty]$ and of the second moments. We then provide a characterisation of equilibria and prove that they are achieved (up to time subsequences) in the large time asymptotics. We conclude the paper constructing two examples of non-uniqueness of measure solutions emanating from the same (atomic) initial datum, showing that the notion of gradient flow solution is necessary to single out a unique measure solution.

(Communicated by the associate editor name) Abstract. We prove global-in-time existence and uniqueness of measure solutions of a nonlocal interaction system of two species in one spatial dimension. For initial data including atomic parts we provide a notion of gradient-flow solutions in terms of the pseudo-inverses of the corresponding cumulative distribution functions, for which the system can be stated as a gradient flow on the Hilbert space L 2 (0, 1) 2 according to the classical theory by Brézis. For absolutely continuous initial data we construct solutions using a minimising movement scheme in the set of probability measures. In addition we show that the scheme preserves finiteness of the L m -norms for all m ∈ (1, +∞] and of the second moments. We then provide a characterisation of equilibria and prove that they are achieved (up to time subsequences) in the large time asymptotics. We conclude the paper constructing two examples of non-uniqueness of measure solutions emanating from the same (atomic) initial datum, showing that the notion of gradient flow solution is necessary to single out a unique measure solution.

Introduction
In this work we consider a particular instance of the following nonlocal interaction system for the evolution of two probability measures ρ and η on the whole real line (1) Here H 1 , H 2 model the way any two agents of the same species interact with one another (socalled self-interaction potentials, or intraspecific interaction potentials). Respectively, K 1 , K 2 are called the cross-interaction potentials, or interspecific interaction potentials, as they describe the interaction between any two agents of opposing species. This model can be easily understood as a natural extension of the well-known aggregation equation (cf. [6,9,25,51,55]) to two species. This link was first established in the paper [36] as the continuous counterpart of a system of ordinary differential equations. More precisely, for M, N ∈ N suppose (x i ) N i=1 and (y i ) M i=1 denote the locations of agents of two different species, each of them with masses 1 N and 1 M respectively. Then, assuming the velocity of any agent is given as an average of the forces exhibited by all other agents upon that agent, one getṡ .., N, The choice of the interaction potentials depends on the application or the phenomena of interest.
In particular in mathematical biology contexts the potentials are often assumed to be radial, i.e. W (x) = w W (|x|), for W ∈ {H i , K i | i = 1, 2}, i.e. they only depend on the relative distance between any two agents. An interaction potential is said to be attractive if w ′ W (|x|) > 0 and repulsive if w ′ W (|x|) < 0. The existence theory developed in [36] covers the case of C 1 -potentials H i , K i , i = 1, 2 (with suitable growth conditions at infinity) and provides a semigroup defined in the space of probability measures with finite second moment equipped with the Wasserstein distance, in the spirit of [1]. More specifically, the JKO scheme, [43], can be adopted in the special case K 1 = K 2 = K. Then Eq. (1) can actually be seen as the gradient flow of the functional (2) F(ρ, η) : In this case a slightly lower regularity needs to be required on the potentials H 1 , H 2 , K as long as all of them are convex up to a quadratic perturbation. Thus, uniqueness can be proven via the notion of λ-convexity along geodesics of F, see [50,1]. Common interaction potentials for the one species case include power laws W (x) = |x| p /p, as for instance in the case of granular media models, cf. [5,56]. Another possible choice is a combination of power laws of the form W (x) = |x| a /a − |x| b /b, for −d < b < a where d is the space dimension. These potentials, featuring short-range repulsion and long-range attraction, are typically chosen in the context of swarming models, cf. [4,7,24,21,28,41,42,44]. Other typical choices include characteristic functions of sets or Morse potentials W (x) = −c a exp(−|x|/l a ) + c r exp(−|x|/l r ), or their regularised versions W p (x) = −c a exp(−|x| p /l a ) + c r exp(−|x| p /l r ), where c a , c r and l a , l r denote the interaction strength and radius of the attractive (resp. repulsive) part and p ≥ 2, cf. [28,29,38]. These potentials display a decaying interaction strength, e.g. accounting for biological limitations of visual, acoustic or olfactory sense. The asymptotic behaviour of solutions to one single equation where the repulsion is modelled by non-linear diffusion and the attraction by nonlocal forces has also received lots of attention in terms of qualitative properties, stationary states and metastability, see [17,19,20,22,40,23] and the references therein, as well as its two-species counterparts, cf. e.g. [35,21,27,18], and references therein.
As set out earlier we shall study a particular instance of the above system where all interactions are modelled by Newtonian potentials. More precisely, by setting N (x) := |x|, we consider repulsive Newtonian intraspecific interactions and attractive Newtonian interspecific interactions, i.e. we will deal with the system Following (2), there is a natural functional that can be associated to system (3), namely We mention at this stage that this choice of the functional does not fit the set of assumptions in [36], in that the (repulsive) intraspecific parts of F are not defined through convex potential (up to a quadratic perturbation).
The corresponding equation for one species has been attracting a lot of interest. In [6] and [9], the authors provide an L ∞ and an L p -theory for the aggregation equation ∂ t u + div(uv) = 0, v = −∇K ⋆ u, with initial data in P 2 (R d ) ∩ L p (R d ), where d ≥ 2 and P 2 (R d ) denotes the set of probability measures with bounded second order moments. They consider radially symmetric kernels whose singularity is of order |x| α , α > 2 − d, at the origin. In particular, the authors prove local well-posedness in P 2 (R d )∩L p (R d ) for p > p s , where p s = p s (d, α). Moreover, when K(x) = |x|, the exponent p s = d d−1 is sharp since for any p < d d−1 the solution instantaneously concentrates mass at the origin for initial data in P 2 (R d ) ∩ L p (R d ). Global well-posedness of solutions with initial data in P 2 (R d ) was proven in [25] for a general class of potentials including in particular K(x) = |x|. The gradient flow structure was crucial to show a unique continuation after blow-up of solutions to the aggregation equation. Let us also mention [8] that provides a well-posedness theory of compactly supported L 1 ∩ L ∞ -solutions for the Newtonian potentials in d ≥ 2. The gradient flow structure introduced in [25] in the particular case of K(x) = |x| in one dimension was further developed in [12], where the authors prove the equivalence of the Wasserstein gradient flow for in the repulsive or attractive case respectively, being Such a result is relevant in particular in the repulsive case, as it shows that all point particles initial data evolve into an L 1 -density on t ∈ (0, +∞) as a simple consequence of the uniqueness of entropy solutions to the corresponding scalar conservation law, [45]. More precisely, a point particle ρ 0 = δ 0 in the repulsive case corresponds to the initial datum F 0 = 1 [0,+∞) for the equation F t + (F 2 − F ) x = 0, and the discontinuity can be resolved (in a weak solution sense) either via a stationary Heaviside function or through a rarefaction wave with time-decaying slope connecting the two states 0 and 1. As the flux g(F ) = F 2 − F is convex, the latter is the only admissible solution in the entropy sense (see e.g. [32]). Therefore, the equivalence result in [12] implies that the distributional derivative ρ = ∂ x F is the only gradient flow solution to the repulsive aggregation equation ρ t = −∂ x (ρ∂ x (|x| ⋆ ρ)). Notice that such a solution satisfies ρ(t, ·) ∈ L ∞ (R) for all t > 0, whereas the initial condition ρ 0 is an atomic measure. The occurrence of such a smoothing effect in the one-species repulsive case suggests that similar phenomena may be observed in the two-species case, at least in one space dimension. Understanding such an issue is one of the purposes of this work. However, the equivalence to the 2 × 2 system of conservation laws does not provide any useful insights in this case, as we shall discuss in detail in Section 5. We would like to stress at this stage that the persistence of an atomic part for one of the two species in (3) would make the definition of measure solutions rather difficult, as the velocity fields are given by convolutions of the solution with a discontinuous function. In the (F, G) version (5) this corresponds to the impossibility of e.g. multiplying a discontinuous function F − G by an atomic measure ∂ x F . On the other hand, we shall see that the equivalence to L 2 (0, 1) 2 -gradient flows in the pseudo-inverse formalism (see e.g. [13,30,12]) provides a natural way to state a suitable notion of solution for (3) with measure initial data giving rise to a unique solution for all times. As for the L p -regularity of solutions, we mention here that a system similar to (3) with the addition of linear diffusion in both components was studied in the context of semiconductor device modelling, see e.g. [3] in which solutions are shown to maintain the finiteness of the L p -norms for p ∈ (1, +∞). As we will show in our paper, the same holds in the one-dimensional diffusion free case (3). However, when initial data feature an atomic part, the attractive part in the functional F may inhibit solutions to instantaneously become L 1 -densities. This may happen for instance when the two species share an atomic part at the same position initially, see Section 5 below. Another interesting issue related to (3) is its asymptotic behaviour for large times. The one species case features total collapse, i.e. the formation of one point particle in a finite time in the attractive case with all the mass of the system, see [25], and large time decay to zero in the repulsive case (as a consequence of the results in [12] and on classical results on the large time behaviour for scalar conservation laws, see e.g. [48]). The asymptotic behaviour in the 2 × 2 case (3) is much more complex. Finite time concentration for smooth initial data cannot happen in the view of the non-expansiveness of the L p -norms proven in the present work. Similarly the case for the large time decay to zero is impossible, as we will construct explicit solutions featuring a steady atomic part for all times. We shall prove that the ω-limit in a suitable topology for (3) is a subset of {ρ = η}, which also coincides with the minimising set of the corresponding functional F. The rest of this paper is organised as follows: • Section 2 contains preliminary concepts on gradient flows in Wasserstein spaces and about the one-dimensional case in particular. • Section 3 deals with the existence and uniqueness of solutions. We first prove it in Subsection 3.1, for the notion of solution introduced in Definition 7. In Subsection 3.2 we consider the case of densities as initial conditions, more precisely in L m (R) for some m ∈ (1, +∞], and we show that a suitable notion of gradient flow solution in the Wasserstein sense (see Definition 9) can be achieved via the Jordan-Kinderlehrer-Otto scheme, which also allows to prove that the L m -regularity is maintained. In addition a uniform-in-time control of the second moment is obtained. Moreover, we prove that our solutions also satisfy Definition 7 given the additional regularity. All the results on the absolutely continuous case are collected in Theorem 12. • Section 4 contains a detailed study of the steady states for (3), as well as of the minimisers of (4). A characterisation of the steady states is provided in Proposition 9. A consequent result concerning the asymptotic behaviour is provided in Theorem 14. • Section 5 describes two relevant examples of atomic initial data. In both cases, non uniqueness of weak measure solutions is shown, and the relevant gradient flow solution is singled out as well. These two examples allow to conclude interesting properties related with the occurrence or not of the smoothing effect (or lack thereof) of initial atomic parts, see Remarks 9 and 10. The link with the hyperbolic system (5) is described in detail in Subsection 5.3 leading to several open problems.

Preliminaries
This section is devoted to setting up the framework to show existence and uniqueness of solutions to the system with Newtonian interactions, N (x) = |x|. Throughout the paper, ρ = ρ(t) and η = η(t) will be considered as time dependent curves with values on the set P(R) of probability measures on R. System (6a) is equipped with initial data for some ρ 0 , η 0 ∈ P(R). Moreover, we write P 2 (R) to denote the set of probability measures with finite second moment, i.e.
In the following we shall use the symbol P a 2 (R) referring to elements of P 2 (R) which are absolutely continuous with respect to the Lebesgue measure. Consider a measure µ ∈ P(R) and a Borel map T : R → R. We denote by ν = T # µ the push-forward of µ through T , defined by for all Borel sets A ⊂ R. We refer to T as the transport map pushing µ to ν. Next let us equip the set P 2 (R) with the 2-Wasserstein distance. For any measures µ, ν ∈ P 2 (R) it is defined as where Γ(µ, ν) is the class of transport plans between µ and ν, that is, where π i : R×R → R, i = 1, 2, denotes the projection operator on the i th component of the product space R 2 . Setting Γ 0 (µ, ν) as the class of optimal plans, i.e. minimisers of (7), the Wasserstein distance becomes for any γ ∈ Γ 0 (µ, ν). The set P 2 (R) equipped with the 2-Wasserstein metric is a complete metric space which can be seen as a length space, see for instance [1,54,57,58].
Next we introduce the notion of the Fréchet sub-differential in the space of probability measures.
This definition will play a crucial role when introducing the notion of gradient flow solutions to system (6) later on, cf. Section 3.2.
It is necessary to recall that the λ-geodesic convexity is strictly linked with the concept of k-flow.
with respect to W 2 if, for an arbitrary µ ∈ P 2 (R), the curve t → S t φ µ is absolutely continuous on [0, +∞[ and satisfies the evolution variational inequality (E.V.I.) for all t > 0, with respect to any reference measureμ ∈ P 2 (R) such that φ(μ) < ∞.
As already mentioned, the previous concepts of k-flow and λ-convexity are closely intertwined. Indeed, a λ−convex functional possesses a uniquely determined k−flow for k ≥ λ, and if a functional possesses a k−flow, then it is λ−convex with λ ≥ k, cf. Refs. [1,37,49], for further details. The notion of k-flow will be of great help in the use of the flow interchange technique, cf. Subsection 3.2. Now, let µ t ∈ AC([0, +∞); P 2 (R)) be an absolutely continuous curve in P 2 (R). We can define the metric derivative of µ t as which is well-defined almost everywhere since µ t is an absolutely continuous curve, cf. [1,54].
Since F µ is a non-decreasing, right continuous function such that we may define the pseudo-inverse function X µ associated to F µ , by for any s ∈ (0, 1). It is easy to see that X µ is right-continuous and non-decreasing as well. Having introduced the pseudo-inverse, let us now recall some of its important properties. First we notice that it is possible to pass from X µ to F µ as follows For any probability measure µ ∈ P 2 (R) and the pseudo-inverse , X µ , associated to it, we have for every bounded continuous function f . Moreover, for µ, ν ∈ P 2 (R), the Hoeffding-Fréchet theorem [53, Section 3.1] allows us to represent the 2-Wasserstein distance, W 2 (µ, ν), in terms of the associated pseudo-inverse functions as since the optimal plan is given by (X µ (s) ⊗ X ν (s)) # L, where L is the Lebesgue measure on the interval [0, 1], cf. also [57,30]. We have seen that for every µ ∈ P 2 (R) we can construct a nondecreasing X µ according to (10), and by the change of variables formula (12) we also know that X µ is square integrable. We now recall that this mapping is indeed a distance-preserving bijection between the space of probability measure with finite second moments and the convex cone of nondecreasing L 2 -functions Proposition 1 (µ → X µ is an isometry). The map mapping probability measures onto the convex cone of non-decreasing L 2 -functions is an isometry.
Let us introduce the notion of sub-differential for functions in L 2 (0, 1) 2 .
Remark 2 (Mass normalisation). In the most general situation possible, the two species, ρ and η, have different masses M ρ , M η > 0. The change of variables allows to rewrite system (6a) (by dropping the tildes) as and the gradient flow structure in the product Wasserstein metric W 2 would be lost because the two interspecific potentials are different. However, this problem can be overcome by using a weighted version of the W 2 product distance of the form as done in [36]. As these multiplying constants M ρ , M η do not bring significant technical difficulties (while making the notation much heavier), for the sake of convenience we shall assume throughout the whole paper that M ρ = M η = 1.

Existence and Uniqueness
In this section we provide the mathematical theory for system (6). In the first subsection we will deal with the case of general probability measures in P 2 (R) × P 2 (R) as initial conditions. In the second subsection we shall restrict ourselves to the case of measures that are absolutely continuous with respect to the Lebesgue measure. In the former we will provide a notion of solutions that is linked to the concept of gradient flows in Hilbert space a-la Brézis [15], working with the pseudoinverse formulation of the problem. In the latter, a better regularity can be achieved and the theory is developed in the framework of gradient flows in Wasserstein space, [1]. Before entering the details, let us recall the definition of interaction energy functional F in (4): which is well defined due to the control on the second order moment.

General Measures Initial Data.
In this first subsection we will use the concept of L 2gradient flow by studying system (1) in terms of the pseudo-inverse functions X ρ and X η defined in Section 2. Throughout the rest of this section we set X := X ρ and Y := X η to simplify the notation. Hence, system (6) (formally) becomes for s ∈ (0, 1) and t ≥ 0, cf. [11,30,47] for similar computations. In order to give a meaning to the above system in the case of µ or η having atoms, we use the convention sign(0) = 0. In terms of the pseudo inverses X and Y , the functional F(ρ, η) becomes In the remainder of this section we shall see that (16) is the L 2 × L 2 -gradient flow associated to (an extended version of) the energy functional (17).
Remark 3. Later on in the paper we need to distinguish between the self-interaction part of F and its cross-interaction part. Thus, let us rewrite F as where S is the energy functional arising from the self-interactions and K is associated to the crossinteraction.
Following the procedure of [12], since we are dealing with distribution of particles, we have to ensure that the flow remains in the set C × C, with C defined in (14), see also [10,13]. Hence, for X ∈ L 2 (0, 1) given, we introduce the the indicator function of C, defined by, Thus we consider the extended functional In [12,Proposition 2.8] the authors proved that the self-interaction part of F, i.e. S, is actually linear when restricted to C. Let us recall this result in the next proposition.
As a trivial consequence of Proposition 2 we have the following result.
Proof. The proof is trivial since K, the cross-interaction part ofF, is convex due to the convexity of the Newtonian potential N . Moreover, from [12, Proposition 2.9] we argue that the remaining part is convex.
We now present the definition of L 2 -gradient flow solutions to system (16).
We observe that the assumption (X 0 , Y 0 ) ∈ C × C is natural as X 0 (respectively Y 0 ) is the pseudoinverse of the cumulative distribution of the initial measure ρ 0 (respectively η 0 ). We also observe that this assumption easily implies ∂F X 0 , Y 0 = ∅.
Since we know thatF is a proper, lower semi-continuous, and convex functional on the Hilbert space L 2 (0, 1) 2 , it easy to show that ∂F is a maximal monotone operator. Thus we can apply the theory of Brézis [15,Theorem 3.1] combined with [39, Section 9.6, Theorem 3] in order to prove existence and uniqueness of an absolutely continuous curve satisfying the differential inclusion above.
There exists a unique solution (X(t, ·), Y (t, ·)) in the sense of Definition 5 with initial datum (X 0 , Y 0 ). Now, let us go back to system (6) and state our definition of solution.
are a solution to system (16) in the sense of Definition 5 with initial datum According to Definition 7 the following theorem is then a consequence of the isometry (15) and Theorem 6.
There exists a unique solution to the system (6) in the sense of Definition 7.
So far we assumed that the link between (6) and (16) is somewhat natural and we just referred to similar situations in the literature. However, the theory developed in this subsection would be somewhat meaningless if we did not show that the concept of solution in Definition 5 extends a more classical notion of solution for (6). The following subsection is dedicated to establishing exactly this link.

3.2.
Absolutely Continuous Initial Data. In this subsection we consider the case of densities as initial data. Following the approach of [1] combined with the results from [11,12,25,26,36], we pose system (6) as the gradient flow of the interaction energy functional (4) (that we recall here for the reader's convenience) for all (ρ, η) ∈ P a 2 (R) × P a 2 (R), and N (x) = |x|, for all x ∈ R. Definition 9. Given any γ 0 = (ρ 0 , η 0 ) ∈ P a 2 (R) × P a 2 (R), an absolutely continuous curve and η(t) solve the following system in the distributional sense Note that it is easy to check that the element of minimal norm in ∂N (x) is given by Using the results obtained in [12,25,26,36], we easily get the following proposition.
is the unique element of the minimal Fréchet sub-differential of F, where Proof. The geodesic convexity of F on P 2 (R)× P 2 (R) is the consequence of two observations. First, the cross-interaction part is geodesically convex as the interspecific interaction potential is given by N (x) = |x|, a convex function. Second, the geodesic convexity of the intraspecific self-interaction part can be proven using a nice monotonicity property of the transport map between two measures in P a 2 (R) in one dimension. Indeed, in [26, Lemma 1.4] the authors prove that, given µ, ν ∈ P a 2 (R), the transport map T = T ν µ is essentially non-decreasing, i.e. it is non-decreasing except on a µ-null set. Thus, when interpolating measures, it is sufficient to consider the restriction of −N (x) = −|x| on x ∈ [0, +∞), which is convex. Hence, we can prove that − 1 2 R N ⋆ µ dµ is geodesically convex. Formula (23) can be proven as in [25,Proposition 2.2].
Remark 5. We highlight that in the presence of atomic parts for ρ or η the sub-differential may be empty, as shown in [12].
and it can be written as is a curve of maximal slope for the functional F if the map t → F(γ(t)) is an absolutely continuous function and the following inequality holds In order to construct a solution to system (6) in the sense of Definition 9 we follow the strategy proposed in [1] and used in [25,36]. First, we prove the existence of a curve of maximal slope by means of the so-called "Minimizing Movement Scheme", cf. [1,34], or Jordan-Kinderlehrer-Otto scheme, cf. [43]. Then, we prove the limit curve of the scheme is absolutely continuous w.r.t. the Lebesgue measure provided the initial datum is in L m (R) × L m (R) for some m > 1. Let τ > 0 be a fixed time step and let γ 0 = (ρ 0 , η 0 ) ∈ P a 2 (R) × P a 2 (R) be a fixed initial datum such that F(γ 0 ) < +∞. We define a sequence {γ n τ } n∈N = {(ρ n τ , η n τ )} n∈N recursively. We set γ 0 τ = γ 0 and, for a given γ n τ ∈ P a Note that (24) We proceed by showing that the family {γ τ } τ >0 admits a limiting curve and conclude by identifying this limit as a distributional solution to system (6). The proof in Proposition 5 follows a, by now, classical argument of [1,Chapter 3], with only minor issues related to some moment estimates in our case. We present it here for the reader's convenience.
From estimate (29) and the inequality in Remark 1 we can deduce the second moment of γ τ (t) is uniformly bounded on compact time intervals. Moreover, as a consequence of estimates (27) and (29), using once again the bound from below for F(γ n τ ) we get Now, let us consider 0 ≤ s < t such that s ∈ ((m − 1)τ, mτ ] and t ∈ ((n − 1)τ, nτ ]. It easy to check that |n − m| < |t−s| τ + 1. By means of the Cauchy-Schwartz inequality and (30) we obtain 1 2 -Hölder equi-continuity for γ τ (up to a negligible error of order √ τ ) since As already mentioned, we can prove more refined estimates for the solution γ to system (6). Indeed, in this subsection we prove that, starting with initial data ρ 0 , η 0 ∈ P a 2 (R) ∩ L m (R), for m ∈ (1, +∞], the solution keeps this regularity for every t ≥ 0 and it has second order moments uniformly bounded in time. These properties can be establish by using the "flow interchange" technique developed by Otto in [52], and Matthes, McCann and Savaré in [49]. This technique is based on the idea that the dissipation of one functional along the gradient flow of another functional equals the dissipation of the second functional along the gradient flow of the first one. In this spirit, the "Evolution Variational Inequality" (E.V.I.) linked with the auxiliary gradient flow is crucial in order to obtain useful refined estimates (see for instance [35,37]). The connection between gradient flows and evolutionary PDEs of diffusion type shown in [1,43,52,54] allows us to consider the (decoupled) system as the gradient flow of the functional as well as the following system which can be seen as the gradient flow of the functional for ε > 0 and m ∈ (1, ∞), with respect to the product 2-Wasserstein distance W 2 . We shall employ the flow interchange strategy twice taking as auxiliary functional (1) E to get L m -regularity (m > 1) for the solution γ, (2) G in order to obtain a uniform bound in time for the second order moments of γ. For the reader's convenience we shall sometimes use the symbol A to denote either G or E. The functional A ∈ {G, E} possess a 0-flow given by the semigroup and S A = (S 1 A , S 2 A ), see for instance [33]. In particular, by setting ·)) is the unique classical solution at time t of system (32a) (respectively (33a)) coupled with an initial value (ν 1 , ν 2 ) at t = 0 in case A = E (A = G, respectively).
is bounded from below in terms of the second moment m 2 (ρ), i.e.
H(ρ) ≥ −C(m 2 (ρ) + 1) β , for every ρ ∈ P a 2 (R d ), β ∈ ( d d+2 , 1), and C < +∞, depending only on the space dimension d. We are going to use this inequality in order to have a uniform bound from below for the entropic part in (32b) and (33b).
For every γ = (ρ, η) ∈ P a 2 (R) 2 , let us define the dissipation of F along S A by where A ∈ {G, E}. We prove the following proposition.
Proof. As an easy consequence of the definition of the sequence {γ n τ } n∈N , for all s > 0 we have that 1 2τ Dividing by s > 0 and passing to the lim sup as s ↓ 0 we get where in the last inequality the well-known connection between displacement convexity and the E.V.I. is crucial, see e.g. [33]. Now, concerning the left-hand side of (35), we notice that Hence, let us focus on the time derivative inside the above integral. Keep in mind that S t E γ n+1 τ is the solution to the decoupled system of nonlinear parabolic equations with strictly positive coefficients, system (32a). Then, using the C ∞ -regularity of S t E γ n+1 τ we may infer where the terms at infinity in the integration by parts vanish due to the rapid decay of the solution to a nondegenerate diffusion equation [46]. (37) yields Combining (38) with (36) and (35) we obtain for all n ∈ N. By Remark 7 we control the log-entropic part of E and we deduce that, as ε ↓ 0, for every t ≥ 0. This proves estimate (34) for m ∈ (1, ∞). In order to extend the estimate to the case m = +∞, we observe that We conclude that, for all T > 0, the subsequence {γ τ k } k∈N obtained form Proposition 5 is uniformly bounded in L ∞ ([0, T ]; L m (R)) 2 . By Banach-Alaoglu's Theorem, in case m is finite, there exists a subsequence (τ ′ k ) ⊂ (τ k ) such that {γ τ ′ k } k∈N converges in the weak L m x,t topology to some limit γ ′ ∈ L m ([0, T ] × R) 2 . In the case of m = +∞ the above subsequence exists in the weak-⋆ topology of L ∞ ([0, T ] × R). Due to Proposition 5 the limit γ ′ coincides with γ on [0, T ]. By a simple weak lower semi-continuity argument we deduce that γ inherits the same estimates as the approximating sequence γ τ k . Since The T was arbitrary we conclude the proof.
Proof. Arguing as in the proof of Proposition 6, from the scheme (24) we easily get (43) τ where, again, the left-hand side of (43) can be rewritten as is the solution to system (33a), which is a decoupled system of linear Fokker-Planck equations, we use its C ∞ -regularity to obtain Let us consider the boundary terms individually. For convenience we set ρ t = S 1,t G ρ n+1 τ , η t = S 2,t G η n+1 τ , and κ := ρ t − η t in order to simplify the notation. Concerning the first term, we have which vanishes as |x| → +∞. Regarding the second term we note that which vanishes as |x| → +∞ because ρ t and η t are solutions of linear Fokker-Planck equations decaying exponentially at infinity. Now, for the same reason, there holds vanishes |x| → +∞. As for the last boundary term we get which, again, goes to 0 as |x| → +∞. Using (46), (47), (48), and (49) in (45) we deduce that which, in combination with (44) and (43), gives G(γ n τ ) ≤ G(γ 0 ).
Hence, taking into account Remark (7), letting ε → 0 + we get for every t ≥ 0, i.e. estimate (42). By similar considerations to the ones at the end of the proof of Proposition 6 and from Proposition 5 we conclude that γ has second order moments uniformly bounded in time.
The final step in this procedure is to prove the curve γ obtained in Proposition 5 is a curve of maximal slope for F, arguing as in [25,36]. Since curves of maximal slope coincide with gradient flows, see [1, Theorem 11.1.3], we actually have that γ is a gradient flow solution to (6) in the sense of Definition 9. Let us summarise the procedure for the sake of completeness. We denote byγ τ the De Giorgi variational interpolation (cf. [1, Definition 3.2.1]), i.e. any interpolatioñ γ τ : [0, +∞) → P a 2 (R) 2 of the discrete values {γ n τ } n∈N defined through scheme (24) such that if t = (n − 1)τ + δ ∈ ((n − 1)τ, nτ ]. Now, from [1, Theorem 3.14, Lemma 3.2.2] it is possible to get the following energy inequality where the pair (γ k , v k ) is the solution of the continuity equation ∂ t γ k (t) + div(v k (t)γ k (t)) = 0 in the sense of distributions. Here γ k is the subsequence from Proposition 5 and v k (t) is the unique velocity field with minimal L 2 (γ k (t))-norm (see Remark below), andγ k :=γ τ k is defined by (52).

Remark 8 (Absolutely continuous curve and the continuity equation).
Thanks to Proposition 5 we know γ k is an absolutely continuous curve, therefore we can identify its tangent vectors with the velocity fields v k (t) such that the continuity equation ∂ t γ k (t) + div(v k (t)γ k (t)) = 0 is satisfied in a distributional sense, according to [1, Theorem 8.3.1]. Furthermore, [1,Proposition 8.4.5] asserts there is only one v k (t) with minimal L 2 (γ k (t))-norm, equal to the metric derivative of γ k (t) for a.e. t.
Up to a subsequence both the interpolations γ τ andγ τ narrowly converge to γ in view of Proposition 5. Proving that γ is a curve of maximal slope in the sense of Definition 10 is then a consequence of the lower semi-continuity of the slope and the energy inequality (53) (6) in the sense of Definition 9. At this stage, the uniqueness of the gradient flow solutions in the sense of Definition 9 follows from the geodesic convexity of F proven in Proposition 4, relying on [1, Theorem 11.1.4]. More precisely, Given two gradient flow solutions γ 1 (t) and γ 2 (t) in the sense of Definition 9, we obtain the stability property (54) W 2 (γ 1 (t), γ 2 (t)) ≤ W 2 (γ 1 (0), γ 2 (0)), for all t ≥ 0. In addition, the unique gradient flow solution satisfies the Evolution Variational Inequality (E.V.I.): for almost all t > 0 and allγ ∈ P(R) 2 . The property (55) can actually be used to show a stronger property, namely that γ is a gradient flow in the sense of Definition 7, which implies uniqueness in the weaker notion of solution defined in Definition 7 in view of Theorem 8. We prove this statement in the following Theorem, which also collects all the estimates proven in this subsection.
Theorem 12. Let m ∈ (1, +∞]. Let ρ 0 , η 0 ∈ P 2 (R) ∩ L m (R). Then, there exists a unique γ = (ρ, η) ∈ L ∞ ([0, +∞); P 2 (R) 2 ∩ L m (R) 2 ) solving (6) in the sense of Definition 9. Moreover, γ is the unique solution to (6) in the sense of Definition 7 as well. Finally, we have the properties Proof. All the statements have been proven earlier in this subsection, in particular in Proposition 6 and Lemma 11. We only need to prove that γ is a solution to (6) in the sense of Definition 7. Recall the E.V.I. (55) for a general γ ∈ P 2 (R) 2 . Integrating the inequality (55) on the time interval [s, t] and dividing by t − s we obtain Consider now the pseudo-inverse variables X ρ , X η , X ρ , and X η of ρ, η, ρ, and η respectively, where γ = (ρ, η) and γ = (ρ, η). The formulas (12) and (13) applied to our case imply We notice that the indicator function has been considered as zero in F as all the involved variables are in the cone C. By absolute continuity in time of the curve t → (X ρ (t), X η (t)), recalling the expression of F, we can let s ↑ t and obtain which, since (X ρ , X η ) was arbitrary, is equivalent to state that The proof will be completed once we show that Z(t) is a Lipschitz curve. Recall the estimate (28) at the level of the JKO scheme, which can be rewritten as ] , for all h > 0. Sending τ ↓ 0 and using the fact that the functional F is continuous w.r.t. W 2 , we get 1 In the pseudo-inverse formalism the above estimate reads 1 where X 0 and Y 0 are the pseudo-inverses corresponding to ρ 0 and η 0 , respectively. The definition of sub-differential, ∂F, then implies Hence, we easily get thatŻ(0) is bounded in L 2 (0, 1) 2 by the estimate Finally, the stability property (54) implies for all t, h > 0 Upon dividing by h and letting h ↓ 0 we get Ż (t) L 2 2 ≤ Ż (0) L 2 2 , which gives the desired regularity for Z(t).

Steady states and minimisers of the energy
In what follows we shall study the energy (4) associated to system (3). First we shall see that the energy can be written in a completely symmetric way which then allows us to show its boundedness from below by zero.
As the kernel, N (x) = |x|, is symmetric we may swap the roles of x and y in the second term in the integral to obtain hence the energy does not depend on the individual densities but merely on their difference. By abuse of notation we shall write for κ ∈ L 1 ((1 + |x| 2 ) dx) with zero mean. We introduce the set of L 1 -functions with finite second order moments with zero mean in order to formulate the following Proposition establishing the boundedness of the energy functional.
Proof. Let κ ∈ L 1 (1 + |x| 2 ) dx be arbitrary. It is well-known that where the last equality holds due to the fact that N is the fundamental solution of the Laplace equation in one dimension. Thus we may write by an integration by parts. Arguing as in the proof of Proposition 11, the boundary term vanishes. Hence we conclude Clearly, equality holds if and only if N ′ ⋆ κ = 0. Differentiating this expression once yields the second assertion and concludes the proof.
Proof. We adopt the strategy of [16,Proposition 3.1]. Let (ρ, η) be a minimiser of the energy F. For i = 1, 2, let v i ∈ C ∞ c (R) be arbitrary velocity fields for the continuity equations with initial data γ 1 (0) = ρ and γ 2 (0) = η. For i = 1, 2, the solution γ i can be constructed by means of the method of characteristic ODE, i.e. by considering X i (s, x) such that for all ϕ ∈ C 1 c (R), so that γ i has mass 1 and it is positive almost everywhere. Then, since (ρ, η) are minimiser of the energy, at s = 0 there holds We choose v 2 = 0 to obtain Since v 1 is arbitrary we obtain ρ∂ x (N ⋆ (ρ − η)) = 0, almost everywhere and similarly, with v 1 = 0 and v 2 arbitrary we also obtain almost everywhere. This shows that (ρ, η) is a stationary solution to (6). Moreover, from Proposition 7 we see that (ρ, η) is a minimiser if and only if κ = 0 a.e., which implies the assertion.
Proof. Since (ρ, η) is a steady state of system (6), from the energy inequality (53) we get which implies |∂F| 2 [γ] = 0. Thus, in view of Proposition 4 we obtain ρ = η almost everywhere. As a consequence of Proposition 7 we conclude (ρ, η) is a minimiser of the energy F.
We conclude this section by providing a characterisation for the ω-limit set of a solution to system (6). For the sake of completeness, let us recall the definition of ω-limit set according to [31, Definition 9.1.5].
Definition 13. Let (X, d) be a complete metric space and consider a dynamical system {S t } t≥0 . For x ∈ X the set Now, let us state the following Theorem.
Theorem 14. Let γ = (ρ, η) be the solution to system (6) with initial datum for a.e. t > 0. Then we can consider an unbounded, increasing sequence {t n } n∈N such that t n → +∞ and γ(t n ) ⇀γ weakly in L m , as n → +∞, whereγ = (ρ,η). According to [2,Theorem 5.3], since γ is a gradient flow solution to system (6) we have Now, if we integrate in a general time interval (t n , t n + 1), we have which gives, passing to the lim inf as n → +∞,   . This example has two separated indicator functions as initial data. In the left graph we see the evolution of system (6) to the stationary state (right graph).
In the middle we see the energy (black) of the solution and its dissipation (red). The dotted line is the numerical time derivative of the energy. It matches well with the analytically obtained dissipation.

Initial Data with Atomic Part, Non-Uniqueness & link with Hyperbolic Systems
In this section we study the evolution of measure valued initial data consisting of atomic parts. We will consider two peculiar examples: first we handle the case of two distinct Dirac deltas as initial condition, namely ρ 0 = δ −1 and η 0 = δ 1 , and then we will consider the case ρ 0 = δ 0 and η 0 = mδ 0 + (1 − m)δ 1 , for some m ≥ 0. In both cases we will provide two candidate weak solutions for system (3) and we will show that only one of them can be selected in the spirit of Definition 7. As the notion of gradient flow in measure valued solution setting is essentially formulated in the pseudo-inverse formalism through Definition 5, in this section we shall work directly with the pseudo-inverse variables. We start by providing an explicit expression for elements in the sub-differential of F in both the examples we shall consider.  Y (z). Then, and in particular the functional becomes In addition, let X, Y ∈ L 2 (0, 1). First, let us compute I 1 taking into account that X = Y on [0, m): Since X is non-decreasing, we have and therefore we get Concerning the other integrals, we easily obtain Summing up all the contributions we have Then (57) follows as a direct consequence. Now, let us characterise the sub-differential ofF. Assume without loss of generality that X ∈ C and Y ∈ C, which implies I C (X) = +∞ and I C (Y ) = 0. If for all (R 1 , R 2 ) ∈ L 2 (0, 1) 2 with (X, Y ) − (R 1 , R 2 ) → 0, and in particular for all (R 1 , R 2 ) ∈ C × C.
In the latter case we obviously have a contradiction because the left-hand side is finite while the right-hand side is infinite and therefore ∂F[(X, Y )] = ∅. Let (X, Y ) ∈ C ×C and (R 1 , R 2 ) ∈ L 2 (0, 1) 2 . Now we have to consider two cases: (1) (R 1 , R 2 ) ∈ C × C; (2) (R 1 , R 2 ) ∈ C × C. In the first case the definition of sub-differential is trivially satisfied, whereas in the second one from (59) we get which concludes the proof.
We remark in particular that under the assumptions in the previous proposition the sub-differential of F is single-valued on C × C. Note that the case sup X < inf Y (m = 0) is included in the previous proposition, and the functional becomes 5.1. The case of two distinct deltas as initial condition. Let us consider the first example, with (60) ρ 0 = δ −1 and η 0 = δ 1 .
At the level of weak (measure) solutions, both and (ρ,η) given byρ satisfy system (6) in the weak sense on [0, 1/2] × R and equal (60) at t = 0, see considerations in Subsection 5.1.1. The corresponding pseudo-inverse functions (X, Y ), given by as well as (X,Ỹ ), given byX satisfy system (16) in the strong sense on [0, 1/2] × [0, 1], see Subsection 5.1.2. Working in the context of pseudo-inverses we can show, in Subsection 5.1.3 that actually only the pseudo-inverse in (63) is an element of the sub-differential. 5.1.1. Weak measure solutions. In order to prove that (ρ, η) in (61) is a weak solution of the system we begin by simplifying the velocity term. To this end we compute the convolution with the Heaviside function Similarly, for the convolution with the second species we obtain We claim that ρ, η as defined above are weak solutions to system (6). Here we only check that ρ is a weak solution as the computation for the second species is done in an analogous way. Now, let φ ∈ C ∞ c and consider the weak formulation where and Let us begin by simplifying the time related term. By changing the order of integration it is easy to see that by an integration by parts. Hence, switching the order of integration another time and simplifying the boundary term we obtain A change of variables x + 1 = 2t finally yields Next we shall address the term space related term. We observe An integration by parts yields Upon adding up I t and I x we observe i.e. ρ is a weak solution of the first equation in system (6). Similarly it can be shown that η is a weak solution to the second equation in (6).
Next we show that (ρ,η) is also a weak solution. As before we compute the terms including the convolutions first. It is easy to check that for all x ∈ R, thus the velocity is given by u := N ′ ⋆ (η −ρ) = sign(x + t − 1) − sign(x − t + 1). Let us consider a test function φ ∈ C ∞ c in order to check the weak formulation as follows: Arguing similarly forη we have (ρ,η) is a weak solution to system (6) with initial data ρ 0 = δ −1 and η 0 = δ 1 as well.

5.1.2.
Strong solutions in the pseudo-inverse formalism. Next, let us show that (X, Y ) defined in (63) is the solution to system (16) in the strong sense. Using that t < 1/2 there holds As for the second pair of pseudoinverses (64) we observe and similarly the equation forỸ is satisfied.

5.1.3.
Characterisation of the sub-differential. We need to check the differential inclusion. According to Proposition 10 we have ∂ ∂t as we claimed. However, there holds ∂ ∂tX (t, z) = 1 = 2z, which shows the pair (X,Ỹ ) defined in (64) is not a solution to system (16) in the sense of Definition 5 for the given initial data.
Remark 9. In this example, both species are initially concentrated at one point, but there is no overlap between them. Therefore, only the intraspecific energies are affected at "singular points", i.e. at points in which the convolution kernel is not smooth. In fact, the interspecific energy is not effected by the Lipschitz point at the origin. In this sense, one expects the qualitative behaviour of this system to be essentially the same as in the one species case, see [12]. More precisely, both species get immediately absolutely continuous w.r.t. to the Lebesgue measure. The attractive crossinteraction energy makes the two patches get closer to each other until they eventually merge. In conclusion, the existence of two distinct measure solutions in this example is not a distinctive feature of the two species system, but rather an extension of a property holding in the one species case.
Then the pair (ρ, η) given by is a weak solution to system (6) whereas, for the cross-interactions, we get (N ′ ⋆ η)(t, x) = m sign(x) − Hence, the velocity on [0, 1/2] is given by We shall now verify that ρ and η are weak solutions. It is easy to see that We will treat each term individually. Note that u(t, 0) = 0 for all t > 0. Together with the fact that φ is compactly supported and the application of the fundamental theorem the first term vanishes and it remains to treat the terms I 1 , I 2 . Using Fubini's theorem and an integration by parts we may write As for the second term a simple integration by parts yields Thus we get Now, we need to check that (ρ,η) satisfies the weak formulation. Here we only check the statement only forρ as the second species is shown analogously. Again we compute the velocity field forρ.

5.2.2.
Strong solutions in the pseudo-inverse formalism. Let us now consider the associated pseudoinverse functions X and Y , given by for z ∈ [0, 1] and 0 ≤ t ≤ T . (X, Y ) is a solution to system (16) in a strong sense, in fact ∂ ∂t we have ∂ ∂t X(t, z) Y (t, z) ∈ −∂F X(t, z), Y (t, z) , so that we can affirm (X, Y ) is a gradient flow solution to system (16). In conclusion, (X,Ỹ ) is not a gradient flow solution since ∂ ∂tX (t, z) = 1 − m = 2(z − m), as we claimed.
Remark 10. Unlike the case of two separate Dirac deltas, the phenomenon arising in this example is indeed a distinctive feature of the two species case. This time the inter-specific energy is indeed affected at the singular point, since both species are present at the same position initially. The common mass m at the point zero is driven both by a self-repulsion and by a cross-attraction effect annihilating each other and producing no movement at all as a result. The extra mass of ρ is instead only driven by self-repulsion, and therefore it gets smoothed. At the point 1, only the smoothing effect occurs, as there is no singular cross-interaction. There is a significant aspect in this solution: the gradient flow solution maintains a bit of its initial atomic part, which never happens in the one species case.

5.3.
Link with hyperbolic systems. In this section we want to highlight the link between system (6) and a particular nonlinear 2× 2 system of conservation laws in one space dimension, see [14,32]. Indeed, considering the cumulative distribution functions F and G of ρ and η respectively (as defined in (9)), we can rewrite system (6) as (67) ∂ t F + 2(F − G)∂ x F = 0, ∂ t G + 2(G − F )∂ x G = 0, or in the equivalent matrix form We stress that the initial condition F 0 , G 0 for F and G are non-decreasing and achieving values in [0, 1], with F 0 (−∞) = G 0 (−∞) = 0 and F 0 (+∞) = G 0 (+∞) = 1. System (67) is hyperbolic, though not strictly, as the eigenvalues are To our knowledge, no general theories on hyperbolic systems currently allow to define a notion of entropy solution for such a system due to the lack of strict hyperbolicity. In particular, there is no canonical way to define a suitable Riemann solver. The link between (67) and (6) can be easily established at the level of weak solutions for (67) with sufficiently smooth initial data, which will correspond to weak solutions for (6). Such a link is slightly more tricky at the level of discontinuous solutions. On the other hand, the use of the Evolution Variational Inequality for (6) seems to be like a natural way to characterise a solution for (67) as well. This task will be performed in a future work. In this subsection we will just display the gradient flow solutions found in the previous subsections at the level of the hyperbolic system (67), as relevant examples of solutions of Cauchy problems which can be solved via the composition of two Riemann problems. Let us start by considering the Cauchy problem which correspond to the initial condition for (6) ρ 0 = δ −1 , and η 0 = δ 1 .
As anticipated, we construct the solution (only for short time) by solving two separate Riemann problems, i.e. (70) On the basis of our previous results, we know the solution at the level of pseudo-inverses functions, i.e. for z ∈ [0, 1] and for t small enough, X(t, z) = 2tz − 1, and Y (t, z) = 2t(z − 1) + 1.
Computing the corresponding cumulative distributions F and G, our candidate solution to problem (70) for t small enough is given by G(t, x) = 0 for x < 1, whereas for problem (71) we have The composition of these two solutions for short times is represented in Figure 5.3. Let us now consider as initial datum the cumulative distribution functions of ρ 0 = δ 0 , and η 0 = mδ 0 + (1 − m)δ 1 , as in (66). As before we have to deal with two different Riemann problems, i.e. (72) Going back to the results of the previous subsection, the gradient flow solution for the pseudoinverse system (16) is given by Y (t, z) = 0 0 ≤ z < m 2t(z − 1) + 1 m ≤ z ≤ 1, and then, for t small enough, the candidate solution to problem (72) is given by