Measure dynamics with Probability Vector Fields and sources

We introduce a new formulation for differential equation describing dynamics of measures on an Euclidean space, that we call Measure Differential Equations with sources. They mix two different phenomena: on one side, a transport-type term, in which a vector field is replaced by a Probability Vector Field, that is a probability distribution on the tangent bundle; on the other side, a source term. Such new formulation allows to write in a unified way both classical transport and diffusion with finite speed, together with creation of mass. The main result of this article shows that, by introducing a suitable Wasserstein-like functional, one can ensure existence of solutions to Measure Differential Equations with sources under Lipschitz conditions. We also prove a uniqueness result under the following additional hypothesis: the measure dynamics needs to be compatible with dynamics of measures that are sums of Dirac masses.


Introduction
The problem of optimal transportation, also called Monge-Kantorovich problem, has been intensively studied in the mathematical community. Related to such problem, the definition of the Wasserstein distance in the space of probability measure has revealed itself to be a powerful tool, in particular for dealing with dynamics of measures (like the transport PDE, see e.g. [3]). For a complete introduction to Wasserstein distances, see [12,13].
This approach has at least two main limits. The first is that the use of transport equation, together with their counterpart in terms of ordinary differential equations [1,2], does not allow to model neither mass diffusion nor concentration phenomena. The second one is that the Wasserstein distance W p (µ, ν) is defined only if the two measures µ, ν have the same mass, then PDEs with sources cannot be studied with such tools.
Both limits were recently overcome by a variety of contributions. The first was addressed in [9], in which a generalization of the concept of vector fields was introduced. Such tool, called Probability Vector Field (PVF in the following), allows to model concentration and diffusion phenomena in the formalism of the transport equation, then being able to translate several useful techniques from dynamical system.
The second limit was addressed by a series of papers introducing generalizations of the Wasserstein distance to measures with different masses. In [10] we defined a generalized Wasserstein distance W g (µ, ν), combining the standard Wasserstein and L 1 distances. In rough words, for W g (µ, ν) an infinitesimal mass δµ of µ (or ν) can either be removed at cost a|δµ|, or moved from µ to ν at cost bW p (δµ, δν). This distance is a generalization of the so-called flat distance. Other generalizations of the Wasserstein distance, with the same spirit of allowing sources of mass, are studied in [5,6,8,7]. As a consequence, sources of mass can be introduced in the transport equation, even when they depend on the measure itself, see [11].
The goal of this article is to define a new class of equations, which are able to describe complex dynamics in the space of measures, including mass diffusion, concentration and sources. The idea is to merge two different dynamics, already individually described in [9,10], and couple them. The first contribution is given by dynamics induced by Probability Vector Fields (PVF in the following), recently introduced in [9]. There, the equationμ is considered, where V : P(R n ) → P(T R n ) is a function from the space P(R n ) of probability measures to the space P(T R n ) of probability measures of the tangent space T R n . The idea of such function is to describe the infinitesimal spreading of the mass µ(x) in a point x along the velocities described by the measure V [µ](x, ·) on the fiber T x R n . Given the projection π : T R n → R n defined by π(x, v) = x, we also require π#V [µ] = µ, i.e. that the projection of V [µ] from P(T R n ) to P(R n ) coincides with µ. This is the measure counterpart of the fact that a vector field is a section of the tangent bundle. The main contribution of [9] is to introduce conditions ensuring existence and/or uniqueness of the solution of the Cauchy problem with dynamics (1). In particular, two key tools are defined: the first is a new non-negative operator W, based on the Wasserstein distance and enjoying some of its properties, on the space P(T R n ). The idea is that W measures the cost of the minimizing transference plan on fibers, among plans whose projections are optimal on the base space. The formal definition is given in Definition 17. If one assumes that V from P(R n ) endowed with the Wasserstein distance to P(T R n ) endowed with W is Lipschitz, then there exists at least one solution to (1). The second tool is the definition of Dirac germs, that are specific choices of solutions to (1) for measures composed of Dirac deltas only. Fixed a Dirac germ for (1), then for each initial measure there exists at most one solution to (1) that is compatible with such chosen germ. In some specific but relevant cases, the coupling of Lipschitz continuity of V with the choice of a compatible Dirac germ ensures both existence and uniqueness of a solution to (1). The second contribution is given by sources and sinks. In this case, the dynamics reads aṡ where s is a measure on the space R n , representing a source/sink of mass. The description of such Partial Differential Equation with a fixed source s is very classical, since the solution is clearly µ t = µ 0 + ts. Instead, we introduced in [10] new conditions to ensure that the dynamics (2) is well posed even when the source s[µ] depends on the whole measure µ itself. The key tool is the introduction of a new distance on the space of measures with finite mass, called the generalized Wasserstein distance W g . If s is Lipschitz with respect to this distance, then one has existence and uniqueness of the solution to the Cauchy problem with dynamics (2). For simplicity, from now on we restrict ourselves to the space M(R d ) of Borel measures with bounded support and finite mass. In this space, the generalized Wasserstein distance W g (µ, ν) is always finite, while the standard Wasserstein distance W (µ, ν) is defined only if the masses of the two measures coincide, i.e. µ(R n ) = ν(R n ). We endow the space M(R n ) with the topology of weak convergence; this coincides with the topology induced by the generalized Wasserstein distance, see Proposition 11 below. We are now ready to define Measure Differential Equations with Source: where V [µ] is a PVF V : M(R n ) → M(T R n ) and s[µ] is a source s : M(R n ) → M(R n ). The goal is to prove existence and/or uniqueness of a solution to the associated Cauchy problem, under the joint hypotheses ensuring existence and/or uniqueness for each of the dynamics (1) and (2). More precisely, we first give the definition of a solution to (3): Such definition is pretty weak and can not allow uniqueness results, thus we are also interested in stronger properties for solutions to (3). In particular, we focus on existence of semigroups of solutions, whose definition in this setting is given below.
2. the map t → S t µ is a solution to (3); 3. for every R, M > 0 there exists C = C(R, M ) > 0 such that if supp(µ) ∪ supp(ν) ⊂ B(0, R) and µ(R n ) + ν(R n ) ≤ M , it then holds (a) supp(S t µ) ⊂ B(0, e Ct (R + M + 1)); We also need to define a natural tool, merging properties of the operator W on P(T R n ) with the setting of the generalized Wasserstein distance W g on M(R n ). Such non-negative operator, that we denote by W g , measures the minimal standard Wasserstein distance on the fiber between transference plans whose projections give a minimizing decomposition for the generalized Wasserstein distance on the base space. The operator is precisely defined in Section 2.5.
We are now ready to state the two main results of this article. The first deals with existence of a solution to (3), while the second focuses on uniqueness. (V1) support sublinearity: there exists C > 0 such that for all µ ∈ M(R n ) it holds (s) The source s : M(R n ) → M(R n ) satisfies: (s1) Lipschitz continuity: there exists L such that for all µ, ν ∈ M(R n ) it holds Several corollaries about existence and/or uniqueness of the solutions to (3) can be directly derived from corresponding results about PVFs from [9]. In particular, one can observe that the uniqueness property depends on the PVF V only, and not on the source s. We then have the two following remarkable cases: with v locally Lipschitz vector field with sub-linear growth. Then, (3) admits a unique Lipschitz semigroup, obtained as the limit of the discretization described in Section 3.1.
is the cumulative distribution of µ, and λ is the Lebesgue measure. This choice of the PVF allows to have solutions that diffuse with finite velocities, see [9, Section 7.1] for more details.
In this case, for any choice of the source s satisfying (s), one has existence of a solution to (3). Even though this solution is not unique, in general, there exists a unique semigroup obtained by the limit of the discretization algorithm described in Section 3.1. The structure of the article is the following. In Section 2 we fix the notation and recall main properties of the tools used later: the Wasserstein distance, the generalized Wasserstein distance and Measure Differential Equations with Probability Vector Fields. In the main Section 3, we prove the results of this paper. In Section 3.1, we prove Theorem 3 about existence of a solution to (3), while in Section 3.2, we prove Theorem 4 about uniqueness.

Dynamics in generalized Wasserstein Spaces
In this section, we fix the notation and define the main tools used in the rest of the article: the Wasserstein distance, the generalized Wasserstein distance and Measure Differential Equations with Probability Vector Fields.

The Wasserstein distance
We use M(R d ) to denote the space of positive Borel regular measures with bounded support and finite mass on R d . Given µ, µ 1 Radon measures (i.e. positive Borel measures with locally finite mass), we write µ 1 µ if µ 1 is absolutely continuous with respect to µ, while we write µ 1 ≤ µ if µ 1 (A) ≤ µ(A) for every Borel set A. We denote by |µ| := µ(R d ) the norm of µ (also called its mass). More generally, if µ = µ + − µ − is a signed Borel measure, we define |µ| := |µ + | + |µ − |.
Given a Borel map γ : R d → R d , the push forward of a measure µ ∈ M(R d ) is defined by: Note that the mass of µ is identical to the mass of γ#µ. Therefore, given two measures µ, ν with the same mass, one may look for γ such that ν = γ#µ and it minimizes the cost This means that each infinitesimal mass δµ is sent to δν and that its infinitesimal cost is the p-th power of the distance between them. Such minimization problem is known as the Monge problem. A generalization of the Monge problem is defined as follows. Given a probability measure π on R d × R d , one can interpret π as a method to transfer a measure µ on R d to another measure on R d as follows: each infinitesimal mass on a location x is sent to a location y with a probability given by π(x, y). Formally, µ is sent to ν if the following properties hold: Such π is called a transference plan from µ to ν and we denote the set of transference plans from µ to ν by P (µ, ν). A condition equivalent to (7) is that, . One can define a cost for π as follows and look for a minimizer of J in P (µ, ν). Such problem is called the Monge-Kantorovich problem. It is important to observe that such problem is a generalization of the Monge problem. The main advantage of this approach is that a minimizer of J in P (µ, ν) always exists. We then denote by P opt (µ, ν) the set of transference plans that are minimizers of J, that is always non-empty.
One can thus define on M 1 (R d ) the following operator between measures of the same mass, called the Wasserstein distance: It is indeed a distance on the subspace of measures in M(R d ) with a given mass, see [12]. It is easy to prove that W p (kµ, kν) = W p (µ, ν) for k ≥ 0, by observing that P (kµ, kν) = P (µ, ν) and that J [π] does not depend on the mass.
From now on, we only consider the Wasserstein distance with parameter p = 1, that will then be denoted by W (µ, ν). It satisfies the following fundamental dual property.
Such property plays a crucial role in the theory of PVF, see [9]. It is then unclear if a corresponding theory can be generalized to any p > 1.

The generalized Wasserstein distance
In this section, we provide a definition of the generalized Wasserstein distance, introduced in [10,11], together with some useful properties. We consider here the generalized Wasserstein distance with parameters a = b = 1, to simplify the notation, and p = 1.
We now provide some properties of W g . Proofs can be adapted from those given in [10].

The infimum in
6. If |µ| = |ν|, it holds We recall now some useful topological results related to the metric space M(R d ) when endowed with the generalized Wasserstein distance. We first recall the definition of tightness in this context.
We now recall the definition of weak convergence of measures, as well as an important result about convergence with respect to the generalized Wasserstein distance, see [10,Theorem 13].
Definition 10 Let {µ n } be a sequence of measures in R d , and µ a measure. We say that µ n converges to µ with respect to the weak topology, and we write µ n µ, if for all functions f ∈ C ∞ c it holds lim n→∞ f dµ n = f dµ.
We also recall the result of completeness, see [10,Proposition 15].

Proposition 12
The space M(R d ) endowed with the distance W g is a complete metric space.
The generalized Wasserstein distance also satisfies a useful dual formula, showing that it coincides with the so-called flat distance. See [11].
We recall that the L 1 distance satisfies a dual formula too, that is We also have this useful estimate to bound integrals. See [10].
It then holds We end this section by giving useful estimates both for the standard and generalized Wasserstein distances W p and W g under flow actions. Proofs are given in [10,11].
Proposition 15 Let v t , w t be two time-varying vector fields, uniformly Lipschitz with respect to the space variable, and φ t , ψ t the flow generated by v, w respectively. Let L be the Lipschitz constant of v and w, i.e. |v t (x) − v t (y)| ≤ L|x − y| for all t, and similarly for w. Let µ, ν ∈ M(R d ). We have the following estimates for the standard Wasserstein distance We have the following estimates for the generalized Wasserstein distance

Measure Differential Equations with Probability Vector Fields
In this section, we summarize the main results and tools about PVFs, introduced in [9]. We slightly enlarge the setting of [9], since we consider general measures with finite mass and not only probability measures.
We first recall the definition of a solution to the Cauchy probleṁ Definition 16 Fix a final time T > 0. A solution to (15) is a map µ : [0, T ] → M(R n ) such that µ(0) = µ 0 and the following holds: • the map t → f dµ(t) is absolutely continuous, and it satisfies We now recall the definition of the pseudo-distance W, that will be useful in the following.
Denote by µ 1 = π 1 #V 1 and µ 2 = π 1 #V 2 the projection of the PVF on the base space. Define Clearly, such functional is not a distance, see examples in [9]. Nevertheless, we will see in the following that the local Lipschitz condition (V2) will ensure existence of solutions to (15). Observe that it also holds See [9] for more details.
We now address the problem of existence of solutions to (15). The idea developed in [9] is to define a semigroup of solutions as the limit of approximated ones. We first describe precisely the discretization method, that will be also useful in the following.
Define M x N ⊂ M(R n ) the space of measures of R n with support on the set of points x i , and M v N ⊂ M(R 2n ) the space of measures of R 2n with support on the set of points ( n . Define the discretization operator in the velocity variable The first property of such discretization is that it introduces an arbitrarily small error in the Wasserstein distance. Proof. The proof with µ and V [µ] being probability measures is given in [9]. The generalization to measures with finite mass is straightforward. One can then define an approximated solution (called the Lattice Approximate Solution) to (15) via an explicit Euler scheme.
Definition 20 Given the Cauchy problem (15), we define the following Lattice Approximate Solution µ N : We are now ready to state the existence of a solution to (15) as a limit of the Lattice Approximate Solutions introduced above.
Then, there exists a Lipschitz semigroup of solutions to (15), obtained as uniform-in-time limit of Lattice Approximate Solutions for the Wasserstein Metric.
Proof. The first key observation is that both A x N and A v N are operators preserving the mass for N sufficiently large, i.e. A x N (µ)(R n ) = µ(R n ) and similarly for the PVF. As a consequence, the mass of µ N (t) coincides with µ N (0), that in turn coincides with µ 0 for N sufficiently large.
If µ 0 (R n ) = 1, then the whole sequence µ N (t) is in P c (R n ), and one can apply the proof of [9, Theorem 4.1]. Otherwise, rescale the mass by defining ν N (t) = 1 µ0(R N ) µ N (t), apply the previous case to define ν(t) and prove that µ(t) = µ 0 (R N )ν(t) is a solution to (15).
We now recall the definition of Dirac germs, that permits to address the problem of uniqueness of the solution to (15). We also give the definition of semigroup compatible with the germ.
We are now ready to prove the main result about uniqueness of solutions to (15).
Theorem 24 Consider a PVF satisfying (V1) and fix a Dirac germ γ. There exists at most one Lipschitz semigroup S t of solutions to (15) compatible with γ.
Proof. First observe that uniform boundedness of the support and the weak formulation (16) when choosing f = 1 on ∪ t∈[0,T ] supp(µ(t)) imply that the mass µ(t)(R n ) is constant along trajectories of (1). Thus, the Dirac germ satisfies conservation of mass too. Apply now the proof of Theorem 5.1 in [9] for an initial data being a probability measure, with the Dirac germ restricted to probability measures. For initial data with general finite mass, apply the rescaling trick described in the proof of Theorem 21 both to the initial data and the Dirac germ.

Measure Equations with sources
In this section, we briefly study the measure equation with sourcė The goal is to prove that condition (s) in Theorem 3 ensures existence and uniqueness of a solution to (20). This is indeed a particular case of a more general result, stated in [10], in which a transport term is added too. For our future use, we prove the statement with the same discretization method of Lattice Approximate Solution introduced in Definition 18.
We also define the time-interpolated solution for t ∈ [0, ∆ N ] as follows: . Proof. We first prove existence of a solution, based on the Lattice Approximate Solution. We prove that µ N is a sequence of equi-Lipschitz and equi-bounded curves in C 0 ([0, T ], M(X)), where X is a compact subset of R n and the space M(X) is endowed with the generalized Wasserstein distance W g . For τ, σ ∈ [0, ∆ N ] it holds We are then left to prove that s[µ N (k∆ N )] is uniformly bounded for k∆ N ∈ [0, T ]. It is sufficient to observe that (10) and hypothesis (s1), together with (22), imply We now prove that there exists R such that supp(µ N (t)) ⊂ B(0, R ) for all N and t ∈ [0, T ]. Eventually enlarging the radius R given in hypothesis (s2), one can assume that supp(µ 0 ) ⊂ B(0, R). Thus, the approximation operator A x N satisfies supp(A x N (µ 0 )) ⊂ B(0, R + 1), as well as supp(A x N (s[µ])) ⊂ B(0, R + 1) for any µ ∈ M(R n ). Since sum of measures with the same support gives a measure with the same support, one can easily prove by induction that measures µ N (k∆ N + τ ) defined by the scheme (21) all have support contained in B(0, R ) with R = R + 1.
Choose now X = B(0, R ), that is a compact space. Then, M(X) is complete when endowed with the generalized Wasserstein distance W g , see Proposition 12. The sequence µ N is equi-Lipschitz in M(X), due to (22), and equi-bounded, since masses are equi-bounded. Then, there exists a converging subsequence, converging to some µ * ∈ C 0 ([0, T ], M(X)). Recall that such convergence with respect to W g coincides with weak convergence of measures.
We now prove that µ * is a solution to (20). We first observe that W g (µ N (0), µ 0 ) ≤ |µ 0 |∆ x N for N sufficiently large implies µ * (0) = µ 0 . We now prove that for all f ∈ C ∞ c (R n ) and τ, σ ∈ [0, T ] with σ > τ , it holds The definition (21) implies that, for σ, τ ∈ [0, ∆ N ], it holds We then have where k t N is the largest integer such that t ≥ k∆ N , i.e. k t N = t ∆ N . The first two terms converge to zero since µ N µ * , while the last term is identically zero due to (24). For the third term, observe that it holds where we used condition (s1) about the Lipschitz continuity of s, as well as the dual formulation (14) for W g . The proof now follows from observing that both the terms in the right hand side of (26) converge to zero: the first satisfies for the constant K = e LT |s[µ 0 ]| given by (22). The second converges to zero since W g metrizes weak convergence.
Remark 27 One might require the minimization of the functional |V 1 −Ṽ 1 |+ T R n ×T R n |v−w| dp(x, v, y, w)+ |V 2 −Ṽ 2 | in (28), that seems more close to the definition of W g . Nevertheless, recall that that |V i −Ṽ i | = |µ i −μ i |. As a consequence, when the choice of the minimizer for W g (µ 1 , µ 2 ) is unique, there is no difference for the minimization of the two functionals. When minimizers for W g (µ 1 , µ 2 ) are not unique, this would introduce two additional terms |V i −Ṽ i | in the right hand side of (29), thus providing a less restricitve inequality.
Moreover, the chosen definition of W g is correct to prove existence of a solution to (3), that is the main goal of this paper. This definition would indeed be natural in the estimate (51) below.
When two measures µ 1 , µ 2 have the same mass |µ 1 | = |µ 2 | and have sufficiently near supports, one can chooseμ i = µ i among the minimizers of W g (µ 1 , µ 2 ). Moreover, if µ 1 ⊥ µ 2 , i.e. the two measures have no shared mass, such choice is the unique minimizer. In this case, the choiceṼ i = V i is unique too, and the operator W g coincides with W.
Similarly to W, the operator W g is then not a distance: the same counterexamples with sufficiently near supports and no shared mass can be found. For example, choose µ 1 = δ 0 , µ 2 = δ , V 1 = δ 0 ⊗ δ 0 , V 2 = δ ⊗ δ 0 , with > 0 sufficiently small. The unique minimizer in (28) is given byμ Again, similarly to the estimate (17) between the standard generalized Wasserstein and the operator W, one can easily prove the following proposition.
Observe now thatV 1 ,V 2 is a possible decomposition to estimate i.e. the decompositionμ 1 ,μ 2 realizes the minimizer of W g (µ 1 , µ 2 ). Since π 13 #q ∈ P opt (μ 1 ,μ 2 ), one can write by the contradiction hypothesis that it holds This contradicts the definition of W g as the infimum of the functional T R n ×T R n |v − w| dp(x, v, y, w).

Proof of the main theorems
In this section, we prove the main results of this article, that are Theorem 3 about existence of solutions to (3) and Theorem 4 about uniqueness.

Existence -Proof of Theorem 3
In this section, we prove Theorem 3 about existence of solutions to (3). The idea is to define Lattice Approximate Solutions described in Definition 20, then pass to the limit. This procedure proved useful for each term in (3) separately, namely the PVF studied in (1) and the source in (2).
Proof of Theorem 3. We first fix an initial data µ 0 and prove the existence of a solution to (3) with initial data µ 0 , that has bounded support and is Lipschitz with respect to time. This corresponds to prove Properties 1-2-3a-3c in the Definition 2 of semigroups for (3). We will then prove Property 3b.
Fix an initial data µ 0 . For each N , define the following approximated solution µ N , based on the discretization in Definition 18: set µ N (0) := A x N (µ 0 ), then recursively Also define the interpolated measure for τ ∈ [0, ∆ N ] as follows: Clearly, the first term on the right hand side corresponds to the transport by the PVF V , while the second term corresponds to the source term s. We now prove that the sequence µ N (t) is equi-bounded and equiintegrable on a complete space, to apply the Ascoli-Arzelà theorem. We first prove that, for a fixed T > 0, the measures µ N (t) are all supported in a compact set. Choose the radius R in hypothesis (s2) giving the maximal support of s[µ]. Then, eventually enlarging R, one can assume that supp(µ(0)) ⊂ B(0, R). This implies supp(µ N (0)) ⊂ B(0, R + n 2 ∆ x N ) ⊂ B(0, R + 1), where n is the dimension of the space R n and N is chosen sufficiently large. Observe the following simple estimate: if supp(µ N (k∆ N )) ⊂ B(0, r) with r > R + 1, then it holds supp(µ N (k∆ N + τ )) ⊂ B(0, r + ∆ N C(1 + r)) for τ ∈ (0, ∆ N ). Indeed: • for each term i, j in the first term it holds that ( by Hypothesis (V1), hence supp(δ xi+τ vj ) ⊂ B(0, r + ∆ N C(1 + r)); • for the second term it holds supp( Since summing measures with the same support is a closed operation, it holds supp(µ N (k∆ N + τ )) ⊂ B(0, r + ∆ N C(1 + r)). Eventually replacing r by max {1, r}, it holds by induction supp(µ N (k∆ N + τ )) ⊂ B(0, r(1 + 2C∆ N ) k+1 ). Since k ≤ T ∆ N + 1, this implies supp(µ N (t)) ⊂ B(0, r(1 + 2C) 2 e 2CT ) for all t ∈ [0, T ]. Observe that the space X = B(0, r(1 + 2C) 2 e 2CT ) is compact. Then, the space M (X) of measures with finite mass endowed with the generalized Wasserstein distance W g is complete, see Proposition 12. Moreover, if we prove that the limit of a subsequence of µ N exists, then it satisfies Property 3a in Definition 2.
We now prove that the sequences µ N (t) are equi-Lipschitz in time with respect to the distance W g . We also prove that the masses |µ N (t)| are uniformly bounded. First observe that the operator µ → A x N (µ) does not increase the mass of µ. The same property holds for the operator µ → i,j m v ij (V [µ])δ xi+τ vj . Then, by the explicit expressions (30)-(31) for µ N (t), it holds for τ, σ ∈ [0, ∆ N ]. Here, we estimated the generalized Wasserstein distance by decomposing it in the Wasserstein distance for the transport term given by the PVF V , and the L 1 distance for the source term given by s. Use now (10) to estimate |µ N (k∆ N )| ≤ |µ 0 | + W g (µ N (k∆ N ), µ 0 ). Use uniform boundedness of the supports for µ N (k∆ N ) and Hypothesis (V1) to estimate sup j |v j | ≤ C 1 := C(1 + diam(X)). Also use Hypothesis (s1) to estimate W g (s[µ N (k∆ N )], s[µ 0 ]) ≤ LW g (µ N (k∆ N ), µ 0 ). This gives where we choose C 2 a constant depending on |µ 0 | and X only, i.e. independent on N . We now prove that W g (µ N (k∆ N ), µ 0 ) is bounded, uniformly in N, k. Observe that the sequence µ N (0) for N sufficiently large satisfies W g (µ N (0), µ 0 ) ≤ W (µ N (0), µ 0 ) ≤ |µ 0 |∆ N , as a consequence of (11) and Proposition 19. Thus, there exists a constant C 3 such that W g (µ N (0), µ 0 ) ≤ C 3 for all N . We now prove the estimate by induction on k. The case k = 0 is already proved. We now prove that, if the estimate holds for k, then it holds for k + 1 too. Use (32) with τ = 0, σ = ∆ N , that gives The estimate (33) is now proved. Since k ≤ T /∆ N , it also holds W g (µ N (k∆ N ), µ 0 ) ≤ C 4 := e C2T C 3 + (e C2T − 1). Then, again by (32) and triangular inequalities, it holds with C 5 := C 2 C 4 + C 2 , i.e. uniform Lipschitz continuity with respect to t of the sequence µ N . Moreover, (10) also implies |µ N (t)| ≤ C 6 := |µ 0 | + T C 5 , hence uniform boundedness of the mass. As a consequence, Ascoli-Arzelà theorem implies existence of a converging subsequence µ N , that we do not relabel. Such limit µ * (t) clearly satisfies Property 1 and Property 3c in Definition 2. We now prove Property 2, i.e. the fact that the limit is a solution to (3). Since the limit is uniformly Lipschitz and with bounded mass, it is easy to prove that the two first properties in Definition 1 are satisfied, as well as the fact that the function t → R n f dµ(t) is absolutely continuous (and even Lipschitz) for each f ∈ C ∞ c (R n ). We are left to prove that the limit µ * satisfies (4) for each f ∈ C ∞ c (R n ) and almost each t ∈ [0, T ]. For a fixed f ∈ C ∞ c (R n ), define the operator F N for τ, σ ∈ [0, T ] as follows: For each τ, σ ∈ [k∆ N , (k + 1)∆ N ], the first term in the right hand side of (35) coincides with Define . By applying Taylor's theorem with Lagrange remainder to each g ij , there exist α ij ∈ (0, 1) such that (37) coincides with where Hf is the Hessian of f . We now estimate F N (τ, σ) with τ, σ ∈ [k∆ N , (k + 1)∆ N ]. It holds The first term on the right hand side of (38) is bounded from above by σ τ I 1 (t) dt + I 2 (τ, σ), where The second term on the right hand side of (38) is bounded from above by By the duality formulas for the generalized Wasserstein and the L 1 distance (14)-(13), the Lipschitz condition (34) and the estimate |t − k∆ N | ≤ ∆ N , for a sufficiently large N it holds We recall that the support of the velocities is bounded (sup j |v j | ≤ C 1 ), and that the projection condition π#V [µ] = µ implies |V [µ]| = |µ|. Similarly, for I 2 (τ, σ), by removing and adding the term (τ −k∆ N ) 2 Hf (x i + α ij σv j )), it holds where · R n ,R n is the operator norm. Apply the mean-value theorem to Hf , to have Recall that σ − k∆ N , τ − k∆ N ≤ ∆ N as well as α ij ∈ (0, 1). It then holds Finally, for I 3 (t) it holds Observe that (10) implies Merging (39)-(40)-(41)-(42), it then holds for a suitable constant C 7 . Take now a general pair τ, σ ∈ [0, T ]. For simplicity, assume that τ < σ and that N is sufficiently large to have Our aim is to prove that it holds lim σ→τ L = 0, with Observe that, restricting ourselves to the converging subsequence µ N µ * , it holds We used here continuity of both V and s with respect to weak convergence of measures, as a consequence of (29) and (V2) for V , and (s1) for s. Observe that L coincides with We finally prove Property 3b in Definition 2. Take two different data µ 0 , ν 0 and build the Lattice Approximate Solutions µ N , ν N according with scheme (30)-(31). Assume to have N sufficiently large so that it holds for all k ∈ N with k ≤ T /∆ N , and similarly for ν N . Such N exists, for two reasons: first, µ N , ν N have uniformly bounded support (Property 3a proved above), thus for N sufficiently large A x N conserves the mass. Second, observe that µ N (0) satisfies (43) by construction, and that if µ N (k∆ N ) satisfies it, then µ N ((k + 1)∆ N ) satisfies it too. Then, by induction, this holds for all k.
Similarly, we assume to have N sufficiently large to have and similarly for ν N . This is first based on the fact that uniform bounded supports of the measure µ N t (Property 3a proved above), together with support sublinearity (V1) of V implies uniform boundedness of the supports of the PVF V [µ N t ], thus A v N conserves the mass for N sufficiently large. As soon as |, one has that the support of π 1 #A v N (V [µ N (k∆ N )]) coincides with A x N (µ N (k∆ N )), since π 1 #V [µ] = µ and the discretization (18) has the same effect of A x N on the base space. Then, (43) implies (44).
Consider now the following corresponding decomposition: write µ N (k∆ N ) = i m i δ xi + j n j δ yj and ν N (k∆ N ) = l p l δ zi + j n j δ tj , whereμ 1 = j n j δ yj ,μ 2 = j n j δ tj , and the optimal transference plan π 13 #p sends each n j δ yj to each n j δ tj . Decompose accordingly the PFV as follows: and similarly with the additional requirement that the optimal transference plan p ∈ P (Ṽ 1 ,Ṽ 2 ) sends n jk δ (yj ,v k ) to n jk δ (tj ,v k ) . By definition of µ N , it then holds µ N ((k + 1)∆ N ) = ik m ik δ xi+∆ N v k + jk n jk δ yj +∆ N v k , and similarly for ν N ((k + 1)∆ N ). Estimate the distance W g (µ N ((k + 1)∆ N ), ν N ((k + 1)∆ N )) by choosing the first component for mass removal and the second one for transport. It then holds We now need to compare . Denote withṼ 1 ,Ṽ 2 the decomposition and p ∈ P (Ṽ 1 ,Ṽ 2 ) the transference plan in (28) realizing that is a finite sum of Dirac deltas, and the same holds for π 1 #Ṽ 2 . Thus, p can be decomposed as follows where each p ik is a transference plan on T xi R n × T y k R n .
We are now ready to define a decompositionV 1 ≤ V 1 ,V 2 ≤ V 2 and a transference plan q ∈ P (V 1 ,V 2 ) to estimate W g (V 1 , V 2 ) from above. For each transference plan p ik , define q ik := jl p ik ((v j + Q ) × (w l + Q ))δ vj ,w l , where the v j , w l are the equispaced discretized points on T x R n , T y R n , respectively, and Q is defined in Definition 18. Then define q := i,k q ik δ xi,y k . (48) Define nowV 1 := π 12 #q. By the definition of A v N in (18) and of q in (48), it is easy to prove that it holds One can equivalently prove that it holdsV 2 := π 34 #q ≤ V 2 . Moreover, it holds π 13 #q = π 13 #p ∈ P opt (µ N (k∆ N )], ν N (k∆ N )), where we used that p is a minimizer of W g (V [µ N (k∆ N )], V [ν N (k∆ N )]). Then, the decompositionV 1 ,V 2 with the transference plan q is admissible in the right hand side of (28). It then holds We estimate each term by following the definition of q ik , as follows Here, we used that |v j − w k | ≤ |v j − v| + |v − w| + |v − w k | for any v ∈ v j + Q and w ∈ w l + Q . Observe now that, by decomposition, it holds jl (|v − w| + 2diam(Q ))(p ik ) | ((v j +Q )×(w l +Q )) = |v − w| dp ik + 2diam(Q )|p ik |.
We now use a density argument to pass to the limit. Consider the countable set Choose a µ 0 ∈ D, define the corresponding sequence µ N 0 and choose a subsequence N k such that µ N k 0 uniformly converges to a solution µ(t) to (3). Then choose µ 1 ∈ D, consider the subsequence µ N k 1 and choose a converging subsequence µ N k l 1 . Repeat this diagonal argument for the countable set of initial data in D, and observe that, passing to the limit in (52) for N → ∞, it holds W g (µ(t), ν(t)) ≤ e Kt W g (µ 0 , ν 0 ) for all µ 0 , ν 0 ∈ D. Observe now that D is dense in M, thus the continuous semigroup µ 0 → S t µ 0 = µ(t) can be uniquely extended from D to M.

Uniqueness
We now prove Theorem 4, i.e. uniqueness of a solution to (3) when a Dirac germ γ is fixed. We first need to define compatibility of a semigroup for the dynamics (3), that is the following.
Observe that this definition coincides with Definition 23, where the dynamics (1) is replaced by (3) and the metric W is replaced by W g . The proof of Theorem 4 is then based on the following Lemma.