Long-time behavior of a fully discrete Lagrangian scheme for a family of fourth order equations

A fully discrete Lagrangian scheme for solving a family of fourth order equations numerically is presented. The discretization is based on the equations' underlying gradient flow structure with respect to the Wasserstein metric, and preserves numerous of their most important structural properties by construction, like conservation of mass and entropy-dissipation. In this paper, the long-time behavior of our discretization is analysed: We show that discrete solutions decay exponentially to equilibrium at the same rate as smooth solutions of the original problem. Moreover, we give a proof of convergence of discrete entropy minimizers towards Barenblatt-profiles or Gaussians, respectively, using $Γ$-convergence.

1. Introduction. In this paper, we propose and study a fully discrete numerical scheme for a family of nonlinear fourth order equations of the type and u(0, .) = u 0 on R at initial time t = 0. The initial density u 0 ≥ 0 is assumed to be compactly supported and integrable with total mass M > 0, and we further require strict positivity of u 0 on supp(u 0 ) = [a, b]. For the sake of simplicity, let us further assume that M = 1. We are especially interested in the long-time behavior of discrete solutions and their rate of decay towards equilibrium. For the exponent in (1), we consider values α ∈ [ 1 2 , 1], and assume λ ≥ 0. The most famous examples for parabolic equations described by (1) are the so-called DLSS equation for α = 1 2 , (first analysed by Derrida, Lebowitz, Speer and Spohn in [23,24] with application in semi-cunductor physics) and the thin-film equation for α = 1 -indeed, references are very rare in the literature for other values of α, except [44] of Matthes, McCann and Savar. Due to the physically motivated origin of (1) (especially for α = 1 2 and α = 1), it is not surprising that solutions to (1) carry many structural properties as for instance nonnegativity, the conservation of mass and the dissipation of (several) entropy functionals. In Section 2, we are going to list more properties of solutions to (1). For the numerical approximation of solutions to (1), it is hence natural to ask for structure-preserving discretizations that inherit at least some of those properties. A minimum criteria for such a scheme should be the preservation of nonnegativity, which can already be a difficult task, if standard discretizations are used. So far, many (semi-)discretizations have been proposed in the literature, and most of them keep some basic structural properties of the equation's underlying nature. Take for example [10,16,39,41], where positivity appears as a conclusion of Lyaponov functionals -a logarithmic/power entropy [10,16,39] or some variant of a (perturbed) information functional. But there is only a small number of examples, where structural properties of (1) are adopted from the discretization by construction. A first approach in this direction was a fully Lagrangian discretization for the DLSS equation by Dring, Matthes and Milišić [25], which is based on its L 2 -Wasserstein gradient flow representation and thus preserves nonnegativity and dissipation of the Fisher-information. A similar approach was then applied in [46], again for the special case α = 1 2 , where we even showed convergence of our numerical scheme, which was -as far as we know -the first convergence proof of a fully discrete numerical scheme for the DLSS equation, which additionally dissipates two Lyapunov functionals.
1.1. Description of the numerical scheme. We are now going to present a scheme, which is practical, stable and easy to implement. In fact our discretization seems to be so mundane that one would not assume any special properties therein, at first glance. But we are going to show later in Section 2 that our numerical approximation can be derived as a natural restriction of a L 2 -Wasserstein gradient flow in the potential landscape of the so-called perturbed information functional into a discrete Lagrangian setting, thus preserves a deep structure. The starting point for our discretization is the Lagrangian representation of (1). Since each u(t, ·) is of mass M , there is a Lagrangian map X(t, ·) : [0, M ] → R -the so-called pseudo-inverse distribution function of u(t, ·) -such that ξ = Written in terms of X, the Wasserstein gradient flow for F α,λ turns into an L 2gradient flow for where Z(t, ξ) := 1 ∂ ξ X(t, ξ) = u t, X(t, ξ) .

HORST OSBERGER
We later show in Proposition 1 that the solvability of the system (6) is guaranteed. The above procedure (1) − (2) yields a sequence of monotone vectors . .), and any entry x n ∆ defines a spatial decomposition of the compact interval [x n 0 , x n K ] ⊂ R, n ∈ N. Fixing k = 0, . . . , K, the sequence n → x n k defines a discrete temporal evolution of spatial grid points in R, and if one assigns each interval [x n k−1 , x n k ] a constant mass package δ, the map n → [x n k−1 , x n k ] characterizes the temporal movement of mass. Hence x ∆ is uniquely related to a sequence of local constant density functions u ∆ : We will see later in Section 2.1, that the information functional F α,λ can be derived using the dissipation of the entropy where Θ α := √ 2α/(2α + 1), and Λ α,λ := λ/(2α + 1). (10) To discretize the entropy H α,λ and the perturbed information functional F α,λ , we introduce and 1.2. Related schemes. The construction of numerical schemes as a solution of discrete Wasserstein gradient flows with Lagrangian representation is not new in the literature. Many approaches in this spirit have been realised for second-order diffusion equation [9,11,43,48], but also for chemotaxis systems [6], for non-local aggregation equations [17,19], and for variants of the Boltzmann equation [32]. We further refer to [42] to the reader interested in a very general numerical treatement of Wasserstein gradient flows. In case of fourth order equations, there are some results for the thin-film equation and its more general version, the Hele-Shaw flow, see [20,32], but convergence results are missing. Rigorous stability and convergence results for fully discrete schemes are rare and can just be found in [31,45] for second order equations, and in [46] for the DLSS equation. However, there are results available for semi-discrete Lagrangian approximations, see e.g. [2,26].
All analytical results that will follow arise from the very fundamental observation that solutions to the scheme defined in Section 1.1 can be successively derived as solutions to the discrete minimizing movement scheme: For fixed x 0 ∆ and n ≥ 1, define x n ∆ recursively as the minimizer of the functional An immediate consequence of the minimization procedure is that solutions x n ∆ dissipate the functional F α,λ .
Concerning the long-time behavior of solutions x ∆ , remarkable similarities to the continuous case appear. Assuming first the case λ > 0, it turns out that the unique minimizer x min δ of H α,λ is even a minimizer of the discrete information functional F α,λ , and the corresponding set of density functions u min δ = u δ [ x min δ ] converges for δ → 0 towards a Barenblatt-profile b α,λ or Gaussian b 1/2,λ , respectively, that is defined by where a ∈ R is chosen to conserve unit mass and Λ α,λ is defined as in 10. Beyond this, solutions x n ∆ satisfying 6 converge as n → ∞ towards a minimizer x min δ of F α,λ with an exponential decay rate, which is "asymptotically equal" to the one obtained in the continuous case. The above claims are summarized in the following theorems: whereû min δ is a locally affine interpolation of u min δ defined in Lemma 3.2.
Let us now consider the zero-confinement case λ = 0. In the continuous setting, the long-time behavior of solutions to 1 with λ = 0 can be studied by a rescaling of solutions to 1 with λ > 0. We are able to translate this methode into the discrete case and derive a discrete counterpart of [44,Corollary 5.5], which describes the intermediate asymptotics of solutions that approach self-similar Barenblatt profiles as t → ∞. Theorem 1.3. Assume λ = 0 and take a sequence of monotone x n ∆ satisfying 13. Then there exists a constant c α > 0 depending only on α, such that where b n ∆,α,0 is a rescaled discrete Barenblatt profile and a τ , b τ > 0, such that a τ , b τ → 1 for τ → 0, see Section 3.2 for more details.
Before we come to the analytical part of this paper, we want to point out the following: The ideas for the proofs of Theorem 1.2 and 1.3 are mainly guided by the techniques developed in [44]. The remarkable observation of this work is the fascinating structure preservation of our discretization, which allows us to adapt almost all calculations from the continuous theory for the discrete setting.
1.4. Structure of paper. In the following Section 2, we point out some of the main structural features of 1 and the functionals H α,λ and F α,λ , and show that our scheme arises from a discrete L 2 -Wasserstein gradient flow, so that many properties of the continuous flow are inherited. Section 3 treats the analysis of discrete equilibria in case of positive confinement λ > 0: We prove convergence of discrete stationary states to Barenblatt-profiles or Gaussians, respectively, and analyse the asymptotics of discrete solutions for λ = 0. Finally, some numerical experiments are presented in Section 4.
2.1. Structural properties of (1). The family of fourth order equations (1) carries a bunch of remarkable structural properties. The most fundamental one is the conservation of mass, which is a naturally given property, if one interprets solutions to (1) as a L 2 -Wasserstein gradient flow of the perturbed information functional F α,λ in (2). As an immediate consequence, F α,λ is a Lyapunov functional, and one can find infinitely many other (formal) Lyapunov functionals at least for special choices of α -see [7,12,36] for α = 1 2 or [3,18,28] for α = 1. Apart from F α,λ , one of the most important such Lyapunov functionals is given by the Λ α,λ -convex entropy H α,λ . It turns out that the functionals F α,λ and H α,λ are not just Lyapunov functionals, but share numerous remarkable similiarities. One can indeed see (1) as a higher order extension of the second order porous media/heat equation [35] which is nothing less than the L 2 -Wasserstein gradient flow of H α,λ . Furthermore, the unperturbed functional F α,0 , i.e. λ = Λ α,λ = 0, equals the dissipation of H α,0 along its own gradient flow, In view of the gradient flow structure, this relation makes (1) the "big brother" of the porous media/heat equation (21), see [22,44] for structural consequences. Another astonishing common feature is the correlation of F α,λ and H α,λ by the so-called fundamental entropy-information relation: For any u ∈ P(R) with H α,λ (u) < ∞, one has that see [44,Corollary 2.3]. This equation is a crucial tool for the analysis of equilibria of both functionals and the corresponding long-time behavior of solutions to (1) and (21). In addition to the above listing, a typical property of diffusion processes like (1) or (21) with positive confinement λ, Λ α,λ > 0 is the convergence towards unique stationary solutions u ∞ and v ∞ , respectively, independent of the choice of initial data. It is maybe one of the most surprising facts that both equations, (1) and (21), share the same steady state, i.e. the stationary solutions u ∞ and v ∞ are identical. Those stationary states are solutions of the elliptic equations with P α (s) := Θ α s α+1/2 , and have the form of Barenblatt profils or Gaussians, respectively, see definition (14) and (15). This was first observed by Denzler and McCann in [22], and further studied in [44] using the Wasserstein gradient flow structure of both equations and their remarkable relation via (22). In case of α ∈ { 1 2 , 1}, the mathematical literature is full of numerous results, which is because of the physical importance of (1) in those limiting cases.

DLSS equation.
As already mentioned at the very beginning, the DLSS equation -(1) with α = 1 2 -arises from the Toom model [23,24] in one spatial dimension on the half-line [0, +∞), and was used to describe interface fluctuations, therein. Moreover, the DLSS equation also finds application in semi-conductor physics, namely as a simplified model (low-temperature, field-free) for a quantum drift diffusion system for electron densities, see [38].
From the analytical point of view, a big variety of results in different settings has been developed in the last view decades. For results on existence and uniqueness, we refer e.g. to [7,27,34,29,37,38], and [12,18,14,29,37,40,44] for qualitative and quantitative descriptions of the long-time behavior. The main reason, which makes the research on this topic so non-trivial, is a lack of comparison/maximum principles as in the theory of second order equations (21). And, unfortunately, the absence of such analytical tools is not neglectable, as the work [7] of Bleher et.al shows: As soon as a solution u of (1) with α = 1 2 is strictly positive, one can show that it is even C ∞ -smooth, but there are no regularity results available from the moment when u touches zero. The problem of strictly positivity of such solutions seems to be a difficult task, since it is still open. This is why alternative theories for nonnegative weak solutions have more and more become matters of great interest, as e.g. an approach based on entropy methodes developed in [29,37].

2.1.2.
Thin-film equation. The thin-film equation -(1) with α = 1 -is of similar physical importance as the DLSS equation, since it gives a dimension-reduced description of the free-surface problem with the Navier-Stokes equation in the case of laminar flow, [47]. In case of linear mobility -which is exactly the case in our situation -the thin-film equation can also be used to describe the pinching of thin necks in a Hele-Shaw cell in one spatial dimension, and thus plays an extraordinary role in physical applications. To this topic, the literature provides some interesting results in the framework of entropy methods, see [13,18,28]. In the (more general) case of nonnegative mobility functions m, i.e.
one of the first achievements to this topic available in the mathematical literature was done by Bernis and Friedman [4]. The same equation is observed in [5], treating a vast number of results to numerous mobility functions of physical meaning. There are several other references in this direction, e.g. Grn et. al [3,21,33], concerning long-time behavior of solutions and the non-trivial question of spreading behavior of the support.

2.2.
Structure-preservation of the numerical scheme. In this section, we try to get a better intuition of the scheme in Section 1.1. Foremost we will derive (6) as a discrete system of Euler-Lagrange equations of a variational problem that arises from a L 2 -Wasserstein gradient flow restricted on a discrete submanifold P δ (R) of the space of probability measures P(R) on R. This is why the numerical scheme satisfies several discrete analogues of the results discussed in the previous section. As the following section shows, some of the inherited properties are obtained by construction (e.g. preservation of mass and dissipation of the entropy), where others are caused by the underlying dicsrete gradient flow structure and a smart choice of a discrete L 2 -Wasserstein metric. Moreover, it is possible to prove that the entropy and the information functional share the same minimizer x min δ even in the discrete case, and solutions of the discrete gradient flow converges with an exponential rate to this stationary state. The prove of this observation is more sophisticated, that is why we dedicate an own section (Section 3) to the treatment of this special property.
2.2.1. Ansatz space and discrete entropy/information functionals. The entropies H α,λ and F α,λ as defined in (9) and (2) are nonnegative functionals on P(R). If we first consider the zero-confinement case λ = 0, one can derive in analogy to [45] the discretization in (11) of H α,0 just by restriction to a finite-dimensional submanifold P δ (R) of P(R): For fixed K ∈ N, the set P δ (R) consists of all local constant density functions u = u δ [ x] (remember definition (8)), such that x ∈ R K+1 is a monotone vector, i.e.
x ∈ x δ := (x 0 , . . . , Such density functions u = u δ [ x] ∈ P δ (R) bear a one-to-one relation to their Lagrangians or Lagrangian maps, which are defined on the mass grid [0, M ] with uniform decomposition (0 = ξ 0 , . . . , ξ k , . . . , ξ K = M ). More precisely, we define for x ∈ x δ the local affine and monotonically increasing function X ξ for u ∈ P δ (R) and its corresponding Lagrangian map. For later analysis, we introduce in addition to the decomposition In view of the entropy's discretization, this implies using (11) and (9), a change of variables x = X δ [ x], and the definition (7) of the xdependent vectors z. This is perfectly compatible with (11). Obviously, one cannot derive the discrete information functional F α,0 in the same way, since F α,0 is not defined on P δ (R). So instead of restriction, we mimic property (22) that is for any is a function on x δ , and where we denote for k = 0, . . . , K by e k ∈ R K+1 the (k + 1)th canonical unit vector.
Remark 1. One of the most fundamental properties of the L 2 -Wasserstein metric W 2 on P(R) in one space dimension is its explicit representation in terms of Lagrangian coordinates. We refer to [1,49] for a comprehensive introduction to the topic. This enables us to prove the existence of K-independent constants c 1 , c 2 > 0, such that A proof of the analogue statement formulated on a domain [a, b] ⊂ (−∞, +∞) is given in [45,Lemma 7], and can be easily recomposed for the whole set of real numbers.
Let us further introduce the sets of (semi)-indizes The calculation (26) in the above example yields the explizit representation of the gradient of H α,λ ( x), and further of the discretized information functional In the case of positive confinement λ > 0, note that the drift potential u → R |x| 2 u(x) dx attains an equivalent representation in terms of Lagrangian coordinates that is X → M 0 |X(ξ)| 2 dξ. In our setting, the simplest discretization of this functional is hence by summing-up over all values x k weighted with δ. This yields as an extension to the case of positive λ, which is nothing else than (11) and (12). Note in addition that δ k∈I 0 A first structural property of the above simple discretization are convexity retention from the continuous to the discrete setting. For reasons of readability, the proof of the following lemma is located in the appendix.
for any x, y ∈ x δ and s ∈ (0, 1). It therefore admits a unique minimizer x min δ ∈ x δ . If we further assume Λ α,λ > 0, then one attains for any As a further conclusion of our natural discretization, we get a discrete fundamental entropy-information relation analogously to the continuous case (23).
The proof of this corollary can again be found in the appendix.
Remark 2. Note that the above seemingly appearing discontinuity at α = 1 2 is not real. For α > 1 2 , the second term in the right hand side of (31) is explicitly given by For α ↓ 1 2 , one has Λ α,λ → Λ 1/2,λ , Θ α → 1 2 , and especially δ κ∈I 1/2 For the following reason, the above representation of F α,λ is indeed a little miracle: From a naive point of view, one would ideally hope to gain a discrete counterpart of the fundamental entropy-information relation (23), if one takes the one-to-one discretization of the L 2 -Wasserstein metric, which is (in the language of Lagrangian vectors) realized by the norm x → W 2 (u δ [ x], u δ [ x]) instead of our simpler choice x → x δ . Indeed, with this ansatz, the proof of the above statement would fail in the moment in which one tries to calculate the scalar product of ∂ x H α,0 and ). This is why our discretization of the L 2 -Wasserstein metric by the norm · δ seems to be the right choice, if one is interested in a structurepreserving discretization.
Corollary 2. The unique minimizier x min δ ∈ x δ of H α,λ is a minimizer of F α,λ and it satisfies for any

LONG-TIME BEHAVIOR FOR A FAMILY OF FOURTH ORDER 415
Proof. Equality (31) and 2α − 1 ≥ 0 show that x → F α,λ ( x) is minimal, iff ∇ δ H α,λ ( x) δ = 0 and H α,λ ( x) is minimal, which is the case for x = x min δ . The representaion in (31) further implies where we used (30) in the last step.

2.2.2.
Interpretation of the scheme as discrete Wasserstein gradient flow. Starting from the discretized perturbed information functional F α,λ we approximate the spatially discrete gradient flow equatioṅ also in time, using minimizing movements. To this end, remember the temporal decomposition of [0, +∞) by using time step sizes τ := {τ 1 , τ 2 , . . . , τ n , . . .} with τ n ≤ τ and τ > 0. As before in the introduction, we combine the spatial and temporal mesh widths in a single discretization parameter ∆ = (τ ; δ). For each y ∈ x δ , introduce the Yosida-regularized information functional F α (·, ·, ·, y) : A fully discrete approximation ( x n ∆ ) ∞ n=0 of (34) is defined inductively from a given initial datum x 0 ∆ by choosing each x n ∆ as a global minimizer of F α (λ, τ n , ·, x n−1 ∆ ). Below, we prove that such a minimizer always exists (see Lemma 2.3).
In practice, one wishes to define x n ∆ as -preferably unique -solution of the Euler-Lagrange equations associated to F α (λ, τ n , ·, x n−1 ∆ ), which leads to the implicit Euler time stepping: Using the explicit representation of ∂ x F α,λ , it is immediately seen that (36) is indeed the same as (6). Equivalence of (36) and the minimization problem is guaranteed at least for sufficiently small τ > 0, as the following proposition shows.
Proposition 1. For each discretization ∆ and every initial condition x 0 ∈ x δ , the sequence of equations (36) can be solved inductively. Moreover, if τ > 0 is sufficiently small with respect to δ and F α,λ ( x 0 ), then each (36) possesses a unique solution with F α,λ ( x) ≤ F α,λ ( x 0 ), and that solution is the unique global minimizer of F α (λ, τ n , ·, x n−1 ∆ ). The proof of this proposition is a consequence of the following rather technical lemma. Lemma 2.3. Fix a spatial discretization parameter δ and a bound C > 0. Then for every y ∈ x δ with F α,λ ( y) ≤ C, the following are true: • for each σ > 0, the function F α (λ, σ, ·, y) possesses at least one global minimizer x * ∈ x δ ; • there exists a τ C > 0 independent of y such that for each σ ∈ (0, τ C ), the global minimizer x * ∈ x δ is strict and unique, and it is the only critical point of F α (λ, σ, ·, y) with F α,λ ( x) ≤ C.
Proof. Fix y ∈ x δ with F α,λ ( y) ≤ C, and define the nonempty (since it contains y) sublevel hence x ∞ is bounded from above by √ 2σC + y ∞ . Especially, which means, in the sense of density functions, that any u = u δ [ x] with x ∈ A C is compactly supported in [−L(δ, σ, y), L(δ, σ, y)]. Consequently, take again x ∈ x δ arbitrarily and declare z * = min κ∈I 1/2 K z κ and z * = max κ∈I 1/2 K z κ , then on the one hand the conservation of mass yields the boundedness of z * from above, and on the other hand F α,λ ( x) ≤ F α,λ ( y) ≤ C yields an upper bound for z * , since , and hence z * ≤ M Θ −1 α C + 2L(δ, σ, y) Collecting the above observations, we first conclude that A C ⊆ x δ is a compact subset of R K+1 , due to |x 0 |, |x K | ≤ L(δ, σ, y) and the continuity of F α,λ . Moreover, K with a positive constant x that depends on C and L(δ, σ, y). Thus A C does not touch the boundary (in the ambient R K+1 ) of x δ . Consequently, A C is closed and bounded in x δ , endowed with the trace topology.
The restriction of the continuous function F α (λ, σ, ·, y) to the compact set A C possesses a minimizer x * ∈ A C . We clearly have F α,λ ( x * ) ≤ F α,λ ( y) ≤ C, and so x * lies in the interior of A C and therefore is a global minimizer of F α (λ, σ, ·, y). This proves the first claim.
Since F α,λ : x δ → R is smooth, its restriction to A C is λ C -convex with some λ C ≤ 0, i.e., ∂ 2 x F α,λ ( x) ≥ λ C 1 K+1 for all x ∈ A C . Independently of y, we have that Consequently, each such F α (λ, σ, ·, y) has at most one critical point x * in the interior of A C , and this x * is necessarily a strict global minimizer.
Remark 3 (propagation of the support). Take a solution x ∆ of (35) with density functions u ∆ . As we already noted in the above proof, any density u n ∆ has compact support in [−L n , L n ] with L n = L(δ, τ n , x n−1 ∆ ) as in (38). Hence which is the best we can assume in case of λ = 0. If λ > 0, one can find a much better bound on the support of u ∆ , namely by replacing (37) by, 3. Analysis of equilibrium. In that which follows, we will analyse the long-time behavior in the discrete setting and will especially prove Theorem 1.2. As we have already seen in [44], the scheme's underlying variational structure is essential to get optimal decay rates. Due to our structure-preserving discretization, it is even possible to derive analogue, asymptotically equal decay rates for solutions to (35).

3.1.
Entropy dissipation -the case of positive confinement λ > 0. In this section, we pursue the discrete rate of decay towards discrete equilibria and try to verify the statements in Theorem 1.2 to that effect. That is why we assume henceforth λ > 0.
for any time step n ∈ N.
The proof is a special case of [1, Theorem 3.

1.4]
Remark 4. In the continuous situation, the analogue proofs of (40) and (41) require a more deeper understanding of variational techniques. An essential tool in this context is the flow interchange lemma, see e.g. [44,Theorem 3.2]. Although one can easily proof a discrete counterpart of the flow interchange lemma, it is not essential in the above proof, since the smoothness of x → H α,λ ( x) allows an explicit calculation of its gradient and hessian.
Lemma 3.1 paves the way for the exponential decay rates of Theorem 1.2. Effectively, (18) and (19) are just applications of the following version of the discrete Gronwall lemma: Assume {c n } n∈N and {y n } n∈N to be sequences with values in R + , satisfying (1 + c n )y n ≤ y n−1 for any n ∈ N, then y n ≤ y 0 e − n−1 k=0 c k 1+c k for any n ∈ N.
This statement can be easily proven by induction. Furthermore, inequality (20) is then a corollary of (18) and a Csiszar-Kullback inequality, see [15,Theorem 30].
3.1.1. Convergence towards Barenblatt profiles and Gaussians. This section is devoted to the proof of Theorem 1.1. Hence let us assume again λ > 0. To prove the statement of this theorem we are going to show that the sequence of functionals H δ α,λ : P(R) → (−∞, +∞] given by Γ-congerves towards H α,λ . More detailed, the following points are satisfied for any u ∈ P(R): (ii) There exists a recovery sequence u δ of u, i.e. lim sup δ→0 H δ α,λ (u δ ) ≤ H α,λ (u) and lim δ→0 W 2 (u δ , u) = 0. The Γ-convergence of H δ α,λ (·) towards H α,λ is a powerful property, since it implies convergence of the sequence of minimizers u min δ = u δ [ x min δ ] towards b α,λ or b 1/2,λ , respectively, with respect to the L 2 -Wasserstein metric, see [8]. To conclude even strong convergence of u min δ at least in L p (R) for arbitrary p ≥ 1, we proceed similar as in [45,Proposition 18]. Necessary for that, recall that the total variation of a function f ∈ L 1 (R) is given by we refer to [30,Definition 1.1]. If f is a piecewise constant function with compact support [x 0 , x K ], taking values f k− 1 2 on intervals (x k−1 , x k ], then the integral in (45) amounts to Consequently, for such f , the supremum in (45) equals and is attained for every Lipschitz continuous function ϕ with ϕ( whereû min δ : R → R is a local affine interpolation of u min δ on R, such that for any κ ∈ I 1/2 Proof. We will first prove the Γ-convergence of H δ α,λ towards H α,λ . The first requirement (i) is a conclusion of the lower semi-continuity of H α,λ .
For the second point (ii), we fix u ∈ P(R) and assume X : [0, M ] → [−∞, +∞] to be the Lagrangian map of u. In addition assume for the moment, that K u dx < M for any compact subset K ⊆ R, hence there is no bounded interval such that the whole mass of u is concentrated therein. Further assume without loss of generality that the center of mass is at x = 0, i.e. 0 −∞ u(x) dx = M/2. Then one can find for any ε > 0 a compact set of the form K = [L 1 , L 2 ] with L 1 < 0 < L 2 , and an integer K ∈ N, such that The first statement is valid due to the boundedness of the second momentum of u, and the last one is satisfied since one can choose K ∈ N arbitrarily large. An immediate consequence of the above choises is that 2δL 2 < ε for L = max{|L 1 |, |L 2 |} due to Using δ = M K −1 we define an equidistant decomposition δ of the mass domain [0, M ]. We furthermore declare x 0 = −2L, x K = 2L and x k = X(ξ k ) for any k = 1, . . . , K −1 and introduce the locally constant density u δ ∈ P δ (R) that corresponds to the Lagrangian map X δ [ x]. This procedure defines a sequence of densities u δ , since ε > 0 was arbitrary, and we are going to prove that u δ is the right choice for the recovery sequence.
Since H α,λ is lower semi-continuous, we obtain lim δ→0 H δ α,λ (u δ ) = H α,λ (u). In the simpler case that K u dx = M for a compact subset K ⊆ R, we choose X δ [ x] as the local affine function with X(ξ k ) = X(ξ k ) for any k = 0, . . . , K, and take the corresponding sequence u δ of local constant density functions. Then the above calculation can be achieved analogously.
To conclude the convergence of u min δ towards b α,λ with respect to W 2 and the convergence of H α,λ ( x min δ ) towards H α,λ (b α,λ ), we invoke [8,Theorem 1.21]. Note that inf u∈P(R) H δ α,λ (u) = H α,λ ( x min δ ) by definition of H δ α,λ , hence the minimum of H δ α,λ is u min δ . Furthermore, each functional H δ α,λ has precompact sublevels which is a consequence of λ > 0 and Prokhorov's Theorem, see for instance [ Let us now prove (47). The convergence of H α,λ ( x min δ ) to H α,λ (b α,λ ) yields on the one hand the uniform boundedness of H α,λ ( x min δ ) with respect to the spatial discretization parameter δ, and on the other hand the uniform boundedness of F α,λ ( x min δ ), which is a conclusion of (31) and ∇ δ H α,λ ( x min δ ) = 0. Similar to [45,Proposition 18], one can easily prove that the term F α,λ ( x min δ ) is an upper bound on the total variation of P α (u min δ ) with P α (s) = Θ α s α+1/2 : Take an arbitrary y ∈ R K+1 with y ∞ ≤ 1. Then and the left-hand side can be reformulated, using (28), Respecting that y δ ≤ M y ∞ we can take the supremum over all y with y ∞ ≤ 1 in (52). Then the Cauchy-Schwarz inequality and the representation of {·} TV in (46) yields , which is uniformly bounded from above due to (31) and the uniform boundedness of F α,λ ( x min δ ). This proves the uniform boundedness of {P α (u min δ )} TV and using the superlinear growth of s → P α (s) and [30,Proposition 1.19], we conclude in (47).
To proof (48), we show that the H 1 (R)-norm ofû min δ is bounded by the information functional. This was already done in [46] for α = 1 2 , where we also showed û min So assume α ∈ ( 1 2 , 1], then the concavity of the mapping s → s α−1/2 yields for any values b > a > 0 the validity of , 3.2. The case of zero confinement λ = 0. This section is essentially devoted to the proof of Theorem 1.3. Let us therefore consider (1) in case of vanishing confinement λ = 0, hence and u(0) = u 0 for arbitrary initial density u 0 ∈ P(R). From the continuous theory, it is known that solutions to (53) or (21) with Λ α,λ = 0 branche out over the whole set of real numbers, hence converge towards zero at a.e. point. This matter of fact makes rigorous analysis of the long-time behavior of solutions to (53) more difficult as in the case of positive confinement. However, the unperturbed functionals H α,0 and F α,0 satisfy the scaling property, see again [44], for any r > 0 and d r u(x) := r −1 u(r −1 ·) with u ∈ P(R). Due to this, it is possible to find weak solutions to a rescaled version of (53) by solving problem (1) with λ = 1. More detailed, the following lemma is valid: is a weak solution to (53).
A consequence of the above lemma is that one can describe how solutions w to (53) vanish asymptotically as t → ∞, although the gained information is not very strong and useful: In fact the first observation (without studying local asymptotics in more detail) is, that w decays to zero with the same rate as the rescaled (timedependent) Barenblatt-profile b * α,0 defined by b * α,0 (t, ·) := d R(t) b α,1 with R(t) of (55). It therefore exists a constant C > 0 just depending on H α,0 (w 0 ) = H α,0 (u 0 ) with for any t > 0. In [44], this behavior was described using weak solutions constructed by minimizing movements. We will adopt this methodes to derive a discrete analogue of (56) for the discrete solutions x ∆ of (35). First of all, we reformulate the scaling operator d r for fixed r > 0 in the setting of monotonically increasing vectors x ∈ x δ . Since d r u(x) := r −1 u(r −1 ·) for arbitrary density in P(R), the same can be done for u δ = u δ [ x], hence for any x ∈ R. The natural extension of d r to the set x δ is hence As a consequence of this definition, note the validity of the discrete scaling property for H α,0 and F α,0 , i.e. for any r > 0 and x ∈ x δ , The first equality is satisfied due to H α,0 ( x) = H α,0 (u δ [ x]) and the scaling property (54) of the continuous entropy functions. The analogue claim for F α,0 in (57) follows by inserting d r x into ∂ x H α,0 and using d r z = r −1 z, then . This scaling properties can now be used to build a bridge between solutions of discrete minimizing movement schemes with λ = 0 to those with positive confinement.
The following lemma is based on the proof of [44,Theorem 5.6], but nevertheless, it is an impressive example for the powerful structure-preservation of our numerical scheme.
if and only if d R x ∈ x δ minimizes the functional It is not difficult so see that this lemma can be formulated for any functional mapping x δ to R with the same scaling property as F α,0 in (57). We refer to the appendix for the very technical proof of Lemma 3.4.

4.1.
Non-uniform meshes. An equidistant mass grid -as used in the analysis above -leads to a good spatial resolution of regions where the value of u 0 is large, but provides a very poor resolution in regions where u 0 is small. Since we are interested in regions of low density, and especially in the evolution of supports, it is natural to use a non-equidistant mass grid with an adapted spatial resolution, like the one defined as follows: The mass discretization of [0, M ] is determined by a vector δ = (ξ 0 , ξ 1 , . . . , ξ K−1 , ξ K ), with 0 = ξ 0 < ξ 1 < · · · < ξ K−1 < ξ K = M and we introduce accordingly the distances (note the convention ξ −1 = ξ K+1 = 0) for κ ∈ I 1/2 K and k ∈ I 0 K , respectively. The piecewise constant density function u ∈ P δ (R) corresponding to a vector x ∈ R K−1 is now given by The Wasserstein-like metric (and its corresponding norm) needs to be adapted as well: The scalar product ·, · δ is replaced by and we set v δ = v, v δ . Hence the metric gradient ∇ δ f ( x) ∈ R K+1 of a function f : x δ → R at x ∈ x δ is given by Otherwise, we proceed as before: The entropy is discretized by restriction, and the discretized information functional is the self-dissipation of the discretized entropy. Explicitly, the resulting fully discrete gradient flow equation attains the form 4.2. Implementation. To guarantuee the existence of an initial vector x 0 ∆ ∈ x δ , which "reaches" any mass point of u 0 , i.e. [x 0 0 , x 0 K ] ⊆ supp(u 0 ), one has to consider initial density functions u 0 with compact support.
Starting from the initial condition x 0 ∆ , the fully discrete solution is calculated inductively by solving the implicit Euler scheme (64) for x n ∆ , given x n−1 ∆ . In each time step, a damped Newton iteration is performed, with the solution from the previous time step as initial guess. More precisely, for given x n−1 ∆ , we calculate x n ∆ by means of the following algorithm: In all of our experiments, we use tol = 10 −8 . Relatively slow convergence of the Newton iteration has been observed in situations where the density u n−1 ∆ has steep gradients and/or intervals of very low values. In the experiments that follow, the number of Newton iterations is between ten at the very first time iterations and decreases up to one iteration at later times.  As a first numerical experiment, we want to analyse the rate of decay in case of positive confinement λ = 5, using α = 1. For that purpose, consider the initial density function u 0 = 0.25| sin(x)| · (0.5 + I x>0 (x)) , x ∈ [−π, π], 0 , else .