Weighted fast diffusion equations (Part II): Sharp asymptotic rates of convergence in relative error by entropy methods

This paper is the second part of the study. In Part~I, self-similar solutions of a weighted fast diffusion equation (WFD) were related to optimal functions in a family of subcritical Caffarelli-Kohn-Nirenberg inequalities (CKN) applied to radially symmetric functions. For these inequalities, the linear instability (symmetry breaking) of the optimal radial solutions relies on the spectral properties of the linearized evolution operator. Symmetry breaking in (CKN) was also related to large-time asymptotics of (WFD), at formal level. A first purpose of Part~II is to give a rigorous justification of this point, that is, to determine the asymptotic rates of convergence of the solutions to (WFD) in the symmetry range of (CKN) as well as in the symmetry breaking range, and even in regimes beyond the supercritical exponent in (CKN). Global rates of convergence with respect to a free energy (or entropy) functional are also investigated, as well as uniform convergence to self-similar solutions in the strong sense of the relative error. Differences with large-time asymptotics of fast diffusion equations without weights will be emphasized.


Introduction. Let us consider the fast diffusion equation with weights
Such an equation admits the self-similar solution At least when 1 − m > 0 is not too big, this self-similar solution attracts all solutions to (1) as t → ∞, but we will also prove that, exactly as for the non-weighted equation corresponding to (β, γ) = (0, 0), there is a basin of attraction of u for any m ∈ (0, 1). However, there are many differences with respect to the non-weighted case, which will be summarized in Section 4.3. To study the convergence of u to u , it is simpler to use self-similar variables (see Section 2.3 for details) and consider the Fokker-Plancktype equation with initial condition v(t = 0, ·) = v 0 . Self-similar solutions are transformed into Barenblatt-type stationary solutions given by where C(M ) is a positive constant uniquely determined by the weighted mass condition 1). Altogether, what we aim at is establishing an exponential convergence of v to B as t → ∞, when the corresponding distance is measured in terms of the free energy (which is sometimes called generalized relative entropy in the literature) By evolving such free energy along the flow and differentiating with respect to t, we formally obtain that where I[v] denotes the relative Fisher information This will be proved rigorously in Section 3.1. Note that, with some abuse of notation, when we write v(t) we mean the whole spatial profile of the function evaluated at time t.
Our goal is to relate F[v] and I [v], at least as t → ∞, and for that we need a detour by a family of Caffarelli-Kohn-Nirenberg inequalities which have been introduced in [7] and studied in Part I of this work, [4]. Let us explain this a bit more in detail. For all q ≥ 1, consider the weighted norms and define L q,γ (R d ) as the space of all measurable functions w such that w L q,γ (R d ) is finite. Actually, at some points below, we shall also make use of the above definition for q ∈ (0, 1) (in which cases clearly w L q,γ (R d ) is no more a norm). The Caffarelli-Kohn-Nirenberg interpolation inequalities with ϑ := (d−γ) (p−1) p [d+β+2−2 γ−p (d−β−2)] and parameters β, γ, p subject to and p ∈ (1, p ] with p : are valid for all functions in the space obtained by completion of D(R d ) with respect to the norm · defined by w → w 2 = ∇w 2 L 2,β (R d ) + w 2 L p+1,γ (R d ) : see [4, Section 2.1] for more details. We shall take for granted once for all assumptions (6) on the parameters, even when it is not mentioned explicitly. However, the subcriticality condition p ≤ p will be assumed for global decay estimates, but not in the study of asymptotic decay estimates.
In this case, inequality (5) can be written as an entropy -entropy production inequality ( and, as a straightforward consequence (recall (4)), we formally deduce the convergence rate This connection is well known and goes back to [9], where similar properties were investigated in the non-weighted case, namely for (β, γ) = (0, 0). For more details see [4,Proposition 1]. Conditions have to be given to make Estimate (8) rigorous. In fact much more is known. Let us consider again the entropy -entropy production inequality where K(M ) is the best constant, which may possibly take the value 0. With Λ(M ) := m 2 (1 − m) −2 K(M ), we formally deduce from (4) and (9) that For brevity, this is what we shall call a global rate of decay because the dependence in v at time t = 0 is explicitly given by F[v(0)]. Let and denote by Λ 0,1 the lowest eigenvalue associated with non-radial eigenfuntions of the operator L defined by where B denotes the Barenblatt profile defined by for some C > 0. Of course we shall later take C = C(M ) if m ∈ (m c , 1). Based on variational methods, the following result has been proved in [4]. We shall also give a short additional proof of (i) below in Section 4.2.
We shall assume that the initial datum v 0 is sandwiched between two Barenblatt profiles: there exist two positive constants C 1 and C 2 such that (12) Condition (12) may look rather restrictive, but it is probably not, because it is expected that the condition is satisfied, for some positive t, by any solution with nonnegative initial datum having finite initial free energy, as it is the case when (β, γ) = (0, 0) and for m sufficiently close to 1: see for instance [6,2]. However, initial regularization effects of (2) are out of the scope of the present paper. If m is not close to 1, Condition (12) determines, to some extent, the basin of attraction of B. What we shall prove is the following result on global rates. Corollary 2. Let (6) hold and m ∈ [m 1 , 1). With the above notations, (10) holds if v is a solution to (2) with initial datum subject to (12).
We know that symmetry holds in (5) whenever β ≤ β FS (γ) and γ < 0, or if 0 ≤ γ < d. The restriction m ≥ m 1 comes from the (sub-)criticality condition p ≤ p in (5). As in [1,2,3], if one is interested only in the asymptotic rate of decay of F[v(t)] as t → ∞ (i.e., without requiring that the multiplicative constant is F[v(0)]), this restriction can be lifted and better estimates of the rates can be given using an appropriate linearization of the problem. Let us give some explanations in this regard.
Still at a formal level, we may consider a solution v = B (1 + ε B 1−m f ) to (2) and keep only the first order term in ε, as in [2]. The corresponding linear evolution equation is Mass conservation (more precisely, relative mass conservation, see Sections 2.2 and 2.3), which is taken into account by requesting that suggests to analyse the spectral gap of L considered as an operator acting on the space L 2 (R d , B 2−m |x| −γ dx). The reader interested in more details is invited to refer to [4,Section 3.3]. Let us define and pick the unique positive solution to η (η + n − 2) = (d − 1)/α 2 , which is given by The following result can be deduced from [4,Lemma 8] (please note the discrepancy of a factor α 2 , which is due to the change of variables x → x |x| α−1 ). See in particular [4, Figure 4 and Appendix B] for details.
The optimal constant is Λ = Λ ess if δ ≤ (n + 2)/2 and Λ = min {Λ ess , Λ 0,1 , Λ 1,0 } otherwise, with Notice that the optimal constant Λ in the Hardy-Poincaré-type inequality is independent of C. In the range m ∈ (m c , 1), it is also independent of M since B as defined in (11) with C = C(M ) explicitly depends on M , according to the expression of C(M ) given in [4,Appendix A].
Notice that m * as defined in (13) is the unique value of m for which, eventually, Λ ess = 0 and, as a consequence, for which there is no spectral gap. Notice that for some values of d, γ and β, the exponent m * takes nonpositive values. However our results are limited to m ∈ (0, 1) and in particular m > 0 will be assumed throughout this paper. Since δ = 1/(1 − m), we obtain precisely the value given by (13). If m > m * then where B 1 and B 2 are defined as in (12). This is not anymore true if m ≤ m * . In that case we shall consider L 1,γ -perturbations of B as defined in (11), for some constant C ∈ [C 2 , C 1 ]. If m > m * then the condition We shall refer to these conditions as the relative mass condition: see Assumptions (H1) and (H2) in Section 2.1 for further details.
We are now able to state the main results of this paper.
Theorem 4. Let (6) hold and m ∈ (0, 1), with m = m * . Under the relative mass condition and with same notations as in Proposition 3, if v solves (2) subject to (12), then there exists a positive constant C such that δ5:= n 2−η Figure 1. The spectrum of L as a function of δ = 1 1−m , with n = 5. The essential spectrum corresponds to the grey area, and its bottom is determined by the parabola δ → Λ ess (δ). The two eigenvalues Λ 0,1 and Λ 1,0 are given by the plain, half-lines, away from the essential spectrum. Note that solutions of the eigenvalue problem exist for any value of δ but may not be in the domain of the operator or below the essential spectrum and are then represented as dotted half-lines. See As in [2,3], this result itself relies on a result of relative uniform convergence which is the key estimate to relate the free energy to the spectrum of L. Before stating the latter, let us define (1−m) (2+β)+2+β−γ and observe that in view of hypotheses (6) and of the change of variables (13), θ is in the range 0 < θ < 1−m 2−m < 1. Theorem 5. Under the assumptions of Theorem 4, there exist positive constants K and t 0 such that, for all q ∈ 2−m 1−m , ∞ , the function w = v/B satisfies in the case γ ∈ (0, d), and We point out that Estimate (14) yields an improvement of a similar result, namely [2,Theorem 3], in the non-weighted case (β, γ) = (0, 0). We shall comment more on the rates of convergence provided by Theorem 5 in Section 4.3.
The proof of Theorem 5 partially relies on uniform Hölder-regularity estimates for bounded solutions to a linearized version of Equation (2). In view of possible degeneracies or singularities of the weights |x| −γ and |x| −β at the origin, such results do not follow from standard parabolic theory and therefore have to be proved separately. We devote an Appendix to these issues, where we give sketches of proofs. These are based on a strategy developed for similar equations by [8].
Using refinements that will be discussed in Section 4.1, we can also prove convergence results in L 1,γ norms. For this purpose, we need to restrict the range of m to ( m 1 , 1), where m 1 is the smallest number such that R d |x| 2+β−γ B |x| −γ dx is finite for all m ∈ ( m 1 , 1), that is Theorem 6. Under the assumptions of Theorem 4 with in addition m ∈ ( m 1 , 1), there exists a monotone, positive function t → µ(t) with lim t→+∞ µ(t) = 1 such that This result is an improvement in the spirit of [15] in the non-weighted case (β, γ) = (0, 0). As it appears in Proposition 3 and Theorems 4 and 5, Λ is smaller than min{Λ ess , Λ 0,1 } only under conditions that are discussed in [4,Appendix B]. Also notice that, by undoing the self-similar change of variables outlined in Section 2.3, it is possible to give algebraic rates of convergence for the original solutions to (1), as in [2].
Let us conclude this introduction by a few bibliographical references. In the case without weights, we primarily refer to [1,2,3] and references therein. The special case corresponding to δ = (n−2)/2 has been treated in [5]. Still in the non-weighted case, improvements have been obtained more recently in [15,16,17,18,18,19,20] using refinements of relative entropy methods, and in [10] using a detailed analysis of fast and slow variables and of the invariant manifolds. These papers are anyway limited to the choice (β, γ) = (0, 0). More references can be found therein.
As for problems with power law weights, we shall refer to [30,31,27] for approaches based on comparison techniques in the case β = 0 and for the porous media equation. The papers [29,26] deal with the critical power |x| −2 , where asymptotics is more subtle. A detailed long-time analysis has been carried out in [24] for the fractional porous media equation with a weight. Diffusion equations of porous media type with two weights (i.e. weights having the same role as |x| −γ and |x| −β here) have been investigated, e.g., in [13,23], where well-posedness issues as well as smoothing effects and asymptotic estimates are discussed in rather general weighted frameworks by means of functional inequalities. In the fast diffusion regime, convergence in relative error to a separable profile for radial solutions on the hyperbolic space has been proved by [22], through pure barrier methods. Note that, in radial coordinates, the Laplace-Beltrami operator is in fact a two-weight Laplacian. The corresponding analysis for the porous medium equation (for general solutions) has then been carried out in [34].
A detailed justification of the introduction of weights and especially power law weights in case of porous media and fast diffusion equations can be found in [28,32]. In [14], for β = 0 and γ > 0 small enough, symmetry of optimal functions in the Caffarelli-Kohn-Nirenberg inequalities (5) is proved to hold. Notice however that the inequalities are then more of Hardy-Sobolev type than of Caffarelli-Kohn-Nirenberg because only one weight is involved. The other case of symmetry in Caffarelli-Kohn-Nirenberg inequalities, which is now fully understood, is the one corresponding to the threshold case p = p , which has been recently solved in [11]. Remarkably the proof relies on the very same flow (1) and an approach based on the Bakry-Emery Γ 2 method. The reader interested in further considerations on Caffarelli-Kohn-Nirenberg inequalities, Γ 2 computations and rigidity results in nonlinear elliptic problems on compact and non-compact manifolds is invited to refer to this paper for a more complete review of the literature in this direction.
The paper is organized as follows. Properties of self-similar solutions, an existence result, a comparison result, the conservation of the relative mass, results on the relative entropy and the rewriting of (1) in relative variables after a self-similar change of variables have been collected in Section 2. These results are adapted form the case (β, γ) = (0, 0). Section 3 is devoted to regularity issues and to the relative uniform convergence, that is, the uniform convergence of the quotient of the solution in self-similar variables by the Barenblatt profile: this at the core of our results and it is also where our paper differs from the case (β, γ) = (0, 0). There we prove Theorems 4 and 5. Because of the weights, the Hölder regularity at the origin is an issue. It relies on a technical result, based on an adaptation of [8]: the proof is given in an Appendix. Some additional results, including the proof of Theorem 6 and some comments have been collected in Section 4.
Throughout this paper, B ρ denotes the centered ball of radius ρ, that is, 2. Self-similar variables, relative entropy and large time asymptotics.

The self-similar solutions.
In order to avoid confusion between original variables and rescaled variables, let us rewrite (1) as The whole family of explicit self-similar solutions of Barenblatt type is given by where C, T > 0 are free parameters and R(τ ) is defined by if m = m c , with m c as above, namely In the special case m = m c we shall replace the initial condition with R(0) = e T , for T ∈ R. More explicitly, If m ≥ m c Barenblatt-type solutions are positive for all τ > 0. If m < m c these solutions extinguish at τ = T .
As already mentioned in Section 1, we shall require that the initial datum u(0) = u 0 is trapped between two Barenblatt profiles. More precisely: (H1) There exist positive constants T and C 1 > C 2 such that If m < m c solutions with initial datum as above extinguish at t = T < ∞ as we shall deduce from the comparison principle (see Corollary 9 and related comments below). Such solutions do do not belong to L 1,γ (R d ). On the other hand, if m ≥ m c solutions are positive at all τ > 0. They belong to Assumption (H2) is in fact a consequence of (H1). Indeed in such a range Barenblatt solutions may not be in L 1,γ (R d ) but the difference of two Barenblatt profiles still belongs to L 1,γ (R d ). On the contrary, if m ≤ m * then (H2) induces an additional restriction.

2.2.
Existence, comparison and conservation of relative mass. In agreement with [25], we provide the following definition of a weak solution.
In [25], (β, γ) = (0, 0) and we point out that it is only required that u 0 ∈ L 1 loc (R d ) and u ∈ C([0, ∞); L 1 loc (R d )). However, because of the weight |y| −β , a priori the equation may not make sense, since in general u m ∈ L 1,β loc (R d ). Hence, for simplicity, we also assume that initial data and solutions are globally bounded, as in the sequel we shall only deal with this kind of solutions.
Proposition 7 (Existence). Assume that m ∈ (0, 1). For any nonnegative u 0 ∈ L ∞ (R d ) there exists a solution to (1) in the sense of the above definition.
Proof. We refer the reader to the proof of [25, Theorem 2.1]: minor changes have to be implemented in order to adapt it to our weighted context. The basic idea consists in approximating the initial datum, e.g., with the sequence u 0n : The corresponding sequence of solutions u n is well defined in view of standard L 1,γ theory (there is no additional difficulty due to the weights compared to the standard theory as exposed in [33]). One can then pass to the limit on such a sequence by exploiting local L 1,γ estimates (as in [25, Lemma 3.1]) along with the global bound u n (τ ) ∞ ≤ u 0n ∞ , valid for all τ > 0.
Proposition 8 (L 1,γ -contraction). Assume that m ∈ (0, 1) and let u 01 , u 02 ∈ L ∞ (R d ) be any two nonnegative initial data with corresponding solutions u 1 , u 2 to (1), that are constructed via the approximation scheme of the proof of Proposition 7. Then Proof. The inequality holds for the approximate solutions u 1,n and u 2,n still as a consequence of the standard L 1,γ theory. Hence, the assertion just follows by taking limits as n → ∞.
Proposition 8 trivially implies the following key comparison result.
Corollary 9 (Comparison principle). Under the same hypotheses as in Proposi- As the reader may note, we do not claim that we have a comparison principle (and hence a uniqueness result) for any solutions in the sense of Definition 2.2, but only for those obtained as limits of L 1,γ approximations. Nevertheless, in the sequel by solution we shall tacitly mean the one constructed as in the proof of Proposition 7, for which comparison holds. Since we consider initial data satisfying (H1), in order to conclude that the corresponding solutions are trapped between Barenblatt profiles at any time, one has to check that the self-similar solution given by (15) can also be obtained as a limit of L 1,γ approximate solutions. This is a standard fact given the explicit profile of U C,T .
Mass conservation is used in the range m > m c to determine the parameter C = C(M ) which characterizes the Barenblatt profile U C,T having the same mass as u. In the range m ≤ m c we can still prove that the quantity which we shall refer to as relative mass, is conserved at any τ > 0, even if U C,T (τ ) ∈ L 1,γ (R d ).
Proposition 10 (Conservation of relative mass). Assume that m ∈ (0, 1) and consider a solution u of (1) with initial datum u 0 satisfying (H1)-(H2). Then Proof. We proceed along the lines of the proof of [2, As λ → ∞, we observe that U m−1 C1,T , |∆φ λ | and |∇φ λ | behave like λ 2+β−γ , λ −2 and λ −1 , respectively, in the region B 2λ \ B λ . In particular, The L 1,γ -contraction of Proposition 8 ensures that the r.h.s. vanishes as λ → ∞. Actually, since a priori u τ does not exist as a function, we have to use a test function φ λ that depends on time and whose time derivative approximates the difference between two Dirac deltas at times τ 2 and τ 1 . However, this is a standard technicality, which we omitted in order to make the proof more readable.
2.3. Self-similar variables: a nonlinear Fokker-Plank equation. As already discussed in [2,4] and outlined in Section 1, it is convenient to analyse the asymptotic behaviour of solutions to (1) whose initial data comply with (H1)-(H2) by means of a suitable time-space change of variables that makes Barenblatt profiles stationary.
Let us rescale the function u according to where R is defined by (16). Similarly, the Barenblatt solution U C,T is transformed into B as defined by (11). Straightforward computations show that v solves the nonlinear, weighted Fokker-Plank equation In terms of the initial datum v 0 (x) = R(0) d−γ u 0 (R(0) x), conditions (H1)-(H2) can be rewritten as follows: (H1') There exist positive constants C 1 > C 2 such that Assumption (H1') is nothing else than (12). Note that, for greater readability, It is clear that, as a consequence of Proposition 10, under assumptions (H1')-(H2') the relative mass of v is also conserved, that is On the other hand, if m ≤ m * we cannot ensure that this identity still holds, but, nevertheless, the conservation of relative mass is still true and reads  (17) corresponding to an initial datum that satisfies (H1')-(H2'). As in [2, Section 2.3], let us introduce the ratio The difference w − 1 is usually referred to as relative error between v and B. In view of (17), it is straightforward to check that w is a solution to which can be seen as a nonlinear, weighted equation of Ornstein-Uhlenbeck type. Let us also define the quantities A straightforward calculation yields In terms of w 0 , assumptions (H1') and (H2') can in turn be rewritten as follows: (H1") There exist positive constants C 1 > C 2 and C ∈ [C 2 , C 1 ] such that Note that, as a consequence of the L 1,γ -contraction estimate and the comparison principle (Proposition 8 and Corollary 9), assumptions (H1")-(H2") are satisfied by a solution w(t) of (18) at any t > 0 if they are satisfied by w 0 , for a suitable f depending also on t (clearly the same holds for v, with respect to (H1')-(H2')).
3. Regularity, relative uniform convergence and asymptotic rates. The goal of this section is to show that the relative error w(t) − 1 converges to zero uniformly as t → ∞. Then in Section 3.2, by taking advantage of this result, we shall prove that such convergence occurs with explicit exponential rates. Lemma 11. Assume that m ∈ (0, 1). Let w be the solution of (18) corresponding to an initial datum w 0 satisfying (H1")-(H2"). Then there exist ν ∈ (0, 1) and a positive constant K > 0, depending on d, m, β, γ, C, C 1 , C 2 such that: where v is the solution of (17) corresponding to the initial datum v 0 = w 0 B.
Proof. For all λ > 1, let us consider the following rescaling: It is straightforward to check that v λ satisfies the same equation as v, with initial datum (v 0 ) λ . In particular, since (v 0 ) λ is bounded and bounded away from zero in B 2 \ B ε/2 independently of λ (consequence of (H1')), in view of standard parabolic regularity there holds for all k ∈ N and some Q k > 0 depending only on d, m, β, γ, C, C 1 , C 2 and ε, but independent of λ. By undoing the scaling, this is equivalent to where by C k x and C k τ we mean partial derivatives restricted to space and time, respectively. As a special case, this proves (19) and (20) upon observing that As for proving (21), it is enough to notice that, for some ν ∈ (0, 1) and another constant Q ν > 0 depending on the same quantities as Q k , with the exception of ε, the estimate follows from the regularity results of the Appendix. Indeed, since v is bounded and bounded away from zero in R + × B 1 , we can apply Corollary 25 with the choices As for (22), let z(t, x) := v(t, x) − B(x). Straightforward computations show that the equation solved by z reads |x| −γ z t = ∇ · |x| −β a(t, x) (∇z + B(t, x) z) , with the same function a as above and We are therefore again in position to use Corollary 25 to get for some K > 0 depending on d, m, β, γ, C, C 1 , C 2 . From here on K will denote a general positive constant, which may change from line to line. Corollary 25 holds with inessential modifications if one replaces balls with annuli. By performing scalings, we deduce that By standard computations and (23), which holds even if k is not an integer, and the identity w(t) − 1 = z(t)/B, we obtain By taking λ = 2 j/α , with j ≥ −1 integer, we conclude the proof of (22). b) The relative free energy and Fisher information. By proceeding along the lines of [2, Section 2.5], we redefine the relative free energy functional as B m |x| γ dx , with a slight abuse of notations in the sense that we consider it as a functional acting on w = v/B. Again, if we formally derive F[w(t)] with respect to t along the flow (18) where I is the relative Fisher information, redefined in terms of w as However, the rigorous justification of (24) is not straightforward, and to this end we need to take advantage of the global regularity estimates provided in Section 3.1.
Proof. We proceed through three steps, following the lines of proof of [2, Proposition 2]. We skip the proof of the fact that F[w(t)] is finite, since it goes exactly as in [2,Lemma 4], with inessential modifications. For the sake of greater readability we shall omit time-dependence, at least when this does not compromise comprehension.
• Step 1. Consider the same cut-off function φ λ as in the proof of Proposition 10, with λ > 1. Then, by using (18), the identity w B = v and integrating by parts, we obtain − d dt We have where, in the last step, we used the inequality for some c > 0, independent of λ > 1. We shall establish (25) in Step 2. By assumptions (H1')-(H2') and the L 1,γ -contraction principle, the difference v − B is in L 1,γ (R d ), so that lim λ→∞ R(λ) = 0 and the proof is completed by passing to the limit as λ → ∞. • Step 2. Recalling (19), we know that Moreover, since v is trapped between two Barenblatt profiles, we have The estimates hold for suitable positive constants Q 0 , Q 1 and Q 2 which are all independent of λ, t > 1. As for φ λ , by construction we have that for some c 1 , c 2 > 0 which are also independent of λ. Estimate (25) readily follows.
• Step 3. It remains to take care of the origin. In principle solutions are only Hölder regular (see the Appendix). Nevertheless, since v is uniformly bounded and locally bounded away from zero, standard energy estimates (see again [33] as a general reference) ensure, e.g., that the quantities |x| γ dt are finite for all t 1 , t 2 > 0, which is enough in order to give sense to (24) at least in a L 1 loc (R + ) sense.
c) Uniform convergence in relative error. By mimicking the proofs of [2, Lemma 5 and Corollary 1], we can show that the rescaled solution v converges to B uniformly in the strong sense of the relative error.
Proposition 14 (Convergence in relative error without rates). Assume that m ∈ (0, 1). If w is a solution of (18) corresponding to an initial datum w 0 satisfying assumptions (H1")-(H2"), then Proof. For all τ > 0, we set w τ (t, x) := w(t + τ, x). In view of (20), there exists a sequence τ n → ∞ such that w τn converges locally uniformly in (1, Moreover, by Corollary 9 we deduce that Thanks to Proposition 13, there holds Since F[w(s n + 2)] is bounded from below (as a consequence of (H1")-(H2"), see again [2, Lemma 4]), we infer that I[w τn (t)] converges to zero in L 1 ((1, 2)) as n → ∞, that is, By Fatou's lemma, this implies Still as a consequence of (20) and (26) we have that This means that the function w m−1 It is readily seen that the only possibility is c ≡ 1. Indeed, if m > m * this is due to the conservation of relative mass (Proposition 10), while in the case m ≤ m * it is a consequence of the L 1,γ -contraction principle (Proposition 8). Since we can repeat the same argument as above, up to subsequences, along any sequence τ n → ∞, in fact we have shown that In order to obtain the global uniform convergence, it is enough to recall (26) and note that by dominated convergence we have lim t→∞ w(t) − 1 L p (R d ) = 0 for all p > d/(2 + β − γ): the global C ν estimate, (21), and a standard interpolation like [2, Proof of Theorem 1] allow us to conclude.

3.2.
Hardy-Poincaré inequalities: convergence with rates. As in [2,3], if m = m * , sharp rates of convergence towards the Barenblatt profile B are related to the optimal constant Λ > 0 of the Hardy-Poincaré-type inequality for any function f ∈ C ∞ c (R d ) such that, additionally, The explicit value of Λ has been computed explicitly in [4], and is provided in Proposition 3.
• Weighted linearization. In order to better understand the asymptotic behaviour of the solutions at hand, let us outline our strategy. The idea, as in [2,Section 3.3], is to linearize the equation of the relative error (18) around the equilibrium, by introducing a convenient weight. More precisely, let f be such that for some small ε > 0. By substituting this expression in (18) and neglecting higher order terms in ε as ε → 0, we formally obtain a linear equation for f , where the r.h.s. involves a positive, self-adjoint operator on L 2 (R d , B 2−m |x| −γ dx) associated with the closure of the quadratic form defined by The functional I[φ] is the linearized version of the Fisher information I, divided by (1 − m). By means of the same heuristics, we can linearize the free energy F as well to get, up to a factor 1/m, If f is a solution of (28) then it is straightforward to infer that it satisfies which by the way could also have been obtained by linearizing (24). In the case m > m * the conservation of relative mass becomes, after linearization, Hence, as a consequence of (27) and (29), we formally get the following exponential decay for the linearized free energy: • Comparing linear and nonlinear quantities. Our aim here is to proceed in a similar way as in [2, Sections 5 and 6.2] so as to compare the free energy and Fisher information F and I with their linearized versions F and I, respectively. This will then allow us to give a rigorous justification of the above exponential decay and to use such an information to infer a precise exponential decay for the relative error.
Let us consider For t 0 ≥ 0 large enough, we deduce from Proposition 14 the existence of h ∈ (0, 1/4) such that w(t) − 1 L ∞ (R d ) ≤ h for any t ≥ t 0 . The next result, whose proof we omit since it is identical to the one of [2, Lemma 3], shows the free energy compares with the linearized free energy.
Lemma 15. Assume that m ∈ (0, 1). If w is a solution of (18) corresponding to an initial datum w 0 satisfying assumptions (H1")-(H2"), then there exists t 0 ≥ 0 such that For simplicity we shall assume that t 0 = 0 from now on. We now state the analogue of [5,Lemma 5.4]. The proof is again identical to the one performed in the case (β, γ) = (0, 0), so we skip it.
The next step is to get a bound of the L ∞ norm of the relative error in terms of the free energy.
Proof. Estimate (30) is a direct consequence of Lemmas 15-16 and of the fact that the free energy is nonincreasing by (24).
Let us now deal with the case γ < 0, where inequality (32) is no longer valid, so we have to proceed in a different way. To this end, first of all note that by Hölder's interpolation we obtain for all r > 0 and p, q ∈ (0, 2−m 1−m ). In particular, in view of Lemma 16, there exist p (sufficiently close to 0), q (sufficiently close to 2−m 1−m ) and a positive constant D depending on d, m, γ, C 1 , C 2 , r, such that Let φ λ be the same family of cut-off functions as in the proof of Proposition 10. It is clear that and for some c > 0 depending only on ν and φ. Thanks to (22), by applying (34) to the functions φ 2 (w(t) − 1) and (1 − φ 1 ) (w(t) − 1), we obtain for all t ≥ 1. Hence, by exploiting (35) with r = 4 and r = 1 in the right-hand sides and summing up the two estimates, we end up with This completes the proof of (31) with θ = 1−m 2−m .
Now we compare the Fisher information with its linearized version in the spirit of [2, Lemma 7] and [5, Lemma 5.1].
Lemma 18. Assume that m ∈ (0, 1). If w is a solution of (18) corresponding to an initial datum w 0 satisfying assumptions (H1")-(H2"), then where µ h is such that Proof. The proof is similar to the one of [5, Lemma 5.1]: here we give some details for the reader's convenience. For the sake of greater readability we shall again omit time dependence.
To begin with, let us rewrite the Fisher information I as where we have set It is easy to check that lim w→1 A(w) = 1, A(w) > 0 and A(w) → 0 as w → ∞. Moreover, since the function a(w) is concave in w, so that its incremental quotient A(w) (evaluated at w = 1) is a nonincreasing function of w. In particular, Similarly, it is straightforward to show that A (w) is bounded. Now let us set g = (w − 1)B m−1 . Since (w − 1)A (w) + A(w) = w m−2 , we get: Using Young's inequality a b ≤ h a 2 + b 2 /4h (for all a, b ∈ R) and the bounds 1 − h ≤ w ≤ 1 + h, we get: (in the last passage we have used the fact that h < 1/4). We have therefore established the inequality To complete the proof, it is enough to establish the inequality To this end, we observe that By definition of g = (w − 1) B m−1 , using Taylor expansions and the bounds on w, through elementary computations we deduce that which concludes the proof.
• Convergence with sharp rates. By means of the results of Section 3.2 we shall first obtain a global (namely involving F and I) inequality of Hardy-Poincaré type and then use it to get sharp rates of convergence for F[w(t)], which in turn will yield rates for the relative error in view of Lemma 17.
Lemma 19. Assume that m ∈ (0, 1), m = m * . If w is a solution of (18) corresponding to an initial datum w 0 satisfying assumptions (H1")-(H2"), then there holds where Λ is the best constant appearing in the Hardy-Poincaré inequality (27), and µ h is the same quantity as in Lemma 18.
Proof. With no loss of generality we can assume that F[w(t)] = 0 (and so also F[w(t)] = 0 thanks to Lemma 15), otherwise there is nothing to prove. The Hardy-Poincaré inequality (27) plus Lemmas 15 and 18 then yield Finally, since This concludes the proof.
Proof of Theorem 4. Let m = m * and assume that w is a solution of (18) corresponding to an initial datum w 0 satisfying assumptions (H1")-(H2"). We have to prove that, for some constants K 0 , t 0 > 0 that depend on d, m, γ, β, C 1 , C, C 2 and w 0 , the decay estimate holds. We split the proof in two steps: in the first one we provide a non-sharp exponential decay for F[w(t)], in the second one we use the latter to get the sharp rate. We adopt implicitly the same notations as in Lemma 19.
• Step 1. By Proposition 14 we know that h(t) := w(t) − 1 ∞ → 0 as t → ∞. According to Lemma 15, there exists t 0 > 0 such that h(t) ≤ 1/4 for any t ≥ t 0 , and we can additionally require that By combining this information, (24) and (37), we obtain which yields the exponential-decay estimate • Step 2. As a consequence of Lemma 17 and in particular (31), we can infer that Moreover, it is clear that hence, inequality (37), which also holds with h = h(t), implies for a.e. t > t 0 , so that by using again (24) we end up with the differential inequality

4.1.
Best matching, refined estimates and L 1,γ -convergence. The relative entropy to the best matching Barenblatt function is defined as where the optimization is taken with respect to the scaling parameter µ > 0, that is, with respect to the set of the scaled Barenblatt functions We start by a computation of the asymptotic rates which follows the line of thought developed in [15,18]. Also see [35] for earlier considerations in this direction. An elementary calculation shows that in fact where µ is the unique scaling parameter for which (40) This approach can be applied to any function v ∈ L 1,γ (R d ) and in particular to a t-dependent solution to (2). Moreover, we observe that Hence µ = µ (t) is monotone, with a positive limit as t → ∞, and this limit has to be equal to 1. Another remark is that where J [v] denotes the relative Fisher information with respect the best matching Barenblatt function, defined as We can consider the linearized regime: if v = B µ (1 + ε B 1−m µ f ), by neglecting higher order terms in ε, the moment condition (40) becomes Let us recall the parameter ρ defined for the self-similar solution of the introduction by 1 With a simple scaling, we can also note that the spectral gap inequality of Proposition 3 is changed into (41) holds. However, compared to Proposition 3, we obtain that the inequality holds with Λ = Λ ess if δ ≤ (n + 2)/2 and with Λ = Λ 0,1 if δ ≥ n/(2 − η), but with an improved spectral gap Λ > Λ 1,0 if (n + 2)/2 < δ < n/(2 − η), because of the orthogonality condition (41). See [4, Appendix B] for details. Hence, by arguing as for the proof of Theorem 4, we obtain for the relative entropy G the following improved convergence rate.
Next, we adapt the Csiszár-Kullback-Pinsker inequality of [16] to our setting. We recall that m 1 := d−γ d+2+β−2 γ . Lemma 21. Let d ≥ 1, m ∈ ( m 1 , 1) and assume that (6) holds. If v is a nonnegative function in The proof goes exactly along the lines of the one of [16,Theorem 4], except that the expression of B µ and the weight |x| −γ have to be taken into account. Details are left to the reader. Proposition 20 and Lemma 21 can be combined to give the result of convergence in L 1,γ (R d ) stated in Theorem 6.

4.2.
Optimality of the constant on the curve of Felli and Schneider. For completeness, let us give the key idea of the proof of Theorem 1, (i), since the framework of the functional G is well adapted. In [4], the proof is purely variational, but the flow setting is particularly convenient as we shall see next.
Proof. As in [4,Proposition 7], we notice that for some explicit constants a and b and for w = v m− 1 2 . For a given function w ∈ C ∞ 0 (R d ), let us consider w µ (x) := µ n 2p w(µ x) for any x ∈ R d . An optimization with respect to µ as in [20] shows the existence of a convex function Ψ such that .
Proof of Theorem 1, (i). Under the assumptions of Theorem 4, it is clear that the optimality in the inequality J (2) can be achieved only in the asymptotic regime, hence showing that Λ = 1 2 (2 + β − γ) 2 /(1 − m) ≥ Λ 0,1 . On the other hand, if symmetry holds in (5), the opposite inequality also holds and hence we have equality. This characterizes the curve β = β F S (γ). This proof of course holds only for solutions corresponding to initial data such that (12) is satisfied, but an appropriate regularization allows us to conclude in the general case.
Using the results of [4, Lemma 8], we can deduce that this holds whenever η = 1, which means α = α FS or, equivalently, β = β FS (γ). As mentioned in the Introduction, in the case (β, γ) = (0, 0) Theorem 5 provides a better rate of convergence for the relative error with respect to the one obtained in [2]; in particular, we have the same rate for all L q norms with q ∈ 2−m 1−m , ∞ . However, in the case γ ∈ (0, d) the rate in (14) still depends on q. To some extent, this has to be expected. Indeed, as soon as γ > 0, it can easily be shown that Gagliardo-Nirenberg interpolation inequalities of the type of (34) fail if in the righthand side one puts an L p,γ norm. Since such inequalities are key in order to turn the decay of the free energy (38) into a uniform decay, the only way we can exploit them, as it is clear from the proof of Lemma 17, is by bounding a non-weighted norm of the relative error with a weighted norm or the free energy, like in (33), and this is precisely what causes the rate to differ.
Finally, let us mention a puzzling moment conservation. It is straightforward to check that Appendix. Hölder regularity at the origin for a degenerate/singular linear problem. First of all we observe that, to our purposes, it is convenient to change variables as in [4,Section 3.3], so that v(t, r, ω) = z(t, s, ω) with s = r α transforms (2) into z t − D * α z D α z m−1 − |x| 2 = 0 upon defining D * α as the adjoint to D α on L 2 (R d , |x| n−d dx), where the parameters α and n are as in (13) and D α z := α ∂z ∂s , 1 s ∇ ω z .
In this regard, let us recall here some basic facts taken from [4,Section 3.3]. If f and g are respectively a vector-valued function and a scalar-valued function, then In other words, if we take a representation of f adapted to spherical coordinates, that is s = |x| and ω = x/s, and consider f s := f · ω and f ω := f − f s ω, then where ∇ ω denotes the gradient with respect to angular derivatives only. In particular, D * α [z 1 D α z 2 ] = − D α z 1 · D α z 2 + z 1 D * α (D α z 2 ) with − D * α (D α z 2 ) = α 2 s n−1 ∂ ∂s s n−1 ∂z 2 ∂s + 1 s 2 ∆ ω z 2 where ∆ ω represents the Laplace-Beltrami operator acting on ω ∈ S d−1 .
The advantage of resorting to this change of variables is that we can transform a problem with two different weights |x| −γ and |x| −β into a problem with two weights that are equal to |x| n−d . It is remarkable that Barenblatt-type stationary solutions (11) are transformed into the standard Barenblatt profiles Details on the change of variables can be found in [4,Section 2.3]. With regards to the purpose of this Appendix, the main interest of the change of variables is that it allows to use standard intrinsic cylinders. Given (t 0 , x 0 ) ∈ R + × R d and r > 0, let Q r (t 0 , x 0 ) := (t, x) ∈ R + × R d : t 0 − 2 r 2 < t < t 0 , |x − x 0 | < 2 r , Q + r (t 0 , x 0 ) := (t, x) ∈ R + × R d : t 0 − 1 4 r 2 < t < t 0 , |x − x 0 | < 1 2 r , Q − r (t 0 , x 0 ) := (t, x) ∈ R + × R d : t 0 − 7 8 r 2 < t < t 0 − 5 8 r 2 , |x − x 0 | < 1 2 r . See Fig. 2. As a straightforward consequence of the above definitions, there holds Q r/4 (t 0 , x 0 ) ⊂ Q + r (t 0 , x 0 ). The above cylinders are the same as the classical parabolic cylinders: having same weights gives the same scaling properties as in the non-weighted case, as first remarked in [8].
Our aim here is to study the local Hölder regularity for solutions to a weighted linear problem of the form for some functions a and B which depend on (t, x) ∈ R + × R d . By following the ideas of F. Chiarenza and R. Serapioni in [8], we start by establishing a parabolic Harnack inequality, through a weighted Moser iteration.

Proposition 23 (A parabolic Harnack inequality).
Assume that a is locally bounded and bounded away from zero and that B is locally bounded in R + × R d . Let d ≥ 2, α > 0 and n > d. If u is a bounded positive solution of (42), then for all (t 0 , x 0 ) ∈ R + × R d and r > 0 such that Q r (t 0 , x 0 ) ⊂ R + × B 1 , we have The constant H > 1 depends only on the local bounds on the coefficients a, B and on d, α, and n.
Proof. The proof follows the lines of [8, Theorem 2.1] with minor modifications. Let us emphasize the main adaptations. We observe that a critical Caffarelli-Kohn-Nirenberg inequality can be rewritten after the change of variables s = r α as and is actually scale invariant. See [11,Inequality 3.2] for details, including symmetry issues and the computation of K n,α in the symmetry range. This inequality plays the same role as the one of [8, Lemma 1.1]. Then the proof follows upon replacing ∇ by D α . The term B u is in fact of lower order, since it is locally bounded: it can easily be reabsorbed into the energy estimates. By translating the intrinsic cylinders with respect to t by r 2 , we achieve the conclusion.
The Harnack inequality of Proposition 23 implies a Hölder continuity, by adapting the classical methodà la De Giorgi to our weighted framework.
Since the change of variables s = r α transforms Hölder functions into Hölder functions (but of course not C 1 ), as a direct consequence of Corollary 24 we have an analogous result for the original (linear) equation.