Improved energy methods for nonlocal diffusion problems

We prove an energy inequality for nonlocal diffusion operators of the following type, and some of its generalisations: $Lu (x) := \int_{\mathbb{R}^N} K(x,y) (u(y) - u(x)) dy$, where $L$ acts on a real function $u$ defined on $\mathbb{R}^N$, and we assume that $K(x,y)$ is uniformly strictly positive in a neighbourhood of $x=y$. The inequality is a nonlocal analogue of the Nash inequality, and plays a similar role in the study of the asymptotic decay of solutions to the nonlocal diffusion equation $\partial_t u = L u$ as the Nash inequality does for the heat equation. The inequality allows us to give a precise decay rate of the $L^p$ norms of $u$ and its derivatives. As compared to existing decay results in the literature, our proof is perhaps simpler and gives new results in some cases (particularly, and surprisingly, in dimensions $N = 1, 2$).


Introduction
In this paper we develop energy methods which are useful in the study of some partial differential equations involving nonlocal diffusion terms. We start by the basic example which is the following integro-differential equation in convolution form: where t ≥ 0 is the time variable, x ∈ R N is the space variable, u = u(t, x) ∈ R is the unknown, and J is the diffusion kernel. Typically one assumes that J is smooth, nonnegative, radially symmetric, and with integral 1; we also mention a variety of models with different assumptions and variations of (1) in Section 4. Equation (1) and its relatives appear as a nonlocal version of the usual diffusion equation ∂ t u = ∆u, and it is known that (1) approximates it when J is close to a Delta function (see Theorem 1.8 and the remarks before it). We will apply energy methods to deal with nonlocal problems that not necessarily involve a convolution. That is, problems of the form where our main hypotheses on K can be summarized as follows: K(x, y) is a nonnegative symmetric function with sup y∈R N R N K(x, y)dx ≤ C K and such that K is strictly positive in a neighborhood of the closet set {x = y}. Furthermore, the symmetry of K can be replaced by integrability conditions (see Subsection 4.2). On the other hand, observe that it makes sense to assume that K(x, x) > 0 since in many models it means that the probability that individuals remain for some time at the point where they are is positive. As a particular application which motivates our arguments we consider the nonlocal dispersal model proposed by Cortázar et al. (2007) (see also Cortázar et al. (2011); Cortázar et al. (2015); Cortázar et al. (2016)): with a prescribed initial data u(x, 0) = u 0 (x) defined on R. Here J is an even, positive, smooth function such that R J(x) dx = 1 and supp J = [−1, 1], and g is a continuous positive function which accounts for the dispersal distance which depends on the departing point. In this model u represents the spatial distribution of a certain species, and g models the heterogeneity of the environment which can affect the distribution of a species through space-dependent dispersal strategies. For this model we are able to give an explicit rate of decay of the L p norm of solutions, which is to our knowledge a new result (see Theorem 4.3).
The driving idea of our methods is that solutions to (1) behave in many ways like solutions to the heat equation where as usual the Laplacian ∆ acts only on the space variable x (see Theorem 1.8 and the comments before it). For more details we refer the reader to Sun et al. (2011) for the Cauchy problem, Cortázar et al. (2009) for Dirichlet boundary conditions (see also Molino and Rossi (2016) in a more general framework) and Cortázar et al. (2008) for Neumann boundary conditions. One important property of (4) is the following time decay of solutions (see for instance Giga et al. (2010)): there is a constant C = C(N, p) > 0 such that In fact, it still holds for u 0 ∈ L 1 (R N ) and all t > 0 by removing the term u 0 −pγ p . Here and below, L p (R N ) denotes the usual Lebesgue space of p-integrable functions on R N , with associated norm denoted by · p . There are several ways of showing this decay and regularization property for the heat equation. One of them is noticing that the L p norms are Lyapunov functionals for (4): if u solves (4) with u 0 ∈ L p (R N ) then One can then compare the right hand side term to u p by using the Gagliardo-Nirenberg-Sobolev inequality (which in this particular case is known as the Nash inequality Nash (1958)) This inequality is valid in any dimension N ; in dimensions N ≥ 3 it can easily be obtained as a consequence of the more familiar Sobolev inequality u 2 * ≤ C ∇u 2 , where 2 * := 2N/(N − 2). By using (7) with v = u p/2 we obtain for any p ≥ 2 that where the last step is obtained through an interpolation of u p/2 between u p and u 1 . Due to mass conservation for the heat equation we have u 1 ≤ u 0 1 for all times t ≥ 0 (this inequality is of course an equality for nonnegative, finite-mass solutions). Hence using (8) in (6) one has for some constant C = C(N, p). This is a differential inequality for u p that readily gives the decay (5).
In the context of diffusion equations, the strategy of using the L p norm of u and its derivative as a means for studying properties of solutions is known as the energy method. It is a close relative of a common and quite successful strategy in kinetic equations and dissipative PDE sometimes known as the entropy method (Arnold et al., 2004;Bakry andÉmery, 1985;Bonforte et al., 2010;Carrillo et al., 2001;Desvillettes and Villani, 2004;Gross, 1975;Otto and Villani, 2000;Villani, 2002), where one compares the time derivative of a Lyapunov functional with the Lyapunov functional itself via a functional inequality in order to obtain a certain decay rate for solutions. These energy methods have the advantage of being quite robust, often being applicable to equations that are not explicitly solvable by Fourier transform methods, and to nonlinear problems. The question that motivates this paper is whether these ideas can be adapted to equation (1) in order to show a decay property similar to (5). One important observation is that the same statement cannot be true for solutions of (1), since there is no instantaneous L 1 to L p regularization. In fact, the L p norms are still a Lyapunov functional for (1) (as is well known, any convex function gives a Lyapunov functional for (1)): if u is an L p solution to (1) then Here, the L p dissipation D J p (u) is defined for any measurable u : R N → R as where for q > 0 we denote by φ q the antisymmetric extension of the usual q-th power, that is, φ q (s) := |s| q sgn(s), s ∈ R.
Of course, since φ p−1 is nondecreasing, the integrand in (10) is also nonnegative and always makes sense as a number in [0, +∞]. We point out that for nonnegative u the expression becomes a bit simpler, Precisely this strategy was discussed in Ignat and Rossi (2009), where it was remarked that no inequality of the following form can hold, for any q > 2 and a smooth, nonnegative, compactly supported function J: Hence the natural analogue of the usual Sobolev inequality does not hold in the nonlocal case. Similarly, the direct analogue of (8) (with D J p (u) on the left hand side) cannot hold, since it would imply an L 1 − L p regularization effect on (1) which is known to fail. In view of this failure, a different strategy was followed there, leading to different inequalities and applications to several linear and nonlinear equations involving nonlocal diffusions. Similar ideas were developed in Brändle and de Pablo (2015) in order to establish decay estimates for fractional diffusions, with modified inequalities used in place of the usual Nash inequality. After the statement of our results we compare them in more detail to those in Brändle and de Pablo (2015); Ignat and Rossi (2009) and other previous works.
Main results. Our purpose is to show a simple inequality that plays the role of (8) and provides a means to show precise decay properties of (1) and (2): is a measurable function such that for some r, R > 0 we have In particular, this is obviously satisfied if J is continuous in a neighborhood of 0 with J(0) > 0.
Theorem 1.1 (L p energy inequality). Let J : R N → R be a function satisfying Hypothesis 1. For every N ≥ 1 and p ≥ 2, there exists a positive constant C = C(N, p) > 0 such that This inequality serves as a useful analogue of (8) in the nonlocal case, as we will see shortly. If one does not care about the precise dependence of the constant C on J then this can be more simply stated as: there exists a constant C = C(N, p, J) depending only on N , p and J such that The constants in the above inequalities can be estimated explicitly by following the proof. To our knowledge, inequality (12) is new. Similar modified Nash inequalities are considered in Carlen et al. (1987); Ignat and Rossi (2009), and especially in Brändle and de Pablo (2015)[Corollary 4.7]. In the latter, (p, q)-inequalities involving the p and q norms of u are given for p > q > 1; ours is the limiting case with q = 1, not included there. We notice the L 1 case is fundamental for the generalisations we describe later, since mass is a natural conserved quantity in many models.
The inequality in Theorem 1.1 immediately allows us to deduce bounds on the asymptotic behaviour of several nonlocal diffusion equations (see Section 4). Let us give the argument for equation (1), which is the simplest possible model: using Taking into account that u 1 is nonincreasing in time (it is conserved for nonnegative solutions) one has This is a differential inequality for u p , which can be solved (see Lemma 4.1) to yield the following result: Theorem 1.2. Take a function J ∈ L 1 (R N ) satisfying Hypothesis 1 and p ∈ [2, +∞). Consider the solution u to equation (1) with initial data u 0 ∈ L 1 (R N ) ∩ L p (R N ). There exists a constant C = C(N, p) (the same as in Theorem 1.1) such that where γ := 2 N (p−1) and Again, if we are not interested in the precise dependence of the bound on J, u 0 1 and u 0 p then the following statement is simpler: there exists a constant C = C(r, R, N, p, u 0 1 , u 0 p ) such that This is a direct consequence of the bound (14); see Remark 4.2. In this sense, Theorem 1.1 is a nonlocal analogue of the Gagliardo-Nirenberg-Sobolev (or Nash) inequality: it allows us to give a decay rate of the nonlocal diffusion equation (1), and in fact this decay rate approaches that of the heat equation as (1) approaches it (see Theorem 1.8). Furthermore, due to the interpolation formula and using inequality (15) for p = 2, we obtain that for q ∈ [1, 2] which means that (15) also holds for 1 ≤ p < 2 and some positive constant C = C(J, N, p, u 0 1 , u 0 2 ). We also give inequalities related to higher derivatives of u in Section 3, and deduce from them corresponding decay properties of derivatives of u, still at the same asymptotic rate as those for the heat equation. For k ≥ 0 we define the differential operator D k acting on a function u as In order to ensure that this expression makes sense we will always assume that u ∈ H k (R N ) (i.e., the classical Sobolev space W k,2 (R N )) when applying D k . The following result gives an estimate of D J 2 (D k u); note that the case k = 0 is just the p = 2 case of Theorem 1.1: Theorem 1.3. Let N ≥ 1 be an integer and J : R N → R be a function satisfying Hypothesis 1. There exists a positive constant C = C(N ) such that As a consequence one can obtain a decay of higher derivatives of solutions to (1). Notice that the case k = 0 of the following result is just Theorem 1.2 with p = 2: Theorem 1.4. Take a function J satisfying Hypothesis 1 and a real k ≥ 0. Consider the solution u to equation (1) with initial data u 0 ∈ L 1 (R N ) ∩ H k (R N ). There exists a constant C = C(N, k) (the same as in Theorem 1.3) such that where γ := 2 N +2k and The decay in Theorem 1.2 can be interpreted as follows: for large times, the asymptotic decay of the L p norm of solutions to the nonlocal diffusion equation (1) is the same as that of the heat equation. However, there can be an initial time during which a different decay takes place. The threshold between the two is related to the value of the L p norm of u: if it is large then heuristically (since we are assuming u 0 is integrable) the main contribution to the L p norm comes from local concentrations of u. Since the smoothing effect of (1) is much weaker than that of the heat equation, the rates of decay of the two differ. On the other hand, when u p is small, the concentrations of u do not play a major role and the decay of both equations becomes comparable. The inequality (12) and the corresponding decay (14) are quite precise on the dependence on J and the initial data, giving a direct estimate of the time when the "heat-like" diffusion kicks in: the time t 0 depends logarithmically on the ratio between u 0 p and u 0 1 . Theorem 1.2 as stated is not new; the simplified statement (15) can be proved for example by Fourier transform methods (Andreu-Vaillo et al., 2010), and the decay (14) can probably be obtained as well. The important advantage of using Theorem 1.1 to prove Theorem 1.2 is that the method is quite robust under modifications in the linear operator. In Subsection 4.2 we prove a result similar to Theorem 1.2 which gives decay properties for general nonlocal diffusion equations with a more general kernel K(x, y) instead of J(x − y): consider the equation is a general kernel (not necessarily symmetric) and σ : R N → [0, +∞) is a function. Let us keep our discussion at a formal level for the moment and not worry about the problem of existence of solutions to (17) or the precise regularity of K and σ. Equation (17) is a general form of the scattering equation (see for example Michel et al. (2004)), and contains many others as a particular case. The nonlocal diffusion (1) which is a type of nonlocal diffusion equation, where the nonlocality is not given by a convolution. Similarly, if we assume then equation (17) is formally the Kolmogorov forward equation for a Markov jump process with jump rates given by K, where u represents the probability density of the process (Ethier and Kurtz, 1986, Chapter 4.2). Notice that (19) is just the statement that the total mass R N u(t, x) dx is formally conserved in time (as should happen for a probabilistic evolution). In that sense, equation (17) contains many evolution equations linked to Markov processes, and has multiple applications. (We give an example linked to a population dispersal in Section 4.3.) Equation (17) has some properties in common with diffusion processes, but it is important to notice that (17) may have finite-mass equilibria (unlike the usual heat equation, whose only finite-mass equilibrium is 0). Let us state a precise result which is relevant for nonlocal diffusions. For all of them we will assume: This is the analogue of Hypothesis 1 in this setting. In order to ensure that L p solutions of (17) exist we will also assume that K is measurable and that for some This ensures that the linear operator on the right hand side of (17) is bounded in L 1 (R N ) and L ∞ (R N ) (and hence, by interpolation, in any L p (R N ) with 1 ≤ p ≤ ∞).
Hypothesis 2 and (20). Consider equation (17) with σ given by (19), and assume that there exists an equilibrium u ∞ of (17) satisfying for some m > 0. Let u be any solution to equation (17) with initial data There exists a constant C depending only on r, R, N , m, p, u 0 1 and u 0 p such that In Section 4.3 we give an application of these results to a dispersal equation proposed in Cortázar et al. (2007), obtaining an explicit rate of convergence to equilibrium.
Remark 1.6. Condition (20) is just included in order to ensure that there are welldefined solutions to (17), but it does not play a role in the decay estimates. It can be removed if it can be justified by other means that solutions to (17) exist and rigorously satisfy the entropy property (9).
Remark 1.7. In Theorem 1.5 one can also give a more precise estimation of the decay and the constants involved, as we did in Theorem 1.2. We have preferred in this case to leave the statement in this form for simplicity, but the reader can state the analogue of Theorem 1.2 without difficulty.
We refer to Section 4.2 for details on this and a proof of Theorem 1.5.
Heat equation scaling. It is worth mentioning that Theorems 1.1 and 1.2 pass to the limit well when the nonlocal equation (1) approximates the heat equation. Let J be a smooth and radially symmetric convolution kernel with J(0) > 0, and denote by J ǫ the rescaling It is well-known that, u ε , the solution to the equation with initial data u 0 ∈ C(R N ) converges to the solution of the heat equation ∂ t v = ∆v with the same initial data (see for instance Andreu-Vaillo et al. (2010); Rey and Toscani (2013)). Since J satisfies Hypothesis 1 for some r, R > 0 one has J ε (z) ≥ rC(J) ε 2+N , for all |z| < Rε. Replacing this in expression (14) the ε is cancelled and we obtain the following result: Theorem 1.8. Assume J satisfies Hypothesis 1. Let u ε be a solution of (22) with initial data u 0 ∈ L 1 (R N ) ∩ L p (R N ) with p ∈ [2, ∞). Then it holds where C 1 = C(N, p)γrR N +2 C(J) does not depend on ε and In particular, t 0 = 0 for all ε < ε 0 = u 0 The interest of the above theorem is that the decay is preserved in the scaling that leads to the heat equation. In addition, for small ε the expression of the decay is exactly of the same form as that of the heat equation, given in (5).
Comparison to results in the literature. Several precise results exist already regarding the decay properties of equation (1). Let us give a brief review and compare them to our own. Nonlocal diffusions including (1) have been studied in Chasseigne et al. (2006), and we refer the reader to the recent book Andreu-Vaillo et al. (2010) for background and an extensive review of the state of the art for equations involving similar nonlocal terms. A similar approximation to the heat equation, with a particular kernel J, was studied in Rey and Toscani (2013), and some nonlocal approximations to Fokker-Planck equations have been recently considered in Mischler and Tristani (2016) and very recently in Toscani (2017).
The observation that solutions to (1) decay asymptotically like the heat equation has been present since the first works on the matter, with several analogues of (5). The first ones were based on the Fourier transform of (1), which is explicitly solvable Chasseigne et al. (2006); Rossi (2007, 2008). Energy methods were considered in Ignat and Rossi (2009); results were given on the decay of several models including the linear nonlocal diffusion equation (1) and a nonlocal version of the p-Laplacian evolution equation. The method in Ignat and Rossi (2009) is different from ours, and is based on a splitting of the function u into a "smooth" part and a "rough" part. The ideas are somehow reminiscent of ours, since they borrow techniques from Fourier splitting by Schonbek (1980) and there is a parallel with our splitting of the function u in Fourier space. The results from Ignat and Rossi (2009) are in dimensions N ≥ 3 and K symmetric; on the other hand, they are well-adapted to nonlinear problems like the nonlocal p-Laplacian equation. Our inequality seems to be a simpler argument which works in any dimension, is welladapted to the linear nonlocal diffusion operator, but does not easily carry over to nonlinear nonlocal operators. It also gives a simple way to track the dependence of the decay on the parameters of the problem, especially the diffusion kernel J.
Inequalities of the type (12) were already noticed in Brändle and de Pablo (2015), and used in order to obtain decay and regularisation properties for nonlinear diffusions of the type (1) where the function J typically behaves as |x| −N −α as x → +∞, for some 0 < α ≤ 2. Their proof goes along the lines of Ignat and Rossi (2009). Inequality (12) is a limit case of their results, but is not included there for similar reasons as in Ignat and Rossi (2009).
As compared to previous results, we summarise our contributions as follows: (1) Inequality (12) seems to be new. Similar ideas were used in Brändle and de Pablo (2015); Ignat and Rossi (2009), but (12) is a limiting case not included in these works.
(2) Our proof of the inequality (12) is straightforward, works in any dimension, and in our opinion simplifies previous arguments for related inequalities. It also leads to a precise estimate of the constants in the inequality, which have in particular the correct scaling when approximating the heat equation (see Theorem 1.8).
(3) A similar method of proof yields inequalities and decay results involving higher derivatives of the function u; see Section 3. (4) The entropy method used allows for an extension to linear mass-conserving equations with general kernels K(x, y) (not necessarily symmetric) instead of J(x − y); see Subsection 4.2.
The paper is organised as follows: in Section 2 we give the proof of the inequality in Theorem 1.1, and in Section 3 we prove similar inequalities involving derivatives. Finally, in Section 4 we show how these inequalities yield decay properties for several equations involving general kernels K(x, y), in particular proving Theorem 1.2 in Subsection 4.1.

Energy inequalities for nonlocal diffusion operators
We are interested in finding useful lower bounds of D J p (u) in terms of L p norms of u. Since (|a| − |b|)(|a| s − |b| s ) ≤ (a − b)(φ s (a) − φ s (b)) for any a, b ∈ R and s > 1 (where φ s (a) := |a| s sgn(a)), it is easily seen that for any measurable u : R N → R. This allows us to work only with nonnegative functions u. This section is devoted to the proof of Theorem 1.1. We first show the case p = 2, and then deduce from it the general inequality for p ≥ 2. The proof of the p = 2 case is a modification of a the original proof of the Nash inequality (7) appearing in the paper by Nash (1958): Lemma 2.1. Let I be the normalised characteristic function of the unit ball in R N , where ω N is the volume of the unit ball in dimension N . There exists a constant C = C(N ) depending only on N such that We point out that the constant C can be estimated explicitly by following the calculations in the proof below.
Since I has integral one we can write, using that the Fourier transform is an isometry of L 2 (R N ; C), where ·, · denotes the usual inner product in the space of L 2 complex functions in R N . We can break the integral of u 2 in two parts, for any δ > 0: These two terms can be estimated as follows: for the first one, For the second one, using (25) and assuming δ < 1 we have Using (27) and (28) in (26) we obtain (29) u 2 2 ≤ ω N δ N u 2 1 + C 1 δ 2 D I 2 (u), for any 0 < δ < 1.
We would like to optimise this quantity in δ, but it is only valid for 0 < δ < 1. If we could choose δ freely we would take the one that achieves the best bound in the inequality (29), that is, . Now we discuss two cases: Case 1. If δ 0 < 1, then replacing δ by δ 0 in (29) we have Equivalently, . Case 2. If δ 0 ≥ 1 then this means that N ω N u 2 1 ≤ 2C 1 D I 2 (u). In this case, choosing δ = 1 in (29) and using the above inequality we get Finally, summarising (30) and (31) we obtain Notice that D J 2 (u) satisfies the following scaling property. For λ > 0 and any function g on R N we denote Then one sees that . This easily gives the following extension of Lemma 2.1: Corollary 2.2 (L 2 energy inequality). Let J satisfy Hypothesis 1. There is some constant C = C(N ) that depends only on the dimension N such that Proof of Corollary 2.2. Call I = I(z) the normalised characteristic of the unit ball, and define K(z) := 1 r ω N J(Rz), z ∈ R N .
Then K(z) ≥ I(z) for all z ∈ R N so D K 2 (u) ≥ D I 2 (u). Since J = rω N K R , due to the scaling (33) we have . Hence we can use Lemma 2.1 (writing C N to denote the constant C in it) to say that This shows the result.
Corollary 2.2 gives the case p = 2 of Theorem 1.1. In order to obtain the general case for p ≥ 2 and complete the proof, let us first state a simple elementary without proof inequality in the next lemma: Lemma 2.3. Let p > 1, there exists c(p) > 0 such that We can now complete the proof of Theorem 1.1: Proof of Theorem 1.1. As explained at the beginning of Section 2, we may assume that u is nonnegative. By using the inequality (35) we obtain = c(p)D J 2 (u p/2 ). Now, by virtue of Corollary 2.2, and calling C N the constant in it, it follows that Finally, due to the interpolation formula

Energy inequalities involving derivatives
We now prove Theorem 1.3, an inequality which is useful when studying the decay of derivatives of solutions to nonlocal diffusion equations: Proof of Theorem 1.3. The proof is a direct extension of the technique in the proof of Theorem 1.1. We follow the same steps. First, we assume that J is the normalised characteristic function of the unit ball in R N , given by (23). Then, closely following Lemma 2.1, we claim for some constant C N > 0 depending only on N . As in the proof of Lemma 2.1, Now, recalling inequality (25) and taking into account that | D k u(ξ)| 2 = |ξ| 2k |û(ξ)| 2 ≤ |ξ| 2k u 2 1 we obtain for 0 < δ ≤ 1 that .
We obtain, as in Lemma 2.1, two possibilities: if δ 0 ≤ 1, we get In the other case, δ 0 > 1, we get Collecting inequalities (38) and (39) we have Reversing the inequality we have thus proved (36).
In order to complete the proof we consider any J satisfying Hypothesis 1. We have a scaling property which is an extension of (33): for any λ > 0. Of course, we also have D k u λ = λ −k (D k u) λ , the usual scaling for derivatives. If I denotes the characteristic function of the unit ball on R N and we define K = 1 rωN J 1/R as in the proof of Corollary 2.2 then K ≥ I, and J = rω N K R . Using the scaling property (40) and the normalised case (36) we see that which shows the result.
We point out that analogous results can be stated for other differential operators. As an example we consider ∇u. Following the notation of the preceding section we set defined for any u ∈ H 1 (R N ). Reasoning along the same lines as in the previous result one obtains the following result for ∇u (notice that this is not the same as the k = 1 case of Theorem 1.3, since D 1 u is not equal to ∇u): Theorem 3.1. Let N ≥ 1 be an integer and J : R N → R be a function satisfying Hypothesis 1. There exists a positive constant C = C(N ) such that Proof. If J has integral one we can write, as before, Since | ∇u(ξ)| 2 = |ξ| 2 |û(ξ)| 2 ≤ |ξ| 2 u 2 1 , one can follow the same reasoning as in the k = 1 case of Theorem 1.3 to obtain the result.

Some applications
4.1. The linear nonlocal diffusion equation in convolution form. The most direct application of the inequalities in the previous section concerns the long-time behaviour of the linear nonlocal diffusion equation: where t ≥ 0 is the time variable, x ∈ R N is the space variable, u = u(t, x) ∈ R is the unknown, and J is the diffusion kernel. As a straightforward consequence of Theorem 1.1 we obtain Theorem 1.2, which we prove now.
Proof of Theorem 1.2. The regularity of the solution u allows us to write the following H-theorem for the L p norm: Due to Theorem 1.1, and taking into account that u(t, ·) 1 = u 0 1 (mass conservation), we have for some constant C = C(N, p). This is a differential inequality for u p p which allows us to compare it to the solution to the equation We can then apply Lemma 4.1 with to obtain the result.
Lemma 4.1. Take C 1 , C 2 , γ > 0 and let X = X(t) be a solution on [0, +∞) to the ordinary differential equation with X(0) > 0. Then we have Remark 4.2. The solution of the ordinary differential equation in the above lemma is actually explicit (see the proof), and we just aim to give a simple statement that captures the decay of the solution as t → +∞. One can simplify even further and say that there is a constant C = C(C 1 , C 2 , γ, X(0)) such that This is easily deduced from (46) with which is finite since both X and (1 + t) − 1 γ have the same decay as t → +∞, and obviously depends only on C 1 , C 2 , γ and X(0).
Proof of Lemma 4.1. By usual theorems in ordinary differential equations, equation (45) has a unique solution on [0, +∞) with the given initial condition X(0), and this solution is nonnegative on [0, +∞). The condition that decides which of the two terms achieves the minimum at each time t is whether (47) X(t) γ ≤ C 2 C 1 or not. Since X is nonincreasing, once this condition is satisfied at a certain t 0 ≥ 0 it will be satisfied for all t ≥ t 0 . With this it is easy to calculate the explicit solution, given by One obtains the result by noticing that X(0)e −C2t ≤ X(0) and X(t 0 ) ≤ X(0).
Similarly, with the help of the previous lemma the inequalities in Theorem 1.3 imply the decay in Theorem 1.4: Proof of Theorem 1.4. If u satisfies equation (1) then D k u satisfies the same equation, with initial condition D k u(0, x) = D k u 0 (x). Hence we have, as in (9), This is again a differential inequality for D k u 2 2 , to which we can apply Lemma 4.1 with 1 , C 2 = CrR k+N . This directly gives the result.

4.2.
General linear mass-conserving nonlocal equations. In this section we prove Theorem 1.5, which concerns equation (17), recalled here: is a general kernel (not necessarily symmetric) and σ : R N → [0, +∞) is a function. In order to apply our strategy to equation (48) we need to have suitable Lyapunov functionals for it. To our knowledge, the most general setting in which one can do this is that of the so-called general relative entropy method (Michel et al., 2004(Michel et al., , 2005, which we state here in a particular case: assume that (19) holds and that (49) There exists a positive equilibrium u ∞ : R N → (0, +∞) of (48).
(That is, a solution u ∞ of (48) which does not depend on time t.) Then it is known that whenever Φ is a convex function and u is any solution of (48). This fact is wellknown in probability theory (see the review by Chafaï (2004)) and is a direct consequence of the general relative entropy method (Michel et al., 2004). The explicit form of its time derivative can be found in Michel et al. (2004): where we denote f (t, x) ≡ u(t, x)/u ∞ (x), and where the t variable has been omitted for shortness. Notice that the integrand is always nonnegative due to the convexity of Φ. The following particular cases are of interest for us here: for Φ(f ) = |f | p with p > 1 we have where the dissipation E K p (f ) is an operator acting only on the x variable. Its expression is given by the right hand side of (50) (with Φ(f ) = |f | p ) and is not so simple. But if we additionally assume that for all x, y ∈ R N , then one can check that for all nonnegative functions f ; note the parallel with (9). The last equality in (53) is obtained by noticing that the integrals corresponding to f (x) p and f (y) p cancel out (easily seen by using (52)), and using (52) again to symmetrise the remaining integral: Condition (52) is known in probability as the detailed balance or reversibility condition (it holds for example if u ∞ ≡ 1 and K is symmetric). If one works in a setting where (51) holds then it may still be possible to use the inequality in Theorem 1.1 (or related ones) and deduce some information on the rate of decay of solutions.
Proof of Theorem 1.5. Condition (20) is easily seen to imply that the linear operator given by is well defined and bounded both in L 1 (R N ) and L p (R N ). This shows that equation (48) with initial condition u 0 has a unique solution in C 1 ([0, +∞), L p (R N )∩L 1 (R N )) which conserves mass (that is, R N u(t, x) dx = R N u 0 (x) dx for all t ≥ 0), and that it satisfies the entropy property (9). It is also seen easily that equation (48) preserves sign: if the initial condition is nonnegative (nonpositive) then u(t, x) is nonnegative (nonpositive) for all t, x. As a consequence, it is enough to prove the result when u 0 is nonnegative -the general result is then obtained by linearity from u 0 = u + 0 − u − 0 , with u + 0 := max{u 0 , 0} and to u − 0 := max{−u 0 , 0}. For x, y ∈ R N call we have d dt where C 2 also depends on m, and we have used mass conservation and again the bounds in (21). Due to the differential inequality in Lemma 4.1 we obtain that for all t ≥ 0, for some constant C as stated in the result. We complete the proof by noticing that

A nonlocal dispersal equation.
We consider the following integro-differential equation (the dispersal model that was briefly mentioned in the introduction): with a prescribed initial data u(x, 0) = u 0 (x) defined on R. Here J is an even, positive, smooth function such that R J(x) dx = 1 and supp J = [−1, 1], and g is a continuous positive function which accounts for the dispersal distance which depends on the departing point. In this model u represents the spatial distribution of a certain species, and g models the heterogeneity of the environment which can affect the distribution of a species through space-dependent dispersal strategies. This model was proposed in Cortázar et al. (2007) (see also Cortázar et al. (2011);Cortázar et al. (2015); Cortázar et al. (2016)). It was shown there that if we assume g is bounded above and below then there exists a positive steady state solution of (54), that is, a solution of the corresponding stationary problem, u ∞ (y) g(y) dy, in R.
Moreover, u ∞ is bounded above and below by positive constants. It was also proved in Cortázar et al. (2007) that any solution u of (54) converges to 0 locally as t → ∞. Using the general result in Theorem 1.5 we are able to improve this asymptotic behavior obtaining a precise decay rate of the L p norms of u: Theorem 4.3. Take p ∈ [2, +∞). Let u be a solution of (54) with initial data u 0 ∈ L 1 (R) ∩ L p (R), and assume that (1) J ∈ L ∞ (R) is a bounded, nonnegative function with compact support, satisfying Hypothesis 1, (2) and g is a continuous function satisfying 1 M ≤ g(x) ≤ M, for all x ∈ R and for some M > 0. Then for some constant C > 0 depending on J, M , p, u 0 1 and u 0 p , u p p ≤ C(1 + t) − p−1 2 , for all t ≥ 0.