An operator splitting scheme for the fractional kinetic Fokker-Planck equation

In this paper, we develop an operator splitting scheme for the fractional kinetic Fokker-Planck equation (FKFPE). The scheme consists of two phases: a fractional diffusion phase and a kinetic transport phase. The first phase is solved exactly using the convolution operator while the second one is solved approximately using a variational scheme that minimizes an energy functional with respect to a certain Kantorovich optimal transport cost functional. We prove the convergence of the scheme to a weak solution to FKFPE. As a by-product of our analysis, we also establish a variational formulation for a kinetic transport equation that is relevant in the second phase. Finally, we discuss some extensions of our analysis to more complex systems.


Introduction
In this paper, we study the existence of solutions to the following fractional kinetic Fokker-Planck equation (FKFPE) with s ∈ (0, 1]. In the above, div denotes the divergence operator; the differential operators ∇, div and △ with subscripts x and v indicate that these operators act only on the corresponding variables; the operator −(−△ v ) s is the fractional Laplacian operator on the variable v, where the fractional Laplacian −(−△) s , is defined by −(−△) s f (x) := −F −1 (|ξ| 2s F [f ](ξ))(x). Here F denotes the Fourier transform on R d , i.e. F [f ](ξ) = 1 (2π) d/2 R d f (x)e −ix·ξ dx. Note that the fractional Laplacian operator with 0 < s < 1 is a non-local operator since it can also be expressed as the singular integral where the normalisation constant is given by C d,s = s2 2s Γ( d+2s 2 )/(π d 2 Γ(1 − s)) and Γ(t) is the Gamma function. See [33] for more equivalent definitions of fractional Laplacian operator.
The equation (1.1) is interesting to us because it can be viewed as the Fokker-Planck (forward Kolmogorov) equation of the following generalized Langevin equation where L s t is the Lévy stable process with exponent 2s. The stochastic differential equation (SDE) (1.2) describes the motion of a particle moving under the influence of a (generalized) frictional force and a stochastic noise and in the absence of an external force field. FKFPE (1.1) is the evolution of the probability distribution of (X t , V t ). In particular, the fractional operator −(−△) s is the Markov generator of the process L s t . When s = 1 and Ψ(v) = |v| 2 2 , equation (1.1) becomes the classical kinetic Fokker-Planck (or Kramers) equation (without external force field) which is a local PDE and has been used widely in chemistry as a simplified model for chemical reactions [32,26] and in statistical mechanics [35,38]. The non-local Lévy process plays an important role in modelling systems that include jumps and long-distance interactions such as anomalous diffusion or transport in confined plasma [5]. Singular limits of Equation (1.1) with Ψ(v) = |v| 2 2 was studied in [12], see also [11] for a similar result for the same equation but on a spatially bounded domain. In a recent work [1], the authors have extended [12] to a system that contains an additional external force field and they have also proved its wellposedness by the means of the Lax-Milgram theorem. We will prove the existence of solutions of (1.1) for a general Ψ based on the trick of operator splitting. For more recent developments on PDEs involving the fractional Laplacian operator, we refer the interested reader to expository surveys [42,43,41].
The aim of this paper is to develop a variational formulation for approximating solutions to equation (1.1). The theory of variational formulation for PDEs took off with the introduction of Wasserstein gradient flows by the seminal work of Jordan, Kinderlehrer and Otto [30]. Such a variational structure has important applications for the analysis of an evolution equation such as providing general methods for proving well-posedness [4] and characterizing large time behaviour (e.g., [10]), giving rise to natural numerical discretizations (e.g., [22]), and offering techniques for the analysis of singular limits (e.g., [39,40,6,18]). There are now a significantly large number of papers in exploring variational structures for local PDEs, see the aforementioned papers and references therein as well as the monographs [4,44] for more details. However, variational formulations for non-local PDEs are less understood. Erbar [24] showed that the fractional heat equation is a gradient flow of the Boltzmann entropy with respect to a new modified Wasserstein distance that is built from the Lévy measure and based on the Benamou-Brenier variant of the Wasserstein distance. Bowles and Agueh [8] proved the existence of the fractional Fokker-Planck equation which can be viewed as the spatially homogeneous version of equation (1.1) or the fractional heat equation with a drift. Erbar's proof is variational based on the so-called "evolution variational inequality" concept introduced in [4]. However, it seems that his method can not be extended to the fractional Fokker-Planck equation since the distance that he introduced was particularly tailored for the Boltzmann entropy. Instead, Bowles and Agueh's proof is "semi-variational" based on a novel splitting argument which we sketch now. They split up the original dynamics (1.3) into two processes: a fractional diffusion process, namely ∂ t f = −(−△) s f , and a transport process in the field of the potential Ψ, namely ∂ t f = div(∇Ψf ), and then alternatively run these processes on a small time interval. Furthermore, the transport process can be understood as a Wasserstein gradient flow of the potential energy. By adopting a suitable interpolation of the individual processes, they were able to show that the constructed splitting scheme converges to a weak solution of (1.3). In the literature, the technique of operator splitting is often used to construct numerical methods for solving PDEs, see [27]. On the theoretical side, the idea of splitting had also been used to study the well-posedness of PDEs, see [9,2] on kinetic equations and [3,16] on fractional PDEs. In the present work, we adopt the same splitting argument in [8] to construct a weak solution to the fractional kinetic equation (1.1). More specifically, we split the dynamics described in (1.1) by two phases: (1) Fractional diffusion phase. At every fixed position x ∈ R d , the probability density f (x, v, t), as a function of velocity v, evolves according to the fractional heat equation (2) Kinetic transport phase. The density f (x, v, t) evolves according to the following equation We expect that successive alternative iterating the above two phases with vanishing period of time would give an approximation to the dynamics (1.1). The key difference between our splitting scheme above and the scheme in [8] is that the transport process here is not only driven by the potential energy but also the kinetic energy. In [8], the transport process is approximated by a discrete Wasserstein gradient flow based on the work [31]. However, due to the presence of the kinetic term, the kinetic transport equation is not a Wasserstein gradient flow; thus one can no longer use the Wasserstein distance. To overcome this obstacle, we employ instead the minimal acceleration cost function and the associated Kantorovich optimal transportation cost functional that has been used in [28,19] for the kinetic Fokker-Planck equation and in [25] for the isentropic Euler system, see Section 3.
1.1. Main result. Throughout the paper, we make the following important assumption on the potential Ψ.
We adopt the following notion of weak solution to KFPE (1.1).
We say that f (x, v, t) is a weak solution to (1.1) if it satisfies the following: The main result of the paper is the following theorem.
The proof of Theorem 1.3 is constructive, that is we will build a converging sequence to a solution of (1.1) from the splitting scheme discussed above that will be rigorously formulated in Section 4. The proof is based on a series of lemmas and is postponed to Section 5. As a by-product of the analysis, we also construct a discrete variational scheme and obtain its convergence for the kinetic transport equation, see Theorem 3.3 in Section 3; thus extending the work [31] to include the kinetic feature. Furthermore, some possible extensions to more complex systems are discussed in Section 6. It is not clear to us how to obtain the uniqueness and regularity result. The bootstrap argument in [30] to prove smoothness of weak solutions (and hence also uniqueness) seems not working for the fractional Laplacian operator due to the lack of a product rule. It should be mentioned that in the recent paper [34], the author has proved the existence and uniqueness of a solution to the fractional Fokker-Planck equation (1.3) in some weighted Lebesgue spaces. It would be an interesting problem to generalize [34] to FKFPE. This is to be investigated in future work.

1.2.
Organization of the paper. The rest of the paper is organized as follows. Section 2 summarizes some basic results about the fractional heat equation. Section 3 studies the kinetic transport equation and its variational formulation. The splitting scheme of the paper is formulated explicitly in Section 4 and some a priori estimates are established for the discrete sequences as well as their timeinterpolation. The proof of the main result is presented in Section 5. Finally, in Section 6 we discuss several possible extensions of the analysis to more complex systems.
1.3. Notation. Let P 2 (R d ) be the collection of probability measures on R d with finite second moments. Let P 2 a (R d ) be the subset of probability measures in P 2 (R d ) that are absolutely continuous with respect to the Lebesgue measure on R d . For µ, ν ∈ P 2 (R d ), the 2-Wasserstein distance W 2 (µ, ν) is defined by where P(µ, ν) is the set of probability measures on R 2d with marginals µ, ν, i.e. p ∈ P(µ, ν) if and only if hold every Borel set A ∈ R d . In the case that µ, ν ∈ P 2 a (R d ) with densities f, g, we may write W 2 (f, g) instead of W 2 (µ, ν).
We use the notation F # µ to denote the push-forward of a probability measure µ on R 2d under map F , that is a probability measure on R 2d satisfying for all smooth test function ϕ,

The fractional heat equation
This section collects some basic results on the fractional heat equation. We start by defining the fractional heat kernel Remember that the fractional Laplacian operator in (1.1) is only an operator in v-variable. With the fractional heat kernel, the solution to the fractional heat equation (1.4) with initial condition f 0 (x, v) can be expressed as where * v is the convolution operator in v-variable. The following elementary result is immediate from the definition of the kernel; see also [8].
(3) R |v| 2 Φ s (v, t)dv = +∞ for all s ∈ (0, 1) and t > 0. Lemma 2.1 (3) demonstrates a significant difference between the fractional heat kernel and standard Gaussian kernel, i.e. the former has infinite second moment. The loss of second moment bound may lead to infinite potential energy for example when the potential Ψ(v) = |v| 2 . To overcome this issue, it is more convenient to make a renormalisation on the fractional heat kernel. To be more precise, for any where 1 BR is an indicator function of a centred ball of radius R. Given a function f ∈ P 2 a (R d ), we can define the renormalised convolution Proof. Notice that it suffices to prove part (2) since part (1) follows directly from part (2) by setting F (v) = |v| 2 . The proof is similar to that of [8, Lemma 4.1], but for completeness we give the proof below. First from the definition off h,R , one sees that Using change of variable z = v − w and Taylor's expansion, we can write the numerator as Note that in the above ξ w,z is an intermediate point between w and z and the term with the modulus vanishes since the kernel Φ h s is symmetric with respect to the origin.
The following lemma provides an upper bound for the ratio on the right side of (2.4). Lemma 2.3. For any s ∈ (0, 1], there exists a constant C > 0 such that holds for all R, h > 0. Proof. This lemma follows directly from a two-sided point-wise estimate on Φ h s (w) as shown in [8,Proposition 2.1]. See also equation (16) in [8].
3. The kinetic transport equation and its variational formulation 3.1. The minimum acceleration cost. Consider the kinetic transport equation with initial value f 0 We are interested in the variational structure of (3.1) which is an interesting problem on its own right. In [31], Kinderlehrer and Tudorascu proved that the transport equation , which is the spatially homogeneous version of (3.1), is a Wasserstein gradient flow of the energy R d Ψf . Their proof is via constructing a discrete variational scheme as in [30]. However, due to the absence of the entropy term, which is super-linear, several non-trivial technicalities were introduced to obtain the compactness of the discrete approximations thus establishing the convergence of the scheme. For the kinetic transport equation (3.1), due to the presence of the kinetic term, it is not a Wasserstein gradient flow in the phase space thus the Wasserstein distance can no longer be used. Therefore to construct a discrete variational scheme for this equation, we need a different Kantorovich optimal transportation cost functional. To this end, we will employ the Kantorovich optimal transportation cost functional that is associated to the minimal acceleration cost. This cost functional has been used before in [28,19] for the kinetic Fokker-Planck equation and in [25] for the isentropic Euler system. We follow the heuristics of defining the minimal acceleration cost as in [25]. Consider the motion of particle going from position x with velocity v to a new position x ′ with velocity v ′ , within a time interval of length h. Suppose that the particle follows a curve ξ : and such that the average acceleration cost along the curve, that is 1 Then the curve is actually a cubic polynomial and the minimal average acceleration cost is given by The Kantorovich functional W h (µ, ν) associated with the cost function C h , is defined by, for any µ, ν ∈ P 2 (R 2d ), where P(µ, ν) is the set of all couplings between µ and ν. It is important to notice that W h is not a distance. In fact, W h is not symmetric in the arguments µ, ν, due to the asymmetry of the cost function C h . In addition, W h (µ, ν) does not vanish when µ = ν. Instead, we have that where F h is the free transport map defined by It is also useful to define the map . As a consequence, the infimum in (3.3) is attained and thus W h (µ, ν) is a minimum.

Variational formulation.
With W h being defined, we want to interpret (3.1) as a generalized gradient flow of the potential energy R 2d Ψ(v)f (x, v)dxdv with respect to W h . For doing so, we consider the variational problem Here f 0 ∈ P 2 a (R 2d ) is an initial probability density with R 2d Ψ(v)f 0 (x, v) dxdv < ∞ and h > 0 is the time step. The next lemma establishes some properties about the minimizer to (3.7).

Lemma 3.2.
(1) For h being sufficiently small, the variational problem (3.7) has a unique minimizer f ∈ P 2 a (R 2d ). (2) Let h > 0 be small enough such that det(I + hD 2 (Ψ(v))) ≤ 1 + αh for some fixed (3) f satisfies the following Euler-Lagrange equation: for any ϕ ∈ C ∞ c (R 2d ), where P * is the optimal coupling in W h (f 0 , f ) and Proof. (1) Thanks to Lemma 3.1, we can rewrite the functional A as (3) The derivation of the Euler-Langrange equation for the minimizer f of the variational problem (3.7) follows the now well-established procedure (see e.g. [30,28]). For the reader's convenience, we sketch the main steps here. First, we consider the perturbation of f defined by push-forwarding f under the flows φ, ψ : [0, ∞)×R 2d → R d : where ζ, η ∈ C ∞ 0 (R 2d , R d ) will be chosen later. Let us denote γ s to be the push forward of f under the flow (ψ s , φ s ). Since (ψ 0 , φ 0 ) = Id, it follows that γ 0 = f , and an explicit calculation gives in the sense of distributions. Second, thanks to the optimality of f , we have that A(γ s ) ≥ A(f ) for all γ s defined via the flow above. Then the standard variational arguments as in [30,28] leads to the following stationary equation on f : where P * is the optimal coupling in the definition of W h (f 0 , f ). Third, we choose ζ and η with a given ϕ ∈ C ∞ 0 (R 2d , R) as follows

Now from the definition of the cost functional
Therefore, together with (3.13), we calculate The Euler-Lagrange equation (3.9) for the minimizer f follows directly by substituting the equation above back into (3.12).
We now can build up a discrete variational scheme for the kinetic transport equation as follows. Given f 0 ∈ P 2 a (R 2d ) with R 2d Ψ(v)f 0 (x, v) dxdv < ∞ and h > 0 is the time step. For every integer k ≥ 1, we define f k as the minimizer of the minimization problem The following theorem extends the work [31] to the kinetic transport equation.
Theorem 3.3. Suppose that Assumption 1.1 holds. Given a f 0 ∈ P 2 a (R 2d ) ∩ L p (R 2d ) for some 1 < p ≤ ∞ and R 2d f 0 (x, v)Ψ(v)dvdx < ∞, there exists a weak solution f (x, v, t) to equation (3.1) in the sense of Definition 1.2 but with the fractional Laplacian term removed.
Proof. The proof of this theorem follows the same lines as that of the Theorem 1.3, that is to show that the discrete variational scheme (3.14) above converges to a weak solution of the kinetic transport equation. Since the proof of Theorem 1.3 will be carried out in details in Section 5, we omit this proof here.

A splitting scheme for FKFPE
4.1. Definition of splitting scheme. As we mentioned in the introduction section, we object to construct an operator splitting scheme for equation (1.1) by continuously alternating processes (1.4) and (1.5), where the later is approximated by the generalized gradient flow of the potential energy, or equivalently, the density after a short time step h is approximately given by the solution to the variational problem (3.7). However, there is an issue associated with iterating (1.4) and (3.7). That is, the solution of the fractional heat equation may not have a finite second moment (see Lemma 2.1 (3)). Hence it can not be used as the initial condition in the variational problem (3.7) since the potential energy might be infinite. To around this issue, we define an approximate fractional diffusion process by using the renormalised convolution (2.3) based on the truncted fractional heat kernel.
To be more precise, given a fixed N ∈ N, let us consider a uniform partition 0 = t 0 < t 1 < · · · < t N = T of the time interval [0, T ] with t k = kh and h = 1/N . With an initial condition f 0 h = f 0 , for n = 1, · · · , N − 1 we iteratively compute the following: • Given a trunction parameter R > 0, compute the renormalised convolution .
• Solve for the minimizer f n+1 h,R of the problem Note that thanks to Lemma 3.2 (1) the minimizer f n+1 h,R in (4.2) is well-defined and unique. Moreover, it follows from Lemma 3.2 (3) that f n+1 h,R satisfies the following equation With the scheme being defined above, we obtain a discrete approximating sequence {f n h,R } 0≤n≤N . Below we define a time-interpolation based on {f n h,R } and our ultimate goal is to prove that this sequence converges to a weak solution of (1.1).
Time-interpolation: We define f h,R by setting Proof. Since f n h,R minimizes the functional f → 1 In particular, if we set f = f * := F # h f n h,R where F h is the free transport map defined in (3.4), then since W h (f n h,R , f * ) = 0 we obtain We have also used the fact that the free transport map F h has unit Jacobian in the last equality. According to Lemma 2.2 (2), we have Substituting (4.9) into (4.8), we obtain from which, by summing over n from 1 to N we obtain (4.10) Then the desired estimate follows directly from (4.10) and Lemma 2.3.
We also need some second moment bounds on f with respect to variable v. Given a density function f , let us set M 2,v (f ) := |v| 2 f (x, v)dxdv. Lemma 4.2. There exist positive constants C, h 0 such that when 0 < h < h 0 , it holds for any index i > 0 that It follows that In addition, letP i h be the optimal coupling in the definition of Proof. First from the definition of the cost function C h in (3.2) we have the following inequalities: Then there exist constants C, h 0 > 0 such that when h < h 0 , Now for any fixed i > 0, we have R 2d This proves (4.11). The estimate (4.12) follows by summing the estimate (4.11) over the index i from 1 to n and inequality (2.4) with F (v) = |v| 2 . Finally, the estimate (4.13) follows directly from inequality (4.16) and the definition of In the next lemma, we prove a uniform L p -bound for the time-interpolation sequence {f h,R }. Lemma 4.3. Let h > 0 be small enough such that det(I + hD 2 (Ψ(v))) ≤ 1 + αh for some fixed α > D 2 Ψ L ∞ (R d ) . If f 0 ∈ L p (R 2d ) for 1 < p < ∞, then . Proof. First, according to Lemma 3.2 (2), we have that In addition, by the definition off n h,R (see (4.1)) and Young's inequality for convolution, . This implies that for any n > 0, Then by the definition of the time-interpolation f h,R in (4.5), we have for any t ∈ (t n , t n+1 ) that

Approximate equation.
We first show in the next lemma that the timeinterpolation f h,R satisfies an approximate equation. Moreover, Here P n h,R is the optimal coupling in the definition of W h (f n h,R , f n h,R ).
Proof. From the definition of f h,R (see (4.5)) and integration by parts, we obtain that where the second equality holds because f h,R solves the fractional heat equation.
By adding and subtracting a few tems, we can write the first term on the right hand side of (5.6) as Now substituting (5.7) back into (5.6) and then summing over index n from 0 to N − 1 yields In the above we also used the fact that ϕ is compactly supported in (−T, T ) so that ϕ(t N ) = 0. Let P n h,R (dxdvdx ′ dv ′ ) be the optimal coupling in W h (f n h,R , f n h,R ). Then it is easy to see that (5.9) where we have used Taylor expansion in the last equality and the error term ε n can be bounded as In view of (4.3), (4.4) and (5.9), we have that and that As a result the last term on the right-hand side of (5.8) can be written as Now using again the fact that ϕ(t N ) = 0, we rewrite the first term on the right side of (5.13) as follows (5.14) Therefore the lemma follows by combining (5.8), (5.13) and (5.14).

5.2.
Passing to the limit. Now we set the truncation parameter R = h −1/2 and define Our aim is to prove that f h converges to a weak solution of (1.1). To this end, we first show that the residual term in the last lemma goes to zero when h → 0.
Lemma 5.2. Let f 0 be a non-negative function such that f 0 ∈ P 2 a (R 2d ) and Proof. The proof follows closely the proof of Lemma 5.3 of [8]. In particular, by using the same arguments there, we can first obtain the following estimates Notice that the supreme norms appearing in the above are finite since ϕ ∈ C ∞ 0 ((−T × T ) × R 2d ) and Ψ ∈ C 1,1 ∩ C 2,1 (R d ). Next, we can bound R 4 (h, R) as In addition, thanks to inequality (4.13) and Lemma 4.1, the error termR can be bounded as follows Finally, the desired estimate (5.16) follows by combining the above estimates and by setting R = h −1/2 . Now we are ready to prove the main Theorem 1.3.
Proof of Theorem 1.3. First, thanks to Lemma 4.3 and the assumption that f 0 ∈ L p (R 2d ) for some 1 < p < ∞, the constructed time-interpolation {f h } in (5.15) is uniformly bounded in L p (R 2d ×(0, T )). Therefore there exists a f ∈ L p (R 2d ×(0, T )) such that f h ⇀ h in L p (R 2d × (0, T )). In view of equation (5.1) of Lemma 5.1, and by using the fact that Remark 5.3. By using the similar technique as in the proof of Lemma 5.8 of [8], one can show that the weak solution f of (1.1) is indeed a probability density for every t ∈ (0, T ), i.e. R 2d f (t, x, v)dxdv = R 2d f 0 (x, v)dxdv = 1.

Possible extensions to more complex systems
With suitable adaptations, it should be possible, in principle, to extend the analysis of the present work to deal with more complex systems. Below we briefly discuss two such systems.
6.1. FKFPE with external force fields. When an external force field, which is assumed to be conservative, is present, the SDE (1.2) becomes where U : R d → R is the external potential. The corresponding FKFPE (1.1) is then given by One can view (6.1) as a dissipative (frictional and stochastic noise) perturbation of the classical Hamiltonian dX t dt = V t , dV t dt = −∇U (X t ).
Thus FKFPE (6.2) contains both conservative and dissipative effects. To construct an approximation scheme for it, instead of the minimal acceleration cost function (3.2), one would use the following minimal Hamiltonian cost function which has been introduced in [19] for the development of a variational scheme for the classical Kramers equation: Under the assumption that U ∈ C 2 (R d ) with ∇ 2 U ≤ C and using the properties of the cost function C h established in [19] we expect that the splitting scheme (4.1)-(4.2), where in (3.3) the Kantorovich optimal cost functional C h is replaced by C h , can be proved to converge to a weak solution of FKFPE (6.2).
Several properties including an explicit representation of the mean squared derivative cost function has been studied in [20] and a variational formulation using this cost function for equation (6.4) with and s = 1 has been developed recently in [21].
Using the properties of the cost functionC n,h established in [20] it should be possible, in principle, to adapt the analysis of the present paper to show that, under suitable assumptions, the splitting scheme (4.1)-(4.2) with C h being substituted bȳ C n,h , converges to a weak solution of the multi-component FKFPE (6.4).