A preconditioned fast Hermite finite element method for space-fractional diffusion equations

We develop a fast Hermite finite element method for a one-dimensional space-fractional diffusion equation, by proving that the stiffness matrix of the method can be expressed as a Toeplitz block matrix. Then a block circulant preconditioner is presented. Numerical results are presented to show the utility of the fast method.


1.
Introduction. Fractional partial differential equations (FPDEs) provide a competitive approach for the description of transport dynamics in complex systems which are governed by anomalous diffusion and non-exponential relaxation patterns [25,29]. In the last decade, many numerical methods have been developed for the numerical solution of FPDEs, including finite difference methods, finite element methods and spectral methods [8,11,21,20,24,31,38]. In particular, Ervin and Roop developed and analyzed Galerkin finite element methods for space-fractional PDEs [11].
Due to the nonlocal nature of fractional differential operators, numerical methods for space-fractional PDEs usually generate dense or full stiffness matrices, for which traditional solvers require O(m 3 ) of operations step and O(m 2 ) of memory for a problem with m unknowns. This is deemed computationally very expensive to solve. Fast numerical methods were previous developed for finite difference methods and finite volume methods [35,36,37], by proving that the stiffness matrix is Toeplitz like. This would reduce the memory from O(m 2 ) to O(m) and computational cost from O(m 3 ) to O(m log m).
In this paper we develop a fast cubic Hermite element method for space-fractional diffusion equation by proving that the stiffness matrix can be expressed as a block Toeplitz matrix. Due to the stiffness matrix are ill-conditioned, the condition number of the stiffness matrix is huge with the increasing number of grids, even makes the iterative method divergent. So we present a block circulant preconditioner to accelerate the Krylov subspace iterative method. The rest of the paper is organized as follows. In §2, we present the model problem. In §3 we assemble the stiffness matrix of the Hermite cubic finite element method. In §4 we prove that the stiffness matrix is a block Toeplitz matrix, which yields an O(m) memory requirement of the stiffness matrix and a lossless O(m log m) matrix-vector multiplication. In §5 we present a block circulant preconditioner to accelerate the fast Krylov subspace iterative method. In §6 we present numerical experiments to investigate the performance of the fast method.
2. Problem formulation and preliminaries. We consider the homogeneous Dirichlet boundary-value problem of the one-dimensional steady-state fractional diffusion equation Here Du(x) = u is the first-order differential operator, 2 − β with 0 < β < 1 represents the order of anomalous diffusion, K is the diffusivity coefficient, 0 ≤ γ ≤ 1 indicates the relative weight of forward versus backward transition problem, f (x) is the source and sink term, and u l , u r are the prescribed Dirichlet boundary data. The left and right fractional integrals of order β are defined by [27] where Γ(·) is the Gamma function. The left and right fractional derivatives of Caputo form of order β are defined by [27] Let H µ 0 (0, 1) with µ > 1/2 be the fractional Sobolev space of order µ with trace 0 at the boundary x = 0 and 1. [11] Multiplying (1) by any test function v ∈ H 1 0 (0, 1) gives the following weak formulation, Then the Galerkin formulation is followed: for given f ∈ H −(1−β/2) (0, 1), find u ∈ H 1−β/2 0 (0, 1) such that where ·, · is the duality pairing between H −µ (0, 1) and H µ 0 (0, 1) and for the twosided problem the associated bilinear form B is defined: We recall some results for fractional elliptic differential equations with constant coefficients. The following theorem was proved in [11].
We define the finite dimensional subspace V h ⊂ H µ 0 (Ω) on the mesh Ω h , which can be expressed by where P 3 (I i ) is the space of cubic polynomials defined on I i . The nodal base functions ϕ m of V h can be expressed in the form for i = 1, 2 · · · , m − 1, and ϕ (1) Thus, the Hermite P 3 finite element scheme for the homogeneous Dirichlet boundary value problem can be written in a matrix form where u be an 2(m − 1)-dimensional vector given by the 2(m − 1)-dimensional vector f can be expressed in a similar fashion, and G is an 2(m − 1) × 2(m − 1) stiffness matrix. We go through some tedious algebraic manipulations to evaluate all the entries of the stiffness matrix G and present them below. Denote The entries of G are given by The lower triangular entries of the G are given by and Similarly, the upper triangular entries of the G are given by and We have evaluated all the entries of the stiffness matrix G.

4.
A fast and efficient method. From above entries of the stiffness matrix G, we note that the matrix G be a full coefficient matrix, and it does not possess no any regular pattern for matrix structure. A direct solver for (17)   Proof. Let the vector v denote the reindexing of the vector u that corresponds to the labeling given in (18) v := u 1 , u 2 , · · · , u m−1 , u 1 , u 2 , · · · , u m−1 T .
where P represents the permutation matrix that maps u to v. The vector g has the reindexing of the vector f that given in (18) by above same fashion. Applied the permutation matrix P to the matrix G that given in (17), P G gives the matrix G with rows interchanged according to the permutation vector u, and GP T gives the matrix G with columns interchanged according to the given permutation vector. Then we have Combining (40) and (41) to obtain a new linear system the matrix A can be decomposed as where the sub-matrices A 1,1 , A 1,2 , A 2,1 , A 2,2 represent the stiffness matrices corresponding to the bilinear form B(ϕ j ) and B(ϕ j ), i, j = 1, 2, · · · , m − 1, respectively. All of the four sub-matrices are of order m − 1.
We define the sub-matrix Comparing to the corresponding entries of the matrix G, we can obtain We can find that the entries of each descending diagonal of A 1,1 are equivalent from observation. So, A 1,1 is an (m − 1)-order Toeplitz matrix.
Similarly, we define the sub-matrix A 1,2 = [a (1,2) i,j ] m−1 i,j=1 , A 2,1 = [a (2,1) i,j ] m−1 i,j=1 , A 2,2 = [a (2,2) i,j ] m−1 i,j=1 . Comparing to the corresponding entries of the matrix G, we can obtain (45) We can get that A 1,1 , A 1,2 , A 2,1 and A 2,2 are an (m − 1)-order Toeplitz matrix. Conclusively, the new linear system (42) can be obtained by a direction examination from (17), and the new stiffness matrix A is an 2 × 2 block matrix with (m − 1) × (m − 1) Toeplitz blocks.  Proof. We divide w into two parts in block form as follows: Then the matrix-vector multiplication Aw is of the form We only need to show that, for a Toeplitz matrix T m−1 with the following form the matrix-vector multiplication T m−1 v can be computed by means of fast Fourier transform (FFTs). First we embedded T m−1 into a (2m − 2) × (2m − 2) circulant matrix C 2m−2 such that More precisely, the matrix-vector multiplication can be computed by: Furthermore, it is well-known that circulant matrices can be diagonalized by the discrete Fourier transform matrix F 2m−2 [7,13] where c 2m−2 is the first columm vector of C 2m−2 . Therefore, for any vector v ∈ R m−1 , the matrix-vector multiplication T m−1 v can be carried out in O(m log m) Table 1. The condition number of stiffness matrix A with K = 1, γ = 0.5, β = 0.5.  Table 1, we find the condition number of matrix A becomes larger and larger with the increasing number of grids. Thus the number of iterations becomes predictively more and more and it gives rise to nonlinear increasing of the overall computational cost. In this section we present an efficient preconditioner for the fast method. For systems with Toeplitz blocks, we will apply two classes of preconditioners: Strang's circulant preconditioner and Chan' s circulant preconditioner, which are outlined below. In [6] the authors study block preconditioners for solving separable block systems with Toeplitz blocks.
For a Toeplitz matrix of order n , which is of the form (51), the Strang' preconditioner S = S(T n ) is defined to be a circulant matrix obtained by copying the central diagonals of T n and reflecting them around to complete the circulant requirement. The diagonals s j of S = [s k−j ] 0≤k,l≤n are given by 0 < j ≤ n/2 , t j−n , n/2 < j ≤ n − 1, It was shown [5] that Chan' preconditioner C F = C F (T n ) is defined to be the minimizer of T n − C n F over all circulant matrices C n . The diagonals c j of C F (T n ) are given by We note that the stiffness matrix A, which is of the form (43), is a 2 × 2 block system with Toeplitz blocks of order m − 1. To motivate the development of a efficient preconditioner based on Strang's preconditioner and T.Chen's preconditioner, we construct a 2 × 2 block matrix with circulant block preconditioner M for A as follows where M 1,1 , M 1,2 , M 2,1 and M 2,2 are constructed as (m − 1) × (m − 1) circulant preconditioners for A 1,1 , A 1,2 , A 2,1 and A 2,2 , respectively. We will use a preconditioned fast conjugate gradient squared method [28] to solve the system(42). In this Krylov subspace algorithm, the matrix-vector multiplications Aw can be performed in O(m log m) operations by Theorem 4.3. Then, we will analyze the computational cost of the inverse of the preconditioner M .
We apply the permutation matrix P that given in (40) to interchange rows and columns of the matrix Λ, and obtain a block-diagonal matrix Λ D of order m − 1 According to the (61), the inverse of the block-diagonal matrix Λ D has the following formulation The inverse of the 2 × 2 matrix S i can be computed as So the matrix-vector multiplication Λ −1 w can be carried out in O(m) operations for any vector w ∈ R 2m−2 . Note that Consequently, M −1 w can be evaluated in O(m log m) operations. 6. Numerical experiments. we consider the numerical simulation of the Dirichlet boundary-value problem of a two-sided fractional differential equation (1) with K = 1, γ = 0.5, u r = 0, u l = 0, and the source term given by The true solution of the above problem is given by In the numerical experiments, we investigate the performance of the cubic Hermite finite element method (16) to solve the two-sided fractional elliptic differential problem on Ω h . We use the numerical solutions by Gaussian elimination as a benchmark. We then solve the linear system by the conjugate gradient squared (CGS) methods, by the fast conjugate gradient squared (FCGS) method in which the matrix-vector multiplication of Aw is carried out via Theorem 4.3, by the preconditioned fast conjugate gradient squared method with Chan' preconditioner (CFCGS) and the preconditioned fast conjugate gradient method with the Strang' preconditioner (SFCGS) presented in Section 5. In the numerical experiment we compute the errors of u h −u From table 2, we observe that the cubic hermite finite element methods have a convergence order of 4 in the L 2 norm and a convergence order of 3 + β in the fractional energy norm. From table 3 and 4 we present the numerical results and reach the following observations: (i) The CGS method consumes much more CPU time than Gaussian elimination. The reason is the number of iterations exceeds the number of grid points m, that means the computational cost has exceeded O(m 3 ) operations. (ii) The FCGS method requires only O(m log m) operations per iteration and significantly reduces the over CPU time. But the number of iterations has not changed, when the number of grid points is huge, iterative methods will diverge. (iii) Both the CFCGS and the SFCGS methods significantly reduce the number of iterations in the FCGS method, consequently, these preconditioned FCGS methods further significantly reduce the CPU time, due to the significant reduction of the number of iterations.