On the fundamental solution and a variational formulation of a degenerate diffusion of Kolmogorov type

In this paper, we construct the fundamental solution to a degenerate diffusion of Kolmogorov type and develop a time-discrete variational scheme for its adjoint equation. The so-called mean squared derivative cost function plays a crucial role occurring in both the fundamental solution and the variational scheme. The latter is implemented by minimizing a free energy functional with respect to the Kantorovich optimal transport cost functional associated with the mean squared derivative cost. We establish the convergence of the scheme to the solution of the adjoint equation generalizing previously known results for the Fokker-Planck equation and the Kramers equation.

In the above equations, the unknowns are ρ = ρ(t, x 1 , . . . , x n ) and f = f (t, x 1 , . . . , x n ) where t > 0 and x = (x 1 , . . . , x n ) ∈ R dn are time and spatial co-ordinates, respectively. The notations ∇, div and ∆ denote the gradient, the divergence and the Laplacian operators respectively. The subscripts in these operators indicate that they only act on the corresponding variables. The two equations are complemented with initial data. The initial data for (1) is a probability measure, ρ 0 , on R dn . Since the right-hand side of (1) is a divergence form, noting that the summation term can be written as − n i=2 div xi−1 (x i ρ), for each t > 0, ρ(t) is also a probability measure on R dn .
Since we will be interested in the fundamental solution of (2), the initial data for (2) is a Dirac measure δ y for y ∈ R dn . Two special cases of (1), that correspond to n = 1 and n = 2 respectively, are the Fokker-Planck equation and the Kramers equation Their corresponding adjoint equations with V ≡ 0 are, respectively, the diffusion equation ∂ t f = ∆f and the ultra-parabolic equation Equations (1) and (2), particularly the Fokker-Plank equation and the Kramers equation, play an important role in statistical mechanics [Ris84], reaction-rate theory [HTB90] and mathematical finance [LPP02,Pas05]. For instance, in the context of statistical mechanics, (1) has been used to describe the time evolution of the probability density function of a many-particle system. Equation (1) with n > 2 can also be viewed as a simplified model of a finite Markovian approximation for the generalised Langevin dynamics [OP11,Duo15] or a model of a harmonic chains of oscillators that arises in the context of non-equilibrium statistical mechanics [EH00, BL08,DM10]. Equations (1) and (2) belong to a class of degenerate diffusions of Kolmogorov type where the Laplacian acts only in some of variables. Two main issues that have been getting a lot of research interest in this class is (i) constructing the fundamental solution of (2) and (ii) developing a timediscrete variational scheme for (1). To motivate our work, let us discuss relevant literature on these issues for the Fokker-Planck equation and the Kramers equation. Regarding the first issue, it is a classical result that the diffusion equation ∂ t f = ∆f has the fundamental solution Φ FP (t, x, y) = 1 In the seminal paper [Kol34], Kolmogorov show that Φ 2 (t, x 1 , y 1 ; x 2 , y 2 ) = √ 3 2πt 2 d exp − C KR t (x, y) 4t with C KR t (x, y) = |y 2 − y 1 | 2 + 12 is the fundamental solution to the ultra-parabolic equation (5). It is this equation that was Hörmander's starting point to develop the hypo-elliptic theory [Hör67], which has become a powerful tool in the theory of partial differential equations, see e.g. [Bra14] for discussions. Since Kolmogorov's paper there has been a considerable amount of work on extending his result to other hypoelliptic equations including (2), see e.g., [Web51,Kup72,Pol94,FP05,DM10,IB15]. We refer to the mentioned papers and references therein for more information on this direction. Regarding the second issue, the functions C FP and C KR also play a crucial role in the more recent development of time-discrete variational schemes for the Fokker-Planck equation and the Kramers equation, respectively. In the seminal paper [JKO98], Jordan-Kinderlehrer-Otto prove a remarkable result that the Fokker-Planck equation (3) can be seen as a gradient flow of the free energy, which is the sum of the potential energy´V ρ and the Boltzmann entropy´ρ log ρ, with respect to the Wasserstein distance on the space of probability measures with finite second moments. Let µ, ν ∈ P 2 (X) be two probability measures on some Euclidean space X having finite second moments. Throughout this paper, Γ(µ, ν) denotes the set of all probability measures on X × X having µ and ν as first and second marginals. The Wasserstein distance W 2 (µ, ν) between them is defined via

Fundamental solution & variational formulation for hypoelliptic PDEs
The main ingredient in [JKO98] is the following variational approximation scheme, which is now known as the JKO-scheme, JKO-scheme: Let h > 0 be a time-step. Define ρ 0 h := ρ 0 . Then, for each k = 1, 2, ..., ρ k h is determined as The main result in [JKO98] then states that, after an appropriate interpolation, the sequence {ρ k h } constructed from the JKO-scheme converges to the solution of the Fokker-Planck equation (3). This result has sparked a large body of research in the last two decades in the field of partial differential equations linking the field to some other branches of mathematics such as optimal transport theory, geometric measure theory and functional inequalities, see monographs [Vil03,AGS08,Vil09] for great expositions on the development.
Inspired by the JKO-scheme, Huang [Hua00] and then Duong et al. [DPZ14] have established different approximation schemes for the Kramers equation (4). The challenge here is that the techniques in [JKO98] can not be directly applied as the Kramers equation is neither a gradient flow nor a Hamiltonian flow even though these schemes are of the same form as in (9). The free energy functional is the same, but instead of the Wasserstein distance, the following Monge-Kantorovich optimal transport cost has been used in these papers where C KR h (x, y) is defined in (7). The cost functionalW h has also been used to construct timediscrete variational schemes for other evolution equations such as the system of isentropic Euler equations [GW09,Wes10] and the compressible Euler equations [CSW14].
Through the above discussions, we notice a similarity between the Fokker-Planck equation and the Kramers equation. That is the role of the cost functions C FP and C KR t . They appear both in the discrete-time variational scheme and the fundamental solution. In addition, these cost functions also satisfy the following property: they minimize the velocity and acceleration integrals respectively Recently in [DT16] we have studied the minimization problem where x = (x 1 , . . . , x n ) ∈ R dn , y = (y 1 , . . . , y n ) ∈ R dn and the infimum is taken over all curves ξ ∈ C n ([0, T ], R d ) that satisfy the boundary conditions (ξ,ξ, . . . , ξ (n−1) )(0) = (x 1 , x 2 , . . . , x n ) and (ξ,ξ, . . . , ξ (n−1) )(t) = (y 1 , y 2 , . . . , y n ).
The optimal value C t (x, y) is called the mean squared derivative cost function and has been found to be useful in the modelling and design of various real-world systems such as motor control, biometrics and online-signatures and robotics, see [DT16] for further discussion.
Inspired by the role of C FP and C KR t for the Fokker-Planck equation and the Kramers equation as discussed above, it is natural to ask (Q1) Is the function where β d is the normalising constant, the fundamental solution to Equation (2)?
(Q2) Can Equation (1) be approximated by a discrete-time variational scheme in the spirit of the JKO-scheme where C FP (x, y) is replaced by C h (x, y)?
The aim of the present paper is to provide affirmative answers to these questions. We now describe our main results.

Main results of the present paper
Our first main result is the following theorem about the fundamental solution to (2).
Theorem 1.1. The function Φ(t, x, y) defined in (13) is the fundamental solution to (2). That is, for each y, Φ(t, x, y), as a function of t and x, satisfies (2). In addition, This theorem together with the explicit formula for the mean squared derivative cost function C t (x, y) in [DT16, Theorem 1.2] (see also Theorem 2.1 below) provide an explicit formula for the fundamental solution Φ(t, x, y).
Our second main result is concerned with a variational formulation of (1). We first introduce the approximation scheme in the spirit of the JKO-scheme. Let h > 0 be given and C h (x, y) be the mean square derivative cost function defined in (11). Let µ and ν be two probability measures on R dn having finite second moments. The Monge-Kantorovich optimal transport cost W h (µ, ν) between µ and ν is defined by The variational approximation scheme of this paper is constructed as follows.
Next we introduce the concept of a weak solution of (1). A function ρ ∈ L 1 (R + × R dn ) is called a weak solution of equation (1) with initial datum ρ 0 ∈ P 2 (R dn ) if it satisfies the following weak formulation of (1): Throughout the paper we make the following assumptions.
and there exists a constant C > 0 such that for all

Fundamental solution & variational formulation for hypoelliptic PDEs
The second main result of the paper is about the convergence of the approximation scheme to a weak solution of (1).
Theorem 1.4. Suppose that V satisfies Assumption 1.3. Let ρ 0 ∈ P 2 (R dn ) satisfy´R dn (V (x n ) + log ρ 0 ) ρ 0 dx < ∞. For any h > 0 sufficiently small, let ρ h k be the sequence of the solutions of the (16). For any t ≥ 0, define the piecewise-constant time interpolation Then for any T > 0, where ρ is the unique weak solution of Equation (1) with initial value ρ 0 . Moreover and as t → 0,

Comparison to related work
Our work is twofold: to construct the fundamental solution to equation (2) (Theorem 1.1) and to develop a time-discrete variational scheme for equation (1) which is the adjoint equation of (2) with an additional external force field (Theorem 1.4). The mean squared derivative cost function (11) plays a central role appearing both in the fundamental solution and in the approximation scheme. We now give further comments on these issues.
On the fundamental solution of (2). The fundamental solution to (2) is not new. As mentioned above, it has been shown in [Web51,Hör67,Kup72,Pol94,FP05,DM10,IB15]. In these papers, the fundamental solution is constructed using various methods such as parametrix and Fourier-transform. In this paper, we provide a direct verification based on elementary combinatorial and linear algebra techniques. We use explicit formulas for the mean squared derivative cost function that we obtained in our recent work [DT16]. Our method is closed to [Kup72]. However, this work does not represent the fundamental solution of (2) in terms of the mean squared derivative cost function. The reference [DM10] provides an implicit representation via the controllability property of a differential system but this work does not address the variational formulation of (1).
Variational formulation for equation (1). Theorem 1.4 is a generalisation of the main results of [JKO98, Hua00, DPZ14] for an arbitrary n. Even though the standard procedure in these papers can be used to prove the theorem, two additional difficulties arise. The first difficulty is how to select an appropriate perturbation flow to derive the Euler-Lagrange equation for the sequence ρ h k . The other difficulty is to prove the convergence of the scheme to a solution of (1) which amounts to show the error terms vanish as h → 0. These difficulties can be solved by using the explicit formulas involving the derivative mean squared cost function derived in our previous work [DT16]. We also note that [Hua00] and [DPZ14] considered the full Kramers equation (i.e., with n = 2) with an external force field −∇ x1 U (x 1 ) where U = U (x 1 ) is a given potential. In this case, the right-hand side of (1) has an additional term div x2 (∇ x1 U ρ). [Hua00] has dealt with this term by adding 1 h´U (x 1 )ρ dx into the free-energy functional resulting in an unusual scale. In contrast, this term has been encoded in the cost function in [DPZ14]. In the present work for arbitrary number of variables n, we have made an assumption that V depends only on the last co-ordinate x n . Due to this assumption Equation (1) resembles the Fokker-Planck equation in the last co-ordinate and the cost function C t depends only on the n-th order derivative of ξ giving rise to a controllable formula. It is not clear to us at the moment how to adapt [Hua00,DPZ14] to deal with a more general case where V depends on more than the last co-ordinate or when the co-ordinates are coupled in a more complex way as in [EH00, DM10, OP11]. We leave this issue for future research.
Microscopic interpretation of the variational scheme. The main results in recent papers [ADPZ11,DLR13,DPZ13,EMR15] show that Scheme 1.2 with n = 1 and n = 2 for the Fokker-Planck equation and the Kramers equation, respectively, can be derived from large-deviation principles of empirical measures associated to underlying stochastic processes. These results provide, among other things, microscopic interpretations for Scheme 1.2 in these cases. We expect that these results can be extended to the general case n, but we will not follow this direction in this paper.

Organization of the paper
The rest of the paper is organised as follows. In Section 2, we summarize the main properties of the mean squared cost function in [DT16]. In Section 3 and Section 4 we prove Theorem 1.1 and Theorem 1.4, respectively. Finally, Appendix 5 contains proofs of technical lemmas.

The mean square derivative cost function
In this section, we collect relevant results on the mean squared derivative cost function in [DT16].
Moreover, the matrix A and its inverse have explicit LU -decompositions given in the following theorem.

Fundamental solution & variational formulation for hypoelliptic PDEs
(1) A = LU , where U and L are defined as follows (2) The inverse of A is given by the product of the following two matrices: otherwise and Note that throughout this paper, all the matrices A, B, M, L and U are of order dn. Each entry of these matrices should be understood as a d-dimensional matrix that is equal to the entry multiplies with the d-dimensional identity matrix I d . For instance, A ij , for 1 ≤ i, j ≤ n should be understood as the matrix A ij I d . The multiplication of matrices are carried out as the multiplication of block matrices.
In the sequel sections, we also need the following the property of the cost function C t (x, y).
Proof. It follows from the formula for C t (x; y) that only the symmetric part M s of M contributes to C t . We have By Cauchy-Schwarz inequality, for t sufficiently small, we have T ≤ K and w 2 ≤ Kt 2 x 2 for some constant K > 0. We use the notation K to denote a universal constant that may change from line to line. Therefore, we get This finishes the proof of this lemma.

The fundamental solution of the adjoint equation (2)
In this section we prove Theorem 1.1. The proof consists of two main steps which are Proposition 3.1 and Proposition 3.2 below.
Proposition 3.1. Let Φ(t, x, y) be defined as in (13). Then it is a solution of (2) if and only if C t (x, y) satisfies the following equation Proof. Let α(n, d) = n 2 d 2 . We compute each term in (2) from the representation of Φ in (13). For the simplicity of notation, in the following computations, we denote C := C t (x, y).
First, we calculate the time-derivative of Φ.
Next, we calculate n i=2 x i · ∇ xi−1 Φ. We have from which by taking the sum over i from 2 to n, we get The Laplacian, ∆ xn Φ, with respect to variable x n is computed analogously It follows from (30), (31) and (32) that Φ is a solution of (2) if and only if The above equality is equivalent to By re-arranging the terms in the above equality and recalling that α(n, d) = n 2 d 2 , we obtain (29). This finishes the proof of the proposition.
The following matrices will play important role in the rest of the paper Note that H 1 (t), H 2 (t), Q, D ∈ R dn×dn . Each entry of these matrices should be understood as a matrix of order d that equals to the entry multiply with the d-dimensional identity matrix.
Proof. The assertions of the lemma are proved by using combinatorial techniques and are given in Appendix 5.
The following lemma is elementary but will be used several times in the sequel. We include it for the sake of completeness and reference.
Lemma 3.3. Suppose that A ∈ R N ×N is an anti-symmetric matrix, then for all x ∈ R N we have, We are now ready to prove Theorem 1.1.
Proof of Theorem 1.1. To prove Theorem 1.1, by Proposition 3.1 it is sufficient to prove (29). According to Theorem 2.1 we have where H 1 and H 2 are given in (33). Therefore, we have Next we will verify Proposition 3.1. We need to show that the function C satisfies Equation (29). We compute the time-derivative of C first.
From (38), we have Using the matrices Q and D defined in (34), n i=2 x i · ∇ xi−1 C and ∇ xn C can be computed as follows and similarly Therefore, we get where to obtain the second equality we have used the fact that D T D = diag(0, . . . , 0, 1) = D.
The Laplacian ∆ xn C is then computed via the Trace operator.
Therefore, using (i)-(ii) above, we obtain y T T 1 y + 2x T T 2 y + x T T 3 x + 2tTr(DH T 2 M H 2 ) = 2dn 2 t 2n−1 which is equal to the left-hand side of (43) as required. Finally, the initial condition (14) follows from the representation of C in Theorem 2.1 and the formula of b in (24). . Hence we will omit them here.
Lemma 4.1. Let ρ 0 , ρ ∈ P 2 (R dn ) be given. There exists a unique optimal plan P opt ∈ Γ(ρ 0 , ρ) such that Lemma 4.2. Let ρ 0 ∈ P 2 (R dn ) be given. If h is small enough, then the minimization problem has a unique solution.
Next we establish the Euler-Lagrange equation for the sequence of minimizers of Scheme 1.2. We will need two auxiliary lemmas whose proofs are presented in Appendix 5.
In particular H ii = 1, H ii+1 = −h and H ij = o(h 2 ) for j ≥ i + 2. Note that H ∈ R dn×dn where H ij should be understood as H ij I d .
Lemma 4.4. Let K = h 2n−2 (H T 2 M H 1 ) −1 . Then In particular, K nn = 1 and K ij = o(h) for all (i, j) = (n, n). Note also that K ∈ R dn×dn where K ij should be understood as K ij I d .
Having these two lemmas, we are now ready to derive the Euler-Lagrange equation for the sequence of minimizers in Scheme 1.2.
Proof. Let µ ∈ P 2 (R dn ) be given and let µ be the unique solution of the minimization problem min γ∈P2(R dn ) We will show that where P opt is the optimal plan in W h (µ, µ).
Although establishing the Euler-Lagrange equation for the minimizer µ has become a wellestablished route, see e.g., [JKO98] and [Hua00,DPZ13] for that of the Fokker-Planck equation (n = 1) and of the Kramers equation (n = 2) respectively, there is one additional difficulty. That is how to select an appropriate perturbation flow from that the Euler-Langrange equation is deduced. We first define a perturbation of µ by a push-forward under an appropriate flow. Let φ 1 , . . . , φ n ∈ C ∞ 0 (R dn , R d ). We define the flows Φ 1 , . . . , Since (Φ 1 0 , . . . , Φ n 0 ) = x, we have µ 0 (x) = µ(x). Taking derivatives with respect to s of both sides gives We then compute, using (14) and (15), the stationarity condition for µ following the calculations in [JKO98,Hua00,DPZ13] 1 2hˆR2dn According to (38) we have Therefore, Let ϕ ∈ C ∞ 0 (R dn , R). We choose φ 1 , . . . , φ n such that where K is given in Lemma 4.4 that implies that h 2−2n K T (H T 1 M H 2 ) = I.

Using Lemmas 4.3 and 4.4, we compute
Substituting these calculations back into (52) we obtain which is the desired equality (49). Applying (49) for the minimizers {ρ h k } k≥1 of Scheme 1.2 yields the statement of Lemma 4.5.

A priori estimates
In this section, we derive a priori estimates for the sequence of the minimizers of Scheme 1.2. The proofs of Lemma 4.6, Lemma 4.7 and Lemma 4.8 below are now standard, see e.g. [JKO98,Hua00,DPZ13], hence we omit them.
The following lemma provides an upper bound for the sum n k=1 W h (ρ h k−1 , ρ h k ). From now on, we denote by M 2 (ρ) the second moment of a probability measure ρ, i.e., M 2 (ρ) =´|x| 2 dρ.
Lemma 4.6. Let {ρ h k } k≥1 be the sequence of the minimizers of Scheme 1.2 for fixed h > 0. Then for any positive integer n and sufficiently small h, we have for some constant C > 0 independent of n.
The next lemma shows boundedness of the second moment M 2 (ρ h k ) and the entropy S(ρ h k ) locally in time.
Lemma 4.7. There exist positive constants T 0 , h 0 , and C, independent of the initial data, such that for any 0 < h ≤ h 0 , the solutions {ρ h k } k≥1 for Scheme 1.2 satisfy where K 0 = ⌈T 0 /h⌉.
The last lemma of this section extends Lemma 4.7 to any final time T > 0.
Lemma 4.8. Let {ρ h k } k≥1 be the sequence of the minimizers of Scheme 1.2 for fixed h > 0. For any T > 0, there exists a constant C > 0 depending on T and on the initial data such that

Proof of Theorem 1.4
Having established the Euler-Lagrange equation and a priori estimates in previous sections, in this section we prove Theorem 1.4. The proof is similar to that of [JKO98,Hua00,DPZ13]. Therefore, we only present the part that is different, that is to prove the convergence of the discrete Euler-Lagrange equations to the weak formulation (17) of Equation (1) as h → 0. The key point is to link the Euler-Lagrange equation for the sequence of minimizers obtained in Lemma 4.5 to a time-discretization of Equation (1).
Let T > 0 be a given final time. For each h > 0 we set K h := ⌈T /h⌉. Let (ρ h k ) k≥1 be the sequence of minimizers of Scheme 1.2 and let t → ρ h (t) be the piecewise-constant interpolation (20). By Lemma 4.8 we have Since the function z → max{z log z, 0} has super-linear growth, (60) guarantees that there exists a subsequence, denoted again by ρ h , and a function ρ ∈ L 1 ((0, T ) × R dn ) such that We now prove that the limit ρ satisfies the weak formulation (17). Let ϕ ∈ C ∞ c ((−∞, T ) × R dn ) be given. All constants C below depend on the parameters of the problem, on the initial datum ρ 0 , and on ϕ, but are independent of k and of h.
Let P h k ∈ Γ(ρ h k−1 , ρ h k ) be the optimal plan for W h (ρ h k−1 , ρ h k ). For any 0 < t < T , we havê where the error term ε k comes from the Taylor expansion of ϕ and can be estimated by Multiplying (62) with 1 h and combining with (48) we get It is worthy noting that θ k depends on t through the t-dependence of ϕ. Integrating (64) with respect to t from (k − 1)h to kh, we obtain Summing this relation from k = 1 to K h gives where By a discrete integration by parts, the left hand side of (66) can be written as From (66) and (68) we obtain Next we show that R h → 0 as h → 0. Indeed, we have Taking the limit h → 0 in (69) yields equation (17) proving (21). The proof of the stronger convergence (22) and of the continuity (23) at t = 0 follows from the equi-near-continuity estimate, see [JKO98, Hua00, DPZ13] where W 2 (ρ 0 , ρ 1 ) is the Wasserstein distance between ρ 0 and ρ 1 defined in (8). This estimate follows from the inequality (see (28)) |x − y| 2 ≤ C C h (x, y) + h 2 (|x| 2 + |y| 2 ) , and the estimates (60) and (58).

Appendix
This appendix contains proofs of technical lemmas in the previous sections.

Proof of Proposition 3.2
In this appendix, we prove Proposition 3.2. For the convenience, we recall the relevant matrices here: We need several auxiliary lemmas. The first one is an explicit formula for the inverse of the matrix B define in (26).
Lemma 5.1. Recall the matrix B in (26) Then B −1 has the following form Proof. Define the matrixB bỹ We now show that BB = I. We consider three cases 1. If k < j, then The last equality is 0 since by the definition ofB we haveB ij = 0 for every i ≥ n + 1 − k > n + 1 − j.
2. If k = j, then similarly as the case above, we get 3. If k > j, then since B ki = 0 for i < n + 1 − k andB ij = 0 for i > n + 1 − j, we get From these three cases, we imply that BB = I, therefore B −1 =B which concludes the proof of the lemma.
The next two lemmas show relations between the matrices H 1 , H 2 , Q and D.
Lemma 5.2. It holds that Proof. For three matrices X, Y, Z, we have

Fundamental solution & variational formulation for hypoelliptic PDEs
In particular, applying this identity for X = H 2 , Y = D and Z = H T 2 , we get where we used the definition of D and H 2 to obtain ( * ) and ( * * ) respectively. This finishes the proof of the lemma.
Lemma 5.3. It holds that This gives that (T 23 ) ij = 0. 2. If j < i, then we have Consider the function g(x) = (x − 1) i−j . On the one hand, since g ′ (x) = (i − j)(x − 1) i−j , it follows that g ′ (1) = 0. On the other hand, we have which implies that In addition, we have Summing up (76) and (77) yields (I) = 0. The term (II) is equal to Hence if j < i, then (T 23 ) ij = 0.
3. Finally, if i = j, then we have From these three cases we obtain that . . , n − 1). As a result, we have which is (75). This finishes the proof of the lemma.
The following lemma is a purely combinatoric identity. We expect that this identity is not new, but we include a proof here for the sake of completeness.
Lemma 5.4. It holds that Proof. We prove (79) by induction. Obviously, (79) is correct for k = 0. Assume that we (79) is valid with some value of k. We will prove (79) is also valid for k + 1. We have where in the last equality we have used the fact that . This is (79) for k + 1. Therefore by the induction principle, (79) is proved.

Fundamental solution & variational formulation for hypoelliptic PDEs
Lemma 5.5. For all j ≤ k and k − j ≤ n, we have Proof. The identity (80) has been used and proved in our previous paper [DT16, Proof of Theorem 4.1]. For the convenience of the reader we provide a proof here. Consider the function f (x) = x n (1 − x) j−1 . On the one hand, we have Therefore,

It follows that
and by multiplying both sides of this equality with (−1) j , we get On the other hand, according to Leibniz formula, we have In particular, we get The identity (80) is then followed from (81) and (82).

Fundamental solution & variational formulation for hypoelliptic PDEs
To prove T 1 is anti-symmetric, it suffices to prove that T 11 is anti-symmetric. Let C ij be the element (i, j) of T 11 . Then C ij can be computed as follows.
Now applying (85) for p = n − i and q = n − j we get From (86) and (84) we achieve which implies that C ji = −C ji . Therefore, T 11 is anti-symmetric and so is T 1 .
According to Lemma 5.2 H 2 DH T 2 = H 0 ; therefore we have We define To prove T 3 is anti-symmetric, it suffices to show that T 31 is anti-symmetric.
From (84) we have: (2(n − j) + 1) which implies that (using the fact that (H 0 ) ij = 1 (n−i)!(n−j!) ) Using (87) and Lemma 5.3, we obtain Therefore, T 31 is anti-symmetric and so is T 3 . In this section, we prove that last identity in Proposition 3.2. We will show that Tr(DH T 2 M H 2 ) = n 2 d t 2(n−1) .
We recall that all the matrices A, B, H 2 , H 1 , D, M, L and U are of order dn. Each entry of these matrix should be understood as a matrix of order d that equals to the entry multiply with the d-dimensional identity matrix I d .
Since Next we show that M nn = n 2 I d . According to Theorem 2.2, we have , and (L −1 ) ln = 1 l = n 0 l = n .