On multi-dimensional hypocoercive BGK models

We study hypocoercivity for a class of linearized BGK models for continuous phase spaces. We develop methods for constructing entropy functionals that enable us to prove exponential relaxation to equilibrium with explicit and physically meaningful rates. In fact, we not only estimate the exponential rate, but also the second time scale governing the time one must wait before one begins to see the exponential relaxation in the L1 distance. This waiting time phenomenon, with a long plateau before the exponential decay"kicks in"when starting from initial data that is well-concentrated in phase space, is familiar from work of Aldous and Diaconis on Markov chains, but is new in our continuous phase space setting. Our strategies are based on the entropy and spectral methods, and we introduce a new"index of hypocoercivity"that is relevant to models of our type involving jump processes and not only diffusion. At the heart of our method is a decomposition technique that allows us to adapt Lyapunov's direct method to our continuous phase space setting in order to construct our entropy functionals. These are used to obtain precise information on linearized BGK models. Finally, we also prove local asymptotic stability of a nonlinear BGK model.


Introduction
This paper is concerned with the large time behavior of nonlinear BGK models (named after the physicists Bhatnagar-Gross-Krook [7]) and their linearizations around their Maxwellian steady state. With respect to position, we consider here only models onT d := L 2π T d , the d-dimensional torus of side length L without confinement potential. Then, the usual BGK model for a phase space density f (x, v,t); x ∈T d , v ∈ R d satisfies the kinetic evolution equation and pressure (setting the gas constant R = 1) Let dx := L −d dx denote the normalized Lebesgue measure onT d , and consider normalized initial data (

1.2)
This means, our system has unit mass, zero mean momentum, and unit position-averaged pressure (w.l.o.g. this can be obtained by a Galilean transformation and choice of units). One easily checks that this normalization is conserved under the flow of (1.1). Hence the system (1.1) is expected to have the unique, space-homogeneous steady state the centered Maxwellian at unit temperature, which clearly has the same normalization as (1.2). A standard argument involving the Boltzmann entropy confirms that this is indeed the case, but it gives no information on the rate of convergence to equilibrium, nor does it even prove convergence. We remark that (1.1) involves two different time scales: the generic transport time is O(L), while the relaxation time is O(1). The goal of this paper is to prove the large time convergence to this f ∞ for solutions of (1.1) and its linearizations in 1, 2, and 3D with explicitly computable exponential rates. This extends our previous work [1], which considered the 1D linear BGK model: where M T denotes the normalized Maxwellian at some temperature T > 0: In [1] we studied the rate at which normalized solutions of (1.3) approach the steady state f ∞ = M T as t → ∞. This problem is interesting since the collision mechanism drives the local velocity distribution towards M T , but a more complicated mechanism involving the interaction of the streaming term v∂ x and the collision operator Q lin is responsible for the emergence of spatial uniformity.
To elucidate this key point, let us define the operator L by The evolution equation (1.3) can be written ∂ t f = L f . Let H denote the weighted space L 2 (T×R d ; M −1 T (v) dx dv), where in the current discussion d = 1. Then Q lin is self-adjoint on H , L f ∞ = 0, and a simple computation shows that if f (t) is a solution of (1. 3), where, as before, ρ(x,t) := R f (x, v,t) dv. Thus, while the norm f (t) − f ∞ H is monotone decreasing, the derivative is zero whenever f (t) has the form f (t) = M T ρ for any smooth density ρ. In particular, the inequality (1.5) Many hypocoercive equations have been studied in recent years [27,15,13,12,5], including BGK models in §1.4 and §3.1 of [12], but sharp decay rates were rarely an issue there. In our earlier work [1], we established hypocoercivity for such models in 1D by an approach that yields explicit -and quite reasonable -values for c and λ . To this end, our main tools have been variants of the entropy-entropy production method.
The articles [1] and [12] only consider BGK models with conserved mass, and partly with also conserved energy. But the tools presented there did not apply to BGK equations that also conserve momentum. This is in fact an important structural restriction that we shall formalize in §2.2 with the notion hypocoercivity index. The common feature of all models analyzed in [1] as well as in [12] is that their hypocoercivity index is 1. The main goal of this paper is to extend the methods from [1] (i.e. constructing feasible Lyapunov functionals) to models with higher hypocoercivity index. Applied to BGK equations this then also includes models with conserved momentum.
The existence of global solutions for the Cauchy problem of (1.1) has been proven in case of unbounded domains [21] and bounded domains [24,22], respectively. In case of bounded domains (such as x ∈T d ), these solutions are essentially bounded and unique [22]. For a space-inhomogeneous nonlinear BGK model with an external confinement potential, the global existence of solutions for its Cauchy problem and their strong convergence in L 1 to a Maxwellian equilibrium state has been proven recently [8].
In the first part of this paper we shall study the linearization of the BGK equation (1.1) around the centered Maxwellian with constant-in-x temperature equal to one. To this end we consider f close to the global equilibrium M 1 (v), with h defined by f (x, v,t) = M 1 (v) + h(x, v,t). Then The perturbation h then satisfies For σ , µ, and τ small we have which yields the linearized BGK model that we shall analyze in dimensions 1, 2, and 3 in this paper: Here and in the sequel we only have h(x, v,t) ≈ f (x, v,t) − M 1 (v), but for simplicity of notation we shall still denote the perturbation by h.  for all t. Moreover, if most of the mass density is initially located in a small portion ofT; e.g., if the gas molecules are initially released from a small container into a vacuum in the rest ofT, then f (t) − M 1 L 1 (T×R d , dx dv) will be close to 2 until the streaming has had time to distribute the particles more uniformly overT. Our estimates bound the time that it takes for this to happen.
Combining (1.13) with (1.14) yields For the one dimensional case, it is shown in Remark 3.4 that λ 1 (L) = O(1/L 2 ) in the limit L → ∞. Moreover, the constant C 1 (L) approaches 1 in the limit L → ∞ by using the limiting behavior α * (L) = O(1/L) in expression (3.14). For initial dataf I with all of the gas molecules initially located in a small region ofT with a volume fraction of order ε, the initial entropy E 1 (f I ) will satisfy E 1 (f I ) = O(ε −2 ). In this case, t init is approximately given by O(−L 2 (C + logε)) for ε ≪ 1, L ≫ 1 and some positive constant C. Thus one time scale in our problems is given, or at least bounded, by t init . After this time, the solution satisfies 16) and the second time scale, is given by 2/λ d (L), the waiting time after t init for f (t)− M 1 L 1 (T×R d , dx dv) to decrease by a factor of 1/e; see Figure 1.
These two times scales are quite similar to what one observes in interacting particle systems or even in card shuffling; see [3,11]. In particular, [3,Fig. 2] is quite similar to our Fig. 1 below.
(c) The resemblance of (1.16) to the results of Aldous and Diaconis for finite Markov chains in [3,11], and in particular for card shuffling, is not a coincidence. The equation (1.3) can be interpreted as the Kolmogorov forward equation for a Markov process. Exponential rates for related Markov process with exponential rates have been obtained by probabilistic methods; see [6] for an early study of this type. However, the approach in [6] relies on compactness arguments and does not yield explicit values for c or λ . One difference between our results and those for finite Markov chains is that in our case, the initial relative entropy can be infinite. In card shuffling, starting form a perfectly ordered deck of cards, one starts from a state of maximal-but finite-relative entropy, and the waiting time for uniformization from this state dominates that of any other starting point. For this reason, the initial waiting time for finite Markov chains is a universal "worst case", while this is impossible in our setting; our result must refer to E d (f I ).
(d) Our bound on the decay rate is monotonically decreasing in L and satisfies λ d (L = 0) > 0 and λ d (L = ∞) = 0 (for d = 1 see Fig. 2 below). Moreover c d (L = 0) = C d (L = 0) = 1 (see (3.13), (3.14) below).  (1.15). The values of C d , λ d correspond to the 1D case with L = 2π, and we chose E d (f I ) = 15. We also show the two time scales of the BGK equation: t init marks the intersection point of the two blue curves and it corresponds to the generic transport time. t 2 := t init + 2 λ marks the intersection point of the exponential curve with the value 2/e, and t 2 − t init corresponds to the relaxation time scale. For larger values of L, t init will be much larger.
To prove local asymptotic stability for the nonlinear BGK equation (1.1) in 3D, we make use of another set of norms: For γ ≥ 0, let H γ (T 3 ) be the Sobolev space consisting of the completion of smooth functions ϕ onT 3 in the Hilbertian norm ϕ 2 Then the inner product in H γ is given by (a) For all γ ≥ 0 there is an entropy functional E γ ( f ) satisfying This paper is organized as follows: In §2 we review from [1] a Lyapunov-type method for hypocoercive ODEs that yields their sharp exponential decay rate. While this approach requires all eigenvectors of the system matrix, we also develop an approach using simplified Lyapunov functionals. This alternative strategy comes at the price of yielding only a suboptimal decay rate, but it can be extended to infinite dimensional systems and BGK equations. In §3 we apply the second strategy to the linearized BGK model (1.9) in 1D, proving exponential decay of the solution towards the spatially uniform Maxwellian, as stated in Theorem 1.1. This is based on decomposing (1.9) into spatial Fourier modes and introducing a Hermite function basis in velocity direction. In the Sections 4 and 5 we extend our result to 2D and 3D, respectively. But this is not straightforward, as it is already not obvious how to choose a convenient Hermite function basis in multi dimensions. Finally, in §6 we prove local exponential stability of the nonlinear BGK equation (1.1) in 3D as stated in Theorem 1.3(b).

Decay of hypocoercive ODEs
The local convergence result in Theorem 1.3(b) is obtained from the global convergence result in Theorem 1.1 and a relatively straightforward control of the errors involved in linearization. Therefore, the essential content of the paper concerns the linearized BGK equations. To this end we shall rewrite them as ODEs -of infinite dimension -in fact. We therefore begin this section with a discussion of the hypocoercivity structure of ODEs and review (from [1]) a Lyapunov-type method that yields their sharp decay rate.

Lyapunov's direct method
To illustrate the method we start with linear, finite dimensional ODEs. Consider an ODE for a vector for some (typically non-Hermitian) matrix C ∈ C n×n . The stability of the steady state f 0 ≡ 0 is determined by the eigenvalues of matrix C: Theorem 2.1. Let C ∈ C n×n and let λ j ( j = 1, . . . , n) denote the eigenvalues of C (counted with their multiplicity).
(S3) The equilibrium f 0 of (2.1) is unstable in all other cases.
For positive definite Hermitian matrices C, using the Lyapunov functional f 2 in the energy method allows to obtain the sharp decay rate, which is the smallest eigenvalue µ of C: The derivative of f 2 along solutions f (t) of (2.1) satisfies where C * denotes the Hermitian transpose of C. Note that the derivative of f 2 depends only on the Hermitian part 1 2 (C * + C) of matrix C, such that for a Hermitian matrix C there is no loss of information. But for non-Hermitian matrices it is more natural to use a modified norm: for some positive definite Hermitian matrix P ∈ C n×n , to be derived from C. The derivative of f 2 P along solutions f (t) of (2.1) satisfies Then, f 0 ≡ 0 is asymptotically stable, if there exists a positive definite Hermitian matrix P such that C * P + PC is positive definite. To determine the decay rate to f 0 , and to choose P conveniently we shall use the following algebraic result. Lemma 2]). For any fixed matrix C ∈ C n×n , let µ := min{ℜ(λ )|λ is an eigenvalue of C}. Let {λ j |1 ≤ j ≤ j 0 } be all the eigenvalues of C with ℜ(λ j ) = µ, only counting their geometric multiplicity. If all λ j ( j = 1, . . . , j 0 ) are non-defective, then there exists a positive definite Hermitian matrix P ∈ C n×n with But P is not uniquely determined. Moreover, if all eigenvalues of C are non-defective, such matrices P satisfying (2.2) are given by where w j ( j = 1, . . . , n) denote the left eigenvectors of C, and b j ∈ R + ( j = 1, . . . , n) are arbitrary weights.

Remark 2.3.
(i) This result was proven in [5,Lemma 4.3] for real matrices C ∈ R n×n , and in [1, Lemma 2] for complex matrices C. In particular, if C is a real matrix, then the inequality (2.2) of Lemma 2.2 holds true for some real positive definite symmetric matrices P ∈ R n×n .
(ii) For the extension of the above lemma to the case of defective eigenvalues see [5,Lemma 4.3(i)] and [2,Prop. 2.2]. But the construction of P then involves also the generalized eigenvectors, (iii) The Lyapunov inequality (2.2) is a special case of a linear matrix inequality. In a standard reference of system and control theory [10], the problem of finding the maximal positive constant µ and a positive definite matrix P satisfying (2.2) is formulated as a generalized eigenvalue problem, see [10, §5.1.3]. The optimal value for the constant µ is pointed out, but the associated matrices P (like in our construction (2.3)) are not specified.
Now we consider examples, where all eigenvalues of C ∈ C n×n are non-defective and have positive real parts. Then the origin is the unique and asymptotically stable steady state f 0 = 0 of (2.1): Due to Lemma 2.2, there exists a positive definite Hermitian matrix P ∈ C n×n such that Let λ P j ( j = 1, . . . , n) denote the positive eigenvalues of the positive definite Hermitian matrix P being ordered by magnitude such that 0 < λ P 1 ≤ . . . ≤ λ P n . Then the matrix inequality λ P 1 I ≤ P ≤ λ P n I implies the equivalence of norms Thus the decay in P-norm (2.4) translates into a decay in the Euclidean norm with the constant c = λ P n /λ P 1 ≥ 1, i.e. the condition number of P. Note that c = 1 if and only if P = I. Remark 2.4. In a popular textbook on linear systems theory [16], the exponential decay (2.5) is obtained as follows [16, §8.5]: For a stable matrix −C (i.e. all eigenvalues of −C have negative real part) and a matrix Q, the unique solution P of Lyapunov's equation If Q is a positive definite symmetric matrix, then the unique solution P is also symmetric and positive definite. Moreover, the P-norm of any solution f (t) of (2.1) satisfies where λ Q j and λ P j are the positive eigenvalues of the positive definite symmetric matrices Q and P. This implies (2.4) with 2µ = min λ Q j /maxλ P j . However, only a suitable choice for Q would allow to recover the optimal decay rate as achieved in Lemma 2.2.
The preceding discussion allows to characterize coercive and hypocoercive systems of linear ODEs (as well as matrices) according to the definition in the introduction: Equation (2.1) with matrix C is coercive, if the Hermitian part of C is positive definite, i.e.
In this case, the trivial energy method (i.e. multiplying (2.1) by f (t) ⊤ and using f 2 as a Lyapunov functional) shows decay of f (t) with rate κ and c = 1. But this exponential rate is not necessarily sharp, e.g. for some non-Hermitian matrices C.
While this notion was originally coined for operators in PDEs, such matrices are typically also called positively stable.
Comparing the spectrum of C and C H , it is well known that the maximum constants κ and µ satisfy κ ≤ µ. If all eigenvalues of C with ℜ(λ j ) = µ are non-defective, then f (t) decays at least with rate µ. However, if C has a defective eigenvalue with ℜ(λ ) = µ, then f (t) decays "slightly slower", i.e. with rate µ − ε, for any ε > 0 (see [5,Proposition 4.5] and [2, Proposition 2.2] for details -applied to hypocoercive Fokker-Planck equations). Very recently this decay result has been improved as follows: In this case there is still a positive definite matrix P, but it cannot be given by the simple formula (2.3), and (2.4) becomes for some C > 0, where m is the maximal defect of the eigenvalues of C with ℜ(λ j ) = µ. See [4] for more information.

Index of hypocoercivity
For the BGK models analyzed below we intend to construct convenient Lyapunov functionals of the form f , P f , where the matrix P does not necessarily have to reveal the sharp spectral gap of C (in the sense of Lemma 2.2). To this end we first give a definition of the structural complexity of a hypocoercive equation of the form d dt Here we decomposed the matrix C ∈ C n×n as C = iC 1 + C 2 with Hermitian matrices C 1 and C 2 with C 2 ≥ 0. In the special case C 1 = 0, (ker C 2 ) ⊥ corresponds to the subspace of decaying solutions f (t), and ker C 2 to the non-decaying subspace. In hypocoercive equations, the semigroup generated by the skew-Hermitian matrix iC 1 may turn non-decaying directions into decaying directions, hence allowing for an exponential decay of all solutions. More precisely, we assume ∃τ ∈ N 0 and ∃κ > 0 : Definition 2.5. For Hermitian matrices C 1 and C 2 with C 2 ≥ 0, the hypocoercivity index of the matrix C (and of the ODE (2.7)) is the smallest τ ∈ N 0 , such that (2.8) holds.
Clearly, τ = 0 corresponds to coercive matrices C; i.e., those for which all eigenvalues of its Hermitian part 1 2 (C + C * ) are strictly positive. A simple computation shows that this definition is invariant under a change of basis. We note that condition (2.8) is identical to the matrix condition in Lemma 2.3 of [5], which characterizes the hypoellipticity of degenerate Fokker-Planck operators of the form L f = div(D∇ f + Cx f ) (using the matrix correspondence D = C 2 , C = C 1 ). Hence, condition (2.8) for the ODE (2.7) and its hypocoercivity index can be seen as an analogue of the finite rank Hörmander condition for hypoelliptic and degenerate diffusion equations [18,Th. 1.1]. While the hypocoercivity index of degenerate parabolic equations determines the algebraic regularization rate (e.g. from L 2 into H 1 , see Theorem A.12 in [27] and Theorem 4.8 in [5]), its role in hypocoercive ODEs is not yet clear.

Equivalent hypocoercivity conditions
Next, we collect several statements which are equivalent to condition (2.8). They will be useful for the analysis in §2.3. Proposition 2.6. Suppose that C 1 ∈ C n×n and C 2 ∈ C n×n are Hermitian matrices. Suppose furthermore that C 2 is positive semi-definite. Then the following conditions are equivalent: (B1) There exists τ ∈ N 0 such that which is often called Kalman rank condition.
(B3) No non-trivial subspace of ker C 2 is invariant under C 1 .
(B4) No eigenvector of C 1 lies in the kernel of C 2 .
(B5) There exists a skew-Hermitian matrix K such that C 2 + [K, Moreover, the smallest possible τ in (B1) and (B2) coincides; it is the hypocoercivity index of C.
Proof. The equivalence of (B1) and (B2) (with the same τ) follows from [ (a) In order to use condition (B1) later on also for "infinite matrices" we give here an equivalent version: Condition (2.9) implies that the columns of C τ+1 √ C 2 , j ∈ {k − 1, . . ., τ + k − 1}. Hence, for a hypocoercive matrix we have to gain with each added term in (2.9) at least one rank until we reach full rank, i.e. space dimension n. Thus, for hypocoercive matrices its hypocoercivity index is bounded from above by the dimension of ker C 2 (or equivalently corank of C 2 ).
In [27,Remark 17] the connections of the above conditions to Kawashima's nondegeneracy condition for the study of degenerate hyperbolic-parabolic systems [20] and Hörmander's rank condition for hypoelliptic equations [18] are noted.
For real symmetric matrices C 1 , C 2 ∈ R n×n with C 2 ≥ 0, condition (B4) is equivalent to the condition that C := iC 1 + C 2 has only eigenvalues with positive real part, see [25,Theorem 1.1]. And the latter statement is equivalent to the exponential stability of (2.7). Using Proposition 2.6, we shall now prove a similar statement for Hermitian matrices: Proof of Lemma 2.8. First, we show that condition (2.8) implies that all eigenvalues λ C of C := iC 1 + C 2 have positive real part ℜ(λ C ) > 0: Let φ be an eigenvector of C corresponding to an eigenvalue λ , i.e. (2.10) Take the complex inner product of this equation with φ , to obtain due to the assumptions on the matrices C 1 and C 2 . Moreover, there exists a skew-Hermitian matrix K such that C 2 + [K, C 1 ] is positive definite by Proposition 2.6. We multiply equation (2.10) with iK and take the inner product with φ such that λ iKφ , φ = iKCφ , φ .
We conclude that, if all eigenvalues λ C of C have positive real part ℜ(λ C ) > 0, then condition (B4)and equivalently (2.8) -must hold. Remark 2.9. In the study of hypocoercivity for discrete velocity BGK models, a family of matrices C (k) := ik C 1 + C 2 (k ∈ N) for some real symmetric matrices C 1 , C 2 ∈ R n×n with C 2 ≥ 0 has to be considered, see [1, §4.1- §4.2]. Following the proof of [26,Prop. 2.4], a uniform bound for the real parts of the eigenvalues λ C (k) of these matrices C (k) (k ∈ N) can be proven: Remark 2.10. Next we relate our study of equation (2.7) to the one of d dt f + L f = 0 in [27]. In the first part of [27], operators L = A * A + B with a skew-symmetric operator B are considered. Our operator/matrix C = iC 1 + C 2 (for some Hermitian matrices C 1 , C 2 ∈ C n×n with C 2 ≥ 0) is of the form L = A * A + B for the choice A = √ C 2 and B = iC 1 acting on the complex Hilbert space C n . First, we notice that K := ker L = ker A ∩ ker B, see [27,Prop. I.2]. There, the study of hypocoercivity is based on the assumptions [27, (3.4)-(3.5)]: 17) or more clearly, where the iterated commutators D k (k ∈ N 0 ) are defined recursively as In [27,Remark 17] it is noted (without a proof) that on finite dimensional Hilbert spaces, condition (2.18) is equivalent to (B3) in Proposition 2.6 (with credit to Denis Serre).
The following simple example shows that this "equivalence" needs a small modification in complex Hilbert spaces: Consider the matrices Matrix A has kernel ker A = span{ 0 1 }. Moreover D 0 = A and D k = 0 for all k ∈ N. Hence, K = ker A ∩ ker B = ker A and conditions (2.17) and (2.18) are satisfied for all τ ∈ N 0 . But (B3) does not hold. Now we give a proof of a slightly modified equivalence. On finite dimensional Hilbert spaces, conditions (2.17) and (2.18) are obviously equivalent. Moreover, we will make use of Proposition 2.6 and only show the equivalence of (B1) and a modified (2.17): if and only if (C 1 , C 2 ) satisfies (B1). Moreover, the smallest possible τ in (2.19) and (B1) coincides.
Proof. First, notice that for all τ ∈ N 0 Defining K ′ := k≥0 ker D k , the inclusion K ⊂ K ′ is proven in [27,Prop I.15]. Next we prove that by induction: For τ = 0, w ∈ ker D 0 = ker A holds. Assume now condition (2.21) for τ and prove it for τ + 1.

Ansatz for the transformation matrix P
For finite dimensional matrices with non-defective eigenvalues, an optimal transformation matrix P (yielding the sharp spectral gap and thus the sharp decay rate) can be constructed as stated in Lemma 2.2. But for "infinite matrices" the eigenfunctions w j will not be known in general. Hence, an optimal matrix P cannot be obtained from formula (2.3). Even for finite dimensional systems with n large, it may not be possible to explicitly construct the matrix P defined in (2.3). However, Lemma 2.2 still provides a guide to the construction of a non-optimal choice of P that can still be used to prove hypocoercivity and to give a quantitative decay rate. We shall exploit this in §3-6 to prove hypocoercivity for BGK equations. To this end we shall only consider minimal matrices P, i.e. matrices with a minimal number of non-zero entries in P − I, such that Lemma 2.2 still allows to deduce hypocoercivity (but then with a suboptimal rate µ). Our focus will be to find a usable and simple ansatz for P and to prove that such an ansatz will give rise to a matrix inequality of the form (2.2). The structure of these ansatzes shall be derived from the connectivity structure of the matrix C: We consider examples of equations (2.7), where we assume w.l.o.g. that the Hermitian matrix C 2 is diagonal and hence real. Next we consider how the zero and negative diagonal elements of −C 2 (or equivalently the non-decaying and decaying eigenmodes of d dt f = −C 2 f ) are coupled via a (non-zero) off-diagonal pair in the Hermitian matrix C 1 . More precisely, a non-zero off-diagonal element of C 1 at j, k (and hence also at k, j) couples, in the evolution equation, the j-th mode of C 2 to its k-th mode (or diagonal element). In the sequel we shall use a simple graphical representation of such connections: there the dots • and • represent, respectively, zero and negative diagonal elements of −C 2 , and an arrow between such dots represents their connection (or coupling).
For each zero element in the diagonal of C 2 , we next consider a shortest connection graph to a nonzero element in diag(C 2 ) -realized by a sequence of non-zero off-diagonal elements of C 1 . This leads to a guideline to find a simple ansatz for a minimal transformation matrix of the form P = I + A: The ansatz parameters of the Hermitian matrix A ∈ C n×n should be put at the positions of the non-zero off-diagonal coupling elements of C 1 that are needed to establish the shortest connection graphs -choosing only one graph per zero element in diag(C 2 ).
Next we shall list some hypocoercive cases with low dimensionality of ker C 2 , because these are the most important cases in kinetic equations (as discussed in §3-6). For those cases we shall then prove that the above mentioned ansatzes indeed allow to establish a spectral gap of C.
In this situation there exists only one (structurally relevant) case. For (2.7) to be hypocoercive, the only zero element of the diagonal of C 2 (w.l.o.g. say with index j = 1) needs to be coupled (via C 1 ) to a positive element of the diagonal of C 2 . Due to our assumptions, The matrix C = iC 1 + C 2 is hypocoercive if and only if (B3) holds. Since ker C 2 = span{e 1 }, Condition (B3) reads here C 1 e 1 ∈ span{e 1 }. Thus, we conclude from C 1 e 1 = (c 1,1 , . . . , c n,1 ) ⊤ that c j,1 = 0 for some j ∈ {2, . . . , n}. Of course, j does not have to be unique, but we now fix one such index j 0 . This means that c 1, j 0 = c j 0 ,1 = 0. In this case the hypocoercivity index is always 1, since Remark 2.7(b) yields here that the hypocoercivity index is less or equal dim(ker C 2 ) = 1.
W.l.o.g. we assume j 0 = 2. The coupling within the relevant 2 × 2-subspace (i.e. the upper left 2 × 2 block of the matrix C) can then be symbolized as •−→• . Such an example was analyzed in §4.3 of [1] (representing a linear BGK equation in 1D) using a transformation matrix with the ansatz for some λ ∈ C. Here, P and I are square matrices of the same size as C, possibly even infinite. The second matrix on the r.h.s. has the same size, but only its upper left 2 × 2 block is non-zero. While the above transformation matrix P is not optimal, this approach is important in practice: in theory, Lemma 2.2 provides the optimal transformation matrix P to deduce the optimal ODE-decay (2.4) or (2.6). But in practice, its computation is tedious, particularly when the system matrix involves a parameter, which is the case for the BGK-models to be analyzed below (cf. Remark 2.9). For large systems, there is therefore a need to design a method that does not require all eigenvectors, even if the resulting decay rates are then sub-optimal. For the case dim(ker C 2 ) = 1, an approximate transformation matrix P of the simple structure (2.22) is sufficient, and it always allows to prove an explicit exponential decay of the ODE (2.7): the following theorem shows that C and P satisfy a matrix inequality of form (2.2), but not necessarily with the optimal constant µ. Moreover, it shows that the ansatz (2.22) from §4.3 of [1] was not a "wild guess" but rather a systematic approach.
Proof. We set P = I + rA with Then we consider C * P + PC = 2C 2 + r(C * A + AC) as a perturbation of the matrix 2C 2 for sufficiently small r ≥ 0. In particular, zero is a simple eigenvalue of C 2 with eigenvector e 1 . For small r ≥ 0, the eigenvalues of C * P + PC are close to the eigenvalues of 2C 2 . Therefore, we only need to study the evolution of the zero eigenvalue w.r.t. r. Due to [17,Thm. 6.3.12], the lowest eigenvalue µ(r) is a continuous function satisfying Due to our assumptions, c 1,2 = 0. Hence, we can choose φ such that ℑ(e −iφ c 1,2 ) is positive. For such a choice, the smallest eigenvalue µ(r) of 2C 2 + r(C * A + AC) will be positive. This finishes the proof.

Hypocoercive matrix with
Up to a change in basis of C n , we consider the Hermitian matrices such that c j > 0 for j ≥ 3 and c j, j ∈ R for all j ∈ {1, . . ., n}. We only consider hypocoercive matrices C = iC 1 + C 2 . Then, C 1 cannot have a block-diagonal structure of partition size (2, n − 2) as, otherwise, the kernel of C 2 would be invariant under C 1 in contradiction to condition (B3). Hence, we shall assume in the sequel w.l.o.g. that c 2,3 = 0. In order to construct (later on) appropriate transformation matrices P we shall distinguish two cases depending on the rank of the upper right submatrix C ur 1 = (c j,k ) j∈{1,2}, k∈{3,...,n} of C 1 . These cases with appropriate ansatz for the matrix P are summarized in Table 1.
Case 2A: In this case the upper right submatrix C ur 1 ∈ C 2×(n−2) has rank 2. Its hypocoercivity index is 1 which can be inferred from condition (B1): Using Due to rank C ur Hence, the hypocoercivity index of C is 1. Such an example (a linearized BGK equation in 1D) was analyzed in §4.4 of [1] using a transformation matrix with ansatz (2.24).
where the upper right submatrix C ur 1 ∈ C 2×(n−2) has rank 1. Again, we assume w.l.o.g. that c 2,3 = 0. The right choice for the unitary matrix U depends on the structure of C 1 : Table 1: We give a classification of Hermitian matrices C 1 , such that the associated matrix C = iC 1 + diag(0, 0, c 2 , . . . , c n ) is hypocoercive. The restrictions on the coefficients of C 1 are depicted as 0 if zero, • if non-zero, and * if there is no restriction. Furthermore, we give the corresponding two-parameter ansatz for the transformation matrix P = I + A. The guideline to construct an admissible Hermitian perturbation matrix A, is to put the parameters λ j at the positions of the (non-zero) coupling elements of C 1 . In case (2B2) this will be apparent after a suitable transformation, see the proof of Theorem 2.15.
only for n ≥ 4. Here, the two connections in the relevant (upper left) 4 × 4-subspace can be symbolized as Case 2B: In this case the upper right submatrix C ur 1 ∈ C 2×(n−2) has rank 1. Then rank √ C 2 , C 1 √ C 2 = n − 1. Hence, the hypocoercivity index of C is 2 since it is bounded from above by dim(ker C 2 ) = 2, see Remark 2.7(b).
This finishes the complete classification of the situation when dim(ker C 2 ) = 2. Our ansatz for matrix P depends on the structure of matrix C 1 . Therefore we distinguish between the subcases (2B1) and (2B2), see also Table 1. We shall prove that these ansatzes will allow for a matrix inequality of the form (2.2) and hence for an explicit exponential decay (2.4) in the ODE (2.7). As in Theorem 2.12 we shall construct P as a perturbation of I. To verify, then, a matrix inequality of the form (2.2) we shall use the following perturbation result on multiple eigenvalues: Lemma 2.14 (Theorem II.2.3 in [19]). Let C 1 and C 2 be Hermitian matrices with C 2 ≥ 0 and dim(ker C 2 ) = k ∈ N 0 , such that the associated matrix C = iC 1 + C 2 is hypocoercive. Let {v j ; j = 1, . . . , k} be an orthonormal basis of the kernel ker C 2 and let A be a Hermitian matrix (which makes P(r) := I + rA a positive definite Hermitian matrix for sufficiently small r ≥ 0). Then, for sufficiently small r > 0, the k lowest eigenvalues µ j (r) of the Hermitian matrix C * P(r) + P(r)C satisfy We will use this result to construct perturbation matrices A and to check the admissibility of the various ansatzes for the transformation matrices P -mostly in the case dim(ker C 2 ) = 2. The two matrices in Lemma 2.14 are related via and C 2 has a k-fold 0-eigenvalue by assumption. Now, if A is chosen such that all eigenvalues ξ j , j = 1, ..., k in (2.32) are positive, then we deduce the positive definiteness of C * P(r)+ P(r)C for sufficiently small r > 0. We remark that the positivity of ξ 1 , ..., ξ k is first of all a sufficient condition for the positive definiteness of C * P(r) + P(r)C (for sufficiently small r > 0). But one sees easily from (2.33) that it is also necessary.
Theorem 2.15. Let C 1 and C 2 be Hermitian matrices with C 2 ≥ 0 and dim(ker C 2 ) = 2, such that the associated matrix C = iC 1 + C 2 is hypocoercive. Then there exists a two-parameter ansatz for a positive definite matrix P = P(λ 1 , λ 2 ), according to Table 1, such that C * P + PC is positive definite (for an appropriate choice of λ 1 , λ 2 ).
Proof. First, one easily checks that all matrices P from Table 1 are positive definite if |λ 1 | 2 + |λ 2 | 2 < 1. Thus, P(r) := I + rA with A := P − I yields for r ∈ [0, 1] a family of positive definite Hermitian matrices P(r). Now, up to a change of basis in C n , we assume without loss of generality that C 2 is a diagonal matrix of the form C 2 = diag(0, 0, c 3 , . . . , c n ) with c j > 0. Then, ker C 2 = span{e 1 , e 2 } and we choose R = (e 1 , e 2 ) ∈ R n×2 . According to Lemma 2.14, the positive definiteness of C * P + PC (for sufficiently small r > 0) can be inferred from the positive definiteness of R * (C * A + AC)R.
Next we deal with each case of C 1 and its corresponding ansatz P = I+A (as listed in Table 1) separately: we need to prove that λ 1 and λ 2 can be chosen such that R * (C * A + AC)R is indeed positive definite.
to be positive definite, all three of its minors have to be positive for appropriately chosen λ 1 and λ 2 . We set for some positive numbers ℓ 1 and ℓ 2 . Then, the minors of first order satisfy The minor of second order reads (using (2.35)) Then, the minor of second order is positive for the choice ℓ 1 = ε|c 1,3 c 2,3 | and ℓ 2 = ε|c 1,4 c 2,4 | with any ε > 0, due to our assumption (2.34). Finally, for sufficiently small ε > 0 the Hermitian matrix P is positive definite.
(2B) First, we verify that the ansatz for P in (2.25) is admissible in case (2B1).
Then, the connections in the relevant (upper left) 3 × 3-subspace can be symbolized as •−→•−→• ; see (2.26). To prove that the ansatz for P in (2.25) with U = I is admissible, we use Lemma 2.14 and we need to check the positive definiteness of for appropriately chosen λ 1 and λ 2 . The minors of first order are They are positive if and only if Due to our assumptions c 1,2 = 0 and c 2,3 = 0, we can choose λ 1 and λ 2 such that this condition is satisfied. The minor of second order reads where the first summand is positive due to (2.37). First we choose λ 1 and λ 2 such that the minors of first order are positive. Then we consider rλ 1 for r ∈ (0, 1) instead of λ 1 , and we choose r ∈ (0, 1) sufficiently small such that the second minor becomes positive, and hence (2.36) is positive definite.
and recall the hypocoercivity condition (2.28). The guideline to construct a simple ansatz for P at the beginning of this section would suggest to connect each non-decaying mode to the same decaying mode. However, for some examples in subcase (2B2) this ansatz is not admissible. Therefore this guideline is not universally true.
Due to Lemma 2.14, we need to check the positive definiteness of R * (C * UAU * + UAU * C) R for appropriately chosen λ 1 and λ 2 . Using R = UR, we deduce , the positive definiteness of iR * − (U * C * 1 U)A + A(U * C 1 U) R for suitable λ 1 and λ 2 follows as in case (2B1).
For dim(ker C 2 ) = 1 or 2, we just listed all possible cases. But for dim(ker C 2 ) = 3 we will next only consider the one situation relevant below for the linearized BGK equation in 1D, i.e. (3.6), (3.7).

Hypocoercive matrix with dim(ker
If the three zeros in the diagonal of C 2 are connected (via C 1 ) only consecutively to a positive entry in the diagonal of C 2 , the relevant 4 × 4-subspace can be symbolized as •−→•−→•−→•. Proceeding as in §2.3.2 one easily checks that rank C 2 , C 1 C 2 , C 2 1 C 2 = n − 1 , rank C 2 , C 1 C 2 , C 2 1 C 2 , C 3 1 C 2 = n , and hence the hypocoercivity index of C is 3.
With C 1 of the form a natural ansatz for a simple transformation matrix is given by with some λ 1 , λ 2 , λ 3 ∈ C. Indeed, this ansatz always yields a useful Lyapunov functional and hence a quantitative exponential decay rate, as we shall now show under the simplifying restriction c 1,1 = c 2,2 = c 3,3 (which is the relevant situation in §3): 0, 0, c 4 , ..., c n ) with c j > 0, and C 1 be a Hermitian matrices of form (2.38) and satisfying c 1,1 = c 2,2 = c 3,3 . Then there exists a three-parameter ansatz for a positive definite matrix P = P(λ 1 , λ 2 , λ 3 ) of form (2.39), such that C * P + PC is positive definite (for an appropriate choice of λ 1 , λ 2 , λ 3 ).
We have ker C 2 = span{e 1 , e 2 , e 3 } and R = (e 1 , e 2 , e 3 ) ∈ R n×3 . According to Lemma 2.14, the positive definiteness of C * P + PC (for sufficiently small r > 0) can be inferred from the positive definiteness of R * (C * A + AC)R.
As in the proof of Theorem 2.15 we search for conditions on λ j ( j = 1, 2, 3) such that the eigenvalues of To satisfy the former conditions it is convenient to choose Now the parameters λ j ( j = 1, 2, 3) can be chosen in analogy to the proof of Theorem 2.15, case (2B) to satisfy the conditions (2.41). Once λ 1 , λ 2 , and arg(λ 3 ) are fixed, we can choose |λ 3 | large enough to also satisfy the positivity of (2.43).
This analysis to construct appropriate matrices P could, of course, also be extended to higher dimensions of ker C 2 , but this gets more cumbersome. In §4 and §5 we have dim(ker C 2 ) = 4 and 5, respectively.

Linearized BGK equation in 1D
In this section we shall analyze the large time behavior of the linearized BGK equation (1.9) in 1D, To prepare for the proof of Theorem 1.1 we shall use an expansion in v-modes, as in [1]. Using the probabilists' Hermite polynomials, we define the normalized Hermite functions corresponding to T = 1: The first three normalized Hermite functions g m (v) are With this notation, (3.1) reads We start with the x-Fourier series of h: Each spatial mode h k (v,t) is decoupled and evolves according to Here, σ k , µ k and τ k denote the spatial modes of the v-moments σ , µ and τ defined in (1.7); hence Next we expand h k (·,t) ∈ L 2 (R; M −1 1 ) in the orthonormal basis {g m (v)} m∈N 0 : For each k ∈ Z, the "infinite vector"ĥ k (t) = (ĥ k,0 (t),ĥ k,1 (t), ...) ⊤ ∈ ℓ 2 (N 0 ) contains all Hermite coefficients of h k (·,t). In particular we havê Hence, (3.5) can be written equivalently as Thus, the vector of its Hermite coefficients satisfies where the operators L 1 , L 2 are represented by "infinite matrices" on ℓ 2 (N 0 ): Remark 3.1. The bi-diagonal form of L 1 is a direct expression of the two-term recursion relation (3.4). This is not special to the Hermite polynomials; a similar expression holds for the orthogonal polynomials with respect to any even reference measure.
Equation (3.6) provides a decomposition of the generator in its skew-symmetric part −ik 2π L L 1 and its symmetric part −L 2 , the latter introducing the decay in the evolution.
We remark that (3.6) simplifies for the spatial mode h 0 with k = 0. One easily verifies that, for all d, the flow of (1.9) preserves (1.7), i.e. σ 0 (t) = 0, µ 0 (t) = 0, τ 0 (t) = 0 for all t ≥ 0. Hence, (3.5) yields For k = 0, we note that the linearized BGK equation is very similar to the equation specified in [1, §4.4]: The only difference is that L 2 now has one more zero -at the second entry on the diagonal, which corresponds to the conservation of momentum. For k = 0, (3.6) has the structure of the example in §2.3.3, and thus hypocoercivity index 3. This has a simple interpretation: The mass-conservation mode is coupled to the momentum-conservation mode, which is coupled to the energy-conservation mode. Finally, the latter is coupled to the decreasing mode that corresponds to g 3 (v). The hypocoercivity index of (3.6) can also be obtained directly from Definition 2.5, in its equivalent formulation (B1') that also applies to "infinite matrices": With corank L 2 = 3, ker L 2 = span{e 0 , e 1 , e 2 }, and the relations we again find τ = 3.
We define the matrices C k := ik 2π L L 1 + L 2 , k ∈ Z which determine the evolution of the spatial modes of the BGK equation in 1D, cf. (3.6). Our next goal is to establish a spectral gap of C k , uniformly in k = 0. This will prove Theorem 1.1 in 1D. Clearly, this matrix corresponds to C = iC 1 + C 2 in §2.3. There, the construction of the transformation matrix P(r) = I + rA was based on Lemma 2.14, and hence on proving the positive definiteness of To compensate for k, it is natural to choose the perturbation matrix A proportional to 1 k . Following §2.3.3 we hence use the ansatz (2.39) for the k-dependent transformation matrices P k : For parameters λ j ; j = 1, 2, 3 to be chosen below, we define P k , k = 0 to be the infinite matrix that has     as its upper-left 4 × 4 block, with all other entries being those of the identity. Under the assumption |λ 1 | 2 + |λ 2 | 2 + |λ 3 | 2 < 1, the matrix P k will be positive definite for all k = 0. Recalling that L 1 is an (infinite) real matrix as well as the parameter choice in (2.42), it is natural to choose also here arg(λ j ) = − π 2 . Hence (3.9) turns into  with α := |λ 1 |, β := |λ 2 |, γ := |λ 3 |. Now, (the infinite dimensional analog of) Theorem 2.16 asserts that the above ansatz will yield an admissible transformation matrix P and hence an exponential decay rate for (3.6), uniformly in k. But, as a perturbation result, it neither provides an explicit value for the decay rate µ, nor does it yield a rather natural ratio between the parameters λ j . These two aspects will be our next task.
Next we search for conditions on α, β , γ > 0 such that the eigenvalues ξ j of are positive. If all minors are positive, then the matrix will be positive definite (by Sylvester's criterion). We deduce the conditions which are special cases of (2.41), (2.43). In fact, the matrix L We note that the special choice β := √ 2α and γ := √ 3α makes all eigenvalues of R * (C * k A + AC k )R equal, which seems to be beneficial to obtain eventually a good decay estimate. Moreover, it will simplify the proof of Lemma 3.3.
In the following lemma we establish an infinite dimensional analog of Lemma 2.2 -for (3.6), the transformed linearized BGK equation in 1D. However, here we shall not aim at obtaining the optimal decay constant µ in the matrix inequality (2.2). Still, µ will be independent of the modal index k ∈ Z, thus providing exponential decay of the full solution. . If the matrices P k are chosen with some α ∈ (0, α (3) ), β = √ 2α, and γ = √ 3α uniformly for all |k| ∈ N, then P k from (3.10) and C * k P k + P k C k are positive definite for all k ∈ Z \ {0}. Moreover, C * k P k + P k C k ≥ 2µP k uniformly in |k| ∈ N , (3.11) with µ := δ 3 (1, α) . The proof of this lemma is deferred to Appendix A.
(3.12) (b) In the limit L → ∞, the matrix C * k P k + P k C k has zero eigenvalues, which is apparent from its upper left submatrix D k,α, √ 2α, √ 3α defined in (A.1). Accordingly, α (3) → 0 with α (3) = O( 1 L ) and µ * = O( 1 L 2 ) in the limit L → ∞. It is no surprise that the exponential decay rate vanishes in this limit, as the limiting whole space problem only exhibits algebraic decay (cf. [9] for the large-time analysis of (1.3) on R d ).
To appreciate the above decay estimate, let us compare it to the spectral gap obtained in numerical tests for L = 2π. In this case the estimate from Remark 3.4 yields the analytic bound with µ * = 0.041812.... As a comparison we computed the spectrum of finite dimensional approximation matrices to L 2 + ikL 1 up to the matrix size n = 500. Apparently the spectral gap is determined by the lowest spatial modes k = ±1. With increasing n it grows monotonically to µ num = 0.558296.... So, our estimate is off by a factor of about 13. Following the strategy from §4.3 in [1], i.e. by maximizing µ in the matrix inequality C * k P k + P k C − 2µP k ≥ 0, the above estimate on the decay rate could be improved further. But we shall not pursue this strategy here again.
Let us briefly compare this gap to the situation in the two 1D BGK models analyzed in §4.3 and §4.4 of [1]. They only differ from the 1D model (3.6)-(3.7) of this section, concerning the matrix L 2 : there we had L 2 = diag(0, 1, ...) and L 2 = diag(0, 1, 0, 1, ...), resp. We recall from §2.2 that both models have hypocoercivity index 1, and their spectral gaps are 0.6974... and 0.3709717660..., resp. One might expect that removing 1 entries from L 2 and hence increasing the hypocoercivity index would decrease the spectral gap. But this is obviously not always the case.

Linearized BGK equation in 2D
Next we shall analyze the linearized BGK equation (1.9) in 2D: Again we consider the x-Fourier series of h: each spatial mode h k (v,t) is decoupled and evolves as Here, σ k , µ k and τ k denote the spatial modes of the v-moments σ , µ and τ defined in (1.7); hence Next we shall introduce an orthonormal basis in v-direction, to represent the spatial modes h k (·,t) ∈ L 2 (R 2 ; M −1 1 ), k ∈ Z 2 . As in 1D we shall again use Hermite functions. But their multi-dimensional generalization is not unique, and we shall present two options that seem to be practical: Basis 1 ("pure tensor-basis"): A complete set of orthogonal polynomials in d variables can be formed as products of d such polynomials, each in a single variable. Using the Hermite polynomials H m in 1D, i.e.
we construct Hermite polynomials on R d as with the multi-index m = (m 1 , . . . , m d ) ∈ N d 0 . They are also generated by a simple generalization of formula (3.2): with |m| = ∑ d j=1 m j (see [14], e.g.). For d = 2, we obtain Using definition (3.3) of normalized Hermite functions in 1D, we define the normalized Hermite functions in d dimensions as holds.
In order to give a vector representation of (4.1), the evolution equation of the spatial modes h k (v,t), we first need to introduce a linear ordering of the velocity basis g m (m ∈ N 2 0 ). We shall use a lexicographic order, i.e. first (increasingly) with respect to the total order |m|, and within a set of order |m| we order w.r.t. m 1 (decreasingly) (for d = 2). Thus, we obtain the linearly ordered basis Given a multi-index m ∈ N 2 0 , its lexicographic index is computed as |m|(|m| + 1)/2 + m 2 with |m| = m 1 + m 2 .
Basis 2 ("energy-basis"): The second basis is a simple variant of the first one. We recall that the evolution with the BGK equation (1.1) conserves the (kinetic) energy and mass. Hence, their difference is also conserved and it is related to the polynomial |v| 2 2 − 1. In analogy to the 1D case from §3 it is thus a natural option to construct a basis of orthogonal polynomialsH m (v), m ∈ N d 0 , such that |v| 2 2 − 1 is a basis element. Compared to {H m (v)}, in fact, we only have to modify the Hermite polynomials of second order. For d = 2 they read:H Similarly, we define normalized Hermite functions The elementsg m satisfy a recurrence relation similar to (4.4), except for identities involvingg 2,0 org 0,2 . For example, While the first basis g m (m ∈ N d 0 ) inherits a simple recurrence formula with three elements, the recurrence formulas for the second basisg m (m ∈ N d 0 ) involve four elements, when includingg 2,0 (v) org 0,2 (v).
To derive the vector representation of (4.1), it is preferable to use the first basis with the linear ordering g m (m ∈ N 0 ). With the identity (g 3 (v) + g 5 (v))/ √ 2 = (|v| 2 /2 − 1)M 1 (v) we rewrite (4.1) as First we consider the spatial mode h 0 with k = 0. With the same argument as in 1D we again obtain (3.8), For each spatial mode k ∈ Z 2 , the "infinite vector"ĥ k (t) = (ĥ k,0 (t),ĥ k,1 (t), ...) ⊤ ∈ ℓ 2 (N 0 ) contains all 2D-Hermite coefficients of h k (·,t). In particular we havê Thus, we can rewrite (4.6) as Our next goal is to rewrite this system in the Hermite function basis as an infinite vector system -in analogy to (3.6) in 1D. In that equation, the operator L 1 is multiplied by the (scalar and integer) mode number k, which is then used in the construction of the transformation matrices P k . To extend this structure and strategy to 2D, we first have to consider the rotational symmetry of (4.7): We note that the basis functions g 0 and g 3 + g 5 depend only on |v|, and that the interplay between the vectors k and v only occurs via k · v. Hence, evolution equations from the family (4.7) having the same modulus |k| are identical in the following sense: Rotating the spatial mode vector k and the v-coordinate system by the same angle, leaves (4.7) invariant. Thus it suffices to consider (4.7) for vectors k = (κ, 0) ⊤ with the discrete moduli We skipped here the mode h 0 , as it was already analyzed before. In the sequel we also denote h κ := h κ,0 and h κ :=ĥ κ,0 . With this notation, (4.7) reads Then, the vector of its Hermite coefficients satisfies where the operators L 1 , L 2 are represented by symmetric "infinite matrices" on ℓ 2 (N 0 ): To compute the hypocoercivity index of the BGK model in 2D, it is preferable to use the second basis with the linear orderingg m (m ∈ N 0 ). We shall give the matrix representation of the two dimensional BGK equation (4.1) in the second velocity basis, again only for the spatial modes k = (κ, 0) ⊤ , κ ∈ K. To obtain the corresponding matricesL 1 andL 2 , we simply represent the linear basis transformation by the infinite matrix which is self-inverse, i.e. S = S −1 . Thus we computeL 1 = S −1 L 1 S andL 2 = S −1 L 2 S, yielding This second basis representation makes it easy of determine the hypocoercivity index of the BGK model in 2D. As for the 1D model, we use Definition 2.5 in its equivalent formulation (B1'): With corankL 2 = 4, kerL 2 = span{e 0 , e 1 , e 2 , e 3 }, and the relations L 1 e 0 = e 1 ,L 1 e 1 = e 0 + e 3 + e 5 ,L 1 e 2 = e 4 ,L 1 e 3 = e 1 + 3/2 e 6 + 1/2 e 8 , (4.9) we find the index τ = 2. At first glance this may come as a surprise, since the analogous 1D model has index 3. But in 2D, each of the two momentum-conservation modes (represented by e 1 and e 2 ) is directly coupled to a decreasing mode (represented by e 5 and e 4 , respectively). These modes are quadratic polynomials, but orthogonal to |v| 2 , where the latter corresponds to the (conserved) kinetic energy, cf. (4.5).
We define the matrices C κ := i 2π L κL 1 +L 2 , κ ∈ K ∪ {0} which determine the evolution of the spatial modes of the BGK equation in 2D, cf. (4.8). Our next goal is to establish a spectral gap of C κ , uniformly in κ ∈ K. This will prove Theorem 1.1 in 2D. To this end we make an ansatz for the transformation matrices: Following the detailed motivation from the 1D analog in §3, let P κ , κ ∈ K be the identity matrix whose upper-left 7 × 7 block is replaced by  with positive parameters α, β , γ, and ω to be chosen below. The distribution of the non-zero off-diagonal elements follows from the pattern in matrixL 1 , with the following rationale: The α-term couples the e 0mode to the e 1 -mode, which is the only choice according to (4.9). The β -term couples the e 1 -mode to the decaying e 5 -mode, and the γ-term couples the e 2 -mode to the decaying e 4 -mode. Finally, the ω-term couples the e 3 -mode to the e 6 -mode, the first decaying mode according to (4.9).
Proof of Theorem 1.1 in 2D. We consider a solution h of (1.9), and the entropy functional E (f ) defined by Here the matrices P 0 = I and P κ defined in (4.10) for κ = |k| = 0 are regarded as bounded operators on L 2 (M −1 1 ). Then d dt where 1 is the decay rate of h 0 , cf. (3.8). This implies (1.11) with λ 2 (L) := 2 min{1, µ} and µ from (4.12). The constants c 2 and C 2 in the estimate (1.10) follow from (A.7): This finishes the proof of Theorem 1.1 in 2D.

Linearized BGK equation in 3D
Next we shall analyze the linearized BGK equation (1.9) in 3D: Again we consider the x-Fourier series of h: Each spatial mode h k (v,t) is decoupled and evolves as Here, σ k , µ k and τ k denote the spatial modes of the v-moments σ , µ and τ defined in (1.7); hence Next we introduce an orthonormal basis in v-direction, to represent the spatial modes h k (·,t) ∈ L 2 (R 3 ; M −1 1 ), k ∈ Z 3 . Again we will use Hermite functions. As in 2D, their multi-dimensional generalization is not unique, and we present two options which seem to be practical: Basis 1 ("pure tensor-basis"): This is a straightforward generalization of the 2D case. Using (4.2) and the normalized 1D-Hermite functions g n (n ∈ N 0 ), we define the normalized Hermite functions in 3D as in (4.3), Then, g m (m ∈ N 3 0 ) form an orthonormal basis of L 2 (R 3 ; M −1 1 ) and inherit a simple recurrence relation (4.4). As in 2D, we shall use a lexicographic order, i.e. we order {g m } first (increasingly) with respect to the total order |m|, and within a set of order |m|, we order first (decreasingly) with respect to m 1 , and then m 2 . Thus, we obtain the linearly ordered basis Basis 2 ("energy-basis"): In analogy to the 2D case from §4, it is natural to construct a basis of orthogonal polynomialsH m (v) (m ∈ N 3 0 ) that involves the kinetic energy polynomial |v| 2 /2 (minus a multiple of the mass); in 3D the relevant term is (|v| 2 − 3)/2. Again, we only have to modify the Hermite polynomials of second order:H Similarly, we define normalized Hermite functions We remark that it is most convenient to obtaing 0,2,0 andg 0,0,2 from diagonalizing the matrix L 2 (see (5.5) below). The elementsg m satisfy a recurrence relation similar to (4.4); except for identities involvingg 2,0,0 , g 0,2,0 org 0,0,2 . For example, Whereas the first basis g m (v) (m ∈ N 3 0 ) inherits a simple recurrence formula with three elements; for the second basisg m (v) (m ∈ N 3 0 ) the recurrence formula forg 2,0,0 (v) relates five elements.
To derive the vector representation of (5.1), it is preferable to use the first basis with the linear ordering for t ≥ 0. As in 1D, the spatially homogeneous mode again satisfies d For each spatial mode k ∈ Z 3 , the "infinite vector"ĥ k (t) = (ĥ k,0 (t),ĥ k,1 (t), ...) ⊤ ∈ ℓ 2 (N 0 ) contains all Hermite coefficients of h k (·,t). In particular, we havê h k,0 = σ k , ĥ k,1 ,ĥ k,2 ,ĥ k,3 ⊤ = µ k ∈ R 3 , 1 √ 2 (ĥ k,4 +ĥ k,7 +ĥ k,9 ) = Thus, we can rewrite (5.2) as for k ∈ Z 3 , t ≥ 0. Since (5.3) is rotationally invariant (as in 2D), it suffices to consider (5.3) for vectors k = (κ, 0) ⊤ with the discrete moduli With the notation h κ := h κ,0 andĥ κ :=ĥ κ,0 , (5.3) reads Then, the vector of its Hermite coefficients satisfies where the operators L 1 , L 2 are represented by "infinite matrices" on ℓ 2 (N 0 ): To determine the hypocoercivity index of the BGK model in 3D, it is preferable to use the second basis with the linear orderingg m (v), m ∈ N 0 . Again, we shall give the matrix representation of the BGK equation (5.1) in the second velocity basis only for the spatial modes k = (κ, 0) ⊤ , κ ∈ K. To obtain the corresponding matricesL 1 andL 2 , we simply represent the linear basis transformation by the infinite matrix which is self-inverse, i.e. S = S −1 . Thus we computeL 1 = S −1 L 1 S andL 2 = S −1 L 2 S, which yields To determine the hypocoercivity index of the BGK model in 3D, we use Definition 2. We define the matrices C κ := i 2π L κL 1 +L 2 , κ ∈ K ∪ {0}, which determine the evolution of the spatial modes of the BGK equation in 3D, cf. (5.4). Our next goal is to establish a spectral gap of C κ , uniformly in κ ∈ K. This will prove Theorem 1.1 in 3D. To this end we make an ansatz for the transformation matrices: Following the detailed motivation from the 1D analog in §3, let P κ , κ ∈ K be the identity matrix whose upper-left 11 × 11 block is replaced by  with positive parameters α, β , γ, ω, and η to be chosen below. The distribution of the non-zero off-diagonal elements follows from the pattern in matrixL 1 .
Proof of Theorem 1.1 in 3D. We consider a solution h of (1.9), and the entropy functional E (f ) defined by Here the matrices P 0 = I and P κ defined in (5.6) for κ = |k| = 0 are regarded as bounded operators on L 2 (M −1 1 ). Then d dt This finishes the proof of Theorem 1.1 in 3D.

Local exponential stability for the BGK equation in 3D
This analysis is an extension of §4.5 in [1]. To make is self-contained we give the complete proof and not only the modification of the key steps. For γ ≥ 0, let H γ (T 3 ) be the Sobolev space consisting of the completion of smooth functions ϕ onT 3 in the Hilbertian norm ϕ 2 , where the inner product in H γ is given by where dx denotes the normalized Lebesgue measure onT 3 . Then H 0 is simply the weighted space Proof of Theorem 1.3. (a) For any solution h(t) to (1.9) with E d (h I + M 1 ) < ∞, normalized according to (1.7), we consider the functionf (t) := h(t) + f ∞ with f ∞ = M 1 . We define a family of entropy functionals E γ (γ ≥ 0) by as an extension of the entropy E (f ) in (5.9). For all γ ≥ 0, the estimates Using these estimates it is a simple matter to control the approximation in (1. , (6.7) so that the gain term in the linearized BGK equation (1.9) is ∂ s F(0, x, v). In this notation, To display the complicated expression for ∂ 2 s F(s, x, v), we define ρ s := 1 + sσ , u s := s ρ s µ , µ s := sµ , P s : where ∂ s P s := 1 3 τ − 2 s|µ| 2 sσ +1 + s 2 |µ| 2 σ (sσ +1) 2 . One can now verify that ∂ 2 s F(s, x, v) is of the order O(σ 2 + |µ| 2 + τ 2 ), which will be related to O(( f − f ∞ ) 2 ) due to the estimates (6.3)-(6.5).
Simple but cumbersome calculations now show that if γ > 3/2 and f − f ∞ H γ is sufficiently small, then there exists a finite constantC γ depending only on γ such that for all s ∈ [0, 1], and hence [The calculations are simplest for non-negative integer γ, in which case the Sobolev norms can be calculated by differentiation. For γ > 3/2 and sufficiently small f − f ∞ H γ , the estimates (6.6) ensure for all s ∈ [0, 1] the boundedness of 0 < ε < 1 + sσ ∞ , 1 + 1 3 (sτ − s 2 |µ| 2 1+sσ ) ∞ < ∞ (i.e. the denominators in (6.7)) for some fixed ε > 0. They also ensure the L 2 (R 3 ; In (1.1), higher powers of f − f ∞ H γ (arising due to derivatives of σ , µ and τ) can be absorbed into the constant of the quadratic term.] Now define the linearized BGK operator where σ , µ and τ are determined by h, and hence f . For all γ ≥ 0, Q 2 is self-adjoint on H γ . Then the nonlinear BGK equation (1.1) becomes which differs from the linearized BGK equation (1.9) only by the additional term R f . It is now a simple matter to prove local exponential stability. We shall use here exactly the entropy functional E γ ( f ) defined in (6.1) with f = M 1 + h. Now assume that h solves (6.9). To compute d dt E γ ( f ) we use an inequality like (5.10) for the drift term and for Q 2 h in (6.9), as well as P |k| ≤ (1 + 2α * ) and (6.8) for the term R f . This yields (if h H γ is small enough) where we have used the fact that h = f − f ∞ . Due to (6.2), it is now simple to complete the proof of Theorem 1.1(b) for L = 2π: In this case, the best decay rate µ = 0, 0001774540949 is attained for α * = 0, 1644256115 (cf. Remark 5.2(b)). Estimate (6.10) shows that there is a δ γ > 0 so that if the initial data Here we used that the linear decay rate in (6.10) is slightly better than 1 2820 , to compensate the nonlinear term.

A Appendix: Deferred proofs
Proof of Lemma 3.3. We compute that C * k P k + P k C k is twice the identity matrix whose upper left 5 × 5 block is replaced by where ℓ := 2π L . We seek to choose α, β and γ to make the matrices P k and D k,α,β ,γ positive definite for all k ∈ Z \ {0}. Under the assumption |α| 2 + |β | 2 + |γ| 2 < 1, the matrix P k will be positive definite for all k = 0.
The following lemma will be needed in the proofs of Lemma 4.1 and Lemma 5.1.
The seventeenth to twentieth minors are multiples of the sixteenth minor. Therefore, these minors are positive for all κ ∈ K under the same condition 0 < α < α δ 16 .