Multiscale mixed finite elements

In this work, we propose a mixed finite element method for solving elliptic multiscale problems based on a localized orthogonal decomposition (LOD) of Raviart-Thomas finite element spaces. It requires to solve local problems in small patches around the elements of a coarse grid. These computations can be perfectly parallelized and are cheap to perform. Using the results of these patch problems, we construct a low dimensional multiscale mixed finite element space with very high approximation properties. This space can be used for solving the original saddle point problem in an efficient way. We prove convergence of our approach, independent of structural assumptions or scale separation. Finally, we demonstrate the applicability of our method by presenting a variety of numerical experiments, including a comparison with an MsFEM approach.


1.
Introduction. In this work we study the mixed formulation of Poisson's equation with a multiscale diffusion coefficient, i.e. where the diffusion coefficient is highly varying on a continuum of different scales. For such coefficients, the solution is typically also highly varying and standard Galerkin methods fail to converge to the correct solution, unless the features on the finest scale are resolved by the underlying computational mesh. A classical application is the flow in a porous medium, modeled by Darcy's law. In this case, the multiscale coefficient describes a permeability field, which is heterogeneous, rapidly varying and has high contrast. Classical discretizations that involve the full fine scale often lead to a vast number of degrees of freedom, which limits the performance and feasibility of corresponding computations. In this paper, we address this kind of problems in the context of mixed finite elements.
We will interpret the mixed formulation of Poisson's equation in a Darcy flow setting, referring to the vector component as flux, and the scalar component as pressure. In Darcy flow applications the flux solution is of particular interest since it tells us how a fluid is transported through the medium. It is desirable and common to use flux conservative discretization schemes. The proposed method is based on the Raviart-Thomas finite element [31] which is locally flux conservative. Concerning the mixed formulation of Poisson's equation, corresponding multiscale methods were for instance proposed in [1,5,6,9]. These methods are based on the Raviart-Thomas finite element and fit into the framework of the Multiscale Finite Element Method (MsFEM, cf. [19]). Another family of multiscale methods is derived from the framework of the Variational Multiscale Method (VMS) [20,21,22,24,29]. Multiscale methods for mixed finite elements based on VMS are proposed and studied in [4,25,28]. Inspired by the results presented in [28], a new multiscale framework arose [26]. We refer to this framework as Localized Orthogonal Decomposition (LOD). It is based on the idea that a finite element space is decomposed into a low dimensional space that incorporates multiscale features and a high dimensional remainder space which is given as the kernel of an interpolation or quasi-interpolation operator. The multiscale space can be used for Galerkinapproximations and allows for cheap computations. Various realizations have been proposed so far. For corresponding formulations and rigorous convergence results for elliptic multiscale problems, we refer to [2, 14,15,18,26] for Galerkin finite element methods, to [13,14] for discontinuous Galerkin methods and to [17] for Galerkin Partition of Unity methods. Among the various applications we refer to the realizations for eigenvalue problems [27], for semilinear equations [16], for the wave equation [3] and for the Helmholtz equation [30].
In this paper we introduce a two level discretization of the mixed problem, that is we work with two meshes: A fine mesh (mesh size h) which resolves all the fine scale features in the solution and a coarse mesh (mesh size H) which is of computationally feasible size. This gives us a fine and a coarse Raviart-Thomas function space for the flux. We denote them respectively by V h (high dimensional) and V H (low dimensional). The kernel of the (standard) nodal Raviart-Thomas interpolation operator Π H onto V H is the detail space V f h . This space can be interpreted as all fine scale features that cannot be captured in the coarse space V H . A low dimensional ideal multiscale space is constructed as the orthogonal complement to the divergence free fluxes in V f h . We prove that this space has good approximation properties in the sense that the energy norm of the error converges with H without pre-asymptotic effects due to the multiscale features. However, the basis functions of the ideal multiscale space have global support and are expensive to compute. We show exponential decay of these basis functions allowing them to be truncated to localized patches with a preserved order of accuracy for the convergence. The resulting space is called the localized multiscale space. The problems that are associated with the localized basis functions have a small number of degrees of freedom and can be solved in parallel with reduced computational cost and memory requirement. Once computed, the low dimensional localized multiscale space can be reused in a nonlinear or time iterative scheme.
We prove inf-sup stability and a priori error estimates (of linear order in H) for both the ideal and the localized method. The local L 2 -instability of the nodal Raviart-Thomas interpolation operator leads to instabilities as h decreases for the localized method. We show that these instabilities can be compensated by increasing the patch size or using Clément-type interpolators instead. In the numerical examples we verify that the localized method has the theoretically derived order of accuracy. We confirm our theoretical findings by performing experiments on the unit square and an L-shaped domain, as well as using a diffusion coefficient with high contrast noise and channel structures. The proposed method is also compared numerically with results from an MsFEM-based approach using a permeability field from the SPE10 benchmark problem.
2. Preliminaries. We consider a bounded Lipschitz domain Ω ⊂ R d (dimension d = 2 or 3) with a piecewise polygonal boundary ∂Ω and let n denote the outgoing normal vector of ∂Ω. For any subdomain ω ⊆ Ω, we shall use standard notation for Lebesgue and Sobolev spaces, i.e. for r ∈ [1, ∞], L r (ω) consists of measurable functions with bounded L r -norm and the space H 1 (ω) consists of L 2 -bounded weakly differentiable functions with L 2 -bounded partial derivatives. The full norm on H 1 (ω) shall be denoted by · H 1 (ω) , whereas the semi-norm is denoted by For scalar functions p and q we denote by (p, q) ω := ω p q the L 2 -scalar product on ω. When ω = Ω, we omit the subscript, i.e. (p, q) := (p, q) Ω . For d-dimensional vector valued functions u and v, we define (u, v) Observe that we use the same notation for norms and scalar products in L 2 without distinguishing between scalar and vector valued functions. This is purely for simplicity, since the appropriate definition is always clear from the context. We use, however, bold face letters for vector valued quantities.
2.1. Continuous problem. With these definitions we are ready to state the continuous problem, which is Poisson's equation in mixed form with Neumann boundary conditions on the full boundary.
Assumption A (Assumptions on coefficients, data and domain).
d×d is a diffusion coefficient, possibly with rapid fine scale variations. Its value is an almost everywhere symmetric matrix and bounded in the sense that there exist real numbers α and β such that for almost every x and any v ∈ R d \ {0} (A2) f ∈ L 2 (Ω) is a source function that fulfills the compatibility condition Ω f = 0.
(A3) The domain Ω is a bounded Lipschitz domain with polygonal (or polyhedral) boundary.
We introduce the following bilinear forms and norms. Let and, further, v V := v H(div,Ω) and q Q := q L 2 (Ω) .
The energy norm is defined as the following weighted flux L 2 -norm, The energy norm can be subscripted with a subdomain ω ⊆ Ω, for example |||·||| 2 ω , to indicate that the integral is taken only over that subdomain.
The following lemma gives the conditions for existence and uniqueness of a solution to the mixed formulation in (1) for subspaces V ⊆ V and Q ⊆ Q. This lemma is useful for establishing existence and uniqueness for all discretizations presented in this paper, since all presented discretizations are conforming.
Lemma 2 (Existence and uniqueness of solution to mixed formulation).
Under Assumptions (A1)-(A3), the conditions for Lemma 2 are satisfied for V = V and Q = Q withα = α,β = β andγ being a constant that depends only on the computational domain. The lemma then yields a unique solution to the continuous problem (1).

2.2.
Discretization with the Raviart-Thomas element. Regarding the discretization, we introduce two conforming families of simplicial (i.e. triangular or tetrahedral) meshes {T h } and {T H } of Ω where h and H are the maximum element diameters. Throughout the paper we refer to T h as the fine mesh and to T H as the coarse mesh. Hence, we indirectly assume h H. We pose the following assumptions on the meshes.
Assumption B (Assumptions on meshes). (B1) The fine mesh T h is the result of one or more conforming (but possibly nonuniform) refinements of the coarse mesh T H such that T h ∩ T H = ∅. (B2) Both meshes T h and T H are shape regular. In particular the positive shape regularity constant ρ for the coarse mesh T H will be referred to below and is defined as The coarse family of meshes {T H } is quasi-uniform, whereas {T h } could be obtained from an arbitrary adaptive refinement.
Remark 3 (Quadrilateral or hexahedral elements). Affine quadrilateral (or hexahedral) elements can also be used. However, the definition of the Raviart-Thomas element presented below in this paper is based on triangular (or tetrahedral) meshes.
We denote by t and T an element of T h or T H , respectively. Similarly e and E denote an edge (for d = 2) or a face (for d = 3) of the elements of T h and T H . Further, n e (respectively n E ) is the outward normal vector of an edge (or face) e (respectively E). We continue this section by discussing finite element discretizations using the two meshes.
We denote all polynomials of degree ≤ k on a subdomain ω by P k (ω) and a ddimensional vector of such polynomials by [P k (ω)] d . We introduce the H 0 (div, Ω)conforming lowest (zeroth) order Raviart-Thomas finite element. For each fine element t ∈ T h and coarse element T ∈ T H , the spaces of Raviart-Thomas shape functions are given by respectively, where x = (x 1 , . . . , x d ) is the space coordinate vector. The Raviart-Thomas finite element spaces on T h and T H are then defined as The degrees of freedom (in the coarse and fine Raviart-Thomas spaces) are given by the averages of the normal fluxes over the edges (respectively faces for d = 3). We denote the degrees of freedom by for the fine and coarse discretization, respectively. The direction of the normal n e (respectively n E ) can be fixed arbitrarily for each edge (respectively face). Here, N e and N E are bounded linear functionals on the space W := H 0 (div, Ω) ∩ L s (Ω), for some s > 2. Note, that the additional regularity (i.e. L s (Ω) for s > 2) is necessary for the edge integrals to be well-defined (cf. [8] Additionally, we let Q H ⊂ Q h ⊂ Q be the space of all piecewise constant functions on T H and T h with zero mean. We denote by P h and P H the L 2 -projections onto Q h and Q H , respectively. Using the fine spaces, we define the fine scale discretization of (1), which will be referred to as the reference problem.
Definition 4 (Reference problem). Find u h ∈ V h and p h ∈ Q h , such that for all v h ∈ V h and q h ∈ Q h .
A similar problem can be stated with the coarse spaces V H and Q H with flux solution u H . The remainder of this section treats only the fine discretization. However, all results hold also for the coarse discretization.
We denote the space of divergence free functions on the fine grid by Remark 5 (Kernel of divergence operator). A natural definition of K h for our To establish existence and uniqueness of a solution to the reference problem, we use that Π h is divergence compatible, i.e. we have the commuting property Using this, the inf-sup stability of b(·, ·) with respect to V h and Q h follows: where w ∈ W is chosen such that ∇ · w = q and w W ≤ C Ω q L 2 (Ω) . This is possible by letting w = ∇φ for a solution φ to ∆φ = q in Ω with homogeneous Neumann boundary conditions. Now, applying Lemma 2 with V = V h , Q = Q h , K = K h , we can derive the constantsα = α,β = β andγ = γ := C −1 W C −1 Ω and establish existence and uniqueness of a solution to the reference problem (2). Note that the inf-sup stability constant γ is independent of h and hence also holds for the pair of spaces V H and Q H .
In the following, we are mainly interested in approximating the flux component u h of the solution. We treat u h as a reliable reference to the exact solution. Note that the L 2 -norm of the divergence error is controlled by the data For the energy norm of the flux error, we have the following error estimate in the energy norm for the lowest order Raviart-Thomas element: where C is independent of h. For a problem with a coefficient A that has fast variations at a scale of size , we have in general that |u| H 1 (Ω) ≈ −1 . Hence, we require h before we can observe the linear convergence in h numerically. We call the regime with h ≥ a pre-asymptotic regime. The goal of this work is the construction of a discrete space which does not suffer from such pre-asymptotic effects triggered by A. In the following, we assume that the fine mesh is fine enough (in the sense that h ) so that |||u − u h ||| is sufficiently small and hence u h a sufficiently accurate reference solution. With the same argument, the accuracy of the coarse solution u H will not be satisfying as long as H > . Note that reference problem (2) never needs to be solved. It just serves as a reference.
In the next section, we will construct the ideal multiscale space of the same (low) dimension as V H , but which yields approximations that are of similar accuracy as the reference solution u h (in particular in the regime H ). Throughout the paper, we do not consider errors that arise from numerical quadrature. For simplicity, we assume that all integrals can be computed exactly.
3. Ideal multiscale problem. In this section, we construct a low dimensional space that can capture the fine scale features of the true multiscale solution. We focus on constructing a good multiscale representation of the flux solution u only. We call it ideal since the reference flux solution is in this space for all f ∈ Q H . This should be contrasted to a localized multiscale space to be introduced in Section 4. In addition to the spaces V h and V H defined above we introduce the following detail space as the intersection of the fine space and the kernel of the coarse Raviart-Thomas interpolation operator, We refer to V f h as the detail space. In this section we aim at constructing a modified splitting, where V H is replaced by a multiscale space which incorporates fine scale features.
3.1. Ideal multiscale space. We will construct the ideal multiscale space by applying fine scale correctors to all coarse functions in V H , i.e. so that (Id − G h )(V H ) is the desired multiscale space for a linear corrector operator G h . The corrector operator is constructed using information from the coefficient A, and has divergence free range in order to keep the flux conservation property of the coarse space.
The definition of the corrector requires us to construct the splitting Next, we introduce an ideal corrector operator. We distinguish between local (element-wise) correctors and a global corrector.
for all v f ∈ K f h . Furthermore, we define the ideal global corrector operator by summing the local contributions, i.e. G h := T ∈T H G T h . The ideal corrector operators are well-defined since equation (5) is guaranteed a unique solution by the Lax-Milgram theorem due to the coercivity and boundedness of a(·, ·) on K f h . Using the ideal global corrector operator, we can define the discrete multiscale function space by where Id is the identity operator. This space has the same dimension as V H . Furthermore, it allows for the splitting

FREDRIK HELLMAN, PATRICK HENNING AND AXEL MÅLQVIST
3.2. Ideal multiscale problem formulation. In this section, we use the previously defined ideal multiscale space to define a (preliminary) multiscale approximation. The ideal multiscale problem reads as follows.
for all v h ∈ V ms H,h and q H ∈ Q H . Lemma 8 (Unique solution of the ideal multiscale problem). Under Assumptions (A1)-(A3) and (B1)-(B3), the ideal multiscale problem (7) has a unique solution. In particular, we have i.e. inf-sup stability independent of h and H.
for all v ∈ V . Combining these results with the inf-sup stability of b(·, ·) on V H and Q H , we get 3.3. Error estimate for ideal problem. In this section, we show that the flux solution of the ideal multiscale problem above converges in the energy norm with linear order in H to the reference solution. This convergence is independent of the variations of A, i.e. we do not have any pre-asymptotic effects from the multiscale features.  (7), then where C ρ,d and C Π are independent of h and H.

Proof. Parametrizing the solutions u h (f ) and u ms
H,h (f ) by the data f , we use the triangle inequality to obtain . The two last terms will be shown to equal zero.
For the first term, we proceed in several steps. Let us defineũ h : In order to bound the term p h − P Hph L 2 (Ω) , we let φ ∈ H 1 0 (Ω) be the weak solution to ∆φ =p h − P Hph . Then we have The existence of such operators is proved in [11]. Exploiting this stability and the fact thatp Combining this estimate with (9) yields For the second term, we have ∇ · u ms H,h (P H f ) ∈ Q H since the correctors are divergence free. This implies ∇ · u ms (2) and (7) in combination with the a(·, ·)-orthogonality between V ms H,h (P H f ), thus the second term equals zero.
To show that the third term is zero, it is sufficient to show that u ms H,h (f − P H f ) = 0. The data f −P H f is L 2 -orthogonal to the test space Q H and it enters the equation (7) only in an L 2 scalar product with test functions. Hence u ms 4. Localized multiscale method. The ideal corrector problems (5) are at least as expensive to solve as the original reference problem. Hence, we require to localize these problems to very small patches, without sacrificing the good approximation properties. If we can achieve this, the corrector problems can be solved with low computational costs and fully in parallel. In this section, we show that this is indeed possible. We prove that we can truncate the computational domain Ω in the local corrector problems (5) to a small environment of a coarse element T . This is possible, since the solutions of (5) decay with exponential rate outside the coarse element T . We obtain a new localized corrector operator which can be used analogously to the ideal corrector operator to construct a localized multiscale space. This localization reduces the computational effort for assembling the multiscale space significantly.
In addition to the assumptions (A1)-(A3) and (B1)-(B3), we require additional assumptions on the computational domain and the mesh. More precisely we assume the following for the analysis.
(A4) We consider d = 2 and a simply-connected domain Ω ⊂ R 2 . (B4) The fine grid T h is quasi-uniform, i.e. the ratio between the maximum and the minimum diameter of a grid element is bounded by a generic constant.
We note that assumption (A4) is crucial for our proof. Assumption (B4) on the other hand could be dropped with a more careful analysis. In this case the estimates (and in particular the decay) will depend on the inverse of the minimum mesh size of the fine grid in a patch U (T ). For simplicity of the presentation, we do not elaborate this case and restrict ourselves to quasi-uniform meshes, i.e. to (B4). Note that even though we fix d = 2, we keep the general notation d to illustrate how the results are influenced by the dimension. The localized method can be formulated analogously for d = 3. In order to localize the detail space K f h , we use admissible patches. We call this restriction to patches localization. For each T ∈ T H we pick a connected patch U (T ) consisting of coarse grid elements and containing T . More precisely, for positive k ∈ N we define k-coarse-layer patches iteratively in the following way. For all T ∈ T H (which are assumed to be closed sets), we define the element patch U k (T ) in the coarse mesh T H by See Figure 1 for an illustration of patches. For a given patch U (T ), we define the restriction of V f h to U (T ) by   Definition 10 (Localized corrector operators). For each T ∈ T H and k ≥ 1 layers, we define a localized element corrector operator G T h,k : for all w ∈ K f h (U k (T )). Further, we define the localized global corrector operator G h,k := T ∈T H G T h,k . The localized corrector operators are again well-defined by the Lax-Milgram theorem, exploiting that a(·, ·) is a weighted L 2 -scalar product. Note that the definition of K f h (U k (T )) implies Neumann boundary conditions on the localized corrector problems (11). We define a localized multiscale function space by V ms,k H,h := (Id − G h,k )(V H ) and state the localized multiscale problem as follows.
Definition 11 (Localized multiscale problem). The localized multiscale problem reads: find u ms,k H,h ∈ V ms,k H,h and p H ∈ Q H , such that for all v h ∈ V ms,k H,h and q H ∈ Q H . Definitions 10 and 11 constitute the proposed multiscale method. Next, we show that the above stated problem is well-posed. Proof. We use similar arguments as in Lemma 8. The basic difference is that we need to show stability for the localized corrector operator G h,k . We start with the stability of the localized element corrector operators. Here we have for arbitrary Now, we can prove L 2 -stability of the localized global operator. We get where C ρ is a constant only depending on the shape regularity constant ρ of the coarse mesh. Similar to (8) we derive inf-sup stability with Observe that the inf-sup stability constant γ 0 k := γ(1 + α −1/2 β 1/2 C 1/2 ρ k d/2 ) −1 depends on k this time.
The inf-sup stability constant γ 0 k depends on k due to overlapping patches. We come back to another estimate of the inf-sup stability constant in Section 4.3 after proving the decay of the correctors.
It is important to note that in the localized case we do not have orthogonality between V ms,k H,h and K f h as in the ideal case (cf. equation (6)). This orthogonality was crucial in the error estimate for the ideal method presented in Lemma 9. In the localized case, we rely on the exponential decay of the localized element correctors, which justifies localization to patches.

4.1.
Error estimate for localized problem. In this section we state the main result of this paper, which is an a priori error estimate in the energy norm between the reference solution and the localized multiscale approximation. We first present a logarithmic stability result for the nodal Raviart-Thomas interpolation operator Π H for fine scale functions and then state a lemma on the exponential decay of the correctors. Then the main theorem follows. The proof of the exponential decay is contained in Section 4.2. The notation a b stands for a ≤ Cb with some constant C that might depend on d, Ω, α, β and coarse and fine mesh regularity constants, but not on the mesh sizes h and H. In particular it does not depend on the possibly rapid oscillations in A.
We recall a well known stability result for the nodal Raviart-Thomas interpolation operator.
Lemma 13 (Logarithmic stability of the nodal interpolation operator for divergence free functions). Assume (B1)-(B4). For any given element T ∈ T H there exists a constant C that only depends on the regularity of T and the quasi-uniformity of T h , such that A proof for this can be found in [34,Lemma 4.1]. This result holds for both d = 2 and 3.

Remark 14.
There exist unconditionally L 2 -stable Clément-type interpolation operators for which we could define λ(H/h) := 1 for all h and H instead, see [7,10,11,32]. In particular, the operators introduced in [7,11] are projections and were used as a technical tool in the proof of Lemma 9 above. However, these operators are hard to implement in practice and hence are not used in the proposed numerical method.
Proof. The lemma is a direct consequence of Lemma 21 in Section 4.2.
Now, combining the error estimate for the ideal multiscale method in Lemma 9 and Lemma 15 we get the following a priori error estimate of the localized multiscale method.
for some 0 < θ < 1 depending on the contrast β/α, but not on k, h and H.
Before stating the proof, we discuss the role and choice of k. The second term in the error estimate (15) is an effect of the localization. This term can be made small by choosing large values of k, i.e. large patch sizes. A natural question is how to choose k to make the second term of order H to some power.
We write λ = λ(H/h) for convenience. Letk = 2d −1 log(θ)λ −1 k = −C θ λ −1 k, where C θ = −2d −1 log(θ) > 0 is a constant independent of H and h. We are interested in the asymptotic behavior, so we consider H 1. Setting the second term in (15) equal to H f L 2 (Ω) yields where W is the Lambert W -function. In terms of the number of layers k, we get k = −C −1 θ λW (−C θ λ −4/d−1 H 2/d ). This equation has two solutions for sufficiently small H. Since we require k ≥ 1, we pick the branch W ≤ −1.
Another, more practical option is to choose k = Rλ log(1/H) for some constant R. Then the expression k d/2 λθ k/λ will be asymptotically (as H → 0) dominated by the power H −R log θ . Choosing R sufficiently large yields arbitrary order of accuracy of the term. The fine mesh size h is often fixed and we can choose for some bases r and s of the two logarithms.
Remark 17. If Clément-type interpolation operators are used, we have λ ≡ 1 independent of H/h. Choosing k = C log(1/H) makes the second term in (15) proportional to log(1/H) d/2 H −C log θ . For an appropriate C we can make the first term in (15)

4.2.
Proof of exponential decay of correctors. This section consists of four lemmas, Lemma 18-21, of which the last one is the main result. The two first lemmas are auxiliary and are motivated by steps in the proofs of the latter two. Before starting, we need to set some notation and introduce some tools. We use the notation Note that we will use the letter K to denote arbitrary triangles of the coarse mesh T H . The first lemma says that every divergence free function w in H(div, Ω) is the divergence of a skew-symmetric matrix.
Here, the divergence of ψ is defined along the rows.
Note that the above lemma is the only instance, where we require the restriction d = 2. Even though the existence of the skew-symmetric matrix is also available for d = 3, we could not prove localized estimates of the type ∇ψ ij L 2 (ω) w L 2 (ω) .
Proof. The result is a combination of well-known results. First, we extend the divergence-free vector field w ∈ H(div, Ω) to a divergence-free vector fieldw ∈ H(div, R d ). In particular we havew ∈ [L 2 (R d )] d andw = w in Ω. Note that the extension of w to R d will be typically not zero outside of Ω. The existence of such an extension operator was proved in [33,Proposition 3.8]. It is well known that there exists a skew-symmetric matrix ψ ∈ [W 1,2 [23,Lemma 2.3]). The matrix is only unique up to a constant, so we fix the constant by Ω ψ = 0 (which gives us a Poincaré inequality). The inequality ∇ψ ij L 2 (ω) w L 2 (ω) (for ω ⊂ R d ) can be seen as follows for d = 2. Obviously, if i = j we obtain ∇ψ ii = ∇ψ jj = 0 and estimate (17) is trivial. If i = j, we obtain by using the skew-symmetry i.e. we obtain even equality in estimate (17).
We also require suitable cut-off functions that are central for the proof. For T ∈ T H and positive k ∈ N, we let the function η T,k ∈ P 1 (T H ) (globally continuous and piecewise linear w.r.t. T H ) be defined as We start with the following lemma, which enables us to approximate truncated functions from K f h .
Lemma 19. Let w h ∈ K f h and let ψ ∈ [W 1,2 loc (Ω)] d×d with w h = ∇ · ψ denote the corresponding skew-symmetric matrix as in Lemma 18. Let furthermore ψ K := |K| −1 K ψ denote the average on K ∈ T H and let ψ H ∈ [L 2 (Ω)] d×d denote the corresponding piecewise constant matrix with ψ H (x) = ψ K for x ∈ K. The broken divergence-operator ∇ H · is given by ∇ H · v := ∇ · v| K for K ∈ T H . The function η T,k ∈ P 1 (T H ) is a given cut-off function as defined in (18) for k > 0. Then, we have that the functionw h : h fulfills the following estimate for any K ∈ T H : otherwise.
Proof. First, we observe that the skew-symmetric matrix ψ must be a polynomial of maximum degree 2 on each fine grid element. We use this in the following without mentioning.

FREDRIK HELLMAN, PATRICK HENNING AND AXEL MÅLQVIST
We fix the element T ∈ T H and k ∈ N and denote η := η T,k . Furthermore, we define for K ∈ T H We definew h := Π h (∇ · (ηψ))−(Π H •Π h ) (∇ · (ηψ)) and observe thatw h ∈ K f h and w h =w h on Ω\U k (T ). The property Π H (w h ) = 0 is clear. The property ∇·w h = 0 follows from the fact that ηψ is still skew symmetric and that ∇ · (Π H • Π h )(·) = (P H • P h )(∇ · ). Since ψ K and c K are constant on K we have Finally, we also have on K, Combining (19), (20), and (21) we obtain for every K ∈ T H Now, we consider the quantity we want to estimate. For any K ∈ T H , In the last step we used Lemma 13, the property that Π h ∇ · ((η − c K )(ψ − ψ K )) is divergence free and the fact that Π h is locally L 2 -stable when applied to functions of small fixed polynomial degree, i.e. for fixed t ∈ T h and r ∈ N there exists a constant C(r) that only depends on r and the shape regularity of t such that Continuing from (23) we obtain Note that we used the properties of η to obtain the Lipschitz bound η−c K L ∞ (K) H ∇η L ∞ (K) 1 and that ∇η has no support outside U k (T ) \ U k−1 (T ). We also used the Poincaré inequality for η − c K which has a zero average on K. Combining (23) and (24) yields the sought result.
We continue with a lemma showing the exponential decay of solutions to problems of the form in (5).
Then, there exists a generic constant 0 < θ < 1 (depending on the contrast β/α) such that for all positive k ∈ N: Proof. The proof exploits similar arguments as in [30]. Let us fix k ∈ N. We denote again η := η T,k ∈ P 1 (T H ) (as in (18)). We apply Lemma 19 to w T ∈ K f h . The corresponding skew symmetric matrix shall again be denoted by ψ = ψ(w T ) and we definew We obtain that ∇ · (ηψ) − ∇ H · (ηψ H ) −w T is zero outside U k (T ) \ U k−1 (T ) and First observe that and With that we have = Ω\U k−1 (T ) = Ω\U k−1 (T )

FREDRIK HELLMAN, PATRICK HENNING AND AXEL MÅLQVIST
For I we use (27) to obtain and for II we obtain Now, denote by L := Cλ(H/h), and we get where C is independent of T , k and A, but can depend on the contrast. We obtain A recursive application of this inequality and w T Ω\U0(T ) ≤ w T Ω yields where we used Bernoulli's inequality and that 0 < L −1 ≤ C −1 in the last step. The choice θ := (1 + C −1 ) −1 proves the lemma.
The following lemma is the main result of this subsection. It can be directly applied to the localized corrector problems (11) Lemma 21. Let the setting of Lemma 20 hold true and let additionally w T,k ∈ K f h (U k (T )) denote the solution of Then, there exists a generic constant 0 < θ < 1 (depending on the contrast) such that for all positive k ∈ N: Proof. Let η T,k be defined according to (18) and denote z : The first term is estimated by For the second term we have z ∈ K f h , hence there exists again a skew-symmetric matrix ψ = ψ(z) with the properties as in Lemma 18 with k+1 ψ)). Using Lemma 19 and supp(η T,k+1 z)∩supp(w T,k ) = ∅ we get

Now proceed as in Lemma 20 to obtain
Combining the estimates for I and II and applying Hölder's inequality finally yields, for k ≥ 1, It remains to bound w T − w T,k 2 Ω . In order to do this, we use Galerkin orthogonality for the local problems, which gives us Again, we use Lemma 20 to show Combining (32) and (33) proves the lemma.

4.3.
Inf-sup stability revisited. The decay results can be used to prove another inf-sup stability constant γ 1 k in addition to γ 0 k from Lemma 12 for the bilinear form b(·, ·) with the localized multiscale space. Using Lemma 21, we obtain . We get the following stability Using the same technique as in Lemma 12, we obtain an inf-sup stability constant For the nodal Raviart-Thomas interpolation operator Π H , λ(H/h) depends on h and H, and we cannot obtain a uniform bound on the constant for this estimate either. However, for L 2 -stable Clément-type interpolation operators (discussed in Remark 14), we have λ(H/h) ≡ 1, independently of h and H. If using such an interpolator in place of Π H , the inf-sup stability constant γ 1 k can be bounded from below by a positive constant independent of h and H, since k d/2 θ k is bounded from above with respect to k.

Numerical experiments.
Four numerical experiments are presented in this section. Their purpose is to show that the error estimate for the localized multiscale method presented in Theorem 16 is valid and useful for determining the patch sizes and that the method is competitive.
A brief overview of the implementation of the method follows. The two dimensional Raviart-Thomas finite element is used. For all free degrees of freedom e (interior edges), the localized global corrector G h,k Φ e for the corresponding basis function Φ e is computed according to equation (11). The additional constraints on the test and trial functions to be in the kernel to the coarse Raviart-Thomas projection operator are implemented using Lagrange multipliers (in addition to those already there due to the mixed formulation). The corrector problems are cheap since they are solved only on small patches. This can be done in parallel over all basis functions. Finally, problem (12) is solved. Regarding the linear system arising here, we compare it with the linear system arising from a standard Raviart-Thomas discretization (using V H for the flux) of the mixed formulation on the coarse mesh: for matrices K and B and a vector b. The difference with the multiscale method is that matrix corresponding to the bilinear form a(·, ·) is computed using the low dimensional modified localized multiscale basis {Φ E − G h,k Φ E } E spanning V ms,k H,h . Since the correctors are divergence free, K is replaced by a different matrixK in the system above, whereas B and b are left intact.
In all numerical experiments below, the diffusion matrix is diagonal with identical diagonal elements, A(x) = A(x)I, with I being the identity matrix, for a scalarvalued function A.

Investigation of error from localization.
In this experiment, we investigate how the error in energy norm of the localized multiscale solution is affected by the localization to patches of the correctors. The error due to localization is bounded by the second term in the estimate in Theorem 16. This term will be the focus of this experiment.
The computational domain is the unit square Ω = [0, 1] 2 and the source function is given by     Figure 4 for the resulting convergence of this error with respect to H for the two values of C. Note that since f ∈ Q H for all examples, the first term in (15) vanishes. The error is hence bounded by k d/2 λ(H/h) 2 θ k/λ(H/h) f L 2 (Ω) , which allows for a careful investigation of the influence of k, H and h. A reference line proportional to H 2 is plotted for guidance. We can see that we achieve convergence for both choices of C. However, since k is rounded to an integer, the convergence plots have a staggered appearance. This example shows that the error due to localization can be kept small and decreases with H. The plots also show the relative error in energy norm for the standard Raviart-Thomas discretization on the coarse mesh. It is evident that the localized multiscale space has good approximation properties since it permits convergence while the standard space of the same dimension does not.

Investigation of instability.
In this experiment we show how singularitylike features can appear in the solution, probably as a result of high contrast in combination with the L 2 -instability of the nodal Raviart-Thomas interpolator.
Again, we consider the unit square Ω = [0, 1] 2 . The diffusion coefficient A is chosen according to Figure 5. In other words, A is defined as otherwise.
The source function is chosen as This particular choice of A and f yields a localized multiscale solution with a clear singularity-like feature at x = (x 1 , x 2 ) = (1/2, 1/2) in the localized multiscale solution.
We use the family of triangulations presented in Figure 3 and fix H = 1/4 so that f is resolved on the coarse scale. Then f ∈ Q H and all error stems from localization (see Theorem 16). We let the resolution h of the fine space be h = 2 −5 , 2 −6 , . . . , 2 −9 . Choosing k = 2, we compute the localized multiscale solution u ms,k H,h and reference solution u h for the given values of h.
From the error estimate in Theorem 16, we expect to have The energy norm of the error is plotted in Figure 6. We can see that for this particular problem and range of h, the error increases with h and with the rate log(h −1 ) as predicted by the error estimate. However, the error estimate seems not to be sharp for this particular example. Figure 7 shows the reference and multiscale flux solutions. The magnitude of the reference solution is in the range [0, 3], while the multiscale solution has a spike reaching magnitude 30 at x = (1/2, 1/2). Interesting to note is that the singularities vanish for the ideal multiscale method, i.e. without localization, see Lemma 9. Standard RT error otherwise. to the localized multiscale problem were computed using H = 2 −2 , 2 −3 , . . . , 2 −6 . The patch size k was chosen as k = C(1 + log 2 (H/h)) 1/2 log 2 (H −1 )    rounded to the nearest integer, with C = 0.25 and C = 0.5. The relative error in energy norm was recorded for the solutions corresponding to the values of H. The resulting convergence plot can be found in Figure 9. We expect the first term in the error estimate, to be of order H 2 . From the convergence plots we can see that C = 0.25 is not sufficient to make the localization error of at least order H 2 , however, C = 0.5 is.

Comparison with MsFEM.
We compare the proposed method with the results obtained using the Multiscale Finite Element Method (MsFEM) based approach in [5]. The domain is Ω = [0, 1.2] × [0, 2.2] and the permeability coefficient A is given in a uniform rectangular grid of size 60 × 220 by the 85th permeability layer in model 2 of SPE10 [12]. The method proposed in [5] is based on a fine and a coarse mesh with quadrilateral elements. The fine mesh is uniform 60 × 220, i.e. aligned with the permeability data, and the coarse mesh is 6 × 22, so that each coarse element is subdivided into 10 × 10 fine elements. The implementation of the method proposed in this work uses triangular meshes, which is why we divide each of the rectangular elements into two triangular elements by a diagonal line drawn from the upper left corner to the lower right corner. As coarse mesh, we use a similar triangular mesh that is constructed from a 6 × 22 rectangular mesh such that the fine mesh is a conforming refinement of the coarse mesh. The (quasi-singular) source data f is equal to 1 in the lower left and −1 in the upper right fine quadrilateral element. Note that such f is a discretization of point sources that model production wells. In particular, the source terms on the continuous level are mathematically described by Dirac delta functions. Hence, for h → 0, we only have f ∈ W −m,2 (Ω) for m > d 2 , opposed to f ∈ L 2 (Ω) as is required for our analysis. To account for this difference, we follow [28] and compute the localized source corrections F T, h f ∈ V f h (U (T )) on -coarse-layer patches for ) and q f h ∈ Q f h (U (T )), where Q f h (U (T )) is the restriction of Q f h to U (T ), analogous to the definition of V f h (U (T )). (The pressure solutioñ F T, h f is not needed for correcting the flux and is discarded after its use as Lagrange multiplier). Since f is non-zero only for the two triangles T 1 and T 2 in the lower left and upper right corners, only two such corrector problems need to be solved. The total localized source correction is h . The localized corrector problems (11) are unaffected by the source correction. The right hand side of the localized multiscale problem (12) is appended with the localized source corrections and instead reads: find u ms,k, . Using a value of = 0 will be referred to as an ad-hoc source correction, since we do not expect to have any decay of the correction already within the source triangle itself. The source corrected solution is u ms,k, H,h + F h f . We emphasize that the need for source correctors for singular source terms is not an exclusive drawback for our approach, but it is a common necessity shared by all comparable multiscale methods in this setting. In particular they are also used for the MsFEM-based approach in [5] that we use for our comparative study.
The proposed localized multiscale method was used to solve for the flux in the described problem for three corrector patch sizes: k = 1, 2, and 3. Three variants of H,h +F h f for = 0 (without interpolation constraint), and iii) with source correction, i.e. u ms,k, H,h + F h f for = k, k + 1, ∞. A reference solution u h was computed on the fine mesh. Table 1 shows the relative energy norm and L 2norm of the difference between the localized multiscale solution and the reference solution for the different values of k and . The corresponding L 2 -norm of the error for the MsFEM method with oversampling HE0-OS proposed in [5] is also presented in the table. Note that HE0-OS is based on a discretization with roughly 33% less degrees of freedom than the proposed method, since it uses quadrilaterals instead of triangles (however, since this holds for both the fine and the coarse mesh, the relative change in the amount of degrees of freedom with respect to the reference solution is the same). The flux solutions are plotted in Figure 10.
The results show that the proposed method even without error correction compares favorably with the homogenization based approach. Ad-hoc error correction gives small errors for this problem in both norms. For source correction with patch size = k, instabilities similar to that studied in Section 5.2 cause the error to increase. However, letting = k + 1 is enough to get errors that compare favorably with [5].