A-POSTERIORI ERROR ESTIMATE FOR A HETEROGENEOUS MULTISCALE APPROXIMATION OF ADVECTION-DIFFUSION PROBLEMS WITH LARGE EXPECTED DRIFT

. In this contribution we address a-posteriori error estimation in L ∞ ( L 2 ) for a heterogeneous multiscale ﬁnite element approximation of time-dependent advection-diﬀusion problems with rapidly oscillating coeﬃcient functions and with a large expected drift. Based on the error estimate, we derive an algorithm for an adaptive mesh reﬁnement. The estimate and the algorithm are validated in numerical experiments, showing applicability and good results even for heterogeneous microstructures


1.
Introduction. In this work we focus on a-posteriori error estimation and adaptivity for a heterogeneous multiscale finite element method (HMM) applied to advection-diffusion problems with rapidly oscillating coefficient functions and large expected drift, i.e. on equations of the following type: Here, b is divergence-free and denotes a very small parameter, which is a characteristic size for the microscale property of the problem. Equations of this type have a lot of applications, especially in hydrology. They can be used for describing reservoir displacement problems or the transport of solutes in ground-and surface water. When the water content takes values that are close to saturation, the diffusion process has only a minor influence and the flow is primarily caused by gravity. In this spirit, the scaling of the advective part with −1 refers to a large Péclet number as suggested by Bourlioux and Majda [18]. In fluid dynamics, the Péclet number describes the ratio of the advective part to the diffusive part. Therefore, a large number indicates an advection dominated transport process. As mentioned above, this particularly occurs with the transport of solutes in groundwater. Beside this, there is wide range of other applications, including the modeling of semi-conductor devices or polymer chemistry.
One possibility for solving equations of type (1) is related to the technique of homogenization, where the limit problem for → 0 is examined. For corresponding results in the periodic setting (i.e. −periodic coefficients), we refer to contributions 1394 PATRICK HENNING AND MARIO OHLBERGER of Allaire and Raphael [15,16,38] for advection-diffusion problems with reaction in a porous medium and Allaire and Orive [14] for advection-diffusion-reaction problems with coefficients that vary both on the macro-and the microscale. The case of nonlinear convection-diffusion problems was treated by Marušić-Paloka and Piatnitski [53]. As the range of the very effective homogenization approach is limited to specific structures, such as periodicity, it is more and more important to consider numerical multiscale methods for more general scenarios. These methods can strongly reduce the complexity of the regarded system by decoupling the problem into macroscale and microscale contributions.
A prominent example for multiscale methods is the so called multiscale finite element method developed by Hou et al. The method can be applied to heterogeneous composite materials and also to porous media. Elliptic equations were considered in [41,42] and two phase flow in porous media were studied in [24]. Multiscale methods for solving parabolic equations with continuum spatial scales and heterogeneous coefficients were discussed in the work of Jiang, Efendiev and Ginting [46]. Other numerical multiscale approaches are the two-scale finite element methods by Schwab and Matache [54,55,59,40], the multiscale mortar mixed finite element discretization by Arbogast et al. [17], or the 'divide-and-conquer' spatial and temporal multiscale method for transient advection-diffusion-reaction equations by Gravemeier and Wall [34]. Another very powerfull class is the variational multiscale method proposed by Hughes et al. [44,45]. Here the solution is splitted into a fine and a coarse-scale part. The corresponding fine-scale equations are solved in dependency of the residual of the coarse-scale solution. Larson and Målqvist derive an a-posteriori estimate via duality techniques for this method. This estimate is used to create an adaptive algorithm. Diffusion dominated elliptic problems are treated in [48,49] and the case of stationary advection-diffusion problems in [50]. Another general framework for adaptive multiscale methods for elliptic problems is suggested by Nolen, Papanicolaou and Pironneau [57].
In our contribution we focus on the heterogeneous multiscale finite element method (HMM) which is contrary to most other multiscale methods not designed to approximate the exact solution u , but to approximate an homogenized solution u 0 which contains no fine scale oscillations. The HMM was originally introduced by E and Engquist in 2003 [20,21,22]. It is based on a standard finite element approach for the macroscopic part of the problem. For the evaluation of the corresponding discrete bilinear form, the solutions of so called cell problems are used. These solutions, only defined on small cells around quadrature points, help to reconstruct the efficient macroscopic properties of the coefficient functions. The method is not restricted to the periodic case but requires scale separation.
In the works of E, Ming and Zhang [23], Abdulle and Schwab [12], Ohlberger [58] and Henning and Ohlberger [36] the elliptic case is treated. The parabolic case is analysed by Abdulle and E [6], Abdulle and Huber [9], Ming and Zhang [56] and Henning and Ohlberger [37]. An algorithm for solving advection-diffusion problems is suggested by Abdulle [1]. A HMM for the wave equation was proposed and analyzed by Abdulle and Grote [8] and Engquist, Holst and Runborg [26,27]. A higher-order HMM based on least-square reconstruction was introduced in [51]. A combination of the HMM with a Reduced Basis approach can be found in [4,5]. For an overview on the HMM we refer to work by Abdulle, E, Engquist and Vanden-Eijnden [7].
Among others, a-priori results concerning HMM were achieved by E et al. [20,23], Abdulle [1,2], Ohlberger [58], Henning and Ohlberger [36,37], Du and Ming [19] and Abdulle and Vilmart [13] and Gloria [31,32]. First a-posteriori error estimates for the heterogeneous multiscale method for elliptic problems can be found in Ohlberger [58] and Henning and Ohlberger [36] in the periodic setting. For a posteriori error estimates for the HMM in general (non periodic) settings, we refer to [10,11,39]. Another approach towards a-posteriori error estimation is given by Larson and Målqvist [50] in the context of the variational multiscale method (VMM). However, the techniques applied for the VMM do not generalize to the HMM.
The goal of this work is to derive an a-posteriori error estimate for the heterogeneous multiscale method for advection diffusion problems with rapidly oscillating coefficient functions and a large expected drift. This method was introduced in [37] and is constructed to capture the effective global properties of the solution u of problem (1), it is not to determine u itself. The a-posteriori error estimate proposed in this contribution is derived under the assumption of periodicity. More precisely, we establish the estimate by using the result that the method is equivalent to a discretization of the two-scale homogenized equation of (1) by means of a discontinuous-Galerkin time stepping method. The derived estimate contains local error indicators, which allow the use of adaptive mesh refinement algorithms. Although we prove the estimate under the restriction of a periodic micro-structure, the result is also applicable to ergodic stochastic coefficients or moderately heterogeneous structures. This claim is demonstrated by two numerical experiments.
Outline: In Section 2 we state our heterogeneous multiscale method and present our main results: an a-posteriori error estimate, and a strategy for adaptive mesh refinement. In Section 3 we discuss two numerical experiments to validate our claims and the achieved estimate. Section 4 is dedicated to the proof of the a posteriori result. 2. Formulation of the method and a-posteriori error estimate. In the following we study the advection-diffusion problem (1) with rapidly oscillating and time-dependent coefficient functions. In addition to standard assumptions that guarantee existence and uniqueness of (1), such as ellipticity of A uniformly in t and x, we demand that k is a positive function and that b is divergence-free. If the advective effects, created by b are expected to average out (i.e. x T +[0, ] d b (t, ·) = 0 for every relevant quadrature point x T ∈ R d ), we do not need further assumptions on A , except that there is some kind of scale separation. If this is not the case, we need to presume that A (t, ·) and b (t, ·) are only micro-scale functions, i.e. they only show a microscopic behaviour and are almost constant on the macro-scale. Examples are periodic coefficients or ergodic stochastic coefficients. For k we assume that the function is -periodic with average 1. We note that this assumption is not a real restriction, but a simplification. The case with a completely general k yields no further difficulties. See for instance [37] on how to formulate the subsequent multiscale method for this case.
Before stating the HMM, we introduce the following notations and definitions: T H defines a regular simplicial partition of R d , the corresponding set of the inner faces is defined by Γ(T H ) := {E| E = T ∩T = ∅, T,T ∈ T H } and by x T we denote the barycenter of T ∈ T H . T h defines a regular periodic partition of the zero-centered unit cube Y := [− 1 2 , 1 2 ] d . The set of inner faces of T h is given by Γ(T h ) := {E Y | E Y = S ∩S = ∅, S,S ∈ T h }. Furthermore, for δ ∈ R >0 , we introduce the δ-scaled unitcells, centered around a barycenter x T by Y T,δ := {x T +δy| y ∈ Y }. The associated bijection x δ T : Y → Y T,δ is given by x δ T (y) := x T + δy, for y ∈ Y and its extension In the following, equidistant time steps will be used for simplification. We define t n := n t, where t denotes the step size, such that N := T0 t ∈ N. The jump over t n is defined by [u] n := u n + − u n − , where u n + := lim t t n u(t, ·), u n − := lim t t n u(t, ·). Furthermore, for any bounded domain M , we denote the outer normal by ν M : ∂M → R d . For two domains M 1 and M 2 , with Γ :=M 1 ∩M 2 and for a function g ∈ (L ∞ (M )) We introduce a Sobolev space of periodic functions with zero mean byH 1 (Y ) : For the HMM itself, we need the subsequent discrete spaces: Assuming that the coefficient functions are continuous, we use the following discrete approximations of the coefficients A , b and k (in T ∈T H Y T,δ ): For (t, x) ∈ [t n , t n+1 )×x T (S), the local mean of the discrete advection b h is defined dy with some δ > 0 that is specified in the method itself. Using a Newton-Cotes quadrature formula of order zero, the heterogeneous multiscale finite element method for advection-diffusion problems with rapidly oscillating coefficient functions is introduced in the subsequent definition. Note that the method is only designed to capture the effective macroscopic properties of u but not its fine-scale oscillations. As proved in [37], in the periodic setting there is a clear relation between HMM approximation u H and homogenized solution u 0 , in the sense that u H converges to u 0 strongly in L 2 .
with the macroscopic bilinear form A n H defined as Here, the local centered reconstructions R (n) T are given as denotes the local reconstruction operator defined through the following cell problems. The image R (n) where E T : P 1 (T ) → P 1 (R d ) denotes the canonic extension operator. The initial value u 0 H = v 0 H is given by a suitable discretization of v 0 . For the parameter δ we assume δ ≥ . An expedient choice for the periodic case is δ = . For δ > , the strategy is called oversampling and it is used to erase the effects of a possibly wrong boundary condition for the cell problems.
Practically, it is of great interest to have some a priori knowledge on how to choose the oversampling parameter δ. Until know, there is no general answer to this question, since corresponding error estimates fundamentally rely on the homogenization setting that we are in. For instance, consider a linear elliptic homogenization problem with A (·) = A( · ) and a corresponding HMM where the local problems are solved in Y T,δ -cells, but with a homogenous Dirichlet boundary condition. For the averaging in the global bilinear form, the Y T,δ -cells are used instead of smaller cells (i.e. no oversampling). In this case, the error produced by an inappropriate choice of δ is of order δ (c.f. [23, Theorem 1.2] and [3, Theorem 16 and 17]), which means that δ should be significantly larger than . If we replace the periodic homogenization setting by a stochastic homogenization setting, the order of the error term can degenerate to δ κ with some κ ≤ 1 2 (c.f. [23, Theorem 1.3]). The same techniques lead to similar results for the MsFEM (c.f. [25,43]). However, practically and using a periodic boundary condition in combination with oversampling, these rates do not become visible, unless H becomes very small. This indicates that the pre-factor in front of the error terms is often small. In order to derive similar estimates as in the aforementioned works ( [23] and [3]) for the HMM with large expected drift, the same strategies can be used where the oversampling can be treated as in [43]. However, one has to consider an asymptotic expansion (c.f. [47]) of the solution to a periodic homogenization problem of a stationary advection-diffusion equation with a periodic boundary condition. Deriving bounds for the various terms in such an estimate is significantly complicated by the periodic boundary condition. To our knowledge, an analysis of such cases was not yet carried out.
An alternative to the usage of a periodic boundary condition for the local problems was proposed in [33] for the elliptic case. Here, Dirichlet boundary conditions are used in combination with an additional regularization of the cell problems by adding the term to the left hand side of the cell problems. The new problem is analyzed using Green functions. This modified approach improves the estimates in the periodic and stochastic setting considerably.
However, no results for general cases are available. The reason why it is extremely difficult to generalize results from periodic (or stochastic) settings to other types of problems is related to the fact that the HMM is always constructed to approximate the homogenized solution, but not to approximate the exact solution. If we are in a setting where do not know the homogenized solution (or how to characterize it) there do not yet exist any tools that allow to derive explicit convergence rates, which can be used for a more general numerical analysis of the HMM. Exact methods like the VMM do not suffer from this since they approximate the exact solution.
Hence, more precise (and more general) answers to the question of how big sampling domains should be are possible (c.f. [52]). Practically, the HMM has an advantage in cases, where there is no need for resolving the microstructure everywhere, or where data is only available in small representative volume elements (which would be the Y T,δ cells).
We also note that the choice of δ has no influence on the approximation of the speed of the drift. The direction of the drift on the other hand is purely determined by a local average δ −d Y T ,δ b . Assume that we are in the periodic setting, where the exact period is unknown. In this case, −d Y T , b is the correct drift value that should be approximated. We easily see that the error is determined by how often the -cell fits into the δ-cell and an L 1 -bound for b on the 'remainder cell' divided by the number of ' -samples', i.e. the error behaves like This is an error that is of smaller order then the δ -terms that show up in the mentioned analysis of the homogenization error.

2.1.
Homogenization. Our goal is to derive an a-posteriori result to estimate the L ∞ (L 2 )-error between the HMM approximation and the solution of the two scale homogenized equation of (1). To this end, let us pose the following periodicity and regularity assumption: We suppose that the initial value belongs to H 1 (R d ) and that the coefficients are Lipschitz-continuous and space periodic with period , i.e. denoting A(t, y) := A (t, y), b(t, y) := b (t, y) and k(t, y) := k (t, y), we assume denotes the full norm, we introduce the subsequent semi-norm and norm on In order to define the two-scale homogenized equation, we introduce the solution With the averaged advection field defined as b(t) := Y − b(t, y) dy we introduce the elliptic part of the two-scale homogenized operator E at t ∈ [0, T 0 ] by ). Finally, we define the space-time operator With these definitions, we are prepared to formulate the following two-scale homogenization result (cf. [38] and the references therein). Theorem 2.3. Let u denote the solution of (1) and suppose that the assumptions of this section hold true. Then there exists a unique solution (u 0 , u 1 ) ∈ X(0, T 0 ) of the two-scale homogenized equation with drift: Moreover, u converges towards u 0 in the following sense and the subsequent regularity for the homogenized solution holds: and Furthermore, we have the estimates 2.2. A posteriori error estimate. In the following, the notation f g is used if f ≤ Cg, where C only depends on the domain, the coefficients and the initial value of our problem. We are now prepared to state our main result.
Here, η N H : 2 denote the residual error indicator and the approximation error indicator, respectively. Using for Φ H ∈ V H and x ∈ T , the local constituents of the residual and approximation error indicators are given as follows. With The local residual error indicator splits into time and space constituents: has specific contributions from the macro scale (H) and the micro scale (h): n,T,S + µ n,T,S ) where we use for 1 ≤ n ≤ N , T ∈ T H , S ∈ T h , E ∈ Γ(T H ) and E Y ∈ Γ(T h ) the following notation for local contributions of the residual error Furthermore, n max ,ñ max andn max are defined as n,E,Y , The indicators depending on µ n,E,Y and µ (5) n,E,Y account for jump residuals on the macro-grid and the indicators depending on µ (1) n,T,E Y account for jump residuals on the micro grid. The gradient jumps typically converge like H 1/2 and h 1/2 respectively, such that all residual terms are expected to converge with the optimal rate (i.e. O(H 2 + h 2 )).
The approximation indicators reflect the typical approximation properties of the scheme. The temporal order is slightly polluted by the logarithmic term (log t N t ) 1 2 , which is a typical artifact from the analysis. For comparable results for standard parabolic equations, we refer e.g. to the work of Eriksson, Johnson and Larsson [29].
Even though the estimate is derived under the restriction of space-periodic coefficients, the final result can also be applied to a more general setting. This claim is emphasized by the numerical experiments in Section 3.
where e N approx denotes a data approximation error. On the other hand, after removing the contributions of the coefficient functions in η N H , we get that the remaining parts of η N H can be estimated in analogy to the proof of Theorem 3.3. in [28]. In this case, we obtain η N H E 0n L N + data-approximation. This yields that (except of data approximation) the estimator converges with the same speed as the Concerning the cell problems, we use a uniform micro-mesh T h with a fixed grid size of h = 2 −4 for simplicity. The micro mesh is fine enough (in comparison to the resolution of the macro-grid) such that we do not see the influence of the discretization error in the cell problems. We denote for the n'th time step V n H :  There are several other possibilities for adaptive algorithms using the local error indicators. Depending on the considered problem we might use a splitting of residual error and approximation error to improve the performance. Furthermore, the macro-mesh error indicators η N H (T ) and ξ N H (T ) can be used to determine suitable values for the grid size of the micro-mesh T h . We refer to [58] for such a strategy and corresponding numerical results. Going even further, each η N H (T ) can be decomposed into constituents depending on S ∈ T h . With such a decomposition we can also adaptively refine the computational micro-grid while coupling the resolution with the macro-grid. This involves a balancing between the various local error indicators (macro and micro grid).
We also note that the above algorithm does not involve a time step control. In particular, if the given time step size is too large, the algorithm might fail to fall below a desired tolerance. One way to overcome this problem is to couple the time step size with the average macro mesh size (average over the local mesh sizes H T ). Another way is to only decrease the time step size if the ratio between two estimated errors (two cycles of the algorithm) is close to 1. These are easy ways to prevent the algorithm from not terminating, but they are of course not optimal. A more detailed analysis concerning the spatial and temporal contributions to the error indicators would probably allow for a more efficient space-time adaptivity, but such analysis is beyond the scope of this paper and is left for further studies.
Remark 2.6. Practically it can be difficult to make a proper choice of δ if is unknown. Since we do not know the correct boundary condition for the local problems on the Y T,δ cells, the only strategy seems to be that we make the Y T,δ -cell sufficiently large so that it contains enough 'optimal sample cells'. The strategy can be described as follows: first, we solve a local problem in a cell of size Y T,δ and compute (for this element T ) the corresponding effective global coefficients as they appear in the macroscopic bilinear form A n H . Then we increase the size of the Y T,δ -cell and repeat the computation. If the new effective global coefficient is close to the old one, the first choice for δ was probably a good choice. If there was a discrepancy between the computed values, we do not yet have enough 'optimal sample cells' in the chosen Y T,δ -cell. Repeating this procedure iteratively, we can stop whenever the determined effective coefficients start to stagnate. In this case we observe the convergence to the correct homogenized coefficients and hence we have a good choice for δ. This strategy is independent of the derived a posteriori error estimate stated in Theorem 2.4, in the sense that it is not resembled by one of the error terms. However, this strategy can be easily combined with the adaptive algorithm.
The availability of a-posteriori error estimates for the HMM in general nonperiodic homogenization settings (such as derived in [39,10,11]) incorporate the error made by the choice of δ indirectly in a 'modeling error contribution'. The modeling error essentially measures the error between the real homogenized matrix and the effective coefficient that is computed by the HMM for H → 0. However, this modeling error cannot be computed, unless periodicity is assumed. Hence, it cannot be used to adaptively control the size of δ.
The fully rigorous a-posterior estimates obtained for the VMM (as in [48,49,50]) are based on the fact, that the local problems decay to zero outside of the support of coarse basis functions. This completely justifies a homogenous Dirichlet boundary condition for the local problems and opens the door to various techniques. In e.g. [49] it is used that the proper size of a patch can be measured by the normal flow of the local solution over the boundary of the patch. Since the local solutions decay to zero (with up to exponential speed), the normal flow must tend to zero. This gives a very reliable indicator. However, this and similar strategies do not generalize to the HMM, since there is no decay to zero of the local solutions.
3. Numerical experiments. Next, we analyze the HMM and the corresponding error indicators from Theorem 2.4 numerically in a periodic and heterogeneous setting. In particular, we apply the a-posteriori estimate for adaptive mesh refinement. Note that the formal assumptions of the theorem are not fulfilled in all our examples, but only the weak assumptions of Section 2. Nevertheless, the results suggest that the method and the achieved global and local error indicators work fine even with stochastic perturbations of periodic structures and moderately heterogeneous coefficient functions. For a-priori convergence results in a completely periodic setting, we refer to [37].
Formally, we are concerned with problems on unbounded domains. In the implementation, however, we restrict to a bounded computational domain. In general, this may produce a globally wrong behaviour of the solution, such as boundary layers or reflection. The most reliable strategy to (almost) avoid these effects is the usage of absorbing boundary conditions. For advection-diffusion problems such boundary conditions are derived by Halpern [35] or Nataf, Rogier and de Sturler [30]. In our test problems, however, it was sufficient to use homogeneous Dirichlet boundary condition for the HMM macro problem on a sufficiently large computational domain.
In both model problems, we make use of the subsequent notations. By u we denote the reference solution of the regarded problem, i.e. the function with which we compare our HMM approximation, to obtain an accurate value for the L 2 -error. In the first model problem u is the homogenized solution u 0 (cf. Section 2.1) and in the second model problem u is u (t, x + B(t) ) (a shift of the exact solution). In order to reduce the effects of a possibly wrong periodic boundary condition for the cell problems in the heterogeneous multiscale method, we use an oversampling technique by choosing δ = 2 for all numerical tests (cf. Definition 2.1).
To distinguish between computations on uniformly and adaptively refined grids, we denote the corresponding numerical solutions for the n'th time step by u n , the relative improvement in error, , the relative improvement in computational time. (9) Here t u CP U and t a CP U denote the CPU times for the uniform and the adaptive computation. We note that t a CP U also contains the time that is required for computing the estimated error and the local error indicators, whereas t u CP U only contains the time for assembling and solving the HMM operator. For m ∈ N + , δ ∈ R m + and the error function g : R m + → R + we define the experimental order of convergence (EOC) of g in (2δ → δ) by log (2) .
For simplicity, we do not refer to m and δ in our results. Instead we refer to the numbers of corresponding uniform computations.
Model problem 1. Find u ∈ L 2 (0, 3; H 1 (R 2 )), with We choose = 0.001. The entries of A are given by a ij (x) = 10 δ ij 1 + sin 2 (2π x 1 )cos 2 (2π where X is a log-normal distributed random variable, with variance σ and expectation E(X) = e σ 2 2 (see Fig. 1 for a ii with σ = 0.1). If a ij (x) would take a value below , we set it to to keep the ellipticity. The advection b is defined by Dividing the above equation for model problem 1 by , we see that the problem does formally not fulfill the assumption k = 1 as assumed in Section 2. However, if we define v (t, x) := u ( −1 t, x), then v solves which fulfills the desired assumptions after dividing by .
Model problem 1 deals with a stochastic perturbation of a space-periodic diffusion matrix. As a reference for the exact solution, we use the homogenized solution u 0 , determined by ignoring the perturbation. In this example, we study in three test cases the properties of the HMM and of the global error indicator for increasing perturbations from the periodic setting. The first test is without perturbation,  i.e. σ = 0, the second with a small variance (σ = 0.01) and the third one with a relatively large variance (σ = 0.1). We expect the HMM to yield good results for all of the tests, but nevertheless the approximation should become worse with increasing variance. If the global error indicator is reliable in such examples, we expect it to show the same behaviour as the error itself, i.e. the experimental order of convergence (EOC) should be roughly the same for error and estimated error. At least, the estimated error must not converge faster than the error itself. For further tests in the periodic setting we refer to [37]. The number of time steps is set to N = 20 for all the computations. Hence, the time step size ( t = 0.15) is quite large in comparison to the used mesh sizes (H, h). This results in empirical convergence rates which do not reach second order. However, we have chosen this situation by purpose to see an influence of t, H and Table 4. Model problem 1. EOC's for errors and global error indicators. In the first and the second column we refer to the numbers of the computations that are used to determine the associated EOC's. The numbers refer to the results depicted in Tab. 1, 2, and 3 (for σ = 0, 0.01, 0.1, respectively). Note that computations with smaller step sizes and σ = 0 yield the desired quadratic experimental orders of convergence, as demonstrated in [37].
Comparing the results depicted in Tables 1, 2 and 3 we see that the HMM solutions are very accurate even for the case of a quite large stochastic perturbation. This is also emphasized by Fig. 1 where a comparison between the isolines of the homogenized solution and the HMM approximation for σ = 0.1 is plotted. Nevertheless, we discover that the experimental orders of convergence (in space) are slightly decreasing for larger variance of the stochastic perturbation (cf. Tab. 4). Thus, the numerical experiments confirm our expectations. As a verification for the reliability of the a posteriori error estimate from Theorem 2.4, we observe that the estimated error captures the behaviour of the error itself. The estimated error is 8 to 9 times larger than the real error for all computations. These results indicate that the error estimator is of good quality, in the sense that it mimics the characteristics of the error itself. If the error worsens by the larger stochastic perturbation, the estimated error is doing the same. The reason for this behaviour is the approximation error indicator part of Ξ N H . The higher the stochastic perturbation, the longer it takes for this part to converge. This claim is also confirmed by Tab. 4. Here, the experimental orders of convergence for the estimated error without the approximation part η N,rel H are constantly at about 1.7, independent of the variance σ. Hence, the approximation error part alone captures the influence of the perturbation and leads to reduced EOC's of Ξ N H with increasing variance.  Model problem 2. Find u ∈ L 2 (0, 1 2 ; H 1 (R 2 )), with else.
We choose = 0.01. The diffusion matrix A (x) is given by Here, the diffusion matrix is rapidly oscillating, but not periodic. The exact solution u is determined by a computation with a standard method on a highly resolved grid. We use u (t, x + B(t) ) as a reference for the accuracy of the HMM approximation u H . In this example, we examine the adaptive Algorithm 1 that is based on the error indicators from Theorem 2.4. We performed six adaptive computations for various choices of T OL and σ T OL . For computations 1 and 2, we choose σ T OL = 1 (equal distribution strategy), for computations 3, 4, 5, and 6 we only allow a variation of 80% of the average local error indicator, i.e. σ T OL = 0.8. We start the algorithm with a uniform initial macro grid of mesh size H = 2 −2 . For all computations, the micro mesh size is set to h = 2 −4 and the number of time steps is set to N = 10 which leads to t = 0.05. Tab. 5 shows that the average EOC of the HMM error is about 2.04, while the EOC for Ξ N H tends to 1.79. The reason for not completely reaching second order can be found in the approximation error part and the part depending on t. Only in optimal cases and for very small time step sizes, we find the space EOC for the estimated error to be really quadratic. Figures 2 and 3 show adaptively refined grids of computation 6 for t = 0.05 and t = 0.5, respectively. Moreover, Fig. 3 shows a perfect match of the isolines of the exact solution and a correspondingadaptive HMM approximation. Tab. 6 shows a comparison between computations on uniformly refined grids and comparable computations using the adaptive strategy. The results clearly indicate a significant advantage of the adaptive computations over the uniform one. We observe that the adaptive compuations are at least 20% and up to 42% faster then comparable computations on uniform grids. 4. Proof of the a posteriori error estimate. In this section sketch the proof of the a posteriori error estimate in Theorem 2.4. The proof follows the ideas originally developed in [58,36] for elliptic multiscale problems and is based on a reformulation of the numerical scheme in the case of periodic homogenization problems. A corresponding reformulation in the current setting has been derived in [37] and was exploited there to derive an a priori error estimate. The subsequent proof of the a posteriori error estimate will then be done in the two-scale variational setting and follows the classical concept for parabolic problems suggested e.g. in [60].
Let Assumption 2.2 be satisfied. Since the coefficients are then periodic in space, we use the notation A h (t, y) := A h (t n , y S ) for y ∈ S. b h and k h are defined analogously. If V 0 t denotes the piecewise constant functions in time, a discrete version where C is a constant only depending on E.
The proof of Lemma 4.5 adapts the ideas stated in the book of [60] for parabolic problems with time dependent coefficients and is left to the reader.
In the following we specify the test functions in equation (12) as t n z 0 (t, ·) dt and z h (t, x, y) := I h (z 1 (t, x, ·))(y).
Here I H and I h denote corresponding Lagrange interpolation operators. This is not problematic since (z 0 , z 1 ) is sufficiently regular in 1d, 2d and 3d. Therefore, (Z H , z h ) is an admissible test function in the error identity in Lemma 4.4.
Since Remark 4.3 shows that z 0 (t, ·) . In the following we use this inequality without mentioning. Lemma 4.6. Let I H denote the Lagrange interpolation operator and letz n+1 0 be defined as in (19). Then we have the following estimate: We see that the expected second order convergence of the error estimator in (20) is slightly worsened by the factor log t N Proof. We estimate |z n+1 0 | H 2 (R d ) by means of (16) and obtain for n = N − 1: and with (17) Using these results and the approximation properties of the Lagrange interpolation operator yields the following: t n ∂ t z 0 (t, ·) L 2 (T ) dt In the last step we made use of equation (18). For simplification, 2C + t is replaced by a sole constant C.
Lemma 4.7. Letz n+1 0 be given by (19), then the following estimate holds true: Proof. For n = N − 1 we calculate: Proceeding analogously to the proof of Lemma 4.6 and using again equation (18) gives the result.
Lemma 4.8. With (19), the following estimates hold true: Rewriting the right hand side with x yields the estimate. And in complete analogy: Similarly, using (6) and the approximation properties of the Lagrange operator: and Combining these results we get the desired a-posteriori error estimate for u 0 (t N , ·)− u N H L 2 (R d ) as claimed in Theorem 2.4.

5.
Conclusion. In this contribution we derived an a-posteriori error estimate for the heterogeneous multiscale FEM for advection-diffusion problems with rapidly oscillating coefficient functions and a large expected drift. This estimate was derived in order to establish a basis for error control and adaptive mesh refinement algorithms. In order to demonstrate the applicability and the advantages of the method, we performed two numerical experiments. In the first one, we analyzed the behaviour of the global error estimate in the case of an increasing stochastic perturbation from the periodic setting. In the second one we used an adaptive mesh refinement algorithm and analyzed its benefit over computations on uniformly refined grids. Even for the non-periodic diffusion matrix in model problem 2 good results could be shown.