Mori-Zwanzig reduced models for uncertainty quantification

In many time-dependent problems of practical interest the parameters and/or initial conditions entering the equations describing the evolution of the various quantities exhibit uncertainty. One way to address the problem of how this uncertainty impacts the solution is to expand the solution using polynomial chaos expansions and obtain a system of differential equations for the evolution of the expansion coefficients. We present an application of the Mori-Zwanzig (MZ) formalism to the problem of constructing reduced models of such systems of differential equations. In particular, we construct reduced models for a subset of the polynomial chaos expansion coefficients that are needed for a full description of the uncertainty caused by uncertain parameters or initial conditions. Even though the MZ formalism is exact, its straightforward application to the problem of constructing reduced models for estimating uncertainty involves the computation of memory terms whose cost can become prohibitively expensive. For those cases, we present a Markovian reformulation of the MZ formalism which can lead to approximations that can alleviate some of the computational expense while retaining an accuracy advantage over reduced models that discard the memory altogether. Our results support the conclusion that successful reduced models need to include memory effects.


Introduction
The problem of quantifying the uncertainty of the solution of systems of partial or ordinary differential equations has become in recent years a rather active area of research. The realization that more often than not, for problems of practical interest, one is not able to determine the parameters, initial conditions, boundary conditions etc. to within high enough accuracy, has led to a flourishing literature of methods for quantifying the impact that this uncertainty imposes on the solution of the problems under investigation (see e.g. [5,10,11,12,14,15,1]). However, despite the increase in computational power and the development of various techniques for uncertainty quantification there is still a wealth of problems where reliable uncertainty quantification is beyond reach. One way to address this problem is to look for reduced models for a subset of the variables needed for a complete description of the uncertainty. The effect of all types of uncertainty is intimately connected with the inherent instabilities that may be present in the underlying system which we subject to the uncertainty. These considerations remain equally, if not more, important when we attempt to construct reduced models for uncertainty quantification.
In the current work, we are concerned with the construction of reduced models for systems of differential equations that arise from polynomial chaos expansions of solutions of a PDE or ODE system. In particular, we focus on the case that the given PDE or ODE system contains uncertain parameters or initial conditions and we want to construct a reduced model for the evolution of a subset of the polynomial chaos expansions that are needed for a complete description of the uncertainty caused by the uncertain parameters. There are different methods to construct reduced models for PDE or ODE systems (see e.g. [6,4] and references therein). We choose to use the Mori-Zwanzig formalism in order to construct the reduced model [2,3].
The main issue with all model reduction approaches is the computation of the memory caused by the process of eliminating variables from the given system (referred to as the full system from this point on) [4]. The memory terms are, in general, integral terms which account for the history of the variables that are not resolved. In principle the integrands appearing in the memory terms can be computed through the solution of the orthogonal dynamics equation [3]. We present some examples where this procedure can be implemented and the resulting reduced model can be estimated. Those examples highlight the definite improvement in accuracy of a reduced model when it includes a memory term. However, it is also easy to come up with examples where the solution of the orthogonal dynamics equation becomes prohibitively expensive.
For such cases we present a Markovian reformulation of the MZ formalism which allows the calculation of the memory terms through the solution of ordinary differential equations instead of the computation of convolution integrals as they appear in the original formulation. We present an algorithm which allows the estimation of the necessary parameters on the fly. This means that one starts evolving the full system and use it to estimate the reduced model parameters. Once this is achieved, the simulation continues by evolving only the reduced model with the necessary parameters set equal to their estimated values from the first part of the algorithm. Of course, such an approximation of the memory term cannot work under all circumstances. We present results for a nontrivial problem where it does yield a reduced model with improved behavior compared to a model that ignores the memory terms altogether.
We should note that this alternative approach to computing the memory term fits in the renormalization framework advocated recently by one of the authors [13] in order to construct reduced models for singular PDEs. In particular, the idea is that one embeds the MZ reduced model in a larger family of reduced models which share the same functional form but may have additional parameters for enhanced flexibility. These extra parameters are determined so that the reduced model reproduces some dynamic features of the full system. After this is done, one can switch to the reduced model for the rest of the simulation. In the current work, the extra parameters are the lengths of the memory appearing in the MZ reduced model. Section 2 presents a brief introduction to the MZ formalism for the construction of reduced models of systems of ODEs. In Section 3 we develop the Markovian reformulation of the MZ formalism and show how one can estimate adaptively the parameters appearing in the reduced model. Section 4 presents numerical results both for the original MZ formalism (Sections 4.1-4.2) and its Markovian reformulation (Section 4.3). Finally, in Section 5 we discuss directions for future work.

Mori-Zwanzig formalism
We begin with a brief presentation of the Mori-Zwanzig formalism [2,3]. Suppose we are given the system where u = ({u k }), k ∈ H ∪G with initial condition u(0) = u 0 . The unknown variables (modes) are divided into two groups, one group is indexed in H and the order indexed in G. Our goal is to construct a reduced model for the modes in the set H. The system of ordinary differential equations we are given can be transformed into a system of linear partial differential equations The solution of (2) is given by u k (u 0 , t) = ϕ k (u 0 , t). Using semigroup notation we can rewrite (2) as ∂ ∂t e tL u 0k = Le tL u 0k Suppose that the vector of initial conditions can be divided as u 0 = (û 0 ,ũ 0 ), whereû 0 is the vector of the resolved variables (those in H) andũ 0 is the vector of the unresolved variables (those in G). Let P be an orthogonal projection on the space of functions ofû 0 and Q = I − P.
Equation (2) can be rewritten as Equation (3) is the Mori-Zwanzig identity. Note that this relation is exact and is an alternative way of writing the original PDE. It is the starting point of our approximations. Of course, we have one such equation for each of the resolved variables u k , k ∈ H. The first term in (3) is usually called Markovian since it depends only on the values of the variables at the current instant, the second is called "noise" and the third "memory". If we write e tQL QLu 0k = w k , If we project (5) we get since P Q = 0. Also for the initial condition P w k (u 0 , 0) = P QLu 0k = 0 by the same argument. Thus, the solution of (5) is at all times orthogonal to the range of P. We call (5) the orthogonal dynamics equation. Since the solutions of the orthogonal dynamics equation remain orthogonal to the range of P , we can project the Mori-Zwanzig equation (3) and find ∂ ∂t P e tL u 0k = P e tL P Lu 0k + P t 0 e (t−s)L P Le sQL QLu 0k ds.
We will not present here more details about how to start from Eq. (6) and construct reduced models of different orders for a general system of ODEs. Such constructions have been documented thoroughly elsewhere (see e.g. [3]). However, we will provide such details for the specific numerical examples in Sections 4.1-4.2.

Markovian reformulation of the MZ formalism
While the MZ model given by Eq. (6) is exact, its construction can be involved and most importantly, very costly. The main source of computational expense is the memory term. Technically, the cost associated with the memory term comes from two sources: i) the presence of the orthogonal dynamics equation solution operator e sQL and ii) the need to find an expression in terms of the resolved variables and time for P Le sQL QLu 0k which appears in the memory integrand. The presence of e sQL is problematic because the orthogonal dynamics equation is, for the general case, a PDE in as many dimensions as the original system of ODEs. Also, finding an expression for P Le sQL QLu 0k is problematic because, in general, it is not possible to separate the dependence of the expression on time and on the resolved variables. Both are formidable tasks and we will show with several examples how they can increase the cost of constructing the reduced model. For some cases (see e.g. Section 4.1 and 4.2) both tasks can be tackled through the use of a finite-rank projection for the operator P. However, we will show with a simple example (see Section 4.3) that the use of a finite-rank projection may be too costly itself. For such cases, we need an alternative approach to the construction of the memory term. In this section we describe a reformulation of the problem of computing the memory term which can alleviate some of these issues. Also, we present numerical results from the application of this approach in Section 4.3.

Finite memory
We focus on the case when the memory has a finite extent only. The case of infinite memory is simpler and is a special case of the formulation pre-sented below. Also, the current reformulation allows us to comment on what happens in the case when the memory is very short. Let w 0k (t) = P t 0 e (t−s)L P Le sQL QLu 0k ds = P t 0 e sL P Le (t−s)QL QLu 0k ds, by the change of variables t ′ = t − s. Note, that w 0k depends both on t and the resolved part of the initial conditionsû 0 . We have suppressed theû 0 dependence for simplicity of notation. If the memory extends only for t 0 units in the past (with t 0 ≤ t,) then The evolution of w 0k is given by where To allow for more flexibility, let us assume that the integrand in the formula for w 1k (t) contributes only for t 1 units with t 1 ≤ t 0 . Then w 1k (t) = P t t−t 1 e sL P Le (t−s)QL QLQLu 0k ds.
We can proceed and write an equation for the evolution of w 1k (t) which reads dw 1k dt = P e tL P LQLQLu 0k − P e (t−t 1 )L P Le t 1 QL QLQLu 0k + w 2k (t), (8) where Similarly, if this integral extends only for t 2 units in the past with t 2 ≤ t 1 , then This hierarchy of equations continues indefinitely. Also, we can assume for more flexibility that at every level of the hierarchy we allow the interval of integration for the integral term to extend to fewer or the same units of time than the integral in the previous level. If we keep, say, n terms in this hierarchy, the equation for w (n−1)k (t) will read Note that the last term in (9) involves the unknown evolution operator for the orthogonal dynamics equation. This situation is the well-known closure problem. We can stop the hierarchy at the nth term by assuming that w nk (t) = 0. In addition to the closure problem, the unknown evolution operator for the orthogonal dynamics equation appears in the equations for the evolution of the quantities w 0k (t), . . . , w (n−1)k (t) through the various terms P e (t−t 0 )L P Le t 0 QL QLu 0k , . . . , P e (t−t 0 )L P Le t 0 QL (QL) n−1 QLu 0k respectively.
We describe now a way to express these terms involving the unknown orthogonal dynamics operator through known quantities so that we obtain a closed system for the evolution of w 0k (t), . . . , w (n−1)k (t).
Since we want to treat the case where t 0 is not necessarily small, we divide the interval [t − t 0 , t] in n 0 subintervals. Define In a similar fashion we can define corresponding quantities for all the memory terms up to w (n−1)k (t) = In order to proceed we need to make an approximation for the integrals over the subintervals.

Trapezoidal rule approximation
We have and from (7) dw Similarly, for w In general, Similarly, By dropping the O((∆t 0 ) 2 ), . . . , O((∆t n−1 ) 2 ) terms we obtain a system of n 0 + n 1 + . . . + n n−1 differential equations for the evolution of the quantities w (n−1)k . This system allows us to determine the memory term w 0k (t). Since the approximation we have used for the integral leads to an error O(∆t) 2 , the ODE solver should also be O(∆t) 2 . We have used the modified Euler method to solve numerically the equations for the reduced model.
Note that the implementation of the above scheme requires the knowledge of the expressions for P e tL P LQLu 0k , . . . , P e tL P L(QL) n−1 QLu 0k . Since the computation of these expressions for large n can be rather involved for nonlinear systems (see Section 4.3), we expect that the above scheme will be used with a small to moderate value of n. Finally, we mention that the above construction can be carried out for integration rules of higher order e.g. Simpson's rule.

Estimation of the memory length
The construction presented above relies on an accurate determination of the memory lengths t 0 , t 1 , . . . , t n−1 . We present in this section a way to estimate these quantities on the fly. This means that we start evolving the full system, use it to estimate t 0 , t 1 , . . . , t n−1 and then switch to the reduced model with the estimated values for t 0 , t 1 , . . . , t n−1 .
For simplicity of presentation we assume that we evolve only w 0k (t). If we use the trapezoidal rule to discretize w 0k (t) and eliminate the term P e (t−t 0 )L P Le t 0 QL QLu 0k from (7), the reduced model reads for k ∈ H. We can solve (13) formally and substitute in (12) to get where λ 0 = 2/t 0 . Recall that, for the resolved variables, we have from the full system dP u k dt = P e tL P Lu 0k + P e tL QLu 0k .
We would like to estimate the memory decay parameter t 0 so that the reduced equation (14) for u k reproduces the behavior of u k as predicted by the full system (15). We can do that by requiring that the evolution of some integral quantity of the solution is the same when predicted by the reduced and full systems. We begin by discretizing the integral term in (14). Suppose that we are evolving the full system with a step size δt, where t = n t δt (note that n t increases as t increases). If we discretize the integral with the trapezoidal rule we find where f k (jδt,û 0 ) = 2P e jδtL P LQLu 0k for j = 0, . . . , n t . The quantities f k (jδt,û 0 ) can be computed from the full system.
There is freedom in the choice of the integral quantity whose evolution the reduced model should be able to reproduce. For example, we can use k∈H |P u k (t)| 2 the squared l 2 norm of the resolved variables. If we use this integral quantity, then from (16) and (15) we find that the unknown parameter t 0 must satisfy where and Re{·} denotes the real part.
With this identification, equation (17) becomes a polynomial equation for y with y ∈ [0, 1]. It is not difficult to solve equation (17) with an iterative method, for example Newton's method. For the numerical results we present in Section 4.3, Newton's method converged to double precision accuracy within 4-5 iterations. After an estimateŷ has been obtained, we can find the estimatet 0 of t 0 (recall λ 0 = 2/t 0 ) from

Determination of optimal estimatet 0
For each time instant t we can obtain through equations (17) and (19), an estimatet 0 (t) for t 0 . Thus, the most important issue that we have to address is that of deciding which is the best estimate of t 0 . In other words, at what time t f should we stop estimating the value of t 0 so that we can use the estimated valuet 0 (t f ) to evolve the reduced model from then on. We define ǫ(t) = max l∈ [1,nt] |ŷ l (t + δt) −ŷ l (t)|. The quantity ǫ(t) monitors the convergence of not only the value of the estimateŷ as a function of the time t, but of the whole function e −λ 0 (t−s) . Ideally, ǫ(t) converges to zero with increasing t. That will be the case if the approximation of the memory term only through P e tL P LQLu 0kr is enough (see (12)- (13)). However, this will not always be the case. If keeping P e tL P LQLu 0kr is not enough, then ǫ(t) will decrease with increasing t up to some time t min when it will reach a nonzero minimum. After that time, it starts increasing. This signals that keeping only P e tL P LQLu 0kr is not enough to describe accurately the memory.
In order to proceed we have two options: (i) construct a higher order model and (ii) identify t f = t min and thust 0 (t f ) =t 0 (t min ). Results for higher order models will be presented elsewhere (see also discussion in Section 5). In the numerical experiments we present in the next section we have chosent 0 (t f ) =t 0 (t min ). Note that the procedure just outlined allows the automation of the algorithm. This means that there is no adjustable reduced model parameter that needs to be specified at the onset of the algorithm.
We are now in a position to state the adaptive Mori-Zwanzig algorithm which constructs a reduced model with the necessary memory term parameter t 0 estimated on the fly.
Adaptive Mori-Zwanzig Algorithm 1. Evolve the full system and compute, at every step, the estimatet 0 (t).
2. When ǫ(t) reaches a minimum (possibly non zero) value at some instant t min , pickt 0 (t min ) as the final estimate of t 0 .
3. For the remaining simulation time (t > t min ), switch from the full system to the reduced model. The reduced model is evolved with the necessary parameter t 0 set to its estimated valuet 0 (t min ).
This procedure can be extended to the computation of optimal estimates for t 1 , t 2 , . . . , i.e. when we evolve, in addition to w 0k (t), the quantities w 1k (t), w 2k (t), . . . . Results for such higher order models will be presented elsewhere.
Because of the spectral decay in the gPC coefficients, it is natural to choose the coefficients u i , i = 0, . . . , Λ of the lower degree Legendre polynomials to be the resolved variablesû, and u i , i = Λ + 1, . . . , M to be the unresolved variablesũ respectively. To conform with the notation in Section 2, we have H = {0, . . . , Λ} and G = {Λ + 1, . . . , M }. We have chosen M = 6 for the full system and Λ = 1 for the reduced system. The solution of the full system is converged for M = 6 and thus, we do not need to keep further terms in the expansion.
The projection P we have chosen is defined as (P f )(û 0 ) = f (û 0 ,0). Also, we define Q = I − P. To be consistent with the notation in Section 2, we In order to be able to compute the expressions for the memory terms we use a finite-rank projection P to approximate the projection P . To define the finite-rank projection we need to introduce a measure for the distribution of the coefficients. We consider the coefficients u 0r to be i.i.d Gaussian random variables with mean at the values given initially (see (21)) and a prescribed variance for i = 0, . . . , M. In the case of the linear ODE, the variance was set to 0.01 for all the variables in the full system. Also, ω is the joint probability measure with respect to these random variables. Then for a function ϕ j (u 0 , t) of the initial conditions and time, the finite-rank projection reads where h ν (û 0 ) are tensor product Hermite polynomials up to some order p, ν is the multi-index ν = (ν 0 , . . . , ν Λ ) with |ν| = Λ i=0 ν i and I is the index set up to order p, i.e., I = {µ |µ| ≤ p}. The order p for the basis functions was set to 5 for a total of 21 basis functions. In formula (22) the inner product is defined as (f, g) = f gdω.
For each j ≤ Λ, the component F j (u 0 , t) denotes the solution of the orthogonal dynamics (24) is equivalent to the Dyson formula: Eq. (25) is a Volterra integral equation for F j (u 0 , t). To proceed, we replace the projection operator P with the finite-rank projection operator P and find where a ν j (s) = (LF j (u 0 , s), h ν (û 0 )). Consequently, We substitute e (t−s)L PLF j (u 0 , s) for e (t−s)L P LF j (u 0 , s) in Eq. (25), multiply both sides by L and take the inner product with h µ (û 0 )); the result is (dropping the approximation sign) Eq. (27) is a Volterra integral equation for the function a ν j (t), which can be rewritten as follows: where f µ j (t) = (Le tL F j (u 0 , 0), h µ (û 0 )), g νµ (t) = (Le tL h ν (û 0 ), h µ (û 0 )).
The functions f ν j (t), g µν (t) can be found by averaging over a collection of experiments or simulations, with initial conditions drawn from the initial distribution. In this example, we use a sparse grid quadrature rule for the multi-dimensional integrals [16].
Finally, we perform one more projection to eliminate the noise term (see Section 2) and the memory term becomes After calculating a µ i and γ µν we obtain the following reduced system, here A and Γ are the matrix form of a µ i and γ µν ,û 0 is the initial condition of resolved variables. Fig. 1 shows the evolution of the memory kernel (Le tQL QLu 1 , h 01 ) which is indicative of the behavior of the memory kernels. The basis function h 01 is the product of the zero order Hermite polynomial in the variable u 0 and the first order Hermite polynomial in the variable u 1 . We see that the memory kernel is rather slowly decaying which means that the resulting reduced order model will have a long memory. Fig. 2 shows the solution for the resolved variables as predicted by the full system and two different reduced order models, the Markovian model which results from dropping the memory term in (29) and the non-Markovian reduced model given by (29). It is obvious from Fig. 2 that the Markovian model loses accuracy quickly. On the other hand, the non-Markovian model retains its accuracy for the length of the simulation interval. This difference in behavior is quantified in Fig. 3 where we see that for both resolved variables the relative error of the Markovian model becomes greater than 50% by the end of the simulation interval. On the other hand, the error of the non-Markovian model remains less than 1% for the whole simulation interval.

Nonlinearly damped and randomly forced particle
Consider the following equation describing a particle moving in a double well potential and driven by a force term (see [16]) where f = sin(t + t 0 )ξ and ξ ∼ U [−1, 1]. We use order M = 6 polynomials in ξ to approximate the full system solution up to time 10. We want to   construct a reduced model for the first 2 coefficients of the polynomial expansion (Λ = 1). As before, we let u(t, ξ) ≈ M i=0 u i (t)φ i (ξ) and we obtain through Galerkin projection the system where e jkmi = φ j (ξ)φ k (ξ)φ m (ξ)φ i (ξ) 1 2 dξ and u 0i = { u • , i = 0; 0, otherwise. Let R be the vector with R i = u i − M j,k,m=0 u j u k u m e jkmi + f i . In order to apply the MZ formalism we need an autonomous system of equations to begin with. For this purpose, we introduce an auxiliary time-variable τ, such that τ = t and dτ dt = 1. The projection operator P projects onto the function space of the first two coefficients and τ . Again, we use the finiterank projection onto the function space expanded by Hermite polynomials up to order 3 to represent the orthogonal dynamics (total of 10 functions) and solve the Volterra equation for the memory kernels as we did for the linear ODE example. The variance for the Gaussian variables used to define the inner product for the finite-rank projection was set to 10 −2i−2 for the coefficient u i with i = 0, 1, . . . , 6. The reason we used a decreasing sequence of variances as we go up in the order of Legendre polynomials is to stabilize the behavior of the reduced model.
As can be seen from Fig. 5, the difference between the (memoryless) Markovian and non-Markovian reduced models is even more pronounced than in the case of the linear ODE. The inclusion of the memory term is indeed crucial for maintaining the accuracy of the reduced model for long times. For the case of the resolved variable u 1 , the relative error spikes at a couple of points even for the otherwise very accurate non-Markovian reduced model. As can be seen from Fig. 4, this is because the exact value of u 1 becomes zero at these points so that the relative error becomes very large even for an accurate approximation. However, the significant improvement in accuracy with the inclusion of the memory term is evident in Fig. 5 which plots the error in a logarithmic scale.

Viscous 1D Burgers with uncertain initial conditions
In this section we show how the above MZ formulation can be used for uncertainty quantification for the one-dimensional Burgers equation with uncertain initial condition. As is explained at the end of this section, the calculation of the MZ memory term cannot proceed as for the last two examples. The reason is that it is prohibitively expensive due to the number of basis functions needed. Thus, we will apply the alternative construction that was presented in Section 3. The equation is given by where ν > 0. Equation (32) should be supplemented with an initial condition u(x, 0) = u 0 (x) and boundary conditions. We solve (32) in the interval [0, 2π] with periodic boundary conditions. This allows us to expand the solution in Fourier series We assume that the initial condition u 0 (x) is uncertain (random) and can be expanded as u 0 (x, ξ) = (α 0 + α 1 ξ)v 0 (x) where ξ is uniformly distributed in [−1, 1] and v 0 (x) a given function. In the numerical experiments we have taken α 0 = α 1 = 1 and v 0 (x) = sin x. Thus, the initial condition varies "uniformly" between the functions 0 and 2 sin x.
To proceed we expand the solution u k (t, ξ) for k ∈ F in a polynomial chaos expansion using the standard Legendre polynomials which are orthogonal in the interval [−1, 1]. In particular, we have that where φ i (ξ) is the standard Legendre polynomial of order i. For each wavenumber k we expand the solution u k (t, ξ) of (36) in Legendre polynomials and keep the first M polynomials Similarly, the initial condition can be written as u 0 (x, ξ) = sin x 1 i=0 α i φ i (ξ) since φ 0 (ξ) = 1 and φ 1 (ξ) = ξ. Substitution of (34) in (33) and use of the orthogonality property of the Legendre polynomials gives for k ∈ F and r = 0, . . . , M − 1. Also where the expectation E[·] is taken with respect to the uniform density on [−1, 1]. The Legendre polynomial triple product integral defines a tensor which has the following sparsity pattern: E[φ l (ξ)φ m (ξ)φ r (ξ)] = 0, if l + m < r or l + r < m or m + r < l or l + m + r = odd [7]. Due to this sparsity pattern, for a given value of M only about 1/4 of the M 3 tensor entries are different from zero. Before we proceed we have to comment on the cost of applying the MZ formalism to construct a reduced model. We have set the viscosity coefficient to ν = 0.03. The solution of the full system was computed with N = 196 Fourier modes (F = [−98, 97]) and the first 7 Legendre polynomials (M = 7). The first 7 Legendre polynomials were enough to obtain converged statistics for the full system. We want to construct reduced models for the evolution of the coefficients of the first 2 Legendre polynomials i.e., u k0 , u k1 for k ∈ F. If we want to apply the MZ formalism in the way we did for the previous two examples (employing a finite-rank projection etc.) we would need to construct a basis in 2 × 98 dimensions (exploiting the fact that the solution of the Burgers equation is real-valued). Any attempt to use basis functions up to a high order is infeasible for such a high-dimensional situation. We have attempted to use only low order basis functions but they are not enough to guarantee accuracy of the reduced model. Thus, we turn to the reformulated reduced model that was presented in Section 3.

Reformulated MZ reduced model
To conform with the Mori-Zwanzig formalism we set . In other words, we resolve, for all the Fourier modes, only the first Λ of the Legendre expansion coefficients and we shall construct a reduced model for them.
The system (36) is supplemented by the initial condition u 0 = (û 0 ,ũ 0 ). We focus on initial conditions where the unresolved Fourier modes are set to zero, i.e. u 0 = (û 0 , 0). We also define L by To construct a MZ reduced model we need to define a projection operator P. For a function h(u 0 ) of all the variables, the projection operator we will use is defined by P (h(u)) = P (h(û 0 ,ũ 0 )) = h(û 0 , 0), i.e. it replaces the value of the unresolved variablesũ 0 in any function h(u 0 ) by zero. Note that this choice of projection is consistent with the initial conditions we have chosen. Also, we define the Markovian term The Markovian term has the same functional form as the RHS of the full system but is restricted to a sum over only the first Λ Legendre expansion coefficients for each Fourier mode.
For the the term P LQLu 0kr we find Finally, to implement any method to solve equation (17) for the estimation of t 0 we need to specify the RHS of the equation (17). This requires the evaluation of the expression P e tL QLu 0kr . For the case of the viscous Burgers equation, we find Note that since we restrict attention to initial conditions for which the unresolved variables are zero and the projection sets the unresolved variables to zero, the quantity P e tL QLu 0kr can be computed through the evolution of the full system (36).
The full system was solved with the modified Euler method with δt = 0.001. The reduced model uses N = 196 Fourier modes but only the first two Legendre polynomials, so Λ = 2. It was solved using the modified Euler method with δt = 0.001. The parameter t 0 needed for the evolution of the memory term was found to be 0.3783 through the procedure described in Section 3.3.1. Figure 6 shows the evolution of the mean energy of the solution 2π|u kr (t)| 2 1 2r + 1 as computed from the full system (with M = 7 Legendre polynomials), the MZ reduced model with Λ = 2 without memory (keeping only the Markovian term) and the MZ reduced model with Λ = 2 with memory. Figure 7 shows the evolution of the standard deviation of the energy of the solution. The variance of the energy is given by The reduced model performs equally well with or without memory. Of course, the reduced model with memory is slower than the reduced model without memory. However, the reduced model with memory is still about 4 times faster than the the full system. Figure 8 shows the evolution of the mean squared l 2 norm of the gradient of the solution as computed from the full system (with M = 7 Legendre polynomials), the MZ reduced model with Λ = 1 without memory (keeping only the Markovian term) and the MZ reduced model with Λ = 1 with memory. Figure 9 shows the evolution of the standard deviation. The variance is given by  The large values of the standard deviation of the mean squared l 2 norm of the gradient are justified by the uncertainty in the initial condition. Recall that we have chosen an initial condition which can vary "uniformly" between the functions 0 and 2 sin x. As a result, the standard deviation is large because it has to account for a wide range of possible initial conditions. It is evident from the figures that the inclusion of the memory term improves the performance of the reduced model. Also, it is evident that there is room for improvement of the reduced model with memory. In particular, more terms are needed in the reformulated MZ model to approximate better the memory.
Recall that the solution of Burgers equation is a contraction [9]. Eventually, the complete description of the uncertainty caused by the uncertainty in the initial condition requires only a few polynomial chaos expansion coefficients. This happens at a time scale that is dictated by the magnitude of the viscosity coefficient. That is why for long times the reduced model with and without memory have comparable behavior to that of the full system. However, for short times, the inclusion of the memory term does make a difference because information from the higher chaos expansion coefficients is needed. The higher chaos expansion coefficients will have a more prolonged contribution for systems that possess unstable modes. In such cases, the inclusion of the memory term becomes imperative for short as well long times. Results for such cases will be presented elsewhere.

Discussion and future work
We have examined the application of the Mori-Zwanzig formalism to the problem of constructing reduced models for uncertainty quantification. In particular, we have constructed reduced models for subsets of the polynomial chaos expansion coefficients needed to describe fully the uncertainty. We have examined cases of parametric or initial condition uncertainty. The main conclusion from the current work is that while the MZ formalism can be applied for the construction of reduced models, the task of constructing an efficient (or even feasible) reduced model can be involved. For cases where the straightforward application of the MZ formalism is not possible, we have offered an alternative construction. The implementation of this alternative construction is reminiscent of renormalization constructions used to describe the evolution of complex solutions of PDEs [13].
The current work opens several directions for future work. First, we should investigate whether there is a more economical way of choosing the basis functions for cases when the basis functions have many arguments (as was the case for the Burgers example). This is important because the calculation of the memory kernels through the finite-rank projection is well defined and the solution of the corresponding Volterra equations can be performed with high accuracy. A related question is whether there is sparsity in the coefficients of the basis functions. It is plausible that even though in principle the number of basis functions to reach a specific order may be very large, many of them may not contribute to the representation. A related approach would be the use of machine learning algorithms to obtain a more efficient representation of the memory term. Finally, a related issue to be investigated is how to ensure the stability of the reduced model when the finite-rank projection is employed. For example, for the nonlinearly damped and forced particle case, we had to assign smaller variances for the higher coefficients to stabilize the reduced model. This procedure needs to be investigated and, if possible, automated.
A second interesting research direction has to do with the representation of the memory term when the finite-rank projection is not possible due to a prohibitively large number of basis functions. We have explored here an expansion of the memory term that involves, in essence, a Taylor expansion of the orthogonal dynamics operator. Such an expansion seems more plausible when the timescale of the orthogonal dynamics is slower than that of the resolved variables. However, there is an alternative way of performing the expansion of the memory term that is more suited to the case when the orthogonal dynamics is faster than the resolved variables. Such an expansion leads to a Taylor expansion of the whole memory term, not just the orthogonal dynamics operator. If the memory kernel becomes insignificant after a time interval t 0 , then one can use the full system up to time t 0 , estimate the Taylor expansion of the whole memory term around time t 0 and then switch to the reduced model with the memory given by the Taylor expansion. We will investigate this alternative memory representation and report the results elsewhere.