A symmetry-adapted numerical scheme for SDEs

We propose a geometric numerical analysis of SDEs admitting Lie symmetries which allows us to individuate a symmetry adapted coordinates system where the given SDE has notable invariant properties. An approximation scheme preserving the symmetry properties of the equation is introduced. Our algorithmic procedure is applied to the family of general linear SDEs for which two theoretical estimates of the numerical forward error are established.


Introduction
The exploitation of special geometric structures in numerical integration of both ordinary and partial differential equations (ODEs and PDEs) is nowadays a mature subject of the numerical analysis often called geometric numerical integration (see e.g. [17,24,28,38]). The importance of this research topic is due to the fact that many differential equations in mathematical applications have some particular geometrical features such as for example a conservation law, a variational origin, an Hamiltonian or symplectic structure, a symmetry structure etc.. The development of geometrically adapted numerical algorithms permits to obtain suitable integration methods which both preserve the qualitative properties of the integrated equations and have a more efficient numerical behaviour with respect to the corresponding standard discretization schemes. In comparison the study of geometric numerical integration of stochastic differential equations (SDEs) is not so well developed. In the current literature the principal aims consist in producing numerical stochastic integrators which are able to preserve the symplectic structure (see e.g. [1,35,40]), some conserved quantities (see e.g. [5,23,32]) or the variational structure (see e.g. [2,3,22,42]) of the considered SDEs. For the study of the algebraic structure of stochastic expansions in order to achive an optimal efficient stochastic integrators at all orders see [11]. Although the exploitation of Lie symmetries of ODEs and PDEs (see e.g. [36]) to obtain better numerical integrators is an active research topic (see e.g. [4,10,31,30] and references therein), the application of the same techniques in the stochastic setting to the best of our knowledge is not yet pursued, probably because the concept of symmetry of a SDE has been quite recently developed (see e.g. [6,9,8,7,16,27,29,33]). In this paper we introduce two different numerical methods taking advantage of the presence of Lie symmetries in a given SDE in order to provide a more efficient numerical integration of it. We first propose the definition of an invariant numerical integrator for a symmetric SDE as a natural generalization of the corresponding concept for an ODE. When one tries to construct general invariant numerical methods in the stochastic framework, in fact, a non trivial problem arises. Since both the SDE solution as well as the Brownian motion driving it are continuous but not differentiable processes, the finite differences discretization could not converge to the SDE solution. We give some necessary and sufficient conditions in order that the two standard numerical methods for SDEs (the Euler and the Milstein schemes) are also invariant numerical methods. By using this result, in particular, we are able to identify a class of privileged coordinates systems where it is convenient to make the discretization procedure. Our second numerical method, based on a well-defined change of the coordinates system, is inspired by the standard techniques of reduction and reconstruction of a SDE with a solvable Lie algebra of symmetries (see [8,26]). Indeed a SDE with a solvable Lie algebra of symmetries can be reduced to a triangular system and, when the number of symmetries is sufficiently high, the latter can be explicitly integrated. In the stochastic setting the explicit integration concept is of course a quite different notion with respect to the deterministic one. Indeed the evaluation of an Ito integral, a necessary step in the reconstruction of a reduced SDE, can only be numerically implemented. We apply our two proposed numerical techniques to the general linear SDEs, being the first nontrivial class of symmetric equations. In this case the two algorithmic methods can be harmonized in such a way to produce the same simple family of best coordinates systems for the discretization procedure. Interestingly, the identified coordinate changes are closely related to the explicit solution formula of linear SDEs. Although the integration formula of linear SDEs is widely known, it is certainly original the recognition of the proposed numerical scheme for linear SDEs as a particular implementation of a general procedure for SDEs with Lie symmetries. We finally point out that the simple case of affine drift and diffusion coefficients plays an important role since any SDE with real analytic drift and diffusion coefficients can be seen locally like it. Moreover we theoretically investigate the numerical advantages of the new numerical scheme for linear SDEs. More precisely we obtain two estimates for the forward numerical error which, in presence of an equilibrium distribution, guarantee that the constructed method is numerically stable for any size of the time step h. This means that for any h > 0 the error does not grow exponentially with the maximum-integration-time T , but it remains finite for T → +∞. This property is not shared by standard explicit or implicit Euler and Milstein methods. The obtained estimates can be considered original results mainly because the coordinate changes involved in the formulation of our numerical scheme are strongly not-Lipschitz, and so the standard convergence theorems can not be applied. Our theoretical results are also numerically illustrated. It is interesting to note that the main part of the theory, in particular the definitions of strong symmetry of a SDE and of numerical scheme, can be easily extended to Stratonovich type SDE driven by general noises ( [6]), for example by exploiting rough paths theory. Unfortunately, since the proofs of Theorem 5.1 and Theorem 5.2 use in essential way the (forward and bachward) Ito formula, the long terms estimates obtained here cannot be straightforwardly generalized to the rough paths driven SDEs framework. At the same time we think that some ideas in the proof of Theorem 5.2 can be suitable exploited to obtain pathwise estimates of long term error in the rough paths setting. The article is structured as follows: in Section 2 we recall the notion of strong symmetry of a SDE and we describe the two standard discretization schemes used in the rest of the paper. In Section 3 we propose two numerical procedures adapted with respect to the Lie symmetries of a SDE. We apply the proposed integration methods to general one and twodimensional linear SDEs in Section 4 . In Section 5 some theoretical estimates showing the stability and efficiency of our adapted-to-symmetries numerical schemes in linear-SDEs are proved. In the last section we expose some numerical experiments confirming the theoretical estimates obtained in the previous section.

Strong symmetries of SDEs
In the following M will be an open subset of R n , and x 1 , ..., x n will be the standard coordinates system of M . If F : M → R k we denote by ∇(F ) the Jacobian of F i.e. the matrix-valued function Furthermore we can identify the vector fields Y ∈ T M with the functions Y : Definition 2.1 Let (Ω, F, F t , P) be a filtered probability space. Let µ and σ be two smooth functions defined on M and valued in n vectors and n × m matrices respectively. A solution to a SDE(µ, σ) is a pair (X, W ) of adapted processes such that i) W is a F t -Brownian motion in R m ; ii) For i = 1, 2, ..., n Remark 2.2 In particular all the integrals are meaningful if a.s.: is said a strong solution if X is adapted to the filtration F W t generated by the BM W and completed with respect to P.
Of course a solution (X, W ) is called a weak solution when it is not strong. In this paper we fix a Brownian motion W , that is we consider only strong solutions of a SDE(µ, σ) and, consequently, we denote them simply by X. For a symmetry analysis via weak solutions of SDEs see [9].
A solution X to a SDE(µ, σ) is a diffusion process admitting as infinitesimal generator: It is particularly useful for obtaining stochastic differentials the following celebrated formula.
Theorem 2.4 (Ito formula) Let X be a solution of the SDE (µ, σ) and let f : M → R be a smooth function. Then F = f (X) has the following stochastic differential We recall important definitions of symmetries of a SDE.
Definition 2.5 (strong finite symmetry) We say that a diffeomorphism Φ is a (strong finite) symmetry of the SDE (µ, σ) if for any solution X to the SDE (µ, σ) also Φ(X) is a solution to the SDE (µ, σ).
By using Ito formula it is immediate to prove the following result.
Theorem 2.6 A diffeomorphism Φ is a symmetry of the SDE (µ, σ) if and only if It is well-known that vector fields acting as infinitesimal generators of one parameter transformation groups are the most important tools in Lie group theory. Definition 2.7 (strong infinitesimal symmetry) A vector field Y is said a (strong infinitesimal) symmetry of the SDE (µ, σ) if the group of the local diffeomorphism Φ a generated by Y is a symmetry of the SDE (µ, σ) for any a ∈ R.
The following determining equations for (any) infinitesimal symmetries are well-known (see, e.g., [16]). For their generalization to a weak solution case see [9].

Numerical integration of SDEs
For convenience of the reader, we recall the two main numerical methods for simulating a SDE and a theorem on the strong convergence of these methods (for a detailed description see e.g. [25]). Consider the SDE having coefficients (µ, σ), driven by the Brownian motion W , and let {t n } n be a partition of [0, T ]. The Euler scheme for the equation (µ, σ) with respect to the given partition is provided by the following sequence of random variables X n ∈ M where ∆t n = t n − t n−1 and ∆W α n = W α tn − W α tn−1 . The Milstein scheme for the same equation (µ, σ) is instead constituted by the sequence of random variablesX n ∈ M such that where ∆W α,β n = tn tn−1 (W β s − W β tn−1 )dW α s . We recall that when m = 1 we have that Theorem 2.9 Let us denote by X t the exact solution of a SDE (µ, σ) and by X N andX N the N-step approximations according with Euler and Milstein scheme respectively. Suppose that the coefficients (µ, σ) are C 2 with bounded derivatives and put t n = nT N and h = T N . Then there exists a constant C(T, µ, σ) such that Furthermore when the coefficients (µ, σ) are C 3 with bounded derivatives then there exists a constant C(T, µ, σ) such that¯ Proof. See Theorem 10.2.2 and Theorem 10.3.5 in [25].
Theorem 2.9 states that X N andX N strongly converge in L 2 (Ω) to the exact solution X T of the SDE (µ, σ), where the order of the convergence with respect to the step size variation h = T N is 1 2 in the Euler case and 1 in the Milstein one. Nevertheless the theorem gives no information on the behaviour of the numerical approximations when we fix the step size h and we vary the final time T . In the standard proof of Theorem 2.9 one estimates the constants C(T, µ, σ) andC(T, µ, σ) by proving that there exist two positive constants K(µ, σ), K (µ, σ) such that C(T, µ, σ) = exp(T · K(µ, σ)) andC(T, µ, σ) = exp(T · K (µ, σ)), by using Gronwall Lemma. In some situations the exponential growth of the error is a correct prediction (see for example [34]). Of course this fact does not mean that in any case the errors n and¯ n exponentially diverge with the time T . Indeed if the SDE (µ, σ) admits an equilibrium distribution it could happen that the two errors remain bounded with respect to the time T . Unfortunately this favorable situation does not happen for any values of the step size h, but only for values within a certain region. The phenomenon just described is known as the stability problem for a discretization method of a SDE. This problem, and the corresponding definition, is usually stated and tested for some specific SDEs (see e.g. [19,41] for the geometric Brownian motion, see e.g. [18,39] the Ornstein-Uhlenbeck process, see e.g. [20,21] for non-linear equations with a Dirac delta equilibrium distribution, and see e.g. [43] for more general situation). We will show some numerical examples of the stability phenomenon for general linear SDEs in Section 6.

Invariant numerical algorithms
When a system of ODEs admits Lie-point symmetries then invariant numerical algorithms can be constructed (see e.g. [31,30,10,4]). By completeness we recall the definition of an invariant numerical scheme for a system of ODEs, in the simple case of one-step algorithms. The obvious extension for multi-step numerical schemes is immediate. The discretization of an ODEs system is a function F : M × R → M such that if x n , x n−1 ∈ M are the n, n − 1 steps respectively and ∆t n is the step size of our discretization we have that If Φ : M → M is a diffeomorphism we say that the discretization defined by the map F is invariant with respect to the map Φ if it happens that Φ(x n ) = F (Φ(x n−1 ), ∆t n ).
If we require that the previous property holds for any x n ∈ R n and for any ∆t n ∈ R + we get for any x ∈ M and ∆t ∈ R. If Φ a is an one-parameter group generated by the vector field which guarantees that the discretization F is invariant with respect to Φ a , generated by Y .
We can extend the previous definition to the case of a SDE in the following way. Let us discuss an integration scheme which depends only on the time ∆t and on the Brownian motion ∆W α n , α = 1, . . . , m (as for example the Euler method). The same discussion for integration methods depending also on ∆W α,β n or other random variables (as the Milstein method) is immediate. In the stochastic case the discretization is a map F : M × R × R m → M and we have Equations (4) and (5) become Since Ito integral strongly depends on the fact that the approximation is backward (and not forward), we stress again that it is not easy to prove that a given discretization X n converges to the real solution of the SDE (µ, σ). For this reason we give a theorem which provides a sufficient (and necessary) condition in order that Euler and Milstein discretizations of a SDE are invariant with respect to any strong symmetries Y 1 , ..., Y r .
Proof. We give the proof for the Euler discretization because for the Milstein discretization the proof is very similar. In the case of Euler discretization we have that The discretization is invariant if and only if Recalling that Y j is a symmetry for the SDE (µ, σ) and therefore it has to satisfy the determining equations (2) and (3), we have that the Euler discretization is invariant if and only if Conversely, suppose that the Euler discretization is invariant and so equality (8) holds. Let x 0 be as in the hypotheses of the theorem and choose ∆t = 0. Then Since ∆W α are arbitrary and span{σ 1 (

Remark 3.2
The affinity of the coefficients Y i j in Theorem 3.1 strongly depends on the fact that Euler and Milstein numerical approximations depend in an affine way from the noise ∆t, ∆W α , ∆W α,β . If we coonsider a non affine numerical approximation we can have non affine symmetries Y 1 , ..., Y r (see the discussion below).
Theorem 3.1 can be fruitfully applied in the following way. If Y 1 , ..., Y r are strong symmetries of a SDE we search a diffeomorphism Φ : have coefficients of first degree in the new coordinates system x 1 , ..., x n . We discretize the transformed SDE Φ(µ, σ) using the Euler discretization, obtaining a discretizatioñ It is easy to prove that if the map Φ is Lipschitz we have that the constructed discretization converges in L 1 to the solution, while if the map Φ is only locally Lipschitz, the weaker convergence in probability can be established. The existence of the diffeomorphism Φ allowing the application of Theorem 3.1 for general Y 1 , ..., Y r is not guaranteed. Furthermore, even when the map Φ exists, unfortunately in general it is not unique. Consider for example the following one-dimensional SDE which has Y = tanh(x)∂ x as a strong symmetry. There are many transformations Φ which are able to put Y with coefficients of first degree, for example the following two transformations: Indeed we have that While the map Φ 1 transforms equation (9) into a geometrical Brownian motion, the transformation Φ 2 reduces equation (9) to a Brownian motion with drift. By applying Euler method by means of Φ 1 we obtain a poor numerical result ( in fact Φ 1 is not a Lipschitz function and in this circumstance errors are amplified). By exploiting Φ 2 to make the discretization we obtain instead an exact simulation. The example shows that this first approach strongly depends on the choice of the diffeomorphism Φ (which has to be invertible in terms of elementary functions). So it is better to have another procedure able to individuate the best coordinate system for performing the SDE discretization.

Adapted coordinates and triangular systems
We introduce a further possible use of Lie's symmetries in the numerical simulation of a SDE which turns out to be relevant only in the stochastic framework. Indeed in the deterministic setting one can obtain a completely explicit result.
, and consider the following triangular SDE The above SDE is triangular in the variables (x 1 1 , ..., x r 1 ). By discretizing a triangular SDE (µ, σ) we reasonable aspect a better behavior than in the general case. Furthermore if X 1 2,t , ..., X n−r 2,t can be exactly simulated with σ i 2,α , µ i 2 growing at most polynomially, we can conjecture that the error grows polynomially with respect to the maximal integration time T .
We recall that the triangular property of stochastic systems is closely related with their symmetries and in particular to SDEs with a solvable Lie algebra of symmetries. In order to briefly explain the connection between symmetries and the triangular form of SDEs, we introduce the following definitions (for more details see [8]).
.., Y r as strong symmetries and let us suppose that Y 1 , ..., Y r constitute a solvable Lie algebra in canonical form. Then the SDE (µ, σ) assumes a triangular form with respect to x 1 , ...., x r .
Proof. The proof is an application of the determing equations and Definition 3.4 (see [8]). As a notable consequence when we have a SDE (µ, σ) admitting a solvable regular Lie algebra Y 1 , ..., Y r we can apply a methodology similar to the one proposed in the previous subsection.
Indeed we can start by searching a map Φ : M → M such that Φ(Y 1 ), ..., Φ(Y r ) constitute a solvable Lie algebra in canonical form so implying that Φ(µ, σ) is a triangular SDE. We can discretize Φ(µ, σ) according with one of standard methods obtaining a discretizationF . By composingF with Φ we obtain a discretization F (x, ∆t, ∆W α ) = Φ −1 (F (Φ(x), ∆t, ∆W α ) which, when Φ is Lipschitz, has the property of being a more simple triangular discretization scheme. Differently from Theorem 3.1, in the present situation we can always construct the diffeomorphism Φ, as the following proposition states. Proposition 3.6 Let G be an r-dimensional solvable Lie algebra on M such that G has constant dimension r as a distribution of T M . Then, for any x 0 ∈ M , there exist a set of generators Y 1 , ..., Y r of G and a local diffeomorphism Φ : Proof. See [8].
We conclude by pointing out that for a general solvable Lie algebra Y 1 , ..., Y r , the map Φ, whose existence is guaranteed by Proposition 3.6, does not transform Φ * (Y 1 ), ..., Φ * (Y r ) into a set of vector fields with coefficients of first degree in x 1 , ..., x n . For this reason and by Theorem 3.1, the discretization F constructed by using the diffeomorphism Φ and the usual Euler discretization algorithm is not invariant with respect to Y 1 , ..., Y r . Nevertheless if we consider solvable Lie algebras satisfying a special relation, then Φ * (Y 1 ), ..., Φ * (Y r ) will have coefficients of first degree in x 1 , ..., x r . (1) and the fact that Φ * (Y 1 ), ..., Φ * (Y r ) are in canonical form, we must have that Φ * (Y k+1 ), ..., Φ * (Y r ) do not depend on x k+1 , ..., x r and their coefficients must be of first degree in x 1 , ..., x r . The second part of the proposition follows from the well known fact that when the vector fields Z 1 , ..., Z r generate an integrable distribution, it is possible to choose a local coordinate system such that the coefficients of Z 1 , ..., Z r do not depend on x r+1 , ..., x n .

General linear SDEs
We first consider the one-dimensional linear SDE where a, b, c, d ∈ R and we apply the procedure previously presented in order to obtain a symmetry adapted discretization scheme.
Although it is possible to prove that equation (10) for ad−bc = 0 does not admit strong symmetries (see [9]), we can look at equation (10) as a part of a two dimensional system admitting Lie symmetries. Let us consider the system on R × R + = M , consisting of the original linear equation and the associated homogeneous one. It is simple to prove, by solving the determining equations (2) and (3), that the system (11) admits the following two strong symmetries: The more general adapted coordinate system system for the symmetries Y 1 , Y 2 is given by where l ∈ R and f : R + → R is a smooth function. Indeed in the coordinate system (x , z ) T = Φ(x, z) we have that In order to guarantee that the Euler and Milstein discretization schemes are invariant, by Theorem 3.1 it is sufficient to choose f (z) = − k z for some constant k. In the new coordinates the original two dimensional SDE becomes In the following, for simplicity, we consider the discretization scheme only for l = 0. The Euler integration scheme becomes: and the Milstein scheme: We note that when k = − d c the two discretization schemes coincide. Coming back to the original problem, in the Euler case we get: (14) and in the Milstein case we obtain: Remark 4.1 There is a deep connection between equations (14) and (15) and the well-known integration formula for scalar linear SDEs. Indeed the equation (10) admits as solution where Equation (14) and (15) can be viewed as the equations obtained by expanding the integrals in formula (16) according with stochastic Taylor's Theorem (see [25]). This fact should not surprise since the adapted coordinates obtained in Subsection 3.2 were introduced exactly to obtain formula (16) from equation (11). Since the discretizations schemes (14) and (15) are closely linked with the exact solution formula of linear SDEs we call them exact methods (or exact discretizations) for the numerical simulation of linear SDEs.
Let us now consider the following two dimensional SDE The previous equation can be solved explicitly. In particular the homogeneous linear part has solution given by (see, e.g. [12]) where µ = α + σ 2 2 . Thus the solution of the initial equation is The Euler discretization of the previous equation becomes: where ∆t n = t n − t n−1 and ∆W i n = W i tn − W i tn−1 .

Theoretical estimation of the numerical forward error for linear SDEs
We provide an explicit estimation of the forward error associated with the exact numerical schemes proposed in the previous section for simulating a general linear SDE. The explicit solution of a linear SDE is well known and the use of the resolutive formula for its simulation is extensively used, but in the literature, to the best of our knowledge, there is no explicit estimation of the forward error.

Enunciates of the Theorems
Dividing [0, T ] in N parts we obtain N + 1 instants t 0 = 0, t n = nh, t N = T , with h = T N . We denote by X N,T t the approximate solution given by exact Euler method,X N,T t the approximate solution with respect to exact Milstein method and by X t the exact solution to the linear SDE. In the following we will omit T where it is possible.
Before giving the proof of the two previous theorems we propose some remarks. We recall that a linear SDE with ad−bc = 0 has an equilibrium distribution if and only if a− c 2 2 < 0. Furthermore the equilibrium distribution admits a finite first moment if and only if a < 0 and a finite second moment if and only if a + c 2 2 < 0. Since we approximate the Ito integral up to the order h 1/2 , the three cases in Theorem 5.1 follow from the fact that for giving an estimate of the error in Euler discretization we need a second moment control. More precisely we can expect a bounded error with respect to T only when the second moment is finite as T → +∞. Since in the Milstein case a finite first moment sufficies, in the second theorem we obtain that the error does not grow with T when a < 0. We can obtain an analogous estimate for the Euler method when d = 0, i.e. in the case in which the Milstein and Euler discretizations coincide (situation similar to the additive-noise-SDEs setting). The use of only the first moment finitess for estimating the error has a price: indeed we obtain an h 1/2 dependence of the error. We remark that the techniques used in the proof of Theorem 5.2 exploit some ideas from the recent rough path integration theory (see e.g. [15]), and in particular this circumstance explains the 1 2 order of convergence. This fact induces us to conjecture that our proof probably works also in the general rough path framework (for example for fractional Brownian motion by following [14]). If in Theorem 5.2 we do not require an uniform-in-time estimate, we can apply the methods used in the proof of Theorem 5.1 for obtaining an error convergence of order 1. Essentially the above theorems prove that for a + c 2 2 < 0 and for a < 0 respectively, our symmetry adapted discretization methods are stable for any value of h. In Section 6 we give a comparison between the stability of the adapted-coordinates schemes with respect to the standard Euler and Milstein ones, via numerical simulations. We conclude by noting that Theorem 5.1 and Theorem 5.2 cannot be deduced in a trivial way from the standard theorems about the convergence of Euler and Milstein methods (such as Theorem 2.9). Indeed the Euler and Milstein discretizations of equations (12) and (13) do not have Lipschitz coefficients. Furthermore even if a given discretization (X n , Z n ) of the system composed by (12) and (13) should converge to the exact solution in L 2 (Ω), being the coordinate change Φ ( introduced in Section 4) not globally Lipschitz, it does not imply that the transformed discretization (X n , Z n ) converges to the exact solution (X, Z) of the equation (11) in L 2 (Ω). Finally, as pointed out in Subsection 2.2, Theorem 2.9 does not guarantee an uniform-in-time convergence as Theorem 5.1 and Theorem 5.2 instead state.
For proving the theorems we need the following two lemmas. The second allows to avoid very long calculations (see Appendix A).

Lemma 5.3
Let W t be a Brownian motion, α, β ∈ R and n ∈ N then for any t ∈ R + is a continuous function of t and in particular it is locally bounded. Moreover we have that Proof. The proof is based on the fact that W t is a normal random variable with zero mean and variance equal to t.
Lemma 5.4 Let F : R 2 → R be a smooth function such that F (0, 0) = 0 and such that for some α ∈ 2N, for any h and for some continuous function L : R → R + . Then there exists an increasing function C : R → R such that If furthermore ∂ w (F )(0, 0) = 0 and there exists an increasing function C : R → R such that Proof. The statements of the lemma derive as special cases from Lemma 5.6.4 and Lemma 5.6.5 in [25].

Proof of Theorem 5.1
We consider the case t = T . In fact we will find that our estimate is uniform for t ≤ T . Using the notations in Remark 4.1 we can write X T = I 1 + I 2 where Also the approximation X N T can be written as the sums of two integrals of the form X N Obviously the strong error N can be estimated by Setting Ψ s,t = Φ t (Φ s ) −1 for any s < t, we obtain (with ∆t i = h) By Jensen's inequality . and by Fubini theorem we have to calculate and Ψ s,t = Ψ s,u Ψ u,t for any s ≤ u ≤ t we obtain that because Ψ t,T and Ψ ti−1,t are independent as a consequence of the Brownian increments independence.

Estimate of
We first consider Since Ito integral involves adapted processes we cannot bring Φ T under the integral sign. However it is possible to take advantage of the backward integral formulation which allows to integrate processes that are measurable with respect to the (future) filtration F t = σ{W s |s ∈ [t, T ]}. In particular when X s is F t -measurable then where {t n i }| i is a sequence of n points partitions of the interval [0, T ], having amplitude decreasing to 0 and the limit is understood in probability. When F is a regular function, F (W t , t) is a process which is measurable with respect to both the filtrations F t and F t ; therefore one can calculate either The next well-known lemma says that we can write I 2 in terms of a backward integral, which allows to bring Φ T under the integral sign.
Proof. We report the proof for convenience of the reader (see, e.g., [37]). Setting since F is C 2 then alsoF is C 2 . From this fact one deduces that By equating the two expressions one obtains the final formula.
Remark 5.6 Since in the proof of Lemma 5.5 the backward Ito integral plays a fundamental role, the same result cannot be obtained within a pathwise approach such as the rough paths one. Since and ∂ w (F )(w, t) = −cF (w, t), by Lemma 5.5, we can write we have that We first consider the term Ĩ 2 −Ĩ N 2 2 . The processĨ N 2 can be written as where 1 (ti−1,ti] is the characteristic function of the interval (t i−1 , t i ]. By Ito's isometry and Fubini's Theorem we obtain Since Brownian motion has independent increments, we have that Introducing the function: which satisfies H(0, 0) = 0, by Lemma 5.4 and Lemma 5.3 we obtain where C 2 (h) is an increasing function and, finally, where In order to estimate the other term in the right-hand side of (20) we note that by introducing we have By applying Lemma 5.5 to K i (t i , W ti ) we can write From the previous equality, by Ito isometry and Minkowski's integral inequality we get Introducing and so, by Lemma 5.4, there exist two continuous increasing functions C 3 (t), C 4 (t) such that Since by independence analogously we can prove that there exists an increasing function C 5 such that For the second term in the right-hand side of (20), we have finally the following estimate where G 1 (T ) and G 2 (T ) are given by (19) and (23) respectively.

Proof of Theorem 5.2
We make the proof only for a < 0, since in the other case the estimate are equal to the Euler case and can be addressed with the same proof. We introduce the two integrals

Estimate of
First we note that (with δt i = h) where we have taken, n ∈ N, 1 2n + 1 α = 1 and 1 < α < 2 such that αa + α(α − 1) c 2 2 ≤ 0 (the last condition guarantees that when T → ∞ we have E[Ψ α ti,T ] → 0). By Jensen's inequality and Lemma 5.4 we can derive the following estimate: where C 5 (h) is an increasing function and in the last inequality we have used the fact that the function F 4 (t, W t ) = Ψ t,h − Ψ 0,h is such that F 4 (0, 0) = 0. By Lemma 5.3, we have that and so

Estimate of
First we note that where α, n are as in the previous subsection. We introduce the following notation where we have used Lemma 5.5 and the fact that we have that It is simple to see that the two norms on the right-hand side of the previous expression do not depend on t i but only on the difference h = t i − t i−1 , so we study the functions (with Ψ ti,ti = 1): By a well-known consequence of Ito isometry (see, e.g., [13]) we can estimate the function Z 1 (h) as: where D n = (n(2n − 1)) n . Since the function satisfies F 5 (0, 0) = ∂ w (F 5 )(0, 0) = 0, by Lemma 5.4 there exists an increasing function C 6 (h) such that As far as concerned the function Z 2 (h), by introducing it is immediate to see that By applying Lemma 5.5 to K(h, W h ), and by noting that K(h, W h ) = 0, we obtain , by Jensen's inequality, Lemma 5.4 and by applying the same techniques used for obtaining (24) we find that with the obvious definition of the function C 9 (h). Finally we have where G 4 (T ) is given by (25).

Numerical examples
We show some numerical experiments which confirm the theoretical estimate proved in Section 5 and permit to study other properties of the new discretization methods introduced in Section 4. We simulate the linear SDE (10) with coefficients a = −2, b = 10, c = 10 e d = 10. The coefficients are such that a + c 2 2 > 0 with a < 0. This means that the considered linear equation admits an equilibrium probability density with finite first moment and infinite second moment. The coefficient d has been chosen big enough to put in evidence the noise effect.
We make a comparison between the Euler and Milstein methods applied directly to equation (10) and the new exact methods (14) and (15) with the constants k = 0 and k = −d c = −1. In particular we observe that when k = −1, the schemes (14) and (15) coincide. We calculate the following two errors: The weak error is estimated trought the explicit expression for the first moment of the linear SDE solution, and by using Monte-Carlo method with 1000000 paths for calculating E[X N t ]. The strong error is estimated by exploiting Monte-Carlo simulation of X t and X N t with 1000000 paths. In order to simulate X t we apply the Milstein method with a steps-size of h = 0.0001, for which we have verified that it gives a good approximation of both E[X t ] and the equilibrium density for t → +∞. Since we use Monte-Carlo methods for estimating E w and E s , the two errors include both the systematic errors of the considered schemes and the statistical errors of the Monte-Carlo estimate procedure. In Figure 1 we report the weak and strong errors with respect to the maximum time of integration t which varies from 0.1 to 1 and stepsize h = 0.025. As predicted by Theorem 5.2, the error of the exact method for k = −1 remains bounded. It is important to note that for the exact method in the case k = 0 (where Theorem 5.1 and Theorem 5.2 do not apply) the errors remains bounded too, while for Euler and Milstein methods the errors grow exponentially with t.
In Figure 2 we report the weak and strong errors with respect to the maximum time of integration t, which varies from 0.1 to 1, and stepsize h = 0.01. In this situation also the errors of the Mistein method remain bounded. In other words h = 0.01 belongs to the stability region of the Milstein method but not to the stability region of the Euler method.
In Figure 3 we plot the weak and strong errors with fixed final time t = 0.5 and steps number N = 10, ..., 80, where the stepsize h = t N . Here we note that the weak and strong errors for the exact methods do not change with the stepsize. This means that with a stepsize of only h = 0.05 the exact methods have weak and strong systematic errors less than the statistical errors. Instead for the Milstein scheme the errors grow and only with a stepsize equal to h = 0.0125 the systematic In Figure 4 we report the total variation distance between the empirical probabilities of X t and of X N t obtained simulating 1000000 paths. We note that there is a big difference between the exact method for k = 0 and for k = −1. The discrepancy is due to the fact that the exact method with k = 0 tends to overestimate the points with probability less then −d c more than the Euler scheme does. Now we simulate the two dimensional linear SDE analized in Section 4 by choosing α = −20, β = −0.5, σ = σ = 5, c = e = 0.1 0.1 T and d = 1 1 T . Our choice of the parameters guarantees the existence of an equilibrium probability density. We compare approximated solutions obtained with the Euler method and with our exact method using h = 0.01. To this end we calculated both the strong and weak componentwise error where X i t is the i−th component of the solution. This time our true solution is calculated using  In Figure 5 and Figure 6 we compare the strong and weak error of both components of the simulated solutions with respect to the maximum time of integration varying from 0.1 to 1. As can be seen the error from our new method is bounded at all times while the Euler method errors show an exponential growth with respect to the maximum time.
In Figure 7 and Figure 8 we compare the errors of both approximations for solutions with T = 1 and timestep size varying between 0.1 to 0.01. As in the previous one-dimensional case we can see how the new exact method gives a good approximation of the true solution even with large timesteps, while the Euler method fails to achieve the same magnitude of error even using a significative smaller timesteps.

Appendix
In the proof of Theorem 5.1, by using Lemma 5.3 and the independence of Brownian increments, we can estimate the errors in a very explicitely way. In particular without exploiting Lemma 5.4. We show main steps and final expressions.
From (17) we obtain that to be compared with (24).