On the Smoluchowski-Kramers approximation for SPDEs and its interplay with large deviations and long time behavior

We discuss here the validity of the small mass limit (the so-called Smoluchowski-Kramers approximation) on a fixed time interval for a class of semi-linear stochastic wave equations, both in the case of the presence of a constant friction term and in the case of the presence of a constant magnetic field. We also consider the small mass limit in an infinite time interval and we see how the approximation is stable in terms of the invariant measure and of the large deviation estimates and the exit problem from a bounded domain of the space of square integrable functions.


Introduction
The motion of a particle of a mass µ in the field b(q) + σ(q)Ẇ , with a constant damping proportional to the speed, is described, according to the Newton law, by the Langevin equation (for the sake of simplicity the friction coefficient is taken equal to 1).
Here b(q) is the deterministic component of the force and σ(q)Ẇ t , whereẆ t is the standard Gaussian white noise in R n and σ(q) is an n × n-matrix, is the stochastic part. It is known that, for 0 < µ 1, the random position q µ t of the particle can be approximated by the solution of the first order equatioṅ q t = b(q t ) + σ(q t )Ẇ t , q 0 = q ∈ R n , (1.2) in the sense that lim µ↓0 P max 0≤t≤T |q µ t − q t | > δ = 0, (1.3) for any 0 ≤ T < ∞ and δ > 0 fixed. Statement (1.3) is called then Smoluchowski-Kramers approximation of q µ t by q t (see to this purpose [36,51,26]). This statement justifies the description of the motion of a small particle by the first order equation (1.2) instead of the second order equation (1.1). Several authors have considered generalizations of this phenomenon in the presence of a magnetic field (see [9,38]) and for a non-constant friction, both in the case it is strictly positive, as in [27,34,33], and in the case it is possibly vanishing, as in [28]. Large deviations and exit problems in the small noise regime have also been studied (see [14]).
In the present paper, we will review some results about the Smoluchowski-Kramers approximation for systems with an infinite number of degrees of freedom, that the three authors have obtained in a series of papers written in last several years (cfr. [6,7,11,12,13]).
Let O be a bounded smooth domain of R d , with d ≥ 0. We are dealing here with the following stochastic semi-linear damped wave equation on O (1.4) Equation (1.4) models the displacement of an elastic material with mass density µ > 0 in the region O, which is exposed to deterministic and random forces. The term −∂u/∂t models the damping, the Laplacian ∆ models the forces that neighboring particles exert on each other and the non-linearity b models some deterministic forcing. State-dependent stochastic perturbations are modeled by the term g ∂w Q /∂t, for some Wiener process w Q , that is white in time and Q-correlated in space (see Section ?? for all definitions), and for some nonlinear coefficient g.
Many authors have studied stochastic wave equations under various assumptions on the non-linear coefficients b and g, on the correlation Q of the noise and on the domain O and the boundary conditions (see for example [3,16,17,35,41,42,44,45,49,50]). We will specify our hypotheses in the following sections.
In this paper we explore the asymptotic behavior of the solution to (1.4) as the mass density of the material µ vanishes. This is the infinite dimensional analogue to the Smoluchowski-Kramers approximation described in (1.3). In Sections 3 and 4, where the case of additive and multiplicative noise are considered, respectively, we review the results of [6,7] to show that as, µ → 0, the solution of equation (1.4) converges to the solution of the stochastic heat equation (t, x) = ∆u(t, x) + b(x, u(t, x)) + g(x, u(t, x)) ∂w Q ∂t (t, x), x ∈ O, t ≥ 0, u(0, x) = u 0 (x), u(t, x) = 0, x ∈ ∂O. (1.5) Specifically, if we denote by u µ and u the solutions to (1.4) and (1.5), respectively, we show that for any T > 0 and δ > 0 fixed, (1. 6) This means that when the mass density of a material is small, the stochastic heat equation approximates the stochastic wave equation well, on any finite time interval. The proof of this infinite dimensional Smoluchowski-Kramers approximation requires uniform estimates on the Sobolev regularity in space and the Hölder regularity in time for the solutions to the stochastic wave equation (1.4). As known, such uniform bounds are used to establish tightness in an appropriate functional space. Once we obtain a weakly convergent subsequence, by the Prokhorov theorem, the identification of the unique limit, and hence the convergence of the whole sequence, is obtained by using a non-trivial integration-by-parts formula for SPDEs.
In Section 5 we review the results of [11], which concern the Smoluchowski-Kramers approximation for an electrically charged material in the presence of a uniform magnetic field. In this case, the displacement of the material is modeled by the equation where m = (0, 0, m) is a constant vector field that is perpendicular to the plane of motion of the material and that models a magnetic field (notice that here the symbol × denotes the usual vector product in R 3 ). The material is assumed to be electrically charged and therefore the above equation describes the movement of an elastic material with constant mass density µ > 0, that is exposed to the electric field, some deterministic forcing b, and a state dependent stochastic forcing g ∂w Q /∂t. One might hope, based on the results from Sections 3 and 4, that, for any T > 0 and δ > 0 fixed, a limit analogous to (1.6) holds, where u µ is the solution to equation (1.7) and u is the solution to the equation where J −1 0 is the inverse of the matrix Unfortunately, because of the presence of the stochastic term, the small-mass limit (1.6) is not true anymore. A similar situation was explored in [9], where it was shown that, in the case of a system with a finite number of degrees of freedom, the difference u µ − u does not converge to zero, as µ tends to zero. Thus, in Section 5 we study the problem of the small mass limit by adding a small friction − ∂u/∂t to equation (1.7) and we show that, for any fixed > 0, the solution to (1.7), with fixed positive friction, converges to the solution of the system of semi-linear stochastic heat-equations formally obtained by setting µ = 0.
Here, we use a slightly different line of argument than in Sections 3 and 4, which allows us to prove the validity of the convergence in L p (Ω; C([0, T ]; L 2 (O))). This means that, if u µ is the solution to the wave equation exposed to a magnetic field and an friction, and if u is the solution to the equation then for any p ≥ 1 and T ≥ 0 fixed, Notice that the L p convergence can also be proven in the case there is no magnetic field, as long as b and g are suitably regular.
In Sections 6, 7, and 8, we investigate the multi-scale interactions between the small mass limit and long-time behaviors of the stochastic wave equation. In Sections 3 to 5, we demonstrate that the solutions to (1.4) converge to the solution to (1.5) in the topology of C([0, T ]; L 2 (O)). In fact (1.6) is only true for fixed finite T > 0 and, on longer time scales, the solutions will deviate arbitrarily far apart in a pathwise sense. But, despite the fact that the Smoluchowski-Kramers approximation is not valid in a pathwise sense on long-time scales, there are still many important long-time characteristics of the small-mass wave equation that are approximated by the heat equation.
In Section 6 we review some results from [6] concerning the relationship between the Smoluchowski-Kramers approximation and invariant measures for equations (1.4) and (1.5), in the case of gradient systems. We recall here that by gradient system we mean a system where the noise is additive (that is g ≡ I) and, if B : then there exists a potential functional F : L 2 (O) → R, such that for any h ∈ L 2 (O) where DF is the Fréchet derivative of F in L 2 (O) and Q is the correlation of the noise.
Due to the fact that (1.4) is a gradient system, by using suitable finite dimensional approximations, we show that the invariant measure for the pair (u µ , ∂u µ /∂t) solving (1.4) in the space where Z is a normalizing constant, independent of µ > 0. This means that we have an explicit expression of the density of the invariant measure with respect to a suitable Gaussian measure on the product space L 2 (O) × H −1 (O). In particular, due to the special form of ν µ , its first marginal Π 1 ν µ does not depend on µ > 0 and coincides with the probability measure ν(du) : which is the invariant measure of system (1.5). In particular, we have that the Smoluchowski-Kramers approximation is valid in the sense that the stochastic heat equation and the damped stochastic wave equation have the same long-time behavior. The case of non-gradient systems is still open, and of course we cannot expect to have any explicit expression for the invariant measures of systems (1.4) and (1.5). Nevertheless, we expect that if ν µ and ν denote the invariant measures of (1.4) and (1.5), respectively, then some sort of convergence for Π 1 ν µ to ν holds, in the small mass µ limit.
In Sections 7 and 8 we review results from [13] and [12] about the relationship between the small mass and the small noise asymptotics. More precisely, we consider for any > 0 the stochastic damped wave equation with appropriate initial and boundary conditions and the corresponding heat equation with the same initial and boundary conditions. We are interested in the multiscale behavior of system (1.10), as both µ and go to zero. From the earlier results, it is clear that if we first take the limit as µ → 0 and then as → 0, the large deviation principle for the heat equation should describe the behavior of the system. Our purpose here is to show that when the order in the two limits is reversed and we first take the limit as → 0 and then µ → 0, the large deviations principle for the heat equation is still appropriate for studying the long-time behaviors of the wave equation. In particular, this result provides a rigorous mathematical justification of what is done in applications, when, in order to study rare events and transitions between metastable states for the more complicated system (1.10), as well as exit times from basins of attraction and the corresponding exit places, the relevant quantities associated with the large deviations for system (1.11) are considered.
Because of the dissipation introduced by the friction term, under suitable conditions on b, the solution to the unperturbed version of (1.10) (that is with = 0) will converge to 0 as t → +∞. When the system is exposed to small random perturbations (that is when 0 < 1), the solution will deviate from this equilibrium point on long time scales. It is thus of interest to study exit times of the form where D ⊂ L 2 (O) contains the equilibrium solution 0.
By extending to this infinite dimensional setting well known results in the theory of large deviations for finite dimensional systems (see [23,29]), in Section 8 we show that the logarithmic exit time asymptotics, as well as the logarithmic expected value of the exit time and the exit position u µ (τ µ, , ·), can be characterized by the quasi-potentialV µ . More precisely, we show that, for small enough fixed µ > 0 and appropriate initial conditions We also prove that if N ⊂ ∂D has the property that inf (1.14) Thus, due to the role played by the quasi-potential in the description of these important asymptotic features of the system, our purpose here is to compare the quasi-potential V µ (u, v) associated with (1.10), with the quasi-potential V (u) associated with (1.11), and to show that for any closed set (1. 15) This means that, in the description of the large deviation principle, taking first the limit as ↓ 0 (large deviation) and then taking the limit as µ ↓ 0 (Smoluchowski-Kramers approximation) is the same as first taking the limit as µ ↓ 0 and then as ↓ 0.
In Section 7, we address this problem in the particular case system (1.10) is of gradient type. As for the invariant measures studied in Section 6, in the case of gradient systems all relevant quantities associated with the large deviation can be explicitly computed. In particular, we show that for any µ > 0 for any (u, v) ∈ Dom((−∆) 1/2 Q −1 ) × Dom(Q −1 ). Therefore, as from (1.16) we have that for any µ > 0, In particular, this means thatV µ (u) does not just coincide with V (u) at the limit, but for any fixed µ > 0.
In the general non-gradient case considered in Section 8, the situation is considerably more delicate and we cannot expect anything explicit as in (1.16). The lack of an explicit expression for V µ (u, v) and V (u) makes the proof of (1.15) much more difficult and requires the introduction of new arguments and techniques.
The first key idea in order to prove (1.15) is to characterize V µ (u, v) as the minimum value for a suitable functional. We recall that the quasi-potential V µ (u, v) is defined as the minimum energy required to the system to go from the asymptotically stable equilibrium 0 where is the large deviation action functional and z µ ψ = (u µ ψ , ∂u µ ψ /∂t) is a mild solution of the skeleton equation associated with equation (1.10), with control ψ ∈ L 2 ((0, T ); L 2 (O)), By working thoroughly with the skeleton equation (1.18), we show that, for small enough µ > 0, where the minimum is taken over all z ∈ C((−∞, 0]; L 2 (O)×H −1 (O)). In particular, we get that the level sets of V µ andV µ are compact in L 2 (O) × H −1 (O) and L 2 (O), respectively. Moreover, we show that both V µ andV µ are well defined and continuous in suitable Sobolev spaces of functions. The second key idea is based on the fact that, as in [14] where the finite dimensional case is studied, for all functions z ∈ C((−∞, 0]; L 2 (O) × H −1 (O)) that are regular enough, if we denote ϕ(t) = Π 1 z(t), we have (1.20) Thus, ifz µ is the minimizer ofV µ (u), whose existence is guaranteed by (1.19), and ifz µ has enough regularity to guarantee that all terms in (1.20) are meaningful, we obtain In the same way, ifφ is a minimizer for V (u) and is regular enough, then  (1.23). Thus, we have to proceed with suitable approximations, which, among other things, require us to prove the continuity of the mappingsV µ : Dom((−∆) 1/2 Q −1 ) → R, uniformly with respect to µ ∈ (0, 1]. In the second part of Section 8, we apply (1.15) to the study of the exit time and of the exit place of u µ from a given domain in L 2 (D) . If is the exit time from D for the solution of (1.11), and V (u) is the quasi-potential associated with this system, the exit time and exit place results for the first-order system are analogous to (1.12), (1.13), and (1.14).
As a consequence of (1.17), in the gradient case, (1.12), and (1.13) imply that, for any fixed µ > 0, the exit time and exit place asymptotics of (1.11) match those of (1.10). In particular, for any µ > 0 (e.g. see [37,Theorem 6.3.2]) and then, by interpolation, we get (for a proof see [37,Lemma 6.3.1]). This implies For any δ ∈ R, we denote by H δ (O) the completion of C ∞ 0 (O) with respect to the norm Here and in what follows, for each h ∈ H δ (O) we shall denote by h k the k-th Fourier Next, for any δ ∈ R we denote by H δ the Hilbert space For any µ > 0 and δ ∈ R, we define the unbounded operator A µ by setting The operators A µ defined on different H δ are all consistent. It is known that A µ is the generator of a group of bounded linear transformations {S µ (t)} t∈ R on H δ which is strongly continuous (for a proof see e.g. [48, section 7.4]). This means that for any (u 0 , v 0 ) ∈ H δ and for any µ > 0, S µ (t)(u 0 , v 0 ) is the solution of the deterministic linear system which can be written as the following abstract evolution problem in H δ where z(t) := (u(t), v(t)).
Next, we notice that the adjoint operator to A µ is given by In what follows we shall denote by {S µ (t)} {t≥0} the semigroup generated by A µ .
If we set Π 1 (u, v) := u and Π 2 (u, v) := v we have that where the pair (f µ k (t; u, v), g µ k (t; u, v)) is for each k ∈ N and µ > 0 the solution of the system (2.1) In fact, both f µ k and g µ k can be explicitly computed.
Moreover, in view of the explicit formula (2.2), it is possible to prove the following bounds for f µ k (t; u, v). Lemma 2.2. [6, (3.12) and (3.13) in Lemma 3.2] For every k ∈ N, µ > 0 and t ≥ 0, we have Moreover, for any k ∈ N, µ > 0, θ ∈ [0, 1] and t > s, we have An important consequence of Proposition 2.1 is the following result on the asymptotic behavior of S µ (t). Proposition 2.3. [6, Proposition 2.4] For any fixed µ > 0 and for any δ ∈ R the semigroup {S µ (t)} t≥0 is of negative type in H δ . This means that there exist some ω µ > 0 and M µ > 0 such that In fact, we have the following representation for the dual semigroup S µ (t) in terms of the semigroup S µ (t).
In the case O = [0, L], we can prove the following integral representation of S µ (t) in terms of a suitable kernel K µ (t, x, y).
The kernel K µ (t, x, y) satisfies some regularity properties both in the time and in the space variables, as shown in the next Lemma.. Lemma 2.6. [7, Lemma 2.3] Let δ < 1/2. Then, for any ρ < 1/2 − δ there exists some constant c ρ > 0 such that for any 0 ≤ r ≤ t and x, y

The approximation in the case of additive noise
We are here concerned with the following stochastic damped semilinear equation (3.1) Our aim is proving that the solution u µ (t) converges to the solution of the stochastic semilinear heat equation as the parameter µ converges to zero.
Here and in what follows, w Q (t, x) is a cylindrical Wiener process. We shall assume that for any h, k ∈ H and t, s ≥ 0 for some operator Q ∈ L(H). We will assume that Q satisfies the following condition. In general it holds |e k | ∞ ≤ c k α , k ∈ N,

In case
(and this is true for several reasonable domains) (3.4) becomes Thus, in dimension d = 1 condition (3.4) is fulfilled by a white noise in space and time. As soon as one goes to higher dimension, this of course is no longer possible. It is important to stress, however, that if the sup-norms of the eigenfunctions e k are equi-bounded, condition (3.4) does not require a noise with trace-class covariance, no matter how large the space dimension is.
Concerning the nonlinearity b, we shall assume the following condition Hypothesis 2. The mapping b :Ō × R → R is measurable and for some positive constant L. Moreover

Estimates for the stochastic convolution
For each µ > 0, we consider the linear problem (3.5) It is well known that if for some θ ∈ R condition (3.4) holds, then for any µ > 0 there exists a unique solution η µ to problem (3.5) such that for any T > 0 and p ≥ 1 (for a proof we refer for example to [18] and [17]). Our aim here is proving that if the constant θ above is strictly positive (as in Hypothesis 1), then for any δ < θ/2 the process η µ has a version which is δ-Hölder continuous with respect to t ≥ 0 and ξ ∈Ō and the momenta of the δ-Hölder norms of η µ are equi-bounded with respect to µ > 0. Namely we prove the following result.
Moreover, for any p ≥ 1 Due to (3.3) and Hypothesis 1, the cylindrical Wiener process w Q (t, x) can be written as where {e k } k∈ N is the complete orthonormal basis in H which diagonalizes ∆ and {β k (t)} k∈ N is a sequence of mutually independent standard Brownian motions all defined on some stochastic basis (Ω, F, F t , P). This means that for all (t, x) ∈ [0, ∞) ×Ō we have where, for each k ∈ N, η µ k (t) is the solution of the one dimensional problem Notice that the second order equation (3.6) can be rewritten more rigorously as the following system η µ k (0) = 0, θ µ k (0) = 0. Then, by the variation of constants formula, it is immediate to check that with f µ k and g µ k defined as the solutions of system (2.1), with initial conditions f (0) = 0 and g(0) = 1.
Our aim here is obtaining estimates for η µ (t) which are independent of µ > 0. We start with a uniform estimate for the mean-square of η µ (t, x) − η µ (s, y), for each t, s ≥ 0 and x, y ∈Ō.
for any t ≥ 0 and x, y ∈Ō. Moreover, there exists a constant c 2 > 0 such that for any t, s ≥ 0 and x ∈Ō.
Once we have the uniform estimates (3.7) and (3.8), the proof of Proposition 3.2 is straightforward. Actually, since for any t, s ≥ 0 and x, y ∈Ō the random variable η µ (t, x)− η µ (s, y) is Gaussian, according to Lemma 3.3 for any p ≥ 1 there exists a constant c p such that Now, if we fix any δ ∈ (0, θ/2), there exists p δ ≥ 1 such that Thus, if for any p ≥ p δ we define we have This implies that η µ belongs to W α,p ([0, T ] × O), P-a.s., so that, due to the Sobolev embedding theorem, there exists a version of η µ belonging to C δ ([0, T ] ×Ō), P-a.s. Moreover, there exists some c T,p independent of µ > 0 such that

The convergence result
According to Proposition 2.1 for any constant c > 0 and any k ∈ N and hence, integrating with respect to t ≥ 0 we get This implies that the families of functions are equi-bounded, and by the Ascoli-Arzelà theorem we have for any T > 0. Now, for any µ > 0 and δ ∈ [0, 1], we define the operators and and then, thanks to Hypothesis 2 for any T > 0, and Note that with these notations, the weak solution η µ of problem (3.5) studied in Subsection 3.1 can be written as Due to the Lipschitz continuity of b, and hence of B µ , it is possible to prove the following well-posedness result for problem (3.1), for every µ > 0 fixed (for a proof see e.g. [19,Theorem 5.3.1]). Proposition 3.5. Assume Hypotheses 1 and 2. Then for any µ > 0 and for any initial data u 0 ∈ H θ (O) and v 0 ∈ H θ−1 (O), there exists a unique mild solution z µ (t) to problem (3.1). Moreover, for any T > 0 and p ≥ 1 there exists c p,µ (T ) > 0 such that Once one has the well-posedness of equation (  To this purpose, the following integration by parts formula holds. Concerning the remainder term R µ (t) defined in (3.13) we have the following limiting result.
Lemma 3.8. Under the same hypotheses of Lemma 3.7 we have 1] , the Skorokhod theorem assures that for any two sequences {µ n } n and {µ m } m converging to zero there exist subsequences {µ n(k) } k∈ N and {µ m(k) } k∈ N and a sequence of random elements , defined on some probability space (Ω,F,P), such that the law of ρ k coincides with the law of (u µ n(k) , u µ m(k) , w Q ), for each k ∈ N, and ρ k convergesP-a.s. to some random element ρ := (u 1 , u 2 ,ŵ Q ) ∈ C. Now, if we show that u 1 = u 2 , we have that there exists some u ∈ C([0, T ]; H) such that u µ converges to u in probability. Actually, as observed by Gyöngy and Krylov in [32], if E is any Polish space equipped with the Borel σ-algebra, a sequence {ρ n } of E-valued random variables converges in probability if and only if for every pair of subsequences {ρ m } and {ρ l } there exists an E 2 -valued subsequence w k := (ρ m(k) , ρ l(k) ) converging weakly to a random variable w supported on the diagonal Note that both u k 1 and u k 2 solve equation (3.1) with w Q replaced byŵ k Q . Then they both verify formula (3.12), with R k 1 and R k 2 obtained replacing u µ respectively with u k 1 and u k 2 and w Q withŵ Q k . According to Lemma 3.8 we have that both R k 1 and R k 2 converge to zero in L 2 (Ω), as m n(k) and µ m(k) go to zero, and then, possibly for a subsequence, they convergeP-a.s. to zero. Due to formula (3.12) and then they coincide with the solution of the semi-linear heat equation perturbed by the noiseŵ Q , which is unique. As we have recalled above, thanks to the Gyöngy-Krylov remark in [32] this implies that u µ converges in probability to some random variable u ∈ C([0, T ]; H). But, by using again formula (3.12) and Lemma 3.8 we have that u solves the heat equation (3.2).
We have just proved the main result of this section.

The approximation in the case of multiplicative noise
We are dealing here with the following damped semi-linear wave equation perturbed by multiplicative noise in the interval [0, L] (4.1) In the present section, we assume that w(t, x) is a cylindrical Wiener process, white in space and time. The coefficients b and g are measurable from [0, L] × R with values in R and g is Lipschitz-continuous in the second variable, uniformly with respect the first one. The coefficient b is either assumed to be Lipschitz-continuous in the second variable, uniformly with respect the first one, or satisfying some polynomial growth and dissipation conditions in the spirit of the Klein-Gordon model (and in this second case g is assumed bounded).
As in the case of Lipschitz continuous b and additive noise in any space dimension, considered in Section 3, we want to show that for any > 0 and T > 0

The coefficients b and g
Concerning the coefficients b and g, as we already mentioned above,we shall consider two different cases. Here is described the first one.
In particular from the assumptions above we have that both b and g have linear growth in the second variable, uniformly with respect to the first. Namely for some constant c > 0. As we are assuming that b(x, ·) is Lipschitz continuous, uniformly with respect to x ∈ [0, L], we have seen that the mapping B µ defined in (3.9) is Lipschitz continuous from H δ into itself. Now, for any µ > 0 and δ ∈ [0, 1], we define Due to Hypothesis 3, the mapping G µ (·)h : The second case that we shall consider is described below.
In this second case the mapping b(x, ·) : R → R is no more Lipschitz continuous and does not necessarily have sublinear growth. From (4.6) we obtain and then according to (4.5) it follows (4.8) Moreover, due to (4.5), for any (x, σ) ∈ [0, L] × R we have Now, by proceeding as in [42], where the particular case of the Klein-Gordon equation is considered, we approximate b by means of Lipschitz continuous mappings, by setting for , so that the mapping B n,µ defined by is Lipschitz continuous from H δ into itself, for any δ ∈ [0, 1], and for any (u, v) ∈ H δ it holds for some constant c not depending on n.
Next, for any n ∈ N we set Due to (4.6), (4.8) and (4.9) it is possible to show that there exists some n 0 > 0 such that for any n ≥ n 0 and (x, for some constant c not depending on n. In particular, as b n coincides with b on [0, L] × [−n, n], for any n > 0, an analogous inequality is fulfilled by b and β on [0, L] × R. As far as the diffusion coefficient G µ is concerned, due to (4.7) in this second case the mapping G µ (·)h : H δ → H δ is bounded, for any fixed h ∈ L ∞ (0, L).

Uniform bounds for the stochastic convolution
For any µ > 0 and T > 0 and for any z ∈ L p (Ω; C([0, T ]; H δ )) we define Our aim in this section is proving some a-priori estimates for Γ µ (z), which are uniform with respect to µ ∈ The estimates above for the H δ (0, L)-norm of Π 1 Γ µ (z), are obtained only for δ < 1/2. This means that they do not provide any bound in the space of continuous functions. In what follows we shall state some pointwise uniform bounds, both in the space and in the time variables, which will lead us to uniform bounds in the space of Hölder continuous functions.
for some κ ∈ [0, 1]. Then, for any As a consequence of the Garcia-Rademich-Rumsey theorem, from the previous lemma we obtain the following result.

The convergence result
If we set z := (u, ∂u/∂t) equation (4.1) can be written in the following abstract form The existence and uniqueness of a mild solution of equation (4.11) for any fixed µ > 0 is a well known fact in the literature, both under Hypothesis 3 (see [3]) and under Hypothesis 4 (see [42] in the more delicate case of space dimension d = 2, under more restrictive conditions on the noise and on the initial data u 0 and v 0 , due to the higher dimension). Namely, we have The proof of the theorem above is obtained by considering an analogous of equation (4.1), obtained by replacing the coefficient b with the truncated coefficients b n . Uniform estimates are obtained for the solutions of the approximating problem and a global solution is obtained by introducing stopping times.The uniform bounds are obtained by using a splitting method and the uniform bounds for the stochastic convolution given in Proposition 4.5.
1. By looking at the proof of the previous theorem, one sees that, in the case of Lipschitz continuous b, in order to have solutions in L p (Ω; C([0, T ]; H δ )) it is not necessary to take the initial data z 0 = (u 0 , v 0 ) in H 1 , but it is sufficient to take them in H δ .
2. From the proof we see that the solution is also unique in L p (Ω; L p (0, T ; H δ )).  The proposition above is obtained by considering separately the case b is Lipschitz and b is locally Lipschitz.
This implies that the family {u µ } µ∈ (0,1] is bounded in L p (Ω; C([0, T ]; H δ )), for any p ≥ 1 and δ < 1/2. In the case that b is not Lipschitz continuous we cannot prove that. Nevertheless, from the proof of the proposition above we have that for any θ < 1/4 the family Actually, if we denote by f T the inverse of the function x → c(1 + x 2λ ) exp(cx 2 T ), by following the proof of the second step in the proposition above we have P sup .
Hence, as f T (R) diverges to +∞ as R → ∞, due to (4.10) we obtain (4.12). Now, as in the case of additive noise, it is possible to prove the following integration by parts formula. For any ϕ ∈ C 2 ([0, T ] × [0, L]), such that ϕ(t, 0) = ϕ(t, L) = 0, Then, also in this case, we have to show that the remainder term R µ (t) converges to zero, as the parameter µ goes to zero. In the proof of Lemma 3.8, b was Lipschitz continuous and the noise was additive, so that we could prove the mean-square convergence of R µ (t) to zero, for any fixed t ≥ 0. Here, in the case of non-Lipschitz b, it is only possible to prove convergence in probability, but, as we will show later on, this is sufficient in order to establish the validity of the Smoluchowski-Kramers approximation.

The Smoluchowski-Kramers approximation in presence of a magnetic field
We consider here the following two dimensional system of stochastic PDEs , this could model the displacement of a charged one-dimensional string, with fixed endpoints, that can move through two other spacial dimensions, where the Laplacian ∆ models the forces neighboring particles exert on each other, the uniform magnetic field points in the direction of O, B is some nonlinear forcing, and ∂w Q /∂t is a Gaussian random forcing field, whose intensity G may depend on the state u µ .
In Section 3 and Section 4, we have studied the validity of the so-called Smoluchowski-Kramers approximation, in the case the magnetic field is replaced by a constant friction. Namely, it has been shown that, as µ tends to 0, the solutions of the second order system converge to the solution of the first order system which is obtained simply by taking µ = 0.
One might hope that a similar result would be true in the case treated in the present section. Namely, one would expect that for any T > 0, δ > 0, where u(t) is the solution of the following system of stochastic PDEs where Unfortunately, as shown in [9] such a limit is not valid, even for finite dimensional analogues of this problem. Actually, one can prove that if the stochastic term in (5.1) is replaced by a continuous function, then u µ would converge uniformly in [0, T ] to the solution of (5.2). But if we have the white noise term, this is not true anymore. An explanation of this lies in the fact that, while for any continuous function ϕ(s) it holds Nevertheless, the problem under consideration can be regularized in such a way that a counterpart of the Smoluchowski-Kramers approximation is still valid. To this purpose, there are various ways to regularize the problem. One possible way consists in regularizing the noise (to this purpose, see [9] and [38] for the analysis of finite dimensional systems, both in the case of constant and in the case of state dependent magnetic field). Another possible way, which is the one we are using here, consists in introducing a small friction proportional to the velocity in equation (5.1) and considering the regularized problem 3) which now depends on two small positive parameters and µ. Our purpose here is showing that, for any fixed > 0, we can take the limit as µ goes to 0. Namely, we want to prove that for any T > 0 and p ≥ 1 where u (t) is the unique mild solution of the problem which is precisely what we get from (5.3) when we formally set µ = 0.

Assumptions and notations
In the present section, unlike in the rest of the paper, we shall denote by H the Hilbert space L 2 (O, R 2 ), endowed with the scalar product we have that Ae k = −α k e k , k ∈ N.
Next, in the same way as in section 3 and 4, for any δ ∈ R, we define H δ to be the completion of C ∞ 0 (O; R 2 ) with respect to the norm Moreover, we define H δ := H δ × H δ−1 , and in the case δ = 0 we simply set H := H 0 . Finally, for any (h, k) ∈ H δ , we denote The cylindrical Wiener process w Q (t, x) is defined as the formal sum where Q = (Q 1 , Q 2 ) ∈ L(H), {β k } k∈ N is a sequence of identical, independently distributed one-dimensional, Brownian motions defined on some probability space (Ω, F, P) and {e k } k∈ N is the orthonormal basis of H introduced above.
Concerning the non-linearity B we assume the following conditions and sup In particular, this implies that for any h 1 , h 2 , , z ∈ H If for any h ∈ L 2 (O) and z ∈ L ∞ (O) we define for some measurable g : and it has linear growth Actually, in this case and by the same reasoning Now, for any µ > 0 and δ ∈ R, as in Section 2we define on H δ the unbounded linear operator where J 0 is the skew symmetric 2 × 2 matrix It can be proven that A µ is the generator of a strongly continuous group of bounded linear operators {S µ (t)} t≥0 on each H δ (for a proof see [48,Section 7.4]). Moreover, for any µ > 0 we define With these notations, if we set system (5.1) can be rewritten as the following stochastic equation in the Hilbert space H dz µ (t) = [A µ z µ (t) + B µ (z µ (t), t)] dt + G µ (z µ (t), t)dw Q (t), z µ (0) = (u 0 , v 0 ).

The approximating semigroup
For any µ, > 0 and δ ∈ R, we define As we have seen for A µ , it is possible to prove that for any µ, > 0 the operator A µ generates a strongly continuous group of bounded linear operators S µ (t), t ≥ 0, on H δ .

4)
and Notice that in particular this implies that for any µ, > 0 there exists c µ, > 0 such that for any (u As a consequence of the Datko theorem, this allows to conclude that there exist M µ, , and ω µ, > 0 such that S µ (t) L(H θ ) ≤ M µ, e −ωµ, t , t ≥ 0.

Now, for any > 0 we define
and we denote by T (t), t ≥ 0, the strongly continuous semigroup generated by A in H θ , for any θ ∈ R. Moreover, we denote Moreover, if there exists a non-negative sequence {λ k } k∈ N such that then, for any 0 < δ < 1 and > 0 there exists a constant c = c(δ, ) such that for any In view of the previous estimates for S µ (t) and T (t), the following convergence result holds.
where P n is the projection of H onto the n-dimensional subspace H n := span{e 1 , . . . , e 2n }.
The two limits above imply that for any > 0 and T > 0 and for any (u, v) ∈ H, and, for any v ∈ H and 0 < t 0 ≤ T ,

Approximation by small friction for additive noise
We assume here that the noisy perturbation in system (5.1) is of additive type, that is G(u, t) = I, for any u ∈ H and t ≥ 0. Moreover, we assume that the covariance operator Q satisfies the following condition.

Hypothesis 7.
There exists a non-negative sequence {λ k } k∈ N such that Qe k = λ k e k , for any k ∈ N. Moreover, there exists δ > 0 such that With the notations we have introduced above and 3, if we denote z µ (t) = (u µ (t), ∂u µ ∂t (t)), t ≥ 0, the regularized system (5.3) can be rewritten as the abstract evolution equation in the Hilbert space H.
Our purpose here is to show that for any fixed > 0 the process u µ (t) converges to the solution u (t) of the following system of stochastic PDEs where for any > 0 we have defined Q = J −1 Q and B (u, t) = J −1 B(u, t), u ∈ H, t ≥ 0.
Notice that with these notations, system (5.10) can be rewritten as the abstract evolution equation in the Hilbert space H.
According to Lemma 5.3, due to Hypothesis 7 for any t ≥ 0 we have This implies that the stochastic convolution takes values in L p (Ω; C([0, T ]; H)), for any T > 0 and p ≥ 1 (for a proof see [18]). Therefore, as the mapping B µ (·, t) : H → H is Lipschitz-continuous, uniformly with respect to t ∈ [0, T ], we have that there exists a unique process z µ ∈ L p (Ω; C([0, T ]; H)) which solves equation (5.9) in the mild sense, that is In the same way, due to (

Approximation by small friction for multiplicative noise
In this section we assume that the space dimension d = 1 and O is a bounded interval, the diffusion coefficient G satisfies Hypothesis 6 and the covariance operator Q satisfies the following condition.

Hypothesis 8.
There exists a bounded non-negative sequence {λ k } k∈ N such that We begin by studying the stochastic convolutions Γ µ (z)(t) := With the notations introduced above, the regularized system (5.3) can be rewritten as 12) and the limiting problem (5.10) can be rewritten as where G (u, t) = J −1 G(u, t). Moreover, there exists a constant c := c( , µ, p, T ) such that Remark 5.8. We can show that the first component for a constant c = c( , p, T ) > 0 that is independent of µ.
In the same way, we have the following result for the stochastic convolution associated with the parabolic problem. As a consequence of this lemma, since the mapping B (·, t) : H → H is Lipschitz continuous, uniformly for t ∈ [0, T ], we have that for any initial condition u 0 ∈ H, system (5.12) admits a unique adapted mild solution u ∈ L p (Ω; C([0, T ]; H)). This follows from a stochastic factorization argument, and limits (5.6), (5.7) and (5.8), combined with arguments analogous to those used in the proof of Lemma 5.7 (see [11,Lemma 5.1]).
Theorem 5.11. Let z µ = (u µ , v µ ) and u be the mild solutions of problems (5.12) and (5.13), with initial conditions z 0 ∈ H and u 0 = Π 1 z 0 ∈ H, respectively. Then, under Hypotheses 2, 3 and 11, for any T > 0, > 0 and p ≥ 1 we have Actually, we have and Then By Lemma 5.2, and Hypothesis 5, there is a constant independent of µ and of 0 < s < t, such that Thanks to (5.14), this implies Finally, the result follows because of (5.6), (5.8), and Theorem 5.10.

The convergence for ↓ 0
In the previous sections, we have shown that under suitable conditions on the coefficients and the noise, for any fixed > 0, T > 0 and p ≥ 1 This limit is not uniform in > 0, and the limit is not true for = 0. In this section we want to show that lim where u is the mild solution of the problem This statement is not true unless we strengthen Hypothesis 8. Actually, Hypothesis 8 is the weakest assumption on the regularity of the noise that implies Theorem 5.6 and Theorem 5.11, for > 0. But in order to prove (5.16) we need to assume the following stronger condition on the covariance Q.

Hypothesis 9.
There exists a non-negative sequence {λ k } k∈ N such that Qe k = λ k e k , for any k ∈ N, and In what follows, we shall denote by T 0 (t), t ≥ 0, the semigroup generated by the differ- This means that if we take the scalar product in H θ of the first equation by u 1 and of the second equation by u 2 , we get d dt for any θ ∈ R and x ∈ H. Now, let us consider the stochastic convolution associated with problem (5.17), in the simple case G = I As a consequence of (5.18), we have and this implies that Hypothesis 9 is necessary in order to have a solution in H for the limiting equation ( Then, in view the previous two lemmas, we have that the arguments used in the proof of Theorem 5.6 and Theorems 5.10 and 5.11 can be repeated and we have the following result. In fact, it is possible to show that the convergence result proved above for ↓ 0 is also valid for the second order system, that is for every µ > 0 fixed. Theorem 5.14. [11, Theorem 6.5] Assume either G satisfies Hypothesis 6 or G(x, t) = I. Then, under Hypotheses 5 and 9, we have that for any initial conditions (u 0 , v 0 ) and µ > 0 for any T > 0 and p ≥ 1.
As long as we can show that S µ (t)P n z → S 0 µ (t)P n z for any fixed n, we can prove Theorem 5.14 by following the arguments of Theorems 5.6 and 5.11. Fortunately, we can prove something stronger. Actually, it is possible to show that sup t≥0 S µ (t) − S 0 µ (t) L(H) = 0, for any fixed µ > 0.
To this purpose, it is useful to introduce an equivalent norm on H × H −1 , depending on µ > 0, by setting Because of (5.4), for any ≥ 0, Note that if = 0, then, by (5.4), for any z ∈ H and t ≥ 0,

The long time behavior
In this section we study the relation between the stationary distributions of the processes u µ (t) and u(t), defined respectively as the solution of the semi-linear stochastic damped wave equation (3.1) and as the solution of the semi-linear stochastic heat equation (3.2).
with the notations introduced in Section 3 we can write equation (3.1) as the abstract evolution equation on the Hilbert space where B µ and Q µ are the operators defined in (3.9) and (3.10), respectively. Note that the adjoint of the operator Q µ : H → H is the operator Q µ : H → H defined by In particular we have that Q µ Q µ : H → H is given by Next, for any µ > 0, we introduce the operator C µ ∈ L + (H) by setting where {S µ (t)} t≥0 is the semigroup generated by A µ , the adjoint to the operator A µ . Next proposition provides an explicit expression for the operator C µ .
In particular, if we assume that Qe k = λ k , for every k ∈ N, where {λ k } k∈ N is a nonnegative sequence such that then C µ is a trace-class operator with In particular, we have that C µ is a non-negative, symmetric trace-class operator in H, and we conclude that the centered Gaussian measure ν µ := N (0, C µ ), of covariance C µ is well defined in H. Moreover, ν µ can be written as a product of two centered gaussian measures on H and H −1 , respectively, that is

The linear case
Our aim here is studying the invariant measure of the linear system and showing that the stationary distribution for the solution of the linear damped wave equation coincides for all µ > 0 with the unique invariant measure of the stochastic heat equation In what follows, we shall assume the following conditions on Q.

Hypothesis 10. The linear operator Q is bounded in H and diagonal with respect to the basis {e
In particular, if d = 1 we can take Q = I, but if d > 1 the noise has to colored in space.  (6.6) so that N (0, C µ ) is ergodic and strongly mixing. Moreover the Gaussian measure Π 1 ν µ = N (0, (−∆) −1 Q 2 /2) is the stationary distribution of (6.4). In particular, Π 1 ν µ does not depend on µ > 0 and coincides with the unique invariant measure ν of the stochastic heat equation (6.5).
As stated in Proposition 6.1, the operator C µ is non-negative, symmetric and of traceclass on H. Thus problem (6.3) admits an invariant measure of the form where λ µ is an invariant measure for the semigroup S µ (t) and N (0, C µ ) is the Gaussian measure, with zero mean and covariance operator C µ (for a proof see e.g. [18,Theorem 11.7]). Moreover, as the semigroup {S µ (t)} t≥0 is of negative type (see Proposition 2.3), due to [18,Theorem 11.11] N (0, C µ ) is the unique invariant measure for (6.1) and (6.6) holds. As well known this implies that N (0, C µ ) is ergodic and strongly mixing.
Next, due to (6.2) the measure N (0, C µ ) defined on H is the product of two Gaussian measures, defined respectively on L 2 (O) and on H −1 (O). Namely In particular the marginal measure Π 1 N (0, C µ ) equals N (0, (−∆) −1 Q 2 /2), so that it does not depend on µ > 0 and coincides with the unique invariant measure ν of the Ornstein-Uhlenbeck process solving problem (6.5).
This allows us to conclude, as the processū µ (t) = Π 1z is the stationary solution to problem (6.4) and its distribution does coincides with the measure Π 1 N (0, C µ ).

The semi-linear case
We show that an analogous result holds also in the non-linear case, when (3.1) is a gradient system. To this purpose, we need to assume that the non-linearity B has some special structure.
Hypothesis 11. There exists F : Moreover, there exists some κ > 0 such that Moreover, it is clear that F (0) = 0, F (h) ≥ 0 for all h ∈ H, and 2. Assume now d ≥ 1, so that Q is a general bounded operator in H, satisfying Hypothesis 1. Let b : R → R be a function of class C 1 , with Lipschitz-continuous first derivative, such that b(0) = 0 and b(x) ≥ 0, for all x ∈ R. Moreover, the only local minimum of b occurs at 0. Let It is immediate to check that F (0) = 0 and F (h) ≥ 0, for all h ∈ H. Furthermore, Therefore, the nonlinearity satisfies Hypothesis 11.
Our aim first is showing that, under the above conditions, system (6.1) is of gradient type and admits an invariant measure of the following type for some normalizing constant c µ .
To this purpose we introduce some notations. For any n ∈ N and δ ∈ R we define andT n : R n × R n → H δ , (x, η) → (T n x, T n η).
Clearly we have R n T n = Id R n . Furthermore, if we set P n := T n R n , we have that P n is the projection of H δ (O) onto the finite dimensional space generated by {e 1 , . . . , e n } and for any fixed u ∈ H δ (O) we have that P n u converges to u in H δ , as n goes to infinity. In particular, settinḡ P n (z) := (P n u, P n v), In what follows, for any Banach space X we denote by B b (X) the Banach space of Borel and bounded functions from X into R, endowed with the sup-norm, and we denote by C b (X) the subspace of uniformly continuous functions.
We recall that the transition semigroup {P µ (t)} t≥0 associated with system (6.1) in H is defined for any t ≥ 0 and ϕ ∈ B b (H) by where (u µ (t), v µ (t)) is the solution of (6.1) with initial datum z = (u, v). Next, we denote by z µ n (t) the solution to the finite dimensional problem where Q µ,n =P n Q µ . Due to (6.7), to the fact that B µ is Lipschitz continuous and to estimate (3.11), it is possible to prove the following approximation result An important consequence of this fact is that the semigroup P µ (t) can be approximated by the semigroup P n µ (t) associated with equation (6.8). Namely, for any ϕ ∈ C b (H) and t ≥ 0 it holds lim n→∞ P µ n (t)ϕ(z) = lim n→∞ E ϕ(z µ n (t)) = P µ (t)ϕ(z), z ∈ H. (6.9) Since DF : H → H is Lipschitz continuous, it is not difficult to check that so that, in particular, F : H → R is locally Lipschitz continuous. 2. From the proof of Theorem 6.5 one sees that it is sufficient to assume a weaker condition than F ≥ 0. Namely, what is needed is that Z and Z n are finite and for any sequence {ϕ n } ⊂ C b (H) uniformly bounded and pointwise convergent to some ϕ ∈ C b (H).

Moreover the distribution
is stationary for equation (3.1) and coincides with the unique invariant measure for the stochastic semi-linear heat equation The proof of the theorem above is obtained by first considering the finite dimensional case and then by a limiting argument.
The transition semigroup associated with system (6.11) is defined for any ϕ ∈ C b (R 2n ) bŷ P µ n (t)ϕ(q, p) = E ϕ (ζ µ n (t)) , t ≥ 0. Note that if we define F n (q) := F (T n q), we have DF n (q) = R n DF (T n q), q ∈ R n , and 1 2 D R n ∆T n q, q R n = R n ∆T n q q ∈ R n .

Moreover, since
for the obvious normalizing constant c n , by a change of variable from Hypothesis 3 we have As a well-known fact, the Boltzmann distribution ν µ,n (dq, dp) = c µ,n exp − Q −2 n R n ∆T n q, q R n − 2 F n (q) exp −µ|Q −1 n p| 2 R n (dq, dp) is invariant for system (6.11), so that for anyφ ∈ C b (R n ) and t ≥ 0 R n ×R nP µ n (t)φ(q, p)ν µ,n (dq, dp) = R n ×R nφ (q, p)ν µ,n (dq, dp). (6.12) Now, it is immediate to check that the H-valued processT n ζ µ n (t) coincides with the solution z µ n (t) of the approximating system (6.8) with initial datumT n (q, p). For any ϕ ∈ C b (H), this yields and hence from (6.12) for any ϕ ∈ C b (H) we obtain R n ×R n P µ n (t)ϕ(T n (q, p))ν µ,n (dq, dp) = R n ×R n ϕ(T n (q, p))ν µ,n (dq, dp).
If T n is considered as a mapping from R n into H by reasoning as above we have Moreover, if T n is considered as a mapping from R n into H −1 (O) we have Actually, for any λ ∈ H −1 (O) we have and by uniqueness of the Fourier transform we obtain (6.15).
Therefore, from (6.14) and (6.15) we havê and hence, since P µ n (t)ϕ(z) = P µ n (t)ϕ(P n z), z ∈ H, from (6.13) it follows Now, due to (6.7) and (6.9) we have Then, thanks to Hypothesis 3, by the dominated convergence theorem we can take the limit as n goes to infinity in both sides of (6.16) and we get for any ϕ ∈ C b (H). By a monotone class argument the same identity follows for arbitrary ϕ ∈ B b (H). This in particular implies that the measure is invariant for P µ (t). Finally, since we obtain the second part of the theorem.

Large deviations in the gradient case
We are here interested in the small noise behavior of the solution of the evolution equation (6.1) in H and in comparing it with the small mass behavior of the solution of the evolution equation (6.10). Namely, we introduce the equation for some parameters 0 < , µ 1. We assume that Hypotheses 10 and 11 are verified so that the non-linearity B has a gradient structure.
We keep µ > 0 fixed and let tend to zero, to study some relevant quantities associated with the large deviation principle for this system, as the quasi-potential that describes also the asymptotic behavior of the expected exit time from a domain and the corresponding exit places. Due to the gradient structure of (7.1), as in the finite dimensional case studied in [26], we are here able to calculate explicitly the quasi-potentials V µ (u, v) for system (7.1) as Actually, we can prove that for any µ > 0 where , and C µ is the operator defined in (6.2). From (7.3), we obtain that Thus, we obtain the equality (7.2) by constructing a path which realizes the minimum. An immediate consequence of (7.2) is that for each µ > 0 where V (u) is the quasi-potential associated with the equation

A characterization of the quasi-potential
For any µ > 0 and t 1 < t 2 , and for any z ∈ C([t 1 , t 2 ]; H) and z 0 ∈ H, we define where z µ ψ solves the skeleton equation associated with (7.1) Analogously, for t 1 < t 2 , and for any ϕ ∈ C([t 1 , t 2 ]; H) and u 0 ∈ H, we define where ϕ u 0 ,ψ solves the problem In what follows, we shall also denote In what follows, for any fixed µ > 0 we shall denote by V µ the quasi-potential associated with system (7.1), namely Analogously, we shall denote by V the quasi-potential associated with equation (7.4), that is Moreover, for any µ > 0 we shall definē In [10,Proposition 5.4], it has been proven that V (u) can be represented as Here, we want to show a similar representations for V µ (u, v), for any fixed µ > 0.

Now if we definez
we see thatz µ ∈ C((t − T µ , 0); H) and Due to the arbitrariness of > 0, we can conclude.

The main result
If z ∈ C((−∞, 0]; H) is such that I µ −∞,0 (z) < +∞, then we have for ϕ = Π 1 z, This means that and (7.5) follows. By the same argument, if I −∞,0 (ϕ) < +∞, then it follows that In particular, for any µ > 0, As known, an analogous result holds for V (u). In what follows, for completeness, we give a proof. We have From this we see that Just as for the wave equation, for u ∈ D((−∆) 1 2 Q −1 ), we defineφ to be the solution of We have lim so that Therefore, in order to conclude the proof of Theorem 7.2, one has to show that (7.7) holds. Then

Large deviations in the non-gradient case
Here, we assume that the smooth bounded domain O ⊂ R d is regular enough so that where {e k } k∈ N is the complete orthonormal basis in H that diagonalizes ∆ and {β k (t)} k∈ N is a sequence of mutually independent Brownian motions, all defined on the same stochastic basis. Moreover, we have assumed that Q is diagonal with respect to the basis {e k } k∈ N , with Qe k = λ k e k . In what follows we shall assume the following for the sequence {λ k } k∈ N .
3. As a consequence of (8.2), for any δ ∈ R and there exists c δ > 0 such that for any x ∈ H δ+2β Concerning the nonlinearity B, we shall assume the following conditions. Moreover B(0) = 0. We also assume that B is differentiable in the space H 2β , with The assumption that B is differentiable is made for convenience to simplify the proof of lower bounds in Theorem 8.19. We believe that by approximating the Lipschitz continuous B with a sequence of differentiable functions whose C 1 seminorm is controlled by the Lipschitz semi-norm of B, the results proved in Theorem 8.19 should remain true.

If for any
as in Section 3 and after, and if we assume that b(x, ·) ∈ C 2k (R), for k ∈ [β + δ/2 − 5/4, β + δ/2 − 1/4], and then B maps H δ into itself, for any δ ∈ [0, 1 + 2β]. The Lipschitz continuity of B in H δ and the bound on the Lipschitz norm, are satisfied if the derivatives of b(x, ·) are small enough.
In Section 6 we have compared the small noise asymptotic behavior of the solution of equation (7.1) with the small noise asymptotic behavior of the solution of equation (7.4). We have seen that in the gradient case (that is, when B satisfies Hypothesis 11), the quasipotential V µ (u, v) associated with equation (7.1) is given by And V (u) is, as well known, the quasi-potential for equation (7.4).
In [12], we have studied the same problem in the more delicate situation the system under consideration is not of gradient type and hence there is no explicit expression for V µ (u, v) and V (u), as in (8.3) and (8.4).
In what follows, we will need the following bounds and limit for the solution z µ z 0 to equation (8.5).
Moreover, for any R > 0, lim In particular, this means that the unperturbed system is uniformly attracted to 0 from any bounded set in H. Next, we show that if the initial velocity is large enough, Π 1 z µ z 0 will leave any bounded set. Actually, as shown in [12,Lemma 3.3], for any µ > 0 and t > 0, there exists c 2 (µ, t) > 0 such that Therefore, by using the mild formulation of the unperturbed equation (8.5), we can conclude that the following lower bound estimate holds for the solution of (8.5).

The skeleton equation
For any µ > 0 and s < t and for any ψ ∈ L 2 ((s, t); H) we define Clearly L µ s,t is a continuous bounded linear operator from L 2 ([s, t]; H) into H. If we define the pseudo-inverse of L µ s,t as we have the following bounds.
2. By proceeding as in the proof of Lemma 8.9, it is possible to prove that

A characterization of the quasi-potential
For any t 1 < t 2 , µ > 0 and z ∈ C((t 1 , t 2 ); H), we define where z µ ψ,z 0 is a mild solution of the skeleton equation associated with equation (7.1), with deterministic control ψ ∈ L 2 ((t 1 , t 2 ); H) and initial conditions z 0 , namely For , µ > 0 and z 0 ∈ H we denote by z µ ,z 0 ∈ L 2 (Ω; C([0, T ]; H)) the mild solution of equation (7.1). Since the mapping B µ : H → H is Lipschitz-continuous and the noisy perturbation in (7.1) is of additive type, as an immediate consequence of the contraction lemma, for any fixed µ > 0 the family {L(z µ ,z 0 )} >0 satisfies a large deviation principle in C([t 1 , t 2 ]; H), with action functional I µ t 1 ,t 2 . In particular, for any δ > 0 and T > 0, Analogously, if for any > 0 u denotes the mild solution of equation (7.4), the family {L(u )} >0 satisfies a large deviation principle in C([t 1 , t 2 ]; H) with action functional where ϕ ψ is a mild solution of the skeleton equation associated with equation (7.4) du In particular, the functionals I µ t 1 ,t 2 and I t 1 ,t 2 are lower semi-continuous and have compact level sets. Moreover, it is not difficult to show that for any compact sets E ⊂ H and E ⊂ H, the level sets In what follows, for the sake of brevity, for any µ > 0 and t ∈ (0, +∞] we shall define I µ t := I µ 0,t and I µ −t := I µ −t,0 and, analogously, for any t ∈ (0, +∞] we shall define I t := I 0,t and I −t := I −t,0 . In particular, we shall set Moreover, for any r > 0 we shall set Once we have introduced the action functionals I µ t 1 ,t 2 and I t 1 ,t 2 , as we have already seen in Section 6 we can introduce the corresponding quasi-potentials, by setting for any µ > 0 and (u, v) ∈ H V µ (u, v) = inf I µ 0,T (z) ; z(0) = 0, z(T ) = (u, v), T > 0 , and for any u ∈ H V (u) = inf {I 0,T (ϕ) ; ϕ(0) = 0, ϕ(T ) = u, T ≥ 0} .
Moreover, for any µ > 0 and u ∈ H, we shall definē In [10, Proposition 5.1] it has been proved that the level set K −∞ (r) is compact in the space C((−∞, 0]; H), endowed with the uniform convergence on bounded sets, and in [10,Proposition 5.4] it has been proven that In what follows we want to prove an analogous result for K µ −∞ , V µ (u, v) andV µ (u).
Theorem 8.11. [12, Theorem 5.1] For small enough µ > 0, the level sets K µ −∞ (r) are compact in the topology of uniform convergence on bounded intervals.
Therefore, we need to show that V µ (u, v) ≤ M µ (u, v), for all (u, v) ∈ H.
The characterization of V µ (u, v) andV µ (u) given in Theorem 8.12, implies that V µ and V µ have compact level sets.

Application to the exit problem
We are here interested in the problem of the exit of the solution u µ of equation (7.1) from a domain D ⊂ H, for any µ > 0 fixed. Then we apply the limiting results proved in Theorems 8.17 and 8.19 to show that, when µ is small, the relevant quantities in the exit problem from D for the solution u µ of equation (7.1) can be approximated by the corresponding ones arising for equation (7.4). First, let us give some assumptions on the set D. In [8] it has been proven that for any u 0 ∈ D such that u 0,u 0 (t) ∈ D, for any t ≥ 0, it holds Similarly, as we would expect, it also holds that   3. For any N ⊂ ∂D such that inf x∈N V (x) < inf x∈∂D V (x), there exits µ 0 > 0 such that for all µ < µ 0 , lim →0 P z 0 (u µ (τ µ, ) ∈ N ) = 0.
We recall that in Theorem 7.2 we have proved that, in the case of gradient systems, for any µ > 0V This means that in this case for any z 0 = (u 0 , v 0 ) ∈ H and µ > 0 such that the unperturbed system u µ 0,z 0 (t) ∈ D for all t > 0 lim →0 log E τ µ, and (8.39) holds for any µ > 0.