THE WIND-DRIVEN OCEAN CIRCULATION: APPLYING DYNAMICAL SYSTEMS THEORY TO A CLIMATE PROBLEM

. The large-scale, near-surface ﬂow of the mid-latitude oceans is dominated by the presence of a larger, anticyclonic and a smaller, cyclonic gyre. The two gyres share the eastward extension of western boundary currents, such as the Gulf Stream or Kuroshio, and are induced by the shear in the winds that cross the respective ocean basins. This physical phenomenol-ogy is described mathematically by a hierarchy of systems of nonlinear partial diﬀerential equations (PDEs). We study the low-frequency variability of this wind-driven, double-gyre circulation in mid-latitude ocean basins, subject to time-constant, purely periodic and more general forms of time-dependent wind stress. Both analytical and numerical methods of dynamical systems theory are applied to the PDE systems of interest. Recent work has focused on the application of non-autonomous and random forcing to double-gyre models. We discuss the associated pullback and random attractors and the non-uniqueness of the invariant measures that are obtained. The presentation moves from ob- servations of the geophysical phenomena to modeling them and on to a proper mathematical understanding of the models thus obtained. Connections are made with the highly topical issues of climate change and climate sensitivity.

1. Introduction and motivation. Since the early '70s, when I chose a thesis topic on theoretical climate dynamics and a dynamical systems approach to treat this topic [20,67,68], the importance of the climate sciences has only been growing [32,89,90,178,182]. Likewise, the usefulness of dynamical systems and of their successive bifurcations in navigating across the hierarchy of climate models [52,[70][71][72]162]while far from universally accepted in the climate sciences -has found a wide community of users and produced a substantial literature.
An important recent development in this literature was caused by the realization that the theory of differentiable dynamical systems (DDS) -in its classical form dealing with autonomous DDSs -was fairly well adapted to applications in which neither the forcing nor the coefficients were time dependent. To a good approximation, this could be said to be the case in numerical weather prediction [97,190], but much less so in climate problems in which the seasonal forcing [30,75,80,93,94,161,193,194] or anthropogenic effects [29,32,89,90,178] play a major role.
To cope with the latter type of problems, applications of the theory of nonautonomous and random dynamical systems (NDS and RDS) have started to appear in the theoretical climate dynamics literature [19,36,57,79]. The purpose of the present paper is to give some flavor of the systematic application of DDS, NDS and RDS theory across the hierarchy of climate models, from the simplest, conceptual ones all the way to the most elaborate ones [54,70,71,75].
In mathematical terms, this means considering, at the lower end of the hierarchy, small systems of ordinary differential equations (ODEs) and at the higher end fullblown systems of nonlinear partial differential equations (PDEs). In the climate literature, the former are sometimes called "toy models," while the latter are referred to as general circulation models (GCMs) or, more recently, as global climate models, which preserves the GCM acronym.
An interesting problem to which these ideas and results will be applied herein is that of the wind-driven circulation in mid-latitude ocean basins [52,54], subject at first to time-constant and then to purely periodic and more general forms of time-dependent wind stress. The wind-driven circulation is obviously strongest near the surface of the oceans, and its variability lies mostly in the subannual and interannual range. In many simplified, theoretical treatments of the oceans, this circulation is treated separately from the so-called buoyancy-driven, thermohaline or overturning circulation. The latter is driven by mass fluxes at the ocean-atmosphere interface; it involves the density distribution of water masses -which depends on both temperature and salinity, hence its characterization as thermohaline -and it reaches all the way to the bottom of the oceans. The variability of this overturning circulation lies largely in the interdecadal and centennial range [52,54,69,71,137,145,183,187]. It will only be mentioned here in passing in Section 4.
Key features of the wind-driven circulation are described in Section 2 below. The influence of strong thermal fronts -like the Gulf Stream in the North Atlantic or the Kuroshio in the North Pacific -on the mid-latitude atmosphere above is severely underestimated, thus contributing to the current range of uncertainties in climate simulations over the 21st century. Typical spatial resolutions in recent century-scale GCM simulations [1,89,90,153,178,180] are still of the order of 100 km at best, whereas resolutions of 50 km and less would be needed to really capture the strong mid-latitude ocean-atmosphere coupling just above the oceanic fronts [62,63,132,176].
An important additional source of uncertainty comes from the difficulty to correctly parametrize global and regional effects of clouds and their highly complex small-scale physics. This difficulty is particularly critical in the tropics, where largescale features such as the El-Niño/Southern Oscillation (ENSO) and the Madden-Julian Oscillation are strongly coupled with convective phenomena [75,124,138].
The outline of the paper is the following. First, we describe in Section 2 the most recent theoretical results regarding the internal variability of the mid-latitude winddriven circulation, viewed as a problem in nonlinear fluid mechanics. These results rely to a large extent on autonomous DDS theory [11,85]. Next, we summarize in Section 3 the key concepts and methods of NDS [163] and RDS [7] theory.
With the results of Sections 2 and 3 in hand, we address in Section 4 the changes in this purely oceanic variability as a result of time-dependent forcing, on the one hand, and truly coupled ocean-atmosphere variability, on the other. Much of the material in this section is new [65,150,198]. A summary and an outlook on future work follow in Section 5. Rigorous mathematical results appear in Appendices A, B and C, which follow closely [79] (Appendix A) and [36] (Appendices B and C).
This paper targets an audience of mathematicians already interested or on the point of becoming interested in the climate sciences, as well as of climate scientists with an interest in the formulation and use of a mathematical framework for their problems. I hope that it will help, rather than hinder, the communication between these two target audiences.
2. Internal variability of the wind-driven ocean circulation.

2.1.
Observations. To a first approximation, the main near-surface currents in the oceans are driven by the mean effect of the winds. The trade winds near the equator blow mainly from east to west and are also called the tropical easterlies. In mid-latitudes, the dominant winds are the prevailing westerlies, and towards the poles the winds are easterly again. Thus an idealized version of wind effects on the ocean must involve zonally oriented winds that blow westward at low and high latitudes, and eastward in mid-latitudes.
Five of the strongest near-surface, mid-and-high-latitude currents are the Antarctic Circumpolar Current, the Gulf Stream and the Labrador Current in the North Atlantic, and the Kuroshio and Oyashio off Japan. These currents are all clearly visible in Fig. 1.
The Gulf Stream [184] is an oceanic jet with a strong influence on the climate of eastern North America and of western Europe. From Mexico's Yucatan Peninsula, the Gulf Stream flows north through the Florida Straits and along the East Coast of the United States. Near Cape Hatteras, it detaches from the coast and begins to drift off into the North Atlantic towards the Grand Banks near Newfoundland. Actually, the Gulf Stream is part of a larger, gyre-like current system, which includes the North Atlantic Drift, the Canary Current and the North Equatorial Current. It is also coupled with the pole-to-pole overturning circulation.
The Coriolis force is responsible for the so-called Ekman transport, which deflects water masses orthogonally to the near-surface wind direction and to the right [72,83,144]. In the North Atlantic, this Ekman transport creates a divergence and a convergence of near-surface water masses, respectively, resulting in the formation of two oceanic gyres: a smaller, cyclonic one in subpolar latitudes, the other larger and anticyclonic in the subtropics. This type of double-gyre circulation characterizes all Figure 1. A map of the main oceanic currents: warm currents in red and cold ones in blue. Reproduced from [79], with permission from Elsevier. mid-latitude ocean basins, including the South Atlantic, as well as the North and South Pacific.
The double-gyre circulation is intensified as the currents approach the East Coast of North America due to the β-effect. This effect arises primarily from the variation of the Coriolis force with latitude, while the oceans' bottom topography also contributes to it. The former, planetary β-effect is of crucial importance in geophysical flows and induces free Rossby waves propagating westward [72,83,144].
The currents along the western shores of the North Atlantic and of the other mid-latitude ocean basins exhibit boundary-layer characteristics and are commonly called western boundary currents. The northward-flowing Gulf Stream and the southward-flowing Labrador Current extension meet near Cape Hatteras and give rise to a strong, mainly eastward jet that eventually reaches Scandinavia. In all four mid-latitude ocean basins -i.e., the North and South Atlantic, as well as the North and South Pacific -the confluence of a warm, poleward current with a cold, equatorward one produces an intense eastward jet. The formation of this jet and of the associated recirculation vortices near the western boundary, to either side of the jet, is mostly driven by internal, nonlinear effects. Figure 2 illustrates how these large-scale wind-driven oceanic flows self-organize, as well as the resulting eastward jet. Different spatial and time scales contribute to this self-organization: so-called mesoscales eddies in the ocean play the role of the synoptic-scale weather systems in the atmosphere. Warm and cold rings last for several months up to a year and have a diameter of about 100 km vs. the much larger horizontal dimensions of atmospheric weather systems, of the order of 1000 km; two cold rings are clearly visible in Fig. 2. Gulf Stream meanders involve larger spatial scales, up to 1000 km, and are associated with interannual variability.
The characteristic scale of the jet and gyres is of several thousand kilometers and they exhibit their own intrinsic dynamics on time scales of several years to possibly one or two decades. A striking feature of the wind-driven circulation is the existence of two well-known North Atlantic Oscillations, with a period of about 7-8 and 14-15 years, respectively. Data analysis of various climatic variables, such as  Figure 2. A satellite image of the sea surface temperature (SST) field over the northwestern North Atlantic (in false color, from the U.S. National Oceanic and Atmospheric Administration), together with a sketch of the associated double-gyre circulation (white arrows). A highly simplified and smoothed view of the amount of potential vorticity injected into the ocean circulation by the equatorial trade winds, the mid-latitudes' prevailing westerlies and the polar easterlies is shown in the sketch to the right. Reproduced from [79], with permission from Elsevier. sea surface temperature (SST) over the North Atlantic or sea level pressure (SLP) over western Europe [49,135,201] and local surface air temperatures in Central England [151], as well as of proxy records, such as tree rings in Britain, travertine concretions in southeastern France [58], and Nile floods over the last millennium or so [105], all exhibit strikingly robust oscillatory behavior with a 7-8-yr period and, to a lesser extent, with a 14-yr period. Variations in the path and intensity of the Gulf Stream are most likely to exert a major influence on the climate in this part of the world [184]. These climatic implications add to the intrinsic interest of theoretical studies of the low-frequency variability of the oceans' double-gyre circulation.
Given the complexity of the processes involved, climate studies have been most successful when using not just a single model but a full hierarchy of models, from the simplest to models to the most detailed GCMs [70,75]. In the following, we describe one of the simplest models of the hierarchy used in studying this problem, which still retains many of its essential features.

2.2.
A simple model of the double-gyre circulation. The simplest model that includes many of the mechanisms described above is governed by the barotropic quasi-geostrophic (QG) equations. The term geostrophic refers to the fact that largescale rotating flows tend to run parallel to, rather than perpendicular to constantpressure contours; in the oceans, these contours are associated with the deviation from rest of the surfaces of equal water mass, due to Ekman pumping. Geostrophic balance implies in particular that the flow is divergence-free. The term barotropic, as opposed to baroclinic, has a slightly different meaning in geophysical fluid dynamics than in engineering fluid mechanics: it means that the model describes a single fluid layer of constant density and therefore the solutions do not depend on depth [72,83,144].
We consider an idealized, rectangular basin geometry and simplified forcing that mimics the distribution of vorticity due to the wind stress, as sketched to the right of Fig. 2. In our idealized model, the amounts of subpolar and subtropical vorticity injected into the basin are equal and the rectangular domain Ω = (0, L x ) × (0, L y ) is symmetric about the axis of zero wind stress curl.
The barotropic two-dimensional (2-D) QG equations in this idealized setting are: (2.1) Here q and ψ are the potential vorticity and the streamfunction, respectively, and the Jacobian J gives the advection of potential vorticity by the flow, J(ψ, q) = ψ x q y − ψ y q x = u · ∇q, where u = (−ψ y , ψ x ), x points east and y points north. The physical parameters are the strength of the planetary vorticity gradient β, the Rossby radius of deformation λ −2 R , the eddy-viscosity coefficient ν, the bottom friction coefficient µ, and the wind-stress intensity τ . We use here free-slip boundary conditions ψ = ∆ 2 ψ = 0; the qualitative results described below do not depend on the particular choice of homogeneous boundary conditions [54,92].
We consider the nonlinear PDE system (2.1) as an infinite-dimensional dynamical system and study its bifurcations as the parameters change. Two key parameters are the wind stress intensity τ and the eddy viscosity ν. An important property of (2.1) is its mirror symmetry in the y = L y /2 axis. This symmetry can be expressed as invariance with respect to the discrete Z 2 group S, given by any solution of (2.1) is thus accompanied by its mirror-conjugated solution. Hence the prevailing bifurcations are of either the symmetry-breaking or the Hopf type.

2.3.
Bifurcations in the double-gyre problem. The development of a comprehensive nonlinear theory of the double-gyre circulation over the last two decades has gone through four main steps. These four steps can be followed through the bifurcation tree in Fig. 3.

2.3.1.
Symmetry-breaking bifurcation. The "trunk" of the bifurcation tree is plotted as the solid blue line in the lower part of the figure. When the forcing τ is weak or the dissipation ν is large, there is only one steady solution, which is antisymmetric with respect to the mid-axis of the basin. This solution exhibits two large gyres, along with their β-induced western boundary currents. Away from the western boundary, such a near-linear solution (not shown) is dominated by so-called Sverdrup balance between wind stress curl and the meridional mass transport [83,186]. The first generic bifurcation of this QG model was found to be a genuine pitchfork bifurcation that breaks the system's symmetry as the nonlinearity becomes large enough with increasing wind stress intensity τ [27,91,92]. As the wind stress increases, the near-linear Sverdrup solution that lies along the solid blue line in the figure develops an eastward jet along the mid-axis, which penetrates farther into the domain and also forms two intense recirculation vortices, on either side of the jet and near the western boundary of the domain.
The resulting more intense, and hence more nonlinear solution is still antisymmetric about the mid-axis, but loses its stability for some critical value of the wind-stress intensity, τ = τ P . This value is indicated by the filled square on the symmetry axis of Fig. 3 and is labeled "Pitchfork" in the figure.
A pair of mirror-symmetric solutions emerges and it is plotted as the two red solid lines in the figure's lower part. The streamfunction fields associated with the two stable steady-state branches have a rather different vorticity distribution and they are plotted in the two small panels to the upper-left and upper-right of Fig. 3. In particular, the jet in such a solution exhibits a large meander, reminiscent of the one seen in Fig. 2 just downstream of Cape Hatteras; note that the colors in Fig.  3 were chosen to facilitate the comparison with Fig. 2. These asymmetric flows are characterized by one recirculation vortex being stronger in intensity than the other, which deflects the jet accordingly either to the southeast, as is the case in the observations for the North Atlantic, or to the northeast.

2.3.2.
Gyre modes. The next step in the theoretical treatment of the problem was taken in part concurrently with the first one above [91,92] and in part shortly thereafter [55,164,179]. It involved the study of time-periodic instabilities through Hopf bifurcation from either an antisymmetric or an asymmetric steady flow. Some of these studies treated wind-driven circulation models limited to a stand-alone, single gyre [145,164]; such a model concentrates on the larger, subtropical gyre while neglecting the smaller, subpolar one.
The overall idea was to develop a full, generic picture of the time-dependent behavior of the solutions in more turbulent regimes, by classifying the various instabilities in a comprehensive way. However, it quickly appeared that a particular kind of instability leads to so-called gyre modes [92,179], and was prevalent across the full hierarchy of models of the double-gyre circulation; furthermore, this instability triggers the lowest nonzero frequency present in all such models [52,54].
These gyre modes always appear after the first pitchfork bifurcation, and it took several years to really understand their genesis: gyre modes arise as two eigenvalues merge -one is associated with a symmetric eigenfunction and responsible for the pitchfork bifurcation, the other is associated with an antisymmetric eigenfunction [166]; this merging is marked by a filled circle on the left branch of antisymmetric stationary solutions and labeled M in Fig. 3.
Such a phenomenon is not a bifurcation stricto sensu: one has topological C 0equivalence before and after the eigenvalue merging, but not from the C 1 point of view. Still, this phenomenon is quite common in small-dimensional dynamical systems with symmetry, as exemplified by the unfolding of codimension-2 bifurcations of Bogdanov-Takens type [85]. In particular, the fact that gyre modes trigger the lowest-frequency of the model is due to the frequency of these modes growing quadratically from zero until nonlinear saturation. Of course these modes, in turn, become unstable shortly after the merging, through a Hopf bifurcation indicated in Fig. 3 by a heavy dot marked "Hopf," from which a stylized limit cycle emerges. A mirror-symmetric M and Hopf bifurcation also occur on the right branch of stationary solutions, but have been omitted in the figure for visual clarity.
More generally, Hopf bifurcations of various types give rise to features that recur more-or-less periodically in fully turbulent planetary-scale flows, atmospheric, oceanic and coupled [52,54,71,72]. In the climate sciences, one commonly refers to this type of near-periodic recurrence as low-frequency variability (LFV).

Global bifurcations.
The importance of the gyre modes was further confirmed through an even more puzzling discovery. Several authors realized, independently of each other, that the low-frequency dynamics of their respective double-gyre models was driven by intense relaxation oscillations of the jet [28,130,136,[167][168][169][170]. These relaxation oscillations, already described in [92,179], were now attributed to a homoclinic bifurcation, with a global character in phase space [72,85]. In effect, the QG model reviewed here undergoes a genuine homoclinic bifurcation that is generic across the full hierarchy of double-gyre models. This bifurcation is due to the growth and eventual merging of the limit cycles that arise from the two mutually symmetric Hopf bifurcations. It is marked in the figure by a filled circle and a stylized lemniscate, and labeled "Homoclinic." This global bifurcation is associated with chaotic behavior of the flow due to the Shilnikov phenomenon [136,170], which induces horseshoes in phase space.
The connection between such homoclinic bifurcations and gyre modes was not immediately obvious, but Simonnet et al. [170] emphasized that the two were part of a single, global dynamical phenomenon. The homoclinic bifurcation indeed results from the unfolding of the gyre modes' limit cycles. This familiar dynamical scenario is again well illustrated by the unfolding of a codimension-2 Bogdanov-Takens bifurcation, where the homoclinic orbits emerge naturally.
Since homoclinic orbits have an infinite period, it was natural to hypothesize that the gyre-mode mechanism, in this broader, global-bifurcation context, gave rise to the observed 7-yr and 14-yr North Atlantic oscillations. Although this hypothesis may appear a little farfetched -given the simplicity of the double-gyre models analyzed so far -it is reinforced by results with much more detailed model in the hierarchy, cf. [52,54] and Section 4 herein.
The successive-bifurcation theory appears therewith to be fairly complete for barotropic, single-layer models of the double-gyre circulation. This theory also provides a self-consistent, plausible explanation for the climatically important 7year and 14-year oscillations of the oceanic circulation and the related atmospheric phenomena in and around the North Atlantic basin [49, 52, 54, 62-65, 105, 135, 151, 169,170,201]. The dominant 7-and 14-year modes of this theory survive, moreover, perturbation by seasonal-cycle changes in the intensity and meridional position of the westerly winds [185].
In baroclinic models, with two or more active layers of different density, baroclinic instabilities [17,54,63,72,83,108,144,145,169,184] surely play a fundamental role, as they do in the observed dynamics of the oceans. However, it is not known to what extent baroclinic instabilities can destroy gyre-mode dynamics. The difficulty lies in a deeper understanding of the so-called rectification process [100], which arises from the nonzero mean effect of the baroclinic component of the flow.
Roughly speaking, rectification drives the dynamics far away from any stationary solutions. In this situation, dynamical systems theory by itself cannot be used as a full explanation of complex, observed behavior resulting from successive bifurcations that are rooted in simple stationary or periodic solutions. Other tools from statistical mechanics and nonequilibrium thermodynamics should, therefore, be considered [22,60,121,122,125,158,191]. Combining these tools with those of the successive-bifurcation approach may eventually lead to a more general and complete physical characterization of gyre modes in realistic models.

Non-autonomous and random dynamical systems (NDSs and RDSs).
An additional way of improving upon the simple results so far is to include timedependent forcing and stochasticity. As discussed in Section 1, the appropriate general framework for doing this is the one provided by NDS and RDS theory [19,36,57,79]. In the present section, we summarize this mathematical framework in as straightforward a way as possible, and start with some simple ideas about deterministic vs. stochastic modeling.
3.1. Background and motivation. We are interested in behavior that is robust across a full hierarchy of climate models in general, and of oceanic double-gyre models in particular. Moreover, we would like this robust behavior of the models to be recognizable in the very large and ever-increasing mass of observational data. Finally, the features in whose robustness we are most interested are those that obtain over long time intervals.
Classical DDS theory, going back to H. Poincaré [152], is essentially a geometric approach to studying the asymptotic, long-term properties of solutions to autonomous ODE systems in phase space. Extensions to nonlinear PDE systems are covered, for instance, in [189]. To apply the theory in a reliable manner to a set of complex physical phenomena, one needs a criterion for the robustness of a given model within a class of dynamical systems. Such a criterion should help one deal with the inescapable uncertainties in model formulation, whether due to incomplete knowledge of the governing laws or inaccuracies in determining model parameters.
In this context, A. A. Andronov and L. S. Pontryagin [2] introduced the concept of structural stability for classifying dynamical systems. Structural stability means that a small, continuous perturbation of a given system preserves its dynamics up to a homeomorphism, i.e., up to a one-to-one continuous change of variables that transforms the phase portrait of our system into that of a nearby system; thus fixed points go into fixed points, limit cycles into limit cycles, etc. Closely related is the notion of hyperbolicity formulated by S. Smale [175]. A system is hyperbolic if, (very) loosely speaking, its limit set can be continuously decomposed into invariant sets that are either contracting or expanding; see [99] for more rigorous definitions.
A very simple example is the phase portrait in the neighborhood of a fixed point of saddle type. The Hartman-Grobman theorem states that the dynamics in this neighborhood is structurally stable. The converse statement, i.e. whether structural stability implies hyperbolicity, is still an open question; the equivalence between structural stability and hyperbolicity has only been shown in the C 1 case, under certain technical conditions [126,142,157,159]. Bifurcation theory is quite complete and satisfying in the setting of hyperbolic dynamics. Problems with hyperbolicity and bifurcations arise, however, when one deals with more complicated limit sets.
Hyperbolicity was introduced initially to help pursue the "dynamicist's dream" of finding -in an abstract space X of all possible dynamical systems with given dimension and regularity -an open and dense set S consisting of structurally stable ones. For this set to be open and dense means, roughly speaking, that any dynamical system in X can be approximated by structurally stable ones in S, while systems in the complement of S are negligible in a suitable sense.
Smale conjectured that hyperbolic systems form an open and dense set in the space of all C 1 dynamical systems. If this conjecture were true then hyperbolicity would be typical of all dynamics. Unfortunately, though, this conjecture is only true for one-dimensional dynamics and flows on disks and surfaces [146]. Smale [174] himself found several counterexamples to his conjecture. Newhouse [139] was able to generate open sets of nonhyperbolic diffeomorphisms using homoclinic tangencies. For the physicist, it is even more striking that the famous Lorenz attractor [119] is structurally unstable. Families of Lorenz attractors, classified by topological type, are not even countable [86,200]. In each of these examples, we observe chaotic behavior in a nonhyperbolic situation, i.e., nonhyperbolic chaos.
Nonhyperbolic chaos appears, therefore, to be a severe obstacle to any easy classification of dynamic behavior. As mentioned by J. Palis [142], A. N. Kolmogorov already suggested at the end of the sixties that "the global study of dynamical systems could not go very far without the use of new additional mathematical tools, like probabilistic ones." Once more, Kolmogorov showed prophetic insight, and nowadays the concept of stochastic stability is an important tool in the study of genericity and robustness for dynamical systems. A system is stochastically stable if its Sinai-Ruelle-Bowen (SRB) measure [173] is stable with respect to stochastic perturbations, and the SRB measure is given by lim n→∞ To replace the failed program of classifying dynamical systems based on structural stability and hyperbolicity, J. Palis [142] formulated his so-called global conjecture: The set S of systems satisfying the following conditions is dense in the C r -topology for r ≥ 1. The three conditions are: (i) the system has only finitely many attractors, i.e. periodic or chaotic sinks; (ii) the union of the corresponding basins of attraction has full Lebesgue measure in X ; and (iii) each attractor is stochastically stable in its basin of attraction.
Stochastic stability is thus based on ergodic theory. We would like to consider a more geometric approach, which can provide a coarser, more robust classification of climate models across a full hierarchy of such models. In this section, we propose such an approach, based on RDS theory [7,39,47,113, and references therein].
RDS theory describes the behavior of dynamical systems subject to external stochastic forcing; its tools have been developed to help study the geometric properties of stochastic differential equations (SDEs). RDS theory is thus the stochastic, SDE counterpart of the geometric theory of ODEs, as presented for instance in [11]. This approach provides a rigorous mathematical framework for a stochastic form of robustness, while the more traditional, topological concepts do not seem to be appropriate for the climate sciences, given the prevalence of nonhyperbolic chaos.

3.2.
Pullback and random attractors. Stochastic parametrizations for GCMs aim at compensating for our lack of detailed knowledge on small spatial scales in the best way possible [95, 115-117, 143, 181]. The underlying assumption is that the associated time scales are also much shorter than the scales of interest and, therefore, the lag correlation of the phenomena being parametrized is negligibly small. Such assumptions clearly hold, for instance, for the spatial and temporal scales of cloud processes, whose parametrization poses a notorious obstacle to realistic climate simulation and prediction [89,90,178,182]. Stochastic parametrizations thus essentially transform a deterministic autonomous system into a nonautonomous one, subject to random forcing.
Explicit time dependence, whether deterministic or stochastic, in a dynamical system immediately raises a technical difficulty. Indeed, the classical notion of attractor is not always relevant, since any object in phase space is moving with time and the concept of forward asymptotics, as considered in autonomous DDS theory, is meaningless. One needs therefore another notion of attractor. In the deterministic nonautonomous framework, the appropriate notion is that of a pullback attractor [26,103,155], which we present below. The closely related notion of random attractor in the stochastic framework is also explained briefly below, with further details given in Appendix A.
3.2.1. Pullback attraction. Before defining the notion of pullback attractor, let us recall some basic facts about nonautonomous dynamical systems. Consider the ODEẋ = f (x, t) (3.1) on a vector space X; this space could even be infinite-dimensional, if we were dealing with partial or functional differential equations, as is often the case in fluidflow and climate problems. Rigorously speaking, we cannot associate a dynamical system acting on X with a nonautonomous ODE; nevertheless, in the case of unique solvability of the initial-value problem, we can introduce a two-parameter family of operators {S(t, s)} t≥s acting on X, with s and t real, such that S(t, s)x(s) = x(t) for t ≥ s, where x(t) is the solution of the Cauchy problem with initial data x(s). This family of operators satisfies S(s, s) = Id X and S(t, τ ) • S(τ, s) = S(t, s) for all t ≥ τ ≥ s, and all real s. Such a family of operators is called a "process" by Sell [163].It extends the classical notion of the resolvent of a nonautonomous linear ODE to the nonlinear setting.
We can now define the pullback attractor as the family of invariant sets {A(t) : −∞ < t < +∞} that satisfy, for every real t and all x 0 in X: "Pullback" attraction does not involve running time backwards; it corresponds instead to the idea of measurements being performed at present time t in an experiment that was started at some time s < t in the past: the experiment has been running for long enough, and we are thus looking at the present moment for an "attracting state." Note that there exist several ways of defining a pullback attractor -the one retained here is a local one, cf. [111, and references therein]; see [16] for further information on nonautonomous dynamical systems in general.
A simple example will be helpful for the general reader. Consider the scalar linear ODEẋ with both α and σ positive. Here α > 0 implies the system is dissipative, a crucial property both mathematically and physically. In the climate context, its importance has been emphasized by [72,119], among others. The situation is depicted in Fig. 4. The PBA is easily computed in this case, and it is given by The figure clearly shows that the approximation of the PBA at a given time t = t 1 or t = t 2 improves as we "pull back" further, from s = s 1 to s = s 2 (red vs. blue orbits in the figure); it also shows that the accuracy of the approximation depends essentially on the time interval |t − s|, rather than on t and s separately. In practice, the characteristic time for obtaining an approximation with given accuracy will depend largely on the rate of dissipation α.
In the stochastic context, noise forcing is modeled by a stationary stochastic process. If the deterministic dynamical system of interest is coupled to this stochastic process in a reasonable way -defined below by the "cocycle property" -then random pullback attractors may appear. These pullback attractors will exist for almost each sample path of the driving stochastic process, so that the same probability distribution governs both sample paths and their corresponding pullback attractors. A more detailed and rigorous explanation is given in Appendix A.
Roughly speaking, the concept of a random attractor provides a geometric framework for the description of asymptotic regimes in the context of stochastic dynamics. Figure 4 showed how additive and linear deterministic forcing modifies a fixed point. We illustrate next, following [36,71], the way that multiplicative stochastic forcing modifies a strange attractor.

3.2.2.
The stochastically perturbed Lorenz model. We consider in this section the Lorenz convection model [119] and its randomly forced version studied in [36]. The model is governed by the well-known three nonlinear, coupled ODEs that are now  forced by linearly multiplicative white noise [7,39]. [SLM]    dx = s(y − x)dt + σx dW t , dy = (rx − y − xz)dt + σy dW t , dz = (−bz + xy)dt + σz dW t . (3.5) In the deterministic context, purely geometric map models were proposed in the 1970s [85] to interpret the dynamics observed numerically by E. N. Lorenz in [119]. These geometric models attracted considerable attention and it was shown that they possess a unique SRB measure [42,101], i.e., a time-independent measure that is invariant under the flow and has conditional measures on unstable manifolds that are absolutely continuous with respect to Lebesgue measure [59]. This result has been extended recently to the Lorenz flow [119] itself, in which the SRB measure is supported by a strange attractor of vanishing volume [6,192].
Even though this result was only proven recently, the existence of such an SRB measure was suspected for a long time and has motivated several numerical studies to compute a probability density function (PDF) associated with the Lorenz model, by filtering out the stable manifolds [51,134, and references therein]. The Lorenz attractor is then approximated by a two-dimensional manifold, called the branched manifold [85], which supports this PDF. Based on such a strategy, Dorfle & Graham [51] showed that the stationary solution of the Fokker-Planck equation for the Lorenz model perturbed by additive white noise possesses a density with two components: the PDF of the deterministic system supported by the branched manifold plus a narrow Gaussian distribution transversal to that manifold. µ ω (x, y, z)dx. One billion initial points have been used in both panels and the pullback attractor is computed for t = 40. The parameter values are the classical ones -r = 28, s = 10, and b = 8/3, while the time step is ∆t = 5 · 10 −3 . The color bar to the right of each panel is on a log-scale and quantifies the probability to end up in a particular region of phase space. Both panels use the same noise realization ω but with noise intensity (a) σ = 0.3 and (b) σ = 0.5. Notice the interlaced filamentary structures between highly (yellow) and moderately (red) populated regions; these structures are much more complex in panel (b), where the noise is stronger. Weakly populated regions cover an important part of the random attractor and are, in turn, entangled with (almost) zero-probability regions (black). After [36].
It follows that, in the presence of additive noise, the resulting PDF looks very much like that of the unperturbed system, only slightly fuzzier: the noise smoothes the small-scale structures of the attractor. More generally, this smoothing appears in the classical, forward approach that only considers forward asymptotics, with t → +∞ -as opposed to the pullback approach, introduced here in Section 3.2.1, in which one considers also the pullback asymptotics of s → −∞ -and it does so for a broad class of additive as well as multiplicative noises, in the sense of [7]. More precisely, this smoothing occurs provided that the diffusion terms due to the stochastic components in the Fokker-Planck equation associated with the SDE system under study are sufficiently non-degenerate; see [36, Appendix C and references therein].
Hörmander's theorem guarantees that this is indeed the case for hypoelliptic SDEs [15]. The corresponding non-degeneracy conditions allow one to regularize the stationary solutions of the transport equation which is the counterpart of the Fokker-Planck equation in the absence of noise; a measure-theoretic justification for it can be found, for instance in [112, p. 210]. This transport equation is also known as the Liouville equation and it provides the probability density at time t of S(t)x when the initial state x is sampled from a probability measure that is absolutely continuous with respect to Lebesgue measure; here {S(t)} t∈R is the flow ofẋ = f (x), for some sufficiently smooth vector field f on R d . As a matter of fact, when f is dissipative and the dynamics associated with it is chaotic, the stationary solutions of (3.6) are very often singular with respect to Lebesgue measure; these solutions are therefore expected to be SRB measures. For a broad class of noises -such as those that obey a hypoellipticity condition -the forward approach leads us to suspect that noise effects tend to remove the singular aspects with respect to Lebesgue measure. This smoothing aspect of random perturbations is often useful in the theoretical understanding of any stochastic system, in particular in the analysis of the low-and higher-order moments, which have been thoroughly studied in various contexts.
For chaotic systems subject to noise, however, this noise-induced smoothing observed in the forward approach compresses a lot of crucial information about the dynamics itself; quite to the contrary, the pullback approach brings this information into sharp focus. A quick look at Figs. 5(a,b) and 6 is already enlightening in this respect. All three figures refer to the invariant measure µ ω supported by the random attractor of our stochastic Lorenz model [SLM]. This model obeys the following three SDEs: In system (3.5), each of the three equations of the classical, deterministic model [119] is perturbed by linearly multiplicative noise in the Itô sense, with W t a Wiener process and σ > 0 the noise intensity. The other parameter values are the standard ones for chaotic behavior [72], and are given in the caption of Fig. 5. Figures 5(a,b) show two snapshots of the sample measure µ ω supported by the random attractor of [SLM] -for the same realization ω but for two different noise intensities, σ = 0.3 and 0.5, while Fig. 6 provides four successive snapshots of µ θtω , for the same noise intensity σ = 0.5 as in Fig. 5(b), but with t = t 0 + k∆t and k = 0, 1, 2, 3 for some t 0 .
The sample measures in these three figures, and in the associated short video given in the SM, exhibit amazing complexity, with fine, very intense filamentation; note logarithmic scale on color bars in the three figures. There is no fuzziness whatsoever in the topological structure of this filamentation, which evokes the Cantor-set foliation of the deterministic attractor [85]. Such a fine structure strongly suggests that these measures are supported by an object of vanishing volume.
Much more can be said, in fact, about these objects. RDS theory offers a rigorous way to define random versions of stable and unstable manifolds, via the Lyapunov spectrum, the Oseledec multiplicative theorem, and a random version of the Hartman-Grobman theorem [7]. These random invariant manifolds can support measures, like in the deterministic context. When the sample measures µ ω of an RDS have absolutely continuous conditional measures on the random unstable manifolds, then µ ω is called a random SRB measure. One can prove rigorously, by relying on Theorem B of [113], that the sample measures of the discretized stochastic system obtained from the [SLM] model share the SRB property. Indeed, it can be shown that a Hörmander hypoellipticity condition is satisfied for our discretized [SLM] model, thus ensuring that the random process generated by this model has a smooth density p(t, x) [104]; see [36, Appendix C1] for the details. Standard arguments [177] can then be used to prove that the stationary solution ρ of our model's Fokker-Planck equation is in fact absolutely continuous with respect to Lebesgue measure.
Since these simulations exhibit exactly one positive Lyapunov exponent, the absolute continuity of ρ implies that the sample measures seen in Figs. 5 and 6 are, actually, good numerical approximations of a genuine random SRB measure for our discretized [SLM], whenever δt is sufficiently small; see also the next section. In fact, Ledrappier & Young's [113] Theorem B is a powerful result, which clearly shows that -in noisy systems, and subject to fairly general conditions -chaos can lead to invariant sample measures with the SRB property; see also [36,Appendix C2]. It is striking that the same noise-induced smoothing that was "hiding" the dynamics in the forward approach allows one here to exhibit the existence of an SRB measure from a pullback point of view, and thus to compute explicitly the unstable manifold supporting this invariant measure.
Note that, since the sample measures associated with the discrete [SLM] system are SRB here, they are so-called physical measures [7,36, and references therein] and can thus be computed at any time t by simply flowing a large set of initial data from the remote past s t up till t, for a fixed realization ω; this is exactly how Figs. 5 and 6 were obtained. Given the SRB property, the nonzero density supported on the model's unstable manifolds delineates numerically these manifolds; Figs. 5 and 6 provide therefore an approximation of the global random attractor of our stochastic Lorenz system.
Finally, these random measures are Markovian, in the sense that they are measurable with respect to the past σ-algebra of the noise [7]. The latter statement results directly from the fact that these measures are physical, and thus satisfy the required measurability conditions in the pullback limit. The information about the moments that is available in the classical Fokker-Planck approach is complemented here by information about the pathwise moments. These pathwise statistics are naturally associated with the sample measures -when the latter are SRB -by taking appropriate averages.
The evolution of the sample measures µ θtω -as apparent from the short video in [36, Supplementary Material] -is quite complex, and two types of motion are present. First, a pervasive "jiggling" of the overall structure can be traced back to the roughness of the Wiener process W t and to the multiplicative way it enters into the [SLM] model. Second, there is a smooth, regular low-frequency motion present in the evolution of the sample measures, which seems to be driven by the deterministic system's unstable limit cycles and is thus related to the well-known lobe dynamics. The latter motion is clearly illustrated in Fig. 6.
More generally, it is worth noting that this type of low-frequency motion seems to occur quite often in the evolution of the samples measures of chaotic systems perturbed by noise; it appears to be related to the recurrence properties of the unperturbed deterministic flow, especially when energetic oscillatory modes characterize the latter. To the best of our knowledge, there are no rigorous results on this type of phenomenon in RDS theory.
Besides this low-frequency motion, abrupt changes in the global structure occur from time to time, with the support of the sample measure either shrinking or expanding suddenly. These abrupt changes recur frequently in the video associated with Fig. 6, which reproduces a relatively short sequence out of a very long stochastic model integration; see SM.
As the noise intensity σ tends to zero, the sample-measure evolution slows down, and one recovers numerically the measure of the deterministic Lorenz system (not shown). This convergence as σ → 0 may be related to the concept of stochastic stability [101,202]. Such a continuity property of the sample measures in the zeronoise limit does not, however, hold in general; it depends on properties of the noise, as well as of the unperturbed attractor [25,44,48].
As stated in the theoretical section, the forward approach is recovered by taking the expectation, E[µ • ] := Ω µ ω P(dω), of these invariant sample measures. In practice, E[µ • ] is closely related to ensemble or time averages that typically yield the previously mentioned PDFs. In addition, when the random invariant measures are Markovian and the Fokker-Planck equation possesses stationary solutions, E[µ • ] = ρ, where ρ is such a solution. Subject to these conditions, there is even a one-to-one correspondence between Markovian invariant measures and stationary measures of the Markov semi-group [7,46]. The inverse operation of µ → ρ = E[µ • ] is then given by ρ → µ ω = lim t→∞ Φ(−t, ω) −1 ρ; the latter is in fact the pullback limit of ρ due to the cocycle property [46]. It follows readily from this result that RDS theory "sees" many more invariant measures than those given by the Markov semi-group approach: non-Markovian measures appear to play an important role in stochastic bifurcation theory [7], for instance.
To summarize, one might say that the classical forward approach considers only expectations and PDFs, whereas the RDS approach "slices" the statistics very finely: the former takes a hammer to the problem, while the latter takes a scalpel. Clearly, distinct physical processes may lead to the same observed PDF: the RDS approach and, in particular, the pullback limit are able to discriminate between these processes and thus provide further insight into them.

Time-dependent forcing and coupled ocean-atmosphere variability.
Given the NDS and RDS concepts of pullback attraction and time-dependent invariant measures, as introduced in the previous section, we are ready now to return to the wind-driven ocean circulation of Section 2 and consider two additional steps in its understanding, simulation and prediction, namely time-dependent forcing and coupling with the atmosphere above. We start by applying a still prescribed but now time-dependent wind-stress to a double-gyre model.

4.1.
Time-dependent forcing. The most obvious way in which wind stress varies in mid-latitudes is from summer to winter: the so-called atmospheric jet stream is both stronger and closer to the equator in winter. Such periodic changes in the forcing were already considered in [185], albeit without the theoretical underpinnings discussed herein. The first application of pullback attractors to the double-gyre problem appears in [149], for the case of periodic forcing, and we rely here on [150] for a systematic presentation of this novel application of NDS theory to the winddriven ocean circulation. 4.1.1. Model formulation. The model, in its autonomous form [147], differs somewhat from that of Eq. (2.1) by the absence of the Laplacian eddy viscosity -given by the bilaplacian ∆ 2 ψ of the streamfunction ψ -and by a slightly different form of the wind stress. Its discretized form is obtained by projection of Eqs (2.1) onto a set of cartesian basis functions, and low-order truncation of the expansion thus obtained. The discrete, low-order model is thus governed by four nonlinear coupled ODEs for the variables {Ψ i (t), i = 1, · · · , 4} that are the coefficients of the streamfunction ψ(x, y, t) retained in this expansion. The choice of basis functions follows up on such idealized double-gyre models introduced in [92,170], in particular on this basis including the exponential form of a current that decays away from the domain's western boundary.
In the presence of nonautonomous forcing, the model takes the forṁ Here Ψ denotes the vector of expansion coefficients (Ψ 1 , Ψ 2 , Ψ 3 , Ψ 4 ) of the streamfunction, W is the vector of forcing terms (W 1 , W 2 , W 3 , W 4 ), and the bilinear terms B i are given by In [150], the forcing W(x, t) is defined as where x = (x, y) and w(x) is the time-constant double-gyre wind stress curl used in [147]; furthermore, γ is the dimensionless intensity of this wind stress, while f (t) is an aperiodic time dependence weighed by the dimensionless parameter . Clearly, w(x) is projected onto the same four leading-order basis functions as Ψ and W.

Model behavior.
The aperiodic forcing f (t) in Fig. 7a corresponds to an idealized but aperiodic choice with a substantial decadal component. Such decadal climate signals are associated with the so-called Pacific Decadal Oscillation [31,127], for instance, and climate prediction on such time scales is a matter of great recent interest for the climate sciences [29,73,151,178,182].
The evolution of the model solutions subject to this forcing is shown in Figs. 7b and 7c for two reference cases, γ = 0.96 and γ = 1.1, respectively; the solutions of the corresponding autonomous system, i.e. with = 0, are plotted in [150, Figs. 1(b,c)]. It is clear from the two panels that a regime change occurs as the forcing is increased. This regime change occurs in the autonomous sytem at γ = 1.0 and corresponds to a global bifurcation associated with a homoclinic orbit; see [147,170] and Section 2.3.3 here. The persistence of the regime shift in the nonautonomous case, for as large as 0.2, is interesting but none too surprising.
In Figs. 7(b, c), N trajectories Ψ k (t) emanate from N initial states uniformly distributed at time t 0 = 0 on a reference subset Γ of the (Ψ 1 , Ψ 3 )-plane, while the (Ψ 2 , Ψ 4 ) coordinates of the initial states (not shown) are chosen the same way as in [149]. The panels (b,c) here provide a first representation of the sets that approximate the corresponding PBAs; the small number of trajectories is in fact chosen for the sake of graphical clarity. The correct identification and characterization of the PBAs requires, however, an analysis of the PDFs that evolve along the trajectories and of the convergence to the appropriate invariant time-dependent sets; see Figs. 7(b',c'). This convergence was shown in [150] to take no longer than about 15 years of model simulation. in [150,Appendix], based on general, NDS-theoretical results [26,103]. Numerically, though, this unique global attractor seems to possess two separate local PBAs, as apparent from Fig. 8. Panels (a) and (b) in the figure refer again to the two types of regimes apparent in Figs. 7(b,c), for γ = 0.96 and γ = 1.1, respectively. The mean normalized distance plotted in the figure is defined as follows: Here δ t is the distance, at time t, between two trajectories of the model that were a distance δ 0 apart at time t = t 0 , and the normalized distance δ n = δ t /δ 0 is averaged over the whole forward time integration T tot = t fin −t init of the available trajectories. The maps of σ in Fig. 8 reveal large chaotic regions where δ n 1 on average (warm colors) but also non-chaotic regions, in which σ ≤ 1 (blue) and thus initially close trajectories do remain close on average. The rectangular regions in the two panels that are labeled by letters A and B and by numbers 1 − 4 correspond to subdomains of the initial set Γ and are discussed further in [150,Sec. 5]. The numerical evidence in Fig. 8 suggests, furthermore that the boundary between the two types of local attractors has fractal properties.
In the autonomous context, the coexistence of topologically distinct local attractors is well known in the climate sciences [52,54,72,170,171, and references therein]. The coexistence of local PBAs with chaotic vs. non-chaotic characteristics, within a unique global PBA, as illustrated by Fig. 8 here, seems to be novel, at least in the climate literature.

Coupled variability.
Accounting for time dependence in atmospheric forcing represents but one step on the way to a fully coupled description of the oceanatmosphere system. The striking effects of intrinsic ocean variability on the atmosphere -even in the presence of prescribed, time-independent wind stress, and at very small spatial scales -have been abundantly documented in [23,[62][63][64][65]176, inter alia]. The study of fully coupled ocean-atmosphere models, with two-way interaction, is thus an obvious next step.

Coupled model formulation.
In the hierarchy of climate models discussed in Section 1, we rely here on a model that can still be considered as being a loworder one -like the one of the previous subsection -but is more complex in terms of the physical mechanisms included. Its ocean component is based on the quasi-geostrophic model in [147][148][149] and in Section 4.1 here, while the atmospheric component is likewise quasi-geostrophic and goes back to work by J. G. Charney and associates [33,156].
S. Vannitsem and associates [196,197] had studied the coupling of these two components via momentum fluxes, as customary in ocean models that focus on the wind-driven circulation. Here we rely on the model version in [198], which also includes heat fluxes as an important ingredient in the coupling, and thus respects the energy balance of the coupled system, cf. [72, Ch. 10, and references therein].
The atmospheric model is based on the vorticity equations of a two-layer, quasigeostrophic flow defined on a β-plane [144,195], while the oceanic model is based on the reduced-gravity, quasi-geostrophic shallow-water model on a β-plane [77,148,195]. The originality of the model resides in both the atmospheric and oceanic components being thermally active and communicating through radiative, latent and sensible heat fluxes [198]; the latter follow a suggestion in [14].
After the usual procedure of expansion and truncation of the nondimensional form of the governing PDEs, one obtains n a = 20 ODEs for the atmosphere and n o = 16 ODEs for the ocean. The full coupled model is thus based on a set of n a + n o = 36 nonlinear ODEs, which couple the expansion coefficients of the streamfunction field in the atmosphere and in the ocean with the temperature field. The time-dependent solutions of this system were obtained by very long numerical integrations, using a time step ∆t = 0.01 nondimensional time units.

4.2.2.
Coupled model behavior. The model's ocean component has a double-gyre pattern, while its atmospheric mean circulation reveals the presence of predominant low-and high-pressure zones, as well as of a subtropical jet [198,Figs. 4 and 5]. The latter features recall realistic climatological properties of the oceanic atmosphere.
Bifurcation analysis [198, Figs. 1 and 2], along with a large number of numerical model simulations for various parameter values, reveal the presence of LFV concentrated on and near a long-periodic, attracting orbit. This orbit arises for large values of the meridional gradient of radiative input C o and of the frictional coupling parameter d; it is plotted in Fig. 9(a) for several values of C o between 270 Wm −2 and 320 Wm −2 , and for d = 1 × 10 −8 s −1 . A somewhat surprising result of the coupled model is that this long-periodic limit cycle combines atmospheric and oceanic modes, thus contradicting the expectation -due to the ocean's greater inertia and slower time scales -of being limited to oceanic modes only.
Chaotic behavior develops around this limit cycle as it loses its stability. In Fig. 9(b), one sees several solutions for C o = 300 Wm −2 and 10 −9 s −1 ≤ d ≤ 8 × 10 −8 s −1 . It is only for the highest of these d-values that the orbit is purely periodic and stable (light blue). The other orbits are visibly chaotic, but their behavior is still dominated by the LFV on decadal and multi-decadal time scales that is typical of oceanic processes. Relative changes in % of the STD Relative changes in % of the skewness Relative changes in % of the STD Relative changes in % of the skewness Figure 9. Long-periodic orbits and slow manifold of the coupled ocean-atmosphere model of [198], in a three-dimensional (3-D) projection onto the leading modes (ψ a,1 , ψ o,2 , T o,2 ) of the atmospheric and oceanic streamfunction fields and that of the oceanic temperature field. . After [198].
All the periodic orbits plotted in Fig. 9(a) involve not just the three variables shown, but a total of d s = 17 variables -namely ψ o,i for i = 2, 4, 6, 8 in the oceanic streamfunction, T o,i for i = 2, 4, 6 in the oceanic temperatures, ψ a,i for i = 1, 5, 6, 9, 10 in the atmospheric streamfunction, and θ a,i for i = 1, 5, 6, 9, 10 in the atmospheric temperatures -while the d f = 19 other variables are equal to 0. If the latter variables are set to 0 initially, they remain equal to 0 as the flow evolves. Hence the subspace of the 17 variables listed above is invariant under the phase space flow induced by the 36 ODEs that govern our coupled model. Moreover, all the orbits computed within this subspace were dominated by slow motions; hence the use of the subscript 's' for "slow" and of 'f' for "fast" with respect to the 19 other variables.
The difference in character between orbits dominated by long periods and those that are much more chaotic -as increasingly apparent in Fig. 9(b) for the magenta, dark blue, green and red orbits -is shown in Figs. 10(a,b). The time series in panel (a), while irregular, are fairly smooth and clearly dominated by a decadal periodicity. Singular-spectrum analysis of such solutions [78, and references therein] confirmed a statistically significant peak at roughly 11 years, while the total length of the limit cycles in Fig. 9(a) is of roughly 22 years.
To assess more quantitatively the difference between the hypothetical slow attractors associated with the smoother orbits vs. the ones associated with more chaotic behavior, the Kaplan-Yorke dimension d KY [98], often called the Lyapunov dimension, was calculated. In this case, calculations were carried out for orbits that differ from each other by the third important parameter of the coupled model, λ, which controls the intensity of the heat flux between the ocean and the atmosphere; this parameter can also affect strongly the stability of model solutions.
The leading Lyapunov exponent σ 1 decreases monotonically as a function of λ for most values of C o and d, cf. [198,Fig. 14]. Overall, the results there suggest that increasing the heat fluxes between the ocean and the atmosphere stabilizes the coupled model. This stabilizing effect of the fluxes is also seen in examining the Kaplan-Yorke dimension d KY of the attractor associated with the slow behavior in the coupled model. This dimension is given by a simple functional of the Lyapunov exponents, where k * is the largest k such that k σ k > 0. Under fairly general circumstances, d KY equals the information dimension and hence it is preferable to several other ways of estimating the dimensionality of attractors [188,Ch. 2]. The results are very instructive, indeed, for five orbits computed at C o = 350 Wm −2 and d = 6 × 10 −8 s −1 , three of which are plotted in [198,Fig. 7]. For the slow limit cycle at λ = 100 Wm −2 K −1 (blue curve in the figure), one obviously has d KY = 1. As λ decreases, one gets: for λ = 50 (not in the figure) d KY 8.7, for λ = 20 (green curve) d KY 14.6, for λ = 1.0 (not in the figure) d KY 20.4, and for λ = 0 (i.e., no heat flux at all; red curve in the figure) d KY 21.5. Visually, the green and red objects in Fig. 7 of [198] look successively stranger, while the Kaplan-Yorke dimension calculations indicate that the corresponding attractors have d KY < d s = 17 for all but the two lowest λ-values, namely 1.0 and 0.
There is thus strong numerical evidence to indicate that, for parameter ranges within which the coupled model's solutions are purely periodic or not far from it, a slow attractor of dimension d KY < d s = 17 exists and that its dimension increases as solutions become noisier, like in Fig. 10(b). 5. Concluding remarks. Over the last four decades, the climate sciences have become of paramount importance in understanding, predicting and attempting to improve the relations between humanity and the planet it inhabits [32,89,90,178,182]. This development could not have occurred without a strong contribution from the mathematical sciences. Several books [52,53,72,120,125,129,195] have documented the mathematics that have been used and that have in turn benefited from this area of applications.
Geopotential height difference (m) Figure 10. Low-frequency variability (LFV) of the coupled oceanatmosphere model. Time series of geopotential height difference between locations (π/n, π/4)) and (π/n, 3π/4)) of the model's nondimensional domain, for different values of meridional temperature gradient C o and coupling coefficient d; this height difference plays the role of a North Atlantic Oscillation index in the model. (a) Chaotic but smooth trajectories living on a hypothetical slow attractor; and (b) strongly fluctuating trajectories that are not lying close to such a slow attractor.
Having started this paper in the first person singular, a voice not often used in contemporary scientific literature, I'd like to ask for the reader's permision to conclude in the same voice. It seems to me that -in spite of increasing efforts in this direction, e.g. [131, and further articles in the same volume] -insufficient attention has been paid to an understanding of the strong interaction between natural or intrinsic climate variability [29,137], on the one hand, and the anthropogenic forcing to which a huge fraction of the recent climate literature has paid attention [178,182].
I would like, therefore, to suggest an avenue for overcoming this gap, and summoning the necessary mathematics for doing so. Not surprisingly for the reader who has followed me through this paper, it relies heavily on the application of NDS and RDS theory.

5.1.
A generalized definition of climate sensitivity. The usual way of defining climate sensitivity, going back to [32], relies on the conceptual picture of Fig. 11a: climate is in a steady state characterized by a global average temperature T = T 0 ; a parameter that is affected by human activities, say CO 2 concentration, is suddenly changed; and T (t) follows -according to the solution of a scalar, linear ODE like Eq. (3.3), in which T replaces x and a Heaviside function replaces σt on the right-hand side -so that T (t) → T 1 , and T 1 > T 0 when the CO 2 jump is positive.
This cartoon has been, of course, considerably enriched in successive assessment reports of the the Intergovernmental Panel on Climate Change [89,90,178,182] from a scalar, linear ODE to the coupled, nonlinear PDE systems that govern GCMs. But no mathematically consistent picture has emerged for reconciling the intrinsic variability that those PDE systems include, as illustrated in Sections 2 and 4 herein, with the complex nature of the forcing, both natural and anthropogenic. Thus climate sensitivity is still essentially thought of as ∆T /∆CO 2 , where ∆T = T 1 −T 0 , the difference between a final and an initial "equilibrium" temperature, and ∆CO 2 is likewise the difference between a final and an initial forcing level. Consideration has been given to changes in other fields than temperature, such as precipitation [118,160], to variances and energetics [121,122], or to regional differences [24,123]. Relative changes in % of the SD Relative changes in % of the skewne Relative changes in % of the SD Relative changes in % of the skewness a nonequilibrium model. Given a jump in a parameter, such as CO2 concentration, only the mean global temperature T changes in (a), while in (b) it is also the period, amplitude and phase of a purely periodic oscillation, such as the seasonal cycle or the intrinsic ENSO cycle. Finally, in panel (c), it is also the character of the oscillation, whether deterministic or stochastically perturbed, which may change. After [71].
To include these considerations and follow up on ideas formulated in [71], I'd like to suggest a definition of climate sensitivity γ 1 that takes full advantage of the NDS and RDS framework and that encompasses changes in the natural varibility, in its most general meaning, as the forcing level or some other parameter changes. This definition replaces the more-or-less standard definition of γ = ∂T /∂µ [72, p. 320], with µ an arbitrary parameter generalizing CO 2 -a definition that corresponds to the equilibrium situation in Fig. 11a -by a more appropriate and self-consistent one that corresponds to the chaotic and random situation in Fig. 11c [56,71] between the time-dependent invariant measures supported on the system's PBA at two distinct parameter values. An idea of the large differences that may exist between two such PBAs is given by the three snapshots of the invariant measure on the PBA of the infinite-dimensional, but still relatively simple ENSO model of [66] in Fig. 12 below.
The model's two dependent variables are sea surface temperature T in the eastern Tropical Pacific and thermocline depth h there, as a function of time t: In Eqs. (5.2), F stands for the seasonal forcing, with period 2π/ω = 12 months, and all three variables -T, h and F -depend on the time t and the delays τ 1 and τ 2 ; these delays characterize the traveling times along the Equator of eastward Kelvin and westward Rossby waves. Several authors have studied delay-differential models of ENSO; see [52] for a review and [38,82, and references therein] for further mathematical details on such models.
The solutions of Eqs. (5.2) exhibit periodic, quasi-periodic and chaotic behavior, as well as frequency locking to the time-dependent, seasonal cycle. Thus, in principle, an infinite number of scalars are required to define the dependence of these solutions on the parameters τ 1 and τ 2 ; these scalars need to include not just the means of temperature T and depth h, but also their variance and higher-order moments. In [71,Fig. 7] this dependence was shown for the zeroth, second and fourth moments of h(t); more precisely the plotted quantities were the mean, standard deviation and fourth root of the kurtosis of the PDF. While the figure illustrates successive snapshots of the invariant measure supported on the PBA for the same value of the model parameters, there are similarly large differences between such measures for different parameter values [71,82]. Closer attention to the definition of Eq. (5.1) can provide better insights into the changes in time, as well as with respect to a pararameter, of higher moments of a model's PDF but also into the delicate and important matter of changes in the distribution of extreme events [34,81].
Steps in the application of the NDS and RDS framework to climate change and climate sensitivity are being taken by a number of research groups [19,53,57,123]. This area of research is only opening up and should provide many opportunities for contributions by mathematicians and climate scientists, as well as for fruitful interactions among them.
Pierini and S. Vannitsem [150,198]. Figure 12  Appendix A. RDS theory and random attractors. We present here briefly the mathematical concepts and tools of random dynamical systems, random attractors and stochastic equivalence. We shall use the concept of pullback attractor introduced in Section 3.2.1 to define the closely related notion of a random attractor, but need first to define an RDS. We denote by T the set Z, for maps, or R, for flows. Let (X, B) be a measurable phase space, and (Ω, F, P, (θ(t)) t∈T ) be a metric dynamical system i.e. a flow in the probability space (Ω, F, P), such that (t, ω) → θ(t)ω is measurable and θ(t) : Ω → Ω is measure preserving, i.e., θ(t)P = P.
If ϕ is measurable, it is called a measurable RDS over θ. If, in addition, X is a topological space (respectively a Banach space), and ϕ satisfies (t, ω) → ϕ(t, ω)x continuous (resp. C k , 1 ≤ k ≤ ∞) for all (t, ω) ∈ T×Ω, then ϕ is called a continuous (resp. C k ) RDS over the flow θ. If so, then is a (measurable) flow on Ω × X, and is called the skew-product of θ and ϕ. In the sequel, we shall use the terms "RDS" or "cocycle" synonymously. The choice of the so-called driving system θ is a crucial step in this set-up; it is mostly dictated by the fact that the coupling between the stationary driving and the deterministic dynamics should respect the time invariance of the former, as illustrated in Fig. 13. The driving system θ also plays an important role in establishing stochastic conjugacy [43] and hence the kind of classification we aim at.
The concept of random attractor is a natural and straightforward extension of the definition of pullback attractor (3.2), in which Sell's [163] process is replaced by a cocycle, cf. Fig. 13, and the attractor A now depends on the realization ω of the noise, so that we have a family of random attractors A(ω), cf. Fig. 14. Roughly speaking, for a fixed realization of the noise, one "rewinds" the noise back to t → −∞ and lets the experiment evolve (forward in time) towards a possibly attracting set A(ω); the driving system θ enables one to do this rewinding without changing the statistics, cf. Figs. 13 and 14. Figure 13. Random dynamical systems (RDS) viewed as a flow on the bundle X × Ω = "dynamical space" × "probability space." For a given state x and realization ω, the RDS ϕ is such that Θ(t)(x, ω) = (θ(t)ω, ϕ(t, ω)x) is a flow on the bundle. Reproduced from [79], with permission from Elsevier.  Figure 14. Schematic diagram of a random attractor A(ω), where ω ∈ Ω is a fixed realization of the noise. To be attracting, for every set B of X in a family B of such sets, one must have lim t→+∞ dist(B(θ(−t)ω), A(ω)) = 0 with B(θ(−t)ω) := ϕ(t, θ(−t)ω)B; to be invariant, one must have ϕ(t, ω)A(ω) = A(θ(t)ω). This definition depends strongly on B; see [45] for more details. Reproduced from [79], with permission from Elsevier.
Other notions of attractor can be defined in the stochastic context, in particular based on the original SDE; see [45,47] for a discussion on this topic. The present definition, though, will serve us well.
Having defined RDSs and random attractors, we now introduce the notion of stochastic equivalence or conjugacy, in order to rigourously compare two RDSs; it is defined as follows: two cocycles ϕ 1 (ω, t) and ϕ 2 (ω, t) are conjugated if and only if there exists a random homeomorphism h ∈ Homeo(X) and an invariant set such that h(ω)(0) = 0 and ϕ 1 (ω, t) = h(θ(t)ω) −1 • ϕ 2 (ω, t) • h(ω). (5.4) Stochastic equivalence extends classic topological conjugacy to the bundle space X × Ω, stating that there exists a one-to-one, stochastic change of variables that continuously transforms the phase portrait of one sample system in X into that of any other such system.
Appendix B. Mixing in random dynamical systems. In this appendix, we define rigorously the concept of an ω-wise mixing RDS, in the continuous-time context. Recall first the well-known definition of mixing in a deterministic dynamical system. Given a flow {φ t } on a topological space X, which possesses an invariant (Borel) probability measure µ, we say that the dynamical system (φ t , µ) is mixing if for any two measurable sets A and B, µ(A ∩ φ −t (B)) −→ t→∞ µ(A)µ(B), (5.5) or equivalently, for any pair of continuous functions F, G : X → R. Equation (5.5) states that the set of points in A whose images belong to B by {φ t } tends towards having the same proportion in A as B has in X, with proportions being understood in terms of the measure µ. Hence any measurable set will tend to redistribute itself over the state space according to µ.
Let us now consider a cocycle {Φ(t, ω)} (t,ω)∈R×Ω on the base space (Ω, F, P, {θ t }), which possesses the sample measures {µ ω }. We say that Φ is ω-wise mixing or fiber mixing [21] -or even simply mixing, if no confusion is possible -if for any random sets [45] A(ω) and B(ω), almost surely with respect to P. This mixing concept and its interpretation are natural extensions of their deterministic counterparts just recalled above, except that the mixing property has to be checked across the fibers ω and θ t ω, due to the skew-product nature of the RDS (Φ, θ) [21].
Appendix C. Low-frequency variability (LFV) and mixing. Low-frequency variability (LFV) is a widely used, but not clearly defined concept in the atmospheric, oceanic and climate sciences [72,75,79]. In general, one just refers to phenomena whose periods are longer than those previously studied. Examples include atmospheric LFV -referring to so-called intraseasonal oscillations whose characteristic time scale of 10-100 days is longer than the 5-10-day life cycle of mid-latitude storms but not longer than a season [75,76] -or oceanic LFV, referring to interannual or interdecadal variability whose characteristic time scales are longer than the several-months-long ones of mesoscale eddies and the seasonal cycle of a year [70,75].
In this appendix, we clarify the notion of LFV from a mathematical perspective. Let us reconsider the deterministic Lorenz system [119]. It is known that the power spectral density, or power spectrum, of this system is exponentially decaying [74,140]. At the same time, one can check numerically that the decay of the autocorrelation function is exponentially decaying, too. Other types of power-spectrum behavior may be encountered for chaotic dynamical systems, though. Aside from pure power-law decay, it may also happen that the power spectrum contains one or several broad peaks that stand out above the continuous background, whether the latter has a power-law [18] or exponential decay. If the central frequencies of these peaks are located in a frequency band that lies close to the lower end of the frequency range being studied, the system is said to exhibit LFV [74,78]. This climatically motivated, but vague notion of LFV can be formalized mathematically through the mixing concept introduced in Appendix B. Indeed, for a general flow {φ t } on a topological space X, which possesses an invariant (Borel) probability measure µ, let us define the correlation function by using the same notations as above. If the system (φ t , µ) is mixing, the rate of approach to zero of C t (F, G) is called the rate of decay of correlations for its observables F and G. A system exhibits a slow decay rate of correlations at short lags if the rate is slower than exponential over some characteristic time interval [0, T ]. This emphasis on the nonuniform decay rate of correlations is consistent with the heuristic LFV notion used in the climate sciences, as described above, and it connects the mixing properties of the flow with its power spectral density. In [37], the authors elucidated the relationship between these two approaches, as well as the relationships between the rate of decay of correlations and the occurrence of rough or smooth parameter dependence of the model's statistics. Moreover, the quantitative knowledge about the system's LFV -when combined with the pathwise approach discussed in Section 3.2 -can be used, according to [35,106], to improve ensemble prediction skills of data-driven stochastic models, such as those discussed in [107].