Bayesian inference of chaotic dynamics by merging data assimilation, machine learning and expectation-maximization

  * Corresponding author: Marc Bocquet

    * Corresponding author: Marc Bocquet 
  • The reconstruction from observations of high-dimensional chaotic dynamics such as geophysical flows is hampered by (ⅰ) the partial and noisy observations that can realistically be obtained, (ⅱ) the need to learn from long time series of data, and (ⅲ) the unstable nature of the dynamics. To achieve such inference from the observations over long time series, it has been suggested to combine data assimilation and machine learning in several ways. We show how to unify these approaches from a Bayesian perspective using expectation-maximization and coordinate descents. In doing so, the model, the state trajectory and model error statistics are estimated all together. Implementations and approximations of these methods are discussed. Finally, we numerically and successfully test the approach on two relevant low-order chaotic models with distinct identifiability.

    Mathematics Subject Classification: Primary: 49J15, 65C60; Secondary: 86-08.


    \begin{equation} \\ \end{equation}
  • Figure 1.  From top to bottom: representation of the flow rate $ \boldsymbol \phi_ \mathbf{A} $ with a NN, integration of the flow rate into $ \mathbf{f}_ \mathbf{A} $ using an explicit integration scheme (here a second-order Runge Kutta scheme), and $ {N_\mathrm{c}}- $fold composition up to the full resolvent $ \mathbf{F}_ \mathbf{A} $. $ \delta t $ is the integration time step corresponding to the resolvent $ \mathbf{f}_ \mathbf{A} $

    Figure 2.  On the left hand side: Properties of the surrogate model obtained from full but noisy observation of the L96 model in the nominal configuration ($ L = 4 $, $ K = 5000 $, $ \sigma_y = 1 $, $ {N_\mathrm{y}} = {N_\mathrm{x}} = 40 $). On the right hand side: Properties of the surrogate model obtained from full but noisy observation of the L05Ⅲ model in the nominal configuration ($ L = 4 $, $ K = 5000 $, $ \sigma_y = 1 $, $ {N_\mathrm{y}} = {N_\mathrm{x}} = 36 $). From top to bottom, are plotted the FS (NRMSE as a function of lead time in Lyapunov time), the LS (all exponents), and the PSD (in log-log-scale). A total of $ 10 $ experiments have been performed for both configurations. The curves corresponding to each member are drawn with thin blue lines while the mean of each indicator over the ensemble are drawn in thick dashed orange line

    Figure 3.  Same as Figure 2 but for several values of the training window length $ K $. Each curve is the mean over $ 10 $ experiments with different sets of observations. The LS and PSD of the reference models are also plotted for comparison

    Figure 4.  On the left hand side: Properties of the surrogate model obtained from full but noisy observation of the L96 model in the nominal configuration ($ L = 4 $, $ K = 5000 $, $ {N_\mathrm{y}} = {N_\mathrm{x}} = 40 $ and with several $ \sigma_y $). On the right hand side: Properties of the surrogate model obtained from full but noisy observation of the L05Ⅲ model in the nominal configuration ($ L = 4 $, $ K = 5000 $, $ \sigma_y = 1 $, $ {N_\mathrm{y}} = {N_\mathrm{x}} = 36 $ and with several $ \sigma_y $). From top to bottom, are plotted the FS (NRMSE as a function of lead time in Lyapunov time) and the PSD (in log-log-scale), averaged over an ensemble of $ 10 $ samples

    Figure 5.  On the left hand side: Properties of the surrogate model obtained from partial and noisy observation of the L96 model in the nominal configuration ($ L = 4 $, $ K = 5000 $, $ \sigma_y = 1 $, $ {N_\mathrm{x}} = 40 $) where $ {N_\mathrm{y}} $ is varied. On the right hand side: Properties of the surrogate model obtained from partial and noisy observation of the L05Ⅲ model in the nominal configuration ($ L = 4 $, $ K = 5000 $, $ \sigma_y = 1 $, $ {N_\mathrm{x}} = 36 $) where $ {N_\mathrm{y}} $ is varied. From top to bottom, are plotted the mean FS (NRMSE as a function of lead time in Lyapunov time), the mean LS (all exponents), and the mean PSD (in log-log-scale). A total of $ 10 $ experiments have been performed for both configurations

    Table 1.  Scalar indicators for nominal experiments based on L96 and L05Ⅲ. Key hyperparameters are recalled. The statistics of the indicators are obtained over $ 10 $ samples

    Model $ {N_\mathrm{y}} $ $ \sigma_y $ $ K $ $ L $ $ \pi_ \frac{1}{2} $ $ \sigma_q $ $ \lambda_1 $
    L96 $ 40 $ $ 1 $ $ 5000 $ $ 4 $ $ 4.56 \pm 0.06 $ $ 0.08790 \pm 2\, 10^{-5} $ $ 1.66 \pm 0.02 $
    L05Ⅲ $ 36 $ $ 1 $ $ 5000 $ $ 4 $ $ 4.06 \pm 0.21 $ $ 0.07720 \pm 2\, 10^{-5} $ $ 1.03 \pm 0.05 $
     | Show Table
    DownLoad: CSV

    Table 2.  Scalar indicators for L96 and L05Ⅲ in their nominal configuration, using either the full or the approximate schemes. The statistics of the indicators are obtained over $ 10 $ samples

    Model Scheme $ \pi_ \frac{1}{2} $ $ \sigma_q $ $ \lambda_1 $
    L96 Approximate $ 4.56 \pm 0.06 $ $ 0.08790 \pm 2\, 10^{-5} $ $ 1.66 \pm 0.02 $
    L96 Full $ 4.24 \pm 0.07 $ $ 0.09152 $ $ 1.66 \pm 0.02 $
    L05Ⅲ Approximate $ 4.06 \pm 0.21 $ $ 0.07720 \pm 2\, 10^{-5} $ $ 1.03 \pm 0.05 $
    L05Ⅲ Full $ 3.97 \pm 0.17 $ $ 0.08024 $ $ 1.03 \pm 0.04 $
     | Show Table
    DownLoad: CSV
