Model | |||||||
L96 | |||||||
L05Ⅲ |
The reconstruction from observations of high-dimensional chaotic dynamics such as geophysical flows is hampered by (ⅰ) the partial and noisy observations that can realistically be obtained, (ⅱ) the need to learn from long time series of data, and (ⅲ) the unstable nature of the dynamics. To achieve such inference from the observations over long time series, it has been suggested to combine data assimilation and machine learning in several ways. We show how to unify these approaches from a Bayesian perspective using expectation-maximization and coordinate descents. In doing so, the model, the state trajectory and model error statistics are estimated all together. Implementations and approximations of these methods are discussed. Finally, we numerically and successfully test the approach on two relevant low-order chaotic models with distinct identifiability.
Citation: |
Figure 1.
From top to bottom: representation of the flow rate
Figure 2.
On the left hand side: Properties of the surrogate model obtained from full but noisy observation of the L96 model in the nominal configuration (
Figure 3.
Same as Figure 2 but for several values of the training window length
Figure 4.
On the left hand side: Properties of the surrogate model obtained from full but noisy observation of the L96 model in the nominal configuration (
Figure 5.
On the left hand side: Properties of the surrogate model obtained from partial and noisy observation of the L96 model in the nominal configuration (
Table 1.
Scalar indicators for nominal experiments based on L96 and L05Ⅲ. Key hyperparameters are recalled. The statistics of the indicators are obtained over
Model | |||||||
L96 | |||||||
L05Ⅲ |
Table 2.
Scalar indicators for L96 and L05Ⅲ in their nominal configuration, using either the full or the approximate schemes. The statistics of the indicators are obtained over
Model | Scheme | |||
L96 | Approximate | |||
L96 | Full | |||
L05Ⅲ | Approximate | |||
L05Ⅲ | Full |
[1] | H. D. I. Abarbanel, P. J. Rozdeba and S. Shirman, Machine learning: Deepest learning as statistical data assimilation problems, Neural Computation, 30 (2018), 2025-2055. doi: 10.1162/neco_a_01094. |
[2] | M. Asch, M. Bocquet and M. Nodet, Data Assimilation: Methods, Algorithms, and Applications, Fundamentals of Algorithms, SIAM, Philadelphia, 2016. doi: 10.1137/1.9781611974546.pt1. |
[3] | C. M. Bishop (ed.), Pattern Recognition and Machine Learning, Springer-Verlag New-York Inc, 2006. |
[4] | C. H. Bishop, B. J. Etherton and S. J. Majumdar, Adaptive sampling with the ensemble transform Kalman filter. Part Ⅰ: Theoretical aspects, Mon. Wea. Rev., 129 (2001), 420-436. doi: 10.1175/1520-0493(2001)129<0420:ASWTET>2.0.CO;2. |
[5] | M. Bocquet, Ensemble Kalman filtering without the intrinsic need for inflation, Nonlin. Processes Geophys., 18 (2011), 735-750. doi: 10.5194/npg-18-735-2011. |
[6] | M. Bocquet, J. Brajard, A. Carrassi and L. Bertino, Data assimilation as a learning tool to infer ordinary differential equation representations of dynamical models, Nonlin. Processes Geophys., 26 (2019), 143-162. doi: 10.5194/npg-26-143-2019. |
[7] | M. Bocquet and P. Sakov, Joint state and parameter estimation with an iterative ensemble Kalman smoother, Nonlin. Processes Geophys., 20 (2013), 803-818. doi: 10.5194/npg-20-803-2013. |
[8] | J. Brajard, A. Carrassi, M. Bocquet and L. Bertino, Combining data assimilation and machine learning to emulate a dynamical model from sparse and noisy observations: a case study with the Lorenz 96 model, J. Comput. Sci., 2020, http://arXiv.org/pdf/2001.01520.pdf. doi: 10.5194/gmd-2019-136. |
[9] | S. L. Brunton, J. L. Proctor and J. N. Kutz, Discovering governing equations from data by sparse identification of nonlinear dynamical systems, PNAS, 113 (2016), 3932-3937. doi: 10.1073/pnas.1517384113. |
[10] | R. H. Byrd, P. Lu and J. Nocedal, A limited memory algorithm for bound constrained optimization, SIAM Journal on Scientific and Statistical Computing, 16 (1995), 1190-1208. doi: 10.1137/0916069. |
[11] | M. Carlu, F. Ginelli, V. Lucarini and A. Politi, Lyapunov analysis of multiscale dynamics: The slow bundle of the two-scale Lorenz 96 model, Nonlin. Processes Geophys., 26 (2019), 73-89. doi: 10.5194/npg-26-73-2019. |
[12] | A. Carrassi, M. Bocquet, L. Bertino and G. Evensen, Data assimilation in the geosciences: An overview on methods, issues, and perspectives, WIREs Climate Change, 9 (2018), e535. doi: 10.1002/wcc.535. |
[13] | B. Chang, L. Meng, E. Haber, F. Tung and D. Begert, Multi-level residual networks from dynamical systems view, in Proceedings of ICLR 2018, 2018. |
[14] | F. Chollet, Deep Learning with Python, Manning Publications Company, 2017. |
[15] | A. P. Dempster, N. M. Laird and D. B. Rubin, Maximum likelihood from incomplete data via the EM algorithm, Journal of the Royal Statistical Society. Series B, 39 (1977), 1-38. |
[16] | P. D. Dueben and P. Bauer, Challenges and design choices for global weather and climate models based on machine learning, Geosci. Model Dev., 11 (2018), 3999-4009. doi: 10.5194/gmd-11-3999-2018. |
[17] | W. E, A proposal on machine learning via dynamical systems, Commun. Math. Stat., 5 (2017), 1-11. doi: 10.1007/s40304-017-0103-z. |
[18] | G. Evensen, Data Assimilation: The Ensemble Kalman Filter, 2nd edition, Springer-Verlag Berlin Heildelberg, 2009. doi: 10.1007/978-3-642-03711-5. |
[19] | R. Fablet, S. Ouala and C. Herzet, Bilinear residual neural network for the identification and forecasting of dynamical systems, in EUSIPCO 2018, European Signal Processing Conference, Rome, Italy, 2018, 1–5. |
[20] | M. Fisher and S. Gürol, Parallelization in the time dimension of four-dimensional variational data assimilation, Q. J. R. Meteorol. Soc., 143 (2017), 1136-1147. doi: 10.1002/qj.2997. |
[21] | Z. Ghahramani and S. T. Roweis, Learning nonlinear dynamical systems using an EM algorithm, in Advances in neural information processing systems, 1999,431–437. |
[22] | W. W. Hsieh and B. Tang, Applying neural network models to prediction and data analysis in meteorology and oceanography, Bull. Amer. Meteor. Soc., 79 (1998), 1855-1870. doi: 10.1175/1520-0477(1998)079<1855:ANNMTP>2.0.CO;2. |
[23] | B. R. Hunt, E. J. Kostelich and I. Szunyogh, Efficient data assimilation for spatiotemporal chaos: A local ensemble transform Kalman filter, Physica D, 230 (2007), 112-126. doi: 10.1016/j.physd.2006.11.008. |
[24] | E. Kalnay, H. Li, T. Miyoshi, S.-C. Yang and J. Ballabrera-Poy, 4-D-Varor ensemble Kalman filter?, Tellus A, 59 (2007), 758-773. |
[25] | Y. A. LeCun, L. Bottou, G. B. Orr and K.-R. Müller, Efficient backprop, in Neural networks: Tricks of the trade, Springer, 2012, 9–48. |
[26] | R. Lguensat, P. Tandeo, P. Ailliot, M. Pulido and R. Fablet, The analog data assimilation, Mon. Wea. Rev., 145 (2017), 4093-4107. doi: 10.1175/MWR-D-16-0441.1. |
[27] | Y. Liu, J.-M. Haussaire, M. Bocquet, Y. Roustan, O. Saunier and A. Mathieu, Uncertainty quantification of pollutant source retrieval: comparison of Bayesian methods with application to the Chernobyl and Fukushima-Daiichi accidental releases of radionuclides, Q. J. R. Meteorol. Soc., 143 (2017), 2886-2901. doi: 10.1002/qj.3138. |
[28] | Z. Long, Y. Lu, X. Ma and B. Dong, PDE-Net: Learning PDEs from Data, , in Proceedings of the 35th International Conference on Machine Learning, 2018. |
[29] | E. N. Lorenz, Designing chaotic models, J. Atmos. Sci., 62 (2005), 1574-1587. doi: 10.1175/JAS3430.1. |
[30] | E. N. Lorenz and K. A. Emanuel, Optimal sites for supplementary weather observations: Simulation with a small model, J. Atmos. Sci., 55 (1998), 399-414. doi: 10.1175/1520-0469(1998)055<0399:OSFSWO>2.0.CO;2. |
[31] | L. Magnusson and E. Källén, Factors influencing skill improvements in the ecmwf forecasting system, Mon. Wea. Rev., 141 (2013), 3142-3153. doi: 10.1175/MWR-D-12-00318.1. |
[32] | V. D. Nguyen, S. Ouala, L. Drumetz and R. Fablet, EM-like learning chaotic dynamics from noisy and partial observations, arXiv preprint, arXiv: 1903.10335. |
[33] | J. Paduart, L. Lauwers, J. Swevers, K. Smolders, J. Schoukens and R. Pintelon, Identification of nonlinear systems using polynomial nonlinear state space models, Automatica, 46 (2010), 647-656. doi: 10.1016/j.automatica.2010.01.001. |
[34] | D. C. Park and Y. Zhu, Bilinear recurrent neural network, in IEEE World Congress on Computational Intelligence., 1994 IEEE International Conference on, 3 (1994), 1459–1464. doi: 10.1109/ICNN.1994.374501. |
[35] | J. Pathak, B. Hunt, M. Girvan, Z. Lu and E. Ott, Model-free prediction of large spatiotemporally chaotic systems from data: A reservoir computing approach, Phys. Rev. Lett., 120 (2018), 024102. doi: 10.1103/PhysRevLett.120.024102. |
[36] | J. Pathak, Z. Lu, B. R. Hunt, M. Girvan and E. Ott, Using machine learning to replicate chaotic attractors and calculate Lyapunov exponents from data, Chaos, 27 (2017), 121102, 9pp. doi: 10.1063/1.5010300. |
[37] | M. Pulido, P. Tandeo, M. Bocquet, A. Carrassi and M. Lucini, Stochastic parameterization identification using ensemble Kalman filtering combined with maximum likelihood methods, Tellus A, 70 (2018), 1-17. doi: 10.1080/16000870.2018.1442099. |
[38] | P. N. Raanes, A. Carrassi and L. Bertino, Extending the square root method to account for additive forecast noise in ensemble methods, Mon. Wea. Rev., 143 (2015), 3857-3873. doi: 10.1175/MWR-D-14-00375.1. |
[39] | V. Rao and A. Sandu, A time-parallel approach to strong-constraint four-dimensional variational data assimilation, J. Comp. Phys., 313 (2016), 583-593. doi: 10.1016/j.jcp.2016.02.040. |
[40] | M. Reichstein, G. Camps-Valls, B. Stevens, J. Denzler, N. Carvalhais and Pr abhat, Deep learning and process understanding for data-driven Earth system science, Nature, 566 (2019), 195-204. doi: 10.1038/s41586-019-0912-1. |
[41] | P. Sakov, J.-M. Haussaire and M. Bocquet, An iterative ensemble Kalman filter in presence of additive model error, Q. J. R. Meteorol. Soc., 144 (2018), 1297-1309. doi: 10.1002/qj.3213. |
[42] | P. Tandeo, P. Ailliot, M. Bocquet, A. Carrassi, T. Miyoshi, M. Pulido and Y. Zhen, A review of innovation based approaches to jointly estimate model and observation error covariance matrices in ensemble data assimilation, 2020, https://arXiv.org/abs/1807.11221, Submitted. |
[43] | Y. Trémolet, Accounting for an imperfect model in 4D-Var, Q. J. R. Meteorol. Soc., 132 (2006), 2483-2504. |
[44] | P. R. Vlachas, J. Pathak, B. R. Hunt, T. P. Sapsis, M. Girvan, E. Ott and P. Koumoutsakos, Forecasting of spatio-temporal chaotic dynamics with recurrent neural networks: a comparative study of reservoir computing and backpropagation algorithms, arXiv preprint, arXiv: 1910.05266., |
[45] | Y.-J. Wang and C.-T. Lin, Runge-Kutta neural network for identification of dynamical systems in high accuracy, IEEE Transactions on Neural Networks, 9 (1998), 294-307. |
[46] | G. C. G. Wei and M. A. Tanner, A Monte Carlo implementation of the EM algorithm and the poor man's data augmentation algorithms, Journal of the American Statistical Association, 85 (1990), 699-704. doi: 10.1080/01621459.1990.10474930. |
[47] | P. Welch, The use of fast Fourier transform for the estimation of power spectra: A method based on time averaging over short, modified periodograms, IEEE Transactions on Audio and Electroacoustics, 15 (1967), 70-73. doi: 10.1109/TAU.1967.1161901. |
[48] | S. J. Wright, Coordinate descent algorithms, Mathematical Programming, 151 (2015), 3-34. doi: 10.1007/s10107-015-0892-3. |