\`x^2+y_1+z_12^34\`
Advanced Search
Article Contents
Article Contents

Machine learning-based conditional mean filter: A generalization of the ensemble Kalman filter for nonlinear data assimilation

  • *Corresponding author: Truong-Vinh Hoang

    *Corresponding author: Truong-Vinh Hoang 
Abstract / Introduction Full Text(HTML) Figure(4) / Table(3) Related Papers Cited by
  • This paper presents the machine learning-based ensemble conditional mean filter (ML-EnCMF) — a filtering method based on the conditional mean filter (CMF) previously introduced in the literature. The updated mean of the CMF matches that of the posterior, obtained by applying Bayes' rule on the filter's forecast distribution. Moreover, we show that the CMF's updated covariance coincides with the expected conditional covariance. Implementing the EnCMF requires computing the conditional mean (CM). A likelihood-based estimator is prone to significant errors for small ensemble sizes, causing the filter divergence. We develop a systematical methodology for integrating machine learning into the EnCMF based on the CM's orthogonal projection property. First, we use a combination of an artificial neural network (ANN) and a linear function, obtained based on the ensemble Kalman filter (EnKF), to approximate the CM, enabling the ML-EnCMF to inherit EnKF's advantages. Secondly, we apply a suitable variance reduction technique to reduce statistical errors when estimating loss function. Lastly, we propose a model selection procedure for element-wisely selecting the applied filter, i.e., either the EnKF or ML-EnCMF, at each updating step. We demonstrate the ML-EnCMF performance using the Lorenz-63 and Lorenz-96 systems and show that the ML-EnCMF outperforms the EnKF and the likelihood-based EnCMF.

    Mathematics Subject Classification: Primary: 62M45, 62M20, 65C20; Secondary: 86-08, 93E11.

    Citation:

    \begin{equation} \\ \end{equation}
  • 加载中
  • Figure 1.  CM computed using Eq. (9) and its linear approximation (Eq. (11))

    Figure 2.  Empirical PDF of the conditional variance $ \mathbb{V} \text{ar}\left[ {Q^{f }|Y^f}\right] $ expressed in Eq. (25b) compared with the variances of the updated ensembles obtained using the EnKF and EnCMF. The expected conditional variance, $ \mathbb{E}\left[ {\mathbb{V} \text{ar}\left[ {Q^{f }|Y^f}\right]}\right] $, and the updated ensemble variance of the EnCMF show closely identical estimated values of approximately 0.17

    Figure 3.  Comparison of the empirical densities of the updated ensembles and the Bayesian posterior: (a) actual value $ \mathrm{{ {q}}}^{\mathrm{tr}} = -2 $ (minus standard deviation of the prior distribution), (b) actual value $ \mathrm{{ {q}}}^{\mathrm{tr}} = 0 $ (mean of the prior distribution), (c) actual value $ \mathrm{{ {q}}}^{\mathrm{tr}} = +2 $ (plus standard deviation of the prior distribution)

    Figure 4.  Lorenz 63: average RMSE of the CM approximations using LL-EnCMF and ML-EnCMF $ \overline{rmse}_{\phi} $, a) $ \Delta T_{\text{obs}} = 0.5 $, b) $ \Delta T_{\text{obs}} = 1 $

    Table 1.  L63 system: performance metrics (average RMSE - average ensemble spread (average coverage probability $ f_{ \text{cv}} $)) of the ML-EnCMF compared with the LL-EnCMF and EnKF with $ \Delta T_{\mathrm{obs}} = 0.5 $. The LL-EnCMF uses inflation coefficients $ 1.25 $, $ 1.2 $, and $ 1.05 $ for ensemble sizes 20, 30, and 60, respectively. For $ N = 20 $, 30, and 60 we compute the coverage probability for 90%, 93.3%, and 93.3%-confident intervals, respectively

    $ N $ 20 30 60 100 200
    ML-EnCMF 1.27-0.96 1.11 - 0.97 0.94 - 0.97 0.86 - 0.97 0.81 - 0.96
    (0.83) (0.90) (0.92) (0.94) (0.95)
    EnKF 1.37-1.17 1.29 - 1.21 1.24 - 1.27 1.23 - 1.29 1.22 - 1.30
    (0.86) (0.90) (0.93) (0.93) (0.93)
    LL-EnCMF 1.43 - 1.02 1.21-1.04 0.99-0.91 0.90-0.89 0.85 - 0.90
    (0.85) (0.92) (0.93) (0.93) (0.95)
     | Show Table
    DownLoad: CSV

    Table 2.  L63 system: performance metrics (average RMSE - average ensemble spread (average coverage probability $ f_{ \text{cv}} $)) of the ML-EnCMF compared with the LL-EnCMF and EnKF with $ \Delta T_{\mathrm{obs}} = 1 $. The LL-EnCMF uses inflation coefficients $ 1.25 $, $ 1.2 $, and $ 1.1 $ for ensemble sizes 20, 30, and 60, respectively. For $ N = 20 $, 30, and 60 we compute the coverage probability for 90%, 93.3%, and 93.3%-confident intervals, respectively

    $ N $ 20 30 60 100 200
    ML-EnCMF 1.66 - 1.19 1.50 - 1.20 1.22 - 1.21 1.14-1.18 1.06 - 1.17
    (0.82) (0.90) (0.90) (0.93) (0.93)
    EnKF 1.67 -1.46 1.61 - 1.55 1.53 - 1.57 1.51 - 1.60 1.51 - 1.61
    (0.87) (0.92) (0.91) (0.93) (0.93)
    LL-EnCMF 2.52 - 1.11 1.78-1.15 1.35 - 1.12 1.18 - 1.06 1.05 - 1.06
    (0.72) (0.86) (0.90) (0.93) (0.93)
     | Show Table
    DownLoad: CSV

    Table 3.  L96 system: performance metrics, i.e., average RMSE - average ensemble spread (average coverage probability $ f_{ \text{cv}} $), of the ML-EnCMF compared with the LL-EnCMF and EnKF with $ \Delta T_{\mathrm{obs}} = 0.4 $

    $ N $ 100 150 200 300 400
    ML-EnCMF 0.84 - 0.70 0.79 - 0.67 0.75 - 0.67 0.72 - 0.64 0.69 - 0.63
    (0.94) (0.94) (0.94) (0.94) (0.94)
    EnKF 0.88 - 0.71 0.85 - 0.73 0.83 - 0.74 0.84 - 0.76 0.83 - 0.78
    (0.92) (0.93) (0.94) (0.94) (0.95)
    LL-EnCMF 1.21 - 0.74 0.92 - 0.73 0.88 - 0.76 0.74 - 0.68 0.70 - 0.69
    (0.89) (0.93) (0.95) (0.95) (0.96)
     | Show Table
    DownLoad: CSV
  • [1] H. D. I. AbarbanelP. J. Rozdeba and S. Shirman, Machine learning: Deepest learning as statistical data assimilation problems, Neural Computation, 30 (2018), 2025-2055.  doi: 10.1162/neco_a_01094.
    [2] M. Ades and P. J. van Leeuwen, An exploration of the equivalent weights particle filter, Quarterly Journal of the Royal Meteorological Society, 139 (2013), 820-840.  doi: 10.1002/qj.1995.
    [3] A. Alexanderian, N. Petra, G. Stadler and O. Ghattas, A fast and scalable method for A-optimal design of experiments for infinite-dimensional bayesian nonlinear inverse problems, SIAM Journal on Scientific Computing, 38 (2016), A243-A272. doi: 10.1137/140992564.
    [4] S. Amari, Backpropagation and stochastic gradient descent method, Neurocomputing, 5 (1993), 185-196.  doi: 10.1016/0925-2312(93)90006-O.
    [5] M. Asch, M. Bocquet and M. Nodet, Data Assimilation: Methods, Algorithms, and Applications, SIAM, 2016. doi: 10.1137/1.9781611974546.
    [6] P. BauerB. Stevens and W. Hazeleger, A digital twin of Earth for the green transition, Nature Climate Change, 11 (2021), 80-83.  doi: 10.1038/s41558-021-00986-y.
    [7] A. BobrowskiFunctional Analysis for Probability and Stochastic Process, Cambridge University Press, 2005.  doi: 10.1017/CBO9780511614583.
    [8] M. BocquetJ. BrajardA. Carrassi and L. Bertino, Bayesian inference of chaotic dynamics by merging data assimilation, machine learning and expectation-maximization, Foundations of Data Science, 2 (2020), 55-80.  doi: 10.3934/fods.2020004.
    [9] J. Brajard, A. Carrassi, M. Bocquet and L. Bertino, Combining data assimilation and machine learning to emulate a dynamical model from sparse and noisy observations: A case study with the Lorenz 96 model, Journal of Computational Science, 44 (2020), 101171, 11 pp. doi: 10.1016/j.jocs.2020.101171.
    [10] F. Burden and D. Winkler, Bayesian regularization of neural networks, In D. J. Livingstone, editor, Artificial Neural Networks: Methods and Applications, 23-42. Humana Press, Totowa, NJ, 2009. doi: 10.1007/978-1-60327-101-1_3.
    [11] M. ChantryH. ChristensenP. Dueben and T. Palmer, Opportunities and challenges for machine learning in weather and climate modelling: Hard, medium and soft ai, Philosophical Transactions of the Royal Society A, 379 (2021), 20200083.  doi: 10.1098/rsta.2020.0083.
    [12] B. CrestelA. AlexanderianG. Stadler and O. Ghattas, A-optimal encoding weights for nonlinear inverse problems, with application to the Helmholtz inverse problem, Inverse Problems, 33 (2017), 074008.  doi: 10.1088/1361-6420/aa6d8e.
    [13] R. DurrettProbability: Theory and Examples, Cambridge University press, 2019.  doi: 10.1017/9781108591034.
    [14] O. G. Ernst, B. Sprungk and H.-J. Starkloff, Bayesian inverse problems and Kalman filters, In Extraction of Quantifiable Information from Complex Systems, 133-159. Springer, 2014. doi: 10.1007/978-3-319-08159-5_7.
    [15] G. Evensen, Data Assimilation: The Ensemble Kalman Filter, Springer Berlin Heidelberg, 2009. doi: 10.1007/978-3-642-03711-5.
    [16] D. J. Gagne II, H. M. Christensen, A. C. Subramanian and A. H. Monahan, Machine learning for stochastic parameterization: Generative adversarial networks in the Lorenz '96 model, Journal of Advances in Modeling Earth Systems, 12 (2020), e2019MS001896. doi: 10.1029/2019MS001896.
    [17] G. Gaspari and S. E. Cohn, Construction of correlation functions in two and three dimensions, Quarterly Journal of the Royal Meteorological Society, 125 (1999), 723-757.  doi: 10.1002/qj.49712555417.
    [18] I. GoodfellowY. Bengio and  A. CourvilleDeep Learning, MIT press, 2016. 
    [19] H. HoelK. J. H. Law and R. Tempone, Multilevel ensemble Kalman filtering, SIAM Journal on Numerical Analysis, 54 (2016), 1813-1839.  doi: 10.1137/15M100955X.
    [20] P. L. Houtekamer and H. L. Mitchell, Data assimilation using an ensemble Kalman filter technique, Monthly Weather Review, 126 (1998), 796-811.  doi: 10.1175/1520-0493(1998)126<0796:DAUAEK>2.0.CO;2.
    [21] A. H. Jazwinski, Stochastic Processes and Filtering Theory, Courier Corporation, 2007.
    [22] R. E. Kalman, A new approach to linear filtering and prediction problems, Journal of Basic Engineering, 82 (1960), 35-45.  doi: 10.1115/1.3662552.
    [23] D. P. Kingma and J. Ba, Adam: A method for stochastic optimization, arXiv: 1412.6980, 2014.
    [24] K. Law, A. Stuart and K. Zygalakis, Data Assimilation: A Mathematical Introduction, volume 214, Springer, 2015. doi: 10.1007/978-3-319-20325-6.
    [25] K. J. H. Law, H. Tembine and R. Tempone, Deterministic mean-field ensemble Kalman filtering, SIAM Journal on Scientific Computing, 38 (2016), A1251-A1279. doi: 10.1137/140984415.
    [26] Y. Lee and A. J. Majda, State estimation and prediction using clustered particle filters, Proceedings of the National Academy of Sciences, 113 (2016), 14609-14614.  doi: 10.1073/pnas.1617398113.
    [27] J. Lei and P. Bickel, A moment matching ensemble filter for nonlinear non-Gaussian data assimilation, Monthly Weather Review, 139 (2011), 3964-3973.  doi: 10.1175/2011MWR3553.1.
    [28] T. Lin and H. Zha, Riemannian manifold learning, IEEE Transactions on Pattern Analysis and Machine Intelligence, 30 (2008), 796-809.  doi: 10.1109/TPAMI.2007.70735.
    [29] E. N. Lorenz, Deterministic nonperiodic flow, Journal of Atmospheric Sciences, 20 (1963), 130-141.  doi: 10.1175/1520-0469(1963)020<0130:DNF>>2.0.CO;2.
    [30] E. N. Lorenz, Predictability: A problem partly solved, In Proc. Seminar on Predictability, volume 1, 1996. doi: 10.1017/CBO9780511617652.004.
    [31] E. N. Lorenz and K. A. Emanuel, Optimal sites for supplementary weather observations: Simulation with a small model, Journal of the Atmospheric Sciences, 55 (1998), 399-414. doi: 10.1175/1520-0469(1998)055<0399:OSFSWO>2.0.CO;2.
    [32] A. J. Majda and  J. HarlimFiltering Complex Turbulent Systems, Cambridge University Press, 2012.  doi: 10.1017/CBO9781139061308.
    [33] H. G. MatthiesE. ZanderB. V. Rosić and A. Litvinenko, Parameter estimation via conditional expectation: A Bayesian inversion, Advanced Modeling and Simulation in Engineering Sciences, 3 (2016), 1-21.  doi: 10.1186/s40323-016-0075-7.
    [34] S. PathirajaS. Reich and W. Stannat, Mckean–Vlasov SDEs in nonlinear filtering, SIAM Journal on Control and Optimization, 59 (2021), 4188-4215.  doi: 10.1137/20M1355197.
    [35] H. K. Pradhan, C. Völker, S. N. Losa, A. Bracher and L. Nerger, Global assimilation of ocean-color data of phytoplankton functional types: Impact of different data sets, Journal of Geophysical Research: Oceans, 125 (2020), e2019JC015586. doi: 10.1029/2019JC015586.
    [36] A. RasheedO. San and T. Kvamsdal, Digital twin: Values, challenges and enablers from a modeling perspective, IEEE Access, 8 (2020), 21980-22012.  doi: 10.1109/ACCESS.2020.2970143.
    [37] S. Reich, A dynamical systems framework for intermittent data assimilation, BIT Numerical Mathematics, 51 (2011), 235-249.  doi: 10.1007/s10543-010-0302-4.
    [38] S. Reich, Data assimilation: The Schrödinger perspective, Acta Numerica, 28 (2019), 635-711.  doi: 10.1017/S0962492919000011.
    [39] S. Reich and  C. CotterProbabilistic Forecasting and Bayesian Data Assimilation, Cambridge University Press, 2015.  doi: 10.1017/CBO9781107706804.
    [40] S. Reich and C. J. Cotter, Ensemble filter techniques for intermittent data assimilation, In M. Cullen, M. A. Freitag, S. Kindermann, and R. Scheichl, editors, Large Scale Inverse Problems: Computational Methods and Applications in the Earth Sciences, 91-134. De Gruyter, 2013.
    [41] R. H. Reichle, Data assimilation methods in the earth sciences, Advances in Water Resources, 31 (2008), 1411-1418.  doi: 10.1016/j.advwatres.2008.01.001.
    [42] C. Soize and R. Ghanem, Probabilistic learning on manifolds constrained by nonlinear partial differential equations for small datasets, Computer Methods in Applied Mechanics and Engineering, 380 (2021), 113777.  doi: 10.1016/j.cma.2021.113777.
    [43] A. Spantini, R. Baptista and Y. Marzouk, Coupling techniques for nonlinear ensemble filtering, arXiv preprint, arXiv: 1907.00389, 2019.
    [44] Q. TangL. MuD. SidorenkoH. GoesslingT. Semmler and L. Nerger, Improving the ocean and atmosphere in a coupled ocean-atmosphere model by assimilating satellite sea-surface temperature and subsurface profile data, Quarterly Journal of the Royal Meteorological Society, 146 (2020), 4014-4029.  doi: 10.1002/qj.3885.
    [45] P. J. van LeeuwenH. R. KünschL. NergerR. Potthast and S. Reich, Particle filters for high-dimensional geoscience applications: A review, Quarterly Journal of the Royal Meteorological Society, 145 (2019), 2335-2365.  doi: 10.1002/qj.3551.
    [46] M. Verlaan and A. W. Heemink, Nonlinearity in data assimilation applications: A practical method for analysis, Monthly Weather Review, 129 (2001), 1578-1589.  doi: 10.1175/1520-0493(2001)129<1578:NIDAAA>2.0.CO;2.
    [47] S. Vetra-CarvalhoP. J. van LeeuwenL. NergerA. BarthM. U. AltafP. BrasseurP. Kirchgessner and J.-M. Beckers, State-of-the-art stochastic data assimilation methods for high-dimensional non-Gaussian problems, Tellus A: Dynamic Meteorology and Oceanography, 70 (2018), 1-43.  doi: 10.1080/16000870.2018.1445364.
    [48] J. Vondřejc and H. G. Matthies, Accurate computation of conditional expectation for highly nonlinear problems, SIAM/ASA Journal on Uncertainty Quantification, 7 (2019), 1349-1368.  doi: 10.1137/18M1196674.
  • 加载中

Figures(4)

Tables(3)

SHARE

Article Metrics

HTML views(3317) PDF downloads(319) Cited by(0)

Access History

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return