Foundations of Data Science
September 2021 , Volume 3 , Issue 3
Special issue on Data Assimilation
Managing Guest Editor: Christopher Jones1
Guest Editors: Marc Bocquet2, Jana de Wiljes3, John Harlim4, Matthias Morzfeld5, Elaine Spiller6, Xin T. Tong7
1 RENCI, University of North Carolina at Chapel Hill, USA
2 CEREA, École des Ponts and EDF R&D, Île-de-France, France
3 Institute for Mathematics, University of Potsdam, Germany
4 Department of Mathematics, Department of Meteorology and Atmospheric Science, Institute for Computational and Data Sciences, The Pennsylvania State University, USA
5 Cecil H. and Ida M. Green, Institute of Geophysics and Planetary Physics, Scripps Institution of Oceanography, University of California, San Diego, USA
6 Mathematical and Statistical Sciences, Marquette University, USA
7 National University of Singapore, Singapore
Select all articles
The reconstruction of the dynamics of an observed physical system as a surrogate model has been brought to the fore by recent advances in machine learning. To deal with partial and noisy observations in that endeavor, machine learning representations of the surrogate model can be used within a Bayesian data assimilation framework. However, these approaches require to consider long time series of observational data, meant to be assimilated all together. This paper investigates the possibility to learn both the dynamics and the state online, i.e. to update their estimates at any time, in particular when new observations are acquired. The estimation is based on the ensemble Kalman filter (EnKF) family of algorithms using a rather simple representation for the surrogate model and state augmentation. We consider the implication of learning dynamics online through (ⅰ) a global EnKF, (ⅰ) a local EnKF and (ⅲ) an iterative EnKF and we discuss in each case issues and algorithmic solutions. We then demonstrate numerically the efficiency and assess the accuracy of these methods using one-dimensional, one-scale and two-scale chaotic Lorenz models.
This paper provides a unified perspective of iterative ensemble Kalman methods, a family of derivative-free algorithms for parameter reconstruction and other related tasks. We identify, compare and develop three subfamilies of ensemble methods that differ in the objective they seek to minimize and the derivative-based optimization scheme they approximate through the ensemble. Our work emphasizes two principles for the derivation and analysis of iterative ensemble Kalman methods: statistical linearization and continuum limits. Following these guiding principles, we introduce new iterative ensemble Kalman methods that show promising numerical performance in Bayesian inverse problems, data assimilation and machine learning tasks.
Ensemble Kalman Inversion (EnKI) [
We propose WEnKI and WEnSRF, the weighted versions of EnKI and EnSRF in this paper. It follows the same gradient flow as that of EnKI/EnSRF with weight corrections. Compared to the classical methods, the new methods are unbiased, and compared with IS, the method has bounded weight variance. Both properties will be proved rigorously in this paper. We further discuss the stability of the underlying Fokker-Planck equation. This partially explains why EnKI, despite being inconsistent, performs well occasionally in nonlinear settings. Numerical evidence will be demonstrated at the end.
This work demonstrates the efficiency of using iterative ensemble smoothers to estimate the parameters of an SEIR model. We have extended a standard SEIR model with age-classes and compartments of sick, hospitalized, and dead. The data conditioned on are the daily numbers of accumulated deaths and the number of hospitalized. Also, it is possible to condition the model on the number of cases obtained from testing. We start from a wide prior distribution for the model parameters; then, the ensemble conditioning leads to a posterior ensemble of estimated parameters yielding model predictions in close agreement with the observations. The updated ensemble of model simulations has predictive capabilities and include uncertainty estimates. In particular, we estimate the effective reproductive number as a function of time, and we can assess the impact of different intervention measures. By starting from the updated set of model parameters, we can make accurate short-term predictions of the epidemic development assuming knowledge of the future effective reproductive number. Also, the model system allows for the computation of long-term scenarios of the epidemic under different assumptions. We have applied the model system on data sets from several countries, i.e., the four European countries Norway, England, The Netherlands, and France; the province of Quebec in Canada; the South American countries Argentina and Brazil; and the four US states Alabama, North Carolina, California, and New York. These countries and states all have vastly different developments of the epidemic, and we could accurately model the SARS-CoV-2 outbreak in all of them. We realize that more complex models, e.g., with regional compartments, may be desirable, and we suggest that the approach used here should be applicable also for these models.
The disparity in the impact of COVID-19 on minority populations in the United States has been well established in the available data on deaths, case counts, and adverse outcomes. However, critical metrics used by public health officials and epidemiologists, such as a time dependent viral reproductive number (
The purpose of this paper is to describe the feedback particle filter algorithm for problems where there are a large number (
Consider the class of Ensemble Square Root filtering algorithms for the numerical approximation of the posterior distribution of nonlinear Markovian signals, partially observed with linear observations corrupted with independent measurement noise. We analyze the asymptotic behavior of these algorithms in the large ensemble limit both in discrete and continuous time. We identify limiting mean-field processes on the level of the ensemble members, prove corresponding propagation of chaos results and derive associated convergence rates in terms of the ensemble size. In continuous time we also identify the stochastic partial differential equation driving the distribution of the mean-field process and perform a comparison with the Kushner-Stratonovich equation.
Many recent advances in sequential assimilation of data into nonlinear high-dimensional models are modifications to particle filters which employ efficient searches of a high-dimensional state space. In this work, we present a complementary strategy that combines statistical emulators and particle filters. The emulators are used to learn and offer a computationally cheap approximation to the forward dynamic mapping. This emulator-particle filter (Emu-PF) approach requires a modest number of forward-model runs, but yields well-resolved posterior distributions even in non-Gaussian cases. We explore several modifications to the Emu-PF that utilize mechanisms for dimension reduction to efficiently fit the statistical emulator, and present a series of simulation experiments on an atypical Lorenz-96 system to demonstrate their performance. We conclude with a discussion on how the Emu-PF can be paired with modern particle filtering algorithms.
Control-type particle filters have been receiving increasing attention over the last decade as a means of obtaining sample based approximations to the sequential Bayesian filtering problem in the nonlinear setting. Here we analyse one such type, namely the feedback particle filter and a recently proposed approximation of the associated gain function based on diffusion maps. The key purpose is to provide analytic insights on the form of the approximate gain, which are of interest in their own right. These are then used to establish a roadmap to obtaining well-posedness and convergence of the finite
This papers shows that nonlinear filter in the case of deterministic dynamics is stable with respect to the initial conditions under the conditions that observations are sufficiently rich, both in the context of continuous and discrete time filters. Earlier works on the stability of the nonlinear filters are in the context of stochastic dynamics and assume conditions like compact state space or time independent observation model, whereas we prove filter stability for deterministic dynamics with more general assumptions on the state space and observation process. We give several examples of systems that satisfy these assumptions. We also show that the asymptotic structure of the filtering distribution is related to the dynamical properties of the signal.
Add your name and e-mail address to receive news of forthcoming issues of this journal:
[Back to Top]