Advanced Search
Article Contents
Article Contents

Estimating linear response statistics using orthogonal polynomials: An RKHS formulation

  • * Corresponding author

    * Corresponding author 

The research of XL was supported under the NSF grant DMS-1819011 and JH was supported under the NSF grant DMS-1854299

Abstract Full Text(HTML) Figure(6) / Table(1) Related Papers Cited by
  • We study the problem of estimating linear response statistics under external perturbations using time series of unperturbed dynamics. Based on the fluctuation-dissipation theory, this problem is reformulated as an unsupervised learning task of estimating a density function. We consider a nonparametric density estimator formulated by the kernel embedding of distributions with "Mercer-type" kernels, constructed based on the classical orthogonal polynomials defined on non-compact domains. While the resulting representation is analogous to Polynomial Chaos Expansion (PCE), the connection to the reproducing kernel Hilbert space (RKHS) theory allows one to establish the uniform convergence of the estimator and to systematically address a practical question of identifying the PCE basis for a consistent estimation. We also provide practical conditions for the well-posedness of not only the estimator but also of the underlying response statistics. Finally, we provide a statistical error bound for the density estimation that accounts for the Monte-Carlo averaging over non-i.i.d time series and the biases due to a finite basis truncation. This error bound provides a means to understand the feasibility as well as limitation of the kernel embedding with Mercer-type kernels. Numerically, we verify the effectiveness of the estimator on two stochastic dynamics with known, yet, non-trivial equilibrium densities.

    Mathematics Subject Classification: Primary: 46E22, 62G07; Secondary: 82C31, 33C45, 33C50, 37A25.


    \begin{equation} \\ \end{equation}
  • 加载中
  • Figure 1.  On the left panel, we show an example of how $ \eta_M $ in (54) behaves as a function of $ M $. On the same panel, we also show the rejection probability $ \mathcal{R}_M $ as a function of $ M $. One can see that as the algorithm converges (with small $ \eta_M $), the rejection probability converges to a relatively small value. On the right panel, we show $ \delta_M $ as a function of $ M $. Here, there is no pattern for $ \delta_M $. In practice, since $ \delta_M $ can be arbitrarily small, one can set $ \delta $ in (43) to be slightly larger than the floating-point single precision to guarantee a well-posed estimator. The results in this figure are based on the gradient system to be discussed in Section 5.1

    Figure 2.  The equilibrium distribution of the triple-well model (59) (upper left panel) and its kernel embedding estimate (upper right panel) based on a total of $ 1\times 10^{7} $ sample. The contour plot (lower left panel) displays the absolute error of the estimate. The error plot (lower right panel) shows the $ \ell_{\infty} $-error of the estimates $ \hat{k}_{A} $ via kernel embedding linear response. We separate the diagonal entries (D) from the non-diagonal entries (ND) due to their scale difference

    Figure 3.  The linear response operator $ k_{A} $ in (62) (blue solid) and the corresponding estimates $ \hat{k}_{A} $ in (63) via kernel embedding linear response (red dash) and KDE (yellow dot-dash). For the two-point statistics, both $ k_{A} $ and $ \hat{k}_{A} $ are computed via Monte-Carlo. The diagonal entries of $ k_{A} $ and $ \hat{k}_{A} $ are normalized so that they share the same initial value $ 1 $. Two insert figures are added to the diagonal entries to show the details of the estimates at the initial stage

    Figure 4.  Left panel: $ \eta_M $ for the Gaussian marginal density of variable $ v $ as a function of $ M = M_2 $(dotted blue line). In the same panel, we also show $ \eta_M $ for the marginal density of variable $ x $ (dashes blue) and the rejection probability $ \mathcal{R}_M $ (solid red) as functions of $ M = M_1 $ for a fixed $ M_2 = 0 $. Right panel: The marginal distribution of $ x $ (left) of the Langevin dynamics (64) at equilibrium. The kernel embedding estimate uses Laguerre polynomials with $ M = 90 $. All the results in this picture are based on a total of $ N = 10^{7} $ samples

    Figure 5.  The linear response operator $ k_{A} $ in (66) (blue solid) and the corresponding estimates $ \hat{k}_{A} $ in (68) via kernel embedding linear response (red dash) and KDE (yellow dot-dash). All the statistics are computed via Monte-Carlo. Similar to Figure 3, the diagonal entries of $ k_{A} $ and $ \hat{k}_{A} $ are normalized so that they share the same initial value $ 1 $. The $ (1, 2) $ and $ (2, 2) $ components reach perfect fits for both methods since $ v $ is Gaussian at the equilibrium

    Figure 6.  Graph of $ k_{\beta, 0.64, 1}(x, x) $ in (32) for $ \beta = 0.42 $ (blue-solid), $ \beta = 0.45 $ (red-dot-dash), and $ \beta = 0.48 $ (yellow-dash). Notice that for $ \rho = 0.64 $, to ensure the boundedness, we need $ \beta \geq \frac{0.8}{1+0.8}\approx 0.44 $, which is consistent with the numerical results

    Table 1.  Elapsed time (based on a desktop computer, equipped with a 3.2GHz quad-core Intel Core i5 processor with 32Gb RAM) of the KDE approach and the kernel embedding approach in computing the linear response statistics

    Method Number of Basis Elapsed Time (s)
    KDE (Triple-Well, $ N = 1\times 10^7 $) $ 1\times 10^7 $ $ 1.99 \times 10^4 $
    Kernel Embedding (Triple-Well, $ M = 60 $) $ 1891 $ $ 1.54 \times 10^3 $
    Kernel Embedding (Langevin, $ M_1 = 90 $, $ M_2 = 0 $) $ 91 $ $ 8.21 $
     | Show Table
    DownLoad: CSV
  • [1] M. Abramowitz and I. A. Stegun, Handbook of Mathematical Functions, with Formulas, Graphs, and Mathematical Tables, volume 55, Third printing, with corrections. US Government printing office, 1965.
    [2] W. A. Al-Salam, Operational representations for the Laguerre and other polynomials, Duke Math. J., 31 (1964), 127-142. 
    [3] D. F. Anderson and J. C. Mattingly, A weak trapezoidal method for a class of stochastic differential equations, Commun. Math. Sci., 9 (2011), 301-318.  doi: 10.4310/CMS.2011.v9.n1.a15.
    [4] M. Baiesi, C. Maes and B. Wynants, Fluctuations and response of nonequilibrium states, Physical review letters, 103 (2009), 010602. doi: 10.1103/PhysRevLett.103.010602.
    [5] T. Berry and J. Harlim, Correcting biased observation model error in data assimilation, Monthly Weather Review, 145 (2017), 2833-2853.  doi: 10.1175/MWR-D-16-0428.1.
    [6] M. Branicki and A. J. Majda, Fundamental limitations of polynomial chaos for uncertainty quantification in systems with intermittent instabilities, Commun. Math. Sci., 11 (2013), 55-103.  doi: 10.4310/CMS.2013.v11.n1.a3.
    [7] Y. Cao and Q. Gu, Generalization error bounds of gradient descent for learning over-parameterized deep ReLU networks, arXiv preprint, arXiv: 1902.01384, 2019.
    [8] L. Carlitz, The product of several Hermite or Laguerre polynomials, Monatsh. Math., 66 (1962), 393-396.  doi: 10.1007/BF01298234.
    [9] R. R. Coifman and S. Lafon, Diffusion maps, Appl. Comput. Harmon. Anal., 21 (2006), 5-30.  doi: 10.1016/j.acha.2006.04.006.
    [10] B. ColboisA. EI Soufi and A. Savo, Eigenvalues of the Laplacian on a compact manifold with density, Comm. Anal. Geom., 23 (2015), 639-670.  doi: 10.4310/CAG.2015.v23.n3.a6.
    [11] Y. A. Davydov, Convergence of distributions generated by stationary stochastic processes, Theory of Probability & its Applications, 13 (1968), 691-696. 
    [12] O. G. ErnstA. MuglerH.-J. Starkloff and E. Ullmann, On the convergence of generalized polynomial chaos expansions, ESAIM Math. Model. Numer. Anal., 46 (2012), 317-339.  doi: 10.1051/m2an/2011045.
    [13] D. J. Evans and G. P. Morriss, Nonlinear-response theory for steady planar Couette flow, Physical Review A, 30 (1984), 1528. doi: 10.1103/PhysRevA.30.1528.
    [14] D. J. Evans and  G. P. MorrissStatistical Mechanics of Nonequilibrium Liquids, Cambridge University Press, 2008. 
    [15] H. Flyvbjerg and H. G. Petersen, Error estimates on averages of correlated data, J. Chem. Phys., 91 (1989), 461-466.  doi: 10.1063/1.457480.
    [16] N. García TrillosM. GerlachM. Hein and D. Slepčev, Error estimates for spectral convergence of the graph Laplacian on random geometric graphs toward the Laplace–Beltrami operator, Found. Comput. Math., 20 (2020), 827-887.  doi: 10.1007/s10208-019-09436-w.
    [17] M. S. Green, Markoff random processes and the statistical mechanics of time-dependent phenomena. ii. irreversible processes in fluids, The Journal of Chemical Physics, 22 (1954), 398-413.  doi: 10.1063/1.1740082.
    [18] M. Hairer and A. J. Majda, A simple framework to justify linear response theory, Nonlinearity, 23 (2010), 909-922. doi: 10.1088/0951-7715/23/4/008.
    [19] H. Hang and I. Steinwart, Fast learning from $\alpha$-mixing observations, J. Multivariate Anal., 127 (2014), 184-199.  doi: 10.1016/j.jmva.2014.02.012.
    [20] A. Hannachi and A. O'Neill, Atmospheric multiple equilibria and non-Gaussian behaviour in model simulations, Quarterly Journal of the Royal Meteorological Society, 127 (2001), 939-958.  doi: 10.1002/qj.49712757312.
    [21] J.-N. HwangS.-R. Lay and A. Lippman, Nonparametric multivariate density estimation: A comparative study, IEEE Transactions on Signal Processing, 42 (1994), 2795-2810. 
    [22] S. W. Jiang and J. Harlim, Parameter estimation with data-driven nonparametric likelihood functions, Entropy, 21 (2019), Paper No. 559, 32 pp. doi: 10.3390/e21060559.
    [23] S. W. Jiang and J. Harlim, Modeling of missing dynamical systems: Deriving parametric models using a nonparametric framework, Res. Math. Sci., 7 (2020), Paper No. 16, 25 pp. doi: 10.1007/s40687-020-00217-4.
    [24] M. S. JollyI. G. Kevrekidis and E. S. Titi, Approximate inertial manifolds for the Kuramoto-Sivashinsky equation: Analysis and computations, Phys. D, 44 (1990), 38-60.  doi: 10.1016/0167-2789(90)90046-R.
    [25] R. Kubo, Statistical-mechanical theory of irreversible processes. I. General theory and simple applications to magnetic and conduction problems, J. Phys. Soc. Japan, 12 (1957), 570-586.  doi: 10.1143/JPSJ.12.570.
    [26] R. Kubo, M. Toda and N. Hashitsume, Statistical Physics II. Nonquilibrium Statistical Mechanics, Springer-Verlag, Berlin, 1985. doi: 10.1007/978-3-642-96701-6.
    [27] B. LeimkuhlerC. Matthews and G. Stoltz, The computation of averages from equilibrium and nonequilibrium Langevin molecular dynamics, IMA J. Numer. Anal., 36 (2016), 13-79.  doi: 10.1093/imanum/dru056.
    [28] C. E. Leith, Climate response and fluctuation dissipation, Journal of the Atmospheric Sciences, 32 (1975), 2022-2026.  doi: 10.1175/1520-0469(1975)032<2022:CRAFD>2.0.CO;2.
    [29] A. L. Levin and D. S. Lubinsky, Christoffel functions, orthogonal polynomials, and Nevai's conjecture for Freud weights, Constr. Approx., 8 (1992), 463-535.  doi: 10.1007/BF01203463.
    [30] L. Liu, D. Li and W. H. Wong, Convergence rates of a partition based bayesian multivariate density estimation method, In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors, Advances in Neural Information Processing Systems 30, pages 4738–4746. Curran Associates, Inc., 2017.
    [31] L. LuH. Jiang and W. H. Wong, Multivariate density estimation by Bayesian sequential partitioning, J. Amer. Statist. Assoc., 108 (2013), 1402-1410.  doi: 10.1080/01621459.2013.813389.
    [32] F. LuM. MorzfeldX. Tu and A. J. Chorin, Limitations of polynomial chaos expansions in the Bayesian solution of inverse problems, J. Comput. Phys., 282 (2015), 138-147.  doi: 10.1016/j.jcp.2014.11.010.
    [33] A. J. Majda, R. V. Abramov and M. J. Grote, Information Theory and Stochastics for Multiscale Nonlinear Systems, CRM Monograph Series v.25, American Mathematical Society, Providence, Rhode Island, USA, 2005. doi: 10.1090/crmm/025.
    [34] J. C. MattinglyA. M. Stuart and D. J. Higham, Ergodicity for SDEs and approximations: Locally Lipschitz vector fields and degenerate noise, Stochastic Process. Appl., 101 (2002), 185-232.  doi: 10.1016/S0304-4149(02)00150-3.
    [35] F. G. Mehler, Ueber die Entwicklung einer Function von beliebig vielen Variablen nach Laplaceschen Functionen höherer Ordnung, J. Reine Angew. Math., 66 (1866), 161-176.  doi: 10.1515/crll.1866.66.161.
    [36] K. Muandet, K. Fukumizu, B. Sriperumbudur and B. Schölkopf, Kernel mean embedding of distributions: A review and beyond, Foundations and Trends® in Machine Learning, 10 (2017), 1–141.
    [37] EA Nadaraya, On non-parametric estimates of density functions and regression curves, Theory of Probability & its Applications, 10 (1965), 186-190. 
    [38] G. Papamakarios, T. Pavlakou and I. Murray, Masked autoregressive flow for density estimation, Advances in Neural Information Processing Systems, (2017), 2338–2347.
    [39] G. A. Pavliotis, Stochastic Processes and Applications: Diffusion Processes, the Fokker-Planck and Langevin Equations, volume 60. Springer, 2014. doi: 10.1007/978-1-4939-1323-7.
    [40] M. Rosenblatt, Remarks on some nonparametric estimates of a density function, Ann. Math. Statist., 27 (1956), 832-837.  doi: 10.1214/aoms/1177728190.
    [41] B. Roux, The calculation of the potential of mean force using computer simulations, Computer Physics Communications, 91 (1995), 275-282.  doi: 10.1016/0010-4655(95)00053-I.
    [42] E. F. Schuster, Estimation of a probability density function and its derivatives, Ann. Math. Statist., 40 (1969), 1187-1195.  doi: 10.1214/aoms/1177697495.
    [43] B. W. Silverman, Density Estimation for Statistics and Data Analysis, volume 26. Monographs on Statistics and Applied Probability. Chapman & Hall, London, 1986.
    [44] B. Sriperumbudur, K. Fukumizu, A. Gretton, A. Hyvärinen and R. Kumar, Density estimation in infinite dimensional exponential families, J. Mach. Learn. Res., 18 (2017), Paper No. 57, 59 pp.
    [45] B. K. Sriperumbudur, K. Fukumizu and G. R. G. Lanckriet, On the relation between universality, characteristic kernels and RKHS embedding of measures, In Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, (2010), 773–780.
    [46] B. K. SriperumbudurK. Fukumizu and G. R. Lanckriet, Universality, characteristic kernels and RKHS embedding of measures, J. Mach. Learn. Res., 12 (2011), 2389-2410. 
    [47] I. Steinwart, On the influence of the kernel on the consistency of support vector machines, J. Mach. Learn. Res., 2 (2001), 67-93. 
    [48] I. Steinwart and A. Christmann, Support Vector Machines, Springer, 2008.
    [49] C. J. Stone, Optimal global rates of convergence for nonparametric regression, Ann. Statist., 10 (1982), 1040-1053.  doi: 10.1214/aos/1176345969.
    [50] G. Szegö, Orthogonal Polynomials, volume 23. American Mathematical Soc., 1939.
    [51] A. Telatovich and X. Li, The strong convergence of operator-splitting methods for the Langevin dynamics model, arXiv preprint, arXiv: 1706.04237, 2017.
    [52] B. Uria, M.-A. Côté, K. Gregor, I. Murray and H. Larochelle, Neural autoregressive distribution estimation, J. Mach. Learn. Res., 17 (2016), Paper No. 205, 37 pp.
    [53] Z. Wang and D. W. Scott, Nonparametric density estimation for high-dimensional data–Algorithms and applications, Wiley Interdiscip. Rev. Comput. Stat., 11 (2019), e1461, 16 pp. doi: 10.1002/wics.1461.
    [54] L. Wasserman, All of Nonparametric Statistics, Springer Science & Business Media, 2006.
    [55] G. N. Watson, Notes on generating functions of polynomials: (1) Laguerre polynomials, J. London Math. Soc., 8 (1933), 189-192.  doi: 10.1112/jlms/s1-8.3.189.
    [56] D. XiuNumerical Methods for Stochastic Computations: A Spectral Method Approach, Princeton University Press, 2010. 
    [57] H. Zhang, X. Li and J. Harlim, A parameter estimation method using linear response statistics: Numerical scheme, Chaos, 29 (2019), 033101, 21 pp. doi: 10.1063/1.5081744.
    [58] H. Zhang, X. Li and J. Harlim, Linear response based parameter estimation in the presence of model error, arXiv preprint, arXiv: 1910.14113, 2019.
  • 加载中




Article Metrics

HTML views(757) PDF downloads(317) Cited by(0)

Access History

Other Articles By Authors



    DownLoad:  Full-Size Img  PowerPoint