doi: 10.3934/fods.2021014

Geometric adaptive Monte Carlo in random environment

1. 

Department of Mathematics, The University of Manchester, Manchester, M13 9PL, UK

2. 

School of Mathematics and Statistics, University of Glasgow, Glasgow, G12 8QQ, UK

3. 

Department of Astronomy and Astrophysics

4. 

Center for Exoplanets and Habitable Worlds

5. 

Center for Astrostatistics

6. 

Institute for Computational and Data Sciences, 525 Davey Laboratory, The Pennsylvania State University, University Park, PA, 16802, USA

* Corresponding author: Theodore Papamarkou

Received  March 2021 Revised  May 2021 Published  June 2021

Manifold Markov chain Monte Carlo algorithms have been introduced to sample more effectively from challenging target densities exhibiting multiple modes or strong correlations. Such algorithms exploit the local geometry of the parameter space, thus enabling chains to achieve a faster convergence rate when measured in number of steps. However, acquiring local geometric information can often increase computational complexity per step to the extent that sampling from high-dimensional targets becomes inefficient in terms of total computational time. This paper analyzes the computational complexity of manifold Langevin Monte Carlo and proposes a geometric adaptive Monte Carlo sampler aimed at balancing the benefits of exploiting local geometry with computational cost to achieve a high effective sample size for a given computational cost. The suggested sampler is a discrete-time stochastic process in random environment. The random environment allows to switch between local geometric and adaptive proposal kernels with the help of a schedule. An exponential schedule is put forward that enables more frequent use of geometric information in early transient phases of the chain, while saving computational time in late stationary phases. The average complexity can be manually set depending on the need for geometric exploitation posed by the underlying model.

Citation: Theodore Papamarkou, Alexey Lindo, Eric B. Ford. Geometric adaptive Monte Carlo in random environment. Foundations of Data Science, doi: 10.3934/fods.2021014
References:
[1]

C. Andrieu and É. Moulines, On the ergodicity properties of some adaptive MCMC algorithms, Ann. Appl. Probab., 16 (2006), 1462-1505.  doi: 10.1214/105051606000000286.  Google Scholar

[2]

Y. BaiG. O. Roberts and J. S. Rosenthal, On the containment condition for adaptive Markov chain Monte Carlo algorithms, Adv. Appl. Stat., 21 (2011), 1-54.   Google Scholar

[3]

M. Betancourt, A general metric for Riemannian manifold Hamiltonian Monte Carlo, in Geometric Science of Information, Lecture Notes in Comput. Sci., 8085, Springer, Heidelberg, 2013,327–334. doi: 10.1007/978-3-642-40020-9_35.  Google Scholar

[4]

B. Calderhead, M. Epstein, L. Sivilotti and M. Girolami, Bayesian approaches for mechanistic ion channel modeling, in In Silico Systems Biology, Methods in Molecular Biology, 1021, Humana Press, Totowa, NJ, 2013, 247-272. doi: 10.1007/978-1-62703-450-0_13.  Google Scholar

[5]

B. Calderhead and M. Girolami, Statistical analysis of nonlinear dynamical systems using differential geometric sampling methods, Interface Focus, 1 (2011). doi: 10.1098/rsfs.2011.0051.  Google Scholar

[6]

S. Chib and E. Greenberg, Understanding the Metropolis-Hastings algorithm, Amer. Statistician, 49 (1995), 327-335.  doi: 10.2307/2684568.  Google Scholar

[7]

A. M. Davie and A. J. Stothers, Improved bound for complexity of matrix multiplication, Proc. Roy. Soc. Edinburgh Sect. A, 143 (2013), 351-369.  doi: 10.1017/S0308210511001648.  Google Scholar

[8]

S. DuaneA. D. KennedyB. J. Pendleton and D. Roweth, Hybrid Monte Carlo, Phys. Lett. B, 195 (1987), 216-222.  doi: 10.1016/0370-2693(87)91197-X.  Google Scholar

[9]

E. B. Ford, Improving the efficiency of Markov chain Monte Carlo for analyzing the orbits of extrasolar planets, Astrophysical J., 642 (2006), 505-522.  doi: 10.1086/500802.  Google Scholar

[10]

E. B. Ford, Quantifying the uncertainty in the orbits of extrasolar planets, Astronomical J., 129 (2005), 1706-1717.  doi: 10.1086/427962.  Google Scholar

[11]

F. L. Gall, Powers of tensors and fast matrix multiplication, in Proceedings of the 39th international symposium on symbolic and algebraic computation, Association for Computing Machinery, 2014,296–303. doi: 10.1145/2608628.2608664.  Google Scholar

[12]

C. J. Geyer, Practical Markov chain Monte Carlo, Statist. Sci., 7 (1992), 473-483.  doi: 10.1214/ss/1177011137.  Google Scholar

[13]

P. E. GillG. H. GolubW. Murray and M. A. Saunders, Methods for modifying matrix factorizations, Math. Comp., 28 (1974), 505-535.  doi: 10.1090/S0025-5718-1974-0343558-6.  Google Scholar

[14]

M. Girolami and B. Calderhead, Riemann manifold Langevin and Hamiltonian Monte Carlo methods, J. R. Stat. Soc. Ser. B Stat. Methodol., 73 (2011), 123-214.  doi: 10.1111/j.1467-9868.2010.00765.x.  Google Scholar

[15]

A. Griewank, On automatic differentiation and algorithmic linearization, Pesquisa Operacional, 34 (2014), 621-645.  doi: 10.1590/0101-7438.2014.034.03.0621.  Google Scholar

[16]

A. Griewank and A. Walther, Evaluating Derivatives. Principles and Techniques of Algorithmic Differentiation, Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA, 2008. doi: 10.1137/1.9780898717761.  Google Scholar

[17]

J. E. Griffin and S. G. Walker, On adaptive Metropolis-Hastings methods, Stat. Comput., 23 (2013), 123-134.  doi: 10.1007/s11222-011-9296-2.  Google Scholar

[18]

H. HaarioM. LaineA. Mira and E. Saksman, DRAM: Efficient adaptive MCMC, Stat. Comput., 16 (2006), 339-354.  doi: 10.1007/s11222-006-9438-0.  Google Scholar

[19]

H. HaarioE. Saksman and J. Tamminen, An adaptive metropolis algorithm, Bernoulli, 7 (2001), 223-242.  doi: 10.2307/3318737.  Google Scholar

[20]

B. Hajek, Cooling schedules for optimal annealing, in Open Problems in Communication and Computation, Springer, New York, 1987,147–150. doi: 10.1007/978-1-4612-4808-8_42.  Google Scholar

[21]

N. J. Higham, Computing the nearest correlation matrix - A problem from finance, IMA J. Numer. Anal., 22 (2002), 329-343.  doi: 10.1093/imanum/22.3.329.  Google Scholar

[22]

N. J. Higham, Computing a nearest symmetric positive semidefinite matrix, Linear Algebra Appl., 103 (1988), 103-118.  doi: 10.1016/0024-3795(88)90223-6.  Google Scholar

[23]

N. J. Higham and N. Strabić, Anderson acceleration of the alternating projections method for computing the nearest correlation matrix, Numer. Algorithms, 72 (2016), 1021-1042.  doi: 10.1007/s11075-015-0078-3.  Google Scholar

[24]

T. House, Hessian corrections to the Metropolis adjusted Langevin algorithm, preprint, arXiv: 1507.06336. Google Scholar

[25]

O. Kallenberg, Random Measures, Theory and Applications, Probability Theory and Stochastic Modelling, 77. Springer, Cham, 2017. doi: 10.1007/978-3-319-41598-7.  Google Scholar

[26]

S. KirkpatrickC. D. Gelatt Jr. and M. P. Vecchi, Optimization by simulated annealing, Science, 220 (1983), 671-680.  doi: 10.1126/science.220.4598.671.  Google Scholar

[27]

T. S. Kleppe, Adaptive step size selection for Hessian-based manifold Langevin samplers, Scand. J. Stat., 43 (2016), 788-805.  doi: 10.1111/sjos.12204.  Google Scholar

[28]

S. LanT. Bui-ThanhM. Christie and M. Girolami, Emulation of higher-order tensors in manifold Monte Carlo methods for Bayesian inverse problems, J. Comput. Phys., 308 (2016), 81-101.  doi: 10.1016/j.jcp.2015.12.032.  Google Scholar

[29]

S. Livingstone and M. Girolami, Information-geometric Markov chain Monte Carlo methods using diffusions, Entropy, 16 (2014), 3074-3102.  doi: 10.3390/e16063074.  Google Scholar

[30]

M. Locatelli, Simulated annealing algorithms for continuous global optimization: Convergence conditions, J. Optim. Theory Appl., 104 (2000), 121-133.  doi: 10.1023/A:1004680806815.  Google Scholar

[31]

J. F. D. Martin and J. M. R. no Sierra, A comparison of cooling schedules for simulated annealing, in Encyclopedia of Artificial Intelligence, 2009,344–352. doi: 10.4018/9781599048499.ch053.  Google Scholar

[32]

R. M. Neal, Bayesian Learning for Neural Networks, Lecture Notes in Statistics, 118, Springer, New York, 1996. doi: 10.1007/978-1-4612-0745-0.  Google Scholar

[33]

J. Neveu, Mathematical Foundations of the Calculus of Probability, Holden-Day, Inc., San Francisco, Calif.-London-Amsterdam, 1965.  Google Scholar

[34]

Y. Nourani and B. Andresen, A comparison of simulated annealing cooling strategies, J. Phys. A: Math. General, 31 (1998), 8373-8385.  doi: 10.1088/0305-4470/31/41/011.  Google Scholar

[35]

T. Papamarkou, A. Mira and M. Girolami, Monte Carlo methods and zero variance principle, in Current Trends in Bayesian Methodology with Applications, CRC Press, Boca Raton, FL, 2015, 457-476.  Google Scholar

[36]

M. Pereyra, Proximal Markov chain Monte Carlo algorithms, Stat. Comput., 26 (2016), 745-760.  doi: 10.1007/s11222-015-9567-4.  Google Scholar

[37]

J. Revels, M. Lubin and T. Papamarkou, Forward-mode automatic differentiation in Julia, preprint, arXiv: 1607.07892. Google Scholar

[38]

G. O. Roberts and J. S. Rosenthal, Coupling and ergodicity of adaptive Markov chain Monte Carlo algorithms, J. Appl. Probab., 44 (2007), 458-475.  doi: 10.1239/jap/1183667414.  Google Scholar

[39]

G. O. Roberts and J. S. Rosenthal, Examples of adaptive MCMC, J. Comput. Graph. Statist., 18 (2009), 349-367.  doi: 10.1198/jcgs.2009.06134.  Google Scholar

[40]

G. O. Roberts and J. S. Rosenthal, Optimal scaling of discrete approximations to Langevin diffusions, J. R. Stat. Soc. Ser. B Stat. Methodol., 60 (1998), 255-268.  doi: 10.1111/1467-9868.00123.  Google Scholar

[41]

G. O. Roberts and O. Stramer, Langevin diffusions and Metropolis-Hastings algorithms, Methodol. Comput. Appl. Probab., 4 (2002), 337-357.  doi: 10.1023/A:1023562417138.  Google Scholar

[42]

G. O. Roberts and R. L. Tweedie, Exponential convergence of Langevin distributions and their discrete approximations, Bernoulli, 2 (1996), 341-363.  doi: 10.2307/3318418.  Google Scholar

[43]

E. Saksman and M. Vihola, On the ergodicity of the adaptive Metropolis algorithm on unbounded domains, Ann. Appl. Probab., 20 (2010), 2178-2203.  doi: 10.1214/10-AAP682.  Google Scholar

[44]

R. SchwentnerT. PapamarkouM. O. KauerV. StathopoulosF. Yang and et al., EWS-FLI1 employs an E2F switch to drive target gene expression, Nucleic Acids Research, 43 (2015), 2780-2789.  doi: 10.1093/nar/gkv123.  Google Scholar

[45]

M. Seeger, Low Rank Updates for the Cholesky Decomposition, Technical report, University of California, Berkeley, 2004. Available from: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.585.5275&rep=rep1&type=pdf. Google Scholar

[46]

U. Şimşekli, R. Badeau, A. T. Cemgil and G. Richard, Stochastic quasi-Newton Langevin Monte Carlo, in Proceedings of the 33rd International Conference on Machine Learning, 2016,642–651. Google Scholar

[47]

N. W. TuchowE. B. FordT. Papamarkou and A. Lindo, The efficiency of geometric samplers for exoplanet transit timing variation models, Monthly Notices Roy. Astronomical Soc., 484 (2019), 3772-3784.  doi: 10.1093/mnras/stz247.  Google Scholar

[48]

M. Vihola, Robust adaptive Metropolis algorithm with coerced acceptance rate, Stat. Comput., 22 (2012), 997-1008.  doi: 10.1007/s11222-011-9269-5.  Google Scholar

[49]

J. H. Wilkinson, Modern error analysis, SIAM Rev., 13 (1971), 548-568.  doi: 10.1137/1013095.  Google Scholar

[50]

V. V. Williams, Breaking the Coppersmith-Winograd barrier, 2011. Google Scholar

[51]

T. XifaraC. SherlockS. LivingstoneS. Byrne and M. Girolami, Langevin diffusions and the Metropolis-adjusted Langevin algorithm, Statist. Probab. Lett., 91 (2014), 14-19.  doi: 10.1016/j.spl.2014.04.002.  Google Scholar

show all references

References:
[1]

C. Andrieu and É. Moulines, On the ergodicity properties of some adaptive MCMC algorithms, Ann. Appl. Probab., 16 (2006), 1462-1505.  doi: 10.1214/105051606000000286.  Google Scholar

[2]

Y. BaiG. O. Roberts and J. S. Rosenthal, On the containment condition for adaptive Markov chain Monte Carlo algorithms, Adv. Appl. Stat., 21 (2011), 1-54.   Google Scholar

[3]

M. Betancourt, A general metric for Riemannian manifold Hamiltonian Monte Carlo, in Geometric Science of Information, Lecture Notes in Comput. Sci., 8085, Springer, Heidelberg, 2013,327–334. doi: 10.1007/978-3-642-40020-9_35.  Google Scholar

[4]

B. Calderhead, M. Epstein, L. Sivilotti and M. Girolami, Bayesian approaches for mechanistic ion channel modeling, in In Silico Systems Biology, Methods in Molecular Biology, 1021, Humana Press, Totowa, NJ, 2013, 247-272. doi: 10.1007/978-1-62703-450-0_13.  Google Scholar

[5]

B. Calderhead and M. Girolami, Statistical analysis of nonlinear dynamical systems using differential geometric sampling methods, Interface Focus, 1 (2011). doi: 10.1098/rsfs.2011.0051.  Google Scholar

[6]

S. Chib and E. Greenberg, Understanding the Metropolis-Hastings algorithm, Amer. Statistician, 49 (1995), 327-335.  doi: 10.2307/2684568.  Google Scholar

[7]

A. M. Davie and A. J. Stothers, Improved bound for complexity of matrix multiplication, Proc. Roy. Soc. Edinburgh Sect. A, 143 (2013), 351-369.  doi: 10.1017/S0308210511001648.  Google Scholar

[8]

S. DuaneA. D. KennedyB. J. Pendleton and D. Roweth, Hybrid Monte Carlo, Phys. Lett. B, 195 (1987), 216-222.  doi: 10.1016/0370-2693(87)91197-X.  Google Scholar

[9]

E. B. Ford, Improving the efficiency of Markov chain Monte Carlo for analyzing the orbits of extrasolar planets, Astrophysical J., 642 (2006), 505-522.  doi: 10.1086/500802.  Google Scholar

[10]

E. B. Ford, Quantifying the uncertainty in the orbits of extrasolar planets, Astronomical J., 129 (2005), 1706-1717.  doi: 10.1086/427962.  Google Scholar

[11]

F. L. Gall, Powers of tensors and fast matrix multiplication, in Proceedings of the 39th international symposium on symbolic and algebraic computation, Association for Computing Machinery, 2014,296–303. doi: 10.1145/2608628.2608664.  Google Scholar

[12]

C. J. Geyer, Practical Markov chain Monte Carlo, Statist. Sci., 7 (1992), 473-483.  doi: 10.1214/ss/1177011137.  Google Scholar

[13]

P. E. GillG. H. GolubW. Murray and M. A. Saunders, Methods for modifying matrix factorizations, Math. Comp., 28 (1974), 505-535.  doi: 10.1090/S0025-5718-1974-0343558-6.  Google Scholar

[14]

M. Girolami and B. Calderhead, Riemann manifold Langevin and Hamiltonian Monte Carlo methods, J. R. Stat. Soc. Ser. B Stat. Methodol., 73 (2011), 123-214.  doi: 10.1111/j.1467-9868.2010.00765.x.  Google Scholar

[15]

A. Griewank, On automatic differentiation and algorithmic linearization, Pesquisa Operacional, 34 (2014), 621-645.  doi: 10.1590/0101-7438.2014.034.03.0621.  Google Scholar

[16]

A. Griewank and A. Walther, Evaluating Derivatives. Principles and Techniques of Algorithmic Differentiation, Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA, 2008. doi: 10.1137/1.9780898717761.  Google Scholar

[17]

J. E. Griffin and S. G. Walker, On adaptive Metropolis-Hastings methods, Stat. Comput., 23 (2013), 123-134.  doi: 10.1007/s11222-011-9296-2.  Google Scholar

[18]

H. HaarioM. LaineA. Mira and E. Saksman, DRAM: Efficient adaptive MCMC, Stat. Comput., 16 (2006), 339-354.  doi: 10.1007/s11222-006-9438-0.  Google Scholar

[19]

H. HaarioE. Saksman and J. Tamminen, An adaptive metropolis algorithm, Bernoulli, 7 (2001), 223-242.  doi: 10.2307/3318737.  Google Scholar

[20]

B. Hajek, Cooling schedules for optimal annealing, in Open Problems in Communication and Computation, Springer, New York, 1987,147–150. doi: 10.1007/978-1-4612-4808-8_42.  Google Scholar

[21]

N. J. Higham, Computing the nearest correlation matrix - A problem from finance, IMA J. Numer. Anal., 22 (2002), 329-343.  doi: 10.1093/imanum/22.3.329.  Google Scholar

[22]

N. J. Higham, Computing a nearest symmetric positive semidefinite matrix, Linear Algebra Appl., 103 (1988), 103-118.  doi: 10.1016/0024-3795(88)90223-6.  Google Scholar

[23]

N. J. Higham and N. Strabić, Anderson acceleration of the alternating projections method for computing the nearest correlation matrix, Numer. Algorithms, 72 (2016), 1021-1042.  doi: 10.1007/s11075-015-0078-3.  Google Scholar

[24]

T. House, Hessian corrections to the Metropolis adjusted Langevin algorithm, preprint, arXiv: 1507.06336. Google Scholar

[25]

O. Kallenberg, Random Measures, Theory and Applications, Probability Theory and Stochastic Modelling, 77. Springer, Cham, 2017. doi: 10.1007/978-3-319-41598-7.  Google Scholar

[26]

S. KirkpatrickC. D. Gelatt Jr. and M. P. Vecchi, Optimization by simulated annealing, Science, 220 (1983), 671-680.  doi: 10.1126/science.220.4598.671.  Google Scholar

[27]

T. S. Kleppe, Adaptive step size selection for Hessian-based manifold Langevin samplers, Scand. J. Stat., 43 (2016), 788-805.  doi: 10.1111/sjos.12204.  Google Scholar

[28]

S. LanT. Bui-ThanhM. Christie and M. Girolami, Emulation of higher-order tensors in manifold Monte Carlo methods for Bayesian inverse problems, J. Comput. Phys., 308 (2016), 81-101.  doi: 10.1016/j.jcp.2015.12.032.  Google Scholar

[29]

S. Livingstone and M. Girolami, Information-geometric Markov chain Monte Carlo methods using diffusions, Entropy, 16 (2014), 3074-3102.  doi: 10.3390/e16063074.  Google Scholar

[30]

M. Locatelli, Simulated annealing algorithms for continuous global optimization: Convergence conditions, J. Optim. Theory Appl., 104 (2000), 121-133.  doi: 10.1023/A:1004680806815.  Google Scholar

[31]

J. F. D. Martin and J. M. R. no Sierra, A comparison of cooling schedules for simulated annealing, in Encyclopedia of Artificial Intelligence, 2009,344–352. doi: 10.4018/9781599048499.ch053.  Google Scholar

[32]

R. M. Neal, Bayesian Learning for Neural Networks, Lecture Notes in Statistics, 118, Springer, New York, 1996. doi: 10.1007/978-1-4612-0745-0.  Google Scholar

[33]

J. Neveu, Mathematical Foundations of the Calculus of Probability, Holden-Day, Inc., San Francisco, Calif.-London-Amsterdam, 1965.  Google Scholar

[34]

Y. Nourani and B. Andresen, A comparison of simulated annealing cooling strategies, J. Phys. A: Math. General, 31 (1998), 8373-8385.  doi: 10.1088/0305-4470/31/41/011.  Google Scholar

[35]

T. Papamarkou, A. Mira and M. Girolami, Monte Carlo methods and zero variance principle, in Current Trends in Bayesian Methodology with Applications, CRC Press, Boca Raton, FL, 2015, 457-476.  Google Scholar

[36]

M. Pereyra, Proximal Markov chain Monte Carlo algorithms, Stat. Comput., 26 (2016), 745-760.  doi: 10.1007/s11222-015-9567-4.  Google Scholar

[37]

J. Revels, M. Lubin and T. Papamarkou, Forward-mode automatic differentiation in Julia, preprint, arXiv: 1607.07892. Google Scholar

[38]

G. O. Roberts and J. S. Rosenthal, Coupling and ergodicity of adaptive Markov chain Monte Carlo algorithms, J. Appl. Probab., 44 (2007), 458-475.  doi: 10.1239/jap/1183667414.  Google Scholar

[39]

G. O. Roberts and J. S. Rosenthal, Examples of adaptive MCMC, J. Comput. Graph. Statist., 18 (2009), 349-367.  doi: 10.1198/jcgs.2009.06134.  Google Scholar

[40]

G. O. Roberts and J. S. Rosenthal, Optimal scaling of discrete approximations to Langevin diffusions, J. R. Stat. Soc. Ser. B Stat. Methodol., 60 (1998), 255-268.  doi: 10.1111/1467-9868.00123.  Google Scholar

[41]

G. O. Roberts and O. Stramer, Langevin diffusions and Metropolis-Hastings algorithms, Methodol. Comput. Appl. Probab., 4 (2002), 337-357.  doi: 10.1023/A:1023562417138.  Google Scholar

[42]

G. O. Roberts and R. L. Tweedie, Exponential convergence of Langevin distributions and their discrete approximations, Bernoulli, 2 (1996), 341-363.  doi: 10.2307/3318418.  Google Scholar

[43]

E. Saksman and M. Vihola, On the ergodicity of the adaptive Metropolis algorithm on unbounded domains, Ann. Appl. Probab., 20 (2010), 2178-2203.  doi: 10.1214/10-AAP682.  Google Scholar

[44]

R. SchwentnerT. PapamarkouM. O. KauerV. StathopoulosF. Yang and et al., EWS-FLI1 employs an E2F switch to drive target gene expression, Nucleic Acids Research, 43 (2015), 2780-2789.  doi: 10.1093/nar/gkv123.  Google Scholar

[45]

M. Seeger, Low Rank Updates for the Cholesky Decomposition, Technical report, University of California, Berkeley, 2004. Available from: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.585.5275&rep=rep1&type=pdf. Google Scholar

[46]

U. Şimşekli, R. Badeau, A. T. Cemgil and G. Richard, Stochastic quasi-Newton Langevin Monte Carlo, in Proceedings of the 33rd International Conference on Machine Learning, 2016,642–651. Google Scholar

[47]

N. W. TuchowE. B. FordT. Papamarkou and A. Lindo, The efficiency of geometric samplers for exoplanet transit timing variation models, Monthly Notices Roy. Astronomical Soc., 484 (2019), 3772-3784.  doi: 10.1093/mnras/stz247.  Google Scholar

[48]

M. Vihola, Robust adaptive Metropolis algorithm with coerced acceptance rate, Stat. Comput., 22 (2012), 997-1008.  doi: 10.1007/s11222-011-9269-5.  Google Scholar

[49]

J. H. Wilkinson, Modern error analysis, SIAM Rev., 13 (1971), 548-568.  doi: 10.1137/1013095.  Google Scholar

[50]

V. V. Williams, Breaking the Coppersmith-Winograd barrier, 2011. Google Scholar

[51]

T. XifaraC. SherlockS. LivingstoneS. Byrne and M. Girolami, Langevin diffusions and the Metropolis-adjusted Langevin algorithm, Statist. Probab. Lett., 91 (2014), 14-19.  doi: 10.1016/j.spl.2014.04.002.  Google Scholar

Figure 1.  Overlaid running means as a function of Monte Carlo iteration and overlaid linear autocorrelations of single chains corresponding to one of the twenty, six and eleven parameters of the respective t-distribution, one-planet and two-planet system. The black horizontal line in the t-distribution running mean plot represents the true mode
figure 2 and the associated running means and autocorrelations of figure 1. The black horizontal lines in the t-distribution trace plots represent the true mode">Figure 2.  Trace plots of single chains as a function of Monte Carlo iteration corresponding to one of the twenty and eleven parameters of the respective t-distribution and two-planet system. The same chains were used for generating the trace plots of figure 2 and the associated running means and autocorrelations of figure 1. The black horizontal lines in the t-distribution trace plots represent the true mode
Table 1.  General complexity bounds per step of MALA, SMMALA, MMALA and AM samplers, and two special cases of a log-target $ f $ with linear complexity $ \mathcal{O}(f) = \mathcal{O}(n) $ and of expensive log-targets $ f $ with complexity $ \mathcal{O}(f)>>\mathcal{O}(n) $
Method General $\mathcal{O}(f)$ Special cases of $\mathcal{O}(f)$
$\mathcal{O}(f)=\mathcal{O}(n)$ $\mathcal{O}(f)>>\mathcal{O}(n)$
MALA $\mathcal{O}(\max{\{fn,n^2\}})$ $\mathcal{O}(n^2)$ $\mathcal{O}(fn)$
SMMALA $\mathcal{O}(\max{\{fn^2,n^3\}})$ $\mathcal{O}(n^3)$ $\mathcal{O}(fn^2)$
MMALA $\mathcal{O}(\max{\{fn^3,n^3\}})$ $\mathcal{O}(n^4)$ $\mathcal{O}(fn^3)$
AM $\mathcal{O}(\max\{f, n^{2.373}\})$ $\mathcal{O}(n^{2.373})$ $\mathcal{O}(f)$
Method General $\mathcal{O}(f)$ Special cases of $\mathcal{O}(f)$
$\mathcal{O}(f)=\mathcal{O}(n)$ $\mathcal{O}(f)>>\mathcal{O}(n)$
MALA $\mathcal{O}(\max{\{fn,n^2\}})$ $\mathcal{O}(n^2)$ $\mathcal{O}(fn)$
SMMALA $\mathcal{O}(\max{\{fn^2,n^3\}})$ $\mathcal{O}(n^3)$ $\mathcal{O}(fn^2)$
MMALA $\mathcal{O}(\max{\{fn^3,n^3\}})$ $\mathcal{O}(n^4)$ $\mathcal{O}(fn^3)$
AM $\mathcal{O}(\max\{f, n^{2.373}\})$ $\mathcal{O}(n^{2.373})$ $\mathcal{O}(f)$
Table 2.  Comparison of sampling efficacy between MALA, AM, SMMALA and GAMC for the t-distribution, one-planet and two-planet system. AR: acceptance rate; ESS: effective sample size; t: CPU runtime in seconds; ESS/t: smaller ESS across model parameters divided by runtime; Speed: ratio of ESS/t for MALA over ESS/t for each other sampler. All tabulated numbers have been rounded to the second decimal place, apart from effective sample sizes, which have been rounded to the nearest integer. The minimum, mean, median and maximum ESS across the effective sample sizes of the twenty, six and eleven parameters (associated with the respective t-distribution, one-planet and two-planet system) are displayed
Student’s t-distribution
Method AR ESS t ESS/t Speed
min mean median max
MALA 0.59 135 159 145 234 9.33 14.52 1.00
AM 0.03 85 118 117 155 17.01 5.03 0.35
SMMALA 0.71 74 87 86 96 143.63 0.52 0.04
GAMC 0.26 1471 1558 1560 1629 31.81 46.23 3.18
One-planet system
Method AR ESS t ESS/t Speed
min mean median max
MALA 0.55 4 76 18 394 57.03 0.07 1.00
AM 0.08 1230 1397 1279 2035 48.84 25.18 378.50
SMMALA 0.71 464 597 646 658 208.46 2.23 33.45
GAMC 0.30 1260 2113 2151 3032 76.80 16.41 246.59
Two-planet system
Method AR ESS t ESS/t Speed
min mean median max
MALA 0.59 5 52 10 377 219.31 0.02 1.00
AM 0.01 18 84 82 248 81.24 0.22 9.05
SMMALA 0.70 53 104 100 161 1606.92 0.03 1.37
GAMC 0.32 210 561 486 1110 328.08 0.64 26.39
Student’s t-distribution
Method AR ESS t ESS/t Speed
min mean median max
MALA 0.59 135 159 145 234 9.33 14.52 1.00
AM 0.03 85 118 117 155 17.01 5.03 0.35
SMMALA 0.71 74 87 86 96 143.63 0.52 0.04
GAMC 0.26 1471 1558 1560 1629 31.81 46.23 3.18
One-planet system
Method AR ESS t ESS/t Speed
min mean median max
MALA 0.55 4 76 18 394 57.03 0.07 1.00
AM 0.08 1230 1397 1279 2035 48.84 25.18 378.50
SMMALA 0.71 464 597 646 658 208.46 2.23 33.45
GAMC 0.30 1260 2113 2151 3032 76.80 16.41 246.59
Two-planet system
Method AR ESS t ESS/t Speed
min mean median max
MALA 0.59 5 52 10 377 219.31 0.02 1.00
AM 0.01 18 84 82 248 81.24 0.22 9.05
SMMALA 0.70 53 104 100 161 1606.92 0.03 1.37
GAMC 0.32 210 561 486 1110 328.08 0.64 26.39
[1]

Ajay Jasra, Kody J. H. Law, Yaxian Xu. Markov chain simulation for multilevel Monte Carlo. Foundations of Data Science, 2021, 3 (1) : 27-47. doi: 10.3934/fods.2021004

[2]

Sahani Pathiraja, Sebastian Reich. Discrete gradients for computational Bayesian inference. Journal of Computational Dynamics, 2019, 6 (2) : 385-400. doi: 10.3934/jcd.2019019

[3]

Guillaume Bal, Ian Langmore, Youssef Marzouk. Bayesian inverse problems with Monte Carlo forward models. Inverse Problems & Imaging, 2013, 7 (1) : 81-105. doi: 10.3934/ipi.2013.7.81

[4]

Olli-Pekka Tossavainen, Daniel B. Work. Markov Chain Monte Carlo based inverse modeling of traffic flows using GPS data. Networks & Heterogeneous Media, 2013, 8 (3) : 803-824. doi: 10.3934/nhm.2013.8.803

[5]

Deng Lu, Maria De Iorio, Ajay Jasra, Gary L. Rosner. Bayesian inference for latent chain graphs. Foundations of Data Science, 2020, 2 (1) : 35-54. doi: 10.3934/fods.2020003

[6]

Michael Damron, C. L. Winter. A non-Markovian model of rill erosion. Networks & Heterogeneous Media, 2009, 4 (4) : 731-753. doi: 10.3934/nhm.2009.4.731

[7]

Samuel N. Cohen, Lukasz Szpruch. On Markovian solutions to Markov Chain BSDEs. Numerical Algebra, Control & Optimization, 2012, 2 (2) : 257-269. doi: 10.3934/naco.2012.2.257

[8]

Zhenquan Zhang, Meiling Chen, Jiajun Zhang, Tianshou Zhou. Analysis of non-Markovian effects in generalized birth-death models. Discrete & Continuous Dynamical Systems - B, 2021, 26 (7) : 3717-3735. doi: 10.3934/dcdsb.2020254

[9]

Kseniia Kravchuk, Alexander Vidybida. Non-Markovian spiking statistics of a neuron with delayed feedback in presence of refractoriness. Mathematical Biosciences & Engineering, 2014, 11 (1) : 81-104. doi: 10.3934/mbe.2014.11.81

[10]

Xi Zhu, Meixia Li, Chunfa Li. Consensus in discrete-time multi-agent systems with uncertain topologies and random delays governed by a Markov chain. Discrete & Continuous Dynamical Systems - B, 2020, 25 (12) : 4535-4551. doi: 10.3934/dcdsb.2020111

[11]

Aku Kammonen, Jonas Kiessling, Petr Plecháč, Mattias Sandberg, Anders Szepessy. Adaptive random Fourier features with Metropolis sampling. Foundations of Data Science, 2020, 2 (3) : 309-332. doi: 10.3934/fods.2020014

[12]

Mario Roy, Mariusz Urbański. Random graph directed Markov systems. Discrete & Continuous Dynamical Systems, 2011, 30 (1) : 261-298. doi: 10.3934/dcds.2011.30.261

[13]

Jiuping Xu, Pei Wei. Production-distribution planning of construction supply chain management under fuzzy random environment for large-scale construction projects. Journal of Industrial & Management Optimization, 2013, 9 (1) : 31-56. doi: 10.3934/jimo.2013.9.31

[14]

Colin Little. Deterministically driven random walks in a random environment on $\mathbb{Z}$. Discrete & Continuous Dynamical Systems, 2016, 36 (10) : 5555-5578. doi: 10.3934/dcds.2016044

[15]

Giacomo Dimarco. The moment guided Monte Carlo method for the Boltzmann equation. Kinetic & Related Models, 2013, 6 (2) : 291-315. doi: 10.3934/krm.2013.6.291

[16]

Johnathan M. Bardsley. Gaussian Markov random field priors for inverse problems. Inverse Problems & Imaging, 2013, 7 (2) : 397-416. doi: 10.3934/ipi.2013.7.397

[17]

Manfred Denker, Yuri Kifer, Manuel Stadlbauer. Thermodynamic formalism for random countable Markov shifts. Discrete & Continuous Dynamical Systems, 2008, 22 (1&2) : 131-164. doi: 10.3934/dcds.2008.22.131

[18]

Felix X.-F. Ye, Yue Wang, Hong Qian. Stochastic dynamics: Markov chains and random transformations. Discrete & Continuous Dynamical Systems - B, 2016, 21 (7) : 2337-2361. doi: 10.3934/dcdsb.2016050

[19]

Manfred Denker, Yuri Kifer, Manuel Stadlbauer. Corrigendum to: Thermodynamic formalism for random countable Markov shifts. Discrete & Continuous Dynamical Systems, 2015, 35 (1) : 593-594. doi: 10.3934/dcds.2015.35.593

[20]

Vladimir Kazakov. Sampling - reconstruction procedure with jitter of markov continuous processes formed by stochastic differential equations of the first order. Conference Publications, 2009, 2009 (Special) : 433-441. doi: 10.3934/proc.2009.2009.433

 Impact Factor: 

Article outline

Figures and Tables

[Back to Top]