\`x^2+y_1+z_12^34\`
Advanced Search
Article Contents
Article Contents

Diffusion-based Sparse-Grid generative models for density estimation

  • *Corresponding author: Guannan Zhang

    *Corresponding author: Guannan Zhang

Notice: This manuscript has been authored by UT-Battelle, LLC, under contract DE-AC05-00OR22725 with the US Department of Energy (DOE). The US government retains and the publisher, by accepting the article for publication, acknowledges that the US government retains a nonexclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this manuscript, or allow others to do so, for US government purposes. DOE will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (https://www.energy.gov/doe-public-access-plan).

Abstract / Introduction Full Text(HTML) Figure(12) Related Papers Cited by
  • In density estimation, generative models are usually categorized under unsupervised learning due to the lack of labeled data. These models apply various indirect loss functions to refine neural network training, yet face specific challenges. Issues like mode collapse and instability in generative adversarial networks are notable, while normalizing flows are constrained by the need to calculate the Jacobian matrix's determinant, limiting network design. While neural networks are well-suited for handling very high-dimensional data, they can be overly complex for moderately high dimensions. Here, traditional sparse polynomial approximation offers advantages by avoiding complex training requirements. This research employs a score-based diffusion model combined with sparse grid interpolation to estimate the unknown density function. The method involves generating labeled data pairs linking samples from the standard Gaussian distribution to the target distribution using the diffusion model. This model transports the Gaussian distribution to the target density through a backward stochastic differential equation, where a Monte Carlo method approximates the score function at any point, facilitating function interpolation for the transport model. A sparse grid interpolant can be built based on the labeled data. We leverage the Tasmanian library [32] for building this sparse-grid-based generative model. We demonstrate the performance of our method using a set of multi-dimensional benchmark distributions.

    Mathematics Subject Classification: 60G25, 65L99, 65Z05.

    Citation:

    \begin{equation} \\ \end{equation}
  • 加载中
  • Figure 1.  Semi-local polynomial points and basis functions, left to right: linear, quadratic, and cubic functions

    Figure 2.  Comparison of the true PDF and estimated histogram for 1D GMM with two connected modes. From left to right: local polynomial grids with polynomial orders $ \{1,2,3\} $. From top to bottom: local polynomial grids with levels $ \{3,4,6\} $. The red line represents the exact PDF. The grey histogram is based on the samples generated by the sparse-grid generative model $ F $

    Figure 3.  KL divergence and log KL divergence between true PDF and estimated PDF with orders $ \{1,2,3\} $ and sparse grid levels $ \{3,4,5,6\} $ for the 1D GMM with connected modes

    Figure 4.  KL divergence and log KL divergence between true PDF and estimated PDF with orders $ \{1,2,3\} $ and training data sample sizes $ \{6000,60000,600000\} $ for the 1D GMM with connected modes

    Figure 5.  Comparison of the true PDF and estimated histogram for 1D well-separated GMM. From left to right: local polynomial grids with orders $ \{1,2,3\} $. From top to bottom: local polynomial grids with levels $ \{3,4,9\} $. The red line represents the exact PDF. The grey histogram is the approximated PDF by local polynomial grids

    Figure 6.  1D well-separated GMM KL divergence and log KL divergence between true PDF and estimated PDF with orders $ \{1,2,3\} $ and levels $ \{3,4,\cdots,9\} $

    Figure 7.  Grids of the 2D local polynomial interpolation with levels $ \{3,4,9\} $

    Figure 8.  Comparison of the banana distribution using a scatter plot of the exact density function and samples obtained by the sparse grid generative model $ F $ of order $ 1 $ with levels $ \left\{ {3,4,9} \right\} $

    Figure 9.  Comparison of the true PDF and estimated histogram for the banana distribution, using local polynomial grids of order $ 1 $ with levels $ \{3,4,9\} $. From left to right: $ x_1 $ and $ x_2 $. From top to bottom: local polynomial grids with levels $ \{3,4,9\} $. The red line represents the exact PDF. The grey histogram is the approximated PDF by local polynomial grids

    Figure 10.  Banana distribution KL divergence and log KL divergence between true marginal PDF and estimated marginal PDF with sparse grid orders $ \{1,2,3\} $ and levels $ \{3,4,\cdots,9\} $. From left to right: $ x_1 $ and $ x_2 $. From top to bottom: KL divergence and log KL divergence

    Figure 11.  Comparison of the true PDF and estimated histogram for a funnel distribution, using local polynomial grids of order $ 1 $ with levels $ \{2,3,4\} $. From left to right: $ x_1 $, $ x_2 $, $ \cdots $, $ x_5 $. From top to bottom: local polynomial grids with levels $ \{2,3,4\} $. The red line represents the exact PDF. The grey histogram is the approximated PDF by local polynomial grids

    Figure 12.  Funnel distribution KL divergence and log KL divergence between true marginal PDF and estimated marginal PDF with sparse grid orders $ \{1,2,3\} $ and levels $ \{2,3,4\} $. From left to right: $ x_1 $, $ x_2 $, $ \cdots $, $ x_5 $. From top to bottom: KL divergence and log KL divergence

  • [1] J. Austin, D. D. Johnson, J. Ho, D. Tarlow and R. van den Berg, Structured denoising diffusion models in discrete state-spaces, in Advances in Neural Information Processing Systems, NeurIPS 2021, virtual, 34 (2021), 17981-17993.
    [2] F. BaoZ. Zhang and G. Zhang, An ensemble score filter for tracking high-dimensional nonlinear dynamical systems, Computer Methods in Applied Mechanics and Engineering, 432 (2024), 117447.  doi: 10.1016/j.cma.2024.117447.
    [3] F. Bao, Z. Zhang and G. Zhang, A score-based nonlinear filter for data assimilation, Journal of Computational Physics, 514 (2024), Paper No. 113207, 16 pp. doi: 10.1016/j.jcp.2024.113207.
    [4] R. Baptista, Y. Marzouk and O. Zahm, On the representation and learning of monotone triangular transport maps, Foundations of Computational Mathematics, Springer, (2023), 1-46. doi: 10.1007/s10208-023-09630-x.
    [5] H.-J. Bungartz and M. Griebel, Sparse grids, Acta Numerica, 13 (2004), 147-269.  doi: 10.1017/S0962492904000182.
    [6] R. Cai, G. Yang, H. Averbuch-Elor, Z. Hao, S. J. Belongie, N. Snavely and B. Hariharan, Learning gradient fields for shape generation, in ECCV 2020, Part III, Lecture Notes in Computer Science, Springer, 12348 (2020), 364-381. doi: 10.1007/978-3-030-58580-8_22.
    [7] T. Chakraborty, U. K. Reddy, S. M. Naik, M. Panja and B. Manvitha, Ten years of generative adversarial nets (GANs): A survey of the state-of-the-art, Machine Learning: Science and Technology, IOP Publishing, 5 (2024), 011001. doi: 10.1088/2632-2153/ad1f77.
    [8] C. Chen, C. Li, L. Chen, W. Wang, Y. Pu and L. Carin, Continuous-time flows for efficient inference and density estimation, in International Conference on Machine Learning, PMLR, (2018), 824-833.
    [9] A. ChkifaA. Cohen and C. Schwab, High-dimensional adaptive sparse polynomial interpolation and applications to parametric PDEs, Foundations of Computational Mathematics, 14 (2014), 601-633.  doi: 10.1007/s10208-013-9154-z.
    [10] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville and Y. Bengio, Generative Adversarial Nets, in Advances in Neural Information Processing Systems, vol. 27, Curran Associates, Inc., 2014.
    [11] A. Haisam, J. Yin, Y. Geng, S. Liang, F. Bao, L. Ju and G. Zhang, A Scalable Training-Free Diffusion Model for Uncertainty Quantification, in Proceedings of the IEEE/ACM International Conference for High Performance Computing, Networking, Storage, and Analysis (SC), IEEE, Denver, CO, USA, 2024.
    [12] J. Ho, A. Jain and P. Abbeel, Denoising diffusion probabilistic models, in Advances in Neural Information Processing Systems, Curran Associates, Inc., 33 (2020), 6840-6851.
    [13] E. Hoogeboom, D. Nielsen, P. Jaini, P. Forré and M. Welling, Argmax flows and multinomial diffusion: Learning categorical distributions, in Advances in Neural Information Processing Systems, NeurIPS 2021, virtual, 34 (2021), 12454-12465.
    [14] A. Hyvärinen, Estimation of non-normalized statistical models by score matching, Journal of Machine Learning Research, 6 (2005), 695-709. 
    [15] J. D. Jakeman and S. G. Roberts, Local and dimension adaptive stochastic collocation for uncertainty quantification, in Sparse Grids and Applications, Springer, 88 (2012), 181-203. doi: 10.1007/978-3-642-31703-3_9.
    [16] D. P. Kingma and M. Welling, Auto-Encoding Variational Bayes, in 2nd International Conference on Learning Representations, ICLR 2014, Conference Track Proceedings, Banff, AB, Canada, 2014.
    [17] I. Kobyzev, S. J. D. Prince and M. A. Brubaker, Normalizing flows: An introduction and review of current methods, IEEE Transactions on Pattern Analysis and Machine Intelligence, IEEE, 43 (2021), 3964-3979. doi: 10.1109/TPAMI.2020.2992934.
    [18] Y. LiuM. YangZ. ZhangF. BaoY. Cao and G. Zhang, Diffusion-model-assisted supervised learning of generative models for density estimation, Journal of Machine Learning for Modeling and Computing, 5 (2024), 25-38.  doi: 10.1615/JMachLearnModelComput.2024051346.
    [19] D. Lu, Y. Liu, Z. Zhang, F. Bao and G. Zhang, A diffusion-based uncertainty quantification method to advance E3SM land model calibration, Journal of Geophysical Research: Machine Learning and Computation, 1 (2024), e2024JH000234. doi: 10.1029/2024JH000234.
    [20] X. Ma and N. Zabaras, An adaptive hierarchical sparse grid collocation algorithm for the solution of stochastic differential equations, Journal of Computational Physics, Elsevier, 228 (2009), 3084-3113. doi: 10.1016/j.jcp.2009.01.006.
    [21] D. OnkenS. W. FungX. Liang and L. Ruthotto, OT-flow: Fast and accurate continuous normalizing flows via optimal transport, Proceedings of the AAAI Conference on Artificial Intelligence, 35 (2021), 9223-9232.  doi: 10.1609/aaai.v35i10.17113.
    [22] G. Papamakarios, E. Nalisnick, D. J. Rezende, S. Mohamed and B. Lakshminarayanan, Normalizing flows for probabilistic modeling and inference, The Journal of Machine Learning Research, 22 (2021), Paper No. 57, 64 pp.
    [23] G. Papamakarios, T. Pavlakou and I. Murray, Masked Autoregressive Flow for Density Estimation, in Advances in Neural Information Processing Systems, vol. 30, 2017.
    [24] M. ParnoP.-B. RubioD. SharpM. BrennanR. BaptistaH. Bonart and Y. Marzouk, Mpart: Monotone parameterization toolkit, Journal of Open Source Software, 7 (2022), 4843.  doi: 10.21105/joss.04843.
    [25] D. Pflüger, Spatially Adaptive Sparse Grids for High-Dimensional Problems, Verlag Dr. Hut, Munich, 2010. doi: 10.1016/j.jco.2010.04.001.
    [26] T. Salimans, I. J. Goodfellow, W. Zaremba, V. Cheung, A. Radford and X. Chen, Improved techniques for training GANs, in Advances in Neural Information Processing Systems, NeurIPS, 29 (2016), 2226-2234.
    [27] T. Schlegl, P. Seeböck, S. M. Waldstein, U. Schmidt-Erfurth and G. Langs, Unsupervised anomaly detection with generative adversarial networks to guide marker discovery, in Information Processing in Medical Imaging, IPMI 2017, Lecture Notes in Computer Science, Springer, 10265 (2017), 146-157. doi: 10.1007/978-3-319-59050-9_12.
    [28] Y. Song and S. Ermon, Generative Modeling by Estimating Gradients of the Data Distribution, in Advances in Neural Information Processing Systems, vol. 32, 2019.
    [29] Y. SongS. GargJ. Shi and S. Ermon, Sliced score matching: A scalable approach to density and score estimation, Proceedings of The 35th Uncertainty in Artificial Intelligence Conference, PMLR, 115 (2020), 574-584. 
    [30] Y. Song, J. Sohl-Dickstein, D. P. Kingma, A. Kumar, S. Ermon and B. Poole, Score-Based Generative Modeling Through Stochastic Differential Equations, in International Conference on Learning Representations, 2021.
    [31] M. Stoyanov, Adaptive sparse grid construction in a context of local anisotropy and multiple hierarchical parents, in Sparse Grids and Applications–Miami 2016, Springer, 123 (2018), 175-199. doi: 10.1007/978-3-319-75426-0_8.
    [32] M. Stoyanov, D. Lebrun-Grandie, J. Burkardt and D. Munster, Tasmanian, 9, 2013. Available from: https://github.com/ORNL/Tasmanian.
    [33] M. K. Stoyanov and C. G. Webster, A dynamically adaptive sparse grids method for quasi-optimal interpolation of multidimensional functions, Computers & Mathematics with Applications, Elsevier, 71 (2016), 2449-2465. doi: 10.1016/j.camwa.2015.12.045.
    [34] P. Vincent, A connection between score matching and denoising autoencoders, Neural Comput., MIT Press, 23 (2011), 1661-1674. doi: 10.1162/NECO_a_00142.
    [35] L. Yang, Z. Zhang, Y. Song, S. Hong, R. Xu, Y. Zhao, W. Zhang, B. Cui and M.-H. Yang, Diffusion models: A comprehensive survey of methods and applications, ACM Comput. Surv., 56 (2023), Article No.: 105, 1-39. doi: 10.1145/3626235.
    [36] J. Yin, S. Liang, S. Liu, F. Bao, H. G. Chipilski, D. Lu and G. Zhang, A Scalable Real-Time Data Assimilation Framework for Predicting Turbulent Atmosphere Dynamics, in Proceedings of the IEEE/ACM International Conference for High Performance Computing, Networking, Storage, and Analysis (SC), IEEE, Atlanta, GA, USA, 2024.
  • 加载中

Figures(12)

SHARE

Article Metrics

HTML views(3328) PDF downloads(276) Cited by(0)

Access History

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return