Computational budget | $ m_1 $ | $ m_2 $ |
10 | 8 | 1126 |
100 | 88 | 11263 |
1000 | 887 | 112631 |
Early Access articles are published articles within a journal that have not yet been assigned to a formal issue. This means they do not yet have a volume number, issue number, or page numbers assigned to them, however, they can still be found and cited using their DOI (Digital Object Identifier). Early Access publication benefits the research community by making new scientific discoveries known as quickly as possible.
Readers can access Early Access articles via the “Early Access” tab for the selected journal.
Machine learning (ML) methods, which fit data to the parameters of a given parameterized model class, have garnered significant interest as potential methods for learning surrogate models for complex engineering systems where traditional simulation is expensive. However, in many scientific and engineering settings, generating high-fidelity data to train ML models is expensive, and the available budget for generating training data is limited, making high-fidelity training data scarce. ML models trained on scarce data have high variance, resulting in poor expected generalization performance. We propose a new multifidelity training approach for scientific machine learning via linear regression that exploits the scientific context where data of varying fidelities and costs are available; for example, high-fidelity data may be generated by an expensive fully resolved physics simulation whereas lower-fidelity data may arise from a cheaper model based on simplifying assumptions. We use the multifidelity data within an approximate control variate framework to define new multifidelity Monte Carlo estimators for linear regression models. We provide bias and variance analysis of our new estimators that guarantee the approach's accuracy and improved robustness to scarce high-fidelity data. Numerical results demonstrate that our multifidelity training approach achieves similar accuracy to the standard high-fidelity-only approach, significantly reducing high-fidelity data requirements.
Citation: |
Figure 1. Exponential function example: Box plots of 500 realizations of estimators for first entry of $ \hat{c}_{XY} $ (left), first regression coefficient (center), and regression model prediction $ \hat{f}(z = 5;\hat{\beta}) $ (right), when true model statistics are known. Black lines show the mean over the 500 realizations of training data and the dotted gray line shows the true reference value
Figure 4. PDE model problem: convergence of multifidelity linear regression estimators for $ \hat c_{XY} $ (top), $ \hat\beta $ (middle), and $ \hat f(z;\hat\beta) $ (bottom), when model statistics are estimated using $ 10^5 $ (left), 100 (center), or 10 (right) pilot samples. Results for the multifidelity approach with the optimal matrix coefficient are omitted in the second and third columns because their variances are so large that they would significantly distort the plot axes
Table 1. Sample allocations for different computational budgets based on (10), assuming exact model statistics for the analytic example
Computational budget | $ m_1 $ | $ m_2 $ |
10 | 8 | 1126 |
100 | 88 | 11263 |
1000 | 887 | 112631 |
Table 2. Analytical example: variances of key regression quantities estimated using high-fidelity-only and multifidelity strategies with a computational budget equivalent to 10 high-fidelity samples. Values correspond to using exact model statistics to compute the control variate coefficient
$ \mathsf{Tr}( \operatorname{\mathbb{C}ov}[\hat c_{XY}]) $ | $ \mathsf{Tr}( \operatorname{\mathbb{C}ov}[\hat\beta]) $ | $ \operatorname{\mathbb{V}ar}[\hat f(z;\hat\beta)] $ | |
MF-$ A^* $ | 3.2e5 | 1.3e3 | 3.3e2 |
MF-$ \alpha^* $ | 1.2e6 | 1.1e4 | 8.1e2 |
MF-$ \alpha^{\rm mean} $ | 1.6e6 | 1.4e4 | 8.1e2 |
High-fidelity only | 3.4e7 | 5.2e4 | 2.9e4 |
Table 3.
High-sample model statistics for the high- and low-fidelity models for the CDR problem using
Model | $ \mu_k $ | $ \sigma_k $ | $ \rho_{1k} $ | $ w_k $ | |
High-fidelity (FD) | $ f^{(1)} $ | 1406 | 276 | 1 | 1.94 |
Low-fidelity (POD-DEIM) | $ f^{(2)} $ | 1349 | 356 | 0.95 | 6.2e-395 |
Table 4. Sample allocation based on (10) for the CDR problem based on reference model statistics computed using all available samples in the data set
Computational budget | $ m_1 $ | $ m_2 $ |
10 | 4 | 250 |
100 | 43 | 2504 |
1000 | 435 | 25045 |
[1] | S. E. Ahmed and P. Stinis, A multifidelity deep operator network approach to closure for multiscale systems, Computer Methods in Applied Mechanics and Engineering, 414 (2023), Paper No. 116161, 23 pp. doi: 10.1016/j.cma.2023.116161. |
[2] | T. Alsup and B. Peherstorfer, Context-aware surrogate modeling for balancing approximation and sampling costs in multifidelity importance sampling and Bayesian inverse problems, SIAM/ASA Journal on Uncertainty Quantification, 11 (2023), 285-319. doi: 10.1137/21M1445594. |
[3] | K. Bhattacharya, B. Hosseini, N. B. Kovachki and A. M. Stuart, Model reduction and neural networks for parametric PDEs, The SMAI Journal of Computational Mathematics, 7 (2021), 121-157. doi: 10.5802/smai-jcm.74. |
[4] | E. L. Bolager, I. Burak, C. Datar, Q. Sun and F. Dietrich, Sampling weights of deep neural networks, Advances in Neural Information Processing Systems, 36 (2024). |
[5] | L. Brevault, M. Balesdent and A. Hebbal, Overview of Gaussian process based multi-fidelity techniques with variable relationship between fidelities, application to aerospace systems, Aerospace Science and Technology, 107 (2020), 106339. doi: 10.1016/j.ast.2020.106339. |
[6] | M. Buffoni and K. Willcox, Projection-based model reduction for reacting flows, In 40th Fluid Dynamics Conference and Exhibit, (2010), 5008. |
[7] | G. Cataldo, E. Qian and J. Auclair, Multifidelity uncertainty quantification and model validation of large-scale multidisciplinary systems, Journal of Astronomical Telescopes, Instruments, and Systems, 8 (2022), 038001. doi: 10.1117/1.JATIS.8.3.038001. |
[8] | N. K. Chada, A. Jasra, K. J. H. Law and S. S. Singh, Multilevel Bayesian deep neural networks, arXiv preprint, arXiv: 2203.12961, 2024. |
[9] | K. J. Chang, R. T. Haftka, G. L. Giles and P.-J. Kao, Sensitivity-based scaling for approximating structural response, Journal of Aircraft, 30 (1993), 283-288. doi: 10.2514/3.48278. |
[10] | S. Chaturantabut and D. Sorensen, Nonlinear model reduction via discrete empirical interpolation, SIAM Journal on Scientific Computing, 32 (2010), 2737-2764. doi: 10.1137/090766498. |
[11] | A. Chaudhuri, J. Jasa, J. R. R. A Martins and K. E Willcox, Multifidelity optimization under uncertainty for a tailless aircraft, In 2018 AIAA Non-Deterministic Approaches Conference, 2018, 1658. |
[12] | A. Chaudhuri, A. N. Marques and K. Willcox, mfEGRA: Multifidelity efficient global reliability analysis through active learning for failure boundary location, Structural and Multidisciplinary Optimization, 64 (2021), 797-811. |
[13] | M. Croci, K. E. Willcox and S. J. Wright, Multi-output multilevel best linear unbiased estimators via semidefinite programming, Computer Methods in Applied Mechanics and Engineering, 413 (2023), Paper No. 116130, 22 pp. doi: 10.1016/j.cma.2023.116130. |
[14] | B. Cuenot and T. Poinsot, Asymptotic and numerical study of diffusion flames with variable Lewis number and finite rate chemistry, Combustion and Flame, 104 (1996), 111-137. doi: 10.1016/0010-2180(95)00111-5. |
[15] | S. De, J. Britton, M. Reynolds, R. Skinner, K. Jansen and A. Doostan, On transfer learning of neural networks using bi-fidelity data for uncertainty propagation, International Journal for Uncertainty Quantification, 10 (2020), 543-573. doi: 10.1615/Int.J.UncertaintyQuantification.2020033267. |
[16] | S. De and A. Doostan, Neural network training using $\ell^1$-regularization and bi-fidelity data, Journal of Computational Physics, 458 (2022), Paper No. 111010, 24 pp. doi: 10.1016/j.jcp.2022.111010. |
[17] | S. De, M. Reynolds, M. Hassanaly, R. N. King and A. Doostan, Bi-fidelity modeling of uncertain and partially unknown systems using DeepONets, Comput. Mech., 71 (2023), 1251-1267. |
[18] | M. V. de Hoop, D. Z. Huang, E. Qian and A. Stuart, The cost-accuracy trade-off in operator learning with neural networks, Journal of Machine Learning, 1 (2022), 299-341. doi: 10.4208/jml.220509. |
[19] | N. Demo, M. Tezzele and G. Rozza, A DeepONet multi-fidelity approach for residual learning in reduced order modeling, Advanced Modeling and Simulation in Engineering Sciences, 10 (2023), Article number: 12. doi: 10.1186/s40323-023-00249-9. |
[20] | M. Destouches, P. Mycek and S. Gürol, Multivariate extensions of the Multilevel Best Linear Unbiased Estimator for ensemble-variational data assimilation, 2023. |
[21] | I.-G. Farcaș, B. Peherstorfer, T. Neckel, F. Jenko and H.-J. Bungartz, Context-aware learning of hierarchies of low-fidelity models for multi-fidelity uncertainty quantification, Computer Methods in Applied Mechanics and Engineering, 406 (2023), Paper No. 115908, 23 pp. doi: 10.1016/j.cma.2023.115908. |
[22] | M. G. Fernández-Godino, S. Dubreuil, N. Bartoli, C. Gogu, S. Balachandar and R. T. Haftka, Linear regression-based multifidelity surrogate for disturbance amplification in multiphase explosion, Structural and Multidisciplinary Optimization, 60 (2019), 2205-2220. doi: 10.1007/s00158-019-02387-4. |
[23] | M. Fujisawa and I. Sato, Multilevel Monte Carlo variational inference, The Journal of Machine Learning Research, 22 (2021), 12741-12784. |
[24] | T. Gerstner, B. Harrach, D. Roth and M. Simon, Multilevel Monte Carlo learning, arXiv preprint, arXiv: 2102.08734, 2021. |
[25] | M. B. Giles, Multilevel Monte Carlo path simulation, Operations Research, 56 (2008), 607-617. doi: 10.1287/opre.1070.0496. |
[26] | M. B. Giles, Multilevel Monte Carlo methods, Acta Numerica, 24 (2015), 259-328. doi: 10.1017/S096249291500001X. |
[27] | A. A. Gorodetsky, G. Geraci, M. S. Eldred and J. D. Jakeman, A generalized approximate control variate framework for multifidelity uncertainty quantification, Journal of Computational Physics, 408 (2020), 109257, 29pp. doi: 10.1016/j.jcp.2020.109257. |
[28] | A. A. Gorodetsky, J. D. Jakeman and G. Geraci, MFNets: Data efficient all-at-once learning of multifidelity surrogates as directed networks of information sources, Computational Mechanics, 68 (2021), 741-758. |
[29] | A. A. Gorodetsky, J. D. Jakeman, G. Geraci and M. S. Eldred, MFNets: Multi-fidelity data-driven networks for Bayesian learning and prediction, International Journal for Uncertainty Quantification, 10 (2020), 595-622. |
[30] | L. L. Gratiet and J. Garnier, Recursive co-kriging model for design of computer experiments with multiple levels of fidelity, International Journal for Uncertainty Quantification, 4 (2014), 365-386. doi: 10.1615/Int.J.UncertaintyQuantification.2014006914. |
[31] | M. Guo, A. Manzoni, M. Amendt, P. Conti and J. S. Hesthaven, Multi-fidelity regression using artificial neural networks: Efficient approximation of parameter-dependent output quantities, Computer Methods in Applied Mechanics and Engineering, 389 (2022), Paper No. 114378, 25 pp. |
[32] | R. T. Haftka, Combining global and local approximations, AIAA Journal, 29 (1991), 1523-1525. doi: 10.2514/3.10768. |
[33] | J. Hartlap, P. Simon and P. Schneider, Why your model parameter confidences might be too optimistic. unbiased estimation of the inverse covariance matrix, Astronomy & Astrophysics, 464 (2007), 399-404. |
[34] | J. S. Hesthaven and S. Ubbiali, Non-intrusive reduced order modeling of nonlinear problems using neural networks, Journal of Computational Physics, 363 (2018), 55-78. doi: 10.1016/j.jcp.2018.02.037. |
[35] |
R. A. Horn and C. R. Johnson, Matrix Analysis, |
[36] | K. Hornik, M. Stinchcombe and H. White, Multilayer feedforward networks are universal approximators, Neural Networks, 2 (1989), 359-366. doi: 10.1016/0893-6080(89)90020-8. |
[37] | A. A. Howard, M. Perego, G. E. Karniadakis and P. Stinis, Multifidelity deep operator networks for data-driven and physics-informed problems, arXiv preprint, arXiv: 2204.09157, 2023. |
[38] | J. Hyun, A. Chaudhuri, K. E. Willcox and H. A. Kim, Multifidelity robust topology optimization for material uncertainties with digital manufacturing, In AIAA SCITECH 2023 Forum, 2023, 2038. |
[39] | S. Jiang and L. J. Durlofsky, Use of multifidelity training data and transfer learning for efficient construction of subsurface flow surrogate models, Journal of Computational Physics, 474 (2023), Paper No. 111800, 25 pp. doi: 10.1016/j.jcp.2022.111800. |
[40] | M. C. Kennedy and A. O'Hagan, Predicting the output from a complex computer code when fast approximations are available, Biometrika, 87 (2000), 1-13. doi: 10.1093/biomet/87.1.1. |
[41] | D. L. Knill, A. A. Giunta, C. A. Baker, B. Grossman, W. H. Mason, R. T. Haftka and L. T. Watson, Response surface models combining linear and Euler aerodynamics for supersonic transport design, Journal of Aircraft, 36 (1999), 75-86. doi: 10.2514/2.2415. |
[42] | Z. Li, N. Kovachki, K. Azizzadenesheli, B. Liu, K. Bhattacharya, A. Stuart and A. Anandkumar, Fourier neural operator for parametric partial differential equations, ICLR 2021; arXiv: 2010.08895, 2021. |
[43] | D. Liu and Y. Wang, Multi-fidelity physics-constrained neural network and its application in materials modeling, Journal of Mechanical Design, 141 (2019), 121403. doi: 10.1115/1.4044400. |
[44] | L. Lu, P. Jin, G. Pang, Z. Zhang and G. E. Karniadakis, Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators, Nature Machine Intelligence, 3 (2021), 218-229. doi: 10.1038/s42256-021-00302-5. |
[45] | L. Lu, R. Pestourie, S. G. Johnson and G. Romano, Multifidelity deep neural operators for efficient learning of partial differential equations with application to fast inverse design of nanoscale heat transport, Physical Review Research, 4 (2022), 023210. doi: 10.1103/PhysRevResearch.4.023210. |
[46] | D. Luo, T. O'Leary-Roseberry, P. Chen and O. Ghattas, Efficient PDE-constrained optimization under high-dimensional uncertainty using derivative-informed neural operators, arXiv preprint, arXiv: 2305.20053, 2023. |
[47] | A. Marques, R. Lam and K. Willcox, Contour location via entropy reduction leveraging multiple information sources, Advances in Neural Information Processing Systems, 31 (2018). |
[48] | M. Mehana, A. Pachalieva, A. Kumar, J. Santos, D. O'Malley, W. Carey, M. Sharma and H. Viswanathan, Prediction and uncertainty quantification of shale well performance using multifidelity Monte Carlo, Gas Science and Engineering, 110 (2023), 204877. doi: 10.1016/j.jgsce.2023.204877. |
[49] | X. Meng and G. E. Karniadakis, A composite neural network that learns from multi-fidelity data: Application to function approximation and inverse PDE problems, Journal of Computational Physics, 401 (2020), 109020, 15 pp. |
[50] | I. Mezić, Analysis of fluid flows via spectral properties of the Koopman operator, Annual Review of Fluid Mechanics, Annu. Rev. Fluid Mech., Annual Reviews Inc., Palo Alto, CA, 45 (2013), 357-378. doi: 10.1146/annurev-fluid-011212-140652. |
[51] | I. Mezić, Spectral properties of dynamical systems, model reduction and decompositions, Nonlinear Dynamics, 41 (2005), 309-325. doi: 10.1007/s11071-005-2824-x. |
[52] | C. Moya and G. Lin, Bayesian, multifidelity operator learning for complex engineering systems–a position paper, Journal of Computing and Information Science in Engineering, 23 (2023), 060814, 8pp. doi: 10.1115/1.4062635. |
[53] | N. H. Nelsen and A. M. Stuart, The random feature model for input-output maps between Banach spaces, SIAM Journal on Scientific Computing, 43 (2021), A3212-A3243. doi: 10.1137/20M133957X. |
[54] | T. O'Leary-Roseberry, P. Chen, U. Villa and O. Ghattas, Derivative informed neural operator: An efficient framework for high-dimensional parametric derivative learning, J. Comput. Phys., 496 (2024), Paper No. 112555, 21 pp. |
[55] | L. Parussini, D. Venturi, P. Perdikaris and G. E. Karniadakis, Multi-fidelity Gaussian process regression for prediction of random fields, Journal of Computational Physics, 336 (2017), 36-50. doi: 10.1016/j.jcp.2017.01.047. |
[56] | B. Peherstorfer and K. Willcox, Data-driven operator inference for nonintrusive projection-based model reduction, Computer Methods in Applied Mechanics and Engineering, 306 (2016), 196-215. doi: 10.1016/j.cma.2016.03.025. |
[57] | B. Peherstorfer, K. Willcox and M. Gunzburger, Optimal model management for multifidelity Monte Carlo estimation, SIAM Journal on Scientific Computing, 38 (2016), A3163-A3194. doi: 10.1137/15M1046472. |
[58] | P. Perdikaris, M. Raissi, A. Damianou, N. D. Lawrence and G. E. Karniadakis, Nonlinear information fusion algorithms for data-efficient multi-fidelity modelling, Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences, 473 (2017), 20160751. doi: 10.1098/rspa.2016.0751. |
[59] | M. Poloczek, J. Wang and P. Frazier, Multi-information source optimization, Advances in Neural Information Processing Systems, 30 (2017). |
[60] | E. Qian, I.-G. Farcas and K. Willcox, Reduced operator inference for nonlinear partial differential equations, SIAM Journal on Scientific Computing, 44 (2022), A1934-A1959. doi: 10.1137/21M1393972. |
[61] | E. Qian, B. Kramer, B. Peherstorfer and K. Willcox, Lift & Learn: Physics-informed machine learning for large-scale nonlinear dynamical systems, Physica D: Nonlinear Phenomena, 406 (2020). |
[62] | E. Qian, B. Peherstorfer, D. O'Malley, V. V. Vesselinov and K. Willcox, Multifidelity Monte Carlo estimation of variance and sensitivity indices, SIAM/ASA Journal on Uncertainty Quantification, 6 (2018), 683-706. doi: 10.1137/17M1151006. |
[63] | M. Ramezankhani, A. Nazemi, A. Narayan, H. Voggenreiter, M. Harandi, R. Seethaler and A. S. Milani, A data-driven multi-fidelity physics-informed learning framework for smart manufacturing: a composites processing case study, In 2022 IEEE 5th International Conference on Industrial Cyber-Physical Systems (ICPS), IEEE, (2022), 1-7. |
[64] | R. Y. Rubinstein and R. Marcus, Efficiency of multivariate control variates in monte carlo simulation, Operations Research, 33 (1985), 661-677. doi: 10.1287/opre.33.3.661. |
[65] | D. Schaden and E. Ullmann, On multilevel best linear unbiased estimators, SIAM/ASA Journal on Uncertainty Quantification, 8 (2020), 601-635. doi: 10.1137/19M1263534. |
[66] | D. Schaden and E. Ullmann, Asymptotic analysis of multilevel best linear unbiased estimators, SIAM/ASA Journal on Uncertainty Quantification, 9 (2021), 953-978. doi: 10.1137/20M1321607. |
[67] | V. Sella, J. Pham, A. Chaudhuri and K. E. Willcox, Projection-based multifidelity linear regression for data-poor applications, In AIAA SCITECH 2023 Forum, 2023, 0916. |
[68] | M. Shi, L. Lv, W. Sun and X. Song, A multi-fidelity surrogate model based on support vector regression, Structural and Multidisciplinary Optimization, 61 (2020), 2363-2375. doi: 10.1007/s00158-020-02522-6. |
[69] | Y. Shi and R. Cornish, On multilevel Monte Carlo unbiased gradient estimation for deep latent variable models, In International Conference on Artificial Intelligence and Statistics, PMLR, (2021), 3925-3933. |
[70] | D. H. Song and D. M. Tartakovsky, Transfer learning on multifidelity data, Journal of Machine Learning for Modeling and Computing, 3 (2022), 31-47. doi: 10.1615/JMachLearnModelComput.2021038925. |
[71] | R. Swischuk, L. Mainini, B. Peherstorfer and K. Willcox., Projection-based model reduction: Formulations for physics-based machine learning, Computers and Fluids, 179 (2019), 704-717. |
[72] | A. L. Teckentrup, R. Scheichl, M. B. Giles and E. Ullmann, Further analysis of multilevel Monte Carlo methods for elliptic PDEs with random coefficients, Numerische Mathematik, 125 (2013), 569-600. doi: 10.1007/s00211-013-0546-4. |
[73] | M. O. Williams, I. G. Kevrekidis and C. W. Rowley, A data–driven approximation of the Koopman operator: Extending dynamic mode decomposition, Journal of Nonlinear Science, 25 (2015), 1307-1346. |
[74] | M. O. Williams, C. W. Rowley and Y. Kevrekidis, A kernel-based method for data-driven Koopman spectral analysis, Journal of Computational Dynamics, 2 (2015), 247-265. doi: 10.3934/jcd.2015005. |
[75] | Y. Zhang, N. H. Kim, C. Park and R. T. Haftka, Multifidelity surrogate based on single linear regression, AIAA Journal, 56 (2018), 4944-4952. doi: 10.2514/1.J057299. |
Exponential function example: Box plots of 500 realizations of estimators for first entry of
Exponential example: comparing models learned with the standard high-fidelity-only (HF) and proposed multifidelity (MF) training approach
Analytical example: convergence of multifidelity linear regression estimators for
PDE model problem: convergence of multifidelity linear regression estimators for
Generalization error over 1000 unseen test data. Plotted lines and shaded regions are the mean and first standard deviation over 500 learned models trained on independent realizations of training data