\`x^2+y_1+z_12^34\`
Advanced Search
Article Contents
Article Contents

Solving implicit inverse problems with homotopy-based regularization path

  • *Corresponding author: Davide Parodi

    *Corresponding author: Davide Parodi 
Abstract / Introduction Full Text(HTML) Figure(4) / Table(1) Related Papers Cited by
  • Implicit inverse problems, in which noisy observations of a physical quantity are used to infer a nonlinear functional applied to an associated function, are inherently ill-posed and often exhibit non-uniqueness of solutions. Such problems arise in a range of domains, including the identification of systems governed by Ordinary and Partial Differential Equations (ODEs/PDEs), optimal control, and data assimilation. Their solution is complicated by the nonlinear nature of the underlying constraints and the instability manifested in the presence of noise.

    In this paper, we propose a homotopy-based optimization method for solving such problems. Beginning with a regularized constrained formulation that includes a sparsity-promoting regularization term, we employ a gradient-based algorithm in which gradients with respect to the model parameters are efficiently computed using the adjoint-state method. Nonlinear constraints are handled through a Newton–Raphson procedure.

    By solving a sequence of problems with decreasing regularization, we trace a solution path that improves stability and enables the exploration of multiple candidate solutions. The method is applied to the latent dynamics discovery problem in simulation, highlighting performance as a function of ground truth sparsity and semi-convergence behavior.

    Mathematics Subject Classification: Primary: 65K10, 49M37; Secondary: 90C30, 49N45.

    Citation:

    \begin{equation} \\ \end{equation}
  • 加载中
  • Figure 1.  $ \textbf{m}_1 $ (first row), $ \textbf{m}_2 $ (second row) synthetic data (light blue line) and noisy discrete synthetic data (blue dots) with different amounts of noise: From left to right $ \boldsymbol{\sigma} = [0.01, 0.1, 0.2] $. $ x- $axis shows time. $ y- $axis shows the values of the state variable

    Figure 2.  Violin plots of relative errors on parameters and solutions computed with the best regularization parameter $ \alpha_q^* $ for each trial $ q = 1,...,n $. The first row shows results for the parameters $ \textbf{m} $. The second row shows results for the solutions $ \textbf{u} $. $ x $-axis reports different levels of noise $ \boldsymbol{\sigma} = [0.01, 0.1, 0.2] $. $ y $-axis reports the relative error

    Figure 3.  Regularization paths with level of noise $ \sigma = 0.1 $. The first row shows results for $ \textbf{m}_1 $ (19), the second row shows results for $ \textbf{m}_2 $ (20). The first column reports the regularization paths of the solutions. The colors of the curves transition from blue ($ l = 0 $) to red ($ l = 99 $). The $ x- $axis reports the time, the $ y- $axis the value of the state variable. The second column reports the regularization paths of the parameters. $ x- $axis shows the regularization parameter, $ y- $axis shows the parameter values. The ground truth values are symbolized by a cross at the last regularization parameter; the curves and crosses of the same color correspond to the same parameter. The third column shows the regularization path of the relative error on parameters. $ x- $axis shows the regularization parameter, $ y- $axis shows the relative error

    Figure 4.  Regularization paths with level of noise $ \sigma = 0.2 $ for $ \textbf{m}_1 $ (19). Top left: Regularization path of the relative error on parameters. $ x- $axis shows the regularization parameter, $ y- $axis shows the relative error. Top right: Regularization paths of the parameters. $ x- $axis shows the regularization parameter, $ y- $axis shows the parameter values. The ground truth values are symbolized by a cross at the last regularization parameter; the curves and crosses of the same color correspond to the same parameter. Bottom left: Terminal part of regularization path of the data loss function (4). Bottom right: Regularization paths of the solutions. The colors of the curves transition from blue ($ l = 0 $) to red ($ l = 99 $). The $ x- $axis reports the time, the $ y- $axis the value of the state variable

    Table 1.  Parameters and solutions mean relative error table. Each entry reports the mean relative error $ \pm $ standard deviation on parameters and solutions. Each row is a different ground truth, each column is a different level of noise $ \boldsymbol{\sigma} = [0.01, 0.1, 0.2] $

    Noise levels
    Low Medium High
    $ \textbf{m}_1 $ 0.01 $ \pm $ 0.00 0.02 $ \pm $ 0.02 0.02 $ \pm $ 0.01
    $ \textbf{m}_2 $ 0.11 $ \pm $ 0.00 0.12 $ \pm $ 0.02 0.11 $ \pm $ 0.05
    $ \textbf{u}_1 $ 0.01 $ \pm $ 0.00 0.05 $ \pm $ 0.02 0.04 $ \pm $ 0.09
    $ \textbf{u}_2 $ 0.03 $ \pm $ 0.00 0.05 $ \pm $ 0.01 0.04 $ \pm $ 0.16
     | Show Table
    DownLoad: CSV
  • [1] A. Alexanderian, Optimal experimental design for infinite-dimensional Bayesian inverse problems governed by PDEs: A review, Inverse Problems, 37 (2021), 043001, 31 pp. doi: 10.1088/1361-6420/abe10c.
    [2] J. Baayen, B. Becker, K.-J. van Heeringen, I. Miltenburg, T. Piovesan, J. Rauw, M. den Toom and J. VanderWees, An overview of continuation methods for non-linear model predictive control of water systems, IFAC-PapersOnLine, 52 (2019), 73–80. doi: 10.1016/j.ifacol.2019.11.012.
    [3] M. Bergounioux, É. Bretin and Y. Privat, How to position sensors in thermo-acoustic tomography, Inverse Problems, 35 (2019), 074003, 25 pp. doi: 10.1088/1361-6420/ab0e4d.
    [4] D. P. Bertsekas, Constrained Optimization and Lagrange Multiplier Methods, in Athena Scientific, Academic Press, 2014.
    [5] T. Bonesky, Morozov's discrepancy principle and Tikhonov-type functionals, Inverse Problems, 25 (2009), 015015, 11 pp. doi: 10.1088/0266-5611/25/1/015015.
    [6] S. L. Brunton, J. L. Proctor and J. N. Kutz, Discovering governing equations from data by sparse identification of nonlinear dynamical systems, Proceedings of the National Academy of Sciences, Washington D.C., U.S.A., 113 (2016), 3932–3937. doi: 10.1073/pnas.1517384113.
    [7] M. Burger, J.-F. Pietschmann and M.-T. Wolfram, Data assimilation in price formation, Inverse Problems, 36 (2020), 064003, 25 pp. doi: 10.1088/1361-6420/ab6d5a.
    [8] J. C. Butcher, Numerical Methods for Ordinary Differential Equations, John Wiley & Sons, 2016. doi: 10.1002/0470868279.
    [9] J. E. Cavanaugh and A. A. Neath, The Akaike information criterion: Background, derivation, properties, application, interpretation, and refinements, Wiley Interdisciplinary Reviews: Computational Statistics, 11 (2019), e1460, 11 pp. doi: 10.1002/wics.1460.
    [10] R. T. Q. Chen, Y. Rubanova, J. Bettencourt and D. K. Duvenaud, Neural ordinary differential equations, Advances in Neural Information Processing Systems, 31 (2018), 6571–6583.
    [11] Z. ChenY. Liu and H. Sun, Physics-informed learning of governing equations from scarce data, Nature Communications, 12 (2021), 6136.  doi: 10.1038/s41467-021-26434-1.
    [12] J.-P. Dedieu, Newton-Raphson method, Encyclopedia of Applied and Computational Mathematics, (Berlin), Springer Berlin Heidelberg, 2015, 1023-1028. doi: 10.1007/978-3-540-70529-1_374.
    [13] M. Defrise, C. Vanhove and X. Liu, An algorithm for total variation regularization in high-dimensional linear problems, Inverse Problems, 27 (2011), 065002, 16 pp. doi: 10.1088/0266-5611/27/6/065002.
    [14] F. Dondelinger, D. Husmeier, S. Rogers and M. Filippone, ODE parameter inference using adaptive gradient matching with Gaussian processes, Artificial Intelligence and Statistics PMLR, 31 (2013), 216–228.
    [15] B. Efron, T. Hastie, I. Johnstone and R. Tibshirani, Least angle regression, Ann. Statist., 32 (2004), 407–499. doi: 10.1214/009053604000000067.
    [16] H. W. Engl and W. Grever, Using the L-curve for determining optimal regularization parameters, Numerische Mathematik, 69 (1994), 25–31. doi: 10.1007/s002110050078.
    [17] J. H. Friedman, T. Hastie and R. Tibshirani, Regularization paths for generalized linear models via coordinate descent, Journal of Statistical Software, 33 (2010), 1–22. doi: 10.18637/jss.v033.i01.
    [18] Z. German-Sallo, Nonlinear wavelet denoising of data signals, UbiCC J., 6 (2011), 895–900.
    [19] G. H. Golub, M. Heath and G. Wahba, Generalized cross-validation as a method for choosing a good ridge parameter, Technometrics, 21 (1979), 215–223. doi: 10.1080/00401706.1979.10489751.
    [20] M. Hadamard, Sur les problèmes aux dérivées partielles et leur signification physique, Princeton University Bulletin, 13 (1902), 49–52.
    [21] M. Hanke, A. Neubauer and O. Scherzer, A convergence analysis of the Landweber iteration for nonlinear ill-posed problems, Numerische Mathematik, 72 (1995), 21–37. doi: 10.1007/s002110050158.
    [22] M. Heinonen, C. Yildiz, H. Mannerström, J. Intosalmi and H. Lähdesmäki, Learning unknown ODE models with Gaussian processes, International Conference on Machine Learning, PMLR, 80 (2018), 1959–1968.
    [23] J. D. Keller and R. Potthast, AI-based data assimilation: Learning the functional of analysis estimation, preprint, 2024, arXiv: 2406.00390.
    [24] J. Kukavcka, V. Golkov and D. Cremers, Regularization for deep learning: A taxonomy, preprint, 2017, arXiv: 1710.10686.
    [25] M. Lorenzi and M. Filippone, Constraining the dynamics of deep probabilistic models, in International Conference on Machine Learning, PMLR, 80 (2018), 3227–3236.
    [26] S. Migórski, A. A. Khan and S. Zeng, Inverse problems for nonlinear quasi-variational inequalities with an application to implicit obstacle problems of p-Laplacian type, Inverse Problems, 35 (2019), 035004, 14 pp. doi: 10.1088/1361-6420/aafcc9.
    [27] J. Mosegaard and C. Rygaard-Hjalsted, Probabilistic analysis of implicit inverse problems, Inverse Problems, 15 (1999), 573-583.  doi: 10.1088/0266-5611/15/2/015.
    [28] A. A. Neath and J. E. Cavanaugh, The Bayesian information criterion: Background, derivation, and applications, Wiley Interdisciplinary Reviews: Computational Statistics, 4 (2012), 199–203. doi: 10.1002/wics.199.
    [29] E. D. Nino-Ruiz, C. Ardila and R. Capacho, Local search methods for the solution of implicit inverse problems, Soft Computing, 22 (2018), 4819–4832. doi: 10.1007/s00500-017-2670-z.
    [30] R.-E. Plessix, A review of the adjoint-state method for computing the gradient of a functional with geophysical applications, Geophysical Journal International, 167 (2006), 495–503. doi: 10.1111/j.1365-246X.2006.02978.x.
    [31] C. E. Rasmussen and  C. K. I. WilliamsGaussian Processes for Machine Learning, MIT Press, 2006. 
    [32] V. RiellyK. LahouelE. LewN. FisherV. HaneyM. Wells and B. Jedynak, MOCK: An algorithm for learning nonparametric differential equations via multivariate occupation kernel functions, Stat, 1050 (2025), 8. 
    [33] L. Rosasco, E. De Vito, A. Caponnetto, M. Piana and A. Verri, Are loss functions all the same?, Neural Computation, 16 (2004), 1063–1076. doi: 10.1162/089976604773135104.
    [34] D. Rothermel and T. Schuster, Solving an inverse heat convection problem with an implicit forward operator by using a projected quasi-Newton method, Inverse Problems, 37 (2021), 045014, 36 pp. doi: 10.1088/1361-6420/abe4a8.
    [35] L. I. Rudin, S. Osher and E. Fatemi, Nonlinear total variation based noise removal algorithms, Physica D: Nonlinear Phenomena, 60 (1992), 259–268. doi: 10.1016/0167-2789(92)90242-F.
    [36] O. Scherzer, Convergence criteria of iterative methods based on Landweber iteration for solving nonlinear problems, Journal of Mathematical Analysis and Applications, 194 (1995), 911–933. doi: 10.1006/jmaa.1995.1335.
    [37] A. Tikhonov and V. Arsenin, Solutions of ill-posed problems, SIAM Review, 21 (1979), 266–267.
    [38] S. Vatankhah, V. E. Ardestani and R. A. Renaut, Application of the $\chi^2$ principle and unbiased predictive risk estimator for determining the regularization parameter in 3-D focusing gravity inversion, Geophysical Journal International, 200 (2015), 265–277. doi: 10.1093/gji/ggu397.
    [39] C. R. Vogel, Computational Methods for Inverse Problems, SIAM, 2002. doi: 10.1137/1.9780898717570.
    [40] L. T. Watson and R. T. Haftka, Modern homotopy methods in optimization, Computer Methods in Applied Mechanics and Engineering, 74 (1989), 289–305. doi: 10.1016/0045-7825(89)90053-4.
    [41] Y. Yao, L. Rosasco and A. Caponnetto, On early stopping in gradient descent learning, Constructive Approximation, 26 (2007), 289–315. doi: 10.1007/s00365-006-0663-2.
  • 加载中

Figures(4)

Tables(1)

SHARE

Article Metrics

HTML views(581) PDF downloads(45) Cited by(0)

Access History

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return