\`x^2+y_1+z_12^34\`
Advanced Search
Article Contents
Article Contents

Optimization via conformal Hamiltonian systems on manifolds

Abstract / Introduction Full Text(HTML) Figure(8) Related Papers Cited by
  • In this work we propose a method to perform optimization on manifolds. We assume to have an objective function $ f $ defined on a manifold and think of it as the potential energy of a mechanical system. By adding a momentum-dependent kinetic energy, we define its Hamiltonian function, which allows us to write the corresponding Hamiltonian system. We make it into a conformal Hamiltonian system by adding a dissipation term: the result is the continuous model of our scheme. We solve it via splitting methods (Lie-Trotter and leapfrog): we combine the RATTLE scheme, approximating the conserved flow, with the exact dissipated flow. The result is a conformal symplectic method for constant stepsizes. We also propose an adaptive stepsize implementation. We test the proposed schemes on an example, the minimization of a function defined on a sphere and they are competitive in comparison with the standard gradient descent method.

    Mathematics Subject Classification: Primary: 65L05, 37M15; Secondary: 65K10, 53C18.

    Citation:

    \begin{equation} \\ \end{equation}
  • 加载中
  • Figure 1.  Results for the SM order 1, for different matrices $ A $ and different parameters $ \gamma $. The stepsize $ h $ is specified in the title

    Figure 2.  Results for the SM of order 1, the SM of order 2 and the adaptive scheme (AD), for different matrices $ A $

    Figure 3.  Verification of the constraints for all methods used: SM of order 1, SM of order 2 and the adaptive routine (AD). Left: checking $ q_n $ belongs to the unit sphere by plotting $ g(q_n) = ||q_n||^2 - 1 $. Right: checking $ p_n $ belongs to the tangent space related to the point $ q_n $ by plotting the inner product $ \langle q_n, p_n \rangle $

    Figure 4.  Distance between the exact solution and the iterates computed with the SM of order 1, of order 2 and the adaptive routine (AD). The distance is calculated between points on the sphere, $ d(x, y) = 2 \arcsin(\frac{||x - y||_2}{2}) $

    Figure 5.  Constant $ \alpha $ such that $ e_{n+1} = \alpha e_n $ related to the plots in Figure 4

    Figure 6.  Results for the SM of order 1, the SM of order 2, the adaptive scheme (AD) and the GD method, for different matrices $ A $

    Figure 7.  Examples of the evolution of the spectral radius $ \rho(DF(q)) $ for different values of $ h $. The second row is a zoom of the plots in the first one

    Figure 8.  Examples of the evolution of the 2-norm of $ DF(q) $ for different values of $ h $. The second row is a zoom of the plots in the first one, and they can be compared to the one in Figure 7 as the data are the same

  • [1] P. A. AbsilR. Mahony and  R. SepulchreOptimization Algorithms on Matrix Manifolds, Princeton University Press, Princeton, NJ, 2008.  doi: 10.1515/9781400830244.
    [2] H. C. Andersen, Rattle: A "velocity" version of the shake algorithm for molecular dynamics calculations, Journal of computational Physics, Elsevier, 52, (1983), 24-34. doi: 10.1016/0021-9991(83)90014-1.
    [3] N. Boumal, An Introduction to Optimization on Smooth Manifolds, 1st edition, Cambridge University Press, 2023.
    [4] E. Celledoni, S. Eidnes, B. Owren and T. Ringholm, Dissipative numerical schemes on Riemannian manifolds with applications to gradient flows, SIAM Journal on Scientific Computing, SIAM, 40 (2018), A3789-A3806. doi: 10.1137/18M1190628.
    [5] E. CelledoniS. EidnesB. Owren and T. Ringholm, Energy-preserving methods on Riemannian manifolds, Mathematics of Computation, 89 (2020), 699-716.  doi: 10.1090/mcom/3470.
    [6] G. França, J. Sulam, D. Robinson and R. Vidal, Conformal symplectic and relativistic optimization, J. Stat. Mech. Theory Exp., 2020 (2020), 124008, 29 pp. doi: 10.1088/1742-5468/abcaee.
    [7] L. C. García-Naranjo and M. Vermeeren, Structure preserving discretization of time-reparametrized Hamiltonian systems with application to nonholonomic mechanics, Journal of Computational Dynamics, 8 (2021), 241-271.  doi: 10.3934/jcd.2021011.
    [8] E. Hairer, C. Lubich and G. Wanner, Geometric Numerical Integration, 2$^{st}$ edition, Springer-Verlag, Berlin, 2006.
    [9] E. Hairer and G. Wanner, Solving Ordinary Differential Equations II, Springer Ser. Comput. Math., 14, Springer-Verlag, Berlin, 1996. doi: 10.1007/978-3-642-05221-7.
    [10] P. Libermann and C. M. Marle, Symplectic Geometry and Analytical Mechanics, 1$^{st}$ edition, Springer Science & Business Media, 2012.
    [11] C. J. Maddison, D. Paulin, Y. W. Teh, B. O'Donoghue and A. Doucet, Hamiltonian Descent Methods, preprint, 2018, arXiv: 1809.05042.
    [12] J. E. Marsden and T. S Ratiu, Introduction to Mechanics and Symmetry: A Basic Exposition of Classical Mechanical Systems, Texts Appl. Math., 17, Springer-Verlag, New York, 1994.
    [13] E. Massart and V. Abrol, Coordinate descent on the orthogonal group for recurrent neural network training, Proceedings of the AAAI Conference on Artificial Intelligence, textbf36 (2022), 7744-7751.
    [14] R. McLachlan and M. Perlmutter, Conformal Hamiltonian systems, Journal of Geometry and Physics, Elsevier, 39 (2001), 276-300. doi: 10.1016/S0393-0440(01)00020-1.
    [15] Y. Nesterov, A method for solving the convex programming problem with convergence rate $O(1/k^2)$, Proceedings of the USSR Academy of Sciences, textbf269 (1983), 543-547.
    [16] B. T. Polyak, Some methods of speeding up the convergence of iteration methods, Ussr Computational Mathematics and Mathematical Physics, Elsevier, 4 (1964), 791-803.
    [17] W. Su, S. Boyd and E. J. Candès, A differential equation for modeling Nesterov's accelerated gradient method: Theory and insights, Journal of Machine Learning Research, 17 (2016), Paper No. 153, 43 pp.
    [18] N. S. Wadia, M. I. Jordan and M. Muehlebach, Optimization with adaptive step size selection from a dynamical systems perspective, Proceedings of the 35th Conference on Neural Information Processing Systems, NeurIPS Workshop on Optimization for Machine Learning, 2021.
    [19] H. Zhang and S. Sra, Towards Riemannian Accelerated Gradient Methods, preprint, 2018, arXiv: 1806.02812.
  • 加载中

Figures(8)

SHARE

Article Metrics

HTML views(2558) PDF downloads(218) Cited by(0)

Access History

Other Articles By Authors

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return