• Previous Article
    Mathematical analysis of an abstract model and its applications to structured populations (I)
  • EECT Home
  • This Issue
  • Next Article
    Robustness of global attractors: Abstract framework and application to dissipative wave equations
doi: 10.3934/eect.2022022
Online First

Online First articles are published articles within a journal that have not yet been assigned to a formal issue. This means they do not yet have a volume number, issue number, or page numbers assigned to them, however, they can still be found and cited using their DOI (Digital Object Identifier). Online First publication benefits the research community by making new scientific discoveries known as quickly as possible.

Readers can access Online First articles via the “Online First” tab for the selected journal.

On the effect of perturbations in first-order optimization methods with inertia and Hessian driven damping

1. 

IMAG, Univ. Montpellier, CNRS, Montpellier, France

2. 

Normandie Univ, ENSICAEN, CNRS, GREYC, Caen, France

3. 

Department of Computer Science and Engineering, Czech Technical University, Prague, Czech Republic

* Corresponding author: Hedy Attouch

Received  January 2022 Revised  March 2022 Early access April 2022

Fund Project: The last author is supported by the OP VVV project CZ.02.1.01/0.0/0.0/16_019/0000765 "Research Center for Informatics"

Second-order continuous-time dissipative dynamical systems with viscous and Hessian driven damping have inspired effective first-order algorithms for solving convex optimization problems. While preserving the fast convergence properties of the Nesterov-type acceleration, the Hessian driven damping makes it possible to significantly attenuate the oscillations. To study the stability of these algorithms with respect to perturbations, we analyze the behaviour of the corresponding continuous systems when the gradient computation is subject to exogenous additive errors. We provide a quantitative analysis of the asymptotic behaviour of two types of systems, those with implicit and explicit Hessian driven damping. We consider convex, strongly convex, and non-smooth objective functions defined on a real Hilbert space and show that, depending on the formulation, different integrability conditions on the perturbations are sufficient to maintain the convergence rates of the systems. We highlight the differences between the implicit and explicit Hessian damping, and in particular point out that the assumptions on the objective and perturbations needed in the implicit case are more stringent than in the explicit case.

Citation: Hedy Attouch, Jalal Fadili, Vyacheslav Kungurtsev. On the effect of perturbations in first-order optimization methods with inertia and Hessian driven damping. Evolution Equations and Control Theory, doi: 10.3934/eect.2022022
References:
[1]

S. Adly and H. Attouch, Finite convergence of proximal-gradient inertial algorithms combining dry friction with Hessian-driven damping, SIAM J. Optim., 30 (2020), 2134-2162.  doi: 10.1137/19M1307779.

[2]

S. Adly and H. Attouch, Finite time stabilization of continuous inertial dynamics combining dry friction with Hessian-driven damping, J. Conv. Analysis, 28 (2021), 281-310. 

[3]

C. D. Alecsa, S. László and T. Pinţa, An extension of the second order dynamical system that models Nesterov's convex gradient method, Appl. Math. Optim., 84 2020), 1687–1716. doi: 10.1007/s00245-020-09692-1.

[4]

F. AlvarezH. AttouchJ. Bolte and P. Redont, A second-order gradient-like dissipative dynamical system with Hessian-driven damping. Application to optimization and mechanics, J. Math. Pures Appl., 81 (2002), 747-779.  doi: 10.1016/S0021-7824(01)01253-3.

[5]

V. ApidopoulosJ.-F. Aujol and C. Dossal, Convergence rate of inertial Forward-Backward algorithm beyond Nesterov's rule, Math. Program. Ser. A, 180 (2020), 137-156.  doi: 10.1007/s10107-018-1350-9.

[6]

H. Attouch, R. I. Boţ and E. R. Csetnek, Fast optimization via inertial dynamics with closed-loop damping, Journal of the European Mathematical Society (JEMS), (2021), arXiv: 2008.02261v2 [math. OC].

[7]

H. Attouch and A. Cabot, Asymptotic stabilization of inertial gradient dynamics with time-dependent viscosity, J. Differential Equations, 263 (2017), 5412-5458.  doi: 10.1016/j.jde.2017.06.024.

[8]

H. Attouch, A. Cabot, Z. Chbani and H. Riahi, Accelerated forward-backward algorithms with perturbations: Application to Tikhonov regularization, J. Optim. Theory Appl., 179 (2018), 1–36. doi: 10.1007/s10957-018-1369-3.

[9]

H. Attouch, Z. Chbani, J. Fadili and H. Riahi, First order optimization algorithms via inertial systems with Hessian driven damping, Math. Program., (2020). doi: 10.1007/s10107-020-01591-1.

[10]

H. Attouch, Z. Chbani, J. Fadili and H. Riahi, Convergence of iterates for first-order optimization algorithms with inertia and hessian driven damping, Optimization, (2021). doi: 10.1080/02331934.2021.2009828.

[11]

H. AttouchZ. ChbaniJ. Peypouquet and P. Redont, Fast convergence of inertial dynamics and algorithms with asymptotic vanishing viscosity, Math. Program. Ser. B, 168 (2018), 123-175.  doi: 10.1007/s10107-016-0992-8.

[12]

H. Attouch, Z. Chbani and H. Riahi, Rate of convergence of the Nesterov accelerated gradient method in the subcritical case $\alpha \leq 3$, ESAIM Control Optim. Calc. Var., 25 (2019), Paper No. 2, 34 pp. doi: 10.1051/cocv/2017083.

[13]

H. AttouchZ. Chbani and H. Riahi, Fast proximal methods via time scaling of damped inertial dynamics, SIAM J. Optim., 29 (2019), 2227-2256.  doi: 10.1137/18M1230207.

[14]

H. Attouch and M.-O. Czarnecki, Asymptotic control and stabilization of nonlinear oscillators with non-isolated equilibria, J. Differential Equations, 179 (2002), 278-310.  doi: 10.1006/jdeq.2001.4034.

[15]

H. Attouch and A. Damlamian, Strong solutions for parabolic variational inequalities, Nonlinear Analysis, 2 (1978), 329-353.  doi: 10.1016/0362-546X(78)90021-4.

[16]

H. Attouch and S. C. László, Newton-like inertial dynamics and proximal algorithms governed by maximally monotone operators, SIAM J. Optim., 30 (2020), 3252-3283.  doi: 10.1137/20M1333316.

[17]

H. Attouch and S. C. László, Continuous Newton-like inertial dynamics for monotone inclusions, Set-Valued Var. Anal., 29 (2020), 555-581.  doi: 10.1007/s11228-020-00564-y.

[18]

H. AttouchP.-E. Maingé and P. Redont, A second-order differential system with Hessian-driven damping; Application to non-elastic shock laws, Differ. Equ. Appl., 4 (2012), 27-65.  doi: 10.7153/dea-04-04.

[19]

H. AttouchJ. Peypouquet and P. Redont, Fast convex minimization via inertial dynamics with Hessian driven damping, J. Differential Equations, 261 (2016), 5734-5783.  doi: 10.1016/j.jde.2016.08.020.

[20]

J.-F. Aujol and C. Dossal, Stability of over-relaxations for the forward-backward algorithm, application to FISTA, SIAM J. Optim., 25 (2015), 2408-2433.  doi: 10.1137/140994964.

[21]

J.-F. AujolC. Dossal and A. Rondepierre, Optimal convergence rates for Nesterov acceleration, SIAM J. Optim., 29 (2019), 3131-3153.  doi: 10.1137/18M1186757.

[22]

A. Beck and M. Teboulle, A fast iterative shrinkage-thresholding algorithm for linear inverse problems, SIAM J. Imaging Sci., 2 (2009), 183-202.  doi: 10.1137/080716542.

[23]

R. I. BotE. R. Csetnek and S. C. Laszlo, Tikhonov regularization of a second order dynamical system with Hessian damping, Math. Program. Ser. B, 189 (2021), 151-186.  doi: 10.1007/s10107-020-01528-8.

[24]

H. Brézis, Opérateurs Maximaux Monotones Dans Les Espaces de Hilbert et Équations D'évolution, Lecture Notes 5, North Holland, 1972.

[25]

C. Castera, J. Bolte, C. Févotte and E. Pauwels, An inertial newton algorithm for deep learning, J. Mach. Learn. Res., 22 (2021), Paper No. 134, 31 pp. doi: 10.22405/2226-8383-2021-22-2-121-134.

[26]

A. Chambolle and Ch. Dossal, On the convergence of the iterates of the "fast iterative shrinkage thresholding algorithm", J. Opt. Theory Appl., 166 (2015), 968-982.  doi: 10.1007/s10957-015-0746-4.

[27]

A. Haraux and M. A. Jendoubi, On a second order dissipative ode in Hilbert space with an integrable source term, Acta Math. Sci., 32 (2012), 155-163.  doi: 10.1016/S0252-9602(12)60009-5.

[28]

T. Lin and M. I. Jordan, A control-theoretic perspective on optimal high-order optimization, (2019), arXiv: 1912.07168v1 [math. OC].

[29]

M. Muehlebach and M. I. Jordan, A dynamical systems perspective on nesterov acceleration, (2019), arXiv: 1905.07436.

[30]

Y. Nesterov, A method of solving a convex programming problem with convergence rate $O(1/k^2)$, Dokl. Akad. Nauk SSSR, 269 (1983), 543-547. 

[31]

Y. Nesterov, Introductory Lectures on Convex Optimization: A Basic Course, Applied Optimization, 87. Kluwer Academic Publishers, Boston, MA, 2004. doi: 10.1007/978-1-4419-8853-9.

[32]

Z. Opial, Weak convergence of the sequence of successive approximations for nonexpansive mappings, Bull. Amer. Math. Soc., 73 (1967), 591-597.  doi: 10.1090/S0002-9904-1967-11761-0.

[33]

B. T. Polyak, Introduction to Optimization, Optimization Software, Inc., Publications Division, New York, 1987.

[34]

M. Schmidt, N. Le Roux and F. Bach, Convergence Rates of Inexact Proximal-Gradient Methods for Convex Optimization, NIPS'11 - 25 th Annual Conference on Neural Information Processing Systems, Dec 2011, Grenada, Spain, HAL inria-00618152v3.

[35]

B. Shi, S. S. Du, M. I. Jordan and W. J. Su, Understanding the acceleration phenomenon via high-resolution differential equations, Math. Program., (2021). doi: 10.1007/s10107-021-01681-8.

[36]

W. Su, S. Boyd and E. J. Candès, A differential equation for modeling nesterov's accelerated gradient method: Theory and insights, J. Mach. Learn. Res., 27 (2016), Paper No. 153, 43 pp.

[37]

S. VillaS. SalzoL. Baldassarres and A. Verri, Accelerated and inexact forward-backward, SIAM J. Optim., 23 (2013), 1607-1633.  doi: 10.1137/110844805.

show all references

References:
[1]

S. Adly and H. Attouch, Finite convergence of proximal-gradient inertial algorithms combining dry friction with Hessian-driven damping, SIAM J. Optim., 30 (2020), 2134-2162.  doi: 10.1137/19M1307779.

[2]

S. Adly and H. Attouch, Finite time stabilization of continuous inertial dynamics combining dry friction with Hessian-driven damping, J. Conv. Analysis, 28 (2021), 281-310. 

[3]

C. D. Alecsa, S. László and T. Pinţa, An extension of the second order dynamical system that models Nesterov's convex gradient method, Appl. Math. Optim., 84 2020), 1687–1716. doi: 10.1007/s00245-020-09692-1.

[4]

F. AlvarezH. AttouchJ. Bolte and P. Redont, A second-order gradient-like dissipative dynamical system with Hessian-driven damping. Application to optimization and mechanics, J. Math. Pures Appl., 81 (2002), 747-779.  doi: 10.1016/S0021-7824(01)01253-3.

[5]

V. ApidopoulosJ.-F. Aujol and C. Dossal, Convergence rate of inertial Forward-Backward algorithm beyond Nesterov's rule, Math. Program. Ser. A, 180 (2020), 137-156.  doi: 10.1007/s10107-018-1350-9.

[6]

H. Attouch, R. I. Boţ and E. R. Csetnek, Fast optimization via inertial dynamics with closed-loop damping, Journal of the European Mathematical Society (JEMS), (2021), arXiv: 2008.02261v2 [math. OC].

[7]

H. Attouch and A. Cabot, Asymptotic stabilization of inertial gradient dynamics with time-dependent viscosity, J. Differential Equations, 263 (2017), 5412-5458.  doi: 10.1016/j.jde.2017.06.024.

[8]

H. Attouch, A. Cabot, Z. Chbani and H. Riahi, Accelerated forward-backward algorithms with perturbations: Application to Tikhonov regularization, J. Optim. Theory Appl., 179 (2018), 1–36. doi: 10.1007/s10957-018-1369-3.

[9]

H. Attouch, Z. Chbani, J. Fadili and H. Riahi, First order optimization algorithms via inertial systems with Hessian driven damping, Math. Program., (2020). doi: 10.1007/s10107-020-01591-1.

[10]

H. Attouch, Z. Chbani, J. Fadili and H. Riahi, Convergence of iterates for first-order optimization algorithms with inertia and hessian driven damping, Optimization, (2021). doi: 10.1080/02331934.2021.2009828.

[11]

H. AttouchZ. ChbaniJ. Peypouquet and P. Redont, Fast convergence of inertial dynamics and algorithms with asymptotic vanishing viscosity, Math. Program. Ser. B, 168 (2018), 123-175.  doi: 10.1007/s10107-016-0992-8.

[12]

H. Attouch, Z. Chbani and H. Riahi, Rate of convergence of the Nesterov accelerated gradient method in the subcritical case $\alpha \leq 3$, ESAIM Control Optim. Calc. Var., 25 (2019), Paper No. 2, 34 pp. doi: 10.1051/cocv/2017083.

[13]

H. AttouchZ. Chbani and H. Riahi, Fast proximal methods via time scaling of damped inertial dynamics, SIAM J. Optim., 29 (2019), 2227-2256.  doi: 10.1137/18M1230207.

[14]

H. Attouch and M.-O. Czarnecki, Asymptotic control and stabilization of nonlinear oscillators with non-isolated equilibria, J. Differential Equations, 179 (2002), 278-310.  doi: 10.1006/jdeq.2001.4034.

[15]

H. Attouch and A. Damlamian, Strong solutions for parabolic variational inequalities, Nonlinear Analysis, 2 (1978), 329-353.  doi: 10.1016/0362-546X(78)90021-4.

[16]

H. Attouch and S. C. László, Newton-like inertial dynamics and proximal algorithms governed by maximally monotone operators, SIAM J. Optim., 30 (2020), 3252-3283.  doi: 10.1137/20M1333316.

[17]

H. Attouch and S. C. László, Continuous Newton-like inertial dynamics for monotone inclusions, Set-Valued Var. Anal., 29 (2020), 555-581.  doi: 10.1007/s11228-020-00564-y.

[18]

H. AttouchP.-E. Maingé and P. Redont, A second-order differential system with Hessian-driven damping; Application to non-elastic shock laws, Differ. Equ. Appl., 4 (2012), 27-65.  doi: 10.7153/dea-04-04.

[19]

H. AttouchJ. Peypouquet and P. Redont, Fast convex minimization via inertial dynamics with Hessian driven damping, J. Differential Equations, 261 (2016), 5734-5783.  doi: 10.1016/j.jde.2016.08.020.

[20]

J.-F. Aujol and C. Dossal, Stability of over-relaxations for the forward-backward algorithm, application to FISTA, SIAM J. Optim., 25 (2015), 2408-2433.  doi: 10.1137/140994964.

[21]

J.-F. AujolC. Dossal and A. Rondepierre, Optimal convergence rates for Nesterov acceleration, SIAM J. Optim., 29 (2019), 3131-3153.  doi: 10.1137/18M1186757.

[22]

A. Beck and M. Teboulle, A fast iterative shrinkage-thresholding algorithm for linear inverse problems, SIAM J. Imaging Sci., 2 (2009), 183-202.  doi: 10.1137/080716542.

[23]

R. I. BotE. R. Csetnek and S. C. Laszlo, Tikhonov regularization of a second order dynamical system with Hessian damping, Math. Program. Ser. B, 189 (2021), 151-186.  doi: 10.1007/s10107-020-01528-8.

[24]

H. Brézis, Opérateurs Maximaux Monotones Dans Les Espaces de Hilbert et Équations D'évolution, Lecture Notes 5, North Holland, 1972.

[25]

C. Castera, J. Bolte, C. Févotte and E. Pauwels, An inertial newton algorithm for deep learning, J. Mach. Learn. Res., 22 (2021), Paper No. 134, 31 pp. doi: 10.22405/2226-8383-2021-22-2-121-134.

[26]

A. Chambolle and Ch. Dossal, On the convergence of the iterates of the "fast iterative shrinkage thresholding algorithm", J. Opt. Theory Appl., 166 (2015), 968-982.  doi: 10.1007/s10957-015-0746-4.

[27]

A. Haraux and M. A. Jendoubi, On a second order dissipative ode in Hilbert space with an integrable source term, Acta Math. Sci., 32 (2012), 155-163.  doi: 10.1016/S0252-9602(12)60009-5.

[28]

T. Lin and M. I. Jordan, A control-theoretic perspective on optimal high-order optimization, (2019), arXiv: 1912.07168v1 [math. OC].

[29]

M. Muehlebach and M. I. Jordan, A dynamical systems perspective on nesterov acceleration, (2019), arXiv: 1905.07436.

[30]

Y. Nesterov, A method of solving a convex programming problem with convergence rate $O(1/k^2)$, Dokl. Akad. Nauk SSSR, 269 (1983), 543-547. 

[31]

Y. Nesterov, Introductory Lectures on Convex Optimization: A Basic Course, Applied Optimization, 87. Kluwer Academic Publishers, Boston, MA, 2004. doi: 10.1007/978-1-4419-8853-9.

[32]

Z. Opial, Weak convergence of the sequence of successive approximations for nonexpansive mappings, Bull. Amer. Math. Soc., 73 (1967), 591-597.  doi: 10.1090/S0002-9904-1967-11761-0.

[33]

B. T. Polyak, Introduction to Optimization, Optimization Software, Inc., Publications Division, New York, 1987.

[34]

M. Schmidt, N. Le Roux and F. Bach, Convergence Rates of Inexact Proximal-Gradient Methods for Convex Optimization, NIPS'11 - 25 th Annual Conference on Neural Information Processing Systems, Dec 2011, Grenada, Spain, HAL inria-00618152v3.

[35]

B. Shi, S. S. Du, M. I. Jordan and W. J. Su, Understanding the acceleration phenomenon via high-resolution differential equations, Math. Program., (2021). doi: 10.1007/s10107-021-01681-8.

[36]

W. Su, S. Boyd and E. J. Candès, A differential equation for modeling nesterov's accelerated gradient method: Theory and insights, J. Mach. Learn. Res., 27 (2016), Paper No. 153, 43 pp.

[37]

S. VillaS. SalzoL. Baldassarres and A. Verri, Accelerated and inexact forward-backward, SIAM J. Optim., 23 (2013), 1607-1633.  doi: 10.1137/110844805.

Figure 1.  Example on a smooth function: Evolution of the objective error and distance to the minimizer as a function of $ t $ for different error decay exponents
Figure 2.  Example on a non-smooth function: Evolution of the objective error and distance to the minimizer as a function of $ t $ for different error decay exponents
[1]

Hedy Attouch, Aïcha Balhag, Zaki Chbani, Hassan Riahi. Fast convex optimization via inertial dynamics combining viscous and Hessian-driven damping with time rescaling. Evolution Equations and Control Theory, 2022, 11 (2) : 487-514. doi: 10.3934/eect.2021010

[2]

Hedy Attouch, Alexandre Cabot, Zaki Chbani, Hassan Riahi. Rate of convergence of inertial gradient dynamics with time-dependent viscous damping coefficient. Evolution Equations and Control Theory, 2018, 7 (3) : 353-371. doi: 10.3934/eect.2018018

[3]

Wilhelm Schlag. Regularity and convergence rates for the Lyapunov exponents of linear cocycles. Journal of Modern Dynamics, 2013, 7 (4) : 619-637. doi: 10.3934/jmd.2013.7.619

[4]

Ruiying Wei, Yin Li, Zheng-an Yao. Global existence and convergence rates of solutions for the compressible Euler equations with damping. Discrete and Continuous Dynamical Systems - B, 2020, 25 (8) : 2949-2967. doi: 10.3934/dcdsb.2020047

[5]

Mina Jiang, Changjiang Zhu. Convergence rates to nonlinear diffusion waves for $p$-system with nonlinear damping on quadrant. Discrete and Continuous Dynamical Systems, 2009, 23 (3) : 887-918. doi: 10.3934/dcds.2009.23.887

[6]

Yazheng Dang, Jie Sun, Honglei Xu. Inertial accelerated algorithms for solving a split feasibility problem. Journal of Industrial and Management Optimization, 2017, 13 (3) : 1383-1394. doi: 10.3934/jimo.2016078

[7]

Florian Dorsch, Hermann Schulz-Baldes. Random Möbius dynamics on the unit disc and perturbation theory for Lyapunov exponents. Discrete and Continuous Dynamical Systems - B, 2022, 27 (2) : 945-976. doi: 10.3934/dcdsb.2021076

[8]

Ronan Costaouec, Haoyun Feng, Jesús Izaguirre, Eric Darve. Analysis of the accelerated weighted ensemble methodology. Conference Publications, 2013, 2013 (special) : 171-181. doi: 10.3934/proc.2013.2013.171

[9]

Sebastian J. Schreiber. Expansion rates and Lyapunov exponents. Discrete and Continuous Dynamical Systems, 1997, 3 (3) : 433-438. doi: 10.3934/dcds.1997.3.433

[10]

Yazheng Dang, Fanwen Meng, Jie Sun. Convergence analysis of a parallel projection algorithm for solving convex feasibility problems. Numerical Algebra, Control and Optimization, 2016, 6 (4) : 505-519. doi: 10.3934/naco.2016023

[11]

James Broda, Alexander Grigo, Nikola P. Petrov. Convergence rates for semistochastic processes. Discrete and Continuous Dynamical Systems - B, 2019, 24 (1) : 109-125. doi: 10.3934/dcdsb.2019001

[12]

Zehui Jia, Xue Gao, Xingju Cai, Deren Han. The convergence rate analysis of the symmetric ADMM for the nonconvex separable optimization problems. Journal of Industrial and Management Optimization, 2021, 17 (4) : 1943-1971. doi: 10.3934/jimo.2020053

[13]

Jean-Paul Chehab, Georges Sadaka. On damping rates of dissipative KdV equations. Discrete and Continuous Dynamical Systems - S, 2013, 6 (6) : 1487-1506. doi: 10.3934/dcdss.2013.6.1487

[14]

P. Fabrie, C. Galusinski, A. Miranville. Uniform inertial sets for damped wave equations. Discrete and Continuous Dynamical Systems, 2000, 6 (2) : 393-418. doi: 10.3934/dcds.2000.6.393

[15]

Xueyong Wang, Yiju Wang, Gang Wang. An accelerated augmented Lagrangian method for multi-criteria optimization problem. Journal of Industrial and Management Optimization, 2020, 16 (1) : 1-9. doi: 10.3934/jimo.2018136

[16]

C.Y. Wang, M.X. Li. Convergence property of the Fletcher-Reeves conjugate gradient method with errors. Journal of Industrial and Management Optimization, 2005, 1 (2) : 193-200. doi: 10.3934/jimo.2005.1.193

[17]

Sanming Liu, Zhijie Wang, Chongyang Liu. On convergence analysis of dual proximal-gradient methods with approximate gradient for a class of nonsmooth convex minimization problems. Journal of Industrial and Management Optimization, 2016, 12 (1) : 389-402. doi: 10.3934/jimo.2016.12.389

[18]

Murat Adivar, Shu-Cherng Fang. Convex optimization on mixed domains. Journal of Industrial and Management Optimization, 2012, 8 (1) : 189-227. doi: 10.3934/jimo.2012.8.189

[19]

Luis Barreira, Claudia Valls. Quadratic Lyapunov sequences and arbitrary growth rates. Discrete and Continuous Dynamical Systems, 2010, 26 (1) : 63-74. doi: 10.3934/dcds.2010.26.63

[20]

Moez Daoulatli. Rates of decay for the wave systems with time dependent damping. Discrete and Continuous Dynamical Systems, 2011, 31 (2) : 407-443. doi: 10.3934/dcds.2011.31.407

2021 Impact Factor: 1.169

Metrics

  • PDF downloads (145)
  • HTML views (64)
  • Cited by (0)

[Back to Top]