AN ADAPTIVELY REGULARIZED SEQUENTIAL QUADRATIC PROGRAMMING METHOD FOR EQUALITY CONSTRAINED OPTIMIZATION

. In this paper, we devise an adaptively regularized SQP method for equality constrained optimization problem that is resilient to constraint degeneracy, with a relatively small departure from classical SQP method. The main feature of our method is an adaptively choice of regularization parameter, embedded in a trust-funnel-like algorithmic scheme. Unlike general regularized methods, which update regularization parameter after a regularized problem is approximately solved, our method updates the regularization parameter at each iteration according to the infeasibility measure and the promised improve- ments achieved by the trial step. The sequence of regularization parameters is not necessarily monotonically decreasing. The whole algorithm is globalized by a trust-funnel-like strategy, in which neither a penalty function nor a ﬁlter is needed. We present global and fast local convergence under weak assump- tions. Preliminary numerical results on a collection of degenerate problems are reported, which are encouraging.

1. Introduction. We are concerned in this paper with the solution of the equality constrained optimization problem where x is a vector of dimension n and f (x) : R n → R, c(x) : R n → R m are assumed to be smooth. The Lagrangian for (1) is defined as The KKT conditions for (1) are Sequential quadratic programming (SQP) methods [4,25] are among the most successful methods for the solution of (1). They compute steps via a sequence of quadratic programs (QP): where ∇f k := ∇f (x k ), c k := c(x k ), ∇c k := ∇c(x k ) and H k is the Hessian (or an approximation of the Hessian) matrix of L(x k , y k ). SQP methods for (1) may be interpreted as applying Newton's method to (2). A Newton-like step (d k , y + k ) for (2) from (x k , y k ) solves the linear system As is common for most SQP methods, it should be ensured that H k is positive definite on the null space of ∇c T k . If ∇c k is not of full rank, the coefficient matrix is singular, and a solution of (3) might not exist. Therefore, stabilized SQP (sSQP) or regularization techniques are necessary. Stabilized SQP (sSQP) methods [24,29,23,56,33,30,14] are specifically designed to remedy the numerical and theoretical performance on degenerate problems. Given the current primal-dual iterate (x k , y k ) and the stabilization parameter v k > 0,the sSQP for problem (1) generates a direction of change via the following QP subproblem [33] min ∇f T k d + Problem (4) has its own advantage that it is always consistent. Most of existing sSQP methods focus on the local properties of the methods. Few globalization of sSQP methods have been proposed so far. In [24] and [23], sSQP is globalized by combining with a certain primal-dual augmented Lagrangian algorithm. In [14], sSQP is combined with the inexact restoration method. In [30], sSQP is combined with the usual augmented Lagrangian algorithm when dealing with problems having inequality constraints.
In parallel with the development of stabilized SQP methods, regularized methods have been proposed to solve an ill-posed problem by constructing a related problem whose solution is unique and differs only slightly from a solution of the original problem(see, e.g., [22,45,35,20,25]). One of the regularization schemes is to modify the coefficient matrix by for some regularization parameters δ k > 0 and v k > 0. Consequently, the search direction is obtained by solving The connection between these two schemes is easy to verify. The KKT systems for (4) can be written in the form of which is coincident with (5) in the case where δ k = 0. Convergence from arbitrary starting points is always enforced by penalty or penalty-free methods. Both classes of methods carefully keep a balance between reducing objective function and satisfying the constraints. Penalty methods use a merit function, which defines some compromise between reducing the objective function and satisfying the constraints, to keep this balance. Penalty-free methods usually keep this balance by using a filter (see [18,17,16,55,49] and etc.), trust-funnel-like techniques [27,42,43] or other techniques [48,34,3,57].
In this paper, we propose an adaptively regularized SQP method for (1). The presented method embeds the regularization in a trust-funnel-like globalization scheme. Trust funnel method is proposed by Gould and Toint [27]. This method has a nice feature that it does not use any penalty function or a filter. Instead, it is globalized by the trust funnel, which is defined as a progressively decreasing limit on the allowed infeasibility of the iterates. The prediction of the behavior of the algorithm is important in finding an acceptable step. If the objective function is more likely to be reduced, then a step is required to provide sufficient reduction on objective function. Otherwise, significant improvement on feasibility is expected.
In our algorithm, the trust-funnel-like framework is also used in setting regularization parameters. At each iteration, a regularization parameter is determined during the course of finding search direction. If the direction d k does not promise sufficient reduction on the objective function (as a result, the algorithm focuses on reducing infeasibility), the magnitude of the regularized term v k y + k should be a fraction of the current infeasibility measure. On the other hand, if d k provides sufficient reduction on the linearize objective function, then v k y + k should be a fraction of the limit on permitted infeasibility.
Note that the regularization is determined iteratively and may require repeat subproblem solving per step computation. Repeat subproblem solving is common in nonlinear programming solvers such as, for example, trust region methods [8].
In addition, as we will show in the Lemma 3.11, a good estimate of v k , which leads to step admitting above requirements, can be made if the linearized constraints is consistent, which implies that repeat subproblem solving can be avoid in this case.
We use · to denote the Euclidean norm · 2 . Subscript k refers to iteration indices and superscript (i) is the ith component of a vector. We define the inertia of a symmetric matrix P , which is denoted with In(P ), to be the scalar triple that indicates the numbers n + , n − , and n 0 of positive, negative, and zero eigenvalues, respectively, that is, In(P ) = (n + , n − , n 0 ). For a symmetric matrix G, we use λ G min and λ G max to denote the smallest and largest eigenvalues of G. Similarly, we use σ A max to denote the largest singular value of a matrix A.
The rest of this paper is organized as follows. In Section 2, we present the algorithm. In Section 3, we discuss its global convergence properties. In Section 4, a modified algorithm, which is fast locally convergent under certain assumptions, is given. In Section 5, we give some details of numerical implementation and report preliminary numerical results on DEGEN [10], a set of degenerate test problems. Finally, in Section 6, we give some comments on further work.
2.1. Regularization and computing step. In short, our algorithm is a trustfunnel-like SQP method with an adaptively regularization scheme. We use a variant of the regularized linear system (5), which is At each iteration, the regularization parameter v k is initialized as wherev > v > 0, γ v > 0 and acts as a measure of the infeasibility. Trust-funnel-like algorithm sets an upper limit h max k on permitted infeasibility of the k−th iterate, i.e., there always holds Suppose that (d k , y + k ) solves (6). If d k promises sufficient reduction on the linearized objective function in the sense of where, γ 1 , σ 1 > 0, then the parameter v k is required to satisfy where γ 2 ∈ (0, 1). If (8) is not satisfied, then v k should satisfy Conditions (9) and (10) ensure that the requirements on feasibility can be met via a line search along d k . However, a desired regularization parameter is not necessarily available when the current feasibility is bad. To prevent the algorithm from reducing v k infinite times or getting a too small regularization parameter, we set a smallest value for v k . In practice, we set v min where v min ∈ (0, 1), γ min v > 0 and σ v > 1. If v k becomes smaller than v min k , the algorithm resorts to some feasibility restoration algorithm, which tries to get closer to the feasible region. We will describe the details of restoration phase in Section 2.3.
Details of updating v k and computing steps are given in Algorithm 1.

Algorithm 1 Regularization and Computing steps
Set v k = v in k where v in k is defined as (7). Compute the (approximate) Hessian Solve the linear system (6) to obtain the entry (d k , y + k ). if (d k , y + k ) satisfies (8) and (9) then Set LS type = 1 and f v = 0; else if (d k , y + k ) does not satisfy (8), but satisfies (10) then Update h max k via (17). Enter the restoration phase (Algorithm 2). end if end if until f v =0 2.2. A line search trust-funnel-like framework. Our algorithm allows some flexibility in the line search procedure. In the case where (8) holds, step d k predicts sufficient reduction on linearized objective function. Hence, sufficient actual reduction on f (x) along this step is expected. For this purpose, a classical Armijo line search is used to find a step size α k which satisfies where ρ ∈ (0, 1). In this case, we require the infeasibility measure of the next iterate not to exceed h max k , i.e., Iterations in this case are called f −iterations. If (8) does not hold, which always implies that the current point is far from feasible region, then the algorithm aims to find a step size α k such that where κ h andκ h are constants in (0, 1). Sometimes, line search procedure may fail to find an acceptable step size, or find a step which is too small to produce sufficient departure from the current point. In these cases, we quit the algorithm temporary and enter a restoration phase in the hope of finding a new point with better properties by reducing infeasibility. In 2680 SONGQIANG QIU AND ZHONGWEN CHEN practice, we set a lower bound α min k of step size, which is defined by where α is a small positive constant, γ α ∈ (0, 1 2 ), σ α > 1, σ β > 0 and γ p is a small positive constant used to keep the denominators in (16) bounded away from 0. If α k becomes smaller than α min k , then the algorithm turns to the restoration phase. 2.3. Feasibility restoration. Restoration phase is necessary for filter methods and this idea applies to our method too. In restoration phase, we first reduce h max with κ r a constant in (0, 1). Then, we try to find an iterate x r k satisfying h(x r k ) ≤ h max k by means of iteratively solving the following problem Given the current iterate x j in a restoration course. A direction d j is obtained by solving a Levenberg-Marquardt subproblem [32,38] or an equivalent augmented linear system where σ r ∈ [1,2]. An Armijo line search is used to compute α j = 2 −lj , where l j is the smallest nonnegative integer satisfying Note that subproblem (19) is a convex problem, hence can be easily solved. Moreover, nice convergence properties of Levenberg-Marquardt methods for degenerate problems have been shown by researchers, see, for instance, [59,9,60,13] and etc.

Algorithm 2 Restoration Algorithm
Initialization. Given the iterate x k at which the restoration phase is called and the maximal infeasibility h max k . Set j := 0. Let x j = x k , c j = c k and ∇c j = ∇c k .
repeat Solve (19) or (20) to get search direction d j .
2.4. The whole algorithm. Now we are ready to give the detailed description of the whole algorithm, see Algorithm 3.

Algorithm 3 Main Algorithm
Choose the following parameters Initialize the primal dual entry as (x 0 , y 0 ). Let Get step d k and style of line search LS type using Algorithm 1. Use (16) to which lies in (α min k , 1] and which satisfies (12) and (13). else Let α k be the largest scalar in which lies in (α min k , 1] and which satisfies (14). end if If the line search procedure above fails, quit Algorithm 3. Update h max k via (17). Enter the restoration phase (Algorithm 2). until d k + h k ≤ 3. Global convergence. At the beginning of the discussion on global convergence, we consider the possible output of the restoration phase. By the convergence results of [59,60,13], the restoration phase, which applies Levenberg-Marquardt methods to (18), may terminate with one of the following two situations: (a) finding a point x r k that approximately solving (18) in the sense of satisfying In the second case, the restoration algorithm fails to get close enough to the feasible region. Hence, the whole algorithm cannot go any further. As Fletcher and Leyffer [18] pointed out, this situation always indicates that the problem is locally infeasible. Therefore, if case (b) happens, then the algorithm stops and reports local infeasibility.
From now on, we assume that the restoration phase always stops with case (a). To continue the argument on global convergence, we need the following assumptions.
is consistent for any x k ∈Ñ ; (A4.b) (Relaxed Constant Rank Constraint Qualification (RCRCQ)) [37] the gradient ∇c k has the same rank for all x k ∈Ñ ; and (A4.c) (Second Order Sufficient Condition (SOSC)) there is a constant Of all the assumptions, (A1)-(A3) are standard for nonlinear constrained optimization. Assumptions (A4.a) and (A4.b) are weaker than the linear independence constraint qualification (LICQ), which requires that ∇c k has full rank. Assumption (A4.c) is a usual assumption for line search algorithms. Hence, these assumptions are reasonable and weaker than standard assumptions.

Preliminary results.
For the sake of simplifying notations, we use M to denote a uniformly upper bound of f k , c k , ∇f k , ∇c k and H k . Without loss of generality, we assume M > 1.
Firstly, we consider the inertia of K k (δ k , v k ) under SOSC. We begin by recalling a result which is a variant of [29, Lemma 3].
Lemma 3.2. Suppose that Assumption 3.1 holds. Then there exist a positive constant, without loss of generality, η 1 , a neighborhoodÑ ofx and a positivev such that Next, we show the relations between the SOSC and the inertia of K k (δ k , v k ).
Theorem 3.3. For the symmetric matrix H k , the following two statements are equivalent: (a) Assumption (A4.c) holds; (b) there exists a positive constant which, with a slight abuse of notation, is also denoted byv , such that In(K k (0, v k )) = (n, m, 0) holds for all x k ∈Ñ and v k ∈ (0,v).
Proof. For x k ∈Ñ , suppose that rank(∇c k ) = r. We write its singular value decomposition (SVD) as where Σ k is diagonal with diagonal elements σ , and U k V k and Û kVk are orthogonal. Note, in paricular, that the columns ofV k constitute an orthonormal basis for the null space of (∇c T k ). Let Then we have By (23), we have thatÛ By substituting these equalities into (24), we obtain Pre-multiplying and post-multiplying Since Û kVk is orthonormal and by Lemma 3.2, the symmetric matrix is also positive definite. Then it follows from the decomposition (27) that, In(K k ) = (n, r, 0), and then from (26) that In(K k (0, v k )) = (n, m, 0). Hence, we prove that (a)⇒ (b). Next, we are going to prove the reverse implication. Suppose that In(K k (0, v k )) = (n, m, 0). Using the decomposition (26) and (27), we conclude that the matrixK k is positive definite. Choosev > 0 small enough such that the (1,1) block of the above matrix is positive definite for all v k ∈ (0,v). Let Note that the positive definiteness ofK k implies that ). Therefore, the matrixV T k H kVk must be positive definite, which is equivalent to (a). The proof is done.
This theorem indicates that when x k is close enough to a feasible point, the primal regularization parameter δ k equals 0. When the point is far from the feasible region, a proper δ k can be found because of the following lemma Without loss of generality, we assume δ k = 0 from now on. For convenience, we abbreviate K k (0, v k ) as K k . Next, we show the boundedness of y + k near a feasible point.
Lemma 3.5. Suppose that Assumptions 3.1 hold. Letx is a feasible point. Then there is a neighborhood N ofx such that, by reducingv if necessary, the set To analyze the step (t k , y + k ), we decompose it according to the SVD (23) as By substituting (29) into (28), pre-multiplying the blocks of this system byÛ T k , V T k , U T k and V T k , and using (25), we botain that  where . Choose a proper neighborhood N ofx such that the singular value σ (r) k >σ > 0 and that w TV T k H kVk w ≥ η 1 w 2 , ∀ w ∈ R n−r with some fixedσ > 0 for all x k ∈ N . By the fourth row of (30), we have z V k = 0 and  By similar arguments as [56, Section 3.1] and possibly shrinking N , we have that, reducingv if necessary, the solution of (31) is bounded for all x k ∈ N and v k ∈ (0,v). Particularly, there is a positive M y such that in this case.
By compactness, we have Corollary 3.6. Suppose that Assumptions 3.1 hold. Then the sequence {y + k } is bounded.
Next, we shall provide some preliminary properties on the matrix K k . Lemma 3.7. Letx andÑ be a feasible accumulation point of {x k } and its neighborhood that satisfy (A4). Letv be the constant given in Theorem 3.3. Then for all x k ∈Ñ and v k ∈ (0,v], the following inequalities hold Proof. Let ξ = 0 be an eigenvector of H k associated with λ H k min . It follows from The second inequality of (33) follows immediately from the first inequality and λ H k min ≤ λ H k max . Lemma 3.7 is useful for the following result, which is inspired by Rusten and Moreover, the smallest positive eigenvalue λ 1 is bounded away from 0 for all x k ∈Ñ and 0 < v k ≤v, i.e., there is a fixedλ such that Proof. If (ξ, η) = 0 is an eigenvector of K k associated with the eigenvalue λ, then Note that λ = −v k < 0 is an eigenvalue of K k is and only if ∇c k does not have full column rank, and in this case, its associated eigenspace is {0} × Null(∇c). (40), we have η = (λ + v k ) −1 ∇c T k ξ. Necessarily, ξ = 0. Substituting into (39) and taking the inner product with ξ yields If −v k < λ < 0, then by Lemma 3.2, the right-hand side of (41) satisfies which contradicts λ < 0. Hence, either λ ≤ −v k or λ ≥ 0. In particular, λ ≤ −v k must be true of λ −1 , which implies (36).
Upon simplifying and substituting z for λ, we see that the quadratic in z takes a non-positive value when evaluated at an eigenvalue λ < −v k of K k . In particular, this must be true of λ −m , which implies (34).
If λ > 0, then we deduce from (41) that This implies that the quadratic 2 ) takes non-positive value when evaluated at an eigenvalue λ > 0 of K k , which implies (35). Now, we consider (37). For a positive eigenvalue λ, we have, by (41) and Lemma 3.2, that In turn, this implies that the quadratic in z takes a nonnegative value when evaluated at an eigenvalue λ > 0 of K k , which implies Note that the right-hand side of (44) is monotonically decreasing on v k . Then, a positive eigenvalue of K k satisfies the following lower bound Therefore, we have In particular this must be true of λ 1 , which implies (37). This lower bound is positive and bounded away from 0 because in (43), the last term Thus, there is a positiveλ such that (38) holds.
Two immediate consequences of this theorem are as following, which provide upper bounds of {K −1 k } and d k with respect to v k .
Corollary 3.9. Letx andÑ be a feasible accumulation point and its neighborhood that satisfy (A4) and (A5). For all x k ∈Ñ and v k ∈ (0,v], Corollary 3.10. Letx andÑ be a feasible accumulation point and its neighborhood that satisfy (A4) and (A5). For all x k ∈ N and v k ∈ (0,v], If v k is sufficiently small, then 3.2. Well-definedness. By the mechanism of the algorithms, both Algorithm 1 and Algorithm 3 are well-defined because both algorithms turn to the restoration phase when the inner loops encounter too small parameters (v k in Algorithm 1 and α k in Algorithm 3). However, restoration phase near the solution region is undesirable because it leads to feasible points instead of stationary ones. In this subsection, we will show that, under Assumption 3.1, this undesirable situation does not happen when the iterate is close enough to the feasible region.
Lemma 3.11. Letx andÑ be a feasible accumulation point and its neighborhood that satisfy (A4). Then, by choosing v k small enough, either the corresponding step y + k satisfies (9), or it satisfies (10). Proof. It follows directly from Lemma 3.5 and Corollary 3.6 that (9) and (10) holds if where M y is the constant given in (32).
Thus, Algorithm 1 will not call restoration phase near a feasible point. for all x k ∈Ñ , i.e., the restoration algorithm is not called.
Proof. By (11), we have v min k = o(h k ). Thus, this Lemma is trivially true because of (45) and (46).
Next, we consider the finite termination of the Armijo line search in Algorithm 3.
Lemma 3.13. Suppose that the Algorithm 3 does not terminate at x k and that v k is chosen such that (8) and (9) are satisfied. Then there exists a positive scalar α f k such that for any α ∈ (0, α f k ], both (12) and (13) By eliminating y + k , we get Taking inner product with d k and by Lemma 3.2, we obtain Using standard Taylor's Theorem and Lipschitz continuity, we have, with a Lipschitz constant L df , Using this inequality, we deduce Therefore, (12) holds for any step size α that satisfies Next, using convexity of 2 norm, after similar arguments, (13) is satisfies for any step size satisfying To sum up, the statement is true with In the other case, there is a similar result.

SONGQIANG QIU AND ZHONGWEN CHEN
Lemma 3.14. Suppose that the Algorithm 3 does not terminate at x k and v k is chosen such that (10) holds. Let Then for any α ∈ (0, α h k ], the condition (14) is satisfied. Now, Lemmas 3.13 and 3.14 indicate that, under Assumption 3.1 and by (16), the line search will terminate successfully and the restoration algorithm will not be called if the iterate is close enough to the feasible region. 3.3. Global convergence theorem. Now, we are ready to present the global convergence results for the presented algorithm. These results parallel convergence discussions in [43], with main difference that the regularization parameter v k is concerned here, which causes some complexity in the proofs. We shall use the following two index sets The main result we will establish is the following.
Theorem 3.17. Suppose that Algorithm 3 does not terminate finitely, and that Assumptions 3.1 hold.
We shall prove this theorem by proving a series of lemmas. Firstly, we show convergence to feasible region by proving two lemmas.
The last inequality combined with Lemma 3.14 yields Then it follows from Corollary 3.10 and (38) that where M is a positive constant defined at the beginning of Subsection 3.1. By Lemma 3.11 and (46), we have By (48), (49), (50) By Lemma 3.13, (47), the mechanism of line search and (8), By Lemma 3.11 and (45) and increasingk if necessary, we have Note that {f k } is monotonically decreasing except for finite points. Then by (51)- (52) and (49) (50), the algorithm does not call restoration algorithm for sufficiently large k. Without loss of generality, we assume that (8) does not hold at all k ∈ K h . Then by eliminating y + k in system (6), we have, for all k ∈ K h , Then, from the second row of (6) and (10), it follows that It is noteworthy that by the mechanism of Algorithm 1 and (46), Hence, v −1 k h k is bounded. Then taking limit on both sides of (54) for k ∈ K h and using Lemma 3.18, we have lim k∈K h d k = 0. This completes the proof. Proof. In this case, condition (8) holds for all k ≥k, wherek is the constant defined in the proof of Lemma 3.19. We first prove lim k→∞ −∇f T k d k = 0 by contradiction. Suppose that there is an index set K such that Note that f k − f k+1 ≥ 0 for all k ≥k. Then For all k ≥k, we have h max  (47), (49) and the fact that h max k keeps unchanged for sufficiently large k, the step size α j is bounded away from 0 for j ∈ K. Then by (55), the last term of above inequalities tends to +∞. However, under Assumptions (A1) and (A2), the term f 0 − f k+1 is bounded, which yields a contradiction. Therefore, lim k→∞ −∇f T k d k = 0. Thus, by (53), (32) and Lemma 3.19, we have which implies lim k→∞ d k = 0.
To summarize the previous results, we have proved Theorem 3.17. Note that Theorem 3.17 implies convergence to KKT points. Lemma 3.22. Suppose that Assumptions 3.1 hold, and thatK is an index set which satisfies lim k∈K h k = 0 and lim k∈K d k = 0. Suppose also, without loss of generality that lim k∈K x k =x. Thenx is a KKT point.
Proof. By continuity, we have thatx is a feasible point. By eliminating y + k in system (6), there is If {v k } is bounded away from 0, then by taking limit on both sides of this equality, we have that which indicates thatx is a KKT point and 0 is the corresponding multiplier. Otherwise, assume, without loss of generality that lim k∈K v k = 0. Then, by the mechanism of updating v k , the index set K h is infinite. By (10) and the second row of (6), we have lim Since, by (50) and (52), the sequence v −1 k h k is bounded, we assume, taking subsequence if necessary, that lim k∈K c k v k =ỹ. Then, by taking limit on the first row of (6), we have ∇f (x) + ∇c(x)ỹ = 0. Hence,x is a KKT point. 4. Local convergence. In this section, we modify the algorithm presented above in order to achieve fast local convergence near a solution under certain assumptions.
Algorithm 3 itself is globally convergent as we have shown. However, in order to meet the condition (9) or (10), the algorithm may have to choose a small regularization parameter v k , which causes slow convergence. Thus, in order to accelerate the algorithm near a solution, we should make some modifications on it.
The major idea of the modifications is to embed Algorithm 3 in a fast locally convergent algorithm. We use the following algorithm framework (Algorithm 4), which is a simplification of [1, Algorithm 1] that ignores inequality constraints.
We use the following assumptions. (B1) The sequence (x k , y k ) generated by Algorithm 4 converges to (x * , y * ) for a certain y * ∈ Y.
5. Numerical experience. In this section, we report the performance of Algorithm 3 on the degenerate problems of DEGEN collection [10], compared with the well-known implementations MINOS and LOQO [52,53].

5.
1. An adaptively regularized interior point method. Since most of the test problems in DEGEN have inequality constraints, we firstly modify the new algorithm so that it can be applied to nonlinear optimization problems of the standard form min f (x) s.t. c(x) = 0, Interior point method is an efficient way for handling inequality constraints. Now we demonstrate how the new algorithm presented in Section 2 can be used within an interior point framework. In general, an interior point method approximately solves a sequence of barrier problems with a positive barrier parameter µ, which is decreasing and converges to 0. An approximate solution (x, y, z) is required to satisfies where E µ (x, y, z) = min{ ∇f + ∇cy − z ∞ , h(x), Xz − µe ∞ } for some fixed κ > 0. The main scheme of the adaptively regularized interior point method for (57) is given in Algorithm 5
The main computation lies in solving approximately the barrier problem (58). For a given point x k , a general interior point method gets a search direction from the solution of the following system where the matrix Σ k takes µX −2 k in primal methods, and equals X −1 k Z k where Z k is a diagonal matrix with the elements of the dual variable z k in primal dual methods. Our method adaptively regularizes the (60) as To keep the components of the iterates bounded away from 0, step size α should satisfy the following fraction-to-the-boundary rule where τ is given by where κ τ is a positive constant less than 1. Wächter and Biegler [54] showed that, under certain assumptions, the components of x k are bounded away from 0 for a fixed µ, as long as there exists κ Σ > 1 such that the diagonal element σ for i = 1, 2, · · · , n and a proper restoration algorithm is used. Note that (62) is trivially true in primal interior point methods and is a commonly used safeguard technique in primal dual interior point methods, see [7,58,55] and etc. Hence, under the same assumptions as [54], the algorithm is still globally convergent when it is applied to interior point framework. Whether the similar convergence results under weaker assumptions still hold for adaptively regularized interior point methods is a topic for further work.

Calculating inertia.
In Algorithm 1, the parameters δ k and v k depend on the inertia of K k (δ k , v k ). Hence, calculating the inertia of a symmetric matrix is important for the implementation of the algorithm. Suppose that K is a matrix which is symmetric, but need not to be positive definite or quasi-definite [51]. Hence, K may not be strongly factorable, i.e., it does not necessary have a factorization P KP T = LDL T , where P is a permutation matrix, L is a unit lower-triangular matrix and D is a diagonal matrix. Bunch and Kaufman [5] introduced some stable methods for calculating inertia of indefinite matrix. Following their ideas, we firstly do a block LDL T factorization for K: where P , L are as above and B is a block diagonal matrix, where each block is of order 1 or 2. Then In(K) = In(B). Secondly, we scan the diagonal and lower sub-diagonal elements of B to find the numbers of positive, negative and zero 1 × 1 blocks, and the number of 2×2 blocks. Suppose that n p , n n , n 0 and n 2 are numbers of them respectively. Bunch and Kaufman [5] pointed out that each 2 × 2 block corresponds to a positive-negative pair of eigenvalues. Therefore, we have In(K) = In(B) = (n p + n 2 , n n + n 2 , n 0 ).
Block LDL T factorization of matrix K k (δ k , v k ) can be obtained from the course of solving (6) via a direct solver for symmetric indefinite system such as MA57 [12]. Hence, the only extra cost we need is to scan diagonal and lower sub-diagonal.
Note that if H k + 1 v k ∇c k ∇c T k is positive definite, then so is H k + 1 v ∇c k ∇c T k for all v ∈ (0, v k ], which implies by Lemma 3.3, In(K k (δ k , v)) = (n, m, 0) that for all v ∈ (0, v k ]. Hence, we do not need to calculate In(K k (δ k , v)) for 0 < v ≤ v k if In(K k (δ k , v k )) has been found to be (n, m, 0).

5.3.
Second order correction. Newton-like methods for constrained optimization may suffer from Maratos effect, in which full steps may be rejected even if arbitrarily near the solution. Some techniques to deal with this situation are suggested by researchers, see, for instance [6,15,21,36,40,41,50,48,28,55] and etc., among which is the second order correction (SOC) technique [15,36,21,55]. The SOC technique computes an additional step which further improves feasibility, and hence has better chance to be accepted. In our implementation, the SOC step is computed only if h k < 100 and the first trial step α max k d k is rejected. An SOC step d soc k is obtained from the following linear system [55] where α max k is the maximal step size satisfying (61). Note that the coefficient matrix of the SOC system is the same as that of (60). Thus additional matrix factorizations are avoid. If the hybrid step α max k d k + d soc k is also rejected, then the algorithm continue the line search along direction d k . Since the SOC step is tried at most once per iteration, the global convergence theorems established before still hold.

Numerical results.
Our implementation is written in Matlab, combined with the use of MA57. The latter provides an effective solver for sparse symmetric linear systems. Regarding MINOS and LOQO, we use the user interfaces via AMPL modelling language [19], with the default values of all parameters except for the following. We increase the maximum number of major iteration of MINOS from 100 to 1500, and increase the limit of iterations of LOQO from 200 to 1500. DEGEN is an AMPL collection which now contains 111 test problems with degenerate constraints. Most problems are very small, but many are in some ways difficult. We use all of them except for the following 5 instances: problems 20205 is unbounded, problems 2DD01-500h, 2DD01-500v, 2DD01-1000h and 2DD01-1000v are too large compared to the other problems.
In DEGEN AMPL models, starting point is randomly chosen in a domain of a specified size. For the convenience of comparison, we use a fixed starting point (10, 10, · · · , 10) for most of the problems except for problems 20212 and 20215, because (10, 10) is a solution of them. For these two problems, the algorithms start at (10, 5). The new algorithm computes initial Lagrangian multipliers by (63) while MINOS and LOQO use there default multiplier settings.
Note that MINOS and LOQO use different stopping rules, see [39,52,53]. Though these rules are equivalent in theory, they do have some different impacts on the performance of an algorithm. In order to put the solvers into more-or-less similar conditions, we compare the new algorithm to these two implementations respectively, using as similar stopping rules to that of the compared method as possible in each comparison. When compared to LOQO, the algorithm uses the termination criteria that LOQO used, which are that the primal and dual agree to 8 significant figures and that the primal and dual are feasible to the 10 −6 relative error level. When compared to MINOS, the algorithm is terminated if both the measure of optimality and the measure of feasibility are less than 10 −6 .
The results are summarized in Tables 1 and 2. In these two tables, we summarize the number of problems solved within 1500 function evaluations, the robustness of the solvers, the average and median function evaluations the solvers used for those problems solved within 1500 function evaluations.
Next, we report the comparison of performance of these algorithms in the form of performance profiles proposed by Dolan and Moré [11]. Suppose that we are comparing the performance of n s solvers on n p problems. Let t p,s denote the function evaluations to solve problem p by solver s. Then the performance ratio of solver s on problem p is defined as r p,s = t p,s min{t p,s : 1 ≤ s ≤ n s } .
If a solvers does not solve problemp, then the ratio rp ,s is assigned rp ,s = 2 max{r p,s : solver s solves problem p}.
Then the logarithmic performance function for each solver is defined as ρ s (τ ) = number of problems s.t. log 2 (r p,s ) ≤ τ n p .
We compare in Figures 1 and 2 the relative performance of the new algorithm compared to MINOS and LOQO in terms of function evaluations. Observe that the new algorithm performs better than MINOS and LOQO on DEGEN collection.   6. Final remarks. We present an adaptively regularization SQP scheme for equality constrained optimization. This method is shown to be globally convergent and is locally convergent after certain modifications. Preliminary numerical results are encouraging. The main ideas of this algorithm can be generalized to general constrained optimization problems. Further work mainly focuses on two aspects: (1) generalizing the framework to other problems, such as mathematical programs with equilibrium constraints, nonlinear complementarity problems and etc. (2) practical considerations so as to developing a robust and efficient implementation for large scale problems.