Complexity analysis of primal-dual interior-point methods for semidefinite optimization based on a parametric kernel function with a trigonometric barrier term

In this paper, we present a class of large- and small-update primal-dual interior-point methods for 
semidefinite optimization based on a parametric kernel function with a trigonometric barrier term. 
For both versions of the kernel-based interior-point methods, the worst case iteration bounds are established, namely, 
$O(n^{\frac{2}{3}}\log\frac{n}{\varepsilon})$ and $O(\sqrt{n}\log\frac{n}{\varepsilon})$, respectively. 
These results match the ones obtained in the linear optimization case.

The IPM is based on the barrier functions that are defined by a large class of univariate functions called the eligible kernel functions [2] which have recently been successfully used to design new primal-dual IPMs for various optimization problems. It is well known that the use of certain eligible kernel functions lead to significant reduction of the complexity gap between large-and small-update methods comparing to the logarithmic kernel function. This was one of the main [17] t 2 −1 2 − t 1 e 3(tan (π/(2+2x))−1) dx O( √ n(log n) 2 log n ε ) O( √ n log n ε ) [16] [16,17] and Cai et al. [3] constructed some new trigonometric kernel functions and presented a class of large-and small-update IPMs for LO. Some trigonometric kernel functions and the corresponding iteration bounds for large-and small-update methods are collected in Table 1.
The purpose of this paper is to present a class of large-and small-update primaldual IPMs for SDO based on the following parametric kernel function which was first studied by Cai et al. [3] for LO. It should be noted that if λ = 0, then which is the kernel function of the classic barrier function. Comparing to the existing ones [7,16,17], the proposed kernel function includes a class of kernel functions. By utilizing the feature of the parametric kernel function, we obtained the iteration bound for large-update methods, namely, O(n 2 3 log n ε ), which improves the classical iteration complexity with a factor n 1 3 . For small-update methods, we derived the iteration bound, namely, O( √ n log n ε ), which coincides the currently best known iteration bound for these type methods.
The paper is organized as follows. In Section 2, we introduce the framework of kernel-based IPMs for SDO. In Section 3, we recall some useful properties of the parametric kernel function, as well as the corresponding barrier function. The complexity and analysis of the algorithms are presented in Section 4. Finally, Section 5 contains some conclusions and remarks.
Some of the notations used throughout the paper are as follows. S n , S n + and S n ++ denote the cone of symmetric, symmetric positive semidefinite and symmetric positive definite n × n matrices, respectively. A · B = Tr(A T B) denotes the matrix inner product of two matrices A and B. . denotes the Frobenius norm for matrices. The Löwner partial order (or ) on positive semidefinite (or positive definite) matrices means A B (or A B) if A − B is positive semidefinite (or positive definite). For any Q ∈ S n ++ , the expression Q 1 2 (or √ Q) denotes its symmetric square root. For any V ∈ S n , we define λ min (V ) (or λ max (V )) to be the minimal (or maximal) eigenvalue of V . Finally, if g(x) ≥ 0 is a real valued function of a real nonnegative variable, the notation g(x) = O(x) means that g(x) ≤cx for some positive constantc and g(x) = Θ(x) that c 1 x ≤ g(x) ≤ c 2 x for two positive constants c 1 and c 2 .
2. The framework of kernel-based IPMs for SDO. Throughout the paper, we assume that (SDOP) and (SDOD) satisfy the interior-point condition (IPC), i.e., there exists (X 0 , y 0 , S 0 ) such that The perturbed Karush-Kuhn-Tucker (KKT) conditions for (SDOP) and (SDOD) are equivalent to the following system where µ > 0 and E is the n × n unit matrix. This system (5) has a unique solution, denoted by (X(µ), y(µ), S(µ)). The set of µ-centers (with µ running through positive real numbers) gives a homotopy path, which is called the central path of (SDOP) and (SDOD) . If µ → 0, then the limit of the central path exists, and since the limit points satisfy the complementarity condition, i.e., XS = 0, it naturally yields an optimal solution for (SDOP) and (SDOD) (see, e.g., [6]). It is well known that the search directions for SDO require some symmetrization scheme. Zhang [24] suggested to replace the nonlinear equation XS = µE by where H P is the linear transformation given by for any symmetric matrix M , and where the scaling matrix P determines the symmetrization strategy. For any given nonsingular matrix P , the system (5) is equivalent to H P (XS) = µE.
Applying Newton's method to the system (7), we have The search direction obtained through the system (8) is called the Monteiro-Zhang (MZ) unified direction. Different choices of the matrix P result in different search directions (see, e.g., [6,24]). In this paper, we consider the so-called NT-symmetrization scheme, from which the NT search direction is derived. Let and D = P 1 2 . The matrix D can be used to rescale X and S to the same matrix V , defined by Furthermore, we definē (10) From (9) and (10), after some elementary reductions, we havē The scaled NT search direction (D X , ∆y, D S ) is computed by solving (11) so that ∆X and ∆S are obtained through (10). So far we described the scheme that defines the classical NT search direction. Now, following [19] we turn to the new approach of this paper. Given the kernel function ψ(t) as given by (3), we briefly recall the definition of matrix function (see, e.g., [9,14,19]).
Let V ∈ S n ++ and be an eigenvalue decomposition of V , where λ i (V ) with i = 1, . . . , n denote the eigenvalues of V and Q is any orthonormal matrix that diagonalizes V . Then, the (matrix valued) matrix function ψ(V ) : S n ++ → S n is defined by Note that ψ(V ) depends only on the restriction of ψ(t) to the set of eigenvalues of , we obtain that the matrix function ψ (V ) is defined as well. Furthermore, we define the real valued matrix function Ψ(V ) : S n ++ → R + as follows where ψ(V ) is given by (13). It should be noted that the gradient of Ψ(V ) is ψ (V ). A crucial observation is that the right-hand side V −1 − V in the third equation of the system (11) equals minus the gradient of the classical logarithmic barrier function where ψ c (t) is the kernel function of the classic barrier function given by (4). That is to say where ∇Ψ c (V ) denotes the gradient of Ψ c (V ), i.e., ψ c (V ). Hence, the system (11) is equivalent toĀ This means that the logarithmic barrier function essentially determines the classical NT search direction. Now, we replace the right-hand side −∇Ψ c (V ) in the third equation in (16) by −∇Ψ(V ), i.e., −ψ (V ). This yields The scaled new search direction (D X , ∆y, D S ) is computed by solving (17) so that ∆X and ∆S are obtained through (10). If (X, y, S) = (X(µ), y(µ), S(µ)), then (∆X, ∆y, ∆S) is nonzero. By taking a default step size along the search directions, we get the new iterate (X + , y + , S + ) according to X + := X + α∆X, y + := y + α∆y, S + := S + α∆S.
One can easily verify that Hence, the value of Ψ(V ) can be considered as a measure for the distance between the given iterate (X, y, S) and the µ-center (X(µ), y(µ), S(µ)).
The above discussion can be summarized in the form of the generic kernel-based primal-dual IPMs for SDO presented in Figure 1.
3. Properties of the parametric kernel (barrier) function. Firstly, we recall some important properties of the parametric kernel function ψ(t). The detailed can be found in [3].
The condition (19) of Lemma 3.1, i.e., ψ (t) > 1, implies that the parametric kernel function ψ(t) is strongly convex. The following lemma provides an important consequence of this property.
The exponential convexity of the parametric kernel function implies the exponential convexity of the associated barrier function.
Lemma 3.6 (Proposition 3 in [14]). Let V 1 , V 2 ∈ S n ++ . Then The norm-based proximity measure δ(V ) : S n ++ → R + is given by As a consequence of Lemma 3.2, the following result obtained in the LO case.

Corollary 3.1 (Corollary 6 in [3]).
Let v ∈ R n ++ and Ψ(v) ≥ 1. Then Note that Ψ(V ) and δ(V ) depend only on the eigenvalues λ i (V ) of the symmetric matrix V . From Corollary 3.1, we have It should be mentioned that during the course of the algorithm the largest value of Ψ(V ) occur just after the update of µ. So next we need derive an estimate for the effect of a µ-update on the value of Ψ(V ). For this purpose, we recall the corresponding result in the LO case.
By applying Theorem 3.8, with v being the vector in R n consisting of all the eigenvalues λ i (V ) of the symmetric matrix V , the theorem below immediately follows.

4.
Complexity and analysis of the algorithms. From (18) and (10), we have It follows from (9) that We can verify that V 2 + is unitarily similar to the matrix X 1 2 2 . This implies that the eigenvalues of V + are precisely the same as those of the matrix From the definition of Ψ(V ), we obtain Ψ(V + ) = Ψ(V + ). Theorem 3.6 implies that Now, we consider the decrease in Ψ(V ) as a function of α and define Furthermore, we define It follows that f (α) ≤ f 1 (α), which means that f 1 (α) gives an upper bound for the decrease of the barrier function Ψ(V ). It is worth pointing out that f 1 (α) is convex and in general f (α) is not convex. That is an important advantage of using the function f 1 (α) instead of using the original decrease function f (α). Moreover, we have f (0) = f 1 (0) = 0. From Lemma 3.10, we have and where and ω 2 = max {|∆ψ (λ j (V + αD S ), λ k (V + αD S ))| : j, k = 1, . . . , n} . Hence, using the third equation of the system (16), we have In order to facilitate discussion, we denote δ := δ(V ), and we have the main result, which provides an upper bound of f 1 (α).  [19]). One has The default step size for the algorithm should be chosen such that X + and S + are feasible and Ψ(V + ) − Ψ(V ) decreases sufficiently. For the details we leave it for the interested readers (see, e.g., [2,3,14]. Following the strategy considered in [3], we briefly recall how to choose the default step size. Suppose that the step size α Then f 1 (α) ≤ 0. The largest possible value of the step size of α satisfying (27) is given byᾱ where ρ(s) : [0, ∞) → (0, 1] is the inverse function of − 1 2 ψ (t) for t ∈ (0, 1]. Furthermore, we can conclude that .
After some elementary reductions, we can conclude that where In the sequel, we useα as the default step size, which essentially depends only on the norm δ and C(λ). It is clear thatᾱ ≥α.
In what follows, we will show that the barrier function Ψ(V ) in each inner iteration with the default step size α, as defined by (32), is decreasing. For this, we need the following technical result. Lemma 4.2 (Lemma 12 in [14]). Let h(t) be a twice differentiable convex function with h(0) = 0, h (0) < 0 and let h(t) attain its (global) minimum at t * > 0. If h (t) is increasing for t ∈ [0, t * ], then As a consequence of Lemma 4.2 and the fact that f (α) ≤ f 1 (α), which is a twice differentiable convex function with f 1 (0) = 0, and f 1 (0) = −2δ 2 < 0, the following lemma is obvious. The following lemma shows that the default step size (32) yields sufficient decrease of the barrier function during each inner iteration. After updating the parameter µ to (1 − θ)µ with 0 < θ < 1, we have, by Theorem 3.9, At the start of an outer iteration and just before updating the parameter µ, we have Ψ(V ) ≤ τ . It follows from (33) that the value of Ψ(V ) exceeds from the threshold τ after updating of µ. Therefore, we need to count how many inner iterations are required to return to the situation where Ψ(V ) ≤ τ . We denote the value of Ψ(V ) after the µ-update as Ψ 0 , the subsequent values in the same outer iteration are denoted as Ψ k , k = 1, . . . , K, where K denotes the total number of inner iterations in the outer iteration. Hence, we have According to the decrease of f (α) in Lemma 4.4, we have where β = ( 3 √ 2C(λ)) −1 , and γ = 2 3 . Lemma 4.5 (Lemma 14 in [14]). Let t 0 , t 1 , ..., t K be a sequence of positive numbers such that where β > 0 and 0 < γ ≤ 1. Then K ≤ t γ 0 βγ . The following lemma provides an estimate for the number of inner iterations between two successive barrier parameter updates, in terms of Ψ 0 and the constant C(λ). Lemma 4.6. One has Proof. From Lemma 4.5 and (35), the result of the lemma follows.
The following lemma provides an upper bound of the number of outer iterations.
Lemma 4.7 (Lemma Π.17 in [18]). If the barrier parameter µ has the initial value µ 0 and is repeatedly multiplied by 1 − θ with 0 < θ < 1, then after at most By multiplying the number of outer iterations and the number of inner iterations, we get an upper bound for the total number of iterations, namely, For the analysis of the iteration bound of small-update methods, we need to estimate the upper bound of Ψ 0 more accurately. This is due to the following lemma.
Lemma 4.9 (Corollary 4.17 in [20]). Let V ∈ S n ++ and V + = V √ 1−θ with 0 < θ < 1. If Ψ(V ) ≤ τ , then From Lemma 4.9, Lemma 3.4 with ψ (t) > 1, Lemma 3.3, and the fact that Applying Theorem 4.6 again, the total number of iterations is bounded above by After some elementary reductions, we have the following theorem, which gives the currently best known iteration bound for small-update methods. which matches the currently best known iteration bound for small-update methods.

5.
Conclusions and remarks. In this paper, we have shown that a class of largeand small-update primal-dual IPMs for LO based on the parametric kernel function with a trigonometric barrier term presented in [3] can be extended to the context of SDO. By utilizing the feature of the parametric kernel function, we obtained the same iteration bounds as in the LO case, both for large-and small-update methods. Although expected, these results were not obvious and, at certain steps of the analysis, they were not trivial and/or straightforward generalization of the LO case. Some interesting topics for further research remain. One interesting topic is to investigate whether it is possible to replace the NT-scaling scheme by some other scaling schemes and still obtain the polynomial-time iteration bounds. Furthermore, numerical results may help us to compare the behavior of the algorithms of the paper with the existing methods.