A FAST CONTINUOUS METHOD FOR THE EXTREME EIGENVALUE PROBLEM

. Matrix eigenvalue problems play a signiﬁcant role in many areas of computational science and engineering. In this paper, we propose a fast continuous method for the extreme eigenvalue problem. We ﬁrst convert such a nonconvex optimization problem into a minimization problem with concave objective function and convex constraints based on the continuous methods de-veloped by Golub and Liao. Then, we propose a continuous method for solving such a minimization problem. To accelerate the convergence of this method, a self-adaptive step length strategy is adopted. Under mild conditions, we prove the global convergence of this method. Some preliminary numerical results are presented to verify the eﬀectiveness of the proposed method eventually.


1.
Introduction. Let A ∈ R n×n be a symmetric matrix, the well-known Real Schur Theorem states that, all eigenvalues of A are real and there exist an orthonormal matrix U = (u 1 , u 2 , · · · , u n ) and a diagonal matrix Λ such that where Λ = diag (λ 1 , λ 2 , · · · , λ n ), λ 1 = · · · = λ s < λ s+1 ≤ · · · ≤ λ n , and 1 ≤ s ≤ n. Then the extreme eigenvalue problem is to find the smallest eigenvalue λ 1 and an associated eigenvector x ∈ X , where X is defined as The eigenvalue problem is a classical but very important problem in various disciplines of science and engineering [21]. And the extreme eigenvalue and the associated eigenvector are used in many engineering and science applications such as control theory, vibration analysis, electric circuits, etc.
There are a lot of efficient numerical methods for the solving of extreme eigenvalue problem. For instance, the power method, QR iteration and Lanczos method mentioned in the classical monograph by Golub and Van Loan [10], the conjugate gradient type methods [18], the filter Lanczos method [6], the Barzilai-Borwein-like methods [8], and so on. Besides these methods, some continuous methods are also powerful for solving the extreme eigenvalue problem. In [2,3], some ordinary differential equation (ODE) systems were introduced for many iterative processes in solving application problems. After the successful application of the neural network approach of Hopfield in optimization [14,15,16], Cichocki and Unbehauen introduced a neural network model for computing particular eigenvalue and the corresponding eigenvector [4]. For the problem of finding the minimal (maximal) eigenvalue of a real symmetric matrix, they first converted it into a constrained optimization problem. Then a neural network model was applied to solve this constrained problem by using either the penalty method or the Lagrange multiplier method. The optimization problems formulated in [4] are not easy to solve, which, as a consequence, makes the application of their methods quite limited.
In [11], Golub and Liao proposed a new continuous method for the extreme and interior eigenvalue problems, where they also first converted the eigenvalue problem into some optimization problems. The main difference between their method and that of Cichocki and Unbehauen [4] is that the optimization problem in [11] is to minimize a strictly concave function over a unit ball, which is very easy to solve. Golub and Liao also proposed a continuous method for the resulting optimization problem in [11], which consists of a merit function and an ordinary differential equation, and proved the convergence of each ODE solution for any starting point. And in recent, this kind of continuous method has been introduced to solve symmetric generalized eigenvalue problems and linear programming successfully [9,17].
In this paper, our first purpose is to simplify the proof of the works by Golub and Liao in [11], and our second purpose is to solve the resulting dynamic system by using a self-adaptive method. In detail, we introduce a parameter in the dynamic system, which plays a key role in the numerical methods. We then introduce a self-adaptive strategy to find the suitable parameter, along with the progress of the iterative method, such as Euler method. Our preliminary numerical results indicate that this strategy can enhance the performance of the proposed method greatly.
The rest of this paper is organized as follows. In Section 2, we present our problem formally and summarize some necessary preliminary results. In Section 3, we construct the dynamic system for solving the eigenvalue problem and prove the convergence of each ODE solution for any starting point. In Section 4, we propose a self-adaptive strategy in choosing the suitable parameter and prove the global convergence of our self-adaptive method. In Section 5, we conduct some numerical experiments to verify the propose method. Then we complete this paper by giving some concluding remarks in Section 6.

Problem formulation. Denote
then the dominant nonnegative eigenvalue problem is to find (x * , λ * ) ∈ S, such that Therefore, this problem can be formulated as an optimization problem From the constraint x T x = 1, it is noted that the above problem is equivalent to where α can be an arbitrarily real constant. By denoting B = A + αI, the problem (5) can be rewritten as min which has the same form as the original problem (4). Since α is arbitrary, we can always choose α big enough, such that the matrix B is always a real symmetric positive definite matrix. Thus, in the following, without loss of generality, we assume that matrix A is a real symmetric positive definite matrix. The feasible region of problem (4) is a sphere, which is not a convex region. This makes the problem difficult to solve. However, it can be converted into a new optimization problem with a convex feasible set. In detail, we first choose a constant c such that c ≥ λ n + 1.
Since A is a symmetric matrix, from the Corollary 2.3.2 in [10], we have Thus, we can choose c = A 1 + 1 (which was used as the default value in the numerical experiments of Golub and Liao [10] and will also be adopted as the default value in our numerical computation). Then we can establish the following optimization problem min which is different from problem (4) in that the objective function is quadratic and strictly concave but the constraint is a simple ball constraint. Since the feasible region of problem (8) is a closed convex set. As a consequence, the problem (8) is easier to solve than (4). Although the optimization problem (8) is not belong to the traditional convex optimization problems, there still has important property of the relationship between the local minimizer and global minimizer of the problem (8). The following lemma from [11] will play a key role in the next dynamic system and convergence analysis. 3. The dynamic system and convergence. From the above extreme eigenvalue problem configuration, we have established an equivalent optimization problem (8) for our eigenvalue problem. However, that optimization problem is not a convex optimization problem, the objective function is strictly concave although the feasible region is convex, then lots of effective approaches for convex optimization problem can not be applied directly.
With the help of Lemma 2.1, it is guaranteed that the solution of problem (8) can be obtained by finding one local minimizer of it. Moreover, it is well known that the first-order optimal condition holds at every local minimizer of a minimization problem: min{f (x)|x ∈ Ω}, (9) where ∅ = Ω ⊂ R n is a convex set, f : R n → R is a continuous differentiable function defined on Ω. And the first-order optimal condition of problem (9) at x * is that any feasible direction is not the descent direction. By denoting as the sets of feasible directions and descent directions at point x ∈ Ω, respectively. The optimal condition of problem (9) at x * in variational inequality form is which is the variational inequality (VI(Ω, ∇f )): Find x * ∈ Ω, such that In the next, let us introduce the projection operator P Ω (·) : R n → Ω, which is defined as One of the fundamental property of P Ω (·) is that for each x ∈ R n , P Ω (x) is the unique vectorx ∈ Ω satisfying the inequality: Then we have the following proposition [5].
Proof. The defining inequality for the (11) is which can be rewritten as From the property (13), the above inequality is equivalent to or equivalently e(x * ) = 0. Moreover, it can be seen that the solution of the minimization problem (9) is invariant to the objective function f (x) scaled by any positive constant β > 0, which means that the zero point of e(x, β) := x − P Ω (x − β∇f (x)) is also invariant to the parameter β.
Based on the above discussion, we can solve the optimization problem (8) by finding a zero point of the following equation with β > 0 is an arbitrary constant and In the following, we introduce the continuous method to find a zero point of (15) for solving the extreme eigenvalue problem based on the above preparation works. Generally, a continuous method for an optimization problem consists of two components: a merit function (bounded below) and a dynamical system, where the merit function must be monotonically non-increasing along the dynamical system.
Taking the objective function as the merit function: the dynamical system can be constructed as where β > 0 is an arbitrary parameter. Note that the dynamic system (17) is a simple generalization of that in [11] in the sense that we introduce a parameter β, and the dynamic system in [11] is a special case for β = 1. From the numerical point of view, this simple modification enables us to use a self-adaptive strategy during solving the dynamic system (17), which is much faster than just using a fixed parameter. Now, we give a simple proof of the proposition that any limit point of the dynamic system is a stationary point of the original problem. The following lemma shows that the limit point set of the dynamic system (17) is nonempty.
satisfies the Lipschitz condition, Ω is a convex closed bounded set, and the initial condition point x 0 ∈ Ω. Then the trajectory of process (17) has a nonempty set of limit points.
Proof. Since the right-hand-side of (17) is continuous in R n , the Cauchy-Peano theorem ensures that there exits a solution x(t) of the dynamical system (17) with Proof. From Lemma 3.2, we note that there exits a solution x(t) of the dynamical system (17) with x(t = t 0 ) = x 0 . For this solution x(t), we define the square of the distance of x(t) to Ω. It follows from the Theorem 1.
By setting z := x and y := P Ω (x − β∇f (x)) in the above inequality, we obtain This means that E(x(t)) is monotonically nonincreasing in t. Therefore, we have where the last inequality follows from the fact that Inequality (19) indicates that the right-hand-side of (17) is bounded for any given x 0 . Again the Cauchy-Peano theorem ensures that the solution x(t) exists in [t 0 , +∞].
Since Ω is a closed convex set, from the nonexpansive property of the projection operator, we have Then, which implies that e(x, β) in (14) is Lipschitz continuous in R n . From the Picard-Lindelöf theorem, the solution of the dynamical system (17) is unique. The result of Theorem 3.3 indicates that the dynamical system (17) is well defined. In the proof of Theorem 3.3, we can see that if x 0 ∈ Ω, the solution x(t) of (17) will move towards the feasible region. And if x 0 ∈ Ω, then the solution x(t) of (17) will stay in Ω from then on. Now we verify that the limit of the solution of (17) is the solution of the optimization problem (8). For this purpose, we need to introduce the following lemma beforehand. Proof. See Exercise 26, p. 119 in [19]. As noted in [11], it is easy to see that T in the above lemma can be extended to +∞. Now we prove the following convergence results for the solution of (17).

Proof. (i) Let
h(t) = e(x(t), β) 2 , by using the Theorem 1.7 in [1] again, we have Then the assumption of Lemma 3.4 is satisfied with M = 2 and the assertion follows immediately.
This completes the proof.
With the convergence results of the dynamical system, we can get the solution of this dynamical system by directly using some numerical tools. For instance, the ODE solvers provided by Matlab, ode45, as done by Golub and Liao [11]. Since we are going to incorporate a self-adaptive parameter selection strategy into the algorithm, we can solve this dynamical system adaptively by using some classical numerical methods for ODE, such as Euler method. 4. The fast algorithm. From the above discussion, in order to solve the original optimization problem (4), the continuous method can be used by setting the objective function (16) as the merit function and (17) as a dynamic system. The efficiency of numerical methods for tracking dynamic system (17) depends heavily on the choice of β. However, in [11], Golub and Liao just fix β = 1. In fact, from our numerical experience, an optimal constant β is usually difficult to find: it varies from one problem to another. The freedom in setting β plays an important role in the numerical method for solving the dynamic system. We thus develop a fast algorithm by incorporating a self-adaptive strategy to select a suitable parameter β. That is, at each iteration, we need to solve the dynamic system (17) adaptively. In detail, we use the Euler method to solve the dynamical system. At time t, we generate the next approximation to the optimal solution via the following iteration where h is the time step-length and β(t) is the suitable parameter selected selfadaptively. The above iteration formula (21) is a simple but effect strategy, it has been successfully used by Fukushima [7] for solving variational inequality problems with strongly monotone mappings. It was also employed by He [13] for solving the monotone linear complementarity problems and convex quadratic problems. Moreover, it was extended by Han and Lo [12] to solve the nonlinear variational inequality problems with co-coercive mappings. We are now in the position to describe our algorithm for solving the dynamic system (17) formally. S0. Given a nonnegative sequence τ (t) with ∞ t=t0 τ (t) < +∞, chosen an arbitrary initial point x(t 0 ) ∈ Ω, and positive parameters > 0, h ∈ (0, 1), µ ∈ (0, 1) and γ = γ 0 > 0. Set t := t 0 . S1. If e(x(t)) ≤ , stop; else, go to S2. S2. (Self-adaptive procedure) Find the smallest nonnegative integer l k , such that β(t + h) = µ l k γ and set γ = (1 + τ (t + h))β(t + h), otherwise, set γ = β(t). S4. Set t := t + h, and go to S1.
The following lemma shows that at each iteration, the self-adaptive procedure in selecting the suitable parameter terminates in finite steps. Consequently, the whole algorithm is well defined. For any t in the above Algorithm 4.1, the procedure of searching β(t + h) (selfadaptive procedure, S2) will be terminated in finite steps. Furthermore, there exists a positive real number denoted by β min , such that β(t) ≥ β min > 0 for all t ≥ 0.
Proof. By S3 in the Algorithm 4.1, we derive that β(t + h) ≤ (1 + τ (t))β(t) for all (1 + τ (t)) < +∞. This implies that there is a positive constant K > 0 such that γ ≤ K and β(t) ≤ K for all t. Because f is uniformly concave with a constant modulus 1 on Ω (c ≥ λ n + 1), it yields By setting y := x(t) and z := By combining the above inequality and (26) together, we obtain it provided that β(t + h) ≤ 1. Based on the facts that γ ≤ K and lim n→∞ µ n = 0, we derive that the self-adaptive procedure in each iteration will be terminated in finite steps. The parameter l k in the Algorithm 4.1 is the minimum nonnegative integer fulfilling the condition (23), and this means that The proof is completed. Lemma 4.2 indicates that the whole Algorithm 4.1 is well-defined. Thus, it either generates an approximate solution, or generates an infinite iterative sequence. In the following, we assume that the stopping checking parameter is zero and the algorithm generates an infinite sequence of iterations. Now, we prove that any cluster point of the Algorithm 4.1 is a local minimizer of the problem (8), which, by invoking Lemma 2.1, is a global minimizer of (4), a solution of the original problem. Proof. Since x(t 0 ) ∈ Ω and h ∈ (0, 1), we have that x(t) ∈ Ω for all t ≥ t 0 . Due to the compact of Ω and the continuity of f , f (x(t)) is bounded for all t ≥ t 0 . That is, there exists a constant C, such that It then follows from (26) that By using (18), we have ∇f (x(t)) T e(x(t), β(t + h)) ≥ e(x(t), β(t + h)) 2 .
This completes the proof.

Numerical experiments.
In this section, we conduct our proposed method on two examples to verify the effectiveness of the proposed method on the extreme eigenvalue problem, and compare it with the ODE solver used by Golub and Liao [11]. The stopping criterion used in our experiments is where is a preset value, and we use = 10 −6 in all our tests. All codes were run in Matlab platform on a PC with Intel(R), Pentium(R) 4 CPU with 3.2GHz. The ODE solver used is ode45, a non-stiff medium order method, where we set RelTol=10 −6 and RelTol=10 −9 in all our runs. Our examples are constructed in the following ways.
2. Generate a random matrix B ∈ R n×n , and get its QR decomposition by [Q, R] = qr(B). 3. Set A = Q T ΛQ, which is a real symmetric positive definite matrix. We select u=ones(n, 1), and x 0 = u/ u as the initial value. We choose c = 5. In our algorithm, we choose t 0 = 0, γ = γ 0 = 1, µ = 0.6 and h = 0.05, and the nonnegative sequence {τ (t)} assignment is to be τ (t) = 0.5, t ≤ 720s. We compare our method with that of Golub and Liao [11], with n varying from 100 to 5000. Figure 1 shows the trajectories of some entries of the solutions of dynamic system for problem size n = 1000 and n = 5000, where the trajectories 'x' and 'y' denote the trajectories of solutions obtained by the Golub and Liao's method and our proposed method, respectively.
From the results shown in Figure 1, we can see that all the three trajectories of 'x' and 'y' (the 200-th, 700-th and 800-th entries of solution for the case of n = 1000, the 1500-th, 3000-th and 4100-th entries of solution for the case of n = 5000) are converged to the same values. This means that both the two methods can find the correct solution of dynamic system. Moreover, our method takes a relatively less time to make the dynamic system to be stable. In detail, we summarize the computation time and the number of iterations of our proposed method in Table  1, and do comparison with the ODE solver. From the results shown in Table 1, we can see that the proposed method is about 10 times faster than the ODE solver for all the testing scenarios. Example 2. In the second example, we construct a new extreme eigenvalue problem with the eigenvalue different from each other, and the diagonal matrix is selected as Λ = diag (1, 1/2, 3/4, 4/5, 5/6, · · · , n − 1/n) ∈ R n×n . x y x 3000 x 1500 x 4100 Figure 1. Sketch three trajectories of two different orders.
All parameters are set the same as the previous Example 1. We also test our proposed method and the ODE solver for comparison.
Illustration of the trajectories of some entries of solutions (the 200-th, 400-th and 900-th entries of solution for the case of n = 1000, the 1000-th, 2000-th and 4000-th entries of solution for the case of n = 5000) by using the two methods are plotted in Figure 2. It can be seen that both methods are effect, whereas the convergence speed of our proposed method is fast. We also summarize the detail results of the computation time and the number of iterations of the proposed method in Table  2. From the results, we can see that our proposed method is also about 10 times faster than the ODE solver for all the testing scenarios. 6. Conclusions. In this paper, we first simplify the proof of the works by Golub and Liao in [11] on the continuous method for the extreme eigenvalue problem. Then we propose a fast continuous method for solving the large-scale extreme eigenvalue problem. To accelerate the convergence speed, we introduce a self-adaptive strategy