AN ADAPTIVE TRUST REGION ALGORITHM FOR LARGE-RESIDUAL NONSMOOTH LEAST SQUARES PROBLEMS

. In this paper, an adaptive trust region algorithm in which the trust region radius converges to zero is presented for solving large-residual nonsmooth least squares problems. This algorithm uses the smoothing technique of the approximation function, and it combines an adaptive trust region radius. Moreover, this algorithm diﬀers from the existing methods for solving non- smooth equations through use of the approximation function of second-order information, which improves the convergence rate for large-residual nonsmooth least squares problems. Under some suitable conditions, the global and local superlinear convergences of the proposed method are proven. The preliminary numerical results indicate that the proposed algorithm is eﬀective and suitable for solving large-residual nonsmooth least squares problems.


1.
Introduction. Consider the following nonsmooth least squares problem: where f : R n → R m is a nonsmooth function. This problem (1.1) and other type nonsmooth problems [10,23,24,27] have recently appeared in many applications in the medical, image restoration, optimal control, functional approximation and curve fitting fields. Initially, we consider n = m, which is equivalent to the following nonsmooth nonlinear equation problem For solving the above nonsmooth equation problem, Qi [15] considers using the smooth plus nonsmooth (SPN) decomposition, namely, f is decomposed into f = p + q, where p : R n → R n is smooth and q : R n → R n is locally Lipschitzian and relatively small, and two examples of SPN decomposition are given; further examples are given in [2,4,3]. Qi [15] presents a trust region method for solving the nonsmooth equation (1.2), and the trail step d k is obtained by solving the following subproblem: This method is a very nice numerical method for solving the nonsmooth equation (1.2). However, Sun, Sampaio and Yuan [20] report that (i) this method is used to solve the nonsmooth least squares problem (1.1), in which there is only a slow convergence rate because we only used the first-order information of p(x), and that (ii) we cannot neglect the second-order information of F (x) for solving the larger residual nonsmooth least squares problem (1.1). Hence, they consider using the SPN decomposition and solving the subproblem where V k is symmetric and carries the second-order information of F (x) and ∆ k > 0. The ratio between the actual reduction and predicted reduction is defined as which plays a important role in determining whether to accept the trail step and how to adjust the trust region radius. The next iterate x k+1 is computed using the following formula where 0 < µ < 1 is a positive constant. The trust region radius for the next iteration is chosen as where ∆ > 0, 0 < µ < µ 1 < 1, and 0 < γ 1 < γ 2 < γ 3 .
As is known, one of the most effective methods for problems (1.1) is the trust region method. The trust region method plays an important role in the area of nonlinear optimization, and it has been proven to be a very efficient method (see [17,21,22] etc.). Levenberg [9] and Marquardt [12] first applied this method for nonlinear least squares problems, and Powell [14] established the convergence result of this method for unconstrained problems.
Based on the concept in [7], note that the ratio { r k } converges to 1 when the sequence generated by the traditional trust region algorithm converges to the problem (1.1). By the updating formula (1.6), it is easy to see that ∆ k will be larger than a positive constant for all larger k. Because { x k − x * } converges to zero, the trust region will ultimately not work. Many algorithms with an adaptive trust region radius have been proposed for solving nonlinear equations problems. For example, Zhang and Wang [26] presented a new trust region method with the trust region radius ∆ k = c p f (x k ) δ for solving nonlinear equations, where 0 < c < 1, p is a nonnegative integer, and 1 2 < δ < 1. Based on the concept in [26], Yuan, Wei and Lu [25] presented a modified BFGS trust region algorithm, and ∆ k = c p f (x k ) , where c ∈ (0, 1), p is also a nonnegative integer, and the quadratic convergence is obtained under suitable conditions. Fan and Pan [6] also presented a new trust region algorithm, and the trust region radius was chosen as ∆ k = µ k f (x k ) δ , where µ k is updated according to r k , δ ∈ (0, 1); under the local error bound condition, we also obtain the quadratic convergence. For more examples, see [8,18], among others. The purpose of this paper is to present an efficient trust region algorithm to solve (1.1). By using the approximation function [3] and adaptive trust region radius [7], the proposed method possesses some good properties.
The remainder of this paper is organized as follows. In the next section, we briefly review some basic results in nonsmooth analysis and state an approximation function. In Section 3, we present an adaptive algorithm for solving problem (1.1), and we prove the global convergence of the proposed method. In Section 4, we prove the superlinear convergence of the algorithm under suitable conditions. Preliminary numerical results are reported in Section 5. We conclude our paper in Section 6.
Throughout this paper, unless otherwise specified, · denotes the Euclidean norm of vectors or matrices.
2. Preliminary. In this section, we state an approximation function. First, we provide the definition of an approximate function [3] as follows.
(ii) for each fixed y ∈ R n , there is a number θ y > 0 such that P y (·) ≡ P (·, y) is continuously differentiable in S(y, θ y ); (iii) for each fixed x ∈ R n , there is a number θ x > 0 such that Where S(x, θ) denotes an open ball in R n with center x and radius θ. Some important properties of P are given in the following [3]: 1. Let f : R n → R n be continuously differentiable. Define P as: Then, P is a point-based smooth approximation function of f .
where φ is continuously differentiable and ψ is continuous. Then, P , which is defined by 710 Z. SHENG, G. YUAN, Z. CUI, X. DUAN AND X. WANG Based on the above propositions, we can now decompose f as follows: where P : R n × R → R m is a point-based smooth approximation function. Thus, we can use P x (x) to define the approximation of F (x) as follows Based on the above discussion of P x (x), it possesses the following remarkable feature.
3. New algorithm and global convergence. In this section, based on the concept in [7], we present an adaptive trust region model for solving (1.1), which is motivated by the approximation function, the general trust region method and the secant equation. Now, we describe the trust region method. In each iteration, a trial step d k is generated by solving an adaptive trust region subproblem, in which the value of the Jacobian of J(x) at x k is used: where ∆ k is the trust region radius and B k is generated by the BFGS update formula and B k+1 satisfies the secant equation B k+1 d k = y k , namely, we have Let d k be the optimal solution of (3.1). The actual reduction is defined by and we define the predicted reduction as Then, we define r k as the ratio between Are d k and P re d k Now, we list the steps of the modified trust region algorithm as follows.
Step 1. If J(x k ) T P k (x k ) ≤ ǫ, then stop. Otherwise, go to Step 2.
Step 2. Solve the trust region subproblem (3.1) to obtain d k .
Remark 3.1. Our modified algorithm can be considered as an extension of paper [7] from smooth optimization to nonsmooth larger residuals least squares optimization.
To establish the global convergence of Algorithm 1, we will make the following common assumption.
(iii) There exists a positive constant M 2 > 0 such that B k ≤ M 2 holds for all k.
Similar to the famous result of Powell [14], we provide the following lemma Lemma 3.1. If d k is the solution of (3.1), then Proof. Because d k is the solution of (3.1), for any α ∈ [0, 1], we obtain The proof is complete. Proof. By contradiction, suppose that there is a positive constant ǫ 0 such that From Algorithm 1, we can define Then, from Assumption A, Lemma 3.1 and (3.10), we obtain If Γ is finite, from (3.7), it is clear that λ k+1 = α 2 λ k for all sufficiently large k. Because α 2 < 1, then lim k→∞ λ k = 0.
Then, we obtain r k → 1. Therefore, by above equation, it follows that there exists a constant λ > 0 such that λ k > λ holds for all sufficiently large k, namely, lim k→∞ λ k > λ > 0, which contradicting the equation (3.13), thus, equation (3.10) is not true. Hence, we have that (3.9) holds, and the proof is complete. 4. Superlinear convergence. In this section, we shall prove the superlinear convergence under suitable conditions. From [7], we know that our new trust region radius converges to zero. Hence, to prove that the algorithm attains superlinear convergence under suitable conditions, we need to show that the constraint is inactive when d k is the solution of (3.1), which means that the trail step takes the quasi-Newton step. Namely, the sequence {λ k } is bounded. Now, let us consider the following lemma.
Lemma 4.1. If d k is the solution of (3.1), assume that {x k } is a sequence generated when Algorithm 1 converges to x * , ∇ 2 F P (x) is continuous in a neighborhood of x * , and Proof. By contradiction, suppose that there is a subsequence {k i } such that λ ki → +∞. Because the inequality holds, by the positive definiteness of ∇ 2 F P (x * ), it follows that for all large k, there exist L > l > 0 such that 3) Therefore, from (4.1), (4.2) and (4.3), we obtain which is a contradiction to λ ki → +∞. Hence, the sequence {λ k } is bounded, and the proof is complete.
holds, then the sequence {x k } converges to x * Q-superlinearly.
Proof. By the definiteness of ∇ 2 F P (x * ) and (4.4), it follows that for all sufficiently large k, there exists a positive constant η such that Note that d k is a solution for solving (3.1); therefore, there exists ν k ≥ 0 such that (J(x k ) T J(x k ) + B k + ν k I)d k = −J(x k ) T P k (x k ). Thus, we have By the continuity of ∇ 2 F P (x) and (4.4), we have (4.7) Thus, we obtain Note that the sequence {λ k } is bounded from Lemma 4.1, and we have ||d k || ≤ 1 2 ∆ k for all large k. Therefore, the sequence {x k } generated by Algorithm 1 is a quasi-Newton sequence that is superlinearly convergent to x * under the assumed conditions. For details regarding the superlinear convergence, see [13]. This completes the proof.

5.
Numerical results. In this section, we report numerical results of our adaptive trust region algorithm (ATR, in short) for solving large-residual nonsmooth least squares problems. We compared our algorithm with the traditional trust region algorithm (TTR, in short) of paper [5]. Here, we obtain the trial step d k in the trust region subproblem (3.1) by the cgtrust method [19] for both codes. Both algorithms were implemented in Matlab R2010b, and the numerical experiments were performed on a PC equipped with CPU Intel Pentium(R) Dual-Core CPU at 3.20 GHz, 2.00 GB of RAM, and the Windows 7 operating system. The parameters were chosen as follows: α 0 = 0.00001, α 1 = 0.1, α 2 = 0.3, α 3 = 1.5 and λ 0 = 1. We stopped the algorithm when the condition J(x k ) T P k (x k ) ≤ 10 −5 was satisfied. According to Conn et.al [5], we choose ∆ 0 = 1, and update ∆ k by the following formula: To explain the ATR is efficient for finding global minimizers of the problems, we consider following examples.
Example 5.1. (this example is a modified problem in [11]). Let According the Proposition 3 of Chen and Qi [3], we have following equations for the second equation and third equation of (5.2). Example 5.2. (this example is a modified Trigonometric Problem in [1]). Let Similar to Example 5.1, we have if |x 2 | < w; (5.6) we also choose w = 0.625 and the initial point The computational results are shown in Figures 1 and 2 with ATR algorithm and TTR algorithm which indicate that our ATR algorithm is effective and suitable to solve large-residual nonsmooth least squares problems. 6. Conclusions. In this paper, we proposed an adaptive trust region algorithm for solving large-residual nonsmooth least squares problems. This algorithm uses the approximation functions of Chen and Qi [3], and it combines a new trust region strategy with the trust region radius converging to zero. In contrast to the nonsmooth equations, the existing methods only use the first-order information of the smooth approximation function. For large-residual nonsmooth least squares 716 Z. SHENG, G. YUAN, Z. CUI, X. DUAN AND X. WANG problems, we must consider the second-order information of the smooth approximation function because the second-order information n i=1 P ki (x)∇ 2 P ki (x) does not converge to zero. Because the trust region radius of the traditional trust region algorithm will be larger than a positive constant, the trust region ultimately does not work. We extended the approach of Fan and Yuan [7], and we applied the algorithm to large-residual nonsmooth least squares problems. Finally, global and local superlinear convergence are established under some mild conditions, and the preliminary numerical results of our algorithm indicate the promising behaviour to solve large-residual nonsmooth least squares problems.