A p -SPHERICAL SECTION PROPERTY FOR MATRIX SCHATTEN-p QUASI-NORM MINIMIZATION

. Low-rank matrix recovery has become a popular research topic with various applications in recent years. One of the most popular methods to dual with this problem for overcoming its NP-hardness is to relax it into some tractable optimization problems. In this paper, we consider a nonconvex relaxation, the Schatten-p quasi-norm minimization (0 < p < 1)


1.
Introduction. Low-rank matrix recovery problem (LMR), with numerous applications in collaborative filtering [7], machine learning [11], computer vision [19] and image denoising [18], is aiming at recovering the original matrix X 0 ∈ R m×n satisfying b = A(X 0 ) with the minimal rank, where b ∈ R M is the measurement and A : R m×n → R M is a given linear transformation. The mathematical model of LMR is as follows: Without loss of generality, it is usually assumed m ≤ n. Suppose that problem (1) is feasible, i.e., there exists at lease one solution to the linear system A(X) = b.
One may be curious about the uniqueness of the solution to problem (1). Actually this is an important issue in theory. Since it is proved that problem (1) is NP-hard ( [3,13]), one of the most popular methods to solve problem (1) is to change it into a tractable relaxation. Thus, it is necessary to investigate the equivalence between the original problem and its relaxations. Fazel, Hindi and Boyd [6] first introduced the famous convex relaxation of (1) in order to get a computationally tractable problem, called the nuclear norm minimization, with the following model: where X * is the nuclear norm of the matrix X, i.e., the sum of its singular values. Recently, a nonconvex relaxation model has been studied [9,12,14], named as the Schatten-p quasi-norm minimization (0 < p < 1), which utilizes the Schatten-p quasi-norm to approximate rank(·), which is described as where X p is the Schatten-p quasi-norm of X, i.e., X p := ( i σ p i (X)) 1/p with σ i (X) being the i-th singular value of X. Although the Schatten-p quasi-norm is not actually a norm when p ∈ (0, 1), the property of · p p on addition can help achieve some important results. In fact, { X * | X ≤ 1} is the tightest convex envelope of { rank(X) | X ≤ 1}, where X is the spectral norm. However, the Schatten-p quasi-norm is a nonconvex approximation to the rank function, which may be tighter than the nuclear norm. Some experiments have shown that the Schatten-p quasi-norm is a powerful tool, which can achieve better reconstruction than the nuclear norm [12,10], and this also has been proved theoretically [17]. Based on these results, it is necessary to study whether the solution obtained from the relaxations is the unique solution of the original problem (1). If the answer is not, are there any conditions that can guarantee the equivalence? In literature, various types of conditions have been proposed to guarantee the equivalence between problem (1) and its convex relaxation (2), such as mutual coherence conditions [2], null space property (NSP) [15], restricted isometry property (RIP) [1], s-goodness [8] et al. However, in terms of conditions for the exact recovery via nonconvex problem (3), there are only few results [9,17,16].
In this paper, we will use the null space analysis of the Schatten-p quasi-norm minimization (0 < p < 1), to investigate the equivalence issue. This is related to spherical section property discussed by Dvijotham and Fazel in [4,5]. We first introduce the definition of the p-spherical section property with constant (0 < p ≤ 1), then based on the definition, derive some conditions for the uniqueness of the solution to problem (1) and conditions for the equivalence between problem (1) and its nonconvex relaxation (3). This paper proceeds as follow. In section 2, we will first introduce the definition of p-spherical section property with constant (0 < p ≤ 1) and briefly discuss some properties based on the definition. In section 3, we will utilize the definition to derive some conditions which can guarantee that problem (1) and (3) have unique solutions, respectively. Further, we will obtain the conditions for the equivalence between the two problems. In section 4, we will present some discussions for the error of the approximate recovery. Conclusions are given in the last section.
2. p-Spherical section property. K. Dvijotham and M. Fazel first presented an analysis for the nuclear norm heuristic based on the spherical section property of the null space in [4,5]. The spherical section property essentially ensures that the subspace Null(A) does not include low-rank matrices. In this section, we will extend the definition in [4,5], but still keep the intention to describe certain type of null space property for a linear transformation A.
where · F is the Frobenius norm. When p = 1, let Z 1 = Z * , then Null(A) satisfying the p-spherical section property with constant reduces to the definition that Null(A) satisfying the spherical section property in [4,5].
When p = 1, the p-spherical section property still has some connections to spherical section property in [4,5]. We present their relation in the following proposition. Proof. Since Null(A) satisfies the p-spherical section property with constant , we have the following inequality forZ For all Z ∈ Null(A)\{0}, where the second inequality follows from the fact: which is a nonincreasing function corresponding to p, i.e., for 0 < p < 1, f (p) > f (1) = . Thus, Null(A) satisfying the p-spherical section property (0 < p < 1) with constant yields that Null(A) satisfying the spherical section property with constant . This shows Definition 2.1 is a generalization of the definition in [4,5].
Proof. Based on the relation between the Schatten-p quasi-norm and Frobenius norm:

YIFU FENG AND MIN ZHANG
we can get the following result immediately: 3. Exact recovery with the p-spherical section property. Based on Definition 2.1, we can derive some conditions to guarantee the uniqueness of the solution to the original problem (1) and problem (3), respectively. Further, the conditions for the equivalence between those two problems can also be obtained as a corollary. The following two theorems are the main results of this paper.
Theorem 3.1. Suppose Null(A) satisfies the p-spherical section property with constant (0 < p ≤ 1). Let one non-zero element X 0 ∈ R m×n satisfy A(X 0 ) = b. If rank(X 0 ) < 2 , then X 0 is the unique solution to problem (1).
Theorem 3.2. Suppose Null(A) satisfies the p-spherical section property with con- then X 0 is the unique solution to problem (3). An explicit sufficient condition for (5) is rank(X 0 ) < 12 .
Theorem 3.1 gives conditions for the uniqueness of the solution to the original problem (1), while Theorem 3.2 presents conditions for the uniqueness of the solution to the non-convex relaxation problem (3). Note that conditions in Theorem 3.2 is more strict than that in Theorem 3.1. Particularly, by combining the results in Theorem 3.1 and 3.2, we can directly obtain a sufficient condition to guarantee the equivalence between problem (1) and (3). Corollary 1. Suppose Null(A) satisfies the p-spherical section property with constant (0 < p ≤ 1). Let one non-zero element X 0 ∈ R m×n satisfy A(X 0 ) = b. If rank(X 0 ) < 12 , then X 0 is the unique solution to both problem (1) and problem (3). Consequently, problem (1) and (3) have the same unique solution, which indicates that they are equivalent.
Next, we will give the proofs of the two main results-Theorem 3.1 and Theorem 3.2.
which contradicts to Proposition 2. The proof is complete. It is stated in Remark 1 that f (p) > for 0 < p < 1, thus, rank(X 0 ) < is a sufficient condition for rank(X 0 ) < f (p), which shows our result in Theorem 3.1 is compatible with Theorem 2.1 (a) in [4].
Proof. (Proof of Theorem 3.2) Assume there exists Y ∈ R m×n such that A(Y ) = A(X 0 ), Y = X 0 and Y p ≤ X 0 p . Let rank(X 0 ) = r and the singular value Then, we have where the second inequality follows from the triangle inequality and the second equality follows from Lemma 2.2 in [9]. Therefore, we can obtain that Next, we consider the following optimization problem: Let x i , y i , z i , u i denote the i-th singular value of W 11 , W 12 , W 21 , W 22 , respectively. Then, problem (7) can be rewritten as the following: max xi,yi,zi,ui

YIFU FENG AND MIN ZHANG
Note that x i , y i , z i are symmetric in both objective function and constraints of problem (8), which infers they have some impacts on the optimal value. Moreover, at the optimum, x i , y i , z i are supposed to be the same value. Thus, we can assume After rescaling x, u, we can rewrite problem (9) to Next, we are focusing on the optimal value of problem (10). First, denote its Lagrangian function as then the KKT conditions of this problem are as follows: Since the Lagrangian function is convex at (x, u) ∈ R 2 + , the solution of KKT system serves a minimum for problem (10). If r < m 4 , we can obtain the optimal solution from the KKT system, and further, the optimal value is In order to prove that X 0 is the unique solution to problem (3), here we need to explore a sufficient condition to obtain a contrary statement that Null(A) doesn't satisfy the p-spherical section property with constant (0 < p ≤ 1), i.e., Note that (11) is equivalent to the following inequality: Since function g(x) = x k (k ≥ 1) is convex in [0, +∞), we have Note that 2 p − 1 ≥ 1, which leads to Thus, we can obtain a sufficient condition for (12): With simple calculations, one can get }, then (11) holds, which contradicts to the p-spherical section property of Null(A). Hence, X 0 must be the unique solution to problem (3).
In this section, we have derived conditions for the uniqueness of the solution to problem (1) and problem (3), and further obtained the equivalence between these two problems. In the above discussions, if the matrix X 0 satisfying A(X 0 ) = b is the unique solutions of problem (1) or problem (3), it is supposed to be a low-rank matrix. However, in practice, there may not exist an exact low-rank matrix X 0 satisfying A(X 0 ) = b, instead, X 0 is approximately low-rank. Thus, problem (1) and problem (3) would not be equivalent in this case. Next, we will consider the error bound between the approximate low-rank matrix X 0 and the solution to the Schatten-p quasi-norm minimization.

4.
Error bound for approximation on low-rank recovery. In practice, the solution matrix X 0 is not usually exactly low-rank, but approximately low-rank, which implies that the error between X 0 and its best rank-r approximation is small, where the best rank-r approximation of X 0 is defined as: In fact, for any matrix W ∈ R m×n , if we denote the SVD of W by where σ(W ) := (σ 1 (W ), · · · , σ m (W )) T is the vector composed of the singular values of W , U ∈ R m×m is an orthogonal matrix, and V ∈ R n×m with its columns being orthogonal. Without loss of generality, let σ 1 (W ) ≥ · · · ≥ σ m (W ) ≥ 0. Then, the best rank-r approximation to W is given by In the following, similar to the proof of Theorem 3.2, we will show if X 0 is approximately low-rank, then the solution to problem (3) is close to X 0 within a constant factor of the error achieved by the best rank-r approximation to X 0 under the p-spherical section property with constant .
Theorem 4.1. Suppose Null(A) satisfies the p-spherical section property with constant (0 < p ≤ 1). Let X 0 ∈ R m×n be nonzero, and X r 0 be the best rank-r