Article Contents
Article Contents

# Guarantees of riemannian optimization for low rank matrix completion

• * Corresponding author: Jian-Feng Cai
The first author is supported by National Science Foundation of China (NSFC) 11801088 and Shanghai Sailing Program 18YF1401600. The second author is supported by Hong Kong Research Grant Council (HKRGC) General Research Fund (GRF) 16306317.
• We establish the exact recovery guarantees for a class of Riemannian optimization methods based on the embedded manifold of low rank matrices for matrix completion. Assume $m$ entries of an $n\times n$ rank $r$ matrix are sampled independently and uniformly with replacement. We first show that with high probability the Riemannian gradient descent and conjugate gradient descent algorithms initialized by one step hard thresholding are guaranteed to converge linearly to the measured matrix provided

\begin{align*} m\geq C_\kappa n^{1.5}r\log^{1.5}(n), \end{align*}

where $C_\kappa$ is a numerical constant depending on the condition number of the measured matrix. Then the sampling complexity is further improved to

\begin{align*} m\geq C_\kappa nr^2\log^{2}(n) \end{align*}

via the resampled Riemannian gradient descent initialization. The analysis of the new initialization procedure relies on an asymmetric restricted isometry property of the sampling operator and the curvature of the low rank matrix manifold. Numerical simulation shows that the algorithms are able to recover a low rank matrix from nearly the minimum number of measurements.

Mathematics Subject Classification: Primary: 58F15, 58F17; Secondary: 53C35.

 Citation:

• Figure 1.  Geometric comparison between NIHT (a) and RGrad (b). One more projection is introduced in RGrad

Figure 2.  Empirical phase transition curves (a) RGrad, (b) RCG, (c) RCG restarted and (d) GD when $n = 800$. Horizontal axis $p = m/n^2$ and vertical axis $q = (2n-r)r/m$. White denotes successful recovery in all ten random tests, and black denotes failure in all tests

Figure 3.  Relative residual (mean and standard deviation over ten random tests) as function of number of iterations for $n = 8000$, $r = 100$, $1/ q = 2$ (a) and $1/ q = 3$ (b). The values after each algorithm are the average computational time (seconds) for convergence

Figure 4.  Performance of (a) RGrad, (b) RCG and (c) RCG restarted under different SNR

Table 1.  Comparison of RGrad/RCG and GD on sampling complexity (SC), required assumptions (RA), per-iteration computational cost (PICC), and local convergence rate (LCR). Remarks: 1) (Ⅰ) represents RGD/RCG with the one step hard thresholding initialization and (Ⅱ) represents RGD/RCG with the resampled Riemannian gradient descent initialization; 2) $v_g$ and $v_{cg}$ are both absolute numerical constants which are less than one

 Algorithm SC RA PICC LCR RGrad, RCG (I) $O(\kappa n^{3/2}r\log^{3/2}(n))$ $\textbf{A0}$ $O(|\Omega|r+|\Omega|+nr^2+nr+r^3)$ $v_{g},~v_{cg}$ RGrad, RCG (II) $O(\kappa^6nr^2\log^2(n))$ $\textbf{A0}$ $O(|\Omega|r+|\Omega|+nr^2+nr+r^3)$ $v_{g},~v_{cg}$ GD [54] $O(\kappa^2nr^2\log(n))$ $\textbf{A0}$ $O(|\Omega|r+|\Omega|+nr^2+nr)$ $(1-\frac{C}{\mu^2\kappa^2r^2})^{1/2}$

Figures(4)

Tables(1)

• on this site

/