# American Institute of Mathematical Sciences

doi: 10.3934/ipi.2020067

## Fast algorithms for robust principal component analysis with an upper bound on the rank

 1 Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, MI 48824, USA 2 School of Mathematical Sciences, Shanghai Key Laboratory for Contemporary Applied Mathematics, Key Laboratory of Mathematics for Nonlinear Sciences (Fudan University), Ministry of Education, Fudan University, Shanghai, 200433, China 3 Department of Computational Mathematics, Science and Engineering, Department of Mathematics, Michigan State University, East Lansing, MI 48824, USA

* Corresponding author

Received  February 2020 Revised  August 2020 Published  November 2020

Fund Project: N. Sha and M. Yan are supported by NSF grant DMS-1621798 and DMS-2012439. L. Shi is supported by NNSFC grant 11631015 and Shanghai Science and Technology Research Program 19JC1420101 and and 20JC1412700

The robust principal component analysis (RPCA) decomposes a data matrix into a low-rank part and a sparse part. There are mainly two types of algorithms for RPCA. The first type of algorithm applies regularization terms on the singular values of a matrix to obtain a low-rank matrix. However, calculating singular values can be very expensive for large matrices. The second type of algorithm replaces the low-rank matrix as the multiplication of two small matrices. They are faster than the first type because no singular value decomposition (SVD) is required. However, the rank of the low-rank matrix is required, and an accurate rank estimation is needed to obtain a reasonable solution. In this paper, we propose algorithms that combine both types. Our proposed algorithms require an upper bound of the rank and SVD on small matrices. First, they are faster than the first type because the cost of SVD on small matrices is negligible. Second, they are more robust than the second type because an upper bound of the rank instead of the exact rank is required. Furthermore, we apply the Gauss-Newton method to increase the speed of our algorithms. Numerical experiments show the better performance of our proposed algorithms.

Citation: Ningyu Sha, Lei Shi, Ming Yan. Fast algorithms for robust principal component analysis with an upper bound on the rank. Inverse Problems & Imaging, doi: 10.3934/ipi.2020067
##### References:
 [1] E. Amaldi and V. Kann, On the approximability of minimizing nonzero variables or unsatisfied relations in linear systems, Theoretical Computer Science, 209 (1998), 237-260.  doi: 10.1016/S0304-3975(97)00115-1.  Google Scholar [2] T. Bouwmans and E. H. Zahzah, Robust pca via principal component pursuit: A review for a comparative evaluation in video surveillance, Computer Vision and Image Understanding, 122 (2014), 22-34.   Google Scholar [3] H. Cai, J.-F. Cai and K. Wei, Accelerated alternating projections for robust principal component analysis, The Journal of Machine Learning Research, 20 (2019), 685-717.   Google Scholar [4] E. J. Candès, X. Li, Y. Ma and J. Wright, Robust principal component analysis?, Journal of the ACM (JACM), 58 (2011), 1-37.  doi: 10.1145/1970392.1970395.  Google Scholar [5] R. Chartrand, Exact reconstruction of sparse signals via nonconvex minimization, IEEE Signal Processing Letters, 14 (2007), 707-710.  doi: 10.1109/LSP.2007.898300.  Google Scholar [6] J. P. Cunningham and Z. Ghahramani, Linear dimensionality reduction: Survey, insights, and generalizations, The Journal of Machine Learning Research, 16 (2015), 2859-2900.   Google Scholar [7] J. F. P. Da Costa, H. Alonso and L. Roque, A weighted principal component analysis and its application to gene expression data, IEEE/ACM Transactions on Computational Biology and Bioinformatics, 8 (2009), 246-252.   Google Scholar [8] F. De la Torre and M. J. Black, Robust principal component analysis for computer vision, in Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001, vol. 1, IEEE, 2001, 362-369. Google Scholar [9] E. Elhamifar and R. Vidal, Sparse subspace clustering: Algorithm, theory, and applications, IEEE transactions on pattern analysis and machine intelligence, 35 (2013), 2765-2781.  doi: 10.1109/TPAMI.2013.57.  Google Scholar [10] J. Fan and R. Li, Variable selection via nonconcave penalized likelihood and its oracle properties, Journal of the American statistical Association, 96 (2001), 1348-1360.  doi: 10.1198/016214501753382273.  Google Scholar [11] R. A. Horn and C. R. Johnson, Matrix Analysis, Cambridge university press, 2013.   Google Scholar [12] X.-L. Huang, L. Shi and M. Yan, Nonconvex sorted $\ell_1$ minimization for sparse approximation, Journal of the Operations Research Society of China, 3 (2015), 207-229.  doi: 10.1007/s40305-014-0069-4.  Google Scholar [13] G. Li and T. K. Pong, Global convergence of splitting methods for nonconvex composite optimization, SIAM Journal on Optimization, 25 (2015), 2434-2460.  doi: 10.1137/140998135.  Google Scholar [14] H. Li and Z. Lin, Accelerated proximal gradient methods for nonconvex programming, in Advances in Neural Information Processing Systems, 2015, 379-387. Google Scholar [15] Z. Lin, M. Chen and Y. Ma, The augmented lagrange multiplier method for exact recovery of corrupted low-rank matrices. 2010, arXiv preprint arXiv: 1009.5055, (2010), 663-670. Google Scholar [16] G. Liu, Z. Lin, S. Yan, J. Sun, Y. Yu and Y. Ma, Robust recovery of subspace structures by low-rank representation, IEEE Transactions on Pattern Analysis and Machine Intelligence, 35 (2012), 171-184.   Google Scholar [17] X. Liu, Z. Wen and Y. Zhang, An efficient Gauss-Newton algorithm for symmetric low-rank product matrix approximations, SIAM Journal on Optimization, 25 (2015), 1571-1608.  doi: 10.1137/140971464.  Google Scholar [18] Y. Lou and M. Yan, Fast l1-l2 minimization via a proximal operator, Journal of Scientific Computing, 74 (2018), 767-785.  doi: 10.1007/s10915-017-0463-2.  Google Scholar [19] N. Sha, M. Yan and Y. Lin, Efficient seismic denoising techniques using robust principal component analysis, in SEG Technical Program Expanded Abstracts 2019, Society of Exploration Geophysicists, 2019, 2543-2547. Google Scholar [20] Y. Shen, H. Xu and X. Liu, An alternating minimization method for robust principal component analysis, Optimization Methods and Software, 34 (2019), 1251-1276.  doi: 10.1080/10556788.2018.1496086.  Google Scholar [21] M. Tao and X. Yuan, Recovering low-rank and sparse components of matrices from incomplete and noisy observations, SIAM Journal on Optimization, 21 (2011), 57-81.  doi: 10.1137/100781894.  Google Scholar [22] L. N. Trefethen and D. Bau â…¢, Numerical linear algebra, vol. 50, SIAM, 1997.  Google Scholar [23] F. Wen, R. Ying, P. Liu and T.-K. Truong, Nonconvex regularized robust PCA using the proximal block coordinate descent algorithm, IEEE Transactions on Signal Processing, 67 (2019), 5402-5416.  doi: 10.1109/TSP.2019.2940121.  Google Scholar [24] Z. Wen, W. Yin and Y. Zhang, Solving a low-rank factorization model for matrix completion by a nonlinear successive over-relaxation algorithm, Mathematical Programming Computation, 4 (2012), 333-361.  doi: 10.1007/s12532-012-0044-1.  Google Scholar [25] J. Wright, A. Ganesh, S. Rao, Y. Peng and Y. Ma, Robust principal component analysis: Exact recovery of corrupted low-rank matrices via convex optimization, in Advances in Neural Information Processing Systems, 2009, 2080-2088. Google Scholar [26] X. Yuan and J. Yang, Sparse and low-rank matrix decomposition via alternating direction methods, preprint, 12 (2009).  Google Scholar [27] C.-H. Zhang, Nearly unbiased variable selection under minimax concave penalty, The Annals of Statistics, 38 (2010), 894-942.  doi: 10.1214/09-AOS729.  Google Scholar

show all references

##### References:
 [1] E. Amaldi and V. Kann, On the approximability of minimizing nonzero variables or unsatisfied relations in linear systems, Theoretical Computer Science, 209 (1998), 237-260.  doi: 10.1016/S0304-3975(97)00115-1.  Google Scholar [2] T. Bouwmans and E. H. Zahzah, Robust pca via principal component pursuit: A review for a comparative evaluation in video surveillance, Computer Vision and Image Understanding, 122 (2014), 22-34.   Google Scholar [3] H. Cai, J.-F. Cai and K. Wei, Accelerated alternating projections for robust principal component analysis, The Journal of Machine Learning Research, 20 (2019), 685-717.   Google Scholar [4] E. J. Candès, X. Li, Y. Ma and J. Wright, Robust principal component analysis?, Journal of the ACM (JACM), 58 (2011), 1-37.  doi: 10.1145/1970392.1970395.  Google Scholar [5] R. Chartrand, Exact reconstruction of sparse signals via nonconvex minimization, IEEE Signal Processing Letters, 14 (2007), 707-710.  doi: 10.1109/LSP.2007.898300.  Google Scholar [6] J. P. Cunningham and Z. Ghahramani, Linear dimensionality reduction: Survey, insights, and generalizations, The Journal of Machine Learning Research, 16 (2015), 2859-2900.   Google Scholar [7] J. F. P. Da Costa, H. Alonso and L. Roque, A weighted principal component analysis and its application to gene expression data, IEEE/ACM Transactions on Computational Biology and Bioinformatics, 8 (2009), 246-252.   Google Scholar [8] F. De la Torre and M. J. Black, Robust principal component analysis for computer vision, in Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001, vol. 1, IEEE, 2001, 362-369. Google Scholar [9] E. Elhamifar and R. Vidal, Sparse subspace clustering: Algorithm, theory, and applications, IEEE transactions on pattern analysis and machine intelligence, 35 (2013), 2765-2781.  doi: 10.1109/TPAMI.2013.57.  Google Scholar [10] J. Fan and R. Li, Variable selection via nonconcave penalized likelihood and its oracle properties, Journal of the American statistical Association, 96 (2001), 1348-1360.  doi: 10.1198/016214501753382273.  Google Scholar [11] R. A. Horn and C. R. Johnson, Matrix Analysis, Cambridge university press, 2013.   Google Scholar [12] X.-L. Huang, L. Shi and M. Yan, Nonconvex sorted $\ell_1$ minimization for sparse approximation, Journal of the Operations Research Society of China, 3 (2015), 207-229.  doi: 10.1007/s40305-014-0069-4.  Google Scholar [13] G. Li and T. K. Pong, Global convergence of splitting methods for nonconvex composite optimization, SIAM Journal on Optimization, 25 (2015), 2434-2460.  doi: 10.1137/140998135.  Google Scholar [14] H. Li and Z. Lin, Accelerated proximal gradient methods for nonconvex programming, in Advances in Neural Information Processing Systems, 2015, 379-387. Google Scholar [15] Z. Lin, M. Chen and Y. Ma, The augmented lagrange multiplier method for exact recovery of corrupted low-rank matrices. 2010, arXiv preprint arXiv: 1009.5055, (2010), 663-670. Google Scholar [16] G. Liu, Z. Lin, S. Yan, J. Sun, Y. Yu and Y. Ma, Robust recovery of subspace structures by low-rank representation, IEEE Transactions on Pattern Analysis and Machine Intelligence, 35 (2012), 171-184.   Google Scholar [17] X. Liu, Z. Wen and Y. Zhang, An efficient Gauss-Newton algorithm for symmetric low-rank product matrix approximations, SIAM Journal on Optimization, 25 (2015), 1571-1608.  doi: 10.1137/140971464.  Google Scholar [18] Y. Lou and M. Yan, Fast l1-l2 minimization via a proximal operator, Journal of Scientific Computing, 74 (2018), 767-785.  doi: 10.1007/s10915-017-0463-2.  Google Scholar [19] N. Sha, M. Yan and Y. Lin, Efficient seismic denoising techniques using robust principal component analysis, in SEG Technical Program Expanded Abstracts 2019, Society of Exploration Geophysicists, 2019, 2543-2547. Google Scholar [20] Y. Shen, H. Xu and X. Liu, An alternating minimization method for robust principal component analysis, Optimization Methods and Software, 34 (2019), 1251-1276.  doi: 10.1080/10556788.2018.1496086.  Google Scholar [21] M. Tao and X. Yuan, Recovering low-rank and sparse components of matrices from incomplete and noisy observations, SIAM Journal on Optimization, 21 (2011), 57-81.  doi: 10.1137/100781894.  Google Scholar [22] L. N. Trefethen and D. Bau â…¢, Numerical linear algebra, vol. 50, SIAM, 1997.  Google Scholar [23] F. Wen, R. Ying, P. Liu and T.-K. Truong, Nonconvex regularized robust PCA using the proximal block coordinate descent algorithm, IEEE Transactions on Signal Processing, 67 (2019), 5402-5416.  doi: 10.1109/TSP.2019.2940121.  Google Scholar [24] Z. Wen, W. Yin and Y. Zhang, Solving a low-rank factorization model for matrix completion by a nonlinear successive over-relaxation algorithm, Mathematical Programming Computation, 4 (2012), 333-361.  doi: 10.1007/s12532-012-0044-1.  Google Scholar [25] J. Wright, A. Ganesh, S. Rao, Y. Peng and Y. Ma, Robust principal component analysis: Exact recovery of corrupted low-rank matrices via convex optimization, in Advances in Neural Information Processing Systems, 2009, 2080-2088. Google Scholar [26] X. Yuan and J. Yang, Sparse and low-rank matrix decomposition via alternating direction methods, preprint, 12 (2009).  Google Scholar [27] C.-H. Zhang, Nearly unbiased variable selection under minimax concave penalty, The Annals of Statistics, 38 (2010), 894-942.  doi: 10.1214/09-AOS729.  Google Scholar
The relative error to the true low-rank matrix vs the rank $p$ for Shen et al.'s and Alg. 2. Alg. 2 is robust to $p$, as long as $p$ is not smaller than the true rank 25
The contour map of the relative error to ${\bf{L}}^\star$ for different parameters. In this experiment, we set $r = 25$ and $s = 20$. The upper bound of the rank is set to be $p = 30$
The numerical experiment on the 'cameraman' image. (A-C) show that the proposed model performs better than Shen et al.'s both visually and in terms of RE and PSNR. (D) compares the objective values vs time for general SVD, Alg. 1, and Alg. 2. Here $f^\star$ is the value obtained by Alg. 2 with more iterations. It shows the fast speed with the Gauss-Newton approach and acceleration. With the Gauss-Newton approach, the computation time for Alg. 1 is reduced to about 1/7 of the one with standard SVD (from 65.11s to 8.43s). The accelerated Alg. 2 requires 5.2s, though the number of iterations is reduced from 3194 to 360
The numerical experiment on the 'Barbara' image. (A-C) show that the proposed model performs better than Shen et al.'s both visually and in terms of RE and PSNR. (D) compares the objective values vs time for general SVD, Alg. 1, and Alg. 2. Here $f^\star$ is the value obtained by Alg. 2 with more iterations. It shows the fast speed with the Gauss-Newton approach and acceleration. With the Gauss-Newton approach, the computation time for Alg. 1 is reduced to less than 1/3 of the one with standard SVD (from 148.6s to 43.7s). The accelerated Alg. 2 requires 23.3s, though the number of iterations is reduced from 3210 to 300
 Algorithm 1: Proposed Algorithm Input: ${\bf{D}}$, $\mu$, $\lambda$, $p$, $\mathcal{A}$, stepsize $t$, stopping criteria $\epsilon$, maximum number of iterations $Max\_Iter$, initialization ${\bf{L}}^0 = \bf{0}$Output: ${\bf{L}}$, ${\bf{S}}$
 Algorithm 1: Proposed Algorithm Input: ${\bf{D}}$, $\mu$, $\lambda$, $p$, $\mathcal{A}$, stepsize $t$, stopping criteria $\epsilon$, maximum number of iterations $Max\_Iter$, initialization ${\bf{L}}^0 = \bf{0}$Output: ${\bf{L}}$, ${\bf{S}}$
 Algorithm 2: Accelerated algorithm with nonmonotone APG Input:${\bf{D}}$, $\mu$, $\lambda$, $p$, $\mathcal{A}$, stepsize $t$, $\eta \in [0,1)$, $\delta > 0$, stopping criteria $\epsilon$, maximum number of iterations $Max\_Iter$, initialization: ${\bf{L}}^0 = {\bf{L}}^1 = {\bf{Z}}^1 = \textbf{0}$, $t^0 = 0$, $t^1 = q^1=1$, $c^1 = F( {\bf{L}}^1)$Output:${\bf{L}}$, ${\bf{S}}$
 Algorithm 2: Accelerated algorithm with nonmonotone APG Input:${\bf{D}}$, $\mu$, $\lambda$, $p$, $\mathcal{A}$, stepsize $t$, $\eta \in [0,1)$, $\delta > 0$, stopping criteria $\epsilon$, maximum number of iterations $Max\_Iter$, initialization: ${\bf{L}}^0 = {\bf{L}}^1 = {\bf{Z}}^1 = \textbf{0}$, $t^0 = 0$, $t^1 = q^1=1$, $c^1 = F( {\bf{L}}^1)$Output:${\bf{L}}$, ${\bf{S}}$
Comparison of three RPCA algorithms. We compare the relative error of their solutions to the true low-rank matrix and the number of iterations. Both Alg. 1 and Alg. 2 have better performance than [20] in terms of the relative error and the number of iterations. Alg. 2 has the fewest iterations but the relative error could be large. It is because the true low-rank matrix is not the optimal solution to the optimization problem, and the trajectory of the iterations moves close to ${\bf{L}}^\star$ before it approaches the optimal solution
 $r$ s Shen et al.'s [20] Alg. 1 Alg.2 $RE( {\bf{L}}, {\bf{L}}^\star)$ $\#$ iter $RE( {\bf{L}}, {\bf{L}}^\star)$ $\#$ iter $RE( {\bf{L}}, {\bf{L}}^\star)$ $\#$ iter 25 20 0.0745 1318 0.0075 296 0.0075 68 50 20 0.0496 1434 0.0101 473 0.0088 77 25 40 0.0990 2443 0.0635 796 0.0915 187
 $r$ s Shen et al.'s [20] Alg. 1 Alg.2 $RE( {\bf{L}}, {\bf{L}}^\star)$ $\#$ iter $RE( {\bf{L}}, {\bf{L}}^\star)$ $\#$ iter $RE( {\bf{L}}, {\bf{L}}^\star)$ $\#$ iter 25 20 0.0745 1318 0.0075 296 0.0075 68 50 20 0.0496 1434 0.0101 473 0.0088 77 25 40 0.0990 2443 0.0635 796 0.0915 187
Performance of Alg. 2 on low-rank matrix recovery with missing entries. We change the level of sparsity in the sparse noise, standard deviation of the Gaussian noise, and the ratio of missing entries
 {s} {$\sigma$} ratio of missing entries $RE( {\bf{L}}, {\bf{L}}^\star)$ by Alg. 2 20 0.05 10% 0.0079 20 0.05 20% 0.0088 20 0.05 50% 0.0201 5 0.01 50% 0.0015
 {s} {$\sigma$} ratio of missing entries $RE( {\bf{L}}, {\bf{L}}^\star)$ by Alg. 2 20 0.05 10% 0.0079 20 0.05 20% 0.0088 20 0.05 50% 0.0201 5 0.01 50% 0.0015
 [1] Dan Zhu, Rosemary A. Renaut, Hongwei Li, Tianyou Liu. Fast non-convex low-rank matrix decomposition for separation of potential field data using minimal memory. Inverse Problems & Imaging, , () : -. doi: 10.3934/ipi.2020076 [2] Lingju Kong, Roger Nichols. On principal eigenvalues of biharmonic systems. Communications on Pure & Applied Analysis, 2021, 20 (1) : 1-15. doi: 10.3934/cpaa.2020254 [3] Xuemei Chen, Julia Dobrosotskaya. Inpainting via sparse recovery with directional constraints. Mathematical Foundations of Computing, 2020, 3 (4) : 229-247. doi: 10.3934/mfc.2020025 [4] Liping Tang, Ying Gao. Some properties of nonconvex oriented distance function and applications to vector optimization problems. Journal of Industrial & Management Optimization, 2021, 17 (1) : 485-500. doi: 10.3934/jimo.2020117 [5] Cheng He, Changzheng Qu. Global weak solutions for the two-component Novikov equation. Electronic Research Archive, 2020, 28 (4) : 1545-1562. doi: 10.3934/era.2020081 [6] George W. Patrick. The geometry of convergence in numerical analysis. Journal of Computational Dynamics, 2021, 8 (1) : 33-58. doi: 10.3934/jcd.2021003 [7] M. S. Lee, H. G. Harno, B. S. Goh, K. H. Lim. On the bang-bang control approach via a component-wise line search strategy for unconstrained optimization. Numerical Algebra, Control & Optimization, 2021, 11 (1) : 45-61. doi: 10.3934/naco.2020014 [8] Li-Bin Liu, Ying Liang, Jian Zhang, Xiaobing Bao. A robust adaptive grid method for singularly perturbed Burger-Huxley equations. Electronic Research Archive, 2020, 28 (4) : 1439-1457. doi: 10.3934/era.2020076 [9] Chongyang Liu, Meijia Han, Zhaohua Gong, Kok Lay Teo. Robust parameter estimation for constrained time-delay systems with inexact measurements. Journal of Industrial & Management Optimization, 2021, 17 (1) : 317-337. doi: 10.3934/jimo.2019113 [10] Haodong Yu, Jie Sun. Robust stochastic optimization with convex risk measures: A discretized subgradient scheme. Journal of Industrial & Management Optimization, 2021, 17 (1) : 81-99. doi: 10.3934/jimo.2019100 [11] Ripeng Huang, Shaojian Qu, Xiaoguang Yang, Zhimin Liu. Multi-stage distributionally robust optimization with risk aversion. Journal of Industrial & Management Optimization, 2021, 17 (1) : 233-259. doi: 10.3934/jimo.2019109 [12] Min Chen, Olivier Goubet, Shenghao Li. Mathematical analysis of bump to bucket problem. Communications on Pure & Applied Analysis, 2020, 19 (12) : 5567-5580. doi: 10.3934/cpaa.2020251 [13] Sihem Guerarra. Maximum and minimum ranks and inertias of the Hermitian parts of the least rank solution of the matrix equation AXB = C. Numerical Algebra, Control & Optimization, 2021, 11 (1) : 75-86. doi: 10.3934/naco.2020016 [14] Russell Ricks. The unique measure of maximal entropy for a compact rank one locally CAT(0) space. Discrete & Continuous Dynamical Systems - A, 2021, 41 (2) : 507-523. doi: 10.3934/dcds.2020266 [15] Qianqian Han, Xiao-Song Yang. Qualitative analysis of a generalized Nosé-Hoover oscillator. Discrete & Continuous Dynamical Systems - B, 2020  doi: 10.3934/dcdsb.2020346 [16] Laurence Cherfils, Stefania Gatti, Alain Miranville, Rémy Guillevin. Analysis of a model for tumor growth and lactate exchanges in a glioma. Discrete & Continuous Dynamical Systems - S, 2020  doi: 10.3934/dcdss.2020457 [17] Vieri Benci, Sunra Mosconi, Marco Squassina. Preface: Applications of mathematical analysis to problems in theoretical physics. Discrete & Continuous Dynamical Systems - S, 2020  doi: 10.3934/dcdss.2020446 [18] Kihoon Seong. Low regularity a priori estimates for the fourth order cubic nonlinear Schrödinger equation. Communications on Pure & Applied Analysis, 2020, 19 (12) : 5437-5473. doi: 10.3934/cpaa.2020247 [19] Maoding Zhen, Binlin Zhang, Vicenţiu D. Rădulescu. Normalized solutions for nonlinear coupled fractional systems: Low and high perturbations in the attractive case. Discrete & Continuous Dynamical Systems - A, 2020  doi: 10.3934/dcds.2020379 [20] Youming Guo, Tingting Li. Optimal control strategies for an online game addiction model with low and high risk exposure. Discrete & Continuous Dynamical Systems - B, 2020  doi: 10.3934/dcdsb.2020347

2019 Impact Factor: 1.373

## Tools

Article outline

Figures and Tables