# American Institute of Mathematical Sciences

February  2021, 15(1): 109-128. doi: 10.3934/ipi.2020067

## Fast algorithms for robust principal component analysis with an upper bound on the rank

 1 Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, MI 48824, USA 2 School of Mathematical Sciences, Shanghai Key Laboratory for Contemporary Applied Mathematics, Key Laboratory of Mathematics for Nonlinear Sciences (Fudan University), Ministry of Education, Fudan University, Shanghai, 200433, China 3 Department of Computational Mathematics, Science and Engineering, Department of Mathematics, Michigan State University, East Lansing, MI 48824, USA

* Corresponding author

Received  February 2020 Revised  August 2020 Published  November 2020

Fund Project: N. Sha and M. Yan are supported by NSF grant DMS-1621798 and DMS-2012439. L. Shi is supported by NNSFC grant 11631015 and Shanghai Science and Technology Research Program 19JC1420101 and and 20JC1412700

The robust principal component analysis (RPCA) decomposes a data matrix into a low-rank part and a sparse part. There are mainly two types of algorithms for RPCA. The first type of algorithm applies regularization terms on the singular values of a matrix to obtain a low-rank matrix. However, calculating singular values can be very expensive for large matrices. The second type of algorithm replaces the low-rank matrix as the multiplication of two small matrices. They are faster than the first type because no singular value decomposition (SVD) is required. However, the rank of the low-rank matrix is required, and an accurate rank estimation is needed to obtain a reasonable solution. In this paper, we propose algorithms that combine both types. Our proposed algorithms require an upper bound of the rank and SVD on small matrices. First, they are faster than the first type because the cost of SVD on small matrices is negligible. Second, they are more robust than the second type because an upper bound of the rank instead of the exact rank is required. Furthermore, we apply the Gauss-Newton method to increase the speed of our algorithms. Numerical experiments show the better performance of our proposed algorithms.

Citation: Ningyu Sha, Lei Shi, Ming Yan. Fast algorithms for robust principal component analysis with an upper bound on the rank. Inverse Problems & Imaging, 2021, 15 (1) : 109-128. doi: 10.3934/ipi.2020067
##### References:
 [1] E. Amaldi and V. Kann, On the approximability of minimizing nonzero variables or unsatisfied relations in linear systems, Theoretical Computer Science, 209 (1998), 237-260.  doi: 10.1016/S0304-3975(97)00115-1.  Google Scholar [2] T. Bouwmans and E. H. Zahzah, Robust pca via principal component pursuit: A review for a comparative evaluation in video surveillance, Computer Vision and Image Understanding, 122 (2014), 22-34.   Google Scholar [3] H. Cai, J.-F. Cai and K. Wei, Accelerated alternating projections for robust principal component analysis, The Journal of Machine Learning Research, 20 (2019), 685-717.   Google Scholar [4] E. J. Candès, X. Li, Y. Ma and J. Wright, Robust principal component analysis?, Journal of the ACM (JACM), 58 (2011), 1-37.  doi: 10.1145/1970392.1970395.  Google Scholar [5] R. Chartrand, Exact reconstruction of sparse signals via nonconvex minimization, IEEE Signal Processing Letters, 14 (2007), 707-710.  doi: 10.1109/LSP.2007.898300.  Google Scholar [6] J. P. Cunningham and Z. Ghahramani, Linear dimensionality reduction: Survey, insights, and generalizations, The Journal of Machine Learning Research, 16 (2015), 2859-2900.   Google Scholar [7] J. F. P. Da Costa, H. Alonso and L. Roque, A weighted principal component analysis and its application to gene expression data, IEEE/ACM Transactions on Computational Biology and Bioinformatics, 8 (2009), 246-252.   Google Scholar [8] F. De la Torre and M. J. Black, Robust principal component analysis for computer vision, in Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001, vol. 1, IEEE, 2001, 362-369. Google Scholar [9] E. Elhamifar and R. Vidal, Sparse subspace clustering: Algorithm, theory, and applications, IEEE transactions on pattern analysis and machine intelligence, 35 (2013), 2765-2781.  doi: 10.1109/TPAMI.2013.57.  Google Scholar [10] J. Fan and R. Li, Variable selection via nonconcave penalized likelihood and its oracle properties, Journal of the American statistical Association, 96 (2001), 1348-1360.  doi: 10.1198/016214501753382273.  Google Scholar [11] R. A. Horn and C. R. Johnson, Matrix Analysis, Cambridge university press, 2013.   Google Scholar [12] X.-L. Huang, L. Shi and M. Yan, Nonconvex sorted $\ell_1$ minimization for sparse approximation, Journal of the Operations Research Society of China, 3 (2015), 207-229.  doi: 10.1007/s40305-014-0069-4.  Google Scholar [13] G. Li and T. K. Pong, Global convergence of splitting methods for nonconvex composite optimization, SIAM Journal on Optimization, 25 (2015), 2434-2460.  doi: 10.1137/140998135.  Google Scholar [14] H. Li and Z. Lin, Accelerated proximal gradient methods for nonconvex programming, in Advances in Neural Information Processing Systems, 2015, 379-387. Google Scholar [15] Z. Lin, M. Chen and Y. Ma, The augmented lagrange multiplier method for exact recovery of corrupted low-rank matrices. 2010, arXiv preprint arXiv: 1009.5055, (2010), 663-670. Google Scholar [16] G. Liu, Z. Lin, S. Yan, J. Sun, Y. Yu and Y. Ma, Robust recovery of subspace structures by low-rank representation, IEEE Transactions on Pattern Analysis and Machine Intelligence, 35 (2012), 171-184.   Google Scholar [17] X. Liu, Z. Wen and Y. Zhang, An efficient Gauss-Newton algorithm for symmetric low-rank product matrix approximations, SIAM Journal on Optimization, 25 (2015), 1571-1608.  doi: 10.1137/140971464.  Google Scholar [18] Y. Lou and M. Yan, Fast l1-l2 minimization via a proximal operator, Journal of Scientific Computing, 74 (2018), 767-785.  doi: 10.1007/s10915-017-0463-2.  Google Scholar [19] N. Sha, M. Yan and Y. Lin, Efficient seismic denoising techniques using robust principal component analysis, in SEG Technical Program Expanded Abstracts 2019, Society of Exploration Geophysicists, 2019, 2543-2547. Google Scholar [20] Y. Shen, H. Xu and X. Liu, An alternating minimization method for robust principal component analysis, Optimization Methods and Software, 34 (2019), 1251-1276.  doi: 10.1080/10556788.2018.1496086.  Google Scholar [21] M. Tao and X. Yuan, Recovering low-rank and sparse components of matrices from incomplete and noisy observations, SIAM Journal on Optimization, 21 (2011), 57-81.  doi: 10.1137/100781894.  Google Scholar [22] L. N. Trefethen and D. Bau â…¢, Numerical linear algebra, vol. 50, SIAM, 1997.  Google Scholar [23] F. Wen, R. Ying, P. Liu and T.-K. Truong, Nonconvex regularized robust PCA using the proximal block coordinate descent algorithm, IEEE Transactions on Signal Processing, 67 (2019), 5402-5416.  doi: 10.1109/TSP.2019.2940121.  Google Scholar [24] Z. Wen, W. Yin and Y. Zhang, Solving a low-rank factorization model for matrix completion by a nonlinear successive over-relaxation algorithm, Mathematical Programming Computation, 4 (2012), 333-361.  doi: 10.1007/s12532-012-0044-1.  Google Scholar [25] J. Wright, A. Ganesh, S. Rao, Y. Peng and Y. Ma, Robust principal component analysis: Exact recovery of corrupted low-rank matrices via convex optimization, in Advances in Neural Information Processing Systems, 2009, 2080-2088. Google Scholar [26] X. Yuan and J. Yang, Sparse and low-rank matrix decomposition via alternating direction methods, preprint, 12 (2009).  Google Scholar [27] C.-H. Zhang, Nearly unbiased variable selection under minimax concave penalty, The Annals of Statistics, 38 (2010), 894-942.  doi: 10.1214/09-AOS729.  Google Scholar

show all references

##### References:
 [1] E. Amaldi and V. Kann, On the approximability of minimizing nonzero variables or unsatisfied relations in linear systems, Theoretical Computer Science, 209 (1998), 237-260.  doi: 10.1016/S0304-3975(97)00115-1.  Google Scholar [2] T. Bouwmans and E. H. Zahzah, Robust pca via principal component pursuit: A review for a comparative evaluation in video surveillance, Computer Vision and Image Understanding, 122 (2014), 22-34.   Google Scholar [3] H. Cai, J.-F. Cai and K. Wei, Accelerated alternating projections for robust principal component analysis, The Journal of Machine Learning Research, 20 (2019), 685-717.   Google Scholar [4] E. J. Candès, X. Li, Y. Ma and J. Wright, Robust principal component analysis?, Journal of the ACM (JACM), 58 (2011), 1-37.  doi: 10.1145/1970392.1970395.  Google Scholar [5] R. Chartrand, Exact reconstruction of sparse signals via nonconvex minimization, IEEE Signal Processing Letters, 14 (2007), 707-710.  doi: 10.1109/LSP.2007.898300.  Google Scholar [6] J. P. Cunningham and Z. Ghahramani, Linear dimensionality reduction: Survey, insights, and generalizations, The Journal of Machine Learning Research, 16 (2015), 2859-2900.   Google Scholar [7] J. F. P. Da Costa, H. Alonso and L. Roque, A weighted principal component analysis and its application to gene expression data, IEEE/ACM Transactions on Computational Biology and Bioinformatics, 8 (2009), 246-252.   Google Scholar [8] F. De la Torre and M. J. Black, Robust principal component analysis for computer vision, in Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001, vol. 1, IEEE, 2001, 362-369. Google Scholar [9] E. Elhamifar and R. Vidal, Sparse subspace clustering: Algorithm, theory, and applications, IEEE transactions on pattern analysis and machine intelligence, 35 (2013), 2765-2781.  doi: 10.1109/TPAMI.2013.57.  Google Scholar [10] J. Fan and R. Li, Variable selection via nonconcave penalized likelihood and its oracle properties, Journal of the American statistical Association, 96 (2001), 1348-1360.  doi: 10.1198/016214501753382273.  Google Scholar [11] R. A. Horn and C. R. Johnson, Matrix Analysis, Cambridge university press, 2013.   Google Scholar [12] X.-L. Huang, L. Shi and M. Yan, Nonconvex sorted $\ell_1$ minimization for sparse approximation, Journal of the Operations Research Society of China, 3 (2015), 207-229.  doi: 10.1007/s40305-014-0069-4.  Google Scholar [13] G. Li and T. K. Pong, Global convergence of splitting methods for nonconvex composite optimization, SIAM Journal on Optimization, 25 (2015), 2434-2460.  doi: 10.1137/140998135.  Google Scholar [14] H. Li and Z. Lin, Accelerated proximal gradient methods for nonconvex programming, in Advances in Neural Information Processing Systems, 2015, 379-387. Google Scholar [15] Z. Lin, M. Chen and Y. Ma, The augmented lagrange multiplier method for exact recovery of corrupted low-rank matrices. 2010, arXiv preprint arXiv: 1009.5055, (2010), 663-670. Google Scholar [16] G. Liu, Z. Lin, S. Yan, J. Sun, Y. Yu and Y. Ma, Robust recovery of subspace structures by low-rank representation, IEEE Transactions on Pattern Analysis and Machine Intelligence, 35 (2012), 171-184.   Google Scholar [17] X. Liu, Z. Wen and Y. Zhang, An efficient Gauss-Newton algorithm for symmetric low-rank product matrix approximations, SIAM Journal on Optimization, 25 (2015), 1571-1608.  doi: 10.1137/140971464.  Google Scholar [18] Y. Lou and M. Yan, Fast l1-l2 minimization via a proximal operator, Journal of Scientific Computing, 74 (2018), 767-785.  doi: 10.1007/s10915-017-0463-2.  Google Scholar [19] N. Sha, M. Yan and Y. Lin, Efficient seismic denoising techniques using robust principal component analysis, in SEG Technical Program Expanded Abstracts 2019, Society of Exploration Geophysicists, 2019, 2543-2547. Google Scholar [20] Y. Shen, H. Xu and X. Liu, An alternating minimization method for robust principal component analysis, Optimization Methods and Software, 34 (2019), 1251-1276.  doi: 10.1080/10556788.2018.1496086.  Google Scholar [21] M. Tao and X. Yuan, Recovering low-rank and sparse components of matrices from incomplete and noisy observations, SIAM Journal on Optimization, 21 (2011), 57-81.  doi: 10.1137/100781894.  Google Scholar [22] L. N. Trefethen and D. Bau â…¢, Numerical linear algebra, vol. 50, SIAM, 1997.  Google Scholar [23] F. Wen, R. Ying, P. Liu and T.-K. Truong, Nonconvex regularized robust PCA using the proximal block coordinate descent algorithm, IEEE Transactions on Signal Processing, 67 (2019), 5402-5416.  doi: 10.1109/TSP.2019.2940121.  Google Scholar [24] Z. Wen, W. Yin and Y. Zhang, Solving a low-rank factorization model for matrix completion by a nonlinear successive over-relaxation algorithm, Mathematical Programming Computation, 4 (2012), 333-361.  doi: 10.1007/s12532-012-0044-1.  Google Scholar [25] J. Wright, A. Ganesh, S. Rao, Y. Peng and Y. Ma, Robust principal component analysis: Exact recovery of corrupted low-rank matrices via convex optimization, in Advances in Neural Information Processing Systems, 2009, 2080-2088. Google Scholar [26] X. Yuan and J. Yang, Sparse and low-rank matrix decomposition via alternating direction methods, preprint, 12 (2009).  Google Scholar [27] C.-H. Zhang, Nearly unbiased variable selection under minimax concave penalty, The Annals of Statistics, 38 (2010), 894-942.  doi: 10.1214/09-AOS729.  Google Scholar
The relative error to the true low-rank matrix vs the rank $p$ for Shen et al.'s and Alg. 2. Alg. 2 is robust to $p$, as long as $p$ is not smaller than the true rank 25
The contour map of the relative error to ${\bf{L}}^\star$ for different parameters. In this experiment, we set $r = 25$ and $s = 20$. The upper bound of the rank is set to be $p = 30$
The numerical experiment on the 'cameraman' image. (A-C) show that the proposed model performs better than Shen et al.'s both visually and in terms of RE and PSNR. (D) compares the objective values vs time for general SVD, Alg. 1, and Alg. 2. Here $f^\star$ is the value obtained by Alg. 2 with more iterations. It shows the fast speed with the Gauss-Newton approach and acceleration. With the Gauss-Newton approach, the computation time for Alg. 1 is reduced to about 1/7 of the one with standard SVD (from 65.11s to 8.43s). The accelerated Alg. 2 requires 5.2s, though the number of iterations is reduced from 3194 to 360
The numerical experiment on the 'Barbara' image. (A-C) show that the proposed model performs better than Shen et al.'s both visually and in terms of RE and PSNR. (D) compares the objective values vs time for general SVD, Alg. 1, and Alg. 2. Here $f^\star$ is the value obtained by Alg. 2 with more iterations. It shows the fast speed with the Gauss-Newton approach and acceleration. With the Gauss-Newton approach, the computation time for Alg. 1 is reduced to less than 1/3 of the one with standard SVD (from 148.6s to 43.7s). The accelerated Alg. 2 requires 23.3s, though the number of iterations is reduced from 3210 to 300
 Algorithm 1: Proposed Algorithm Input: ${\bf{D}}$, $\mu$, $\lambda$, $p$, $\mathcal{A}$, stepsize $t$, stopping criteria $\epsilon$, maximum number of iterations $Max\_Iter$, initialization ${\bf{L}}^0 = \bf{0}$Output: ${\bf{L}}$, ${\bf{S}}$
 Algorithm 1: Proposed Algorithm Input: ${\bf{D}}$, $\mu$, $\lambda$, $p$, $\mathcal{A}$, stepsize $t$, stopping criteria $\epsilon$, maximum number of iterations $Max\_Iter$, initialization ${\bf{L}}^0 = \bf{0}$Output: ${\bf{L}}$, ${\bf{S}}$
 Algorithm 2: Accelerated algorithm with nonmonotone APG Input:${\bf{D}}$, $\mu$, $\lambda$, $p$, $\mathcal{A}$, stepsize $t$, $\eta \in [0,1)$, $\delta > 0$, stopping criteria $\epsilon$, maximum number of iterations $Max\_Iter$, initialization: ${\bf{L}}^0 = {\bf{L}}^1 = {\bf{Z}}^1 = \textbf{0}$, $t^0 = 0$, $t^1 = q^1=1$, $c^1 = F( {\bf{L}}^1)$Output:${\bf{L}}$, ${\bf{S}}$
 Algorithm 2: Accelerated algorithm with nonmonotone APG Input:${\bf{D}}$, $\mu$, $\lambda$, $p$, $\mathcal{A}$, stepsize $t$, $\eta \in [0,1)$, $\delta > 0$, stopping criteria $\epsilon$, maximum number of iterations $Max\_Iter$, initialization: ${\bf{L}}^0 = {\bf{L}}^1 = {\bf{Z}}^1 = \textbf{0}$, $t^0 = 0$, $t^1 = q^1=1$, $c^1 = F( {\bf{L}}^1)$Output:${\bf{L}}$, ${\bf{S}}$
Comparison of three RPCA algorithms. We compare the relative error of their solutions to the true low-rank matrix and the number of iterations. Both Alg. 1 and Alg. 2 have better performance than [20] in terms of the relative error and the number of iterations. Alg. 2 has the fewest iterations but the relative error could be large. It is because the true low-rank matrix is not the optimal solution to the optimization problem, and the trajectory of the iterations moves close to ${\bf{L}}^\star$ before it approaches the optimal solution
 $r$ s Shen et al.'s [20] Alg. 1 Alg.2 $RE( {\bf{L}}, {\bf{L}}^\star)$ $\#$ iter $RE( {\bf{L}}, {\bf{L}}^\star)$ $\#$ iter $RE( {\bf{L}}, {\bf{L}}^\star)$ $\#$ iter 25 20 0.0745 1318 0.0075 296 0.0075 68 50 20 0.0496 1434 0.0101 473 0.0088 77 25 40 0.0990 2443 0.0635 796 0.0915 187
 $r$ s Shen et al.'s [20] Alg. 1 Alg.2 $RE( {\bf{L}}, {\bf{L}}^\star)$ $\#$ iter $RE( {\bf{L}}, {\bf{L}}^\star)$ $\#$ iter $RE( {\bf{L}}, {\bf{L}}^\star)$ $\#$ iter 25 20 0.0745 1318 0.0075 296 0.0075 68 50 20 0.0496 1434 0.0101 473 0.0088 77 25 40 0.0990 2443 0.0635 796 0.0915 187
Performance of Alg. 2 on low-rank matrix recovery with missing entries. We change the level of sparsity in the sparse noise, standard deviation of the Gaussian noise, and the ratio of missing entries
 {s} {$\sigma$} ratio of missing entries $RE( {\bf{L}}, {\bf{L}}^\star)$ by Alg. 2 20 0.05 10% 0.0079 20 0.05 20% 0.0088 20 0.05 50% 0.0201 5 0.01 50% 0.0015
 {s} {$\sigma$} ratio of missing entries $RE( {\bf{L}}, {\bf{L}}^\star)$ by Alg. 2 20 0.05 10% 0.0079 20 0.05 20% 0.0088 20 0.05 50% 0.0201 5 0.01 50% 0.0015
 [1] Tao Wu, Yu Lei, Jiao Shi, Maoguo Gong. An evolutionary multiobjective method for low-rank and sparse matrix decomposition. Big Data & Information Analytics, 2017, 2 (1) : 23-37. doi: 10.3934/bdia.2017006 [2] Manfred Einsiedler, Elon Lindenstrauss. On measures invariant under diagonalizable actions: the Rank-One case and the general Low-Entropy method. Journal of Modern Dynamics, 2008, 2 (1) : 83-128. doi: 10.3934/jmd.2008.2.83 [3] Lakmi Niwanthi Wadippuli, Ivan Gudoshnikov, Oleg Makarenkov. Global asymptotic stability of nonconvex sweeping processes. Discrete & Continuous Dynamical Systems - B, 2020, 25 (3) : 1129-1139. doi: 10.3934/dcdsb.2019212 [4] Khosro Sayevand, Valeyollah Moradi. A robust computational framework for analyzing fractional dynamical systems. Discrete & Continuous Dynamical Systems - S, 2021  doi: 10.3934/dcdss.2021022 [5] Thomas Alazard. A minicourse on the low Mach number limit. Discrete & Continuous Dynamical Systems - S, 2008, 1 (3) : 365-404. doi: 10.3934/dcdss.2008.1.365 [6] F.J. Herranz, J. de Lucas, C. Sardón. Jacobi--Lie systems: Fundamentals and low-dimensional classification. Conference Publications, 2015, 2015 (special) : 605-614. doi: 10.3934/proc.2015.0605 [7] Reza Lotfi, Yahia Zare Mehrjerdi, Mir Saman Pishvaee, Ahmad Sadeghieh, Gerhard-Wilhelm Weber. A robust optimization model for sustainable and resilient closed-loop supply chain network design considering conditional value at risk. Numerical Algebra, Control & Optimization, 2021, 11 (2) : 221-253. doi: 10.3934/naco.2020023 [8] Jan Prüss, Laurent Pujo-Menjouet, G.F. Webb, Rico Zacher. Analysis of a model for the dynamics of prions. Discrete & Continuous Dynamical Systems - B, 2006, 6 (1) : 225-235. doi: 10.3934/dcdsb.2006.6.225 [9] Mao Okada. Local rigidity of certain actions of solvable groups on the boundaries of rank-one symmetric spaces. Journal of Modern Dynamics, 2021, 17: 111-143. doi: 10.3934/jmd.2021004 [10] Sohana Jahan. Discriminant analysis of regularized multidimensional scaling. Numerical Algebra, Control & Optimization, 2021, 11 (2) : 255-267. doi: 10.3934/naco.2020024 [11] Qiang Guo, Dong Liang. An adaptive wavelet method and its analysis for parabolic equations. Numerical Algebra, Control & Optimization, 2013, 3 (2) : 327-345. doi: 10.3934/naco.2013.3.327 [12] Vieri Benci, Sunra Mosconi, Marco Squassina. Preface: Applications of mathematical analysis to problems in theoretical physics. Discrete & Continuous Dynamical Systems - S, 2021, 14 (5) : i-i. doi: 10.3934/dcdss.2020446 [13] Martial Agueh, Reinhard Illner, Ashlin Richardson. Analysis and simulations of a refined flocking and swarming model of Cucker-Smale type. Kinetic & Related Models, 2011, 4 (1) : 1-16. doi: 10.3934/krm.2011.4.1 [14] Rui Hu, Yuan Yuan. Stability, bifurcation analysis in a neural network model with delay and diffusion. Conference Publications, 2009, 2009 (Special) : 367-376. doi: 10.3934/proc.2009.2009.367 [15] Seung-Yeal Ha, Shi Jin. Local sensitivity analysis for the Cucker-Smale model with random inputs. Kinetic & Related Models, 2018, 11 (4) : 859-889. doi: 10.3934/krm.2018034 [16] Israa Mohammed Khudher, Yahya Ismail Ibrahim, Suhaib Abduljabbar Altamir. Individual biometrics pattern based artificial image analysis techniques. Numerical Algebra, Control & Optimization, 2021  doi: 10.3934/naco.2020056 [17] Dan Wei, Shangjiang Guo. Qualitative analysis of a Lotka-Volterra competition-diffusion-advection system. Discrete & Continuous Dynamical Systems - B, 2021, 26 (5) : 2599-2623. doi: 10.3934/dcdsb.2020197 [18] Hailing Xuan, Xiaoliang Cheng. Numerical analysis and simulation of an adhesive contact problem with damage and long memory. Discrete & Continuous Dynamical Systems - B, 2021, 26 (5) : 2781-2804. doi: 10.3934/dcdsb.2020205 [19] Jiangxing Wang. Convergence analysis of an accurate and efficient method for nonlinear Maxwell's equations. Discrete & Continuous Dynamical Systems - B, 2021, 26 (5) : 2429-2440. doi: 10.3934/dcdsb.2020185 [20] Carlos Fresneda-Portillo, Sergey E. Mikhailov. Analysis of Boundary-Domain Integral Equations to the mixed BVP for a compressible stokes system with variable viscosity. Communications on Pure & Applied Analysis, 2019, 18 (6) : 3059-3088. doi: 10.3934/cpaa.2019137

2019 Impact Factor: 1.373