Article Contents
Article Contents

Stability of sampling for CUR decompositions

• * Corresponding author: Longxiu Huang
The first author is partially supported by the National Science Foundation TRIPODS program, grant number NSF CCF–1740858. The second author is partially supported by NSF CAREER DMS 1348721 and NSF BIGDATA 1740325
• This article studies how to form CUR decompositions of low-rank matrices via primarily random sampling, though deterministic methods due to previous works are illustrated as well. The primary problem is to determine when a column submatrix of a rank $k$ matrix also has rank $k$. For random column sampling schemes, there is typically a tradeoff between the number of columns needed to be chosen and the complexity of determining the sampling probabilities. We discuss several sampling methods and their complexities as well as stability of the method under perturbations of both the probabilities and the underlying matrix. As an application, we give a high probability guarantee of the exact solution of the Subspace Clustering Problem via CUR decompositions when columns are sampled according to their Euclidean lengths.

Mathematics Subject Classification: Primary: 15A23, 65F30; Secondary: 68W20.

 Citation:

• Table 1.  Table summarizing sampling complexities for different algorithms

 Sampling # Rows # Cols Success Prob Complexity Ref $p_i^\text{unif}, q_i^\text{unif}$ $\frac{r}{ \varepsilon^4\delta}\log(\frac{r}{ \varepsilon^4\delta})$ $\frac{r}{ \varepsilon^4\delta}\log(\frac{r}{ \varepsilon^4\delta})$ $(1-2e^{-c\gamma^2/\delta})^2$ $O(1)$ Cor 5.2 $p_i^\text{col}, q_i^\text{col}$ $\frac{r}{ \varepsilon^4\delta}\log(\frac{r}{ \varepsilon^4\delta})$ $\frac{r}{ \varepsilon^4\delta}\log(\frac{r}{ \varepsilon^4\delta})$ $(1-2e^{-c/\delta})^2$ $O(mn)$ Thm 4.1 $p_i^\text{col}, q_i^\text{col}$ $r\kappa(A)^2(\log(k)+\frac{1}{\delta})$ $r\kappa(A)^2(\log(k)+\frac{1}{\delta})$ $(1-2e^{-c/\delta})^2$ $O(mn)$ Cor 6.4 $p_i^{\text{lev}, k}, q_i^{\text{lev}, k}$ $\frac{k^2}{ \varepsilon^2\delta}$ $\frac{k^2}{ \varepsilon^2\delta}$ $1-e^{-1/\delta}$ $O(\text{SVD}(A, k))$ [12] $p_i^{\text{lev}, k}, q_i^{\text{lev}, k}$ $k\log(k)+\frac{k}{\delta}$ $k\log(k)+\frac{k}{\delta}$ $(1-2e^{-1/\delta})^2$ $O(\text{SVD}(A, k))$ [33] DEIM-CUR $k$ $k$ 1 $O(\text{SVD}(A, k) + k^4)$ [28] RRQR-CUR $k$ $k$ 1 $O(\text{RRQR}(A))$ [32]
•  [1] A. Aldroubi, K. Hamm, A. B. Koku and A. Sekmen, CUR decompositions, similarity matrices, and subspace clustering, Frontiers in Applied Mathematics and Statistics, 4 (2019), 65. doi: 10.3389/fams.2018.00065. [2] A. Aldroubi, A. Sekmen, A. B. Koku and A. F. Cakmak, Similarity matrix framework for data from union of subspaces, Applied and Computational Harmonic Analysis, 45 (2018), 425-435.  doi: 10.1016/j.acha.2017.08.006. [3] R. Basri and D. Jacobs, Lambertian reflectance and linear subspaces, IEEE Transactions on Pattern Analysis & Machine Intelligence, 25 (2003), 218-233.  doi: 10.1109/ICCV.2001.937651. [4] C. Boutsidis and D. P. Woodruff, Optimal CUR matrix decompositions, SIAM Journal on Computing, 46 (2017), 543-589.  doi: 10.1137/140977898. [5] S. Chaturantabut and D. C. Sorensen, Nonlinear model reduction via discrete empirical interpolation, SIAM Journal on Scientific Computing, 32 (2010), 2737-2764.  doi: 10.1137/090766498. [6] J. Chiu and L. Demanet, Sublinear randomized algorithms for skeleton decompositions, SIAM Journal on Matrix Analysis and Applications, 34 (2013), 1361-1383.  doi: 10.1137/110852310. [7] J. P. Costeira and T. Kanade, A multibody factorization method for independently moving objects, International Journal of Computer Vision, 29 (1998), 159-179.  doi: 10.1023/A:1008000628999. [8] S. Demko, Condition numbers of rectangular systems and bounds for generalized inverses, Linear Algebra and its Applications, 78 (1986), 199-206.  doi: 10.1016/0024-3795(86)90024-8. [9] J. Dongarra and F. Sullivan, Guest editors introduction to the top 10 algorithms, Computing in Science & Engineering, 2 (2000), 22-23.  doi: 10.1109/MCISE.2000.814652. [10] P. Drineas, R. Kannan and M. W. Mahoney, Fast monte carlo algorithms for matrices. III: Computing a compressed approximate matrix decomposition, SIAM Journal on Computing, 36 (2006), 184-206.  doi: 10.1137/S0097539704442702. [11] P. Drineas and M. W. Mahoney, On the Nyström method for approximating a Gram matrix for improved kernel-based learning, Journal of Machine Learning Research, 6 (2005), 2153–2175, http://www.jmlr.org/papers/volume6/drineas05a/drineas05a.pdf. [12] P. Drineas, M. W. Mahoney and S. Muthukrishnan, Relative-error $CUR$ matrix decompositions, SIAM Journal on Matrix Analysis and Applications, 30 (2008), 844-881.  doi: 10.1137/07070471X. [13] E. Elhamifar and R. Vidal, Sparse subspace clustering: Algorithm, theory, and applications, IEEE Transactions on Pattern Analysis and Machine Intelligence, 35 (2013), 2765–2781, https://ieeexplore.ieee.org/document/6482137. doi: 10.1109/TPAMI.2013.57. [14] A. Gittens, The spectral norm error of the naive Nystrom extension, preprint, arXiv: 1110.5305. [15] A. Gittens and M. W. Mahoney, Revisiting the Nyström method for improved large-scale machine learning, The Journal of Machine Learning Research, 17 (2016), Paper No. 117, 65 pp, https://dl.acm.org/doi/abs/10.5555/2946645.3007070. [16] G. H. Golub and C. F. van Loan, Matrix Computations, Fourth edition, Johns Hopkins Studies in the Mathematical Sciences, Johns Hopkins University Press, Baltimore, MD, 2013. [17] L. Guttman, Enlargement methods for computing the inverse matrix, The Annals of Mathematical Statistics, 17 (1946), 336-343.  doi: 10.1214/aoms/1177730946. [18] R. Hadani and A. Singer, Representation theoretic patterns in three dimensional cryo-electron microscopy I: The intrinsic reconstitution algorithm, Annals of Mathematics (2), 174 (2011), 1219–1241. doi: 10.4007/annals.2011.174.2.11. [19] K. Hamm and L. X. Huang, Perturbations of CUR decompositions, e-prints, arXiv: 1908.08101. [20] K. Hamm and L. X. Huang, Perspectives on CUR decompositions, Applied and Computational Harmonic Analysis, 48 (2020), 1088-1099.  doi: 10.1016/j.acha.2019.08.006. [21] R. Kannan and S. Vempala, Randomized algorithms in numerical linear algebra, Acta Numerica, 26 (2017), 95-135.  doi: 10.1017/S0962492917000058. [22] G. C. Liu, Z. C. Lin, S. C. Yan, J. Sun, Y. Yu and Y. Ma, Robust recovery of subspace structures by low-rank representation, IEEE Transactions on Pattern Analysis and Machine Intelligence, 35 (2012), 171–184, https://ieeexplore.ieee.org/document/6180173. doi: 10.1109/TPAMI.2012.88. [23] M. W. Mahoney and P. Drineas, CUR matrix decompositions for improved data analysis, Proc. Natl. Acad. Sci. USA, 106 (2009), 697-702.  doi: 10.1073/pnas.0803205106. [24] R. Penrose, On best approximate solutions of linear matrix equations, Proc. Cambridge Philos. Soc., 52 (1956), 17-19.  doi: 10.1017/S0305004100030929. [25] F. Pourkamali-Anaraki and S. Becker, Improved fixed-rank Nyström approximation via QR decomposition: Practical and theoretical aspects, Neurocomputing, 363 (2019), 261-272.  doi: 10.1016/j.neucom.2019.06.070. [26] M. Rudelson, Personal Communication, 2019. [27] M. Rudelson and R. Vershynin, Sampling from large matrices: An approach through geometric functional analysis, Journal of the ACM, 54 (2007), Art. 21, 19 pp. doi: 10.1145/1255443.1255449. [28] D. C. Sorensen and M. Embree, A DEIM induced CUR factorization, SIAM Journal on Scientific Computing, 38 (2016), A1454–A1482. doi: 10.1137/140978430. [29] J. A. Tropp, Column subset selection, matrix factorization, and eigenvalue optimization, Proceedings of the Twentieth Annual ACM-SIAM Symposium on Discrete Algorithms, SIAM, Philadelphia, PA, (2009), 978–986. doi: 10.1137/1.9781611973068.106. [30] M. Udell and A. Townsend, Why are big data matrices approximately low rank?, SIAM Journal on Mathematics of Data Science, 1 (2019), 144-160.  doi: 10.1137/18M1183480. [31] R. Vidal, Subspace clustering, IEEE Signal Processing Magazine, 28 (2011), 52–68, https://ieeexplore.ieee.org/document/5714408. doi: 10.1109/MSP.2010.939739. [32] S. Voronin and P.-G. Martinsson, Efficient algorithms for CUR and interpolative matrix decompositions, Advances in Computational Mathematics, 43 (2017), 495-516.  doi: 10.1007/s10444-016-9494-8. [33] T. Yang, L. Zhang, R. Jin and S. Zhu, An explicit sampling dependent spectral error bound for column subset selection, International Conference on Machine Learning, (2015), 135–143, http://proceedings.mlr.press/v37/yanga15.pdf.

Tables(1)