Data: Initialize 1 while not convergence do 2 │ Update 3 │ Update 4 │ Update 5 end |
High-dimensional data often lie in low-dimensional subspaces instead of the whole space. Subspace clustering is a problem to analyze data that are from multiple low-dimensional subspaces and cluster them into the corresponding subspaces. In this work, we propose a $(k,k)$-sparse matrix factorization method for subspace clustering. In this method, data itself is considered as the "dictionary", and each data point is represented as a linear combination of the basis of its cluster in the dictionary. Thus, the coefficient matrix is low-rank and sparse. With an appropriate permutation, it is also blockwise with each block corresponding to a cluster. With an assumption that each block is no more than $k$-by-$k$ in matrix recovery, we seek a low-rank and $(k,k)$-sparse coefficient matrix, which will be used for the construction of affinity matrix in spectral clustering. The advantage of our proposed method is that we recover a coefficient matrix with $(k,k)$-sparse and low-rank simultaneously, which is better fit for subspace clustering. Numerical results illustrate the effectiveness that it is better than SSC and LRR in real-world classification problems such as face clustering and motion segmentation.
Citation: |
Table Algorithm 1. ADMM Algorithm to solve Model (11)
Data: Initialize 1 while not convergence do 2 │ Update 3 │ Update 4 │ Update 5 end |
Table Algorithm 2.
The algorithm for proximity operator of
Data: Result: 1 Let 2 Find $\frac{\tilde{h}_{k-r-1}}{\alpha+1}>\frac{\gamma_{r, l}}{l-k+(\alpha+1)(r+1)}\ge\frac{\tilde{h}_{k-r}}{\alpha+1}, $ $\tilde{u}_l>\frac{\gamma_{r, l}}{l-k+(\alpha+1)(r+1)}\ge \tilde{g}_{l+1}.$ 3 Define $ q_i = \left\{ \begin{array}{ll} \frac{\alpha}{\alpha+1}\tilde{h}_i & \textrm{if}\; i=1, \cdots, k-r-1\\ \tilde{h}_i-\frac{\gamma_{r, l}}{l-k+(\alpha+1)(r+1)}\\ & \textrm{if}\; i=k-r, \cdots, l\\ 0 & \textrm{if} \; i=l+1, \cdots, n \end{array} \right.$ 4 Set |
Table 1. The error rate (mean % and median %) for face clustering on Extended Yale dataset B
# Classes | mean/median | SSC | LRR | (3, 3)-SMF | (4, 4)-SMF | ||
error | s | error | s | ||||
2 | mean | 15.83 | 6.37 | 3.38 | 18 | 3.53 | 18 |
median | 15.63 | 6.25 | 2.34 | 2.34 | |||
3 | mean | 28.13 | 9.57 | 6.19 | 25 | 6.06 | 25 |
median | 28.65 | 8.85 | 5.73 | 5.73 | |||
5 | mean | 37.90 | 14.86 | 11.06 | 35 | 10.04 | 35 |
median | 38.44 | 14.38 | 9.38 | 9.06 | |||
8 | mean | 44.25 | 23.27 | 23.08 | 50 | 22.51 | 50 |
median | 44.82 | 21.29 | 27.54 | 26.06 | |||
10 | mean | 50.78 | 29.38 | 25.36 | 65 | 23.91 | 65 |
median | 49.06 | 32.97 | 27.19 | 27.34 |
Table 2. The error rate (mean %/median %) for motion segmentation on Hopkins155 dataset
SSC | LRR | (3, 3)-SMF | (4, 4)-SMF | |
Mean | 9.28 | 8.43 | 6.61 | 7.16 |
Median | 0.24 | 1.54 | 1.20 | 1.32 |
[1] | P. K. Agarwal and N. H. Mustafa, k-means projective clustering, in Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, ACM, 2004, 155–165. doi: 10.1145/1055558.1055581. |
[2] | A. Argyriou, R. Foygel and N. Srebro, Sparse prediction with the k-support norm, in Advances in Neural Information Processing Systems, 2012, 1457–1465. |
[3] | J. Bolte, S. Sabach and M. Teboulle, Proximal alternating linearized minimization for nonconvex and nonsmooth problems, Mathematical Programming, 146 (2014), 459-494. doi: 10.1007/s10107-013-0701-9. |
[4] | V. Chandrasekaran, B. Recht, P. A. Parrilo and A. S. Willsky, The convex geometry of linear inverse problems, Foundations of Computational mathematics,, 12 (2012), 805-849. |
[5] | G. Chen and G. Lerman, Spectral curvature clustering (scc), International Journal of Computer Vision, 81 (2009), 317-330. doi: 10.1007/s11263-008-0178-9. |
[6] | J. P. Costeira and T. Kanade, A multibody factorization method for independently moving objects, International Journal of Computer Vision, 29 (1998), 159-179. |
[7] | H. Derksen, Y. Ma, W. Hong and J. Wright, Segmentation of multivariate mixed data via lossy coding and compression in Electronic Imaging 2007, International Society for Optics and Photonics (2007), 65080H. doi: 10.1117/12.714912. |
[8] | E. Elhamifar and R. Vidal, Sparse subspace clustering: Algorithm, theory, and applications, IEEE Transactions on Pattern Analysis and Machine Intelligence, 35 (2013), 2765-2781. doi: 10.1109/TPAMI.2013.57. |
[9] | P. Favaro, R. Vidal and A. Ravichandran, A closed form solution to robust subspace estimation and clustering, in 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 2011, 1801–1807. doi: 10.1109/CVPR.2011.5995365. |
[10] | J. Feng, Z. Lin, H. Xu and S. Yan, Robust subspace segmentation with block-diagonal prior, in IEEE Conference on Computer Vision and Pattern Recognition, IEEE, 2014, 3818–3825. doi: 10.1109/CVPR.2014.482. |
[11] | A. Goh and R. Vidal, Segmenting motions of different types by unsupervised manifold clustering, in IEEE Conference on Computer Vision and Pattern Recognition, IEEE, 2007, 1–6. doi: 10.1109/CVPR.2007.383235. |
[12] | L. N. Hutchins, S. M. Murphy, P. Singh and J. H. Graber, Position-dependent motif characterization using non-negative matrix factorization, Bioinformatics, 24 (2008), 2684-2690. doi: 10.1093/bioinformatics/btn526. |
[13] | K. Lee, J. Ho and D. Kriegman, Acquiring linear subspaces for face recognition under variable lighting, IEEE Transactions on Pattern Analysis and Machine Intelligence, 27 (2005), 684-698. |
[14] | G. Liu, Z. Lin and Y. Yu, Robust subspace segmentation by low-rank representation, in Proceedings of the 27th international conference on machine learning, 2010, 663–670. |
[15] | L. Lu and R. Vidal, Combined central and subspace clustering for computer vision applications, in Proceedings of the 23rd international conference on Machine learning, ACM, 2006, 593–600. doi: 10.1145/1143844.1143919. |
[16] | B. Nasihatkon and R. Hartley, Graph connectivity in sparse subspace clustering, in Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on, IEEE, 2011, 2137–2144. doi: 10.1109/CVPR.2011.5995679. |
[17] | A. Y. Ng, M. I. Jordan and Y. Weiss, On spectral clustering: Analysis and an algorithm, Advances in Neural Information Processing Systems, 2 (2002), 849-856. |
[18] | S. Oymak, A. Jalali, M. Fazel, Y. C. Eldar and B. Hassibi, Simultaneously structured models with application to sparse and low-rank matrices, Information Theory, IEEE Transactions on, 61 (2015), 2886-2908. doi: 10.1109/TIT.2015.2401574. |
[19] | N. Parikh and S. Boyd, Proximal algorithms, Foundations and Trends in optimization, 1 (2014), 127-239. doi: 10.1561/2400000003. |
[20] | M. J. D. Powell, On search directions for minimization algorithms, Mathematical Programming, 4 (1973), 193-201. doi: 10.1007/BF01584660. |
[21] | S. R. Rao, R. Tron, R. Vidal and Y. Ma, Motion segmentation via robust subspace separation in the presence of outlying, incomplete, or corrupted trajectories, in IEEE Conference on Computer Vision and Pattern Recognition, IEEE, 2008, 1–8. doi: 10.1109/CVPR.2008.4587437. |
[22] | E. Richard, G. R. Obozinski and J. -P. Vert, Tight convex relaxations for sparse matrix factorization, in Advances in Neural Information Processing Systems, 2014, 3284–3292. |
[23] | Y. Sugaya and K. Kanatani, Geometric structure of degeneracy for multi-body motion segmentation, in Statistical Methods in Video Processing, Springer, 2004, 13–25. doi: 10.1007/978-3-540-30212-4_2. |
[24] | M. E. Tipping and C. M. Bishop, Mixtures of probabilistic principal component analyzers, Neural Computation, 11 (1999), 443-482. doi: 10.1162/089976699300016728. |
[25] | R. Vidal, A tutorial on subspace clustering, IEEE Signal Processing Magazine, 28 (2010), 52-68. |
[26] | R. Vidal, Y. Ma and S. Sastry, Generalized Principal Component Analysis (GPCA), Interdisciplinary Applied Mathematics, 40. Springer, New York, 2016. doi: 10.1007/978-0-387-87811-9. |
[27] | Y. -X. Wang, H. Xu and C. Leng, Provable subspace clustering: When LRR meets SSC, in Advances in Neural Information Processing Systems, 2013, 64–72. |
[28] | J. Yan and M. Pollefeys, A general framework for motion segmentation: Independent, articulated, rigid, non-rigid, degenerate and non-degenerate, in Computer Vision–ECCV 2006, Springer, 2006, 94–106. doi: 10.1007/11744085_8. |
[29] | W. I. Zangwill, Nonlinear Programming: A Unified Approach, vol. 196, Prentice-Hall Englewood Cliffs, NJ, 1969. |
[30] | T. Zhang, A. Szlam, Y. Wang and G. Lerman, Hybrid linear modeling via local best-fit flats, International Journal of Computer Vision, 100 (2012), 217-240. doi: 10.1007/s11263-012-0535-6. |