Side-information-induced reweighted sparse subspace clustering

  • * Corresponding author: Weiwei Wang

The first author is supported by National Natural Science Foundation of China (Grants 61472303, 61772389)

  • Subspace clustering segments a collection of data from a union of several subspaces into clusters with each cluster corresponding to one subspace. The geometric information of the dataset reflects its intrinsic structure and can be utilized to assist the segmentation. In this paper, we propose side-information-induced reweighted sparse subspace clustering (SRSSC) for high-dimensional data clustering. In our method, the geometric information of the high-dimensional data points in a target space is utilized to induce subspace clustering as side-information. We solve the method by iterating the reweighted $ l_1 $-norm minimization to obtain the self-representation coefficients of the data and segment the data using the spectral clustering framework. We compare the performance of our proposed algorithm with some state-of-the-art algorithms using synthetic data and three famous real datasets. Our proposed SRSSC algorithm is the simplest but the most effective. In the experiments, the results of these clustering algorithms verify the effectiveness of our proposed algorithm.

    Mathematics Subject Classification: Primary: 62H30, 68T10; Secondary: 90C26.


  • Figure 1.  Illustration of a simple instance of subspace clustering

    Figure 2.  Angles of pairs of data in the databases

    Figure 3.  Convergence of SRSSC on the Extended YaleB dataset

    Figure 4.  Visualization of the similarity matrices that were obtained by the different methods

    Figure 5.  Some sample images from the Extended Yale B database

    Figure 6.  The average computation times of the different algorithms using the Yale B dataset

    Figure 7.  Some sample images from the COIL 20 database (top) and the images that belong to the same object (bottom)

    Figure 8.  Some sample images from the USPS database

    Table 1.  Performance comparison of the different algorithms using the synthetic data

    Algorithms LSR SMR LRR SSC StrSSC RSSC Ours
    Accuracy Mean 93.57% 94.35% 92.46% 94.04% 94.08% 96.60% 97.00%
    Median 92.85% 94.07% 92.13% 93.60% 93.70% 96.10% 96.90%
    Table 2.  The clustering errors (%) of some different algorithms using the Extended Yale B dataset

    No. of subject Algorithms LSR SMR LRR SSC StrSSC RSSC Ours
    2 Subjects Mean 7.35 1.75 2.13 1.87 1.23 0.57 0.48
    Median 7.03 0.78 0.78 0 0 0 0
    3 Subjects Mean 9.92 3.03 3.49 3.29 2.84 1.08 0.78
    Median 10.41 2.08 2.08 0.52 0.52 0 0.52
    4 Subjects Mean 13.66 3.25 4.86 3.80 2.95 1.65 1.07
    Median 14.07 2.34 3.91 1.95 0.78 0.39 0.39
    5 Subjects Mean 17.56 3.91 5.92 4.33 3.31 2.21 1.40
    Median 17.81 2.50 4.99 2.50 1.25 0.62 0.62
    6 Subjects Mean 20.95 5.28 6.83 4.87 3.73 2.79 1.68
    Median 21.07 2.86 5.99 3.39 2.08 1.30 1.04
    7 Subjects Mean 24.31 6.38 7.75 5.40 4.01 3.43 2.03
    Median 24.10 3.13 7.14 4.46 2.68 1.79 1.34
    8 Subjects Mean 27.52 6.83 11.05 5.92 4.38 3.98 2.59
    Median 27.83 3.71 7.42 4.69 3.13 1.86 1.56
    9 Subjects Mean 31.01 7.14 10.32 6.46 4.56 4.55 2.84
    Median 31.42 4.51 7.81 4.77 3.65 2.43 1.65
    10 Subjects Mean 33.49 7.81 16.95 7.40 4.74 4.90 3.33
    Median 32.81 7.03 18.91 5.63 4.22 3.59 2.03
    Table 3.  The clustering errors (%) of some comparative algorithms using the COIL 20 dataset

    No. of subject Algorithms LSR SMR LRR SSC StrSSC RSSC Ours
    2 Subjects Mean 15.05 13.75 14.86 8.13 0.76 1.35 0.43
    Median 13.91 10.42 13.07 0 0 0 0
    3 Subjects Mean 22.16 21.97 21.81 10.87 1.86 1.37 0.72
    Median 20.22 19.44 19.65 1.85 0 0 0
    20 Subjects Mean(Mdian) 25.49 24.72 24.67 20.07 15.73 16.32 7.78
    Table 4.  The clustering errors (%) of some comparative algorithms using the USPS dataset

    No. of subject Algorithms LSR SMR LRR SSC StrSSC RSSC Ours
    2 Subjects Mean 15.49 15.25 14.76 11.80 11.48 10.90 10.80
    Median 13.91 13.42 13.07 9.00 8.50 9.50 7.00
    3 Subjects Mean 29.13 28.91 28.81 27.59 27.88 27.44 26.67
    Median 27.02 16.34 26.37 21.83 25.67 23.82 21.33
    10 Subjects Mean(Median) 46.25 45.97 45.53 44.89 44.25 44.03 43.95
