June  2021, 3(2): 225-249. doi: 10.3934/fods.2021015

Spectral clustering revisited: Information hidden in the Fiedler vector

1. 

Committee on Computational and Applied Mathematics, The University of Chicago, Chicago, IL 60637, USA

2. 

Department of Mathematics, University of Washington, Seattle, WA 98195, USA

* Corresponding author: Stefan Steinerberger

Received  September 2020 Revised  May 2021 Published  June 2021 Early access  June 2021

Fund Project: AD was supported by the 2019-2020 Yale University Emerging Scholars Initiative Post-Baccalaureate Research Education Program (ESI PREP). SS was supported by the NSF (DMS-2123224) and the Alfred P. Sloan Foundation

We study the clustering problem on graphs: it is known that if there are two underlying clusters, then the signs of the eigenvector corresponding to the second largest eigenvalue of the adjacency matrix can reliably reconstruct the two clusters. We argue that the vertices for which the eigenvector has the largest and the smallest entries, respectively, are unusually strongly connected to their own cluster and more reliably classified than the rest. This can be regarded as a discrete version of the Hot Spots conjecture and should be a useful heuristic for evaluating 'strongly clustered' versus 'liminal' nodes in applications. We give a rigorous proof for the stochastic block model and discuss several explicit examples.

Citation: Adela DePavia, Stefan Steinerberger. Spectral clustering revisited: Information hidden in the Fiedler vector. Foundations of Data Science, 2021, 3 (2) : 225-249. doi: 10.3934/fods.2021015
References:
[1]

E. Abbe, Community detection and stochastic block models: Recent developments, J. Mach. Learn. Res., 18 (2017), 86pp.  Google Scholar

[2]

E. AbbeA. S. Bandeira and G. Hall, Exact recovery in the stochastic block model, IEEE Trans. Inform. Theory, 62 (2016), 471-487.  doi: 10.1109/TIT.2015.2490670.  Google Scholar

[3]

E. AbbeJ. FanK. Wang and Y. Zhong, Entrywise eigenvector analysis of random matrices with low expected rank, Ann. Statist., 48 (2020), 1452-1474.  doi: 10.1214/19-AOS1854.  Google Scholar

[4]

E. Abbe and C. Sandon, Proof of the achievability conjectures for the general stochastic block model, Comm. Pure Appl. Math., 71 (2018), 1334-1406.  doi: 10.1002/cpa.21719.  Google Scholar

[5]

R. Andersen and K. Lang, Communities from seed sets, in Proceedings of the 15th International Conference on World Wide Web, 2006, 223–232. doi: 10.1145/1135777.1135814.  Google Scholar

[6]

A. S. Bandeira, Random Laplacian matrices and convex relaxations, Found. Comput. Math., 18 (2018), 345-379.  doi: 10.1007/s10208-016-9341-9.  Google Scholar

[7]

R. Bañuelos and K. Burdzy, On the "hot spots" conjecture of J. Rauch, J. Funct. Anal., 164 (1999), 1-33.  doi: 10.1006/jfan.1999.3397.  Google Scholar

[8]

M. Belkin and P. Niyogi, Laplacian eigenmaps and spectral techniques for embedding and clustering, Adv. Neural Info. Processing Systems, (2002), 585–591. Available from: https://papers.nips.cc/paper/2001/file/f106b7f99d2cb30c3db1c3cc0fde9ccb-Paper.pdf. Google Scholar

[9] A. BlumJ. Hopcroft and R. Kannan, Foundations of Data Science, Cambridge University Press, 2020.  doi: 10.1017/9781108755528.  Google Scholar
[10]

R. B. Boppana, Eigenvalues and graph bisection: An average-case analysis, 28th Annual Symposium on Foundations of Computer Science, Los Angeles, CA, 1987. doi: 10.1109/SFCS. 1987.22.  Google Scholar

[11]

K. Burdzy, The hot spots problem in planar domains with one hole, Duke Math. J., 129 (2005), 481-502.  doi: 10.1215/S0012-7094-05-12932-5.  Google Scholar

[12]

K. Burdzy and W. Werner, A counterexample to the "hot spots" conjecture, Ann. of Math. (2), 149 (1999), 309-317.  doi: 10.2307/121027.  Google Scholar

[13]

J. CapeM. Tang and C. E. Priebe, The two-to-infinity norm and singular subspace geometry with applications to high-dimensional statistics, Ann. Statist., 47 (2019), 2405-2439.  doi: 10.1214/18-AOS1752.  Google Scholar

[14] J. Cheeger, A lower bound for the smallest eigenvalue of the Laplacian, in Problems in Analysis, Princeton Univ. Press, Princeton, NJ, 1970.  doi: 10.1515/9781400869312-013.  Google Scholar
[15]

X. ChengG. Mishne and S. Steinerberger, The Geometry of nodal sets and outlier detection, J. Number Theory, 185 (2018), 48-64.  doi: 10.1016/j.jnt.2017.09.021.  Google Scholar

[16]

X. ChengM. Rachh and S. Steinerberger, On the diffusion geometry of graph Laplacians and applications, Appl. Comput. Harmon. Anal., 46 (2019), 674-688.  doi: 10.1016/j.acha.2018.04.001.  Google Scholar

[17]

F. R. K. Chung, Spectral graph theory, CBMS Regional Conference Series in Mathematics, 92, American Mathematical Society, Providence, RI, 1997.  Google Scholar

[18]

M. K. Chung, S. Seo, N. Adluru and H. K. Vorperian, Hot spots conjecture and its application to modeling tubular structures, in Machine Learning in Medical Imaging, Lecture Notes in Computer Science, 7009, Springer, 2011, 225–232. doi: 10.1007/978-3-642-24319-6_28.  Google Scholar

[19]

A. Damle and Y. Sun, Uniform bounds for invariant subspace perturbations, SIAM J. Matrix Anal. Appl., 41 (2020), 1208-1236.  doi: 10.1137/19M1262760.  Google Scholar

[20]

C. Davis and W. M. Kahan, The rotation of eigenvectors by a perturbation. III, SIAM J. Numer. Anal., 7 (1970), 1-46.  doi: 10.1137/0707001.  Google Scholar

[21]

W. E. Donath and A. J. Hoffman, Lower bounds for the partitioning of graphs, IBM J. Res. Develop., 17 (1973), 420-425.  doi: 10.1147/rd.175.0420.  Google Scholar

[22]

M. Fiedler, A property of eigenvectors of nonnegative symmetric matrices and its application to graph theory, Czechoslovak Math. J., 25 (1975), 619-633.  doi: 10.21136/CMJ.1975.101357.  Google Scholar

[23]

M. Fiedler, Algebraic connectivity of graphs, Czechoslovak Math. J., 23 (1973), 298-305.   Google Scholar

[24]

M. Fiedler, Laplacian of graphs and algebraic connectivity, in Combinatorics and Graph Theory, Banach Center Publ., 25, PWN, Warsaw, 1989, 57–70.  Google Scholar

[25]

H. Gernandt and J. P. Pade, Schur reduction of trees and extremal entries of the Fiedler vector, Linear Algebra Appl., 570 (2019), 93-122.  doi: 10.1016/j.laa.2019.02.008.  Google Scholar

[26]

D. K. HammondP. Vandergheynst and R. Gribonval, Wavelets on graphs via spectral graph theory, Appl. Comput. Harmon. Anal., 30 (2011), 129-150.  doi: 10.1016/j.acha.2010.04.005.  Google Scholar

[27]

P. W. HollandK. B. Laskey and S. Leinhardt, Stochastic blockmodels: First steps, Social Networks, 5 (1983), 109-137.  doi: 10.1016/0378-8733(83)90021-7.  Google Scholar

[28]

C. Judge and S. Mondal, Euclidean triangles have no hot spots, Ann. of Math. (2), 191 (2020), 167-211.  doi: 10.4007/annals.2020.191.1.3.  Google Scholar

[29]

R. KannanS. Vempala and A. Vetta, On clusterings: Good, bad and spectral, J. ACM, 51 (2004), 497-515.  doi: 10.1145/990308.990313.  Google Scholar

[30]

T. Kato, Perturbation Theory for Linear Operators, Die Grundlehren der mathematischen Wissenschaften, Band 132, Springer-Verlag New York, Inc., New York, 1966.  Google Scholar

[31]

B. Kawohl, Rearrangements and Convexity of Level Sets in PDE, Lecture Notes in Mathematics, 1150, Springer-Verlag, Berlin, 1985. doi: 10.1007/BFb0075060.  Google Scholar

[32]

T. C. Kwok, L. C. Lau, Y. T. Lee, S. Oveis Gharan and L. Trevisan, Improved Cheeger's inequality: Analysis of spectral partitioning algorithms through higher order spectral gap, in STOC'13—Proceedings of the 2013 ACM Symposium on Theory of Computing, ACM, New York, 2013, 11–20. doi: 10.1145/2488608.2488611.  Google Scholar

[33]

R. Lederman and S. Steinerberger, Extreme values of the Fiedler vector on trees, preprint, arXiv: 1912.08327. Google Scholar

[34]

D. A. Levin and Y. Peres, Markov Chains and Mixing Times, American Mathematical Society, Providence, RI, 2017. doi: 10.1090/mbk/107.  Google Scholar

[35]

M. W. MahoneyL. Orecchia and N. K. Vishnoi, A local spectral method for graphs: With applications to improving graph partitions and exploring data graphs locally, J. Mach. Learn. Res., 13 (2012), 2339-2365.   Google Scholar

[36]

F. McSherry, Spectral partitioning of random graphs, 42nd IEEE Symposium on Foundations of Computer Science (Las Vegas, NV, 2001), IEEE Computer Soc., Los Alamitos, CA, 2001, 529–537. doi: 10.1109/SFCS. 2001.959929.  Google Scholar

[37]

M. E. J. Newman, Modularity and community structure in networks, PNAS, 103 (2006), 8577-8582.  doi: 10.1073/pnas.0601602103.  Google Scholar

[38]

M. E. J. Newman and M. Girvan, Finding and evaluating community structure in networks, Phys. Rev. E, 69 (2004). doi: 10.1103/PhysRevE. 69.026113.  Google Scholar

[39]

A. Ng, M. Jordan and Y. Weiss, On spectral clustering: Analysis and an algorithm, NIPS'01: Proceedings of the 14th International Conference on Neural Information Processing Systems: Natural and SyntheticJanuary 2001, 849–856. Google Scholar

[40]

A. PerryA. S. WeinA. S. Bandeira and A. Moitra, Message-passing algorithms for synchronization problems over compact groups, Comm. Pure Appl. Math., 71 (2018), 2275-2322.  doi: 10.1002/cpa.21750.  Google Scholar

[41]

M. Rachh and S. Steinerberger, On the location of maxima of solutions of Schroedinger's equation, Comm. Pure Appl. Math., 71 (2018), 1109-1122.  doi: 10.1002/cpa.21753.  Google Scholar

[42]

M. F. Rios, J. Calder and G. Lerman, Algorithms for $\ell_p $-based semi-supervised learning on graphs, preprint, arXiv: 1901.05031. Google Scholar

[43]

K. RoheS. Chatterjee and B. Yu, Spectral clustering and the high-dimensional stochastic blockmodel, Ann. Statist., 39 (2011), 1878-1915.  doi: 10.1214/11-AOS887.  Google Scholar

[44]

J. Shi and J. Malik, Normalized cuts and image segmentation, IEEE Transactions on Pattern Analysis and Machine Intelligence, 22 (2000), 888-905.  doi: 10.1109/34.868688.  Google Scholar

[45] D. A. Spielman and S.-H. Teng, Spectral partitioning works: Planar graphs and finite element meshes, 37th Annual Symposium on Foundations of Computer Science (Burlington, VT, 1996), IEEE Comput. Soc. Press, Los Alamitos, CA, 1996.  doi: 10.1109/SFCS.1996.548468.  Google Scholar
[46]

S. Steinerberger, Hot spots in convex domains are in the tips (up to an inradius), Comm. Partial Differential Equations, 45 (2020), 641-654.  doi: 10.1080/03605302.2020.1750427.  Google Scholar

[47]

L. Trevisan, Graph Partitioning and Expanders, CS359G Lecture 4, Stanford University, Palo Alto. Google Scholar

[48]

L. Trevisan, Max cut and the smallest eigenvalue, SIAM J. Comput., 41 (2012), 1769-1786.  doi: 10.1137/090773714.  Google Scholar

[49] R. Vershynin, High-Dimensional Probability. An Introduction with Applications in Data Science, Cambridge Series in Statistical and Probabilistic Mathematics, 47, Cambridge University Press, Cambridge, 2018.  doi: 10.1017/9781108231596.  Google Scholar
[50]

U. von Luxburg, A tutorial on spectral clustering, Stat. Comput., 17 (2007), 395-416.  doi: 10.1007/s11222-007-9033-z.  Google Scholar

[51]

X. Zhu, Z. Ghahramani and J. D. Lafferty, Semi-supervised learning using gaussian fields and harmonic functions, in Proceedings of the 20th International Conference on Machine Learning (ICML-03), 2003, 912–919. Google Scholar

show all references

References:
[1]

E. Abbe, Community detection and stochastic block models: Recent developments, J. Mach. Learn. Res., 18 (2017), 86pp.  Google Scholar

[2]

E. AbbeA. S. Bandeira and G. Hall, Exact recovery in the stochastic block model, IEEE Trans. Inform. Theory, 62 (2016), 471-487.  doi: 10.1109/TIT.2015.2490670.  Google Scholar

[3]

E. AbbeJ. FanK. Wang and Y. Zhong, Entrywise eigenvector analysis of random matrices with low expected rank, Ann. Statist., 48 (2020), 1452-1474.  doi: 10.1214/19-AOS1854.  Google Scholar

[4]

E. Abbe and C. Sandon, Proof of the achievability conjectures for the general stochastic block model, Comm. Pure Appl. Math., 71 (2018), 1334-1406.  doi: 10.1002/cpa.21719.  Google Scholar

[5]

R. Andersen and K. Lang, Communities from seed sets, in Proceedings of the 15th International Conference on World Wide Web, 2006, 223–232. doi: 10.1145/1135777.1135814.  Google Scholar

[6]

A. S. Bandeira, Random Laplacian matrices and convex relaxations, Found. Comput. Math., 18 (2018), 345-379.  doi: 10.1007/s10208-016-9341-9.  Google Scholar

[7]

R. Bañuelos and K. Burdzy, On the "hot spots" conjecture of J. Rauch, J. Funct. Anal., 164 (1999), 1-33.  doi: 10.1006/jfan.1999.3397.  Google Scholar

[8]

M. Belkin and P. Niyogi, Laplacian eigenmaps and spectral techniques for embedding and clustering, Adv. Neural Info. Processing Systems, (2002), 585–591. Available from: https://papers.nips.cc/paper/2001/file/f106b7f99d2cb30c3db1c3cc0fde9ccb-Paper.pdf. Google Scholar

[9] A. BlumJ. Hopcroft and R. Kannan, Foundations of Data Science, Cambridge University Press, 2020.  doi: 10.1017/9781108755528.  Google Scholar
[10]

R. B. Boppana, Eigenvalues and graph bisection: An average-case analysis, 28th Annual Symposium on Foundations of Computer Science, Los Angeles, CA, 1987. doi: 10.1109/SFCS. 1987.22.  Google Scholar

[11]

K. Burdzy, The hot spots problem in planar domains with one hole, Duke Math. J., 129 (2005), 481-502.  doi: 10.1215/S0012-7094-05-12932-5.  Google Scholar

[12]

K. Burdzy and W. Werner, A counterexample to the "hot spots" conjecture, Ann. of Math. (2), 149 (1999), 309-317.  doi: 10.2307/121027.  Google Scholar

[13]

J. CapeM. Tang and C. E. Priebe, The two-to-infinity norm and singular subspace geometry with applications to high-dimensional statistics, Ann. Statist., 47 (2019), 2405-2439.  doi: 10.1214/18-AOS1752.  Google Scholar

[14] J. Cheeger, A lower bound for the smallest eigenvalue of the Laplacian, in Problems in Analysis, Princeton Univ. Press, Princeton, NJ, 1970.  doi: 10.1515/9781400869312-013.  Google Scholar
[15]

X. ChengG. Mishne and S. Steinerberger, The Geometry of nodal sets and outlier detection, J. Number Theory, 185 (2018), 48-64.  doi: 10.1016/j.jnt.2017.09.021.  Google Scholar

[16]

X. ChengM. Rachh and S. Steinerberger, On the diffusion geometry of graph Laplacians and applications, Appl. Comput. Harmon. Anal., 46 (2019), 674-688.  doi: 10.1016/j.acha.2018.04.001.  Google Scholar

[17]

F. R. K. Chung, Spectral graph theory, CBMS Regional Conference Series in Mathematics, 92, American Mathematical Society, Providence, RI, 1997.  Google Scholar

[18]

M. K. Chung, S. Seo, N. Adluru and H. K. Vorperian, Hot spots conjecture and its application to modeling tubular structures, in Machine Learning in Medical Imaging, Lecture Notes in Computer Science, 7009, Springer, 2011, 225–232. doi: 10.1007/978-3-642-24319-6_28.  Google Scholar

[19]

A. Damle and Y. Sun, Uniform bounds for invariant subspace perturbations, SIAM J. Matrix Anal. Appl., 41 (2020), 1208-1236.  doi: 10.1137/19M1262760.  Google Scholar

[20]

C. Davis and W. M. Kahan, The rotation of eigenvectors by a perturbation. III, SIAM J. Numer. Anal., 7 (1970), 1-46.  doi: 10.1137/0707001.  Google Scholar

[21]

W. E. Donath and A. J. Hoffman, Lower bounds for the partitioning of graphs, IBM J. Res. Develop., 17 (1973), 420-425.  doi: 10.1147/rd.175.0420.  Google Scholar

[22]

M. Fiedler, A property of eigenvectors of nonnegative symmetric matrices and its application to graph theory, Czechoslovak Math. J., 25 (1975), 619-633.  doi: 10.21136/CMJ.1975.101357.  Google Scholar

[23]

M. Fiedler, Algebraic connectivity of graphs, Czechoslovak Math. J., 23 (1973), 298-305.   Google Scholar

[24]

M. Fiedler, Laplacian of graphs and algebraic connectivity, in Combinatorics and Graph Theory, Banach Center Publ., 25, PWN, Warsaw, 1989, 57–70.  Google Scholar

[25]

H. Gernandt and J. P. Pade, Schur reduction of trees and extremal entries of the Fiedler vector, Linear Algebra Appl., 570 (2019), 93-122.  doi: 10.1016/j.laa.2019.02.008.  Google Scholar

[26]

D. K. HammondP. Vandergheynst and R. Gribonval, Wavelets on graphs via spectral graph theory, Appl. Comput. Harmon. Anal., 30 (2011), 129-150.  doi: 10.1016/j.acha.2010.04.005.  Google Scholar

[27]

P. W. HollandK. B. Laskey and S. Leinhardt, Stochastic blockmodels: First steps, Social Networks, 5 (1983), 109-137.  doi: 10.1016/0378-8733(83)90021-7.  Google Scholar

[28]

C. Judge and S. Mondal, Euclidean triangles have no hot spots, Ann. of Math. (2), 191 (2020), 167-211.  doi: 10.4007/annals.2020.191.1.3.  Google Scholar

[29]

R. KannanS. Vempala and A. Vetta, On clusterings: Good, bad and spectral, J. ACM, 51 (2004), 497-515.  doi: 10.1145/990308.990313.  Google Scholar

[30]

T. Kato, Perturbation Theory for Linear Operators, Die Grundlehren der mathematischen Wissenschaften, Band 132, Springer-Verlag New York, Inc., New York, 1966.  Google Scholar

[31]

B. Kawohl, Rearrangements and Convexity of Level Sets in PDE, Lecture Notes in Mathematics, 1150, Springer-Verlag, Berlin, 1985. doi: 10.1007/BFb0075060.  Google Scholar

[32]

T. C. Kwok, L. C. Lau, Y. T. Lee, S. Oveis Gharan and L. Trevisan, Improved Cheeger's inequality: Analysis of spectral partitioning algorithms through higher order spectral gap, in STOC'13—Proceedings of the 2013 ACM Symposium on Theory of Computing, ACM, New York, 2013, 11–20. doi: 10.1145/2488608.2488611.  Google Scholar

[33]

R. Lederman and S. Steinerberger, Extreme values of the Fiedler vector on trees, preprint, arXiv: 1912.08327. Google Scholar

[34]

D. A. Levin and Y. Peres, Markov Chains and Mixing Times, American Mathematical Society, Providence, RI, 2017. doi: 10.1090/mbk/107.  Google Scholar

[35]

M. W. MahoneyL. Orecchia and N. K. Vishnoi, A local spectral method for graphs: With applications to improving graph partitions and exploring data graphs locally, J. Mach. Learn. Res., 13 (2012), 2339-2365.   Google Scholar

[36]

F. McSherry, Spectral partitioning of random graphs, 42nd IEEE Symposium on Foundations of Computer Science (Las Vegas, NV, 2001), IEEE Computer Soc., Los Alamitos, CA, 2001, 529–537. doi: 10.1109/SFCS. 2001.959929.  Google Scholar

[37]

M. E. J. Newman, Modularity and community structure in networks, PNAS, 103 (2006), 8577-8582.  doi: 10.1073/pnas.0601602103.  Google Scholar

[38]

M. E. J. Newman and M. Girvan, Finding and evaluating community structure in networks, Phys. Rev. E, 69 (2004). doi: 10.1103/PhysRevE. 69.026113.  Google Scholar

[39]

A. Ng, M. Jordan and Y. Weiss, On spectral clustering: Analysis and an algorithm, NIPS'01: Proceedings of the 14th International Conference on Neural Information Processing Systems: Natural and SyntheticJanuary 2001, 849–856. Google Scholar

[40]

A. PerryA. S. WeinA. S. Bandeira and A. Moitra, Message-passing algorithms for synchronization problems over compact groups, Comm. Pure Appl. Math., 71 (2018), 2275-2322.  doi: 10.1002/cpa.21750.  Google Scholar

[41]

M. Rachh and S. Steinerberger, On the location of maxima of solutions of Schroedinger's equation, Comm. Pure Appl. Math., 71 (2018), 1109-1122.  doi: 10.1002/cpa.21753.  Google Scholar

[42]

M. F. Rios, J. Calder and G. Lerman, Algorithms for $\ell_p $-based semi-supervised learning on graphs, preprint, arXiv: 1901.05031. Google Scholar

[43]

K. RoheS. Chatterjee and B. Yu, Spectral clustering and the high-dimensional stochastic blockmodel, Ann. Statist., 39 (2011), 1878-1915.  doi: 10.1214/11-AOS887.  Google Scholar

[44]

J. Shi and J. Malik, Normalized cuts and image segmentation, IEEE Transactions on Pattern Analysis and Machine Intelligence, 22 (2000), 888-905.  doi: 10.1109/34.868688.  Google Scholar

[45] D. A. Spielman and S.-H. Teng, Spectral partitioning works: Planar graphs and finite element meshes, 37th Annual Symposium on Foundations of Computer Science (Burlington, VT, 1996), IEEE Comput. Soc. Press, Los Alamitos, CA, 1996.  doi: 10.1109/SFCS.1996.548468.  Google Scholar
[46]

S. Steinerberger, Hot spots in convex domains are in the tips (up to an inradius), Comm. Partial Differential Equations, 45 (2020), 641-654.  doi: 10.1080/03605302.2020.1750427.  Google Scholar

[47]

L. Trevisan, Graph Partitioning and Expanders, CS359G Lecture 4, Stanford University, Palo Alto. Google Scholar

[48]

L. Trevisan, Max cut and the smallest eigenvalue, SIAM J. Comput., 41 (2012), 1769-1786.  doi: 10.1137/090773714.  Google Scholar

[49] R. Vershynin, High-Dimensional Probability. An Introduction with Applications in Data Science, Cambridge Series in Statistical and Probabilistic Mathematics, 47, Cambridge University Press, Cambridge, 2018.  doi: 10.1017/9781108231596.  Google Scholar
[50]

U. von Luxburg, A tutorial on spectral clustering, Stat. Comput., 17 (2007), 395-416.  doi: 10.1007/s11222-007-9033-z.  Google Scholar

[51]

X. Zhu, Z. Ghahramani and J. D. Lafferty, Semi-supervised learning using gaussian fields and harmonic functions, in Proceedings of the 20th International Conference on Machine Learning (ICML-03), 2003, 912–919. Google Scholar

Figure 1.  A graph with two communities that have strong inter-connectedness but very few edges between them
Figure 2.  A two-dimensional domain with Neumann boundary condition and the second Laplacian eigenfunction
Figure 3.  A stochastic block graph: $n = 500$, $p = 0.05$ and $q = 0.0001$. Two clusters that are barely connected; the cluster on the left has a somewhat more 'ambiguous' region; this ambiguity should be reflected in the size of the entries of the eigenvector ${\bf{v_2}}$
Figure 4.  (top) The deviation from expected in-group affinity ($ c $, defined in Equation 2) for the vertices of a stochastic block model with $ (n,p,q) = (2000, 0.6, 0.4) $. Vertices are plotted in increasing order of the corresponding $ {\bf{v_2}} $ entry. (mid) Values of $ {\bf{v_2}} $ for corresponding vertices, ordered in increasing value. (bottom) Plot showing linear relationship between $ \Delta/\sqrt{n} $ and $ \left|{\bf{v_2}}\right| $, in accordance with the main theorem
Figure 5.  Error rates on subsets of vertices with extremal $ {\bf{v_2}} $ value, compared with the global $ {\bf{v_2}} $ label-estimation error rate. Subsets were chosen by taking the nodes with $ \varepsilon\cdot n $ largest magnitude $ {\bf{v_2}} $ entries, as in Corollary 1. This figure was generated by randomly sampling 500 independent stochastic block models, $ n = 200 $, $ p = 0.55, $ and $ q = 0.45 $
Figure 6.  Visualization of clustering experiments performed using MNIST dataset. Three hundred images of 3's and three hundred images of 8's were chosen at random from the original MNIST dataset. Pixel values were normalized and rounded to take binary values. A graph was constructed, with a vertex corresponding to each image, and an edge between two vertices if one of the vertices was within the 10% nearest neighbors of the other, using Euclidean distance. The vector $ {\bf{v_2}} $ and values of $ c $ (see Equation 3) were calculated for each vertex. The top figure was generated without noise. In the bottom figure, each pixel's binary value was reversed with independent probability $ \rho = 0.5 $, and the same calculations were performed
5], and compare the overall error rate when using the five nodes corresponding to the five largest entries of $ {\bf{v_2}} $ as our seed set (shown in blue), versus five nodes selected uniformly at random from within a single community (shown in orange). Errorbars indicate one standard deviation. Using $ {\bf{v_2}} $ to inform the initial seed set consistently outperforms use of a random seed set. We emphasize that the random seed set is selected from within a single community: this means that not only are $ {\bf{v_2}} $-extremal vertices likely to be in their correct community, as reflected in Figure 5, but that these vertices are "preferable" to others within the same community">Figure 7.  Comparing how $ {\bf{v_2}} $ can inform seed set expansion methods. We replicate the method described in [5], and compare the overall error rate when using the five nodes corresponding to the five largest entries of $ {\bf{v_2}} $ as our seed set (shown in blue), versus five nodes selected uniformly at random from within a single community (shown in orange). Errorbars indicate one standard deviation. Using $ {\bf{v_2}} $ to inform the initial seed set consistently outperforms use of a random seed set. We emphasize that the random seed set is selected from within a single community: this means that not only are $ {\bf{v_2}} $-extremal vertices likely to be in their correct community, as reflected in Figure 5, but that these vertices are "preferable" to others within the same community
[1]

Sarah Constantin, Robert S. Strichartz, Miles Wheeler. Analysis of the Laplacian and spectral operators on the Vicsek set. Communications on Pure & Applied Analysis, 2011, 10 (1) : 1-44. doi: 10.3934/cpaa.2011.10.1

[2]

Monique Dauge, Thomas Ourmières-Bonafos, Nicolas Raymond. Spectral asymptotics of the Dirichlet Laplacian in a conical layer. Communications on Pure & Applied Analysis, 2015, 14 (3) : 1239-1258. doi: 10.3934/cpaa.2015.14.1239

[3]

Antonio Rieser. A topological approach to spectral clustering. Foundations of Data Science, 2021, 3 (1) : 49-66. doi: 10.3934/fods.2021005

[4]

Rui Wang, Rundong Zhao, Emily Ribando-Gros, Jiahui Chen, Yiying Tong, Guo-Wei Wei. HERMES: Persistent spectral graph software. Foundations of Data Science, 2021, 3 (1) : 67-97. doi: 10.3934/fods.2021006

[5]

James B. Kennedy, Jonathan Rohleder. On the hot spots of quantum graphs. Communications on Pure & Applied Analysis, 2021, 20 (9) : 3029-3063. doi: 10.3934/cpaa.2021095

[6]

Xiao-Hui Li, Huo-Jun Ruan. The "hot spots" conjecture on higher dimensional Sierpinski gaskets. Communications on Pure & Applied Analysis, 2016, 15 (1) : 287-297. doi: 10.3934/cpaa.2016.15.287

[7]

Can Huang, Zhimin Zhang. The spectral collocation method for stochastic differential equations. Discrete & Continuous Dynamical Systems - B, 2013, 18 (3) : 667-679. doi: 10.3934/dcdsb.2013.18.667

[8]

Kazuhiro Ishige, Y. Kabeya. Hot spots for the two dimensional heat equation with a rapidly decaying negative potential. Discrete & Continuous Dynamical Systems - S, 2011, 4 (4) : 833-849. doi: 10.3934/dcdss.2011.4.833

[9]

Lu Yang, Guangsheng Wei, Vyacheslav Pivovarchik. Direct and inverse spectral problems for a star graph of Stieltjes strings damped at a pendant vertex. Inverse Problems & Imaging, 2021, 15 (2) : 257-270. doi: 10.3934/ipi.2020063

[10]

Dengfeng Lü, Shuangjie Peng. On the positive vector solutions for nonlinear fractional Laplacian systems with linear coupling. Discrete & Continuous Dynamical Systems, 2017, 37 (6) : 3327-3352. doi: 10.3934/dcds.2017141

[11]

Marta García-Huidobro, Raul Manásevich, J. R. Ward. Vector p-Laplacian like operators, pseudo-eigenvalues, and bifurcation. Discrete & Continuous Dynamical Systems, 2007, 19 (2) : 299-321. doi: 10.3934/dcds.2007.19.299

[12]

Wei Xi Li, Chao Jiang Xu. Subellipticity of some complex vector fields related to the Witten Laplacian. Communications on Pure & Applied Analysis, 2021, 20 (7&8) : 2709-2724. doi: 10.3934/cpaa.2021047

[13]

Blaine Keetch, Yves van Gennip. A Max-Cut approximation using a graph based MBO scheme. Discrete & Continuous Dynamical Systems - B, 2019, 24 (11) : 6091-6139. doi: 10.3934/dcdsb.2019132

[14]

Zhongming Chen, Liqun Qi. Circulant tensors with applications to spectral hypergraph theory and stochastic process. Journal of Industrial & Management Optimization, 2016, 12 (4) : 1227-1247. doi: 10.3934/jimo.2016.12.1227

[15]

Manel Hmimida, Rushed Kanawati. Community detection in multiplex networks: A seed-centric approach. Networks & Heterogeneous Media, 2015, 10 (1) : 71-85. doi: 10.3934/nhm.2015.10.71

[16]

Wenting Cong, Jian-Guo Liu. A degenerate $p$-Laplacian Keller-Segel model. Kinetic & Related Models, 2016, 9 (4) : 687-714. doi: 10.3934/krm.2016012

[17]

Shigehiro Sakata, Yuta Wakasugi. Movement of time-delayed hot spots in Euclidean space for a degenerate initial state. Discrete & Continuous Dynamical Systems, 2020, 40 (5) : 2705-2738. doi: 10.3934/dcds.2020147

[18]

Daniel Guo, John Drake. A global semi-Lagrangian spectral model for the reformulated shallow water equations. Conference Publications, 2003, 2003 (Special) : 375-385. doi: 10.3934/proc.2003.2003.375

[19]

Salvador Cruz-García, Catherine García-Reimbert. On the spectral stability of standing waves of the one-dimensional $M^5$-model. Discrete & Continuous Dynamical Systems - B, 2016, 21 (4) : 1079-1099. doi: 10.3934/dcdsb.2016.21.1079

[20]

O. A. Veliev. Essential spectral singularities and the spectral expansion for the Hill operator. Communications on Pure & Applied Analysis, 2017, 16 (6) : 2227-2251. doi: 10.3934/cpaa.2017110

 Impact Factor: 

Article outline

Figures and Tables

[Back to Top]