
-
Previous Article
Randomized learning of the second-moment matrix of a smooth function
- FoDS Home
- This Issue
-
Next Article
Modelling dynamic network evolution as a Pitman-Yor process
Power weighted shortest paths for clustering Euclidean data
1. | Department of Mathematics, University of California, Los Angeles, Los Angeles CA 90095, USA |
2. | Department of Mathematics, University of Michigan, Ann Arbor MI 48109, USA |
We study the use of power weighted shortest path metrics for clustering high dimensional Euclidean data, under the assumption that the data is drawn from a collection of disjoint low dimensional manifolds. We argue, theoretically and experimentally, that this leads to higher clustering accuracy. We also present a fast algorithm for computing these distances.
References:
[1] |
M. Alamgir and U. Von Luxburg, Shortest path distance in random k-nearest neighbor graphs, arXiv preprint, arXiv: 1206.6381. Google Scholar |
[2] |
A. Aldroubi, K. Hamm, A. Koku and A. Sekmen, Cur decompositions, similarity matrices, and subspace clustering, Front. Appl. Math. Stat., 4 (2019), p65.
doi: 10.3389/fams.2018.00065. |
[3] |
E. Arias-Castro,
Clustering based on pairwise distances when the data is of mixed dimensions, IEEE Transactions on Information Theory, 57 (2011), 1692-1706.
doi: 10.1109/TIT.2011.2104630. |
[4] |
R. Basri and D. Jacobs, Lambertian reflectance and linear subspaces, IEEE Transactions on Pattern Analysis & Machine Intelligence, 2002,218–233.
doi: 10.1109/ICCV.2001.937651. |
[5] |
J. L. Bentley,
Multidimensional binary search trees used for associative searching, Communications of the ACM, 18 (1975), 509-517.
doi: 10.1145/361002.361007. |
[6] |
A. Beygelzimer, S. Kakade and J. Langford, Cover trees for nearest neighbor, in Proceedings of the 23rd International Conference on Machine Learning, ACM, 2006, 97–104.
doi: 10.1145/1143844.1143857. |
[7] |
A. Bijral, N. Ratliff and N. Srebro, Semi-supervised learning with density based distances, in Proceedings of the Twenty-Seventh Conference on Uncertainty in Artificial Intelligence, AUAI Press, 2011, 43–50. Google Scholar |
[8] |
H. Chang and D.-Y. Yeung,
Robust path-based spectral clustering, Pattern Recognition, 41 (2008), 191-203.
doi: 10.1016/j.patcog.2007.04.010. |
[9] |
T. Chu, G. Miller and D. Sheehy, Exploration of a graph-based density sensitive metric, arXiv preprint, arXiv: 1709.07797. Google Scholar |
[10] |
R. Coifman and S. Lafon,
Diffusion maps, Applied and Computational Harmonic Analysis, 21 (2006), 5-30.
doi: 10.1016/j.acha.2006.04.006. |
[11] |
T. Cormen, C. Leiserson, R. Rivest and C. Stein, Introduction to Algorithms, MIT press, 2009.
![]() |
[12] |
J. Costeira and T. Kanade, A multibody factorization method for independently moving objects, International Journal of Computer Vision, 29 (1998), 159-179. Google Scholar |
[13] |
K. Diaz-Chito, A. Hernández-Sabaté and A. López,
A reduced feature set for driver head pose estimation, Applied Soft Computing, 45 (2016), 98-107.
doi: 10.1016/j.asoc.2016.04.027. |
[14] |
D. Dua and C. Graff, UCImachine learning repository, 2017, http://archive.ics.uci.edu/ml. Google Scholar |
[15] |
C. Fefferman, S. Mitter and H. Narayanan,
Testing the manifold hypothesis, Journal of the American Mathematical Society, 29 (2016), 983-1049.
doi: 10.1090/jams/852. |
[16] |
B. Fischer and J. Buhmann,
Path-based clustering for grouping of smooth curves and texture segmentation, IEEE Transactions on Pattern Analysis and Machine Intelligence, 25 (2003), 513-518.
doi: 10.1109/TPAMI.2003.1190577. |
[17] |
S. Har-Peled, Computing the k nearest-neighbors for all vertices via dijkstra, arXiv preprint, arXiv: 1607.07818. Google Scholar |
[18] |
J. Ho, M.-H. Yang, J. Lim, K.-C. Lee and D. Kriegman, Clustering appearances of objects under varying illumination conditions, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003, 11–18.
doi: 10.1109/CVPR.2003.1211332. |
[19] |
C. D. Howard and C. Newman,
Geodesics and spanning trees for euclidean first-passage percolation, Annals of Probability, 29 (2001), 577-623.
doi: 10.1214/aop/1008956685. |
[20] |
S. Hwang, S. Damelin and A. Hero Ⅲ,
Shortest path through random points, The Annals of Applied Probability, 26 (2016), 2791-2823.
doi: 10.1214/15-AAP1162. |
[21] |
M. Jacobs, E. Merkurjev and S. Esedoḡlu,
Auction dynamics: A volume constrained mbo scheme, Journal of Computational Physics, 354 (2018), 288-310.
doi: 10.1016/j.jcp.2017.10.036. |
[22] |
A. Little, M. Maggioni and J. Murphy, Path-based spectral clustering: Guarantees, robustness to outliers, and fast algorithms, arXiv preprint, arXiv: 1712.06206. Google Scholar |
[23] |
A. Moscovich, A. Jaffe and B. Nadler, Minimax-optimal semi-supervised regression on unknown manifolds, in Artificial Intelligence and Statistics, 2017,933–942. Google Scholar |
[24] |
S. Nene, S. Nayar, H. Murase et al., Columbia object image library (coil-20). Google Scholar |
[25] |
A. Ng, M. Jordan and Y. Weiss, On spectral clustering: Analysis and an algorithm, in Advances in Neural Information Processing Systems, 2002,849–856. Google Scholar |
[26] |
A. Orlitsky and Sajama, Estimating and computing density based distance metrics, in Proceedings of the 22nd International Conference on Machine Learning, ACM, 2005,760–767. Google Scholar |
[27] |
J. Tenenbaum, V. De Silva and J. Langford,
A global geometric framework for nonlinear dimensionality reduction, Science, 290 (2000), 2319-2323.
doi: 10.1126/science.290.5500.2319. |
[28] |
P. Vincent and Y. Bengio, Density-sensitive Metrics and Kernels, Snowbird Learning Workshop, 2003. Google Scholar |
[29] |
K. Yin and X.-C. Tai,
An effective region force for some variational models for learning and clustering, Journal of Scientific Computing, 74 (2018), 175-196.
doi: 10.1007/s10915-017-0429-4. |
[30] |
L. Zelnik-Manor and P. Perona, Self-tuning spectral clustering, in Advances in Neural Information Processing Systems, 2005, 1601–1608. Google Scholar |
show all references
References:
[1] |
M. Alamgir and U. Von Luxburg, Shortest path distance in random k-nearest neighbor graphs, arXiv preprint, arXiv: 1206.6381. Google Scholar |
[2] |
A. Aldroubi, K. Hamm, A. Koku and A. Sekmen, Cur decompositions, similarity matrices, and subspace clustering, Front. Appl. Math. Stat., 4 (2019), p65.
doi: 10.3389/fams.2018.00065. |
[3] |
E. Arias-Castro,
Clustering based on pairwise distances when the data is of mixed dimensions, IEEE Transactions on Information Theory, 57 (2011), 1692-1706.
doi: 10.1109/TIT.2011.2104630. |
[4] |
R. Basri and D. Jacobs, Lambertian reflectance and linear subspaces, IEEE Transactions on Pattern Analysis & Machine Intelligence, 2002,218–233.
doi: 10.1109/ICCV.2001.937651. |
[5] |
J. L. Bentley,
Multidimensional binary search trees used for associative searching, Communications of the ACM, 18 (1975), 509-517.
doi: 10.1145/361002.361007. |
[6] |
A. Beygelzimer, S. Kakade and J. Langford, Cover trees for nearest neighbor, in Proceedings of the 23rd International Conference on Machine Learning, ACM, 2006, 97–104.
doi: 10.1145/1143844.1143857. |
[7] |
A. Bijral, N. Ratliff and N. Srebro, Semi-supervised learning with density based distances, in Proceedings of the Twenty-Seventh Conference on Uncertainty in Artificial Intelligence, AUAI Press, 2011, 43–50. Google Scholar |
[8] |
H. Chang and D.-Y. Yeung,
Robust path-based spectral clustering, Pattern Recognition, 41 (2008), 191-203.
doi: 10.1016/j.patcog.2007.04.010. |
[9] |
T. Chu, G. Miller and D. Sheehy, Exploration of a graph-based density sensitive metric, arXiv preprint, arXiv: 1709.07797. Google Scholar |
[10] |
R. Coifman and S. Lafon,
Diffusion maps, Applied and Computational Harmonic Analysis, 21 (2006), 5-30.
doi: 10.1016/j.acha.2006.04.006. |
[11] |
T. Cormen, C. Leiserson, R. Rivest and C. Stein, Introduction to Algorithms, MIT press, 2009.
![]() |
[12] |
J. Costeira and T. Kanade, A multibody factorization method for independently moving objects, International Journal of Computer Vision, 29 (1998), 159-179. Google Scholar |
[13] |
K. Diaz-Chito, A. Hernández-Sabaté and A. López,
A reduced feature set for driver head pose estimation, Applied Soft Computing, 45 (2016), 98-107.
doi: 10.1016/j.asoc.2016.04.027. |
[14] |
D. Dua and C. Graff, UCImachine learning repository, 2017, http://archive.ics.uci.edu/ml. Google Scholar |
[15] |
C. Fefferman, S. Mitter and H. Narayanan,
Testing the manifold hypothesis, Journal of the American Mathematical Society, 29 (2016), 983-1049.
doi: 10.1090/jams/852. |
[16] |
B. Fischer and J. Buhmann,
Path-based clustering for grouping of smooth curves and texture segmentation, IEEE Transactions on Pattern Analysis and Machine Intelligence, 25 (2003), 513-518.
doi: 10.1109/TPAMI.2003.1190577. |
[17] |
S. Har-Peled, Computing the k nearest-neighbors for all vertices via dijkstra, arXiv preprint, arXiv: 1607.07818. Google Scholar |
[18] |
J. Ho, M.-H. Yang, J. Lim, K.-C. Lee and D. Kriegman, Clustering appearances of objects under varying illumination conditions, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003, 11–18.
doi: 10.1109/CVPR.2003.1211332. |
[19] |
C. D. Howard and C. Newman,
Geodesics and spanning trees for euclidean first-passage percolation, Annals of Probability, 29 (2001), 577-623.
doi: 10.1214/aop/1008956685. |
[20] |
S. Hwang, S. Damelin and A. Hero Ⅲ,
Shortest path through random points, The Annals of Applied Probability, 26 (2016), 2791-2823.
doi: 10.1214/15-AAP1162. |
[21] |
M. Jacobs, E. Merkurjev and S. Esedoḡlu,
Auction dynamics: A volume constrained mbo scheme, Journal of Computational Physics, 354 (2018), 288-310.
doi: 10.1016/j.jcp.2017.10.036. |
[22] |
A. Little, M. Maggioni and J. Murphy, Path-based spectral clustering: Guarantees, robustness to outliers, and fast algorithms, arXiv preprint, arXiv: 1712.06206. Google Scholar |
[23] |
A. Moscovich, A. Jaffe and B. Nadler, Minimax-optimal semi-supervised regression on unknown manifolds, in Artificial Intelligence and Statistics, 2017,933–942. Google Scholar |
[24] |
S. Nene, S. Nayar, H. Murase et al., Columbia object image library (coil-20). Google Scholar |
[25] |
A. Ng, M. Jordan and Y. Weiss, On spectral clustering: Analysis and an algorithm, in Advances in Neural Information Processing Systems, 2002,849–856. Google Scholar |
[26] |
A. Orlitsky and Sajama, Estimating and computing density based distance metrics, in Proceedings of the 22nd International Conference on Machine Learning, ACM, 2005,760–767. Google Scholar |
[27] |
J. Tenenbaum, V. De Silva and J. Langford,
A global geometric framework for nonlinear dimensionality reduction, Science, 290 (2000), 2319-2323.
doi: 10.1126/science.290.5500.2319. |
[28] |
P. Vincent and Y. Bengio, Density-sensitive Metrics and Kernels, Snowbird Learning Workshop, 2003. Google Scholar |
[29] |
K. Yin and X.-C. Tai,
An effective region force for some variational models for learning and clustering, Journal of Scientific Computing, 74 (2018), 175-196.
doi: 10.1007/s10915-017-0429-4. |
[30] |
L. Zelnik-Manor and P. Perona, Self-tuning spectral clustering, in Advances in Neural Information Processing Systems, 2005, 1601–1608. Google Scholar |



3 Lines | |||||
3 Moons | |||||
3 Circles | |||||
- |
3 Lines | |||||
3 Moons | |||||
3 Circles | |||||
- |
3 Lines | |||||
3 Moons | |||||
3 Circles | |||||
- |
3 Lines | |||||
3 Moons | |||||
3 Circles | |||||
- |
[1] |
Mohsen Abdolhosseinzadeh, Mir Mohammad Alipour. Design of experiment for tuning parameters of an ant colony optimization method for the constrained shortest Hamiltonian path problem in the grid networks. Numerical Algebra, Control & Optimization, 2021, 11 (2) : 321-332. doi: 10.3934/naco.2020028 |
[2] |
Antonio Rieser. A topological approach to spectral clustering. Foundations of Data Science, 2021, 3 (1) : 49-66. doi: 10.3934/fods.2021005 |
[3] |
M. Phani Sudheer, Ravi S. Nanjundiah, A. S. Vasudeva Murthy. Revisiting the slow manifold of the Lorenz-Krishnamurthy quintet. Discrete & Continuous Dynamical Systems - B, 2006, 6 (6) : 1403-1416. doi: 10.3934/dcdsb.2006.6.1403 |
[4] |
Ana Rita Nogueira, João Gama, Carlos Abreu Ferreira. Causal discovery in machine learning: Theories and applications. Journal of Dynamics & Games, 2021 doi: 10.3934/jdg.2021008 |
[5] |
Yi Gao, Rui Li, Yingjing Shi, Li Xiao. Design of path planning and tracking control of quadrotor. Journal of Industrial & Management Optimization, 2021 doi: 10.3934/jimo.2021063 |
[6] |
Mario Bukal. Well-posedness and convergence of a numerical scheme for the corrected Derrida-Lebowitz-Speer-Spohn equation using the Hellinger distance. Discrete & Continuous Dynamical Systems, 2021, 41 (7) : 3389-3414. doi: 10.3934/dcds.2021001 |
[7] |
Shan-Shan Lin. Due-window assignment scheduling with learning and deterioration effects. Journal of Industrial & Management Optimization, 2021 doi: 10.3934/jimo.2021081 |
[8] |
Sheng-I Chen, Yen-Che Tseng. A partitioning column approach for solving LED sorter manipulator path planning problems. Journal of Industrial & Management Optimization, 2021 doi: 10.3934/jimo.2021055 |
[9] |
Yongkun Wang, Fengshou He, Xiaobo Deng. Multi-aircraft cooperative path planning for maneuvering target detection. Journal of Industrial & Management Optimization, 2021 doi: 10.3934/jimo.2021050 |
[10] |
Marian Gidea, Rafael de la Llave, Tere M. Seara. A general mechanism of instability in Hamiltonian systems: Skipping along a normally hyperbolic invariant manifold. Discrete & Continuous Dynamical Systems, 2020, 40 (12) : 6795-6813. doi: 10.3934/dcds.2020166 |
[11] |
Dingheng Pi. Periodic orbits for double regularization of piecewise smooth systems with a switching manifold of codimension two. Discrete & Continuous Dynamical Systems - B, 2021 doi: 10.3934/dcdsb.2021080 |
[12] |
Wei Wang, Yang Shen, Linyi Qian, Zhixin Yang. Hedging strategy for unit-linked life insurance contracts with self-exciting jump clustering. Journal of Industrial & Management Optimization, 2021 doi: 10.3934/jimo.2021072 |
[13] |
Mehmet Duran Toksari, Emel Kizilkaya Aydogan, Berrin Atalay, Saziye Sari. Some scheduling problems with sum of logarithm processing times based learning effect and exponential past sequence dependent delivery times. Journal of Industrial & Management Optimization, 2021 doi: 10.3934/jimo.2021044 |
[14] |
Hao Li, Honglin Chen, Matt Haberland, Andrea L. Bertozzi, P. Jeffrey Brantingham. PDEs on graphs for semi-supervised learning applied to first-person activity recognition in body-worn video. Discrete & Continuous Dynamical Systems, 2021 doi: 10.3934/dcds.2021039 |
[15] |
Shanjian Tang, Fu Zhang. Path-dependent optimal stochastic control and viscosity solution of associated Bellman equations. Discrete & Continuous Dynamical Systems, 2015, 35 (11) : 5521-5553. doi: 10.3934/dcds.2015.35.5521 |
Impact Factor:
Tools
Metrics
Other articles
by authors
[Back to Top]