The $ k $-means problem is a classical combinatorial optimization problem which has lots of applications in many fields such as machine learning, data mining, etc. We consider a variant of $ k $-means problem in the spherical space, that is, spherical $ k $-means problem with penalties. In the problem, it is allowable that some nodes in the spherical space can not be clustered by paying some penalty costs. Based on local search scheme, we propose a $ \left(4 (11+4\sqrt{7})+ \epsilon\right) $-approximation algorithm using singe-swap operation, where $ \epsilon $ is a positive constant.
Citation: |
[1] | S. Ahmadian, A. Norouzi-Fard, O. Svensson and J. Ward, Better guarantees for $k$-means and Euclidean $k$-median by primal-dual algorithms, SIAM Journal on Computing, 49 (2019), FOCS17-97–FOCS17-156. doi: 10.1137/18M1171321. |
[2] | D. Aloise, A. Deshpande, P. Hansen and P. Popat, NP-hardness of Euclidean sum-of-squares clustering, Machine Learning, 75 (2009), 245-248. doi: 10.1007/s10994-009-5103-0. |
[3] | S. Ahmadian, A. Norouzi-Fard, O. Svensson and J. Ward, Better guarantees for $k$-means and Euclidean $k$-median by primal-dual algorithms, SIAM Journal on Computing, 49 (2019), FOCS17-97–FOCS17-156. doi: 10.1137/18M1171321. |
[4] | D. Arthur and S. Vassilvitskii, $K$-means++: The advantages of careful seeding, Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SIAM, 2007, 1027–1035. doi: 10.1145/1283383.1283494. |
[5] | Y. Endo and S. Miyamoto, Spherical $k$-means++ clustering, Proceedings of International Conference on Modeling Decisions for Artificial Intelligence, (2015), 103–114. doi: 10.1007/978-3-319-23240-99. |
[6] | Q. Feng, Z. Zhang, F. Shi and J. Wang, An improved approximation algorithm for the $k$-means problem with penalties, Proceedings of International Workshop on Frontiers in Algorithmics, (2019), 170–181. doi: 10.1007/978-3-030-18126-015. |
[7] | Z. Friggstad, K. Khodamoradi, M. Rezapour and M. Salavatipour, Approximation schemes for clustering with outliers, ACM Transactions on Algorithms, 15 (2019), 26: 1–26: 26. doi: 10.1145/3301446. |
[8] | A. Georgogiannis, Robust $k$-means: A theoretical revisit, Proceedings of 30th Conference on Neural Information Processing Systems, (2016), 2891–2899. doi: 10.5555/3157382.3157421. |
[9] | J. Han, M. Kamber and J. Pei, Data Mining: Concepts and Techniques, Elsevier, 2012. doi: 10.1016/C2009-0-61819-5. |
[10] | A. K. Jain, Data clustering: 50 years beyond $k$-means, Pattern Recognition Letters, 31 (2010), 651-666. doi: 10.1016/j.patrec.2009.09.011. |
[11] | S. Ji, D. Xu, L. Guo, M. Li and D. Zhang, The seeding algorithm for spherical $k$-means clustering with penalties, Journal of Combinatorial Optimization, 2 (2020), 1-18. doi: 10.1007/s10898-019-00779-w. |
[12] | T. Kanungo, D. M. Mount, N. S. Netanyahu, C. D. Piatko, R. Silverman and A. Y. Wu, A local search approximation algorithm for $k$-means clustering, Computational Geometry-Theory and Applications, 28 (2004), 89-112. doi: 10.1016/j.comgeo.2004.03.003. |
[13] | M. Li, The bi-criteria seeding algorithms for two variants of $k$-means problem, Journal of Combinatorial Optimization. doi: 10.1007/s10878-020-00537-9. |
[14] | M. Li, D. Xu, J. Yue, D. Zhang and P. Zhang, The seeding algorithm for $k$-means problem with penalties, Journal of Combinatorial Optimization, 39 (2020), 15-32. doi: 10.1007/s10878-019-00450-w. |
[15] | S. Lloyd, Least squares quantization in PCM, IEEE Transactions on Information Theory, 28 (1982), 129-137. doi: 10.1109/TIT.1982.1056489. |
[16] | R. Ostrovsky, Y. Rabani, L. Schulman and C. Swamy, The effectiveness of Lloyd-type methods for the $k$-means problem, Journal of the ACM, 59 (2012), 28: 1–28: 22. doi: 10.1145/2395116.2395117. |
[17] | D. Wei, A constant-factor bi-criteria approximation guarantee for $k$-means++, Proceedings of the Thirtieth International Conference on Neural Information Processing Systems, (2016), 604–612. doi: 10.5555/3157096.3157164. |
[18] | X. Wu, V. Kumar, J. Quinlan, J. Ross Ghosh, Q. Yang, H. Motoda, G. J. McLachlan, A. Ng, B. Liu, P. S. Yu, Z. H. Zhou, M. Steinbach, D. J. Hand and D. Steinberg, Top 10 algorithms in data mining, Knowledge and Information Systems, 14 (2008), 1-37. doi: 10.1007/s10115-007-0114-2. |
[19] | D. Zhang, Y. Cheng, M. Li, Y. Wang and D. Xu, Local search approximation algorithms for the spherical $k$-means problem, Proceedings of International Conference on Algorithmic Applications in Management, Springer (2019), 341–351. doi: 10.1007/978-3-030-27195-4_31. |
[20] | D. Zhang, C. Hao, C. Wu, D. Xu and Z. Zhang, Local search approximation algorithms for the $k$-means problem with penalties, Journal of Combinatorial Optimization, 37 (2019), 439-453. doi: 10.1007/s10878-018-0278-6. |
Spherical
Swap operataion
Compound mapping
Cases for re-clustering
Partition of