• Previous Article
    Cooperation in traffic network problems via evolutionary split variational inequalities
  • JIMO Home
  • This Issue
  • Next Article
    A note on the paper "Sufficient optimality conditions using convexifactors for optimistic bilevel programming problem"
doi: 10.3934/jimo.2021067

Approximation algorithm for spherical $ k $-means problem with penalty

1. 

School of Business, Nankai University, Tianjin, 300071, China, College of Science, Tianjin University of Technology, Tianjin, 300384, China

2. 

Computer Science and Technology Department, Tianjin Renai College, Tianjin, 306136, China

3. 

Department of Operations Research and Information Engineering, Beijing University of Technology, Beijing, 100124, China

* Corresponding author: Wei Lv

Received  July 2020 Revised  February 2021 Published  April 2021

The $ k $-means problem is a classical combinatorial optimization problem which has lots of applications in many fields such as machine learning, data mining, etc. We consider a variant of $ k $-means problem in the spherical space, that is, spherical $ k $-means problem with penalties. In the problem, it is allowable that some nodes in the spherical space can not be clustered by paying some penalty costs. Based on local search scheme, we propose a $ \left(4 (11+4\sqrt{7})+ \epsilon\right) $-approximation algorithm using singe-swap operation, where $ \epsilon $ is a positive constant.

Citation: Chenchen Wu, Wei Lv, Yujie Wang, Dachuan Xu. Approximation algorithm for spherical $ k $-means problem with penalty. Journal of Industrial & Management Optimization, doi: 10.3934/jimo.2021067
References:
[1]

S. Ahmadian, A. Norouzi-Fard, O. Svensson and J. Ward, Better guarantees for $k$-means and Euclidean $k$-median by primal-dual algorithms, SIAM Journal on Computing, 49 (2019), FOCS17-97–FOCS17-156. doi: 10.1137/18M1171321.  Google Scholar

[2]

D. AloiseA. DeshpandeP. Hansen and P. Popat, NP-hardness of Euclidean sum-of-squares clustering, Machine Learning, 75 (2009), 245-248.  doi: 10.1007/s10994-009-5103-0.  Google Scholar

[3]

S. Ahmadian, A. Norouzi-Fard, O. Svensson and J. Ward, Better guarantees for $k$-means and Euclidean $k$-median by primal-dual algorithms, SIAM Journal on Computing, 49 (2019), FOCS17-97–FOCS17-156. doi: 10.1137/18M1171321.  Google Scholar

[4]

D. Arthur and S. Vassilvitskii, $K$-means++: The advantages of careful seeding, Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SIAM, 2007, 1027–1035. doi: 10.1145/1283383.1283494.  Google Scholar

[5]

Y. Endo and S. Miyamoto, Spherical $k$-means++ clustering, Proceedings of International Conference on Modeling Decisions for Artificial Intelligence, (2015), 103–114. doi: 10.1007/978-3-319-23240-99.  Google Scholar

[6]

Q. Feng, Z. Zhang, F. Shi and J. Wang, An improved approximation algorithm for the $k$-means problem with penalties, Proceedings of International Workshop on Frontiers in Algorithmics, (2019), 170–181. doi: 10.1007/978-3-030-18126-015.  Google Scholar

[7]

Z. Friggstad, K. Khodamoradi, M. Rezapour and M. Salavatipour, Approximation schemes for clustering with outliers, ACM Transactions on Algorithms, 15 (2019), 26: 1–26: 26. doi: 10.1145/3301446.  Google Scholar

[8]

A. Georgogiannis, Robust $k$-means: A theoretical revisit, Proceedings of 30th Conference on Neural Information Processing Systems, (2016), 2891–2899. doi: 10.5555/3157382.3157421.  Google Scholar

[9]

J. Han, M. Kamber and J. Pei, Data Mining: Concepts and Techniques, Elsevier, 2012. doi: 10.1016/C2009-0-61819-5.  Google Scholar

[10]

A. K. Jain, Data clustering: 50 years beyond $k$-means, Pattern Recognition Letters, 31 (2010), 651-666.  doi: 10.1016/j.patrec.2009.09.011.  Google Scholar

[11]

S. JiD. XuL. GuoM. Li and D. Zhang, The seeding algorithm for spherical $k$-means clustering with penalties, Journal of Combinatorial Optimization, 2 (2020), 1-18.  doi: 10.1007/s10898-019-00779-w.  Google Scholar

[12]

T. KanungoD. M. MountN. S. NetanyahuC. D. PiatkoR. Silverman and A. Y. Wu, A local search approximation algorithm for $k$-means clustering, Computational Geometry-Theory and Applications, 28 (2004), 89-112.  doi: 10.1016/j.comgeo.2004.03.003.  Google Scholar

[13]

M. Li, The bi-criteria seeding algorithms for two variants of $k$-means problem, Journal of Combinatorial Optimization. doi: 10.1007/s10878-020-00537-9.  Google Scholar

[14]

M. LiD. XuJ. YueD. Zhang and P. Zhang, The seeding algorithm for $k$-means problem with penalties, Journal of Combinatorial Optimization, 39 (2020), 15-32.  doi: 10.1007/s10878-019-00450-w.  Google Scholar

[15]

S. Lloyd, Least squares quantization in PCM, IEEE Transactions on Information Theory, 28 (1982), 129-137.  doi: 10.1109/TIT.1982.1056489.  Google Scholar

[16]

R. Ostrovsky, Y. Rabani, L. Schulman and C. Swamy, The effectiveness of Lloyd-type methods for the $k$-means problem, Journal of the ACM, 59 (2012), 28: 1–28: 22. doi: 10.1145/2395116.2395117.  Google Scholar

[17]

D. Wei, A constant-factor bi-criteria approximation guarantee for $k$-means++, Proceedings of the Thirtieth International Conference on Neural Information Processing Systems, (2016), 604–612. doi: 10.5555/3157096.3157164.  Google Scholar

[18]

X. WuV. KumarJ. QuinlanJ. Ross GhoshQ. YangH. MotodaG. J. McLachlanA. NgB. LiuP. S. YuZ. H. ZhouM. SteinbachD. J. Hand and D. Steinberg, Top 10 algorithms in data mining, Knowledge and Information Systems, 14 (2008), 1-37.  doi: 10.1007/s10115-007-0114-2.  Google Scholar

[19]

D. Zhang, Y. Cheng, M. Li, Y. Wang and D. Xu, Local search approximation algorithms for the spherical $k$-means problem, Proceedings of International Conference on Algorithmic Applications in Management, Springer (2019), 341–351. doi: 10.1007/978-3-030-27195-4_31.  Google Scholar

[20]

D. ZhangC. HaoC. WuD. Xu and Z. Zhang, Local search approximation algorithms for the $k$-means problem with penalties, Journal of Combinatorial Optimization, 37 (2019), 439-453.  doi: 10.1007/s10878-018-0278-6.  Google Scholar

show all references

References:
[1]

S. Ahmadian, A. Norouzi-Fard, O. Svensson and J. Ward, Better guarantees for $k$-means and Euclidean $k$-median by primal-dual algorithms, SIAM Journal on Computing, 49 (2019), FOCS17-97–FOCS17-156. doi: 10.1137/18M1171321.  Google Scholar

[2]

D. AloiseA. DeshpandeP. Hansen and P. Popat, NP-hardness of Euclidean sum-of-squares clustering, Machine Learning, 75 (2009), 245-248.  doi: 10.1007/s10994-009-5103-0.  Google Scholar

[3]

S. Ahmadian, A. Norouzi-Fard, O. Svensson and J. Ward, Better guarantees for $k$-means and Euclidean $k$-median by primal-dual algorithms, SIAM Journal on Computing, 49 (2019), FOCS17-97–FOCS17-156. doi: 10.1137/18M1171321.  Google Scholar

[4]

D. Arthur and S. Vassilvitskii, $K$-means++: The advantages of careful seeding, Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SIAM, 2007, 1027–1035. doi: 10.1145/1283383.1283494.  Google Scholar

[5]

Y. Endo and S. Miyamoto, Spherical $k$-means++ clustering, Proceedings of International Conference on Modeling Decisions for Artificial Intelligence, (2015), 103–114. doi: 10.1007/978-3-319-23240-99.  Google Scholar

[6]

Q. Feng, Z. Zhang, F. Shi and J. Wang, An improved approximation algorithm for the $k$-means problem with penalties, Proceedings of International Workshop on Frontiers in Algorithmics, (2019), 170–181. doi: 10.1007/978-3-030-18126-015.  Google Scholar

[7]

Z. Friggstad, K. Khodamoradi, M. Rezapour and M. Salavatipour, Approximation schemes for clustering with outliers, ACM Transactions on Algorithms, 15 (2019), 26: 1–26: 26. doi: 10.1145/3301446.  Google Scholar

[8]

A. Georgogiannis, Robust $k$-means: A theoretical revisit, Proceedings of 30th Conference on Neural Information Processing Systems, (2016), 2891–2899. doi: 10.5555/3157382.3157421.  Google Scholar

[9]

J. Han, M. Kamber and J. Pei, Data Mining: Concepts and Techniques, Elsevier, 2012. doi: 10.1016/C2009-0-61819-5.  Google Scholar

[10]

A. K. Jain, Data clustering: 50 years beyond $k$-means, Pattern Recognition Letters, 31 (2010), 651-666.  doi: 10.1016/j.patrec.2009.09.011.  Google Scholar

[11]

S. JiD. XuL. GuoM. Li and D. Zhang, The seeding algorithm for spherical $k$-means clustering with penalties, Journal of Combinatorial Optimization, 2 (2020), 1-18.  doi: 10.1007/s10898-019-00779-w.  Google Scholar

[12]

T. KanungoD. M. MountN. S. NetanyahuC. D. PiatkoR. Silverman and A. Y. Wu, A local search approximation algorithm for $k$-means clustering, Computational Geometry-Theory and Applications, 28 (2004), 89-112.  doi: 10.1016/j.comgeo.2004.03.003.  Google Scholar

[13]

M. Li, The bi-criteria seeding algorithms for two variants of $k$-means problem, Journal of Combinatorial Optimization. doi: 10.1007/s10878-020-00537-9.  Google Scholar

[14]

M. LiD. XuJ. YueD. Zhang and P. Zhang, The seeding algorithm for $k$-means problem with penalties, Journal of Combinatorial Optimization, 39 (2020), 15-32.  doi: 10.1007/s10878-019-00450-w.  Google Scholar

[15]

S. Lloyd, Least squares quantization in PCM, IEEE Transactions on Information Theory, 28 (1982), 129-137.  doi: 10.1109/TIT.1982.1056489.  Google Scholar

[16]

R. Ostrovsky, Y. Rabani, L. Schulman and C. Swamy, The effectiveness of Lloyd-type methods for the $k$-means problem, Journal of the ACM, 59 (2012), 28: 1–28: 22. doi: 10.1145/2395116.2395117.  Google Scholar

[17]

D. Wei, A constant-factor bi-criteria approximation guarantee for $k$-means++, Proceedings of the Thirtieth International Conference on Neural Information Processing Systems, (2016), 604–612. doi: 10.5555/3157096.3157164.  Google Scholar

[18]

X. WuV. KumarJ. QuinlanJ. Ross GhoshQ. YangH. MotodaG. J. McLachlanA. NgB. LiuP. S. YuZ. H. ZhouM. SteinbachD. J. Hand and D. Steinberg, Top 10 algorithms in data mining, Knowledge and Information Systems, 14 (2008), 1-37.  doi: 10.1007/s10115-007-0114-2.  Google Scholar

[19]

D. Zhang, Y. Cheng, M. Li, Y. Wang and D. Xu, Local search approximation algorithms for the spherical $k$-means problem, Proceedings of International Conference on Algorithmic Applications in Management, Springer (2019), 341–351. doi: 10.1007/978-3-030-27195-4_31.  Google Scholar

[20]

D. ZhangC. HaoC. WuD. Xu and Z. Zhang, Local search approximation algorithms for the $k$-means problem with penalties, Journal of Combinatorial Optimization, 37 (2019), 439-453.  doi: 10.1007/s10878-018-0278-6.  Google Scholar

Figure 1.  $ k $-means problem
Figure 2.  Spherical $ k $-means problem
Figure 3.  Swap operataion
Figure 4.  Compound mapping
Figure 5.  Cases for re-clustering
Figure 6.  Partition of $ \mathcal{N} $
[1]

Min Li, Yishui Wang, Dachuan Xu, Dongmei Zhang. The approximation algorithm based on seeding method for functional $ k $-means problem. Journal of Industrial & Management Optimization, 2020  doi: 10.3934/jimo.2020160

[2]

Yishui Wang, Dongmei Zhang, Peng Zhang, Yong Zhang. Local search algorithm for the squared metric $ k $-facility location problem with linear penalties. Journal of Industrial & Management Optimization, 2021, 17 (4) : 2013-2030. doi: 10.3934/jimo.2020056

[3]

Sung Ha Kang, Berta Sandberg, Andy M. Yip. A regularized k-means and multiphase scale segmentation. Inverse Problems & Imaging, 2011, 5 (2) : 407-429. doi: 10.3934/ipi.2011.5.407

[4]

Baolan Yuan, Wanjun Zhang, Yubo Yuan. A Max-Min clustering method for $k$-means algorithm of data clustering. Journal of Industrial & Management Optimization, 2012, 8 (3) : 565-575. doi: 10.3934/jimo.2012.8.565

[5]

Kien Ming Ng, Trung Hieu Tran. A parallel water flow algorithm with local search for solving the quadratic assignment problem. Journal of Industrial & Management Optimization, 2019, 15 (1) : 235-259. doi: 10.3934/jimo.2018041

[6]

Gaidi Li, Zhen Wang, Dachuan Xu. An approximation algorithm for the $k$-level facility location problem with submodular penalties. Journal of Industrial & Management Optimization, 2012, 8 (3) : 521-529. doi: 10.3934/jimo.2012.8.521

[7]

Baoli Shi, Zhi-Feng Pang, Jing Xu. Image segmentation based on the hybrid total variation model and the K-means clustering strategy. Inverse Problems & Imaging, 2016, 10 (3) : 807-828. doi: 10.3934/ipi.2016022

[8]

Behrad Erfani, Sadoullah Ebrahimnejad, Amirhossein Moosavi. An integrated dynamic facility layout and job shop scheduling problem: A hybrid NSGA-II and local search algorithm. Journal of Industrial & Management Optimization, 2020, 16 (4) : 1801-1834. doi: 10.3934/jimo.2019030

[9]

Jianjun Liu, Min Zeng, Yifan Ge, Changzhi Wu, Xiangyu Wang. Improved Cuckoo Search algorithm for numerical function optimization. Journal of Industrial & Management Optimization, 2020, 16 (1) : 103-115. doi: 10.3934/jimo.2018142

[10]

Justyna Jarczyk, Witold Jarczyk. Gaussian iterative algorithm and integrated automorphism equation for random means. Discrete & Continuous Dynamical Systems, 2020, 40 (12) : 6837-6844. doi: 10.3934/dcds.2020135

[11]

Leong-Kwan Li, Sally Shao. Convergence analysis of the weighted state space search algorithm for recurrent neural networks. Numerical Algebra, Control & Optimization, 2014, 4 (3) : 193-207. doi: 10.3934/naco.2014.4.193

[12]

Behrouz Kheirfam, Morteza Moslemi. On the extension of an arc-search interior-point algorithm for semidefinite optimization. Numerical Algebra, Control & Optimization, 2018, 8 (2) : 261-275. doi: 10.3934/naco.2018015

[13]

Mohamed A. Tawhid, Ahmed F. Ali. An effective hybrid firefly algorithm with the cuckoo search for engineering optimization problems. Mathematical Foundations of Computing, 2018, 1 (4) : 349-368. doi: 10.3934/mfc.2018017

[14]

Weihua Liu, Andrew Klapper. AFSRs synthesis with the extended Euclidean rational approximation algorithm. Advances in Mathematics of Communications, 2017, 11 (1) : 139-150. doi: 10.3934/amc.2017008

[15]

Xavier Gràcia, Xavier Rivas, Narciso Román-Roy. Constraint algorithm for singular field theories in the k-cosymplectic framework. Journal of Geometric Mechanics, 2020, 12 (1) : 1-23. doi: 10.3934/jgm.2020002

[16]

Ruiqi Yang, Dachuan Xu, Yicheng Xu, Dongmei Zhang. An adaptive probabilistic algorithm for online k-center clustering. Journal of Industrial & Management Optimization, 2019, 15 (2) : 565-576. doi: 10.3934/jimo.2018057

[17]

Ming-Jong Yao, Tien-Cheng Hsu. An efficient search algorithm for obtaining the optimal replenishment strategies in multi-stage just-in-time supply chain systems. Journal of Industrial & Management Optimization, 2009, 5 (1) : 11-32. doi: 10.3934/jimo.2009.5.11

[18]

Abdel-Rahman Hedar, Ahmed Fouad Ali, Taysir Hassan Abdel-Hamid. Genetic algorithm and Tabu search based methods for molecular 3D-structure prediction. Numerical Algebra, Control & Optimization, 2011, 1 (1) : 191-209. doi: 10.3934/naco.2011.1.191

[19]

Sen Zhang, Guo Zhou, Yongquan Zhou, Qifang Luo. Quantum-inspired satin bowerbird algorithm with Bloch spherical search for constrained structural optimization. Journal of Industrial & Management Optimization, 2020  doi: 10.3934/jimo.2020130

[20]

Zheng Chang, Haoxun Chen, Farouk Yalaoui, Bo Dai. Adaptive large neighborhood search Algorithm for route planning of freight buses with pickup and delivery. Journal of Industrial & Management Optimization, 2021, 17 (4) : 1771-1793. doi: 10.3934/jimo.2020045

2019 Impact Factor: 1.366

Metrics

  • PDF downloads (22)
  • HTML views (75)
  • Cited by (0)

Other articles
by authors

[Back to Top]