NP-hardness of quadratic Euclidean 1-Mean and 1-Median 2-Clustering problem with the constraints on the cluster sizes
- Authors: Kel’manov A.V.1,2, Pyatkin A.V.1,2, Khandeev V.I.1,2
-
Affiliations:
- Sobolev Institute of Mathematics, Siberian Branch of the Russian Academy of Sciences
- Novosibirsk State University
- Issue: Vol 489, No 4 (2019)
- Pages: 339-343
- Section: Mathematics
- URL: https://journals.eco-vector.com/0869-5652/article/view/18680
- DOI: https://doi.org/10.31857/S0869-56524894339-343
- ID: 18680
Cite item
Full Text
Abstract
In the paper, we consider a problem of clustering a finite set of N points in d-dimensional Euclidean space into two clusters minimizing the sum over all clusters of the intracluster sums of the distances between clusters elements and their centers. The center of one cluster is defined as centroid (geometric center). The center of the other one is a sought point in the input set. We analyze the variant of the problem with the given clusters sizes. We have proved the strong NP-hardness of this problem.
About the authors
A. V. Kel’manov
Sobolev Institute of Mathematics, Siberian Branch of the Russian Academy of Sciences; Novosibirsk State University
Author for correspondence.
Email: kelm@math.nsc.ru
Russian Federation, 4, Acad. Koptyug prospect, Novosibirsk, 630090; 1, Pirogova street, Novosibirsk, 630090
A. V. Pyatkin
Sobolev Institute of Mathematics, Siberian Branch of the Russian Academy of Sciences; Novosibirsk State University
Email: artempyatkin@gmail.com
Russian Federation, 4, Acad. Koptyug prospect, Novosibirsk, 630090; 1, Pirogova street, Novosibirsk, 630090
V. I. Khandeev
Sobolev Institute of Mathematics, Siberian Branch of the Russian Academy of Sciences; Novosibirsk State University
Email: khandeev@math.nsc.ru
Russian Federation, 4, Acad. Koptyug prospect, Novosibirsk, 630090; 1, Pirogova street, Novosibirsk, 630090
References
- Aloise D., Deshpande A., Hansen P., Popat P. NP Hardness of Euclidean Sum-of-Squares Clustering // Machine Learning. 2009. V. 75. № 2. P. 245-248.
- Kariv O., Hakimi S.L. An Algorithmic Approach to Network Location Problems. Pt. II: The p-Medians // SIAM J. Appl. Math. 1979. V. 37. P. 513-538.
- Гимади Э.Х., Кельманов А.В., Кельманова М.А., Хамидуллин С.А. Апостериорное обнаружение в числовой последовательности квазипериодического фрагмента при заданном числе повторов // Сиб. журн. индустр. математики. 2006. Т. 9. № 1 (25). C. 55-74.
- Gimadi E.Kh., Kel’manov A.V., Kel’manova M.A., Khamidullin S.A. A Posteriori Detecting a Quasiperiodic Fragment in a Numerical Sequence // Pattern Recognition and Image Analysis. 2008. V. 18. № 1. P. 30-42.
- Бабурин А.Е., Гимади Э.Х., Глебов Н.И., Пяткин А.В. Задача отыскания подмножества векторов с максимальным суммарным весом // Дискрет. анализ и исслед. операций. Сер. 2. 2007. Т. 14. № 1. С. 32-42.
- James G., Witten D., Hastie T., Tibshirani R. An Introduction to Statistical Learning. N.Y.: Springer Science+Business Media, LLC, 2013. 426 p.
- Shirkhorshidi A.S., Aghabozorgi S., Wah T.Y., Herawan T. Big Data Clustering: A Review // LNCS. 2014. V. 8583. P. 707-720.
- Bishop C.M. Pattern Recognition and Machine Learning. N.Y.: Springer Science+Business Media, LLC, 2006. 738 p.
- Aggarwal C.C. Data Mining: The Textbook. Springer International Publishing, 2015. 734 p.
- Edwards A.W.F., Cavalli-Sforza L.L. A Method for Cluster Analysis // Biometrics. 1965. V. 21. P. 362-375.
- Garey M.R., Johnson D.S. Computers and Intractability: A Guide to the Theory of NP-Completeness. San Francisco: Freeman, 1979. 338 p.
- Papadimitriou C.H. Computational complexity. N.Y.: Addison-Wesley, 1994. 523 p.