Harris, Simon and Cordeiro De Amorim, Renato (2022) An extensive empirical comparison of k-means initialisation algorithms. IEEE Access, 10. pp. 58752-58768. DOI https://doi.org/10.1109/access.2022.3179803
Harris, Simon and Cordeiro De Amorim, Renato (2022) An extensive empirical comparison of k-means initialisation algorithms. IEEE Access, 10. pp. 58752-58768. DOI https://doi.org/10.1109/access.2022.3179803
Harris, Simon and Cordeiro De Amorim, Renato (2022) An extensive empirical comparison of k-means initialisation algorithms. IEEE Access, 10. pp. 58752-58768. DOI https://doi.org/10.1109/access.2022.3179803
Abstract
The k-means clustering algorithm, whilst widely popular, is not without its drawbacks. In this paper, we focus on the sensitivity of k-means to its initial set of centroids. Since the cluster recovery performance of k-means can be improved by better initialisation, numerous algorithms have been proposed aiming at producing good initial centroids. However, it is still unclear which algorithm should be used in any particular clustering scenario. With this in mind, we compare 17 such algorithms on 6,000 synthetic and 28 real-world data sets. The synthetic data sets were produced under different configurations, allowing us to show which algorithm excels in each scenario. Hence, the results of our experiments can be particularly useful for those considering k-means for a non-trivial clustering scenario.
Item Type: | Article |
---|---|
Uncontrolled Keywords: | k-means; k-means initialisation; clustering |
Divisions: | Faculty of Science and Health Faculty of Science and Health > Computer Science and Electronic Engineering, School of |
SWORD Depositor: | Unnamed user with email elements@essex.ac.uk |
Depositing User: | Unnamed user with email elements@essex.ac.uk |
Date Deposited: | 08 Jun 2022 08:16 |
Last Modified: | 23 Sep 2022 19:54 |
URI: | http://repository.essex.ac.uk/id/eprint/32969 |
Available files
Filename: An_extensive_empirical_comparison_of_k-means_initialisation_algorithms.pdf
Licence: Creative Commons: Attribution 3.0