Cordeiro de Amorim, R and Makarenkov, V and Mirkin, B (2016) A-Wardpβ: Effective hierarchical clustering using the Minkowski metric and a fast k-means initialisation. Information Sciences, 370-37. pp. 343-354. DOI https://doi.org/10.1016/j.ins.2016.07.076
Cordeiro de Amorim, R and Makarenkov, V and Mirkin, B (2016) A-Wardpβ: Effective hierarchical clustering using the Minkowski metric and a fast k-means initialisation. Information Sciences, 370-37. pp. 343-354. DOI https://doi.org/10.1016/j.ins.2016.07.076
Cordeiro de Amorim, R and Makarenkov, V and Mirkin, B (2016) A-Wardpβ: Effective hierarchical clustering using the Minkowski metric and a fast k-means initialisation. Information Sciences, 370-37. pp. 343-354. DOI https://doi.org/10.1016/j.ins.2016.07.076
Abstract
In this paper we make two novel contributions to hierarchical clustering. First, we introduce an anomalous pattern initialisation method for hierarchical clustering algorithms, called A-Ward, capable of substantially reducing the time they take to converge. This method generates an initial partition with a sufficiently large number of clusters. This allows the cluster merging process to start from this partition rather than from a trivial partition composed solely of singletons. Our second contribution is an extension of the Ward and Ward p algorithms to the situation where the feature weight exponent can differ from the exponent of the Minkowski distance. This new method, called A-Ward pβ , is able to generate a much wider variety of clustering solutions. We also demonstrate that its parameters can be estimated reasonably well by using a cluster validity index. We perform numerous experiments using data sets with two types of noise, insertion of noise features and blurring within-cluster values of some features. These experiments allow us to conclude: (i) our anomalous pattern initialisation method does indeed reduce the time a hierarchical clustering algorithm takes to complete, without negatively impacting its cluster recovery ability; (ii) A-Ward pβ provides better cluster recovery than both Ward and Ward p .
Item Type: | Article |
---|---|
Uncontrolled Keywords: | Initialisation algorithm Minkowski metric Hierarchical clustering Feature weighting |
Subjects: | Q Science > QA Mathematics > QA75 Electronic computers. Computer science |
Divisions: | Faculty of Science and Health Faculty of Science and Health > Computer Science and Electronic Engineering, School of |
SWORD Depositor: | Unnamed user with email elements@essex.ac.uk |
Depositing User: | Unnamed user with email elements@essex.ac.uk |
Date Deposited: | 18 Sep 2017 11:40 |
Last Modified: | 30 Oct 2024 19:34 |
URI: | http://repository.essex.ac.uk/id/eprint/20363 |
Available files
Filename: 1611.01060v1.pdf
Licence: Creative Commons: Attribution-Noncommercial-No Derivative Works 3.0