Research Repository

Feature Relevance in Ward's Hierarchical Clustering Using the L (p) Norm

Amorim, RC (2015) 'Feature Relevance in Ward's Hierarchical Clustering Using the L (p) Norm.' Journal of Classification, 32 (1). 46 - 62. ISSN 0176-4268

[img]
Preview
Text
MW_Ward.pdf - Accepted Version

Download (283kB) | Preview

Abstract

In this paper we introduce a new hierarchical clustering algorithm called Wardp. Unlike the original Ward, Wardp generates feature weights, which can be seen as feature rescaling factors thanks to the use of the Lp norm. The feature weights are cluster dependent, allowing a feature to have different degrees of relevance at different clusters. We validate our method by performing experiments on a total of 75 real-world and synthetic datasets, with and without added features made of uniformly random noise. Our experiments show that: (i) the use of our feature weighting method produces results that are superior to those produced by the original Ward method on datasets containing noise features; (ii) it is indeed possible to estimate a good exponent p under a totally unsupervised framework. The clusterings produced by Wardp are dependent on p. This makes the estimation of a good value for this exponent a requirement for this algorithm, and indeed for any other also based on the Lp norm.

Item Type: Article
Uncontrolled Keywords: Ward method, Hierarchical clustering, Feature weights, Feature relevance, L-p norm, Minkowski metric
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Divisions: Faculty of Science and Health > Computer Science and Electronic Engineering, School of
Depositing User: Elements
Date Deposited: 18 Sep 2017 13:56
Last Modified: 18 Oct 2017 16:16
URI: http://repository.essex.ac.uk/id/eprint/20365

Actions (login required)

View Item View Item