Research Repository

Applying subclustering and Lp distance in Weighted K-Means with distributed centroids

Amorim, RC and Makarenkov, V (2016) 'Applying subclustering and Lp distance in Weighted K-Means with distributed centroids.' Neurocomputing, 173 (P3). 700 - 707. ISSN 0925-2312

[img]
Preview
Text
MWk_Prototype.pdf - Accepted Version
Available under License Creative Commons Attribution Non-commercial No Derivatives.

Download (333kB) | Preview

Abstract

We consider the Weighted K-Means algorithm with distributed centroids aimed at clustering data sets with numerical, categorical and mixed types of data. Our approach allows given features (i.e., variables) to have different weights at different clusters. Thus, it supports the intuitive idea that features may have different degrees of relevance at different clusters. We use the Minkowski metric in a way that feature weights become feature re-scaling factors for any considered exponent. Moreover, the traditional Silhouette clustering validity index was adapted to deal with both numerical and categorical types of features. Finally, we show that our new method usually outperforms traditional K-Means as well as the recently proposed WK-DC clustering algorithm.

Item Type: Article
Uncontrolled Keywords: Clustering Mixed data Feature weighting K-Means Minkowski metric
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Divisions: Faculty of Science and Health > Computer Science and Electronic Engineering, School of
Depositing User: Elements
Date Deposited: 18 Sep 2017 12:44
Last Modified: 18 Oct 2017 16:16
URI: http://repository.essex.ac.uk/id/eprint/20361

Actions (login required)

View Item View Item