Olobatuyi, Kehinde and Parker, Matthew RP and Ariyo, Oludare (2023) Cluster weighted model based on TSNE algorithm for high-dimensional data. International Journal of Data Science and Analytics, 17 (3). pp. 261-273. DOI https://doi.org/10.1007/s41060-023-00422-8
Olobatuyi, Kehinde and Parker, Matthew RP and Ariyo, Oludare (2023) Cluster weighted model based on TSNE algorithm for high-dimensional data. International Journal of Data Science and Analytics, 17 (3). pp. 261-273. DOI https://doi.org/10.1007/s41060-023-00422-8
Olobatuyi, Kehinde and Parker, Matthew RP and Ariyo, Oludare (2023) Cluster weighted model based on TSNE algorithm for high-dimensional data. International Journal of Data Science and Analytics, 17 (3). pp. 261-273. DOI https://doi.org/10.1007/s41060-023-00422-8
Abstract
Cluster-weighted models (CWMs) are an important class of machine learning models that are commonly used for modelling complex datasets. However, they are known to suffer from reduced computing efficiency and estimator accuracy when dealing with high-dimensional data. Previous work has proposed a parsimonious technique that can improve CWMs' performance in the high-dimensional data paradigm. However, this method has a setback for very high-dimensional data, where the dimensionality is greater than 100. In this paper, we propose a new hybridised method that incorporates a dimensionality reduction technique called T-distributed stochastic neighbour embedding (TSNE) to enhance the parsimonious CWMs in high-dimensional space. Additionally, we introduce a novel heuristic for detecting the hidden components of the underlying mixture model, which can be used with the popular R package FlexCWM. We evaluated the performance of the proposed method using two real datasets and found that it improves clustering power when compared to both the parsimony methods and the TSNE methods combined with CWMs in the high-dimensional data setting. Our results suggest that the proposed method can improve the efficiency and accuracy of CWMs in dealing with high-dimensional data, making it a valuable tool for data scientists and statisticians.
Item Type: | Article |
---|---|
Uncontrolled Keywords: | Cluster-weighted model; Expectation maximisation; FlexCWM; High-dimensional data; Parsimonious technique |
Divisions: | Faculty of Science and Health Faculty of Science and Health > Mathematics, Statistics and Actuarial Science, School of |
SWORD Depositor: | Unnamed user with email elements@essex.ac.uk |
Depositing User: | Unnamed user with email elements@essex.ac.uk |
Date Deposited: | 03 Jul 2023 16:21 |
Last Modified: | 01 Jul 2024 01:00 |
URI: | http://repository.essex.ac.uk/id/eprint/35903 |
Available files
Filename: Cluster_Weighted_Model_Based_on_TSNE_algorithm_for_High_Dimensional_Data__Copy_ (3).pdf