Sanei, Saeid and Lee, Tracey KM and Boukhennoufa, Issam and Jarchi, Delaram and Zhai, Xiaojun and McDonald-Maier, Klaus (2025) Evaluating the Effect of Surrogate Data Generation on Healthcare Data Assessment. Big Data and Cognitive Computing, 9 (2). p. 22. DOI https://doi.org/10.3390/bdcc9020022
Sanei, Saeid and Lee, Tracey KM and Boukhennoufa, Issam and Jarchi, Delaram and Zhai, Xiaojun and McDonald-Maier, Klaus (2025) Evaluating the Effect of Surrogate Data Generation on Healthcare Data Assessment. Big Data and Cognitive Computing, 9 (2). p. 22. DOI https://doi.org/10.3390/bdcc9020022
Sanei, Saeid and Lee, Tracey KM and Boukhennoufa, Issam and Jarchi, Delaram and Zhai, Xiaojun and McDonald-Maier, Klaus (2025) Evaluating the Effect of Surrogate Data Generation on Healthcare Data Assessment. Big Data and Cognitive Computing, 9 (2). p. 22. DOI https://doi.org/10.3390/bdcc9020022
Abstract
In healthcare applications, often it is not possible to record sufficient data as required for deep learning or data-driven classification and feature detection systems due to the patient condition, various clinical or experimental limitations, or time constraints. On the other hand, data imbalance invalidates many of the test results crucial for clinical approvals. Generating synthetic (artificial or dummy) data has become a potential solution to address this issue. Such data should possess adequate information, properties, and characteristics to mimic the real-world data recorded in natural circumstances. Several methods have been proposed for this purpose, and results often show that adding surrogates improves the decision-making accuracy. This article evaluates the most recent surrogate data generation and data synthesis methods to investigate the effects of the number of surrogates on improving the classification results. It is shown that the data analysis/classification results improve with an increasing number of surrogates, but this no longer continues after a certain number of surrogates. This achievement helps in deciding on the number of surrogates for each strategy, resulting in the alleviation of the computation cost.
| Item Type: | Article |
|---|---|
| Uncontrolled Keywords: | data augmentation, DNN, randomizing, Siamese GAN, singular spectrum analysis, SSA, surrogate generation, window slicing, window warping |
| Divisions: | Faculty of Science and Health Faculty of Science and Health > Computer Science and Electronic Engineering, School of |
| SWORD Depositor: | Unnamed user with email elements@essex.ac.uk |
| Depositing User: | Unnamed user with email elements@essex.ac.uk |
| Date Deposited: | 03 Jul 2026 14:43 |
| Last Modified: | 03 Jul 2026 14:43 |
| URI: | http://repository.essex.ac.uk/id/eprint/40165 |
Available files
Filename: BDCC-09-00022.pdf
Licence: Creative Commons: Attribution 4.0