Mahmoud, Osama and Harrison, Andrew and Gul, Asma and Khan, Zardad and Metodiev, Metodi V and Lausen, Berthold (2016) Minimizing Redundancy Among Genes Selected Based on the Overlapping Analysis. In: UNSPECIFIED, ? - ?.
Mahmoud, Osama and Harrison, Andrew and Gul, Asma and Khan, Zardad and Metodiev, Metodi V and Lausen, Berthold (2016) Minimizing Redundancy Among Genes Selected Based on the Overlapping Analysis. In: UNSPECIFIED, ? - ?.
Mahmoud, Osama and Harrison, Andrew and Gul, Asma and Khan, Zardad and Metodiev, Metodi V and Lausen, Berthold (2016) Minimizing Redundancy Among Genes Selected Based on the Overlapping Analysis. In: UNSPECIFIED, ? - ?.
Abstract
For many functional genomic experiments, identifying the most characterizing genes is a main challenge. Both the prediction accuracy and interpretability of a classifier could be enhanced by performing the classification based only on a set of discriminative genes. Analyzing overlapping between gene expression of different classes is an effective criterion for identifying relevant genes. However, genes selected according to maximizing a relevance score could have rich redundancy.We propose a scheme for minimizing selection redundancy, in which the Proportional Overlapping Score (POS) technique is extended by using a recursive approach to assign a set of complementary discriminative genes. The proposed scheme exploits the gene masks defined by POS to identify more integrated genes in terms of their classification patterns. The approach is validated by comparing its classification performance with other feature selection methods, Wilcoxon Rank Sum, mRMR, MaskedPainter and POS, for several benchmark gene expression datasets using three different classifiers: Random Forest; k Nearest Neighbour; SupportVector Machine. The experimental results of classification error rates show that our proposal achieves a better performance.
Item Type: | Conference or Workshop Item (UNSPECIFIED) |
---|---|
Additional Information: | Published proceedings: Studies in Classification, Data Analysis, and Knowledge Organization |
Subjects: | H Social Sciences > HA Statistics Q Science > QA Mathematics Q Science > QH Natural history > QH426 Genetics |
Divisions: | Faculty of Science and Health Faculty of Science and Health > Life Sciences, School of Faculty of Science and Health > Mathematics, Statistics and Actuarial Science, School of |
SWORD Depositor: | Unnamed user with email elements@essex.ac.uk |
Depositing User: | Unnamed user with email elements@essex.ac.uk |
Date Deposited: | 05 Dec 2016 21:16 |
Last Modified: | 05 Dec 2024 21:45 |
URI: | http://repository.essex.ac.uk/id/eprint/18343 |