Mahmoud, Osama and Harrison, Andrew and Perperoglou, Aris and Gul, Asma and Khan, Zardad and Metodiev, Metodi V and Lausen, Berthold (2014) A feature selection method for classification within functional genomics experiments based on the proportional overlapping score. BMC Bioinformatics, 15 (1). 274-. DOI https://doi.org/10.1186/1471-2105-15-274
Mahmoud, Osama and Harrison, Andrew and Perperoglou, Aris and Gul, Asma and Khan, Zardad and Metodiev, Metodi V and Lausen, Berthold (2014) A feature selection method for classification within functional genomics experiments based on the proportional overlapping score. BMC Bioinformatics, 15 (1). 274-. DOI https://doi.org/10.1186/1471-2105-15-274
Mahmoud, Osama and Harrison, Andrew and Perperoglou, Aris and Gul, Asma and Khan, Zardad and Metodiev, Metodi V and Lausen, Berthold (2014) A feature selection method for classification within functional genomics experiments based on the proportional overlapping score. BMC Bioinformatics, 15 (1). 274-. DOI https://doi.org/10.1186/1471-2105-15-274
Abstract
Background: Microarray technology, as well as other functional genomics experiments, allow simultaneous measurements of thousands of genes within each sample. Both the prediction accuracy and interpretability of a classifier could be enhanced by performing the classification based only on selected discriminative genes. We propose a statistical method for selecting genes based on overlapping analysis of expression data across classes. This method results in a novel measure, called proportional overlapping score (POS), of a feature's relevance to a classification task.Results: We apply POS, along-with four widely used gene selection methods, to several benchmark gene expression datasets. The experimental results of classification error rates computed using the Random Forest, k Nearest Neighbor and Support Vector Machine classifiers show that POS achieves a better performance.Conclusions: A novel gene selection method, POS, is proposed. POS analyzes the expressions overlap across classes taking into account the proportions of overlapping samples. It robustly defines a mask for each gene that allows it to minimize the effect of expression outliers. The constructed masks along-with a novel gene score are exploited to produce the selected subset of genes.
Item Type: | Article |
---|---|
Uncontrolled Keywords: | Feature selection; Gene ranking; Microarray classification; Proportional overlap score; Gene mask; Minimum subset of genes |
Subjects: | Q Science > QA Mathematics Q Science > QH Natural history > QH426 Genetics |
Divisions: | Faculty of Science and Health Faculty of Science and Health > Life Sciences, School of Faculty of Science and Health > Mathematics, Statistics and Actuarial Science, School of |
SWORD Depositor: | Unnamed user with email elements@essex.ac.uk |
Depositing User: | Unnamed user with email elements@essex.ac.uk |
Date Deposited: | 14 Aug 2014 07:54 |
Last Modified: | 30 Oct 2024 19:55 |
URI: | http://repository.essex.ac.uk/id/eprint/9960 |
Available files
Filename: 1471-2105-15-274.pdf
Licence: Creative Commons: Attribution 3.0