Marco, Antonio and Marín, Ignacio (2007) A general strategy to determine the congruence between a hierarchical and a non-hierarchical classification. BMC Bioinformatics, 8 (1). 442-. DOI https://doi.org/10.1186/1471-2105-8-442
Marco, Antonio and Marín, Ignacio (2007) A general strategy to determine the congruence between a hierarchical and a non-hierarchical classification. BMC Bioinformatics, 8 (1). 442-. DOI https://doi.org/10.1186/1471-2105-8-442
Marco, Antonio and Marín, Ignacio (2007) A general strategy to determine the congruence between a hierarchical and a non-hierarchical classification. BMC Bioinformatics, 8 (1). 442-. DOI https://doi.org/10.1186/1471-2105-8-442
Abstract
<h4>Background</h4>Classification procedures are widely used in phylogenetic inference, the analysis of expression profiles, the study of biological networks, etc. Many algorithms have been proposed to establish the similarity between two different classifications of the same elements. However, methods to determine significant coincidences between hierarchical and non-hierarchical partitions are still poorly developed, in spite of the fact that the search for such coincidences is implicit in many analyses of massive data.<h4>Results</h4>We describe a novel strategy to compare a hierarchical and a dichotomic non-hierarchical classification of elements, in order to find clusters in a hierarchical tree in which elements of a given "flat" partition are overrepresented. The key improvement of our strategy respect to previous methods is using permutation analyses of ranked clusters to determine whether regions of the dendrograms present a significant enrichment. We show that this method is more sensitive than previously developed strategies and how it can be applied to several real cases, including microarray and interactome data. Particularly, we use it to compare a hierarchical representation of the yeast mitochondrial interactome and a catalogue of known mitochondrial protein complexes, demonstrating a high level of congruence between those two classifications. We also discuss extensions of this method to other cases which are conceptually related.<h4>Conclusion</h4>Our method is highly sensitive and outperforms previously described strategies. A PERL script that implements it is available at http://www.uv.es/~genomica/treetracker.
Item Type: | Article |
---|---|
Uncontrolled Keywords: | Saccharomyces cerevisiae Proteins; Mitochondrial Proteins; Oligonucleotide Array Sequence Analysis; Cluster Analysis; Reproducibility of Results; Sequence Analysis, Protein; Protein Interaction Mapping; Decision Trees; Classification; Artificial Intelligence; User-Computer Interface; Databases, Protein; Pattern Recognition, Automated |
Subjects: | Q Science > Q Science (General) |
Divisions: | Faculty of Science and Health Faculty of Science and Health > Life Sciences, School of |
SWORD Depositor: | Unnamed user with email elements@essex.ac.uk |
Depositing User: | Unnamed user with email elements@essex.ac.uk |
Date Deposited: | 23 Mar 2015 19:32 |
Last Modified: | 30 Oct 2024 20:35 |
URI: | http://repository.essex.ac.uk/id/eprint/8500 |
Available files
Filename: 2007_BMCBioinformatics_Treetracker.pdf
Licence: Creative Commons: Attribution 3.0