Research Repository

Rice_Phospho 1.0: a new rice-specific SVM predictor for protein phosphorylation sites

Lin, Shoukai and Song, Qi and Tao, Huan and Wang, Wei and Wan, Weifeng and Huang, Jian and Xu, Chaoqun and Chebii, Vivien and Kitony, Justine and Que, Shufu and Harrison, Andrew and He, Huaqin (2015) 'Rice_Phospho 1.0: a new rice-specific SVM predictor for protein phosphorylation sites.' Scientific Reports, 5 (1). 11940-. ISSN 2045-2322

[img]
Preview
Text
srep11940.pdf - Published Version
Available under License Creative Commons Attribution.

Download (642kB) | Preview

Abstract

Experimentally-determined or computationally-predicted protein phosphorylation sites for distinctive species are becoming increasingly common. In this paper, we compare the predictive performance of a novel classification algorithm with different encoding schemes to develop a rice-specific protein phosphorylation site predictor. Our results imply that the combination of Amino acid occurrence Frequency with Composition of K-Spaced Amino Acid Pairs (AF-CKSAAP) provides the best description of relevant sequence features that surround a phosphorylation site. A support vector machine (SVM) using AF-CKSAAP achieves the best performance in classifying rice protein phophorylation sites when compared to the other algorithms. We have used SVM with AF-CKSAAP to construct a rice-specific protein phosphorylation sites predictor, Rice-Phospho 1.0 (http://bioinformatics.fafu.edu.cn/rice-phospho1.0). We measure the Accuracy (ACC) and Matthews Correlation Coefficient (MCC) of Rice-Phospho 1.0 to be 82.0% and 0.64, significantly higher than those measures for other predictors such as Scansite, Musite, PlantPhos and PhosphoRice. Rice-Phospho 1.0 also successfully predicted the experimentally identified phosphorylation sites in LOC-Os03g51600.1, a protein sequence which did not appear in the training dataset. In summary, Rice-phospho 1.0 outputs reliable predictions of protein phosphorylation sites in rice, and will serve as a useful tool to the community.

Item Type: Article
Uncontrolled Keywords: Plant Proteins; Area Under Curve; ROC Curve; Phosphorylation; Algorithms; Internet; User-Computer Interface; Oryza; Support Vector Machine
Subjects: Q Science > QH Natural history > QH301 Biology
Divisions: Faculty of Science and Health
Faculty of Science and Health > Mathematical Sciences, Department of
SWORD Depositor: Elements
Depositing User: Elements
Date Deposited: 23 Jul 2015 23:48
Last Modified: 13 Jan 2022 22:01
URI: http://repository.essex.ac.uk/id/eprint/14440

Actions (login required)

View Item View Item