TY - JOUR
T1 - SVMCRYS
T2 - An SVM approach for the prediction of protein crystallization propensity from protein sequence
AU - Kandaswamy, Krishna Kumar
AU - Pugalenthi, Ganesan
AU - Suganthan, P. N.
AU - Gangal, Rajeev
PY - 2010/4/1
Y1 - 2010/4/1
N2 - X-ray crystallography is the most widely used method for protein 3-dimensional structure determination. Selection of target protein that can yield high quality crystal for X-ray crystallography is a challenging task. Prediction of protein crystallization propensity from sequence information is useful for the selection of target protein for crystallization. Recently, support vector machines have been widely used to solve various biological problems. In this work, we present a SVMCRYS method which use support vector machine to classify protein sequence into 'amenable to crystallization' and 'resistant to crystallization'. SVMCRYS was trained on a dataset containing 728 sequences that gave diffraction quality crystal and 728 sequences where work had been stopped before obtaining crystal. The performance of SVMCRYS method was compared with other sequence-based crystallization prediction methods such as SECRET, CRYSTALP, OB-Score, ParCrys and XtalPred using three different datasets. SVMCRYS achieved better prediction rate with higher sensitivity and specificity. Our analysis suggests that SVMCRYS can be used to predict proteins which are amenable to crystallization and proteins which are difficult for crystallization. The SVMCRYS software, dataset and feature set can be obtained from http://www3.ntu.edu.sg/home/EPNSugan/index_files/svmcrys.htm.
AB - X-ray crystallography is the most widely used method for protein 3-dimensional structure determination. Selection of target protein that can yield high quality crystal for X-ray crystallography is a challenging task. Prediction of protein crystallization propensity from sequence information is useful for the selection of target protein for crystallization. Recently, support vector machines have been widely used to solve various biological problems. In this work, we present a SVMCRYS method which use support vector machine to classify protein sequence into 'amenable to crystallization' and 'resistant to crystallization'. SVMCRYS was trained on a dataset containing 728 sequences that gave diffraction quality crystal and 728 sequences where work had been stopped before obtaining crystal. The performance of SVMCRYS method was compared with other sequence-based crystallization prediction methods such as SECRET, CRYSTALP, OB-Score, ParCrys and XtalPred using three different datasets. SVMCRYS achieved better prediction rate with higher sensitivity and specificity. Our analysis suggests that SVMCRYS can be used to predict proteins which are amenable to crystallization and proteins which are difficult for crystallization. The SVMCRYS software, dataset and feature set can be obtained from http://www3.ntu.edu.sg/home/EPNSugan/index_files/svmcrys.htm.
UR - http://www.scopus.com/inward/record.url?scp=77950100747&partnerID=8YFLogxK
U2 - 10.2174/092986610790963726
DO - 10.2174/092986610790963726
M3 - Journal articles
C2 - 20044918
AN - SCOPUS:77950100747
SN - 0929-8665
VL - 17
SP - 423
EP - 430
JO - Protein and Peptide Letters
JF - Protein and Peptide Letters
IS - 4
ER -