Predicting and Analyzing Lipid-Binding Proteins Using an Efficient Physicochemical Property Mining Method
2013
Lipid-binding proteinsjoin many important biological processes. Lipid-binding proteins are highly related to diseases, such as metabolic diseases, cancer and autoimmune diseases. The existed studies of predictinglipid-binding functions or predictinglipid-binding sites, but notidentify the lipid-binding proteins ornot lipid-binding proteins.This study purpose a systematic approach to identify a small set of physicochemical and biochemical properties in AAindex database to design support vector machine (SVM) based classifier for predicting and analyzing lipid-binding proteins. The merits of this study are three-fold: First, we establish a data set of lipid-binding proteins collected from SwissProt utilizing the gene ontology (GO) annotation terms. Secondly, utilize an efficient genetic algorithm based optimization method IBCGA to select an informative set of feature vectors of representing sequences from the viewpoint of machine learning. Thirdly, analyze the selected feature vectors to identify the related physicochemical properties which may affect the binding mechanism oflipid-binding proteins. In this study, to overcome the unbalanced dataset problem caused from the number of putative negative dataset (537,346) being almost 530 times to that of positive dataset (1,053), a dataset determining technique is proposed.Then the dataset is applied to make a high performance classifier. The prediction accuracy of independent test is 77.75% using 18 properties. The selected 18 properties may divide into 6 groupings:alpha and turn propensities, beta propensity, Composition, Hydrophobicity, Physicochemical properties and other properties.Hydrophobicity and alpha-helix are most relative to lipid-binding protein.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
18
References
1
Citations
NaN
KQI