Many proteins are highly modular, being assembled from globular domains and segments of natively disordered polypeptides. Linear motifs, short sequence modules functioning independently of protein tertiary structure, are most abundant in natively disordered polypeptides but are also found in accessible parts of globular domains, such as exposed loops. The prediction of novel occurrences of known linear motifs attempts the difficult task of distinguishing functional matches from stochastically occurring non-functional matches. Although functionality can only be confirmed experimentally, confidence in a putative motif is increased if a motif exhibits attributes associated with functional instances such as occurrence in the correct taxonomic range, cellular compartment, conservation in homologues and accessibility to interacting partners. Several tools now use these attributes to classify putative motifs based on confidence of functionality. Current methods assessing motif accessibility do not consider much of the information available, either predicting accessibility from primary sequence or regarding any motif occurring in a globular region as low confidence. We present a method considering accessibility and secondary structural context derived from experimentally solved protein structures to rectify this situation. Putatively functional motif occurrences are mapped onto a representative domain, given that a high quality reference SCOP domain structure is available for the protein itself or a close relative. Candidate motifs can then be scored for solvent-accessibility and secondary structure context. The scores are calibrated on a benchmark set of experimentally verified motif instances compared with a set of random matches. A combined score yields 3-fold enrichment for functional motifs assigned to high confidence classifications and 2.5-fold enrichment for random motifs assigned to low confidence classifications. The structure filter is implemented as a pipeline with both a graphical interface via the ELM resource http://elm.eu.org/ and through a Web Service protocol. New occurrences of known linear motifs require experimental validation as the bioinformatics tools currently have limited reliability. The ELM structure filter will aid users assessing candidate motifs presenting in globular structural regions. Most importantly, it will help users to decide whether to expend their valuable time and resources on experimental testing of interesting motif candidates.
Plasmodium falciparum and Schistosoma mansonii are the parasites responsible for most of the malaria and schistosomiasis cases in the world. Notwithstanding their many differences, the two agents have striking similarities in that they both are blood feeders and are targets of an overlapping set of drugs, including the well-known artemether molecule. Here we explore the possibility of using the known information about the mode of action of artemether in Plasmodium to identify the molecular target of the drug in Schistosoma and provide evidence that artemether binds to SmSERCA, a putative Ca²⁺-ATPase of Schistosoma . We also predict the putative binding mode of the molecule for both its Plasmodium and Schistosoma targets. Our analysis of the mode of binding of artemether to Ca²⁺-ATPases also provides an explanation for the apparent paradox that, although the molecule has no side effect in humans, it has been shown to possess antitumoral activity.
Phosphorylation is the most widely studied post-translational modification occurring in cells. While mass spectrometry-based proteomics experiments are uncovering thousands of novel in vivo phosphorylation sites, the identification of kinase specificity rules still remains a relatively slow and often inefficacious task. In the last twenty years, many efforts have being devoted to the experimental and computational identification of sequence and structural motifs encoding kinase-substrate interaction key residues and the phosphorylated amino acid itself. In this review, we retrace the road to the discovery of phosphorylation sequence motifs, examine the progresses achieved in the detection of three-dimensional motifs and discuss their importance in the understanding of regulation and de-regulation of many cellular processes.
False occurrences of functional motifs in protein sequences can be considered as random events due solely to the sequence composition of a proteome. Here we use a numerical approach to investigate the random appearance of functional motifs with the aim of addressing biological questions such as: How are organisms protected from undesirable occurrences of motifs otherwise selected for their functionality? Has the random appearance of functional motifs in protein sequences been affected during evolution? Here we analyse the occurrence of functional motifs in random sequences and compare it to that observed in biological proteomes; the behaviour of random motifs is also studied. Most motifs exhibit a number of false positives significantly similar to the number of times they appear in randomized proteomes (=expected number of false positives). Interestingly, about 3% of the analysed motifs show a different kind of behaviour and appear in biological proteomes less than they do in random sequences. In some of these cases, a mechanism of evolutionary negative selection is apparent; this helps to prevent unwanted functionalities which could interfere with cellular mechanisms. Our thorough statistical and biological analysis showed that there are several mechanisms and evolutionary constraints both of which affect the appearance of functional motifs in protein sequences.
Despite the investments in malaria research, an effective vaccine has not yet been developed and the causative parasites are becoming increasingly resistant to most of the available drugs. PfATP6, the sarco/endoplasmic reticulum Ca2+ pump (SERCA) of P. falciparum, has been recently genetically validated as a potential antimalarial target and cyclopiazonic acid (CPA) has been found to be a potent inhibitor of SERCAs in several organisms, including P. falciparum. In position 263, PfATP6 displays a leucine residue, whilst the corresponding position in the mammalian SERCA is occupied by a glutamic acid. The PfATP6 L263E mutation has been studied in relation to the artemisinin inhibitory effect on P. falciparum and recent studies have provided evidence that the parasite with this mutation is more susceptible to CPA. Here, we characterized, for the first time, the interaction of CPA with PfATP6 and its mammalian counterpart to understand similarities and differences in the mode of binding of the inhibitor to the two Ca2+ pumps. We found that, even though CPA does not directly interact with the residue in position 263, the presence of a hydrophobic residue in this position in PfATP6 rather than a negatively charged one, as in the mammalian SERCA, entails a conformational arrangement of the binding pocket which, in turn, determines a relaxation of CPA leading to a different binding mode of the compound. Our findings highlight differences between the plasmodial and human SERCA CPA-binding pockets that may be exploited to design CPA derivatives more selective toward PfATP6.
Relatively few protein structures are known, compared to the enormous amount of sequence data produced in the sequencing of different genomes, and relatively few protein complexes are deposited in the PDB with respect to the great amount of interaction data coming from high-throughput experiments (two-hybrid or affinity purification of protein complexes and mass spectrometry). Nevertheless, we can rely on computational techniques for the extraction of high-quality and information-rich data from the known structures and for their spreading in the protein sequence space. We describe here the ongoing research projects in our group: we analyse the protein complexes stored in the PDB and, for each complex involving one domain belonging to a family of interaction domains for which some interaction data are available, we can calculate its probability of interaction with any protein sequence. We analyse the structures of proteins encoding a function specified in a PROSITE pattern, which exhibits relatively low selectivity and specificity, and build extended patterns. To this aim, we consider residues that are well-conserved in the structure, even if their conservation cannot easily be recognized in the sequence alignment of the proteins holding the function. We also analyse protein surface regions and, through the annotation of the solvent-exposed residues, we annotate protein surface patches via a structural comparison performed with stringent parameters and independently of the residue order in the sequence. Local surface comparison may also help in identifying new sequence patterns, which could not be highlighted with other sequence-based methods.
Resistance to chloroquine of malaria strains is known to be associated with a parasite protein named PfCRT, the mutated form of which is able to reduce chloroquine accumulation in the digestive vacuole of the pathogen. Whether the protein mediates extrusion of the drug acting as a channel or as a carrier and which is the protonation state of its chloroquine substrate is the subject of a scientific debate. We present here an analytical approach that explores which combination of hypotheses on the mechanism of transport and the protonation state of chloroquine are consistent with available equilibrium experimental data. We show that the available experimental data are not, by themselves, sufficient to conclude whether the protein acts as a channel or as a transporter, which explains the origin of their different interpretation by different authors. Interestingly, though, each of the two models is only consistent with a subset of hypotheses on the protonation state of the transported molecule. The combination of these results with a sequence and structure analysis of PfCRT, which strongly suggests that the molecule is a carrier, indicates that the transported species is either or both the mono and di-protonated forms of chloroquine. We believe that our results, besides shedding light on the mechanism of chloroquine resistance in P. falciparum, have implications for the development of novel therapies against resistant malaria strains and demonstrate the usefulness of an approach combining systems biology strategies with structural bioinformatics and experimental data.
Abstract Background The SH3 domain family is one of the most representative and widely studied cases of so-called Peptide Recognition Modules (PRM). The polyproline II motif PxxP that generally characterizes its ligands does not reflect the complex interaction spectrum of the over 1500 different SH3 domains, and the requirement of a more refined knowledge of their specificity implies the setting up of appropriate experimental and theoretical strategies. Due to the limitations of the current technology for peptide synthesis, several experimental high-throughput approaches have been devised to elucidate protein-protein interaction mechanisms. Such approaches can rely on and take advantage of computational techniques, such as regular expressions or position specific scoring matrices (PSSMs) to pre-process entire proteomes in the search for putative SH3 targets. In this regard, a reliable inference methodology to be used for reducing the sequence space of putative binding peptides represents a valuable support for molecular and cellular biologists. Results Using as benchmark the peptide sequences obtained from in vitro binding experiments, we set up a neural network model that performs better than PSSM in the detection of SH3 domain interactors. In particular our model is more precise in its predictions, even if its performance can vary among different SH3 domains and is strongly dependent on the number of binding peptides in the benchmark. Conclusion We show that a neural network can be more effective than standard methods in SH3 domain specificity detection. Neural classifiers identify general SH3 domain binders and domain-specific interactors from a PxxP peptide population, provided that there are a sufficient proportion of true positives in the training sets. This capability can also improve peptide selection for library definition in array experiments. Further advances can be achieved, including properly encoded domain sequences and structural information as input for a global neural network.
Monitoring resistance phenotypes for Plasmodium falciparum, using in vitro growth assays, and relating findings to parasite genotype has proved particularly challenging for the study of resistance to artemisinins.Plasmodium falciparum isolates cultured from 28 returning travellers diagnosed with malaria were assessed for sensitivity to artemisinin, artemether, dihydroartemisinin and artesunate and findings related to mutations in pfatp6 and pfmdr1.Resistance to artemether in vitro was significantly associated with a pfatp6 haplotype encoding two amino acid substitutions (pfatp6 A623E and S769N; (mean IC50 (95% CI) values of 8.2 (5.7 - 10.7) for A623/S769 versus 623E/769 N 13.5 (9.8 - 17.3) nM with a mean increase of 65%; p = 0.012). Increased copy number of pfmdr1 was not itself associated with increased IC50 values for artemether, but when interactions between the pfatp6 haplotype and increased copy number of pfmdr1 were examined together, a highly significant association was noted with IC50 values for artemether (mean IC50 (95% CI) values of 8.7 (5.9 - 11.6) versus 16.3 (10.7 - 21.8) nM with a mean increase of 87%; p = 0.0068). Previously described SNPs in pfmdr1 are also associated with differences in sensitivity to some artemisinins.These findings were further explored in molecular modelling experiments that suggest mutations in pfatp6 are unlikely to affect differential binding of artemisinins at their proposed site, whereas there may be differences in such binding associated with mutations in pfmdr1. Implications for a hypothesis that artemisinin resistance may be exacerbated by interactions between PfATP6 and PfMDR1 and for epidemiological studies to monitor emerging resistance are discussed.