logo
    Protease-Inhibitor Interaction Predictions: Lessons on the Complexity of Protein–Protein Interactions
    17
    Citation
    47
    Reference
    10
    Related Paper
    Citation Trend
    Abstract:
    Protein interactions shape proteome function and thus biology. Identification of protein interactions is a major goal in molecular biology, but biochemical methods, although improving, remain limited in coverage and accuracy. Whereas computational predictions can guide biochemical experiments, low validation rates of predictions remain a major limitation. Here, we investigated computational methods in the prediction of a specific type of interaction, the inhibitory interactions between proteases and their inhibitors. Proteases generate thousands of proteoforms that dynamically shape the functional state of proteomes. Despite the important regulatory role of proteases, knowledge of their inhibitors remains largely incomplete with the vast majority of proteases lacking an annotated inhibitor. To link inhibitors to their target proteases on a large scale, we applied computational methods to predict inhibitory interactions between proteases and their inhibitors based on complementary data, including coexpression, phylogenetic similarity, structural information, co-annotation, and colocalization, and also surveyed general protein interaction networks for potential inhibitory interactions. In testing nine predicted interactions biochemically, we validated the inhibition of kallikrein 5 by serpin B12. Despite the use of a wide array of complementary data, we found a high false positive rate of computational predictions in biochemical follow-up. Based on a protease-specific definition of true negatives derived from the biochemical classification of proteases and inhibitors, we analyzed prediction accuracy of individual features, thereby we identified feature-specific limitations, which also affected general protein interaction prediction methods. Interestingly, proteases were often not coexpressed with most of their functional inhibitors, contrary to what is commonly assumed and extrapolated predominantly from cell culture experiments. Predictions of inhibitory interactions were indeed more challenging than predictions of nonproteolytic and noninhibitory interactions. In summary, we describe a novel and well-defined but difficult protein interaction prediction task and thereby highlight limitations of computational interaction prediction methods. Protein interactions shape proteome function and thus biology. Identification of protein interactions is a major goal in molecular biology, but biochemical methods, although improving, remain limited in coverage and accuracy. Whereas computational predictions can guide biochemical experiments, low validation rates of predictions remain a major limitation. Here, we investigated computational methods in the prediction of a specific type of interaction, the inhibitory interactions between proteases and their inhibitors. Proteases generate thousands of proteoforms that dynamically shape the functional state of proteomes. Despite the important regulatory role of proteases, knowledge of their inhibitors remains largely incomplete with the vast majority of proteases lacking an annotated inhibitor. To link inhibitors to their target proteases on a large scale, we applied computational methods to predict inhibitory interactions between proteases and their inhibitors based on complementary data, including coexpression, phylogenetic similarity, structural information, co-annotation, and colocalization, and also surveyed general protein interaction networks for potential inhibitory interactions. In testing nine predicted interactions biochemically, we validated the inhibition of kallikrein 5 by serpin B12. Despite the use of a wide array of complementary data, we found a high false positive rate of computational predictions in biochemical follow-up. Based on a protease-specific definition of true negatives derived from the biochemical classification of proteases and inhibitors, we analyzed prediction accuracy of individual features, thereby we identified feature-specific limitations, which also affected general protein interaction prediction methods. Interestingly, proteases were often not coexpressed with most of their functional inhibitors, contrary to what is commonly assumed and extrapolated predominantly from cell culture experiments. Predictions of inhibitory interactions were indeed more challenging than predictions of nonproteolytic and noninhibitory interactions. In summary, we describe a novel and well-defined but difficult protein interaction prediction task and thereby highlight limitations of computational interaction prediction methods. Identification of protein interactions is an important goal in molecular biology yet one that remains difficult. Approaches such as yeast-2-hybrid, coimmunoprecipitation and newer experimental methods (1.Kristensen A.R. Gsponer J. Foster L.J. A high-throughput approach for measuring temporal changes in the interactome.Nat. Methods. 2012; 9: 907-909Crossref PubMed Scopus (224) Google Scholar, 2.Weisbrod C.R. Chavez J.D. Eng J.K. Yang L. Zheng C. Bruce J.E. In vivo protein interaction network identified with a novel real-time cross-linked peptide identification strategy.J. Proteome Res. 2013; 12: 1569-1579Crossref PubMed Scopus (112) Google Scholar) are highly productive and scalable. However, limited accuracy from false positives and coverage that is context dependent remain problematic (3.von Mering C. Krause R. Snel B. Cornell M. Oliver S.G. Fields S. Bork P. Comparative assessment of large-scale data sets of protein–protein interactions.Nature. 2002; 417: 399-403Crossref PubMed Scopus (1924) Google Scholar, 4.Braun P. Tasan M. Dreze M. Barrios-Rodiles M. Lemmens I. Yu H. Sahalie J.M. Murray R.R. Roncari L. de Smet A.-S. Venkatesan K. Rual J.-F. Vandenhaute J. Cusick M.E. Pawson T. Hill D.E. Tavernier J. Wrana J.L. Roth F.P. Vidal M. An experimentally derived confidence score for binary protein–protein interactions.Nat. Methods. 2009; 6: 91-97Crossref PubMed Scopus (334) Google Scholar). Computational methods have been developed to predict protein–protein interactions, commonly linking together proteins on the basis of shared features such as patterns of conservation, expression, or annotations (5.Jansen R. Yu H. Greenbaum D. Kluger Y. Krogan N.J. Chung S. Emili A. Snyder M. Greenblatt J.F. Gerstein M. A Bayesian networks approach for predicting protein–protein interactions from genomic data.Science. 2003; 302: 449-453Crossref PubMed Scopus (1051) Google Scholar, 6.Rhodes D.R. Tomlins S.A. Varambally S. Mahavisno V. Barrette T. Kalyana-Sundaram S. Ghosh D. Pandey A. Chinnaiyan A.M. Probabilistic model of the human protein–protein interaction network.Nat. Biotechnol. 2005; 23: 951-959Crossref PubMed Scopus (353) Google Scholar, 7.Bhardwaj N. Lu H. Correlation between gene expression profiles and protein–protein interactions within and across genomes.Bioinformatics. 2005; 21: 2730-2738Crossref PubMed Scopus (135) Google Scholar, 8.Franceschini A. Szklarczyk D. Frankild S. Kuhn M. Simonovic M. Roth A. Lin J. Minguez P. Bork P. von Mering C. Jensen L.J. STRING v9.1: Protein–protein interaction networks, with increased coverage and integration.Nucleic Acids Res. 2013; 41: D808-D815Crossref PubMed Scopus (3296) Google Scholar)—a version of guilt by association. A second class of approaches uses protein structural features to identify potential physical interaction interfaces (9.Zhang Q.C. Petrey D. Deng L. Qiang L. Shi Y. Thu C.A. Bisikirska B. Lefebvre C. Accili D. Hunter T. Maniatis T. Califano A. Honig B. Structure-based prediction of protein–protein interactions on a genome-wide scale.Nature. 2012; 490: 556-560Crossref PubMed Scopus (511) Google Scholar). These approaches can be combined. However, their practical utility remains unclear. In the methods cited above, accuracy was estimated by cross-validation or by validating a small number of hand-picked test cases (5.Jansen R. Yu H. Greenbaum D. Kluger Y. Krogan N.J. Chung S. Emili A. Snyder M. Greenblatt J.F. Gerstein M. A Bayesian networks approach for predicting protein–protein interactions from genomic data.Science. 2003; 302: 449-453Crossref PubMed Scopus (1051) Google Scholar, 6.Rhodes D.R. Tomlins S.A. Varambally S. Mahavisno V. Barrette T. Kalyana-Sundaram S. Ghosh D. Pandey A. Chinnaiyan A.M. Probabilistic model of the human protein–protein interaction network.Nat. Biotechnol. 2005; 23: 951-959Crossref PubMed Scopus (353) Google Scholar). Estimates of the true efficacy of prediction methods in structured evaluations, such as those that exist for function prediction (critical assessment of protein function annotation algorithms (10.Radivojac P. Clark W.T. Oron T.R. Schnoes A.M. Wittkop T. Sokolov A. Graim K. Funk C. Verspoor K. Ben-Hur A. Pandey G. Yunes J.M. Talwalkar A.S. Repo S. Souza M.L. Piovesan D. Casadio R. Wang Z. Cheng J. Fang H. Gough J. Koskinen P. Törönen P. Nokso-Koivisto J. Holm L. Cozzetto D. Buchan D.W. Bryson K. Jones D.T. Limaye B. Inamdar H. Datta A. Manjari S.K. Joshi R. Chitale M. Kihara D. Lisewski A.M. Erdin S. Venner E. Lichtarge O. Rentzsch R. Yang H. Romero A.E. Bhat P. Paccanaro A. Hamp T. Kaβner R. Seemayer S. Vicedo E. Schaefer C. Achten D. Auer F. Boehm A. Braun T. Hecht M. Heron M. Hönigschmid P. Hopf T.A. Kaufmann S. Kiening M. Krompass D. Landerer C. Mahlich Y. Roos M. Björne J. Salakoski T. Wong A. Shatkay H. Gatzmann F. Sommer I. Wass M.N. Sternberg M.J. Škunca N. Supek F. Bošnjak M. Panov P. Džeroski S. Šmuc, Kourmpetis Y.A. van Dijk A.D.J. ter Braak C.J. Zhou Y. Gong Q. Dong X. Tian W. Falda M. Fontana P. Lavezzo E. Di Camillo B. Toppo S. Lan L. Djuric N. Guo Y. Vucetic S. Bairoch A. Linial M. Babbitt P.C. Brenner S.E. Orengo C. Rost B. Mooney S.D. Friedberg I. A large-scale evaluation of computational protein function prediction.Nat. Methods. 2013; 10: 221-227Crossref PubMed Scopus (589) Google Scholar)), structure prediction (critical assessment of protein structure prediction (11.Moult J. Fidelis K. Kryshtafovych A. Schwede T. Tramontano A. Critical assessment of methods of protein structure prediction (CASP)—Round x.Proteins Struct. Funct. Bioinform. 2014; 82: 1-6Crossref PubMed Scopus (317) Google Scholar)), or for structural docking (critical assessment of prediction of interactions (12.Janin J. Welcome to CAPRI: A critical assessment of predicted interactions.Proteins Struct. Funct. Bioinforma. 2002; 47: 257Crossref Scopus (54) Google Scholar)), are lacking for protein interaction prediction methods. If computational predictions of interactions were sufficiently accurate, biochemical assays could be targeted more efficiently by focusing on predicted pairs (9.Zhang Q.C. Petrey D. Deng L. Qiang L. Shi Y. Thu C.A. Bisikirska B. Lefebvre C. Accili D. Hunter T. Maniatis T. Califano A. Honig B. Structure-based prediction of protein–protein interactions on a genome-wide scale.Nature. 2012; 490: 556-560Crossref PubMed Scopus (511) Google Scholar), but to date, computational predictions do not appear to have played a major role in interaction discovery or prioritization (13.Pavlidis P. Gillis J. Progress and challenges in the computational prediction of gene function using networks: 2012–2013 update.F1000Research. 2013; 2: 230Crossref PubMed Scopus (15) Google Scholar). We hypothesized that studying a specific subset of protein interactions and combining computational prediction and biochemical validation will grant deeper insights into the pitfalls and state of the art for general protein interaction predictions. We focused on the prediction of interactions between protease inhibitors and proteases—a problem that has not received specific attention to our knowledge—despite being characterized by covalent or low-KD noncovalent interactions (low nm or pm) and hence, in principle, being more tractable for identification than high-KD noncovalent, general protein–protein interactions. Previous cell culture and transcript analyses have suggested that known protease–inhibitor pairs are often coexpressed and coregulated (14.Breckon J.J. Papaioannou S. Kon L.W. Tumber A. Hembry R.M. Murphy G. Reynolds J.J. Meikle M.C. Stromelysin (MMP-3) synthesis is up-regulated in estrogen-deficient mouse osteoblasts in vivo and in vitro.J. Bone Miner. Res. 1999; 14: 1880-1890Crossref PubMed Scopus (47) Google Scholar, 15.Nuttall R.K. Pennington C.J. Taplin J. Wheal A. Yong V.W. Forsyth P.A. Edwards D.R. Elevated membrane-type matrix metalloproteinases in gliomas revealed by profiling proteases and inhibitors in human cancer cells1 1 Norfolk and Norwich big C appeal; the Medical Research Council; The Canadian Institutes of Health Research; and The European Union Framework V (Contract no. QLG1–2000-00131).Mol. Cancer Res. 2003; 1: 333-345PubMed Google Scholar). It is therefore hypothesized that protease–inhibitor coexpression plays a major role in the regulation of the detrimental activities of a protease. Inverse protease–inhibitor coexpression is thought to amplify protease activity but has only been observed for relatively few protease–inhibitor pairs (16.Overall C.M. Wrana J.L. Sodek J. Independent regulation of collagenase, 72-kDa progelatinase, and metalloendoproteinase inhibitor expression in human fibroblasts by transforming growth factor-beta.J. Biol. Chem. 1989; 264: 1860-1869Abstract Full Text PDF PubMed Google Scholar, 17.Overall C.M. Sodek J. Concanavalin A produces a matrix-degradative phenotype in human fibroblasts. Induction and endogenous activation of collagenase, 72-kDa gelatinase, and pump-1 is accompanied by the suppression of the tissue inhibitor of matrix metalloproteinases.J. Biol. Chem. 1990; 265: 21141-21151Abstract Full Text PDF PubMed Google Scholar). Overall, it is currently a common assumption that protease–inhibitor coexpression is evidence for an inhibitory interaction, but this concept has not been tested comprehensively. Proteases are a critical component of the posttranslational regulatory machinery in cells and therefore promising drug targets. However, drug targeting of proteases has been hampered by complex protease biology that is often poorly understood. One aspect of this complexity is the organization of proteases in dense interaction networks of protease cleavage and interaction (18.Fortelny N. Cox J.H. Kappelhoff R. Starr A.E. Lange P.F. Pavlidis P. Overall C.M. Network analyses reveal pervasive functional regulation between proteases in the human protease web.PLoS Biol. 2014; 12: e1001869Crossref PubMed Scopus (121) Google Scholar). Proteases regulate the activity of other proteases by direct cleavage or by cleaving their endogenous inhibitors, which in turn influences additional distal cleavage events. Thus, proteases can potentially indirectly influence the cleavage of substrates other than their direct substrates. We recently established a graph model of protease web interactions based on existing biochemical data that can be used to predict proteolytic pathways (19.Fortelny N. Yang S. Pavlidis P. Lange P.F. Overall C.M. Proteome TopFIND 3.0 with TopFINDer and PathFINDer: Database and analysis tools for the association of protein termini to pre- and post-translational events.Nucleic Acids Res. 2015; 43: D290-D297Crossref PubMed Scopus (83) Google Scholar). However, the network is far from its full potential because cleavage and inhibition interaction data underlying the model are incomplete. This is mainly due to the lack of studies of proteases and inhibitors but also to the lack of uploading of existing data to community databases. Computational prediction could provide a means to accelerate the addition of interactions to this network. However, large-scale computational prediction efforts in protease interaction biology have been limited to the use of molecular features of proteases and their substrates to predict protease cleavage (20.Song J. Matthews A.Y. Reboul C.F. Kaiserman D. Pike R.N. Bird P.I. Whisstock J.C. Predicting serpin/protease interactions.Methods Enzymol. 2011; 501: 237-273Crossref PubMed Scopus (9) Google Scholar) and have largely ignored protease inhibition. Therefore, the whole realm of protease inhibition is underexplored, with 354 (∼80%) of 444 human proteases lacking annotated inhibitors and 13 (∼14%) of 94 inhibitors without any annotated targets (orphan inhibitors) in the MEROPS protease database (21.Rawlings N.D. Barrett A.J. Bateman A. MEROPS: The database of proteolytic enzymes, their substrates and inhibitors.Nucleic Acids Res. 2012; 40: D343-D350Crossref PubMed Scopus (711) Google Scholar). Proteases are regulated by multiple mechanisms other than inhibition such as autodegradation, reversible activation, substrate-induced activation, and other allosteric activators. However, protease inhibitors are often present in adjacent compartments to block and clear excess proteases that could rapidly and irreversibly cleave a large number of proteins. Protease inhibitors are therefore often secreted in the plasma or distal tissues to block proteases delivered by diffusion, secretion, or leakage from tissues to the circulation. Considering the key role of proteases in cell signaling pathways, identifying additional, physiologically relevant protease–inhibitor pairs would greatly benefit our understanding of protease biology. Important questions in interaction prediction methods are which input data to use for predictions and how to evaluate performance (in contrast, the prediction algorithm used plays relatively little role (22.Gillis J. Pavlidis P. The role of indirect connections in gene networks in predicting function.Bioinformatics. 2011; 27: 1860-1866Crossref PubMed Scopus (48) Google Scholar)). To evaluate performance of a predictor, efficacy in separating predefined true positives (TP) 1The abbreviations used are: TP, true positives; TN, true negatives; PPI, protein-protein interaction; MMP, matrix metalloproteinase; AUC, area under the curve; ROC, receiver operating characteristic; RPKM, reads per kilobase of transcript per million mapped reads; GO, Gene Ontology; EXP, Inferred from Experiment; IDA, Inferred from Direct Assay; IPI, Inferred from Physical Interaction; IGI, Inferred from Genetic Interaction; IMP, Inferred from Mutant Phenotype; IEP, Inferred from Expression Pattern; TAS, Traceable Author Statement. 1The abbreviations used are: TP, true positives; TN, true negatives; PPI, protein-protein interaction; MMP, matrix metalloproteinase; AUC, area under the curve; ROC, receiver operating characteristic; RPKM, reads per kilobase of transcript per million mapped reads; GO, Gene Ontology; EXP, Inferred from Experiment; IDA, Inferred from Direct Assay; IPI, Inferred from Physical Interaction; IGI, Inferred from Genetic Interaction; IMP, Inferred from Mutant Phenotype; IEP, Inferred from Expression Pattern; TAS, Traceable Author Statement. and true negative (TN) examples is measured. For example, in interaction prediction, if most true interacting proteins are coexpressed and noninteractors are not coexpressed, then coexpression is a good predictor of interaction. The better the separation of the two groups, the better the predictive performance. In general, TPs are readily found in biological databases, but the definition of TNs is a challenge, especially for weak interactions having low mm KDs, and more practically since a lack of interaction is rarely established and documented. Common approaches therefore use unlikely interactions as TNs, for example, random interactions (based on the assumption that true interactions are a small subset of all possible interactions) or interactions between proteins localized to different cellular compartments according to annotation (4.Braun P. Tasan M. Dreze M. Barrios-Rodiles M. Lemmens I. Yu H. Sahalie J.M. Murray R.R. Roncari L. de Smet A.-S. Venkatesan K. Rual J.-F. Vandenhaute J. Cusick M.E. Pawson T. Hill D.E. Tavernier J. Wrana J.L. Roth F.P. Vidal M. An experimentally derived confidence score for binary protein–protein interactions.Nat. Methods. 2009; 6: 91-97Crossref PubMed Scopus (334) Google Scholar). An advantage of the protease–inhibitor prediction task is the ability to define TP and TN inhibitions more accurately. Protease inhibitors are characterized by tight interactions with their cognate proteases, thus providing a clear separation between true and false interactors. Further, proteases and their inhibitors are organized into families based on their primary sequence and into clans based on the structure of their active site and reactive site, respectively (21.Rawlings N.D. Barrett A.J. Bateman A. MEROPS: The database of proteolytic enzymes, their substrates and inhibitors.Nucleic Acids Res. 2012; 40: D343-D350Crossref PubMed Scopus (711) Google Scholar). Families and clans of inhibitors can mostly be assigned specifically to one or two target protease classes. Thus, it is possible to define TN pairs, where the inhibitor cannot inhibit the protease based on known chemical and structural constraints. As examples, a serpin will not inhibit a metalloprotease, and a tissue inhibitor of metalloproteinases will neither inhibit a serine protease nor aspartate, threonine, or cysteine proteases. However, matrix metalloproteinases (MMPs) cleave and inactivate many serpins and so transiently are also interactors before peptide bond scission, albeit with a moderate KD (∼ Km) (18.Fortelny N. Cox J.H. Kappelhoff R. Starr A.E. Lange P.F. Pavlidis P. Overall C.M. Network analyses reveal pervasive functional regulation between proteases in the human protease web.PLoS Biol. 2014; 12: e1001869Crossref PubMed Scopus (121) Google Scholar, 23.auf dem Keller U. Prudova A. Eckhard U. Fingleton B. Overall C.M. Systems-level analysis of proteolytic events in increased vascular permeability and complement activation in skin inflammation.Sci. Signal. 2013; 6: rs2Crossref PubMed Scopus (96) Google Scholar). A further advantage of selecting this group of proteins in the analysis of prediction methods is the accuracy of biochemical testing of the predictions by measuring inhibition of the catalytic activity of the protease. Here, we defined TP inhibitions (n = 294) as those inhibitions annotated in MEROPS (21.Rawlings N.D. Barrett A.J. Bateman A. MEROPS: The database of proteolytic enzymes, their substrates and inhibitors.Nucleic Acids Res. 2012; 40: D343-D350Crossref PubMed Scopus (711) Google Scholar). We defined TN inhibitions (n = 6,990) as enzymatically implausible inhibitor/protease pairs that are known not to be inhibitory. Using this gold standard, we evaluated the predictive power of common interaction prediction methodology to predict protease–inhibitor pairs in the protease web. Predictions were based on protein–protein interaction data, coannotation, coexpression, phylogenetic similarity, and colocalization as input data. Interestingly, we report that coexpression is surprisingly low for many functional protease–inhibitor pairs, contrary to what is commonly assumed. In particular, we employed 40 interaction predictors based on coexpression values derived from different input data and correlation metrics, all of which we found suffered from weak predictive power. Nonetheless, we predicted 270 protease–inhibitor pairs, examined 9 of these predicted inhibitions biochemically, and validated the novel inhibition of kallikrein 5 (KLK5) by serpin B12 (SERPINB12), previously an orphan inhibitor. Protease and protease inhibitor data and coexpression matrices used throughout the analyses are available for download at http://hdl.handle.net/11272/10472. Protease and inhibitor class, family, cleavage, and inhibitor information were extracted from the MEROPS database (http://merops.sanger.ac.uk/) (21.Rawlings N.D. Barrett A.J. Bateman A. MEROPS: The database of proteolytic enzymes, their substrates and inhibitors.Nucleic Acids Res. 2012; 40: D343-D350Crossref PubMed Scopus (711) Google Scholar) version 9.9 on September 30, 2013. MEROPS identifiers were used to classify proteases and inhibitors into classes and families as described on the MEROPS website. Protein-protein interaction (PPI) data from Human Integrated Protein-Protein Interaction Reference (24.Schaefer M.H. Fontaine J.-F. Vinayagam A. Porras P. Wanker E.E. Andrade-Navarro M.A. HIPPIE: Integrating protein interaction networks with experiment based quality scores.PLoS ONE. 2012; 7: e31826Crossref PubMed Scopus (237) Google Scholar) version 1.5 were downloaded on June 12, 2013. PPI data from high-throughput experiments were downloaded from BioGRID (25.Chatr-Aryamontri A. Breitkreutz B.-J. Oughtred R. Boucher L. Heinicke S. Chen D. Stark C. Breitkreutz A. Kolas N. O'Donnell L. Reguly T. Nixon J. Ramage L. Winter A. Sellam A. Chang C. Hirschman J. Theesfeld C. Rust J. Livstone M.S. Dolinski K. Tyers M. The BioGRID interaction database: 2015 update.Nucleic Acids Res. 2015; 43: D470-D478Crossref PubMed Scopus (705) Google Scholar) on October 11, 2013. PPI data from (26.Bossi A. Lehner B. Tissue specificity and the human protein interaction network.Mol. Syst. Biol. 2009; 5: 260Crossref PubMed Scopus (260) Google Scholar) were downloaded on October 11, 2013. Experiments with up to 100 identified PPIs were considered low throughput, those with 100–1,000 PPIs were labeled medium throughput, and those with more than 1,000 PPIs were deemed high throughput. Protein localization information was downloaded from three sources: LocDB (27.Rastogi S. Rost B. LocDB: Experimental annotations of localization for Homo sapiens Arabidopsis thaliana.Nucleic Acids Res. 2011; 39: D230-D234Crossref PubMed Scopus (39) Google Scholar) (data downloaded November 19, 2013), the Human Protein Atlas (28.Uhlen M. Oksvold P. Fagerberg L. Lundberg E. Jonasson K. Forsberg M. Zwahlen M. Kampf C. Wester K. Hober S. Wernerus H. Björling L. Ponten F. Towards a knowledge-based Human Protein Atlas.Nat. Biotechnol. 2010; 28: 1248-1250Crossref PubMed Scopus (1706) Google Scholar) (downloaded November 12, 2013.), and Gene Ontology (GO) annotation using the hgu95av2.db package in R (29.RCore Team R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria2013Google Scholar) (downloaded August 8, 2013). For each dataset, annotations were mapped to GO terms and annotation trees for each protein were generated using the GOstats package in R (29.RCore Team R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria2013Google Scholar). For LocDB, primary and secondary localization information was combined for each protein. Main and other localization data from the Human Protein Atlas were used if the reliability was annotated as High, Medium, or Supportive. GO annotations were retained if the evidence code was one of EXP, IDA, IPI, IGI, IMP, IEP, or TAS. Genome Tissue Expression Atlas (GTEx) data (30.Lonsdale J. Thomas J. Salvatore M. Phillips R. Lo E. Shad S. Hasz R. Walters G. Garcia F. Young N. Foster B. Moser M. Karasik E. Gillard B. Ramsey K. Sullivan S. Bridge J. Magazine H. Syron J. Fleming J. Siminoff L. Traino H. Mosavel M. Barker L. Jewell S. Rohrer D. Maxim D. Filkins D. Harbach P. Cortadillo E. Berghuis B. Turner L. Hudson E. Feenstra K. Sobin L. Robb J. Branton P. Korzeniewski G. Shive C. Tabor D. Qi L. Groch K. Nampally S. Buia S. Zimmerman A. Smith A. Burges R. Robinson K. Valentino K. Bradbury D. Cosentino M. Diaz-Mayoral N. Kennedy M. Engel T. Williams P. Erickson K. Ardlie K. Winckler W. Getz G. DeLuca D. MacArthur D. Kellis M. Thomson A. Young T. Gelfand E. Donovan M. Meng Y. Grant G. Mash D. Marcus Y. Basile M. Liu J. Zhu J. Tu Z. Cox N.J. Nicolae D.L. Gamazon E.R. Im H.K. Konkashbaev A. Pritchard J. Stevens M. Flutre T. Wen X. Dermitzakis E.T. Lappalainen T. Guigo R. Monlong J. Sammeth M. Koller D. Battle A. Mostafavi S. McCarthy M. Rivas M. Maller J. Rusyn I. Nobel A. Wright F. Shabalin A. Feolo M. Sharopova N. Sturcke A. Paschal J. Anderson J.M. Wilder E.L. Derr L.K. Green E.D. Struewing J.P. Temple G. Volpi S. Boyer J.T. Thomson E.J. Guyer M.S. Ng C. Abdallah A. Colantuoni D. Insel T.R. Koester S.E. Little A.R. Bender P.K. Lehner T. Yao Y. Compton C.C. Vaught J.B. Sawyer S. Lockhart N.C. Demchok J. Moore H.F. The Genotype-Tissue Expression (GTEx) project.Nat. Genet. 2013; 45: 580-585Crossref PubMed Scopus (4349) Google Scholar) were downloaded on January 31, 2013. Gene Expression Omnibus Series 7307 expression data were downloaded from the database Gemma (31.Zoubarev A. Hamer K.M. Keshav K.D. McCarthy E.L. Santos J.R.C. Van Rossum T. McDonald C. Hall A. Wan X. Lim R. Gillis J. Pavlidis P. Gemma: A resource for the reuse, sharing and meta-analysis of expression profiling data.Bioinformatics. 2012; 28: 2272-2273Crossref PubMed Scopus (73) Google Scholar) on June 26, 2013. Other microarray-based expression datasets used in meta-coexpression analysis were downloaded from Gemma on January 18, 2013 and are listed in supplemental Table S6. Gene correlation was calculated using the cor function in R (29.RCore Team R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria2013Google Scholar). Partial correlation was calculated using the ppcor package in R. Full datasets or subsets were used as inputs as explained in the results section and in supplemental Table S5. Phylogenetic profile data were constructed by downloading mappings from human proteins to other species from InParanoid (32.Östlund G. Schmitt T. Forslund K. Köstler T. Messina D.N. Roopra S. Frings O. Sonnhammer E.L. InParanoid 7: New algorithms and tools for eukaryotic orthology analysis.Nucleic Acids Res. 2010; 38: D196-D203Crossref PubMed Scopus (502) Google Scholar). Mappings were binarized into 0 (absent) and 1 (present) for the binary networks before calculating the fraction of agreement (where the genes are absent or present in both organisms), Pearson correlation (cor package in R (29.RCore Team R: A La
    Keywords:
    Proteome
    Serpin
    Protease inhibitor (pharmacology)
    Serine proteinase inhibitors (serpins), typically fold to a metastable native state and undergo a major conformational change in order to inhibit target proteases. However, conformational lability of the native serpin fold renders them susceptible to misfolding and aggregation, and underlies misfolding diseases such as α1-antitrypsin deficiency. Serpin specificity towards its protease target is dictated by its flexible and solvent exposed reactive centre loop (RCL), which forms the initial interaction with the target protease during inhibition. Previous studies have attempted to alter the specificity by mutating the RCL to that of a target serpin, but the rules governing specificity are not understood well enough yet to enable specificity to be engineered at will. In this paper, we use conserpin, a synthetic, thermostable serpin, as a model protein with which to investigate the determinants of serpin specificity by engineering its RCL. Replacing the RCL sequence with that from α1-antitrypsin fails to restore specificity against trypsin or human neutrophil elastase. Structural determination of the RCL-engineered conserpin and molecular dynamics simulations indicate that, although the RCL sequence may partially dictate specificity, local electrostatics and RCL dynamics may dictate the rate of insertion during protease inhibition, and thus whether it behaves as an inhibitor or a substrate. Engineering serpin specificity is therefore substantially more complex than solely manipulating the RCL sequence, and will require a more thorough understanding of how conformational dynamics achieves the delicate balance between stability, folding and function required by the exquisite serpin mechanism of action.
    Serpin
    Dynamics
    Citations (42)
    The Serine Protease Inhibitors (Serpins) have been a focus of research by biomedical industries due to their critical role in human health. The use of serpin in the treatment of many diseases was widely investigated through the identification of new genes encoding these proteins in all kingdoms of life. The characterization of these genes revealed that they encoded proteins having low sequence homologies. Future developments are focusing not only on the protease inhibition activity, but also on the other effects due to the interactions of serpins with other components such as hormone transport. Here we give a concise overview of the most recent patents that have been reported in this field of research.
    Serpin
    The native fold of inhibitory serpins (serpin proteinase inhibitors) is metastable and therefore does not represent the most stable conformation that the primary sequence encodes for. The most stable form is adopted when the reactive centre loop (RCL) inserts, as the fourth strand, into the A β-sheet. Currently a serpin can adopt at least four more stable conformations, termed the cleaved, delta, latent and polymeric states. The accessibility of these alternative low energy folds renders the serpin molecule susceptible to mutations that can result in dysfunction and pathology. Here, we discuss the means by which the serpin can attain and preserve this metastable conformation. We also consider the triggers for misfolding to these more stable states and the mechanisms by which it occurs.
    Serpin
    Folding (DSP implementation)
    Metastability
    Citations (22)
    The study of serpin deficiency is currently one of the most active areas in basic medical research. Recently, three hypotheses concerning serpin deficiency have been proposed, which are referred to as the conformational disturbance hypothesis (CDH) , loop-sheet polymerisation hypothesis (LSPH) and multiple binding site hypothesis (MB-SH) . CDH was put forward to explicit serpin deficiency due to conformational change of reactive loop of serpins as a result of mutations occurring away from the reactive site residues and LSPH was to explain deficient serpins due to the formation of polymers. MBSH was proposed to explain the mechanism of the formation of stable enzyme-serpin complex via more than one binding site and blockage or mutation in any of the sites resulting in serpin deficiency. A combination of these mechanisms may be critical in understanding the roles of the many documented mutations and autoimmunities which result in qualitative and quantitative serpin deficiency.
    Serpin
    Citations (0)
    Human protein C inhibitor (PCI), a serpin-type protease inhibitor originally described as an inhibitor of activated protein C, has broad protease reactivity. In addition to its activities within the blood clotting and fibrinolytic cascades, it seems to participate in several biological processes including reproduction and tumor growth. This review summarizes the current understanding of PCI function, regulation, and potential biological role.
    Serpin
    Protease inhibitor (pharmacology)
    Citations (32)