Novel Multiprotein Complexes Identified in the Hyperthermophilic Archaeon Pyrococcus furiosus by Non-denaturing Fractionation of the Native Proteome

2009 
The repository of sequenced genomes is now above 850 (Genomes Online Database). Consequently there is a tremendous need to determine the function of gene products and identify groups of proteins that work together as complexes in distinct cellular processes. Genome-wide functional analyses suggest that there are 200–300 core biological functions that are essential to life (1). More often than not the functional units are assemblies composed of multiple proteins (2). Many biological processes involve multiprotein complexes that function as large and efficient machines, such as ribosomes (3, 4), flagella (5), and cellulosomes (6). In addition, the supramolecular organization of enzymes of partial or entire metabolic pathways as “metabolons” appears to provide certain advantages, such as substrate channeling (7–9). The identification of protein-protein interactions and functional, stable associations is extremely important in understanding the biology of a cell. However, predicting the nature of such complexes within a single genome, let alone for hundreds of genomes, remains a major challenge. Although there are currently several methods available to study protein-protein interactions on a genome-wide scale (10–12), each has severe limitations. The in vivo two-hybrid system (13–15) requires tagged proteins, is limited to binary interactions, and is thought to generate a large percentage of false positives (16). The epitope tag affinity purification and tandem affinity purification methods have also been used extensively (17–21), but the tags can disrupt native protein-protein interactions, and the methods tend to be biased toward proteins that interact with high affinity and/or proteins of high abundance (10). The major limitation with all of these approaches is that they require genetic manipulation of the target organism, an ability limited to only a few well studied systems. Non-genetic techniques to identify protein-protein interactions include co-immunoaffinity precipitation to capture endogenous protein complexes, but this is not a genome-wide approach as it requires highly specific antibodies made against purified proteins (22). Two-dimensional blue native/SDS-PAGE and clear native-PAGE are also two widely used techniques that do not require genetic manipulation and allow for the analysis of protein complexes on a proteome-wide scale in a single experiment (23–26). However, they are limited in their dynamic range and typically identify only high abundance proteins (27). The goal of this research is to develop a global method to identify novel protein complexes (PCs)1 independent of a genetic system and applicable to any organism with available native biomass. The approach involves multistep, non-denaturing column chromatography where the co-fractionation of proteins is used to identify potential complexes. As a model system we use the hyperthermophilic archaeon Pyrococcus furiosus, an anaerobe that grows optimally at 100 °C (28). Its genome sequence (29, 30) contains 2125 ORFs. A universal feature of prokaryotic genomes is the organization of genes into operons, which form basic transcriptional units (31) and are important in functional genomics. Using a neural network, we predicted that 1460 ORFs in the P. furiosus genome are contained within 470 operons (32), 349 of which were validated using DNA microarray data (33, 34). Operons typically encode functionally related proteins, which can include enzymes of the same pathway as well as heteromeric PCs. Herein heteromeric PCs encoded by two or more adjacent genes are referred to as Type 1 PCs, whereas heteromeric PCs encoded by two or more unlinked genes are referred to as Type 2 PCs. This pilot study focused on identifying stable, Type 1 heteromeric PCs in P. furiosus based on the co-fractionation of proteins during sequential column chromatography steps. In addition, a high throughput (HT) system was devised to allow protein identification by nano-LC-ESI-MS/MS. Our long term objective is to develop HT protocols for novel PC identification on a genome-wide basis using limited amounts of biomass.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    106
    References
    39
    Citations
    NaN
    KQI
    []