Does Secondary Structure Determine Tertiary Structure in Proteins

2005 
Is highly approximate knowledge of a protein's backbone structure sufficient to success- fully identify its family, superfamily, and tertiary fold? To explore this question, backbone dihedral angles were extracted from the known three-dimen- sional structure of 2,439 proteins and mapped into 36 labeled, 60° 60° bins, called mesostates. Using this coarse-grained mapping, protein conformation can be approximated by a linear sequence of mesostates. These linear strings can then be aligned and assessed by conventional sequence-comparison methods. We report that the mesostate sequence is sufficient to recognize a protein's family, superfamily, and fold with good fidelity. Proteins 2005;61:338 -343. and loops. Using a simple scoring matrix, conventional pairwise sequence comparisons between these strings were performed and used to construct a Przytycka-tree (P-tree), in which the distance between any two nodes is proportional to the difference in score between their aligned secondary structure strings. The P-tree is gener- ated completely automatically, and it reflects the global secondary structure relationships among the proteins used to construct it: the closer the nodes, the greater the similarity of secondary structure among their correspond- ing proteins. Surprisingly, the straightforward P-tree was found to be largely in agreement with the SCOP tree, although the latter is a complex construct based on structure, evolutionary knowledge, and human judgment. This result lends support to the hypothesis that successful fold recognition can be derived solely from knowledge of secondary structure.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    38
    References
    17
    Citations
    NaN
    KQI
    []