Use of conditional probabilities for determining relationships between amino acid sequence and protein secondary structure

1992 
The conditional probability, P(σ|x), is a statement of the probability that the value of σ will be found given the prior information that a value of x has been observed. Here σ represents any one of the secondary structure types, α, β, τ, and ρ for helix, sheet, turn, and random, respectively, and x represents a sequence attribute, including, but not limited to: (1) hydropathy; (2) hydrophobic moments assuming helix and sheet; (3) Richardson and Richardson helical N-cap and C-cap values; (4) Chou–Fasman conformational parameters for helix, Pα, for sheet, Pβ, and for turn, Pτ; and (5) Garnier, Osguthorpe, and Robson (GOR) information values for helix, Iα, for sheet, Iβ, for turn, I,τ, and for random structure, Iρ. Plots of P (σ|x) vs. x are demonstrated to provide information about the correlation between structure and attribute, σ and x. The separations between different P (σ|x) vs. x curves indicate the capacity of a given attribute to discriminate between different secondary structural types and permit comparison of different attributes. P (α|x), P (β|x), P (τ|x) and P (ρ|x) vs. x plots show that the most useful attributes for discriminating helix are, in order: hydrophobic moment assuming helix > Pα » N-cap > C-cap ≈ Iα ≈ Iτ. The information value for turns, Iτ, was found to discriminate helix better than turns. Discrimination for sheet was found to be in the following order: Iβ » Pβ ≈ hydropathy > Iρ ≈ hydrophobic moment assuming sheet. Three attributes, at their low values, were found to give significant discrimination for the absence of helix: Iα ≈ Pα ≈ hydrophobic moment assuming helix. Also, three other attributes were found to indicate the absence of sheet: Pβ » Iτ ≈ hydropathy. Indications of the absence of σ could be as useful for some applications as the indication of the presence of σ.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    46
    References
    22
    Citations
    NaN
    KQI
    []