Deconvolution and Database Search of Complex Tandem Mass Spectra of Intact Proteins

2010 
Top-down proteomics studies intact proteins, enabling new opportunities for analyzing post-translational modifications. Because tandem mass spectra of intact proteins are very complex, spectral deconvolution (grouping peaks into isotopomer envelopes) is a key initial stage for their interpretation. In such spectra, isotopomer envelopes of different protein fragments span overlapping regions on the m/z axis and even share spectral peaks. This raises both pattern recognition and combinatorial challenges for spectral deconvolution. We present MSDeconv, a combinatorial algorithm for spectral deconvolution. The algorithm first generates a large set of candidate isotopomer envelopes for a spectrum, then represents the spectrum as a graph, and finally selects its highest scoring subset of envelopes as a heaviest path in the graph. In contrast with other approaches, the algorithm scores sets of envelopes rather than individual envelopes. We demonstrate that MS-Deconv improves on Thrash and Xtract in the number of correctly recovered monoisotopic masses and speed. We applied MS-Deconv to a large set of top-down spectra from Yersinia rohdei (with a still unsequenced genome) and further matched them against the protein database of related and sequenced bacterium Yersinia enterocolitica. MS-Deconv is available at http://proteomics.ucsd.edu/Software.html. Molecular & Cellular Proteomics 9:2772–2782, 2010.
    • Correction
    • Cite
    • Save
    • Machine Reading By IdeaReader
    40
    References
    11
    Citations
    NaN
    KQI
    []