Ion mobility brings an additional dimension of separation to LC–MS, improving identification of peptides and proteins in complex mixtures. A recently introduced timsTOF mass spectrometer (Bruker) couples trapped ion mobility separation to TOF mass analysis. With the parallel accumulation serial fragmentation (PASEF) method, the timsTOF platform achieves promising results, yet analysis of the data generated on this platform represents a major bottleneck. Currently, MaxQuant and PEAKS are most used to analyze these data. However, because of the high complexity of timsTOF PASEF data, both require substantial time to perform even standard tryptic searches. Advanced searches (e.g. with many variable modifications, semi- or non-enzymatic searches, or open searches for post-translational modification discovery) are practically impossible. We have extended our fast peptide identification tool MSFragger to support timsTOF PASEF data, and developed a label-free quantification tool, IonQuant, for fast and accurate 4-D feature extraction and quantification. Using a HeLa data set published by Meier et al. (2018), we demonstrate that MSFragger identifies significantly (∼30%) more unique peptides than MaxQuant (1.6.10.43), and performs comparably or better than PEAKS X+ (∼10% more peptides). IonQuant outperforms both in terms of number of quantified proteins while maintaining good quantification precision and accuracy. Runtime tests show that MSFragger and IonQuant can fully process a typical two-hour PASEF run in under 70 min on a typical desktop (6 CPU cores, 32 GB RAM), significantly faster than other tools. Finally, through semi-enzymatic searching, we significantly increase the number of identified peptides. Within these semi-tryptic identifications, we report evidence of gas-phase fragmentation before MS/MS analysis. Ion mobility brings an additional dimension of separation to LC–MS, improving identification of peptides and proteins in complex mixtures. A recently introduced timsTOF mass spectrometer (Bruker) couples trapped ion mobility separation to TOF mass analysis. With the parallel accumulation serial fragmentation (PASEF) method, the timsTOF platform achieves promising results, yet analysis of the data generated on this platform represents a major bottleneck. Currently, MaxQuant and PEAKS are most used to analyze these data. However, because of the high complexity of timsTOF PASEF data, both require substantial time to perform even standard tryptic searches. Advanced searches (e.g. with many variable modifications, semi- or non-enzymatic searches, or open searches for post-translational modification discovery) are practically impossible. We have extended our fast peptide identification tool MSFragger to support timsTOF PASEF data, and developed a label-free quantification tool, IonQuant, for fast and accurate 4-D feature extraction and quantification. Using a HeLa data set published by Meier et al. (2018), we demonstrate that MSFragger identifies significantly (∼30%) more unique peptides than MaxQuant (1.6.10.43), and performs comparably or better than PEAKS X+ (∼10% more peptides). IonQuant outperforms both in terms of number of quantified proteins while maintaining good quantification precision and accuracy. Runtime tests show that MSFragger and IonQuant can fully process a typical two-hour PASEF run in under 70 min on a typical desktop (6 CPU cores, 32 GB RAM), significantly faster than other tools. Finally, through semi-enzymatic searching, we significantly increase the number of identified peptides. Within these semi-tryptic identifications, we report evidence of gas-phase fragmentation before MS/MS analysis. A major challenge to identification and quantification of proteins from tissue or cultured cells is the immense complexity of the peptide mixtures that result from enzymatic preparation of these samples for liquid chromatography-mass spectrometry (LC–MS) analysis. Ion mobility spectrometry brings an additional dimension of separation to LC–MS proteomics, significantly improving peptide identification. Following electrospray ionization, ion mobility differentiates gas-phase peptide ions by their size and charge before mass analysis. Ion mobility separation occurs on the millisecond timescale, improving selectivity without adding to analysis times. Recently, a commercially available instrument that couples trapped ion mobility spectrometry (TIMS) to time-of-flight (TOF) mass analysis (1Silveira J.A. Ridgeway M.E. Laukien F.H. Mann M. Park M.A. Parallel accumulation for 100% duty cycle trapped ion mobility-mass spectrometry.Int. J. Mass Spectrom. 2017; 413: 168-175Crossref Scopus (44) Google Scholar) has achieved promising depth of coverage, routinely identifying over 6000 proteins from individual 120-min LC gradients (2Meier F. Beck S. Grassl N. Lubeck M. Park M.A. Raether O. Mann M. Parallel accumulation–serial fragmentation (PASEF): multiplying sequencing speed and sensitivity by synchronized scans in a trapped ion mobility device.J. Proteome Res. 2015; 14: 5378-5387Crossref PubMed Scopus (176) Google Scholar, 3Meier F. Brunner A.D. Koch S. Koch H. Lubeck M. Krause M. Goedecke N. Decker J. Kosinski T. Park M.A. Bache N. Hoerning O. Cox J. Rather O. Mann M. Online Parallel Accumulation-Serial Fragmentation (PASEF) with a Novel Trapped Ion Mobility Mass Spectrometer.Mol. Cell. Proteomics. 2018; 17: 2534-2545Abstract Full Text Full Text PDF PubMed Scopus (339) Google Scholar). Owing to the dual TIMS design of this instrument, where the first region is used for storing ions and the second for ion mobility separation, peptides can be continually selected for sequencing with minimal reduction in duty cycle. This data acquisition method has been termed parallel accumulation-serial fragmentation (PASEF) (2Meier F. Beck S. Grassl N. Lubeck M. Park M.A. Raether O. Mann M. Parallel accumulation–serial fragmentation (PASEF): multiplying sequencing speed and sensitivity by synchronized scans in a trapped ion mobility device.J. Proteome Res. 2015; 14: 5378-5387Crossref PubMed Scopus (176) Google Scholar, 3Meier F. Brunner A.D. Koch S. Koch H. Lubeck M. Krause M. Goedecke N. Decker J. Kosinski T. Park M.A. Bache N. Hoerning O. Cox J. Rather O. Mann M. Online Parallel Accumulation-Serial Fragmentation (PASEF) with a Novel Trapped Ion Mobility Mass Spectrometer.Mol. Cell. Proteomics. 2018; 17: 2534-2545Abstract Full Text Full Text PDF PubMed Scopus (339) Google Scholar). For typical data-dependent acquisition (DDA) measurements, a survey scan is performed, and the N-highest abundance precursor ions are targeted for tandem mass spectrometry (MS/MS) analysis based on their m/z and mobility. Fast quadrupole switching times allow multiple peptide ions to be targeted for fragmentation during a single ion mobility scan. As a target precursor exits the TIMS region, the quadrupole switches to transmit the corresponding m/z determined by the survey scan. Synchronization of the TIMS device and quadrupole mass filter reduces chimeric spectra and enables removal of singly-charged contaminant ions. Additionally, because of the fast acquisition speed (50–200 ms for a full scan), low-abundance precursors can be repeatedly re-targeted to improve MS/MS spectrum quality (2Meier F. Beck S. Grassl N. Lubeck M. Park M.A. Raether O. Mann M. Parallel accumulation–serial fragmentation (PASEF): multiplying sequencing speed and sensitivity by synchronized scans in a trapped ion mobility device.J. Proteome Res. 2015; 14: 5378-5387Crossref PubMed Scopus (176) Google Scholar, 3Meier F. Brunner A.D. Koch S. Koch H. Lubeck M. Krause M. Goedecke N. Decker J. Kosinski T. Park M.A. Bache N. Hoerning O. Cox J. Rather O. Mann M. Online Parallel Accumulation-Serial Fragmentation (PASEF) with a Novel Trapped Ion Mobility Mass Spectrometer.Mol. Cell. Proteomics. 2018; 17: 2534-2545Abstract Full Text Full Text PDF PubMed Scopus (339) Google Scholar). A current major limitation of the PASEF proteomics method is long post-acquisition analysis time because of the high dimensionality of the data and large number of acquired MS/MS scans. MaxQuant (4Cox J. Mann M. MaxQuant enables high peptide identification rates, individualized ppb-range mass accuracies and proteome-wide protein quantification.Nat. Biotechnol. 2008; 26: 1367-1372Crossref PubMed Scopus (9150) Google Scholar, 5Prianichnikov N. Koch H. Koch S. Lubeck M. Heilig R. Brehmer S. Fischer R. Cox J. MaxQuant software for ion mobility enhanced shotgun proteomics.Mol. Cell. Proteomics. 2020; 19: 1058-1069Abstract Full Text Full Text PDF PubMed Scopus (56) Google Scholar) and PEAKS (6Zhang J. Xin L. Shan B. Chen W. Xie M. Yuen D. Zhang W. Zhang Z. Lajoie G.A. Ma B. PEAKS DB: de novo sequencing assisted database search for sensitive and accurate peptide identification.Mol. Cell. Proteomics. 2012; 11 (M111.010587)Abstract Full Text Full Text PDF Scopus (105) Google Scholar) are both capable of processing PASEF data but require roughly three hours to perform a standard tryptic search given a raw data file from a two-hour gradient. Neither MaxQuant nor PEAKS are practical for nonspecific digest searches or open searches (7Chick J.M. Kolippakkam D. Nusinow D.P. Zhai B. Rad R. Huttlin E.L. Gygi S.P. A mass-tolerant database search identifies a large proportion of unassigned spectra in shotgun proteomics as modified peptides.Nat. Biotechnol. 2015; 33: 743-749Crossref PubMed Scopus (258) Google Scholar, 8Kong A.T. Leprevost F.V. Avtonomov D.M. Mellacheruvu D. Nesvizhskii A.I. MSFragger: ultrafast and comprehensive peptide identification in mass spectrometry-based proteomics.Nat. Methods. 2017; 14: 513-520Crossref PubMed Scopus (475) Google Scholar), which are helpful in discovering post-translational modifications. We have recently introduced a fragment ion indexing method and its implementation in an ultrafast database search tool MSFragger (8Kong A.T. Leprevost F.V. Avtonomov D.M. Mellacheruvu D. Nesvizhskii A.I. MSFragger: ultrafast and comprehensive peptide identification in mass spectrometry-based proteomics.Nat. Methods. 2017; 14: 513-520Crossref PubMed Scopus (475) Google Scholar). The speed of MSFragger makes it well suited for the analysis of large and complex data sets such as those from timsTOF PASEF. As conversion from Bruker's raw liquid chromatography-ion mobility-MS (LC-IMS-MS) format (.d) to an open, searchable format (.mzML) represents another significant computational challenge (up to 90 min per single two-hour LC–MS gradient raw file), we also extended MSFragger to read the raw format directly. Here we demonstrate that MSFragger can now perform peptide identification from raw timsTOF PASEF data in a fraction of the time required by other tools. A second challenge is related to quantification of timsTOF PASEF data. Because of the added ion mobility dimension, previously developed quantification tools need to be extended to LC-IMS-MS data. In MaxQuant this is done by slicing a 4-D space (ion mobility, m/z, retention time, and intensity) into multiple 3-D sub-spaces (m/z, retention time, and intensity) and tracing peaks within each sub-space (5Prianichnikov N. Koch H. Koch S. Lubeck M. Heilig R. Brehmer S. Fischer R. Cox J. MaxQuant software for ion mobility enhanced shotgun proteomics.Mol. Cell. Proteomics. 2020; 19: 1058-1069Abstract Full Text Full Text PDF PubMed Scopus (56) Google Scholar). Though MaxQuant only uses every third TOF scan in feature detection, it represents a significant fraction of the overall analysis time. Similarly, PEAKS (6Zhang J. Xin L. Shan B. Chen W. Xie M. Yuen D. Zhang W. Zhang Z. Lajoie G.A. Ma B. PEAKS DB: de novo sequencing assisted database search for sensitive and accurate peptide identification.Mol. Cell. Proteomics. 2012; 11 (M111.010587)Abstract Full Text Full Text PDF Scopus (105) Google Scholar) has extended its functionality to support quantification of timsTOF PASEF data, with analysis times like those of MaxQuant. To address this challenge, we introduce IonQuant, a tool that takes Bruker's raw files and database search results as input to perform fast extracted ion chromatogram (XIC)-based quantification. Using spectral data indexing for XIC tracing in retention and ion mobility dimensions, IonQuant requires ∼10 min per file on a desktop computer. IonQuant is integrated seamlessly with MSFragger (8Kong A.T. Leprevost F.V. Avtonomov D.M. Mellacheruvu D. Nesvizhskii A.I. MSFragger: ultrafast and comprehensive peptide identification in mass spectrometry-based proteomics.Nat. Methods. 2017; 14: 513-520Crossref PubMed Scopus (475) Google Scholar) and the Philosopher validation toolkit (9Leprevost F.V. Haynes S.E. Avtonomov D.M. Chang H.-Y. Shanmugam A.K. Mellacheruvu D. Kong A.T. Nesvizhskii A.I. Philosopher: a versatile toolkit for shotgun proteomics data analysis.Nat. Methods. 2020; 17Google Scholar). Using timsTOF PASEF HeLa data published by Meier et al. (3Meier F. Brunner A.D. Koch S. Koch H. Lubeck M. Krause M. Goedecke N. Decker J. Kosinski T. Park M.A. Bache N. Hoerning O. Cox J. Rather O. Mann M. Online Parallel Accumulation-Serial Fragmentation (PASEF) with a Novel Trapped Ion Mobility Mass Spectrometer.Mol. Cell. Proteomics. 2018; 17: 2534-2545Abstract Full Text Full Text PDF PubMed Scopus (339) Google Scholar) and three-organism mixture data published by Prianichnikov et al. (5Prianichnikov N. Koch H. Koch S. Lubeck M. Heilig R. Brehmer S. Fischer R. Cox J. MaxQuant software for ion mobility enhanced shotgun proteomics.Mol. Cell. Proteomics. 2020; 19: 1058-1069Abstract Full Text Full Text PDF PubMed Scopus (56) Google Scholar), we show the application of MSFragger and IonQuant to measure the analysis speed and quantitative reproducibility across replicate injections, and compare these results to PEAKS and MaxQuant. We demonstrate how more comprehensive (including semi-enzymatic and open) searches with MSFragger enable deep dives in these data, revealing interesting trends and recovering large numbers of peptides missed in the original analysis. Additionally, our pipeline has spectral library building capabilities and is fully compatible with the Skyline environment for subsequent visualization and targeted exploration of the data. Overall, we showcase a fast, flexible, and accurate computational platform for analyzing timsTOF PASEF proteomics data. We used data from five experimental conditions (25, 50, 100, 150, and 200 ms TIMS accumulation time) published by Meier et al. (3Meier F. Brunner A.D. Koch S. Koch H. Lubeck M. Krause M. Goedecke N. Decker J. Kosinski T. Park M.A. Bache N. Hoerning O. Cox J. Rather O. Mann M. Online Parallel Accumulation-Serial Fragmentation (PASEF) with a Novel Trapped Ion Mobility Mass Spectrometer.Mol. Cell. Proteomics. 2018; 17: 2534-2545Abstract Full Text Full Text PDF PubMed Scopus (339) Google Scholar) in the experiments. Each experimental condition has four technical replicates. Meier et al. (3Meier F. Brunner A.D. Koch S. Koch H. Lubeck M. Krause M. Goedecke N. Decker J. Kosinski T. Park M.A. Bache N. Hoerning O. Cox J. Rather O. Mann M. Online Parallel Accumulation-Serial Fragmentation (PASEF) with a Novel Trapped Ion Mobility Mass Spectrometer.Mol. Cell. Proteomics. 2018; 17: 2534-2545Abstract Full Text Full Text PDF PubMed Scopus (339) Google Scholar) concluded that the 100 ms accumulation time gave the best identification results. We used these four replicates with 100 ms accumulation time extensively (performing closed tryptic search, closed semi-enzymatic search, open search, and label free quantification comparisons). We also used data generated from a mixture of three organisms (H. sapiens, S. cerevisiae, and E. coli) published by Prianichnikov et al. (5Prianichnikov N. Koch H. Koch S. Lubeck M. Heilig R. Brehmer S. Fischer R. Cox J. MaxQuant software for ion mobility enhanced shotgun proteomics.Mol. Cell. Proteomics. 2020; 19: 1058-1069Abstract Full Text Full Text PDF PubMed Scopus (56) Google Scholar). There are two experimental conditions (A and B) that contain the following ratios of each organism with respect to one another: 1:1 (H. sapiens), 2:1 (S. cerevisiae), and 1:4 (E. coli). We used these data to evaluate the quantification accuracy of IonQuant. For identification, we estimated the false-discovery rate (FDR) using the target-decoy approach (10Elias J.E. Gygi S.P. Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry.Nat. Methods. 2007; 4: 207-214Crossref PubMed Scopus (2827) Google Scholar, 11Nesvizhskii A.I. A survey of computational methods and error rate estimation procedures for peptide and protein identification in shotgun proteomics.J. Proteomics. 2010; 73: 2092-2123Crossref PubMed Scopus (380) Google Scholar). For quantification, we evaluated the quality with coefficient of variation (CV) and Pearson correlation coefficient. Raw data files from four replicate injections each of HeLa lysate acquired at five different TIMS ramp (accumulation) times on a Bruker timsTOF Pro (3Meier F. Brunner A.D. Koch S. Koch H. Lubeck M. Krause M. Goedecke N. Decker J. Kosinski T. Park M.A. Bache N. Hoerning O. Cox J. Rather O. Mann M. Online Parallel Accumulation-Serial Fragmentation (PASEF) with a Novel Trapped Ion Mobility Mass Spectrometer.Mol. Cell. Proteomics. 2018; 17: 2534-2545Abstract Full Text Full Text PDF PubMed Scopus (339) Google Scholar) were downloaded from ProteomeXchange (12Vizcaíno J.A. Deutsch E.W. Wang R. Csordas A. Reisinger F. Ríos D. Dianes J.A. Sun Z. Farrah T. Bandeira N. Binz P.-A. Xenarios I. Eisenacher M. Mayer G. Gatto L. Campos A. Chalkley R.J. Kraus H.-J. Albar J.P. Martinez-Bartolomé S. Apweiler R. Omenn G.S. Martens L. Jones A.R. Hermjakob H. ProteomeXchange provides globally coordinated proteomics data submission and dissemination.Nat. Biotechnol. 2014; 32: 223-226Crossref PubMed Scopus (2070) Google Scholar) (PXD010012). For all searches, a protein sequence database of reviewed Human proteins (accessed 09/30/2019 from UniProt; 20463 entries including 115 common contaminant sequences) was used unless otherwise noted. Decoy sequences were generated and appended to the original database for MSFragger. PEAKS and MaxQuant only need target sequences. Tryptic cleavage specificity was applied, along with variable methionine oxidation, variable protein N-terminal acetylation, and fixed carbamidomethyl cysteine modifications. The allowed peptide length and mass ranges were 7–50 residues and 500–5000 Da, respectively. PEAKS and MaxQuant search parameters were set as close as possible to those used by MSFragger. For MSFragger searches, peptide sequence identification was performed with version 2.2 and FragPipe version 12.1 with mass calibration and parameter optimization enabled. PeptideProphet and ProteinProphet in Philosopher (version 2.0.0; https://philosopher.nesvilab.org/) were used to filter all peptide-spectrum matches (PSMs), peptides, and proteins to 1% PSM and 1% protein FDR. Quantification analysis was performed with IonQuant (version 1.1.0). For PEAKS X+ searches, version 10.5 was used, and PSMs and peptides were filtered to 1% peptide FDR by clicking the FDR button on the "Summary" page. Because there is no option in PEAKS to automatically filter the proteins, we tried different protein "-10logP" scores from the smallest to the largest until the reported protein FDR was equal to 1%. MaxQuant version 1.6.10.43 was used. The PSMs and peptides were filtered to 1% PSM FDR, and the protein groups were filtered to 1% protein FDR, which are the default settings. Entries from decoy proteins and "only identified by site" were removed. Raw data files from the mixture of three organism (5Prianichnikov N. Koch H. Koch S. Lubeck M. Heilig R. Brehmer S. Fischer R. Cox J. MaxQuant software for ion mobility enhanced shotgun proteomics.Mol. Cell. Proteomics. 2020; 19: 1058-1069Abstract Full Text Full Text PDF PubMed Scopus (56) Google Scholar) were download from ProteomeXchange (12Vizcaíno J.A. Deutsch E.W. Wang R. Csordas A. Reisinger F. Ríos D. Dianes J.A. Sun Z. Farrah T. Bandeira N. Binz P.-A. Xenarios I. Eisenacher M. Mayer G. Gatto L. Campos A. Chalkley R.J. Kraus H.-J. Albar J.P. Martinez-Bartolomé S. Apweiler R. Omenn G.S. Martens L. Jones A.R. Hermjakob H. ProteomeXchange provides globally coordinated proteomics data submission and dissemination.Nat. Biotechnol. 2014; 32: 223-226Crossref PubMed Scopus (2070) Google Scholar) (PXD014777). Three HeLa-only quality control samples (20190122_HeLa_QC_Slot1-47_1_3219.d, 20190122_HeLa_QC_Slot1-47_1_3220.d, and 20190122_HeLa_QC_Slot1-47_1_3221.d) from this same publication and repository were also used to examine gas-phase fragmentation in more recently-acquired data. In the three-organism quantification benchmarking data set, there are two experimental conditions with three replicates each. We used MSFragger (version 2.2) coupled with FragPipe (version 12.1) and Philosopher (version 2.0.0) to perform a closed search. The protein sequence database was the combination of reviewed H. sapiens, S. cerevisiae, and E. coli proteins (accessed 04/18/2020 from UniProt; 61576 entries), with decoy sequences added. We used IonQuant (version 1.1.0) to perform quantitative analysis. For benchmarking, we downloaded MaxQuant results with the folder name "Tenzer.nomatching_MaxQuant" from https://www.ebi.ac.uk/pride/archive/projects/PXD014777. We also re-analyzed these data using MaxQuant (version 1.6.14.0) and the protein database used by MSFragger. Decoy sequences were deleted before passing it to MaxQuant. The minimum ratio count was set to 2 (default value in MaxQuant). Remaining parameters were identical to those used in the HeLa lysate analysis. Within MSFragger, precursor tolerance was set to 50 ppm and fragment tolerance was set to 20 ppm, with mass calibration and parameter optimization enabled. Two missed cleavages were allowed, and two enzymatic termini were specified. Isotope error was set to 0/1/2. 50 ppm precursor tolerance coupled with 0/1/2 isotope error encompasses deamidation (0.98 Da). Deamidated peptides are a common artifact of sample preparation and handling, so there is no need to separate these peptides from unmodified ones given the aims of this study. Additionally, this slightly wider precursor tolerance results in more candidate PSMs, which benefits expectation value estimation in MSFragger. The minimum number of fragment peaks required to include a PSM in modeling was set to two, and the minimum number required to report the match was four. The top 150 most intense peaks and a minimum of 15 fragment peaks required to search a spectrum were used as initial settings. Parameters used in PEAKS and MaxQuant were set as close as possible to those used by MSFragger. The parameters used by MSFragger for semi-tryptic searches were equivalent to those used in the closed searches (detailed above) but with only one enzymatic peptide terminus required. MaxQuant does not allow any missed cleavages with semi-tryptic searching. For further investigation of the identified semi-tryptic peptides, variable pyro-glutamic acid and pyro-carbamidomethyl cysteine (−17.03 Da from glutamine and cysteine), and variable water loss (−18.01) on any peptide N terminus were also included in the semi-enzymatic MSFragger search parameters. These same parameters were used to search three HeLa injections from PXD014777 (5). Precursor mass tolerance was set from −150 to +500 Da, and precursor true tolerance and fragment mass tolerance were set to 20 ppm. Mass calibration and parameter optimization were enabled. Two missed cleavages were allowed, and the number of enzymatic termini was set to two. Isotope error was set to 0. The minimum number of fragment peaks required to include a PSM in modeling was set to two, and the minimum number required to report the match was four. A minimum of 15 fragment peaks and the top 100 most intense peaks were used as initial settings. As there are numerous spectral preprocessing procedures, such as peak centroiding, mass calibration, and retention time alignment/calibration before peak tracing and feature extraction, tolerance settings for quantification are unlikely to translate directly between quantification tools. Thus, we decided to use the default settings for each tool, which have been optimized to perform the best in most cases. In IonQuant, mass tolerance was set to 10 ppm, retention time tolerance was set to 0.4 min, ion mobility (1/K0) tolerance was set to 0.05, normalization was enabled, and minimum isotope count was set to 2 by default. Minimum ion counts 1 and 2 were tried. In PEAKS, identification directed quantification was performed with retention time alignment, with no CV filter nor outlier removal. Mass error and ion mobility tolerances were set to 20 ppm and 0.05 1/K0, respectively. The retention time shift tolerance used in alignment was set to 20 min as recommended by the documentation. In MaxQuant, Fast LFQ was performed with large ratio stabilization, minimum ratio count set to two (except where noted), three minimum neighbors, and six average number of neighbors by default. The remaining parameters were also set to default values. MSstats (13Choi M. Chang C.Y. Clough T. Broudy D. Killeen T. MacLean B. Vitek O. MSstats: an R package for statistical analysis of quantitative mass spectrometry-based proteomic experiments.Bioinformatics. 2014; 30: 2524-2526Crossref PubMed Scopus (518) Google Scholar) was used to calculate protein abundances from the ion abundances reported by each tool. For MSFragger and PEAKS, ions (filtered at 1% PSM and 1% protein FDR for MSFragger; 1% peptide FDR for PEAKS) were provided to MSstats. For MaxQuant, evidence.txt (filtered at 1% PSM FDR) and proteinGroup.txt (filtered at 1% protein FDR) were provided to MSstats. The dataProcess function with log10 intensity transformation was used to calculate protein abundances. MSFragger (version 2.2, via FragPipe version 12.1) and MaxQuant (version 1.6.10.43) were compared on a desktop with Intel Optane SSD 900P series hard disk, Intel Core i7-8700 3.2 GHz 6 CPU cores (12 logical cores), and 32 GB memory. Because of installation and licensing constraints, PEAKS Studio X+ was used on an Intel Xeon Gold 2.4 GHz 20 CPU cores (40 logical cores) workstation with 96 GB RAM. An overview of the computational workflow in shown in Fig. 1. MS/MS spectral files acquired in PASEF mode can be read directly by MSFragger. MSFragger loads the raw format (.d) using our original spectral reading library MSFTBX (14Avtonomov D.M. Raskind A. Nesvizhskii A.I. BatMass: a Java Software Platform for LC-MS Data Visualization in Proteomics and Metabolomics.J. Proteome Res. 2016; 15: 2500-2509Crossref PubMed Scopus (28) Google Scholar), extended here to interact with the Bruker's native library. During loading, Bruker's native library (timsdata.dll or libtimsdata.so) functions are called to perform scan combining, peak picking, and de-noising. MSFTBX passes the loaded scans to MSFragger without any additional processing. After loading, MSFragger writes all extracted scans into a binary format, mzBIN, for fast data access in any future re-analyses of the same data. After database searching with MSFragger (see Experimental Procedures), PSMs are saved in the pepXML file format. PSMs are processed using PeptideProphet (15Keller A. Nesvizhskii A.I. Kolker E. Aebersold R. Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search.Anal. Chem. 2002; 74: 5383-5392Crossref PubMed Scopus (3885) Google Scholar) and ProteinProphet (16Nesvizhskii A.I. Keller A. Kolker E. Aebersold R. A statistical model for identifying proteins by tandem mass spectrometry.Anal. Chem. 2003; 75: 4646-4658Crossref PubMed Scopus (3621) Google Scholar) as part of the Philosopher toolkit. Philosopher is also used for FDR filtering, and for generating summary reports at the PSM, peptide ion, peptide, and protein levels (Fig. 1A). Finally, IonQuant (see below) is used to extract peptide ion intensities for all PSMs, and adds quantification information to the PSM, peptide, and protein-level tables. Spectral files generated by timsTOF PASEF are large and structurally complex because of the fast TOF scan rate and additional ion mobility dimension. IonQuant, written in Java, traces and quantifies features from the four-dimensional space (ion mobility, m/z, retention time, and intensity) quickly and accurately using indexing technology (Fig. 1B). IonQuant first digitizes the ion mobility dimension with a predefined bin width (0.002 1/K0; Vs/cm2). Then, IonQuant indexes all peaks within this 4-D space according to their ion mobility, m/z, and retention time, which reduces memory usage and accelerates subsequent peak tracing. Given theoretical m/z, precursor ion mobility, and retention time from an identified MS/MS spectrum, IonQuant first locates the indexes corresponding to the precursor ion mobility with a user-defined tolerance. Then, it collects the m/z indexes within the tolerance of the theoretical m/z. With these two index-querying steps, IonQuant only needs to look at a small fraction of the whole data. Finally, it traverses all qualified peaks within the retention time range and generates a curve by tracing and performing Gaussian smoothing. After tracing all peaks in the retention time and m/z dimension, IonQuant traces the ion mobility dimension by clustering adjacent peaks to form 4-D features. Finally, IonQuant reports the boundaries, apex location, and volume of each detected ion feature. Given the theoretical m/z from a PSM, IonQuant tries to extract up to three 4-D features corresponding to 0, +1, and +2 isotopes. Then, it uses the summation of these features' volumes as the quantified intensity. By default, IonQuant requires at least two isotopes (minimum isotope count 2). IonQuant takes spe
The FragPipe computational proteomics platform is gaining widespread popularity among the proteomics research community because of its fast processing speed and user-friendly graphical interface. Although FragPipe produces well-formatted output tables that are ready for analysis, there is still a need for an easy-to-use and user-friendly downstream statistical analysis and visualization tool. FragPipe-Analyst addresses this need by providing an R shiny web server to assist FragPipe users in conducting downstream analyses of the resulting quantitative proteomics data. It supports major quantification workflows, including label-free quantification, tandem mass tags, and data-independent acquisition. FragPipe-Analyst offers a range of useful functionalities, such as various missing value imputation options, data quality control, unsupervised clustering, differential expression (DE) analysis using Limma, and gene ontology and pathway enrichment analysis using Enrichr. To support advanced analysis and customized visualizations, we also developed FragPipeAnalystR, an R package encompassing all FragPipe-Analyst functionalities that is extended to support site-specific analysis of post-translational modifications (PTMs). FragPipe-Analyst and FragPipeAnalystR are both open-source and freely available.
There is a growing demand to develop high-throughput and high-sensitivity mass spectrometry methods for single-cell proteomics. The commonly used isobaric labeling-based multiplexed single-cell proteomics approach suffers from distorted protein quantification due to co-isolated interfering ions during MS/MS fragmentation, also known as ratio compression. We reasoned that the use of MS3-based quantification could mitigate ratio compression and provide better quantification. However, previous studies indicated reduced proteome coverages in the MS3 method, likely due to long duty cycle time and ion losses during multilevel ion selection and fragmentation. Herein, we described an improved MS acquisition method for MS3-based single-cell proteomics by employing a linear ion trap to measure reporter ions. We demonstrated that linear ion trap can increase the proteome coverages for single-cell-level peptides with even higher gain obtained via the MS3 method. The optimized real-time search MS3 method was further applied to study the immune activation of single macrophages. Among a total of 126 single cells studied, over 1200 and 1000 proteins were quantifiable when at least 50 and 75% nonmissing data were required, respectively. Our evaluation also revealed several limitations of the low-resolution ion trap detector for multiplexed single-cell proteomics and suggested experimental solutions to minimize their impacts on single-cell analysis.
Cross-linking technique coupled with mass spectrometry (MS) is widely used in the analysis of protein structures and protein-protein interactions. In order to identify cross-linked peptides from MS data, we need to consider all pairwise combinations of peptides, which is computationally prohibitive when the sequence database is large. To alleviate this problem, some heuristic screening strategies are used to reduce the number of peptide pairs during the identification. However, heuristic screening strategies may miss some true cross-linked peptides.We directly tackle the combination challenge without using any screening strategies. With the data structure of double-ended queue, the proposed algorithm reduces the quadratic time complexity of exhaustive searching down to the linear time complexity. We implement the algorithm in a tool named Xolik. The running time of Xolik is validated using databases with different numbers of proteins. Experiments using synthetic and empirical datasets show that Xolik outperforms existing tools in terms of running time and statistical power.Source code and binaries of Xolik are freely available at http://bioinformatics.ust.hk/Xolik.html.Supplementary data are available at Bioinformatics online.
Advanced ScienceVolume 12, Issue 1 2570004 Back CoverOpen Access Mapping Nanoscale-To-Single-Cell Phosphoproteomic Landscape by Chip-DIA (Adv. Sci. 1/2025) Gul Muneer, Gul MuneerSearch for more papers by this authorSofani Tafesse Gebreyesus, Sofani Tafesse GebreyesusSearch for more papers by this authorCiao-Syuan Chen, Ciao-Syuan ChenSearch for more papers by this authorTzu-Tsung Lee, Tzu-Tsung LeeSearch for more papers by this authorFengchao Yu, Fengchao YuSearch for more papers by this authorChih-An Lin, Chih-An LinSearch for more papers by this authorMin-Shu Hsieh, Min-Shu HsiehSearch for more papers by this authorAlexey I. Nesvizhskii, Alexey I. NesvizhskiiSearch for more papers by this authorChao-Chi Ho, Chao-Chi HoSearch for more papers by this authorSung-Liang Yu, Sung-Liang YuSearch for more papers by this authorHsiung-Lin Tu, Hsiung-Lin TuSearch for more papers by this authorYu-Ju Chen, Yu-Ju ChenSearch for more papers by this author Gul Muneer, Gul MuneerSearch for more papers by this authorSofani Tafesse Gebreyesus, Sofani Tafesse GebreyesusSearch for more papers by this authorCiao-Syuan Chen, Ciao-Syuan ChenSearch for more papers by this authorTzu-Tsung Lee, Tzu-Tsung LeeSearch for more papers by this authorFengchao Yu, Fengchao YuSearch for more papers by this authorChih-An Lin, Chih-An LinSearch for more papers by this authorMin-Shu Hsieh, Min-Shu HsiehSearch for more papers by this authorAlexey I. Nesvizhskii, Alexey I. NesvizhskiiSearch for more papers by this authorChao-Chi Ho, Chao-Chi HoSearch for more papers by this authorSung-Liang Yu, Sung-Liang YuSearch for more papers by this authorHsiung-Lin Tu, Hsiung-Lin TuSearch for more papers by this authorYu-Ju Chen, Yu-Ju ChenSearch for more papers by this author First published: 09 January 2025 https://doi.org/10.1002/advs.202570004AboutPDF ToolsRequest permissionExport citationAdd to favoritesTrack citation ShareShare Give accessShare full text accessShare full-text accessPlease review our Terms and Conditions of Use and check box below to share full-text version of article.I have read and accept the Wiley Online Library Terms and Conditions of UseShareable LinkUse the link below to share a full-text version of this article with your friends and colleagues. Learn more.Copy URL Share a linkShare onEmailFacebookxLinkedInRedditWechat Graphical Abstract Single-Cell Phosphoproteomic Analysis In article 2402421, Hsiung-Lin Tu, Yu-Ju Chen, and co-workers present a novel phosphoproteomic chip combined with mass spectrometry to map the phosphoproteomic landscape at single-cell resolution. This cover artwork highlights the chip, symbolized as a 'vessel' that navigates through single-cell capture, lysis, protein digestion, and phospho-enrichment, revealing the first phosphoprotein network that drives cellular signaling within a single cell. Volume12, Issue1January 9, 20252570004 RelatedInformation
Here, we describe the implementation of the fast proteomics search engine MSFragger as a processing node in the widely used Proteome Discoverer (PD) software platform. PeptideProphet (via the Philosopher tool kit) is also implemented as an additional PD node to allow validation of MSFragger open (mass-tolerant) search results. These two nodes, along with the existing Percolator validation module, allow users to employ different search strategies and conveniently inspect search results through PD. Our results have demonstrated the improved numbers of PSMs, peptides, and proteins identified by MSFragger coupled with Percolator and significantly faster search speed compared to the conventional SEQUEST/Percolator PD workflows. The MSFragger-PD node is available at https://github.com/nesvilab/PD-Nodes/releases/.
Abstract Chemical cross-linking coupled with mass spectrometry is a powerful tool to study protein-protein interactions and protein conformations. Two linked peptides are ionized and fragmented to produce a tandem mass spectrum. In such an experiment, a tandem mass spectrum contains ions from two peptides. The peptide identification problem becomes a peptide-peptide pair identification problem. Currently, most existing tools don’t search all possible pairs due to the quadratic time complexity. Consequently, a significant percentage of linked peptides are missed. In our earlier work, we developed a tool named ECL to search all pairs of peptides exhaustively. While ECL does not miss any linked peptides, it is very slow due to the quadratic computational complexity, especially when the database is large. Furthermore, ECL uses a score function without statistical calibration, while researchers 1,2 have demonstrated that using a statistical calibrated score function can achieve a higher sensitivity than using an uncalibrated one. Here, we propose an advanced version of ECL, named ECL 2.0. It achieves a linear time and space complexity by taking advantage of the additive property of a score function. It can analyze a typical data set containing tens of thousands of spectra using a large-scale database containing thousands of proteins in a few hours. Comparison with other five state-of-the-art tools shows that ECL 2.0 is much faster than pLink, StavroX, ProteinProspector, and ECL. Kojak is the only one tool that is faster than ECL 2.0. But Kojak does not exhaustively search all possible peptide pairs. We also adopt an e -value estimation method to calibrate the original score. Comparison shows that ECL 2.0 has the highest sensitivity among the state-of-the-art tools. The experiment using a large-scale in vivo cross-linking data set demonstrates that ECL 2.0 is the only tool that can find PSMs passing the false discovery rate threshold. The result illustrates that exhaustive search and well calibrated score function are useful to find PSMs from a huge search space.