Metabolomics has emerged as the latest of the so-called "omics" disciplines and has great potential to provide deeper understanding of fundamental biochemical processes at the biological system level. Among recent technological developments, LC-HRMS enables determination of hundreds to thousands of metabolites over a wide range of concentrations and has developed into one of the most powerful techniques in non-targeted metabolomics. The analysis of mixtures of in-vivo-stable isotopic-labeled samples or reference substances with un-labeled samples leads to specific LC-MS data patterns which can be systematically exploited in practically all data-processing steps. This includes recognition of true metabolite-derived analytical features in highly complex LC-MS data and characterization of the global biochemical composition of biological samples. In addition, stable-isotopic labeling can be used for more accurate quantification (via internal standardization) and identification of compounds in different organisms.
An untargeted metabolomics workflow for the detection of metabolites derived from endogenous or exogenous tracer substances is presented. To this end, a recently developed stable isotope-assisted LC–HRMS-based metabolomics workflow for the global annotation of biological samples has been further developed and extended. For untargeted detection of metabolites arising from labeled tracer substances, isotope pattern recognition has been adjusted to account for nonlabeled moieties conjugated to the native and labeled tracer molecules. Furthermore, the workflow has been extended by (i) an optional ion intensity ratio check, (ii) the automated combination of positive and negative ionization mode mass spectra derived from fast polarity switching, and (iii) metabolic feature annotation. These extensions enable the automated, unbiased, and global detection of tracer-derived metabolites in complex biological samples. The workflow is demonstrated with the metabolism of 13C9-phenylalanine in wheat cell suspension cultures in the presence of the mycotoxin deoxynivalenol (DON). In total, 341 metabolic features (150 in positive and 191 in negative ionization mode) corresponding to 139 metabolites were detected. The benefit of fast polarity switching was evident, with 32 and 58 of these metabolites having exclusively been detected in the positive and negative modes, respectively. Moreover, for 19 of the remaining 49 phenylalanine-derived metabolites, the assignment of ion species and, thus, molecular weight was possible only by the use of complementary features of the two ion polarity modes. Statistical evaluation showed that treatment with DON increased or decreased the abundances of many detected metabolites.
Abstract Untargeted metabolomics promises comprehensive characterization of small molecules in biological samples. However, the field is hampered by low annotation rates and abstract spectral data. Despite recent advances in computational metabolomics, manual annotations and manual confirmation of in-silico annotations remain important in the field. Here, exploratory data analysis methods for mass spectral data provide overviews, prioritization, and structural hypothesis starting points to researchers facing large quantities of spectral data. In this research, we propose a fluid means of dealing with mass spectral data using specXplore, an interactive python dashboard providing interactive and complementary visualizations facilitating mass spectral similarity matrix exploration. Specifically, specXplore provides a two dimensional t-SNE embedding as a jumping board for local connectivity exploration using complementary interactive visualizations in the form of partial network drawings, similarity heatmaps, and fragmentation overview maps. SpecXplore makes use of of state of the art ms2deepscore pairwise spectral similarities as a quantitative backbone, while allowing fast changes of threshold and connectivity limitation settings, providing flexibility in adjusting settings to suit the localized node environment being explored. We believe that specXplore can become an integral part in mass spectral data exploration efforts and assist users in the generation of structural hypotheses for compounds of interest. Technical Terms A network is a collection of connected features. In our case, a network consists of MS/MS spectral features connected provided their spectral similarity is high. Networks are represented using node-link-diagrams. Node-link diagram -a term commonly used to refer to the graphical representation of a network via nodes and links (i.e. edges). In this paper, we use node-link diagram and network-view interchangeably. A node is a feature in a network that can be connected to other features via edges. An alternative term for node is vertex. An edge is a connection between two nodes. Other terms for edges are links or vertices. Network layout refers to the spatial arrangement of nodes and edges on an usually two dimensional plotting surface. Network layout is also sometimes referred to as embedding. This term is avoided in this paper to avoid confusion with embedding in the machine learning sense. Given a network G ( V, E ), where V denotes its nodes and E its (weighted) edges, we define its topology as the relationships between individual (groups of) nodes and edges or the network as a whole, irrespective of the network’s layout. Molecular Networking (MN) is an exploratory data analysis technique merging spectral similarity-based topological clustering and visualization as node-link diagrams. The plain English words group/grouping are wherever appropriate to avoid jargon terms such as clustering (as in k-medoid or k-means clustering), embedding (as in projection of groups of features into a close-by lower dimensional space), or molecular families. The latter are groups of spectral data features clustered and visualized as network-views via traditional MN or feature based molecular networking (FBMN). Molecular families, usually represent smaller, disconnected networks that are part of a larger dataset. When we refer to this disconnected nature, we use the phrasing disjoint sub-network for emphasis.
An untargeted screening strategy for the detection of biotransformation products of xenobiotics using stable isotopic labelling (SIL) and liquid chromatography-high resolution mass spectrometry (LC-HRMS) is reported. The organism of interest is treated with a mixture of labelled and non-labelled precursor and samples are analysed by LC-HRMS. Raw data are processed with the recently developed MetExtract software for the automated extraction of corresponding peak pairs. The SIL-assisted approach is exemplified by the metabolisation of the Fusarium mycotoxin deoxynivalenol (DON) in planta. Flowering ears were inoculated with 100 μg of a 1 + 1 (v/v) mixture of non-labelled and fully labelled DON. Subsequent sample preparation, LC-HRMS measurements and data processing revealed a total of 57 corresponding peak pairs, which originated from ten metabolites. Besides the known DON and DON-3-glucoside, which were confirmed by measurement of authentic standards, eight further DON-biotransformation products were found by the untargeted screening approach. Based on a mass deviation of less than ±5 ppm and MS/MS measurements, one of these products was annotated as DON-glutathione (GSH) conjugate, which is described here for the first time for wheat. Our data further suggest that two DON-GSH-related metabolites, the processing products DON-S-cysteine and DON-S-cysteinyl-glycine and five unknown DON conjugates were formed in planta. Future MS/MS measurements shall reveal the molecular structures of the detected conjugates in more detail.
Metabolomics experiments often comprise large numbers of biological samples resulting in huge amounts of data. This data needs to be inspected for plausibility before data evaluation to detect putative sources of error e.g. retention time or mass accuracy shifts. Especially in liquid chromatography-high resolution mass spectrometry (LC-HRMS) based metabolomics research, proper quality control checks (e.g. for precision, signal drifts or offsets) are crucial prerequisites to achieve reliable and comparable results within and across experimental measurement sequences. Software tools can support this process. The software tool QCScreen was developed to offer a quick and easy data quality check of LC-HRMS derived data. It allows a flexible investigation and comparison of basic quality-related parameters within user-defined target features and the possibility to automatically evaluate multiple sample types within or across different measurement sequences in a short time. It offers a user-friendly interface that allows an easy selection of processing steps and parameter settings. The generated results include a coloured overview plot of data quality across all analysed samples and targets and, in addition, detailed illustrations of the stability and precision of the chromatographic separation, the mass accuracy and the detector sensitivity. The use of QCScreen is demonstrated with experimental data from metabolomics experiments using selected standard compounds in pure solvent. The application of the software identified problematic features, samples and analytical parameters and suggested which data files or compounds required closer manual inspection. QCScreen is an open source software tool which provides a useful basis for assessing the suitability of LC-HRMS data prior to time consuming, detailed data processing and subsequent statistical analysis. It accepts the generic mzXML format and thus can be used with many different LC-HRMS platforms to process both multiple quality control sample types as well as experimental samples in one or more measurement sequences.
Abstract Covalent or non-covalent heterogeneous multimerization of molecules associated with extracts from biological samples analyzed via LC-MS is quite difficult to recognize/annotate and therefore the prevalence of multimerization remains largely unknown. In this study, we utilized 13C labeled and unlabeled Pichia pastoris extracts to recognize heterogeneous multimers. More specifically, between 0.8% and 1.5% of the biologically-derived features detected in our experiments were confirmed to be heteromers, about half of which we could successfully annotate with monomeric partners. Interestingly, we found specific chemical classes such as nucleotides to disproportionately contribute to heteroadducts. Furthermore, we compiled these compounds into the first MS/MS library that included data from heteromultimers to provide a starting point for other labs to improve the annotation of such ions in other metabolomics data sets. Then, the detected heteromers were also searched in publicly accessible LC-MS datasets available in Metabolights, Metabolomics WB, and GNPS/MassIVE to demonstrate that these newly annotated ions are also relevant to other public datasets. Furthermore, in additional datasets ( Triticum aestivum , Fusarium graminearum, and Trichoderma reesei ) our developed workflow also detected 0.5% to 4.9% of metabolite features to originate from heterodimers, demonstrating heteroadducts to be present in metabolomics studies at a low percentage.
Untargeted approaches and thus biological interpretation of metabolomics results are still hampered by the reliable assignment of the global metabolome as well as classification and (putative) identification of metabolites. In this work we present an liquid chromatography-mass spectrometry (LC-MS)-based stable isotope assisted approach that combines global metabolome and tracer based isotope labeling for improved characterization of (unknown) metabolites and their classification into tracer derived submetabolomes. To this end, wheat plants were cultivated in a customized growth chamber, which was kept at 400 ± 50 ppm 13CO2 to produce highly enriched uniformly 13C-labeled sample material. Additionally, native plants were grown in the greenhouse and treated with either 13C9-labeled phenylalanine (Phe) or 13C11-labeled tryptophan (Trp) to study their metabolism and biochemical pathways. After sample preparation, liquid chromatography-high resolution mass spectrometry (LC-HRMS) analysis and automated data evaluation, the results of the global metabolome- and tracer-labeling approaches were combined. A total of 1,729 plant metabolites were detected out of which 122 respective 58 metabolites account for the Phe- and Trp-derived submetabolomes. Besides m/z and retention time, also the total number of carbon atoms as well as those of the incorporated tracer moieties were obtained for the detected metabolite ions. With this information at hand characterization of unknown compounds was improved as the additional knowledge from the tracer approaches considerably reduced the number of plausible sum formulas and structures of the detected metabolites. Finally, the number of putative structure formulas was further reduced by isotope-assisted annotation tandem mass spectrometry (MS/MS) derived product ion spectra of the detected metabolites. A major innovation of this paper is the classification of the metabolites into submetabolomes which turned out to be valuable information for effective filtering of database hits based on characteristic structural subparts. This allows the generation of a final list of true plant metabolites, which can be characterized at different levels of specificity.
Many metabolomics studies use mixtures of (acidified) methanol and water for sample extraction. In the present study, we investigated if the extraction with methanol can result in artifacts. To this end, wheat leaves were extracted with mixtures of native and deuterium-labeled methanol and water, with or without 0.1% formic acid. Subsequently, the extracts were analyzed immediately or after storage at 10 °C, −20 °C or −80 °C with an HPLC-HESI-QExactive HF-Orbitrap instrument. Our results showed that 88 (8%) of the >1100 detected compounds were derived from the reaction with methanol and either formed during sample extraction or short-term storage. Artifacts were found for various substance classes such as flavonoids, carotenoids, tetrapyrrols, fatty acids and other carboxylic acids that are typically investigated in metabolomics studies. 58 of 88 artifacts were common between the two tested extraction variants. Remarkably, 34 of 73 (acidified extraction solvent) and 33 of 73 (non-acidified extraction solvent) artifacts were formed de novo as none of these meth(ox)ylated metabolites were found after extraction of native leaf samples with CD3OH/H2O. Moreover, sample extracts stored at 10 °C for several days, as can typically be the case during longer measurement sequences, led to an increase in both the number and abundance of methylated artifacts. In contrast, frozen sample extracts were relatively stable during a storage period of one week. Our study shows that caution has to be exercised if methanol is used as the extraction solvent as the detected metabolites might be artifacts rather than natural constituents of the biological system. In addition, we recommend storing sample extracts in deep freezers immediately after extraction until measurement.