High-resolution sweep metagenomics using ultrafast read mapping and inference

2018 
Traditional 16S ribosomal RNA sequencing and whole-genome shotgun metagenomics can determine the composition of bacterial communities on genus level and species level but high-resolution inference on the strain level is challenging due to close relatedness between strain genomes. We present the mSWEEP pipeline for identifying and estimating relative abundances of bacterial strains from plate sweeps of enrichment cultures. mSWEEP uses a database of biologically grouped sequence assemblies as a reference and achieves ultra-fast mapping and accurate inference using pseudoalignment, Bayesian probabilistic modeling, and a control for false positive results. We use sequencing data from the major human pathogens Campylobacter jejuni, Campylobacter coli, Klebsiella pneumoniae and Staphylococcus epidermidis to demonstrate that mSWEEP significantly outperforms previous state-of-the-art in strain quantification and detection accuracy. The introduction of mSWEEP opens up a new field of plate sweep metagenomics and facilitates investigation of bacterial cultures composed of mixtures of organisms at differing levels of variation.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    37
    References
    0
    Citations
    NaN
    KQI
    []