Evaluating and Improving SSU rRNA PCR Primer Coverage via Metagenomes from Global Ocean Surveys

2020 
Small subunit ribosomal RNA (SSU rRNA) amplicon sequencing comprehensively profiles microbiomes, but results will only be accurate if PCR primers perfectly match environmental sequences. To evaluate whether primers commonly used in microbial oceanography match naturally-occurring organisms, we compared primers with > 300 million rRNA sequences retrieved from globally-distributed metagenomes. The best-performing 16S primers were 515Y/926R and 515Y/806RB which perfectly matched most sequences (~0.95). Considering Cyanobacteria/Chloroplast 16S, 515Y/926R had the highest coverage (0.99), making it ideal for quantifying phytoplankton. For 18S sequences, 515Y/926R performed best (0.88), followed by V4R/V4RB (18S-specific; 0.82). Using Atlantic and Pacific BioGEOTRACES field samples, we demonstrate high correspondence between 515Y/926R amplicons (generated as part of this study) and metagenomic 16S rRNA (median R2=0.98, n=272), indicating amplicons can produce equally accurate community composition data versus shotgun metagenomics. Since our pipeline identifies missed taxa, we suggest modifications to improve coverage of biogeochemically-important oceanic microorganisms - a strategy applicable to any environment with metagenomic data. Significance StatementQuantification of taxonomically-informative marker genes using PCR amplification and high-throughput sequencing is a low-cost technique for monitoring distributions and changes of microbial communities across space and time. In order to maximize this procedures effectiveness, it is essential that environmental organisms match PCR primer sequences exactly. In this study, we developed a software pipeline to evaluate how well commonly-used primers match primer-binding regions from globally-distributed short-read oceanic metagenomes. Our results demonstrate common primer sets vary widely in performance, and that including additional degenerate bases is a simple strategy to maximize environmental coverage. Written in the reproducible snakemake workflow language and publicly accessible, our pipeline provides a general-purpose tool to guide rational design of PCR primers for any environment with metagenomic data.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    61
    References
    3
    Citations
    NaN
    KQI
    []