Influenza Classification from Short Reads with VAPOR Facilitates Robust Mapping Pipelines and Zoonotic Strain Detection for Routine Surveillance Applications.

2019 
Motivation: Influenza viruses represent a global public health burden due to annual epidemics and pandemic potential. Due to a rapidly evolving RNA genome, inter-species transmission, intra-host variation, and noise in short-read data, reads can be lost during mapping, and de novo assembly can be time consuming and result in misassembly. We assessed read loss during mapping, and designed a graph-based classifier, VAPOR, for selecting mapping references, assembly validation, and detection of strains of non-human origin. Results: Standard human reference viruses were insufficient for mapping diverse influenza samples in simulation. VAPOR retrieved references for 257 real whole genome sequencing (WGS) samples with a mean of >99.8% identity to assemblies, and increased the proportion of mapped reads by up to 13.3% compared to standard references. VAPOR has the potential to improve the robustness of bioinformatics pipelines for surveillance and could be adapted to other RNA viruses.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    44
    References
    2
    Citations
    NaN
    KQI
    []