QColors: An algorithm for conservative viral quasispecies reconstruction from short and non-contiguous next generation sequencing reads

2011 
Next generation sequencing technologies have been successfully applied to HIV-infected patients in order to obtain the mutational spectrum of heterogeneous viral populations within individuals, known as quasispecies. However, the metage-nomics problem of quasispecies sequence reconstruction from next generation sequencing reads is not-yet widely applied in current practice and remains an emerging area of research. Furthermore, the majority of research methodology in HIV has focused on 454 sequencing, while many next-generation sequencing platforms are limited to shorter read lengths relative to 454 sequencing. Little work has been done in determining how best to address the read length limitations of other platforms. The approach described here incorporates graph representations of both read differences and read overlap to conservatively determine the regions of the sequence with sufficient variability to separate quasispecies sequences. Within these tractable regions of quasispecies inference, we use constraint programming to solve for an optimal quasispecies subsequence determination via vertex coloring of the conflict graph, a representation which also lends itself to data with non-contiguous reads such as paired-end sequencing. We demonstrate the utility of the method by successfully applying it to simulations based on actual intra-patient clonal HIV-1 sequencing data.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    61
    References
    13
    Citations
    NaN
    KQI
    []