De novo assembly of transcriptome from next-generation sequencing data

2016 
Reconstruction of transcriptome by de novo assembly from next generation sequencing (NGS) short-sequence reads provides an essential mean to catalog expressed genes, identify splicing isoforms, and capture the expression detail of transcripts for organisms with no reference genome available. De novo transcriptome assembly faces many unique challenges, including alternative splicing, variable expression level covering a dynamic range of several orders of magnitude, artifacts introduced by reverse transcription, etc. In the current review, we illustrate the grand strategy in applying De Bruijn Graph (DBG) approach in de novo transcriptome assembly. We further analyze many parameters proven critical in transcriptome assembly using DBG. Among them, k-mer length, coverage depth of reads, genome complexity, performance of different programs are addressed in greater details. A multi-k-mer strategy balancing efficiency and sensitivity is discussed and highly recommended for de novo transcriptome assembly. Future direction points to the combination of NGS and third generation sequencing technology that would greatly enhance the power of de novo transcriptomics study.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    140
    References
    4
    Citations
    NaN
    KQI
    []