A revisit of RSEM generative model and its EM algorithm for quantifying transcript abundances.

2018 
RSEM has been mainly known for its accuracy in transcript abundance quantification. However, its quantification time is extremely high compared to that of recent quantification tools. In this paper, we revised the RSEM9s EM algorithm. In particular, we derived accurate M-step updates to eliminate incorrect heuristic updates in RSEM. We also implement some optimizations that reduce the quantification time about a hundred times while still have better accuracy compared to RSEM. In particular, we noticed that different parameters have different convergence rates, therefore we identified and removed early converged parameters to significantly reduce the model complexity in further iterations, and we also use SQUAREM method to further speed up the convergence rate. We implemented these revisions in a packaged named Hera-EM, with source code available at: https://github.com/bioturing/hera/tree/master/hera-EM
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    12
    References
    3
    Citations
    NaN
    KQI
    []