PRAM: a novel pooling approach for discovering intergenic transcripts from large-scale RNA sequencing experiments
2019
Publicly available RNA-seq data is routinely used for retrospective analysis to elucidate new biology. Novel transcript discovery enabled by joint examination of large collections of RNA-seq datasets has emerged as one such analysis. Current methods for transcript discovery rely on a 92-Step9 approach where the first step encompasses building transcripts from individual datasets, followed by the second step that merges predicted transcripts across datasets. To increase the power of transcript discovery from large collections of RNA-seq datasets, we developed a novel 91-Step9 approach named Pooling RNA-seq and Assembling Models (PRAM) that builds transcript models from pooled RNA-seq datasets. We demonstrate in a computational benchmark that 91-Step9 outperforms 92-Step9 approaches in predicting overall transcript structures and individual splice junctions, while performing competitively in detecting exonic nucleotides. Applying PRAM to 30 human ENCODE RNA-seq datasets identified unannotated transcripts with epigenetic and RAMPAGE signatures similar to those of recently annotated transcripts. In a case study, we discovered and experimentally validated new transcripts through the application of PRAM to mouse hematopoietic RNA-seq datasets. Notably, we uncovered new transcripts that share a differential expression pattern with a neighboring gene Pik3cg implicated in human hematopoietic phenotypes, and we provided evidence for the conservation of this relationship in human. PRAM is implemented as an R/Bioconductor package and is available at https://bioconductor.org/packages/pram.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
50
References
0
Citations
NaN
KQI