An RNA-Seq Bioinformatics Pipeline for Data Processing of Arabidopsis Thaliana Datasets

2017 
In the literature, it is often seen that computational tools aiming at RNA-Seq data analysis are applied to Arabidopsis thaliana using default parameters, which often results in inaccurate measurement of gene quantification and expression, as they are designed for data processing of mammalian genomes. Therefore, to accurately measure the gene expression and quantification in plant genomes, a computational workflow is proposed in this work; this can effectively process the A. thaliana plant genome files using command-line bioinformatics tools with custom parameter settings. Using the proposed pipeline, we identified 690 genes by overlapping differentially expressed genes obtained from Cuffdiff, DESeq and edgeR methods. Dynamics of expression of known floral regulators identified from Day-1 to Day-10 are consistent with the published experimental results. Gene Ontology (GO) analysis shows a decrease in expression during the transition phase. From clustering, dynamics of expression of potential floral regulators were identified which are involved in flowering and regulation of flower development. Our analysis showed that, the proposed pipeline is capable of processing A. thaliana RNA-Seq datasets, providing consistent results which could potentially assist in identifying novel genes involved in gene regulation. Its use can be extended to other A. thaliana datasets and tissue samples.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    37
    References
    0
    Citations
    NaN
    KQI
    []