Easing the Pain of Next-Gen Sequencing Data Evaluation and Delivery

2014 
High throughput genome biology has created the need for a robust and scalable software system able to handle massive amounts of sequencing data. At the Data Production Facility in the USC Epigenome Center we have addressed this problem by developing an online sequence data access site called the Epigenome Center Data Portal: ECDP. This scalable portal allows researchers to explore and download their datasets in a secure fashion. From the initial LIMS sample entry (currently using Genologics) through sequencing and downstream analysis on our supercomputing cluster, all characteristics of a sample are parsed and tracked allowing for the presentation of these metrics on a single integrated interface. The QC metrics data generated by the analyses can be visualized in a number of ways. Metrics can be viewed for multiple samples side by side, plotted on an interactive plot, and exported in spreadsheet format. The most important summary data are presented initially with the additional option to drill down into further detail. This allows for the rapid assessment of library quality before proceeding further with in-depth analyses. ECDP also serves as the primary means of client sequence data delivery. Clients can securely download analyzed data such as fastq files, bam files, and visualization tracks. Recently we have been collecting information on the client usage of ECDP in order to more effectively tailor the site to the needs of the user community. The initial results of this tracking will be presented.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []