Genesis and Gappa: Library and Toolkit for Working with Phylogenetic (Placement) Data

2019 
The ever increasing amount of genomic and meta-genomic sequence data has transformed biology into a data-driven and compute-intensive discipline. Hence, there is a need for efficient algorithms and scalable implementations thereof for analysing such data.nnWe present O_SCPCAPGENESISC_SCPCAP, a library for working with phylogenetic data, and O_SCPCAPGAPPAC_SCPCAP, an accompanying command line tool for conducting typical analyses on such data. While our tools primarily target phylogenetic trees and phylogenetic placements, they also offer a plethora of functions for handling genetic sequences, taxonomies, and other relevant data types.nnThe tools aim at improved usability at the production stage (conducting data analyses) as well as the development stage (rapid prototyping): The modular interface of O_SCPCAPGENESISC_SCPCAP simplifies numerous standard high-level tasks and analyses, while allowing for low-level customization at the same time. Our implementation relies on modern, multi-threaded C++11, and is substantially more com-putationally efficient than analogous tools. We already employed the core O_SCPCAPGENESISC_SCPCAP library in several of our tools and publications, thereby proving its flexibility and utility. Both O_SCPCAPGENESISC_SCPCAP and O_SCPCAPGAPPAC_SCPCAP are available under GPLv3 at http://github.com/lczech/genesis and http://github.com/lczech/gappa.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    27
    References
    12
    Citations
    NaN
    KQI
    []