Provenance traces from Chiron parallel workflow engine

2013 
Scientific workflows are commonly used to model and execute large-scale scientific experiments. They represent key resources for scientists and are managed by Scientific Workflow Management Systems (SWfMS). The different languages used by SWfMS may impact in the way the workflow engine executes the workflow, sometimes limiting optimization opportunities. To tackle this issue, we recently proposed a scientific workflow algebra [1]. This algebra is inspired by database relational algebra and it enables automatic optimization of scientific workflows to be executed in parallel in high performance computing (HPC) environments. This way, the experiments presented in this paper were executed in Chiron, a parallel scientific workflow engine implemented to support the scientific workflow algebra. Before executing the workflow, Chiron stores the prospective provenance [2] of the workflow on its provenance database. Each workflow is composed by several activities, and each activity consumes relations. Similarly to relational databases, a relation contains a set of attributes and it is composed by a set of tuples. Each tuple in a relation contains a series of values, each one associated to a specific attribute. The tuples of a relation are distributed to be consumed in parallel over the computing resources according to the workflow activity. During and after the execution, the retrospective provenance [2] is also stored.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    5
    References
    4
    Citations
    NaN
    KQI
    []