Examining Statistics of Workflow Evolution Provenance: A First Study

2008 
Provenance (also referred to as audit trail, lineage, and pedigree) captures information about the steps used to generate a given data product. Such information provides documentation that is key to determining data quality and authorship, and necessary for preserving, reproducing, sharing and publishing the data. Workflow design, in particular for exploratory tasks (e.g., creating a visualization, mining a data set), requires an involved, trial-and-error process. To solve a problem, a user has to iteratively refine a workflow to experiment with different techniques and try different parameter values, as she formulates and test hypotheses. The maintenance of detailed provenance (or history) of this process has many benefits that go beyond documentation and result reproducibility. Notably, it supports several operations that facilitate exploration, including the ability to return to a previous workflow version in an intuitive way, to undo bad changes, to compare different workflows, and to be reminded of the actions that led to a particular result [2].
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    7
    References
    14
    Citations
    NaN
    KQI
    []