Scaling SPADE to Big provenance

Ashish Gehani,Hasanat Kazmi,Hassaan Irshad

Scaling SPADE to Big provenance

2016

Ashish Gehani
Hasanat Kazmi
Hassaan Irshad

Provenance middleware (such as SPADE) lets individuals and applications use a common framework for reporting, storing, and querying records that characterize the history of computational processes and resulting data artifacts. Previous efforts have addressed a range of issues, from instrumentation techniques to applications in the domains of scientific reproducibility and data security. Here we report on our experience adapting SPADE to handle large provenance data sets. In particular, we describe two motivating case studies, several challenges that arose from managing provenance at scale, and our approach to address each concern.

Keywords:

Database
Data security
Scaling
Middleware
Provenance
Data set
Data mining
Computer science
Data science
common framework

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations