Summarizing Provenance of Aggregate Query Results in Relational Databases

2021 
Data provenance is any information about the origin of a piece of data and the process that led to its creation. Most database provenance work has focused on creating models and semantics to query and generate this information. While comprehensive, provenance information remains large and overwhelming, which can make it hard for provenance systems to support data exploration. We present a new approach to provenance exploration that builds on data summarization techniques. We contribute two novel summarization schemes for the provenance of aggregation queries: Impact summaries, and comparative summaries. We show with experiments that our techniques incur little overhead compared to basic summaries. We conduct a survey to show that our approaches are useful to users.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    15
    References
    0
    Citations
    NaN
    KQI
    []