Congressional samples for approximate answering of group-by queries

Swarup Acharya,Phillip B. Gibbons,Viswanath Poosala

Congressional samples for approximate answering of group-by queries

2000

Swarup Acharya
Phillip B. Gibbons
Viswanath Poosala

In large data warehousing environments, it is often advantageous to provide fast, approximate answers to complex decision support queries using precomputed summary statistics, such as samples. Decision support queries routinely segment the data into groups and then aggregate the information in each group (group-by queries). Depending on the data, there can be a wide disparity between the number of data items in each group. As a result, approximate answers based on uniform random samples of the data can result in poor accuracy for groups with very few data items, since such groups will be represented in the sample by very few (often zero) tuples.

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations