BinDex: A Two-Layered Index for Fast and Robust Scans

2020 
In modern analytical database systems, the performance of the data scan operation is of key importance to the performance of query execution. Existing approaches may be categorized into index scan and sequential scan. However, both approaches have inherent inefficiencies. Indeed, sequential scan may need to access a large amount of unneeded data, especially for queries with low selectivity. Instead, index scan may involve a large number of expensive random memory accesses when the query selectivity is high. Moreover, with the growing complexities in database query workloads, it has become hard to predict which approach is better for a particular query. In order to obtain fast and robust scans under all selectivities, this paper proposes BinDex, a two-layered index structure based on binned bitmaps that can be used to significantly accelerate the scan operations for in-memory column stores. The first layer of BinDex consists of a set of binned bitmaps which filter out most unneeded values in a column. The second layer provides some auxiliary information to correct the bits that have incorrect values. By varying the number of bit vectors in the first layer, BinDex can make a tradeoff between memory space and performance. Experimental results show that BinDex outperforms the state-of-the-art approaches with less memory than a B+-tree would use. And by enlarging the memory space, BinDex can achieve up to 2.9 times higher performance, eliminating the need for making a choice between sequential or index scans.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    37
    References
    2
    Citations
    NaN
    KQI
    []