SF-sketch: slim-fat-sketch with GPU assistance.

2017 
A sketch is a probabilistic data structure that is used to record frequencies of items in a multi-set. Various types of sketches have been proposed in literature and applied in a variety of fields, such as data stream processing, natural language processing, distributed data sets etc. While several variants of sketches have been proposed in the past, existing sketches still have a significant room for improvement in terms of accuracy. In this paper, we propose a new sketch, called Slim-Fat (SF) sketch, which has a significantly higher accuracy compared to prior art, a much smaller memory footprint, and at the same time achieves the same speed as the best prior sketch. The key idea behind our proposed SF-sketch is to maintain two separate sketches: a small sketch called Slim-subsketch and a large sketch called Fat-subsketch. The Slim-subsketch, stored in the fast memory (SRAM), enables fast and accurate querying. The Fat-subsketch, stored in the relatively slow memory (DRAM), is used to assist the insertion and deletion from Slim-subsketch. We implemented and extensively evaluated SF-sketch along with several prior sketches and compared them side by side. Our experimental results show that SF-sketch outperforms the most commonly used CM-sketch by up to 33.1 times in terms of accuracy. The concise version of our paper will appear in IKDE 2017.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    22
    References
    1
    Citations
    NaN
    KQI
    []