ComBI: Compressed Binary Search Tree for Approximate k-NN Searches in Hamming Space

2021 
Abstract The space-partitioning based hashing techniques are widely used to represent high-dimensional data points as bit-codes. Although Binary Search Trees (BSTs) can be used for storing bit-codes, their size grows exponentially with code length. In practice, such a tree turns out to be highly sparse, increasing the miss-rate of nearest neighbor searches. We present Compressed BST of Inverted hash tables (ComBI), a geometrically motivated compression technique for BSTs. ComBI enables fast and approximate nearest neighbor searches without a significant memory footprint over BSTs. We show that approximate search in ComBI can perform competitively to an exact search algorithm in retrieving the nearest neighbors in a hamming space. On a database containing ∼80M samples, ComBI yields an average precision of 0.90, at ∼4X - ∼296X improvements in run-time across different code lengths when compared to MIH, a widely used exact search method. On a database consisting of 1B samples, this value of precision (0.90) is reached at ∼4X - ∼19X improvements in run-time.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    39
    References
    0
    Citations
    NaN
    KQI
    []