Accelerated Manhattan hashing via bit-remapping with location information

2017 
Hashing is a binary-code encoding method which tries to preserve the neighborhood structures in the original feature space, in order to realize efficient approximate nearest neighbor search in large-scale databases. Existing hashing methods usually adopt a two-stage strategy (projection stage and quantization stage) to encode data points, and threshold-based single-bit quantization (SBQ) is used to binarize each projected dimension into 0 or 1. Data similarity between hash codes is measured by their Hamming distance. However, SBQ may destroy the original neighborhood structures by quantizing neighboring points near threshold into different binary values. Double-bit quantization (DBQ) and its derivative, Manhattan hashing, have been proposed to fix this problem. Experimental results showed that Manhattan hashing outperformed state-of-the-art methods in terms of effectiveness, but lost the advantage of efficiency because it used decimal arithmetic instead of fast bitwise operations for similarity measurement between hash codes. In this paper, we propose an accelerated strategy of Manhattan hashing by making full use of bitwise operations. Our main contributions are: 1) a new encoding method which assigns location information to each binary digit is proposed to avoid the time-consuming decimal arithmetic; 2) a novel hash code distance measurement that accelerates the calculation of Manhattan distance is proposed to improve query efficiency. Extensive experiments on three benchmark datasets show that our approach improves the speed of data querying on 2-bit, 3-bit and 4-bit quantized hash codes by at least one order of magnitude on average, without any precision loss.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    36
    References
    3
    Citations
    NaN
    KQI
    []