Some Branches May Bear Rotten Fruits: Diversity Browsing VP-Trees.

2020 
Diversified similarity searching embeds result diversification straight into the query procedure, which boosts the computational performance by orders of magnitude. While metric indexes have a hidden potential for perfecting such procedures, the construction of a suitable, fast, and incremental solution for diversified similarity searching is still an open issue. This study presents a novel index-and-search algorithm, coined diversity browsing, that combines an optimized implementation of the vantage-point tree (VP-Tree) index with the distance browsing search strategy and coverage-based query criteria. Our proposal maps data elements into VP-Tree nodes, which are incrementally evaluated for solving diversified neighborhood searches. Such an evaluation is based not only on the distance between the query and candidate objects but also on distances from the candidate to data elements (called influencers) in the partial search result. Accordingly, we take advantage of those distance-based relationships for pruning VP-Tree branches that are themselves influenced by elements in the result set. As a result, diversity browsing benefits from data indexing for (i) eliminating nodes without valid candidate elements, and (ii) examining the minimum number of partitions regarding the query element. Experiments with real-world datasets show our approach outperformed competitors GMC and GNE by at least 4.91 orders of magnitude, as well as baseline algorithm BRID\(_k\) in at least \(87.51\%\) regarding elapsed query time.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    19
    References
    1
    Citations
    NaN
    KQI
    []