Gorder: an efficient method for KNN join processing

Chenyi Xia,Hongjun Lu,Beng Chin Ooi,Jing Hu

Gorder: an efficient method for KNN join processing

2004

An important but very expensive primitive operation of high-dimensional databases is the K-Nearest Neighbor (KNN) similarity join. The operation combines each point of one dataset with its KNNs in the other dataset and it provides more meaningful query results than the range similarity join. Such an operation is useful for data mining and similarity search. In this paper, we propose a novel KNN-join algorithm, called the Gorder (or the G-ordering KNN) join method. Gorder is a block nested loop join method that exploits sorting, join scheduling and distance computation filtering and reduction to reduce both I/O and CPU costs. It sorts input datasets into the G-order and applied the scheduled block nested loop join on the G-ordered data. The distance computation reduction is employed to further reduce CPU cost. It is simple and yet efficient, and handles high-dimensional data efficiently. Extensive experiments on both synthetic cluster and real life datasets were conducted, and the results illustrate that Gorder is an efficient KNN-join method and outperforms existing methods by a wide margin.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

102

Citations