Supporting set-valued joins in NoSQL using MapReduce

2015 
NoSQL systems are increasingly adopted for Web applications requiring scalability that relational database systems cannot meet. Although NoSQL systems have not been designed to support joins, as they are applied to a wide variety of applications, the need to support joins has emerged. Furthermore, joins performed in NoSQL systems are generally similarity joins, rather than exact-match joins, which find similar pairs of records. Since Web applications often use the MapReduce framework, we develop a solution to perform similarity joins in NoSQL systems using the MapReduce framework. Author-HighlightsWe developed a set-similarity join solution in NoSQL using MapReduce.Our set-similarity join algorithm can avoid redundant comparisons between join attribute values in the MapReduce framework.We decreased substantially the amount of network traffic in the MapReduce framework.We reduced the number of comparisons to find all similar pairs by extending the prefix filtering technique for the MapReduce Framework.Our solution resulted in up to an order of magnitude improvement in performance over the most efficient existing solution.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    26
    References
    13
    Citations
    NaN
    KQI
    []