A Method of Data Distribution for Distributed Cross Join

Ping Lu,Shengmei Luo,Zhiping Wang,Wenwu Qu

A Method of Data Distribution for Distributed Cross Join

2013

Ping Lu
Shengmei Luo
Zhiping Wang
Wenwu Qu

One of the major challenges in big data processing is the efficiency of cross join, such as the similarity calculation in business intelligence. In this paper we introduce an optimal data distribution algorithm for distributed cross join which combine each row from the first table with each row from the second table, which can reduce the network traffic and guarantee the computation balance of the distributed system.

Keywords:

Big data
Computation
Cluster analysis
Algorithm design
Business intelligence
Distributed database
Distributed algorithm
Estimation of distribution algorithm
Theoretical computer science
Computer science
big data processing
Data mining

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations