Evaluating the Scaling of Graph-Algorithms for Big Data Using GraphX

2016 
Graph processing has achieved a lot of attention in differentbig data scenarios. In this paper, we present the design, implementation, and experimental evaluation of graph processing algorithmsin two differentapplication areas. First, we use semi-clustering as an example of an algorithmtypically used social network analysis. Then, we examine an algorithm for collaborative filteringas typically used in E-Commerce scenarios. For both algorithms, we make use of Apache GraphX as an existing distributedgraph processing framework based on Apache Spark. As GraphX does not include these two algorithms, we describe how to implement them using a combination of GraphX andthe underlying Spark Core. Based on our implementation, we perform experiments to test the scalability of both the algorithmsand the GraphX processing framework. The experiments show that different kinds of graphalgorithms can be supported within the Spark framework. Furthermore, we show that for our test data the algorithmsscale almost linearly when properly designed.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    15
    References
    3
    Citations
    NaN
    KQI
    []