Sprouter: Dynamic Graph Processing over Data Streams at Scale

2018 
Graph data is becoming dominant for many applications such as social networks, targeted advertising, and web indexing. As a result of that, advances in machine learning and data mining techniques depend tightly on the ability to process this data structure efficiently and reliably. Despite the importance of processing dynamic graphs in real-time, it remains a challenge to maintain such graphs and process them over data streams. We propose Sprouter, an end-to-end framework which enable storing enormous graph data, allows updates in real-time, and supports efficient complex analytics in addition to OLTP queries. We demonstrate that our framework can ingest and process streaming data efficiently using a scalable multi-cluster distributed architecture, apply incremental graph updates, and store the dynamic graph for fast query performance. Experiments showed the system is able to update graphs having up to 100 million edges in under 50 s in a moderate underlying cluster. As we use all open source tools, the framework can be easily extended in the future with other equivalent software.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    12
    References
    4
    Citations
    NaN
    KQI
    []