Scalable Infrastructures for Data in Motion

2014 
Analytics applications for reporting and human interaction with big data rely upon scalable frameworks for data ingest, storage, and computation. Batch processing of analytic workloads increases latency of results and can perform redundant computation. In real-world applications, new data points are continuously arriving and a suite of algorithms must be updated to reflect the changes. Reducing the latency of re-computation by keeping algorithms online and up-to-date enables fast query, experimentation, and drill-down. In this paper, we share our experiences designing and implementing scalable infrastructure around No SQL databases for social media analytics applications. We propose a new heterogeneous architecture and execution model for streaming data applications that focuses on throughput and modularity.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    11
    References
    3
    Citations
    NaN
    KQI
    []