Streaming Machine Learning Algorithms with Big Data Systems

2019 
Designing low latency applications that can process large volumes data with higher efficiency is a challenging problem. With the limited time to process data, usage of online algorithms are becoming important in the big-data applications. Stream processing is a well-known area that has been studied for a long time. In this research, our objective is to use state of the art big-data analytic engines to implement online algorithms and compare the strengths and weaknesses in each system. We use a streaming version of Support Vector Machines (SVM) and KMeans to do the analysis. Apache Flink, Apache Storm and Twister2 streaming frameworks are used to implement these algorithms. Our study focuses on the efficiency of online training of these algorithms and the results show higher performance in Twister2 framework for these algorithms.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    12
    References
    3
    Citations
    NaN
    KQI
    []