Streaming Machine Learning Algorithms with Big Data Systems

Vibhatha Abeykoon,Gregor von Laszewski,Supun Kamburugamuve,Kannan Govindrarajan,Pulasthi Wickramasinghe,Chathura Widanage,Niranda Perera,Ahmet Uyar,Gurhan Gunduz,Selahattin Akkaş

Streaming Machine Learning Algorithms with Big Data Systems

2019

Vibhatha Abeykoon
Gregor von Laszewski
Supun Kamburugamuve
Kannan Govindrarajan
Pulasthi Wickramasinghe
Chathura Widanage
Niranda Perera
Ahmet Uyar
Gurhan Gunduz
Selahattin Akkaş

Designing low latency applications that can process large volumes data with higher efficiency is a challenging problem. With the limited time to process data, usage of online algorithms are becoming important in the big-data applications. Stream processing is a well-known area that has been studied for a long time. In this research, our objective is to use state of the art big-data analytic engines to implement online algorithms and compare the strengths and weaknesses in each system. We use a streaming version of Support Vector Machines (SVM) and KMeans to do the analysis. Apache Flink, Apache Storm and Twister2 streaming frameworks are used to implement these algorithms. Our study focuses on the efficiency of online training of these algorithms and the results show higher performance in Twister2 framework for these algorithms.

Keywords:

Machine learning
Algorithm
Online algorithm
Dataflow
k-means clustering
Computer science
Big data
Latency (engineering)
Support vector machine
Stream processing
Data mining
Artificial intelligence
Strengths and weaknesses

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations