Implementing a Platform to Run Clustering Algorithms Using Distributed Computing

Ioan-Daniel Borlea,Daniel Iercan,Radu-Emil Precup,Florin Dragan,Alexandra-Bianca Borlea

Implementing a Platform to Run Clustering Algorithms Using Distributed Computing

2019

Most of the clustering algorithms are designed to work as a sequential algorithm that requires all data to be present, which limits the actual implementation to run on a single machine and does not support horizontal scalability. This is problematic in today’s context when volume of data gets larger each day and the need to process data quickly is essential. Hence, in this paper we propose a platform that allows running clustering algorithms in a distributed manner. This is achieved through splitting the data into smaller and equal partitions, and through redesigning the original clustering algorithms to allow working on a sub-set of the input data without having to interact with the processing of the rest of the input data. At the end the so-called reduce phase aggregates the partial results obtained from processing each partition and it produces the global result.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations