Analysis of parallel computational models for clustering

Małgorzata Płaza,Stanislaw Deniziak,Mirosław Płaza,Radosław Belka,Paweł Pięta

Analysis of parallel computational models for clustering

2018

Clustering is one of the main task of data mining, where groups of similar objects are discovered and grouping of similar data as well as outliers detection are performed. Processing of huge datasets requires scalable models of computations and distributed computing environments, therefore efficient parallel clustering methods are required for this purpose. Usually for parallel data analytics the MapReduce processing model is used. But growing computer power of heterogeneous platforms based on graphic processors and FPGA accelerators causes that CUDA and OpenCL models may be interesting alternative to MapReduce. This paper presents comparative analysis of effectiveness of applying MapReduce and CUDA/OpenCL processing models for clustering. We compare different methods of clustering in terms of their possibilities of parallelization using both models of computation. The conclusions indicate directions for further work in this area.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations