Improving Parallel Data Mining for Different Data Distributions in IoT Systems

2020 
We aim at improving the distributed implementation of data mining algorithms in modern Internet of Things (IoT) systems. The idea of our approach is performing as much as possible computations at local IoT nodes, rather than transferring data for processing at a central compute cluster as in the current solutions based on MapReduce. We study different kinds of data distributions between the nodes of IoT and we adapt the structure of the implementation correspondingly. Our formally-based approach ensures the correctness of the obtained parallel implementation. We implement our approach in the Java-based data mining library DXelopes, and we illustrate the approach with the popular algorithm Naive Bayes. Experiments confirm that our approach significantly reduces the application run time.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    16
    References
    0
    Citations
    NaN
    KQI
    []