Data analytics algorithm benchmark on distributed systems

2018 
Big Data Analytics have becoming more important in Industrial Revolution 4.0 (IR4.0). Data Analytics is a superset to Data Mining. Data mining consist of several popular methods. Rough Set or Rough Classification Modeling (RCM), Statistical analysis and Neural Network are among prevalent algorithms in Data Analytics. Satisfiable Integer Programming (SIP) algorithm in RCM consume lots of time to execute especially on a single node environment. SIP capability is to give better result in terms of reducts calculation accuracy on huge dataset. Distributed Inter Process Communication (DIPC) is an open source distributed operating system. Among its services are shared memory and semaphore. Combination of SIP algorithm and DIPC is proposed in order to expedite the computational times and processing speed. Standardized i386 machines were used to develop clusters of distributed operating system in this experiment. The result on computational times differences among different algorithms were recorded and compared.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    12
    References
    0
    Citations
    NaN
    KQI
    []