Hp-Apriori: Horizontal parallel-apriori algorithm for frequent itemset mining from big data

2017 
Due to large scale and complexity of big data, mining the big data using a single personal computer is a difficult problem. With increasing in the size of databases, parallel computing systems can cause considerable advantages in the data mining applications by means of the exploitation of data mining algorithms. Parallelization of association rule mining algorithms is an important task in data mining to mine frequent patterns from transaction databases. These algorithms either distribute database horizontally or increase number of CPU to reduce execution time of frequent pattern mining. In this paper, a novel frequent itemset mining algorithm, namely Horizontal parallel-Apriori (HP-Apriori), is proposed that divides database both horizontally and vertically with partitioning mining process into four sub-processes so that all four tasks are performed in parallel way. Also the HP-Apriori tries to speed up the mining process by an index file that is generated in the first step of algorithm. The proposed algorithm has been compared with Count Distribution (CD) in terms of execution time and speedup criteria on the four real datasets. Experimental results demonstrated that the HP-Apriori outperforms over CD in terms of minimizing execution time and maximizing speedup in high scalability.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    20
    References
    3
    Citations
    NaN
    KQI
    []