Smart Cache: An Optimized MapReduce Implementation of Frequent Itemset Mining

2015 
Frequent Item set Mining (FIM) is a classic data mining topic with many real world applications such as market basket analysis. Many algorithms including Apriori, FP-Growth, and Eclat were proposed in the FIM field. As the dataset size grows, researchers have proposed MapReduce version of FIM algorithms to meet the big data challenge. This paper proposes new improvements to the MapReduce implementation of FIM algorithm by introducing a cache layer and a selective online analyzer. We have evaluated the effectiveness and efficiency of Smart Cache via extensive experiments on four public datasets. Smart Cache can reduce on average 45.4%, and up to 97.0% of the total execution time compared with the state-of-the-art solution.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    28
    References
    8
    Citations
    NaN
    KQI
    []