Selective Database Projections Based Approach for Mining High-Utility Itemsets

2018 
High-utility itemset mining (HUIM) is an emerging area of data mining and is widely used. HUIM differs from the frequent itemset mining (FIM), as the latter considers only the frequency factor, whereas the former has been designed to address both quantity and profit factors to reveal the most profitable products. The challenges of generating the HUI include exponential complexity in both time and space. Moreover, the pruning techniques of reducing the search space, which is available in FIM because of their monotonic and anti-monotonic properties, cannot be used in HUIM. In this paper, we propose a novel selective database projection-based HUI mining algorithm (SPHUI-Miner). We introduce an efficient data format, named HUI-RTPL, which is an optimum and compact representation of data requiring low memory. We also propose two novel data structures, viz, selective database projection utility list and Tail-Count list to prune the search space for HUI mining. Selective projections of the database reduce the scanning time of the database making our proposed approach more efficient. It creates unique data instances and new projections for data having less dimensions thereby resulting in faster HUI mining. We also prove upper bounds on the amount of memory consumed by these projections. Experimental comparisons on various benchmark data sets show that the SPHUI-Miner algorithm outperforms the state-of-the-art algorithms in terms of computation time, memory usage, scalability, and candidates generation.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    38
    References
    4
    Citations
    NaN
    KQI
    []