Efficient High-utility Itemset Mining Based on a Novel Data Structure

2021 
High-utility itemset mining (HUIM) is one of the important tasks in data mining. In HUIM, the most profitable products can be found by considering the quantity and profit factors, rather than the frequency factor. The one-phase HUIM algorithms based on utility-list structure have become one of the most efficient algorithms since they can mine high-utility itemsets (HUIs) without generating candidates. However, storing itemset information for utility list is time consuming and memory consuming, especially on the dense datasets with long transactions. To address the problem, this paper proposes an HUIM algorithm based on a novel data structure. The novel data structure is designed by reorganizing the transaction database in order to get all HUIs effectively and reduce memory usage in the depth-first search process. Based on the novel data structure, two upper bounds, which are based on the extensions utility and local transaction weighted utility, are introduced to reduce the search space greatly from width and depth. Experiment are carried out on the dense and sparse benchmark datasets. Compared with some state-of-the-art HUIM algorithms, the proposed HUIM algorithm has obtained better performance for mining HUIs in terms of running time and memory usage.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    3
    References
    0
    Citations
    NaN
    KQI
    []