A Guided FP-Growth algorithm for mining multitude-targeted item-sets and class association rules in imbalanced data

2020 
Abstract Identifying frequent item-sets is a popular data-mining task. It consists of finding sets of items frequently appearing in data. Yet, finding all frequent item-sets in large or dense datasets may be time-consuming, and a user may be interested merely in some specific item-sets rather than all of them. Recently, methods have been proposed for targeted item-set mining; that is to calculate the support of some item-sets of interest. Though this approach is often more suitable for real applications than traditional item-set mining approaches, performance remains an issue. To address that issue, this paper presents a novel algorithm for multitude-targeted mining, named Guided Frequent Pattern-Growth (GFP-Growth). The GFP-Growth algorithm is designed to quickly mine a given set of item-sets using a small amount of memory. This paper proves that GFP-Growth yields the exact frequency-counts for each item-set of interest. It further shows that GFP-Growth can boost the performance for several problems requiring item-set mining. We specifically study the problem of generating minority-class rules from imbalanced data and develop the Minority-Report Algorithm (MRA) that uses GFP-Growth to solve this problem efficiently. We prove several theoretical properties of MRA and present experimental results showing substantial performance gain.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    47
    References
    3
    Citations
    NaN
    KQI
    []