Mining Top-k Useful Negative Sequential Patterns via Learning

2019 
As an important tool for behavior informatics, negative sequential patterns (NSPs) (such as missing a medical treatment) are sometimes much more informative than positive sequential patterns (PSPs) (e.g., attending a medical treatment) in many applications. However, NSP mining is at an early stage and faces many challenging problems, including 1) how to mine an expected number of NSPs; 2) how to select useful NSPs; and 3) how to reduce high time consumption. To solve the first problem, we propose an algorithm Topk-NSP to mine the k most frequent negative patterns. In Topk-NSP, we first mine the top-k PSPs using the existing methods, and then we use an idea which is similar to top-k PSPs mining to mine the top-k NSPs from these PSPs. To solve the remaining two problems, we propose three optimization strategies for Topk-NSP. The first optimization strategy is that, in order to consider the influence of PSPs when selecting useful top-k NSPs, we introduce two weights, wP and wN, to express the user preference degree for NSPs and PSPs, respectively, and select useful NSPs by a weighted support wsup. The second optimization strategy is to merge wsup and an interestingness metric to select more useful NSPs. The third optimization strategy is to introduce a pruning strategy to reduce the high computational costs of Topk-NSP. Finally, we propose an optimization algorithm Topk-NSP⁺. To the best of our knowledge, Topk-NSP⁺ is the first algorithm that can mine the top-k useful NSPs. The experimental results on four synthetic and two real-life data sets show that the Topk-NSP⁺ is very efficient in mining the top-k NSPs in the sense of computational cost and scalability.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    46
    References
    17
    Citations
    NaN
    KQI
    []