An efficient single pass approach to frequent episode discovery in sequence data

2008 
There is a considerable body of work on sequence mining of transactional data. Most of the related work on point data make several passes over the entire dataset in order to discover frequently occurring episodes/patterns. The OnePass frequent episode discovery (or FED) algorithm, proposed in this paper, takes a different approach than the traditional apriori class of pattern detection algorithms. In our approach, significant intervals for each event (or device) are computed first (independently) and are used for detecting frequent patterns along with the interval in which they occur. The advantage of this approach is that the data set is compressed substantially in the first step thereby reducing the size of input used and hence the computation. Also, each event/device can be processed individually allowing for parallel computation of individual events. The OnePass FED algorithm then works on these significant intervals to discover interesting episodes in a single pass as compared to the apriori class of algorithms. Our approach is significantly more efficient and scales as well as compared to traditional mining algorithms. Extensive experimental analysis establishes its efficiency and scalability.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    1
    Citations
    NaN
    KQI
    []