logo
    A Mining Maximal Frequent Itemsets over the Entire History of Data Streams
    13
    Citation
    14
    Reference
    10
    Related Paper
    Citation Trend
    Abstract:
    Mining maximal frequent itemsets has been widely concerned. However, mining data streams is more difficult than mining static databases because of the huge, high-speed and continuous characteristics of streaming data. This paper presents an algorithm, called IDSM-MFI. The algorithm uses a synopsis data structure to store the items of transactions embedded data streams so far. It adopts a top-bottom and bottom-top method to mine the set of all maximal frequent itemsets in landmark windows over data stream, which can be output in real time based on users' specified thresholds. Theoretical analysis and experimental results show that our algorithm is efficient and scalable for mining the set of all maximal frequent itemsets over the entire history of data stream.
    Keywords:
    Data set
    Streaming Data
    Due to potentially large number of applications of real-time data stream mining in scientific and business analysis, the real-time data streams mining has drawn attention of many researchers who are working in the area of machine learning and data mining.In many cases, for real-time data stream mining online learning is used.Environments that require online learning are non-stationary and whose underlying distributions may change over time i.e. concept drift, because of which mining of real-time data streams with concept drifts is quite challenging.However, ensemble methods have been suggested for this particular situation.This paper reviews various online methods of drift detection.We also present some results of our experiments that show the comparison of some online drift detection (concept drift) methods.
    Concept Drift
    Streaming Data
    Citations (16)
    In this paper, we considers the problem of mining with weighted support over a data stream sliding window using limited memory space. The continuous characteristic of streaming data necessitates the use of algorithms that require only one scan over the stream for knowledge discovery. This paper focuses on research issues concerning mining frequent itemsets in data streams and we suggests an efficient algorithm WSFI-Mine to mine all frequent itemsets. Our experiment show that our algorithm not only achieved effectively consumes less memory, but also runs significantly faster than THUI-mine.
    Sliding window protocol
    Streaming Data
    Citations (0)
    Concept drift has been a very important concept in the realm of data streams. Streaming data may consist of multiple drifting concepts each having its own underlying data distribution. Concept drift occurs when a set of examples has legitimate class labels at one time and has different legitimate labels at another time. This paper provides a comprehensive overview of existing concept -evolution in concept drifting techniques along different dimensions and it provides lucid vision about the ensemble's behavior when dealing with concept drifts. Key words:data stream,ensemble, class label,concept drift.
    Concept Drift
    Streaming Data
    Realm
    Abstract Learning from data streams is a hot topic in machine learning and data mining. This article presents our recent work on the topic of learning from data streams. We focus on emerging topics, including fraud detection and hyper-parameter tuning for streaming data. The first study is a case study on interconnected by-pass fraud. This is a real-world problem from high-speed telecommunications data that clearly illustrates the need for online data stream processing. In the second study, we present an optimization algorithm for online hyper-parameter tuning from nonstationary data streams.
    Streaming Data
    Concept Drift