Stratified and time-aware sampling based adaptive ensemble learning for streaming recommendations
8
Citation
67
Reference
10
Related Paper
Citation Trend
Keywords:
Streaming Data
Concept Drift
Experience sampling method
Adaptive sampling
Information Overload
Currently, online processing of the data streams is a very active research topic. Streaming data are usually dynamic, where the underlying data distributions evolve during the time. Predictive data analytical tasks, such as classification, must be able to reflect such dynamics. This phenomenon is called a concept drift, and multiple adaptive classification methods have been proposed to handle drifting streams. To understand how the adaptive models work, it is necessary to use the techniques able to visualize how the model performs, as well as to provide explanations of the drift occurrence. In this paper, we present the visualization technique based on feature importance. In this case, we want to provide information about the continuous importance of the input features and use it to explain the possible drifts in the data. We used the commonly used ADWIN adaptive streaming classifier and evaluated the technique on the two real-world data streams with concept drift.
Concept Drift
Streaming Data
Feature (linguistics)
Cite
Citations (1)
Streaming Data
Concept Drift
Cite
Citations (2)
World is generating immeasurable amount of data every minute, that needs to be analyzed for better decision making. In order to fulfil this demand of faster analytics, businesses are adopting efficient stream processing and machine learning techniques. However, data streams are particularly challenging to handle. One of the prominent problems faced while dealing with streaming data is concept drift. Concept drift is described as, an unexpected change in the underlying distribution of the streaming data that can be observed as time passes. In this work, we have conducted a systematic literature review to discover several methods that deal with the problem of concept drift. Most frequently used supervised and unsupervised techniques have been reviewed and we have also surveyed commonly used publicly available artificial and real-world datasets that are used to deal with concept drift issues.
Concept Drift
Streaming Data
Real world data
Cite
Citations (0)
Concept Drift
Streaming Data
Benchmark (surveying)
Time lag
False alarm
Cite
Citations (1)
The potential objective of data mining (DM) over the data streaming is the detection of concept-drift. Concept-Drift signifies a diversity among the data tuples streamed in the sequence. The concept-drift often appears as incremental or abrupt. The incremental drift denotes the gradual increment of the drift between the tuples of streaming data. The other format of the drift is abrupt, which signifies the drift between tuples of data streaming in sequence. The proposed method is an Ensemble Framework for Concept-Drift Detection in Multidimensional Streaming Data (EFCDD). In addition, the proposed method EFCDD deals with the recurrent drift of the concept in streaming data. To state the drift, the projection diversity of the values representing the field positions or field-IDs, which are in use for framing the structure of the records streaming form the intended sources. The experimental study was carried out by mocking the streams of those transmitting records of the benchmark datasets often used in DM. The outcomes of the experimental study evince the scalability and prominence of EFCDD toward the detection of drift in concept. The proposal performance is measured by comparing simulation outcomes with the other existing model.
Streaming Data
Concept Drift
Benchmark (surveying)
Streaming algorithm
Cite
Citations (9)
In data stream analyses, detecting the concept drift accurately is important to maintain the classification performance. Most drift detection methods assume that the class labels become available immediately after a data sample arrives. However, this assumption is overly optimistic, as labeling costs are high and much time is needed to obtain the label of data samples. Therefore, it is un-realistic to attempt to acquire all of the labels when processing the data streams. In this paper, we propose a concept drift detection method under the assumption that there is limited access to labels. The proposed method detects concept drift on unlabeled data streams based on the class label information which is predicted by the classifier trained with the limited number of labeled data samples. Experimental results on synthetic and real streaming data show that the proposed method is competent to detect the concept drift using only a small amount of labeled data.
Concept Drift
Streaming Data
Labeled data
Cite
Citations (3)
Data distributions in streaming environments are usually not stationary. In order to maintain a high predictive quality at all times, online learning models need to adapt to distributional changes, which are known as concept drift. The timely and robust identification of concept drift can be difficult, as we never have access to the true distribution of streaming data. In this work, we propose a novel framework for the detection of real concept drift, called ERICS. By treating the parameters of a predictive model as random variables, we show that concept drift corresponds to a change in the distribution of optimal parameters. To this end, we adopt common measures from information theory. The proposed framework is completely model-agnostic. By choosing an appropriate base model, ERICS is also capable to detect concept drift at the input level, which is a significant advantage over existing approaches. An evaluation on several synthetic and real-world data sets suggests that the proposed framework identifies concept drift more effectively and precisely than various existing works.
Concept Drift
Streaming Data
Identification
Base (topology)
Cite
Citations (1)
This paper discusses concept drift in online streaming data and evaluates the performance of different Extreme Learning Machine (ELM) based techniques on classifying online streaming data in the presence of concept drift. It also compares the performance of a hybrid model called Online Recurrent ELM (OR-ELM) with traditional recurrent neural networks, in terms of training speed and accuracy, on streaming data that has concept drift. The results of our experiments show that OR-ELM has better accuracy and faster training time.
Concept Drift
Extreme Learning Machine
Streaming Data
Cite
Citations (1)
Data stream mining has become an interesting analysis topic and it is a growing interest in data discovery method. There are several applications supporting stream data processing like device network, electronic network, etc. Our approach AhtNODE (Adaptive Hoeffding Tree based NOvel class DEtection) detects novel class in the presence of concept drift in streaming data. It addresses there are three challenges of streaming data: infinite length, concept drift, and concept evolution. This approach automatically detects the novel class whenever it arrives in the data stream. It is a multi-class approach that distinguishes novel class from existing classes. The authors tend to apply the Adaptive Hoeffding Tree as a classification model that is also used to handle the concept drift situation. Previous approaches used the ensemble model to handle concept drift. In AHT, classification is done in the single pass. The experiment result proves the effectiveness of AhtNODE compared to existing ensemble classifier in terms of classification accuracy, speed and use of memory.
Concept Drift
Streaming Data
Cite
Citations (3)
In data stream analysis, detecting the concept drift accurately is important to maintain the classification performance. Most drift detection methods assume that the class labels become available immediately after a data sample arrives. However, it is unrealistic to attempt to acquire all of the labels when processing the data streams, as labeling costs are high and much time is needed. In this paper, we propose a concept drift detection method under the assumption that there is limited access or no access to class labels. The proposed method detects concept drift on unlabeled data streams based on the class label information which is predicted by a classifier or a virtual classifier. Experimental results on synthetic and real streaming data show that the proposed method is competent to detect the concept drift on unlabeled data stream.
Concept Drift
Streaming Data
Labeled data
Cite
Citations (49)