language-icon Old Web
English
Sign In

Data stream clustering

In computer science, data stream clustering is defined as the clustering of data that arrive continuously such as telephone records, multimedia data, financial transactions etc. Data stream clustering is usually studied as a streaming algorithm and the objective is, given a sequence of points, to construct a good clustering of the stream, using a small amount of memory and time. Theorem —  STREAM can solve the k-Median problem on a data stream in a single pass, with time O(n1+e) and space θ(nε) up to a factor 2O(1/e), where n the number of points and e < 1 / 2 {displaystyle e<1/2} . In computer science, data stream clustering is defined as the clustering of data that arrive continuously such as telephone records, multimedia data, financial transactions etc. Data stream clustering is usually studied as a streaming algorithm and the objective is, given a sequence of points, to construct a good clustering of the stream, using a small amount of memory and time. Data stream clustering has recently attracted attention for emerging applications that involve large amounts of streaming data. For clustering, k-means is a widely used heuristic but alternate algorithms have also been developed such as k-medoids, CURE and the popular BIRCH. For data streams, one of the first results appeared in 1980 but the model was formalized in 1998.

[ "CURE data clustering algorithm", "Canopy clustering algorithm" ]
Parent Topic
Child Topic
    No Parent Topic