Chapter 32 – StatStream: Statistical Monitoring of Thousands of Data Streams in Real Time*

2002 
Publisher Summary Maintaining multistream and time-delayed statistics in a continuous online fashion is a significant challenge in data management. This chapter solves this problem in a scalable way that gives a guaranteed response time with high accuracy. The Discrete Fourier Transform (DFT) technique reduces the enormous raw data streams into a manageable synoptic data structure and gives good I/O performance. For any pair of streams, the pair-wise statistic is computed in an incremental fashion and requires constant time per update using a DFT approximation. A sliding/basic window framework is introduced to facilitate the efficient management of streaming data digests. One reduces the correlation coefficient similarity measure to a Euclidean measure and makes use of a grid structure to detect correlations among thousands of high-speed data streams in real time. Experiments conducted using synthetic and real data show that StatStream detects correlations efficiently and precisely.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    1
    Citations
    NaN
    KQI
    []