Chapter 32 – StatStream: Statistical Monitoring of Thousands of Data Streams in Real Time*
2002
Publisher Summary
Maintaining multistream and time-delayed statistics in a continuous online fashion is a significant challenge in data management. This chapter solves this problem in a scalable way that gives a guaranteed response time with high accuracy. The Discrete Fourier Transform (DFT) technique reduces the enormous raw data streams into a manageable synoptic data structure and gives good I/O performance. For any pair of streams, the pair-wise statistic is computed in an incremental fashion and requires constant time per update using a DFT approximation. A sliding/basic window framework is introduced to facilitate the efficient management of streaming data digests. One reduces the correlation coefficient similarity measure to a Euclidean measure and makes use of a grid structure to detect correlations among thousands of high-speed data streams in real time. Experiments conducted using synthetic and real data show that StatStream detects correlations efficiently and precisely.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
0
References
1
Citations
NaN
KQI