Near Real-Time Service Monitoring Using High-Dimensional Time Series
2015
We demonstrate a near real-time service monitoring system for detecting and diagnosing issues from high-dimensional time series data. For detection, we have implemented a learning algorithm that constructs a hierarchy of detectors from data. It is scalable, does not require labelled examples of issues for learning, runs in near real-time, and identifles a subset of counter time series as being relevant for a detected issue. For diagnosis, we provide efflcient algorithms as post-detection diagnosis aids to flnd further relevant counter time series at issue times, a SQL-like query language for writing flexible queries that apply these algorithms on the time series data, and a graphical user interface for visualizing the detection and diagnosis results. Our solution has been deployed in production as an end-to-end system for monitoring Microsoft's internal distributed data storage and computing platform consisting of tens of thousands of machines and currently analyses about 12000 counter time series.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
10
References
2
Citations
NaN
KQI