Situation-Aware Multivariate Time Series Anomaly Detection Through Active Learning and Contrast VAE-Based Models in Large Distributed Systems

2022 
The massive amounts of monitoring data in network applications bring an urgent need for intelligent operation in large distributed systems. The key problem is precisely detecting anomalies in multivariate time series (MTS) monitoring metrics with the awareness of different application scenarios. Unsupervised MTS anomaly detection methods aim at detecting data anomalies from historical MTS without considering the out-of-band information (including user feedback and background information like code deployment status), which leads to poor performance in practice. To take advantage of the out-of-band information, we propose ACVAE, an MTS anomaly detection algorithm through active learning and contrast VAE-based detection models, which simultaneously learns MTS data’s normal and anomalous patterns for anomaly detection. We also use a learnable prior to capture system status from the background information. Moreover, we propose a query model for VAE-based methods, which can learn to query labels of the most useful instances to train the detection model. We evaluate our algorithm on three different monitoring situations in eBay’s search back-end systems. ACVAE achieves a range F1 score of 0.68~0.96 with only 3% labels, significantly outperforming the best competing methods by 0.18~0.50, and even better than a supervised ensemble method designed by domain experts in eBay.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    54
    References
    0
    Citations
    NaN
    KQI
    []