Confidence guided anomaly detection model for anti-concept drift in dynamic logs

2020 
Abstract Log data records system state and runtime behaviors, and is usually used to diagnose system failures and detect anomalies. However, the accuracy of log-based anomaly detection algorithms will reduce dramatically in dynamic logs since the system more complex than ever before, a phenomenon known as concept drift. In this paper, we design a confidence-guide anomaly detection model that combines multiple algorithms, called Multi-CAD. We first propose a statistical value p_value to measure the non-conformity between logs and establish a link in the new log and previous logs, and can also choose multiple suitable algorithms as the non-conformity measure to calculate scores for combined detection instead of to make a decision. And then, we design a confidence-guided parameter adjustment method to anti-concept drift in dynamic logs and update the score set with the corresponding label from a trusted result that contains a label, non-conformity score, and confidence by a feedback mechanism as the previous experience for the following-up detection. Finally, we demonstrate that Multi-CAD will make a balance performance in precision rate, recall rate, and F_measure, and detect actual anomalies on multiple datasets. An extensive set of experiment results highlight that Multi-CAD will increase almost 20% on average in recall rate and F_measure compared with four typical algorithms on the HDFS benchmark dataset, where it achieves 98.2% in precision rate, 95.2% in recall rate, and 96.7% in F_measure.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    37
    References
    11
    Citations
    NaN
    KQI
    []