Failure Prediction for Cloud Datacenter by Hybrid Message Pattern Learning
2014
In operations and management of large-scale cloud data enters, it is essential for administrators to handle failures occurring in their infrastructure before causing service-level violations. Some techniques for failure prediction have been studied because they can be used to start the troubleshooting process at the early stage of troubles and to prevent service-level violations from occurring. By its nature, however, failure prediction involves a certain amount of incorrect detection (false-positive). When applying failure prediction to the operation and management of cloud data enters, incorrect detection can result in the execution of unnecessary workaround tasks and additional costs. Existing methods for failure prediction using Bayesian inference to identify message patterns related to a certain failure are difficult to apply to relatively stable systems, because the accuracy of their predictions deteriorates in environments where failure rarely occurs. In order to solve this problem, we propose a novel method to improve the accuracy of failure prediction by suppressing incorrect detections using a hybrid score that integrates the probability of simultaneous occurrence between a message pattern and a failure and frequency of the message patterns for the failure. We implemented this method and evaluated the accuracy in a real commercial cloud data enter. The evaluation results revealed that it improved the accuracy of failure prediction by 31.9% compared with the existing method in terms of precision in the best case.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
12
References
2
Citations
NaN
KQI