FRL-MFPG: Propagation-aware fault root cause location for microservice intelligent operation and maintenance

2023 
Due to the continuous updates and complex dependencies of microservices, the probability of a fault occurrence and the difficulty of doing a diagnosis have increased, making it hard for operation and maintenance staff to quickly and accurately troubleshoot a fault and locate its root cause.To fulfill the requirements of artificial intelligence for IT operations, called AIOps, this paper studies microservice fault root cause location technology from two aspects, microservice fault propagation relationships and fault root cause location.First, this paper designs a microservice fault propagation graph construction method MFPG-FC based on fault correlation. The method effectively depicts the propagation relationship and the scope of influence of microservice faults and improves the accuracy of locating a fault’s root cause. Second, in terms of fault root cause location, this paper proposes a microservice fault root cause location algorithm based on a microservice fault propagation relationship graph called FRL-MFPG. The FRL-MFPG algorithm is designed to improve the globalization, flexibility and accuracy of the fault location search range and rate. Finally, an AIOps-oriented microservice fault root cause location framework (AIOps-MFRL) is designed.The experimental results show that, compared with the traditional method, the method proposed in this paper is more accurate and can locate the root cause of a fault more accurately. After detecting the fault of microservices, it can achieve the goal of locating the root cause of the fault, which is helpful to improve the efficiency of intelligent operation and maintenance.The method in this paper can effectively locate the root cause of a fault and identify the root cause indicators of the fault after the fault is detected in the microservice. It has better timeliness and accuracy, reduces troubleshooting time and the losses caused by faults, and improves operation and maintenance efficiency.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []