Fault Detection System Activated by Failure Information

2007 
We propose a fault detection system activated by an application when the application recognizes the occurrence of a failure, in order to realize self managing systems that automatically find the source of a failure. In existing detection systems, there are three issues for constructing self managing applications: i) the detection results are not sent to the applications, ii) they can not identify the source failure from all of the detected failures, and iii) configuring the detection system for networked system is hard work. For overcoming these issues, the proposed system takes three approaches: i) the system receives failure information from an application and returns a result set to the application, ii) the system identifies the source failure using relationships among errors, and Hi) the system obtains information of the monitored system from a database. The relationship is expressed by a tree. This tree is called error relationship tree. The database provides information which are system entities such as hardware devices, software object, and network topology. When the proposed system starts looking for the source of a failure, causal relations from an error relation tree are referred to, and the correspondence of error definitions and actual objects is derived using the database. We show the design of the detection operation activated by the failure information and the architecture of the proposed system.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    19
    References
    1
    Citations
    NaN
    KQI
    []