On gLite WMS/LB monitoring and Management through WMSMonitor

2010 
The Workload Management System is the gLite service supporting the distributed production and analysis activities of various HEP experiments. It is responsible of dispatching computing jobs to remote computing facilities by matching job requirements and the resource status information collected from the Grid information services. Given the distributed and heterogeneous nature of the Grid, the monitoring of the job lifecycle and of the aggregate workflow patterns generated by multiple user communities, and the reliability of the service are of great importance. In this paper we deal with the problem of WMS monitoring and management. We present the architecture and implementation of the WMSMonitor, a tool for WMS monitoring and management, which has been designed to meet the needs of various WMS user categories: administrators, developers, advanced Grid users and performance testers. The tool was successfully deployed to monitor the progress of WMS job submission activities during HEP computing challenges. We also describe how, for each WMS in a cluster, WMSMonitor produces status indexes and a load metric that can be used for automated notification of critical events via Nagios, or for ranking of service instances deployed in load balancing mode.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    3
    References
    3
    Citations
    NaN
    KQI
    []