RSV: OSG Fabric Monitoring and Interoperation with WLCG Monitoring Systems

The Open Science Grid (OSG) Resource and Service Validation (RSV) project provides solutions to several grid fabric monitoring problems, while at the same time providing a bridge between the OSG operations and monitoring infrastructure of the WLCG (Worldwide LHC Computing Grid) infrastructure. The RSV-based OSG fabric monitoring begins with local resource fabric monitoring, which gives local administrators tools to monitor their status on the OSG using their local monitoring infrastructure. With a set of local grid status probes, the results of which are uploaded to a central collector, a system administrator can monitor and watch their resource locally, while the OSG Operations Center (GOC) can watch from a centralized position. Plug-ins relay RSV results to other popular fabric monitoring software (Nagios) allowing system administrators flexibility to stay aware of their grid status using their chosen status display interface. Additional probes are easily developed and plugged into the RSV structure, and an emphasis is placed on the community to develop additional probes that fit the needs of different categories of users (VO, User, Software Developer) as needed. From the GOC, results are transmitted to a WLCG message broker via a specified format, which can then translate these records into critical statistics to the LHC collaborating projects. RSV has succeeded in meeting these initial goals; future development is centred around usability and extending the project's scope and functionality.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader