Design and performance of the virtualization platform for offline computing on the ATLAS TDAQ Farm

2014 
With the LHC collider at CERN currently going through the period of Long Shutdown 1 there is an opportunity to use the computing resources of the experiments' large trigger farms for other data processing activities. In the case of the ATLAS experiment, the TDAQ farm, consisting of more than 1500 compute nodes, is suitable for running Monte Carlo (MC) production jobs that are mostly CPU and not I/O bound. This contribution gives a thorough review of the design and deployment of a virtualized platform running on this computing resource and of its use to run large groups of CernVM based virtual machines operating as a single CERN-P1 WLCG site. This platform has been designed to guarantee the security and the usability of the ATLAS private network, and to minimize interference with TDAQ's usage of the farm. Openstack has been chosen to provide a cloud management layer. The experience gained in the last 3.5 months shows that the use of the TDAQ farm for the MC simulation contributes to the ATLAS data processing at the level of a large Tier-1 WLCG site, despite the opportunistic nature of the underlying computing resources being used.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    3
    References
    3
    Citations
    NaN
    KQI
    []