Preservation System for Scientific Experiments in High Performance Computing: Challenges and Proposed Concept

2019 
Continuously growing amount of research experiments using High Performance Computing (HPC) leads to the questions of research data management and in particular how to preserve a scientific experiment including all related data for long term for its future reproduction. This paper covers some challenges and possible solutions related to the preservation of scientific experiments on HPC systems and represents a concept of the preservation system for HPC computations. Storage of the experiment itself with some related data is not only enough for its future reproduction, especially in the long term. For that case preservation of the whole experiment's environment (operating system, used libraries, environment variables, input data, etc.) via containerization technology (e.g. using Docker, Singularity) is proposed. This approach allows to preserve the entire environment, but is not always possible on every HPC system because of security issues. And it also leaves a question, how to deal with commercial software that was used within the experiment. As a possible solution we propose to run a preservation process outside of the computing system on the web-server and to replace all commercial software inside the created experiment's image with open source analogues that should allow future reproduction of the experiment without any legal issues. The prototype of such a system was developed, the paper provides the scheme of the system, its main features and describes the first experimental results and further research steps.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    10
    References
    0
    Citations
    NaN
    KQI
    []