Containerized Analyses Enable Interactive and Reproducible Statistics.

2021 
In recent decades the analysis of data has become increasingly computational. Correspondingly, this has changed how scientific and statistical work is shared. For example, it is now commonplace for underlying analysis code and data to be proffered alongside journal publications and conference talks. Unfortunately, sharing code faces several challenges. First, it is often difficult to take code from one computer and run it on another. Code configuration, version, and dependency issues often make this challenging. Secondly, even if the code runs, it is often hard to understand or interact with the analysis. This makes it difficult to assess the code and its findings, for example, in a peer review process. In this paper we advocate for two practical approaches to help make sharing interactive and reproducible analyses easy: (1) analysis containerization, a technology that fully encapsulates an analysis, data, code and dependencies into a shareable format, and (2) code notebooks, an accessible format for interacting with third-party analyses. We will demonstrate that the combination of these two technologies is powerful and that containerizing interactive code notebooks can help make it easy for statisticians to share code, analyses, and ideas.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    2
    References
    0
    Citations
    NaN
    KQI
    []