language-icon Old Web
English
Sign In

dCache, Sync-and-Share for Big Data

2015 
The availability of cheap, easy-to-use sync-and-share cloud services has split the scientific storage world into the traditional big data management systems and the very attractive sync-and-share services. With the former, the location of data is well understood while the latter is mostly operated in the Cloud, resulting in a rather complex legal situation.Beside legal issues, those two worlds have little overlap in user authentication and access protocols. While traditional storage technologies, popular in HEP, are based on X.509, cloud services and sync-and-share software technologies are generally based on username/password authentication or mechanisms like SAML or Open ID Connect. Similarly, data access models offered by both are somewhat different, with sync-and-share services often using proprietary protocols.As both approaches are very attractive, dCache.org developed a hybrid system, providing the best of both worlds. To avoid reinventing the wheel, dCache.org decided to embed another Open Source project: OwnCloud. This offers the required modern access capabilities but does not support the managed data functionality needed for large capacity data storage.With this hybrid system, scientists can share files and synchronize their data with laptops or mobile devices as easy as with any other cloud storage service. On top of this, the same data can be accessed via established mechanisms, like GridFTP to serve the Globus Transfer Service or the WLCG FTS3 tool, or the data can be made available to worker nodes or HPC applications via a mounted filesystem. As dCache provides a flexible authentication module, the same user can access its storage via different authentication mechanisms; e.g., X.509 and SAML. Additionally, users can specify the desired quality of service or trigger media transitions as necessary, thus tuning data access latency to the planned access profile. Such features are a natural consequence of using dCache.We will describe the design of the hybrid dCache/OwnCloud system, report on several months of operations experience running it at DESY, and elucidate the future road-map.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    2
    References
    0
    Citations
    NaN
    KQI
    []