Seer-Dock: A General-Purpose Dockerized Scholarly Document Collection and Management Framework

2021 
The harvesting, management, and analysis of thematic document collections is a major challenge in a wide variety of applications. While the criteria for compiling such collections are individual, the entire process is largely standardized. Therefore, it is not efficient to build new systems over and over again to take over these tasks. In this work, we introduce Seer-Dock, a novel and easy-to deploy general-purpose dockerized framework to build a scholarly document harvesting and management system. It is based on CiteSeerX, the most widely used scholarly search engine. Seer-Dock uses docker containers for all components and thus enables its users to rapidly deploy a full-fledged document collection and management system on any operating system platform and tailor it to the specific needs of an application domain. Moreover, it is easy to scale, orchestrate, maintain, and recover. In this resource paper, we introduce the architecture of Seer-Dock and its components. Like its kernel CiteSeerX, Seer-Dock is available under an Apache 2 open source license.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    9
    References
    0
    Citations
    NaN
    KQI
    []