The Case of a Novel Operational Distributed Storage Service for Big Data in a Semiconductor Wafer Fabrication Foundry

2018 
We present in this paper a novel infrastructural service based on Hadoop for big data storage and computing in a Taiwan's semiconductor wafer fabrication foundry. The service is named Hadoop data service (HDS), which has been built and operated in production systems for 3.5 years. It evolves over time by incrementally accommodating users' requirements. HDS is a web-based distributed big data storage facility. Users simply rely on HDS to access data objects stored in Hadoop with the HTTP protocol. In addition, HDS is scalable and reliable. Moreover, HDS is efficient and effective by intelligently selecting either Hadoop distributed file system (HDFS) or database (HBase) for publishing data objects. Specifically, HDS is transparent to existing analytics and data inquiry applications, such as Spark and Hive. This paper discusses the design and implementation features for HDS. The performance metrics of HDS are also demonstrated.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []