Metadata management in a big data infrastructure

2020 
Abstract The adoption of the Internet of Things (IoT) in industry provides the opportunity to gather valuable data. Nevertheless, this amount of data must be analyzed to identify patterns in the data, model behaviors of equipment and to enable prediction. Although big data found its initiation already some years ago, there are still many challenges to be solved, e.g. metadata representation and management are still a research topic. The big data architecture of the RISC data analytics framework relies on the combination of big data technologies with semantic approaches, to process and store large volumes of data from heterogeneous sources, provided by FILL, which is a key machine tool provider. The proposed architecture is capable of handling sensor data using big data technologies such as Spark on Hadoop, InfluxDB and Elasticsearch. The metadata representation and management approach is adopted in order to define the structure and the relations (i.e., the connections) between the various data sources provided by the sensors and logging information system. On the other hand, using a metadata approach in our big data environment enhances RISC data analytics framework by making it generic, reusable and responsive in case of changes, thus keeping the data lakes up-to-date and ensuring the validity of the analytics results. The work presented here is part of an ongoing project (BOOST 4.0) currently addressed under the EU H2020 program.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    12
    References
    0
    Citations
    NaN
    KQI
    []