Sample Driven Data Mapping for Linked Data and Web APIs

2020 
In order to create the most comprehensive RDF Knowledge Base possible, data integration is essential. Many different data sources are used to extend a given dataset or to correct errors in the data. Nowadays, Web APIs (instead of data dumps) are common external data sources, since many data providers make their data publicly available. However, the classic problems of data integration, i.e., which parts of the datasets can be mapped, remain. In addition, Web APIs are often more restrictive than data dumps and of course slower to access due to latencies and other constraints. In this paper we demonstrate the FiLiPo (Finding Linkage Points) system to automatically find connections (i.e., linkage points) between Web APIs and local Knowledge Bases in a reasonable amount of time. To this end, we developed a sample-driven schema matching system, which models Web API services as parameterized queries. These Web API services return a view definition of their data which subsequently need to be connected to the local database scheme. Furthermore, our approach is able to find valid input values for Web API services automatically (e.g. IDs) and can determine combined linkage points (e.g. first and last name) despite different structures. Our results on six real world API services with two local databases show that our linkage point detection algorithm performs well in terms of precision (0.89 up to 1.0) and recall (0.69 up to 1.0).
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    10
    References
    3
    Citations
    NaN
    KQI
    []