Schema Matching Incorporating Attribute Distribution Features

2010 
This paper presents a new web schema matching algorithm incorporateing attribute distribution features.Attribute distribution features include the mutually exclusive feature and the co-occurring feature.By discovering mutually exclusive attribute pair and various statistics of the attribute pair,the mutually exclusive feature is calculated with the implication of the semantic similarity of the attribute pair.To utilize name similarity and value similarity based features,the attribute distribution features are combined with traditional similarity based features through machine learning techniques.After potential matched attribute pairs are discovered,this paper introduces the co-occurring feature as the constraint of clustering algorithms and solves the web schema matching problem by constrained attribute clustering algorithms.Experiments on a wide variety of domains demonstrate the improvements of F-scores ranging from 0.13 to 0.55.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []