Probabilistic model for truth discovery with mean and median check framework

2021 
Abstract In the era of big data, information can be collected from various sources. Unfortunately, information provided by multiple sources on the same entity is inevitably conflicting. Due to the ubiquitous existence of data conflicts, truth discovery has recently attracted considerable attention. Several truth discovery methods focus on providing a point estimate for the truth of each entity and exhibit completely different performances on the same input dataset. Therefore, an appropriate truth discovery method should be adopted to fit the unknown source reliability distributions. To address this, we approach truth discovery from another perspective. We theoretically verify that if the absolute distance between the mean and median value is large, then there must be incorrect claims with large errors in the input dataset. Accordingly, we propose a mean and median check (MMC) framework for truth detection, error claim removal, and iteration-stopping criteria. The experiments demonstrate that MMC can effectively remove incorrect claims provided by unreliable sources. Furthermore, the performance of state-of-the-art truth discovery methods can be significantly improved if MMC is used for input data preprocessing.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    30
    References
    0
    Citations
    NaN
    KQI
    []