DAST: An aggregation scheme for crowdsensed indoor data exploiting sequential long-tail features

2021 
In the era of big data, information about the same objects can be gathered and accumulated from multiple sources (i.e., crowdworkers) through so-called crowdsensing. Especially, in the indoor positioning system using Bluetooth fingerprints, multiple crowdworkers are required to collect the information of Bluetooth beacons at the reference points and the corresponding received signal strength indicators (RSSI). Due to the unknown proper and bias of each crowdworker, it is challenging to appropriately estimate the reliability of each worker/source and truthfully aggregate data. Moreover, the collected data possesses two properties: they follow long-tail, where most of the data is gathered by a few sources, i.e., abundant crowdworkers only provide small amount; they have time-sequential feature: the truth about the crowdsensing tasks smoothly evolve with time. In response to the above problem and data features, this paper proposes an accurate data aggregation mechanism incorporating sequential long-tail characteristics, DAST. Specifically, we infer each source's credibility based on the estimated confidence interval using the amount of data historically provided by the source. Meanwhile, in order to capture the sequential characteristics of the data, the accumulated data in previous period is used as a virtual source to obtain the new aggregated value for the current period. Thorough simulations using artificial data and real data demonstrate that the performance of DAST is superior to the existed schemes including Confidence-Aware Truth Discovery (CATD), Precision-Recall (PrecRec) and Dynamic Truth Discovery (DynaTD) in terms of the mean absolute error (MAE) and the root mean square error (RMSE).
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    14
    References
    0
    Citations
    NaN
    KQI
    []