Customized Eager-Lazy Data Cleansing for Satisfactory Big Data Veracity
2021
Big data systems are becoming mainstream for big data management either for batch processing or real-time processing. In order to extract insights from data, quality issues are very important to address, particularly. A veracity assessment model is consequently needed. In this paper, we propose a model which ties quality of datasets and quality of query resultsets. We particularly examine quality issues raised by a given dataset, order attributes along their fitness for use and correlate veracity metrics to business queries. We validate our work using the open dataset NYC taxi’ trips.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
7
References
0
Citations
NaN
KQI