Variegated data swabbing: An improved purge approach for data cleaning

2017 
The Errors and inconsistencies present in the data cause massive problem in Data Warehousing. Aprodigious solution is required to extract relevant data for efficacious and infallible decision making. Therefore in this paper, we propose a mechanism an efficient Variegated Data Swabbing algorithm to enhance the eminence of raw data, by removing errors, inconsistencies, redundancies, and duplicity from the structured data. Proposed variegated data swabbing algorithm takes two data-sets from two different data sources, integrates them to form a new single data-set by removing all the duplicate rows along with the missing values or NaN values from the data. Spell checker algorithm is applied to the proposed system for checking mistakes or misspellings of words, suggestion for the respective words provided. The proposed system provides the better and efficient results than the existing algorithm in terms of Accuracy, Execution time and Space.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    4
    References
    1
    Citations
    NaN
    KQI
    []