Data Reduction using Similarity Class and Enhanced Tolerance Relation for Complete and Incomplete Information Systems

2019 
Research on data analytics is entering a new challenge where huge data need to be processed in a timely manner. However, there are issues to computational resources when some data are redundant, inconsistent, noisy, and incomplete. It is imperative to reduce data size in order to overcome some issues specifically redundant and incomplete data. Thus, some redundant data or incomplete data need to be removed. In this paper, we would like to present the data reduction approaches using Rough set theory based on similarity class between the two objects and an enhanced tolerance relation when the data is incomplete/imprecise. The data structured with complete and incomplete information systems are discussed. Comparative analysis and experiment result between the proposed approaches to based-line approaches in terms of reduction rate and accuracy are presented. We found that, the proposed approaches are more favorable with high reduction rate for complete information systems and high accuracy for incomplete information systems.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    12
    References
    0
    Citations
    NaN
    KQI
    []