A First Study on the Use of Noise Filtering to Clean the Bags in Multi-Instance Classification

2018 
Data in the real world is far from being perfect. The appearance of noise is a common issue that arises from the limitations of data adquisition mechanisms and human knowledge. In classification, label noise will hinder the performance of any classifier, inducing a bias in the model built. While label noise has attracted the attention of researchers in standard classification lately, its study in multi-instance classification has just begun. In this work, we propose the usage of a filtering algorithm for multi-instance classification that is able to reduce the impact of negative instances within the bags. In order to do so, we decompose the bags to form a standard classification problem that can be efficiently treated by a specialized noise filter. The bags are then rebuilt, without the eliminated instances. In our experiments, we show that by applying our approach we can diminish the impact of noise and even obtain better results at 0% noise level for several classifiers. Our approach opens a promising way to deal with noise in the bags of multi-instance datasets and further improve the classification rate of the models constructed.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    29
    References
    1
    Citations
    NaN
    KQI
    []