ML-MDLText: An efficient and lightweight multilabel text classifier with incremental learning

2020 
Abstract Single-label text classification has been extensively studied in the last decades, and usually, more attention has been given to offline learning scenarios, where all of the training data is available in advance. However, real-world text classification problems often involve multilabel instances and have dynamic textual patterns that can change frequently. In this context, the methods must predict a subset of target labels rather than a single one, and ideally should be able to update their model incrementally to be scalable and adaptable to changes in data patterns using limited time and memory. In this study, we present a text classification method based on the minimum description length principle that can be applied to multilabel classification without requiring the transformation of the classification problem. It also takes advantage of dependency information among labels and naturally supports online learning. We evaluated its performance using fifteen datasets from different application domains and compared it with traditional benchmark classifiers, considering three online learning scenarios. Even without requiring problem transformation tricks, the results obtained by the proposed method were very competitive with existing state-of-the-art online learning methods and those that transform multilabel problems into several single-label ones.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    67
    References
    3
    Citations
    NaN
    KQI
    []