Gated Neural Network with Regularized Loss for Multi-label Text Classification

2019 
Multi-label text classification is generally more difficult for its exponential output label space and more flexible input document considering its corresponding number of labels. We observed that some long documents have only one or two labels while some short documents are related to much more labels. In this paper, we propose a dynamic Gated Neural Network architecture to simultaneously process the document through two parts, one to extract the most informative semantics and filter redundant information, the other to capture context semantics with most of the information in the document kept. And semantics from these two parts are dynamically controlled by a gate to perform subsequent classification. And to better the training we incorporate label dependencies into traditional binary cross-entropy loss by exploiting label co-occurrences. Experimental results on AAPD and RCV1-V2 datasets show that our proposed methods achieve state-of-art performance. Further analysis of experimental results demonstrate that the proposed methods not only capture enough feature information from both long and short documents assigned with various labels, but also exploit label dependency to regularize the proposed model to further improve its performance.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    22
    References
    3
    Citations
    NaN
    KQI
    []