IAS-BERT: An Information Gain Association Vector Semi-supervised BERT Model for Sentiment Analysis

2021 
With the popularity of large-scale corpora, statistics-based models have become mainstream model in Natural Language Processing (NLP). The Bidirectional Encoder Representations from Transformers (BERT), as one of those models, has achieved excellent results in various tasks of NLP since its emergence. But it still has shortcomings, such as poor capability of extracting local features and exploding of training gradients. After analyzing the shortcomings of BERT, this paper proposed an Information-gain Association Vector Semi-supervised Bidirectional Encoder Representations from Transformers (IAS-BERT) model, which improves the capability of capturing local features. Considering the influence of feature's polarity to overall sentiment and association between two word-embeddings, we use information gain on the training corpus. And then, the information gain results are used as an annotation of training corpus to generate a new word embedding. At the same time, we use forward-matching to optimize the computational overhead of IAS-BERT. We experiment the model on dataset of sentiment analysis, and it have achieved good results.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    23
    References
    0
    Citations
    NaN
    KQI
    []