An Efficient Long Chinese Text Sentiment Analysis Method Using BERT-Based Models with BiGRU

2021 
There is thereby an urgent need but it is still a significant challenge to solve long Chinese t ext sentiment. BERT-based pre-trained language model (PLM) has been demonstrated to be the state-of-the-art approach for sentiment analysis. However, BERT can only process 510 tokens at a time, limiting the accuracy of sentiment analysis for long texts. Meanwhile, existing long text truncation methods for this BERT deficiency perform still weak in capture core sentiments. Aiming to better solve long text sentiment analysis, we propose a BERT-based fusion model. Firstly, we elaborately devise a new truncation method to gain four types of embeddings and resample the dataset to alter the imbalanced distribution of labels. Secondly, $\mathcal{N}$ BERT-based models are leveraged to joint learn the above four embeddings. Thirdly, we adopt a BiGRU network to fuse $\mathcal{N}$ BERT-based models and further utilizes attention mechanism to obtain effective core sentiments of the long Chinese texts. In the end, we employ three ensemble algorithms to optimize our model, improving Micro and Macro F1 by 1.68% and 1.53% respectively.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    10
    References
    0
    Citations
    NaN
    KQI
    []