An Active Learning Framework Based on Query-By-Committee for Sentiment Analysis

2019 
Social networks are the main resources to obtain information about people’s opinion and sentiments towards different topics as they spend hours daily on social media and share their opinion. However, it cost too much time to label the abundantly unlabeled data. To dramatically reduce annotation cost, this paper creatively apply an Active learning framework for Sentiment analysis called Active learning for Sentiment analysis (AL4SA) to naturally integrate active learning and machine learning into a single framework. First, AL4SA train a committee with multiple weak learners. Then, committee members are used to select the most informative samples from the unlabeled data by the Vote entropy strategy and enhance the model’s performance by incorporating newly samples in each iteration. Our experiments were compared with baseline machine learning methods such as KNN, Naive Bayes, Gradient Boosting Decision Tree (GBDT) and Support Vector Machine (SVM) on the hotel review corpus, the result shows that the cost of the labeled data can be cut by at least half and the performance is optimal than those baseline machine learning methods due to the most informative samples were selected by committee members.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    9
    References
    0
    Citations
    NaN
    KQI
    []