Feature Extraction for Sentiment Analysis in Indonesian Twitter

2021 
Twitter's sentiment analysis is one of the most interesting fields of research lately. It intertwines the natural language processing techniques with data mining. Up to this point, many algorithms have been proposed to better understand sentiment from text. The proposed method can be focused on the preprocessing step, dataset splitting method (training and testing), dataset balancing method (when the data is unbalanced), to the improvement of the existing algorithm. But, the main focus of this paper is on feature extraction from tweets using TF-IDF. The features obtained from this process are expected to improve the accuracy of the classification process. The dataset used in this research is in Indonesian, which has a very different form when compared to English. This dataset consists of 1068 manually labeled tweets related to the "school from home" policy caused by the COVID-19 outbreak, taken from March to July. All steps required to process this data will be implemented using python. To validate its utility, the performance of the proposed method is compared with each other. Finally, the results are summarized by reflecting on the impact of the inclusion of the proposed features for each classification algorithm for sentiment detection
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    2
    References
    0
    Citations
    NaN
    KQI
    []