Opinion Mining: An Approach to Feature Engineering

2019 
Sentiment Analysis or opinion mining refers to a process of identifying and categorizing the subjective information in source materials using natural language processing (NLP), text analytics and statistical linguistics. The main purpose of opinion mining is to determine the writer’s attitude towards a particular topic under discussion. This is done by identifying a polarity of a particular text paragraph using different feature sets. Feature engineering in pre-processing phase plays a vital role in improving the performance of a classifier. In this paper we empirically evaluated various features weighting mechanisms against the well-established classification techniques for opinion mining, i.e. Naive Bayes-Multinomial for binary polarity cases and SVM-LIN for multiclass cases. In order to evaluates these classification techniques we use Rotten Tomatoes publically available movie reviews dataset for training the classifiers as this is widely used dataset by research community for the same purpose. The empirical experiment concludes that the feature set containing noun, verb, adverb and adjective lemmas with feature-frequency (FF) function perform better among all other feature settings with 84% and 85% correctly classified test instances for Naive Bayes and SVM, respectively.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    8
    References
    0
    Citations
    NaN
    KQI
    []