Experimental study of support vector machines and Naïve Bayes classifier on automated subject area classification

2017 
Subject area classification allow researchers to identify publications based on their discipline or research domain. When number of document is large, classification of publication documents become increasingly difficult. Besides, covering granularity of broad range of subject areas manually is a critical problem. In recent areas, machine learning has emerged as an effective way for automated classification in various domains such as text, images and videos. Problems with classifying large amount of publication papers can be solved with automating the process of subject area classification using supervised machine learning approaches. This paper represents an experimental study that used support vector machines and naive bayes for automated classification of subject areas. Text classification method is used to find the probability of a document to be in certain category based on co-words and their frequency in a document. The proposed experimentation is consisted of two phases. In first phase, a list of co-words was generated from a collection of document in each of selected subject areas using text pre-processing technique. In second phase, both Support Vector Machines(SVM) and Naive Bayes classifiers were used to conduct the experimentation and performance of each method was observed. It was found that SVM performs better than Naive Bayes classifier in multi-label classification.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    10
    References
    1
    Citations
    NaN
    KQI
    []