A Study on Improving the Performance of Document Classification Using the Context of Terms

2012 
One of the limitations of BOW method is that each term is recognized only by its form, failing to represent the term`s meaning or thematic background. To overcome the limitation, different profiles for each term were defined by thematic categories depending on contextual characteristics. In this study, a specific term was used as a classification feature based on its meaning or thematic background through the process of comparing the context in those profiles with the occurrences in an actual document. The experiment was conducted in three phases; term weighting, ensemble classifier implementation, and feature selection. The classification performance was enhanced in all the phases with the ensemble classifier showing the highest performance score. Also, the outcome showed that the proposed method was effective in reducing the performance bias caused by the total number of learning documents.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    2
    References
    2
    Citations
    NaN
    KQI
    []