Automatic classification of academic documents using text mining techniques

2012 
In this work an automatic classifier of undergraduate final projects based on text mining is presented. The dataset, comprising documents from four professional categories, was represented by means the vector space model with different index metrics. Also, a number of techniques for reduction dimensionality were applied over the word space. In order to construct the classification model the K-nearest neighbor algorithm was applied. Using 10-fold cross-validations we could obtain 82% of predictive accuracy. However, we achieved an accuracy of 95% with a recommendation of up to two categories taking into account the interdisciplinary in documents. This classifier was integrated into an application for automatic assignment of reviewers, which performs this assignation from teachers who belong to the areas recommended.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    25
    References
    3
    Citations
    NaN
    KQI
    []