The design and implementation of an excellent text categorization system

2002 
Based on the study of text classification techniques, a new text categorization method which uses a weight adjustment measure to improve a vector space model and naive Bayesian classifier is proposed, and an experimental text classification system CWZ is implemented to make comparison within various text classification approaches. Compared with many commercial text classification systems, the behavior of CWZ is much better. We introduce its framework, function, main modules and running environment, give our experimental results, and discuss a few important technical issues involved in the system to get some valuable conclusions. We also describe how to improve the vector space model and naive Bayesian classifier.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    2
    References
    4
    Citations
    NaN
    KQI
    []