Text classification algorithm based on sparse distributed representation

2020 
The effect of automatic text classification depends on training data to a great extent. However, the actual data often contains noise. It is often difficult, expensive or time consuming to improve the quality of data without noise at all. Aiming at this problem, a novel text classification algorithm is proposed based on sparse distributed representation (SDR) which is extremely tolerant to noise. The algorithm first created class-SDR for each class label by merging category feature vectors with the subsample technique. Then, the algorithm assigns a class label for a document by comparing the overlap value of SDR with class-SDRs. The experimental results show that the algorithm has better performance in classification with noise training data compared with six frequently used text classification algorithms.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    17
    References
    0
    Citations
    NaN
    KQI
    []