A New Multi-Phase Algorithm for Stemming in Farsi Language Based on Morphology

2011 
The main goal of stemming is to standardize words by reducing a word to its origin. In this paper a new algorithm for stemming in Farsi (Persian) language is presented. This stemmer is based on removing the suffixes and prefixes, and a database is used for saving the exceptions to decrease error rate. In the proposed method the speed of stemmer and also the percentage of errors are improved. The evaluation results on the prototype document collections show significant improvement in precision and recall in comparison with other well-known methods.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    12
    References
    7
    Citations
    NaN
    KQI
    []