Vietnamese spelling detection and correction using Bi-gram, Minimum Edit Distance, SoundEx algorithms with some additional heuristics

2008 
The spelling checking problem is considered to contain two main phases: the detecting phase and the correcting phase. In this paper, we present a new approach for Vietnamese spelling checking based on Vietnamese characteristics for each phase. Our research approach includes the use of a syllable Bi-gram in combination with parts of speech (POS) to find out suspected syllables. In the correcting phase, we based on minimum edit distance, SoundEx algorithms and some heuristics to build a weight function for assessing suggestion candidates. The training corpus and the test set were collected from e-newspapers.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    12
    References
    6
    Citations
    NaN
    KQI
    []