Enhancing the Performance of Part of Speech tagging of Nepali language through Hybrid approach.

2015 
Part-of-speech tagging is the process of marking up the words in a text (corpus) as corresponding to a particular part of speech, based on both its definition, as well as its context —i.e. relationship with adjacent and related words in a phrase, sentence, or paragraph. Part-of-Speech (POS) tagging is the process of assigning the appropriate part of speech or lexical category to each word in a natural language sentence. Part-of-speech tagging is an important part of Natural Language Processing (NLP) and is useful for most NLP applications. It is often the first stage of natural language processing following which further processing like chunking, parsing, etc. are done. There are a number of approaches to implement part of speech tagger (1), i.e. Rule Based approach, Statistical approach and Hybrid approach. Rule-based tagger uses linguistic rules to assign the correct tags to the words in the sentence or file. Statistical Part of Speech tagger is based on the probabilities of occurrences of words for a particular tag. Hybrid based Part of Speech tagger is a combination of Rule based approach and Statistical approach. In this paper, we have proposed a Hybrid approach using Hidden Markov Model (statistical approach) integrated with Rule-Based method towards POS tagging and achieved the accuracy of 93.15%.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    4
    References
    7
    Citations
    NaN
    KQI
    []