A Study of The Value of Parameter N in n gram Statistical Model in Chinese Language
1998
Abstract As a major statistical model,n gram has been applied extensively in the process of language processing (such as POS tagging,language modeling of speech recognition,character recognition,etc.).However,there is no definitive conclusion what N value will be optimal for Chinese language processing until now.This paper introduces a kind of estimation for the selection of parameter N in n gram model in Chinese language. Three factors has been analyzed for comparing different N value. These are the approximate expression for Chinese grammatical structure,reconstruction of new words,and the performance for the transcription of Chinese Pinyin sequence to text. Finally, a conclusion was obtained that 4 is a better selection of parameter N value for n gram model based on words in Chinese language. It will be helpful for the development of Chinese statistical language model and language processing.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
0
References
1
Citations
NaN
KQI