Unsupervised text pattern learning using minimum description length
2010
The knowledge of text patterns in a domain-specific corpus is valuable in many natural language processing (NLP) applications such as information extraction, question-answering system, and etc. In this paper, we propose a simple but effective probabilistic language model for modeling the in-decomposability of text patterns. Under the minimum description length (MDL) principle, an efficient unsupervised learning algorithm is implemented and the experiment on an English critical writing corpus has shown promising coverage of patterns compared with human summary.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
11
References
3
Citations
NaN
KQI