Semiautomatic Acquisition of Translation Templates from Monolingual Unannotated Chinese Patent Corpus
2013
We propose a data-driven, semiautomatic and unsupervised method, which can semiautomatically extract translation templates from the unannotated Chinese patent corpus. The method includes seven steps: morphological analysis, replace, fllter, cluster, merge, sort and edit. After extracting and preforming the preliminary templates, we manually edit them and then get the ultimate templates, which are used in a template-based machine translation system. The experimental results show that the method is efiective to improve the quality of machine translation, and that the template-based machine translation system outperforms the conventional rule-based machine translation system without templates.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
9
References
3
Citations
NaN
KQI