Health Database Oriented Word Alignment for Machine Translation Based on Generalized Intersection

2013 
Health database oriented data analysis and processing is very valuable, and in which the word alignment plays an important role. Health database contains a lot of medical terms. The existing word alignment methods cannot perform well due to the deficiency of term dictionary. This paper proposed a method of word alignment between Chinese and Japanese for healthy database. The method is based on the generalized intersection upon the set form of the sentence-level aligned bilingual corpus. We use GI (generalized intersection) model to align words. The GI model includes an algorithm based on generalized intersection operations on word set, and uses special stop-word set to improve the recall further. The results of experiments indicate that the GI model performed well for the health database with huge amounts of medical terms, as well as the language pairs with less linguistic resource, such as Chinese and Japanese.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    4
    References
    0
    Citations
    NaN
    KQI
    []