Parameter Learning for Statistical Machine Translation Using CMA-ES

2015 
Minimum error rate training (MERT) is probably still the most widely used parameter learning algorithm in statistical machine translation [1] (SMT). However, it does not support the use of large number of learning features (e.g. 30 features or more). Moreover, acting on parameter space, MERT is only a local optimization algorithm. In this paper, we investigate for the first time the use of metaheuristics and global optimization techniques for the problem of learning parameters in SMT. In particular, We replace MERT with the well-known meta-heuristics for global optimization called CovarianceMatrixAdaptation Evolution Strategy (CMAES) [2]. We test the effectiveness of CMA-ES by conducting SMT experiments on an English-Vietnamese corpus. The results show that the improved SMT system using CMA-ES achieved superior BLEU scores compared to the baseline SMT system using MERT both on the dev and test data sets.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    19
    References
    0
    Citations
    NaN
    KQI
    []