Lightly supervised word-sense translation-error detection and resolution in an interactive conversational spoken language translation system

2015 
Lexical ambiguity can cause critical failure in conversational spoken language translation (CSLT) systems that rely on statistical machine translation (SMT) if the wrong sense is presented in the target language. Interactive CSLT systems offer the capability to detect and pre-empt such word-sense translation errors (WSTEs) by engaging the human operators in a precise clarification dialogue aimed at resolving the problem. This paper presents an end-to-end framework for accurate detection and interactive resolution of WSTEs to minimize communication errors due to ambiguous source words. We propose (a) a novel, extensible, two-level classification architecture for identifying potential WSTEs in SMT hypotheses; (b) a constrained phrase-pair clustering mechanism for identifying the translated sense of ambiguous source words in SMT hypotheses; and (c) an interactive strategy that integrates this information to request specific clarifying information from the operator. By leveraging unsupervised and lightly supervised learning techniques, our approach minimizes the need for expensive human annotation in developing each component of this framework. Each component, as well as the overall framework, was evaluated in the context of an interactive English-to-Iraqi Arabic CSLT system.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    42
    References
    2
    Citations
    NaN
    KQI
    []