Lightly supervised word-sense translation-error detection and resolution in an interactive conversational spoken language translation system
2015
Lexical ambiguity can cause critical failure in conversational spoken language translation (CSLT) systems that rely on statistical machine translation (SMT) if the wrong sense is presented in the target language. Interactive CSLT systems offer the capability to detect and pre-empt such word-sense translation errors (WSTEs) by engaging the human operators in a precise clarification dialogue aimed at resolving the problem. This paper presents an end-to-end framework for accurate detection and interactive resolution of WSTEs to minimize communication errors due to ambiguous source words. We propose (a) a novel, extensible, two-level classification architecture for identifying potential WSTEs in SMT hypotheses; (b) a constrained phrase-pair clustering mechanism for identifying the translated sense of ambiguous source words in SMT hypotheses; and (c) an interactive strategy that integrates this information to request specific clarifying information from the operator. By leveraging unsupervised and lightly supervised learning techniques, our approach minimizes the need for expensive human annotation in developing each component of this framework. Each component, as well as the overall framework, was evaluated in the context of an interactive English-to-Iraqi Arabic CSLT system.
Keywords:
- Machine translation
- Computer science
- Supervised learning
- Ambiguity
- Linguistics
- Natural language processing
- Artificial intelligence
- Architecture
- Computer-assisted translation
- Spoken language
- Speech translation
- Transfer-based machine translation
- Speech recognition
- Annotation
- Cluster analysis
- Operator (computer programming)
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
42
References
2
Citations
NaN
KQI