The 2009 IBM GALE Mandarin broadcast transcription system

2010 
This paper gives an up-to-date description of the IBM Mandarin broadcast transcription system developed under the DARPA GALE program. Technical advances over our previous system include a novel acoustic modeling approach using subspace Gaussian mixture models, a speaking rate adaptation method using frame rate normalization, and an effective recipe for lattice combination. We present results on three consortium-defined test sets. It is shown that with these advances, the new system attains a 9% relative reduction in character error rate compared to our previous GALE evaluation system. The reported 9.1% error rate on the phase three evaluation set represents the state of the art in Mandarin broadcast speech transcription.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    11
    References
    12
    Citations
    NaN
    KQI
    []