Learning name pronunciations in automatic speech recognition systems

2003 
Many speech recognition systems that provide over-the phone services, e.g. name dialers, stock quote providers, location finders, rely on the accurate recognition of proper names. For this to happen, the systems need to know how their users will pronounce these words. However, predicting the pronunciation of a proper name is a notoriously difficult problem as it depends on the origin of the name, the linguistic background of the speaker, and other cultural and sociological factors, in addition of course to the word spelling. In this paper, we describe a data-driven method that learns proper name pronunciations from audio samples of these words. The algorithm relies on the machinery of a general purpose speech recognizer to find the phone sequence that best matches the sample speech waveforms. In addition, it incorporates linguistic knowledge automatically acquired from a pronunciation dictionary to ensure that the learned pronunciations are "reasonable" from a linguistic viewpoint. We show on a corporate name dialing database that the proposed algorithm reduces the call routing error rate by 40% compared to a reference letter-to-phone pronunciation engine.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    13
    References
    23
    Citations
    NaN
    KQI
    []