Phonetic clustering based confidence measure for embedded speech recognition

2010 
Word posterior probability (WPP) based confidence measure (CM) has been applied successfully in LVCSR tasks. However, for embedded speech recognition in which system resource is limited, not only performance of CM but also efficiency of the algorithm need to be considered. One of the most important issue in calculating WPP is how to obtain reliable estimation of the normalization term. So in this paper we investigate several methods to estimate the normalization term and focus on methods using different phone-based grammar. Furthermore, to make good trade-off between performance and efficiency for embedded system, we present a general approach of estimating WPP based confidence score based on data-driven phonetic clustering, where Kullback-Leibler divergence (KLD) is employed for grouping all phones into different clusters. Corresponding acoustic and language models for calculating CM score can be re-trained based on the clustering phones. Experimental results on different Mandarin command word and digit recognition tasks show that the proposed method can significantly improve the efficiency with little degradation in CM performance, where more than 90% processing time of CM module is saved.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    10
    References
    2
    Citations
    NaN
    KQI
    []