Recent progress in the INRS speech recognition system

1995 
The INRS large‐vocabulary continuous‐speech recognition system employs a two‐pass search. First, inexpensive models prune the search space; then a powerful language model and detailed acoustic–phonetic models scrutinize the data. A new fast match with two‐phone lookahead and pruning speeds up the search. In language modeling, excluding low‐count statistics reduces memory (50% fewer bigrams and 92% fewer trigrams); with Wall Street Journal texts, excluding single‐occurrence bigrams and trigrams with counts less than five yields little performance decrease. In acoustic modeling, separate male and female right‐context VQ models and a bigram language model are used in the first pass, and right‐context continuous models and a trigram language model are used in the second pass. A shared‐distribution clustering uses a distortion measure based only on the weights of Gaussian mixtures in the HMM model. Testing the system with a 5000‐word vocabulary, the word inclusion rate (i.e., correct word retained in the first...
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []