Systolic arrays for dynamic programming in speech recognition systems

1983 
Assuming certain characteristics of speech, and bit-serial data paths, a systolic array for performing dynamic programming is described. It is assumed that words may be modelled as linear sequences of acoustic kernels, and that each iteration of the systolic array must occur every ten milliseconds (matching the frame rate of our likelihood calculating engine). We wish to do the dynamic programming in as parallel a fashion as possible (which requires a good deal of silicon), but we have a long period of time (ten milliseconds) in which to carry out the computation. Hence, bit-serial data paths are used, since they minimize interconnection and gate requirements, at the expense of requiring more execution time. Gate array and custom VLSI designs are contemplated, and it is found that significantly large vocabularies may be supported with a small number of chips. This paper describes a systolic array approach to designing a portion (a dynamic programming-based best path search) of a continuous speech recognition system. In this system, we represent the possible words which make up an utterance in a grammar graph, similar to the approach taken with the CMU DRAGON system [1]. Each instance of the word in this grammar graph is an instantiation of a word model, which consists of a linear sequence of HARPY-like acoustic kernel models [3]. Associated with each acoustic kernel model are a template, a minimum duration, and a maximum duration (all refer to centisecond frames). Other word instance specific information is kept for best path search, as described below.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    7
    References
    0
    Citations
    NaN
    KQI
    []