A prosodically guided speech understanding strategy

1975 
Our strategy for computer understanding of speech uses prosodic features to break up continuous speech into sentences and phrases and locate stressed syllables in those phrases. The most reliable phonetic data are obtained by performing a distinguishing features analysis within the stressed syllables and by locating sibilants and other robust information in unstressed syllables. The numbers and locations of syntactic boundaries and stressed syllables are used to select likely syntactic and semantic structures, within which words are hypothesized to correspond to the partial distinguishing features matrices obtained from the segmental analyses. Portions of this strategy have been implemented and tested with hundreds of seconds of speech, involving fifteen talkers. A program for detecting syntactic boundaries from fall-rise patterns in fundamental frequency contours correctly detected ever 90 percent of all predicted boundaries. An algorithm for locating stressed syllables (from fundamental frequency contours and high-energy syllabic nuclei) correctly located the nuclei of over 85 percent of all those syllables perceived as stressed by a panel of listeners. A study of segmental analysis results obtained by several other research groups showed that phonetic recognition clearly is most successful in the stressed syllables. Procedures for classification of stressed vowels, location and classification of sibilants, and location of stops, nasals, and [r]-like sounds have been implemented. Prosodic aids to parsing and semantic analysis are being investigated.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    11
    References
    33
    Citations
    NaN
    KQI
    []