Automated assessment of prosody production

2009 
Assessment of prosody is important for diagnosis and remediation of speech and language disorders, for diagnosis of neurological conditions, and for foreign language instruction. Current assessment is largely auditory-perceptual, which has obvious drawbacks; however, automation of assessment faces numerous obstacles. We propose methods for automatically assessing production of lexical stress, focus, phrasing, pragmatic style, and vocal affect. Speech was analyzed from children in six tasks designed to elicit specific prosodic contrasts. The methods involve dynamic and global features, using spectral, fundamental frequency, and temporal information. The automatically computed scores were validated against mean scores from judges who, in all but one task, listened to ''prosodic minimal pairs'' of recordings, each pair containing two utterances from the same child with approximately the same phonemic material but differing on a specific prosodic dimension, such as stress. The judges identified the prosodic categories of the two utterances and rated the strength of their contrast. For almost all tasks, we found that the automated scores correlated with the mean scores approximately as well as the judges' individual scores. Real-time scores assigned during examination - as is fairly typical in speech assessment - correlated substantially less than the automated scores with the mean scores.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    58
    References
    36
    Citations
    NaN
    KQI
    []