Several analyses relating facial motion with perioral muscle behavior and speech acoustics are described. The results suggest that linguistically relevant visual information is distributed over large regions of the face and can be modeled from the same control source as the acoustics.
Keller and Ostry (J. Acoust. Soc. Am., in press) described a microcomputer-based system for the measurement of tongue dorsum movements with pulsed-echo ultrasound. We have recently completed an upgrade of this system to provide facilities for (1) simultaneous ultrasound measurement of any two of lingual, larygeal, and lateral pharyngeal wall movements, (2) the transduction of jaw movements and force. (3) concurrent EMG sampling, and (4) acoustic sampling at rates up to 7 kHz. The presentation focuses on the pulsed-ultrasound recording, display, and analysis techniques. The transducer placement procedures for tongue dorsum, vocal folds, and lateral pharyngeal wall are described and several examples of simultaneous recordings are presented. Data analysis techniques involving separate application of natural cubic spline functions to each of the records are also presented and an iterative procedure for optimizing goodness of fit of the spline functions is described.
The representation of speech goals was explored using an auditory feedback paradigm. When talkers produce vowels the formant structure of which is perturbed in real time, they compensate to preserve the intended goal. When vowel formants are shifted up or down in frequency, participants change the formant frequencies in the opposite direction to the feedback perturbation. In this experiment, the specificity of vowel representation was explored by examining the magnitude of vowel compensation when the second formant frequency of a vowel was perturbed for speakers of two different languages (English and French). Even though the target vowel was the same for both language groups, the pattern of compensation differed. French speakers compensated to smaller perturbations and made larger compensations overall. Moreover, French speakers modified the third formant in their vowels to strengthen the compensation even though the third formant was not perturbed. English speakers did not alter their third formant. Changes in the perceptual goodness ratings by the two groups of participants were consistent with the threshold to initiate vowel compensation in production. These results suggest that vowel goals not only specify the quality of the vowel but also the relationship of the vowel to the vowel space of the spoken language.
This paper presents kinematic data and a statistical analysis of the MRC Psycholinguistic Database in a study of patterns of syllable coda structure in English bisyllables. X-ray microbeam and OPTOTRAK data were used to study spontaneous errors in rapid speech. Subjects repeated real-word and nonsense bisyllables at increasing speaking rates. All subjects made errors that harmonized the place of articulation of the codas at faster speaking rates. Errors were determined by kinematic changes in the tongue and lip gestures. Analysis of the MRC Psycholinguistic Database revealed that coda harmonies occur statistically more frequently than chance in English even when the statistical incidence of segments are taken into account. The experimental evidence is discussed in terms of production constraints on syllable production. The influence of stress pattern, manner, and place of articulation are explored. [Work supported by NIH-NIDCD Grant No. DC-00594 and NSERC.]
In altered auditory feedback experiments, participants respond rapidly and alter their speech production to compensate for the induced auditory feedback error. This response does not occur with small feedback perturbations, and thus there must be a threshold for compensation by the auditory-vocal feedback system. What is unknown is the psychophysical threshold during speech production (i.e., how large are manipulations that are consciously detectable?). The purpose of this experiment was to compare the threshold for conscious awareness of perturbation, and the minimal change that induces compensation. A real-time auditory feedback manipulation system was employed in a repeated measures design. The compensation threshold was determined using the change point test during a perturbation ramp of 4 Hz/utterance. In the psychophysical measurement, the size of the feedback manipulation followed a two-alternative forced-choice paradigm. With 17 individuals, the mean psychophysical threshold was 105 Hz (SE: 9 Hz). The mean compensation threshold was 64 Hz (SE: 8 Hz). The significant difference between these thresholds is consistent with the hypothesis that the auditory-vocal feedback system can operate without conscious control and suggests that it may have access to a more sensitive representation of speech formants.