The developmental use of vowel duration, final transition and voicing during closure as cues to voicing in final stop consonants were investigated, using 10 three-year-old, 10 six-year-old and 10 adult subjects. The stimuli were alterations of eight stop-vowel-stop words. The presentation of each stimulus item was response contingent. The resulting data supported the ability of adults and children to use both temporal and spectral cues to acoustic/ phonetic distinctions. However, three-year olds relied more on temporal cues, six-year olds relied more on spectral cues, while adults used both spectral and temporal cues in judging the voicing feature of final stop consonants.
The developmental use of vowel duration, final transition, and voicing during closure as cues to voicing in final stop consonants was investigated, using subjects ages 3 to 6 years and adult The stimuli were alterations of eight stop-vowel-stop words. The presentation of each stimulus item was response contingent. There was evidence that the adults responded more to duration cues than did the children. The response to spectral cues was more similar for the three age groups. A comparison of the response of these adults with response of adults in a previous study suggested that condition of response had a very significant effect on the direction and/or magnitude of response to specific treatments. In the earlier study the stimuli were presented at 750 ms. The response contingent presentation of the stimuli in this study seemed to result in a very different decision strategy on the part of the listeners. For example, adult subjects judged the same stimulus item voiced (67%) at 750 ms ISI but voiceless (62%) under the open interval condition. Listeners gave much more definite judgments under the fixed interval condition,judging many stimuli ambiguous in the open interval condition. Implications of these findings for the design and the interpretation of speech perception studies will be discussed.
Wave duration and wave peak amplitude were measured for 10 Mandarin speakers of English and five native English speakers. Results revealed significant differences between the two groups on these acoustic variables. Discriminant function analysis resulted in 12 (80%) of 15 subjects being classified correctly in their respective language groups based on the wave duration and wave peak amplitude measures. A second experiment, with the 10 Mandarin subjects as speakers and 231 native English speakers as listeners, ranked the intelligibility of each speaker. A regression analysis revealed high correlation (r=0.897) between acoustic and perceptual measures; measurements of amplitude wave duration account for 89.1% of the variation in intelligibility and measurements of peaks in wave amplitude account for 68.5% of the variation in the speech samples. These findings suggest that intelligibility of Mandarin speakers of English is a function of acoustic parameters of prosody.
Vocal fundamental frequency was measured for speakers of five languages under three conditions (reading English, reading native language, and spontaneous speaking-native language). The samples were recorded in a sound-treated booth and analyzed by a Visipitch (Kay Elemetrics) frequency analyzer interfaced to an IBMxt computer. Preliminary analysis suggests that mean fundamental frequency was surprisingly similar across languages for the various speaking conditions, and that the mean fundamental was higher for reading than for speaking (as has been found in studies of English), but that there were significant differences between languages and by sex in standard deviation of the fundamental under the various speaking conditions (reading English, etc.). The results suggest that fundamental frequency is determined primarily by physiological factors with some linguistic variations.
The fricated portions of the naturally produced words ship, shop, shoe (as produced by a standard English speaker) were systematically reduced by computer editing. These edited syllables were randomized and presented to 20 adult listeners, 10 of whom spoke a variety of standard English and 10 of whom spoke a variety of nonstandard English which has been influenced by Spanish. Listeners judged each syllable as having initial fricative, affricate, or stop (e.g., ship, chip, or tip). Results show a cross-over in judgments from fricative to affricate (e.g., ship to chip) and from affricate to stop (e.g., chip to tip) in specific ranges of fricative durations for both groups of listeners. However, the range of durations judged ambiguous was greater for the speakers of the Spanish influenced dialect particularly between the fricative and affricate categories. This may be a reflection of the phonological status of this distinction in the nonstandard dialect.
The duration of the preceding vowel has been called a primary and even necessary cue to voicing in final stop consonants. The results of this investigation suggest that in natural speech, vowel duration differences are probably neither necessary nor adequate cues to this distinction and that voicing during closure may be required to disambiguate final voiced stops. The stimuli were 52 one-syllable words recorded by two speakers and subjected to an analog-to-digital process and to linear predictive coding. Deletions, compressions, and expansions of segments of these 104 syllables produced 521 stimulus items which were randomized and presented to 12 adult listeners who judged the syllables to end in a voiced or voiceless stop. Though syllable duration was a more significant cue to the voicing feature than was vowel duration, syllable duration was not a necessary cue in that, even at the extremes of syllable length, syllables with final transition and/or final segment information intact did not effect a crossover in voicing judgment. Syllable duration was not an adequate cue to the voicing feature in that syllables without final transition and final segment information were not heard as better than 60 percent voiced at any syllable duration. By contrast, voicing during closure determined the voicing decision across the full range of syllable durations.
In several recent studies, spectral acoustic cues (e.g., voicing during closure, preclosure transition) have been found to have more influence than temporal acoustic cues (e.g., vowel or syllable duration) on perception of the voicing distinction in final stops. Earlier studies, mostly using synthetic speech, had found vowel duration to be the primary cue to this distinction. The present stimuli were taken from a study in which spectral cues had dominated judgments in quiet listening conditions. These stimuli were presented against a high level of background noise (S/N=0) to seven normal adult subjects in a forced choice (e.g., BED–BET) format. In quiet, the correlation of the percentage of the original vowel remaining after deletion with judgments of the final stop as voiced had been 0.18. In the noise listening condition the correlation was 0.94. Syllables with voiced final stops from which 76% of the vowel (all but the final transition) had been deleted were judged voiced (82%) in quiet but voiceless (87%) in noise. These results indicate that the temporal cue of vowel duration is more resistant to degradation by noise than are spectral cues, and that vowel duration plays a dominant role only when other cues are unavailable.