Perception of emotional valences and activity levels from vowel segments of continuous speech.

2010 
Summary This study aimed to investigate the role of voice source and formant frequencies in the perception of emotional valence and psychophysiological activity level from short vowel samples (∼150 milliseconds). Nine professional actors (five males and four females) read a prose passage simulating joy, tenderness, sadness, anger, and a neutral emotional state. The stress carrying vowel [a:] was extracted from continuous speech during the Finnish word [t a: k:ahan] and analyzed for duration, fundamental frequency (F0), equivalent sound level (L eq ), alpha ratio, and formant frequencies F1–F4. Alpha ratio was calculated by subtracting the L eq (dB) in the range 50Hz–1kHz from the L eq in the range 1–5kHz. The samples were inverse filtered by Iterative Adaptive Inverse Filtering and the estimates of the glottal flow obtained were parameterized with the normalized amplitude quotient (NAQ = f AC /(d peak T)). Fifty listeners (mean age 28.5 years) identified the emotional valences from the randomized samples. Multinomial Logistic Regression Analysis was used to study the interrelations of the parameters for perception. It appeared to be possible to identify valences from vowel samples of short duration (∼150 milliseconds). NAQ tended to differentiate between the valences and activity levels perceived in both genders. Voice source may not only reflect variations of F0 and L eq , but may also have an independent role in expression, reflecting phonation types. To some extent, formant frequencies appeared to be related to valence perception but no clear patterns could be identified. Coding of valence tends to be a complicated multiparameter phenomenon with wide individual variation.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    38
    References
    36
    Citations
    NaN
    KQI
    []