Enabling Robots to Distinguish Between Aggressive and Joking Attitudes

Kota Maehama,Jani Even,Carlos Toshinori Ishi,Takayuki Kanda

Enabling Robots to Distinguish Between Aggressive and Joking Attitudes

2021

During a conversation, the meaning of an utterance may drastically change depending on the attitude of the speaker. For example, offensive words may be used “seriously” to threaten or “jokingly” to tease. However, robots do not have yet the capacity to understand such nuance. Therefore, we have developed an attitude recognition system that allows robots to evaluate whether an utterance with an offensive lexical content is aggressive (serious) or a joke. First, we created a data set of 7199 utterances (16 participants) that reproduces the different attitudes toward robots that we observed in field experiments. Second, we implemented voice quality features for breathy voice, creaky voice (or vocal fry), and pressed voice analysis, combined them with conventional prosodic features, and developed a neural network architecture to estimate the “perceived level of joking” of the utterances. Finally, we compared the performance of the proposed method to standard approaches for speech emotion recognition. We show that the combination of voice quality and prosodic features we proposed outperforms at this task the conventional neural network used for speech emotion recognition. The proposed system predicts the “perceived level of joking” of an utterance with an accuracy comparable to what a human would guess.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations