Speech Emotion Recognition using Spectral Normalized CycleGAN

2019 
Generative Adversarial Networks are also popular for discrimination tasks such as Speech Emotion Recognition. Many SER approaches use the GANs to learn the mappings from parameter spaces to real data distributions for data augmentation. In this paper we use the CycleGANs to learn the relatively less complex domain adaptation functions for data augmentation. We also apply the Spectral Normalization technique to stable the GAN training for speech data. Besides the SN-Discriminators we add SN to the generators as well. The experimental results on the IEMOCAP dataset show that the proposed SN-CycleGAN effectively improves the training performance and reduces the GAN training loss from 3.40 to 0.89.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    4
    References
    0
    Citations
    NaN
    KQI
    []