Research on Deep Sound Source Separation

Yunuo Yang,Honghui Li

Research on Deep Sound Source Separation

2021

The cocktail party effect is a fundamental problem in sound source separation, and many researchers have worked to solve this problem. In recent years, the most popular algorithms to solve the problem of sound source separation are Support Vector Machine (SVM), Gaussian Mixture Model (GMM), non-negative matrix factorization (NMF), and Variational Autoencoder (VAE). Especially VAE model showed excellent ability in dealing with the problem of sound separation. In this paper, the β-VAE model, combined with a weakly supervised classification proposed by Karamatli et al., was first reproduced. Since Karamatli's experiment only completed the connection between sound and words, in order to learn more information about the speaker, this model is used to learn a mapping between sounds and individual speakers and a mapping between sounds and gender. It turns out that the separation results could be obtained by retraining the model after the establishment of the new 'male' and 'female' labels. his result lays a foundation for the future study of the mapping between individuals and words. When the tag is specific to an individual, more data is needed to support this experiment, and the more data available for training, the better result the model will get.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations