Gaussian Modeling-Based Multichannel Audio Source Separation Exploiting Generic Source Spectral Model

2019 
As blind audio source separation has remained very challenging in real-world scenarios, some existing works, including ours, have investigated the use of a weakly informed approach where generic source spectral models (GSSM) can be learned a priori based on nonnegative matrix factorization (NMF). Such approach was derived for single-channel audio mixtures and shown to be efficient in different settings. This paper proposes a multichannel source separation approach where the GSSM is combined with the source spatial covariance model within a unified Gaussian modeling framework. We present the generalized expectation-minimization (EM) algorithm for the para-meter estimation. Especially, for guiding the estimation of the intermediate source variances in each EM iteration, we investigate the use of two criteria: First, the estimated variances of each source are constrained by NMF, and finally, the total variances of all sources are constrained by NMF altogether. While the former can be seen as a source variance denoising step, the latter is viewed as an additional separation step applied to the source variance. We demonstrate the speech separation performance, together with its convergence and stability with respect to parameter setting, of the proposed approach using a benchmark dataset provided within the 2016 Signal Separation Evaluation Campaign.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    52
    References
    2
    Citations
    NaN
    KQI
    []