Cross-domain cooperative deep stacking network for speech separation

Wei Jiang,Shan Liang,Like Dong,Hong Yang,Wenju Liu,Yunji Wang

Cross-domain cooperative deep stacking network for speech separation

2015

Nowadays supervised speech separation has drawn much attention and shown great promise in the meantime. While there has been a lot of success, existing algorithms perform the task only in one preselected representative domain. In this study, we propose to perform the task in two different time-frequency domains simultaneously and cooperatively, which can model the implicit correlations between different representations of the same speech separation task. Besides, many time-frequency (T-F) units are dominated by noise in low signal-to-noise ratio (SNR) conditions, so more robust features are obtained by stacking features of original mixtures with that extracted from separated speech of each deep stacking network (DSN) block, which can be regarded as a denoised version of the original features. Quantitative experiments show that the proposed cross-domain cooperative deep stacking network (DSN-CDC) has enhanced modeling capability as well as generalization ability, which outperforms a previous algorithm based on standard deep neural networks.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations