Monaural Speech Enhancement using Deep Neural Network with Cross-Speech Dataset

2021 
Deep Neural Network (DNN)-based mask estimation approach is an emerging algorithm in monaural speech enhancement. It is used to enhance speech signals from the noisy background by calculating either speech or noise dominant in a particular frame of the noisy speech signal. It can construct complex models for nonlinear processing. However, the limitation of the DNN-based mask algorithm is a generalization of the targeted population. Past research works focused on their target dataset because of time consumption for the audio recording session. Thus, in this work, different recording conditions were used to study the performance of the DNN-based mask estimation approach. The findings revealed that different language test dataset, as well as different conditions, may not give large impact in speech enhancement performance since the algorithm only learn the noise information. But, the performance of speech enhancement is promising when the trained model has been designed properly, especially given the less sample variations in the input dataset involved during the training session.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    19
    References
    0
    Citations
    NaN
    KQI
    []