Convolutional bidirectional long short-term memory hidden Markov model hybrid system for polyphonic sound event detection

2016 
In this study, we propose a polyphonic sound event detection method based on a hybrid system of Convolutional Bidirectional Long Short-Term Memory Recurrent Neural Network and Hidden Markov Model (CBLSTM-HMM). Inspired by the state-of-the-art approach to integrating neural networks to HMM in speech recognition, the proposed method develops the hybrid system using CBLSTM to estimate the HMM state output probability, making it possible to model sequential data while handling its duration change. The proposed hybrid system is capable of detecting a segment of each sound event without post-processing, such as a smoothing process of detection results over multiple frames, usually required in the frame-wise detection methods. Moreover, we can easily apply it to a multi-label classification problem to achieve polyphonic sound event detection. We conduct experimental evaluations using the DCASE2016 task two dataset to compare the performance of the proposed method to that of the conventional methods, such as non-...
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    1
    Citations
    NaN
    KQI
    []