Two-stream convolutional neural networks based on a self-attention mechanism for environmental sound classification

2021 
In recent years, with the development of smart homes, urban sound monitoring and machine hearing, the accurate recognition of environmental sound has become particularly important. However, due to the suddenness of environmental sound and the tendency to mix with other noises, the accuracy of environmental sound recognition has been unsatisfactory. In order to improve the recognition accuracy of environmental sounds, this paper designs a decision layer fusion module (SAD) based on self-attention mechanism, and introduces the self-attention mechanism into the decision layer part of the two-stream convolutional neural network. We also constructed the SAD-CNN model architecture to better fuse the decision layer outputs of convolutional neural networks with different inputs to improve the accuracy of environment sound recognition. We investigated the performance of different fusion methods based on the UrbanSound8K dataset, and the experimental results validated the effectiveness of the SAD module and the SAD-CNN model architecture, with the best recognition accuracy of 96.24% for the UrbanSound8K dataset
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []