A Spiking Neural Network with Distributed Keypoint Encoding for Robust Sound Recognition

2019 
Compared to traditional artificial neural networks, spiking neural networks (SNNs) operate on an additional dimension of time which makes them more suitable for processing sound signals. However, two of the major challenges in sound recognition with SNNs are neural encoding and learning which demand more research efforts. In this paper, we propose a novel method by combining an improved local time-frequency encoding using key-points detection and biologically plausible tempotron spike learning for robust sound recognition. In the neural encoding part, local energy peaks, called key-points, are firstly extracted from local temporal and spectral regions in the spectrogram. The extracted key-points in each frequency channel are then distributed to multiple sub-channels according to their energy amplitudes with their temporal positions being retained. The resulted spatio-temporal spike patterns are then used as the inputs for spiking neural networks to learn and classify patterns of different categories. We use the RWCP database to evaluate the performance of our proposed system in mismatched environments. Our experimental results highlight that our proposed system, namely DKP-SNN, is effective and reliable for robust sound recognition, resulting in an improved recognition performance as compared to baseline methods.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    34
    References
    2
    Citations
    NaN
    KQI
    []