At the Speed of Sound: Efficient Audio Scene Classification.

Bo Dong,Cristian Lumezanu,Yuncong Chen,Dongjin Song,Takehiko Mizoguchi,Haifeng Chen,Latifur Khan

At the Speed of Sound: Efficient Audio Scene Classification.

2020

Bo Dong
Cristian Lumezanu
Yuncong Chen
Dongjin Song
Takehiko Mizoguchi
Haifeng Chen
Latifur Khan

Efficient audio scene classification is essential for smart sensing platforms such as robots, medical monitoring, surveillance, or autonomous vehicles. We propose a retrieval-based scene classification architecture that combines recurrent neural networks and attention to compute embeddings for short audio segments. We train our framework using a custom audio loss function that captures both the relevance of audio segments within a scene and that of sound events within a segment. Using experiments on real audio scenes, we show that we can discriminate audio scenes with high accuracy after listening in for less than a second. This preserves 93% of the detection accuracy obtained after hearing the entire scene.

Keywords:

Pattern recognition
Speed of sound
Computer science
Computer vision
Artificial intelligence

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations