Deep Learning Methods for Audio Events Detection
2021
Deep learning is a branch of machine learning that is based on the use of algorithms whose purpose is the modeling of high-level abstractions on data. It is part of a family of targeted techniques learning methods to represent data. Convolutional neural networks (CNN) represent a type of neural network in which the connection pattern between neurons is inspired by the structure of the visual cortex in the animal world. The individual neurons present in this part of the brain respond to certain stimuli in a restricted region of observation, called the receptive field. CNNs are designed to recognize visual patterns directly in images represented by pixels and require a preprocessing quantity that is zero or very limited. They can recognize extremely variable patterns, such as freehand writing and images representing the real world. Typically, a CNN consists of several alternating layers of convolution and pooling followed by one or more final levels fully connected in the case of classification, or by a certain number of levels of upsampling in the regression case. In this chapter, we will see how to identify audio events in a complex sound scenario using convolutional neural networks. An automatic identification system for sound events can prove extremely useful in different contexts: safety of public events, home acoustic monitoring, bioacoustics monitoring, acoustic monitoring for healthcare, and more generally for the needs of smart cities. The variability of sound events determines continuous changes in frequency content that can hardly be traced with the instruments commonly used in traditional acoustics. Convolutional neural networks (CNN) are useful for learning the shift-invariant filters that are essential for modeling audio events.
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
37
References
0
Citations
NaN
KQI