Utilizing Machine Learning for Video Processing
2021
cloud services or plug-ins to video editing tools. Recent advances in deep learning, a subset of ML that utilizes multiple layers to extract more detail from content (e.g., the initial layer may detect the edges of an object up to understanding what that object is), have led to technology breakthroughs in areas of video processing utilizing ML. Some key technologies that are part of the advances in deep learning include convolutional neural networks (CNNs), recurrent neural networks (RNNs), and generative adversarial networks (GANs). CNNs are a class of deep neural network commonly applied to analyze visual imagery. Inputs to the network come from a set of image pixels, usually adjacent (such as a square or a rectangle of pixels). The inputs to the network are convolved (the input pixel area is shifted across the image) to generate abstracted versions of the image in higher layers of the network known as activation maps
. RNNs use the output from previous experiences as inputs into different layers of the model and are uniquely suited to time series/temporal data (such as sequential video understanding activity detection). GANs contest two neural networks with each other in a game. A generative network generates candidates, while a discriminative network evaluates them. The generative network learns to map to a desired data distribution, while the discriminative network distinguishes the candidates produced by the generator as either part of the desired data distribution or not. The generative networks’ training objective is to increase the error rate of the discriminative network, that is, to fool the discriminative network into believing that the generated candidates are real and are not synthesized. One GAN example is a network that creates realistic synthesized human faces that look so real that the discriminative network (or a person, for that matter) cannot tell that they are not real faces.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
0
References
0
Citations
NaN
KQI