Unsupervised Video Summarization with Attentive Conditional Generative Adversarial Networks

Proceedings of the 30th ACM International Conference on Multimedia (2019)

Xufeng He Hua Yang Tao Song Zongpu Zhang Zhengui Xue Ruhui Ma Neil M. Robertson Haibing Guan

Citation

Reference

Related Paper

Citation Trend

Abstract:

With the rapid growth of video data, video summarization technique plays a key role in reducing people's efforts to explore the content of videos by generating concise but informative summaries. Though supervised video summarization approaches have been well studied and achieved state-of-the-art performance, unsupervised methods are still highly demanded due to the intrinsic difficulty of obtaining high-quality annotations. In this paper, we propose a novel yet simple unsupervised video summarization method with attentive conditional Generative Adversarial Networks (GANs). Firstly, we build our framework upon Generative Adversarial Networks in an unsupervised manner. Specifically, the generator produces high-level weighted frame features and predicts frame-level importance scores, while the discriminator tries to distinguish between weighted frame features and raw frame features. Furthermore, we utilize a conditional feature selector to guide GAN model to focus on more important temporal regions of the whole video frames. Secondly, we are the first to introduce the frame-level multi-head self-attention for video summarization, which learns long-range temporal dependencies along the whole video sequence and overcomes the local constraints of recurrent units, e.g., LSTMs. Extensive evaluations on two datasets, SumMe and TVSum, show that our proposed framework surpasses state-of-the-art unsupervised methods by a large margin, and even outperforms most of the supervised methods. Additionally, we also conduct the ablation study to unveil the influence of each component and parameter settings in our framework.

Keywords:

Discriminator

Margin (machine learning)

Feature (linguistics)

Key frame

Feature Learning

Generative model

Topics:

Video Analysis and Summarization

Music and Audio Processing

Advanced Image and Video Retrieval Techniques

10.1145/3343031.3351056

Cite

PDF

Anomaly Detection Based on Unsupervised Disentangled Representation Learning in Combination with Manifold Learning

2022 International Joint Conference on Neural Networks (IJCNN) (2020)

Xiaoyan Li Iluju Kiringa Tet Yeap Xiaodan Zhu Yifeng Li

Identifying anomalous samples from highly complex and unstructured data is a crucial but challenging task in a variety of intelligent systems. In this paper, we present a novel deep anomaly detection framework named AnoDM (standing for Anomaly detection based on unsupervised Disentangled representation learning and Manifold learning). The disentanglement learning is currently implemented by β-VAE for automatically discovering interpretable factorized latent representations in a completely unsupervised manner. The manifold learning is realized by t-SNE for projecting the latent representations to a 2D map. We define a new anomaly score function by combining β-VAE's reconstruction error in the raw feature space and local density estimation in the t-SNE space. AnoDM was evaluated on both image and time-series data and achieved better results than models that use just one of the two measures and other deep learning methods.

Feature Learning

Representation

Anomaly (physics)

10.1109/ijcnn48605.2020.9207046

Cite

Citations (6)

Progressive Stage-wise Learning for Unsupervised Feature Representation Enhancement

2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2021)

Zefan Li Chenxi Liu Alan Yuille Bingbing Ni Wenjun Zhang

Unsupervised learning methods have recently shown their competitiveness against supervised training. Typically, these methods use a single objective to train the en-tire network. But one distinct advantage of unsupervised over supervised learning is that the former possesses more variety and freedom in designing the objective. In this work, we explore new dimensions of unsupervised learning by proposing the Progressive Stage-wise Learning (PSL) framework. For a given unsupervised task, we design multi-level tasks and define different learning stages for the deep network. Early learning stages are forced to focus on low-level tasks while late stages are guided to extract deeper information through harder tasks. We discover that by progressive stage-wise learning, unsupervised feature representation can be effectively enhanced. Our extensive experiments show that PSL consistently improves results for the leading unsupervised learning methods.

Feature Learning

Feature (linguistics)

Competitive learning

Representation

Supervised Learning

10.1109/cvpr46437.2021.00964

Cite

Citations (6)

Geometry and Topology of Conceptual Representations of Simple Visual Data

Current Chinese Science (2022)

Serge Dolgikh

Introduction: Representations play an essential role in learning artificial and biological systems by producing informative structures associated with characteristic patterns in the sensory environment. In this work, we examined unsupervised latent representations of images of basic geometric shapes with neural network models of unsupervised generative self-learning. Background: Unsupervised concept learning with generative neural network models. Objective: Investigation of structure, geometry and topology in the latent representations of generative models that emerge as a result of unsupervised self-learning with minimization of generative error. Examine the capacity of generative models to abstract and generalize essential data characteristics, including the type of shape, size, contrast, position and orientation. Methods: Generative neural network models, direct visualization, density clustering, and probing and scanning of latent positions and regions. Results: Structural consistency of latent representations; geometrical and topological characteristics of latent representations examined and analysed with unsupervised methods. Development and verification of methods of unsupervised analysis of latent representations. Conclusion: Generative models can be instrumental in producing informative compact representations of complex sensory data correlated with characteristic patterns.

Generative model

10.2174/2210298103666221130101950

Cite

Citations (0)

On minimal variations for unsupervised representation learning

arXiv (Cornell University) (2022)

Vivien Cabannes Alberto Bietti Randall Balestriero

Unsupervised representation learning aims at describing raw data efficiently to solve various downstream tasks. It has been approached with many techniques, such as manifold learning, diffusion maps, or more recently self-supervised learning. Those techniques are arguably all based on the underlying assumption that target functions, associated with future downstream tasks, have low variations in densely populated regions of the input space. Unveiling minimal variations as a guiding principle behind unsupervised representation learning paves the way to better practical guidelines for self-supervised learning algorithms.

Representation

Feature Learning

Competitive learning

Supervised Learning

10.48550/arxiv.2211.03782

Cite

Citations (0)

Unsupervised Feature Learning and Deep Learning: A Review and New Perspectives

Yoshua Bengio Aaron Courville Pascal Vincent

The success of machine learning algorithms generally depends on data representation, and we hypothesize that this is because different representations can entangle and hide more or less the different explanatory factors of variation behind the data. Although specific domain knowledge can be used to help design representations, learning with generic priors can also be used, and the quest for AI is motivating the design of more powerful representation-learning algorithms implementing such priors. This paper reviews recent work in the area of unsupervised feature learning and deep learning, covering advances in probabilistic models, auto-encoders, manifold learning, and deep networks. This motivates longer-term unanswered questions about the appropriate objectives for learning good representations, for computing representations (i.e., inference), and the geometrical connections between representation learning, density estimation and manifold learning.

Feature Learning

Representation

Feature (linguistics)

Instance-based learning

Source

Cite

Citations (444)

Feature Representation of AutoEncoders for Unsupervised IoT Malware Detection

Lecture notes in computer science (2021)

Huu Noi Nguyen Van Cuong Nguyen Ngọc Trần Nguyên Van Loi Cao

Feature Learning

Feature (linguistics)

Representation

Feature vector

Transfer of learning

10.1007/978-3-030-91387-8_18

Cite

Citations (4)

Progressive Stage-wise Learning for Unsupervised Feature Representation Enhancement

arXiv (Cornell University) (2021)

Zefan Li Chenxi Liu Alan Yuille Bingbing Ni Wenjun Zhang

Unsupervised learning methods have recently shown their competitiveness against supervised training. Typically, these methods use a single objective to train the entire network. But one distinct advantage of unsupervised over supervised learning is that the former possesses more variety and freedom in designing the objective. In this work, we explore new dimensions of unsupervised learning by proposing the Progressive Stage-wise Learning (PSL) framework. For a given unsupervised task, we design multilevel tasks and define different learning stages for the deep network. Early learning stages are forced to focus on lowlevel tasks while late stages are guided to extract deeper information through harder tasks. We discover that by progressive stage-wise learning, unsupervised feature representation can be effectively enhanced. Our extensive experiments show that PSL consistently improves results for the leading unsupervised learning methods.

Feature Learning

Feature (linguistics)

Competitive learning

Representation

10.48550/arxiv.2106.05554

Cite

Citations (0)

Progressive Stage-wise Learning for Unsupervised Feature Representation Enhancement

arXiv (Cornell University) (2021)

Zefan Li Chenxi Liu Alan Yuille Bingbing Ni Wenjun Zhang

Feature Learning

Competitive learning

Feature (linguistics)

Representation

Source

Cite

Citations (1)

Learning Disentangled Representation by Exploiting Pretrained Generative Models: A Contrastive Learning View

arXiv (Cornell University) (2021)

Xuanchi Ren Tao Yang Yuwang Wang Wenjun Zeng

From the intuitive notion of disentanglement, the image variations corresponding to different factors should be distinct from each other, and the disentangled representation should reflect those variations with separate dimensions. To discover the factors and learn disentangled representation, previous methods typically leverage an extra regularization term when learning to generate realistic images. However, the term usually results in a trade-off between disentanglement and generation quality. For the generative models pretrained without any disentanglement term, the generated images show semantically meaningful variations when traversing along different directions in the latent space. Based on this observation, we argue that it is possible to mitigate the trade-off by $(i)$ leveraging the pretrained generative models with high generation quality, $(ii)$ focusing on discovering the traversal directions as factors for disentangled representation learning. To achieve this, we propose Disentaglement via Contrast (DisCo) as a framework to model the variations based on the target disentangled representations, and contrast the variations to jointly discover disentangled directions and learn disentangled representations. DisCo achieves the state-of-the-art disentangled representation learning and distinct direction discovering, given pretrained non-disentangled generative models including GAN, VAE, and Flow. Source code is at https://github.com/xrenaa/DisCo.

Generative model

Feature Learning

Representation

Leverage (statistics)

Regularization

Code (set theory)

10.48550/arxiv.2102.10543

Cite

Citations (11)

Different latent variables learning in variational autoencoder

2020 7th International Conference on Information, Cybernetics, and Computational Social Systems (ICCSS) (2017)

Qingyang Xu Yiqin Yang Zhe Wu Li Zhang

Unsupervised learning is a good neural network training way. However, the unsupervised learning algorithm is rare. The generative model is an interesting algorithm which can generate the similar data as the sample data by building a probabilistic model of the input data, and it can be used for unsupervised learning. Variational autoencoder is a typical generative model which is different from common autoencoder that a probabilistic parameter layer follows the hidden layer. Some new data can be reconstructed according to probabilistic model parameters. The probabilistic model parameter is the latent variable. In this paper, we want to do some research to test the data reconstruct effect of the variational autoencoder by different latent variables. According to the simulation, the more latent variables the more style of the sample is.

Autoencoder

Generative model

10.1109/iccss.2017.8091468

Cite

Citations (6)