A Face Identification System Based on Self-Supervised Learning Using Triplet-based Variational Autoencoder
0
Citation
0
Reference
20
Related Paper
Keywords:
Autoencoder
Identification
Cite
Supervised Learning
Instance-based learning
Competitive learning
Cite
Citations (35)
Disentangled representation learning aims to extract explanatory features or factors and retain salient information. Factorized hierarchical variational autoencoder (FHVAE) presents a way to disentangle a speech signal into sequential-level and segmental-level features, which represent speaker identity and speech content information, respectively. As a self-supervised objective, autoregressive predictive coding (APC), on the other hand, has been used in extracting meaningful and transferable speech features for multiple downstream tasks. Inspired by the success of these two representation learning methods, this paper proposes to integrate the APC objective into the FHVAE framework aiming at benefiting from the additional self-supervision target. The main proposed method requires neither more training data nor more computational cost at test time, but obtains improved meaningful representations while maintaining disentanglement. The experiments were conducted on the TIMIT dataset. Results demonstrate that FHVAE equipped with the additional self-supervised objective is able to learn features providing superior performance for tasks including speech recognition and speaker recognition. Furthermore, voice conversion, as one application of disentangled representation learning, has been applied and evaluated. The results show performance similar to baseline of the new framework on voice conversion.
Autoencoder
Feature Learning
TIMIT
Representation
Cite
Citations (9)
Cite
Citations (0)
Semi-supervised learning is a topic of practical importance because of the difficulty of obtaining numerous labeled data. In this paper, we apply an extension of adversarial autoencoder to semi-supervised learning tasks. In attempt to separate style and content, we divide the latent representation of the autoencoder into two parts. We regularize the autoencoder by imposing a prior distribution on both parts to make them independent. As a result, one of the latent representations is associated with content, which is useful to classify the images. We demonstrate that our method disentangles style and content of the input images and achieves less test error rate than vanilla autoencoder on MNIST semi-supervised classification tasks.
Autoencoder
MNIST database
Representation
Feature Learning
Supervised Learning
Cite
Citations (6)
Autoencoder
Chord (peer-to-peer)
Cite
Citations (0)
Autoencoders are deep learning architectures that learn feature representation by minimizing the reconstruction error. Using an autoencoder as baseline, this paper presents a novel formulation for a class sparsity based supervised encoder, termed as CSSE. We postulate that features from the same class will have a common sparsity pattern/support in the latent space. Therefore, in the formulation of the autoencoder, a supervision penalty is introduced as a jointsparsity promoting l2;1-norm. The formulation of CSSE is derived for a single hidden layer and it is applied for multiple hidden layers using a greedy layer-bylayer learning approach. The proposed CSSE approach is applied for learning face representation and verification experiments are performed on the LFW and PaSC face databases. The experiments show that the proposed approach yields improved results compared to autoencoders and comparable results with state-ofthe-art face recognition algorithms.
Autoencoder
Representation
Feature Learning
Feature (linguistics)
Cite
Citations (76)
Autoencoder
Human multitasking
Cite
Citations (0)
This work tackles the problem of semi-supervised learning of image classifiers. Our main insight is that the field of semi-supervised learning can benefit from the quickly advancing field of self-supervised visual representation learning. Unifying these two approaches, we propose the framework of self-supervised semi-supervised learning (S4L) and use it to derive two novel semi-supervised image classification methods. We demonstrate the effectiveness of these methods in comparison to both carefully tuned baselines, and existing semi-supervised learning methods. We then show that S4L and existing semi-supervised methods can be jointly trained, yielding a new state-of-the-art result on semi-supervised ILSVRC-2012 with 10% of labels.
Supervised Learning
Representation
Cite
Citations (640)
Semi-supervised learning is sought for leveraging the unlabelled data when labelled data is difficult or expensive to acquire. Deep generative models (e.g., Variational Autoencoder (VAE)) and semisupervised Generative Adversarial Networks (GANs) have recently shown promising performance in semi-supervised classification for the excellent discriminative representing ability. However, the latent code learned by the traditional VAE is not exclusive (repeatable) for a specific input sample, which prevents it from excellent classification performance. In particular, the learned latent representation depends on a non-exclusive component which is stochastically sampled from the prior distribution. Moreover, the semi-supervised GAN models generate data from pre-defined distribution (e.g., Gaussian noises) which is independent of the input data distribution and may obstruct the convergence and is difficult to control the distribution of the generated data. To address the aforementioned issues, we propose a novel Adversarial Variational Embedding (AVAE) framework for robust and effective semi-supervised learning to leverage both the advantage of GAN as a high quality generative model and VAE as a posterior distribution learner. The proposed approach first produces an exclusive latent code by the model which we call VAE++, and meanwhile, provides a meaningful prior distribution for the generator of GAN. The proposed approach is evaluated over four different real-world applications and we show that our method outperforms the state-of-the-art models, which confirms that the combination of VAE++ and GAN can provide significant improvements in semisupervised classification.
Discriminative model
Autoencoder
Supervised Learning
Leverage (statistics)
Generative model
Cite
Citations (25)
Autoencoder
Cite
Citations (8)