Deep Multiview Clustering by Contrasting Cluster Assignments
3
Citation
39
Reference
10
Related Paper
Citation Trend
Abstract:
Multiview clustering (MVC) aims to reveal the underlying structure of multiview data by categorizing data samples into clusters. Deep learning-based methods exhibit strong feature learning capabilities on large-scale datasets. For most existing deep MVC methods, exploring the invariant representations of multiple views is still an intractable problem. In this paper, we propose a cross-view contrastive learning (CVCL) method that learns view-invariant representations and produces clustering results by contrasting the cluster assignments among multiple views. Specifically, we first employ deep autoencoders to extract view-dependent features in the pretraining stage. Then, a cluster-level CVCL strategy is presented to explore consistent semantic label information among the multiple views in the fine-tuning stage. Thus, the proposed CVCL method is able to produce more discriminative cluster assignments by virtue of this learning strategy. Moreover, we provide a theoretical analysis of soft cluster assignment alignment. Extensive experimental results obtained on several datasets demonstrate that the proposed CVCL method outperforms several state-of-the-art approaches.Keywords:
Discriminative model
Feature (linguistics)
Discriminative model
Cite
Citations (2)
Currently, most top-performing Weakly supervised Fine-grained Image Classification (WFGIC) schemes tend to pick out discriminative patches. However, those patches usually contain much noise information, which influences the accuracy of the classification. Besides, they rely on a large amount of candidate patches to discover the discriminative ones, thus leading to high computational cost. To address these problems, we propose a novel end-to-end Self-regressive Localization with Discriminative Prior Network (SDN) model, which learns to explore more accurate size of discriminative patches and enables to classify images in real time. Specifically, we design a multi-task discriminative learning network, a self-regressive localization sub-network and a discriminative prior sub-network with the guided loss as well as the consistent loss to simultaneously learn self-regressive coefficients and discriminative prior maps. The self-regressive coefficients can decrease noise information in discriminative patches and the discriminative prior maps through learning discriminative probability values filter thousands of candidate patches to single figure. Extensive experiments demonstrate that the proposed SDN model achieves state-of-the-art both in accuracy and efficiency.
Discriminative model
Contextual image classification
Cite
Citations (3)
Fine-grained image classification is to recognize hundreds of subcategories in each basic-level category. Existing methods employ discriminative localization to find the key distinctions between similar subcategories. However, they generally have two limitations: 1) discriminative localization relies on region proposal methods to hypothesize the locations of discriminative regions, which are time-consuming and the bottleneck of improving classification speed and 2) the training of discriminative localization depends on object or part annotations which are heavily labor-consuming and the obstacle of marching toward practical application. It is highly challenging to address the two limitations simultaneously, while existing methods only focus on one of them. Therefore, we propose a weakly supervised discriminative localization approach (WSDL) for fast fine-grained image classification to address the two limitations at the same time, and its main advantages are: 1) multi-level attention guided localization learning is proposed to localize discriminative regions with different focuses automatically, without using object and part annotations, avoiding the labor consumption. Different level attentions focus on different characteristics of the image, which are complementary and boost classification accuracy and 2) n-pathway end-to-end discriminative localization network is proposed to improve classification speed, which simultaneously localizes multiple different discriminative regions for one image to boost classification accuracy, and shares full-image convolutional features generated by a region proposal network to accelerate the process of generating region proposals as well as reduce the computation of convolutional operation. Both are jointly employed to simultaneously improve classification speed and eliminate dependence on object and part annotations. Comparing with state-of-the-art methods on two widely used fine-grained image classification data sets, our WSDL approach achieves the best accuracy and the efficiency of classification.
Discriminative model
Contextual image classification
Cite
Citations (76)
Local discriminative regions play important roles in fine-grained image analysis tasks. How to locate local discriminative regions with only category label and learn discriminative representation from these regions have been hot spots. In our work, we propose Searching Discriminative Regions (SDR) and Learning Discriminative Regions (LDR) method to search and learn local discriminative regions in images. The SDR method adopts attention mechanism to iteratively search for high-response regions in images, and uses this as a clue to locate local discriminative regions. Moreover, the LDR method is proposed to learn compact within category and sparse between categories representation from the raw image and local images. Experimental results show that our proposed approach achieves excellent performance in both fine-grained image retrieval and classification tasks, which demonstrates its effectiveness.
Discriminative model
Representation
Contextual image classification
Cite
Citations (5)
The advancements in generative modeling, particularly the advent of diffusion models, have sparked a fundamental question: how can these models be effectively used for discriminative tasks? In this work, we find that generative models can be great test-time adapters for discriminative models. Our method, Diffusion-TTA, adapts pre-trained discriminative models such as image classifiers, segmenters and depth predictors, to each unlabelled example in the test set using generative feedback from a diffusion model. We achieve this by modulating the conditioning of the diffusion model using the output of the discriminative model. We then maximize the image likelihood objective by backpropagating the gradients to discriminative model's parameters. We show Diffusion-TTA significantly enhances the accuracy of various large-scale pre-trained discriminative models, such as, ImageNet classifiers, CLIP models, image pixel labellers and image depth predictors. Diffusion-TTA outperforms existing test-time adaptation methods, including TTT-MAE and TENT, and particularly shines in online adaptation setups, where the discriminative model is continually adapted to each example in the test set. We provide access to code, results, and visualizations on our website: https://diffusion-tta.github.io/.
Discriminative model
Generative model
Cite
Citations (0)
Discriminative localization is essential for fine-grained image classification task, which devotes to recognizing hundreds of subcategories in the same basic-level category. Reflecting on discriminative regions of objects, key differences among different subcategories are subtle and local. Existing methods generally adopt a two-stage learning framework: The first stage is to localize the discriminative regions of objects, and the second is to encode the discriminative features for training classifiers. However, these methods generally have two limitations: (1) Separation of the two-stage learning is time-consuming. (2) Dependence on object and parts annotations for discriminative localization learning leads to heavily labor-consuming labeling. It is highly challenging to address these two important limitations simultaneously. Existing methods only focus on one of them. Therefore, this paper proposes the discriminative localization approach via saliency-guided Faster R-CNN to address the above two limitations at the same time, and our main novelties and advantages are: (1) End-to-end network based on Faster R-CNN is designed to simultaneously localize discriminative regions and encode discriminative features, which accelerates classification speed. (2) Saliency-guided localization learning is proposed to localize the discriminative region automatically, avoiding labor-consuming labeling. Both are jointly employed to simultaneously accelerate classification speed and eliminate dependence on object and parts annotations. Comparing with the state-of-the-art methods on the widely-used CUB-200-2011 dataset, our approach achieves both the best classification accuracy and efficiency.
Discriminative model
ENCODE
Cite
Citations (52)
Discriminative model
Cite
Citations (25)
Discriminative learning of sparse-code based dictionaries tends to be inherently unstable. We show that using a discriminative version of the deviation function to learn such dictionaries leads to a more stable formulation that can handle the reconstruction/discrimination trade-off in a principled manner. Results on Graz02 and UCF Sports datasets validate the proposed formulation.
Discriminative model
Dictionary Learning
Code (set theory)
Cite
Citations (6)
This paper presents a fusing framework of discriminative and generative models for Chinese dialect identification. The generative models are employed to produce language feature vectors and the discriminative models are used to make classification. Four Chinese dialects is tested with this system. The experimental results showed that the proposed system outperformed the GMM based system. Meanwhile the SVM based discriminative methods has more powerful discriminative ability than ANN based one.
Discriminative model
Generative model
Identification
Feature (linguistics)
Cite
Citations (0)
Discriminative model
Manifold (fluid mechanics)
Tracking (education)
Cite
Citations (103)