Blended Latent Diffusion
4
Citation
0
Reference
10
Related Paper
Citation Trend
Abstract:
The tremendous progress in neural image generation, coupled with the emergence of seemingly omnipotent vision-language models has finally enabled text-based interfaces for creating and editing images. Handling generic images requires a diverse underlying generative model, hence the latest works utilize diffusion models, which were shown to surpass GANs in terms of diversity. One major drawback of diffusion models, however, is their relatively slow inference time. In this paper, we present an accelerated solution to the task of local text-driven editing of generic images, where the desired edits are confined to a user-provided mask. Our solution leverages a recent text-to-image Latent Diffusion Model (LDM), which speeds up diffusion by operating in a lower-dimensional latent space. We first convert the LDM into a local image editor by incorporating Blended Diffusion into it. Next we propose an optimization-based solution for the inherent inability of this LDM to accurately reconstruct images. Finally, we address the scenario of performing local edits using thin masks. We evaluate our method against the available baselines both qualitatively and quantitatively and demonstrate that in addition to being faster, our method achieves better precision than the baselines while mitigating some of their artifacts.Keywords:
Image editing
Generative model
generative adversial networks are a neural-network based generative models , predominantly used for generating data-samples close to the data distribution they have been trained on .A model for generating realistic blood cell images based on cycle-consistent generative adversial networks is developed along with their corresponding segmentation masks
Generative model
Cite
Citations (0)
Uncovering data generative factors is the ultimate goal of disentanglement learning. Although many works proposed disentangling generative models able to uncover the underlying generative factors of a dataset, so far no one was able to uncover OOD generative factors (i.e., factors of variations that are not explicitly shown on the dataset). Moreover, the datasets used to validate these models are synthetically generated using a balanced mixture of some predefined generative factors, implicitly assuming that generative factors are uniformly distributed across the datasets. However, real datasets do not present this property. In this work we analyse the effect of using datasets with unbalanced generative factors, providing qualitative and quantitative results for widely used generative models. Moreover, we propose TC-VAE, a generative model optimized using a lower bound of the joint total correlation between the learned latent representations and the input data. We show that the proposed model is able to uncover OOD generative factors on different datasets and outperforms on average the related baselines in terms of downstream disentanglement metrics.
Generative model
Generative Design
Cite
Citations (0)
GANs provide a framework for training generative models which mimic a data distribution. However, in many cases we wish to train these generative models to optimize some auxiliary objective function within the data it generates, such as making more aesthetically pleasing images. In some cases, these objective functions are difficult to evaluate, e.g. they may require human interaction. Here, we develop a system for efficiently improving a GAN to target an objective involving human interaction, specifically generating images that increase rates of positive user interactions. To improve the generative model, we build a model of human behavior in the targeted domain from a relatively small set of interactions, and then use this behavioral model as an auxiliary loss function to improve the generative model. We show that this system is successful at improving positive interaction rates, at least on simulated data, and characterize some of the factors that affect its performance.
Generative model
Training set
Cite
Citations (1)
Discriminative model
Generative model
Identification
Cite
Citations (0)
Approaches to zero-shot learning have typically involved finding embeddings in a latent space between visual and textual features and using these embeddings with nearest neighbor searches to perform classification. An alternative approach is to convert this problem to a fully supervised problem by introducing generated features using generative models. This can be done using any form of a generative approach. We have seen models like simple GANs to cycle GANs and also approaches like out-of-distribution detection models. However, generative approaches are typically unstable in training and recent research in vision and language training has seen significant progress in zero-shot learning making the use of generative approaches rather obsolete. Based on our studies we see a few primary concerns for the drop in the use of generative approaches. As mentioned before, the use of generative models typically leads to unstable training. Further, the training process is expensive and takes a lot of time. A possible direction we are considering is to change the use of the typically used I3D backbone and use transformer based backbones as we believe this will lead to better features in training for the seen classes and this will significantly boost the training process of generative models.
Generative model
Generative Design
Cite
Citations (0)
The rapid adoption of generative Artificial Intelligence (AI) tools that can generate realistic images or text, such as DALL-E, MidJourney, or ChatGPT, have put the societal impacts of these technologies at the center of public debate. These tools are possible due to the massive amount of data (text and images) that is publicly available through the Internet. At the same time, these generative AI tools become content creators that are already contributing to the data that is available to train future models. Therefore, future versions of generative AI tools will be trained with a mix of human-created and AI-generated content, causing a potential feedback loop between generative AI and public data repositories. This interaction raises many questions: how will future versions of generative AI tools behave when trained on a mixture of real and AI generated data? Will they evolve and improve with the new data sets or on the contrary will they degrade? Will evolution introduce biases or reduce diversity in subsequent generations of generative AI tools? What are the societal implications of the possible degradation of these models? Can we mitigate the effects of this feedback loop? In this document, we explore the effect of this interaction and report some initial results using simple diffusion models trained with various image datasets. Our results show that the quality and diversity of the generated images can degrade over time suggesting that incorporating AI-created data can have undesired effects on future versions of generative models.
Generative model
Generative Design
Cite
Citations (4)
This paper proposes two novel knowledge transfer techniques for class-incremental learning (CIL). First, we propose data-free generative replay (DF-GR) to mitigate catastrophic forgetting in CIL by using synthetic samples from a generative model. In the conventional generative replay, the generative model is pre-trained for old data and shared in extra memory for later incremental learning. In our proposed DF-GR, we train a generative model from scratch without using any training data, based on the pre-trained classification model from the past, so we curtail the cost of sharing pre-trained generative models. Second, we introduce dual-teacher information distillation (DT-ID) for knowledge distillation from two teachers to one student. In CIL, we use DT-ID to learn new classes incrementally based on the pre-trained model for old classes and another model (pre-)trained on the new data for new classes. We implemented the proposed schemes on top of one of the state-of-the-art CIL methods and showed the performance improvement on CIFAR-100 and ImageNet datasets.
Generative model
Generative Design
Cite
Citations (0)
In this thesis, a novel generative-evaluative method was proposed to solve the problem of dexterous grasping of the novel object with a single view. The generative model is learned from human demonstration. The grasps generated by the generative model are used to train the evaluative model. Two novel evaluative network architectures are proposed. The evaluative model is a deep evaluative network that is trained in the simulation. The generative-evaluative method is tested in a real grasp data set with 49 previously unseen challenging objects. The generative-evaluative method achieves a success rate of 78% that outperforms the purely generative method, that has a success rate of 57%. The thesis provides insights into the strengths and weaknesses of the generative-evaluative method by comparing different deep network architectures.
Generative model
Strengths and weaknesses
Generative Design
Cite
Citations (5)
Generative model
Code (set theory)
Cite
Citations (16)
This paper proposes two novel knowledge transfer techniques for class-incremental learning (CIL). First, we propose data-free generative replay (DF-GR) to mitigate catastrophic forgetting in CIL by using synthetic samples from a generative model. In the conventional generative replay, the generative model is pre-trained for old data and shared in extra memory for later incremental learning. In our proposed DF-GR, we train a generative model from scratch without using any training data, based on the pre-trained classification model from the past, so we curtail the cost of sharing pre-trained generative models. Second, we introduce dual-teacher information distillation (DT-ID) for knowledge distillation from two teachers to one student. In CIL, we use DT-ID to learn new classes incrementally based on the pre-trained model for old classes and another model (pre-)trained on the new data for new classes. We implemented the proposed schemes on top of one of the state-of-the-art CIL methods and showed the performance improvement on CIFAR-100 and ImageNet datasets.
Generative model
Generative Design
Cite
Citations (0)