Blended Latent Diffusion

arXiv (Cornell University) (2022)

Omri Avrahami Ohad Fried Dani Lischinski

Citation

Reference

Related Paper

Citation Trend

Abstract:

The tremendous progress in neural image generation, coupled with the emergence of seemingly omnipotent vision-language models has finally enabled text-based interfaces for creating and editing images. Handling generic images requires a diverse underlying generative model, hence the latest works utilize diffusion models, which were shown to surpass GANs in terms of diversity. One major drawback of diffusion models, however, is their relatively slow inference time. In this paper, we present an accelerated solution to the task of local text-driven editing of generic images, where the desired edits are confined to a user-provided mask. Our solution leverages a recent text-to-image Latent Diffusion Model (LDM), which speeds up diffusion by operating in a lower-dimensional latent space. We first convert the LDM into a local image editor by incorporating Blended Diffusion into it. Next we propose an optimization-based solution for the inherent inability of this LDM to accurately reconstruct images. Finally, we address the scenario of performing local edits using thin masks. We evaluate our method against the available baselines both qualitatively and quantitatively and demonstrate that in addition to being faster, our method achieves better precision than the baselines while mitigating some of their artifacts.

Keywords:

Image editing

Generative model

Topics:

Multimodal Machine Learning Applications

Generative Adversarial Networks and Image Synthesis

Domain Adaptation and Few-Shot Learning

10.48550/arxiv.2206.02779

Cite

PDF

Generating Realistic Blood-Cell Images using Cycle-Consistent Generative Adversial Networks

International Journal of Innovative Technology and Exploring Engineering (2019)

M. V. Nageswara Rao

generative adversial networks are a neural-network based generative models , predominantly used for generating data-samples close to the data distribution they have been trained on .A model for generating realistic blood cell images based on cycle-consistent generative adversial networks is developed along with their corresponding segmentation masks

Generative model

10.35940/ijitee.l2948.1081219

Cite

Citations (0)

TC-VAE: Uncovering Out-of-Distribution Data Generative Factors

arXiv (Cornell University) (2023)

Cristian Meo Anirudh Goyal Justin Dauwels

Uncovering data generative factors is the ultimate goal of disentanglement learning. Although many works proposed disentangling generative models able to uncover the underlying generative factors of a dataset, so far no one was able to uncover OOD generative factors (i.e., factors of variations that are not explicitly shown on the dataset). Moreover, the datasets used to validate these models are synthetically generated using a balanced mixture of some predefined generative factors, implicitly assuming that generative factors are uniformly distributed across the datasets. However, real datasets do not present this property. In this work we analyse the effect of using datasets with unbalanced generative factors, providing qualitative and quantitative results for widely used generative models. Moreover, we propose TC-VAE, a generative model optimized using a lower bound of the joint total correlation between the learned latent representations and the input data. We show that the proposed model is able to uncover OOD generative factors on different datasets and outperforms on average the related baselines in terms of downstream disentanglement metrics.

Generative model

Generative Design

10.48550/arxiv.2304.04103

Cite

Citations (0)

Improving image generative models with human interactions

arXiv (Cornell University) (2017)

Andrew K. Lampinen David R. So Douglas Eck Fred Bertsch

GANs provide a framework for training generative models which mimic a data distribution. However, in many cases we wish to train these generative models to optimize some auxiliary objective function within the data it generates, such as making more aesthetically pleasing images. In some cases, these objective functions are difficult to evaluate, e.g. they may require human interaction. Here, we develop a system for efficiently improving a GAN to target an objective involving human interaction, specifically generating images that increase rates of positive user interactions. To improve the generative model, we build a model of human behavior in the targeted domain from a relatively small set of interactions, and then use this behavioral model as an auxiliary loss function to improve the generative model. We show that this system is successful at improving positive interaction rates, at least on simulated data, and characterize some of the factors that affect its performance.

Generative model

Training set

10.48550/arxiv.1709.10459

Cite

Citations (1)

Generative Model for Person Re-Identification: A Review

Lecture notes in electrical engineering (2020)

Zhong Zhang Tongzhen Si Shuang Liu

Discriminative model

Generative model

Identification

10.1007/978-981-13-9409-6_174

Cite

Citations (0)

Are generative approaches to ZSAR a look in the right direction?

Zhihui Zhang

Approaches to zero-shot learning have typically involved finding embeddings in a latent space between visual and textual features and using these embeddings with nearest neighbor searches to perform classification. An alternative approach is to convert this problem to a fully supervised problem by introducing generated features using generative models. This can be done using any form of a generative approach. We have seen models like simple GANs to cycle GANs and also approaches like out-of-distribution detection models. However, generative approaches are typically unstable in training and recent research in vision and language training has seen significant progress in zero-shot learning making the use of generative approaches rather obsolete. Based on our studies we see a few primary concerns for the drop in the use of generative approaches. As mentioned before, the use of generative models typically leads to unstable training. Further, the training process is expensive and takes a lot of time. A possible direction we are considering is to change the use of the typically used I3D backbone and use transformer based backbones as we believe this will lead to better features in training for the seen classes and this will significantly boost the training process of generative models.

Generative model

Generative Design

10.31219/osf.io/ncyqb

Cite

Citations (0)

Towards Understanding the Interplay of Generative Artificial Intelligence and the Internet

arXiv (Cornell University) (2023)

Gonzalo Martínez Lauren Watson Pedro Reviriego José Alberto Hernández Marc Juárez

The rapid adoption of generative Artificial Intelligence (AI) tools that can generate realistic images or text, such as DALL-E, MidJourney, or ChatGPT, have put the societal impacts of these technologies at the center of public debate. These tools are possible due to the massive amount of data (text and images) that is publicly available through the Internet. At the same time, these generative AI tools become content creators that are already contributing to the data that is available to train future models. Therefore, future versions of generative AI tools will be trained with a mix of human-created and AI-generated content, causing a potential feedback loop between generative AI and public data repositories. This interaction raises many questions: how will future versions of generative AI tools behave when trained on a mixture of real and AI generated data? Will they evolve and improve with the new data sets or on the contrary will they degrade? Will evolution introduce biases or reduce diversity in subsequent generations of generative AI tools? What are the societal implications of the possible degradation of these models? Can we mitigate the effects of this feedback loop? In this document, we explore the effect of this interaction and report some initial results using simple diffusion models trained with various image datasets. Our results show that the quality and diversity of the generated images can degrade over time suggesting that incorporating AI-created data can have undesired effects on future versions of generative models.

Generative model

Generative Design

10.48550/arxiv.2306.06130

Cite

Citations (4)

Dual-Teacher Class-Incremental Learning With Data-Free Generative Replay

arXiv (Cornell University) (2021)

Yoojin Choi Mostafa El‐Khamy Jungwon Lee

This paper proposes two novel knowledge transfer techniques for class-incremental learning (CIL). First, we propose data-free generative replay (DF-GR) to mitigate catastrophic forgetting in CIL by using synthetic samples from a generative model. In the conventional generative replay, the generative model is pre-trained for old data and shared in extra memory for later incremental learning. In our proposed DF-GR, we train a generative model from scratch without using any training data, based on the pre-trained classification model from the past, so we curtail the cost of sharing pre-trained generative models. Second, we introduce dual-teacher information distillation (DT-ID) for knowledge distillation from two teachers to one student. In CIL, we use DT-ID to learn new classes incrementally based on the pre-trained model for old classes and another model (pre-)trained on the new data for new classes. We implemented the proposed schemes on top of one of the state-of-the-art CIL methods and showed the performance improvement on CIFAR-100 and ImageNet datasets.

Generative model

Generative Design

10.48550/arxiv.2106.09835

Cite

Citations (0)

Deep Dexterous Grasping of Novel Objects from a Single View.

arXiv (Cornell University) (2019)

Ümit Ruşen Aktaş Chao Zhao Marek Kopicki Aleš Leonardis Jeremy Wyatt

In this thesis, a novel generative-evaluative method was proposed to solve the problem of dexterous grasping of the novel object with a single view. The generative model is learned from human demonstration. The grasps generated by the generative model are used to train the evaluative model. Two novel evaluative network architectures are proposed. The evaluative model is a deep evaluative network that is trained in the simulation. The generative-evaluative method is tested in a real grasp data set with 49 previously unseen challenging objects. The generative-evaluative method achieves a success rate of 78% that outperforms the purely generative method, that has a success rate of 57%. The thesis provides insights into the strengths and weaknesses of the generative-evaluative method by comparing different deep network architectures.

Generative model

Strengths and weaknesses

Generative Design

Source

Cite

Citations (5)

A Comprehensive Review of the Latest Advancements in Large Generative AI Models

Communications in computer and information science (2023)

Satyam Kumar Dayima Musharaf Seerat Musharaf Anil Kumar Sagar

Generative model

Code (set theory)

10.1007/978-3-031-45121-8_9