FuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space OptimizationFuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space Optimization

Last update: Dec 31, 2022

Related tags

Deep Learning FuseDream

Overview

FuseDream

This repo contains code for our paper (paper link):

FuseDream: Training-Free Text-to-Image Generation with Improved CLIP+GAN Space Optimization

by Xingchao Liu, Chengyue Gong, Lemeng Wu, Shujian Zhang, Hao Su and Qiang Liu from UCSD and UT Austin.

Introduction

FuseDream uses pre-trained GANs (we support BigGAN-256 and BigGAN-512 for now) and CLIP to achieve high-fidelity text-to-image generation.

Requirements

Please use pip or conda to install the following packages: PyTorch==1.7.1, torchvision==0.8.2, lpips==0.1.4 and also the requirements from BigGAN.

Getting Started

We transformed the pre-trained weights of BigGAN from TFHub to PyTorch. To save your time, you can download the transformed BigGAN checkpoints from:

https://drive.google.com/drive/folders/1nJ3HmgYgeA9NZr-oU-enqbYeO7zBaANs?usp=sharing

Put the checkpoints into ./BigGAN_utils/weights/

Run the following command to generate images from text query:

python fusedream_generator.py --text 'YOUR TEXT' --seed YOUR_SEED

For example, to get an image of a blue dog:

python fusedream_generator.py --text 'A photo of a blue dog.' --seed 1234

The generated image will be stored in ./samples

Colab Notebook

For a quick test of FuseDream, we provide Colab notebooks for FuseDream(Single Image) and FuseDream-Composition(TODO). Have fun!

Citations

If you use the code, please cite:

@inproceedings{
brock2018large,
title={Large Scale {GAN} Training for High Fidelity Natural Image Synthesis},
author={Andrew Brock and Jeff Donahue and Karen Simonyan},
booktitle={International Conference on Learning Representations},
year={2019},
url={https://openreview.net/forum?id=B1xsqj09Fm},
}

and

@misc{
liu2021fusedream,
title={FuseDream: Training-Free Text-to-Image Generation with Improved CLIP+GAN Space Optimization}, 
author={Xingchao Liu and Chengyue Gong and Lemeng Wu and Shujian Zhang and Hao Su and Qiang Liu},
year={2021},
eprint={2112.01573},
archivePrefix={arXiv},
primaryClass={cs.CV}
}

FuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space OptimizationFuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space Optimization

Related tags

Overview

FuseDream

Introduction

Requirements

Getting Started

Colab Notebook

Citations

Owner

XCL

Code for "SRHEN: Stepwise-Refining Homography Estimation Network via Parsing Geometric Correspondences in Deep Latent Space"

Automatically measure the facial Width-To-Height ratio and get facial analysis results provided by Microsoft Azure

SegNet-Basic with Keras

Learning Versatile Neural Architectures by Propagating Network Codes

Algorithmic trading using machine learning.

Complete system for facial identity system

In this repo we reproduce and extend results of Learning in High Dimension Always Amounts to Extrapolation by Balestriero et al. 2021

Using Tensorflow Object Detection API to detect Waymo open dataset

Equivariant Imaging: Learning Beyond the Range Space

Pytorch implementation of the paper "COAD: Contrastive Pre-training with Adversarial Fine-tuning for Zero-shot Expert Linking."

Pytorch implementation of DeePSiM

The source code of CVPR 2019 paper "Deep Exemplar-based Video Colorization".

Neural Scene Flow Fields using pytorch-lightning, with potential improvements

Implementation of Squeezenet in pytorch, pretrained models on Cifar 10 data to come

This repository includes the official project for the paper: TransMix: Attend to Mix for Vision Transformers.

Noise Conditional Score Networks (NeurIPS 2019, Oral)

PyTorch source code for Distilling Knowledge by Mimicking Features

WPPNets: Unsupervised CNN Training with Wasserstein Patch Priors for Image Superresolution

Monocular Depth Estimation - Weighted-average prediction from multiple pre-trained depth estimation models

FB-tCNN for SSVEP Recognition