Official PyTorch implementation of BlobGAN: Spatially Disentangled Scene Representations

Related tags

Deep Learningblobgan
Overview

BlobGAN: Spatially Disentangled Scene Representations
Official PyTorch Implementation

Paper | Project Page | Video | Interactive Demo Open in Colab

BlobGAN.mp4

This repository contains:

  • 🚂 Pre-trained BlobGAN models on three datasets: bedrooms, conference rooms, and a combination of kitchens, living rooms, and dining rooms
  • 💻 Code based on PyTorch Lightning and Hydra 🐍 which fully supports CPU, single GPU, or multi GPU/node training and inference

We also provide an 📓 interactive demo notebook to help get started using our model. Download this notebook and run it on your own Python environment, or test it out on Colab. You can:

  • 🖌️ ️ Generate and edit realistic images with an interactive UI
  • 📹 Create animated videos showing off your edited scenes

And, coming soon!

  • 📸 Upload your own image and convert it into blobs!
  • 🧬 Programmatically modify images and reproduce results from our paper

Setup

Run the commands below one at a time to download the latest version of the BlobGAN code, create a Conda environment, and install necessary packages and utilities.

git clone https://github.com/dave-epstein/blobgan.git
mkdir -p blobgan/logs/wandb
conda create -n blobgan python=3.9
conda activate blobgan
conda install pytorch=1.11.0 torchvision=0.12.0 torchaudio cudatoolkit=11.3 -c pytorch
conda install cudatoolkit-dev=11.3 -c conda-forge
pip install tqdm==4.64.0 hydra-core==1.1.2 omegaconf==2.1.2 clean-fid==0.1.23 wandb==0.12.11 ipdb==0.13.9 lpips==0.1.4 einops==0.4.1 inputimeout==1.0.4 pytorch-lightning==1.5.10 matplotlib==3.5.2 mpl_interactions[jupyter]==0.21.0
wget -q --show-progress https://github.com/ninja-build/ninja/releases/download/v1.10.2/ninja-linux.zip
sudo unzip -q ninja-linux.zip -d /usr/local/bin/
sudo update-alternatives --install /usr/bin/ninja ninja /usr/local/bin/ninja 1 --force

Running pretrained models

See scripts/load_model.py for an example of how to load a pre-trained model (using the provided load_model function, which can be called from elsewhere) and generate images with it. You can also run the file from the command line to generate images and save them to disk. For example:

python scripts/load_model.py --model_name bed --dl_dir models --save_dir out --n_imgs 32 --save_blobs --label_blobs

See the command's help for more details and options: scripts/load_model.py --help

Training your own model

Before training your model, you'll need to modify src/configs/experiments/local.yaml to include your WandB information and machine-specific configuration (such as path to data -- dataset.path or dataset.basepath -- and number of GPUs trainer.gpus). To turn off logging entirely, pass logger=false, or to only log to disk but not write to server, pass wandb.offline=true. Our code currently only supports WandB logging.

Here's an example command which will train a model on LSUN bedrooms. We list the configuration modules to load for this experiment (blobgan, local, jitter) and then specify any other options as we desire. For example, if we wanted to train a model without jitter, we could just remove that module from the experiments array.

python src/run.py +experiment=[blobgan,local,jitter] wandb.name='10-blob BlobGAN on bedrooms'

In some shells, you may need to add extra quotes around some of these options to prevent them from being parsed immediately on the command line.

Train on the LSUN category of your choice by passing in dataset.category, e.g. dataset.category=church. Tackle multiple categories at once with dataset=multilsun and dataset.categories=[kitchen,bedroom].

You can also train on any collection of images by selecting dataset=imagefolder and passing in the path. The code expects at least a subfolder named train and optional subfolders named validate and test. The below command also illustrates how to set arbitrary options using Hydra syntax, such as turning off FID logging or changing dataloader batch size:

python src/run.py +experiment=[blobgan,local,jitter] wandb.name='20-blob BlobGAN on Places' dataset.dataloader.batch_size=24 +model.log_fid_every_epoch=false dataset=imagefolder +dataset.path=/path/to/places/ model.n_features=20

Other parameters of interest are likely trainer.log_every_n_steps and model.log_images_every_n_steps which control frequency of logging scalars and images, and checkpoint.every_n_train_steps and checkpoint.save_top_k which dictate checkpoint saving frequency and decide how many most recent checkpoints to keep (-1 means keep everything).

Citation

If our code or models aided your research, please cite our paper:

@misc{epstein2022blobgan,
      title={BlobGAN: Spatially Disentangled Scene Representations},
      author={Dave Epstein and Taesung Park and Richard Zhang and Eli Shechtman and Alexei A. Efros},
      year={2022},
      eprint={2205.02837},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}		

Code acknowledgments

This repository is built on top of rosinality's excellent PyTorch re-implementation of StyleGAN2 and Bill Peebles' GANgealing codebase.

Owner
PhD student at UC Berkeley
Python package to generate image embeddings with CLIP without PyTorch/TensorFlow

imgbeddings A Python package to generate embedding vectors from images, using OpenAI's robust CLIP model via Hugging Face transformers. These image em

Max Woolf 81 Jan 04, 2023
An University Project of Quera Web Crawling.

WebCrawlerProject An University Project of Quera Web Crawling. خزشگر اینستاگرام در این پروژه شما باید با استفاده از کتابخانه های زیر یک خزشگر اینستاگر

Mahdi 3 Aug 12, 2022
Code, Models and Datasets for OpenViDial Dataset

OpenViDial This repo contains downloading instructions for the OpenViDial dataset in 《OpenViDial: A Large-Scale, Open-Domain Dialogue Dataset with Vis

119 Dec 08, 2022
A hifiasm fork for metagenome assembly using Hifi reads.

hifiasm_meta - de novo metagenome assembler, based on hifiasm, a haplotype-resolved de novo assembler for PacBio Hifi reads.

44 Jul 10, 2022
TensorFlow (Python API) implementation of Neural Style

neural-style-tf This is a TensorFlow implementation of several techniques described in the papers: Image Style Transfer Using Convolutional Neural Net

Cameron 3.1k Jan 02, 2023
Forecasting directional movements of stock prices for intraday trading using LSTM and random forest

Forecasting directional movements of stock-prices for intraday trading using LSTM and random-forest https://arxiv.org/abs/2004.10178 Pushpendu Ghosh,

Pushpendu Ghosh 270 Dec 24, 2022
Visual Tracking by TridenAlign and Context Embedding

Visual Tracking by TridentAlign and Context Embedding (TACT) Test code for "Visual Tracking by TridentAlign and Context Embedding" Janghoon Choi, Juns

Janghoon Choi 32 Aug 25, 2021
Ready-to-use code and tutorial notebooks to boost your way into few-shot image classification.

Easy Few-Shot Learning Ready-to-use code and tutorial notebooks to boost your way into few-shot image classification. This repository is made for you

Sicara 399 Jan 08, 2023
Python PID Tuner - Makes a model of the System from a Process Reaction Curve and calculates PID Gains

PythonPID_Tuner_SOPDT Step 1: Takes a Process Reaction Curve in csv format - assumes data at 100ms interval (column names CV and PV) Step 2: Makes a r

1 Jan 18, 2022
Demonstrational Session git repo for H SAF User Workshop (28/1)

5th H SAF User Workshop The 5th H SAF User Workshop supported by EUMeTrain will be held in online in January 24-28 2022. This repository contains inst

H SAF 4 Aug 04, 2022
This repository contains the code for "SBEVNet: End-to-End Deep Stereo Layout Estimation" paper by Divam Gupta, Wei Pu, Trenton Tabor, Jeff Schneider

SBEVNet: End-to-End Deep Stereo Layout Estimation This repository contains the code for "SBEVNet: End-to-End Deep Stereo Layout Estimation" paper by D

Divam Gupta 19 Dec 17, 2022
DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generative Transformers

DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generative Transformers Authors: Jaemin Cho, Abhay Zala, and Mohit Bansal (

Jaemin Cho 98 Dec 15, 2022
Probabilistic Gradient Boosting Machines

PGBM Probabilistic Gradient Boosting Machines (PGBM) is a probabilistic gradient boosting framework in Python based on PyTorch/Numba, developed by Air

Olivier Sprangers 112 Dec 28, 2022
IAUnet: Global Context-Aware Feature Learning for Person Re-Identification

IAUnet This repository contains the code for the paper: IAUnet: Global Context-Aware Feature Learning for Person Re-Identification Ruibing Hou, Bingpe

30 Jul 14, 2022
Code for paper "Extract, Denoise and Enforce: Evaluating and Improving Concept Preservation for Text-to-Text Generation" EMNLP 2021

The repo provides the code for paper "Extract, Denoise and Enforce: Evaluating and Improving Concept Preservation for Text-to-Text Generation" EMNLP 2

Yuning Mao 18 May 24, 2022
Code accompanying our NeurIPS 2021 traffic4cast challenge

Traffic forecasting on traffic movie snippets This repo contains all code to reproduce our approach to the IARAI Traffic4cast 2021 challenge. In the c

Nina Wiedemann 2 Aug 09, 2022
Social Fabric: Tubelet Compositions for Video Relation Detection

Social-Fabric Social Fabric: Tubelet Compositions for Video Relation Detection This repository contains the code and results for the following paper:

Shuo Chen 7 Aug 09, 2022
[ICML 2020] Prediction-Guided Multi-Objective Reinforcement Learning for Continuous Robot Control

PG-MORL This repository contains the implementation for the paper Prediction-Guided Multi-Objective Reinforcement Learning for Continuous Robot Contro

MIT Graphics Group 65 Jan 07, 2023
A script helps the user to update Linux and Mac systems through the terminal

Description This script helps the user to update Linux and Mac systems through the terminal. All the user has to install some requirements and then ru

Roxcoder 2 Jan 23, 2022
CSPML (crystal structure prediction with machine learning-based element substitution)

CSPML (crystal structure prediction with machine learning-based element substitution) CSPML is a unique methodology for the crystal structure predicti

8 Dec 20, 2022