[NeurIPS 2021 Spotlight] Code for Learning to Compose Visual Relations

Overview

Learning to Compose Visual Relations

This is the pytorch codebase for the NeurIPS 2021 Spotlight paper Learning to Compose Visual Relations.

Demo

Image Generation Demo

Please use the following command to generate images on the CLEVR dataset. Please use --num_rels to control the input relational descriptions.

python demo.py --checkpoint_folder ./checkpoint --model_name clevr --output_folder ./ --dataset clevr \
--resume_iter best --batch_size 25 --num_steps 80 --num_rels 1 --data_folder ./data --mode generation
GIF Final Generated Image

Image Editing Demo

Please use the following command to edit images on the CLEVR dataset. Please use --num_rels to control the input relational descriptions.

python demo.py --checkpoint_folder ./checkpoint --model_name clevr --output_folder ./ --dataset clevr \
--resume_iter best --batch_size 25 --num_steps 80 --num_rels 1 --data_folder ./data --mode editing
Input Image GIF Final Edited Image

Training

Data Preparation

Please utilize the following data link to download the CLEVR data utilized in our experiments. Then place all data files under ./data folder. Downloads for additional datasets and precomputed feature files will be posted soon. Feel free to raise an issue if there is a particular dataset you would like to download.

Model Training

To train your own model, please run following command. Please use --dataset to train your model on different datasets, e.g. --dataset clevr.

python -u train.py --cond --dataset=${dataset} --exp=${dataset} --batch_size=10 --step_lr=300 \
--num_steps=60 --kl --gpus=1 --nodes=1 --filter_dim=128 --im_size=128 --self_attn \
--multiscale --norm --spec_norm --slurm --lr=1e-4 --cuda --replay_batch \
--numpy_data_path ./data/clevr_training_data.npz

Evaluation

To evaluate our model, you can use your own trained models or download the pre-trained models model_best.pth from ${dataset}_model folder from link and put it under the project folder ./checkpoints/${dataset}. Only clevr_model is currently available. More pretrained-models will be posted soon.

Evaluate Image Generation Results Using the Pretrained Classifiers

Please use the following command to generate images on the test set first. Please use --dataset and --num_rels to control the dataset and the number of input relational descriptions. Note that 1 <= num_rels <= 3.

python inference.py --checkpoint_folder ./checkpoints --model_name ${dataset} \
--output_folder ./${dataset}_gen_images --dataset ${dataset} --resume_iter best \
--batch_size 32 --num_steps 80 --num_rels ${num_rels} --data_folder ./data --mode generation

In order to evaluate the binary classification scores of the generated images, you can train one binary classifier or download a pretrained one from link under the binary_classifier folder.

To train your own binary classifier, please use following command:

python train_classifier.py --train --spec_norm --norm \
--dataset ${dataset} --lr 3e-4 --checkpoint_dir ./binary_classifier

Please use following command to evaluate on generated images conditioned on selected number of relations. Please use --num_rels to specify the number of relations.

python classification_scores.py --dataset ${dataset} --checkpoint_dir ./binary_classifier/ \
--data_folder ./data --generated_img_folder ./${dataset}_gen_images/num_rels_${num_rels} \
--mode generation --num_rels ${num_rels}

Evaluate Image Editing Results Using the Pretrained Classifiers

Please use the following command to edit images on the test set first. Please use --dataset and --num_rels to select the dataset and the number of input relational descriptions.

python inference.py --checkpoint_folder ./checkpoints --model_name ${dataset} \
--output_folder ./${dataset}_edit_images --dataset ${dataset} --resume_iter best \
--batch_size 32 --num_steps 80 --num_rels 1 --data_folder ./data --mode editing

To evaluate classification scores of image editing results, please change the --mode to editing.

python classification_scores.py --dataset ${dataset} --checkpoint_dir ./binary_classifier/ \
--data_folder ./data --generated_img_folder ./${dataset}_edit_images/num_rels_${num_rels} \
--mode editing --num_rels ${num_rels}

Acknowledgements

The code for training EBMs is from https://github.com/yilundu/improved_contrastive_divergence.


Citation

Please consider citing our papers if you use this code in your research:

@article{liu2021learning,
  title={Learning to Compose Visual Relations},
  author={Liu, Nan and Li, Shuang and Du, Yilun and Tenenbaum, Josh and Torralba, Antonio},
  journal={Advances in Neural Information Processing Systems},
  volume={34},
  year={2021}
}
Owner
Nan Liu
MS CS @uiuc; BS CS @umich
Nan Liu
[SIGGRAPH 2020] Attribute2Font: Creating Fonts You Want From Attributes

Attr2Font Introduction This is the official PyTorch implementation of the Attribute2Font: Creating Fonts You Want From Attributes. Paper: arXiv | Rese

Yue Gao 200 Dec 15, 2022
CMT: Convolutional Neural Networks Meet Vision Transformers

CMT: Convolutional Neural Networks Meet Vision Transformers [arxiv] 1. Introduction This repo is the CMT model which impelement with pytorch, no refer

FlyEgle 83 Dec 30, 2022
A python library for face detection and features extraction based on mediapipe library

FaceAnalyzer A python library for face detection and features extraction based on mediapipe library Introduction FaceAnalyzer is a library based on me

Saifeddine ALOUI 14 Dec 30, 2022
Simple implementation of Mobile-Former on Pytorch

Simple-implementation-of-Mobile-Former At present, only the model but no trained. There may be some bug in the code, and some details may be different

Acheung 103 Dec 31, 2022
Compute execution plan: A DAG representation of work that you want to get done. Individual nodes of the DAG could be simple python or shell tasks or complex deeply nested parallel branches or embedded DAGs themselves.

Hello from magnus Magnus provides four capabilities for data teams: Compute execution plan: A DAG representation of work that you want to get done. In

12 Feb 08, 2022
Rot-Pro: Modeling Transitivity by Projection in Knowledge Graph Embedding

Rot-Pro : Modeling Transitivity by Projection in Knowledge Graph Embedding This repository contains the source code for the Rot-Pro model, presented a

Tewi 9 Sep 28, 2022
Exposure Time Calculator (ETC) and radial velocity precision estimator for the Near InfraRed Planet Searcher (NIRPS) spectrograph

NIRPS-ETC Exposure Time Calculator (ETC) and radial velocity precision estimator for the Near InfraRed Planet Searcher (NIRPS) spectrograph February 2

Nolan Grieves 2 Sep 15, 2022
2021 National Underwater Robotics Vision Optics

2021-National-Underwater-Robotics-Vision-Optics 2021年全国水下机器人算法大赛-光学赛道-B榜精度第18名 (Kilian_Di的团队:A榜[email pro

Di Chang 9 Nov 04, 2022
Layer 7 DDoS Panel with Cloudflare Bypass ( UAM, CAPTCHA, BFM, etc.. )

Blood Deluxe DDoS DDoS Attack Panel includes CloudFlare Bypass (UAM, CAPTCHA, BFM, etc..)(It works intermittently. Working on it) Don't attack any web

272 Nov 01, 2022
The repository offers the official implementation of our BMVC 2021 paper in PyTorch.

CrossMLP Cascaded Cross MLP-Mixer GANs for Cross-View Image Translation Bin Ren1, Hao Tang2, Nicu Sebe1. 1University of Trento, Italy, 2ETH, Switzerla

Bingoren 16 Jul 27, 2022
Greedy Gaussian Segmentation

GGS Greedy Gaussian Segmentation (GGS) is a Python solver for efficiently segmenting multivariate time series data. For implementation details, please

Stanford University Convex Optimization Group 72 Dec 07, 2022
Pytorch implementation of ICASSP 2022 paper Attention Probe: Vision Transformer Distillation in the Wild

Attention Probe: Vision Transformer Distillation in the Wild Jiahao Wang, Mingdeng Cao, Shuwei Shi, Baoyuan Wu, Yujiu Yang In ICASSP 2022 This code is

IIGROUP 6 Sep 21, 2022
The PyTorch implementation of Directed Graph Contrastive Learning (DiGCL), NeurIPS-2021

Directed Graph Contrastive Learning Paper | Poster | Supplementary The PyTorch implementation of Directed Graph Contrastive Learning (DiGCL). In this

Tong Zekun 28 Jan 08, 2023
Gesture Volume Control Using OpenCV and MediaPipe

This Project Uses OpenCV and MediaPipe Hand solutions to identify hands and Change system volume by taking thumb and index finger positions

Pratham Bhatnagar 6 Sep 12, 2022
[ICCV 2021] A Simple Baseline for Semi-supervised Semantic Segmentation with Strong Data Augmentation

[ICCV 2021] A Simple Baseline for Semi-supervised Semantic Segmentation with Strong Data Augmentation

CodingMan 45 Dec 12, 2022
Pytorch implementation of various High Dynamic Range (HDR) Imaging algorithms

Deep High Dynamic Range Imaging Benchmark This repository is the pytorch impleme

Tianhong Dai 5 Nov 16, 2022
For encoding a text longer than 512 tokens, for example 800. Set max_pos to 800 during both preprocessing and training.

LongScientificFormer For encoding a text longer than 512 tokens, for example 800. Set max_pos to 800 during both preprocessing and training. Some code

Athar Sefid 6 Nov 02, 2022
Code for Estimating Multi-cause Treatment Effects via Single-cause Perturbation (NeurIPS 2021)

Estimating Multi-cause Treatment Effects via Single-cause Perturbation (NeurIPS 2021) Single-cause Perturbation (SCP) is a framework to estimate the m

Zhaozhi Qian 9 Sep 28, 2022
[ICCV '21] In this repository you find the code to our paper Keypoint Communities

Keypoint Communities In this repository you will find the code to our ICCV '21 paper: Keypoint Communities Duncan Zauss, Sven Kreiss, Alexandre Alahi,

Duncan Zauss 262 Dec 13, 2022
CLADE - Efficient Semantic Image Synthesis via Class-Adaptive Normalization (TPAMI 2021)

Efficient Semantic Image Synthesis via Class-Adaptive Normalization (Accepted by TPAMI)

tzt 49 Nov 17, 2022