Pytorch implemenation of Stochastic Multi-Label Image-to-image Translation (SMIT)

Last update: Mar 01, 2022

Overview

SMIT: Stochastic Multi-Label Image-to-image Translation

This repository provides a PyTorch implementation of SMIT. SMIT can stochastically translate an input image to multiple domains using only a single generator and a discriminator. It only needs a target domain (binary vector e.g., [0,1,0,1,1] for 5 different domains) and a random gaussian noise.

Paper

SMIT: Stochastic Multi-Label Image-to-image Translation
Andrés Romero¹, Pablo Arbelaez¹, Luc Van Gool², Radu Timofte²
¹Biomedical Computer Vision (BCV) Lab, Universidad de Los Andes.
²Computer Vision Lab (CVL), ETH Zürich.

Citation

@article{romero2019smit,
  title={SMIT: Stochastic Multi-Label Image-to-Image Translation},
  author={Romero, Andr{\'e}s and Arbel{\'a}ez, Pablo and Van Gool, Luc and Timofte, Radu},
  journal={ICCV Workshops},
  year={2019}
}

Dependencies

Python (2.7, 3.5+)
PyTorch (0.3, 0.4, 1.0)

Usage

Cloning the repository

$ git clone https://github.com/BCV-Uniandes/SMIT.git
$ cd SMIT

Downloading the dataset

To download the CelebA dataset:

$ bash generate_data/download.sh

Train command:

./main.py --GPU=$gpu_id --dataset_fake=CelebA

Each dataset must has datasets/ .py and datasets/ .yaml files. All models and figures will be stored at snapshot/models/$dataset_fake/ _ .pth and snapshot/samples/$dataset_fake/ _ .jpg, respectivelly.

Test command:

./main.py --GPU=$gpu_id --dataset_fake=CelebA --mode=test

SMIT will expect the .pth weights are stored at snapshot/models/$dataset_fake/ (or --pretrained_model=location/model.pth should be provided). If there are several models, it will take the last alphabetical one.

Demo:

./main.py --GPU=$gpu_id --dataset_fake=CelebA --mode=test --DEMO_PATH=location/image_jpg/or/location/dir

DEMO performs transformation per attribute, that is swapping attributes with respect to the original input as in the images below. Therefore, --DEMO_LABEL is provided for the real attribute if DEMO_PATH is an image (If it is not provided, the discriminator acts as classifier for the real attributes).

Pretrained models

Models trained using Pytorch 1.0.

Multi-GPU

For multiple GPUs we use Horovod. Example for training with 4 GPUs:

mpirun -n 4 ./main.py --dataset_fake=CelebA

Qualitative Results. Multi-Domain Continuous Interpolation.

First column (original input) -> Last column (Opposite attributes: smile, age, genre, sunglasses, bangs, color hair). Up: Continuous interpolation for the fake image. Down: Continuous interpolation for the attention mechanism.

Pytorch implemenation of Stochastic Multi-Label Image-to-image Translation (SMIT)

Related tags

Overview

SMIT: Stochastic Multi-Label Image-to-image Translation

Paper

Citation

Dependencies

Usage

Cloning the repository

Downloading the dataset

Train command:

Test command:

Demo:

Pretrained models

Multi-GPU

Qualitative Results. Multi-Domain Continuous Interpolation.

Qualitative Results. Random sampling.

CelebA

EmotionNet

RafD

Edges2Shoes

Edges2Handbags

Yosemite

Painters

Qualitative Results. Style Interpolation between first and last row.

CelebA

EmotionNet

RafD

Edges2Shoes

Edges2Handbags

Yosemite

Painters

Qualitative Results. Label continuous inference between first and last row.

CelebA

EmotionNet

Owner

Biomedical Computer Vision Group @ Uniandes

PyTorch implementation of PNASNet-5 on ImageNet

This is the code related to "Sparse-to-dense Feature Matching: Intra and Inter domain Cross-modal Learning in Domain Adaptation for 3D Semantic Segmentation" (ICCV 2021).

[CVPR2021] The source code for our paper 《Removing the Background by Adding the Background: Towards Background Robust Self-supervised Video Representation Learning》.

Code repo for "RBSRICNN: Raw Burst Super-Resolution through Iterative Convolutional Neural Network" (Machine Learning and the Physical Sciences workshop in NeurIPS 2021).

CvT-ASSD: Convolutional vision-Transformerbased Attentive Single Shot MultiBox Detector (ICTAI 2021 CCF-C 会议)The 33rd IEEE International Conference on Tools with Artificial Intelligence

Learning Open-World Object Proposals without Learning to Classify

Learn other languages ​​using artificial intelligence with python.

This is a collection of our NAS and Vision Transformer work.

Context-Aware Image Matting for Simultaneous Foreground and Alpha Estimation

A simple version for graphfpn

Convert onnx models to pytorch.

Distilling Motion Planner Augmented Policies into Visual Control Policies for Robot Manipulation (CoRL 2021)

Dilated Convolution for Semantic Image Segmentation

Exploit ILP to learn symmetry breaking constraints of ASP programs.

Img-process-manual - Utilize Python Numpy and Matplotlib to realize OpenCV baisc image processing function

SSL_SLAM2: Lightweight 3-D Localization and Mapping for Solid-State LiDAR (mapping and localization separated) ICRA 2021

It's like Shape Editor in Maya but works with skeletons (transforms).

Empowering journalists and whistleblowers

a general-purpose Transformer based vision backbone

My implementation of DeepMind's Perceiver

Learn other languages using artificial intelligence with python.