Invertible conditional GANs for image editing

Related tags

Deep LearningIcGAN
Overview

Invertible Conditional GANs

A real image is encoded into a latent representation z and conditional information y, and then decoded into a new image. We fix z for every row, and modify y for each column to obtain variations in real samples.

This is the implementation of the IcGAN model proposed in our paper:

Invertible Conditional GANs for image editing. November 2016.

This paper is a summarized and updated version of my master thesis, which you can find here:

Master thesis: Invertible Conditional Generative Adversarial Networks. September 2016.

The baseline used is the Torch implementation of the DCGAN by Radford et al.

  1. Training the model
    1. Face dataset: CelebA
    2. Digit dataset: MNIST
  2. Visualize the results
    1. Reconstruct and modify real images
    2. Swap attributes
    3. Interpolate between faces

Requisites

Please refer to DCGAN torch repository to know the requirements and dependencies to run the code. Additionally, you will need to install the threads and optnet package:

luarocks install threads

luarocks install optnet

In order to interactively display the results, follow these steps.

1. Training the model

Model overview

The IcGAN is trained in four steps.

  1. Train the generator.
  2. Create a dataset of generated images with the generator.
  3. Train the encoder Z to map an image x to a latent representation z with the dataset generated images.
  4. Train the encoder Y to map an image x to a conditional information vector y with the dataset of real images.

All the parameters of the training phase are located in cfg/mainConfig.lua.

There is already a pre-trained model for CelebA available in case you want to skip the training part. Here you can find instructions on how to use it.

1.1 Train with a face dataset: CelebA

Note: for speed purposes, the whole dataset will be loaded into RAM during training time, which requires about 10 GB of RAM. Therefore, 12 GB of RAM is a minimum requirement. Also, the dataset will be stored as a tensor to load it faster, make sure that you have around 25 GB of free space.

Preprocess

mkdir celebA; cd celebA

Download img_align_celeba.zip here under the link "Align&Cropped Images". Also, you will need to download list_attr_celeba.txt from the same link, which is found under Anno folder.

unzip img_align_celeba.zip; cd ..
DATA_ROOT=celebA th data/preprocess_celebA.lua

Now move list_attr_celeba.txt to celebA folder.

mv list_attr_celeba.txt celebA

Training

  • Conditional GAN: parameters are already configured to run CelebA (dataset=celebA, dataRoot=celebA).

     th trainGAN.lua
  • Generate encoder dataset:

     net=[GENERATOR_PATH] outputFolder=celebA/genDataset/ samples=182638 th data/generateEncoderDataset.lua

    (GENERATOR_PATH example: checkpoints/celebA_25_net_G.t7)

  • Train encoder Z:

     datasetPath=celebA/genDataset/ type=Z th trainEncoder.lua
    
  • Train encoder Y:

     datasetPath=celebA/ type=Y th trainEncoder.lua
    

1.2 Train with a digit dataset: MNIST

Preprocess

Download MNIST as a luarocks package: luarocks install mnist

Training

  • Conditional GAN:

     name=mnist dataset=mnist dataRoot=mnist th trainGAN.lua
  • Generate encoder dataset:

     net=[GENERATOR_PATH] outputFolder=mnist/genDataset/ samples=60000 th data/generateEncoderDataset.lua

    (GENERATOR_PATH example: checkpoints/mnist_25_net_G.t7)

  • Train encoder Z:

     datasetPath=mnist/genDataset/ type=Z th trainEncoder.lua
    
  • Train encoder Y:

     datasetPath=mnist type=Y th trainEncoder.lua
    

2 Pre-trained CelebA model:

CelebA model is available for download here. The file includes the generator and both encoders (encoder Z and encoder Y).

3. Visualize the results

For visualizing the results you will need an already trained IcGAN (i.e. a generator and two encoders). The parameters for generating results are in cfg/generateConfig.lua.

3.1 Reconstruct and modify real images

Reconstrucion example

decNet=celeba_24_G.t7 encZnet=celeba_encZ_7.t7 encYnet=celeba_encY_5.t7 loadPath=[PATH_TO_REAL_IMAGES] th generation/reconstructWithVariations.lua

3.2 Swap attributes

Swap attributes

Swap the attribute information between two pairs of faces.

decNet=celeba_24_G.t7 encZnet=celeba_encZ_7.t7 encYnet=celeba_encY_5.t7 im1Path=[IM1] im2Path=[IM2] th generation/attributeTransfer.lua

3.3 Interpolate between faces

Interpolation

decNet=celeba_24_G.t7 encZnet=celeba_encZ_7.t7 encYnet=celeba_encY_5.t7 im1Path=[IM1] im2Path=[IM2] th generation/interpolate.lua

Do you like or use our work? Please cite us as

@inproceedings{Perarnau2016,
  author    = {Guim Perarnau and
               Joost van de Weijer and
               Bogdan Raducanu and
               Jose M. \'Alvarez},
  title     = {{Invertible Conditional GANs for image editing}},
  booktitle   = {NIPS Workshop on Adversarial Training},
  year      = {2016},
}
Owner
Guim
Guim
Weakly Supervised Text-to-SQL Parsing through Question Decomposition

Weakly Supervised Text-to-SQL Parsing through Question Decomposition The official repository for the paper "Weakly Supervised Text-to-SQL Parsing thro

14 Dec 19, 2022
Official PyTorch implementation for paper Context Matters: Graph-based Self-supervised Representation Learning for Medical Images

Context Matters: Graph-based Self-supervised Representation Learning for Medical Images Official PyTorch implementation for paper Context Matters: Gra

49 Nov 23, 2022
Training data extraction on GPT-2

Training data extraction from GPT-2 This repository contains code for extracting training data from GPT-2, following the approach outlined in the foll

Florian Tramer 62 Dec 07, 2022
A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch

This repository holds NVIDIA-maintained utilities to streamline mixed precision and distributed training in Pytorch. Some of the code here will be included in upstream Pytorch eventually. The intenti

NVIDIA Corporation 6.9k Jan 03, 2023
Analyses of the individual electric field magnitudes with Roast.

Aloi Davide - PhD Student (UoB) Analysis of electric field magnitudes (wp2a dataset only at the moment) and correlation analysis with Dynamic Causal M

Davide Aloi 7 Dec 15, 2022
Text Extraction Formulation + Feedback Loop for state-of-the-art WSD (EMNLP 2021)

ConSeC is a novel approach to Word Sense Disambiguation (WSD), accepted at EMNLP 2021. It frames WSD as a text extraction task and features a feedback loop strategy that allows the disambiguation of

Sapienza NLP group 36 Dec 13, 2022
Sync2Gen Code for ICCV 2021 paper: Scene Synthesis via Uncertainty-Driven Attribute Synchronization

Sync2Gen Code for ICCV 2021 paper: Scene Synthesis via Uncertainty-Driven Attribute Synchronization 0. Environment Environment: python 3.6 and cuda 10

Haitao Yang 62 Dec 30, 2022
Code for CVPR 2021 paper: Anchor-Free Person Search

Introduction This is the implementationn for Anchor-Free Person Search in CVPR2021 License This project is released under the Apache 2.0 license. Inst

158 Jan 04, 2023
PaddleViT: State-of-the-art Visual Transformer and MLP Models for PaddlePaddle 2.0+

PaddlePaddle Vision Transformers State-of-the-art Visual Transformer and MLP Models for PaddlePaddle 🤖 PaddlePaddle Visual Transformers (PaddleViT or

1k Dec 28, 2022
Project repo for Learning Category-Specific Mesh Reconstruction from Image Collections

Learning Category-Specific Mesh Reconstruction from Image Collections Angjoo Kanazawa*, Shubham Tulsiani*, Alexei A. Efros, Jitendra Malik University

438 Dec 22, 2022
Improving Transferability of Representations via Augmentation-Aware Self-Supervision

Improving Transferability of Representations via Augmentation-Aware Self-Supervision Accepted to NeurIPS 2021 TL;DR: Learning augmentation-aware infor

hankook 38 Sep 16, 2022
PyTorch implementation for our NeurIPS 2021 Spotlight paper "Long Short-Term Transformer for Online Action Detection".

Long Short-Term Transformer for Online Action Detection Introduction This is a PyTorch implementation for our NeurIPS 2021 Spotlight paper "Long Short

77 Dec 16, 2022
LLVIP: A Visible-infrared Paired Dataset for Low-light Vision

LLVIP: A Visible-infrared Paired Dataset for Low-light Vision Project | Arxiv | Abstract It is very challenging for various visual tasks such as image

CVSM Group - email: <a href=[email protected]"> 377 Jan 07, 2023
Python scripts form performing stereo depth estimation using the high res stereo model in PyTorch .

PyTorch-High-Res-Stereo-Depth-Estimation Python scripts form performing stereo depth estimation using the high res stereo model in PyTorch. Stereo dep

Ibai Gorordo 26 Nov 24, 2022
[ICLR 2022] DAB-DETR: Dynamic Anchor Boxes are Better Queries for DETR

DAB-DETR This is the official pytorch implementation of our ICLR 2022 paper DAB-DETR. Authors: Shilong Liu, Feng Li, Hao Zhang, Xiao Yang, Xianbiao Qi

336 Dec 25, 2022
FCOSR: A Simple Anchor-free Rotated Detector for Aerial Object Detection

FCOSR: A Simple Anchor-free Rotated Detector for Aerial Object Detection FCOSR: A Simple Anchor-free Rotated Detector for Aerial Object Detection arXi

59 Nov 29, 2022
Differentiable architecture search for convolutional and recurrent networks

Differentiable Architecture Search Code accompanying the paper DARTS: Differentiable Architecture Search Hanxiao Liu, Karen Simonyan, Yiming Yang. arX

Hanxiao Liu 3.7k Jan 09, 2023
EquiBind: Geometric Deep Learning for Drug Binding Structure Prediction

EquiBind: geometric deep learning for fast predictions of the 3D structure in which a small molecule binds to a protein

Hannes Stärk 355 Jan 03, 2023
Python program that works as a contact list

Lista de Contatos Programa em Python que funciona como uma lista de contatos. Features Adicionar novo contato Remover contato Atualizar contato Pesqui

Victor B. Lino 3 Dec 16, 2021
Rafael Project- Classifying rockets to different types using data science algorithms.

Rocket-Classify Rafael Project- Classifying rockets to different types using data science algorithms. In this project we received data base with data

Hadassah Engel 5 Sep 18, 2021