StyleMapGAN - Official PyTorch Implementation

Overview

StyleMapGAN - Official PyTorch Implementation

StyleMapGAN: Exploiting Spatial Dimensions of Latent in GAN for Real-time Image Editing
Hyunsu Kim, Yunjey Choi, Junho Kim, Sungjoo Yoo, Youngjung Uh
In CVPR 2021.

Paper: https://arxiv.org/abs/2104.14754
Video: https://youtu.be/qCapNyRA_Ng

Abstract: Generative adversarial networks (GANs) synthesize realistic images from random latent vectors. Although manipulating the latent vectors controls the synthesized outputs, editing real images with GANs suffers from i) time-consuming optimization for projecting real images to the latent vectors, ii) or inaccurate embedding through an encoder. We propose StyleMapGAN: the intermediate latent space has spatial dimensions, and a spatially variant modulation replaces AdaIN. It makes the embedding through an encoder more accurate than existing optimization-based methods while maintaining the properties of GANs. Experimental results demonstrate that our method significantly outperforms state-of-the-art models in various image manipulation tasks such as local editing and image interpolation. Last but not least, conventional editing methods on GANs are still valid on our StyleMapGAN. Source code is available at https://github.com/naver-ai/StyleMapGAN.

Demo

Youtube video Click the figure to watch the teaser video.

Interactive demo app Run demo in your local machine.

All test images are from CelebA-HQ, AFHQ, and LSUN.

python demo.py --ckpt expr/checkpoints/celeba_hq_256_8x8.pt --dataset celeba_hq

Installation

ubuntu gcc 7.4.0 CUDA CUDA-driver cudnn7 conda Python 3.6.12 pytorch 1.4.0

Clone this repository:

git clone https://github.com/naver-ai/StyleMapGAN.git
cd StyleMapGAN/

Install the dependencies:

conda create -y -n stylemapgan python=3.6.12
conda activate stylemapgan
./install.sh

Datasets and pre-trained networks

We provide a script to download datasets used in StyleMapGAN and the corresponding pre-trained networks. The datasets and network checkpoints will be downloaded and stored in the data and expr/checkpoints directories, respectively.

CelebA-HQ. To download the CelebA-HQ dataset and parse it, run the following commands:

# Download raw images and create LMDB datasets using them
# Additional files are also downloaded for local editing
bash download.sh create-lmdb-dataset celeba_hq

# Download the pretrained network (256x256)
bash download.sh download-pretrained-network-256 celeba_hq

# Download the pretrained network (1024x1024 image / 16x16 stylemap / Light version of Generator)
bash download.sh download-pretrained-network-1024 ffhq_16x16

AFHQ. For AFHQ, change above commands from 'celeba_hq' to 'afhq'.

Train network

Implemented using DistributedDataParallel.

# CelebA-HQ
python train.py --dataset celeba_hq --train_lmdb data/celeba_hq/LMDB_train --val_lmdb data/celeba_hq/LMDB_val

# AFHQ
python train.py --dataset afhq --train_lmdb data/afhq/LMDB_train --val_lmdb data/afhq/LMDB_val

# CelebA-HQ / 1024x1024 image / 16x16 stylemap / Light version of Generator
python train.py --size 1024 --latent_spatial_size 16 --small_generator --dataset celeba_hq --train_lmdb data/celeba_hq/LMDB_train --val_lmdb data/celeba_hq/LMDB_val 

Generate images

Reconstruction Results are saved to expr/reconstruction.

# CelebA-HQ
python generate.py --ckpt expr/checkpoints/celeba_hq_256_8x8.pt --mixing_type reconstruction --test_lmdb data/celeba_hq/LMDB_test

# AFHQ
python generate.py --ckpt expr/checkpoints/afhq_256_8x8.pt --mixing_type reconstruction --test_lmdb data/afhq/LMDB_test

W interpolation Results are saved to expr/w_interpolation.

# CelebA-HQ
python generate.py --ckpt expr/checkpoints/celeba_hq_256_8x8.pt --mixing_type w_interpolation --test_lmdb data/celeba_hq/LMDB_test

# AFHQ
python generate.py --ckpt expr/checkpoints/afhq_256_8x8.pt --mixing_type w_interpolation --test_lmdb data/afhq/LMDB_test

Local editing Results are saved to expr/local_editing. We pair images using a target semantic mask similarity. If you want to see details, please follow preprocessor/README.md.

# Using GroundTruth(GT) segmentation masks for CelebA-HQ dataset.
python generate.py --ckpt expr/checkpoints/celeba_hq_256_8x8.pt --mixing_type local_editing --test_lmdb data/celeba_hq/LMDB_test --local_editing_part nose

# Using half-and-half masks for AFHQ dataset.
python generate.py --ckpt expr/checkpoints/afhq_256_8x8.pt --mixing_type local_editing --test_lmdb data/afhq/LMDB_test

Unaligned transplantation Results are saved to expr/transplantation. It shows local transplantations examples of AFHQ. We recommend the demo code instead of this.

python generate.py --ckpt expr/checkpoints/afhq_256_8x8.pt --mixing_type transplantation --test_lmdb data/afhq/LMDB_test

Random Generation Results are saved to expr/random_generation. It shows random generation examples.

python generate.py --mixing_type random_generation --ckpt expr/checkpoints/celeba_hq_256_8x8.pt

Style Mixing Results are saved to expr/stylemixing. It shows style mixing examples.

python generate.py --mixing_type stylemixing --ckpt expr/checkpoints/celeba_hq_256_8x8.pt --test_lmdb data/celeba_hq/LMDB_test

Semantic Manipulation Results are saved to expr/semantic_manipulation. It shows local semantic manipulation examples.

python semantic_manipulation.py --ckpt expr/checkpoints/celeba_hq_256_8x8.pt --LMDB data/celeba_hq/LMDB --svm_train_iter 10000

Metrics

  • Reconstruction: LPIPS, MSE
  • W interpolation: FIDlerp
  • Generation: FID
  • Local editing: MSEsrc, MSEref, Detectability (Refer to CNNDetection)

If you want to see details, please follow metrics/README.md.

License

The source code, pre-trained models, and dataset are available under Creative Commons BY-NC 4.0 license by NAVER Corporation. You can use, copy, tranform and build upon the material for non-commercial purposes as long as you give appropriate credit by citing our paper, and indicate if changes were made.

For business inquiries, please contact [email protected].
For technical and other inquires, please contact [email protected].

Citation

If you find this work useful for your research, please cite our paper:

@inproceedings{kim2021stylemapgan,
  title={Exploiting Spatial Dimensions of Latent in GAN for Real-time Image Editing},
  author={Kim, Hyunsu and Choi, Yunjey and Kim, Junho and Yoo, Sungjoo and Uh, Youngjung},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  year={2021}
}

Related Projects

Model code starts from StyleGAN2 PyTorch unofficial code, which refers to StyleGAN2 official code. LPIPS, FID, and CNNDetection codes are used for evaluation. In semantic manipulation, we used StyleGAN pretrained network to get positive and negative samples by ranking. The demo code starts from Neural-Collage.

Owner
NAVER AI
Official account of NAVER AI, Korea No.1 Industrial AI Research Group
NAVER AI
Self-Supervised Generative Style Transfer for One-Shot Medical Image Segmentation

Self-Supervised Generative Style Transfer for One-Shot Medical Image Segmentation This repository contains the Pytorch implementation of the proposed

Devavrat Tomar 19 Nov 10, 2022
Release of the ConditionalQA dataset

ConditionalQA Datasets accompanying the paper ConditionalQA: A Complex Reading Comprehension Dataset with Conditional Answers. Disclaimer This dataset

14 Oct 17, 2022
This project implements "virtual speed" from heart rate monito

ANT+ Virtual Stride Based Speed and Distance Monitor Overview This project imple

2 May 20, 2022
Discord bot for notifying on github events

Git-Observer Discord bot for notifying on github events ⚠️ This bot is meant to write messages to only one channel (implementing this for multiple pro

ilu_vatar_ 0 Apr 19, 2022
Contrastive Fact Verification

VitaminC This repository contains the dataset and models for the NAACL 2021 paper: Get Your Vitamin C! Robust Fact Verification with Contrastive Evide

47 Dec 19, 2022
Repository of the paper Compressing Sensor Data for Remote Assistance of Autonomous Vehicles using Deep Generative Models at ML4AD @ NeurIPS 2021.

Compressing Sensor Data for Remote Assistance of Autonomous Vehicles using Deep Generative Models Code and supplementary materials Repository of the p

Daniel Bogdoll 4 Jul 13, 2022
DL course co-developed by YSDA, HSE and Skoltech

Deep learning course This repo supplements Deep Learning course taught at YSDA and HSE @fall'21. For previous iteration visit the spring21 branch. Lec

Yandex School of Data Analysis 1.3k Dec 30, 2022
Multi-Modal Machine Learning toolkit based on PaddlePaddle.

简体中文 | English PaddleMM 简介 飞桨多模态学习工具包 PaddleMM 旨在于提供模态联合学习和跨模态学习算法模型库,为处理图片文本等多模态数据提供高效的解决方案,助力多模态学习应用落地。 近期更新 2022.1.5 发布 PaddleMM 初始版本 v1.0 特性 丰富的任务

njustkmg 520 Dec 28, 2022
[CVPR 2021] "Multimodal Motion Prediction with Stacked Transformers": official code implementation and project page.

mmTransformer Introduction This repo is official implementation for mmTransformer in pytorch. Currently, the core code of mmTransformer is implemented

DeciForce: Crossroads of Machine Perception and Autonomy 232 Dec 31, 2022
Learning to trade under the reinforcement learning framework

Trading Using Q-Learning In this project, I will present an adaptive learning model to trade a single stock under the reinforcement learning framework

Uirá Caiado 470 Nov 28, 2022
Comp445 project - Data Communications & Computer Networks

COMP-445 Data Communications & Computer Networks Change Python version in Conda

Peng Zhao 2 Oct 03, 2022
Repository for scripts and notebooks from the book: Programming PyTorch for Deep Learning

Repository for scripts and notebooks from the book: Programming PyTorch for Deep Learning

Ian Pointer 368 Dec 17, 2022
The description of FMFCC-A (audio track of FMFCC) dataset and Challenge resluts.

FMFCC-A This project is the description of FMFCC-A (audio track of FMFCC) dataset and Challenge resluts. The FMFCC-A dataset is shared through BaiduCl

18 Dec 24, 2022
Predict Breast Cancer Wisconsin (Diagnostic) using Naive Bayes

Naive-Bayes Predict Breast Cancer Wisconsin (Diagnostic) using Naive Bayes Downloading Data Set Use our Breast Cancer Wisconsin Data Set Also you can

Faeze Habibi 0 Apr 06, 2022
Fast and robust clustering of point clouds generated with a Velodyne sensor.

Depth Clustering This is a fast and robust algorithm to segment point clouds taken with Velodyne sensor into objects. It works with all available Velo

Photogrammetry & Robotics Bonn 957 Dec 21, 2022
Source Code for our paper: Understand me, if you refer to Aspect Knowledge: Knowledge-aware Gated Recurrent Memory Network

KaGRMN-DSG_ABSA This repository contains the PyTorch source Code for our paper: Understand me, if you refer to Aspect Knowledge: Knowledge-aware Gated

XingBowen 4 May 20, 2022
Reviatalizing Optimization for 3D Human Pose and Shape Estimation: A Sparse Constrained Formulation

Reviatalizing Optimization for 3D Human Pose and Shape Estimation: A Sparse Constrained Formulation This is the implementation of the approach describ

Taosha Fan 47 Nov 15, 2022
A deep learning framework for historical document image analysis

DIVA-DAF Description A deep learning framework for historical document image analysis. How to run Install dependencies # clone project git clone https

9 Aug 04, 2022
Kindle is an easy model build package for PyTorch.

Kindle is an easy model build package for PyTorch. Building a deep learning model became so simple that almost all model can be made by copy and paste from other existing model codes. So why code? wh

Jongkuk Lim 77 Nov 11, 2022
Official repo for BMVC2021 paper ASFormer: Transformer for Action Segmentation

ASFormer: Transformer for Action Segmentation This repo provides training & inference code for BMVC 2021 paper: ASFormer: Transformer for Action Segme

42 Dec 23, 2022