Pytorch implementation of few-shot semantic image synthesis

Last update: Sep 26, 2022

Related tags

Overview

Few-shot Semantic Image Synthesis Using StyleGAN Prior

Our method can synthesize photorealistic images from dense or sparse semantic annotations using a few training pairs and a pre-trained StyleGAN.

Prerequisites

Python3
PyTorch

Preparation

Download and decompress the file containing StyleGAN pre-trained models and put the "pretrained_models" directory in the parent directory.

Inference with our pre-trained models

Download and decompress the file containing our pretrained encoders and put the "results" directory in the parent directory.
For example, our results for celebaMaskHQ in a one-shot setting can be generated as follows:

python scripts/inference.py --exp_dir=results/celebaMaskHQ_oneshot --checkpoint_path=results/celebaMaskHQ_oneshot/checkpoints/iteration_100000.pt --data_path=./data/CelebAMask-HQ/test/labels/ --couple_outputs --latent_mask=8,9,10,11,12,13,14,15,16,17

Inference results are generated in results/celebaMaskHQ_oneshot. If you use other datasets, please specify --exp_dir, --checkpoint_path, and --data_path appropriately.

Training

For each dataset, you can train an encoder as follows:

CelebAMask

python scripts/train.py --exp_dir=[result_dir] --dataset_type=celebs_seg_to_face --stylegan_weights pretrained_models/stylegan2-ffhq-config-f.pt --start_from_latent_avg --label_nc 19 --input_nc 19

CelebALandmark

python scripts/train.py --exp_dir=[result_dir] --dataset_type=celebs_landmark_to_face --stylegan_weights pretrained_models/stylegan2-ffhq-config-f.pt --start_from_latent_avg --label_nc 71 --input_nc 71 --sparse_labeling

Intermediate training outputs with the StyleGAN pre-trained with the CelebA-HQ dataset. It can be seen that the layouts of the bottom-row images reconstructed from the middle-row pseudo semantic masks gradually become close to those of the top-row StyleGAN samples as the training iterations increase.

LSUN church

python scripts/train.py --exp_dir=[result_dir] --dataset_type=lsunchurch_seg_to_img --stylegan_weights pretrained_models/stylegan2-church-config-f.pt --style_num 14 --start_from_latent_avg --label_nc 151 --input_nc 151

LSUN car

python scripts/train.py --exp_dir=[result_dir] --dataset_type=lsuncar_seg_to_img --stylegan_weights pretrained_models/stylegan2-car-config-f.pt --style_num 16 --start_from_latent_avg --label_nc 5 --input_nc 5

LSUN cat

python scripts/train.py --exp_dir=[result_dir] --dataset_type=lsuncat_scribble_to_img --stylegan_weights pretrained_models/stylegan2-cat-config-f.pt --style_num 14 --start_from_latent_avg --label_nc 9 --input_nc 9 --sparse_labeling

Ukiyo-e

python scripts/train.py --exp_dir=[result_dir] --dataset_type=ukiyo-e_scribble_to_img --stylegan_weights pretrained_models/ukiyoe-256-slim-diffAug-002789.pt --style_num 14 --channel_multiplier 1 --start_from_latent_avg --label_nc 8 --input_nc 8 --sparse_labeling

Anime

python scripts/train.py --exp_dir=[result_dir] --dataset_type=anime_cross_to_img --stylegan_weights pretrained_models/2020-01-11-skylion-stylegan2-animeportraits-networksnapshot-024664.pt --style_num 16 --start_from_latent_avg --label_nc 2 --input_nc 2 --sparse_labeling

Using StyleGAN samples as few-shot training data

Run the following script:

python scripts/generate_stylegan_samples.py --exp_dir=[result_dir] --stylegan_weights ./pretrained_models/stylegan2-ffhq-config-f.pt --style_num 18 --channel_multiplier 2

Then a StyleGAN image (*.png) and a corresponding latent code (*.pt) are obtained in [result_dir]/data/images and [result_dir]/checkpoints.

Manually annotate the generated image in [result_dir]/data/images and save the annotated mask in [result_dir]/data/labels.
Edit ./config/data_configs.py and ./config/paths_config.py appropriately to use the annotated pairs as a training set.
Run a training command above with appropriate options.

Citation

Please cite our paper if you find the code useful:

@article{endo2021fewshotsmis,
  title = {Few-shot Semantic Image Synthesis Using StyleGAN Prior},
  author = {Yuki Endo and Yoshihiro Kanamori},
  journal   = {CoRR},
  volume    = {abs/2103.14877},
  year      = {2021}
}

Acknowledgements

This code heavily borrows from the pixel2style2pixel repository.

Pytorch implementation of few-shot semantic image synthesis

Related tags

Overview

Few-shot Semantic Image Synthesis Using StyleGAN Prior

Prerequisites

Preparation

Inference with our pre-trained models

Training

Using StyleGAN samples as few-shot training data

Citation

Acknowledgements

Owner

[CVPR'21] FedDG: Federated Domain Generalization on Medical Image Segmentation via Episodic Learning in Continuous Frequency Space

A booklet on machine learning systems design with exercises

The Unsupervised Reinforcement Learning Benchmark (URLB)

my graduation project is about live human face augmentation by projection mapping by using CNN

My implementation of Fully Convolutional Neural Networks in Keras

A GUI for Face Recognition, based upon Docker, Tkinter, GPU and a camera device.

Boosting Monocular Depth Estimation Models to High-Resolution via Content-Adaptive Multi-Resolution Merging

GNEE - GAT Neural Event Embeddings

Implementation of CVPR 2020 Dual Super-Resolution Learning for Semantic Segmentation

Homepage of paper: Paint Transformer: Feed Forward Neural Painting with Stroke Prediction, ICCV 2021.

Rainbow: Combining Improvements in Deep Reinforcement Learning

Codes for AAAI 2022 paper: Context-aware Health Event Prediction via Transition Functions on Dynamic Disease Graphs

Categorizing comments on YouTube into different categories.

Nerf pl - NeRF (Neural Radiance Fields) and NeRF in the Wild using pytorch-lightning

Single-Stage Instance Shadow Detection with Bidirectional Relation Learning (CVPR 2021 Oral)

pytorch implementation of Attention is all you need

Traditional deepdream with VQGAN+CLIP and optical flow. Ready to use in Google Colab

PG2Net: Personalized and Group PreferenceGuided Network for Next Place Prediction

Intel® Nervana™ reference deep learning framework committed to best performance on all hardware

Joint Learning of 3D Shape Retrieval and Deformation, CVPR 2021