GAN encoders in PyTorch that could match PGGAN, StyleGAN v1/v2, and BigGAN. Code also integrates the implementation of these GANs.

Overview

MTV-TSA: Adaptable GAN Encoders for Image Reconstruction via Multi-type Latent Vectors with Two-scale Attentions.

Python 3.7.3 PyTorch 1.8.1 Apache-2.0

cxx1 cxx2 msk dy zy

This is the official code release for "Adaptable GAN Encoders for Image Reconstruction via Multi-type Latent Vectors with Two-scale Attentions".

The code contains a set of encoders that match pre-trained GANs (PGGAN, StyleGANv1, StyleGANv2, BigGAN) via multi-scale vectors with two-scale attentions.

Usage

  • training encoder with center attentions (align image)

python E_align.py

  • training encoder with Gram-based attentions (misalign image)

python E_mis_align.py

  • embedding real images to latent space (using StyleGANv1 and w).

    a. You can put real images at './checkpoint/realimg_file/' (default file as args.img_dir)

    b. You should load pre-trained Encoder at './checkpoint/E/E_blur(case2)_styleganv1_FFHQ_state_dict.pth'

    c. Then run:

python embedding_img.py

  • discovering attribute directions with latent space : embedded_img_processing.py

Note: Pre-trained Model should be download first , and default save to './chechpoint/'

Metric

  • validate performance (Pre-trained GANs and baseline)

    1. using generations.py to generate reconstructed images (generate GANs images if needed)
    2. Files in the directory "./baseline/" could help you to quickly format images and latent vectors (w).
    3. Put comparing images to different files, and run comparing-baseline.py
  • ablation study : look at ''./ablations-study/''

Setup

Encoders

  • Case 1: Training most pre-trained GANs with encoders. at './model/E/E.py' (quickly converge for reconstructed GANs' image)
  • Case 2: Training StyleGANv1 on FFHQ for ablation study and real face image process at './model/E/E_Blur.py' (margin blur and more GPU memory)

Pre-Trained GANs

note: put pre-trained GANs weight file at ''./checkpoint/' directory

  • StyleGAN_V1 (should contain 3 files: Gm, Gs, center-tensor):
    • Cat 256:
      • ./checkpoint/stylegan_V1/cat/cat256_Gs_dict.pth
      • ./checkpoint/stylegan_V1/cat/cat256_Gm_dict.pth
      • ./checkpoint/stylegan_V1/cat/cat256_tensor.pt
    • Car 256: same above
    • Bedroom 256:
  • StyleGAN_V2 (Only one files : pth):
    • FFHQ 1024:
      • ./checkpoint/stylegan_V2/stylegan2_ffhq1024.pth
  • PGGAN ((Only one files : pth)):
    • Horse 256:
      • ./checkpoint/PGGAN/
  • BigGAN (Two files : model as .pt and config as .json ):
    • Image-Net 256:
      • ./checkpoint/biggan/256/G-256.pt
      • ./checkpoint/biggan/256/biggan-deep-256-config.json

Options and Setting

note: different GANs should set different parameters carefully.

  • choose --mtype for StyleGANv1=1, StyleGANv2=2, PGGAN=3, BIGGAN=4
  • choose Encoder start_features (--z_dim) carefully, the value are: 16->1024x1024, 32->512x512, 64->256x256
  • if go on training, set --checkpoint_dir_E which path save pre-trained Encoder model
  • --checkpoint_dir_GAN is needed, StyleGANv1 is a directory(contains 3 filers: Gm, Gs, center-tensor) , others are file path (.pth or .pt)
    parser = argparse.ArgumentParser(description='the training args')
    parser.add_argument('--iterations', type=int, default=210000) # epoch = iterations//30000
    parser.add_argument('--lr', type=float, default=0.0015)
    parser.add_argument('--beta_1', type=float, default=0.0)
    parser.add_argument('--batch_size', type=int, default=2)
    parser.add_argument('--experiment_dir', default=None) #None
    parser.add_argument('--checkpoint_dir_GAN', default='./checkpoint/stylegan_v2/stylegan2_ffhq1024.pth') #None  ./checkpoint/stylegan_v1/ffhq1024/ or ./checkpoint/stylegan_v2/stylegan2_ffhq1024.pth or ./checkpoint/biggan/256/G-256.pt
    parser.add_argument('--config_dir', default='./checkpoint/biggan/256/biggan-deep-256-config.json') # BigGAN needs it
    parser.add_argument('--checkpoint_dir_E', default=None)
    parser.add_argument('--img_size',type=int, default=1024)
    parser.add_argument('--img_channels', type=int, default=3)# RGB:3 ,L:1
    parser.add_argument('--z_dim', type=int, default=512) # PGGAN , StyleGANs are 512. BIGGAN is 128
    parser.add_argument('--mtype', type=int, default=2) # StyleGANv1=1, StyleGANv2=2, PGGAN=3, BigGAN=4
    parser.add_argument('--start_features', type=int, default=16)  # 16->1024 32->512 64->256

Pre-trained Model

We offered pre-trainned GANs and their corresponding encoders here: models (default setting is the case1 ).

GANs:

  • StyleGANv1-(FFHQ1024, Car512, Cat256) models which contain 3 files Gm, Gs and center-tensor.
  • PGGAN and StyleGANv2. A single .pth file gets Gm, Gs and center-tensor together.
  • BigGAN 128x128 ,256x256, and 512x512: each type contain a config file and model (.pt)

Encoders:

  • StyleGANv1 FFHQ (case 2) for real-image embedding and process.
  • StyleGANv2 LSUN Cat 256, they are one models from case 1 (Grad-CAM based attentions) and both models from case 2 (Grad-Cam based and Center-aligned Attentions for ablation study):
  • StyleGANv2 FFHQ (case 1)
  • Biggan-256 (case 1)

If you want to try more GANs, cite more pre-trained GANs below:

Acknowledgements

Pre-trained GANs:

StyleGANv1: https://github.com/podgorskiy/StyleGan.git, ( Converting code for official pre-trained model is here: https://github.com/podgorskiy/StyleGAN_Blobless.git) StyleGANv2 and PGGAN: https://github.com/genforce/genforce.git BigGAN: https://github.com/huggingface/pytorch-pretrained-BigGAN

Comparing Works:

In-Domain GAN: https://github.com/genforce/idinvert_pytorch pSp: https://github.com/eladrich/pixel2style2pixel ALAE: https://github.com/podgorskiy/ALAE.git

Related Works:

Grad-CAM & Grad-CAM++: https://github.com/yizt/Grad-CAM.pytorch SSIM Index: https://github.com/Po-Hsun-Su/pytorch-ssim

Our method implementation partly borrow from the above works (ALAE and Related Works). We would like to thank those authors.

If you have any questions, please contact us by E-mail ( [email protected]). Pull request or any comment is also welcome.

License

The code of this repository is released under the Apache 2.0 license.
The directories models/biggan and models/stylegan2 are provided under the MIT license.

Cite

@misc{yu2021adaptable,
      title={Adaptable GAN Encoders for Image Reconstruction via Multi-type Latent Vectors with Two-scale Attentions}, 
      author={Cheng Yu and Wenmin Wang},
      year={2021},
      eprint={2108.10201},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

简体中文:

如何应用于编辑人脸

Owner
owl
Be a strong man & Try to be a great man
owl
Negative Interactions for Improved Collaborative Filtering:

Negative Interactions for Improved Collaborative Filtering: Don’t go Deeper, go Higher This notebook provides an implementation in Python 3 of the alg

Harald Steck 21 Mar 05, 2022
Determined: Deep Learning Training Platform

Determined: Deep Learning Training Platform Determined is an open-source deep learning training platform that makes building models fast and easy. Det

Determined AI 2k Dec 31, 2022
Implementation of the Point Transformer layer, in Pytorch

Point Transformer - Pytorch Implementation of the Point Transformer self-attention layer, in Pytorch. The simple circuit above seemed to have allowed

Phil Wang 501 Jan 03, 2023
OBBDetection: an oriented object detection toolbox modified from MMdetection

OBBDetection note: If you have questions or good suggestions, feel free to propose issues and contact me. introduction OBBDetection is an oriented obj

MIXIAOXIN_HO 3 Nov 11, 2022
The audio-video synchronization of MKV Container Format is exploited to achieve data hiding

The audio-video synchronization of MKV Container Format is exploited to achieve data hiding, where the hidden data can be utilized for various management purposes, including hyper-linking, annotation

Maxim Zaika 1 Nov 17, 2021
A Probabilistic End-To-End Task-Oriented Dialog Model with Latent Belief States towards Semi-Supervised Learning

LABES This is the code for EMNLP 2020 paper "A Probabilistic End-To-End Task-Oriented Dialog Model with Latent Belief States towards Semi-Supervised L

17 Sep 28, 2022
Code for Quantifying Ignorance in Individual-Level Causal-Effect Estimates under Hidden Confounding

🍐 quince Code for Quantifying Ignorance in Individual-Level Causal-Effect Estimates under Hidden Confounding 🍐 Installation $ git clone

Andrew Jesson 19 Jun 23, 2022
CVPR2022 (Oral) - Rethinking Semantic Segmentation: A Prototype View

Rethinking Semantic Segmentation: A Prototype View Rethinking Semantic Segmentation: A Prototype View, Tianfei Zhou, Wenguan Wang, Ender Konukoglu and

Tianfei Zhou 239 Dec 26, 2022
Official PyTorch implementation of our AAAI22 paper: TransMEF: A Transformer-Based Multi-Exposure Image Fusion Framework via Self-Supervised Multi-Task Learning. Code will be available soon.

Official-PyTorch-Implementation-of-TransMEF Official PyTorch implementation of our AAAI22 paper: TransMEF: A Transformer-Based Multi-Exposure Image Fu

117 Dec 27, 2022
Perfect implement. Model shared. x0.5 (Top1:60.646) and 1.0x (Top1:69.402).

Shufflenet-v2-Pytorch Introduction This is a Pytorch implementation of faceplusplus's ShuffleNet-v2. For details, please read the following papers:

423 Dec 07, 2022
This project is a re-implementation of MASTER: Multi-Aspect Non-local Network for Scene Text Recognition by MMOCR

This project is a re-implementation of MASTER: Multi-Aspect Non-local Network for Scene Text Recognition by MMOCR,which is an open-source toolbox based on PyTorch. The overall architecture will be sh

Jianquan Ye 82 Nov 17, 2022
Soft actor-critic is a deep reinforcement learning framework for training maximum entropy policies in continuous domains.

This repository is no longer maintained. Please use our new Softlearning package instead. Soft Actor-Critic Soft actor-critic is a deep reinforcement

Tuomas Haarnoja 752 Jan 07, 2023
An open source library for face detection in images. The face detection speed can reach 1000FPS.

libfacedetection This is an open source library for CNN-based face detection in images. The CNN model has been converted to static variables in C sour

Shiqi Yu 11.4k Dec 27, 2022
Code for Low-Cost Algorithmic Recourse for Users With Uncertain Cost Functions

EMS-COLS-recourse Initial Code for Low-Cost Algorithmic Recourse for Users With Uncertain Cost Functions Folder structure: data folder contains raw an

Prateek Yadav 1 Nov 25, 2022
YOLOX + ROS(1, 2) object detection package

YOLOX + ROS(1, 2) object detection package

Ar-Ray 158 Dec 21, 2022
Code and data of the Fine-Grained R2R Dataset proposed in paper Sub-Instruction Aware Vision-and-Language Navigation

Fine-Grained R2R Code and data of the Fine-Grained R2R Dataset proposed in the EMNLP2020 paper Sub-Instruction Aware Vision-and-Language Navigation. C

YicongHong 34 Nov 15, 2022
[ICCV 2021] Official Tensorflow Implementation for "Single Image Defocus Deblurring Using Kernel-Sharing Parallel Atrous Convolutions"

KPAC: Kernel-Sharing Parallel Atrous Convolutional block This repository contains the official Tensorflow implementation of the following paper: Singl

Hyeongseok Son 50 Dec 29, 2022
SCU OlympicsRunning Baseline

Competition 1v1 running Environment check details in Jidi Competition RLChina2021智能体竞赛 做出的修改: 奖励重塑:修改了环境,重新设置了奖励的分配,使得奖励组成不只有零和博弈,还有探索环境的奖励。 算法微调:修改了官

ZiSeoi Wong 2 Nov 23, 2021
Utility code for use with PyXLL

pyxll-utils There is no need to use this package as of PyXLL 5. All features from this package are now provided by PyXLL. If you were using this packa

PyXLL 10 Dec 18, 2021
Tools for robust generative diffeomorphic slice to volume reconstruction

RGDSVR Tools for Robust Generative Diffeomorphic Slice to Volume Reconstructions (RGDSVR) This repository provides tools to implement the methods in t

Lucilio Cordero-Grande 0 Oct 29, 2021