[CVPR 2021] Generative Hierarchical Features from Synthesizing Images

Related tags

Deep Learningghfeat
Overview

GH-Feat - Generative Hierarchical Features from Synthesizing Images

image Figure: Training framework of GH-Feat.

Generative Hierarchical Features from Synthesizing Images
Yinghao Xu*, Yujun Shen*, Jiapeng Zhu, Ceyuan Yang, Bolei Zhou
Computer Vision and Pattern Recognition (CVPR), 2021 (Oral)

[Paper] [Project Page]

In this work, we show that well-trained GAN generators can be used as training supervision to learn hierarchical visual features. We call this feature as Generative Hierarchical Feature (GH-Feat). Properly learned from a novel hierarchical encoder, GH-Feat is able to facilitate both discriminative and generative visual tasks, including face verification, landmark detection, layout prediction, transfer learning, style mixing, image editing, etc.

Usage

Environment

Before running the code, please setup the environment with

conda env create -f environment.yml
conda activate ghfeat

Testing

The following script can be used to extract GH-Feat from a list of images.

python extract_ghfeat.py ${ENCODER_PATH} ${IMAGE_LIST} -o ${OUTPUT_DIR}

We provide some well-learned encoders for inference.

Path Description
face_256x256 GH-Feat encoder trained on FF-HQ dataset.
tower_256x256 GH-Feat encoder trained on LSUN Tower dataset.
bedroom_256x256 GH-Feat encoder trained on LSUN Bedroom dataset.

Training

Given a well-trained StyleGAN generator, our hierarchical encoder is trained with the objective of image reconstruction.

python train_ghfeat.py \
       ${TRAIN_DATA_PATH} \
       ${VAL_DATA_PATH} \
       ${GENERATOR_PATH} \
       --num_gpus ${NUM_GPUS}

Here, the train_data and val_data can be created by this script. Note that, according to the official StyleGAN repo, the dataset is prepared in the multi-scale manner, but our encoder training only requires the data at the largest resolution. Hence, please specify the path to the tfrecords with the target resolution instead of the directory of all the tfrecords files.

Users can also train the encoder with slurm:

srun.sh ${PARTITION} ${NUM_GPUS} \
        python train_ghfeat.py \
               ${TRAIN_DATA_PATH} \
               ${VAL_DATA_PATH} \
               ${GENERATOR_PATH} \
               --num_gpus ${NUM_GPUS}

We provide some pre-trained generators as follows.

Path Description
face_256x256 StyleGAN trained on FFHQ dataset.
tower_256x256 StyleGAN trained on LSUN Tower dataset.
bedroom_256x256 StyleGAN trained on LSUN Bedroom dataset.

Codebase Description

  • Most codes are directly borrowed from StyleGAN repo.
  • Structure of the proposed hierarchical encoder: training/networks_ghfeat.py
  • Training loop of the encoder: training/training_loop_ghfeat.py
  • To feed GH-Feat produced by the encoder to the generator as layer-wise style codes, we slightly modify training/networks_stylegan.py. (See Line 263 and Line 477).
  • Main script for encoder training: train_ghfeat.py.
  • Script for extracting GH-Feat from images: extract_ghfeat.py.
  • VGG model for computing perceptual loss: perceptual_model.py.

Results

We show some results achieved by GH-Feat on a variety of downstream visual tasks.

Discriminative Tasks

Indoor scene layout prediction image

Facial landmark detection image

Face verification (face reconstruction) image

Generative Tasks

Image harmonization image

Global editing image

Local Editing image

Multi-level style mixing image

BibTeX

@inproceedings{xu2021generative,
  title     = {Generative Hierarchical Features from Synthesizing Images},
  author    = {Xu, Yinghao and Shen, Yujun and Zhu, Jiapeng and Yang, Ceyuan and Zhou, Bolei},
  booktitle = {CVPR},
  year      = {2021}
}
Owner
GenForce: May Generative Force Be with You
Research on Generative Modeling in Zhou Group
GenForce: May Generative Force Be with You
RepVGG: Making VGG-style ConvNets Great Again

RepVGG: Making VGG-style ConvNets Great Again (PyTorch) This is a super simple ConvNet architecture that achieves over 80% top-1 accuracy on ImageNet

2.8k Jan 04, 2023
Implementation of paper "DCS-Net: Deep Complex Subtractive Neural Network for Monaural Speech Enhancement"

DCS-Net This is the implementation of "DCS-Net: Deep Complex Subtractive Neural Network for Monaural Speech Enhancement" Steps to run the model Edit V

Jack Walters 10 Apr 04, 2022
PyTorch implementation of Value Iteration Networks (VIN): Clean, Simple and Modular. Visualization in Visdom.

VIN: Value Iteration Networks This is an implementation of Value Iteration Networks (VIN) in PyTorch to reproduce the results.(TensorFlow version) Key

Xingdong Zuo 215 Dec 07, 2022
PyTorch code for DriveGAN: Towards a Controllable High-Quality Neural Simulation

PyTorch code for DriveGAN: Towards a Controllable High-Quality Neural Simulation

76 Dec 24, 2022
Official implementation of the paper Label-Efficient Semantic Segmentation with Diffusion Models

Label-Efficient Semantic Segmentation with Diffusion Models Official implementation of the paper Label-Efficient Semantic Segmentation with Diffusion

Yandex Research 355 Jan 06, 2023
The official implementation of ICCV paper "Box-Aware Feature Enhancement for Single Object Tracking on Point Clouds".

Box-Aware Tracker (BAT) Pytorch-Lightning implementation of the Box-Aware Tracker. Box-Aware Feature Enhancement for Single Object Tracking on Point C

Kangel Zenn 5 Mar 26, 2022
AI创造营 :Metaverse启动机之重构现世,结合PaddlePaddle 和 Wechaty 创造自己的聊天机器人

paddle-wechaty-Zodiac AI创造营 :Metaverse启动机之重构现世,结合PaddlePaddle 和 Wechaty 创造自己的聊天机器人 12星座若穿越科幻剧,会拥有什么超能力呢?快来迎接你的专属超能力吧! 现在很多年轻人都喜欢看科幻剧,像是复仇者系列,里面有很多英雄、超

105 Dec 22, 2022
SemEval2022 Patronizing and Condescending Language (PCL) Detection

SemEval2022 Patronizing and Condescending Language (PCL) Detection This task is from SemEval 2022. What is Patronizing and Condescending Language (PCL

Daniel Saeedi 0 Aug 05, 2022
SelfAugment extends MoCo to include automatic unsupervised augmentation selection.

SelfAugment extends MoCo to include automatic unsupervised augmentation selection. In addition, we've included the ability to pretrain on several new datasets and included a wandb integration.

Colorado Reed 24 Oct 26, 2022
Face Detection & Age Gender & Expression & Recognition

Face Detection & Age Gender & Expression & Recognition

Sajjad Ayobi 188 Dec 28, 2022
Tf alloc - Simplication of GPU allocation for Tensorflow2

tf_alloc Simpliying GPU allocation for Tensorflow Developer: korkite (Junseo Ko)

Junseo Ko 3 Feb 10, 2022
Get 2D point positions (e.g., facial landmarks) projected on 3D mesh

points2d_projection_mesh Input 2D points (e.g. facial landmarks) on an image Camera parameters (extrinsic and intrinsic) of the image Aligned 3D mesh

5 Dec 08, 2022
yolov5目标检测模型的知识蒸馏(基于响应的蒸馏)

代码地址: https://github.com/Sharpiless/yolov5-knowledge-distillation 教师模型: python train.py --weights weights/yolov5m.pt \ --cfg models/yolov5m.ya

52 Dec 04, 2022
Monitor your ML jobs on mobile devices📱, especially for Google Colab / Kaggle

TF Watcher TF Watcher is a simple to use Python package and web app which allows you to monitor 👀 your Machine Learning training or testing process o

Rishit Dagli 54 Nov 01, 2022
Lava-DL, but with PyTorch-Lightning flavour

Deep learning project seed Use this seed to start new deep learning / ML projects. Built in setup.py Built in requirements Examples with MNIST Badges

Sami BARCHID 4 Oct 31, 2022
Drone detection using YOLOv5

This drone detection system uses YOLOv5 which is a family of object detection architectures and we have trained the model on Drone Dataset. Overview I

Tushar Sarkar 27 Dec 20, 2022
Corruption Invariant Learning for Re-identification

Corruption Invariant Learning for Re-identification The official repository for Benchmarks for Corruption Invariant Person Re-identification (NeurIPS

Minghui Chen 73 Dec 08, 2022
As-ViT: Auto-scaling Vision Transformers without Training

As-ViT: Auto-scaling Vision Transformers without Training [PDF] Wuyang Chen, Wei Huang, Xianzhi Du, Xiaodan Song, Zhangyang Wang, Denny Zhou In ICLR 2

VITA 68 Sep 05, 2022
[NeurIPS 2021] Official implementation of paper "Learning to Simulate Self-driven Particles System with Coordinated Policy Optimization".

Code for Coordinated Policy Optimization Webpage | Code | Paper | Talk (English) | Talk (Chinese) Hi there! This is the source code of the paper “Lear

DeciForce: Crossroads of Machine Perception and Autonomy 81 Dec 19, 2022
Official source code of paper 'IterMVS: Iterative Probability Estimation for Efficient Multi-View Stereo'

IterMVS official source code of paper 'IterMVS: Iterative Probability Estimation for Efficient Multi-View Stereo' Introduction IterMVS is a novel lear

Fangjinhua Wang 127 Jan 04, 2023