[CVPR 2021] Generative Hierarchical Features from Synthesizing Images

Related tags

Deep Learningghfeat
Overview

GH-Feat - Generative Hierarchical Features from Synthesizing Images

image Figure: Training framework of GH-Feat.

Generative Hierarchical Features from Synthesizing Images
Yinghao Xu*, Yujun Shen*, Jiapeng Zhu, Ceyuan Yang, Bolei Zhou
Computer Vision and Pattern Recognition (CVPR), 2021 (Oral)

[Paper] [Project Page]

In this work, we show that well-trained GAN generators can be used as training supervision to learn hierarchical visual features. We call this feature as Generative Hierarchical Feature (GH-Feat). Properly learned from a novel hierarchical encoder, GH-Feat is able to facilitate both discriminative and generative visual tasks, including face verification, landmark detection, layout prediction, transfer learning, style mixing, image editing, etc.

Usage

Environment

Before running the code, please setup the environment with

conda env create -f environment.yml
conda activate ghfeat

Testing

The following script can be used to extract GH-Feat from a list of images.

python extract_ghfeat.py ${ENCODER_PATH} ${IMAGE_LIST} -o ${OUTPUT_DIR}

We provide some well-learned encoders for inference.

Path Description
face_256x256 GH-Feat encoder trained on FF-HQ dataset.
tower_256x256 GH-Feat encoder trained on LSUN Tower dataset.
bedroom_256x256 GH-Feat encoder trained on LSUN Bedroom dataset.

Training

Given a well-trained StyleGAN generator, our hierarchical encoder is trained with the objective of image reconstruction.

python train_ghfeat.py \
       ${TRAIN_DATA_PATH} \
       ${VAL_DATA_PATH} \
       ${GENERATOR_PATH} \
       --num_gpus ${NUM_GPUS}

Here, the train_data and val_data can be created by this script. Note that, according to the official StyleGAN repo, the dataset is prepared in the multi-scale manner, but our encoder training only requires the data at the largest resolution. Hence, please specify the path to the tfrecords with the target resolution instead of the directory of all the tfrecords files.

Users can also train the encoder with slurm:

srun.sh ${PARTITION} ${NUM_GPUS} \
        python train_ghfeat.py \
               ${TRAIN_DATA_PATH} \
               ${VAL_DATA_PATH} \
               ${GENERATOR_PATH} \
               --num_gpus ${NUM_GPUS}

We provide some pre-trained generators as follows.

Path Description
face_256x256 StyleGAN trained on FFHQ dataset.
tower_256x256 StyleGAN trained on LSUN Tower dataset.
bedroom_256x256 StyleGAN trained on LSUN Bedroom dataset.

Codebase Description

  • Most codes are directly borrowed from StyleGAN repo.
  • Structure of the proposed hierarchical encoder: training/networks_ghfeat.py
  • Training loop of the encoder: training/training_loop_ghfeat.py
  • To feed GH-Feat produced by the encoder to the generator as layer-wise style codes, we slightly modify training/networks_stylegan.py. (See Line 263 and Line 477).
  • Main script for encoder training: train_ghfeat.py.
  • Script for extracting GH-Feat from images: extract_ghfeat.py.
  • VGG model for computing perceptual loss: perceptual_model.py.

Results

We show some results achieved by GH-Feat on a variety of downstream visual tasks.

Discriminative Tasks

Indoor scene layout prediction image

Facial landmark detection image

Face verification (face reconstruction) image

Generative Tasks

Image harmonization image

Global editing image

Local Editing image

Multi-level style mixing image

BibTeX

@inproceedings{xu2021generative,
  title     = {Generative Hierarchical Features from Synthesizing Images},
  author    = {Xu, Yinghao and Shen, Yujun and Zhu, Jiapeng and Yang, Ceyuan and Zhou, Bolei},
  booktitle = {CVPR},
  year      = {2021}
}
Owner
GenForce: May Generative Force Be with You
Research on Generative Modeling in Zhou Group
GenForce: May Generative Force Be with You
Leaderboard and Visualization for RLCard

RLCard Showdown This is the GUI support for the RLCard project and DouZero project. RLCard-Showdown provides evaluation and visualization tools to hel

Data Analytics Lab at Texas A&M University 246 Dec 26, 2022
Global Filter Networks for Image Classification

Global Filter Networks for Image Classification Created by Yongming Rao, Wenliang Zhao, Zheng Zhu, Jiwen Lu, Jie Zhou This repository contains PyTorch

Yongming Rao 273 Dec 26, 2022
Code for "Learning Skeletal Graph Neural Networks for Hard 3D Pose Estimation" ICCV'21

Skeletal-GNN Code for "Learning Skeletal Graph Neural Networks for Hard 3D Pose Estimation" ICCV'21 Various deep learning techniques have been propose

37 Oct 23, 2022
LSTM built using Keras Python package to predict time series steps and sequences. Includes sin wave and stock market data

LSTM Neural Network for Time Series Prediction LSTM built using the Keras Python package to predict time series steps and sequences. Includes sine wav

Jakob Aungiers 4.1k Jan 02, 2023
MLP-Like Vision Permutator for Visual Recognition (PyTorch)

Vision Permutator: A Permutable MLP-Like Architecture for Visual Recognition (arxiv) This is a Pytorch implementation of our paper. We present Vision

Qibin (Andrew) Hou 162 Nov 28, 2022
Implementation of Invariant Point Attention, used for coordinate refinement in the structure module of Alphafold2, as a standalone Pytorch module

Invariant Point Attention - Pytorch Implementation of Invariant Point Attention as a standalone module, which was used in the structure module of Alph

Phil Wang 113 Jan 05, 2023
Get 2D point positions (e.g., facial landmarks) projected on 3D mesh

points2d_projection_mesh Input 2D points (e.g. facial landmarks) on an image Camera parameters (extrinsic and intrinsic) of the image Aligned 3D mesh

5 Dec 08, 2022
code for Multi-scale Matching Networks for Semantic Correspondence, ICCV

MMNet This repo is the official implementation of ICCV 2021 paper "Multi-scale Matching Networks for Semantic Correspondence.". Pre-requisite conda cr

joey zhao 25 Dec 12, 2022
Back to Basics: Efficient Network Compression via IMP

Back to Basics: Efficient Network Compression via IMP Authors: Max Zimmer, Christoph Spiegel, Sebastian Pokutta This repository contains the code to r

IOL Lab @ ZIB 1 Nov 19, 2021
Contextual Attention Network: Transformer Meets U-Net

Contextual Attention Network: Transformer Meets U-Net Contexual attention network for medical image segmentation with state of the art results on skin

Reza Azad 67 Nov 28, 2022
Off-policy continuous control in PyTorch, with RDPG, RTD3 & RSAC

arXiv technical report soon available. we are updating the readme to be as comprehensive as possible Please ask any questions in Issues, thanks. Intro

Zhihan 31 Dec 30, 2022
Source code for "MusCaps: Generating Captions for Music Audio" (IJCNN 2021)

MusCaps: Generating Captions for Music Audio Ilaria Manco1 2, Emmanouil Benetos1, Elio Quinton2, Gyorgy Fazekas1 1 Queen Mary University of London, 2

Ilaria Manco 57 Dec 07, 2022
FairMOT for Multi-Class MOT using YOLOX as Detector

FairMOT-X Project Overview FairMOT-X is a multi-class multi object tracker, which has been tailored for training on the BDD100K MOT Dataset. It makes

Jonathan Tan 33 Dec 28, 2022
[ICCV 2021] Deep Hough Voting for Robust Global Registration

Deep Hough Voting for Robust Global Registration, ICCV, 2021 Project Page | Paper | Video Deep Hough Voting for Robust Global Registration Junha Lee1,

57 Nov 28, 2022
Export CenterPoint PonintPillars ONNX Model For TensorRT

CenterPoint-PonintPillars Pytroch model convert to ONNX and TensorRT Welcome to CenterPoint! This project is fork from tianweiy/CenterPoint. I impleme

CarkusL 149 Dec 13, 2022
Advanced Deep Learning with TensorFlow 2 and Keras (Updated for 2nd Edition)

Advanced Deep Learning with TensorFlow 2 and Keras (Updated for 2nd Edition)

Packt 1.5k Jan 03, 2023
Wordplay, an artificial Intelligence based crossword puzzle solver.

Wordplay, AI based crossword puzzle solver A crossword is a word puzzle that usually takes the form of a square or a rectangular grid of white- and bl

Vaibhaw 4 Nov 16, 2022
Official PyTorch Implementation for InfoSwap: Information Bottleneck Disentanglement for Identity Swapping

InfoSwap: Information Bottleneck Disentanglement for Identity Swapping Code usage Please check out the user manual page. Paper Gege Gao, Huaibo Huang,

Grace Hešeri 56 Dec 20, 2022
Decompose to Adapt: Cross-domain Object Detection via Feature Disentanglement

Decompose to Adapt: Cross-domain Object Detection via Feature Disentanglement In this project, we proposed a Domain Disentanglement Faster-RCNN (DDF)

19 Nov 24, 2022
Another pytorch implementation of FCN (Fully Convolutional Networks)

FCN-pytorch-easiest Trying to be the easiest FCN pytorch implementation and just in a get and use fashion Here I use a handbag semantic segmentation f

Y. Dong 158 Dec 21, 2022