The PyTorch implementation for paper "Neural Texture Extraction and Distribution for Controllable Person Image Synthesis" (CVPR2022 Oral)

Last update: Dec 10, 2022

Overview

ArXiv | Get Start

Neural-Texture-Extraction-Distribution

The PyTorch implementation for our paper "Neural Texture Extraction and Distribution for Controllable Person Image Synthesis" (CVPR2022 Oral)

We propose a Neural-Texture-Extraction-Distribution operation for controllable person image synthesis. Our model can be used to control the pose and appearance of a reference image:

Pose Control

Appearance Control

News

2022.4.30 Colab demos are provided for quick exploration.
2022.4.28 Code for PyTorch is available now!

Installation

Requirements

Python 3
PyTorch 1.7.1
CUDA 10.2

Conda Installation

# 1. Create a conda virtual environment.
conda create -n NTED python=3.6
conda activate NTED
conda install -c pytorch pytorch=1.7.1 torchvision cudatoolkit=10.2

# 2. Clone the Repo and Install dependencies
git clone --recursive https://github.com/RenYurui/Neural-Texture-Extraction-Distribution.git
pip install -r requirements.txt

# 3. Install mmfashion (for appearance control only)
pip install mmcv==0.5.1
pip install pycocotools==2.0.4
cd ./scripts
chmod +x insert_mmfashion2mmdetection.sh
./insert_mmfashion2mmdetection.sh
cd ../third_part/mmdetection
pip install -v -e .

Demo

Several demos are provided. Please first download the resources by runing

cd scripts
./download_demos.sh

Pose Transfer

Run the following code for the results.

PATH_TO_OUTPUT=./demo_results
python demo.py \
--config ./config/fashion_512.yaml \
--which_iter 495400 \
--name fashion_512 \
--file_pairs ./txt_files/demo.txt \
--input_dir ./demo_images \
--output_dir $PATH_TO_OUTPUT

Appearance Control

Meanwhile, run the following code for the appearance control demo.

python appearance_control.py \
--config ./config/fashion_512.yaml \
--name fashion_512 \
--which_iter 495400 \
--input_dir ./demo_images \
--file_pairs ./txt_files/appearance_control.txt

Colab Demo

Please check the Colab Demos for pose control and appearance control.

Dataset

Download img_highres.zip of the DeepFashion Dataset from In-shop Clothes Retrieval Benchmark.
Unzip img_highres.zip. You will need to ask for password from the dataset maintainers. Then rename the obtained folder as img and put it under the ./dataset/deepfashion directory.
We split the train/test set following GFLA. Several images with significant occlusions are removed from the training set. Download the train/test pairs and the keypoints pose.zip extracted with Openpose by runing:
```
cd scripts
./download_dataset.sh
```
Or you can download these files manually：
- Download the train/test pairs from Google Drive including train_pairs.txt, test_pairs.txt, train.lst, test.lst. Put these files under the ./dataset/deepfashion directory.
- Download the keypoints pose.rar extracted with Openpose from Google Driven. Unzip and put the obtained floder under the ./dataset/deepfashion directory.

Run the following code to save images to lmdb dataset.

python -m scripts.prepare_data \
--root ./dataset/deepfashion \
--out ./dataset/deepfashion

Training

This project supports multi-GPUs training. The following code shows an example for training the model with 512x352 images using 4 GPUs.

CUDA_VISIBLE_DEVICES=0,1,2,3 python -m torch.distributed.launch \
--nproc_per_node=4 \
--master_port 1234 train.py \
--config ./config/fashion_512.yaml \
--name $name_of_your_experiment

All configs for this experiment are saved in ./config/fashion_512.yaml. If you change the number of GPUs, you may need to modify the batch_size in ./config/fashion_512.yaml to ensure using a same batch_size.

Inference

Download the trained weights for 512x352 images and 256x176 images. Put the obtained checkpoints under ./result/fashion_512 and ./result/fashion_256 respectively.

Run the following code to evaluate the trained model:

# run evaluation for 512x352 images
python -m torch.distributed.launch \
--nproc_per_node=1 \
--master_port 12345 inference.py \
--config ./config/fashion_512.yaml \
--name fashion_512 \
--no_resume \
--output_dir ./result/fashion_512/inference 

# run evaluation for 256x176 images
python -m torch.distributed.launch \
--nproc_per_node=1 \
--master_port 12345 inference.py \
--config ./config/fashion_256.yaml \
--name fashion_256 \
--no_resume \
--output_dir ./result/fashion_256/inference

The result images are save in ./result/fashion_512/inference and ./result/fashion_256/inference.

The PyTorch implementation for paper "Neural Texture Extraction and Distribution for Controllable Person Image Synthesis" (CVPR2022 Oral)

Related tags

Overview

Neural-Texture-Extraction-Distribution

News

Installation

Requirements

Conda Installation

Demo

Pose Transfer

Appearance Control

Colab Demo

Dataset

Training

Inference

Owner

Ren Yurui

Demos of essentia classifiers hosted on replicate.ai

JAXDL: JAX (Flax) Deep Learning Library

Y. Zhang, Q. Yao, W. Dai, L. Chen. AutoSF: Searching Scoring Functions for Knowledge Graph Embedding. IEEE International Conference on Data Engineering (ICDE). 2020

Source code for paper "ATP: AMRize Than Parse! Enhancing AMR Parsing with PseudoAMRs" @NAACL-2022

A fast and easy to use, moddable, Python based Minecraft server!

A curated list of Generative Deep Art projects, tools, artworks, and models

Merlion: A Machine Learning Framework for Time Series Intelligence

Contrastive Loss Gradient Attack (CLGA)

ML models and internal tensors 3D visualizer

Implementation of the paper: "SinGAN: Learning a Generative Model from a Single Natural Image"

Learning from Synthetic Data with Fine-grained Attributes for Person Re-Identification

BRNet - code for Automated assessment of BI-RADS categories for ultrasound images using multi-scale neural networks with an order-constrained loss function

A PyTorch-Based Framework for Deep Learning in Computer Vision

ivadomed is an integrated framework for medical image analysis with deep learning.

[EMNLP 2021] MuVER: Improving First-Stage Entity Retrieval with Multi-View Entity Representations

MODNet: Trimap-Free Portrait Matting in Real Time

Real-time Object Detection for Streaming Perception, CVPR 2022

Code for "Multi-View Multi-Person 3D Pose Estimation with Plane Sweep Stereo"

Unconstrained Text Detection with Box Supervisionand Dynamic Self-Training

When Does Pretraining Help? Assessing Self-Supervised Learning for Law and the CaseHOLD Dataset of 53,000+ Legal Holdings