Exploring the Dual-task Correlation for Pose Guided Person Image Generation

Last update: Dec 15, 2022

Overview

Dual-task Pose Transformer Network

The source code for our paper "Exploring Dual-task Correlation for Pose Guided Person Image Generation“ (CVPR2022)

Get Start

1) Requirement

Python 3.7.9
Pytorch 1.7.1
torchvision 0.8.2
CUDA 11.1
NVIDIA A100 40GB PCIe

2) Data Preperation

Following PATN, the dataset split files and extracted keypoints files can be obtained as follows:

DeepFashion

Download the DeepFashion dataset in-shop clothes retrival benchmark, and put them under the ./dataset/fashion directory.
Download train/test pairs and train/test keypoints annotations from Google Drive, including fasion-resize-pairs-train.csv, fasion-resize-pairs-test.csv, fasion-resize-annotation-train.csv, fasion-resize-annotation-train.csv, train.lst, test.lst, and put them under the ./dataset/fashion directory.
Split the raw image into the training set (./dataset/fashion/train) and test set (./dataset/fashion/test):

python data/generate_fashion_datasets.py

Market1501

Download the Market1501 dataset from here. Rename bounding_box_train and bounding_box_test as train and test, and put them under the ./dataset/market directory.
Download train/test key points annotations from Google Drive including market-pairs-train.csv, market-pairs-test.csv, market-annotation-train.csv, market-annotation-train.csv. Put these files under the ./dataset/market directory.

3) Train a model

DeepFashion

python train.py --name=DPTN_fashion --model=DPTN --dataset_mode=fashion --dataroot=./dataset/fashion --batchSize 32 --gpu_id=0

Market1501

python train.py --name=DPTN_market --model=DPTN --dataset_mode=market --dataroot=./dataset/market --dis_layer=3 --lambda_g=5 --lambda_rec 2 --t_s_ratio=0.8 --save_latest_freq=10400 --batchSize 32 --gpu_id=0

4) Test the model

You can directly download our test results from Google Drive: Deepfashion, Market1501.

DeepFashion

python test.py --name=DPTN_fashion --model=DPTN --dataset_mode=fashion --dataroot=./dataset/fashion --which_epoch latest --results_dir ./results/DPTN_fashion --batchSize 1 --gpu_id=0

Market1501

python test.py --name=DPTN_market --model=DPTN --dataset_mode=market --dataroot=./dataset/market --which_epoch latest --results_dir=./results/DPTN_market  --batchSize 1 --gpu_id=0

5) Evaluation

We adopt SSIM, PSNR, FID and LPIPS for the evaluation.

DeepFashion

python -m  metrics.metrics --gt_path=./dataset/fashion/test --distorated_path=./results/DPTN_fashion --fid_real_path=./dataset/fashion/train --name=./fashion

Market1501

python -m  metrics.metrics --gt_path=./dataset/market/test --distorated_path=./results/DPTN_market --fid_real_path=./dataset/market/train --name=./market --market

6) Pre-trained Model

Our pre-trained model can be downloaded from Google Drive: Deepfashion, Market1501.

Citation

Acknowledgement

We build our project based on pix2pix. Some dataset preprocessing methods are derived from PATN.

Exploring the Dual-task Correlation for Pose Guided Person Image Generation

Related tags

Overview

Dual-task Pose Transformer Network

Get Start

1) Requirement

2) Data Preperation

3) Train a model

4) Test the model

5) Evaluation

6) Pre-trained Model

Citation

Acknowledgement

Owner

Lighthouse: Predicting Lighting Volumes for Spatially-Coherent Illumination

PyTorch implementation of the implicit Q-learning algorithm (IQL)

PyZebrascope - an open-source Python platform for brain-wide neural activity imaging in behaving zebrafish

PyTorch implementation of InstaGAN: Instance-aware Image-to-Image Translation

Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch

Official PyTorch code for Hierarchical Conditional Flow: A Unified Framework for Image Super-Resolution and Image Rescaling (HCFlow, ICCV2021)

Object Depth via Motion and Detection Dataset

aka "Bayesian Methods for Hackers": An introduction to Bayesian methods + probabilistic programming with a computation/understanding-first, mathematics-second point of view. All in pure Python ;)

Official implementation of the RAVE model: a Realtime Audio Variational autoEncoder

The code for paper Efficiently Solve the Max-cut Problem via a Quantum Qubit Rotation Algorithm

Language-Agnostic Website Embedding and Classification

Code for the SIGIR 2022 paper "Hybrid Transformer with Multi-level Fusion for Multimodal Knowledge Graph Completion"

Python Implementation of the CoronaWarnApp (CWA) Event Registration

SGPT: Multi-billion parameter models for semantic search

[ACL 2022] LinkBERT: A Knowledgeable Language Model 😎 Pretrained with Document Links

Keras documentation, hosted live at keras.io

VLG-Net: Video-Language Graph Matching Networks for Video Grounding

Bayesian Optimization Library for Medical Image Segmentation.

Supporting code for the paper "Dangers of Bayesian Model Averaging under Covariate Shift"

Official implementation for "QS-Attn: Query-Selected Attention for Contrastive Learning in I2I Translation" (CVPR 2022)