Official Implementation and Dataset of "PPR10K: A Large-Scale Portrait Photo Retouching Dataset with Human-Region Mask and Group-Level Consistency", CVPR 2021

Related tags

Deep LearningPPR10K
Overview

Portrait Photo Retouching with PPR10K

Paper | Supplementary Material

PPR10K: A Large-Scale Portrait Photo Retouching Dataset with Human-Region Mask and Group-Level Consistency
Jie Liang*, Hui Zeng*, Miaomiao Cui, Xuansong Xie and Lei Zhang.
In CVPR 2021.

The proposed Portrait Photo Retouching dataset (PPR10K) is a large-scale and diverse dataset that contains:

  • 11,161 high-quality raw portrait photos (resolutions from 4K to 8K) in 1,681 groups;
  • 3 versions of manual retouched targets of all photos given by 3 expert retouchers;
  • full resolution human-region masks of all photos.

Samples

sample_images

Two example groups of photos from the PPR10K dataset. Top: the raw photos; Bottom: the retouched results from expert-a and the human-region masks. The raw photos exhibit poor visual quality and large variance in subject views, background contexts, lighting conditions and camera settings. In contrast, the retouched results demonstrate both good visual quality (with human-region priority) and group-level consistency.

This dataset is first of its kind to consider the two special and practical requirements of portrait photo retouching task, i.e., Human-Region Priority and Group-Level Consistency. Three main challenges are expected to be tackled in the follow-up researches:

  • Flexible and content-adaptive models for such a diverse task regarding both image contents and lighting conditions;
  • Highly efficient models to process practical resolution from 4K to 8K;
  • Robust and stable models to meet the requirement of group-level consistency.

Agreement

  • All files in the PPR10K dataset are available for non-commercial research purposes only.
  • You agree not to reproduce, duplicate, copy, sell, trade, resell or exploit for any commercial purposes, any portion of the images and any portion of derived data.

Overview

All data is hosted on GoogleDrive, OneDrive and 百度网盘 (验证码: mrwn):

Path Size Files Format Description
PPR10K-dataset 406 GB 176,072 Main folder
├  raw 313 GB 11,161 RAW All photos in raw format (.CR2, .NEF, .ARW, etc)
├  xmp_source 130 MB 11,161 XMP Default meta-file of the raw photos in CameraRaw, used in our data augmentation
├  xmp_target_a 130 MB 11,161 XMP CameraRaw meta-file of the raw photos recoding the full adjustments by expert a
├  xmp_target_b 130 MB 11,161 XMP CameraRaw meta-file of the raw photos recoding the full adjustments by expert b
├  xmp_target_c 130 MB 11,161 XMP CameraRaw meta-file of the raw photos recoding the full adjustments by expert c
├  masks_full 697 MB 11,161 PNG Full-resolution human-region masks in binary format
├  masks_360p 56 MB 11,161 PNG 360p human-region masks for fast training and validation
├  train_val_images_tif_360p 91 GB 97894 TIF 360p Source (16 bit tiff, with 5 versions of augmented images) and target (8 bit tiff) images for fast training and validation
├  pretrained_models 268 MB 12 PTH pretrained models for all 3 versions
└  hists 624KB 39 PNG Overall statistics of the dataset

One can directly use the 360p (of 540x360 or 360x540 resolution in sRGB color space) training and validation files (photos, 5 versions of augmented photos and the corresponding human-region masks) we have provided following the settings in our paper (train with the first 8,875 files and validate with the last 2286 files).
Also, see the instructions to customize your data (e.g., augment the training samples regarding illuminations and colors, get photos with higher or full resolutions).

Training and Validating the PPR using 3DLUT

Installation

  • Clone this repo.
git clone https://github.com/csjliang/PPR10K
cd PPR10K/code_3DLUT/
  • Install dependencies.
pip install -r requirements.txt
  • Build. Modify the CUDA path in trilinear_cpp/setup.sh adaptively and
cd trilinear_cpp
sh trilinear_cpp/setup.sh

Training

  • Training without HRP and GLC strategy, save models:
python train.py --data_path [path_to_dataset] --gpu_id [gpu_id] --use_mask False --output_dir [path_to_save_models]
  • Training with HRP and without GLC strategy, save models:
python train.py --data_path [path_to_dataset] --gpu_id [gpu_id] --use_mask True --output_dir [path_to_save_models]
  • Training without HRP and with GLC strategy, save models:
python train_GLC.py --data_path [path_to_dataset] --gpu_id [gpu_id] --use_mask False --output_dir [path_to_save_models]
  • Training with both HRP and GLC strategy, save models:
python train_GLC.py --data_path [path_to_dataset] --gpu_id [gpu_id] --use_mask True --output_dir [path_to_save_models]

Evaluation

  • Generate the retouched results:
python validation.py --data_path [path_to_dataset] --gpu_id [gpu_id] --model_dir [path_to_models]
  • Use matlab to calculate the measures in our paper:
calculate_metrics(source_dir, target_dir, mask_dir)

Pretrained Models

mv your/path/to/pretrained_models/* saved_models/
  • specify the --model_dir and --epoch (-1) to validate or initialize the training using the pretrained models, e.g.,
python validation.py --data_path [path_to_dataset] --gpu_id [gpu_id] --model_dir mask_noglc_a --epoch -1
python train.py --data_path [path_to_dataset] --gpu_id [gpu_id] --use_mask True --output_dir mask_noglc_a --epoch -1

Citation

If you use this dataset or code for your research, please cite our paper.

@inproceedings{jie2021PPR10K,
  title={PPR10K: A Large-Scale Portrait Photo Retouching Dataset with Human-Region Mask and Group-Level Consistency},
  author={Liang, Jie and Zeng, Hui and Cui, Miaomiao and Xie, Xuansong and Zhang, Lei},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  year={2021}
}

Related Projects

3D LUT

Contact

Should you have any questions, please contact me via [email protected].

Towards Boosting the Accuracy of Non-Latin Scene Text Recognition

Convolutional Recurrent Neural Network + CTCLoss | STAR-Net Code for paper "Towards Boosting the Accuracy of Non-Latin Scene Text Recognition" Depende

Sanjana Gunna 7 Aug 07, 2022
Automatic voice-synthetised summaries of latest research papers on arXiv

PaperWhisperer PaperWhisperer is a Python application that keeps you up-to-date with research papers. How? It retrieves the latest articles from arXiv

Valerio Velardo 124 Dec 20, 2022
A Python module for the generation and training of an entry-level feedforward neural network.

ff-neural-network A Python module for the generation and training of an entry-level feedforward neural network. This repository serves as a repurposin

Riadh 2 Jan 31, 2022
OpenVisionAPI server

🚀 Quick start An instance of ova-server is free and publicly available here: https://api.openvisionapi.com Checkout ova-client for a quick demo. Inst

Open Vision API 93 Nov 24, 2022
Polyp-PVT: Polyp Segmentation with Pyramid Vision Transformers (arXiv2021)

Polyp-PVT by Bo Dong, Wenhai Wang, Deng-Ping Fan, Jinpeng Li, Huazhu Fu, & Ling Shao. This repo is the official implementation of "Polyp-PVT: Polyp Se

Deng-Ping Fan 102 Jan 05, 2023
Unofficial Implementation of RobustSTL: A Robust Seasonal-Trend Decomposition Algorithm for Long Time Series (AAAI 2019)

RobustSTL: A Robust Seasonal-Trend Decomposition Algorithm for Long Time Series (AAAI 2019) This repository contains python (3.5.2) implementation of

Doyup Lee 222 Dec 21, 2022
[ACM MM 2021] Diverse Image Inpainting with Bidirectional and Autoregressive Transformers

Diverse Image Inpainting with Bidirectional and Autoregressive Transformers Installation pip install -r requirements.txt Dataset Preparation Given the

Yingchen Yu 25 Nov 09, 2022
U^2-Net - Portrait matting This repository explores possibilities of using the original u^2-net model for portrait matting.

U^2-Net - Portrait matting This repository explores possibilities of using the original u^2-net model for portrait matting.

Dennis Bappert 104 Nov 25, 2022
PyTorch implementation of Deformable Convolution

Deformable Convolutional Networks in PyTorch This repo is an implementation of Deformable Convolution. Ported from author's MXNet implementation. Buil

411 Dec 16, 2022
DeepConsensus uses gap-aware sequence transformers to correct errors in Pacific Biosciences (PacBio) Circular Consensus Sequencing (CCS) data.

DeepConsensus DeepConsensus uses gap-aware sequence transformers to correct errors in Pacific Biosciences (PacBio) Circular Consensus Sequencing (CCS)

Google 149 Dec 19, 2022
Cross-platform CLI tool to generate your Github profile's stats and summary.

ghs Cross-platform CLI tool to generate your Github profile's stats and summary. Preview Hop on to examples for other usecases. Jump to: Installation

HackerRank 134 Dec 20, 2022
RL agent to play μRTS with Stable-Baselines3

Gym-μRTS with Stable-Baselines3/PyTorch This repo contains an attempt to reproduce Gridnet PPO with invalid action masking algorithm to play μRTS usin

Oleksii Kachaiev 24 Nov 11, 2022
Experimental code for paper: Generative Adversarial Networks as Variational Training of Energy Based Models

Experimental code for paper: Generative Adversarial Networks as Variational Training of Energy Based Models, under review at ICLR 2017 requirements: T

Shuangfei Zhai 18 Mar 05, 2022
This is the official implementation of the paper "Object Propagation via Inter-Frame Attentions for Temporally Stable Video Instance Segmentation".

[CVPRW 2021] - Object Propagation via Inter-Frame Attentions for Temporally Stable Video Instance Segmentation

Anirudh S Chakravarthy 6 May 03, 2022
This code is an implementation for Singing TTS.

MLP Singer This code is an implementation for Singing TTS. The algorithm is based on the following papers: Tae, J., Kim, H., & Lee, Y. (2021). MLP Sin

Heejo You 22 Dec 23, 2022
LaBERT - A length-controllable and non-autoregressive image captioning model.

Length-Controllable Image Captioning (ECCV2020) This repo provides the implemetation of the paper Length-Controllable Image Captioning. Install conda

bearcatt 53 Nov 13, 2022
TANL: Structured Prediction as Translation between Augmented Natural Languages

TANL: Structured Prediction as Translation between Augmented Natural Languages Code for the paper "Structured Prediction as Translation between Augmen

98 Dec 15, 2022
Disentangled Face Attribute Editing via Instance-Aware Latent Space Search, accepted by IJCAI 2021.

Instance-Aware Latent-Space Search This is a PyTorch implementation of the following paper: Disentangled Face Attribute Editing via Instance-Aware Lat

67 Dec 21, 2022
NeuralDiff: Segmenting 3D objects that move in egocentric videos

NeuralDiff: Segmenting 3D objects that move in egocentric videos Project Page | Paper + Supplementary | Video About This repository contains the offic

Vadim Tschernezki 14 Dec 05, 2022
[ICCV 2021] Learning A Single Network for Scale-Arbitrary Super-Resolution

ArbSR Pytorch implementation of "Learning A Single Network for Scale-Arbitrary Super-Resolution", ICCV 2021 [Project] [arXiv] Highlights A plug-in mod

Longguang Wang 229 Dec 30, 2022