Official implementation of the paper 'High-Resolution Photorealistic Image Translation in Real-Time: A Laplacian Pyramid Translation Network' in CVPR 2021

Last update: Dec 26, 2022

Related tags

Deep Learning LPTN

Overview

LPTN

Paper | Supplementary Material | Poster

High-Resolution Photorealistic Image Translation in Real-Time: A Laplacian Pyramid Translation Network
Jie Liang*, Hui Zeng*, and Lei Zhang.
In CVPR 2021.

Abstract

Existing image-to-image translation (I2IT) methods are either constrained to low-resolution images or long inference time due to their heavy computational burden on the convolution of high-resolution feature maps. In this paper, we focus on speeding-up the high-resolution photorealistic I2IT tasks based on closed-form Laplacian pyramid decomposition and reconstruction. Specifically, we reveal that the attribute transformations, such as illumination and color manipulation, relate more to the low-frequency component, while the content details can be adaptively refined on high-frequency components. We consequently propose a Laplacian Pyramid Translation Network (LPTN) to simultaneously perform these two tasks, where we design a lightweight network for translating the low-frequency component with reduced resolution and a progressive masking strategy to efficiently refine the high-frequency ones. Our model avoids most of the heavy computation consumed by processing high-resolution feature maps and faithfully preserves the image details. Extensive experimental results on various tasks demonstrate that the proposed method can translate 4K images in real-time using one normal GPU while achieving comparable transformation performance against existing methods.

Overall pipeline of the LPTN:

For more details, please refer to our paper.

Getting started

Clone this repo.

git clone https://github.com/csjliang/LPTN
cd LPTN

Install dependencies. (Python 3 + NVIDIA GPU + CUDA. Recommend to use Anaconda)

pip install -r requirement.txt

Download dataset (FiveK in 480p) and create lmdb (to accelerate training).

PYTHONPATH="./:${PYTHONPATH}" python scripts/data_preparation/download_datasets.py
PYTHONPATH="./:${PYTHONPATH}" python scripts/data_preparation/create_lmdb.py

Training

First, check and adapt the yml file options/train/LPTN/train_FiveK.yml, then

Single GPU:

PYTHONPATH="./:${PYTHONPATH}" CUDA_VISIBLE_DEVICES=0 python codes/train.py -opt options/train/LPTN/train_FiveK.yml

Distributed Training:

PYTHONPATH="./:${PYTHONPATH}" CUDA_VISIBLE_DEVICES=0,1,2,3 python -m torch.distributed.launch --nproc_per_node=4 --master_port=4321 codes/train.py -opt options/train/LPTN/train_FiveK.yml --launcher pytorch

Training files (logs, models, training states and visualizations) will be saved in the directory ./experiments/{name}

Evaluation

First, check and adapt the yml file options/test/LPTN/test_FiveK.yml and options/test/LPTN/test_speed_FiveK.yml, then

Calculate metrics and save visual results:

PYTHONPATH="./:${PYTHONPATH}" CUDA_VISIBLE_DEVICES=0 python codes/test.py -opt options/test/LPTN/test_FiveK.yml

Test inference speed:

PYTHONPATH="./:${PYTHONPATH}" CUDA_VISIBLE_DEVICES=0 python codes/test_speed.py -opt options/test/LPTN/test_speed_FiveK.yml

Evaluating files (logs and visualizations) will be saved in the directory ./results/{name}

Use Pretrained Models

Download the pretrained model from GoogleDrive and move it to the directory experiments/pretrained_models:
Specify the path: pretrain_network_g in test_FiveK.yml and run evaluation.

Notes

We have optimized the training process and improved the performance (get 22.9db on FiveK at 480p)
We will release the datasets of day2night and sum2win later.

Citation

If you use this dataset or code for your research, please cite our paper.

@inproceedings{jie2021LPTN,
  title={High-Resolution Photorealistic Image Translation in Real-Time: A Laplacian Pyramid Translation Network},
  author={Liang, Jie and Zeng, Hui and Zhang, Lei},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  year={2021}
}

Acknowledgement

We borrowed the training and validating framework from the excellent BasicSR project.

Contact

Should you have any questions, please contact me via [email protected].

Official implementation of the paper 'High-Resolution Photorealistic Image Translation in Real-Time: A Laplacian Pyramid Translation Network' in CVPR 2021

Related tags

Overview

LPTN

Paper | Supplementary Material | Poster

Abstract

Getting started

Training

Evaluation

Use Pretrained Models

Notes

Citation

Acknowledgement

Contact

Owner

Phylogeny Partners

This package implements THOR: Transformer with Stochastic Experts.

PINN(s): Physics-Informed Neural Network(s) for von Karman vortex street

Style-based Point Generator with Adversarial Rendering for Point Cloud Completion (CVPR 2021)

PyTorch code for MART: Memory-Augmented Recurrent Transformer for Coherent Video Paragraph Captioning

WaveFake: A Data Set to Facilitate Audio DeepFake Detection

Code for "Long Range Probabilistic Forecasting in Time-Series using High Order Statistics"

[CVPR 2020] Local Class-Specific and Global Image-Level Generative Adversarial Networks for Semantic-Guided Scene Generation

A Comparative Review of Recent Kinect-Based Action Recognition Algorithms (TIP2020, Matlab codes)

Nicely is a real-time Feedback and Intervention Program Depression is a prevalent issue across all age groups, socioeconomic classes, and cultural identities.

Cerberus Transformer: Joint Semantic, Affordance and Attribute Parsing

Learning based AI for playing multi-round Koi-Koi hanafuda card games. Have fun.

[NeurIPS 2021] SSUL: Semantic Segmentation with Unknown Label for Exemplar-based Class-Incremental Learning

Dynamic Realtime Animation Control

Codes and scripts for "Explainable Semantic Space by Grounding Languageto Vision with Cross-Modal Contrastive Learning"

Automated image registration. Registrationimation was too much of a mouthful.

This is the official Pytorch implementation of "Lung Segmentation from Chest X-rays using Variational Data Imputation", Raghavendra Selvan et al. 2020

TensorFlow for Raspberry Pi

Unsupervised Image to Image Translation with Generative Adversarial Networks

Code for Active Learning at The ImageNet Scale.