PyTorch implementation of paper "StarEnhancer: Learning Real-Time and Style-Aware Image Enhancement" (ICCV 2021 Oral)

Last update: Dec 28, 2022

Overview

StarEnhancer

StarEnhancer: Learning Real-Time and Style-Aware Image Enhancement (ICCV 2021 Oral)

Abstract: Image enhancement is a subjective process whose targets vary with user preferences. In this paper, we propose a deep learning-based image enhancement method covering multiple tonal styles using only a single model dubbed StarEnhancer. It can transform an image from one tonal style to another, even if that style is unseen. With a simple one-time setting, users can customize the model to make the enhanced images more in line with their aesthetics. To make the method more practical, we propose a well-designed enhancer that can process a 4K-resolution image over 200 FPS but surpasses the contemporaneous single style image enhancement methods in terms of PSNR, SSIM, and LPIPS. Finally, our proposed enhancement method has good interactability, which allows the user to fine-tune the enhanced image using intuitive options.

Getting started

Install

We test the code on PyTorch 1.8.1 + CUDA 11.1 + cuDNN 8.0.5, and close versions also work fine.

Install PyTorch and torchvision fom http://pytorch.org.
Install other requirements:

pip install -r requirements.txt

We mainly train the model on RTX 2080Ti * 4, but a smaller mini batch size can also work.

Prepare

You can generate your own dataset, or download the one we generate.

The final file path should be the same as the following:

┬─ save_model
│   ├─ stylish.pth.tar
│   └─ ... (model & embedding)
└─ data
    ├─ train
    │   ├─ 01-Experts-A
    │   │   ├─ a0001.jpg
    │   │   └─ ... (id.jpg)
    │   └─ ... (style folder)
    ├─ valid
    │   └─ ... (style folder)
    └─ test
        └─ ... (style folder)

Download

Data and pretrained models are available on GoogleDrive.

Generate

Download raw data from MIT-Adobe FiveK Dataset.
Download the modified Lightroom database fivek.lrcat, and replace the original database with it.
Generate dataset in JPEG format with quality 100, which can refer to this issue.
Run generate_dataset.py in data folder to generate dataset.

Train

Firstly, train the style encoder:

python train_stylish.py

Secondly, fetch the style embedding for each sample in the train set:

python fetch_embedding.py

Lastly, train the curve encoder and mapping network:

python train_enhancer.py

Test

Just run:

python test.py

Testing LPIPS requires about 10 GB GPU memory, and if an OOM occurs, replace the following lines

lpips_val = loss_fn_alex(output * 2 - 1, target_img * 2 - 1).item()

with

lpips_val = 0

Notes

Due to agreements, we are unable to release part of the source code. This repository provides a pure python implementation for research use. There are some differences between the repository and the paper as follows:

The repository uses a ResNet-18 w/o BN as the curve encoder's backbone, and the paper uses a more lightweight model.
The paper uses CUDA to implement the color transform function, and the repository uses torch.gather to implement it.
The repository removes some tricks used in training lightweight models.

Overall, this repository can achieve higher performance, but will be slightly slower.

Comments

Multi-style, unpaired setting

您好，在多风格非配对图场景，能否交换source和target的位置，并将得到的output_A和output_B进一步经过enhancer,得到recover_A和recover_B。最后计算l1_loss(source, recover_A)和l1_loss(target, recover_B)及Triplet_loss(output_A，target, source) 和 Triplet_loss(output_B，source，target)

def train(train_loader, mapping, enhancer, criterion, optimizer):
    losses = AverageMeter()
    criterionTriplet = torch.nn.TripletMarginLoss(margin=1.0, p=2)
    FEModel = Feature_Extract_Model().cuda()

    mapping.train()
    enhancer.train()

    for (source_img, source_center, target_img, target_center) in train_loader:
        source_img = source_img.cuda(non_blocking=True)
        source_center = source_center.cuda(non_blocking=True)
        target_img = target_img.cuda(non_blocking=True)
        target_center = target_center.cuda(non_blocking=True)

        style_A = mapping(source_center)
        style_B = mapping(target_center)

        output_A = enhancer(source_img, style_A, style_B)
        output_B = enhancer(target_img, style_B, style_A)
        recoverA = enhancer(output_A, style_B, style_A)
        recoverB = enhancer(output_B, style_A, style_B)

        source_img_feature = FEModel(source_img)
        target_img_feature = FEModel(target_img)
        output_A_feature = FEModel(output_A)
        output_B_feature = FEModel(output_B)

        loss_l1 = criterion(recoverA, source_img) + criterion(recoverB, target_img)
        loss_triplet = criterionTriplet(output_B_feature, source_img_feature, target_img_feature) + \
                       criterionTriplet(output_A_feature, target_img_feature, source_img_feature)
        loss = loss_l1 + loss_triplet

        losses.update(loss.item(), args.t_batch_size)

        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

    return losses.avg

opened by jxust01 4

Questions about dataset preparation

您好，我想用您的工程跑一下自己的数据，现在有输入，输出一组数据对，训练数据里面A-E剩下的4种效果是怎样生成的呢，这些目标效果数据能否是非成对的呢？如果只有一种风格，能否A-E目标效果都拷贝成一样的数据呢，在train_enhancer.py所训练的单风格脚本是需要embeddings.npy文件，这个文件在单风格训练时是必须的吗

opened by zener90818 4
Dataset processing

你好，我在您提供的fivek.lrcat没找到 DeepUPE issue里的"(default) input with ExpertC"。请问单风格实验的输入是下图中的“InputAsShotZeroed”还是“(Q)InputZeroed with ExpertC WhiteBalance”

opened by madfff 2
Configure Renovate
Welcome to Renovate! This is an onboarding PR to help you understand and configure settings before regular Pull Requests begin.

🚦 To activate Renovate, merge this Pull Request. To disable Renovate, simply close this Pull Request unmerged.

Detected Package Files

requirements.txt (pip_requirements)

Configuration Summary

Based on the default config's presets, Renovate will:

Start dependency updates only once this onboarding PR is merged

Enable Renovate Dependency Dashboard creation

If semantic commits detected, use semantic commit type fix for dependencies and chore for all others

Ignore node_modules, bower_components, vendor and various test/tests directories

Autodetect whether to pin dependencies or maintain ranges

Rate limit PR creation to a maximum of two per hour

Limit to maximum 20 open PRs at any time

Group known monorepo packages together

Use curated list of recommended non-monorepo package groupings

Fix some problems with very old Maven commons versions

Ignore spring cloud 1.x releases

Ignore http4s digest-based 1.x milestones

Use node versioning for @types/node

Limit concurrent requests to reduce load on Repology servers until we can fix this properly, see issue 10133

🔡 Would you like to change the way Renovate is upgrading your dependencies? Simply edit the renovate.json in this branch with your custom config and the list of Pull Requests in the "What to Expect" section below will be updated the next time Renovate runs.

What to Expect

With your current configuration, Renovate will create 1 Pull Request:

Pin dependency torch to ==1.10.0

Schedule: ["at any time"]

Branch name: renovate/pin-dependencies

Merge into: main

Pin torch to ==1.10.0

❓ Got questions? Check out Renovate's Docs, particularly the Getting Started section. If you need any further assistance then you can also request help here.

This PR has been generated by WhiteSource Renovate. View repository job log here.
opened by renovate[bot] 1
The results are not the same as the paper
I am the author.

Some peers have emailed me asking about the performance of the open source model that does not agree with the results in the paper. As stated in the README, the model is not the model of the paper, but the performance is similar. The exact result should be: PSNR: 25.41, SSIM: 0.942, LPIPS: 0.085

If you find that your result is not this, then it may be that the JPEG codec is different, which is related to the version of opencv and how it is installed.

You can uninstall your opencv (either with pip or conda) and reinstall it using pip (it must be pip, because conda installs a different JPEG codec):

pip install opencv-python==4.5.5.62
opened by IDKiro 0

Releases(others)

others(May 25, 2022)

Source code(tar.gz)
Source code(zip)
BasicEnhancer.zip(79.10 KB)

Owner

IDKiro

Stroll in the abyss

GitHub Repository

Implementation of the master's thesis "Temporal copying and local hallucination for video inpainting".

Temporal copying and local hallucination for video inpainting This repository contains the implementation of my master's thesis "Temporal copying and

1 Dec 02, 2022

LabelImg is a graphical image annotation tool.

LabelImgPlus LabelImg is a graphical image annotation tool. This project is not updated with new functions now. More functions are supported with Labe

200 Dec 20, 2022

Accurate 3D Face Reconstruction with Weakly-Supervised Learning: From Single Image to Image Set (CVPRW 2019). A PyTorch implementation.

Accurate 3D Face Reconstruction with Weakly-Supervised Learning: From Single Image to Image Set —— PyTorch implementation This is an unofficial offici

833 Dec 28, 2022

A very tiny, very simple, and very secure file encryption tool.

Picocrypt is a very tiny (hence "Pico"), very simple, yet very secure file encryption tool. It uses the modern ChaCha20-Poly1305 cipher suite as well

1k Dec 30, 2022

The Pytorch implementation for "Video-Text Pre-training with Learned Regions"

Region_Learner The Pytorch implementation for "Video-Text Pre-training with Learned Regions" (arxiv) We are still cleaning up the code further and pre

0 Mar 20, 2022

git git《Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking》(CVPR 2021) GitHub:git2] 《Masksembles for Uncertainty Estimation》(CVPR 2021) GitHub:git3]

Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking Ning Wang, Wengang Zhou, Jie Wang, and Houqiang Li Accepted by CVPR

236 Dec 22, 2022

PyTorch implementations of algorithms for density estimation

pytorch-flows A PyTorch implementations of Masked Autoregressive Flow and some other invertible transformations from Glow: Generative Flow with Invert

546 Dec 05, 2022

Course materials for Fall 2021 "CIS6930 Topics in Computing for Data Science" at New College of Florida

Fall 2021 CIS6930 Topics in Computing for Data Science This repository hosts course materials used for a 13-week course "CIS6930 Topics in Computing f

101 Nov 30, 2022

Trustworthy AI related projects

Trustworthy AI This repository aims to include trustworthy AI related projects from Huawei Noah's Ark Lab. Current projects include: Causal Structure

589 Dec 30, 2022

Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic Data

Real-ESRGAN Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic Data Ported from https://github.com/xinntao/Real-ESRGAN Depend

44 Dec 27, 2022

Robot Reinforcement Learning on the Constraint Manifold

Implementation of "Robot Reinforcement Learning on the Constraint Manifold"

31 Dec 05, 2022

🏆 The 1st Place Submission to AICity Challenge 2021 Natural Language-Based Vehicle Retrieval Track (Alibaba-UTS submission)

AI City 2021: Connecting Language and Vision for Natural Language-Based Vehicle Retrieval 🏆 The 1st Place Submission to AICity Challenge 2021 Natural

82 Dec 29, 2022

Bridging Composite and Real: Towards End-to-end Deep Image Matting

Bridging Composite and Real: Towards End-to-end Deep Image Matting Please note that the official repository of the paper Bridging Composite and Real:

30 Oct 31, 2022

This is a TensorFlow implementation for C2-Rec

This is a TensorFlow implementation for C2-Rec We refer to the repo SASRec. Requirements requirement.txt Datasets This repo includes Amazon Beauty dat

7 Nov 14, 2022

Attention-based Transformation from Latent Features to Point Clouds (AAAI 2022)

Attention-based Transformation from Latent Features to Point Clouds This repository contains a PyTorch implementation of the paper: Attention-based Tr

12 Nov 11, 2022

VLGrammar: Grounded Grammar Induction of Vision and Language

27 Dec 23, 2022

Implementation of the paper "Fine-Tuning Transformers: Vocabulary Transfer"

Transformer-vocabulary-transfer Implementation of the paper "Fine-Tuning Transfo

13 Nov 30, 2022

The official implementation of You Only Compress Once: Towards Effective and Elastic BERT Compression via Exploit-Explore Stochastic Nature Gradient.

You Only Compress Once: Towards Effective and Elastic BERT Compression via Exploit-Explore Stochastic Nature Gradient (paper) @misc{zhang2021compress,

46 Dec 07, 2022

A simple baseline for 3d human pose estimation in tensorflow. Presented at ICCV 17.

3d-pose-baseline This is the code for the paper Julieta Martinez, Rayat Hossain, Javier Romero, James J. Little. A simple yet effective baseline for 3

1.3k Jan 03, 2023

shufflev2-yolov5：lighter, faster and easier to deploy

shufflev2-yolov5: lighter, faster and easier to deploy. Evolved from yolov5 and the size of model is only 1.7M (int8) and 3.3M (fp16). It can reach 10+ FPS on the Raspberry Pi 4B when the input size

1.5k Jan 05, 2023