Revisiting Global Statistics Aggregation for Improving Image Restoration

Related tags

Deep Learningtlsc
Overview

PWC PWC

Revisiting Global Statistics Aggregation for Improving Image Restoration

Xiaojie Chu, Liangyu Chen, Chengpeng Chen, Xin Lu

Paper: https://arxiv.org/pdf/2112.04491.pdf

Introduction

This repository is an official implementation of the TLSC. We propose Test-time Local Statistics Converter (TLSC), which replaces the statistic aggregation region from the entire spatial dimension to the local window, to mitigate the issue between training and testing. Our approach has no requirement of retraining or finetuning, and only induces marginal extra costs.

arch

Illustration of training and testing schemes of image restoration. From left to right: image from the dataset; input for the restorer (patches or entire-image depend on the scheme); aggregating statistics from the feature map. For (a), (b), and (c), statistics are aggregated along the entire spatial dimension. (d) Ours, statistics are aggregated in a local region for each pixel.

Abstract

Global spatial statistics, which are aggregated along entire spatial dimensions, are widely used in top-performance image restorers. For example, mean, variance in Instance Normalization (IN) which is adopted by HINet, and global average pooling (ie, mean) in Squeeze and Excitation (SE) which is applied to MPRNet. This paper first shows that statistics aggregated on the patches-based/entire-image-based feature in the training/testing phase respectively may distribute very differently and lead to performance degradation in image restorers. It has been widely overlooked by previous works. To solve this issue, we propose a simple approach, Test-time Local Statistics Converter (TLSC), that replaces the region of statistics aggregation operation from global to local, only in the test time. Without retraining or finetuning, our approach significantly improves the image restorer's performance. In particular, by extending SE with TLSC to the state-of-the-art models, MPRNet boost by 0.65 dB in PSNR on GoPro dataset, achieves 33.31 dB, exceeds the previous best result 0.6 dB. In addition, we simply apply TLSC to the high-level vision task, ie, semantic segmentation, and achieves competitive results. Extensive quantity and quality experiments are conducted to demonstrate TLSC solves the issue with marginal costs while significant gain.

Usage

Installation

This implementation based on BasicSR which is a open source toolbox for image/video restoration tasks.

git clone https://github.com/megvii-research/tlsc.git
cd tlsc
pip install -r requirements.txt
python setup.py develop

Quick Start (Single Image Inference)

Main Results

Method GoPro GoPro HIDE HIDE REDS REDS
PSNR SSIM PSNR SSIM PSNR SSIM
HINet 32.71 0.959 30.33 0.932 28.83 0.863
HINet-local (ours) 33.08 0.962 30.66 0.936 28.96 0.865
MPRNet 32.66 0.959 30.96 0.939 - -
MPRNet-local (ours) 33.31 0.964 31.19 0.942 - -

Evaluation

Image Deblur - GoPro dataset (Click to expand)
  • prepare data

    • mkdir ./datasets/GoPro

    • download the test set in ./datasets/GoPro/test (refer to MPRNet)

    • it should be like:

      ./datasets/
      ./datasets/GoPro/test/
      ./datasets/GoPro/test/input/
      ./datasets/GoPro/test/target/
  • eval

    • download pretrained HINet to ./experiments/pretrained_models/HINet-GoPro.pth

    • python basicsr/test.py -opt options/test/HIDE/MPRNetLocal-HIDE.yml

    • download pretrained MPRNet to ./experiments/pretrained_models/MPRNet-GoPro.pth

    • python basicsr/test.py -opt options/test/HIDE/MPRNetLocal-HIDE.yml

Image Deblur - HIDE dataset (Click to expand)
  • prepare data

    • mkdir ./datasets/HIDE

    • download the test set in ./datasets/HIDE/test (refer to MPRNet)

    • it should be like:

      ./datasets/
      ./datasets/HIDE/test/
      ./datasets/HIDE/test/input/
      ./datasets/HIDE/test/target/
  • eval

    • download pretrained HINet to ./experiments/pretrained_models/HINet-GoPro.pth

    • python basicsr/test.py -opt options/test/GoPro/MPRNetLocal-GoPro.yml

    • download pretrained MPRNet to ./experiments/pretrained_models/MPRNet-GoPro.pth

    • python basicsr/test.py -opt options/test/GoPro/MPRNetLocal-GoPro.yml

Image Deblur - REDS dataset (Click to expand)
  • prepare data

    • mkdir ./datasets/REDS

    • download the val set from val_blur, val_sharp to ./datasets/REDS/ and unzip them.

    • it should be like

      ./datasets/
      ./datasets/REDS/
      ./datasets/REDS/val/
      ./datasets/REDS/val/val_blur_jpeg/
      ./datasets/REDS/val/val_sharp/
      
    • python scripts/data_preparation/reds.py

      • flatten the folders and extract 300 validation images.
  • eval

    • download pretrained HINet to ./experiments/pretrained_models/HINet-REDS.pth
    • python basicsr/test.py -opt options/test/REDS/HINetLocal-REDS.yml

Tricks: Change the 'fast_imp: false' (naive implementation) to 'fast_imp: true' (faster implementation) in MPRNetLocal config can achieve faster inference speed.

License

This project is under the MIT license, and it is based on BasicSR which is under the Apache 2.0 license.

Citations

If TLSC helps your research or work, please consider citing TLSC.

@article{chu2021tlsc,
  title={Revisiting Global Statistics Aggregation for Improving Image Restoration},
  author={Chu, Xiaojie and Chen, Liangyu and and Chen, Chengpeng and Lu, Xin},
  journal={arXiv preprint arXiv:2112.04491},
  year={2021}
}

Contact

If you have any questions, please contact [email protected] or [email protected].

Owner
MEGVII Research
Power Human with AI. 持续创新拓展认知边界 非凡科技成就产品价值
MEGVII Research
Sandbox for training deep learning networks

Deep learning networks This repo is used to research convolutional networks primarily for computer vision tasks. For this purpose, the repo contains (

Oleg Sémery 2.7k Jan 01, 2023
[CVPR-2021] UnrealPerson: An adaptive pipeline for costless person re-identification

UnrealPerson: An Adaptive Pipeline for Costless Person Re-identification In our paper (arxiv), we propose a novel pipeline, UnrealPerson, that decreas

ZhangTianyu 70 Oct 10, 2022
Temporal Dynamic Convolutional Neural Network for Text-Independent Speaker Verification and Phonemetic Analysis

TDY-CNN for Text-Independent Speaker Verification Official implementation of Temporal Dynamic Convolutional Neural Network for Text-Independent Speake

Seong-Hu Kim 16 Oct 17, 2022
GAN example for Keras. Cuz MNIST is too small and there should be something more realistic.

Keras-GAN-Animeface-Character GAN example for Keras. Cuz MNIST is too small and there should an example on something more realistic. Some results Trai

160 Sep 20, 2022
Supporting code for "Autoregressive neural-network wavefunctions for ab initio quantum chemistry".

naqs-for-quantum-chemistry This repository contains the codebase developed for the paper Autoregressive neural-network wavefunctions for ab initio qua

Tom Barrett 24 Dec 23, 2022
Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.

Pretrained Language Model This repository provides the latest pretrained language models and its related optimization techniques developed by Huawei N

HUAWEI Noah's Ark Lab 2.6k Jan 01, 2023
Machine Learning Platform for Kubernetes

Reproduce, Automate, Scale your data science. Welcome to Polyaxon, a platform for building, training, and monitoring large scale deep learning applica

polyaxon 3.2k Dec 23, 2022
Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm

DeCLIP Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm. Our paper is available in arxiv Updates ** Ou

Sense-GVT 470 Dec 30, 2022
Collection of NLP model explanations and accompanying analysis tools

Thermostat is a large collection of NLP model explanations and accompanying analysis tools. Combines explainability methods from the captum library wi

126 Nov 22, 2022
*ObjDetApp* deploys a pytorch model for object detection

*ObjDetApp* deploys a pytorch model for object detection

Will Chao 1 Dec 26, 2021
Applications using the GTN library and code to reproduce experiments in "Differentiable Weighted Finite-State Transducers"

gtn_applications An applications library using GTN. Current examples include: Offline handwriting recognition Automatic speech recognition Installing

Facebook Research 68 Dec 29, 2022
Useful materials and tutorials for 110-1 NTU DBME5028 (Application of Deep Learning in Medical Imaging)

Useful materials and tutorials for 110-1 NTU DBME5028 (Application of Deep Learning in Medical Imaging)

7 Jun 22, 2022
Pretrained models for Jax/Flax: StyleGAN2, GPT2, VGG, ResNet.

Pretrained models for Jax/Flax: StyleGAN2, GPT2, VGG, ResNet.

Matthias Wright 169 Dec 26, 2022
Api's bulid in Flask perfom to manage Todo Task.

Citymall-task Api's bulid in Flask perfom to manage Todo Task. Installation Requrements : Python: 3.10.0 MongoDB create .env file with variables DB_UR

Aisha Tayyaba 1 Dec 17, 2021
Distributed Arcface Training in Pytorch

Distributed Arcface Training in Pytorch

3 Nov 23, 2021
Tandem Mass Spectrum Prediction with Graph Transformers

MassFormer This is the original implementation of MassFormer, a graph transformer for small molecule MS/MS prediction. Check out the preprint on arxiv

Röst Lab 13 Oct 27, 2022
DSTC10 Track 2 - Knowledge-grounded Task-oriented Dialogue Modeling on Spoken Conversations

DSTC10 Track 2 - Knowledge-grounded Task-oriented Dialogue Modeling on Spoken Conversations This repository contains the data, scripts and baseline co

Alexa 51 Dec 17, 2022
The codes and related files to reproduce the results for Image Similarity Challenge Track 2.

The codes and related files to reproduce the results for Image Similarity Challenge Track 2.

Wenhao Wang 89 Jan 02, 2023
Photo2cartoon - 人像卡通化探索项目 (photo-to-cartoon translation project)

人像卡通化 (Photo to Cartoon) 中文版 | English Version 该项目为小视科技卡通肖像探索项目。您可使用微信扫描下方二维码或搜索“AI卡通秀”小程序体验卡通化效果。

Minivision_AI 3.5k Dec 30, 2022
LIMEcraft: Handcrafted superpixel selectionand inspection for Visual eXplanations

LIMEcraft LIMEcraft: Handcrafted superpixel selectionand inspection for Visual eXplanations The LIMEcraft algorithm is an explanatory method based on

MI^2 DataLab 4 Aug 01, 2022