Pre-Trained Image Processing Transformer (IPT)

Overview

Pre-Trained Image Processing Transformer (IPT)

By Hanting Chen, Yunhe Wang, Tianyu Guo, Chang Xu, Yiping Deng, Zhenhua Liu, Siwei Ma, Chunjing Xu, Chao Xu, Wen Gao. [arXiv]

We study the low-level computer vision task (such as denoising, super-resolution and deraining) and develop a new pre-trained model, namely, image processing transformer (IPT). We present to utilize the well-known ImageNet benchmark for generating a large amount of corrupted image pairs. The IPT model is trained on these images with multi-heads and multi-tails. The pre-trained model can therefore efficiently employed on desired task after fine-tuning. With only one pre-trained model, IPT outperforms the current state-of-the-art methods on various low-level benchmarks.

MindSpore Code

Requirements

  • python 3
  • pytorch == 1.4.0
  • torchvision

Dataset

The benchmark datasets can be downloaded as follows:

For super-resolution:

Set5, Set14, B100, Urban100.

For denoising:

CBSD68, Urban100.

For deraining:

Rain100L.

The result images are converted into YCbCr color space. The PSNR is evaluated on the Y channel only.

Script Description

This is the inference script of IPT, you can following steps to finish the test of image processing tasks, like SR, denoise and derain, via the corresponding pretrained models.

Script Parameter

For details about hyperparameters, see option.py.

Evaluation

Pretrained models

The pretrained models are available in google drive

Evaluation Process

Inference example: For SR x2,x3,x4:

python main.py --dir_data $DATA_PATH --pretrain $MODEL_PATH --data_test Set5+Set14+B100+Urban100 --scale $SCALE

For Denoise 30,50:

python main.py --dir_data $DATA_PATH --pretrain $MODEL_PATH --data_test CBSD68+Urban100 --scale 1 --denoise --sigma $NOISY_LEVEL

For derain:

python main.py --dir_data $DATA_PATH --pretrain $MODEL_PATH --scale 1 --derain

Results

  • Detailed results on image super-resolution task.
Method Scale Set5 Set14 B100 Urban100
VDSR X2 37.53 33.05 31.90 30.77
EDSR X2 38.11 33.92 32.32 32.93
RCAN X2 38.27 34.12 32.41 33.34
RDN X2 38.24 34.01 32.34 32.89
OISR-RK3 X2 38.21 33.94 32.36 33.03
RNAN X2 38.17 33.87 32.32 32.73
SAN X2 38.31 34.07 32.42 33.1
HAN X2 38.27 34.16 32.41 33.35
IGNN X2 38.24 34.07 32.41 33.23
IPT (ours) X2 38.37 34.43 32.48 33.76
Method Scale Set5 Set14 B100 Urban100
VDSR X3 33.67 29.78 28.83 27.14
EDSR X3 34.65 30.52 29.25 28.80
RCAN X3 34.74 30.65 29.32 29.09
RDN X3 34.71 30.57 29.26 28.80
OISR-RK3 X3 34.72 30.57 29.29 28.95
RNAN X3 34.66 30.52 29.26 28.75
SAN X3 34.75 30.59 29.33 28.93
HAN X3 34.75 30.67 29.32 29.10
IGNN X3 34.72 30.66 29.31 29.03
IPT (ours) X3 34.81 30.85 29.38 29.49
Method Scale Set5 Set14 B100 Urban100
VDSR X4 31.35 28.02 27.29 25.18
EDSR X4 32.46 28.80 27.71 26.64
RCAN X4 32.63 28.87 27.77 26.82
SAN X4 32.64 28.92 27.78 26.79
RDN X4 32.47 28.81 27.72 26.61
OISR-RK3 X4 32.53 28.86 27.75 26.79
RNAN X4 32.49 28.83 27.72 26.61
HAN X4 32.64 28.90 27.80 26.85
IGNN X4 32.57 28.85 27.77 26.84
IPT (ours) X4 32.64 29.01 27.82 27.26
  • Super-resolution result

  • Denoising result

  • Derain result

Citation

@misc{chen2020pre,
      title={Pre-Trained Image Processing Transformer}, 
      author={Chen, Hanting and Wang, Yunhe and Guo, Tianyu and Xu, Chang and Deng, Yiping and Liu, Zhenhua and Ma, Siwei and Xu, Chunjing and Xu, Chao and Gao, Wen},
      year={2021},
      eprint={2012.00364},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Acknowledgement

Owner
HUAWEI Noah's Ark Lab
Working with and contributing to the open source community in data mining, artificial intelligence, and related fields.
HUAWEI Noah's Ark Lab
Pytorch implementations of Bayes By Backprop, MC Dropout, SGLD, the Local Reparametrization Trick, KF-Laplace, SG-HMC and more

Bayesian Neural Networks Pytorch implementations for the following approximate inference methods: Bayes by Backprop Bayes by Backprop + Local Reparame

1.4k Jan 07, 2023
Multi-Horizon-Forecasting-for-Limit-Order-Books

Multi-Horizon-Forecasting-for-Limit-Order-Books This jupyter notebook is used to demonstrate our work, Multi-Horizon Forecasting for Limit Order Books

Zihao Zhang 116 Dec 23, 2022
Multiple paper open-source codes of the Microsoft Research Asia DKI group

📫 Paper Code Collection (MSRA DKI Group) This repo hosts multiple open-source codes of the Microsoft Research Asia DKI Group. You could find the corr

Microsoft 249 Jan 08, 2023
ACV is a python library that provides explanations for any machine learning model or data.

ACV is a python library that provides explanations for any machine learning model or data. It gives local rule-based explanations for any model or data and different Shapley Values for tree-based mod

Salim Amoukou 85 Dec 27, 2022
Research using Cirq!

ReCirq Research using Cirq! This project contains modules for running quantum computing applications and experiments through Cirq and Quantum Engine.

quantumlib 230 Dec 29, 2022
Retina blood vessel segmentation with a convolutional neural network

Retina blood vessel segmentation with a convolution neural network (U-net) This repository contains the implementation of a convolutional neural netwo

Orobix 1.2k Jan 06, 2023
Pytorch cuda extension of grid_sample1d

Grid Sample 1d pytorch cuda extension of grid sample 1d. Since pytorch only supports grid sample 2d/3d, I extend the 1d version for efficiency. The fo

lyricpoem 24 Dec 03, 2022
Official repository for "Intriguing Properties of Vision Transformers" (2021)

Intriguing Properties of Vision Transformers Muzammal Naseer, Kanchana Ranasinghe, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, & Ming-Hsuan Yang P

Muzammal Naseer 155 Dec 27, 2022
My solutions for Stanford University course CS224W: Machine Learning with Graphs Fall 2021 colabs (GNN, GAT, GraphSAGE, GCN)

machine-learning-with-graphs My solutions for Stanford University course CS224W: Machine Learning with Graphs Fall 2021 colabs Course materials can be

Marko Njegomir 7 Dec 14, 2022
O2O-Afford: Annotation-Free Large-Scale Object-Object Affordance Learning (CoRL 2021)

O2O-Afford: Annotation-Free Large-Scale Object-Object Affordance Learning Object-object Interaction Affordance Learning. For a given object-object int

Kaichun Mo 26 Nov 04, 2022
Official Implementation for the paper DeepFace-EMD: Re-ranking Using Patch-wise Earth Mover’s Distance Improves Out-Of-Distribution Face Identification

DeepFace-EMD: Re-ranking Using Patch-wise Earth Mover’s Distance Improves Out-Of-Distribution Face Identification Official Implementation for the pape

Anh M. Nguyen 36 Dec 28, 2022
The official codes for the ICCV2021 Oral presentation "Rethinking Counting and Localization in Crowds: A Purely Point-Based Framework"

P2PNet (ICCV2021 Oral Presentation) This repository contains codes for the official implementation in PyTorch of P2PNet as described in Rethinking Cou

Tencent YouTu Research 208 Dec 26, 2022
Cmsc11 arcade - Final Project for CMSC11

cmsc11_arcade Final Project for CMSC11 Developers: Limson, Mark Vincent Peñafiel

Gregory 1 Jan 18, 2022
A Deep Learning based project for creating line art portraits.

ArtLine The main aim of the project is to create amazing line art portraits. Sounds Intresting,let's get to the pictures!! Model-(Smooth) Model-(Quali

Vijish Madhavan 3.3k Jan 07, 2023
Semi Supervised Learning for Medical Image Segmentation, a collection of literature reviews and code implementations.

Semi-supervised-learning-for-medical-image-segmentation. Recently, semi-supervised image segmentation has become a hot topic in medical image computin

Healthcare Intelligence Laboratory 1.3k Jan 03, 2023
Mmrotate - OpenMMLab Rotated Object Detection Benchmark

OpenMMLab website HOT OpenMMLab platform TRY IT OUT 📘 Documentation | 🛠️ Insta

OpenMMLab 1.2k Jan 04, 2023
A pytorch implementation of Pytorch-Sketch-RNN

Pytorch-Sketch-RNN A pytorch implementation of https://arxiv.org/abs/1704.03477 In order to draw other things than cats, you will find more drawing da

Alexis David Jacq 172 Dec 12, 2022
This repository is the code of the paper "Sparse Spatial Transformers for Few-Shot Learning".

🌟 Sparse Spatial Transformers for Few-Shot Learning This code implements the Sparse Spatial Transformers for Few-Shot Learning(SSFormers). Our code i

chx_nju 38 Dec 13, 2022
Semantic Edge Detection with Diverse Deep Supervision

Semantic Edge Detection with Diverse Deep Supervision This repository contains the code for our IJCV paper: "Semantic Edge Detection with Diverse Deep

Yun Liu 12 Dec 31, 2022
Sound Source Localization for AI Grand Challenge 2021

Sound-Source-Localization Sound Source Localization study for AI Grand Challenge 2021 (sponsored by NC Soft Vision Lab) Preparation 1. Place the data-

sanghoon 19 Mar 29, 2022