Official PyTorch code for Mutual Affine Network for Spatially Variant Kernel Estimation in Blind Image Super-Resolution (MANet, ICCV2021)

Last update: Dec 29, 2022

Overview

Mutual Affine Network for Spatially Variant Kernel Estimation in Blind Image Super-Resolution (MANet, ICCV2021)

This repository is the official PyTorch implementation of Mutual Affine Network for Spatially Variant Kernel Estimation in Blind Image Super-Resolution (arxiv, supplementary).

🚀 🚀 🚀 News:

Aug. 17, 2021: See our previous spatially invariant kernel estimation work: Flow-based Kernel Prior with Application to Blind Super-Resolution (FKP), CVPR2021.
Aug. 17, 2021: See our recent work for flow-based image SR: Hierarchical Conditional Flow: A Unified Framework for Image Super-Resolution and Image Rescaling (HCFlow), ICCV2021
Aug. 17, 2021: See our recent work for real-world image SR: Designing a Practical Degradation Model for Deep Blind Image Super-Resolution (BSRGAN), ICCV2021

Existing blind image super-resolution (SR) methods mostly assume blur kernels are spatially invariant across the whole image. However, such an assumption is rarely applicable for real images whose blur kernels are usually spatially variant due to factors such as object motion and out-of-focus. Hence, existing blind SR methods would inevitably give rise to poor performance in real applications. To address this issue, this paper proposes a mutual affine network (MANet) for spatially variant kernel estimation. Specifically, MANet has two distinctive features. First, it has a moderate receptive field so as to keep the locality of degradation. Second, it involves a new mutual affine convolution (MAConv) layer that enhances feature expressiveness without increasing receptive field, model size and computation burden. This is made possible through exploiting channel interdependence, which applies each channel split with an affine transformation module whose input are the rest channel splits. Extensive experiments on synthetic and real images show that the proposed MANet not only performs favorably for both spatially variant and invariant kernel estimation, but also leads to state-of-the-art blind SR performance when combined with non-blind SR methods.

Requirements

Python 3.7, PyTorch >= 1.6, scipy >= 1.6.3
Requirements: opencv-python
Platforms: Ubuntu 16.04, cuda-10.0 & cuDNN v-7.5

Note: this repository is based on BasicSR. Please refer to their repository for a better understanding of the code framework.

Quick Run

Download stage3_MANet+RRDB_x4.pth from release and put it in ./pretrained_models. Then, run this command:

cd codes
python test.py --opt options/test/test_stage3.yml

Data Preparation

To prepare data, put training and testing sets in ./datasets as ./datasets/DIV2K/HR/0801.png. Commonly used datasets can be downloaded here.

Training

Step1: to train MANet, run this command:

python train.py --opt options/train/train_stage1.yml

Step2: to train non-blind RRDB, run this command:

python train.py --opt options/train/train_stage2.yml

Step3: to fine-tune RRDB with MANet, run this command:

python train.py --opt options/train/train_stage3.yml

All trained models can be downloaded from release. For testing, downloading stage3 models is enough.

Testing

To test MANet (stage1, kernel estimation only), run this command:

python test.py --opt options/test/test_stage1.yml

To test RRDB-SFT (stage2, non-blind SR with ground-truth kernel), run this command:

python test.py --opt options/test/test_stage2.yml

To test MANet+RRDB (stage3, blind SR), run this command:

python test.py --opt options/test/test_stage3.yml

Note: above commands generate LR images on-the-fly. To generate testing sets used in the paper, run this command:

python prepare_testset.py --opt options/test/prepare_testset.yml

Interactive Exploration of Kernels

To explore spaitally variant kernels on an image, use --save_kernel and run this command to save kernel:

python test.py --opt options/test/test_stage1.yml --save_kernel

Then, run this command to creat an interactive window:

python interactive_explore.py --path ../results/001_MANet_aniso_x4_test_stage1/toy_dataset1/npz/toy1.npz

Results

We conducted experiments on both spatially variant and invariant blind SR. Please refer to the paper and supp for results.

Citation

@inproceedings{liang21manet,
  title={Mutual Affine Network for Spatially Variant Kernel Estimation in Blind Image Super-Resolution},
  author={Liang, Jingyun and Sun, Guolei and Zhang, Kai and Van Gool, Luc and Timofte, Radu},
  booktitle={IEEE Conference on International Conference on Computer Vision},
  year={2021}
}

License & Acknowledgement

This project is released under the Apache 2.0 license. The codes are based on BasicSR, MMSR, IKC and KAIR. Please also follow their licenses. Thanks for their great works.

Comments

Training and OOM

Thanks for your code. I tried to train the model with train_stage1.yml, and the Cuda OOM. I am using 2080 Ti, I tried to reduce the batch size from 16 to 2 and the GT_size from 192 to 48. However, the training still OOM. May I know is there anything I missed? Thanks.

opened by hcleung3325 9
[How to get SR image by spatially variant estimated blur kernels]

Hi, Thank you for your excellent and interesting work! I'm not so clear about the process after kernels estimation during SR reconstruction after reading your paper. Could you please explain?

opened by CaptainEven 7
The method of creating kernels

I noticed that the function for creating kernel ('anisotropic_gaussian_kernel_matlab') is different from the standard gaussian distribution (e.g. the method that used in IKC, https://github.com/yuanjunchai/IKC/blob/2a846cf1194cd9bace08973d55ecd8fd3179fe48/codes/utils/util.py#L244). I am wondering why a different way is used here. Actually, a test dataset created by IKC with same sigma range seems to have poor performance on MANet, and vice versa.

opened by zhiqiangfu 3

[import error]

    k = scipy.stats.multivariate_normal.pdf(pos, mean=[0, 0], cov=cov)
AttributeError: module 'scipy' has no attribute 'stats'

scipy version error? So, which version of scipy is required?

opened by CaptainEven 2

A letter from afar

Good evening, boss! I recently discovered your work about MANet.I found that the length of the gaussian kernel your method generated is equal to 18.Does this setting have any specific meaning？

opened by fenghao195 0
New Super-Resolution Benchmarks
Hello,

MSU Graphics & Media Lab Video Group has recently launched two new Super-Resolution Benchmarks.

Video Upscalers Benchmark: Quality Enhancement determines the best upscaling methods for increasing video resolution and improving visual quality.

Super-Resolution for Video Compression benchmark aims to test Super-Resolution methods on compressed videos and select the best model for each video codec standard.

If you are interested in participating, you can add your algorithm following the submission steps:

Submit for Video Upscalers Benchmark: Quality Enhancement

Submit for Super-Resolution for Video Compression benchmark

We would be grateful for your feedback on our work!
opened by EvgeneyBogatyrev 0
About LR_Image PSNR/SSIM

Many thanks for your excellent work!

I wonder what is the LR_Image PSNR/SSIM in the ablation study to evaluate the MANet about kernel prediction, and how to compute these?

opened by Shaosifan 0
Questions about the paper

Thanks again for your great work. I have several questions about the paper. In Figure 2, you mentioned the input for MANet is a LR, but the input for your code seems to be DIV2K GT. Is there any further process I miss? Also, is that possible for the whole model trained in y-channel since my deployed environment only deals with y-channel? Thanks.

opened by mrgreen3325 0
Issue about class BatchBlur_SV in utils.util

MANet/codes/utils/util.py Line 661: kernel = kernel.flatten(2).unsqueeze(0).expand(3,-1,-1,-1) The kernel shape: [B, HW, l, l] ->[B, HW, l^2] ->[1, B, HW, l^2] ->[C, B, HW, l^2] I think it is wrong, because it is not corresponding to the shape of pad.

The line 661 should be kernel = kernel.flatten(2).unsqueeze(1).expand(-1, 3,-1,-1) The kernel shape: [B, HW, l, l] ->[B, HW, l^2] ->[B, 1, HW, l^2] ->[B, C, HW, l^2]

opened by jiangmengyu18 0

Releases(v0.0)

v0.0(Aug 18, 2021)

Pretrained models and supplementary.

Note: Downloading stage3 models is enough for testing MANet.
Source code(tar.gz)
Source code(zip)
MANet_supplementary.pdf(4.12 MB)
stage1_MANet_x2.pth(6.93 MB)
stage1_MANet_x3.pth(6.93 MB)
stage1_MANet_x4.pth(6.93 MB)
stage1_MANet_x4_noise15.pth(6.93 MB)
stage2_RRDB_x2.pth(30.92 MB)
stage2_RRDB_x3.pth(30.96 MB)
stage2_RRDB_x4.pth(31.00 MB)
stage2_RRDB_x4_noise15.pth(31.00 MB)
stage3_MANet+RRDB_x2.pth(37.86 MB)
stage3_MANet+RRDB_x3.pth(37.89 MB)
stage3_MANet+RRDB_x4.pth(37.94 MB)
stage3_MANet+RRDB_x4_noise15.pth(37.94 MB)

Owner

Jingyun Liang

PhD Student at Computer Vision Lab, ETH Zurich

GitHub Repository https://arxiv.org/abs/2108.05302

PyKaldi GOP-DNN on Epa-DB

PyKaldi GOP-DNN on Epa-DB This repository has the tools to run a PyKaldi GOP-DNN algorithm on Epa-DB, a database of non-native English speech by Spani

18 Dec 14, 2022

Mouse Brain in the Model Zoo

Deep Neural Mouse Brain Modeling This is the repository for the ongoing deep neural mouse modeling project, an attempt to characterize the representat

15 Aug 22, 2022

Hydra: an Extensible Fuzzing Framework for Finding Semantic Bugs in File Systems

Hydra: An Extensible Fuzzing Framework for Finding Semantic Bugs in File Systems Paper Finding Semantic Bugs in File Systems with an Extensible Fuzzin

[email protected])"> 129 Dec 15, 2022

MobileNetV1-V2，MobileNeXt，GhostNet，AdderNet，ShuffleNetV1-V2，Mobile+ViT etc.

MobileNetV1-V2，MobileNeXt，GhostNet，AdderNet，ShuffleNetV1-V2，Mobile+ViT etc. ⭐⭐⭐⭐⭐

568 Jan 04, 2023

DROPO: Sim-to-Real Transfer with Offline Domain Randomization

DROPO: Sim-to-Real Transfer with Offline Domain Randomization Gabriele Tiboni, Karol Arndt, Ville Kyrki. This repository contains the code for the pap

8 Dec 19, 2022

Code for EMNLP 2021 paper Contrastive Out-of-Distribution Detection for Pretrained Transformers.

Contra-OOD Code for EMNLP 2021 paper Contrastive Out-of-Distribution Detection for Pretrained Transformers. Requirements PyTorch Transformers datasets

27 Oct 28, 2022

Vector Quantized Diffusion Model for Text-to-Image Synthesis

Vector Quantized Diffusion Model for Text-to-Image Synthesis Due to company policy, I have to set microsoft/VQ-Diffusion to private for now, so I prov

294 Jan 05, 2023

Pytorch Implementations of large number classical backbone CNNs, data enhancement, torch loss, attention, visualization and some common algorithms.

Torch-template-for-deep-learning Pytorch implementations of some **classical backbone CNNs, data enhancement, torch loss, attention, visualization and

270 Dec 31, 2022

Pytorch implementation of the paper "COAD: Contrastive Pre-training with Adversarial Fine-tuning for Zero-shot Expert Linking."

Expert-Linking Pytorch implementation of the paper "COAD: Contrastive Pre-training with Adversarial Fine-tuning for Zero-shot Expert Linking." This is

12 Jan 01, 2023

Official PyTorch code for Mutual Affine Network for Spatially Variant Kernel Estimation in Blind Image Super-Resolution (MANet, ICCV2021)

Related tags

Overview

Mutual Affine Network for Spatially Variant Kernel Estimation in Blind Image Super-Resolution (MANet, ICCV2021)

Requirements

Quick Run

Data Preparation

Training

Testing

Interactive Exploration of Kernels

Results

Citation

License & Acknowledgement

Comments

Releases(v0.0)

v0.0(Aug 18, 2021)

Owner

Jingyun Liang

PyKaldi GOP-DNN on Epa-DB

Mouse Brain in the Model Zoo

Hydra: an Extensible Fuzzing Framework for Finding Semantic Bugs in File Systems

MobileNetV1-V2，MobileNeXt，GhostNet，AdderNet，ShuffleNetV1-V2，Mobile+ViT etc.

DROPO: Sim-to-Real Transfer with Offline Domain Randomization

Code for EMNLP 2021 paper Contrastive Out-of-Distribution Detection for Pretrained Transformers.

Vector Quantized Diffusion Model for Text-to-Image Synthesis

Pytorch Implementations of large number classical backbone CNNs, data enhancement, torch loss, attention, visualization and some common algorithms.

Official PyTorch code of Holistic 3D Scene Understanding from a Single Image with Implicit Representation (CVPR 2021)

Libraries, tools and tasks created and used at DeepMind Robotics.

Python3 Implementation of (Subspace Constrained) Mean Shift Algorithm in Euclidean and Directional Product Spaces

Collapse by Conditioning: Training Class-conditional GANs with Limited Data

Data-Driven Operational Space Control for Adaptive and Robust Robot Manipulation

TorchIO is a Medical image preprocessing and augmentation toolkit for deep learning. Part of the PyTorch Ecosystem.

Official repository of the paper Privacy-friendly Synthetic Data for the Development of Face Morphing Attack Detectors

A Python package for generating concise, high-quality summaries of a probability distribution

[ICCV 2021] Official PyTorch implementation for Deep Relational Metric Learning.

Visualizer using audio and semantic analysis to explore BigGAN (Brock et al., 2018) latent space.

Data for "Driving the Herd: Search Engines as Content Influencers" paper

Pytorch implementation of the paper "COAD: Contrastive Pre-training with Adversarial Fine-tuning for Zero-shot Expert Linking."