Official PyTorch code for Mutual Affine Network for Spatially Variant Kernel Estimation in Blind Image Super-Resolution (MANet, ICCV2021)

Overview

Mutual Affine Network for Spatially Variant Kernel Estimation in Blind Image Super-Resolution (MANet, ICCV2021)

This repository is the official PyTorch implementation of Mutual Affine Network for Spatially Variant Kernel Estimation in Blind Image Super-Resolution (arxiv, supplementary).

🚀 🚀 🚀 News:


Existing blind image super-resolution (SR) methods mostly assume blur kernels are spatially invariant across the whole image. However, such an assumption is rarely applicable for real images whose blur kernels are usually spatially variant due to factors such as object motion and out-of-focus. Hence, existing blind SR methods would inevitably give rise to poor performance in real applications. To address this issue, this paper proposes a mutual affine network (MANet) for spatially variant kernel estimation. Specifically, MANet has two distinctive features. First, it has a moderate receptive field so as to keep the locality of degradation. Second, it involves a new mutual affine convolution (MAConv) layer that enhances feature expressiveness without increasing receptive field, model size and computation burden. This is made possible through exploiting channel interdependence, which applies each channel split with an affine transformation module whose input are the rest channel splits. Extensive experiments on synthetic and real images show that the proposed MANet not only performs favorably for both spatially variant and invariant kernel estimation, but also leads to state-of-the-art blind SR performance when combined with non-blind SR methods.

Requirements

  • Python 3.7, PyTorch >= 1.6, scipy >= 1.6.3
  • Requirements: opencv-python
  • Platforms: Ubuntu 16.04, cuda-10.0 & cuDNN v-7.5

Note: this repository is based on BasicSR. Please refer to their repository for a better understanding of the code framework.

Quick Run

Download stage3_MANet+RRDB_x4.pth from release and put it in ./pretrained_models. Then, run this command:

cd codes
python test.py --opt options/test/test_stage3.yml

Data Preparation

To prepare data, put training and testing sets in ./datasets as ./datasets/DIV2K/HR/0801.png. Commonly used datasets can be downloaded here.

Training

Step1: to train MANet, run this command:

python train.py --opt options/train/train_stage1.yml

Step2: to train non-blind RRDB, run this command:

python train.py --opt options/train/train_stage2.yml

Step3: to fine-tune RRDB with MANet, run this command:

python train.py --opt options/train/train_stage3.yml

All trained models can be downloaded from release. For testing, downloading stage3 models is enough.

Testing

To test MANet (stage1, kernel estimation only), run this command:

python test.py --opt options/test/test_stage1.yml

To test RRDB-SFT (stage2, non-blind SR with ground-truth kernel), run this command:

python test.py --opt options/test/test_stage2.yml

To test MANet+RRDB (stage3, blind SR), run this command:

python test.py --opt options/test/test_stage3.yml

Note: above commands generate LR images on-the-fly. To generate testing sets used in the paper, run this command:

python prepare_testset.py --opt options/test/prepare_testset.yml

Interactive Exploration of Kernels

To explore spaitally variant kernels on an image, use --save_kernel and run this command to save kernel:

python test.py --opt options/test/test_stage1.yml --save_kernel

Then, run this command to creat an interactive window:

python interactive_explore.py --path ../results/001_MANet_aniso_x4_test_stage1/toy_dataset1/npz/toy1.npz

Results

We conducted experiments on both spatially variant and invariant blind SR. Please refer to the paper and supp for results.

Citation

@inproceedings{liang21manet,
  title={Mutual Affine Network for Spatially Variant Kernel Estimation in Blind Image Super-Resolution},
  author={Liang, Jingyun and Sun, Guolei and Zhang, Kai and Van Gool, Luc and Timofte, Radu},
  booktitle={IEEE Conference on International Conference on Computer Vision},
  year={2021}
}

License & Acknowledgement

This project is released under the Apache 2.0 license. The codes are based on BasicSR, MMSR, IKC and KAIR. Please also follow their licenses. Thanks for their great works.

Comments
  • Training and OOM

    Training and OOM

    Thanks for your code. I tried to train the model with train_stage1.yml, and the Cuda OOM. I am using 2080 Ti, I tried to reduce the batch size from 16 to 2 and the GT_size from 192 to 48. However, the training still OOM. May I know is there anything I missed? Thanks.

    opened by hcleung3325 9
  • [How to get SR image by spatially variant estimated blur kernels]

    [How to get SR image by spatially variant estimated blur kernels]

    Hi, Thank you for your excellent and interesting work! I'm not so clear about the process after kernels estimation during SR reconstruction after reading your paper. Could you please explain?

    opened by CaptainEven 7
  • The method of creating kernels

    The method of creating kernels

    I noticed that the function for creating kernel ('anisotropic_gaussian_kernel_matlab') is different from the standard gaussian distribution (e.g. the method that used in IKC, https://github.com/yuanjunchai/IKC/blob/2a846cf1194cd9bace08973d55ecd8fd3179fe48/codes/utils/util.py#L244). I am wondering why a different way is used here. Actually, a test dataset created by IKC with same sigma range seems to have poor performance on MANet, and vice versa.

    opened by zhiqiangfu 3
  • [import error]

    [import error]

        k = scipy.stats.multivariate_normal.pdf(pos, mean=[0, 0], cov=cov)
    AttributeError: module 'scipy' has no attribute 'stats'
    

    scipy version error? So, which version of scipy is required?

    opened by CaptainEven 2
  • A letter from afar

    A letter from afar

    Good evening, boss! I recently discovered your work about MANet.I found that the length of the gaussian kernel your method generated is equal to 18.Does this setting have any specific meaning? image

    opened by fenghao195 0
  • New Super-Resolution Benchmarks

    New Super-Resolution Benchmarks

    Hello,

    MSU Graphics & Media Lab Video Group has recently launched two new Super-Resolution Benchmarks.

    If you are interested in participating, you can add your algorithm following the submission steps:

    We would be grateful for your feedback on our work!

    opened by EvgeneyBogatyrev 0
  • About LR_Image PSNR/SSIM

    About LR_Image PSNR/SSIM

    Many thanks for your excellent work!

    I wonder what is the LR_Image PSNR/SSIM in the ablation study to evaluate the MANet about kernel prediction, and how to compute these?

    opened by Shaosifan 0
  • Questions about the paper

    Questions about the paper

    Thanks again for your great work. I have several questions about the paper. In Figure 2, you mentioned the input for MANet is a LR, but the input for your code seems to be DIV2K GT. Is there any further process I miss? Also, is that possible for the whole model trained in y-channel since my deployed environment only deals with y-channel? Thanks.

    opened by mrgreen3325 0
  • Issue about class BatchBlur_SV in utils.util

    Issue about class BatchBlur_SV in utils.util

    MANet/codes/utils/util.py Line 661: kernel = kernel.flatten(2).unsqueeze(0).expand(3,-1,-1,-1) The kernel shape: [B, HW, l, l] ->[B, HW, l^2] ->[1, B, HW, l^2] ->[C, B, HW, l^2] I think it is wrong, because it is not corresponding to the shape of pad.

    The line 661 should be kernel = kernel.flatten(2).unsqueeze(1).expand(-1, 3,-1,-1) The kernel shape: [B, HW, l, l] ->[B, HW, l^2] ->[B, 1, HW, l^2] ->[B, C, HW, l^2]

    opened by jiangmengyu18 0
Owner
Jingyun Liang
PhD Student at Computer Vision Lab, ETH Zurich
Jingyun Liang
PyKaldi GOP-DNN on Epa-DB

PyKaldi GOP-DNN on Epa-DB This repository has the tools to run a PyKaldi GOP-DNN algorithm on Epa-DB, a database of non-native English speech by Spani

18 Dec 14, 2022
Mouse Brain in the Model Zoo

Deep Neural Mouse Brain Modeling This is the repository for the ongoing deep neural mouse modeling project, an attempt to characterize the representat

Colin Conwell 15 Aug 22, 2022
Hydra: an Extensible Fuzzing Framework for Finding Semantic Bugs in File Systems

Hydra: An Extensible Fuzzing Framework for Finding Semantic Bugs in File Systems Paper Finding Semantic Bugs in File Systems with an Extensible Fuzzin

gts3.org (<a href=[email protected])"> 129 Dec 15, 2022
MobileNetV1-V2,MobileNeXt,GhostNet,AdderNet,ShuffleNetV1-V2,Mobile+ViT etc.

MobileNetV1-V2,MobileNeXt,GhostNet,AdderNet,ShuffleNetV1-V2,Mobile+ViT etc. ⭐⭐⭐⭐⭐

568 Jan 04, 2023
DROPO: Sim-to-Real Transfer with Offline Domain Randomization

DROPO: Sim-to-Real Transfer with Offline Domain Randomization Gabriele Tiboni, Karol Arndt, Ville Kyrki. This repository contains the code for the pap

Gabriele Tiboni 8 Dec 19, 2022
Code for EMNLP 2021 paper Contrastive Out-of-Distribution Detection for Pretrained Transformers.

Contra-OOD Code for EMNLP 2021 paper Contrastive Out-of-Distribution Detection for Pretrained Transformers. Requirements PyTorch Transformers datasets

Wenxuan Zhou 27 Oct 28, 2022
Vector Quantized Diffusion Model for Text-to-Image Synthesis

Vector Quantized Diffusion Model for Text-to-Image Synthesis Due to company policy, I have to set microsoft/VQ-Diffusion to private for now, so I prov

Shuyang Gu 294 Jan 05, 2023
Pytorch Implementations of large number classical backbone CNNs, data enhancement, torch loss, attention, visualization and some common algorithms.

Torch-template-for-deep-learning Pytorch implementations of some **classical backbone CNNs, data enhancement, torch loss, attention, visualization and

Li Shengyan 270 Dec 31, 2022
Official PyTorch code of Holistic 3D Scene Understanding from a Single Image with Implicit Representation (CVPR 2021)

Implicit3DUnderstanding (Im3D) [Project Page] Holistic 3D Scene Understanding from a Single Image with Implicit Representation Cheng Zhang, Zhaopeng C

Cheng Zhang 149 Jan 08, 2023
Libraries, tools and tasks created and used at DeepMind Robotics.

Libraries, tools and tasks created and used at DeepMind Robotics.

DeepMind 270 Nov 30, 2022
Python3 Implementation of (Subspace Constrained) Mean Shift Algorithm in Euclidean and Directional Product Spaces

(Subspace Constrained) Mean Shift Algorithms in Euclidean and/or Directional Product Spaces This repository contains Python3 code for the mean shift a

Yikun Zhang 0 Oct 19, 2021
Collapse by Conditioning: Training Class-conditional GANs with Limited Data

Collapse by Conditioning: Training Class-conditional GANs with Limited Data Moha

Mohamad Shahbazi 33 Dec 06, 2022
Data-Driven Operational Space Control for Adaptive and Robust Robot Manipulation

OSCAR Project Page | Paper This repository contains the codebase used in OSCAR: Data-Driven Operational Space Control for Adaptive and Robust Robot Ma

NVIDIA Research Projects 74 Dec 22, 2022
TorchIO is a Medical image preprocessing and augmentation toolkit for deep learning. Part of the PyTorch Ecosystem.

Medical image preprocessing and augmentation toolkit for deep learning. Part of the PyTorch Ecosystem.

Fernando Pérez-García 1.6k Jan 06, 2023
Official repository of the paper Privacy-friendly Synthetic Data for the Development of Face Morphing Attack Detectors

SMDD-Synthetic-Face-Morphing-Attack-Detection-Development-dataset Official repository of the paper Privacy-friendly Synthetic Data for the Development

10 Dec 12, 2022
A Python package for generating concise, high-quality summaries of a probability distribution

GoodPoints A Python package for generating concise, high-quality summaries of a probability distribution GoodPoints is a collection of tools for compr

Microsoft 28 Oct 10, 2022
[ICCV 2021] Official PyTorch implementation for Deep Relational Metric Learning.

Ranking Models in Unlabeled New Environments Prerequisites This code uses the following libraries Python 3.7 NumPy PyTorch 1.7.0 + torchivision 0.8.1

Borui Zhang 39 Dec 10, 2022
Visualizer using audio and semantic analysis to explore BigGAN (Brock et al., 2018) latent space.

BigGAN Audio Visualizer Description This visualizer explores BigGAN (Brock et al., 2018) latent space by using pitch/tempo of an audio file to generat

Rush Kapoor 2 Nov 21, 2022
Data for "Driving the Herd: Search Engines as Content Influencers" paper

herding_data Data for "Driving the Herd: Search Engines as Content Influencers" paper Dataset description The collection contains 2250 documents, 30 i

0 Aug 17, 2021
Pytorch implementation of the paper "COAD: Contrastive Pre-training with Adversarial Fine-tuning for Zero-shot Expert Linking."

Expert-Linking Pytorch implementation of the paper "COAD: Contrastive Pre-training with Adversarial Fine-tuning for Zero-shot Expert Linking." This is

BoChen 12 Jan 01, 2023