Official PyTorch code for Mutual Affine Network for Spatially Variant Kernel Estimation in Blind Image Super-Resolution (MANet, ICCV2021)

Overview

Mutual Affine Network for Spatially Variant Kernel Estimation in Blind Image Super-Resolution (MANet, ICCV2021)

This repository is the official PyTorch implementation of Mutual Affine Network for Spatially Variant Kernel Estimation in Blind Image Super-Resolution (arxiv, supplementary).

๐Ÿš€ ๐Ÿš€ ๐Ÿš€ News:


Existing blind image super-resolution (SR) methods mostly assume blur kernels are spatially invariant across the whole image. However, such an assumption is rarely applicable for real images whose blur kernels are usually spatially variant due to factors such as object motion and out-of-focus. Hence, existing blind SR methods would inevitably give rise to poor performance in real applications. To address this issue, this paper proposes a mutual affine network (MANet) for spatially variant kernel estimation. Specifically, MANet has two distinctive features. First, it has a moderate receptive field so as to keep the locality of degradation. Second, it involves a new mutual affine convolution (MAConv) layer that enhances feature expressiveness without increasing receptive field, model size and computation burden. This is made possible through exploiting channel interdependence, which applies each channel split with an affine transformation module whose input are the rest channel splits. Extensive experiments on synthetic and real images show that the proposed MANet not only performs favorably for both spatially variant and invariant kernel estimation, but also leads to state-of-the-art blind SR performance when combined with non-blind SR methods.

Requirements

  • Python 3.7, PyTorch >= 1.6, scipy >= 1.6.3
  • Requirements: opencv-python
  • Platforms: Ubuntu 16.04, cuda-10.0 & cuDNN v-7.5

Note: this repository is based on BasicSR. Please refer to their repository for a better understanding of the code framework.

Quick Run

Download stage3_MANet+RRDB_x4.pth from release and put it in ./pretrained_models. Then, run this command:

cd codes
python test.py --opt options/test/test_stage3.yml

Data Preparation

To prepare data, put training and testing sets in ./datasets as ./datasets/DIV2K/HR/0801.png. Commonly used datasets can be downloaded here.

Training

Step1: to train MANet, run this command:

python train.py --opt options/train/train_stage1.yml

Step2: to train non-blind RRDB, run this command:

python train.py --opt options/train/train_stage2.yml

Step3: to fine-tune RRDB with MANet, run this command:

python train.py --opt options/train/train_stage3.yml

All trained models can be downloaded from release. For testing, downloading stage3 models is enough.

Testing

To test MANet (stage1, kernel estimation only), run this command:

python test.py --opt options/test/test_stage1.yml

To test RRDB-SFT (stage2, non-blind SR with ground-truth kernel), run this command:

python test.py --opt options/test/test_stage2.yml

To test MANet+RRDB (stage3, blind SR), run this command:

python test.py --opt options/test/test_stage3.yml

Note: above commands generate LR images on-the-fly. To generate testing sets used in the paper, run this command:

python prepare_testset.py --opt options/test/prepare_testset.yml

Interactive Exploration of Kernels

To explore spaitally variant kernels on an image, use --save_kernel and run this command to save kernel:

python test.py --opt options/test/test_stage1.yml --save_kernel

Then, run this command to creat an interactive window:

python interactive_explore.py --path ../results/001_MANet_aniso_x4_test_stage1/toy_dataset1/npz/toy1.npz

Results

We conducted experiments on both spatially variant and invariant blind SR. Please refer to the paper and supp for results.

Citation

@inproceedings{liang21manet,
  title={Mutual Affine Network for Spatially Variant Kernel Estimation in Blind Image Super-Resolution},
  author={Liang, Jingyun and Sun, Guolei and Zhang, Kai and Van Gool, Luc and Timofte, Radu},
  booktitle={IEEE Conference on International Conference on Computer Vision},
  year={2021}
}

License & Acknowledgement

This project is released under the Apache 2.0 license. The codes are based on BasicSR, MMSR, IKC and KAIR. Please also follow their licenses. Thanks for their great works.

Comments
  • Training and OOM

    Training and OOM

    Thanks for your code. I tried to train the model with train_stage1.yml, and the Cuda OOM. I am using 2080 Ti, I tried to reduce the batch size from 16 to 2 and the GT_size from 192 to 48. However, the training still OOM. May I know is there anything I missed? Thanks.

    opened by hcleung3325 9
  • [How to get SR image by spatially variant estimated blur kernels]

    [How to get SR image by spatially variant estimated blur kernels]

    Hi, Thank you for your excellent and interesting work! I'm not so clear about the process after kernels estimation during SR reconstruction after reading your paper. Could you please explain?

    opened by CaptainEven 7
  • The method of creating kernels

    The method of creating kernels

    I noticed that the function for creating kernel ('anisotropic_gaussian_kernel_matlab') is different from the standard gaussian distribution (e.g. the method that used in IKC, https://github.com/yuanjunchai/IKC/blob/2a846cf1194cd9bace08973d55ecd8fd3179fe48/codes/utils/util.py#L244). I am wondering why a different way is used here. Actually, a test dataset created by IKC with same sigma range seems to have poor performance on MANet, and vice versa.

    opened by zhiqiangfu 3
  • [import error]

    [import error]

        k = scipy.stats.multivariate_normal.pdf(pos, mean=[0, 0], cov=cov)
    AttributeError: module 'scipy' has no attribute 'stats'
    

    scipy version error? So, which version of scipy is required?

    opened by CaptainEven 2
  • A letter from afar

    A letter from afar

    Good evening, boss! I recently discovered your work about MANet.I found that the length of the gaussian kernel your method generated is equal to 18.Does this setting have any specific meaning๏ผŸ image

    opened by fenghao195 0
  • New Super-Resolution Benchmarks

    New Super-Resolution Benchmarks

    Hello,

    MSU Graphics & Media Lab Video Group has recently launched two new Super-Resolution Benchmarks.

    If you are interested in participating, you can add your algorithm following the submission steps:

    We would be grateful for your feedback on our work!

    opened by EvgeneyBogatyrev 0
  • About LR_Image PSNR/SSIM

    About LR_Image PSNR/SSIM

    Many thanks for your excellent work!

    I wonder what is the LR_Image PSNR/SSIM in the ablation study to evaluate the MANet about kernel prediction, and how to compute these?

    opened by Shaosifan 0
  • Questions about the paper

    Questions about the paper

    Thanks again for your great work. I have several questions about the paper. In Figure 2, you mentioned the input for MANet is a LR, but the input for your code seems to be DIV2K GT. Is there any further process I miss? Also, is that possible for the whole model trained in y-channel since my deployed environment only deals with y-channel? Thanks.

    opened by mrgreen3325 0
  • Issue about class BatchBlur_SV in utils.util

    Issue about class BatchBlur_SV in utils.util

    MANet/codes/utils/util.py Line 661: kernel = kernel.flatten(2).unsqueeze(0).expand(3,-1,-1,-1) The kernel shape: [B, HW, l, l] ->[B, HW, l^2] ->[1, B, HW, l^2] ->[C, B, HW, l^2] I think it is wrong, because it is not corresponding to the shape of pad.

    The line 661 should be kernel = kernel.flatten(2).unsqueeze(1).expand(-1, 3,-1,-1) The kernel shape: [B, HW, l, l] ->[B, HW, l^2] ->[B, 1, HW, l^2] ->[B, C, HW, l^2]

    opened by jiangmengyu18 0
Owner
Jingyun Liang
PhD Student at Computer Vision Lab, ETH Zurich
Jingyun Liang
A3C LSTM Atari with Pytorch plus A3G design

NEWLY ADDED A3G A NEW GPU/CPU ARCHITECTURE OF A3C FOR SUBSTANTIALLY ACCELERATED TRAINING!! RL A3C Pytorch NEWLY ADDED A3G!! New implementation of A3C

David Griffis 532 Jan 02, 2023
Dynamic View Synthesis from Dynamic Monocular Video

Dynamic View Synthesis from Dynamic Monocular Video Project Website | Video | Paper Dynamic View Synthesis from Dynamic Monocular Video Chen Gao, Ayus

Chen Gao 139 Dec 28, 2022
A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation

Segnet is deep fully convolutional neural network architecture for semantic pixel-wise segmentation. This is implementation of http://arxiv.org/pdf/15

Pradyumna Reddy Chinthala 190 Dec 15, 2022
Deep Latent Force Models

Deep Latent Force Models This repository contains a PyTorch implementation of the deep latent force model (DLFM), presented in the paper, Compositiona

Tom McDonald 5 Oct 26, 2022
Scalable implementation of Lee / Mykland (2012) and Ait-Sahalia / Jacod (2012) Jump tests for noisy high frequency data

JumpDetectR Name of QuantLet : JumpDetectR Published in : 'To be published as "Jump dynamics in high frequency crypto markets"' Description : 'Scala

LvB 12 Jan 01, 2023
Implementation of the paper: "SinGAN: Learning a Generative Model from a Single Natural Image"

SinGAN This is an unofficial implementation of SinGAN from someone who's been sitting right next to SinGAN's creator for almost five years. Please ref

35 Nov 10, 2022
MOOSE (Multi-organ objective segmentation) a data-centric AI solution that generates multilabel organ segmentations to facilitate systemic TB whole-person research

MOOSE (Multi-organ objective segmentation) a data-centric AI solution that generates multilabel organ segmentations to facilitate systemic TB whole-person research.The pipeline is based on nn-UNet an

QIMP team 30 Jan 01, 2023
PyTorch reimplementation of REALM and ORQA

PyTorch reimplementation of REALM and ORQA

Li-Huai (Allan) Lin 17 Aug 20, 2022
Deep Reinforcement Learning for mobile robot navigation in ROS Gazebo simulator

DRL-robot-navigation Deep Reinforcement Learning for mobile robot navigation in ROS Gazebo simulator. Using Twin Delayed Deep Deterministic Policy Gra

87 Jan 07, 2023
Pose Detection and Machine Learning for real-time body posture analysis during exercise to provide audiovisual feedback on improvement of form.

Posture: Pose Tracking and Machine Learning for prescribing corrective suggestions to improve posture and form while exercising. This repository conta

Pratham Mehta 10 Nov 11, 2022
Large scale embeddings on a single machine.

Marius Marius is a system under active development for training embeddings for large-scale graphs on a single machine. Training on large scale graphs

Marius 107 Jan 03, 2023
Pytorch implementation of the unsupervised object discovery method LOST.

LOST Pytorch implementation of the unsupervised object discovery method LOST. More details can be found in the paper: Localizing Objects with Self-Sup

Valeo.ai 189 Dec 25, 2022
Place holder for HOPE: a human-centric and task-oriented MT evaluation framework using professional post-editing

HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professional Post-Editing Towards More Effective MT Evaluation Place holder for dat

Lifeng Han 1 Apr 25, 2022
Pytorch implementation for "Density-aware Chamfer Distance as a Comprehensive Metric for Point Cloud Completion" (NeurIPS 2021)

Density-aware Chamfer Distance This repository contains the official PyTorch implementation of our paper: Density-aware Chamfer Distance as a Comprehe

Tong WU 93 Dec 15, 2022
Contrastive Learning for Compact Single Image Dehazing, CVPR2021

AECR-Net Contrastive Learning for Compact Single Image Dehazing, CVPR2021. Official Pytorch based implementation. Paper arxiv Pytorch Version TODO: mo

glassy 253 Jan 01, 2023
Implementation of Shape Generation and Completion Through Point-Voxel Diffusion

Shape Generation and Completion Through Point-Voxel Diffusion Project | Paper Implementation of Shape Generation and Completion Through Point-Voxel Di

Linqi Zhou 103 Dec 29, 2022
PyTorch implementation of UPFlow (unsupervised optical flow learning)

UPFlow: Upsampling Pyramid for Unsupervised Optical Flow Learning By Kunming Luo, Chuan Wang, Shuaicheng Liu, Haoqiang Fan, Jue Wang, Jian Sun Megvii

kunming luo 87 Dec 20, 2022
Revisiting Self-Training for Few-Shot Learning of Language Model.

SFLM This is the implementation of the paper Revisiting Self-Training for Few-Shot Learning of Language Model. SFLM is short for self-training for few

15 Nov 19, 2022
Cancer metastasis detection with neural conditional random field (NCRF)

NCRF Prerequisites Data Whole slide images Annotations Patch images Model Training Testing Tissue mask Probability map Tumor localization FROC evaluat

Baidu Research 731 Jan 01, 2023
MarcoPolo is a clustering-free approach to the exploration of bimodally expressed genes along with group information in single-cell RNA-seq data

MarcoPolo is a method to discover differentially expressed genes in single-cell RNA-seq data without depending on prior clustering Overview MarcoPolo

Chanwoo Kim 13 Dec 18, 2022