Official PyTorch Implementation for "Recurrent Video Deblurring with Blur-Invariant Motion Estimation and Pixel Volumes"

Last update: Nov 06, 2022

Related tags

Deep Learning video-deblurring

Overview

PVDNet: Recurrent Video Deblurring with Blur-Invariant Motion Estimation and Pixel Volumes

This repository contains the official PyTorch implementation of the following paper:

Recurrent Video Deblurring with Blur-Invariant Motion Estimation and Pixel Volumes
Hyeongseok Son, Junyong Lee, Jonghyeop Lee, Sunghyun Cho, Seungyong Lee, TOG 2021 (presented at SIGGRAPH 2021)

About the Research

Click here

Overall Framework

Our video deblurring framework consists of three modules: a blur-invariant motion estimation network (BIMNet), a pixel volume generator, and a pixel volume-based deblurring network (PVDNet). We first train BIMNet; after it has converged, we combine the two networks with the pixel volume generator. We then fix the parameters of BIMNet and train PVDNet by training the entire network.

Blur-Invariant Motion Estimation Network (BIMNet)

To estimate motion between frames accurately, we adopt LiteFlowNet and train it with a blur-invariant loss so that the trained network can estimate blur-invariant optical flow between frames. We train BIMNet with a blur-invariant loss , which is defined as (refer Eq. 1 in the main paper):

The figure shows a qualitative comparison of different optical flow methods. The results of the other methods contain severely distorted structures due to errors in their optical flow maps. In contrast, the results of BIMNets show much less distortions.

Pixel Volume for Motion Compensation

We propose a novel pixel volume that provides multiple candidates for matching pixels between images. Moreover, a pixel volume provides an additional cue for motion compensation based on the majority.

Our pixel volume approach leads to the performance improvement of video deblurring by utilizing the multiple candidates in a pixel volume in two aspects: 1) in most cases, the majority cue for the correct match would help as the statistics (Sec. 4.4 in the main paper) shows, and 2) in other cases, PVDNet would exploit multiple candidates to estimate the correct match referring to nearby pixels with majority cues.

Getting Started

Prerequisites

Tested environment

Environment setup

$ git clone https://github.com/codeslake/PVDNet.git
$ cd PVDNet

$ conda create -y --name PVDNet python=3.8 && conda activate PVDNet
# for CUDA10.2
$ sh install_CUDA10.2.sh
# for CUDA11.1
$ sh install_CUDA11.1.sh

Datasets
- Download and unzip Su et al.'s dataset and Nah et al.'s dataset under [DATASET_ROOT]:
```
├── [DATASET_ROOT]
│   ├── train_DVD
│   ├── test_DVD
│   ├── train_nah
│   ├── test_nah
```
  Note:
  - [DATASET_ROOT] is currently set to ./datasets/video_deblur. It can be specified by modifying config.data_offset in ./configs/config.py.

Pre-trained models

Download and unzip pretrained weights under ./ckpt/:

├── ./ckpt
│   ├── BIMNet.pytorch
│   ├── PVDNet_DVD.pytorch
│   ├── PVDNet_nah.pytorch
│   ├── PVDNet_large_nah.pytorch

Testing models of TOG2021

For PSNRs and SSIMs reported in the paper, we use the approach of Koehler et al. following Su et al., that first aligns two images using global translation to represent the ambiguity in the pixel location caused by blur.
Refer here for the evaluation code.

## Table 4 in the main paper (Evaluation on Su etal's dataset)
# Our final model 
CUDA_VISIBLE_DEVICES=0 python run.py --mode PVDNet_DVD --config config_PVDNet --data DVD --ckpt_abs_name ckpt/PVDNet_DVD.pytorch

## Table 5 in the main paper (Evaluation on Nah etal's dataset)
# Our final model 
CUDA_VISIBLE_DEVICES=0 python run.py --mode PVDNet_nah --config config_PVDNet --data nah --ckpt_abs_name ckpt/PVDNet_nah.pytorch

# Larger model
CUDA_VISIBLE_DEVICES=0 python run.py --mode PVDNet_large_nah --config config_PVDNet_large --data nah --ckpt_abs_name ckpt/PVDNet_large_nah.pytorch

Note:

Testing results will be saved in [LOG_ROOT]/PVDNet_TOG2021/[mode]/result/quanti_quali/[mode]_[epoch]/[data]/.

[LOG_ROOT] is set to ./logs/ by default. Refer here for more details about the logging.

options
- --data: The name of a dataset to evaluate: DVD | nah | random. Default: DVD
  - The data structure can be modified in the function set_eval_path(..) in ./configs/config.py.
  - random is for testing models with any video frames, which should be placed as [DATASET_ROOT]/random/[video_name]/*.[jpg|png].

Wiki

Citation

If you find this code useful, please consider citing:

@artical{Son_2021_TOG,
    author = {Son, Hyeongseok and Lee, Junyong and Lee, Jonghyeop and Cho, Sunghyun and Lee, Seungyong},
    title = {Recurrent Video Deblurring with Blur-Invariant Motion Estimation and Pixel Volumes},
    journal = {ACM Transactions on Graphics},
    year = {2021}
}

Contact

Open an issue for any inquiries. You may also have contact with [email protected] or [email protected]

Resources

All material related to our paper is available by following links:

Link
The main paper
arXiv
Supplementary Files
Checkpoint Files
Su et al [2017]'s dataset (reference)
Nah et al. [2017]'s dataset (reference)

License

This software is being made available under the terms in the LICENSE file.

Any exemptions to these terms require a license from the Pohang University of Science and Technology.

About Coupe Project

Project ‘COUPE’ aims to develop software that evaluates and improves the quality of images and videos based on big visual data. To achieve the goal, we extract sharpness, color, composition features from images and develop technologies for restoring and improving by using them. In addition, personalization technology through user reference analysis is under study.

Please check out other Coupe repositories in our Posgraph github organization.

Official PyTorch Implementation for "Recurrent Video Deblurring with Blur-Invariant Motion Estimation and Pixel Volumes"

Related tags

Overview

PVDNet: Recurrent Video Deblurring with Blur-Invariant Motion Estimation and Pixel Volumes

About the Research

Overall Framework

Blur-Invariant Motion Estimation Network (BIMNet)

Pixel Volume for Motion Compensation

Getting Started

Prerequisites

Testing models of TOG2021

Wiki

Citation

Contact

Resources

License

About Coupe Project

Useful Links

Owner

Junyong Lee

[CVPR'21] Learning to Recommend Frame for Interactive Video Object Segmentation in the Wild

Robustness via Cross-Domain Ensembles

[CVPR 2021] Scan2Cap: Context-aware Dense Captioning in RGB-D Scans

Exploit ILP to learn symmetry breaking constraints of ASP programs.

Development kit for MIT Scene Parsing Benchmark

A Python library for working with arbitrary-dimension hypercomplex numbers following the Cayley-Dickson construction of algebras.

The official implementation of our CVPR 2021 paper - Hybrid Rotation Averaging: A Fast and Robust Rotation Averaging Approach

Collection of TensorFlow2 implementations of Generative Adversarial Network varieties presented in research papers.

The Official Repository for "Generalized OOD Detection: A Survey"

PAthological QUpath Obsession - QuPath and Python conversations

The Unsupervised Reinforcement Learning Benchmark (URLB)

Global-Local Attention for Emotion Recognition

Multi-objective constrained optimization for energy applications via tree ensembles

✨风纪委员会自动投票脚本，利用Github Action帮你进行裁决操作（为了让其他风纪委员有案件可判，本程序从中午12点才开始运行，有需要请自己修改运行时间）

Sparse Progressive Distillation: Resolving Overfitting under Pretrain-and-Finetune Paradigm

YOLOv5 detection interface - PyQt5 implementation

本项目是一个带有前端界面的垃圾分类项目，加载了训练好的模型参数，模型为efficientnetb4，暂时为40分类问题。

SIMULEVAL A General Evaluation Toolkit for Simultaneous Translation

Codes for AAAI22 paper "Learning to Solve Travelling Salesman Problem with Hardness-Adaptive Curriculum"

Rainbow DQN implementation that outperforms the paper's results on 40% of games using 20x less data 🌈