Towards Flexible Blind JPEG Artifacts Removal (FBCNN, ICCV 2021)

Overview

Towards Flexible Blind JPEG Artifacts Removal (FBCNN, ICCV 2021)

paper download python version pytorch version License

Jiaxi Jiang, Kai Zhang, Radu Timofte

Computer Vision Lab, ETH Zurich, Switzerland


๐Ÿ”ฅ ๐Ÿ”ฅ This repository is the official PyTorch implementation of paper "Towards Flexible Blind JPEG Artifacts Removal". paper download FBCNN achieves state-of-the-art performance in blind JPEG artifacts removal on

  • Single JPEG images (color/grayscale)
  • Double JPEG images (aligned/non-aligned)
  • Real-world JPEG images

Training a single deep blind model to handle different quality factors for JPEG image artifacts removal has been attracting considerable attention due to its convenience for practical usage. However, existing deep blind methods usually directly reconstruct the image without predicting the quality factor, thus lacking the flexibility to control the output as the non-blind methods. To remedy this problem, in this paper, we propose a flexible blind convolutional neural network, namely FBCNN, that can predict the adjustable quality factor to control the trade-off between artifacts removal and details preservation. Specifically, FBCNN decouples the quality factor from the JPEG image via a decoupler module and then embeds the predicted quality factor into the subsequent reconstructor module through a quality factor attention block for flexible control. Besides, we find existing methods are prone to fail on non-aligned double JPEG images even with only a one-pixel shift, and we thus propose a double JPEG degradation model to augment the training data. Extensive experiments on single JPEG images, more general double JPEG images, and real-world JPEG images demonstrate that our proposed FBCNN achieves favorable performance against state-of-the-art methods in terms of both quantitative metrics and visual quality.

๐Ÿš€ ๐Ÿš€ Some Visual Examples (Click for full images)


Training

We will release the training code at KAIR.

Testing

  • Grayscale JPEG images
python main_test_fbcnn_gray.py
  • Grayscale JPEG images, trained with double JPEG degradation model
python main_test_fbcnn_gray_doublejpeg.py
  • Color JPEG images
python main_test_fbcnn_color.py
  • Real-World JPEG images
python main_test_fbcnn_color_real.py

Contents

Motivations

JPEG is one of the most widely-used image compression algorithms and formats due to its simplicity and fast encoding/decoding speeds. However, it is a lossy compression algorithm and can introduce annoying artifacts. Existing methods for JPEG artifacts removal generally have four limitations in real applications:

  • Most existing learning-based methods [e.g. ARCNN, MWCNN, SwinIR] trained a specific model for each quality factor, lacking the flexibility to learn a single model for different JPEG quality factors.

  • DCT-based methods [e.g. DMCNN, QGAC] need to obtain the DCT coefficients or quantization table as input, which is only stored in JPEG format. Besides, when images are compressed multiple times, only the most recent compression information is stored.

  • Existing blind methods [e.g. DnCNN, DCSC, QGAC] can only provide a deterministic reconstruction result for each input, ignoring the need for user preferences.

  • Existing methods are all trained with synthetic images which assumes that the low-quality images are compressed only once. However, most images from the Internet are compressed multiple times. Despite some progress for real recompressed images, e.g. from Twitter [ARCNN, DCSC], a detailed and complete study on double JPEG artifacts removal is still missing.

Network Architecture

We propose a flexible blind convolutional neural network (FBCNN) that predicts the quality factor of a JPEG image and embed it into the decoder to guide image restoration. The quality factor can be manually adjusted for flexible JPEG restoration according to the user's preference. architecture

Analysis of Double JPEG Restoration

1. What is non-aligned double JPEG compression?

Non-aligned double JPEG compression means that the 8x8 blocks of two JPEG compression are not aligned. For example, when we crop a JPEG image and save it also as JPEG, it is highly possible we get a non-aligned double JPEG image. real There are many other common scenarios including, but not limited to:

  • take a picture by smartphone and upload it online. Most social media platforms, e.g. Wechat, Twitter, Facebook, resize the uploaded images by downsampling and then apply JPEG compression to save storage space.
  • Edit a JPEG image that introduces cropping, rotation, or resizing, and save it as JPEG.
  • Zoom in/out a JPEG image, then take a screenshot, save it as JPEG.
  • Group different JPEG image and save it as a single JPEG image.
  • Most memes are compressed many times with non-aligned cases.

2. Limitation of existing blind methods on restoration of non-aligned double JPEG images

We find that existing blind methods always do not work when the 8x8 blocks of two JPEG compression are not aligned and QF1 <= QF2, even with just a one-pixel shift. Other cases such as non-aligned double JPEG with QF1>QF2, or aligned double JPEG compression, are actually equivalent to single JPEG compression.

Here is an example of the restoration result of DnCNN and QGAC on a JPEG image with different degradation settings. '*' means there is a one-pixel shift between two JPEG blocks. lena_doublejpeg

3. Our solutions

We find for non-aligned double JPEG images with QF1 < QF2, FBCNN always predicts the quality factor as QF2. However, it is the smaller QF1 that dominants the compression artifacts. By manually changing the predicted quality factor to QF1, we largely improve the result.

Besides, to get a fully blind model, we propose two blind solutions to solve this problem:

(1) FBCNN-D: Train a model with a single JPEG degradation model + automatic dominant QF correction. By utilizing the property of JPEG images, we find the quality factor of a single JPEG image can be predicted by applying another JPEG compression. When QF1 = QF2, the MSE of two JPEG images is minimal. In our paper, we also extend this method to non-aligned double JPEG cases to get a fully blind model.

(2) FBCNN-A: Augment training data with double JPEG degradation model, which is given by:

y = JPEG(shift(JPEG(x, QF1)),QF2)

By reducing the misalignment of training data and real-world JPEG images, FBCNN-A further improves the results on complex double JPEG restoration. This proposed double JPEG degradation model can be easily integrated into other image restoration tasks, such as single image super-resolution (e.g. BSRGAN), for better general real image restoration.

Experiments

1. Single JPEG restoration

single_table *: Train a specific model for each quality factor. single_compare

2. Non-aligned double JPEG restoration

There is a pixel shift of (4,4) between the blocks of two JPEG compression. double_table double_compare

3. Real-world JPEG restoration

real

4. Flexibility of FBCNN

By setting different quality factors, we can control the trade-off between artifacts removal and details preservation, according to the user's preference. flexible

Citation

@inproceedings{jiang2021towards,
title={Towards Flexible Blind {JPEG} Artifacts Removal},
author={Jiang, Jiaxi and Zhang, Kai and Timofte, Radu},
booktitle={IEEE International Conference on Computer Vision},
year={2021}
}

License and Acknowledgement

This project is released under the Apache 2.0 license. This work was partly supported by the ETH Zรผrich Fund (OK) and a Huawei Technologies Oy (Finland) project.

Comments
  • About the results of fbcnn_color.pth

    About the results of fbcnn_color.pth

    Hi, Great works! I test your model(fbcnn_color.pth) in 'testset/Real' dataset. The results are not so remarkable as the picture in this master. The output of model(fbcnn_color.pth) without qf_input are as follow(left is input, right is output):

    merge_1 Uploading merge_2.jpgโ€ฆ merge_3

    merge_4 merge_5 merge_6 I don't kown if there are something wrong with my results. And the output of model(fbcnn_color.pth) with qf_input are also not so good. When zoom out, I can find obvious artifacts. Hope for your reply.

    question 
    opened by YangGangZhiQi 6
  • Inference is causing Out of Memory error (even for V100)

    Inference is causing Out of Memory error (even for V100)

    Hello, I tried running 'main_test_fbcnn_color.py' on a real JPEG image using one 16 GB V100 but the code threw ยดOut of Memoryยด error. Any idea on how to use this code with large images, say 12 MPix or more?

    opened by wind-surfer 4
  • Real data testing

    Real data testing

    hello, it seems the results of the real dataset are not run from the fbcnn_color.pth ? Would you mind providing the corresponding fbcnn model for the real dataset? Thank you!

    opened by yiyunchen 3
  • Problems about re-produce the  training results  in  QF-estimation loss

    Problems about re-produce the training results in QF-estimation loss

    Hi, It is a nice job to enable flexible QF embedding in JPEG deblocking! I want to re-produce the results following the directions in the paper, However, there are some questions.

    1. qf_loss Following the paper, I calculate the loss as:
                input = batch["degree"].type(Tensor)
                label = batch["label"].type(Tensor)
                qf_label = batch["qf"].type(Tensor)
                qf_label = (1-qf_label/100) # 0-1
                out, qf_pred = model(input)
    
                mse = mse_pixelwise(out, label)
    
                cls_loss = l1_pixelwise2(qf_pred, qf_label)
    
                loss = l1_pixelwise(out, label) * 0.5  + mse * 0.5 + cls_loss * 0.1
    

    However, during the training phase, the cls_loss is not changed at all

    2021-10-09 22:38:47,481 - __main__ - INFO - 
    [QF 10 lr 0.000100 cls_loss:0.23448  loss:0.04852 Epoch 0/120] [Batch 19/203] [psnr: 19.300194 ]  ETA: 6:18:37.362892   [mse: 0.005858 ] [mse_original: 0.000583 ]
    2021-10-09 22:39:08,319 - __main__ - INFO - 
    [QF 10 lr 0.000100 cls_loss:0.20090  loss:0.03645 Epoch 0/120] [Batch 39/203] [psnr: 21.449274 ]  ETA: 7:02:01.486693   [mse: 0.003679 ] [mse_original: 0.000384 ]
    2021-10-09 22:39:29,365 - __main__ - INFO - 
    
    2021-10-10 05:11:26,158 - __main__ - INFO - 
    [QF 10 lr 0.000020 cls_loss:0.20545  loss:0.02607 Epoch 119/120] [Batch 159/203] [psnr: 34.804854 ]  ETA: 0:00:41.660994   [mse: 0.000345 ] [mse_original: 0.000475 ]
    2021-10-10 05:11:45,303 - __main__ - INFO - 
    [QF 10 lr 0.000020 cls_loss:0.22516  loss:0.02851 Epoch 119/120] [Batch 179/203] [psnr: 34.770472 ]  ETA: 0:00:22.570621   [mse: 0.000385 ] [mse_original: 0.000513 ]
    2021-10-10 05:12:04,498 - __main__ - INFO - 
    [QF 10 lr 0.000020 cls_loss:0.18704  loss:0.02385 Epoch 119/120] [Batch 199/203] [psnr: 34.775089 ]  ETA: 0:00:03.771435   [mse: 0.000276 ] [mse_original: 0.000377 ]
    2021-10-10 05:12:12,970 - __main__ - INFO - QF: 10 [PSNR: live1: 29.005289    classic: 29.159697]  
     [SSIM: live1: 0.806543    classic: 0.795164] 
     [max PSNR in live1: 29.005289, max epoch: 119]
    

    The cls_loss is always at 2, and when I test the qf estimation with input image in qf=[10,20,..,90], the qf estimation is always 53,

    which shows that the cls_loss does not work well. It is reasonable that qf estimation is 50 since it is good for L1 loss when not trained well.

    Due to the GPU limitation, I set the Batchsize = 96, and training with DIV2K patches for about 2.5W iters

    Is there anything wrong in my implementation, thanks!

    opened by JuZiSYJ 3
  • About the result  under rgb2ycbcr of opencv version

    About the result under rgb2ycbcr of opencv version

    Hi, I test the result on classic5 and live1 in gray with pre-trained weight

    However, I find some mistakes caused by rgb2ycbcr function

    Here is my offline test result image

    where classic5 is ok because it is gray input, and the live1 result is strange because of the rgb2ycbcr function.

    The GT I compare is produced by Matlab version YCbCr, which is different from OpenCV version which is utilized in your training and eval process. if n_channels == 3: img_L = cv2.cvtColor(img_L, cv2.COLOR_RGB2BGR) _, encimg = cv2.imencode('.jpg', img_L, [int(cv2.IMWRITE_JPEG_QUALITY), quality_factor]) img_L = cv2.imdecode(encimg, 0) if n_channels == 1 else cv2.imdecode(encimg, 3)

    The logging result in live1_qf10 is 21-12-23 15:43:59.031 : Average PSNR/SSIM/PSNRB - live1_fbcnn_gray_10 -: 28.96$\vert$0.8254$\vert$28.64. which is a little higher than 26.97 but much lower than 29.75, because the model is for OpenCV version Y, not Matlab version

    However, the common setting in deblocking is using Matlab version Y

    I test ICCV21-Learning Dual Priors for JPEG Compression Artifacts Removal, the offline results are image

    It is similar to the results produced in paper

    I hope it may be helpful.

    opened by JuZiSYJ 2
  • run color real error

    run color real error

    run default code, show me error: model.load_state_dict(torch.load(model_path), strict=True): error in loading state_stict for FBCNN: Missing keys in state_stict : ......

    but when run fbcnn_color.py , it is ok. why ?

    opened by gao123qiang 1
  • Slight color(chroma) brightness change

    Slight color(chroma) brightness change

    Hello @jiaxi-jiang, Sorry to bother you,

    I test some non-photo content jpeg(2d drawing),
    high FBCNN QF can preserve detail and noise,
    but I notice some area have slight color(chroma) brightness change.

    In this case, FBCNN QF 70 have slight brightness change on dark red area(red circle),
    could you teach me how to improve color accurate for non-photo content?

    original image (jpeg q75 420),
    153967648-8051d8d9-38ec-4e2a-a4d7-fd0cd3e81c4d

    png 8 bit depth (FBCNN QF 70)
    153967648-8051d8d9-38ec-4e2a-a4d7-fd0cd3e81c4d_qf_70_red

    Other sample(QF 30,50,70)

    sample.zip

    opened by Lee-lithium 2
  • Image tile and output 16 bit depth png

    Image tile and output 16 bit depth png

    Hello @jiaxi-jiang, Sorry to bother you,

    I plan apply FBCNN for my non-photo content jpeg(2d drawing),
    but I have some question about tool implement,
    could you teach me about those question?

    1. I have some large jpeg(4441x6213) want apply FBCNN,
      but I don't have enough RAM to process this image,
      probably can implement tile function for FBCNN?

    I find a split_imageset function in FBCNN utils_image.py,
    I try to invoke this function,
    but I haven't find a good method to implement split and merge function.

    1. I notice FBCNN output is 8 bit depth png, probably implement output(input) 16 bit depth png function, can get better result?

    Thank you produce this amazing tool. :)

    opened by Lee-lithium 2
  • Colab

    Colab

    I'm excited to watch the demo video. Please release the code for Google colab so that we can try it easily. I am looking forward to your next update. Thank you.

    opened by osushilover 0
Releases(v1.0)
Edge-oriented Convolution Block for Real-time Super Resolution on Mobile Devices, ACM Multimedia 2021

Codes for ECBSR Edge-oriented Convolution Block for Real-time Super Resolution on Mobile Devices Xindong Zhang, Hui Zeng, Lei Zhang ACM Multimedia 202

xindong zhang 236 Dec 26, 2022
Pytorch implementation of DeePSiM

Pytorch implementation of DeePSiM

1 Nov 05, 2021
Open source implementation of AceNAS: Learning to Rank Ace Neural Architectures with Weak Supervision of Weight Sharing

AceNAS This repo is the experiment code of AceNAS, and is not considered as an official release. We are working on integrating AceNAS as a built-in st

Yuge Zhang 6 Sep 07, 2022
Selective Wavelet Attention Learning for Single Image Deraining

SWAL Code for Paper "Selective Wavelet Attention Learning for Single Image Deraining" Prerequisites Python 3 PyTorch Models We provide the models trai

Bobo 9 Jun 17, 2022
PyTorch META-DATASET (Few-shot classification benchmark)

PyTorch META-DATASET (Few-shot classification benchmark) This repo contains a PyTorch implementation of meta-dataset and a unified implementation of s

Malik Boudiaf 39 Oct 31, 2022
๐Ÿ€ Pytorch implementation of various Attention Mechanisms, MLP, Re-parameter, Convolution, which is helpful to further understand papers.โญโญโญ

๐Ÿ€ Pytorch implementation of various Attention Mechanisms, MLP, Re-parameter, Convolution, which is helpful to further understand papers.โญโญโญ

xmu-xiaoma66 7.7k Jan 05, 2023
๐Ÿฅ‡ LG-AI-Challenge 2022 1์œ„ ์†”๋ฃจ์…˜ ์ž…๋‹ˆ๋‹ค.

LG-AI-Challenge-for-Plant-Classification Dacon์—์„œ ์ง„ํ–‰๋œ ๋†์—… ํ™˜๊ฒฝ ๋ณ€ํ™”์— ๋”ฐ๋ฅธ ์ž‘๋ฌผ ๋ณ‘ํ•ด ์ง„๋‹จ AI ๊ฒฝ์ง„๋Œ€ํšŒ ์— ๋Œ€ํ•œ ์ฝ”๋“œ์ž…๋‹ˆ๋‹ค. (colab directory์— ์ฝ”๋“œ๊ฐ€ ์ž˜ ์ •๋ฆฌ ๋˜์–ด์žˆ์Šต๋‹ˆ๋‹ค.) Requirements python

siwooyong 10 Jun 30, 2022
Code for paper "A Critical Assessment of State-of-the-Art in Entity Alignment" (https://arxiv.org/abs/2010.16314)

A Critical Assessment of State-of-the-Art in Entity Alignment This repository contains the source code for the paper A Critical Assessment of State-of

Max Berrendorf 16 Oct 14, 2022
RADIal is available now! Check the download section

Latest news: RADIal is available now! Check the download section. However, because we are currently working on the data anonymization, we provide for

valeo.ai 55 Jan 03, 2023
ColossalAI-Examples - Examples of training models with hybrid parallelism using ColossalAI

ColossalAI-Examples This repository contains examples of training models with Co

HPC-AI Tech 185 Jan 09, 2023
TensorFlowOnSpark brings TensorFlow programs to Apache Spark clusters.

TensorFlowOnSpark TensorFlowOnSpark brings scalable deep learning to Apache Hadoop and Apache Spark clusters. By combining salient features from the T

Yahoo 3.8k Jan 04, 2023
AI Based Smart Exam Proctoring Package

AI Based Smart Exam Proctoring Package It takes image (base64) as input: Provide Output as: Detection of Mobile phone. Detection of More than 1 person

NARENDER KESWANI 3 Sep 09, 2022
A Fast and Stable GAN for Small and High Resolution Imagesets - pytorch

A Fast and Stable GAN for Small and High Resolution Imagesets - pytorch The official pytorch implementation of the paper "Towards Faster and Stabilize

Bingchen Liu 455 Jan 08, 2023
TorchGRL is the source code for our paper Graph Convolution-Based Deep Reinforcement Learning for Multi-Agent Decision-Making in Mixed Traffic Environments for IV 2022.

TorchGRL TorchGRL is the source code for our paper Graph Convolution-Based Deep Reinforcement Learning for Multi-Agent Decision-Making in Mixed Traffi

XXQQ 42 Dec 09, 2022
Supplemental learning materials for "Fourier Feature Networks and Neural Volume Rendering"

Fourier Feature Networks and Neural Volume Rendering This repository is a companion to a lecture given at the University of Cambridge Engineering Depa

Matthew A Johnson 133 Dec 26, 2022
[CVPR 2021] Modular Interactive Video Object Segmentation: Interaction-to-Mask, Propagation and Difference-Aware Fusion

[CVPR 2021] Modular Interactive Video Object Segmentation: Interaction-to-Mask, Propagation and Difference-Aware Fusion

Rex Cheng 364 Jan 03, 2023
Long Expressive Memory (LEM)

Long Expressive Memory for Sequence Modeling This repository contains the implementation to reproduce the numerical experiments of the paper Long Expr

Konstantin Rusch 47 Dec 17, 2022
A tool to estimate time varying instantaneous reproduction number during epidemics

EpiEstim A tool to estimate time varying instantaneous reproduction number during epidemics. It is described in the following paper: @article{Cori2013

MRC Centre for Global Infectious Disease Analysis 78 Dec 19, 2022
PyTorch implementation of MSBG hearing loss model and MBSTOI intelligibility metric

PyTorch implementation of MSBG hearing loss model and MBSTOI intelligibility metric This repository contains the implementation of MSBG hearing loss m

BUT <a href=[email protected]"> 9 Nov 08, 2022
FG-transformer-TTS Fine-grained style control in transformer-based text-to-speech synthesis

LST-TTS Official implementation for the paper Fine-grained style control in transformer-based text-to-speech synthesis. Submitted to ICASSP 2022. Audi

Li-Wei Chen 64 Dec 30, 2022