[ICCV 2021] Focal Frequency Loss for Image Reconstruction and Synthesis

Overview

Focal Frequency Loss - Official PyTorch Implementation

teaser

This repository provides the official PyTorch implementation for the following paper:

Focal Frequency Loss for Image Reconstruction and Synthesis
Liming Jiang, Bo Dai, Wayne Wu and Chen Change Loy
In ICCV 2021.
Project Page | Paper | Poster | Slides | YouTube Demo

Abstract: Image reconstruction and synthesis have witnessed remarkable progress thanks to the development of generative models. Nonetheless, gaps could still exist between the real and generated images, especially in the frequency domain. In this study, we show that narrowing gaps in the frequency domain can ameliorate image reconstruction and synthesis quality further. We propose a novel focal frequency loss, which allows a model to adaptively focus on frequency components that are hard to synthesize by down-weighting the easy ones. This objective function is complementary to existing spatial losses, offering great impedance against the loss of important frequency information due to the inherent bias of neural networks. We demonstrate the versatility and effectiveness of focal frequency loss to improve popular models, such as VAE, pix2pix, and SPADE, in both perceptual quality and quantitative performance. We further show its potential on StyleGAN2.

Updates

  • [09/2021] The code of Focal Frequency Loss is released.

  • [07/2021] The paper of Focal Frequency Loss is accepted by ICCV 2021.

Quick Start

Run pip install focal-frequency-loss for installation. Then, the following code is all you need.

from focal_frequency_loss import FocalFrequencyLoss as FFL
ffl = FFL(loss_weight=1.0, alpha=1.0)  # initialize nn.Module class

import torch
fake = torch.randn(4, 3, 64, 64)  # replace it with the predicted tensor of shape (N, C, H, W)
real = torch.randn(4, 3, 64, 64)  # replace it with the target tensor of shape (N, C, H, W)

loss = ffl(fake, real)  # calculate focal frequency loss

Tips:

  1. Current supported PyTorch version: torch>=1.1.0. Warnings can be ignored. Please note that experiments in the paper were conducted with torch<=1.7.1,>=1.1.0.
  2. Arguments to initialize the FocalFrequencyLoss class:
    • loss_weight (float): weight for focal frequency loss. Default: 1.0
    • alpha (float): the scaling factor alpha of the spectrum weight matrix for flexibility. Default: 1.0
    • patch_factor (int): the factor to crop image patches for patch-based focal frequency loss. Default: 1
    • ave_spectrum (bool): whether to use minibatch average spectrum. Default: False
    • log_matrix (bool): whether to adjust the spectrum weight matrix by logarithm. Default: False
    • batch_matrix (bool): whether to calculate the spectrum weight matrix using batch-based statistics. Default: False
  3. Experience shows that the main hyperparameters you need to adjust are loss_weight and alpha. The loss weight may always need to be adjusted first. Then, a larger alpha indicates that the model is more focused. We use alpha=1.0 as default.

Exmaple: Image Reconstruction (Vanilla AE)

As a guide, we provide an example of applying the proposed focal frequency loss (FFL) for Vanilla AE image reconstruction on CelebA. Applying FFL is pretty easy. The core details can be found here.

Installation

After installing Anaconda, we recommend you to create a new conda environment with python 3.8.3:

conda create -n ffl python=3.8.3 -y
conda activate ffl

Clone this repo, install PyTorch 1.4.0 (torch>=1.1.0 may also work) and other dependencies:

git clone https://github.com/EndlessSora/focal-frequency-loss.git
cd focal-frequency-loss
pip install -r VanillaAE/requirements.txt

Dataset Preparation

In this example, please download img_align_celeba.zip of the CelebA dataset from its official website. Then, we highly recommend you to unzip this file and symlink the img_align_celeba folder to ./datasets/celeba by:

bash scripts/datasets/prepare_celeba.sh [PATH_TO_IMG_ALIGN_CELEBA]

Or you can simply move the img_align_celeba folder to ./datasets/celeba. The resulting directory structure should be:

├── datasets
│    ├── celeba
│    │    ├── img_align_celeba  
│    │    │    ├── 000001.jpg
│    │    │    ├── 000002.jpg
│    │    │    ├── 000003.jpg
│    │    │    ├── ...

Test and Evaluation Metrics

Download the pretrained models and unzip them to ./VanillaAE/experiments.

We have provided the example test scripts. If you only have a CPU environment, please specify --no_cuda in the script. Run:

bash scripts/VanillaAE/test/celeba_recon_wo_ffl.sh
bash scripts/VanillaAE/test/celeba_recon_w_ffl.sh

The Vanilla AE image reconstruction results will be saved at ./VanillaAE/results by default.

After testing, you can further calculate the evaluation metrics for this example. We have implemented a series of evaluation metrics we used and provided the metric scripts. Run:

bash scripts/VanillaAE/metrics/celeba_recon_wo_ffl.sh
bash scripts/VanillaAE/metrics/celeba_recon_w_ffl.sh

You will see the scores of different metrics. The metric logs will be saved in the respective experiment folders at ./VanillaAE/results.

Training

We have provided the example training scripts. If you only have a CPU environment, please specify --no_cuda in the script. Run:

bash scripts/VanillaAE/train/celeba_recon_wo_ffl.sh
bash scripts/VanillaAE/train/celeba_recon_w_ffl.sh 

After training, inference on the newly trained models is similar to Test and Evaluation Metrics. The results could be better reproduced on NVIDIA Tesla V100 GPUs with torch<=1.7.1,>=1.1.0.

More Results

Here, we show other examples of applying the proposed focal frequency loss (FFL) under diverse settings.

Image Reconstruction (VAE)

reconvae

Image-to-Image Translation (pix2pix | SPADE)

consynI2I

Unconditional Image Synthesis (StyleGAN2)

256x256 results (without truncation) and the mini-batch average spectra (adjusted to better contrast):

unsynsg2res256

1024x1024 results (without truncation) synthesized by StyleGAN2 with FFL:

unsynsg2res1024

Citation

If you find this work useful for your research, please cite our paper:

@inproceedings{jiang2021focal,
  title={Focal Frequency Loss for Image Reconstruction and Synthesis},
  author={Jiang, Liming and Dai, Bo and Wu, Wayne and Loy, Chen Change},
  booktitle={ICCV},
  year={2021}
}

Acknowledgments

The code of Vanilla AE is inspired by PyTorch DCGAN and MUNIT. Part of the evaluation metric code is borrowed from MMEditing. We also apply LPIPS and pytorch-fid as evaluation metrics.

License

All rights reserved. The code is released under the MIT License.

Copyright (c) 2021

Owner
Liming Jiang
Ph.D. student, [email protected]
Liming Jiang
This is the official implementation for the paper "Heterogeneous Multi-player Multi-armed Bandits: Closing the Gap and Generalization" in NeurIPS 2021.

MPMAB_BEACON This is code used for the paper "Decentralized Multi-player Multi-armed Bandits: Beyond Linear Reward Functions", Neurips 2021. Requireme

Cong Shen Research Group 0 Oct 26, 2021
Code repository for "Free View Synthesis", ECCV 2020.

Free View Synthesis Code repository for "Free View Synthesis", ECCV 2020. Setup Install the following Python packages in your Python environment - num

Intelligent Systems Lab Org 253 Dec 07, 2022
RoMa: A lightweight library to deal with 3D rotations in PyTorch.

RoMa: A lightweight library to deal with 3D rotations in PyTorch. RoMa (which stands for Rotation Manipulation) provides differentiable mappings betwe

NAVER 90 Dec 27, 2022
ZeroGen: Efficient Zero-shot Learning via Dataset Generation

ZEROGEN This repository contains the code for our paper “ZeroGen: Efficient Zero

Jiacheng Ye 31 Dec 30, 2022
Using contrastive learning and OpenAI's CLIP to find good embeddings for images with lossy transformations

Creating Robust Representations from Pre-Trained Image Encoders using Contrastive Learning Sriram Ravula, Georgios Smyrnis This is the code for our pr

Sriram Ravula 26 Dec 10, 2022
SOTR: Segmenting Objects with Transformers [ICCV 2021]

SOTR: Segmenting Objects with Transformers [ICCV 2021] By Ruohao Guo, Dantong Niu, Liao Qu, Zhenbo Li Introduction This is the official implementation

186 Dec 20, 2022
Official code for "EagerMOT: 3D Multi-Object Tracking via Sensor Fusion" [ICRA 2021]

EagerMOT: 3D Multi-Object Tracking via Sensor Fusion Read our ICRA 2021 paper here. Check out the 3 minute video for the quick intro or the full prese

Aleksandr Kim 276 Dec 30, 2022
Code base for reproducing results of I.Schubert, D.Driess, O.Oguz, and M.Toussaint: Learning to Execute: Efficient Learning of Universal Plan-Conditioned Policies in Robotics. NeurIPS (2021)

Learning to Execute (L2E) Official code base for completely reproducing all results reported in I.Schubert, D.Driess, O.Oguz, and M.Toussaint: Learnin

3 May 18, 2022
Quantization library for PyTorch. Support low-precision and mixed-precision quantization, with hardware implementation through TVM.

HAWQ: Hessian AWare Quantization HAWQ is an advanced quantization library written for PyTorch. HAWQ enables low-precision and mixed-precision uniform

Zhen Dong 293 Dec 30, 2022
RLDS stands for Reinforcement Learning Datasets

RLDS RLDS stands for Reinforcement Learning Datasets and it is an ecosystem of tools to store, retrieve and manipulate episodic data in the context of

Google Research 135 Jan 01, 2023
一套完整的微博舆情分析流程代码,包括微博爬虫、LDA主题分析和情感分析。

已经将项目的关键文件上传,包含微博爬虫、LDA主题分析和情感分析三个部分。 1.微博爬虫 实现微博评论爬取和微博用户信息爬取,一天大概十万条。 2.LDA主题分析 实现文档主题抽取,包括数据清洗及分词、主题数的确定(主题一致性和困惑度)和最优主题模型的选择(暴力搜索)。 3.情感分析 实现评论文本的

182 Jan 02, 2023
PyVideoAI: Action Recognition Framework

This reposity contains official implementation of: Capturing Temporal Information in a Single Frame: Channel Sampling Strategies for Action Recognitio

Kiyoon Kim 22 Dec 29, 2022
CM building dataset Timisoara

CM_building_dataset_Timisoara Date created: Febr-2020 The Timi\c{s}oara Building Dataset - TMBuD - is composed of 160 images with the resolution of 76

Orhei Ciprian 5 Sep 07, 2022
Code for "Optimizing risk-based breast cancer screening policies with reinforcement learning"

Tempo: Optimizing risk-based breast cancer screening policies with reinforcement learning Introduction This repository was used to develop Tempo, as d

Adam Yala 12 Oct 11, 2022
Deep and online learning with spiking neural networks in Python

Introduction The brain is the perfect place to look for inspiration to develop more efficient neural networks. One of the main differences with modern

Jason Eshraghian 447 Jan 03, 2023
A Context-aware Visual Attention-based training pipeline for Object Detection from a Webpage screenshot!

CoVA: Context-aware Visual Attention for Webpage Information Extraction Abstract Webpage information extraction (WIE) is an important step to create k

Keval Morabia 41 Jan 01, 2023
Make a Turtlebot3 follow a figure 8 trajectory and create a robot arm and make it follow a trajectory

HW2 - ME 495 Overview Part 1: Makes the robot move in a figure 8 shape. The robot starts moving when launched on a real turtlebot3 and can be paused a

Devesh Bhura 0 Oct 21, 2022
FaRL for Facial Representation Learning

FaRL for Facial Representation Learning This repo hosts official implementation of our paper General Facial Representation Learning in a Visual-Lingui

Microsoft 19 Jan 05, 2022
Official Tensorflow implementation of U-GAT-IT: Unsupervised Generative Attentional Networks with Adaptive Layer-Instance Normalization for Image-to-Image Translation (ICLR 2020)

U-GAT-IT — Official TensorFlow Implementation (ICLR 2020) : Unsupervised Generative Attentional Networks with Adaptive Layer-Instance Normalization fo

Junho Kim 6.2k Jan 04, 2023
Node for thenewboston digital currency network.

Project setup For project setup see INSTALL.rst Community Join the community to stay updated on the most recent developments, project roadmaps, and ra

thenewboston 27 Jul 08, 2022