[ACM MM 2021] Diverse Image Inpainting with Bidirectional and Autoregressive Transformers

Related tags

Deep LearningBAT-Fill
Overview

Diverse Image Inpainting with Bidirectional and Autoregressive Transformers

Installation

pip install -r requirements.txt

Dataset Preparation

Given the dataset, please prepare the images paths in a folder named by the dataset with the following folder strcuture.

    flist/dataset_name
        ├── train.flist    # paths of training images
        ├── valid.flist    # paths of validation images
        └── test.flist     # paths of testing images

In this work, we use CelebA-HQ (Download availbale here), Places2 (Download availbale here), ParisStreet View (need author's permission to download)

ImageNet K-means Cluster: The kmeans_centers.npy is downloaded from image-gpt, it's used to quantitize the low-resolution images.

Testing with Pre-trained Models

  1. Download pre-trained models:
  1. Put the pre-trained model under the checkpoints folder, e.g.
    checkpoints
        ├── celebahq_bat_pretrain
            ├── latest_net_G.pth 
  1. Prepare the input images and masks to test.
python bat_sample.py --num_sample [1] --tran_model [bat name] --up_model [upsampler name] --input_dir [dir of input] --mask_dir [dir of mask] --save_dir [dir to save results]

Training New Models

Pretrained VGG model Download from here, move it to models/. This model is used to calculate training loss for the upsampler.

New models can be trained with the following commands.

  1. Prepare dataset. Use --dataroot option to locate the directory of file lists, e.g. ./flist, and specify the dataset name to train with --dataset_name option. Identify the types and mask ratio using --mask_type and --pconv_level options.

  2. Train the transformer.

# To specify your own dataset or settings in the bash file.
bash train_bat.sh

Please note that some of the transformer settings are defined in train_bat.py instead of options/, and this script will take every available gpus for training, please define the GPUs via CUDA_VISIBLE_DEVICES instead of --gpu_ids, which is used for the upsampler.

  1. Train the upsampler.
# To specify your own dataset or settings in the bash file.
bash train_up.sh

The upsampler is typically trained by the low-resolution ground truth, we find that using some samples from the trained BAT might be helpful to improve the performance i.e. PSNR, SSIM. But the sampling process is quite time consuming, training with ground truth also could yield reasonable results.

Citation

If you find this code helpful for your research, please cite our papers.

@inproceedings{yu2021diverse,
  title={Diverse Image Inpainting with Bidirectional and Autoregressive Transformers},
  author={Yu, Yingchen and Zhan, Fangneng and Wu, Rongliang and Pan, Jianxiong and Cui, Kaiwen and Lu, Shijian and Ma, Feiying and Xie, Xuansong and Miao, Chunyan},
  booktitle={Proceedings of the 29th ACM International Conference on Multimedia},
  year={2021}
}

Acknowledgments

This code borrows heavily from SPADE and minGPT, we apprecite the authors for sharing their codes.

Owner
Yingchen Yu
Yingchen Yu
FG-transformer-TTS Fine-grained style control in transformer-based text-to-speech synthesis

LST-TTS Official implementation for the paper Fine-grained style control in transformer-based text-to-speech synthesis. Submitted to ICASSP 2022. Audi

Li-Wei Chen 64 Dec 30, 2022
REGTR: End-to-end Point Cloud Correspondences with Transformers

REGTR: End-to-end Point Cloud Correspondences with Transformers This repository contains the source code for REGTR. REGTR utilizes multiple transforme

Zi Jian Yew 108 Dec 17, 2022
Re-implememtation of MAE (Masked Autoencoders Are Scalable Vision Learners) using PyTorch.

mae-repo PyTorch re-implememtation of "masked autoencoders are scalable vision learners". In this repo, it heavily borrows codes from codebase https:/

Peng Qiao 1 Dec 14, 2021
Existing Literature about Machine Unlearning

Machine Unlearning Papers 2021 Brophy and Lowd. Machine Unlearning for Random Forests. In ICML 2021. Bourtoule et al. Machine Unlearning. In IEEE Symp

Jonathan Brophy 213 Jan 08, 2023
Implementation for paper MLP-Mixer: An all-MLP Architecture for Vision

MLP Mixer Implementation for paper MLP-Mixer: An all-MLP Architecture for Vision. Give us a star if you like this repo. Author: Github: bangoc123 Emai

Ngoc Nguyen Ba 86 Dec 10, 2022
Example for AUAV 2022 with obstacle avoidance.

AUAV 2022 Sample This is a sample PX4 based quadrotor path planning framework based on Ubuntu 20.04 and ROS noetic for the IEEE Autonomous UAS 2022 co

James Goppert 11 Sep 16, 2022
Text completion with Hugging Face and TensorFlow.js running on Node.js

Katana ML Text Completion 🤗 Description Runs with with Hugging Face DistilBERT and TensorFlow.js on Node.js distilbert-model - converter from Hugging

Katana ML 2 Nov 04, 2022
PyTorch Implementation of Region Similarity Representation Learning (ReSim)

ReSim This repository provides the PyTorch implementation of Region Similarity Representation Learning (ReSim) described in this paper: @Article{xiao2

Tete Xiao 74 Jan 03, 2023
Practical Blind Denoising via Swin-Conv-UNet and Data Synthesis

Practical Blind Denoising via Swin-Conv-UNet and Data Synthesis [Paper] [Online Demo] The following results are obtained by our SCUNet with purely syn

Kai Zhang 312 Jan 07, 2023
Temporal Knowledge Graph Reasoning Triggered by Memories

MTDM Temporal Knowledge Graph Reasoning Triggered by Memories To alleviate the time dependence, we propose a memory-triggered decision-making (MTDM) n

4 Sep 25, 2022
Author's PyTorch implementation of Randomized Ensembled Double Q-Learning (REDQ) algorithm.

REDQ source code Author's PyTorch implementation of Randomized Ensembled Double Q-Learning (REDQ) algorithm. Paper link: https://arxiv.org/abs/2101.05

109 Dec 16, 2022
A 2D Visual Localization Framework based on Essential Matrices [ICRA2020]

A 2D Visual Localization Framework based on Essential Matrices This repository provides implementation of our paper accepted at ICRA: To Learn or Not

Qunjie Zhou 27 Nov 07, 2022
A Bayesian cognition approach for belief updating of correlation judgement through uncertainty visualizations

Overview Code and supplemental materials for Karduni et al., 2020 IEEE Vis. "A Bayesian cognition approach for belief updating of correlation judgemen

Ryan Wesslen 1 Feb 08, 2022
A blender add-on that automatically re-aligns wrong axis objects.

Auto Align A blender add-on that automatically re-aligns wrong axis objects. Usage There are three options available in the 3D Viewport Sidebar It

29 Nov 25, 2022
Source code for paper "Deep Diffusion Models for Robust Channel Estimation", TBA.

diffusion-channels Source code for paper "Deep Diffusion Models for Robust Channel Estimation". Generic flow: Use 'matlab/main.mat' to generate traini

The University of Texas Computational Sensing and Imaging Lab 15 Dec 22, 2022
Train a state-of-the-art yolov3 object detector from scratch!

TrainYourOwnYOLO: Building a Custom Object Detector from Scratch This repo let's you train a custom image detector using the state-of-the-art YOLOv3 c

AntonMu 616 Jan 08, 2023
BboxToolkit is a tiny library of special bounding boxes.

BboxToolkit is a light codebase collecting some practical functions for the special-shape detection, such as oriented detection

jbwang1997 73 Jan 01, 2023
Semantically Contrastive Learning for Low-light Image Enhancement

Semantically Contrastive Learning for Low-light Image Enhancement Here, we propose an effective semantically contrastive learning paradigm for Low-lig

48 Dec 16, 2022
TrackFormer: Multi-Object Tracking with Transformers

TrackFormer: Multi-Object Tracking with Transformers This repository provides the official implementation of the TrackFormer: Multi-Object Tracking wi

Tim Meinhardt 321 Dec 29, 2022
Parallel and High-Fidelity Text-to-Lip Generation; AAAI 2022 ; Official code

Parallel and High-Fidelity Text-to-Lip Generation This repository is the official PyTorch implementation of our AAAI-2022 paper, in which we propose P

Zhying 77 Dec 21, 2022