[ACM MM 2021] Diverse Image Inpainting with Bidirectional and Autoregressive Transformers

Related tags

Deep LearningBAT-Fill
Overview

Diverse Image Inpainting with Bidirectional and Autoregressive Transformers

Installation

pip install -r requirements.txt

Dataset Preparation

Given the dataset, please prepare the images paths in a folder named by the dataset with the following folder strcuture.

    flist/dataset_name
        ├── train.flist    # paths of training images
        ├── valid.flist    # paths of validation images
        └── test.flist     # paths of testing images

In this work, we use CelebA-HQ (Download availbale here), Places2 (Download availbale here), ParisStreet View (need author's permission to download)

ImageNet K-means Cluster: The kmeans_centers.npy is downloaded from image-gpt, it's used to quantitize the low-resolution images.

Testing with Pre-trained Models

  1. Download pre-trained models:
  1. Put the pre-trained model under the checkpoints folder, e.g.
    checkpoints
        ├── celebahq_bat_pretrain
            ├── latest_net_G.pth 
  1. Prepare the input images and masks to test.
python bat_sample.py --num_sample [1] --tran_model [bat name] --up_model [upsampler name] --input_dir [dir of input] --mask_dir [dir of mask] --save_dir [dir to save results]

Training New Models

Pretrained VGG model Download from here, move it to models/. This model is used to calculate training loss for the upsampler.

New models can be trained with the following commands.

  1. Prepare dataset. Use --dataroot option to locate the directory of file lists, e.g. ./flist, and specify the dataset name to train with --dataset_name option. Identify the types and mask ratio using --mask_type and --pconv_level options.

  2. Train the transformer.

# To specify your own dataset or settings in the bash file.
bash train_bat.sh

Please note that some of the transformer settings are defined in train_bat.py instead of options/, and this script will take every available gpus for training, please define the GPUs via CUDA_VISIBLE_DEVICES instead of --gpu_ids, which is used for the upsampler.

  1. Train the upsampler.
# To specify your own dataset or settings in the bash file.
bash train_up.sh

The upsampler is typically trained by the low-resolution ground truth, we find that using some samples from the trained BAT might be helpful to improve the performance i.e. PSNR, SSIM. But the sampling process is quite time consuming, training with ground truth also could yield reasonable results.

Citation

If you find this code helpful for your research, please cite our papers.

@inproceedings{yu2021diverse,
  title={Diverse Image Inpainting with Bidirectional and Autoregressive Transformers},
  author={Yu, Yingchen and Zhan, Fangneng and Wu, Rongliang and Pan, Jianxiong and Cui, Kaiwen and Lu, Shijian and Ma, Feiying and Xie, Xuansong and Miao, Chunyan},
  booktitle={Proceedings of the 29th ACM International Conference on Multimedia},
  year={2021}
}

Acknowledgments

This code borrows heavily from SPADE and minGPT, we apprecite the authors for sharing their codes.

Owner
Yingchen Yu
Yingchen Yu
Finetune SSL models for MOS prediction

Finetune SSL models for MOS prediction This is code for our paper under review for ICASSP 2022: "Generalization Ability of MOS Prediction Networks" Er

Yamagishi and Echizen Laboratories, National Institute of Informatics 32 Nov 22, 2022
(NeurIPS 2021) Realistic Evaluation of Transductive Few-Shot Learning

Realistic evaluation of transductive few-shot learning Introduction This repo contains the code for our NeurIPS 2021 submitted paper "Realistic evalua

Olivier Veilleux 14 Dec 13, 2022
Deeply Supervised, Layer-wise Prediction-aware (DSLP) Transformer for Non-autoregressive Neural Machine Translation

Non-Autoregressive Translation with Layer-Wise Prediction and Deep Supervision Training Efficiency We show the training efficiency of our DSLP model b

Chenyang Huang 36 Oct 31, 2022
Distance Encoding for GNN Design

Distance-encoding for GNN design This repository is the official PyTorch implementation of the DEGNN and DEAGNN framework reported in the paper: Dista

172 Nov 08, 2022
XViT - Space-time Mixing Attention for Video Transformer

XViT - Space-time Mixing Attention for Video Transformer This is the official implementation of the XViT paper: @inproceedings{bulat2021space, title

Adrian Bulat 33 Dec 23, 2022
Phy-Q: A Benchmark for Physical Reasoning

Phy-Q: A Benchmark for Physical Reasoning Cheng Xue*, Vimukthini Pinto*, Chathura Gamage* Ekaterina Nikonova, Peng Zhang, Jochen Renz School of Comput

29 Dec 19, 2022
Official PyTorch implementation of paper: Standardized Max Logits: A Simple yet Effective Approach for Identifying Unexpected Road Obstacles in Urban-Scene Segmentation (ICCV 2021 Oral Presentation)

SML (ICCV 2021, Oral) : Official Pytorch Implementation This repository provides the official PyTorch implementation of the following paper: Standardi

SangHun 61 Dec 27, 2022
CLUES: Few-Shot Learning Evaluation in Natural Language Understanding

CLUES: Few-Shot Learning Evaluation in Natural Language Understanding This repo contains the data and source code for baseline models in the NeurIPS 2

Microsoft 29 Dec 29, 2022
Tensorflow Implementation of SMU: SMOOTH ACTIVATION FUNCTION FOR DEEP NETWORKS USING SMOOTHING MAXIMUM TECHNIQUE

SMU A Tensorflow Implementation of SMU: SMOOTH ACTIVATION FUNCTION FOR DEEP NETWORKS USING SMOOTHING MAXIMUM TECHNIQUE arXiv https://arxiv.org/abs/211

Fuhang 5 Jan 18, 2022
Kaggle | 9th place single model solution for TGS Salt Identification Challenge

UNet for segmenting salt deposits from seismic images with PyTorch. General We, tugstugi and xuyuan, have participated in the Kaggle competition TGS S

Erdene-Ochir Tuguldur 276 Dec 20, 2022
An open software package to develop BCI based brain and cognitive computing technology for recognizing user's intention using deep learning

An open software package to develop BCI based brain and cognitive computing technology for recognizing user's intention using deep learning

deepbci 272 Jan 08, 2023
A framework for attentive explainable deep learning on tabular data

🧠 kendrite A framework for attentive explainable deep learning on tabular data 💨 Quick start kedro run 🧱 Built upon Technology Description Links ke

Marnix Koops 3 Nov 06, 2021
a delightful machine learning tool that allows you to train, test and use models without writing code

igel A delightful machine learning tool that allows you to train/fit, test and use models without writing code Note I'm also working on a GUI desktop

Nidhal Baccouri 3k Jan 05, 2023
WormMovementSimulation - 3D Simulation of Worm Body Movement with Neurons attached to its body

Generate 3D Locomotion Data This module is intended to create 2D video trajector

1 Aug 09, 2022
Attendance Monitoring with Face Recognition using Python

Attendance Monitoring with Face Recognition using Python A python GUI integrated attendance system using face recognition to take attendance. In this

Vaibhav Rajput 2 Jun 21, 2022
Pytorch implementation of MLP-Mixer with loading pre-trained models.

MLP-Mixer-Pytorch PyTorch implementation of MLP-Mixer: An all-MLP Architecture for Vision with the function of loading official ImageNet pre-trained p

Qiushi Yang 2 Sep 29, 2022
Free-duolingo-plus - Duolingo account creator that uses your invite code to get you free duolingo plus

free-duolingo-plus duolingo account creator that uses your invite code to get yo

1 Jan 06, 2022
PICK: Processing Key Information Extraction from Documents using Improved Graph Learning-Convolutional Networks

Code for the paper "PICK: Processing Key Information Extraction from Documents using Improved Graph Learning-Convolutional Networks" (ICPR 2020)

Wenwen Yu 498 Dec 24, 2022
Code accompanying the paper "ProxyFL: Decentralized Federated Learning through Proxy Model Sharing"

ProxyFL Code accompanying the paper "ProxyFL: Decentralized Federated Learning through Proxy Model Sharing" Authors: Shivam Kalra*, Junfeng Wen*, Jess

Layer6 Labs 14 Dec 06, 2022
Normal Learning in Videos with Attention Prototype Network

Codes_APN Official codes of CVPR21 paper: Normal Learning in Videos with Attention Prototype Network (https://arxiv.org/abs/2108.11055) Overview of ou

11 Dec 13, 2022