MAT: Mask-Aware Transformer for Large Hole Image Inpainting

Related tags

Deep LearningMAT
Overview

MAT: Mask-Aware Transformer for Large Hole Image Inpainting (CVPR2022, Oral)

Wenbo Li, Zhe Lin, Kun Zhou, Lu Qi, Yi Wang, Jiaya Jia

[Paper]


News

This is the official implementation of MAT. The training and testing code is released. We also provide our masks for CelebA-HQ-val and Places-val here.


Visualization

We present a transformer-based model (MAT) for large hole inpainting with high fidelity and diversity.

large hole inpainting with pluralistic generation

Compared to other methods, the proposed MAT restores more photo-realistic images with fewer artifacts.

comparison with sotas

Usage

  1. Clone the repository.
    git clone https://github.com/fenglinglwb/MAT.git 
  2. Install the dependencies.
    • Python 3.7
    • PyTorch 1.7.1
    • Cuda 11.0
    • Other packages
    pip install -r requirements.txt

Quick Test

  1. We provide models trained on CelebA-HQ and Places365-Standard at 512x512 resolution. Download models from One Drive and put them into the 'pretrained' directory. The released models are retrained, and hence the visualization results may slightly differ from the paper.

  2. Obtain inpainted results by running

    python generate_image.py --network model_path --dpath data_path --outdir out_path [--mpath mask_path]

    where the mask path is optional. If not assigned, random 512x512 masks will be generated. Note that 0 and 1 values in a mask refer to masked and remained pixels.

    For example, run

    python generate_image.py --network pretrained/CelebA-HQ.pkl --dpath test_sets/CelebA-HQ/images --mpath test_sets/CelebA-HQ/masks --outdir samples

    Note. Our implementation only supports generating an image whose size is a multiple of 512. You need to pad or resize the image to make its size a multiple of 512. Please pad the mask with 0 values.

Train

For example, if you want to train a model on Places, run a bash script with

python train.py \
    --outdir=output_path \
    --gpus=8 \
    --batch=32 \
    --metrics=fid36k5_full \
    --data=training_data_path \
    --data_val=val_data_path \
    --dataloader=datasets.dataset_512.ImageFolderMaskDataset \
    --mirror=True \
    --cond=False \
    --cfg=places512 \
    --aug=noaug \
    --generator=networks.mat.Generator \
    --discriminator=networks.mat.Discriminator \
    --loss=losses.loss.TwoStageLoss \
    --pr=0.1 \
    --pl=False \
    --truncation=0.5 \
    --style_mix=0.5 \
    --ema=10 \
    --lr=0.001

Description of arguments:

  • outdir: output path for saving logs and models
  • gpus: number of used gpus
  • batch: number of images in all gpus
  • metrics: find more metrics in 'metrics/metric_main.py'
  • data: training data
  • data_val: validation data
  • dataloader: you can define your own dataloader
  • mirror: use flip augmentation or not
  • cond: use class info, default: false
  • cfg: configuration, find more details in 'train.py'
  • aug: use augmentation of style-gan-ada or not, default: false
  • generator: you can define your own generator
  • discriminator: you can define your own discriminator
  • loss: you can define your own loss
  • pr: ratio of perceptual loss
  • pl: use path length regularization or not, default: false
  • truncation: truncation ratio proposed in stylegan
  • style_mix: style mixing ratio proposed in stylegan
  • ema: exponoential moving averate, ~K samples
  • lr: learning rate

Evaluation

We provide evaluation scrtips for FID/U-IDS/P-IDS/LPIPS/PSNR/SSIM/L1 metrics in the 'evaluation' directory. Only need to give paths of your results and GTs.

Citation

@inproceedings{li2022mat,
    title={MAT: Mask-Aware Transformer for Large Hole Image Inpainting},
    author={Li, Wenbo and Lin, Zhe and Zhou, Kun and Qi, Lu and Wang, Yi and Jia, Jiaya},
    booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
    year={2022}
}

License and Acknowledgement

The code and models in this repo are for research purposes only. Our code is bulit upon StyleGAN2-ADA.

Towards Understanding Quality Challenges of the Federated Learning: A First Look from the Lens of Robustness

FL Analysis This repository contains the code and results for the paper "Towards Understanding Quality Challenges of the Federated Learning: A First L

3 Oct 17, 2022
This is the repository for CVPR2021 Dynamic Metric Learning: Towards a Scalable Metric Space to Accommodate Multiple Semantic Scales

Intro This is the repository for CVPR2021 Dynamic Metric Learning: Towards a Scalable Metric Space to Accommodate Multiple Semantic Scales Vehicle Sam

39 Jul 21, 2022
GarmentNets: Category-Level Pose Estimation for Garments via Canonical Space Shape Completion

GarmentNets This repository contains the source code for the paper GarmentNets: Category-Level Pose Estimation for Garments via Canonical Space Shape

Columbia Artificial Intelligence and Robotics Lab 43 Nov 21, 2022
Modular Gaussian Processes

Modular Gaussian Processes for Transfer Learning đź§© Introduction This repository contains the implementation of our paper Modular Gaussian Processes f

Pablo Moreno-Muñoz 10 Mar 15, 2022
Pyramid Scene Parsing Network, CVPR2017.

Pyramid Scene Parsing Network by Hengshuang Zhao, Jianping Shi, Xiaojuan Qi, Xiaogang Wang, Jiaya Jia, details are in project page. Introduction This

Hengshuang Zhao 1.5k Jan 05, 2023
Sample code from the Neural Networks from Scratch book.

Neural Networks from Scratch (NNFS) book code Code from the NNFS book (https://nnfs.io) separated by chapter.

Harrison 172 Dec 31, 2022
A Probabilistic End-To-End Task-Oriented Dialog Model with Latent Belief States towards Semi-Supervised Learning

LABES This is the code for EMNLP 2020 paper "A Probabilistic End-To-End Task-Oriented Dialog Model with Latent Belief States towards Semi-Supervised L

17 Sep 28, 2022
My freqtrade strategies

My freqtrade-strategies Hi there! This is repo for my freqtrade-strategies. My name is Ilya Zelenchuk, I'm a lecturer at the SPbU university (https://

171 Dec 05, 2022
Users can free try their models on SIDD dataset based on this code

SIDD benchmark 1 Train python train.py If you want to train your network, just modify the yaml in the options folder. 2 Validation python validation.p

Yuzhi ZHAO 2 May 20, 2022
Differentiable Optimizers with Perturbations in Pytorch

Differentiable Optimizers with Perturbations in PyTorch This contains a PyTorch implementation of Differentiable Optimizers with Perturbations in Tens

Jake Tuero 54 Jun 22, 2022
Boost learning for GNNs from the graph structure under challenging heterophily settings. (NeurIPS'20)

Beyond Homophily in Graph Neural Networks: Current Limitations and Effective Designs Jiong Zhu, Yujun Yan, Lingxiao Zhao, Mark Heimann, Leman Akoglu,

GEMS Lab: Graph Exploration & Mining at Scale, University of Michigan 70 Dec 18, 2022
Automatic Number Plate Recognition using Contours and Convolution Neural Networks (CNN)

Cite our paper if you find this project useful https://www.ijariit.com/manuscripts/v7i4/V7I4-1139.pdf Abstract Image processing technology is used in

Adithya M 2 Jun 28, 2022
Sharpened cosine similarity torch - A Sharpened Cosine Similarity layer for PyTorch

Sharpened Cosine Similarity A layer implementation for PyTorch Install At your c

Brandon Rohrer 203 Nov 30, 2022
An alarm clock coded in Python 3 with Tkinter

Tkinter-Alarm-Clock An alarm clock coded in Python 3 with Tkinter. Run python3 Tkinter Alarm Clock.py in a terminal if you have Python 3. NOTE: This p

CodeMaster7000 1 Dec 25, 2021
GitHub repository for the ICLR Computational Geometry & Topology Challenge 2021

ICLR Computational Geometry & Topology Challenge 2022 Welcome to the ICLR 2022 Computational Geometry & Topology challenge 2022 --- by the ICLR 2022 W

42 Dec 13, 2022
Super Pix Adv - Offical implemention of Robust Superpixel-Guided Attentional Adversarial Attack (CVPR2020)

Super_Pix_Adv Offical implemention of Robust Superpixel-Guided Attentional Adver

DLight 8 Oct 26, 2022
A custom DeepStack model that has been trained detecting ONLY the USPS logo

This repository provides a custom DeepStack model that has been trained detecting ONLY the USPS logo. This was created after I discovered that the Deepstack OpenLogo custom model I was using did not

Stephen Stratoti 9 Dec 27, 2022
《Train in Germany, Test in The USA: Making 3D Object Detectors Generalize》(CVPR 2020)

Train in Germany, Test in The USA: Making 3D Object Detectors Generalize This paper has been accpeted by Conference on Computer Vision and Pattern Rec

Xiangyu Chen 101 Jan 02, 2023
A GUI to automatically create a TOPAS-readable MLC simulation file

Python script to create a TOPAS-readable simulation file descriring a Multi-Leaf-Collimator. Builds the MLC using the data from a 3D .stl file.

Sebastian Schäfer 0 Jun 19, 2022
How Effective is Incongruity? Implications for Code-mix Sarcasm Detection.

Code for the paper: How Effective is Incongruity? Implications for Code-mix Sarcasm Detection - ICON ACL 2021

2 Jun 05, 2022