πŸ¦™ LaMa Image Inpainting, Resolution-robust Large Mask Inpainting with Fourier Convolutions, WACV 2022

Overview

πŸ¦™ LaMa: Resolution-robust Large Mask Inpainting with Fourier Convolutions

Official implementation by Samsung Research

by Roman Suvorov, Elizaveta Logacheva, Anton Mashikhin, Anastasia Remizova, Arsenii Ashukha, Aleksei Silvestrov, Naejin Kong, Harshith Goka, Kiwoong Park, Victor Lempitsky.

πŸ”₯ πŸ”₯ πŸ”₯
LaMa generalizes surprisingly well to much higher resolutions (~2k ❗️ ) than it saw during training (256x256), and achieves the excellent performance even in challenging scenarios, e.g. completion of periodic structures.

[Project page] [arXiv] [Supplementary] [BibTeX]


Try out in Google Colab

Environment setup

Clone the repo: git clone https://github.com/saic-mdal/lama.git

There are three options of an environment:

  1. Python virtualenv:

    virtualenv inpenv --python=/usr/bin/python3
    source inpenv/bin/activate
    pip install torch==1.8.0 torchvision==0.9.0
    
    cd lama
    pip install -r requirements.txt 
    
  2. Conda

    % Install conda for Linux, for other OS download miniconda at https://docs.conda.io/en/latest/miniconda.html
    wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
    bash Miniconda3-latest-Linux-x86_64.sh -b -p $HOME/miniconda
    $HOME/miniconda/bin/conda init bash
    
    cd lama
    conda env create -f conda_env.yml
    conda activate lama
    conda install pytorch torchvision torchaudio cudatoolkit=10.2 -c pytorch -y
    pip install pytorch-lightning==1.2.9
    
  3. Docker: No actions are needed πŸŽ‰ .

Inference

Run

cd lama
export TORCH_HOME=$(pwd) && export PYTHONPATH=.

1. Download pre-trained models

Install tool for yandex disk link extraction:

pip3 install wldhx.yadisk-direct

The best model (Places2, Places Challenge):

curl -L $(yadisk-direct https://disk.yandex.ru/d/ouP6l8VJ0HpMZg) -o big-lama.zip
unzip big-lama.zip

All models (Places & CelebA-HQ):

curl -L $(yadisk-direct https://disk.yandex.ru/d/EgqaSnLohjuzAg) -o lama-models.zip
unzip lama-models.zip

2. Prepare images and masks

Download test images:

curl -L $(yadisk-direct https://disk.yandex.ru/d/xKQJZeVRk5vLlQ) -o LaMa_test_images.zip
unzip LaMa_test_images.zip
OR prepare your data: 1) Create masks named as `[images_name]_maskXXX[image_suffix]`, put images and masks in the same folder.
  • You can use the script for random masks generation.
  • Check the format of the files:
    image1_mask001.png
    image1.png
    image2_mask001.png
    image2.png
    
  1. Specify image_suffix, e.g. .png or .jpg or _input.jpg in configs/prediction/default.yaml.

3. Predict

On the host machine:

python3 bin/predict.py model.path=$(pwd)/big-lama indir=$(pwd)/LaMa_test_images outdir=$(pwd)/output

OR in the docker

The following command will pull the docker image from Docker Hub and execute the prediction script

bash docker/2_predict.sh $(pwd)/big-lama $(pwd)/LaMa_test_images $(pwd)/output device=cpu

Docker cuda: TODO

Train and Eval

⚠️ Warning: The training is not fully tested yet, e.g., did not re-training after refactoring ⚠️

Make sure you run:

cd lama
export TORCH_HOME=$(pwd) && export PYTHONPATH=.

Then download models for perceptual loss:

mkdir -p ade20k/ade20k-resnet50dilated-ppm_deepsup/
wget -P ade20k/ade20k-resnet50dilated-ppm_deepsup/ http://sceneparsing.csail.mit.edu/model/pytorch/ade20k-resnet50dilated-ppm_deepsup/encoder_epoch_20.pth

Places

On the host machine:

# Download data from http://places2.csail.mit.edu/download.html
# Places365-Standard: Train(105GB)/Test(19GB)/Val(2.1GB) from High-resolution images section
wget http://data.csail.mit.edu/places/places365/train_large_places365standard.tar
wget http://data.csail.mit.edu/places/places365/val_large.tar
wget http://data.csail.mit.edu/places/places365/test_large.tar

# Unpack and etc.
bash fetch_data/places_standard_train_prepare.sh
bash fetch_data/places_standard_test_val_prepare.sh
bash fetch_data/places_standard_evaluation_prepare_data.sh

# Sample images for test and viz at the end of epoch
bash fetch_data/places_standard_test_val_sample.sh
bash fetch_data/places_standard_test_val_gen_masks.sh

# Run training
# You can change bs with data.batch_size=10
python bin/train.py -cn lama-fourier location=places_standard

# Infer model on thick/thin/medium masks in 256 and 512 and run evaluation 
# like this:
python3 bin/predict.py \
model.path=$(pwd)/experiments/
   
    _
    
     _lama-fourier_/ \
indir=$(pwd)/places_standard_dataset/evaluation/random_thick_512/ \
outdir=$(pwd)/inference/random_thick_512 model.checkpoint=last.ckpt

python3 bin/evaluate_predicts.py \
$(pwd)/configs/eval_2gpu.yaml \
$(pwd)/places_standard_dataset/evaluation/random_thick_512/ \
$(pwd)/inference/random_thick_512 $(pwd)/inference/random_thick_512_metrics.csv

    
   

Docker: TODO

CelebA

On the host machine:

# Make shure you are in lama folder
cd lama
export TORCH_HOME=$(pwd) && export PYTHONPATH=.

# Download CelebA-HQ dataset
# Download data256x256.zip from https://drive.google.com/drive/folders/11Vz0fqHS2rXDb5pprgTjpD7S2BAJhi1P

# unzip & split into train/test/visualization & create config for it
bash fetch_data/celebahq_dataset_prepare.sh

# generate masks for test and visual_test at the end of epoch
bash fetch_data/celebahq_gen_masks.sh

# Run training
python bin/train.py -cn lama-fourier-celeba data.batch_size=10

# Infer model on thick/thin/medium masks in 256 and run evaluation 
# like this:
python3 bin/predict.py \
model.path=$(pwd)/experiments/
   
    _
    
     _lama-fourier-celeba_/ \
indir=$(pwd)/celeba-hq-dataset/visual_test_256/random_thick_256/ \
outdir=$(pwd)/inference/celeba_random_thick_256 model.checkpoint=last.ckpt

    
   

Docker: TODO

Places Challenge

On the host machine:

# This script downloads multiple .tar files in parallel and unpacks them
# Places365-Challenge: Train(476GB) from High-resolution images (to train Big-Lama) 
bash places_challenge_train_download.sh

TODO: prepare
TODO: train 
TODO: eval

Docker: TODO

Create your data

On the host machine:

Explain explain explain

TODO: format
TODO: configs 
TODO: run training
TODO: run eval

OR in the docker:

TODO: train
TODO: eval

Hints

Generate different kinds of masks

The following command will execute a script that generates random masks.

bash docker/1_generate_masks_from_raw_images.sh \
    configs/data_gen/random_medium_512.yaml \
    /directory_with_input_images \
    /directory_where_to_store_images_and_masks \
    --ext png

The test data generation command stores images in the format, which is suitable for prediction.

The table below describes which configs we used to generate different test sets from the paper. Note that we do not fix a random seed, so the results will be slightly different each time.

Places 512x512 CelebA 256x256
Narrow random_thin_512.yaml random_thin_256.yaml
Medium random_medium_512.yaml random_medium_256.yaml
Wide random_thick_512.yaml random_thick_256.yaml

Feel free to change the config path (argument #1) to any other config in configs/data_gen or adjust config files themselves.

Override parameters in configs

Also you can override parameters in config like this:

python3 bin/train.py -cn 
   
     data.batch_size=10 run_title=my-title

   

Where .yaml file extension is omitted

Models options

Config names for models from paper (substitude into the training command):

* big-lama
* big-lama-regular
* lama-fourier
* lama-regular
* lama_small_train_masks

Which are seated in configs/training/folder

Links

Training time & resources

TODO

Acknowledgments

Citation

If you found this code helpful, please consider citing:

@article{suvorov2021resolution,
  title={Resolution-robust Large Mask Inpainting with Fourier Convolutions},
  author={Suvorov, Roman and Logacheva, Elizaveta and Mashikhin, Anton and Remizova, Anastasia and Ashukha, Arsenii and Silvestrov, Aleksei and Kong, Naejin and Goka, Harshith and Park, Kiwoong and Lempitsky, Victor},
  journal={arXiv preprint arXiv:2109.07161},
  year={2021}
}
Owner
Advanced Image Manipulation Lab @ Samsung AI Center Moscow
Advanced Image Manipulation Lab @ Samsung AI Center Moscow
Attention-guided gan for synthesizing IR images

SI-AGAN Attention-guided gan for synthesizing IR images This repository contains the Tensorflow code for "Pedestrian Gender Recognition by Style Trans

1 Oct 25, 2021
UnpNet - Rethinking 3-D LiDAR Point Cloud Segmentation(IEEE TNNLS)

UnpNet Citation Please cite the following paper if you use this repository in your reseach. @article {PMID:34914599, Title = {Rethinking 3-D LiDAR Po

Shijie Li 4 Jul 15, 2022
CSPML (crystal structure prediction with machine learning-based element substitution)

CSPML (crystal structure prediction with machine learning-based element substitution) CSPML is a unique methodology for the crystal structure predicti

8 Dec 20, 2022
Grow Function: Generate 3D Stacked Bifurcating Double Deep Cellular Automata based organisms which differentiate using a Genetic Algorithm...

Grow Function: A 3D Stacked Bifurcating Double Deep Cellular Automata which differentiates using a Genetic Algorithm... TLDR;High Def Trees that you can mint as NFTs on Solana

Nathaniel Gibson 4 Oct 08, 2022
A Protein-RNA Interface Predictor Based on Semantics of Sequences

PRIP PRIP:A Protein-RNA Interface Predictor Based on Semantics of Sequences installation gensim==3.8.3 matplotlib==3.1.3 xgboost==1.3.3 prettytable==2

李优 0 Mar 25, 2022
Winners of DrivenData's Overhead Geopose Challenge

Winners of DrivenData's Overhead Geopose Challenge

DrivenData 22 Aug 04, 2022
Implementation of Neural Style Transfer in Pytorch

PytorchNeuralStyleTransfer Code to run Neural Style Transfer from our paper Image Style Transfer Using Convolutional Neural Networks. Also includes co

Leon Gatys 396 Dec 01, 2022
Experiments for Fake News explainability project

fake-news-explainability Experiments for fake news explainability project This repository only contains the notebooks used to train the models and eva

Lorenzo Flores (Lj) 1 Dec 03, 2022
Unsupervised Representation Learning by Invariance Propagation

Unsupervised Learning by Invariance Propagation This repository is the official implementation of Unsupervised Learning by Invariance Propagation. Pre

FengWang 15 Jul 06, 2022
Video-Music Transformer

VMT Video-Music Transformer (VMT) is an attention-based multi-modal model, which generates piano music for a given video. Paper https://arxiv.org/abs/

Chin-Tung Lin 5 Jul 13, 2022
Global Filter Networks for Image Classification

Global Filter Networks for Image Classification Created by Yongming Rao, Wenliang Zhao, Zheng Zhu, Jiwen Lu, Jie Zhou This repository contains PyTorch

Yongming Rao 273 Dec 26, 2022
Implementation of "Efficient Regional Memory Network for Video Object Segmentation" (Xie et al., CVPR 2021).

RMNet This repository contains the source code for the paper Efficient Regional Memory Network for Video Object Segmentation. Cite this work @inprocee

Haozhe Xie 76 Dec 14, 2022
Revisiting, benchmarking, and refining Heterogeneous Graph Neural Networks.

Heterogeneous Graph Benchmark Revisiting, benchmarking, and refining Heterogeneous Graph Neural Networks. Roadmap We organize our repo by task, and on

THUDM 176 Dec 17, 2022
DGN pymarl - Implementation of DGN on Pymarl, which could be trained by VDN or QMIX

This is the implementation of DGN on Pymarl, which could be trained by VDN or QM

4 Nov 23, 2022
TGS Salt Identification Challenge

TGS Salt Identification Challenge This is an open solution to the TGS Salt Identification Challenge. Note Unfortunately, we can no longer provide supp

neptune.ai 123 Nov 04, 2022
Binary Stochastic Neurons in PyTorch

Binary Stochastic Neurons in PyTorch http://r2rt.com/binary-stochastic-neurons-in-tensorflow.html https://github.com/pytorch/examples/tree/master/mnis

Onur Kaplan 54 Nov 21, 2022
HSC4D: Human-centered 4D Scene Capture in Large-scale Indoor-outdoor Space Using Wearable IMUs and LiDAR. CVPR 2022

HSC4D: Human-centered 4D Scene Capture in Large-scale Indoor-outdoor Space Using Wearable IMUs and LiDAR. CVPR 2022 [Project page | Video] Getting sta

51 Nov 29, 2022
ReSSL: Relational Self-Supervised Learning with Weak Augmentation

ReSSL: Relational Self-Supervised Learning with Weak Augmentation This repository contains PyTorch evaluation code, training code and pretrained model

mingkai 45 Oct 25, 2022
Boosted CVaR Classification (NeurIPS 2021)

Boosted CVaR Classification Runtian Zhai, Chen Dan, Arun Sai Suggala, Zico Kolter, Pradeep Ravikumar NeurIPS 2021 Table of Contents Quick Start Train

Runtian Zhai 4 Feb 15, 2022