2021:"Bridging Global Context Interactions for High-Fidelity Image Completion"

Related tags

Deep LearningTFill
Overview

TFill

arXiv | Project

This repository implements the training, testing and editing tools for "Bridging Global Context Interactions for High-Fidelity Image Completion" by Chuanxia Zheng, Tat-Jen Cham, Jianfei Cai and Dinh Phung. Given masked images, the proposed TFill model is able to generate high-fidelity plausible results on various settings.

Examples

teaser

Framework

We propose the two-stages image completion framework, where the upper content inference network (TFill-Coarse) generates semantically correct content using a transformer encoder to directly capture the global context information; the lower appearance refinement network (TFill-refined) copies global visible and generated features to holes.

teaser

Getting started

  • Clone this repo:
git clone https://github.com/lyndonzheng/TFill
cd TFill

Requirements

The original model is trained and evaluated with Pytorch v1.9.1, which cannot be visited in current PyTorch. Therefore, we create a new environment with Pytorch v1.10.0 to test the model, where the performance is the same.

A suitable conda environment named Tfill can be created and activated with:

conda env create -f environment.yaml
conda activate TFill

Runing pretrained models

Download the pre-trained models using the following links (CelebA-HQ, FFHQ, ImageNet, Plcases2 ) and put them undercheckpoints/ directory. It should have the following structure:

./checkpoints/
├── celeba
│   ├── latest_net_D.pth
│   ├── latest_net_D_Ref.pth
│   ├── latest_net_E.pth
│   ├── latest_net_G.pth
│   ├── latest_net_G_Ref.pth
│   ├── latest_net_T.pth
├── ffhq
│   ├── ...
├── ...
  • Test the model
sh ./scripts/test.sh

For different models, the users just need to modify lines 2-4, including name,img_file,mask_file. For instance, we can replace the celeba to imagenet.

The default results will be stored under the results/ folder, in which:

  • examples/: shows original and masked images;
  • img_out/: shows upsampled Coarse outputs;
  • img_ref_out/: shows the final Refined outputs.

Datasets

  • face dataset:
    • 24,183 training images and 2,824 test images from CelebA and use the algorithm of Growing GANs to get the high-resolution CelebA-HQ dataset.
    • 60,000 training images and 10,000 test images from FFHQ provided by StyleGAN.
  • natural scenery: original training and val images from Places2.
  • object original training images from ImageNet.

Traning

  • Train a model (two stage: Coarse and Refinement)
sh ./scripts/train.sh

The default setting is for the top Coarse training. The users just need to replace the coarse with refine at line 6. Then, the model can continue training for high-resolution image completion. More hyper-parameter can be in options/.

The coarse results using transformer and restrictive CNN is impressive, which provides plausible results for both foreground objects and background scene.

teaser teaser

GUI

The GUI operation is similar to our previous GUI in PIC, where steps are also the same.

Basic usage is:

sh ./scripts/ui.sh 

In gui/ui_model.py, users can modify the img_root(line 30) and the corresponding img_files(line 31) to randomly edit images from the testing dataset.

Editing Examples

  • Results (original, output) for face editing

teaser

  • Results (original, masked input, output) for nature scene editing

teaser

Next

  • Higher-resolution pluralistic image completion

License

This work is licensed under a MIT License.

This software is for educational and academic research purpose only. If you wish to obtain a commercial royalty bearing license to this software, please contact us at [email protected].

Citation

The code also uses our previous PIC. If you use this code for your research, please cite our papers.

@misc{zheng2021tfill,
      title={Bridging Global Context Interactions for High-Fidelity Image Completion},
      author={Zheng, Chuanxia and Cham, Tat-Jen and Cai, Jianfei and Phung, Dinh},
      year={2021},
      eprint={2104.00845},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

@inproceedings{zheng2019pluralistic,
  title={Pluralistic Image Completion},
  author={Zheng, Chuanxia and Cham, Tat-Jen and Cai, Jianfei},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  pages={1438--1447},
  year={2019}
}

@article{zheng2021pluralistic,
  title={Pluralistic Free-From Image Completion},
  author={Zheng, Chuanxia and Cham, Tat-Jen and Cai, Jianfei},
  journal={International Journal of Computer Vision},
  pages={1--20},
  year={2021},
  publisher={Springer}
}
Owner
Chuanxia Zheng
Chuanxia Zheng
Source code for our paper "Empathetic Response Generation with State Management"

Source code for our paper "Empathetic Response Generation with State Management" this repository is maintained by both Jun Gao and Yuhan Liu Model Ove

Yuhan Liu 3 Oct 08, 2022
TransCD: Scene Change Detection via Transformer-based Architecture

TransCD: Scene Change Detection via Transformer-based Architecture

wangzhixue 29 Dec 11, 2022
Implementation for our ICCV 2021 paper: Dual-Camera Super-Resolution with Aligned Attention Modules

DCSR: Dual Camera Super-Resolution Implementation for our ICCV 2021 oral paper: Dual-Camera Super-Resolution with Aligned Attention Modules paper | pr

Tengfei Wang 110 Dec 20, 2022
HeartRate detector with ArduinoandPython - Use Arduino and Python create a heartrate detector.

Syllabus of Contents Syllabus of Contents Introduction Of Project Features Develop With Python code introduction Installation License Developer Contac

1 Jan 05, 2022
Code and data for "TURL: Table Understanding through Representation Learning"

TURL This Repo contains code and data for "TURL: Table Understanding through Representation Learning". Environment and Setup Data Pretraining Finetuni

SunLab-OSU 63 Nov 23, 2022
Python wrapper to access the amazon selling partner API

PYTHON-AMAZON-SP-API Amazon Selling-Partner API If you have questions, please join on slack Contributions very welcome! Installation pip install pytho

Michael Primke 330 Jan 06, 2023
Answer a series of contextually-dependent questions like they may occur in natural human-to-human conversations.

SCAI-QReCC-21 [leaderboards] [registration] [forum] [contact] [SCAI] Answer a series of contextually-dependent questions like they may occur in natura

19 Sep 28, 2022
Official implementation of paper "Query2Label: A Simple Transformer Way to Multi-Label Classification".

Introdunction This is the official implementation of the paper "Query2Label: A Simple Transformer Way to Multi-Label Classification". Abstract This pa

Shilong Liu 274 Dec 28, 2022
Posterior temperature optimized Bayesian models for inverse problems in medical imaging

Posterior temperature optimized Bayesian models for inverse problems in medical imaging Max-Heinrich Laves*, Malte Tölle*, Alexander Schlaefer, Sandy

Artificial Intelligence in Cardiovascular Medicine (AICM) 6 Sep 19, 2022
OCR-D wrapper for detectron2 based segmentation models

ocrd_detectron2 OCR-D wrapper for detectron2 based segmentation models Introduction Installation Usage OCR-D processor interface ocrd-detectron2-segm

Robert Sachunsky 13 Dec 06, 2022
Structure Information is the Key: Self-Attention RoI Feature Extractor in 3D Object Detection

Structure Information is the Key: Self-Attention RoI Feature Extractor in 3D Object Detection abstract:Unlike 2D object detection where all RoI featur

DK. Zhang 2 Oct 07, 2022
PyTorch implementation of MSBG hearing loss model and MBSTOI intelligibility metric

PyTorch implementation of MSBG hearing loss model and MBSTOI intelligibility metric This repository contains the implementation of MSBG hearing loss m

BUT <a href=[email protected]"> 9 Nov 08, 2022
Sarus implementation of classical ML models. The models are implemented using the Keras API of tensorflow 2. Vizualization are implemented and can be seen in tensorboard.

Sarus published models Sarus implementation of classical ML models. The models are implemented using the Keras API of tensorflow 2. Vizualization are

Sarus Technologies 39 Aug 19, 2022
[ICCV 2021] Relaxed Transformer Decoders for Direct Action Proposal Generation

RTD-Net (ICCV 2021) This repo holds the codes of paper: "Relaxed Transformer Decoders for Direct Action Proposal Generation", accepted in ICCV 2021. N

Multimedia Computing Group, Nanjing University 80 Nov 30, 2022
Joint project of the duo Hacker Ninjas

Project Smoothie Společný projekt dua Hacker Ninjas. První pokus o hříčku po třech týdnech učení se programování. Jakub Kolář e:\

Jakub Kolář 2 Jan 07, 2022
Projects for AI/ML and IoT integration for games and other presented at re:Invent 2021.

Playground4AWS Projects for AI/ML and IoT integration for games and other presented at re:Invent 2021. Architecture Minecraft and Lamps This project i

Vinicius Senger 5 Nov 30, 2022
The official implementation of EIGNN: Efficient Infinite-Depth Graph Neural Networks (NeurIPS 2021)

EIGNN: Efficient Infinite-Depth Graph Neural Networks The official implementation of EIGNN: Efficient Infinite-Depth Graph Neural Networks (NeurIPS 20

Juncheng Liu 14 Nov 22, 2022
UCSD Oasis platform

oasis UCSD Oasis platform Local project setup Install Docker Compose and make sure you have Pip installed Clone the project and go to the project fold

InSTEDD 4 Jun 16, 2021
MicRank is a Learning to Rank neural channel selection framework where a DNN is trained to rank microphone channels.

MicRank: Learning to Rank Microphones for Distant Speech Recognition Application Scenario Many applications nowadays envision the presence of multiple

Samuele Cornell 20 Nov 10, 2022
An official implementation of the paper Exploring Sequence Feature Alignment for Domain Adaptive Detection Transformers

Sequence Feature Alignment (SFA) By Wen Wang, Yang Cao, Jing Zhang, Fengxiang He, Zheng-jun Zha, Yonggang Wen, and Dacheng Tao This repository is an o

WangWen 79 Dec 24, 2022