The pytorch implementation of the paper "text-guided neural image inpainting" at MM'2020

Overview

TDANet: Text-Guided Neural Image Inpainting, MM'2020 (Oral)

MM | ArXiv

This repository implements the paper "Text-Guided Neural Image Inpainting" by Lisai Zhang, Qingcai Chen, Baotian Hu and Shuoran Jiang. Given one masked image, the proposed TDANet generates diverse plausible results according to guidance text.

Inpainting example

Manipulation Extension example

Getting started

Installation

This code was tested with Pytoch 1.2.0, CUDA 10.1, Python 3.6 and Ubuntu 16.04 with a 2080Ti GPU

pip install visdom dominate
  • Clone this repo (we suggest to only clone the depth 1 version):
git clone https://github.com/idealwhite/tdanet --depth 1
cd tdanet
  • Download the dataset and pre-processed files as in following steps.

Datasets

  • CUB_200: dataset from Caltech-UCSD Birds 200.
  • COCO: object detection 2014 datset from MS COCO.
  • pre-processed datafiles: train/test split, caption-image mapping, image sampling and pre-trained DAMSM from GoogleDrive and extarct them to dataset/ directory as specified in config.bird.yml/config.coco.yml.

Training Demo

python train.py --name tda_bird  --gpu_ids 0 --model tdanet --mask_type 0 1 2 3 --img_file ./datasets/CUB_200_2011/train.flist --mask_file ./datasets/CUB_200_2011/train_mask.flist --text_config config.bird.yml
  • Important: Add --mask_type in options/base_options.py for different training masks. --mask_file path is needed for object mask, use train_mask.flist for CUB and image_mask_coco_all.json for COCO. --text_config refer to the yml configuration file for text setup, --img_file is the image file dir or file list.
  • To view training results and loss plots, run python -m visdom.server and copy the URL http://localhost:8097.
  • Training models will be saved under the ./checkpoints folder.
  • More training options can be found in ./options folder.
  • Suggestion: use mask type 0 1 2 3 for CUB dataset and 0 1 2 4 for COCO dataset. Train more than 2000 epochs for CUB and 200 epochs for COCO.

Evaluation Demo

Test

python test.py --name tda_bird  --img_file datasets/CUB_200_2011/test.flist --results_dir results/tda_bird  --mask_file datasets/CUB_200_2011/test_mask.flist --mask_type 3 --no_shuffle --gpu_ids 0 --nsampling 1 --no_variance

Note:

  • Remember to add the --no_variance option to get better performance.
  • For COCO object mask, use image_mask_coco_all.json as the mask file..

A eval_tda_bird.flist will be generated after the test. Then in the evaluation, this file is used as the ground truth file list:

python evaluation.py --batch_test 60 --ground_truth_path eval_tda_bird.flist --save_path results/tda_bird
  • Add --ground_truth_path to the dir of ground truth image path or list. --save_path as the result dir.

Pretrained Models

Download the pre-trained models bird inpainting or coco inpainting and put them undercheckpoints/ directory.

GUI

  • Install the PyQt5 for GUI operation
pip install PyQt5

The GUI could now only avaliable in debug mode, please refer to this issues for detailed instructions. The author is not good at solving PyQt5 problems, wellcome contrbutions.

TODO

  • Debug the GUI application
  • Further improvement on COCO quality.

License

This software is for educational and academic research purpose only. If you wish to obtain a commercial royalty bearing license to this software, please contact us at [email protected].

Acknowledge

We would like to thanks Zheng et al. for providing their source code. This project is fit from their greate Pluralistic Image Completion Project.

Citation

If you use this code for your research, please cite our paper.

@inproceedings{10.1145/3394171.3414017,
author = {Zhang, Lisai and Chen, Qingcai and Hu, Baotian and Jiang, Shuoran},
title = {Text-Guided Neural Image Inpainting},
year = {2020},
booktitle = {Proceedings of the 28th ACM International Conference on Multimedia},
pages = {1302–1310},
location = {Seattle, WA, USA},
}
Owner
LisaiZhang
Enjoy thinking about everything.
LisaiZhang
[CVPR 2022] Official PyTorch Implementation for "Reference-based Video Super-Resolution Using Multi-Camera Video Triplets"

Reference-based Video Super-Resolution (RefVSR) Official PyTorch Implementation of the CVPR 2022 Paper Project | arXiv | RealMCVSR Dataset This repo c

Junyong Lee 151 Dec 30, 2022
Convert openmmlab (not only mmdetection) series model to tensorrt

MMDet to TensorRT This project aims to convert the mmdetection model to TensorRT model end2end. Focus on object detection for now. Mask support is exp

JinTian 4 Dec 17, 2021
Visualizing lattice vibration information from phonon dispersion to atoms (For GPUMD)

Phonon-Vibration-Viewer (For GPUMD) Visualizing lattice vibration information from phonon dispersion for primitive atoms. In this tutorial, we will in

Liangting 6 Dec 10, 2022
DeepFaceEditing: Deep Face Generation and Editing with Disentangled Geometry and Appearance Control

DeepFaceEditing: Deep Face Generation and Editing with Disentangled Geometry and Appearance Control One version of our system is implemented using the

260 Nov 28, 2022
Uncertainty Estimation via Response Scaling for Pseudo-mask Noise Mitigation in Weakly-supervised Semantic Segmentation

Uncertainty Estimation via Response Scaling for Pseudo-mask Noise Mitigation in Weakly-supervised Semantic Segmentation Introduction This is a PyTorch

XMed-Lab 30 Sep 23, 2022
Bald-to-Hairy Translation Using CycleGAN

GANiry: Bald-to-Hairy Translation Using CycleGAN Official PyTorch implementation of GANiry. GANiry: Bald-to-Hairy Translation Using CycleGAN, Fidan Sa

Fidan Samet 10 Oct 27, 2022
A high-performance anchor-free YOLO. Exceeding yolov3~v5 with ONNX, TensorRT, NCNN, and Openvino supported.

YOLOX is an anchor-free version of YOLO, with a simpler design but better performance! It aims to bridge the gap between research and industrial communities. For more details, please refer to our rep

7.7k Jan 06, 2023
An implementation of Video Frame Interpolation via Adaptive Separable Convolution using PyTorch

This work has now been superseded by: https://github.com/sniklaus/revisiting-sepconv sepconv-slomo This is a reference implementation of Video Frame I

Simon Niklaus 984 Dec 16, 2022
Codes for our paper The Stem Cell Hypothesis: Dilemma behind Multi-Task Learning with Transformer Encoders published to EMNLP 2021.

The Stem Cell Hypothesis Codes for our paper The Stem Cell Hypothesis: Dilemma behind Multi-Task Learning with Transformer Encoders published to EMNLP

Emory NLP 5 Jul 08, 2022
Lightweight Salient Object Detection in Optical Remote Sensing Images via Feature Correlation

CorrNet This project provides the code and results for 'Lightweight Salient Object Detection in Optical Remote Sensing Images via Feature Correlation'

Gongyang Li 13 Nov 03, 2022
We envision models that are pre-trained on a vast range of domain-relevant tasks to become key for molecule property prediction

We envision models that are pre-trained on a vast range of domain-relevant tasks to become key for molecule property prediction. This repository aims to give easy access to state-of-the-art pre-train

GMUM 90 Jan 08, 2023
A simple AI that will give you si ple task and this is made with python

Crystal-AI A simple AI that will give you si ple task and this is made with python Prerequsites: Python3.6.2 pyttsx3 pip install pyttsx3 pyaudio pip i

CrystalAnd 1 Dec 25, 2021
Pytorch Implementation for NeurIPS (oral) paper: Pixel Level Cycle Association: A New Perspective for Domain Adaptive Semantic Segmentation

Pixel-Level Cycle Association This is the Pytorch implementation of our NeurIPS 2020 Oral paper Pixel-Level Cycle Association: A New Perspective for D

87 Oct 19, 2022
Official PyTorch implementation of "Uncertainty-Based Offline Reinforcement Learning with Diversified Q-Ensemble" (NeurIPS'21)

Uncertainty-Based Offline Reinforcement Learning with Diversified Q-Ensemble This is the code for reproducing the results of the paper Uncertainty-Bas

43 Nov 23, 2022
A curated list of automated deep learning (including neural architecture search and hyper-parameter optimization) resources.

Awesome AutoDL A curated list of automated deep learning related resources. Inspired by awesome-deep-vision, awesome-adversarial-machine-learning, awe

D-X-Y 2k Dec 30, 2022
Physical Anomalous Trajectory or Motion (PHANTOM) Dataset

Physical Anomalous Trajectory or Motion (PHANTOM) Dataset Description This dataset contains the six different classes as described in our paper[]. The

0 Dec 16, 2021
Official PyTorch implementation of "Meta-Learning with Task-Adaptive Loss Function for Few-Shot Learning" (ICCV2021 Oral)

MeTAL - Meta-Learning with Task-Adaptive Loss Function for Few-Shot Learning (ICCV2021 Oral) Sungyong Baik, Janghoon Choi, Heewon Kim, Dohee Cho, Jaes

Sungyong Baik 44 Dec 29, 2022
Merlion: A Machine Learning Framework for Time Series Intelligence

Merlion: A Machine Learning Library for Time Series Table of Contents Introduction Installation Documentation Getting Started Anomaly Detection Foreca

Salesforce 2.8k Dec 30, 2022
Bianace Prediction Pytorch Model

Bianace Prediction Pytorch Model Main Results ETHUSDT from 2021-01-01 00:00:00 t

RoyYang 4 Jul 20, 2022
Complete U-net Implementation with keras

U Net Lowered with Keras Complete U-net Implementation with keras Original Paper Link : https://arxiv.org/abs/1505.04597 Special Implementations : The

Sagnik Roy 14 Oct 10, 2022