Code for CVPR'2022 paper ✨ "Predict, Prevent, and Evaluate: Disentangled Text-Driven Image Manipulation Empowered by Pre-Trained Vision-Language Model"

Last update: Nov 28, 2022

Related tags

Computer Vision PPE

Overview

PPE ✨

Repository for our CVPR'2022 paper:

Predict, Prevent, and Evaluate: Disentangled Text-Driven Image Manipulation Empowered by Pre-Trained Vision-Language Model. Zipeng Xu, Tianwei Lin, Hao Tang, Fu Li, Dongliang He, Nicu Sebe, Radu Timofte, Luc Van Gool, Errui Ding. To appear in CVPR 2022.

Pytorch implementation is at here: zipengxuc/PPE-Pytorch.

Updates

24 Mar 2022: We update our arxiv-version paper.

30 Mar 2022: We have had some changes in releasing the code. Pytorch implementation is now at here: zipengxuc/PPE-Pytorch.

14 Apr 2022: Update our PaddlePaddle inference code in this repository.

To reproduce our results:

Setup:

Install CLIP:

conda install --yes -c pytorch pytorch=1.7.1 torchvision cudatoolkit=<CUDA_VERSION>
pip install ftfy regex tqdm gdown
pip install git+https://github.com/openai/CLIP.git

Download pre-trained models:

The code relies on the PaddleGAN (PaddlePaddle implementation of StyleGAN2). Download the pre-trained StyleGAN2 generator from here.

We provided several pretrained PPE models on here.
Invert real images:

The mapper is trained on latent vectors, so it is necessary to invert images into latent space. To edit human face, StyleCLIP provides the CelebA-HQ that was inverted by e4e: test set.

Usage:

Please first put downloaded pretraiend models and data on ckpt folder.

Inference

In PaddlePaddle version, we only provide inference code to generate editing results:

python mapper/evaluate.py

Reference

@article{xu2022ppe,
author = {Zipeng Xu and Tianwei Lin and Hao Tang and Fu Li and Dongliang He and Nicu Sebe and Radu Timofte and Luc Van Gool and Errui Ding},
title = {Predict, Prevent, and Evaluate: Disentangled Text-Driven Image Manipulation Empowered by Pre-Trained Vision-Language Model},
journal = {arXiv preprint arXiv:2111.13333},
year = {2021}
}

If you have any questions, please contact [email protected]. :)

Code for CVPR'2022 paper ✨ "Predict, Prevent, and Evaluate: Disentangled Text-Driven Image Manipulation Empowered by Pre-Trained Vision-Language Model"

Related tags

Overview

PPE ✨

Updates

To reproduce our results:

Setup:

Usage:

Inference

Reference

Owner

Zipeng Xu

CNN+LSTM+CTC based OCR implemented using tensorflow.

Qrcode Attendence System with Opencv and Pyzbar

2 telegram-bots: for image recognition and for text generation

A tool combining EasyOCR and LaMa to automatically detect text and replace it with an inpainted background.

A facial recognition program that plays a alarm (mp3 file) when a person i seen in the room. A basic theif using Python and OpenCV

The code for CVPR2022 paper "Likert Scoring with Grade Decoupling for Long-term Action Assessment".

Textboxes_plusplus implementation with Tensorflow (python)

Turn images of tables into CSV data. Detect tables from images and run OCR on the cells.

Handwritten Number Recognition using CNN and Character Segmentation

An Implementation of the seglink alogrithm in paper Detecting Oriented Text in Natural Images by Linking Segments

Official PyTorch implementation for "Mixed supervision for surface-defect detection: from weakly to fully supervised learning"

Semantic-based Patch Detection for Binary Programs

Official code for "Bridging Video-text Retrieval with Multiple Choice Questions", CVPR 2022 (Oral).

PyTorch Re-Implementation of EAST: An Efficient and Accurate Scene Text Detector

BNF Globalization Code (CVPR 2016)

governance proposal to make fei redeemable for eth

Primary QPDF source code and documentation

Image processing in Python

Convert scans of handwritten notes to beautiful, compact PDFs