Chunkmogrify: Real image inversion via Segments

Overview

Chunkmogrify: Real image inversion via Segments

Logo

Teaser video with live editing sessions can be found here

This code demonstrates the ideas discussed in arXiv submission Real Image Inversion via Segments.
http://arxiv.org/abs/2110.06269
(David Futschik, Michal Lukáč, Eli Shechtman, Daniel Sýkora)

Abstract:
We present a simple, yet effective approach to editing real images via generative adversarial networks (GAN). Unlike previous techniques, that treat all editing tasks as an operation that affects pixel values in the entire image in our approach we cut up the image into a set of smaller segments. For those segments corresponding latent codes of a generative network can be estimated with greater accuracy due to the lower number of constraints. When codes are altered by the user the content in the image is manipulated locally while the rest of it remains unaffected. Thanks to this property the final edited image better retains the original structures and thus helps to preserve natural look.

before after

before after

What do I need?

You will need a local machine with a relatively recent GPU - I wouldn't recommend trying Chunkmogrify with anything older than RTX 2080. It is technically possible to run even on CPU, but the operations become so slow that the user experience is not enjoyable.

Quick startup guide

Requirements:
Python 3.7 or newer

Note: If you are using Anaconda, I recommend creating a new environment to run this project. Packages installed with conda and pip often don't play together very nicely.

Steps to be able to successfully run the project:

  1. Clone or download the repository and open a terminal / Powershell instance in the directory.
  2. Install the required python packages by running pip install -r requirements.txt. This might take a while, since it will download a few packages which will be several hundred MBs of data. Some packages might need to compile their extensions (as well as this project itself), so a C++ compiler needs to be present. On Linux, this is typically not an issue, but running on Windows might require Visual Studio and CUDA installations to successfully setup the project.
  3. Run python app.py. When running for the first time, it will automatically download required resources, which are also several hundred megabytes. Progression of the download can be monitored in the command line window.

To see if everything installed and configured properly, load up a photo and try running a projection step. If there are no errors, you are good to go.

Possible problems:

Torch not compiled with CUDA enabled.
Run

pip uninstall torch
pip cache purge
pip install torch -f https://download.pytorch.org/whl/torch_stable.html

Explanation of usage

Tutorial video: click below

Open an image using File -> Image from File. There is a sample image provided to check functionality.

Mask painting:
Left click paints, right click unpaints. Mouse wheel controls the size of the brush.

Projection:
Input a number of steps (100 or 200 is ok, 500 is max before LR goes to 0 currently) and press Projection Steps. Wait until projection finishes, you can observe the global image view by choosing output mode Projection Only during this process. To fine-tune, you can perform a small number of Pivotal Tuning steps.

Editing:
To add an edit, click the double arrow down icon in the Attribute Editor on the left side. Choose the type of edit (W, S, Styleclip), the direction of the edit, and drag the sliders to change the currently masked region. Usually it's necessary to increase the multiplier before noticeable changes are reflected via the direction slider.

Multiple different edits can be composed on top of each other at the same time. Their order is largely irrelevant. Currently in the default mode, only one region is being edited, and so all selected edits apply to the same region. If you would like to change the region, you can Freeze the current image, and perform a new projection, but you will lose the ability to change existing edits.

To save the current image, click the Save Current Image button. If the Unalign checkbox is active, the program will attempt to compose the aligned face back into the original image. Saved images can be found in the SavedImages directory by default. This can be changed in _config.yaml.

Keyboard shortcuts

Current keyboard shortcuts include:

Show/Hide mask :: Alt+M
Toggle mask painting :: Alt+N

W-space editing

Source for some of the basic directions:
(https://twitter.com/robertluxemburg/status/1207087801344372736)

To add your own directions, save them in a numpy pickle format as a (num_ws, 512) or (1, 512) format and specify their path in w_directions.py.

Style-space editing (S space edits)

Source:
StyleSpace Analysis: Disentangled Controls for StyleGAN Image Generation
(https://arxiv.org/abs/2011.12799)
(https://github.com/betterze/StyleSpace)

The presets can be found in s_presets.py, some were taken directly from the paper, others I found by manual exploration. You can perform similar exploration by choosing the Custom preset once you have a projection.

StyleCLIP editing

Source:
StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery
(https://arxiv.org/abs/2103.17249)
(https://github.com/orpatashnik/StyleCLIP)

Pretrained models taken from (https://github.com/orpatashnik/StyleCLIP/blob/main/utils.py) and manually removed the decoder from the state dict, since it's not used and takes up majority of file size.

PTI Optimization

Source:
Pivotal Tuning for Latent-based Editing of Real Images
(https://arxiv.org/abs/2106.05744)

This method allows you to match the target photo very closely, while retaining editing capacities.

It's often good to run 30-50 iterations of PTI to get very close matching of the source image, which won't cause a very noticeable drop in the editing capabilities.

Attribution

This repository makes use of code provided by the various repositories linked above, plus additionally code from:

styleganv2-ada-pytorch (https://github.com/NVlabs/stylegan2-ada-pytorch)
poisson-image-editing (https://github.com/PPPW/poisson-image-editing) for optional support of idempotent blend (slow implementation of blending that only changes the masked part which can be accessed by uncommenting the option in synthesis.py)

Citation

If you find this code useful for your research, please cite the arXiv submission linked above.

Owner
David Futschik
PhD student @ CTU Prague, Czech Republic.
David Futschik
Pytorch implementation of "A simple neural network module for relational reasoning" (Relational Networks)

Pytorch implementation of Relational Networks - A simple neural network module for relational reasoning Implemented & tested on Sort-of-CLEVR task. So

Kim Heecheol 800 Dec 05, 2022
TEDSummary is a speech summary corpus. It includes TED talks subtitle (Document), Title-Detail (Summary), speaker name (Meta info), MP4 URL, and utterance id

TEDSummary is a speech summary corpus. It includes TED talks subtitle (Document), Title-Detail (Summary), speaker name (Meta info), MP4 URL

3 Dec 26, 2022
A PoC Corporation Relationship Knowledge Graph System on top of Nebula Graph.

Corp-Rel is a PoC of Corpartion Relationship Knowledge Graph System. It's built on top of the Open Source Graph Database: Nebula Graph with a dataset

Wey Gu 20 Dec 11, 2022
Official Code Implementation of the paper : XAI for Transformers: Better Explanations through Conservative Propagation

Official Code Implementation of The Paper : XAI for Transformers: Better Explanations through Conservative Propagation For the SST-2 and IMDB expermin

Ameen Ali 23 Dec 30, 2022
A library for efficient similarity search and clustering of dense vectors.

Faiss Faiss is a library for efficient similarity search and clustering of dense vectors. It contains algorithms that search in sets of vectors of any

Meta Research 18.8k Jan 08, 2023
Equivariant Imaging: Learning Beyond the Range Space

Equivariant Imaging: Learning Beyond the Range Space Equivariant Imaging: Learning Beyond the Range Space Dongdong Chen, Julián Tachella, Mike E. Davi

Dongdong Chen 46 Jan 01, 2023
Phylogeny Partners

Phylogeny-Partners Two states models Instalation You may need to install the cython, networkx, numpy, scipy package: pip install cython, networkx, num

1 Sep 19, 2022
Real-time pose estimation accelerated with NVIDIA TensorRT

trt_pose Want to detect hand poses? Check out the new trt_pose_hand project for real-time hand pose and gesture recognition! trt_pose is aimed at enab

NVIDIA AI IOT 803 Jan 06, 2023
ClevrTex: A Texture-Rich Benchmark for Unsupervised Multi-Object Segmentation

ClevrTex This repository contains dataset generation code for ClevrTex benchmark from paper: ClevrTex: A Texture-Rich Benchmark for Unsupervised Multi

Laurynas Karazija 26 Dec 21, 2022
ResNEsts and DenseNEsts: Block-based DNN Models with Improved Representation Guarantees

ResNEsts and DenseNEsts: Block-based DNN Models with Improved Representation Guarantees This repository is the official implementation of the empirica

Kuan-Lin (Jason) Chen 2 Oct 02, 2022
Multi-atlas segmentation (MAS) is a promising framework for medical image segmentation

Multi-atlas segmentation (MAS) is a promising framework for medical image segmentation. Generally, MAS methods register multiple atlases, i.e., medical images with corresponding labels, to a target i

NanYoMy 13 Oct 09, 2022
Code of U2Fusion: a unified unsupervised image fusion network for multiple image fusion tasks, including multi-modal, multi-exposure and multi-focus image fusion.

U2Fusion Code of U2Fusion: a unified unsupervised image fusion network for multiple image fusion tasks, including multi-modal (VIS-IR, medical), multi

Han Xu 129 Dec 11, 2022
Training and Evaluation Code for Neural Volumes

Neural Volumes This repository contains training and evaluation code for the paper Neural Volumes. The method learns a 3D volumetric representation of

Meta Research 370 Dec 08, 2022
Unofficial PyTorch implementation of MobileViT based on paper "MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer".

MobileViT RegNet Unofficial PyTorch implementation of MobileViT based on paper MOBILEVIT: LIGHT-WEIGHT, GENERAL-PURPOSE, AND MOBILE-FRIENDLY VISION TR

Hong-Jia Chen 91 Dec 02, 2022
A robotic arm that mimics hand movement through MediaPipe tracking.

La-Z-Arm A robotic arm that mimics hand movement through MediaPipe tracking. Hardware NVidia Jetson Nano Sparkfun Pi Servo Shield Micro Servos Webcam

Alfred 1 Jun 05, 2022
Final project for Intro to CS class.

Financial Analysis Web App https://share.streamlit.io/mayurk1/fin-web-app-final-project/webApp.py 1. Project Description This project is a technical a

Mayur Khanna 1 Dec 10, 2021
Multi-layer convolutional LSTM with Pytorch

Convolution_LSTM_pytorch Thanks for your attention. I haven't got time to maintain this repo for a long time. I recommend this repo which provides an

Zijie Zhuang 733 Dec 30, 2022
Code for the IJCAI 2021 paper "Structure Guided Lane Detection"

SGNet Project for the IJCAI 2021 paper "Structure Guided Lane Detection" Abstract Recently, lane detection has made great progress with the rapid deve

Jinming Su 27 Dec 08, 2022
MQBench Quantization Aware Training with PyTorch

MQBench Quantization Aware Training with PyTorch I am using MQBench(Model Quantization Benchmark)(http://mqbench.tech/) to quantize the model for depl

Ling Zhang 29 Nov 18, 2022
ATOMIC 2020: On Symbolic and Neural Commonsense Knowledge Graphs

(Comet-) ATOMIC 2020: On Symbolic and Neural Commonsense Knowledge Graphs Paper Jena D. Hwang, Chandra Bhagavatula, Ronan Le Bras, Jeff Da, Keisuke Sa

AI2 152 Dec 27, 2022