pix2pix in tensorflow.js

Overview

pix2pix in tensorflow.js

This repo is moved to https://github.com/yining1023/pix2pix_tensorflowjs_lite

See a live demo here: https://yining1023.github.io/pix2pix_tensorflowjs/

Screen_Shot_2018_06_17_at_11_06_09_PM

Try it yourself: Download/clone the repository and run it locally:

git clone https://github.com/yining1023/pix2pix_tensorflowjs.git
cd pix2pix_tensorflowjs
python3 -m http.server

Credits: This project is based on affinelayer's pix2pix-tensorflow. I want to thank christopherhesse, nsthorat, and dsmilkov for their help and suggestions from this Github issue.

How to train a pix2pix(edges2xxx) model from scratch

    1. Prepare the data
    1. Train the model
    1. Test the model
    1. Export the model
    1. Port the model to tensorflow.js
    1. Create an interactive interface in the browser

1. Prepare the data

  • 1.1 Scrape images from google search
  • 1.2 Remove the background of the images
  • 1.3 Resize all images into 256x256 px
  • 1.4 Detect edges of all images
  • 1.5 Combine input images and target images
  • 1.6 Split all combined images into two folders: train and val

Before we start, check out affinelayer's Create your own dataset. I followed his instrustion for steps 1.3, 1.5 and 1.6.

1.1 Scrape images from google search

We can create our own target images. But for this edge2pikachu project, I downloaded a lot of images from google. I'm using this google_image_downloader to download images from google. After downloading the repo above, run -

$ python image_download.py <query> <number of images>

It will download images and save it to the current directory.

1.2 Remove the background of the images

Some images have some background. I'm using grabcut with OpenCV to remove background Check out the script here: https://github.com/yining1023/pix2pix-tensorflow/blob/master/tools/grabcut.py To run the script-

$ python grabcut.py <filename>

It will open an interactive interface, here are some instructions: https://github.com/symao/InteractiveImageSegmentation Here's an example of removing background using grabcut:

Screen Shot 2018 03 13 at 7 03 28 AM

1.3 Resize all images into 256x256 px

Download pix2pix-tensorflow repo. Put all images we got into photos/original folder Run -

$ python tools/process.py --input_dir photos/original --operation resize --output_dir photos/resized

We should be able to see a new folder called resized with all resized images in it.

1.4 Detect edges of all images

The script that I use to detect edges of images from one folder at once is here: https://github.com/yining1023/pix2pix-tensorflow/blob/master/tools/edge-detection.py, we need to change the path of the input images directory on line 31, and create a new empty folder called edges in the same directory. Run -

$ python edge-detection.py

We should be able to see edged-detected images in the edges folder. Here's an example of edge detection: left(original) right(edge detected)

0_batch2 0_batch2_2

1.5 Combine input images and target images

python tools/process.py --input_dir photos/resized --b_dir photos/blank --operation combine --output_dir photos/combined

Here is an example of the combined image: Notice that the size of the combined image is 512x256px. The size is important for training the model successfully.

0_batch2

Read more here: affinelayer's Create your own dataset

1.6 Split all combined images into two folders: train and val

python tools/split.py --dir photos/combined

Read more here: affinelayer's Create your own dataset

I collected 305 images for training and 78 images for testing.

2. Train the model

# train the model
python pix2pix.py --mode train --output_dir pikachu_train --max_epochs 200 --input_dir pikachu/train --which_direction BtoA

Read more here: https://github.com/affinelayer/pix2pix-tensorflow#getting-started

I used the High Power Computer(HPC) at NYU to train the model. You can see more instruction here: https://github.com/cvalenzuela/hpc. You can request GPU and submit a job to HPC, and use tunnels to tranfer large files between the HPC and your computer.

The training takes me 4 hours and 16 mins. After train, there should be a pikachu_train folder with checkpoint in it. If you add --ngf 32 --ndf 32 when training the model: python pix2pix.py --mode train --output_dir pikachu_train --max_epochs 200 --input_dir pikachu/train --which_direction BtoA --ngf 32 --ndf 32, the model will be smaller 13.6 MB, and it will take less time to train.

3. Test the model

# test the model
python pix2pix.py --mode test --output_dir pikachu_test --input_dir pikachu/val --checkpoint pikachu_train

After testing, there should be a new folder called pikachu_test. In the folder, if you open the index.html, you should be able to see something like this in your browser:

Screen_Shot_2018_03_15_at_8_42_48_AM

Read more here: https://github.com/affinelayer/pix2pix-tensorflow#getting-started

4. Export the model

python pix2pix.py --mode export --output_dir /export/ --checkpoint /pikachu_train/ --which_direction BtoA

It will create a new export folder

5. Port the model to tensorflow.js

I followed affinelayer's instruction here: https://github.com/affinelayer/pix2pix-tensorflow/tree/master/server#exporting

cd server
python tools/export-checkpoint.py --checkpoint ../export --output_file static/models/pikachu_BtoA.pict

We should be able to get a file named pikachu_BtoA.pict, which is 54.4 MB. If you add --ngf 32 --ndf 32 when training the model: python pix2pix.py --mode train --output_dir pikachu_train --max_epochs 200 --input_dir pikachu/train --which_direction BtoA --ngf 32 --ndf 32, the model will be smaller 13.6 MB, and it will take less time to train.

6. Create an interactive interface in the browser

Copy the model we get from step 5 to the models folder.

Owner
Yining Shi
Creative Coding 👩‍💻+ Machine Learning 🤖
Yining Shi
Complete U-net Implementation with keras

U Net Lowered with Keras Complete U-net Implementation with keras Original Paper Link : https://arxiv.org/abs/1505.04597 Special Implementations : The

Sagnik Roy 14 Oct 10, 2022
Code for KHGT model, AAAI2021

KHGT Code for KHGT accepted by AAAI2021 Please unzip the data files in Datasets/ first. To run KHGT on Yelp data, use python labcode_yelp.py For Movi

32 Nov 29, 2022
Optimal space decomposition based-product quantization for approximate nearest neighbor search

Optimal space decomposition based-product quantization for approximate nearest neighbor search Abstract Product quantization(PQ) is an effective neare

Mylove 1 Nov 19, 2021
"Structure-Augmented Text Representation Learning for Efficient Knowledge Graph Completion"(WWW 2021)

STAR_KGC This repo contains the source code of the paper accepted by WWW'2021. "Structure-Augmented Text Representation Learning for Efficient Knowled

Bo Wang 60 Dec 26, 2022
Code for the paper "SmoothMix: Training Confidence-calibrated Smoothed Classifiers for Certified Robustness" (NeurIPS 2021)

SmoothMix: Training Confidence-calibrated Smoothed Classifiers for Certified Robustness (NeurIPS2021) This repository contains code for the paper "Smo

Jongheon Jeong 17 Dec 27, 2022
GitHub repository for the ICLR Computational Geometry & Topology Challenge 2021

ICLR Computational Geometry & Topology Challenge 2022 Welcome to the ICLR 2022 Computational Geometry & Topology challenge 2022 --- by the ICLR 2022 W

42 Dec 13, 2022
Conformer: Local Features Coupling Global Representations for Visual Recognition

Conformer: Local Features Coupling Global Representations for Visual Recognition (arxiv) This repository is built upon DeiT and timm Usage First, inst

Zhiliang Peng 378 Jan 08, 2023
Doge-Prediction - Coding Club prediction ig

Doge-Prediction Coding Club prediction ig Basically: Create an application that

1 Jan 10, 2022
Pytorch implementation of CoCon: A Self-Supervised Approach for Controlled Text Generation

COCON_ICLR2021 This is our Pytorch implementation of COCON. CoCon: A Self-Supervised Approach for Controlled Text Generation (ICLR 2021) Alvin Chan, Y

alvinchangw 79 Dec 18, 2022
An implementation of IMLE-Net: An Interpretable Multi-level Multi-channel Model for ECG Classification

IMLE-Net: An Interpretable Multi-level Multi-channel Model for ECG Classification The repostiory consists of the code, results and data set links for

12 Dec 26, 2022
RP-GAN: Stable GAN Training with Random Projections

RP-GAN: Stable GAN Training with Random Projections This repository contains a reference implementation of the algorithm described in the paper: Behna

Ayan Chakrabarti 20 Sep 18, 2021
Implementation of Pooling by Sliced-Wasserstein Embedding (NeurIPS 2021)

PSWE: Pooling by Sliced-Wasserstein Embedding (NeurIPS 2021) PSWE is a permutation-invariant feature aggregation/pooling method based on sliced-Wasser

Navid Naderializadeh 3 May 06, 2022
Deep Semisupervised Multiview Learning With Increasing Views (IEEE TCYB 2021, PyTorch Code)

Deep Semisupervised Multiview Learning With Increasing Views (ISVN, IEEE TCYB) Peng Hu, Xi Peng, Hongyuan Zhu, Liangli Zhen, Jie Lin, Huaibai Yan, Dez

3 Nov 19, 2022
A Comprehensive Empirical Study of Vision-Language Pre-trained Model for Supervised Cross-Modal Retrieval

CLIP4CMR A Comprehensive Empirical Study of Vision-Language Pre-trained Model for Supervised Cross-Modal Retrieval The original data and pre-calculate

24 Dec 26, 2022
Official Pytorch implementation of MixMo framework

MixMo: Mixing Multiple Inputs for Multiple Outputs via Deep Subnetworks Official PyTorch implementation of the MixMo framework | paper | docs Alexandr

79 Nov 07, 2022
Proximal Backpropagation - a neural network training algorithm that takes implicit instead of explicit gradient steps

Proximal Backpropagation Proximal Backpropagation (ProxProp) is a neural network training algorithm that takes implicit instead of explicit gradient s

Thomas Frerix 40 Dec 17, 2022
The codes and models in 'Gaze Estimation using Transformer'.

GazeTR We provide the code of GazeTR-Hybrid in "Gaze Estimation using Transformer". We recommend you to use data processing codes provided in GazeHub.

65 Dec 27, 2022
Covid-19 Test AI (Deep Learning - NNs) Software. Accuracy is the %96.5, loss is the 0.09 :)

Covid-19 Test AI (Deep Learning - NNs) Software I developed a segmentation algorithm to understand whether Covid-19 Test Photos are positive or negati

Emirhan BULUT 28 Dec 04, 2021
Pure python implementations of popular ML algorithms.

Minimal ML algorithms This repo includes minimal implementations of popular ML algorithms using pure python and numpy. The purpose of these notebooks

Alexis Gidiotis 3 Jan 10, 2022
ACAV100M: Automatic Curation of Large-Scale Datasets for Audio-Visual Video Representation Learning. In ICCV, 2021.

ACAV100M: Automatic Curation of Large-Scale Datasets for Audio-Visual Video Representation Learning This repository contains the code for our ICCV 202

sangho.lee 28 Nov 08, 2022