pix2pix in tensorflow.js

Last update: Oct 04, 2022

Related tags

Deep Learning pix2pix_tensorflowjs

Overview

pix2pix in tensorflow.js

This repo is moved to https://github.com/yining1023/pix2pix_tensorflowjs_lite

See a live demo here: https://yining1023.github.io/pix2pix_tensorflowjs/

Try it yourself: Download/clone the repository and run it locally:

git clone https://github.com/yining1023/pix2pix_tensorflowjs.git
cd pix2pix_tensorflowjs
python3 -m http.server

Credits: This project is based on affinelayer's pix2pix-tensorflow. I want to thank christopherhesse, nsthorat, and dsmilkov for their help and suggestions from this Github issue.

How to train a pix2pix(edges2xxx) model from scratch

1. Prepare the data
1. Train the model
1. Test the model
1. Export the model
1. Port the model to tensorflow.js
1. Create an interactive interface in the browser

1. Prepare the data

1.1 Scrape images from google search
1.2 Remove the background of the images
1.3 Resize all images into 256x256 px
1.4 Detect edges of all images
1.5 Combine input images and target images
1.6 Split all combined images into two folders: train and val

Before we start, check out affinelayer's Create your own dataset. I followed his instrustion for steps 1.3, 1.5 and 1.6.

1.1 Scrape images from google search

We can create our own target images. But for this edge2pikachu project, I downloaded a lot of images from google. I'm using this google_image_downloader to download images from google. After downloading the repo above, run -

$ python image_download.py <query> <number of images>

It will download images and save it to the current directory.

1.2 Remove the background of the images

Some images have some background. I'm using grabcut with OpenCV to remove background Check out the script here: https://github.com/yining1023/pix2pix-tensorflow/blob/master/tools/grabcut.py To run the script-

$ python grabcut.py <filename>

It will open an interactive interface, here are some instructions: https://github.com/symao/InteractiveImageSegmentation Here's an example of removing background using grabcut:

1.3 Resize all images into 256x256 px

Download pix2pix-tensorflow repo. Put all images we got into photos/original folder Run -

$ python tools/process.py --input_dir photos/original --operation resize --output_dir photos/resized

We should be able to see a new folder called resized with all resized images in it.

1.4 Detect edges of all images

The script that I use to detect edges of images from one folder at once is here: https://github.com/yining1023/pix2pix-tensorflow/blob/master/tools/edge-detection.py, we need to change the path of the input images directory on line 31, and create a new empty folder called edges in the same directory. Run -

$ python edge-detection.py

We should be able to see edged-detected images in the edges folder. Here's an example of edge detection: left(original) right(edge detected)

1.5 Combine input images and target images

python tools/process.py --input_dir photos/resized --b_dir photos/blank --operation combine --output_dir photos/combined

Here is an example of the combined image: Notice that the size of the combined image is 512x256px. The size is important for training the model successfully.

Read more here: affinelayer's Create your own dataset

1.6 Split all combined images into two folders: `train` and `val`

python tools/split.py --dir photos/combined

Read more here: affinelayer's Create your own dataset

I collected 305 images for training and 78 images for testing.

2. Train the model

# train the model
python pix2pix.py --mode train --output_dir pikachu_train --max_epochs 200 --input_dir pikachu/train --which_direction BtoA

I used the High Power Computer(HPC) at NYU to train the model. You can see more instruction here: https://github.com/cvalenzuela/hpc. You can request GPU and submit a job to HPC, and use tunnels to tranfer large files between the HPC and your computer.

The training takes me 4 hours and 16 mins. After train, there should be a pikachu_train folder with checkpoint in it. If you add --ngf 32 --ndf 32 when training the model: python pix2pix.py --mode train --output_dir pikachu_train --max_epochs 200 --input_dir pikachu/train --which_direction BtoA --ngf 32 --ndf 32, the model will be smaller 13.6 MB, and it will take less time to train.

3. Test the model

# test the model
python pix2pix.py --mode test --output_dir pikachu_test --input_dir pikachu/val --checkpoint pikachu_train

After testing, there should be a new folder called pikachu_test. In the folder, if you open the index.html, you should be able to see something like this in your browser:

4. Export the model

python pix2pix.py --mode export --output_dir /export/ --checkpoint /pikachu_train/ --which_direction BtoA

It will create a new export folder

5. Port the model to tensorflow.js

I followed affinelayer's instruction here: https://github.com/affinelayer/pix2pix-tensorflow/tree/master/server#exporting

cd server
python tools/export-checkpoint.py --checkpoint ../export --output_file static/models/pikachu_BtoA.pict

We should be able to get a file named pikachu_BtoA.pict, which is 54.4 MB. If you add --ngf 32 --ndf 32 when training the model: python pix2pix.py --mode train --output_dir pikachu_train --max_epochs 200 --input_dir pikachu/train --which_direction BtoA --ngf 32 --ndf 32, the model will be smaller 13.6 MB, and it will take less time to train.

6. Create an interactive interface in the browser

Copy the model we get from step 5 to the models folder.

pix2pix in tensorflow.js

Related tags

Overview

pix2pix in tensorflow.js

This repo is moved to https://github.com/yining1023/pix2pix_tensorflowjs_lite

How to train a pix2pix(edges2xxx) model from scratch

1. Prepare the data

1.1 Scrape images from google search

1.2 Remove the background of the images

1.3 Resize all images into 256x256 px

1.4 Detect edges of all images

1.5 Combine input images and target images

1.6 Split all combined images into two folders: `train` and `val`

2. Train the model

3. Test the model

4. Export the model

5. Port the model to tensorflow.js

6. Create an interactive interface in the browser

Owner

Yining Shi

This is a deep learning-based method to segment deep brain structures and a brain mask from T1 weighted MRI.

This is an official implementation of the CVPR2022 paper "Blind2Unblind: Self-Supervised Image Denoising with Visible Blind Spots".

An example showing how to use jax to train resnet50 on multi-node multi-GPU

A curated list of automated deep learning (including neural architecture search and hyper-parameter optimization) resources.

3D Avatar Lip Syncronization from speech (JALI based face-rigging)

Time Series Cross-Validation -- an extension for scikit-learn

The implementation code for "DAGAN: Deep De-Aliasing Generative Adversarial Networks for Fast Compressed Sensing MRI Reconstruction"

A simple rest api serving a deep learning model that classifies human gender based on their faces. (vgg16 transfare learning)

Human Detection - Pedestrian Detection using OpenCV Python

Code for Transformers Solve Limited Receptive Field for Monocular Depth Prediction

CKD - Collaborative Knowledge Distillation for Heterogeneous Information Network Embedding

Code for our TKDE paper "Understanding WeChat User Preferences and “Wow” Diffusion"

Simple is not Easy: A Simple Strong Baseline for TextVQA and TextCaps[AAAI2021]

Streamlit App For Product Analysis - Streamlit App For Product Analysis

A set of examples around hub for creating and processing datasets

Tools to create pixel-wise object masks, bounding box labels (2D and 3D) and 3D object model (PLY triangle mesh) for object sequences filmed with an RGB-D camera.

For encoding a text longer than 512 tokens, for example 800. Set max_pos to 800 during both preprocessing and training.

Official PyTorch implementation of the ICRA 2021 paper: Adversarial Differentiable Data Augmentation for Autonomous Systems.

Localization Distillation for Object Detection

A large-scale video dataset for the training and evaluation of 3D human pose estimation models

pix2pix in tensorflow.js

Related tags

Overview

pix2pix in tensorflow.js

This repo is moved to https://github.com/yining1023/pix2pix_tensorflowjs_lite

How to train a pix2pix(edges2xxx) model from scratch

1. Prepare the data

1.1 Scrape images from google search

1.2 Remove the background of the images

1.3 Resize all images into 256x256 px

1.4 Detect edges of all images

1.5 Combine input images and target images

1.6 Split all combined images into two folders: train and val

2. Train the model

3. Test the model

4. Export the model

5. Port the model to tensorflow.js

6. Create an interactive interface in the browser

Owner

Yining Shi

This is a deep learning-based method to segment deep brain structures and a brain mask from T1 weighted MRI.

This is an official implementation of the CVPR2022 paper "Blind2Unblind: Self-Supervised Image Denoising with Visible Blind Spots".

An example showing how to use jax to train resnet50 on multi-node multi-GPU

A curated list of automated deep learning (including neural architecture search and hyper-parameter optimization) resources.

3D Avatar Lip Syncronization from speech (JALI based face-rigging)

Time Series Cross-Validation -- an extension for scikit-learn

The implementation code for "DAGAN: Deep De-Aliasing Generative Adversarial Networks for Fast Compressed Sensing MRI Reconstruction"

A simple rest api serving a deep learning model that classifies human gender based on their faces. (vgg16 transfare learning)

Human Detection - Pedestrian Detection using OpenCV Python

Code for Transformers Solve Limited Receptive Field for Monocular Depth Prediction

CKD - Collaborative Knowledge Distillation for Heterogeneous Information Network Embedding

Code for our TKDE paper "Understanding WeChat User Preferences and “Wow” Diffusion"

Simple is not Easy: A Simple Strong Baseline for TextVQA and TextCaps[AAAI2021]

Streamlit App For Product Analysis - Streamlit App For Product Analysis

A set of examples around hub for creating and processing datasets

Tools to create pixel-wise object masks, bounding box labels (2D and 3D) and 3D object model (PLY triangle mesh) for object sequences filmed with an RGB-D camera.

For encoding a text longer than 512 tokens, for example 800. Set max_pos to 800 during both preprocessing and training.

Official PyTorch implementation of the ICRA 2021 paper: Adversarial Differentiable Data Augmentation for Autonomous Systems.

Localization Distillation for Object Detection

A large-scale video dataset for the training and evaluation of 3D human pose estimation models

1.6 Split all combined images into two folders: `train` and `val`