Synthesize photos from PhotoDNA using machine learning 🌱

Last update: Nov 23, 2022

Related tags

Deep Learning ribosome

Overview

Ribosome

Synthesize photos from PhotoDNA.

See the blog post for more information.

Installation

Dependencies

You can install Python dependencies using pip install -r requirements.txt. If you want to install the packages manually, here is a list:

PyTorch (torch, torchvision)
NumPy (numpy)
Pillow (Pillow)
tqdm (tqdm)

Pre-trained models

Ribosome is released with 4 pre-trained models:

coco-model.pt: trained on COCO 2017 train images
celeba-model.pt: trained on CelebA aligned+cropped images
nsfw-model.pt: trained on 100K SFW+NSFW images scraped from Reddit
coco+celeba+nsfw-model.pt: trained on the combination of the above

Use the models trained on NSFW data at your own risk.

Usage

Inference

Use the infer.py script to produce images from hashes:

python infer.py [--model MODEL] [--output OUTPUT] hash

The hash is a base64-encoded string, e.g. cVwhQ58OSCEOIwF+AigAkT0GAWdwAQs8o04KGYMfHBUANRUOAycUEFABCh6PABIghDBzCa4RTysQYVcvMDdkMypBPSyNAgRCcTf2AC9PfiYSWDw3KTcxPxM2HSqTDSIsgxJFFA+iihERcU4fHEY4Lj0xhw3QJN4OXQwbIzJjVTsUodIVVy3/FY8I/wcui11O.

Training

Datasets

Datasets consist of images paired with hashes, in the format of a CSV file with paths/hashes, and image files in a directory. The CSV file has two colums, path and hash (no header row). The hash is base64-encoded. Images are 100x100 in size. After producing such a CSV, it may be convenient to shuffle it and split it into a training set and validation set.

Example dataset

Ribosome includes an example dataset in this format, produced from COCO:

coco100x100.tar.gz: image files
coco-train.csv: training set hashes
coco-val.csv: validation set hashes

Preparing a dataset

To produce 100x100 images from an existing dataset, it may be convenient to use ImageMagick.

To resize image.jpg to 100x100 ignoring the original aspect ratio:

mogrify -resize '100x100!' image.jpg

To resize image.jpg to 100x100 by taking a center crop:

mogrify -resize '100x100^' -gravity Center -extent '100x100' image.jpg

You can process files in parallel using find / xargs, e.g. to convert all .jpg images using 24 threads:

find . -name '*.jpg' | xargs -n 1 -P 24 mogrify -resize '100x100!'

Ribosome does not provide code to compute PhotoDNA hashes, but such code is available in pyPhotoDNA.

Train a model

Use the train.py script to train a model on a dataset:

python train.py --train-data TRAIN_DATA ...

--train-data is the path to the train data CSV
Paths in the CSV are interpreted relative to --data-dir (or . if not supplied)
--val-data is the path to the validation data CSV; if provided, the script will report the validation loss after every epoch

See python train.py --help for all the options.

License

Patient-Survival - Using Python, I developed a Machine Learning model using classification techniques such as Random Forest and SVM classifiers to predict a patient's survival status that have undergone breast cancer surgery.

Patient-Survival - Using Python, I developed a Machine Learning model using classification techniques such as Random Forest and SVM classifiers to predict a patient's survival status that have undergone breast cancer surgery.

1 Dec 28, 2021

Intrusion Detection System using ensemble learning (machine learning)

IDS-ML implementation of an intrusion detection system using ensemble machine learning methods Data set This project is carried out using the UNSW-15

4 Nov 25, 2022

Codes for realizing theories learned from Data Mining, Machine Learning, Deep Learning without using the present Python packages.

Codes-for-Algorithms Codes for realizing theories learned from Data Mining, Machine Learning, Deep Learning without using the present Python packages.

1 Apr 12, 2022

An open source machine learning library for performing regression tasks using RVM technique.

Introduction neonrvm is an open source machine learning library for performing regression tasks using RVM technique. It is written in C programming la

33 May 31, 2022

Knowledge Management for Humans using Machine Learning & Tags

HyperTag HyperTag helps humans intuitively express how they think about their files using tags and machine learning.

165 Nov 4, 2022

Pneumonia Detection using machine learning - with PyTorch

Pneumonia Detection Pneumonia Detection using machine learning. Training was done in colab: DEMO: Result (Confusion Matrix): Data I uploaded my datase

12 Jul 7, 2022

Optimising chemical reactions using machine learning

Summit Summit is a set of tools for optimising chemical processes. We’ve started by targeting reactions. What is Summit? Currently, reaction optimisat

75 Dec 14, 2022

Providing the solutions for high-frequency trading (HFT) strategies using data science approaches (Machine Learning) on Full Orderbook Tick Data.

Modeling High-Frequency Limit Order Book Dynamics Using Machine Learning Framework to capture the dynamics of high-frequency limit order books. Overvi

1.3k Jan 7, 2023

Algorithmic trading using machine learning.

Algorithmic Trading This machine learning algorithm was built using Python 3 and scikit-learn with a Decision Tree Classifier. The program gathers sto

101 Nov 10, 2022

Synthesize photos from PhotoDNA using machine learning 🌱

Related tags

Overview

Ribosome

Installation

Dependencies

Pre-trained models

Usage

Inference

Training

Datasets

Example dataset

Preparing a dataset

Train a model

License

You might also like...

Patient-Survival - Using Python, I developed a Machine Learning model using classification techniques such as Random Forest and SVM classifiers to predict a patient's survival status that have undergone breast cancer surgery.

Intrusion Detection System using ensemble learning (machine learning)

Codes for realizing theories learned from Data Mining, Machine Learning, Deep Learning without using the present Python packages.

An open source machine learning library for performing regression tasks using RVM technique.

Knowledge Management for Humans using Machine Learning & Tags

Pneumonia Detection using machine learning - with PyTorch

Optimising chemical reactions using machine learning

Providing the solutions for high-frequency trading (HFT) strategies using data science approaches (Machine Learning) on Full Orderbook Tick Data.

Algorithmic trading using machine learning.

Releases(v1.0.0)

v1.0.0(Dec 20, 2021)

Owner

Anish Athalye

Instant-Teaching: An End-to-End Semi-Supervised Object Detection Framework

Code for ICCV 2021 paper Graph-to-3D: End-to-End Generation and Manipulation of 3D Scenes using Scene Graphs

Simple Python application to transform Serial data into OSC messages

Code repository for paper `Skeleton Merger: an Unsupervised Aligned Keypoint Detector`.

Train Dense Passage Retriever (DPR) with a single GPU

Discerning Decision-Making Process of Deep Neural Networks with Hierarchical Voting Transformation

Graph Regularized Residual Subspace Clustering Network for hyperspectral image clustering

Implementation of the paper "Self-Promoted Prototype Refinement for Few-Shot Class-Incremental Learning"

Libraries, tools and tasks created and used at DeepMind Robotics.

[AAAI 2021] EMLight: Lighting Estimation via Spherical Distribution Approximation and [ICCV 2021] Sparse Needlets for Lighting Estimation with Spherical Transport Loss

The official implementation for "FQ-ViT: Fully Quantized Vision Transformer without Retraining".

CRNN With PyTorch

PyTorch implementation of our ICCV 2019 paper: Liquid Warping GAN: A Unified Framework for Human Motion Imitation, Appearance Transfer and Novel View Synthesis

EvDistill: Asynchronous Events to End-task Learning via Bidirectional Reconstruction-guided Cross-modal Knowledge Distillation (CVPR'21)

Fast Soft Color Segmentation

DTCN IJCAI - Sequential prediction learning framework and algorithm

This is the repository for paper NEEDLE: Towards Non-invertible Backdoor Attack to Deep Learning Models.

Code for the paper "Ordered Neurons: Integrating Tree Structures into Recurrent Neural Networks"

A PyTorch implementation of ViTGAN based on paper ViTGAN: Training GANs with Vision Transformers.

PyTorch implementation of normalizing flow models