A new mini-batch framework for optimal transport in deep generative models, deep domain adaptation, approximate Bayesian computation, color transfer, and gradient flow.

Related tags

MiscellaneousBoMb-OT
Overview

BoMb-OT

Python3 implementation of the papers On Transportation of Mini-batches: A Hierarchical Approach and Improving Mini-batch Optimal Transport via Partial Transportation.

Please CITE our papers whenever this repository is used to help produce published results or incorporated into other software.

@article{nguyen2021transportation,
      title={On Transportation of Mini-batches: A Hierarchical Approach}, 
      author={Khai Nguyen and Dang Nguyen and Quoc Nguyen and Tung Pham and Hung Bui and Dinh Phung and Trung Le and Nhat Ho},
      journal={arXiv preprint arXiv:2102.05912},
      year={2021},
}
@article{nguyen2021improving,
      title={Improving Mini-batch Optimal Transport via Partial Transportation}, 
      author={Khai Nguyen and Dang Nguyen and Tung Pham and Nhat Ho},
      journal={arXiv preprint arXiv:2108.09645},
      year={2021},
}

This implementation is made by Khai Nguyen and Dang Nguyen. README is on updating process.

Requirement

  • python 3.6
  • pytorch 1.7.1
  • torchvision
  • numpy
  • tqdm
  • geomloss
  • POT
  • matplotlib
  • cvxpy

What is included?

The scalable implementation of the batch of mini-batches scheme and the conventional averaging scheme of mini-batch transportation types: optimal transport (OT), partial optimal transport (POT), unbalanced optimal transport (UOT), sliced optimal transport for:

  • Deep Generative Models
  • Deep Domain Adaptation
  • Approximate Bayesian Computation
  • Color Transfer
  • Gradient Flow

Deep Adaptation on digits datasets (DeepDA/digits)

Code organization

cfg.py : this file contains arguments for training.

methods.py : this file implements the training process of the deep DA.

models.py : this file contains the architecture of the genertor and the classifier.

train_digits.py: running file for deep DA.

utils.py : this file contains implementation of utility functions.

Terminologies

--method : type of mini-batch deep DA method (jdot, jumbot, jpmbot)

--source_ds : source dataset

--target_ds : target dataset

--epsilon : OT regularization coefficient for Sinkhorn algorithm

--tau : marginal penalization coefficient in UOT

--mass : fraction of masses in POT

--eta1 : weight of embedding loss

--eta2 : weight of transportation loss

--k : number of mini-batches

--mbsize : mini-batch size

--n_epochs : number of running epochs

--test_interval : interval of two continuous test phase

--lr : initial learning rate

--data_dir : path to dataset

--reg : OT regularization coefficient for Sinkhorn algorithm

--bomb : Using Batch of Mini-batches

--ebomb : Using entropic Batch of Mini-batches

--breg : OT regularization coefficient for entropic Batch of Mini-batches

Change the number of mini-batches $k$

bash sh/exp_mOT_change_k.sh
bash sh/exp_BoMbOT_change_k.sh

Change the mini-batch size $m$

bash sh/exp_mOT_change_m.sh
bash sh/exp_BoMbOT_change_m.sh

Deep Adaptation on Office-Home and VisDA datasets (DeepDA/office)

Code organization

data_list.py : this file contains functions to create dataset.

evaluate.py : this file is used to evaluate model trained on VisDA dataset.

lr_schedule.py : this file implements the learning rate scheduler.

network.py : this file contains the architecture of the genertor and the classifier.

pre_process.py : this file implements preprocessing techniques.

train.py : this file implements the training process for both datasets.

Terminologies

--net : architecture type of the generator

--dset : name of the dataset

--test_interval : interval of two continuous test phase

--s_dset_path : path to source dataset

--stratify_source : use stratify sampling

--s_dset_path : path to target dataset

--batch_size : training batch size

--stop_step : number of iterations

--ot_type : type of OT loss (balanced, unbalanced, partial)

--eta1 : weight of embedding loss ($\alpha$ in equation 10)

--eta2 : weight of transportation loss ($\lambda_t$ in equation 10)

--epsilon : OT regularization coefficient for Sinkhorn algorithm

--tau : marginal penalization coefficient in UOT

--mass : fraction of masses in POT

--bomb : Using Batch of Mini-batches

--ebomb : Using entropic Batch of Mini-batches

--breg : OT regularization coefficient for entropic Batch of Mini-batches

Train on Office-Home

bash sh/train_home.sh

Train on VisDA

bash sh/train_visda.sh

Deep Generative model (DeepGM)

Code organization

Celeba_generator.py, Cifar_generator.py : these files contain the architecture of the generator on CelebA and CIFAR10 datasets, and include some self-function to compute losses of corresponding baselines.

experiments.py : this file contains some functions for generating images.

fid_score.py: this file is used to compute the FID score.

gen_images.py: read saved models to produce 10000 images to calculate FID.

inception.py: this file contains the architecture of Inception Net V3.

main_celeba.py, main_cifar.py : running files on the corresponding datasets.

utils.py : this file contains implementation of utility functions.

Terminologies

--method : type of OT loss (OT, UOT, POT, sliced)

--reg : OT regularization coefficient for Sinkhorn algorithm

--tau : marginal penalization coefficient in UOT

--mass : fraction of masses in POT

--k : number of mini-batches

--m : mini-batch size

--epochs : number of epochs at k = 1. The actual running epochs is calculated by multiplying this value by the value of k.

--lr : initial learning rate

--latent-size : latent size of the generator

--datadir : path to dataset

--L : number of projections when using slicing approach

--bomb : Using Batch of Mini-batches

--ebomb : Using entropic Batch of Mini-batches

--breg : OT regularization coefficient for entropic Batch of Mini-batches

Train on CIFAR10

CUDA_VISIBLE_DEVICES=0 python main_cifar.py --method POT --reg 0 --tau 1 \
    --mass 0.7 --k 2 --m 100 --epochs 100 --lr 5e-4 --latent-size 32 --datadir ./data

Train on CELEBA

CUDA_VISIBLE_DEVICES=0 python main_celeba.py --method POT --reg 0 --tau 1 \
    --mass 0.7 --k 2 --m 200 --epochs 100 --lr 5e-4 --latent-size 32 --datadir ./data

Gradient Flow (GradientFlow)

python main.py

Color Transfer (Color Transfer)

python main.py  --m=100 --T=10000 --source images/s1.bmp --target images/t1.bmp --cluster

Terminologies

--k : number of mini-batches

--m : the size of mini-batches

--T : the number of steps

--cluster: K mean clustering to compress images

--palette: show color palette

--source: Path to the source image

Acknowledgment

The structure of DeepDA is largely based on JUMBOT and ALDA. The structure of ABC is largely based on SlicedABC. We are very grateful for their open sources.

Owner
Khai Ba Nguyen
I am currently an AI Resident at VinAI Research, Vietnam.
Khai Ba Nguyen
Do you need a screensaver for CircuitPython? Of course you do

circuitpython_screensaver Do you need a screensaver for CircuitPython? Of course you do Demo video of dvdlogo screensaver: screensaver_dvdlogo.mp4 Dem

Tod E. Kurt 8 Sep 02, 2021
Get a list of the top-10 rejected libraries in your WhiteSource inventory

WhiteSource Top 10 Rejected Libraries Generate a spreadsheet listing the 10 most common libraries in your WhiteSource inventory that were rejected by

WhiteSource-PS-tools 10 Mar 23, 2022
ALSPAC data analysis studying links between screen-usage and mental health issues in children. Provided data has been synthesised.

ADSMH - Mental Health and Screen Time Group coursework for Applied Data Science at the University of Bristol. Overview The data set that you have was

Kai 1 Jan 13, 2022
An animal facts python module

An animal facts python module

Fayas Noushad 3 Dec 19, 2021
A wrapper script to make working with ADB (Android Debug Bridge) easier

Python-ADB-Wrapper A wrapper script to make working with ADB (Android Debug Bridge) easier This project was just a simple test to see if I could wrap

18iteration 1 Nov 25, 2021
Home Assistant integration for spanish electrical data providers (e.g., datadis)

homeassistant-edata Esta integración para Home Assistant te permite seguir de un vistazo tus consumos y máximas potencias alcanzadas. Para ello, se ap

VMG 163 Jan 05, 2023
Groupe du projet Python en 2TL2-4

Présentation Projet EpheCom Ce logiciel a été développé dans le cadre scolaire. EpheCom est un logiciel de communications - vocale et écrite - en temp

1 Dec 26, 2021
Comprehensive Python Cheatsheet

Comprehensive Python Cheatsheet

Jure Šorn 31.3k Dec 30, 2022
Architecture example simulator

SCADA architecture Example of a SCADA-like console application, used to serve as a minimal example of a standard architecture of an IIoT system. Insta

1 Nov 06, 2021
A script to generate NFT art living on the Solana blockchain.

NFT Generator This script generates NFT art based on its desired traits with their specific rarities. It has been used to generate the full collection

Rude Golems 24 Oct 08, 2022
OntoSeer is a tool to help users build better quality ontologies

Ontoseer This document provides documentation for the first version of OntoSeer.OntoSeer is a tool that monitors the ontology development process andp

Knowledgeable Computing and Reasoning Lab 9 Aug 15, 2022
Demo of using DataLoader to prevent out of memory

Demo of using DataLoader to prevent out of memory

3 Jun 25, 2022
Osintgram by Datalux but i fixed some errors i found and made it look cleaner

OSINTgram-V2 OSINTgram-V2 is made from Osintgram which is made by Datalux originally but i took the script and fixed some errors i found and made the

2 Feb 02, 2022
Rotazioni: a linear programming workout split optimizer

Rotazioni: a linear programming workout split optimizer Dependencies Dependencies for the frontend and backend are respectively listed in client/packa

Marco 3 Oct 13, 2022
aaencode for python,把python代码转换为颜文字

py-aaencode aaencode for python,把python代码转换为颜文字 compile.py: 将python编译成颜文字,编译结果有随机性,可以选择BPE词表压缩代码 compile_min.py: 最小化的编译器 compiled_min.txt: 编译得到的最小的com

11 Dec 30, 2021
PyPIContents is an application that generates a Module Index from the Python Package Index (PyPI) and also from various versions of the Python Standard Library.

PyPIContents is an application that generates a Module Index from the Python Package Index (PyPI) and also from various versions of the Python Standar

Collage Labs 10 Nov 19, 2022
Simple AoC helper program you can use to develop your own solutions in python.

AoC-Compabion Simple AoC helper program you can use to develop your own solutions in python. Simply install it in your python environment using pip fr

Alexander Vollmer 1 Dec 20, 2021
Ontario-Covid-Screening - An automated Covid-19 School Screening Tool for Ontario

Ontario-Covid19-Screening An automated Covid-19 School Screening Tool for Ontari

Rayan K 0 Feb 20, 2022
dragmap-meth: Fast and accurate aligner for bisulfite sequencing reads using dragmap

dragmap_meth (dragmap_meth.py) Alignment of BS-Seq reads using dragmap. Intro This works for single-end reads and for paired-end reads from the direct

Shaojun Xie 3 Jul 14, 2022
Junos PyEZ is a Python library to remotely manage/automate Junos devices.

The repo is under active development. If you take a clone, you are getting the latest, and perhaps not entirely stable code. DOCUMENTATION Official Do

Juniper Networks 623 Dec 10, 2022