Contrastive Learning Inverts the Data Generating Process

Overview

Contrastive Learning Inverts the Data Generating Process

Official code to reproduce the results and data presented in the paper Contrastive Learning Inverts the Data Generating Process.

3DIdent dataset example images

Experiments

To reproduce the disentanglement results for the MLP mixing, use the main_mlp.py script. For the experiments on KITTI Masks use the main_kitti.py script. For those on 3DIdent, use main_3dident.py.

MLP Mixing

> python main_mlp.py --help
usage: main_mlp.py
       [-h] [--sphere-r SPHERE_R] [--box-min BOX_MIN] [--box-max BOX_MAX]
       [--sphere-norm] [--box-norm] [--only-supervised] [--only-unsupervised]
       [--more-unsupervised MORE_UNSUPERVISED] [--save-dir SAVE_DIR]
       [--num-eval-batches NUM_EVAL_BATCHES] [--rej-mult REJ_MULT]
       [--seed SEED] [--act-fct ACT_FCT] [--c-param C_PARAM]
       [--m-param M_PARAM] [--tau TAU] [--n-mixing-layer N_MIXING_LAYER]
       [--n N] [--space-type {box,sphere,unbounded}] [--m-p M_P] [--c-p C_P]
       [--lr LR] [--p P] [--batch-size BATCH_SIZE] [--n-log-steps N_LOG_STEPS]
       [--n-steps N_STEPS] [--resume-training]

Disentanglement with InfoNCE/Contrastive Learning - MLP Mixing

optional arguments:
  -h, --help            show this help message and exit
  --sphere-r SPHERE_R
  --box-min BOX_MIN     For box normalization only. Minimal value of box.
  --box-max BOX_MAX     For box normalization only. Maximal value of box.
  --sphere-norm         Normalize output to a sphere.
  --box-norm            Normalize output to a box.
  --only-supervised     Only train supervised model.
  --only-unsupervised   Only train unsupervised model.
  --more-unsupervised MORE_UNSUPERVISED
                        How many more steps to do for unsupervised compared to
                        supervised training.
  --save-dir SAVE_DIR
  --num-eval-batches NUM_EVAL_BATCHES
                        Number of batches to average evaluation performance at
                        the end.
  --rej-mult REJ_MULT   Memory/CPU trade-off factor for rejection resampling.
  --seed SEED
  --act-fct ACT_FCT     Activation function in mixing network g.
  --c-param C_PARAM     Concentration parameter of the conditional
                        distribution.
  --m-param M_PARAM     Additional parameter for the marginal (only relevant
                        if it is not uniform).
  --tau TAU
  --n-mixing-layer N_MIXING_LAYER
                        Number of layers in nonlinear mixing network g.
  --n N                 Dimensionality of the latents.
  --space-type {box,sphere,unbounded}
  --m-p M_P             Type of ground-truth marginal distribution. p=0 means
                        uniform; all other p values correspond to (projected)
                        Lp Exponential
  --c-p C_P             Exponent of ground-truth Lp Exponential distribution.
  --lr LR
  --p P                 Exponent of the assumed model Lp Exponential
                        distribution.
  --batch-size BATCH_SIZE
  --n-log-steps N_LOG_STEPS
  --n-steps N_STEPS
  --resume-training

KITTI Masks

>python main_kitti.py --help
usage: main_kitti.py [-h] [--box-norm BOX_NORM] [--p P] [--experiment-dir EXPERIMENT_DIR] [--evaluate] [--specify SPECIFY] [--random-search] [--random-seeds] [--seed SEED] [--beta BETA] [--gamma GAMMA]
                     [--rate-prior RATE_PRIOR] [--data-distribution DATA_DISTRIBUTION] [--rate-data RATE_DATA] [--data-k DATA_K] [--betavae] [--search-beta] [--output-dir OUTPUT_DIR] [--log-dir LOG_DIR]
                     [--ckpt-dir CKPT_DIR] [--max-iter MAX_ITER] [--dataset DATASET] [--batch-size BATCH_SIZE] [--num-workers NUM_WORKERS] [--image-size IMAGE_SIZE] [--use-writer] [--z-dim Z_DIM] [--lr LR]
                     [--beta1 BETA1] [--beta2 BETA2] [--ckpt-name CKPT_NAME] [--log-step LOG_STEP] [--save-step SAVE_STEP] [--kitti-max-delta-t KITTI_MAX_DELTA_T] [--natural-discrete] [--verbose] [--cuda]
                     [--num_runs NUM_RUNS]

Disentanglement with InfoNCE/Contrastive Learning - KITTI Masks

optional arguments:
  -h, --help            show this help message and exit
  --box-norm BOX_NORM
  --p P
  --experiment-dir EXPERIMENT_DIR
                        specify path
  --evaluate            evaluate instead of train
  --specify SPECIFY     use argument to only compute a subset of metrics
  --random-search       whether to random search for params
  --random-seeds        whether to go over random seeds with UDR params
  --seed SEED           random seed
  --beta BETA           weight for kl to normal
  --gamma GAMMA         weight for kl to laplace
  --rate-prior RATE_PRIOR
                        rate (or inverse scale) for prior laplace (larger -> sparser).
  --data-distribution DATA_DISTRIBUTION
                        (laplace, uniform)
  --rate-data RATE_DATA
                        rate (or inverse scale) for data laplace (larger -> sparser). (-1 = rand).
  --data-k DATA_K       k for data uniform (-1 = rand).
  --betavae             whether to do standard betavae training (gamma=0)
  --search-beta         whether to do rand search over beta
  --output-dir OUTPUT_DIR
                        output directory
  --log-dir LOG_DIR     log directory
  --ckpt-dir CKPT_DIR   checkpoint directory
  --max-iter MAX_ITER   maximum training iteration
  --dataset DATASET     dataset name (dsprites, cars3d,smallnorb, shapes3d, mpi3d, kittimasks, natural
  --batch-size BATCH_SIZE
                        batch size
  --num-workers NUM_WORKERS
                        dataloader num_workers
  --image-size IMAGE_SIZE
                        image size. now only (64,64) is supported
  --use-writer          whether to use a log writer
  --z-dim Z_DIM         dimension of the representation z
  --lr LR               learning rate
  --beta1 BETA1         Adam optimizer beta1
  --beta2 BETA2         Adam optimizer beta2
  --ckpt-name CKPT_NAME
                        load previous checkpoint. insert checkpoint filename
  --log-step LOG_STEP   numer of iterations after which data is logged
  --save-step SAVE_STEP
                        number of iterations after which a checkpoint is saved
  --kitti-max-delta-t KITTI_MAX_DELTA_T
                        max t difference between frames sampled from kitti data loader.
  --natural-discrete    discretize natural sprites
  --verbose             for evaluation
  --cuda
  --num_runs NUM_RUNS   when searching over seeds, do 10

3DIdent

>python main_3dident.py --help
usage: main_3dident.py [-h] [--batch-size BATCH_SIZE] [--n-eval-samples N_EVAL_SAMPLES] [--lr LR] [--optimizer {adam,sgd}] [--iterations ITERATIONS]
                                                                   [--n-log-steps N_LOG_STEPS] [--load-model LOAD_MODEL] [--save-model SAVE_MODEL] [--save-every SAVE_EVERY] [--no-cuda] [--position-only]
                                                                   [--rotation-and-color-only] [--rotation-only] [--color-only] [--no-spotlight-position] [--no-spotlight-color] [--no-spotlight]
                                                                   [--non-periodic-rotation-and-color] [--dummy-mixing] [--identity-solution] [--identity-mixing-and-solution]
                                                                   [--approximate-dataset-nn-search] --offline-dataset OFFLINE_DATASET [--faiss-omp-threads FAISS_OMP_THREADS]
                                                                   [--box-constraint {None,fix,learnable}] [--sphere-constraint {None,fix,learnable}] [--workers WORKERS]
                                                                   [--mode {supervised,unsupervised,test}] [--supervised-loss {mse,r2}] [--unsupervised-loss {l1,l2,l3,vmf}]
                                                                   [--non-periodical-conditional {l1,l2,l3}] [--sigma SIGMA] [--encoder {rn18,rn50,rn101,rn151}]

Disentanglement with InfoNCE/Contrastive Learning - 3DIdent

optional arguments:
  -h, --help            show this help message and exit
  --batch-size BATCH_SIZE
  --n-eval-samples N_EVAL_SAMPLES
  --lr LR
  --optimizer {adam,sgd}
  --iterations ITERATIONS
                        How long to train the model
  --n-log-steps N_LOG_STEPS
                        How often to calculate scores and print them
  --load-model LOAD_MODEL
                        Path from where to load the model
  --save-model SAVE_MODEL
                        Path where to save the model
  --save-every SAVE_EVERY
                        After how many steps to save the model (will always be saved at the end)
  --no-cuda
  --position-only
  --rotation-and-color-only
  --rotation-only
  --color-only
  --no-spotlight-position
  --no-spotlight-color
  --no-spotlight
  --non-periodic-rotation-and-color
  --dummy-mixing
  --identity-solution
  --identity-mixing-and-solution
  --approximate-dataset-nn-search
  --offline-dataset OFFLINE_DATASET
  --faiss-omp-threads FAISS_OMP_THREADS
  --box-constraint {None,fix,learnable}
  --sphere-constraint {None,fix,learnable}
  --workers WORKERS     Number of workers to use (0=#cpus)
  --mode {supervised,unsupervised,test}
  --supervised-loss {mse,r2}
  --unsupervised-loss {l1,l2,l3,vmf}
  --non-periodical-conditional {l1,l2,l3}
  --sigma SIGMA         Sigma of the conditional distribution (for vMF: 1/kappa)
  --encoder {rn18,rn50,rn101,rn151}

3DIdent Dataset

We introduce 3Dident, a dataset with hallmarks of natural environments (shadows, different lighting conditions, 3D rotations, etc.). A preliminary version of the dataset is released along with our pre-print.

3DIdent dataset example images

You can access the dataset here. The training and test datasets consists of 250000 and 25000 samples, respectively. To load, you can use the ThreeDIdentDataset class defined in datasets/threedident_dataset.py.

BibTeX

If you find our analysis helpful, please cite our pre-print:

@article{zimmermann2021cl,
  author = {
    Zimmermann, Roland S. and
    Sharma, Yash and
    Schneider, Steffen and
    Bethge, Matthias and
    Brendel, Wieland
  },
  title = {
    Contrastive Learning Inverts the Data Generating Process
  },
  journal = {CoRR},
  volume = {abs/2102.08850},
  year = {2021},
}
Face and Pose detector that emits MQTT events when a face or human body is detected and not detected.

Face Detect MQTT Face or Pose detector that emits MQTT events when a face or human body is detected and not detected. I built this as an alternative t

Jacob Morris 38 Oct 21, 2022
Official Pytorch implementation of Online Continual Learning on Class Incremental Blurry Task Configuration with Anytime Inference (ICLR 2022)

The Official Implementation of CLIB (Continual Learning for i-Blurry) Online Continual Learning on Class Incremental Blurry Task Configuration with An

NAVER AI 34 Oct 26, 2022
Posterior predictive distributions quantify uncertainties ignored by point estimates.

Posterior predictive distributions quantify uncertainties ignored by point estimates.

DeepMind 177 Dec 06, 2022
Request execution of Galaxy SARS-CoV-2 variation analysis workflows on input data you provide.

SARS-CoV-2 processing requests Request execution of Galaxy SARS-CoV-2 variation analysis workflows on input data you provide. Prerequisites This autom

useGalaxy.eu 17 Aug 13, 2022
Plugin for Gaffer providing direct acess to asset from PolyHaven.com. Only HDRIs at the moment, Cycles and Arnold supported

GafferHaven Plugin for Gaffer providing direct acess to asset from PolyHaven.com. Only HDRIs are supported at the moment, in Cycles and Arnold lights.

Jakub Vondra 6 Jan 26, 2022
Rotary Transformer

[中文|English] Rotary Transformer Rotary Transformer is an MLM pre-trained language model with rotary position embedding (RoPE). The RoPE is a relative

325 Jan 03, 2023
CondNet: Conditional Classifier for Scene Segmentation

CondNet: Conditional Classifier for Scene Segmentation Introduction The fully convolutional network (FCN) has achieved tremendous success in dense vis

ycszen 31 Jul 22, 2022
202 Jan 06, 2023
Code for AutoNL on ImageNet (CVPR2020)

Neural Architecture Search for Lightweight Non-Local Networks This repository contains the code for CVPR 2020 paper Neural Architecture Search for Lig

Yingwei Li 104 Aug 31, 2022
Neural Reprojection Error: Merging Feature Learning and Camera Pose Estimation

Neural Reprojection Error: Merging Feature Learning and Camera Pose Estimation This is the official repository for our paper Neural Reprojection Error

Hugo Germain 78 Dec 01, 2022
Official implementation of the paper "AAVAE: Augmentation-AugmentedVariational Autoencoders"

AAVAE Official implementation of the paper "AAVAE: Augmentation-AugmentedVariational Autoencoders" Abstract Recent methods for self-supervised learnin

Grid AI Labs 48 Dec 12, 2022
Inverse Optimal Control Adapted to the Noise Characteristics of the Human Sensorimotor System

Inverse Optimal Control Adapted to the Noise Characteristics of the Human Sensorimotor System This repository contains code for the paper Schultheis,

2 Oct 28, 2022
Unofficial implementation of "Coordinate Attention for Efficient Mobile Network Design"

Unofficial implementation of "Coordinate Attention for Efficient Mobile Network Design". CoordAttention tensorflow slim

Billy 9 Aug 22, 2022
Code for Environment Inference for Invariant Learning (ICML 2020 UDL Workshop Paper)

Environment Inference for Invariant Learning This code accompanies the paper Environment Inference for Invariant Learning, which appears at ICML 2021.

Elliot Creager 40 Dec 09, 2022
SIMULEVAL A General Evaluation Toolkit for Simultaneous Translation

SimulEval SimulEval is a general evaluation framework for simultaneous translation on text and speech. Requirement python = 3.7.0 Installation git cl

Facebook Research 48 Dec 28, 2022
Official Repo for Ground-aware Monocular 3D Object Detection for Autonomous Driving

Visual 3D Detection Package: This repo aims to provide flexible and reproducible visual 3D detection on KITTI dataset. We expect scripts starting from

Yuxuan Liu 305 Dec 19, 2022
Cold Brew: Distilling Graph Node Representations with Incomplete or Missing Neighborhoods

Cold Brew: Distilling Graph Node Representations with Incomplete or Missing Neighborhoods Introduction Graph Neural Networks (GNNs) have demonstrated

37 Dec 15, 2022
X-VLM: Multi-Grained Vision Language Pre-Training

X-VLM: learning multi-grained vision language alignments Multi-Grained Vision Language Pre-Training: Aligning Texts with Visual Concepts. Yan Zeng, Xi

Yan Zeng 286 Dec 23, 2022
This repository contains the segmentation user interface from the OpenSurfaces project, extracted as a lightweight tool

OpenSurfaces Segmentation UI This repository contains the segmentation user interface from the OpenSurfaces project, extracted as a lightweight tool.

Sean Bell 66 Jul 11, 2022
Categorizing comments on YouTube into different categories.

Youtube Comments Categorization This repo is for categorizing comments on a youtube video into different categories. negative (grievances, complaints,

Rhitik 5 Nov 26, 2022