Viewmaker Networks: Learning Views for Unsupervised Representation Learning

Last update: Dec 01, 2022

Related tags

Deep Learning viewmaker

Overview

Viewmaker Networks: Learning Views for Unsupervised Representation Learning

Alex Tamkin, Mike Wu, and Noah Goodman

Paper link: https://arxiv.org/abs/2010.07432

0) Background

Viewmaker networks are a new, more general method for self-supervised learning that enables pretraining with the same algorithm on a diverse range of different modalities—including images, speech, and sensor data.

Viewmaker networks learn a family of data transformations with a generative model, as opposed to prior approaches which use data transformations developed by domain experts through trial and error.

Viewmakers are trained adversarially with respect to the pretraining loss—this means they are compatible with many different pretraining objectives. We present results for SimCLR and InstDisc, but viewmakers are compatible with any view-based objective, including MoCo, BYOL, SimSiam, and SwAV.

Some example distortions learned for images (each frame is generated with a different random noise input to the viewmaker)

1) Install Dependencies

We used the following PyTorch libraries for CUDA 10.1; you may have to adapt for your own CUDA version:

pip install torch==1.7.1+cu101 torchvision==0.8.2+cu101 torchaudio==0.7.2 -f https://download.pytorch.org/whl/torch_stable.html

Install other dependencies:

pip install -r requirements.txt

2) Running experiments

Start by running

source init_env.sh

Now, you can run experiments for the different modalities as follows:

scripts/run_sensor.py config/sensor/pretrain_viewmaker_pamap2_simclr.json --gpu-device 0

This command runs viewmaker pretraining on the Pamap2 wearable sensor dataset using GPU #0. (If you have a multi-GPU node, you can specify other GPUs.)

The scripts directory holds:

run_image.py: for pretraining and running linear evaluation on CIFAR-10
run_meta_transfer.py: for running linear evaluation on a range of transfer datasets, including many from MetaDataset
run_audio.py: for pretraining on LibriSpeech and running linear evaluation on a range of transfer datasets
run_sensor.py: for pretraining on Pamap2 and running transfer, supervised, and semi-supervised learning on different splits of Pamap2
eval_cifar10_c.py: for evaluating a linear evaluation model on the CIFAR-10-C dataset for assessing robustness to common corruptions

The config directory holds configuration files for the different experiments, specifying the hyperparameters from each experiment. The first field in every config file is exp_base which specifies the base directory to save experiment outputs, which you should change for your own setup.

You are responsible for downloading the datasets. Update the paths in src/datasets/root_paths.py.

Training curves and other metrics are logged using wandb.ai

Viewmaker Networks: Learning Views for Unsupervised Representation Learning

Related tags

Overview

Viewmaker Networks: Learning Views for Unsupervised Representation Learning

0) Background

1) Install Dependencies

2) Running experiments

Owner

Alex Tamkin

Unofficial PyTorch implementation of Fastformer based on paper "Fastformer: Additive Attention Can Be All You Need"."

A package, and script, to perform imaging transcriptomics on a neuroimaging scan.

PaddleBoBo是基于PaddlePaddle和PaddleSpeech、PaddleGAN等开发套件的虚拟主播快速生成项目

Approaches to modeling terrain and maps in python

Face-Recognition-Attendence-System - This face recognition Attendence system using Python

PyTorch implementation of image classification models for CIFAR-10/CIFAR-100/MNIST/FashionMNIST/Kuzushiji-MNIST/ImageNet

[ACL-IJCNLP 2021] Improving Named Entity Recognition by External Context Retrieving and Cooperative Learning

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

Model Serving Made Easy

Segcache: a memory-efficient and scalable in-memory key-value cache for small objects

StyleGAN-NADA: CLIP-Guided Domain Adaptation of Image Generators

[ICML 2022] The official implementation of Graph Stochastic Attention (GSAT).

Using a Seq2Seq RNN architecture via TensorFlow to predict future Bitcoin prices

Async API for controlling Hue Lights

YOLO5Face: Why Reinventing a Face Detector (https://arxiv.org/abs/2105.12931)

Baseline model for "GraspNet-1Billion: A Large-Scale Benchmark for General Object Grasping" (CVPR 2020)

Extension to fastai for volumetric medical data

A lightweight library designed to accelerate the process of training PyTorch models by providing a minimal

text_recognition_toolbox: The reimplementation of a series of classical scene text recognition papers with Pytorch in a uniform way.

[NeurIPS 2021] Towards Better Understanding of Training Certifiably Robust Models against Adversarial Examples | ⛰️⚠️