Tensorflow 2 implementations of the C-SimCLR and C-BYOL self-supervised visual representation methods from "Compressive Visual Representations" (NeurIPS 2021)

Last update: Nov 23, 2022

Overview

Compressive Visual Representations

This repository contains the source code for our paper, Compressive Visual Representations. We developed information-compressed versions of the SimCLR and BYOL self-supervised learning algorithms, which we call C-SimCLR and C-BYOL, using the Conditional Entropy Bottleneck, and achieved significant improvements in accuracy and robustness, yielding linear evaluation performance competitive with fully supervised models.

We include implementations of the C-SimCLR and C-BYOL algorithms developed in our paper, as well as SimCLR and BYOL baselines.

Getting Started

Install the necessary dependencies with pip install -r requirements.txt. We recommend creating a new virtual environment.

To train a model with C-SimCLR on ImageNet run bash scripts/csimclr.sh. And to train a model with C-BYOL, run bash scripts/cbyol.sh.

Refer to the scripts for further configuration options, and also to train the corresponding SimCLR and BYOL baselines.

These command lines use the hyperparameters used to train the models in our paper. In particular, we used a batch size of 4096 using 32 Cloud TPUs. Using different accelerators will require reducing the batch size. To get started with Google Cloud TPUs, we recommend following this tutorial.

Checkpoints

The following table contains pretrained checkpoints for C-SimCLR, C-BYOL and also their respective baselines, SimCLR and BYOL. All models are trained on ImageNet. The Top-1 accuracy is obtained by training a linear classifier on top of a ``frozen'' backbone whilst performing self-supervised training of the network.

Algorithm	Backbone	Training epochs	ImageNet Top-1	Checkpoint
SimCLR	ResNet 50	1000	71.1	link
SimCLR	ResNet 50 2x	1000	74.6	link
C-SimCLR	ResNet 50	1000	71.8	link
C-SimCLR	ResNet 50 2x	1000	74.7	link
BYOL	ResNet 50	1000	74.4	link
BYOL	ResNet 50 2x	1000	77.3	link
C-BYOL	ResNet 50	1000	75.9	link
C-BYOL	ResNet 50 2x	1000	79.1	link
C-BYOL	ResNet 101	1000	78.0	link
C-BYOL	ResNet 152	1000	78.8	link
C-BYOL	ResNet 50	1500	76.0	link

Reference

If you use C-SimCLR or C-BYOL, please use the following BibTeX entry.

@InProceedings{lee2021compressive,
  title={Compressive Visual Representations},
  author={Lee, Kuang-Huei and Arnab, Anurag and Guadarrama, Sergio and Canny, John and Fischer, Ian},
  booktitle={NeurIPS},
  year={2021}
}

Credits

This repository is based on SimCLR. We also match our BYOL implementation in Tensorflow 2 to the original implementation of BYOL in JAX.

Disclaimer: This is not an official Google product.

Tensorflow 2 implementations of the C-SimCLR and C-BYOL self-supervised visual representation methods from "Compressive Visual Representations" (NeurIPS 2021)

Related tags

Overview

Compressive Visual Representations

Getting Started

Checkpoints

Reference

Credits

Owner

Google Research

A-ESRGAN aims to provide better super-resolution images by using multi-scale attention U-net discriminators.

SEOVER: Sentence-level Emotion Orientation Vector based Conversation Emotion Recognition Model

Code accompanying our NeurIPS 2021 traffic4cast challenge

This repository contains the code for "Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based Bias in NLP".

Use CLIP to represent video for Retrieval Task

Faster Convex Lipschitz Regression

Automatic detection and classification of Covid severity degree in LUS (lung ultrasound) scans

My coursework for Machine Learning (2021 Spring) at National Taiwan University (NTU)

Angle data is a simple data type.

Must-read Papers on Physics-Informed Neural Networks.

Streamlit app demonstrating an image browser for the Udacity self-driving-car dataset with realtime object detection using YOLO.

SHRIMP: Sparser Random Feature Models via Iterative Magnitude Pruning

GEP (GDB Enhanced Prompt) - a GDB plug-in for GDB command prompt with fzf history search, fish-like autosuggestions, auto-completion with floating window, partial string matching in history, and more!

Code for the paper "Graph Attention Tracking". (CVPR2021)

Housing Price Prediction

KeypointDeformer: Unsupervised 3D Keypoint Discovery for Shape Control

Continual reinforcement learning baselines: experiment specifications, implementation of existing methods, and common metrics. Easily extensible to new methods.

Corruption Invariant Learning for Re-identification

FcaNet: Frequency Channel Attention Networks

A large-image collection explorer and fast classification tool