Scaling and Benchmarking Self-Supervised Visual Representation Learning

Last update: Dec 31, 2022

Overview

FAIR Self-Supervision Benchmark is deprecated. Please see VISSL, a ground-up rewrite of benchmark in PyTorch.

FAIR Self-Supervision Benchmark

This code provides various benchmark (and legacy) tasks for evaluating quality of visual representations learned by various self-supervision approaches. This code corresponds to our work on Scaling and Benchmarking Self-Supervised Visual Representation Learning. The code is written in Python and can be used to evaluate both PyTorch and Caffe2 models (see this). We hope that this benchmark release will provided a consistent evaluation strategy that will allow measuring the progress in self-supervision easily.

Introduction

The goal of fair_self_supervision_benchmark is to standardize the methodology for evaluating quality of visual representations learned by various self-supervision approaches. Further, it provides evaluation on a variety of tasks as follows:

Benchmark tasks: The benchmark tasks are based on principle: a good representation (1) transfers to many different tasks, and, (2) transfers with limited supervision and limited fine-tuning. The tasks are as follows.

Image Classification
- VOC07
- COCO2014
- Places205
Low-Shot Image Classification
- VOC07
- Places205
Object Detection on VOC07 and VOC07+12 with frozen backbone for detectors:
- Fast R-CNN
- Faster R-CNN
Surface Normal Estimation
Visual Navigation in Gibson Environment

These Benchmark tasks use the network architectures:

Legacy tasks: We also classify some commonly used evaluation tasks as legacy tasks for reasons mentioned in Section 7 of paper. The tasks are as follows:

ImageNet-1K classification task
VOC07 full finetuning
Object Detection on VOC07 and VOC07+12 with full tuning for detectors:
- Fast R-CNN
- Faster R-CNN

License

fair_self_supervision_benchmark is CC-NC 4.0 International licensed, as found in the LICENSE file.

Citation

If you use fair_self_supervision_benchmark in your research or wish to refer to the baseline results published in the paper, please use the following BibTeX entry.

@article{goyal2019scaling,
  title={Scaling and Benchmarking Self-Supervised Visual Representation Learning},
  author={Goyal, Priya and Mahajan, Dhruv and Gupta, Abhinav and Misra, Ishan},
  journal={arXiv preprint arXiv:1905.01235},
  year={2019}
}

Installation

Please find installation instructions in INSTALL.md.

Getting Started

After installation, please see GETTING_STARTED.md for how to run various benchmark tasks.

Model Zoo

We provide models used in our paper in the MODEL_ZOO.

References

Scaling and Benchmarking Self-Supervised Visual Representation Learning. Priya Goyal, Dhruv Mahajan, Abhinav Gupta*, Ishan Misra*. Tech report, arXiv, May 2019.

Scaling and Benchmarking Self-Supervised Visual Representation Learning

Related tags

Overview

FAIR Self-Supervision Benchmark is deprecated. Please see VISSL, a ground-up rewrite of benchmark in PyTorch.

FAIR Self-Supervision Benchmark

Introduction

License

Citation

Installation

Getting Started

Model Zoo

References

Owner

Meta Research

Implementation of the SUMO (Slim U-Net trained on MODA) model

The repo of Feedback Networks, CVPR17

pytorch implementation of "Contrastive Multiview Coding", "Momentum Contrast for Unsupervised Visual Representation Learning", and "Unsupervised Feature Learning via Non-Parametric Instance-level Discrimination"

Source code of D-HAN: Dynamic News Recommendation with Hierarchical Attention Network

An open-source online reverse dictionary.

Interpretable and Generalizable Person Re-Identification with Query-Adaptive Convolution and Temporal Lifting

LoveDA: A Remote Sensing Land-Cover Dataset for Domain Adaptive Semantic Segmentation

上海交通大学全自动抢课脚本，支持准点开抢与抢课后持续捡漏两种模式。2021/06/08更新。

Gesture Volume Control v.2

DeepCO3: Deep Instance Co-segmentation by Co-peak Search and Co-saliency

i3DMM: Deep Implicit 3D Morphable Model of Human Heads

In Search of Probeable Generalization Measures

Faster RCNN with PyTorch

PyTorch implementation of saliency map-aided GAN for Auto-demosaic+denosing

MAU: A Motion-Aware Unit for Video Prediction and Beyond, NeurIPS2021

The official TensorFlow implementation of the paper Action Transformer: A Self-Attention Model for Short-Time Pose-Based Human Action Recognition

An unofficial styleguide and best practices summary for PyTorch

Re-implementation of 'Grokking: Generalization beyond overfitting on small algorithmic datasets'

Warning: This project does not have any current developer. See bellow.

PASSL包含 SimCLR，MoCo，BYOL，CLIP等基于对比学习的图像自监督算法以及 Vision-Transformer，Swin-Transformer，BEiT，CVT，T2T，MLP_Mixer等视觉Transformer算法