Code for generating a single image pretraining dataset

Overview

Single Image Pretraining of Visual Representations

As shown in the paper

A critical analysis of self-supervision, or what we can learn from a single image, Asano et al. ICLR 2020

Example images from our dataset

Why?

Self-supervised representation learning has made enormous strides in recent years. In this paper we show that a large part why self-supervised learning works are the augmentations. We show this by pretraining various SSL methods on a dataset generated solely from augmenting a single source image and find that various methods still pretrain quite well and even yield representations as strong as using the whole dataset for the early layers of networks.

Abstract

We look critically at popular self-supervision techniques for learning deep convolutional neural networks without manual labels. We show that three different and representative methods, BiGAN, RotNet and DeepCluster, can learn the first few layers of a convolutional network from a single image as well as using millions of images and manual labels, provided that strong data augmentation is used. However, for deeper layers the gap with manual supervision cannot be closed even if millions of unlabelled images are used for training. We conclude that: (1) the weights of the early layers of deep networks contain limited information about the statistics of natural images, that (2) such low-level statistics can be learned through self-supervision just as well as through strong supervision, and that (3) the low-level statistics can be captured via synthetic transformations instead of using a large image dataset.

Usage

Here we provide the code for generating a dataset from using just a single source image. Since the publication, I have slightly modified the dataset generation script to make it easier to use. Dependencies: torch, torchvision, joblib, PIL, numpy, any recent version should do.

Run like this:

python make_dataset_single.py --imgpath images/ameyoko.jpg --targetpath ./out/ameyoko_dataset

Here is the full description of the usage:

usage: make_dataset_single.py [-h] [--img_size IMG_SIZE]
                              [--batch_size BATCH_SIZE] [--num_imgs NUM_IMGS]
                              [--threads THREADS] [--vflip] [--deg DEG]
                              [--shear SHEAR] [--cropfirst]
                              [--initcrop INITCROP] [--scale SCALE SCALE]
                              [--randinterp] [--imgpath IMGPATH] [--debug]
                              [--targetpath TARGETPATH]

Single Image Pretraining, Asano et al. 2020

optional arguments:
  -h, --help            show this help message and exit
  --img_size IMG_SIZE
  --batch_size BATCH_SIZE
  --num_imgs NUM_IMGS   number of images to be generated
  --threads THREADS     how many CPU threads to use for generation
  --vflip               use vflip?
  --deg DEG             max rot angle
  --shear SHEAR         max shear angle
  --cropfirst           usage of initial crop to not focus too much on center
  --initcrop INITCROP   initial crop size relative to image
  --scale SCALE SCALE   data augmentation inverse scale
  --randinterp          For RR crops: use random interpolation method or just bicubic?
  --imgpath IMGPATH
  --debug
  --targetpath TARGETPATH

Reference

If you find this code/idea useful, please consider citing our paper:

@inproceedings{asano2020a,
title={A critical analysis of self-supervision, or what we can learn from a single image},
author={Asano, Yuki M. and Rupprecht, Christian and Vedaldi, Andrea},
booktitle={International Conference on Learning Representations (ICLR)},
year={2020},
}
Owner
Yuki M. Asano
I'm a PhD student in the Visual Geometry Group at the University of Oxford. I work with @chrirupp and @vedaldi.
Yuki M. Asano
repro_eval is a collection of measures to evaluate the reproducibility/replicability of system-oriented IR experiments

repro_eval repro_eval is a collection of measures to evaluate the reproducibility/replicability of system-oriented IR experiments. The measures were d

IR Group at Technische Hochschule Köln 9 May 25, 2022
Intent parsing and slot filling in PyTorch with seq2seq + attention

PyTorch Seq2Seq Intent Parsing Reframing intent parsing as a human - machine translation task. Work in progress successor to torch-seq2seq-intent-pars

Sean Robertson 160 Jan 07, 2023
Split your patch similarly to `git add -p` but supporting multiple buckets

split-patch.py This is git add -p on steroids for patches. Given a my.patch you can run ./split-patch.py my.patch You can choose in which bucket to p

102 Oct 06, 2022
GUI for TOAD-GAN, a PCG-ML algorithm for Token-based Super Mario Bros. Levels.

If you are using this code in your own project, please cite our paper: @inproceedings{awiszus2020toadgan, title={TOAD-GAN: Coherent Style Level Gene

Maren A. 13 Dec 14, 2022
This repo is about to create the Streamlit application for given ML model.

HR-Attritiion-using-Streamlit This repo is about to create the Streamlit application for given ML model. Problem Statement: Managing peoples at workpl

Pavan Giri 0 Dec 10, 2021
ICCV2021 Paper: AutoShape: Real-Time Shape-Aware Monocular 3D Object Detection

ICCV2021 Paper: AutoShape: Real-Time Shape-Aware Monocular 3D Object Detection

Zongdai 107 Dec 20, 2022
Neural Reprojection Error: Merging Feature Learning and Camera Pose Estimation

Neural Reprojection Error: Merging Feature Learning and Camera Pose Estimation This is the official repository for our paper Neural Reprojection Error

Hugo Germain 78 Dec 01, 2022
Deep Learning segmentation suite designed for 2D microscopy image segmentation

Deep Learning segmentation suite dessigned for 2D microscopy image segmentation This repository provides researchers with a code to try different enco

7 Nov 03, 2022
Compact Bidirectional Transformer for Image Captioning

Compact Bidirectional Transformer for Image Captioning Requirements Python 3.8 Pytorch 1.6 lmdb h5py tensorboardX Prepare Data Please use git clone --

YE Zhou 19 Dec 12, 2022
Official repository of OFA. Paper: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework

Paper | Blog OFA is a unified multimodal pretrained model that unifies modalities (i.e., cross-modality, vision, language) and tasks (e.g., image gene

OFA Sys 1.4k Jan 08, 2023
Code for the prototype tool in our paper "CoProtector: Protect Open-Source Code against Unauthorized Training Usage with Data Poisoning".

CoProtector Code for the prototype tool in our paper "CoProtector: Protect Open-Source Code against Unauthorized Training Usage with Data Poisoning".

Zhensu Sun 1 Oct 26, 2021
Phy-Q: A Benchmark for Physical Reasoning

Phy-Q: A Benchmark for Physical Reasoning Cheng Xue*, Vimukthini Pinto*, Chathura Gamage* Ekaterina Nikonova, Peng Zhang, Jochen Renz School of Comput

29 Dec 19, 2022
Time should be taken seer-iously

TimeSeers seers - (Noun) plural form of seer - A person who foretells future events by or as if by supernatural means TimeSeers is an hierarchical Bay

279 Dec 26, 2022
Stitch it in Time: GAN-Based Facial Editing of Real Videos

STIT - Stitch it in Time [Project Page] Stitch it in Time: GAN-Based Facial Edit

1.1k Jan 04, 2023
An implementation of DeepMind's Relational Recurrent Neural Networks in PyTorch.

relational-rnn-pytorch An implementation of DeepMind's Relational Recurrent Neural Networks (Santoro et al. 2018) in PyTorch. Relational Memory Core (

Sang-gil Lee 241 Nov 18, 2022
A library for optimization on Riemannian manifolds

TensorFlow RiemOpt A library for manifold-constrained optimization in TensorFlow. Installation To install the latest development version from GitHub:

Oleg Smirnov 83 Dec 27, 2022
Deep learning based hand gesture recognition using LSTM and MediaPipie.

Hand Gesture Recognition Deep learning based hand gesture recognition using LSTM and MediaPipie. Demo video using PingPong Robot Files Pretrained mode

Brad 24 Nov 11, 2022
LaBERT - A length-controllable and non-autoregressive image captioning model.

Length-Controllable Image Captioning (ECCV2020) This repo provides the implemetation of the paper Length-Controllable Image Captioning. Install conda

bearcatt 53 Nov 13, 2022
GAT - Graph Attention Network (PyTorch) 💻 + graphs + 📣 = ❤️

GAT - Graph Attention Network (PyTorch) 💻 + graphs + 📣 = ❤️ This repo contains a PyTorch implementation of the original GAT paper ( 🔗 Veličković et

Aleksa Gordić 1.9k Jan 09, 2023
PyTorch code of "SLAPS: Self-Supervision Improves Structure Learning for Graph Neural Networks"

SLAPS-GNN This repo contains the implementation of the model proposed in SLAPS: Self-Supervision Improves Structure Learning for Graph Neural Networks

60 Dec 22, 2022