Repo for the paper Extrapolating from a Single Image to a Thousand Classes using Distillation

Overview

Extrapolating from a Single Image to a Thousand Classes using Distillation

by Yuki M. Asano* and Aaqib Saeed* (*Equal Contribution)

Our-method

Extrapolating from one image. Strongly augmented patches from a single image are used to train a student (S) to distinguish semantic classes, such as those in ImageNet. The student neural network is initialized randomly and learns from a pretrained teacher (T) via KL-divergence. Although almost none of target categories are present in the image, we find student performances of > 59% for classifying ImageNet's 1000 classes. In this paper, we develop this single datum learning framework and investigate it across datasets and domains.

Key contributions

  • A minimal framework for training neural networks with a single datum from scratch using distillation.
  • Extensive ablations of the proposed method, such as the dependency on the source image, the choice of augmentations and network architectures.
  • Large scale empirical evidence of neural networks' ability to extrapolate on > 13 image, video and audio datasets.
  • Qualitative insights on what and how neural networks trained with a single image learn.

Neuron visualizations

Neurons

We compare activation-maximization-based visualizations using the Lucent library. Even though the model has never seen an image of a panda, the model trained with a teacher and only single-image inputs has a good idea of how a panda looks like.

Running the experiments

Installation

In each folder cifar\in1k\video you will find a requirements.txt file. Install packages as follows:

pip3 install -r requirements.txt

1. Prepare Dataset:

To generate single image data, we refer to the data_generation folder

2. Run Experiments:

There is a main "distill.py" file for each experiment type: small-scale and large-scale images and video. Note: 2a uses tensorflow and 2b, 2c use pytorch.

2a. Run distillation experiments for CIFAR-10/100

e.g. with Animal single-image dataset as follows:

# in cifar folder:
python3 distill.py --dataset=cifar10 --image=/path/to/single_image_dataset/ \
                   --student=wrn_16_4 --teacher=wrn_40_4 

Note that we provide a pretrained teacher model for reproducibility.

2b. Run distillation experiments for ImageNet with single-image dataset as follows:

# in in1k folder:
python3 distill.py --dataset=in1k --testdir /ILSVRC12/val/ \
                   --traindir=/path/to/dataset/ --student_arch=resnet50 --teacher_arch=resnet18 

Note that teacher models are automatically downloaded from torchvision or timm.

2c. Run distillation experiments for Kinetics with single-image-created video dataset as follows:

# in video folder:
python3 distill.py --dataset=k400 --traindir=/dataset/with/vids --test_data_path /path/to/k400/val 

Note that teacher models are automatically downloaded from torchvideo when you distill a K400 model.

Pretrained models

Large-scale (224x224-sized) image ResNet-50 models trained for 200ep:

Dataset Teacher Student Performance Checkpoint
ImageNet-12 R18 R50 59.1% R50 weights
ImageNet-12 R50 R50 53.5% R50 weights
Places365 R18 R50 54.7% R50 weights
Flowers101 R18 R50 58.1% R50 weights
Pets37 R18 R50 83.7% R50 weights
IN100 R18 R50 74.1% R50 weights
STL-10 R18 R50 93.0% R50 weights

Video x3d_s_e (expanded) models (160x160 crop, 4frames) trained for 400ep:

Dataset Teacher Student Performance Checkpoint
K400 x3d_xs x3d_xs_e 53.57% weights
UCF101 x3d_xs x3d_xs_e 77.32% weights

Citation

@inproceedings{asano2021extrapolating,
  title={Extrapolating from a Single Image to a Thousand Classes using Distillation},
  author={Asano, Yuki M. and Saeed, Aaqib},
  journal={arXiv preprint arXiv:2112.00725},
  year={2021}
}
Owner
Yuki M. Asano
I'm an Computer Vision researcher at the University of Amsterdam. Did my PhD at the Visual Geometry Group in Oxford.
Yuki M. Asano
Implementation of: "Exploring Randomly Wired Neural Networks for Image Recognition"

RandWireNN Unofficial PyTorch Implementation of: Exploring Randomly Wired Neural Networks for Image Recognition. Results Validation result on Imagenet

Seung-won Park 684 Nov 02, 2022
Vehicle Detection Using Deep Learning and YOLO Algorithm

VehicleDetection Vehicle Detection Using Deep Learning and YOLO Algorithm Dataset take or find vehicle images for create a special dataset for fine-tu

Maryam Boneh 96 Jan 05, 2023
Dataset and Code for ICCV 2021 paper "Real-world Video Super-resolution: A Benchmark Dataset and A Decomposition based Learning Scheme"

Dataset and Code for RealVSR Real-world Video Super-resolution: A Benchmark Dataset and A Decomposition based Learning Scheme Xi Yang, Wangmeng Xiang,

Xi Yang 92 Jan 04, 2023
A concise but complete implementation of CLIP with various experimental improvements from recent papers

x-clip (wip) A concise but complete implementation of CLIP with various experimental improvements from recent papers Install $ pip install x-clip Usag

Phil Wang 515 Dec 26, 2022
PiRapGenerator - Make anyone rap the digits of pi

PiRapGenerator Make anyone rap the digits of pi (sample files are of Ted Nivison

7 Oct 02, 2022
Official PyTorch implementation of the paper "Likelihood Training of Schrödinger Bridge using Forward-Backward SDEs Theory (SB-FBSDE)"

Official PyTorch implementation of the paper "Likelihood Training of Schrödinger Bridge using Forward-Backward SDEs Theory (SB-FBSDE)" which introduces a new class of deep generative models that gene

Guan-Horng Liu 43 Jan 03, 2023
The repo contains the code of the ACL2020 paper `Dice Loss for Data-imbalanced NLP Tasks`

Dice Loss for NLP Tasks This repository contains code for Dice Loss for Data-imbalanced NLP Tasks at ACL2020. Setup Install Package Dependencies The c

223 Dec 17, 2022
A large-scale video dataset for the training and evaluation of 3D human pose estimation models

ASPset-510 ASPset-510 (Australian Sports Pose Dataset) is a large-scale video dataset for the training and evaluation of 3D human pose estimation mode

Aiden Nibali 36 Oct 30, 2022
Python implementation of Wu et al (2018)'s registration fusion

reg-fusion Projection of a central sulcus probability map using the RF-ANTs approach (right hemisphere shown). This is a Python implementation of Wu e

Dan Gale 26 Nov 12, 2021
Learning Representations that Support Robust Transfer of Predictors

Transfer Risk Minimization (TRM) Code for Learning Representations that Support Robust Transfer of Predictors Prepare the Datasets Preprocess the Scen

Yilun Xu 15 Dec 07, 2022
(CVPR 2021) Lifting 2D StyleGAN for 3D-Aware Face Generation

Lifting 2D StyleGAN for 3D-Aware Face Generation Official implementation of paper "Lifting 2D StyleGAN for 3D-Aware Face Generation". Requirements You

Yichun Shi 66 Nov 29, 2022
Official code for CVPR2022 paper: Depth-Aware Generative Adversarial Network for Talking Head Video Generation

📖 Depth-Aware Generative Adversarial Network for Talking Head Video Generation (CVPR 2022) 🔥 If DaGAN is helpful in your photos/projects, please hel

Fa-Ting Hong 503 Jan 04, 2023
PyTorch implementation of neural style transfer algorithm

neural-style-pt This is a PyTorch implementation of the paper A Neural Algorithm of Artistic Style by Leon A. Gatys, Alexander S. Ecker, and Matthias

770 Jan 02, 2023
CoReD: Generalizing Fake Media Detection with Continual Representation using Distillation (ACMMM'21 Oral Paper)

CoReD: Generalizing Fake Media Detection with Continual Representation using Distillation (ACMMM'21 Oral Paper) (Accepted for oral presentation at ACM

Minha Kim 1 Nov 12, 2021
BEGAN in PyTorch

BEGAN in PyTorch This project is still in progress. If you are looking for the working code, use BEGAN-tensorflow. Requirements Python 2.7 Pillow tqdm

Taehoon Kim 260 Dec 07, 2022
Here I will explain the flow to deploy your custom deep learning models on Ultra96V2.

Xilinx_Vitis_AI This repo will help you to Deploy your Deep Learning Model on Ultra96v2 Board. Prerequisites Vitis Core Development Kit 2019.2 This co

Amin Mamandipoor 1 Feb 08, 2022
Implementation of Bottleneck Transformer in Pytorch

Bottleneck Transformer - Pytorch Implementation of Bottleneck Transformer, SotA visual recognition model with convolution + attention that outperforms

Phil Wang 621 Jan 06, 2023
PyoMyo - Python Opensource Myo library

PyoMyo Python module for the Thalmic Labs Myo armband. Cross platform and multithreaded and works without the Myo SDK. pip install pyomyo Documentati

PerlinWarp 81 Jan 08, 2023
Freecodecamp Scientific Computing with Python Certification; Solution for Challenge 2: Time Calculator

Assignment Write a function named add_time that takes in two required parameters and one optional parameter: a start time in the 12-hour clock format

Hellen Namulinda 0 Feb 26, 2022
Deep Anomaly Detection with Outlier Exposure (ICLR 2019)

Outlier Exposure This repository contains the essential code for the paper Deep Anomaly Detection with Outlier Exposure (ICLR 2019). Requires Python 3

Dan Hendrycks 464 Dec 27, 2022