DatasetGAN: Efficient Labeled Data Factory with Minimal Human Effort

Last update: Jan 05, 2023

Related tags

Overview

DatasetGAN

This is the official code and data release for:

DatasetGAN: Efficient Labeled Data Factory with Minimal Human Effort

^{Yuxuan Zhang*, Huan Ling*, Jun Gao, Kangxue Yin, Jean-Francois Lafleche, Adela Barriuso, Antonio Torralba, Sanja Fidler}

CVPR'21, Oral [paper] [supplementary] [Project Page]

News

Benchmark Challenge - A benchmark with diversed testing images is coming soon -- stay tuned!
Generated dataset for downstream tasks is coming soon -- stay tuned!

License

For any code dependency related to Stylegan, the license is under the Creative Commons BY-NC 4.0 license by NVIDIA Corporation. To view a copy of this license, visit LICENSE.

The code of DatasetGAN is released under the MIT license. See LICENSE for additional details.

The dataset of DatasetGAN is released under the Creative Commons BY-NC 4.0 license by NVIDIA Corporation. You can use, redistribute, and adapt the material for non-commercial purposes, as long as you give appropriate credit by citing our paper and indicating any changes that you've made.

Requirements

Python 3.6 or 3.7 are supported.
Pytorch 1.4.0 + is recommended.
This code is tested with CUDA 10.2 toolkit and CuDNN 7.5.
Please check the python package requirement from requirements.txt, and install using

pip install -r requirements.txt

Download Dataset from google drive and put it in the folder of ./datasetGAN/dataset_release. Please be aware that the dataset of DatasetGAN is released under the Creative Commons BY-NC 4.0 license by NVIDIA Corporation.

Download pretrained checkpoint from Stylegan and convert the tensorflow checkpoint to pytorch. Put checkpoints in the folder of ./datasetGAN/dataset_release/stylegan_pretrain. Please be aware that the any code dependency and checkpoint related to Stylegan, the license is under the Creative Commons BY-NC 4.0 license by NVIDIA Corporation.

Note: a good example of converting stylegan tensorlow checkpoint to pytorch is available this Link.

Training

To reproduce paper DatasetGAN: Efficient Labeled Data Factory with Minimal Human Effort:

cd datasetGAN

Run Step1: Interpreter training.
Run Step2: Sampling to generate massive annotation-image dataset.
Run Step3: Train Downstream Task.

1. Interpreter Training

python train_interpreter.py --exp experiments/.json

Note: Training time for 16 images is around one hour. 160G RAM is required to run 16 images training. One can cache the data returned from prepare_data function to disk but it will increase trianing time due to I/O burden.

Example of annotation schema for Face class. Please refer to paper for other classes.

2. Run GAN Sampling

python train_interpreter.py \
--generate_data True --exp experiments/.json  \
--resume [path-to-trained-interpreter in step3] \
--num_sample [num-samples]

To run sampling processes in parallel

sh datasetGAN/script/generate_face_dataset.sh

Example of sampling images and annotation:

3. Train Downstream Task

python train_deeplab.py \
--data_path [path-to-generated-dataset in step4] \
--exp experiments/.json

Inference

python test_deeplab_cross_validation.py --exp experiments/face_34.json\
--resume [path-to-downstream task checkpoint] --cross_validate True

June 21st Update:

For training interpreter, we change the upsampling method from nearnest upsampling to bilinar upsampling in line and update results in Table 1. The table reports mIOU.

Citations

Please ue the following citation if you use our data or code:

@inproceedings{zhang2021datasetgan,
  title={Datasetgan: Efficient labeled data factory with minimal human effort},
  author={Zhang, Yuxuan and Ling, Huan and Gao, Jun and Yin, Kangxue and Lafleche, Jean-Francois and Barriuso, Adela and Torralba, Antonio and Fidler, Sanja},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={10145--10155},
  year={2021}
}

DatasetGAN: Efficient Labeled Data Factory with Minimal Human Effort

Related tags

Overview

DatasetGAN

DatasetGAN: Efficient Labeled Data Factory with Minimal Human Effort

News

License

Requirements

Training

1. Interpreter Training

2. Run GAN Sampling

3. Train Downstream Task

Inference

Citations

Owner

A stock generator that assess a list of stocks and returns the best stocks for investing and money allocations based on users choices of volatility, duration and number of stocks

Monocular 3D pose estimation. OpenVINO. CPU inference or iGPU (OpenCL) inference.

Source code for our paper "Learning to Break Deep Perceptual Hashing: The Use Case NeuralHash"

Minimal implementation and experiments of "No-Transaction Band Network: A Neural Network Architecture for Efficient Deep Hedging".

Cross-lingual Transfer for Speech Processing using Acoustic Language Similarity

A tensorflow implementation of Fully Convolutional Networks For Semantic Segmentation

Revisiting Global Statistics Aggregation for Improving Image Restoration

An implementation of Video Frame Interpolation via Adaptive Separable Convolution using PyTorch

Source code for Fathony, Sahu, Willmott, & Kolter, "Multiplicative Filter Networks", ICLR 2021.

Package to compute Mauve, a similarity score between neural text and human text. Install with `pip install mauve-text`.

The fastai book, published as Jupyter Notebooks

This project demonstrates the use of neural networks and computer vision to create a classifier that interprets the Brazilian Sign Language.

Pytorch codes for "Self-supervised Multi-view Stereo via Effective Co-Segmentation and Data-Augmentation"

TabNet for fastai

IMBENS: class-imbalanced ensemble learning in Python.

TextWorld is a sandbox learning environment for the training and evaluation of reinforcement learning (RL) agents on text-based games.

A simple software for capturing human body movements using the Kinect camera.

Code for the paper 'A High Performance CRF Model for Clothes Parsing'.

Implements an infinite sum of poisson-weighted convolutions

[NeurIPS 2020] Code for the paper "Balanced Meta-Softmax for Long-Tailed Visual Recognition"

DatasetGAN: Efficient Labeled Data Factory with Minimal Human Effort

Related tags

Overview

DatasetGAN

DatasetGAN: Efficient Labeled Data Factory with Minimal Human Effort

News

License

Requirements

Training

1. Interpreter Training

2. Run GAN Sampling

3. Train Downstream Task

Inference

Citations

Owner

A stock generator that assess a list of stocks and returns the best stocks for investing and money allocations based on users choices of volatility, duration and number of stocks

Monocular 3D pose estimation. OpenVINO. CPU inference or iGPU (OpenCL) inference.

Source code for our paper "Learning to Break Deep Perceptual Hashing: The Use Case NeuralHash"

Minimal implementation and experiments of "No-Transaction Band Network: A Neural Network Architecture for Efficient Deep Hedging".

Cross-lingual Transfer for Speech Processing using Acoustic Language Similarity

A tensorflow implementation of Fully Convolutional Networks For Semantic Segmentation

Revisiting Global Statistics Aggregation for Improving Image Restoration

An implementation of Video Frame Interpolation via Adaptive Separable Convolution using PyTorch

Source code for Fathony, Sahu, Willmott, & Kolter, "Multiplicative Filter Networks", ICLR 2021.

Package to compute Mauve, a similarity score between neural text and human text. Install with `pip install mauve-text`.

The fastai book, published as Jupyter Notebooks

This project demonstrates the use of neural networks and computer vision to create a classifier that interprets the Brazilian Sign Language.

Pytorch codes for "Self-supervised Multi-view Stereo via Effective Co-Segmentation and Data-Augmentation"

TabNet for fastai

IMBENS: class-imbalanced ensemble learning in Python.

​TextWorld is a sandbox learning environment for the training and evaluation of reinforcement learning (RL) agents on text-based games.

A simple software for capturing human body movements using the Kinect camera.

Code for the paper 'A High Performance CRF Model for Clothes Parsing'.

Implements an infinite sum of poisson-weighted convolutions

[NeurIPS 2020] Code for the paper "Balanced Meta-Softmax for Long-Tailed Visual Recognition"

TextWorld is a sandbox learning environment for the training and evaluation of reinforcement learning (RL) agents on text-based games.